[go: up one dir, main page]

Showing 60 open source projects for "pdf data mining"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Do not waste time with spreadsheets Icon
    Do not waste time with spreadsheets

    Connect to common Project Management tools such as Jira, Smartsheet, Asana, Wrike, Monday.com, and others.

    Acuity PPM provides Senior Leaders and Project Management Teams (PMO's) with lightweight and easy-to-use software to manage the project portfolio. Acuity PPM helps you create and evaluate new project requests through prioritization, financial planning and resource management.
    Learn More
  • 1
    Umbrel

    Umbrel

    A beautiful personal server OS for Raspberry Pi or any Linux distro

    ...They’re a part of your private life, and now they can all be stored by you, in your home, on your Umbrel. The Bitcoin network is made up of thousands of nodes that verify every single transaction in the blockchain. Some of them mine Bitcoin too, but unlike a mining node, running a non-mining node doesn’t require expensive hardware. Achieve unparalleled privacy by connecting your wallet directly to the Bitcoin node on your Umbrel.
    Downloads: 77 This Week
    Last Update:
    See Project
  • 2
    Data Crow

    Data Crow

    The ultimate cataloguer

    Data Crow allows you to use the standard movie & video (divx, xvid, DVD, Blu-ray, etc), book (and eBooks), images, board games, comic books, games & software, music (mp3 and other music files) cataloguing modules. Besides these modules, which you can change to fit your requirements, you can create new modules (want to catalogue your stamps, equipment, or anything else?). The GUI is skinnable. Reporting (using JasperReports and their community edition JasperSoft Developer Studio ), loan...
    Leader badge">
    Downloads: 267 This Week
    Last Update:
    See Project
  • 3
    ArchiveBox

    ArchiveBox

    Open source self-hosted web archiving

    ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content. ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data....
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    JupyterLab

    JupyterLab

    JupyterLab computational environment

    ...Documents and activities integrate with each other, enabling new workflows for interactive computing. JupyterLab also offers a unified model for viewing and handling data formats. JupyterLab understands many file formats (images, CSV, JSON, Markdown, PDF, Vega, Vega-Lite, etc.) and can also display rich kernel output in these formats. See File and Output Formats for more information. To navigate the user interface, JupyterLab offers customizable keyboard shortcuts and the ability to use key maps from vim, emacs, and Sublime Text in the text editor.
    Downloads: 272 This Week
    Last Update:
    See Project
  • The All-In-One Google Workspace Management Tool for IT Admins Icon
    The All-In-One Google Workspace Management Tool for IT Admins

    Our advanced administration makes adding, deleting, suspending, and de-provisioning users simple and quick.

    gPanel by Promevo streamlines administration, security, and user management, giving organizations full control over their Google Workspace.
    Learn More
  • 5
    File Converter

    File Converter

    Simple tool which allows you to convert and compress files

    File Converter is a minimalist open‑source tool (GPL‑3.0) that lets users convert and compress one or multiple files directly via the Windows Explorer context menu. It integrates with powerful back-end utilities—FFmpeg, ImageMagick, Ghostscript—to handle a broad range of media and document transformations. File Converter is a personal open source project started in 2014. I have put hundreds of hours adding, refining and tuning File Converter with the goal of making the conversion and...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 6
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    ...Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS. Intuitive design pattern for high-performance microservices. Seamless Docker container integration: sharing, exploring, sandboxing, versioning and dependency control via Jina Hub. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Perf Book

    Perf Book

    The book "Performance Analysis and Tuning on Modern CPU"

    This project is a practical guide to performance analysis and tuning on modern CPUs, bridging microarchitecture details with hands-on profiling. It explains how caches, TLBs, prefetchers, branch predictors, and out-of-order execution influence real program speed, then connects those concepts to concrete optimization strategies. Readers learn how to design trustworthy benchmarks, avoid measurement traps (warmup, turbo, frequency scaling), and interpret hardware performance counters. The book...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    pdfcrack is a command line, password recovery tool for PDF-files.
    Leader badge">
    Downloads: 399 This Week
    Last Update:
    See Project
  • 9
    Drive Health Analyzer - SSD/HDD Monitor

    Drive Health Analyzer - SSD/HDD Monitor

    Monitor disk health, predict failures, track SSD/HDD SMART attributes

    Drive Health Analyzer is a comprehensive disk monitoring solution designed to prevent data loss by tracking the health status of SSDs and HDDs. The software reads SMART attributes, monitors temperature, analyzes disk performance, and predicts potential drive failures before they occur. It supports all major storage types including NVMe, SATA, and IDE drives. Features real-time alerts, detailed health reports, and automatic background monitoring. The intuitive dashboard displays critical...
    Leader badge">
    Downloads: 171 This Week
    Last Update:
    See Project
  • Workload Automation for Global Enterprises Icon
    Workload Automation for Global Enterprises

    Orchestrate Your Entire Tech Stack with Redwood RunMyJobs

    Redwood lets you orchestrate securely and reliably across any application, service or server, in the cloud or on-premise, all inside a single platform.
    Learn More
  • 10
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    recovery is a Live DVD/USB which aims troubleshooting, disk partitioning, system rescue, backup , restore data and desktop. This is a customized version of Debian Live. It contains : GParted, Clonezilla, Boot-Repair, LibreOffice and a lot of tools like ddrescue, Nwipe, TestDisk, DejaDup and many more version 2.4 - 19 September 2025 recovery is now modular in design, meaning programs can be installed simply by double clicking on module files. add modules : - borg 1.4.1 - calibre...
    Downloads: 60 This Week
    Last Update:
    See Project
  • 12
    WP 34s

    WP 34s

    Scientific/engineering firmware repurposing HP business calculators!

    This project has created scientific firmware for the HP-20b and HP-30b business calculators. WP 34S turns either of these calculators into a powerful keystroke programmable scientific device. According to our customers, it's the most powerful and fastest RPN scientific pocket calculator ever built. WP 34S is alive and stable since 2011. We have succeeded in satisfying the most picky users - read about their experiences on http://www.hpmuseum.org/forum/forum-8.html. Since 2014, WP...
    Leader badge">
    Downloads: 55 This Week
    Last Update:
    See Project
  • 13
    Open Crypto Tracker

    Open Crypto Tracker

    Bitcoin Alts portfolio tracker, email / text / alexa / telegram alerts

    100% FREE / open source / PRIVATE cryptocurrency portfolio tracker. Email / text / alexa / telegram price alerts, price charts, mining calcs, leverage / gain / loss / balance stats, news feeds +more. Privately track Bitcoin / Ethereum / unlimited cryptocurrencies. Customize as many assets / markets / alerts / charts as you want. Over 50 Exchanges / 40 Trading Pairs Supported (exchanges / pairings list at bottom of README.txt): https://tinyurl.com/ct-readme Nearly Unlimited Assets...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Free Weighbridge Software with CCTV

    Free Weighbridge Software with CCTV

    Smart Weighbridge Software with CCTV camera & WhatsApp Integration

    For any query Contact - info@eagleweigh.com or visit www.eagleweigh.com This is smart and easy to operate weighbridge software suitable for all kinds of weighbridges / Dharam Kanta. This Weighbridge software provides a secure and fraud-free operation of weighbridges with its advanced design and fraud detection features. It is a solution to the enterprises/industries that are looking for easy, transparent, and automating their weighbridge platforms. It is provided with lifetime...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 15
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI to organize files and suggest meaningful file names based on real content, not just filenames or extensions. The app can analyze image files locally and propose human-readable rename suggestions (for example, IMG_2048.jpg → clouds_over_lake.jpg). It can also analyze the text content of documents to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common text files....
    Downloads: 231 This Week
    Last Update:
    See Project
  • 16
    dktools - Dirk Krauses tools

    dktools - Dirk Krauses tools

    Drawing, graphics conversion, software development, administration.

    GUI and command line tools for advanced users and administrators: wxdkdraw - Minimalistic drawing application for use with LaTeX, wxd2lat - Convert wxdkdraw files to LaTeX, bitmap2pp - Convert PNG/JPEG/TIFF/NetPBM to (E)PS or PDF, fig2lat - Convert XFig files to LaTeX, htmlbook - publish HTML like a book, dkcpre - C debugging and tracing preprocessor, itadmin - manage your IT using a MySQL/MariaDB database, dk-fic - file integrity checker, dk-ls - list files, output column order is configurable, dk-cat, dk-sort, dk-lines - text tools for *x and Windows, dk-send, dk-recv - transmit data stream, dk-t2h, dk-t2l - text to HTML or LaTeX conversion.
    Leader badge">
    Downloads: 20 This Week
    Last Update:
    See Project
  • 17
    Monitoring AIX, VMware,Oracle, Nutanix

    Monitoring AIX, VMware,Oracle, Nutanix

    AIX, Linux, VMware, Nutanix, Oracle, RHV, Cloud performance monitoring

    The tool offers you end-to-end views of your server environment and can save you significant money in operation monitoring by predicting utilization bottlenecks in your virtualized environment. You can also generate policy-based alerts, capacity reports and load forecasts. The product supports these virtualization platforms: - IBM Power Systems - VMware - Nutanix - Proxmox - Huawei FusionCompute - OracleVM - Oracle Solaris LDOM, CDOM, Zone - oVirt / RedHat Virtualization...
    Leader badge">
    Downloads: 71 This Week
    Last Update:
    See Project
  • 18
    Monitoring Storage,  SAN,  LAN

    Monitoring Storage, SAN, LAN

    Storage, SAN, LAN Performance Monitoring: IBM,NetApp,Hitachi,HPE,EMC

    The tool offers you end-to-end views of your storage environment including LAN and SAN and can save you significant money in operation monitoring and by predicting utilization bottlenecks in your virtualized environment. You can also generate policy-based alerts, view overall health status of your systems, reduce service downtime, use capacity and forecasting data - Real-time storage performance visibility in LAN and SAN multi-vendor environment - Historical reporting (graph, CSV, PDF) - Alerting based on performance thresholds - Storage event monitoring The tool supports enterprise class level storage devices from major storage vendors like: IBM, Dell EMC, NetAPP, HPE, Hitachi, Lenovo, Pure Storage, Huawei, Dot Hill, INFINIDAT, Fujitsu, DataCore, Quantum, QNAP, FalconStor,Ceph, Synology, RAIDIX, Qumulo,Inspur,Veritas Monitoring of the SAN supports Brocade, Qlogic and Cisco SAN switches Monitoring of the LAN network Try demo at http://demo.stor2rrd.com
    Leader badge">
    Downloads: 18 This Week
    Last Update:
    See Project
  • 19
    RY GeoIP 3

    RY GeoIP 3

    User-friendly network & geolocation tools, APIs integration and more!

    RY GeoIP 3 is a powerful application that combines network and geolocation tools for comprehensive analysis. With its user-friendly interface and integration with Google Maps API and API Ninja DNS Lookups service, you can perform a wide range of operations, from geolocation lookups and ping tests to DNS analysis, traceroute, SSL certificate inspection, header data retrieval, and open port scanning. The ability to save data as PDFs and maps as images further enhances the utility of the...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Form OCR Testing Tool

    Form OCR Testing Tool

    A set of tools to use in Microsoft Azure Form Recognizer

    An open source labeling tool for Form Recognizer, part of the Form OCR Test Toolset (FOTT). This is a MAIN branch of the Tool. It contains all the newest features available. This is NOT the most stable version since this is a preview. The purpose of this repo is to allow customers to test the tools available when working with Microsoft Forms and OCR services. Currently, Labeling tool is the first tool we present here. Users could provide feedback, and make customer-specific changes to meet...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    DynaQ

    DynaQ

    Innovative text document search. http://dynaq.opendfki.de for details.

    The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    FixSit

    FixSit

    Fix Mac Type/Creator data for MacOS on Linux

    This little tool will scan your .HFS filesystem for files with "????UNIX" codes and tries to correct them. Currently supports .sit, .zip, and .pdf
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    sar2html
    Sar2html is web based frontend for performance monitoring. It converts sar binary data to graphical format and keep historical data in it's database. Project homepage is here: https://github.com/cemtan/sar2html.git Supported Operating Systems: HPUX 11.11, 11.23, 11,31 Solaris 5.9, 5.10, 5.11 Redhat 3, 4, 5, 6, 7 Suse 8, 9, 10, 11, 12 Ubuntu 18, 20 If you have customers facing performance problems on operating systems listed above you may send sar2ascii to collect performance data. ...
    Leader badge">
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Xena - Digital Preservation Software

    Xena - Digital Preservation Software

    Xena transforms files into open data formats

    Xena transforms files into open data formats for long-term digital preservation, encodes content in Base64 and wraps in XML metadata. Formats supported include MBOX, PST, MSG, DOC, XLS, PPT, RTF, PNG, XML, PDF, JPG, TIFF, PCX, WAV, MP3 and more. NO LONGER MAINTAINED, NO LONGER SUPPORTED
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next