[go: up one dir, main page]

Showing 118 open source projects for "php extract text pdf"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Infor M3 ERP Icon
    Infor M3 ERP

    Enterprise manufacturers and distributors requiring a solution to manage and execute complex processes

    Efficiently executing the complex processes of enterprise manufacturers and distributors. Infor M3 is a cloud-based, manufacturing and distribution ERP system that leverages the latest technologies to provide an exceptional user experience and powerful analytics in a multicompany, multicountry, and multisite platform. Infor M3 and related CloudSuite™ industry solutions include industry-leading functionality for the chemical, distribution, equipment, fashion, food and beverage, and industrial manufacturing industries. Staying ahead of the competition means staying agile. Our new capabilities bring improved data-driven insights and streamlined workflows to help you make informed decisions and take quick action.
    Learn More
  • 1
    PdfPig

    PdfPig

    Read and extract text and other content from PDFs in C#

    This project allows users to read and extract text and other content from PDF files. In addition the library can be used to create simple PDF documents containing text and geometrical shapes.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    py-pdf-parser

    py-pdf-parser

    A Python tool to help extracting information from structured PDFs

    py-pdf-parser is a Python tool designed to help extract information from structured PDFs. It provides a simple interface to define parsing rules and extract data from PDF documents. ​
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    PDFsam

    PDFsam

    PDFsam, a desktop application to split, merge, mix, rotate PDF files

    PDFsam Basic is our free and open-source desktop application to split, merge, extract pages, rotate and mix PDF files. PDFsam Visual is a powerful tool to visually compose PDF files, reorder pages, delete pages, split, merge, rotate, encrypt, decrypt, extract text, convert to grayscale, crop PDF files. PDFsam Basic is written using JavaFX. Since version 4 it is released as a self-contained application and bundles a jlinked JDK while version 3 requires a Java Runtime Environment 8 with JavaFx installed in order to run.
    Downloads: 124 This Week
    Last Update:
    See Project
  • 4
    PyPDF

    PyPDF

    A pure-python PDF library capable of splitting, merging, cropping

    pypdf is a pure Python library for working with PDF files, allowing developers to split, merge, rotate, encrypt, and extract content from PDFs. It’s an actively maintained fork of PyPDF2, improving performance, compatibility, and support for modern PDF standards. Suitable for both automation scripts and full-featured applications, pypdf handles PDFs without requiring external dependencies.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Our xDM platform turns business users into data champions. Icon
    Our xDM platform turns business users into data champions.

    Discover the Intelligent Data Hub unique platform for Master Data Management

    It empowers organizations of any size to build trusted data applications quickly, with fast time to value using a single software platform for governance, master data, reference data, data quality, enrichment, and workflows.
    Learn More
  • 5
    pdfly

    pdfly

    CLI tool to extract (meta)data from PDF and manipulate PDF files

    A Python library designed for manipulating PDF files with functionalities for extraction, transformation, and document generation.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Scribe.js

    Scribe.js

    JavaScript OCR and text extraction for images and PDFs

    Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    unipdf

    unipdf

    Golang PDF library for creating and processing PDF files (pure go)

    UniDoc UniPDF is a PDF library for Go (golang) with capabilities for creating and reading, processing PDF files. The library is written and supported by FoxyUtils.com, where the library is used to power many of its services. Every release of our libraries is automatically tested against known vulnerabilities and do not pass unless everything is remediated. All changes are carefully reviewed by our team.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    iLovePDF Api

    iLovePDF Api

    iLovePDF Rest Api - PHP Library

    Develop and automate PDF processing tasks like Compress PDF, merging PDF, Split PDF, converting Office to PDF, PDF to JPG, Images to PDF, adding Page Numbers, Rotate PDF, Unlocking PDF, stamping a Watermark, and Repair PDF. Each one with several settings to get your desired results. Strong infrastructure to offer the best-dedicated processing power. You might know us from ilovepdf.com where we process millions of PDFs daily. We offer a simple and concise API Reference and Guide as well as...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    PDFCraft

    PDFCraft

    PDFCraft is a free, privacy-focused PDF toolkit

    PDFCraft is an extensible toolkit for creating, editing, and transforming PDF documents with both a graphical interface and a scripting API, making it useful for users ranging from casual editors to automated document processors. At its core, the project provides a clean, modern UI where you can rearrange pages, annotate text, insert images, fill forms, and export to multiple formats, all without needing a heavyweight commercial PDF suite.
    Downloads: 23 This Week
    Last Update:
    See Project
  • Bitdefender Ultimate Small Business Security Icon
    Bitdefender Ultimate Small Business Security

    Protect the big future of your small business

    Get exceptional protection against all digital threats for your business and employees.
    Learn More
  • 10
    PyMuPDF

    PyMuPDF

    Python bindings for MuPDF's rendering library.

    MuPDF is a lightweight PDF, XPS, and E-book viewer. MuPDF consists of a software library, command line tools, and viewers for various platforms. The renderer in MuPDF is tailored for high-quality anti-aliased graphics. It renders text with metrics and spacing accurate to within fractions of a pixel for the highest fidelity in reproducing the look of a printed page on the screen. The viewer is small, fast, yet complete. It supports many document formats, such as PDF, XPS, OpenXPS, CBZ, EPUB,...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 11

    Image To Text tools

    ITTT is a Free tool designed to Scan and extract Text from Images.

    Image To Text Tools is a 100% Free user-friendly tool designed to Scan and extract containing text in images into editable text formats. Whether you need to extract text from scanned documents, photographs, or other image files, Image To Text Tools provides accurate and reliable Optical Character Recognition (OCR) capabilities to meet your needs.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 12
    Umi-OCR

    Umi-OCR

    OCR software, free and offline

    Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines. The software supports flexible usage patterns including screenshot capture OCR, batch processing of large sets of images or documents, PDF parsing, QR code detection, and layout-aware paragraph output. ...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 13
    "A free, open-source PDF editor for basic editing tasks"
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Image Toolbox

    Image Toolbox

    Image Toolbox is an powerful picture editor, which can crop

    Image Toolbox is a powerful picture editor, which can crop, apply filters, add some drawings, erase background, edit EXIF, or even create a PDF file.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 15
    deepdoctection

    deepdoctection

    A Repo For Document AI

    DeepDoctection is a document AI framework that applies deep learning techniques to analyze and extract structured data from scanned documents, PDFs, and images. deepdoctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Sprint PDF Editor (Smarter PDF Solution)

    Sprint PDF Editor (Smarter PDF Solution)

    Edit, Convert, Extract , Export, Secure and PDF Imposition.

    Sprint PDF Editor® The Productive, Modern, Innovative, Clean & Colourful GUI. Faster, Smarter & Seamless workflows, with 50+ functions. Sprint PDF Editor & Reader, Complete PDF Solution, Supercharge Your Workflows With Imposition, Extract, Compress, Watermark, Protect & Secure, Split & Merge, Crop Pages, Printing, Stamp & more. Your Privacy, Our Priority Protect Your Data with Complete Confidence. Our software is designed to keep your information 100% secure. Unlike cloud-based...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 17
    PDF Reader for Windows 7

    PDF Reader for Windows 7

    Free PDF reader for Windows 7

    PDF Reader for Windows 7 is a lightweight yet powerful tool designed to provide a smooth and efficient PDF viewing experience. This user-friendly software allows users to effortlessly open, view, and print PDF documents with a sleek, intuitive interface that makes navigation a breeze. Perfectly optimized for the Windows 7 environment, it offers rapid performance and compatibility with various PDF files, ensuring that even large documents load quickly without lag. In addition to its clean...
    Downloads: 59 This Week
    Last Update:
    See Project
  • 18
    Super PDF Editor (a Batch PDF Processor)

    Super PDF Editor (a Batch PDF Processor)

    Create, Edit, Delete, Organize , Convert, Export, Secure & Sign PDF.

    Super PDF Editor - Powerful, superfast, lightweight PDF processor. All-in-one PDF solution, PDF editing with 80+ tools and functions. The easy-to-use software is complete with editing tools for modifying PDF files your way. Most comprehensive, powerful, process-based and lightning-fast batch processor software. OCR PDF. PDF Imposition, Reverse Pages, Resize Page, Scale Page, Booklet, N-up Pages, Merge, Split by page, Extract Page, Rotate Page. Replace Page, Insert Page, Delete Page....
    Leader badge">
    Downloads: 23 This Week
    Last Update:
    See Project
  • 19
    TextExtractor

    TextExtractor

    Extracts plain text from a variety of different file types

    TextExtractor extracts plain text from hundreds of different file types, storing the text extracted in suitably named text files. TextExtractor 1.10 works in six different modes :- Instant Mode - Just select any file and extract the text from it. Batch Mode - Select a group of files and extract the text from all of them in one go. Polling Mode - Watch a folder location, processing new files as they appear there. Hierarchical Mode - Extract Text from files in a directory...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 20
    DataExtract

    DataExtract

    Extracts Data Types Like Email Addresses From All Kinds Of Files

    DataExtract is a program that scans files of many different types - text, PDF, Word, Excel etc, extracting all kinds of structured patterns, like email addresses and phone numbers, from them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    pdf combiner merger converter splitter

    pdf combiner merger converter splitter

    PDF Combiner is a user-friendly, GUI-based tool built in

    PDF Combiner is a user-friendly open source free to use, GUI-based tool for combining, pdf to excel, pdf to word, image to pdf, zip, unzip annotate and splitting PDF files. It is easy to use, supports multiple file insert and delete and process, and allows you to adjust the order of files before combining.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 22
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    MyBox

    MyBox

    Easy Tools of PDF, Image, File, Network, Data, and Medias

    javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.
    Downloads: 55 This Week
    Last Update:
    See Project
  • 24
    Bible SuperSearch

    Bible SuperSearch

    Web-Based Bible Search Engine

    Advanced web-based Bible passage lookup and search script. Search entire Bible or only within the specified reference(s). Keeps users on your website and does not send them to our website. Install our API and run completely on your website! Full Boolean search with parentheses, proximity search. Look up multiple passages, parallel Bibles. Bible downloads also available: * PDF * Plain Text * MySQL dumps * JSON *...
    Leader badge">
    Downloads: 476 This Week
    Last Update:
    See Project
  • 25

    Path Lister

    Get list of files and folders tree

    Folder digging tool to retrieve the list of files based on a search criteria. You can get on the file name, or full path. You can get the file count based on a file mask.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next