[go: up one dir, main page]

Showing 307 open source projects for "pdf to xml"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • EHS Software and Management System Icon
    EHS Software and Management System

    ERA offers the only full EHS&Q platform with advanced automation to drive your complete compliance.

    ERA Environmental Software Solutions develops web-based EHS management software for small, medium, and large manufacturers needing to comply with federal, provincial, and state regulations, monitor their air, water, and waste emissions and other environmental outputs, author and manage Safety Data Sheets (SDS) in more than 40 languages, or standardize their Health and Safety procedures for incident and inspection tracking, training delivery, and audit management. The platform also supports comprehensive reporting for programs like TRI, Tier II, Title V, NEI, and NPRI. Companies across the automotive, aerospace, general manufacturing, and paints and coatings industries, to name a few, rely on ERA’s all-in-one, SOC 2 Type II certified SaaS for complete coverage of their EHS needs.
    Learn More
  • 1
    tableExport.jquery.plugin

    tableExport.jquery.plugin

    jQuery plugin to export a html table to JSON, XML, CSV, TSV, TXT, SQL

    jQuery plugin to export an html table to JSON, XML, CSV, TSV, TXT, SQL, Word, Excel, PNG, and PDF.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    carbone

    carbone

    Fast and simple report generator, from JSON to pdf, xslx, docx, odt

    Turn your JSON into PDF, DOCX, XLSX, PPTX, ODS and many more. Fast, Simple and Powerful report generator in any format PDF, DOCX, XLSX, ODT, PPTX, ODS, XML, CSV using templates and your JSON data as input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. ...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 4
    PyMuPDF

    PyMuPDF

    Python bindings for MuPDF's rendering library.

    MuPDF is a lightweight PDF, XPS, and E-book viewer. MuPDF consists of a software library, command line tools, and viewers for various platforms. The renderer in MuPDF is tailored for high-quality anti-aliased graphics. It renders text with metrics and spacing accurate to within fractions of a pixel for the highest fidelity in reproducing the look of a printed page on the screen. The viewer is small, fast, yet complete. It supports many document formats, such as PDF, XPS, OpenXPS, CBZ, EPUB,...
    Downloads: 11 This Week
    Last Update:
    See Project
  • Project Planning and Management Software | Planview Icon
    Project Planning and Management Software | Planview

    Connect programs, projects, resources, and financials with business outcomes using portfolio management software from Planview.

    Planview® Portfolios enables enterprises to accelerate strategic execution by seamlessly integrating business and technology planning, optimizing resources, and leveraging the power of embedded AI — Planview Anvi™ — to deliver breakthrough products, services, and customer experiences. This unified approach aligns strategy with execution, driving enhanced business performance across the organization.
    Learn More
  • 5
    ExifTool

    ExifTool

    ExifTool meta information reader/writer

    ExifTool is a battle-tested Perl application for reading, writing, and batch-editing metadata in thousands of file types—images, videos, audio, documents, and more. It understands major standards like EXIF, IPTC, and XMP as well as an enormous range of camera maker notes and container formats (for example, QuickTime/MP4, PDF, TIFF). Typical workflows include extracting metadata to JSON/CSV/XML, renaming files from timestamps or tags, shifting capture times, copying tags between files, and cleaning or normalizing fields for archives. The tool is meticulous about character encodings, time zones, and tag groups, helping avoid silent corruption when moving metadata between ecosystems. ...
    Downloads: 87 This Week
    Last Update:
    See Project
  • 6
    dvisvgm

    dvisvgm

    A fast DVI, EPS, and PDF to SVG converter

    The command-line utility dvisvgm is a tool for TEX/LATEX users. It converts DVI, EPS, and PDF files to the XML-based vector graphics format SVG. In contrast to bitmap graphics, vector graphics are arbitrarily scalable without loss of quality. All modern web browsers support a large amount of the current SVG standard 1.1. Furthermore, SVG files can also be displayed with the Java-based Squiggle SVG browser which is part of the Apache Batik project, and the free vector graphics editor Inkscape.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    OmniTools

    OmniTools

    Self-hosted collection of powerful web-based tools for everyday tasks

    ...A key design choice is that file processing happens entirely on the client side, meaning your data stays in your browser instead of being sent to the backend. The tool catalog spans both technical and non-technical needs, including image, video, audio, PDF, text, date/time, math, and data format utilities like JSON/CSV/XML helpers. It’s also packaged for straightforward self-hosting, with a lightweight Docker image and simple run commands, so it can be deployed quickly on a homelab or internal network.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    ...Monitor and track PDF file changes, and know when a PDF file has text changes. Know when your favourite product is on sale, or other special deals are announced before anyone else. Detect and monitor changes in JSON API responses.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    SILE

    SILE

    The SILE Typesetter — Simon’s Improved Layout Engine

    SILE is a typesetting system; its job is to produce beautiful printed documents. Conceptually, SILE is similar to TeX—from which it borrows some concepts and even syntax and algorithms—but the similarities end there. Rather than being a derivative of the TeX family SILE is a new typesetting and layout engine written from the ground up using modern technologies and borrowing some ideas from graphical systems such as InDesign.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Striven | All In One Business Management Software Icon
    Striven | All In One Business Management Software

    Striven is an all-in-one business management software suite with everything your organization needs for success.

    Striven is the all-in-one business management software that lowers your costs, improves your operations, and makes work easier. Make your company’s data coherent, connected, and relevant.
    Learn More
  • 10
    LaTeXML

    LaTeXML

    A TeX and LaTeX to XML/HTML/ePub/MathML translator

    LaTeXML is a tool that converts LaTeX documents into structured formats like HTML, MathML, and ePub. Unlike traditional TeX-to-PDF processors, LaTeXML preserves semantic content, making it suitable for web publishing, accessibility, and content reuse. It supports a wide range of LaTeX packages and is designed to enable high-quality rendering of mathematical and scientific documents.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    unioffice

    unioffice

    Pure go library for creating and processing Office Word documents

    unioffice is a library for creation of Office Open XML documents (.docx, .xlsx and .pptx). Its goal is to be the most compatible and highest-performance Go library for the creation and editing of docx/xlsx/pptx files. Every release of our libraries is automatically tested against known vulnerabilities and do not pass unless everything is remediated. All changes are carefully reviewed by our team. Our SDKs libraries are flexible and developed to solve common problems. We provide high-level...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12

    efactuur-pdf-nl

    PDF generation for Dutch UBL and SETU invoices

    The EfactuurNL2PDF project provides the following functionality: - PDF generation for UBL or SETU Invoice documents. - Schematron validation stylesheets - Genericode validation stylesheets The following HR-XML-NL and UBL-NL message versions are currently supported in this project : - NLCIUS (si-ubl-2.0.1) - UBL Invoice 1.9 - UBL Invoice 1.8 - UBL Invoice 1.7 - UBL Invoice 1.6.3 - UBL Invoice 1.6.2 - UBL Invoice 1.1 - SETU Invoice 2.0 - SETU Invoice 1.8.1 - SETU Invoice 1.7 - SETU Invoice 1.6.4 - SETU Invoice 1.1
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    bridgex

    bridgex

    Convert files like docx, xlsx, pptx, html, and more to MarkDown

    ... - Support for multiple input formats. - Lightweight editing prior to saving. Supported Formats 📂 Bridgex supports conversion of the following file formats: - PDF (.pdf) - Word (.docx) - PowerPoint (.pptx) - Excel (.xlsx, .xls, .csv) - Outlook Messages (.msg) - Text (.txt, .text) - Markdown (.md, .markdown) - JSON (.json, .jsonl) - XML (.xml) - RSS/Atom (.rss, .atom) - HTML/MHTML (.html, .htm, .mhtml) - ePub (.epub) - Compressed files (.zip) - Jupyter Notebooks (.ipynb) - Other formats supported by Markitdown Bridgex is not an IDE, text editor, Markdown editor, or document viewer
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    Программа для создания PDF файлов с этикетками-датаматрикс на основе выгруженного из ЦРПТ csv файла. Так же позволяет генерировать xml файлы для ввода в оборот, передачи и приемки кодов между юр лицами и ИП.
    Leader badge">
    Downloads: 72 This Week
    Last Update:
    See Project
  • 15
    Hypernomicon

    Hypernomicon

    Hypertext-infused philosophy personal database software

    Hypernomicon is a personal productivity/database application for researchers that combines structured note-taking, mind-mapping, management of files (e.g., PDFs) and folders, and reference management into an integrated environment that organizes all of the above into semantic networks or hierarchies in terms of debates, positions, arguments, labels, terminology/concepts, and user-defined keywords by means of database relations and automatically generated hyperlinks (hence ‘Hyper’ in the...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 16
    GanttProject

    GanttProject

    Free project scheduling and management tool

    GanttProject is a project scheduling application written in Java and featuring Gantt chart, resource management, calendaring, import/export (MS Project, HTML, PDF, spreadsheets). Learn more on http://ganttproject.biz
    Leader badge">
    Downloads: 328 This Week
    Last Update:
    See Project
  • 17
    VDP Mail Merge 1.7

    VDP Mail Merge 1.7

    High Volume Variable Data Printing & Mail Merge in PDF format

    FREE & user friendly Variable Data Printing tool, to impose Variable Text onto PDF documents. (Mail Merge, Numbered Tickets, Bates numbering, Mass mail Addresses, customized Brochures / Post cards, Contact information, Voters Lists ... etc) Capable of processing more than 2,000,000 Pages or 10,000,000 Records.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    SignServer
    The SignServer is an application for server side signatures called by other systems. It is flexible and can be customized to specific needs. The SignServer have a ready to use TimeStamp server and signers for PDF, XML, ODF, PGP, OOXML and MRTD (ePassport DS).
    Downloads: 11 This Week
    Last Update:
    See Project
  • 19
    TEXminer

    TEXminer

    Text Mining Classification for Texts in ASCII, Unicode and PDF Format.

    TEXminer uses generic Text Mining Methods to analyze Unicode Files as plain Text or PDF. The Text Database can be saved in XML where the orginal Text, the Sentence and Word Lists and additional Parameters (e.g. Abbreviations) are stored. TEXminer allows Language Detection by Letter Frequency Analysis, finding important Words by Cooccurrence Analysis, Determination of Central Expressions, Thematic Text Classification (also Semantic Groups) Fingerprint Comparison and Word Frequency. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    docconv

    docconv

    Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

    A Go wrapper library to convert PDF, DOC, DOCX, XML, HTML, RTF, ODT, Pages documents and images (see optional dependencies below) to plain text. See go help install for details on the installation location of the installed docd executable. Make sure that the full path to the executable is in your PATH environment variable. To add image support to the docconv library you first need to install and build gosseract.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    C# ECG Toolkit

    C# ECG Toolkit

    ECG Toolkit support for: SCP-ECG, DICOM, HL7 aECG, ISHNE & MUSE-XML

    C# ECG Toolkit is an open source software toolkit to convert, view and print electrocardiograms. The toolkit is developed using C# .NET Framework 2.0 and later (code also supports netstandard2.0). Support for ECG formats: SCP-ECG, DICOM, HL7 aECG, ISHNE, MUSE-XML and OmronECG.
    Leader badge">
    Downloads: 26 This Week
    Last Update:
    See Project
  • 22

    FOray

    Modular XSL-FO Implementation for Java.

    FOray is an open-source XSL-FO publishing system that is suitable for converting XML content into PDF and other document formats. Although not yet fully conformant with the XSL-FO standard, it is very useful for many applications.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23

    JCS CFD (común)

    Biblioteca común del módulo de comprobantes fiscales (JCS CFD).

    Biblioteca para la emisión y certificación de comprobantes fiscales (CFDI, anexo 20). Permite también administrar los archivos (XML y PDF) así como su envío por correo-e. Si necesita ayuda con el uso de esta biblioteca o requiere apoyo con integación con otro sistema, escríbame por correo-e. Sus comentarios y retroalimentación son imporantes para fortalecer y hacer crecer este proyecto. Las donaciones son bienvenidas y pueden invitarme una taza de chocolate caliente en buymeacoffee.com/sjaguar.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    DocJGenerator

    Wiki generator and Java Help System

    Allows to generate a wiki (interlinked HTML files) from a bunch of XML formatted files. It also allows to add a Help-system to a Swing or JavaFX application. Also it is also possible to generate a PDF, Word (docx), or epub document rather than a wiki. The tool also provides a visual editor to edit the wiki. The project also support both the Mediawiki and Markdown syntax.
    Downloads: 60 This Week
    Last Update:
    See Project
  • 25
    KSeF API | Faktury KSeF

    KSeF API | Faktury KSeF

    Profesionalna integracja z Krajowym Systemem e-Faktur (KSeF)

    W serwisie ksefapi.pl specjalizujemy się w kompleksowych rozwiązaniach związanych z Krajowym Systemem e-Faktur (KSeF). Oferujemy integrację z KSeF, generowanie i wysyłanie e-faktur w formacie XML, a także wizualizację faktur (PDF lub HTML). Zapewniamy szybką i skuteczną obsługę faktur KSeF, w tym wyszukiwanie i pobieranie faktur sprzedażowych i kosztowych. Pomagamy firmom w adaptacji do nowych regulacji KSeF, oferując narzędzia i wsparcie techniczne. Aktualnie dostępne narzędzia: - Bezpośrednie wywołanie API REST: https://ksefapi.pl/api-test/swagger-ui/index.html#/ - Biblioteki programistyczne PHP, JAVA, .NET: https://ksefapi.pl/pobierz-ksef-php-java-net/ - Pełna i szczegółowa dokumentacja: https://ksef24.com/ - Aplikacja mobilna (wkrótce) - Aplikacja webowa (wkrótce) W zależności od wymagań można wykorzystać nasze narzędzia do komunikacji bezpośrednio ze swojego systemu (On-premise) lub poprzez naszą architekturę (SaaS)
    Downloads: 8 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next