[go: up one dir, main page]

Search Results for "data extraction of website"

Showing 140 open source projects for "data extraction of website"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered appsβ€”without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • LinkSquares: All-in-One Contract Management Platform Icon
    LinkSquares: All-in-One Contract Management Platform

    #1 Customer Rated CLM Any Contract. Every Department. One Platform.

    LinkSquares is the leading Contract Lifecycle Management (CLM) software designed to help legal, procurement, and business operations teams master the entire contract lifecycle, from creation to execution and renewal. The platform transforms how companies manage agreements by centralizing data, automating routine work, and providing actionable insights powered by AI. This single, connected source of truth helps teams eliminate manual processes, streamline workflows, boost visibility, and ensure compliance across thousands of contracts, ultimately reducing risk and administrative burden.
    Learn More
  • 1
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 2
    Smile

    Smile

    Statistical machine intelligence and learning engine

    Smile is a fast and comprehensive machine learning engine. With advanced data structures and algorithms, Smile delivers the state-of-art performance. Compared to this third-party benchmark, Smile outperforms R, Python, Spark, H2O, xgboost significantly. Smile is a couple of times faster than the closest competitor. The memory usage is also very efficient. If we can train advanced machine learning models on a PC, why buy a cluster? Write applications quickly in Java, Scala, or any JVM...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    SoapUI

    SoapUI

    SoapUI is a free and open source cross-platform functional testing

    ...SoapUI Open Source is the simplest and easiest way to begin your API testing journey. APIs, or Application Programming Interfaces, have become the center of software development, connecting and transferring data and logic across disparate systems and applications. Luckily, testing them can greatly improve the efficiency of your testing strategy as a whole, helping you deliver software faster than ever.
    Downloads: 73 This Week
    Last Update:
    See Project
  • 4
    GeoServer

    GeoServer

    GeoServer repository

    GeoServer is an open-source software server written in Java that allows users to share and edit geospatial data. Designed for interoperability, it publishes data from any major spatial data source using open standards. Being a community-driven project, GeoServer is developed, tested, and supported by a diverse group of individuals and organizations from around the world. GeoServer is the reference implementation of the Open Geospatial Consortium (OGC) Web Feature Service (WFS) and Web...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Accounting practice management software Icon
    Accounting practice management software

    Accountants, accounting firms, tax attorneys, tax professionals

    Canopy is a cloud-based practice management software for accounting and tax firms, offering tools for client engagement, document management, workflow automation, and time & billing. Its Client Engagement platform centralizes interactions with a secure portal, customizable branding, and email integration, while the Document Management system enables organized, paperless file storage. The Workflow module enhances visibility into tasks and projects through templates, task assignments, and automation, reducing human error. Additionally, the Time & Billing feature tracks billable hours, generates invoices, and processes payments, ensuring accurate financial management. With its comprehensive features, Canopy streamlines operations, reduces stress, and enhances client experiences.
    Learn More
  • 5
    Facebook SDK for Android

    Facebook SDK for Android

    Used to integrate Android apps with Facebook Platform

    ...The current version of the Facebook SDK for Android is version 11.2.0 and requires the Android API 15. Code and samples for the Facebook SDK for Android are available on GitHub. A secure and convenient way for people to log into your app or website by using their Facebook credentials. Enable people to post to Facebook from your app. People can share, send a message, and share to stories. Understand people's actions in your app and measure the effectiveness of your Mobile App Ads. Get data in and out of Facebook's social graph, query data, post stories, upload photos and perform other tasks.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 6
    PublicCMS

    PublicCMS

    PublicCMS is an open source JAVACMS system

    ...Support full site static, SSI, partial static of dynamic pages, improve concurrent access ability and SEO friendliness of the website. Easily support tens of millions of data and tens of millions of PVs, and provide a lightweight solution with low consumption, high performance and high availability for massive data access.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    ANTLR

    ANTLR

    Parser generator to read, process, or translate structured text

    ...It’s widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day. The languages for Hive and Pig, the data warehouse and analysis systems for Hadoop, both use ANTLR. Lex Machina uses ANTLR for information extraction from legal texts. Oracle uses ANTLR within SQL Developer IDE and their migration tools. NetBeans IDE parses C++ with ANTLR. The HQL language in the Hibernate object-relational mapping framework is built with ANTLR.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    WFDownloader App

    WFDownloader App

    Free batch downloader for image, wallpaper, video, audio, document,

    Use as an image gallery, wallpaper, audio/music, video, document, and other media bulk downloader from supported websites. Also use to download sequential website urls that have a certain pattern (e.g. image01.png to image100.png). Also use app's built-in site crawler for advanced link search or extraction. There is also special support for forum media and open directory downloading. It's a programmable downloader and also works with password protected sites. Say goodbye to downloading one by one. ...
    Leader badge">
    Downloads: 276 This Week
    Last Update:
    See Project
  • 9
    ProviewR
    ProviewR is a modern, powerful and general process control system. It contains all functions normally required for successful sequential control, adjustment, data acquisition, communication, supervision, etc. See also our website www.proviewr.se
    Leader badge">
    Downloads: 84 This Week
    Last Update:
    See Project
  • Dragonfly | An In-Memory Data Store without Limits Icon
    Dragonfly | An In-Memory Data Store without Limits

    Dragonfly Cloud is engineered to handle the heaviest data workloads with the strictest security requirements.

    Dragonfly is a drop-in Redis replacement that is designed for heavy data workloads running on modern cloud hardware. Migrate in less than a day and experience up to 25X the performance on half the infrastructure.
    Learn More
  • 10
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender : https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 32 This Week
    Last Update:
    See Project
  • 11
    Hibernate

    Hibernate

    An object relational-mapping (ORM) library for Java

    The Hibernate projects offer a suite of powerful Java libraries to work with data. It is best known for Hibernate ORM, which provides relational persistence for Java models and is an implementation of the Jakarta Persistence specification. Hibernate projects do not consistently release binaries or documentation to SourceForge anymore. For up-to-date information, refer to the Hibernate website: * Hibernate ORM: https://hibernate.org/orm/ * Hibernate Validator: https://hibernate.org/validator/ * Hibernate Search: https://hibernate.org/search/ That website will also be updated with newer projects, such as Hibernate Reactive.
    Leader badge">
    Downloads: 472 This Week
    Last Update:
    See Project
  • 12
    M. Vezelis Draw

    M. Vezelis Draw

    Project productivity & cost estimations tool which creates Gantt Chart

    That's project development productivity and cost estimations tool which reflects it in the form of multiple custom diagrams, including the Gantt Chart. The tool provides succinct project metrics related information, and on demand information under each diagram. It allows a user to calculate Function Points, and estimate cost based on COCOMO Basic and Intermediate models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    LegacyInsight

    Legacy reverse engineering tool

    LegacyInsight is an AI-powered reverse engineering platform that transforms legacy software systems into comprehensible business logic. Using cutting-edge GenAI, it analyzes legacy and extracts core operations, business rules, and data transformationsβ€”all translated into natural language. LegacyInsight supports enterprise-grade systems built on Java, COBOL, NET and other legacy stacks, helping organizations reclaim understanding of business-critical code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    FullSync

    FullSync

    Easy file synchronization for everyone

    FullSync is a powerful tool that helps you keep multiple copies of various data in sync. I.e. it can update your Website using (S)Ftp, backup your data or refresh a working copy from a remote server. It offers flexible rules, a scheduler and more
    Downloads: 15 This Week
    Last Update:
    See Project
  • 15
    JavaComm

    JavaComm

    Desktop communication tool: chat, telephony, multimedia

    ...The JavaComm Server runs on any Jakarta EE Web Profile 10 platform with the required extensions. Tomcat 10.1.44 is used as the reference server for development and deployment. A SQL database stores global system data, with MariaDB 10.5.29 serving as the reference database via a standard JDBC environment. User documentation, written from the end-user perspective, and detailed administrator guides are available on the JavaComm homepage. Visit the website for installation instructions, configuration help, and full feature descriptions.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    TigerSafe

    TigerSafe

    Free open source password manager

    TigerSafe is a free open source password manager. It allows to store passwords in a file, without internet, by encrypting them with a single password. The user can then use a different password for every website he wants to use, and only has to remember a single password: the one used to encrypt/decrypt the file storing his passwords. It is highly recommended to do backups of the file storing passwords with TigerSafe, for example copy/paste it in USB flash drives, cloud drives like Google Drive, Dropbox... Indeed, users are solely responsible for their data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    uniCenta POS

    uniCenta POS

    uniCenta oPOS - dynamically evolving POS project

    ...Get the latest uniCenta oPOS 5.4.0 https://unicenta.com/download-files/ if you would like to make a contribution and support the project or need business support help. πŸ“’ uniCenta oPOS 5.4.0 in fully integrated with WooCommerce! βœ… Run your website and store with the same data βœ… Support table ordering at your restaurant βœ… Click and Collect βœ… Pop-up events with your online products βœ… Centralised customers for loyalty programs βœ… Centralised customers for discounts / promotions Please see the changelog here: https://unicenta.com/support/changelog-unicenta-opos/ uniCenta oPOS is a multi-lingual - 17 languages - commercial-grade POS (Point-Of-Sale) software. ...
    Leader badge">
    Downloads: 856 This Week
    Last Update:
    See Project
  • 18
    The Systems Biology Format Converter (SBFC) is a Java generic framework aiming to translate any systems biology model format into another. The SBFC Web Site providing detailed feature description and instructions for installing, using and developing new modules is: http://sbfc.sourceforge.net/ . Currently, SBFC allows the user to convert SBML models into the formats: BioPax, Matlab, Octave, XPP, DOT, and SBGN. Due to its modular design fast development and addition of new converters...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    FluentLenium

    FluentLenium

    FluentLenium is a web & mobile automation framework

    FluentLenium is a React-ready website automation framework that extends Selenium to write readable, reusable, reliable and resilient UI functional tests. It’s written and maintained by people who are automating browser-based tests on a daily basis. FluentLenium provides a Java-fluent interface to Selenium, and brings some magic to avoid common issues faced by Selenium users. FluentLenium is shipped with adapters for JUnit4, JUnit5, TestNG, Spock, Spring TestNG, Cucumber and Kotest, but it...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby Database - Written in Java Cross Platform See also Free Email Sender in this link: https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21

    OpenVigil

    Open pharmacovigilance data extraction, mining and analysis tool

    OpenVigil provides a webinterface to analyse pharmacovigilance data, i.e., spontanous or systematic collections of treatments (drugs) and observed adverse events ("drug side effects"). FDA Adverse Event Reporting System (AERS) and other pharmacovigilance data (e.g., Canadian or German) are supported. The OpenVigil web-based analysis tools offer several analysis modes like extraction, filtering, mining of data and analyses via measurements of disproportionality like proportional reporting ratio or reporting odds ratios) and export to spreadsheets programs like Microsoft Excel or statistics programs like R. ...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 22
    JKlubTV

    JKlubTV

    Java Schach Turnierverwaltung / Java Chess Tournament Management

    Manage chess round robin tournaments with JKlubTV Version 3.0.0-beta This application stores the data belonging to round robin into a SQLite database. It is intended for webmasters who want to easily manage a club tournament in their own chess club. The HTML tables that are to be published for the website will be created easily by the application. Automatically calculate the total points, Sonneborn Berger points of each player, as well as their sequence DWZ, and sorts the HTML table by the calculated Rangliste.Desweiteren it is possible the tables as to save PDF file. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 23
    Perceptron

    Perceptron

    The birth of modern video feedback art.

    Perceptron is a video feedback engine with a variety of extraordinary graphical effects. Perceptron is an endless flow of transforming visuals. Perceptron * recursively transforms images and video streams in realtime and produces a combination of Julia fractals, IFS fractals, and chaotic patterns due to video feedback * evolves geometric patterns into the realm of infinite details and deepens the thought * records animations (movies) * saves and opens presets...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    SiteofSiteIDE

    SiteofSiteIDE

    Static site IDE is a Static Site Generators aka Static Site Editor

    Static website generator instead of php/asp for maximum speed (an element valued by SEO strategies). In reality, a minimum of php/asp code is used to establish the browser language and cookie management. Support for the GDPR is included as an example (it should be modified according to the processing of the website owner's data).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    PDFLayoutTextStripper

    PDFLayoutTextStripper

    Converts a pdf file into a text file while keeping the layout

    Converts a PDF file into a text file while keeping the layout of the original PDF. Useful to extract the content from a table or a form in a PDF file. PDFLayoutTextStripper is a subclass of PDFTextStripper class (from the Apache PDFBox library).
    Downloads: 0 This Week
    Last Update:
    See Project