[go: up one dir, main page]

Showing 26 open source projects for "website crawler"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Yeastar: Business Phone System and Unified Communications Icon
    Yeastar: Business Phone System and Unified Communications

    Go beyond just a PBX with all communications integrated as one.

    User-friendly, optimized, and scalable, the Yeastar P-Series Phone System redefines business connectivity by bringing together calling, meetings, omnichannel messaging, and integrations in one simple platform—removing the limitations of distance, platforms, and systems.
    Learn More
  • 1
    SiteOne Crawler

    SiteOne Crawler

    SiteOne Crawler is a website analyzer and exporter

    SiteOne Crawler is a very useful and easy-to-use tool you'll ♥ as a Dev/DevOps, website owner or consultant. Works on all popular platforms - Windows, macOS, and Linux (x64 and arm64 too). It will crawl your entire website in depth, analyze and report problems, show useful statistics and reports, generate an offline version of the website, generate sitemaps, or send reports via email.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    crwlr

    crwlr

    Library for Rapid (Web) Crawler and Scraper Development

    This library provides kind of a framework and a lot of ready-to-use, so-called steps, that you can use as building blocks, to build your own crawlers and scrapers with. Before diving into the library, let's have a look at the terms crawling and scraping. For most real-world use cases, those two things go hand in hand, which is why this library helps with and combines both. A (web) crawler is a program that (down)loads documents and follows the links in it to load them as well. A crawler...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    SiteOne Crawler (desktop app)

    SiteOne Crawler (desktop app)

    A free, feature-rich web analyzer and exporter/cloner you will love!

    A free in-depth website analyzer providing audits of security, performance, SEO, accessibility and other technical aspects. Available as a desktop application for Windows/macOS/Linux and as a CLI tool for advanced users and CI/CD processes. It also includes an offline web page exporter (website clone, mirror).
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    DocSearch

    DocSearch

    The easiest way to add search to your documentation

    Initially created to fulfill our own developers' needs, DocSearch quickly evolved into a successful community project. Over the years, we've explored new ways to address the complexities of search for the open-source community. DocSearch understands how the user input fits into the context of your project and instantly presents the most relevant content with fewer interactions than any other method. With a design very close to the native experience on mobile, we leverage users acquaintance...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Dominate AI Search Results Icon
    Dominate AI Search Results

    Generative Al is shaping brand discovery. AthenaHQ ensures your brand leads the conversation.

    AthenaHQ is a cutting-edge platform for Generative Engine Optimization (GEO), designed to help brands optimize their visibility and performance across AI-driven search platforms like ChatGPT, Google AI, and more.
    Learn More
  • 5

    ahCrawler

    A PHP search engine for your website and web analytics tool. GNU GPL3

    ahCrawler is a set to implement your own search on your website and an analyzer for your web content. It can be used on a shared hosting. It consists of * crawler (spider) and indexer * search for your website(s) * search statistics * website analyzer (http header, short titles and keywords, linkchecker, ...) You need to install it on your own server. So all crawled data stay in your environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    WFDownloader App

    WFDownloader App

    Free batch downloader for image, wallpaper, video, audio, document,

    Use as an image gallery, wallpaper, audio/music, video, document, and other media bulk downloader from supported websites. Also use to download sequential website urls that have a certain pattern (e.g. image01.png to image100.png). Also use app's built-in site crawler for advanced link search or extraction. There is also special support for forum media and open directory downloading. It's a programmable downloader and also works with password protected sites. Say goodbye to downloading one by one. ...
    Leader badge">
    Downloads: 276 This Week
    Last Update:
    See Project
  • 7
    ScrapBot 1.40 64bits

    ScrapBot 1.40 64bits

    Task automation software for accessing and manipulating website data.

    ScrapBot is a task automation software that allows you to access, authenticate, extract, and insert data on any website. The software utilizes JavaScript to execute tasks, eliminating the need for server or additional software installations. The system can control the accessed webpage through JavaScript, and the entire navigation can be viewed in the program window. The main.js script runs in a separate frame from the navigation frame but can access all page content without any restrictions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    PHP mini vulnerability suite

    Multiple server/webapp vulnerability scanner

    github: https://github.com/samedog/phpmvs
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    sponge

    A website crawler and links downloader command line tool

    sponge is a website crawler and links downloader command line tool
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-based, Comprehensive Service Management for Businesses and IT Providers Icon
    AI-based, Comprehensive Service Management for Businesses and IT Providers

    Modular solutions for change management, asset management and more

    ChangeGear provides IT staff with the functions required to manage everything from ticketing to incident, change and asset management and more. ChangeGear includes a virtual agent, self-service portals and AI-based features to support analyst and end user productivity.
    Learn More
  • 10
    ShadowSocksShare

    ShadowSocksShare

    Python ShadowSocks framework

    This project obtains the shared ss(r) account from the ss(r) shared website crawler, redistributes the account and generates a subscription link by parsing and verifying the account connectivity. Since Google plus will be closed on April 2, 2019, almost all the available accounts crawled before come from Google plus. So if you are building your own website, please keep an eye on the updates of this project and redeploy using the latest source code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    phoneutria
    A Java Web crawler: multi-threaded, scalable, with high performance, extensible and polite. It can be used to crawl and index any web or enterprise domain and is configurable through a XML configuration file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ToroSearch Search Engine
    The "ToroSearch Search Engine" ist a project, I created in my leisure. You can add websites of your search engine or pages of your website, and you can search for websites on your own search machine or you can search for pages of your website. ATTENTION: This is not a crawler. It just lists websites or pages. Originally I hosted it myself, and nobody knew the source code. But now I don't have the time anymore to host and program it myself. And on SourceForge anyone can see it and change it for himself. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Products of the project: Java HTMLParser - VietSpider Web Data Extractor - Extractor VietSpider News. Click on "Show project details" to see more feature about each product.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Addons for IOSEC - DoS HTTP Security

    Addons for IOSEC - DoS HTTP Security

    IOSec Addons are enhancements for web security and crawler detection

    ...Added Facebook Bot Support for "Facebot/1.0". Add this code to your website to prevent unauthorized stealing of your valuable content & block malicious bots from crawling your site.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    sitecheck

    Modular web site spider for web developers.

    More than just a link checker, sitecheck is a website spider (also known as a crawler) which can assist with SEO by testing an entire site plus both inbound links from search engines and outbound links to other sites for the following issues: looping redirects (HTTP 301/302), broken links (HTTP 404), server errors (HTTP 500), spelling mistakes, low readability scores (using the Flesch Reading Ease test), missing/empty/duplicate meta tags, duplicate content, slow page speed, W3C validation errors and accessibility errors. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Zoozle Search & Download Suchmaschine

    Zoozle Search & Download Suchmaschine

    Zoozle 2008 - 2010 Webpage, Tools and SQL Files

    Download search engine and directory with Rapidshare and Torrent - zoozle Download Suchmaschine All The files that run the World Leading German Download Search Engine in 2010 with 500 000 unique visitors a day - all the tools you need to set up a clone. Code Contains: - PHP Files for zoozle - Perl Crawler for gathering new content to database and all other cool tools i have...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    Aerc Crawler

    A software used to crawler websites and make sitemaps.

    Desenvolvido em Delphi X2 mas com total compatibilidade para Delphi 7, esta ferramenta vasculha um site determinado pelo usuario e atravez das tags "<a href='..." inicia um processo de Crawling no mesmo. Armazenando as informacoes em um banco de dados SQLli existe a possibilidade da criacao de um mapa do website. Developed in Delphi X2 but with full compatibility for Delphi 7, this tool scans a site given by the User and trough tags "<a href = '..." begins the process of Crawling...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    This 5 generation selenium web crawler crawl through web page of a host website searching for static and dynamic links and able to detect honeypot links.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20
    Ex-Crawler
    Ex-Crawler is divided into 3 subprojects (Crawler Daemon, distributed gui Client, (web) search engine) which together provide a flexible and powerful search engine supporting distributed computing. More informations: http://ex-crawler.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    bee-rain is a web crawler that harvest and index file over the network. You can see result by bee-rain website : http://bee-rain.internetcollaboratif.info/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    The Java Sitemap Parser can parse a website's Sitemap (http://www.sitemaps.org/). This is useful for web crawlers that want to discover URLs from a website that is using the Sitemap Protocol. This project has been incorporated into crawler-commons (https://github.com/crawler-commons/crawler-commons) and is no longer being maintained.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    APC Anti Crawler is a php5 class based on APC which can be used to limit the amount of http request per IP. It stop web crawler to download your entire website.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    LogCrawler is an ANT task for automatic testing of web applications. Using a HTTP crawler it visits all pages of a website and checks the server logfiles for errors. Use it as a "smoketest" with your CI system like CruiseControl.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    PHP Crawler is a simple website search script for small-to-medium websites. The only requrements are PHP and MySQL, no shell access required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next