[go: up one dir, main page]

Showing 35 open source projects for "hadoop project"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • EasySend is a no-code platform that transforms customer journeys Icon
    EasySend is a no-code platform that transforms customer journeys

    Defy form limits. 
Create digital experiences.

    Evolve forms into smart, AI-powered digital workflows that streamline your data intake and elevate customer experiences.
    Learn More
  • 1
    Apache Bigtop

    Apache Bigtop

    Bigtop is an Apache Foundation project for Infrastructure Engineers

    Apache Bigtop is a project focused on building and packaging the Hadoop ecosystem and related big data components. It provides a consistent framework for testing, packaging, and deploying Hadoop distributions, including tools like HDFS, YARN, Spark, Hive, HBase, and more. By maintaining cross-platform builds (RPMs, DEBs, Docker images, and Kubernetes support), Bigtop makes it easier for organizations to deploy big data stacks in different environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    SeaweedFS

    SeaweedFS

    Distributed storage system for blobs, objects, files, and data lake

    ...Blob store has O(1) disk seek, local tiering, cloud tiering. Filer supports cross-cluster active-active replication, Kubernetes, POSIX, S3 API, encryption, Erasure Coding for warm storage, FUSE mount, Hadoop, WebDAV. SeaweedFS is an independent Apache-licensed open source project with its ongoing development made possible because of the community. SeaweedFS is a simple and highly scalable distributed file system. There are two objectives, to store billions of files, and to serve the files fast! SeaweedFS started as an Object Store to handle small files efficiently. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    SZT-bigdata

    SZT-bigdata

    SZT‑bigdata is an open source project

    SZT‑bigdata is an open-source project analyzing real Shenzhen metro (subway) card usage data using big‑data frameworks like Spark, Hadoop, Hive, Kafka, Flink, ClickHouse, HBase, and Elasticsearch. Aimed at exploring transit passenger flow patterns and system optimization using a variety of Scala-based technologies.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    ...It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
    Downloads: 7 This Week
    Last Update:
    See Project
  • Attack Surface Management | Criminal IP ASM Icon
    Attack Surface Management | Criminal IP ASM

    For security operations, threat-intelligence and risk teams wanting a tool to get access to auto-monitored assets exposed to attack surfaces

    Criminal IP’s Attack Surface Management (ASM) is a threat-intelligence–driven platform that continuously discovers, inventories, and monitors every internet-connected asset associated with an organization, including shadow and forgotten resources, so teams see their true external footprint from an attacker’s perspective. The solution combines automated asset discovery with OSINT techniques, AI enrichment and advanced threat intelligence to surface exposed hosts, domains, cloud services, IoT endpoints and other Internet-facing vectors, capture evidence (screenshots and metadata), and correlate findings to known exploitability and attacker tradecraft. ASM prioritizes exposures by business context and risk, highlights vulnerable components and misconfigurations, and provides real-time alerts and dashboards to speed investigation and remediation.
    Learn More
  • 5

    MarDRe

    MapReduce-based tool to remove duplicate DNA reads

    ...Instead, MarDRe takes advantage of the MapReduce programming model to significantly improve ParDRe performance on distributed systems, especially on cloud-based infrastructures. Written in pure Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for Big Data processing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    X-RIME is a open source project devoted to provide Hadoop based solution for large scale social network analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    RSS Atom Feed Analytics With MapReduce

    This is a data analytics project for RSS feeds using hadoop MapReduce

    This project accepts the output of jatomrss project as the input. It applies the MR logic on the same to perform the analytics
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    ankus

    ankus

    Data Mining and Machine Learning Algorithms based on MapReduce

    [The feature of ankus] * ankus is a 'web-based big data mining project and tool'. - MapReduce-based data mining/machine learning algorithms library - Hadoop-based distributed bigdata system - offering a web-based GUI for easy use [The ankus project & License] * The ankus project consists of three as an open source. * ankus has Dual licensed under the community and commercial licenses
    Downloads: 0 This Week
    Last Update:
    See Project
  • ACI Learning: Internal Audit, Cybersecurity, and IT Training Icon
    ACI Learning: Internal Audit, Cybersecurity, and IT Training

    Proven skill building for every aspect of your support or IT team.

    Traditional training doesn't equip employees with the practical skills they need to drive business success. ACI Learning provides hands-on IT and cybersecurity training designed to build real-world, on-the-job skills. Our outcome-based programs empower employees with certification prep, industry-recognized credentials, and flexible learning options. With expert-led video training, labs, and scalable solutions, we help businesses, individuals, governments, and academic institutions develop a skilled workforce, align with business goals, and stay ahead in a rapidly evolving digital world.
    Learn More
  • 10
    hmrjp-maven-plugin

    hmrjp-maven-plugin

    Hadoop mapreduce maven plugin

    hmrjp-maven-plugin is a maven plugin which helps creating, running and verifying hadoop mapreduce jobs remotely just like any other java project which is built using maven.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    ...The command line tools of Hadoop-BAM should be understandable to all users, but they are limited in scope. See the SeqPig project for a higher-level interface to the file formats supported by Hadoop-BAM: http://seqpig.sourceforge.net See Seal for Hadoop-based read alignment tools, Seal: http://biodoop-seal.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Volt

    Volt is pure JAVA NGS mapping soft which run on Hadoop 2.0 env

    The project move to VoltMR http://voltmr.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Hadoop

    Hadoop

    Use Hadoop in Scientific Workflows

    This project provides with an application to integrate Hadoop with WS-PGRADE workflows. It uses Openstack cloud to create user specified Hadoop clusters and execute jobs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Flamingo Project

    Flamingo Project

    Workflow Designer, Hive Editor, Pig Editor, File System Browser

    Flamingo is a open-source Big Data Platform that combine a Ajax Rich Web Interface + Workflow Engine + Workflow Designer + MapReduce + Hive Editor + Pig Editor. 1. Easy Tool for big data 2. Use comfortable in Hadoop EcoSystem projects 3. Based GPL V3 License Supporting Pig IDE, Hive IDE, HDFS Browser, Scheduler, Hadoop Job Monitoring, Workflow Engine, Workflow Designer, MapReduce.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Cascalog

    Cascalog

    Data processing on Hadoop without the hassle

    ...The best way to get started with Cascalog is experiment with the toy datasets that ship with the project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    UI To the Hadoop HBase Project
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    HDFSFileTransfer

    File transfer from local FS to HDFS

    The HDFSFileTransfer project was created and developed to ease Hadoop users quickly copying varied files such as: flat, structured, unstructured, big and small from linux to Hadoop File System (HDFS). It allows users to transfer files: - within the same physical machine - from local file system (linux) into HDFS - between two physical machines - copy files from local file system (linux) with HDFS cluster installed to another HDFS cluster.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Aspose for Hadoop

    Aspose for Hadoop

    This project holds source code for Aspose for Hadoop project.

    Aspose for Hadoop project enables Apache Hadoop / MapReduce developers to work with various binary file formats. The developers can create and convert binary sequence files into text sequence files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Crowbar

    Crowbar

    A complete operations platform to deploy, maintain and scale clusters.

    The Crowbar Project is an effort to build a complete, easy to use operational platform for everyone. It allows for any number of physical nodes to be moved from bare-metal to production cluster within hours. Specific applications include (but are not limited to) Hadoop and OpenStack.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    BIRT Report Designer

    BIRT Report Designer

    Open Source Reporting & Data Visualization Platform

    BIRT is an open source technology platform used to create data visualizations and reports that can be embedded into rich client and web applications. Developers who use BIRT Designer are able to access information from multiple data sources easily and quickly in order to create reports and applications with stunning data visualizations. Actuate now provides a free report server, BIRT iHub F-Type, to deploy BIRT content so developers don't have to build their own infrastructure. With a...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    Jxtadoop

    Jxtadoop

    This project aims to provide P2P capabilities with Hadoop DFS.

    Hadoop is designed to work in large datacenters with thousands of servers connected to each others in the Hadoop cloud. This project focuses on the Distributed File System part of Hadoop (HDFS). The goal of this project is to provide an alternative to direct IP connectivity required for Hadoop. Instead, the DFS layer has been modified to use a Peer-2-Peer framework which allows direct connectivity in datacenters as well as indirect connectivity to bypass firewall constraints. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This project aims to reduce the data read redundancy in the Apache Hadoop.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    testHadoop

    Project to work with binary files on hadoop

    Aspose for Hadoop will enable hadoop developers to work with binary file formats on Hadoop by converting binary sequence files into text sequence files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    JUMMP

    JUMMP

    JUMMP: Job Uninterrupted Maneuverable MapReduce Platform

    ...Use of this project for research requires citation of our paper. W.C. Moody; L. Ngo; E. Duffy; A. Apon; "JUMMP: Job Uninterrupted Maneuverable MapReduce Platform", Cluster Computing (CLUSTER), 2013 IEEE International Conference on , 23-27 Sept. 2013
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Standalone HDFS
    Hadoop is a great project for deep analytics based on the MapReduce features. It also includes a powerful distributed file system designed to ensure that the analytics workloads can locally access the data to be processed to minimize the network bandwidth impact. I found this filesystem very useful to leverage storage from all my PCs and even from some of my online storage such as S3.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next