[go: up one dir, main page]

Data Pipeline Tools for ChromeOS

Browse free open source Data Pipeline tools and projects for ChromeOS below. Use the toggles on the left to filter open source Data Pipeline tools by OS, license, language, programming language, and project status.

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • La version gratuite d'Auth0 s'enrichit ! Icon
    La version gratuite d'Auth0 s'enrichit !

    Gratuit pour 25 000 utilisateurs avec intégration Okta illimitée : concentrez-vous sur le développement de vos applications.

    Vous l'avez demandé, nous l'avons fait ! Les versions gratuite et payante d'Auth0 incluent des options qui vous permettent de développer, déployer et faire évoluer vos applications en toute sécurité. Utilisez Auth0 dès maintenant pour découvrir tous ses avantages.
    Essayez Auth0 gratuitement
  • 1
    Pentaho

    Pentaho

    Pentaho offers comprehensive data integration and analytics platform.

    Pentaho couples data integration with business analytics in a modern platform to easily access, visualize and explore data that impacts business results. Use it as a full suite or as individual components that are accessible on-premise, in the cloud, or on-the-go (mobile). Pentaho enables IT and developers to access and integrate data from any source and deliver it to your applications all from within an intuitive and easy to use graphical tool. The Pentaho Enterprise Edition Free Trial can be obtained from https://pentaho.com/download/
    Leader badge">
    Downloads: 1,650 This Week
    Last Update:
    See Project
  • 2
    CloverDX

    CloverDX

    Design, automate, operate and publish data pipelines at scale

    Please, visit www.cloverdx.com for latest product versions. Data integration platform; can be used to transform/map/manipulate data in batch and near-realtime modes. Suppors various input/output formats (CSV,FIXLEN,Excel,XML,JSON,Parquet, Avro,EDI/X12,HL7,COBOL,LOTUS, etc.). Connects to RDBMS/JMS/Kafka/SOAP/Rest/LDAP/S3/HTTP/FTP/ZIP/TAR. CloverDX offers 100+ specialized components which can be further extended by creation of "macros" - subgraphs - and libraries, shareable with 3rd parties. Simple data manipulation jobs can be created visually. More complex business logic can be implemented using Clover's domain-specific-language CTL, in Java or languages like Python or JavaScript. Through its DataServices functionality, it allows to quickly turn data pipelines into REST API endpoints. The platform allows to easily scale your data job across multiple cores or nodes/machines. Supports Docker/Kubernetes deployments and offers AWS/Azure images in their respective marketplace
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    GenStage

    GenStage

    Producer and consumer actors with back-pressure for Elixir

    GenStage is a specification and set of behaviours for building demand-driven data pipelines on the BEAM. It formalizes the roles of producers, consumers, and producer-consumers, using back-pressure so that fast producers don’t overwhelm downstream stages. Developers implement callbacks like handle_demand and handle_events to control how items are emitted, transformed, and consumed across asynchronous boundaries. Because stages are OTP processes, you gain fault tolerance, supervised restarts, and concurrency tuned via configurable demand and partitioning. GenStage underpins higher-level libraries like Flow and Broadway, but it can also be used directly for custom pipelines where timing and throughput matter. Its clear separation of concerns encourages testable, composable stages that can be rearranged as requirements evolve. In production, this leads to predictable, resilient dataflows for event ingestion, batching, and parallel processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Union Pandera

    Union Pandera

    Light-weight, flexible, expressive statistical data testing library

    The open-source framework for precision data testing for data scientists and ML engineers. Pandera provides a simple, flexible, and extensible data-testing framework for validating not only your data but also the functions that produce them. A simple, zero-configuration data testing framework for data scientists and ML engineers seeking correctness. Access a comprehensive suite of built-in tests, or easily create your own validation rules for your specific use cases. Validate the functions that produce your data by automatically generating test cases for them. Integrate seamlessly with the Python ecosystem. Overcome the initial hurdle of defining a schema by inferring one from clean data, then refine it over time. Identify the critical points in your data pipeline, and validate data going in and out of them. Build confidence in the quality of your data by defining schemas for complex data objects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also. Get json example at https://github.com/arrahtech/osdq-spark How to run Unzip the zip file Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c .\example\samplerun.json Mac UNIX java -cp ./lib/*:./osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ./example/samplerun.json For those on windows, you need to have hadoop distribtion unzipped on local drive and HADOOP_HOME set. Also copy winutils.exe from here into HADOOP_HOME\bin
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next