[go: up one dir, main page]

Scala Data Quality Tools

View 150 business solutions

Browse free open source Scala Data Quality Tools and projects below. Use the toggles on the left to filter open source Scala Data Quality Tools by OS, license, language, programming language, and project status.

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Easily build robust connections between Salesforce and any platform Icon
    Easily build robust connections between Salesforce and any platform

    We help companies using Salesforce connect their data with a no-code Salesforce-native solution.

    Like having Postman inside Salesforce! Declarative Webhooks allows users to quickly and easily configure bi-directional integrations between Salesforce and external systems using a point-and-click interface. No coding is required, making it a fast and efficient and as a native solution, Declarative Webhooks seamlessly integrates with Salesforce platform features such as Flow, Process Builder, and Apex. You can also leverage the AI Integration Agent feature to automatically build your integration templates by providing it with links to API documentation.
    Learn More
  • 1
    Feathr

    Feathr

    A scalable, unified data and AI engineering platform for enterprise

    Feathr is a data and AI engineering platform that is widely used in production at LinkedIn for many years and was open sourced in 2022. It is currently a project under LF AI & Data Foundation. Define data and feature transformations based on raw data sources (batch and streaming) using Pythonic APIs. Register transformations by names and get transformed data(features) for various use cases including AI modeling, compliance, go-to-market and more. Share transformations and data(features) across team and company. Feathr is particularly useful in AI modeling where it automatically computes your feature transformations and joins them to your training data, using point-in-time-correct semantics to avoid data leakage, and supports materializing and deploying your features for use online in production.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also. Get json example at https://github.com/arrahtech/osdq-spark How to run Unzip the zip file Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c .\example\samplerun.json Mac UNIX java -cp ./lib/*:./osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ./example/samplerun.json For those on windows, you need to have hadoop distribtion unzipped on local drive and HADOOP_HOME set. Also copy winutils.exe from here into HADOOP_HOME\bin
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next