[go: up one dir, main page]

Showing 66 open source projects for "data"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Junie, the AI coding agent by JetBrains Icon
    Junie, the AI coding agent by JetBrains

    Your smart coding agent

    Junie is an AI-powered coding agent developed by JetBrains designed to enhance developer productivity by integrating directly into popular IDEs such as IntelliJ IDEA, PyCharm, and Android Studio. It supports developers by assisting with code completion, testing, and inspections, ensuring code quality and reducing debugging time.
    Learn More
  • 1
    Synthetic Data Vault (SDV)

    Synthetic Data Vault (SDV)

    Synthetic Data Generation for tabular, relational and time series data

    The Synthetic Data Vault (SDV) is a Synthetic Data Generation ecosystem of libraries that allows users to easily learn single-table, multi-table and timeseries datasets to later on generate new Synthetic Data that has the same format and statistical properties as the original dataset. Synthetic data can then be used to supplement, augment and in some cases replace real data when training Machine Learning models.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    CTGAN

    CTGAN

    Conditional GAN for generating synthetic tabular data

    CTGAN is a collection of Deep Learning based synthetic data generators for single table data, which are able to learn from real data and generate synthetic data with high fidelity. If you're just getting started with synthetic data, we recommend installing the SDV library which provides user-friendly APIs for accessing CTGAN. The SDV library provides wrappers for preprocessing your data as well as additional usability features like constraints. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    SDGym

    SDGym

    Benchmarking synthetic data generation methods

    The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    YData Synthetic

    YData Synthetic

    Synthetic data generators for tabular and time-series data

    A package to generate synthetic tabular and time-series data leveraging state-of-the-art generative models. Synthetic data is artificially generated data that is not collected from real-world events. It replicates the statistical components of real data without containing any identifiable information, ensuring individuals' privacy. This repository contains material related to Generative Adversarial Networks for synthetic data generation, in particular regular tabular data and time-series. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • The Easy Way To Build A Referral Program Icon
    The Easy Way To Build A Referral Program

    Referral Factory is the #1 referral software used by SMEs and Marketers.

    Referral Factory offers over 1000 pre-built referral program templates you can use as your own, or you can build your own referral program from scratch. You get unlimited referral campaigns on all plans, and brilliant support from their team of referral marketing experts.
    Learn More
  • 5
    LlamaIndex

    LlamaIndex

    Central interface to connect your LLM's with external data

    LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data. LlamaIndex is a simple, flexible interface between your external data and LLMs. It provides the following tools in an easy-to-use fashion. Provides indices over your unstructured and structured data for use with LLM's. These indices help to abstract away common boilerplate and pain points for in-context learning. Dealing with prompt limitations (e.g. 4096 tokens for Davinci) when the context is too big. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    BertViz

    BertViz

    BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

    BertViz is an interactive tool for visualizing attention in Transformer language models such as BERT, GPT2, or T5. It can be run inside a Jupyter or Colab notebook through a simple Python API that supports most Huggingface models. BertViz extends the Tensor2Tensor visualization tool by Llion Jones, providing multiple views that each offer a unique lens into the attention mechanism. The head view visualizes attention for one or more attention heads in the same layer. It is based on the...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    LangChain

    LangChain

    ⚡ Building applications with LLMs through composability ⚡

    Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    PaddleNLP

    PaddleNLP

    Easy-to-use and powerful NLP library with Awesome model zoo

    PaddleNLP It is a natural language processing development library for flying paddles, with Easy-to-use text area API, Examples of applications for multiple scenarios, and High-performance distributed training Three major features, aimed at improving the modeling efficiency of the flying oar developer's text field, aiming to improve the developer's development efficiency in the text field, and provide rich examples of NLP applications. Provide rich industry-level pre-task capabilities Taskflow And process-wide text area API: Support for the loading of rich Chinese data sets Dataset API, can flexibly and efficiently complete data pretreatment Data API, Preset 60 + pre-training word vector Embedding API, Providing 100 + pre-training model Transformer API Wait, the efficiency of NLP task modeling can be greatly improved.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    Deep Lake

    Deep Lake

    Data Lake for Deep Learning. Build, manage, and query datasets

    Deep Lake (formerly known as Activeloop Hub) is a data lake for deep learning applications. Our open-source dataset format is optimized for rapid streaming and querying of data while training models at scale, and it includes a simple API for creating, storing, and collaborating on AI datasets of any size. It can be deployed locally or in the cloud, and it enables you to store all of your data in one place, ranging from simple annotations to large videos.
    Downloads: 1 This Week
    Last Update:
    See Project
  • deskbird is the most intuitive desk booking app for your hybrid office. Icon
    deskbird is the most intuitive desk booking app for your hybrid office.

    With deskbird, creating an efficient workplace has never been easier.

    For companies in need of a people-centric workplace management solution so employees can see who is in the office, schedule their office and work-from-home days, and book resources for office days.
    Learn More
  • 10
    Regex

    Regex

    Generate matching and non matching strings based on regex patterns

    ...Follow the link to Online IDE with created project: JDoodle. Enter your pattern and see the results. By design a+, a* and a{n,} patterns in regex imply an infinite number of characters should be matched. When generating data, that would mean values of infinite length might be generated. It is highly doubtful anyone would require a string of infinite length, thus I've artificially limited repetitions in such patterns to 100 symbols when generating random values. Use a{n,m} if you require some specific number of repetitions. It is suggested to avoid using such infinite patterns to generate data based on regex.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    Langflow

    Langflow

    Low-code app builder for RAG and multi-agent AI applications

    Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 12
    pwa-asset-generator

    pwa-asset-generator

    Automates PWA asset generation and image declaration

    Automates PWA asset generation and image declaration. Automatically generates icon and splash screen images, favicons and mstile images. Updates manifest.json and index.html files with the generated images according to Web App Manifest specs and Apple Human Interface guidelines. When you build a PWA with a goal of providing native-like experiences on multiple platforms and stores, you need to meet with the criteria of those platforms and stores with your PWA assets; icon sizes and splash...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    marqo

    marqo

    Tensor search for humans

    ...Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and text-to-image search and analytics. Marqo adapts and stores your data in a fully schemaless manner. It combines tensor search with a query DSL that provides efficient pre-filtering. Tensor search allows you to go beyond keyword matching and search based on the meaning of text, images and other unstructured data. Be a part of the tribe and help us revolutionize the future of search. Whether you are a contributor, a user, or simply have questions about Marqo, we got your back.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    ...NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node mixed-precision training. Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Swirl

    Swirl

    Swirl queries any number of data sources with APIs

    ...It's intended for use by developers and data scientists who want to solve multi-silo search problems from enterprise search to new monitoring & alerting solutions that push information to users continuously. Built on the Python/Django/RabbitMQ stack, SWIRL includes connectors to Apache Solr, ChatGPT, Elastic, OpenSearch | PostgreSQL, Google BigQuery plus generic HTTP/GET/JSON with configurations for premium services.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Haystack

    Haystack

    Haystack is an open source NLP framework to interact with your data

    Apply the latest NLP technology to your own data with the use of Haystack's pipeline architecture. Implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications. Evaluate components and fine-tune models. Ask questions in natural language and find granular answers in your documents using the latest QA models with the help of Haystack pipelines.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Lightweight' GAN

    Lightweight' GAN

    Implementation of 'lightweight' GAN, proposed in ICLR 2021

    ...Quoting the one-line summary "converge on single gpu with few hours' training, on 1024 resolution sub-hundred images". Augmentation is essential for Lightweight GAN to work effectively in a low data setting. You can test and see how your images will be augmented before they pass into a neural network (if you use augmentation). The general recommendation is to use suitable augs for your data and as many as possible, then after some time of training disable the most destructive (for image) augs. You can turn on automatic mixed precision with one flag --amp. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Orion

    Orion

    A machine learning library for detecting anomalies in signals

    Orion is a machine-learning library built for unsupervised time series anomaly detection. Such signals are generated by a wide variety of systems, few examples include telemetry data generated by satellites, signals from wind turbines, and even stock market price tickers. We built this to provide one place where users can find the latest and greatest in machine learning and deep learning world including our own innovations. Abstract away from the users the nitty-gritty about preprocessing, finding the best pipeline, and postprocessing. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Albumentations

    Albumentations

    Fast image augmentation library and an easy-to-use wrapper

    ...Albumentations supports different computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation. Albumentations works well with data from different domains: photos, medical images, satellite imagery, manufacturing and industrial applications, Generative Adversarial Networks. Albumentations can work with various deep learning frameworks such as PyTorch and Keras.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Simple StyleGan2 for Pytorch

    Simple StyleGan2 for Pytorch

    Simplest working implementation of Stylegan2

    Simple Pytorch implementation of Stylegan2 that can be completely trained from the command-line, no coding needed. You will need a machine with a GPU and CUDA installed. You can also specify the location where intermediate results and model checkpoints should be stored. You can increase the network capacity (which defaults to 16) to improve generation results, at the cost of more memory. By default, if the training gets cut off, it will automatically resume from the last checkpointed file....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    ChatGPT.Net

    ChatGPT.Net

    Unofficial .Net Client for ChatGPT

    The ChatGPT.Net Unofficial .Net API for ChatGPT is a C# library that allows developers to access ChatGPT, a chat-based language model. With this API, developers can send queries to ChatGPT and receive responses in real-time, making it easy to integrate ChatGPT into their own applications. The new method operates without a browser by utilizing a server that has implemented bypass methods to function as a proxy. The library sends requests to the server, which then redirects the request to...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    OpenAI DALL·E AsyncImage SwiftUI

    OpenAI DALL·E AsyncImage SwiftUI

    OpenAI swift async text to image for SwiftUI app using OpenAI

    ...They are Markov chains trained using variational inference. The goal of diffusion models is to learn the latent structure of a dataset by modeling the way in which data points diffuse through the latent space.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    AppFlowy

    AppFlowy

    Bring projects, wikis, and teams together with AI.

    AppFlowy is an AI collaborative workspace where you can achieve more without losing control of your data. It is the best open source alternative to Notion, offering a 100% offline mode and self-hosting with a cloud service of your choice. Build a centralized workspace for your wiki, projects, and notes with AppFlowy. It allows you to organize and visualize your data in tables, Kanban boards, calendars, and more. You can filter and sort your data in any way you want. ...
    Downloads: 48 This Week
    Last Update:
    See Project
  • 24
    KoboldCpp

    KoboldCpp

    Run GGUF models easily with a UI or API. One File. Zero Install.

    KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models, inspired by the original KoboldAI. It's a single self-contained distributable that builds off llama.cpp and adds many additional powerful features.
    Downloads: 419 This Week
    Last Update:
    See Project
  • 25

    uweb browser: unlimited power

    minimal suckless android web browser with unlimited power

    ...and .js files as commands). - user-defined site-specific JS/CSS/HTML/preprocessing. - Online play/preview/preprocess for downloadable resources. - Multiple type profiles: switch any data including logins/config orthogonally - web automation, crontab (alarm clock)
    Downloads: 6 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next