[go: up one dir, main page]

Browse free open source Python Generative AI and projects below. Use the toggles on the left to filter open source Python Generative AI by OS, license, language, programming language, and project status.

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Yeastar: Business Phone System and Unified Communications Icon
    Yeastar: Business Phone System and Unified Communications

    Go beyond just a PBX with all communications integrated as one.

    User-friendly, optimized, and scalable, the Yeastar P-Series Phone System redefines business connectivity by bringing together calling, meetings, omnichannel messaging, and integrations in one simple platform—removing the limitations of distance, platforms, and systems.
    Learn More
  • 1
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products. This fork is supported across Linux, Windows and Macintosh. Linux users can use either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm driver). We do not recommend the GTX 1650 or 1660 series video cards. They are unable to run in half-precision mode and do not have sufficient VRAM to render 512x512 images.
    Downloads: 26 This Week
    Last Update:
    See Project
  • 2
    KoboldCpp

    KoboldCpp

    Run GGUF models easily with a UI or API. One File. Zero Install.

    KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models, inspired by the original KoboldAI. It's a single self-contained distributable that builds off llama.cpp and adds many additional powerful features.
    Downloads: 419 This Week
    Last Update:
    See Project
  • 3
    GIMP ML

    GIMP ML

    AI for GNU Image Manipulation Program

    This repository introduces GIMP3-ML, a set of Python plugins for the widely popular GNU Image Manipulation Program (GIMP). It enables the use of recent advances in computer vision to the conventional image editing pipeline. Applications from deep learning such as monocular depth estimation, semantic segmentation, mask generative adversarial networks, image super-resolution, de-noising and coloring have been incorporated with GIMP through Python-based plugins. Additionally, operations on images such as edge detection and color clustering have also been added. GIMP-ML relies on standard Python packages such as numpy, scikit-image, pillow, pytorch, open-cv, scipy. In addition, GIMP-ML also aims to bring the benefits of using deep learning networks used for computer vision tasks to routine image processing workflows.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 4
    gpt-j-api

    gpt-j-api

    API for the GPT-J language mode. Including a FastAPI backend

    An API to interact with the GPT-J language model and variants! You can use and test the model in two different ways. These are the endpoints of the public API and require no authentication. Just SSH into a TPU VM. This code was tested on both the v2-8 and v3-8 variants.
    Downloads: 9 This Week
    Last Update:
    See Project
  • AI-based, Comprehensive Service Management for Businesses and IT Providers Icon
    AI-based, Comprehensive Service Management for Businesses and IT Providers

    Modular solutions for change management, asset management and more

    ChangeGear provides IT staff with the functions required to manage everything from ticketing to incident, change and asset management and more. ChangeGear includes a virtual agent, self-service portals and AI-based features to support analyst and end user productivity.
    Learn More
  • 5
    revChatGPT

    revChatGPT

    Reverse engineered ChatGPT API

    Reverse Engineered ChatGPT API by OpenAI. Extensible for chatbots etc. This is not an official OpenAI product.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    Make-A-Video - Pytorch (wip)

    Make-A-Video - Pytorch (wip)

    Implementation of Make-A-Video, new SOTA text to video generator

    Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch. They combine pseudo-3d convolutions (axial convolutions) and temporal attention and show much better temporal fusion. The pseudo-3d convolutions isn't a new concept. It has been explored before in other contexts, say for protein contact prediction as "dimensional hybrid residual networks". The gist of the paper comes down to, take a SOTA text-to-image model (here they use DALL-E2, but the same learning points would easily apply to Imagen), make a few minor modifications for attention across time and other ways to skimp on the compute cost, do frame interpolation correctly, get a great video model out. Passing in images (if one were to pretrain on images first), both temporal convolution and attention will be automatically skipped. In other words, you can use this straightforwardly in your 2d Unet and then port it over to a 3d Unet once that phase of the training is done.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    website-to-gif

    website-to-gif

    Turn your website into a GIF

    This Github Action automatically creates an animated GIF or WebP from a given web page to display on your project README (or anywhere else). In your GitHub repo, create a workflow file or extend an existing one. You have to also include a step to checkout and commit to the repo. You can use the following example gif.yml. Make sure to modify the url value and add any other input you want to use. WebP rendering will take a lot of time to benefit from lossless quality and file size optimization.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    Finetune Transformer LM

    Finetune Transformer LM

    Code for "Improving Language Understanding by Generative Pre-Training"

    finetune-transformer-lm is a research codebase that accompanies the paper “Improving Language Understanding by Generative Pre-Training,” providing a minimal implementation focused on fine-tuning a transformer language model for evaluation tasks. The repository centers on reproducing the ROCStories Cloze Test result and includes a single-command training workflow to run the experiment end to end. It documents that runs are non-deterministic due to certain GPU operations and reports a median accuracy over multiple trials that is slightly below the single-run result in the paper, reflecting expected variance in practice. The project ships lightweight training, data, and analysis scripts, keeping the footprint small while making the experimental pipeline transparent. It is provided as archived, research-grade code intended for replication and study rather than continuous development.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    GPT Neo

    GPT Neo

    An implementation of model parallel GPT-2 and GPT-3-style models

    An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here to play with our pre-trained models, we strongly recommend you try out the HuggingFace Transformer integration. Training and inference is officially supported on TPU and should work on GPU as well. This repository will be (mostly) archived as we move focus to our GPU-specific repo, GPT-NeoX. NB, while neo can technically run a training step at 200B+ parameters, it is very inefficient at those scales. This, as well as the fact that many GPUs became available to us, among other things, prompted us to move development over to GPT-NeoX. All evaluations were done using our evaluation harness. Some results for GPT-2 and GPT-3 are inconsistent with the values reported in the respective papers. We are currently looking into why, and would greatly appreciate feedback and further testing of our eval harness.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Dominate AI Search Results Icon
    Dominate AI Search Results

    Generative Al is shaping brand discovery. AthenaHQ ensures your brand leads the conversation.

    AthenaHQ is a cutting-edge platform for Generative Engine Optimization (GEO), designed to help brands optimize their visibility and performance across AI-driven search platforms like ChatGPT, Google AI, and more.
    Learn More
  • 10
    LangChain

    LangChain

    ⚡ Building applications with LLMs through composability ⚡

    Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    PaddleNLP

    PaddleNLP

    Easy-to-use and powerful NLP library with Awesome model zoo

    PaddleNLP It is a natural language processing development library for flying paddles, with Easy-to-use text area API, Examples of applications for multiple scenarios, and High-performance distributed training Three major features, aimed at improving the modeling efficiency of the flying oar developer's text field, aiming to improve the developer's development efficiency in the text field, and provide rich examples of NLP applications. Provide rich industry-level pre-task capabilities Taskflow And process-wide text area API: Support for the loading of rich Chinese data sets Dataset API, can flexibly and efficiently complete data pretreatment Data API, Preset 60 + pre-training word vector Embedding API, Providing 100 + pre-training model Transformer API Wait, the efficiency of NLP task modeling can be greatly improved.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    Video Diffusion - Pytorch

    Video Diffusion - Pytorch

    Implementation of Video Diffusion Models

    Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. It uses a special space-time factored U-net, extending generation from 2D images to 3D videos. 14k for difficult moving mnist (converging much faster and better than NUWA) - wip. Any new developments for text-to-video synthesis will be centralized at Imagen-pytorch. For conditioning on text, they derived text embeddings by first passing the tokenized text through BERT-large. You can also directly pass in the descriptions of the video as strings, if you plan on using BERT-base for text conditioning. This repository also contains a handy Trainer class for training on a folder of gifs. Each gif must be of the correct dimensions image_size and num_frames.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    AudioLM - Pytorch

    AudioLM - Pytorch

    Implementation of AudioLM audio generation model in Pytorch

    Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch It also extends the work for conditioning with classifier free guidance with T5. This allows for one to do text-to-audio or TTS, not offered in the paper. Yes, this means VALL-E can be trained from this repository. It is essentially the same. This repository now also contains a MIT licensed version of SoundStream. It is also compatible with EnCodec, however, be aware that it has a more restrictive non-commercial license, if you choose to use it.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    CTGAN

    CTGAN

    Conditional GAN for generating synthetic tabular data

    CTGAN is a collection of Deep Learning based synthetic data generators for single table data, which are able to learn from real data and generate synthetic data with high fidelity. If you're just getting started with synthetic data, we recommend installing the SDV library which provides user-friendly APIs for accessing CTGAN. The SDV library provides wrappers for preprocessing your data as well as additional usability features like constraints. When using the CTGAN library directly, you may need to manually preprocess your data into the correct format, for example, continuous data must be represented as floats. Discrete data must be represented as ints or strings. The data should not contain any missing values.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    ChatFred

    ChatFred

    Alfred workflow using ChatGPT, DALL·E 2 and other models for chatting

    Alfred workflow using ChatGPT, DALL·E 2 and other models for chatting, image generation and more. Access ChatGPT, DALL·E 2, and other OpenAI models. Language models often give wrong information. Verify answers if they are important. Talk with ChatGPT via the cf keyword. Answers will show as Large Type. Alternatively, use the Universal Action, Fallback Search, or Hotkey. To generate text with InstructGPT models and see results in-line, use the cft keyword. ⤓ Install on the Alfred Gallery or download it over GitHub and add your OpenAI API key. If you have used ChatGPT or DALL·E 2, you already have an OpenAI account. Otherwise, you can sign up here - You will receive $5 in free credit, no payment data is required. Afterward you can create your API key. To start a conversation with ChatGPT either use the keyword cf, setup the workflow as a fallback search in Alfred or create your custom hotkey to directly send the clipboard content to ChatGPT.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    CodiumAI PR-Agent

    CodiumAI PR-Agent

    AI-Powered tool for automated pull request analysis

    CodiumAI PR-Agent is an open-source tool aiming to help developers review pull requests faster and more efficiently. It automatically analyzes the pull request and can provide several types of commands. See the Usage Guide for instructions how to run the different tools from CLI, online usage, Or by automatically triggering them when a new PR is opened. You can try GPT-4 powered PR-Agent, on your public GitHub repository, instantly. Just mention @CodiumAI-Agent and add the desired command in any PR comment. The agent will generate a response based on your command.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    Diffusers

    Diffusers

    State-of-the-art diffusion models for image and audio generation

    Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules. Whether you're looking for a simple inference solution or training your own diffusion models, Diffusers is a modular toolbox that supports both. Our library is designed with a focus on usability over performance, simple over easy, and customizability over abstractions. State-of-the-art diffusion pipelines that can be run in inference with just a few lines of code. Interchangeable noise schedulers for different diffusion speeds and output quality. Pretrained models that can be used as building blocks, and combined with schedulers, for creating your own end-to-end diffusion systems. We recommend installing Diffusers in a virtual environment from PyPi or Conda. For more details about installing PyTorch and Flax, please refer to their official documentation.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    Phenaki - Pytorch

    Phenaki - Pytorch

    Implementation of Phenaki Video, which uses Mask GIT

    Implementation of Phenaki Video, which uses Mask GIT to produce text-guided videos of up to 2 minutes in length, in Pytorch. It will also combine another technique involving a token critic for potentially even better generations. A new paper suggests that instead of relying on the predicted probabilities of each token as a measure of confidence, one can train an extra critic to decide what to iteratively mask during sampling. This repository will also endeavor to allow the researcher to train on text-to-image and then text-to-video. Similarly, for unconditional training, the researcher should be able to first train on images and then fine tune on video.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    DocsGPT

    DocsGPT

    GPT-powered chat for documentation search & assistance

    DocsGPT is a cutting-edge open-source solution that streamlines the process of finding information in project documentation. With its integration of powerful GPT models, developers can easily ask questions about a project and receive accurate answers. Say goodbye to time-consuming manual searches, and let DocsGPT help you quickly find the information you need. Try it out and see how it revolutionizes your project documentation experience. Contribute to its development and be a part of the future of AI-powered assistance.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    FID score for PyTorch

    FID score for PyTorch

    Compute FID scores with PyTorch

    This is a port of the official implementation of Fréchet Inception Distance to PyTorch. FID is a measure of similarity between two datasets of images. It was shown to correlate well with human judgement of visual quality and is most often used to evaluate the quality of samples of Generative Adversarial Networks. FID is calculated by computing the Fréchet distance between two Gaussians fitted to feature representations of the Inception network. The weights and the model are exactly the same as in the official Tensorflow implementation, and were tested to give very similar results (e.g. .08 absolute error and 0.0009 relative error on LSUN, using ProGAN generated images). However, due to differences in the image interpolation implementation and library backends, FID results still differ slightly from the original implementation. In difference to the official implementation, you can choose to use a different feature layer of the Inception network instead of the default pool3 layer.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    GPT-Code UI

    GPT-Code UI

    An open source implementation of OpenAI's ChatGPT Code interpreter

    An open source implementation of OpenAI's ChatGPT Code interpreter. Simply ask the OpenAI model to do something and it will generate & execute the code for you. You can put a .env in the working directory to load the OPENAI_API_KEY environment variable. For Azure OpenAI Services, there are also other configurable variables like deployment name. See .env.azure-example for more information. Note that model selection on the UI is currently not supported for Azure OpenAI Services.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Old Photo Restoration

    Old Photo Restoration

    Bringing Old Photo Back to Life (CVPR 2020 oral)

    We propose to restore old photos that suffer from severe degradation through a deep learning approach. Unlike conventional restoration tasks that can be solved through supervised learning, the degradation in real photos is complex and the domain gap between synthetic images and real old photos makes the network fail to generalize. Therefore, we propose a novel triplet domain translation network by leveraging real photos along with massive synthetic image pairs. Specifically, we train two variational autoencoders (VAEs) to respectively transform old photos and clean photos into two latent spaces. And the translation between these two latent spaces is learned with synthetic paired data. This translation generalizes well to real photos because the domain gap is closed in the compact latent space. Besides, to address multiple degradations mixed in one old photo, we design a global branch with a partial nonlocal block targeting to the structured defects, such as scratches and dust spots.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    Petals

    Petals

    Run 100B+ language models at home, BitTorrent-style

    Run 100B+ language models at home, BitTorrent‑style. Run large language models like BLOOM-176B collaboratively — you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning. Single-batch inference runs at ≈ 1 sec per step (token) — up to 10x faster than offloading, enough for chatbots and other interactive apps. Parallel inference reaches hundreds of tokens/sec. Beyond classic language model APIs — you can employ any fine-tuning and sampling methods, execute custom paths through the model, or see its hidden states. You get the comforts of an API with the flexibility of PyTorch. You can also host BLOOMZ, a version of BLOOM fine-tuned to follow human instructions in the zero-shot regime — just replace bloom-petals with bloomz-petals. Petals runs large language models like BLOOM-176B collaboratively — you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    VALL-E

    VALL-E

    PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)

    We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. Experiment results show that VALL-E significantly outperforms the state-of-the-art zero-shot TTS system in terms of speech naturalness and speaker similarity. In addition, we find VALL-E could preserve the speaker's emotion and acoustic environment of the acoustic prompt in synthesis.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    gpt-2-simple

    gpt-2-simple

    Python package to easily retrain OpenAI's GPT-2 text-generating model

    A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifically the "small" 124M and "medium" 355M hyperparameter versions). Additionally, this package allows easier generation of text, generating to a file for easy curation, allowing for prefixes to force the text to start with a given phrase. For finetuning, it is strongly recommended to use a GPU, although you can generate using a CPU (albeit much more slowly). If you are training in the cloud, using a Colaboratory notebook or a Google Compute Engine VM w/ the TensorFlow Deep Learning image is strongly recommended. (as the GPT-2 model is hosted on GCP) You can use gpt-2-simple to retrain a model using a GPU for free in this Colaboratory notebook, which also demos additional features of the package. Note: Development on gpt-2-simple has mostly been superceded by aitextgen, which has similar AI text generation capabilities with more efficient training time.
    Downloads: 5 This Week
    Last Update:
    See Project