[go: up one dir, main page]

Showing 116 open source projects for "video ai"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • The Industry Leading Platform for eCommerce Enablement and Analytics Icon
    The Industry Leading Platform for eCommerce Enablement and Analytics

    With MikMak Insights, brands gain real-time eCommerce analytics on the channels, campaigns, creative, and audiences that drive conversions.

    MikMak’s Where to Buy Shoppable Solutions help multichannel brands drive sales, grow market share, and increase profitability while reducing costs across categories such as CPG, Grocery, Alcohol, Beauty, Personal Care, Pet Care, Home Care, Consumer Electronics, Home Appliances, Toys, and more.
    Learn More
  • 1
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    Make-A-Video - Pytorch (wip)

    Make-A-Video - Pytorch (wip)

    Implementation of Make-A-Video, new SOTA text to video generator

    Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch. They combine pseudo-3d convolutions (axial convolutions) and temporal attention and show much better temporal fusion. The pseudo-3d convolutions isn't a new concept. It has been explored before in other contexts, say for protein contact prediction as "dimensional hybrid residual networks".
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    Video Diffusion - Pytorch

    Video Diffusion - Pytorch

    Implementation of Video Diffusion Models

    Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. It uses a special space-time factored U-net, extending generation from 2D images to 3D videos. 14k for difficult moving mnist (converging much faster and better than NUWA) - wip. Any new developments for text-to-video synthesis will be centralized at...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies subtitle overlays, producing a polished short video without manual editing. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • Easily build robust connections between Salesforce and any platform Icon
    Easily build robust connections between Salesforce and any platform

    We help companies using Salesforce connect their data with a no-code Salesforce-native solution.

    Like having Postman inside Salesforce! Declarative Webhooks allows users to quickly and easily configure bi-directional integrations between Salesforce and external systems using a point-and-click interface. No coding is required, making it a fast and efficient and as a native solution, Declarative Webhooks seamlessly integrates with Salesforce platform features such as Flow, Process Builder, and Apex. You can also leverage the AI Integration Agent feature to automatically build your integration templates by providing it with links to API documentation.
    Learn More
  • 5
    MoneyPrinterTurbo

    MoneyPrinterTurbo

    Generate short videos with one click using AI LLM

    MoneyPrinterTurbo is an AI-driven tool that enables users to generate high-definition short videos with minimal input. By providing a topic or keyword, the system automatically creates video scripts, sources relevant media assets, adds subtitles, and incorporates background music, resulting in a polished video ready for distribution.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 6
    Director

    Director

    AI video agents framework for next-gen video interactions

    Director is a video database management system designed to organize, search, and retrieve large collections of video content efficiently.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    Wan2.2 is a major upgrade to the Wan series of open and advanced large-scale video generative models, incorporating cutting-edge innovations to boost video generation quality and efficiency. It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting,...
    Downloads: 155 This Week
    Last Update:
    See Project
  • 8
    Wan2.1

    Wan2.1

    Wan2.1: Open and Advanced Large-Scale Video Generative Model

    Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports...
    Downloads: 73 This Week
    Last Update:
    See Project
  • 9
    Story Flicks

    Story Flicks

    Generate high-definition story short videos with one click using AI

    Story Flicks is another open-source project in the AI-assisted video generation / editing space, focused on creating short, story-style videos from script or prompt inputs. It aims to let users generate high-definition short movies or video stories with minimal manual effort, using AI models under the hood to assemble visuals, timing, and possibly narration or subtitles. For creators who want to produce narrative short-form content — whether for social media, storytelling, or prototyping video ideas — story-flicks offers a lightweight, code-backed alternative to complex video editing suites. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Cortex: Boost Developer Coding Skills Icon
    Cortex: Boost Developer Coding Skills

    Cortex makes coding easier and faster for developers. See how our portal connects tools and cuts busywork.

    Cortex is a simple portal that helps developers work smarter by linking all your tools, setting clear rules, and slashing repetitive tasks. It speeds up onboarding, updates old code, and fixes issues fast. Over 100 big companies use it to save time and get better results.
    Try it now!
  • 10
    AutoClip

    AutoClip

    AI-powered video clipping and highlight generation

    AutoClip is an open-source, AI-powered video processing system designed to automate the extraction of “highlight” segments from full-length videos — ideal for creators who want to generate bite-sized clips, compilations, or highlight reels without manually sifting through hours of footage. The system supports downloading videos from major platforms (e.g. YouTube, Bilibili), or accepting local uploads, and then applies AI analysis to identify segments worth clipping based on content (e.g. high energy moments, speech, or other heuristics). ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    Step-Video-T2V

    Step-Video-T2V

    State-of-the-art (SoTA) text-to-video pre-trained model

    Step-Video-T2V is a state-of-the-art text-to-video foundation model developed to generate videos from natural-language prompts; its 30B-parameter architecture is designed to produce coherent, temporally extended video sequences — up to around 204 frames — based on input text. Under the hood it uses a compressed latent representation (a Video-VAE) to reduce spatial and temporal redundancy, and a denoising diffusion (or similar) process over that latent space to generate smooth, plausible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    HunyuanWorld-Voyager

    HunyuanWorld-Voyager

    RGBD video generation model conditioned on camera input

    HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video...
    Downloads: 48 This Week
    Last Update:
    See Project
  • 13
    ComfyUI-LTXVideo

    ComfyUI-LTXVideo

    LTX-Video Support for ComfyUI

    ComfyUI-LTXVideo is a bridge between ComfyUI’s node-based generative workflow environment and the LTX-Video multimedia processing framework, enabling creators to orchestrate complex video tasks within a visual graph paradigm. Instead of writing code to apply effects, transitions, edits, and data flows, users can assemble nodes that represent video inputs, transformations, and outputs, letting them prototype and automate video production pipelines visually. This integration empowers...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Open-Sora

    Open-Sora

    Open-Sora: Democratizing Efficient Video Production for All

    Open-Sora is an open-source initiative aimed at democratizing high-quality video production. It offers a user-friendly platform that simplifies the complexities of video generation, making advanced video techniques accessible to everyone. The project embraces open-source principles, fostering creativity and innovation in content creation. Open-Sora provides tools, models, and resources to create high-quality videos, aiming to lower the entry barrier for video production and support diverse...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 15
    Phenaki - Pytorch

    Phenaki - Pytorch

    Implementation of Phenaki Video, which uses Mask GIT

    Implementation of Phenaki Video, which uses Mask GIT to produce text-guided videos of up to 2 minutes in length, in Pytorch. It will also combine another technique involving a token critic for potentially even better generations. A new paper suggests that instead of relying on the predicted probabilities of each token as a measure of confidence, one can train an extra critic to decide what to iteratively mask during sampling. This repository will also endeavor to allow the researcher to...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    VMZ (Video Model Zoo)

    VMZ (Video Model Zoo)

    VMZ: Model Zoo for Video Modeling

    The codebase was designed to help researchers and practitioners quickly reproduce FAIR’s results and leverage robust pre-trained backbones for downstream tasks. It also integrates Gradient Blending, an audio-visual modeling method that fuses modalities effectively (available in the Caffe2 implementation). Although VMZ is now archived and no longer actively maintained, it remains a valuable reference for understanding early large-scale video model training, transfer learning, and multimodal...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    HunyuanVideo

    HunyuanVideo

    HunyuanVideo: A Systematic Framework For Large Video Generation Model

    HunyuanVideo is a cutting-edge framework designed for large-scale video generation, leveraging advanced AI techniques to synthesize videos from various inputs. It is implemented in PyTorch, providing pre-trained model weights and inference code for efficient deployment. The framework aims to push the boundaries of video generation quality, incorporating multiple innovative approaches to improve the realism and coherence of the generated content.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    QualityScaler

    QualityScaler

    Image/video AI upscaler app (BSRGAN)

    Qualityscaler is a Windows app that uses BSRGAN Artificial Intelligence to enhance, enlarge and reduce noise in photographs and videos. QualityScaler is completely written in Python, from the backend to the front end. Image/list of images upscale. Video upscale. Drag&drop files [image / multiple images/video] Automatic image tiling and merging to avoid gpu VRAM limitation. Resize image/video before upscaling. Multiple Gpu support. Compatible images - png, jpeg, bmp, webp, tif. Compatible...
    Downloads: 145 This Week
    Last Update:
    See Project
  • 19
    HunyuanCustom

    HunyuanCustom

    Multimodal-Driven Architecture for Customized Video Generation

    HunyuanCustom is a multimodal video customization framework by Tencent Hunyuan, aimed at generating customized videos featuring particular subjects (people, characters) under flexible conditions, while maintaining subject/identity consistency. It supports conditioning via image, audio, video, and text, and can perform subject replacement in videos, generate avatars speaking given audio, or combine multiple subject images. The architecture builds on HunyuanVideo, with added modules for...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    HunyuanVideo-I2V

    HunyuanVideo-I2V

    A Customizable Image-to-Video Model based on HunyuanVideo

    HunyuanVideo-I2V is a customizable image-to-video generation framework from Tencent Hunyuan, built on their HunyuanVideo foundation. It extends video generation so that given a static reference image plus an optional prompt, it generates a video sequence that preserves the reference image’s identity (especially in the first frame) and allows stylized effects via LoRA adapters. The repository includes pretrained weights, inference and sampling scripts, training code for LoRA effects, and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    x-unet

    x-unet

    Implementation of a U-net complete with efficient attention

    Implementation of a U-net complete with efficient attention as well as the latest research findings. For 3d (video or CT / MRI scans).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Vidi2

    Vidi2

    Large Multimodal Models for Video Understanding and Editing

    Vidi is a family of large multimodal models developed for deep video understanding and editing tasks, integrating vision, audio, and language to allow sophisticated querying and manipulation of video content. It’s designed to process long-form, real-world videos and answer complex queries such as “when in this clip does X happen?” or “where in the frame is object Y during that moment?” — offering temporal retrieval, spatio-temporal grounding (i.e. locating objects over time + space), and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    LTX-2

    LTX-2

    Python inference and LoRA trainer package for the LTX-2 audio–video

    LTX-2 is a powerful, open-source toolkit developed by Lightricks that provides a modular, high-performance base for building real-time graphics and visual effects applications. It is architected to give developers low-level control over rendering pipelines, GPU resource management, shader orchestration, and cross-platform abstractions so they can craft visually compelling experiences without starting from scratch. Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries,...
    Downloads: 60 This Week
    Last Update:
    See Project
  • 24
    Generative AI for Beginners (Version 3)

    Generative AI for Beginners (Version 3)

    21 Lessons, Get Started Building with Generative AI

    ...The course covers everything from model selection, prompt engineering, and chat/text/image app patterns to secure development practices and UX for AI. It also walks through modern application techniques such as function calling, RAG with vector databases, working with open source models, agents, fine-tuning, and using SLMs. Each lesson includes a short video, a written guide, runnable samples for Azure OpenAI, the GitHub Marketplace Model Catalog, and the OpenAI API, plus a “Keep Learning” section for deeper study.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    CogVideo

    CogVideo

    text and image to video generation: CogVideoX (2024) and CogVideo

    CogVideo is an open source text-/image-/video-to-video generation project that hosts the CogVideoX family of diffusion-transformer models and end-to-end tooling. The repo includes SAT and Diffusers implementations, turnkey demos, and fine-tuning pipelines (including LoRA) designed to run across a wide range of NVIDIA GPUs, from desktop cards (e.g., RTX 3060) to data-center hardware (A100/H100). Current releases cover CogVideoX-2B, CogVideoX-5B, and the upgraded CogVideoX1.5-5B variants, plus...
    Downloads: 23 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next