Secure open source cloud runtime for AI apps & AI agents
Python inference and LoRA trainer package for the LTX-2 audio–video
Large Multimodal Models for Video Understanding and Editing
21 Lessons, Get Started Building with Generative AI
text and image to video generation: CogVideoX (2024) and CogVideo
Convert AI papers to GUI
Video-based AI memory library. Store millions of text chunks in MP4
Multimodal Diffusion with Representation Alignment
Make videos programmatically with React
Implementation of Recurrent Interface Network (RIN)
Official code for StoryMem: Multi-shot Long Video Storytelling
Video understanding codebase from FAIR for reproducing video models
Official SeedVR2 Video Upscaler for ComfyUI
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Repo for SeedVR2 & SeedVR
The most powerful and modular diffusion model GUI, api and backend
End-to-end pipeline converting generative videos
Motion-controllable Video Generation via Latent Trajectory Guidance
A suite of advanced multi-modal LLMs
Video, Image and GIF upscale/enlarge(Super-Resolution)
A realtime, self-hosted recipe app for families & friends
The python library for real-time communication
Advancing Open-source World Models
Streaming Real-time Audio-Driven Avatar Generation
Open source AI agent CLI tool to bring Gemini into your terminal