A fast, local neural text to speech system
A deep learning toolkit for Text-to-Speech, battle-tested in research
State-of-the-art TTS model under 25MB
MARS5 is a fully open-source, hyper-realistic text-to-speech (TTS).
LLM Frontend for Power Users
SoTA open-source TTS
A gradio web UI for running Large Language Models like LLaMA
Conversational voice AI agents
Examples and guides for using the Gemini API
Toolkit for conversational AI
Implementation of AudioLM audio generation model in Pytorch
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
An intelligent desktop virtual assistant based on Live2D technology
PDF to Podcast transforms any PDF document into a podcast-ready audio
App in java for chatting to a generative A.I. (involving tts and stt)
A walk along memory lane
Verbot 5 with V-SDK and CCS
VibeVoice: Open-source multi-speaker long-form text-to-speech model
Dia-1.6B generates lifelike English dialogue and vocal expressions