Transcribe any audio to text, translate and edit subtitles 100% locall
A text-to-speech, speech-to-text and speech-to-speech library
Chat & pretrained large audio language model proposed by Alibaba Cloud
Text transcription & slicing tool with visual timeline and WAV output.
Speech recognition module for Python
Crowdsourcing platform for full text transcription and tagging
Repo of Qwen2-Audio chat & pretrained large audio language model
A free, open source, and extensible speech-to-text application
Audio foundation model excelling in audio understanding
Open-source framework for intelligent speech interaction
Transcribe and translate audio offline on your personal computer
LLM-based Reinforcement Learning audio edit model
Oobabooga - The definitive Web UI for local AI, with powerful features
Multi-modal large language model designed for audio understanding
Code for openai.fm, a demo for the OpenAI Speech API
A gallery that showcases on-device ML/GenAI use cases
A Family of Open Sourced Music Foundation Models
Speech-to-text, text-to-speech, and speaker recognition
Generate audiobooks from EPUBs, PDFs and text with captions
LilyPond sheet music text editor
Transcribe on your own
Qwen3-omni is a natively end-to-end, omni-modal LLM
Comprehensive Gradio WebUI for audio processing
Audiocraft is a library for audio processing and generation