VibeVoice-1.5B is Microsoft’s frontier open-source text-to-speech (TTS) model designed for generating expressive, long-form, multi-speaker conversational audio such as podcasts. Unlike traditional TTS systems, it excels in scalability, speaker consistency, and natural turn-taking for up to 90 minutes of continuous speech with as many as four distinct speakers. A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz, enabling high audio fidelity with efficient processing of long sequences. The model integrates a Qwen2.5-based large language model with a diffusion head to produce realistic acoustic details and capture conversational context. Training involved curriculum learning with increasing sequence lengths up to 65K tokens, allowing VibeVoice to handle very long dialogues effectively. Safety mechanisms include an audible disclaimer and imperceptible watermarking in all generated audio to mitigate misuse risks.

Features

  • Open-source TTS model for expressive, long-form conversational speech
  • Generates up to 90 minutes of audio with up to 4 distinct speakers
  • Continuous acoustic & semantic tokenizers at 7.5 Hz for fidelity and efficiency
  • Integrates Qwen2.5-1.5B LLM with a diffusion head for context and realism
  • Curriculum-trained on sequences up to 65K tokens for long dialogues
  • Embedded audible disclaimer and imperceptible watermark in all outputs

Project Samples

Project Activity

See All Activity >

Categories

AI Models

License

MIT License

Follow VibeVoice

VibeVoice Web Site

You Might Also Like
MongoDB Atlas runs apps anywhere Icon
MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of VibeVoice!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Models

Registered

2025-12-08