GLM-TTS is an advanced text-to-speech synthesis system built on large language model technologies that focuses on producing high-quality, expressive, and controllable spoken output, including features like emotion modulation and zero-shot voice cloning. It uses a two-stage architecture where a generative LLM first converts text into intermediate speech token sequences and then a Flow-based neural model converts those tokens into natural audio waveforms, enabling rich prosody and voice character even for unseen speakers. The system introduces a multi-reward reinforcement learning framework that jointly optimizes for voice similarity, emotional expressiveness, pronunciation, and intelligibility, yielding output that can rival commercial options in naturalness and expressiveness. GLM-TTS also supports phoneme-level control and hybrid text + phoneme input, giving developers precise control over pronunciation critical for multilingual or polyphone­-rich languages.

Features

  • Zero-shot voice cloning from short prompt audio
  • Multi-reward reinforcement learning for expressive prosody
  • Two-stage LLM + Flow-based audio generation pipeline
  • Support for phoneme-level control and hybrid inputs
  • High-quality synthesis comparable with commercial TTS
  • Streaming real-time speech synthesis

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow GLM-TTS

GLM-TTS Web Site

You Might Also Like
Gen AI apps are built with MongoDB Atlas Icon
Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of GLM-TTS!

Additional Project Details

Programming Language

Python

Related Categories

Python Text to Speech Software, Python AI Models

Registered

2026-01-20