[go: up one dir, main page]

Open Source TypeScript Text to Speech Software

TypeScript Text to Speech Software

View 205 business solutions

Browse free open source TypeScript Text to Speech Software and projects below. Use the toggles on the left to filter open source TypeScript Text to Speech Software by OS, license, language, programming language, and project status.

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Yeastar: Business Phone System and Unified Communications Icon
    Yeastar: Business Phone System and Unified Communications

    Go beyond just a PBX with all communications integrated as one.

    User-friendly, optimized, and scalable, the Yeastar P-Series Phone System redefines business connectivity by bringing together calling, meetings, omnichannel messaging, and integrations in one simple platform—removing the limitations of distance, platforms, and systems.
    Learn More
  • 1
    OpenAI.fm

    OpenAI.fm

    Code for openai.fm, a demo for the OpenAI Speech API

    OpenAI.fm is an official interactive demo application built to showcase the OpenAI Speech API and its advanced text-to-speech capabilities, providing developers and creators with a hands-on web interface to convert text into high-quality, customizable audio using state-of-the-art TTS models. Developed using Next.js and the OpenAI Speech API, this demo illustrates how the latest neural voice models can produce natural, expressive speech with adjustable styles and voices, highlighting features like emotional range, tone, and real-time playback. Users can experiment with different input text and voice options directly in their browser, gaining a sense of how high-fidelity AI audio can be integrated into applications ranging from podcasts and narration to accessibility tools and interactive agents. Although the web demo is free to explore, production use of the underlying API requires an OpenAI API key and may incur costs based on usage.
    Downloads: 185 This Week
    Last Update:
    See Project
  • 2
    TTS-Vue

    TTS-Vue

    Microsoft speech synthesis tool, built with Electron

    TTS-Vue is a desktop text-to-speech application built with Electron, Vue, ElementPlus, and Vite, focused on using Microsoft’s official Speech API for high-quality neural synthesis. It wraps the Microsoft TTS WebSocket interface in a clean UI so users can paste or load text, choose voices, tweak parameters, and export audio without touching raw API calls. The app supports SSML (Speech Synthesis Markup Language), letting power users specify fine-grained control over pronunciation, pauses, prosody, and emphasis using XML-like markup. It includes batch conversion: users can select multiple .txt files and convert them into audio in one go, making it handy for large text collections or repetitive tasks. For long texts or big files, TTS-Vue automatically slices content into manageable segments, converts them separately, and then stitches them back into a single audio file, avoiding the usual length or timeout issues with TTS APIs.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 3
    Readest

    Readest

    Readest is a modern, feature-rich ebook reader

    Readest is a project meant to facilitate reading, studying, or consuming content by integrating reading tools with AI-powered assistance. Although the repository is not as widely documented or popular as some, the idea is that Readest supports features to help with reading comprehension — likely combining OCR / text retrieval, translation, note-taking, or summarization for reading materials (eBooks, articles, PDFs). The goal appears to be to let users feed in arbitrary reading material and then interact with it (highlighting, translation, lookup, maybe TTS or summarization) more comfortably. Because of that, it's oriented towards learners, researchers, or people dealing with multilingual documents — especially when they need to rapidly digest or reference large amounts of text. The design seems to prioritize flexible input formats, possibly OCR or uploaded documents, and interactive tools to navigate or annotate them.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 4
    Voicebox

    Voicebox

    The open-source voice synthesis studio powered by Qwen3-TTS

    Voicebox is a local-first voice synthesis studio that aims to bring professional, DAW-like voice generation workflows to a desktop app while keeping models and voice data entirely on your machine. It positions itself as an open-source alternative to cloud voice platforms by emphasizing privacy, offline use, and freedom from subscriptions or usage caps. The tool supports downloading voice models, cloning voices from short audio samples, and generating speech locally, then organizing the results using studio-oriented editing concepts. A standout capability is its multi-track timeline editor and supporting audio tools (like trimming and conversation mixing), which let creators compose multi-voice scenes instead of generating single clips in isolation. It is API-first, meaning you can use it as an app for production work or integrate its speech generation into your own software via an API layer.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Dominate AI Search Results Icon
    Dominate AI Search Results

    Generative Al is shaping brand discovery. AthenaHQ ensures your brand leads the conversation.

    AthenaHQ is a cutting-edge platform for Generative Engine Optimization (GEO), designed to help brands optimize their visibility and performance across AI-driven search platforms like ChatGPT, Google AI, and more.
    Learn More
  • 5
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    easyVoice is an open-source text-to-speech platform aimed at turning long-form text and novels into high-quality audio, with a strong focus on usability and scalability. It provides a web interface where users can paste or upload large texts and generate speech and subtitles in a single workflow, even for works exceeding 100,000 characters. The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure parameters such as rate, pitch, and volume per role. It offers streaming playback so audio starts almost immediately, even for very long inputs, and automatically generates subtitle files suitable for video production or translation workflows. Under the hood, easyVoice uses a modern stack with Vue 3 and Element Plus on the front end, Node.js and Express on the back end, and TTS engines such as Microsoft Azure TTS and OpenAI-compatible APIs, orchestrated through ffmpeg.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all necessary dependencies, so users can focus on experimenting with voices instead of managing tooling. It offers both a Gradio backend and an optional React frontend, which can be accessed on separate ports and even run inside Docker for more reproducible deployments. An extension system lets you enable extra models and tools, install community extensions from a catalog, and manage them via a dedicated GUI or CLI extension manager.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Amica

    Amica

    Amica is an open source interface for interactive communication

    Amica is an open source interface for interacting with fully animated 3D characters that combine voice chat, vision, and an emotion engine into a single experience. It lets you hold natural conversations with AI characters that can see, listen, and speak, while expressing emotional states through facial expressions and body language. Users can import VRM character models, adjust their appearance, tune the voice to match the character, and define behavior using different large language models and TTS backends. Under the hood, Amica leverages modern web and desktop technologies: three.js and three-vrm for 3D rendering, Transformers.js for running models in the browser, Whisper and Silero VAD for speech recognition and voice-activity detection, and a variety of LLM backends such as llama.cpp servers, ChatGPT-compatible APIs, Ollama, KoboldCpp, and others. It also integrates multiple text-to-speech providers, including ElevenLabs, OpenAI, Coqui, RVC, and AllTalkTTS.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Polyglot

    Polyglot

    Cross-platform AI language practice app

    Polyglot is a cross platform AI language practice application that runs as a desktop app and also offers a web version. It is built around conversational large language models and Azure based text to speech services, turning them into an interactive environment for speaking practice in multiple languages. Users can define custom AI personas, choose languages, and configure their own OpenAI and Azure keys so they retain control over which backends they use. The app supports speech recognition with quick keyboard shortcuts, allowing learners to hold down a key to speak and release it to submit for recognition and response. It includes translation features, dark mode, playback of the user’s own recorded speech, and word highlighting that tracks the progress of synthesized audio to make following along easier. Polyglot also integrates additional AI providers, supports configurable conversation scenarios, and lets users personalize avatars, making the experience more engaging and flexible.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next