[go: up one dir, main page]

Showing 120 open source projects for "ai voice"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Transforming NetOps Through No-Code Network Automation - NetBrain Icon
    Transforming NetOps Through No-Code Network Automation - NetBrain

    For anyone searching for a complete no-code automation platform for hybrid network observability and AIOps

    NetBrain, founded in 2004, provides a powerful no-code automation platform for hybrid network observability, allowing organizations to enhance their operational efficiency through automated workflows. The platform applies automation across three key workflows: troubleshooting, change management, and assessment.
    Learn More
  • 1
    Qwen3-ASR

    Qwen3-ASR

    Qwen3-ASR is an open-source series of ASR models

    ...The architecture combines advanced neural acoustic modeling with context-aware language prediction so that outputs maintain both fidelity to the original speech and grammatical coherence. This makes Qwen3-ASR suitable for voice-driven applications like AI assistants, dictation tools, speech analytics pipelines, and accessibility features, where accurate and fluid transcription is critical.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    Qwen-Audio

    Qwen-Audio

    Chat & pretrained large audio language model proposed by Alibaba Cloud

    Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    PyGPT

    PyGPT

    Open source personal AI Assistant for Linux, Windows and Mac

    PyGPT is a desktop application that allows you to talk to OpenAI's LLM models such as GPT4 and GPT3 using your own computer and OpenAI API. It allows you to talk in chat mode and in completion mode, as well as generate images using DALL-E 2. PyGPT also adds access to the Internet for GPT via Google Custom Search API and Wikipedia API and includes voice synthesis using Microsoft Azure Text-to-Speech API. Moreover, the application has implemented context memory support, context storage,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Note67

    Note67

    A private, local meeting notes assistant

    note67 is a private, local meeting notes assistant application that combines audio capture, transcription, and AI-powered summarization to help users document conversations and meetings on their own devices without relying on cloud services. Built with a cross-platform architecture using Rust (via Tauri) for backend logic and a TypeScript/React frontend, it prioritizes privacy by performing audio transcription locally with Whisper models and generating summaries with locally-hosted AI, eliminating the need to send sensitive meeting content to external servers. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Information Security Made Simple and Affordable | Carbide Icon
    Information Security Made Simple and Affordable | Carbide

    For companies requiring a solution to scale their business without incurring security debt

    Get expert guidance and smart tools to launch or level up your security and compliance efforts without the complexity.
    Learn More
  • 5

    VOIP-VOICE-TO-TEXT&ANALYS

    Convert VoIP calls to text and analyze them with AI

    The VoIP voice-to-text software for Issabel is an intelligent, AI-based solution that converts calls into accurate Persian text. After each call, the audio file is sent to the GPT-4O AI engine, producing editable transcripts. The software also provides AI-powered call analysis, extracting key points, customer requests, satisfaction levels, and sensitive topics, all stored in the database.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 6
    Voqal

    Voqal

    Natural speech programming assistant for the software developers

    Voqal is a programming assistant built for software developers looking to enhance their productivity with natural speech programming. Using Voqal, you can navigate, write, run, and debug software in JetBrains IDEs using your voice. Write code faster, reduce repetitive strain injuries, and improve focus and productivity. Voqal is promptable and privacy-focused, allowing you to customize your experience and control your data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Voice Accounting For Blind & Mute People

    Voice Accounting For Blind & Mute People

    Free & Easy AI Voice Accounting Software For Blind & Speechless People

    Just download the above zip file, extract it and then open the index.html file on internet browsers like Firefox ( preferable ) or Google Chrome. Also, please view and download my full collection of softwares for people with disabilities, here : https://sourceforge.net/projects/softwares-for-disabled-people/ This full collection also includes the Voice Accounting Software as well.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Languine

    Languine

    Translate your application with Languine CLI powered by AI.

    ​Languine is an AI-powered localization platform designed to automate and streamline the translation process for applications, ensuring seamless integration within development workflows. It offers intelligent, context-aware translations across over 100 languages, maintaining brand voice and tone consistency. It provides a command-line interface and continuous integration/continuous deployment integration, allowing developers to manage translations directly or automate them within existing pipelines. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Tock

    Tock

    Tock, the open source conversational AI toolkit

    Complete and autonomous NLU solution leveraging opensource libs, such as OpenNLP, Stanford, Duckling and more. Web, mobile, social networks, smart speakers and more. Create your bot once, connect it progressively to multiple channels as you need them. Simple graphical interfaces to build stories and models, manage multilingual and multichannel bots, better understand users with analytics. Program complex stories using Kotlin, Python or Node.js provided components, or integrate with any...
    Downloads: 2 This Week
    Last Update:
    See Project
  • The Easy Way To Build A Referral Program Icon
    The Easy Way To Build A Referral Program

    Referral Factory is the #1 referral software used by SMEs and Marketers.

    Referral Factory offers over 1000 pre-built referral program templates you can use as your own, or you can build your own referral program from scratch. You get unlimited referral campaigns on all plans, and brilliant support from their team of referral marketing experts.
    Learn More
  • 10
    Step-Audio 2

    Step-Audio 2

    Multi-modal large language model designed for audio understanding

    Step-Audio2 is an advanced, end-to-end multimodal large language model designed for high-fidelity audio understanding and natural speech conversation: unlike many pipelines that separate speech recognition, processing, and synthesis, Step-Audio2 processes raw audio, reasons about semantic and paralinguistic content (like emotion, speaker characteristics, non-verbal cues), and can generate contextually appropriate responses — including potentially generating or transforming audio output. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Chatterbox

    Chatterbox

    SoTA open-source TTS

    Chatterbox is Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs and is consistently preferred in side-by-side evaluations. Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out. Try it now on our...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 12
    Pedalboard

    Pedalboard

    A Python library for audio

    ...It supports the most popular audio file formats and a number of common audio effects out of the box and also allows the use of VST3® and Audio Unit formats for loading third-party software instruments and effects. pedalboard was built by Spotify’s Audio Intelligence Lab to enable using studio-quality audio effects from within Python and TensorFlow. Internally at Spotify, pedalboard is used for data augmentation to improve machine learning models and to help power features like Spotify’s AI DJ and AI Voice Translation. pedalboard also helps in the process of content creation, making it possible to add effects to audio without using a Digital Audio Workstation.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    OpenAI Realtime Embedded

    OpenAI Realtime Embedded

    Instructions on how to use the Realtime API on Microcontrollers

    openai-realtime-embedded is a repository that provides resources, SDKs, and example links for using OpenAI’s Realtime API on embedded hardware platforms (e.g. microcontrollers). The goal is to enable low-latency conversational agents (e.g. voice-based assistants) running directly on constrained devices, by leveraging WebRTC and streaming APIs to communicate with OpenAI systems. The repo includes pointers to an ESP32 implementation (maintained as esp32 branch) and documentation that Espressif...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Coqui TTS

    Coqui TTS

    A deep learning toolkit for Text-to-Speech, battle-tested in research

    TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pre-trained models, tools for measuring dataset quality and is already used in 20+ languages for products and research projects. High-performance Deep Learning models for Text2Speech tasks. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings...
    Downloads: 47 This Week
    Last Update:
    See Project
  • 15
    Voxal voice changer

    Voxal voice changer

    Transform your voice in real-time voxal voice changer

    Voxal Voice Changer is a program that allows you to modify your voice by applying various effects (e.g. pitch change, echo, etc.) in real-time. Effects can be added in any sequence and in any combination, allowing you to distort your voice beyond recognition. Take your audio to the next level! Our powerful Voice Changer software lets you morph your voice in real-time with stunning AI-powered quality.
    Leader badge">
    Downloads: 19 This Week
    Last Update:
    See Project
  • 16
    elevenlabs-api

    elevenlabs-api

    elevenlabs-api is an open source Java wrapper around the ElevenLabs

    ...The most realistic and versatile AI speech software, ever. Eleven brings the most compelling, rich and lifelike voices to creators and publishers seeking the ultimate tools for storytelling. Generate top-quality spoken audio in any voice and style with the most advanced and multipurpose AI speech tool out there. Our deep learning model renders human intonation and inflections with unprecedented fidelity and adjusts delivery based on context.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 17
    Auto Synced & Translated Dubs

    Auto Synced & Translated Dubs

    Automatically translates the text of a video based on a subtitle file

    Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    Step-Audio-EditX

    Step-Audio-EditX

    LLM-based Reinforcement Learning audio edit model

    Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    GHOST-AI CHAT

    GHOST-AI CHAT

    GHOST AI CHAT

    User - Introducing our brand new GHOST AI chat app for windows steam devices! - Enjoy seamless conversations with our advanced AI technology. - Personalize your chat experience with customizable settings. - Explore a wide range of topics with our diverse knowledge base. https://discord.gg/nmupE7Xhuk https://ghostai.pro/ https://play.google.com/store/apps/details?id=com.ghostai.chat
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    BlackBelt WASTE - ipv4/Tor/i2p +AI+Voice

    BlackBelt WASTE - ipv4/Tor/i2p +AI+Voice

    Modern, AI-Smart, WASTE p2p for ipv4, Tor and i2p + Voice Conference.

    Open Source - GPLv3 inc images. A WASTE client. Download and create your own WASTE networks. Move 1000's of GB's at 100MB+ per sec. (800 Mbits per sec) FULL pause and resume capable. Voice Conference, Chat, Transfer files and Participate in Forums in a secure environment. For Windows XP 32/64, Vista 32/64, Win7 32/64, Win8 32/64, Win 10, Win 11, Linux (WINE). *** User Based Access Control - for voice, chats, file transfers and uploads. (useful in NULLNETS) *** Distributed Autonomic-Performance-Tuning - A Goal-Seeking Swarming-Semiotic AI *** AI Connect - AI Managed Connections. *** Self-Organising Anti-Spoofing Technology *** Geared Multi-threading, providing the smoothest performance possible *** Advanced Threat Detect and Manage Technology *** Voice Conferencing Over WASTE *** RNN - Recurring Neural Net - AI Noise Reduction *** Differential Files Transfer - Seriously fast data backups
    Downloads: 19 This Week
    Last Update:
    See Project
  • 21
    Amica

    Amica

    Amica is an open source interface for interactive communication

    Amica is an open source interface for interacting with fully animated 3D characters that combine voice chat, vision, and an emotion engine into a single experience. It lets you hold natural conversations with AI characters that can see, listen, and speak, while expressing emotional states through facial expressions and body language. Users can import VRM character models, adjust their appearance, tune the voice to match the character, and define behavior using different large language models and TTS backends. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    VoiceClip

    VoiceClip

    VoiceClip es una aplicación de asistencia a usuarios

    VoiceClip es una aplicación de asistencia a usuarios diseñada para integrarse de manera fluida en su entorno de trabajo, proporcionando un acceso rápido y eficiente a diversas funcionalidades mediante comandos de voz y texto. Presentada como una barra de herramientas que permanece siempre visible en primer plano, VoiceClip busca simplificar tareas comunes, mejorar la productividad y facilitar la interacción con su sistema operativo y con tecnologías avanzadas de inteligencia artificial
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    comfyui-mixlab-nodes

    comfyui-mixlab-nodes

    Workflow and speech recognition app

    ...The project also brings Real-time Design features like screen capture and floating video nodes, enabling creative pipelines that mix live screen content, generative models, and visual effects. For audio and speech, it provides nodes for SpeechRecognition and SpeechSynthesis, plus workflows that combine voice generation with real-time face swapping and other audio-visual effects. On the AI side, it integrates multiple LLM providers (cloud and local), supports OpenAI-compatible endpoints, Siliconflow models, and includes prompt-focused utilities for random prompt generation, Chinese prompts, clip interrogation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Piper TTS

    Piper TTS

    A fast, local neural text to speech system

    Piper is a fast, local neural text-to-speech (TTS) system developed by the Rhasspy team. Optimized for devices like the Raspberry Pi 4, Piper enables high-quality speech synthesis without relying on cloud services, making it ideal for privacy-conscious applications. It utilizes ONNX models trained with VITS to deliver natural-sounding voices across various languages and accents. Piper is particularly suited for offline voice assistants and embedded systems.
    Downloads: 477 This Week
    Last Update:
    See Project
  • 25
    Personal A.I Assistant
    An Open Source Personal A.I Assistant Based on the Google Gemini API that is Fully Customizable for Your Needs, Ask Questions, Request Real Time Data and Information, Play Music. Launch Programs and Open Websites on Your PC with Voice Commands. ***Requires Your Google Gemini API Key to Work***
    Downloads: 13 This Week
    Last Update:
    See Project