sound to text free download

Showing 442 open source projects for "sound to text"

View related business solutions

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
La version gratuite d'Auth0 s'enrichit !
Gratuit pour 25 000 utilisateurs avec intégration Okta illimitée : concentrez-vous sur le développement de vos applications.

Vous l'avez demandé, nous l'avons fait ! Les versions gratuite et payante d'Auth0 incluent des options qui vous permettent de développer, déployer et faire évoluer vos applications en toute sécurité. Utilisez Auth0 dès maintenant pour découvrir tous ses avantages.

Essayez Auth0 gratuitement
1

Sound to Text

Convert a sound file to text to analyze it

Convert a sound file to text you can analyze to spot hidden words for psychological or universal-philosophical interpretation. The program will convert a sound file to text, with the purpose of analyzing it to spot intelligible words. Use the program with songs, movie or TV dialogue, private recordings, to reveal the hidden text messages of the sound. The resulting text will be a long string of mostly repeating characters but, every so often, you will notice an intelligible word, either...

Downloads: 1 This Week

Last Update: 2023-12-09
See Project
2

Frescobaldi

LilyPond sheet music text editor

Frescobaldi is a free and open source LilyPond sheet music text editor. Designed to be powerful yet lightweight and easy-to-use, Frescobaldi offers great functionality and a host of useful features such as music view with advanced two-way Point & Click, Midi capturing to enter music, a Snippet Manager and many more. Frescobaldi is named after Girolamo Frescobaldi (1583-1643), an Italian composer of keyboard music in the late Renaissance and early Baroque period.

2 Reviews

Downloads: 46 This Week

Last Update: 2025-08-08
See Project
3

p5.js

Client-side JS platform for artists, designers and students to express

... objects for text, input, video, webcam, and sound. p5.js is an interpretation of Processing for today’s web. We hold events and operate with support from the Processing Foundation. For self-learners and animators, artists, game makers, creative-technologists, curriculum planners, designers, graphic designers, graphics editors, learning experience designers, project managers, software engineer, student, teachers, university faculty members, visualization researchers, etc.

Downloads: 20 This Week

Last Update: 2025-09-01
See Project
4

Spoon

Metaprogramming library to analyze and transform Java source code

Spoon is an open-source library to analyze, rewrite, transform, transpile Java source code. It parses source files to build a well-designed AST with powerful analysis and transformation API. It supports modern Java versions up to Java 20. Spoon is an official Inria open-source project, and member of the OW2 open-source consortium.

Downloads: 15 This Week

Last Update: 2025-08-10
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

Axmol Engine

Multi-platform Engine for Desktop, XBOX (UWP) and Mobile games

Axmol is a modern C++ game engine forked from Cocos2d-x, designed to support high-performance 2D and lightweight 3D game development across multiple platforms. It improves upon the original Cocos2d-x with a cleaner architecture, better tooling, and support for modern C++ standards. Axmol supports scripting with Lua and JavaScript, and is suitable for both indie developers and studios targeting mobile, desktop, and web platforms. With an active community and frequent updates, Axmol is a solid...

Downloads: 12 This Week

Last Update: 2025-10-05
See Project
6

AudioCraft

Audiocraft is a library for audio processing and generation

AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides...

Downloads: 5 This Week

Last Update: 4 hours ago
See Project
7

Scribble Deluxe

Efficient, internationalized, text renderer for GameMaker

A modern text renderer for GameMaker 2023.8 by Juju Adams. Efficient, internationalized, multi-effects text renderer for GameMaker. Scribble Deluxe is a comprehensive text rendering library designed to replace GameMaker’s native draw_text() functions without adding unnecessary complexity. Scribble’s design should feel familiar and intuitive for GameMaker users. Scribble Deluxe supports all GameMaker export platforms, with the exception of HTML5.

Downloads: 5 This Week

Last Update: 2025-09-23
See Project
8

SpeechRecognition

Speech recognition module for Python

Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...

Downloads: 5 This Week

Last Update: 2025-05-12
See Project
9

The Arcade Library

Easy to use Python library for creating 2D arcade games

Arcade is an easy-to-use Python library for creating 2D video games. It provides a modern and straightforward API, enabling developers to craft engaging games and graphical applications efficiently. Arcade supports rendering shapes, handling user input, and managing game physics, making it suitable for both beginners and experienced developers.

Downloads: 6 This Week

Last Update: 3 days ago
See Project
Enterprise Service Management solution for growing businesses | ServoDesk
Close 25% More Support Cases in Your Business with ServoDesk-Guaranteed

What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25%-without hiring more staff.

Learn More
10

Podcastfy.ai

Transforming Multimodal Content into Captivating Multilingual Audio

Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization...

Downloads: 3 This Week

Last Update: 2024-11-16
See Project
11

SmallBASIC

SmallBASIC is a fast and easy to learn BASIC language interpreter

SmallBASIC is a lightweight and powerful BASIC interpreter designed for simplicity and speed, suitable for hobbyists, educators, and retro computing enthusiasts. It offers a traditional text-based programming experience reminiscent of early microcomputers, while including modern features such as structured programming, graphics, and file I/O. SmallBASIC runs on multiple platforms, including Windows, Linux, Android, and DOS, making it accessible across a wide range of systems.

Downloads: 4 This Week

Last Update: 2025-05-28
See Project
12

Tagify

Lightweight, efficient Tags input component in Vanilla JS

Transforms an input field or a textarea into a Tags component, in an easy, customizable way, with great performance and a small code footprint, exploded with features. Customizable HTML templates for the different areas of the component (wrapper, tags, dropdown, dropdown item, dropdown header, dropdown footer) Shows suggestions list (flexible settings & styling) at full (component) width or next to the typed texted (caret) Allows setting suggestions' aliases for easier fuzzy-searching....

Downloads: 4 This Week

Last Update: 2025-08-28
See Project
13

Qwen2-Audio

Repo of Qwen2-Audio chat & pretrained large audio language model

Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...

Downloads: 3 This Week

Last Update: 2025-09-23
See Project
14

HunyuanVideo-Foley

Multimodal Diffusion with Representation Alignment

HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional use...

Downloads: 3 This Week

Last Update: 2025-09-23
See Project
15

MLX42

Codam's own fixed, functioning alternative of the miniLibX

MLX42 is a modern C graphics and windowing library built on top of GLFW and inspired by the original MLX library used in 42 school projects. It aims to provide a higher-level, beginner-friendly abstraction for students learning about graphical programming, while also embracing modern practices like event-driven input, texture rendering, and transparency. MLX42 is structured to reduce boilerplate and simplify the creation of games or interactive applications in C, making it an excellent...

Downloads: 3 This Week

Last Update: 2025-03-27
See Project
16

Qwen-Audio

Chat & pretrained large audio language model proposed by Alibaba Cloud

Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...

Downloads: 1 This Week

Last Update: 2025-09-23
See Project
17

ImageBind

ImageBind One Embedding Space to Bind Them All

... any modality can be compared or retrieved against any other (e.g., matching sound to text or depth to image). The model is trained using large-scale contrastive learning, leveraging diverse datasets from natural images, videos, audio clips, and sensor data. Once trained, it can perform cross-modal retrieval, zero-shot classification, and multimodal composition without additional fine-tuning.

Downloads: 1 This Week

Last Update: 6 days ago
See Project
18

Moshi

A speech-text foundation model for real time dialogue

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi...

Downloads: 0 This Week

Last Update: 2024-11-05
See Project
19

SFML.Net

Official binding of SFML for .Net languages

SFML.Net is the official .NET binding for the Simple and Fast Multimedia Library (SFML), providing C# developers with access to a powerful multimedia and game development framework. It wraps SFML’s C++ API into a user-friendly .NET interface, making it easy to build 2D games, multimedia apps, and simulations with graphics, sound, windowing, and input support. SFML.Net keeps the design idiomatic to C#, maintaining SFML's performance and portability while providing seamless integration with .NET...

Downloads: 0 This Week

Last Update: 2025-03-26
See Project
20

Text to Chord

Turn words into chords

Convert words and sentences to 5 note chords you can use to inspire music creation. Have fun turning your name, your city name, your friends' names, your team's name, your pet's name into wild and original harmonies that go beyond serialism and classic jazz.

Downloads: 5 This Week

Last Update: 2024-09-19
See Project
21

Text to Waveform

Create synth presets from words

Convert words to waveforms you can load into a synthesizer oscillator to create synth presets. Have fun turning your name, your friends' names, your city name, your pet's name, your team's name into synth presets you can use to produce a track.

Downloads: 0 This Week

Last Update: 2023-12-09
See Project
22

Omilo - a text to speech application

Omilo is a simple text to speech application

Omilo is a simple text to speech application for Windows and Linux using Festival, Flite, Marytts and Piper voices.

">

3 Reviews

Downloads: 12 This Week

Last Update: 2024-09-20
See Project
23

Russian Text-to-speech programs

читание, чтение, говорение

For Windows (on Linux trought Wine can work) 3 russian text-to-speech programs (Chitanie, Chtenie and Govorenie). If you want donate. paypal.me/alkbab Читание, Чтение, Говорение есть программы пробующие преобразовать русский текст в русскую речь . Для Windows. На Linux через Wine... Кто хочет может пожертвовать paypal.me/alkbab

Downloads: 1 This Week

Last Update: 2025-05-07
See Project
24

Equalizer APO

A system-wide equalizer for Windows 7 / 8 / 8.1 / 10 / 11

... in conjunction with Room EQ Wizard (http://www.roomeqwizard.com/), because it can read its filter text file format. Please have a look at the Wiki (http://sourceforge.net/p/equalizerapo/wiki/) for instructions on installation and configuration

">

280 Reviews

Downloads: 67,706 This Week

Last Update: 2025-05-09
See Project
25

Tux Paint

An award-winning drawing program for children of all ages

Tux Paint is a free, award-winning drawing program originally created for children ages 3 to 12, but enjoyed by all! It combines an easy-to-use interface, fun sound effects, and an encouraging cartoon mascot who guides children as they use the program. You're presented with a blank canvas and a variety of drawing tools to help them be creative. Along with paintbrush, shapes and text, Tux Paint includes a "stamp" feature to add pre-drawn or photographic imagery to pictures, and a set...

">

111 Reviews

Downloads: 19,283 This Week

Last Update: 2025-09-28
See Project