[go: up one dir, main page]

Showing 442 open source projects for "sound to text"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • La version gratuite d'Auth0 s'enrichit ! Icon
    La version gratuite d'Auth0 s'enrichit !

    Gratuit pour 25 000 utilisateurs avec intégration Okta illimitée : concentrez-vous sur le développement de vos applications.

    Vous l'avez demandé, nous l'avons fait ! Les versions gratuite et payante d'Auth0 incluent des options qui vous permettent de développer, déployer et faire évoluer vos applications en toute sécurité. Utilisez Auth0 dès maintenant pour découvrir tous ses avantages.
    Essayez Auth0 gratuitement
  • 1
    Sound to Text

    Sound to Text

    Convert a sound file to text to analyze it

    Convert a sound file to text you can analyze to spot hidden words for psychological or universal-philosophical interpretation. The program will convert a sound file to text, with the purpose of analyzing it to spot intelligible words. Use the program with songs, movie or TV dialogue, private recordings, to reveal the hidden text messages of the sound. The resulting text will be a long string of mostly repeating characters but, every so often, you will notice an intelligible word, either...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Frescobaldi

    Frescobaldi

    LilyPond sheet music text editor

    Frescobaldi is a free and open source LilyPond sheet music text editor. Designed to be powerful yet lightweight and easy-to-use, Frescobaldi offers great functionality and a host of useful features such as music view with advanced two-way Point & Click, Midi capturing to enter music, a Snippet Manager and many more. Frescobaldi is named after Girolamo Frescobaldi (1583-1643), an Italian composer of keyboard music in the late Renaissance and early Baroque period.
    Downloads: 46 This Week
    Last Update:
    See Project
  • 3
    p5.js

    p5.js

    Client-side JS platform for artists, designers and students to express

    ... objects for text, input, video, webcam, and sound. p5.js is an interpretation of Processing for today’s web. We hold events and operate with support from the Processing Foundation. For self-learners and animators, artists, game makers, creative-technologists, curriculum planners, designers, graphic designers, graphics editors, learning experience designers, project managers, software engineer, student, teachers, university faculty members, visualization researchers, etc.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 4
    Spoon

    Spoon

    Metaprogramming library to analyze and transform Java source code

    Spoon is an open-source library to analyze, rewrite, transform, transpile Java source code. It parses source files to build a well-designed AST with powerful analysis and transformation API. It supports modern Java versions up to Java 20. Spoon is an official Inria open-source project, and member of the OW2 open-source consortium.
    Downloads: 15 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Axmol Engine

    Axmol Engine

    Multi-platform Engine for Desktop, XBOX (UWP) and Mobile games

    Axmol is a modern C++ game engine forked from Cocos2d-x, designed to support high-performance 2D and lightweight 3D game development across multiple platforms. It improves upon the original Cocos2d-x with a cleaner architecture, better tooling, and support for modern C++ standards. Axmol supports scripting with Lua and JavaScript, and is suitable for both indie developers and studios targeting mobile, desktop, and web platforms. With an active community and frequent updates, Axmol is a solid...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 6
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Scribble Deluxe

    Scribble Deluxe

    Efficient, internationalized, text renderer for GameMaker

    A modern text renderer for GameMaker 2023.8 by Juju Adams. Efficient, internationalized, multi-effects text renderer for GameMaker. Scribble Deluxe is a comprehensive text rendering library designed to replace GameMaker’s native draw_text() functions without adding unnecessary complexity. Scribble’s design should feel familiar and intuitive for GameMaker users. Scribble Deluxe supports all GameMaker export platforms, with the exception of HTML5.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    The Arcade Library

    The Arcade Library

    Easy to use Python library for creating 2D arcade games

    Arcade is an easy-to-use Python library for creating 2D video games. It provides a modern and straightforward API, enabling developers to craft engaging games and graphical applications efficiently. Arcade supports rendering shapes, handling user input, and managing game physics, making it suitable for both beginners and experienced developers.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Enterprise Service Management solution for growing businesses | ServoDesk Icon
    Enterprise Service Management solution for growing businesses | ServoDesk

    Close 25% More Support Cases in Your Business with ServoDesk-Guaranteed

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25%-without hiring more staff.
    Learn More
  • 10
    Podcastfy.ai

    Podcastfy.ai

    Transforming Multimodal Content into Captivating Multilingual Audio

    Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    SmallBASIC

    SmallBASIC

    SmallBASIC is a fast and easy to learn BASIC language interpreter

    SmallBASIC is a lightweight and powerful BASIC interpreter designed for simplicity and speed, suitable for hobbyists, educators, and retro computing enthusiasts. It offers a traditional text-based programming experience reminiscent of early microcomputers, while including modern features such as structured programming, graphics, and file I/O. SmallBASIC runs on multiple platforms, including Windows, Linux, Android, and DOS, making it accessible across a wide range of systems.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    Tagify

    Tagify

    Lightweight, efficient Tags input component in Vanilla JS

    Transforms an input field or a textarea into a Tags component, in an easy, customizable way, with great performance and a small code footprint, exploded with features. Customizable HTML templates for the different areas of the component (wrapper, tags, dropdown, dropdown item, dropdown header, dropdown footer) Shows suggestions list (flexible settings & styling) at full (component) width or next to the typed texted (caret) Allows setting suggestions' aliases for easier fuzzy-searching....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional use...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    MLX42

    MLX42

    Codam's own fixed, functioning alternative of the miniLibX

    MLX42 is a modern C graphics and windowing library built on top of GLFW and inspired by the original MLX library used in 42 school projects. It aims to provide a higher-level, beginner-friendly abstraction for students learning about graphical programming, while also embracing modern practices like event-driven input, texture rendering, and transparency. MLX42 is structured to reduce boilerplate and simplify the creation of games or interactive applications in C, making it an excellent...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    Qwen-Audio

    Qwen-Audio

    Chat & pretrained large audio language model proposed by Alibaba Cloud

    Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    ImageBind

    ImageBind

    ImageBind One Embedding Space to Bind Them All

    ... any modality can be compared or retrieved against any other (e.g., matching sound to text or depth to image). The model is trained using large-scale contrastive learning, leveraging diverse datasets from natural images, videos, audio clips, and sensor data. Once trained, it can perform cross-modal retrieval, zero-shot classification, and multimodal composition without additional fine-tuning.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    SFML.Net

    SFML.Net

    Official binding of SFML for .Net languages

    SFML.Net is the official .NET binding for the Simple and Fast Multimedia Library (SFML), providing C# developers with access to a powerful multimedia and game development framework. It wraps SFML’s C++ API into a user-friendly .NET interface, making it easy to build 2D games, multimedia apps, and simulations with graphics, sound, windowing, and input support. SFML.Net keeps the design idiomatic to C#, maintaining SFML's performance and portability while providing seamless integration with .NET...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Text to Chord

    Text to Chord

    Turn words into chords

    Convert words and sentences to 5 note chords you can use to inspire music creation. Have fun turning your name, your city name, your friends' names, your team's name, your pet's name into wild and original harmonies that go beyond serialism and classic jazz.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    Text to Waveform

    Text to Waveform

    Create synth presets from words

    Convert words to waveforms you can load into a synthesizer oscillator to create synth presets. Have fun turning your name, your friends' names, your city name, your pet's name, your team's name into synth presets you can use to produce a track.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Omilo - a text to speech application

    Omilo is a simple text to speech application

    Omilo is a simple text to speech application for Windows and Linux using Festival, Flite, Marytts and Piper voices.
    Leader badge">
    Downloads: 12 This Week
    Last Update:
    See Project
  • 23

    Russian Text-to-speech programs

    читание, чтение, говорение

    For Windows (on Linux trought Wine can work) 3 russian text-to-speech programs (Chitanie, Chtenie and Govorenie). If you want donate. paypal.me/alkbab Читание, Чтение, Говорение есть программы пробующие преобразовать русский текст в русскую речь . Для Windows. На Linux через Wine... Кто хочет может пожертвовать paypal.me/alkbab
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24

    Equalizer APO

    A system-wide equalizer for Windows 7 / 8 / 8.1 / 10 / 11

    ... in conjunction with Room EQ Wizard (http://www.roomeqwizard.com/), because it can read its filter text file format. Please have a look at the Wiki (http://sourceforge.net/p/equalizerapo/wiki/) for instructions on installation and configuration
    Leader badge">
    Downloads: 67,706 This Week
    Last Update:
    See Project
  • 25
    Tux Paint

    Tux Paint

    An award-winning drawing program for children of all ages

    Tux Paint is a free, award-winning drawing program originally created for children ages 3 to 12, but enjoyed by all! It combines an easy-to-use interface, fun sound effects, and an encouraging cartoon mascot who guides children as they use the program. You're presented with a blank canvas and a variety of drawing tools to help them be creative. Along with paintbrush, shapes and text, Tux Paint includes a "stamp" feature to add pre-drawn or photographic imagery to pictures, and a set...
    Leader badge">
    Downloads: 19,283 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next