[go: up one dir, main page]

Showing 748 open source projects for "sound"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Easy-to-use online form builder for every business. Icon
    Easy-to-use online form builder for every business.

    Create online forms and publish them. Get an email for each response. Collect data.

    Easy-to-use online form builder for every business. Create online forms and publish them. Get an email for each response. Collect data. Design professional looking forms with JotForm Online Form Builder. Customize with advanced styling options to match your branding. Speed up and simplify your daily work by automating complex tasks with JotForm’s industry leading features. Securely and easily sell products. Collect subscription fees and donations. Being away from your computer shouldn’t stop you from getting the information you need. No matter where you work, JotForm Mobile Forms lets you collect data offline with powerful forms you can manage from your phone or tablet. Get the full power of JotForm at your fingertips. JotForm PDF Editor automatically turns collected form responses into professional, secure PDF documents that you can share with colleagues and customers. Easily generate custom PDF files online!
    Learn More
  • 1
    spotDL

    spotDL

    Download your Spotify playlists and songs along with album art

    spotDL finds songs from Spotify playlists on YouTube and downloads them - along with album art, lyrics and metadata.
    Downloads: 162 This Week
    Last Update:
    See Project
  • 2
    Audiomentations

    Audiomentations

    A Python library for audio data augmentation

    ...Tensorflow/Keras or Pytorch. Has helped people get world-class results in Kaggle competitions. Is used by companies making next-generation audio products. Mix in another sound, e.g. a background noise. Useful if your original sound is clean and you want to simulate an environment where background noise is present. A folder of (background noise) sounds to be mixed in must be specified. These sounds should ideally be at least as long as the input sounds to be transformed. Otherwise, the background sound will be repeated, which may sound unnatural. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 4
    Podcastfy.ai

    Podcastfy.ai

    Transforming Multimodal Content into Captivating Multilingual Audio

    Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Searching for a better way to ship ecommerce? We can help Icon
    Searching for a better way to ship ecommerce? We can help

    ShipHero gives you the tools that give you ecommerce fulfillment super powers.

    ShipHero is built for multi-channel commerce. With a few clicks, you can connect your stores. ShipHero will download new products, as well as sync existing ones. When changes are made to your inventory all connected stores will be updated.
    Learn More
  • 5
    PersonaPlex

    PersonaPlex

    PersonaPlex code

    PersonaPlex is an open-source real-time conversational speech AI model that goes beyond traditional text chat by providing full-duplex speech-to-speech interaction, meaning it can listen and talk at the same time instead of waiting for you to finish speaking before responding. This architectural approach eliminates awkward pauses and makes conversations feel much more human-like, with natural behaviors such as overlapping speech, interruptions, and fluent turn-taking, traits that traditional...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 6
    You-Get

    You-Get

    Dumb downloader that scrapes the web

    You-Get is a small command-line utility for downloading media (video, audio and images) from the Web when there are no other means to do so. It can download video and audio files from such popular web sites as YouTube, Twitter, Niconico, Vimeo, Flickr, Instagram and a whole lot more. You-Get is a great option for when you want to enjoy your favorite videos, audio or images from the internet without having to open any web browsers or get interrupted by ads. It’s also a good choice for...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 7
    yami

    yami

    An open-source music player with simple UI

    Yami is a lightweight, open-source music player built in Python. It focuses on simplicity and ease of use, providing an intuitive user interface (UI) for users to manage and play their music. Whether you're playing local files or downloading from online sources using spotdl, Yami offers a seamless experience. This project is designed for users who want a minimalistic, cross-platform music player with the ability to integrate external sources like Spotify/YouTube Music.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Quod Libet

    Quod Libet

    Music player and music library manager for Linux, Windows, and macOS

    Quod Libet is a cross-platform audio/music management program. It provides many ways to view your local library, and supports streaming audio and feeds (podcasts, etc). It has extremely flexible metadata editing and searching capabilities. With over 90 plugins included, you can extend and integrate with almost anything, or write your own! Ex Falso is a bare-bones tag editor with the same editing interface as Quod Libet. Quod Libet is a GTK+-based audio player written in Python, using the...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    Tauon

    Tauon

    The music player of today

    Tauon is a modern, streamlined music player app that's packed with features! An emphasis on playlists and drag-and-drop importing puts you in control of your music library. Faded volume control, 24-bit FLAC support, and gapless playback provide the ultimate listening experience. Excellent CUE sheet support, an original smart playlist system, and network playback from koel or Airsonic servers. Last.fm, Listenbrainz, and Maloja scribbling. MPRIS2 support for desktop integration. Tauon is a...
    Downloads: 12 This Week
    Last Update:
    See Project
  • The Apple Device Management and Security Platform Icon
    The Apple Device Management and Security Platform

    For IT teams at organizations that run on Apple

    Achieve harmony across your Apple device fleet with Kandji's unmatched management and security capabilities.
    Learn More
  • 10
    Librosa

    Librosa

    Python library for audio and music analysis

    Librosa is a powerful Python library for analyzing and processing audio and music signals. Built on top of NumPy, SciPy, and matplotlib, it provides a wide range of tools for feature extraction, time-series manipulation, audio display, and music information retrieval. Whether you're building machine learning models for audio classification or visualizing spectrograms, Librosa is a go-to library for researchers and developers working in audio signal processing.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    FeelUOwn

    FeelUOwn

    Trying to be a robust, user-friendly and hackable music player

    FeelUOwn is a user-friendly, and hackable music player.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Swing Music

    Swing Music

    Swing Music is a beautiful, self-hosted music player

    Swing Music is a beautiful, self-hosted music player and streaming server that lets you bring your personal audio library online with a modern browser-based interface, giving you a private alternative to mainstream streaming services. Designed to be both elegant and powerful, the project scans your local music files (like MP3s or FLACs), organizes metadata, and streams them on-demand to any device with a browser or its Android client. It includes features like folder browsing, playlist...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    ...It uses a novel model setup that combines continuous acoustic features with discrete semantic tokens to richly capture sound and meaning across speech, music, and environmental audio.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi, and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional use. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides inference scripts, checkpoints, and simple Python APIs so you can generate clips from prompts or incorporate the models into applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Ren'Py

    Ren'Py

    The Ren'Py Visual Novel Engine

    Ren’Py is a free and open-source visual novel engine that makes telling interactive stories easy by combining narrative scripting with multimedia support for images, sound, and music in a way that’s accessible to both beginners and experienced developers. Its scripting language is designed to be readable and intuitive, allowing writers and creators to define scenes, dialogues, branching choices, character expressions, and events without deep programming knowledge, while still offering the ability to embed Python for advanced control and custom gameplay features. ...
    Downloads: 68 This Week
    Last Update:
    See Project
  • 18
    NovaSR

    NovaSR

    A lightning fast audio upsampler

    NovaSR is an extremely lightweight and high-performance audio upsampling model that transforms low-quality 16 kHz audio into clearer, high-fidelity 48 kHz audio with remarkable speed and efficiency. At only about 50 KB in size, the model is orders of magnitude smaller than typical audio super-resolution networks, yet it achieves high quality and realtime performance thanks to its compact architecture and efficient convolutional design. NovaSR is especially valuable for post-processing tasks...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    Speakr

    Speakr

    Speakr is a personal, self-hosted web application

    Speakr is an open-source, real-time text-to-speech (TTS) web application that allows users to convert written text into natural-sounding speech in just a few clicks. It provides a clean, user-friendly interface where users can input text, choose a voice style or language, and immediately hear the output, making it ideal for accessibility, content creation, and learning applications. Behind the scenes, Speakr leverages modern TTS engines and streaming audio technologies to deliver smooth and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    The Arcade Library

    The Arcade Library

    Easy to use Python library for creating 2D arcade games

    Arcade is an easy-to-use Python library for creating 2D video games. It provides a modern and straightforward API, enabling developers to craft engaging games and graphical applications efficiently. Arcade supports rendering shapes, handling user input, and managing game physics, making it suitable for both beginners and experienced developers.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 21
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    ...It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound classification, emotion, etc.), and offers pretrained models (e.g. 7B) released via ModelScope and Hugging Face. Code & examples provided with Hugging Face transformers, and usage via AutoProcessor, model classes etc. High performance on many standard benchmarks: ASR, speech-emotion recognition, vocal sound classification, speech translation etc.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    LiveAvatar

    LiveAvatar

    Streaming Real-time Audio-Driven Avatar Generation

    LiveAvatar is an open-source research and implementation project that provides a unified framework for real-time, streaming, interactive avatar video generation driven by audio and other control signals. It implements techniques from state-of-the-art diffusion-based avatar modeling to support infinite-length continuous video generation with low latency, enabling interactive AI avatars that maintain continuity and realism over extended sessions. The project co-designs algorithms and system...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    MoviePy

    MoviePy

    Video editing with Python

    MoviePy is a Python module for video editing, which can be used for basic operations (like cuts, concatenations, title insertions), video compositing (a.k.a. non-linear editing), video processing, or to create advanced effects. It can read and write the most common video formats, including GIF. MoviePy is an open source software originally written by Zulko and released under the MIT licence. It works on Windows, Mac, and Linux, with Python 2 or Python 3. The code is hosted on Github, where...
    Downloads: 45 This Week
    Last Update:
    See Project
  • 24
    AI Chatbot Framework

    AI Chatbot Framework

    Python chatbot framework with Natural Language Understanding

    Building a chatbot can sound daunting, but it’s totally doable. AI Chatbot Framework is an AI powered conversational dialog interface built in Python. With this tool, it’s easy to create Natural Language conversational scenarios with no coding efforts whatsoever. The smooth UI makes it effortless to create and train conversations to the bot and it continuously gets smarter as it learns from conversations it has with people.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    OSWorld

    OSWorld

    Benchmarking Multimodal Agents for Open-Ended Tasks

    OSWorld is an open-source synthetic world environment designed for embodied AI research and multi-agent learning. It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next