[go: up one dir, main page]

Showing 54 open source projects for "layered voice analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Apify is a full-stack web scraping and automation platform helping anyone get value from the web. Icon
    Apify is a full-stack web scraping and automation platform helping anyone get value from the web.

    Get web data. Build automations.

    Actors are serverless cloud programs that extract data, automate web tasks, and run AI agents. Developers build them using JavaScript, Python, or Crawlee, Apify's open-source library. Build once, publish to Store, and earn when others use it. Thousands of developers do this - Apify handles infrastructure, billing, and monthly payouts.
    Learn More
  • 1
    Rasa

    Rasa

    Open source machine learning framework to automate text conversations

    Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual assistants on Facebook Messenger, Slack, Google Hangouts, Webex Teams, Microsoft Bot Framework, Rocket.Chat, Mattermost, Telegram, and Twilio or on your own custom conversational channels. Rasa helps you build contextual assistants capable of having layered conversations with lots of back-and-forths.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 2
    Joern

    Joern

    Open-source code analysis platform for C/C++/Java/Binary/Javascript

    Joern is a platform for analyzing source code, bytecode, and binary executables. It generates code property graphs (CPGs), a graph representation of code for cross-language code analysis. Code property graphs are stored in a custom graph database. This allows code to be mined using search queries formulated in a Scala-based domain-specific query language. Joern is developed with the goal of providing a useful tool for vulnerability discovery and research in static program analysis.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound classification, emotion, etc.), and offers pretrained models (e.g. 7B) released via ModelScope and Hugging Face. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Luna AI

    Luna AI

    Virtual AI anchor that combines state-of-the-art technology

    Luna AI is a virtual AI streamer framework designed to power an interactive VTuber that can go live on major platforms and chat with viewers in real time. It is built around a core assistant persona called “Luna AI,” which can be driven by a wide range of large language models and platforms, including GPT-style APIs, Claude, LangChain-based backends, ChatGLM, Kimi, Ollama, and many others. The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE),...
    Downloads: 6 This Week
    Last Update:
    See Project
  • deskbird is the most intuitive desk booking app for your hybrid office. Icon
    deskbird is the most intuitive desk booking app for your hybrid office.

    With deskbird, creating an efficient workplace has never been easier.

    For companies in need of a people-centric workplace management solution so employees can see who is in the office, schedule their office and work-from-home days, and book resources for office days.
    Learn More
  • 5
    MiniCPM-o

    MiniCPM-o

    A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

    MiniCPM-o 2.6 is a cutting-edge multimodal large language model (MLLM) designed for high-performance tasks across vision, speech, and video. Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    Big-AGI

    Big-AGI

    AI suite powered by state-of-the-art models and providing advanced AI

    Big-AGI is a comprehensive, open-source AI workspace built to serve as a powerful multi-model interface for developers, researchers, and professionals who want deep control over generative AI workflows and outputs. It unifies access to multiple large language models (LLMs) and AI services through a modern web UI that emphasizes effi­cient interaction, flexibility, and extensibility, enabling users to conduct multi-model chats, execute code, generate images, and perform voice or text-based...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Qwen-Audio

    Qwen-Audio

    Chat & pretrained large audio language model proposed by Alibaba Cloud

    Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Continuous Claude v3

    Continuous Claude v3

    Context management for Claude Code. Hooks maintain state via ledgers

    ...The project orchestrates many specialized agents and skills—109 skills and 32 agents—so that complex coding tasks can be broken down, analyzed, and executed collaboratively by different components. It also includes a layered code analysis pipeline to reduce token usage and maintain relevant context efficiently. This continuous learning environment enables workflows such as bug fixing, refactoring, planning, and exploratory investigation while minimizing the need to re-explain context manually.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Ango Hub | All-in-one data labeling platform Icon
    Ango Hub | All-in-one data labeling platform

    For AI teams and Computer Vision team in organizations of all size

    AI-Assisted features of the Ango Hub will automate your AI data workflows to improve data labeling efficiency and model RLHF, all while allowing domain experts to focus on providing high-quality data.
    Learn More
  • 10
    Eliza

    Eliza

    Autonomous agents for everyone

    Build and deploy autonomous AI agents with consistent personalities across Discord, Twitter, and Telegram. Full support for voice, text, and media interactions. Built-in RAG memory system, document processing, media analysis, and autonomous trading capabilities. Supports multiple AI models including Llama, GPT-4, and Claude. Create custom actions, add new platform integrations, and extend functionality through a modular plugin system. Full TypeScript support.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Auto-Commenter

    Auto-Commenter

    A Claude skill that automatically posts personalized comments

    Auto-Commenter is a Claude-oriented automation project built to help users write and post comments that sound natural and context-aware in targeted online communities. It centers on learning a user’s writing style from their real comment history, then applying that style to generate responses that feel consistent with the user rather than generic template text. The workflow emphasizes deeper post analysis so the system can respond to what is actually being discussed, instead of replying with...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Open Interpreter

    Open Interpreter

    A natural language interface for computers

    Open Interpreter is an open-source tool that provides a natural-language interface for interacting with your computer. It lets large language models (LLMs) run code locally (Python, JavaScript, shell, etc.), enabling you to ask your computer to do tasks like data analysis, file manipulation, browsing, etc. in human terms (“chat with your computer”), with safeguards. Runs locally or via configured remote LLM servers/inference backends, giving flexibility to use models you trust or have...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 13

    VOIP-VOICE-TO-TEXT&ANALYS

    Convert VoIP calls to text and analyze them with AI

    The VoIP voice-to-text software for Issabel is an intelligent, AI-based solution that converts calls into accurate Persian text. After each call, the audio file is sent to the GPT-4O AI engine, producing editable transcripts. The software also provides AI-powered call analysis, extracting key points, customer requests, satisfaction levels, and sensitive topics, all stored in the database.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 14
    Recorder

    Recorder

    HTML5 js recording mp3 wav ogg webm amr format

    ​Supports microphone recording and real-time processing in most of the implemented getUserMediamobile and PC browsers, mainly including Chrome, Firefox, Safari, iOS 14.3+, Android WebView, Tencent Android X5 kernel (QQ, WeChat, Mini Program WebView) , uni-app (App, H5), and most Android phones updated after 2021 have their own browsers; do not support: UC-based kernel (typical Alipay), most of the old domestic mobile phones that have not been updated have their own browsers and any other...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Amphion

    Amphion

    Toolkit for audio, music, and speech generation

    Amphion is a toolkit from OpenMMLab dedicated to audio, music, and speech generation, aimed at both reproducible research and helping newcomers get started in generative audio. It provides standardized implementations and recipes for classic and state-of-the-art generative models in audio, including TTS, music generation, and voice conversion. A distinctive feature of Amphion is its emphasis on visualization: it offers interactive visualizations of model architectures and generation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    LINE Solver

    Queueing Theory Algorithms

    LINE is an open-source software package to analyze queueing models via analytical methods and simulation. The solver is available for Java/Kotlin, MATLAB, and Python. LINE features algorithms for the solution of open queueing systems (e.g., M/M/1, M/M/k, M/G/1, ...), open and closed queueing networks, and layered queueing networks. Additional details are available on the project website: http://line-solver.sf.net.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    Feishu ChatGPT

    Feishu ChatGPT

    Voice dialogue, role-playing, multi-topic discussion, picture creation

    Feishu × (GPT-3.5 + DALL·E + Whisper) = flying-like work experience. Voice dialogue, role-playing, multi-topic discussion, picture creation, table analysis, document export. Golang language, it goes without saying! Master the gin framework proficiently, developing the backend is as natural as breathing! Familiar with the SDKs of DingTalk, Feishu, Qiwei and other platforms, and be able to develop and integrate a series of amazing functions!
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18
    DoSA-2D

    DoSA-2D

    2D open source actuator simulation software

    DoSA-2D is a two-dimensional open source software for magnetic force analysis of actuators and solenoids. Not only individuals but also companies can use the program for free and participate in the development of it themselves. The program environment is developed to be similar to that of product development, so even product developers who have not majored in analysis can easily analyze the magnetic force of actuators or solenoids. DoSA-2D is responsible for an easy working...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DoSA-3D

    DoSA-3D

    3D open source actuator simulation software

    DoSA-3D is a 3D open source software for magnetic force analysis of actuators and solenoids. Not only individuals but also companies can use the program for free and participate in the development of it themselves. The program environment is developed to be similar to that of product development, so even product developers who have not majored in analysis can easily analyze the magnetic force of actuators. In DoSA-3D, three programs are connected and operated as follows. - DoSA-3D :...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    InstrumentalMusic

    InstrumentalMusic

    Application which detects musical notes from the microphone.

    Application which detects musical notes from the microphone. It allows listening to the microphone and play the detected notes to output (in midi). Multilanguage support. Zoom Dark mode option JDK-17 compatibility With v1.2 it includes a pitch shifter (making voice lower or sharper through a slider) There is a demo video which shows how it works (the demo video can be visited from Help menu of the application) You can also see the pitch-shifter demo version...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    Layered-Wall Cylindrical Pressure Vessel

    An Excel workbook suitable for preliminary design and analysis

    LWCPV.3.zip includes: 1. A macro enabled Excel workbook (LWCPV.xlsm) for preliminary design and analysis of long cylindrical, layered-wall pressure vessels (PVs) subjected to external hydrostatic pressure. The following four types of layered-wall construction are addressed: a. Two-layer isotropic. b. Three-layer isotropic sandwich-wall; and c. Three-layer sandwich-wall with orthotropic outer layers and isotropic core. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    ...Microservice code features include logging, service registration and discovery, registry, rate limit, circuit breaker, trace, metrics monitoring, pprof performance analysis, statistics, caching, CICD. The code uses a decoupled layered structure and it's easy to add or replace functional code. As an efficiency-enhancing tool, commonly repeated code is basically generated automatically and only business logic code needs to be populated based on the generated template code examples.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Bareflank Hypervisor

    Bareflank Hypervisor

    lightweight hypervisor SDK written in C++

    The Bareflank Hypervisor is an open source hypervisor Software Development Toolkit (SDK) for Rust and C++, led by Assured Information Security, Inc. (AIS), that provides the tools needed to rapidly prototype and create your own hypervisor on 64bit versions of Intel and AMD (ARMv8 CPUs, RISC-V and PowerPC also planned). The Bareflank SDK is intended for instructional/research purposes as it only provides enough virtualization support to start/stop a hypervisor. Bareflank can also be used as...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    VoiceFixer

    VoiceFixer

    General Speech Restoration

    VoiceFixer is a machine-learning framework for “speech restoration”: given a degraded or distorted audio recording — with noise, clipping, low sampling rate, reverberation, or other artifacts — it attempts to recover high-fidelity, clean speech. The architecture works in two stages: first an analysis stage that tries to extract “clean” intermediate features from the noisy audio (e.g. removing noise, denoising, dereverberation, upsampling), and then a neural vocoder-based synthesis stage that reconstructs a high-quality waveform from those features. Unlike many single-purpose noise reduction tools, VoiceFixer targets a “general speech restoration” problem (GSR), capable of handling multiple types of distortions at once, which makes it suitable for old recordings, phone-call audio, amateur voice recordings, or archival media. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 25
    vocoder_chung
    vocoder chung is a small educational vocoder using discrete fourier transform FFT spectrum written in easy fast compiled freebasic . (24/12/2019) uses fast and accurate FFTdll.dll (28/03/2020) algorythmic voice cloning / change / morphing experiment added
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next