[go: up one dir, main page]

Showing 25 open source projects for "speech"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Track time for payroll, billing and productivity Icon
    Track time for payroll, billing and productivity

    Flexible time and billing software that enables teams to easily track time and expenses for payroll, projects, and client billing.

    Because time is money, and we understand how challenging it can be to keep track of employee hours. The constant reminder to log timesheets so your business can increase billables, run an accurate payroll and remove the guesswork from project estimates – we get it.
    Learn More
  • 1
    Kaldi

    Kaldi

    kaldi-asr/kaldi is the official location of the Kaldi project

    Kaldi is an open source toolkit for speech recognition research. It provides a powerful framework for building state-of-the-art automatic speech recognition (ASR) systems, with support for deep neural networks, Gaussian mixture models, hidden Markov models, and other advanced techniques. The toolkit is widely used in both academia and industry due to its flexibility, extensibility, and strong community support.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Olares

    Olares

    Olares: An Open-Source Sovereign Cloud OS for Local AI

    Olares is an AI-powered chatbot framework designed to support real-time natural language understanding and response generation.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    fairseq2

    fairseq2

    FAIR Sequence Modeling Toolkit 2

    fairseq2 is a modern, modular sequence modeling framework developed by Meta AI Research as a complete redesign of the original fairseq library. Built from the ground up for scalability, composability, and research flexibility, fairseq2 supports a broad range of language, speech, and multimodal content generation tasks, including instruction fine-tuning, reinforcement learning from human feedback (RLHF), and large-scale multilingual modeling. Unlike the original fairseq—which evolved into a large, monolithic codebase—fairseq2 introduces a clean, plugin-oriented architecture designed for long-term maintainability and rapid experimentation. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    A series of open source files and programs available to use for developing programs to work with the WowWee Robotics RSMedia Robot. These include a USB serial console, a cross-compiler, a firmware dump program, text-to-speech and source code.
    Leader badge">
    Downloads: 24 This Week
    Last Update:
    See Project
  • Optimize every aspect of hiring with Greenhouse Recruiting Icon
    Optimize every aspect of hiring with Greenhouse Recruiting

    Hire for what’s next.

    What’s next for many of us is changing. Your company’s ability to hire great talent is as important as ever – so you’ll be ready for whatever’s ahead. Whether you need to scale your team quickly or improve your hiring process, Greenhouse gives you the right technology, know-how and support to take on what’s next.
    Learn More
  • 5
    SPTK is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, vector quantization techniques, and other extended versions of them.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    SVoice (Speech Voice Separation)

    SVoice (Speech Voice Separation)

    We provide a PyTorch implementation of the paper Voice Separation

    SVoice is a PyTorch-based implementation of Facebook Research’s study on speaker voice separation as described in the paper “Voice Separation with an Unknown Number of Multiple Speakers.” This project presents a deep learning framework capable of separating mixed audio sequences where several people speak simultaneously, without prior knowledge of how many speakers are present. The model employs gated neural networks with recurrent processing blocks that disentangle voices over multiple...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Rhasspy

    Rhasspy

    Offline private voice assistant for many human languages

    Rhasspy (ˈɹæspi) is an open-source, fully offline set of voice assistant services for many human languages that works well with Hermes protocol-compatible services (Snips.AI), Home Assistant and Hass.io, Node-RED, Jeedom, OpenHAB. Rhasspy will produce JSON events that can trigger action in home automation software, such as a Node-RED flow. Rhasspy comes with a snazzy web interface that lets you configure, program, and test your voice assistant remotely from your web browser. All of the web...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    vinuxproject

    vinuxproject

    Vinux is an Ubuntu derived distribution for blind & visually impaired.

    Vinux supports software text to speech and Braille support from boot-up to shutdown. Users can use installation medium to install independently with no sighted assistance required. Vinux supports command line environment speech, Desktop environment speech and magnification features. Vinux comes with an accessible suite of software and has an excellent mailing list support group.
    Leader badge">
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Attack Surface Management | Criminal IP ASM Icon
    Attack Surface Management | Criminal IP ASM

    For security operations, threat-intelligence and risk teams wanting a tool to get access to auto-monitored assets exposed to attack surfaces

    Criminal IP’s Attack Surface Management (ASM) is a threat-intelligence–driven platform that continuously discovers, inventories, and monitors every internet-connected asset associated with an organization, including shadow and forgotten resources, so teams see their true external footprint from an attacker’s perspective. The solution combines automated asset discovery with OSINT techniques, AI enrichment and advanced threat intelligence to surface exposed hosts, domains, cloud services, IoT endpoints and other Internet-facing vectors, capture evidence (screenshots and metadata), and correlate findings to known exploitability and attacker tradecraft. ASM prioritizes exposures by business context and risk, highlights vulnerable components and misconfigurations, and provides real-time alerts and dashboards to speed investigation and remediation.
    Learn More
  • 10
    mp3 library, advanced ID3V1 and ID3V2 tagger, player. Organize a large mp3 library, over 40,000 songs. Speech synthesis and tag backup utilities. Scripts to maintain and organize song files.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 11
    Hemera is a Virtual Intelligent System aggregating some more advanced Artificial Intelligence Technologies (speech, speech recognition, form recognition, motion recognition ...); with applications in daily tasks, domotics and robotics ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    pdf2mp3

    Simply convert your PDF files into audio books

    Summary: Your eyes are tired of looking into the tablet or cell-phone screen reading ebooks? You have difficulty reading from LCD screen specially in a driving vehicle? This software is for you! It converts your PDF files to MP3 audio books. Special Features (Compared to similar projects): Each page is in a separate MP3 file. Created MP3 files have ID3v2 tags showing Book name and page number. Multi-threaded conversion, means all CPU cores will be used thus multiple times faster conversion.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13

    symposia

    Bringing Plato's Symposium to VR

    A multimodal, multiliteracy aware programmed approach to creating virtual realities inspired by Plato's Symposium and all surrounding, mostly later, texts. Designed for the author's doctoral dissertation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DPRK pull is a script that pulls the English language North Korean news articles from the KCNA website and puts them into one file for reading by a Text to Speech program.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Intended eventually to be a live CD, LinVision allows the blind to use a computer to: 1) Organise books & read aloud. 2) Organise & play music. 3) Teach & test keyboard skills. 4) Write & save or email work. 5) Browse Internet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Tested for Ubuntu Maverick - Create Audiobooks from eBooks, text or pictures. - Read eBooks or text aloud while scrolling through pages
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    A script for producing a collection of audio files containing your emails.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Tools for DSS (Digital Speech Standard) files used in Olympus Digital Voice Recorders.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    VoxForge collects user-submitted speech audio files for the creation of Acoustic Models for Free and Open Source Speech Recognition Engines such as HTK, Julius, ISIP and Sphinx.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DTW is intended to be a Voice in -> Pictures + Text out program written in java using Sphinx from CMU. This is intended to be useful to people who have good oral/visual literacy skills but poor written literacy skills.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Kathak is a Bangla text to speech synthesizer which can produce the speech from unicode bangla text input. We are developing the system based on Festvox framework.The Festival Speech Synthesis System was used as a base for developing Kathak.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Reads RSS feeds' full html page, scrapes and summarizes just the article content, stripped of ads, etc. Converts to speech (ogg/mp3) and creates a podcast of all of the summaries. Works with slashdot, weather, cnn, newsforge, groklaw, pirillo and more!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    The Pawn will make it possibly for you to tell the computer exactly what you would like it to do. Fiction. No its reality now. The highly customizable slackware will be the base for Pawn.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    The Software will be a collection of voice related applications and solutions Final Releases will be as a cram format. The Yopy is a Linux Based PDA Device. Yoobin uses the Flite text to speech engine to create audio output from keyboard and menu b
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    VoiFax is a program that manage voice/data/fax modem in the same manner of vgetty and mgetty.VoiFax is thinking for use in the small businness as in the enterprises. It is fully compatible with vgetty scripts and manage (via efax) modem fax 1.0/1.1/2.0
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next