[go: up one dir, main page]

Showing 48 open source projects for "recognition"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Process Street | Compliance Operations Platform Icon
    Process Street | Compliance Operations Platform

    Systemize execution. Prove compliance.

    Bring compliance and operations under one roof with an AI agent that automates workflows, policies that enforce rules, and a platform that delivers results.
    Learn More
  • 1
    whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples. whisper.cpp supports integer quantization of the Whisper ggml models. ...
    Downloads: 392 This Week
    Last Update:
    See Project
  • 2
    Kaldi

    Kaldi

    kaldi-asr/kaldi is the official location of the Kaldi project

    Kaldi is an open source toolkit for speech recognition research. It provides a powerful framework for building state-of-the-art automatic speech recognition (ASR) systems, with support for deep neural networks, Gaussian mixture models, hidden Markov models, and other advanced techniques. The toolkit is widely used in both academia and industry due to its flexibility, extensibility, and strong community support.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Seamless Communication

    Seamless Communication

    Foundational Models for State-of-the-Art Speech and Text Translation

    ...The system architecture includes a real-time multimodal signal pipeline for audio, video, and sensor data, a dialog manager that can decide when to act (speak, gesture, point) or query, and a cross-modal reasoning layer that fuses perception with semantic context. The research prototype includes components for visual grounding (understanding when a user references something in view), gesture recognition and synthesis, and turn-taking mechanisms that mirror human conversational timing. Because latency and synchronization are critical, the codebase invests in asynchronous scheduling, overlap of perception and reasoning, and fast fallback responses.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    audioFlux

    audioFlux

    A library for audio and music analysis, feature extraction

    A library for audio and music analysis, and feature extraction. Can be used for deep learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) ASR, etc.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Workspace management made easy, fast and affordable. Icon
    Workspace management made easy, fast and affordable.

    For companies searching for a desk booking software for safe and flexible working

    The way we work has changed and Clearooms puts you in complete control of your hybrid workspace. Both meeting rooms and hot desk booking can be easily managed to ensure flexible and safe working, however big or small your organisation.
    Learn More
  • 5
    TEN Framework

    TEN Framework

    TEN, a voice agent framework to create conversational AI.

    TEN (Transformative Extensions Network) is a voice agent framework for creating conversational AI applications, focusing on high performance and modularity.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    ZBar

    ZBar

    ZBar is an open source software suite for reading bar codes

    ZBar is an open source software suite for reading bar codes from various sources, including webcams. As its development stopped in 2012, I took the task of keeping it updated with the V4L2 API. This is the main repository for it. A library for reading barcodes from various sources, useful for applications needing barcode scanning.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 7
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI to organize files and suggest meaningful file names based on real content, not just filenames or extensions. The app can analyze image files locally and propose human-readable rename suggestions (for example, IMG_2048.jpg → clouds_over_lake.jpg). It can also analyze the text content of documents to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common text files....
    Downloads: 231 This Week
    Last Update:
    See Project
  • 8
    Mozilla JPEG Encoder Project

    Mozilla JPEG Encoder Project

    Improved JPEG encoder

    MozJPEG improves JPEG compression efficiency achieving higher visual quality and smaller file sizes at the same time. It is compatible with the JPEG standard, and the vast majority of the world's deployed JPEG decoders. MozJPEG is compatible with the libjpeg API and ABI. It is intended to be a drop-in replacement for libjpeg. MozJPEG is a strict superset of libjpeg-turbo's functionality. All MozJPEG's improvements can be disabled at run time, and in that case it behaves exactly like...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Darknet

    Darknet

    Convolutional Neural Networks

    ...The repository provides pre-trained models, configuration files, and tools for training custom object detection models. With GPU acceleration via CUDA and OpenCV integration, it achieves high performance in image recognition tasks. Its simplicity, combined with powerful capabilities, has made Darknet one of the most influential projects in the computer vision community.
    Downloads: 42 This Week
    Last Update:
    See Project
  • ACI Learning: Internal Audit, Cybersecurity, and IT Training Icon
    ACI Learning: Internal Audit, Cybersecurity, and IT Training

    Proven skill building for every aspect of your support or IT team.

    Traditional training doesn't equip employees with the practical skills they need to drive business success. ACI Learning provides hands-on IT and cybersecurity training designed to build real-world, on-the-job skills. Our outcome-based programs empower employees with certification prep, industry-recognized credentials, and flexible learning options. With expert-led video training, labs, and scalable solutions, we help businesses, individuals, governments, and academic institutions develop a skilled workforce, align with business goals, and stay ahead in a rapidly evolving digital world.
    Learn More
  • 10

    cuneiformplus

    Fork of OCR software cuneiform

    Fork of OCR software cuneiform Original software see: https://launchpad.net/cuneiform-linux by Cognitive Technologies and Jussi Pakkanen Other Open Source OCR stuff see * Tesseract by Ray Smith (using the Leptonica image library) * GOCR * OCRAD
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Speech Recognition in English & Polish

    Speech Recognition in English & Polish

    Speech recognition software for English & Polish languages

    Software for speech recognition in English & Polish languages. Basic versions of SkryBot: 1. SkryBot Home Speech (English Language) - https://sourceforge.net/projects/skrybotdomowy/files/ReleasesEnglish/InstalatorSkryBotHomeSpeechDemo-2.6.9.18117.exe/download 2. SkryBot DoMowy (Polish Language) - https://sourceforge.net/projects/skrybotdomowy/files/ReleasesPolish/InstalatorSkryBotDoMowyDemo-2.4.9.18117.exe/download More help: https://sourceforge.net/p/skrybotdomowy/wiki/ Domain advanced versions (Polish Language) 1. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    OpenPR
    OpenPR stands for Open Pattern Recognition project and is intended to be an open source library for algorithms of image processing, computer vision, natural language processing, pattern recognition, machine learning and the related fields.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Animal is AN IMAging Library written in C. Its simple API supports over 80 image formats, and is intended to make massive use of other image processing libraries. Animal aims at image analysis and recognition. It is mainly the C basis of the SIP toolbox.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    LeapInto

    LeapInto

    Simplified interface to Leap Motion designed for art and music apps

    LeapInto provides a simplified interface to the Leap Motion hand sensor input device. Multiple hand recognition is simplified to several stable categories and coordinates are normalised. The interface comes two flavours at present, an open broadcast system using the OSC protocol and a plugin for the Csound audio/music programming language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    gWaei, Japanese Dictionary for GNOME
    ...gWaei is an easy to use and yet powerful dictionary program for Japanese to English translation. It organizes results by relevance, supports regex searches, tabs, spell checking, kanji handwriting recognition and an console interface.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    Handwritten Digits Recognition On Android
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Technical analysis library with indicators like ADX, MACD, RSI, Stochastic, TRIX... includes also candlestick pattern recognition. Useful for trading application developpers using either Excel, .NET, Mono, Java, Perl or C/C++.
    Leader badge">
    Downloads: 15,713 This Week
    Last Update:
    See Project
  • 18
    This software records and replays user interaction with the computer. It can be interfaced through voice commands.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    World Voice Recognition est un programme OpenSource de reconnaissance vocal dont le but est de faire la liaison entre plusieurs modules crée par n'importe quelle developpeur ( Module du microphone, module de reconnaissance vocal, module pour faire parler l'ordinateur, ou des plugins : par exemple la météo ). La SDK est compatible avec n'importe quelle language de programmation (ASM,C++,Ada,Java...) sur toutes les platformes (Windows, Mac et Linux).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Vedvarsha is an application for 2 purposes: 1. Handwariting script recognition that extracts recognized letters into documents. 2. OCR (Optical Character Recogniton) that works only for non-cursive and isolated characters. It depends upon libsyntactic,
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Windows hunter is an open source screen scraping program. It is still under heavy development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Aplication for language and encoding recognition based on methods of machine learning. Command-line interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Arabisc is speaker independent large vocabulary continuous speech recognizer for Arabic language released under GNU license.It is also a collection of open source tools that allows researchers and developers to build speech recognition systems for Arab
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    1.) Investigation with cosine transform, and anti transform algorithm, with some voice recognition code. 2.) Translator: Croatian, English. 3.) 2D to 3D picture algorithm (principle) and new 2Dto3D video conversion code with AviSynth video scripting
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    UCR is a project name for the development of an handwritten characters in Korean language. The goal is to create a UCR Library for handwriting as well as OCR from off-line, on-line data. And we have a plan to build a UCR library for mobile.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next