[go: up one dir, main page]

Browse free open source Java Linguistics Software and projects below. Use the toggles on the left to filter open source Java Linguistics Software by OS, license, language, programming language, and project status.

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 1

    Wordcorr

    Data management for comparative linguistics

    Wordcorr automates the tedious and risky process of tabulating and managing the sound correspondences used in working out the historical development of natural languages. Initial support was from NSF.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    oopinyinguide
    OO Pinyin Guide is a Java extension for OpenOffice 3 or higher. It enables the user to add pinyin transliteration over Chinese characters inside a text document. This tool can be useful for people learning or teaching Chinese.
    Leader badge">
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3
    HermeneutiX

    HermeneutiX

    Your graphical tool for Syntactic/Semantic Structure Analysis of texts

    HermeneutiX is a tool for diagramming syntactic and semantic structures of complex (not necessarily foreign-language) texts (e.g. bible or other historical excerpts). HermeneutiX is now part of SciToS (the scientific tool set). Starting with version 2.0.0, HermeneutiX can be found on GitHub. Please check out the release summary: https://github.com/scientific-tool-set/scitos/releases For an introduction, check out this video: https://youtu.be/uQjewyG0Ad8 PS: To run a Java application such as HermeneutiX (i.e. SciToS) you need a Java Runtime Environment (JRE). HermeneutiX is currently built to be compatible down to JRE version 6. You may download the current JRE here: http://www.java.com/en/download
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4

    TextComparer

    Small Java program to compare two texts

    Small Java program to compare two texts, originally designed to be used to find quotations in a Byzantine anthology. It can quite likely be used to detect plagiarism between two texts as well Graphical interface which allows easy navigation between corresponding parts in the two different texts. Uses the http://software.jessies.org/salma-hayek/ Java TextArea for this. Probably very much can be done to improve this program and the algorithm which it uses.
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Board Game Language
    Board Game Language (BGL, pronounced "bagel") is a natural language syntax programming language for first-time programmers. It uses board games as a metaphor for programming concepts, with the goal of teaching users the foundations of programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    CHALICE
    Connecting Historical Authorities with Links, Contexts and Entities. CHALICE is a historic placename gazetteer for the UK, published as Linked Data and linked to other widely-used sources of placename reference information on the semantic web.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    CLEiM

    Cross Lingual Education in Medicine

    CLEiM (Cross Lingual Education in Medicine) is an opensource version of an Intelligent System which extract concepts from medical texts and provides qualified information. It integrates information from various sources. This system has been developed by the Intelligent System Group GSI (http://www.esi.uem.es/gsi/) at UEM University. We do NER (Named Entity Recognition) based on GATE platform. The installation is simple, you can use it as a Web application. It has been tested under apache-tomcat. The original system has been successfully used to carry out active learning activities with medical students. However, it could be interesting in much more knowledge fields.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    FALCON - Text Search Java Project

    FALCON - Text Search Java Project

    JSON based text search Java Project

    ----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language Processing, Information Extraction and Question-Answering Architecture. ---------------------- - Latest Version - ---------------------- Details of latest version can be found on project website - http://geekdadaji.com --------------------------- - CONTACT DETAILS - --------------------------- CREATOR : SWAPNIL A JADHAV (saj1919) EMAIL ID : dadajibudhau@gmail.com WEBSITE : http://geekdadaji.com LICENSE : CC BY-NC 4.0
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    HAWK - PDF Text Search Java Project

    HAWK - PDF Text Search Java Project

    No more support for this project - TAKE A LOOK AT FALCONSEARCH

    No more support for this project - TAKE A LOOK AT FALCONSEARCH "https://sourceforge.net/projects/falcontextsearch/"
    Downloads: 0 This Week
    Last Update:
    See Project
  • Comet Backup - Fast, Secure Backup Software for MSPs Icon
    Comet Backup - Fast, Secure Backup Software for MSPs

    Fast, Secure Backup Software for Businesses and IT Providers

    Comet is a flexible backup platform, giving you total control over your backup environment and storage destinations.
    Learn More
  • 10
    A system to perform analysis of large documents for the purpose of cataloging similar documents. Similarity is based upon contextual analysis of these documents done by identifying common words and proper nouns.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Java program to create a (potentially multilingual) glossary of the unique words in any given Lojban text. Note that the Sourceforge page for this was superceded by the Bitbucket repository: https://bitbucket.org/pretoriusjf/vlastezba/overview Any further updates will be made there.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Maui is a multi-purpose automatic topic indexing algorithm. Given a document, Maui automatically identifies its topics. Depending on the task topics are tags, keywords, keyphrases, vocabulary terms, descriptors or Wikipedia titles.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Musaheb

    An Arabic collocation extraction tool

    “Musaheb”, an Arabic collocation extraction tool that has been designed and implemented to overcome the limitations of existing collocation extraction tools. “Musaheb” is able to extract n-gram collocations up to 5-gram, in addition to extracting the collocates of the nodes (the word-types we are looking for its collocates) within a window size of zero to 15 words. Moreover, it provides eight collocation statistics to calculate the strength of the collocation, and permits the input of various constraints during node selection and collocate extraction. Based on the user preferences for the node, concordance and collocates selection, the tool saves all nodes and their associated collocates in an XML file; allowing easy conversion to different formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    A linguistic tool to aid in the study of Linguistics/Phonology, specifically distinctive features of possible language sounds. Comprised of both a Visual C++ .NET version as well as a Java based web applet version. The C++ version has all but been ab
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    Porter Stemmer

    Java version of Porter's Stemming algorithm

    The Stemmer class transforms a word into its root form. The input word is provided from the add() methods. The stem() method will return the stem as will toString() after stem() has been called). The clear() method will wipe the Stemmer buffer and allow a new word to be input. This version extends Martin Porter's original stemming algorithm by allowing capital letters to exist in words. This version should also be plugged in wherever the old algorithm is used with few accommodations necessary. The code in this version is more readable (in my opinion) than the old version. There is a main at the bottom that shows how to use the Stemmer.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    TML - Text Mining Library for LSA & CMM

    TML is a Java Library for LSA and extracting Concept Maps from text

    TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    jWords is a port of WORDS (by William Whitaker, a free latin-to-english dictionary program written in Ada), to Java. Besides the dictionary will be translated to the German language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    semantic term annotation and description
    This project extends the ASV Toolbox from the Wortschatz-project at the University of Leipzig. It annotates terms extracted by the "TE" (Terminolgy Extraction) and "Namerec" modules with semantic resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    stocleka is a project divided into a UI and a library for cleaning user stories and converting them to arff files (used for Weka). it may be mainly used for research and scientific purposes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next