cocoNLP

cocoNLP is a lightweight natural-language processing toolkit geared toward practical information extraction from raw text, especially for Chinese and mixed Chinese–English content. Instead of requiring a heavy pipeline, it focuses on quick wins such as extracting names, places, organizations, emails, phone numbers, and dates directly from unstructured sentences. The project blends pattern-based methods with NLP heuristics, giving developers dependable results for real-world texts like chats, comments, and user-generated content. Its API is intentionally simple, so you can drop it into scripts, ETL jobs, or dashboards without deep ML expertise. Because it aims at utility over complexity, it’s useful for prototyping data products or building lightweight text analytics where large models would be overkill. The repository also includes examples and test snippets to help you understand expected inputs and typical outputs, which shortens the learning curve for newcomers.

Features

Ready-made extractors for names, locations, organizations, emails, phones, and dates
Chinese and mixed-language text handling for common real-world corpora
Lightweight API surface that integrates into scripts and services quickly
Pattern-driven approach for predictable behavior and easy customization
Works well in ETL and data-cleaning pipelines without GPU dependencies
Examples and test snippets to validate usage and outputs

Project Samples

Project Activity

See All Activity >

Follow cocoNLP

cocoNLP Web Site

User Reviews

Be the first to post a review of cocoNLP!

Additional Project Details

Programming Language

Python

Related Categories

Python Natural Language Processing (NLP) Tool

Registered

4 days ago

Similar Business Software

Watson Natural Language Understanding

Watson Natural Language Understanding is a cloud native product that uses deep learning to extract metadata from text such as entities, keywords, categories, sentiment, emotion, relations, and syntax. Get underneath the topics mentioned in your data by using text analysis to extract keywords,...

See Software
LM-Kit.NET

LM-Kit.NET is a cutting-edge, high-level inference SDK designed specifically to bring the advanced capabilities of Large Language Models (LLM) into the C# ecosystem. Tailored for developers working within .NET, LM-Kit.NET provides a comprehensive suite of powerful Generative AI tools, making...

See Software
TextRazor

The TextRazor API helps you extract and understand the Who, What, Why and How from your news stories with unprecedented accuracy and speed. Entity Extraction, Disambiguation and Linking. Keyphrase Extraction. Automatic Topic Tagging and Classification. All in 12 languages. Deep analysis of your...

See Software
kama DEI

kama.ai is a Responsible AI Agent platform that blends knowledge graph AI with advanced generative models for trustworthy Hybrid AI Agents. It empowers industries such as finance, education, healthcare, and Indigenous services with culturally aware, ethical, and accurate AI. By incorporating...

See Software
Grooper

Grooper was built from the ground up by BIS, a company with 35 years of continuous experience developing and delivering new technology. Grooper is an intelligent document processing and digital data integration solution that empowers organizations to extract meaningful information from...

See Software
Azure Text Analytics

Mine insights in unstructured text using NLP—no machine-learning expertise required—using text analytics, a collection of features from Cognitive Service for Language. Gain a deeper understanding of customer opinions with sentiment analysis. Identify key phrases and entities such as people,...

See Software

Report inappropriate content

cocoNLP

A Chinese information extraction tool

Get an email when there's a new version of cocoNLP

Features

Project Samples

Project Activity

Categories

Follow cocoNLP

User Reviews

Additional Project Details

Programming Language

Related Categories

Registered