Jabbar et al., 2023 - Google Patents
An analytical analysis of text stemming methodologies in information retrieval and natural language processing systemsJabbar et al., 2023
View PDF- Document ID
- 9830345911035463322
- Author
- Jabbar A
- Iqbal S
- Tamimy M
- Rehman A
- Bahaj S
- Saba T
- Publication year
- Publication venue
- IEEE Access
External Links
Snippet
The exponential increase in textual unstructured digital data creates significant demand for advanced and smart stemming systems. As a preprocessing stage, stemming is applied in various research fields such as information retrieval (IR), domain vocabulary analysis, and …
- 238000000034 method 0 title abstract description 132
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G06F17/2881—Natural language generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2755—Morphological analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jabbar et al. | An analytical analysis of text stemming methodologies in information retrieval and natural language processing systems | |
Alwaneen et al. | Arabic question answering system: a survey | |
Singh et al. | Text stemming: Approaches, applications, and challenges | |
IJntema et al. | A lexico-semantic pattern language for learning ontology instances from text | |
US11068653B2 (en) | System and method for context-based abbreviation disambiguation using machine learning on synonyms of abbreviation expansions | |
US20140082003A1 (en) | Document mining with relation extraction | |
Sarwadnya et al. | Marathi extractive text summarizer using graph based model | |
Sarveswaran et al. | Thamizhi Morph: A morphological parser for the Tamil language | |
Elayeb | Arabic word sense disambiguation: a review | |
Jabbar et al. | An improved Urdu stemming algorithm for text mining based on multi-step hybrid approach | |
Jabbar et al. | A survey on Urdu and Urdu like language stemmers and stemming techniques | |
JP2011118689A (en) | Retrieval method and system | |
Prokopidis et al. | A neural NLP toolkit for Greek | |
Farhan et al. | Survey of automatic query expansion for Arabic text retrieval | |
Wu et al. | Semantic segment extraction and matching for internet FAQ retrieval | |
Sindhu et al. | Text summarization: a technical overview and research perspectives | |
Dave et al. | A Systematic Review of Stemmers of Indian and Non-Indian Vernacular Languages | |
Demilie et al. | Automated all in one misspelling detection and correction system for Ethiopian languages | |
Rahat et al. | A recursive algorithm for open information extraction from Persian texts | |
Lyashevskaya et al. | An HMM-based PoS Tagger for Old Church Slavonic | |
Lezama Sanchez et al. | A Behavior Analysis of the Impact of Semantic Relationships on Topic Discovery | |
Varga et al. | Natural language processing of large parallel corpora | |
Molendini et al. | LIT: Rule based Italian lemmatizer | |
Kahlawi | An ontology-driven DBpedia quality enhancement to support Entity Annotation for Arabic Text | |
Dione | Finite-state tokenization for a deep Wolof LFG grammar |