Chan et al., 2024 - Google Patents
Exploring Part-of-Speech Tagging Models in Malaysia's MultilingualChan et al., 2024
View PDF- Document ID
- 15781958182188107037
- Author
- Chan Y
- Leong F
- Lim T
- Tan C
- Publication year
- Publication venue
- 2024 3rd International Conference on Digital Transformation and Applications (ICDXA)
External Links
Snippet
This study delves into the realm of part-of-speech (POS) tagging within Malaysia's multilingual context. We investigate the efficacy of CRF-QTAG, semi-supervised CRF models and rule-based systems within Bahasa Rojak Analytics—a Malay-based language …
- 238000005516 engineering process 0 abstract description 14
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/2775—Phrasal analysis, e.g. finite state techniques, chunking
- G06F17/278—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
- G06F17/2223—Handling non-latin characters, e.g. kana-to-kanji conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2755—Morphological analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10409911B2 (en) | Systems and methods for text analytics processor | |
| Shahi et al. | Natural language processing for Nepali text: A review | |
| Priyadarshi et al. | Towards the first Maithili part of speech tagger: Resource creation and system development | |
| Jabbar et al. | An improved Urdu stemming algorithm for text mining based on multi-step hybrid approach | |
| Soe | Improving accuracy of part-of-speech (POS) tagging using hidden markov model and morphological analysis for Myanmar Language | |
| Ali et al. | SiNER: A large dataset for Sindhi named entity recognition | |
| Belay et al. | Impacts of homophone normalization on semantic models for amharic | |
| Chen et al. | Korean named entity recognition based on language-specific features | |
| Ullah et al. | A deep learning-based approach for part of speech (PoS) tagging in the pashto language | |
| Mekki et al. | Named entity recognition of tunisian arabic using the Bi-LSTM-CRF model | |
| Inuwa-Dutse | Naijanlp: A survey of nigerian low-resource languages | |
| Maddu et al. | A survey on NLP tasks, resources and techniques for low-resource Telugu-English code-mixed text | |
| Nobel et al. | Next word prediction in Bangla using hybrid approach | |
| Habib | The challenges of Persian user-generated textual content: A machine learning-based approach | |
| Marovac et al. | A survey of resources and methods for natural language processing of serbian language | |
| Khorjuvenkar et al. | Parts of speech tagging for Konkani language | |
| Hellwig | Coarse semantic classification of rare nouns using cross-lingual data and recurrent neural networks | |
| Feng et al. | Multi-level cross-lingual attentive neural architecture for low resource name tagging | |
| Abdelghany et al. | Doc2vec: An approach to identify hadith similarities | |
| Room | Named entity recognition | |
| Chan et al. | Exploring Part-of-Speech Tagging Models in Malaysia's Multilingual | |
| Dehkharghani et al. | A novel approach to sentiment analysis in Persian using discourse and external semantic information | |
| Mihi et al. | Automatic sarcasm detection in dialectal arabic using bert and tf-idf | |
| Almanea | Deep learning in written arabic linguistic studies: A comprehensive survey | |
| Turki Khemakhem et al. | POS tagging without a tagger: Using aligned corpora for transferring knowledge to under-resourced languages |