[go: up one dir, main page]

WO2009145605A3 - A document categorization system - Google Patents

A document categorization system Download PDF

Info

Publication number
WO2009145605A3
WO2009145605A3 PCT/MY2009/000065 MY2009000065W WO2009145605A3 WO 2009145605 A3 WO2009145605 A3 WO 2009145605A3 MY 2009000065 W MY2009000065 W MY 2009000065W WO 2009145605 A3 WO2009145605 A3 WO 2009145605A3
Authority
WO
WIPO (PCT)
Prior art keywords
document
ontology
objects
text
attributes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/MY2009/000065
Other languages
French (fr)
Other versions
WO2009145605A2 (en
Inventor
Yuan Kai Chow
Nagendran Perumal
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mimos Bhd
Original Assignee
Mimos Bhd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mimos Bhd filed Critical Mimos Bhd
Publication of WO2009145605A2 publication Critical patent/WO2009145605A2/en
Publication of WO2009145605A3 publication Critical patent/WO2009145605A3/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

Document categorization system is a tool for matching objects in an ontology (101) against text documents. Relationship based object document matcher (100) will parse 5 the text document and returns the object name to which it's related to in the ontology (101 ). These attributes comprise of data objects and its relationships, either object or data type relationships. The Relationship based object document matcher (100) matches the attributes of all the objects in the ontology against the contents of the text document. When the objects attributes are matched in the text, the document 0 matching probability index increases. The document with the highest document matching probability index will create a new relationship in the ontology (101 ) between the document and the matched object.
PCT/MY2009/000065 2008-05-30 2009-05-29 A document categorization system Ceased WO2009145605A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI20081851 2008-05-30
MYPI20081851A MY158574A (en) 2008-05-30 2008-05-30 A document categorization system

Publications (2)

Publication Number Publication Date
WO2009145605A2 WO2009145605A2 (en) 2009-12-03
WO2009145605A3 true WO2009145605A3 (en) 2010-02-25

Family

ID=41377819

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2009/000065 Ceased WO2009145605A2 (en) 2008-05-30 2009-05-29 A document categorization system

Country Status (2)

Country Link
MY (1) MY158574A (en)
WO (1) WO2009145605A2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040088157A1 (en) * 2002-10-30 2004-05-06 Motorola, Inc. Method for characterizing/classifying a document
US7213205B1 (en) * 1999-06-04 2007-05-01 Seiko Epson Corporation Document categorizing method, document categorizing apparatus, and storage medium on which a document categorization program is stored

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7213205B1 (en) * 1999-06-04 2007-05-01 Seiko Epson Corporation Document categorizing method, document categorizing apparatus, and storage medium on which a document categorization program is stored
US20040088157A1 (en) * 2002-10-30 2004-05-06 Motorola, Inc. Method for characterizing/classifying a document

Also Published As

Publication number Publication date
MY158574A (en) 2016-10-14
WO2009145605A2 (en) 2009-12-03

Similar Documents

Publication Publication Date Title
Read A pruned problem transformation method for multi-label classification
WO2011159516A3 (en) Semantic content searching
WO2009003072A3 (en) Integrated platform for user input of digital ink
WO2012068238A3 (en) Shipping system and method with taxonomic tariff harmonization
WO2005076161A3 (en) System and method for comparative analysis of textual documents
WO2007103352A3 (en) Systems and methods for document annotation
WO2006008733A3 (en) A method for determining near duplicate data objects
WO2010105216A3 (en) System and method for automatic semantic labeling of natural language texts
WO2008049023A9 (en) Method and system for offline indexing of content and classifying stored data
WO2009023344A3 (en) Managing status of search index generation in handheld book reader device
WO2008027503A3 (en) Semantic search engine
WO2008031062A3 (en) System and method for building and retriving a full text index
WO2007008492A3 (en) Processing collocation mistakes in documents
WO2005038668A3 (en) Computer implemented methods and systems for representing multiple schemas and transferring data between different data schemas within a contextual ontology
WO2009006030A3 (en) A compliance management system
GB0823706D0 (en) Fast data entry
WO2011119410A3 (en) A system and methods thereof for mining web based user generated content for creation of term taxonomies
WO2009145605A3 (en) A document categorization system
Chase Lipton et al. Thresholding classifiers to maximize F1 score
WO2008126262A1 (en) Content explanation apparatus and method
Casella et al. Declining near-infrared flux from the black-hole candidate MAXI J1820+ 070 (ASASSN-18ey) in transition
Elberrichi Text mining using n-grams
WO2008114316A1 (en) Electronic document management device and electronic document management program
Yanagimoto et al. Information filtering using Kullback-Leibler divergence
Mizuno et al. Informing a robot of object location with both hand-gesture and verbal cues

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09755068

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09755068

Country of ref document: EP

Kind code of ref document: A2