Viswanathan et al., 1990 - Google Patents
A syntactic approach to document segmentationViswanathan et al., 1990
- Document ID
- 802677816540492164
- Author
- Viswanathan M
- Krishnamoorthy M
- Publication year
- Publication venue
- Structural Pattern Analysis
External Links
Snippet
A document image is a bit-map produced by raster-digitizing (scanning) a printed page. It can contain fields of text, equations, tables and figures. This article describes a method to identify the spatial structure of a document image and label various components without …
- 230000011218 segmentation 0 title abstract description 20
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2205—Storage facilities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/211—Formatting, i.e. changing of presentation of document
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2247—Tree structured documents; Markup, e.g. Standard Generalized Markup Language [SGML], Document Type Definition [DTD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00442—Document analysis and understanding; Document recognition
- G06K9/00463—Document analysis by extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics, paragraphs, words or letters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
- G06K9/34—Segmentation of touching or overlapping patterns in the image field
- G06K9/342—Cutting or merging image elements, e.g. region growing, watershed, clustering-based techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K2209/00—Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/42—Syntactic analysis
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Krishnamoorthy et al. | Syntactic segmentation and labeling of digitized pages from technical journals | |
EP1457917B1 (en) | Apparatus and methods for converting network drawings from raster format to vector format | |
JP3640972B2 (en) | A device that decodes or interprets documents | |
Wong et al. | Document analysis system | |
Fateman et al. | Optical character recognition and parsing of typeset mathematics1 | |
EP0702322B1 (en) | Method and apparatus for identifying words described in a portable electronic document | |
EP0439743B1 (en) | Constraint driven on-line recognition of handwritten characters and symbols | |
JP2618832B2 (en) | Method and system for analyzing logical structure of document | |
JPH09198456A (en) | Method for recognizing table based on graph | |
US20040194035A1 (en) | Systems and methods for automatic form segmentation for raster-based passive electronic documents | |
Ha et al. | Understanding mathematical expressions from document images | |
US5699453A (en) | Method and apparatus for logically tagging of document elements in the column by major white region pattern matching | |
Conway | Page grammars and page parsing. a syntactic approach to document layout recognition | |
US20020016796A1 (en) | Document processing method, system and medium | |
Peels et al. | Document architecture and text formatting | |
US7639881B2 (en) | Application of grammatical parsing to visual recognition tasks | |
Viswanathan | Analysis of scanned documents—A syntactic approach | |
Rekers | On the use of Graph Grammars for de ning the Syntax of Graphical Languages | |
Viswanathan et al. | A syntactic approach to document segmentation | |
Hu et al. | A mixed approach toward an efficient logical structure recognition from document images | |
US20020085755A1 (en) | Method for region analysis of document image | |
Lin | Header and footer extraction by page association | |
Viswanathan et al. | A SYNTACTIC APPROACH ΤΟ DOCUMENT SEGMENTATION | |
Handley | Document recognition | |
Viswanathan et al. | Document recognition: an attribute grammar approach |