[go: up one dir, main page]

Viswanathan et al., 1990 - Google Patents

A syntactic approach to document segmentation

Viswanathan et al., 1990

Document ID
802677816540492164
Author
Viswanathan M
Krishnamoorthy M
Publication year
Publication venue
Structural Pattern Analysis

External Links

Snippet

A document image is a bit-map produced by raster-digitizing (scanning) a printed page. It can contain fields of text, equations, tables and figures. This article describes a method to identify the spatial structure of a document image and label various components without …
Continue reading at www.worldscientific.com (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • G06F17/2205Storage facilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/211Formatting, i.e. changing of presentation of document
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • G06F17/2247Tree structured documents; Markup, e.g. Standard Generalized Markup Language [SGML], Document Type Definition [DTD]
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/277Lexical analysis, e.g. tokenisation, collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00442Document analysis and understanding; Document recognition
    • G06K9/00463Document analysis by extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics, paragraphs, words or letters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • G06K9/34Segmentation of touching or overlapping patterns in the image field
    • G06K9/342Cutting or merging image elements, e.g. region growing, watershed, clustering-based techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/68Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K2209/00Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis

Similar Documents

Publication Publication Date Title
Krishnamoorthy et al. Syntactic segmentation and labeling of digitized pages from technical journals
EP1457917B1 (en) Apparatus and methods for converting network drawings from raster format to vector format
JP3640972B2 (en) A device that decodes or interprets documents
Wong et al. Document analysis system
Fateman et al. Optical character recognition and parsing of typeset mathematics1
EP0702322B1 (en) Method and apparatus for identifying words described in a portable electronic document
EP0439743B1 (en) Constraint driven on-line recognition of handwritten characters and symbols
JP2618832B2 (en) Method and system for analyzing logical structure of document
JPH09198456A (en) Method for recognizing table based on graph
US20040194035A1 (en) Systems and methods for automatic form segmentation for raster-based passive electronic documents
Ha et al. Understanding mathematical expressions from document images
US5699453A (en) Method and apparatus for logically tagging of document elements in the column by major white region pattern matching
Conway Page grammars and page parsing. a syntactic approach to document layout recognition
US20020016796A1 (en) Document processing method, system and medium
Peels et al. Document architecture and text formatting
US7639881B2 (en) Application of grammatical parsing to visual recognition tasks
Viswanathan Analysis of scanned documents—A syntactic approach
Rekers On the use of Graph Grammars for de ning the Syntax of Graphical Languages
Viswanathan et al. A syntactic approach to document segmentation
Hu et al. A mixed approach toward an efficient logical structure recognition from document images
US20020085755A1 (en) Method for region analysis of document image
Lin Header and footer extraction by page association
Viswanathan et al. A SYNTACTIC APPROACH ΤΟ DOCUMENT SEGMENTATION
Handley Document recognition
Viswanathan et al. Document recognition: an attribute grammar approach