[go: up one dir, main page]

Aldea et al., 2018 - Google Patents

Document layout analysis system

Aldea et al., 2018

View PDF
Document ID
4413293659802677758
Author
Aldea A
Coriu R
Prajică Å
Brînzea R
Boiangiu C
Publication year
Publication venue
Journal of Information Systems & Operations Management

External Links

Snippet

The need to process large amounts of printed physical data has led to the development of automated solutions for scanning and converting such documents into an editable text format. Following the layout analysis process, the different areas (blocks) of the document …
Continue reading at www.researchgate.net (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • G06K9/34Segmentation of touching or overlapping patterns in the image field
    • G06K9/342Cutting or merging image elements, e.g. region growing, watershed, clustering-based techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • G06K9/32Aligning or centering of the image pick-up or image-field
    • G06K9/3233Determination of region of interest
    • G06K9/325Detection of text region in scene imagery, real life image or Web pages, e.g. licenses plates, captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00442Document analysis and understanding; Document recognition
    • G06K9/00463Document analysis by extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics, paragraphs, words or letters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/68Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • G06K9/2054Selective acquisition/locating/processing of specific regions, e.g. highlighted text, fiducial marks, predetermined fields, document type identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6201Matching; Proximity measures
    • G06K9/6202Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30244Information retrieval; Database structures therefor; File system structures therefor in image databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K2209/00Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR

Similar Documents

Publication Publication Date Title
US10943105B2 (en) Document field detection and parsing
US8290268B2 (en) Segmenting printed media pages into articles
Minetto et al. SnooperText: A text detection system for automatic indexing of urban scenes
Fabrizio et al. Text detection in street level images
Chazalon et al. ICDAR 2021 competition on historical map segmentation
US9798711B2 (en) Method and system for generating a graphical organization of a page
Narang et al. Line segmentation of Devanagari ancient manuscripts
Rahman et al. Text Information Extraction from Digital Image Documents Using Optical Character Recognition
Aldea et al. Document layout analysis system
WO2025003209A1 (en) Image analysis
Feild et al. Scene text recognition with bilateral regression
Andersen et al. Features for neural net based region identification of newspaper documents
Thiyam et al. Integrated document segmentation and region identification: textual, equation and graphical
Baskaran et al. Comic character recognition (CCR): extraction of speech balloon context and character of interest in comics
Humied Segmentation accuracy for offline Arabic handwritten recognition based on bounding box algorithm
US20240378913A1 (en) Information processing device, information processing method, and computer program product
Huang et al. Chinese historic image threshold using adaptive K-means cluster and Bradley’s
Jain et al. Digitization of Handwritten Documents Using Image Processing
Dalve et al. A survey on real time text detection and recognition from traffic panels
Yang et al. Local projection-based character segmentation method for historical Chinese documents
Baskaran et al. Comic Character Recognition (CCR): Extraction of Speech Balloon Context
Ezat et al. A Systematic Review of Vehicle License Plate Recognition Algorithms Based on Image Segmentation
Pun et al. A Survey on Change Detection Techniques in Document Images
Sarkar et al. A font invariant character segmentation technique for printed bangla word images
Zhang et al. Using Orientation Voting to Extract Text Lines with Various Mixed Directions from a Document Image