Baró et al., 2019 - Google Patents

Towards a generic unsupervised method for transcription of encoded manuscripts

Baró et al., 2019

Document ID: 8313424091033996085
Author: Baró A; Chen J; Fornés A; Megyesi B
Publication year: 2019
Publication venue: Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage

External Links

Cited by

Snippet

Historical ciphers, a special type of manuscripts, contain encrypted information, important for the interpretation of our history. The first step towards decipherment is to transcribe the images, either manually or by automatic image processing techniques. Despite the …

Continue reading at pages.cvc.uab.es (PDF) (other versions)

230000035897 transcription 0 title abstract description 33

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
- G06K9/2054—Selective acquisition/locating/processing of specific regions, e.g. highlighted text, fiducial marks, predetermined fields, document type identification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
- G06K9/22—Image acquisition using hand-held instruments
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00852—Recognising whole cursive words
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/03—Detection or correction of errors, e.g. by rescanning the pattern
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/38—Quantising the analogue image signal, e.g. histogram thresholding for discrimination between background and foreground patterns
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition

Similar Documents

Publication	Publication Date	Title
Baviskar et al.	2021	Efficient automated processing of the unstructured documents using artificial intelligence: A systematic literature review and future directions
Baró et al.	2019	Towards a generic unsupervised method for transcription of encoded manuscripts
Kantharaj et al.	2022	Opencqa: Open-ended question answering with charts
CN107004140B (en)	2021-01-12	Text recognition method and computer program product
Xue et al.	2018	A better way to attend: Attention with trees for video question answering
US11868313B1 (en)	2024-01-09	Apparatus and method for generating an article
Dölek et al.	2022	A deep learning model for Ottoman OCR
Lai et al.	2022	Semeval 2022 task 12: Symlink-linking mathematical symbols to their descriptions
Toselli et al.	2018	Transcribing a 17th-century botanical manuscript: Longitudinal evaluation of document layout detection and interactive transcription
Nguyen et al.	2019	OCR error correction for unconstrained Vietnamese handwritten text
Almanea	2021	Automatic methods and neural networks in Arabic texts diacritization: a comprehensive survey
Alrasheed et al.	2021	Evaluation of deep learning techniques for content extraction in spanish colonial notary records
Kasem et al.	2023	Advancements and challenges in arabic optical character recognition: A comprehensive survey
Yasin et al.	2023	Transformer-based neural machine translation for post-OCR error correction in cursive text
Xia et al.	2023	A sequence-to-sequence approach with mixed pointers to topic segmentation and segment labeling
Khan et al.	2024	OCR approaches for humanities: Applications of artificial intelligence/machine learning on transcription and transliteration of historical documents
Bender et al.	2019	Learning fine-grained image representations for mathematical expression recognition
Sundaram et al.	2015	Bigram language models and reevaluation strategy for improved recognition of online handwritten Tamil words
Tasdemir et al.	2024	Automatic transcription of Ottoman documents using deep learning
Sinclair et al.	2022	Handwriting recognition for Scottish Gaelic
CN116611428B (en)	2025-08-19	Non-autoregressive decoding Vietnam text regularization method based on editing alignment algorithm
Andrés et al.	2023	Search for hyphenated words in probabilistic indices: a machine learning approach
Almanea	2024	Deep learning in written arabic linguistic studies: A comprehensive survey
Villegas et al.	2016	Exploiting existing modern transcripts for historical handwritten text recognition
Villanova-Aparisi et al.	2024	Reading Order Independent Metrics for Information Extraction in Handwritten Documents