Pincemin et al., 2020 - Google Patents
Textometry on audiovisual corporaPincemin et al., 2020
View PDF- Document ID
- 2960331535709790072
- Author
- Pincemin B
- Heiden S
- Decorde M
- Publication year
- Publication venue
- 15th International Conference on Statistical Analysis of Textual Data JADT 2020
External Links
Snippet
Textometry is applied to audiovisual corpora, such as transcripts from semi-directed interviews, or the" Actualités françaises" newsreels archive. A workflow using an assisted or automatic transcription software is efficient to get a rich encoding. New features are added to …
- 230000035897 transcription 0 abstract description 32
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/24—Editing, e.g. insert/delete
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G06F17/30023—Querying
- G06F17/30038—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/211—Formatting, i.e. changing of presentation of document
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30011—Document retrieval systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30817—Information retrieval; Database structures therefor; File system structures therefor of video data using information manually generated or using information not derived from the video content, e.g. time and location information, usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Aksan et al. | Construction of the Turkish National Corpus (TNC). | |
Austin | Data and language documentation | |
US20130305149A1 (en) | Document reader and system for extraction of structural and semantic information from documents | |
Alfuraih | The undergraduate learner translator corpus: a new resource for translation studies and computational linguistics | |
Ädel | Corpus compilation | |
Fromont et al. | ONZE Miner: the development of a browser-based research tool | |
Pfanzelter et al. | Digital interfaces of historical newspapers: opportunities, restrictions and recommendations | |
Nagy et al. | Extending ELAN into variationist sociolinguistics | |
Pancur et al. | The siParl corpus of Slovene parliamentary proceedings | |
Landert et al. | Corpus pragmatics | |
Lindström et al. | The online database of the University of Tartu Archives of Estonian Dialects and Kindred Languages and the Corpus of Estonian Dialects | |
Pincemin et al. | Textometry on audiovisual corpora | |
Primorac et al. | Distant reading two decades on: Reflections on the digital turn in the study of literature | |
Handler et al. | ClioQuery: Interactive query-oriented text analytics for comprehensive investigation of historical news archives | |
Jong et al. | Access to recorded interviews: A research agenda | |
de Campos et al. | An integrated system for managing the andalusian parliament's digital library | |
Thieberger | Building a lexical database with multiple outputs: Examples from legacy data and from multimodal fieldwork | |
Midrigan-Ciochina et al. | Resources in underrepresented languages: Building a representative romanian corpus | |
Smyth et al. | Exploring the possibilities of Thomson’s fourth paradigm transformation—The case for a multimodal approach to digital oral history? | |
Arnob et al. | IndicDialogue: A dataset of subtitles in 10 Indic languages for Indic language modeling | |
Miyagawa | Digitization of Coptic manuscripts and digital humanities: Tools and methods for Coptic studies | |
Liu | Parallel Bilingual CI Corpus Construction | |
Resch et al. | Encoding Crime and Punishment in TEI: The Digital Processing of Early Modern Broadsheets from Vienna | |
Lahaussois | Introduction: Section on Grammar writing, documentation and data collection | |
Gillaspy | Innovation fuels sustainability: how innovative elements promoted longevity in three digital humanities projects in music |