Hoegen et al., 2019 - Google Patents
An end-to-end conversational style matching agentHoegen et al., 2019
View PDF- Document ID
- 18158496443418276179
- Author
- Hoegen R
- Aneja D
- McDuff D
- Czerwinski M
- Publication year
- Publication venue
- Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents
External Links
Snippet
We present an end-to-end voice-based conversational agent that is able to engage in naturalistic multi-turn dialogue and align with the interlocutor's conversational style. The system uses a series of deep neural network components for speech recognition, dialogue …
- 230000003993 interaction 0 abstract description 38
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hoegen et al. | An end-to-end conversational style matching agent | |
Ram et al. | Conversational ai: The science behind the alexa prize | |
US20230118412A1 (en) | Stylizing Text-to-Speech (TTS) Voice Response for Assistant Systems | |
Thomas et al. | Style and alignment in information-seeking conversation | |
Nie et al. | Long dialogue emotion detection based on commonsense knowledge graph guidance | |
Murray et al. | Predicting group performance in task-based interaction | |
Aneja et al. | Understanding conversational and expressive style in a multimodal embodied conversational agent | |
Mazeland | Conversation analysis | |
Scissors et al. | In CMC we trust: The role of similarity | |
Gweon et al. | Measuring prevalence of other-oriented transactive contributions using an automated measure of speech style accommodation | |
Axelsson et al. | Modeling feedback in interaction with conversational agents—a review | |
Choi et al. | Offline and online satisfaction prediction in open-domain conversational systems | |
DeVault et al. | Can I finish? Learning when to respond to incremental interpretation results in interactive dialogue | |
Priva et al. | Distinct behaviors in convergence across measures | |
Tsai et al. | A study of multimodal addressee detection in human-human-computer interaction | |
Sapru et al. | Automatic recognition of emergent social roles in small group interactions | |
Ganesh et al. | A survey of challenges and methods in the computational modeling of multi-party dialog | |
Irfan et al. | Dynamic emotional language adaptation in multiparty interactions with agents | |
Bowden et al. | Entertaining and opinionated but too controlling: a large-scale user study of an open domain alexa prize system | |
Levitan | Developing an integrated model of speech entrainment | |
Shen et al. | Empathicstories++: A multimodal dataset for empathy towards personal experiences | |
Fu et al. | Improving meeting inclusiveness using speech interruption analysis | |
Wei et al. | Multimodal user satisfaction recognition for non-task oriented dialogue systems | |
Mawalim et al. | Personality trait estimation in group discussions using multimodal analysis and speaker embedding | |
Woo et al. | IAVA: Interactive and Adaptive Virtual Agent |