JP2012093972A

JP2012093972A - Conversation processing device

Info

Publication number: JP2012093972A
Application number: JP2010240953A
Authority: JP
Inventors: Takashi Tanaka; 高士田中; Yasuhiko Oshima; 靖彦大島; Yoshihide Asano; 佳秀浅野
Original assignee: MTI JAPAN CO Ltd
Current assignee: MTI JAPAN CO Ltd
Priority date: 2010-10-27
Filing date: 2010-10-27
Publication date: 2012-05-17
Anticipated expiration: 2030-10-27
Also published as: JP5539842B2

Abstract

PROBLEM TO BE SOLVED: To provide a conversation processing device capable of creating a responsive sentence encouraging a user to change in emotion state.SOLUTION: A conversation processing device which recognizes emotion of a user and expresses emotion to the user using a knowledge database, comprises processing for obtaining a current point in a coordinate system having components of a type of emotion and strength of emotion from the recognized result of the emotion of the user, and determining a passing point nearer to a targeted convergence state of the emotion than the current point in the coordinate system. The knowledge database indicates a plurality of words and typical sentences in association with parameters in accordance with the type of the emotion and the strength of the emotion. The emotion is expressed to the user by searching the knowledge data base using the parameters in accordance with the type of the emotion and the strength of the emotion at the passing point to obtain at least one of the words and the typical sentences to be used for conversation, and creating a response sentence using the obtained result.

Description

本発明は、ユーザの言葉に対して応答文を作成する対話処理装置に関するものである。 The present invention relates to a dialogue processing apparatus that creates a response sentence for a user's words.

従来の対話処理は、ＡＴＭのように装置側から一方的な発話がなされたり、また自動電話サービスのように、ありきたりな定型文を挿入してユーザからの問いかけに答えるというものが一般的であった。かかる応対は、機械的・事務的な印象をユーザに与え、装置との対話を楽しむような余地がなかった。
しかし、近年では、ゲームや愛玩用のペット型玩具、介護や癒しを目的としたコミュニケーションロボット等の用途で対話処理装置の利用が期待されている。このような用途では、人間と対面しているかのような印象をユーザに与える対話処理が求められる。 Conventional dialog processing is generally one-way utterance from the device side like ATM, or answering a user's question by inserting a common fixed sentence like automatic telephone service. there were. Such a response gives a user a mechanical / office-like impression, and there is no room for enjoying a dialogue with the apparatus.
However, in recent years, the use of dialog processing devices is expected for applications such as pet toys for games and pets, and communication robots for the purpose of nursing and healing. In such an application, an interactive process that gives the user the impression that they are facing a human being is required.

例えば、コミュニケーションロボットとしては、特許文献１がある。また、特許文献２には、ユーザの性格に応じたコミュニケーション行動を行うコミュニケーションロボットが開示されている。 For example, there is Patent Document 1 as a communication robot. Patent Document 2 discloses a communication robot that performs communication behavior according to the personality of the user.

国際公開２００５/０１４２４２号公報International Publication No. 2005/014242 特開２００８−２７８９８１号公報JP 2008-278981 A

しかしながら、特許文献２に記載のものは、ユーザの感情を汲み、その感情状態の変化を促すような応答を返すものではない。
本発明は、ユーザ感情状態の変化を促すような応答文を作成することができる対話処理装置を提供することを目的とする。 However, the device described in Patent Document 2 does not return a response that draws the user's emotion and prompts the emotional state to change.
An object of the present invention is to provide an interactive processing apparatus that can create a response sentence that prompts a change in a user emotional state.

上記目的を達成するために、本発明に係る対話処理装置は、ユーザの感情の認識と、知識データベースを用いたユーザに対する感情表現とを実行する対話処理装置であって、ユーザ感情の認識結果から、感情の種別及び感情の強さを成分とした座標系における現在点を得るとともに、この座標系において現在点よりも目標とする感情の収束状態に近い経由点を決定する処理を含み、知識データベースは、複数の語彙及び定型文のそれぞれを、感情の種別及び感情の強さに応じたパラメータに関連付けて示すものであり、前記ユーザに対する感情表現は、経由点での感情の種別及び感情の強さに応じたパラメータで知識データベースを検索することで、対話に用いるべき語彙及び定型文の少なくとも一方を得て、その取得結果を用いた応答文を作成することでなされることを特徴とする。 In order to achieve the above object, a dialogue processing apparatus according to the present invention is a dialogue processing apparatus that executes recognition of a user's emotion and expression of emotion for the user using a knowledge database, from the recognition result of the user's emotion A knowledge database that includes a process for obtaining a current point in a coordinate system having the emotion type and emotion strength as components, and determining a via point closer to the target emotion convergence state than the current point in this coordinate system. Indicates each of a plurality of vocabularies and fixed phrases in association with parameters according to the type of emotion and the strength of emotion, and the emotional expression for the user includes the type of emotion and the strength of emotion at the waypoint By searching the knowledge database with parameters according to the size, we obtain at least one of the vocabulary and fixed phrases that should be used in the dialogue, and the response sentence using the acquired results Characterized in that it is made by forming.

本発明に係る対話処理装置は、現在のユーザ感情よりも目標とする感情の収束状態に近い経由点での感情種別、及び感情の強さと対応するような語彙を用いて、応答文を作成する。これにより、目標とする感情の収束状態に緩やかに近づけるようなユーザ感情の変化を促すことができる。このような緩やかな感情変化を促す応答は、人間と対面しているかのような印象をユーザに与えることができる。 The dialogue processing apparatus according to the present invention creates a response sentence by using an emotion type at a waypoint closer to the target emotion convergence state than the current user emotion and a vocabulary corresponding to the emotion strength. . As a result, it is possible to promote a change in user emotion that gently approaches the target emotion convergence state. Such a response that prompts a gentle emotional change can give the user the impression that they are facing humans.

本実施形態に係る対話処理装置の利用形態を示す図The figure which shows the utilization form of the dialogue processing apparatus which concerns on this embodiment 対話処理装置１の内部構成を示すブロック図The block diagram which shows the internal structure of the dialogue processing apparatus 1 感情語彙クラスタファイルの一例を示す図である。It is a figure which shows an example of an emotion vocabulary cluster file. 感情語彙クラスタに収録された語彙及び定型文を母集団としたマハラノビステーブルを示す図The figure which shows the Mahalanobis table which made the vocabulary recorded in the emotion vocabulary cluster and the fixed sentence a population. 感情表現制御部１３の詳細な構成を示すブロック図The block diagram which shows the detailed structure of the emotion expression control part 13 一時学習データのデータ構造を示す図Diagram showing the data structure of temporary learning data （ａ）は、θ成分を１０等分し１０種類の感情を割り当てた極座標系と、ＳＴｅｍｏｔｉｏｎが出力する感情情報（喜び、平常、哀しみ、怒り）の各感情要素を示す単位ベクトルとの関係を示す図、（ｂ）は、ＳＴｅｍｏｔｉｏｎが出力する感情情報を合成した合成ベクトルｖ３により、極座標系における感情の現在点を決定する手順を示す図、（ｃ）は、現在点と経由点との関係を示す図(A) is a relationship between a polar coordinate system in which the θ component is equally divided into 10 and assigned 10 types of emotions, and a unit vector indicating each emotion element of emotion information (joy, normality, sadness, anger) output by ST emotion (B) is a diagram showing a procedure for determining the current point of emotion in the polar coordinate system based on a synthesized vector v3 obtained by synthesizing emotion information output by ST emotion, and (c) is a diagram showing a current point, a transit point, and Diagram showing the relationship 学習データｆθを用いた経由点θ成分の決定を示す図The figure which shows determination of the via point (theta) component using learning data f (theta) 学習データｆＲを用いた経由点θ成分の決定を示す図The figure which shows determination of the via point (theta) component using learning data fR （ａ）は経由点と応答後の現在点とに差が生じても一時学習を継続する条件を示す図、（ｂ）は経由点と応答後の現在点とに差が生じ一時学習を終了する条件を示す図(A) is a diagram showing conditions for continuing temporary learning even if there is a difference between the waypoint and the current point after response, and (b) is a case where there is a difference between the waypoint and the current point after answering to terminate temporary learning. That shows the conditions 本実施形態に係る対話処理装置１の動作手順を示すフローチャートThe flowchart which shows the operation | movement procedure of the dialogue processing apparatus 1 which concerns on this embodiment. マハラノビス取得処理の詳細を示すフローチャートFlow chart showing details of Mahalanobis acquisition processing 経由点算出・一時学習処理の詳細の詳細を示すフローチャートFlowchart showing details of waypoint calculation / temporary learning process 感情語彙絞り込み処理の詳細の詳細を示すフローチャートFlow chart showing details of emotion vocabulary narrowing processing 感情シーケンス終了処理の詳細の詳細を示すフローチャートFlow chart showing details of emotion sequence end processing 変形例に係る対話処理装置の内部構成を示す図The figure which shows the internal structure of the dialogue processing apparatus which concerns on a modification. 変形例に係る対話処理装置において現在点を決定する方法を示す図The figure which shows the method of determining a present point in the dialogue processing apparatus which concerns on a modification.

以下、本発明に係る対話処理装置の実施の形態について、図を用いて説明する。
図１は、本実施形態に係る対話処理装置の利用形態を示す図である。本発明に係る対話処理装置１は、マイク、スピーカ、マイコンシステムを内蔵したヌイグルミ様の玩具であり、ユーザの発話に対して応答文を音声にて出力する動作を繰り返すことで、ユーザとの対話処理を実現する。このような対話処理において、ユーザの言葉に強い怒りや、強い哀しみ等の好ましくない感情が認識される場合、対話処理装置１は、ユーザ感情をなだめたり、あるいは好ましい感情に変化するよう促す語彙、若しくは定型文を応答文に用いる。対話処理においてユーザ感情が目標とする感情状態に収束するように語彙、若しくは定型文を用いることを、対話処理装置１の感情表現処理と称する。感情表現処理では、ユーザの発話に基づくユーザ感情の認識と、それに対する応答文の出力とで１回の「試行」とし、対話処理装置１は、試行を繰り返すことでユーザの感情が目標とする感情状態に収束した場合に、感情表現処理を終了する。 DESCRIPTION OF EMBODIMENTS Hereinafter, an embodiment of a dialogue processing apparatus according to the present invention will be described with reference to the drawings.
FIG. 1 is a diagram showing a usage pattern of the dialogue processing apparatus according to the present embodiment. The dialogue processing apparatus 1 according to the present invention is a stuffed toy with a microphone, a speaker, and a microcomputer system. By repeating the operation of outputting a response sentence in voice to the user's utterance, the dialogue with the user is performed. Realize processing. In such a dialogue process, when an unpleasant emotion such as strong anger or strong sadness is recognized in the user's word, the dialogue processing device 1 uses a vocabulary that urges the user to soothe or change the user's emotion. Alternatively, a fixed sentence is used as a response sentence. The use of a vocabulary or a fixed sentence so that the user emotion converges to a target emotion state in the dialogue processing is referred to as emotion expression processing of the dialogue processing device 1. In emotion expression processing, recognition of user emotion based on the user's utterance and output of a response sentence to the user are regarded as one “trial”, and the dialogue processing device 1 targets the user's emotion by repeating the trial. When the emotion state is converged, the emotion expression process is terminated.

次に、対話処理装置１の内部構成について説明する。図２は、対話処理装置１の内部構成を示すブロック図である。対話処理装置１は、マイコンシステム１０、マイク２０、及び、スピーカ３０を有する。
マイク２０は、入力された音声から音声信号を生成し、これをＰＣＭＷＡＶＥ形式の音声データに変換した後に出力する。 Next, the internal configuration of the dialogue processing apparatus 1 will be described. FIG. 2 is a block diagram showing an internal configuration of the dialogue processing apparatus 1. The dialogue processing apparatus 1 includes a microcomputer system 10, a microphone 20, and a speaker 30.
The microphone 20 generates an audio signal from the input audio, converts it into audio data in the PCM WAVE format, and outputs the audio data.

スピーカ３０は、ＰＣＭＷＡＶＥ形式の音声データを、音声に変換して出力する。
マイコンシステム１０はＣＰＵ、ＲＡＭ、ＲＯＭからなり、ＲＯＭに記録されたプログラムをＣＰＵが実行することにより、本図の破線内に示す音声認識部１１、感情認識部１２、感情表現制御部１３、知識データベース１４、応答文作成部１５、及び、音声合成部１６の機能を実現する。 The speaker 30 converts audio data in PCM WAVE format into audio and outputs it.
The microcomputer system 10 includes a CPU, a RAM, and a ROM. When the CPU executes a program recorded in the ROM, the voice recognition unit 11, the emotion recognition unit 12, the emotion expression control unit 13, and the knowledge shown in the broken line in FIG. The functions of the database 14, the response sentence creation unit 15, and the speech synthesis unit 16 are realized.

音声認識部１１は、音声認識ソフトの実行により実現される機能ブロックであり、マイク２０から入力された音声データを解析し、ユーザが発した言葉をテキストデータとして出力する。
感情認識部１２は、マイク２０から入力された音声データの抑揚等を解析し、発話したユーザの感情の認識結果として、感情情報を出力する。感情認識部の実現には、例えば、株式会社ＡＧＩ製の感情認識ソフトである「ＳＴＥｍｏｔｉｏｎ」等の利用が考えられる。ＳＴＥｍｏｔｉｏｎは、「怒り」「喜び」「哀しみ」「平常」「笑い」「興奮」の６つの感情要素について、それぞれ強度を検出する。「怒り」「喜び」「哀しみ」「平常」「興奮」については０〜１０の１１段階で検出され、「笑い」についてはある/ないの2段階で検出される。以下、本実施の形態では、ＳＴＥｍｏｔｉｏｎを利用するものとし、感情認識部１２が出力する感情情報としては、「ＳＴＥｍｏｔｉｏｎ」の検出値のうち「怒り」、「喜び」、「哀しみ」、「平常」の４つの感情要素の検出値を用いる。 The voice recognition unit 11 is a functional block realized by executing voice recognition software, analyzes voice data input from the microphone 20, and outputs words uttered by the user as text data.
The emotion recognition unit 12 analyzes the inflection of the voice data input from the microphone 20 and outputs emotion information as a recognition result of the emotion of the user who spoke. For the realization of the emotion recognition unit, for example, use of “ST Emotion” which is emotion recognition software manufactured by AGI Corporation can be considered. ST Emotion detects the intensity for each of the six emotional elements of “anger”, “joy”, “sadness”, “normal”, “laughter”, and “excitement”. "Anger", "joy", "sadness", "normal", and "excitement" are detected in 11 levels from 0 to 10, and "laughter" is detected in 2 levels. Hereinafter, in the present embodiment, it is assumed that ST emotion is used, and emotion information output by the emotion recognition unit 12 includes “anger”, “joy”, “sadness”, “sadness” among the detected values of “ST emotion”. The detected values of the four emotional elements “normal” are used.

感情表現制御部１３は、応答文で用いるべき感情表現を含む語彙や定型文を知識データベースから取得し、応答文作成部１５へ提示する機能を有する。具体的には、感情認識部１２が出力した感情情報において強い怒りや、強い哀しみが示されている場合に、これらをなだめたり、慰撫する語彙や定型文を知識データベース１４から取得し、取得結果を応答文作成部１５へ出力する。ここで知識データベース１４から複数の語彙や定型文が取得される場合、感情表現制御部１３は尤度に従って、取得結果の語彙や定型文に順位付けをした上で応答文作成部１５へ出力する。 The emotion expression control unit 13 has a function of acquiring a vocabulary and a fixed sentence including emotion expressions to be used in the response sentence from the knowledge database and presenting them to the response sentence creating unit 15. Specifically, when strong anger or strong sorrow is shown in the emotion information output by the emotion recognition unit 12, vocabulary and fixed sentences that soothe or comfort them are acquired from the knowledge database 14, and the acquisition result Is output to the response sentence creation unit 15. Here, when a plurality of vocabularies and fixed phrases are acquired from the knowledge database 14, the emotion expression control unit 13 ranks the acquired vocabulary and fixed phrases according to the likelihood, and outputs them to the response sentence creation unit 15. .

知識データベース１４は、対話処理における応答のために用いる語彙管理データベースである。知識データベース１４は、様々な会話のカテゴリー(以下、クラスタ)について語彙及び定型文を収集し、クラスタ毎にXMLファイルを作成し管理する。各クラスタファイルでは、語彙及び定型文が意味属性で分類されている。また、知識データベース１４では更に、語彙及び定型文に共起頻度等の関連情報が付加されている。図３は、感情語彙クラスタファイルの一例を示す図である。感情語彙クラスタファイルには、感情表現を伴う語彙及び定型文（以下、「感情語彙」と総称する。）が収録されており、これらの感情語彙を応答文で使用することが望ましいと考えられる会話相手の感情種別及び感情の強さを意味属性として、各感情語彙が分類されている。 The knowledge database 14 is a vocabulary management database used for responses in dialogue processing. The knowledge database 14 collects vocabulary and fixed phrases for various conversation categories (hereinafter referred to as clusters), and creates and manages XML files for each cluster. In each cluster file, vocabulary and fixed phrases are classified by semantic attributes. Further, in the knowledge database 14, related information such as the co-occurrence frequency is added to the vocabulary and fixed phrases. FIG. 3 is a diagram illustrating an example of an emotion vocabulary cluster file. The emotional vocabulary cluster file contains vocabulary with emotional expressions and fixed phrases (hereinafter collectively referred to as “emotional vocabulary”), and conversations that it is desirable to use these emotional vocabularies in response sentences. Each emotional vocabulary is classified with the other party's emotion type and emotion strength as semantic attributes.

知識データベース１４は更に、このようなデータ構造で収録している語彙及び定型文から、応答文に用いるべき語彙や定型文を検索する機能を有する。知識データベース１４の検索機能としては、具体的には、入力テキストで使用された語彙に対してマハラノビス距離や共起頻度から蓋然性の高い語彙及び定型文を返す中立語彙検索機能と、ユーザ感情に対する応答として適した感情語彙を返す感情語彙検索機能とがある。 The knowledge database 14 further has a function of searching for a vocabulary and a fixed sentence to be used for a response sentence from a vocabulary and a fixed sentence recorded in such a data structure. Specifically, as the search function of the knowledge database 14, a neutral vocabulary search function that returns a vocabulary and a fixed phrase that have a high probability from the Mahalanobis distance and co-occurrence frequency for the vocabulary used in the input text, and a response to user emotion And an emotion vocabulary search function that returns a suitable emotion vocabulary.

感情語彙検索機能では、感情語彙クラスタファイルに収録された感情語彙を母集団としたマハラノビス距離を用いて語彙選択がなされる。感情語彙検索機能として知識データベース１４は、検索用マハラノビス距離が入力されると、感情語彙クラスタファイルに収録された感情語彙のうちマハラノビス距離が入力に近いものから所定数（例えば４つ）だけ応答として返す。 In the emotion vocabulary search function, vocabulary selection is performed using the Mahalanobis distance with the emotion vocabulary recorded in the emotion vocabulary cluster file as a population. As the emotion vocabulary search function, when a search Mahalanobis distance is input, the knowledge database 14 returns a predetermined number (for example, four) of responses from the emotion vocabulary recorded in the emotion vocabulary cluster file from those whose Mahalanobis distance is close to the input. return.

ここで用いるマハラノビス距離は、感情の種別及び感情の強さをＸ−Ｙ成分とした直交座標系における母集団の共分散行列Ａ、検索対象の感情種別及び強さを表すベクトルＸを用いて、以下の式１により算出される。
マハラノビス距離＝Ｘ＊Ａの逆行列＊Ｘ・・・式１
このようなマハラノビス距離は、感情語彙クラスタファイルに収録された感情語彙に変化がなければ、検索したい感情の種別及び強さに応じて一意に定まるものである。そこで、感情の種別及び強さに、感情語彙クラスタファイルから算出したマハラノビス距離を関連付けたマハラノビステーブルを作成しておき、感情語彙検索機能を用いた問合せ元となる感情表現制御部１３へ予め提供しておく。図４は、感情語彙クラスタに収録された語彙及び定型文を母集団としたマハラノビステーブルを示す図である。マハラノビステーブルを用いることにより、感情表現制御部１３では、応答文に使用したい語彙の感情の種別及び感情の強さから容易にマハラノビス距離を得て、感情語彙を検索することができる。 The Mahalanobis distance used here is a covariance matrix A of a population in an orthogonal coordinate system in which the type of emotion and the strength of emotion are XY components, and a vector X representing the emotion type and strength of the search target, It is calculated by the following formula 1.
Mahalanobis distance = X * A inverse matrix * X Equation 1
Such Mahalanobis distance is uniquely determined according to the type and strength of the emotion to be searched if there is no change in the emotion vocabulary recorded in the emotion vocabulary cluster file. Therefore, a Mahalanobis table in which the Mahalanobis distance calculated from the emotion vocabulary cluster file is associated with the type and strength of the emotion is created and provided in advance to the emotion expression control unit 13 that is the inquiry source using the emotion vocabulary search function. Keep it. FIG. 4 is a diagram showing a Mahalanobis table in which the vocabulary and fixed sentences recorded in the emotion vocabulary cluster are used as a population. By using the Mahalanobis table, the emotion expression control unit 13 can easily obtain the Mahalanobis distance from the emotion type and the strength of the vocabulary to be used in the response sentence, and search the emotion vocabulary.

応答文作成部１５は、音声認識部１１が出力したテキストデータを入力として、ユーザが発した言葉をに対する応答文を作成する。応答文作成部１５による応答文作成では、先ず、知識データベース１４が提供する中立語彙検索機能を利用して、入力テキストに対する感情表現を伴わない価値中立な文章が作成される。その後、感情表現制御部１３から感情表現を含む語彙や定型文が提供された場合、応答文作成部１５は、感情表現制御部１３が付した尤度順位や共起頻度に従って提供された語彙、定型文の中から適切なものを選択し、選択したものを価値中立な文章に挿入して応答文の作成を完了する。感情表現制御部１３から感情表現を含む語彙や定型文が提供されなかった場合には、価値中立な文章をそのまま応答文とする。こうして作成された応答文は応答文テキストデータとして音声合成部１６へ出力される。 The response sentence creation unit 15 creates a response sentence for the words uttered by the user, using the text data output by the speech recognition unit 11 as an input. In the response sentence creation by the response sentence creation unit 15, first, a neutral vocabulary search function provided by the knowledge database 14 is used to create a value-neutral sentence that does not involve emotional expression for the input text. Thereafter, when a vocabulary or fixed phrase including emotional expression is provided from the emotion expression control unit 13, the response sentence creation unit 15 includes a vocabulary provided according to the likelihood rank or co-occurrence frequency attached by the emotion expression control unit 13, Appropriate sentences are selected from the standard sentences, and the selected ones are inserted into the value-neutral sentences to complete the creation of the response sentence. When a vocabulary or fixed phrase including emotional expression is not provided from the emotional expression control unit 13, a value-neutral sentence is used as it is as a response sentence. The response sentence created in this way is output to the speech synthesizer 16 as response sentence text data.

音声合成部１６は、応答文作成部１５が出力した応答文テキストデータに基づいて、ＰＣＭＷＡＶＥ形式の音声データを生成する。
以上が、対話処理装置１の内部構成の概略である。

＜感情表現制御部１３の詳細＞
本実施形態に係る対話処理装置１の特徴的な機能として、感情表現処理がある。感情表現処理は、応答文に感情語彙を挿入することで実現される。そこで以下では、応答文に挿入する感情語彙を決定する感情表現制御部１３の詳細を説明する。 The speech synthesizer 16 generates speech data in the PCM WAVE format based on the response text data output from the response text creation unit 15.
The above is the outline of the internal configuration of the dialogue processing apparatus 1.

<Details of Emotion Expression Control Unit 13>
A characteristic function of the dialogue processing apparatus 1 according to the present embodiment is emotion expression processing. Emotion expression processing is realized by inserting an emotion vocabulary into the response sentence. Therefore, the details of the emotion expression control unit 13 that determines the emotion vocabulary to be inserted into the response sentence will be described below.

図５は、感情表現制御部１３の詳細な構成を示すブロック図である。感情表現制御部１３は、座標変換部１０１、経由点算出部１０２、学習データ管理部１０３、語彙選択部１０４、マハラノビステーブル保持部１０５の各機能ブロックを含む。
座標変換部１０１は、４種類の感情要素についてその強度を示す感情情報を、１つの合成ベクトルに変換することで、感情の種別をθ成分、感情の強さをＲ成分とした極座標系においてユーザ感情を示す現在点を決定する機能ブロックである。 FIG. 5 is a block diagram showing a detailed configuration of the emotion expression control unit 13. The emotion expression control unit 13 includes functional blocks of a coordinate conversion unit 101, a waypoint calculation unit 102, a learning data management unit 103, a vocabulary selection unit 104, and a Mahalanobis table holding unit 105.
The coordinate conversion unit 101 converts emotion information indicating the strength of four types of emotion elements into a single composite vector, thereby allowing the user to use a polar coordinate system in which the emotion type is the θ component and the emotion strength is the R component. It is a functional block that determines the current point indicating emotion.

経由点算出部１０２は、経由点を決定する機能ブロックである。経由点とは、１回の試行によって変化させることを目指すユーザの感情状態であり、極座標系における１点で表現される。経由点には極座標系において現在点よりも目標とする感情の収束状態に近い点を設定する。
学習データ管理部１０３は、学習データを蓄積及び更新することで、経由点決定の強化学習を実現する機能ブロックである。学習データは、過去のエピソードでの経由点のθ成分を、初回試行から順にならべた数列ｆθ｛θ１、θ２、・・・、θｎ｝、及び、過去のエピソードでの経由点のＲ成分を、初回試行から順にならべた数列ｆＲ｛Ｒ１、Ｒ２、・・・、Ｒｎ｝のデータ形式で保存される。ここで「エピソード」とは、初回試行からユーザ感情が目標範囲に収束するまで繰り返された一連の試行である。 The waypoint calculation unit 102 is a functional block that determines a waypoint. A via point is a user's emotional state that is intended to be changed by one trial, and is represented by one point in the polar coordinate system. A point closer to the convergence state of the target emotion than the current point is set as the via point in the polar coordinate system.
The learning data management unit 103 is a functional block that realizes reinforcement learning for waypoint determination by accumulating and updating learning data. The learning data includes a numerical sequence fθ {θ1, θ2,..., Θn} in which the θ component of the via point in the past episode is arranged in order from the first trial, and the R component of the via point in the past episode, It is stored in a data format of a number sequence fR {R1, R2,..., Rn} arranged in order from the first trial. Here, the “episode” is a series of trials repeated from the first trial until the user emotion converges to the target range.

学習データを更新するために、学習データ管理部１０３は、図６に示すように各エピソードで全試行の経由点データを、ＲＡＭ上の作業領域に一時学習データとして蓄積しておき、ユーザ感情が目標範囲に収束した時に一時学習データでの試行回数と既存の学習データでの試行回数を比較する。一時学習データの方が試行回数が少なければ、学習データ管理部１０３は一時学習データを新たな学習データとして更新する。 In order to update the learning data, the learning data management unit 103 accumulates the transit point data of all trials in each episode as temporary learning data in each work area as shown in FIG. Compare the number of trials with temporary learning data and the number of trials with existing learning data when it converges to the target range. If the number of trials is smaller in the temporary learning data, the learning data management unit 103 updates the temporary learning data as new learning data.

尚、学習データ管理部１０３には、「恥」、「怖」、「哀」、「厭」、「怒」、「昂」、「驚」、「喜」、「好」、「安」の感情が割り当てられた各領域を開始位置とする１０のθ成分学習データｆθ、及び１０のＲ成分学習データｆＲが蓄積されている。このように領域毎に学習データを管理し更新するために、学習データ管理部１０３は、初回試行での現在点を会話開始位置としてエピソード終了まで保持しておき、エピソード終了時の一時学習データとの比較、更新には、会話開始位置が属する領域での学習データを対象とする。 The learning data management unit 103 includes “shame”, “scary”, “sorrow”, “厭”, “anger”, “昂”, “surprise”, “joy”, “good”, “low”. Ten θ component learning data fθ and 10 R component learning data fR starting from each region to which an emotion is assigned are accumulated. In this way, in order to manage and update the learning data for each region, the learning data management unit 103 holds the current point in the first trial as the conversation start position until the end of the episode, and the temporary learning data at the end of the episode For comparison and update, learning data in the region to which the conversation start position belongs are targeted.

語彙選択部１０４は、応答文で使用すべき感情語彙をリストにして応答文作成部１５へ提供する機能ブロックである。先ず語彙選択部１０４は、マハラノビステーブル保持部１０５に保持されている図４のマハラノビステーブルを参照して、経由点の感情の種別θ、及び感情の強さＲに応じたマハラノビス距離を得る。この経由点のマハラノビス距離を用いて、知識データベース１４から感情語彙を検索することで、マハラノビス距離が近い複数の感情語彙を得ることができる。ただし、こうして得られる感情語彙にはマハラノビス距離が近くとも、経由点と感情の種類が異なるものが含まれる。そこで語彙選択部１０４では、検索により得られた感情語彙のうち経由点と感情種類が一致するものだけを、経由点のマハラノビス距離と近い順に順位付けしてリストに登録し、応答文作成部１５へ提供する。 The vocabulary selection unit 104 is a functional block that provides a list of emotion vocabulary to be used in the response sentence to the response sentence creation unit 15. First, the vocabulary selection unit 104 refers to the Mahalanobis table of FIG. 4 held in the Mahalanobis table holding unit 105 to obtain the Mahalanobis distance according to the emotion type θ of the waypoint and the emotion strength R. By using the Mahalanobis distance of this waypoint to search the emotional vocabulary from the knowledge database 14, it is possible to obtain a plurality of emotional vocabularies that are close to the Mahalanobis distance. However, the emotional vocabulary obtained in this way includes words with different types of waypoints and emotions even when the Mahalanobis distance is close. Therefore, the vocabulary selection unit 104 ranks the emotional vocabulary obtained by the search and matches the emotional type with those of the emotional vocabulary, ranks them in order from the Mahalanobis distance of the transiting point, and registers them in the list. To provide.

マハラノビステーブル保持部１０５は、マハラノビステーブルを保持する機能ブロックであり、ＦｅＲＡＭ等の書き換え可能な不揮発性メモリに確保された記録領域により実現される。
以上が感情表現制御部１３の詳細な構成である。続いて、座標変換部１０１における現在点の決定方法、及び、経由点算出部１０２における経由点の決定方法の詳細について説明する。 The Mahalanobis table holding unit 105 is a functional block that holds the Mahalanobis table, and is realized by a recording area secured in a rewritable nonvolatile memory such as FeRAM.
The detailed configuration of the emotion expression control unit 13 has been described above. Next, details of the current point determination method in the coordinate conversion unit 101 and the waypoint determination method in the waypoint calculation unit 102 will be described.

＜座標変換部１０１における現在点の決定方法＞
現在点は、４種類の感情要素についてその強度を示す感情情報を１つの合成ベクトルに変換することで、感情の種別をθ成分、感情の強さをＲ成分とした極座標系の１点に決定される。
<Determination Method of Current Point in Coordinate Conversion Unit 101>
The current point is determined by converting emotion information indicating the strength of four types of emotion elements into one composite vector, thereby determining the emotion type as one component in the polar coordinate system with the θ component and emotion strength as the R component. Is done.

ユーザ感情の現在点を示すために用いる極座標系では、図７の（ａ）のように、θ成分３６度毎に分割した１０の領域に「感情表現辞典」(1993 東京堂出版中村明)において分類される１０種類の感情が、「恥」、「怖」、「哀」、「厭」、「怒」、「昂」、「驚」、「喜」、「好」、「安」の順番で反時計回りに割り当てられている。以下、反時計回りをθの正方向、時計回りをθの負方向とする。 In the polar coordinate system used to indicate the current point of user emotion, as shown in FIG. 7 (a), there are 10 regions divided every 36 degrees of the θ component in the “Emotion Expression Dictionary” (1993 Akira Nakamura Publishing). The ten emotions that are classified are "shame", "fear", "sorrow", "厭", "anger", "昂", "surprise", "joy", "good", "low" Assigned counterclockwise. Hereinafter, the counterclockwise direction is the positive direction of θ, and the clockwise direction is the negative direction of θ.

このような極座標系上で感情情報を１つの合成ベクトルに変換するために、感情情報の各感情要素に対応する４つの単位ベクトルを定義する。具体的には、原点から「安」の領域の中央方向に感情情報のうち「平常」の感情要素に対応する単位ベクトルを定義し、原点から「怒」の領域の中央方向に感情情報のうち「怒り」の感情要素に対応する単位ベクトルを定義し、原点から「喜」の領域の中央方向に感情情報のうち「喜び」の感情要素に対応する単位ベクトルを定義し、原点から「哀」の領域の中央方向に感情情報のうち「哀しみ」の感情要素に対応する単位ベクトルを定義する。こうして定義した単位ベクトルを基に、感情情報により示される４つの感情要素は図７の（ｂ）のように、その強度に応じた４つのベクトルで表現できる。 In order to convert emotion information into one synthetic vector on such a polar coordinate system, four unit vectors corresponding to emotion elements of emotion information are defined. Specifically, a unit vector corresponding to the emotional element of “normal” in the emotional information is defined in the center direction of the “cheap” area from the origin, and the emotional information in the central direction of the “angry” area from the origin. Define a unit vector corresponding to the emotion element of “anger”, define a unit vector corresponding to the emotion element of “joy” in the emotion information from the origin toward the center of the area of “joy”, and define “sorrow” from the origin. A unit vector corresponding to the emotion element of “sorrow” in the emotion information is defined in the central direction of the region. Based on the unit vector thus defined, the four emotion elements indicated by the emotion information can be expressed by four vectors according to their strengths as shown in FIG.

これら４つのベクトルのうち「喜び」のベクトルＶ_ｐと「平常」のベクトルＶ_ｎとを合成したものをベクトルＶ_１とし、また、「怒り」のベクトルＶ_ａと「哀しみ」のベクトルＶ_ｓとを合成したものをベクトルＶ_２とし、ベクトルＶ_１とベクトルＶ_２とを合成することで、現在点を示す合成ベクトルＶ_３が得られる。
現在点の具体的な計算方法としては、「喜び」方向の偏角θ１、「平常」方向の偏角θ２、「哀しみ」方向の偏角θ３、「怒り」方向の偏角θ４とすると、各単位ベクトルのｘ成分、ｙ成分は、
喜び（ｘ）＝ｃｏｓθ１喜び（ｙ）＝ｓｉｎθ１
平常（ｘ）＝ｃｏｓθ２平常（ｙ）＝ｓｉｎθ２
哀しみ（ｘ）＝ｃｏｓθ３哀しみ（ｙ）＝ｓｉｎθ３
怒り（ｘ）＝ｃｏｓθ４喜び（ｙ）＝ｓｉｎθ４
となるので、各成分について、感情情報で示される各感情要素の強度（０〜１０）の値を乗算して合計することで、合成ベクトルのｘ成分である「Ｘ」、及び合成ベクトルのｙ成分である「Ｙ」が求められる。この現在点のｘ、ｙ成分から以下の式２により、合成ベクトルの偏角θｔ、即ち現在点のθ成分が求められる。 These four and vector V _p of "joy" in the vector obtained by combining the vector V _n of the "normal" and the vector V _1, also the vector V _a of "anger" and the vector V _s of "sorrow" Is a vector V ₂ and a vector V ₁ and a vector V ₂ are combined to obtain a combined vector V ₃ indicating the current point.
As a specific calculation method of the current point, if the declination angle θ1 in the “joy” direction, the declination angle θ2 in the “normal” direction, the declination angle θ3 in the “sorrow” direction, and the declination angle θ4 in the “anger” direction, The x and y components of the unit vector are
Joy (x) = cos θ1 Joy (y) = sin θ1
Normal (x) = cos θ2 Normal (y) = sin θ2
Sorrow (x) = cos θ3 Sorrow (y) = sin θ3
Anger (x) = cos θ4 Joy (y) = sin θ4
Therefore, by multiplying each component by the value of the intensity (0 to 10) of each emotion element indicated by the emotion information, “X” that is the x component of the composite vector, and y of the composite vector The component “Y” is determined. From the x and y components of the current point, the deviation angle θt of the combined vector, that is, the θ component of the current point, is obtained by the following equation 2.

θｔ＝tan^-1（Ｙ／Ｘ）・・・式２
また、合成ベクトルの大きさ、（Ｘ^２＋Ｙ^２）の平方根が、現在点のＲ成分となる。以上が座標変換部１０１における現在点の決定方法の詳細である。

＜経由点算出部１０２における経由点の決定方法＞
次に経由点決定方法の詳細について説明する。本実施形態に係る対話処理装置１は、θ成分が安、好、喜の何れかの感情が割り当てられた領域であるか、感情の強さを示すＲ成分が１以下である状態にユーザ感情を収束させることを目的としており、図７の（ｃ）に示すように、経由点は極座標系において現在点よりも目標とする感情の収束状態に近い点、即ち、現在点よりも安、好、喜の領域に近いか、現在点よりもＲ成分が小さい点に設定される。以下、θ成分が安、好、喜の何れかの感情が割り当てられた領域、及び感情の強さを示すＲ成分が１以下である領域を合わせた図７の（ｃ）で斜線で示した領域を、「目標範囲」という。 θt = tan ⁻¹ (Y / X) Equation 2
The square root of the combined vector, (X ² + Y ² ), is the R component of the current point. The details of the current point determination method in the coordinate conversion unit 101 have been described above.

<Method for Determining Via Point in Via Point Calculation Unit 102>
Next, details of the waypoint determination method will be described. In the dialogue processing apparatus 1 according to the present embodiment, the user component is in a state where the θ component is an area to which any of the emotions of low, good, and joy is assigned, or the R component indicating the strength of the emotion is 1 or less. As shown in FIG. 7C, the via point is closer to the target emotion convergence state than the current point in the polar coordinate system, that is, it is cheaper and better than the current point. , It is set to a point that is close to the region of pleasure or has a smaller R component than the current point. In the following, the region where the θ component is assigned to one of the emotions of low, good and happy and the region where the R component indicating the strength of the emotion is 1 or less are indicated by hatching in FIG. The area is called “target range”.

経由点の決定には２つの方法がある。１つ目は初期状態での経由点決定方法であり、２つ目は学習データを利用した経由点決定方法である。ユーザの発話音声の取得と応答文の出力とを繰り返す対話処理によって、以前にユーザ感情が目標範囲に収束したことがある場合、学習データ管理部１０３には、対話処理の開始からユーザ感情が収束するまでの一連の経過が学習データとして記録されている。初期状態とは、このような学習データが蓄積されていない状態である。 There are two ways to determine the waypoint. The first is a waypoint determination method in the initial state, and the second is a waypoint determination method using learning data. If the user emotion has previously converged to the target range by the interactive processing that repeats the acquisition of the user's speech and the output of the response sentence, the learning data management unit 103 causes the user emotion to converge from the start of the interactive processing. A series of processes up to this time is recorded as learning data. The initial state is a state in which such learning data is not accumulated.

＜初期状態での経由点決定方法＞
初期状態での経由点決定では、θ成分とＲ成分とを独立に決定する。１度の経過点決定で現在点からθ成分を変化させる幅は、０〜３６度であり、正規分布等の確率密度関数を用いて、試行の度に変化量を決定する。またθ成分を変化させる方向は、現在点が「哀」、「怖」、「恥」の何れかが割り当てられた領域及び「厭」が割り当てられた領域の中央より「安」側にあれば、「安」が割り当てられた領域に近づく方向、即ち、負方向である。逆に、現在点が「驚」、「昂」、「怒」の何れかが割り当てられた領域及び「厭」が割り当てられた領域の中央より「喜」側にあれば、「喜」が割り当てられた領域に近づく方向、即ち、正方向に変化させる。 <Via-point determination method in the initial state>
In the determination of the waypoint in the initial state, the θ component and the R component are determined independently. The range in which the θ component is changed from the current point by determining one elapsed point is 0 to 36 degrees, and the amount of change is determined for each trial using a probability density function such as a normal distribution. The direction in which the θ component is changed is when the current point is on the “low” side from the center of the area assigned with “sorrow”, “fear”, or “shame” and the area assigned with “厭”. , The direction approaching the area assigned “low”, that is, the negative direction. On the other hand, if the current point is on the “joy” side of the area to which any of “surprise”, “昂”, or “anger” is assigned and the center of the area to which “厭” is assigned, “joy” is assigned. It changes to the direction approaching the given area, that is, the positive direction.

１度の経過点決定で現在点からＲ成分を減少させる幅は、０〜１であり、正規分布等の確率密度関数を用いて、試行の度に変化量を決定する。
＜学習データを利用した経由点決定方法＞
学習データを利用した経由点決定においても、θ成分とＲ成分とを独立に決定する。この点は、初期状態での経由点決定と同様であり、学習データは、θ成分若しくはＲ成分の決定について用いられる。 The range in which the R component is decreased from the current point by determining the elapsed point once is 0 to 1, and the amount of change is determined for each trial using a probability density function such as a normal distribution.
<Via-point determination method using learning data>
Also in the waypoint determination using the learning data, the θ component and the R component are determined independently. This is the same as the waypoint determination in the initial state, and the learning data is used for determining the θ component or the R component.

学習データは、過去のエピソードでの経由点のθ成分、若しくはＲ成分を、初回試行から順にならべた数列ｆθ｛θ１、θ２、・・・、θｎ｝、ｆＲ｛Ｒ１、Ｒ２、・・・、Ｒｎ｝である。学習データ管理部１０３には、「恥」、「怖」、「哀」、「厭」、「怒」、「昂」、「驚」、「喜」、「好」、「安」の感情が割り当てられた各領域を開始位置とする１０のθ成分学習データｆθ、及び１０のＲ成分学習データｆＲが蓄積されている。 The learning data is a series of fθ {θ1, θ2,..., Θn}, fR {R1, R2,. Rn}. The learning data management unit 103 has emotions of “shame”, “scary”, “sorrow”, “厭”, “anger”, “昂”, “surprise”, “joy”, “good”, “low”. Ten θ component learning data fθ and 10 R component learning data fR starting from each assigned region are accumulated.

図８は、学習データｆθを用いた経由点θ成分の決定を示す図である。本図において矢印は、現在点のθ成分を示しており、白丸は数列ｆθのｔ番目の値である。経由点算出部１０２では、目標範囲外である現在点が最初に入力された試行を１回目の試行として、この現在点が属する領域を開始位置とする数列ｆθを選択し、数列ｆθの１番目の値を経由点のθ成分とする。続く２回目、３回目、４回目の試行においても同じ数列ｆθを用いて、２番目、３番目、４番目、の値を経由点のθ成分とする。学習データｆＲを用いて経由点Ｒ成分を決定する場合も同様に初回試行での現在点が属する領域を開始位置とする数列ｆＲを選択し、その後は図９に示すように、１回目試行の現在点θ成分が属する領域を開始位置とする数列ｆＲを選択し、数列ｆＲの１番目、２番目、３番目、４番目の値を、それぞれ１回目、２回目、３回目、４回目の試行での経由点Ｒ成分に用いる。 FIG. 8 is a diagram illustrating determination of the via point θ component using the learning data fθ. In this figure, the arrow indicates the θ component of the current point, and the white circle is the t-th value of the sequence fθ. In the waypoint calculation unit 102, the trial in which the current point outside the target range is first input is selected as the first trial, and the sequence fθ starting from the region to which the current point belongs is selected, and the first of the sequence fθ is selected. Is the θ component of the waypoint. In the subsequent second, third, and fourth trials, the same number fθ is used, and the second, third, and fourth values are set as θ components of the waypoints. Similarly, when determining the via-point R component using the learning data fR, the numerical sequence fR starting from the region to which the current point in the first trial belongs is selected, and thereafter, as shown in FIG. Select a sequence fR starting from the region to which the current point θ component belongs, and try the first, second, third, and fourth values of the first, second, third, and fourth values of the sequence fR, respectively. This is used for the R component at the via point.

ここで、ｔ回目試行での経由点とｔ+１回目試行で認識される現在点とでθ成分の差が所定の閾値（例えば、３６度）を超える場合や、Ｒ成分の差が所定の閾値（例えば、２．５）を超える場合、ｔ回目試行を「無効試行」と呼ぶ。図１０の（ａ）の例では４回目の試行が無効試行となっている。無効試行の次の試行、図１０の（ａ）の例では５回目の試行が無効試行でない場合には、従前の学習データを用いて経由点決定を継続する。しかし、図１０の（ｂ）のように無効試行が２回連続した場合、２回目の無効試行の次の試行では、その試行での現在点が属する領域を開始位置とする学習データを新たに選びなおし、新たな初回試行として経由点決定を行う。このような無効試行の判定、学習データの選び直しは、θ成分とＲ成分とで独立して実行される。 Here, when the difference of the θ component exceeds a predetermined threshold (for example, 36 degrees) between the transit point in the t-th trial and the current point recognized in the t + 1-th trial, the difference in the R component is a predetermined value. When the threshold value (for example, 2.5) is exceeded, the t-th trial is referred to as an “invalid trial”. In the example of FIG. 10A, the fourth trial is an invalid trial. In the case of the trial next to the invalidation trial, in the example of FIG. 10A, if the fifth trial is not the invalid trial, the waypoint determination is continued using the previous learning data. However, when invalid trials are continued twice as shown in FIG. 10B, in the trial next to the second invalid trial, new learning data having a start position in the region to which the current point belongs in the trial is newly added. Re-select and make waypoint determination as a new first trial. Such determination of invalid trial and reselection of learning data are performed independently for the θ component and the R component.

以上が座標変換部１０１における経由点の決定方法の詳細である。
ここまで説明した感情表現制御部１３の機能により、ユーザ感情を目標範囲に近づけるような感情語彙を選択することができ、このような感情語彙を応答文に挿入することで、本実施形態に係る対話処理装置１による感情表現処理が実現される。

＜対話処理装置１の動作＞
続いて、本実施形態に係る対話処理装置１の動作を説明する。 The details of the waypoint determination method in the coordinate conversion unit 101 have been described above.
By the function of the emotion expression control unit 13 described so far, it is possible to select an emotion vocabulary that brings the user emotion closer to the target range, and by inserting such an emotion vocabulary into the response sentence, according to the present embodiment Emotion expression processing by the dialogue processing device 1 is realized.

<Operation of Dialogue Processing Device 1>
Subsequently, the operation of the dialogue processing apparatus 1 according to the present embodiment will be described.

図１１は、本実施形態に係る対話処理装置１の動作手順を示すフローチャートである。
対話処理装置１の動作では、ステップＳ１で対話処理装置１の初期化処理が実行され、ユーザの発話が待ち受けられる。初期化処理とは、学習データ管理部１０３が一時学習データと会話開始位置とを管理するための記録領域を、ＲＡＭ上の作業領域に確保する処理である。一時学習データ記録領域と会話開始位置の記録領域とは、θ成分用及びＲ成分用がそれぞれ別に確保される。 FIG. 11 is a flowchart showing an operation procedure of the dialogue processing apparatus 1 according to the present embodiment.
In the operation of the dialog processing device 1, initialization processing of the dialog processing device 1 is executed in step S1, and the user's speech is awaited. The initialization process is a process in which the learning data management unit 103 secures a recording area for managing temporary learning data and a conversation start position in a work area on the RAM. A temporary learning data recording area and a conversation starting position recording area are separately provided for the θ component and the R component.

初期化後、ユーザが発話するとマイク２０を介して音声入力データが取得される（ステップＳ２）と、この音声入力データが感情認識部１２により解析され４つの感情要素について強度を示す感情情報が取得される（ステップＳ３）。この感情情報に基づいて感情表現制御部１３では、マハラノビス取得処理（ステップＳ４）が実行される。
ここで、感情情報を変換して得られるユーザ感情の現在点が目標範囲内にある場合、詳細を後述するマハラノビス取得処理においてマハラノビス距離は取得されない（ステップＳ５：Ｎｏ）。このような場合、応答文作成部１５には感情語彙が提供されることがなく、応答文作成部１５は、感情語彙を挿入させることなくステップＳ８にて応答文を作成する。例えばユーザの発言が「Ｔチームが負けた。」という言葉であり、感情の現在点が目標範囲を外れるような強い感情が認識されなかった場合、応答文作成部１５は、「Ｔチームは、負けましたね。」という感情語彙を含まない応答文を作成する。 After the initialization, when the user speaks, voice input data is acquired through the microphone 20 (step S2). This voice input data is analyzed by the emotion recognition unit 12, and emotion information indicating the strength of the four emotion elements is acquired. (Step S3). Based on this emotion information, the emotion expression control unit 13 executes a Mahalanobis acquisition process (step S4).
Here, when the current point of user emotion obtained by converting emotion information is within the target range, the Mahalanobis distance is not acquired in the Mahalanobis acquisition process described in detail later (step S5: No). In such a case, no emotional vocabulary is provided to the response sentence creating unit 15, and the response sentence creating unit 15 creates a response sentence in step S8 without inserting the emotional vocabulary. For example, when the user's remark is the word “T team has lost” and a strong emotion whose current point of emotion is out of the target range is not recognized, the response sentence creation unit 15 reads “ Create a response sentence that does not include the emotional vocabulary.

一方、感情情報を変換して得られるユーザ感情の現在点が目標範囲外にある場合には、マハラノビス取得処理において何らかのマハラノビス距離が取得される（ステップＳ５：Ｙｅｓ）。このような場合、感情表現制御部１３内の語彙選択部１０４によって、取得されたマハラノビス距離を用いて知識データベース１４が検索され、マハラノビス距離が近い複数の感情語彙が取得される（ステップＳ６）。複数の感情語彙を取得した語彙選択部１０４は、詳細を後述する感情語彙絞り込み処理により不適切な感情語彙を除外し、適切な感情語彙のみをマハラノビス距離が近い順に並べた出力リストを作成し、応答文作成部１５へ出力する（ステップＳ７）。感情語彙のリストが提供された応答文作成部１５は、ステップＳ８にて感情語彙を挿入した応答文を作成する。 On the other hand, if the current point of user emotion obtained by converting emotion information is outside the target range, some Mahalanobis distance is acquired in the Mahalanobis acquisition process (step S5: Yes). In such a case, the knowledge database 14 is searched using the acquired Mahalanobis distance by the vocabulary selection unit 104 in the emotion expression control unit 13, and a plurality of emotional vocabularies having a short Mahalanobis distance are acquired (step S6). The vocabulary selection unit 104 that has acquired a plurality of emotional vocabularies excludes inappropriate emotional vocabulary by the emotional vocabulary narrowing process, which will be described in detail later, and creates an output list in which only appropriate emotional vocabularies are arranged in order of closest Mahalanobis distance, It outputs to the response sentence preparation part 15 (step S7). The response sentence creation unit 15 provided with the emotion vocabulary list creates a response sentence in which the emotion vocabulary is inserted in step S8.

例として、ユーザの「Ｔチームが負けた。」という言葉の抑揚等から目標範囲を外れるような強い感情が認識され、この認識結果から決定された経由点のマハラノビス距離が２．２であった場合を想定する。図４を参照すると、マハラノビス距離が２．２に近いのは、マハラノビス距離が２．３である感情種別「哀」強度「４」の感情語彙、マハラノビス距離が２．０である感情種別「哀」強度「５」の感情語彙、マハラノビス距離が２．４である感情種別「怒」強度「４」の感情語彙、及びマハラノビス距離が２．２である感情種別「怒」強度「５」の感情語彙である。本例では、これらの属性の感情語彙が図３に示す感情語彙クラスタファイルから抽出され、語彙選択部１０４に取得される。ここでユーザ感情の現在点が「哀」領域に含まれるのもであったなら、抽出された語彙のうち感情種別「怒」の感情語彙は除外され、マハラノビス距離が２．３の感情種別「哀」強度「４」の語彙「残念だけど」「気にすることはないですよ」と、マハラノビス距離が２．０の感情種別「哀」強度「５」の語彙「悲しいけど」「まだなんとかなりますよ」とがこの順番で出力リストに登録され、応答文作成部１５へ出力される。応答文作成部１５では、出力リストに登録された感情語彙からマハラノビス距離や共起頻度を用いて使用する語彙を選択する。ここで感情語彙「残念だけど」が選択された場合は、「Ｔチームは、残念だけど負けましたね。」という感情語彙「残念だけど」を挿入した応答文が作成される。こうして作成された応答文は、スピーカ３０を介して音声としてユーザに提供される。 As an example, strong emotions that deviate from the target range due to the inflection of the user's word “T team lost” were recognized, and the Mahalanobis distance of the waypoint determined from this recognition result was 2.2 Assume a case. Referring to FIG. 4, the Mahalanobis distance is close to 2.2 because the emotional vocabulary of the emotion type “sorrow” strength “4” with the Mahalanobis distance 2.3, and the emotion type “sorrow” with the Mahalanobis distance of 2.0. "Emotion vocabulary with strength" 5 ", emotion vocabulary with emotion type" anger "intensity" 4 "with Mahalanobis distance 2.4, and emotion with" anger "intensity" 5 "with emotion type" Maharanobis distance 2.2 " Vocabulary. In this example, the emotional vocabulary of these attributes is extracted from the emotional vocabulary cluster file shown in FIG. 3 and acquired by the vocabulary selection unit 104. If the current point of the user emotion is included in the “sorrow” region, the emotion vocabulary of the emotion type “anger” is excluded from the extracted vocabulary, and the emotion type “Mahalanobis distance is 2.3”. Vocabulary of “sorrow” strength “4” “I'm sorry,” “I do n’t care”, “Maharanobis distance 2.0 emotion type“ sorrow ”strength“ 5 ”vocabulary“ sad, ”“ still quite Is registered in the output list in this order, and is output to the response sentence creation unit 15. The response sentence creation unit 15 selects a vocabulary to be used from the emotion vocabulary registered in the output list using the Mahalanobis distance and the co-occurrence frequency. When the emotional vocabulary “I'm sorry,” is selected here, a response sentence is created in which the emotional vocabulary “I'm sorry, but I'm sorry.” Is inserted. The response sentence created in this way is provided to the user as sound through the speaker 30.

以上の動作手順においてステップＳ２〜ステップＳ８が１回の試行となり、ユーザの発話の度に繰り返される。
＜マハラノビス取得処理＞
続いて、マハラノビス取得処理の詳細について説明する。図１２は、マハラノビス取得処理の詳細を示すフローチャートである。 In the above operation procedure, steps S2 to S8 are one trial and are repeated every time the user speaks.
<Mahalanobis acquisition processing>
Next, details of the Mahalanobis acquisition process will be described. FIG. 12 is a flowchart showing details of the Mahalanobis acquisition process.

マハラノビス取得処理では、先ず座標変換部１０１が感情情報を合成ベクトルに変換し、図４の（ｂ）で示したようにユーザ感情の現在点を決定する（ステップＳ１１）。次に、詳細を後述する経由点算出・一時学習処理を経由点算出部１０２が実行し、現在点に基づいて経由点を決定する（ステップＳ１２）。
ここで現在点が目標範囲外である場合（ステップＳ１３:Ｎｏ）、語彙選択部１０４は、図４のマハラノビステーブルを参照し（ステップＳ１４）、ステップＳ１１で決定された経由点のθ成分、Ｒ成分を基にマハラノビス値を取得する（ステップＳ１５）。 In the Mahalanobis acquisition process, first, the coordinate conversion unit 101 converts emotion information into a composite vector, and determines the current point of user emotion as shown in FIG. 4B (step S11). Next, the waypoint calculation unit 102 executes a waypoint calculation / temporary learning process described later in detail, and determines a waypoint based on the current point (step S12).
Here, when the current point is outside the target range (step S13: No), the vocabulary selection unit 104 refers to the Mahalanobis table of FIG. 4 (step S14), and the θ component of the waypoint determined in step S11, R A Mahalanobis value is acquired based on the component (step S15).

現在点が目標範囲に含まれている場合（ステップＳ１３:Ｙｅｓ）、語彙選択部１０４がマハラノビス値を取得しなかったことを応答文作成部１５へ通知し（ステップＳ１６）、その後、学習データ管理部１０３により、感情シーケンス終了処理が実行される。
以上がマハラノビス取得処理の詳細である。
＜経由点算出・一時学習処理＞
続いて、経由点算出・一時学習処理の詳細について説明する。図１３は、経由点算出・一時学習処理の詳細の詳細を示すフローチャートである。 When the current point is included in the target range (step S13: Yes), the vocabulary selection unit 104 notifies the response sentence creation unit 15 that the Mahalanobis value has not been acquired (step S16), and then learning data management The emotion sequence end process is executed by the unit 103.
The above is the details of the Mahalanobis acquisition process.
<Via-point calculation / temporary learning process>
Next, details of the waypoint calculation / temporary learning process will be described. FIG. 13 is a flowchart showing details of the waypoint calculation / temporary learning process.

経由点算出・一時学習処理では、先ずステップＳ２１、ステップＳ２２において、学習データ管理部１０３により一時学習データの蓄積回数が確認される。一時学習データの蓄積回数が所定の閾値Ｔｈ１（例えば１０回）を超えていれば（ステップＳ２１:Ｙｅｓ）、一時学習データがリセットされる（ステップＳ３５）。一時学習データのリセットとは、一時学習データ記録領域と会話開始位置の記録領域に記録されているデータを全て消去する処理である。 In the waypoint calculation / temporary learning process, first, in step S21 and step S22, the learning data management unit 103 confirms the number of accumulations of temporary learning data. If the accumulated number of temporary learning data exceeds a predetermined threshold Th1 (for example, 10 times) (step S21: Yes), the temporary learning data is reset (step S35). The temporary learning data reset is a process of erasing all data recorded in the temporary learning data recording area and the recording area of the conversation start position.

一時学習データとして経由点が１つも記録されていない場合（ステップＳ２２:Ｙｅｓ）、即ち、一時学習データリセット直後や初回試行時である場合、学習データ管理部１０３が現在点を会話開始位置の記録領域に記録し（ステップＳ２３）、処理がステップＳ３０へ移行する。
一時学習データとして経由点が記録されている場合（ステップＳ２２:Ｎｏ）、即ち２回目以降の試行である場合は、ステップＳ２４〜ステップＳ２９の処理手順で、経由点算出部１０２によって前回試行が無効試行であったかが判定される。前回試行の経由点と現在点とのＲ成分の差が２．５より大きい場合（ステップＳ２４:Ｙｅｓ）や、前回試行の経由点と現在点とのθ成分の差が３６度より大きい場合（ステップＳ２７:Ｙｅｓ）には、前回試行が無効試行であったとして、それぞれの成分について範囲外回数を記録する変数がインクリメントされる（ステップＳ２６、ステップＳ２９）。前回試行が無効試行ではなかった場合（ステップＳ２４：Ｎｏ、ステップＳ２７：Ｎｏ）には、それぞれの成分について範囲外回数を記録する変数が０に初期化される（ステップＳ２５、ステップＳ２８）。更に経由点算出部１０２は、ステップＳ３０において、範囲外回数を記録する変数が２以上であるかを判定する。範囲外回数が２回以上であれば、一時学習データがリセットされる（ステップＳ３６）。尚、範囲外回数を記録する変数は、一時学習データリセット処理において、０に初期化される。 When no transit point is recorded as temporary learning data (step S22: Yes), that is, immediately after the temporary learning data is reset or at the first trial, the learning data management unit 103 records the current point as the conversation start position. The area is recorded (step S23), and the process proceeds to step S30.
If a transit point is recorded as temporary learning data (step S22: No), that is, if it is a second or later trial, the transit point calculation unit 102 invalidates the previous trial in the processing procedure of step S24 to step S29. It is determined whether it was a trial. When the difference between the R component of the previous trial via point and the current point is greater than 2.5 (step S24: Yes), or when the difference between the previous trial via component and the current point is greater than 36 degrees ( In step S27: Yes, assuming that the previous trial was an invalid trial, the variable for recording the out-of-range count for each component is incremented (step S26, step S29). When the previous trial is not an invalid trial (step S24: No, step S27: No), the variable for recording the out-of-range frequency for each component is initialized to 0 (step S25, step S28). Further, in step S30, the waypoint calculation unit 102 determines whether the variable for recording the out-of-range number is 2 or more. If the out-of-range count is 2 or more, the temporary learning data is reset (step S36). Note that the variable for recording the out-of-range count is initialized to 0 in the temporary learning data reset process.

その後、学習データ管理部１０３が現在点を一時学習データ記録領域に追加記録し、現在点が目標範囲外の位置する場合（ステップＳ３２：Ｎｏ）、経由点算出部１０２が、ステップＳ３３、及びステップＳ３４において今回試行の経由点としてＲ成分とθ成分を決定する。ステップＳ３３、及びステップＳ３４における経由点の決定では、「経由点算出部１０２における経由点の決定方法」として既に説明した初期状態での経由点決定方法、若しくは、学習データを利用した経由点決定方法が利用される。 Thereafter, when the learning data management unit 103 additionally records the current point in the temporary learning data recording area and the current point is located outside the target range (step S32: No), the waypoint calculation unit 102 performs steps S33 and S33. In S34, the R component and the θ component are determined as the waypoints for the current trial. In the determination of the waypoint in step S33 and step S34, the waypoint determination method in the initial state already described as “the waypoint determination method in the waypoint calculation unit 102” or the waypoint determination method using learning data Is used.

以上が経由点算出・一時学習処理の詳細である。
＜感情語彙絞り込み処理＞
続いて、感情語彙絞り込み処理の詳細について説明する。図１４は、感情語彙絞り込み処理の詳細の詳細を示すフローチャートである。
感情語彙絞り込み処理は、語彙選択部１０４によって実行される処理であり、先ず、経由点のθ成分に基づいて経由点が属する領域に割り当てられた感情の種別を取得し（ステップＳ４１）、この感情の種別を、意味属性文字列に設定する（ステップＳ４２）。その後、ＲＡＭ上の出力用テーブル記録領域を初期化し（ステップＳ４３）する。 The above is the details of the waypoint calculation / temporary learning process.
<Emotion vocabulary narrowing process>
Next, details of the emotion vocabulary narrowing process will be described. FIG. 14 is a flowchart showing details of emotion vocabulary narrowing processing.
The emotion vocabulary narrowing process is a process executed by the vocabulary selection unit 104. First, the emotion type assigned to the region to which the waypoint belongs is acquired based on the θ component of the waypoint (step S41). Is set to a semantic attribute character string (step S42). Thereafter, the output table recording area on the RAM is initialized (step S43).

出力用テーブル記録領域を初期化した後は、ステップＳ４５〜ステップＳ４６の処理を知識データベース１４から検出された全ての感情語彙について繰り返す。ステップＳ４５の処理は、知識データベース１４から検出された感情語彙で未処理のもののうち、経由点とマハラノビス距離が最も近いものを選択し、（ステップＳ４５）、この感情語彙の意味属性である感情種別と、ステップＳ４２で設定した意味属性文字列を比較する（ステップＳ４６）。比較の結果が一致すれば（ステップＳ４６:Ｙｅｓ）、選択した感情語彙を、出力用テーブル記録領域に追加記録する（ステップＳ４７）。 After the output table recording area is initialized, the processes in steps S45 to S46 are repeated for all emotional vocabularies detected from the knowledge database 14. The processing in step S45 selects the unprocessed emotion vocabulary detected from the knowledge database 14 and selects the one having the closest waypoint and Mahalanobis distance (step S45), and the emotion type that is the semantic attribute of this emotion vocabulary And the semantic attribute character string set in step S42 are compared (step S46). If the comparison results match (step S46: Yes), the selected emotion vocabulary is additionally recorded in the output table recording area (step S47).

以上が感情語彙絞り込み処理の詳細である。
＜感情シーケンス終了処理＞
続いて、感情シーケンス終了処理の詳細について説明する。図１５は、感情シーケンス終了処理の詳細の詳細を示すフローチャートである。
感情シーケンス終了処理は、試行を繰り返すことでユーザ感情を示す現在点が目標範囲に収束した場合に実行される処理である。 The above is the details of the emotion vocabulary narrowing process.
<Emotion sequence end processing>
Next, the details of the emotion sequence end process will be described. FIG. 15 is a flowchart showing details of emotion sequence end processing.
The emotion sequence end process is a process executed when the current point indicating the user emotion converges to the target range by repeating trials.

感情シーケンス終了処理において学習データ管理部１０３は、会話開始位置の記録領域に記録されているθ成分の値を取得し（ステップＳ５１）、この値が「恥」、「怖」、「哀」、「厭」、「怒」、「昂」、「驚」、「喜」、「好」、「安」の何れの種類の感情が割り当てられた領域に含まれるかを判定し（ステップＳ５２）、この領域を開始位置とする学習データが以前に蓄積されていない場合（ステップＳ５３：Ｎｏ）、一時学習データ記録領域に記録されている経由点を、新たな学習データとして蓄積する（ステップＳ５４）。 In the emotion sequence end process, the learning data management unit 103 acquires the value of the θ component recorded in the recording area of the conversation start position (step S51), and this value is “shame”, “fear”, “sorrow”, It is determined whether any kind of emotions “厭”, “anger”, “昂”, “surprise”, “joy”, “good”, “low” are included in the assigned region (step S52), When the learning data having this area as the start position has not been accumulated before (step S53: No), the waypoint recorded in the temporary learning data recording area is accumulated as new learning data (step S54).

ステップＳ５２で判定した領域を開始位置とする学習データが以前に蓄積されている場合（ステップＳ５３:Ｙｅｓ）には、この学習データの経由点の数と、一時学習データ記録領域に記録されている経由点の数とを比較する（ステップＳ５５）。一時学習データ記録領域に記録されている経由点の数の方が少なければ（ステップＳ５５:Ｙｅｓ）、既存の学習データを破棄し、一時学習データ記録領域に記録されている経由点を、新たな学習データとして蓄積する。 When the learning data having the area determined in step S52 as the start position has been accumulated before (step S53: Yes), the number of via points of the learning data and the temporary learning data recording area are recorded. The number of via points is compared (step S55). If the number of waypoints recorded in the temporary learning data recording area is smaller (step S55: Yes), the existing learning data is discarded, and the waypoints recorded in the temporary learning data recording area are replaced with new ones. Accumulate as learning data.

最後にステップＳ５６において一時学習データをリセットする。以上が感情シーケンス終了処理の詳細である。
以上、本実施形態に係る対話処理装置によれば、現在のユーザ感情よりも目標とする感情の収束状態に近い経由点での感情種別、及び感情の強さと対応するような語彙を用いて、応答文を作成することができる。これにより、目標とする感情の収束状態に緩やかに近づけるようなユーザ感情の変化を促すことができる。このような緩やかな感情変化を促す応答は、人間と対面しているかのような印象をユーザに与えることができる。 Finally, temporary learning data is reset in step S56. The above is the details of the emotion sequence end process.
As described above, according to the dialogue processing apparatus according to the present embodiment, using the vocabulary corresponding to the emotion type at the waypoint closer to the convergence state of the target emotion than the current user emotion, and the strength of the emotion, Response sentences can be created. As a result, it is possible to promote a change in user emotion that gently approaches the target emotion convergence state. Such a response that prompts a gentle emotional change can give the user the impression that they are facing humans.

（変形例）
以上、本発明に係る対話処理装置について、実施形態に基づいて説明したが、本発明はこれらの実施形態に限られない。例えば、以下のような変形例が考えられる。
（１）本発明は、各実施形態で説明したフローチャートの処理手順が開示する記録方法であるとしてもよい。また、これらの方法をコンピュータにより実現するプログラムであるとしてもよいし、前記プログラムからなるデジタル信号であるとしてもよい。
(Modification)
As mentioned above, although the dialogue processing apparatus concerning the present invention was explained based on the embodiment, the present invention is not limited to these embodiments. For example, the following modifications can be considered.
(1) The present invention may be a recording method disclosed by the processing procedure of the flowchart described in each embodiment. Further, the present invention may be a program that realizes these methods by a computer, or may be a digital signal composed of the program.

また、本発明は、前記プログラム又は前記デジタル信号をコンピュータ読み取り可能な記録媒体、例えば、フレキシブルディスク、ハードディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ、ＤＶＤ−ＲＯＭ、ＤＶＤ−ＲＡＭ、ＢＤ（Ｂｌｕ−ｒａｙＤｉｓｃ）、半導体メモリなど、に記録したものとしてもよい。
また、本発明は、前記プログラム又は前記デジタル信号を、電気通信回線、無線又は有線通信回線、インターネットを代表とするネットワーク等を経由して伝送するものとしてもよい。 The present invention also provides a computer-readable recording medium for the program or the digital signal, such as a flexible disk, hard disk, CD-ROM, MO, DVD, DVD-ROM, DVD-RAM, BD (Blu-ray Disc). It may be recorded in a semiconductor memory or the like.
In the present invention, the program or the digital signal may be transmitted via an electric communication line, a wireless or wired communication line, a network represented by the Internet, or the like.

また、前記プログラム又は前記デジタル信号を前記記録媒体に記録して移送することにより、又は前記プログラム又は前記デジタル信号を前記ネットワーク等を経由して移送することにより、独立した他のコンピュータシステムにより実施するとしてもよい。
（２）経由点は現在のユーザ感情よりも目標とする感情の収束状態に近い点であればよく、上記実施形態とは異なる手法により決定してもよい。 In addition, the program or the digital signal is recorded on the recording medium and transferred, or the program or the digital signal is transferred via the network or the like, and is executed by another independent computer system. It is good.
(2) The via point may be a point closer to the target emotion convergence state than the current user emotion, and may be determined by a method different from the above embodiment.

例えば、先ず現在点からθ成分だけ変化させ、Ｒ成分は現在点と同じ値に設定した経由点を決定し、このような経由点を用いて試行を繰り返した結果、無効試行が所定回数（例えば２回）連続した場合に、次の試行からは現在点からＲ成分だけ減少させ、θ成分は現在点と同じ値に設定した経由点を決定するという手法を用いてもよい。
（３）上記実施形態では、学習データを、数列ｆθ｛θ１、θ２、・・・、θｎ｝、数列ｆＲ｛Ｒ１、Ｒ２、・・・、Ｒｎ｝というデータ形式で保持するとしたが、学習データは他のデータ形式であってもよい。 For example, first, the θ component is changed from the current point, and the R component is determined as a via point set to the same value as the current point. As a result of repeating the trial using such a via point, invalid trials are performed a predetermined number of times (for example, In the case of two consecutive times, a method may be used in which the R component is decreased from the current point by the next trial and the via point is set to the same value as the current point for the θ component.
(3) In the above embodiment, the learning data is held in the data format of the sequence fθ {θ1, θ2,..., Θn} and the sequence fR {R1, R2,. May be in other data formats.

例えば、横軸を試行回数ｔ、縦軸をθ又はＲとした直交座標に過去のエピソードでの経由点をプロットし、これらから最小二乗法により近似直線θ＝ａｔ＋ｂ、又はＲ＝ａｔ＋ｂを求めることができる。このｆ（ｔ）＝ａｔ＋ｂを学習データとして、ｔ回目のθ、Ｒを決定してもよい。
また他の例として、上述の近似直線の傾きａを学習データとしてもよい。この場合には、初回試行の現在点を切片とした関数ｆ（ｔ）＝ａｔ＋θ１、ｆ（ｔ）＝ａｔ＋Ｒ１を生成して経由点を決定することができる。ここでθ１は初回試行での現在点のθ成分、Ｒ１は、初回試行での現在点Ｒ成分である。 For example, plotting via points in past episodes on Cartesian coordinates with the horizontal axis representing the number of trials t and the vertical axis representing θ or R, and obtaining an approximate line θ = at + b or R = at + b from these by the least square method Can do. Using this f (t) = at + b as learning data, the tth θ and R may be determined.
As another example, the inclination a of the above approximate line may be used as learning data. In this case, a function point f (t) = at + θ1 and f (t) = at + R1 with the current point of the first trial as an intercept can be generated to determine the via point. Here, θ1 is the θ component of the current point in the first trial, and R1 is the R component of the current point in the first trial.

（４）知識データベースを、対話処理装置の外部に設けてもよい。例えば、インターネットに接続されたサーバ装置を知識データベースとして機能させ、図１６に示すように対話処理装置に通信インターフェースを設け、無線ＬＡＮ等のネットワークを介して知識データベースと通信する構成としてもよい。このような変形例では、インターネットを介して知識データベースの収集語彙を適時更新することが容易になる。 (4) A knowledge database may be provided outside the dialogue processing apparatus. For example, a server device connected to the Internet may function as a knowledge database, and a communication interface may be provided in the dialog processing device as shown in FIG. 16 to communicate with the knowledge database via a network such as a wireless LAN. In such a modification, it becomes easy to update the collected vocabulary of the knowledge database in a timely manner via the Internet.

（５）上記実施形態では、ＳＴｅｍｏｔｉｏｎの出力を極座標上に合成して現在点を得るとした。しかしながら、ユーザ感情の現在点を表現する方法は、必ずしも極座標に限定されるものではない。
例えば、上記実施形態で説明した極座標に、ＳＴｅｍｏｔｉｏｎの出力値のうち「興奮」の値を対応させたｚ軸を追加することで、ユーザ感情の現在点を平面座標ではなく、空間座標で表現することもできる。 (5) In the above embodiment, the current point is obtained by combining the output of ST emotion on the polar coordinates. However, the method of expressing the current point of user emotion is not necessarily limited to polar coordinates.
For example, by adding a z-axis corresponding to the value of “excitement” among the output values of ST emotion to the polar coordinates described in the above embodiment, the current point of user emotion is expressed in spatial coordinates instead of plane coordinates You can also

（６）各実施形態及び変形例をそれぞれ組み合わせるとしてもよい。 (6) Each embodiment and modification may be combined.

本発明は、ユーザとの対話を行う玩具、ロボット装置等に有用である。 INDUSTRIAL APPLICABILITY The present invention is useful for toys, robot devices, and the like that interact with a user.

１対話処理装置
１０マイコンシステム
１１音声認識部
１２感情認識部
１３感情表現制御部
１４知識データベース
１５応答文作成部
１６音声合成部
２０マイク
３０スピーカ
１０１座標変換部
１０２経由点算出部
１０３学習データ管理部
１０４語彙選択部
１０５マハラノビステーブル保持部 DESCRIPTION OF SYMBOLS 1 Dialogue processing apparatus 10 Microcomputer system 11 Voice recognition part 12 Emotion recognition part 13 Emotion expression control part 14 Knowledge database 15 Response sentence creation part 16 Speech synthesis part 20 Microphone 30 Speaker 101 Coordinate conversion part 102 Via-point calculation part 103 Learning data management part 104 Vocabulary selector 105 Mahalanobis table holder

Claims

A dialogue processing device that executes recognition of a user's emotion and emotion expression for the user using a knowledge database,
From the recognition result of the user emotion, obtain the current point in the coordinate system with the emotion type and emotion strength as components, and determine the via point closer to the target emotion convergence state than the current point in this coordinate system Including processing,
The knowledge database shows each of a plurality of vocabularies and fixed phrases in association with parameters according to emotion type and emotion strength,
The emotional expression for the user is obtained by obtaining at least one of a vocabulary and a fixed sentence to be used for dialogue by searching a knowledge database with a parameter according to the type of emotion at the waypoint and the strength of the emotion. An interactive processing device characterized in that it is made by creating a response sentence using.

The coordinate system is a polar coordinate system in which an emotion type is assigned to each region of the θ component, and the strength of the emotion is an R component,
The parameter is a Mahalanobis distance using a vocabulary and a fixed sentence recorded in a knowledge database as a population in an orthogonal coordinate system in which an emotion type and an emotion strength are XY components. 1. The dialogue processing apparatus according to 1.

The types of emotion indicated by the θ component of the polar coordinate system are the emotions of cheap, good, happy, surprise, jealousy, anger, jealousy, sadness, fear and shame in the polar coordinate system in a clockwise direction. The dialogue processing apparatus according to claim 2.

The waypoint is determined
A first process for setting a value closer to the region to which the emotion type of cheap, good, and joy is assigned than the θ component at the current point as the θ component at the via point, and the strength of emotion than the R component at the current point 4. The dialogue processing apparatus according to claim 3, wherein the dialogue processing apparatus is configured to execute at least one of the second processing for setting a value obtained by weakening the value as the R component of the waypoint.

The dialogue processing device has a learning function that reinforces and learns a waypoint determination method by presenting the created response sentence to the user and repeating an attempt to newly recognize the user emotion after the presentation. The dialogue processing apparatus according to claim 2.

The learning function
The difference between the θ component of the via point and the θ component of the new current point related to the newly recognized user emotion, and the difference between the R component of the via point and the R component of the new current point are within the specified value. In some cases, the waypoints are stored as temporary learning data,
If the number of via points stored as temporary learning data is less than the existing learning result when the recognized user emotion becomes the target emotion convergence state, the temporary learning data becomes a new learning result. The interactive processing device according to claim 5, wherein the interactive processing device is set.

The learning result is stored as a function θ = f (t) and a function R = f (t) that can be derived from the number of trials t,
The dialog processing apparatus according to claim 6, wherein the determination of the waypoint is performed by determining θ and R based on these functions.