JP7746447B2

JP7746447B2 - electronic equipment

Info

Publication number: JP7746447B2
Application number: JP2024062824A
Authority: JP
Inventors: 正義孫
Original assignee: SoftBank Group Corp
Current assignee: SoftBank Group Corp
Priority date: 2023-04-20
Filing date: 2024-04-09
Publication date: 2025-09-30
Anticipated expiration: 2044-04-09

Description

開示の実施形態は、電子機器に関する。 The disclosed embodiments relate to electronic devices.

従来、ユーザの状態に対してロボットの適切な行動を決定する技術が開示されている（例えば、特許文献１参照）。特許文献１には、ロボットが特定の行動を実行したときのユーザの反応を認識し、認識したユーザの反応に対するロボットの行動を決定できなかった場合、認識したユーザの状認に適した行動に関する情報をサーバから受信することで、ロボットの行動を更新する点が開示されている。 Technology for determining appropriate robot behavior in response to a user's state has been disclosed (see, for example, Patent Document 1). Patent Document 1 discloses a technique in which a robot recognizes the user's reaction when performing a specific action, and if it is unable to determine the robot's behavior in response to the recognized user reaction, the robot's behavior is updated by receiving information from a server about behavior appropriate to the recognized user's state.

特許第６０５３８４７号公報Patent No. 6053847

しかしながら、従来の技術では、ユーザの行動に対して適切な行動を実行する点で改善の余地があった。 However, conventional technology leaves room for improvement in terms of executing appropriate actions in response to user behavior.

本発明は、上記に鑑みてなされたものであって、適切な行動を実行することができる電子機器を提供することを目的とする。 The present invention was made in consideration of the above, and aims to provide an electronic device that can perform appropriate actions.

実施形態の一態様に係る電子機器は、プレゼンテーションを行うユーザの行動を認識し、認識した前記ユーザの行動に対応する自身の行動を決定し、決定した前記自身の行動に基づいて制御対象を制御する、を備える。 In one embodiment, an electronic device recognizes the behavior of a user giving a presentation, determines its own behavior corresponding to the recognized user behavior, and controls a control target based on the determined own behavior.

実施形態の一態様によれば、適切な行動を実行することができる。 According to one aspect of the embodiment, appropriate action can be taken.

図１は、本実施形態に係る制御システムの一例を概略的に示す図である。FIG. 1 is a diagram schematically illustrating an example of a control system according to this embodiment. 図２は、ロボットの機能構成を概略的に示す図である。FIG. 2 is a diagram illustrating a schematic functional configuration of the robot. 図３は、ロボットにおいて行動を決定する動作に関する動作フローの一例を概略的に示す図である。FIG. 3 is a diagram illustrating an example of an operational flow relating to an operation for determining an action in a robot. 図４は、ロボット及びサーバとして機能するコンピュータのハードウェア構成の一例を概略的に示す図である。FIG. 4 is a diagram illustrating an example of a hardware configuration of a robot and a computer that functions as a server. 図５は、複数の感情がマッピングされる感情マップを示す図である。FIG. 5 is a diagram showing an emotion map onto which multiple emotions are mapped. 図６は、感情マップの他の例を示す図である。FIG. 6 is a diagram showing another example of an emotion map. 図７は、感情テーブルの一例を示す図である。FIG. 7 is a diagram showing an example of an emotion table. 図８は、感情テーブルの一例を示す図である。FIG. 8 is a diagram showing an example of an emotion table.

以下、実施形態を通じて本発明を説明するが、以下の実施形態は特許請求の範囲に係る発明を限定するものではない。また、実施形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。また、本実施形態においては、電子機器として「ロボット」を一例に挙げて説明する。なお、当該電子機器は、ロボットの他に、ぬいぐるみ、スマートフォン等の携帯型端末装置、スマートスピーカ等の入力装置等であってよい。 The present invention will be described below through embodiments, but the following embodiments do not limit the invention as claimed. Furthermore, not all of the combinations of features described in the embodiments are necessarily essential to the solution of the invention. Furthermore, in this embodiment, a "robot" will be used as an example of an electronic device. Note that the electronic device may be a stuffed animal, a portable terminal device such as a smartphone, an input device such as a smart speaker, or the like, in addition to a robot.

図１は、本実施形態に係る制御システム１の一例を概略的に示す図である。図１に示すように、制御システム１は、複数のロボット１００と、連携機器４００と、サーバ３００とを備える。複数のロボット１００は、それぞれユーザによって管理される。 Figure 1 is a diagram that schematically illustrates an example of a control system 1 according to this embodiment. As shown in Figure 1, the control system 1 includes multiple robots 100, a linked device 400, and a server 300. Each of the multiple robots 100 is managed by a user.

ロボット１００は、ユーザと会話を行ったり、ユーザに映像を提供したりする。このとき、ロボット１００は、通信網２０を介して通信可能なサーバ３００等と連携して、ユーザとの会話や、ユーザへの映像等の提供を行う。例えば、ロボット１００は、自身で適切な会話を学習するだけでなく、サーバ３００と連携して、ユーザとより適切に会話を進められるように学習を行う。また、ロボット１００は、撮影したユーザの映像データ等をサーバ３００に記録させ、必要に応じて映像データ等をサーバ３００に要求して、ユーザに提供する。 The robot 100 converses with the user and provides the user with video. At this time, the robot 100 cooperates with a server 300 or the like, with which it can communicate via the communication network 20, to converse with the user and provide the user with video, etc. For example, the robot 100 not only learns appropriate conversations on its own, but also cooperates with the server 300 to learn how to have more appropriate conversations with the user. The robot 100 also records video data of the user that it has captured on the server 300, and requests video data, etc. from the server 300 as needed and provides it to the user.

また、ロボット１００は、自身の感情の種類を表す感情値を持つ。例えば、ロボット１００は、「喜」、「怒」、「哀」、「楽」、「快」、「不快」、「安心」、「不安」、「悲しみ」、「興奮」、「心配」、「安堵」、「充実感」、「虚無感」及び「普通」のそれぞれの感情の強さを表す感情値を持つ。ロボット１００は、例えば興奮の感情値が大きい状態でユーザと会話するときは、早いスピードで音声を発する。このように、ロボット１００は、自己の感情を行動で表現することができる。 The robot 100 also has an emotional value that represents the type of emotion it feels. For example, the robot 100 has emotional values that represent the intensity of each of the following emotions: "joy," "anger," "sorrow," "pleasure," "discomfort," "relief," "anxiety," "sadness," "excitement," "worry," "relief," "fulfillment," "emptiness," and "neutral." For example, when the robot 100 is conversing with the user when its emotional value is high in excitement, it will speak at a fast speed. In this way, the robot 100 can express its emotions through its actions.

また、ロボット１００は、文章生成モデル（いわゆる、ＡＩ（ＡｒｔｉｆｉｃｉａｌＩｎｔｅｌｌｉｇｅｎｃｅ）チャットエンジンと感情エンジンをマッチングさせることで、ユーザ１０の感情に対応するロボット１００の行動を決定するように構成してよい。具体的には、ロボット１００は、ユーザ１０の行動を認識して、当該ユーザの行動に対するユーザ１０の感情を判定し、判定した感情に対応するロボット１００の行動を決定するように構成してよい。 The robot 100 may also be configured to determine the behavior of the robot 100 that corresponds to the emotions of the user 10 by matching a sentence generation model (so-called AI (Artificial Intelligence) chat engine) with an emotion engine. Specifically, the robot 100 may be configured to recognize the behavior of the user 10, determine the emotions of the user 10 regarding the user's behavior, and determine the behavior of the robot 100 that corresponds to the determined emotions.

より具体的には、ロボット１００は、ユーザ１０の行動を認識した場合、予め設定された文章生成モデルを用いて、当該ユーザ１０の行動に対してロボット１００がとるべき行動内容を自動で生成する。文章生成モデルは、文字による自動対話処理のためのアルゴリズム及び演算と解釈してよい。文章生成モデルは、例えば特開２０１８－０８１４４４号公報やｃｈａｔＧＰＴ（インターネット検索＜URL:https://openai.com/blog/chatgpt＞）に開示される通り公知であるため、その詳細な説明を省略する。このような、文章生成モデルは、大規模言語モデル（ＬＬＭ：ＬａｒｇｅＬａｎｇｕａｇｅＭｏｄｅｌ）により構成されている。以上、本実施形態は、大規模言語モデルと感情エンジンとを組み合わせることにより、ユーザ１０やロボット１００の感情と、様々な言語情報とをロボット１００の行動に反映させるということができる。つまり、本実施形態によれば、文章生成モデルと感情エンジンとを組み合わせることにより、相乗効果を得ることができる。 More specifically, when the robot 100 recognizes the user's 10 behavior, it uses a preset sentence generation model to automatically generate the behavior that the robot 100 should take in response to the user's 10 behavior. The sentence generation model may be interpreted as an algorithm and calculation for automatic text-based dialogue processing. Since sentence generation models are publicly known, as disclosed in, for example, JP 2018-081444 A and chatGPT (Internet search <URL: https://openai.com/blog/chatgpt>), a detailed description thereof will be omitted. Such a sentence generation model is configured using a large-scale language model (LLM). As described above, this embodiment combines a large-scale language model with an emotion engine to reflect the emotions of the user 10 and the robot 100, as well as various linguistic information, in the behavior of the robot 100. In other words, according to this embodiment, a synergistic effect can be achieved by combining a sentence generation model with an emotion engine.

また、ロボット１００は、ユーザの行動を認識する機能を有する。ロボット１００は、カメラ機能で取得したユーザの顔画像や、マイク機能で取得したユーザの音声を解析することによって、ユーザの行動を認識する。ロボット１００は、認識したユーザの行動等に基づいて、ロボット１００が実行する行動を決定する。 The robot 100 also has the function of recognizing the user's actions. The robot 100 recognizes the user's actions by analyzing the user's facial image acquired using the camera function and the user's voice acquired using the microphone function. The robot 100 determines the action to be performed by the robot 100 based on the recognized user's actions, etc.

ロボット１００は、ユーザの感情、ロボット１００の感情、及びユーザの行動に基づいてロボット１００が実行する行動を定めたルールを記憶しており、ルールに従って各種の行動を行う。 The robot 100 stores rules that define the actions that the robot 100 will take based on the user's emotions, the robot's own emotions, and the user's actions, and performs various actions in accordance with the rules.

具体的には、ロボット１００には、ユーザの感情、ロボット１００の感情、及びユーザの行動に基づいてロボット１００の行動を決定するための反応ルールを有している。反応ルールには、例えば、ユーザの行動が「笑う」である場合に対して、「笑う」という行動が、ロボット１００の行動として定められている。また、反応ルールには、ユーザの行動が「怒る」である場合に対して、「謝る」という行動が、ロボット１００の行動として定められている。また、反応ルールには、ユーザの行動が「質問する」である場合に対して、「回答する」という行動が、ロボット１００の行動として定められている。反応ルールには、ユーザの行動が「悲しむ」である場合に対して、「声をかける」という行動が、ロボット１００の行動として定められている。 Specifically, the robot 100 has reaction rules for determining the behavior of the robot 100 based on the user's emotions, the robot's emotions, and the user's behavior. For example, the reaction rules define the behavior of the robot 100 as "laughing" when the user's behavior is "laughing." The reaction rules also define the behavior of the robot 100 as "apologizing" when the user's behavior is "anger." The reaction rules also define the behavior of the robot 100 as "answering" when the user's behavior is "asking a question." The reaction rules also define the behavior of the robot 100 as "calling out" when the user's behavior is "sad."

ロボット１００は、反応ルールに基づいて、ユーザの行動が「怒る」であると認識した場合、反応ルールで定められた「謝る」という行動を、ロボット１００が実行する行動として選択する。例えば、ロボット１００は、「謝る」という行動を選択した場合に、「謝る」動作を行うと共に、「謝る」言葉を表す音声を出力する。 When the robot 100 recognizes that the user's behavior is "anger" based on the reaction rules, it selects the behavior of "apologizing" defined in the reaction rules as the behavior to be performed by the robot 100. For example, when the robot 100 selects the behavior of "apologizing," it performs the motion of "apologizing" and outputs a sound representing the word "apologize."

また、ロボット１００の感情が「普通」（すわなち、「喜」＝０、「怒」＝０、「哀」＝０、「楽」＝０）であり、ユーザの状態が「１人、寂しそう」という条件が満たされた場合に、ロボット１００の感情が「心配になる」という感情の変化内容と、「声をかける」の行動を実行できることが定められている。 It is also defined that when the robot's 100 emotion is "normal" (i.e., "happy" = 0, "anger" = 0, "sad" = 0, "happy" = 0) and the user's state is "alone and lonely," the robot's emotion will change to "worried" and it will be able to perform the action of "calling out."

ロボット１００は、反応ルールに基づいて、ロボット１００の現在の感情が「普通」であり、かつ、ユーザが１人で寂しそうな状態にあると認識した場合、ロボット１００の「哀」の感情値を増大させる。また、ロボット１００は、反応ルールで定められた「声をかける」という行動を、ユーザに対して実行する行動として選択する。例えば、ロボット１００は、「声をかける」という行動を選択した場合に、心配していることを表す「どうしたの？」という言葉を、心配そうな音声に変換して出力する。 When the robot 100 recognizes, based on the reaction rules, that its current emotion is "normal" and that the user appears lonely, it increases the robot 100's emotion value of "sadness." Furthermore, the robot 100 selects the behavior of "calling out" defined in the reaction rules as the behavior to be performed toward the user. For example, when the robot 100 selects the behavior of "calling out," it converts the words "What's wrong?", which express concern, into a worried voice and outputs it.

また、ロボット１００は、この行動によって、ユーザからポジティブな反応が得られたことを示すユーザ反応情報を、サーバ３００に送信する。ユーザ反応情報には、例えば、「怒る」というユーザ行動、「謝る」というロボット１００の行動、ユーザの反応がポジティブであったこと、及びユーザの属性が含まれる。 The robot 100 also transmits user reaction information to the server 300 indicating that this action has elicited a positive reaction from the user. The user reaction information includes, for example, the user's action of "getting angry," the robot's action of "apologizing," the fact that the user's reaction was positive, and the user's attributes.

サーバ３００は、各ロボット１００から受信したユーザ反応情報を記憶する。そして、サーバ３００は、各ロボット１００からのユーザ反応情報を解析して、反応ルールを更新する。 The server 300 stores the user reaction information received from each robot 100. The server 300 then analyzes the user reaction information from each robot 100 and updates the reaction rules.

ロボット１００は、更新された反応ルールをサーバ３００に問い合わせることにより、更新された反応ルールをサーバ３００から受信する。ロボット１００は、更新された反応ルールを、ロボット１００が記憶している反応ルールに組み込む。これにより、ロボット１００は、他のロボット１００が獲得した反応ルールを、自身の反応ルールに組み込むことができる。反応ルールが更新された場合、サーバ３００から自動的にロボット１００に送信されてもよい。 The robot 100 receives updated reaction rules from the server 300 by querying the server 300 for the updated reaction rules. The robot 100 incorporates the updated reaction rules into the reaction rules stored in the robot 100. This allows the robot 100 to incorporate reaction rules acquired by other robots 100 into its own reaction rules. When the reaction rules are updated, they may be automatically sent from the server 300 to the robot 100.

また、ロボット１００は、連携機器４００と連携した行動を実行することができる。連携機器４００は、例えば、カラオケ機器、ワインセラー、冷蔵庫、端末機器（ＰＣ（Personal Computer）や、スマートフォン、タブレット等）、洗濯機、自動車、カメラ、トイレ設備、電動歯ブラシ、テレビ、ディスプレイ、家具（クローゼット等）、薬箱、楽器、照明機器、運動玩具（一輪車等）である。これら連携機器４００は、通信網２０を介してロボット１００と通信可能に接続され、ロボット１００との間で情報の送受信を行う。この構成により、連携機器４００は、ロボット１００の指示に従って、自身の制御や、ユーザとの会話等を行う。 The robot 100 can also perform actions in cooperation with linked devices 400. Linked devices 400 include, for example, karaoke machines, wine cellars, refrigerators, terminal devices (such as personal computers (PCs), smartphones, and tablets), washing machines, automobiles, cameras, toilet equipment, electric toothbrushes, televisions, displays, furniture (such as closets), medicine cabinets, musical instruments, lighting equipment, and exercise toys (such as unicycles). These linked devices 400 are communicatively connected to the robot 100 via the communication network 20, and transmit and receive information to and from the robot 100. With this configuration, the linked devices 400 follow instructions from the robot 100 to control themselves and converse with the user.

本開示では、連携機器４００である端末機器（ＰＣ（Personal Computer）４００ａや、スマートフォン４００ｂ、タブレット４００ｃ等）とロボット１００との連携により、ユーザに対して各種行動を実行する例について説明する。 This disclosure describes an example in which a terminal device (such as a personal computer (PC) 400a, a smartphone 400b, or a tablet 400c) that is a linked device 400 cooperates with a robot 100 to perform various actions on a user.

ロボット１００は、プレゼンテーションを行うユーザの行動を認識し、認識したユーザの行動に対応する自身の行動を決定し、決定した自身の行動に基づいて制御対象を制御する。具体的には、ロボット１００は、ユーザがプレゼンテーションとして発表の練習を行う場合に、ユーザのプレゼンテーションの内容の所定の完成度が高める行動を行う。 The robot 100 recognizes the actions of the user giving a presentation, determines its own actions corresponding to the recognized user actions, and controls the controlled object based on its own determined actions. Specifically, when the user practices a presentation, the robot 100 performs actions that enhance the predetermined level of completion of the content of the user's presentation.

上記したように、ロボット１００は、ユーザがプレゼンテーションとして発表の練習を行う際に、発表内容の聴講、反応、内容への指摘、改善提案等のプレゼンテーションの完成度を高める行動を実行する。例えば、ロボット１００は、発表の練習におけるプレゼンテーションの内容を解析して、所定の完成度の指標が閾値未満の箇所を抽出し、抽出した箇所について所定のフィードバックに関する行動を決定する。具体的には、ロボット１００は、プレゼンテーションの内容に含まれる誤字、脱字、内容の誤り、内容の充実度、ユーザの声量、発表速度、目線、ユーザの感情の変化のうちいずれか１つ又は複数の組み合わせを所定の完成度の指標として、完成度の指標が閾値未満の箇所を抽出する。例えば、ロボット１００は、完成度を高める行動として、発表練習の聴講およびプレゼンテーションの内容の解析を行い、予め設定された指標が所定の閾値未満の場合に、係る箇所についてフィードバックを行う。なお、ここでいうフィードバックとはユーザのプレゼンテーションの完成度を高めるための指摘、指導、提案のことで、例えば、誤字の訂正、表現の変更、内容の削除又は追記、構成の変更、発表時の声量、姿勢、態度の改善等が含まれる。 As described above, when a user practices a presentation, robot 100 performs actions to improve the quality of the presentation, such as listening to the content of the presentation, responding to it, pointing out the content, and suggesting improvements. For example, robot 100 analyzes the content of the presentation during the practice presentation, extracts sections where a predetermined indicator of completeness is below a threshold, and determines an action related to predetermined feedback for the extracted sections. Specifically, robot 100 uses one or a combination of typos, omissions, content errors, completeness of the content, the user's voice volume, presentation speed, line of sight, and changes in the user's emotions contained in the presentation content as the predetermined indicator of completeness, and extracts sections where the indicator of completeness is below a threshold. For example, robot 100 analyzes the content of the practice presentation and the presentation as an action to improve the quality of the presentation, and provides feedback on the relevant sections when a predetermined indicator is below a predetermined threshold. Feedback here refers to comments, guidance, and suggestions to improve the quality of the user's presentation, including, for example, correcting typos, changing expressions, deleting or adding content, changing the structure, and improving the volume, posture, and attitude of the speaker during the presentation.

また、ロボット１００は、ユーザによるプレゼンテーションとして発表の練習を認識した場合、プレゼンテーションを行うユーザの感情を高める行動を決定する。具体的には、ロボット１００は、ユーザがプレゼンテーションとして発表の練習を行っていると認識した場合に、ユーザの感情を高める行動として「素晴らしい発表ですね！」、「声が大きくてわかりやすいです！」といったようなユーザに対する発話ができる。 Furthermore, when the robot 100 recognizes that the user is practicing a presentation, it determines an action to enhance the emotions of the user giving the presentation. Specifically, when the robot 100 recognizes that the user is practicing a presentation, it can speak to the user, such as "That was a great presentation!" or "Your voice is loud and easy to understand!", as an action to enhance the user's emotions.

また、ロボット１００は、ユーザが発表の練習が終わったタイミングで、プレゼンテーションの内容に関する発話を行うことができる。具体的には、ロボット１００は、ユーザがプレゼンテーションとして発表の練習が終わった場合に、ユーザの感情を高める行動として「素晴らしい発表ですね！」、「声が大きくてわかりやすいです！」といったようなユーザに対する発話ができる。また、ロボット１００は、ユーザがプレゼンテーションとして発表の練習が終わった場合に、プレゼンテーションの内容を改善するために「もう少しゆっくり説明するとなおよいです。」や、「〇〇の部分が間違っていたので直しましょう。」といったようなユーザに対する発話ができる。 Furthermore, the robot 100 can make utterances related to the content of the presentation when the user has finished practicing the presentation. Specifically, when the user has finished practicing the presentation, the robot 100 can make utterances to the user such as "That was a great presentation!" or "Your voice is loud and easy to understand!" as actions to enhance the user's emotions. Furthermore, when the user has finished practicing the presentation, the robot 100 can make utterances to the user such as "It would be better if you explained it a little more slowly" or "There was a mistake in part XX, so let's fix that" in order to improve the content of the presentation.

また、ロボット１００は、ユーザからプレゼンテーションの内容の完成度を高めることを要求する音声を受け付けた場合、プレゼンテーションの内容の完成度を高める行動を決定する。具体的には、ロボット１００は、ユーザによる発表練習中、または発表練習終了後に、ユーザからの発話に基づいてプレゼンテーションの内容の完成度が高まるような行動を決定できる。例えば、ロボット１００は、ユーザから「プレゼンテーションの内容で改善点はありますか？」という発話があった場合に、「もう少しゆっくり説明するとなおよいです。」、「〇〇の部分が間違っていたので直しましょう。」といったような提案を行うことができる。 Furthermore, when the robot 100 receives a voice from the user requesting that the content of the presentation be made more complete, it determines an action that will make the content of the presentation more complete. Specifically, the robot 100 can determine an action that will make the content of the presentation more complete based on the user's utterances while the user is practicing the presentation or after the presentation practice has ended. For example, when the user says, "Is there anything that can be improved about the content of the presentation?" the robot 100 can make suggestions such as, "It would be better if you explained it a little more slowly," or "There was a mistake in part XX, so let's fix it."

また、ロボット１００は、プレゼンテーションの内容を特定し、特定したプレゼンテーションの内容に合わせた所定のフィードバックに関する行動を決定することができる。具体的には、ロボット１００は、ユーザのプレゼンテーションの内容を解析し、プレゼンテーションの内容の完成度が高まるような行動を決定できる。例えば、ロボット１００は、完成度を高める行動として、プレゼンテーションの内容を取得して解析を行い、予め設定された指標が所定の閾値未満の場合に、係る箇所について上記したフィードバックを行う。なお、本項目における「取得」は、ユーザによる発表練習といった発話形式ではなく、プレゼンテーションの内容にかかるデータを情報処理装置等が読み込んで解析を行ってもよい。 The robot 100 can also identify the content of the presentation and determine behavior related to predetermined feedback that is tailored to the identified content of the presentation. Specifically, the robot 100 can analyze the content of the user's presentation and determine behavior that will improve the completeness of the presentation content. For example, as an action to improve the completeness, the robot 100 acquires and analyzes the content of the presentation, and if a preset indicator is below a predetermined threshold, provides the above-mentioned feedback on the relevant part. Note that "acquire" in this section does not necessarily mean in the form of speech, such as a user practicing a presentation, but may also mean that data related to the content of the presentation is read and analyzed by an information processing device or the like.

また、ロボット１００は、ユーザの状態や行動に基づいて、ユーザがプレゼンテーションとして発表の練習を行う場合に、ユーザのプレゼンテーションの内容の所定の完成度が高める行動を行う。例えば、ロボット１００は、ユーザが手に端末装置を持ち、かつ「ユーザが行うプレゼンテーションの発表練習での発表速度が速すぎます。」と認識した場合に、会話形式でユーザに話しかけて、プレゼンテーションの内容を改善するための提案を実施できる。 Furthermore, when the user practices a presentation, the robot 100 performs actions based on the user's state and behavior to improve the level of completion of the content of the user's presentation. For example, when the user is holding a terminal device and the robot 100 recognizes that "the speed at which the user practices the presentation is too fast," it can speak to the user in a conversational manner and make suggestions to improve the content of the presentation.

このように、本開示において、ロボット１００は、端末機器（ＰＣや、スマートフォン、タブレット等）と連携した行動を行うことで、ユーザに対して、プレゼンテーションの完成度を高める行動の実行、すなわちユーザと一緒にプレゼンテーションの発表練習を行うことができる。すなわち、本開示に係るロボット１００によれば、ユーザに対して適切な行動を実行することができる。 In this way, according to the present disclosure, the robot 100 can perform actions that improve the quality of a presentation for the user by performing actions in cooperation with a terminal device (such as a PC, smartphone, or tablet), i.e., can practice the presentation together with the user. In other words, the robot 100 according to the present disclosure can perform appropriate actions for the user.

図２は、ロボット１００の機能構成を概略的に示す図である。ロボット１００は、センサ部２００と、センサモジュール部２１０と、格納部２２０と、ユーザ状態認識部２３０と、感情決定部２３２と、行動認識部２３４と、行動決定部２３６と、記憶制御部２３８と、行動制御部２５０と、制御対象２５２と、通信処理部２８０と、を有する制御部によって構成される。 Figure 2 is a diagram showing the schematic functional configuration of the robot 100. The robot 100 is composed of a control unit having a sensor unit 200, a sensor module unit 210, a storage unit 220, a user state recognition unit 230, an emotion determination unit 232, a behavior recognition unit 234, a behavior determination unit 236, a memory control unit 238, a behavior control unit 250, a control target 252, and a communication processing unit 280.

制御対象２５２は、表示装置、スピーカ及び目部のＬＥＤ、並びに、腕、手及び足等を駆動するモータ等を含む。ロボット１００の姿勢や仕草は、腕、手及び足等のモータを制御することにより制御される。ロボット１００の感情の一部は、これらのモータを制御することにより表現できる。また、ロボット１００の目部のＬＥＤの発光状態を制御することによっても、ロボット１００の表情を表現できる。例えば、表示装置は、ロボット１００の胸に設けられる。また、表示装置の表示を制御することによっても、ロボット１００の表情を表現できる。表示装置は、ユーザとの会話内容を文字として表示してもよい。なお、ロボット１００の姿勢、仕草及び表情は、ロボット１００の態度の一例である。 The control object 252 includes a display device, a speaker, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 100 are controlled by controlling the motors of the arms, hands, and feet. Some of the emotions of the robot 100 can be expressed by controlling these motors. The facial expressions of the robot 100 can also be expressed by controlling the light emission state of the LEDs in the eyes of the robot 100. For example, the display device is provided on the chest of the robot 100. The facial expressions of the robot 100 can also be expressed by controlling the display on the display device. The display device may display the content of the conversation with the user as text. The posture, gestures, and facial expressions of the robot 100 are examples of the attitude of the robot 100.

センサ部２００は、マイク２０１と、３Ｄ深度センサ２０２と、２Ｄカメラ２０３と、距離センサ２０４と、加速度センサ２０５と、サーモセンサ２０６と、タッチセンサ２０７とを含む。マイク２０１は、音声を連続的に検出して音声データを出力する。なお、マイク２０１は、ロボット１００の頭部に設けられ、バイノーラル録音を行う機能を有してよい。３Ｄ深度センサ２０２は、赤外線パターンを連続的に照射して、赤外線カメラで連続的に撮影された赤外線画像から赤外線パターンを解析することによって、物体の輪郭を検出する。２Ｄカメラ２０３は、イメージセンサの一例である。２Ｄカメラ２０３は、可視光によって撮影して、可視光の映像情報を生成する。物体の輪郭は、２Ｄカメラ２０３によって生成された映像情報から検出されてもよい。距離センサ２０４は、例えばレーザや超音波等を照射して物体までの距離を検出する。加速度センサ２０５は、例えば、ジャイロセンサであり、ロボット１００の加速度を検出する。サーモセンサ２０６は、ロボット１００の周囲の温度を検出する。タッチセンサ２０７は、ユーザのタッチ操作を検出するセンサであり、例えば、ロボット１００の頭部および手に配置される。なお、センサ部２００は、この他にも、時計、モータフィードバック用のセンサ等を含んでよい。 The sensor unit 200 includes a microphone 201, a 3D depth sensor 202, a 2D camera 203, a distance sensor 204, an acceleration sensor 205, a thermosensor 206, and a touch sensor 207. The microphone 201 continuously detects sound and outputs audio data. The microphone 201 may be provided on the head of the robot 100 and may have a binaural recording function. The 3D depth sensor 202 detects the contour of an object by continuously emitting an infrared pattern and analyzing the infrared pattern from infrared images continuously captured by the infrared camera. The 2D camera 203 is an example of an image sensor. The 2D camera 203 captures images using visible light and generates visible light video information. The contour of an object may be detected from the video information generated by the 2D camera 203. The distance sensor 204 detects the distance to an object by emitting, for example, a laser or ultrasonic wave. The acceleration sensor 205 is, for example, a gyro sensor, and detects the acceleration of the robot 100. The thermosensor 206 detects the temperature around the robot 100. The touch sensor 207 is a sensor that detects touch operations by the user, and is placed, for example, on the head and hands of the robot 100. The sensor unit 200 may also include a clock, a sensor for motor feedback, etc.

なお、図２に示すロボット１００の構成要素のうち、制御対象２５２及びセンサ部２００を除く構成要素は、ロボット１００が有する行動制御システムが有する構成要素の一例である。ロボット１００の行動制御システムは、制御対象２５２を制御の対象とする。 Note that of the components of the robot 100 shown in FIG. 2, the components excluding the control target 252 and the sensor unit 200 are examples of components of the behavior control system of the robot 100. The behavior control system of the robot 100 controls the control target 252.

格納部２２０は、反応ルール２２１及び履歴データ２２２を含む。履歴データ２２２は、ユーザの過去の感情値及び行動の履歴を含む。この感情値及び行動の履歴は、例えば、ユーザの識別情報に対応付けられることによって、ユーザ毎に記録される。格納部２２０の少なくとも一部は、メモリ等の記憶媒体によって実装される。ユーザの顔画像、ユーザの属性情報等を格納する人物ＤＢを含んでもよい。なお、図２に示すロボット１００の構成要素のうち、制御対象２５２、センサ部２００及び格納部２２０を除く構成要素の機能は、ＣＰＵがプログラムに基づいて動作することによって実現できる。例えば、基本ソフトウエア（ＯＳ）及びＯＳ上で動作するプログラムによって、これらの構成要素の機能をＣＰＵの動作として実装できる。 The storage unit 220 includes reaction rules 221 and history data 222. The history data 222 includes the user's past emotional values and behavioral history. This emotional value and behavioral history is recorded for each user, for example, by being associated with the user's identification information. At least a portion of the storage unit 220 is implemented as a storage medium such as a memory. It may also include a person DB that stores the user's facial image, user attribute information, etc. Note that the functions of the components of the robot 100 shown in FIG. 2, excluding the control target 252, sensor unit 200, and storage unit 220, can be realized by the CPU operating based on a program. For example, the functions of these components can be implemented as CPU operations using basic software (OS) and programs running on the OS.

センサモジュール部２１０は、音声感情認識部２１１と、発話理解部２１２と、表情認識部２１３と、顔認識部２１４とを含む。センサモジュール部２１０には、センサ部２００で検出された情報が入力される。センサモジュール部２１０は、センサ部２００で検出された情報を解析して、解析結果をユーザ状態認識部２３０に出力する。 The sensor module unit 210 includes a voice emotion recognition unit 211, a speech understanding unit 212, a facial expression recognition unit 213, and a face recognition unit 214. Information detected by the sensor unit 200 is input to the sensor module unit 210. The sensor module unit 210 analyzes the information detected by the sensor unit 200 and outputs the analysis results to the user state recognition unit 230.

センサモジュール部２１０の音声感情認識部２１１は、マイク２０１で検出されたユーザの音声を解析して、ユーザの感情を認識する。例えば、音声感情認識部２１１は、音声の周波数成分等の特徴量を抽出して、抽出した特徴量に基づいて、ユーザの感情を認識する。発話理解部２１２は、マイク２０１で検出されたユーザの音声を解析して、ユーザの発話内容を表す文字情報を出力する。例えば、発話理解部２１２は、「プレゼンテーションの内容で改善点はありますか？」といったロボット１００への問いかけの内容を解析して、ユーザの発話内容を表す文字情報を出力できる。 The voice emotion recognition unit 211 of the sensor module unit 210 analyzes the user's voice detected by the microphone 201 and recognizes the user's emotions. For example, the voice emotion recognition unit 211 extracts features such as frequency components of the voice and recognizes the user's emotions based on the extracted features. The speech understanding unit 212 analyzes the user's voice detected by the microphone 201 and outputs text information representing the content of the user's utterance. For example, the speech understanding unit 212 can analyze the content of a question posed to the robot 100, such as "Is there anything we can improve on in the content of our presentation?" and output text information representing the content of the user's utterance.

表情認識部２１３は、２Ｄカメラ２０３で撮影されたユーザの画像から、ユーザの表情及びユーザの感情を認識する。例えば、表情認識部２１３は、目及び口の形状、位置関係等に基づいて、ユーザの表情及び感情を認識する。例えば、表情認識部２１３は、プレゼンテーションの発表を行う際や、ロボット１００に対して問いかけを行っている際の表情及び感情を認識できる。 The facial expression recognition unit 213 recognizes the user's facial expression and emotions from the image of the user captured by the 2D camera 203. For example, the facial expression recognition unit 213 recognizes the user's facial expression and emotions based on the shape and positional relationship of the eyes and mouth. For example, the facial expression recognition unit 213 can recognize the facial expression and emotions when giving a presentation or when asking the robot 100 a question.

顔認識部２１４は、ユーザの顔を認識する。顔認識部２１４は、人物ＤＢ（図示省略）に格納されている顔画像と、２Ｄカメラ２０３によって撮影されたユーザの顔画像とをマッチングすることによって、ユーザを認識する。 The face recognition unit 214 recognizes the user's face. The face recognition unit 214 recognizes the user by matching a face image stored in a person DB (not shown) with the face image of the user captured by the 2D camera 203.

ユーザ状態認識部２３０は、センサモジュール部２１０で解析された情報に基づいて、ユーザの状態を認識する。例えば、センサモジュール部２１０の解析結果を用いて、主として知覚に関する処理を行う。例えば、ユーザ状態認識部２３０は、「ユーザがプレゼンテーションの発表練習を行っています。」、「ユーザの発話速度が所定の閾値を超えています。」等の知覚情報を生成し、生成された知覚情報の意味を理解する処理を行う。例えば、ユーザ状態認識部２３０は、「ユーザが行うプレゼンテーションの発表練習での発表速度が速すぎます。」等の意味情報を生成する。 The user state recognition unit 230 recognizes the user's state based on the information analyzed by the sensor module unit 210. For example, it mainly performs processing related to perception using the analysis results of the sensor module unit 210. For example, the user state recognition unit 230 generates perceptual information such as "The user is practicing for a presentation" or "The user's speaking speed exceeds a predetermined threshold," and performs processing to understand the meaning of the generated perceptual information. For example, the user state recognition unit 230 generates semantic information such as "The user is speaking too fast during the presentation practice."

感情決定部２３２は、センサモジュール部２１０で解析された情報、及びユーザ状態認識部２３０によって認識されたユーザの状態に基づいて、ユーザの感情を示す感情値を決定する。例えば、センサモジュール部２１０で解析された情報、及び認識されたユーザの状態を、予め学習されたニューラルネットワークに入力し、ユーザの感情を示す感情値を取得する。 The emotion determination unit 232 determines an emotion value indicating the user's emotion based on the information analyzed by the sensor module unit 210 and the user's state recognized by the user state recognition unit 230. For example, the information analyzed by the sensor module unit 210 and the recognized user's state are input into a pre-trained neural network to obtain an emotion value indicating the user's emotion.

ここで、ユーザの感情を示す感情値とは、ユーザの感情の正負を示す値であり、例えば、ユーザの感情が、「喜」、「楽」、「快」、「安心」、「興奮」、「安堵」、及び「充実感」のように、快感や安らぎを伴う明るい感情であれば、正の値を示し、明るい感情であるほど、大きい値となる。ユーザの感情が、「怒」、「哀」、「不快」、「不安」、「悲しみ」、「心配」、及び「虚無感」のように、嫌な気持ちになってしまう感情であれば、負の値を示し、嫌な気持ちであるほど、負の値の絶対値が大きくなる。ユーザの感情が、上記の何れでもない場合（「普通」）、０の値を示す。 Here, the emotion value indicating the user's emotion is a value that indicates whether the user's emotion is positive or negative. For example, if the user's emotion is a cheerful emotion accompanied by a sense of pleasure or comfort, such as "joy," "pleasure," "comfort," "relief," "excitement," "relief," and "fulfillment," the value will be positive, and the more cheerful the emotion, the larger the value. If the user's emotion is a negative emotion, such as "anger," "sorrow," "discomfort," "anxiety," "sorrow," "worry," and "emptiness," the value will be negative, and the more unpleasant the emotion, the larger the absolute value of the negative value. If the user's emotion is none of the above ("neutral"), the value will be 0.

また、感情決定部２３２は、センサモジュール部２１０で解析された情報、及びユーザ状態認識部２３０によって認識されたユーザの状態に基づいて、ロボット１００の感情を示す感情値を決定する。 In addition, the emotion determination unit 232 determines an emotion value indicating the emotion of the robot 100 based on the information analyzed by the sensor module unit 210 and the user's state recognized by the user state recognition unit 230.

ロボット１００の感情値は、複数の感情分類の各々に対する感情値を含み、例えば、「喜」、「怒」、「哀」、「楽」それぞれの強さを示す値（０～５）である。 The emotion value of the robot 100 includes emotion values for each of multiple emotion categories, and is a value (0 to 5) indicating the strength of each of the emotions, for example, "joy," "anger," "sadness," and "happiness."

具体的には、感情決定部２３２は、センサモジュール部２１０で解析された情報、及びユーザ状態認識部２３０によって認識されたユーザの状態に対応付けて定められた、ロボット１００の感情値を更新するルールに従って、ロボット１００の感情を示す感情値を決定する。 Specifically, the emotion determination unit 232 determines an emotion value indicating the emotion of the robot 100 in accordance with rules for updating the emotion value of the robot 100 that are determined in association with the information analyzed by the sensor module unit 210 and the user's state recognized by the user state recognition unit 230.

例えば、感情決定部２３２は、ユーザ状態認識部２３０によってユーザが寂しそうと認識された場合、ロボット１００の「哀」の感情値を増大させる。また、ユーザ状態認識部２３０によってユーザが笑顔になったと認識された場合、ロボット１００の「喜」の感情値を増大させる。 For example, if the user state recognition unit 230 recognizes that the user looks lonely, the emotion determination unit 232 increases the "sad" emotion value of the robot 100. Also, if the user state recognition unit 230 recognizes that the user is smiling, the emotion determination unit 232 increases the "happy" emotion value of the robot 100.

なお、感情決定部２３２は、ロボット１００の状態を更に考慮して、ロボット１００の感情を示す感情値を決定してもよい。例えば、ロボット１００のバッテリー残量が少ない場合やロボット１００の周辺環境が真っ暗な場合等に、ロボット１００の「哀」の感情値を増大させてもよい。更にバッテリー残量が少ないにも関わらず継続して話しかけてくるユーザの場合は、「怒」の感情値を増大させても良い。 The emotion determination unit 232 may determine the emotion value indicating the emotion of the robot 100 by further considering the state of the robot 100. For example, if the battery level of the robot 100 is low or if the surrounding environment of the robot 100 is pitch black, the emotion value of "sadness" of the robot 100 may be increased. Furthermore, if the user continues to talk to the robot 100 despite the battery level being low, the emotion value of "anger" may be increased.

行動認識部２３４は、センサモジュール部２１０で解析された情報、及びユーザ状態認識部２３０によって認識されたユーザの状態に基づいて、ユーザの行動を認識する。例えば、センサモジュール部２１０で解析された情報、及び認識されたユーザの状態を、予め学習されたニューラルネットワークに入力し、予め定められた複数の行動分類（例えば、「笑う」、「怒る」、「質問する」、「悲しむ」）の各々の確率を取得し、最も確率の高い行動分類を、ユーザの行動として認識する。例えば、行動認識部２３４は、ユーザの「端末装置を操作する」、「プレゼンテーションの内容を発話する」、「発表中に話す内容を考える」、「質問に回答する」等といったユーザの行動を認識する。 The behavior recognition unit 234 recognizes user behavior based on the information analyzed by the sensor module unit 210 and the user's state recognized by the user state recognition unit 230. For example, the information analyzed by the sensor module unit 210 and the recognized user's state are input into a pre-trained neural network, the probability of each of multiple predetermined behavior categories (e.g., "laughing," "angry," "asking a question," "sad") is obtained, and the behavior category with the highest probability is recognized as the user's behavior. For example, the behavior recognition unit 234 recognizes user behavior such as "operating a terminal device," "speaking the content of a presentation," "thinking about what to say during a presentation," "answering a question," etc.

以上のように、本実施形態では、ロボット１００は、ユーザを特定したうえでユーザの発話内容を取得するが、当該発話内容の取得と利用等に際してはユーザから法令に従った必要な同意を取得するほか、本実施形態に係るロボット１００の行動制御システムは、ユーザの個人情報及びプライバシーの保護に配慮する。 As described above, in this embodiment, the robot 100 identifies the user and then acquires the user's speech content. However, when acquiring and using the speech content, the robot 100 obtains the necessary consent from the user in accordance with laws and regulations. Furthermore, the behavior control system of the robot 100 in this embodiment takes into consideration the protection of the user's personal information and privacy.

行動決定部２３６は、感情決定部２３２により決定されたユーザの現在の感情値と、ユーザの現在の感情値が決定されるよりも前に感情決定部２３２により決定された過去の感情値の履歴データ２２２と、ロボット１００の感情値とに基づいて、行動認識部２３４によって認識されたユーザの行動に対応する行動を決定する。本実施形態では、行動決定部２３６は、ユーザの過去の感情値として、履歴データ２２２に含まれる直近の１つの感情値を用いる場合について説明するが、開示の技術はこの態様に限定されない。例えば、行動決定部２３６は、ユーザの過去の感情値として、直近の複数の感情値を用いてもよいし、一日前等の単位期間の分だけ前の感情値を用いてもよい。また、行動決定部２３６は、ロボット１００の現在の感情値だけでなく、ロボット１００の過去の感情値の履歴を更に考慮して、ユーザの行動に対応する行動を決定してもよい。行動決定部２３６が決定する行動は、ロボット１００が行うジェスチャー又はロボット１００の発話内容を含む。 The behavior determination unit 236 determines a behavior corresponding to the user's behavior recognized by the behavior recognition unit 234 based on the user's current emotion value determined by the emotion determination unit 232, history data 222 of past emotion values determined by the emotion determination unit 232 before the user's current emotion value was determined, and the emotion value of the robot 100. In this embodiment, the behavior determination unit 236 describes a case in which the behavior determination unit 236 uses one of the most recent emotion values included in the history data 222 as the user's past emotion value, but the disclosed technology is not limited to this form. For example, the behavior determination unit 236 may use multiple most recent emotion values as the user's past emotion value, or may use an emotion value from a unit period ago, such as one day ago. Furthermore, the behavior determination unit 236 may determine a behavior corresponding to the user's behavior by taking into consideration not only the robot 100's current emotion value but also the history of the robot 100's past emotion values. The behavior determined by the behavior determination unit 236 includes gestures performed by the robot 100 or the content of speech uttered by the robot 100.

なお、行動決定部２３６は、ロボット１００の感情に基づいて、ユーザの行動に対応する行動を決定してもよい。例えば、ロボット１００がユーザから暴言をかけられた場合や、ユーザに横柄な態度をとられている場合（すなわち、ユーザの反応が不良である場合）、周囲の騒音が騒がしくユーザの音声を検出できない場合、ロボット１００のバッテリー残量が少ない場合などにおいて、ロボット１００の「怒」や「哀」の感情値が増大した場合、行動決定部２３６は、「怒」や「哀」の感情値の増大に応じた行動を、ユーザの行動に対応する行動として決定してもよい。また、ユーザの反応が良好である場合や、ロボット１００のバッテリー残量が多い場合などにおいて、ロボット１００の「喜」や「楽」の感情値が増大した場合、行動決定部２３６は、「喜」や「楽」の感情値の増大に応じた行動を、ユーザの行動に対応する行動として決定してもよい。また、行動決定部２３６は、ロボット１００の「怒」や「哀」の感情値を増大させたユーザに対する行動とは異なる行動を、ロボット１００の「喜」や「楽」の感情値を増大させたユーザに対する行動として決定してもよい。このように、行動決定部２３６は、ロボット自身の感情そのものや、ユーザの行動によってユーザがロボット１００の感情をどのように変化させたかに応じて、異なる行動を決定すればよい。 The behavior determination unit 236 may determine an action corresponding to the user's behavior based on the emotions of the robot 100. For example, if the robot 100 is verbally abused by the user or is arrogant (i.e., the user has a poor reaction), if the user's voice cannot be detected due to loud ambient noise, or if the robot 100's battery is low, and the robot 100's emotional value of "anger" or "sadness" increases, the behavior determination unit 236 may determine an action corresponding to the user's behavior in accordance with the increase in the emotional value of "anger" or "sadness." Also, if the user has a good reaction or the robot 100's battery is high, and the robot 100's emotional value of "joy" or "happiness" increases, the behavior determination unit 236 may determine an action corresponding to the user's behavior in accordance with the increase in the emotional value of "joy" or "happiness." Furthermore, the behavior determination unit 236 may determine a different behavior for a user who has increased the robot 100's emotional values of "joy" or "happiness" than the behavior for a user who has increased the robot 100's emotional values of "anger" or "sadness." In this way, the behavior determination unit 236 may determine a different behavior depending on the robot's own emotions and how the user's actions have changed the robot 100's emotions.

本実施形態に係る行動決定部２３６は、ユーザの行動に対応する行動として、ユーザの過去の感情値と現在の感情値の組み合わせと、ロボット１００の感情値と、ユーザの行動と、反応ルール２２１とに基づいて、ロボット１００の行動を決定する。例えば、行動決定部２３６は、ユーザの過去の感情値が正の値であり、かつ現在の感情値が負の値である場合、ユーザの行動に対応する行動として、ユーザの感情値を正に変化させるための行動を決定する。 The behavior determination unit 236 according to this embodiment determines the behavior of the robot 100 as a behavior corresponding to the user's behavior, based on a combination of the user's past and current emotional values, the emotional value of the robot 100, the user's behavior, and the reaction rules 221. For example, if the user's past emotional value is a positive value and the current emotional value is a negative value, the behavior determination unit 236 determines, as a behavior corresponding to the user's behavior, a behavior that will change the user's emotional value to a positive value.

反応ルール２２１には、ユーザの過去の感情値と現在の感情値の組み合わせと、ロボット１００の感情値と、ユーザの行動とに応じたロボット１００の行動が定められている。例えば、ユーザの過去の感情値が正の値であり、かつ現在の感情値が負の値であり、ユーザの行動が悲しむである場合、ロボット１００の行動として、ジェスチャーを交えてユーザを励ます問いかけを行う際のジェスチャーと発話内容との組み合わせが定められている。 The reaction rules 221 define the behavior of the robot 100 according to a combination of the user's past and current emotional values, the robot's 100 emotional value, and the user's behavior. For example, if the user's past emotional value is a positive value and the current emotional value is a negative value, and the user's behavior is sad, the robot 100's behavior is defined as a combination of gestures and speech content when asking a question to encourage the user with gestures.

例えば、反応ルール２２１には、ロボット１００の感情値のパターン（「喜」、「怒」、「哀」、「楽」の値「０」～「５」の６値の４乗である１２９６パターン）、ユーザの過去の感情値と現在の感情値の組み合わせのパターン、ユーザの行動パターンの全組み合わせに対して、ロボット１００の行動が定められる。すわなち、ロボット１００の感情値のパターン毎に、ユーザの過去の感情値と現在の感情値の組み合わせが、負の値と負の値、負の値と正の値、正の値と負の値、正の値と正の値、負の値と普通、及び普通と普通等のように、複数の組み合わせのそれぞれに対して、ユーザの行動パターンに応じたロボット１００の行動が定められる。なお、行動決定部２３６は、例えば、ユーザが「この前に話したあの話題について話したい」というような過去の話題から継続した会話を意図する発話を行った場合に、履歴データ２２２を用いてロボット１００の行動を決定する動作モードに遷移してもよい。 For example, the reaction rules 221 define the behavior of the robot 100 for all combinations of patterns of the robot's 100's emotional values (1,296 patterns, which are the fourth power of six values from "joy," "anger," "sadness," and "happiness" ranging from "0" to "5"); patterns of combinations of the user's past and current emotional values; and the user's behavior patterns. That is, for each pattern of the robot's 100's emotional values, behavior of the robot 100 is defined in accordance with the user's behavior pattern for each of multiple combinations of the user's past and current emotional values, such as negative and negative, negative and positive, positive and negative, positive and positive, negative and normal, and normal and normal. Note that the behavior determination unit 236 may transition to an operating mode that determines the behavior of the robot 100 using the history data 222 when the user makes an utterance intending to continue a conversation from a past topic, such as "I want to talk about that topic we talked about last time."

なお、反応ルール２２１には、ロボット１００の感情値のパターン（１２９６パターン）の各々に対して、最大で一つずつ、ロボット１００の行動としてジェスチャー及び発言内容の少なくとも一方が定められていてもよい。あるいは、反応ルール２２１には、ロボット１００の感情値のパターンのグループの各々に対して、ロボット１００の行動としてジェスチャー及び発言内容の少なくとも一方が定められていてもよい。 Note that the reaction rules 221 may define at least one gesture or speech as the behavior of the robot 100 for each of the patterns (1296 patterns) of the robot 100's emotional value. Alternatively, the reaction rules 221 may define at least one gesture or speech as the behavior of the robot 100 for each group of patterns of the robot 100's emotional value.

反応ルール２２１に定められているロボット１００の行動に含まれる各ジェスチャーには、当該ジェスチャーの強度が予め定められている。反応ルール２２１に定められているロボット１００の行動に含まれる各発話内容には、当該発話内容の強度が予め定められている。 The strength of each gesture included in the behavior of the robot 100 defined in the reaction rules 221 is predetermined. The strength of each utterance included in the behavior of the robot 100 defined in the reaction rules 221 is predetermined.

例えば、反応ルール２２１には、端末装置を操作する場合、プレゼンテーションの内容を発話する場合、発表中に話す内容を考える場合、質問に回答する場合、ユーザの要望に関する発話等の行動パターンに対応するロボット１００の行動が定められている。なお、ユーザの要望に関する発話の一例としては、「プレゼンテーションの内容で改善点はありますか？」といったロボット１００への問いかけ等である。 For example, the reaction rules 221 define the behavior of the robot 100 corresponding to behavioral patterns such as when operating a terminal device, speaking the contents of a presentation, thinking about what to say during a presentation, answering a question, and utterances related to the user's requests. An example of an utterance related to a user's requests is a question to the robot 100 such as, "Is there anything in the content of the presentation that could be improved?"

記憶制御部２３８は、行動決定部２３６によって決定された行動に対して予め定められた行動の強度と、感情決定部２３２により決定されたロボット１００の感情値とに基づいて、ユーザの行動を含むデータを履歴データ２２２に記憶するか否かを決定する。 The memory control unit 238 determines whether or not to store data including the user's behavior in the history data 222 based on the predetermined behavior strength for the behavior determined by the behavior determination unit 236 and the emotion value of the robot 100 determined by the emotion determination unit 232.

具体的には、ロボット１００の複数の感情分類の各々に対する感情値の総和と、行動決定部２３６によって決定された行動が含むジェスチャーに対して予め定められた強度と、行動決定部２３６によって決定された行動が含む発話内容に対して予め定められた強度との和である強度の総合値が、閾値以上である場合、ユーザの行動を含むデータを履歴データ２２２に記憶すると決定する。 Specifically, if the total intensity, which is the sum of the emotional values for each of the robot's 100's multiple emotional classifications, the predetermined intensity for the gestures included in the behavior determined by the behavior determination unit 236, and the predetermined intensity for the speech content included in the behavior determined by the behavior determination unit 236, is equal to or greater than a threshold value, it is determined that data including the user's behavior is to be stored in the history data 222.

記憶制御部２３８は、ユーザの行動を含むデータを履歴データ２２２に記憶すると決定した場合、行動決定部２３６によって決定された行動と、現時点から一定期間前までの、センサモジュール部２１０で解析された情報（例えば、その場の音声、画像、におい等のデータ等のあらゆる周辺情報）、及びユーザ状態認識部２３０によって認識されたユーザの状態（例えば、ユーザの表情、感情等）を、履歴データ２２２に記憶する。 When the storage control unit 238 decides to store data including the user's behavior in the history data 222, it stores in the history data 222 the behavior determined by the behavior determination unit 236, information analyzed by the sensor module unit 210 from the present time up to a certain period of time ago (e.g., all peripheral information such as data on the sound, images, and smells of the scene), and the user's state recognized by the user state recognition unit 230 (e.g., the user's facial expression, emotions, etc.).

行動制御部２５０は、行動決定部２３６が決定した行動に基づいて、制御対象２５２を制御する。例えば、行動制御部２５０は、行動決定部２３６が発話することを含む行動を決定した場合に、制御対象２５２に含まれるスピーカから音声を出力させる。このとき、行動制御部２５０は、ロボット１００の感情値に基づいて、音声の発声速度を決定してもよい。例えば、行動制御部２５０は、ロボット１００の感情値が大きいほど、速い発声速度を決定する。このように、行動制御部２５０は、感情決定部２３２が決定した感情値に基づいて、行動決定部２３６が決定した行動の実行形態を決定する。具体的には、行動制御部２５０は、ユーザが行うプレゼンテーションとして発表の練習を行う際、発表内容の聴講、反応、内容への指摘、改善提案等のプレゼンテーションの完成度を高める行動を実行する。 The behavior control unit 250 controls the control target 252 based on the behavior determined by the behavior determination unit 236. For example, when the behavior determination unit 236 determines an behavior that includes speaking, the behavior control unit 250 outputs sound from a speaker included in the control target 252. At this time, the behavior control unit 250 may determine the speaking rate of the sound based on the emotional value of the robot 100. For example, the behavior control unit 250 determines a faster speaking rate the higher the emotional value of the robot 100. In this way, the behavior control unit 250 determines the execution form of the behavior determined by the behavior determination unit 236 based on the emotional value determined by the emotional determination unit 232. Specifically, when practicing a presentation to be given by a user, the behavior control unit 250 performs behaviors that improve the quality of the presentation, such as listening to the content of the presentation, reacting, pointing out the content, and suggesting improvements.

例えば、行動制御部２５０は、発表の練習におけるプレゼンテーションの内容を解析して、所定の完成度の指標が閾値未満の箇所を抽出し、抽出した箇所について所定のフィードバックに関する行動を決定する。また、行動制御部２５０は、プレゼンテーションの内容に含まれる誤字、脱字、内容の誤り、内容の充実度、ユーザの声量、発表速度、目線、ユーザの感情の変化のうちいずれか１つ又は複数の組み合わせを所定の完成度の指標として、完成度の指標が閾値未満の箇所を抽出する。また、行動制御部２５０は、完成度を高める行動として、発表練習の聴講およびプレゼンテーションの内容の解析を行い、予め設定された指標が所定の閾値未満の場合に、係る箇所についてフィードバックを行う。なお、ここでいうフィードバックとはユーザのプレゼンテーションの完成度を高めるための指摘、指導、提案のことで、例えば、誤字の訂正、表現の変更、内容の削除又は追記、構成の変更、発表時の声量、姿勢、態度の改善等が含まれる。 For example, the behavior control unit 250 analyzes the content of a presentation during a practice presentation, extracts sections where a predetermined indicator of completeness is below a threshold, and determines an action related to predetermined feedback for the extracted sections. The behavior control unit 250 also uses one or a combination of typos, omissions, content errors, completeness of the content, the user's voice volume, presentation speed, eye contact, and changes in the user's emotions as the predetermined indicator of completeness and extracts sections where the indicator of completeness is below a threshold. The behavior control unit 250 also analyzes the content of the practice presentation and the presentation as an action to improve completeness, and provides feedback on the relevant sections if a preset indicator is below a predetermined threshold. Note that feedback here refers to comments, guidance, and suggestions to improve the completeness of the user's presentation, and includes, for example, correcting typos, changing expressions, deleting or adding content, changing the structure, and improving voice volume, posture, and attitude during the presentation.

また、行動制御部２５０は、ユーザによる発表の練習を認識した場合、練習中や練習が終わったタイミングで、プレゼンテーションを行うユーザの感情を高める行動として「素晴らしい発表ですね！」、「声が大きくてわかりやすいです！」といったようなユーザに対する発話ができる。また、行動制御部２５０は、練習中や練習が終わったタイミングで、プレゼンテーションの内容を改善するために「もう少しゆっくり説明するとなおよいです。」や、「〇〇の部分が間違っていたので直しましょう。」といったようなユーザに対する発話ができる。 Furthermore, when the behavior control unit 250 recognizes that the user is practicing a presentation, it can make utterances to the user during or after practice, such as "That was a great presentation!" or "Your voice is loud and easy to understand!", as actions to enhance the emotions of the user giving the presentation. Furthermore, the behavior control unit 250 can make utterances to the user during or after practice, such as "It would be better if you explained it a little more slowly" or "There was a mistake in part XX, so let's fix that" in order to improve the content of the presentation.

また、行動制御部２５０は、ユーザのプレゼンテーションの内容を解析し、誤字、脱字、内容の誤り、内容の充実度、ユーザの声量、発表速度、目線、ユーザの感情の変化等の予め設定された指標が所定の閾値未満の場合に、係る箇所についてフィードバックを行う。なお、行動制御部２５０は、ユーザによる発表練習といった発話形式ではなく、プレゼンテーションの内容にかかるデータを読み込んで解析を行ってもよい。 The behavior control unit 250 also analyzes the content of the user's presentation, and provides feedback on the relevant points if pre-set indicators such as typos, omissions, content errors, content completeness, the user's voice volume, presentation speed, line of sight, and changes in the user's emotions are below a predetermined threshold. The behavior control unit 250 may also read and analyze data related to the content of the presentation, rather than using speech formatting such as a user practicing a presentation.

また、行動制御部２５０は、ユーザによる発表練習中または発表練習終了後に、ユーザからプレゼンテーションの内容の完成度を高めることを要求する音声を受け付けた場合、プレゼンテーションの内容の完成度が高まるような行動を決定できる。例えば、行動制御部２５０は、ユーザから「プレゼンテーションの内容で改善点はありますか？」という発話があった場合に、「もう少しゆっくり説明するとなおよいです。」、「〇〇の部分が間違っていたので直しましょう。」といったような提案を行うことができる。 Furthermore, when the behavior control unit 250 receives a voice request from the user during or after the user has finished practicing the presentation requesting that the content of the presentation be made more complete, it can determine an action that will improve the completeness of the content of the presentation. For example, when the user says, "Is there anything that can be improved about the content of the presentation?" the behavior control unit 250 can make suggestions such as, "It would be better if you explained it a little more slowly," or "There was a mistake in part XX, so let's fix it."

行動制御部２５０は、行動決定部２３６が決定した行動を実行したことに対するユーザの感情の変化を認識してもよい。例えば、ユーザの音声や表情に基づいて感情の変化を認識してよい。その他、センサ部２００に含まれるタッチセンサで衝撃が検出されたことに基づいて、ユーザの感情の変化を認識してよい。センサ部２００に含まれるタッチセンサで衝撃が検出された場合に、ユーザの感情が悪くなったと認識したり、センサ部２００に含まれるタッチセンサの検出結果から、ユーザの反応が笑っている、あるいは、喜んでいる等と判断される場合には、ユーザの感情が良くなったと認識したりしてもよい。ユーザの反応を示す情報は、通信処理部２８０に出力される。 The behavior control unit 250 may recognize changes in the user's emotions in response to the execution of the behavior determined by the behavior determination unit 236. For example, changes in emotions may be recognized based on the user's voice or facial expression. Alternatively, changes in the user's emotions may be recognized based on the detection of an impact by a touch sensor included in the sensor unit 200. If an impact is detected by a touch sensor included in the sensor unit 200, it may recognize that the user's emotions have worsened, or if the detection result of the touch sensor included in the sensor unit 200 indicates that the user's reaction is smiling or happy, it may recognize that the user's emotions have improved. Information indicating the user's reaction is output to the communication processing unit 280.

また、行動制御部２５０は、行動決定部２３６が決定した行動をロボット１００の感情に応じて決定した実行形態で実行した後、感情決定部２３２は、当該行動が実行されたことに対するユーザの反応に基づいて、ロボット１００の感情値を更に変化させる。具体的には、感情決定部２３２は、行動決定部２３６が決定した行動を行動制御部２５０が決定した実行形態でユーザに対して行ったことに対するユーザの反応が不良でなかった場合に、ロボット１００の「喜」の感情値を増大させるまた、感情決定部２３２は、行動決定部２３６が決定した行動を行動制御部２５０が決定した実行形態でユーザに対して行ったことに対するユーザの反応が不良であった場合に、ロボット１００の「哀」の感情値を増大させる。 Furthermore, after the behavior control unit 250 executes the behavior determined by the behavior determination unit 236 in an execution mode determined according to the emotion of the robot 100, the emotion determination unit 232 further changes the emotion value of the robot 100 based on the user's reaction to the execution of the behavior. Specifically, the emotion determination unit 232 increases the emotion value of "joy" of the robot 100 when the user's reaction to the behavior determined by the behavior determination unit 236 being performed in the execution mode determined by the behavior control unit 250 is not negative.Furthermore, the emotion determination unit 232 increases the emotion value of "sad" of the robot 100 when the user's reaction to the behavior determined by the behavior determination unit 236 being performed in the execution mode determined by the behavior control unit 250 is negative.

更に、行動制御部２５０は、決定したロボット１００の感情値に基づいて、ロボット１００の感情を表現する。例えば、行動制御部２５０は、ロボット１００の「喜」の感情値を増加させた場合、制御対象２５２を制御して、ロボット１００に喜んだ仕草を行わせる。また、行動制御部２５０は、ロボット１００の「哀」の感情値を増加させた場合、ロボット１００の姿勢がうなだれた姿勢になるように、制御対象２５２を制御する。 Furthermore, the behavior control unit 250 expresses the emotion of the robot 100 based on the determined emotion value of the robot 100. For example, if the emotion value of "happiness" of the robot 100 is increased, the behavior control unit 250 controls the control object 252 to make the robot 100 perform a happy gesture. Furthermore, if the emotion value of "sadness" of the robot 100 is increased, the behavior control unit 250 controls the control object 252 to make the robot 100 assume a droopy posture.

更に、行動制御部２５０は、上記したロボット１００の感情の変化に基づいて、ロボット１００の行動を変化させる。例えば、行動制御部２５０は、ユーザによる発表練習を聴講したロボット１００が「喜」の感情値を増加させた場合、「とても素晴らしい発表で、内容がよく分かるよ！」といったようにユーザの発表を積極的に褒める行動を取ることができる。他方で、例えば、制御対象２５２は、行動制御部２５０は、「哀」の感情値を増加させた場合、「発表がちょっとわかりにくいかな、でも一緒に改善していこう！」といったようにユーザを励ます行動をとることができる。 Furthermore, the behavior control unit 250 changes the behavior of the robot 100 based on the changes in the emotions of the robot 100 described above. For example, if the robot 100, having listened to a user's practice presentation, increases its emotional value of "happiness," the behavior control unit 250 can take action to actively praise the user's presentation, such as saying, "That was a great presentation, and I could understand the content very well!" On the other hand, for example, if the behavior control unit 250 increases the emotional value of "sadness," the control object 252 can take action to encourage the user, such as saying, "I think your presentation is a little hard to understand, but let's work together to improve it!"

通信処理部２８０は、サーバ３００との通信を担う。上述したように、通信処理部２８０は、ユーザ反応情報をサーバ３００に送信する。また、通信処理部２８０は、更新された反応ルールをサーバ３００から受信する。通信処理部２８０がサーバ３００から、更新された反応ルールを受信すると、反応ルール２２１を更新する。通信処理部２８０は、連携機器４００との間で情報を送受信できる。 The communication processing unit 280 is responsible for communication with the server 300. As described above, the communication processing unit 280 transmits user reaction information to the server 300. The communication processing unit 280 also receives updated reaction rules from the server 300. When the communication processing unit 280 receives updated reaction rules from the server 300, it updates the reaction rules 221. The communication processing unit 280 can send and receive information to and from the linked device 400.

サーバ３００は、各ロボット１００とサーバ３００との間の通信を行い、ロボット１００から送信されたユーザ反応情報を受信し、ポジティブな反応が得られた行動を含む反応ルールに基づいて、反応ルールを更新する。 The server 300 communicates between each robot 100 and the server 300, receives user reaction information sent from the robot 100, and updates the reaction rules based on reaction rules that include actions that have received positive reactions.

図３は、ロボット１００において行動を決定する動作に関する動作フローの一例を概略的に示す図である。図３に示す動作フローは、繰り返し実行される。このとき、センサモジュール部２１０で解析された情報が入力されているものとする。なお、動作フロー中の「Ｓ」は、実行されるステップを表す。 Figure 3 is a diagram that shows an example of an operation flow related to the operation of determining the behavior of the robot 100. The operation flow shown in Figure 3 is executed repeatedly. At this time, it is assumed that information analyzed by the sensor module unit 210 has been input. Note that "S" in the operation flow indicates the step that is executed.

まず、ステップＳ１０１において、ユーザ状態認識部２３０は、センサモジュール部２１０で解析された情報に基づいて、ユーザの状態を認識する。例えば、ユーザ状態認識部２３０は、「ユーザがプレゼンテーションの発表練習を行っています。」、「ユーザの発話速度が所定の閾値を超えています。」等の知覚情報を生成し、生成された知覚情報の意味を理解する処理を行う。例えば、ユーザ状態認識部２３０は、「ユーザが行うプレゼンテーションの発表練習での発表速度が速すぎます。」等の意味情報を生成する。 First, in step S101, the user state recognition unit 230 recognizes the user's state based on the information analyzed by the sensor module unit 210. For example, the user state recognition unit 230 generates perceptual information such as "The user is practicing for a presentation" or "The user's speaking speed exceeds a predetermined threshold," and performs processing to understand the meaning of the generated perceptual information. For example, the user state recognition unit 230 generates semantic information such as "The user's speaking speed during the presentation practice is too fast."

ステップＳ１０２において、感情決定部２３２は、センサモジュール部２１０で解析された情報、及びユーザ状態認識部２３０によって認識されたユーザの状態に基づいて、ユーザの感情を示す感情値を決定する。 In step S102, the emotion determination unit 232 determines an emotion value indicating the user's emotion based on the information analyzed by the sensor module unit 210 and the user's state recognized by the user state recognition unit 230.

ステップＳ１０３において、感情決定部２３２は、センサモジュール部２１０で解析された情報、及びユーザ状態認識部２３０によって認識されたユーザの状態に基づいて、ロボット１００の感情を示す感情値を決定する。感情決定部２３２は、決定したユーザの感情値を履歴データ２２２に追加する。 In step S103, the emotion determination unit 232 determines an emotion value indicating the emotion of the robot 100 based on the information analyzed by the sensor module unit 210 and the user's state recognized by the user state recognition unit 230. The emotion determination unit 232 adds the determined user's emotion value to the history data 222.

ステップＳ１０４において、行動認識部２３４は、センサモジュール部２１０で解析された情報及びユーザ状態認識部２３０によって認識されたユーザの状態に基づいて、ユーザの行動分類を認識する。例えば、行動認識部２３４は、ユーザの「端末装置を操作する」、「プレゼンテーションの内容を発話する」、「発表中に話す内容を考える」、「質問に回答する」等といったユーザの行動を認識する。 In step S104, the behavior recognition unit 234 recognizes the user's behavior classification based on the information analyzed by the sensor module unit 210 and the user's state recognized by the user state recognition unit 230. For example, the behavior recognition unit 234 recognizes user behavior such as "operating a terminal device," "speaking the contents of the presentation," "thinking about what to say during the presentation," "answering a question," etc.

ステップＳ１０５において、行動決定部２３６は、ステップＳ１０２で決定されたユーザの現在の感情値及び履歴データ２２２に含まれる過去の感情値の組み合わせと、ロボット１００の感情値と、行動認識部２３４によって認識されたユーザの行動と、反応ルール２２１とに基づいて、ロボット１００の行動を決定する。 In step S105, the behavior determination unit 236 determines the behavior of the robot 100 based on a combination of the user's current emotion value determined in step S102 and the past emotion values included in the history data 222, the emotion value of the robot 100, the user's behavior recognized by the behavior recognition unit 234, and the reaction rules 221.

ステップＳ１０６において、行動制御部２５０は、行動決定部２３６により決定された行動に基づいて、制御対象２５２を制御する。例えば、行動制御部２５０は、ユーザが行うプレゼンテーションとして発表の練習を行う際、発表内容の聴講、反応、内容への指摘、改善提案等のプレゼンテーションの完成度を高める行動を実行する。 In step S106, the behavior control unit 250 controls the control target 252 based on the behavior determined by the behavior determination unit 236. For example, when the user practices a presentation, the behavior control unit 250 performs behaviors to improve the quality of the presentation, such as listening to the content of the presentation, reacting, pointing out the content, and suggesting improvements.

ステップＳ１０７において、記憶制御部２３８は、行動決定部２３６によって決定された行動に対して予め定められた行動の強度と、感情決定部２３２により決定されたロボット１００の感情値とに基づいて、強度の総合値を算出する。 In step S107, the memory control unit 238 calculates a total intensity value based on the predetermined action intensity for the action determined by the action determination unit 236 and the emotion value of the robot 100 determined by the emotion determination unit 232.

ステップＳ１０８において、記憶制御部２３８は、強度の総合値が閾値以上であるか否かを判定する。強度の総合値が閾値未満である場合には、ユーザの行動を含むデータを履歴データ２２２に記憶せずに、当該処理を終了する。一方、強度の総合値が閾値以上である場合には、ステップＳ１０９へ移行する。 In step S108, the storage control unit 238 determines whether the total intensity value is equal to or greater than the threshold value. If the total intensity value is less than the threshold value, the processing ends without storing data including the user's behavior in the history data 222. On the other hand, if the total intensity value is equal to or greater than the threshold value, the processing proceeds to step S109.

ステップＳ１０９において、行動決定部２３６によって決定された行動と、現時点から一定期間前までの、センサモジュール部２１０で解析された情報、及びユーザ状態認識部２３０によって認識されたユーザの状態と、を、履歴データ２２２に記憶する。 In step S109, the behavior determined by the behavior determination unit 236, the information analyzed by the sensor module unit 210 from the present time up to a certain period of time ago, and the user's state recognized by the user state recognition unit 230 are stored in the history data 222.

以上説明したように、ロボット１００は、プレゼンテーションを行うユーザの行動を認識し、認識したユーザの行動に対応する自身の行動を決定し、決定した自身の行動に基づいて制御対象を制御する制御部を備える。このように、ロボット１００は、ユーザが行うプレゼンテーションの発表練習ごとに適切な行動を取ることで、ユーザのプレゼンテーションの内容の完成度を高める効果を実現する。 As described above, robot 100 recognizes the actions of the user giving a presentation, determines its own actions corresponding to the recognized user actions, and includes a control unit that controls the control target based on its own determined actions. In this way, robot 100 achieves the effect of improving the quality of the content of the user's presentation by taking appropriate actions for each practice presentation the user gives.

具体的には、ロボット１００の制御部は、ユーザがプレゼンテーションとして発表の練習を行う場合に、ユーザのプレゼンテーションの内容の所定の完成度が高める行動を行う。例えば、ロボット１００の制御部は、発表の練習におけるプレゼンテーションの内容を解析して、所定の完成度の指標が閾値未満の箇所を抽出し、抽出した箇所について所定のフィードバックに関する行動を決定する。具体例として、ロボット１００は、プレゼンテーションの内容に含まれる誤字、脱字、内容の誤り、内容の充実度、ユーザの声量、発表速度、目線、ユーザの感情の変化のうちいずれか１つ又は複数の組み合わせを所定の完成度の指標として、完成度の指標が閾値未満の箇所を抽出する。これにより、ロボット１００は、ユーザが行うプレゼンテーションとして発表の練習を行う際、発表内容の聴講、反応、内容への指摘、改善提案等のプレゼンテーションの完成度を高める行動を実行することができる。また、ロボット１００は、ユーザによるプレゼンテーションの発表において、抽象的な改善点の指摘ではなく、具体的な箇所および内容を指摘することを可能とする。このようにして、ロボット１００は、ユーザの発表練習に一緒に参加して、双方向のコミュニケーションを取りながらユーザのプレゼンテーションの内容の完成度を高める効果を実現する。 Specifically, when a user practices a presentation, the control unit of robot 100 performs an action to enhance the predetermined level of completion of the content of the user's presentation. For example, the control unit of robot 100 analyzes the content of the presentation during practice, extracts sections where a predetermined level of completion index is below a threshold, and determines an action related to predetermined feedback for the extracted sections. As a specific example, robot 100 uses one or a combination of typos, omissions, content errors, content completeness, the user's voice volume, presentation speed, eye contact, and changes in the user's emotions as predetermined completion indexes to extract sections where the completion index is below a threshold. As a result, when the user practices a presentation, robot 100 can perform actions to enhance the presentation's completion, such as listening to the presentation content, responding to it, pointing out content issues, and suggesting improvements. Furthermore, robot 100 enables the user to point out specific areas and content of the presentation rather than simply pointing out abstract areas for improvement. In this way, the robot 100 can participate in the user's presentation practice, engaging in two-way communication and improving the quality of the user's presentation content.

また、ロボット１００の制御部は、ユーザによるプレゼンテーションとして発表の練習を認識した場合、プレゼンテーションを行うユーザの感情を高める行動を決定する。これにより、ロボット１００は、プレゼンテーションとして発表の練習を行うユーザに対して「素晴らしい発表ですね！」、「声が大きくてわかりやすいです！」といったようなユーザの感情を高める発話を可能とする。これにより、ロボット１００は、ユーザの発表練習に一緒に参加して、双方向のコミュニケーションを取りながらユーザの感情やプレゼンテーションに対するモチベーションを高めることで、プレゼンテーションの完成度を高める効果を実現する。 Furthermore, when the control unit of the robot 100 recognizes that the user is practicing a presentation as a presentation, it determines an action that will enhance the emotions of the user who is giving the presentation. As a result, the robot 100 is able to speak to the user practicing the presentation as a presentation in a way that will enhance the emotions of the user, such as "That was a great presentation!" or "Your voice is loud and easy to understand!". As a result, the robot 100 participates in the user's presentation practice and enhances the user's emotions and motivation for the presentation through two-way communication, thereby achieving the effect of improving the quality of the presentation.

また、ロボット１００の制御部は、ユーザが発表の練習が終わったタイミングで、プレゼンテーションの内容に関する発話を行う。これにより、ロボット１００は、プレゼンテーションとして発表の練習が終わった場合にユーザに対して「素晴らしい発表ですね！」、「声が大きくてわかりやすいです！」といったユーザの感情を高める発話を行うことができる。他方、ロボット１００は、プレゼンテーションとして発表の練習が終わった場合にユーザに対して「もう少しゆっくり説明するとなおよいです。」や「〇〇の部分が間違っていたので直しましょう。」といったような完成度を高めるための提案をユーザに対する発話できる。これにより、ロボット１００は、ユーザの発表練習に一緒に参加して、双方向のコミュニケーションを取りながらユーザの感情やプレゼンテーションに対するモチベーションを高めつつ、改善の提案を行うことでプレゼンテーションの完成度を高める効果を実現する。 Furthermore, the control unit of the robot 100 makes speech related to the content of the presentation when the user has finished practicing the presentation. As a result, when the practice presentation is finished, the robot 100 can make speeches to the user that will enhance the user's emotions, such as "That was a great presentation!" or "Your voice is loud and easy to understand!". On the other hand, when the practice presentation is finished, the robot 100 can make speeches to the user to improve the quality of the presentation, such as "It would be better if you explained it a little more slowly" or "There was a mistake in part XX, so let's fix that." As a result, the robot 100 can participate in the user's presentation practice and enhance the user's emotions and motivation for the presentation through two-way communication, while making suggestions for improvement, thereby achieving the effect of improving the quality of the presentation.

また、ロボット１００の制御部は、ユーザからプレゼンテーションの内容の完成度を高めることを要求する音声を受け付けた場合、プレゼンテーションの内容の完成度を高める行動を決定する。このように、ロボット１００は、ユーザから「プレゼンテーションの内容で改善点はありますか？」という発話があった場合に、「もう少しゆっくり説明するとなおよいです。」、「〇〇の部分が間違っていたので直しましょう。」といった提案を行うことができる。このようにして、ロボット１００は、ユーザの発表練習に一緒に参加して、双方向のコミュニケーションを取りながらユーザのプレゼンテーションの内容の完成度を高める効果を実現する。 Furthermore, when the control unit of the robot 100 receives a voice request from the user requesting that the content of the presentation be improved, it determines an action to improve the quality of the presentation content. In this way, when the user says, "Is there anything that can be improved about the content of the presentation?", the robot 100 can make suggestions such as, "It would be better if you explained it a little more slowly," or "There was a mistake in part XX, so let's fix that." In this way, the robot 100 participates in the user's presentation practice together, achieving the effect of improving the quality of the content of the user's presentation through two-way communication.

また、ロボット１００の制御部は、プレゼンテーションの内容を特定し、特定したプレゼンテーションの内容に合わせた所定のフィードバックに関する行動を決定する。これにより、ロボット１００は、プレゼンテーションの内容を取得して解析を行い、予め設定された指標が所定の閾値未満の場合に、係る箇所についてフィードバックを可能とする。従って、ロボット１００は、発表練習の実施が難しいユーザに対しても、当該ユーザが準備したプレゼンテーションにかかるデータを用いることで、的確な改善の提案が可能となる。このようにして、ロボット１００は、ユーザの属性によらずプレゼンテーションの完成度を高める効果を実現する。 The control unit of the robot 100 also identifies the content of the presentation and determines behavior related to predetermined feedback that matches the identified content of the presentation. This allows the robot 100 to acquire and analyze the content of the presentation, and if a preset indicator is below a predetermined threshold, it can provide feedback on the relevant part. Therefore, even for users who have difficulty practicing their presentation, the robot 100 can accurately suggest improvements by using data related to the presentation prepared by the user. In this way, the robot 100 achieves the effect of improving the quality of presentations regardless of the user's attributes.

上記実施形態では、ロボット１００は、ユーザの顔画像を用いてユーザを認識する場合について説明したが、開示の技術はこの態様に限定されない。例えば、ロボット１００は、ユーザが発する音声、ユーザのメールアドレス、ユーザのＳＮＳのＩＤ又はユーザが所持する無線ＩＣタグが内蔵されたＩＤカード等を用いてユーザを認識してもよい。 In the above embodiment, the robot 100 recognizes the user using a facial image of the user, but the disclosed technology is not limited to this. For example, the robot 100 may recognize the user using a voice uttered by the user, the user's email address, the user's SNS ID, or an ID card with a built-in wireless IC tag that the user possesses.

なお、ロボット１００は、行動制御システムを備える電子機器の一例である。行動制御システムの適用対象は、ロボット１００に限られず、様々な電子機器に行動制御システムを適用できる。また、サーバ３００の機能は、１以上のコンピュータによって実装されてよい。サーバ３００の少なくとも一部の機能は、仮想マシンによって実装されてよい。また、サーバ３００の機能の少なくとも一部は、クラウドで実装されてよい。 The robot 100 is an example of an electronic device equipped with a behavior control system. The application of the behavior control system is not limited to the robot 100, and the behavior control system can be applied to a variety of electronic devices. The functions of the server 300 may be implemented by one or more computers. At least some of the functions of the server 300 may be implemented by a virtual machine. At least some of the functions of the server 300 may be implemented in the cloud.

図４は、ロボット１００及びサーバ３００として機能するコンピュータ１２００のハードウェア構成の一例を概略的に示す図である。コンピュータ１２００にインストールされたプログラムは、コンピュータ１２００を、本実施形態に係る装置の１又は複数の「部」として機能させ、又はコンピュータ１２００に、本実施形態に係る装置に関連付けられるオペレーション又は当該１又は複数の「部」を実行させることができ、および／又はコンピュータ１２００に、本実施形態に係るプロセス又は当該プロセスの段階を実行させることができる。そのようなプログラムは、コンピュータ１２００に、本明細書に記載のフローチャートおよびブロック図のブロックのうちのいくつか又は全てに関連付けられた特定のオペレーションを実行させるべく、ＣＰＵ１２１２によって実行されてよい。 Figure 4 is a diagram schematically illustrating an example of the hardware configuration of a computer 1200 functioning as the robot 100 and the server 300. A program installed on the computer 1200 can cause the computer 1200 to function as one or more "parts" of an apparatus according to this embodiment, or to perform operations or one or more "parts" associated with an apparatus according to this embodiment, and/or to perform a process or steps of a process according to this embodiment. Such a program may be executed by the CPU 1212 to cause the computer 1200 to perform specific operations associated with some or all of the blocks in the flowcharts and block diagrams described herein.

本実施形態によるコンピュータ１２００は、ＣＰＵ１２１２、ＲＡＭ１２１４、およびグラフィックコントローラ１２１６を含み、それらはホストコントローラ１２１０によって相互に接続されている。コンピュータ１２００はまた、通信インタフェース１２２２、記憶装置１２２４、ＤＶＤドライブ、およびＩＣカードドライブのような入出力ユニットを含み、それらは入出力コントローラ１２２０を介してホストコントローラ１２１０に接続されている。ＤＶＤドライブは、ＤＶＤ－ＲＯＭドライブおよびＤＶＤ－ＲＡＭドライブ等であってよい。記憶装置１２２４は、ハードディスクドライブおよびソリッドステートドライブ等であってよい。コンピュータ１２００はまた、ＲＯＭ１２３０およびキーボードのような入出力ユニットを含み、それらは入出力チップ１２４０を介して入出力コントローラ１２２０に接続されている。 The computer 1200 according to this embodiment includes a CPU 1212, RAM 1214, and a graphics controller 1216, which are interconnected by a host controller 1210. The computer 1200 also includes input/output units such as a communications interface 1222, a storage device 1224, a DVD drive, and an IC card drive, which are connected to the host controller 1210 via an input/output controller 1220. The DVD drive may be a DVD-ROM drive, a DVD-RAM drive, or the like. The storage device 1224 may be a hard disk drive, a solid-state drive, or the like. The computer 1200 also includes input/output units such as a ROM 1230 and a keyboard, which are connected to the input/output controller 1220 via an input/output chip 1240.

ＣＰＵ１２１２は、ＲＯＭ１２３０およびＲＡＭ１２１４内に格納されたプログラムに従い動作し、それにより各ユニットを制御する。グラフィックコントローラ１２１６は、ＲＡＭ１２１４内に提供されるフレームバッファ等又はそれ自体の中に、ＣＰＵ１２１２によって生成されるイメージデータを取得し、イメージデータがディスプレイデバイス１２１８上に表示されるようにする。 The CPU 1212 operates according to programs stored in the ROM 1230 and RAM 1214, thereby controlling each unit. The graphics controller 1216 acquires image data generated by the CPU 1212 into a frame buffer provided in the RAM 1214 or into itself, and causes the image data to be displayed on the display device 1218.

通信インタフェース１２２２は、ネットワークを介して他の電子デバイスと通信する。記憶装置１２２４は、コンピュータ１２００内のＣＰＵ１２１２によって使用されるプログラムおよびデータを格納する。ＤＶＤドライブは、プログラム又はデータをＤＶＤ－ＲＯＭ等から読み取り、記憶装置１２２４に提供する。ＩＣカードドライブは、プログラムおよびデータをＩＣカードから読み取り、および／又はプログラムおよびデータをＩＣカードに書き込む。 The communication interface 1222 communicates with other electronic devices via a network. The storage device 1224 stores programs and data used by the CPU 1212 in the computer 1200. The DVD drive reads programs or data from a DVD-ROM or the like and provides them to the storage device 1224. The IC card drive reads programs and data from an IC card and/or writes programs and data to an IC card.

ＲＯＭ１２３０はその中に、アクティブ化時にコンピュータ１２００によって実行されるブートプログラム等、および／又はコンピュータ１２００のハードウェアに依存するプログラムを格納する。入出力チップ１２４０はまた、様々な入出力ユニットをＵＳＢポート、パラレルポート、シリアルポート、キーボードポート、マウスポート等を介して、入出力コントローラ１２２０に接続してよい。 ROM 1230 stores therein boot programs and the like that are executed by computer 1200 upon activation, and/or programs that depend on the hardware of computer 1200. I/O chip 1240 may also connect various I/O units to I/O controller 1220 via USB ports, parallel ports, serial ports, keyboard ports, mouse ports, etc.

プログラムは、ＤＶＤ－ＲＯＭ又はＩＣカードのようなコンピュータ可読記憶媒体によって提供される。プログラムは、コンピュータ可読記憶媒体から読み取られ、コンピュータ可読記憶媒体の例でもある記憶装置１２２４、ＲＡＭ１２１４、又はＲＯＭ１２３０にインストールされ、ＣＰＵ１２１２によって実行される。これらのプログラム内に記述される情報処理は、コンピュータ１２００に読み取られ、プログラムと、上記様々なタイプのハードウェアリソースとの間の連携をもたらす。装置又は方法が、コンピュータ１２００の使用に従い情報のオペレーション又は処理を実現することによって構成されてよい。 The programs are provided on a computer-readable storage medium such as a DVD-ROM or IC card. The programs are read from the computer-readable storage medium, installed in storage device 1224, RAM 1214, or ROM 1230, which are also examples of computer-readable storage media, and executed by CPU 1212. The information processing described in these programs is read by computer 1200, resulting in cooperation between the programs and the various types of hardware resources described above. An apparatus or method may be configured by implementing the operation or processing of information in accordance with the use of computer 1200.

例えば、通信がコンピュータ１２００および外部デバイス間で実行される場合、ＣＰＵ１２１２は、ＲＡＭ１２１４にロードされた通信プログラムを実行し、通信プログラムに記述された処理に基づいて、通信インタフェース１２２２に対し、通信処理を命令してよい。通信インタフェース１２２２は、ＣＰＵ１２１２の制御の下、ＲＡＭ１２１４、記憶装置１２２４、ＤＶＤ－ＲＯＭ、又はＩＣカードのような記録媒体内に提供される送信バッファ領域に格納された送信データを読み取り、読み取られた送信データをネットワークに送信し、又はネットワークから受信した受信データを記録媒体上に提供される受信バッファ領域等に書き込む。 For example, when communication is performed between computer 1200 and an external device, CPU 1212 may execute a communication program loaded into RAM 1214 and instruct communication interface 1222 to perform communication processing based on the processing described in the communication program. Under the control of CPU 1212, communication interface 1222 reads transmission data stored in a transmission buffer area provided in RAM 1214, storage device 1224, DVD-ROM, or a recording medium such as an IC card, and transmits the read transmission data to the network, or writes received data received from the network to a reception buffer area or the like provided on the recording medium.

また、ＣＰＵ１２１２は、記憶装置１２２４、ＤＶＤドライブ（ＤＶＤ－ＲＯＭ）、ＩＣカード等のような外部記録媒体に格納されたファイル又はデータベースの全部又は必要な部分がＲＡＭ１２１４に読み取られるようにし、ＲＡＭ１２１４上のデータに対し様々なタイプの処理を実行してよい。ＣＰＵ１２１２は次に、処理されたデータを外部記録媒体にライトバックしてよい。 The CPU 1212 may also cause all or a necessary portion of a file or database stored on an external recording medium such as the storage device 1224, a DVD drive (DVD-ROM), an IC card, etc. to be read into the RAM 1214, and perform various types of processing on the data on the RAM 1214. The CPU 1212 may then write the processed data back to the external recording medium.

様々なタイプのプログラム、データ、テーブル、およびデータベースのような様々なタイプの情報が記録媒体に格納され、情報処理を受けてよい。ＣＰＵ１２１２は、ＲＡＭ１２１４から読み取られたデータに対し、本開示の随所に記載され、プログラムの命令シーケンスによって指定される様々なタイプのオペレーション、情報処理、条件判断、条件分岐、無条件分岐、情報の検索／置換等を含む、様々なタイプの処理を実行してよく、結果をＲＡＭ１２１４に対しライトバックする。また、ＣＰＵ１２１２は、記録媒体内のファイル、データベース等における情報を検索してよい。例えば、各々が第２の属性の属性値に関連付けられた第１の属性の属性値を有する複数のエントリが記録媒体内に格納される場合、ＣＰＵ１２１２は、当該複数のエントリの中から、第１の属性の属性値が指定されている条件に一致するエントリを検索し、当該エントリ内に格納された第２の属性の属性値を読み取り、それにより予め定められた条件を満たす第１の属性に関連付けられた第２の属性の属性値を取得してよい。 Various types of information, such as various types of programs, data, tables, and databases, may be stored on the recording medium and may undergo information processing. CPU 1212 may perform various types of processing on data read from RAM 1214, including various types of operations, information processing, conditional judgment, conditional branching, unconditional branching, information search/replacement, etc., as described throughout this disclosure and specified by the program's instruction sequence, and write the results back to RAM 1214. CPU 1212 may also search for information in files, databases, etc. on the recording medium. For example, if multiple entries each having an attribute value of a first attribute associated with an attribute value of a second attribute are stored on the recording medium, CPU 1212 may search for an entry whose attribute value of the first attribute matches a specified condition from among the multiple entries, read the attribute value of the second attribute stored in the entry, and thereby obtain the attribute value of the second attribute associated with the first attribute that satisfies a predetermined condition.

上記したプログラム又はソフトウエアモジュールは、コンピュータ１２００上又はコンピュータ１２００近傍のコンピュータ可読記憶媒体に格納されてよい。また、専用通信ネットワーク又はインターネットに接続されたサーバシステム内に提供されるハードディスク又はＲＡＭのような記録媒体が、コンピュータ可読記憶媒体として使用可能であり、それによりプログラムを、ネットワークを介してコンピュータ１２００に提供する。 The above-mentioned programs or software modules may be stored on computer-readable storage media on or near computer 1200. Recording media such as a hard disk or RAM provided within a server system connected to a dedicated communications network or the Internet can also be used as computer-readable storage media, thereby providing the programs to computer 1200 via the network.

本実施形態におけるフローチャートおよびブロック図におけるブロックは、オペレーションが実行されるプロセスの段階又はオペレーションを実行する役割を持つ装置の「部」を表してよい。特定の段階および「部」が、専用回路、コンピュータ可読記憶媒体上に格納されるコンピュータ可読命令と共に供給されるプログラマブル回路、および／又はコンピュータ可読記憶媒体上に格納されるコンピュータ可読命令と共に供給されるプロセッサによって実装されてよい。専用回路は、デジタルおよび／又はアナログハードウェア回路を含んでよく、集積回路（ＩＣ）および／又はディスクリート回路を含んでよい。プログラマブル回路は、例えば、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、およびプログラマブルロジックアレイ（ＰＬＡ）等のような、論理積、論理和、排他的論理和、否定論理積、否定論理和、および他の論理演算、フリップフロップ、レジスタ、並びにメモリエレメントを含む、再構成可能なハードウェア回路を含んでよい。 The blocks in the flowcharts and block diagrams in this embodiment may represent stages of a process in which an operation is performed or "parts" of a device responsible for performing the operation. Particular stages and "parts" may be implemented by dedicated circuitry, programmable circuitry provided with computer-readable instructions stored on a computer-readable storage medium, and/or a processor provided with computer-readable instructions stored on a computer-readable storage medium. Dedicated circuitry may include digital and/or analog hardware circuitry, and may include integrated circuits (ICs) and/or discrete circuits. Programmable circuitry may include reconfigurable hardware circuitry including AND, OR, XOR, NAND, NOR, and other logical operations, flip-flops, registers, and memory elements, such as field programmable gate arrays (FPGAs) and programmable logic arrays (PLAs).

コンピュータ可読記憶媒体は、適切なデバイスによって実行される命令を格納可能な任意の有形なデバイスを含んでよく、その結果、そこに格納される命令を有するコンピュータ可読記憶媒体は、フローチャート又はブロック図で指定されたオペレーションを実行するための手段を作成すべく実行され得る命令を含む、製品を備えることになる。コンピュータ可読記憶媒体の例としては、電子記憶媒体、磁気記憶媒体、光記憶媒体、電磁記憶媒体、半導体記憶媒体等が含まれてよい。コンピュータ可読記憶媒体のより具体的な例としては、フロッピー（登録商標）ディスク、ディスケット、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、リードオンリメモリ（ＲＯＭ）、消去可能プログラマブルリードオンリメモリ（ＥＰＲＯＭ又はフラッシュメモリ）、電気的消去可能プログラマブルリードオンリメモリ（ＥＥＰＲＯＭ）、静的ランダムアクセスメモリ（ＳＲＡＭ）、コンパクトディスクリードオンリメモリ（ＣＤ－ＲＯＭ）、デジタル多用途ディスク（ＤＶＤ）、ブルーレイ（登録商標）ディスク、メモリスティック、集積回路カード等が含まれてよい。 A computer-readable storage medium may include any tangible device capable of storing instructions executed by a suitable device, such that a computer-readable storage medium having instructions stored thereon comprises an article of manufacture, including instructions that can be executed to create means for performing the operations specified in the flowchart or block diagram. Examples of computer-readable storage media may include electronic storage media, magnetic storage media, optical storage media, electromagnetic storage media, semiconductor storage media, etc. More specific examples of computer-readable storage media may include floppy disks, diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), electrically erasable programmable read-only memory (EEPROM), static random access memory (SRAM), compact disc read-only memory (CD-ROM), digital versatile disc (DVD), Blu-ray disc, memory stick, integrated circuit card, etc.

コンピュータ可読命令は、アセンブラ命令、命令セットアーキテクチャ（ＩＳＡ）命令、マシン命令、マシン依存命令、マイクロコード、ファームウェア命令、状態設定データ、又はＳｍａｌｌｔａｌｋ（登録商標）、ＪＡＶＡ（登録商標）、Ｃ＋＋等のようなオブジェクト指向プログラミング言語、および「Ｃ」プログラミング言語又は同様のプログラミング言語のような従来の手続型プログラミング言語を含む、１又は複数のプログラミング言語の任意の組み合わせで記述されたソースコード又はオブジェクトコードのいずれかを含んでもよい。 The computer-readable instructions may include either assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state-setting data, or source or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk®, JAVA®, C++, etc., and conventional procedural programming languages such as the "C" programming language or similar programming languages.

コンピュータ可読命令は、汎用コンピュータ、特殊目的のコンピュータ、若しくは他のプログラム可能なデータ処理装置のプロセッサ、又はプログラマブル回路が、フローチャート又はブロック図で指定されたオペレーションを実行するための手段を生成するために当該コンピュータ可読命令を実行すべく、ローカルに又はローカルエリアネットワーク（ＬＡＮ）、インターネット等のようなワイドエリアネットワーク（ＷＡＮ）を介して、汎用コンピュータ、特殊目的のコンピュータ、若しくは他のプログラム可能なデータ処理装置のプロセッサ、又はプログラマブル回路に提供されてよい。プロセッサの例としては、コンピュータプロセッサ、処理ユニット、マイクロプロセッサ、デジタル信号プロセッサ、コントローラ、マイクロコントローラ等を含む。 The computer-readable instructions may be provided locally or over a wide area network (WAN) such as a local area network (LAN), the Internet, etc. to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, or to a programmable circuit, such that the processor or programmable circuit executes the computer-readable instructions to generate means for performing the operations specified in the flowcharts or block diagrams. Examples of processors include computer processors, processing units, microprocessors, digital signal processors, controllers, microcontrollers, etc.

(その他の実施形態)
なお、上述したロボット１００は、ぬいぐるみに搭載してもよく、あるいは、ぬいぐるみに搭載された制御対象機器（スピーカやカメラ）に無線又は有線で接続された制御装置に適用してもよい。 (Other embodiments)
The robot 100 described above may be mounted on a stuffed toy, or may be applied to a control device connected wirelessly or by wire to a control target device (speaker or camera) mounted on the stuffed toy.

感情決定部２３２は、特定のマッピングに従い、ユーザの感情を決定してよい。具体的には、感情決定部２３２は、特定のマッピングである感情マップ（図５参照）に従い、ユーザの感情を決定してよい。 The emotion determination unit 232 may determine the user's emotion in accordance with a specific mapping. Specifically, the emotion determination unit 232 may determine the user's emotion in accordance with an emotion map (see Figure 5), which is a specific mapping.

図５は、複数の感情がマッピングされる感情マップ７００を示す図である。感情マップ７００において、感情は、中心から放射状に同心円に配置されている。同心円の中心に近いほど、原始的状態の感情が配置されている。同心円のより外側には、心境から生まれる状態や行動を表す感情が配置されている。感情とは、情動や心的状態も含む概念である。同心円の左側には、概して脳内で起きる反応から生成される感情が配置されている。同心円の右側には概して、状況判断で誘導される感情が配置されている。同心円の上方向及び下方向には、概して脳内で起きる反応から生成され、かつ、状況判断で誘導される感情が配置されている。また、同心円の上側には、「快」の感情が配置され、下側には、「不快」の感情が配置されている。このように、感情マップ７００では、感情が生まれる構造に基づいて複数の感情がマッピングされており、同時に生じやすい感情が、近くにマッピングされている。 Figure 5 shows an emotion map 700 on which multiple emotions are mapped. In emotion map 700, emotions are arranged in concentric circles radiating from the center. Emotions closer to the center of the concentric circles are more primitive. Emotions representing states and actions arising from a state of mind are arranged on the outer edges of the concentric circles. The concept of emotion includes both emotions and mental states. Emotions that are generally generated from reactions that occur in the brain are arranged on the left side of the concentric circles. Emotions that are generally induced by situational judgment are arranged on the right side of the concentric circles. Emotions that are generally generated from reactions that occur in the brain and are induced by situational judgment are arranged above and below the concentric circles. Furthermore, the emotion of "pleasure" is arranged on the top side of the concentric circles, and the emotion of "discomfort" is arranged on the bottom side. In this way, in emotion map 700, multiple emotions are mapped based on the structure by which emotions are generated, and emotions that tend to occur simultaneously are mapped close together.

（１）例えばロボット１００の感情決定部２３２である感情エンジンが、１００ｍｓｅｃ程度で感情を検知している場合、ロボット１００の反応動作（例えば相槌）の決定は、頻度が少なくとも、感情エンジンの検知頻度（１００ｍｓｅｃ）と同様のタイミングに設定してよく、これよりも早いタイミングに設定してもよい。感情エンジンの検知頻度はサンプリングレートと解釈してよい。 (1) For example, if the emotion engine, which is the emotion determination unit 232 of the robot 100, detects emotions at approximately 100 msec, the frequency of determining the robot 100's reaction action (e.g., a backchannel) may be set to at least the same timing as the emotion engine's detection frequency (100 msec), or may be set to an earlier timing. The emotion engine's detection frequency may be interpreted as the sampling rate.

１００ｍｓｅｃ程度で感情を検知し、即時に連動して反動動作（例えば相槌）を行うことで、不自然な相槌ではなくなり、自然な空気を読んだ対話を実現できる。ロボット１００感情マップ７００の曼荼羅の方向性とその度合い（強さ）に応じて、反動動作（相槌等）を行う。なお、感情エンジンの検知頻度（サンプリングレート）は、１００ｍｓに限定されず、シチュエーション（スポーツをしている場合等）、ユーザの年齢等に応じて、変更してもよい。 By detecting emotions within about 100 ms and immediately performing a corresponding reaction (e.g., a backchannel), unnatural reactions are eliminated, enabling a natural, well-read dialogue. The reaction (e.g., a backchannel) is performed according to the directionality and degree (strength) of the mandala in the robot's 100 emotion map 700. The emotion engine's detection frequency (sampling rate) is not limited to 100 ms, and may be changed depending on the situation (e.g., when playing sports), the user's age, etc.

（２）感情マップ７００と照らし合わせ、感情の方向性とその度合いの強さを予め設定しておき、相槌の動き及び相槌の強弱を設定してよい。例えば、ロボット１００が安定感、安心等を感じている場合、ロボット１００は、頷いて話を聞き続ける。ロボット１００が不安、迷い、怪しい感じを覚えている場合、ロボット１００は、首をかしげてもよく、首振りを止めてもよい。 (2) The directionality and intensity of emotions may be set in advance in reference to the emotion map 700, and the movement of the interjections and the strength of the interjections may be set. For example, if the robot 100 feels a sense of stability, security, etc., the robot 100 may nod and continue listening. If the robot 100 feels anxious, confused, or suspicious, the robot 100 may tilt its head or stop shaking its head.

これらの感情は、感情マップ７００の３時の方向に分布しており、普段は安心と不安のあたりを行き来する。感情マップ７００の右半分では、内部的な感覚よりも状況認識の方が優位に立つため、落ち着いた印象になる。 These emotions are distributed in the 3 o'clock direction on emotion map 700, and usually fluctuate between relief and anxiety. In the right half of emotion map 700, situational awareness takes precedence over internal sensations, resulting in a calm impression.

（３）ロボット１００が褒められて快感を覚えた場合、「あー」というフィラーが台詞の前に入り、きつい言葉をもらって痛感を覚えた場合、「うっ！」というフィラーが台詞の前に入ってよい。また、ロボット１００が「うっ！」と言いつつうずくまる仕草等の身体的な反応を含めてよい。これらの感情は、感情マップ７００の９時あたりに分布している。 (3) If the robot 100 feels good after being praised, the filler "ah" may be inserted before the line, and if the robot 100 feels pain after receiving harsh words, the filler "ugh!" may be inserted before the line. A physical reaction, such as the robot 100 crouching down while saying "ugh!", may also be included. These emotions are distributed around 9 o'clock on the emotion map 700.

（４）感情マップ７００の左半分では、状況認識よりも内部的な感覚（反応）の方が優位に立つ。よって、思わず反応してしまった印象を与え得る。 (4) In the left half of the emotion map 700, internal sensations (reactions) take precedence over situational awareness. This can give the impression of an unconscious reaction.

ロボット１００が納得感という内部的な感覚（反応）を覚えながら状況認識においても好感を覚える場合、ロボット１００は、相手を見ながら深く頷いてよく、また「うんうん」と発してよい。このように、ロボット１００は、相手へのバランスのとれた好感、すなわち、相手への許容や寛容といった行動を生成してよい。このような感情は、感情マップ７００の１２時あたりに分布している。 When the robot 100 feels an internal sense (reaction) of satisfaction and also feels a favorable impression in its situational awareness, the robot 100 may nod deeply while looking at the other person, or may say "uh-huh." In this way, the robot 100 may generate behavior that shows a balanced favorable impression toward the other person, that is, acceptance and tolerance toward the other person. Such emotions are distributed around 12 o'clock on the emotion map 700.

逆に、ロボット１００が不快感という内部的な感覚（反応）を覚えながら状況認識においても、ロボット１００は、嫌悪を覚えるときには首を横に振る、憎しみを覚えるくらいになると、目のＬＥＤを赤くして相手を睨んでもよい。このような感情は、感情マップ７００の６時あたりに分布している。 Conversely, even when robot 100 is aware of an internal sensation (reaction) of discomfort, it may shake its head when it feels disgust, or turn the LEDs in its eyes red and glare at the other person when it feels hatred. These emotions are distributed around 6 o'clock on emotion map 700.

（５）感情マップ７００の内側は心の中、感情マップ７００の外側は行動を表すため、感情マップ７００の外側に行くほど、感情が目に見える（行動に表れる）ようになる。 (5) The inside of the emotion map 700 represents what is going on in the mind, and the outside of the emotion map 700 represents behavior, so the further out you go on the emotion map 700, the more visible (expressed in behavior) the emotion becomes.

（６）感情マップ７００の３時付近に分布する安心を覚えながら、人の話を聞く場合、ロボット１００は、軽く首を縦に振って「ふんふん」と発する程度であるが、１２時付近の愛の方になると、首を深く縦に振るような力強い頷きをしてよい。 (6) When listening to someone while feeling a sense of security, which is distributed around 3 o'clock on the emotion map 700, the robot 100 may nod its head lightly and say "hmm," but when it comes to feelings of love, which are distributed around 12 o'clock, it may nod its head strongly, nodding its head deeply.

感情決定部２３２は、センサモジュール部２１０で解析された情報、及び認識されたユーザ１０の状態を、予め学習されたニューラルネットワークに入力し、感情マップ７００に示す各感情を示す感情値を取得し、ユーザ１０の感情を決定する。このニューラルネットワークは、センサモジュール部２１０で解析された情報、及び認識されたユーザ１０の状態と、感情マップ７００に示す各感情を示す感情値との組み合わせである複数の学習データに基づいて予め学習されたものである。また、このニューラルネットワークは、図６に示す感情マップ９００のように、近くに配置されている感情同士は、近い値を持つように学習される。図６は、感情マップの他の例を示す図である。図６では、「安心」、「安穏」、「心強い」という複数の感情が、近い感情値となる例を示している。 The emotion determination unit 232 inputs the information analyzed by the sensor module unit 210 and the recognized state of the user 10 into a pre-trained neural network, obtains emotion values indicating each emotion shown in the emotion map 700, and determines the emotion of the user 10. This neural network is pre-trained based on multiple pieces of training data that are combinations of the information analyzed by the sensor module unit 210, the recognized state of the user 10, and emotion values indicating each emotion shown in the emotion map 700. Furthermore, this neural network is trained so that emotions that are located close to each other have similar values, as in the emotion map 900 shown in Figure 6. Figure 6 is a diagram showing another example of an emotion map. Figure 6 shows an example in which multiple emotions, such as "relieved," "calm," and "reassuring," have similar emotion values.

また、感情決定部２３２は、特定のマッピングに従い、ロボット１００の感情を決定してよい。具体的には、感情決定部２３２は、センサモジュール部２１０で解析された情報、ユーザ状態認識部２３０によって認識されたユーザ１０の状態、及びロボット１００の状態を、予め学習されたニューラルネットワークに入力し、感情マップ７００に示す各感情を示す感情値を取得し、ロボット１００の感情を決定する。このニューラルネットワークは、センサモジュール部２１０で解析された情報、認識されたユーザ１０の状態、及びロボット１００の状態と、感情マップ７００に示す各感情を示す感情値との組み合わせである複数の学習データに基づいて予め学習されたものである。例えば、タッチセンサ２０７の出力から、ロボット１００がユーザ１０になでられていると認識される場合に、「嬉しい」の感情値「３」となることを表す学習データや、加速度センサ２０５の出力から、ロボット１００がユーザ１０に叩かれていると認識される場合に、「怒」の感情値「３」となることを表す学習データに基づいて、ニューラルネットワークが学習される。また、このニューラルネットワークは、図６に示す感情マップ９００のように、近くに配置されている感情同士は、近い値を持つように学習される。 The emotion determination unit 232 may also determine the emotion of the robot 100 according to a specific mapping. Specifically, the emotion determination unit 232 inputs the information analyzed by the sensor module unit 210, the state of the user 10 recognized by the user state recognition unit 230, and the state of the robot 100 into a pre-trained neural network, obtains emotion values indicating each emotion shown in the emotion map 700, and determines the emotion of the robot 100. This neural network has been pre-trained based on multiple learning data that are combinations of the information analyzed by the sensor module unit 210, the recognized state of the user 10, and the state of the robot 100, and emotion values indicating each emotion shown in the emotion map 700. For example, the neural network is trained based on learning data indicating that when the output of the touch sensor 207 indicates that the robot 100 is being stroked by the user 10, the emotional value becomes "happy" at "3," and based on learning data indicating that when the output of the acceleration sensor 205 indicates that the robot 100 is being hit by the user 10, the emotional value becomes "anger" at "3." Furthermore, this neural network is trained so that emotions that are close to each other have similar values, as in the emotion map 900 shown in Figure 6.

また、感情決定部２３２は、文章生成モデルによって生成されたロボット１００の行動内容に基づいて、ロボット１００の感情を決定してもよい。具体的には、感情決定部２３２は、文章生成モデルによって生成されたロボット１００の行動内容を、予め学習されたニューラルネットワークに入力し、感情マップ７００に示す各感情を示す感情値を取得し、取得した各感情を示す感情値と、現在のロボット１００の各感情を示す感情値とを統合し、ロボット１００の感情を更新する。例えば、取得した各感情を示す感情値と、現在のロボット１００の各感情を示す感情値とをそれぞれ平均して、統合する。このニューラルネットワークは、文章生成モデルによって生成されたロボット１００の行動内容を表すテキストと、感情マップ７００に示す各感情を示す感情値との組み合わせである複数の学習データに基づいて予め学習されたものである。 The emotion determination unit 232 may also determine the emotion of the robot 100 based on the behavioral content of the robot 100 generated by the sentence generation model. Specifically, the emotion determination unit 232 inputs the behavioral content of the robot 100 generated by the sentence generation model into a pre-trained neural network, obtains emotion values indicating each emotion shown in the emotion map 700, and integrates the obtained emotion values indicating each emotion with emotion values indicating each current emotion of the robot 100 to update the emotion of the robot 100. For example, the emotion values indicating each obtained emotion and emotion values indicating each current emotion of the robot 100 are averaged and integrated. This neural network is pre-trained based on multiple learning data that are combinations of text indicating the behavioral content of the robot 100 generated by the sentence generation model and emotion values indicating each emotion shown in the emotion map 700.

例えば、文章生成モデルによって生成されたロボット１００の行動内容として、ロボット１００の発話内容「それはよかったね。ラッキーだったね。」が得られた場合には、この発話内容を表すテキストをニューラルネットワークに入力すると、感情「嬉しい」の感情値として高い値が得られ、感情「嬉しい」の感情値が高くなるように、ロボット１００の感情が更新される。 For example, if the utterance content of robot 100, "That's great. You're lucky," is obtained as the behavioral content of robot 100 generated by the sentence generation model, when the text representing this utterance content is input into the neural network, a high value is obtained as the emotion value of the emotion "happy," and the emotion of robot 100 is updated so that the emotion value of the emotion "happy" becomes higher.

行動決定部２３６は、ユーザの行動と、ユーザの感情、ロボットの感情とを表すテキストに、ユーザの行動に対応するロボットの行動内容を質問するための固定文を追加して、対話機能を有する文章生成モデルに入力することにより、ロボットの行動内容を生成する。 The behavior determination unit 236 generates the robot's behavior by adding fixed sentences to ask about the robot's behavior corresponding to the user's behavior to text representing the user's behavior, the user's emotions, and the robot's emotions, and inputting this into a sentence generation model with dialogue capabilities.

例えば、行動決定部２３６は、感情決定部２３２によって決定されたロボット１００の感情から、図７に示すような感情テーブルを用いて、ロボット１００の状態を表すテキストを取得する。図７は、感情テーブルの一例を示す図である。ここで、感情テーブルには、感情の種類毎に、各感情値に対してインデックス番号が付与されており、インデックス番号毎に、ロボット１００の状態を表すテキストが格納されている。 For example, the behavior determination unit 236 obtains text representing the state of the robot 100 from the emotion of the robot 100 determined by the emotion determination unit 232, using an emotion table such as that shown in FIG. 7. FIG. 7 is a diagram showing an example of an emotion table. Here, in the emotion table, an index number is assigned to each emotion value for each type of emotion, and text representing the state of the robot 100 is stored for each index number.

感情決定部２３２によって決定されたロボット１００の感情が、インデックス番号「２」に対応する場合、「とても楽しい状態」というテキストが得られる。なお、ロボット１００の感情が、複数のインデックス番号に対応する場合、ロボット１００の状態を表すテキストが複数得られる。 If the emotion of the robot 100 determined by the emotion determination unit 232 corresponds to index number "2", the text "very happy state" is obtained. Note that if the emotion of the robot 100 corresponds to multiple index numbers, multiple pieces of text representing the state of the robot 100 are obtained.

また、ユーザ１０の感情に対しても、図８に示すような感情テーブルを用意しておく。図８は、感情テーブルの一例を示す図である。ここで、ユーザの行動が、「ＡＡＡと話しかける」であり、ロボット１００の感情が、インデックス番号「２」であり、ユーザ１０の感情が、インデックス番号「３」である場合には、「ロボットはとても楽しい状態です。ユーザは普通に楽しい状態です。ユーザに「ＡＡＡ」と話しかけられました。ロボットとして、どのように返事をしますか？」と文章生成モデルに入力し、ロボットの行動内容を取得する。行動決定部２３６は、この行動内容から、ロボットの行動を決定する。なお、「ＡＡＡ」は、ユーザがロボット１００に付けた名称（呼び名）である。 An emotion table such as that shown in Figure 8 is also prepared for the emotions of the user 10. Figure 8 is a diagram showing an example of an emotion table. Here, if the user's behavior is "talking to AAA", the robot 100's emotion is index number "2", and the user 10's emotion is index number "3", then the following is input into the sentence generation model to obtain the robot's behavior: "The robot is in a very happy state. The user is in a normal happy state. The user has spoken to you with "AAA". How would you respond as the robot?" The behavior determination unit 236 determines the robot's behavior from this behavior content. Note that "AAA" is the name (nickname) given to the robot 100 by the user.

このように、ロボット１００は、ロボットの感情に応じたインデックス番号に応じて、ロボットの行動を変えることができるため、ユーザは、ロボット１００に心があるような印象を持ち、ロボットに対して話しかける等の行動をとることが促進される。 In this way, robot 100 can change its behavior according to the index number that corresponds to the robot's emotions, which gives the user the impression that robot 100 has a heart, encouraging them to take actions such as talking to the robot.

また、行動決定部２３６は、ユーザの行動と、ユーザの感情、ロボットの感情とを表すテキストだけでなく、履歴データ２２２の内容を表すテキストも追加した上で、ユーザの行動に対応するロボットの行動内容を質問するための固定文を追加して、対話機能を有する文章生成モデルに入力することにより、ロボットの行動内容を生成するようにしてもよい。これにより、ロボット１００は、ユーザの感情や行動を表す履歴データに応じて、ロボットの行動を変えることができるため、ユーザは、ロボットに個性があるような印象を持ち、ロボットに対して話しかける等の行動をとることが促進される。また、履歴データに、ロボットの感情や行動を更に含めるようにしてもよい。 The behavior determination unit 236 may also generate the robot's behavior by adding text representing the contents of the history data 222, as well as text representing the user's behavior, the user's emotions, and the robot's emotions, and then adding a fixed sentence for asking about the robot's behavior corresponding to the user's behavior, and inputting this into a sentence generation model with a dialogue function. This allows the robot 100 to change its behavior in accordance with the history data representing the user's emotions and behavior, giving the user the impression that the robot has individuality and encouraging them to take actions such as talking to the robot. The history data may also include the robot's emotions and behavior.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更又は改良を加えることが可能であることが当業者に明らかである。その様な変更又は改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 The present invention has been described above using embodiments, but the technical scope of the present invention is not limited to the scope described in the above embodiments. It will be clear to those skilled in the art that various modifications and improvements can be made to the above embodiments. It is clear from the claims that such modifications and improvements can also be included within the technical scope of the present invention.

特許請求の範囲、明細書、及び図面中において示した装置、システム、プログラム、及び方法における動作、手順、ステップ、及び段階等の各処理の実行順序は、特段「より前に」、「先立って」等と明示しておらず、また、前の処理の出力を後の処理で用いるのでない限り、任意の順序で実現しうることに留意すべきである。特許請求の範囲、明細書、及び図面中の動作フローに関して、便宜上「まず、」、「次に、」等を用いて説明したとしても、この順で実施することが必須であることを意味するものではない。 The order of execution of each process, such as operations, procedures, steps, and stages, in the devices, systems, programs, and methods shown in the claims, specifications, and drawings is not specifically stated as "before," "prior to," or the like, and it should be noted that processes can be performed in any order, unless the output of a previous process is used in a subsequent process. Even if the operational flow in the claims, specifications, and drawings is described using "first," "next," etc. for convenience, this does not mean that it is necessary to perform the processes in that order.

１制御システム
２０通信網
１００ロボット
２００センサ部
２０１マイク
２０２３Ｄ深度センサ
２０３２Ｄカメラ
２０４距離センサ
２０５加速度センサ
２０６サーモセンサ
２０７タッチセンサ
２１０センサモジュール部
２１１音声感情認識部
２１２発話理解部
２１３表情認識部
２１４顔認識部
２２０格納部
２２１反応ルール
２２２履歴データ
２３０ユーザ状態認識部
２３２感情決定部
２３４行動認識部
２３６行動決定部
２３８記憶制御部
２５０行動制御部
２５２制御対象
２８０通信処理部
３００サーバ
４００連携機器 1 Control system 20 Communication network 100 Robot 200 Sensor unit 201 Microphone 202 3D depth sensor 203 2D camera 204 Distance sensor 205 Acceleration sensor 206 Thermosensor 207 Touch sensor 210 Sensor module unit 211 Voice emotion recognition unit 212 Speech understanding unit 213 Facial expression recognition unit 214 Face recognition unit 220 Storage unit 221 Response rules 222 History data 230 User state recognition unit 232 Emotion determination unit 234 Behavior recognition unit 236 Behavior determination unit 238 Memory control unit 250 Behavior control unit 252 Control target 280 Communication processing unit 300 Server 400 Linked device

Claims

a control unit that recognizes an action of a user giving a presentation, determines its own action corresponding to the recognized action of the user, and controls a control target based on the determined action of its own;
Equipped with
the control unit analyzes the content of the presentation, and if a preset index is less than a predetermined threshold, generates action content that improves a predetermined level of completeness of the content of the user's presentation using a sentence generation model;
when receiving a voice from the user requesting that the content of the presentation be made more complete, generating action content for making the content of the presentation more complete using the sentence generation model;
electronic equipment.

The control unit
When the user practices the presentation, the user performs an action to improve the content of the presentation to a predetermined level of completion.
The electronic device according to claim 1 .

The control unit
When the presentation by the user is recognized as a practice presentation, determining an action to enhance the emotion of the user giving the presentation.
The electronic device according to claim 2 .

The control unit
identifying content of the presentation and determining a predetermined feedback action tailored to the identified content of the presentation;
The electronic device according to claim 2 .

The control unit
When the user has finished practicing the presentation, he or she makes a speech related to the content of the presentation.
The electronic device according to claim 2 .

The control unit
analyzing the content of the presentation in the presentation practice, extracting portions where a predetermined indicator of completion is less than a threshold, and determining an action regarding predetermined feedback for the extracted portions;
The electronic device according to claim 2 .

The control unit
when receiving a voice from the user requesting that the content of the presentation be improved, determining an action to improve the content of the presentation;
The electronic device according to claim 2 .

The control unit
extracting portions where the indicator of completeness is less than a threshold value using one or a combination of typos, omissions, errors in content, completeness of content, the user's voice volume, presentation speed, line of sight, and changes in the user's emotions as the predetermined indicator of completeness of the presentation;
10. The electronic device according to claim 2 or 6.

The electronic device includes:
The electronic device according to claim 1 , which is mounted on a stuffed toy or connected wirelessly or by wire to a control target device mounted on the stuffed toy.