JP2004070876A

JP2004070876A - Conversation system and conversation processing program

Info

Publication number: JP2004070876A
Application number: JP2002233090A
Authority: JP
Inventors: Takashi Matsuda; 松田　隆
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2002-08-09
Filing date: 2002-08-09
Publication date: 2004-03-04

Abstract

【課題】データベースの作成作業の負担を軽減し、自然な流れで、かつ、ユーザが楽しめるユニークな会話を実現する。
【解決手段】小説などの会話文を含んだ既存の文章情報から会話として利用可能な文章を抽出し、これらの文章に各文章間の時間的、話題的な距離を示す時間関連情報を付加して会話文データベース１８を作成する。また、ニュース記事などの会話文を含まない既存の文章情報を利用して前記同様にして記事文データベース１９を作成する。ユーザの発言に対し、会話文データベース１８または記事文データベース１９から会話として適切な文章を時間関連情報に基づいて選出して発言する。このように、既存の文章情報を利用することでデータベースを簡易に作成でき、そのデータベースを用いて自然な流れで、しかも、ユニークな会話を行うことができる。
【選択図】　　図１To reduce the burden of creating a database, and to realize a unique conversation that can be enjoyed by a user in a natural manner.
SOLUTION: Extracting sentences that can be used as conversations from existing sentence information including conversation sentences such as novels, and adding time-related information indicating temporal and topical distances between the sentences to these sentences. To create the conversation database 18. Further, the article database 19 is created in the same manner as described above using existing sentence information that does not include a conversation sentence such as a news article. In response to the user's utterance, an appropriate sentence as a conversation is selected from the conversation sentence database 18 or the article sentence database 19 based on the time-related information and uttered. In this way, a database can be easily created by using existing text information, and a unique conversation can be performed using the database in a natural manner.
[Selection] Figure 1

Description

【０００１】
【発明の属する技術分野】
本発明は、会話型ロボット等の玩具類やテレビゲーム機などに用いられる会話システムであって、ユーザがコンピュータを相手に会話することで楽しみや安らぎなどを得ることのできる会話システムに関する。
【０００２】
【従来の技術】
従来、テレビゲーム機や玩具等に用いられる会話システムの多くは、通常、予め決められたシナリオに沿って会話を行う方式（以下、「シナリオ方式」と呼ぶ）を採用している。この「シナリオ方式」で用いられるシナリオは、実現性を考慮して、会話システムがまず話題を限定するような発言を行い、その後の会話の進展も、分岐が極力少なくなるように作られている。このため、人間（ユーザ）が会話の主導権をとれず、会話の流れが平凡だったり、不自然だったりするなどの欠点がある。
【０００３】
そこで、「シナリオ方式」以外の会話システムとして、「人工知能」的な会話システムと、「人工無能」的な会話システムが考えられている。「人工知能」的な会話システムは、ユーザの発言を構文解析することにより意味を抽出して、発言の意図を理解し、それに基づいて返事を作り出そうとするシステムである。このシステムは、人間の知能に近い処理を行って会話を実現するものであるため、高度な技術が必要であると共に、あるゆる分野に対応させることは困難である。つまり、例えば「切符販売」や「情報検索」などのように、会話が限定された分野にしか適用することができない。
【０００４】
これに対し、「人工無能」的な会話システムは、「人工無能」と俗称されているアプローチを主に用いる会話システムである。これは、ユーザの発言を構文解析して意味抽出するといったような手法を用いないで、表面的に会話らしきものを実現するものである。つまり、ユーザの発言の意味は理解していないが、会話としては成立するようなシステムを実現するものである。このシステムでは、ユーザの発言から特定のパターン（キーワード）を見つけ、そのパターンと予め用意されたデータベースに登録された各パターンとを比較し、該当するパターンに対応した返事のデータ群を出力する。例えば、データベースの中に「野球」といったパターンと、それに対応する返事データとして「私はＡＢＣチームのファンです。」と登録されている場合に、ユーザが「僕は野球が好きだ」と言ったとしたら、会話システムはその発言に含まれる「野球」をキーワードにしてデータベースを検索して、「私はＡＢＣチームのファンです」と答える。
【０００５】
このように、「人工無能」的な会話システムは、構文解析等の複雑な処理を必要とせず、データベースによるパターンマッチングを採用しているため、実際の会話のような省略的な文や、文法的にあいまいな文であっても対応できる。また、ユーザの普通の発言（「シナリオ方式」のような決められた形式ではない発言）に対して返事を生み出す根本的な仕組みが提供されているので、ユーザが会話を自然に主導することができる。
【０００６】
【発明が解決しようとする課題】
上述した「人工無能」的な会話システムでは、データベースの内容や量が会話の質に大きな影響を与える。データベースにつまらいない内容しか登録されていないと、つまらない会話しかできないし、登録量が少ないと、同じ会話の繰り返しとなる。しかし、質、量共に充実したデータベースを作成するには膨大な作業が必要となる。また、データベースの作成に関わる技術者が限られた人となるので、会話システムが返事できる話題もそれらの人が精通している話題の範囲に限られてしまう可能性がある。また、様々な会話の流れを事前に想定してデータベースを作成しておくことは非常に難しいので、会話として自然な流れにならない場合が多い。
【０００７】
本発明は前記のような点に鑑みなされたもので、データベースの作成作業の負担を軽減し、自然な流れで、かつ、ユーザが楽しめるユニークな会話を実現することのできる会話システム及び会話処理プログラムを提供することを目的とする。
【０００８】
【課題を解決するための手段】
本発明の会話システムは、ユーザとの間で会話を行う会話システムであって、既存の文章情報から会話として利用可能な文章を抽出する文章抽出手段と、この文章抽出手段によって抽出された各文章間の時間的、話題的な距離を示す時間関連情報を算出する時間関連情報算出手段と、前記文章抽出手段によって抽出された各文章に前記時間関連情報算出手段によって算出された時間関連情報を付加して記憶するデータベースと、ユーザの発言に対し、前記データベースから会話として適切な文章を前記時間関連情報に基づいて選択して発言する会話処理手段とを具備して構成される。
【０００９】
このような構成の会話システムによれば、例えば小説、戯曲、映画やドラマのシナリオ、落語や漫才などの記録、実際の会話記録といったような会話文を含んだ既存の文章情報、あるいは、ニュース記事などのように会話文を含まない既存の文章情報を対象として、このような文章情報から会話として利用可能な文章が抽出され、これらの文章に各文章間の時間的、話題的な距離を示す時間関連情報が付加されてデータベースに登録される。そして、ユーザの発言に対し、このデータベースから会話として適切な文章が前記時間関連情報に基づいて選択されて発言される。このように、既存の文章情報を利用することでデータベースを簡易に作成でき、そのデータベースを用いて自然な流れで、しかも、ユニークな会話を行うことができる。
【００１０】
また、本発明の会話システムは、ユーザとの間で会話を行う会話システムであって、既存の文章情報から会話として利用可能な文章を抽出する文章抽出手段と、この文章抽出手段によって抽出された各文章間の時間的、話題的な距離を示す時間関連情報を算出する時間関連情報算出手段と、前記文章抽出手段によって抽出された各文章に前記時間関連情報算出手段によって算出された時間関連情報を付加して記憶するデータベースと、ユーザの発言からキーワードを抽出するキーワード抽出手段と、このキーワード抽出手段によって抽出されたキーワードを含む文章を前記データベースから検索し、その文章の時間関連情報との差分が所定値以下の文章を発言候補として選択する選択手段と、この選択手段によって選択された文章を利用して発言する発言処理手段とを具備して構成される。
【００１１】
このような構成の会話システムによれば、例えば小説、戯曲、映画やドラマのシナリオ、落語や漫才などの記録、実際の会話記録といったような会話文を含んだ既存の文章情報、あるいは、ニュース記事などのように会話文を含まない既存の文章情報を対象として、このような文章情報から会話として利用可能な文章が抽出され、これらの文章に各文章間の時間的、話題的な距離を示す時間関連情報が付加されてデータベースに登録される。そして、ユーザの発言に対し、その発言から会話の切っ掛けとなるキーワードが抽出され、前記データベースから当該キーワードを含む文章が検索されると共に、前記時間関連情報に基づいてその文章に時間的、話題的に近い文章が選出されて発言に利用される。このように、既存の文章情報を利用することでデータベースを簡易に作成でき、ユーザの発言に対し、そのデータベースから時間関連情報を用いて会話として適切な文章を選出して発言することで、自然な流れで、しかも、ユニークな会話を行うことができる。
【００１２】
また、前記構成の会話システムにおいて、前記データベースの各文章に時間関連情報と共に前回発言日時を示す情報を付加しておき、前記選択手段は前記前回発言日時情報に基づいて所定日数以内に発言されていない文章を対象として発言候補の選択を行う構成とする。これにより、例えば３日以内に発言された文章を対象外として発言するなど、頻繁に同じセリフを発言することを回避することができる。
【００１３】
また、前記構成の会話システムにおいて、前記発言処理手段は前記選択手段によって選択された文章から会話として不適切な部分を削除して発言する構成とする。これにより、既存の文章情報として、例えばニュース記事を利用した場合において、ニュース記事特有の表現で会話には不自然なものを除外してから発言するこができるので、ニュース記事を流用して発言しているにもか拘わらず不自然さの少ない発言を行うことができる。
【００１４】
また、前記構成の会話システムにおいて、前記発言処理手段は前記選択手段によって選択された文章から会話として不適切な部分を削除し、その削除後の文章に含まれるキーワードの数に応じて当該文章を複数に分割することにより、これらの分割文章のいずれかを発言する構成とする。これにより、既存の文章情報として、例えばニュース記事を利用した場合において、会話の持つ情報量に近くなるように、その記事文の部分的に切り出して用いることで、本来は会話に用いるためのものではない書き言葉の文章から会話らしい発言を作り出すことができる。
【００１５】
また、本発明の会話システムは、ユーザとの間で会話を行う会話システムであって、会話文と非会話文とが混在する特定の文章情報を記憶する記憶手段と、この記憶手段に記憶された文章情報の中からユーザの発言に含まれるキーワードを含んだ文章を検索する検索手段と、この検索手段によって検索された文章が会話文であるか非会話文であるか判断する判断手段と、この判断手段によって会話文であると判断された場合にそれ以後の会話文を対象として時間的、話題的な距離を示す時間関連情報を算出し、その時間関連情報に基づいて会話として適切な文章を選択して発言する第１の発言処理手段と、前記判断手段によって非会話文であると判断された場合にそれ以後の非会話文を対象として時間的、話題的な距離を示す時間関連情報を算出し、その時間関連情報に基づいて会話として適切な文章を選択し、その文章から会話として不適切な部分を削除して発言する第２の発言処理手段とを具備して構成される。
【００１６】
このような構成の会話システムによれば、会話文と非会話文とが混在する特定の文章情報を利用して、会話における発言を作り出すことができる。この特定の文章情報とは、例えば電子ブックであり、会話文と会話文でない地の文章が混在している。ユーザの発言に含まれるキーワードが会話文にあれば、会話文を対象として時間関連情報が算出され、その時間関連情報に基づいて会話として適切な文章が選出されて発言される。一方、ユーザの発言に含まれるキーワードが非会話文にあれば、非会話文を対象として時間関連情報が算出され、その時間関連情報に基づいて会話として適切な文章が選出され、さらに、その文章から会話として不適切な部分が削除されて発言される。
【００１７】
また、本発明の会話システムは、ユーザとの間で会話を行う会話システムであって、見出し語とそれに対応する説明文とからなる特定の辞書情報を記憶する記憶手段と、この記憶手段に記憶された辞書情報の中からユーザの発言に含まれるキーワードを含んだ見出し語を検索する検索手段と、この検索手段によって検索された見出し語に対応した説明文を前記辞書情報から抽出し、その説明文から会話として不適切な部分を削除して発言する第１の発言処理手段とを具備して構成される。
【００１８】
このような構成の会話システムによれば、例えば「国語辞典」や「百科事典」などの特定の辞書情報を利用して、会話における発言を作り出すことができる。この辞書情報は見出し語とそれに対応する説明文とからなる。ユーザの発言に対し、その発言に含まれるキーワードを含んだ見出し語が検索され、その見出し語に対応した説明文が前記辞書情報から抽出され、さらに、その説明文から会話として不適切な部分が削除されて発言される。
【００１９】
また、前記構成の会話システムにおいて、前記辞書情報の中にユーザの発言に含まれるキーワードを含んだ見出し語が存在しなかった場合に、前記辞書情報の中からランダムに見出し語を選出し、その見出し語に対応した説明文を利用して発言することで会話を継続させる第２の発言処理手段を備えた構成とする。これにより、ユーザの発言に含まれるキーワードが辞書情報の見出し語にない場合でも、この辞書情報からランダムに選出される見出し語に対応した説明文を利用して発言することで、会話を継続することができる。
【００２０】
【発明の実施の形態】
以下、図面を参照して本発明の実施形態を説明する。
【００２１】
（第１の実施形態）
図１は本発明の第１の実施形態に係る会話システムのハードウェア構成を示すブロック図である。この会話システムは、ユーザの発言に対し、あたかも人間が返事をしているかの如く発言して会話を進めるためのものであり、例えば会話型ロボット等の玩具類やテレビゲーム機などに搭載される。
【００２２】
図１では、本システムを汎用のコンピュータによって実現した場合の基本的な構成が示されており、ＣＰＵ１１、音声入力部１２、Ａ／Ｄ変換部１３、音声出力部１４、Ｄ／Ａ変換部１５、ワークメモリ１６、不揮発性メモリ１７によって構成されている。
【００２３】
ＣＰＵ１１は、不揮発性メモリ１７などに記憶されたプログラムを読み込むことにより、そのプログラムに記述された手順に従って所定の処理を実行する。音声入力部１２は、会話時にユーザの音声を入力するためのマイクである。この音声入力部１２から入力されたユーザの音声（アナログデータ）はＡ／Ｄ変換部１３でデジタルデータに変換されてＣＰＵ１１に取り込まれる。ＣＰＵ１１はワークメモリ１６を用いて処理を行い、ユーザの発言に対する返事をＤ／Ａ変換部１５を介して出力する。Ｄ／Ａ変換部１５は、ＣＰＵ１１によって生成された音声データをアナログデータに変換して音声出力部１４に与える。音声出力部１４は、これを外部に出力するためのスピーカである。
【００２４】
不揮発性メモリ１７は、例えばフラッシュメモリからなり、電源が切れても記憶内容が消えない書き換え可能なメモリである。この不揮発性メモリ１７には、本発明の会話システムを実現するためのプログラム１７ａの他、会話処理に必要な情報として、会話文データベース１８、記事文データベース１９、キーワード履歴テーブル２０、前回発言記事文バッファ２１、累積発言文字数カウンタ２２が設けられている。前記プログラム１７ａは、後述するデータベース作成処理を実行するためのプログラムを含む。
【００２５】
会話文データベース１８は、例えば小説、戯曲、映画やドラマのシナリオ、落語や漫才の記録、実際の会話記録など、会話文を含んだ既存の文章情報を対象として、その文章情報から会話部分の文章だけを抜き出して作成されたデータベースである。記事文データベース１９は、例えばニュース記事など、会話文を含まない既存の文章情報（書き言葉による文章情報）を対象として、その文章情報から会話として利用可能な部分の文章を抜き出して作成されたデータベースである。
【００２６】
キーワード履歴テーブル２０は、ユーザの発言の中に見つけたキーワードと、それを見つけた日時のデータを履歴として保持としておくためのテーブルである。会話時にキーワードを見つける度にそれらをキーワード履歴テーブル２０に書き込んでいく。この場合、キーワード履歴テーブル２０の記憶容量が一杯になった時点で古い日時のデータから上書きしていくものとする。前回発言記事文バッファ２１は、会話時に記事文データベース１９に登録された記事の文章を利用してユーザに対する発言（返事）が行われた場合にその文章を保持しておくためのものである。累積発言文字数カウンタ２２は、発言に利用された記事の文章の文字数をカウントしておくものである。
【００２７】
このような構成の会話システムにおいて、音声入力部１２から入力されたユーザの音声はＡ／Ｄ変換部１３にてデジタルデータに変換された後、ＣＰＵ１１に与えられる。ＣＰＵ１１では、「音声認識処理」→「会話処理」→「読み上げ処理」といった順で各処理を行ってユーザの発言に対する返事を返す。すなわち、まず、「音声認識処理」により音声から文字への変換を行ってテキスト形式の文章を作成する。なお、「音声認識処理」では仮名漢字変換処理も同時に行われているものとする。次に、このテキスト形式の文章に対して「会話処理」を施してユーザの発言に対する返事を作成し、これを「読み上げ処理」によって読み上げる。このときＣＰＵ１１にてユーザに対する返事として生成された音声データはＤ／Ａ変換部１５によってアナログデータに変換された後、音声出力部１４を通じて出力される。
【００２８】
ここで、「音声認識処理」と「読み上げ処理」については一般的に知られている手法を用いるものとしてその詳しい説明は省略する。以下では「会話処理」を中心に説明する。
【００２９】
まず、「会話処理」で用いられる会話文データベース１８と記事文データベース１９を作成するための処理について説明する。なお、これらの会話文データベース作成処理は、本システムに備えられたＣＰＵ１１の一機能として実行されるものであっても、あるいは、例えば本システムに接続されるパソコン等にて実行されるものであっても良い。パソコン等で行う構成の場合には、そこで作成された会話文データベース１８や記事文データベース１９が本システムの不揮発性メモリ１７に書き込まれて、会話処理時に参照されることになる。本実施形態では、本システムに備えられたＣＰＵ１１がプログラム１７ａを読み込むことで、以下に説明するような会話文データベース１８と記事文データベース１９の作成処理を行うものとする。
【００３０】
（ａ）会話文データベース作成処理
会話文データベース作成処理では、例えば小説、戯曲、映画やドラマのシナリオ、落語や漫才などの記録、実際の会話記録など、直面している会話とはもともとは全く無関係に作られた作品の文章中に含まれる会話文を利用してデータベース（会話文データベース１８）を作成する。この場合、この種の既存の文章情報から会話部分の文章を抜き出すと共に、これらの会話文間の時間的、話題的な距離を示す情報（以下、時間関連情報と称す）を算出して会話文データベース１８に登録しておくことで、会話処理時に前記時間関連情報に基づいて適切な会話文を選択する。
【００３１】
図２は第１の実施形態における会話システムの会話文データベース作成処理の流れを示すフローチャートである。ここでは、説明を簡単にするために、小説のテキストファイルが例えば不揮発性メモリ１７などにあり、そのテキストファイルを読み込んで処理するものとする。なお、対象ファイルが戯曲やシナリオや会話記録などの場合には細部が多少異なる処理となる。
【００３２】
小説のテキストファイルが複数あり、それらを１つずつ処理していく。会話部分の抽出は、対象が小説の場合に、「」や『』の会話記号用のかぎ括弧を利用して行えば良い。文単位への分割は、句点（。）などを利用すれば良い。時間関連情報は、各会話文の時間的な近さ、話題としての近さを知るための情報である。この時間関連情報の値が近い時には、その会話文が発言された時間が近く、同じ話題である可能性が高いことを示す。小説の場合には、各会話文が発言された時間そのものは分からないので、改行コードなどに基づいてその情報を作成する。改行コードが入ると、そこで文章間の時間的、意味的な繋がりは少し薄れると考える。小説において、空白行は、文章間に区切りをつけるために設けられている場合が多いので、空白行が入ると、その前後の文章の時間関連情報値を大きく（例えば“１０”）隔てる。ファイルが変わると、全く違う話題になるので、さらに大きな値（例えば“１００”）を加算する。
【００３３】
この会話文データベース作成処理について詳しく説明する。
【００３４】
図２に示すように、ＣＰＵ１１は、まず、時間関連情報を初期値“０”にして最初のテキストファイルを開き（ステップＡ１１）、その先頭から次の改行コードまでのテキストを読み込む（ステップＡ１２）。そして、ＣＰＵ１１は、その読み込んだテキストが空白行か否かをチェックする（ステップＡ１３）。空白行でなければ（ステップＡ１３のＹＥＳ）、ＣＰＵ１１は「」や『』の会話記号を利用して当該テキストから会話部分の文章の抽出処理を行う（ステップＡ１４）。その結果、会話部分があれば（ステップＡ１５のＹＥＳ）、ＣＰＵ１１は句点などを利用して会話部分の文章を文単位で分割し、これらの文データに時間関連情報と前回発言日時を付加して会話文データベース１８に登録する（ステップＡ１６）。なお、前回発言日時は会話処理で用いられるデータであり、この時点では空データを登録しておく。
【００３５】
続いて、ＣＰＵ１１は現在の時間関連情報に“１”を加算し（ステップＡ１７）、次の改行コードまでのテキストを読み込んで前記同様の処理を行う（ステップＡ１８→Ａ１２）。また、読み込んだテキストが空白行の場合には（ステップＡ１３のＹＥＳ）、時間関連情報に“１０”を加算して（ステップＡ１９）、次のテキストに移る。
【００３６】
全てのテキストに対する処理が終了すると（ステップＡ１８のＹｅｓ）、未処理のテキストファイルがあれば（ステップＡ２０のＹｅｓ）、そのテキストファイルに対して前記同様の処理を繰り返す。その際、全く違う話題になるので、ＣＰＵ１１は次のテキストファイルを開いたときに時間関連情報に“１００”を加算しておく（ステップＡ２１）。
【００３７】
このようにして、会話文を含んだテキストファイルから会話文としての文章を抽出して文単位で時間関連情報および前回発言日時と共に会話文データベース１８に登録していく。
【００３８】
図３に前記会話文データベース作成処理の対象となる小説のテキストファイルの一例を示す（太宰治「グッド・バイ」より抜粋）。各行が改行コードまでの文章である。「…」はここで示した文章の前後にもデータがあることを表現している。なお、印刷や表示された時の一行と改行コードまでの一行とは異なる。改行コードがなくても一行に表示しきれない長さの文は改行されて表示、印刷される。また、時間関連情報は参考のために付したもので、実際のテキストファイルにはない。ここでは、「ケンカするほど深い仲、ってね。」の文章の時間関連情報が“５０００”だったとして、そこから算出した値が示されている。「怪力（四）」は、章の題名である。このような章の変わり目を認識してやや大き目の値（例えば“３０”など）を時間関連情報に加算しても良い。ここでは、空白行が必ず前後にあるので、それを利用した例を示した。
【００３９】
図４にこのテキストファイルから作成された会話文データベース１８の一例を示す。この会話文データベース１８には、前記図３に示した小説の会話部分の文データが文単位で登録されている。これらの文データには時間関連情報と前回発言日時が付加されている。前回発言日時は会話処理で用いるデータであって、ここでは空白データである。
【００４０】
（ｂ）記事文データベース作成処理
記事文データベース作成処理では、ニュース記事などを対象として会話にふさわしい文を抽出してデータベース（記事文データベース１９）を作成する。前記会話文データベース作成処理と同様に、文章情報から会話文を抜き出すと共に、これらの会話文間の時間的、話題的な距離を示す時間関連情報を算出して記事文データベース１９に登録しておくことで、会話処理時に前記時間関連情報に基づいて適切な会話文を選べるようにする。ただし、ニュース記事などでは、そのニュース記事特有の表現が使われており、しかも、小説のように会話文を含まないため、会話文にふさわしい形に編集する必要がある。
【００４１】
図２は第１の実施形態における会話システムの記事文データベース作成処理の流れを示すフローチャートである。ここでは、インターネット上のニュースサイトなどからニュース記事がダウンロードされて、ＨＴＭＬ（ＨｙｐｅｒＴｅｘｔ　Ｍａｒｋｕｐ　Ｌａｎｇｕａｇｅ）形式の複数のファイルとして、例えば不揮発性メモリ１７などに既にあるものとして説明する。
【００４２】
ホームページ情報の中には、表示される文字以外にＨＴＭＬタグが含まれるのでこれを削除する。また、記事の内容を表現する文章以外に、リンクを伴った他の記事の見出しや他のホームページの名称などがある。これらには句点（。）が含まれないので、ここではそれで識別して削除する。ホームページの記事では、内容的に大きな区切りではないところでも空白行を用いることが多いので、空白行を見つけた時の時間関連情報の加算値は“１”とする。内容的に大きな区切りとなる箇所では、複数の空白行と共に見出しなども用いられ、その度に時間関連情報が加算されるので問題はない。また、ここでは、会話文データベース１８の作成処理とは異なる処理を示すために、改行コードまでの文章が複数の文から構成されていても同じレコードに記録するものとする。
【００４３】
この記事文データベース作成処理について具体的に説明する。
【００４４】
図５に示すように、ＣＰＵ１１は、まず、時間関連情報を初期値“０”にして最初のＨＴＭＬファイルを開き（ステップＢ１１）、そのＨＴＭＬファイルからタグを除去することで、画面に表示されているニュース記事のテキストのみを残す（ステップＢ１２）。
【００４５】
ここで、ＣＰＵ１１は次の改行コードまでのテキストを読み込み（ステップＢ１３）、その読み込んだテキストが空白行か否かをチェックする（ステップＢ１４）。空白行でなければ（ステップＢ１４のＹＥＳ）、ＣＰＵ１１は当該テキストに句点があるか否かをチェックする（ステップＢ１５）。その結果、句点があれば（ステップＢ１５のＹｅｓ）、当該テキストは文章であるものとして、ＣＰＵ１１はその文データを抽出して時間関連情報と前回発言日時を付加して記事文データベース１９に登録する（ステップＢ１６）。なお、前回発言日時は会話処理で用いられるデータであり、この時点では空データを登録しておく。一方、句点がなければ（ステップＢ１５のＮＯ）、当該テキストは見出しやリンクであると判断して破棄して、ステップＢ１７の進む。
【００４６】
続いて、ＣＰＵ１１は現在の時間関連情報に“１”を加算し（ステップＢ１７）、次の改行コードまでのテキストを読み込んで前記同様の処理を行う（ステップＢ１８→Ｂ１３）。
【００４７】
また、全てのテキストに対する処理が終了すると（ステップＢ１８のＹｅｓ）、未処理のＨＴＭＬファイルがあれば（ステップＢ１９のＹｅｓ）、そのＨＴＭＬファイルに対して前記同様の処理を繰り返す。その際、全く違う記事になるので、ＣＰＵ１１は次のテキストファイルを開いたときに時間関連情報に“１００”を加算しておく（ステップＢ２０）。
【００４８】
このようにして、ニュース記事などを対象として会話として利用可能な文章を抽出して時間関連情報および前回発言日時と共に記事文データベース１９に登録していく。図６に前記記事文データベース作成処理によって作成される記事文データベース１９の一例を示す。時間関連情報が“１００００”と“１０００１”のレコードと、時間関連情報が“１０１０２”から“１０１０６”のレコードとは違うファイルの記事（ホームページ上では違うアドレスの記事）である。時間関連情報が“１０１０２”から“１０１０６”で２つずつ増えているのは、これらの文章は一連の文章ではあるが、間に空白行が入っていたことを意味している。
【００４９】
（ｃ）会話処理
次に、会話文データベース１８と記事文データベース１９を用いた会話処理について説明する。
【００５０】
図７および図８は第１の実施形態における会話システムの会話処理の流れを示すフローチャートである。また、図９乃至図１１はこの会話処理の中に含まれるレコード選択処理、記事文切断発言処理、記事文不適切部分削除処理の流れを示すフローチャートである。
【００５１】
まず、これらのフローチャートで示される処理を説明する前に、理解を容易とするため、具体例を挙げて本システムの会話処理について説明する。ここで、「キーワード」とは、他の言葉よりも強く示唆する言葉を示すものであるが、本実施形態では説明を簡便にするために、２文字以上の漢字、カタカナ、数字、またはこれらの組み合わせからなる単語を会話の切っ掛けとなるキーワードとして認識するものとする。
【００５２】
例えば、ユーザが「僕は音楽が好きだ」と言ったとすると、この中のキーワードは「音楽」である。そこで、会話文データベース１８の中から「音楽」といったキーワードを有する文データが検索される。この場合、図４に示す会話文データベース１８の例であれば（前回発言日時も図のようにデータがないものとし、ここで表示されている以外に「音楽」を含む文データがないとする）、レコード選択処理（図９）において、まず、「あなたにも音楽がわかるの？」と「ばか、僕の音楽通を示らんな、君は。」といった文データが抽出されると共に、これらの文データとの時間関連情報の差が２以下である「音痴みたいな顔をしているけど。」，「名曲ならば、一日一ぱいでも聞いていたい。」，「あの曲は、何？」，「ショパン。」も抽出される。
【００５３】
これらの中の１つがランダムに選ばれて発言される。どれも音楽に関する発言なので、課題として適切である。これにより、ユーザに対して会話システムは自分の発言を理解したと錯覚させることができ、発言自体の表現も小説から抜粋されたものでユニークであり、会話の流れとしても自然である。
【００５４】
この場合、会話文データベース１８の中でキーワードを含む会話文を発言した人物の文章だけを利用してユーザに返事する方法も考えられるが、本実施形態では、キーワードを含む会話文を発言した人物と、その人と会話している人物の両方の文章から返事を見つけ出すようにしているため、返事のバリエーションが多く抽出でき、同時に、キーワード（前記例では「音楽」）そのものを含む会話文も返事として返されるされることがあるので、ユーザは自分の言ったことを理解していると強く感じて満足を得ることができる。また、キーワードがいつも返事に入っていると逆に不自然さを感じてしまうこともあるが、本実施形態ではキーワード（「音楽」）を含む文に後続するキーワードを含まない文も発言されることがあるので、その不自然さを感じることもない。しかも、その返事はキーワード（「音楽」）を含む会話の流れから選ばれたものであり、自然な内容であることが多い。
【００５５】
ここで、会話文データベース１８に登録された各会話文のデータに付加された時間関連情報が各会話文の時間的・内容的な差を適切に表現している。例えば、ユーザの発言の中に「ケンカ」というキーワードがあって、「ケンカするほど深い中、ってね。」が候補となったとしても、そのレコードの直後のレコードである「ピアノが聞こえるね。」が候補となることはない。
【００５６】
また、本実施形態では、「あなたにも音楽がわかるの？音痴みたいな顔をしているけど。」のように、小説では１回の発言とされるところを、２つの文に分けて１つだけを発言させる。１つの文に比べると２つの文に含まれる情報は当然多く、情報量が増えると、ユーザの発言と食い違う情報が含まれてしまう可能性が高まる。それを避けて、ユーザの発言との食い違いが生じる可能性を下げるために、文単位にわけて発言させることで、会話システムが行っている会話とは本来は全く無関係に作られた小説の中の会話をいろいろな場面で利用できるようになる。
【００５７】
また、発言された文の前回発言日時を会話文データベース１８に記録しておき、これをレコード選択時に利用することで、同じ文が頻繁に発言されて飽きられることを防いでいる。
【００５８】
ここで、会話文データベース１８に返事を見つけられなかった場合、本システムは、より広い話題を含んでいる記事文データベース１９を利用する。文を選択する処理は会話文データベース１８の場合と同じである。しかし、選択された文をそのまま発言すると、会話としては非常に不自然になるので、記事文切断発言処理（図１０）により、これを会話として自然なもの変形する。
【００５９】
まず、ニュース特有の表現で、それをそのまま会話に用いると不自然な文を記事不適切部分削除処理（図１１）で削除する。
【００６０】
例えば、ユーザが「僕は＊＊＊＊が好きなんだ」と言ったとして、それに対して選択されたレコードが、「米大リーグ、ア・リーグの最優秀選手（ＭＶＰ）に選出された＊＊＊＊＊の＊＊＊＊外野手（２８）が９日、関西空港着の航空機で＊＊夫人とともに帰国した。首位打者と盗塁王を獲得し、シーズン２４２安打の新人最多安打記録を９０年ぶりに更新。」だったとする。なお、“＊＊＊＊”の部分は、実際には人物の名前や場所の名前などが入るものであるが、ここでは固有名詞の記載を避けるために＊記号でマスクして表現するものとする。
【００６１】
前記の文章の例では、「（ＭＶＰ）」，「（２８）」，「９日」が不適切部分として削除される。これは、括弧を「カッコ」などと読み上げるのは会話として明らかに不自然なことによる。また、括弧自体は自動的に読まずに済ませるようにしたとしても、括弧の中の内容を読むことも不自然である。ユーザの日常の会話では、名詞の後に同格の名詞をつけることはないし、人物の年齢をその人を指す名詞の直後にいきなり言うこともない。「９日」のような日程に関する文言についても、このニュースが出た月内であれば意味があるが、他の月でも無意味なものであるため、削除対象となる。
【００６２】
これらを削除することにより、前記文章は次のようになる。
【００６３】
「米大リーグ、ア・リーグの最優秀選手に選出された＊＊＊＊＊の＊＊＊＊外野手が、関西空港着の航空機で＊＊夫人とともに帰国した。首位打者と盗塁王を獲得し、シーズン２４２安打の新人最多安打記録を９０年ぶりに更新。」
しかし、これでも会話における発言としては全く不自然である。その最大の原因は、１回の発言としては情報量が多すぎることにある。日常会話では、１回の発言に含まれる情報量は非常に少ない。本実施形態では、キーワードの数で会話の情報量を量るものとする。
【００６４】
例えば、「あなたにも音楽がわかるの？音痴みたいな顔をしているけど。」はキーワードを２つ（「音楽」，「音痴」）しか含んでいない。これに対し、「米大リーグ、ア・リーグの最優秀選手に選出された＊＊＊＊＊の＊＊＊＊外野手が、関西空港着の航空機で＊＊夫人とともに帰国した。首位打者と盗塁王を獲得し、シーズン２４２安打の新人最多安打記録を９０年ぶりに更新。」は、明らかに２個以上のキーワードが含まれている。そこで、ランダムに任意の文字位置を選び、句読点を利用して、それを含む文節（選ばれた文字位置を含む文に読点がない場合は文）を切り出す。ここでは「文節」を、句読点で区切られ、その中にはそれ以上の句読点がない部分を言うことにする。
【００６５】
これにより、例えば「米大リーグ」が切り出されたり、「ア・リーグの最優秀選手に選出された＊＊＊＊＊の＊＊＊＊外野手が」などが切り出される。「米大リーグ」が切り出された場合は、ここに含まれるキーワードは１個だけなので、これをそのまま利用して返事とする。「ア・リーグの最優秀選手に選出された＊＊＊＊の＊＊＊＊外野手が」の文節が切り出された場合は、ここに含まれるキーワードが５個でまだ多すぎるので、さらに、この文節の中からランダムに任意の文字位置を選び、今度はキーワードを区切りとしてその文字を含む部分を切り出す。それでもまだ切り出した部分の情報量が多すぎる場合にはこれを繰り返して、最終的には、例えば「ア・リーグの最優秀選手に」や「＊＊＊＊＊の＊＊＊＊外野手が」などを切り出す。
【００６６】
文節は句読点ではもはや切り出せないが、キーワードを区切りにして切断しているので、切り出した部分は比較的自然な日本語となる。これらを返事とすると、結果として、ややぶっきらぼうで中途半端な印象も与えはするが、ユーザの発言との食い違いが目立たない曖昧な返事になる。なお、切り出した部分を編集して、例えば「米大リーグだな」や「＊＊＊＊の＊＊＊＊外野手がね」などのように語尾をつけたりしても良い。
【００６７】
この手法では、関連している情報の一部分だけを発言しているので、ユーザは残りの部分を聞きたくもなる。例えば、ユーザの発言「野球」に対して「＊＊＊＊＊の＊＊＊＊外野手が」と返事されるとその後が聞きたくなるし、「９０年ぶりに更新」と返事されると、何が９０年ぶりだろう、と興味を引かれる。このように意図的に情報量を制限することで、会話にミステリアスな味わいを付加し、ユーザに好奇心を持たせる効果も出せる。
【００６８】
ユーザの発言のキーワードを用いて、会話文データベース１８でも記事文データベース１９でも返事が作れなかった場合、あるいは、ユーザの発言にキーワードが含まれていなかった場合に、そのまま応答しないでいると、非常に無口な会話システムとなってしまう可能性がある。そのような場合に備えた処理が図８に会話処理（２）として示した処理である。
【００６９】
すなわち、記事文データベース１９を用いて発言したときに、そのときに発言用として選択された文章を前回発言記事文バッファ２１に記録しておくことで、これを利用して発言する。記事文データベース１９を用いた発言の場合には、会話らしい情報量とするために、全体の文章のごく一部分しか発言していないので、残りの部分も発言する。
【００７０】
これにより、主に以下の３つの効果が出せる。
（１）記事文データベース１９を用いた発言が持つミステリアスな味わいにより引き起こされた興味や好奇心にある程度の充足感を与える。
（２）１つの話題について継続して会話しているという感覚をユーザに与える。
（３）選択された記事文の中にユーザの発言と食い違う情報が含まれていたとしても、徐々に小出しにすることでそれによる違和感を少なくする。
【００７１】
前記（１）は、制限されて隠されていた情報が徐々に明かされることにより達成される。前記（２）は話が飛びすぎる人工無能特有の欠点を解消する。ユーザは自分がしゃべった直後に急に話をそらされると不快感を感じるが、徐々に時間をかけてあいまいに話がそれていくのにはそれほど不快感を感じない。また、情報量が制限され、小出しにされた中に食い違いが出てくると、不快感を感じるよりも、どういう意味かと不思議に思い質問したくなる傾向もある。前記（３）はそれらの傾向を利用している。しかし、記事文が持つ情報を全部明らかにしてしまうと、ユーザの発言意図と食い違う情報が多数出てきてしまい、前記（３）の効果ではカバーしきれなくなる。そこで、一定の制限を設けてそれに達したらやめるものとする。本実施形態では、累積発言文字数カウンタ２２にて発言文字数をカウントしておき、文章全体の１／４に達した時点でやめるものとする。
【００７２】
以上の仕組みを用いると、先ほどの例では、例えば次のような会話が可能である。
【００７３】
ユーザ：「僕は＊＊＊＊選手が好きなんだ」
会話システム：「＊＊＊＊＊の＊＊＊＊外野手が」
ユーザ：「君も知っているんだね」（キーワードなし）
会話システム：「首位打者と盗塁王を」（従来の人工無能であれば、ここで全く違う話題になっているか、黙ってしまっている可能性が高いが、本発明では同じ話題を継続できる）
ユーザ：「うん」（キーワードなし）
会話システム：「関西空港着の航空機で」
ユーザ：「はあ？」（キーワードなし）
会話システム：「９０年ぶりに更新」
ユーザ：「え？」（キーワードなし）
会話システム：「シーズン２４２安打」
ユーザ：「すごい記録だよね」（次の発言はキーワード「記録」を使って作られる）
このように、部分的には意味不明なところや食い違いがありながらも（その部分では特にユーザにミステリアスな味わいをあたえ好奇心を刺激しながら）、同じ話題で発言が行なわれ、最初の「僕は＊＊＊＊選手が好きなんだ」に対応する会話が成立している。それぞれがあいまいな発言なので、不足した情報をユーザが都合よく補って解釈する効果もある。ここで引用した記事は野球の＊＊＊＊選手が帰国したことを伝える報道記事である。本来は、ユーザの発言「僕は＊＊＊＊選手が好きなんだ」とは無関係に書かれたものである。このように、会話とは全く無関係に作られた文章を会話に利用することができる。
【００７４】
また、前回発言記事文バッファ２１に文章がなかった場合、キーワード履歴テーブル２０に記録しておいた最近のキーワードを用いて発言する。これにより、発言せずに終わってしまう可能性をさらに下げる。同時に、同じ話題について継続して会話している印象をユーザに与える。これも、話が飛びすぎる欠点を軽減するためのものである。例えば、先ほどの「僕は＊＊＊＊選手が好きなんだ」で始まる会話の続きにおいて、キーワード「記録」に基づく会話をした後に、キーワード履歴テーブル２０に残っているキーワード「＊＊＊＊選手」に基づく会話を再開できる。
【００７５】
以下に、上述した会話システムを実現するための具体的な処理手順について、図７乃至図１１に示すフローチャートを参照して詳しく説明する。これらのフローチャートで示される処理は、本システムに備えられたＣＰＵ１１がプログラムを読み込むことで実行する。
【００７６】
図７に示すように、本システムの会話処理が起動されると、ＣＰＵ１１は、まず、ユーザの発言の中からキーワードを抽出する（ステップＣ１１）。詳しくは、音声入力部１２を通じて入力されたユーザの音声データを音声認識処理して得られるテキストデータの中から会話の切っ掛けとなるキーワードを探す。ここで言うキーワードとは、ユーザの発言内容を他の言葉よりも強く示唆する言葉である。本実施形態では、２文字以上の漢字、カタカナ、数字、またはこれらの組み合わせからなる単語をキーワードとして抽出する。ユーザの発言の中に該当するキーワードがあった場合には（ステップＣ１２のＹＥＳ）、ＣＰＵ１１は、その抽出したキーワードを不揮発性メモリ１７に設けられたキーワード履歴テーブル２０に現在日時のデータと共に書き込んだ後（ステップＣ１３）、そのキーワードを用いて会話文データベース１８に対するレコード選択処理を行う（ステップＣ１４）。前記キーワード履歴テーブル２０は後述する図８の会話処理（２）で用いられる。
【００７７】
図９に示すように、レコード選択処理では、ＣＰＵ１１は、前記抽出したキーワードに基づいて会話文データベース１８を検索する（ステップＤ１１）。その結果、会話文データベース１８に登録された文データの中に当該キーワードを含むレコードがあれば（ステップＤ１２のＹＥＳ）、ＣＰＵ１１はそのレコードを会話文データベース１８から抽出すると共に、時間関連情報が近いレコードで、かつ、前回発言日時が所定日数以内でないレコードを抽出する（ステップＤ１３）。時間関連情報が近いレコードとは、時間的、話題的に近いレコードのことであり、具体的には当該レコードの時間関連情報との差分値が２以内のレコードを言う。また、前回発言日時が所定日数以内でないレコードとは、最近発言されていないレコードのことであり、具体的には３日以内に発言されないレコードを言う。会話文データベース１８から該当するレコードを抽出できた場合には（ステップＤ１４のＹＥＳ）、ＣＰＵ１１はこれらのレコードを発言候補として、そのうちの１つをランダムに選択する（ステップＤ１５）。なお、１つしか抽出できなかった場合にはそれを選択レコードとする。
【００７８】
図７に戻って、前記レコード選択処理によって会話文データベース１８から発言候補としてのレコードが選択されると、ＣＰＵ１１はそのレコードの文データをユーザに対する返事として発言する（ステップＣ１５）。詳しくは、発言する文データに対応した音声データを生成し、これをＤ／Ａ変換部１５にてアナログ波形に変換した後、音声出力部１４を通じて読み上げる。このとき、ＣＰＵ１１は会話文データベース１８の中の前記選択レコードに対応した前回発言日時の項目に現在日時を書き込んでおく（ステップＣ１６）。一方、前記レコード選択処理によって会話文データベース１８から発言候補としてのレコードが選択されなかった場合には、ＣＰＵ１１は当該キーワードを用いて記事文データベース１９に対するレコード選択処理を行う（ステップＣ１７）。このときのレコード選択処理は、会話文データベース１８が記事文データベース１９に代わるだけで図９と同様である。
【００７９】
ここで、記事文データベース１９から発言候補としてのレコードが選択された場合には、ＣＰＵ１１はそのレコードの文データに対して記事文切断発言処理を施すことで、その文データを会話として自然な形に直してからユーザに対する返事として発言する（ステップＣ１８）。そして、ＣＰＵ１１は記事文データベース１９の中の前記選択レコードに対応した前回発言日時の項目に現在日時を書き込んでおく（ステップＣ１９）。
【００８０】
図１０に示すように、記事文切断発言処理では、ＣＰＵ１１は、まず、記事文データベース１９から抽出した文データ（記事の文章）を不揮発性メモリ１７に設けられた前回発言記事文バッファ２１に保持しておく（ステップＥ１１）。そして、ＣＰＵ１１はこの前回発言記事文バッファ２１に保持した文データを処理対象として記事不適切部分削除処理を行い、その文データから会話として不適切な部分を削除する（ステップＥ１２）。詳しくは、図１１に示すように、ＣＰＵ１１は当該文データに含まれる括弧を探し、その括弧の記号とその括弧に挟まれた部分を削除する（ステップＦ１１）。また、ＣＰＵ１１は当該文データから日時を表現する文言を探してこれを削除すると共に（ステップＦ１２）、さらにニュース特有の文言、例えば「＊＊＊＊通信社によると」とか「＊＊＊＊新聞社の調べたところによると」などの文言を探してこれを削除する（ステップＦ１３）。
【００８１】
このようにして、文データから会話として不適切な部分を削除すると、ＣＰＵ１１はその削除後の文データに含まれるキーワードの数を調べて、そのキーワードの数がｎ個（ここではｎ＝３）以上あれば（ステップＥ１３のＹＥＳ）、会話文として不適切であると判断して、以下のようにして文章を短文化して会話文として適切な形にしていく。
【００８２】
すなわち、ＣＰＵ１１は、当該文データの任意の文字位置を乱数的に指定し、その文字を含む文節または文を句読点などを利用して切り出す（ステップＥ１４）。そして、この切り出した文または文節に含まれるキーワードの数を調べ、それがｎ個以上であれば（ステップＥ１５のＹＥＳ）、今度はキーワードを区切りにして文節をさらに短く切断する（ステップＥ１６）。これをキーワードの数がｎ個より少なくなるまで、具体的にはキーワードの数が２個以下になるまで繰り返す。
【００８３】
ＣＰＵ１１はこのようして最終的に得られた文、文節あるいは断片をユーザに対する返事として発言する（ステップＥ１７）。また、ＣＰＵ１１は今回発言した文字数を累積発言文字数カウンタ２２に加算して（ステップＥ１８）、その累積発言文字数カウンタ２２の値が所定値以下であるか否かを判断する（ステップＥ１９）。詳しくは、発言した文字数の累積値が前回発言記事文バッファ２１の記事文章の長さの１／４以下であるか否かを判断する。これは、後述する会話処理（２）で同じ記事文章を小出しにして何らかの発言を行う場合において、ユーザは最初のうちはどういう意味かと不思議に思い質問したくなる傾向があるが、同一記事文の情報を全部明らかにしてしまうと、ユーザの発言意図と食い違う情報が多数出てきてしまい、逆に違和感を与えてしまうことになる。そこで、累積文字数が文章全体の１／４の長さに達した時点で同じ記事文章からの発言を中止するために（ステップＥ１９のＮＯ）、ＣＰＵ１１は前回発言記事文バッファ２１を０クリアすると共に前回発言記事文バッファ２１を空にしておく（ステップＥ２０）。
【００８４】
ここで、ユーザの発言に含まれるキーワードを用いて、会話文データベース１８でも記事文データベース１９でも返事が作れなかった場合（図７のステップＣ１４→Ｃ１７の非選択）、あるいは、ユーザの発言にキーワードが含まれていなかった場合において（図７のステップＣ１１のＮＯ）、図８に示す会話処理（２）が実行される。
【００８５】
図８に示すように、会話処理（２）では、ＣＰＵ１１は、前回発言記事文バッファ２１に文章があるか否かをチェックする（ステップＣ２０）。前回発言記事文バッファ２１に文章（前回選択された記事文）があれば（ステップＣ２０のＹＥＳ）、ＣＰＵ１１はこの文章を利用して前記図１０の記事文切断処理を行って発言を行う（ステップＣ２１）。一方、前回発言記事文バッファ２１に文章がない場合には（ステップＣ２０のＮＯ）、ＣＰＵ１１はキーワード履歴テーブル２０から最近に記録されたキーワードを抽出する（ステップＣ２２）。この場合、今回記録されたキーワードは対象外とする。また、所定時間以上前に選択されているキーワードも対象外とする。このキーワード履歴テーブル２０に該当するキーワードがあれば（ステップＣ２３のＹＥＳ）、ＣＰＵ１１はそのキーワードを用いて再度会話文データベース１８や記事文データベース１９から返事となる文を抽出して発言を行う（ステップＣ２４〜Ｃ２９）。ステップＣ２４〜Ｃ２９の処理は前記図７のステップＣ１４〜Ｃ１９と同様である。ただし、ここでも返事を作成できなかった場合には、ユーザの発言に対する返事はせずに会話処理を終了することになる。
【００８６】
このように、本発明の会話システムでは、ユーザの発言に含まれるキーワードに基づいて会話文データベース１８から会話文を見つけ、その会話文と時間関連情報の差が少ない会話文を返事としているので、元々はこれらの会話文は直面している会話とは全く無関係に作られたものであるにも拘わらず、そこから内容的に繋がりがあり、しかも、会話として自然な表現の返事を作り出すことができる。この場合、小説では各会話文の時間間隔を知ることはできないが、会話文の中の改行コードと共に地の文（会話文ではない文）や章の題の改行コードも用い、さらに内容の区切りを示すものとして空白行を考慮して時間関連情報を算出することで、会話文相互の時間的、話題的な関連性を適切に表現する情報が作り出せる。これを用いて会話文を選択するので、ユーザの発言と時間的にも内容的にも繋がりのある適切な文を選択して発言できる。
【００８７】
また、小説の中では１回の発言とされている複数の文（一組の引用符に囲まれた文）を分割して、文単位で１回の発言としているので、ユーザの発言との明確な食い違いを生みにくく、多くの場面の会話に適用できる。
【００８８】
ユーザ発言の中のキーワードを含む会話文に対する返事の会話文だけではなく、そのキーワードを含む会話文の話者自身の会話文も利用しているので、多くの会話文を選び出すことができ、発言のバラエティが豊富となる。また、これにより、ユーザが発言したキーワードそのものが含まれる返事と、それが含まれない返事の両方が可能となるので、ユーザに会話システムは自分の言ったことを理解しているという強い満足感を、不自然さを感じさせることなく与えることができる。
【００８９】
会話文の識別と抽出、時間関連情報の算出が会話文データベース作成の際に行っておくので、会話時のＣＰＵ１１の処理の負担が軽減する。また、会話文データベース１８に会話文のみを抽出して持っているので、会話らしい会話が少ない容量のメモリで実現できる。
【００９０】
一方、会話文データベース１８とは別に記事文データベース１９を用いて発言する場合において、ニュース記事に特有の表現で会話には不自然なものを除外してから発言するので、ニュース記事を流用して発言しているにもか拘わらず不自然さの少ない発言を行うことができる。また、会話の持つ情報量に近くなるように、記事文の一部分を切り出して用いるので、本来は会話に用いるためのものではない書き言葉の文章から会話らしい発言を作り出せる。この場合、句読点やキーワードを利用して文を切り出すので、切れ目が不自然にならない。断片的な発言を行うので、発言の意味があいまいで解釈の余地が広くなり、ユーザが会話の流れに都合よくその発言を解釈してくれる可能性が高まる。
【００９１】
また、情報量を量る尺度として、キーワードの数を用いているので、文章を文法的に解析したりする必要がなく、簡単におおむね正しい情報量を見積もれる。断片的な発言で意味があいまいなのでユーザに興味や好奇心を抱かせ、会話を継続する欲求を持たせることができる。発言に利用したニュース記事の文章を記憶しておき、その後の発言で、そこから他の部分を切り出しているので、ユーザが持った興味や好奇心を少しずつ満足させていくことができる。また、これにより同一の話題で複数回の発言を行うことができ、１つの話題について継続して会話している感覚をユーザに与えることができる。これにより、今回のユーザの発言の中にキーワードを見つけられなかった場合や、見つけたとしてもそのキーワードでは発言を作り出せなかった場合であっても、適切な発言を作り出すことができる。
【００９２】
また、ニュース記事に含まれているユーザの発言との食い違いが少しずつしか明らかにされないので、一度にニュース記事の内容すべてが明かされるよりもユーザにとって受け入れやすく、ユーザがそれを容認して会話を進めてくれる可能性が高まる。
【００９３】
また、ニュース記事の文章を利用した発言では、その文章の持つ情報量のどれだけの割合が既に発言されたかを管理し、その文章の全部を発言せずに一定の割合が発言された時点でやめるので、そのニュース記事の文章のすべての内容が明かされることはなく、ユーザの発言との食い違いをあまり目立たないままにしておくことができる。
【００９４】
また、記事文データベース１９を用いた発言は、そればかりであれば、断片的な印象や、ぶっきらぼうな印象を与える可能性もあるが、会話文データベース１８を用いた発言に混ざってそれがなされるので、会話全体の印象が自然なものとなる。言い換えれば、小説や戯曲などから会話文データベース１８を作成した時には、そればかりを用いて発言すると、会話システムがやや喋りすぎの印象を与えてしまうことがあるが、記事文データベース１９による発言と組み合わせることでそのような印象を弱めることもできる。
【００９５】
また、ユーザの発言の中に含まれていたキーワードを履歴として残しておき、その後会話が進んだ後に、そのキーワードを用いて発言を行うので、話題が維持されて会話が行われているとユーザに感じさせることができる。また、これにより、ユーザの発言の中にキーワードを見つけられなかった場合や、見つけたとしてもそのキーワードでは発言を作り出せなかった場合にも、適切な発言を作り出すことができる。
【００９６】
従来の「人工無能」的な会話システムでは、データベースに登録されているパターン（「登録パターン」と称す）と、そのパターンに対応する返事のデータ（「登録返事」と称す）の関係は単純で固定されている。本発明では、データベースに登録されている文データが「登録返事」に対応するわけであるが、それ自身に「登録パターン」も含んでいる。そして、その「登録パターン」と「登録返事」の関係は「多対多」の関係としてアルゴリズムが関連付けられている。例えば、図６に示す記事文データベース１９に登録されている文データにおいて、「＊＊＊＊＊軍放送などによると、〜戦車は底辺の部分が裂けたという。」は、その直前のレコードである「＊＊＊＊＊自治区＊＊中部の＊＊＊人入植地近くで１４日夜、〜今後、紛争の激化は必至だ。」に含まれる多数の「登録パターン」（キーワード）の「登録返事」であるとともに、直後のレコードである「軍放送によると、〜と声明を出したという。」を「登録返事」とする多数「登録パターン」を含んでいる。そして、それ自身も自分自身が持つ多数の「登録パターン」の「登録返事」である。これにより、いろいろな「登録パターン」に対して多数の適切な「登録返事」が対応できる。このような関係を従来の「人工無能」的な会話システムの方式で登録するのは困難であり、また、大量の記憶領域が必要となってしまう。
【００９７】
なお、前記実施形態では、会話文データベース１８を小説から作成する例を示したが、例えば戯曲、映画やドラマのシナリオ、落語や漫才などの記録、実際の会話記録などを用いても良い。これらを用いる場合には、会話文の識別や時間関連情報の算出にそれぞれに独特の書式を利用することになる。例えば、戯曲では、それぞれのセリフは「」で囲まれていない場合が多い。セリフを言う役名を行頭に置き、それに続けて一定の空白を設けてセリフが記されていることが多い。会話文ではない文《ト書き》が（）に囲まれていたりもする。よって、このような戯曲の場合は、会話文の抽出に「」を利用せずにそれらの書式を利用すれば良い。
【００９８】
また、記事文データベース１９をインターネット上のニュース記事から作成する例を示したが、書籍など、文章を含むものであればどのようなものを利用しても良い。
【００９９】
また、時間関連情報の算出に改行コードと空白行を用いたが、句点（。）、引用符（「」）、章や段落や項目の見出しなどを用いても良い。また、文字数を用いても良い（各文の先頭文字間の文字数を利用するなど）。
【０１００】
また、時間関連情報をすべてのレコードに対して持つ例を示したが、時間関連情報の値が大きく変化するところにのみ持つなど、その一部だけを持っても良い。あるいは、時間関連情報を数値ではない形式で持っても良い。例えば、時間関連情報が大きく隔たるところに、空白のレコードなどを設けることにしても良い。
【０１０１】
また、キーワードを２文字以上の漢字、カタカナ、数字、またはこれらの組み合わせの単語としたが、他のものであっても良い。例えば、文字数の規定を変えても良いし、漢字とひらがなの混じった語句や、ひらがなだけからなる語句を含めたも良い。
【０１０２】
音声認識され漢字変換された後の文章からキーワードを抽出するものとしたが、漢字変換がされないひらがなでの状態で抽出しても良い。キーワードスポッティングなど、音としての認識の段階で抽出しても良い。これらの場合には、どの語句がキーワードなのかを知るための情報を設ければ良い。例えば、キーワードだけを登録したデータベースを用いれば良い。
【０１０３】
また、ユーザの発言の中にあるキーワードを会話文データベース１８や記事文データベース１９に探す例を示したが、ユーザの発言の中にあるもっと複雑なパターンを探しても良い。例えば、複数のキーワードのＡＮＤやＯＲの組み合わせ、語順の指定、ワイルドカード文字を含んだ指定、品詞など語句の種類の指定、などがあっても良い。会話文データベース１８や記事文データベース１９を探す時に用いるパターンは、ユーザの発言の中にあるパターンそのものではなく、そこから作り出されたパターンや、それに対応して選ばれたパターンであっても良い。
【０１０４】
また、情報量をキーワードの数で量ったが、他の手段を用いても良い。例えば、文字数、漢字の数、読みの数、名詞の数、動詞の数、それらの組み合わせなどを用いても良い。
【０１０５】
また、記事文から発言を作るときに、情報量を常に一定値以下にしたが、全体的には情報量を下げながら、時には情報量の多い発言をしても良い。例えば、発言に含まれるキーワードの数を１から５までの範囲でランダムに選ぶなどしても良い。
【０１０６】
また、１回の発言の情報量を削減するために文の一部を切り出す処理は、記事文に対してのみ行ったが、会話文に対して行っても良い。戯曲などには非常に長いセリフもあるが、それに適用すると効果がある場合がある。会話文データベース１８で発言されなかった時のみ、記事文データベース１９を使うとしたが、例えば会話文データベース１８で発言が作成可能な時にも、ある確率で記事文データベース１９を用いた発言をしても良い。全体として会話文データベース１８での発言が優先的に行われれば良い。
【０１０７】
また、ユーザ発言の中のキーワードを含む文以後の文を発言の候補としたが、それより前の文も、時間関連情報の差がある程度の範囲であるという条件で、候補に含めても良い。例えば、図４の会話文データベース１８に示すようなデータを持つ場合に、ユーザの発言の中に「名曲」というキーワードを見つけた時、前記第１実施形態であれば、「音痴みたいな顔をしているけど。」が選択されることはなかったが、時間関連情報の差が負で絶対値の小さいものも候補として、これが選択できるようにしても良い。
【０１０８】
記事文データベース１９の１つのレコードに改行コードまでの文章を当てはめたが、会話文データベース１８のように文単位で当てはめても良い。逆に、会話文データベース１８の１つのレコードを改行コードまで、あるいは、終端括弧の」まで（１回の発言全部）としても良い。
【０１０９】
また、前回発言記事文バッファ２１に前回の発言時に選択された文章そのものを記憶させたが、その文章に対応する記事文データベース１９のレコードを指定する情報を記憶させても良い。
【０１１０】
（第２の実施形態）
次に、本発明の第２の実施形態について説明する。
【０１１１】
図１２は本発明の第２の実施形態に係る会話システムのハードウェア構成を示すブロック図である。なお、図１（第１の実施形態）と同一部分には同一符号を付して、その説明は省略するものとし、ここでは異なる点のみについて説明する。第２の実施形態における会話システムは、本来の機能である会話機能の他に、電子ブックリーダ機能と電子辞書機能を持つ。すなわち、図１２に示すように不揮発性メモリ１７の電子ブックデータ領域には、様々な電子ブックデータ３１が記憶されており、会話システムはユーザから指示に応じて、それらを朗読する（読み上げる）ことができる。また、不揮発性メモリ１７の電子辞書データ領域には、例えば「国語辞典」や「百科事典」などの様々な電子辞書データ３２が記憶されており、ユーザの質問に対して、対応する項目の内容を読み上げることができる。この電子辞書データ３２は、見出し語とその見出し語に対応する説明文とからなる。
【０１１２】
図１３および図１４は第２の実施形態における会話システムの会話処理の流れを示すフローチャートである。まず、これらのフローチャートで示される処理を説明する前に、理解を容易とするため、第２の実施形態としての会話処理の概要について説明する。
【０１１３】
第２実施形態では、電子ブックデータ３１と電子辞書データ３２を利用して会話における発言を作り出すことを特徴としている。電子ブックデータ３１は、電子ブックとして提供される書物情報そのものなので、会話文データベース１８とは異なり、会話文と会話文でない地の文（非会話文）、さらに空白行、作者名、タイトル、目次などが混在した特定の文章情報（テキストデータ）である。時間関連情報もそこには含まれていない。そこで、電子ブックデータ３１の中にユーザ発言の中にあるキーワードを見つけた場合には、それが文章かどうかを判定し（例えば、行毎に句読点や引用符があるかないかで判定すれば良い）、文章でない時（作者名や目次など）はそれを除外し、文章の場合にはそれが会話文であれば、前記第１実施形態における会話文データベース作成処理（図２）に相当する処理を行い、会話文ではない地の文であれば、前記第１実施形態における記事文データベース作成処理（図５）に相当する処理を行う。
【０１１４】
時間関連情報については、キーワードを見つけた文の近傍についてだけ、キーワードを見つけた文章との相対的な値をその時点で算出して利用する。ここでは、キーワードを見つけた文章が会話文の時は会話文を対象と、地の文（非会話文）であれば地の文を対象とする。これは、すぐ近くにあっても会話文と地の文の内容は普通は全く異なるからである。例えば、図３に示す文章例において、「彼は、いよいよキザになる。眼を細めて、遠くのラジオに耳を傾ける。」という地の文は、その前後の会話文が音楽の話題であるのに対して、登場人物の描写となっている。
【０１１５】
一方、ユーザ発言の中に含まれるキーワードが電子辞書データ３２の中の見出し語の項目に見つかった場合には、その見出し語の説明文に対して、基本的には前記第１の実施形態における記事文データベース処理（図５）に相当する処理を行うが、その際、上述したような記事不適切部分削除処理（図１１）に代えて、発言に含むと不適切になる辞書特有の記事や情報などを削除する処理を行う。削除すべき常套的な語句は不揮発性メモリ１７に設けられた不適切語句データベース３３に予め登録しておいて、これに一致する語句は自動的に削除すれば良い。辞書の書式は一貫しているので、それも利用して発言に不適切な情報は削除する。
【０１１６】
例えば、キーワード「腹黒」を使って、電子辞書データ３２の「国語辞典」の中から以下のような形式のデータが抽出されたとする。
【０１１７】
はらぐろ・い［腹黒い］
（形）［文］ク　はらぐろ・し
心がねじけている。心の中に悪巧みや陰謀をもっている。
「−・い人間」
［派生］――さ（名）
このようなデータの中で「心がねじけている。心の中に悪巧みや陰謀をもっている。」以外のすべてを削除する。この例の場合、句読点のない文は削除するというルールを適用すれば実現できる。あるいは、（形）［文］［派生］（名）などはこの辞書で予め役割を決められた記述なので、これらを予め不適切語句データベース３３に登録しておき、それを参照しながら削除しても良い。この例の場合には、括弧に囲まれた部分は削除するというルールでも良い。また、「−」を含む文は削除する、「」で囲まれる引用文は削除する、見出しの行と次の行は削除する、などのルールを適切に組み合わせて用いても良い。
【０１１８】
このようにして抽出された「心がねじけている。心の中に悪巧みや陰謀をもっている。」に対して、前記第１の実施形態における記事文切断発言処理と同じように情報量削減のための切り出しを行う。これにより、例えば、
ユーザ：「腹黒い奴だな」
会話システム：「心がねじけている」
などの会話を実現することができる。
【０１１９】
ここまで述べた第２の実施形態における会話処理の概略である。さらに、第２の実施形態では、前記第１の実施形態では本質的に不可能だった種類の会話も実現できる。すなわち、前記第１の実施形態においては、一般的な「人工無能」的な会話システムと同様に、発言を作り出す素材である会話文データベース１８や記事文データベース１９に登録されている文の意味を会話システム（のプログラム）は理解していない。しかし、電子辞書データ３２の持つ意味ならば大枠が理解できる。これは、例えば「国語辞典」や「百科事典」であれば、通常、見出し語があって、それに続いて記述されていることは、その見出し語に対する説明文であることによる。そこで、それを用いた会話が可能となる。この部分の会話処理が図１４に示す会話処理（２）である。
【０１２０】
会話処理（２）では、電子ブックデータ３１や電子辞書データ３２からユーザの発言に対する返事を作成できなかった場合に、以下のように、電子辞書データ３２の説明文を利用して会話システムの方から辻褄のあった一連の発言ができる。
【０１２１】
会話システム：「ところで、腹黒いってどういう意味だか知っている？」
ユーザ：「知らない」（ユーザが知らなかった場合）
会話システム：「心がねじけている。心の中に悪巧みや陰謀をもっている。ということなんだよ。どう？勉強になった？」
このように、辻褄のあった会話の流れを実現しているが、「シナリオ」方式のようにシナリオをいちいち作る手間がいらない。ここでも、不適切語句を削除しているので、電子ブックデータ３１を流用しているにも拘わらず不自然な会話とはならない。また、この手法を用いて必ず発言が作り出せるので、よくしゃべる会話システムを実現することができる。しかも、電子ブックデータ３１の膨大な数の項目が利用できるので、何度この手法を用いても毎回新鮮な話題を提供できる。
【０１２２】
以下に、第２の実施形態における会話システムを実現するための具体的な処理手順について、図１３及び図１４に示すフローチャートを参照して詳しく説明する。これらのフローチャートで示される処理は、本システムに備えられたＣＰＵ１１がプログラムを読み込むことで実行する。
【０１２３】
図１３に示すように、本システムの会話処理が起動されると、ＣＰＵ１１は、まず、ユーザの発言に含まれるキーワードを抽出し、そのキーワードを電子ブックデータ３１の中から探す（ステップＧ１１）。そして、電子ブックデータ３１の中にキーワードを含む文章があれば（ステップＧ１２のＹＥＳ）、ＣＰＵ１１はその文章が会話文であるか否かを例えば「」記号等を利用して判断する（ステップＧ１３）。
【０１２４】
ここで、会話文であった場合には（ステップＧ１３のＹＥＳ）、ＣＰＵ１１は電子ブックデータ３１の中で当該会話文以後に存在する会話文を見つけ出し、その相対的な時間関連情報を算出する（ステップＧ１４）。そして、ＣＰＵ１１はその算出した時間関連情報に基づいて返事とすべき会話文を選出し、これを発言とする（ステップＧ１５）。また、会話文以外の文章であった場合には（ステップＧ１３ＮＯ）、ＣＰＵ１１は電子ブックデータ３１の中で当該文章以後に存在する文章を見つけ出し、その相対的な時間関連情報を算出する（ステップＧ１６）。そして、ＣＰＵ１１はその算出した時間関連情報に基づいて返事とすべき文章を選出した後（ステップＧ１７）、さらに、その文章に対して前記記事文切断発言処理と同様の処理を施して会話として不適切な部分を削除して発言とする（ステップＧ１８）。
【０１２５】
一方、ユーザの発言に含まれるキーワードが電子ブックデータ３１に存在しなかった場合において（ステップＧ１２のＮＯ）、次にＣＰＵ１１は、そのユーザの発言に含まれるキーワードを電子辞書データ３２の見出し語の項目の中から探す（ステップＧ１９）。ユーザのキーワードと一致する見出し語の項目があれば（ステップＧ２０のＹＥＳ）、ＣＰＵ１１はその見出し語に対応した説明文から電子辞書特有の記号や表現を削除した後（ステップＧ２１）、さらに、その削除後の説明文に対して記記事文切断発言処理と同様の処理を施して会話として不適切な部分を削除して発言とする（ステップＧ２２）。
【０１２６】
また、電子辞書データ３２にもユーザのキーワードがなく、返事を作り出せない場合には（ステップＧ２０のＮＯ）、図１４に示す会話処理（２）が実行される。
【０１２７】
図１４に示すように、会話処理（２）では、ＣＰＵ１１は電子ブックデータ３１の中からランダムに項目を選ぶ（ステップＧ２３）。このとき選択された項目の見出し語をＷとすると、ＣＰＵ１１は例えば「ところで、Ｗってどういう意味だか知っている？」といったようユーザに対する質問文を見出し語Ｗを引用して生成し、これを発言する（ステップＧ２４）。ここで、ユーザの返事が「知っている」といった肯定文であった場合には（ステップＧ２５のＹＥＳ）、ＣＰＵ１１は例えば「知ってるんだったらいいや」といったような決めセリフを発言して（ステップＧ２６）、ここでの処理を終える。
【０１２８】
また、ユーザの返事が「知らない」といった肯定文であった場合には（ステップＧ２５のＹＥＳ）、ＣＰＵ１１は電子ブックデータ３１の中から見出し語Ｗに対応した説明文を抽出し、この説明文から会話として不適切な語句を削除して読み上げると共に、最後に例えば「ということなんだよ。どう？勉強になった？」といったような決めセリフを発言して（ステップＧ２８）、ここでの処理を終える。
【０１２９】
このように、電子ブックデータ３１や電子辞書データ３２は、会話システムがユーザと交わしている会話とは本来は全く関係ないが、これを利用して会話らしい発言を作り出すことができる。また、電子ブックデータ３１や電子辞書データ３２を利用して会話を行うので、別途会話用のデータを持たずに実現することができ、さらに、これらに登録されている膨大な情報を利用して常に広い範囲の話題に対応できるなどの利点がある。
【０１３０】
なお、前記第２の実施形態では、会話システムが電子ブックデータ３１や電子辞書データ３２を読み上げるとしたが、表示部を設けて表示しても良い。また、キーワードを見つけた文が会話文であれば発言に用いるものとして会話文を選び、地の文であれば地の文を選ぶとしたが、区別せずに選んでも良い。その場合には奇抜な展開の多い会話となる。
【０１３１】
要するに、本発明は前記各実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。更に、前記各実施形態には種々の段階の発明が含まれており、開示される複数の構成要件における適宜な組み合わせにより種々の発明が抽出され得る。例えば、実施形態で示される全構成要件から幾つかの構成要件が削除されても、「発明が解決しようとする課題」で述べた効果が解決でき、「発明の効果」の欄で述べられている効果が得られる場合には、この構成要件が削除された構成が発明として抽出され得る。
【０１３２】
また、上述した各実施形態において記載した手法は、コンピュータに実行させることのできるプログラムとして、例えば磁気ディスク（フレキシブルディスク、ハードディスク等）、光ディスク（ＣＤ−ＲＯＭ、ＤＶＤ等）、半導体メモリなどの記録媒体に書き込んで各種装置に適用したり、そのプログラム自体をネットワーク等の伝送媒体により伝送して各種装置に適用することも可能である。本装置を実現するコンピュータは、記録媒体に記録されたプログラムあるいは伝送媒体を介して提供されたプログラムを読み込み、このプログラムによって動作が制御されることにより、上述した処理を実行する。
【０１３３】
【発明の効果】
以上詳記したように本発明によれば、例えば小説、戯曲、映画やドラマのシナリオ、落語や漫才などの記録、実際の会話記録といったような会話文を含んだ既存の文章情報、あるいは、ニュース記事などのように会話文を含まない既存の文章情報を利用してデータベースを簡単に作成することができ、ユーザの発言に対し、このデータベースから会話として適切な文章を時間関連情報に基づいて選出して発言するようにしたことで、自然な流れで、しかも、ユニークな会話を実現できる。
【０１３４】
また、例えば電子ブックや電子辞書のように、会話システムがユーザと交わしている会話とは本来は全く関係ないものを利用して、会話らしい発言を作り出すこともできる。
【図面の簡単な説明】
【図１】本発明の第１の実施形態に係る会話システムのハードウェア構成を示すブロック図。
【図２】第１の実施形態における会話システムの会話文データベース作成処理を説明するためのフローチャート。
【図３】前記会話文データベース作成処理の対象となる小説のテキストファイルの一例を示す図。
【図４】前記図３のテキストファイルから作成された会話文データベースの一例を示す図。
【図５】第１の実施形態における会話システムの記事文データベース作成処理を説明するためのフローチャート。
【図６】前記記事文データベース作成処理によって作成される記事文データベースの一例を示す図。
【図７】第１の実施形態における会話システムの会話処理を説明するためのフローチャート。
【図８】第１の実施形態における会話システムの会話処理（２）を説明するためのフローチャート。
【図９】前記会話処理に含まれるレコード選択処理を詳しく説明するためのフローチャート。
【図１０】前記会話処理に含まれる記事文切断発言処理を詳しく説明するためのフローチャート。
【図１１】前記会話処理に含まれる記事文不適切部分削除処理を詳しく説明するためのフローチャート。
【図１２】本発明の第２の実施形態に係る会話システムのハードウェア構成を示すブロック図。
【図１３】第２の実施形態における会話システムの会話処理を説明するためのフローチャート。
【図１４】第２の実施形態における会話システムの会話処理（２）を説明するためのフローチャート。
【符号の説明】
１１…ＣＰＵ
１２…音声入力部
１３…Ａ／Ｄ変換部
１４…音声出力部
１５…Ｄ／Ａ変換部
１６…ワークメモリ
１７…不揮発性メモリ
１７ａ…プログラム
１８…会話文データベース
１９…記事文データベース
２０…キーワード履歴テーブル
２１…前回発言記事文バッファ
２２…累積発言文字数カウンタ
３１…電子ブックデータ
３２…電子辞書データ
３３…不適切語句データベース[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a conversation system used for toys such as a conversational robot, a video game machine, and the like, and relates to a conversation system in which a user can obtain enjoyment and comfort by talking to a computer.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, many conversation systems used for video game machines, toys, and the like usually employ a method (hereinafter referred to as “scenario method”) in which a conversation is performed according to a predetermined scenario. The scenario used in this "scenario method" is made so that the conversation system first makes a statement that limits the topic in consideration of feasibility, and the progress of the conversation after that is made so that the number of branches is minimized. . For this reason, humans (users) cannot take the initiative in conversation, and there are drawbacks such as the flow of conversation is mediocre or unnatural.
[0003]
Therefore, “artificial intelligence” conversation systems and “artificial incompetence” conversation systems are considered as conversation systems other than the “scenario method”. An "artificial intelligence" conversation system is a system that extracts meaning by parsing a user's utterance, understands the intention of the utterance, and creates an answer based on the meaning. Since this system realizes a conversation by performing processing close to human intelligence, it requires a high level of technology and is difficult to deal with any field. That is, it can be applied only to a field where conversation is limited, such as “ticket sale” and “information search”.
[0004]
In contrast, an “artificial incompetence” conversation system is a conversation system that mainly uses an approach commonly referred to as “artificial incompetence”. This realizes what appears to be a conversation on the surface without using a technique such as parsing the user's utterance and extracting meaning. In other words, the system does not understand the meaning of the user's remarks, but can be established as a conversation. In this system, a specific pattern (keyword) is found from a user's remarks, the pattern is compared with each pattern registered in a database prepared in advance, and a reply data group corresponding to the corresponding pattern is output. For example, if the database says “I am a fan of the ABC team” and “I am a fan of the ABC team” as the response data corresponding to the pattern “baseball”, the user says “I like baseball” Then, the conversation system searches the database with “baseball” included in the remark as a keyword and answers “I am a fan of the ABC team”.
[0005]
In this way, the “artificial incompetence” conversation system does not require complicated processing such as parsing and employs pattern matching by a database. Even ambiguous sentences can be handled. In addition, since a fundamental mechanism is provided to generate a response to a user's normal utterance (an utterance that is not in a fixed format such as the “scenario method”), the user can naturally lead conversation. it can.
[0006]
[Problems to be solved by the invention]
In the “artificial incompetent” conversation system described above, the contents and quantity of the database greatly affect the quality of the conversation. If only content that is boring is registered in the database, only a boring conversation can be made, and if the amount of registration is small, the same conversation is repeated. However, enormous work is required to create a database with both quality and quantity. In addition, since the number of engineers involved in database creation is limited, the topics that the conversation system can reply to may be limited to the range of topics familiar to those persons. In addition, it is very difficult to create a database by assuming various conversation flows in advance, so there are many cases where the conversation does not flow naturally.
[0007]
The present invention has been made in view of the above points. A conversation system and a conversation processing program capable of reducing the burden of database creation work, realizing a natural conversation and a unique conversation that can be enjoyed by a user. The purpose is to provide.
[0008]
[Means for Solving the Problems]
The conversation system of the present invention is a conversation system for performing a conversation with a user, and extracts a sentence that can be used as a conversation from existing sentence information, and each sentence extracted by the sentence extraction means. Time-related information calculating means for calculating time-related information indicating temporal and topical distances between the time-related information calculated by the time-related information calculating means to each sentence extracted by the sentence extracting means And a database for storing and a conversation processing means for selecting and uttering an appropriate sentence as a conversation from the database based on the time-related information in response to a user's utterance.
[0009]
According to such a conversation system, for example, existing text information including conversational sentences such as novels, plays, movies and drama scenarios, records of rakugo or comics, actual conversation records, or news articles For existing sentence information that does not include conversational sentences, such as the sentence, sentences that can be used as conversations are extracted from such sentence information, and these sentences indicate the temporal and topical distance between each sentence. Time-related information is added and registered in the database. Then, in response to the user's utterance, an appropriate sentence as a conversation is selected from the database based on the time related information and uttered. In this way, a database can be easily created by using existing text information, and a unique conversation can be performed using the database in a natural manner.
[0010]
The conversation system according to the present invention is a conversation system for performing conversation with a user. The sentence extraction means extracts text usable as conversation from existing sentence information, and the sentence extraction means extracts the sentence. Time-related information calculating means for calculating time-related information indicating temporal and topical distances between the sentences, and time-related information calculated by the time-related information calculating means for each sentence extracted by the sentence extracting means And a database for storing keywords, keyword extracting means for extracting a keyword from a user's remarks, a sentence including the keyword extracted by the keyword extracting means is searched from the database, and a difference from time related information of the sentence Selecting means for selecting sentences with a predetermined value or less as a speech candidate, and using the sentences selected by the selecting means Constituted by and a speech processing means for word.
[0011]
According to such a conversation system, for example, existing text information including conversational sentences such as novels, plays, movies and drama scenarios, records of rakugo or comics, actual conversation records, or news articles For existing sentence information that does not include conversational sentences, such as the sentence, sentences that can be used as conversations are extracted from such sentence information, and these sentences indicate the temporal and topical distance between each sentence. Time-related information is added and registered in the database. Then, for a user's utterance, a keyword that is a starting point for conversation is extracted from the utterance, a sentence including the keyword is searched from the database, and the sentence is temporally and topically based on the time-related information. Sentences close to are selected and used for remarks. In this way, it is possible to easily create a database by using existing sentence information, and by selecting appropriate sentences as conversations using the time-related information from the database and speaking, It is possible to have a unique conversation.
[0012]
Further, in the conversation system having the above-described configuration, information indicating the previous utterance date and time is added to each sentence of the database together with time-related information, and the selection means is uttered within a predetermined number of days based on the previous utterance date and time information. A speech candidate is selected for an unsent sentence. Accordingly, it is possible to avoid frequently speaking the same line, for example, by uttering a sentence remarked within three days as a target.
[0013]
Further, in the conversation system having the above-described configuration, the speech processing means is configured to delete a part inappropriate for conversation from the text selected by the selection means. As a result, for example, when a news article is used as existing text information, it is possible to speak after excluding unnatural things in the conversation with expressions unique to the news article. It is possible to make a remark with little unnaturalness even though he is doing.
[0014]
Further, in the conversation system having the above-described configuration, the speech processing unit deletes an inappropriate part as a conversation from the sentence selected by the selection unit, and changes the sentence according to the number of keywords included in the deleted sentence. It is set as the structure which speaks either of these division | segmentation sentences by dividing | segmenting into multiple. As a result, as existing text information, for example, when a news article is used, the article text is partially cut out and used so that it is close to the amount of information that the conversation has, so that it is originally intended for conversation It is possible to create conversation-like utterances from written sentences that are not.
[0015]
The conversation system of the present invention is a conversation system for performing conversation with a user, and stores storage means for storing specific sentence information in which conversation sentences and non-conversation sentences are mixed, and is stored in the storage means. Search means for searching for sentences including keywords included in the user's remarks from the sentence information, and determination means for determining whether the sentences searched by the search means are conversational sentences or non-conversational sentences; When it is determined that the sentence is a conversation sentence by this determination means, time-related information indicating temporal and topical distance is calculated for the subsequent conversation sentences, and an appropriate sentence as a conversation based on the time-related information. First speech processing means for selecting and speaking, and time-related information indicating temporal and topical distances for subsequent non-conversational sentences when determined by the judging means as non-conversational sentences The Out, select the appropriate sentence as a conversation based on the time-related information, and comprises a second speech processing means to speak with sanitized as conversation from that sentence.
[0016]
According to the conversation system having such a configuration, it is possible to create a speech in a conversation using specific text information in which a conversation sentence and a non-conversation sentence are mixed. The specific sentence information is, for example, an electronic book, and a conversation sentence and a sentence of a ground that is not a conversation sentence are mixed. If the keyword included in the user's utterance is in the conversation sentence, time-related information is calculated for the conversation sentence, and an appropriate sentence is selected as the conversation based on the time-related information. On the other hand, if the keyword included in the user's utterance is in a non-conversational sentence, time-related information is calculated for the non-conversational sentence, and an appropriate sentence is selected as a conversation based on the time-related information. The part that is inappropriate for the conversation is deleted and the speech is made.
[0017]
The conversation system of the present invention is a conversation system for performing conversation with a user, and stores storage means for storing specific dictionary information including headwords and explanations corresponding thereto, and storage in the storage means. Search means for searching for a headword including a keyword included in a user's utterance from the dictionary information obtained, and an explanatory sentence corresponding to the headword searched by the search means is extracted from the dictionary information, and its description A first utterance processing means for uttering by deleting an inappropriate part as a conversation from a sentence.
[0018]
According to the conversation system having such a configuration, it is possible to create a speech in conversation using specific dictionary information such as “Japanese dictionary” and “encyclopedia”. This dictionary information is composed of headwords and corresponding explanations. For a user's utterance, a headword including a keyword included in the utterance is searched, and an explanatory text corresponding to the headword is extracted from the dictionary information. It is deleted and remarked.
[0019]
Further, in the conversation system having the above configuration, when there is no entry word including a keyword included in a user's utterance in the dictionary information, the entry word is selected at random from the dictionary information, The second speech processing means for continuing the conversation by speaking using the explanatory text corresponding to the headword is adopted. As a result, even if the keyword included in the user's utterance is not included in the headword of the dictionary information, the conversation is continued by speaking using the explanatory text corresponding to the headword randomly selected from the dictionary information. be able to.
[0020]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0021]
(First embodiment)
FIG. 1 is a block diagram showing a hardware configuration of a conversation system according to the first embodiment of the present invention. This conversation system is intended to advance a conversation by speaking as if a human is responding to a user's remarks. For example, the conversation system is mounted on toys such as a conversational robot or a video game machine. .
[0022]
FIG. 1 shows a basic configuration when this system is realized by a general-purpose computer. The CPU 11, the voice input unit 12, the A / D conversion unit 13, the voice output unit 14, and the D / A conversion unit 15 are shown. , Work memory 16 and nonvolatile memory 17.
[0023]
CPU11 reads the program memorize | stored in the non-volatile memory 17 etc., and performs a predetermined | prescribed process according to the procedure described in the program. The voice input unit 12 is a microphone for inputting a user's voice during conversation. The user's voice (analog data) input from the voice input unit 12 is converted into digital data by the A / D conversion unit 13 and taken into the CPU 11. The CPU 11 performs processing using the work memory 16 and outputs a reply to the user's utterance via the D / A converter 15. The D / A conversion unit 15 converts the audio data generated by the CPU 11 into analog data and supplies the analog data to the audio output unit 14. The audio output unit 14 is a speaker for outputting this to the outside.
[0024]
The non-volatile memory 17 is a rewritable memory that is composed of, for example, a flash memory and does not erase stored contents even when the power is turned off. In the nonvolatile memory 17, in addition to the program 17a for realizing the conversation system of the present invention, as necessary information for conversation processing, the conversation sentence database 18, the article sentence database 19, the keyword history table 20, the previous remark article sentence A buffer 21 and a cumulative message character counter 22 are provided. The program 17a includes a program for executing database creation processing to be described later.
[0025]
The conversation sentence database 18 is a sentence of a conversation part from the sentence information, for example, existing sentence information including a conversation sentence such as a novel, a drama, a movie or drama scenario, a record of a rakugo or comic story, and an actual conversation record. It is a database created by extracting only The article text database 19 is a database created by extracting existing text information (text information by written words) that does not include a conversation text, such as a news article, by extracting a portion of text that can be used as a conversation from the text information. is there.
[0026]
The keyword history table 20 is a table for holding as a history the keywords found in the user's remarks and the date and time when the keywords were found. Each time keywords are found during conversation, they are written into the keyword history table 20. In this case, when the storage capacity of the keyword history table 20 is full, data is overwritten from old date and time data. The previous comment article buffer 21 is for holding a sentence when an article is replied (replyed) to the user using the article sentence registered in the article database 19 during conversation. The cumulative utterance character counter 22 counts the number of characters in the text of an article used for utterance.
[0027]
In the conversation system having such a configuration, the user's voice input from the voice input unit 12 is converted into digital data by the A / D conversion unit 13 and then given to the CPU 11. The CPU 11 performs each process in the order of “voice recognition process” → “conversation process” → “reading process” and returns a reply to the user's statement. That is, first, text-to-character text is created by performing speech-to-character conversion by “voice recognition processing”. In the “voice recognition process”, the kana-kanji conversion process is also performed at the same time. Next, a “conversation process” is performed on the text-formatted sentence to create a reply to the user's utterance, and this is read out by the “read-out process”. At this time, the audio data generated as a reply to the user by the CPU 11 is converted into analog data by the D / A converter 15 and then output through the audio output unit 14.
[0028]
Here, “speech recognition processing” and “read-out processing” are not described in detail because they use generally known methods. In the following, the explanation will be focused on “conversation processing”.
[0029]
First, a process for creating the conversation sentence database 18 and the article sentence database 19 used in the “conversation process” will be described. These conversational sentence database creation processes are executed as one function of the CPU 11 provided in the present system, or are executed by, for example, a personal computer connected to the present system. May be. In the case of a configuration using a personal computer or the like, the conversation sentence database 18 and the article sentence database 19 created there are written in the nonvolatile memory 17 of the present system and referred to during conversation processing. In the present embodiment, it is assumed that the CPU 11 provided in the present system reads the program 17a to perform the creation process of the conversation sentence database 18 and the article sentence database 19 as described below.
[0030]
(A) Conversation sentence database creation processing
In the conversation text database creation process, for example, in the text of a work that was originally created completely unrelated to the conversation you are facing, such as a novel, a drama, a movie or drama scenario, a record of a rakugo or comic story, an actual conversation record, etc. Is used to create a database (conversation sentence database 18). In this case, the text of the conversation part is extracted from the existing text information of this type, and information indicating the temporal and topical distance between these conversation texts (hereinafter referred to as time-related information) is calculated. By registering in the database 18, an appropriate conversation sentence is selected based on the time-related information during conversation processing.
[0031]
FIG. 2 is a flowchart showing the flow of conversation text database creation processing of the conversation system in the first embodiment. Here, in order to simplify the description, it is assumed that a novel text file is present in, for example, the nonvolatile memory 17 and the text file is read and processed. If the target file is a drama, a scenario, a conversation record, etc., the details are slightly different.
[0032]
There are multiple novel text files and they are processed one by one. Extraction of the conversation part may be performed using quotes for conversation symbols such as “” and “” when the target is a novel. For the division into sentence units, a punctuation mark (.) Or the like may be used. The time-related information is information for knowing the closeness of each conversation sentence in time and closeness as a topic. When the value of this time-related information is close, it means that the time when the conversation sentence is spoken is close and there is a high possibility that it is the same topic. In the case of a novel, since the time when each conversation sentence is spoken is not known, the information is created based on a line feed code or the like. If a line feed code is entered, the temporal and semantic connection between sentences will be slightly lost. In novels, blank lines are often provided to separate sentences, so when a blank line is entered, the time-related information values of the preceding and following sentences are largely separated (for example, “10”). When the file changes, it becomes a completely different topic, so a larger value (for example, “100”) is added.
[0033]
The conversation sentence database creation process will be described in detail.
[0034]
As shown in FIG. 2, the CPU 11 first opens the first text file with the time-related information set to the initial value “0” (step A11), and reads the text from the beginning to the next line feed code (step A12). . Then, the CPU 11 checks whether or not the read text is a blank line (step A13). If it is not a blank line (YES in step A13), the CPU 11 performs a process of extracting a sentence of a conversation part from the text using a conversation symbol such as “” or “” (step A14). As a result, if there is a conversation part (YES in step A15), the CPU 11 divides the sentence of the conversation part into sentences by using a punctuation mark, etc., and adds time-related information and the last utterance date and time to these sentence data. It is registered in the conversation sentence database 18 (step A16). The previous speech date and time is data used in conversation processing, and empty data is registered at this time.
[0035]
Subsequently, the CPU 11 adds “1” to the current time-related information (step A17), reads the text up to the next line feed code, and performs the same processing (step A18 → A12). If the read text is a blank line (YES in step A13), “10” is added to the time related information (step A19), and the process proceeds to the next text.
[0036]
When the processing for all the texts is completed (Yes in Step A18), if there is an unprocessed text file (Yes in Step A20), the same processing as described above is repeated for the text file. At that time, since the topic is completely different, the CPU 11 adds “100” to the time related information when the next text file is opened (step A21).
[0037]
In this way, a sentence as a conversation sentence is extracted from a text file including the conversation sentence, and is registered in the conversation sentence database 18 together with the time-related information and the last utterance date and time on a sentence basis.
[0038]
FIG. 3 shows an example of a novel text file that is the subject of the conversation database creation process (excerpt from Osamu Dazai “Good Buy”). Each line is a sentence up to a line feed code. "..." expresses that there is data before and after the sentence shown here. Note that one line when printed or displayed is different from one line up to the line feed code. Sentences that cannot be displayed on a single line without a line feed code will be displayed on a new line and printed. The time-related information is for reference only and is not in the actual text file. Here, it is assumed that the time-related information of the sentence “Don't be deep enough to fight” is “5000”, and the value calculated therefrom is shown. "Mysterious power (4)" is the title of the chapter. Recognizing such a chapter change, a slightly larger value (eg, “30”) may be added to the time-related information. Here, there is always a blank line before and after, so an example using it was shown.
[0039]
FIG. 4 shows an example of the conversation sentence database 18 created from this text file. In the conversation sentence database 18, sentence data of the conversation part of the novel shown in FIG. 3 is registered in units of sentences. These sentence data are appended with time-related information and the date and time of the last statement. The previous utterance date and time is data used in conversation processing, and is blank data here.
[0040]
(B) Article sentence database creation processing
In the article sentence database creation process, a sentence (article sentence database 19) is created by extracting sentences suitable for conversation for news articles and the like. Similar to the conversation sentence database creation processing, the conversation sentence is extracted from the sentence information, and time-related information indicating the temporal and topical distance between these conversation sentences is calculated and registered in the article sentence database 19. Thus, an appropriate conversation sentence can be selected based on the time-related information during conversation processing. However, in news articles and the like, expressions unique to the news article are used, and since a conversation sentence is not included like a novel, it is necessary to edit it in a form suitable for the conversation sentence.
[0041]
FIG. 2 is a flowchart showing the flow of article database creation processing of the conversation system in the first embodiment. Here, it is assumed that a news article is downloaded from a news site or the like on the Internet and already exists in, for example, the nonvolatile memory 17 as a plurality of files in the HTML (HyperText Markup Language) format.
[0042]
Since the HTML tag is included in the homepage information in addition to the displayed characters, it is deleted. In addition to sentences expressing the contents of articles, there are headlines of other articles accompanied by links, names of other homepages, and the like. Since these do not include a period (.), They are identified and deleted here. In articles on the home page, blank lines are often used even if the content is not a large break, so the added value of the time related information when a blank line is found is set to “1”. There is no problem because a headline or the like is used together with a plurality of blank lines at a place where the content is largely separated, and time-related information is added each time. Here, in order to show a process different from the process of creating the conversation sentence database 18, even if the sentence up to the line feed code is composed of a plurality of sentences, it is recorded in the same record.
[0043]
The article database creation process will be specifically described.
[0044]
As shown in FIG. 5, the CPU 11 first opens the first HTML file with the time-related information set to the initial value “0” (step B11), removes the tag from the HTML file, and is displayed on the screen. Only the text of the existing news article is left (step B12).
[0045]
Here, the CPU 11 reads the text up to the next line feed code (step B13) and checks whether the read text is a blank line (step B14). If it is not a blank line (YES in step B14), the CPU 11 checks whether or not there is a punctuation in the text (step B15). As a result, if there is a punctuation mark (Yes in step B15), the text is assumed to be a sentence, and the CPU 11 extracts the sentence data, adds the time-related information and the previous utterance date, and registers them in the article sentence database 19. (Step B16). The previous speech date and time is data used in conversation processing, and empty data is registered at this time. On the other hand, if there is no punctuation (NO in step B15), the text is determined to be a headline or a link and discarded, and the process proceeds to step B17.
[0046]
Subsequently, the CPU 11 adds “1” to the current time-related information (step B17), reads the text up to the next line feed code, and performs the same processing as described above (step B18 → B13).
[0047]
When processing for all texts is completed (Yes in step B18), if there is an unprocessed HTML file (Yes in step B19), the same processing as described above is repeated for the HTML file. At that time, since the article is completely different, the CPU 11 adds “100” to the time-related information when the next text file is opened (step B20).
[0048]
In this way, sentences usable as conversations for news articles and the like are extracted and registered in the article sentence database 19 together with the time-related information and the last utterance date and time. FIG. 6 shows an example of the article sentence database 19 created by the article sentence database creation process. The records with time-related information “10000” and “10001” and the records with time-related information “10102” to “10106” are different articles (articles with different addresses on the homepage). The time-related information increases by two from “10102” to “10106” means that these sentences are a series of sentences but there are blank lines in between.
[0049]
(C) Conversation processing
Next, conversation processing using the conversation sentence database 18 and the article sentence database 19 will be described.
[0050]
7 and 8 are flowcharts showing the flow of conversation processing of the conversation system in the first embodiment. FIG. 9 to FIG. 11 are flowcharts showing the flow of the record selection process, article sentence disconnection utterance process, and article sentence inappropriate part deletion process included in the conversation process.
[0051]
First, before explaining the processing shown in these flowcharts, in order to facilitate understanding, the conversation processing of this system will be described with a specific example. Here, the “keyword” indicates a word that suggests more strongly than other words, but in this embodiment, in order to simplify the explanation, two or more kanji characters, katakana, numbers, or these characters are used. Assume that a word consisting of a combination is recognized as a keyword for starting conversation.
[0052]
For example, if the user says "I like music", the keyword in this is "music". Therefore, sentence data having a keyword such as “music” is searched from the conversation sentence database 18. In this case, if it is an example of the conversation sentence database 18 shown in FIG. 4 (assuming that there is no data in the previous utterance date and time as shown in the figure, and there is no sentence data including “music” other than being displayed here) ) In the record selection process (Fig. 9), sentence data such as "Do you understand music?" And "Don't show me my music, don't show me music" are extracted. The difference in time-related information from the sentence data is 2 or less, “I have a fool-like face.” “If it is a masterpiece, I want to hear it once a day.”, “What is that song?” , “Chopin.” Is also extracted.
[0053]
One of these is chosen at random and remarked. All of these are remarks related to music, so they are appropriate as issues. Thereby, the conversation system can make the user have the illusion that he / she understands his / her speech, and the expression of the speech itself is extracted from the novel and unique, and the flow of conversation is also natural.
[0054]
In this case, a method of responding to the user by using only the sentence of the person who has spoken the conversation sentence including the keyword in the conversation sentence database 18 can be considered, but in this embodiment, the person who has spoken the conversation sentence containing the keyword. Since the answer is found from the sentences of both the person talking to the person, many variations of the answer can be extracted, and at the same time, the conversation sentence including the keyword ("music" in the above example) itself is also answered. , The user can feel strongly that he understands what he said and get satisfaction. In addition, if the keyword is always included in the reply, it may feel unnatural, but in this embodiment, a sentence that does not include the keyword that follows the sentence that includes the keyword ("music") is also remarked. I don't feel any unnaturalness. Moreover, the answer is selected from the conversation flow including the keyword ("music") and is often a natural content.
[0055]
Here, the time-related information added to the data of each conversation sentence registered in the conversation sentence database 18 appropriately expresses the temporal and content differences between the conversation sentences. For example, even if there is a keyword “Kenka” in the user's remarks and “Deep enough to fight” is a candidate, you can hear “Piano”, the record immediately after that record. Is not a candidate.
[0056]
Moreover, in this embodiment, the place that is said to be a single statement in a novel is divided into two sentences, such as “Do you understand the music? Make one speak. The amount of information included in two sentences is naturally larger than that of one sentence, and as the amount of information increases, there is a higher possibility that information that conflicts with the user's remarks will be included. In order to avoid this and reduce the possibility of discrepancies with the user's remarks, by letting them speak in sentence units, in a novel that was originally created completely unrelated to the conversation that the conversation system is conducting Can be used in various situations.
[0057]
Further, by recording the previous statement date and time of the spoken sentence in the conversation sentence database 18 and using this at the time of record selection, it is possible to prevent the same sentence from being frequently spoken and bored.
[0058]
If no reply is found in the conversation sentence database 18, the system uses the article sentence database 19 including a wider topic. The process of selecting a sentence is the same as in the conversation sentence database 18. However, if the selected sentence is spoken as it is, it becomes very unnatural as a conversation. Therefore, this is transformed into a natural one as a conversation by the article sentence cut-off speech process (FIG. 10).
[0059]
First, an unnatural sentence is deleted by an article inappropriate part deletion process (FIG. 11) if it is an expression peculiar to news and used as it is in a conversation.
[0060]
For example, if the user says "I like ***", the record selected for it is "Most selected player of the US Major League League (AVP) *" ******* Outfielder (28) returned home with his wife ** on an aircraft arriving at Kansai Airport on the 9th, winning the top batter and the thief king, and hit the newest hit record for season 242 hits 90 ”Updated for the first time in years.” Note that “***” actually contains the name of a person, the name of a place, etc., but here it is expressed by masking with a * symbol to avoid the description of proper nouns. To do.
[0061]
In the above sentence example, “(MVP)”, “(28)”, and “9 days” are deleted as inappropriate parts. This is because it is clearly unnatural as a conversation to read parentheses as “braces”. Even if the parentheses themselves are not automatically read, it is unnatural to read the contents in the parentheses. In a user's daily conversation, nouns are not followed by nouns of equal rank, and the age of a person is not suddenly said immediately after the noun referring to the person. Words related to the schedule such as “9th” are also meaningful if they are within the month in which this news is issued, but they are meaningless in other months, and are therefore subject to deletion.
[0062]
By deleting these, the sentence becomes:
[0063]
“******* Outfielder selected as the best player in the US Major League Baseball League and A League returned home with his wife *** on an aircraft arriving at Kansai Airport. And the newest hit record of new 242 hits in the season has been updated for the first time in 90 years. "
However, this is quite unnatural as a speech in conversation. The biggest cause is that there is too much information as one statement. In everyday conversation, the amount of information contained in a single statement is very small. In the present embodiment, the amount of conversation information is measured by the number of keywords.
[0064]
For example, “Do you understand the music? I have a face that seems like an idiot.” Contains only two keywords (“music” and “sense of humor”). On the other hand, “**** outfielder of **** who was selected as the best player in the US Major League Baseball League and A League returned home with Mrs. ** on an aircraft arriving at Kansai Airport. "The robber king was acquired and the newest hit record for the season 242 hits was updated for the first time in 90 years." Clearly includes more than two keywords. Therefore, an arbitrary character position is selected at random, and a phrase including the punctuation mark is extracted (a sentence when the sentence including the selected character position does not have a punctuation mark). Here, “sentence” is delimited by punctuation marks, and there are no further punctuation marks.
[0065]
As a result, for example, “US major league” is cut out, “******* outfielder selected as the best player in a league” is cut out. When “US Major League” is cut out, only one keyword is included here, so this is used as it is as a reply. If the phrase “******* outfielder selected as the best player in a league” is cut out, there are still 5 keywords included in this, so An arbitrary character position is randomly selected from this clause, and this time, a part including the character is cut out with a keyword as a delimiter. If there is still too much information in the cut out part, repeat this, and finally, for example, “To the best player in a league” or “****** outfielder And so on.
[0066]
Phrases can no longer be cut out with punctuation marks, but are cut off with keywords as delimiters, so the cut-out part is relatively natural Japanese. If you answer these, you will end up with a somewhat awkward and half-hearted impression, but the answer will be vague and inconspicuous with the user's remarks. It should be noted that the cut-out portion may be edited to add a ending, such as “National League” or “****** outfielder”.
[0067]
In this method, since only a part of the related information is spoken, the user also wants to listen to the remaining part. For example, if the user says “baseball” and “******* outfielder” is answered, then he / she wants to listen to it and “updates after 90 years”. I'm interested in what's been 90 years. By intentionally limiting the amount of information in this way, it is possible to add a mysterious taste to the conversation and to make the user more curious.
[0068]
If no response is made in the conversation sentence database 18 or the article sentence database 19 using the keyword of the user's speech, or if the keyword is not included in the user's speech, it is very There is a possibility that it will become a silent conversation system. The process prepared for such a case is the process shown as conversation process (2) in FIG.
[0069]
That is, when an utterance is made using the article sentence database 19, a sentence selected for utterance at that time is recorded in the previous utterance article sentence buffer 21, so that the utterance is used. In the case of an utterance using the article sentence database 19, only a very small part of the entire sentence is uttered in order to make the amount of information that seems to be a conversation, so the remaining part is also uttered.
[0070]
As a result, the following three effects can be obtained mainly.
(1) A certain degree of satisfaction is given to the interest and curiosity caused by the mysterious taste of the remarks using the article sentence database 19.
(2) Giving the user a feeling of continuing conversation about one topic.
(3) Even if the selected article contains information that conflicts with the user's remarks, the discomfort caused by it is reduced by gradually reducing the information.
[0071]
The above (1) is achieved by gradually revealing limited and hidden information. The above (2) eliminates the disadvantages inherent in artificial incompetence where the talk skips too much. The user feels uncomfortable when he is suddenly turned away immediately after he speaks, but does not feel so uncomfortable as the talk gradually shifts over time. Also, if the amount of information is limited and a discrepancy occurs while being dispensed, there is a tendency to wonder and ask what it means rather than feel uncomfortable. Said (3) utilizes those tendencies. However, if all the information of the article is clarified, a lot of information that is inconsistent with the intention of the user will come out, and the effect (3) cannot be covered. Therefore, it is assumed that a certain limit is set and stopped when the limit is reached. In the present embodiment, the number of spoken characters is counted by the cumulative spoken character count counter 22 and is stopped when it reaches 1/4 of the entire sentence.
[0072]
If the above mechanism is used, in the previous example, for example, the following conversation is possible.
[0073]
User: “I like ***** players”
Conversation system: “******* outfielder of *****”
User: “You know it” (no keywords)
Conversation system: “The top batter and the stealing king” (If it is conventional artificial incompetence, it is likely that it is a completely different topic here or it is silent, but the same topic can be continued in the present invention)
User: “Yes” (no keyword)
Conversation system: “On an aircraft arriving at Kansai Airport”
User: “Huh?” (No keyword)
Conversation system: “Updated for the first time in 90 years”
User: “E?” (No keyword)
Conversation system: “Season 242 hits”
User: “It ’s a great record” (the next statement is made using the keyword “record”)
In this way, although there are some unclear points and discrepancies (particularly giving the user a mysterious taste and stimulating curiosity), the same topic was spoken and the first “I The conversation corresponding to “I like **** players” has been established. Since each statement is ambiguous, there is also an effect that the user can supplement the missing information conveniently and interpret it. The article quoted here is a news article reporting that a baseball player has returned home. Originally, it was written regardless of the user's remarks “I like *** players”. In this way, sentences made completely unrelated to conversation can be used for conversation.
[0074]
If there is no sentence in the previous comment article buffer 21, the latest keyword recorded in the keyword history table 20 is used to make a comment. This further reduces the possibility of ending without speaking. At the same time, it gives the user the impression of continuing conversation on the same topic. This is also to alleviate the shortcomings of too much talk. For example, in the continuation of the conversation starting with “I like **** players”, after the conversation based on the keyword “record”, the keyword “**** player” remaining in the keyword history table 20 Can be resumed.
[0075]
Hereinafter, specific processing procedures for realizing the above-described conversation system will be described in detail with reference to the flowcharts shown in FIGS. The processing shown in these flowcharts is executed by the CPU 11 provided in the present system reading a program.
[0076]
As shown in FIG. 7, when the conversation processing of this system is started, the CPU 11 first extracts keywords from the user's remarks (step C11). Specifically, a keyword for starting conversation is searched from text data obtained by performing speech recognition processing on the user's voice data input through the voice input unit 12. The keyword referred to here is a word that strongly suggests the content of the user's statement than other words. In the present embodiment, a word composed of two or more kanji characters, katakana, numbers, or a combination thereof is extracted as a keyword. If there is a corresponding keyword in the user's utterance (YES in step C12), the CPU 11 writes the extracted keyword in the keyword history table 20 provided in the nonvolatile memory 17 together with the current date and time data. After (step C13), record selection processing for the conversation sentence database 18 is performed using the keyword (step C14). The keyword history table 20 is used in the conversation process (2) shown in FIG.
[0077]
As shown in FIG. 9, in the record selection process, the CPU 11 searches the conversation sentence database 18 based on the extracted keyword (step D11). As a result, if there is a record including the keyword in the sentence data registered in the conversation sentence database 18 (YES in step D12), the CPU 11 extracts the record from the conversation sentence database 18 and has close time-related information. A record is extracted that is not within the predetermined number of days in the previous utterance (step D13). A record with close time-related information is a record that is close in time and topic, and specifically refers to a record whose difference value from the time-related information of the record is 2 or less. Further, the record in which the previous speech date is not within the predetermined number of days is a record that has not been speeched recently, and specifically refers to a record that is not speeched within 3 days. When a corresponding record can be extracted from the conversation sentence database 18 (YES in step D14), the CPU 11 selects one of these records as a speech candidate at random (step D15). If only one can be extracted, it is set as a selected record.
[0078]
Returning to FIG. 7, when a record as an utterance candidate is selected from the conversation sentence database 18 by the record selection process, the CPU 11 utters the sentence data of the record as a reply to the user (step C15). Specifically, voice data corresponding to sentence data to be spoken is generated, converted into an analog waveform by the D / A converter 15, and then read out through the voice output unit 14. At this time, the CPU 11 writes the current date and time in the item of the previous utterance date and time corresponding to the selected record in the conversation sentence database 18 (step C16). On the other hand, when a record as a speech candidate is not selected from the conversation sentence database 18 by the record selection process, the CPU 11 performs a record selection process for the article sentence database 19 using the keyword (step C17). The record selection process at this time is the same as that in FIG. 9 except that the conversation sentence database 18 is replaced with the article sentence database 19.
[0079]
Here, when a record as an utterance candidate is selected from the article sentence database 19, the CPU 11 performs article sentence utterance processing on the sentence data of the record, so that the sentence data becomes a natural form as a conversation. Speak as a reply to the user after correcting (step C18). Then, the CPU 11 writes the current date and time in the item of the previous message date and time corresponding to the selected record in the article sentence database 19 (step C19).
[0080]
As shown in FIG. 10, in the article sentence disconnection utterance process, the CPU 11 first holds the sentence data (article sentence) extracted from the article sentence database 19 in the previous utterance article sentence buffer 21 provided in the nonvolatile memory 17. (Step E11). Then, the CPU 11 performs an article inappropriate part deletion process using the sentence data held in the previous comment article sentence buffer 21 as a processing target, and deletes an inappropriate part as a conversation from the sentence data (step E12). Specifically, as shown in FIG. 11, the CPU 11 searches for parentheses included in the sentence data, and deletes the parenthesis symbol and the portion between the parentheses (step F11). Further, the CPU 11 searches the sentence data for a word expressing the date and time and deletes it (step F12), and further news-specific words such as “according to“ **** News Agency ”or“ *** newspaper ”. A word such as “according to the investigation by the company” is searched for and deleted (step F13).
[0081]
In this way, when a part inappropriate for conversation is deleted from the sentence data, the CPU 11 checks the number of keywords included in the deleted sentence data, and the number of keywords is n (here, n = 3). If there is more (YES in step E13), it is determined that the sentence is inappropriate, and the sentence is shortened to an appropriate form as a conversation sentence as follows.
[0082]
That is, the CPU 11 randomly specifies an arbitrary character position of the sentence data, and cuts out a phrase or sentence including the character by using punctuation marks (step E14). Then, the number of keywords included in the extracted sentence or clause is checked, and if it is n or more (YES in step E15), the phrase is further cut into shorter segments (step E16). This is repeated until the number of keywords is less than n, specifically, the number of keywords is 2 or less.
[0083]
The CPU 11 speaks the sentence, clause or fragment finally obtained in this way as a reply to the user (step E17). Further, the CPU 11 adds the number of characters spoken this time to the cumulative speech character counter 22 (step E18), and determines whether or not the value of the cumulative speech character counter 22 is equal to or less than a predetermined value (step E19). Specifically, it is determined whether or not the cumulative value of the number of spoken characters is equal to or less than ¼ of the length of the article text in the previous comment article buffer 21. This is because in the conversation process (2) described later, when the same article text is dispensed and some remarks are made, the user tends to wonder and ask what it means at the beginning. If all the information is clarified, a lot of information that is inconsistent with the user's intention to speak comes out, which gives a sense of incongruity. Therefore, in order to stop the utterance from the same article sentence when the cumulative number of characters reaches ¼ of the entire sentence (NO in step E19), the CPU 11 clears the previous utterance article sentence buffer 21 to 0. The previous message article buffer 21 is emptied (step E20).
[0084]
Here, when a reply is not made in the conversation sentence database 18 or the article sentence database 19 using the keyword included in the user's utterance (step C14 → C17 in FIG. 7 is not selected), or the keyword is included in the user's utterance. Is not included (NO in step C11 in FIG. 7), the conversation process (2) shown in FIG. 8 is executed.
[0085]
As shown in FIG. 8, in the conversation process (2), the CPU 11 checks whether or not there is a sentence in the previous message article buffer 21 (step C20). If there is a sentence (an article sentence selected last time) in the previous message article buffer 21 (YES in step C20), the CPU 11 uses the sentence to perform an article sentence cutting process in FIG. C21). On the other hand, when there is no sentence in the previous comment article sentence buffer 21 (NO in step C20), the CPU 11 extracts the keyword recorded recently from the keyword history table 20 (step C22). In this case, the keyword recorded this time is excluded. In addition, keywords selected before a predetermined time are also excluded. If there is a corresponding keyword in the keyword history table 20 (YES in Step C23), the CPU 11 extracts a reply sentence from the conversation sentence database 18 or the article sentence database 19 again using the keyword and makes a statement (Step S23). C24-C29). Steps C24 to C29 are the same as steps C14 to C19 in FIG. However, if no reply can be made here, the conversation process is terminated without replying to the user's comment.
[0086]
As described above, in the conversation system of the present invention, the conversation sentence is found from the conversation sentence database 18 based on the keyword included in the user's utterance, and the conversation sentence with little difference between the conversation sentence and the time-related information is returned. Although these conversations were originally created completely unrelated to the conversations they faced, they were connected in terms of content, and it was possible to create natural expressions as conversations. it can. In this case, the novel cannot know the time interval of each conversation sentence, but also uses the line break code in the conversation sentence and the line break code of the title of the chapter (the sentence that is not a conversation sentence) and the chapter title, and further breaks the contents By calculating time-related information in consideration of blank lines, it is possible to create information that appropriately expresses temporal and topical relationships between conversational sentences. Since a conversation sentence is selected using this, it is possible to select and speak an appropriate sentence that is connected in time and content to the user's utterance.
[0087]
In addition, a plurality of sentences (sentences surrounded by a set of quotation marks) in a novel are divided into one sentence for each sentence. It is difficult to create clear discrepancies and can be applied to conversations in many situations.
[0088]
Since not only the reply sentence for the conversation sentence containing the keyword in the user's utterance but also the conversation sentence of the speaker of the conversation sentence including the keyword is used, many conversation sentences can be selected and remarked. The variety will be rich. In addition, this allows both a reply that includes the keyword itself spoken and a reply that does not include it, so the user has a strong satisfaction that the conversation system understands what he said. Can be given without feeling unnatural.
[0089]
Since conversation sentence identification and extraction and time-related information are calculated when the conversation sentence database is created, the processing load on the CPU 11 during conversation is reduced. In addition, since only the conversation sentence is extracted and held in the conversation sentence database 18, a conversation-like conversation can be realized with a small amount of memory.
[0090]
On the other hand, when speaking using the article text database 19 separately from the conversation text database 18, the news article is diverted after excluding unnatural things in the conversation with expressions specific to the news article. It is possible to make a speech with less unnaturalness even though it is speaking. Moreover, since a part of the article sentence is cut out and used so as to be close to the amount of information possessed by the conversation, a conversation-like utterance can be created from written sentences that are not originally intended for conversation. In this case, since the sentence is cut out using punctuation marks and keywords, the breaks are not unnatural. Since fragmentary utterances are made, the meaning of the utterances is vague and there is a lot of room for interpretation, and the possibility of the user interpreting the utterances conveniently in the flow of conversation increases.
[0091]
Further, since the number of keywords is used as a measure for measuring the amount of information, it is not necessary to analyze the sentence grammatically, and the amount of information can be estimated easily. Since the meaning is ambiguous due to fragmentary remarks, the user can be interested and curious and have the desire to continue the conversation. Since the text of the news article used for the utterance is stored, and other parts are cut out from the utterance after that, the user's interests and curiosity can be satisfied little by little. In addition, this makes it possible to make a remark a plurality of times on the same topic, and to give the user a feeling of continuing conversation on one topic. As a result, even when a keyword cannot be found in the user's remarks this time, or even when a keyword cannot be created with that keyword, an appropriate remark can be created.
[0092]
Also, since the discrepancy with the user's remarks contained in the news article is revealed little by little, it is easier for the user to accept than the entire contents of the news article are revealed at once, and the user accepts it and talks The possibility of proceeding is increased.
[0093]
In addition, in the utterance using the text of the news article, it manages how much of the amount of information that the text has already been spoken, and when a certain percentage is said without saying the whole sentence Because it stops, the entire contents of the news article are not revealed, and the discrepancy with the user's remarks can be kept inconspicuous.
[0094]
In addition, a statement using the article sentence database 19 may give a fragmentary impression or a dull impression if it is just that, but it is mixed with a statement using the conversation sentence database 18. So the whole conversation has a natural impression. In other words, when the conversation sentence database 18 is created from a novel or a drama, if it is used alone, the conversation system may give the impression that it is a little too harsh, but this is combined with the statement from the article sentence database 19. It can also weaken such an impression.
[0095]
In addition, since the keywords included in the user's remarks are kept as a history, and the conversation proceeds after the conversation has progressed, the user is told that the topic is maintained and the conversation is being performed. Can make you feel. In addition, this makes it possible to create an appropriate utterance even when the keyword cannot be found in the user's utterance, or even if the keyword cannot be produced with the keyword even if it is found.
[0096]
In the conventional “artificial incompetence” conversation system, the relationship between the pattern registered in the database (referred to as “registered pattern”) and the reply data corresponding to that pattern (referred to as “registered reply”) is simple. It is fixed. In the present invention, the sentence data registered in the database corresponds to “registration reply”, but also includes “registration pattern” in itself. The “registration pattern” and “registration reply” are associated with an algorithm as a “many-to-many” relationship. For example, in the sentence data registered in the article sentence database 19 shown in FIG. 6, “according to ****** military broadcasts, the tank is torn at the bottom” is the record immediately before that. “Registration Reply” for a number of “registration patterns” (keywords) included in a certain “******* Autonomous Region ** in the central part of **** near the colony of *** people on the night of the 14th to the intensification of conflict in the future” "And a number of" registration patterns "with" registration reply "as the record immediately after," according to military broadcasts, said that the statement was issued ". And it itself is a “registration reply” for a number of “registration patterns” that it has. As a result, many appropriate “registration replies” can be handled for various “registration patterns”. It is difficult to register such a relationship using the conventional “artificial incompetent” conversation system, and a large amount of storage area is required.
[0097]
In the above-described embodiment, an example in which the conversation sentence database 18 is created from a novel has been shown. However, for example, a drama, a movie or drama scenario, a record of rakugo or comics, an actual conversation record, or the like may be used. When these are used, a unique format is used for identification of conversational sentences and calculation of time-related information. For example, in a play, each line is often not surrounded by "". In many cases, a serif is written with a name that says a serif at the beginning of the line, followed by a certain space. A sentence that is not a conversational sentence is sometimes surrounded by (). Therefore, in the case of such a drama, it is sufficient to use those formats without using “” for extracting the conversation sentence.
[0098]
Moreover, although the example which produces the article sentence database 19 from the news article on the internet was shown, as long as it contains sentences, such as a book, what kind of thing may be used.
[0099]
In addition, a line feed code and a blank line are used to calculate time-related information, but a punctuation mark (.), A quotation mark (""), a chapter, paragraph, or item heading may be used. Also, the number of characters may be used (for example, the number of characters between the first characters of each sentence is used).
[0100]
Moreover, although the example which has time related information with respect to all the records was shown, it may have only a part, such as having only the place where the value of time related information changes greatly. Or you may have time related information in the form which is not a numerical value. For example, a blank record or the like may be provided where time-related information is largely separated.
[0101]
Further, although the keyword is a word of two or more kanji characters, katakana, numbers, or a combination thereof, other keywords may be used. For example, the regulation of the number of characters may be changed, or a phrase that includes kanji and hiragana, or a phrase that consists of only hiragana may be included.
[0102]
The keywords are extracted from the sentence after voice recognition and kanji conversion. However, the keywords may be extracted in a hiragana state where kanji conversion is not performed. It may be extracted at the stage of recognition as sound, such as keyword spotting. In these cases, information for knowing which words are keywords may be provided. For example, a database in which only keywords are registered may be used.
[0103]
Moreover, although the example which searches the keyword which exists in a user's utterance in the conversation sentence database 18 and the article sentence database 19 was shown, you may look for the more complicated pattern in a user's utterance. For example, there may be a combination of AND and OR of a plurality of keywords, designation of word order, designation including a wild card character, designation of a phrase type such as part of speech. The pattern used when searching the conversation sentence database 18 and the article sentence database 19 may not be the pattern itself in the user's utterance, but a pattern created from it or a pattern selected corresponding to it.
[0104]
Moreover, although the amount of information is measured by the number of keywords, other means may be used. For example, the number of characters, the number of Kanji characters, the number of readings, the number of nouns, the number of verbs, a combination thereof, or the like may be used.
[0105]
In addition, when making a comment from an article, the amount of information is always kept below a certain value. However, it is also possible to make a statement with a large amount of information while reducing the amount of information as a whole. For example, the number of keywords included in a statement may be selected randomly within a range of 1 to 5.
[0106]
In addition, the process of cutting out a part of a sentence in order to reduce the amount of information of one utterance is performed only on an article sentence, but may be performed on a conversation sentence. There are some very long lines in plays, but when applied to them, it may be effective. Although the article sentence database 19 is used only when the conversation sentence database 18 is not uttered, for example, even when the conversation sentence database 18 can produce a utterance, the article sentence database 19 is uttered with a certain probability. Also good. As a whole, it is only necessary to preferentially speak in the conversation sentence database 18.
[0107]
In addition, the sentence after the sentence including the keyword in the user utterance is set as the candidate for the utterance, but the sentence before that may be included in the candidate on the condition that the difference in the time related information is within a certain range. . For example, in the case of having data as shown in the conversation sentence database 18 of FIG. 4, when the keyword “masterpiece” is found in the user's remarks, the first embodiment “ However, "is not selected, but it is also possible to select a candidate having a negative difference in time-related information and a small absolute value.
[0108]
Although the sentence up to the line feed code is applied to one record of the article sentence database 19, it may be applied on a sentence basis as in the conversation sentence database 18. On the contrary, one record in the conversation sentence database 18 may be up to a line feed code or up to a closing parenthesis (all one remark).
[0109]
Moreover, although the sentence itself selected at the time of the last utterance was memorize | stored in the last utterance article sentence buffer 21, you may memorize | store the information which designates the record of the article sentence database 19 corresponding to the sentence.
[0110]
(Second Embodiment)
Next, a second embodiment of the present invention will be described.
[0111]
FIG. 12 is a block diagram showing a hardware configuration of a conversation system according to the second embodiment of the present invention. The same parts as those in FIG. 1 (first embodiment) are denoted by the same reference numerals, and the description thereof will be omitted. Only different points will be described here. The conversation system in the second embodiment has an electronic book reader function and an electronic dictionary function in addition to the conversation function that is the original function. That is, as shown in FIG. 12, various electronic book data 31 are stored in the electronic book data area of the non-volatile memory 17, and the conversation system reads (reads) them in response to an instruction from the user. Can do. In addition, various electronic dictionary data 32 such as “Japanese dictionary” and “encyclopedia” are stored in the electronic dictionary data area of the nonvolatile memory 17, and the contents of items corresponding to the user's question are stored. Can be read aloud. The electronic dictionary data 32 includes a headword and an explanatory text corresponding to the headword.
[0112]
13 and 14 are flowcharts showing the flow of conversation processing of the conversation system in the second embodiment. First, before explaining the processing shown in these flowcharts, the outline of the conversation processing as the second embodiment will be described for easy understanding.
[0113]
The second embodiment is characterized in that a speech in a conversation is created using the electronic book data 31 and the electronic dictionary data 32. Since the electronic book data 31 is book information itself provided as an electronic book, unlike the conversational sentence database 18, the conversational sentence and a non-conversational sentence (non-conversational sentence), a blank line, an author name, a title, and a table of contents Is specific text information (text data). Time related information is not included. Therefore, when a keyword in the user's utterance is found in the electronic book data 31, it is determined whether it is a sentence (for example, whether or not there is a punctuation mark or a quotation mark for each line). ), When it is not a sentence (author name, table of contents, etc.), it is excluded, and in the case of a sentence, if it is a conversation sentence, a process corresponding to the conversation sentence database creation process (FIG. 2) in the first embodiment. If the sentence is not a conversational sentence, a process corresponding to the article sentence database creation process (FIG. 5) in the first embodiment is performed.
[0114]
As for the time-related information, only the vicinity of the sentence where the keyword is found is calculated and used at that time relative to the sentence where the keyword is found. Here, when the sentence in which the keyword is found is a conversation sentence, the conversation sentence is targeted, and when the sentence is a ground sentence (non-conversation sentence), the ground sentence is targeted. This is because the content of the conversation sentence and the local sentence are usually completely different even if they are in the immediate vicinity. For example, in the sentence example shown in FIG. 3, the sentence before and after the sentence in the field that says “He will finally become a bruise. Squint his eyes and listen to a distant radio” is the topic of music. In contrast, it is a depiction of the characters.
[0115]
On the other hand, when a keyword included in the user utterance is found in the entry word item in the electronic dictionary data 32, basically, in the description of the entry word, basically in the first embodiment. A process corresponding to the article database process (FIG. 5) is performed. At this time, instead of the article inappropriate part deletion process (FIG. 11) as described above, an article unique to a dictionary that becomes inappropriate when included in a statement, Process to delete information etc. A conventional word / phrase to be deleted may be registered in advance in the inappropriate word / phrase database 33 provided in the nonvolatile memory 17 and a word / phrase matching this may be automatically deleted. Since the dictionary format is consistent, it is also used to delete information inappropriate for remarks.
[0116]
For example, it is assumed that data in the following format is extracted from the “Japanese dictionary” of the electronic dictionary data 32 using the keyword “abdominal black”.
[0117]
Haraguroi [Black stomach]
(Form) [sentence] ku haraguro, shi
My heart is twisted. He has a trick and a conspiracy in his heart.
"--I human"
[Derived]-Sa (Name)
In this data, delete everything except “The mind is twisted. In this example, this can be realized by applying a rule of deleting a sentence without punctuation. Alternatively, (form) [sentence] [derivation] (name) and the like are descriptions whose roles are determined in advance in this dictionary, so these are registered in advance in the inappropriate word / phrase database 33 and deleted while referring to them. Also good. In the case of this example, a rule of deleting a portion surrounded by parentheses may be used. Also, a combination of rules such as deleting a sentence including “-”, deleting a quoted sentence surrounded by “”, deleting a heading line and the next line may be used.
[0118]
In order to reduce the amount of information in the same way as the article cut-off speech processing in the first embodiment, the extracted “the heart is twisted. Cut out. This allows, for example,
User: “I ’m a black guy”
Conversation system: “The heart is twisted”
It is possible to realize such conversations.
[0119]
It is an outline of conversation processing in the second embodiment described so far. Furthermore, in the second embodiment, it is possible to realize a kind of conversation that was essentially impossible in the first embodiment. That is, in the first embodiment, the meaning of the sentences registered in the conversation sentence database 18 and the article sentence database 19 which are materials for generating the remarks, as in a general “artificial incompetence” conversation system. I don't understand the conversation system. However, the outline of the electronic dictionary data 32 can be understood. This is because, for example, in the case of “Japanese dictionary” or “encyclopedia”, there is usually a headword, and the subsequent description is due to an explanation for the headword. Therefore, a conversation using it becomes possible. This part of conversation processing is conversation processing (2) shown in FIG.
[0120]
In the conversation process (2), when a reply to the user's utterance cannot be created from the electronic book data 31 or the electronic dictionary data 32, the conversation system uses the explanatory text of the electronic dictionary data 32 as follows. Can make a series of remarks.
[0121]
Conversation system: "By the way, do you know what blackness means?"
User: "I don't know" (if the user didn't know)
Conversation system: “The mind is twisted. It means that there are crafts and intrigues in the mind. How did you learn?”
In this way, the flow of conversation with a habit is realized, but there is no need to create scenarios one by one as in the “scenario” method. Again, since inappropriate words are deleted, the unnatural conversation does not occur even though the electronic book data 31 is used. In addition, since this method can be used to make a speech, a conversation system that can be spoken well can be realized. In addition, since an enormous number of items of the electronic book data 31 can be used, a fresh topic can be provided every time no matter how many times this method is used.
[0122]
Hereinafter, a specific processing procedure for realizing the conversation system in the second embodiment will be described in detail with reference to the flowcharts shown in FIGS. 13 and 14. The processing shown in these flowcharts is executed by the CPU 11 provided in the present system reading a program.
[0123]
As shown in FIG. 13, when the conversation processing of this system is started, the CPU 11 first extracts a keyword included in the user's utterance and searches the electronic book data 31 for the keyword (step G11). If there is a sentence including the keyword in the electronic book data 31 (YES in step G12), the CPU 11 determines whether the sentence is a conversation sentence using, for example, a “” symbol (step G13). ).
[0124]
If it is a conversational sentence (YES in step G13), the CPU 11 finds a conversational sentence existing after the conversational sentence in the electronic book data 31, and calculates its relative time-related information ( Step G14). Then, the CPU 11 selects a conversation sentence to be answered based on the calculated time-related information and uses it as a statement (step G15). If the sentence is not a conversation sentence (NO in step G13), the CPU 11 finds a sentence existing after the sentence in the electronic book data 31, and calculates relative time-related information (step G16). ). Then, after selecting a sentence to be answered based on the calculated time-related information (step G17), the CPU 11 performs a process similar to the article sentence disconnection utterance process on the sentence and determines that the sentence is not a conversation. An appropriate part is deleted and a statement is made (step G18).
[0125]
On the other hand, when the keyword included in the user's statement does not exist in the electronic book data 31 (NO in step G12), the CPU 11 then sets the keyword included in the user's statement as the entry word of the electronic dictionary data 32. Search from the items (step G19). If there is a headword item that matches the user's keyword (YES in step G20), the CPU 11 deletes symbols and expressions specific to the electronic dictionary from the explanatory text corresponding to the headword (step G21), and then A process similar to the article sentence cutting utterance process is performed on the explanatory text after the deletion to delete a part inappropriate as a conversation and make a utterance (step G22).
[0126]
Further, when there is no user keyword in the electronic dictionary data 32 and no reply can be made (NO in step G20), the conversation process (2) shown in FIG. 14 is executed.
[0127]
As shown in FIG. 14, in the conversation process (2), the CPU 11 randomly selects an item from the electronic book data 31 (step G23). If the entry word of the item selected at this time is W, the CPU 11 generates a question sentence for the user by quoting the entry word W, for example, "Do you know what W means?" Say (step G24). If the user's reply is an affirmative sentence such as “I know” (YES in step G25), the CPU 11 speaks a decision line such as “I want to know” (for example) ( Step G26), the processing here ends.
[0128]
If the user's reply is an affirmative sentence such as “I don't know” (YES in step G25), the CPU 11 extracts an explanation corresponding to the entry word W from the electronic book data 31, and this explanation Then, delete inappropriate words as a conversation and read it out, and finally say a decision line such as “What is that? How did you study?” (Step G28) Finish.
[0129]
As described above, the electronic book data 31 and the electronic dictionary data 32 have nothing to do with the conversation that the conversation system exchanges with the user. In addition, since the conversation is performed using the electronic book data 31 and the electronic dictionary data 32, it can be realized without having the data for conversation separately, and furthermore, the vast amount of information registered in these can be used. There are advantages such as being able to deal with a wide range of topics at all times.
[0130]
In the second embodiment, the conversation system reads out the electronic book data 31 and the electronic dictionary data 32, but a display unit may be provided for display. In addition, if the sentence where the keyword is found is a conversation sentence, the conversation sentence is selected as the sentence to be used, and if the sentence is a local sentence, the local sentence is selected. However, it may be selected without distinction. In that case, it becomes a conversation with many unusual developments.
[0131]
In short, the present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the invention at the stage of implementation. Furthermore, the above embodiments include inventions at various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements. For example, even if some constituent elements are deleted from all the constituent elements shown in the embodiment, the effects described in “Problems to be solved by the invention” can be solved, and are described in the “Effects of the invention” column. If the effect is obtained, a configuration from which this configuration requirement is deleted can be extracted as an invention.
[0132]
In addition, the method described in each of the embodiments described above is a recording medium such as a magnetic disk (flexible disk, hard disk, etc.), optical disk (CD-ROM, DVD, etc.), semiconductor memory, etc., as a program that can be executed by a computer. The program itself can be applied to various apparatuses, or the program itself can be transmitted through a transmission medium such as a network and applied to various apparatuses. A computer that implements this apparatus reads a program recorded on a recording medium or a program provided via a transmission medium, and performs the above-described processing by controlling operations by this program.
[0133]
【The invention's effect】
As described in detail above, according to the present invention, for example, existing sentence information including conversational sentences such as novels, plays, movies and drama scenarios, records of rakugo and comics, actual conversation records, or news It is possible to easily create a database using existing sentence information that does not include conversational sentences such as articles, etc., and select appropriate sentences as conversations from this database based on time-related information for user's remarks As a result, it is possible to realize a natural conversation and a unique conversation.
[0134]
In addition, for example, a conversation-like utterance can be created by using a conversation system that has nothing to do with the conversation with the user, such as an electronic book or an electronic dictionary.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a hardware configuration of a conversation system according to a first embodiment of the present invention.
FIG. 2 is a flowchart for explaining conversation sentence database creation processing of the conversation system according to the first embodiment.
FIG. 3 is a diagram showing an example of a novel text file that is a target of the conversation sentence database creation process;
4 is a view showing an example of a conversation sentence database created from the text file of FIG. 3. FIG.
FIG. 5 is a flowchart for explaining article database creation processing of the conversation system in the first embodiment;
FIG. 6 is a view showing an example of an article sentence database created by the article sentence database creation process;
FIG. 7 is a flowchart for explaining conversation processing of the conversation system in the first embodiment.
FIG. 8 is a flowchart for explaining conversation processing (2) of the conversation system in the first embodiment;
FIG. 9 is a flowchart for explaining in detail a record selection process included in the conversation process.
FIG. 10 is a flowchart for explaining in detail an article sentence disconnection utterance process included in the conversation process.
FIG. 11 is a flowchart for explaining in detail an article sentence inappropriate part deletion process included in the conversation process;
FIG. 12 is a block diagram showing a hardware configuration of a conversation system according to the second embodiment of the present invention.
FIG. 13 is a flowchart for explaining conversation processing of the conversation system in the second embodiment.
FIG. 14 is a flowchart for explaining conversation processing (2) of the conversation system in the second embodiment;
[Explanation of symbols]
11 ... CPU
12 ... Voice input part
13 ... A / D converter
14 ... Audio output section
15 ... D / A converter
16 ... Work memory
17 ... Non-volatile memory
17a ... Program
18 ... Conversation sentence database
19 ... Article database
20 ... Keyword history table
21 ... Last message buffer
22 ... Cumulative message counter
31 ... Electronic book data
32 ... Electronic dictionary data
33 ... Inappropriate phrase database

Claims

A conversation system for conversation with a user,
A sentence extraction means for extracting sentences usable as conversation from existing sentence information;
Time-related information calculating means for calculating time-related information indicating temporal and topical distances between the sentences extracted by the sentence extracting means;
A database that stores the time-related information calculated by the time-related information calculating means added to each sentence extracted by the text extracting means;
A conversation system comprising conversation processing means for selecting a sentence appropriate for a conversation from the database based on the time-related information in response to a user's comment.

A conversation system for conversation with a user,
A sentence extraction means for extracting sentences usable as conversation from existing sentence information;
Time-related information calculating means for calculating time-related information indicating temporal and topical distances between the sentences extracted by the sentence extracting means;
A database that stores the time-related information calculated by the time-related information calculating means added to each sentence extracted by the text extracting means;
Keyword extraction means for extracting keywords from the user's comments;
Search means for searching a sentence including the keyword extracted by the keyword extraction means from the database, and selecting a sentence whose difference from time related information of the sentence is a predetermined value or less as a speech candidate;
A conversation system characterized by comprising speech processing means for speaking using the text selected by the selection means.

Each sentence of the database is appended with information indicating the previous utterance date and time together with time related information,
The conversation system according to claim 2, wherein the selection unit selects a speech candidate for a sentence that has not been spoken within a predetermined number of days based on the previous speech date information.

3. The conversation system according to claim 2, wherein the speech processing means deletes an inappropriate part as a conversation from the sentence selected by the selection means.

The speech processing means deletes an inappropriate part as a conversation from the text selected by the selection means, and divides the text into a plurality according to the number of keywords included in the text after the deletion. The conversation system according to claim 2, wherein any one of the divided sentences is remarked.

A conversation system for conversation with a user,
Storage means for storing specific sentence information in which a conversational sentence and a non-conversational sentence are mixed;
Search means for searching for sentences containing keywords included in the user's remarks from the sentence information stored in the storage means;
A determination means for determining whether the sentence searched by the search means is a conversational sentence or a non-conversational sentence;
When it is determined that the sentence is a conversation sentence by this determination means, time-related information indicating a temporal and topical distance is calculated for the subsequent conversation sentences, and an appropriate sentence as a conversation based on the time-related information. First message processing means for selecting and speaking,
When it is determined that the sentence is a non-conversation sentence, time-related information indicating a temporal and topical distance is calculated for the subsequent non-conversation sentence, and appropriate as a conversation based on the time-related information. And a second utterance processing means for uttering by selecting an appropriate sentence and deleting an inappropriate part from the sentence as a conversation.

A conversation system for conversation with a user,
Storage means for storing specific dictionary information consisting of headwords and corresponding explanatory text;
Search means for searching for a headword including a keyword included in the user's remarks from dictionary information stored in the storage means;
An explanation sentence corresponding to the headword searched by the retrieval means is extracted from the dictionary information, and a first comment processing means for uttering by deleting an inappropriate part from the explanation sentence as a conversation is provided. A featured conversation system.

When there is no headword including a keyword included in the user's utterance in the dictionary information, a headword is randomly selected from the dictionary information, and an explanation corresponding to the headword is used. 8. The conversation system according to claim 7, further comprising second speech processing means for continuing the conversation by speaking.

A conversation processing program used in a computer for conversation with a user,
In the computer,
A function to extract sentences that can be used as conversations from existing sentence information;
A function for calculating time-related information indicating temporal and topical distances between the extracted sentences;
A function of adding the time-related information to each extracted sentence and registering it in a database;
A conversation processing program for executing a function of selecting and speaking a sentence appropriate for conversation from the database based on the time-related information in response to a user's comment.

A conversation processing program used in a computer for conversation with a user,
In the computer,
A function to extract sentences that can be used as conversations from existing sentence information;
A function for calculating time-related information indicating temporal and topical distances between the extracted sentences;
A function of adding the time-related information to each sentence and registering it in a database;
The ability to extract keywords from user statements;
A function that searches the database including the extracted keyword from the database, and selects a sentence whose difference from the time-related information of the sentence is a predetermined value or less as a speech candidate,
A conversation processing program for executing a function to speak using the selected sentence.

A conversation processing program used in a computer having a function of performing conversation with a user and a memory storing specific sentence information in which a conversation sentence and a non-conversation sentence are mixed,
In the computer,
A function of searching for a sentence including a keyword included in a user's remarks from specific sentence information stored in the memory;
A function for determining whether the retrieved sentence is a conversational sentence or a non-conversational sentence;
When it is determined that it is a conversational sentence, time-related information indicating temporal and topical distance is calculated for the subsequent conversational sentences, and an appropriate sentence is selected as the conversation based on the time-related information. The ability to speak,
When it is determined that the sentence is a non-conversational sentence, time-related information indicating temporal and topical distance is calculated for the subsequent non-conversational sentences, and an appropriate sentence is selected as a conversation based on the time-related information. And a conversation processing program for executing a function that deletes an inappropriate part from the sentence as a conversation.

A conversation processing program used for a computer having a function of performing conversation with a user and a memory storing specific dictionary information including a headword and an explanatory text corresponding thereto,
In the computer,
A function for searching for a headword including a keyword included in a user's remarks from dictionary information stored in the memory;
A conversation processing program for extracting a description corresponding to the searched entry word from the dictionary information, and executing a function of deleting a part inappropriate for conversation from the description and executing a function.