JP2014167517A

JP2014167517A - Conversation providing system, game providing system, conversation providing method, game providing method, and program

Info

Publication number: JP2014167517A
Application number: JP2013038837A
Authority: JP
Inventors: Yuji Oishi; 雄二大石; Yutaka Kunida; 豊國田; Shinya Ishihara; 晋也石原
Original assignee: Nippon Telegraph and Telephone Corp; Nippon Telegraph and Telephone East Corp
Current assignee: Nippon Telegraph and Telephone Corp; Nippon Telegraph and Telephone East Corp
Priority date: 2013-02-28
Filing date: 2013-02-28
Publication date: 2014-09-11

Abstract

【課題】ユーザに対しオンラインで他者とコミュニケーションを円滑に図ることを可能とすること。
【解決手段】ユーザ端末に対して入力されたユーザの音声を音声認識し、認識結果を、ユーザの会話相手に対して送信する。ユーザの属性に応じた言語モデルを記憶する辞書記憶部を備え、辞書記憶部に記憶される複数の言語モデルの中から、ユーザの属性に応じて言語モデルを選択し、選択された言語モデルを用いてユーザの発話音声のデータを音声認識しても良い。
【選択図】図３To enable a user to smoothly communicate with others online.
Voice recognition of a user input to a user terminal is performed, and a recognition result is transmitted to a conversation partner of the user. A dictionary storage unit that stores a language model according to a user attribute is selected, a language model is selected according to a user attribute from a plurality of language models stored in the dictionary storage unit, and the selected language model is The user's speech data may be used for voice recognition.
[Selection] Figure 3

Description

本発明は、音声認識を用いた会話の技術に関する。 The present invention relates to a conversation technique using voice recognition.

近年、オンラインゲームが普及し、ゲームを通してチャットなどでコミュニケーションを図る技術が提案されている。例えば特許文献１には、各プレイヤが入力したメッセージが、ゲームサーバ装置に集められて各ビデオゲーム装置に配信され、チャットウィンドウにメッセージが表示される技術が開示されている。 In recent years, online games have become widespread, and techniques for communicating via chat etc. through games have been proposed. For example, Patent Document 1 discloses a technique in which messages input by each player are collected in a game server device and distributed to each video game device, and the message is displayed in a chat window.

特開２００３−３２５９８３号公報JP 2003-325983 A

しかしながら、ゲームを行いながらチャットのメッセージを入力することは非常に難しく、スムーズなコミュニケーションを図りにくいという問題があった。このような問題は、ゲームにおけるコミュニケーションに限った問題ではなく、オンラインでチャットを行う事によってコミュニケーションを図る技術全般に共通した問題である。
上記事情に鑑み、本発明は、ユーザに対しオンラインで他者とコミュニケーションを円滑に図ることを可能とする技術の提供を目的とする。 However, it is very difficult to input a chat message while playing a game, and there is a problem that smooth communication is difficult. Such a problem is not a problem limited to communication in a game, but is a problem common to all technologies for performing communication by chatting online.
In view of the above circumstances, an object of the present invention is to provide a technology that enables a user to smoothly communicate with others online.

本発明の一態様は、ユーザ端末に対して入力されたユーザの音声を音声認識する音声認識部と、前記音声認識部の認識結果を、前記ユーザの会話相手に対して送信する通信部と、を備える会話提供システムである。 One aspect of the present invention is a voice recognition unit that recognizes a user's voice input to a user terminal, a communication unit that transmits a recognition result of the voice recognition unit to the conversation partner of the user, Is a conversation providing system.

本発明の一態様は、上記の会話提供システムであって、ユーザの属性に応じた言語モデルを記憶する辞書記憶部をさらに備え、前記音声認識部は、前記辞書記憶部に記憶される複数の言語モデルの中から、ユーザの属性に応じて言語モデルを選択し、選択された前記言語モデルを用いてユーザの発話音声のデータを音声認識する。 One aspect of the present invention is the conversation providing system described above, further including a dictionary storage unit that stores a language model corresponding to a user attribute, wherein the speech recognition unit includes a plurality of storage units stored in the dictionary storage unit. A language model is selected from the language model according to the user's attribute, and the speech data of the user is recognized using the selected language model.

本発明の一態様は、ユーザに対してオンラインゲームを提供するゲーム制御部と、ユーザ端末に対して入力されたユーザの音声を音声認識する音声認識部と、前記音声認識部の認識結果を、前記ユーザの会話相手に対して送信する通信部と、を備えるゲーム提供システムである。 One aspect of the present invention provides a game control unit that provides an online game to a user, a voice recognition unit that recognizes a user's voice input to a user terminal, and a recognition result of the voice recognition unit. And a communication unit that transmits to the conversation partner of the user.

本発明の一態様は、ユーザ端末に対して入力されたユーザの音声を音声認識する音声認識ステップと、前記音声認識ステップにおける認識結果を、前記ユーザの会話相手に対して送信する通信ステップと、を備える会話提供方法である。 One aspect of the present invention is a voice recognition step for recognizing a user's voice input to a user terminal, a communication step for transmitting a recognition result in the voice recognition step to the conversation partner of the user, Is a conversation providing method.

本発明の一態様は、ユーザに対してオンラインゲームを提供するゲーム制御ステップと、ユーザ端末に対して入力されたユーザの音声を音声認識する音声認識ステップと、前記音声認識ステップにおける認識結果を、前記ユーザの会話相手に対して送信する通信ステップと、を備えるゲーム提供方法である。 One aspect of the present invention is a game control step for providing an online game to a user, a voice recognition step for voice recognition of a user's voice input to a user terminal, and a recognition result in the voice recognition step. And a communication step of transmitting to the user's conversation partner.

本発明の一態様は、コンピュータに対し、ユーザ端末に対して入力されたユーザの音声を音声認識する音声認識ステップと、前記音声認識ステップにおける認識結果を、前記ユーザの会話相手に対して送信する通信ステップと、を実行させるためのコンピュータプログラムである。 According to one aspect of the present invention, a speech recognition step for recognizing a user's speech input to a user terminal and a recognition result in the speech recognition step are transmitted to a computer to a computer. And a communication step.

本発明の一態様は、コンピュータに対し、ユーザに対してオンラインゲームを提供するゲーム制御ステップと、ユーザ端末に対して入力されたユーザの音声を音声認識する音声認識ステップと、前記音声認識ステップにおける認識結果を、前記ユーザの会話相手に対して送信する通信ステップと、を実行させるためのコンピュータプログラムである。 One aspect of the present invention is a game control step for providing an online game to a user to a computer, a voice recognition step for voice recognition of a user voice input to a user terminal, and the voice recognition step. A communication program for transmitting a recognition result to the conversation partner of the user.

本発明によれば、ユーザはオンラインで他者とコミュニケーションを円滑に図る事が可能となる。 According to the present invention, a user can smoothly communicate with others online.

本発明の一実施形態に係るシステムの構成例を示した図である。It is a figure showing an example of composition of a system concerning one embodiment of the present invention. ユーザ端末１００の機能構成を示す概略ブロック図である。2 is a schematic block diagram showing a functional configuration of a user terminal 100. FIG. 会話提供システム３００の機能構成を示す概略ブロック図である。2 is a schematic block diagram showing a functional configuration of a conversation providing system 300. FIG. ユーザ情報テーブルの具体例を示す図である。It is a figure which shows the specific example of a user information table. 会話情報テーブルの具体例を表す図である。It is a figure showing the specific example of a conversation information table. 辞書テーブルの具体例を示す図である。It is a figure which shows the specific example of a dictionary table. 会話提供システム３００の動作の流れの具体例を表すフローチャートである。5 is a flowchart illustrating a specific example of the operation flow of the conversation providing system 300. 会話提供システム３００ａの機能構成を示す概略ブロック図である。It is a schematic block diagram which shows the function structure of the conversation provision system 300a. 会話提供システム３００が組み込まれたオンラインのゲーム提供システム４００のシステム構成例を表す図である。It is a figure showing the system configuration example of the online game provision system 400 in which the conversation provision system 300 was integrated.

以下、本発明による一実施形態に係る会話提供システムについて、図面を参照して説明する。
図１は、本発明の一実施形態に係るシステムの構成例を示した図である。会話提供システム３００は、ネットワーク２００を介して複数のユーザ端末１００に接続されている。ユーザ端末１００は、他者と会話を行うユーザによって使用される。 Hereinafter, a conversation providing system according to an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a diagram showing a configuration example of a system according to an embodiment of the present invention. The conversation providing system 300 is connected to a plurality of user terminals 100 via the network 200. The user terminal 100 is used by a user who has a conversation with another person.

ユーザ端末１００は、例えばパーソナルコンピュータ、タブレット装置、スマートフォン、ノートパソコン、ワークステーション、テレビ受像機、テレビ電話装置等の情報処理装置を用いて構成される。
図２は、ユーザ端末１００の機能構成を示す概略ブロック図である。
音声入力部１０１は、ユーザの発話内容をユーザ端末１００に入力する。具体的には、音声入力部１０１は、ユーザが発話することによって生じた音波を受け、音波に応じたアナログ信号を生成する。音声入力部１０１は、生成されたアナログ信号を信号処理部１０２に出力する。 The user terminal 100 is configured using an information processing device such as a personal computer, a tablet device, a smartphone, a notebook computer, a workstation, a television receiver, or a videophone device.
FIG. 2 is a schematic block diagram illustrating a functional configuration of the user terminal 100.
The voice input unit 101 inputs the user's utterance content to the user terminal 100. Specifically, the voice input unit 101 receives a sound wave generated by a user speaking and generates an analog signal corresponding to the sound wave. The voice input unit 101 outputs the generated analog signal to the signal processing unit 102.

信号処理部１０２は、音声入力部１０１によって生成されたアナログ信号を、デジタル信号の音声データに変換する。
送受信部１０３は、信号処理部１０２によって生成された音声データに対し、ユーザに対して付与されているユーザＩＤを付与する。そして、送受信部１０３は、ユーザＩＤが付与された音声データを会話提供システム３００に送信する。また、送受信部１０３は、会話提供システム３００から会話データを受信する。会話データには、ユーザ端末１００のユーザの発話内容を表す文字列や、他のユーザ端末１００のユーザの発話内容を表す文字列が含まれる。
表示部１０４は、会話提供システム３００から受信された会話データの内容を表示する。ユーザ端末１００のユーザは、表示された文字列を視認し、発言することによって他者と会話（文字チャット）を行う事ができる。 The signal processing unit 102 converts the analog signal generated by the audio input unit 101 into audio data of a digital signal.
The transmission / reception unit 103 assigns the user ID assigned to the user to the audio data generated by the signal processing unit 102. Then, the transmission / reception unit 103 transmits the audio data to which the user ID is assigned to the conversation providing system 300. In addition, the transmission / reception unit 103 receives conversation data from the conversation providing system 300. The conversation data includes a character string representing the utterance content of the user of the user terminal 100 and a character string representing the utterance content of the user of the other user terminal 100.
The display unit 104 displays the content of conversation data received from the conversation providing system 300. The user of the user terminal 100 can perform a conversation (character chat) with another person by visually recognizing the displayed character string and speaking.

図３は、会話提供システム３００の機能構成を示す概略ブロック図である。会話提供システム３００は、１台又は複数台の情報処理装置によって構成される。例えば、会話提供システム３００が一台の情報処理装置で構成される場合、情報処理装置は、バスで接続されたＣＰＵ（Central Processing Unit）やメモリや補助記憶装置などを備え、会話提供プログラムを実行する。会話提供プログラムの実行によって、情報処理装置は、通信部３０１、ユーザ情報記憶部３０２、会話情報記憶部３０３、会話制御部３０４、辞書記憶部３０５、音声認識部３０６を備える装置として機能する。なお、会話提供システム３００の各機能の全て又は一部は、ＡＳＩＣ（Application Specific Integrated Circuit）やＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを用いて実現されても良い。また、会話提供システム３００は、専用のハードウェアによって実現されても良い。会話提供プログラムは、コンピュータ読み取り可能な記録媒体に記録されても良い。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置である。会話提供プログラムは、電気通信回線を介して送受信されても良い。 FIG. 3 is a schematic block diagram illustrating a functional configuration of the conversation providing system 300. The conversation providing system 300 includes one or a plurality of information processing apparatuses. For example, when the conversation providing system 300 is configured by a single information processing apparatus, the information processing apparatus includes a CPU (Central Processing Unit) connected via a bus, a memory, an auxiliary storage device, and the like, and executes a conversation providing program. To do. By executing the conversation providing program, the information processing apparatus functions as an apparatus including the communication unit 301, the user information storage unit 302, the conversation information storage unit 303, the conversation control unit 304, the dictionary storage unit 305, and the voice recognition unit 306. All or some of the functions of the conversation providing system 300 may be realized using hardware such as an application specific integrated circuit (ASIC), a programmable logic device (PLD), and a field programmable gate array (FPGA). . Further, the conversation providing system 300 may be realized by dedicated hardware. The conversation providing program may be recorded on a computer-readable recording medium. The computer-readable recording medium is, for example, a portable medium such as a flexible disk, a magneto-optical disk, a ROM, a CD-ROM, or a storage device such as a hard disk built in the computer system. The conversation providing program may be transmitted / received via a telecommunication line.

通信部３０１は、ネットワーク２００を介してユーザ端末１００と通信を行う。
ユーザ情報記憶部３０２は、磁気ハードディスク装置や半導体記憶装置などの記憶装置を用いて構成される。ユーザ情報記憶部３０２は、ユーザ情報テーブルを記憶する。
会話情報記憶部３０３は、磁気ハードディスク装置や半導体記憶装置などの記憶装置を用いて構成される。会話情報記憶部３０３は、会話情報テーブルを記憶する。 The communication unit 301 communicates with the user terminal 100 via the network 200.
The user information storage unit 302 is configured using a storage device such as a magnetic hard disk device or a semiconductor storage device. The user information storage unit 302 stores a user information table.
The conversation information storage unit 303 is configured using a storage device such as a magnetic hard disk device or a semiconductor storage device. The conversation information storage unit 303 stores a conversation information table.

会話制御部３０４は、あるユーザによって発話されたメッセージを、そのユーザが属している会話グループの各メンバーのユーザ端末１００に対して配信する。具体的には以下のとおりである。会話制御部３０４は、あるユーザによって発話されたメッセージの音声データが受信されると、音声データに付与されているユーザＩＤを読み出す。会話制御部３０４は、読み出されたユーザＩＤに応じた属性情報をユーザ情報テーブルに基づいて判定する。会話制御部３０４は、ユーザＩＤと属性情報と音声データとを音声認識部３０６に出力する。会話制御部３０４は、音声認識部３０６から認識結果（テキストデータ及びユーザＩＤ）を受けると、ユーザＩＤに応じた配信先を決定する。そして、会話制御部３０４は、決定された配信先に対して認識結果（会話データ）を送信する。 The conversation control unit 304 distributes a message uttered by a certain user to the user terminals 100 of the members of the conversation group to which the user belongs. Specifically, it is as follows. When the voice data of the message uttered by a certain user is received, the conversation control unit 304 reads the user ID given to the voice data. The conversation control unit 304 determines attribute information corresponding to the read user ID based on the user information table. The conversation control unit 304 outputs the user ID, attribute information, and voice data to the voice recognition unit 306. Upon receiving the recognition result (text data and user ID) from the voice recognition unit 306, the conversation control unit 304 determines a delivery destination corresponding to the user ID. Then, the conversation control unit 304 transmits a recognition result (conversation data) to the determined delivery destination.

辞書記憶部３０５は、磁気ハードディスク装置や半導体記憶装置などの記憶装置を用いて構成される。辞書記憶部３０５は、辞書テーブルと音響モデルと複数の言語モデルとを記憶する。言語モデルは属性情報毎に予め用意されている。
音声認識部３０６は、会話制御部３０４からユーザＩＤと属性情報と音声データとを受けると、属性情報に応じた言語モデルを用いて音声認識処理を行う。そして、音声認識部３０６は、音声認識処理によって得られたテキストデータと、処理対象となった音声データに対応付けられていたユーザＩＤと、を認識結果として会話制御部３０４に出力する。 The dictionary storage unit 305 is configured using a storage device such as a magnetic hard disk device or a semiconductor storage device. The dictionary storage unit 305 stores a dictionary table, an acoustic model, and a plurality of language models. A language model is prepared in advance for each attribute information.
When the voice recognition unit 306 receives the user ID, attribute information, and voice data from the conversation control unit 304, the voice recognition unit 306 performs voice recognition processing using a language model corresponding to the attribute information. Then, the voice recognition unit 306 outputs the text data obtained by the voice recognition process and the user ID associated with the voice data to be processed to the conversation control unit 304 as a recognition result.

図４は、ユーザ情報テーブルの具体例を示す図である。ユーザ情報テーブルには、ユーザＩＤ毎に属性情報及びアドレスが対応付けて登録されている。ユーザＩＤは、各ユーザ端末１００のユーザに対して付与される識別情報である。属性情報は、各ユーザの属性を表す情報である。属性情報の具体例として、ユーザの性別、ユーザの世代、ユーザの趣味、ユーザが選んだキャラクター（例えばゲームに登場するキャラクター、コミックに登場するキャラクター、アニメに登場するキャラクター）がある。アドレスは、各ユーザが使用しているユーザ端末１００のアドレスを表す。 FIG. 4 is a diagram illustrating a specific example of the user information table. In the user information table, attribute information and an address are registered in association with each user ID. The user ID is identification information given to the user of each user terminal 100. The attribute information is information representing the attribute of each user. Specific examples of the attribute information include the user's gender, the user's generation, the user's hobbies, and the character selected by the user (for example, a character appearing in a game, a character appearing in a comic, or a character appearing in an animation). The address represents the address of the user terminal 100 used by each user.

図５は、会話情報テーブルの具体例を表す図である。会話情報テーブルには、会話グループ毎に、その会話グループに参加しているユーザのユーザＩＤが対応付けて登録されている。図５の例では、グループＩＤＧ００１の会話グループには、４人のユーザ（Ｕ００１，Ｕ００３，Ｕ００４，Ｕ００５）が参加している。また、グループＩＤＧ００２の会話グループには、２人のユーザ（Ｕ００２，Ｕ００６）が参加している。 FIG. 5 is a diagram illustrating a specific example of the conversation information table. In the conversation information table, for each conversation group, user IDs of users participating in the conversation group are registered in association with each other. In the example of FIG. 5, four users (U001, U003, U004, U005) participate in the conversation group with the group ID G001. In addition, two users (U002, U006) participate in the conversation group with the group ID G002.

図６は、辞書テーブルの具体例を示す図である。辞書テーブルには、属性情報毎に言語モデルＩＤが対応付けて登録されている。言語モデルＩＤは、言語モデルを表す識別情報である。辞書テーブルには、各属性情報に応じて予め準備された言語モデルの識別情報が対応付けられている。例えば、属性情報が女性を表す場合には、その言語モデルは、女性に多いしゃべり方や女性に用いられることの多い単語などの統計情報に基づいて予め準備される。例えば、属性情報が世代を表す場合には、その世代に多いしゃべり方やその世代で流行った流行語などの統計情報に基づいて予め準備される。例えば、属性情報がキャラクターを表す場合には、キャラクターに設定されているしゃべり方やキャラクターが用いることの多い単語などの統計情報に基づいて予め準備される。 FIG. 6 is a diagram showing a specific example of the dictionary table. In the dictionary table, a language model ID is registered in association with each attribute information. The language model ID is identification information representing a language model. The dictionary table is associated with language model identification information prepared in advance according to each attribute information. For example, when the attribute information represents a woman, the language model is prepared in advance based on statistical information such as a way of talking that is often used by women and words that are often used by women. For example, when the attribute information represents a generation, it is prepared in advance based on statistical information such as how to talk a lot in that generation and buzzwords popular in that generation. For example, when the attribute information represents a character, it is prepared in advance based on statistical information such as a method of speaking set for the character and words frequently used by the character.

図７は、会話提供システム３００の動作の流れの具体例を表すフローチャートである。まず、通信部３０１がネットワーク２００を介してユーザ端末１００から音声データを受信する（ステップＳ１０１）。会話制御部３０４は、受信された音声データに付与されているユーザＩＤを読み出す。会話制御部３０４は、読み出されたユーザＩＤに応じた属性情報をユーザ情報テーブルに基づいて判定する。会話制御部３０４は、ユーザＩＤと属性情報と音声データとを音声認識部に出力する。 FIG. 7 is a flowchart illustrating a specific example of the operation flow of the conversation providing system 300. First, the communication unit 301 receives audio data from the user terminal 100 via the network 200 (step S101). The conversation control unit 304 reads the user ID assigned to the received voice data. The conversation control unit 304 determines attribute information corresponding to the read user ID based on the user information table. The conversation control unit 304 outputs the user ID, attribute information, and voice data to the voice recognition unit.

音声認識部３０６は、会話制御部３０４によって判定された属性情報に応じた言語モデルを、辞書テーブルに基づいて判定する（ステップＳ１０２）。音声認識部３０６は、音響モデルと、ステップＳ１０２において判定された言語モデルと、を用いて音声認識処理を実行する（ステップＳ１０３）。音声認識部３０６は、音声認識処理によって得られたテキストデータと、処理対象となった音声データに対応付けられていたユーザＩＤと、を認識結果として会話制御部３０４に出力する。 The voice recognition unit 306 determines a language model corresponding to the attribute information determined by the conversation control unit 304 based on the dictionary table (step S102). The speech recognition unit 306 performs speech recognition processing using the acoustic model and the language model determined in step S102 (step S103). The voice recognition unit 306 outputs the text data obtained by the voice recognition process and the user ID associated with the voice data to be processed to the conversation control unit 304 as a recognition result.

会話制御部３０４は、音声認識部３０６から認識結果（テキストデータ及びユーザＩＤ）を受けると、ユーザＩＤに応じた配信先を決定する（ステップＳ１０４）。そして、会話制御部３０４は、決定された配信先に対して認識結果（会話データ）を送信する（ステップＳ１０５）。 Upon receiving the recognition result (text data and user ID) from the speech recognition unit 306, the conversation control unit 304 determines a delivery destination corresponding to the user ID (step S104). The conversation control unit 304 transmits a recognition result (conversation data) to the determined delivery destination (step S105).

このように構成された会話提供システム３００では、ユーザはキーボードやマウス等の入力装置を操作することなく、発話することによってメッセージを入力することが可能となる。そのため、ユーザはオンラインで他者とコミュニケーションを円滑に図ることが可能となる。 In the conversation providing system 300 configured as described above, the user can input a message by speaking without operating an input device such as a keyboard or a mouse. Therefore, the user can smoothly communicate with others online.

また、会話提供システム３００では、ユーザの発話内容は、ユーザが選択した属性に応じた言語モデルを用いて音声認識される。そのため、ユーザの発話内容をより正確に認識することが可能となる。 Also, in the conversation providing system 300, the user's utterance content is voice-recognized using a language model corresponding to the attribute selected by the user. Therefore, it becomes possible to recognize the user's utterance content more accurately.

＜変形例＞
言語モデルは、必ずしもユーザが発話した内容（音声データ）に対して一語一句忠実に再現するように準備される必要は無く、一部の文言がキャラクターが用いることの多い文言に置換されるように準備されてもよい。例えば、音声データが「私の名はＡである」という文言を表す際に、音声認識の処理結果として「拙者の名はＡでござる」というテキストデータが得られるように、言語モデルが準備されても良い。このように言語モデルが準備されることにより、ユーザによって選択されたキャラクターの雰囲気をより忠実に再現することが可能となる。 <Modification>
The language model does not necessarily have to be prepared to faithfully reproduce words (speech data) spoken by the user word by word, so that some words are replaced with words often used by characters. May be prepared. For example, a language model is prepared so that when voice data expresses the phrase “My name is A”, the text data that “the name of the deaf person is A” is obtained as the processing result of voice recognition. May be. By preparing the language model in this way, the character atmosphere selected by the user can be reproduced more faithfully.

会話提供システム３００は、その機能の一部がユーザ端末１００に実装されることによって、ユーザ端末１００と共に構成されても良い。例えば、音声認識部３０６及び辞書記憶部３０５がユーザ端末１００に実装されても良い。この場合、ユーザ端末１００において音声認識処理が実行され、テキストデータが取得される。ユーザ端末１００は、取得されたテキストデータ及び自装置に設定されているユーザＩＤを会話データとして会話制御部３０４に対して送信する。会話制御部３０４は、ユーザ端末１００から会話データを受信すると、ユーザＩＤに応じて会話データの配信先を決定し、会話データを配信する。 The conversation providing system 300 may be configured together with the user terminal 100 by implementing a part of the function in the user terminal 100. For example, the voice recognition unit 306 and the dictionary storage unit 305 may be mounted on the user terminal 100. In this case, voice recognition processing is executed in the user terminal 100, and text data is acquired. The user terminal 100 transmits the acquired text data and the user ID set in the own device to the conversation control unit 304 as conversation data. When the conversation control unit 304 receives conversation data from the user terminal 100, the conversation control unit 304 determines a distribution destination of the conversation data according to the user ID, and distributes the conversation data.

会話提供システムは、翻訳辞書記憶部及び翻訳部をさらに備えるように構成されても良い。以下、このように構成された会話提供システムの変形例（会話提供システム３００ａ）の構成について説明する。
図８は、会話提供システム３００ａの機能構成を示す概略ブロック図である。会話提供システム３００ａは、翻訳辞書記憶部３１１及び翻訳部３１２をさらに備える点、会話制御部３０４に代えて会話制御部３０４ａを備える点、で会話提供システム３００と異なり、他の構成は会話提供システム３００と同様である。 The conversation providing system may be configured to further include a translation dictionary storage unit and a translation unit. Hereinafter, a configuration of a modified example (conversation providing system 300a) of the conversation providing system configured as described above will be described.
FIG. 8 is a schematic block diagram showing a functional configuration of the conversation providing system 300a. The conversation providing system 300a is different from the conversation providing system 300 in that the conversation providing system 300a further includes a translation dictionary storage unit 311 and a translation unit 312, and a conversation control unit 304a instead of the conversation control unit 304. The same as 300.

翻訳辞書記憶部３１１は、磁気ハードディスク装置や半導体記憶装置などの記憶装置を用いて構成される。翻訳辞書記憶部３１１は、複数の言語間の翻訳を行うために必要となる翻訳辞書データを記憶する。翻訳辞書データには、既存のどのような構成が適用されても良い。
翻訳部３１２は、音声認識部３０６から認識結果（テキストデータ及びユーザＩＤ）を受けると、テキストデータを、予め指定されている他の言語に翻訳する。翻訳部３１２は、翻訳処理を行う際には、翻訳辞書記憶部３１１に記憶されている翻訳辞書データを用いる。翻訳部３１２は、翻訳処理が完了すると、翻訳結果（翻訳後のテキストデータ及びユーザＩＤ）を会話制御部３０４に出力する。翻訳部３１２が翻訳する言語（予め指定されている他の言語）は、ユーザ端末１００のユーザによって指定されても良いし、会話提供システム３００において予め指定されても良いし、他の態様で指定されても良い。なお、翻訳部３１２は、指定されている他の言語と、音声認識部３０６から出力された認識結果の言語とが同一である場合、翻訳処理を行わずに認識結果をそのまま翻訳結果として出力しても良い。
会話制御部３０４ａは、音声認識部３０６から出力された認識結果ではなく、翻訳部３１２から出力された翻訳結果を配信先に対して送信する。 The translation dictionary storage unit 311 is configured using a storage device such as a magnetic hard disk device or a semiconductor storage device. The translation dictionary storage unit 311 stores translation dictionary data necessary for translation between a plurality of languages. Any existing configuration may be applied to the translation dictionary data.
Upon receiving the recognition result (text data and user ID) from the speech recognition unit 306, the translation unit 312 translates the text data into another language specified in advance. The translation unit 312 uses translation dictionary data stored in the translation dictionary storage unit 311 when performing translation processing. When the translation process is completed, the translation unit 312 outputs the translation result (translated text data and user ID) to the conversation control unit 304. The language translated by the translation unit 312 (other language specified in advance) may be specified by the user of the user terminal 100, may be specified in advance in the conversation providing system 300, or specified in another manner. May be. When the other designated language and the language of the recognition result output from the speech recognition unit 306 are the same, the translation unit 312 outputs the recognition result as it is as a translation result without performing the translation process. May be.
The conversation control unit 304a transmits not the recognition result output from the speech recognition unit 306 but the translation result output from the translation unit 312 to the distribution destination.

＜適用例＞
会話提供システム３００は、他のサービスと連携して動作するように構成されても良い。例えば、会話提供システム３００は、オンラインのゲーム提供システムと連携して動作するように構成されても良い。図９は、会話提供システム３００が組み込まれたオンラインのゲーム提供システム４００のシステム構成例を表す図である。 <Application example>
The conversation providing system 300 may be configured to operate in cooperation with other services. For example, the conversation providing system 300 may be configured to operate in cooperation with an online game providing system. FIG. 9 is a diagram illustrating a system configuration example of an online game providing system 400 in which the conversation providing system 300 is incorporated.

ゲーム提供システム４００は、１台又は複数台の情報処理装置によって構成される。例えば、ゲーム提供システム４００が一台の情報処理装置で構成される場合、情報処理装置は、バスで接続されたＣＰＵやメモリや補助記憶装置などを備え、ゲームプログラムを実行する。ゲームプログラムの実行によって、情報処理装置は、通信部３０１、ユーザ情報記憶部３０２、会話情報記憶部３０３、会話制御部３０４、辞書記憶部３０５ａ、音声認識部３０６、制御部４０１、ゲーム制御部４０２を備える装置として機能する。なお、ゲーム提供システム４００の各機能の全て又は一部は、ＡＳＩＣやＰＬＤやＦＰＧＡ等のハードウェアを用いて実現されても良い。また、ゲーム提供システム４００は、専用のハードウェアによって実現されても良い。ゲームプログラムは、コンピュータ読み取り可能な記録媒体に記録されても良い。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置である。ゲームプログラムは、電気通信回線を介して送受信されても良い。 The game providing system 400 is configured by one or a plurality of information processing apparatuses. For example, when the game providing system 400 is configured by a single information processing apparatus, the information processing apparatus includes a CPU, a memory, an auxiliary storage device, and the like connected by a bus and executes a game program. By executing the game program, the information processing apparatus includes a communication unit 301, a user information storage unit 302, a conversation information storage unit 303, a conversation control unit 304, a dictionary storage unit 305a, a voice recognition unit 306, a control unit 401, and a game control unit 402. It functions as a device provided with. Note that all or some of the functions of the game providing system 400 may be realized using hardware such as an ASIC, a PLD, or an FPGA. Further, the game providing system 400 may be realized by dedicated hardware. The game program may be recorded on a computer-readable recording medium. The computer-readable recording medium is, for example, a portable medium such as a flexible disk, a magneto-optical disk, a ROM, a CD-ROM, or a storage device such as a hard disk built in the computer system. The game program may be transmitted / received via an electric communication line.

図９において、図３に示される機能部と同じものについては同じ符号を付している。
制御部４０１は、ゲーム提供システム４００の各機能部を制御する。例えば、制御部４０１は、ユーザ端末１００から音声データが受信された場合、受信された音声データを会話制御部３０４に出力する。例えば、制御部４０１は、ゲームに関するデータが受信された場合、受信されたデータをゲーム制御部４０２に出力する。例えば、制御部４０１は、会話制御部３０４から会話データを受けた場合や、ゲーム制御部４０２からゲームに関するデータを受けた場合には、各データの宛先に対してデータを送信する。ユーザ端末１００の表示部１０４は、ゲーム制御部４０２から提供されたデータに応じてゲームの画面を表示すると共に、会話データの文字列を表示する。例えば、表示部１０４は、ゲーム画面に重畳させて会話データの文字列を表示しても良い。 In FIG. 9, the same components as those shown in FIG. 3 are given the same reference numerals.
The control unit 401 controls each functional unit of the game providing system 400. For example, when voice data is received from the user terminal 100, the control unit 401 outputs the received voice data to the conversation control unit 304. For example, when data related to a game is received, the control unit 401 outputs the received data to the game control unit 402. For example, when the control unit 401 receives conversation data from the conversation control unit 304 or receives data related to a game from the game control unit 402, the control unit 401 transmits data to the destination of each data. The display unit 104 of the user terminal 100 displays a game screen according to the data provided from the game control unit 402 and also displays a character string of conversation data. For example, the display unit 104 may display a character string of conversation data superimposed on the game screen.

ゲーム制御部４０２は、１又は複数のゲームプログラムを予め記憶し、ユーザ端末１００のユーザに対しゲームを提供する。
辞書記憶部３０５ａは、音響モデルと、ゲーム制御部４０２が提供するゲームに応じて予め準備された言語モデルと、を記憶する。辞書記憶部３０５ａは、ゲーム制御部４０２が複数のゲームプログラムを記憶する場合、ゲームプログラム毎に言語モデルを記憶する。 The game control unit 402 stores one or more game programs in advance and provides a game to the user of the user terminal 100.
The dictionary storage unit 305a stores an acoustic model and a language model prepared in advance according to the game provided by the game control unit 402. When the game control unit 402 stores a plurality of game programs, the dictionary storage unit 305a stores a language model for each game program.

このように構成されたゲーム提供システム４００では、ユーザは、オンラインゲームにおいて発話することによってメッセージを入力することが可能となる。そのため、ユーザは、オンラインゲームにおいてキーボードやマウス等の入力装置の操作を、ゲームのためだけに集中することが可能となる。そのため、ユーザはオンラインゲームにおいて他者とコミュニケーションを円滑に図ることが可能となる。 In the game providing system 400 configured as described above, the user can input a message by speaking in an online game. Therefore, the user can concentrate the operation of the input device such as a keyboard and a mouse only for the game in the online game. Therefore, the user can smoothly communicate with others in the online game.

また、ゲーム提供システム４００では、ゲームの種類に応じて言語モデルが準備される。そのため、ゲーム特有の用語などを正確に音声認識することが可能となる。
以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 In the game providing system 400, language models are prepared according to the type of game. For this reason, it becomes possible to accurately recognize a game-specific term or the like.
The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes designs and the like that do not depart from the gist of the present invention.

１００…ユーザ端末，２００…ネットワーク，３００…会話提供システム，３０１…通信部，３０２…ユーザ情報記憶部，３０３…会話情報記憶部，３０４…会話制御部，３０５…辞書記憶部，３０６…音声認識部 DESCRIPTION OF SYMBOLS 100 ... User terminal, 200 ... Network, 300 ... Conversation providing system, 301 ... Communication part, 302 ... User information storage part, 303 ... Conversation information storage part, 304 ... Conversation control part, 305 ... Dictionary storage part, 306 ... Speech recognition Part

Claims

A voice recognition unit that recognizes the voice of the user input to the user terminal;
A communication unit that transmits a recognition result of the voice recognition unit to the conversation partner of the user;
A conversation providing system.

A dictionary storage unit for storing a language model according to the user's attributes;
The speech recognition unit selects a language model according to a user attribute from a plurality of language models stored in the dictionary storage unit, and uses the selected language model to generate data of a user's utterance speech The conversation providing system according to claim 1, which recognizes speech.

A game control unit that provides an online game to the user;
A voice recognition unit that recognizes the voice of the user input to the user terminal;
A communication unit that transmits a recognition result of the voice recognition unit to the conversation partner of the user;
A game providing system comprising:

A voice recognition step for recognizing the voice of the user input to the user terminal;
A communication step of transmitting the recognition result in the voice recognition step to the conversation partner of the user;
Conversation providing method comprising:

A game control step for providing an online game to a user;
A voice recognition step for recognizing the voice of the user input to the user terminal;
A communication step of transmitting the recognition result in the voice recognition step to the conversation partner of the user;
A game providing method comprising:

Against the computer
A voice recognition step for recognizing the voice of the user input to the user terminal;
A computer program for executing a communication step of transmitting a recognition result in the voice recognition step to the conversation partner of the user.

Against the computer
A game control step for providing an online game to a user;
A voice recognition step for recognizing the voice of the user input to the user terminal;
A computer program for executing a communication step of transmitting a recognition result in the voice recognition step to the conversation partner of the user.