JP2025037354A

JP2025037354A - Communication support device and communication support method

Info

Publication number: JP2025037354A
Application number: JP2023144235A
Authority: JP
Inventors: 真二古澤; Shinji Furusawa; 隆昭木村; Takaaki Kimura; 史人小林; Fumito Kobayashi; 伊織久野; Iori Kuno
Original assignee: Saxa Inc
Current assignee: Saxa Inc
Priority date: 2023-09-06
Filing date: 2023-09-06
Publication date: 2025-03-18

Abstract

【課題】２者間の通話や電話会議、オンライン会議を行う場合に、各話者（参加者）を適切に支援して、話者全員の利便性を向上させ、通話や会議の質を向上させる。【解決手段】音声認識１２１１が、それぞれの話者からの音声情報をテキストデータに変換する。あいまい発言検出部１２１２が、音声認識部１２１１からのテキストデータを解析し、支援が必要となる所定部分を検出する。キーワード抽出部１２１３、検索実行部１２３、訂正文作成部１２５が機能して、当該所定部分に応じたメッセージを作成し、訂正提供部１２６が、当該メッセージを、テキスト情報として、または、音声情報に変換して、話者に提供する。【選択図】図２[Problem] When conducting a two-way call, telephone conference, or online conference, each speaker (participant) is appropriately supported, improving the convenience for all speakers and improving the quality of the call or conference. [Solution] A voice recognition unit 1211 converts voice information from each speaker into text data. An ambiguous utterance detection unit 1212 analyzes the text data from the voice recognition unit 1211 and detects a specific portion that requires support. A keyword extraction unit 1213, a search execution unit 123, and a correction sentence creation unit 125 function to create a message corresponding to the specific portion, and a correction provision unit 126 converts the message into text information or voice information and provides it to the speaker. [Selected Figure] Figure 2

Description

この発明は、複数の話者間でネットワークを通じて音声情報を送受する通話を行う場合に、各話者を支援する装置、方法に関する。 This invention relates to a device and method for supporting each speaker when multiple speakers make a call to send and receive voice information over a network.

後に記す特許文献１には、販売支援システムに関する発明が開示されている。当該販売支援システムは、顧客とセールスマンとの会話を音声データとして通信端末装置からネットワークを介して販売支援用サーバへ送信し、当該サーバにてリアルタイムで話した内容の間違いを訂正し、これを間違い訂正のメッセージとして通信端末装置へ送信する。これにより、話した内容に間違いが存在していた場合に、即座に訂正することができる。従って、話の内容が間違ったまま商談が進むことによるトラブルを回避し、顧客に対して迷惑をかけないようにすることができる。 Patent Document 1, which will be described later, discloses an invention relating to a sales support system. This sales support system transmits the conversation between the customer and the salesperson as voice data from a communication terminal device via a network to a sales support server, where any errors in what was said are corrected in real time and transmitted to the communication terminal device as an error correction message. This allows any errors in what was said to be corrected immediately. This avoids problems that may occur when negotiations proceed with incorrect information being conveyed, and prevents inconvenience to the customer.

特開２００２－２８８３５２号公報JP 2002-288352 A

上述した特許文献１に開示された販売支援システムは、顧客に応対するセールスマンを支援するためのものであり、顧客の間違いまでをも訂正するものではない。近年においては、電話会議やオンライン会議（Ｗｅｂ会議）が頻繁に行われる状況にある。電話会議は、それぞれが遠隔地に所在する複数人の参加者が、電話回線を通じて接続された電話端末を通じて、相互に通話音声を送受し合い、リアルタイムに会議を行うものである。オンライン会議（Ｗｅｂ会議）は、インターネット環境とＰＣ（Personal Computer）やスマートフォンなどの携帯通信端末といったデバイスを利用して遠隔地の参加者をつなぎ、参加者間で音声情報や映像情報を送受して、リアルタイムに会議を行うものである。 The sales support system disclosed in the above-mentioned Patent Document 1 is intended to support salesmen who deal with customers, and is not intended to correct customer mistakes. In recent years, telephone conferences and online conferences (Web conferences) have become commonplace. In a telephone conference, multiple participants located in remote locations send and receive voice messages to each other through telephone terminals connected via telephone lines, and hold a conference in real time. In an online conference (Web conference), participants in remote locations are connected using the Internet environment and devices such as a PC (Personal Computer) or a mobile communication terminal such as a smartphone, and audio and video information is sent and received between the participants, and a conference is held in real time.

電話会議やＷｅｂ会議の場合、参加者は、従来の電話通信のように、発呼者と着呼者との２名に限られることなく、２名以上の複数人で会議を行うことが可能である。このため、例えば、会議の主催者一人だけについて、通話内容の誤りを検出し、当該主催者だけに通知しただけでは、参加者全員の利便性を向上させることはできず、会議全体の質の向上につながらない。また、通話内容の誤りを訂正するだけでなく、会議で話題に上がった例えば人名や出来事などの不明な事項を、正確な情報として補足する必要もある。更には、会議の参加者や会議の内容から考えて、コンプライアンス違反やハラスメントといった観点から不適切な発言が伝達されることを防止する必要もある。 In the case of telephone conferences and web conferences, participants are not limited to two people, the caller and the callee, as in conventional telephone communication, but can be two or more people. For this reason, for example, simply detecting an error in the content of a call for only one person, the organizer of the conference, and notifying only that organizer will not improve the convenience for all participants, and will not lead to an improvement in the quality of the conference as a whole. In addition to correcting errors in the content of the call, it is also necessary to supplement with accurate information any unclear matters that were discussed in the conference, such as names or events. Furthermore, it is also necessary to prevent the transmission of inappropriate remarks from the perspective of compliance violations and harassment, taking into account the participants and content of the conference.

以上のことに鑑み、例えば、２者間の通話や電話会議、オンライン会議を行う場合に、各話者（参加者）を適切に支援して、話者全員の利便性を向上させると共に、２者間の通話や電話会議、オンライン会議の質を向上させることを目的とする。 In view of the above, the object of the present invention is to provide appropriate support to each speaker (participant) in, for example, a two-party call, telephone conference, or online conference, thereby improving the convenience for all speakers and improving the quality of the two-party call, telephone conference, or online conference.

上記課題を解決するため、請求項１に記載の発明の通話支援装置は、
複数の話者間でネットワークを通じて音声情報を送受する通話を行う場合に、各話者からの音声情報を中継する通話支援装置であって、
それぞれの話者からの音声情報をテキストデータに変換する音声認識手段と、
前記音声認識手段からの前記テキストデータを解析し、支援が必要となる所定部分を検出する検出手段と、
前記検出手段で前記所定部分が検出された場合に、前記所定部分に応じたメッセージを作成し、あるいは、前記音声情報の前記所定部分に対応する部分を加工して加工済み音声情報を作成する作成手段と、
前記メッセージを、テキスト情報として、または、音声情報に変換して、少なくとも前記所定部分を検出した音声情報の提供元の話者に対して提供し、あるいは、前記加工済み音声情報を、前記所定部分を検出した音声情報の提供元の話者以外の話者に対して提供する提供手段と
を備えることを特徴とする。 In order to solve the above problem, the communication support device of the present invention described in claim 1 comprises:
A communication support device that relays voice information from each speaker when a call is made between a plurality of speakers by transmitting and receiving voice information over a network, comprising:
a speech recognition means for converting speech information from each speaker into text data;
a detection means for analyzing the text data from the speech recognition means and detecting a predetermined portion requiring assistance;
a creating means for creating a message corresponding to the predetermined portion when the detecting means detects the predetermined portion, or for creating processed voice information by processing a portion of the voice information corresponding to the predetermined portion;
and a providing means for converting the message into text information or into audio information and providing the converted audio information to a speaker who provided the audio information from which at least the specified portion was detected, or for providing the processed audio information to a speaker other than the speaker who provided the audio information from which the specified portion was detected.

請求項１に記載の発明の通話支援装置によれば、当該通話支援装置は、複数の話者間でネットワークを通じて音声情報を送受する通話を行う場合に、各話者からの音声情報を中継するものである。当該通話支援装置においては、音声認識手段によって、それぞれの話者からの音声情報がテキストデータに変換される。検出手段によって、音声認識手段からのテキストデータが解析され、支援が必要となる所定部分が検出される。 According to the communication support device of the invention described in claim 1, the communication support device relays voice information from each speaker when a call is made between multiple speakers in which voice information is transmitted and received over a network. In the communication support device, the voice information from each speaker is converted into text data by the voice recognition means. The detection means analyzes the text data from the voice recognition means and detects a specific portion that requires assistance.

検出手段により所定部分が検出されると、作成手段により、当該所定部分に応じたメッセージが作成され、あるいは、話者からの音声情報の当該所定部分に対応する部分を加工して加工済み音声情報が作成される。提供手段によって、作成手段で作成された当該メッセージが、テキストデータとして、あるいは、音声情報に変換されて、少なくとも当該所定部分を検出した音声情報の提供元の話者に対して提供され、あるいは、当該加工済み音声情報が、当該所定部分を検出した音声情報の提供元の話者以外の話者に対して提供される。 When the detection means detects the specified portion, the creation means creates a message corresponding to the specified portion, or processes the portion of the voice information from the speaker that corresponds to the specified portion to create processed voice information. The provision means converts the message created by the creation means into text data or voice information and provides it to at least the speaker who provided the voice information that detected the specified portion, or provides the processed voice information to a speaker other than the speaker who provided the voice information that detected the specified portion.

この発明によれば、２者間の通話や電話会議、オンライン会議を行う場合に、各話者（参加者）を適切に支援することができる。これにより、話者全員の利便性を向上させると共に、２者間の通話や電話会議、オンライン会議の質の向上を実現できる。 According to this invention, when a two-party call, telephone conference, or online conference is held, it is possible to appropriately support each speaker (participant). This improves the convenience for all speakers and also improves the quality of the two-party call, telephone conference, or online conference.

実施の形態の通話支援システムの構成例を説明するための図である。1 is a diagram for explaining a configuration example of a communication support system according to an embodiment; この発明による通話支援装置の第１の実施の形態が適用された電話制御装置の構成例を説明するためのブロック図である。1 is a block diagram for explaining a configuration example of a telephone control device to which a first embodiment of a communication support device according to the present invention is applied; 第１の実施の形態の電話制御装置が用いられて構成された通話支援システムでの処理を説明するためのシーケンス図である。4 is a sequence diagram for explaining a process in a call support system configured using the telephone control device of the first embodiment. FIG. この発明による通話支援装置の第２の実施の形態が適用された電話制御装置の構成例を説明するためのブロック図である。FIG. 11 is a block diagram for explaining a configuration example of a telephone control device to which a second embodiment of a communication support device according to the present invention is applied. 第２の実施の形態の電話制御装置が用いられて構成された通話支援システムでの処理を説明するためのシーケンス図である。FIG. 11 is a sequence diagram for explaining a process in a call support system configured using a telephone control device according to a second embodiment. この発明による通話支援装置の第３の実施の形態が適用された電話制御装置の構成例を説明するためのブロック図である。FIG. 13 is a block diagram for explaining a configuration example of a telephone control device to which a third embodiment of a communication support device according to the present invention is applied. 第３の実施の形態の電話制御装置が用いられて構成された通話支援システムでの処理を説明するためのシーケンス図である。FIG. 13 is a sequence diagram for explaining a process in a call support system configured using a telephone control device according to a third embodiment.

以下、図を参照しながら、この発明により装置、方法の実施の形態について説明する。この発明による装置、方法は、例えば、電話の呼制御を行う主装置やＰＢＸ（Private Branch eXchange）、ＳＩＰ（Session Initiation Protocol）サーバなどの電話制御装置やオンライン会議を行うための会議サーバ、クラウドＰＢＸなどに適用可能なものである。ここで、主装置、ＰＢＸ、ＳＩＰサーバなどの電話制御装置は、会社等において、いわゆるビジネスホンシステムを構築するために構内に設けられる装置であり、通話者双方の通話音声を中継する。 Below, an embodiment of the device and method according to the present invention will be described with reference to the drawings. The device and method according to the present invention can be applied to, for example, telephone control devices such as a main unit that controls telephone calls, a PBX (Private Branch eXchange), and a SIP (Session Initiation Protocol) server, a conference server for online conferences, and a cloud PBX. Here, telephone control devices such as a main unit, a PBX, and a SIP server are devices installed on the premises of a company or the like to build a so-called business phone system, and relay the voices of both callers.

また、会議サーバは、オンライン会議を実現するためにインターネット上に設けられるサーバ装置であり、オンライン会議の参加者全員の音声を中継する。また、クラウドＰＢＸは、インターネット上（クラウド上）にＰＢＸの機能を構築し、インターネット回線を利用してビジネスフォンの機能を利用できるようにするものであり、通話者双方の通話音声を中継する。また、電話会議は、電話制御装置やクラウドＰＢＸに電話会議のアクセスポイントとなるための機能を備えることにより実現できる。 The conference server is a server device installed on the Internet to realize online conferences, and relays the voices of all participants in the online conference. The cloud PBX builds PBX functions on the Internet (on the cloud), allowing business phone functions to be used over an Internet line, and relays the voices of both callers. A telephone conference can be realized by providing a telephone control device or cloud PBX with the functionality to become an access point for the telephone conference.

この発明による装置、方法は、電話制御装置、会議サーバ、クラウドＰＢＸなどの、通話や会議を行う全ての話者の音声情報を中継する種々の装置に対して適用可能なものである。換言すれば、この発明による装置、方法は、複数の話者間で、種々のネットワークを通じて音声情報を送受する通話を行う場合に、各話者からの音声情報を中継する装置に適用可能なものである。以下に説明する実施の形態においては、説明を簡単にするため、この発明による装置、方法を、いわゆるビジネスホンシステムを構築するためにオフィスなどの構内に設けられる電話制御装置に適用し、２者間で通話を行う場合を例にして説明する。 The device and method of the present invention are applicable to various devices that relay the voice information of all speakers who make a call or hold a conference, such as a telephone control device, a conference server, and a cloud PBX. In other words, the device and method of the present invention are applicable to devices that relay voice information from each speaker when a call is made between multiple speakers in which voice information is sent and received over various networks. In the embodiment described below, for simplicity, the device and method of the present invention are applied to a telephone control device installed on the premises of an office or the like to build a so-called business phone system, and a call between two people is described as an example.

なお、「通話」との文言は、狭義には、電話で話をすることを意味する。しかし、この明細書において「通話」との文言は、複数の話者間で所定のネットワークを通じてリアルタイムに音声情報を送受して会話をすることを含むものとする。すなわち、この明細書において、「通話」との文言は、電話回線を通じた２者間の電話通信だけを意味するものではなく、電話会議やオンライン会議といった複数の話者がリアルタイムで音声により打合せや会議を行う場合も含むものとする。 In the narrow sense, the term "call" means talking on the telephone. However, in this specification, the term "call" includes conversation between multiple speakers by sending and receiving voice information in real time over a specified network. In other words, in this specification, the term "call" does not only mean telephone communication between two parties over a telephone line, but also includes cases where multiple speakers hold meetings or conferences by voice in real time, such as telephone conferences and online meetings.

［通話支援システムの構成例］
図１は、実施の形態の通話支援システムの構成例を説明するための図である。図１において、中央部分に示した広域ネットワーク６は、外線電話網６１と、ＩＰ（Internet Protocol）網６２とを含む。外線電話網６１は、公衆交換電話網、携帯電話網などを含み、主に音声通話サービスを実現するものである。ＩＰ網６２は、インターネット・プロトコル・スイート技術を利用して相互接続されたコンピュータネットワークを意味し、いわゆる「インターネット」と等価のものである。 [Example of configuration of a call support system]
Fig. 1 is a diagram for explaining an example of the configuration of a telephone communication support system according to an embodiment. In Fig. 1, a wide area network 6 shown in the center includes an external telephone network 61 and an IP (Internet Protocol) network 62. The external telephone network 61 includes a public switched telephone network, a mobile phone network, etc., and mainly realizes voice communication services. The IP network 62 refers to a computer network interconnected using Internet Protocol Suite technology, and is equivalent to the so-called "Internet."

外線電話網６１とＩＰ網６２とに接続された電話制御装置１Ａ、１Ｂ、１Ｃには、内線電話網２を介して、複数の内線電話端末（以下、電話端末と記載する。）３（１）、３（２）、３（３）、…が接続されて、ビジネスホンシステムを構成している。電話制御装置１Ａ、１Ｂ、１Ｃは、内線と外線との間の接続や内線内の接続を制御する。すなわち、電話制御装置１Ａ、１Ｂ、１Ｃは、複数の電話端末３（１）、３（２）、３（３）、…が接続されたものであり、内線と外線の間や内線内の通信の接続、切断、転送等のいわゆる呼制御を行うものである。 Plural internal telephone terminals (hereafter referred to as telephone terminals) 3(1), 3(2), 3(3), ... are connected to telephone control devices 1A, 1B, 1C, which are connected to an external telephone network 61 and an IP network 62, via an internal telephone network 2, forming a business phone system. The telephone control devices 1A, 1B, 1C control connections between internal and external lines and connections within internal lines. In other words, the telephone control devices 1A, 1B, 1C are connected to multiple telephone terminals 3(1), 3(2), 3(3), ..., and perform so-called call control such as connecting, disconnecting, and transferring communications between internal and external lines and within internal lines.

なお、電話制御装置１Ａ、１Ｂ、１Ｃが、ＩＰ網６２にも接続されているのは、ＩＰ網６２を通じて、ＶｏＩＰ技術を利用したＩＰ電話サービスの利用も可能にしているためである。また、電話制御装置１Ａ、１Ｂ、１Ｃというように分けているのは、呼制御を行う機能の他に、それぞれが異なる機能を備えるためである。電話制御装置１Ａ、１Ｂ、１Ｃが備える異なる機能の詳細については後述する。 The telephone control devices 1A, 1B, and 1C are also connected to the IP network 62 because it is possible to use IP telephone services that use VoIP technology through the IP network 62. The telephone control devices are divided into 1A, 1B, and 1C because, in addition to the function of performing call control, each device has different functions. The different functions of the telephone control devices 1A, 1B, and 1C will be described in detail later.

また、電話制御装置１Ａ、１Ｂ、１Ｃは、ＬＡＮ（Local Area Network）４を介して、内部情報サーバ５（１）が接続されている。内部情報サーバ５（１）は、例えば、顧客情報ＤＢ（Data Base）、取引情報ＤＢ、製品情報ＤＢなどの社内において利用される種々の情報を管理するサーバ装置である。なお、内部情報サーバ５（１）は、１台に限るものではなく、複数の内部情報サーバ５（１）、５（２）、…が設けられ、それぞれが異なる情報を管理する場合もある。このように、電話制御装置１Ａ、１Ｂ、１Ｃは、ＬＡＮ４を通じて内部情報サーバ５（１）等と接続されることにより、内部情報サーバ５（１）等から必要となる情報の取得が可能になっている。 The telephone control devices 1A, 1B, and 1C are also connected to an internal information server 5(1) via a LAN (Local Area Network) 4. The internal information server 5(1) is a server device that manages various information used within the company, such as a customer information DB (Data Base), a transaction information DB, and a product information DB. The internal information server 5(1) is not limited to one unit, and multiple internal information servers 5(1), 5(2), ... may be provided, each managing different information. In this way, the telephone control devices 1A, 1B, and 1C are connected to the internal information server 5(1) etc. via the LAN 4, making it possible to obtain the necessary information from the internal information server 5(1) etc.

また、図１に示すように、広域ネットワーク６には、外線電話端末７（１）や情報提供サーバ８（１）が接続されている。図１においては図示しないが、携帯電話網の基地局を介して、スマートフォンなどの携帯通信端末も接続可能である。なお、図１において、外線電話端末７（１）は、１台しか示していないが、実際には、外線電話端末７（２）、７（３）、…というように、多数のものが接続されている。また、外線電話端末７（１）、７（２）、７（３）、…は、家庭に配置される固定電話端末のように、広域ネットワーク６に接続されている場合もあれば、他の電話制御装置を介して接続され、ビジネスホンシステムを構成するものとして用いられている場合もある。 As shown in FIG. 1, an outside telephone terminal 7(1) and an information providing server 8(1) are connected to the wide area network 6. Although not shown in FIG. 1, a mobile communication terminal such as a smartphone can also be connected via a base station of a mobile phone network. Although only one outside telephone terminal 7(1) is shown in FIG. 1, in reality, many outside telephone terminals such as outside telephone terminals 7(2), 7(3), etc. are connected. The outside telephone terminals 7(1), 7(2), 7(3), etc. may be connected to the wide area network 6 like a fixed telephone terminal located in a home, or may be connected via another telephone control device and used as part of a business phone system.

また、図１に示すように、広域ネットワーク６には、情報提供サーバ８（１）が接続されている。図１において、情報提供サーバ８（１）は、１台しか示していないが、実際には、情報提供サーバ８（２）、８（３）、…というように、多数のものが接続されている。情報提供サーバ８（１）等は、例えば、電車の乗り換え情報を提供するもの、過去、現在の天気情報（履歴）や今後の天気予報を提供するもの、過去のニュース情報を提供するものなど、種々の情報を提供するサーバ装置である。 As shown in FIG. 1, an information providing server 8(1) is connected to the wide area network 6. Although only one information providing server 8(1) is shown in FIG. 1, in reality, many information providing servers are connected, such as information providing servers 8(2), 8(3), .... Information providing servers 8(1) and the like are server devices that provide various types of information, such as train transfer information, past and current weather information (history) and future weather forecasts, and past news information.

図１に示した通話支援システムの電話制御装置１Ａ、１Ｂ、１Ｃは、内線と外線との間の通話回線の接続や内線内の通話回線の接続を制御するだけではない。この実施の形態の電話制御装置１Ａ、１Ｂ、１Ｃは、双方の話者からの通話音声を解析し、（１）事後訂正（あいまいな発言部分の訂正）、（２）事前補足（不明発言部分の補足）、（３）事前防止（不適切発言部分の上書き）といった新たな機能を備える。もちろん、１つの電話制御装置が、（１）～（３）として示した３つの機能の全部を備えることも可能である。 The telephone control devices 1A, 1B, and 1C of the call support system shown in FIG. 1 do not just control the connection of call lines between internal and external lines and the connection of call lines within an internal line. The telephone control devices 1A, 1B, and 1C of this embodiment analyze the call voices from both speakers and have new functions such as (1) post-correction (correction of ambiguous utterances), (2) advance supplementation (supplementation of unclear utterances), and (3) advance prevention (overwriting of inappropriate utterances). Of course, it is also possible for one telephone control device to have all three functions shown as (1) to (3).

しかし、以下においては、説明を簡単にするため、電話制御装置１Ａ、１Ｂ、１Ｃのそれぞれが、上記の（１）～（３）の内の異なる機能を備えるものとして説明する。すなわち、第１の実施の形態の電話制御装置１Ａは、（１）事後訂正（あいまいな発言部分の訂正）機能を備えるものである。第２の実施の形態の電話制御装置１Ｂは、（２）事前補足（不明発言部分の補足）機能を備えるものである。第３の実施の形態の電話制御装置１Ｃは、（３）事前防止（不適切発言部分の上書き）機能を備えるものである。 However, in the following, for simplicity, telephone control devices 1A, 1B, and 1C will be described as each having a different function among the above (1) to (3). That is, telephone control device 1A of the first embodiment has a (1) post-correction function (correction of ambiguous remarks). Telephone control device 1B of the second embodiment has a (2) advance supplementation function (supplementation of unclear remarks). Telephone control device 1C of the third embodiment has a (3) advance prevention function (overwriting inappropriate remarks).

また、図１に示す電話端末３（１）、３（２）、３（３）、…のそれぞれは、基本的な構成は同様のものであるため、以下の説明においては、特に区別して示す場合を除き、電話端末３（１）、３（２）、３（３）、…のそれぞれを電話端末３と総称する。同様に、外線電話端末７（１）、７（２）、７（３）、…のそれぞれについても、基本的な構成は同様のものであるため、以下の説明においては、特に区別して示す場合を除き、外線電話端末７（１）、７（２）、７（３）、…のそれぞれを外線電話端末７と総称する。 In addition, the telephone terminals 3(1), 3(2), 3(3), ... shown in FIG. 1 have the same basic configuration, and therefore in the following explanation, unless otherwise specified, telephone terminals 3(1), 3(2), 3(3), ... will be collectively referred to as telephone terminal 3. Similarly, the external telephone terminals 7(1), 7(2), 7(3), ... also have the same basic configuration, and therefore in the following explanation, unless otherwise specified, telephone terminals 7(1), 7(2), 7(3), ... will be collectively referred to as external telephone terminal 7.

図１に示した通話支援システムにおいては、上述もしたように、電話端末と外線電話端末７との間の通話（内線と外線との間の通話）も電話端末３間の通話（内線内の通話）も可能である。しかし、以下においては、説明を簡単にするため、電話端末３と外線電話端末７との間に通話回線が接続されて、通話を行う場合（内線と外線との間で通話を行う場合）を例にして説明する。以下、第１、第２、第３の実施の形態の電話制御装置１Ａ、１Ｂ、１Ｃのそれぞれについて具体的に説明する。 As described above, in the call support system shown in FIG. 1, calls between telephone terminals and external telephone terminals 7 (calls between internal and external lines) and calls between telephone terminals 3 (calls within an internal line) are possible. However, for simplicity of explanation, the following will be described using an example in which a call line is connected between telephone terminals 3 and external telephone terminals 7 and a call is made (calls between internal and external lines). Below, the telephone control devices 1A, 1B, and 1C of the first, second, and third embodiments will be described in detail.

［第１の実施の形態］
＜第１の実施の形態の電話制御装置１Ａの構成例＞
第１の実施の形態の電話制御装置１Ａは、電話回線を接続して通話を行う双方の話者からの通話音声を解析して、事後訂正（あいまいな発言部分の訂正）を行う機能を備えるものである。図２は、この発明による通話支援装置の第１の実施の形態が適用された電話制御装置１Ａの構成例を説明するためのブロック図である。 [First embodiment]
<Configuration Example of Telephone Control Device 1A of First Embodiment>
The telephone control device 1A of the first embodiment has a function of analyzing the voices of both speakers who are connected to a telephone line and making post-correction (correction of ambiguous utterances). Fig. 2 is a block diagram for explaining a configuration example of the telephone control device 1A to which the first embodiment of the communication support device according to the present invention is applied.

図２において、接続端１０１Ｔは、外線電話網６１への接続端部を構成し、電話網Ｉ／Ｆ（Interface）１０１は、外線電話網６１を通じての通信処理を行う部分である。すなわち、電話網Ｉ／Ｆ１０１は、外線電話網６１を介して送信されて来る自機宛ての信号を、自機において処理可能な形式の信号に変換してこれを取り込む。また、電話網Ｉ／Ｆ１０１は、自機から目的とする相手先に送信する信号を、送信用の形式の信号に変換してこれを外線電話網６１に送出して相手先に送信する。 In FIG. 2, connection end 101T constitutes the connection end to the external telephone network 61, and telephone network I/F (Interface) 101 is a part that performs communication processing through the external telephone network 61. That is, telephone network I/F 101 converts signals sent to the device itself via the external telephone network 61 into signals in a format that can be processed by the device and imports them. In addition, telephone network I/F 101 converts signals to be sent from the device to the intended recipient into signals in a transmission format and sends them to the external telephone network 61 to transmit them to the recipient.

制御部１０２は、図示しないがＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、不揮発性メモリなどを備えたマイクロプロセッサであり、電話制御装置１Ａの各部を制御する。記憶装置１０３は、例えば、ＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）といった、記録媒体とそのドライバとからなる装置部であり、種々のデータの記録媒体への記録、読み出し、変更、削除などを行う。また、記憶装置１０３は、必要となるデータやプログラムを記憶保持する他、種々の処理において生じる中間データを一時記憶する作業領域としても用いられる。 The control unit 102 is a microprocessor equipped with a CPU (Central Processing Unit), ROM (Read Only Memory), RAM (Random Access Memory), non-volatile memory, etc. (not shown), and controls each part of the telephone control device 1A. The storage device 103 is a device unit consisting of a recording medium such as an HDD (Hard Disk Drive) or SSD (Solid State Drive) and its driver, and records, reads, modifies, and deletes various data on the recording medium. In addition to storing and holding necessary data and programs, the storage device 103 is also used as a working area for temporarily storing intermediate data generated in various processes.

端末情報ファイル１０４は、ＨＤＤやＳＳＤなどの記録装置部に作成され、自機に接続された（収容された）電話端末３の内線番号をはじめとする電話端末３に関する種々の情報を記憶保持すると共に、電話端末３の現在の状態を示す情報についても記憶保持する。この端末情報ファイル１０４に記憶保持された情報を用いて、電話端末３についての呼制御が行われる。 The terminal information file 104 is created in a recording device such as an HDD or SSD, and stores and holds various information about the telephone terminal 3, including the extension number of the telephone terminal 3 connected (housed) to the device, as well as information indicating the current status of the telephone terminal 3. Call control for the telephone terminal 3 is performed using the information stored in this terminal information file 104.

あいまい発言辞書１０５Ａは、ＨＤＤやＳＳＤなどの記録装置部に作成され、通話回線を接続して通話を行う双方の話者の通話音声における、あいまいな発言部分を検出するための種々の辞書データを保持する。辞書データの一例を挙げれば、例えば、「たぶん」、「～と思う」、「記憶が正しければ」、「おそらく」などのあいまいな表現や、数字を含む「〇〇分」、［××時間］、「△△円」といった表現などである。数字を含む表現を含めているのは、当該数字部分が正しいとは限らないからである。 The ambiguous utterance dictionary 105A is created in a recording device such as a HDD or SSD, and holds various dictionary data for detecting ambiguous utterances in the voices of both speakers who are talking over a connected telephone line. Examples of the dictionary data include ambiguous expressions such as "probably," "I think," "if I remember correctly," and "probably," as well as expressions that include numbers such as "xx minutes," "xx hours," and "xx yen." Expressions that include numbers are included because the numerical portion may not necessarily be correct.

検索先判定辞書１０６もまた、ＨＤＤやＳＳＤなどの記録装置部に作成されるものである。検索先判定辞書１０６は、あいまいな発言部分について、正確な内容を検索するために、内部情報サーバ５（１）、…等の記憶情報を検索するのか、ＩＰ網６２上の情報提供サーバ８（１）、…等の記憶情報を検索するのかを判定するための辞書データを保持する。辞書データは、簡単には、「検索用キーワード：内部」、「検索キーワード：外部」といった検索キーワードと検索先が内部か外部かを示す情報とが対になったものである。 The search destination determination dictionary 106 is also created in a recording device such as an HDD or SSD. The search destination determination dictionary 106 holds dictionary data for determining whether to search for stored information such as the internal information server 5 (1), ... or the information providing server 8 (1), ... on the IP network 62 in order to search for the exact content of an ambiguous statement. The dictionary data is simply a set of search keywords, such as "search keyword: internal" and "search keyword: external", paired with information indicating whether the search destination is internal or external.

検索先判定辞書１０６の辞書データの一例を挙げれば、例えば、「製品番号１２３４：内部」、「営業担当者：内部」、「〇〇駅から□□駅まで：外部」や「〇〇月△△日の天気：外部」といったものになる。ここで、「内部」は、ＬＡＮ４に接続された内部情報サーバ５（１）、…を意味し、「外部」は、IＰ網６２上の情報提供サーバ８（１）、…を意味する。検索先判定辞書１０６を用いて、検索先を絞り込むことで、あいまいな発言部分についての正確な内容を、適切かつ迅速に検索することができる。 Examples of dictionary data in the search destination determination dictionary 106 include "Product number 1234: inside", "Sales representative: inside", "From XX station to □□ station: outside", and "Weather on XX month, △△ day: outside". Here, "inside" refers to the internal information server 5(1), ... connected to the LAN 4, and "outside" refers to the information providing server 8(1), ... on the IP network 62. By narrowing down the search destination using the search destination determination dictionary 106, it is possible to appropriately and quickly search for the exact content of an ambiguous statement.

接続端１０７Ｔは、内線電話網２への接続端部を構成する。内線Ｉ／Ｆ（Interface）１０７は、電話制御装置１Ａと、内線電話網２を通じて電話制御装置１Ａに収容される電話端末３のそれぞれとの間の通信を可能にする。従って、電話端末３からの信号は、接続Ｉ／Ｆ１０７において自機において処理可能な形式の信号に変換されて取り込まれる。また、電話制御装置１Ａから電話端末３への信号は、接続Ｉ／Ｆ１０７において送信用の形式の信号に変換されて、電話端末３に送信される。従って、内線電話網２を通じた通信は、接続端１０７Ｔ及び接続Ｉ／Ｆ１０７を通じて行うことになる。 The connection end 107T constitutes a connection end to the internal telephone network 2. The internal I/F (Interface) 107 enables communication between the telephone control device 1A and each of the telephone terminals 3 accommodated in the telephone control device 1A through the internal telephone network 2. Therefore, signals from the telephone terminals 3 are converted by the connection I/F 107 into signals in a format that can be processed by the device itself and then imported. Furthermore, signals from the telephone control device 1A to the telephone terminals 3 are converted by the connection I/F 107 into signals in a format for transmission and then transmitted to the telephone terminals 3. Therefore, communication through the internal telephone network 2 is carried out through the connection end 107T and the connection I/F 107.

接続端１０８Ｔは、ＬＡＮ４への接続端部を構成する。ＬＡＮＩ／Ｆ（Interface）１０８は、電話制御装置１Ａと、ＬＡＮ４を通じて内部情報サーバ５（１）、…との間の通信を可能にする。従って、電話制御装置１Ａから内部情報サーバ５（１）、…への信号は、ＬＡＮＩ／Ｆ１０８において送信用の形式の信号に変換されて、内部情報サーバ５（１）、…に送信される。また、内部情報サーバ５（１）、…からの信号は、ＬＡＮＩ／Ｆ１０８において自機において処理可能な形式の信号に変換されて取り込まれる。従って、ＬＡＮ４を通じた通信は、接続端１０８Ｔ及びＬＡＮＩ／Ｆ１０８を通じて行うことになる。 The connection end 108T constitutes the connection end to the LAN4. The LAN I/F (Interface) 108 enables communication between the telephone control device 1A and the internal information server 5 (1), ... via the LAN4. Therefore, signals from the telephone control device 1A to the internal information server 5 (1), ... are converted by the LAN I/F 108 into signals in a format for transmission and transmitted to the internal information server 5 (1), .... Furthermore, signals from the internal information server 5 (1), ... are converted by the LAN I/F 108 into signals in a format that can be processed by the device itself and then imported. Therefore, communication through the LAN4 is carried out through the connection end 108T and the LAN I/F 108.

呼制御部１０９は、制御部１０２の制御の下、端末情報ファイル１０４の管理情報を用い、電話端末３の発信、着信、応答、切断等の呼制御を行う。呼制御部１０９は、図２に示すように、発信制御部１０９Ｓと、着信制御部１０９Ｒとを備えている。呼制御部１０９では、配下の電話端末３から発信（発信要求）を受け付けると、発信制御部１０９Ｓが機能して、指示された相手先を呼び出すようにし、当該相手先が応答してきたら電話回線を接続して通話を可能にする。 Under the control of the control unit 102, the call control unit 109 uses the management information in the terminal information file 104 to perform call control such as making and receiving calls, answering and disconnecting calls of the telephone terminal 3. As shown in FIG. 2, the call control unit 109 is equipped with an outgoing call control unit 109S and an incoming call control unit 109R. When the call control unit 109 receives an outgoing call (outgoing call request) from a subordinate telephone terminal 3, the outgoing call control unit 109S functions to call the specified destination, and when the destination responds, the telephone line is connected to enable the call.

また、呼制御部１０９では、自機宛ての着信（相手先からの発信通知）を受け付けた場合には、着信制御部１０９Ｒが機能して、配下の電話端末３に着信通知を行う。これにより、電話端末３では、放音部（リンガ）より呼び出し音が放音され、着信の発生が通知される。電話端末３のいずれかにおいて、着信に応答する操作（オフフック操作）がなされると、着信制御部１０９Ｒは、これを検知して、着信に応答し、オフフックがされた電話端末３との間に通話回線を接続して通話を可能にする。 When the call control unit 109 receives an incoming call addressed to itself (a call notification from the other party), the incoming call control unit 109R functions to notify the subordinate telephone terminal 3 of the incoming call. This causes the telephone terminal 3 to emit a ring tone from the sound emission unit (ringer), notifying the occurrence of the incoming call. When an operation to answer the incoming call (off-hook operation) is performed at any of the telephone terminals 3, the incoming call control unit 109R detects this, answers the incoming call, and connects a call line to the off-hook telephone terminal 3 to enable the call.

この後、接続した電話回線を保留にしたり、転送したり、解放したりする処理は、配下の電話端末３からの要求に応じて、制御部１０２の制御の下に処理される。なお、制御部１０２は、配下の電話端末３が備えるＬＥＤ（Light Emitting Diode）の点灯／消灯制御やディスプレイへの表示のための制御なども行う。 After this, the processes of putting the connected telephone line on hold, transferring, and releasing are processed under the control of the control unit 102 in response to requests from the subordinate telephone terminals 3. The control unit 102 also controls the turning on and off of LEDs (Light Emitting Diodes) equipped in the subordinate telephone terminals 3 and controls display on the display.

接続端１１０Ｔは、ＩＰ網６２への接続端部を構成する。通信Ｉ／Ｆ（Interface）１１０は、ＩＰ網６２を通じての通信処理を行う部分である。これにより、電話制御装置１Ａは、通信Ｉ／Ｆ１１０及び接続端１１０Ｔを通じてＩＰ網６２上の情報提供サーバ８（１）、…等にアクセスし、検索を行うようにして必要な情報の提供を受けることができる。 The connection end 110T constitutes the connection end to the IP network 62. The communication I/F (Interface) 110 is a part that performs communication processing through the IP network 62. As a result, the telephone control device 1A can access the information providing server 8(1), ..., etc. on the IP network 62 through the communication I/F 110 and the connection end 110T, and can receive the necessary information by performing a search.

会話支援処理部１２０Ａが、電話回線を接続して通話を行う双方の話者からの通話音声を解析し、事後訂正（あいまいな発言部分の訂正）を行う機能を実現する部分となる。会話支援処理部１２０Ａは、図２に示すように、音声認識部１２１１と、あいまい発言検出部１２１２と、キーワード抽出部１２１３とからなる音声処理部１２１Ａを備える。更に、会話支援処理部１２０Ａは、検索先判定部１２２と、検索実行部１２３と、正誤判定部１２４と、訂正文作成部１２５と、訂正提供部１２６とを備える。 The conversation support processing unit 120A is the part that realizes the function of analyzing the voices of both speakers who are connected via telephone lines and performing post-correction (correction of ambiguous utterances). As shown in FIG. 2, the conversation support processing unit 120A has a voice processing unit 121A consisting of a voice recognition unit 1211, an ambiguous utterance detection unit 1212, and a keyword extraction unit 1213. Furthermore, the conversation support processing unit 120A has a search destination determination unit 122, a search execution unit 123, a correctness determination unit 124, a correction sentence creation unit 125, and a correction providing unit 126.

音声認識部１２１１は、通話回線を接続して、通話を行う双方の話者からのそれぞれの通話音声をテキストデータに変換する処理を行う。この場合、音声認識部１２１１は、電話端末３からの通話音声と外線電話端末７からの通話音声とのそれぞれについて、区別できるようにしてテキストデータに変換する。あいまい発言検出部１２１２は、それぞれの話者からの通話音声から変換されたそれぞれのテキストデータを、文節や単語に区切るようにして解析し、あいまい発言辞書１０５Ａを参照して、あいまい発言部分を検出する。 The voice recognition unit 1211 connects a telephone line and converts the voices of both speakers into text data. In this case, the voice recognition unit 1211 converts the voices of the telephone call from the telephone terminal 3 and the external telephone terminal 7 into text data in such a way that they can be distinguished from each other. The ambiguous utterance detection unit 1212 analyzes the text data converted from the voices of the speakers by dividing them into phrases and words, and detects ambiguous utterances by referring to the ambiguous utterance dictionary 105A.

キーワード抽出部１２１３は、音声認識部１２１１で変換されたテキストデータから、あいまい発言検出部１２１２で検出されたあいまい発言部分についての正確な内容を検索するための検索用キーワードを抽出する。検索先判定部１２２は、キーワード抽出部１２１３で抽出された検索用キーワードを用いて検索先判定辞書１０６を参照し、内部情報サーバ５（１）、…を検索先とするか、ＩＰ網６２上の情報提供サーバ８（１）、…を検索先とするかを判別する。 The keyword extraction unit 1213 extracts search keywords from the text data converted by the speech recognition unit 1211 to search for the exact content of the ambiguous utterance portion detected by the ambiguous utterance detection unit 1212. The search destination determination unit 122 refers to the search destination determination dictionary 106 using the search keywords extracted by the keyword extraction unit 1213, and determines whether to search the internal information server 5(1), ... or the information providing server 8(1), ... on the IP network 62.

検索実行部１２３は、検索先判定部１２２で判定された検索先に蓄積されている情報を検索対象として、キーワード抽出部１２１３で抽出された検索用キーワードを用いて検索を実行する処理を行う。この場合、検索先が内部情報サーバ５（１）、…である場合には、所定の検索プログラムを実行し、内部情報サーバ５（１）、…に蓄積されている情報の中から検索用キーワードに合致する情報（検索結果）を得る。また、検索先がＩＰ網６２上の情報提供サーバ８（１）、…である場合には、検索実行部１２３は、所定のブラウザ（Ｗｅｂページ閲覧ソフト）を実行し、検索用キーワードを用いて検索を実行して、当該検索キーワードに合致する情報（検索結果）を得る。 The search execution unit 123 performs processing to execute a search using the search keywords extracted by the keyword extraction unit 1213, with the information stored in the search destination determined by the search destination determination unit 122 as the search target. In this case, if the search destination is the internal information server 5(1), ..., a specified search program is executed to obtain information (search results) that matches the search keywords from among the information stored in the internal information server 5(1), .... Also, if the search destination is an information providing server 8(1), ... on the IP network 62, the search execution unit 123 executes a specified browser (web page viewing software), executes a search using the search keywords, and obtains information (search results) that matches the search keywords.

正誤判定部１２４は、あいまい発言検出部１２１２で検出されたあいまい発言部分の内容と、検索実行部１２３で取得された検索結果とを比較して、あいまい発言部分の内容が正しいか誤りかの正誤判定を行う。正誤判定部１２４において、あいまい発言検出部１２１２で検出されたあいまい発言部分の内容が正しいと判定された場合には、当該あいまい発言部分に対する処理は終了する。しかし、正誤判定部１２４において、あいまい発言検出部１２１２で検出されたあいまい発言部分の内容が誤りであると判定された場合には、訂正文作成部１２５が機能する。 The correctness determination unit 124 compares the content of the ambiguous utterance part detected by the ambiguous utterance detection unit 1212 with the search results obtained by the search execution unit 123, and performs a correctness determination of whether the content of the ambiguous utterance part is correct or incorrect. If the correctness determination unit 124 determines that the content of the ambiguous utterance part detected by the ambiguous utterance detection unit 1212 is correct, the processing for that ambiguous utterance part ends. However, if the correctness determination unit 124 determines that the content of the ambiguous utterance part detected by the ambiguous utterance detection unit 1212 is incorrect, the correction sentence creation unit 125 functions.

訂正文作成部１２５は、検索実行部１２３の検索の結果得られた検索結果と、検索用キーワードなどの情報を考慮して、検出された当該あいまい発言部分の内容を訂正するための訂正文を作成する。訂正提供部１２６は、訂正文作成部１２５で作成された訂正文（テキストデータ）を音声情報に変換し、通話回線を接続している話者に対して、当該通話回線を通じて提供する。すなわち、発信元の話者と着信先の話者との双方に、訂正文を音声情報として提供できる。これにより、発信元の話者と着信先の話者との双方に、誤った発言部分について同時に訂正をすることができる。 The correction sentence creation unit 125 creates a correction sentence to correct the content of the detected ambiguous utterance portion, taking into consideration the search results obtained from the search execution unit 123 and information such as search keywords. The correction providing unit 126 converts the correction sentence (text data) created by the correction sentence creation unit 125 into voice information and provides it to the speaker connected to the telephone line via the telephone line. In other words, the correction sentence can be provided as voice information to both the originating speaker and the receiving speaker. This allows the erroneous utterance portion to be corrected simultaneously for both the originating speaker and the receiving speaker.

なお、上述もしたように、音声処理部１２１Ａは、通話回線を接続して、通話を行う双方の話者からのそれぞれの通話音声を処理対象とし、処理対象についてどちらの話者の通話音声なのかを区別可能に処理する。簡単には、電話端末３と外線電話端末７との間に通信回線が接続された場合、音声処理部１２１Ａは、電話端末３からの通話音声なのか、外線電話端末７からの通話音声なのかを区別可能にして処理する。このため、訂正文を作成したあいまい発言部分は、電話端末３からの通話音声の部分なのか、外線電話端末７からの通話音声の部分なのかの区別はできている。 As described above, the voice processing unit 121A connects a telephone line and processes the voices of both speakers who are engaged in a call, and processes the voices so that it is possible to distinguish which speaker is the subject of the processing. Simply put, when a communication line is connected between telephone terminal 3 and outside telephone terminal 7, the voice processing unit 121A processes the voices so that it is possible to distinguish whether they are the voices from telephone terminal 3 or the voices from outside telephone terminal 7. Therefore, it is possible to distinguish whether the ambiguous utterance portion for which the correction sentence was created is the part of the voice from telephone terminal 3 or the part of the voice from outside telephone terminal 7.

そこで、訂正提供部１２６は、訂正文作成部１２５で作成された訂正文（テキストデータ）を音声情報に変換し、誤った発言をした話者だけに提供することもできる。この場合には、誤った発言を行った話者が、自身の音声により、自身の発言の誤りを訂正し、他の話者に対して提供できる。この場合には、誤った発言を行った話者自身も納得感を得られ、他方の話者も訂正を容易に受け入れることができるなど、よりソフトな対応とすることができる。 The correction providing unit 126 can therefore convert the correction sentence (text data) created by the correction sentence creating unit 125 into audio information and provide it only to the speaker who made the erroneous statement. In this case, the speaker who made the erroneous statement can correct the error in their own statement using their own voice and provide it to the other speaker. In this case, a softer response can be achieved, in that the speaker who made the erroneous statement feels convinced and the other speaker can easily accept the correction.

＜第１の実施の形態の通話支援システムでの処理＞
図３は、第１の実施の形態の電話制御装置１Ａが用いられて構成された通話支援システムでの処理を説明するためのシーケンス図である。上述もしたように、電話端末３から外線電話端末７に電話を掛けることにより、あるいは、外線電話端末７から電話端末３に電話を掛けることにより、電話端末３と外線電話端末７との間に通話回線が接続され、通話が開始されているものとする（ステップＳ１）。 <Processing in the communication support system according to the first embodiment>
3 is a sequence diagram for explaining the process of the call support system configured using the telephone control device 1A of the first embodiment. As described above, it is assumed that a call line is connected between the telephone terminal 3 and the outside line telephone terminal 7 by making a call from the telephone terminal 3 to the outside line telephone terminal 7, or by making a call from the outside line telephone terminal 7 to the telephone terminal 3, and a call is started (step S1).

図１を用いて説明したように、電話端末３は、電話制御装置１Ａの配下の電話端末であるため、電話端末３と外線電話端末７との間の通話音声は、全て電話制御装置１Ａを介して送受される。すなわち、電話制御装置１Ａは、電話端末３と外線電話端末７との間の全ての通話音声を中継する。このため、電話制御装置１Ａでは、制御部１０２の制御により、通話音声の転送と音声認識が開始される（ステップＳ２）。 As explained using FIG. 1, telephone terminal 3 is a telephone terminal subordinate to telephone control device 1A, so all voice communication between telephone terminal 3 and outside telephone terminal 7 is sent and received via telephone control device 1A. In other words, telephone control device 1A relays all voice communication between telephone terminal 3 and outside telephone terminal 7. For this reason, in telephone control device 1A, transfer of voice communication and voice recognition are started under the control of control unit 102 (step S2).

具体的に、ステップＳ２では、接続端１０１Ｔ及び電話網Ｉ／Ｆ１０１を通じて受信した外線電話端末７からの通話音声を、内線Ｉ／Ｆ１０７及び接続端１０７Ｔを通じて電話端末３に転送（送信）する。また、ステップＳ２では、接続端１０７Ｔ及び内線Ｉ／Ｆ１０７を通じて受信した電話端末３からの通話音声を、電話網Ｉ／Ｆ１０１及び接続端１０１Ｔを通じて外線電話端末７に転送（送信）する。更に、ステップＳ２では、制御部１０２の制御の下、音声処理部１２１Ａの音声認識部１２１１が機能して、電話端末３からの通話音声と、外線電話端末７からの通話音声のそれぞれについて、テキストデータに変換する処理を開始する。 Specifically, in step S2, the call voice from the outside telephone terminal 7 received through the connection end 101T and the telephone network I/F 101 is transferred (sent) to the telephone terminal 3 through the extension I/F 107 and the connection end 107T. Also, in step S2, the call voice from the telephone terminal 3 received through the connection end 107T and the extension I/F 107 is transferred (sent) to the outside telephone terminal 7 through the telephone network I/F 101 and the connection end 101T. Furthermore, in step S2, under the control of the control unit 102, the voice recognition unit 1211 of the voice processing unit 121A functions to start a process of converting the call voice from the telephone terminal 3 and the call voice from the outside telephone terminal 7 into text data.

同時に、制御部１０２は、音声処理部１２１Ａのあいまい発言検出部１２１２と、キーワード抽出部１２１３を制御し、あいまい発言部分の検出と、検索用キーワードの抽出とを行う（ステップＳ３）。ステップＳ３において、あいまい発言検出部１２１２は、音声認識部１２１１からのテキストデータの提供を受けて、文節や単語を検出し、あいまい発言辞書１０５Ａを参照して、あいまい発言部分を検出する。あいまい発言部分は、上述もしたように、例えば、「たぶん」、「～と思う」、「記憶が正しければ」、「おそらく」などのあいまいな表現を含む部分や、「〇〇分」、［××時間］、「△△円」といった数字を含む部分などである。 At the same time, the control unit 102 controls the ambiguous utterance detection unit 1212 and keyword extraction unit 1213 of the voice processing unit 121A to detect ambiguous utterance parts and extract search keywords (step S3). In step S3, the ambiguous utterance detection unit 1212 receives text data from the voice recognition unit 1211, detects phrases and words, and detects ambiguous utterance parts by referring to the ambiguous utterance dictionary 105A. As described above, ambiguous utterance parts include parts that include ambiguous expressions such as "probably," "I think," "if I remember correctly," and "probably," as well as parts that include numbers such as "xx minutes," "xx hours," and "△△ yen."

また、ステップＳ３において、キーワード抽出部１２１３は、例えば、「山の日の／次の祝日は／たぶん／秋分の日／だと思う。」といった通話音声に応じたテキストデータが存在したとする。この場合、「たぶん／秋分の日／だと思う。」といった部分があいまい発言部分として抽出される。このため、キーワード抽出部１２１３は、当該あいまい部分の直前の部分も考慮し、「秋分の日」と等価（イコール）となる部分である「山の日の次の祝日は」という文言部分を、検索用キーワードとして抽出する。 In step S3, the keyword extraction unit 1213 assumes that there is text data corresponding to the voice of the call, such as, for example, "I think the next national holiday after Mountain Day is probably Autumnal Equinox Day." In this case, the part "I think it's probably Autumnal Equinox Day" is extracted as an ambiguous utterance part. Therefore, the keyword extraction unit 1213 also takes into account the part immediately before the ambiguous part, and extracts the phrase "The next national holiday after Mountain Day is," which is equivalent to "Autumnal Equinox Day," as a search keyword.

また、ステップＳ３において、キーワード抽出部１２１３は、例えば、「町田から橋本までの所要時間は、５分です。」といった通話音声に応じたテキストデータが存在したとする。この場合、「５分です。」といった部分があいまい発言部分として抽出される。このため、キーワード抽出部１２１３は、当該あいまい部分の直前の部分も考慮し、「５分」と等価（イコール）となる部分である「町田から橋本までの所要時間は」という文言部分を、検索用キーワードとして抽出する。 In step S3, the keyword extraction unit 1213 assumes that there is text data corresponding to the voice of the call, such as, for example, "The time required from Machida to Hashimoto is 5 minutes." In this case, the part "It takes 5 minutes" is extracted as an ambiguous utterance part. Therefore, the keyword extraction unit 1213 also takes into consideration the part immediately before the ambiguous part, and extracts the phrase "The time required from Machida to Hashimoto," which is the part equivalent to "5 minutes," as a search keyword.

次に、制御部１０２の制御の下、検索先判定部１２２が機能し、キーワード抽出部１２１３で抽出された検索用キーワードに基づいて、検索先判定辞書１０６を参照し、検索先を判定する処理を行う（ステップＳ４）。具体的に、検索先判定部１２２は、ＬＡＮ４上の内部情報サーバ５（１）、…を検索先とするか、ＩＰ網６２上の情報提供サーバ８（１）、…を検索先とするかを判別する。 Next, under the control of the control unit 102, the search destination determination unit 122 functions to refer to the search destination determination dictionary 106 and perform a process of determining the search destination based on the search keywords extracted by the keyword extraction unit 1213 (step S4). Specifically, the search destination determination unit 122 determines whether the search destination should be the internal information server 5(1), ... on the LAN 4, or the information providing server 8(1), ... on the IP network 62.

例えば、上述した「山の日の次の祝日は」という検索用キーワードの場合、「山の日」や「祝日」といった単語は、一般的な単語であり、内部（社内）で特に用いられる文言ではないため、ＩＰ網６２上の情報提供サーバ８（１）、…が検索先であると判定する。これに対して、例えば、検索キーワードに含まれる単語が、「製品番号１２３４」や「株式会社○○○様の営業担当者」のように、内部（社内）で特に用いられる文言である場合には、ＬＡＮ４上の内部情報サーバ５（１）、…が検索先であると判定する。 For example, in the case of the above-mentioned search keyword "What is the next national holiday after Mountain Day?", words such as "Mountain Day" and "national holiday" are general words and are not phrases that are particularly used internally (within a company), so it is determined that the information providing server 8 (1), ... on the IP network 62 is the search destination. In contrast, if the words included in the search keyword are phrases that are particularly used internally (within a company), such as "product number 1234" or "sales representative of XXX Co., Ltd.", it is determined that the internal information server 5 (1), ... on the LAN 4 is the search destination.

また、検索用キーワードが「町田から橋本までの所要時間」の場合には、ＩＰ網６２上の所定の鉄道会社のＷｅｂページを検索先とするなど、検索用キーワードに基づいて、検索先（問い合わせ先）となるＷｅｂページ自体を特定することも可能である。同様に、検索用キーワードが「製品番号１２３４」や「株式会社○○○様の営業担当者」などの得意先を示す情報の場合には、ＬＡＮ４上の製品情報ＤＢ（Data Base）や顧客情報ＤＢなどのように、検索先となるデータベース自体を特定することも可能である。 In addition, if the search keyword is "travel time from Machida to Hashimoto," it is possible to specify the web page itself to be searched (inquiry destination) based on the search keyword, such as searching a specific railway company's web page on IP network 62. Similarly, if the search keyword is information indicating a customer, such as "product number 1234" or "sales representative of XXX Co., Ltd.", it is also possible to specify the database itself to be searched, such as a product information DB (Data Base) or customer information DB on LAN 4.

次に、制御部１０２の制御の下、検索実行部１２３が機能して、検索先判定部１２２で判定された検索先に対して、キーワード抽出部１２１３で抽出された検索用キーワードを用いて検索を実行する（ステップＳ５）。検索実行部１２３は、検索先がＩＰ網６２上の情報提供サーバ８（１）、…である場合には、所定のブラウザ（Ｗｅｂページ閲覧ソフト）を用いて、検索用キーワードを入力するようにして、ＩＰ網６２上の情報提供サーバ８（１）、…を検索先として検索を行う。これにより、具体例を示すと、例えば、「山の日の次の祝日は」という検索用キーワードを用いた場合には、「敬老の日」という検索結果が得られる。また、「町田から橋本までの所要時間は」という検索用キーワードが用いられた場合には、「快速１１分、各停１４分」という検索結果が得られる。 Next, under the control of the control unit 102, the search execution unit 123 functions to execute a search using the search keyword extracted by the keyword extraction unit 1213 for the search destination determined by the search destination determination unit 122 (step S5). When the search destination is the information providing server 8 (1), ... on the IP network 62, the search execution unit 123 uses a specified browser (web page viewing software) to input the search keyword, and executes a search using the information providing server 8 (1), ... on the IP network 62 as the search destination. As a result, for example, when the search keyword "What is the next national holiday after Mountain Day?" is used, the search result "Respect for the Aged Day" is obtained. Also, when the search keyword "How long does it take to get from Machida to Hashimoto?" is used, the search result "11 minutes for rapid train, 14 minutes for local train" is obtained.

また、検索実行部１２３は、検索先がＬＡＮ４上の内部情報サーバ５（１）、…である場合には、所定の検索プログラムを実行し、検索用キーワードを入力するようにして、ＬＡＮ４上の内部情報サーバ５（１）、…を検索先として検索を行う。これにより、具体例を示すと、例えば、「製品番号１２３４」という検索用キーワードを用いた場合には、製品情報ＤＢが参照されて、製品番号１２３４の仕様などの詳細情報が得られる。また、「株式会社〇〇様の営業担当者」という検索用キーワードが用いられた場合には、顧客情報先ＤＢが参照され、「営業１課鈴木太郎」のように、「株式会社〇〇様」の営業担当者の氏名が得られる。 Furthermore, when the search destination is the internal information server 5(1), ... on the LAN 4, the search execution unit 123 executes a specified search program and inputs a search keyword, and performs a search on the internal information server 5(1), ... on the LAN 4 as the search destination. As a result, to give a specific example, when the search keyword "product number 1234" is used, the product information DB is referenced and detailed information such as the specifications of product number 1234 is obtained. When the search keyword "sales representative of XX Co., Ltd." is used, the customer information DB is referenced and the name of the sales representative of XX Co., Ltd. is obtained, such as "Sales Division 1, Suzuki Taro."

次に、制御部１０２の制御の下、正誤判定部１２４が機能し、あいまい発言検出部１２１２で検出されたあいまい発言部分と検索実行部１２３で取得された検索結果とを比較して、あいまい発言部分の内容が正しいか誤りかの正誤判定を行う（ステップＳ６）。具体的には、あいまい発言部分が「たぶん／秋分の日／だと思う。」である場合の検索結果は、上述したように「敬老の日」であるので、あいまい発言部分の誤りであると判定される。また、あいまい発言部分が「５分」である場合の検索結果は、上述したように「快速１１分、各停１４分」であるので、誤りであると判定できる。 Next, under the control of the control unit 102, the truth/false judgment unit 124 functions to compare the ambiguous statement portion detected by the ambiguous statement detection unit 1212 with the search result acquired by the search execution unit 123, and judges whether the content of the ambiguous statement portion is correct or incorrect (step S6). Specifically, when the ambiguous statement portion is "Probably/Autumnal Equinox/I think," the search result is "Respect for the Aged Day" as described above, so it is judged that the ambiguous statement portion is incorrect. Also, when the ambiguous statement portion is "5 minutes," the search result is "11 minutes for rapid train, 14 minutes for local train" as described above, so it can be judged that the ambiguous statement portion is incorrect.

ステップＳ６で、誤りではないと判定された場合には、事後訂正の必要はないので、当該あいまい発言部分についての処理は終了し、次のあいまい発言部分の処理に移ることになる。ステップＳ６判定処理において、誤りであると判定されたとする。この場合、制御部１０２の制御の下、訂正文作成部１２５が機能して、訂正文を作成する処理を行う（ステップＳ７）。ステップＳ７において、訂正文作成部１２５は、検索実行部１２３の検索結果と、キーワード抽出部１２１３で抽出された検索用キーワードなどの情報を考慮して、検出された当該あいまい発言部分の内容を訂正するための訂正文を作成する。 If it is determined in step S6 that the ambiguous utterance is not an error, then no subsequent correction is necessary, and the process for that ambiguous utterance is terminated, and the process moves on to the next ambiguous utterance. Assume that it is determined to be an error in the step S6 determination process. In this case, under the control of the control unit 102, the correction statement creation unit 125 functions to perform a process for creating a correction statement (step S7). In step S7, the correction statement creation unit 125 creates a correction statement to correct the content of the detected ambiguous utterance, taking into account the search results of the search execution unit 123 and information such as the search keywords extracted by the keyword extraction unit 1213.

例えば、上述した前者の例の場合には、「山の日の次の祝日は、敬老の日です。」という訂正文を作成することになる。また、上述した後者の例の場合には、「町田から橋本までの所要時間は、快速１１分、各停１４分です。」といった訂正文を作成することになる。この後、制御部１０２の制御の下、訂正提供部１２６が機能し、訂正文作成部１２５で作成された訂正文（テキストデータ）を音声データに変換し、通話回線を接続している話者に対して、提供するする処理を行う（ステップＳ８）。 For example, in the case of the former example described above, a correction sentence such as "The next national holiday after Mountain Day is Respect for the Aged Day" would be created. In the case of the latter example described above, a correction sentence such as "The travel time from Machida to Hashimoto is 11 minutes by rapid train and 14 minutes by local train" would be created. Thereafter, under the control of the control unit 102, the correction providing unit 126 functions to convert the correction sentence (text data) created by the correction sentence creating unit 125 into voice data and provide it to the speaker connected to the telephone line (step S8).

この場合、訂正文に応じた音声データ（訂正音声データ）は、例えば、双方の話者の通話音声が途切れたことを例えば制御部１０２において検出した場合に、訂正提供部１２６が双方の話者に対して提供する（ステップＳ９、ステップＳ１０）。訂正文に応じた音声データの提供が通話を阻害することなく、訂正文に応じた音声データを適切に提供するためである。これにより、ステップＳ３で検出されたあいまいな発言部分について、当該発言部分が誤りあった場合には、電話制御装置１Ａの機能によって自動的に話者双方に対して訂正を行うことができる。 In this case, the voice data corresponding to the correction sentence (corrected voice data) is provided to both speakers by the correction providing unit 126 when, for example, the control unit 102 detects that the call voices of both speakers have been interrupted (steps S9 and S10). This is to ensure that the voice data corresponding to the correction sentence is provided appropriately without interrupting the call. As a result, if the ambiguous utterance portion detected in step S3 contains an error, a correction can be automatically made to both speakers by the function of the telephone control device 1A.

この後、制御部１０２は、訂正提供部１２６からの訂正文の音声データの送出が終了すると、当該訂正文の音声データを消去して、通常の通話に戻るようにする（ステップＳ１１）。これにより、通話音声の送受信が中断されることなく継続され（ステップＳ１２）、上述したステップＳ２からの処理が繰り返すようにされる。 After this, when the correction providing unit 126 finishes sending the voice data of the correction sentence, the control unit 102 erases the voice data of the correction sentence and returns to the normal call (step S11). This allows the transmission and reception of the voice of the call to continue without interruption (step S12), and the process from step S2 described above is repeated.

このようにして、第１の実施の形態の電話制御装置１Ａでは、電話回線を接続して通話を行う双方の話者からの通話音声を解析し、事後訂正（あいまいな発言部分の訂正）を行うことができる。これにより、間違いを間違いのままとすることが無く、正しい情報に自動的に訂正することができるので、正確でない発言をしてしまい、相手に不都合を生じさせるといったことを防止することができる。従って、不正確な発言により、不利益が生じてしまうようなことを防止できる。 In this way, the telephone control device 1A of the first embodiment can analyze the voices of both speakers who are connected over a telephone line and make post-correction (correction of ambiguous statements). This allows mistakes to be automatically corrected to the correct information without being left as errors, preventing inaccurate statements from being made and causing inconvenience to the other party. This prevents disadvantages from occurring due to inaccurate statements.

［第２の実施の形態］
＜第２の実施の形態の電話制御装置１Ｂの構成例＞
第２の実施の形態の電話制御装置１Ｂは、電話回線を接続して通話を行う双方の話者からの通話音声を解析して、事前補足（不明発言部分の補足）を行う機能を備えるものである。図４は、この発明による通話支援装置の第２の実施の形態が適用された電話制御装置１Ｂの構成例を説明するためのブロック図である。図４に示す電話制御装置１Ｂにおいて、図２を用いて説明した第１の実施の形態の電話制御装置１Ａと同様に構成される部分には、同じ参照符号を付し、当該部分の詳細な説明については重複するので省略する。 [Second embodiment]
<Configuration example of telephone control device 1B according to the second embodiment>
The telephone control device 1B of the second embodiment has a function of analyzing the voices of both speakers who are connected to a telephone line and performing advance supplementation (supplementation of unclear utterances). Fig. 4 is a block diagram for explaining a configuration example of the telephone control device 1B to which the second embodiment of the communication support device of the present invention is applied. In the telephone control device 1B shown in Fig. 4, parts configured similarly to the telephone control device 1A of the first embodiment explained using Fig. 2 are given the same reference numerals, and detailed explanations of these parts are omitted to avoid duplication.

図４に示すように、第２の実施の形態の電話制御装置１Ｂは、不明状況辞書１０５Ｂを備える。不明状況辞書１０５Ｂは、ＨＤＤやＳＳＤなどの記録装置部に作成され、通話回線を接続して通話を行う双方の話者の通話音声における、不明な発言部分を検出するための種々の辞書データを保持する。辞書データの一例を挙げれば、例えば、「えーと、えーと、…」、「えー、……」、「～は、何でしたっけ。」、「～は、何分でしたっけ。」といった、言葉が出てこない場合の表現や、相手に質問する表現などである。 As shown in FIG. 4, the telephone control device 1B of the second embodiment includes an unknown situation dictionary 105B. The unknown situation dictionary 105B is created in a recording device such as an HDD or SSD, and holds various dictionary data for detecting unclear utterances in the voices of both speakers who are talking over a connected telephone line. Examples of dictionary data include expressions for when you can't find the words, such as "Um, um, ...", "Um, ...", "What was it again?" and "How many minutes was it again?", as well as expressions for asking the other party questions.

会話支援処理部１２０Ｂが、電話回線を接続して通話を行う双方の話者からの通話音声を解析し、事前補足（不明発言部分の補足）を行う機能を実現する部分となる。会話支援処理部１２０Ｂは、図２に示すように、音声認識部１２１１と、不明発言検出部１２１４と、キーワード抽出部１２１５とからなる音声処理部１２１Ｂを備える。更に、会話支援処理部１２０Ｂは、検索先判定部１２２と、検索実行部１２３と、検索成否判定部１２７と、補足文作成部１２８と、補足提供部１２９とを備える。 The conversation support processing unit 120B is the unit that realizes the function of analyzing the voices of both speakers who are connected via telephone lines and performing advance supplementation (supplementation of unclear utterances). As shown in FIG. 2, the conversation support processing unit 120B includes a voice processing unit 121B that is made up of a voice recognition unit 1211, an unclear utterance detection unit 1214, and a keyword extraction unit 1215. Furthermore, the conversation support processing unit 120B includes a search destination determination unit 122, a search execution unit 123, a search success/failure determination unit 127, a supplementary sentence creation unit 128, and a supplement provision unit 129.

音声認識部１２１１は、上述もしたように、通話回線を接続して、通話を行う双方の話者からのそれぞれの通話音声をテキストデータに変換する処理を行う。この場合、音声認識部１２１１は、電話端末３からの通話音声と外線電話端末７からの通話音声とのそれぞれについて、区別できるようにしてテキストデータに変換する。不明発言検出部１２１４は、それぞれの話者からの通話音声から変換されたそれぞれのテキストデータを、文節や単語に区切るようにして解析し、不明状況辞書１０５Ｂを参照して、不明発言部分を検出する。また、不明発言検出部１２１４は、例えば、「製品番号は、・・・・」のように、通話音声（発言）が、途中で途切れ、その後に所定時間以上（例えば２秒以上）の無音が生じた場合には、当該部分を不明発言部分として検出する。 As described above, the voice recognition unit 1211 connects a telephone line and converts the voices of both speakers who are speaking into text data. In this case, the voice recognition unit 1211 converts the voices of the telephone call from the telephone terminal 3 and the voices of the telephone call from the external telephone terminal 7 into text data so that they can be distinguished from each other. The unclear utterance detection unit 1214 analyzes the text data converted from the voices of the speakers by dividing them into phrases and words, and detects the unclear utterance portion by referring to the unclear situation dictionary 105B. In addition, when the voices of the telephone call (utterance) are interrupted midway, such as "The product number is...", and then silence occurs for a predetermined period of time or more (for example, 2 seconds or more), the unclear utterance detection unit 1214 detects the relevant portion as the unclear utterance portion.

キーワード抽出部１２１５は、音声認識部１２１１で変換されたテキストデータから、不明発言検出部１２１４で検出された不明発言部分についての内容を検索するための検索用キーワードを抽出する。検索先判定部１２２は、キーワード抽出部１２１５で抽出された検索用キーワードを用いて検索先判定辞書１０６を参照し、内部情報サーバ５（１）、…を検索先とするか、ＩＰ網６２上の情報提供サーバ８（１）、…を検索先とするかを判別する。 The keyword extraction unit 1215 extracts search keywords for searching the content of the unclear utterance portion detected by the unclear utterance detection unit 1214 from the text data converted by the voice recognition unit 1211. The search destination determination unit 122 refers to the search destination determination dictionary 106 using the search keywords extracted by the keyword extraction unit 1215, and determines whether to search the internal information server 5(1), ... or the information providing server 8(1), ... on the IP network 62.

検索実行部１２３は、検索先判定部１２２で判定された検索先に蓄積されている情報を検索対象として、キーワード抽出部１２１５で抽出された検索用キーワードを用いて検索を実行する処理を行う。この場合、検索先が内部情報サーバ５（１）、…である場合には、所定の検索プログラムを実行し、内部情報サーバ５（１）、…に蓄積されている情報の中から検索用キーワードに合致する情報（検索結果）を得る。また、検索先がＩＰ網６２上の情報提供サーバ８（１）、…である場合には、検索実行部１２３は、所定のブラウザ（Ｗｅｂページ閲覧ソフト）を実行し、検索用キーワードを用いて検索を実行し、当該検索キーワードに合致する情報（検索結果）を得る。 The search execution unit 123 performs processing to execute a search using the search keywords extracted by the keyword extraction unit 1215, with the information stored in the search destination determined by the search destination determination unit 122 as the search target. In this case, if the search destination is the internal information server 5(1), ..., a specified search program is executed, and information matching the search keywords (search results) is obtained from the information stored in the internal information server 5(1), .... Also, if the search destination is an information providing server 8(1), ... on the IP network 62, the search execution unit 123 executes a specified browser (web page viewing software), executes a search using the search keywords, and obtains information matching the search keywords (search results).

検索成否判定部１２７は、検索実行部１２３での検索結果に基づいて、不明発言部分に対応する解答が検索できたか否かの判定を行う。検索成否判定部１２７での判定結果に応じて、補足文作成部１２８が機能する。すなわち、補足文作成部１２８は、検索成否判定部１２７での判定結果が、不明発言部分に対応する解答が検索できなかったことを示すものである場合には、不明部分の解答が得られなかった旨を通知する補足文を作成する。また、補足文作成部１２８は、検索成否判定部１２７での判定結果が、不明発言部分に対応する解答が検索できたことを示すものである場合には、検索実行部１２３での検索結果を含む補足文を作成する。 The search success/failure determination unit 127 determines whether or not an answer corresponding to the unclear statement portion has been found based on the search result by the search execution unit 123. The supplementary sentence creation unit 128 functions according to the determination result by the search success/failure determination unit 127. That is, if the determination result by the search success/failure determination unit 127 indicates that an answer corresponding to the unclear statement portion has not been found, the supplementary sentence creation unit 128 creates a supplementary sentence notifying the user that an answer to the unclear portion has not been obtained. Also, if the determination result by the search success/failure determination unit 127 indicates that an answer corresponding to the unclear statement portion has been found, the supplementary sentence creation unit 128 creates a supplementary sentence including the search result by the search execution unit 123.

補足提供部１２９は、補足文作成部１２８で作成された補足文（テキストデータ）を音声情報に変換し、通話回線を接続している話者に対して、当該通話回線を通じて提供する。すなわち、発信元の話者と着信先の話者との双方に、補足文を音声情報として提供できる。これにより、発信元の話者と着信先の話者との双方に、不明発言部分について同時に補足をすることができる。 The supplementary sentence providing unit 129 converts the supplementary sentence (text data) created by the supplementary sentence creating unit 128 into audio information and provides it to the speaker connected to the telephone line via the telephone line. In other words, the supplementary sentence can be provided as audio information to both the calling speaker and the called speaker. This makes it possible to provide supplementary information on unclear utterances to both the calling speaker and the called speaker at the same time.

なお、上述もしたように、音声処理部１２１Ｂは、通話回線を接続して、通話を行う双方の話者からのそれぞれの通話音声を処理対象とし、処理対象についてどちらの話者の通話音声なのかを区別可能に処理する。簡単には、例えば、電話端末３と外線電話端末７との間に通信回線が接続された場合、音声処理部１２１Ｂは、電話端末３からの通話音声なのか、外線電話端末７からの通話音声なのかを区別可能にして処理する。このため、補足文を作成した不明発言部分は、電話端末３からの通話音声の部分なのか、外線電話端末７からの通話音声の部分なのかの区別はできている。 As described above, the voice processing unit 121B connects a telephone line and processes the voices of both speakers in a call, and processes the voices in a manner that makes it possible to distinguish which speaker's voice is being processed. Simply put, for example, when a communication line is connected between telephone terminal 3 and outside telephone terminal 7, the voice processing unit 121B processes the voices in a manner that makes it possible to distinguish whether they are the voices from telephone terminal 3 or the voices from outside telephone terminal 7. For this reason, it is possible to distinguish whether the unclear utterance portion for which the supplementary sentence was created is the part of the voice from telephone terminal 3 or the part of the voice from outside telephone terminal 7.

そこで、補足提供部１２９は、補足文作成部１２８で作成された補足文（テキストデータ）を音声情報に変換し、不明発言をした話者だけに提供することもできる。この場合には、不明発言を行った話者が、自身の音声により、自身の発言における不明部分を補足し、他の話者に対して提供できる。この場合には、不明発言を行った話者自身も納得感を得られ、他方の話者も補足を容易に受け入れることができるなど、よりソフトな対応とすることができる。 The supplementary provision unit 129 can therefore convert the supplementary sentence (text data) created by the supplementary sentence creation unit 128 into audio information and provide it only to the speaker who made the unclear statement. In this case, the speaker who made the unclear statement can supplement the unclear part of their own statement with their own voice and provide it to the other speaker. In this case, a softer response can be achieved, in that the speaker who made the unclear statement feels convinced and the other speaker can easily accept the supplement.

＜第２の実施の形態の通話支援システムでの処理＞
図５は、第２の実施の形態の電話制御装置１Ｂが用いられて構成された通話支援システムでの処理を説明するためのシーケンス図である。図５においても、電話端末３から外線電話端末７に電話を掛けることにより、あるいは、外線電話端末７から電話端末３に電話を掛けることにより、電話端末３と外線電話端末７との間に通話回線が接続され、通話が開始されているものとする（ステップＳ２１）。 <Processing in the communication support system according to the second embodiment>
Fig. 5 is a sequence diagram for explaining the process of the call support system configured using the telephone control device 1B of the second embodiment. In Fig. 5, it is assumed that a call line is connected between the telephone terminal 3 and the outside telephone terminal 7 by making a call from the telephone terminal 3 to the outside telephone terminal 7, or by making a call from the outside telephone terminal 7 to the telephone terminal 3, and a call is started (step S21).

図１を用いて説明したように、電話端末３は、電話制御装置１Ｂの配下の電話端末であるため、電話端末３と外線電話端末７との間の通話音声は、全て電話制御装置１Ｂを介して送受される。すなわち、電話制御装置１Ｂは、電話端末３と外線電話端末７との間の全ての通話音声を中継する。このため、電話制御装置１Ｂでは、制御部１０２の制御により、通話音声の転送と音声認識が開始される（ステップＳ２２）。このように、図５のステップＳ２１の処理は、図３に示したステップＳ１の処理と同様の処理であり、図５のステップＳ２２の処理は、図３に示したステップＳ２の処理と同様の処理である。 As explained using FIG. 1, telephone terminal 3 is a telephone terminal under telephone control device 1B, so all call voice between telephone terminal 3 and outside telephone terminal 7 is sent and received via telephone control device 1B. That is, telephone control device 1B relays all call voice between telephone terminal 3 and outside telephone terminal 7. For this reason, in telephone control device 1B, transfer of call voice and voice recognition are started under the control of control unit 102 (step S22). In this way, the process of step S21 in FIG. 5 is the same process as the process of step S1 shown in FIG. 3, and the process of step S22 in FIG. 5 is the same process as the process of step S2 shown in FIG. 3.

同時に、制御部１０２は、音声処理部１２１Ｂの不明発言検出部１２１４と、キーワード抽出部１２１５を制御し、不明発言部分の検出と、検索用キーワードの抽出とを行う（ステップＳ２３）。ステップＳ２３において、不明発言検出部１２１４は、音声認識部１２１１からのテキストデータの提供を受けて、文節や単語を検出し、不明状況辞書１０５Ｂを参照して、不明発言部分を検出する。不明発言部分は、上述もしたように、例えば、「えーと、えーと、…」、「～は、何でしたっけ。」、「～は、何分でしたっけ。」といった、言葉が出てこない場合の表現を含む部分や、相手に質問する表現を含む部分である。更には、通話音声（発言）が、途中で途切れ、その後に所定時間以上（例えば２秒以上）の無音が生じた部分なども、不明発言部分として検出する。 At the same time, the control unit 102 controls the unclear utterance detection unit 1214 and the keyword extraction unit 1215 of the voice processing unit 121B to detect unclear utterance parts and extract search keywords (step S23). In step S23, the unclear utterance detection unit 1214 receives text data from the voice recognition unit 1211, detects phrases and words, and detects unclear utterance parts by referring to the unknown situation dictionary 105B. As described above, the unclear utterance parts are parts that include expressions when the words cannot be found, such as "Um, um, ...", "What was it?", and "How many minutes was it?", and parts that include expressions to ask the other party. Furthermore, parts where the voice (utterance) of a call is interrupted midway and then silence occurs for a predetermined time or more (for example, 2 seconds or more) are also detected as unclear utterance parts.

例えば、「〇〇様に先月納品したアダプタの型番は、えーと、えーと、…」といった通話音声があった場合に、「えーと、えーと、…」という部分が不明発言部分として不明発言検出部１２１４により抽出されたとする。この場合、キーワード抽出部１２１５は、当該不明発言部分の直前の部分も考慮し、「〇〇様」、「先月納品」、「アダプタの型番」といった検索用キーワードを抽出する。また、「羽田から福岡までの飛行時間は、何分だっけ。」といった通話音声があった場合に、「何分だっけ」という部分が不明発言部分として不明発言検出部１２１４により検出されたとする。この場合、キーワード抽出部１２１５は、当該不明発言部分の直前の部分も考慮し、「羽田から福岡までの飛行時間」といった検索用キーワードを抽出する。 For example, in a case where a voice message includes the following statement: "The model number of the adapter delivered to Mr./Ms. XX last month is, um, um, ...", the portion "um, um, ..." is extracted by the unknown utterance detection unit 1214 as an unclear utterance. In this case, the keyword extraction unit 1215 takes into account the portion immediately preceding the unclear utterance and extracts search keywords such as "Mr./Ms. XX", "Delivered last month", and "Adapter model number". In addition, in a case where a voice message includes the following statement: "How many minutes is the flight time from Haneda to Fukuoka?", the portion "How many minutes is it?" is detected by the unknown utterance detection unit 1214 as an unclear utterance. In this case, the keyword extraction unit 1215 takes into account the portion immediately preceding the unclear utterance and extracts search keywords such as "Flight time from Haneda to Fukuoka".

次に、制御部１０２の制御の下、検索先判定部１２２が機能し、キーワード抽出部１２１３で抽出された検索用キーワードに基づいて、検索先判定辞書１０６を参照し、検索先を判定する処理を行う（ステップＳ２４）。ステップＳ２４では、検索先判定部１２２が、キーワード抽出部１２１３で抽出された検索用キーワードに基づいて、検索先判定辞書１０６を参照し、検索先を内部情報サーバ５（１）、…にするか、ＩＰ網６２上の情報提供サーバ８（１）、…にするかを判別する。 Next, under the control of the control unit 102, the search destination determination unit 122 functions to refer to the search destination determination dictionary 106 based on the search keywords extracted by the keyword extraction unit 1213 and perform a process to determine the search destination (step S24). In step S24, the search destination determination unit 122 refers to the search destination determination dictionary 106 based on the search keywords extracted by the keyword extraction unit 1213 and determines whether the search destination should be the internal information server 5 (1), ... or the information providing server 8 (1), ... on the IP network 62.

例えば、検索キーワードに含まれる単語が、上述した「〇〇様」、「先月納品」、「アダプタの型番」のように、内部（社内）で特に用いられる文言である場合には、ＬＡＮ４上の内部情報サーバ５（１）、…が検索先であると判定する。これに対して、上述した「羽田から福岡までの飛行時間」という検索用キーワードの場合、「羽田」、「福岡」、「飛行時間」といった単語は、一般的な単語であり、内部（社内）で特に用いられる文言ではない。このため、この場合には、ＩＰ網６２上の情報提供サーバ８（１）、…が検索先である判定する。 For example, if the words included in the search keyword are phrases that are used especially internally (within a company), such as the above-mentioned "Mr./Ms. XX," "Delivered last month," and "Adapter model number," then it is determined that the internal information server 5 (1), ... on the LAN 4 is the search destination. In contrast, in the case of the above-mentioned search keyword "Flight time from Haneda to Fukuoka," words such as "Haneda," "Fukuoka," and "Flight time" are general words and are not phrases that are used especially internally (within a company). For this reason, in this case, it is determined that the information providing server 8 (1), ... on the IP network 62 is the search destination.

また、検索用キーワードが「○○様」や「アダプタの型番」などの得意先を示す情報の場合には、ＬＡＮ４上の顧客情報ＤＢや製品情報ＤＢなどのように、検索先となるデータベース自体を特定することも可能である。同様に、検索用キーワードが「羽田から福岡までの飛行時間」の場合には、ＩＰ網６２上の所定の航空会社のＷｅｂページを検索先とするなど、検索用キーワードに基づいて、検索先（問い合わせ先）となるＷｅｂページ自体を特定することも可能である。 In addition, when the search keyword is information indicating a customer, such as "Mr./Ms. XX" or "adapter model number," it is possible to specify the database itself to be searched, such as a customer information DB or product information DB on LAN 4. Similarly, when the search keyword is "flight time from Haneda to Fukuoka," it is also possible to specify the web page itself to be searched (inquiry destination) based on the search keyword, such as searching the web page of a specific airline on IP network 62.

次に、制御部１０２の制御の下、検索実行部１２３が機能して、検索先判定部１２２で判定された検索先に対して、キーワード抽出部１２１５抽出された検索用キーワードを用いて検索を実行する（ステップＳ２５）。検索実行部１２３は、検索先がＩＰ網６２上の情報提供サーバ８（１）、…である場合には、所定のブラウザ（Ｗｅｂページ閲覧ソフト）を用いて、検索用キーワードを入力するようにして、ＩＰ網６２上の情報提供サーバ８（１）、…を検索先として検索を行う。これにより、具体例を示すと、例えば、「羽田から福岡までの飛行時間」という検索用キーワードを用いた場合には、「約１時間５０分」という検索結果が得られる。 Next, under the control of the control unit 102, the search execution unit 123 functions to execute a search for the search destination determined by the search destination determination unit 122 using the search keyword extracted by the keyword extraction unit 1215 (step S25). When the search destination is an information providing server 8(1), ... on the IP network 62, the search execution unit 123 uses a specified browser (web page viewing software) to input the search keyword, and executes a search for the information providing server 8(1), ... on the IP network 62 as the search destination. As a result, for example, when the search keyword "flight time from Haneda to Fukuoka" is used, the search result "about 1 hour and 50 minutes" is obtained.

また、検索実行部１２３は、検索先がＬＡＮ４上の内部情報サーバ５（１）、…である場合には、所定の検索プログラムを実行し、検索用キーワードを入力するようにして、ＬＡＮ４上の内部情報サーバ５（１）、…を検索先として検索を行う。これにより、具体例を示すと、例えば、「〇〇様」、「先月納品」、「アダプタの型番」という検索用キーワードが用いられたとする。この場合には、取引情報ＤＢが参照され、得意先である「○○様」に対して、「先月納品」した「アダプタの型番」である、例えば「ＡＤＰ１２３４」という検索結果が得られる。 Furthermore, when the search destination is the internal information server 5 (1), ... on the LAN 4, the search execution unit 123 executes a specified search program and inputs search keywords, and performs a search on the internal information server 5 (1), ... on the LAN 4 as the search destination. As a specific example, assume that the search keywords "Mr. XX", "Delivered last month", and "Adapter model number" are used. In this case, the transaction information DB is referenced, and a search result of, for example, "ADP1234", which is the "Adapter model number" "Delivered last month" to the customer "Mr. XX", is obtained.

次に、制御部１０２の制御の下、検索成否判定部１２７が機能し、検索実行部１２３による検索の成否が判定される（ステップＳ２６）。具体的には、検索実行部１２３により、検索結果が得られたか否かが判定される。次に、制御部１０２の制御の下、補足文作成部１２８が機能し、補足文を作成する処理を行う（ステップＳ２７）。ステップＳ２６において、検索成否判定部１２７が検索結果は得られていない（検索不成功）であると判定した場合、ステップＳ２７において、補足文作成部１２８は、例えば、「検索できませんでした。」といった補足文を作成する。逆に、ステップＳ２６において、検索成否判定部１２７が、検索結果が得られた（検索成功）と判定した場合、ステップＳ２７において、補足文作成部１２８は、検索実行部１２３の検索処理の結果に基づいて、補足文を作成する。 Next, under the control of the control unit 102, the search success/failure determination unit 127 functions to determine whether the search performed by the search execution unit 123 is successful (step S26). Specifically, the search execution unit 123 determines whether a search result has been obtained. Next, under the control of the control unit 102, the supplementary sentence creation unit 128 functions to perform a process of creating a supplementary sentence (step S27). If the search success/failure determination unit 127 determines in step S26 that no search result has been obtained (search unsuccessful), the supplementary sentence creation unit 128 creates a supplementary sentence such as "Search failed" in step S27. Conversely, if the search success/failure determination unit 127 determines in step S26 that a search result has been obtained (search successful), the supplementary sentence creation unit 128 creates a supplementary sentence based on the result of the search process performed by the search execution unit 123 in step S27.

具体例を示せば、「羽田から福岡までの飛行時間」という検索用キーワードを用いて、「約１時間５０分」という検索結果が得られた場合には、「羽田から福岡までの飛行時間は、約１時間５０分です。」という補足文を作成する。また、「〇〇様」、「先月納品」、「アダプタの型番」という検索用キーワードが用いて、「ＡＤＰ１２３４」という検索結果が得られた場合には、「○○様に先月納品したアダプタの型番は、ＡＤＰ１２３４です。」といった補足文を作成する。 To give a specific example, if the search keyword "flight time from Haneda to Fukuoka" is used and the search result is "approximately 1 hour 50 minutes," a supplemental sentence such as "Flight time from Haneda to Fukuoka is approximately 1 hour 50 minutes" is created. Similarly, if the search keywords "Mr./Ms. XX," "Delivered last month," and "Adapter model number" are used and the search result is "ADP1234," a supplemental sentence such as "The model number of the adapter delivered to Mr./Ms. XX last month is ADP1234" is created.

この後、制御部１０２の制御の下、補足提供部１２９が機能し、補足文作成部１２８で作成された補足文（テキストデータ）を音声データに変換し、通話回線を接続している話者に対して提供するする処理を行う（ステップＳ２８）。この場合、例えば、双方の話者の通話音声が途切れたことを例えば制御部１０２において検出した場合に、補足提供部１２９が双方の話者に対して提供する（ステップＳ２９、ステップＳ３０）。補足文に応じた音声データの提供が通話を阻害することなく、補足文に応じた音声データを適切に提供するためである。これにより、ステップＳ２３で検出された不明発言部分について、電話制御装置１Ｂの機能によって自動的に、話者双方に対して当該発言部分を補足する（補う）情報を提供できる。 After this, under the control of the control unit 102, the supplementary provision unit 129 functions to convert the supplementary sentence (text data) created by the supplementary sentence creation unit 128 into audio data and provide it to the speakers connected on the telephone line (step S28). In this case, for example, if the control unit 102 detects that the voice of both speakers has been interrupted, the supplementary provision unit 129 provides it to both speakers (steps S29, S30). This is to provide appropriate audio data corresponding to the supplementary sentence without disrupting the call. As a result, for the unclear utterance portion detected in step S23, the function of the telephone control device 1B can automatically provide both speakers with information that supplements (complements) that portion of the utterance.

この後、制御部１０２は、補足提供部１２９からの補足文の音声データの送出が終了すると、当該補足文の音声データを消去して、通常の通話に戻るようにする（ステップＳ３１）。ステップＳ３１の処理の後においては、音声処理部１２１Ｂが機能して、音声認識部１２１１が通話音声をテキストデータに変換し、不明発言検出部１２１４が機能して、提供した補足文についての評価を示す部分を抽出する。例えば、補足文の提供直後に、「そうじゃないでしょう。」、「違いますよね。」、「ちょっと腑に落ちないですね。」といった否定的な文言が抽出された場合には、制御部１０２は、ステップＳ２５からの処理に戻り、再度検索を行って、別の検索結果を用いて、追加の補足を行うようにする（ステップＳ３２）。 After this, when the transmission of the voice data of the supplementary sentence from the supplementary provision unit 129 is completed, the control unit 102 erases the voice data of the supplementary sentence and returns to the normal call (step S31). After the processing of step S31, the voice processing unit 121B functions, the voice recognition unit 1211 converts the call voice into text data, and the unclear statement detection unit 1214 functions to extract a portion indicating an evaluation of the provided supplementary sentence. For example, if negative words such as "That's not it," "It's not right," or "That doesn't make sense," are extracted immediately after the provision of the supplementary sentence, the control unit 102 returns to the processing from step S25, searches again, and provides additional supplementation using another search result (step S32).

これに対して、例えば、補足文の提供直後に、「あっ、そうでしたね。」、「なるほど。」、「やっと、すっきりしました。」といった肯定的な文言が抽出された場合には、今回の不明部分についての検索を終了する（ステップＳ３３）。また、補足文の提供直後に、「この件は、後で調べてご連絡します。」、「また、別の機会に」といった補足不要であることを示す文言が抽出された場合にも、今回の不明部分についての検索を終了する（ステップＳ３３）。この後、通話音声の送受信が継続され（ステップＳ３４）、上述したステップＳ２３からの処理が繰り返すようにされる。 In contrast to this, for example, if positive words such as "Ah, that's right," "I see," or "Finally, I feel relieved" are extracted immediately after the supplementary text is provided, the search for the current unclear portion is terminated (step S33). Also, if words indicating that no supplement is necessary such as "I will look into this matter and get back to you later" or "Maybe another time" are extracted immediately after the supplementary text is provided, the search for the current unclear portion is terminated (step S33). After this, the transmission and reception of voice communication continues (step S34), and the process from step S23 described above is repeated.

なお、ステップＳ３２やステップＳ３３においては、図示しないが、例えば、記憶装置１０３に格納されている「否定的文言辞書」が用いられ、ステップＳ３３においては、図示しないが、例えば、記憶装置１０３に格納されている「肯定的文言辞書」が用いられる。 Note that in steps S32 and S33, although not shown, for example, a "negative phrase dictionary" stored in the storage device 103 is used, and in step S33, although not shown, for example, a "positive phrase dictionary" stored in the storage device 103 is used.

このようにして、第２の実施の形態の電話制御装置１Ｂでは、電話回線を接続して通話を行う双方の話者からの通話音声を解析し、事前補足（不明発言部分の補足）を行うことができる。これにより、通話中に不明なことが発信し、話が先に進まないような場合でも、自動的に不明発言部分について、解答が示されるので、話の遅滞を防止でき、スムーズに通話（会話）を行うことができる。 In this way, the telephone control device 1B of the second embodiment can analyze the voices of both speakers who are connected to the telephone line and can perform advance supplementation (supplementation of unclear utterances). As a result, even if something unclear is said during a call and the conversation does not progress, an answer is automatically provided for the unclear utterance, preventing delays in the conversation and allowing the call (conversation) to proceed smoothly.

［第３の実施の形態］
＜第３の実施の形態の電話制御装置１Ｃの構成例＞
第３の実施の形態の電話制御装置１Ｃは、電話回線を接続して通話を行う双方の話者からの通話音声を解析し、事前防止（不適切発言部分の上書き）を行う機能を備えるものである。図６は、この発明による通話支援装置の第３の実施の形態が適用された電話制御装置１Ｃの構成例を説明するためのブロック図である。図６に示す電話制御装置１Ｃにおいて、図２を用いて説明した第１の実施の形態の電話制御装置１Ａと同様に構成される部分には、同じ参照符号を付し、当該部分の詳細な説明については省略する。 [Third embodiment]
<Configuration Example of Telephone Control Device 1C of Third Embodiment>
The telephone control device 1C of the third embodiment has a function of analyzing the voices of both speakers who are connected to a telephone line and performing preventive measures (overwriting inappropriate remarks). Fig. 6 is a block diagram for explaining a configuration example of the telephone control device 1C to which the third embodiment of the communication support device of the present invention is applied. In the telephone control device 1C shown in Fig. 6, parts configured similarly to the telephone control device 1A of the first embodiment described using Fig. 2 are given the same reference numerals, and detailed explanations of those parts are omitted.

図６に示すように、第３の実施の形態の電話制御装置１Ｃは、不適切文言辞書１０５Ｃを備える。不適切文言辞書１０５Ｃは、ＨＤＤやＳＳＤなどの記録装置部に作成され、通話回線を接続して通話を行う双方の話者の通話音声における、不適切な発言部分を検出するための種々の辞書データを保持する。不適切文言辞書１０５Ｃには、相手を馬鹿にする言葉、差別的な言葉、不穏当な言葉、公序良俗に反する言葉など、相手が不快に感じるような種々の文言が登録されている。 As shown in FIG. 6, the telephone control device 1C of the third embodiment includes an inappropriate word dictionary 105C. The inappropriate word dictionary 105C is created in a recording device such as an HDD or SSD, and holds various dictionary data for detecting inappropriate utterances in the voices of both speakers who are connected to a telephone line and making a call. The inappropriate word dictionary 105C stores various words that may make the other party feel uncomfortable, such as words that make the other party feel insulted, discriminatory words, inappropriate words, and words that go against public order and morals.

会話支援処理部１２０Ｃが、電話回線を接続して通話を行う双方の話者からの通話音声を解析し、事前防止（不適切発言部分の上書き）を行う機能を実現する部分となる。会話支援処理部１２０Ｃは、図２に示すように、音声認識部１２１１と、不適切発言検出部１２１６と、不適切発言上書き部１２１７とからなる音声処理部１２１Ｃを備える。更に、会話支援処理部１２０Ｃは、ガイダンス作成部１３１と、ガイダンス提供部１３２とを備える。 The conversation support processing unit 120C is the part that realizes the function of analyzing the voices of both speakers who are connected to the telephone line and preventing inappropriate remarks in advance (overwriting inappropriate remarks). As shown in FIG. 2, the conversation support processing unit 120C includes a voice processing unit 121C that is made up of a voice recognition unit 1211, an inappropriate remark detection unit 1216, and an inappropriate remark overwriting unit 1217. Furthermore, the conversation support processing unit 120C includes a guidance creation unit 131 and a guidance provision unit 132.

この第３の実施の形態の電話制御装置１Ｃは、自己の配下の電話端末３からの通話音声に、不適切な発言部分が含まれていた場合に、当該部分を外線電話端末７には提供しないようにする。さらに、電話制御装置１Ｃは、不適切な発言部分を含む通話音声の送信元である電話端末３に対しては、注意喚起を促すガイダンスメッセージを提供する。従って、外線電話端末７からの通話音声に不適切な発言部分が含まれていたとしても、これはそのまま電話端末３に提供される。 In the third embodiment, when an inappropriate remark is included in a call voice from a subordinate telephone terminal 3, the telephone control device 1C does not provide that portion to the outside telephone terminal 7. Furthermore, the telephone control device 1C provides a guidance message calling attention to the telephone terminal 3 that is the sender of the call voice including the inappropriate remark. Therefore, even if an inappropriate remark is included in a call voice from the outside telephone terminal 7, this is provided to the telephone terminal 3 as is.

すなわち、音声認識部１２１１は、電話端末３からの通話音声と外線電話端末７からの通話音声とのそれぞれについて、区別できる。そこで、この第３の実施の形態の電話制御装置においては、通話回線を接続して、通話を行う電話端末３からの通話音声をテキストデータに変換する処理を行う。不適切発言検出部１２１６は、電話端末３の話者からの通話音声から変換されたテキストデータを、文節や単語に区切るようにして解析し、不適切文言辞書１０５Ｃを参照して、不適切文言部分を検出する。 That is, the voice recognition unit 1211 can distinguish between the call voice from the telephone terminal 3 and the call voice from the external telephone terminal 7. Therefore, in the telephone control device of this third embodiment, a call line is connected and a process is performed to convert the call voice from the telephone terminal 3 that makes the call into text data. The inappropriate remark detection unit 1216 analyzes the text data converted from the call voice from the speaker of the telephone terminal 3 by dividing it into phrases and words, and detects inappropriate words by referring to the inappropriate word dictionary 105C.

不適切発言上書き部１２１７は、不適切発言検出部１２１６で検出された不適切な発言部分に対応する通話音声部分を、この実施の形態の電話制御装置１Ｃでは無音に置き換えて、相手先である外線電話端末７に対して送信するようにする処理を行う。このように、送受される通話音声は、必ず音声処理部１２１Ｃを通じて、相手先に送信するようにされる。ガイダンス作成部１３１は、不適切な発言部分が検出された通話音声の提供元である電話端末３の話者に対して注意喚起を促すガイダンスメッセージを作成する。当該ガイダンスメッセージは、例えば、「不適切な発言がありました。当該部分の音声情報は相手先には送信されていません。」などといったものとなる。 The inappropriate remarks overwriting unit 1217 replaces the portion of the telephone call audio corresponding to the inappropriate remarks detected by the inappropriate remarks detection unit 1216 with silence in the telephone control device 1C of this embodiment, and transmits this to the outside telephone terminal 7, which is the other party. In this way, the transmitted and received telephone call audio is always transmitted to the other party via the audio processing unit 121C. The guidance creation unit 131 creates a guidance message that calls attention to the speaker of the telephone terminal 3, which is the source of the telephone call audio in which the inappropriate remarks were detected. The guidance message is, for example, "An inappropriate remark was made. The audio information for that portion has not been transmitted to the other party."

ガイダンス提供部１３２は、ガイダンス作成部１３１で作成されたガイダンスメッセージ（テキストデータ）を音声情報に変換し、当該不適切発言を行った電話端末３の話者に対してのみ提供する処理を行う。すなわち、ガイダンス提供部１３２は、ガイダンス作成部１３１で作成されたガイダンスメッセージ（テキストデータ）を内線Ｉ／Ｆ１０７及び接続端１０７Ｔを通じて、電話回線を接続している配下の電話端末３に提供する。 The guidance providing unit 132 converts the guidance message (text data) created by the guidance creating unit 131 into voice information and provides it only to the speaker of the telephone terminal 3 who made the inappropriate remark. In other words, the guidance providing unit 132 provides the guidance message (text data) created by the guidance creating unit 131 to the subordinate telephone terminal 3 connected to the telephone line via the extension I/F 107 and the connection end 107T.

なお、この第３の実施の形態の電話制御装置１Ｃでは、配下の電話端末３からの通話音声に対して、不適切な発言部分の検出を行うようにした。しかし、この第３の実施の形態の電話制御装置１Ｃの音声処理部１２１Ｃにおいても、通話回線を接続して、通話を行う双方の話者からのそれぞれの通話音声を処理対象とし、処理対象についてどちらの話者の通話音声なのかを区別可能に処理することができる。このため、外線電話端末７からの通話音声についても、電話端末３からの通話音声を処理対象とする場合と同様に不適切な発言部分を検出し、当該部分の通話音声は電話端末３には提供しないようにできる。また、この場合には、不適切発言があったことを通知するガイダンスメッセージを形成して、外線電話端末７に対して提供することもできる。 In the telephone control device 1C of the third embodiment, inappropriate remarks are detected from the call voice from the subordinate telephone terminal 3. However, the voice processing unit 121C of the telephone control device 1C of the third embodiment can also connect a call line, process the call voices from both speakers of the call, and process the call voice of each speaker in a manner that makes it possible to distinguish which speaker is the target of processing. Therefore, inappropriate remarks can be detected from the call voice from the outside telephone terminal 7 in the same way as when the call voice from the telephone terminal 3 is processed, and the call voice in question is not provided to the telephone terminal 3. In this case, a guidance message notifying that an inappropriate remark has been made can also be formed and provided to the outside telephone terminal 7.

＜第３の実施の形態の通話支援システムでの処理＞
図７は、第３の実施の形態の電話制御装置１Ｃが用いられて構成された通話支援システムでの処理を説明するためのシーケンス図である。図７においても、電話端末３から外線電話端末７に電話を掛けることにより、あるいは、外線電話端末７から電話端末３に電話を掛けることにより、電話端末３と外線電話端末７との間に通話回線が接続され、通話が開始されているものとする（ステップＳ４１）。 <Processing in the communication support system according to the third embodiment>
Fig. 7 is a sequence diagram for explaining the process of the call support system configured using the telephone control device 1C of the third embodiment. In Fig. 7, it is assumed that a call line is connected between the telephone terminal 3 and the outside telephone terminal 7 by making a call from the telephone terminal 3 to the outside telephone terminal 7, or by making a call from the outside telephone terminal 7 to the telephone terminal 3, and a call is started (step S41).

図１を用いて説明したように、電話端末３は、電話制御装置１Ｃの配下の電話端末であるため、電話端末３と外線電話端末７との間の通話音声は、全て電話制御装置１Ｃを介して送受される。すなわち、電話制御装置１Ｃは、電話端末３と外線電話端末７との間の全ての通話音声を中継する。このため、電話制御装置１Ｃでは、制御部１０２の制御により、通話音声の転送と音声認識が開始される（ステップＳ４２）。 As explained using FIG. 1, telephone terminal 3 is a telephone terminal subordinate to telephone control device 1C, and therefore all call voice between telephone terminal 3 and outside telephone terminal 7 is sent and received via telephone control device 1C. In other words, telephone control device 1C relays all call voice between telephone terminal 3 and outside telephone terminal 7. For this reason, in telephone control device 1C, transfer of call voice and voice recognition are started under the control of control unit 102 (step S42).

第３の実施の形態の電話制御装置１Ｃでは、例えば、制御部１０２が機能して、音声認識部１２１１で通話音声から変換されたテキストデータに基づいて、挨拶の内容などから、通話の相手先である外線電話端末７の話者に関する情報を抽出する（ステップＳ４３）。また、ステップＳ４３において、制御部１０２は、相手先である外線電話端末７に割り当てられている電話番号をも用いて外線電話端末７の話者に関する情報を抽出する。外線電話端末７の電話番号は、外線電話端末７が発信元である場合には、当該外線電話端末７から提供される発信元電話番号を用いることができ、外線電話端末７が着信先である場合には、発信元である電話端末３から提供される着信先電話番号を用いることができる。 In the telephone control device 1C of the third embodiment, for example, the control unit 102 functions to extract information about the speaker of the outside telephone terminal 7, which is the other end of the call, from the content of the greeting, etc., based on the text data converted from the call voice by the voice recognition unit 1211 (step S43). In addition, in step S43, the control unit 102 also uses the telephone number assigned to the outside telephone terminal 7, which is the other end of the call, to extract information about the speaker of the outside telephone terminal 7. When the outside telephone terminal 7 is the caller, the caller telephone number provided by the outside telephone terminal 7 can be used as the telephone number of the outside telephone terminal 7, and when the outside telephone terminal 7 is the callee, the callee telephone number provided by the caller telephone terminal 3 can be used.

ステップＳ４３で抽出される外線電話端末７の話者に関する情報は、不適切な発言部分の抽出処理において用いられる。例えば、外線電話端末７の話者が、得意先の話者である場合には、失礼が無いように高いレベルで不適切な発言部分の検出を行わなければならない。また、外線電話端末７の話者が、発注先である場合には、特にハラスメント的な発言には注意を要する。このように、外線電話端末７の話者が、誰なのかによって、不適切な発言の内、どの分野の発言により注意が必要なのかが異なる場合があるため、外線電話端末７の話者に関する情報も考慮することになる。ステップＳ４３の処理の後においては、電話制御装置１Ｃの音声認識部１２１１で行われる音声認識は、電話制御装置１Ｃの配下の電話端末３からの通話音声だけに絞り込んでもよい。 The information about the speaker of the external telephone terminal 7 extracted in step S43 is used in the process of extracting inappropriate remarks. For example, if the speaker of the external telephone terminal 7 is a customer, inappropriate remarks must be detected at a high level so as not to be rude. Also, if the speaker of the external telephone terminal 7 is a supplier, particular attention must be paid to harassing remarks. In this way, the information about the speaker of the external telephone terminal 7 is also taken into consideration, since the category of remarks that require attention may differ depending on who the speaker of the external telephone terminal 7 is. After the process of step S43, the voice recognition performed by the voice recognition unit 1211 of the telephone control device 1C may be limited to only the call voice from the telephone terminal 3 subordinate to the telephone control device 1C.

この後、電話制御装置１Ｃの音声処理部１２１Ｃでは、電話端末３からの通話音声を処理対象とする（ステップＳ４４）。従って、電話端末３からの通話音声について、音声認識部１２１１が機能して通話音声をテキストデータに変換し、このテキストデータについて、不適切発言検出部１２１６が、不適切文言辞書１０５Ｃを参照して、不適切な発言部分の検出を行う（ステップＳ４５）。なお、ステップＳ４５において、不適切発言検出部１２１６は、相手先である外線電話端末７の話者が誰かに応じて、不適切な発言となるレベルを調整できる。 Then, the voice processing unit 121C of the telephone control device 1C processes the call voice from the telephone terminal 3 (step S44). Therefore, for the call voice from the telephone terminal 3, the voice recognition unit 1211 functions to convert the call voice into text data, and for this text data, the inappropriate remark detection unit 1216 refers to the inappropriate phrase dictionary 105C to detect inappropriate remarks (step S45). Note that in step S45, the inappropriate remark detection unit 1216 can adjust the level of what is considered inappropriate depending on who is speaking at the outside telephone terminal 7, which is the other party.

このため、不適切文言辞書１０５Ｃに登録されている辞書データには、例えば、レベルを示す情報が付加されており、レベルに応じて不適切な発言部分を検出することができるようにされる。例えば、相手先が得意先である場合には、全範囲で高いレベルで厳しく不適切発言を検出し、相手先が発注先である場合には、ハラスメントの範囲について厳しく不適切発言を検出するなどのことができる。 For this reason, the dictionary data registered in the inappropriate phrases dictionary 105C is provided with information indicating the level, for example, so that inappropriate remarks can be detected according to the level. For example, if the other party is a customer, inappropriate remarks can be detected at a high level of strictness across the entire range, and if the other party is a supplier, inappropriate remarks can be detected at a strict level only with respect to the scope of harassment.

ステップＳ４５において、電話端末３からの通話音声において、不適切な発言部分が検出されたとする。この場合、不適切発言検出部１２１６から不適切発言上書き部１２１７に対して、当該電話端末からの通話音声の当該不適切な発言部分を示す情報が提供される。ここで、通話音声の不適切な発言部分を示す情報は、例えば、通話開始時点を始点とする時間情報、あるいは、通話音声の先頭から割り当てられたアドレス情報（ポインタ情報）などである。 In step S45, it is assumed that an inappropriate remark portion is detected in the call audio from telephone terminal 3. In this case, the inappropriate remark detection unit 1216 provides information indicating the inappropriate remark portion of the call audio from the telephone terminal to the inappropriate remark overwriting unit 1217. Here, the information indicating the inappropriate remark portion of the call audio is, for example, time information starting from the start of the call, or address information (pointer information) assigned from the beginning of the call audio.

不適切発言上書き部１２１７は、不適切発言検出部１２１６からの情報に基づいて、電話端末３からの通話音声の不適切な発言部分を無音で上書きし、この上書きされた通話音声を、通話の相手先である外線電話端末７に送信する（ステップＳ４６）。これにより、通話の相手先である外線電話端末７に対しては、電話端末３からの通話音声であって、不適切な発言部分が無音で上書きされた通話音声が提供される（ステップＳ４７）。 Based on the information from the inappropriate remarks detection unit 1216, the inappropriate remarks overwriting unit 1217 overwrites the inappropriate remarks in the voice of the call from the telephone terminal 3 with silence, and transmits this overwritten voice of the call to the outside telephone terminal 7, which is the other end of the call (step S46). As a result, the voice of the call from the telephone terminal 3 in which the inappropriate remarks have been overwritten with silence is provided to the outside telephone terminal 7, which is the other end of the call (step S47).

更に、電話制御装置１Ｃでは、ガイダンス作成部１３１が機能して、不適切な発言があったことを通知するガイダンスメッセージ（テキストデータ）が形成される（ステップＳ４８）。 Furthermore, in the telephone control device 1C, the guidance creation unit 131 functions to generate a guidance message (text data) notifying the user that an inappropriate comment has been made (step S48).

ガイダンス作成部１３１で形成されたガイダンスメッセージは、ガイダンス提供部１３２において音声情報に変換されて、通話回線を接続して通話を行っている不適切な発言の送信元である電話端末３に送信される（ステップＳ４９）。これにより、当該電話端末３に対して、ガイダンスメッセージが提供される（ステップＳ５０）。これにより、電話端末３の話者は、自分が不適切な発言をしてしまったことを認識し、以後の発言について注意を払うことができる。 The guidance message created by the guidance creation unit 131 is converted into audio information by the guidance provision unit 132 and transmitted to the telephone terminal 3 that is the sender of the inappropriate remarks and is currently speaking over a connected telephone line (step S49). As a result, the guidance message is provided to the telephone terminal 3 (step S50). This allows the speaker of the telephone terminal 3 to recognize that they have made an inappropriate remark and to be careful about future remarks.

この後、制御部１０２は、通常の通話に戻るように各部を制御し（ステップＳ５１）。これにより、通話音声の送受信が中断されることなく継続され（ステップＳ５２）、上述したステップＳ４４からの処理が繰り返すようにされる。このようにして、電話制御装置１Ｃの配下の電話端末３を利用する話者は、自己の不適切な発言を、相手先に通知することなく、自己が不適切な発言を行ったことをガイダンスメッセージで認識して、以後の発言に注意を払うことができる。従って、通話に相手に対して、不用意に不快な思いをさせることが無い。 The control unit 102 then controls each unit to return to a normal call (step S51). This allows the transmission and reception of voice communication to continue uninterrupted (step S52), and the process from step S44 described above is repeated. In this way, a speaker using a telephone terminal 3 under the telephone control device 1C can recognize from the guidance message that he or she has made an inappropriate remark, without notifying the other party of the inappropriate remark, and can be careful about future remarks. This prevents the other party from being unintentionally made to feel uncomfortable during a call.

なお、この第３の実施の形態では、電話制御装置１Ｃの配下の電話端末３からの通話音声についてだけ、不適切な発言部分を検出し、無音で置き換えるようにしたが、これに限るものではない。上述もしたように、外線電話端末７からの通話音声についても、電話端末３からの通話音声を処理対象とする場合と同様に不適切な発言部分を検出し、当該部分の通話音声は電話端末３には提供しないようにできる。また、この場合には、不適切発言があったことを通知するガイダンスメッセージを形成して、外線電話端末７に対して提供することもできる。 In the third embodiment, inappropriate remarks are detected and replaced with silence only for the call voice from telephone terminal 3 under the telephone control device 1C, but this is not limited to the above. As described above, inappropriate remarks can also be detected for call voice from external telephone terminal 7 in the same way as when the call voice from telephone terminal 3 is the processing target, and the call voice in question can be prevented from being provided to telephone terminal 3. In this case, a guidance message can also be formed to notify the external telephone terminal 7 that an inappropriate remark has been made.

従って、内線内で通話する場合、例えば、電話端末３（１）と電話端末３（３）で通話するような場合であっても、双方からの通話音声について不適切な発言部分を検出し、その部分の通話音声を無音で置き換えて、相手先に提供するようにできる。また、この場合においても、不適切な発言を行ったのは、どちらの電話端末の話者であるかを把握できる。単に通話経路の違いだけでなく、通話音声（音声データ）に付加されている送信元を示す情報に基づいて送信元を特定できる。これにより、不適切な発言を行った話者の電話端末に対して、ガイダンスメッセージを提供できる。 Therefore, when making an internal call, for example, when a call is made between telephone terminal 3(1) and telephone terminal 3(3), it is possible to detect inappropriate remarks in the call voice from both parties, replace those parts of the call voice with silence, and provide the silenced parts to the other party. Even in this case, it is possible to determine which telephone terminal made the inappropriate remark. The sender can be identified based not only on differences in the call paths, but also on information indicating the sender that is added to the call voice (voice data). This makes it possible to provide a guidance message to the telephone terminal of the speaker who made the inappropriate remark.

また、ガイダンスメッセージは、上述したものに限るものではなく、例えば、「○○〇…は、不適切な文言です。」といったように、話者に対して、当該話が発したどの文言が、不適切なのかを、ガイダンスメッセージで通知することも可能である。これにより、不適切な発言を意識せずに行ってしまった話者は、何が不適切な発言だったのかを明確に把握することができ、以降、その発言を行うことが無いように意識することができる。 In addition, the guidance message is not limited to the above, and it is also possible to use a guidance message to inform the speaker which words uttered by the speaker are inappropriate, for example, "XXX... is an inappropriate statement." This allows a speaker who has made an inappropriate statement without realizing it to clearly understand what was inappropriate and to be conscious of not making that statement in the future.

［電話会議、オンライン会議への適用］
上述もしたように、この発明は、２者間の通話を中継する電話制御装置だけでなく、電話会議やオンライン会議のように、２名以上の複数の話者が参加した会議を行うこともできる。電話会議は、電話制御装置やクラウドＰＢＸに電話会議機能が設けられていれば実現できる。また、オンライン会議の場合には、ＩＰ網６２上の会議サーバにより実現される。 [Application to telephone conferences and online meetings]
As described above, the present invention is not only a telephone control device that relays a call between two people, but also a conference in which two or more speakers participate, such as a telephone conference or an online conference. A telephone conference can be realized if the telephone control device or cloud PBX is provided with a telephone conference function. In the case of an online conference, it is realized by a conference server on the IP network 62.

このため、電話会議機能が設けられた電話制御装置やクラウドＰＢＸに、また、ＩＰ網６２上の会議サーバに、上述した電話制御装置１Ａ、１Ｂ、１Ｃの会話支援処理部１２０Ａ、１２０Ｂ、１２０Ｃを設け、各話者からの通話音声を処理対象とすればよい。これにより、２名以上の複数の話者が参加した電話会議、オンライン会議の場合であっても、（１）事後訂正（あいまいな発言部分の訂正）、（２）事前補足（不明発言部分の補足）、（３）事前防止（不適切発言部分の上書き）の各機能を用いるようにできる。 For this reason, the conversation support processing units 120A, 120B, and 120C of the telephone control devices 1A, 1B, and 1C described above can be provided in a telephone control device or cloud PBX equipped with a telephone conference function, or in a conference server on the IP network 62, and the call voices from each speaker can be processed. This makes it possible to use the following functions even in a telephone conference or online conference in which two or more speakers participate: (1) post-correction (correction of ambiguous remarks), (2) advance supplementation (supplementation of unclear remarks), and (3) advance prevention (overwriting of inappropriate remarks).

［実施の形態の効果］
この発明によれば、２者間の通話や電話会議、オンライン会議を行う場合に、各話者（参加者）を適切に支援できる。これにより、話者全員の利便性を向上させると共に、２者間の通話、電話会議、オンライン会議の質の向上を実現できる。 [Effects of the embodiment]
According to the present invention, when a two-party call, a telephone conference, or an online conference is held, each speaker (participant) can be appropriately supported. This improves the convenience of all speakers and improves the quality of the two-party call, the telephone conference, or the online conference.

［変形例］
上述した第１、第２、第３の実施の形態の電話制御装置は、それぞれ異なる機能を備えるものとして説明したが、これに限るものではない。会話支援処理部１２０Ａ、１２０Ｂ、１２０Ｃを備えることにより、（１）事後訂正（あいまいな発言部分の訂正）機能、（２）事前補足（不明発言部分の補足）機能、（３）事前防止（不適切発言部分の上書き）機能の２つの機能を備えた電話制御装置を実現できる。もちろん、（１）事後訂正機能、（２）事前補足機能、（３）事前防止機能の内の２つの機能を備えるようにすることも可能である。電話会議機能を備えた電話制御装置やクラウドＰＢＸ、また、オンライン会議を実現するための会議サーバについても同様である。 [Modification]
The telephone control devices of the first, second, and third embodiments described above have been described as having different functions, but are not limited to this. By providing the conversation support processing units 120A, 120B, and 120C, a telephone control device having two functions, namely, (1) post-correction (correction of ambiguous remarks) function, (2) advance supplement (supplement of unclear remarks) function, and (3) advance prevention (overwriting of inappropriate remarks) function, can be realized. Of course, it is also possible to provide two of the functions of (1) post-correction function, (2) advance supplement function, and (3) advance prevention function. The same applies to telephone control devices and cloud PBXs with telephone conference functions, and conference servers for realizing online conferences.

また、上述した実施の形態では、電話制御装置、クラウドＰＢＸ、会議サーバなどにこの発明が適用可能であることを説明したが、これに限るものではない。例えば、社内とIＰ網とを接続するゲートウェイ装置にこの発明を適用することもできる。要は、複数の話者間で、種々のネットワークを通じて音声情報を送受する通話を行う場合に、各話者からの音声情報を中継する種々の装置に対して、この発明を適用することができる。 In addition, in the above-mentioned embodiment, it has been explained that the present invention can be applied to telephone control devices, cloud PBXs, conference servers, etc., but the present invention is not limited to these. For example, the present invention can also be applied to a gateway device that connects a company to an IP network. In short, when a call is made between multiple speakers in which voice information is sent and received through various networks, the present invention can be applied to various devices that relay voice information from each speaker.

また、上述した第３の実施の形態においては、不適切な発言は、相手を馬鹿にする言葉、差別的な言葉、不穏当な言葉、公序良俗に反する言葉など、相手が不快に感じるような種々の文言であるものとして説明した。しかし、これに限るものではない。これらの言葉に該当しなくても、相手先に伝えたくない文言がある場合には、これを不適切文言辞書１０５Ｃに登録しておくことにより、これを相手先に伝えないようにすることができる。例えば、相手先が誰かは、電話番号や接続ＩＤなどの各話者に固有の識別情報により分かる。このため、相手先に応じて、相手先が気にしたり、嫌がったりする文言を不適切文言辞書１０５Ｃに登録しておくことで、相手先に応じて不適切となる文言を伝えないようにすることができる。 In the third embodiment described above, inappropriate remarks are described as various words that make the other party feel uncomfortable, such as words that make the other party feel insulted, discriminatory words, inappropriate words, and words that go against public order and morals. However, this is not limited to these. Even if there is a word that does not fall under these words but that you do not want to convey to the other party, you can register it in the inappropriate word dictionary 105C so that it will not be conveyed to the other party. For example, the identity of the other party can be determined from identification information unique to each speaker, such as a telephone number or a connection ID. Therefore, by registering words that the other party may be concerned about or dislike in the inappropriate word dictionary 105C depending on the other party, you can prevent inappropriate words from being conveyed to the other party.

また、第１、第２の実施の形態においても、電話番号や接続ＩＤなどの各話者に固有の識別情報により、あるいは、通話の始めの挨拶などの通音音声のテキストデータを解析することにより、通話の相手先は誰なのかを把握することができる。当該解析は、あいまい発言検出部１２１２や不明発言検出部１２１４で行えばよい。このため、第１の実施の形態の電話制御装置１Ａにおいては、通話の相手先が誰なのかを考慮して、あいまいな発言部分を検出したり、キーワードを抽出したり、検索を行ったりすることができる。また、第２の実施の形態の電話制御装置１Ｂにおいては、通話の相手先が誰なのかを考慮して、不明な発言部分を検出したり、キーワードを抽出したり、検索を行ったりすることができる。 Also, in the first and second embodiments, the identity of the caller can be determined by analyzing identification information unique to each speaker, such as a telephone number or a connection ID, or by analyzing text data of the sounded voice, such as the greeting at the beginning of the call. This analysis can be performed by the ambiguous utterance detection unit 1212 or the unclear utterance detection unit 1214. Therefore, in the telephone control device 1A of the first embodiment, ambiguous utterance parts can be detected, keywords can be extracted, and searches can be performed, taking into account who the caller is. Also, in the telephone control device 1B of the second embodiment, unclear utterance parts can be detected, keywords can be extracted, and searches can be performed, taking into account who the caller is.

また、第１の実施の形態では、訂正文（テキストデータ）を音声情報に変換して提供した。また、第２の実施の形態では、補足文（テキストデータ）を音声情報に変換して、提供した。また、第３の実施の形態では、ガイダンスメッセージを音声情報に変換して提供した。しかし、これに限るものではない。訂正文、補足文、ガイダンスメッセージの提供先が、テキストデータを受信して、表示出力したり、音声情報に変換して出力したりすることができる機能を備える場合には、訂正文、補足文、ガイダンスメッセージをテキストデータとして提供先に提供してもよい。 In the first embodiment, the correction text (text data) is converted into audio information and provided. In the second embodiment, the supplementary text (text data) is converted into audio information and provided. In the third embodiment, the guidance message is converted into audio information and provided. However, this is not limited to this. If the recipient of the correction text, supplementary text, or guidance message has a function that can receive text data and display it or convert it into audio information and output it, the correction text, supplementary text, or guidance message may be provided to the recipient as text data.

特に、会議サーバを通じて、ＰＣ（Personal Computer）やタブレットＰＣ、スマートフォンを用いてオンライン会議を行う場合には、訂正文、補足文、ガイダンスメッセージをテキストデータとして提供先に提供する。これにより、提供先においては、訂正文、補足文、ガイダンスメッセージを、用いているＰＣ、タブレットＰＣ、スマートフォンのディスプレイに表示し、使用者に提供できる。これにより、音声による会議の邪魔をすることなく、訂正文、補足文、ガイダンスメッセージをテキストデータとして、適切なタイミングで目的とする提供先の使用者に提供できる。 In particular, when an online conference is held using a PC (Personal Computer), tablet PC, or smartphone through a conference server, corrections, supplementary text, and guidance messages are provided to the intended recipient as text data. As a result, the recipient can display the corrections, supplementary text, and guidance messages on the display of the PC, tablet PC, or smartphone being used and provide them to the user. This allows corrections, supplementary text, and guidance messages to be provided to the intended recipient user as text data at the appropriate time without disrupting the audio conference.

［その他］
上述した実施の形態の説明からも分かるように、請求項の音声認識手段の機能は、実施の形態の電話制御装置１Ａ、１Ｂ、１Ｃの音声認識部１２１１が実現している。また、請求項の検出手段の機能は、電話制御装置１Ａのあいまい発言検出部１２１２が、電話制御装置１Ｂの不明発言検出部１２１４が、電話制御装置１Ｃの不適切発言検出部１２１６がそれぞれ実現している。また、請求項の作成手段の機能は、電話制御装置１Ａでは、キーワード抽出部１２１３、検索実行部１２３、訂正文作成部１２５が実現している。同様に、請求項の作成手段の機能は、電話制御装置１Ｂでは、キーワード抽出部１２１３、検索実行部１２３、補足文作成部１２８が、電話制御装置１Ｃでは、不適切発言上書き部１２１７が、それぞれ実現している。また、請求項の提供手段の機能は、電話制御装置１Ａの訂正提供部１２６が、電話制御装置１Ｂの補足提供部１２９が、電話制御装置１Ｃの不適切発言上書き部１２１７がそれぞれ実現している。 [others]
As can be seen from the above description of the embodiment, the function of the voice recognition means in the claims is realized by the voice recognition unit 1211 of the telephone control devices 1A, 1B, and 1C in the embodiment. Also, the function of the detection means in the claims is realized by the ambiguous remark detection unit 1212 of the telephone control device 1A, the unclear remark detection unit 1214 of the telephone control device 1B, and the inappropriate remark detection unit 1216 of the telephone control device 1C. Also, the function of the creation means in the claims is realized by the keyword extraction unit 1213, the search execution unit 123, and the correction sentence creation unit 125 in the telephone control device 1A. Similarly, the function of the creation means in the claims is realized by the keyword extraction unit 1213, the search execution unit 123, and the supplementary sentence creation unit 128 in the telephone control device 1B, and the inappropriate remark overwriting unit 1217 in the telephone control device 1C. The function of the providing means in the claims is realized by the correction providing unit 126 of the telephone control device 1A, the supplement providing unit 129 of the telephone control device 1B, and the inappropriate comment overwriting unit 1217 of the telephone control device 1C, respectively.

また、図３、図５、図７のシーケンス図を用いて説明した電話制御装置１Ａ、１Ｂ、１Ｃで行われる処理が、この発明の通話支援方法の一実施の形態が適用されたものである。 The processes performed by the telephone control devices 1A, 1B, and 1C described using the sequence diagrams in Figures 3, 5, and 7 are an embodiment of the call support method of the present invention.

１Ａ、１Ｂ、１Ｃ…電話制御装置、１０１Ｔ…接続端、１０１…電話網Ｉ／Ｆ、１０２…制御装置、１０３…記憶装置、１０４…端末管理ファイル、１０５Ａ…あいまい発言辞書、１０５Ｂ…不明状況辞書、１０５Ｃ…不適切文言辞書、１０６…検索先判定辞書、１０７Ｔ…接続端、１０７…内線Ｉ／Ｆ、１０８Ｔ…接続端、１０８…ＬＡＮＩ／Ｆ、１０９…呼制御部、１０９Ｓ…発信制御部、１０９Ｒ…着信制御部、１１０Ｔ…接続端、１１０…通信Ｉ／Ｆ、１２０Ａ…会話支援処理部、１２１Ａ…音声処理部、１２１１…音声認識部、１２１２…あいまい発言検出部、１２１３…キーワード抽出部、１２２…検索先判定部、１２３…検索実行部、１２４…正誤判定部、１２５…訂正文作成部、１２６…訂正制御部、１２０Ｂ…会話支援処理部、１２１４…不明発言検出部、１２１５…キーワード抽出部、１２７…検索成否判定部、１２８…補足文作成部、１２９…補足提供部、１２０Ｃ…会話支援処理部、１２１６…不適切発言検出部、１２１７…不適切発言上書き部、１３１…ガイダンス作成部、１３２…ガイダンス提供部、２…内線電話網、３、３（１）、３（２）、３（３）…電話端末、４…ＬＡＮ、５（１）…内部情報サーバ、６…広域ネットワーク、６１…外線電話網、６２…ＩＰ網、７、７（１）…外線電話端末、８（１）…情報提供サーバ 1A, 1B, 1C...telephone control device, 101T...connection end, 101...telephone network I/F, 102...control device, 103...storage device, 104...terminal management file, 105A...ambiguous speech dictionary, 105B...unknown situation dictionary, 105C...inappropriate word dictionary, 106...search destination determination dictionary, 107T...connection end, 107...extension I/F, 108T...connection end, 108...LAN I/F, 109...call control unit, 109S...outgoing call control unit, 109R...incoming call control unit, 110T...connection end, 110...communication I/F, 120A...conversation support processing unit, 121A...voice processing unit, 1211...voice recognition unit, 1212...ambiguous speech detection unit, 1213...keyword extraction unit, 122...search destination determination unit, 1 23...search execution unit, 124...correction judgment unit, 125...correction sentence creation unit, 126...correction control unit, 120B...conversation support processing unit, 1214...unclear remark detection unit, 1215...keyword extraction unit, 127...search success/failure judgment unit, 128...supplement sentence creation unit, 129...supplement provision unit, 120C...conversation support processing unit, 1216...inappropriate remark detection unit, 1217...inappropriate remark overwriting unit, 131...guidance creation unit, 132...guidance provision unit, 2...extension telephone network, 3, 3(1), 3(2), 3(3)...telephone terminal, 4...LAN, 5(1)...internal information server, 6...wide area network, 61...external telephone network, 62...IP network, 7, 7(1)...external telephone terminal, 8(1)...information provision server

Claims

A communication support device that relays voice information from each speaker when a call is made between a plurality of speakers by transmitting and receiving voice information over a network, comprising:
a speech recognition means for converting speech information from each speaker into text data;
a detection means for analyzing the text data from the speech recognition means and detecting a predetermined portion requiring assistance;
a creating means for creating a message corresponding to the predetermined portion when the detecting means detects the predetermined portion, or for creating processed voice information by processing a portion of the voice information corresponding to the predetermined portion;
and a providing means for converting the message into text information or into audio information and providing the converted audio information to a speaker who provided at least the audio information from which the specified portion was detected, and for providing the processed audio information to a speaker other than the speaker who provided the audio information from which the specified portion was detected.

The communication support device according to claim 1,
The detection means detects an ambiguous utterance portion,
The creating means includes:
a first extraction means for extracting a search keyword for searching for an accurate content of the ambiguous utterance portion from the text data from the voice recognition means;
a first search means for searching information disclosed on a predetermined network by using the search keyword extracted by the first extraction means;
and a correction message creation means for creating a correction message as the message based on a search result from the first search means,
The communication support device according to claim 1, wherein the providing means provides the correction sentence as text information or by converting the correction sentence into voice information.

The communication support device according to claim 1,
The detection means detects an unclear utterance portion,
The creating means includes:
a second extraction means for extracting search keywords for searching for appropriate content for the unclear utterance portion from the text data from the speech recognition means;
a second search means for performing a search for information disclosed on a predetermined network by using the search keyword extracted by the second extraction means;
and a supplemental text creating means for creating a supplemental text as the message based on a search result from the second search means,
The communication support device according to claim 1, wherein the providing means provides the supplementary sentence as text information or by converting the supplementary sentence into audio information.

The communication support device according to claim 1,
The detection means detects an inappropriate comment portion,
The creating means includes:
a communication support device which functions as an overwriting means for creating the processed voice information by overwriting a portion of the voice information from the one speaker corresponding to the inappropriate portion with silence or other information.

A method for supporting a call used in a call support device that relays voice information from each speaker when a call is made between a plurality of speakers by transmitting and receiving voice information over a network, comprising:
a speech recognition step in which speech recognition means converts speech information from each speaker into text data;
a detection step in which a detection means analyzes the text data converted in the speech recognition step and detects a predetermined portion requiring assistance;
a creating means for creating a message corresponding to the predetermined portion when the predetermined portion is detected in the detecting step, or for creating processed voice information by processing a portion of the voice information corresponding to the predetermined portion;
a providing step in which a providing means converts the message into text information or into voice information and provides the message to a speaker who provided at least the voice information from which the specified portion was detected, or provides the processed voice information to a speaker other than the speaker who provided the voice information from which the specified portion was detected.