JP2001092845A

JP2001092845A - Document acquisition method and recording medium

Info

Publication number: JP2001092845A
Application number: JP27193199A
Authority: JP
Inventors: Akihiko Sugikawa; 明彦杉川
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1999-09-27
Filing date: 1999-09-27
Publication date: 2001-04-06

Abstract

(57)【要約】（修正有）【課題】ユーザの音声を認識し、ネットワークを通じ
て次のリンク先の文書を取得する文書取得方法を提供す
る。【解決手段】取得した文書を解析し、次のリンク先を
対応付ける対応単語情報を音声認識辞書に登録する辞書
作成ステップと、対応単語情報と、次のリンク先文書を
取得するための識別子との対応付けを行って管理テーブ
ルに登録するテーブル作成ステップと、ユーザが入力し
た音声の認識を行う音声認識ステップと、管理テーブル
から識別子を得る識別子取得ステップと、音声認識辞書
と管理テーブルの内容を変更する絞り込み変更ステップ
と、変更された音声認識辞書と管理テーブルに基づい
て、識別子の数を絞り込む絞り込みステップとを有す
る。 (57) [Summary] (with correction) [PROBLEMS] To provide a document acquisition method for recognizing a user's voice and acquiring a next linked document via a network. A dictionary creation step of analyzing an acquired document and registering corresponding word information for associating a next link destination in a speech recognition dictionary, and a process of creating a corresponding word information and an identifier for obtaining a next link destination document. A table creation step of associating and registering in the management table, a voice recognition step of recognizing a voice input by the user, an identifier acquisition step of obtaining an identifier from the management table, and changing the contents of the voice recognition dictionary and the management table And a narrowing-down step of narrowing down the number of identifiers based on the changed voice recognition dictionary and management table.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ユーザから入力さ
れた音声を認識し、この認識結果に基づいて、インター
ネット等の電気通信回線を通じて取得したハイパーテキ
スト等の文書から次のリンク先の文書を取得する文書取
得方法及びそのプログラムを記録した記録媒体に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention recognizes speech input from a user and, based on the recognition result, converts a next link destination document from a hypertext or other document obtained through a telecommunication line such as the Internet. The present invention relates to a method for acquiring a document and a recording medium on which the program is recorded.

【０００２】[0002]

【従来の技術】近年、www（Word Wide Web）上に存在す
る情報を容易に取得する方法として、ハイパーテキスト
が使用され、ブラウザと呼ばれるクライアントを通じて
関連情報へのアクセスを実現している。2. Description of the Related Art In recent years, hypertext has been used as a method for easily obtaining information existing on www (Word Wide Web), and access to related information has been realized through a client called a browser.

【０００３】ブラウザは、サーバーにURLと呼ばれる唯
一の識別子を使用して該情報の送信を依頼する。[0003] The browser requests the server to transmit the information using a unique identifier called a URL.

【０００４】ハイパーテキストは、Ｈｔｍｌ（Hyper Te
xt Mark-up Language）と呼ばれる文法により定義され
ており、文字や写真などのオブジェクトの画面上のレイ
アウトやスタイル、他の情報へのリンクの存在を示すシ
ンボルであるアンカーの定義、リンク先の識別子である
URLとの対応、特定の制御を位などで行うスクリプトか
ら構成される。Hypertext is an HTML (Hyper Te
xt Mark-up Language) is defined by a grammar called the layout and style of objects such as characters and photos, the definition of anchors that are symbols that indicate the existence of links to other information, and the identifier of the link destination Is
It consists of a script that performs correspondence with URLs and specific control at the rank and the like.

【０００５】ブラウザは、上記各々の識別をTAGを用い
て行い、TAG開始信号と終了記号で指定された範囲の内
容を、各TAGに対応する解釈にしたがって処理する。[0005] The browser performs each of the above identifications using TAGs, and processes the contents of the range specified by the TAG start signal and end symbol according to the interpretation corresponding to each TAG.

【０００６】ブラウザはＨｔｍｌ文章を取得すると、該
文書の解析を行い、TAGに基づき表示手段への出力用の
データを作成し、その情報の一部あるいは全てを表示す
る。When the browser obtains the HTML text, the browser analyzes the document, creates data for output to the display means based on the TAG, and displays a part or all of the information.

【０００７】ところで、アンカーは、他と異なる表現が
使用され、利用者がアンカーであることを識別しやすい
ようにしている。通常、フォントの色を変更したり、下
線などを付加して表示する。アンカーに関連しているリ
ンク先情報（URL）をあわせて管理し、ユーザーがポイ
ンティングデバイスやキーボードを使用して特定のアン
カーを指定すると、ブラウザは該アンカーに対応したUR
Lを用いて、そのURLにより一意に決められるサーバに情
報送信を依頼する。[0007] By the way, the anchor uses a different expression from the others, so that the user can easily identify the anchor. Usually, the font color is changed or an underline is added for display. The link destination information (URL) related to the anchor is also managed, and when the user designates a specific anchor using a pointing device or a keyboard, a browser corresponding to the anchor is used for the UR corresponding to the anchor.
Using L, request information transmission from a server uniquely determined by the URL.

【０００８】このように利用者はアンカーを指定するこ
とにより次々と関連情報にアクセスすることが実現され
ている。通常、該システムは通信手段を有する情報処理
装置上にプログラムとして実現され、記録媒体や通信回
線を通じて配布され、情報処理装置の記録手段に格納さ
れている。As described above, the user can access related information one after another by designating an anchor. Usually, the system is realized as a program on an information processing apparatus having communication means, distributed through a recording medium or a communication line, and stored in the recording means of the information processing apparatus.

【０００９】最近では、アンカーの指定手段として、音
声認識を用いる装置や方法が提案されている。これは、
携帯情報装置ではポインティングデバイスの操作を行う
場所が確保しにくい、あるいは操作しにくいなどの問題
が生じるためである。Recently, devices and methods using voice recognition have been proposed as means for specifying an anchor. this is,
This is because it is difficult to secure a place for operating the pointing device in the portable information device, or it is difficult to operate the pointing device.

【００１０】このような問題を解決する提案として、従
来例１（特開平１０−１２４２９３号）がある。この従
来例１を図１３のフローチャートを使用して説明する。As a proposal for solving such a problem, there is a conventional example 1 (Japanese Patent Laid-Open No. Hei 10-124293). This Conventional Example 1 will be described with reference to the flowchart of FIG.

【００１１】この従来例１では、Ｈｔｍｌ文書を受信し
（ステップ４０１）、Ｈｔｍｌ文章の解析を行いアンカ
ーとリンク先URLをそれぞれ取得し（ステップ４０
２）、読み辞書データベースにアクセスしアンカー文字
列の読みデータを作成し（ステップ４０３）、取得した
読みデータを認識用辞書に登録を行い（ステップ４０
４）、読みデータとリンク先情報の対を作成しテーブル
に登録し（ステップ４０５）、ユーザからの音声入力を
待機する（ステップ４０６）。In the first conventional example, an HTML document is received (step 401), the HTML document is analyzed, and an anchor and a link destination URL are obtained (step 40).
2) Access the reading dictionary database to create reading data of the anchor character string (step 403), and register the obtained reading data in the recognition dictionary (step 40).
4) A pair of read data and link destination information is created and registered in a table (step 405), and a voice input from the user is waited for (step 406).

【００１２】ユーザが所望のアンカーを読み上げると、
音声認識手段は入力された音声信号を処理し認識用のデ
ータを作成し、認識用辞書に登録されたデータと比較し
各々に点数を付与する。予め決めた閾値を超えたなら
ば、最高の点数を得たものを認識結果として出力する
（ステップ４０７）。When the user reads out the desired anchor,
The voice recognition means processes the input voice signal to create data for recognition, compares the data with data registered in the dictionary for recognition, and gives a score to each. If the threshold value is exceeded, the one with the highest score is output as a recognition result (step 407).

【００１３】システムは、得られた認識結果をキーとし
て上記のテーブルを検索し、リンク先情報を取得し（ス
テップ４０８）、それを使用して特定のサーバに情報要
求を送信する（ステップ４０９）。The system searches the above table using the obtained recognition result as a key, acquires link destination information (step 408), and transmits an information request to a specific server using the information (step 409). .

【００１４】以上により音声入力を使用して関連情報に
アクセスすることは可能であるが、利用者はアンカーの
文そのものを読み上げる必要があり、アンカーが長い場
合には、利用者の負担となり、かつ読み間違いの可能性
が増加し、それにより音声認識に失敗する可能性があ
る。As described above, it is possible to access the related information using voice input, but the user needs to read out the sentence of the anchor itself, and if the anchor is long, it becomes a burden on the user, and The likelihood of misreading increases, which can cause speech recognition to fail.

【００１５】そこで、このような問題を解決するため
に、従来例２（特開平１１−２５０９８号）が提案され
た。この従来例２を図１４のフローチャートを使用して
説明する。In order to solve such a problem, a second conventional example (Japanese Patent Laid-Open No. 11-25098) has been proposed. This Conventional Example 2 will be described with reference to the flowchart of FIG.

【００１６】この従来例２では、アンカーの文をそのま
ま使用するのではなく、Ｈｔｍｌ文章の解析を行いアン
カーを取得し（ステップ４１１）、各アンカーの形態素
解析を行い、１つの文を複数の語句に分割し、その中
から名詞句のみを選択し（ステップ４１２）、選択した
名詞句からアンカー文が決定できるように唯一の名詞句
を選択し（ステップ４１３）、選択した単語に読みデー
タを付加し（ステップ４１４）、音声認識辞書への登録
（ステップ４１５）、リンク先情報と対にしてテーブル
作成を行う（ステップ４１４）。In the conventional example 2, instead of using the sentence of the anchor as it is, the HTML sentence is analyzed to acquire the anchor (step 411), and the morphological analysis of each anchor is performed. , And only the noun phrases are selected from them (step 412), and the only noun phrase is selected so that the anchor sentence can be determined from the selected noun phrases (step 413), and the reading data is added to the selected word. (Step 414), registration in the voice recognition dictionary (step 415), and creation of a table in combination with the link destination information (step 414).

【００１７】これにより、利用者はアンカーの文を読み
上げるのではなく、名詞句のみを発話すればよく、前シ
ステムと比較してユーザの負荷の低減と、認識エラーの
発生を低減している。Thus, the user need only speak the noun phrase instead of reading out the sentence of the anchor, which reduces the load on the user and the occurrence of recognition errors as compared with the previous system.

【００１８】また、この従来例２では、名詞句を識別し
やすいように、アンカー内の該当部分の文字表示を強調
したり、色を変更することを特徴として提案している。Further, the second conventional example is characterized in that the character display of the corresponding portion in the anchor is emphasized or the color is changed so that the noun phrase can be easily identified.

【００１９】しかし、このシステムが提示する名詞句の
みしか受け付けないこと、アンカーに含まれる名詞句全
てがそれまでのアンカーの識別用として使用されると、
アンカーを唯一に決定するための名詞句が選択できない
場合があり、また、アンカーに名詞句が１つも存在しな
い場合もあり、さらに、唯一の名詞句の読みデータが存
在しない場合があり、また、人名、地名、専門用語など
使用頻度が少ないものの読みデータは辞書サイズの制限
により含まれていない場合がある、など実利用で発生す
る問題には対処できない。However, if only the noun phrase presented by this system is accepted, and if all the noun phrases included in the anchor are used for identifying the previous anchor,
There is a case where noun phrase for solely determining an anchor cannot be selected, a case where no noun phrase exists in the anchor, a case where no noun phrase reading data exists, and a case where no noun phrase exists. It is not possible to cope with problems that occur in actual use, such as reading data of less frequently used names such as person names, place names, and technical terms, which may not be included due to dictionary size restrictions.

【００２０】[0020]

【発明が解決しようとする課題】このように従来は、特
定の名詞句のみしか利用できない問題、また、有効な名
詞句が選択できなかった場合にリンク先にアクセスでき
ないという問題があった。As described above, conventionally, there has been a problem that only a specific noun phrase can be used, and a problem that a link destination cannot be accessed when a valid noun phrase cannot be selected.

【００２１】そこで、ユーザから入力された音声を認識
し、この認識結果に基づいて、電気通信回線を通じて取
得した文書から次のリンク先の文書を取得する場合に、
ユーザが音声で指示した情報に基づいて、複数の識別子
が得られても、一の識別子に絞り込むことができる発明
を提供する。Therefore, when recognizing the voice input from the user and obtaining the next linked document from the document obtained through the telecommunication line based on the recognition result,
Provided is an invention capable of narrowing down to a single identifier even if a plurality of identifiers are obtained based on information instructed by a user by voice.

【００２２】また、解析した対応単語情報と識別子が対
応していない場合でも、その識別子を得ることができる
発明を提供する。Further, the present invention provides an invention capable of obtaining an identifier even when the analyzed corresponding word information does not correspond to the identifier.

【００２３】さらに、表示部の表示された内容、また
は、表示されていない内容に対応して必要な情報のみを
ユーザに提示できる発明を提供する。Furthermore, the present invention provides an invention capable of presenting only necessary information to a user corresponding to the content displayed on the display unit or the content not displayed.

【００２４】本発明は、１つ以上の語句を受け付けるこ
とによりリンク先へのアクセスを実現し、かつ、アンカ
ーに有効な語句が選択できない場合でもリンク先情報へ
のアクセスを実現する方法を提供することを目的とす
る。The present invention provides a method for realizing access to a link destination by accepting one or more words, and for realizing access to link destination information even when a word valid for an anchor cannot be selected. The purpose is to:

【００２５】[0025]

【課題を解決するための手段】請求項１の発明は、ユー
ザから入力された音声を認識し、この認識結果に基づい
て、電気通信回線を通じて取得した文書から次のリンク
先の文書を取得する文書取得方法であって、前記取得し
た文書を解析し、次のリンク先を対応付けることが可能
な単語に関する対応単語情報を音声認識辞書に登録する
辞書登録ステップと、前記音声認識辞書に登録した対応
単語情報と、前記次のリンク先文書を取得するための識
別子との対応付けを行って管理テーブルを作成するテー
ブル作成ステップと、前記音声認識辞書に登録した対応
単語情報に基づいて前記ユーザが入力した音声の認識を
行う音声認識ステップと、前記音声認識ステップにおけ
る認識結果に基づいて、前記管理テーブルから識別子を
得る識別子取得ステップとを有した文書取得方法におい
て、前記識別子取得ステップで得た識別子が複数存在す
る場合には、前記複数の識別子に基づいて前記音声認識
辞書と前記管理テーブルの内容を、前記複数の識別子の
中から、さらに識別子の数を絞り込み可能なように変更
する絞り込み変更ステップと、前記変更された音声認識
辞書と管理テーブルに基づいて、前記音声認識ステップ
と前記識別子取得ステップを行って、識別子の数を絞り
込む絞り込みステップとを有することを特徴とする文書
取得方法である。According to a first aspect of the present invention, a voice input from a user is recognized, and a next linked document is obtained from a document obtained through a telecommunication line based on the recognition result. A document acquisition method, comprising: analyzing the acquired document; and registering, in a speech recognition dictionary, corresponding word information on a word to which a next link can be associated, and a correspondence registered in the speech recognition dictionary. A table creation step of creating a management table by associating word information with an identifier for acquiring the next linked document, and inputting by the user based on the corresponding word information registered in the speech recognition dictionary A voice recognition step of recognizing the obtained voice, and an identifier acquisition step of obtaining an identifier from the management table based on the recognition result in the voice recognition step. In the case where there are a plurality of identifiers obtained in the identifier obtaining step, the contents of the speech recognition dictionary and the management table are referred to based on the plurality of identifiers. From among them, a narrowing change step of further changing the number of identifiers so as to be narrowed down, based on the changed voice recognition dictionary and the management table, performing the voice recognition step and the identifier obtaining step, And a narrowing down step for narrowing down the number.

【００２６】請求項２の発明は、ユーザから入力された
音声を認識し、この認識結果に基づいて、電気通信回線
を通じて取得した文書から次のリンク先の文書を取得す
る文書取得方法であって、前記取得した文書を解析し、
次のリンク先を対応付けることが可能な単語に関する対
応単語情報を音声認識辞書に登録する辞書登録ステップ
と、前記音声認識辞書に登録した対応単語情報と、前記
次のリンク先文書を取得するための識別子との対応付け
を行って管理テーブルを作成するテーブル作成ステップ
と、前記音声認識辞書に登録した対応単語情報に基づい
て前記ユーザが入力した音声の認識を行う音声認識ステ
ップと、前記音声認識ステップにおける認識結果に基づ
いて、前記管理テーブルから識別子を得る識別子取得ス
テップとを有した文書取得方法において、前記解析した
対応単語情報と識別子との対応付けが可能かどうかの判
定を行う対応判定ステップと、前記対応付けができない
場合には、前記音声認識辞書に登録するためのデータで
あって、前記解析した対応単語情報に対応する特別読み
情報を作成して前記音声認識辞書に登録する読み情報作
成ステップとを有することを特徴とする文書取得方法で
ある。According to a second aspect of the present invention, there is provided a document acquisition method for recognizing a voice input by a user and acquiring a next linked document from a document acquired through a telecommunication line based on the recognition result. Analyzing the obtained document,
A dictionary registration step of registering corresponding word information relating to a word to which a next link destination can be associated in a speech recognition dictionary; and a corresponding word information registered in the speech recognition dictionary; and acquiring the next link destination document. A table creation step of creating a management table by associating with an identifier, a speech recognition step of recognizing a speech input by the user based on corresponding word information registered in the speech recognition dictionary, and a speech recognition step A document acquisition method for acquiring an identifier from the management table on the basis of the recognition result in the document acquisition method, wherein a correspondence determination step of determining whether or not the analyzed corresponding word information and the identifier can be associated with each other; If the association is not possible, the data to be registered in the speech recognition dictionary, A document acquisition method characterized in that it comprises a reading information creation step of creating and registering a special reading information on the voice recognition dictionary corresponding to the corresponding word information.

【００２７】請求項３の発明は、ユーザから入力された
音声を認識し、この認識結果に基づいて、電気通信回線
を通じて取得した文書から次のリンク先の文書を取得す
る文書取得方法であって、前記取得した文書を解析し、
次のリンク先を対応付けることが可能な単語に関する対
応単語情報を音声認識辞書に登録する辞書登録ステップ
と、前記音声認識辞書に登録した対応単語情報と、前記
次のリンク先文書を取得するための識別子との対応付け
を行って管理テーブルを作成するテーブル作成ステップ
と、前記音声認識辞書に登録した対応単語情報に基づい
て前記ユーザが入力した音声の認識を行う音声認識ステ
ップと、前記音声認識ステップにおける認識結果に基づ
いて、前記管理テーブルから識別子を得る識別子取得ス
テップとを有した文書取得方法において、ユーザに文書
を提示する表示部の画面に表示されている内容か否かを
判定する表示対象判定ステップと、前記表示部の画面に
表示されている内容、または、表示されていない内容に
応じて音声認識辞書の登録する対応単語情報を変更する
表示登録変更ステップとを有することを特徴とする文書
取得方法である。According to a third aspect of the present invention, there is provided a document acquisition method for recognizing a voice input by a user and acquiring a next linked document from a document acquired through a telecommunication line based on the recognition result. Analyzing the obtained document,
A dictionary registration step of registering corresponding word information relating to a word to which a next link destination can be associated in a speech recognition dictionary; and a corresponding word information registered in the speech recognition dictionary; and acquiring the next link destination document. A table creation step of creating a management table by associating with an identifier, a speech recognition step of recognizing a speech input by the user based on corresponding word information registered in the speech recognition dictionary, and a speech recognition step A document acquisition method for acquiring an identifier from the management table based on the recognition result in the step (a), wherein a display object for determining whether or not the content is displayed on a screen of a display unit for presenting a document to a user A determination step, and a speech recognition word according to the content displayed on the screen of the display unit or the content not displayed. A document acquisition method characterized by having a display registration changing step of changing the corresponding word information to be registered in.

【００２８】請求項４の発明は、ユーザから入力された
音声を認識し、この認識結果に基づいて、電気通信回線
を通じて取得した文書から次のリンク先の文書を取得す
る文書取得方法を実現するプログラムを記録した記録媒
体であって、前記取得した文書を解析し、次のリンク先
を対応付けることが可能な単語に関する対応単語情報を
音声認識辞書に登録する辞書登録機能と、前記音声認識
辞書に登録した対応単語情報と、前記次のリンク先文書
を取得するための識別子との対応付けを行って管理テー
ブルを作成するテーブル作成機能と、前記音声認識辞書
に登録した対応単語情報に基づいて前記ユーザが入力し
た音声の認識を行う音声認識機能と、前記音声認識機能
における認識結果に基づいて、前記管理テーブルから識
別子を得る識別子取得機能とを実現するプログラムを記
録した記録媒体において、前記識別子取得機能で得た識
別子が複数存在する場合には、前記複数の識別子に基づ
いて前記音声認識辞書と前記管理テーブルの内容を、前
記複数の識別子の中から、さらに識別子の数を絞り込み
可能なように変更する絞り込み変更機能と、前記変更さ
れた音声認識辞書と管理テーブルに基づいて、前記音声
認識機能と前記識別子取得機能を行って、識別子の数を
絞り込む絞り込み機能と、を実現するプログラムを記録
したことを特徴とする文書取得方法の記録媒体である。According to a fourth aspect of the present invention, there is provided a document acquiring method for recognizing a voice input from a user and acquiring a next linked document from a document acquired through a telecommunication line based on the recognition result. A recording medium on which a program is recorded, wherein the acquired document is analyzed, and a dictionary registration function for registering corresponding word information about a word to which a next link can be associated in a speech recognition dictionary; A table creation function for creating a management table by associating the registered corresponding word information with an identifier for acquiring the next linked document, and the corresponding word information registered in the speech recognition dictionary. A voice recognition function for recognizing voice input by a user, and an identifier for obtaining an identifier from the management table based on a recognition result in the voice recognition function If a plurality of identifiers obtained by the identifier obtaining function are present in a recording medium on which a program for realizing the obtaining function is recorded, the contents of the speech recognition dictionary and the management table are determined based on the plurality of identifiers. From among a plurality of identifiers, a narrowing-down change function for further changing the number of identifiers so as to be narrowed down, and performing the voice recognition function and the identifier acquisition function based on the changed voice recognition dictionary and management table. , A program for realizing a narrowing-down function for narrowing down the number of identifiers is recorded.

【００２９】請求項５の発明は、ユーザから入力された
音声を認識し、この認識結果に基づいて、電気通信回線
を通じて取得した文書から次のリンク先の文書を取得す
る文書取得方法を実現するプログラムを記録した記録媒
体であって、前記取得した文書を解析し、次のリンク先
を対応付けることが可能な単語に関する対応単語情報を
音声認識辞書に登録する辞書登録機能と、前記音声認識
辞書に登録した対応単語情報と、前記次のリンク先文書
を取得するための識別子との対応付けを行って管理テー
ブルを作成するテーブル作成機能と、前記音声認識辞書
に登録した対応単語情報に基づいて前記ユーザが入力し
た音声の認識を行う音声認識機能と、前記音声認識機能
における認識結果に基づいて、前記管理テーブルから識
別子を得る識別子取得機能とを実現するプログラムを記
録した記録媒体において、前記解析した対応単語情報と
識別子との対応付けが可能かどうかの判定を行う対応判
定機能と、前記対応付けができない場合には、前記音声
認識辞書に登録するためのデータであって、前記解析し
た対応単語情報に対応する特別読み情報を作成して前記
音声認識辞書に登録する読み情報作成機能と、を実現す
るプログラムを記録したことを特徴とする文書取得方法
の記録媒体である。According to a fifth aspect of the present invention, there is provided a document acquisition method for recognizing a voice input by a user and acquiring a next linked document from a document acquired through a telecommunication line based on the recognition result. A recording medium on which a program is recorded, wherein the acquired document is analyzed, and a dictionary registration function for registering corresponding word information about a word to which a next link can be associated in a speech recognition dictionary; A table creation function for creating a management table by associating the registered corresponding word information with an identifier for acquiring the next linked document, and the corresponding word information registered in the speech recognition dictionary. A voice recognition function for recognizing voice input by a user, and an identifier for obtaining an identifier from the management table based on a recognition result in the voice recognition function In a recording medium on which a program for realizing the acquisition function is recorded, a correspondence determination function for determining whether or not the analyzed corresponding word information and the identifier can be associated with each other. Data for registering in a recognition dictionary, and a reading information creation function of creating special reading information corresponding to the analyzed corresponding word information and registering the special reading information in the speech recognition dictionary. This is a recording medium for a document acquisition method that is a feature.

【００３０】請求項６の発明は、ユーザから入力された
音声を認識し、この認識結果に基づいて、電気通信回線
を通じて取得した文書から次のリンク先の文書を取得す
る文書取得方法を実現するプログラムを記録した記録媒
体であって、前記取得した文書を解析し、次のリンク先
を対応付けることが可能な単語に関する対応単語情報を
音声認識辞書に登録する辞書登録機能と、前記音声認識
辞書に登録した対応単語情報と、前記次のリンク先文書
を取得するための識別子との対応付けを行って管理テー
ブルを作成するテーブル作成機能と、前記音声認識辞書
に登録した対応単語情報に基づいて前記ユーザが入力し
た音声の認識を行う音声認識機能と、前記音声認識機能
における認識結果に基づいて、前記管理テーブルから識
別子を得る識別子取得機能とを実現するプログラムを記
録した記録媒体において、ユーザに文書を提示する表示
部の画面に表示されている内容か否かを判定する表示対
象判定機能と、前記表示部の画面に表示されている内
容、または、表示されていない内容に応じて音声認識辞
書の登録する対応単語情報を変更する表示登録変更機能
と、を実現するプログラムを記録したことを特徴とする
文書取得方法の記録媒体である。According to a sixth aspect of the present invention, there is provided a document acquisition method for recognizing a voice input by a user and acquiring a next linked document from a document acquired through a telecommunication line based on the recognition result. A recording medium on which a program is recorded, wherein the acquired document is analyzed, and a dictionary registration function for registering corresponding word information about a word to which a next link can be associated in a speech recognition dictionary; A table creation function for creating a management table by associating the registered corresponding word information with an identifier for acquiring the next linked document, and the corresponding word information registered in the speech recognition dictionary. A voice recognition function for recognizing voice input by a user, and an identifier for obtaining an identifier from the management table based on a recognition result in the voice recognition function In a recording medium on which a program for realizing the acquisition function is recorded, a display target determination function for determining whether or not the content is displayed on a screen of a display unit for presenting a document to a user; And a display registration change function for changing corresponding word information registered in the speech recognition dictionary in accordance with the content being displayed or the content not being displayed. It is.

【００３１】請求項１，４の発明であると、ユーザが音
声で指示した対応単語情報に基づいて、複数の識別子が
得られても、管理テーブルの内容を変更して、その変更
した内容に従ってユーザに再度音声を入力してもらうこ
とにより、さらに識別子を絞り込むことができる。According to the first and fourth aspects of the present invention, even if a plurality of identifiers are obtained based on the corresponding word information instructed by the user by voice, the contents of the management table are changed and the contents are changed according to the changed contents. By having the user input the voice again, the identifier can be further narrowed down.

【００３２】請求項２，５の発明であると、解析した対
応単語情報と識別子が対応していない場合でも、特別読
み情報を作成して、この特別読み情報をユーザが音声で
指示すれば、その特別読み情報に対応する識別子を得る
ことができる。According to the second and fifth aspects of the present invention, even if the analyzed corresponding word information and the identifier do not correspond to each other, special reading information is created, and if this special reading information is indicated by a voice by the user, An identifier corresponding to the special reading information can be obtained.

【００３３】請求項３，６の発明であると、表示部の表
示された内容、または、表示されていない内容に対応し
て対応単語情報と登録するために、必要な情報のみがユ
ーザに提示できる。According to the third and sixth aspects of the present invention, only necessary information is presented to the user in order to register the corresponding word information corresponding to the content displayed on the display unit or the content not displayed. it can.

【００３４】[0034]

【発明の実施の形態】（第１の実施例）本発明の第１の
実施例を図１〜図７を用いて説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS (First Embodiment) A first embodiment of the present invention will be described with reference to FIGS.

【００３５】１．構成図２は、第１の実施例の情報処理装置の構成を示すブロ
ック図の一例である。1. Configuration FIG. 2 is an example of a block diagram illustrating a configuration of the information processing apparatus according to the first embodiment.

【００３６】図２に示すような演算手段１、表示手段
２、記憶手段３、通信手段４、キーボード５、指示手段
６、音声入力手段７から構成される情報処理装置上で動
作するプログラムとして実現する。また、ＣＤ−ＲＯＭ
やＦＤなどの記録媒体、あるいは、通信手段６を通じて
該処理装置に読み込まれるものとする。As shown in FIG. 2, it is realized as a program that operates on an information processing apparatus including an arithmetic unit 1, a display unit 2, a storage unit 3, a communication unit 4, a keyboard 5, an instruction unit 6, and a voice input unit 7. I do. CD-ROM
It is assumed that the data is read into the processing device via a recording medium such as a FD or an FD, or the communication unit 6.

【００３７】表示手段２は、表示内容を一時的に記憶す
るビデオメモリ、DA変換部と、これら表示内容をユーザ
に提示するCRTや液晶表示装置などで構成する。The display means 2 comprises a video memory for temporarily storing display contents, a DA converter, and a CRT or liquid crystal display for presenting the display contents to the user.

【００３８】記憶手段３とは、一時的なデータの保持に
使用されるRAM、永久に保存するためのハードディスク
などの装置を用いる。RAMには本プログラムや認識用辞
書データやリンク管理テーブルなどの作業用の変数が格
納され、ハードディスクは、受信したＨｔｍｌ文章のキ
ャッシュ、音声認識用の辞書である音素データ辞書、読
みデータ辞書などを記録する。As the storage means 3, a device such as a RAM used for temporarily storing data and a hard disk for storing data permanently are used. The RAM stores work variables such as this program, dictionary data for recognition, and link management tables. The hard disk stores a cache of received HTML sentences, a phoneme data dictionary that is a dictionary for speech recognition, and a reading data dictionary. Record.

【００３９】通信手段４は、ケーブルを使用する有線LA
N、RS232C、セントロニクス、SCSI，USB，IEEE1394な
ど、あるいは、ケーブルを使用しない、無線LAN、赤外
線通信、PHSデータ通信などで実現する。例えば、イン
ターネットを通じて通信を行うものである。The communication means 4 is a wired LA using a cable.
It is realized by N, RS232C, Centronics, SCSI, USB, IEEE1394, etc., or wireless LAN, infrared communication, PHS data communication, etc. without using cables. For example, communication is performed through the Internet.

【００４０】指示手段６は、マウス、トラックボール、
アキュポイント、表示手段に重ねたペン入力などで実現
する。The indicating means 6 includes a mouse, a trackball,
It is realized by accu points, pen input on display means, etc.

【００４１】音声入力手段７は、マイクや外部アンプと
接続し、入力された音声信号をAD変換を通じてデジタル
信号に変換する。入力されたデータは、メモリに格納す
る。The audio input means 7 is connected to a microphone or an external amplifier, and converts an input audio signal into a digital signal through AD conversion. The input data is stored in the memory.

【００４２】演算手段１は、CPUとも呼ばれ上記の各手
段とバス９を通じて接続されデータの送受信を行う。ま
た、音声認識手段も兼ねている。The computing means 1 is also called a CPU and is connected to the above-mentioned means via the bus 9 to transmit and receive data. It also serves as voice recognition means.

【００４３】Ｈｔｍｌ文章を解析し画面に表示を行う処
理は、コンポーネント化されたブラウザモジュールを使
用することにより可能である。The process of analyzing the HTML text and displaying it on the screen is possible by using a browser module that is made into components.

【００４４】２．処理第１の実施例の実現形態の一例であるフローチャートを
図１に示す。2. Processing FIG. 1 shows a flowchart as an example of an embodiment of the first embodiment.

【００４５】本プログラムは指定されたＨｔｍｌ文章
を、インターネット等の電気通信回線を通じて、通信手
段６によりサーバ８から取得し，メモリに記録する（ス
テップ１０１）。This program acquires the designated HTML text from the server 8 by the communication means 6 through an electric communication line such as the Internet, and records it in the memory (step 101).

【００４６】次に、Ｈｔｍｌ文章を、TAG情報を基に解
析する（ステップ１０２）。Next, the HTML text is analyzed based on the TAG information (step 102).

【００４７】図３に本実施例の説明を行うためにＨｔｍ
ｌ文章の一例を示す。なお、説明を簡略にするために本
実施例に関係のないTAG情報は示していない。FIG. 3 shows Htm for explaining the present embodiment.
1 shows an example of a sentence. Note that TAG information not related to the present embodiment is not shown for simplification of the description.

【００４８】例えば、例に記載されている <A href="http://www.zzz.com.news/news1.htm">本日の
トップ</A> の解析結果は、アンカーが、”本日のトップ”であり、
リンク先の識別子（URL）は、"http://www.zzz.com.new
s/news1.htm"となる。For example, the analysis result of <A href="http://www.zzz.com.news/news1.htm">today's top </A> described in the example shows that the anchor is "today's Is the top of
The link identifier (URL) is "http://www.zzz.com.new
s / news1.htm ".

【００４９】なお、各データの切り出しは、<>""などの
記号を文字列を検索することで実現できる。The extraction of each data can be realized by searching a character string for symbols such as <>"".

【００５０】次に、アンカー文字列を文法及び品詞情報
からなる辞書データと比較し、形態素解析を行う。”本
日のトップ”を例にすると、”本日”、”の”、”トッ
プ”と分割され、そこから名詞句のみを選択すると”本
日”、”トップ”の単語がキーワードとして選択できる
（ステップ１０３）。Next, the anchor character string is compared with dictionary data including grammar and part of speech information, and morphological analysis is performed. Taking "Today's top" as an example, it is divided into "Today", "of", and "Top". If only a noun phrase is selected therefrom, the words "Today" and "Top" can be selected as keywords (step 103). ).

【００５１】ステップ１０３で選択した単語”本
日”、”トップ”を一時的に作業エリアに記録し、語句
の読み辞書と比較し、各単語の読みデータ、”ほんじ
つ”、”とっぷ”を取得する（ステップ１０４）。The words "today" and "top" selected in step 103 are temporarily recorded in the work area, compared with the reading dictionary of words and phrases, and the reading data of each word, "honjintsu" and "top". Is acquired (step 104).

【００５２】ステップ１０４で取得した読みデータを音
声認識用の辞書のデータに順次追加する（ステップ１０
５）。音声認識用の辞書のデータは、表示するＨｔｍｌ
文章が変更された場合に、全て削除する。また、同一の
単語を登録しても無意味なので登録前に、重複検査を行
う。これは辞書に登録した単語を作業エリアに記録し、
登録する単語をキーにして検索することにより実現す
る。The reading data obtained in step 104 is sequentially added to the data of the dictionary for voice recognition (step 10).
5). The data of the dictionary for voice recognition is Html to be displayed.
If the text is changed, delete it all. Also, even if the same word is registered, since it is meaningless, a duplication check is performed before registration. This records the words registered in the dictionary in the work area,
This is realized by searching using the word to be registered as a key.

【００５３】図４に音声認識用の辞書に登録する単語の
データを示す。FIG. 4 shows word data to be registered in the dictionary for speech recognition.

【００５４】Ｈｔｍｌ文章の解析により取得した”ほん
じつ”、”とっぷ”、”にゅーす”、”こくない”、”
かいがい”などのキーワードのほかに、システム制御用
の単語もあわせて登録しておく。制御用単語の役割につ
いては後で説明する。"Hontsutsu", "Top", "Niisu", "Koi", "" obtained by analyzing the HTML sentence
In addition to keywords such as "Igai", words for system control are also registered. The role of the control words will be described later.

【００５５】読みデータの取得後、読みデータ、識別
子、アンカー文字列を１組として管理テーブルに登録す
る（ステップ１０６）。図５に管理テーブルに登録する
情報を示す。After obtaining the read data, the read data, the identifier, and the anchor character string are registered as a set in the management table (step 106). FIG. 5 shows information registered in the management table.

【００５６】管理テーブルへの登録は、読みデータをキ
ーにして登録することにより後で高速に検索することが
可能となる。なお、読みデータを登録せず、この読みデ
ータに対応するインデックスを登録しておいてもよい。Registration in the management table can be performed later at a high speed by registering the read data as a key. Note that an index corresponding to the read data may be registered without registering the read data.

【００５７】ステップ１０３からステップ１０６までの
手順をアンカー内の名詞句の個数分実行する。全ての名
詞句の検査終了後にステップ１０２にもどり、Ｈｔｍｌ
文章の解析を継続する。The procedure from step 103 to step 106 is executed for the number of noun phrases in the anchor. After checking all noun phrases, return to step 102 and
Continue parsing the sentence.

【００５８】Ｈｔｍｌ文章の全ての内容を解析後、演算
手段１の音声認識手段を有効にする命令を送信する。こ
れによりユーザが音声を発話するまでシステムは待機状
態となる（ステップ１０７）。After analyzing all the contents of the HTML text, a command to activate the voice recognition means of the arithmetic means 1 is transmitted. As a result, the system enters a standby state until the user speaks (step 107).

【００５９】演算手段１の音声認識手段における音声認
識の実現方法の該略を説明する。The outline of the method of implementing speech recognition in the speech recognition means of the arithmetic means 1 will be described.

【００６０】一例として、隠れマルコフモデルを使用す
る。As an example, a hidden Markov model is used.

【００６１】ユーザの発話により音声入力手段７からの
信号の入力があると、AD変換を介してデジタルデータと
してメモリに記録し、予め決められた時間に相当する信
号をフーリエ変換により周波数信号に変換し、特徴デー
タを作成する。When a signal is input from the voice input means 7 by the user's utterance, the signal is recorded in the memory as digital data via AD conversion, and a signal corresponding to a predetermined time is converted into a frequency signal by Fourier transform. And create feature data.

【００６２】また、音声認識用の辞書に登録した単語
と、音素モデルから各単語の音素モデルを作成し、前記
の複数のデータからなる特徴データ列との距離計算を行
い、各単語についてのスコアーを求める。Further, a phoneme model of each word is created from the words registered in the dictionary for speech recognition and the phoneme model, and the distance between the word and the feature data string composed of the plurality of data is calculated. Ask for.

【００６３】計算結果が予め決められた閾値を超える場
合には、最良のスコアが得られた単語を認識結果として
出力する。閾値に達しない場合にはエラーを出力する。If the calculation result exceeds a predetermined threshold, the word with the best score is output as the recognition result. If the threshold is not reached, an error is output.

【００６４】認識結果がエラーの場合には、ステップ１
０７に戻って待機状態となる。If the recognition result is an error, step 1
Returning to step 07, the apparatus enters a standby state.

【００６５】認識が成功したならば、認識結果の単語を
キーにして上述したテーブルの検索を行う（ステップ１
０９）。If the recognition is successful, the above-mentioned table is searched using the word of the recognition result as a key (step 1).
09).

【００６６】識別子が１つの場合検索した結果、該当する識別子が１つの場合には、得ら
れた識別子を用いて情報要求を行う命令を通信手段６を
通じて送信する（ステップ１１０）。In the case where there is one identifier When the result of the search is that there is only one identifier, a command for requesting information using the obtained identifier is transmitted through the communication means 6 (step 110).

【００６７】例えば、利用者が”本日”と発話すると、
音声認識手段からは”ほんじつ”という情報が得られ
る。これをキーにしてテーブルを検索すると、識別子、 "http://www.zzz.com.news/news1.htm" が得られる。For example, when the user speaks “today”,
The information "honjintsu" is obtained from the voice recognition means. Searching the table using this as a key gives the identifier "http://www.zzz.com.news/news1.htm".

【００６８】これにより、次のリンク先のＨｔｍｌ文章
を得ることができる。As a result, the next linked HTML document can be obtained.

【００６９】複数の識別子が存在する場合複数の識別子が存在する場合には、ステップ１１１から
ステップ１１６を実行する。When there are a plurality of identifiers When there are a plurality of identifiers, steps 111 to 116 are executed.

【００７０】例えば、利用者が”ニュース”と発話した
場合は、音声認識手段からは”にゅーす”という情報が
得られる。これをキーにして管理テーブルを検索する
と、２つの識別子、 "http://www.zzz.com.news/news2.htm" と "http://www.zzz.com.news/news3.htm" が得られる。For example, when the user utters "news", the information "Nice" is obtained from the voice recognition means. When the management table is searched using this as a key, two identifiers, "http://www.zzz.com.news/news2.htm" and "http://www.zzz.com.news/news3.htm" Is obtained.

【００７１】得られた２つの識別子をキーにして、２つ
の識別子から一の識別子に絞り込みができるように、こ
れら識別子に対応した他の読みデータを取得する。そし
て、得られた読みデータから音声認識で得られた読みデ
ータを除く読みデータで辞書の内容を更新する（ステッ
プ１１１）。Using the obtained two identifiers as keys, other read data corresponding to these identifiers is obtained so that the two identifiers can be narrowed down to one identifier. Then, the contents of the dictionary are updated with the read data excluding the read data obtained by speech recognition from the obtained read data (step 111).

【００７２】具体的には、上記２つの識別子は”にゅー
す”によって得られた識別子であるので、ここからさら
に絞り込むために”にゅーす”以外の読みデータを使用
する。すなわち、”こくない”、”にゅーす”、”かい
がい”、”にゅーす”の読みデータが取得されるが、”
にゅーす”は認識された単語なので使用しない。辞書変
更前には、音声認識無効信号を送信しておく。More specifically, since the above two identifiers are identifiers obtained by "Nice", read data other than "Nice" is used to further narrow down the information. In other words, the read data of “don't come”, “ni-su”, “kaiai”, “ni-su” is obtained,
"Nice" is a recognized word and is not used.Before changing the dictionary, a speech recognition invalidation signal is transmitted.

【００７３】また、アンカーが”ニュース”のみの場合
には、上記のアルゴリズムを適用すると他の有効な読み
データがなくなってしまう。これを防ぐために各組合せ
に対し、通し番号を付与し、この通し番号の読みを用い
ることによりこの問題は回避できる。When the anchor is "news" only, if the above algorithm is applied, other valid read data will be lost. This problem can be avoided by assigning a serial number to each combination to prevent this and using the reading of the serial number.

【００７４】図６に示すように、辞書には、識別子をキ
ーにしたテーブル検索により得られた”こくない”、”
かいがい”の他に”いちばん”、”にばん”の語句を追
加している。本実施例では通し番号を付与したが、アル
ファベット等を付与してもかまわない。As shown in FIG. 6, the dictionary contains "Kako" and "Koi" obtained by a table search using an identifier as a key.
In addition to the word "kaigai", the words "most" and "niban" are added.In this embodiment, the serial numbers are assigned, but alphabets and the like may be assigned.

【００７５】上記の読みデータ、識別子、アンカー文字
を組にして管理テーブルの内容を更新する（ステップ１
１２）。The contents of the management table are updated by combining the read data, the identifier, and the anchor character (step 1).
12).

【００７６】図７に示すように管理テーブルには、本来
得られる読み、識別子の組合せの他に、番号の読みデー
タとの組合せも登録している。As shown in FIG. 7, in the management table, in addition to the originally obtained combination of the reading and the identifier, the combination with the reading data of the number is also registered.

【００７７】図８は、表示手段２に示されたユーザへの
絞り込み選択を促す画面の例であり、重複している選択
肢を番号付きで列挙することにより、アンカーと番号と
の対応付けがわかるようにしている（ステップ１１
３）。FIG. 8 is an example of a screen shown on the display means 2 for prompting the user to select a narrowing down. By listing the overlapping options with numbers, the correspondence between the anchors and the numbers can be understood. (Step 11
3).

【００７８】以上のステップ１１１からステップ１１３
まで実行後、音声認識手段を有効にする命令を送信す
る。これによりユーザが音声を発話するまでシステムは
待機状態となる。The above steps 111 to 113
After executing the above, a command for enabling the voice recognition means is transmitted. This puts the system in a standby state until the user speaks.

【００７９】以下、ステップ１１４からステップ１１７
は、ステップ１０７からステップ１０９と同様の処理を
実行する。すなわち、絞り込み選択を促す画面に基づい
てユーザに再度発話を行ってもらい、この再度の音声デ
ータの内容と、変更した辞書との内容に基づいて複数の
識別子から一の識別子を絞り込む。Hereinafter, steps 114 to 117 will be described.
Performs the same processing as steps 107 to 109. That is, the user utters again based on the screen prompting the selection of narrowing down, and one identifier is narrowed down from a plurality of identifiers based on the content of the voice data again and the content of the changed dictionary.

【００８０】そして、再び複数の識別子が得られた場合
には、ステップ１１１に戻り同様の手順をくり返すこと
により最終的には１つのみの識別子の選択を可能とす
る。If a plurality of identifiers are obtained again, the procedure returns to step 111 and the same procedure is repeated to finally select only one identifier.

【００８１】認識結果が制御命令の場合認識結果が制御命令”もどる”、”ほーむ”の場合に、
各命令に対応した処理を実行する。When the recognition result is a control command When the recognition result is a control command "return" or "home",
Executes processing corresponding to each instruction.

【００８２】”もどる”の場合には１つ前に取得したUR
Lに対応するＨｔｍｌ文章を再度表示する処理手順を実
行する。In the case of "return", the UR acquired immediately before
A processing procedure for displaying the HTML sentence corresponding to L again is executed.

【００８３】”ほーむ”の場合には、プログラムを実行
した場合に最初に表示すると予め設定したＨｔｍｌ文章
を再度表示する。Ｈｔｍｌ文章取得、表示完了後はステ
ップ１０１から処理手順を継続する。In the case of "home", if the program is displayed first when the program is executed, a preset HTML text is displayed again. After completing the acquisition and display of the HTML text, the processing procedure is continued from step 101.

【００８４】ステップ１１５において認識結果が”きゃ
んせる”の場合には、ステップ１０２に戻る。これは音
声認識結果が間違ってしまい、ユーザが所望しない状況
に陥った場合にユーザが発話する。認識結果が間違った
かどうかは、絞り込み画面に表示されたアンカーにより
ユーザは理解できる。If the recognition result is "cancel" in step 115, the process returns to step 102. In this case, the user speaks when the speech recognition result is wrong and the user falls into a situation that is not desired. Whether the recognition result is wrong can be understood by the user by the anchor displayed on the narrowing-down screen.

【００８５】３．第１の実施例の変更例第１の実施例において、管理テーブルの内容を更新する
だけでなく、音声認識用の辞書を更新する理由は、認識
候補を減らすことにより誤認識が発生する可能性を低く
するためであり、認識手段の性能がよければ変更しなく
てもよい。言い換えれば、ステップ１１１を省いてもか
まわない。3. Modification of First Embodiment In the first embodiment, the reason for not only updating the content of the management table but also updating the dictionary for speech recognition is that the possibility of erroneous recognition occurring due to the reduction of recognition candidates is high. This need not be changed if the performance of the recognition means is good. In other words, step 111 may be omitted.

【００８６】４．第１の実施例の効果アンカーから名詞句のみを選択するのは、ユーザになれ
ない作業を要求するので表示方法を他と変更することは
有効であると考えられる。しかし、特開平１１−２５０
９８号に述べられている、強調表示やフォントの色を変
更することは、もともとアンカーが強調表示されていた
り、アンカーに背景色を考慮した色が付加されている場
合には問題となる。ふりがなを付与する方法は、読みデ
ータが長い場合には本来の画面のレイアウトを著しく損
なう可能性がある。[0086] 4. Effects of the First Embodiment Selecting only a noun phrase from an anchor requires an operation that cannot be a user, and it is considered effective to change the display method to another. However, JP-A-11-250
The highlighting and the change of the font color described in Japanese Patent No. 98 become a problem when the anchor is originally highlighted or a color considering the background color is added to the anchor. The method of giving the phonetic mark may significantly impair the original screen layout when the read data is long.

【００８７】そこで、本実施例においては、名詞句の
前、あるいは前後に特定の記号を挿入する方式を用い
る。この方法は、従来例と比較して表現方式の影響を受
けない、新たに挿入するデータは少ないので、レイアウ
トを著しく損ねないという利点がある。Therefore, in this embodiment, a method of inserting a specific symbol before or after a noun phrase is used. This method is advantageous in that the layout is not significantly impaired since there is little data to be newly inserted, which is not affected by the expression method, as compared with the conventional example.

【００８８】以上説明したように第１の実施例により、
利用者はアンカー内の所望のキーワードを発話すること
によりリンク先情報にアクセス可能となる。また、発話
した単語が複数のアンカーに含まれる場合でも、絞り込
みを行うステップ１１１からステップ１１６を有するこ
とにより対処可能となる。As described above, according to the first embodiment,
The user can access the link destination information by speaking a desired keyword in the anchor. In addition, even when the uttered word is included in a plurality of anchors, it is possible to cope by having steps 111 to 116 for narrowing down.

【００８９】（第２の実施例）本発明の第２の実施例を
図９〜図１０を用いて説明する。(Second Embodiment) A second embodiment of the present invention will be described with reference to FIGS.

【００９０】この説明は、第１の実施例をベースにして
第１の実施例と異なる点のみを行う。This explanation is based on the first embodiment, and only different points from the first embodiment will be described.

【００９１】なお説明には、第１の実施例を用いるが第
１の実施例を実現するために必要条件全てを満たす必要
はない。また、読みデータを有する名詞句が１つもアン
カー内に存在しないアンカーを「例外データ」と呼ぶこ
ととする。Although the first embodiment is used for the description, it is not necessary to satisfy all the necessary conditions for realizing the first embodiment. An anchor in which no noun phrase having reading data exists in the anchor is referred to as “exception data”.

【００９２】図９のフローチャートにおけるステップ２
０３のアンカー文の形態素解析実行後、名詞句の個数の
計測を行う。ここで名詞句の個数が０の場合は、ステッ
プ２１１１の処理を行う。名詞句の個数が１以上の場合
は、存在する名詞句の個数分、ステップ２０４からステ
ップ２０６の処理を実行後、有効な読みデータを有する
名詞句の計測を行う。計測した結果が０の場合は上記の
同様にステップ２１１の処理を行う。Step 2 in the flowchart of FIG.
After executing the morphological analysis of the 03 anchor sentence, the number of noun phrases is measured. If the number of noun phrases is 0, the process of step 2111 is performed. If the number of noun phrases is one or more, after performing the processing from step 204 to step 206 for the number of existing noun phrases, measurement of noun phrases having valid reading data is performed. If the measured result is 0, the process of step 211 is performed in the same manner as described above.

【００９３】ステップ２１１では識別子とアンカーの情
報を対応づけて例外用管理テーブルに登録する。また、
ブラウザに表示されているアンカー情報の内容を変更す
る。これにより利用者は、そのアンカーに対しては、特
別な処理が必要なことを知る。In step 211, the identifier and the anchor information are associated with each other and registered in the exception management table. Also,
Change the contents of the anchor information displayed in the browser. Thus, the user knows that special processing is required for the anchor.

【００９４】例外データの音声による指定方法の一例を
説明する。An example of a method of specifying exception data by voice will be described.

【００９５】予め特定単語、ここでは仮に”よみなし”
と仮定する。ステップ２０７の音声入力待ち状態で、利
用者が”読みなし”と発話し、認識結果から”よみな
し”情報を取得した場合には、ステップ７で例外処理判
定を行う。”よみなし”の指示と判定した場合には、ス
テップ２１３からステップ２１７の処理を実行する。A specific word in advance, in this case, temporarily “not considered”
Assume that If the user utters “no reading” in the voice input waiting state in step 207 and obtains “not considered” information from the recognition result, exception processing is determined in step 7. When it is determined that the instruction is “not regarded”, the processing from step 213 to step 217 is executed.

【００９６】ステップ２１３では、音声認識用の辞書の
内容を全て削除し例外データの個数分の通し番号の読み
データをこの辞書に登録する。また、第１の実施例と同
様に”きゃんせる”などの制御命令を追加しておく。In step 213, all the contents of the dictionary for speech recognition are deleted, and the read data of the serial numbers corresponding to the number of the exception data are registered in this dictionary. Also, a control command such as "cancel" is added in the same manner as in the first embodiment.

【００９７】ステップ２１４では、図１０に示すよう
に、例外データのアンカーの文頭に対応した通し番号を
付与して選択画面を表示手段２に提示する。具体的に
は、図１０において”もれなし”を指示するために、通
し番号の”１”をユーザが発話する。In step 214, as shown in FIG. 10, a serial number corresponding to the beginning of the anchor of the exception data is assigned, and a selection screen is presented on the display means 2. Specifically, in FIG. 10, the user utters the serial number "1" to indicate "no leakage".

【００９８】ステップ２１６では、音声認識手段で通し
番号のデータ（特別読みデータ）を取得した場合には、
その番号に対応する例外データの識別子を選択する（ス
テップ２１７）。At step 216, when the serial number data (special reading data) is obtained by the voice recognition means,
The identifier of the exception data corresponding to the number is selected (step 217).

【００９９】その後、ステップ２１０で選択した識別子
を用いてリンク情報の要求を送信する。Thereafter, a request for link information is transmitted using the identifier selected in step 210.

【０１００】以上のような本実施例では有効な読みデー
タを有する名詞句がアンカー内に１つも存在しない状況
に対しては、例外データかどうかの判定を行い、例外デ
ータの場合には、別途の選択手段である特別読みデータ
（上記例では通し番号）を準備することにより対処可能
である。In the present embodiment as described above, if no noun phrase having valid reading data exists in the anchor, it is determined whether or not the data is exceptional data. This can be dealt with by preparing special reading data (serial number in the above example), which is the selection means.

【０１０１】本実施例では通し番号を例外データの識別
用に用いたが、他にアルファベット、５０音、特殊な記
号など使用する方法も考えられる。In this embodiment, the serial number is used for identifying exception data. However, a method using alphabets, Japanese syllabary, special symbols, and the like may be used.

【０１０２】別の指定方法としては、ステップ２０３で
有効登録なしと判定した場合には、アンカーの文頭に通
し番号を追加する形で表示内容の変更を行い、かつ音声
認識辞書に番号の読みデータを登録し、管理テーブルに
は、読みデータ、リンク識別子、アンカーを対応して登
録する方法も考えられる。As another designation method, when it is determined in step 203 that there is no valid registration, the display content is changed by adding a serial number to the beginning of the anchor, and the reading data of the number is stored in the speech recognition dictionary. A method of registering and registering the reading data, the link identifier, and the anchor correspondingly to the management table is also conceivable.

【０１０３】（第３の実施例）本発明の第３の実施例を
図１１〜図１２を用いて説明する。(Third Embodiment) A third embodiment of the present invention will be described with reference to FIGS.

【０１０４】第３の実施例は、本発明において１つのア
ンカーに読みデータを有する複数の名詞句が存在する場
合には、認識辞書の登録する単語が増加し、誤認識の可
能性が増加する問題に関わるものである。In the third embodiment, when a plurality of noun phrases having reading data in one anchor exist in the present invention, the number of words registered in the recognition dictionary increases, and the possibility of erroneous recognition increases. It concerns the problem.

【０１０５】ところで、表示手段２の表示用の有効領域
が、Ｈｔｍｌ文章を解析により作成した表示画面より小
さい場合がある。この場合ブラウザはスクロールバーを
画面表示し、有効領域に含まれない画面へのアクセスを
可能とする。In some cases, the display effective area of the display means 2 is smaller than the display screen created by analyzing the HTML text. In this case, the browser displays a scroll bar on the screen and enables access to a screen that is not included in the effective area.

【０１０６】利用者から見れば、表示領域が狭く、か
つ、一度も表示されていないアンカーに関しては、アク
セスの対象として考慮しない。それゆえ、表示されない
アンカーに含まれる読みデータを認識用の辞書に登録す
るのは、誤認識の可能性を増やすこととなり有益でな
い。From the user's point of view, anchors whose display area is narrow and have not been displayed at all are not considered as targets for access. Therefore, registering the reading data included in the non-displayed anchor in the dictionary for recognition increases the possibility of erroneous recognition and is not useful.

【０１０７】第３の実施例に関わる処理手順を図１、図
１２のフローチャートを用い、第１の実施例をベースに
して異なる点のみ説明する。なお、説明には、第１の実
施例を用いるが第１の実施例を実現するために必要条件
全てを満たす必要はない。The processing procedure according to the third embodiment will be described with reference to the flowcharts of FIGS. 1 and 12 and only the differences from the first embodiment. Although the first embodiment is used for the description, it is not necessary to satisfy all the necessary conditions for realizing the first embodiment.

【０１０８】図１のステップ１０５の音声認識用の辞書
への登録は、第３の実施例では実行しない。また、ステ
ップ１０６の管理テーブルには、読みデータ、識別子、
アンカーの他に、図１１に示すように表示手段２におけ
る表示開始ｘ座標、ｙ座標、アンカーの幅、アンカーの
高さも併せて登録する。The registration in the dictionary for speech recognition in step 105 in FIG. 1 is not executed in the third embodiment. In addition, the management table of step 106 includes the read data, the identifier,
In addition to the anchor, the display start x-coordinate and y-coordinate, the width of the anchor, and the height of the anchor on the display unit 2 are also registered as shown in FIG.

【０１０９】各アンカーの表示に関する情報は、表示画
面作成時に計算したものを使用する。あるいはDocument
Object Modelと呼ばれる技術を用いて、ブラウザから
取得することも可能である。As the information on the display of each anchor, the information calculated when the display screen is created is used. Or Document
It can also be obtained from a browser using a technology called Object Model.

【０１１０】図１のステップ１０７の音声入力待ちの状
況で、利用者がスクロールバーを操作することによりブ
ラウザに表示手段２の表示領域の変更を指示した場合、
図１２に示す手順を実行する。When the user instructs the browser to change the display area of the display means 2 by operating the scroll bar in the state of waiting for voice input in step 107 in FIG.
The procedure shown in FIG. 12 is performed.

【０１１１】ステップ３０１では、表示手段２の現在の
有効表示領域のサイズと、どの領域が表示されているか
に関する情報を取得する。ここでは仮に、領域のサイズ
を幅３２０、高さ４００とし、作成した画面データ内の
（０，１４０）から幅３２０、高さ４００の情報が提示
されているものとか仮定する。これは作成した画面の座
標で表現すると（０，１４０）−（３１９，５３９）の
領域となる。In step 301, information on the size of the current effective display area of the display means 2 and which area is being displayed is obtained. Here, it is assumed that the size of the area is assumed to be width 320 and height 400, and information of width 320 and height 400 from (0, 140) in the created screen data is presented. This is an area of (0,140)-(319,539) when represented by the coordinates of the created screen.

【０１１２】ステップ３０２では、音声認識用の辞書に
登録している内容を全て削除する。In step 302, all the contents registered in the dictionary for voice recognition are deleted.

【０１１３】ステップ３０３とステップ３０４は、管理
テーブルに記録されている項目の数だけ実行する。ステ
ップ３０３ではＮ番目の要素の内容を取得し、現在アン
カーが表示されているかどうか検討する。Steps 303 and 304 are executed by the number of items recorded in the management table. In step 303, the content of the Nth element is obtained, and it is determined whether the anchor is currently displayed.

【０１１４】例では項目１，２は領域（１００，４８
０）−（１９５，４９５）を占めており、上記有効表示
領域に含まれているので表示されていると判定する。同
様に項目３，４は表示されていると判定されるが、項目
５，６に関しては有効領域外なので表示されていないと
判定される。In the example, items 1 and 2 correspond to the area (100, 48).
0)-(195, 495), and is included in the valid display area, so that it is determined that it is displayed. Similarly, it is determined that items 3 and 4 are displayed, but it is determined that items 5 and 6 are not displayed because they are outside the effective area.

【０１１５】ステップ３０４では、ステップ３０３で表
示されていると判定されている項目の読みデータのみ音
声認識用の辞書に登録する。In step 304, only the reading data of the item determined to be displayed in step 303 is registered in the dictionary for speech recognition.

【０１１６】もちろん、最初に画面にデータを表示する
場合でも図１２に示す手順を実行する。これにより画面
変更指示を行わない場合でも対処できる。Of course, the procedure shown in FIG. 12 is executed even when data is first displayed on the screen. Accordingly, it is possible to cope with the case where the screen change instruction is not issued.

【０１１７】以上のような処理により、音声認識用の辞
書に登録する読み情報を必要最低限にすることが可能と
なり、誤認識の可能性が低減できる。With the above processing, the reading information registered in the dictionary for voice recognition can be minimized, and the possibility of erroneous recognition can be reduced.

【０１１８】本説明では、表示されている領域に含まれ
るアンカーの単語を音声認識用の辞書に登録したが、一
度表示した単語を登録する方法も考えられる。In the present description, the word of the anchor included in the displayed area is registered in the dictionary for speech recognition, but a method of registering the word once displayed may be considered.

【０１１９】この場合、図１１に示すテーブルに登録す
るデータに表示済みかそうでないかのフラグの項目を追
加し、音声認識用の辞書に登録する判定基準として該フ
ラグを使用することにより実現できる。In this case, this can be realized by adding an item of a flag indicating whether the data is already displayed or not to the data to be registered in the table shown in FIG. .

【０１２０】さらに、上記説明とは反対に、表示されて
いない領域に含まれるアンカーの単語を音声認識用の辞
書に登録してもよい。Further, contrary to the above description, the word of the anchor included in the non-displayed area may be registered in the dictionary for speech recognition.

【０１２１】これは、ユーザが画面をめくってその頁の
表示をしなくなった場合に、その頁を再度表示すること
なしに、識別子を絞り込むことができる。According to this, when the user turns the screen and stops displaying the page, the identifier can be narrowed down without displaying the page again.

【０１２２】[0122]

【発明の効果】請求項１，４の発明であると、ユーザが
音声で指示した対応単語情報に基づいて、複数の識別子
が得られても、管理テーブルの内容を変更して、その変
更した内容に従ってユーザに再度音声を入力してもらう
ことにより、さらに識別子を絞り込むことができる。According to the first and fourth aspects of the present invention, even if a plurality of identifiers are obtained based on the corresponding word information instructed by the user by voice, the contents of the management table are changed and the contents are changed. By having the user input the voice again according to the content, the identifier can be further narrowed down.

【０１２３】請求項２，５の発明であると、解析した対
応単語情報と識別子が対応していない場合でも、特別読
み情報を作成して、この特別読み情報をユーザが音声で
指示すれば、その特別読み情報に対応する識別子を得る
ことができる。According to the second and fifth aspects of the present invention, even if the analyzed corresponding word information and the identifier do not correspond to each other, special reading information is created, and if this special reading information is indicated by a voice, An identifier corresponding to the special reading information can be obtained.

【０１２４】請求項３，６の発明であると、表示部の表
示された内容、または、表示されていない内容に対応し
て対応単語情報と登録するために、必要な情報のみがユ
ーザに提示できる。According to the third and sixth aspects of the invention, only necessary information is presented to the user in order to register the corresponding word information corresponding to the content displayed on the display unit or the content not displayed. it can.

[Brief description of the drawings]

【図１】本発明の第１の実施例の処理手順であるフロー
チャートである。FIG. 1 is a flowchart illustrating a processing procedure according to a first embodiment of the present invention.

【図２】第１の実施例の情報処理装置の構成を示すブロ
ック図である。FIG. 2 is a block diagram illustrating a configuration of the information processing apparatus according to the first embodiment.

【図３】Ｈｔｍｌ文章の１つの例を示す説明図である。FIG. 3 is an explanatory diagram showing an example of an HTML sentence.

【図４】音声認識用の辞書のデータを示す説明図であ
る。FIG. 4 is an explanatory view showing dictionary data for speech recognition.

【図５】読みデータ、識別子の管理テーブルへの登録を
示す説明図である。FIG. 5 is an explanatory diagram showing registration of read data and an identifier in a management table.

【図６】変更後の音声認識用の辞書のデータを示す説明
図である。FIG. 6 is an explanatory diagram showing data of a dictionary for speech recognition after a change.

【図７】変更後の読みデータ、識別子の管理テーブルの
登録の内容を示す説明図である。FIG. 7 is an explanatory diagram showing the contents of registration of read data and identifier management tables after change.

【図８】絞り込み用候補画面の説明図である。FIG. 8 is an explanatory diagram of a narrowing-down candidate screen.

【図９】第２の実施例の処理手順であるフローチャート
である。FIG. 9 is a flowchart illustrating a processing procedure according to the second embodiment;

【図１０】例外データ選択画面の説明図である。FIG. 10 is an explanatory diagram of an exception data selection screen.

【図１１】読みデータ、識別子、座標データの管理テー
ブルへの登録を示す説明図である。FIG. 11 is an explanatory diagram showing registration of reading data, an identifier, and coordinate data in a management table.

【図１２】第３の実施例の処理手順であるフローチャー
トである。FIG. 12 is a flowchart illustrating a processing procedure according to a third embodiment;

【図１３】従来例１の処理手順を示すフローチャートで
ある。FIG. 13 is a flowchart showing a processing procedure of Conventional Example 1.

【図１４】従来例２の処理手順を示すフローチャートで
ある。FIG. 14 is a flowchart illustrating a processing procedure of Conventional Example 2.

[Explanation of symbols]

１演算手段２表示手段３記憶手段４キーボード５指示手段６通信手段７音声入力手段８サーバ９バス DESCRIPTION OF SYMBOLS 1 Calculation means 2 Display means 3 Storage means 4 Keyboard 5 Instruction means 6 Communication means 7 Voice input means 8 Server 9 Bus

フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 15/00 Ｇ１０Ｌ 3/00 ５５１Ｐ５Ｋ０３４ 15/28 Ｈ０４Ｌ 13/00 ３０７Ｚ９Ａ００１Ｈ０４Ｌ 29/08 Ｆターム(参考） 5B009 KB00 5B075 PP07 5B082 AA11 EA07 GC04 5B089 GA25 GB03 HA10 JA24 KA03 KB07 KC15 KH16 5D015 GG01 GG03 KK02 LL10 5K034 AA18 BB06 FF01 FF17 HH01 HH02 HH14 HH17 HH26 LL01 9A001 BB03 BB04 CC07 HH17 JJ25 JJ26 JJ27 JJ72 KK56 Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat II (Reference) G10L 15/00 G10L 3/00 551P 5K034 15/28 H04L 13/00 307Z 9A001 H04L 29/08 F-term (Reference) 5B009 KB00 5B075 PP07 5B082 AA11 EA07 GC04 5B089 GA25 GB03 HA10 JA24 KA03 KB07 KC15 KH16 5D015 GG01 GG03 KK02 LL10 5K034 AA18 BB06 FF01 FF17 HH01 HH02 HH14 HH17 HH26 LL01 9A001 BB03 JJ03HJ BB03

Claims

[Claims]

1. A document acquisition method for recognizing a voice input by a user and acquiring a next linked document from a document acquired through a telecommunication line based on a result of the recognition. A dictionary registration step of registering corresponding word information on a word that can be associated with a next link destination in a speech recognition dictionary; anda corresponding word information registered in the speech recognition dictionary; A table creation step of creating a management table by associating with an identifier for obtaining, and a speech recognition step of recognizing a speech input by the user based on corresponding word information registered in the speech recognition dictionary, An identifier obtaining step of obtaining an identifier from the management table based on a recognition result in the voice recognition step. In the case where there are a plurality of identifiers obtained in the identifier obtaining step, the contents of the speech recognition dictionary and the management table are extracted from the plurality of identifiers based on the plurality of identifiers, and the number of identifiers is further increased. And a narrowing-down step of narrowing down the number of identifiers by performing the voice recognition step and the identifier acquisition step based on the changed voice recognition dictionary and management table. A document acquisition method characterized in that:

2. A document acquisition method for recognizing a voice input from a user and acquiring a next linked document from a document acquired through a telecommunication line based on the result of the recognition. A dictionary registration step of registering corresponding word information on a word that can be associated with a next link destination in a speech recognition dictionary; anda corresponding word information registered in the speech recognition dictionary; A table creation step of creating a management table by associating with an identifier for obtaining, and a speech recognition step of recognizing a speech input by the user based on corresponding word information registered in the speech recognition dictionary, An identifier obtaining step of obtaining an identifier from the management table based on a recognition result in the voice recognition step. In addition, a correspondence determination step of determining whether or not correspondence between the analyzed corresponding word information and an identifier is possible; and, if the correspondence is not possible, data for registering in the voice recognition dictionary. A step of generating special reading information corresponding to the analyzed corresponding word information and registering the special reading information in the speech recognition dictionary.

3. A document acquisition method for recognizing a voice input from a user and acquiring a next linked document from a document acquired through a telecommunication line based on the result of the recognition. A dictionary registration step of registering corresponding word information on a word that can be associated with a next link destination in a speech recognition dictionary; anda corresponding word information registered in the speech recognition dictionary; A table creation step of creating a management table by associating with an identifier for obtaining, and a speech recognition step of recognizing a speech input by the user based on corresponding word information registered in the speech recognition dictionary, An identifier obtaining step of obtaining an identifier from the management table based on a recognition result in the voice recognition step. A display target determining step of determining whether or not the content is displayed on a screen of a display unit that presents a document to a user; and a content displayed on the screen of the display unit, or A display registration change step of changing corresponding word information registered in the speech recognition dictionary according to the content that does not exist.

4. A recording which records a program for realizing a document acquisition method for recognizing a voice input from a user and acquiring a next linked document from a document acquired through a telecommunication line based on the recognition result. A medium, wherein the acquired document is analyzed, and a dictionary registration function for registering corresponding word information on a word to which a next link destination can be associated in a speech recognition dictionary; and corresponding word information registered in the speech recognition dictionary. And a table creation function for creating a management table by associating the identifier with the identifier for acquiring the next linked document; and a voice input by the user based on corresponding word information registered in the voice recognition dictionary. A speech recognition function for recognizing, and an identifier acquisition function for obtaining an identifier from the management table based on a recognition result in the speech recognition function. In the recording medium on which the program to be recorded is recorded, if there are a plurality of identifiers obtained by the identifier acquisition function, the contents of the speech recognition dictionary and the management table are updated based on the plurality of identifiers. From inside
Further, a narrowing change function for changing the number of identifiers so as to be narrowed down, based on the changed voice recognition dictionary and the management table, performing the voice recognition function and the identifier acquisition function,
A recording medium for a document acquisition method, characterized by recording a program for realizing a function of narrowing down the number of identifiers and

5. A recording device for recognizing a voice inputted by a user and recording a program for realizing a document acquisition method for acquiring a next linked document from a document acquired through a telecommunication line based on the recognition result. A medium, wherein the acquired document is analyzed, and a dictionary registration function for registering corresponding word information on a word to which a next link destination can be associated in a speech recognition dictionary; and corresponding word information registered in the speech recognition dictionary. And a table creation function for creating a management table by associating the identifier with the identifier for acquiring the next linked document; and a voice input by the user based on corresponding word information registered in the voice recognition dictionary. A voice recognition function for recognizing the information and an identifier acquisition function for obtaining an identifier from the management table based on the recognition result in the voice recognition function are realized. And a correspondence determination function of determining whether or not the analyzed corresponding word information and the identifier can be associated with each other on a recording medium on which a program for realizing the above is recorded. Data for registering in a dictionary, and a reading information creating function for creating special reading information corresponding to the analyzed corresponding word information and registering the special reading information in the speech recognition dictionary. Recording medium of the document acquisition method to be described.

6. A recording device for recognizing a voice input by a user and recording a program for realizing a document acquisition method for acquiring a next linked document from a document acquired through a telecommunication line based on the recognition result. A medium, wherein the acquired document is analyzed, and a dictionary registration function for registering corresponding word information on a word to which a next link destination can be associated in a speech recognition dictionary; and corresponding word information registered in the speech recognition dictionary. And a table creation function for creating a management table by associating the identifier with the identifier for acquiring the next linked document; and a voice input by the user based on corresponding word information registered in the voice recognition dictionary. A voice recognition function for recognizing the information and an identifier acquisition function for obtaining an identifier from the management table based on the recognition result in the voice recognition function are realized. And a display medium determining function for determining whether or not the content is displayed on a screen of a display unit for presenting a document to a user on a recording medium on which a program for realizing the program is realized. And a display registration change function for changing corresponding word information registered in the speech recognition dictionary in accordance with contents present or not displayed, and a program for realizing the following.