JP2000268045A

JP2000268045A - Information terminal equipment

Info

Publication number: JP2000268045A
Application number: JP11070098A
Authority: JP
Inventors: Yuichi Yoshida; 祐一吉田
Original assignee: Olympus Optical Co Ltd
Current assignee: Olympus Corp
Priority date: 1999-03-16
Filing date: 1999-03-16
Publication date: 2000-09-29

Abstract

PROBLEM TO BE SOLVED: To provide information terminal capable of reducing the operation burden of an operator in the construction of a data base to utilize voice recognition. SOLUTION: An evaluation point calculating part 38, a code book 16 and a field processing part 39 collate a label string provided by a vector quantizing part 36 with a grammer corresponding to the field of a designated voice input object and determine a field to store data (input data) expressed by that label string on the basis of this collated result. Grammars are previously applied to respective plural fields of a personal information managing(PIM) data base 20 and the grammars are provided as grammer data 18 containing symbols expressing the characteristics of fields and specified models expressing characters capable of being inputted to these fields.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明はオペレータから発せ
られた音声を認識し、かかる音声認識結果を種々のデー
タベースへの入力データとする情報端末装置に関し、よ
り具体的には個人情報管理のための携帯型の個人情報管
理（ＰＩＭ）装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information terminal device for recognizing a voice uttered by an operator and using the voice recognition result as input data to various databases, and more specifically to an information terminal device for managing personal information. The present invention relates to a portable personal information management (PIM) device.

【０００２】[0002]

【従来の技術】近年、ディクテーション（口述筆記）や
音声対話、さらには音声自動翻訳といった音声認識技術
が注目を浴びてきている。このような音声認識技術を利
用してオペレータから発せられた音声を認識し、かかる
音声認識結果を例えばキーボードやマウス等により得ら
れた入力データとほぼ同等に利用する装置が幾つか提案
されるようになってきた。例えば、特開平６−３４８４
５２号に記載のコンピュータシステムは、発話をコンピ
ュータプログラムのコマンドに変換するものであり、特
開平８―２２７３９８号に記載の携帯情報端末ではスケ
ジュール管理機能によって登録されるイベント内容を音
声で入力するものであり、特開平１０―７０６１３号に
記載のシステムは、電話注文受付において、高い音声認
識率で氏名・住所のデータを作成するものである。2. Description of the Related Art In recent years, voice recognition techniques such as dictation, spoken dialogue, and automatic speech translation have attracted attention. Some devices have been proposed which recognize voices emitted from an operator by using such voice recognition technology and use the voice recognition results substantially equivalently to input data obtained by, for example, a keyboard or a mouse. It has become For example, JP-A-6-3484
The computer system described in No. 52 converts speech into a command of a computer program, and the portable information terminal described in Japanese Patent Application Laid-Open No. Hei 8-227398 uses a voice to input event contents registered by a schedule management function. The system described in Japanese Patent Application Laid-Open No. H10-70613 creates name / address data at a high voice recognition rate when receiving a telephone order.

【０００３】このような従来のシステムにおける音声認
識は、いわゆるキーボード又はマウス等の操作の併用を
前提とするか、あるいはこのような入力手段によるコマ
ンド操作自体を単に音声入力に置き換えたものであり、
音声認識を利用した入力における操作性をより改善し、
ユーザーインターフェースを向上することが求められて
きている。[0003] Speech recognition in such a conventional system presupposes the combined use of so-called keyboard or mouse operations, or simply replaces the command operation itself by such input means with speech input.
Improve operability of input using voice recognition,
There is a need to improve the user interface.

【０００４】例えば近年、急速に普及している情報端末
装置のうち、特に携帯型のものは、筐体、及び一般的に
は筐体に設けられるキーボード又はマウス等の入力手段
は小さく構成されており、これを長時間操作して大量の
データを入力することはオペレータにとって多大な操作
負担になるという問題点がある。また、ある種の情報端
末装置は、キーボード等の入力手段自体を備えておら
ず、パソコン等で作成されたデータベースを転送するよ
うにし、端末上ではデータ入力を行わない構成のものも
あり、キーボード又はマウス等の操作の併用を前提とす
る音声認識は適用できない。[0004] For example, among information terminal devices that have been rapidly spread in recent years, in particular, in portable devices, a housing and input means such as a keyboard or a mouse generally provided on the housing are configured to be small. There is a problem in that inputting a large amount of data by operating this for a long period of time imposes a heavy operation burden on the operator. Also, some types of information terminal devices do not have input means such as a keyboard, and transfer a database created by a personal computer or the like, and do not perform data input on the terminal. Alternatively, speech recognition on the assumption that a mouse or the like is used in combination cannot be applied.

【０００５】また、一般的に入力データ数が多い場合、
データベース作成におけるフィールド移動のためのカー
ソル操作等は極めて面倒であり、このようなデータ入力
における操作負担の問題は携帯型の情報端末装置のみな
らず、他の装置にも当てはまる。In general, when the number of input data is large,
Cursor operation or the like for moving a field in creating a database is extremely troublesome, and such a problem of operation burden in data input applies not only to a portable information terminal device but also to other devices.

【０００６】[0006]

【発明が解決しようとする課題】本発明は上記事情を考
慮してなされたものであり、音声認識を利用するデータ
ベース構築におけるオペレータの操作負担を軽減できる
情報端末装置を提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above circumstances, and it is an object of the present invention to provide an information terminal device capable of reducing the operation burden on an operator in constructing a database using voice recognition. .

【０００７】[0007]

【課題を解決するための手段】上記課題を解決し目的を
達成するために本発明は次のように構成されている。SUMMARY OF THE INVENTION In order to solve the above problems and achieve the object, the present invention is configured as follows.

【０００８】［１］本発明の情報端末装置は、少なくと
も複数のフィールドからなるレコードが複数格納されて
なるデータベースを構築して情報を管理する情報端末装
置において、上記レコードの複数のフィールドのうち音
声入力対象のフィールドを音声入力前に予めオペレータ
に指定させる指定手段と、上記オペレータから発せられ
た音声を認識する認識手段と、上記複数のフィールドの
各々に予め与えられている文法のうち上記指定手段によ
り指定された音声入力対象のフィールドに対応する文法
に、上記認識手段によって得られた認識結果を照らし合
わせ、かかる照合結果に基づいて当該認識結果を格納す
るフィールドを決定する決定手段と、上記データベース
の同一又は異なるレコード中の上記決定フィールドへ
の、上記認識結果に基づく入力データの書き込みを行う
書き込み手段と、を具備する。[1] An information terminal device according to the present invention is a data terminal device for managing information by constructing a database storing a plurality of records each including at least a plurality of fields. Designating means for allowing an operator to designate a field to be input in advance before voice input; recognizing means for recognizing a voice emitted from the operator; and specifying means among grammars previously given to each of the plurality of fields Determining means for comparing the grammar corresponding to the field of the voice input target designated by the recognition result obtained by the recognition means with the grammar corresponding to the field, and determining a field for storing the recognition result based on the matching result; To the decision field in the same or different record Writing means for writing the input data to brute comprises a.

【０００９】［２］本発明の個人情報管理装置は、少な
くとも複数のフィールドからなるレコードが複数格納さ
れてなる個人情報データベースと、上記個人情報データ
ベースの所定のレコードに係る複数のフィールドにデー
タを入力するための入力画面を表示する表示手段と、オ
ペレータから発せられた音声を入力する入力手段と、上
記個人情報データベースへのデータ入力に係る制御を行
う制御手段であって、上記表示手段に上記入力画面を表
示させ、上記複数のフィールドのうち音声入力対象のフ
ィールドを音声入力前に予めオペレータに指定させる指
定手段と、上記入力手段から入力された音声を認識する
認識手段と、上記複数のフィールドの各々に予め与えら
れている文法のうち上記指定手段により指定された音声
入力対象のフィールドに対応する文法に、上記認識手段
によって得られた認識結果を照らし合わせ、かかる照合
結果に基づいて当該認識結果を格納するフィールドを決
定する決定手段と、上記個人情報データベースの同一又
は異なるレコード中の上記決定フィールドへの、上記認
識結果に基づく入力データの書き込みを行う書き込み手
段と、上記入力データを上記入力画面に表示させる表示
制御手段と、により構成される制御手段と、を具備す
る。[2] The personal information management device of the present invention inputs a personal information database in which a plurality of records each including at least a plurality of fields are stored, and a plurality of fields related to predetermined records in the personal information database. Display means for displaying an input screen for inputting data, input means for inputting a voice uttered by an operator, and control means for controlling data input to the personal information database. A screen for displaying a screen, a designation unit for allowing an operator to designate in advance a field to be voice-inputted among the plurality of fields before voice input, a recognition unit for recognizing voice input from the input unit, and a recognition unit for the plurality of fields. Of the grammars given in advance to each, the voice input target Determining a field for storing the recognition result based on the matching result by comparing the recognition result obtained by the recognition means with a grammar corresponding to the same; A writing unit for writing input data based on the recognition result to the decision field, and a display control unit for displaying the input data on the input screen.

【００１０】［３］本発明の記録媒体は、少なくとも複
数のフィールドからなるレコードが複数格納されてなる
データベースを構築して情報を管理するためのプログラ
ムを記録した記録媒体であって、上記レコードの複数の
フィールドのうち音声入力対象のフィールドを音声入力
前に予めオペレータに指定させる指定手段と、上記オペ
レータから発せられた音声を認識する認識手段と、上記
複数のフィールドの各々に予め与えられている文法のう
ち上記指定手段により指定された音声入力対象のフィー
ルドに対応する文法に、上記認識手段によって得られた
認識結果を照らし合わせ、かかる照合結果に基づいて当
該認識結果を格納するフィールドを決定する決定手段
と、上記データベースの同一又は異なるレコード中の上
記決定フィールドへの、上記認識結果に基づく入力デー
タの書き込みを行う書き込み手段と、を実行するプログ
ラムを記録したものである。[3] A recording medium according to the present invention is a recording medium on which a program for managing information by constructing a database storing a plurality of records each having at least a plurality of fields is recorded. Designating means for allowing an operator to designate a field to be subjected to voice input from among a plurality of fields before voice input, recognition means for recognizing voice uttered from the operator, and pre-given to each of the plurality of fields The recognition result obtained by the recognition means is compared with the grammar corresponding to the field of the voice input target specified by the specification means in the grammar, and a field for storing the recognition result is determined based on the comparison result. To the decision field in the same or different record of the database It is obtained by recording a program for executing a writing means for writing the input data based on the recognition result.

【００１１】[0011]

【発明の実施の形態】「構成」図１は本発明の一実施形
態に係る情報端末装置の概略構成を示すブロック図であ
る。同図に示す情報端末装置は、データベースを利用し
て情報を入力したり、所望の情報を検索、抽出して表示
するなど種々のデータベース処理を行う装置であって、
制御手段１、データベース２、表示手段３、指定手段
４、及び音声入力手段５により構成される。特に本実施
形態はデータベース２の構築、より詳しくはオペレータ
の音声認識に基づくデータベース２へのデータ入力に係
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS "Configuration" FIG. 1 is a block diagram showing a schematic configuration of an information terminal device according to an embodiment of the present invention. The information terminal device shown in the figure is a device that performs various database processes such as inputting information using a database, searching for desired information, extracting and displaying the information,
It comprises a control unit 1, a database 2, a display unit 3, a designation unit 4, and a voice input unit 5. In particular, this embodiment relates to the construction of the database 2, and more specifically, to data input to the database 2 based on voice recognition of an operator.

【００１２】データベース２は、少なくとも複数のフィ
ールドからなるレコードが複数格納されて成る。表示手
段３はデータベース２の所定のレコードについて、その
複数のフィールドにデータを入力するための入力画面を
表示する手段であって、例えばディスプレイからなる。
指定手段４は、音声認識に基づくデータ入力を行う前
に、該データ入力を行うフィールドをオペレータに指定
させる手段であって、例えばキーボード又はマウス等か
らなる。該データ入力を行うフィールドを固定とする場
合、指定手段４は省かれても良い。The database 2 stores a plurality of records each including at least a plurality of fields. The display means 3 is a means for displaying an input screen for inputting data to a plurality of fields of a predetermined record of the database 2, and is, for example, a display.
The designating means 4 is a means for allowing an operator to designate a field for data input before data input based on voice recognition, and includes, for example, a keyboard or a mouse. When the field for inputting the data is fixed, the specifying means 4 may be omitted.

【００１３】音声入力手段５はオペレータから発せられ
た音声を入力する手段であって、例えばマイクからな
る。後述するが、一連の音声入力データが記録されたフ
ァイルを用い、データベース２へのデータ入力を一括し
て行う場合、当該音声入力手段５の代わりに、該ファイ
ルから音声入力データを読み出す手段が設けられる。The voice input means 5 is a means for inputting a voice uttered by an operator, and comprises, for example, a microphone. As will be described later, when a file in which a series of voice input data is recorded is used to collectively input data to the database 2, a means for reading voice input data from the file is provided instead of the voice input means 5. Can be

【００１４】制御手段１は、データベース２へのデータ
入力に係り、表示手段３、指定手段４、及び音声入力手
段５を制御する手段であってＣＰＵを含む。この制御手
段１は、コンピュータにより読み取り可能な記録媒体に
記録されソフトウェアにより実現されても良い。The control means 1 controls the display means 3, the designation means 4, and the voice input means 5 in connection with data input to the database 2, and includes a CPU. The control means 1 may be realized by software recorded on a computer-readable recording medium.

【００１５】以下、オペレータの音声認識に基づくデー
タベースへのデータ入力について、本発明のより具体的
な実施形態によって説明する。Hereinafter, data input to a database based on voice recognition by an operator will be described with reference to a more specific embodiment of the present invention.

【００１６】図２は、本発明を個人情報管理（ＰＩＭ）
装置に適用した実施形態の構成を示すブロック図であ
る。本実施形態の個人情報管理装置は、本体１０、マイ
ク１２、Ａ／Ｄ変換器１４、符号帳１６、単語ＨＭＭ
（Hidden Markov Model;隠れマルコフモデル）１８、Ｐ
ＩＭデータベース２０、ディスプレイ２２、キーボード
又はマウス２４、および制御部３０、とにより構成され
る。FIG. 2 shows the present invention for personal information management (PIM).
It is a block diagram showing composition of an embodiment applied to an apparatus. The personal information management apparatus according to the present embodiment includes a main body 10, a microphone 12, an A / D converter 14, a codebook 16, a word HMM
(Hidden Markov Model) 18, P
It comprises an IM database 20, a display 22, a keyboard or mouse 24, and a control unit 30.

【００１７】本体１０内に設けられる制御部３０は、図
示しないＣＰＵからなり、このＣＰＵ上で動作するプロ
グラムとして設けられる初期設定部３２、周波数分析部
３４、ベクトル量子化部３６、評価点計算部３８、およ
びフィールド処理部３９を有する。The control unit 30 provided in the main body 10 comprises a CPU (not shown), and an initial setting unit 32, a frequency analysis unit 34, a vector quantization unit 36, an evaluation point calculation unit provided as a program operating on the CPU. 38, and a field processing unit 39.

【００１８】図３は本実施形態の個人情報管理装置の本
体の外観を示す図である。FIG. 3 is a diagram showing the external appearance of the main body of the personal information management device of the present embodiment.

【００１９】本実施形態の個人情報管理装置は、例えば
「To_Do_List（トゥードゥーリスト）」、「スケジュー
ル」、「電話帳」、および「メモ」の４つからなる個人
情報を管理することができ、これらの個人情報の各々に
対応して４つのアプリケーションを備えている。これら
アプリケーションは、本体１０のディスプレイ２２に表
示される特定メニュー画面から選択的に起動される。Ｉ
１〜Ｉ４はこれらアプリケーションのなかから所望のア
プリケーションをオペレータが選択して起動する際に用
いられるアイコンであって、Ｉ１は「To_Do_List」アプ
リケーションを、Ｉ２は「スケジュール」アプリケーシ
ョンを、Ｉ３は「電話帳」アプリケーションを、そして
Ｉ４は「メモ」アプリケーションのアイコンをそれぞれ
示している。起動されたアプリケーションはＰＩＭデー
タベース２０に対するデータベース処理を行う。また、
Ｂ１〜Ｂ４は、これらアプリケーションをディスプレイ
２２以外で起動するためのボタンであって、順に、「To
_Do_List」、「スケジュール」、「電話帳」、「メモ」
のそれぞれのアプリケーションに対応して設けられてい
る。ＰＩＭデータベース２０は、「To_Do_List」、「ス
ケジュール」、「電話帳」、「メモ」のそれぞれに対応
して、１つ又は複数のレコードの格納領域を有する。１
つのレコードは、レコード種別毎に固有の、１つ又は複
数のフィールドから成る。The personal information management apparatus according to the present embodiment can manage four pieces of personal information of, for example, "To_Do_List (to-do list)", "schedule", "telephone directory", and "memo". There are four applications corresponding to each of these personal information. These applications are selectively activated from a specific menu screen displayed on the display 22 of the main body 10. I
1 to I4 are icons used by the operator to select and start a desired application from these applications, I1 is a “To_Do_List” application, I2 is a “schedule” application, and I3 is a “phonebook”. And the icon of the "memo" application. The activated application performs database processing on the PIM database 20. Also,
B1 to B4 are buttons for activating these applications other than on the display 22.
_Do_List "," Schedule "," Phonebook "," Memo "
Are provided corresponding to the respective applications. The PIM database 20 has a storage area for one or more records corresponding to each of “To_Do_List”, “Schedule”, “Phonebook”, and “Memo”. 1
One record includes one or a plurality of fields unique to each record type.

【００２０】図４は、ＰＩＭデータベースのレコード入
力画面の一例を示す図である。ディスプレイ２２に表示
されたレコード入力画面は、フィールド名表示、及びそ
のフィールドのデータの入力／表示のための領域４０
と、音声入力対象フィールド設定のための（チェックボ
ックス）領域４２とを有する。FIG. 4 is a diagram showing an example of a record input screen of the PIM database. The record input screen displayed on the display 22 includes a field name display and an area 40 for inputting / displaying data of the field.
And a (check box) area 42 for setting a voice input target field.

【００２１】図２に示す初期設定部３２は初期設定処理
として上述したアプリケーション起動に係る処理、およ
び音声入力対象フィールド設定に係る処理を担う手段で
ある。音声入力対象フィールド設定は、キーボード又は
マウス２４をオペレータが操作し、図４に示すようにチ
ェックボックス領域４２にて任意のボックスにチェック
を付す操作を行って所望の音声入力対象フィールドを指
定することにより行われる。図４には、オペレータが
「名前」、「電話番号（自宅）」、「Ｅ＿Ｍａｉｌ」、
および「住所」のフィールドを音声入力する旨、指定を
行った場合が示されている。The initial setting section 32 shown in FIG. 2 is a means for performing the processing relating to the above-described application start-up and the processing relating to the setting of the voice input target field as the initial setting processing. The voice input target field setting is performed by designating a desired voice input target field by operating the keyboard or mouse 24 and performing an operation of checking an arbitrary box in the check box area 42 as shown in FIG. It is performed by FIG. 4 shows that the operator has “name”, “telephone number (home)”, “E_Mail”,
A case is shown in which a voice input is performed in the field of “address” and “address”.

【００２２】図２に示すマイク１２は、オペレータから
発せられた音声を入力する手段である。マイク１２によ
りアナログ態様で得られる音声信号はＡ／Ｄ変換器１４
によりデジタル信号に変換され、周波数分析部３４に供
給される。The microphone 12 shown in FIG. 2 is a means for inputting a voice uttered by an operator. An audio signal obtained by the microphone 12 in an analog manner is converted into an A / D converter 14
, And supplied to the frequency analysis unit 34.

【００２３】周波数分析部３４、及びベクトル量子化部
３６は、入力された音声信号に基づく認識手段である。
周波数分析部３４は音声信号を周波数分析し、その分析
結果として周波数スペクトル得る手段である。ベクトル
量子化部３６は周波数分析部３４により得られた周波数
スペクトルのベクトル値を求め、符号帳１６を参照し、
該ベクトルの要素をまとめて１つの符号で表現すること
でベクトル量子化を行う手段である。The frequency analysis unit 34 and the vector quantization unit 36 are recognition means based on the input audio signal.
The frequency analysis unit 34 is a means for analyzing the frequency of the audio signal and obtaining a frequency spectrum as the analysis result. The vector quantization unit 36 obtains a vector value of the frequency spectrum obtained by the frequency analysis unit 34, refers to the codebook 16,
This is means for performing vector quantization by expressing the elements of the vector collectively by one code.

【００２４】評価点計算部３８、文法データ１８、およ
びフィールド処理部３９は指定された音声入力対象のフ
ィールドに対応する文法に、ベクトル量子化部３６によ
り得られたラベル列を照らし合わせ、かかる照合結果に
基づき、該ラベル列が表すデータ（すなわち入力デー
タ）を格納するフィールドを決定する決定手段である。The evaluation point calculation unit 38, the grammar data 18, and the field processing unit 39 compare the label sequence obtained by the vector quantization unit 36 with the grammar corresponding to the designated field of the voice input target, and perform the matching. Based on the result, it is a deciding means for deciding a field for storing data represented by the label string (that is, input data).

【００２５】ＰＩＭデータベース２０の複数のフィール
ドの各々には、予め文法が与えられており、文法は、フ
ィールドの特性を表す記号及び当該フィールドへの入力
可能文字を表す特定モデル（ここでは隠れマルコフモデ
ルとする）を含む文法データ１８として設けられる。以
下の表１に、本実施形態に係る文法データを模式的な表
として示す。この表１に示される各々の文法は絶対的な
ものではなく、特定なエディタを備えることでオペレー
タが任意に編集（追加、変更、削除など）可能とする。Each of the plurality of fields of the PIM database 20 is given a grammar in advance. The grammar is defined by a symbol representing a characteristic of the field and a specific model (here, a hidden Markov model) representing characters that can be input to the field. ) Is provided as grammar data 18. Table 1 below shows grammatical data according to the present embodiment as a schematic table. The grammars shown in Table 1 are not absolute, and the operator can arbitrarily edit (add, change, delete, etc.) by providing a specific editor.

【００２６】[0026]

【表１】 [Table 1]

【００２７】評価点計算部３８は、ベクトル量子化部３
６により得られたラベル列を文法データ１８が有する隠
れマルコフモデルにより評価し、評価点（尤度）を計算
する手段である。The evaluation point calculation unit 38 includes the vector quantization unit 3
This is a means for evaluating the label sequence obtained in Step 6 using a hidden Markov model included in the grammar data 18 and calculating an evaluation point (likelihood).

【００２８】ここで、隠れマルコフモデル（ＨＭＭ）に
よる単語の表現について説明する。Here, the expression of a word by a hidden Markov model (HMM) will be described.

【００２９】単語「ＳＵＭ」をＨＭＭによって表現する
と、図５に示すように「ｓ」，「ｕ」，「ｍ」の三つの
ラベル間の遷移がアーク（弧）により示された有限状態
モデルとなる。遷移確率は音声データベースなどを用い
てあらかじめ計算しておく。この例では、ラベル「ｓ」
の状態にある場合、当該ラベル「ｓ」に止まる確率が８
２％、ラベル「ｚ」に遷移する確率が５％、ラベル「ｓ
ｈ」に遷移する確率が３％、そしてラベル「ｕ」に遷移
する確率が１０％である。図５においては省略されてい
るが、より実用的なモデルでは１ラベル当たりおよそ２
００個のアークが存在する。このような隠れマルコフモ
デルを用いた音声認識時においては、入力音声から求め
たラベル時系列を用いて複数の単語の候補を求め、その
なかから遷移確率（尤度）が最大となる単語を認識結果
とする。When the word “SUM” is represented by an HMM, as shown in FIG. 5, a transition between three labels “s”, “u”, and “m” is represented by a finite state model represented by an arc. Become. The transition probability is calculated in advance using a voice database or the like. In this example, the label "s"
, The probability of stopping at the label “s” is 8
2%, probability of transition to label “z” is 5%, label “s”
The probability of transition to "h" is 3%, and the probability of transition to label "u" is 10%. Although omitted in FIG. 5, in a more practical model, about 2
There are 00 arcs. At the time of speech recognition using such a hidden Markov model, a plurality of word candidates are obtained using a label time series obtained from input speech, and a word having a maximum transition probability (likelihood) is recognized from among them. Result.

【００３０】そして本実施形態の評価点計算部３８は、
ベクトル量子化部３６により得られたラベル列を、文法
データ１８が有する隠れマルコフモデルにより評価し、
評価点（尤度）を計算する。The evaluation point calculator 38 of the present embodiment
The label sequence obtained by the vector quantization unit 36 is evaluated by a hidden Markov model of the grammar data 18,
Calculate the evaluation point (likelihood).

【００３１】フィールド処理部３９は評価点計算部３８
において得られた評価点に基づき、認識結果を格納する
フィールドを決定し、文法データ１８に従って文法の状
態遷移処理を行ない、ＰＩＭデータベース２０の同一又
は異なるレコード中の適切なフィールドに、認識結果に
基づく入力データを書き込む手段である。The field processing section 39 includes an evaluation point calculation section 38
The field for storing the recognition result is determined on the basis of the evaluation points obtained in the step (1), the state transition process of the grammar is performed according to the grammar data 18, and the appropriate field in the same or different record of the PIM database 20 is determined based on the recognition result. This is a means for writing input data.

【００３２】音声データ入力に係るオペレータの発声方
法としては、単語毎に区切って発声する方法から、話し
言葉で連続して発声する、いわゆるディクテーションま
で幾つかある。本実施形態ではオペレータの発声に特に
制約を課さないが、認識率を高めると共にフィールド決
定誤りを防ぐためには、少なくともフィールドとフィー
ルドの間、「名前」における姓と名の間は発声を区切る
ことが好ましい。There are several methods of uttering the operator related to the input of voice data, from a method of uttering each word separately, to a so-called dictation that utters continuously with spoken words. In the present embodiment, no particular limitation is imposed on the utterance of the operator. However, in order to increase the recognition rate and prevent erroneous field determination, it is necessary to separate the utterance at least between the fields and between the first and last names in the "name". preferable.

【００３３】また、例えば「住所」から「メモ」にフィ
ールドが続くような場合、フィールドの境界を判別し難
い。この場合、「住所」に係る音声入力を終えた時点で
オペレータが「メモ」すなわちフィールド名を発声する
ことで、直接的にフィールド指定できるように構成する
ことが好ましい。具体的には、「住所：八王子市…」に
続いて「メモ：駅のすぐそば」を音声入力する場合、
「ジュウショ．．．ハチオウジシ…．．．メモ．．．エ
キノスグソバ」という具合に発声すると、フィールド名
である「メモ」を認識した時点で「メモ」のフィールド
に入力フィールドが最優先で決定されるように構成す
る。そうすれば、フィールド決定誤りを防止でき、より
確実にデータ入力を行えるようになる。なお、フィール
ドの区切り記号（表１に示した例では「esc」）の所定
の読みを発声する（例えば「エスケープ」、「エスク」
等）ことで強制的にフィールドを区切ることができるよ
うに構成しても良い。また、オペレータが直接的に指定
するフィールドとしては、本例のように直後のフィール
ドのみならず、任意のフィールドを指定可能（音声コマ
ンドによるフィールド直接指定）としても良い。Further, for example, when a field follows from "address" to "memo", it is difficult to determine the boundary of the field. In this case, it is preferable that the operator can directly designate a field by uttering a "memo", that is, a field name at the end of the voice input of the "address". Specifically, if you input "address: Hachioji-shi ..." followed by "memo: right next to the station,"
When you say "Josho ... Hachioji ... Memo ... Echinosugusoba", the input field is determined to be the highest priority in the "Memo" field when the field name "Memo" is recognized. To be configured. By doing so, a field determination error can be prevented, and data can be input more reliably. In addition, a predetermined reading of a field delimiter (“esc” in the example shown in Table 1) is uttered (for example, “escape”, “esc”).
Etc.), the fields may be forcibly separated. Further, as the field directly specified by the operator, not only the immediately following field as in this example, but also any field can be specified (field direct specification by voice command).

【００３４】「動作」図６は、本実施形態に係る個人情
報管理装置におけるデータベース構築に係る動作を示す
フローチャートである。[Operation] FIG. 6 is a flowchart showing an operation relating to database construction in the personal information management apparatus according to the present embodiment.

【００３５】「ステップＳ１」先ずオペレータは、上述
したようにディスプレイ２２に表示されたアイコンＩ１
〜Ｉ４又はボタンＢ１〜Ｂ４を操作することでアプリケ
ーションを指定する。この指定に応じて初期設定部３２
は該当するアプリケーションを起動する。これにより処
理対象のデータベースが決定される。[Step S1] First, the operator operates the icon I1 displayed on the display 22 as described above.
The user designates an application by operating .about.I4 or buttons B1 to B4. Initial setting unit 32 according to this designation
Starts the corresponding application. Thereby, the database to be processed is determined.

【００３６】「ステップＳ２」次に、図４に示したよう
に、ＰＩＭデータベースのレコード入力画面上におい
て、オペレータにより音声入力対象フィールドの指定が
行われる。[Step S2] Next, as shown in FIG. 4, on the record input screen of the PIM database, an operator specifies a field to be subjected to voice input.

【００３７】「ステップＳ３」音声データを区切りまで
獲得し、音声認識処理を伴うフィールド処理の後、音声
入力データをＰＩＭデータベース２０に格納する処理が
行われる。[Step S3] The voice data is acquired up to the break, and after the field processing accompanied by the voice recognition processing, the processing of storing the voice input data in the PIM database 20 is performed.

【００３８】「ステップＳ４」音声認識によるデータベ
ース構築処理を終了させるべくオペレータから指示があ
った場合は本処理を終了し、そうでない場合はステップ
Ｓ３に戻ってデータベース構築処理を継続する。なお、
ここでは詳細に述べないが、ステップＳ２において指定
された音声入力フィールド以外のフィールドに対する、
音声認識に依らないデータ入力（例えばキーボード入
力）が当該ステップＳ４以降にて行なわれる。[Step S4] If there is an instruction from the operator to end the database construction process by voice recognition, the present process is terminated. Otherwise, the process returns to step S3 to continue the database construction process. In addition,
Although not described in detail here, for fields other than the voice input field designated in step S2,
Data input (for example, keyboard input) not based on voice recognition is performed in step S4 and subsequent steps.

【００３９】図７は、上記ステップＳ３における音声認
識処理の流れを示すフローチャートである。FIG. 7 is a flowchart showing the flow of the voice recognition process in step S3.

【００４０】「ステップＳ２１」マイク１２によりアナ
ログ態様で得られた音声信号をＡ／Ｄ変換器１４がデジ
タル信号（例えばここではＰＣＭフォーマットとする）
に変換する。得られたデジタル信号は周波数分析部３４
に供給される。[Step S21] The A / D converter 14 converts the audio signal obtained in an analog manner by the microphone 12 into a digital signal (for example, in PCM format).
Convert to The obtained digital signal is transmitted to the frequency analyzer 34.
Supplied to

【００４１】「ステップＳ２２」ステップＳ２１におい
て得られたデジタルの音声信号を周波数分析部３４が周
波数分析する。その分析結果として周波数スペクトルが
得られる。[Step S22] The frequency analysis unit 34 analyzes the frequency of the digital audio signal obtained in step S21. A frequency spectrum is obtained as a result of the analysis.

【００４２】「ステップＳ２３」ベクトル量子化部３６
は周波数分析部３４により得られた周波数スペクトルの
ベクトル値を求め、符号帳１６を参照し、該ベクトルの
要素をまとめて１つの符号で表現することでベクトル量
子化を行う。ベクトル量子化によると、周波数スペクト
ルは１００種類程度のグループに分類され、該グループ
の名前（ラベル列）によって表すことができるようにな
る。[Step S23] Vector quantization section 36
Calculates the vector value of the frequency spectrum obtained by the frequency analysis unit 34, refers to the codebook 16, and collectively expresses the elements of the vector with one code to perform vector quantization. According to the vector quantization, the frequency spectrum is classified into about 100 types of groups, and can be represented by the names (label strings) of the groups.

【００４３】「ステップＳ２４」評価点計算部３８は、
ベクトル量子化部３６により得られたラベル列を、文法
データ１８が有する隠れマルコフモデル（ＨＭＭ）（例
えば図７には「スーパー」、「薬局」、「銀行」のＨＭ
Ｍが示されている）により評価し、評価点（尤度）を計
算する。より詳しくは、認識対象となる単語（符号帳１
６に格納されている）毎に評価を行い、それぞれの出現
確率を計算し、該確率が最大の値をとる単語を認識結果
とする。この場合の評価点が低いことは、認識結果が正
しい確率が低いとみなす。[Step S24] The evaluation point calculator 38 calculates
The label sequence obtained by the vector quantization unit 36 is converted into a hidden Markov model (HMM) (for example, the “super”, “pharmacy”, and “bank” HM
M is indicated), and an evaluation point (likelihood) is calculated. More specifically, a word to be recognized (codebook 1
6 (stored in No. 6), the respective appearance probabilities are calculated, and the word having the highest probability is determined as a recognition result. In this case, a low evaluation score is regarded as a low probability that the recognition result is correct.

【００４４】このような認識結果を得る当該音声認識処
理は、次に述べるフィールド処理において実行される。The voice recognition processing for obtaining such a recognition result is executed in the field processing described below.

【００４５】図８は、上記ステップＳ３におけるフィー
ルド処理及びデータベース書き込み処理の流れを示すフ
ローチャートである。FIG. 8 is a flowchart showing the flow of the field processing and the database writing processing in step S3.

【００４６】「ステップＳ３１」これから構築処理を行
うデータベースに係わるレコードの処理カウンタをＩと
し、フィールドの処理カウンタをＪとし、これらカウン
タＩ及びＪに１を代入することで初期化を行う。また、
指定された音声入力対象フィールドのうち、先頭フィー
ルドの文法をロードする。なお、オペレータが必ずしも
先頭フィールドの音声入力を最初に行うとは限らないの
で、ここでロードは暫定的なものである。[Step S31] The processing counter of a record relating to the database for which the construction processing is to be performed is I, the processing counter of the field is J, and initialization is performed by substituting 1 into these counters I and J. Also,
Loads the grammar of the first field among the specified fields for voice input. Here, since the operator does not always input the voice of the first field first, the loading is temporary.

【００４７】「ステップＳ３２」図７を参照して説明し
た音声認識処理を開始し、オペレータから発声された音
声の認識結果を得る。なお、当該音声認識処理をリアル
タイムで行わず、一連の音声認識結果を予め得ておき、
この音声認識結果を一括して処理（バッチ処理）するこ
ともできる。この場合は一連の音声認識結果が記録され
たデータファイルを入力とする。[Step S32] The speech recognition process described with reference to FIG. 7 is started, and the recognition result of the speech uttered by the operator is obtained. In addition, without performing the voice recognition process in real time, a series of voice recognition results are obtained in advance,
This speech recognition result can be processed collectively (batch processing). In this case, a data file in which a series of speech recognition results is recorded is input.

【００４８】「ステップＳ３３」オペレータから終了指
示があった場合であって所定回数繰り返しても又は所定
時間経過しても音声認識結果が得られない場合、又は上
記バッチ処理を行う場合にあっては音声データファイル
の終端を検出した場合は、本フローの動作を終了する。[Step S33] In the case where there is an end instruction from the operator and no speech recognition result is obtained even after repeating the predetermined number of times or after the lapse of the predetermined time, or when performing the batch processing described above. If the end of the audio data file is detected, the operation of this flow ends.

【００４９】「ステップＳ３４」ステップＳ３３にて得
られた認識結果の評価点（認識得点）を判定する。この
評価点が所定のしきい値を超える場合はステップＳ３５
に移行し、該しきい値を下回る場合はステップＳ３９に
移行する。[Step S34] An evaluation point (recognition score) of the recognition result obtained in step S33 is determined. If this evaluation point exceeds a predetermined threshold, step S35
The process proceeds to step S39 if it is below the threshold.

【００５０】「ステップＳ３５」ステップＳ３４におけ
る認識得点が高いことから現フィールドを入力フィール
ドとして決定するとともに、当該フィールドにおける文
法遷移を開始する。この文法遷移の終了はフィールドの
終了を意味する。ステップＳ３２にて得られた認識結果
は、ＰＩＭデータベース２０内の第Ｉレコード、第Ｊフ
ィールドに書き込まれる。図９は文法遷移の一例を示す
図である。[Step S35] Since the recognition score in step S34 is high, the current field is determined as an input field, and grammatical transition in the field is started. The end of the grammar transition means the end of the field. The recognition result obtained in step S32 is written in the I-th record and the J-th field in the PIM database 20. FIG. 9 is a diagram illustrating an example of a grammar transition.

【００５１】「ステップＳ３６」当該フィールドにおけ
る文法遷移を進める。例えば、図９において、ｓｔ１の
状態で「ＰＭ」（又は「ＡＭ」）の認識得点が高かった
場合、状態がｓｔ２に遷移する。[Step S36] The grammar transition in the field is advanced. For example, in FIG. 9, when the recognition score of “PM” (or “AM”) is high in the state of st1, the state transits to st2.

【００５２】「ステップＳ３７」文法の状態遷移に基づ
いてフィールドの終了を判定する。フィールドの途中の
場合はステップＳ３２に移行し、当該フィールドの入力
を続ける。フィールドの終了の場合はステップＳ３８に
移行する。[Step S37] The end of the field is determined based on the state transition of the grammar. If it is in the middle of the field, the process proceeds to step S32, and the input of the field is continued. If the field has ended, the process moves to step S38.

【００５３】「ステップＳ３８」フィールドの処理カウ
ンタＪを＋１する。すなわち次のフィールドを想定し、
その文法をロードするとともに、ステップＳ３２以降の
動作を繰り返す。なお、カウンタＪが最高値（Ｊ＿ＭＡ
Ｘ）をとる場合は当該レコード内における最終フィール
ドと判断し、レコードの処理カウンタＩを＋１するとと
もにフィールドの処理カウンタＪを初期化する。つまり
次のレコードを処理する。[Step S38] The processing counter J in the field is incremented by one. That is, assuming the following fields,
The grammar is loaded, and the operation from step S32 is repeated. Note that the counter J has the highest value (J_MA
If X) is taken, it is determined that the field is the last field in the record, the processing counter I of the record is incremented by 1, and the processing counter J of the field is initialized. That is, the next record is processed.

【００５４】「ステップＳ３９」ステップＳ３４におい
て認識得点がしきい値を下回った場合、現フィールドか
ら近傍（３乃至５フィールド先）のフィールドについ
て、認識得点が高くなるフィールドを検索する。[Step S39] If the recognition score falls below the threshold value in step S34, a field having a higher recognition score is searched for a field near (3 to 5 fields ahead) from the current field.

【００５５】「ステップＳ４０」検索に成功し、認識得
点が高くなるフィールドを発見できた場合はステップＳ
４１に移行する。一方、認識得点が高いフィールドを発
見できなかった場合は認識得点が低いながらも認識結果
をいずれかのフィールドに格納し該フィールドにマーク
を付与するなどして識別可能にする。あるいは、フィー
ルドを決定できない旨をディスプレイ２２に表示し、格
納すべきフィールドをオペレータに選択させても良い。[Step S40] If the search is successful and a field with a high recognition score is found, then step S40 is performed.
It moves to 41. On the other hand, when a field having a high recognition score cannot be found, the recognition result is stored in any one of the fields even though the recognition score is low, and a mark is given to the field to make it identifiable. Alternatively, a message that the field cannot be determined may be displayed on the display 22, and the operator may select a field to be stored.

【００５６】「ステップＳ４１」フィールド処理カウン
タＪを＋１するとともに該カウンタＪに相当する文法を
新たにロードする。[Step S41] The field processing counter J is incremented by one, and a grammar corresponding to the counter J is newly loaded.

【００５７】「ステップＳ４２」現フィールドを入力フ
ィールドとして決定するとともに、当該フィールドにお
ける文法遷移を開始する。ステップＳ３２にて得られた
認識結果は、ＰＩＭデータベース２０内の第Ｉレコー
ド、第Ｊフィールドに書き込まれる。[Step S42] The current field is determined as an input field, and grammatical transition in the field is started. The recognition result obtained in step S32 is written in the I-th record and the J-th field in the PIM database 20.

【００５８】上記ステップＳ３５及びステップＳ４２に
おいては、音声認識結果を格納するフィールドが決定さ
れ、該認識結果は文字データや記号データからなる入力
データに変換される。この入力データはディスプレイ２
２に表示されるが、この時点において入力データはオペ
レータにより編集可能である。しかる編集の後にオペレ
ータからの指示によってＰＩＭデータベース２０の適切
なフィールドに入力データが書き込まれる。In steps S35 and S42, a field for storing a speech recognition result is determined, and the recognition result is converted into input data including character data and symbol data. This input data is displayed on display 2
2, the input data can be edited by the operator at this point. After the appropriate editing, the input data is written in an appropriate field of the PIM database 20 according to an instruction from the operator.

【００５９】以上説明した本実施形態によれば、音声認
識結果をフィールドの特性を表す文法に照らし合わせ、
該照合結果に基づいて認識結果が表す入力データを格納
するフィールドを決定するように構成されており、キー
ボード等でフィールド移動等の操作を行う必要がない。
つまりオペレータは音声入力対象フィールドを予め指定
しておき、単に入力データを順番に発声するだけで済
み、フィールド移動のための操作が不要となる。According to the above-described embodiment, the speech recognition result is compared with the grammar representing the characteristics of the field,
It is configured to determine the field for storing the input data represented by the recognition result based on the collation result, and it is not necessary to perform an operation such as moving the field with a keyboard or the like.
In other words, the operator pre-specifies the field for voice input and simply utters the input data in order, eliminating the need for an operation for moving the field.

【００６０】したがって、フィールド数及びレコード数
が膨大となるようなデータベースを構築する場合に、オ
ペレータにかかる操作負担を極めて軽減できる。Therefore, when constructing a database in which the number of fields and the number of records are enormous, the operation burden on the operator can be greatly reduced.

【００６１】なお、本発明は上述した実施形態に限定さ
れず種々変形して実施可能である。例えば、上述した実
施形態は本発明を個人情報管理装置に適用したものであ
ったが、いわゆるパーソナル・コンピュータ（ＰＣ）や
ワークステーション、その他種々の情報端末装置に本発
明を適用して実施しても良い。The present invention is not limited to the above-described embodiment, but can be implemented with various modifications. For example, in the above-described embodiment, the present invention is applied to a personal information management device. However, the present invention is applied to so-called personal computers (PCs), workstations, and various other information terminal devices, and the present invention is applied. Is also good.

【００６２】また、例えば「住所」の入力において、郵
便番号や電話番号の局番が入力された場合、これらの情
報に基づいて地名等の候補を細かく絞ることができる。
そこで郵便番号−地名や局番−地名のデータベースを追
加的に設けることで、地名の漢字選択における誤字防止
を図っても良い。Further, for example, when the postal code or the telephone number of the telephone number is inputted in the input of the "address", the candidates such as the place name can be narrowed down based on the information.
Therefore, by additionally providing a postal code-place name or a station number-place name database, it is possible to prevent erroneous characters in selecting a kanji for the place name.

【００６３】[0063]

【発明の効果】以上説明したように、本発明によれば、
音声認識を利用するデータベース構築におけるオペレー
タの操作負担を軽減できる情報端末装置を提供できる。As described above, according to the present invention,
It is possible to provide an information terminal device capable of reducing an operator's operation burden in constructing a database using voice recognition.

[Brief description of the drawings]

【図１】本発明の一実施形態に係る情報端末装置の概略
構成を示すブロック図FIG. 1 is a block diagram showing a schematic configuration of an information terminal device according to an embodiment of the present invention.

【図２】本発明を個人情報管理（ＰＩＭ）装置に適用し
た実施形態の構成を示すブロック図FIG. 2 is a block diagram showing a configuration of an embodiment in which the present invention is applied to a personal information management (PIM) device.

【図３】同実施形態の個人情報管理装置の本体の外観を
示す図FIG. 3 is an exemplary view showing the appearance of the main body of the personal information management apparatus according to the embodiment;

【図４】ＰＩＭデータベースのレコード入力画面の一例
を示す図FIG. 4 is a diagram showing an example of a record input screen of a PIM database

【図５】隠れマルコフモデル（ＨＭＭ）による単語の表
現を説明する図FIG. 5 is a view for explaining word expression by a hidden Markov model (HMM).

【図６】同実施形態に係る個人情報管理装置におけるデ
ータベース構築に係る動作を示すフローチャートFIG. 6 is an exemplary flowchart showing an operation related to database construction in the personal information management device according to the embodiment;

【図７】音声認識処理の流れを示すフローチャートFIG. 7 is a flowchart showing the flow of a voice recognition process;

【図８】フィールド処理及びデータベース書き込み処理
の流れを示すフローチャートFIG. 8 is a flowchart showing the flow of field processing and database writing processing.

【図９】文法遷移の一例を示す図FIG. 9 is a diagram showing an example of a grammar transition

[Explanation of symbols]

１…制御手段２…データベース３…表示手段４…指定手段５…音声入力手段１０…本体１２…マイク１４…Ａ／Ｄ変換器１６…符号帳１８…文法データ２０…ＰＩＭデータベース２２…ディスプレイ２４…キーボード又はマウス DESCRIPTION OF SYMBOLS 1 ... Control means 2 ... Database 3 ... Display means 4 ... Designation means 5 ... Speech input means 10 ... Main body 12 ... Microphone 14 ... A / D converter 16 ... Code book 18 ... Grammar data 20 ... PIM database 22 ... Display 24 ... Keyboard or mouse

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 15/18 Ｇ０６Ｆ 15/40 ３１０Ｇ 15/00 Ｇ１０Ｌ 3/00 ５３５Ａ 15/22 ５３７Ｚ // Ｇ０６Ｆ 15/02 ３２５５５１Ｂ５６１Ｃ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G10L 15/18 G06F 15/40 310G 15/00 G10L 3/00 535A 15/22 537Z // G06F 15/02 325 551B 561C

Claims

[Claims]

1. An information terminal device for constructing a database storing a plurality of records each including at least a plurality of fields and managing information, wherein a field to be voice-input among a plurality of fields of the record is input before voice input. Designating means for allowing an operator to designate in advance; recognizing means for recognizing a voice emitted from the operator; and a grammar previously given to each of the plurality of fields; Determining a field for storing the recognition result based on the matching result by comparing the recognition result obtained by the recognition means with the grammar corresponding to the grammar; and determining the field in the same or different record of the database. Writing input data to the field based on the recognition result above The information terminal apparatus characterized by comprising writing means for performing a.

2. A method according to claim 1, wherein a grammar previously given to each of the plurality of fields is used for recognizing voices corresponding to the characteristics of the field and symbols, characters, and character strings that can be input to the field according to the characteristics. A storage unit for storing as a grammar data representing the specific model; and a reading unit for reading the grammar data from the storage unit, wherein the determination unit determines a symbol read by the reading unit. The information terminal device according to claim 2, wherein the grammar is compared with the recognition result based on a state transition of the grammar based on the grammar.

3. The recognition result by the recognition means is a label sequence obtained by frequency-analyzing the voice of the operator and vector-quantizing a spectrum obtained as the analysis result, and the specific model is a hidden Markov model. 3. The method according to claim 2, wherein the determining unit performs the matching based on the likelihood obtained by evaluating the label sequence using a hidden Markov model.
An information terminal device according to item 1.

4. The information terminal device according to claim 2, further comprising changing means for changing contents of grammatical data in said storage means.

5. A recognition result storage means for storing a series of speech recognition results obtained by said recognition means, reading a series of speech recognition results stored in said storage means, and said determination relating to said read recognition results. 5. The information terminal device according to claim 1, further comprising: a collective control unit that collectively determines a field by the unit and writes input data by the writing unit. 6.

6. A personal information database storing a plurality of records each including at least a plurality of fields, and display means for displaying an input screen for inputting data to a plurality of fields related to predetermined records of the personal information database. And input means for inputting a voice uttered by an operator; and control means for controlling data input to the personal information database. The input means displays the input screen on the display means, and displays the plurality of fields. Designating means for allowing an operator to designate a field to be voice-inputted before voice input; recognition means for recognizing voice input from the input means; and a grammar previously given to each of the plurality of fields. The grammar corresponding to the field of the voice input target specified by the specifying means is added to the recognition Determining means for comparing the recognition result obtained by the means and determining a field for storing the recognition result based on the matching result; and performing the recognition on the determination field in the same or different record of the personal information database. A personal information management apparatus comprising: a writing unit that writes input data based on a result; and a control unit that includes a display control unit that displays the input data on the input screen.

7. A recording medium storing a program for managing information by constructing a database in which a plurality of records each including at least a plurality of fields are stored, wherein a field to be subjected to voice input among a plurality of fields of the record Means for designating an operator in advance before inputting a voice, a recognition means for recognizing a voice emitted from the operator, and a grammar previously given to each of the plurality of fields specified by the specification means Determining means for comparing a recognition result obtained by the recognition means with a grammar corresponding to a field to be voice-input, and determining a field for storing the recognition result based on the matching result; Based on the above recognition result in the above decision field in the record Recording medium for recording a program for executing a writing means for writing the input data.