JP2005091991A

JP2005091991A - Personal identification system, personal identification method, and storage device used for personal identification system

Info

Publication number: JP2005091991A
Application number: JP2003327472A
Authority: JP
Inventors: Tetsuo Konno; 哲郎今野
Original assignee: Seiko Precision Inc
Current assignee: Seiko Precision Inc
Priority date: 2003-09-19
Filing date: 2003-09-19
Publication date: 2005-04-07

Abstract

【課題】記憶装置のメモリ容量を少なくすることができ、記憶装置にかかるコストを低く抑えることができ、かつ、識別を受ける者の状態による影響が少ない認識率の高い個人識別システムを提供する。
【解決手段】所定の一連の文字について特定個人が発した音声から所定の複数の文字間の第１時間間隔、複数の文字及び所定の周波数をあらかじめ記憶した携帯可能な記憶装置と、記憶装置に記憶された第１時間間隔、複数の文字及び周波数を読み込む読込手段、被識別者が発した音声を入力する音声入力手段、音声入力手段により入力された音声の複数の文字から第１時間間隔に相当する第２時間間隔及び所定の周波数に相当する周波数を抽出する音声データ抽出手段、及び、音声データ抽出手段によって抽出された第２時間間隔及び周波数と読込手段によって読み込んだ第１時間間隔と周波数とを照合する照合手段、を有する情報処理装置と、を備える。
【選択図】図１
PROBLEM TO BE SOLVED: To provide a personal identification system having a high recognition rate that can reduce the memory capacity of a storage device, can reduce the cost of the storage device, and is less affected by the state of a person to be identified.
A portable storage device that stores in advance a first time interval between a plurality of predetermined characters, a plurality of characters, and a predetermined frequency from a voice uttered by a specific individual for a predetermined series of characters, and a storage device The first time interval stored, the reading means for reading a plurality of characters and frequencies, the voice input means for inputting the voice uttered by the identified person, and the first time interval from the plurality of voice characters inputted by the voice input means Audio data extraction means for extracting a corresponding second time interval and a frequency corresponding to a predetermined frequency, and the second time interval and frequency extracted by the audio data extraction means and the first time interval and frequency read by the reading means And an information processing apparatus having collation means for collating them.
[Selection] Figure 1

Description

本発明は、記憶装置にあらかじめ記憶された特定個人の情報と照合することによって特定個人を識別する個人識別システム、個人識別方法及び個人識別システムに用いられる記憶装置に関する。 The present invention relates to a personal identification system, a personal identification method, and a storage device used in a personal identification system for identifying a specific individual by collating with information on the specific person stored in advance in the storage device.

従来、特定個人を識別する方式として、特開平５−１７４２０３号公報に記載された情報処理装置があった。この装置は、ＩＣカード内の情報を利用して音声認識によって特定サービスの適用を受けるものであって、音声入力装置とこれに着脱可能なＩＣカードから構成されていた。このＩＣカードは、特定個人の情報及び固有情報のほかに、利用の是非を決定する照合手段を有していた。 Conventionally, there has been an information processing apparatus described in JP-A-5-174203 as a method for identifying a specific individual. This device receives application of a specific service by voice recognition using information in the IC card, and is composed of a voice input device and an IC card that can be attached to and detached from the voice input device. This IC card has collation means for determining whether or not to use the information in addition to specific personal information and unique information.

また、別の例として特開平６−２１５０１７号公報に記載された本人識別機構があった。この識別機構においては、音声入力機構から入力された音声信号をあらかじめ登録されている使用者本人の音声データと比較識別するとともに、音声入力機構から入力された使用者の音声信号を分析認識し、キーワードとしての文字コードに変換して登録されているキーワードと比較識別することにより使用者が本人であるか否かを識別していた。 As another example, there is a personal identification mechanism described in Japanese Patent Laid-Open No. 6-215017. In this identification mechanism, the voice signal input from the voice input mechanism is compared and identified with the voice data of the user registered in advance, and the voice signal of the user input from the voice input mechanism is analyzed and recognized. Whether or not the user is the user is identified by comparing with a keyword registered by being converted into a character code as a keyword.

さらに、別の例としては特開平５−１８１４７号公報に記載された個人識別方式があった。この方式においては、カードの携帯者から得た物理的または化学的特徴に関するデータを、カードに組み込まれたＩＣメモリにあらかじめ暗号化されて登録されたデータと照合することによって、カードの携帯者がカード内のメモリに登録されている特定個人かどうかを識別するものであった。
特開平５−１７４２０３号公報特開平６−２１５０１７号公報特開平５−１８１４７号公報 Another example is a personal identification system described in Japanese Patent Laid-Open No. 5-18147. In this method, the card carrier can check the physical or chemical characteristics obtained from the card holder against data registered in advance in an IC memory embedded in the card. It was used to identify whether or not a specific person is registered in the memory in the card.
JP-A-5-174203 Japanese Patent Laid-Open No. 6-215017 Japanese Patent Laid-Open No. 5-18147

上述の特開平５−１７４２０３号公報に記載された情報処理装置においては、ＩＣカード内に特定個人情報及び固有情報のほかに照合手段を保存するため、記憶容量を大きくせざるを得ず、ＩＣカードにかかるコストが高くなっていた。 In the information processing apparatus described in the above-mentioned Japanese Patent Application Laid-Open No. 5-174203, since the collation means is stored in the IC card in addition to the specific personal information and the unique information, the storage capacity must be increased. The cost of the card was high.

また、特開平６−２１５０１７号公報に記載された本人識別機構においては、本人識別のためのデータとして、使用者本人の音声データと本人の音声信号を文字コードに変換したキーワードを必要とするため、これらの音声データ及びキーワードを保存するために記憶容量が大きく、コストの高い記憶媒体を要していた。 Further, in the personal identification mechanism described in Japanese Patent Application Laid-Open No. 6-215017, as the data for personal identification, the user's own voice data and a keyword obtained by converting the voice signal of the user into a character code are required. In order to store these voice data and keywords, a storage medium having a large storage capacity and high cost is required.

さらに、特開平５−１８１４７号公報に記載された個人識別方式においては、カードの携帯者の状態に起因してその物理的または化学的特徴が変化するため、特定個人の識別率が大きく変動するおそれがあった。また、物理的または化学的特徴に関するデータは、圧縮したとしても大きな記憶容量を要し、カードにかかるコストが高くなっていた。 Furthermore, in the personal identification system described in Japanese Patent Laid-Open No. 5-18147, the physical or chemical characteristics change due to the cardholder's state, so that the identification rate of a specific individual varies greatly. There was a fear. In addition, data relating to physical or chemical characteristics requires a large storage capacity even when compressed, and the cost of the card is high.

上記問題点を解決するために、本発明の個人識別システムにおいては、所定の一連の文字について特定個人が発した音声から所定の複数の文字間の第１時間間隔及び複数の文字をあらかじめ記憶した携帯可能な記憶装置と、記憶装置に記憶された第１時間間隔と複数の文字を読み込む読込手段、被識別者が発した音声を入力する音声入力手段、音声入力手段により入力された音声の複数の文字から第１時間間隔に相当する第２時間間隔を抽出する音声データ抽出手段、及び、音声データ抽出手段によって抽出された第２時間間隔と読込手段によって読み込んだ第１時間間隔とを照合する照合手段、を有する情報処理装置と、を備えることを特徴としている。 In order to solve the above problems, in the personal identification system of the present invention, a first time interval between a plurality of predetermined characters and a plurality of characters are stored in advance from a voice uttered by a specific individual for a predetermined series of characters. Portable storage device, first time interval stored in the storage device and reading means for reading a plurality of characters, voice input means for inputting a voice uttered by the identified person, a plurality of voices input by the voice input means Voice data extracting means for extracting a second time interval corresponding to the first time interval from the character of the character, and collating the second time interval extracted by the voice data extracting means with the first time interval read by the reading means And an information processing apparatus having collation means.

本発明の個人識別システムにおいては、所定の一連の文字について特定個人が発した音声から所定の複数の文字間の第１時間間隔、複数の文字、及び、所定の第１周波数をあらかじめ記憶した携帯可能な記憶装置と、記憶装置に記憶された第１時間間隔、複数の文字、及び、第１周波数を読み込む読込手段、被識別者が発した音声を入力する音声入力手段、音声入力手段により入力された音声の複数の文字から第１時間間隔に相当する第２時間間隔と第１周波数に相当する第２周波数とを抽出する音声データ抽出手段、及び、音声データ抽出手段によって抽出された第２時間間隔及び第２周波数と読込手段によって読み込んだ第１時間間隔及び第１周波数とを照合する照合手段、を有する情報処理装置と、を備えることを特徴としている。 In the personal identification system of the present invention, a mobile phone in which a first time interval between a plurality of predetermined characters, a plurality of characters, and a predetermined first frequency are stored in advance from a voice uttered by a specific individual for a predetermined series of characters. Possible storage device, first time interval stored in the storage device, a plurality of characters, a reading means for reading the first frequency, a voice input means for inputting a voice uttered by the identified person, and a voice input means Voice data extracting means for extracting a second time interval corresponding to the first time interval and a second frequency corresponding to the first frequency from the plurality of characters of the recorded voice, and a second extracted by the voice data extracting means And an information processing apparatus having a collating unit that collates the first time interval and the first frequency read by the reading unit with the time interval and the second frequency.

本発明の個人識別方法においては、所定の一連の文字について特定個人が発した音声から所定の第１時間間隔、複数の文字、及び、所定の第１周波数を携帯可能な記憶装置に記憶させるステップと、記憶装置に記憶された第１時間間隔、複数の文字、及び、所定の第１周波数を読み込むステップと、被識別者が発した音声を入力するステップと、
音声入力手段により入力された音声の複数の文字から第１時間間隔に相当する第２時間間隔及び第１周波数に相当する第２周波数を抽出するステップと、音声データ抽出手段によって抽出された第２時間間隔及び第２周波数を、読込手段によって読み込んだ第１時間間隔及び第１周波数と照合するステップと、
を備えることを特徴としている。
In the personal identification method of the present invention, a step of storing a predetermined first time interval, a plurality of characters, and a predetermined first frequency in a portable storage device from a voice uttered by a specific individual for a predetermined series of characters. Reading a first time interval stored in the storage device, a plurality of characters, and a predetermined first frequency, inputting a voice uttered by the identified person,
Extracting a second time interval corresponding to the first time interval and a second frequency corresponding to the first frequency from a plurality of characters of the voice input by the voice input means; and a second extracted by the voice data extraction means. Collating the time interval and the second frequency with the first time interval and the first frequency read by the reading means;
It is characterized by having.

所定の複数の文字間の第１時間間隔は、所定の一連の文字のうちの最初と最後の文字の音声の発声の時間間隔であることが好ましい。 Preferably, the first time interval between the predetermined plurality of characters is a time interval of speech of the first and last characters in the predetermined series of characters.

第１周波数は、所定の一連の文字について特定個人が発した音声が有する所定のフォルマント周波数のうち最も高い周波数であることが好ましい。 The first frequency is preferably the highest frequency among the predetermined formant frequencies possessed by the voice uttered by a specific individual for a predetermined series of characters.

所定の一連の文字についての特定個人の発声は複数回行われ、それぞれの発声における第１時間間隔又は第１周波数の平均値及び標準偏差が記憶装置に記憶されていることが好ましい。 It is preferable that the utterance of a specific individual for a predetermined series of characters is performed a plurality of times, and the average value and standard deviation of the first time interval or the first frequency in each utterance are stored in the storage device.

記憶装置はＲＦ−ＩＤカード、ＲＦ−ＩＤタグ、又は、接触式ＩＣカードであることが好ましい。 The storage device is preferably an RF-ID card, an RF-ID tag, or a contact IC card.

記憶装置及び情報処理装置は無線により情報交換する手段を有することが好ましい。 The storage device and the information processing device preferably have means for exchanging information wirelessly.

記憶装置に、第１時間間隔及び第１周波数を記憶する抽出記憶装置を備えることが好ましい。 It is preferable that the storage device includes an extraction storage device that stores the first time interval and the first frequency.

本発明の個人識別システムに用いられる記憶装置は、電磁誘導によって駆動電力を発生する手段を有することが好ましい。 The storage device used in the personal identification system of the present invention preferably has means for generating drive power by electromagnetic induction.

本発明によると、記憶装置に記憶されているのは所定の処理によって抽出されたカード所有者の固有情報と個人情報であるため、メモリ容量を少なくすることができ、記憶装置にかかるコストを低く抑えることができる。また、複数の認証文字の時間間隔と、所定の基準により選定した周波数とを併用して特定個人の識別を行うため、認識率を高くすることができる。さらに、被識別者の発声による認証文字の時間間隔と所定の周波数が所定範囲内にあればカード所有者であると判断するため、被識別者の状態の影響を受けることが少なくなる。 According to the present invention, since the card holder's unique information and personal information extracted by a predetermined process are stored in the storage device, the memory capacity can be reduced and the cost of the storage device can be reduced. Can be suppressed. In addition, since a specific individual is identified using a time interval between a plurality of authentication characters and a frequency selected according to a predetermined criterion, the recognition rate can be increased. Furthermore, since the cardholder is determined to be a cardholder if the time interval and the predetermined frequency of the authentication character by the utterance of the identified person are within the predetermined range, the influence of the state of the identified person is reduced.

以下、本発明にかかる実施形態を部屋への入退室時のセキュリティに用いた例について図面を参照しつつ詳しく説明する。本発明は、部屋への入退室時のセキュリティに限らず、例えばコンピュータ使用者の識別などにも用いることができる。本実施形態にかかる識別システムは、携帯可能なＲＦ−ＩＤカード１０（図１）に記憶された情報に基づいて個人の識別を行うシステムに関するものであって、ＲＦ−ＩＤカード１０、ＲＦ−ＩＤカードへ被識別者の情報を抽出し記憶させるデータ抽出装置２０（図２）、ＲＦ−ＩＤカードに記憶された情報を読み取り入力されたデータと比較してカード所有者本人であるか否かを照合する情報処理装置４０（図１）を有する。 Hereinafter, an example in which an embodiment according to the present invention is used for security when entering and leaving a room will be described in detail with reference to the drawings. The present invention is not limited to security at the time of entering and leaving a room, but can also be used for identifying a computer user, for example. The identification system according to the present embodiment relates to a system for identifying an individual based on information stored in a portable RF-ID card 10 (FIG. 1), and includes an RF-ID card 10 and an RF-ID. A data extraction device 20 (FIG. 2) that extracts and stores information on the person to be identified in the card, and reads the information stored in the RF-ID card and compares it with the input data to determine whether or not it is the card owner. It has the information processing apparatus 40 (FIG. 1) to collate.

ＲＦ−ＩＤカード（記憶装置）１０は、図１に示すように、その内部に、カード所有者（特定個人）の個人情報を記憶する個人情報記憶手段１１、カード所有者の固有情報（生体情報）を記憶する固有情報記憶手段１２、外部に設けられた情報処理装置と通信可能とするとともに個人情報記憶手段１１及び固有情報記憶手段１２に対する情報の記憶、読み出しを制御し、受信する電波から電磁誘導により駆動する電力を発生する通信・制御・電源発生手段１３、並びに、送受信アンテナ１４を有する。個人情報記憶手段１１には、例えばカード所有者の氏名、性別、生年月日、登録コード番号などの個人情報が記憶される。固有情報記憶手段１２には、カード所有者の生体固有の情報としてその音声に基づく固有情報（生体情報）が記憶される。通信・制御・電力発生手段１３及び送受信アンテナ１４は、既存のＲＦ−ＩＤカードに備えられたものであって、これらを設けることにより、数ｍｍから数ｍの距離に配置した情報処理装置と情報のやりとりが可能となる。なお、個人情報記憶手段１１及び固有情報記憶手段１２へ情報を記憶するにはＲＦ−ＩＤカード１０を図２に示すデータ抽出装置２０と接触又は非接触に接続させる。本実施形態では、記憶媒体として無線機能を有するＲＦ−ＩＤカードを用いたが、接触式ＩＣカードなどのほかの記憶媒体を用いることもできることはもちろん、無線ではなく情報処理装置４０に接続して情報のやりとりを行ってもよい。 As shown in FIG. 1, the RF-ID card (storage device) 10 includes personal information storage means 11 for storing personal information of the card holder (specific individual) and unique information (biological information) of the card owner. ) And the information processing apparatus provided outside, and the storage and reading of information in the personal information storage unit 11 and the specific information storage unit 12 are controlled, and electromagnetic waves are received from received radio waves. It has communication / control / power generation means 13 for generating power to be driven by induction, and a transmission / reception antenna 14. The personal information storage means 11 stores personal information such as the cardholder's name, gender, date of birth, and registration code number. The unique information storage means 12 stores unique information (biological information) based on the voice as information unique to the cardholder's biometric. The communication / control / power generation means 13 and the transmission / reception antenna 14 are provided in an existing RF-ID card. By providing these, the information processing apparatus and information arranged at a distance of several mm to several m are provided. Can be exchanged. In order to store information in the personal information storage means 11 and the unique information storage means 12, the RF-ID card 10 is connected to the data extraction device 20 shown in FIG. In the present embodiment, an RF-ID card having a wireless function is used as a storage medium. However, other storage media such as a contact IC card can be used, and it is connected to the information processing apparatus 40 instead of wireless. Information may be exchanged.

データ抽出装置２０は、図２に示すように、カード所有者が所定の一連の文字を発声してなる音声を入力するためのマイク（音声入力手段）２１と、マイク２１から入力された音声を電流値に変換し、この値を２値化する入力処理部２２と、入力処理部２２で得られた音声データを文字データに変換する音声ＣＯＤＥＣ部２３と、タイミング処理部２４と、認証文字抽出部２５と、タイミングデータ抽出部２６と、音声周波数成分抽出部２７と、を有する。タイミング処理部２４は、入力処理部２２及び音声ＣＯＤＥＣ部２３で得られたデータからカード所有者が発した各文字の音声データの開始、終了タイミングを検出する。認証文字抽出部２５は、タイミング処理部２４で得られたタイミングに基づいて、音声ＣＯＤＥＣ部２３から出力されたデータから認証に必要な複数の文字に対応するデータを抽出する。タイミングデータ抽出部２６は、認証文字抽出部２５における抽出結果に基づいて、タイミング処理部２４で得られたタイミングから、認証に必要な複数の認証文字間の時間間隔（第１時間間隔）を抽出する。音声周波数成分抽出部２７は、タイミング処理部２４で得られたタイミングに基づいて、対象の文字の音声データに対応する周波数を算出し、所定の条件によりある一つの周波数（第１周波数）を抽出する。この抽出の方法については後述する。 As shown in FIG. 2, the data extraction device 20 includes a microphone (speech input means) 21 for inputting a sound produced by a cardholder uttering a predetermined series of characters, and a sound input from the microphone 21. An input processing unit 22 that converts the current value into a binary value, a voice CODEC unit 23 that converts voice data obtained by the input processing unit 22 into character data, a timing processing unit 24, and an authentication character extraction Unit 25, timing data extraction unit 26, and audio frequency component extraction unit 27. The timing processing unit 24 detects the start and end timing of the voice data of each character issued by the cardholder from the data obtained by the input processing unit 22 and the voice CODEC unit 23. The authentication character extraction unit 25 extracts data corresponding to a plurality of characters necessary for authentication from the data output from the voice CODEC unit 23 based on the timing obtained by the timing processing unit 24. The timing data extraction unit 26 extracts time intervals (first time intervals) between a plurality of authentication characters necessary for authentication from the timing obtained by the timing processing unit 24 based on the extraction result in the authentication character extraction unit 25. To do. The voice frequency component extraction unit 27 calculates a frequency corresponding to the voice data of the target character based on the timing obtained by the timing processing unit 24, and extracts one frequency (first frequency) according to a predetermined condition. To do. This extraction method will be described later.

ここで、認証文字抽出部２５で抽出する複数の文字は、タイミング処理部２４に接続された認証文字設定手段２９を用いてあらかじめ設定する。例えば、入力する音声の最初と最後の文字を、認証文字設定手段２９内に備えられたキーボードなどの入力手段（不図示）から入力しておく。音声ＣＯＤＥＣ部２３において、容易に音声を文字に変換可能な文字を、データ抽出装置側で指定して認証文字として設定することも可能である。 Here, a plurality of characters to be extracted by the authentication character extraction unit 25 are set in advance using the authentication character setting means 29 connected to the timing processing unit 24. For example, the first and last characters of the input voice are input from input means (not shown) such as a keyboard provided in the authentication character setting means 29. In the voice CODEC unit 23, it is also possible to designate a character that can easily convert voice into a character on the data extraction device side and set it as an authentication character.

また、音声周波数成分抽出部２７で抽出する周波数は、音声周波数成分抽出部２７に接続された周波数設定部３０を用いてあらかじめ設定する。例えば、音声周波数成分抽出部２７においてそれぞれの文字に対応する周波数のうち最も高い周波数を有する音声データを抽出することとして、以下に周波数の設定について説明する。なお、設定方法を複数用意して、その中から選択することも可能である。この場合には、その選択方法をＲＦ−ＩＤカード１０の固有情報記憶手段１２に記憶することが好ましい。 The frequency extracted by the audio frequency component extraction unit 27 is set in advance using the frequency setting unit 30 connected to the audio frequency component extraction unit 27. For example, the setting of the frequency will be described below assuming that the voice frequency component extraction unit 27 extracts the voice data having the highest frequency among the frequencies corresponding to each character. It is also possible to prepare a plurality of setting methods and select from among them. In this case, the selection method is preferably stored in the unique information storage means 12 of the RF-ID card 10.

音声の入力は複数回行い、それぞれの入力時の音声についてタイミングデータ抽出部２６で抽出された時間間隔（第２時間間隔）は、演算部３１において平均値及び標準偏差が算出される。また、音声周波数成分抽出部２７では音声周波数（第２周波数）が複数回抽出されて、演算部３１において音声の最高周波数の平均値及び標準偏差が測定される。なお、複数回の測定のばらつきを示すものであれば標準偏差以外の数値（例えば分散）を用いてもよい。 The sound is input a plurality of times, and the arithmetic unit 31 calculates an average value and a standard deviation for the time interval (second time interval) extracted by the timing data extraction unit 26 for each input sound. The voice frequency component extraction unit 27 extracts the voice frequency (second frequency) a plurality of times, and the calculation unit 31 measures the average value and standard deviation of the highest frequency of the voice. Note that a numerical value (for example, variance) other than the standard deviation may be used as long as it indicates a variation in the number of measurements.

このようにして、所定の一連の文字、複数の認証文字、算出された複数の認証文字間の時間間隔の平均値及び標準偏差、並びに、音声の最高周波数の平均値及び標準偏差は、カード所有者の固有情報として出力部３２において所定のデータ形式に整えられて、記憶の際に出力部３２に接触又は非接触で接続されたＲＦ−ＩＤカード１０の固有情報記憶手段１２に記憶される。固有情報記憶手段１２に記憶された情報は、出力部３２からＲＦ−ＩＤカード１０を取り外した後もＲＦ−ＩＤカード１０の不揮発性メモリからなる固有情報記憶手段１２に保持される。このように固有情報記憶手段１２内には、複数の認証文字間の時間間隔の平均値及び標準偏差、並びに、音声の最高周波数の平均値及び標準偏差が保存されるため、その記憶容量は非常に小さいもので足りる。所定の一連の文字をカード所有者のチェックに必要としない場合は、これを固有情報記憶手段１２に保存する必要がないため、さらに記憶容量を小さくすることができる。さらに、記憶する情報を時間間隔の平均値及び音声周波数の平均値に限定して記憶容量をさらに小さくすることも可能である。この場合には、標準偏差に代えて所定値を別途設定するものとする。なお、個人情報については、既成のタグ記憶装置により個人情報記憶手段１１に記憶させる。 In this way, a predetermined series of characters, a plurality of authentication characters, the average value and standard deviation of the calculated time intervals between the plurality of authentication characters, and the average value and standard deviation of the highest frequency of voice are Is stored in the specific information storage means 12 of the RF-ID card 10 connected to the output unit 32 in a contact or non-contact manner at the time of storage. The information stored in the unique information storage unit 12 is held in the unique information storage unit 12 including the nonvolatile memory of the RF-ID card 10 even after the RF-ID card 10 is removed from the output unit 32. As described above, since the average value and standard deviation of the time intervals between the plurality of authentication characters and the average value and standard deviation of the highest frequency of the voice are stored in the unique information storage means 12, the storage capacity is extremely high. A small one is enough. If a predetermined series of characters is not required for the cardholder's check, it is not necessary to store them in the unique information storage means 12, so that the storage capacity can be further reduced. Furthermore, the storage capacity can be further reduced by limiting the information to be stored to the average value of the time interval and the average value of the audio frequency. In this case, a predetermined value is separately set instead of the standard deviation. The personal information is stored in the personal information storage means 11 by an existing tag storage device.

情報処理装置４０は、図１に示すように、通信・制御手段４２と、音声データ抽出手段４５と、照合手段４６と、を有する。通信・制御手段４２は、送受信アンテナ４１を介してＲＦ−ＩＤカード１０と通信可能とするとともに、これに接続された個人情報・固有情報確認手段（読込手段）４３による個人情報記憶手段１１及び固有情報記憶手段１２に記憶された情報の読み取りを制御する。音声データ抽出手段４５は、ＲＦ−ＩＤカード１０を携帯している者（被識別者）がマイク（音声入力手段）４４から入力した音声から、複数の認証文字間の時間間隔、及び、音声の最高周波数を抽出する。照合手段４６は、保存された個人情報と個人情報記憶手段１１内に記憶された情報との照合、及び、固有情報記憶手段１２内に記憶された固有情報と音声データ抽出手段４５により抽出された情報との照合を行う。ここで、マイク４４から入力する音声は、固有情報登録時にマイク２１から入力したのと同一の所定の一連の文字を発声したものである。 As shown in FIG. 1, the information processing apparatus 40 includes a communication / control unit 42, an audio data extraction unit 45, and a collation unit 46. The communication / control means 42 can communicate with the RF-ID card 10 via the transmission / reception antenna 41, and the personal information storage means 11 and the unique information by the personal information / unique information confirmation means (reading means) 43 connected thereto. The reading of information stored in the information storage means 12 is controlled. The voice data extracting unit 45 is configured to extract a time interval between a plurality of authentication characters and a voice from a voice input from a microphone (voice input unit) 44 by a person carrying the RF-ID card 10 (identified person). Extract the highest frequency. The collation unit 46 collates the stored personal information with the information stored in the personal information storage unit 11, and the unique information stored in the unique information storage unit 12 and the voice data extraction unit 45 extract it. Check against information. Here, the voice input from the microphone 44 is a voice produced by uttering a predetermined series of characters identical to those input from the microphone 21 at the time of registration of the unique information.

照合手段４６は、照合の結果ＲＦ−ＩＤカード１０を携帯した者がＲＦ−ＩＤカード１０の所有者であると判断した場合は、これに接続された出入口の電気錠５０に対して開くように制御信号を出力し、電気錠５０はこの信号に応じて自動的に開かれる。照合の結果ＲＦ−ＩＤカード１０を携帯した者がＲＦ−ＩＤカード１０の所有者ではないと判断した場合は、照合手段４６は電気錠５０に対してなんら制御信号は出力せず、電気錠５０は閉じた状態を維持する。 When the collation means 46 determines that the person carrying the RF-ID card 10 is the owner of the RF-ID card 10 as a result of collation, the collation means 46 opens the door to the electric lock 50 at the entrance / exit connected thereto. A control signal is output, and the electric lock 50 is automatically opened in response to this signal. If it is determined that the person carrying the RF-ID card 10 is not the owner of the RF-ID card 10 as a result of the collation, the collating means 46 does not output any control signal to the electric lock 50 and the electric lock 50 Keeps closed.

このように、複数の認証文字間の時間間隔、及び、音声の最高周波数を用いて識別を行うため、カード携帯者の状態に影響されることが少なく、高い確率で識別を行うことができる。また、ＲＦ−ＩＦカード１０に記憶される識別に必要な情報は、複数の認証文字と、認証文字間の時間間隔と、音声の最高周波数と、であって少なくて済むため、メモリの記憶容量を小さくすることができる。さらに、個別の電源を必要としないため携帯するカードの小型化・薄型化が可能となる。 Thus, since it identifies using the time interval between several authentication characters, and the highest frequency of an audio | voice, it is less influenced by a card | curd carrier's state and can identify with high probability. Further, since the information necessary for identification stored in the RF-IF card 10 includes a plurality of authentication characters, a time interval between the authentication characters, and the maximum frequency of the voice, the memory storage capacity can be reduced. Can be reduced. Furthermore, since a separate power source is not required, the card to be carried can be reduced in size and thickness.

つづいて、本実施形態の識別システムを用いた識別動作について説明する。
まず、識別動作を行う前に、データ抽出装置２０を用いてあらかじめＲＦ−ＩＤカード１０にカード所有者の個人情報及び固有情報を記憶させる。この実施形態では、カード所有者に発声させる一連の文字として「ひらけごま」を選定し、認証文字として最初の「ひ」と最後の「ま」を認証文字設定手段２９により設定する。また、抽出する周波数として周波数設定部３０により最高周波数を設定する。カード所有者に発声させる一連の文字としては２文字以上であれば任意の文字を選定することができる。また、認証文字は一連の文字のうち任意の位置の任意の語を設定することができ、認証文字の数は２文字以上であれば任意の文字数を設定できる。さらに、抽出する周波数は最高周波数のほか、例えば最低周波数、平均周波数を選択することもでき、抽出範囲は発声されたすべての文字のほか、例えば認証文字のみであってもよい。 Subsequently, an identification operation using the identification system of the present embodiment will be described.
First, before performing the identification operation, personal information and unique information of the cardholder are stored in advance in the RF-ID card 10 using the data extraction device 20. In this embodiment, “hiratake sesame” is selected as a series of characters to be uttered by the cardholder, and the first “hi” and the last “ma” are set as authentication characters by the authentication character setting means 29. Further, the highest frequency is set by the frequency setting unit 30 as the frequency to be extracted. As a series of characters to be uttered by the cardholder, any character can be selected as long as it is two or more characters. In addition, as the authentication character, an arbitrary word at an arbitrary position in a series of characters can be set, and an arbitrary number of characters can be set as long as the number of authentication characters is two or more. Further, in addition to the highest frequency, for example, the lowest frequency and the average frequency can be selected as the frequency to be extracted, and the extraction range may be, for example, only the authentication characters in addition to all the spoken characters.

カード所有者が発声した「ひらけごま」の音声がマイク２１から入力されると、タイミング処理部２４において「ひ」「ら」「け」「ご」「ま」のそれぞれのタイミングが算出処理され、タイミングデータ抽出部２６において認証文字である最初の「ひ」と最後の「ま」のデータの時間間隔が抽出される。この時間間隔は、認証文字「ひ」と「ま」のそれぞれの発声開始タイミングの間隔とするが、もちろん発声終了タイミングの間隔としてもよいし、発声音量のピーク値の間隔とすることもできる。時間間隔のデータは演算部３１内に記憶される。 When the voice of “Hirake Sesame” uttered by the cardholder is input from the microphone 21, the timing processing unit 24 calculates the timings of “Hi”, “Ra”, “Ke”, “Go”, and “Ma”, The timing data extraction unit 26 extracts the time interval between the first “hi” and the last “ma” as authentication characters. This time interval is the interval between the utterance start timings of the authentication characters “hi” and “ma”, but of course may be the interval between the utterance end timings, or may be the interval between the peak values of the utterance volume. The time interval data is stored in the calculation unit 31.

一方、音声周波数の抽出は、音声周波数成分抽出部２７において「ひ」「ら」「け」「ご」「ま」それぞれの音声データに対応する周波数を抽出比較してもっとも高い周波数であるものを選択することによって行う。周波数のデータは演算部３１内に記憶される。 On the other hand, in the extraction of the audio frequency, the audio frequency component extraction unit 27 extracts and compares the frequencies corresponding to the respective audio data of “hi”, “ra”, “ke”, “go”, and “ma” and finds the highest frequency. Do by selecting. The frequency data is stored in the calculation unit 31.

このような抽出作業を複数回（例えば５回）繰り返し、演算部３１において、時間間隔のデータ及び周波数のデータそれぞれについて平均値と標準偏差を算出し、その結果を出力部３２から出力して、一連の文字及び認証文字とともに固有情報記憶手段１２に記憶させる。 Such extraction work is repeated a plurality of times (for example, 5 times), and the arithmetic unit 31 calculates an average value and a standard deviation for each of the time interval data and the frequency data, and outputs the result from the output unit 32, It is stored in the unique information storage means 12 together with a series of characters and authentication characters.

次に、本実施形態の識別システムを導入した部屋への入室の際の識別動作について、図３を参照しつつ説明する。
上述のようにあらかじめ個人情報及び固有情報記憶させたＲＦ−ＩＤカード１０を携帯する者が入室する場合は、電気錠５０付近に配置した情報処理装置４０に対してＲＦ−ＩＤカード１０をかざす（ステップＳ１０１）。すると、待機状態であった情報処理装置４０が動作状態となって、送受信アンテナ４１と送受信アンテナ１４との間で通信が行われる。 Next, an identification operation when entering a room where the identification system of the present embodiment is introduced will be described with reference to FIG.
When a person carrying the RF-ID card 10 in which personal information and unique information are stored in advance enters the room as described above, the RF-ID card 10 is held over the information processing device 40 arranged near the electric lock 50 ( Step S101). Then, the information processing apparatus 40 that has been in a standby state becomes an operating state, and communication is performed between the transmission / reception antenna 41 and the transmission / reception antenna 14.

つづいて、カード携帯者はマイク４４に対してあらかじめ定められた一連の文字を発声して音声を入力する（ステップＳ１０２）。次に、設定した一連の文字とカード携帯者の発生する音声との一致をチェックする（ステップＳ１０３）。発声された文字が設定された一連の文字と異なると、音声データ抽出手段４５で判断されたときはその後の処理は行わずに識別を終了する（ステップＳ１０３でＮＯ）。 Subsequently, the card holder utters a predetermined series of characters to the microphone 44 and inputs the voice (step S102). Next, a match between the set of characters and the voice generated by the cardholder is checked (step S103). If the uttered character is different from the set of set characters, when the voice data extracting means 45 determines, the subsequent processing is not performed and the identification ends (NO in step S103).

発声された文字がカードから読み取った一連の文字であると判断されたとき（ステップＳ１０３でＹＥＳ）は、音声データ抽出手段４５において、カードから読み取った認証文字（本実施形態では「ひ」と「ま」）を抽出しこれらの時間間隔を測定するとともに、各文字の音声データのうち最も高い周波数を抽出する（ステップＳ１０４）。 When it is determined that the uttered character is a series of characters read from the card (YES in step S103), the voice data extraction means 45 uses the authentication characters read from the card (in this embodiment, “hi” and “ M ") is extracted and these time intervals are measured, and the highest frequency is extracted from the voice data of each character (step S104).

なお、ステップＳ１０３を省略して、音声入力（ステップＳ１０２）後に、設定した一連の文字とカード携帯者の発生する音声との一致をチェックせずに、所定の文字を２文字抽出し、２文字間の時間間隔を測定し、音声周波数成分データを抽出する（ステップＳ１０４）ことも可能である。この場合には、カード携帯者の発生する音声から、音声データ抽出手段４５で認証文字を抽出しこれらの時間間隔を測定する。この抽出ができない場合は、その後の処理は行わずに識別を終了する。 Note that step S103 is omitted, and after inputting the voice (step S102), two predetermined characters are extracted without checking the match between the set of characters and the voice generated by the cardholder. It is also possible to measure the time interval between them and extract the audio frequency component data (step S104). In this case, the authentication data is extracted from the voice generated by the card holder by the voice data extracting means 45, and these time intervals are measured. If this extraction is not possible, the identification is terminated without performing subsequent processing.

一方、個人情報・固有情報確認手段４３においては、送受信アンテナ４１及び通信・制御手段４２を介して固有情報記憶手段１２に記憶された時間間隔の平均値、標準偏差及び音声の最高周波数の平均値及び標準偏差からなる固有情報が読み込まれる（ステップＳ１０５）。読み込まれた固有情報は、カード携帯者が発した音声から抽出された時間間隔のデータ及び最高周波数のデータと照合手段４６により照合される（ステップＳ１０６）。 On the other hand, in the personal information / unique information confirmation unit 43, the average value of the time interval, the standard deviation, and the average value of the highest frequency of the voice stored in the unique information storage unit 12 via the transmission / reception antenna 41 and the communication / control unit 42. And the unique information consisting of the standard deviation is read (step S105). The read unique information is collated by the collating means 46 with the time interval data and the highest frequency data extracted from the voice uttered by the card holder (step S106).

照合は照合手段４６において以下のように行う。
まず、カード携帯者が発した音声から抽出された時間間隔及び最高周波数と、固有情報記憶手段１２に記憶された時間間隔及び最高周波数の平均との差をそれぞれ求める。つづいて、これらの差の絶対値を１０倍して、固有情報記憶手段１２に記憶されたそれぞれの標準偏差で除する。こうして求めた値に５０を加えると、カード携帯者が発した音声から抽出された時間間隔及び周波数の固有情報記憶手段１２に記憶した音声データに対する偏差値を求めることができる。この偏差値が４５以上５５以下であれば、カード携帯者が発した音声から抽出された時間間隔及び最高周波数は、それぞれ、固有情報記憶手段１２に記憶された平均時間間隔及び平均最高周波数に近いものとして、カード携帯者はカード所有者であると判断する。この偏差値範囲は、必要なセキュリティ状態などに応じて、任意に設定することができる。 The collation is performed in the collation means 46 as follows.
First, the difference between the time interval and the maximum frequency extracted from the voice uttered by the card holder and the average of the time interval and the maximum frequency stored in the unique information storage means 12 is obtained. Subsequently, the absolute value of these differences is multiplied by 10 and divided by the respective standard deviations stored in the unique information storage means 12. When 50 is added to the value thus obtained, a deviation value for the voice data stored in the specific information storage means 12 of the time interval and frequency extracted from the voice uttered by the cardholder can be obtained. If this deviation value is 45 or more and 55 or less, the time interval and the maximum frequency extracted from the voice uttered by the card holder are close to the average time interval and the average maximum frequency stored in the unique information storage means 12, respectively. As a matter of fact, it is determined that the card holder is the card holder. This deviation value range can be arbitrarily set according to the required security state.

上記音声から抽出される最高周波数の照合に関して、図４及び図５を参照しつつさらに詳細な説明を行う。
音声として言葉を発生する場合に、音声は母音と子音とから構成される。子音がピークの安定した周波数スペクトルを持たないのに対して、母音の周波数スペクトルは、そのピークが比較的安定している。このピークは、一般にフォルマント周波数と呼ばれている。フォルマント周波数は、人間の気管の形状に依存しているため、個人により異なる値を示す。本実施形態では、このような特徴を有するフォルマント周波数を用いて個人の識別を行っている。 A more detailed description will be given with reference to FIGS. 4 and 5 regarding collation of the highest frequency extracted from the voice.
When words are generated as speech, the speech is composed of vowels and consonants. While the consonant does not have a stable frequency spectrum of peaks, the frequency spectrum of vowels has a relatively stable peak. This peak is generally called formant frequency. Since the formant frequency depends on the shape of the human trachea, it shows different values depending on the individual. In this embodiment, an individual is identified using a formant frequency having such characteristics.

図４は、図１に示す音声データ抽出手段４５、照合手段４６の詳細な構成を示すブロック図である。図４に示すように、音声データ抽出手段４５は、音声ＣＯＤＥＣ部５１、ＦＦＴまたは自己相関関数計算回路５２及び記憶装置５３を有し、
照合手段４６は、対象周波数成分設定回路６１、選択回路６２及び比較回路６３
を有する。 FIG. 4 is a block diagram showing a detailed configuration of the voice data extraction unit 45 and the collation unit 46 shown in FIG. As shown in FIG. 4, the audio data extraction means 45 includes an audio CODEC unit 51, an FFT or autocorrelation function calculation circuit 52, and a storage device 53.
The matching means 46 includes a target frequency component setting circuit 61, a selection circuit 62, and a comparison circuit 63.
Have

図４において、マイク４４から入力されたアナログ音声信号は音声ＣＯＤＥＣ部５１によりデジタル音声データに変換され、ＦＦＴまたは自己相関関数計算回路５２において音声データから連続のＦＦＴまたは自己相関関数により周波数スペクトルが求められる。この周波数スペクトル分析では、安定した母音の周波数ピークを時間と共に得ることができ、第１母音〜第５母音の各母音について、３番目までの周波数ピークである第１フォルマント周波数、第２フォルマント周波数及び第３フォルマント周波数を記憶装置５３に記憶する。 In FIG. 4, the analog audio signal input from the microphone 44 is converted into digital audio data by the audio CODEC unit 51, and the frequency spectrum is obtained from the audio data by the continuous FFT or autocorrelation function in the FFT or autocorrelation function calculation circuit 52. It is done. In this frequency spectrum analysis, a stable vowel frequency peak can be obtained with time. For each vowel of the first vowel to the fifth vowel, the first formant frequency, the second formant frequency, and the third formant frequency, The third formant frequency is stored in the storage device 53.

ここで、図５に、母音”ａ”（あ）、”ｉ”（い）、”ｕ”（う）、”ｅ”（え）、”ｏ”（お）それぞれについて、周波数の低い側から３番目までの周波数スペクトルのピーク（第１フォルマント周波数、第２フォルマント周波数及び第３フォルマント周波数）の平均値を例示する。 Here, FIG. 5 shows the vowels “a” (a), “i” (i), “u” (u), “e” (e), and “o” (o) from the lower frequency side. The average value of the peak (1st formant frequency, 2nd formant frequency, and 3rd formant frequency) of the frequency spectrum to the 3rd is illustrated.

照合の対象となる周波数は、記憶装置５３に記憶されたフォルマント周波数の中から対象周波数成分設定回路６１にしたがって選択回路６２により選択され、比較回路６３へ周波数データが出力される。本実施形態では第１フォルマント周波数領域の中から最大の周波数を選択することとしているが、第２フォルマント周波数領域または第３フォルマント周波数領域から最大の周波数を選択することもできる。 The frequency to be collated is selected from the formant frequencies stored in the storage device 53 by the selection circuit 62 according to the target frequency component setting circuit 61, and the frequency data is output to the comparison circuit 63. In the present embodiment, the maximum frequency is selected from the first formant frequency region, but the maximum frequency can also be selected from the second formant frequency region or the third formant frequency region.

一方、比較回路６３は、予めＲＦ−ＩＤカード１０に登録してある個人の特定のフォルマント周波数データ（個人情報データ）を個人情報・固有情報確認手段４３から読み込み、選択回路６２から入力された選択周波数データと比較して、両者が一致すれば照合の出力を発生する。 On the other hand, the comparison circuit 63 reads the individual specific formant frequency data (personal information data) registered in advance in the RF-ID card 10 from the personal information / unique information confirmation means 43 and selects the selection input from the selection circuit 62. Compared with the frequency data, if they match, a collation output is generated.

以上の照合により、カード携帯者がカード所有者であると判断されると（ステップＳ１０７でＹＥＳ）、図示していない個人情報の照合で確認できた場合には照合手段４６から電気錠５０に制御信号が出力されて電気錠５０が自動的に開かれる。これに対して、カード携帯者がカード所有者ではないと判断されると（ステップＳ１０７でＮＯ）、ふたたび音声を入力して照合動作が行われる。なお、照合手段４６にカウント部を設けて照合回数をカウントし、所定回数照合してもカード携帯者がカード所有者であると判断されるに至らない場合は、それ以後マイク４４からの音声入力をしないこととすると、きわめて多数回音声入力することによってカード所有者であると判断されるに至る場合を排除することができるため好ましい。 If it is determined by the above verification that the cardholder is the cardholder (YES in step S107), if the verification is performed by verification of personal information (not shown), the verification means 46 controls the electric lock 50. A signal is output and the electric lock 50 is automatically opened. On the other hand, if it is determined that the cardholder is not the cardholder (NO in step S107), the voice is input again and the verification operation is performed. If the card carrier is not determined to be the cardholder even after a predetermined number of times of collation by providing a counting unit in the collating means 46, voice input from the microphone 44 is performed thereafter. It is preferable not to do so because it is possible to eliminate the case where the card owner is determined by inputting the voice very many times.

また、例えば、特定個人が発した音声中の文字間の時間間隔のみに着目した識別方式であってもよい。この場合には、セキュリティレベルが比較的緩いシステムであり、低価格で実現できる。
本発明について上記実施形態を参照しつつ説明したが、本発明は上記実施形態に限定されるものではなく、改良の目的または本発明の思想の範囲内において改良または変更が可能である。 In addition, for example, an identification method that focuses only on the time interval between characters in a voice uttered by a specific individual may be used. In this case, the system has a relatively low security level and can be realized at a low price.
Although the present invention has been described with reference to the above embodiment, the present invention is not limited to the above embodiment, and can be improved or changed within the scope of the purpose of the improvement or the idea of the present invention.

本発明の実施形態に係るＲＦ−ＩＤカードと情報処理装置の構成を示す図である。It is a figure which shows the structure of RF-ID card | curd and information processing apparatus which concern on embodiment of this invention. 本発明の実施形態に係るデータ抽出記憶装置の構成を示す図である。It is a figure which shows the structure of the data extraction storage device which concerns on embodiment of this invention. 本発明の実施形態に係る個人の識別の流れを示すフローチャートである。It is a flowchart which shows the flow of an individual identification which concerns on embodiment of this invention. 本発明の実施形態に係る音声周波数に関する照合に用いる音声データ抽出手段と照合手段の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice data extraction means used for the collation regarding the audio | voice frequency which concerns on embodiment of this invention, and a collation means. 本発明の実施形態に係る音声周波数データにおける母音の３番目までの周波数ピークの平均値を例示する表である。It is a table | surface which illustrates the average value of the frequency peak to the 3rd of the vowel in the audio | voice frequency data which concerns on embodiment of this invention.

Explanation of symbols

１０ＲＦ−ＩＤカード（記憶装置）
１１個人情報記憶手段
１２固有情報記憶手段
２０抽出記憶装置
４０情報処理装置
４３個人情報・固有情報確認手段（読込手段）
４４マイク（音声入力手段）
４５音声データ抽出手段
４６照合手段
５０出入口の電気錠

10 RF-ID card (storage device)
11 Personal information storage means 12 Unique information storage means 20 Extraction storage device 40 Information processing device 43 Personal information / unique information confirmation means (reading means)
44 Microphone (voice input means)
45 voice data extracting means 46 collating means 50 door lock

Claims

A portable storage device that stores a first time interval between a plurality of predetermined characters and a plurality of the characters in advance from a voice uttered by a specific individual for a predetermined series of characters;
Reading means for reading the first time interval and the plurality of characters stored in the storage device, voice input means for inputting voice uttered by the identified person, and the plurality of characters of the voice input by the voice input means Voice data extracting means for extracting a second time interval corresponding to the first time interval, and the second time interval extracted by the voice data extracting means and the first time interval read by the reading means An information processing apparatus having collation means for collating
A personal identification system comprising:

A portable storage device that stores in advance a first time interval between a plurality of predetermined characters from a voice uttered by a specific individual for a predetermined series of characters, the plurality of characters, and a predetermined first frequency;
Reading means for reading the first time interval, the plurality of characters, and the first frequency stored in the storage device, voice input means for inputting a voice uttered by the identified person, and input by the voice input means Voice data extracting means for extracting a second time interval corresponding to the first time interval and a second frequency corresponding to the first frequency from the plurality of characters of the voice, and extracted by the voice data extracting means An information processing apparatus comprising: collation means for collating the second time interval and the second frequency with the first time interval and the first frequency read by the reading means;
A personal identification system comprising:

3. The personal identification system according to claim 1, wherein the first time interval between the predetermined plurality of characters is a time interval of speech of the first and last characters in the predetermined series of characters.

4. The personal identification system according to claim 2, wherein the first frequency is a highest frequency among predetermined formant frequencies included in the voice uttered by a specific individual for the predetermined series of characters.

The specific individual utterance of the predetermined series of characters is performed a plurality of times, and the average value and standard deviation of the first time interval or the first frequency in each utterance are stored in the storage device. The personal identification system according to claim 4.

The personal identification system according to claim 1, wherein the storage device is an RF-ID card, an RF-ID tag, or a contact IC card.

The personal identification system according to claim 1, wherein the storage device and the information processing device have means for exchanging information wirelessly.

The personal identification system according to claim 1, wherein the storage device includes an extraction storage device that stores the first time interval and the first frequency.

9. A storage device used in the personal identification system according to claim 1, further comprising means for generating drive power by electromagnetic induction.

Storing a predetermined first time interval, the plurality of characters, and a predetermined first frequency in a portable storage device from a voice uttered by a specific individual for a predetermined series of characters;
Reading the first time interval, the plurality of characters, and the first frequency stored in the storage device;
Inputting voice uttered by the identified person;
Extracting a second time interval corresponding to the first time interval and a second frequency corresponding to the first frequency from the plurality of characters of the voice input by the voice input means;
Collating the second time interval and the second frequency extracted by the voice data extracting means with the first time interval and the first frequency read by the reading means;
A personal identification method comprising: