[go: up one dir, main page]

WO2015087372A1 - Unidirectional close-talking microphone - Google Patents

Unidirectional close-talking microphone Download PDF

Info

Publication number
WO2015087372A1
WO2015087372A1 PCT/JP2013/007335 JP2013007335W WO2015087372A1 WO 2015087372 A1 WO2015087372 A1 WO 2015087372A1 JP 2013007335 W JP2013007335 W JP 2013007335W WO 2015087372 A1 WO2015087372 A1 WO 2015087372A1
Authority
WO
WIPO (PCT)
Prior art keywords
noise
microphone
processing means
data
speech
Prior art date
Application number
PCT/JP2013/007335
Other languages
French (fr)
Japanese (ja)
Inventor
睦朗 古閑
良平 上瀧
泰彦 野村
齊藤 哲也
Original Assignee
救救com株式会社
プロトラスト株式会社
日本エムエムアイテクノロジー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 救救com株式会社, プロトラスト株式会社, 日本エムエムアイテクノロジー株式会社 filed Critical 救救com株式会社
Priority to PCT/JP2013/007335 priority Critical patent/WO2015087372A1/en
Publication of WO2015087372A1 publication Critical patent/WO2015087372A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/14Throat mountings for microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/326Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only for microphones

Definitions

  • the present invention relates to a microphone for collecting speech, and in particular, collects sound by removing noise to such an extent that a speaker's speech can be clearly recognized even when noise such as noise is large.
  • the present invention relates to a unidirectional close-talking microphone that can be used.
  • the scene where sound is collected with a microphone is not limited to a quiet place, and it may be used in a place with a lot of noise such as under an overpass or in an emergency vehicle.
  • a normal directional microphone picks up noise such as noise transmitted from the direction of the speaker, and there is a problem that it is not possible to sufficiently collect clear sound.
  • the number of noise detection microphones is reduced to suppress an increase in cost, without increasing the size of the speaker housing.
  • a technique for detecting and removing noise is disclosed.
  • a speaker is driven by outputting an acoustic signal, an electromotive force generated in the speaker due to ambient noise is detected, and a signal having an opposite phase of the ambient noise is generated based on the electromotive force to generate an acoustic signal.
  • a technique for canceling ambient noise by adding is disclosed.
  • the audio signal to be canceled is not necessarily an unnecessary signal such as noise.
  • the output will be reduced to the necessary sound, and it has not necessarily been possible to cancel the noise as intended.
  • Japanese Patent Application Laid-Open No. 2012-231468 discloses a technique related to an audio headset that can be transmitted to a remote listener by removing noise from an audio signal spoken by a wearer, and is transmitted by internal bone conduction.
  • the first sound signal is output from the sound vibration of the sound
  • the second sound signal is output from the sound vibration of the wearer
  • the sound generated by the wearer by combining the first sound signal and the second sound signal.
  • a technique for using the first audio signal as a means for calculating the cutoff frequency as well as the probability that no audio is present.
  • the present invention is a microphone for collecting voice and converting it into an electric signal.
  • the voice of a speaker can be The present invention relates to a unidirectional close-talking microphone that can reduce the stress of a user and can perform sound collection conversion by removing noise to a level that can be clearly recognized.
  • a unidirectional close-talking microphone comprises a single microphone that collects sound and converts it into an electrical signal, and a noise processing means that removes noise.
  • the supply current to the drain of the N-channel junction field effect transistor composed of the grounded source is set to 90 ⁇ A to 100 ⁇ A.
  • the unidirectional close-talking microphone is selected by a speech extraction unit, a feature extraction unit that extracts a feature of speech from which noise has been removed by the speech extraction unit and the noise processing unit, and a vocabulary processing unit.
  • Recognition processing means for recognizing speech by applying the acoustic model data distributed from the acoustic model data section to grammatical data determined to be applied by the lexical data and the grammar processing section, and a post-recognition processing means for generating character data And an output means for outputting the generated character data.
  • the present invention is configured as described in detail above, the following effects are obtained. 1. Since a single microphone is used, the structure does not become too complicated and can be easily configured. Further, since the optimization circuit is added to the noise processing means, maintenance is facilitated. In addition, since the current supplied to the drain of the N-channel junction field effect transistor composed of the grounded source is set to 90 ⁇ A to 100 ⁇ A, noise other than the voice of the speaker can be easily canceled within a sufficient range, and voice data can be easily obtained. Can be optimized.
  • FIG. 1 is a perspective view of a unidirectional close-talking microphone according to the present invention
  • FIG. 2 is a schematic circuit diagram of the unidirectional close-talking microphone
  • FIG. 3 is a graph showing an optimization area of voice information
  • FIG. 4 is an operation flowchart of the voice recognition system.
  • the unidirectional close-talking microphone according to the present invention comprises a main body 10, a microphone 20, a noise processing means 30, and an earphone 40, and is simply used by fixing the main body 10 to a speaker's neck or the like. It is a macrophone that can perform noise cancellation with a simple structure.
  • the main body 10 is a main part of a unidirectional close-talking microphone that is fixedly installed on the speaker's neck or the like. As shown in FIG. 1, in this embodiment, the main body 10 has an elastic semicircular shape.
  • the microphone 20 is integrally held at the tip of the main body 10. Further, a port 14 for connecting a noise processing means 30 and an earphone 40 described later is provided on the opposite side of the main body 10 where the microphone 20 is installed. Furthermore, a pair of fixing members 12 are provided inside the both ends of the semicircular shape to fix the neck and the like.
  • the main body 10 is made of plastic in the present embodiment, but is not limited thereto, and may be a material having elasticity. For example, a wide range of resin materials can be used. Further, the overall shape is not limited to the neck shape.
  • the microphone 20 is a member for collecting the voice of the speaker. As shown in FIG. 1, the microphone 20 is a single unit and is equipped with a sound collecting unit 22 at the end. The microphone picks up the sound and converts it into an electric signal by transmitting the sound toward the sound collecting unit 22, and transmits the electronic information to the outside.
  • the microphone 20 is bent in a U shape and is connected to the main body 10 so as to be rotatable. That is, by rotating the microphone 20 around the connection portion with the main body 10, it is possible to adjust the movement so that the microphone 20 is brought closer to the speaker's mouth when the main body 10 is attached to the neck or the like. .
  • the noise processing means 30 is a member that takes in voice data collected by the microphone 20 and converted into an electrical signal, and removes noise such as noise contained in the data.
  • the noise processing means 30 is detachably attached to the port 14 on the opposite side of the place where the microphone 20 of the main body 10 is connected via the cable 16.
  • the noise processing means 30 is composed of an impedance converter 32 and an optimization circuit 34 as shown in FIG.
  • the impedance converter 32 is a device for transmitting an electrical signal, and includes an N-channel junction field effect transistor (FET) 32a.
  • the optimization circuit 34 is a device for removing noise and optimizing audio data, and is connected to the impedance converter 32 and processes the audio data transmitted from the impedance converter 32.
  • the optimization circuit 34 is composed of a plurality of bridge-connected resistors 34a and 34b.
  • the noise processing by the noise processing means 30 is performed by cooperation of the impedance converter 32 and the optimization circuit 34.
  • the noise processing means 30 effectively removes noise such as noise and noise from the audio data consisting of the electric signal collected by the microphone 20 and converted into a signal, by electric processing.
  • the current value and the like are adjusted, and the supply current to the drain of the N-channel junction field effect transistor 32a composed of the grounded source is adjusted to be in the range of 90 ⁇ A to 100 ⁇ A.
  • the supply current to the drain of the N-channel junction field effect transistor 32a is reduced from the normal value of 500 ⁇ A to 90 ⁇ A by adjusting the potential applied to the gate. ing. According to actual measurement, this 90 ⁇ A was an optimum value that is low in sensitivity to noise with a sound source in the distance.
  • the signal-to-noise ratio (Signal / Noise ratio signal-to-noise ratio) is significantly improved, the speaker's voice is reliably output, and noise such as noise / noise that makes the sound source far away can be reliably removed. It has become possible.
  • the supply current value to the drain of the N-channel junction field effect transistor 32a is 90 ⁇ A in this embodiment, but it can be given a certain width as shown in FIG. Measurements have shown that the value is preferably in the range of 90 ⁇ A to 100 ⁇ A.
  • the sound pressure of the voice uttered by the speaker is desirably 100 dB or more, and it is desirable to use the speaker's mouth as close to the microphone 20 as possible. Since the initial sound pressure of the voice often exceeds 100 dB, it has been proved by actual measurement that the speaker can be clearly recognized if the sound source of the speaker is sufficiently close to the microphone.
  • the earphone 40 is a member for transmitting voice information from the outside to the speaker, and as shown in FIG. 1, a port provided at the end of the main body 10 opposite to the place where the microphone 20 is connected. 42 is detachably mounted via a cable 44.
  • a receiving device (not shown) is provided inside the main body 10 so that a speaker can acquire voice data received by the receiving device via the earphone 40.
  • the unidirectional close-talking microphone further includes a speech extraction unit 50, a feature extraction unit 60, a recognition processing unit 70, and a recognition unit in addition to the noise processing unit 30.
  • Post-processing means 80 and output means 90 are provided.
  • the voice extraction means 50 is a means for further analyzing and extracting a sound recognized as a voice from the data subjected to the noise removal process by the noise processing means 30. This makes it possible to configure basic data for converting voice data into text (character data).
  • the feature extraction unit 60 is a unit for extracting the features of the speech from which noise has been removed by the speech extraction unit 50 and the noise processing unit 30. By this means, each sound is identified and identified from the characteristics of each sound in the voice, and the preparation for text conversion is completed.
  • the data processed by the feature extraction means 60 is transmitted to the recognition processing means 70.
  • the recognition processing means 70 is means for recognizing speech based on acoustic model data for vocabulary data and grammatical data.
  • the vocabulary data is selected by the vocabulary processing unit 72 that analyzes the speech data and performs vocabulary selection processing, and the grammar data is processed by the grammar processing unit 74 that performs appropriate grammar application determination processing from each selected vocabulary.
  • the data processed by the recognition processing means 70 is transmitted to the post-recognition processing means 80.
  • the post-recognition processing unit 80 is a unit that generates character data. That is, the speech data recognized correctly by the recognition processing means 70 is analyzed to convert the speech into text.
  • the character data generated by the post-recognition processing unit 80 is transmitted to the output unit 90.
  • the output unit 90 performs processing for outputting the generated character data to an external PC or storage medium. This makes it possible to use the voice data that has been converted into text for any purpose.
  • the voice data optimized by the noise processing means 30 is used for text conversion, more accurate voice character data conversion can be realized. That is, even in a place with a lot of noise such as the inside of an emergency vehicle, it is possible to accurately transmit the voice uttered by the speaker, and to further accurately convert the voice to text (character data). Is possible.
  • FIG. 1 is a perspective view of a unidirectional close-talking microphone according to the present invention.
  • Schematic circuit diagram of unidirectional close-talking microphone A graph showing the optimization area of voice information Operation flow diagram of speech recognition system

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

[Problem] To provide a unidirectional close-talking microphone for collecting audio in a manner that reduces stress for a user by removing noise when collecting audio to such an extent that a speaker's voice can be clearly recognized, in particular, even if there is a considerable amount of unwanted noise. [Solution] Provided is a unidirectional close-talking microphone consisting of a single microphone and a noise processing means. The noise processing means consists of an impedance converter consisting of an N-channel junction field effect transistor, and an optimizer circuit for optimizing audio data. An electrical current supplied to a drain of the N-channel junction field effect transistor consisting of a grounded source is 90 to 100 μA.

Description

単一指向性接話型マイクロフォンUnidirectional close-talking microphone
 本発明は、音声を集音するためのマイクロフォンに関し、特に、騒音や雑音等のノイズが大きい場合であっても、話者の音声を明確に認識できる程度にノイズを除去して集音することが可能な単一指向性接話型マイクロフォンに関する。 The present invention relates to a microphone for collecting speech, and in particular, collects sound by removing noise to such an extent that a speaker's speech can be clearly recognized even when noise such as noise is large. The present invention relates to a unidirectional close-talking microphone that can be used.
 従来より、集音して電気信号に変換するマイクロフォンが数多く存在しており、用途に応じて様々な機能や特性を有するマイクロフォンが開発され、使用されている。特に、雑音などのノイズを除去して話者のクリアな音声のみを集音することを可能としたマイクロフォンが数多く開発されている。 Conventionally, there are many microphones that collect sound and convert it into an electric signal, and microphones having various functions and characteristics have been developed and used depending on applications. In particular, many microphones have been developed that can remove only noise such as noise and collect only the clear voice of the speaker.
 マイクロフォンは、話者の音声をピンポイントで集音することが出来るように、指向性を高くする構造・仕組のものが多く存在し、これによって話者の音声を確実に集音することが出来る。しかし、マイクロフォンで集音する場面は、静かな場所とは限らず、高架下や緊急車両の中など、騒音・雑音が大きい場所で使用するケースも想定される。このような場合、通常の指向性マイクロフォンでは、話者の方向から伝達する騒音・雑音等のノイズまで拾ってしまい、クリアな音声を集音することが充分に可能ということは出来ないという問題が存在している。 There are many microphones with structures and mechanisms that increase directivity so that the speaker's voice can be collected pinpointly, so that the speaker's voice can be collected reliably. . However, the scene where sound is collected with a microphone is not limited to a quiet place, and it may be used in a place with a lot of noise such as under an overpass or in an emergency vehicle. In such a case, a normal directional microphone picks up noise such as noise transmitted from the direction of the speaker, and there is a problem that it is not possible to sufficiently collect clear sound. Existing.
 このような、問題点を解消するための技術として、例えば、特開2011-13403号では、ノイズ検出用のマイクロフォンの数を減らしてコスト上昇を抑え、スピーカの筐体を大型化することなく、ノイズを検出・除去する技術が開示されている。ここでは、音響信号を出力することによりスピーカを駆動し、周囲ノイズによってスピーカに発生する起電力を検出して、その起電力に基づいて周囲ノイズの逆位相となる信号を生成して音響信号に加算することにより周囲ノイズをキャンセルする技術が開示されている。 As a technique for solving such a problem, for example, in Japanese Patent Application Laid-Open No. 2011-13403, the number of noise detection microphones is reduced to suppress an increase in cost, without increasing the size of the speaker housing. A technique for detecting and removing noise is disclosed. Here, a speaker is driven by outputting an acoustic signal, an electromotive force generated in the speaker due to ambient noise is detected, and a signal having an opposite phase of the ambient noise is generated based on the electromotive force to generate an acoustic signal. A technique for canceling ambient noise by adding is disclosed.
 この技術によると、スピーカに発生する起電力に応じたキャンセル信号を生成してノイズをキャンセルする事が可能となるが、キャンセルする音声信号が必ずしもノイズ等の不必要な信号であるとは限らず、場合によっては必要な音声まで出力が低減してしまう可能性も考えられ、必ずしも狙い通りのノイズキャンセルが可能となるとは言えなかった。 According to this technology, it becomes possible to cancel a noise by generating a cancel signal corresponding to the electromotive force generated in the speaker. However, the audio signal to be canceled is not necessarily an unnecessary signal such as noise. In some cases, there is a possibility that the output will be reduced to the necessary sound, and it has not necessarily been possible to cancel the noise as intended.
 また、特開2012-231468号では、装着者が発話した音声信号から雑音除去して遠隔聴取者に伝達することができるようにするオーディオ・ヘッドセットに関する技術として、内部骨伝導によって伝達される非音響の音声振動より第1の音声信号を出力し、装着者の音声振動から第2の音声信号を出力し、第1の音声信号と第2の音声信号とを結合して装着者が発する音声を表す第3の信号を出力するとともに、第1の音声信号を、遮断周波数ならびに音声が存在しない確率を計算するための手段で使用する技術が開示されている。 Japanese Patent Application Laid-Open No. 2012-231468 discloses a technique related to an audio headset that can be transmitted to a remote listener by removing noise from an audio signal spoken by a wearer, and is transmitted by internal bone conduction. The first sound signal is output from the sound vibration of the sound, the second sound signal is output from the sound vibration of the wearer, and the sound generated by the wearer by combining the first sound signal and the second sound signal. And a technique for using the first audio signal as a means for calculating the cutoff frequency as well as the probability that no audio is present.
 この技術によると、話者の音声信号を抽出する事ができ、これによってノイズを除去する事が可能になると考えられるが機器の構造が複雑になり、構成し難くなるという問題点があった。また、機器を話者に密着させる必要があるため、使用するにあたって話者へ与えるストレスが増加するという問題点も存在していた。 According to this technology, it is considered that a speaker's voice signal can be extracted, and this can eliminate noise, but there is a problem that the structure of the device becomes complicated and difficult to configure. In addition, since it is necessary to bring the device into close contact with the speaker, there has been a problem that stress applied to the speaker increases during use.
 騒音・雑音が大きい場所であってもノイズを拾わずに音声を集音することが可能であれば、あらゆる環境下で正確な情報伝達が可能となる。特に救急車両内などの騒音レベルが高い場所では、誤った情報を伝達するリスクを軽減する事が可能となる。また、音声を文字データに変換するテキスト化処理を行う場合の正確性も確実にアップするというメリットがある。
 そこで、構造が複雑になり過ぎず、かつ、使い勝手がよく、かつ、あらゆる環境下で正確な情報伝達が可能となる、精度の高いノイズキャンセル機能を有する単一指向性接話型マイクロフォンの開発が望まれていた。
特開2011-13403号公報 特開2012-231468号公報
If it is possible to collect voice without picking up noise even in a place where noise and noise are large, accurate information transmission is possible in any environment. In particular, in a place with a high noise level such as in an ambulance vehicle, it is possible to reduce the risk of transmitting erroneous information. In addition, there is an advantage that the accuracy in the case of performing text processing for converting speech into character data is also improved.
Therefore, the development of a unidirectional close-talking microphone with a highly accurate noise canceling function that does not become too complicated in structure, is easy to use, and enables accurate information transmission in any environment. It was desired.
JP 2011-13403 A JP 2012-231468 A
 本発明は上記問題を解決するために、音声を集音し、電気信号に変換するためのマイクロフォンであって、特に、騒音や雑音等のノイズが大きい場合であっても、話者の音声を明確に認識できる程度にノイズを除去して集音変換することが可能な、使用者のストレスを軽減した単一指向性接話型マイクロフォンに関する。 In order to solve the above problems, the present invention is a microphone for collecting voice and converting it into an electric signal. In particular, even when noise such as noise or noise is large, the voice of a speaker can be The present invention relates to a unidirectional close-talking microphone that can reduce the stress of a user and can perform sound collection conversion by removing noise to a level that can be clearly recognized.
 上記の目的を達成するために本発明に係る単一指向性接話型マイクロフォンは、音声を集音し電気信号に変換する単一のマイクロフォンと、雑音を除去する雑音処理手段と、からなる単一指向性接話型マイクロフォンであって、前記雑音処理手段は、電気信号を伝達するNチャネル接合型電界効果トランジスタからなるインピーダンス・コンバータと、雑音を除去して音声データを最適化する最適化回路とから構成されるとともに、周囲の騒音を効果的に排除するために、ソース接地からなる前記Nチャネル接合型電界効果トランジスタのドレインへの供給電流を90μA乃至100μAとした構成である。 In order to achieve the above object, a unidirectional close-talking microphone according to the present invention comprises a single microphone that collects sound and converts it into an electrical signal, and a noise processing means that removes noise. A unidirectional close-talking microphone, wherein the noise processing means includes an impedance converter comprising an N-channel junction field effect transistor for transmitting an electrical signal, and an optimization circuit for optimizing audio data by removing noise In order to effectively eliminate ambient noise, the supply current to the drain of the N-channel junction field effect transistor composed of the grounded source is set to 90 μA to 100 μA.
 また、前記単一指向性接話型マイクロフォンは、音声抽出手段と、該音声抽出手段と前記雑音処理手段によって雑音が除去された音声の特徴を抽出する特徴抽出手段と、語彙処理部において選択された語彙データおよび文法処理部において適用判断がなされた文法データに音響モデルデータ部から配信される音響モデルデータを適用して音声を認識する認識処理手段と、文字データを生成する認識後処理手段と、生成された文字データを出力する出力手段と、を装備する構成である。 The unidirectional close-talking microphone is selected by a speech extraction unit, a feature extraction unit that extracts a feature of speech from which noise has been removed by the speech extraction unit and the noise processing unit, and a vocabulary processing unit. Recognition processing means for recognizing speech by applying the acoustic model data distributed from the acoustic model data section to grammatical data determined to be applied by the lexical data and the grammar processing section, and a post-recognition processing means for generating character data And an output means for outputting the generated character data.
 本発明は、上記詳述した通りの構成であるので、以下のような効果がある。
1.マイクロフォンを単一としたので、構造が複雑になりすぎず、容易に構成することが可能となる。また、雑音処理手段に最適化回路を付加する構成としたため、メンテナンスが容易となる。また、ソース接地からなる前記Nチャネル接合型電界効果トランジスタのドレインへの供給電流を90μA乃至100μAとしたため、話者の音声以外のノイズを容易にかつ充分な範囲でキャンセル可能となり、音声データを容易に最適化することができる。
Since the present invention is configured as described in detail above, the following effects are obtained.
1. Since a single microphone is used, the structure does not become too complicated and can be easily configured. Further, since the optimization circuit is added to the noise processing means, maintenance is facilitated. In addition, since the current supplied to the drain of the N-channel junction field effect transistor composed of the grounded source is set to 90 μA to 100 μA, noise other than the voice of the speaker can be easily canceled within a sufficient range, and voice data can be easily obtained. Can be optimized.
2.音声を抽出後に音声の特徴を抽出し、該音声情報に文法データと音響モデルデータを適用して音声を認識して文字データを生成する構造としたため、騒音・雑音が大きい場所であっても、話者の音声を利用可能な文字データに変換することが可能となる。 2. Extracting the features of the speech after extracting the speech, and applying the grammar data and acoustic model data to the speech information to recognize the speech and generate character data, so even in places where noise and noise are large, It becomes possible to convert the voice of the speaker into usable character data.
 以下、本発明に係る単一指向性接話型マイクロフォンを、図面に示す実施例に基づいて詳細に説明する。図1は、本発明に係る単一指向性接話型マイクロフォンの斜視図であり、図2は、単一指向性接話型マイクロフォンの概略回路図である。図3は、音声情報の最適化領域を示すグラフ図であり、図4は、音声認識システムの動作フロー図である。 Hereinafter, a unidirectional close-talking microphone according to the present invention will be described in detail based on an embodiment shown in the drawings. FIG. 1 is a perspective view of a unidirectional close-talking microphone according to the present invention, and FIG. 2 is a schematic circuit diagram of the unidirectional close-talking microphone. FIG. 3 is a graph showing an optimization area of voice information, and FIG. 4 is an operation flowchart of the voice recognition system.
 本発明の単一指向性接話型マイクロフォンは、本体10と、マイクロフォン20と、雑音処理手段30と、イヤホン40とからなり、本体10を話者の首等に固定設置して使用する、簡易な構造でノイズキャンセルを行うことが可能なマクロフォンである。 The unidirectional close-talking microphone according to the present invention comprises a main body 10, a microphone 20, a noise processing means 30, and an earphone 40, and is simply used by fixing the main body 10 to a speaker's neck or the like. It is a macrophone that can perform noise cancellation with a simple structure.
 本体10は、話者の首等に固定設置するための単一指向性接話型マイクロフォンの本体部であって、図1に示すように、本実施例では、弾性のある半円形状となっており、マイクロフォン20が本体10の先端に一体的に保持される構成となっている。また、本体10のマイクロフォン20が設置されている反対側には、後述の雑音処理手段30およびイヤホン40を接続するためのポート14が設けられている。更に、半円形状の両端内側には、首等を挟んで固定するための一対の固定部材12が設けられている。本体10は、本実施例ではプラスチックで構成されているが、これに限定されず弾性を有する素材であればよい。例えば、広く樹脂製素材を用いる事が可能である。また、全体の形状も首掛け形に限定されるものではない。 The main body 10 is a main part of a unidirectional close-talking microphone that is fixedly installed on the speaker's neck or the like. As shown in FIG. 1, in this embodiment, the main body 10 has an elastic semicircular shape. The microphone 20 is integrally held at the tip of the main body 10. Further, a port 14 for connecting a noise processing means 30 and an earphone 40 described later is provided on the opposite side of the main body 10 where the microphone 20 is installed. Furthermore, a pair of fixing members 12 are provided inside the both ends of the semicircular shape to fix the neck and the like. The main body 10 is made of plastic in the present embodiment, but is not limited thereto, and may be a material having elasticity. For example, a wide range of resin materials can be used. Further, the overall shape is not limited to the neck shape.
 マイクロフォン20は、話者の音声を集音するための部材であり、図1に示すように、本実施例では単一からなり、端部に集音部22が装備されている。集音部22に向けて音声を発する事によりマイクロフォンが音声を拾って電気信号に変換し、該電子情報を外部へ伝達する構成となっている。マイクロフォン20は、くの字状に折れ曲がった形状であり本体10に回転移動可能に接続されている。すなわち、本体10との接続部分を軸にマイクロフォン20を回転させることで、本体10を首等に装着した際に話者の口にマイクロフォン20を近づけるように移動調整することが可能となっている。 The microphone 20 is a member for collecting the voice of the speaker. As shown in FIG. 1, the microphone 20 is a single unit and is equipped with a sound collecting unit 22 at the end. The microphone picks up the sound and converts it into an electric signal by transmitting the sound toward the sound collecting unit 22, and transmits the electronic information to the outside. The microphone 20 is bent in a U shape and is connected to the main body 10 so as to be rotatable. That is, by rotating the microphone 20 around the connection portion with the main body 10, it is possible to adjust the movement so that the microphone 20 is brought closer to the speaker's mouth when the main body 10 is attached to the neck or the like. .
 雑音処理手段30は、マイクロフォン20で集音し電気信号化した音声データを取り込んで、データに含まれる雑音・騒音等のノイズを除去する部材である。雑音処理手段30は、本実施例では、図1に示すように、本体10のマイクロフォン20が接続されている場所の反対側のポート14にケーブル16を介して着脱可能に装着される。 The noise processing means 30 is a member that takes in voice data collected by the microphone 20 and converted into an electrical signal, and removes noise such as noise contained in the data. In this embodiment, as shown in FIG. 1, the noise processing means 30 is detachably attached to the port 14 on the opposite side of the place where the microphone 20 of the main body 10 is connected via the cable 16.
 雑音処理手段30は、図2に示すように、インピーダンス・コンバータ32と、最適化回路34とで構成されている。インピーダンス・コンバータ32は、電気信号を伝達するための装備であり、Nチャネル接合型電界効果トランジスタ(FET)32aから構成されている。 The noise processing means 30 is composed of an impedance converter 32 and an optimization circuit 34 as shown in FIG. The impedance converter 32 is a device for transmitting an electrical signal, and includes an N-channel junction field effect transistor (FET) 32a.
 また、最適化回路34は、雑音を除去して音声データを最適化するための機器であり、インピーダンス・コンバータ32に接続されており、インピーダンス・コンバータ32から伝達される音声データを処理する。最適化回路34は、本実施例では、複数からなるブリッジ接続された抵抗34a、34bから構成されている。 The optimization circuit 34 is a device for removing noise and optimizing audio data, and is connected to the impedance converter 32 and processes the audio data transmitted from the impedance converter 32. In this embodiment, the optimization circuit 34 is composed of a plurality of bridge-connected resistors 34a and 34b.
 雑音処理手段30による雑音処理は、インピーダンス・コンバータ32と、最適化回路34とが協調することにより行われる。すなわち、雑音処理手段30は、マイクロフォン20が集音し、信号化して伝達してきた電気信号からなる音声データから、騒音・雑音等のノイズを電気的処理により効果的に排除するため、各抵抗値・電流値等が調整されており、ソース接地からなるNチャネル接合型電界効果トランジスタ32aのドレインへの供給電流は、90μA乃至100μAの範囲となるように調整されている。 The noise processing by the noise processing means 30 is performed by cooperation of the impedance converter 32 and the optimization circuit 34. In other words, the noise processing means 30 effectively removes noise such as noise and noise from the audio data consisting of the electric signal collected by the microphone 20 and converted into a signal, by electric processing. The current value and the like are adjusted, and the supply current to the drain of the N-channel junction field effect transistor 32a composed of the grounded source is adjusted to be in the range of 90 μA to 100 μA.
 従来技術によれば、集音した音声の雑音除去処理を行うにあたりマイクロフォンユニットを2セット設け、それらを逆相接続することでキャンセル処理を行うものが存在している。これによると、話者の音声は、発声点と各マイクロフォンの距離の差が大きいため、マイクロフォンユニットの出力に位相差分の出力が生じる事となる。一方、周囲からの騒音・雑音は、音源と各マイクロフォンの距離の差が小さくなることからほぼ同相となり、逆相接続していることにより各音声を相殺して打ち消すことが容易となる。 According to the prior art, there are two microphone units that perform cancellation processing by providing two sets of microphone units and performing reverse phase connection when performing noise removal processing on collected sound. According to this, since the speaker's voice has a large difference in distance between the utterance point and each microphone, an output of a phase difference is generated in the output of the microphone unit. On the other hand, noise and noise from the surroundings are almost in phase because the difference in distance between the sound source and each microphone is small, and it is easy to cancel each other by canceling each voice by connecting them in opposite phases.
 しかしながら、この方法によると、理想的な環境下においては位相が合う事によって相殺されることにより雑音を除去することが出来ても、相殺される音圧は、せいぜい約20dBであり、充分な雑音の除去ができるとは言えないという問題点がある。特に、高架下や緊急車両の中など、騒音・雑音が大きい場所で使用する場合には、雑音除去処理を行う事により同時に音声を認識する事が困難になる事が多く、音声認識の実用に耐え得る雑音除去処理は不可能であった。 However, according to this method, even if noise can be removed by canceling out by matching the phases in an ideal environment, the canceled sound pressure is at most about 20 dB, and sufficient noise is obtained. There is a problem that it cannot be said that it can be removed. In particular, when used in a noisy place such as under an overhead or in an emergency vehicle, it is often difficult to recognize the voice at the same time by performing noise reduction processing. A tolerable noise removal process was not possible.
 本実施例による上記雑音処理手段30の構成では、Nチャネル接合型電界効果トランジスタ32aのドレインへの供給電流を、ゲートに印加する電位を調整することにより、通常500μAであるところを、90μAまで下げている。この90μAは、実測によると、遠方を音源とする雑音に低感度となる最適値であった。Nチャネル接合型電界効果トランジスタ32aのドレインへの供給電流値を、この値となるように調整した雑音処理手段30を動作させることにより、騒音・雑音を充分効果的に除去することが可能となる事が、図3のグラフに示すように明らかになっている。 In the configuration of the noise processing means 30 according to the present embodiment, the supply current to the drain of the N-channel junction field effect transistor 32a is reduced from the normal value of 500 μA to 90 μA by adjusting the potential applied to the gate. ing. According to actual measurement, this 90 μA was an optimum value that is low in sensitivity to noise with a sound source in the distance. By operating the noise processing means 30 in which the value of the current supplied to the drain of the N-channel junction field effect transistor 32a is adjusted to this value, noise and noise can be removed sufficiently effectively. This is clear as shown in the graph of FIG.
 また、図3に示すように、上記調整を行い、マイクロフォン20からのゲインを通常時より約30dBほど下げているため、140dB SPLぐらいまで音声信号は歪まないという結果が得られている。これにより、S/N比(Signal/Noise ratio 信号雑音比)は格段に良くなり、話者の音声は確実に出力し、音源が遠方となる騒音・雑音等のノイズは確実に除去することが可能となった。 Further, as shown in FIG. 3, since the above adjustment is performed and the gain from the microphone 20 is lowered by about 30 dB from the normal time, the result that the audio signal is not distorted to about 140 dB SPL is obtained. As a result, the signal-to-noise ratio (Signal / Noise ratio signal-to-noise ratio) is significantly improved, the speaker's voice is reliably output, and noise such as noise / noise that makes the sound source far away can be reliably removed. It has become possible.
 Nチャネル接合型電界効果トランジスタ32aのドレインへの供給電流値は、本実施例では90μAとしているが、図3に示すように、ある程度の幅を持たせることが可能であり、ドレインへの供給電流値は90μA乃至100μAの範囲であることが望ましいことが実測により明らかになった。また、話者が発する音声の音圧は、100dB以上であることが望ましく、マイクロフォン20に話者の口を極力近づけて使用することが望ましい。音声の初期音圧は100dBを超えていることが多いので、話者の音源がマイクロフォンに充分近ければ明瞭に認識することが実測により証明されている。 The supply current value to the drain of the N-channel junction field effect transistor 32a is 90 μA in this embodiment, but it can be given a certain width as shown in FIG. Measurements have shown that the value is preferably in the range of 90 μA to 100 μA. The sound pressure of the voice uttered by the speaker is desirably 100 dB or more, and it is desirable to use the speaker's mouth as close to the microphone 20 as possible. Since the initial sound pressure of the voice often exceeds 100 dB, it has been proved by actual measurement that the speaker can be clearly recognized if the sound source of the speaker is sufficiently close to the microphone.
 イヤホン40は、外部からの音声情報を話者に伝達するための部材であり、図1に示すように、本体10のマイクロフォン20が接続されている場所の反対側の端部に設けられたポート42にケーブル44を介して着脱可能に装着される。本発明に係る単一指向性接話型マイクロフォンを使用するにあたり、騒音・雑音等が大きい場所であっても外部からの指示等の情報を話者が容易に得る必要がある。本実施例では、本体10内部に受信装置(図示せず)を設けて、該受信装置が受信した音声データをイヤホン40を介して話者が取得することが出来る構成となっている。 The earphone 40 is a member for transmitting voice information from the outside to the speaker, and as shown in FIG. 1, a port provided at the end of the main body 10 opposite to the place where the microphone 20 is connected. 42 is detachably mounted via a cable 44. In using the unidirectional close-talking microphone according to the present invention, it is necessary for a speaker to easily obtain information such as instructions from the outside even in a place where noise and noise are large. In this embodiment, a receiving device (not shown) is provided inside the main body 10 so that a speaker can acquire voice data received by the receiving device via the earphone 40.
 本発明に係る単一指向性接話型マイクロフォンには、更に、図4に示すように、雑音処理手段30の他に音声抽出手段50と、特徴抽出手段60と、認識処理手段70と、認識後処理手段80と、出力手段90とが装備されている。 As shown in FIG. 4, the unidirectional close-talking microphone according to the present invention further includes a speech extraction unit 50, a feature extraction unit 60, a recognition processing unit 70, and a recognition unit in addition to the noise processing unit 30. Post-processing means 80 and output means 90 are provided.
 音声抽出手段50は、雑音処理手段30によって雑音除去処理がされたデータから更に音声として認識される音を解析し、抽出する手段である。これにより、音声データをテキスト化(文字データ化)するための基礎データを構成することが可能となる。また、特徴抽出手段60は、音声抽出手段50と雑音処理手段30によって雑音が除去された音声の特徴を抽出するための手段である。この手段により音声中の各音の特徴から一音一音を特定して識別し、テキスト化するための準備が完了する。 The voice extraction means 50 is a means for further analyzing and extracting a sound recognized as a voice from the data subjected to the noise removal process by the noise processing means 30. This makes it possible to configure basic data for converting voice data into text (character data). The feature extraction unit 60 is a unit for extracting the features of the speech from which noise has been removed by the speech extraction unit 50 and the noise processing unit 30. By this means, each sound is identified and identified from the characteristics of each sound in the voice, and the preparation for text conversion is completed.
 特徴抽出手段60で処理されたデータは、認識処理手段70に伝達される。認識処理手段70は、図4に示すように、語彙データおよび文法データに音響モデルデータにより音声を認識する手段である。語彙データは、音声データを分析して語彙の選択処理を行う語彙処理部72において選択され、文法データは、選択された各語彙から適切な文法適用判断処理を行う文法処理部74において処理判断される。これらに音響モデルデータ部76から配信される音響モデルを用いる事により、音声の内容を正確に認識する処理を行うことが可能となる。 The data processed by the feature extraction means 60 is transmitted to the recognition processing means 70. As shown in FIG. 4, the recognition processing means 70 is means for recognizing speech based on acoustic model data for vocabulary data and grammatical data. The vocabulary data is selected by the vocabulary processing unit 72 that analyzes the speech data and performs vocabulary selection processing, and the grammar data is processed by the grammar processing unit 74 that performs appropriate grammar application determination processing from each selected vocabulary. The By using the acoustic model distributed from the acoustic model data unit 76 for these, it becomes possible to perform processing for accurately recognizing the content of the voice.
 認識処理手段70で処理されたデータは、認識後処理手段80に伝達される。認識後処理手段80は、文字データを生成する手段である。すなわち、認識処理手段70によって正確に認識された音声データを解析して音声をテキスト化する処理を行う。認識後処理手段80によって生成された文字データは、出力手段90に伝達される。出力手段90は、生成された文字データを外部のPCや記憶媒体へ出力する処理を行う。これにより、テキスト化された音声データをあらゆる用途に使用する事が可能となる。 The data processed by the recognition processing means 70 is transmitted to the post-recognition processing means 80. The post-recognition processing unit 80 is a unit that generates character data. That is, the speech data recognized correctly by the recognition processing means 70 is analyzed to convert the speech into text. The character data generated by the post-recognition processing unit 80 is transmitted to the output unit 90. The output unit 90 performs processing for outputting the generated character data to an external PC or storage medium. This makes it possible to use the voice data that has been converted into text for any purpose.
 このように、雑音処理手段30によって最適化された音声データを使用してテキスト化を行うため、より正確な音声の文字データ変換を実現することが可能となる。すなわち、緊急車両内部のような騒音・雑音が甚大な場所であっても、話者の発した音声を正確に伝達することが出来、更に、該音声を正確にテキスト(文字データ)化する事が可能となる。 As described above, since the voice data optimized by the noise processing means 30 is used for text conversion, more accurate voice character data conversion can be realized. That is, even in a place with a lot of noise such as the inside of an emergency vehicle, it is possible to accurately transmit the voice uttered by the speaker, and to further accurately convert the voice to text (character data). Is possible.
本発明に係る単一指向性接話型マイクロフォンの斜視図1 is a perspective view of a unidirectional close-talking microphone according to the present invention. 単一指向性接話型マイクロフォンの概略回路図Schematic circuit diagram of unidirectional close-talking microphone 音声情報の最適化領域を示すグラフ図A graph showing the optimization area of voice information 音声認識システムの動作フロー図Operation flow diagram of speech recognition system
 10 本体
 12 固定部
 14 ポート
 16 ケーブル
 20 マイクロフォン
 22 集音部
 30 雑音処理手段
 32 インピーダンス・コンバータ
 32a Nチャネル接合型電界効果トランジスタ
 34 最適化回路
 34a 抵抗
 40 イヤホン
 42 ポート
 44 ケーブル
 50 音声抽出手段
 60 特徴抽出手段
 70 認識処理手段
 80 認識後処理手段
 90 出力手段
DESCRIPTION OF SYMBOLS 10 Main body 12 Fixed part 14 Port 16 Cable 20 Microphone 22 Sound collecting part 30 Noise processing means 32 Impedance converter 32a N channel junction field effect transistor 34 Optimization circuit 34a Resistance 40 Earphone 42 Port 44 Cable 50 Audio extraction means 60 Feature extraction Means 70 Recognition processing means 80 Post recognition processing means 90 Output means

Claims (2)

  1.  音声を集音し、電気信号に変換する単一のマイクロフォンと、雑音を除去する雑音処理手段と、からなる単一指向性接話型マイクロフォンにおいて、
     前記雑音処理手段は、電気信号を伝達するNチャネル接合型電界効果トランジスタからなるインピーダンス・コンバータと、雑音を除去して音声データを最適化する最適化回路とから構成されるとともに、周囲の騒音を効果的に排除するために、ソース接地からなる前記Nチャネル接合型電界効果トランジスタのドレインへの供給電流を90μA乃至100μAとしたことを特徴とする単一指向性接話型マイクロフォン。
    In a unidirectional close-talking microphone comprising a single microphone that collects sound and converts it into an electrical signal, and a noise processing means for removing noise,
    The noise processing means includes an impedance converter composed of an N-channel junction field effect transistor that transmits an electrical signal, and an optimization circuit that optimizes audio data by removing noise, and is configured to reduce ambient noise. A unidirectional close-talking microphone characterized in that, in order to eliminate it effectively, the supply current to the drain of the N-channel junction field-effect transistor composed of grounded source is 90 μA to 100 μA.
  2.  前記単一指向性接話型マイクロフォンは、
     音声抽出手段と、該音声抽出手段と前記雑音処理手段によって雑音が除去された音声の特徴を抽出する特徴抽出手段と、語彙処理部において選択された語彙データおよび文法処理部において適用判断がなされた文法データに音響モデルデータ部から配信される音響モデルデータを適用して音声を認識する認識処理手段と、文字データを生成する認識後処理手段と、生成された文字データを出力する出力手段と、を装備することを特徴とする請求項1記載の単一指向性接話型マイクロフォン。
    The unidirectional close-talking microphone is:
    Speech extraction means, feature extraction means for extracting features of speech from which noise has been removed by the speech extraction means and the noise processing means, and lexical data selected by the vocabulary processing section and application determination made in the grammar processing section Recognition processing means for recognizing speech by applying acoustic model data distributed from the acoustic model data section to grammatical data, post-recognition processing means for generating character data, output means for outputting the generated character data, The unidirectional close-talking microphone according to claim 1, further comprising:
PCT/JP2013/007335 2013-12-12 2013-12-12 Unidirectional close-talking microphone WO2015087372A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2013/007335 WO2015087372A1 (en) 2013-12-12 2013-12-12 Unidirectional close-talking microphone

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2013/007335 WO2015087372A1 (en) 2013-12-12 2013-12-12 Unidirectional close-talking microphone

Publications (1)

Publication Number Publication Date
WO2015087372A1 true WO2015087372A1 (en) 2015-06-18

Family

ID=53370721

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/007335 WO2015087372A1 (en) 2013-12-12 2013-12-12 Unidirectional close-talking microphone

Country Status (1)

Country Link
WO (1) WO2015087372A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04216216A (en) * 1990-12-15 1992-08-06 Fujitsu Ltd Speech volume optimization circuit for telephone equipment
JP2001505719A (en) * 1996-06-03 2001-04-24 エリクソン インコーポレイテッド Audio A / D converter using frequency modulation
JP2001186032A (en) * 1999-11-16 2001-07-06 Motorola Inc Wireless back end circuit
JP2001249684A (en) * 2000-03-02 2001-09-14 Sony Corp Device and method for recognizing speech, and recording medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04216216A (en) * 1990-12-15 1992-08-06 Fujitsu Ltd Speech volume optimization circuit for telephone equipment
JP2001505719A (en) * 1996-06-03 2001-04-24 エリクソン インコーポレイテッド Audio A / D converter using frequency modulation
JP2001186032A (en) * 1999-11-16 2001-07-06 Motorola Inc Wireless back end circuit
JP2001249684A (en) * 2000-03-02 2001-09-14 Sony Corp Device and method for recognizing speech, and recording medium

Similar Documents

Publication Publication Date Title
US10097921B2 (en) Methods circuits devices systems and associated computer executable code for acquiring acoustic signals
US10535362B2 (en) Speech enhancement for an electronic device
US7502484B2 (en) Ear sensor assembly for speech processing
US10861484B2 (en) Methods and systems for speech detection
US8082149B2 (en) Methods and apparatuses for myoelectric-based speech processing
JP6034793B2 (en) Audio signal generation system and method
EP1538865B1 (en) Microphone and communication interface system
US20230352038A1 (en) Voice activation detecting method of earphones, earphones and storage medium
CN109195042B (en) Low-power-consumption efficient noise reduction earphone and noise reduction system
JP2005522078A (en) Microphone and vocal activity detection (VAD) configuration for use with communication systems
JP2011191423A (en) Device and method for recognition of speech
JP2008064892A (en) Speech recognition method and speech recognition apparatus using the same
US20220180886A1 (en) Methods for clear call under noisy conditions
Wang et al. Attention-based fusion for bone-conducted and air-conducted speech enhancement in the complex domain
CN110931027A (en) Audio processing method, apparatus, electronic device, and computer-readable storage medium
US20240284123A1 (en) Hearing Device Comprising An Own Voice Estimator
CN115209331A (en) Hearing device comprising a noise reduction system
EP2916320A1 (en) Multi-microphone method for estimation of target and noise spectral variances
US20220189498A1 (en) Signal processing device, signal processing method, and program
US20220122606A1 (en) Hearing device system and method for operating same
KR101850693B1 (en) Apparatus and method for extending bandwidth of earset with in-ear microphone
WO2015087372A1 (en) Unidirectional close-talking microphone
CN114127846B (en) Voice tracking listening device
KR19990081731A (en) Insertion-type handset
JP2007267331A (en) Combination microphone system for voice collection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13899343

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13899343

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP