[go: up one dir, main page]

CN1326112C - Voice recognition device and integrated circuit realizing method - Google Patents

Voice recognition device and integrated circuit realizing method Download PDF

Info

Publication number
CN1326112C
CN1326112C CNB2005100337656A CN200510033765A CN1326112C CN 1326112 C CN1326112 C CN 1326112C CN B2005100337656 A CNB2005100337656 A CN B2005100337656A CN 200510033765 A CN200510033765 A CN 200510033765A CN 1326112 C CN1326112 C CN 1326112C
Authority
CN
China
Prior art keywords
template
processing unit
port
test
control module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2005100337656A
Other languages
Chinese (zh)
Other versions
CN1664926A (en
Inventor
贺前华
李韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CNB2005100337656A priority Critical patent/CN1326112C/en
Publication of CN1664926A publication Critical patent/CN1664926A/en
Application granted granted Critical
Publication of CN1326112C publication Critical patent/CN1326112C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)
  • Complex Calculations (AREA)

Abstract

本发明提供一种语音识别装置,包括嵌入式处理器、动态时间归正算法模块,动态时间归正算法模块通过控制总线与嵌入式处理器相连接,嵌入式处理器分别连接有程序存储器、数据存储器、显示器,同时通过模/数转换模块与麦克风连接,嵌入式处理器设置有按键、RS232接口,动态时间归正算法模块包括计算控制模块、处理单元阵列、模板缓冲区。本发明通过计算控制模块产生“运动方向”,把参考模板特征和测试模板特征参数由两个相反的方向输入处理单元阵列,使各处理单元在参考模板和测试模板长度变化情况下仍然能够正确地进行两个模板的匹配过程,同时大大减小了硬件资源和输入接口控制逻辑的复杂度。

Figure 200510033765

The invention provides a speech recognition device, comprising an embedded processor, a dynamic time correction algorithm module, the dynamic time correction algorithm module is connected with the embedded processor through a control bus, and the embedded processor is respectively connected with a program memory, a data The memory and the display are connected to the microphone through the analog/digital conversion module at the same time, the embedded processor is provided with keys and RS232 interface, and the dynamic time correction algorithm module includes a calculation control module, a processing unit array, and a template buffer. The present invention generates "movement direction" through the calculation control module, and inputs the reference template feature and the test template feature parameter into the processing unit array from two opposite directions, so that each processing unit can still be correct when the length of the reference template and the test template change. The matching process of the two templates is performed, and at the same time, the complexity of hardware resources and input interface control logic is greatly reduced.

Figure 200510033765

Description

一种语音识别装置及其集成电路实现方法A voice recognition device and its integrated circuit implementation method

技术领域technical field

本发明涉及语音识别和集成电路设计领域,具体是指一种语音识别装置及其集成电路实现方法。The invention relates to the field of speech recognition and integrated circuit design, in particular to a speech recognition device and an integrated circuit implementation method thereof.

背景技术Background technique

目前,语音识别算法的实现方法一般有软件实现和集成电路实现两类。软件实现语音识别算法指在通用计算机上编写程序实现或利用嵌入式软件实现,其中,在通用计算机上编写程序实现的可利用资源充分,且算法优化升级灵活,缺点在于计算机携带不方便,不能随身使用,移动性差;利用嵌入式软件实现,其缺点在于资源有限,处理器计算速度也存在瓶颈问题。集成电路实现语音识别算法指采用逻辑电路实现,其特点是运算速度快、实时,识别装置体积小、重量轻,便于移动使用,也可作为构成模块,加载到其他需要人机交互的智能设备中,如语音拨号手机/电话机。At present, there are generally two types of implementation methods for speech recognition algorithms: software implementation and integrated circuit implementation. Software implementation of speech recognition algorithm refers to the implementation of programming on a general-purpose computer or using embedded software. Among them, the available resources for programming on a general-purpose computer are sufficient, and the algorithm can be optimized and upgraded flexibly. The disadvantage is that the computer is inconvenient to carry and cannot be carried with you. Use, poor mobility; using embedded software, its disadvantages are limited resources, and there is also a bottleneck problem in processor calculation speed. Integrated circuit implementation of voice recognition algorithm refers to the implementation of logic circuits, which are characterized by fast and real-time operation, small size and light weight of the recognition device, which is convenient for mobile use, and can also be used as a component module to be loaded into other intelligent devices that require human-computer interaction. , such as a voice dialing cell phone/telephone.

动态时间归正(Dynamic Time Warping,简称DTW)算法是常用的语音模式识别算法之一,主要应用于中小词汇量语音识别。The Dynamic Time Warping (DTW) algorithm is one of the commonly used speech pattern recognition algorithms, which is mainly used in small and medium vocabulary speech recognition.

语音识别算法的集成电路实现方法通常有以下两种:There are usually two methods for implementing integrated circuits of speech recognition algorithms:

(1)微控制器加一个处理单元结构。该结构是采用一个微控制器控制处理单元的数据输入、计算、输出,充分利用了硬件处理单元在计算方面的优势和微控制器在控制方面的优势,比起使用通用处理器的纯软件实现语音识别方法在识别速度上有较大提高,但是由于受到微控制器处理速度的限制,这种实现结构仍然难以满足实时性要求。(1) Microcontroller plus a processing unit structure. This structure uses a microcontroller to control the data input, calculation, and output of the processing unit, making full use of the advantages of the hardware processing unit in computing and the advantages of the microcontroller in control. Compared with the pure software implementation using a general-purpose processor The speech recognition method has greatly improved the recognition speed, but due to the limitation of the processing speed of the microcontroller, this implementation structure is still difficult to meet the real-time requirements.

(2)处理单元阵列结构。该结构的处理速度比非处理阵列实现方法有数量级的提高。按照处理器阵列的分类,目前用于实现DTW算法的处理单元阵列几乎都属于心动阵列。心动阵列通常只适用于计算规模大小确定的问题,目前的心动阵列实现结构都是采用固定参考模板和测试模板的帧数方式,而且参考模板的帧数和测试模板的帧数必须相等,但通常情况下每个字、词的发音的时长是不同的,即使是同一个人不同时候说出来的同一词的发音时长也是不同的,因此固定参考模板和测试模板的帧数不但增加了特征提取部分的计算量和复杂度,更重要的是增大了语音特征表达的失真度,不利于后续的识别,因此采用处理单元阵列结构实现语音识别的通用性比较差。(2) Processing unit array structure. The processing speed of this structure is orders of magnitude faster than non-processing array implementations. According to the classification of processor arrays, the processing unit arrays currently used to implement the DTW algorithm are almost all cardiac arrays. Cardiac arrays are usually only applicable to the problem of determining the size of the calculation. The current implementation structure of cardiac arrays adopts the method of fixing the number of frames of the reference template and the test template, and the number of frames of the reference template and the number of frames of the test template must be equal, but usually Under certain circumstances, the pronunciation duration of each character and word is different, even the same person speaks the same word at different times. The amount of calculation and complexity, more importantly, increases the distortion of speech feature expression, which is not conducive to subsequent recognition. Therefore, the versatility of speech recognition using a processing unit array structure is relatively poor.

将语音识别算法嵌入到集成电路芯片上,不仅带来了应用上的便利,还将极大地拓宽语音识别的应用领域。国际上现有的语音识别装置中,采用的典型语音识别芯片有Sensory公司的RSC语音识别芯片和Philips公司的Hello IC语音识别芯片。其中RSC语音识别芯片可识别16个预先设定好的词,若采用外接SRAM(数据存储器),采用特定说话人方式,经过训练,词汇量可扩大到100左右。这些语音识别芯片的缺点在于:都是基于8位MCU(Micro ControllerUnit,单片机)加上一个语音处理单元实现的,可识别的命令数很有限,难以满足大多数实际应用的要求。Embedding the speech recognition algorithm on the integrated circuit chip not only brings convenience in application, but also greatly expands the application field of speech recognition. Among the existing speech recognition devices in the world, the typical speech recognition chips used include Sensory's RSC speech recognition chip and Philips' Hello IC speech recognition chip. Among them, the RSC speech recognition chip can recognize 16 pre-set words. If an external SRAM (data memory) is used and a specific speaker is used, the vocabulary can be expanded to about 100 after training. The shortcoming of these speech recognition chips is: all realize based on 8-bit MCU (Micro Controller Unit, single-chip microcomputer) plus a speech processing unit, the number of recognizable commands is very limited, it is difficult to meet the requirements of most practical applications.

国内集成电路设计在近几年才有较大的发展,目前仍然相对滞后,在专用语音识别芯片设计方面的研究目前很少报道,还未见语音识别专用芯片的报道。Domestic integrated circuit design has only developed greatly in recent years, and it is still relatively lagging behind. There are few reports on the design of special speech recognition chips, and there are no reports on special speech recognition chips.

发明内容Contents of the invention

本发明的目的就是为了克服现有语音识别技术实现方法的缺点和问题,提供一种通用性、实用性强,可进行参数配置的采用动态时间归正算法的语音识别装置。The purpose of the present invention is to overcome the shortcomings and problems of the existing speech recognition technology implementation methods, and provide a speech recognition device with strong versatility and practicability, which can be configured with parameters and adopts a dynamic time normalization algorithm.

本发明的目的还在于提供上述动态时间归正算法的集成电路实现方法。The object of the present invention is also to provide an integrated circuit implementation method of the above dynamic time correction algorithm.

本发明的目的通过下述技术方案实现:本语音识别装置包括嵌入式处理器、动态时间归正算法模块,所述动态时间归正算法模块通过控制总线与嵌入式处理器相连接,所述嵌入式处理器分别连接有程序存储器、数据存储器、显示器,同时通过模/数转换模块与麦克风连接,所述嵌入式处理器设置有按键、RS232接口,所述动态时间归正算法模块包括计算控制模块、处理单元阵列、模板缓冲区,且设置有系统时钟端口、复位信号端口、控制信号端口、参考模板有效端口、测试模块有效端口、特征数据端口、系统忙标志端口、输出数据有效端口、匹配结果输出端口,所述处理单元阵列是一维线性阵列,各处理单元依次连接,所述系统时钟端口、复位信号端口与各处理单元分别连接,所述最后一个处理单元连接有匹配结果输出端口,所述计算控制模块与第一个、最后一个处理单元以及所述模板缓冲区、系统时钟端口、复位信号端口、控制信号端口、参考模板有效端口、测试模块有效端口、系统忙标志端口、输出数据有效端口分别连接,所述模板缓冲区与第一个、最后一个处理单元以及所述系统时钟端口、复位信号端口、特征数据端口分别连接。The object of the present invention is achieved through the following technical solutions: the voice recognition device includes an embedded processor, a dynamic time correction algorithm module, the dynamic time correction algorithm module is connected with the embedded processor through a control bus, and the embedded The embedded processor is respectively connected with a program memory, a data memory, and a display, and is connected with a microphone through an analog/digital conversion module. The embedded processor is provided with buttons and an RS232 interface, and the dynamic time correction algorithm module includes a calculation control module. , processing unit array, template buffer, and set with system clock port, reset signal port, control signal port, reference template valid port, test module valid port, feature data port, system busy flag port, output data valid port, matching result output port, the processing unit array is a one-dimensional linear array, each processing unit is connected in turn, the system clock port, the reset signal port are respectively connected to each processing unit, and the last processing unit is connected to a matching result output port, so The calculation control module and the first and last processing unit and the template buffer, system clock port, reset signal port, control signal port, reference template valid port, test module valid port, system busy flag port, output data valid The ports are respectively connected, and the template buffer is respectively connected to the first and last processing unit, the system clock port, the reset signal port, and the characteristic data port.

所述模板缓冲区包括测试模板缓冲区、参考模板缓冲区。The stencil buffer includes a test stencil buffer and a reference stencil buffer.

所述控制总线分别与系统时钟端口、复位信号端口、控制信号端口、参考模板有效端口、测试模块有效端口、特征数据端口、系统忙标志端口、输出数据有效端口、匹配结果输出端口连接,所述控制信号端口与控制总线的三条控制信号线连接,可以表示8种不同的控制状态,使嵌入式处理器实现对动态时间归正算法模块的控制。The control bus is respectively connected with the system clock port, the reset signal port, the control signal port, the effective port of the reference template, the effective port of the test module, the characteristic data port, the system busy flag port, the effective port of the output data, and the output port of the matching result. The control signal port is connected with three control signal lines of the control bus, which can represent 8 different control states, so that the embedded processor can realize the control of the dynamic time normalization algorithm module.

采用动态时间归正算法的上述语音识别装置的集成电路实现方法,包括如下步骤:The integrated circuit implementation method of the above-mentioned speech recognition device adopting the dynamic time correction algorithm comprises the following steps:

第一,所述计算控制模块检测到控制总线上有复位模板缓冲区信号后,对所述模板缓冲区进行复位,等待数据的输入;First, after the calculation control module detects that there is a reset template buffer signal on the control bus, it resets the template buffer and waits for the input of data;

第二,所述计算控制模块检测到所述测试模板有效端口或参考模板有效端口为高电平时,输出相应的地址和控制信号,把所述特征数据端口上的测试模板特征参数和参考模板特征参数保存到所述模板缓冲区中,完成数据的输入;Second, when the calculation control module detects that the effective port of the test template or the effective port of the reference template is a high level, it outputs the corresponding address and control signal, and the test template characteristic parameter and the reference template characteristic on the characteristic data port Parameters are saved in the template buffer, and the input of data is completed;

第三,所述计算控制模块检测到控制总线上有开始匹配命令后,向所述模板缓冲区输出相应的地址和控制信号,把参考模板特征参数和测试模板特征参数读出,分别送到所述第一个处理单元和最后一个处理单元,同时所述计算控制模块计算出下一个时刻的运动方向,并将其送到所述第一个处理单元,同时使所述系统忙标志端口输出高电平,说明系统正在工作;Third, after the calculation control module detects that there is a start matching command on the control bus, it outputs the corresponding address and control signal to the template buffer, reads the reference template characteristic parameters and the test template characteristic parameters, and sends them to the template buffer respectively. The first processing unit and the last processing unit, while the calculation control module calculates the direction of motion at the next moment, and sends it to the first processing unit, and simultaneously makes the system busy flag port output high level, indicating that the system is working;

第四,所述计算控制模块根据运动方向依次把需要的参考模板和测试模板从所述模板缓冲区中读取出来,送到所述各处理单元中进行处理;Fourth, the calculation control module sequentially reads the required reference templates and test templates from the template buffer according to the direction of motion, and sends them to the processing units for processing;

第五,数据处理完成后,所述最后一个处理单元通过所述匹配结果输出端口输出测试模板和参考模板的匹配加权距离,此时所述计算控制模块使所述输出数据有效端口输出一个正脉冲,同时使所述系统忙标志端口输出低电平;Fifth, after the data processing is completed, the last processing unit outputs the matching weighted distance between the test template and the reference template through the matching result output port, and at this time, the calculation control module makes the output data valid port output a positive pulse , making the system busy flag port output low level at the same time;

第六、嵌入式处理器收到输出数据有效端口输出的正脉冲后,把匹配结果读到数据存储器中缓存起来,然后重复上述第二~五步骤,进行下一次匹配,如此循环,直到所有参考模板匹配完成。Sixth, after the embedded processor receives the positive pulse output from the output data valid port, it reads the matching result into the data memory and caches it, then repeats the above-mentioned steps 2 to 5 for the next match, and so on until all references Template matching is complete.

所述运动方向是由所述计算控制模块确定下一个匹配时刻第一个处理单元的搜索方向,同时利用多处理单元的搜索路径的平行性,使运动方向信号和参考模板特征一起在心动阵列中流动,完成不等长模板的匹配。The direction of motion is determined by the calculation control module as the search direction of the first processing unit at the next matching moment, and at the same time, the parallelism of the search paths of the multi-processing units is used to make the signal of the direction of motion and the reference template feature together in the cardiac array flow to complete the matching of unequal-length templates.

所述第三步中,参考模板特征参数由第一个处理单元开始依次向下传递,测试模板特征参数从最后一个处理单元输入,并使之沿参考模板参数流向相反的方向流动,形成参考模板参数和测试模板参数由两个相反的方向相向运动。In the third step, the characteristic parameters of the reference template are passed down sequentially from the first processing unit, and the characteristic parameters of the test template are input from the last processing unit, and made to flow in the opposite direction along the reference template parameters to form a reference template Parameters and test template parameters move towards each other in two opposite directions.

基于动态时间归正算法的语音识别装置的工作原理是:The working principle of the speech recognition device based on the dynamic time correction algorithm is:

(1)生成参考模板。通过所述按键,选择训练功能,进行下述步骤:(1) Generate a reference template. Through the buttons, select the training function and perform the following steps:

1、发音人的声音通过所述麦克风转换成电信号后经所述模/数转换模块变成数字语音信号;1. The voice of the speaker is converted into an electrical signal by the microphone and then converted into a digital voice signal by the analog/digital conversion module;

2、所述嵌入式处理器对数字语音信号进行预处理,然后提取LPC(线性预测编码)特征参数作为参考模块特征参数;2, described embedded processor carries out preprocessing to digital voice signal, then extracts LPC (linear predictive coding) feature parameter as reference module feature parameter;

3、将参考模块特征参数保存到所述程序存储器中,形成参考模板供识别时使用。3. Saving the characteristic parameters of the reference module into the program memory to form a reference template for use in identification.

(2)语音识别。通过所述按键,选择识别功能,进行下述步骤:(2) Speech recognition. Select the recognition function through the buttons, and perform the following steps:

1、发音人的声音通过所述麦克风转换成电信号后经所述模/数转换模块变成数字语音信号;1. The voice of the speaker is converted into an electrical signal by the microphone and then converted into a digital voice signal by the analog/digital conversion module;

2、所述嵌入式处理器对数字语音信号做预处理,提取LPC特征参数作为测试模板特征参数,并保存到所述数据存储器中,形成测试模板;2. The embedded processor preprocesses the digital voice signal, extracts the LPC characteristic parameter as the test template characteristic parameter, and saves it in the data memory to form the test template;

3、所述嵌入式处理器分别从数据存储器、程序存储器中将测试模板特征参数、参考模板特征参数读出,并送到所述动态时间归正算法模块中模板缓冲区的测试模板缓冲区、参考模板缓冲区;3. The embedded processor reads out the test template characteristic parameter and the reference template characteristic parameter respectively from the data memory and the program memory, and sends it to the test template buffer, the template buffer in the dynamic time normalization algorithm module, reference stencil buffer;

4、所述嵌入式处理器发送开始匹配命令,所述动态时间归正算法模块将测试模板与参考模板进行匹配,所述嵌入式处理器得到计算结果后对所述动态时间归正算法模块的参考模板缓冲区进行复位,并从所述程序存储器中取出下一个参考模板参数放到所述动态时间归正算法模块的参考模板缓冲区,然后开始下一次匹配,如此循环,直到所有参考模板匹配完成;4. The embedded processor sends a start matching command, and the dynamic time normalization algorithm module matches the test template with the reference template, and the embedded processor performs calculation of the dynamic time normalization algorithm module after obtaining the calculation result. Reset the reference template buffer, and take out the next reference template parameter from the program memory and put it into the reference template buffer of the dynamic time correction algorithm module, then start the next match, and so on, until all reference templates match Finish;

5、所述嵌入式处理器根据测试模板与所有参考模板的匹配加权距离进行识别判决,得到识别结果并在所述显示器上显示,从而完成语音识别的过程。5. The embedded processor performs recognition judgment according to the matching weighted distances between the test template and all reference templates, obtains the recognition result and displays it on the display, thereby completing the speech recognition process.

在本发明中,所述嵌入式处理器和动态时间归正算法模块用一片FPGA(Field Programmable Gate Array,现场可编程门阵列)实现。In the present invention, the embedded processor and the dynamic time normalization algorithm module are implemented with a FPGA (Field Programmable Gate Array, Field Programmable Gate Array).

所述动态时间归正算法的集成电路实现通过使用VHDL(Very High SpeedIntegrated Circuit Hardware Description Language,超高速集成电路的硬件描述语言)语言进行描述。The integrated circuit implementation of the dynamic time normalization algorithm is described by using VHDL (Very High Speed Integrated Circuit Hardware Description Language, the hardware description language of ultra-high speed integrated circuits).

相对现有技术,本发明具有以下优点及有益效果:Compared with the prior art, the present invention has the following advantages and beneficial effects:

(1)采用“运动方向”的概念,利用一维心动阵列实现DTW算法。为了最大限度地提高识别速度,本发明采用了多个处理单元即处理单元阵列同时运算的方法。这种实现方法的处理速度比非处理阵列实现方法有数量级的提高,适合实时性要求比较高的应用场合。目前的DTW算法的电路实现结构都是采用固定参考模板和测试模板的帧数方式,且参考模板的帧数和测试模板的帧数必须相等,因此通用性比较差。针对这一问题,本发明在DTW实现结构中增加了一个计算控制模块(cal_ctrl),“运动方向”由计算控制模块产生,用于确定下一个匹配时刻第一个处理单元的搜索方向,同时利用多处理单元的搜索路径平行性特点,使运动方向信号和参考模板特征一起在心动阵列中流动,由运动方向控制测试模板特征在心动阵列中逆向流动,使得各个处理单元在参考模板和测试模板长度变化情况下仍然能够正确地进行两个模板的匹配过程。(1) Using the concept of "direction of motion", the DTW algorithm is implemented using a one-dimensional cardiac array. In order to improve the recognition speed to the greatest extent, the present invention adopts a method in which multiple processing units, that is, a processing unit array, operate simultaneously. The processing speed of this implementation method is an order of magnitude higher than that of the non-processing array implementation method, and is suitable for applications with relatively high real-time requirements. The circuit implementation structure of the current DTW algorithm adopts the method of fixing the number of frames of the reference template and the test template, and the number of frames of the reference template and the number of frames of the test template must be equal, so the versatility is relatively poor. To solve this problem, the present invention adds a calculation control module (cal_ctrl) in the DTW implementation structure, and the "movement direction" is generated by the calculation control module to determine the search direction of the first processing unit at the next matching moment, while using The parallelism of the search path of the multi-processing unit makes the motion direction signal and the reference template feature flow together in the cardiac array, and the test template feature is controlled by the motion direction to flow in the reverse direction in the cardiac array, so that each processing unit is within the length of the reference template and the test template. The matching process of the two templates can still be performed correctly under changing conditions.

(2)在DTW算法匹配过程中,对于一维并行阵列来说,把测试模板特征参数从最后一个处理单元输入,并使之沿参考模板参数流向相反的方向流动,参考模板特征参数由第一个处理单元开始依次向下传递,流动是顺序的。因此,模板特征参数由两个相反的方向输入,相对传统并行阵列所采用的模板特征参数由一个方向输入和先存储后计算的模板处理方式,大大减小了硬件资源和输入接口控制逻辑的复杂度。(2) In the matching process of the DTW algorithm, for a one-dimensional parallel array, the characteristic parameters of the test template are input from the last processing unit, and make it flow in the opposite direction along the flow of the reference template parameters, and the reference template characteristic parameters are obtained from the first Processing units begin to pass down in turn, and the flow is sequential. Therefore, the template characteristic parameters are input from two opposite directions. Compared with the template characteristic parameters adopted by the traditional parallel array, the template characteristic parameters are input from one direction and the template processing method is stored first and then calculated, which greatly reduces the complexity of hardware resources and input interface control logic. Spend.

附图说明Description of drawings

图1是本发明语音识别装置的结构示意图;Fig. 1 is the structural representation of speech recognition device of the present invention;

图2是图1所示动态时间归正算法模块的内部结构示意图;Fig. 2 is a schematic diagram of the internal structure of the dynamic time normalization algorithm module shown in Fig. 1;

图3是图1所示动态时间归正算法模块的外部结构示意图;Fig. 3 is a schematic diagram of the external structure of the dynamic time normalization algorithm module shown in Fig. 1;

图4是图3所示测试模板与参考模板匹配过程的原理图。FIG. 4 is a schematic diagram of the matching process between the test template and the reference template shown in FIG. 3 .

具体实施方式Detailed ways

下面结合实施例,对本发明作进一步地详细说明,但本发明的实施方式不限于此。The present invention will be described in further detail below in conjunction with the examples, but the embodiments of the present invention are not limited thereto.

实施例1Example 1

图1给出了基于动态时间归正算法的本发明语音识别装置的具体结构,本语音识别装置包括嵌入式处理器(Nios II)l、动态时间归正算法模块(DTW)2,动态时间归正算法模块(DTW)2通过控制总线与嵌入式处理器(Nios II)1相连接,嵌入式处理器(Nios II)1分别连接有程序存储器(FLASH)3、数据存储器(SRAM)4、显示器5,同时通过模/数转换模块6与麦克风7连接,嵌入式处理器(Nios II)1设置有按键8、RS232接口9。Fig. 1 has provided the concrete structure of the speech recognition device of the present invention based on dynamic time normalization algorithm, and this speech recognition device comprises embedded processor (Nios II) 1, dynamic time normalization algorithm module (DTW) 2, dynamic time normalization The positive algorithm module (DTW) 2 is connected with the embedded processor (Nios II) 1 through the control bus, and the embedded processor (Nios II) 1 is respectively connected with the program memory (FLASH) 3, the data memory (SRAM) 4, the display 5. At the same time, the analog/digital conversion module 6 is connected to the microphone 7, and the embedded processor (Nios II) 1 is provided with a button 8 and an RS232 interface 9.

图2、图3给出了动态时间归正算法模块(DTW)2的结构,包括计算控制模块(cal_ctrl)10、处理单元阵列11、模板缓冲区12,且设置有系统时钟端口(clk)13、控制信号端口(ctrl_sig)14、特征数据端口(dat_ptn)15、参考模板有效端口(ref_dat_en)16、复位信号端口(rst)17、测试模块有效端口(test_dat_en)18、系统忙标志端口(busy)19、匹配结果输出端口(dout)20、输出数据有效端口(dout_en)21。处理单元阵列11是一维线性阵列,各处理单元依次连接,系统时钟端口(clk)13、复位信号端口(rst)17与各处理单元分别连接,最后一个处理单元11-2连接有匹配结果0输出端口(dout)20,计算控制模块与第一个、最后一个处理单元11-1、11-2以及模板缓冲区12、系统时钟端口(clk)13、控制信号端口(ctrl_sig)14、参考模板有效端口(ref_dat-en)16、复位信号端口(rst)17、测试模块有效端口(test_dat_en)18、系统忙标志端口(busy)19、输出数据有效端口(dout_en)20分别连接,模板缓冲区12与第一个、最后一个处理单元11-1、11-2以及系统时钟端口(clk)13、特征数据端口(dat_ptn)15、复位信号端口(rst)分别连接17。Fig. 2, Fig. 3 have provided the structure of dynamic time normalization algorithm module (DTW) 2, comprise calculation control module (cal_ctrl) 10, processing unit array 11, template buffer 12, and be provided with system clock port (clk) 13 , control signal port (ctrl_sig) 14, feature data port (dat_ptn) 15, reference template valid port (ref_dat_en) 16, reset signal port (rst) 17, test module valid port (test_dat_en) 18, system busy flag port (busy) 19. Matching result output port (dout) 20 , output data effective port (dout_en) 21 . The processing unit array 11 is a one-dimensional linear array, each processing unit is connected sequentially, the system clock port (clk) 13, the reset signal port (rst) 17 are respectively connected to each processing unit, and the last processing unit 11-2 is connected with a matching result 0 Output port (dout) 20, calculation control module and the first and last processing unit 11-1, 11-2 and template buffer 12, system clock port (clk) 13, control signal port (ctrl_sig) 14, reference template Valid port (ref_dat-en) 16, reset signal port (rst) 17, test module valid port (test_dat_en) 18, system busy flag port (busy) 19, output data valid port (dout_en) 20 are respectively connected, template buffer 12 Connect 17 to the first and last processing units 11-1, 11-2, system clock port (clk) 13, feature data port (dat_ptn) 15, and reset signal port (rst) respectively.

模板缓冲区12包括测试模板缓冲区、参考模板缓冲区。The stencil buffer 12 includes a test stencil buffer and a reference stencil buffer.

控制总线分别与系统时钟端口(clk)13、控制信号端口(ctrl_sig)14、特征数据端口(dat_ptn)15、参考模板有效端口(ref_dat_en)16、复位信号端口(rst)17、测试模块有效端口(test_dat_en)18、系统忙标志端口(busy)19、匹配结果输出端口(dout)20、输出数据有效端口(dout_en)21连接,控制信号端口(ctrl_sig)14与控制总线的三条控制信号线连接,可以表示8种不同的控制状态,使嵌入式处理器(Nios II)1实现对动态时间归正算法模块的控制。The control bus is respectively connected with the system clock port (clk) 13, the control signal port (ctrl_sig) 14, the feature data port (dat_ptn) 15, the reference template valid port (ref_dat_en) 16, the reset signal port (rst) 17, the test module valid port ( test_dat_en) 18, system busy sign port (busy) 19, matching result output port (dout) 20, output data valid port (dout_en) 21 are connected, control signal port (ctrl_sig) 14 is connected with three control signal lines of control bus, can Indicates 8 different control states, enabling the embedded processor (Nios II) 1 to control the dynamic time correction algorithm module.

实施例2Example 2

如图1、图2、图3所示,基于动态时间归正算法的本语音识别装置的工作过程是:As shown in Figure 1, Figure 2, and Figure 3, the working process of this speech recognition device based on the dynamic time correction algorithm is:

(1)生成参考模板。通过按键8,选择训练功能,进行下述步骤:(1) Generate a reference template. Press button 8 to select the training function and proceed to the following steps:

1、发音人的声音通过麦克风7转换成电信号后经模/数转换模块6变成数字语音信号;1. The speaker's voice is converted into an electrical signal by the microphone 7 and then converted into a digital voice signal by the analog/digital conversion module 6;

2、嵌入式处理器(Nios II)1对数字语音信号进行预处理,然后提取LPC(线性预测编码)特征参数作为参考模块特征参数;2. The embedded processor (Nios II) 1 preprocesses the digital voice signal, and then extracts the LPC (Linear Predictive Coding) feature parameter as the reference module feature parameter;

3、将参考模块特征参数保存到程序存储器(Flash)3中,形成参考模板供识别时使用。3. Save the characteristic parameters of the reference module into the program memory (Flash) 3 to form a reference template for identification.

(2)语音识别。通过按键8,选择识别功能,进行下述步骤:(2) Speech recognition. Press key 8 to select the identification function and proceed to the following steps:

1、发音人的声音通过麦克风7转换成电信号后经模/数转换模块6变成数字语音信号;1. The speaker's voice is converted into an electrical signal by the microphone 7 and then converted into a digital voice signal by the analog/digital conversion module 6;

2、嵌入式处理器(Nios II)对数字语音信号做预处理,提取LPC特征参数作为测试模板特征参数,并保存到数据存储器(SRAM)4中,形成测试模板;2, the embedded processor (Nios II) preprocesses the digital voice signal, extracts the LPC feature parameter as the test template feature parameter, and saves it in the data memory (SRAM) 4 to form the test template;

3、嵌入式处理器(Nios II)1通过控制总线向复位信号端口(rst)17发出的复位模板缓冲区信号,计算控制模块(cal_ctrl)10检测到复位模板缓冲区信号后,对模板缓冲区12进行复位,等待数据的输入;3. The embedded processor (Nios II) 1 sends the reset template buffer signal to the reset signal port (rst) 17 through the control bus. After the calculation control module (cal_ctrl) 10 detects the reset template buffer signal, the template buffer is 12 reset and wait for data input;

4、嵌入式处理器(Nios II)1通过控制总线向特征数据端口(dat_ptn)15依次输入测试模板特征参数、参考模板特征参数,并依次置测试模板有效端口(test_dat_en)18、参考模板有效端口(ref_dat_en)16为高电平,计算控制模块(cal_ctrl)10检测到测试模板有效端口(test_dat_en)18、参考模板有效端口(ref_dat_en)16为高电平时,依次输出相应的地址和控制信号,并把测试模板特征参数、参考模板特征参数分别保存到模板缓冲区12中的测试模板缓冲区、参考模板缓冲区,完成数据的输入;4. Embedded processor (Nios II) 1 inputs test template feature parameters and reference template feature parameters sequentially to feature data port (dat_ptn) 15 through the control bus, and sets test template effective port (test_dat_en) 18 and reference template effective port successively (ref_dat_en) 16 is a high level, and when the calculation control module (cal_ctrl) 10 detects that the test template effective port (test_dat_en) 18 and the reference template effective port (ref_dat_en) 16 are high level, it outputs corresponding addresses and control signals in sequence, and Save the test template feature parameter and the reference template feature parameter to the test template buffer in the template buffer 12 and the reference template buffer respectively to complete the input of data;

5、嵌入式处理器(Nios II)1通过控制总线向控制信号端口(ctrl_sig)14发出匹配命令,计算控制模块(cal_ctrl)10检测到匹配命令后,给模板缓冲区12输出相应的地址和控制信号,把参考模板特征参数和测试模板特征参数读出,分别送到第一个处理单元11-1和最后一个处理单元11-2,同时所述计算控制模块10计算出下一个时刻的运动方向,并将其送到所述第一个处理单元,同时使系统忙标志端口(busy)19输出高电平,说明系统正在工作;5. The embedded processor (Nios II) 1 sends a matching command to the control signal port (ctrl_sig) 14 through the control bus. After the calculation control module (cal_ctrl) 10 detects the matching command, it outputs the corresponding address and control to the template buffer 12 signal, read out the reference template feature parameter and the test template feature parameter, and send them to the first processing unit 11-1 and the last processing unit 11-2 respectively, and the calculation control module 10 calculates the motion direction at the next moment , and send it to the first processing unit, and simultaneously make the system busy flag port (busy) 19 output a high level, indicating that the system is working;

6、计算控制模块(cal_ctrl)10根据运动方向依次把需要的参考模板特征参数和测试模板特征参数从模板缓冲区12中读取出来,送到各处理单元中进行运算,直到计算完最后一个数据;6. Calculation control module (cal_ctrl) 10 sequentially reads the required reference template characteristic parameters and test template characteristic parameters from the template buffer 12 according to the motion direction, and sends them to each processing unit for calculation until the last data is calculated ;

7、计算完成后,最后一个处理单元11-2通过匹配结果输出端口(Dout)20输出匹配结果,即测试模板和参考模板的匹配加权距离,此时计算控制模块(cal_ctrl)10通过输出数据有效端口(dout_en)21、控制总线向嵌入式处理器(Nios II)1输出一个正脉冲,提出中断申请,提醒嵌入式处理器(Nios II)1已经计算完毕,同时使系统忙标志端口(busy)19输出低电平;7. After the calculation is completed, the last processing unit 11-2 outputs the matching result through the matching result output port (Dout) 20, that is, the matching weighted distance between the test template and the reference template. At this time, the calculation control module (cal_ctrl) 10 is valid by outputting data Port (dout_en) 21, the control bus outputs a positive pulse to the embedded processor (Nios II) 1, proposes an interrupt application, reminds the embedded processor (Nios II) 1 that the calculation has been completed, and simultaneously makes the system busy flag port (busy) 19 output low level;

8、嵌入式处理器(Nios II)1收到输出数据有效端口(dout_en)21输出的正脉冲后,把匹配结果读到数据存储器(SRAM)4中缓存起来,然后重复上述3~7步骤,进行下一次匹配,如此循环,直到所有参考模板匹配完成;8. After the embedded processor (Nios II) 1 receives the positive pulse output by the output data effective port (dout_en) 21, it reads the matching result into the data memory (SRAM) 4 and caches it, then repeats the above steps 3 to 7, Carry out the next match, and so on, until all reference template matches are completed;

g、嵌入式处理器(Nios II)1根据测试模板和所有参考模板的匹配加权距离进行识别判决,得到识别结果并在显示器5上显示,从而完成语音识别的过程。G, embedded processor (Nios II) 1 carries out recognition judgment according to the matching weighted distance of test template and all reference templates, obtains recognition result and shows on display 5, thereby finishes the process of speech recognition.

如图3所示,本动态时间归正算法模块的处理单元阵列是一维并行阵列,把测试模板特征参数从最后一个处理单元11-2输入,并使之沿参考模板参数流向相反的方向流动,参考模板特征参数由第一个处理单元11-1开始依次向下传递,流动是顺序的。如图4所示,在测试模板与参考模板的匹配过程的搜索方向是从左下角到右上角,处理单元B在计算第i行的时候处理单元A已经在计算第i+1行了;而每一列的特征则与行特征流向相反,处理单元A在计算第j列时,处理单元B已经在计算第j+1列了。因此,参考模板特征参数和测试模板特征参数由两个相反的方向输入,相对传统并行阵列所采用的模板特征参数由一个方向输入和先存储后计算的模板处理方式,大大减小了硬件资源和输入接口控制逻辑的复杂度。As shown in Figure 3, the processing unit array of this dynamic time correction algorithm module is a one-dimensional parallel array, and the test template characteristic parameter is input from the last processing unit 11-2, and it is made to flow in the opposite direction along the reference template parameter flow , the reference template feature parameters are passed down sequentially from the first processing unit 11-1, and the flow is sequential. As shown in Figure 4, the search direction of the matching process between the test template and the reference template is from the lower left corner to the upper right corner, when the processing unit B is calculating the i-th row, the processing unit A is already calculating the i+1th row; and The characteristics of each column flow in the opposite direction to the row characteristics. When processing unit A calculates column j, processing unit B is already calculating column j+1. Therefore, the characteristic parameters of the reference template and the characteristic parameters of the test template are input from two opposite directions. Compared with the template characteristic parameters adopted by the traditional parallel array, the template processing method is input from one direction and stored first and then calculated, which greatly reduces hardware resources and The input interface controls the complexity of the logic.

本装置中,程序存储器(FLASH)3为Atmel公司的AM29LV065DU,数据存储器(SRAM)4为IDT公司的IDT71V416,模/数转换模块6采用美国国家半导体公司的8位模/数转换器ADC0809,显示器5为LUMEX公司的LCM-S16020,嵌入式处理器(Nios II)1和动态时间归正算法模块2采用一片Altera公司的型号为EP1C20F400C7的FPGA(Field Programmable Gate Array,现场可编程门阵列)22实现,其实现过程是:In this device, program memory (FLASH) 3 is AM29LV065DU of Atmel Corporation, data memory (SRAM) 4 is IDT71V416 of IDT Corporation, analog/digital conversion module 6 adopts 8-bit analog/digital converter ADC0809 of National Semiconductor Corporation of the United States, and the display 5 is the LCM-S16020 of LUMEX Company, the embedded processor (Nios II) 1 and the dynamic time correction algorithm module 2 are realized by an FPGA (Field Programmable Gate Array, Field Programmable Gate Array) 22 of Altera Company whose model is EP1C20F400C7 , and its implementation process is:

1、用VHDL(Very High Speed Integrated Circuit Hardware DescriptionLanguage,超高速集成电路的硬件描述语言)对系统电路进行描述;1. Use VHDL (Very High Speed Integrated Circuit Hardware Description Language, the hardware description language for ultra-high speed integrated circuits) to describe the system circuit;

2、用Synplicity公司的集成电路设计综合软件Synplify对VHDL描述的硬件电路进行综合,得到与Altera公司的型号为EP1C20F400C7的FPGA相应的网表文件;2. Use Synplicity's integrated circuit design synthesis software Synplify to synthesize the hardware circuit described by VHDL, and obtain the netlist file corresponding to Altera's FPGA model EP1C20F400C7;

3、用Altera公司的FPGA设计实现软件QuartusII对得到的网表进行布局布线及提取延时信息;3. Use Altera's FPGA design implementation software QuartusII to layout and route the obtained netlist and extract delay information;

4、用Cadence公司的集成电路设计仿真软件NC-SIM进行时序仿真;4. Use Cadence's integrated circuit design simulation software NC-SIM for timing simulation;

5、用Altera公司的FPGA设计实现软件QuartusII把硬件配置信息下载到上述FPGA(型号EP1C20F400C7)上。5. Download the hardware configuration information to the above-mentioned FPGA (model EP1C20F400C7) with the FPGA design implementation software QuartusII of Altera Company.

如上所述,便可较好地实现本发明。As described above, the present invention can be preferably carried out.

Claims (3)

1、采用集成电路实现的语音识别方法,其特征在于包括如下步骤:1, adopt the speech recognition method that integrated circuit realizes, it is characterized in that comprising the steps: 第一,计算控制模块检测到控制总线上有模板缓冲区复位信号后,对模板缓冲区进行复位,等待数据的输入;First, after the calculation control module detects that there is a template buffer reset signal on the control bus, it resets the template buffer and waits for the input of data; 第二,所述计算控制模块检测到测试模板有效或参考模板有效时,输出相应的地址和控制信号,把特征数据端口上的测试模板特征参数和参考模板特征参数保存到模板缓冲区中,完成数据的输入;Second, when the calculation control module detects that the test template is valid or the reference template is valid, the corresponding address and control signal are output, and the test template characteristic parameters and the reference template characteristic parameters on the characteristic data port are saved in the template buffer, and the completion data entry; 第三,所述计算控制模块检测剑控制总线上有开始匹配命令后,向模板缓冲区输出相应的地址和控制信号,把参考模板特征参数和测试模板特征参数读出,分别送到第一个处理单元和最后一个处理单元,同时所述计算控制模块计算出下一个时刻的运动方向,并将其送到所述第一个处理单元,同时使所述系统忙标志端口输出高电平,说明系统正在工作;The 3rd, after described calculation control module detects that there is a start matching command on the control bus, it outputs corresponding address and control signal to the template buffer, reads the reference template characteristic parameters and test template characteristic parameters, and sends them to the first processing unit and the last processing unit, while the calculation control module calculates the movement direction at the next moment, and sends it to the first processing unit, and simultaneously makes the system busy flag port output high level, indicating the system is working; 第四,所述计算控制模块根据运动方向依次把需要的参考模板和测试模板从所述模板缓冲区中读取出来,送到所述各处理单元中进行处理;Fourth, the calculation control module sequentially reads the required reference templates and test templates from the template buffer according to the direction of motion, and sends them to the processing units for processing; 第五,数据处理完成后,所述最后一个处理单元通过所述匹配结果输出端口输出测试模板和参考模板的匹配加权距离,此时所述计算控制模块使输出数据行效端口输出一个正脉冲,同时使系统忙标志端口输出低电平;Fifth, after the data processing is completed, the last processing unit outputs the matching weighted distance between the test template and the reference template through the matching result output port, and at this time, the calculation control module makes the output data effective port output a positive pulse, At the same time, make the system busy flag port output low level; 第六,嵌入式处理器收到输出数据有效端口输出的正脉冲后,把匹配结果读到数据存储器中缓存起来,然后重复上述第二~五步骤,进行下一次匹配,如此循环,直到所有参考模板匹配完成。Sixth, after the embedded processor receives the positive pulse output from the output data valid port, it reads the matching result into the data memory and caches it, then repeats the above-mentioned steps 2 to 5 for the next match, and so on until all references Template matching is complete. 2、按权利要求1所述采用集成电路实现的语音识别方法,其特征在于,所述运动方向是由所述计算控制模块确定下一个匹配时刻第一个处理单元的搜索方向,同时利用多处理单元的搜索路径的平行性,使运动方向信号和参考模板特征一起在心动阵列中流动,完成不等长模板的匹配。2. The speech recognition method implemented by an integrated circuit according to claim 1, wherein the direction of motion is determined by the calculation control module to determine the search direction of the first processing unit at the next matching moment, and simultaneously utilizes multiprocessing The parallelism of the search path of the unit makes the motion direction signal and the reference template feature flow together in the cardiac array to complete the matching of the unequal-length templates. 3、按权利要求1所述采用集成电路实现的语音识别方法,其特征在于,所述第三步中,参考模板特征参数由第一个处理单元开始依次向下传递,测试模板特征参数从最后一个处理单元输入,并使之沿参考模板参数流向相反的方向流动,形成参考模板参数和测试模板参数由两个相反的方向相向运动。3, adopt the speech recognition method that integrated circuit realizes according to claim 1, it is characterized in that, in the described 3rd step, reference template feature parameter is started to pass down successively by first processing unit, and test template feature parameter is from last A processing unit inputs and makes it flow in opposite directions along the flow of the reference template parameters, so that the reference template parameters and the test template parameters move toward each other in two opposite directions.
CNB2005100337656A 2005-03-28 2005-03-28 Voice recognition device and integrated circuit realizing method Expired - Fee Related CN1326112C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100337656A CN1326112C (en) 2005-03-28 2005-03-28 Voice recognition device and integrated circuit realizing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100337656A CN1326112C (en) 2005-03-28 2005-03-28 Voice recognition device and integrated circuit realizing method

Publications (2)

Publication Number Publication Date
CN1664926A CN1664926A (en) 2005-09-07
CN1326112C true CN1326112C (en) 2007-07-11

Family

ID=35035967

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100337656A Expired - Fee Related CN1326112C (en) 2005-03-28 2005-03-28 Voice recognition device and integrated circuit realizing method

Country Status (1)

Country Link
CN (1) CN1326112C (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111028841B (en) * 2020-03-10 2020-07-07 深圳市友杰智新科技有限公司 Method and device for awakening system to adjust parameters, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1352787A (en) * 1999-02-08 2002-06-05 高通股份有限公司 Distributed voice recognition system
CN2781513Y (en) * 2005-03-28 2006-05-17 华南理工大学 Speech recogintion device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1352787A (en) * 1999-02-08 2002-06-05 高通股份有限公司 Distributed voice recognition system
CN2781513Y (en) * 2005-03-28 2006-05-17 华南理工大学 Speech recogintion device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DTW的ASIC实现算法研究 李韬 贺前华 王前,微电子学,第34卷第3期 2004 *

Also Published As

Publication number Publication date
CN1664926A (en) 2005-09-07

Similar Documents

Publication Publication Date Title
CN109784489A (en) Convolutional neural networks IP kernel based on FPGA
CN110223687A (en) Instruction execution method, device, storage medium and electronic device
CN110379411B (en) Speech synthesis method and device for target speaker
CN110992941A (en) Power grid dispatching voice recognition method and device based on spectrogram
CN113763966A (en) End-to-end text-independent voiceprint recognition method and system
CN114120979A (en) Optimization method, training method, device and medium of voice recognition model
WO2021169711A1 (en) Instruction execution method and apparatus, storage medium, and electronic device
CN1326112C (en) Voice recognition device and integrated circuit realizing method
CN112991382B (en) Heterogeneous visual target tracking system and method based on PYNQ framework
CN2781513Y (en) Speech recogintion device
Stemmer et al. Speech Recognition and Understanding on Hardware-Accelerated DSP.
CN116580693A (en) Training method of timbre conversion model, timbre conversion method, device and equipment
Liu et al. Design and Implementation of Human-Computer Interaction Intelligent System Based on Speech Control.
WO2020073839A1 (en) Voice wake-up method, apparatus and system, and electronic device
CN114822494A (en) A voice data acquisition method, device, electronic device and storage medium
CN110335591A (en) A kind of parameter management method, device, machine readable media and equipment
Gong et al. QCNN inspired reconfigurable keyword spotting processor with hybrid data-weight reuse methods
CN101256769B (en) Speech recognition devices and methods thereof
CN212675912U (en) An Automatic Speech Recognition System Based on FPGA
CN101120397A (en) Speech recognition system, speech recognition method and speech recognition program
CN208368159U (en) A kind of speech recognition front-ends processing system based on ARM embedded platform
CN112767949B (en) Voiceprint recognition system based on binary weight convolutional neural network
CN110299148A (en) Voice fusion method, electronic device and storage medium based on Tensorflow
CN1523506A (en) Digital Image Matching Chip
CN115331658A (en) Voice recognition method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20070711

Termination date: 20110328