KR101079653B1

KR101079653B1 - Apparatus and method to generate keywords for speech recognition in a navigation device

Info

Publication number: KR101079653B1
Application number: KR1020080131221A
Authority: KR
Inventors: 왕지현; 정의석; 강병옥; 박전규; 강점자; 김종진; 박기영; 이성주; 전형배; 정호영; 정훈; 이윤근
Original assignee: 한국전자통신연구원
Priority date: 2008-12-22
Filing date: 2008-12-22
Publication date: 2011-11-04
Anticipated expiration: 2028-12-22
Also published as: KR20100072731A

Abstract

본 발명은 네비게이션 기기에서 음성인식 대상 키워드의 생성 기술에 관한 것으로, 네비게이션 기기에서 음성인식에 의한 관심지(POI) 검색을 실현하기 위하여, POI 명칭으로부터 발화가능성이 높은 음성인식 대상 키워드를 자동으로 조합 생성함으로써 사용자의 다양한 발화문을 음성 인식할 수 있도록 하는 것을 특징으로 한다. 본 발명에 의하면, 네비게이션 기기에서 사용자가 발화 가능한 POI의 이형태들을 자동으로 생성함으로써 음성을 위한 관심지의 검색서비스를 가능하게 할 수 있으며, 이를 통해 사용자 편의성을 높일 수 있다.The present invention relates to a technology for generating a voice recognition target keyword in a navigation device. In order to realize POI search by voice recognition in a navigation device, a voice recognition target keyword having a high probability of speech is automatically combined from a POI name. It is characterized in that by generating a voice recognition of the various spoken text of the user. According to the present invention, the navigation device can automatically generate these types of POIs that can be spoken by the user, thereby enabling a search service of interest for voice, thereby increasing user convenience.

음성인식, 네비게이션 기기, POI(Point of Interest), 이형태 Voice recognition, navigation device, POI (Point of Interest), heteromorphic

Description

Apparatus and method for generating voice recognition target keyword in navigation device {APPARATUS AND METHOD TO GENERATE KEYWORDS FOR SPEECH RECOGNITION IN A NAVIGATION DEVICE}

본 발명은 네비게이션 기기에 관한 것으로서, 특히 차량용 또는 휴대용 네비게이션 기기에서 음성에 의한 관심지(Point of Interest, 이하 POI라 한다)의 검색을 위하여 음성인식 대상 키워드를 자동으로 생성하는데 적합한 네비게이션 기기에서 음성인식 대상 키워드의 생성장치 및 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a navigation device, and more particularly, to speech recognition in a navigation device suitable for automatically generating a voice recognition target keyword for searching a point of interest (hereinafter referred to as POI) in a vehicle or a portable navigation device. An apparatus and method for generating a target keyword.

본 발명은 지식경제부 및 정보통신연구진흥원의 IT성장동력기술개발사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호: 2006-S-036-03, 과제명: 신성장동력산업용 대용량 대화형 분산 처리 음성 인터페이스 기술 개발].The present invention is derived from a study conducted as part of the IT growth engine technology development project of the Ministry of Knowledge Economy and the Ministry of Information and Communication Research and Development. [Task management number: 2006-S-036-03, Task name: Large-capacity interactive distribution for new growth engine industries Processing voice interface technology development].

일반적으로 네비게이션 기기는 자동차 등과 같은 이동체에 설치되거나 혹은 사용자가 직접 들고 다니면서 목적지의 위치 및 목적지에 도달하기 위한 길을 찾아주고 안내하기 위한 것으로서, 지도 정보 및 네비게이션 프로그램에 위성항법장 치(Global Positioning System, 이하 GPS라 한다) 위성으로부터 수신한 위치 신호를 표시하여 이를 사용자에게 디스플레이 하는 네비게이션부와, 네비게이션부와 연결되거나, 네비게이션부 내에 구비되어 GPS 위성으로부터의 위치신호인 GPS 신호를 수신하는 GPS 수신부와, 위치 신호를 송신하는 GPS 위성이 있다.In general, a navigation device is installed in a moving object such as a car or the like, and is used to find and guide a location of a destination and a way to reach a destination while being carried by a user, and the global positioning system is used for map information and a navigation program. And a GPS receiver for displaying a location signal received from a satellite and displaying the same to a user, and a GPS receiver connected to the navigation unit or provided in the navigation unit to receive a GPS signal that is a location signal from a GPS satellite. There is a GPS satellite that transmits a location signal.

이와 같은, 네비게이션 기기는 어떤 전파를 추적하는 것이 아니라 GPS 위성으로부터의 GPS 정보를 수신함으로써 현재의 위치와 목적지의 이동경로와 거리 및 속도 등을 계산하여 사용자에게 실시간으로 정확한 위치를 보여줄 수 있으며, 이를 통해 길안내 서비스를 수행하게 된다.As such, the navigation device receives GPS information from the GPS satellites rather than tracking any radio waves, calculates the current location and the movement path, distance, and speed of the destination, and shows the exact location to the user in real time. You will be guided through the service.

이에 네비게이션 기기의 사용자는 원하는 목적지를 검색하게 되며, 이에 네비게이션부에 저장된 지역 정보, 상호, 건물 정보 DB를 통하여 사용자가 입력한 목적지 검색어에 대한 검색을 수행하여 검색된 결과를 출력하며, 이에 사용자가 검색된 결과 중 해당 목적지를 선택하는 경우, 선택한 목적지의 해당 위치와 현재 GPS 수신 위치를 분석한 후, 목적지까지 가는 길을 자동으로 안내하게 된다. Accordingly, the user of the navigation device searches for a desired destination, and searches for a destination search word input by the user through the local information, trade name, and building information DB stored in the navigation unit, and outputs the searched result. When selecting the destination from the results, after analyzing the corresponding location of the selected destination and the current GPS receiving location, it will automatically guide the way to the destination.

다만, 네비게이션 기기의 사용자가 행선지의 정확한 명칭을 알지 못하거나, 길이가 긴 명칭의 경우는 편의상 쉽게 불리우는 명칭으로 검색하기 때문에 검색을 할 때 실패할 가능성이 높다. 이와 같이 대부분의 경우는 본래 POI명칭의 부분어나 그와 유사한 변이형을 사용하는 경우이다. 따라서 POI명칭에 대한 음성 검색을 실현하기 위해서는 본래의 POI명칭으로부터 그것의 부분어와 변이형을 생성하여 다양한 사용자 발화에 대처해야만 음성 검색의 효율을 높일 수 있다.However, since the user of the navigation device does not know the exact name of the destination, or the long name is searched by a name that is easily called for convenience, it is likely to fail when searching. In most of these cases, the original POI name is a subword or similar variant. Therefore, in order to realize the voice search for the POI name, it is necessary to generate its subwords and variants from the original POI name to cope with various user utterances to increase the efficiency of the voice search.

예를 들어, POI명칭이 "오산시 운암주공 4단지아파트"인 경우 대부분의 사 용자는 긴 POI명칭을 그대로 발화하여 검색하기 보다는 "운암주공 4단지" 또는 "운암주공아파트", 운암아파트" 등의 부분어로 발화할 가능성이 높다.For example, if the POI name is "Osan City Unam Jugong Apartment 4", most users will not search for the long POI name as it is. Most likely to ignite.

또 다른 예로 "GS 칼텍스 강변주유소"의 경우, "강변 GS 주유소"나 "강변 GS 칼텍스"의 형태로도 불릴 수 있다.In another example, "GS Caltex Riverside Gas Station" may also be referred to as "Gangsan GS Gas Station" or "Gangsan GS Caltex".

상기한 바와 같이 동작하는 종래 기술에 의한 네비게이션 기기를 이용한 음성 검색에 있어서는, 원POI명칭에 대한 다양한 발화 이형태를 고려하지 않고 있으므로 사용자의 발화문은 음성인식에 실패하게 될 가능성이 높으며, 이것은 음성인식 대상 키워드인 POI명칭과 사용자의 음성 발화문 사이의 불일치로 인해 발생되는 것이나, 이에 대한 별다른 방도가 없었다. In the voice search using a navigation device according to the prior art operating as described above, the user's spoken text is likely to fail in voice recognition because various speech variations of the original POI name are not considered. This is caused by a mismatch between the target keyword POI name and the user's voice spoken text, but there is no way for this.

이에 본 발명은, 네비게이션 기기에서 음성인식에 의한 POI 검색을 실현하기 위하여, POI 명칭으로부터 발화가능성이 높은 음성인식 대상 키워드를 자동으로 조합 생성하여 사용자의 다양한 발화문을 음성 인식할 수 있는 네비게이션 기기에서 음성인식 대상 키워드의 생성장치 및 방법을 제공한다. Accordingly, in order to realize POI search by voice recognition in a navigation device, a navigation device capable of automatically recognizing various speech texts of a user by automatically generating a combination of voice recognition target keywords having a high probability of speech from POI names. An apparatus and method for generating a voice recognition target keyword are provided.

또한 본 발명은, 네비게이션 기기에서 음성인식에 의한 POI 검색을 실현하기 위하여, POI DB로부터 통계적으로 확률이 높은 명사들의 조합을 학습 코퍼스(corpus)로부터 학습한 후에 이것을 바탕으로 새로운 이형태를 자동 생성할 수 있는 네비게이션 기기에서 음성인식 대상 키워드의 생성장치 및 방법을 제공한다.In addition, the present invention, in order to realize the POI search by speech recognition in the navigation device, after learning a combination of statistically probable nouns from the POI DB from the learning corpus can automatically generate a new morphology based thereon An apparatus and method for generating a voice recognition target keyword in a navigation device is provided.

본 발명의 일 실시예 장치는, 관심지(POI) 명칭 DB의 POI 문자열들을 분석하여 명사들 간의 공기를 수집하고, 이에 대한 확률값을 통계정보로 구축하는 통계모델 학습부와, 수집된 통계정보를 이용하여 입력된 POI 명칭에 대한 발화 이형태들을 생성하는 이형태 생성부를 포함한다.According to an embodiment of the present invention, a device for analyzing a POI string of a POI name DB collects air between nouns and constructs a statistical model learning unit for constructing probability values as statistical information, and collects collected statistical information. It includes a variant generation unit for generating utterance variants for the POI name input using.

본 발명의 일 실시예 방법은, 관심지(POI) 명칭 DB의 POI 문자열들을 분석하여 명사들 간의 공기를 수집하고, 이에 대한 확률값을 통계정보로 구축하는 과정과, 상기 구축된 통계정보를 이용하여 입력된 POI 명칭에 대한 발화 이형태들을 생성하는 과정을 포함한다.According to an embodiment of the present invention, a method of collecting PO between strings of nouns by analyzing POI strings of a POI name DB, constructing probability values as statistical information, and using the constructed statistical information Generating utterance variants for the input POI name.

본 발명에 있어서, 개시되는 발명 중 대표적인 것에 의하여 얻어지는 효과를 간단히 설명하면 다음과 같다. In the present invention, the effects obtained by the representative ones of the disclosed inventions will be briefly described as follows.

본 발명은, 네비게이션 기기에서 사용자가 발화 가능한 POI의 이형태들을 자동으로 생성함으로써 음성을 위한 관심지의 검색서비스를 가능하게 할 수 있으며, 이를 통해 사용자 편의성을 높일 수 있는 효과가 있다.The present invention can enable the search service of the interest for voice by automatically generating this type of POI that can be spoken by the user in the navigation device, thereby increasing user convenience.

이하 첨부된 도면을 참조하여 본 발명의 동작 원리를 상세히 설명한다. 하 기에서 본 발명을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. Hereinafter, the operating principle of the present invention will be described in detail with reference to the accompanying drawings. In the following description of the present invention, when it is determined that a detailed description of a known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted. The following terms are defined in consideration of the functions of the present invention, and may be changed according to the intentions or customs of the user, the operator, and the like. Therefore, the definition should be based on the contents throughout this specification.

본 발명은 네비게이션 기기에서 음성인식에 의한 POI 검색을 실현하기 위하여, POI 명칭으로부터 발화가능성이 높은 음성인식 대상 키워드를 자동으로 조합 생성하여 사용자의 다양한 발화문을 음성 인식할 수 있도록 구현하는 것으로서, POI DB로부터 통계적으로 확률이 높은 명사들의 조합을 학습 코퍼스(corpus)로부터 학습한 후에 이것을 바탕으로 새로운 이형태를 자동 생성하고, 이를 통해 음성을 위한 관심지의 검색서비스를 가능하게 하는 것이다.In order to realize POI search by voice recognition in a navigation device, a voice recognition target keyword having a high utterance possibility is automatically generated from a POI name so that various speech texts of a user can be recognized. After learning the combination of statistically probable nouns from the DB from the learning corpus, a new variant is automatically generated based on this, thereby enabling a search service of interest for speech.

POI 명칭은 다수의 명사들이 결합되어 구성되어 있는 특징이 있다. 대부분의 경우, POI 명칭의 부분 문자열(substring)이나 이들 부분 문자열의 자리바꿈에 의하여 생성된 문자열이 발화 이형태가 되며, 그렇지 않은 경우는 동의어나 유의어를 갖는 명사가 대치(replacement)되어 새로운 이형태가 만들어 진 경우이다.POI names are characterized by a combination of a number of nouns. In most cases, substrings of POI names or strings generated by inversions of these substrings are spoken variants, otherwise nouns with synonyms or synonyms are replaced to create a new variant. It is the case.

여기서, 이형태란, POI명칭 목록으로부터 음성인식 대상어를 생성하기 위하여, 사용자 발화 가능성이 높은 부분어(Substring)와 변이형(Variants)을 생성하는 것이며, 이를 원문의 발화 '이형태'라 지칭한다.Here, this form is to generate substrings and variants having high possibility of user speech in order to generate a speech recognition target word from the POI name list, which is referred to as an original speech 'this form'.

도 1은 본 발명의 실시예에 따른 네비게이션 기기의 구조를 도시한 블록도 이다. 1 is a block diagram showing the structure of a navigation device according to an embodiment of the present invention.

도 1을 참조하면, 네비게이션 기기는 제어부(100), GPS 수신부(102), 메모리(104), 조작 패널 입력부(106), 디스플레이부(108), 스피커(110)와 음성 입력부(112) 등을 포함한다.Referring to FIG. 1, the navigation device includes a control unit 100, a GPS receiver 102, a memory 104, an operation panel input unit 106, a display unit 108, a speaker 110, a voice input unit 112, and the like. Include.

구체적으로 GPS 수신부(102)는 예컨대 3개 이상의 GPS 위성들로부터 제공받은 위치정보를 토대로 정확한 시간과 거리를 측정하여 3개의 각각 다른 거리를 삼각방법에 따라서 현재 위치를 정확히 계산하며, 계산된 정보를 제어부(100)로 전달한다. In detail, the GPS receiver 102 measures accurate time and distance based on location information provided from three or more GPS satellites, for example, calculates three different distances accurately according to a triangulation method, and calculates the calculated information. Transfer to the control unit 100.

메모리(104)는 지역별 지도데이터를 포함하는 전자지도 데이터베이스를 저장하고 있으며, 이를 제어부(100)로 전달함으로써, 제어부(100)에서는 위치정보 및 전자지도 데이터베이스를 토대로 길안내를 수행하게 된다. The memory 104 stores an electronic map database including map data for each region, and transmits it to the controller 100 so that the controller 100 performs the road guidance based on the location information and the electronic map database.

조작 패널 입력부(106)는 네비게이션에 포함된 키패드 또는 터치스크린 등을 통해 사용자로부터 입력된 제어 명령으로서, 키 입력 신호 또는 터치 신호를 제어부(100)로 전달하며, 이에 응답하여 제어부(100)에서는 전달받은 제어 명령을 토대로 각 기능블록들을 제어하게 된다. The operation panel input unit 106 is a control command input from a user through a keypad or a touch screen included in the navigation, and transmits a key input signal or a touch signal to the controller 100, and in response thereto, the controller 100 transmits the key input signal or touch signal. Each function block is controlled based on the received control command.

음성 입력부(112)는 음성 검색의 수행 시 사용자가 입력한 음성을 제어부(100)로 전달하게 된다.The voice input unit 112 delivers the voice input by the user to the controller 100 when performing a voice search.

제어부(100)는 네비게이션 장치의 각 기능요소들과 연계하여 전반적인 기능을 제어하며, 위치 측정, 길안내 서비스, 속도 감지 서비스 등을 수행한다. 그리고 사용자가 조작 패널 입력부(106) 및 음성 입력부(112)를 통해 입력한 텍스트 정보 및 음성 정보를 토대로 목적지 검색을 수행하게 된다. The control unit 100 controls the overall function in association with each functional element of the navigation device, performs a position measurement, a road guidance service, a speed detection service. In addition, the destination search is performed based on the text information and the voice information input by the user through the operation panel input unit 106 and the voice input unit 112.

한편, 제어부(100)는 사용자의 다양한 발화문에 대한 음성 인식을 수행할 수 있도록 통계모델 학습부와 이형태 생성부를 포함할 수 있다. The controller 100 may include a statistical model learner and a heterogeneous generator to perform voice recognition of various speech texts of the user.

도 2는 본 발명의 실시예에 따른 통계모델 학습부와 이형태 생성부의 처리관계를 나타내는 구성도이다.2 is a block diagram showing the processing relationship between the statistical model learning unit and the shape generation unit according to an embodiment of the present invention.

도 2를 참조하면, 통계모델 학습부(202)에서는 이형태 생성부(208)가 POI 이형태를 생성할 때 사용하기 위한 리소스를 구축한다. 이때 생성되는 리소스는 명사들 간의 공기정보(co-occurrence)를 대량의 코퍼스 즉, 입력된 POI 명칭 DB(200)로부터 수집하고 이들의 확률값을 통계정보DB(204)에 구축하게 된다. Referring to FIG. 2, the statistical model learning unit 202 constructs resources for use by the heterogeneous generation unit 208 when generating the POI heteromorphic form. At this time, the generated resource collects co-occurrence between nouns from a large amount of corpus, that is, the input POI name DB 200 and builds their probability values in the statistical information DB 204.

이후 이형태 생성부(208)에서는 입력된 POI명칭(206)에 대해 통계 모델 학습부(202)에서 구축된 통계정보 DB(204)를 이용하여 생성 가능한 POI발화 이형태들(210)을 출력한다. 여기서 입력되는 POI명칭(206)은 외부로부터 입력되는 것이 아닌, 발화 이형태를 출력하기 위해 임의로 입력되는 것으로서, POI명칭 DB(200) 내의 모든 POI 명칭이 될 수 있다.Afterwards, the morphology generator 208 outputs the POI utterance variants 210 that can be generated using the statistical information DB 204 constructed by the statistic model learner 202 with respect to the input POI name 206. The POI name 206 input here is arbitrarily input to output a speech morphology, not input from the outside, and may be all POI names in the POI name DB 200.

도 3은 본 발명의 실시예에 따른 통계모델 학습부의 요소처리 구조를 도시한 구성도이다.3 is a block diagram showing the element processing structure of the statistical model learning unit according to an embodiment of the present invention.

도 3을 참조하면, 통계 모델 학습부(202)는 복합 명사 분해부(300), 언어모델 생성부(302), 확률값 계산부(304), 언어 모델 가중치 계산부(306) 등을 포함하는 것으로서, POI명칭 DB(200)를 입력받아 통계정보(204)를 출력하며, 이때, 통계정보(204)는 N-gram확률 정보 DB(308)와 언어모델 가중치 정보 DB(310)를 포함한 다.Referring to FIG. 3, the statistical model learner 202 includes a compound noun decomposition unit 300, a language model generator 302, a probability value calculator 304, a language model weight calculator 306, and the like. , Receiving the POI name DB 200 and outputs the statistical information 204, wherein the statistical information 204 includes the N-gram probability information DB 308 and the language model weight information DB 310.

구체적으로 복합명사 분해부(300)는 입력된 각각의 POI명칭이 다수의 명사들의 결합으로 이뤄져 있기 때문에 각각의 단위 명사들로 분해하고 특정 의미 태그로 태깅하는 처리를 수행한다. 예를 들어, POI명칭이 "신림8동 우편 취급소"의 경우, '신림/<행정지명>' + '8동/<숫자표현>' + '우편취급소/<기관명>'와 같은 분해결과와 각각에 태그 정보가 붙게 된다. Specifically, the compound noun decomposing unit 300 decomposes each unit noun and performs tagging with a specific semantic tag because each input POI name is formed by combining a plurality of nouns. For example, if the POI name is "Sillim 8-dong Post Office," the decomposition results such as 'Sillim / <administrative name>' + '8-dong / <numeric expression>' + 'Mail handling / <agency name>' Tag information is added to the.

한편, POI명칭을 구성하는 각 문자열(w₀w₁ _…w_n)마다 확률값을 가질 수 있는데 다음 (표 1)과 같이 각 명사(워드, w)들의 확률값의 곱으로 결정된다.On the other hand, each character string (w ₀ w ₁ _... w _n ) constituting the POI name may have a probability value, which is determined as the product of the probability values of each noun (word, w) as shown in Table 1 below.

P(w₀w₁ _…w_n) = P(w₀) x P(w₁) x P(w₂) x ... x P(w_n)P (w ₀ w ₁ _… w _n ) = P (w ₀ ) x P (w ₁ ) x P (w ₂ ) x ... x P (w _n )

보다 신뢰적이고 정확한 확률값을 얻기 위해서 각 워드의 확률값은 문맥을 고려한 N-gram 확률값으로 대체될 수 있다. 이에 대한 자세한 설명은 언어모델 가중치 계산부(306)에서 설명된다.In order to obtain a more reliable and accurate probability value, the probability value of each word can be replaced with the N-gram probability value considering the context. A detailed description thereof will be described in the language model weight calculator 306.

언어모델 생성부(302)는 POI문자열을 구성하는 각 워드의 주변 문맥을 고려하기 위한 N-gram을 생성한다. 여기서 N-gram 이란, 과거의 n-1개의 단어로부터 다음에 나타날 단어의 확률을 정의하는 것이다.The language model generator 302 generates an N-gram for considering the surrounding context of each word constituting the POI string. Here, N-gram defines the probability of the next word from n-1 words in the past.

본 발명의 실시예에서 사용하는 N-gram 언어모델은 6가지가 사용된다. 이중 3가지는 복합명사 분해 결과인 각 명사 워드에 대한 N-gram이고, 나머지 3가지는 분해된 각 명사의 태그 정보를 이용한 N-gram이다.Six N-gram language models used in the embodiment of the present invention are used. Three of them are N-grams for each noun word resulting from decomposition of compound nouns, and the other three are N-grams using tag information of each noun decomposed.

통계 언어분야에서 전자는 워드 N-gram이라 하고, 후자는 클래스 기반(Class-based) N-gram이라 불린다. 전자와 후자는 Unigram, Bigram, Trigram 3가지를 각각 사용하기 때문에 총 6가지의 언어모델을 사용하게 된다.In the field of statistical language, the former is called the word N-gram and the latter is called the class-based N-gram. The former and the latter use three types of Unigram, Bigram, and Trigram respectively, so six language models are used.

확률값 계산부(304)는 언어모델 생성부(302)로부터 생성된 각 N-gram의 개수를 카운팅한 후에 각 N-gram문자열에 대한 확률값을 계산한다. 계산된 확률값은 리소스 파일의 형태로 N-gram 확률 정보 DB(308)에 저장되며, 저장된 N-gram확률값은 이형태 생성부(208)에서 참조가 되는데 POI 이형태 후보를 구성하는 각 명사의 N-gram 확률값을 참조할 때 사용 된다. The probability value calculator 304 calculates a probability value for each N-gram string after counting the number of each N-gram generated from the language model generator 302. The calculated probability value is stored in the N-gram probability information DB 308 in the form of a resource file, and the stored N-gram probability value is referenced by the heterogeneous generation unit 208. The N-gram of each noun constituting the POI heteromorphic candidate It is used to refer to the probability value.

언어모델 가중치 계산부(306)는 보다 정확한 N-gram확률값을 얻기 위한 처리부이다. 학습 코퍼스로 사용되는 POI명칭 DB(200)는 아무리 데이터량이 많다 하더라도 출현할 수 있는 모든 명사와 그들 간의 조합 문자열을 포함하고 있지는 않기 때문에 데이터가 희박(Sparse)할 수 밖에 없다. 따라서 본 발명에서는 확률값 계산부(304)로부터 얻은 6가지의 단위 N-gram확률을 아래 (표2)와 같은 선형 보간법(Linear Interpolation)에 의해 연결한 확률값을 최종 N-gram확률로 이용한다. 각 가중치(l) 값은 코퍼스로부터 학습하여 0과 1사이의 값으로 결정이 되는데, 6가지 언어모델 중 상대적으로 데이터가 희박한 N-gram의 가중치가 낮게 계산이 되며, 이것은 가중치가 높은 N-gram모델에 비해 상대적으로 덜 신뢰한다는 의미이다.The language model weight calculator 306 is a processor for obtaining a more accurate N-gram probability value. The POI name DB 200 used as the learning corpus does not include all nouns that may appear and combination strings between them even if the amount of data is large. Therefore, in the present invention, a probability value obtained by linking six unit N-gram probabilities obtained from the probability value calculator 304 by linear interpolation as shown in Table 2 below is used as the final N-gram probability. Each weight (l) value is determined by learning from the corpus to a value between 0 and 1. The weight of N-gram, which has relatively low data among 6 language models, is calculated to be low. This means that it is relatively less reliable than the model.

P(w_i) =
λ₁(h)*P(w_i) +
λ₂(h)*P(w_i|w_i _-1) +
λ₃(h)*P(w_i|w_i _-2, w_i _-1) +
λ₄(h)*P_c(w_i) +
λ₅(h)*P_c(w_i|w_i _-1) +
λ₆(h)*P_c(w_i|w_i _-2,w_i _-1) (0<=λ₁+…+λ₆ <=1)P (w _i ) =
λ ₁ (h) * P (w _i ) +
λ ₂ (h) * P (w _i | w _i _-1 ) +
λ ₃ (h) * P (w _i | w _i _-2 , w _i _-1 ) +
λ ₄ (h) * P _c (w _i ) +
λ ₅ (h) * P _c (w _i | w _i _-1 ) +
λ ₆ (h) * P _c (w _i | w _i _-2 , w _i _-1 ) (0 <= λ ₁ +… + λ ₆ <= 1)

상기 (표 2)에서 w는 워드(명사)이고, h는 history의 약자로서 w_i _-1w_i _-2…를 의미한다. P(x)는 워드 N-gram확률이고, P_c(x)는 클래스 N-gram확률이다. 그리고 P(w_i)는 unigram확률, P(w_i|w_i _-1)는 bigram확률, P(w_i|w_i _-2, w_i _-1)는 trigram확률이다.Above (Table 2) in the word w (n) and, h is an abbreviation for history w _i w _i _-1 ... _-2 Means. P (x) is the word N-gram probability and P _c (x) is the class N-gram probability. P (w _i ) is the unigram probability, P (w _i | w _i _-1 ) is the bigram probability, and P (w _i | w _i _-2 , w _i _-1 ) is the trigram probability.

각 N-gram모델의 가중치(l)는 생성된 모든 N-gram별로 계산되어 언어 모델 가중치 정보 DB(310)에 저장되며, (표3)의 EM(Expectation-Maximization) 알고리즘에 의해 학습한다.The weight l of each N-gram model is calculated for each generated N-gram and stored in the language model weight information DB 310, and is learned by the EM (Expectation-Maximization) algorithm shown in Table 3.

1) E-step
λ_j ^(t+1)(h) = { λ_j ^(t)(h) x P(w_i|h_i) } / ∑_k=1~6 { λ_k ^(t)(h) x P(w_i|h_i) }

2) M-step
K^(t+1) = log ∑_k=1~6 { λ_k ^(t+1)(h) x P(w_i|h_i) }

3) 수렴 여부 검사
K(t+1)과 K(t)의 차이가 지정한 작은 값보다 작을 때 까지 1)과 2)를 반복 수행1) E-step
λ _j ^{(t + 1)} (h) = {λ _j ^(t) (h) x P (w _i | h _i )} / ∑ _{k = 1 to 6} {λ _k ^(t) (h) x P (w _i | h _i )}

2) M-step
K ^{(t + 1)} = log ∑ _{k = 1 ~ 6} {λ _k ^{(t + 1)} (h) x P (w _i | h _i )}

3) Convergence check
1) and 2) are repeated until the difference between K (t + 1) and K (t) is smaller than the specified small value.

도 4는 본 발명의 실시예에 따른 통계모델 학습부의 동작절차를 도시한 흐름도이다.4 is a flowchart illustrating an operation procedure of a statistical model learning unit according to an exemplary embodiment of the present invention.

도 4를 참조하면, 400단계에서 통계모델 학습부(202)로 POI 명칭 DB가 입력되면, 먼저 402단계에서 복합 명사 분해부(300)를 통해 다수의 명사들의 결합으로 이루어진 POI 명칭들을 각각의 단위 명사들로 분해하고, 특정 의미 태그로 태깅한다.Referring to FIG. 4, when the POI name DB is input to the statistical model learner 202 in step 400, first, in step 402, POI names formed by combining a plurality of nouns through the compound noun decomposing unit 300 are included in each unit. Decompose into nouns and tag them with specific semantic tags.

404단계에서 언어모델 생성부(302)는 각 명사의 주변 문맥을 고려하기 위한 N-gram을 생성하게 된다. 즉, 복합명사분해 결과인 각 명사 워드에 대한 N-gram과 406단계에서 분해된 각 명사의 태그 정보를 이용한 N-gram인 클래스 N-gram을 생성한다. In step 404, the language model generator 302 generates an N-gram to consider the surrounding context of each noun. That is, a class N-gram, which is an N-gram using tag information of each noun word resulting from compound noun decomposition and tag information of each noun decomposed in step 406, is generated.

408단계에서 확률값 계산부(304)는 언어모델 생성부(302)로부터 생성된 각 N-gram의 개수를 카운트하고, 410단계에서 각 N-gram 문자열에 대한 확률값을 계산한다. 그리고, 계산된 확률값은 412단계에서 N-gram 확률정보 DB(308)에 저장된다. 414단계에서 언어모델 가중치 계산부(306)는 생성된 모든 N-gram별로 가중치를 계산하게 되며, 계산된 결과는 416단계에서 언어 모델 가중치 정보 DB(310)에 저장된다. In step 408, the probability value calculator 304 counts the number of each N-gram generated from the language model generator 302, and calculates a probability value for each N-gram string in step 410. The calculated probability value is stored in the N-gram probability information DB 308 in step 412. In step 414, the language model weight calculator 306 calculates the weight for every generated N-gram, and the calculated result is stored in the language model weight information DB 310 in step 416.

도 5는 본 발명의 실시예에 따른 이형태 생성부의 요소처리 구조를 도시한 구성도이다.5 is a block diagram showing the element processing structure of the shape generation unit according to an embodiment of the present invention.

도 5를 참조하면, 이형태 생성부(208)는 복합 명사 분해부(500), 이형태 후보 생성부(502), 언어 모델 생성부(504), 이형태 후보 확률 계산부(506), 이형태 후보 필터링부(508), 필터링DB(510), 이형태 동의어 생성부(516), 동의어 사전(518) 등을 포함하며, 이러한 이형태 생성부(208)에 POI 명칭이 입력되며, 출력은 발화 가능한 이형태 목록이 된다.Referring to FIG. 5, the heterogeneous generation unit 208 includes a compound noun decomposition unit 500, a heterogeneous candidate generation unit 502, a language model generator 504, a heterogeneous candidate probability calculation unit 506, and a heterogeneous candidate filtering unit. 508, filtering DB 510, heterogeneous synonym generator 516, synonym dictionary 518, and the like, and the POI name is input to the heterogeneous generator 208, and the output is a list of possible heterogeneous forms. .

구체적으로 복합 명사 분해부(500)는 다수의 명사들의 결합으로 이루어진 POI 명칭을 단위 명사로 분해하고, 특정 의미 태그로 태깅한다. 그리고 이형태 후보 생성부(502)에서는 입력된 POI 명칭으로부터 복합명사 분해된 다수의 명사들 조합에 따라 발생할 수 있는 모든 경우의 수를 이형태 후보로 생성한다. 명사들의 개수가 n개라고 했을 때 발생할 수 있는 가짓수는 n!개가 된다.Specifically, the compound noun decomposing unit 500 decomposes a POI name composed of a combination of a plurality of nouns into a unit noun and tags the tag with a specific semantic tag. The heteromorphic candidate generator 502 generates a heterogeneous candidate in all cases that may occur according to a combination of a plurality of nouns in which a complex noun is decomposed from the input POI name. If you have n nouns, you can get n!

이후 언어 모델 생성부(504)는 POI문자열을 구성하는 각 워드의 주변 문맥을 고려하기 위한 N-gram을 생성하는 것으로서, 복합명사 분해 결과인 각 명사 워드에 대한 워드 N-gram과, 분해된 각 명사의 태그 정보를 이용한 클래스 N-gram을 생성한다. Thereafter, the language model generator 504 generates an N-gram for considering the surrounding context of each word constituting the POI string, the word N-gram for each noun word that is the result of the decomposition of the compound noun, and the decomposed angle. Create class N-gram using tag information of nouns.

이형태 후보 확률 계산부(506)는 (표 1)과 (표 2)의 계산식에 따라 모든 이형태 후보의 확률값을 계산하며, 이때 통계 모델 학습부(202)가 생성한 통계정보 DB(204)를 참조하게 된다.The heteromorphic candidate probability calculating unit 506 calculates probability values of all the heterogeneous candidates according to the formulas of Tables 1 and 2, and refers to the statistical information DB 204 generated by the statistical model learning unit 202. Done.

이형태 후보 필터링부(508)는 이형태후보가 다음의 몇 가지 조건 중 적어도 한가지에 해당되면 걸러내는 처리를 수행하는 것으로서, 필터링 패턴 DB(512)와, 대표 상호명 사전(514)을 포함하는 필터링 DB(510)를 참조하여 필터링을 수행한다.The form candidate filtering unit 508 performs a filtering process when the form candidate corresponds to at least one of the following conditions, and includes a filtering DB including a filtering pattern DB 512 and a representative company name dictionary 514 ( The filtering is performed by referring to 510.

가) 이형태후보의 확률값이 특정 임계치(Threshold)보다 낮은 경우A) when the probability value of the candidate is lower than a certain threshold;

나) 필터링 패턴('-' 패턴)에 매칭되는 경우B) If it matches the filtering pattern ('-' pattern)

다) 입력 POI 문자열에 있는 대표상호명이 이형태 후보에 출현하지 않는 경우C) when a representative name in the input POI string does not appear in this type candidate

여기서 나)의 필터링 패턴은 워드와 워드의 태그로 기술된다. 패턴의 BNF(Backus-Naur Form)은 다음 (표 4)와 같다.Here, the filtering pattern in b) is described in words and tags of words. The BNF (Backus-Naur Form) of the pattern is shown in Table 4 below.

'+'는 패턴에 매칭되면 해당 이형태후보를 반드시 생성하라는 의미이고, '-'는 패턴에 매칭되면 해당 이형태후보를 반드시 제거하라는 의미이다. 물론 둘 중에 한 가지만 사용될 수 있다. <term>은 다수개를 원하는 만큼 기술할 수 있으며, '명사'만 기술하거나 '명사/태그명'의 형식으로 기술할 수 있다는 의미이다.'+' Means to match the pattern candidates must be created if the pattern is matched, '-' means to remove the candidate candidates. Of course, only one of them can be used. <term> can describe as many as you want, and can only describe 'noun' or 'noun / tag name'.

예를 들면, POI명칭이 "가양2동치안센터"일때, 이형태후보 "가양치안센터2동"을 생성되지 않도록 제거하려면 패턴으로 "- * <숫자표현>"와 같이 기술할 수 있다. 즉, <숫자표현>으로 종료하는 모든 이형태 후보를 제거하라는 의미이다.For example, when the POI name is "Gaiyang 2 Security Center", this form candidate "Gayang Security Center 2" can be removed so that the pattern can be described as "-* <number expression>". This means to remove all heteromorphic candidates ending with <numeric expression>.

가)를 위한 대표상호명 사전은 'LG', '삼성', '베니건스' 등의 상호명을 기술해 놓은 리소스이다. 이러한 대표상호명이 POI 원문에 출현하는 경우 이형태 후보는 반드시 POI 원문에 출현한 대표상호명을 포함하고 있어야 한다.The Dictionary of Representative Names for A) is a resource describing trade names such as 'LG', 'Samsung' and 'Venigans'. If such a representative name appears in the POI text, the candidate must be included in the POI text.

이형태 동의어 생성부(516)는 필터링되지 않은 이형태 후보를 대상으로 동의어 사전(518)을 이용하여 명사단위로 동의어 대치를 해준다. 예를 들어, POI 이형태 후보가 "동국대사대부속중학교"인 경우, '부속중학교'를 동의어 관계에 있는 '부속중', '부속중교' 등으로 대치함으로써, 최종으로 출력되는 이형태는 "동국대사대부속중학교" 이외에도 "동국대사대부속중", "동국대사대부속중교"로도 출력된다.The heteromorphic synonym generator 516 replaces synonyms by a noun unit using a synonym dictionary 518 for unfiltered heteromorphic candidates. For example, if the candidate for POI variant is "Dongkuk Ambassador Middle School", the "Intermediate Ambassador Subsidiary" is replaced by "Affiliated Middle School" and "Affiliated Middle School" in synonyms. In addition to "Middle School", it is also output as "Dongguk Ambassador" and "Dongguk Ambassador".

도 6은 본 발명의 실시예에 따른 이형태생성부의 동작절차를 도시한 흐름도이다.6 is a flowchart illustrating an operation procedure of the shape generating unit according to an exemplary embodiment of the present invention.

도 6을 참조하면, 600단계에서 POI 명칭이 입력되면, 이는 이형태 생성부(208)의 복합 명사 분해부(500)로 전달되어, 602단계의 복합 명사 분해부(500)에서 다수의 명사들의 결합으로 이루어진 POI 명칭을 단위 명사로 분해하고, 특정 의미 태그로 태깅한다. 그리고 604단계에서 이형태 후보 생성부(502)에 입력된 POI 명칭 즉, 문자열로부터 복합명사 분해된 다수의 명사들 조합에 따라 발생할 수 있는 모든 경우의 수를 이형태 후보로 생성한다.Referring to FIG. 6, when a POI name is input in step 600, it is transmitted to the compound noun decomposing unit 500 of the heterogeneous generation unit 208, where a plurality of nouns are combined in the compound noun decomposing unit 500 in step 602. Decompose POI names consisting of unit nouns and tag them with specific semantic tags. In operation 604, the POI name input to the heteromorphic candidate generator 502, that is, the number of all cases that may occur according to a combination of a plurality of nouns decomposed from a character string as a heterogeneous candidate is generated.

이후 606단계에서 언어모델 생성부(504)는 복합명사 분해 결과인 각 명사 워드에 대한 워드 N-gram과, 608단계에서 분해된 각 명사의 태그 정보를 이용한 클래스 N-gram을 생성하고 610단계에서 이형태 후보 확률 계산부(506)를 통해 (표 1)과 (표 2)의 계산식을 토대로 모든 이형태 후보의 확률값을 계산한다.Thereafter, in step 606, the language model generator 504 generates a word N-gram for each noun word that is a result of decomposition of the compound noun, and a class N-gram using tag information of each noun decomposed in step 608. Probability values of all heteromorphic candidates are calculated through the heterogeneous candidate probability calculation unit 506 based on the formulas of Tables 1 and 2.

612단계에서는 이형태 후보 필터링부(508)가 필터링DB(510)의 필터링 패턴 DB(512)와 대표 상호명 사전 정보(514)를 참조하여 이형태 후보가 기설정된 조건 중 한가지라도 해당되면 이를 걸러내어 제외한다. 그리고 614단계에서는 이형태 동의어 생성부(516)를 통해 필터링되지 않은 이형태 후보를 대상으로 명사단위로 동의어 대치를 수행하는 것이며, 이때 동의어사전(514)을 참조하여 대치한다. In operation 612, the heteromorphic candidate filtering unit 508 filters the exclusion of the heterogeneous candidate, even if one of the preset conditions is applicable, by referring to the filtering pattern DB 512 and the representative business name dictionary information 514 of the filtering DB 510. . In operation 614, the synonymous substitution is performed in the noun unit with respect to the unfiltered heteromorphic candidate through the heteromorphic synonym generation unit 516. In this case, the synonym dictionary 514 is substituted.

이후 616단계에서 이형태 동의어 생성부(516)로부터 전달된 정보는 발화 가능한 이형태 목록으로 출력하게 된다. In operation 616, the information transmitted from the heteromorphic synonym generator 516 is output as a list of speech forms.

이상 설명한 바와 같이, 본 발명은 네비게이션 기기에서 음성인식에 의한 POI 검색을 실현하기 위하여, POI 명칭으로부터 발화가능성이 높은 음성인식 대상 키워드를 자동으로 조합 생성하여 사용자의 다양한 발화문을 음성 인식할 수 있도록 구현하는 것으로서, POI DB로부터 통계적으로 확률이 높은 명사들의 조합을 학습 말뭉치(corpus)로부터 학습한 후에 이것을 바탕으로 새로운 이형태를 자동 생성한다. As described above, in order to realize POI retrieval by voice recognition in a navigation device, the present invention may automatically generate a combination of voice recognition target keywords having a high probability of speech from POI names so that various voices of a user may be recognized. As an implementation, a combination of statistically probable nouns from the POI DB is learned from a learning corpus and then automatically generated a new variant.

한편 본 발명의 상세한 설명에서는 네비게이션 기기에서 음성인식 대상 키워드의 생성장치 및 방법에 대한 구체적인 실시예에 관해 설명하였으나, 본 발명의 범위에서 벗어나지 않는 한도 내에서 여러 가지 변형이 가능함은 물론이다. Meanwhile, in the detailed description of the present invention, a specific embodiment of an apparatus and a method for generating a voice recognition target keyword in a navigation device has been described, but various modifications are possible without departing from the scope of the present invention.

즉, 본 발명은 네비게이션 기기에 한정되지 않으며, 사용자가 발화 가능한 POI의 이형태들을 자동으로 생성하여 음성을 이용한 검색 서비스를 가능하게 하는 모든 장치 및 사용자기기에서 사용 가능한 것이다. 그러므로 본 발명의 범위는 설명된 실시예에 국한되지 않으며, 후술되는 특허청구의 범위뿐만 아니라 이 특허청구의 범위와 균등한 것들에 의해 정해져야 한다.In other words, the present invention is not limited to a navigation device, and can be used in all devices and user devices that automatically generate these types of POIs that can be spoken by a user to enable a search service using voice. Therefore, the scope of the present invention should not be limited to the described embodiments, but should be defined not only by the scope of the following claims, but also by those equivalent to the scope of the claims.

도 1은 본 발명의 실시예에 따른 네비게이션 기기의 구조를 도시한 블록도,1 is a block diagram showing the structure of a navigation device according to an embodiment of the present invention;

도 2는 본 발명의 실시예에 따른 통계모델학습부와 이형태생성부의 처리관계를 나타내는 구성도,2 is a block diagram showing a processing relationship between the statistical model learning unit and the shape generation unit according to an embodiment of the present invention;

도 3은 본 발명의 실시예에 따른 통계모델학습부의 요소처리부의 구조를 도시한 구성도,3 is a block diagram showing the structure of the element processing unit of the statistical model learning unit according to an embodiment of the present invention;

도 4는 본 발명의 실시예에 따른 통계모델학습부의 동작절차를 도시한 흐름도,4 is a flowchart illustrating an operation procedure of a statistical model learning unit according to an embodiment of the present invention;

도 5는 본 발명의 실시예에 따른 이형태생성부의 요소처리부의 구조를 도시한 구성도,5 is a block diagram showing the structure of the element processing unit of the shape generating unit according to an embodiment of the present invention,

도 6은 본 발명의 실시예에 따른 이형태생성부의 동작절차를 도시한 흐름도.6 is a flowchart illustrating an operation procedure of the shape generating unit according to the embodiment of the present invention.

< 도면의 주요 부분에 대한 부호 설명 > <Explanation of Signs of Major Parts of Drawings>

100 : 제어부 102 : GPS 수신부 100: control unit 102: GPS receiving unit

104 : 메모리 106 : 조작 패널 입력부 104: memory 106: operation panel input unit

108 : 디스플레이부 110 : 스피커 108: display unit 110: speaker

112 : 음성 입력부 202 : 통계 모델 학습부112: voice input unit 202: statistical model learning unit

204 : 통계정보 208 : 이형태 생성부204: Statistical information 208: Shape generation unit

Claims

A statistical model learning unit for collecting air between nouns by analyzing POI strings of a POI name DB and constructing probability values and weights thereof as statistical information;

A morphology generation unit for generating utterance variants for the POI name input using the constructed statistical information.

Apparatus for generating a voice recognition target keyword in the navigation device comprising a.

The method of claim 1,

The statistical model learning unit,

A compound noun decomposing unit for decomposing nouns constituting the POI string;

A language model generator for generating N-gram language model strings from the decomposed nouns;

A probability value calculator which calculates counts and probability values of the generated N-gram language model strings and records N-gram probability information;

Language model weight calculation unit for recording the language model weight information by calculating the weight of the generated N-gram language model string

Apparatus for generating a voice recognition target keyword in a navigation device comprising a.

3. The method of claim 2,

The N-gram probability information and the language model weight information,

The apparatus for generating a voice recognition target keyword in the navigation device, characterized in that the statistical information.

The method of claim 1,

The heterogeneous generation unit,

A heteromorphic candidate generator for generating a heteromorphic candidate by combining all possible nouns from the decomposed nouns;

A language model generator for generating an N-gram language model string from nouns constituting the heteromorphic candidate;

A heterogeneous candidate probability calculator which calculates counts and probability values of the generated N-gram language model strings and records N-gram probability information with reference to the statistical information DB generated by the statistical model learner;

A heterogeneous candidate filtering unit for removing the heterogeneous candidates as a heterogeneous result if predetermined conditions are satisfied;

The morphological synonym generation unit generating morphological forms by substituting the corresponding synonyms in the case of nouns having a synonym relationship among the nouns constituting the morphological candidates.

The method of claim 4, wherein

The heterogeneous candidate filtering unit,

The probability value of the heterologous candidate is lower than a certain threshold,

The heterogeneous candidate is matched by the filtering pattern,

If the representative cross name in the POI string does not appear in this type candidate,

The apparatus for generating a speech recognition target keyword in a navigation device, characterized in that when any one of the above, removes as a result of the final deformation.

Analyzing the POI strings in the POI name DB to collect air between nouns and constructing probability values and weights as statistical information;

A process of generating speech forms for the input POI name using the constructed statistical information

Method of generating a voice recognition target keyword in the navigation device comprising a.

The method of claim 6,

The process of building with the statistical information,

Decomposing nouns constituting the POI string,

Generating an N-gram language model string from the decomposed nouns;

Calculating the count and probability of the generated N-gram language model string and recording the N-gram probability information;

Process of recording weight of language model weight information by calculating weight of generated N-gram language model string

Method of generating a voice recognition target keyword in a navigation device comprising a.

The method of claim 7, wherein

The N-gram probability information and the language model weight information,

The method of generating a voice recognition target keyword in the navigation device, characterized in that the statistical information.

The method of claim 6,

The process of generating the utterance variants,

Decomposing nouns constituting the POI string,

Generating a heteromorphic candidate by combining all possible nouns from the decomposed nouns,

Generating an N-gram language model string from nouns constituting the heteromorphic candidate;

Calculating a count and a probability value of the generated N-gram language model string with reference to the statistical information DB generated from the statistical model learner, and recording N-gram probability information;

If the predetermined condition is satisfied for each of the generated heterogeneous candidates, removing them as a heterogeneous result;

A process of generating a morphology by substituting the synonyms in the case of nouns having a synonym relationship among the nouns constituting the heteromorphic candidates

The method of claim 9,

The process of removing as a result of this morphology,

The heterogeneous candidate is matched by the filtering pattern,

The method of generating a voice recognition target keyword in a navigation device, characterized in that when any one of the above, removes as a result of the final deformation.