KR20020026228A - Real Time Speech Translation - Google Patents
Real Time Speech Translation Download PDFInfo
- Publication number
- KR20020026228A KR20020026228A KR1020020011222A KR20020011222A KR20020026228A KR 20020026228 A KR20020026228 A KR 20020026228A KR 1020020011222 A KR1020020011222 A KR 1020020011222A KR 20020011222 A KR20020011222 A KR 20020011222A KR 20020026228 A KR20020026228 A KR 20020026228A
- Authority
- KR
- South Korea
- Prior art keywords
- input
- voices
- specific
- voice
- sound
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
본 발명은 다수의 방법을 통하여 입력되는 음성이나 소리를 동시에 특정의 소리나 언어, 문자로 실시간 변환시키는 방법에 관한 것이다.The present invention relates to a method for simultaneously converting a voice or a sound input through a plurality of methods into a specific sound, language, and text at the same time.
Description
본 발명은 다수의 방법을 통하여 입력되는 음성이나 소리를 동시에 특정의 소리나 언어, 문자로 실시간 변환시키는 방법을 강구하고자 하는 것이다.The present invention seeks to devise a method for converting a voice or a sound input through a plurality of methods simultaneously into a specific sound, language, or text in real time.
본 기술은 소리의 실시간 변환에 관한 기술이다.This technology relates to the real-time conversion of sound.
본 발명은 다수의 방법을 통하여 입력되는 음성이나 소리를 동시에 특정의 소리나 언어, 문자로 실시간 변환시키는 방법을 찾고자 하는 것이다.The present invention seeks to find a method for simultaneously converting a voice or a sound input through a plurality of methods into a specific sound, language, and text in real time.
도 1 은 본 발명에 의한 실시간 음성변환 실시예1 is a real-time speech conversion embodiment according to the present invention
도 2 는 본 발명에 의한 실시간 음성변환 실시예2 is a real-time speech conversion embodiment according to the present invention
도 3 은 본 발명에 의한 실시간 음성변환 실시예3 is a real-time speech conversion embodiment according to the present invention
* 도면의 주요부분에 대한 설명* Description of the main parts of the drawings
1 : 음성 2 : 소리 11 : 통역 12 : 번역1: Voice 2: Sound 11: Interpretation 12: Translation
소리는 진동수의 많고 적음에 따른 높낮이와 에너지의 크기에 비례하는 진동폭의 장단에 따르는 세기와 진동 스펙트럼의 구조에 따르는 음색으로 특성지을 수 있다. 또한, 말하는 사람에 따라 음의장단(Rhythm), 강약(Accent), 억양(Intonation), 음가(Formant) 및 유연성(Weakening)등의 특징이 있어 듣는 사람은 상대가 누구인지를 알게 된다.The sound can be characterized by the tone according to the structure of the intensity and vibration spectrum according to the high and low frequency and the amplitude of the amplitude, which is proportional to the magnitude of the energy. In addition, depending on the speaker, Rhythm, Accent, Intonation, Formant, Weakening, etc., allows the listener to know who the other person is.
본 발명은 이와 같은 음의 다양한 특성을 화자(話者)나 소정의 개체별로 디지털 데이터화여 분류, 저장하고 실시간 입력되고 있는 소리를 저장된 데이터와 비교하여 화자나 개체를 알아내면서 동시에 사용자가 미리 지정한 사람이나 언어, 문자로 변환하여 실시간으로 출력하는 것으로 구성되어 있다. 입력되는 것은 [도 1]과 같이 사람의 음성이나 자연의 소리, 문자로서 여러 사람의 목소리나 여러 동물의 소리가 동시에 입력될 수도 있다. 이러한 소리나 문자는 [도 3]과 같이 인터넷이나 전화, 휴대폰, 녹음기 및 음성이나 소리를 재생하는 어떤 형태의 장비로부터도 입력될 수 있으며 이를 [도 2]와 같은 입력, 코드화, 변환, 출력의 단계로 사용한다.The present invention categorizes and stores various characteristics of such sounds by a speaker or a predetermined object, and compares the sound being input in real time with the stored data to find a speaker or an object and at the same time a user who has previously designated the sound. It is composed of real-time output by converting it to language, text, and text. As shown in FIG. 1, a voice of a person, a sound of nature, and a character may be input at the same time. Such sound or text may be input from the Internet or a telephone, a mobile phone, a voice recorder, or any type of device that reproduces voice or sound, such as [FIG. 3], and the input, encoding, conversion, and output of FIG. Use as a step.
입력되는 말은 고립단어 특성과 연결단어 특성 및 연속 음성 특성을 분석하여 화자별 특성으로 분류/저장한다. 변환부에서는 실시간 입력되는 언어를 특정의 언어로 통역하고, 이를 문자화하며, 특정인의 음성으로 변환한다.The input words are classified and stored as speaker-specific characteristics by analyzing isolated word characteristics, connected word characteristics, and continuous speech characteristics. The conversion unit translates the language input in real time into a specific language, textifies it, and converts the language into a specific person's voice.
이렇게 함으로서 죽은 사람의 음성으로 녹음 되지 않았던 말을 한다든지, 외국 대통령이 국내에 와서 말을 할 때 동시통역을 하는데 그 사람의 목소리로 말 할 수 있으며, A는 영어로 말해도 문장을 번역하여 한국어로 번역하여 말하는데 목소리가 그 사람의 목소리이며, 영문과 국문으로 기록된다. 이때 다양한 사람의 목소리와 언어, 문자로 변환이 가능하다.By doing this, you can say something that was not recorded in the voice of the dead or simultaneous interpretation when the foreign president came to Korea and spoke in the voice of the person. Translated and spoken, the voice is the person's voice, recorded in English and Korean. At this time, it can be converted into various voices, languages and characters.
또한, 여러 사람의 말이 입력될 때에 한 사람의 말을 추적할 수 있고, 또 한 사람의 말을 다양한 사람의 말과 혼합하여 출력할 수도 있다.In addition, when several people's words are input, one person's words can be tracked, and another person's words can be mixed with the words of various people and output.
인식하고 합성하는 말은 화자종속 조건으로 여러 사람의 말중에서 특정인의 말을 추적하여 기록하거나, 화자독립적으로 특정인이 아닌 여러 사람의 공통 의견을 추출할 수도 있으며, 화자 적응(Speaker Adaptive) 방식이어서 새로운 화자의 말을 추적, 특징을 추정할 수도 있다.Recognizing and synthesizing is a speaker-dependent condition that tracks and records the words of a specific person from the words of several people, or extracts common opinions of several people who are not speaker-independent speakers. You can also follow the speaker's words and estimate their characteristics.
새로운 사람의 말을 들으면 그의 특징을 기억하면서 데이터 베이스에 저장, 대화특성은 가상 최적치를 찾아내고 가상 최적치에 해당하는 실제 입력이 들어오면 가상치를 버리고 실제값으로 입력하여 가상치를 제거하면서 데이터 베이스를 실시간으로 관리한다.When you hear a new person, it remembers its characteristics and stores it in the database.The interactive feature finds the virtual optimal value, and when the actual input corresponding to the virtual optimal value comes in, discards the virtual value and inputs the actual value to remove the virtual value. Manage with.
이와 같은 발명을 통하여Through this invention
여러 음성에서 변조된 여러 음성으로From multiple voices to multiple modulated voices
선택된 사람의 변조된 단독 음성으로With the selected person's modulated single voice
통역된 여러 음성으로With multiple interpreted voices
선택된 사람의 통역된 단독 음성으로With the interpreted sole voice of the chosen person
여러 사람의 문자로By several people
특정인의 문자로In character
번역된 여러 사람의 문자로As translated characters
번역된 특정인의 문자로As a specific person translated
한사람의 음성에서 변조된 한사람의 음성으로From one person's voice to one person's voice
통역된 음성으로With an interpreted voice
특정인의 문자로 변환이 가능하다.Can be converted to a specific character.
그러므로 본 발명은 하나 이상의 음성이나 문자가 실시간으로 입력될 때에 이 음성이나 문자를 디지털 코드화 하기위하여 샘플링하고, 표본화하여 입력된 코드를 단위시간으로 절단한다.Therefore, in the present invention, when one or more voices or texts are input in real time, the voices or texts are sampled for digital coding, sampled, and the input codes are cut in unit time.
단위 음색의 특성을 구분하여 입력/저장/비교하는 단계하기 위해서는 푸리에 변환과 스펙트럼 분석을 통하여 사물과 말하는 사람의 특징을 추출하고, 비교하여 인식할 수 있다.In order to distinguish, input, store, and compare the characteristics of unit tones, a feature of an object and a speaker may be extracted, compared, and recognized through Fourier transform and spectrum analysis.
일련의 연속되는 음성으로 이루어진 문장의 특성을 구분하여 입력/저장/비교하는 단계와Inputting, storing, and comparing the characteristics of sentences consisting of a series of consecutive voices;
단위 음색과 일련의 연속되는 음성으로부터 화자(話者)의 대화특성을 추출하여 입력/저장/비교하는 단계와Extracting, storing, and comparing the dialogue characteristics of the speaker from the unit tone and a series of continuous voices;
입력된 언어를 특정의 언어로 통역하는 단계와Interpreting the input language into a specific language
입력된 문자를 특정의 문자로 번역하는 단계와Translating the input characters into specific characters
입력된 언어를 특정의 문자로 변환하는 단계와Converting the input language into specific characters
입력된 소리를 특정의 소리로 변성하는 단계와Converting the input sound into a specific sound; and
입력된 음성을 특정의 음성으로 변성하는 단계와Converting the input voice into a specific voice; and
하나 이상의 입력된 소리로부터 특정의 소리를 구분하는 단계와Distinguishing a particular sound from one or more input sounds; and
이미 입력되어 있는 데이터와 입력되고 있는 데이터로서는 요구되는 변환특성의 출력물을 얻기가 곤란할 경우에는 상기에 해당하는 최적의 단계를 추론하여 출력하되, 각각의 특성값이 추론치와 실제 값에 따른 출력을 기억하도록 데이터베이스화하여, 실제의 값이 입력될 경우에는 즉시 추론값을 실제값으로 대치하도록 하는 단계와If it is difficult to obtain the output of the required conversion characteristics from the data already inputted and the inputted data, the optimal step corresponding to the above is inferred and outputted, and each characteristic value is output according to the inference value and the actual value. Database to remember, so that when the actual value is entered, the inference value is replaced with the actual value immediately.
앞서 말한 통역, 번역, 변성 및 다수로부터 하나를 선택한 결과를 출력하는 단계로 이루어진 실시간 음성 처리방법으로 구성된다.It consists of a real-time speech processing method consisting of the above-described interpretation, translation, degeneration and outputting the result of selecting one from a plurality.
본 발명으로 실시간으로 입력되는 소리나 음성을 자신이 원하는 목소리나 언어, 문자로 출력할 수 있다.According to the present invention, a sound or a voice input in real time can be output in a voice, a language, and a character desired by the user.
Claims (2)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020020011222A KR20020026228A (en) | 2002-03-02 | 2002-03-02 | Real Time Speech Translation |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020020011222A KR20020026228A (en) | 2002-03-02 | 2002-03-02 | Real Time Speech Translation |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| KR20020026228A true KR20020026228A (en) | 2002-04-06 |
Family
ID=19719564
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1020020011222A Ceased KR20020026228A (en) | 2002-03-02 | 2002-03-02 | Real Time Speech Translation |
Country Status (1)
| Country | Link |
|---|---|
| KR (1) | KR20020026228A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20190101681A (en) * | 2018-02-23 | 2019-09-02 | (주)에어사운드 | Wireless transceiver for Real-time multi-user multi-language interpretation and the method thereof |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH05113795A (en) * | 1991-05-31 | 1993-05-07 | Oki Electric Ind Co Ltd | Voice synthesizing device |
| JPH08335096A (en) * | 1995-06-07 | 1996-12-17 | Oki Electric Ind Co Ltd | Text voice synthesizer |
| KR19980065482A (en) * | 1997-01-10 | 1998-10-15 | 김광호 | Speech synthesis method to change the speaking style |
| KR20020006172A (en) * | 2000-07-11 | 2002-01-19 | 이수성 | Interpreter |
| KR20020049061A (en) * | 2000-12-19 | 2002-06-26 | 전영권 | A method for voice conversion |
-
2002
- 2002-03-02 KR KR1020020011222A patent/KR20020026228A/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH05113795A (en) * | 1991-05-31 | 1993-05-07 | Oki Electric Ind Co Ltd | Voice synthesizing device |
| JPH08335096A (en) * | 1995-06-07 | 1996-12-17 | Oki Electric Ind Co Ltd | Text voice synthesizer |
| KR19980065482A (en) * | 1997-01-10 | 1998-10-15 | 김광호 | Speech synthesis method to change the speaking style |
| KR20020006172A (en) * | 2000-07-11 | 2002-01-19 | 이수성 | Interpreter |
| KR20020049061A (en) * | 2000-12-19 | 2002-06-26 | 전영권 | A method for voice conversion |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20190101681A (en) * | 2018-02-23 | 2019-09-02 | (주)에어사운드 | Wireless transceiver for Real-time multi-user multi-language interpretation and the method thereof |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR102769179B1 (en) | Synthetic data augmentation using voice conversion and speech recognition models | |
| CN112581963B (en) | Voice intention recognition method and system | |
| US5911129A (en) | Audio font used for capture and rendering | |
| KR100383353B1 (en) | Speech recognition apparatus and method of generating vocabulary for the same | |
| US7089184B2 (en) | Speech recognition for recognizing speaker-independent, continuous speech | |
| US8423354B2 (en) | Speech recognition dictionary creating support device, computer readable medium storing processing program, and processing method | |
| US20070088547A1 (en) | Phonetic speech-to-text-to-speech system and method | |
| Varga et al. | ASR in mobile phones-an industrial approach | |
| US20030120490A1 (en) | Method for creating a speech database for a target vocabulary in order to train a speech recorgnition system | |
| Bhatt et al. | Effects of the dynamic and energy based feature extraction on hindi speech recognition | |
| JP2004347732A (en) | Automatic language identification method and apparatus | |
| US5970454A (en) | Synthesizing speech by converting phonemes to digital waveforms | |
| US5987412A (en) | Synthesising speech by converting phonemes to digital waveforms | |
| Kurian et al. | Continuous speech recognition system for Malayalam language using PLP cepstral coefficient | |
| JPH0887297A (en) | Speech synthesis system | |
| JPH09146580A (en) | Effect sound retrieving device | |
| KR20020026228A (en) | Real Time Speech Translation | |
| KR200184200Y1 (en) | Apparatus for intelligent dialog based on voice recognition using expert system | |
| Maged et al. | Improving speaker identification system using discrete wavelet transform and AWGN | |
| JP2980382B2 (en) | Speaker adaptive speech recognition method and apparatus | |
| Khalifa et al. | Statistical modeling for speech recognition | |
| KR100393196B1 (en) | Speech recognition apparatus and method | |
| KR102116014B1 (en) | voice imitation system using recognition engine and TTS engine | |
| AU674246B2 (en) | Synthesising speech by converting phonemes to digital waveforms | |
| JP2658426B2 (en) | Voice recognition method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A201 | Request for examination | ||
| PA0109 | Patent application |
Patent event code: PA01091R01D Comment text: Patent Application Patent event date: 20020302 |
|
| PA0201 | Request for examination | ||
| PG1501 | Laying open of application | ||
| E902 | Notification of reason for refusal | ||
| PE0902 | Notice of grounds for rejection |
Comment text: Notification of reason for refusal Patent event date: 20040519 Patent event code: PE09021S01D |
|
| E601 | Decision to refuse application | ||
| PE0601 | Decision on rejection of patent |
Patent event date: 20050222 Comment text: Decision to Refuse Application Patent event code: PE06012S01D Patent event date: 20040519 Comment text: Notification of reason for refusal Patent event code: PE06011S01I |
|
| J201 | Request for trial against refusal decision | ||
| PJ0201 | Trial against decision of rejection |
Patent event date: 20050318 Comment text: Request for Trial against Decision on Refusal Patent event code: PJ02012R01D Patent event date: 20050222 Comment text: Decision to Refuse Application Patent event code: PJ02011S01I Appeal kind category: Appeal against decision to decline refusal Decision date: 20060929 Appeal identifier: 2005101001579 Request date: 20050318 |
|
| AMND | Amendment | ||
| PB0901 | Examination by re-examination before a trial |
Comment text: Amendment to Specification, etc. Patent event date: 20050418 Patent event code: PB09011R02I Comment text: Request for Trial against Decision on Refusal Patent event date: 20050318 Patent event code: PB09011R01I |
|
| B601 | Maintenance of original decision after re-examination before a trial | ||
| PB0601 | Maintenance of original decision after re-examination before a trial | ||
| J301 | Trial decision |
Free format text: TRIAL DECISION FOR APPEAL AGAINST DECISION TO DECLINE REFUSAL REQUESTED 20050318 Effective date: 20060929 |
|
| PJ1301 | Trial decision |
Patent event code: PJ13011S01D Patent event date: 20060929 Comment text: Trial Decision on Objection to Decision on Refusal Appeal kind category: Appeal against decision to decline refusal Request date: 20050318 Decision date: 20060929 Appeal identifier: 2005101001579 |