EA004079B1

EA004079B1 - System and method of templating specific human voices

Info

Publication number: EA004079B1
Application number: EA200200587A
Authority: EA
Inventors: Стивен Дж. Киуг; Кэтрин Аксия Киуг
Original assignee: Стивен Дж. Киуг; Кэтрин Аксия Киуг
Priority date: 1999-11-23
Filing date: 2000-11-23
Publication date: 2003-12-25
Also published as: BR0015773A; AU2048001A; CN1391690A; ZA200204036B; NO20022406L; IL149813A0; CA2392436A1; KR20020060975A; NO20022406D0; EP1252620A1; JP2003515768A; EA200200587A1; AP2002002524A0; WO2001039180A1

Abstract

1. A system for capturing an enabling portion of a specific voice sufficient for using that portion as a template in further use of the voice, comprising: a. means for capturing an enabling portion of a voice in a form useful for analysis as to voice characteristics; b. analysis means for receiving and analyzing the captured voice and for characterizing elements of the captured voice as characterization data so that the characterization data is sufficient to uniquely characterize and re-create the captured voice as previously unspoken utterances by said captured voice; c. storage means for receiving characterization data from the analysis means for a specific voice; and d. retrieval means for retrieving the analysis and characterization data for further use in generating intelligible utterances sounding like, but never previously spoken by, the captured voice. 2. The system of claim 1 in which the means for capturing the voice comprises digital recording means. 3. The system of claim 1 in which the means for capturing the voice comprises a flash memory card. 4. The system of claim 1 in which the means for capturing the voice comprises analog recording means. 5. The system of claim 1 in which the means for capturing the voice comprises input means for receiving a live voice and for transmitting that live voice to the analysis means. 6. The system of claim 1 in which the analysis means comprises digital data storage means. 7. The system of claim 1 in which the analysis means comprises means for identifying specific patterns, syntax, frequency, pitch and tones of speech in the captured voice data. 8. The system of claim 1 in which the analysis means comprises means for identifying specific vocabulary, pronunciation, or accent unique to the captured voice. 9. The system of claim 1 in which the analysis means comprises means for identifying specific features unique to the captured voice deriving principally from specific anatomic structures of the originator of the voice. 10. The system of claim 1 in which the analysis means comprises means for determining the vocabulary of the originator of the captured voice. 11. The system of claim 10 in which the analysis means comprises means for setting the vocabulary as characterization data for use in forming a future templated voice. 12. The system of claim 1 in which the analysis means comprises digital processing apparatus for digitally processing input data in the form of a voice or digital representation of a recorded voice. 13. The system of claim 1 in which the analysis means comprises second input means for receiving additional data regarding the physiology of the voice originator. 14. The system of claim 13 in which the analysis means second input means comprises digital signal processor means suitable for selectively receiving audio or other data comprising visualization information on the morphology of the voice originator. 15. The system of claim 1 in which the analysis means comprises comparison means for comparing an input voice data set with stored data comprising age data, language data, educational data, gender data, occupation data, accent data, nationality data, ethnic data, voice type data, custom data and setting data. 16. The system of claim 1 in which the analysis means comprises third input means for receiving data regarding the voice originator comprising age data, educational data, gender data, occupation data, accent data, nationality data, ethnic data, voice type data, custom data, language data and setting data. 17. A method of creating a voice-like noise which is identical in sound to an actual specific human's voice, comprising the steps of: a. capturing an enabling portion of a specific human's voice for storage and use; b. storing the enabling portion of the specific human's voice; c. analyzing the enabling portion to identify essential components or characteristics of the captured voice; and d. utilizing the identified essential components or characteristics to create a new voice which, when assigned data from one or more database means and when heard, sounds identical in all respects to the voice of the specific human's voice to a listener having normal aural discretion abilities. 18. The method of claim 17 in which the analyzing step comprises the steps of identifying the components in the captured enabling portion of the specific human's voice relating to at least one of the components including frequency, tone, pitch, volume, accent, gender, harmonic structure, acoustic power, phonetic or timing accent, power and periodicity. 19. The method of claim 18 in which the step of capturing an enabling portion of a specific human's voice for storage and use includes capturing either larynx generated noise or turbulence generated noise of the specific human's voice. 20. A method of accurately replicating a human voice comprising the steps of: a. identifying a minimum size data set comprising a combination of words, sounds or phrases which must be emitted by the originator of a voice to be replicated; b. capturing the emission of the combination of words, sounds or phrases by the originator of the voice to be replicated in a medium; c. analyzing the captured emission to identify voice characteristics of the originator of the voice sufficient to allow artificial generation of the voice, using the identified characteristics, so that the artificially generated voice is substantially identical in all respects to a listener having normal aural discretion abilities when the listener hears the generated voice utilizing some language components not contained in the captured emission of the originator's actual voice. 21. An article of manufacture comprising: a. a computer usable medium having computer readable program code means embodied therein for causing replication of a human voice, the computer readable program code means in said article of manufacture comprising: b. computer readable program code means for causing a computer to effect an analysis of a captured enabling portion of an originator's voice to identify voice characteristics data sufficient to allow artificial generation of the voice; and c. computer readable program code means for causing use of the identified voice characteristics data to artificially generate a voice, so that the artificially generated voice is substantially identical in sound and usage to a listener when the listener hears the generated voice utilizing some language components not contained in the captured emission of the originator's actual voice. 22. The article of manufacture of claim 21 further comprising computer readable program code means for storing the generated voice for later use. 23. The article of manufacture of claim 21 further comprising computer readable program code means for using the voice characteristics data to create a voice profile of the originator of the voice. 24. The article of manufacture of claim 21 further comprising computer readable program code means for accessing data base means for storing data comprising age data, educational data, gender data, occupation data, accent data, language, nationality data, ethnic data, voice type data, custom data, general data and setting data. 25. A computer program product for use with an aural output device, said computer program product comprising: a. a computer usable medium having computer readable program code means embodied therein for causing replication of a human voice via an output aural device, the computer program product comprising: b. computer readable program code means for causing a computer to effect an analysis of a captured enabling portion of an originator's voice to identify voice characteristics data sufficient to allow artificial generation of the voice; and c. computer readable program code means for causing use of the identified voice characteristics data to artificially generate and output a voice via an aural output device, so that the artificially generated voice is substantially identical in sound and usage to a listener when the listener hears the generated voice utilizing some language components not contained in the captured emission of the originator's actual voice. 26. A computer program product for use with a display device, said computer program product comprising: a. a computer usable medium having computer readable program code means embodied therein for causing replication of a human voice and verification of the accuracy of the replicated voice displayed on the display device, the computer program product comprising: d. computer readable program code means for causing a computer to effect an analysis of a captured enabling portion of an originator's voice to identify voice characteristics data sufficient to allow artificial generation of the voice; and e. computer readable program code means for causing use of the identified voice characteristics data to artificially generate a voice and to compare the characteristics of the generated voice to the originator's voice on a display device, so that the artificially generated voice is substantially identical in sound to a listener when the display device so indicates and when a listener actually hears the generated voice utilizing some language components not contained in the captured emission of the originator's actual voice. 27. A computer program product for use with an aural output device, said computer program product comprising: a. a computer usable medium having computer readable program code means embodied therein for initiating replication of a human voice via an output aural device, the computer program product comprising: b. computer readable program code means for causing a computer to receive and activate a voice characteristics data file unique to a specific voice sufficient to allow artificial generation of the voice; and c. computer readable program code means for causing use of the identified voice characteristics data to artificially generate and ou

Description

Область техникиTechnical field

Системы, способы и продукты, предназначенные для сохранения и адаптации звука, и в частности голоса людей.Systems, methods and products designed to preserve and adapt sound, and in particular the voice of people.

Предпосылки изобретенияBACKGROUND OF THE INVENTION

С древних времен млекопитающие и другие существа сообщались друг с другом в определенной форме с помощью голоса или аналогичных звуков. Действительно, такие звуки обычно вполне различимы ввиду различий в морфологии живых существ даже в пределах одного вида. Различия между живыми существами включают в значительной степени отличительные элементы структуры речи и тонов. К сожалению, возможность слышать речь другого человека, голос которого представляет особый интерес, утрачивается, когда этот человек умирает или теряет контакт со слушателем.Since ancient times, mammals and other creatures communicated with each other in a certain form using voice or similar sounds. Indeed, such sounds are usually quite distinguishable due to differences in the morphology of living things, even within the same species. The differences between living things include the largely distinctive elements of speech structure and tones. Unfortunately, the opportunity to hear the speech of another person whose voice is of particular interest is lost when that person dies or loses contact with the listener.

В настоящее время разработаны только начальные формы фиксирования звука на носителях информации, с помощью которых могут быть сохранены голоса. Например, устройство записи на ленте или устройство цифровой записи используют для записи голоса человека и, таким образом, для сохранения его для будущего прослушивания и воспроизведения в том виде, как он был изначально записан, или для прослушивания частей исходной записи по желанию. Эти устройства и способы записи голоса также включают некоторый набор искусственно синтезированных голосов, созданных с помощью компьютера, которые могут использоваться для выполнения различных функций, включая, например, автоматические телефонные услуги и проверку, очень примитивный разговор между игрушками или оборудованием и пользователем, синтезированные голоса для фильмов, использование в индустрии развлечений и т. п. В некоторых приложениях такие искусственные голоса заранее программируют на узкий набор ответов в соответствии с конкретной вводимой информацией. Хотя такой искусственный голос в некоторых случаях позволяет получить более осмысленные ответы, чем при использовании простой записи реального голоса, звучание искусственного голоса, однако, остается простым по сравнению с возможностями реально звучащего голоса в соответствии с настоящим изобретением. Действительно, в некоторых вариантах воплощения настоящего изобретения присутствуют элементы, существенно отличающиеся от таких систем, или в них известная технология используется далеко за пределами того, что рассматривается или даже предполагается в известных открытиях или нововведениях.Currently, only initial forms of sound recording on information carriers have been developed, with the help of which voices can be saved. For example, a tape recorder or digital recorder is used to record a person’s voice and, thus, to save it for future listening and playback as it was originally recorded, or to listen to parts of the original recording as desired. These voice recording devices and methods also include a set of artificially synthesized computer-generated voices that can be used to perform various functions, including, for example, automatic telephone services and verification, a very primitive conversation between toys or equipment and the user, synthesized voices for films, use in the entertainment industry, etc. In some applications, such artificial voices are pre-programmed for a narrow set of answers in accordance with specific input. Although such an artificial voice in some cases allows for more meaningful answers than when using a simple recording of a real voice, the sound of an artificial voice, however, remains simple compared to the capabilities of a real sounding voice in accordance with the present invention. Indeed, in some embodiments of the present invention, there are elements that are significantly different from such systems, or in which the known technology is used far beyond what is considered or even assumed in the known discoveries or innovations.

Во многих публикациях, выходящих во всем мире, описаны аспекты искусственной вокализации. Аналогично в некоторых ссылочных документах описаны системы и технологии, направленные на использование и создание звуков искусственного голоса. Однако ни в одном из этих ссылочных документов не описаны концепции, раскрытые в соответствии с настоящим изобретением.Many publications around the world describe aspects of artificial vocalization. Similarly, some referenced documents describe systems and technologies aimed at using and creating artificial voice sounds. However, none of these reference documents describes the concepts disclosed in accordance with the present invention.

Краткое описание изобретенияSUMMARY OF THE INVENTION

В настоящем изобретении предложены системы и способы, предназначенные для записи или фиксирования другим путем достаточного количества звучания голоса определенного человека для формирования модели структуры голоса. Эту модель затем используют в качестве инструмента для построения звучания нового разговора, озвученного точно тем же голосом, причем этот новый разговор, вероятно, никогда не был в действительности произнесен или никогда не был произнесен в точном контексте или с использованием точно таких фраз этим конкретным человеком, но в действительности звучит во всех аспектах идентично реальной речи данного человека. Достаточную часть рассчитывают таким образом, чтобы зафиксировать элементы реального голоса, необходимые для реконструкции этого реального голоса, при этом существует методика доверительной оценки, с помощью которой можно рассчитать ограничения реконструированного или воссозданного разговора, в случае, когда отсутствует достаточно большой отрезок речи, на основе которого можно начать моделирование. Новый голос или голоса могут использоваться совместно с базой данных с учетом определенной темы, исторических данных и с использованием модулей адаптивного или искусственного интеллекта для обеспечения возможности построения нового разговора с пользователем, так как если бы присутствовал реальный обладатель смоделированного голоса. Такая система и способ могут использоваться совместно с другими носителями информации, такими как программные файлы, сервисная программа, записанная в микросхеме, или носителями информации в других формах. Интерактивное использование такой системы и способа может быть организовано различным образом. Сам отдельный модуль может содержать полностью один из вариантов воплощения настоящего изобретения, например, в виде микросхемы или электронной схемы, которая сконфигурирована для фиксирования и предоставления возможности использовать голос, таким образом, как описано в настоящем описании.The present invention provides systems and methods for recording or recording in another way, a sufficient amount of sound of the voice of a particular person to form a model of the structure of the voice. This model is then used as a tool to build the sound of a new conversation voiced in exactly the same voice, and this new conversation probably never was actually spoken or was never spoken in the exact context or using exactly such phrases by this particular person, but in reality it sounds in all aspects identical to the real speech of a given person. A sufficient part is calculated in such a way as to fix the elements of the real voice necessary for the reconstruction of this real voice, and there is a confidence assessment technique with which you can calculate the limitations of the reconstructed or recreated conversation, in the case when there is no sufficiently large length of speech based on which you can start modeling. A new voice or voices can be used in conjunction with the database, taking into account a specific topic, historical data, and using adaptive or artificial intelligence modules to provide the opportunity to build a new conversation with the user, as if the real owner of the modeled voice were present. Such a system and method can be used in conjunction with other storage media, such as program files, a utility program recorded in a chip, or storage media in other forms. The interactive use of such a system and method can be organized in various ways. The individual module itself may contain one of the embodiments of the present invention, for example, in the form of a microcircuit or an electronic circuit that is configured to fix and enable the use of voice, as described in the present description.

Модель используется, например, как инструмент для фиксирования и создания новых диалогов с людьми, непосредственный доступ к которым отсутствует, которые, возможно, умерли, или даже с лицами, которые дали разрешение на моделирование и использование своих голосов таким образом. Другой пример представляет собой использование в средствах массовой информации, например в фильмах или в фотографии или других средствах изображения, действительного голоса (голосов) реального обладателя для создания по запросу виртуального диалога с реальным обладателем. Различные другие варианты использования и примене3 ния рассматриваются в пределах объема настоящего изобретения.The model is used, for example, as a tool for capturing and creating new dialogs with people who are not directly accessible, who may have died, or even with people who have given permission to model and use their voices in this way. Another example is the use in the media, for example in films or in photography or other means of image, of the real voice (s) of the real owner to create, upon request, a virtual dialogue with the real owner. Various other uses and applications are contemplated within the scope of the present invention.

Краткое описание чертежейBrief Description of the Drawings

Фиг. 1 изображает схему алгоритма в соответствии с одним из вариантов воплощения последовательности работы системы в соответствии с настоящим изобретением.FIG. 1 depicts a flow chart in accordance with one embodiment of a system sequence in accordance with the present invention.

Фиг. 2 - схему одного из вариантов воплощения подсистемы фиксирования голоса.FIG. 2 is a diagram of one embodiment of a voice recording subsystem.

Фиг. 3 - схему одного из вариантов воплощения подсистемы анализа голоса.FIG. 3 is a diagram of one embodiment of a voice analysis subsystem.

Фиг. 4 - схему одного из вариантов воплощения подсистемы определения характеристик голоса.FIG. 4 is a diagram of one embodiment of a subsystem for determining voice characteristics.

Фиг. 5 - схему одного из вариантов воплощения подсистемы моделирования голоса.FIG. 5 is a diagram of one embodiment of a voice modeling subsystem.

Фиг. 6 - схему одного из вариантов воплощения подсистемы пакетирования сигнала модели голоса.FIG. 6 is a diagram of one embodiment of a packetization subsystem of a voice model signal.

Фиг. 7 - один из вариантов воплощения схемы системы в соответствии с настоящим изобретением, используемой с вариантами удаленной загрузки информации.FIG. 7 is one embodiment of a system diagram in accordance with the present invention used with remote information download options.

Фиг. 8 - вид сверху примера одного из вариантов воплощения настоящего изобретения в виде мобильного, компактного компонента.FIG. 8 is a top view of an example of one embodiment of the present invention as a mobile, compact component.

Фиг. 9 - вид сверху примера одного из вариантов воплощения настоящего изобретения с использованием источника визуальной информации на носителе.FIG. 9 is a top view of an example of one embodiment of the present invention using a source of visual information on a medium.

Подробное описание изобретенияDETAILED DESCRIPTION OF THE INVENTION

Голос представляет собой звук, имеющий чрезвычайно высокое значение у млекопитающих. Ребенок узнает голос матери, который успокаивает его, даже до рождения, и звуки голоса дедушки уменьшают страх даже у взрослых людей. Другие голоса могут влиять на совершенно посторонних людей или могут вызывать воспоминания у влюбленных о давно минувших событиях и моментах. Здесь приведены всего несколько примеров великого дара - способности распознавания, которым обладает человек и другие виды млекопитающих, и их способности влиять на других (и на самих себя) с помощью самого уникального звука - голоса каждого живого существа. У людей, например, особенность голоса человека происходит от генетического вклада родителей, который выражается в форме, размере и положении различных компонентов тела человека, которые влияют на звучание голоса человека, когда он говорит или общается по-другому с помощью голоса или через ротовой и носовой проходы. Кроме того, существуют другие влияющие факторы. В связи с этим, понятно, что люди существенно различаются между собой, часто даже в пределах одной семьи. Действительно, даже голос одного и того же человека может звучать несколько поразному из-за временных влияний, таких как состояние здоровья, уровень стресса, эмоциональное состояние, усталость, температура окружающей человека среды или другие факторы.The voice is a sound of extremely high importance in mammals. The child recognizes the mother’s voice, which calms him, even before birth, and the sounds of his grandfather’s voice reduce fear even in adults. Other voices can influence completely strangers or can cause memories of lovers about past events and moments. Here are just a few examples of a great gift - the recognition ability possessed by humans and other species of mammals, and their ability to influence others (and themselves) with the help of the most unique sound - the voice of every living creature. In humans, for example, the peculiarity of a person’s voice comes from the genetic contribution of parents, which is expressed in the shape, size and position of various components of the human body that affect the sound of a person’s voice when he speaks or communicates differently through his voice or through the mouth and nose passageways. In addition, there are other influencing factors. In this regard, it is clear that people differ significantly among themselves, often even within the same family. Indeed, even the voice of one and the same person may sound somewhat different due to temporary influences, such as health status, stress level, emotional state, fatigue, the person’s ambient temperature, or other factors.

Однако во всем мире считается общепринятым, что качество голоса человека представляет собой чрезвычайно уникальную комбинацию, которую могут различать люди, слышавшие этот голос раньше. Способность людей к ассоциациям с помощью чувств является исключительной, в частности, когда такие чувства относятся к идентификации и ассоциации с голосом человека. Главные и незначительные события в жизни часто вспоминаются через много лет или десятилетий в соответствии с природой сделанных замечаний или запомнившегося тона. Такова устойчивая сила и эмоциональная энергия голоса.However, it is generally accepted throughout the world that a person’s voice quality is an extremely unique combination that people who have heard this voice before can discern. The ability of people to associate through feelings is exceptional, in particular when such feelings relate to identification and association with a person’s voice. Major and minor events in life are often recalled after many years or decades in accordance with the nature of the comments made or the memorable tone. Such is the steady strength and emotional energy of the voice.

Хорошо известны способы фиксирования и воспроизведения голоса человека на различных носителях и машинах. Манипуляции с записанным голосом человека в течение многих десятилетий производились, в основном, преднамеренно и непреднамеренно, на ленте и на цифровых носителях. Однако такие манипуляции обычно были ограничены рамками того, что в действительности было выражено человеком, а не тем, что могло бы быть выражено этим человеком. Например, сегменты действительных высказываний человека воспроизводили, редактировали, смешивали и повторно воспроизводили, иногда даже с разными скоростями. Другие примеры использования голоса человека включают воспроизведение сегментов преднамеренно искаженного голоса, такого, который мог использоваться в мультфильмах или для других звуковых эффектов, используемых в анимации или некоторых видах музыки. Конечно, в анимации также использовали искусственный голос, не обязательно создаваемый на основе реального голоса. Один из примеров представляет собой голос оператора, сгенерированный компьютером, который используется в некоторых телефонных системах и системах связи. Один из способов синтеза голосов и звуков называется конкатенативным способом и представляет собой запись выборок формы колебаний реального разговора человека. В данном способе затем производят разбиение заранее записанного реального разговора человека на сегменты и выполняют генерирование речевых выражений путем соединения этих сегментов речи человека для построения слогов, слов или фраз. Размер этих сегментов изменяется. Другой способ синтеза разговора человека известен как параметрический. В этом способе используются математические модели для воссоздания требуемых звуков речи. Для каждого требуемого звука используется математическая модель или функция, предназначенная для генерирования этого звука. В собственно параметрическом способе, в общем, звуки, произносимые человеком, не используются в качестве элементов построения речи. И, наконец, в общем, существует несколько хорошо известных типов параметрических синтезаторов речи. Один из них известен как артикуляторный синтезатор, который производит математическое моделирование физических аспектов легких, гортани, а также речевого и носового трактов человека. Другой тип параметрического синтезатора речи известен как формантный синтезатор, который производит математическое моделирование акустических аспектов речевого тракта человека.Well-known methods of recording and reproducing human voices on various media and machines. Manipulations with a recorded human voice for many decades have been carried out mainly intentionally and unintentionally, on tape and on digital media. However, such manipulations were usually limited to what was actually expressed by a person, and not what could be expressed by that person. For example, segments of actual human statements were reproduced, edited, mixed and re-played, sometimes even at different speeds. Other examples of the use of human voice include reproducing segments of an intentionally distorted voice, such as could be used in cartoons or other sound effects used in animations or some types of music. Of course, the animation also used an artificial voice, not necessarily based on a real voice. One example is the voice of an operator generated by a computer, which is used in some telephone and communication systems. One of the methods for synthesizing voices and sounds is called a concatenative method and is a recording of samples of the form of vibrations of a person’s real conversation. In this method, a pre-recorded real human conversation is then split into segments and speech expressions are generated by connecting these segments of human speech to build syllables, words or phrases. The size of these segments varies. Another way of synthesizing human conversation is known as parametric. This method uses mathematical models to recreate the required speech sounds. For each sound required, a mathematical model or function is used to generate that sound. In the parametric method itself, in general, sounds made by a person are not used as elements of speech construction. And finally, in general, there are several well-known types of parametric speech synthesizers. One of them is known as an articulatory synthesizer, which performs mathematical modeling of the physical aspects of the lungs, larynx, as well as the human speech and nasal tracts. Another type of parametric speech synthesizer is known as the formant synthesizer, which performs mathematical modeling of the acoustic aspects of the human voice path.

Другие системы включают средство, предназначенное для распознавания конкретного голоса после обучения используемой системы этому голосу. Примеры таких систем включают различные системы распознавания голоса, используемые в области фиксирования разговорного языка с последующим переводом его звучания в текст, такие как те, которые используются в системах автоматической записи под диктовку и в аналогичных системах. Другие системы, связанные с обработкой речи, относятся к области биометрии и использованию определенных слов речи в качестве кодов или шифров для обеспечения безопасности. Ни в одной из этих систем, способов, средств или других форм раскрытия не используются различные изобретения, описанные в настоящем описании, и в этих источниках даже не описана потребность в таких технических нововведениях. Уже давно возникла потребность в системе и способе, предназначенных для сохранения голосов других существ динамичным и адаптивным образом для использования в будущем и для пользы источника-обладателя голоса или других людей. Кроме того, существует потребность в системах и способах для осуществления и использования такого фиксирования или профилирования голоса, которое представляет членораздельную, или артикулированную, или имеющую все подлинные характеристики, вокализацию или голос в соответствии с голосом реального человека, таким образом, который, возможно, никогда не был предусмотрен этим человеком. Системам и способам, предназначенным для выполнения этого, присущи некоторые дополнительные преимущества, которые позволяют легко использовать их всем людям, фактически любого уровня образования, культуры или говорящим на любых языках. Кроме того, существует потребность в новых способах, технологиях и моделях ведения деловых операций, а также в воплощении устройств и других средств для создания и осуществления доступа к определенным моделям голоса и затем для использования этих моделей голоса для персональных потребностей или по личному желанию при ведении деловых операций или для удовольствия. И вновь повторим, что, хотя были сделаны значительные достижения в области технологии голоса, ни в одной из этих выполненных в прошлом разработок не предусматриваются аспекты настоящего изобретения и они просто подчеркивают новую и до настоящего времени непризнанную потребность в настоящем изобретении.Other systems include means for recognizing a particular voice after teaching the system in use to that voice. Examples of such systems include various voice recognition systems used in the field of recording a spoken language and then translating its sound into text, such as those used in dictation automatic recording systems and similar systems. Other speech processing systems relate to the field of biometrics and the use of certain speech words as codes or ciphers for security. None of these systems, methods, means or other forms of disclosure use the various inventions described in the present description, and these sources do not even describe the need for such technical innovations. There has long been a need for a system and method designed to preserve the voices of other creatures in a dynamic and adaptive manner for future use and for the benefit of the source of the voice or other people. In addition, there is a need for systems and methods for implementing and using such recording or profiling of a voice that is articulate, articulated, or having all the true characteristics, vocalization or voice in accordance with the voice of a real person, in a way that may never was not provided by this person. Systems and methods designed to accomplish this have some additional advantages that make it easy to use them for all people, in fact of any level of education, culture or speaking any language. In addition, there is a need for new methods, technologies and models for conducting business operations, as well as for the implementation of devices and other means for creating and accessing certain voice models and then for using these voice models for personal needs or for personal desire when conducting business operations or for fun. And again, that although significant advances have been made in the field of voice technology, none of these past developments have provided for aspects of the present invention and they simply emphasize the new and still unrecognized need for the present invention.

На фиг. 1 представлена схема одного из вариантов воплощения системы 10, предназначенной для фиксирования достаточной части конкретного голоса, необходимой для использования этой части в качестве модели при дальнейшем использовании характеристик голоса. Система 10 может представлять собой часть портативного устройства, такого как портативное электронное устройство, или она может представлять собой часть вычислительного устройства такого размера, как портативный компьютер, компьютер типа ноутбук или настольный компьютер, или система 10 может просто представлять собой часть электронной схемы, установленной внутри другого устройства, либо электронный компонент или элемент, разработанный для временного или постоянного размещения в другом электронном элементе, схеме или системе, или использования с ними, либо система 10 может, полностью или частично, представлять собой считываемый компьютером код или просто логическую или функциональную схему в нейронной системе, либо система 10 может быть сформирована как какое-либо другое устройство или продукт, такой как распределенная система сетевого типа. В одном из вариантов воплощения система 10 содержит средство 15 ввода или фиксирования, предназначенное для фиксирования или приема части голоса для обработки и построения алгоритма голоса или средства 19 моделирования, которое может быть сформировано как поток данных, пакет данных, телекоммуникационный сигнал, средство программного кода, предназначенное для определения и воссоздания конкретного голоса или множества характеристик голоса, организованных для применения или моделирования в отношении другой организации звука или речи в виде кажущегося голоса реального обладателя. Другие средства формирования считываемого компьютером программного кода или другие средства, предназначенные для использования определенных данных идентифицированных характеристик голоса для искусственного генерирования голоса, также рассматриваются в пределах настоящего изобретения. Логика или правила алгоритма или средства 19 моделирования предпочтительно формируют с вводом минимального количества звучания голоса, однако, для каждого голоса может потребоваться использовать различное количество звучания голоса и других данных для формирования приемлемого набора данных.In FIG. 1 is a diagram of one embodiment of a system 10 designed to capture a sufficient portion of a particular voice necessary to use this part as a model for further use of voice characteristics. System 10 may be part of a portable device, such as a portable electronic device, or it may be part of a computing device of a size such as a laptop computer, laptop computer or desktop computer, or system 10 may simply be part of an electronic circuit installed inside another device, or an electronic component or element designed for temporary or permanent placement in another electronic element, circuit or system, or using zovaniya with them, or the system 10 may be completely or partially, be a computer-readable code or logic or functional diagram of a neural system, or the system 10 may be formed as any other device or product, such as a distributed system of network type. In one embodiment, the system 10 comprises input or fixation means 15 for capturing or receiving part of the voice for processing and constructing a voice algorithm or modeling means 19, which can be formed as a data stream, data packet, telecommunication signal, software code means, designed to identify and recreate a particular voice or a set of voice characteristics organized for use or modeling in relation to another organization of sound or speech in ie the apparent voice of the real owner. Other means of generating a computer-readable program code or other means for using certain data of the identified voice characteristics to artificially generate voice are also contemplated within the scope of the present invention. The logic or rules of the algorithm or modeling means 19 are preferably generated with a minimum amount of voice input, however, for each voice, it may be necessary to use a different amount of voice and other data to form an acceptable data set.

В одном из вариантов воплощения настоящего изобретения требуется производить фиксирование и использование части звучания голоса человека, например, на основе небольшой по размеру аналоговой или цифровой записи или при вводе в режиме реального времени голоса живого человека, голос которого модеΊ лируется. Действительно, может быть сформирована заранее определенная группа слов для оптимизации фиксирования данных наиболее соответствующих характеристик голоса человека для обеспечения точного повторения голоса. Для наиболее эффективного определения того, какая из форм предоставленной части звучания голоса наиболее соответствует данному человеку, предусмотрено средство анализа. При фиксировании и записи данных голоса при одном вводе данных или при вводе серии данных речевую информацию сохраняют, по меньшей мере, в одной части средства 22 хранения.In one embodiment of the present invention, it is required to fix and use a part of the sound of a human voice, for example, based on a small analog or digital recording or when real-time voice of a living person is entered, whose voice is modeled. Indeed, a predetermined group of words can be formed to optimize data capture of the most relevant characteristics of a person’s voice to ensure accurate voice repetition. For the most effective determination of which of the forms of the provided part of the sound of the voice most corresponds to a given person, an analysis tool is provided. When recording and recording voice data during one data input or when entering a series of data, voice information is stored in at least one part of the storage means 22.

Анализ данных голоса выполняют в средстве 25 процессора для идентификации характеристик, используемых при создании модели голоса конкретного человека. При этом предусматривается, что данные голоса могут направляться непосредственно в средство процессора и не обязательно должны первоначально поступать в средство 22 хранения. Далее приведен пример описания взаимодействия средства процессора, средства хранения и средства моделирования со ссылкой на фиг. 2-8. После анализа адекватных данных голоса в одном из вариантов воплощения производится запись модели голоса и эта запись вызывается средством 25 процессора. Например, после того, как было произведено фиксирование достаточной части звучания голоса АА, его анализ и моделирование (теперь обозначен как АА₄), производится запись в средстве 22 хранения (которое может быть расположено рядом с другими компонентами или установлено на удалении или может использоваться в распределенном режиме в одном или нескольких местах расположения) до тех пор, пока не появится запрос на его использование. Один из примеров запроса на использование представляет собой запрос, посылаемый пользователем системы 10 через средство 29 репрезентативного ввода для использования модели АА₄ голоса АА во вновь создаваемом разговоре, в котором вместо реального живого голоса АА используется сгенерированный голос. Это может произойти в связи или с использованием одной или нескольких различных баз данных, некоторые из которых представлены ситуационной базой 33 данных или персональной базой 36 данных. В свою очередь, выполняется вызов модели АА_г голоса АА, которая используется в качестве формирующего механизма с некоторыми другими звуками для создания нового разговорного голоса АА¹, который звучит точно так же, как голос АА реального обладателя, и построен на первоначально введенных данных после их формирования. Хотя новый голос АА¹во всех отношениях звучит так же, как голос АА реального обладателя, в действительности он представляет собой голос, искусственно созданный с помощью модели АА_Ь в которой сформирован соответствующий ключ голоса АА, аналогичный генетическому коду. Таким образом, достаточная часть действительного голоса позволяет произвести кодирование системы 10 с использованием модели, которая позволяет произвести воссоздание и неограниченное использование зафиксированного голоса, практически любым образом, необходимым пользователю. Это не просто синтез ранее произнесенных высказываний частей голоса АА, которые были электронным способом соединены вместе с помощью технологии конкатенации или формантной техники, а, скорее, представляет собой совершенно новый голос, который был разработан, произведен и собран или сконструирован с использованием характеристик данных голоса АА (то есть модели или профиля голоса) и, возможно, других характеристик, относящихся к реальному обладателю голоса АА.Voice data analysis is performed in processor means 25 to identify characteristics used in creating a particular person's voice model. It is envisaged that the voice data can be sent directly to the processor means and does not have to initially enter the storage means 22. The following is an example of a description of the interaction of the processor means, storage means and modeling means with reference to FIG. 2-8. After the analysis of adequate voice data in one embodiment, a voice model is recorded and this recording is called by processor 25. For example, after a sufficient part of the sound of the AA voice has been recorded, analyzed and modeled (now designated as AA ₄ ), recording is made in the storage means 22 (which can be located next to other components or installed remotely or can be used in distributed mode in one or more locations) until a request for its use appears. One example of a usage request is a request sent by a user of the system 10 through representative input means 29 to use the AA model ₄ of the AA voice in a newly created conversation in which a generated voice is used instead of the real live AA voice. This may occur in connection with or using one or more different databases, some of which are represented by situational database 33 or personal database 36. In turn, a call is made to the AA model _g of the AA voice, which is used as a forming mechanism with some other sounds to create a new conversational voice AA ¹ , which sounds exactly like the voice of the AA of the actual owner, and is based on the initially entered data after it formation. Although the new voice of AA ¹ in all respects sounds the same as the voice of AA of the actual owner, in reality it is a voice artificially created using the AA _b model in which the corresponding AA voice key is generated, similar to the genetic code. Thus, a sufficient part of the actual voice allows the encoding of the system 10 using a model that allows you to recreate and unlimited use of the fixed voice, in almost any way necessary for the user. This is not just a synthesis of previously spoken utterances of parts of the voice of AA that were electronically connected together using concatenation technology or formant technique, but rather is a completely new voice that was designed, produced and assembled or constructed using the characteristics of the data of the voice of AA (that is, the model or profile of the voice) and possibly other characteristics related to the actual holder of the AA voice.

Конечно, при этом признается, что такая технология повлечет за собой весьма значительные последствия и потребуются меры предосторожности для обеспечения надлежащего использования этой технологии моделирования голоса. Действительно, такая технология может потребовать дополнительного использования средств санкционирования с тем, чтобы только санкционированные пользователи могли иметь разрешение на доступ и использование технологии и данных моделирования голоса. Кроме того, дополнительно может потребоваться создать средство, предназначенное для проверки, является ли слышимый голос реальным или смоделированным, для обеспечения защиты от обманного или неразрешенного использования такого синтезированного голоса. Может потребоваться создать юридические механизмы для учета этой области технологии в дополнение к лицензированию, контрактам и другим механизмам, существующим в настоящее время в большинстве стран.Of course, it is recognized that such a technology will entail very significant consequences and precautions will be required to ensure proper use of this voice modeling technology. Indeed, such a technology may require the additional use of authorization tools so that only authorized users can have permission to access and use the technology and voice simulation data. In addition, it may additionally be necessary to create a tool designed to verify whether the audible voice is real or simulated, to provide protection against fraudulent or unauthorized use of such a synthesized voice. It may be necessary to create legal mechanisms to address this area of technology in addition to licensing, contracts and other mechanisms currently in place in most countries.

На фиг. 1 средство 41 связи представляет пути потока энергии или данных, которые могут представлять собой физические проводники, световодные каналы или другие электронные, биологические или другие активируемые каналы между элементами системы. В одном из вариантов воплощения средство 44 питания показано, как находящееся в системе 10, но если необходимо, оно также может быть установлено на расстоянии.In FIG. 1, the communication means 41 represents energy or data flow paths, which may be physical conductors, light guide channels, or other electronic, biological, or other activated channels between system elements. In one embodiment, the power means 44 is shown as being in the system 10, but if necessary, it can also be installed at a distance.

В другом варианте воплощения системы 10 алгоритм, сигнал, средство кода или модель, которая создается, в целом или частично, может возвращаться для записи или уточнения в средство 22 хранения, средство 19 моделирования либо в другой компонент или архитектуру системы. Такая возможность позволяет и обеспечивает улучшение или адаптацию модели определенного голоса в соответствии с инструкциями разработчика или другого пользователя. Это может быть выполнено, например, если множество наборов данных голоса одного и того же человека могут вводиться в течение некоторого времени, или происходит возрастное изменение, развитие или другие изменения в физиологии или характере реального обладателя голоса. Действительно, возможно осуществлять обучение смоделированного голоса для восстановления контекста предыдущих сеансов и для включения таких знаний в будущие операции. В этих случаях может быть полезным выбирать режим усовершенствования для воспроизведения модели АА\ голоса (АА¹) и для усовершенствования голоса или модели со сравнением и с обновлением, с использованием средства 22 анализа или средства 29 ввода. Еще один пример включает поиск человека с голосом ВВ, который содержит одну или большее количество характеристик голоса, аналогичных голосу АА, который является голосом реального обладателя для модели АА\. В этом случае может быть полезным вводить одну или большее количество аналогичных характеристик из голоса ВВ в виде ограниченных либо общих входные данных для улучшения голоса АА¹ или модели АА\. Поэтому становится также возможным сохранять голос ВВ и создавать голос ВВ¹ и модель ВВ\ голоса, каждый из которых может использоваться в будущем. Другой пример включает создание базы данных различным образом улучшенных голосов для одного реального обладателя голоса, которая используется по потребности в качестве соответствующего варианта системы или пользователем в соответствии с создавшейся ситуацией. Еще в одном примере может быть предложена услуга поиска соответствия голоса и обеспечения соответствующих инструментов улучшения, таких как естественно или искусственно сгенерированные колебания или другие акустические или сигнальные элементы для улучшения модели голоса в соответствии с желанием пользователя.In another embodiment of the system 10, an algorithm, signal, code means, or model that is created, in whole or in part, can be returned for recording or refinement to storage means 22, modeling means 19, or to another component or system architecture. This feature allows and ensures the improvement or adaptation of the model of a particular voice in accordance with the instructions of the developer or another user. This can be done, for example, if a plurality of voice data sets of the same person can be entered for some time, or an age-related change, development or other changes in the physiology or character of the actual voice holder occurs. Indeed, it is possible to train simulated voices to restore the context of previous sessions and to incorporate such knowledge into future operations. In these cases, it may be useful to select an enhancement mode for reproducing the AA \ voice model (AA ¹ ) and for improving the voice or model with comparison and updating, using analysis tool 22 or input tool 29. Another example involves the search for a person with a BB voice, which contains one or more characteristics of the voice, similar to the voice of AA, which is the voice of the real holder for the AA \ model. In this case, it may be useful to introduce one or more similar characteristics from the voice of the BB in the form of limited or common input data to improve the voice of AA ¹ or model AA \. Therefore, it also becomes possible to save the voice of BB and create a voice of BB ¹ and a model of BB \ voices, each of which can be used in the future. Another example involves the creation of a database of variously improved votes for one real voice holder, which is used as needed as an appropriate version of the system or by the user in accordance with the situation. In yet another example, a voice matching search service and providing appropriate enhancement tools, such as naturally or artificially generated vibrations or other acoustic or signal elements, can be offered to improve the voice model according to the user's desire.

Перед описанием других вариантов воплощения системы 10 или аналогичных систем и способов полезно проанализировать возможные варианты применения данной технологии. В общем, количество вариантов использования так велико, что трудно все их перечислить. Однако следует учитывать, что любое использование звуков, похожих на голос, которые генерируются с помощью данных, введенных в модель, и данных, получаемых в результате работы этой модели, или инструмента кодирования для создания звука, похожего на голос, рассматривается как находящееся в пределах объема настоящего изобретения, в частности, когда такой инструмент кодирования используется с другими средствами генерирования звука, если соответственно они используются для воссоздания звука голоса, который фактически является идентичным голосу действительного реального обладателя. Использование сгенерированного голоса в совершенно новых предложениях или других структурах языка также рассматривается как находящееся в пределах объема настоящего изобретения. Способность создать машину, компонент или другое средство считываемого компьютером кода как часть формирования или передачи сигнала процесса моделирования голоса, или продукта, дополнительно обеспечивает использование настоящей технологии. Средство, предназначенное для связи или использования такой технологии моделирования голоса или генерирования голоса для перемещения данных в потоке или других форм данных, позволяет сформировать виртуальный диалог, который может быть адаптивным и разумным, а также просто информационным или реагирующим, и такой диалог или разговор производится голосами, выбираемыми пользователем. Кроме того, предусматривается, что описанная здесь технология может использоваться с визуальными изображениями, а также со слышимыми звуками.Before describing other embodiments of system 10 or similar systems and methods, it is useful to analyze possible applications of this technology. In general, the number of use cases is so great that it is difficult to list them all. However, it should be borne in mind that any use of voice-like sounds that are generated using data entered into the model and data obtained as a result of this model, or an encoding tool to create a sound similar to voice, is considered to be within the scope of the present invention, in particular, when such an encoding tool is used with other means of generating sound, if respectively they are used to recreate the sound of a voice that is actually identical to Olos the real real owner. The use of the generated voice in completely new sentences or other language structures is also considered to be within the scope of the present invention. The ability to create a machine, component or other means of computer-readable code as part of the formation or transmission of a signal from a voice modeling process or product further provides for the use of this technology. A tool designed to communicate or use such a voice modeling or voice generation technology to move data in a stream or other forms of data allows you to create a virtual dialogue that can be adaptive and reasonable, as well as just informational or responsive, and such a dialogue or conversation is done by voices user selectable. In addition, it is contemplated that the technology described herein can be used with visual images as well as with audible sounds.

Кроме того, предполагается, что модель голоса, описанная здесь, может быть создана с использованием данных, которые не включают действительно доступную часть голоса реального обладателя, но что доступная часть голоса реального обладателя может использоваться, возможно, с другими данными для подтверждения точности дублирования голоса реального обладателя. Таким образом, становится возможным либо использовать достаточную часть голоса для моделирования голоса или просто для подтверждения точности смоделированного другим способом голоса. Смоделированный или скопированный голос может использоваться для взаимодействия с пользователями компьютеров или других машин и систем или для передачи им указаний. Пользователь может выбрать такой смоделированный голос либо из своей собственной библиотеки смоделированных голосов, из другого источника смоделированных голосов, или может просто создать новый голос. Например, смоделированный голос АА¹ может быть выбран пользователем для передачи указаний по голосовой почте или для считывания текстов или другого интерфейса связи, в то время как смоделированный голос СС может быть выбран для использования в интерактивной развлекательной программе. Неисправности или проблемы, возникающие в машине пользователя, или сигналы предупреждения пользователю устройства могут быть идентифицированы или разрешены пользователем при работе со смоделированным голосом ΌΌ. Здесь приведены простые примеры того, как такая технология позволяет создать улучшенный интерфейс пользователя и ассоциацию пользователя с функциями, задачами, режимами или другими свойствами путем использования технологии моделирования голоса. Выбор и использование модели, а также создание и использование генерированного голоса могут быть выполнены в машине или в устройстве пользователя, частично в машине или в устройстве пользователя или за пре делами машины или устройства пользователя. Могут также возникнуть случаи только временного использования одного или большего количества устройств, например, в номере гостиницы, при посещении какого-либо офиса или в других местах либо при временном использовании устройства, которое, тем не менее, обеспечивает вышеуказанные свойства в приведенных выше различных вариантах. Например, турист может захотеть иметь при себе или обеспечить доступ к определенным голосам, которые сопровождали бы его во время путешествия на самолете или в номере гостиницы. Настоящее изобретение может использоваться в помещениях больниц, или в приютах, или в других местах. Такие варианты использования становятся возможными в одном или большем количестве вариантов воплощения, приведенных здесь. Интересно, что настоящая система также может использоваться некоторыми лицами с их собственном голосом в качестве наследия для других. Многие другие варианты использования находятся в пределах объема приведенного здесь описания.In addition, it is assumed that the voice model described here can be created using data that does not include the truly available part of the voice of the real owner, but that the available part of the voice of the real owner can be used, possibly with other data, to confirm the accuracy of the duplication of the voice of the real the owner. Thus, it becomes possible either to use a sufficient part of the voice to simulate the voice or simply to confirm the accuracy of the voice modeled in another way. A simulated or copied voice can be used to interact with users of computers or other machines and systems, or to transmit directions to them. The user can select such a simulated voice either from his own library of simulated votes, from another source of simulated votes, or he can simply create a new voice. For example, simulated voice AA ¹ can be selected by the user to send instructions via voicemail or to read texts or another communication interface, while simulated voice CC can be selected for use in an interactive entertainment program. Faults or problems occurring in the user's machine or warning signals to the device user can be identified or resolved by the user when working with the simulated voice ΌΌ. Here are simple examples of how this technology allows you to create an improved user interface and user association with functions, tasks, modes, or other properties by using voice modeling technology. The selection and use of the model, as well as the creation and use of the generated voice, can be performed in a machine or device of a user, partially in a machine or device of a user, or outside the machine or device of a user. There may also be cases of only temporary use of one or more devices, for example, in a hotel room, when visiting an office or in other places, or when temporarily using a device that, nevertheless, provides the above properties in the above various options. For example, a tourist may want to carry or provide access to certain voices that would accompany him when traveling by plane or in a hotel room. The present invention can be used in hospital premises, or in shelters, or in other places. Such use cases are made possible in one or more embodiments described herein. Interestingly, this system can also be used by some individuals with their own voice as a legacy to others. Many other uses are within the scope of the description herein.

Другие варианты использования настоящего изобретения, описанные здесь, включают образование, например ознакомление детей и других лиц с историческими событиями с использованием смоделированных по выбору голосов. Например, если родитель желает, чтобы его ребенок узнал о взаимоотношениях рас в Соединенных Штатах Америки в 1960-е годы, используя один из голосов умерших бабушек или дедушек ребенка, то смоделированный голос выбранного дедушки или бабушки (если он имеется) мог бы быть разработан, изготовлен и предоставлен для использования. Система 10 могла бы обеспечить доступ в одну или большее количество баз данных для сбора информации и знаний о соответствующем предмете и предоставить эту информацию в одну или большее количество баз данных в системе 10, таких как ситуационная база 33 данных, предназначенная для использования по мере необходимости. Смоделированный голос ЕЕ¹ бабушки или дедушки мог бы использоваться для доступа к требуемой информации, и требуемый запрос мог бы удовлетворяться с помощью смоделированного голоса ЕЕ¹, и обсуждение по требуемой теме начиналось бы, когда это необходимо. Такое обсуждение может быть сохранено для последующего использования в системе 10 или в удаленном месте, если необходимо, либо обсуждение может производиться интерактивно между дедушкой или бабушкой, то есть смоделированным голосом, и ребенком. Такое свойство становится возможным путем использования модуля распознавания голоса, который должен иметь еще до начала обсуждения информацию об идентичности голоса ребенка, и для включения адекватного словаря и нейтрального уровня знаний по различным комбинациям вопросов, которые, вероятно, поступят от ребенка. Кроме того, может быть создан мост от модуля ввода и распознавания голоса к части смоделированного голоса системы для обеспечения способности живого реагирования смоделированного голоса. Таким образом, предусматриваются различные инструменты распознавания голоса, сконфигурированные в соответствии с новым вариантом использования, описанным в настоящем описании. Конечно, такая конфигурация также требует применения средств для быстрого поиска, необходимых для ответов на вопросы и для формулирования ответа, соответствующего уровню слушающего ребенка. Очевидно, этот пример иллюстрирует особый потенциал настоящей технологии, в частности, когда она комбинируется с соответствующими данными, мощностью и быстродействием системы.Other uses of the present invention described herein include education, for example, familiarizing children and others with historical events using simulated voices. For example, if a parent wants his child to learn about the relationship of races in the United States of America in the 1960s using one of the voices of the deceased grandparents of the child, then a simulated voice of the chosen grandfather or grandmother (if any) could be developed, manufactured and provided for use. System 10 could provide access to one or more databases to collect information and knowledge about the relevant subject and provide this information to one or more databases in system 10, such as situational database 33, which is intended to be used as needed. The simulated voice of EE ¹ of the grandparents could be used to access the required information, and the requested request could be satisfied with the simulated voice of EE ¹ , and discussion on the desired topic would begin when necessary. Such a discussion can be saved for later use in system 10 or in a remote location, if necessary, or the discussion can be carried out interactively between grandfather or grandmother, that is, a modeled voice, and the child. Such a property is made possible by using a voice recognition module, which should have information about the identity of the child’s voice before the discussion begins, and to include an adequate vocabulary and a neutral level of knowledge on various combinations of issues that are likely to come from the child. In addition, a bridge can be created from the voice input and recognition module to part of the simulated voice of the system to provide a lively responsive voice to the simulated voice. Thus, various voice recognition tools are provided that are configured in accordance with the new use case described herein. Of course, such a configuration also requires the use of quick search tools necessary to answer questions and to formulate an answer that is appropriate for the level of the listening child. Obviously, this example illustrates the special potential of this technology, in particular when it is combined with the corresponding data, power and speed of the system.

В качестве альтернативы, при использовании дополнительного модуля распознавания голоса становится возможным использовать только ограниченные свойства, позволяющие слушателю смоделированного голоса направлять генерируемый голос для остановки или продолжения работы, или для обеспечения некоторых других свойств с помощью определенных команд. Это представляет собой форму ограниченного диалогового режима, соответствующего некоторым, но не всем типам использования. Даже если пользователь не выберет использование дополнительных свойств и вместо этого просто включит рассказ или обсуждение с голосом отсутствующих бабушки или дедушки, эффект и использование этого будет огромным для этого или других типов использования.Alternatively, when using the optional voice recognition module, it becomes possible to use only limited properties that allow the listener of the modeled voice to direct the generated voice to stop or continue working, or to provide some other properties with certain commands. This is a form of limited conversational mode corresponding to some, but not all types of use. Even if the user does not choose to use additional properties and instead simply includes a story or discussion with the voice of the absent grandparents, the effect and use of this will be enormous for this or other types of use.

В случае, когда пользователь желает только использовать смоделированный голос, соответствующий образованию и жизненному опыту реального обладателя этого голоса, то это возможно выполнить с помощью ввода различных фильтров или модификаторов. Например, смоделированный голос может снова представлять собой выбранный выше голос дедушки или бабушки, (смоделированный голос ЕЕ¹) , и фильтр ДАННЫХ ДАТ используется с датой, выбранной ДО ДЕКАБРЯ 1963 ГОДА, для обсуждения взаимоотношений рас в Соединенных Штатах Америки в 1960-е годы. В результате этого, будет создано обсуждение, которое не будет включать информацию, которая произошла после указанной даты. В этом примере дедушка или бабушка не может обсуждать Акт о праве голоса 1965 года или беспорядки в городах в конце 1960-х годов в этой стране. Аналогично, становится возможным откорректировать различные аспекты данных или самого смоделированного голоса, например, используя тип характеристик данных, представленный на фиг. 4. При этом, однако, следует понимать, что другие регулировки являются возможными и рассмат13 риваются в пределах объема настоящего изобретения, приведенного здесь, и что вышеприведенные примеры просто представляют возможности технологии в соответствии с настоящим изобретением.In the case when the user only wants to use the modeled voice, corresponding to the education and life experience of the real owner of this voice, this can be done by entering various filters or modifiers. For example, the simulated voice can again be the grandfather or grandmother's voice selected above (simulated voice EE ¹ ), and the DATE DATA filter is used with the date selected BEFORE DECEMBER 1963 to discuss race relationships in the United States of America in the 1960s. As a result of this, a discussion will be created that will not include information that occurred after the specified date. In this example, grandparents cannot discuss the 1965 Voting Act or urban unrest in the late 1960s in that country. Similarly, it becomes possible to correct various aspects of the data or the modeled voice itself, for example using the type of data characteristics presented in FIG. 4. In this case, however, it should be understood that other adjustments are possible and are considered within the scope of the present invention described here, and that the above examples merely represent the possibilities of the technology in accordance with the present invention.

В другом варианте воплощения системы и способов, описанных здесь, пользователь может использовать смоделированный голос любимого человека или другого человека для чтения пользователю. В этом примере становится возможным обеспечить чтение книг для людей любого возраста голосом отсутствующего или умершего члена семьи или другого человека, известного пользователю. При комбинировании с огромным набором должным образом сконфигурированных носителей и средств кода, считываемого компьютером, для осуществления связи между данными, такое новшество само по себе обеспечит чрезвычайную пользу для пользователей. Этот тип использования имеет широкие варианты применения за пределами конкретного примера, приведенного выше. Действительно, еще более широкое использование этой технологии, таким образом, состоит в предоставлении доступа к базе данных разрешенных и смоделированных голосов, доступ к которым может осуществляться и которые могут использоваться другими лицами за плату или при условии другой формы компенсации. В применении к музыке, такая технология также имеет глубокие последствия, в частности, если человек может получить доступ к смоделированным голосам, существовавшим в прошлом, и представить популярных певцов - голоса многих из них все еще доступны для моделирования. Очевидно, такая технология позволяет создать новую сферу производства, сдачи в наем, покупки или других вариантов использования моделей голосов и связанных с этим средств, технологий и способов ведения деловых операций с использованием такой технологии.In another embodiment of the system and methods described herein, a user may use the simulated voice of a loved one or another person to read to the user. In this example, it becomes possible to provide books for people of all ages with the voice of an absent or deceased family member or other person known to the user. When combined with a vast array of properly configured media and computer-readable code means for communicating between data, this innovation alone will provide extreme benefits to users. This type of use has wide applications beyond the specific example above. Indeed, an even wider use of this technology, therefore, is to provide access to a database of authorized and modeled votes, which can be accessed and which can be used by others for a fee or subject to another form of compensation. When applied to music, this technology also has profound consequences, in particular, if a person can access the modeled voices that existed in the past and introduce popular singers - the voices of many of them are still available for modeling. Obviously, this technology allows you to create a new sphere of production, rental, purchase or other options for using voice models and related tools, technologies and methods of conducting business operations using such technology.

Настоящее изобретение также может найти применение при медицинском лечении определенных незначительных или значительных психологических заболеваний, при котором правильное использование терапии смоделированного голоса может иметь в достаточной степени смягчающий или даже терапевтический эффект. Еще один возможный вариант использования этой технологии состоит в создании вновь разработанного голоса для использования его так, чтобы он был основан или имел предшественника в виде одного или большего количества смоделированных голосов существующего реального обладателя. Монопольное использование вновь созданного голоса может контролироваться с использованием различных средств или законодательных актов, таких как лицензирование или отчисление авторских гонораров и т.п. Конечно, такие голоса могут сохраняться как частная собственность для ограниченного использования самим разработчиком.The present invention may also find application in the medical treatment of certain minor or significant psychological diseases, in which the proper use of simulated voice therapy can have a sufficiently mitigating or even therapeutic effect. Another possible use of this technology is to create a newly developed voice for use in such a way that it is founded or has a predecessor in the form of one or more simulated voices of the existing real owner. The exclusive use of the newly created voice can be controlled using various means or legislation, such as licensing or royalties, etc. Of course, such voices can be kept as private property for limited use by the developer.

Можно представить природу созданных таким образом библиотек. Такие голоса будут представлять творческие звуки разработчика, но каждый голос в действительности будет иметь компонент напряжения реального голоса млекопитающего, как основу использования инструмента или кода моделирования, аналогично образцу ткани для анализа ДНК, но применимого к определенному голосу. Этот тип комбинации представляет мощные новые возможности для связи и взаимоотношений на основе голоса и других звуков, создаваемых млекопитающими.One can imagine the nature of the libraries created in this way. Such voices will represent the creative sounds of the developer, but each voice will in fact have a stress component of the mammalian real voice as the basis for using a modeling tool or code, similar to a tissue sample for DNA analysis, but applicable to a particular voice. This type of combination presents powerful new possibilities for communication and relationships based on voice and other sounds created by mammals.

Системы в соответствии с настоящим изобретением могут быть портативными или иметь другие размеры. Системы могут быть внедрены в другие системы или могут представлять собой отдельные системы при работе. Системы и способы, описанные здесь, могут представлять собой часть или все элементы в распределенной, сетевой или другой удаленной системе взаимодействия. В системах и способах, описанных в настоящем описании, могут использоваться загружаемые данные или данные с удаленным доступом и они могут использоваться для управления различными другими системами или способами или процессами. Варианты воплощения настоящего изобретения включают процедуры открытого интерфейса, предназначенные для запросов и воплощения способов и операций, описанных здесь, но которые могут выполняться полностью или частично другими операционными системами или системными приложениями. Процесс моделирования и использования смоделированных голосов может выполняться и использоваться либо в отношении млекопитающих, либо искусственных машин или процессов. Например, робот или другой помощник с искусственным интеллектом может создать или использовать один или большее количество смоделированных голосов такого типа. Такой помощник может также использоваться для автоматического поиска голосов в соответствии с определенными общими или ограниченными критериями и может затем генерировать смоделированные голоса в виртуальных или физических фабриках голосов. Таким образом, могут быть эффективно созданы большие базы данных смоделированных голосов. При таком или аналогичном систематическом использовании может оказаться предпочтительным создавать и использовать данные или другие типы маркировки и технологий идентификации в отношении одной или большего количества частей реальных голосов, используемых для создания смоделированного голоса.The systems of the present invention may be portable or other sizes. Systems may be implemented in other systems or may be separate systems during operation. The systems and methods described herein may be part or all of elements in a distributed, network, or other remote communication system. The systems and methods described herein may use downloadable data or data with remote access and they can be used to control various other systems or methods or processes. Embodiments of the present invention include open interface procedures for querying and implementing the methods and operations described herein, but which may be performed in whole or in part by other operating systems or system applications. The process of modeling and using simulated voices can be performed and used either with respect to mammals or artificial machines or processes. For example, a robot or other artificial intelligence assistant can create or use one or more simulated voices of this type. Such an assistant can also be used to automatically search for votes in accordance with certain general or limited criteria and can then generate simulated votes in virtual or physical vote factories. Thus, large databases of simulated voices can be efficiently created. With this or similar systematic use, it may be preferable to create and use data or other types of marking and identification technologies for one or more parts of real voices used to create a simulated voice.

Ниже приведены примеры вариантов применения с использованием описанной здесь технологии. Они не предназначены для ограничения, но скорее представлены как возможные варианты использования в дополнение к вари15 антам воплощения, описанным или предполагаемым в настоящем описании.The following are examples of applications using the technology described here. They are not intended to be limiting, but rather presented as possible uses in addition to the embodiments described or intended in the present description.

Пример 1.Example 1

Процесс моделирования с использованием элементов вариантов воплощения, описанных здесь, позволяет сформировать сигнал кодирования голоса, содержащего логическую структуру характеристик специфического голоса, достаточных для точного копирования звучания этого голоса.The modeling process using the elements of the embodiments described herein allows you to generate a voice encoding signal containing a logical structure of the characteristics of a specific voice, sufficient to accurately copy the sound of that voice.

Пример 2.Example 2

Устройство компьютерного указания и обновления, представления статуса или помощника с использованием одного или большего количества выбранных голосов с использованием описанной здесь технологии.A computer-based device for indicating and updating, presenting a status or an assistant using one or more selected votes using the technology described herein.

Пример 3.Example 3

Домашнее устройство контроля за энергией, оповещающее устройство или помощник с использованием одного или нескольких голосов на основе описанной здесь технологии.A home energy control device, an alert device or assistant using one or more voices based on the technology described here.

Пример 4.Example 4

Помощник в номере гостиницы или помощник в автомобиле, предназначенный для подачи пользователю указаний в соответствии с необходимостью, таких как, например, звонокбудильник в гостинице с выбранным пользователем голосом. Аналогично водитель автомобиля может получать информацию, передаваемую голосом или голосами, выбранными пользователем.An assistant in a hotel room or an assistant in a car, designed to give the user instructions as necessary, such as, for example, an alarm clock in a hotel with a voice selected by the user. Similarly, a car driver can receive information transmitted by voice or voices selected by the user.

Пример 5.Example 5

Использование одного или большего количества выбранных голосов на основе описанной здесь технологии в персональном цифровом помощнике, в портативном устройстве на базе персонального компьютера или в другом электронном устройстве или компоненте в любое время для фиксирования голоса, помощника, обработчика извещений и т.д.Using one or more selected voices based on the technology described here in a personal digital assistant, in a portable device based on a personal computer, or in another electronic device or component at any time for recording voice, assistant, notification processor, etc.

Пример 6.Example 6

Создание или управление одним или большим количеством выбранных голосов или моделей голосов с помощью логики компьютера/электронной микросхемы, инструкций или средств кода для ведения деловых операций и способов технологии, а также описанных здесь вариантов производства.Creating or managing one or more selected voices or voice models using computer / electronic chip logic, instructions or code tools for conducting business operations and technology methods, as well as production options described here.

Пример 7.Example 7

Использование технологии моделирования голоса в комбинации с другими визуальными носителями, такими как фотографии, цифровое видео или голографические изображения.Using voice modeling technology in combination with other visual media such as photographs, digital video or holographic images.

Пример 8.Example 8

Использование описанной здесь технологии с профильной картой, основанной на памяти типа флэш, для включения в любое устройство, которое может записывать, воспроизводить или воссоздавать голос.Using the technology described here with a profile card based on flash memory for inclusion in any device that can record, play or recreate voice.

Пример 9.Example 9

Использование технологии, описанной здесь, с персональным устройством, которое сканирует и обновляет загружаемую информацию для пользователя в виде требуемого голоса или голосов по выбору. Например, это может использоваться для организации действий, которые могут выполняться роботом, таким как информационный робот, предназначенный для осуществления фонового поиска и связи через интерфейс, в то время как пользователь не доступен, и затем для сообщения пользователю статуса с помощью одного или большего количества заданных голосов, с использованием описанной здесь технологии.Using the technology described here with a personal device that scans and updates downloadable information for the user in the form of the desired voice or votes of choice. For example, this can be used to organize actions that can be performed by a robot, such as an information robot designed to perform background searching and communication through an interface while the user is not available, and then to inform the user of the status using one or more of the specified votes using the technology described here.

Пример 10.Example 10

Использование технологии, описанной здесь, в комбинации с одним или большим количеством компонентов автомобиля или другой транспортной системы.Using the technology described herein in combination with one or more components of a car or other transportation system.

Пример 11.Example 11

Использование технологии, описанной здесь, с одним или большим количеством компонентов самолета во время полета.Using the technology described here with one or more components of an airplane during flight.

Пример 12.Example 12

Использование технологии, описанной здесь, в качестве устройства напоминания о правилах безопасности при использовании с одним или большим количеством компонентов приспособлений или оборудования на рабочем месте, такого как монитор осанки оператора персонального компьютера, электрического оборудования, опасного оборудования и т.д.Using the technology described here as a safety reminder when used with one or more components of devices or equipment at the workplace, such as the monitor of the operator’s posture of a personal computer, electrical equipment, hazardous equipment, etc.

Пример 13.Example 13

Использование технологии, описанной здесь, как дополнение к другим системам автоматического голосового управления, таким как устройства для диктовки, устройства для указаний, устройства-компаньоны или устройства считывания текстов.Using the technology described here as an addition to other automatic voice control systems, such as dictation devices, pointing devices, companion devices or text reading devices.

Пример 14.Example 14

При использовании описанной здесь технологии используются механизмы социального посредничества или управления, такие как инструмент против приступов гнева на дороге или других форм раздражения и разочарования, включаемые водителем, или включающиеся автоматически, или с помощью других средств.Using the technology described here, social mediation or management mechanisms are used, such as a tool against bouts of anger on the road or other forms of annoyance and frustration, triggered by the driver, or triggered automatically, or by other means.

Пример 15.Example 15

Использование технологии, описанной здесь, в качестве инструмента обучения дома, в школе или на рабочем месте.Using the technology described here as an educational tool at home, at school or in the workplace.

Пример 16.Example 16

Использование технологии, описанной здесь, для вдохновенного чтения.Using the technology described here for inspirational reading.

Пример 17.Example 17

Использование технологии, описанной здесь, в качестве инструмента, действующего как машина для записи семейной хронологии.Using the technology described here as an instrument acting as a family history recorder.

Пример 18.Example 18

Использование технологии, описанной здесь, в качестве технологии товарного знакаUsing the technology described here as a trademark technology

Мик1сМа1сй™ для обеспечения соответствующими голосами певцов с самыми лучшими или требуемыми голосами.Mik1sMa1sy ™ to provide matching voices to singers with the best or most desired voices.

Пример 19.Example 19

Использование технологии, описанной здесь, в качестве технологии товарного знака Уо1се8е1ес1™ для обеспечения соответствующих кинофильмов или видеозаписей предпочтительными голосами для моделирования развлекательных сценариев, в которых ранее использовался реальный исполнитель, или воссозданных впоследствии при совместном использовании технологии моделирования голоса.Using the technology described here as the technology of the Wo1ce8e1ec1 ™ trademark to provide relevant movies or videos with preferred voices for modeling entertainment scenarios that previously used a real performer, or recreated subsequently when sharing voice modeling technology.

Пример 20.Example 20

Использование технологии, описанной здесь, в качестве устройства альтер эго (второе я), которое выполнено в портативном виде, позволяет использовать режим (режимы) работы товарных знаков 8е1ес1Уоюе™ или УоюеХ™ и содержит базу данных изображений лиц, которым соответствуют голоса, а также анонимных моделей, которые могут быть выбраны аналогично примеру 7.Using the technology described here as a device of the alter ego (second self), which is made in a portable form, allows you to use the mode (s) of the trademarks 8е1ес1Уоюе ™ or УоюеХ ™ and contains a database of images of persons who correspond to voices, as well as anonymous models that can be selected analogously to example 7.

Пример 21.Example 21

Использование технологии, описанной здесь, для создания профиля профилированного или смоделированного голоса.Using the technology described here to create a profile of a profiled or modeled voice.

Пример 22.Example 22

Использование технологии, описанной здесь, в качестве устройства чтения для детей или в качестве ночного помощника дома для управления интерактивной системой безопасности.Using the technology described here as a reading device for children or as a night assistant at home to control an interactive security system.

На фиг. 2 изображена схема алгоритма одного из вариантов воплощения подсистемы фиксирования голоса, которая может содержать средство кода, считываемого компьютером, или способа, предназначенного для выполнения фиксирования, анализа и использования голоса АА, предназначенного для моделирования. На фиг. 3 изображен один из вариантов воплощения подсистемы анализа голоса, которая может содержать средство логики, или способа, предназначенного для эффективного определения маршрутизации характеристики данных голоса. В этих вариантах воплощения голос АА фиксируют в модуле приобретения или на этапе 103, и затем он проходит по логическим этапам и путям передачи данных, таким как путь 106, через процесс моделирования. Фиксирование данных может быть выполнено либо цифровыми, либо аналоговыми способами и компонентами. Сигнал, который затем представляет зафиксированный голос АА, проходит через средство 111 или способ анализа для определения, соответствует ли существующий профиль голоса или модель голосу АА. Это может выполняться, например, путем сравнения одной или множества характеристик (таких как показаны в подсистеме 113 характеристики голоса на фиг. 4), как определено модулем 103 приобретения или средством 111 анализа, и затем путем сравнения одной или большего количества его характеристик с известными профилями или моделями голоса, разрешенными для доступа, на этапе 111 анали за. Представительная обратная связь и петля 114 исходного анализа обеспечивают выполнение этих этапов, так что обработка проходит по пути 116. Такое сравнение может включать подачу запросов в базу данных профиля голоса или в другой носитель записи, локальный или удаленный. Этап анализа в модуле 111 анализа и работа подсистемы 113 характеристики голоса могут повторяться в соответствии с алгоритмическими, статистическими или другим технологиями для подтверждения, относится ли, соответствует ли анализируемый голос или нет существующему профилю голоса или файлу данных. На фиг. 4 приведена подробная структура подсистемы 113 характеристики голоса.In FIG. 2 is a flowchart of one embodiment of a voice recording subsystem, which may comprise computer-readable code or a method for capturing, analyzing, and using AA voice for modeling. In FIG. 3 depicts one embodiment of a voice analysis subsystem, which may comprise logic, or a method for efficiently determining the routing of voice data characteristics. In these embodiments, the voice AA is captured in the acquisition module or at step 103, and then it goes through the logical steps and data paths, such as path 106, through the modeling process. Data capture can be performed either digitally or analogously with methods and components. The signal, which then represents the recorded voice of AA, passes through the means 111 or analysis method to determine whether the existing voice profile or model matches the voice of AA. This can be accomplished, for example, by comparing one or many characteristics (such as shown in the voice characteristics subsystem 113 in FIG. 4), as determined by the acquisition module 103 or analysis means 111, and then by comparing one or more of its characteristics with known profiles or voice models allowed for access, in step 111 of the analysis. The representative feedback and source analysis loop 114 provide these steps so that processing proceeds along path 116. Such a comparison may include submitting requests to a voice profile database or other recording medium, local or remote. The analysis step in the analysis module 111 and the operation of the subsystem 113 of the voice characteristic can be repeated in accordance with algorithmic, statistical or other technologies to confirm whether the analyzed voice corresponds or not to the existing voice profile or data file. In FIG. 4 shows a detailed structure of the voice characteristic subsystem 113.

Далее вновь рассмотрим фиг. 2, на которой показано, что если для сигнала, соответствующего голосу АА, не находится соответствие или он не является идентичным существующему набору профилей голоса, то сигнал проходит в подсистему характеристики голоса для его полной характеристики. Однако если существующий файл данных профиля голоса соответствует сигналу профиля голоса АА, то может не потребоваться создание новой модели в модуле/на этапе 127. В этой ситуации сигнал может анализироваться и/или характеризоваться для возможного генерирования обновленного профиля или модели, которые сами по себе могут затем не записываться или не использоваться. Такая ситуация может произойти, например, когда доступны дополнительные данные характеристики (такие как обеспечивающая часть, наличие или отсутствие стресса или других факторов), которые ранее не были доступны. В соответствии с этим конкретный файл данных голоса может содержать множество моделей. На фиг. 2 и 3 представлен процесс проверки достоверности, содержащий логические этапы и компоненты системы, в общем, показанные как подсистема 133 проверки достоверности. Следует подчеркнуть, что эти фигуры представляют размещение компонентов в подсистеме, в общем, схематично. Кроме того, как показано на фиг. 3, после определения того, что файл данных профиля голоса существует (этап 137), в случае необходимости, может осуществляться логическая проверка достоверности на этапе 139. Если требуется обновление существующей модели, оно производится на этапе 142. В качестве альтернативы на логическом этапе 145 отмечают, что не требуется производить обновление существующей модели. После этапов 143 или 145 на этапе 155 записывается или используется обновленный или предыдущий профиль или модель голоса.Next, we again consider FIG. 2, which shows that if there is no match for the signal corresponding to the AA voice or it is not identical to the existing set of voice profiles, the signal passes to the voice response subsystem for its full characteristics. However, if the existing voice profile data file matches the AA voice profile signal, then it may not be necessary to create a new model in module / at step 127. In this situation, the signal can be analyzed and / or characterized to possibly generate an updated profile or model, which by themselves can then not recorded or used. Such a situation can occur, for example, when additional characteristics data are available (such as the providing part, the presence or absence of stress or other factors) that were not previously available. Accordingly, a particular voice data file may comprise a plurality of models. In FIG. 2 and 3, a validation process is presented comprising logical steps and system components, generally shown as a validation subsystem 133. It should be emphasized that these figures represent the placement of components in the subsystem, in general, schematically. Furthermore, as shown in FIG. 3, after determining that a voice profile data file exists (step 137), if necessary, a logical validation can be performed at step 139. If an existing model is required to be updated, it is done at step 142. Alternatively, at logical step 145, note that it is not required to update the existing model. After steps 143 or 145, at step 155, an updated or previous voice profile or model is recorded or used.

Модуль/этап 127 создания модели на фиг.The module / step 127 of creating the model of FIG.

содержит использование подсистемы характеристики голоса для создания уникального идентификатора, предпочтительно цифрового идентификатора, для того конкретного голоса, для которого создается модель или профиль. Эти данные аналогичны генетическому коду, кодам последовательности генов или штрих-кодам и аналогичны идентификаторам в особенности уникальных объектов, существ или явлений. В соответствии с этим заявители называют такой профиль или модель голоса Уоюе Тешр1а1е Тесйпо1оду™ (Технология моделирования голоса а также УоЧсе ΌΝΑ™ или ΆΌΝΑ™¹ (ДНК голоса) УоЧсе Зедиепсе Собек™ или Уоюе Зедиепсе Собшд™ (Коды или кодирование голосовой последовательности). Термины профиль, профили или профилирование и производные от них термины могут заменять вышеприведенные товарные знаки или другие термины, используемые в качестве названий и ссылок для этой новой технологии. После завершения создания модели модель голоса может быть записана (показано как модуль записи или этап 161 или использовано в модуле или этапе 164).contains the use of the voice characteristic subsystem to create a unique identifier, preferably a digital identifier, for that particular voice for which a model or profile is being created. This data is similar to a genetic code, gene sequence codes or barcodes and similar to identifiers in particular of unique objects, creatures or phenomena. Accordingly, applicants call such a profile or voice model Uoye Teshr1a1e Tesypoodu ™ (Voice modeling technology and UOCHSE ΌΝΑ ™ or ΆΌΝΑ ™ ¹ (voice DNA) UOCHSE Zediepse Sobek ™ or Uoyu Zediepse Sobdsd ™ (Codes or encoding of a voice sequence). profile, profiles, or profiling and terms derived from them may supersede the above trademarks or other terms used as names and references for this new technology. recorded (shown as a recording unit 161 or step or used in a module or step 164).

Фиг. 4 изображает схематическое представление подсистемы характеристики голоса. В настоящем описании приведен, по меньшей мере, один вариант воплощения данных и средства характеристики, предназначенных для определения и характеристики выраженных данных для определения голоса с использованием моделирования или профилирования голоса, как раскрыто в настоящем описании. Как показано, при формулировке данных характеристик доступны различные типы данных для сравнения. Эти данные характеристик затем будут использоваться для создания модели или профиля голоса в соответствии с критериями кодировки. Хотя данные, приведенные на фиг. 4, выглядят как скомпонованные в отдельных модулях, может оказаться предпочтительным использовать процесс открытого компаратора, в котором может быть обеспечен доступ к любым данным для сравнения в любых различных последовательностях или взвешенных приоритетах. Независимо от этого, как показано на чертеже, данные могут содержать следующие категории языка, пол, диалект, регион или акцент (показаны как выходной сигнал УСо Характеристики голоса в модуле или на этапе 201); частота, высота, тон, длительность или амплитуда (показаны как выходной сигнал УС₁ в модуле или на этапе 203); возраст, здоровье, произношение, словарный запас или физиология (показаны как выходной сигнал УС₂ в модуле или на этапе 205); структура, синтаксис, громкость, модуляция или тип голоса (показаны как выходной сигнал УС₃в модуле или на этапе 207); образование, опыт, фаза, повторение или грамматика (показаны как выходной сигнал УС4 в модуле или на этапе 209); род занятий, национальность, этническая принадлежность, привычки или окружающая обстановка (показаны как выходной сигнал УС₅в модуле или на этапе 211); контекст, колебания, правила/модели, использование типа части, размер или количество (показаны как выходной сигнал УС₆ в модуле или на этапе 213); скорость, эмоции, сгруппированность, аналогии или акустическая модель (показаны как выходной сигнал УС₇ в модуле или на этапе 215); математическая модель, модель обработки, модель сигнала, модель схожести звуков или модель совместного использования (показаны как выходной сигнал УС₈ в модуле или на этапе 217); модель вектора, адаптивные данные, классификации, фонетика или артикуляция (показаны как выходной сигнал УС9 в модуле или на этапе 219); сегменты, слоги, комбинации, самообучение или молчание (показаны как выходной сигнал УС10 в модуле или на этапе 221); пакеты, частота дыхания, тембр, резонанс или модель повторных проявлений (показаны как УС₁₁ в модуле или на этапе 223); гармоники, модели синтеза, разрешающая способность, точность или другие характеристики (показаны как выходной сигнал УС12 в модуле или на этапе 225); или различные другие техники для уникальной идентификации части (фрагмента или полностью) голоса. Например, они дополнительно могут включать цифровую или аналоговую идентификацию голоса, модуляцию, синтез входных данных или других данных, сформированных или используемых для этой цели, и все это показано как выходной сигнал УС_Х в модуле или на этапе 227.FIG. 4 depicts a schematic representation of a voice response subsystem. The present description provides at least one embodiment of the data and characterization means for determining and characterizing expressed data for voice determination using voice modeling or profiling, as described herein. As shown, when formulating these characteristics, various types of data are available for comparison. This performance data will then be used to create a model or voice profile according to the encoding criteria. Although the data shown in FIG. 4 appear to be arranged in separate modules, it may be preferable to use an open comparator process in which access to any data can be provided for comparison in any different sequences or weighted priorities. Regardless of this, as shown in the drawing, the data may contain the following categories of language, gender, dialect, region or accent (shown as USO output signal Voice characteristics in the module or at step 201); frequency, pitch, tone, duration or amplitude (shown as the output signal US ₁ in the module or at step 203); age, health, pronunciation, vocabulary or physiology (shown as the output signal US ₂ in the module or at step 205); structure, syntax, volume, modulation or type of voice (shown as output signal US ₃ in the module or at step 207); education, experience, phase, repetition or grammar (shown as output signal US4 in the module or at step 209); occupation, nationality, ethnicity, habits or environment (shown as output signal US ₅ in the module or at step 211); context, fluctuations, rules / models, use of the type of part, size or quantity (shown as the output signal of DC ₆ in the module or at step 213); speed, emotions, grouping, analogies or an acoustic model (shown as the output signal of DC ₇ in the module or at step 215); a mathematical model, a processing model, a signal model, a similarity model of sounds, or a sharing model (shown as an output signal of DC ₈ in a module or at step 217); vector model, adaptive data, classifications, phonetics or articulation (shown as output signal US9 in the module or at step 219); segments, syllables, combinations, self-learning or silence (shown as the output signal US10 in the module or at step 221); packets, respiratory rate, timbre, resonance or a pattern of repeated manifestations (shown as CSS ₁₁ in the module or at step 223); harmonics, synthesis models, resolution, accuracy or other characteristics (shown as output signal US12 in the module or at step 225); or various other techniques for uniquely identifying a portion (fragment or entire) of a voice. For example, they may additionally include digital or analog voice identification, modulation, synthesis of input data or other data generated or used for this purpose, and all this is shown as an output signal US _X in the module or at step 227.

Следует учитывать, что один или большее количество типов данных из любого одного или большего количества модулей или этапов может позволить получить значение для модели голоса. Кроме того, для целей настоящего изобретения величина УС_Х охватывает любую известную технологию классификации в ходе интерпретации, независимо от того, указана ли она в данном описании, при условии, что она используется в определении в последующем уникального профиля или модели голоса для конкретного голоса и используется в соответствии с новыми разработками, описанными здесь. Кроме того, следует учитывать, что данные, скомбинированные в файлах характеристик голоса, и выходные сигналы УС₀, УС₁, УС₂, УС₃, УС₄, УС₅, УСб, УС7, УС₈, УС9, УС10, УС11, УС12 и УС_Х могут получать различные приоритеты и могут комбинироваться различным образом для точного и эффективного анализа и характеристики голоса, причем УС_Х представляет дополнительные технологии, приведенные здесь в качестве ссылки.It should be noted that one or more data types from any one or more modules or steps may provide a value for the voice model. In addition, for the purposes of the present invention, the value of US _X covers any known classification technology during interpretation, regardless of whether it is indicated in this description, provided that it is used in the determination of a unique voice profile or model for a specific voice and is used in accordance with the new developments described here. In addition, it should be borne in mind that the data combined in the voice characteristic files and the output signals US ₀ , US ₁ , US ₂ , US ₃ , US ₄ , US ₅ , USb, US ₇ , US ₈ , US ₉ , US ₁₀ , US ₁₁ , US 12 and US _X can receive different priorities and can be combined in various ways for accurate and efficient analysis and characterization of the voice, and US _X represents additional technologies, given here by reference.

На фиг. 5 и 6 приведены примеры устройства пакетирования сигнала, пригодного для приема данных характеристик различных голосов, таких как цифровые или кодированные данные, представляющие информацию, которая рассматривается как соответствующая и формирующая для моделируемого голоса. Устройство 316 пакетирования сигнала комбинирует выход модуля содержания сигнала или этапа 332 и значения/количественной оценки по одному или большему количеству сигналов УС₀-УС,.. а также форматы сигнала или кода в модуле или на этапе 343, как соответствующие для правильной передачи и использования различными потенциальными интерфейсами пользователя, устройствами или средствами передачи для создания выходной модели кода голоса, или сигнала УТ_Х. Следует учитывать, что возможно применение различных способов для создания уникального идентификатора, предназначенного для разделения различных характеристик голоса, и что такие различные возможности могут использоваться в настоящем изобретении в широком контексте и объеме настоящего изобретения для получения, в некоторой степени, независимой в отношении некоторых компонентов методологии.In FIG. Figures 5 and 6 show examples of a signal packetizing device suitable for receiving characteristics data of various voices, such as digital or encoded data representing information that is considered appropriate and generating for the modeled voice. The signal packetizer 316 combines the output of the signal content module or step 332 and the value / quantity from one or more US ₀ -US signals, .. as well as the signal or code formats in the module or in step 343, as appropriate for proper transmission and use various potential user interfaces, devices or transmission media to create an output model of the voice code, or signal UT _X. It should be borne in mind that various methods can be used to create a unique identifier designed to separate the various characteristics of the voice, and that such various possibilities can be used in the present invention in the broad context and scope of the present invention to obtain, to some extent, independent of some components of the methodology .

На фиг. 7 представлена организация и способ электронного запроса и передачи между генерированием модели голоса или устройством 404 записи и удаленным пользователем. В этом представлении достаточные части могут быть отправлены в удаленное устройство 404 генерирования или записи модели голоса от любого количества различных пользователей 410, 413, 416. Устройство 404 затем генерирует или производит поиск файла данных модели голоса и создает или производит поиск сигнала модели голоса. Сигнал модели затем передают или загружают для пользователя или для назначенного им лица, как показано на этапе 437. В момент загрузки или позже по запросу 441 пользователя сигнал модели форматируют для соответствующего использования устройством назначения, включая инструкции протокола активирования, как показано на этапе/в модуле 457.In FIG. 7 illustrates the organization and method of electronic request and transmission between voice model generation or recording device 404 and a remote user. In this view, sufficient parts can be sent to a remote voice model generating or recording device 404 from any number of different users 410, 413, 416. The device 404 then generates or searches for a voice model data file and creates or searches for a voice model signal. The model signal is then transmitted or downloaded for the user or for the person designated by him, as shown in step 437. At the time of loading or later, at the request of the user 441, the model signal is formatted for appropriate use by the destination device, including activation protocol instructions, as shown in step / in the module 457.

На фиг. 8 схематично представлено устройство мобильного носителя, такого как карта, диск или микросхема, в котором размещены все необходимые компоненты, в зависимости от режима использования и потребностей, необходимых для использования технологии моделирования голоса. Например, на основе 7 и 8 может быть изготовлена карта 477 для двери в гостинице, используемая при регистрации туриста в гостинице. Однако в дополнение к нормальному программированию кода безопасности на месте и использованию электронных компонентов 479, установленных в карте, можно получить дополнительные свойства, содержащие аспекты настоящего изобретения. Схематичное представление дополнительных свойств, получаемых с помощью такой карты, включает средство 481, предназначенное для приема и использования модели голоса для выбранного голоса или голоса туриста для различных целей при остановке туриста в гостинице. Как показано, такие свойства могут включать элемент 501 приема и накопления модели, генератор или схему 506 генератора звука, блок 511 центрального процессора, схему 515 ввода/вывода (Ι/Ο), элементы 518 цифроаналогового/аналогово цифрового преобразования (Ό-Α) и блок 521 синхронизации. Повторим, что могут использоваться различные другие элементы, такие как средство сжатия или распаковки голоса, которые известны в области производства сотовых телефонов, или другие компоненты, позволяющие придать карте требуемую функцию. Пользователь может наслаждаться диалогом или использовать интерфейс с неодушевленными устройствами, установленными в гостинице, которые звучат голосом (голосами), выбранным туристом. Несомненно, профиль туриста позволяет даже сохранять такую информацию по голосовым предпочтениям, как соответствующую, и это можно использовать для повышения начислений для оплаты или для улучшения условий проживания при использовании настоящего изобретения. Предусматривается, что настоящее изобретение может использоваться для различных вариантов применения и в различных изделиях, и пример, приведенный на фиг. 8 и 9, не следует рассматривать как ограничивающий.In FIG. 8 schematically shows a device of a mobile medium, such as a card, disk or microcircuit, in which all the necessary components are placed, depending on the mode of use and the needs necessary for using voice modeling technology. For example, based on 7 and 8, a 477 card for a hotel door can be produced when registering a tourist at a hotel. However, in addition to the normal programming of the security code in place and the use of electronic components 479 installed in the card, additional properties containing aspects of the present invention can be obtained. A schematic representation of the additional properties obtained with such a card includes a means 481 for receiving and using a voice model for a selected voice or a tourist’s voice for various purposes when a tourist is staying at a hotel. As shown, such properties may include a model receiving and accumulating element 501, a sound generator or generator 506, a central processing unit 511, an input / output circuit 515 (Ι / Ο), digital-to-analog / analog-digital conversion (элементы-Α) elements 518, and block 521 synchronization. We repeat that various other elements can be used, such as voice compression or decompression means, which are known in the field of cell phone manufacturing, or other components that make it possible to give the card the desired function. The user can enjoy the dialogue or use the interface with inanimate devices installed in the hotel that sound in the voice (s) chosen by the tourist. Undoubtedly, the profile of the tourist even allows you to save such information on voice preferences as appropriate, and this can be used to increase charges for payment or to improve living conditions when using the present invention. It is contemplated that the present invention can be used for various applications and in various products, and the example shown in FIG. 8 and 9 should not be construed as limiting.

На фиг. 9 приведено изображение фотографии 602, которая сконфигурирована для интерактивного использования технологии моделирования голоса, причем голос Л присваивается фигуре Р_л и голос КК присваивается фигуре Р_КК. В рамке 610 установлено средство или другая структура, например средство считываемого компьютером кода или просто трехмерный материал, предназначенное для создания интерфейса с предметами или объектами, изображенными на фотографии (или на других носителях) с соответствующими моделями голоса для воссоздания диалога, который, вероятно, происходил или мог бы происходить по желанию пользователя.In FIG. 9 is an image of a photograph 602 that is configured to interactively use voice modeling technology, wherein a voice L is assigned to a figure R _l and a voice QC is assigned to a figure P _QC . In frame 610, a tool or other structure is installed, for example, a computer-readable code tool or simply three-dimensional material designed to create an interface with objects or objects depicted in a photograph (or other media) with appropriate voice models to recreate the dialogue that probably occurred or could happen at the request of the user.

Следует понимать, что существуют различные средства и способы для фиксирования, анализа и синтеза компонентов реального и искусственного голосов. Например, в следующих патентах Соединенных Штатов Америки и в цитируемых в них или приведенных ссылках раскрыто несколько средств, предназначенных для фиксирования, синтеза, перевода, распознавания, характеристики или другого типа анализа голосов, и они приведены здесь полностью в качестве ссылки для раскрытия такой информации: 4,493,050; 4,710,959; 5,930,755; 5,307,444; 5,890,117; 5,030,101; 4,257,304; 5,794,193;It should be understood that there are various means and methods for fixing, analyzing and synthesizing the components of real and artificial voices. For example, the following United States patents and the references cited or referenced disclose several means for recording, synthesizing, translating, recognizing, characterizing, or some other type of voice analysis, and are hereby incorporated by reference in their entirety to disclose such information: 4,493,050; 4,710,959; 5,930,755; 5,307,444; 5,890,117; 5,030,101; 4,257,304; 5,794,193;

5,774,837; 5,634,085; 5,704,007; 5,280,527;5,774,837; 5,634,085; 5,704,007; 5,280,527;

5,465,290; 5,428,707; 5,231,670; 4,914,703;5,465,290; 5,428,707; 5,231,670; 4,914,703;

4,803,729; 5,850,627; 5,765,132; 5,715,367;4,803,729; 5,850,627; 5,765,132; 5,715,367;

4,829,578; 4,903,305; 4,805,218; 5,915,236;4,829,578; 4,903,305; 4,805,218; 5,915,236;

5,920,836; 5,909,666; 5,920,837; 4,907,279;5,920,836; 5,909,666; 5,920,837; 4,907,279;

5,859,913; 5,978,765; 5,475,796; 5,483,579;5,859,913; 5,978,765; 5,475,796; 5,483,579;

4,122,742; 5,278,943; 4,833,718; 4,757,737;4,122,742; 5,278,943; 4,833,718; 4,757,737;

4,754,485; 4,975,957; 4,912,768; 4,907,279;4,754,485; 4,975,957; 4,912,768; 4,907,279;

4,888,806; 4,682,292; 4,415,767; 4,181,821;4,888,806; 4,682,292; 4,415,767; 4,181,821;

3,982,070; и 4,884,972. Ни в одном из этих документов не раскрыт вклад в соответствии с настоящим изобретением, заявленный или дру23 гим образом описанный здесь. Напротив, в вышеуказанных патентах описаны инструменты, которые могут использоваться, но не являются необходимыми, при воплощении одного или большего количества вариантов воплощения настоящего изобретения. Таким образом, следует понимать, что различные системы, продукты, средства, способы, процессы, форматы данных, накопители данных и среды передачи, содержание данных и другие аспекты рассматриваются в настоящем изобретении для получения новых и неочевидных свойств, преимущества, продуктов и вариантов применения технологии, описанной здесь. В связи с этим вышеприведенные описания следует рассматривать как пример, а не ограничения, соответственно так, что формула изобретения позволяет расширить объем, которому должна соответствовать эта передовая технология, без ограничений по мере развития и получения новых возможностей технологий воплощения.3,982,070; and 4,884,972. None of these documents disclose a contribution in accordance with the present invention claimed or otherwise described herein. In contrast, the above patents describe tools that can be used, but are not necessary, when embodying one or more embodiments of the present invention. Thus, it should be understood that various systems, products, means, methods, processes, data formats, data storage media and transmission media, data content and other aspects are considered in the present invention to obtain new and non-obvious properties, advantages, products and applications of the technology described here. In this regard, the above descriptions should be considered as an example, and not limitation, respectively, so that the claims allow to expand the scope to which this advanced technology should comply, without restrictions as the development and new capabilities of technology implementation.

Claims

CLAIM

1. A system designed to fix part of a specific voice, sufficient to use this part as a model, suitable for the subsequent generation of unspoken words and phrases with further use of the voice, containing

a) means for recording a sufficient portion of the voice in a form suitable for analysis as characteristics of the voice;

b) analysis means for receiving and analyzing a fixed voice and for characterizing elements of a fixed voice as characteristics data so that these characteristics data are sufficient to uniquely characterize and recreate a recorded voice in the form of previously unexpressed expressions by said fixed voice;

c) storage means for receiving characteristic data from analysis means for a particular fixed voice; and

b) an extraction tool designed to extract analysis data and characteristics data for further use in generating the sound of semantic expressions, similar to the sound of a fixed voice, but never said before.

2. The system according to claim 1, characterized in that the means for recording voice contains a means of digital recording.

3. The system according to claim 1, characterized in that the means for recording voice contains a flash memory card.

4. The system according to claim 1, characterized in that the means for fixing voice includes means for analog recording.

5. The system according to claim 1, characterized in that the means for fixing voice contains input means for receiving live voice and for transmitting this live voice to the analysis tool.

6. The system according to claim 1, characterized in that the analysis means comprises means for recording digital data.

7. The system according to claim 1, characterized in that the analysis tool comprises means for identifying a particular structure, syntax, frequency, pitch and speech tones in the recorded voice data.

8. The system according to claim 1, characterized in that the analysis tool contains a tool designed to identify a specific vocabulary, pronunciation or accent, a unique fixed voice.

9. The system according to claim 1, characterized in that the analysis tool comprises a means for identifying certain properties unique to a fixed voice, obtained on the basis of the specific anatomical structure of the real voice holder.

10. The system according to claim 1, characterized in that the analysis tool contains a tool designed to determine the vocabulary of the real holder of a fixed voice.

11. The system of claim 10, characterized in that the analysis tool comprises means for installing vocabulary data and characteristics data for use in the formation of future simulated voices.

12. The system according to claim 1, characterized in that the analysis tool comprises a digital processing device for digitally processing data input in the form of a voice or a digital representation of a recorded voice.

13. The system according to claim 1, characterized in that the analysis tool contains a second input means for receiving additional data related to the physiology of the real voice holder.

14. The system of claim 13, wherein the second input means of the analysis tool comprises a digital signal processor means that selectively receives audio or other data containing visualization information regarding the morphology of the actual voice holder.

15. The system according to claim 1, characterized in that the analysis tool contains a comparison tool for comparing a set of input voice data with recorded data containing age data, language data, education data, gender data, occupation data, emphasis data, nationality data, ethnicity data, voice type data, habits data and environmental data.

16. The system according to claim 1, characterized in that the analysis tool contains a third input means for receiving data related to the actual owner of the voice, containing age data, education data, gender data, occupation data, emphasis data, data nationality, ethnicity, type of voice, habits, language, and environment.

17. A method of creating sounds similar to voice, which are identical in sound to the voice of a particular person, comprising the following steps:

a) recording a sufficient portion of a particular person's voice for recording and use;

b) a recording of a sufficient portion of the vote of a particular person;

c) analysis of a sufficient portion of the voice to identify the essential components or characteristics of the recorded voice; and

b) the use of identified essential components or characteristics to create a new voice, which when establishing data from one or more database tools and then listening to it sounds identical in all respects to the voice of a particular person for a listener who has normal auditory discrimination.

18. The method according to 17, characterized in that the analysis step comprises the steps of identifying components in a fixed sufficient portion of a particular person’s voice related to at least one of the components, including frequency, tone, pitch, volume, accent, gender, harmonic structure , acoustic power, phonetic or temporal emphasis, energy and frequency.

19. The method according to p. 18, characterized in that the step of fixing a sufficient part of the voice of a particular person, intended for recording and use, includes fixing the sound generated by the larynx, or the sound generated by the turbulence of the voice of a particular person.

20. A method for accurately copying a person’s voice, comprising the following steps:

a) the identification of a minimum data set containing a combination of words, sounds or phrases that must be spoken by the actual holder of the voice to copy;

b) fixing when pronouncing a combination of words, sounds or phrases of the actual holder of the voice, which must be copied on the medium;

c) analysis of the fixed pronunciation to identify the characteristics of the voice of the actual voice holder, sufficient to ensure artificial voice generation using the identified characteristics, so that the artificially generated voice is essentially identical in all respects to the listener who has normal hearing recognition when such a listener hears the generated voice using some language components not contained in the fixed pronunciation ystvitelnogo real voice of its owner.

21. The product containing

a) a recording medium used in a computer, having means of a program code readable by a computer, which is installed therein to create a copy of a person’s voice, and the means of program code readable by a computer in said product

b) a computer-readable code tool for performing a computer analysis of a recorded sufficient portion of the voice of the actual holder to identify voice characteristics data sufficient to provide artificial voice generation; and

c) computer-readable program code means for using the characteristics of the identified voice to artificially generate voice so that the artificially generated voice is substantially identical in sound and use to the user when the listener hears a generated voice using some components language not contained in the recorded speech of the real voice of its real owner.

22. The product according to item 21, characterized in that it further comprises a means of program code, read by a computer, designed to record the generated voice for subsequent use.

23. The product according to item 21, characterized in that it further comprises a means of program code read by a computer, designed to use these characteristics of the voice to create a voice profile of its real owner.

24. The product according to item 21, additionally containing a means of program code read by a computer, designed to provide access to a database tool designed to record data containing age data, education data, gender related data, occupation data , accent data, language, nationality, ethnic data, voice type data, habits data, general data and environmental data.

25. A computer program product for use with an acoustic output device, said computer program product comprising

a) means of program code read by a computer embodied therein for copying a person’s voice through an acoustic output device, the computer program product comprising

c) computer readable software code means for using the identified voice characteristic data to artificially generate and output voice through an acoustic output device so that the artificially generated voice is substantially identical in sound and use to the listener when the listener hears the generated voice using some components of the language that are not contained in the recorded speech of the real voice of its real owner.

26. A computer program product for use with a display device, wherein said computer program product comprises

a) means of program code read by a computer installed in it, designed to copy a person’s voice and verify the accuracy of the copied voice displayed on a display device, the computer program product comprising

b) a computer-readable code tool for performing computer analysis of a recorded sufficient portion of the voice of its actual holder to identify voice characteristic data sufficient to perform artificial voice generation; and

c) computer readable software code means for using the identified voice characteristic data to artificially generate a voice and to compare the characteristics of the generated voice with the voice of its actual owner on the display device so that the artificially generated voice is substantially identical in sound to the listener when the display device indicates this and when the listener actually hears the generated voice using some components of the language that are not contained in the recorded speech of the real voice of its real owner.

27. A computer program product for use with an acoustic output device, said computer program product comprising

a) means of a program code read by a computer installed in it, designed to start copying a person’s voice through an output acoustic device, the computer program product comprising

b) a computer-readable code tool for receiving and activating by a computer a voice characteristic data file unique to a particular voice, sufficient to allow artificial voice generation; and

c) computer readable program code means for using the identified voice characteristic data to artificially generate output voice through an acoustic output device so that the artificially generated voice is substantially identical in sound to the listener when the listener hears the generated voice and recorded speech real voice of its real owner.

28. A computer program product for use with an electronic device, said computer program product comprising

a) means of program code read by a computer installed therein to start copying a person’s voice, the computer program product comprising

b) a computer-readable code tool for receiving and activating a voice characteristic data file unique to a particular voice, sufficient to allow artificial voice generation; and

c) computer-readable program code means for using the identified voice characteristic data file and sound output of the sound generating means for artificially generating voice so that the artificially generated voice is substantially identical in sound to the actual voice of its owner.

29. A storage device designed to record data designed to provide access to a software application running on a data processing subsystem, comprising

a) a data structure recorded in the specified storage device, and the specified data structure includes information stored in a database used by the specified software application and including

b) at least one data file of a sufficient part of the voice recorded in the specified storage device, and each of these sets of data files of a sufficient part of the voice contains information essentially different from any other data file of a sufficient part of the voice;

c) a plurality of voice characteristic data files containing various reference information for a plurality of voice characteristics; and

b) a plurality of sets of voice profiles, each of which contains at least one voice profile data file containing data unique to this data file only, in which the data structure allows access to voice characteristic data files and voice profile data files to perform the operation comparing with at least one file a sufficient portion of the voice.

30. A data processing system executing a software application and containing a database used for the specified software application, and the specified data processing system contains

a) CPU means for processing said software application; and

b) storage means for storing a data structure for accessing by said software application, said data structure consisting of information stored in a database used by said software application and including at least one data file of a sufficient part of the voice recorded in the specified storage device, and each of the set of data files of a sufficient part of the voice contains information essentially different from any set of file given s enough of the vote;

a plurality of voice characteristic data files containing various reference information for a plurality of voice characteristics;

a plurality of voice profile sets, each of which contains at least one voice profile data file containing data unique to this data file only; and

c) in which the data processing system allows access to voice characteristic data files and voice profile data files to perform a comparison operation with at least one data file of a sufficient portion of the voice.

31. A computer data signal embodied in a transmission medium, comprising

a) an encoding source code for a unique voice profile model used to encode additional electronic sound to create a specific generated voice; and

b) the source encoding code at a specific location and configured so that the encoding source code can be removable from the storage medium for use as a key to generate the generated voice.

32. A method of using the selected voice as a personal voice assistant with an electronic device, comprising the following steps:

a) the inclusion of electronic means for accessing a remote database;

b) transmitting a portion of the signal to a remote database containing a voice database containing a plurality of sets of voice profiles, each of which has at least one voice profile data file containing data unique to this data file and identified by a unique identifier;

c) transmitting a portion of the signal to a remote database to uniquely identify the desired data file and then to transfer the contents of the data file to a user-defined location of the electronic device; and

b) performing the use of the selected and transmitted data file as a voice model, in combination with the corresponding sound generated either by an electronic device or other means designed to generate such a sound, so that, as required by the user, he can receive sound from electronic devices in the form of the sound of the selected voice, as determined by the identified voice.

33. The method according to p, characterized in that the data file includes data characteristics of the selected voice, arranged in the form of software code read by a computer, for using data characterizing the identified voice for artificially generating a voice model.

34. The method according to p, characterized in that the embodiment includes the use of identification means in order to allow only identified users to access and use voice modeling technology and data.

35. The method according to p, characterized in that the embodiment includes the use of means of verification with selective access to verify that the audible voice is either a real voice or a generated model.

36. A method of conducting business operations that uses a system designed to fix part of a specific voice, sufficient to use this part as a model suitable for the subsequent generation of previously unspoken words and phrases with further use of voice, containing the following steps:

a) recording a sufficient portion of the voice in the form used for analysis as characteristics of the voice;

b) entering a sufficient part into the analysis module to obtain characterizing elements of the recorded voice as the characteristics data so that the characteristics data are sufficient to uniquely characterize and recreate the recorded voice in the form of previously unspoken words and phrases in the specified recorded voice;

c) receiving characteristic data from the analysis module for a particular fixed voice; and

g) recording these characteristics for later use in the generation of words and phrases that sound like fixed by a voice, but never expressed before.

37. The method according to clause 36, wherein the means for recording voice contains a means of digital input.

38. The method according to clause 36, wherein a sufficient portion of the voice is received in electronic form.

39. The method according to clause 36, wherein the characteristic data is packaged to form a signal of the voice model used to combine with the generated sound to create a simulated voice that sounds just like a specific real voice.

40. The method according to clause 36, wherein the simulated voice is controlled in such a way as to enable reception of voice input commands in the simulated voice to pronounce new words using the simulated voice, which, however, were not entered by a specific voice.

41. An automated device designed to capture a sufficient part of a specific voice and to use this part as a model used for subsequent use of the modeled voice, containing

a) a receiving module designed to receive a sufficient portion of the voice in the form used for analysis to obtain voice characteristics;

b) an analysis module for receiving and analyzing a fixed voice and for obtaining characterizing elements of a fixed voice as characteristics data; and

c) a model generator module designed to automatically generate a voice model signal as a unique identifier for the acquired specific voice, but so that the specified voice model signal contains data different from the specific words expressed by the voice from which the enabling part was formed.

42. The device according to paragraph 41, characterized in that it further comprises a means of communication, designed to communicate with the accumulation means for receiving data characteristics from the database.

43. The device according to paragraph 41, characterized in that it further comprises a means of communication designed to communicate with the accumulation means for storing the generated model until a request is received.

44. An interactive method for creating voice models and generating payment for such generation, comprising

a) recording a sufficient portion of a specific vote;

b) analyzing a sufficient portion of a particular voice to generate a data profile that determines the characteristics of the recorded voice so that it can be recreated for later use;

c) generating a voice model signal as a unique identifier for the acquired specific voice; and

j) providing at least one generated data profile for commercial use to others.

45. The method performed by the device, designed to create a voice model and generate payment for such generation, containing

a) recording a sufficient portion of a specific vote;

b) analyzing a sufficient portion of a particular voice to generate a data profile that determines the characteristics of the recorded voice in such a way that it can be recreated for later use;

c) using a data profile, generating a voice model signal as a unique identifier fixed to a particular voice; and

j) providing at least one voice model signal for commercial use.

46. A business method for creating a voice model, comprising

a) recording a sufficient portion of a specific voice or simulated voice;

b) the use of computer tools that analyze a sufficient portion of the voice to generate a data profile that determines the characteristics of the recorded voice33 so that it can be recreated for later use;

c) electronically generating or searching for a voice model signal as a unique identifier for a fixed voice; and

b) providing at least one voice model for commercial use.

47. The method of conducting business operations according to item 46, wherein the step of providing is performed by the exchange of electronic data.

48. A method of creating a model of multiple voices, containing

a) fixing a sufficient portion of the plurality of votes or simulated votes;

b) the use of computer tools that analyze sufficient parts of the voices to generate a data profile that determines the characteristics of the recorded voices in such a way that they can be arranged as a packet in the form of a single voice signal suitable for reconstitution for subsequent use; and

c) electronically generating a voice model signal as a unique identifier for a newly generated voice.