DE1219093B

DE1219093B - Method and arrangement for generating artificial speech from digital data on speech stored in an electronic data processing system

Info

Publication number: DE1219093B
Application number: DEJ25600A
Authority: DE
Inventors: Dipl-Ing Dr Kurt Bandat; Dipl-Ing Ernst Rothauser
Original assignee: IBM Deutschland GmbH
Current assignee: IBM Deutschland GmbH
Priority date: 1964-04-06
Filing date: 1964-04-06
Publication date: 1966-06-16
Also published as: GB1059802A; CH433829A; SE316510B; AT247921B; NL6504101A

Abstract

1,059,802. Vocoder. INTERNATIONAL BUSINESS MACHINES CORPORATION. March 29, 1965 [April 6, 1964], No. 13167/65. Heading H4R. Digital data representing time multiplexed spectrum channel signals and excitation function signals of a channel vocoder are supplied from a data processing system EDV to a commutator switch S which feeds the individual channel signals to their respective synthesis channels 1 to n, each of which contains a digital-to-analogue converter DAW and the appropriate frequency filter FS. Switch S operates to feed. one sample per channel during each excitation function period, and if this sample is applied to. the filter as an amplitude modulated pulse the usual channel modulator becomes redundant. The various channel signals are added, amplified, and reproduced by a loud-speaker. The plurality of digital to analogue converters, one in each spectrum channel, may be replaced by one in the common line to the switch S. A buffer store may also be included in the common line to the switch S, Fig. 4 (not shown).

Description

BUNDESREPUBLIK DEUTSCHLANDFEDERAL REPUBLIC OF GERMANY

DEUTSCHESGERMAN

PATENTAMTPATENT OFFICE

AUSLEGESCHRIFTEDITORIAL

Int. Cl.:Int. Cl .:

H04bH04b

H04mH04m

Deutsche Kl.: 21 a2 - 36/22German class: 21 a2 - 36/22

Nummer:
Aktenzeichen:
Anmeldetag:
Auslegetag:Number:
File number:
Registration date:
Display day:

J 25600 VIII a/21 a2J 25600 VIII a / 21 a2

6. April 1964April 6, 1964

16. Juni 1966June 16, 1966

Es ist eine Reihe von Verfahren zur Erzeugung künstlicher Sprache bekanntgeworden, die sich sowohl mechanischer und elektromechanischer als auch rein elektronischer Mittel bedient. Für Ausgabegeräte für elektronische Datenverarbeitungsanlagen sind diese Verfahren jedoch weniger geeignet.A number of methods for generating artificial speech have become known which are both mechanical and electromechanical as well as purely electronic means. For output devices however, these methods are less suitable for electronic data processing systems.

Es ist daher die Aufgabe der Erfindung, ein Verfahren zur Erzeugung künstlicher Sprache anzugeben, welches besonders für die Datenausgabe in gesprochener Form aus datenverarbeitenden Anlagen geeignet ist.It is therefore the object of the invention to specify a method for generating artificial speech, which especially for the data output in spoken form from data processing systems suitable is.

Die Erfindung betrifft demnach ein Verfahren zur Erzeugung künstlicher Sprache aus in einer elektronischen Datenverarbeitungsanlage gespeicherten digitalen Daten über die momentane Energie in η Frequenzteilbereichen des zu erzeugenden Sprachsignals (Kanalenergiewerte) sowie über den zeitlichen Verlauf der Energieanregung des Sprachsignals, wobei die Anregung des Sprachsignals durch in ihrer zeitlichen Lage definierte Energieimpulse erfolgt. Gekennzeichnet ist dieses Verfahren dadurch, daß von der Datenverarbeitungsanlage jeweils zu den durch die zeitliche Lage der zu erzeugenden Anregungsenergieimpulse gegebenen Zeitpunkten η die Kanalenergiewerte darstellende digitale Codegruppen nacheinander auf einen Schalter gegeben werden, welcher je eine Codegruppe auf den zugeordneten Kanal der Kanäle 1 bis η verteilt, daß in jedem Kanal ein Digital-Analog-Wandler aus der Codegruppe einen amplitudenmodulierten Impuls herstellt und daß von diesem Impuls in einem jeweils nachgeschalteten Bandfilter nur die Frequenzen des entsprechenden Bandes durchgelassen werden, derart, daß die Summe der Filterausgangsspannungen ein elektrisches Sprachsignal ergibt.The invention accordingly relates to a method for generating artificial speech from digital data stored in an electronic data processing system about the current energy in η frequency subranges of the speech signal to be generated (channel energy values) and about the time course of the energy excitation of the speech signal, the excitation of the speech signal being triggered by in their energy impulses defined at the time. This method is characterized by the fact that the data processing system sends digital code groups representing the channel energy values to a switch one after the other at the times η given by the temporal position of the excitation energy pulses to be generated, each of which distributes a code group to the assigned channel of channels 1 to η that in each channel a digital-to-analog converter produces an amplitude-modulated pulse from the code group and that only the frequencies of the corresponding band are allowed to pass through this pulse in a subsequent band filter, so that the sum of the filter output voltages results in an electrical speech signal.

Eine Anordnung zur Durchführung dieses Verfahrens kann in besonders günstiger Weise dadurch verwirklicht werden, daß die aus η Codegruppen bestehenden Daten von der Datenverarbeitungsanlage in Form eines /c-stelligen Code als Impulse über k parallele Leitungen und über einen Schalter mit /c Armen zu dem Digital-Analog-Wandler des jeweils zugeordneten Kanals übertragen werden und dieser Wandler nur passive und näherungsweise lineare Schaltelemente zur Umwandlung der Codegruppe in einen amplitudenmodulierten Impuls enthält.An arrangement for carrying out this method can be implemented in a particularly advantageous manner that the data consisting of η code groups from the data processing system in the form of a / c-digit code as pulses via k parallel lines and via a switch with / c arms to the digital -Analog converters of the respectively assigned channel are transmitted and this converter contains only passive and approximately linear switching elements for converting the code group into an amplitude-modulated pulse.

Eine schaltungstechnische Vereinfachung der Anordnung läßt sich dadurch erzielen, daß vor dem Schalter ein einziger Digital-Analog-Wandler eingefügt wird, während die Wandler in den einzelnen Kanälen entfallen.A circuit simplification of the arrangement can be achieved that before Switch a single digital-to-analog converter is inserted while the converter is in each Channels are omitted.

Zur besseren Anpassung der Arbeitsgeschwindig-Verfahren und Anordnung zur Erzeugung
künstlicher Sprache aus in einer elektronischen
Datenverarbeitungsanlage gespeicherten
digitalen Daten über die SpracheFor better adaptation of the working speed method and arrangement for generation
artificial speech from in an electronic
Data processing system stored
digital data on voice

Anmelder:Applicant:

IBM Deutschland Internationale Büro-MaschinenIBM Germany International Office Machines

Gesellschaft m. b. H.,Society m. B. H.,

Sindelfingen (Württ), Tübinger Allee 49Sindelfingen (Württ), Tübinger Allee 49

Als Erfinder benannt:
Dipl.-Ing. Dr. Kurt Bandat,
Dipl.-Ing. Ernst Rothauser, WienNamed as inventor:
Dipl.-Ing. Dr. Kurt Bandat,
Dipl.-Ing. Ernst Rothauser, Vienna

keiten der einzelnen Anlagen ist es weiterhin sehr vorteilhaft, zwischen die Datenverarbeitungsanlage und den Schatler bzw. den Digital-Analog-Wandler einen zusätzlichen Pufferspeicher einzuschalten, der die Daten aus der Anlage aufnimmt und zu von den Daten selbst gesteuerten Zeitpunkten mit gegebenenfalls transformierter Impulsfolgefrequenz an den Schalter bzw. den Digital-Analog-Wandler weiterleitet. It is also very advantageous between the data processing system and to switch on the Schatler or the digital-to-analog converter an additional buffer memory, the records the data from the system and, if necessary, at times controlled by the data itself forwards transformed pulse repetition frequency to the switch or the digital-to-analog converter.

Ferner ist es zur besseren Ausnutzung der Kapazität der Datenverarbeitungsanlage vorteilhaft, an diese mehrere Sprechkanäle anzuschalten.
Gemäß einer weiteren Ausgestaltung der Erfindung wird vorgeschlagen, das elektrische Ausgangssprachsignal nicht unmittelbar in akustische Sprachlaute umzuwandeln, sondern zunächst auf eine. Übertragungsleitung, vorzugsweise eine Telefonleitung, zu übertragen und somit eine Datenausgabe aus einer Datenverarbeitungsanlage in gesprochener Form auf eine Fernübertragungsleitung zu ermöglichen.Furthermore, for better utilization of the capacity of the data processing system, it is advantageous to connect several speech channels to it.
According to a further embodiment of the invention, it is proposed not to convert the electrical output speech signal directly into acoustic speech sounds, but first to one. Transmission line, preferably a telephone line, to transmit and thus enable data output from a data processing system in spoken form on a long-distance transmission line.

Im folgenden wird die Erfindung an Hand eines durch Zeichnungen erläuterten Ausführungsbeispieles näher beschrieben. Es zeigtIn the following the invention will be explained with reference to an embodiment illustrated by drawings described in more detail. It shows

F i g. 1 ein Blockschaltbild einer Anordnung zur Erzeugung künstlicher Sprache aus in einer Datenverarbeitungsanlage gespeicherten digitalen Daten,F i g. 1 is a block diagram of an arrangement for generating artificial speech in a data processing system stored digital data,

F,i g. 2 ein Blockschaltbild einer vereinfachten Anordnung zur Erzeugung künstlicher Sprache,F, i g. 2 is a block diagram of a simplified arrangement for generating artificial speech,

F i g. 3 ein Zeitdiagramm zur Veranschaulichung der Abtastvorgänge des Schalters S der F i g. 1 und 2 undF i g. 3 is a timing diagram to illustrate the scanning operations of the switch S of FIG. 1 and 2 and

609 579/275609 579/275

F i g. 4 die Verwendung eines Pufferspeichers zur Geschwindigkeitsanpassung des Schalters S der Fig. 1 und 2.F i g. 4 shows the use of a buffer memory for adapting the speed of the switch S of FIGS. 1 and 2.

In Fi g. 1 ist das Blockschaltbild einer Anordnung zur Durchführung des Verfahrens zur Erzeugung künstlicher Sprache aus in einer Datenverarbeitungsanlage EDV gespeicherten digitalen Daten der Sprache dargestellt. Am Ausgang der Datenverarbeitungsanlage EDV werden die digitalisierten Angaben über die Sprache, die sogenannten Daten, in Form eines fc-stelligen Code an k parallelen Leitungen zur Umwandlung in Sprache bereitgestellt. Die Enden dieser Leitungen sind mit einem Schalter S verbunden, der k parallele Durchschaltstrecken besitzt und somit jeweils eine komplette Codegruppe auf einen der Kanäle 1 bis η durchschalten kann. Zur Erläuterung der Abtastvorgänge ist es zweckmäßig, sich den Schalter 5 als einen mit k parallelen Armen versehenen Drehschalter, beispielsweise in der Art eines Drehwählers, vorzustellen, der innerhalb eines bestimmten Zeitintervalls nacheinander alle Kanäle von 1 bis η je einmal überstreicht. Die Kanäle 1 bis η entsprechen den Spektrumkanälen des nach dem bekannten Vocoder-Verfahren aufgespaltenen Sprachbandes. In Fi g. 1 shows the block diagram of an arrangement for carrying out the method for generating artificial speech from digital speech data stored in a data processing system EDP. At the output of the data processing system EDP , the digitized information about the language, the so-called data, is provided in the form of an fc-digit code on k parallel lines for conversion into language. The ends of these lines are connected to a switch S , which has k parallel through-connection paths and can thus in each case switch through a complete code group to one of the channels 1 to η . To explain the scanning processes, it is useful to think of the switch 5 as a rotary switch provided with k parallel arms, for example in the form of a rotary selector, which sweeps over all channels from 1 to η one after the other within a certain time interval. The channels 1 to η correspond to the spectrum channels of the voice band split according to the known vocoder method.

Die für die Anwendung des Vocoder-Verfahrens ebenfalls noch erforderliche Grundwelleninformation, die sogenannte Anregungsfunktion, mit der Frequenz f_A wird' in bekannter Weise dadurch gewonnen, daß die Nulldurchgänge einer Richtung festgestellt werden und daraus jeweils ein Rechteckimpulszug abgeleitet wird. Der zeitliche Abstand der so gewonnenen Rechteckimpulse hat die Größe Vf_A, wie in Fig. 3 oben dargestellt ist. Die Drehgeschwindigkeit ω des Schalters S wird nun, wie aus Fi g. 3 unten ersichtlich ist, so gewählt, daß alle η Kanäle nacheinander im Zeitabstand Si/ω in der Zeit l/f_A einmal abgetastet werden. Die jeweiligen Ausgangsgrößen bei der Abtastung eines Kanals werden über k parallele Leitungen einem jedem Kanal zugeordneten Digital-Analog-Wandler Z>v4 PF zugeführt, der einen dem im &-stelligen Code dargestellten Wert entsprechenden amplitudenmodulierten Impuls erzeugt. Dieser Impuls wird über eine einfache Leitung einem Filtersatz iW zugeführt, welcher nur das entsprechende Frequenzband durchläßt. Das Summensignal aller Filterausgänge ergibt dann das elektrische Sprachsignal, das über einen Verstärker V einem Lautsprecher L zugeführt wird.The fundamental wave information also still required for the application of the vocoder method, the so-called excitation function, with the frequency f _A is obtained in a known manner in that the zero crossings of one direction are determined and a square pulse train is derived from each. The time interval between the rectangular pulses obtained in this way has the size Vf _A , as shown above in FIG. 3. The rotational speed ω of the switch S is now, as shown in Fi g. 3 can be seen below, selected so that all η channels are scanned one after the other at a time interval Si / ω in the time l / f _A. The respective output variables when scanning a channel are fed via k parallel lines to a digital-to-analog converter Z> v4 PF assigned to each channel, which generates an amplitude-modulated pulse corresponding to the value shown in the &-digit code. This pulse is fed to a filter set iW via a simple line, which only allows the corresponding frequency band to pass. The sum signal of all filter outputs then results in the electrical speech signal, which is fed to a loudspeaker L via an amplifier V.

F i g. 2 zeigt eine Vereinfachung der Anordnung nach Fig. 1 insofern, als hier ein einziger Digital-Analog-Wandler Ζλ^Ι-Ρ zwischen den Ausgang der elektronischen Datenverarbeitungsanlage EDV auf den Eingang des Schalters S eingeschaltet wird. Auf diese Weise wird der Aufwand an Digital-Analog-Wandlern erheblich verringert.F i g. 2 shows a simplification of the arrangement according to FIG. 1 in that a single digital-to-analog converter Ζλ ^ Ι-Ρ between the output of the electronic data processing system EDP is switched on to the input of the switch S here . In this way, the cost of digital-to-analog converters is considerably reduced.

In Fig.4 ist das Prinzip angegeben, wie durch Einfügung eines Pufferspeichers P in die k Verbindungsleitungen vom Ausgang der Datenverarbeitungsanlage EDV zum Schalter eine Möglichkeit geschaffen wird, die Daten zu von den Daten selbst, beispielsweise über die Steuerleitungen Se, bestimmten Zeitpunkten, gegebenenfalls mit transformierter Impulsfolgefrequenz, an den Schalter zu übertragen.4 shows the principle of how by inserting a buffer memory P into the k connecting lines from the output of the data processing system EDP to the switch, a possibility is created for the data at times determined by the data itself, for example via the control lines Se, if necessary with transformed pulse repetition frequency to be transmitted to the switch.

Dadurch kann dann auch die Abtastgeschwindigkeit des Schalters S beeinflußt werden.As a result, the scanning speed of the switch S can then also be influenced.

Claims

Patent claims:

1. A method for generating artificial speech from digital data stored in an electronic data processing system about the current energy, in. η frequency subranges of the speech signal to be generated (channel energy values) as well as over the temporal progression of the energy excitation of the speech signal, the excitation of the speech signal being carried out by energy pulses defined in their temporal position, characterized in that the electronic data processing system (EDP) responds to the by the temporal Position of the excitation energy pulses to be generated, given times η, the digital code groups representing the channel energy values are given one after the other to a switch (S) , which each distributes a code group to the assigned channel of channels 1 to η so that a digital-to-analog converter (DAW ) produces an amplitude-modulated pulse from the code segment and that only the frequencies of the corresponding band are allowed to pass through this pulse in a subsequent band filter (FS) , so that the sum of the filter output signals results in an electrical speech signal.

2. Arrangement for carrying out the method according to claim 1, characterized in that the data consisting of η code groups on the channel energy values from the data processing system (EDP) in the form of an ar-digit code as a pulse over k parallel lines and a switch (S) with k arms to the digital-to-analog converter (DAW) of the respectively assigned channel 1 to η and that this digital-to-analog converter contains only passive and approximately linear switching elements for converting the code group into an amplitude-modulated pulse (F i g. 1).

3. Arrangement according to claim 2, characterized in that instead of the digital-to-analog converter in the individual channels 1 to η in front of the switch (S), a single digital-to-analog converter (DAW) is inserted (Fig. 2).

4. Arrangement according to claims 2 and / or 3, characterized in that an additional buffer memory (P) is switched on between the electronic data processing system (EDP) and the switch (S) or the digital-to-analog converter (DAW) receives the data from the data processing system and forwards them to the switch (S) or the digital-to-analog converter (DA W) at times controlled by the data itself, with possibly transformed pulse repetition frequency (FIGS. 1, 2 and 4).

5. Arrangement according to one or more of claims 2 to 4, characterized in that Several speech channels are connected to the electronic data processing system.

6. Arrangement according to one or more of claims 2 to 5, characterized in that the electrical output signal is not converted directly into acoustic speech sounds, but first on a transmission line, preferably a telephone line.

1 sheet of drawings