WO2013011634A1 - 波形処理装置、波形処理方法および波形処理プログラム - Google Patents
波形処理装置、波形処理方法および波形処理プログラム Download PDFInfo
- Publication number
- WO2013011634A1 WO2013011634A1 PCT/JP2012/004128 JP2012004128W WO2013011634A1 WO 2013011634 A1 WO2013011634 A1 WO 2013011634A1 JP 2012004128 W JP2012004128 W JP 2012004128W WO 2013011634 A1 WO2013011634 A1 WO 2013011634A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- waveform
- pitch
- segment
- power
- normalization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/90—Pitch determination of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G10L13/07—Concatenation rules
Definitions
- the present invention relates to a waveform processing device, a waveform processing method, and a waveform processing program, and more particularly to a waveform processing device, a waveform processing method, and a waveform processing program that change the power of a waveform.
- the speech waveform is represented by a waveform with time on the horizontal axis and amplitude on the vertical axis.
- a speech waveform is prepared for each segment from the voices of speakers recorded in advance. Then, a synthesized speech is obtained by connecting the waveform of the segments corresponding to the speech to be output.
- the sound waveform of each segment is cut out at a pitch period.
- This cut out waveform is called a pitch waveform. Since a pitch waveform is cut out with a pitch period from the waveform of one element, a plurality of pitch waveforms are generated for each element.
- the pitch period is the reciprocal of the pitch frequency (fundamental frequency).
- FIG. 11 is a schematic diagram illustrating an example of compressor processing for an audio waveform.
- the power envelope of the waveform 91 of the sound before the compressor processing can be schematically expressed as a power envelope 92.
- the power envelope of the voice waveform becomes a power envelope 93.
- Patent Document 1 describes a speech synthesizer.
- the speech synthesizer described in Patent Document 1 performs waveform normalization processing as described below. That is, the speech synthesizer described in Patent Document 1 takes out one pitch waveform.
- the speech synthesizer described in Patent Document 1 obtains normalized waveform information S [i] by performing the following equation (2) calculation with A as a predetermined value.
- the power of the voice recorded to obtain the voice waveform for each segment varies depending on the voice recording conditions and the influence of the speaker's habit.
- synthesized speech is generated using a waveform generated from such recorded speech, power non-uniformity occurs such that the power becomes particularly large at a certain position on the horizontal axis (time axis). As a result, synthesized speech that is difficult to hear is generated.
- compressor processing can be considered as a method for eliminating the power non-uniformity of the synthesized speech.
- the waveform in the portion where the amplitude value is lower than the threshold is not changed, and the waveform is changed so that the amplitude value is constant for the portion where the amplitude value is equal to or greater than the threshold.
- the waveform is changed so that a portion where the amplitude value is equal to or larger than the threshold value is flattened. Therefore, the compressor processing has a problem that the sound waveform is distorted and the sound quality is deteriorated.
- an object of the present invention is to provide a waveform processing device, a waveform processing method, and a waveform processing program for changing the power of each pitch waveform of a segment so that natural synthesized speech can be obtained.
- a waveform processing apparatus selects a pitch waveform one by one from a group of pitch waveforms corresponding to a segment, and calculates the scalar representing the power of the selected pitch waveform;
- a normalization degree calculating means for calculating a normalization degree, which is an index value indicating a degree of normalization with respect to the pitch waveform, as a function value of an increasing function using a scalar as a variable, and a pitch waveform selected by the power calculation means.
- a change coefficient calculating means for calculating a change coefficient for changing the amplitude value based on a scalar and a normalization degree; an amplitude changing means for multiplying the amplitude value at each sampling point of the pitch waveform selected by the power calculating means by the change coefficient; It is characterized by providing.
- the waveform processing method selects a pitch waveform one by one from a group of pitch waveforms corresponding to a segment, calculates a scalar representing the power of the selected pitch waveform, and normalizes the selected pitch waveform.
- Calculates the degree of normalization which is an index value representing the degree of noise, as a function value of an increasing function with a scalar as a variable, and calculates a change coefficient that changes the amplitude value of the selected pitch waveform based on the scalar and degree of normalization
- the change value is multiplied by the amplitude value at each sampling point of the selected pitch waveform.
- the waveform processing program is a power calculation process for selecting a pitch waveform one by one from a group of pitch waveforms corresponding to an element and calculating a scalar representing the power of the selected pitch waveform.
- Normalization degree calculation processing that calculates the normalization degree, which is an index value indicating the degree of normalization for the pitch waveform selected in the calculation processing, as a function value of an increasing function using a scalar as a variable, and the pitch waveform selected in the power calculation processing
- a change coefficient calculation process that calculates a change coefficient that changes the amplitude value of the signal based on the scalar and the normalization degree, and an amplitude change process that multiplies the amplitude value at each sampling point of the pitch waveform selected in the power calculation process by the change coefficient Is executed.
- the power of each pitch waveform of the segment can be changed so that natural synthesized speech can be obtained.
- FIG. FIG. 1 is a block diagram showing an example of the first embodiment of the present invention.
- the waveform processing apparatus includes a speech unit storage unit 1, a prosody correction unit 2, and a unit waveform connection unit 3.
- the speech segment storage unit 1 is a storage device that stores a plurality of pitch waveforms for each segment.
- the unit of the segment will be described.
- the first half and the second half of the vowel are each one unit (one unit of a unit).
- the consonant and the first half of the vowel that follows are a single segment, and the latter half of the vowel is a single segment.
- the waveform of the recorded voice is cut out for each segment.
- a pitch waveform is generated by further dividing the waveform for each element by a pitch period.
- the pitch period can be obtained, for example, as the time from the peak of the waveform to the next peak.
- a waveform in which the peak exists in the center and the power at both ends of the waveform is smaller than the peak may be cut out as a pitch waveform.
- FIG. 1 schematically shows pitch waveform groups 21, 22, and 23 as examples of pitch waveform groups for each unit stored in the speech unit storage unit 1.
- the pitch waveform group 21 is a pitch waveform group corresponding to one segment.
- Each of the pitch waveform groups 22 and 23 also corresponds to one piece.
- the speech unit storage unit 1 also stores the duration time for each unit when the waveform of the unit is generated without performing thinning or insertion of the pitch waveform is taken as an example.
- FIG. 2 is an explanatory diagram schematically showing an example of a pitch waveform.
- the pitch waveform is sampled along the horizontal axis (time axis). Assume that sampling is performed N times from 0 to N ⁇ 1 with respect to the pitch waveform illustrated in FIG.
- Sampling frequency N can be said to be the length of one pitch waveform.
- the amplitude value at t is P (t).
- the prosodic correction unit 2 changes the power of the pitch waveform belonging to the pitch waveform group for each segment. Furthermore, the waveform of one unit is generated by thinning or inserting the pitch waveform according to the duration time when the unit is output, and connecting (superimposing and adding) the pitch waveforms.
- the segment waveform linking unit 3 generates synthesized speech by linking the waveforms for each segment created by the prosody correcting unit 2.
- the prosody correction unit 2 includes a power correction unit 10, a time adjustment unit 8, and a segment waveform generation unit 9.
- the power correction unit 10 reads the pitch waveform group stored in the speech unit storage unit 1 for each unit.
- the power correction unit 10 calculates a normalization degree for each pitch waveform corresponding to one segment. Further, the power of the pitch waveform is changed based on the degree of normalization obtained for the pitch waveform. In other words, the power is corrected based on the normalization degree.
- the power correction unit 10 includes a power calculation unit 4, a normalization degree calculation unit 6, a scaling coefficient calculation unit 5, and a multiplier 7.
- the power calculation unit 4 reads a pitch waveform group for each unit from the speech unit storage unit 1.
- the power calculation unit 4, the normalization degree calculation unit 6, the scaling coefficient calculation unit 5, and the multiplier 7 perform processing for each pitch waveform belonging to the pitch waveform group of one unit.
- the power calculation unit 4 reads a pitch waveform group for each segment according to the order of the segments in the synthesized speech.
- the power calculation unit 4 calculates a scalar S representing power for the pitch waveform of interest.
- a case where the power calculation unit 4 calculates the average amplitude as a scalar S representing power will be described as an example.
- the power calculation unit 4 calculates the average by calculating the following equation (3).
- the amplitude S may be calculated.
- the scalar S representing power is not limited to the above average amplitude, and the power calculation unit 4 may calculate another value as the scalar S representing power. Other examples of the scalar S representing power will be described later.
- ⁇ is a real number satisfying 0.0 ⁇ ⁇ ⁇ 1.0.
- the increasing function used as A (S) may be, for example, a step function, a polygonal line function, or a sigmoid function.
- the increase function A (S) is a polygonal line function will be described as an example.
- the normalization degree calculation unit 6 calculates a value according to the average amplitude S calculated by the power calculation unit 4 using the function A (S) of the following formula (4), thereby calculating the normalization degree What is necessary is just to obtain
- ⁇ min and ⁇ max in Expression (4) may be determined in advance as constants that satisfy ⁇ min ⁇ max .
- S 1 and S 2 may be determined in advance as constants that satisfy S 1 ⁇ S 2 .
- the scaling coefficient calculation unit 5 calculates a scaling coefficient as a function value of a function having a scalar S representing power (average amplitude in this example) and a normalization degree ⁇ as variables.
- the scaling factor is a factor by which the amplitude value P (t) at each sampling point of the pitch waveform is multiplied.
- the power of the pitch waveform can be changed (corrected) by multiplying P (t) by the scaling factor.
- the scaling coefficient calculation unit 5 calculates a scaling coefficient g that satisfies the condition (C / S) ⁇ g ⁇ 1.0.
- the scaling coefficient calculation unit 5 may obtain the scaling coefficient g by substituting the average amplitude S and the normalization degree ⁇ into the function G (S, ⁇ ) of the following equation (5), for example.
- Equation (5) is a predetermined constant as described above.
- One scaling coefficient is obtained for one pitch waveform by the processing of the power calculation unit 4, the normalization degree calculation unit 6, and the scaling coefficient calculation unit 5.
- the power is changed by performing the calculation of the following formula (6) with respect to 1.
- P (t) ' is the corrected amplitude value at each sampling point.
- the time adjustment unit 8 is input with the duration time for outputting the segment for each segment.
- the time adjustment unit 8 applies the corrected pitch waveform group to the corrected pitch waveform group based on the ratio between the duration time set in advance for the power-corrected pitch waveform group and the input duration time length.
- the pitch waveform is thinned out or the pitch waveform is inserted.
- the pitch waveform to be inserted may be the same as the pitch waveform already obtained.
- a pitch pattern is input to the segment waveform generator 9.
- the pitch pattern is a time series of pitch frequencies.
- the segment waveform generation unit 9 connects the pitch waveforms for each segment according to the pitch frequency indicated by the pitch pattern.
- the segment waveform generator 9 calculates the pitch period by calculating the reciprocal of the pitch frequency, and connects the pitch waveform groups for each segment in accordance with the pitch period.
- what should be calculated from which pitch frequency is included in the pitch pattern (time series of pitch frequencies) when the pitch waveforms are connected may be determined as follows, for example.
- a time series in which the pitch frequency is associated with the elapsed time from the reference time may be input as the pitch pattern.
- the segment waveform generation unit 9 determines the pitch waveform order in the synthesized speech, and calculates the pitch period used when connecting the pitch waveforms using the pitch frequency corresponding to the elapsed time corresponding to the pitch waveform order. do it.
- the power calculation unit 4, the normalization degree calculation unit 6, the scaling factor calculation unit 5, the multiplier 7, the time adjustment unit 8, the segment waveform generation unit 9, and the segment waveform connection unit 3 operate according to a waveform processing program, for example.
- a waveform processing program for example.
- a computer program storage device (not shown) stores the waveform processing program, and the CPU reads the program, and according to the program, the power calculation unit 4, the normalization degree calculation unit 6, and the scaling coefficient calculation unit. 5, the multiplier 7, the time adjustment unit 8, the unit waveform generation unit 9, and the unit waveform connection unit 3 may be operated.
- Each element may be realized by a separate unit.
- FIG. 4 is a flowchart showing an example of a process for synthesizing a pitch waveform for one segment. It is assumed that a pitch waveform group is stored in advance in the speech segment storage unit 1 for each segment.
- the power calculation unit 4 reads a pitch waveform group for one unit from the speech unit storage unit 1 (step S1). Then, the power calculation unit 4 determines whether there is a pitch waveform that has not yet been selected in the pitch waveform group for one segment read in step S1 (step S2). If there is an unselected pitch waveform (Yes in step S2), the process proceeds to step S3. Note that since no pitch waveform has been selected at the time of first transition from step S1 to step S2, the process proceeds to step S3.
- step S3 the power calculation unit 4 selects one pitch waveform that has not yet been selected from the pitch waveform group for one segment read in step S1 (step S3).
- the power calculation unit 4 calculates a scalar S representing power for the selected pitch waveform (step S4).
- a scalar S representing power In this example, a case where an average amplitude is calculated as a scalar S representing power will be described as an example.
- the power calculation unit 4 may calculate the average amplitude S of the pitch waveform by performing the calculation of Expression (3) for the selected pitch waveform.
- the normalization degree calculation unit 6 calculates the normalization degree ⁇ based on the average amplitude S (step S5).
- the function shown in Expression (4) is predetermined as the increasing function A (S) using the average amplitude S as a variable.
- the scaling coefficient calculation unit 5 calculates a scaling coefficient for the pitch waveform group selected in step S1 based on the average amplitude S and the normalization degree ⁇ (step S6).
- the function shown in Expression (5) is predetermined as the function G (S, ⁇ ) representing the scaling coefficient.
- the normalization degree calculation unit 6 may calculate the scaling coefficient by substituting the average amplitude S calculated in step S4 and the normalization degree ⁇ calculated in step S5 into G (S, ⁇ ).
- the multiplier 7 changes the power of the pitch waveform selected in step S3 using the scaling coefficient g calculated in step S6 (step S7).
- the corrected amplitude value P (t) ′ at each sampling point may be calculated by performing the calculation shown in Expression (6).
- the correction for the waveform selected in step S3 is completed.
- step S7 the power correction unit 10 repeats the operations after step S2.
- step S2 If it is determined in step S2 that there are no unselected pitch waveforms (No in step S2), the process proceeds to step S8.
- the fact that there is no unselected pitch waveform means that all the pitch waveforms belonging to the group of pitch waveforms read in step S1 have already been selected, and the changes have been completed for those pitch waveforms. Become.
- the time adjustment unit 8 is input with a duration time when the segment is output as synthesized speech.
- the time adjustment unit 8 calculates the ratio between the duration time set in advance for the pitch waveform group for one segment read in step S1 and the input duration length. Then, based on the ratio, the time adjustment unit 8 thins out the pitch waveform or inserts the pitch waveform into the corrected pitch waveform group (step S8).
- the predetermined duration is the duration of the segment when the segment waveform is generated without thinning out or inserting the pitch waveform.
- FIG. 5 is an explanatory diagram showing an example of pitch waveform thinning
- FIG. 6 is an explanatory diagram showing an example of pitch waveform insertion
- FIG. 5A shows each pitch waveform before thinning
- FIG. 6A shows each pitch waveform before insertion.
- Numbers 1 to 6 shown in FIGS. 5A and 6A indicate the order of pitch waveforms. 5 and 6, the maximum amplitude of each pitch waveform is common, but the maximum amplitude of each pitch is not necessarily common.
- the input duration length (the duration length when the segment is output as synthesized speech) is 0.66 times the predetermined duration length.
- the time adjustment unit 8 excludes the second and fourth pitch waveforms and moves the third, fifth, and sixth pitch waveforms to the second to fourth positions (see FIG. 5). 5 (b)).
- the number of pitch waveforms is reduced from six to four, and the duration of this segment is 0.66 times that in the case where no thinning is performed.
- the input duration length is 1.33 times the predetermined duration length.
- the time adjustment unit 8 inserts the same pitch wavelength as the second pitch wavelength after the second pitch wavelength.
- the same pitch wavelength as the fourth pitch wavelength is inserted.
- the number of pitch waveforms increases from six to eight, and the duration of this segment is 1.33 times that in the case where no insertion is performed.
- thinning and insertion are not limited to the examples shown in FIGS.
- About how many pitch waveforms are excluded when the input duration length is a predetermined duration length, and what pitch waveform is inserted with the same pitch waveform Can be determined in advance as a thinning or insertion rule.
- the segment waveform generation unit 9 specifies the pitch frequency corresponding to the pitch waveform read in step S1 from the input pitch frequencies, and calculates the reciprocal of the pitch frequency. Calculate the pitch period. Then, the individual pitch waveforms are connected in accordance with the pitch period (step S9).
- superposition addition may be performed using a shift amount corresponding to the pitch period.
- the first pitch waveform is P 1 (t)
- the second pitch waveform is P 2 (t)
- the shift amount corresponding to the pitch period from the first pitch waveform to the second pitch waveform Is T is T.
- the segment waveform generation unit 9 obtains a connected pitch waveform by calculating P 1 (t) + P 2 (t + T).
- the third and subsequent pitch waveforms may be superimposed and added while reflecting the shift amount.
- from the peak to the next peak becomes long at a place where the pitch period is long, and from the peak to the next peak becomes short at a place where the pitch period is short.
- the vicinity of the end point of the previous pitch waveform and the vicinity of the start point of the next pitch waveform may be overlapped on the time axis.
- the segment waveform generator 9 may add the amplitude value between the vicinity of the end point of the previous pitch waveform and the vicinity of the start point of the next pitch waveform.
- the prosodic correction unit 2 may perform the above-described steps S1 to S9 for each segment in the order of the segments used in the synthesized speech.
- the segment waveform connecting unit 3 connects the waveforms of the segments according to the order of the segments used in the synthesized speech.
- the segment waveform linking unit 3 may perform waveform superposition addition using a shift amount corresponding to the duration time. For example, assume that the waveform of the first phoneme is X 1 (t) and the waveform of the second phoneme is X 2 (t). Further, it is assumed that the shift amount corresponding to the duration of the first phoneme is R. In this case, the segment waveform connecting unit 3 obtains a connected waveform by calculating X 1 (t) + X 2 (t + R). Similarly, the third and subsequent phoneme waveforms may be superimposed and added while reflecting the shift amount.
- the segment waveform connecting unit 3 may add the amplitude value between the vicinity of the end point of the waveform of the previous phoneme and the vicinity of the start point of the waveform of the next phoneme.
- the function A (S) used for calculating the normalization degree ⁇ is an increasing function. Therefore, the greater the value of the average amplitude (scalar representing power), the higher the degree of normalization. That is, it approaches full normalization. On the other hand, the smaller the average amplitude value, the lower the normalization degree, and the less the change in power due to the change in step S7. Therefore, it is possible to maintain a state where the amplitude of the pitch waveform having a small amplitude is relatively smaller than that of the other pitch waveforms. As a result, natural synthesized speech can be obtained.
- the scaling factor calculation unit 5 calculates a scaling factor g that satisfies the condition (C / S) ⁇ g ⁇ 1.0, and the multiplier 7 changes the power by the scaling factor g. Therefore, even if a pitch waveform that suddenly increases in power due to voice recording conditions or speaker habits is obtained, it is possible to prevent power nonuniformity from occurring in the resultant synthesized voice waveform. it can.
- the multiplier 7 changes the power of the pitch waveform by the calculation of the equation (6), the pitch waveform after the change is not distorted, and the deterioration of the sound quality can be prevented.
- the power calculation unit 4 may obtain a scalar S representing power by the calculation of Expression (7) shown below.
- the scalar obtained by equation (7) is the square of the average amplitude obtained by equation (3).
- the power calculation unit 4 may obtain a scalar S representing power by calculation of the following equation (8).
- the normalization degree calculation unit 6 calculates a value corresponding to the scalar S (for example, the average amplitude of power) calculated by the power calculation unit 4 using the function A (S) of the following equation (9). do it.
- the function shown in Formula (9) can also be called a binary function.
- the normalization degree calculation unit 6 may calculate the normalization degree ⁇ by substituting the scalar S calculated by the power calculation unit 4 into the following formula (10).
- ⁇ min and ⁇ max may be determined in advance as constants that satisfy ⁇ min ⁇ max .
- ⁇ 1 and ⁇ 2 may be determined as constants that satisfy the following equations (11) and (12).
- S 1 and S 2 in Expression (12) may be determined in advance as constants that satisfy S 1 ⁇ S 2 .
- a (S) is a sigmoid function
- the change in the normalization degree ⁇ becomes smooth, so the change in power becomes more natural.
- C in Expression (13) is a predetermined constant.
- ⁇ 1 and ⁇ 2 in Equation (13) may be determined in advance as constants that satisfy 0.0 ⁇ ⁇ 1 ⁇ 2 ⁇ 1.0.
- a scaling factor g may be calculated.
- C in Formula (14) is a predetermined constant. Further, ⁇ 1 and ⁇ 2 in the equation (14) may be determined as constants that satisfy the following equations (15) and (16).
- the normalization degree calculation unit 6 determines whether the segment for which the scaling coefficient is to be calculated (that is, the segment corresponding to the pitch waveform group read in step S1) is a vowel, or a voiced plosive (b, d,
- the increasing function A (S) used for calculating the normalization degree ⁇ is switched depending on whether a consonant other than g) or a consonant of a voiced plosive is included.
- the normalization degree calculation unit 6 receives the result of performing language processing on the text information to be synthesized speech output. That is, it is determined by language processing whether each segment is a segment corresponding to a vowel, a segment containing a consonant other than a voiced plosive, or a segment containing a consonant of a voiced plosive. Then, the determination results may be input to the normalization degree calculation unit 6 in the order of the segments.
- the normalization degree calculation unit 6 uses the function A (S) of the following equation (17) as the increase function A (S). And the normalization degree ⁇ may be calculated.
- the normalization degree calculation unit 6 uses the following equation (18) as the increase function A (S).
- the normalization degree ⁇ may be calculated using the function A (S).
- the normalization degree calculation unit 6 uses the function of the following equation (19) as the increase function A (S). What is necessary is just to calculate the normalization degree (alpha) using A (S).
- S 1 , S 2 , and S th may be set as constants in advance. However, S 2 and S th are determined so as to satisfy S 2 ⁇ S th .
- ⁇ min1 , ⁇ max1 , ⁇ min2 , and ⁇ max2 may be determined in advance as constants that satisfy ⁇ min1 ⁇ max1 and ⁇ min2 ⁇ max2 , respectively. However, ⁇ max1 and ⁇ max2 are determined so as to satisfy the condition of ⁇ max2 ⁇ max1 . As for ⁇ min1 and ⁇ min2 , either value may be large.
- the normalization degree of the segment including the consonant can be kept small. Further, it is possible to prevent the power of the voiced plosive from becoming larger than before scaling. Accordingly, it is possible to prevent the voice deterioration of the consonant accompanying the scaling.
- the normalization degree calculation unit 6 corresponds to a segment within 3 mora from the head of the segment for which the scaling coefficient is calculated (that is, the segment corresponding to the pitch waveform group read in step S1).
- the increase function A (S) used to calculate the normalization degree ⁇ may be switched depending on whether or not it is a segment. In this case, as a language process for the text information to be synthesized speech output, a process is performed to determine whether each segment corresponds to a segment within 3 mora from the beginning of the sentence. The determination result may be input to the normalization degree calculation unit 6.
- the normalization degree calculation unit 6 uses the function A (S) of the following equation (20) as the increase function A (S). ) To calculate the normalization degree ⁇ .
- the normalization degree calculation unit 6 uses the function A (Equation 21) shown below as the increase function A (S).
- the normalization degree ⁇ may be calculated using S).
- S 1 , S 2 , and S 3 may be determined in advance as constants that satisfy S 1 ⁇ S 3 ⁇ S 2 , respectively.
- ⁇ min1 , ⁇ max1 , ⁇ min2 , and ⁇ max2 may be determined in advance as constants satisfying ⁇ min1 ⁇ max1 and ⁇ min2 ⁇ max2 , respectively.
- ⁇ max1 and ⁇ max2 are determined so as to satisfy the condition of ⁇ max2 ⁇ max1 .
- ⁇ min1 and ⁇ min2 either value may be large.
- a (S) used for calculation of the normalization degree ⁇ is not based on whether the segment is within 3 mora from the beginning of the sentence but based on whether the segment is within 3 mora from the beginning of the exhalation paragraph. May be switched. That is, when the segment for which the scaling factor is calculated is a segment within 3 mora from the beginning of the exhalation paragraph, the normalization degree calculation unit 6 may calculate the normalization degree ⁇ using Expression (20). . If the segment for which the scaling coefficient is to be calculated is not a segment within 3 mora from the beginning of the exhalation paragraph, the normalization degree calculation unit 6 may calculate the normalization degree ⁇ using Expression (21). In this case, the normalization degree calculation unit 6 may be input with the result of determining for each segment whether or not the segment is within 3 mora from the beginning of the exhalation paragraph.
- the synthesized speech at the beginning of a sentence or an exhalation paragraph can be made more natural by reducing the normalization degree of the segment within 3 mora from the beginning of the sentence (or the end of the exhalation paragraph).
- FIG. 8 is a block diagram showing an example of the second embodiment of the present invention. Constituent elements similar to those in the first embodiment are denoted by the same reference numerals as those in FIG. 1, and detailed description thereof is omitted.
- the waveform processing apparatus of the second embodiment further includes a recorded speech waveform storage unit 32, a time length information storage unit 31, a segment creation unit 33, Is provided.
- the recorded voice waveform storage unit 32 is a storage device that stores a recorded voice waveform.
- FIG. 8 shows an example in which a series of waveforms of syllables “u”, “ma”, and “i” is stored.
- the time length information storage unit 31 is a storage device that stores the time length of each syllable of the recorded voice. That is, the time length information storage unit 31 stores the time length of each syllable corresponding to the waveform stored in the recorded speech waveform storage unit 32. For example, the time length information storage unit 31 stores the time length for each syllable such as “u”, “ma”, and “i”.
- the segment creation unit 33 extracts a waveform for each segment from the waveform (recorded speech waveform) stored in the recorded speech waveform storage unit 32, and further extracts a pitch waveform for each waveform of each segment. . Then, a pitch waveform group is stored in the speech unit storage unit 1 for each unit.
- the segment creation unit 33 includes a segment waveform cutout unit 34 and a pitch waveform generation unit 35.
- the segment generation unit 33 selects individual segments from the waveform (recorded speech waveform) stored in the recorded speech waveform storage unit 32. Cut out the waveform.
- the first half and the second half of the vowel are each one unit (one unit of a unit).
- the consonant and the first half of the vowel that follows are a single segment, and the latter half of the vowel is a single segment.
- the segment creation unit 33 may cut out the first half and the second half of the syllable of the vowel alone from the recorded speech waveform.
- the first half of the consonant and the subsequent vowel may be cut out and the second half of the vowel may be cut out.
- what is necessary is just to determine the location applicable to each syllable in the waveform of the recorded audio
- the waveform of audio recorded as illustrated in FIG. 8 corresponds to the syllables “u”, “ma”, and “i”.
- the segment creation unit 33 identifies locations corresponding to “u”, “ma”, “i” from the recorded waveform based on the time lengths “u”, “ma”, “i”, Cut out the first half and the second half of the part corresponding to the syllable. As a result, a waveform for each segment is obtained.
- the pitch waveform generator 35 cuts out a pitch waveform for each waveform of each segment. Even in the waveform of one unit, a plurality of peaks appear.
- the pitch waveform generation unit 35 calculates the interval between the peaks as a pitch period. Then, the pitch waveform generator 35 obtains a plurality of pitch waveforms (pitch waveform group) for one unit by cutting out the waveform of the unit in accordance with the pitch period. Note that the pitch waveform generator 35 cuts out individual pitch waveforms so that the peak exists in the center and the power at both ends of the waveform is smaller than the peak.
- the pitch waveform generation unit 35 stores the generated pitch waveform group in the speech unit storage unit 1 for each unit.
- the recorded waveform including the syllables “u”, “ma”, and “i” has been described as an example.
- the recorded speech waveform storage unit 32 stores many recorded waveforms including various syllables.
- the time length of each syllable corresponding to the recorded waveform is stored in the time length information storage unit 31.
- the segment waveform cutout unit 34 and the pitch waveform generation unit 35 are realized by, for example, a CPU of a computer that operates according to a waveform processing program.
- the same effect as that of the first embodiment can be obtained.
- the pitch segment group of various segments can be automatically stored in the speech segment storage unit 1.
- FIG. FIG. 9 is a block diagram showing an example of the third embodiment of the present invention. Components similar to those in the first embodiment and the second embodiment are denoted by the same reference numerals as those in FIGS. 1 and 9, and detailed description thereof is omitted.
- the waveform processing apparatus of the third embodiment includes a recorded speech waveform storage unit 32, a time length information storage unit 31, a segment creation unit 33a, a speech segment storage unit 1, a pitch pattern generation unit 41, A single waveform connecting portion 3 is provided.
- the unit creation unit 33a performs scaling on the pitch waveform group before being stored in the speech unit storage unit 1, and stores the scaled pitch waveform group in the speech unit storage unit 1.
- the pitch waveform generation unit 41 connects the pitch waveforms stored in the speech unit storage unit 1 for each unit.
- the segment creation unit 33 a includes a segment waveform cutout unit 34, a pitch waveform generation unit 35, and a power correction unit 10.
- the segment waveform cutout unit 34 and the pitch waveform generation unit 35 are the same as those elements in the second embodiment.
- the power correction unit 10, the power calculation unit 4, the normalization degree calculation unit 6, the scaling coefficient calculation unit 5, and the multiplier 7 included in the power correction unit 10 are the same as those elements in the first and second embodiments. It is.
- the multiplier 7 stores the scaled pitch waveform group in the speech unit storage unit 1.
- the pitch waveform generation unit 41 includes a time adjustment unit 8 and a segment waveform generation unit 9.
- the time adjustment unit 8, the segment waveform generation unit 9, and the segment waveform connection unit 3 are the same as those elements in the first and second embodiments.
- FIG. 10 is a block diagram showing an example of the minimum configuration of the waveform processing apparatus of the present invention.
- the waveform processing apparatus of the present invention includes power calculation means 71, normalization degree calculation means 72, change coefficient calculation means 73, and amplitude change means 74.
- the power calculation means 71 selects a pitch waveform one by one from the pitch waveform group corresponding to the segment, and a scalar (for example, average amplitude or , (Scalar obtained by equation (7) or equation (8)).
- a scalar for example, average amplitude or , (Scalar obtained by equation (7) or equation (8)).
- the normalization degree calculation unit 72 (for example, the normalization degree calculation unit 6) increases the normalization degree, which is an index value indicating the degree of normalization with respect to the pitch waveform selected by the power calculation unit 71, using a scalar as a variable. It is calculated as a function value of a function (for example, function A (S) exemplified in Expression (4), Expression (9), or Expression (10)).
- the conversion coefficient calculation unit 73 calculates a change coefficient (for example, the scaling coefficient g) for changing the amplitude value of the pitch waveform selected by the power calculation unit 71 based on the scalar and the normalization degree. To calculate.
- Amplitude changing means 74 (for example, multiplier 7) multiplies the amplitude value at each sampling point of the pitch waveform selected by power calculating means 71 by a change coefficient.
- the power of each pitch waveform of the segment can be changed so that natural synthesized speech can be obtained.
- Power calculation means for selecting a pitch waveform one by one from a group of pitch waveforms corresponding to a segment, calculating a scalar representing the power of the selected pitch waveform, and the pitch waveform selected by the power calculation means
- a normalization degree calculating means for calculating a normalization degree, which is an index value representing a degree of normalization with respect to the above, as a function value of an increasing function using the scalar as a variable, and an amplitude value of the pitch waveform selected by the power calculation means
- Change coefficient calculating means for calculating a change coefficient to be changed based on the scalar and the normalization degree; and amplitude changing means for multiplying the amplitude value at each sampling point of the pitch waveform selected by the power calculating means by the change coefficient.
- a waveform processing apparatus comprising:
- Additional remark 3 The waveform processing apparatus of Additional remark 1 or Additional remark 2 provided with the segment waveform generation means which produces
- Additional remark 4 The waveform processing apparatus in any one of Additional remark 1 to Additional remark 3 provided with the segment waveform connection means which connects the waveform showing the segment produced
- Additional remark 5 The waveform processing apparatus in any one of Additional remark 1 to Additional remark 4 provided with the segment storage means which memorize
- (Appendix 6) Recorded speech waveform storage means for storing the waveform of the recorded speech, segment waveform cutout means for cutting out the recorded speech waveform for each segment, and the pitch of the waveform extracted for each segment 6.
- the waveform processing device according to any one of appendix 1 to appendix 5, further comprising a pitch waveform generation unit that cuts out each waveform and generates a pitch waveform group corresponding to the segment for each segment.
- One pitch waveform is selected one by one from the group of pitch waveforms corresponding to the segment, a scalar representing the power of the selected pitch waveform is calculated, and an index value representing the degree of normalization for the selected pitch waveform
- the normalization degree is calculated as a function value of an increasing function using the scalar as a variable, and a change coefficient for changing the amplitude value of the selected pitch waveform is calculated based on the scalar and the normalization degree.
- a waveform processing method characterized by multiplying the amplitude value at each sampling point of the pitch waveform by the change coefficient.
- a power calculation process for selecting a pitch waveform one by one from a group of pitch waveforms corresponding to an element and calculating a scalar representing the power of the selected pitch waveform, and a pitch selected by the power calculation process Changes the normalization degree, which is an index value indicating the degree of normalization of the waveform, as a function value of the increasing function using the scalar as a variable, and changes the amplitude value of the pitch waveform selected in the power calculation process
- a power calculation unit that selects one pitch waveform from a group of pitch waveforms corresponding to a segment, calculates a scalar representing the power of the selected pitch waveform, and the pitch waveform selected by the power calculation unit
- a normalization degree that is an index value indicating the degree of normalization with respect to a normalization degree calculation unit that calculates a function value of an increase function using the scalar as a variable, and an amplitude value of the pitch waveform selected by the power calculation unit.
- a change coefficient calculation unit that calculates a change coefficient to be changed based on the scalar and the normalization degree; and an amplitude change unit that multiplies the amplitude value at each sampling point of the pitch waveform selected by the power calculation unit by the change coefficient.
- a waveform processing apparatus comprising:
- Additional remark 13 The waveform processing apparatus of Additional remark 1 or Additional remark 2 provided with the segment waveform generation part which produces
- Additional remark 14 The waveform processing apparatus in any one of additional remark 1 to additional remark 3 provided with the segment waveform connection part which connects the waveform showing the segment produced
- the waveform processing device according to any one of supplementary notes 1 to 4, further comprising a segment storage unit that stores, for each segment, a pitch waveform group corresponding to the segment.
- (Supplementary note 16) Recorded speech waveform storage unit for storing recorded speech waveform, segment waveform segmenting unit for segmenting the recorded speech waveform for each segment, and pitching the waveform segmented for each segment
- the waveform processing apparatus according to any one of Supplementary Note 1 to Supplementary Note 5, including a pitch waveform generation unit that cuts out each waveform and generates a pitch waveform group corresponding to the segment for each segment.
- the present invention is applicable to a waveform processing apparatus that changes the power of a waveform.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Electrophonic Musical Instruments (AREA)
- Telephone Function (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
1つの素片に対応する複数のピッチ波形に対して、特許文献1に記載された方法で正規化を行うと、その各ピッチ波形の最大振幅が揃う。このような正規化を完全正規化と呼ぶことにする。本発明では、1つの素片に対応する複数のピッチ波形に対して完全正規化を行う態様と、全く正規化を行わずにピッチ波形を元のままとする態様との中間的な態様を規定する規定値を算出する。以下、この規定値を正規化度と記す。正規化度は、正規化の程度を表す指標値であるということができる。そして、本発明では、その正規化度に応じて、ピッチ波形のパワーを変更する。
図1は、本発明の第1の実施形態の例を示すブロック図である。第1の実施形態の波形処理装置は、図1に示すように、音声素片記憶部1と、韻律補正部2と、素片波形連結部3とを備える。
図4は、1つの素片に関してピッチ波形を合成する処理の例を示すフローチャートである。音声素片記憶部1には、予め素片毎にピッチ波形群が記憶されているものとする。
第2の実施形態の波形処理装置は、音声素片記憶部1に記憶させるピッチ波形群を素片毎に生成する。図8は、本発明の第2の実施形態の例を示すブロック図である。第1の実施形態と同様の構成要素については、図1と同一の符号を付し、詳細な説明を省略する。第2の実施形態の波形処理装置は、第1の実施形態の要素(図1参照)に加え、さらに、収録音声波形記憶部32と、時間長情報記憶部31と、素片作成部33とを備える。
図9は、本発明の第3の実施形態の例を示すブロック図である。第1の実施形態や第2の実施形態と同様の構成要素については、図1、図9と同一の符号を付し、詳細な説明を省略する。
2 韻律補正部
3 素片波形連結部
4 パワー計算部
5 スケーリング係数計算部
6 正規化度計算部
7 乗算器
8 時間調整部
9 素片波形生成部
10 パワー補正部
Claims (10)
- 素片に対応するピッチ波形群の中からピッチ波形を1つずつ選択し、選択したピッチ波形のパワーを表すスカラを計算するパワー計算手段と、
パワー計算手段によって選択されたピッチ波形に対する正規化の程度を表す指標値である正規化度を、前記スカラを変数とする増加関数の関数値として計算する正規化度計算手段と、
パワー計算手段によって選択されたピッチ波形の振幅値を変更する変更係数を、前記スカラおよび前記正規化度に基づいて計算する変更係数計算手段と、
パワー計算手段によって選択されたピッチ波形の各サンプリング点における振幅値に前記変更係数を乗じる振幅変更手段とを備える
ことを特徴とする波形処理装置。 - 変更係数計算手段は、変更係数をgとし、予め定められた定数をCとし、パワー計算手段に計算されたスカラをSとし、正規化度をαとしたときに、(C/S)≦g≦1.0を満足する変更係数gを、Sおよびαを変数とする関数の関数値として計算する
請求項1に記載の波形処理装置。 - 振幅変更手段によって変更されたピッチ波形を連結することにより、素片を表す波形を生成する素片波形生成手段を備える
請求項1または請求項2に記載の波形処理装置。 - 素片波形生成手段によって生成された素片を表す波形を連結する素片波形連結手段を備える
請求項1から請求項3のうちのいずれか1項に記載の波形処理装置。 - 素片に対応するピッチ波形群を素片毎に記憶する素片記憶手段を備える
請求項1から請求項4のうちのいずれか1項に記載の波形処理装置。 - 収録された音声の波形を記憶する収録音声波形記憶手段と、
前記収録された音声の波形を素片毎に切り出す素片波形切り出し手段と、
素片毎に切り出された波形をピッチ波形毎に切り出し、素片毎に、素片に対応するピッチ波形群を生成するピッチ波形生成手段とを備える
請求項1から請求項5のうちのいずれか1項に記載の波形処理装置。 - 素片に対応するピッチ波形群の中からピッチ波形を1つずつ選択し、選択したピッチ波形のパワーを表すスカラを計算し、
選択したピッチ波形に対する正規化の程度を表す指標値である正規化度を、前記スカラを変数とする増加関数の関数値として計算し、
選択したピッチ波形の振幅値を変更する変更係数を、前記スカラおよび前記正規化度に基づいて計算し、
選択したピッチ波形の各サンプリング点における振幅値に前記変更係数を乗じる
ことを特徴とする波形処理方法。 - 変更係数をgとし、予め定められた定数をCとし、選択したピッチ波形のパワーを表すスカラをSとし、正規化度をαとしたときに、(C/S)≦g≦1.0を満足する変更係数gを、Sおよびαを変数とする関数の関数値として計算する
請求項7に記載の波形処理方法。 - コンピュータに、
素片に対応するピッチ波形群の中からピッチ波形を1つずつ選択し、選択したピッチ波形のパワーを表すスカラを計算するパワー計算処理、
パワー計算処理で選択したピッチ波形に対する正規化の程度を表す指標値である正規化度を、前記スカラを変数とする増加関数の関数値として計算する正規化度計算処理、
パワー計算処理で選択したピッチ波形の振幅値を変更する変更係数を、前記スカラおよび前記正規化度に基づいて計算する変更係数計算処理、および、
パワー計算処理で選択したピッチ波形の各サンプリング点における振幅値に前記変更係数を乗じる振幅変更処理
を実行させるための波形処理プログラム。 - コンピュータに、
変更係数計算処理で、変更係数をgとし、予め定められた定数をCとし、パワー計算処理で計算したスカラをSとし、正規化度をαとしたときに、(C/S)≦g≦1.0を満足する変更係数gを、Sおよびαを変数とする関数の関数値として計算させる
請求項9に記載の波形処理プログラム。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2013524586A JP5862667B2 (ja) | 2011-07-19 | 2012-06-26 | 波形処理装置、波形処理方法および波形処理プログラム |
| US14/131,460 US9443538B2 (en) | 2011-07-19 | 2012-06-26 | Waveform processing device, waveform processing method, and waveform processing program |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2011158298 | 2011-07-19 | ||
| JP2011-158298 | 2011-07-19 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2013011634A1 true WO2013011634A1 (ja) | 2013-01-24 |
Family
ID=47557837
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2012/004128 Ceased WO2013011634A1 (ja) | 2011-07-19 | 2012-06-26 | 波形処理装置、波形処理方法および波形処理プログラム |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US9443538B2 (ja) |
| JP (1) | JP5862667B2 (ja) |
| WO (1) | WO2013011634A1 (ja) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2016065899A (ja) * | 2014-09-22 | 2016-04-28 | カシオ計算機株式会社 | 音声合成装置、方法、およびプログラム |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112562635B (zh) * | 2020-12-03 | 2024-04-09 | 云知声智能科技股份有限公司 | 解决语音合成中拼接处产生脉冲信号的方法、装置及系统 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH02137889A (ja) * | 1988-11-19 | 1990-05-28 | Sony Corp | 信号記録方法及び装置 |
| JPH09244693A (ja) * | 1996-03-07 | 1997-09-19 | N T T Data Tsushin Kk | 音声合成方法及び装置 |
| WO2004049304A1 (ja) * | 2002-11-25 | 2004-06-10 | Matsushita Electric Industrial Co., Ltd. | 音声合成方法および音声合成装置 |
| JP2008015361A (ja) * | 2006-07-07 | 2008-01-24 | Sharp Corp | 音声合成装置、音声合成方法および音声合成方法を実現するためのプログラム |
-
2012
- 2012-06-26 WO PCT/JP2012/004128 patent/WO2013011634A1/ja not_active Ceased
- 2012-06-26 JP JP2013524586A patent/JP5862667B2/ja active Active
- 2012-06-26 US US14/131,460 patent/US9443538B2/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH02137889A (ja) * | 1988-11-19 | 1990-05-28 | Sony Corp | 信号記録方法及び装置 |
| JPH09244693A (ja) * | 1996-03-07 | 1997-09-19 | N T T Data Tsushin Kk | 音声合成方法及び装置 |
| WO2004049304A1 (ja) * | 2002-11-25 | 2004-06-10 | Matsushita Electric Industrial Co., Ltd. | 音声合成方法および音声合成装置 |
| JP2008015361A (ja) * | 2006-07-07 | 2008-01-24 | Sharp Corp | 音声合成装置、音声合成方法および音声合成方法を実現するためのプログラム |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2016065899A (ja) * | 2014-09-22 | 2016-04-28 | カシオ計算機株式会社 | 音声合成装置、方法、およびプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| US20140136192A1 (en) | 2014-05-15 |
| JP5862667B2 (ja) | 2016-02-16 |
| JPWO2013011634A1 (ja) | 2015-02-23 |
| US9443538B2 (en) | 2016-09-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP3913770B2 (ja) | 音声合成装置および方法 | |
| JP4469883B2 (ja) | 音声合成方法及びその装置 | |
| JP4406440B2 (ja) | 音声合成装置、音声合成方法及びプログラム | |
| JPWO2012063424A1 (ja) | 特徴量系列生成装置、特徴量系列生成方法および特徴量系列生成プログラム | |
| JP2008033133A (ja) | 音声合成装置、音声合成方法および音声合成プログラム | |
| US20110046957A1 (en) | System and method for speech synthesis using frequency splicing | |
| US20090326951A1 (en) | Speech synthesizing apparatus and method thereof | |
| US7765103B2 (en) | Rule based speech synthesis method and apparatus | |
| JP5862667B2 (ja) | 波形処理装置、波形処理方法および波形処理プログラム | |
| JP6011039B2 (ja) | 音声合成装置および音声合成方法 | |
| US20010029454A1 (en) | Speech synthesizing method and apparatus | |
| JP2009133890A (ja) | 音声合成装置及びその方法 | |
| JP5874639B2 (ja) | 音声合成装置、音声合成方法及び音声合成プログラム | |
| US8407054B2 (en) | Speech synthesis device, speech synthesis method, and speech synthesis program | |
| JP4525162B2 (ja) | 音声合成装置及びそのプログラム | |
| JP5106274B2 (ja) | 音声処理装置、音声処理方法及びプログラム | |
| JP4476855B2 (ja) | 音声合成装置及びその方法 | |
| JP5054632B2 (ja) | 音声合成装置及び音声合成プログラム | |
| JP5393546B2 (ja) | 韻律作成装置及び韻律作成方法 | |
| JP2004341259A (ja) | 音声素片伸縮装置およびその方法 | |
| JP2008299266A (ja) | 音声合成装置および音声合成方法 | |
| JP2005300919A (ja) | 音声合成装置 | |
| JP4630038B2 (ja) | 音声波形データベース構築方法、この方法を実施する装置およびプログラム | |
| JPH1097268A (ja) | 音声合成装置 | |
| JP2004347994A (ja) | 音声合成装置、音声合成方法およびこの音声合成方法を実行するプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12815239 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2013524586 Country of ref document: JP Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 14131460 Country of ref document: US |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 12815239 Country of ref document: EP Kind code of ref document: A1 |