TWI497484B

TWI497484B - Performance evaluation device, karaoke device, server device, performance evaluation system, performance evaluation method and program

Info

Publication number: TWI497484B
Application number: TW102113839A
Authority: TW
Inventors: Shuichi Matsumoto
Original assignee: Yamaha Corp
Priority date: 2012-04-18
Filing date: 2013-04-18
Publication date: 2015-08-21
Also published as: CN104170006A; WO2013157602A1; KR101666535B1; KR20140124843A; CN104170006B; JP5958041B2; JP2013222140A; TW201407602A

Description

Performance evaluation device, karaoke device, server device, performance evaluation system, performance evaluation method, and program

本發明係關於一種評價樂曲演奏之優劣之技術。The present invention relates to a technique for evaluating the merits of a musical composition.

例如提出有各種關於具備對歌唱者之歌唱演奏之優劣進行評分之評分功能之歌唱用卡拉OK裝置(以下若未作特別說明，則僅稱作「卡拉OK裝置」)之技術。作為揭示有此種技術之文獻有專利文獻1。該文獻所揭示之卡拉OK裝置係針對歌曲之每個音符算出自使用者之歌唱音擷取之音調與自作為引導旋律而預先準備之資料擷取之音調之差量，基於該差量算出基本得分。又，該卡拉OK裝置係於進行運用顫音或上滑音(hiccup)等技巧之歌唱之情形時，算出對應於該歌唱進行之次數之獎勵分(bonus point)。該卡拉OK裝置係將基本得分與獎勵分之合計得分作為最終評價結果向使用者提示。根據該技術，可使運用顫音或上滑音等高難度技巧之歌唱反映於評價結果。For example, there is a technique for karaoke apparatus for singing (hereinafter, simply referred to as "karaoke apparatus" unless otherwise specified) for a karaoke apparatus having a scoring function for scoring the singers' performances. Patent Document 1 discloses a document which discloses such a technique. The karaoke apparatus disclosed in the document calculates the difference between the pitch extracted from the user's singing voice and the pitch extracted from the data prepared in advance as a guide melody for each note of the song, and calculates the basic based on the difference. Score. Further, the karaoke apparatus calculates a bonus point corresponding to the number of times the singing is performed when a singer such as a vibrato or a hiccup is used. The karaoke apparatus presents the total score of the basic score and the bonus points as a final evaluation result to the user. According to this technique, it is possible to reflect the singing result using a skill such as vibrato or upper glide sound.

又，作為揭示有由表示歌唱音之波形檢測進行了使用顫音或上滑音等技巧之歌唱之技術的文獻，例如有專利文獻2至6。In addition, as a document which discloses a technique of singing using a technique such as vibrato or upper-slide, the waveform of the vocal sound is detected, for example, Patent Documents 2 to 6.

[Previous Technical Literature] [Patent Literature]

[專利文獻1]日本專利特開2005-107334號公報[Patent Document 1] Japanese Patent Laid-Open Publication No. 2005-107334

[專利文獻2]日本專利特開2005-107330號公報[Patent Document 2] Japanese Patent Laid-Open Publication No. 2005-107330

[專利文獻3]日本專利特開2005-107087號公報[Patent Document 3] Japanese Patent Laid-Open Publication No. 2005-107087

[專利文獻4]日本專利特開2008-268370號公報[Patent Document 4] Japanese Patent Laid-Open Publication No. 2008-268370

[專利文獻5]日本專利特開2005-107336號公報[Patent Document 5] Japanese Patent Laid-Open Publication No. 2005-107336

[專利文獻6]日本專利特開2008-225115號公報[Patent Document 6] Japanese Patent Laid-Open Publication No. 2008-225115

然而，於專利文獻1之技術之情形時，即便於對原本不喜好進行運用顫音或上滑音等技巧之歌唱之歌唱部位進行如此之歌唱之情形時，亦會加上獎勵分。因此有作為評價結果提示之得分與人之感受背離之問題。However, in the case of the technique of Patent Document 1, even when the singing is performed on a singing part which is not intended to use a technique such as vibrato or upper glide, a bonus point is added. Therefore, there is a problem that the score that is suggested as the evaluation result deviates from the feeling of the person.

本發明係鑒於如此之課題而完成者，其目的在於在卡拉OK歌唱等樂曲演奏之評價中，可提示更接近人之感受之評價結果。The present invention has been made in view of such a problem, and an object of the present invention is to provide an evaluation result closer to a person's feelings in the evaluation of a musical performance such as karaoke singing.

為解決上述課題，本發明提供一種演奏評價裝置，其具備：表情演奏參考資料獲取機構，其獲取以樂曲中所含之音符或音符群之發音開始時刻為基準，表示上述樂曲演奏中應進行之表情演奏與該表情演奏在上述樂曲中應進行之時序之表情演奏參考資料；音調音量資料產生機構，其自演奏者演奏之上述樂曲之演奏音產生表示該演奏音之音調及音量之音調音量資料；及演奏評價機構，其係於由上述音調音量資料產生機構產生之上述音調音量資料所示之音調及音量之至少一者之特性表示由上述樂曲中之上述表情演奏參考資料所示之特定時間範圍內應由上述表情演奏參考資料進行之表情演奏之特性之情形時，提高對由上述演奏者進行之上述樂曲之演奏之評價。In order to solve the above problems, the present invention provides a performance evaluation apparatus including: an expression playing reference data acquiring unit that acquires a sounding start or a note group included in a music piece as a reference, and indicates that the music piece should be played in the music performance The expression playing reference and the expression playing the reference performance in the above-mentioned music; the pitch volume data generating mechanism generates a tone data indicating the pitch and volume of the performance sound from the performance sound of the music played by the player And a performance evaluation mechanism, wherein the characteristic of at least one of the pitch and the volume indicated by the pitch volume data generated by the pitch volume data generating means indicates a specific time indicated by the expression performance reference material in the music piece In the case where the characteristics of the expression performance performed by the above-described expression playing reference material are within the range, the evaluation of the performance of the above-described music performed by the above-mentioned performer is improved.

又，本發明提供一種卡拉OK裝置，其具備：上述演奏評價裝置；伴奏資料獲取機構，其獲取指示樂曲之伴奏的伴奏資料；及聲音信號輸出機構，其係依據上述伴奏資料之指示輸出表示伴奏之音樂之聲音信號；上述音調音量資料產生機構產生表示對應於依據自上述聲音信號輸出機構輸出之聲音信號而自揚聲器播放之伴奏由上述演奏者進行之上述樂曲之演奏音之音調及音量之音調音量資料。Moreover, the present invention provides a karaoke apparatus comprising: the performance evaluation apparatus; an accompaniment data acquisition unit that acquires accompaniment data indicating an accompaniment of a music piece; and a sound signal output mechanism that outputs an accompaniment according to the indication of the accompaniment material The sound signal of the music; the above-mentioned pitch volume data generating mechanism generates a representation corresponding to the sound from the sound The sound signal output from the sound signal output mechanism and the accompaniment played from the speaker is performed by the player to perform the pitch sound of the music and the volume sound volume data of the volume.

又，本發明提供一種伺服器裝置，其具備：表情演奏出現資料獲取機構，其對於任意數量之由任意演奏者進行之樂曲之演奏音之各者，獲取表示以上述樂曲中所含之音符或音符群之發音開始時刻為基準之一個時序下出現一個表情演奏之表情演奏出現資料；表情演奏參考資料產生機構，其基於由上述表情演奏出現資料獲取機構獲取之任意數量之表情演奏出現資料，對於上述樂曲中所含之音符或音符群之各者，特定出以該音符或音符群之發音開始時刻為基準之任一時序下任一表情演奏是否以任一頻率出現，按照該特定出之資訊，產生以上述樂曲中所含之音符或音符群之發音開始時刻為基準，表示上述樂曲之演奏中應進行之表情演奏與該表情演奏在該樂曲中應進行之時序之表情演奏參考資料；及發送機構，其將由上述表情演奏參考資料產生機構產生之表情演奏參考資料發送至演奏評價裝置。Moreover, the present invention provides a server device comprising: an expression playing appearance data acquiring unit that acquires a note indicating the music included in the music piece for each of a number of performance sounds of the music piece performed by an arbitrary player The beginning of the pronunciation of the note group is an expression playing performance appearing at one time of the reference; the expression playing reference data generating mechanism is based on any number of expressions played by the above-mentioned expression playing data obtaining means, for Each of the notes or note groups included in the music piece specifies whether any of the expressions at any one of the timings based on the pronunciation start time of the note or the note group appears at any frequency, according to the specific information. And generating an expression playing reference material that is to be performed in the performance of the music piece and the time when the expression is performed in the music piece, based on the pronunciation start time of the note or the note group included in the music piece; and a sending mechanism that will be generated by the above-described expression playing reference material generating mechanism Performance Evaluation of performance transmitted to the reference device.

又，本發明提供一種歌唱評價系統，其具備：表情演奏參考資料獲取機構，其獲取以樂曲中所含之音符或音符群之發音開始時刻為基準，表示上述樂曲之演奏中應進行之表情演奏與該表情演奏在上述樂曲中應進行之時序之第一表情演奏參考資料；音調音量資料產生機構，其自演奏者演奏之上述樂曲之演奏音產生表示該演奏音之音調及音量之音調音量資料；演奏評價機構，其係於由上述音調音量資料產生機構產生之上述音調音量資料所示之音調及音量之至少一者之特性表示由上述樂曲中之上述第一表情演奏參考資料所示之特定時間範圍內應由上述第一表情演奏參考資料進行之表情演奏特性之情形時，提高對由上述演奏者進行之上述樂曲之演奏之評價；表情演奏出現資料獲取機構，其對於任意數量之由任意演奏者進行之樂曲之演奏音之各者，獲取表示以上述任意演奏者演奏之上述樂曲中所含之音符或音符群之發音開始時刻為基準之一個時序下出現一個表情演奏之表情演奏出現資料；及表情演奏參考資料產生機構，其基於由上述表情演奏出現資料獲取機構獲取之任意數量之表情演奏出現資料，對於由上述任意演奏者演奏之樂曲中所含之音符或音符群之各者，特定出以該音符或音符群之發音開始時刻為基準之任一時序下任一表情演奏是否以任一頻率出現，按照該特定出之資訊，產生以上述任意演奏者演奏之樂曲中所含之音符或音符群之發音開始時刻為基準，表示由上述任意演奏者進行之樂曲之演奏中應進行之表情演奏與該表情演奏在由上述任意演奏者演奏之樂曲中應進行之時序之第二表情演奏參考資料。Moreover, the present invention provides a singing evaluation system including: an expression playing reference data acquiring unit that acquires an expression playing in the performance of the music piece based on the pronunciation start time of the note or the note group included in the music piece. a first expression performance reference material that plays the time sequence that should be performed in the music piece; the tone volume data generation mechanism generates a tone volume data indicating the pitch and volume of the performance sound from the performance sound of the music piece played by the player a performance evaluation mechanism that is characterized by at least one of a pitch and a volume indicated by the pitch volume data generated by the pitch volume data generating means, and is represented by a specific one of the first expression performance reference data in the music In the case of the expression performance characteristics of the first expression playing reference material in the time range, the evaluation of the performance of the music performed by the above-mentioned performer is improved; the expression playing appearance data acquisition mechanism is performed for any number of arbitrary performances. For each of the performances of the music performed by the person, obtain the representation Above any of the above-mentioned music player is playing the notes contained in the note or The pronunciation start time of the group is an expression playing performance appearing at one time of the reference; and the expression playing reference data generating mechanism is based on any number of expression playing data acquired by the above-mentioned expression playing data obtaining means, Whether each of the notes or the note group included in the music played by any of the above-mentioned players specifies whether the expression of any of the expressions appears at any frequency based on the timing of the start of the pronunciation of the note or the note group, According to the specific information, the expression start time of the note or the note group included in the music played by any of the above-mentioned players is used as a reference, and the expression performance to be performed in the performance of the music performed by any of the above players is indicated. The expression plays a second expression playing reference in the sequence that should be performed by the music played by any of the above players.

又，本發明提供一種演奏評價方法，其獲取以樂曲中所含之音符或音符群之發音開始時刻為基準，表示上述樂曲之演奏中應進行之表情演奏與該表情演奏在上述樂曲中應進行之時序之表情演奏參考資料，自演奏者演奏之上述樂曲之演奏音產生表示該演奏音之音調及音量之音調音量資料，於由上述音調音量資料所示之音調及音量之至少一者之特性表示在由上述樂曲中之上述表情演奏參考資料所示之特定時間範圍內應由上述表情演奏參考資料進行之表情演奏特性之情形時，提高對由上述演奏者進行之上述樂曲之演奏之評價。Further, the present invention provides a performance evaluation method for obtaining an expression performance to be performed in the performance of the music piece and the expression performance in the music piece based on the pronunciation start time of the note or the note group included in the music piece. The timing expression playing reference material, the playing sound of the music piece played by the performer produces the pitch sound data indicating the pitch and volume of the playing sound, and the characteristics of at least one of the pitch and the volume indicated by the pitch volume data. When the expression performance characteristic to be performed by the above-described expression performance reference material in the specific time range indicated by the above-described expression performance reference material in the above-described music piece is expressed, the evaluation of the performance of the music piece by the player is improved.

又，本發明提供一種程式，其可由電腦執行，且使上述電腦執行以下處理：表情演奏參考資料獲取處理，其獲取以樂曲中所含之音符或音符群之發音開始時刻為基準，表示將上述樂曲之演奏中應進行之表情演奏與該表情演奏在上述樂曲中應進行之時序之表情演奏參考、資料；音調音量資料產生處理，其係自演奏者演奏之上述樂曲之演奏音產生表示該演奏音之音調及音量之音調音量資料；及演奏評價處理，其係於由上述音調音量資料產生機構所產生之上述音調音量資料所示之音調及音量之至少一者之特性表示在由上述樂曲中之上述表情演奏參考資料所示之特定時間範圍內應由上述表情演奏參考資料進行之表情演奏特性之情形時，提高對由上述演奏者進行之上述樂曲之演奏之評價。Moreover, the present invention provides a program executable by a computer, and causing the computer to perform the following processing: an expression playing reference material acquisition process, which acquires the above-mentioned pronunciation start time of the note or the note group included in the music piece, indicating that the above The expression of the expression to be played in the performance of the music and the expression of the expression in the above-mentioned music, and the tone volume data generation processing, which is generated by the performance of the above-mentioned music played by the player, indicating the performance And a tone evaluation volume data; and a performance evaluation process, wherein the characteristics of at least one of the pitch and the volume indicated by the pitch volume data generated by the pitch volume data generating means are represented by the music The specific time range indicated by the above-mentioned expression performance reference material shall be performed by the above-mentioned expression playing reference material. In the case of the expression playing characteristic, the evaluation of the performance of the above-mentioned music performed by the above-mentioned performer is improved.

根據本發明，實現一種於各樂曲之演奏中，若以期望之時序進行期望之表情演奏，則對演奏者給予較高之評價之演奏評價裝置。其結果，於由演奏者進行表情演奏之情形時，進行與人之感受背離較少之評價。According to the present invention, it is possible to realize a performance evaluation apparatus that gives a higher evaluation to a player if the desired expression is performed at a desired timing in the performance of each piece of music. As a result, when the performer performs the expression of the expression, the evaluation is less than the feeling of the person.

1‧‧‧歌唱評價系統1‧‧‧Singing Evaluation System

10‧‧‧卡拉OK裝置10‧‧‧ Karaoke device

10-m‧‧‧卡拉OK裝置10-m‧‧‧ Karaoke device

11‧‧‧聲源11‧‧‧ source

12‧‧‧揚聲器12‧‧‧ Speaker

13‧‧‧麥克風13‧‧‧ microphone

14‧‧‧顯示部14‧‧‧Display Department

15‧‧‧通訊介面15‧‧‧Communication interface

16‧‧‧聲音適配器16‧‧‧Sound Adapter

17‧‧‧CPU17‧‧‧CPU

18‧‧‧RAM18‧‧‧RAM

19‧‧‧ROM19‧‧‧ROM

20‧‧‧硬碟20‧‧‧ Hard disk

21‧‧‧音聲儲存器21‧‧‧Voice storage

30‧‧‧伺服器裝置30‧‧‧Server device

35‧‧‧通訊介面35‧‧‧Communication interface

37‧‧‧CPU37‧‧‧CPU

38‧‧‧RAM38‧‧‧RAM

39‧‧‧ROM39‧‧‧ROM

40‧‧‧硬碟40‧‧‧ Hard disk

90‧‧‧網路90‧‧‧Network

101‧‧‧表情演奏參考資料獲取機構101‧‧‧Face expression reference material acquisition agency

102‧‧‧音調音量資料產生機構102‧‧‧ tone volume data generating mechanism

103‧‧‧演奏評價機構103‧‧‧ Performance Evaluation Agency

APG‧‧‧歌唱分析程式APG‧‧‧Singing Analysis Program

DBS‧‧‧歌唱樣本資料庫DBS‧‧‧ singing sample database

DBRK‧‧‧參考資料庫DBRK‧‧‧ Reference Library

DBRS‧‧‧參考資料庫DBRS‧‧‧ Reference Library

HD‧‧‧標頭HD‧‧‧ heading

MD-n‧‧‧曲目資料MD-n‧‧‧ Track Information

S_A ‧‧‧聲音信號S _A ‧‧‧Sound signal

S_I ‧‧‧圖像信號S _I ‧‧‧ image signal

S_L ‧‧‧輸出信號S _L ‧‧‧Output signal

S_M ‧‧‧收音信號S _M ‧‧‧ radio signal

S_P ‧‧‧輸出信號S _P ‧‧‧Output signal

TR_AC ‧‧‧伴奏音軌TR _AC ‧ ‧ accompaniment track

TR_LY ‧‧‧歌詞音軌TR _LY ‧ ‧ lyrics track

TR_NR ‧‧‧範本歌唱參考音軌TR _NR ‧‧‧Fan Ben singing reference track

VPG‧‧‧歌唱評價程式VPG‧‧‧ singing evaluation program

圖1係顯示本發明之一實施形態之歌唱評價系統之構成之圖。Fig. 1 is a view showing the configuration of a singing evaluation system according to an embodiment of the present invention.

圖2係顯示延音之歌唱音之波形之圖。Fig. 2 is a diagram showing the waveform of the singer of the sustain.

圖3係顯示顫音之歌唱音之波形之圖。Fig. 3 is a diagram showing the waveform of the vocal sound of the vibrato.

圖4係顯示裝飾音之歌唱音之波形之圖。Fig. 4 is a diagram showing the waveform of the singing sound of the decorative sound.

圖5係顯示上滑音之歌唱音之波形之圖。Fig. 5 is a view showing the waveform of the singing sound of the upper sliding sound.

圖6係顯示下滑音之歌唱音之波形之圖。Fig. 6 is a diagram showing the waveform of the singing sound of the glide sound.

圖7係顯示本發明之一實施形態之歌唱評價系統之動作之流程圖。Fig. 7 is a flow chart showing the operation of the singing evaluation system according to an embodiment of the present invention.

圖8係針對延音產生之統計資料之一例。Figure 8 is an example of statistics generated for sustain.

圖9係針對顫音產生之統計資料之一例。Figure 9 is an example of statistics for vibrato generation.

圖10係針對裝飾音產生之統計資料之一例。Figure 10 is an example of statistics for decorative sound generation.

圖11係針對上滑音產生之統計資料之一例。Figure 11 is an example of statistics for the generation of upper glide.

圖12係針對下滑音產生之統計資料之一例。Figure 12 is an example of statistics for glide generation.

圖13係顯示本發明之演奏評價裝置之方塊圖。Figure 13 is a block diagram showing the performance evaluation apparatus of the present invention.

以下，參照圖式說明本發明之實施形態。Hereinafter, embodiments of the present invention will be described with reference to the drawings.

圖1係顯示本發明之一實施形態之歌唱評價系統1之構成之圖。該歌唱評價系統1具有卡拉OK裝置10-m(m=1、2...M，M係卡拉OK裝置之總數)與伺服器裝置30。卡拉OK裝置10-m於各卡拉OK店內分別設置一台或複數台。伺服器裝置30設於系統運營中心內。卡拉OK裝置10-m及伺服器裝置30與網絡90連接，可相互收發各種資料。Fig. 1 is a view showing the configuration of a singing evaluation system 1 according to an embodiment of the present invention. The singing evaluation system 1 has a karaoke apparatus 10-m (m=1, 2...M, M series karaoke equipment The total number is set) with the server device 30. The karaoke apparatus 10-m is provided with one or a plurality of sets in each karaoke shop. The server device 30 is located in the system operation center. The karaoke apparatus 10-m and the server apparatus 30 are connected to the network 90, and can transmit and receive various materials to each other.

卡拉OK裝置10-m係進行藉由支持使用者之歌唱之伴奏曲之播放與歌詞顯示完成之歌唱演出、與使用者之歌唱優劣之評價之裝置。此處，卡拉OK裝置10-m於歌唱優劣之評價中，進行以使用者之歌唱音之音調及音量之良好與否為評價對象之評價、與以如下所示之5種表情歌唱之良好與否為評價對象之評價。將2個評價之評價結果之得分與評論訊息一併向使用者提示。The karaoke apparatus 10-m is a device for evaluating the performance of the singing performance by the user's singing accompaniment and the lyric display completion, and the user's singing performance. Here, the karaoke apparatus 10-m evaluates whether the pitch of the user's singing voice and the volume are good or not, and the evaluation of the five types of expressions as shown below. No is the evaluation of the evaluation object. The scores of the evaluation results of the two evaluations are presented to the user together with the comment message.

A1. sustain

其係故意延遲歌曲內之特定音之歌出之表情歌唱。如圖2所示，於進行該歌唱之情形時，聲音之音調自歌唱音之前之音向該歌唱音變化之時刻較樂譜(範本歌唱)中對應於兩個音之2個音符(音符)之轉移時刻僅延遲少許時間。It is a deliberate delay in the expression of the song of the specific sound in the song. As shown in FIG. 2, in the case of performing the singing, the sound of the sound is changed from the sound before the singing sound to the time when the singing sound changes, and the two musical notes (notes) corresponding to the two sounds in the musical score (the singer singing). The transfer time is only delayed for a short time.

B1. vibrato

其係一面對歌曲內之特定音保持表觀之音調一面使其細微地顫動之表情歌唱。如圖3所示，於進行該歌唱之情形時，歌唱音之音調跨及樂譜中之對應於該音之音符之高度而週期性地變化。It sings in the face of a particular tune in the song while maintaining a subtle tone. As shown in FIG. 3, in the case of performing the singing, the pitch of the singing voice periodically changes across the height of the musical note corresponding to the note.

C1. decorative sound

其係使歌曲內之特定音之音色於發音之途中以吼叫之方式變化之表情歌唱。如圖4所示，於進行該歌唱之情形時，歌唱音之音調於樂譜中之對應於該音之音符之途中一下子上升。It makes the sound of the specific sound in the song sing in the way of screaming on the way of pronunciation. As shown in FIG. 4, in the case of the singing, the pitch of the singing voice rises in the middle of the musical score corresponding to the note of the sound.

D1.

其係以比原來之高度低之聲音發音歌曲內之特定音後接近原來之高度之歌唱方法。如圖5所示，於進行該歌唱之情形時，歌唱音之發音開始時刻之音調比樂譜中之對應於該音之音符之高度低。並且，該歌唱音之音調在發音開始後緩慢上升，到達至與音符之高度大致相同之高度。It is a singing method that is pronounced at a height close to the original sound after the specific sound in the song is sounded at a lower level than the original height. As shown in FIG. 5, in the case of performing the singing, the pitch of the beginning of the pronunciation of the singing voice is lower than the height of the musical note corresponding to the note in the musical score. and, The pitch of the singing voice slowly rises after the beginning of the pronunciation and reaches a height substantially equal to the height of the note.

E1. down sound (fall)

其係以比原來之高度高之聲音發音歌曲內之特定音後接近原來之高度之歌唱方法。如圖6所示，於進行該歌唱之情形時，歌唱音之發音開始時刻之音調比樂譜中之對應於該音之音符之高度高。並且，該歌唱音之音調在發音開始後緩慢下降，到達至與音符之高度大致相同之高度。It is a singing method that pronounces a specific sound in a song with a height higher than the original height and approaches the original height. As shown in FIG. 6, in the case of performing the singing, the pitch of the beginning of the pronunciation of the singing voice is higher than the height of the musical note corresponding to the note in the musical score. Moreover, the pitch of the singing voice slowly descends after the start of the pronunciation, reaching a height substantially equal to the height of the note.

返回至圖1，繼續進行歌唱評價系統1整體之說明。卡拉OK裝置10-m具有聲源11、揚聲器12、麥克風13、顯示部14、通訊介面15、聲音適配器16、CPU(Central Processing Unit，中央處理單元)17、RAM(Random Access Memory，隨機存取記憶體)18、ROM(Read Only Memory，唯讀記憶體)19、硬碟20、音聲儲存器(sequencer)21。聲源11輸出依據MIDI(Musical Instrument Digital Interface：樂器數位介面)之各種訊息之聲音信號S_A 。揚聲器12將所賦予之信號作為聲音播放。麥克風13接收聲音並輸出收音信號S_M 。顯示部14顯示對應於圖像信號S_I 之圖像。通訊介面15在與連接於網路90之裝置間收發資料。Returning to Fig. 1, the description of the overall singing evaluation system 1 is continued. The karaoke apparatus 10-m has a sound source 11, a speaker 12, a microphone 13, a display unit 14, a communication interface 15, a sound adapter 16, a CPU (Central Processing Unit) 17, and a RAM (Random Access Memory). Memory 18, ROM (Read Only Memory) 19, hard disk 20, and sound memory 21 (sequencer). The sound source 11 outputs a sound signal S _{A according to} various messages of a MIDI (Musical Instrument Digital Interface). The speaker 12 plays the signal given as a sound. The microphone 13 receives the sound and outputs a sound pickup signal S _M . The display section 14 displays an image corresponding to the image signal S _I . The communication interface 15 transmits and receives data between devices connected to the network 90.

聲音適配器16測定聲音信號S_M 之音調及音量，產生表示其等之時間變化之音調音量資料。具體而言，聲音適配器16(對應於音調音量資料獲取機構)係每時間T_S (例如T_S =30毫秒)檢測自麥克風13賦予之聲音信號S_M 之音調，將該檢測結果作為信號S_P 輸出。又，聲音適配器16係每時間T_S 檢測自麥克風13賦予之聲音信號S_M 之音量，將該檢測結果作為信號S_L 輸出。The sound adapter 16 measures the pitch and volume of the sound signal S _M and produces tone volume data indicating the time variation of the sound signal. Specifically, the sound adapter 16 (corresponding to the tone volume data acquisition means) detects the pitch of the sound signal S _M given from the microphone 13 every time T _S (for example, T _S = 30 milliseconds), and uses the detection result as the signal _SP Output. Further, the sound adapter 16 detects the volume of the sound signal S _M supplied from the microphone 13 every time T _S , and outputs the detection result as the signal S _L .

CPU17將RAM18作為工作區利用並且執行記憶於ROM19或硬碟20之程式。該CPU17之動作詳情於下文進行敍述。於ROM19中記憶有IPL(Initial Program Loader：初始程式載入器)等。於硬碟20中記憶有各種歌曲之曲目資料MD-n(n=1~N)(N係歌曲種類之總數)、參考資料庫DBRK及歌唱評價程式VPG。各歌曲之曲目資料MD-n係以SMF(Standard MIDI File：標準MIDI檔案)形式記錄有歌曲之伴奏內容、歌曲之歌詞及歌曲之範本歌唱內容之資料。The CPU 17 uses the RAM 18 as a work area and executes a program stored in the ROM 19 or the hard disk 20. Details of the operation of the CPU 17 will be described below. An IPL (Initial Program Loader) or the like is stored in the ROM 19. Memory in the hard disk 20 The track data MD-n (n=1~N) of various songs (the total number of N-type song types), the reference database DBRK, and the singing evaluation program VPG. The track data of each song MD-n records the accompaniment content of the song, the lyrics of the song, and the sung content of the song in the form of SMF (Standard MIDI File).

若具體說明，則如圖1之框內所示，曲目資料MD-n包含標頭HD、伴奏音軌TR_AC 、歌詞音軌TR_LY 、範本歌唱參考音軌TR_NR 。於標頭HD中記述有曲目編號、曲名、類型、演奏時間、時基(相當於4分音符中1個音符之時間之滴答聲(tick)數)等資訊。Specifically, as shown in the frame of FIG. 1, the track data MD-n includes a header HD, an accompaniment track TR _AC , a lyric track TR _LY , and a template vocal reference track TR _NR . In the header HD, information such as the track number, the name of the song, the type, the playing time, and the time base (the number of ticks corresponding to the time of one note in the quarter note) are described.

伴奏音軌TR_AC 中按時間序列順序記述有指示歌曲之樂譜之伴奏部分之各音符NT(i)(i表示自樂譜之相當部分之前頭之音符NT(1)起計算之順序)之音之發音之事件EV(i)_ON 與指示其消音之事件EV(i)_OFF 、及表示前後事件之執行時間差(滴答聲數)之差量時間DT。In the accompaniment track TR _AC , each note NT(i) indicating the accompaniment portion of the musical score of the song is recorded in chronological order (i indicates the order in which the note NT(1) from the head of the musical score is calculated). The pronounced event EV(i) _{ON is} the difference time DT between the event EV(i) _OFF indicating the silence, and the execution time difference (the number of ticks) indicating the before and after events.

歌詞音軌TR_LY 中按時間序列順序記述有表示歌曲之歌詞之各資料D_LY 、與表示各歌詞之顯示時刻(更具體而言，各歌詞之顯示時刻與各者之前之歌詞之顯示時刻之間的時間差(滴答聲數))之差量時間DT。In the lyric track TR _LY , the data D _LY indicating the lyrics of the song and the display time indicating the lyrics are displayed in chronological order (more specifically, the display time of each lyric and the display time of the lyrics before each of the lyrics) The time difference between the time difference (the number of clicks)) is DT.

範本歌唱參考音軌TR_NR 中按時間序列順序記述有指示歌曲之樂譜之歌唱部分之各音符NT(i)之音之發音之事件EV(i)_ON 與指示其消音之事件EV(i)_OFF 、及表示前後事件之執行時間差(滴答聲數)之差量時間DT。In the singer-song reference track TR _NR , an event EV(i) _ON indicating the sound of the note of the note NT(i) of the singing part of the score of the song and an event indicating the muffling thereof EV(i) _OFF are described in chronological order. And the difference time DT indicating the execution time difference (the number of ticks) of the events before and after.

參考資料庫DBRK(對應於表情演奏參考資料產生機構)中記憶有5種表情歌唱參考資料DD_a1 、DD_a2 、DD_a3 、DD_a4 、DD_a5 。表情歌唱參考資料DD_a1 係表示以歌曲中所含之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時進行延音歌唱之情形的評價分VSR(t)之各對之資料。表情歌唱參考資料DD_a2 係表示以歌曲中所含之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時進行顫音歌唱之情形的評價分VSR(t)之各對之資料。表情歌唱參考資料DD_a3 係表示以歌曲中所含之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時進行裝飾音歌唱之情形的評價分VSR(t)之各對之資料。表情歌唱參考資料DD_a4 係表示以歌曲中所含之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時進行上滑音歌唱之情形的評價分VSR(t)之各對之資料。表情歌唱參考資料DD_a5 係表示以歌曲中所含之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時進行下滑音歌唱之情形的評價分VSR(t)之各對之資料。以下，不區分5種表情歌唱參考資料DD_a1 、DD_a2 、DD_a3 、DD_a4 、DD_a5 之情形記作表情歌唱參考資料DD。The reference database DBRK (corresponding to the expression playing reference material generating mechanism) memorizes five kinds of expression singing reference materials DD _a1 , DD _a2 , DD _a3 , DD _a4 , DD _a5 . The expression singing reference material DD _a1 indicates the evaluation of the timing of the sounding of the note NT(i) contained in the song as the reference time t _BS on each time t on the time axis and the sustaining singing at the time t. The data of each pair of VSR(t). The expression vocal reference material DD _a2 indicates that each of the time t on the time axis of the reference point t _{BS and} the tremolo singing at the time t are the pronunciation start time of the note NT(i) included in the song. Information on each pair of VSR(t). The expression singing reference material DD _a3 indicates the evaluation of the time when the pronunciation start time of the note NT(i) included in the song is the time t on the time axis of the reference point t _{BS and} the decorative sound singing at the time t. The data of each pair of VSR(t). The expression singing reference material DD _a4 indicates the evaluation of the situation in which the pronunciation start time of the note NT(i) included in the song is the time t on the time axis of the reference point t _BS and the case where the upper slide sound is performed at the time t. The data of each pair of VSR(t). The expression vocal reference material DD _a5 indicates the evaluation of the situation in which the pronunciation start time of the note NT(i) included in the song is the time t on the time axis of the reference point t _{BS and} the singer singing at the time t. The data of each pair of VSR(t). Hereinafter, the case where the five kinds of expression singing reference materials DD _a1 , DD _a2 , DD _a3 , DD _a4 , and DD _{a5 are} not distinguished is recorded as the expression singing reference material DD.

歌唱評價程式VPG具有以下3個功能。The singing evaluation program VPG has the following three functions.

A2. Standard evaluation function

其係將由聲音適配器16之輸出信號S_L 及S_P 所示之音調及音量與由範本歌唱參考音軌TR_NR (對應於範本演奏參考資料獲取機構)內之各事件EV(i)_ON 及EV(i)_OFF 決定之各音符NT(i)之範本音調PCH_REF 及範本音量LV_REF 進行比較，基於該比較之結果來評價歌唱優劣之功能。Each event EV sound adapter by which the output signal line S _L 16 and the pitch of the S _P and volume as shown by the reference template singing track TR _NR (corresponding to a reference model acquired playing means) within (i) _ON and EV (i) The sample pitch PCH _REF and the template volume LV _REF of each note NT(i) determined by _OFF are compared, and the function of singing is evaluated based on the result of the comparison.

B2. Expression singing evaluation function

其係如下之功能：每當聲音適配器16之輸出信號S_P 所示之音調波形中出現表情歌唱之特徵波形時，求得以成為表情歌唱之對象之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之表情歌唱之特徵波形的出現時刻，自參考資料庫DBRK內之相當之表情歌唱參考資料DD之各評價分VSR(t)中選擇與該出現時刻對應之評價分VSR(t)，基於該評價分VSR(t)評價歌唱優劣。It is a function of: when the characteristic waveform of the expression singer appears in the pitch waveform indicated by the output signal S _{P of the} sound adapter 16, the pronunciation start time of the note NT(i) which becomes the object of the expression sing is used as the reference point. The appearance time of the characteristic waveform of the expression singing on the time axis of the t _BS , and the evaluation score VSR corresponding to the appearance time is selected from the evaluation scores VSR(t) of the equivalent expression singing reference data DD in the reference database DBRK ( t), based on the evaluation score VSR (t) evaluation of singing pros and cons.

C2. Evaluation result prompt function

其係自a2進行之評價之評價結果及b2進行之評價之評價結果算出得分，使該得分與評論訊息一併顯示於顯示部14之功能。It is calculated from the evaluation result of the evaluation performed by a2 and the evaluation result of the evaluation by b2 The score is such that the score is displayed on the display unit 14 together with the comment message.

音聲儲存器21係於基於利用遠程控制器(未圖示)進行之歌曲之歌唱開始操作將相當曲目之曲目資料MD-n自硬碟20傳送至RAM18之情形時，將該曲目資料MD-n內之事件EV(i)_ON 、EV(i)_OFF 及資料D_LY 供給至裝置各部。具體而言，若於RAM18中記憶曲目資料MD-n，則音聲儲存器21基於記述於該曲目資料MD-n之標頭HD之時基與由遠程控制器(未圖示)指定之節拍來規定1滴答聲之時間長，根據該時間長之經過計數滴答聲且進行以下3個處理。The sound storage 21 is a case where the track data MD-n of the equivalent track is transferred from the hard disk 20 to the RAM 18 based on the singing start operation of the song by the remote controller (not shown), the track material MD- The events EV(i) _ON , EV(i) _OFF and data D _{LY are} supplied to each part of the device. Specifically, if the track data MD-n is stored in the RAM 18, the sound memory 21 is based on the time base of the header HD described in the track data MD-n and the beat specified by the remote controller (not shown). The length of time for specifying one click is long, and the following three processes are performed based on the count of the ticking after the length of time.

第1處理中，音聲儲存器21係每當滴答聲之計算數與伴奏音軌TR_AC 內之差量時間DT一致時讀取後續之事件EV(i)_ON (或EV(i)_OFF )並供給至聲源11。聲源11係若自音聲儲存器21供給事件EV(i)_ON ，則將該事件EV(i)_ON 指定之聲音信號S_A 供給至揚聲器12，若自音聲儲存器21供給事件EV(i)_OFF ，則停止向揚聲器12之聲音信號S_A 之供給。In the first processing, the sound storage 21 reads the subsequent event EV(i) _ON (or EV(i) _OFF ) whenever the calculated number of clicks coincides with the difference time DT in the accompaniment track TR _AC . And supplied to the sound source 11. When the sound source 11 supplies the event EV(i) _{ON to the} sound storage 21, the sound signal S _A designated by the event EV(i) _{ON is} supplied to the speaker 12, and if the event EV is supplied from the sound storage 21 ( i) _OFF , the supply of the sound signal S _A to the speaker 12 is stopped.

第2處理中，音聲儲存器21(對應於歌詞資料獲取機構)係每當滴答聲之計算數與歌詞音軌TR_LY 內之差量時間DT一致時讀取後續之資料D_LY 並供給至顯示部14(對應於圖像信號輸出機構)。若自音聲儲存器21供給資料D_LY ，則顯示部14將該資料D_LY 轉換為歌詞字幕之圖像，並使該圖像顯示於顯示器(未圖示)。In the second processing, the sound storage 21 (corresponding to the lyric data acquisition means) reads the subsequent data D _LY and supplies it to the time when the calculation number of the ticks coincides with the difference time DT in the lyric track TR _LY The display unit 14 (corresponding to an image signal output mechanism). When the material D _LY is supplied from the sound storage 21, the display unit 14 converts the material D _LY into an image of the lyrics subtitle and displays the image on a display (not shown).

藉由音聲儲存器21進行該第1及第2處理，而進行自揚聲器12播放伴奏音及對顯示器顯示歌詞。使用者一面聽自揚聲器12播放之伴奏音一面向麥克風13歌唱顯示於顯示器之歌詞。使用者向麥克風13歌唱之期間，麥克風13輸出使用者之歌唱音之收音信號S_M ，聲音適配器16輸出表示該信號S_M 之音調及音量之信號S_P 及S_L 。The first and second processes are performed by the sound storage 21, and the accompaniment sound is played from the speaker 12 and the lyrics are displayed on the display. The user listens to the accompaniment sound played from the speaker 12 and sings the lyrics displayed on the display toward the microphone 13. While the user is singing to the microphone 13, the microphone 13 outputs the sound signal S _{M of the} user's singing voice, and the sound adapter 16 outputs signals S _P and S _L indicating the pitch and volume of the signal S _M .

第3處理中，音聲儲存器21係每當滴答聲之計算數與範本歌唱參考音軌TR_NR 內之差量時間DT一致時讀取後續之事件EV(i)_ON (或EV(i)_OFF )並供給至CPU17。CPU17使用自音聲儲存器21供給之事件 EV(i)_ON 及EV(i)_OFF 與聲音適配器16之輸出信號S_P 及S_L 來評價使用者之歌唱之優劣。詳情於下文進行敍述。In the third processing, the sound storage 21 reads the subsequent event EV(i) _ON (or EV(i) every time the calculated number of clicks coincides with the difference time DT in the template singing reference track TR _NR . _OFF ) and supplied to the CPU 17. The CPU 17 evaluates the merits of the user's singing using the events EV(i) _ON and EV(i) _OFF supplied from the sound storage 21 and the output signals S _P and S _{L of the} sound adapter 16. Details are described below.

伺服器裝置30係發揮支援卡拉OK店鋪之伺服器之提供之作用的裝置。伺服器裝置30具有通訊介面35、CPU37、RAM38、ROM39、硬碟40。通訊介面35在與連接於網路90之裝置之間收發資料。CPU37將RAM38作為工作區使用，且執行記憶於ROM39或硬碟40之各種程式。該CPU37之動作之詳情於下文進行敍述。於ROM39中記憶有IPL等。The server device 30 is a device that functions to support the provision of a server of a karaoke store. The server device 30 has a communication interface 35, a CPU 37, a RAM 38, a ROM 39, and a hard disk 40. The communication interface 35 transmits and receives data between devices connected to the network 90. The CPU 37 uses the RAM 38 as a work area, and executes various programs stored in the ROM 39 or the hard disk 40. Details of the operation of the CPU 37 are described below. An IPL or the like is stored in the ROM 39.

於硬碟40中記憶有歌唱樣本資料庫DBS、參考資料庫DBRS及歌唱分析程式APG。於歌唱樣本資料庫DBS中個別記憶有分別對應於1個歌曲之歌唱樣本資料DS群。歌唱樣本資料DS係記錄有具有一定水準以上之歌唱力者歌唱歌曲時之歌唱音之音調波形及音量波形之資料。於參考資料庫DBRS中記憶有應儲存於各卡拉OK裝置10-m之參考資料庫DBRK內之最新之表情歌唱參考資料DD。The singing sample database DBS, the reference database DBRS, and the singing analysis program APG are stored in the hard disk 40. The singing sample data DS group corresponding to one song is individually recorded in the singing sample database DBS. Singing sample data DS is a record of the pitch waveform and volume waveform of a singing voice when a singer with a certain level or higher is singing. The latest expression singing reference material DD stored in the reference database DBRK of each karaoke apparatus 10-m is stored in the reference database DBRS.

歌唱分析程式APG具有以下3個功能。The singing analysis program APG has the following three functions.

A3. Storage function

其係自卡拉OK裝置10-m逐個獲取1曲目量之各歌曲之歌唱樣本資料DS，將獲得之歌唱樣本資料DS儲存於歌唱樣本資料庫DBS之功能。It acquires the singing sample data DS of each song of one track from the karaoke apparatus 10-m one by one, and stores the obtained singing sample data DS in the function of the singing sample database DBS.

B3. Overwrite function

其係如下之功能：針對儲存於歌唱樣本資料庫DBS之歌唱樣本資料DS之各者，自該歌唱樣本資料DS所示之波形內搜索表情歌唱之特徵波形，由該搜索結果產生表示以成為表情歌唱之對象之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時之表情歌唱之出現數Num之關係之統計資料，基於統計資料之內容將與參考資料庫DBR內之表情歌唱參考資料DD中之各時刻t對應之評價分 VSR(t)覆寫。The function is as follows: for each of the singing sample data DS stored in the singing sample database DBS, searching for the characteristic waveform of the expression singing from the waveform indicated by the singing sample data DS, and generating a representation by the search result to become an expression The pronunciation start time of the object of the singing object NT(i) is the statistical data of the relationship between the time t on the time axis of the reference point t _BS and the appearance number Num of the expression singing at the time t, based on the contents of the statistical data. The evaluation score VSR(t) corresponding to each time t in the expression singing reference material DD in the reference database DBR is overwritten.

C3. Send function

其係根據來自卡拉OK裝置10-m之要求，將利用覆寫功能覆寫之表情歌唱參考資料DD發送至卡拉OK裝置10-m之功能。It is a function of transmitting the expression singing reference material DD overwritten by the overwriting function to the karaoke apparatus 10-m in accordance with the request from the karaoke apparatus 10-m.

接著說明本實施形態之動作。圖7係顯示本實施形態之動作之流程圖。圖7中，卡拉OK裝置10-m之CPU17於進行歌曲之歌唱開始操作之情形時(S100：Yes(是))，對音聲儲存器21供給控制信號S₀ 使音聲儲存器21開始處理(上述第1~第3處理)(S120)。當音聲儲存器21之處理開始時，CPU17進行標準歌唱評價處理(S130)與表情歌唱評價處理(S140)該2個處理。該2個處理之詳情如下。Next, the operation of this embodiment will be described. Fig. 7 is a flow chart showing the operation of this embodiment. 7, karaoke OK device 10-m of the CPU17 when to be the case of singing songs of the start operation of (S100: Yes (Yes)), the supply of the sound in the reservoir 21 control signals S ₀ so that sound in the reservoir 21 begins processing (The first to third processes described above) (S120). When the processing of the sound storage 21 is started, the CPU 17 performs the two processes of the standard singing evaluation processing (S130) and the expression singing evaluation processing (S140). Details of the two processes are as follows.

A4. Standard singing evaluation processing (S130)

該處理中，CPU17將自音聲儲存器21供給事件EV(i)_ON 後至供給下個事件EV(i)_OFF 為止之時間設為相當於第i號音符NT(i)之聲音的發音時間T_NT (i)。CPU17求得發音時間T_NT (i)之間之聲音適配器16之輸出信號S_P 所示之音調與轉換事件EV(i)_ON 之音符號碼後之範本音調PCH_REF 之差PCH_DEF 、及其間之信號S_P 所示之音量與轉換事件EV(i)_ON 之節拍後之範本音量LV_REF 之差LV_DEF ，於該差PCH_DEF 及差LV_DEF 控制於特定範圍之情形時判定為音符NT(i)之歌唱合格。CPU17於使用者自歌唱開始至結束期間進行該音符判定，將對用判定為合格之音符NT(i)數除以歌唱之結束時刻之所有音符TN(i)之數所得之值乘以100之值作為基本得分SR_BASE 。In this process, the CPU 17 is supplied to the next sound in the event EV from reservoir 21 is supplied event EV (i) _{the ON} (i) until the time of the _OFF to NT corresponding to the i-th note (i) the time of audio pronunciation T _NT (i). CPU17 obtained pitch conversion event EV (i) the difference between the note number of _ON PCH _REF of the pitch template shown PCH _DEF between the pronunciation of audio time T _NT (i) the output signal S 16 of the adapter _P, and between the determining when the volume of the converter shown in FIG event EV after the signal S _P (i) _oN template of the beat volume LV _REF difference LV _DEF, and to the difference PCH _DEF LV _DEF control difference to the case of a specific range of notes NT (i ) singing is qualified. The CPU 17 performs the note determination from the start to the end of the singing, and multiplies the value obtained by dividing the number of the musical notes NT(i) judged as qualified by the number of all notes TN(i) at the end time of the singing by 100. The value is taken as the base score SR _BASE .

又，於該處理中，CPU17判定聲音適配器16之輸出信號S_P 所示之音調波形內是否出現延音、顫音、裝飾音、上滑音、下滑音之任一表情歌唱之特徵波形。此處，延音之特徵波形之判定方法之詳情參考專利文獻2，顫音之特徵波形之判定方法之詳情參考專利文獻3，裝飾音之特徵波形之判定方法之詳情參考專利文獻4，上滑音之特徵波形之判定方法之詳情參考專利文獻5，下滑音之特徵波形之判定方法之詳情參考專利文獻6。CPU17係自使用者開始歌唱至結束歌唱之期間進行該特徵波形判定，將對歌唱結束時刻之表情歌唱之出現數乘以特定係數所得之值作為相加分SR_ADD 。而且，於該處理中，將基本得分SR_BASE 與相加分SR_ADD 之合計作為標準得分SR_NOR 。Further, in this process, CPU 17 determines whether there is illustrated the damper of audio tones within the adapter 16 as the output signal S _P wave, tremolo, wherein any of ornaments, the glide, the decline of the sound waveform of a singing expression. Here, for details of the method of determining the characteristic waveform of the sustaining sound, refer to Patent Document 2, and the details of the method for determining the characteristic waveform of the vibrato refer to Patent Document 3, and the details of the method for determining the characteristic waveform of the decorative sound refer to Patent Document 4, Upper Slider For details of the method of determining the characteristic waveform, refer to Patent Document 5, and the details of the method for determining the characteristic waveform of the glide are referred to Patent Document 6. The CPU 17 performs the feature waveform determination from the time when the user starts singing to the end of the singing, and multiplies the number of appearances of the expression singing at the end of the singing time by the specific coefficient as the addition score SR _ADD . Moreover, in this processing, the total of the base score SR _BASE and the addition score SR _ADD is taken as the standard score SR _NOR .

B4. Expression singing evaluation processing (S140)

於該處理中，CPU17將聲源事件EV(i)_ON 之輸出至下個事件EV(i)_OFF 之輸出為止之時間作為相當於第i號音符NT(i)之音之發音時間T_NT (i)。而且，於發音時間T_NT (i)之間之聲音適配器16之輸出信號S_P 所示之音調波形內出現表情歌唱特徵波形之情形時，CPU17求得發音時間T_NT (i)內之表情歌唱出現時刻與所出現之表情歌唱之種類。CPU17產生表示如此特定之表情歌唱之種類與出現時刻之表情歌唱出現資料。In this process, the CPU 17 sets the time from the output of the sound source event EV(i) _{ON to} the output of the next event EV(i) _OFF as the sounding time T _NT corresponding to the sound of the ith note NT(i) ( i). Further, when the case wherein the expression singing tone waveforms shown occur within the time of the sound between the pronunciation T _NT (i) of the adapter output signal S _P wave 16, CPU 17 obtained within the expression pronunciation time T _NT (i) Singing The moment of appearance and the type of expression that appears. The CPU 17 generates an expression sing presenting information indicating the kind of the expression singer and the appearance time.

然後，CPU17自表情歌唱參考資料DD所示之一連串評價分VSR(t)中選擇對應於產生之表情歌唱出現資料所示之表情歌唱及其出現時刻之評價分VSR(t)。CPU係於使用者開始歌唱至結束歌唱之期間進行如此之評價分VSR(t)之選擇，將歌唱結束時刻之評價分VSR(t)之平均值作為表情得分SR_EX 。Then, the CPU 17 selects the evaluation score VSR(t) corresponding to the expression singing shown in the generated expression singing appearance data and its appearance time from the series evaluation score VSR(t) shown in the expression singing reference material DD. The CPU selects the evaluation score VSR(t) while the user starts singing to finish singing, and uses the average value of the evaluation score VSR(t) of the singing end time as the expression score SR _EX .

當使用者之歌曲歌唱結束時，CPU17進行評價結果提示處理(S150)。評價結果提示處理中，CPU17選擇由標準歌唱評價處理打分之標準得分SR_NOR 與由表情歌唱評價處理打分之表情得分SR_EX 中較高者之得分。然後，於CPU17選擇標準得分SR_NOR 之情形時，將該得分SR_NOR 與例如「很酷很精緻的歌呢」等對應於得分SR_NOR 之評論訊息顯示於顯示部14。又，於CPU17選擇表情得分SR_EX 之情形時，將該得分 SR_EX 與例如「充滿感情呢」等對應於表情得分SR_EX 之評論訊息顯示於顯示部14。When the user's song singing ends, the CPU 17 performs an evaluation result presenting process (S150). In the evaluation result presentation processing, the CPU 17 selects the score of the higher of the standard score SR _NOR scored by the standard singing evaluation processing and the expression score SR _EX scored by the expression singing evaluation processing. Then, when the CPU 17 selects the standard score SR _NOR , the score SR _NOR is displayed on the display unit 14 with a comment message corresponding to the score SR _NOR such as "cool and delicate song". Furthermore, if the situation in the CPU17 selection expression score of SR _EX, SR _EX and the scores such as "full of emotion it", etc. correspond to the expression of SR _EX score comment message on the display unit 14.

接著，CPU17進行樣本發送處理(S160)。樣本發送處理中，CPU17將在歌曲之歌唱開始至結束之期間由聲音適配器16輸出之信號S_P 及S_L 作為該歌曲之歌唱樣本資料DS，將包含該歌唱樣本資料DS與步驟S130中求得之基本得分SR_BASE (歌唱評價資料)之訊息MS1發送至伺服器裝置30。Next, the CPU 17 performs sample transmission processing (S160). In the sample transmission processing, the CPU 17 uses the signals S _P and S _L output from the sound adapter 16 during the start to the end of the singing of the song as the singing sample data DS of the song, and includes the singing sample data DS and the step S130. The message MS1 of the basic score SR _BASE (singing evaluation data) is sent to the server device 30.

若自卡拉OK裝置10-m獲得訊息MS1(S200：Yes)，則伺服器裝置30之CPU37自該訊息MS1取出歌唱樣本資料DS與基本得分SR_BASE ，將該基本得分SR_BASE 與劃分上級者及非上級者之基準得分SR_TH (例如80分)進行比較(S220)。於基本得分SR_BASE 高於基準得分SR_TH 之情形時(S220：Yes)，CPU37將自訊息MS1取出之歌唱樣本資料DS儲存於歌唱樣本資料庫DBS(S230)。If the message MS1 is obtained from the karaoke device 10-m (S200: Yes), the CPU 37 of the server device 30 takes out the singing sample data DS and the basic score SR _BASE from the message MS1, and divides the basic score SR _BASE with the superior class and The non-superior's benchmark score SR _TH (for example, 80 points) is compared (S220). When the basic score SR _{BASE is} higher than the reference score SR _TH (S220: Yes), the CPU 37 stores the singing sample data DS taken out from the message MS1 in the singing sample database DBS (S230).

接著，CPU37(對應於表情演奏出現資料產生機構)進行覆寫處理(S240)。覆寫處理中，CPU37進行以下5個處理。第1處理中，CPU37自儲存於歌唱樣本資料庫DBS之各歌唱樣本資料DS所示之音調波形內搜索延音之特徵波形，產生表示該搜索結果之表情歌唱出現資料(表示以出現延音之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t之資料)。接著，CPU37基於有關延音產生之表情歌唱出現資料，產生以音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時之表情歌唱「延音」之出現數Num之關係的統計資料，基於該統計資料之內容而覆寫與表情歌唱參考資料DD_a1 中之各時刻t對應之評價分VSR(t)。Next, the CPU 37 (corresponding to the expression playing appearance data generating means) performs overwriting processing (S240). In the overwrite processing, the CPU 37 performs the following five processing. In the first processing, the CPU 37 searches for the characteristic waveform of the sustaining sound from the pitch waveforms stored in the singing sample data DS stored in the singing sample database DBS, and generates an expression vocal appearance data indicating the search result (indicating that the sustaining sound is present). The pronunciation start time of the note NT(i) is the data of each time t on the time axis of the reference point t _BS ). Next, the CPU 37 sings the appearance data based on the expression regarding the sustain, and generates the time t at the time axis of the reference point t _{BS with} the pronunciation start time of the note NT(i) and the expression "song" at the time t The statistical data of the relationship of the occurrence number Num is overwritten with the evaluation score VSR(t) corresponding to each time t in the expression singing reference material DD _a1 based on the content of the statistical data.

圖8係顯示對於延音之統計資料之一例之圖。該例之統計資料中，在比基準點t_BS 提前時間T1_a1 之時刻t1_a1 與比基準點t_BS 延後時間T4_a1 之時刻t4_a1 之間分佈有表情歌唱之出現數Num。並且，於該例之統計資料中，在緊接基準點t_BS 後之時刻t2_a1 表現有出現數Num之最大峰值，在比時刻t2_a1 遲之時刻t3_a1 表現有出現數Num之第2峰值。藉此，利用該例之統計資料之覆寫後之表情歌唱參考資料DD_a1 中，時刻t2_a1 之評價分VSR(t2_a1 )最高，時刻t3_a1 之評價分VSR(t3_a1 )第2高。Figure 8 is a diagram showing an example of statistics for sustain. This embodiment of the statistics, the t _BS advance and delay time t1 _a1 t _BS than the reference point in time than the reference point of time T1 _a1 distribution between time t4 _a1 T4 _a1 has the number of emotion of singing Num. Further, in the statistical data of this example, the maximum peak of the number Num appears at the time t2 _a1 immediately after the reference point t _BS , and the second peak of the number Num appears at a time t3 _a1 later than the time t2 _a1 . Whereby, after use of the embodiment of statistics override the singing expression references DD _a1, the evaluation time points t2 _a1 VSR (t2 _a1) Evaluation of the highest points of time t3 _a1 VSR (t3 _a1) second high.

第2處理中，CPU37自儲存於歌唱樣本資料庫DBS之各歌唱樣本資料DS所示之音調波形內搜索顫音之特徵波形，產生表示該搜索結果之表情歌唱出現資料(表示以顫音出現之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t之資料)。接著，CPU37基於有關顫音產生之表情歌唱出現資料，產生表示以音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時之表情歌唱之出現數Num之關係的統計資料，基於該統計資料之內容而覆寫與表情歌唱參考資料DD_a2 之各時刻t對應之評價分VSR(t)。In the second processing, the CPU 37 searches for the characteristic waveform of the vibrato from the pitch waveforms stored in the respective singing sample data DS of the singing sample database DBS, and generates an expression singing material indicating the search result (indicating the note NT appearing in vibrato) The pronunciation start time of (i) is the data of each time t on the time axis of the reference point t _BS ). Next, the CPU 37 sings the appearance data based on the expression regarding the vibrato, and generates the number of occurrences of the expression sings on the time axis representing the start time of the pronunciation of the note NT(i) as the reference point t _BS and the time t at the time t. The statistical data of the relationship of Num, based on the content of the statistical data, overwrites the evaluation score VSR(t) corresponding to each time t of the expression singing reference material DD _a2 .

圖9係顯示對於顫音之統計資料之一例之圖。該例之統計資料中，在基準點t_BS 與比基準點t_BS 延後時間T2_a2 之時刻t2_a2 之間分佈有表情歌唱之出現數Num。而且，於該例之統計資料中，在比基準點t_BS 延後時間T1_a2 之時刻t1_a2 表現有出現數Num之最大峰值。藉此，利用該例之統計資料之覆寫後之表情歌唱參考資料DD_a2 中，時刻t1_a2 之評價分VSR(t1_a2 )最高。Figure 9 is a diagram showing an example of statistics for vibrato. In the statistical data of this example, the number Num of expression sings is distributed between the reference point t _BS and the time t2 _a2 which is later than the reference point t _{BS by the} time T2 _a2 . Further, in the statistical data of this example, the maximum peak value of the number Num appears at the time t1 _a2 which is delayed by the time T1 _{a2 from} the reference point t _BS . Therefore, in the expression vocal reference material DD _a2 after the overwriting of the statistical data of this example, the evaluation score VSR(t1 _a2 ) at the time t1 _{a2 is} the highest.

第3處理中，CPU37自儲存於歌唱樣本資料庫DBS之各歌唱樣本資料DS所示之音調波形內搜索裝飾音之特徵波形，產生表示該搜索結果之表情歌唱出現資料(表示以裝飾音出現之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t之資料)。接著，CPU37基於有關裝飾音產生之表情歌唱出現資料，產生表示以音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時之表情歌唱出現數Num之關係的統計資料，基於該統計資料之內容而覆寫與表情歌唱參考資料DD_a3 中之各時刻t對應之評價分VSR(t)。In the third processing, the CPU 37 searches for the characteristic waveform of the decorative sound from the pitch waveforms stored in the singing sample data DS stored in the singing sample database DBS, and generates the expression singing appearance data indicating the search result (indicating that the decorative sound appears) The pronunciation start time of the note NT(i) is the data of each time t on the time axis of the reference point t _BS ). Next, the CPU 37 sings the appearance data based on the expression related to the decorative sound, and generates each time t on the time axis indicating the start time of the pronunciation of the note NT(i) as the reference point t _{BS and} the number of expression sings at the time t The statistical data of the relationship of Num is overwritten with the evaluation score VSR(t) corresponding to each time t in the expression singing reference material DD _a3 based on the content of the statistical data.

圖10係顯示對於裝飾音之統計資料之一例之圖。該例之統計資料中，在基準點t_BS 與比基準點t_BS 延後時間T2_a3 之時刻t2_a3 之間分佈有表情歌唱之出現數Num。並且，該例之統計資料中，在比基準點t_BS 延後時間T1_a3 之時刻t1_a3 表現有出現數Num之最大峰值。藉此，利用該例之統計資料之覆寫後之表情歌唱參考資料DD_a3 中，時刻t1_a3 之評價分VSR(t1_a3 )最高。Fig. 10 is a view showing an example of statistics for decorative sounds. In the statistical data of this example, the number Num of expression sings is distributed between the reference point t _BS and the time t2 _a3 which is later than the reference point t _{BS by the} time T2 _a3 . Further, in the statistical data of this example, the maximum peak of the number Num appears at the time t1 _a3 which is delayed by the time T1 _{a3 from} the reference point t _BS . Therefore, in the expression singing reference material DD _a3 after the overwriting of the statistical data of this example, the evaluation score VSR(t1 _a3 ) at the time t1 _{a3 is} the highest.

第4處理中，CPU37自儲存於歌唱樣本資料庫DBS之各歌唱樣本資料DS所示之音調波形內搜索上滑音之特徵波形，產生表示該搜索結果之表情歌唱出現資料(表示以上滑音出現之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t之資料)。接著，CPU37基於關於上滑音而產生之表情歌唱出現資料，產生表示以音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時之表情歌唱出現數Num之關係的統計資料，基於該統計資料之內容而覆寫與表情歌唱參考資料DD_a4 中之各時刻t對應之評價分VSR(t)。In the fourth processing, the CPU 37 searches for the characteristic waveform of the upper sliding sound from the pitch waveforms stored in the singing sample data DS stored in the singing sample database DBS, and generates an expression vocal appearance data indicating the search result (indicating the note in which the above sliding sound appears) The pronunciation start time of NT(i) is the data of each time t on the time axis of the reference point t _BS ). Next, the CPU 37 generates an expression vocal appearance data based on the upper gliding sound, and generates an expression singer indicating the time t on the time axis of the reference point t _BS of the note NT(i) and the time t at the time t. The statistical data of the relationship of the number Num overwrites the evaluation score VSR(t) corresponding to each time t in the expression singing reference material DD _a4 based on the content of the statistical data.

圖11係顯示對於上滑音之統計資料之一例之圖。該例之統計資料中，在基準點t_BS 與比基準點t_BS 延後時間T2_a4 之時刻t2_a4 之間分佈有表情歌唱之出現數Num。並且，該例之統計資料中，於基準點t_BS 表現有出現數Num之最大峰值，於比基準點t_BS 延後時間T1_a4 之時刻t1_a 4表現有出現數Num之第2峰值。藉此，利用該例之統計資料之覆寫後之表情歌唱參考資料DD_a4 中，時刻t_BS 之評價分VSR(t_BS )最高，時刻t1_a4 之評價分VSR(t1_a4 )第2高。Figure 11 is a diagram showing an example of statistics for the upper glide. In the statistical data of this example, the number Num of expression sings is distributed between the reference point t _BS and the time t2 _a4 which is later than the reference point t _BS delay time T2 _a4 . Further, the statistics of the embodiment, the reference point t _BS Num exhibits the largest number of peaks appear in the delay time t _BS than the reference point of time t1 _a 4 T1 _a4 manifestations number Num of the second peak appears. Whereby, after use of the embodiment of statistics override the reference DD _a4 singing expression, the evaluation time t divided _BS the VSR (t _BS) the highest evaluation point of time t1 _a4 VSR (t1 _a4) a second high.

第5處理中，CPU37自儲存於歌唱樣本資料庫DBS之各歌唱樣本資料DS所示之音調波形內搜索下滑音之特徵波形，產生表示該搜索結果之表情歌唱出現資料(表示以下滑音出現之音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t之資料)。接著，CPU37基於有關下滑音產生之表情歌唱出現資料，產生表示以音符NT(i)之發音開始時刻為基準點t_BS 之時間軸上之各時刻t與該等時刻t時之表情歌唱出現數Num之關係的統計資料，基於該統計資料之內容而覆寫與表情歌唱參考資料DD_a5 中之各時刻對應之評價分VSR(t)。In the fifth processing, the CPU 37 searches for the characteristic waveform of the glide sound from the pitch waveforms stored in the respective singing sample data DS of the singing sample database DBS, and generates an expression vocal appearance data indicating the search result (indicating the note of the following glide) The pronunciation start time of NT(i) is the data of each time t on the time axis of the reference point t _BS ). Next, the CPU 37 sings the appearance data based on the expression relating to the glide sound, and generates the time t of the expression on the time axis indicating the start time of the pronunciation of the note NT(i) as the reference point t _{BS and} the number of expressions of the singer at the time t The statistical data of the relationship of Num, based on the content of the statistical data, overwrites the evaluation score VSR(t) corresponding to each moment in the expression singing reference material DD _a5 .

圖12係顯示對於下滑音之統計資料之一例之圖。該例之統計資料中，在比基準點t_BS 延後時間T1_a5 之時刻t1_a5 與自時刻t_BS 延後時間T2_a5 之時刻t2_a5 之間分佈有表情歌唱之出現數Num。並且，該例之統計資料中，於時刻t2_a5 表現有出現數Num之最大峰值。藉此，利用該例之統計資料之覆寫後之表情歌唱參考資料DD_a5 中，時刻t2_a5 之評價分VSR(t2_a5 )最高。Figure 12 is a diagram showing an example of statistics for glide sounds. Distribution between t2 _a5 embodiment of the statistics, t _BS t _BS delay in delay time than the reference time T1 _a5 point of time from the time t1 _a5 and the time T2 _a5 has the number of emotion singing Num. Further, in the statistical data of this example, the maximum peak of the number of occurrences Num is expressed at time t2 _a5 . Therefore, in the expression vocal reference material DD _a5 after the overwriting of the statistical data of this example, the evaluation score VSR(t2 _a5 ) at the time t2 _{a5 is} the highest.

圖7中，卡拉OK裝置10-m之CPU17每當於預先規定之詢問時刻到來(S110：Yes)時進行詢問處理(S170)。該詢問處理中，CPU17將求得最新資料之發送之訊息MS2發送至伺服器裝置30(S170)。若自卡拉OK裝置10-m接收訊息MS2(S210：Yes)，則伺服器裝置30之CPU37將在上次之訊息MS2之接收時刻至此次之訊息MS2之接收時刻之間覆寫內容後之表情歌唱參考資料DD發送至訊息M2之發送端之卡拉OK裝置10-m(S250)。若自伺服器裝置30接收表情歌唱參考資料DD，則卡拉OK裝置10-m之CPU17將該表情歌唱參考資料DD覆寫於參考資料庫DBRK而更新其內容(S180)。In Fig. 7, the CPU 17 of the karaoke apparatus 10-m performs an inquiry process every time a predetermined inquiry time comes (S110: Yes) (S170). In the inquiry processing, the CPU 17 transmits the message MS2 for which the latest data is transmitted to the server device 30 (S170). If the message MS2 is received from the karaoke apparatus 10-m (S210: Yes), the CPU 37 of the server apparatus 30 will overwrite the content expression between the reception time of the last message MS2 and the reception time of the current message MS2. The singing reference material DD is sent to the karaoke apparatus 10-m of the transmitting end of the message M2 (S250). Upon receiving the expression singing reference material DD from the server device 30, the CPU 17 of the karaoke apparatus 10-m overwrites the expression singing reference material DD in the reference database DBRK to update its content (S180).

以上係本實施形態之構成之詳情。根據本實施形態可獲得以下效果。The above is the details of the configuration of this embodiment. According to this embodiment, the following effects can be obtained.

第1，於本實施形態之附帶表情之歌唱評價處理中，每當於聲音適配器16之輸出信號之波形中出現表情歌唱之特徵波形時，求得以成為表情歌唱之對象之音符NT(i)之發音開始時刻為基準點之時間軸上之表情歌唱之特徵波形之出現時刻，自歌唱參考資料DD內之各評價分VSR(t)中選擇與該出現時刻對應之評價分VSR(t)，基於該選擇之評價分VSR(t)評價歌唱之優劣。藉此，根據本實施形態，即便使用者進行表情歌唱，若其時序不適當仍無法獲得良好之評價。因此，根據本實施形態，可提示更接近人之感受之評價結果。First, in the vocal evaluation process with an expression in the present embodiment, whenever the characteristic waveform of the expression singer appears in the waveform of the output signal of the sound adapter 16, the note NT(i) which becomes the object of the expression sing is sought. The start time of the pronunciation is the appearance time of the characteristic waveform of the expression singing on the time axis of the reference point, and the evaluation score VSR(t) corresponding to the appearance time is selected from each evaluation score VSR(t) in the singing reference material DD, based on The evaluation of this selection is based on VSR(t) to evaluate the pros and cons of singing. Therefore, according to the present embodiment, even if the user performs expression singing, if the timing is not appropriate, a good evaluation cannot be obtained. Therefore, according to the present embodiment, it is possible to present an evaluation result closer to the feeling of the person.

第2，本實施形態中，對於儲存於歌唱樣本資料庫DBS內之表情歌唱參考資料DD之各者，自該資料DD所示之波形內搜索表情歌唱之特徵波形，由該搜索結果產生表示以成為表情歌唱之對象之音符NT(i)之發音開始時刻為基準點之時間軸上之各時刻與該等時刻時之表情歌唱之出現數之關係的統計資料，基於統計資料之內容而覆寫與歌唱參考資料DD中之各時刻對應之評價分VSR(t)。藉此，根據本實施形態，可使完美演唱歌曲之上級者們之歌唱方法之傾向變化反映於評價結果。Secondly, in the present embodiment, for each of the expression singing reference materials DD stored in the singing sample database DBS, the characteristic waveform of the expression singing is searched from the waveform indicated by the data DD, and the search result is generated by the search result. The utterance of the note of the expression of the singer NT(i) is the statistical data of the relationship between the time on the time axis of the reference point and the number of appearances of the expression sings at the time, and is overwritten based on the contents of the statistical data. The evaluation corresponding to each moment in the singing reference material DD is divided into VSR(t). As a result, according to the present embodiment, it is possible to reflect the change in the tendency of the singing method of the perfect singing songs to the evaluation result.

以上，針對本發明之一實施形態進行了說明，但本發明亦可有其他實施形態。例如，如下所述。Although an embodiment of the present invention has been described above, the present invention may have other embodiments. For example, as described below.

(1)上述實施形態中，CPU17自聲音適配器16之輸出信號S_P 檢測延音、顫音、裝飾音、上滑音、下滑音該5種表情歌唱。但亦可檢測該5種以外之表情歌唱。例如亦可檢測抑揚頓挫之歌唱。(1) In the above embodiment, CPU 17 outputs sound from the adapter detection signal S _P of the damper 16, vibrato, embellishments, glide on, down the five kinds of expressions singing sound. However, it is also possible to detect the expressions of the five other types of singing. For example, you can also detect the singing of the swaying.

(2)上述實施形態中，CPU17使用聲音適配器16之輸出信號S_P 及S_L 兩者進行標準歌唱評價處理，僅使用聲音適配器16之輸出信號S_P 及S_L 中表示音調之信號S_P 進行表情歌唱評價處理。但CPU17亦可僅使用信號S_P 及S_L 之一者進行標準歌唱評價處理。又，CPU17亦可使用信號S_P 及S_L 兩者進行表情歌唱評價處理。(2) In the above embodiment, CPU 17 using both the sound adapter _L 16 output signal S _P and S of the standard evaluation processing singing, only the sound of the adapter output signals S 16 and S _L _P represents the pitch of the signal S _P for Expression singing evaluation processing. However, the CPU 17 can also perform standard singing evaluation processing using only one of the signals S _P and S _L . Further, the CPU 17 can perform the expression singing evaluation processing using both of the signals S _P and S _L .

(3)上述實施形態之表情歌唱評價處理中，基於表情歌唱之特徵波形之出現時刻評價歌唱之優劣。但亦可進行添加有表情歌唱之特徵波形之出現時刻以外之要素(例如延音、顫音、裝飾音、上滑音、下滑音之各者之長度或深度等)之評價。(3) In the expression singing evaluation processing of the above embodiment, the merits of the singing are evaluated based on the appearance timing of the characteristic waveform of the expression singing. However, it is also possible to evaluate the elements other than the appearance time of the characteristic waveform to which the expression sing is added (for example, the length or depth of each of the sustain, vibrato, decorative sound, upper-slide, and glide).

(4)上述實施形態之表情歌唱評價處理中，採用檢測對應於歌曲中所含之各音符之歌唱音中出現之表情歌唱之構成，但亦可採用檢測對應於歌曲中所含之一連串複數個音符(音符群)之歌唱音中出現之表情歌唱之構成。例如，如漸強．漸弱之表情歌唱係在一連串複數個音符之歌唱中進行之表情歌唱，因此該等表情歌唱之檢測及評價較理想為以音符群為單位進行。因此，關於如此之表情歌唱之表情歌唱參考資料DD亦較理想為以音符群單位構成。(4) In the expression singer evaluation processing of the above-described embodiment, the composition of the expression sing which appears in the vocal sound corresponding to each note included in the song is detected, but it is also possible to detect that a plurality of ones corresponding to the song are included The composition of the expression of singing in the singing voice of a note (note group). For example, if it is getting stronger. Fading expression singing in a series of multiple tones The expression of the singer in the singer is sung, so the detection and evaluation of the singer is ideal for the group of notes. Therefore, the expression vocal reference DD for such expressions is also ideally composed of musical note group units.

(5)上述實施形態中，採用如下之構成：自卡拉OK裝置10對伺服器裝置30發送包含自歌曲之歌唱開始至結束之期間由聲音適配器16輸出之信號S_P 及S_L 之歌唱樣本資料DS(音調音量資料)，於伺服器裝置30中由歌唱樣本資料DS進行各表情歌唱之檢測及其出現時序之特定處理。亦可取代其，採用如下之構成：自卡拉OK裝置10對伺服器裝置30發送表示由麥克風13收音之聲音的聲音信號S_M (表示歌唱音之聲音波形資料)，於伺服器裝置30中進行自聲音信號S_M 產生信號S_P 及信號S_L 之處理(上述實施形態中之聲音適配器16進行之處理)之構成。又，亦可採用如下之構成：自卡拉OK裝置10對伺服器裝置30發送表示按照歌唱評價程式VPG進行之表情歌唱評價處理(S140)時特定之表情歌唱之種類及其出現時序之資料(表情歌唱出現資料)，於伺服器裝置30中不進行表情歌唱檢測處理，而基於自卡拉OK裝置10發送之表情歌唱出現資料進行表情歌唱參考資料DD之更新處理。(5) In the above embodiment, the karaoke apparatus 10 transmits the singing sample data including the signals S _P and S _L outputted from the sound adapter 16 during the period from the start to the end of the singing of the song from the karaoke apparatus 10 to the server apparatus 30. The DS (tone volume data) is used in the server device 30 to perform the detection of each expression singing and the specific processing of the appearance timing by the singing sample data DS. Alternatively, a configuration is adopted in which the karaoke apparatus 10 transmits a sound signal S _M (a sound waveform data indicating a singing sound) indicating the sound collected by the microphone 13 to the server device 30, and performs the processing in the server device 30. The processing of the signal S _P and the signal S _L from the sound signal S _M (the processing performed by the sound adapter 16 in the above embodiment). Further, a configuration may be employed in which the karaoke apparatus 10 transmits to the server device 30 the type of the expression singer and the appearance timing thereof when the expression singer evaluation processing (S140) performed in accordance with the singing evaluation program VPG (expression) The vocal appearance data is not performed in the server device 30, and the expression vocal appearance data transmitted from the karaoke device 10 is used to update the expression vocal reference material DD.

(6)上述實施形態中，伺服器裝置30進行統計資料之產生與基於此之表情歌唱參考資料DD之覆寫。但各卡拉OK裝置10-m亦可將先前由自機產生或自其他卡拉OK裝置10-m直接或經由伺服器裝置30獲得之表示歌唱音之聲音信號SM、或由該等聲音信號S_M 產生之信號S_P 及信號S_L 、或表示使用該等信號特定之表情歌唱之種類及其出現時序之資料(表情歌唱出現資料)預先記憶於硬碟20，CPU17讀取其等並使用，伺服器裝置30進行與S240進行之處理相同之處理、即統計資料之產生與基於此之表情歌唱參考資料DD之覆寫。(6) In the above embodiment, the server device 30 performs the generation of the statistical data and the overwriting of the expression singing reference material DD based thereon. However, each karaoke apparatus 10-m may also be a sound signal SM representing a vocal sound previously generated by the self-machine or obtained from other karaoke apparatus 10-m or via the server apparatus 30, or by the sound signal S _M The generated signal S _P and the signal S _L or the data indicating the type of the expression singing and the timing of the appearance of the signal (the expression singing appearance data) are pre-recorded on the hard disk 20, and the CPU 17 reads and uses the servo. The device device 30 performs the same processing as that performed by S240, that is, generation of statistical data and overwriting of the expression singing reference material DD based thereon.

(7)上述實施形態中之歌唱之評價方法及對歌唱者提示評價結果之態樣可進行各種變更。例如，上述實施形態中，採用藉由將標準歌唱評價處理(S130)中基於表情歌唱之出現次數算出之相加分SR_ADD 與基本得分SR_BASE 合計而算出標準得分SR_NOR 之構成，但亦可採用不考慮標準歌唱評價處理中表情歌唱之出現，僅算出基本得分SR_BASE 之構成。又，上述實施形態中，對歌唱者顯示由標準歌唱評價處理打分之標準得分SR_NOR 與由表情歌唱評價處理打分之表情得分SR_EX 中較高者之得分，但亦可進行顯示該等兩者、顯示其等之合計分數等以其他態樣對歌唱者之評價結果之提示。(7) The evaluation method of the singing in the above embodiment and the manner in which the singer presents the evaluation result can be variously changed. For example, in the above-described embodiment, the standard score SR _NOR is calculated by adding the addition score SR _ADD calculated based on the number of appearances of the expression singing in the standard singing evaluation process (S130) to the basic score SR _BASE , but may be used. The composition of the basic score SR _BASE is calculated only without considering the appearance of the expression singing in the standard singing evaluation process. Further, in the above-described embodiment, the score of the higher of the standard score SR _NOR scored by the standard singing evaluation process and the expression score SR _EX scored by the expression singing evaluation process is displayed to the singer, but both of them may be displayed. And display the total score of the class, etc., and the other results to the singer's evaluation results.

(8)上述實施形態中，採用如下構成：於表情歌唱參考資料DD之更新時，將基本得分SR_BASE 高於基準得分SR_TH 之歌唱者作為上級者，僅使用關於上級者之歌唱樣本資料DS進行表情歌唱參考資料DD之更新。表情歌唱參考資料DD之更新所使用之歌唱樣本資料DS之選擇方法不限於此。例如亦可取代基本得分SR_BASE ，將對基本得分SR_BASE 合計相加分SR_ADD 之標準得分SR_NOR 作為上級者之推測基準使用。又，亦可採用如下之構成：為了將因完全未進行表情歌唱而基本得分SR_BASE 成為高得分之上級者除外，除設置下側之閾值(基準得分SR_TH )以外亦設置上側之閾值，高於上側之閾值之基本得分SR_BASE (或其他得分)之歌唱者之歌唱樣本資料DS不用於表情歌唱參考資料DD之更新。又，亦可取代如上述般將歌唱者分為上級者與其以外者該2部分，例如對基本得分SR_BASE 較高之歌唱者之歌唱樣本資料DS附加較大之加權而用於表情歌唱參考資料DD之更新。(8) In the above embodiment, the singer who has the basic score SR _BASE higher than the reference score SR _TH is used as the superior in the update of the expression singing reference material DD, and only the singing sample data DS about the superior is used. Update the expression vocal reference DD. The selection method of the singing sample data DS used for the update of the expression singing reference material DD is not limited to this. For example, instead of the basic score SR _BASE , the standard score SR _{NOR of} the base score SR _BASE total addition score SR _{ADD may} be used as a speculative reference for the superior. In addition, it is also possible to adopt a configuration in which the upper threshold is set in addition to the lower threshold (reference score SR _TH ) in order to set the basic score SR _{BASE to} be higher than the upper score because the expression is not performed at all, and the upper threshold is set. The singing sample data DS of the singer whose basic score SR _BASE (or other score) of the upper threshold is not used for the update of the expression singing reference DD. In addition, instead of dividing the singer into the superior and the other two parts as described above, for example, the singer sample data DS of the singer whose base score SR _{BASE is} higher is added with a larger weight for the expression sing reference material. DD update.

(9)上述實施形態中，作為評價樂曲演奏之演奏評價裝置之一例，顯示有設於歌唱用卡拉OK裝置且評價歌唱演奏之演奏評價裝置，但本發明之演奏評價裝置並不限於歌唱演奏之評價，亦可適用於使用各種樂器之樂曲演奏之評價。即，上述實施形態中使用之「歌唱」這一語言可由更普通之「演奏」這一語言置換。再者，評價樂器演奏之演奏評價裝置中，例如進行有關吉他之推弦(choking)等對應於各樂器之表情演奏之評價。又，於樂曲並非歌曲而為樂器用之樂曲之情形時，樂器演奏用卡拉OK裝置構成為曲目資料MD取代歌詞音軌TR_LY 而包含例如表示樂譜之資料與作為表示樂譜之各區間(例如2小節或4小節之模塊等)之顯示時刻之差量時間按時間序列順序記述之資料之樂譜音軌，音聲儲存器21(對應於樂譜資料獲取機構)及顯示部14係以按照樂譜音軌，伴隨樂曲之進行將表示對應於伴奏部位之樂譜之圖像信號輸出至顯示器之方式構成。再者，歌唱用卡拉OK裝置及樂器演奏用卡拉OK裝置中，無需歌唱或樂譜之顯示之情形時，亦可不進行利用音聲儲存器21及顯示部14進行之圖像信號之輸出處理。(9) In the above-described embodiment, as an example of the performance evaluation apparatus for evaluating a musical performance, a performance evaluation apparatus provided for the singing karaoke apparatus and evaluating the singing performance is displayed. However, the performance evaluation apparatus of the present invention is not limited to the singing performance. The evaluation can also be applied to the evaluation of the performance of the music using various instruments. That is, the language "singing" used in the above embodiment can be replaced by the language of "normal performance". Further, in the performance evaluation device for evaluating the musical performance, for example, evaluation of the expression performance corresponding to each musical instrument such as choking of the guitar is performed. Further, in the case where the music piece is not a song but a piece of music for a musical instrument, the karaoke apparatus for musical instrument performance is configured such that the track material MD replaces the lyrics track TR _LY and includes, for example, information indicating the musical score and each section (for example, 2) representing the musical score. The difference between the display time of the bar or the 4 bar module, etc.) The music track of the data described in time series, the sound storage 21 (corresponding to the music data acquisition mechanism) and the display unit 14 are arranged according to the music track The image signal corresponding to the musical score of the accompaniment portion is output to the display along with the progress of the music. Further, in the karaoke apparatus for singing and the karaoke apparatus for musical instrument performance, the output processing of the image signals by the sound storage 21 and the display unit 14 may not be performed when the display of the singing or the music is not required.

(10)如由以上之例示而理解般，本發明之較佳態樣之演奏評價裝置係作為如下裝置而總括地表現，且其他要素之有無或其他要素之具體態樣為任意，該裝置如圖13所例示般具備：表情演奏參考資料獲取機構101，其獲取以樂曲中所含之音符或音符群之發音開始時刻作為基準，表示於上述樂曲演奏中應進行之表情演奏與該表情演奏應於上述樂曲中進行之時序之表情演奏參考資料；音調音量資料產生機構102，其自演奏者演奏之上述樂曲之演奏音產生表示該演奏音之音調及音量之音調音量資料；及演奏評價機構103，其係於由上述音調音量資料產生機構102產生之上述音調音量資料所示之音調及音量之至少一者之特性表示上述樂曲之上述表情演奏參考資料所示之特定時間範圍內應由上述表情演奏參考資料進行之表情演奏特性之情形時，提高對上述演奏者之上述樂曲演奏之評價。(10) As understood from the above description, the performance evaluation apparatus of the preferred embodiment of the present invention is generally shown as the following apparatus, and the presence or absence of other elements or the specific aspect of other elements is arbitrary. As illustrated in FIG. 13, the expression playing reference data acquiring unit 101 acquires the expression starting time of the note or the note group included in the music, and indicates that the expression performance and the expression should be performed in the music performance. The expression playing reference data of the time sequence performed in the music piece; the pitch volume data generating unit 102 generates a tone volume data indicating the pitch and volume of the performance sound from the performance sound of the music piece played by the player; and the performance evaluation mechanism 103 And the characteristic of at least one of the pitch and the volume indicated by the pitch volume data generated by the pitch volume data generating unit 102 indicates that the expression should be played by the expression in a specific time range indicated by the expression performance reference material of the music piece. Improve the above-mentioned performance of the above performers when referring to the expression performance characteristics of the reference material Evaluation of the performance.

(11)上述實施形態中，顯示有於作為所謂之專用機之卡拉OK裝置中設有本發明之演奏評價裝置之例，但本發明之演奏評價裝置並不限於專用機。例如亦可採用藉由使個人電腦或可攜式資訊終端(例如可攜式電話或智能電話)或遊戲機裝置等各種裝置進行按照程式之處理而實現本發明之演奏評價裝置之構成。又，該程式亦可儲存散佈於 CD-ROM等記錄媒體中，或利用網際網路等電通訊電線散佈。(11) In the above embodiment, the example of the performance evaluation apparatus of the present invention is provided in a karaoke apparatus as a so-called dedicated machine. However, the performance evaluation apparatus of the present invention is not limited to a dedicated machine. For example, a configuration of the performance evaluation apparatus of the present invention can be realized by performing processing in accordance with a program by various devices such as a personal computer or a portable information terminal (for example, a portable telephone or a smart phone) or a game machine device. Also, the program can also be stored and distributed In a recording medium such as a CD-ROM, or by using an electric communication cable such as the Internet.

本申請係基於2012年4月18日申請之日本專利申請案之特願2012-094853者，其內容以參照之形式併入至本文中。The present application is based on Japanese Patent Application No. 2012-094853, filed on Jan.

[Industrial availability]

根據本發明，於由演奏者進行表情演奏之情形時，可進行與人之感受背離較少之評價。According to the present invention, when the expression is performed by the performer, an evaluation that is less deviating from the human feeling can be made.

1‧‧‧歌唱評價系統1‧‧‧Singing Evaluation System

10-m‧‧‧卡拉OK裝置10-m‧‧‧ Karaoke device

11‧‧‧聲源11‧‧‧ source

12‧‧‧揚聲器12‧‧‧ Speaker

13‧‧‧麥克風13‧‧‧ microphone

14‧‧‧顯示部14‧‧‧Display Department

15‧‧‧通訊介面15‧‧‧Communication interface

16‧‧‧聲音適配器16‧‧‧Sound Adapter

17‧‧‧CPU17‧‧‧CPU

18‧‧‧RAM18‧‧‧RAM

19‧‧‧ROM19‧‧‧ROM

20‧‧‧硬碟20‧‧‧ Hard disk

21‧‧‧音聲儲存器21‧‧‧Voice storage

30‧‧‧伺服器裝置30‧‧‧Server device

35‧‧‧通訊介面35‧‧‧Communication interface

37‧‧‧CPU37‧‧‧CPU

38‧‧‧RAM38‧‧‧RAM

39‧‧‧ROM39‧‧‧ROM

40‧‧‧硬碟40‧‧‧ Hard disk

90‧‧‧網路90‧‧‧Network

APG‧‧‧歌唱分析程式APG‧‧‧Singing Analysis Program

DBRK‧‧‧參考資料庫DBRK‧‧‧ Reference Library

DBRS‧‧‧參考資料庫DBRS‧‧‧ Reference Library

DBS‧‧‧歌唱樣本資料庫DBS‧‧‧ singing sample database

HD‧‧‧標頭HD‧‧‧ heading

MD-n‧‧‧曲目資料MD-n‧‧‧ Track Information

S_A ‧‧‧聲音信號S _A ‧‧‧Sound signal

S_I ‧‧‧圖像信號S _I ‧‧‧ image signal

S_o ‧‧‧控制信號S _o ‧‧‧ control signal

S_L ‧‧‧輸出信號S _L ‧‧‧Output signal

S_M ‧‧‧收音信號S _M ‧‧‧ radio signal

S_P ‧‧‧輸出信號S _P ‧‧‧Output signal

TR_AC ‧‧‧伴奏音軌TR _AC ‧ ‧ accompaniment track

TR_LY ‧‧‧歌詞音軌TR _LY ‧ ‧ lyrics track

VPG‧‧‧歌唱評價程式VPG‧‧‧ singing evaluation program

Claims

A performance evaluation device comprising: an expression performance reference material acquisition mechanism that acquires a sequence of notes to be played in the performance of the music and a sequence in which the expression should be played in the music, and the note or the note group included in the music An expression playing reference material represented by a pronunciation start time; a pitch volume data generating mechanism that generates a tone volume data indicating a pitch and volume of the performance sound from a performance sound of the music played by the player; and a performance evaluation mechanism, The characteristic of at least one of the pitch and the volume indicated by the pitch volume data generated by the pitch volume data generating means is expressed by the expression in a specific time range indicated by the expression performance reference material in the music piece. When the characteristics of the expression performance performed by the reference material are played, the evaluation of the performance of the above-described music performed by the above-mentioned performer is improved.

The performance evaluation apparatus of claim 1, further comprising: a tone volume data acquisition mechanism that acquires a tone volume indicating a pitch and volume of the performance sound for each of a plurality of performance sounds of the music played by any player An expression playing performance generating means for indicating at least one of a pitch and a volume indicated by the volume pitch data acquired by the pitch volume data acquiring means: one or more predetermined ones at an arbitrary timing of the music piece In the case of one of the characteristics of the expression of the expression, an expression performance occurrence data indicating the expression of the expression and the timing of the note start time based on the note or the note group included in the music piece is generated; and the expression performance reference a data generating mechanism based on any number of expressions played by the above-mentioned expression playing data generating means, Each of the musical notes or the note group included in the music is specified to be played at any frequency at any timing based on the pronunciation start time of the note or the note group, and is generated according to the specific information. Expression playing reference material.

The performance evaluation device of claim 2, further comprising: an expression performance reference data storage mechanism for memorizing the expression performance reference material, based on the expression expression reference material generated by the expression expression reference data generation mechanism, and memorizing the expression of the expression performance reference data The above-mentioned facial expression reference material of the institution is overwritten.

The performance evaluation apparatus according to any one of claims 1 to 3, further comprising: a template performance reference acquisition means for acquiring a model performance reference material indicating a pitch of the musical piece of the music piece, wherein the performance evaluation mechanism is based on the tone volume The result of the comparison between the pitch indicated by the pitch volume data generated by the data generating means and the pitch indicated by the model playing reference material is performed on the performance of the music performed by the player.

The performance evaluation apparatus of claim 2, further comprising: a template performance reference acquisition means for acquiring a template performance reference material indicating a pitch of the musical composition of the music piece, wherein the performance evaluation mechanism is based on a tone generated by the pitch volume data generating means The pitch of the volume data and the pitch of the tone shown by the model playing reference material are used to evaluate the performance of the music performed by the player, and the pitch volume data obtained by the pitch volume data acquiring mechanism is The performance evaluation data indicates the evaluation result by the performance evaluation mechanism using the template performance reference material, or the other mechanism having the same mechanism as the performance evaluation mechanism is used in the same manner as the template performance reference material. As a result of the evaluation of the data, the expression playing reference data generating means uses the pitch volume data of the performance evaluation data that satisfies the specific condition among the pitch volume data acquired by the pitch volume data acquiring means, based on the appearance of the expression generating means by the expression The generated expression plays the appearance data, and the above expression expression reference material is generated.

The performance evaluation apparatus of claim 1 or 2, further comprising: a template performance reference material acquisition unit that acquires a model performance reference material indicating a volume or a pitch that is a template of the music piece; the performance evaluation mechanism is based on The volume or pitch indicated by the pitch volume data generated by the pitch volume data generating means is compared with the volume or pitch indicated by the template playing reference material, and is determined to be related to the performance of the music piece performed by the player. a first score (SR _NOR ), and based on a comparison between the volume or pitch indicated by the pitch volume data generated by the pitch volume data generating means and the volume or pitch indicated by the expression playing reference material The second score (SR _NOR ) related to the performance of the music piece by the player is performed, and the performance evaluation of the performance of the music piece is performed based on the first score and the second score.

A karaoke apparatus comprising: the performance evaluation apparatus according to any one of claims 1 to 5; an accompaniment data acquisition mechanism that acquires accompaniment data indicating the accompaniment of the music; and a sound signal output mechanism accompanying the accompaniment data Instructing to output a sound signal indicating the music of the accompaniment; and the pitch volume data generating mechanism generates tone volume data, the tone volume data indicating a sound signal corresponding to the output from the sound signal outputting mechanism The pitch and volume of the performance sound of the above-mentioned music performed by the above-mentioned player from the accompaniment of the speaker playback.

The karaoke apparatus of claim 7, wherein the music piece is a song, and the karaoke apparatus further comprises: a lyrics data acquiring unit that acquires lyrics data indicating the lyrics of the song; and an image signal output mechanism whose output representation is The lyrics shown in the above lyrics data should be image signals of lyrics sung together with the accompaniment indicated by the sound signal currently output by the sound signal output mechanism.

The karaoke apparatus of claim 7, wherein the music piece is a piece of music played by an instrument, and the karaoke apparatus further comprises: a music score data acquiring mechanism that acquires musical score data indicating a musical score of the music piece; and an image signal output mechanism, The output indicates the musical score indicated by the musical score data, and indicates an image signal of the musical score to be played in conjunction with the accompaniment indicated by the sound signal currently output by the sound signal outputting means.

A server device comprising: an expression playing appearance data acquiring means for acquiring, for each of a plurality of performance sounds of a piece of music played by an arbitrary player, starting with a pronunciation of a note or a note group included in the music piece The expression of the expression playing performance appears at a timing of the reference; the expression playing reference data generating mechanism is based on any number of expression performance appearance data acquired by the expression playing appearance data acquiring mechanism, and is included in the music composition. Each of the note or the note group specifies whether any of the expressions at any timing based on the pronunciation start time of the note or the note group appears at any frequency, and the music is generated according to the specific information. Should play in the performance The facial expression playing and the facial expression should be performed in the above-mentioned musical composition, the expression playing reference material indicated by the pronunciation start time of the note or the note group contained in the music piece; and the transmitting mechanism, which will be played by the above expression The expression performance reference material generated by the reference generation mechanism is sent to the performance evaluation device.

A performance evaluation system comprising: an expression performance reference material acquisition mechanism that acquires a sequence of notes and notes to be played in the performance of the music and a sequence in which the expression should be performed in the music a first expression performance reference material indicated by the pronunciation start time; a pitch volume data generation mechanism that generates a tone volume data indicating the pitch and volume of the performance sound from the performance sound of the music piece played by the player; the performance evaluation mechanism And the characteristic of at least one of the pitch and the volume indicated by the pitch volume data generated by the pitch volume data generating means is expressed in a specific time range indicated by the first expression performance reference data in the music piece In the case of the characteristics of the expression performance performed by the first expression performance reference material, the evaluation of the performance of the music performed by the player is improved; the expression performance appearance data acquisition mechanism is played by any player for any number of players. Each of the performances of the music is obtained by the above-mentioned The beginning of the pronunciation of the note or the note group contained in the above-mentioned music played by the player is an expression of the expression playing performance at a timing of the reference; and the expression playing reference material generating mechanism is based on the expression played by the above expression Any number of expression performance appearance data acquired by the data acquisition mechanism, and for each of the notes or note groups included in the music played by any of the above-mentioned players, the timing of the pronunciation start of the note or the note group is specified. Whether any of the expressions in a sequence appears at any frequency, according to the specific information, The expression performance to be performed in the performance of any of the above-mentioned players, and the timing at which the expression should be performed in the music played by any of the above-mentioned performers, the timing of the pronunciation of the notes or notes included in the music played by any of the above players. A second expression playing reference material represented as a reference.

A performance evaluation method for obtaining a time period in which the expression performance to be performed in the performance of the music piece and the expression performance should be performed in the music piece, based on the pronunciation start time of the note or the note group included in the music piece as a reference The expression playing reference material, wherein the playing sound of the music piece played by the player produces a tone volume data indicating the pitch and volume of the playing sound, and is represented by at least one of the pitch and the volume indicated by the pitch volume data. When the characteristics of the expression performance performed by the above-described expression performance reference material in the specific time range indicated by the above-described expression performance reference material in the above-mentioned music piece are evaluated, the evaluation of the performance of the music piece performed by the above-mentioned performer is improved.

A program executable by a computer, and causing the computer to perform the following processing: an expression playing reference material acquisition process, which acquires a timing at which the expression to be performed in the performance of the music piece and the expression should be performed in the music piece to the music piece The expression playing reference material represented by the pronunciation start time included in the note or the note group; the pitch volume data generation processing, which produces a tone indicating the pitch and volume of the performance sound from the performance sound of the music played by the player a volume data; and a performance evaluation process, wherein the characteristics of at least one of the pitch and the volume indicated by the pitch volume data generated by the pitch volume data generating means are expressed in the expression playing reference material in the music In the case where the characteristics of the expression performance to be performed by the above-described expression playing reference material in the specific time range are shown, the evaluation of the performance of the above-described music performed by the above-mentioned performer is improved.