JP2007256618A

JP2007256618A - Search device

Info

Publication number: JP2007256618A
Application number: JP2006080811A
Authority: JP
Inventors: Tatsuya Iriyama; 達也入山
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2006-03-23
Filing date: 2006-03-23
Publication date: 2007-10-04

Abstract

<P>PROBLEM TO BE SOLVED: To provide technology capable of searching a singer who has a desired singing feeling in a karaoke device. <P>SOLUTION: When a CPU 31 of a server unit 3 detects that a singing analysis data is received via a communication section 35, the received singing analysis data is compared with the singing analysis data stored in a singing analysis database storage area 34a, and according to its matching rate, one or more singing analysis data are selected from the singing analysis database storage area 34a. The CPU 31 reads identification information corresponding to the selected singing analysis data from the singing analysis database storing area 34a, and transmits the read identification information to the karaoke device 2 via a communication network 4. A CPU 11 of the karaoke device 2 displays a singer indicated by the received identification information on a display section 15. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、歌唱者を検索するための技術に関する。 The present invention relates to a technique for searching for a singer.

カラオケ装置においては、歌唱者の歌唱の巧拙を採点するための方法が種々提案されている。例えば、特許文献１においては、歌唱された音声のピッチと基準ピッチとを比較して、どの部分がうまく歌えなかったかを判定する方法が提案されている。また、特許文献２に記載の技術では、歌唱者が歌った曲の物理的特性、聴取者のその曲に対する評価等に応じて歌唱者の歌唱を採点し、よりよい採点のために採点結果を蓄積する方法が提案されている。
特開２００４−０９３６０１号公報特開２０００−９９０１４号公報 In a karaoke apparatus, various methods for scoring the skill of a singer's singing have been proposed. For example, Patent Document 1 proposes a method of determining which part has not been successfully sung by comparing the pitch of the sung voice with a reference pitch. Moreover, in the technique of patent document 2, a singer's song is scored according to the physical characteristics of the song sung by the singer, the listener's evaluation of the song, etc., and the scoring result for better scoring. A method of accumulation has been proposed.
Japanese Patent Application Laid-Open No. 2004-093601 JP 2000-99014 A

ところで、デュエット、アカペラ、ゴスペル、合唱など複数人数で行う歌唱形態はさまざまあるが、そのいずれにおいても、複数人数が「声質」、「音程感」などを揃えることは重要なポイントである。均一な「音質」が効果的なのは、双子の歌手「ザ・ピーナッツ」などの例からも明らかであるし、器楽合奏と異なり、人の歌唱感は「音程感」の個人差が大きい。高度な合唱団でも、練習においてこの「音程感」をそろえることにかなりの苦労を有する。ここでいう音程感とは、例えば、歌いだしの音程上げ下げや、ビブラートの中心音程をどこにおくか、などといった、楽譜に表しきれない細かな音程表現を指す。 By the way, there are various types of singing performed by a plurality of people such as duet, a cappella, gospel, and chorus. The fact that uniform “sound quality” is effective is apparent from examples such as the twin singer “The Peanuts”, and unlike instrumental ensembles, the singing feeling of a person has a large individual difference in “pitch feeling”. Even advanced choirs have a great deal of difficulty in getting this “pitch” in practice. The sense of pitch here refers to a fine pitch expression that cannot be expressed in the score, such as where the pitch of the singing is raised or lowered, or where the central pitch of the vibrato is placed.

カラオケ装置の利用者のなかには、自分と歌唱感が似ている人や、ある特定の歌唱感を有する人（例えば、自分と音程感は同じで声質が異なる人、等）を探してデュエットや合唱を行いたいという要望をもつ者もいる。
本発明は上述した背景の下になされたものであり、所望する歌唱感を有する歌唱者を利用者が見つけることのできる技術を提供することを目的とする。 Among users of karaoke devices, search for people who have a similar feeling to singing or who have a certain feeling of singing (for example, people who have the same pitch and different voice quality, etc.) as a duet or chorus. Some people want to do it.
The present invention has been made under the above-described background, and an object of the present invention is to provide a technique by which a user can find a singer having a desired singing feeling.

上記課題を解決するため、本発明は、歌唱者を識別する識別情報と前記歌唱者の歌唱音声の特徴を示す歌唱分析データとの対を複数記憶した記憶手段と、歌唱者の歌唱音声の特徴を示す歌唱分析データを取得する取得手段と、前記取得手段により取得した歌唱分析データと前記記憶手段に記憶された歌唱分析データとを比較し、その一致度に応じて、前記記憶手段に記憶された歌唱分析データから１以上の歌唱分析データを選択する選択手段と、前記選択手段により選択された歌唱分析データと対応する識別情報を前記記憶手段から読み出し、読み出した識別情報を出力する出力手段とを具備することを特徴とする検索装置を提供する。
本発明の好ましい態様においては、前記記憶手段は、前記識別情報に対応付けて歌唱者に関する歌唱者情報を記憶し、前記出力手段は、前記選択手段により選択された歌唱分析データと対応する識別情報を前記記憶手段から読み出し、読み出した識別情報を出力するとともに、当該識別情報と対応する歌唱者情報を前記記憶手段から読み出し、読み出した歌唱者情報を報知することを特徴とする。
本発明の別の好ましい態様においては、前記記憶手段は、前記識別情報に対応付けて歌唱者の連絡先を示す通信アドレスを記憶し、前記出力手段は、前記選択手段により選択された歌唱分析データと対応する識別情報を前記記憶手段から読み出し、読み出した識別情報を出力するとともに、当該識別情報と対応する通信アドレスを前記記憶手段から読み出し、読み出した通信アドレスに宛てて、選択された旨を示すメッセージを通信することを特徴とする。
本発明の更に好ましい態様においては、前記歌唱分析データは、前記音声のピッチ、スペクトルおよびパワーの少なくともいずれか一つを示すデータであることを特徴とする。
また、本発明の別の好ましい態様においては、前記歌唱分析データは、歌唱に用いられている技法の種類とタイミングを示す技法データであることを特徴とする。
また、本発明の更に好ましい態様においては、入力される歌唱者の音声を音声データとして出力する入力手段と、前記入力手段が出力した音声データから前記歌唱分析データを生成する生成手段とを備え、前記取得手段は、前記生成手段により生成された歌唱分析データを取得することを特徴とする。
また、本発明の更に好ましい態様においては、前記生成手段により生成された歌唱分析データを前記記憶手段に記憶する記憶制御手段を備えることを特徴とする。 In order to solve the above-mentioned problems, the present invention provides a storage means storing a plurality of pairs of identification information for identifying a singer and singing analysis data indicating characteristics of the singing voice of the singer, and characteristics of the singing voice of the singer. The singing analysis data indicating the singing analysis data acquired by the acquisition unit and the singing analysis data stored in the storage unit are compared, and the storage unit stores the singing analysis data indicating the degree of coincidence. Selecting means for selecting one or more song analysis data from the song analysis data, and output means for reading identification information corresponding to the song analysis data selected by the selection means from the storage means and outputting the read identification information; A search device is provided.
In a preferred aspect of the present invention, the storage means stores singer information relating to a singer in association with the identification information, and the output means identifies identification information corresponding to the song analysis data selected by the selection means. Is read from the storage means, and the read identification information is output, and the singer information corresponding to the identification information is read from the storage means, and the read singer information is notified.
In another preferable aspect of the present invention, the storage means stores a communication address indicating a contact information of the singer in association with the identification information, and the output means is singing analysis data selected by the selection means. Is read from the storage means, and the read identification information is output, and the communication address corresponding to the identification information is read from the storage means to indicate that it has been selected for the read communication address. It is characterized by communicating a message.
In a further preferred aspect of the present invention, the song analysis data is data indicating at least one of the pitch, spectrum and power of the voice.
In another preferred aspect of the present invention, the singing analysis data is technique data indicating the type and timing of a technique used for singing.
Further, in a further preferred aspect of the present invention, it comprises an input means for outputting the input voice of the singer as voice data, and a generating means for generating the song analysis data from the voice data output by the input means, The acquisition unit acquires singing analysis data generated by the generation unit.
In a further preferred aspect of the present invention, the apparatus further comprises storage control means for storing the song analysis data generated by the generation means in the storage means.

本発明によれば、所望する歌唱感を有する歌唱者を利用者が見つけることができる。 According to the present invention, a user can find a singer who has a desired singing feeling.

＜Ａ：第１実施形態＞
＜Ａ−１：構成＞
図１は、この発明の一実施形態に係る検索システム１の全体構成の一例を示すブロック図である。この検索システム１は、カラオケ装置２ａ，２ｂ，２ｃとサーバ装置３とが通信ネットワーク４を介して接続されて構成される。なお、図１には３つのカラオケ装置が例示されているが、本検索システムに含まれるカラオケ装置の数は３に限定されるものではなく、これより多くても少なくてもよい。また、以下では、カラオケ装置２ａ，２ｂ，２ｃを各々区別する必要がない場合には、単に「カラオケ装置２」とする。 <A: First Embodiment>
<A-1: Configuration>
FIG. 1 is a block diagram showing an example of the overall configuration of a search system 1 according to an embodiment of the present invention. The search system 1 is configured by connecting karaoke devices 2 a, 2 b, 2 c and a server device 3 via a communication network 4. In addition, although three karaoke apparatuses are illustrated in FIG. 1, the number of karaoke apparatuses included in the search system is not limited to three and may be more or less than this. In the following description, when it is not necessary to distinguish the karaoke apparatuses 2a, 2b, and 2c, they are simply referred to as “karaoke apparatus 2”.

図２は、カラオケ装置２のハードウェア構成を例示したブロック図である。図において、ＣＰＵ（Central Processing Unit）１１は、ＲＯＭ（Read Only Memory）１２または記憶部１４に記憶されているコンピュータプログラムを読み出してＲＡＭ（Random Access Memory）１３にロードし、これを実行することにより、カラオケ装置２の各部を制御する。記憶部１４は、例えばハードディスクなどの大容量の記憶手段であり、伴奏データ記憶領域１４ａと、歌詞データ記憶領域１４ｂと、音声データ記憶領域１４ｃと、歌唱分析データ記憶領域１４ｄとを有している。表示部１５は、例えば液晶ディスプレイなどであり、ＣＰＵ１１の制御の下で、カラオケ装置２を操作するためのメニュー画面や、背景画像に歌詞テロップを重ねたカラオケ画面などの各種画面を表示する。操作部１６は、各種のキーを供えており、押下されたキーに対応した信号をＣＰＵ１１へ出力する。マイクロフォン１７は、歌唱者が発音した音声を収音する収音手段である。音声処理部１８は、マイクロフォン１７によって収音された音声（アナログデータ）をデジタルデータに変換してＣＰＵ１１に供給する。スピーカ１９は、音声処理部１８に接続されており、音声処理部１８から出力される信号に応じた強度で放音する。通信部２０は、各種通信装置等を備えており、ＣＰＵ１１の制御の下、通信ネットワーク４を介してサーバ装置３とデータの授受を行う。 FIG. 2 is a block diagram illustrating a hardware configuration of the karaoke apparatus 2. In the figure, a CPU (Central Processing Unit) 11 reads a computer program stored in a ROM (Read Only Memory) 12 or a storage unit 14, loads it into a RAM (Random Access Memory) 13, and executes it. Control each part of the karaoke apparatus 2. The storage unit 14 is a large-capacity storage unit such as a hard disk, and includes an accompaniment data storage area 14a, a lyrics data storage area 14b, an audio data storage area 14c, and a song analysis data storage area 14d. . The display unit 15 is a liquid crystal display, for example, and displays various screens such as a menu screen for operating the karaoke device 2 and a karaoke screen in which lyrics telop is superimposed on a background image under the control of the CPU 11. The operation unit 16 provides various keys and outputs a signal corresponding to the pressed key to the CPU 11. The microphone 17 is a sound collecting unit that picks up the sound produced by the singer. The sound processing unit 18 converts sound (analog data) collected by the microphone 17 into digital data and supplies the digital data to the CPU 11. The speaker 19 is connected to the sound processing unit 18 and emits sound with an intensity corresponding to a signal output from the sound processing unit 18. The communication unit 20 includes various communication devices and the like, and exchanges data with the server device 3 via the communication network 4 under the control of the CPU 11.

記憶部１４の伴奏データ記憶領域１４ａには、例えばＭＩＤＩ（Musical Instruments Digital Interface）形式の伴奏データであって、各曲の伴奏を行う各種楽器の音程（ピッチ）や強さ（ベロシティ）や効果の付与等を示す情報が楽曲の進行に従って記された伴奏データが記憶されている。この伴奏データの中には、楽曲のメロディの音階を示すメロディデータが含まれている。歌詞データ記憶領域１４ｂには、伴奏データと対応する歌詞を示す歌詞データが記憶されている。音声データ記憶領域１４ｃには、マイクロフォン１７から音声処理部１８を経てＡ／Ｄ変換された音声データが、例えばＷＡＶＥ形式やＭＰ３（MPEG Audio Layer-3）形式で時系列に記憶される。歌唱分析データ記憶領域１４ｄには、歌唱者の音声の特徴を示す歌唱分析データが記憶される。本実施形態においては、歌唱分析データとして、音声データのスペクトル、ピッチおよびパワーを示す情報を用いる。 The accompaniment data storage area 14a of the storage unit 14 is accompaniment data in, for example, MIDI (Musical Instruments Digital Interface) format, and includes the pitch (pitch), strength (velocity), and effects of various musical instruments that accompany each song. Accompaniment data in which information indicating the assignment or the like is described according to the progress of the music is stored. The accompaniment data includes melody data indicating the scale of the melody of the music. The lyrics data storage area 14b stores lyrics data indicating lyrics corresponding to the accompaniment data. In the audio data storage area 14c, audio data A / D converted from the microphone 17 via the audio processing unit 18 is stored in time series, for example, in WAVE format or MP3 (MPEG Audio Layer-3) format. The song analysis data storage area 14d stores song analysis data indicating the characteristics of the singer's voice. In the present embodiment, information indicating the spectrum, pitch, and power of voice data is used as singing analysis data.

図３は、サーバ装置３のハードウェア構成を例示したブロック図である。図において、ＣＰＵ３１は、ＲＯＭ３２または記憶部３４に記憶されているコンピュータプログラムを読み出してＲＡＭ３３にロードし、これを実行することにより、サーバ装置３の各部を制御する。記憶部３４は、例えばハードディスクなどの大容量の記憶手段であり、歌唱分析データベース記憶領域３４ａと歌唱者情報テーブル記憶領域３４ｂとを有している。通信部３５は、各種通信装置等を備えており、ＣＰＵ３１の制御の下、通信ネットワーク４を介してカラオケ装置２とデータの授受を行う。 FIG. 3 is a block diagram illustrating a hardware configuration of the server device 3. In the figure, the CPU 31 reads out a computer program stored in the ROM 32 or the storage unit 34, loads it into the RAM 33, and executes it to control each unit of the server device 3. The storage unit 34 is a large-capacity storage unit such as a hard disk, and has a song analysis database storage area 34a and a singer information table storage area 34b. The communication unit 35 includes various communication devices and the like, and exchanges data with the karaoke device 2 via the communication network 4 under the control of the CPU 31.

記憶部３４の歌唱分析データベース記憶領域３４ａには、歌唱分析データの集合である歌唱分析データベースが記憶されている。
図４は、歌唱分析データベースの内容の一例を示す図である。図示のように、この歌唱分析データベースは、「識別情報」と「歌唱分析データ」との対が複数記憶されている。これらの項目のうち、「識別情報」の項目には、歌唱者を識別する情報が記憶される。この識別情報は、例えば会員番号やユーザＩＤなどの歌唱者個人を識別する情報であってもよい。または、歌唱が行われた場所（部屋番号、店番号）や時刻を示す情報であってもよい。要するに、この識別情報は、歌唱者を識別する情報であればどのようなものであってもよい。
次に、「歌唱分析データ」の項目には、カラオケ装置２で生成された歌唱分析データが記憶される。この歌唱分析データベースには、図示のように、複数の歌唱分析データが記憶される。 The song analysis database storage area 34a of the storage unit 34 stores a song analysis database that is a set of song analysis data.
FIG. 4 is a diagram showing an example of the contents of the song analysis database. As shown in the figure, this song analysis database stores a plurality of pairs of “identification information” and “singing analysis data”. Among these items, information for identifying the singer is stored in the “identification information” item. This identification information may be information for identifying an individual singer such as a membership number or a user ID. Or the information which shows the place (room number, store number) and time where singing was performed may be sufficient. In short, this identification information may be any information as long as it identifies the singer.
Next, singing analysis data generated by the karaoke apparatus 2 is stored in the “singing analysis data” item. In this song analysis database, a plurality of song analysis data are stored as shown in the figure.

次に、記憶部３４の歌唱者情報テーブル記憶領域３４ｂには、「識別情報」と「歌唱者情報」とが対応付けて記憶されている。「識別情報」の項目には歌唱者を識別する情報が記憶されている。「歌唱者情報」の項目には、歌唱者の名前やその歌唱者が利用している店の名称等、歌唱者に関する情報が記憶されている。 Next, in the singer information table storage area 34b of the storage unit 34, “identification information” and “singer information” are stored in association with each other. Information for identifying a singer is stored in the item “identification information”. The item “singer information” stores information about the singer such as the name of the singer and the name of the store used by the singer.

＜Ａ−２：動作＞
次に、検索システム１の動作を説明する。
＜Ａ−２−１：歌唱分析データ蓄積動作＞
まず、検索システム１の歌唱分析データ蓄積動作を説明する。
歌唱者は、カラオケ装置２の操作部１６を操作して、伴奏データの再生を指示する。ＣＰＵ１１は、この指示に応じて、伴奏データを伴奏データ記憶領域１４ａから読み出し、音声処理部１８に供給する。音声処理部１８は、供給された伴奏データをアナログ信号に変換してスピーカ１９に供給して放音させる。このとき、ＣＰＵ１１は表示部１５を制御して、「伴奏に合わせて歌唱してください」というような歌唱を促すメッセージを表示するようにしてもよい。歌唱者は、スピーカ１９から放音される伴奏に合わせて歌唱を行う。このとき、歌唱者の音声はマイクロフォン１７によって収音されて音声信号に変換され、音声処理部１８へと供給される。そして、音声処理部１８によってＡ／Ｄ変換された音声データは、記憶部１４の音声データ記憶領域１４ｃに時系列に記憶される。 <A-2: Operation>
Next, the operation of the search system 1 will be described.
<A-2-1: Singing analysis data storage operation>
First, the song analysis data storage operation of the search system 1 will be described.
The singer operates the operation unit 16 of the karaoke apparatus 2 to instruct reproduction of accompaniment data. In response to this instruction, the CPU 11 reads the accompaniment data from the accompaniment data storage area 14 a and supplies it to the audio processing unit 18. The sound processing unit 18 converts the supplied accompaniment data into an analog signal and supplies it to the speaker 19 for sound emission. At this time, the CPU 11 may control the display unit 15 to display a message prompting singing such as “Please sing along with the accompaniment”. The singer sings along with the accompaniment emitted from the speaker 19. At this time, the voice of the singer is picked up by the microphone 17, converted into a voice signal, and supplied to the voice processing unit 18. The audio data A / D converted by the audio processing unit 18 is stored in the audio data storage area 14c of the storage unit 14 in time series.

伴奏データの再生が終了すると、ＣＰＵ１１は、音声データ記憶領域１４ｃに記憶された音声データを所定時間長のフレーム単位に分離し、フレーム単位でピッチ、スペクトルおよびパワーを音声データから算出する。スペクトルの検出にはＦＦＴ（Fast Fourier Transform）を用いればよい。ＣＰＵ１１は、算出したピッチ、スペクトルおよびパワーを示す情報を歌唱分析データとして、歌唱分析データ記憶領域１４ｄに記憶する。なお、この歌唱分析データは、曲全体のピッチ、スペクトルおよびパワーを示す情報であってもよく、または、曲の一部分におけるピッチ、スペクトルおよびパワーを示す情報であってもよい。 When the reproduction of the accompaniment data ends, the CPU 11 separates the audio data stored in the audio data storage area 14c into frame units having a predetermined time length, and calculates the pitch, spectrum, and power from the audio data in frame units. The spectrum may be detected by using FFT (Fast Fourier Transform). The CPU 11 stores information indicating the calculated pitch, spectrum, and power as song analysis data in the song analysis data storage area 14d. The song analysis data may be information indicating the pitch, spectrum and power of the entire song, or may be information indicating the pitch, spectrum and power of a part of the song.

続けて、ＣＰＵ１１は、生成した歌唱分析データと識別情報とを、通信ネットワーク４を介してサーバ装置３に送信する。なお、この識別情報は、操作部１６を介して歌唱者によって入力されるようにしてもよく、または、ＣＰＵ１１が自動的に生成するようにしてもよい。 Subsequently, the CPU 11 transmits the generated song analysis data and identification information to the server device 3 via the communication network 4. The identification information may be input by the singer through the operation unit 16 or may be automatically generated by the CPU 11.

サーバ装置３のＣＰＵ３１は、通信ネットワーク４を介して歌唱分析データを受信したことを検知すると、受信された歌唱分析データと識別情報とを記憶部３４の歌唱分析データベース記憶領域３４ａに記憶する。 When detecting that the song analysis data has been received via the communication network 4, the CPU 31 of the server device 3 stores the received song analysis data and identification information in the song analysis database storage area 34 a of the storage unit 34.

カラオケ装置２は、歌唱者によって歌唱が行われる度に、その都度歌唱分析データを生成し、生成した歌唱分析データと識別情報とをサーバ装置３に送信する。これにより、サーバ装置３には、複数の歌唱における歌唱分析データが記憶される。 The karaoke device 2 generates singing analysis data each time a singer performs a song, and transmits the generated singing analysis data and identification information to the server device 3. Thereby, the server apparatus 3 stores song analysis data for a plurality of songs.

＜Ａ−２−２：検索動作＞
次に、検索システム１の検索動作について説明する。
まず、検索システム１の利用者は、操作部１６を操作して検索を指示する。カラオケ装置２のＣＰＵ１１は、検索が指示されたことを検知すると、上述した歌唱分析データの生成処理を行う。すなわち、ＣＰＵ１１は、伴奏データを伴奏データ記憶領域１４ａから読み出して伴奏音をスピーカ１９から放音させ、利用者の音声をマイクロフォン１７で収音させて音声データから歌唱分析データを生成する。
カラオケ装置２のＣＰＵ１１は、歌唱分析データを生成すると、生成した歌唱分析データを、通信ネットワーク４を介してサーバ装置３に送信する。 <A-2-2: Search operation>
Next, the search operation of the search system 1 will be described.
First, the user of the search system 1 operates the operation unit 16 to instruct a search. When the CPU 11 of the karaoke apparatus 2 detects that the search is instructed, the singing analysis data generation process described above is performed. That is, the CPU 11 reads the accompaniment data from the accompaniment data storage area 14a, emits the accompaniment sound from the speaker 19, collects the user's voice with the microphone 17, and generates song analysis data from the voice data.
When generating the song analysis data, the CPU 11 of the karaoke device 2 transmits the generated song analysis data to the server device 3 via the communication network 4.

サーバ装置３のＣＰＵ３１は、通信部３５を介して歌唱分析データを受信した（取得した）ことを検知すると、受信した歌唱分析データと歌唱分析データベース記憶領域３４ａに記憶された歌唱分析データとを比較し、その一致度に応じて、歌唱分析データベース記憶領域３４ａに記憶された歌唱分析データから１以上の歌唱分析データを選択する。具体的には、例えば、ＣＰＵ３１は、受信した歌唱分析データ（以下、「受信分析データ」）のピッチと歌唱分析データベース記憶領域３４ａに記憶された歌唱分析データ（以下、「記憶分析データ」）のピッチとを比較して、その一致度が最も高い歌唱分析データを選択する。同様に、受信分析データのスペクトルと記憶分析データのスペクトルとを比較して、その一致度が最も高い歌唱分析データを選択し、また、受信分析データのパワーと記憶分析データのパワーとを比較して、その一致度が最も高い歌唱分析データを選択する。なお、本実施形態においては、ピッチ、スペクトルおよびパワーのそれぞれについて、その一致度が最も高い歌唱分析データをそれぞれ１つずつ選択するようにしたが、選択する数は１に限定されるものではなく、ピッチ、スペクトルおよびパワーのそれぞれについて２以上の歌唱分析データを選択するようにしてもよい。 When the CPU 31 of the server device 3 detects that the song analysis data has been received (acquired) via the communication unit 35, the received song analysis data is compared with the song analysis data stored in the song analysis database storage area 34a. Then, according to the degree of coincidence, one or more song analysis data are selected from the song analysis data stored in the song analysis database storage area 34a. Specifically, for example, the CPU 31 stores the pitch of the received singing analysis data (hereinafter, “reception analysis data”) and the singing analysis data (hereinafter, “memory analysis data”) stored in the singing analysis database storage area 34a. Singing analysis data having the highest degree of coincidence is selected by comparing with the pitch. Similarly, the spectrum of the reception analysis data and the spectrum of the storage analysis data are compared, and the song analysis data having the highest degree of coincidence is selected, and the power of the reception analysis data is compared with the power of the storage analysis data. Then, the singing analysis data having the highest degree of coincidence is selected. In the present embodiment, for each of the pitch, spectrum, and power, the song analysis data having the highest degree of coincidence is selected one by one. However, the number to be selected is not limited to one. Two or more song analysis data may be selected for each of pitch, spectrum, and power.

サーバ装置３のＣＰＵ３１は、選択した歌唱分析データと対応する識別情報を歌唱分析データベース記憶領域３４ａから読み出して、読み出した識別情報を通信ネットワーク４を介してカラオケ装置２に送信（出力）する。このとき、ＣＰＵ３１は、当該識別情報と対応する歌唱者情報を歌唱者情報テーブル記憶領域３４ｂから読み出し、読み出した歌唱者情報を識別情報とあわせて送信する。 The CPU 31 of the server device 3 reads identification information corresponding to the selected song analysis data from the song analysis database storage area 34a, and transmits (outputs) the read identification information to the karaoke device 2 via the communication network 4. At this time, the CPU 31 reads the singer information corresponding to the identification information from the singer information table storage area 34b, and transmits the read singer information together with the identification information.

カラオケ装置２のＣＰＵ１１は、受信した歌唱者情報を表示部１５に出力し、表示部１５を制御して、歌唱者情報の示す内容を表示部１５に表示させる。
図５は、カラオケ装置２の表示部１５に表示される画面の一例を示す図である。図５に示す例においては、ピッチの一致度が最も高い歌唱者として「歌唱者Ａ」が表示され、また、スペクトルの一致度が最も高い歌唱者として「歌唱者Ｂ」が表示され、また、パワーの一致度が最も高い歌唱者として「歌唱者Ｃ」が表示された場合の例を示している。 CPU11 of the karaoke apparatus 2 outputs the received singer information to the display part 15, controls the display part 15, and displays the content which singer information shows on the display part 15. FIG.
FIG. 5 is a diagram illustrating an example of a screen displayed on the display unit 15 of the karaoke apparatus 2. In the example shown in FIG. 5, “Singer A” is displayed as the singer with the highest degree of matching of the pitch, “Singer B” is displayed as the singer with the highest degree of coincidence of the spectrum, An example in which “Singer C” is displayed as the singer with the highest power matching degree is shown.

このように本実施形態においては、歌唱者の歌唱音声を分析して、分析結果である歌唱分析データを蓄積し、検索結果を利用者に報知する。利用者は、検索システム１を利用して、歌唱における音程感、音質、ダイナミクス等の歌唱感が自分と似ている人を探し出すことができたり、誰と誰の歌い方が似ているか、といったことを知ることができる。これにより、例えば合唱のオーディションやメンバー募集を自動化することが可能である。または、オーディオデータを送信しあっての（遠隔地での）バーチャルデュエット、バーチャル合唱等も可能となる。 As described above, in this embodiment, the singing voice of the singer is analyzed, the singing analysis data as the analysis result is accumulated, and the search result is notified to the user. The user can use the search system 1 to find a person who has a similar singing feeling, such as a sense of pitch, sound quality, dynamics, etc., and who sings who is similar I can know that. Thereby, for example, it is possible to automate the chorus audition and member recruitment. Alternatively, virtual duet, virtual chorus, etc. (remote location) by transmitting audio data can be performed.

＜Ｂ：第２実施形態＞
次に、この発明の第２の実施形態について説明する。この実施形態が、上述した第１の実施形態と異なる点は、歌唱分析データの内容が異なる点である。そのため、以下の説明においては、上述した第１実施形態と同様の構成要素については、同じ符号を付与してその説明を省略する。 <B: Second Embodiment>
Next explained is the second embodiment of the invention. This embodiment is different from the first embodiment described above in that the content of the song analysis data is different. Therefore, in the following description, the same code | symbol is provided about the component similar to 1st Embodiment mentioned above, and the description is abbreviate | omitted.

本実施形態においては、歌唱分析データとして、歌唱に用いられている歌唱技法の種類とタイミングを示す技法データを用いる。
図６は、本実施形態における歌唱分析データの内容の一例を示す図である。図示のように、歌唱分析データは、「区間情報」と「種別情報」との各項目が互いに関連付けられている。これらの項目のうち、「区間情報」の項目には、音声データにおいて歌唱技法が用いられた区間を示す情報が記憶される。なお、この区間情報が示す区間は、開始時刻情報と終了時刻情報とによって表される時間幅を有した区間であってもよく、またはある１点の時刻を示すものであってもよい。 In the present embodiment, technique data indicating the type and timing of the singing technique used for singing is used as the singing analysis data.
FIG. 6 is a diagram illustrating an example of the content of song analysis data in the present embodiment. As shown in the drawing, in the song analysis data, items of “section information” and “type information” are associated with each other. Among these items, the “section information” item stores information indicating a section in which the singing technique is used in the audio data. The section indicated by the section information may be a section having a time width represented by the start time information and the end time information, or may indicate a certain point of time.

「種別情報」の項目には、予め複数種類設定された歌唱技法を識別する情報が記憶される。この「種別情報」は、例えば「ビブラート」、「しゃくり」、「こぶし」、「ファルセット」、「つっこみ」、「ため」、「息継ぎ」などの歌唱技法を識別する情報である。「ビブラート」は、音の高さをほんのわずかに連続的に上下させ、震えるような音色を出す技法を示す。「しゃくり」は、目的の音より低い音から発音し、音程を滑らかに目的の音に近づけていく技法を示す。「こぶし」は、装飾的に加えるうねるような節回しを行う技法を示す。「ファルセット」は、いわゆる「裏声」で歌う技法を示す。「つっこみ」は、歌い出しを本来のタイミングよりも早いタイミングにする技法を示す。「ため」は、歌い出しを本来のタイミングよりも遅いタイミングにする技法を示す。「息継ぎ」は、歌唱者が息継ぎをするタイミングを示すものである。 In the “type information” item, information for identifying a plurality of types of singing techniques set in advance is stored. This “type information” is information for identifying a singing technique such as “vibrato”, “shakuri”, “fist”, “farset”, “tsukkomi”, “for”, “breathing”, and the like. “Vibrato” refers to a technique that raises and lowers the pitch of the sound only slightly and produces a trembling tone. “Shikkuri” refers to a technique in which sound is generated from a sound lower than the target sound, and the pitch is smoothly brought close to the target sound. “Fist” refers to a technique for adding a decorative undulation. “Falset” indicates a technique of singing with a so-called “back voice”. “Tsukumi” refers to a technique for making the singing start earlier than the original timing. “For” indicates a technique for making the singing timing later than the original timing. The “breathing” indicates the timing when the singer breathes.

次に、カラオケ装置２が行う歌唱分析データ生成処理について以下に説明する。
歌唱者は、カラオケ装置２の操作部１６を操作して、伴奏データの再生を指示する。ＣＰＵ１１は、この指示に応じて、伴奏データを伴奏データ記憶領域１４ａから読み出し、音声処理部１８に供給する。音声処理部１８は、供給された伴奏データをアナログ信号に変換してスピーカ１９に供給して放音させる。歌唱者は、スピーカ１９から放音される伴奏に合わせて歌唱を行う。このとき、歌唱者の音声はマイクロフォン１７によって収音されて音声信号に変換され、音声処理部１８へと供給される。そして、音声処理部１８によってＡ／Ｄ変換された音声データは、記憶部１４の音声データ記憶領域１４ｃに時系列に記憶される。 Next, the song analysis data generation process performed by the karaoke apparatus 2 will be described below.
The singer operates the operation unit 16 of the karaoke apparatus 2 to instruct reproduction of accompaniment data. In response to this instruction, the CPU 11 reads the accompaniment data from the accompaniment data storage area 14 a and supplies it to the audio processing unit 18. The sound processing unit 18 converts the supplied accompaniment data into an analog signal and supplies it to the speaker 19 for sound emission. The singer sings along with the accompaniment emitted from the speaker 19. At this time, the voice of the singer is picked up by the microphone 17, converted into a voice signal, and supplied to the voice processing unit 18. The audio data A / D converted by the audio processing unit 18 is stored in the audio data storage area 14c of the storage unit 14 in time series.

伴奏データの再生が終了すると、ＣＰＵ１１は、音声データ記憶領域１４ｃに記憶された音声データに対して音声分析処理を行い、時刻に対応したピッチ、パワー、スペクトルを音声データから算出する。続けて、ＣＰＵ１１は、伴奏データ記憶領域１４ａに記憶された伴奏データに含まれるメロディデータと音声データ記憶領域１４ｃに記憶された音声データとを所定のフレーム単位で解析し、音声データとメロディデータとの時間的な対応関係を検出する。 When the reproduction of the accompaniment data is completed, the CPU 11 performs a sound analysis process on the sound data stored in the sound data storage area 14c, and calculates a pitch, power, and spectrum corresponding to the time from the sound data. Subsequently, the CPU 11 analyzes the melody data included in the accompaniment data stored in the accompaniment data storage area 14a and the voice data stored in the voice data storage area 14c in units of a predetermined frame, Detect the temporal correspondence of.

次に、ＣＰＵ１１は、音声データから算出されたピッチ、パワーおよびスペクトルの時間的な変化のパターンを解析して、この解析結果が予め定められたパターンに対応するか否かを判定し、対応する場合には当該パターンに対応する区間を特定の歌唱技法が用いられている区間として特定する。そして、ＣＰＵ１１は、特定した区間の区間情報を、その歌唱技法を示す種別情報と関連付けて記憶部１４の歌唱分析データ記憶領域１４ｄに記憶する。 Next, the CPU 11 analyzes the pattern of temporal changes in pitch, power, and spectrum calculated from the audio data, determines whether or not this analysis result corresponds to a predetermined pattern, and responds accordingly. In this case, the section corresponding to the pattern is specified as a section in which a specific singing technique is used. Then, the CPU 11 stores the section information of the specified section in the song analysis data storage area 14d of the storage unit 14 in association with the type information indicating the song technique.

ここで、各歌唱技法が用いられている区間の特定処理について以下に説明する。本実施形態においては、ＣＰＵ１１は、「ビブラート」、「しゃくり」、「こぶし」、「ファルセット」、「つっこみ」、「ため」および「息継ぎ」の各歌唱技法が用いられている区間を特定（検出）する。これらのうち、「ビブラート」および「しゃくり」は音声データから算出されたピッチに基づいて検出する。また、「こぶし」および「ファルセット」は音声データから算出されたスペクトルに基づいて検出する。また、「ため」および「つっこみ」は、音声データから算出されたピッチとメロディデータとに基づいて検出する。また、「息継ぎ」は、音声データから算出されたパワーとメロディデータとに基づいて検出する。 Here, the identification process of the area where each singing technique is used is demonstrated below. In the present embodiment, the CPU 11 specifies (detects) a section in which each singing technique of “vibrato”, “shakuri”, “fist”, “farset”, “tsukkomi”, “for” and “breathing” is used. ) Of these, “vibrato” and “shrimp” are detected based on the pitch calculated from the audio data. “Fist” and “Falset” are detected based on the spectrum calculated from the audio data. Also, “for” and “tsukkomi” are detected based on the pitch and melody data calculated from the audio data. Further, “breathing” is detected based on the power calculated from the voice data and the melody data.

ＣＰＵ１１は、音声データとメロディデータとの対応関係と、音声データから算出されたピッチとに基づいて、音声データに含まれる音の開始時刻と当該音に対応するメロディデータの音の開始時刻とが異なる区間を特定する。ここで、ＣＰＵ１１は、音声データのピッチの変化タイミングがメロディデータのピッチの変化タイミングよりも早く現れている区間、すなわち音声データに含まれる音の開始時刻が当該音に対応するメロディデータの音の開始時刻よりも早い区間については、この区間を「つっこみ」の歌唱技法が用いられている区間であると特定する。ＣＰＵ１１は、特定した区間の区間情報を、「つっこみ」を示す識別情報と関連付けて記憶部１４の歌唱分析データ記憶領域１４ｄに記憶する。 Based on the correspondence between the audio data and the melody data and the pitch calculated from the audio data, the CPU 11 determines the start time of the sound included in the audio data and the start time of the sound of the melody data corresponding to the sound. Identify different sections. Here, the CPU 11 is a section in which the pitch change timing of the voice data appears earlier than the pitch change timing of the melody data, that is, the start time of the sound included in the voice data corresponds to the sound of the melody data corresponding to the sound. For a section earlier than the start time, this section is specified as a section in which the “Tsukumi” singing technique is used. The CPU 11 stores the section information of the specified section in the song analysis data storage area 14d of the storage unit 14 in association with the identification information indicating “push”.

逆に、ＣＰＵ１１は、音声データとメロディデータとの対応関係と、音声データから算出されたピッチとに基づいて、音声データのピッチの変化タイミングがメロディデータのピッチの変化タイミングよりも遅れて現れている区間、すなわち音声データに含まれる音の開始時刻が当該音に対応するメロディデータの音の開始時刻よりも遅い区間を検出し、検出した区間を「ため」の歌唱技法が用いられている区間であると特定する。 Conversely, the CPU 11 shows that the change timing of the pitch of the audio data appears later than the change timing of the pitch of the melody data based on the correspondence between the audio data and the melody data and the pitch calculated from the audio data. That is, a section where the start time of the sound included in the sound data is later than the start time of the sound of the melody data corresponding to the sound, and the section where the singing technique for “for” is used as the detected section To be identified.

また、ＣＰＵ１１は、音声データから算出したピッチの時間的な変化のパターンを解析して、中心となる周波数の上下に所定の範囲内でピッチが連続的に変動している区間を検出し、検出した区間を「ビブラート」の歌唱技法が用いられている区間であると特定する。 Further, the CPU 11 analyzes the pattern of the temporal change of the pitch calculated from the audio data, detects a section where the pitch continuously fluctuates within a predetermined range above and below the center frequency, and detects it. This section is identified as a section in which the “vibrato” singing technique is used.

また、ＣＰＵ１１は、音声データから算出したピッチの時間的な変化のパターンを解析して、低いピッチから高いピッチに連続的にピッチが変化する区間を検出し、検出した区間を「しゃくり」の歌唱技法が用いられている区間であると特定する。なお、この処理は、メロディデータとの対応関係に基づいて行うようにしてもよい。すなわち、ＣＰＵ１１は、音声データとメロディデータとの対応関係に基づいて、音声データのピッチが、低いピッチから連続的にメロディデータのピッチに近づいている区間を検出すればよい。 Further, the CPU 11 analyzes the pattern of the temporal change of the pitch calculated from the audio data, detects a section in which the pitch continuously changes from a low pitch to a high pitch, and sings the detected section as a “shrimp” song. Identifies the interval in which the technique is used. This process may be performed based on the correspondence with the melody data. In other words, the CPU 11 may detect a section in which the pitch of the voice data is continuously approaching the pitch of the melody data from a low pitch based on the correspondence relationship between the voice data and the melody data.

また、ＣＰＵ１１は、音声データとメロディデータとの対応関係と、音声データから算出されたパワーとに基づいて、メロディデータが有音である区間であって音声データのパワー値が所定の閾値よりも小さい区間を検出し、検出した箇所を「息継ぎ」の区間であると特定する。 Further, the CPU 11 is a section in which the melody data is voiced based on the correspondence between the voice data and the melody data and the power calculated from the voice data, and the power value of the voice data is higher than a predetermined threshold value. A small section is detected, and the detected part is specified as the "breathing" section.

また、ＣＰＵ１１は、音声データから算出されたスペクトルの時間的な変化パターンを解析して、スペクトル特性がその予め決められた変化状態に急激に遷移している区間を検出し、検出した区間を「ファルセット」の歌唱技法が用いられている区間であると特定する。ここで、予め決められた変化状態とは、スペクトル特性の高調波成分が極端に少なくなる状態である。例えば、図７に示すように、地声の場合は沢山の高調波成分が含まれるが（同図（ａ）参照）、ファルセットになると高調波成分の大きさが極端に小さくなる（同図（ｂ）参照）。なお、この場合、ＣＰＵ１１は、ピッチが大幅に上方に変化したかどうかも参照してもよい。ファルセットは地声と同一のピッチを発生する場合でも用いられることもあるが、一般には地声では発声できない高音を発声するときに使われる技法だからである。したがって、音声データのピッチが所定音高以上の場合に限って「ファルセット」の検出をするように構成してもよい。また、男声と女声とでは一般にファルセットを用いる音高の領域が異なるので、音声データの音域や、音声データから検出されるフォルマントによって性別検出を行い、この結果を踏まえてファルセット検出の音高領域を設定してもよい。 Further, the CPU 11 analyzes the temporal change pattern of the spectrum calculated from the audio data, detects a section where the spectral characteristics are abruptly changed to the predetermined change state, and detects the detected section as “ It is specified that the section uses the “Falset” singing technique. Here, the predetermined change state is a state in which the harmonic component of the spectrum characteristic is extremely reduced. For example, as shown in FIG. 7, in the case of a local voice, many harmonic components are included (see FIG. 7A). b)). In this case, the CPU 11 may also refer to whether or not the pitch has changed significantly upward. The falset is sometimes used even when generating the same pitch as the local voice, but is generally a technique used when generating high-pitched sounds that cannot be generated by the local voice. Therefore, “Falset” may be detected only when the pitch of the audio data is equal to or higher than a predetermined pitch. In addition, since the pitch range using the falset is generally different between male voice and female voice, gender detection is performed based on the voice data range and formants detected from the voice data, and based on this result, the pitch range for falset detection is determined. It may be set.

また、ＣＰＵ１１は、スペクトル特性の変化の態様が短時間に多様に切り替わる区間を検出し、検出した部分を「こぶし」の歌唱技法が用いられている部分であると特定する。「こぶし」の場合は、短い区間において声色や発声方法を変えて唸るような味わいを付加する歌唱技法であるため、この技法が用いられている区間においてはスペクトル特性が多様に変化するからである。 In addition, the CPU 11 detects a section in which the mode of change of the spectrum characteristic is variously switched in a short time, and identifies the detected part as a part where the “fist” singing technique is used. In the case of “fist”, it is a singing technique that adds a taste that can be changed by changing the voice color and utterance method in a short section, so the spectral characteristics change variously in the section where this technique is used. .

以上のようにして、ＣＰＵ１１は、音声データから各歌唱技法が用いられている区間を検出し、検出した区間を示す区間情報をその歌唱技法を示す種別情報と関連付けて記憶部１４の歌唱分析データ記憶領域１４ｄに記憶する。 As described above, the CPU 11 detects the section in which each singing technique is used from the voice data, associates the section information indicating the detected section with the type information indicating the singing technique, and singing analysis data in the storage unit 14. Store in the storage area 14d.

サーバ装置３のＣＰＵ１１は、取得した歌唱分析データと歌唱分析データベース記憶領域３４ａに記憶された歌唱分析データとを比較し、その一致度に応じて歌唱分析データを選択する。このように、本実施形態においては、歌唱技法を示す技法データを歌唱分析データとして用いるから、歌唱技法を用いるタイミング等が似ている歌唱者を検索することができる。 The CPU 11 of the server device 3 compares the acquired song analysis data with the song analysis data stored in the song analysis database storage area 34a, and selects song analysis data according to the degree of coincidence. Thus, in this embodiment, since the technique data which shows a singing technique are used as song analysis data, the singer who resembles the timing etc. which use a singing technique can be searched.

＜Ｃ：変形例＞
以上、本発明の実施形態について説明したが、本発明は上述した実施形態に限定されることなく、他の様々な形態で実施可能である。以下にその一例を示す。
（１）上述した実施形態においては、サーバ装置３のＣＰＵ３１は、取得した歌唱分析データとの一致度が最も高い歌唱分析データを選択するようにした。これに加えて、あるグループにおいて誰と誰とが似ているかという組み合わせを生成して利用者に報知するようにしてもよい。具体的には、例えば、歌唱者Ａ〜歌唱者Ｄのグループと、歌唱者Ｅ〜歌唱者Ｈのグループとのふたつのグループとにおいて、歌唱分析データの一致度の高い組み合わせ（例えば、「歌唱者Ａと歌唱者Ｅ、歌唱者Ｂと歌唱者Ｈ」等）を特定するようにしてもよい。
歌い方は、その人の性格や、音楽的嗜好を反映すると考えられる。例えば、素直で照れ屋な人は安定したストレートな歌い方をするだろうし、開放的で情熱的な人は表現豊かであるといえる。また、クラシックしか聴かない人と演歌しか聴かない人とでは歌い方は全く異なる。このような特徴を予め採集し、性格情報や相性情報を生成してデータベース化しておくことで、歌唱分析データを基に、性格・相性判断を行うこともできる。すなわち、歌唱分析データを検索キーにして、データベースを検索することにより、性格情報や相性情報を出力するように構成することもできる。この場合、複数の人が順番に歌った後に性格判断や相性判断を行い、その結果を記憶しておけば、それらの結果を照合することにより、誰と誰との相性が合っているというような判定を行うこともできる。これによれば、似たもの同士のカップルを生成するというゲームも行うことができる。 <C: Modification>
As mentioned above, although embodiment of this invention was described, this invention is not limited to embodiment mentioned above, It can implement with another various form. An example is shown below.
(1) In embodiment mentioned above, CPU31 of the server apparatus 3 was made to select song analysis data with the highest coincidence with the acquired song analysis data. In addition to this, a combination of who and who is similar in a certain group may be generated and notified to the user. Specifically, for example, in two groups of a group of singer A to singer D and a group of singer E to singer H, a combination with a high degree of coincidence of singing analysis data (for example, “singer” A and singer E, singer B and singer H ", etc.) may be specified.
The way of singing is considered to reflect the personality and musical taste of the person. For example, an honest and shy person will sing in a stable and straight manner, and an open and passionate person is expressive. Also, the way of singing is completely different between those who listen only to classical music and those who listen only to enka. By collecting such features in advance, generating personality information and compatibility information and creating a database, personality / compatibility can be determined based on song analysis data. In other words, the personal information and compatibility information can be output by searching the database using the song analysis data as a search key. In this case, personality judgment and compatibility judgment are performed after multiple people sing in order, and if the result is memorized, by checking those results, who and who are compatible Can also be made. According to this, the game of generating a couple of similar things can also be played.

または、取得した歌唱分析データに対して所定の条件を満たす歌唱分析データを選択するようにしてもよい。具体的には、例えば、サーバ装置の記憶部に、歌手ユニットの各メンバーの歌唱分析データを夫々記憶部に記憶させておき、サーバ装置のＣＰＵが、取得した歌唱分析データと記憶部に記憶された歌唱分析データとを比較して、取得した歌唱分析データがどのユニットのどのメンバーの歌唱感に似ているかを判定する。そして、サーバ装置のＣＰＵが、判定されたユニットの他のメンバーと似ている歌唱分析データを、歌唱分析データベースから選択して利用者に報知する。このようにすれば、例えば利用者が、ある歌手ユニットのあるメンバーに歌唱感が似ている場合に、そのパートを歌うと効果的であることを知ることができる。
さらに、ある特徴を持つ歌声を持つ人や、あるユニットの特定のメンバーの歌唱感に似ている人を捜し出すこともできる。すなわち、サーバ装置３にそのようなリクエストを登録しておき、サーバ装置３においては、事前に記憶している歌唱分析データにリクエストがあった旨のマーク（フラグなど）を付けておく。そして、これに似ている歌唱が入力された場合に、その方の識別情報（あるいは名前データなど）をリクエストした人に報知する。リクエストをした人がメールアドレスをサーバに登録している場合は、そのメールアドレスに通知すればよい。このようにすれば、所望の歌唱感を持った人を容易に捜し出すことができる。 Or you may make it select the song analysis data which satisfy | fill a predetermined condition with respect to the acquired song analysis data. Specifically, for example, the song analysis data of each member of the singer unit is stored in the storage unit in the storage unit of the server device, and the CPU of the server device is stored in the acquired song analysis data and the storage unit. The obtained singing analysis data is compared to the singing feeling of which member of which unit. And CPU of a server apparatus selects the song analysis data similar to the other member of the determined unit from a song analysis database, and alert | reports to a user. In this way, for example, when the user has a feeling of singing similar to a member of a certain singer unit, it can be known that it is effective to sing that part.
You can also search for people who have singing voices with certain characteristics, or who resemble the singing feeling of a particular member of a unit. That is, such a request is registered in the server device 3, and the server device 3 adds a mark (flag or the like) indicating that the request has been made to the song analysis data stored in advance. When a song similar to this is input, the person who requested the identification information (or name data, etc.) is notified to that person. If the person who made the request has registered an e-mail address with the server, the e-mail address may be notified. In this way, a person with a desired singing feeling can be easily searched.

（２）上述した第１の実施形態においては、歌唱分析データとして、音声のパワー、ピッチおよびスペクトルを示す情報を用いたが、音声のピッチを示す情報を歌唱分析データとして用いるようにしてもよく、または、スペクトルを示す情報を歌唱分析データとして用いるようにしてもよい。要するに、音声のピッチ、スペクトルおよびパワーの少なくともいずれか一つを示す情報を歌唱分析データとして用いるようにしてもよい。また、ピッチ、スペクトルおよびパワーのうちのどの情報を用いるかを利用者が操作部を用いて選択できるようにしてもよい。または、第２実施形態で示した歌唱技法のいずれかを利用者が選択できるようにしてもよい。 (2) In the first embodiment described above, information indicating the power, pitch, and spectrum of the voice is used as the song analysis data, but information indicating the pitch of the voice may be used as the song analysis data. Or you may make it use the information which shows a spectrum as song analysis data. In short, information indicating at least one of the pitch, spectrum, and power of voice may be used as song analysis data. In addition, the user may be allowed to select which information of pitch, spectrum, and power is to be used using the operation unit. Or you may enable it for a user to select either of the singing techniques shown in 2nd Embodiment.

また、ピッチやパワー、スペクトル等の複数の情報を総合した結果を利用者に報知するようにしてもよい。
図８は、複数の情報を総合した結果を報知する画面の一例を示す図である。この例においては、サーバ装置３のＣＰＵ３１は、ピッチ、パワーおよびスペクトルのそれぞれの検索結果を数値で算出し、重み付けを行って総合結果を出力する。 Moreover, you may make it alert | report to a user the result which integrated | combined several information, such as a pitch, power, and a spectrum.
FIG. 8 is a diagram illustrating an example of a screen for informing a result of combining a plurality of pieces of information. In this example, the CPU 31 of the server device 3 calculates the search results of pitch, power, and spectrum as numerical values, performs weighting, and outputs a comprehensive result.

（３）なお、カラオケ装置による検索結果の報知の形態は、表示に限らず、例えば音声メッセージを出力するような形態であってもよいし、また、メッセージを電子メール形式で利用者のメール端末に送信するなどといった形態であってもよい。要は、利用者に対して何らかの手段でメッセージ乃至情報を伝えることができる報知形態であればよい。 (3) The form of notification of the search result by the karaoke device is not limited to display, but may be a form that outputs a voice message, for example, or the message is sent to the user's mail terminal in the form of an e-mail. It may be in a form such as being transmitted to. In short, any notification form that can transmit a message or information to the user by any means is acceptable.

（４）また、上述した実施形態においては、選択された歌唱分析データと対応する識別情報の示す利用者を報知させるようにしたが、これに加えて、選択された歌唱分析データと対応する識別情報の示す歌唱者（以下、「選択された歌唱者」）に、選択された旨を報知するようにしてもよい。
具体的には、例えば、サーバ装置が、選択された歌唱者の利用しているカラオケ装置に選択された旨を示す情報を送信し、そのカラオケ装置のＣＰＵが、受信した情報に基づいて、「あなたと声質の似ている○○さんがバーチャルデュエットを希望しています」といったメッセージを表示するようにしてもよい。または、メッセージを電子メール形式で選択された歌唱者に送信するようにしてもよい。このようにすれば、選択された歌唱者は、自分が選択されたことを認識することができる。なお、電子メール形式で選択された歌唱者に送信する際は、歌唱分析データとともにその人の電子メールアドレスを識別情報に対応付けて記憶しておき、当該アドレスに宛てて送信するように構成すればよい。
なお、メッセージの宛先アドレスとなる通信アドレスは、電子メールアドレスに限らず、選択された歌唱者が利用しているカラオケ装置のＩＰアドレスやＭＡＣアドレスであってもよく、または、選択された歌唱者が利用しているパーソナルコンピュータのＩＰアドレスやＭＡＣアドレスであってもよい。または、歌唱者の電話番号であってもよい。電話番号の場合は、サーバ装置またはカラオケ装置が、識別情報に対応付けて記憶された電話番号に対して自動的に発呼し、呼接続が確立した時点で予め記憶された音声メッセージを再生するようにしてもよい。要するに、識別情報に対応付けて歌唱者の連絡先を示す通信アドレスを記憶し、識別情報に対応する通信アドレスを記憶手段から読み出して、読み出した通信アドレスに宛てて、選択された旨を示すメッセージを通信するようにすればよい。 (4) In the above-described embodiment, the user indicated by the identification information corresponding to the selected song analysis data is notified. In addition to this, the identification corresponding to the selected song analysis data is provided. You may make it alert | report to the singer which information shows (henceforth "the selected singer") that it was selected.
Specifically, for example, the server device transmits information indicating that the server device has been selected to the karaoke device used by the selected singer, and the CPU of the karaoke device uses the received information as “ You may be able to display a message such as “XX's voice quality similar to you wants a virtual duet”. Alternatively, the message may be sent to the selected singer in the e-mail format. In this way, the selected singer can recognize that he has been selected. When transmitting to the singer selected in the e-mail format, the e-mail address of the person is stored in association with the identification information together with the singing analysis data, and is transmitted to the address. That's fine.
Note that the communication address that is the destination address of the message is not limited to the e-mail address, and may be the IP address or MAC address of the karaoke device used by the selected singer, or the selected singer. It may be the IP address or MAC address of the personal computer used by. Or the telephone number of a singer may be sufficient. In the case of a telephone number, the server apparatus or karaoke apparatus automatically calls the telephone number stored in association with the identification information, and reproduces the voice message stored in advance when the call connection is established. You may do it. In short, the communication address indicating the contact information of the singer is stored in association with the identification information, the communication address corresponding to the identification information is read out from the storage means, and the message indicating that the selection is made to the read communication address May be communicated.

また、カラオケボックスなどでは、各部屋の利用者について歌唱分析データを生成すれば、上述のようなバーチャルデュエットの組み合わせを作ることが容易である。この場合、各部屋に設置されているカラオケ装置毎にＩＤを持っておけば、どの部屋から入力された歌声であるかを知ることができるから、似た歌声を持つ人が在席している部屋のモニタ装置に上述の「あなたと声質の似ている○○さんがバーチャルデュエットを希望しています」というようなメッセージを表示することができる。
これに応答する場合は、カラオケ装置に入力される音声を互いに相手の部屋のカラオケ装置の音声入力系統に加えることにより、リアルタイムでバーチャルデュエットを行うことができる。伴奏については、いずれか一方の部屋の伴奏信号を他方の部屋のカラオケ装置に転送するように構成すればよい。同様にして、３以上の部屋において、バーチャルコーラスを行うことも可能である。 Moreover, in a karaoke box etc., if song analysis data is produced | generated about the user of each room, it is easy to make the combination of the above virtual duets. In this case, if you have an ID for each karaoke device installed in each room, you can know from which room the singing voice is input, so people with similar singing voices are present The above-mentioned message such as “Mr. XX who has a voice quality similar to you wants a virtual duet” can be displayed on the monitor device in the room.
When responding to this, a virtual duet can be performed in real time by adding the sound input to the karaoke device to the sound input system of the karaoke device in the other party's room. The accompaniment may be configured so that the accompaniment signal in one of the rooms is transferred to the karaoke apparatus in the other room. Similarly, virtual chorus can be performed in three or more rooms.

（５）また、上述した実施形態においては、検索キーとなる歌唱分析データは、カラオケ装置２のＣＰＵが生成するようにしたが、これに代えて、ＣＰＵ１１が、歌唱分析データの入力を促す処理を行い、利用者が歌唱分析データを入力するようにしてもよい。この場合は、例えば、ＣＰＵ１１が、歌唱分析データの入力を促す画面を表示部１５に表示させ、利用者は、例えばＵＳＢ（Universal Serial Bus）等のインタフェースを介してカラオケ装置２に歌唱分析データを入力するようにすればよい。この場合、事前にパーソナルコンピュータ等の装置で歌唱分析データを生成するようにしておけばよい。この際も、上述した実施形態と同様に、パーソナルコンピュータが、マイクロフォンで歌唱者の音声を収音して、収音した音声を分析して歌唱分析データを生成する。
また、カラオケ装置２にＲＦＩＤリーダを設けて、歌唱分析データが書き込まれたＲＦＩＤをカラオケ装置２のＲＦＩＤリーダが読み取るようにしてもよい。要するに、歌唱分析データの取得は、ＣＰＵ１１が歌唱分析データを生成してもよく、あるいはＣＰＵ１１に歌唱分析データを直接入力してもよい。 (5) In the above-described embodiment, the song analysis data serving as the search key is generated by the CPU of the karaoke apparatus 2. Instead, the CPU 11 prompts the user to input song analysis data. And the user may input singing analysis data. In this case, for example, the CPU 11 displays a screen that prompts the user to input song analysis data on the display unit 15, and the user sends the song analysis data to the karaoke apparatus 2 via an interface such as a USB (Universal Serial Bus). Just input. In this case, the singing analysis data may be generated in advance by a device such as a personal computer. At this time, as in the above-described embodiment, the personal computer collects the voice of the singer with the microphone, analyzes the collected voice, and generates song analysis data.
Alternatively, the karaoke apparatus 2 may be provided with an RFID reader so that the RFID in which the singing analysis data is written is read by the RFID reader of the karaoke apparatus 2. In short, the singing analysis data may be acquired by the CPU 11 or may be input directly into the CPU 11.

または、識別情報が入力されることによって歌唱分析データが指定されるようにしてもよい。この場合は、例えば、ＣＰＵ１１が、利用者の識別情報の入力を促す画面を表示部１５に表示させ、利用者が操作部１６を操作してカラオケ装置２に入力するようにしてもよい。この場合は、カラオケ装置２のＣＰＵ１１がサーバ装置３に識別情報を送信し、サーバ装置３のＣＰＵ３１は、識別情報を受信すると、これに対応する歌唱分析データを歌唱分析データベース記憶領域３４ａから読み出す。 Alternatively, the song analysis data may be designated by inputting identification information. In this case, for example, the CPU 11 may display a screen that prompts the user to input identification information on the display unit 15, and the user may operate the operation unit 16 to input the karaoke device 2. In this case, when the CPU 11 of the karaoke apparatus 2 transmits the identification information to the server apparatus 3 and the CPU 31 of the server apparatus 3 receives the identification information, the singing analysis data corresponding thereto is read from the singing analysis database storage area 34a.

（６）歌唱者を識別する識別情報に加えて、曲を識別する曲識別情報を歌唱分析データに対応付けて歌唱分析データベースに記憶させるようにしてもよい。このように構成するとともに、歌唱分析データを検索する際には、曲識別情報も入力するようにすれば、曲識別情報が一致する歌唱分析データ群を対象として検索を行うことができるから、任意の曲について似た歌声の人を捜すことができる。 (6) In addition to the identification information for identifying the singer, the song identification information for identifying the song may be stored in the song analysis database in association with the song analysis data. In this way, when searching for song analysis data, if song identification information is also input, the search can be performed for a song analysis data group with matching song identification information. You can look for people with similar singing voices about your songs.

（７）上述した実施形態では、カラオケ装置２とサーバ装置３とが通信ネットワークで接続された検索システム１が、本実施形態に係る機能の全てを実現するようになっている。これに対し、通信ネットワークで接続された３以上の装置が上記機能を分担するようにし、それら複数の装置を備えるシステムが同実施形態のシステムを実現するようにしてもよい。または、ひとつの装置が上記機能のすべてを実現するようにしてもよい。 (7) In the above-described embodiment, the search system 1 in which the karaoke device 2 and the server device 3 are connected via a communication network realizes all the functions according to the present embodiment. On the other hand, three or more devices connected via a communication network may share the above functions, and a system including the plurality of devices may realize the system of the embodiment. Alternatively, one device may realize all of the above functions.

（８）上述した実施形態におけるカラオケ装置２のＣＰＵ１１またはサーバ装置３のＣＰＵ３１によって実行されるプログラムは、磁気テープ、磁気ディスク、フレキシブルディスク、光記録媒体、光磁気記録媒体、ＣＤ（Compact Disk）−ＲＯＭ、ＤＶＤ（Digital Versatile Disk）、ＲＡＭなどの記録媒体に記憶した状態で提供し得る。また、インターネットのようなネットワーク経由でカラオケ装置２またはサーバ装置３にダウンロードさせることも可能である。 (8) Programs executed by the CPU 11 of the karaoke apparatus 2 or the CPU 31 of the server apparatus 3 in the above-described embodiment are a magnetic tape, a magnetic disk, a flexible disk, an optical recording medium, a magneto-optical recording medium, and a CD (Compact Disk)- It can be provided in a state stored in a recording medium such as a ROM, a DVD (Digital Versatile Disk), or a RAM. It is also possible to download to the karaoke apparatus 2 or the server apparatus 3 via a network such as the Internet.

本発明の第１実施形態に係る検索システムの構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the search system which concerns on 1st Embodiment of this invention. 同実施形態のカラオケ装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of the karaoke apparatus of the embodiment. 同実施形態のサーバ装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of the server apparatus of the embodiment. 同実施形態の歌唱分析データベースの内容の一例を示す図である。It is a figure which shows an example of the content of the song analysis database of the embodiment. 同実施形態のカラオケ装置の表示部に表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on the display part of the karaoke apparatus of the embodiment. 本発明の第２実施形態に係る歌唱分析データの内容の一例を示す図である。It is a figure which shows an example of the content of the song analysis data which concerns on 2nd Embodiment of this invention. ファルセットの検出処理を説明するための図である。It is a figure for demonstrating the detection process of a false set. カラオケ装置の表示部に表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on the display part of a karaoke apparatus.

Explanation of symbols

１…検索システム、２，２ａ，２ｂ，２ｃ…カラオケ装置、３…サーバ装置、４…通信ネットワーク、１１，３１…ＣＰＵ、１２，３２…ＲＯＭ、１３，３３…ＲＡＭ、１４，３４…記憶部、１５…表示部、１６…操作部、１７…マイクロフォン、１８…音声処理部、１９…スピーカ、２０，３５…通信部。 DESCRIPTION OF SYMBOLS 1 ... Search system 2, 2a, 2b, 2c ... Karaoke apparatus, 3 ... Server apparatus, 4 ... Communication network, 11, 31 ... CPU, 12, 32 ... ROM, 13, 33 ... RAM, 14, 34 ... Memory | storage part , 15 ... display unit, 16 ... operation unit, 17 ... microphone, 18 ... audio processing unit, 19 ... speaker, 20, 35 ... communication unit.

Claims

Storage means for storing a plurality of pairs of identification information for identifying a singer and singing analysis data indicating characteristics of the singing voice of the singer;
Acquisition means for acquiring singing analysis data indicating the characteristics of the singing voice of the singer;
The singing analysis data acquired by the acquisition unit and the singing analysis data stored in the storage unit are compared, and one or more singing analysis data is obtained from the singing analysis data stored in the storage unit according to the degree of coincidence. A selection means to select;
A search apparatus comprising: output means for reading identification information corresponding to the song analysis data selected by the selection means from the storage means and outputting the read identification information.

The storage means stores singer information related to a singer in association with the identification information,
The output means reads the identification information corresponding to the singing analysis data selected by the selection means from the storage means, outputs the read identification information, and the singer information corresponding to the identification information from the storage means. The retrieval apparatus according to claim 1, wherein the retrieved singer information is notified.

The storage means stores a communication address indicating a contact address of the singer in association with the identification information,
The output means reads identification information corresponding to the song analysis data selected by the selection means from the storage means, outputs the read identification information, and reads a communication address corresponding to the identification information from the storage means. 2. The search device according to claim 1, wherein a message indicating that the message is selected is communicated to the read communication address.

The search apparatus according to claim 1, wherein the singing analysis data is data indicating at least one of a pitch, a spectrum, and power of the voice.

The search apparatus according to claim 1, wherein the singing analysis data is technique data indicating a type and timing of a technique used for singing.

Input means for outputting the voice of the input singer as voice data;
Generating means for generating the song analysis data from the voice data output by the input means,
The search device according to any one of claims 1 to 5, wherein the acquisition unit acquires song analysis data generated by the generation unit.

The search device according to claim 6, further comprising storage control means for storing the song analysis data generated by the generation means in the storage means.