JP2000209500A

JP2000209500A - A method of synthesizing and displaying a person image separately shot on a recorded background image and displaying and outputting the same, and a karaoke apparatus employing the method

Info

Publication number: JP2000209500A
Application number: JP11007815A
Authority: JP
Inventors: Makoto Wakamatsu; 誠若松; Hiroshi Shingai; 浩新鎧
Original assignee: Daiichikosho Co Ltd
Current assignee: Daiichikosho Co Ltd
Priority date: 1999-01-14
Filing date: 1999-01-14
Publication date: 2000-07-28

Abstract

(57)【要約】【課題】背景映像に人物映像を合成するさい、背景映
像の内容に応じて自動的に人物映像の位置や大きさを適
切にする。【解決手段】背景映像となる映像データがデータベー
スに蓄積されるとともに、前記映像データには背景映像
の内容に合わせた適切な人物合成レイアウトを指定する
人物合成処方データが対応づけられており、前記データ
ベースから指定の映像データを取り出して背景映像を再
生するとともに、これとは別に撮影された映像信号から
人物映像部分を切り出して前記背景映像に合成表示する
映像合成手段を備え、前記映像合成手段は、背景映像と
して出力する映像データに対応づけられている前記人物
合成処方データに基づいて、切り出した前記人物映像部
分を前記背景映像の画面のどの位置にどのくらいの大き
さで合成するかなどの合成条件を可変設定する人物映像
合成方法とした。 (57) [Summary] [Problem] To combine a person image with a background image, automatically adjust the position and size of the person image according to the content of the background image. SOLUTION: Image data serving as a background image is accumulated in a database, and the image data is associated with person synthesis prescription data for specifying an appropriate person synthesis layout according to the content of the background image. A video synthesizing unit that extracts specified video data from the database and reproduces the background video, and separates out a human video portion from a video signal captured separately and synthesizes and displays the human video portion on the background video. Based on the person-synthesizing prescription data associated with the image data to be output as a background image, synthesizing the cut-out part of the person-image portion and the size and size of the background image on the screen. The method is a person image synthesis method in which conditions are variably set.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は記録済みの既存の背
景映像に別途撮影された人物映像を合成して表示出力す
る方法に関し、特に、ビデオカメラで撮影した映像から
人物部分のみを抽出して背景映像に合成する方法に関す
る。また、その方法を採用したカラオケ装置にも関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for synthesizing and outputting a person's image separately shot on an existing recorded background image, and in particular, extracting only a person's portion from an image shot by a video camera. The present invention relates to a method of synthesizing a background image. Also, the present invention relates to a karaoke apparatus employing the method.

【０００２】[0002]

【従来の技術】だれでも、テレビ放送の天気予報の番組
で天気図を背景に天気予報士が解説をする映像を目にし
たことがあるだろう。この背景映像にビデオカメラで撮
影中の人物を合成して表示出力する人物映像合成技術は
なにもテレビ放送に限るものではない。人物などの前景
を撮影するビデオカメラと録画（記録）済みの背景映像
を再生するビデオ機器とクロマキー合成機能がある編集
装置があれば同様な合成映像をディスプレイに表示する
ことができる。そのため、この映像合成技術は、娯楽や
趣味を目的として、様々な用途に広く利用されつつあ
る。例えば、最近のカラオケ装置にこの技術の応用例を
見ることができる。2. Description of the Related Art Everyone has seen an image of a weather forecaster commenting on a weather map in the background of a weather forecast program on a television broadcast. The technique of synthesizing a person image that combines a person being photographed with a video camera with the background image and outputs the synthesized image is not limited to television broadcasting. If there is a video camera that captures a foreground of a person or the like, a video device that plays back a recorded (recorded) background image, and an editing device that has a chroma key combining function, similar combined images can be displayed on a display. Therefore, this video synthesis technology is being widely used for various purposes for entertainment and hobbies. For example, an application of this technology can be seen in a recent karaoke apparatus.

【０００３】通常、カラオケ装置はカラオケ伴奏音楽の
音響出力に同期してカラオケ楽曲用のムード映像をディ
スプレイに表示出力する。上述の映像合成技術を適用し
たカラオケ装置は、このムード映像を再生するための装
置（ビデオＣＤプレーヤなど）を使用して適宜な背景映
像を再生させ、この背景映像にビデオカメラで撮影して
いる歌唱中の人物映像をクロマキー合成している。それ
によって、歌唱者が世界各地の観光名所あるいはＳＦや
おとぎ話の世界を舞台にして歌っているかのような映像
とすることができる。このようにカラオケ装置に人物映
像合成機能を付加することで、カラオケの娯楽性を向上
させ、カラオケ装置の利用機会を増加させようとしてい
る。。Usually, a karaoke apparatus outputs a mood image for a karaoke song on a display in synchronization with the sound output of the karaoke accompaniment music. A karaoke apparatus to which the above-described video synthesizing technique is applied reproduces an appropriate background video using a device for reproducing the mood video (such as a video CD player), and shoots the background video with a video camera. Chromakey composition of the singing person image. This makes it possible to create an image as if the singer is singing in a sightseeing spot or SF or fairy tale world around the world. As described above, by adding a person image synthesizing function to the karaoke apparatus, the entertainment of karaoke is improved, and the chance of using the karaoke apparatus is increased. .

【０００４】[0004]

【発明が解決しようとする課題】従来の人物映像合成方
法は、背景映像が含む内容（風景や構造物など）が変わ
っても合成される人物は終止おなじ寸法で表示される。
確かに、テレビニュースなどにおける背景映像は報道資
料的な映像なので、その映像内容がどのようなものであ
ってもアナウンサーなどの人物映像の表示寸法は変わら
ない方が都合がよい。In the conventional person image synthesizing method, the person to be synthesized is displayed with the same size as the end even if the content (landscape, structure, etc.) included in the background image changes.
Certainly, the background video in television news and the like is a video of a news material, so it is more convenient that the display size of a person video such as an announcer does not change regardless of the video content.

【０００５】一方、実際に人物が存在しない場所にあた
かもその人物がいるかのように合成する場合、背景の内
容に合わせて人物の合成状態（画面上における人物の表
示位置や表示寸法）を変えた方が現実感がある。静止し
た遠方の風景を背景にして人物を合成するのであれば、
違和感無くその人がその場所に居合わせているかのよう
に見えるかもしれない。しかし、背景の映像がズーム・
イン／アウトする場合などでは背景の拡大率に応じて人
物の表示寸法も変えなくては動かないはずの背景が人物
に近寄ってきたり遠ざかっていったりするような違和感
が生じる。On the other hand, in the case where a person is synthesized as if the person actually exists in a place where no person actually exists, the state of synthesis of the person (the display position and display size of the person on the screen) is changed according to the contents of the background. There is more realism. If you ’re composing a person against a still, distant landscape,
It may seem as if the person is present at the place without discomfort. However, the background image
In the case of going in / out, for example, the background, which should not move unless the display size of the person is changed according to the enlargement ratio of the background, causes an uncomfortable feeling that the background approaches or moves away from the person.

【０００６】さらに、林立するビルの街角に人物を立た
せるといった場合、ビルの大きさなどに合わせて適切な
表示寸法で人物を合成しなければ、人物はまるで「大怪
獣」のような大きさに見えてしまう。背後に海が見える
砂浜といった内容の背景映像に人物を合成する場合、海
上に人物が「浮遊」していてはこれも映像表現的に違和
感が大きい。これを回避するためには再生中の背景映像
と撮影中の人物映像との合成映像を見ながら合成状態を
可変制御すればよいが、そのためには高度な編集技術と
複雑な編集操作が必要となる。もちろん、手動操作なの
で適切な合成状態になるまで時間も掛かる。Further, in the case where a person stands on a street corner of a building in which a forest stands, if the person is not synthesized with an appropriate display size in accordance with the size of the building, the person will have a size like a "large monster". It looks like. When a person is synthesized with a background image such as a sandy beach with the sea in the background, if the person is "floating" on the sea, this is also very unnatural in terms of image expression. To avoid this, it is only necessary to variably control the composition while watching the composite video of the background video being reproduced and the human video being captured, but this requires advanced editing techniques and complicated editing operations. Become. Of course, since it is a manual operation, it takes a long time to reach an appropriate combination state.

【０００７】カラオケ装置など、補助的・付加的に人物
映像合成機能を利用する場合においては、本来の操作
（楽曲の指定入力や音量調節など）以外の操作は全く不
要なものである。複雑な操作をしてまでその機能を利用
しようという気にはなれない。そのため、機能の付加に
よる利用機会の増加が見込めず、設備費を増大させるだ
けとなる。[0007] In the case of using a person image synthesizing function supplementarily and additionally, such as a karaoke apparatus, there is no need for any operation other than the original operation (such as inputting music and adjusting the volume). I don't feel like using that function until I do a complicated operation. Therefore, an increase in utilization opportunities due to the addition of functions cannot be expected, and only equipment costs increase.

【０００８】そこで本発明は、背景映像の内容に応じて
適切な位置や寸法などの合成条件を自動的に決定し、そ
の決定に従って人物映像を違和感無く背景に合成するた
めの人物映像合成方法と、その方法を採用したカラオケ
装置を提供することを目的としている。Accordingly, the present invention provides a person image synthesizing method for automatically deciding appropriate synthesizing conditions such as position and dimensions according to the contents of a background image and synthesizing a person image with the background without discomfort according to the determination. It is an object of the present invention to provide a karaoke apparatus employing the method.

【０００９】[0009]

【課題を解決するための手段】第１の発明はつぎの事項
（１）〜（４）によって特定される。（１）記録済みの背景映像に別途撮影された人物映像を
合成して表示出力する方法である。（２）前記背景映像となる映像データがデータベースに
蓄積されている。このデータベースにおいては、背景映
像の内容に合わせた適切な人物合成レイアウトを直接的
または間接的に指定する人物合成処方データが映像デー
タに対応づけられている。（３）前記データベースから指定された映像データを取
り出して背景映像を再生するとともに、これとは別に撮
影された映像信号から人物映像部分を切り出して前記背
景映像に合成し、その合成映像をディスプレイに向けて
出力する映像合成手段を備えている。（４）前記映像合成手段は、背景映像として出力する映
像データに対応づけられている前記人物合成処方データ
に基づいて、切り出した前記人物映像部分を前記背景映
像の画面のどの位置にどのくらいの大きさで合成するか
などの合成条件を可変設定する。Means for Solving the Problems The first invention is specified by the following items (1) to (4). (1) This is a method in which a separately shot human image is combined with a recorded background image and displayed and output. (2) The video data serving as the background video is stored in a database. In this database, person synthesis prescription data that directly or indirectly specifies an appropriate person synthesis layout according to the content of a background image is associated with video data. (3) The specified video data is taken out from the database and a background video is reproduced, and a human video portion is cut out separately from the video signal and synthesized with the background video, and the synthesized video is displayed on a display. Video synthesizing means for outputting the image to the user. (4) The image synthesizing means, based on the person synthesizing prescription data associated with the image data to be output as a background image, places the cut-out person image portion at what position on the screen of the background image and at what size. Then, variably set the synthesis conditions such as whether to synthesize.

【００１０】そして、前記映像データがデジタル映像デ
ータであり、前記人物合成処方データがこのデジタル映
像データに多重化されている人物映像合成方法を第２の
発明とし、前記人物合成処方データが前記データベース
内における映像データの記録位置に対応付けされたデー
タとして映像データとは個別に記録され、前記映像合成
手段が再生中にある映像データの記録位置を逐次取得す
るとともに、その記録位置に対応する人物合成処方デー
タに基づいて合成レイアウトを適宜に可変制御して合成
表示する人物映像合成方法を第３の発明としている。[0010] A second aspect of the present invention is a person image synthesizing method in which the image data is digital image data, and the person synthesized prescription data is multiplexed with the digital image data. The video data is recorded separately from the video data as data associated with the recording position of the video data within the video data, and the video synthesizing unit sequentially obtains the recording position of the video data being reproduced, and the person corresponding to the recording position. A third invention is a third aspect of the present invention, which is a person image synthesizing method in which a synthetic layout is variably controlled based on synthetic prescription data and synthesized and displayed.

【００１１】第４の発明はつぎの事項（４１）〜（４
４）によって特定される。（４１）請求項１〜３に記載の人物映像合成方法を採用
したカラオケ装置であって、前記データベースと、前記
映像合成手段と、前記映像データベースから指定の映像
データを取り出して前記背景映像を再生するための映像
再生手段と、ビデオカメラとを備える。（４２）カラオケ伴奏音楽の起源となる音楽生成データ
がカラオケ楽曲毎に区分されてカラオケデータベースに
格納されている。カラオケ演奏処理手段が指定楽曲の音
楽生成データを処理して前記伴奏音楽を音響出力する。（４３）カラオケ楽曲毎に背景映像指定シーケンスが規
定されている。前記映像再生手段は、指定楽曲の背景映
像指定シーケンスに従って前記映像データベースから指
定された映像データを順番に再生することで背景映像を
カラオケ伴奏音楽の音響出力に同期してディスプレイに
表示出力していく。（４４）前記映像合成手段は、前記人物合成処方データ
に従って、指定楽曲の演奏中に前記人物映像の合成条件
を適宜に可変制御して合成表示する。また、第４の発明
において、カラオケ楽曲毎に歌詞画像の起源となる歌詞
描出データが前記カラオケデータベースに格納され、歌
詞描出手段が指定楽曲の前記歌詞描出データを処理して
伴奏音楽に同期して該当の歌詞画像を前記ディスプレイ
に表示出力するさい、前記映像合成手段が該当の背景映
像に前記人物映像と当該歌詞画像とを合成表示するカラ
オケ装置を第５の発明とした。The fourth invention provides the following items (41) to (4).
4). (41) A karaoke apparatus that employs the person image synthesis method according to any one of claims 1 to 3, wherein the background image is reproduced by extracting specified image data from the database, the image synthesis means, and the image database. And a video camera. (42) The music generation data that is the origin of the karaoke accompaniment music is stored for each karaoke song in the karaoke database. Karaoke performance processing means processes the music generation data of the designated music piece and outputs the accompaniment music as sound. (43) A background video designation sequence is defined for each karaoke song. The video reproducing means sequentially displays the specified video data from the video database in accordance with the specified background video sequence of the specified music, thereby displaying the background video on the display in synchronization with the audio output of the karaoke accompaniment music. . (44) According to the person composition prescription data, the image composition means appropriately variably controls the composition condition of the person image during the performance of the designated music piece and performs composition display. Further, in the fourth invention, for each karaoke song, the lyrics rendering data which is the origin of the lyrics image is stored in the karaoke database, and the lyrics rendering means processes the lyrics rendering data of the designated music and synchronizes with the accompaniment music. A fifth aspect of the present invention is a karaoke apparatus in which, when displaying the corresponding lyrics image on the display, the video synthesizing means combines and displays the person image and the lyrics image on the relevant background video.

【００１２】[0012]

【発明の実施の形態】＝＝＝カラオケ装置の基本的な構
成と動作＝＝＝図１は、本発明の実施例におけるカラオケ装置の構成を
示している。中央制御部１１は、内部にＣＰＵ、ＲＡ
Ｍ、ＲＯＭを含むマイコンであり、データバス１００や
制御バス１１０を介して周辺各構成部とデータ通信を行
い、このカラオケ装置１を制御／統括している。DESCRIPTION OF THE PREFERRED EMBODIMENTS === Basic Configuration and Operation of Karaoke Apparatus === FIG. 1 shows the configuration of a karaoke apparatus according to an embodiment of the present invention. The central control unit 11 has a CPU, RA
The karaoke apparatus 1 is a microcomputer including an M and a ROM, and performs data communication with peripheral components via a data bus 100 and a control bus 110 to control / control the karaoke apparatus 1.

【００１３】＜カラオケ装置で処理されるデータおよび
その格納部＞ハードディスク装置１２は、カラオケ伴奏
音楽の起源となる音楽生成データやその楽曲演奏に同期
してディスプレイに表示すべき歌詞画像の起源となる歌
詞描出データ、後述の背景映像指定シーケンスなど、楽
曲ＩＤをキーとして区分されているデータセットをカラ
オケデータとして格納している。<Data Processed by Karaoke Apparatus and Its Storage Unit> The hard disk drive 12 becomes the origin of the music generation data which is the origin of the karaoke accompaniment music and the lyrics image to be displayed on the display in synchronization with the music performance. A karaoke data set is stored using a song ID as a key, such as lyrics rendering data and a background video designation sequence described below.

【００１４】ＤＶＤ−ＲＯＭチェンジャ２３はＤＶＤ−
ＲＯＭ（以下、ＤＶＤ）を多数格納するとともに、内蔵
する駆動装置が任意のＤＶＤを駆動して適宜なデータ記
録位置にランダムアクセスし、そこに記録されているデ
ータを読み出す。ＤＶＤに記録されているデータは長時
間分のＭＰＥＧ２−ＶＩＤＥＯ規格のデジタル映像デー
タを主とした多重化データである。The DVD-ROM changer 23 is a DVD-ROM changer.
A large number of ROMs (hereinafter, referred to as DVDs) are stored, and a built-in driving device drives an arbitrary DVD to randomly access an appropriate data recording position and read data recorded therein. The data recorded on the DVD is multiplexed data mainly composed of digital video data of the MPEG2-VIDEO standard for a long time.

【００１５】音楽生成データは、ＭＩＤＩ規格によって
記述された音源制御データであり、シンセサイザ１３が
演奏時系列上のどの時点にどのような伴奏音を生成すべ
きかを記述している。歌詞描出データは、書式付きの歌
詞文字と、歌詞文字列のディスプレイへの表示／消去タ
イミングと、表示中の歌詞文字列に対する色変えパター
ンとを記述している。背景映像指定シーケンスは歌詞画
像の背景映像としてディスプレイ２２に表示される短時
間分の映像をどのようにつなぎ合わせるかを指定したデ
ータであり、ＤＶＤチェンジャ２３に格納されているど
のＤＶＤのどのデータ記録位置からどのくらいの時間再
生するかを指定したデータを時系列順に記述している。The music generation data is sound source control data described in accordance with the MIDI standard, and describes at what time in the performance time sequence the synthesizer 13 should generate what accompaniment sound. The lyrics rendering data describes a formatted lyrics character, the timing of displaying / erasing the lyrics character string on the display, and a color changing pattern for the displayed lyrics character string. The background video designation sequence is data designating how to connect short-time videos displayed on the display 22 as background images of the lyrics image, and which data of which DVD stored in the DVD changer 23 is recorded. Data designating how long to play from the position is described in chronological order.

【００１６】＜カラオケ演奏処理＞中央制御部１１は、
リモコン送信器１７や操作パネル１８などの操作入力部
にて入力されたリクエスト楽曲のＩＤを操作制御部１９
を介して受け取ると、その入力順とＩＤとを対応づけて
ＲＡＭに格納し、楽曲の演奏予約登録を行う。そして、
予約順番に従って該当する楽曲のカラオケデータを取り
出す。ＭＩＤＩデータをシンセサイザ１３に転送し、カ
ラオケ伴奏音楽を生成させる。シンセサイザ１３が出力
するカラオケ伴奏音楽の音声信号はマイクロホン１５か
らの歌唱音声信号と混合されてスピーカ１６に音響出力
される。<Karaoke Performance Processing> The central control unit 11
The ID of the requested music input from the operation input unit such as the remote control transmitter 17 and the operation panel 18 is stored in the operation control unit 19
, The input order and the ID are stored in the RAM in association with each other, and the music performance reservation is registered. And
The karaoke data of the corresponding music is taken out according to the reservation order. The MIDI data is transferred to the synthesizer 13 to generate karaoke accompaniment music. The audio signal of the karaoke accompaniment music output from the synthesizer 13 is mixed with the singing audio signal from the microphone 15 and output as sound to the speaker 16.

【００１７】一方、カラオケ伴奏音楽の生成処理に同期
して歌詞描出データを処理し、色変わりしていく歌詞画
像をビットマップ画像データとして順次生成してビデオ
ＲＡＭ２０に展開する。また、背景映像指定シーケンス
に従ってＤＶＤチェンジャ２３を制御し、ＤＶＤチェン
ジャ２３から所定の映像データを含む多重化データを出
力させる。デコーダ４０は読み出された多重化データを
多重分離してデジタル映像データを抽出するとともに、
その映像データを復号処理して背景映像に相当するＮＴ
ＳＣ信号に再生する。映像制御部２１はその背景映像に
ビデオＲＡＭ２０内の歌詞画像をスーパーインポーズす
る処理を行ってディスプレイ２２に映像出力する。On the other hand, the lyrics rendering data is processed in synchronization with the generation of the karaoke accompaniment music, the lyrics images that change color are sequentially generated as bitmap image data, and are developed in the video RAM 20. In addition, the DVD changer 23 is controlled according to the background video designation sequence, and the DVD changer 23 outputs multiplexed data including predetermined video data. The decoder 40 demultiplexes the read multiplexed data to extract digital video data,
The video data is decoded and NT corresponding to the background video is processed.
Play back to SC signal. The video controller 21 performs a process of superimposing the lyrics image in the video RAM 20 on the background video and outputs the video to the display 22.

【００１８】本実施例のカラオケ装置１は上述の基本機
能に加え、カラオケ楽曲に合わせて歌う歌唱者をビデオ
カメラ２５で撮影して、その人物部分のみを背景映像に
合成する人物映像合成機能を備えている。The karaoke apparatus 1 of this embodiment has, in addition to the basic functions described above, a person image synthesizing function of photographing a singer who sings along with the karaoke music with the video camera 25 and synthesizing only the person part with the background image. Have.

【００１９】＝＝＝人物映像合成機能の概略＝＝＝本発明における人物映像合成方法は、基本的にはクロマ
キー合成などの映像合成技術をベースにしている。本実
施例では、クロマキー合成技術を適用している。すなわ
ち、歌唱者が背景の壁と舞台とが所定の色（本実施例で
は青色）に塗り染められたステージ上の所定の位置に立
って歌唱する。この歌唱者をビデオカメラ２５で撮影す
る。ビデオカメラ２５の撮影映像中から青色の部分（ブ
ルーバック部分）を切り取って歌唱者のみの映像（人物
映像）を抽出する。そして、映像制御部３０がこの人物
映像を背景映像に合成する処理を行ってディスプレイ２
２に表示する。=== Overview of Person Image Combining Function === The person image combining method of the present invention is basically based on image combining techniques such as chroma key combining. In this embodiment, a chroma key combining technique is applied. That is, the singer sings while standing at a predetermined position on the stage where the background wall and the stage are painted in a predetermined color (blue in this embodiment). The singer is photographed by the video camera 25. A blue portion (blue back portion) is cut out from the video captured by the video camera 25, and a video (person video) of only the singer is extracted. Then, the image control unit 30 performs a process of synthesizing the person image with the background image, and
2 is displayed.

【００２０】しかし、本発明が従来の映像合成方法と決
定的に異なっている点は、背景映像が含む内容、例え
ば、風景や構造物、その動的変化（ズームミングやパン
ニングなど）、背景映像中における各構成要素（オブジ
ェクト：人や物）の動きなどによって人物映像の表示寸
法や表示位置などの合成レイアウトを適切に自動制御し
て合成できるところにある。この高度な人物映像合成機
能を達成するために、本実施例では、合成レイアウトに
関する情報を表現した制御データ（人物合成処方デー
タ）がＤＶＤにデジタル映像データとともに多重記録さ
れている。However, the point that the present invention is decisively different from the conventional image synthesizing method is that the contents included in the background image, for example, landscapes and structures, dynamic changes thereof (such as zooming and panning), and background image The composite layout such as the display size and display position of a person image can be appropriately automatically controlled and combined according to the movement of each component (object: person or object) in the inside. In this embodiment, control data (personal synthesis prescription data) expressing information related to a composite layout is multiplex-recorded together with digital video data on a DVD in order to achieve this advanced person image synthesizing function.

【００２１】また、映像制御部３０は上述した歌詞画像
のスーパーインポーズ処理に加え、映像信号のＡ／Ｄお
よびＤ／Ａ変換、クロマキー合成や映像の拡大／縮小あ
るいは移動など、入力した映像信号に対して各種映像効
果処理を実行するための機能を備え、この映像制御部３
０が人物合成処方データに基づいて入力した背景映像お
よび撮影映像の映像信号に映像効果処理を施すことで上
述の人物映像合成機能を達成している。In addition to the above-described superimposition processing of the lyric image, the video control unit 30 performs the input / output of the input video signal such as A / D and D / A conversion of the video signal, chroma key synthesis, enlargement / reduction or movement of the video, and the like. To perform various video effect processing on the
0 achieves the above-described person image synthesis function by performing image effect processing on the video signal of the background image and the captured image input based on the person synthesis prescription data.

【００２２】＝＝＝人物合成処方データ＝＝＝前記人物合成処方データは、ＤＶＤにＭＰＥＧ２−ｓｙ
ｓｔｅｍｓ規格による多重データとしてデジタル映像デ
ータと同じ時系列上に記録されている。そして、この合
成処方データの処理と同時に再生される背景映像の内容
に合わせて、合成する人物映像をディスプレイ２２画面
上でどのくらいの大きさで何処に表示するかを表示寸法
と表示座標として指定している。=== People Synthetic Prescription Data ===
It is recorded as multiplexed data according to the stems standard on the same time series as the digital video data. Then, in accordance with the content of the background video reproduced at the same time as the processing of the composite prescription data, the user specifies the size and display coordinates of the person video to be synthesized on the display 22 screen as the display size and display coordinates. ing.

【００２３】表示寸法はディスプレイ２２の画面の高さ
を１としたときの相対なサイズであり、この値が１の場
合、撮影映像から抽出された人物映像の縮尺を変更する
ことなく合成することとしている。１／２であれば縦横
を１／２に縮小した人物映像にする。２であれば、縦横
を２倍に拡大する。表示座標はディスプレイ２２画面の
水平および垂直方向の位置をｘ，ｙ座標で示したもので
あり、左下隅の座標を（０，０）とし、右上隅の座標を
（６４０，４８０）として指定している。The display size is a relative size when the height of the screen of the display 22 is set to 1, and when this value is 1, it is necessary to combine the person images extracted from the photographed image without changing the scale. And If it is ２, a portrait image in which the height and width are reduced to にする is set. If it is 2, the height and width are doubled. The display coordinates indicate the horizontal and vertical positions of the screen of the display 22 by x and y coordinates. The coordinates of the lower left corner are designated as (0, 0), and the coordinates of the upper right corner are designated as (640, 480). ing.

【００２４】＝＝＝ユーザインタフェースと予備設定＝
＝＝利用者がカラオケ楽曲を指定するさい、人物映像合成機
能を使用する旨の適宜な入力操作（例えば、同じ楽曲に
人物映像合成機能を使用するための楽曲番号を別に用意
してその番号で演奏指定するなど）を行うと、中央制御
部１１はこの楽曲の演奏時に人物映像合成機能を起動す
ることとして演奏予約する。=== User Interface and Preliminary Settings =
== When a user specifies a karaoke song, an appropriate input operation to use the person image combining function (for example, a song number for using the person image combining function for the same song is separately prepared and the number is used as the input number) When the performance is designated, the central control unit 11 reserves the performance by activating the person image synthesizing function when performing the music.

【００２５】この楽曲の演奏に先立って歌唱者がブルー
バックとなる特設ステージに立つ。このステージ上には
歌唱者が立つべき位置が印されている。ステージに向け
られたビデオカメラ２５のズームの拡大率や視角方向
は、この印の上に立つ通常の体格を有する人（本実施例
では、日本人の成人平均身長の人）が所定の構図で納め
られるようにあらかじめ固定されている。本実施例で
は、水平方向の視野の中央にこの人の足下から頭頂部ま
で上下にいっぱいに映るように固定されている。なお、
ビデオカメラ２５はビデオカメラ用リモコンＩＦ２６か
らの赤外線信号によって撮影の開始／停止の制御がなさ
れている。もちろん制御信号の伝送経路を有線としても
よい。Prior to the performance of the music, the singer stands on a special stage with a blue background. The position where the singer should stand is marked on this stage. The zoom magnification and the visual angle direction of the video camera 25 pointed at the stage are determined by a person having a normal physique standing on this mark (in this embodiment, a person of average adult height of Japanese) in a predetermined composition. Pre-fixed to fit. In this embodiment, it is fixed in the center of the field of view in the horizontal direction so as to be fully displayed vertically from the feet of the person to the top of the head. In addition,
The video camera 25 is controlled to start / stop shooting by an infrared signal from a video camera remote controller IF 26. Of course, the transmission path of the control signal may be wired.

【００２６】中央制御部１１は楽曲の演奏を開始するさ
い、映像制御部３０に対して人物映像合成機能の起動を
指示する。また、デコーダ４０に対して多重分離した人
物合成処方データの送出を指示する。さらに、ビデオカ
メラ用リモコンＩＦ２６に対してビデオカメラ２５に撮
影を開始させるための赤外線信号を送出させる。The central control unit 11 instructs the image control unit 30 to start the person image synthesizing function when starting playing music. In addition, it instructs the decoder 40 to transmit the multiplex-separated person synthesized prescription data. Further, it causes the video camera remote controller IF 26 to transmit an infrared signal for causing the video camera 25 to start shooting.

【００２７】＝＝＝人物映像合成機能の動作＝＝＝楽曲の演奏処理が開始されると、それに同期してＤＶＤ
チェンジャ２３から背景映像のデジタル映像データを含
む多重化データ出力される。デコーダ４０はこの多重化
データをデジタル映像データと上述の人物映像処方デー
タとに分離する。映像データはデコーダ４０内で復号さ
れてアナログの映像信号として映像制御部３０に入力さ
れる。人物映像処方データはデータバス１００を介して
映像制御部３０にデジタルデータとして入力される。=== Operation of Person Image Synthesizing Function === When music performance processing is started, the DVD
Multiplexed data including digital video data of a background video is output from the changer 23. The decoder 40 separates the multiplexed data into digital video data and the above-mentioned person video prescription data. The video data is decoded in the decoder 40 and input to the video control unit 30 as an analog video signal. The person image prescription data is input as digital data to the image control unit 30 via the data bus 100.

【００２８】一方、ビデオカメラの撮影映像信号（アナ
ログ信号）も映像制御部３０に入力される。映像制御部
３０は、この撮影映像をデジタルデータに変換し、上述
の人物映像合成機能のためのプロセスを実行する。ま
ず、撮影映像からブルーバックの部分を取り除いた人物
映像部分を抽出する。そして、この人物映像データを人
物合成処方データに基づいて表示寸法と表示位置とを適
宜に変換した映像データに加工する。On the other hand, a video signal (analog signal) captured by the video camera is also input to the video control unit 30. The video control unit 30 converts the captured video into digital data and executes a process for the above-described human video synthesis function. First, a human video portion obtained by removing a blue background portion from a captured video is extracted. Then, the person image data is processed into image data in which display dimensions and display positions are appropriately converted based on the person prescription data.

【００２９】表示寸法の制御については、人物映像デー
タをディスプレイ２２画面上の各画素に対応させて展開
したとき、人物映像を構成する画素を適宜に間引いたり
補間したりすることで縮小したり拡大したりする。この
ようにして縮尺変換された人物映像データを生成する。
さらに、縮尺変換された人物映像の表示位置を表示座標
に従って移動させる。人物映像は歌唱者が立つステージ
の印位置が基点となっている。この基点を指定された表
示座標（ｘ，ｙ）に移動させる。なお、拡大時では人物
の上部が画面からはみ出してしまう可能性があるため、
ｙ方向については補間によって生成された新規な画素分
だけ下方向にシフトさせている。例えば、表示寸法の指
定が１．５であれば４８０×０．５＝２４０画素分下方
に基点をシフトさせる。このようにして人物の顔が必ず
画面内に入るようにしている。縮尺や座標が変換された
人物映像データはアナログの映像信号に変換され、従来
のクロマキー合成によって背景映像中に挿入される。さ
らに、ビデオＲＡＭ２０に展開されている歌詞画像も合
成してディスプレイ２２に映像出力する。With regard to the control of the display size, when the human image data is developed corresponding to each pixel on the display 22 screen, the pixels constituting the human image are reduced or enlarged by appropriately thinning out or interpolating. Or In this manner, the scale-converted person image data is generated.
Further, the display position of the scaled converted person image is moved according to the display coordinates. The person image is based on the mark position of the stage where the singer stands. This base point is moved to the specified display coordinates (x, y). In addition, since the top of the person may protrude from the screen at the time of enlargement,
In the y direction, the image is shifted downward by a new pixel generated by interpolation. For example, if the designation of the display size is 1.5, the base point is shifted downward by 480 × 0.5 = 240 pixels. In this way, the face of the person always enters the screen. The person image data whose scale and coordinates have been converted are converted into analog image signals, and inserted into the background image by conventional chromakey synthesis. Further, it also synthesizes the lyrics image developed in the video RAM 20 and outputs the image to the display 22.

【００３０】図２に上述の人物映像合成機能の概略をデ
ィスプレイ画面における表示状態として示した。ここで
は、建物の前に車が駐車しているという内容の背景映像
に、歌唱者を合成する場合を例に挙げている。従来のク
ロマキー合成あるいは表示寸法の指定が１で表示位置の
指定が（３２０，０）のとき、すなわち縮尺変換や移動
処理をしないときの人物映像合成では、ブルーバックと
なるステージ上の歌唱者を撮影した映像から人物映像
（Ａ１）を抽出する。これを背景映像（Ｂ）にクロマキ
ー合成して合成映像（Ｃ１）が表示される。FIG. 2 shows an outline of the above-described person image synthesizing function as a display state on a display screen. Here, an example is described in which a singer is combined with a background video that indicates that a car is parked in front of a building. In the conventional chroma key synthesis or when the display size is specified as 1 and the display position is specified as (320, 0), that is, in the case of performing the human image synthesis without performing the scale conversion or the moving process, the singer on the stage as a blue background is identified. A person video (A1) is extracted from the captured video. This is chroma-key composited with the background video (B) to display the composite video (C1).

【００３１】一方、本発明による人物映像合成方法によ
れば、映像（Ａ１）をデジタル処理して縮尺や位置が変
更された人物映像（Ａ２）を生成する。この映像（Ａ
２）に背景映像（Ｂ）をクロマキー合成して映像表現的
に違和感のない合成映像（Ｃ２）を得る。On the other hand, according to the person image synthesizing method of the present invention, the image (A1) is digitally processed to generate a person image (A2) whose scale and position have been changed. This video (A
2) The background video (B) is chromakey-combined with the background video (B) to obtain a composite video (C2) with no unnatural feeling in the video expression.

【００３２】＝＝＝補足、その他の実施例など＝＝＝上記実施例では歌唱者の体格によらず、人物映像を人物
合成処方データの表示寸法に従って縮尺変換をしてい
る。しかし、歌唱者が長身である場合、ビデオカメラの
視野からはみ出してしまうこともあり得る。その場合、
歌唱者がちょうど撮影映像の視野に収まるように事前に
ビデオカメラのズーム倍率を変更しておいてもよい。あ
るいは、ビデオカメラの撮影映像から抽出された人物映
像を監視しながらズーム倍率を可変制御し、画面の上端
部にブルーバックの色が出現した時点でズーム倍率を固
定することとしてもよい。逆に、歌唱者が子供などでビ
デオカメラの視野内に小さく撮影される場合、同様にし
てズーム倍率を変更することで人物映像をディスプレイ
の画面の高さに合わせるようにすることもできる。な
お、歌唱者が視野に収まる場合は、抽出した人物映像か
ら人物分の高さが取得できるので、ズーム倍率を変化さ
せずに画像処理によって人物の高さを画面の高さに合わ
せてもよい。=== Supplement, Other Embodiments === In the above embodiment, the scale of the person image is converted according to the display size of the person prescription data regardless of the singer's physique. However, if the singer is tall, it may run out of the field of view of the video camera. In that case,
The zoom magnification of the video camera may be changed in advance so that the singer just fits in the field of view of the captured video. Alternatively, the zoom magnification may be variably controlled while monitoring a person image extracted from a video image captured by the video camera, and the zoom magnification may be fixed when a blue-back color appears at the upper end of the screen. Conversely, when the singer is photographed small in the field of view of the video camera by a child or the like, the person image can be adjusted to the height of the display screen by changing the zoom magnification in the same manner. If the singer is within the field of view, the height of the person can be obtained from the extracted person video, so that the height of the person may be adjusted to the height of the screen by image processing without changing the zoom magnification. .

【００３３】さらに、背景映像に合成する人物の大きさ
を画面の高さを基準にした相対値で指定するのではな
く、人物の顔など所定の部位の大きさ（高さや幅など）
を指定することとしてもよい。この場合、切り出した人
物映像から所定の部位を画像認識によって特定し、その
部位が指定の大きさとなるように人物全体を相似変形さ
せて合成させればよい。Further, instead of specifying the size of the person to be synthesized with the background image by a relative value based on the height of the screen, the size (height, width, etc.) of a predetermined portion such as the face of the person
May be specified. In this case, a predetermined part may be specified from the cut-out person image by image recognition, and the whole person may be similarly deformed and combined so that the part has a designated size.

【００３４】人物映像を合成するさい、上述の寸法や位
置のほかに、人物を回転させたり鏡像にしたり、あるい
はコピーして複数の人物を背景映像に合成するなど、映
像制御部３０が実行できる映像効果処理であれば人物合
成処方データとして指定し、それに従ったレイアウトで
合成することができる。また、人物映像の上半分あるい
は下半分など、人物映像の一部を合成表示することもで
きる。When synthesizing a person image, in addition to the dimensions and positions described above, the image control unit 30 can perform operations such as rotating the person, making a mirror image, or copying and synthesizing a plurality of persons with the background image. In the case of video effect processing, it can be designated as person composition prescription data and composed with a layout according to it. In addition, a part of the person image such as the upper half or the lower half of the person image can be combined and displayed.

【００３５】さらに、背景映像をＣＧ（コンピュータグ
ラフィクス）技術によって生成することは容易であり、
その場合、人物映像の表示位置としてＣＧによる仮想空
間での３次元的な位置によって指定することもできる。
それによって、例えばコンサートホールの舞台上に歌唱
者を配置し、その歌唱者が舞台手前の大観衆の合間から
見え隠れするなど、背景映像中に合成される人物と背景
映像中のオブジェクトとの奥行き方向の位置関係も表現
できる。Further, it is easy to generate a background image by CG (computer graphics) technology.
In that case, the display position of the person image can be designated by a three-dimensional position in a virtual space by CG.
As a result, for example, a singer is placed on the stage of a concert hall, and the singer can be seen or hidden between large crowds in front of the stage. Can also be expressed.

【００３６】上記実施例では、ステージ上の所定の位置
を人物映像の基点として処理しているが、歌唱者がステ
ージ上を移動しても指定の位置に追尾できるようにして
もよい。抽出した人物映像から人物の水平方向の幅が取
得できる。その幅の中心とステージ面（人物の足下）と
の交点を基点とすればよい。あるいは、水平方向に自在
に回転するとともにその回転角度を制御できるような雲
台にビデオカメラを載置し、歌唱者が常に画面の中心に
映るように追尾することとしてもよい。In the above embodiment, the predetermined position on the stage is processed as the base point of the person image. However, the singer may be able to track the specified position even when moving on the stage. The horizontal width of the person can be obtained from the extracted person video. The intersection point between the center of the width and the stage surface (under the feet of the person) may be set as a base point. Alternatively, a video camera may be mounted on a camera platform that can freely rotate in the horizontal direction and control the rotation angle thereof, and track the singer so that the singer always appears in the center of the screen.

【００３７】また、人物合成処方データは上述の実施例
のように数値として直接規定しなくてもよい。合成時の
レイアウトに関する情報であればよく、例えば、背景映
像の内容をキーワード（海、街など）によって表現し、
各キーワードに対応した寸法や位置がテーブルなどで用
意されていることとしてもよい。The person prescription data need not be directly defined as numerical values as in the above-described embodiment. Any information on the layout at the time of synthesis may be used. For example, the content of the background image is expressed by a keyword (sea, town, etc.)
The size and position corresponding to each keyword may be prepared in a table or the like.

【００３８】なお、上述の人物映像合成方法はカラオケ
装置だけでなく、パーソナルコンピュータやビデオ編集
機器などで複数のソースからの映像を合成する場合にも
適用できることは明らかである。もちろん、背景映像お
よび人物映像は動画である必要はなく、静止画像であっ
てもよい。背景映像に対応する人物合成処方データが用
意され、このデータに基づいて合成処理が行えれば同様
の映像合成機能が実現できる。It is apparent that the above-described method of synthesizing a person image can be applied not only to a karaoke apparatus but also to a method of synthesizing images from a plurality of sources using a personal computer or a video editing device. Of course, the background image and the person image need not be moving images, and may be still images. The same video compositing function can be realized if human compositing prescription data corresponding to the background video is prepared and the compositing process can be performed based on this data.

【００３９】背景映像がアナログ式光ディスクなどの媒
体に収録されている場合などでは人物合成処方データを
映像データの多重データとして用意できない。このよう
に人物合成処方データを映像データに並列的に記録でき
ない場合、フレーム番号やトラック番号あるいは媒体に
おける映像の記録開始点からの再生経過時間（リニアタ
イム）など、再生時に出力される映像記録位置（フレー
ムデータ）と人物合成処方データとを対応づけて別デー
タとして用意しておけばよい。もちろん、ディスクチェ
ンジャなど映像データが複数の媒体に渡って収録されて
いる場合は媒体のＩＤもフレームデータとともに対応づ
けておく。この方法をカラオケ装置に適用すれば、背景
映像指定シーケンスによって指定されているフレームデ
ータに基づいて所定の合成レイアウトが特定できる。も
ちろん、人物合成処方データを演奏時系列に対応づけた
データとして楽曲毎に区分して用意しておいてもよい。In the case where the background video is recorded on a medium such as an analog optical disk, the person prescription data cannot be prepared as multiplexed data of video data. When the person prescription data cannot be recorded in parallel with the video data as described above, the video recording position output at the time of reproduction, such as the frame number, the track number, or the elapsed playback time (linear time) from the video recording start point on the medium. (Frame data) and person-combined prescription data may be prepared in association with each other. Of course, when video data such as a disc changer is recorded over a plurality of media, the media ID is associated with the frame data. If this method is applied to a karaoke apparatus, a predetermined composite layout can be specified based on the frame data specified by the background video specifying sequence. Of course, it is also possible to prepare the pre-synthesized person data as data associated with the performance time series for each music piece.

【００４０】[0040]

【発明の効果】本発明の人物映像合成方法によれば、背
景映像に人物映像を合成するさいのレイアウトに関する
情報を表現した人物合成処方データを再生する映像デー
タに対応づけ、この処方データに基づいて合成表示処理
を行うこととしている。そのため、複雑な操作や高度な
編集技術を必要としないで自動的に人物が背景映像中に
適切な位置や寸法などで合成される。しかも、背景映像
の映像変化に追従させることもできる。したがって、そ
の人が背景映像の撮影現場に居合わせたような合成映像
に現実感を与える。According to the method for synthesizing a person image according to the present invention, the prescription data is synthesized based on the prescription data based on the synthesized person prescription data expressing the information on the layout when the background image is synthesized with the person image. To perform the combined display process. Therefore, a person is automatically synthesized at an appropriate position, size, and the like in the background video without requiring complicated operations or advanced editing techniques. In addition, it is possible to follow a change in the background image. Therefore, a sense of reality is given to the synthesized video as if the person were present at the shooting site of the background video.

【００４１】人物合成処方データをデジタル映像データ
に多重したり映像データの再生時系列に合わせて別デー
タとしたりするなど、背景映像の記録形態（デジタル、
アナログなど）に応じてそのデータ形態を適宜に選択で
きる。そのため、背景映像のデータ形態に依存すること
なく高度な人物映像合成機能を実現できる。The recording format of the background image (digital, digital, etc.), such as multiplexing the person prescription data with the digital image data or using separate data in accordance with the reproduction time series of the image data.
The data format can be appropriately selected depending on the type of data. For this reason, an advanced person image combining function can be realized without depending on the data format of the background image.

【００４２】この人物映像合成方法をカラオケ装置に適
用すれば、カラオケ楽曲の演奏時にディスプレイに表示
されるムード映像を背景映像として歌唱者をその映像に
違和感無く合成することができる。加えて歌詞画像も挿
入すれば、歌唱者や同伴者が歌詞と同時に合成映像を見
ることができる。When this person image synthesizing method is applied to a karaoke apparatus, a singer can be synthesized with a mood image displayed on a display at the time of playing a karaoke song as a background image without discomfort. In addition, if a lyric image is also inserted, the singer or companion can view the combined video simultaneously with the lyrics.

[Brief description of the drawings]

【図１】本発明の実施例におけるカラオケ装置の構成図
である。FIG. 1 is a configuration diagram of a karaoke apparatus according to an embodiment of the present invention.

【図２】上記実施例における人物映像合成機能の概略説
明図である。FIG. 2 is a schematic explanatory diagram of a person image combining function in the embodiment.

[Explanation of symbols]

１カラオケ装置１１中央制御部１２ハードディスク装置２３ＤＶＤチェンジャ３０映像制御部４０デコーダ DESCRIPTION OF SYMBOLS 1 Karaoke apparatus 11 Central control part 12 Hard disk drive 23 DVD changer 30 Video control part 40 Decoder

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｈ０４Ｎ 9/75 Ｇ０６Ｆ 15/66 ４５０５Ｄ１０８Ｆターム(参考） 5B057 AA20 CA01 CB01 CE08 CE09 5C023 AA02 AA03 AA04 AA05 AA17 AA18 BA11 CA03 CA04 CA05 DA03 DA08 5C066 AA00 AA01 AA07 AA12 BA17 ED03 ED04 ED09 ED13 GA31 GB01 HA02 KE09 KM01 KM11 5C080 AA01 AA09 BB05 CC03 DD30 EE01 EE17 EE19 EE29 FF09 GG02 GG12 JJ01 JJ02 5C082 AA05 AA21 AA27 AA31 AA37 BA02 BA12 BA41 BB15 CA32 CA52 CA59 DA51 MM05 5D108 BA04 BB06 BD02 BE06 BF20──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) H04N 9/75 G06F 15/66 450 5D108 F term (Reference) 5B057 AA20 CA01 CB01 CE08 CE09 5C023 AA02 AA03 AA04 AA05 AA17 AA18 BA11 CA03 CA04 CA05 DA03 DA08 5C066 AA00 AA01 AA07 AA12 BA17 ED03 ED04 ED09 ED13 GA31 GB01 HA02 KE09 KM01 KM11 5C080 AA01 AA09 BB05 CC03 DD30 EE01 EE17 EE19 EE29 FF09 GG02 GG12 GG02 GG09 CA59 DA51 MM05 5D108 BA04 BB06 BD02 BE06 BF20

Claims

[Claims]

1. The invention specified by the following items (1) to (4). (1) This is a method in which a separately shot human image is combined with a recorded background image and displayed and output. (2) The video data serving as the background video is stored in a database. In this database, person synthesis prescription data that directly or indirectly specifies an appropriate person synthesis layout according to the content of a background image is associated with video data. (3) The specified video data is taken out from the database and a background video is reproduced, and a human video portion is cut out separately from the video signal and synthesized with the background video, and the synthesized video is displayed on a display. Video synthesizing means for outputting the image to the user. (4) The image synthesizing unit, based on the person synthesis prescription data associated with the image data to be output as a background image, places the cut-out person image portion at what position on the screen of the background image and how large. Then, variably set the synthesis conditions such as whether to synthesize.

2. The method according to claim 1, wherein the image data is digital image data, and the person prescription data is multiplexed with the digital image data.

3. The personal prescription data according to claim 1, wherein the person prescription data is recorded separately from the video data as data associated with a recording position of the video data in the database. A person image synthesizing method, wherein a recording position of certain image data is sequentially obtained, and a composite layout is appropriately variably controlled on the basis of person prescription data corresponding to the recording position and synthesized and displayed.

4. The invention specified by the following items (41) to (44). (41) A karaoke apparatus that employs the person image synthesis method according to any one of claims 1 to 3, wherein the background image is reproduced by extracting specified image data from the database, the image synthesis means, and the image database. And a video camera. (42) The music generation data which is the origin of the karaoke accompaniment music is stored for each karaoke song in the karaoke database. Karaoke performance processing means processes the music generation data of the designated music piece and outputs the accompaniment music as sound. (43) A background video designation sequence is defined for each karaoke song. The video reproducing means sequentially displays the specified video data from the video database in accordance with the specified background video sequence of the specified music, thereby displaying the background video on the display in synchronization with the audio output of the karaoke accompaniment music. . (44) According to the person composition prescription data, the image composition means variably controls the composition condition of the person image during the performance of the designated music piece and performs composition display.

5. The karaoke apparatus according to claim 4, wherein
Lyric rendering data that is the origin of the lyrics image for each karaoke song is stored in the karaoke database, and the lyrics rendering means processes the lyrics rendering data of the designated song and synchronizes the corresponding lyrics image with the accompaniment music on the display. At the time of display output, the video synthesizing means synthesizes and displays the person image and the lyrics image on the corresponding background image.