JP2002125199A

JP2002125199A - Frame information description method, frame information generation device and method, video reproduction device and method, and recording medium

Info

Publication number: JP2002125199A
Application number: JP2001200220A
Authority: JP
Inventors: Osamu Hori; 修堀; Toshimitsu Kaneko; 敏充金子; Takeshi Mita; 雄志三田; Koji Yamamoto; 晃司山本; Koichi Masukura; 孝一増倉
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2000-06-30
Filing date: 2001-06-29
Publication date: 2002-04-26
Anticipated expiration: 2021-06-29
Also published as: JP4253139B2

Abstract

(57)【要約】【課題】利用者にとってより効果的な特殊再生を可能
とする特殊再生制御情報を生成する特殊再生制御情報生
成装置を提供すること。【解決手段】特殊再生制御情報生成装置では、映像デ
ータ記憶部２から対象となる映像データを読み込み、映
像位置情報処理部１１にて、特殊再生に供される一部の
フレームをフレーム系列に沿って順次選択的に抽出し、
抽出した各フレームの元映像データにおける位置を示す
映像位置情報を作成し、表示時間制御情報処理部１２に
て、各フレームの表示時間を示す表示時間制御情報を作
成する。そして、各フレームの映像位置情報と表示時間
制御情報をフレーム情報として配列した特殊再生制御情
報を作成し、これを特殊再生制御情報記憶部３に格納す
る。 (57) [Problem] To provide a special reproduction control information generation device for generating special reproduction control information that enables more effective special reproduction for a user. SOLUTION: In a special reproduction control information generating device, target video data is read from a video data storage unit 2, and a video position information processing unit 11 converts a part of frames to be used for special reproduction along a frame sequence. To selectively extract
Image position information indicating the position of each extracted frame in the original image data is created, and the display time control information processing unit 12 creates display time control information indicating the display time of each frame. Then, special reproduction control information in which video position information and display time control information of each frame are arranged as frame information is created, and this is stored in the special reproduction control information storage unit 3.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ディジタルコンテ
ンツ、例えば映像コンテンツを特殊再生するためのフレ
ーム情報記述方法、フレーム情報生成装置及び方法、映
像再生装置及び方法並びに記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for describing frame information for specially reproducing digital contents, for example, video contents, a frame information generating apparatus and method, a video reproducing apparatus and method, and a recording medium.

【０００２】[0002]

【従来の技術】近年、動画像をディジタル映像として圧
縮し、ＤＶＤやＨＤＤに代表されるディスクメディアに
格納することによって、映像をランダム再生できる環境
が整った。この環境では、あらかじめ定められた場所か
ら、待ち時間がほとんどない状態で途中再生が可能であ
る。従来のテープメディアと同様に２〜４倍速の早回し
および逆回しも可能である。2. Description of the Related Art In recent years, an environment in which a video can be reproduced at random has been established by compressing a moving image as a digital video and storing it on a disk medium such as a DVD or an HDD. In this environment, midway reproduction is possible from a predetermined location with little waiting time. As with the conventional tape media, it is also possible to perform fast and reverse rotation at 2 to 4 times speed.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、映像は
長時間の場合が多く、２〜４倍速の再生でも全ての内容
を視聴するには、時間が十分短縮されない問題があっ
た。倍速再生を大きくすると、画面の変化が視聴能力以
上に大きくなって内容把握が困難であったり、本来内容
を把握するにはあまり重要でない部分においても、倍速
再生が行われたりして無駄な場合がある。However, the video is often long, and there is a problem that the time is not sufficiently reduced in order to view all the contents even at the reproduction speed of 2 to 4 times. If the playback speed is too large, the screen changes become larger than the viewing ability, making it difficult to grasp the contents. There is.

【０００４】このような問題は映像コンテンツに限ら
ず、音・テキストからなるコンテンツ、あるいはマルチ
メディアコンテンツにも当てはまる。[0004] Such a problem applies not only to video contents but also to contents composed of sound and text or multimedia contents.

【０００５】本発明の目的は、利用者にとってより効果
的な特殊再生を可能とするフレーム情報記述方法、フレ
ーム情報生成装置及び方法、映像再生装置及び方法並び
に記録媒体に関する。An object of the present invention relates to a frame information description method, a frame information generation apparatus and method, a video reproduction apparatus and method, and a recording medium that enable more effective special reproduction for a user.

【０００６】[0006]

【課題を解決するための手段】上記した課題を解決し目
的を達成するために、本発明は以下に示す手段を用いて
いる。In order to solve the above-mentioned problems and achieve the object, the present invention uses the following means.

【０００７】本発明のフレーム情報記述方法は、元映像
データの複数のフレームから抽出したフレームの該元映
像データ中における位置を特定する第１情報を記述する
ステップと、該抽出フレームの表示時間に関する第２情
報を記述するステップとを具備することを特徴とする。A frame information description method according to the present invention relates to a step of describing first information for specifying a position in a source video data of a frame extracted from a plurality of frames of the source video data, and a display time of the extracted frame. Writing the second information.

【０００８】前記抽出フレームはフレーム群からなり、
前記第１情報は該抽出フレーム群の該元映像データ中に
おける位置を特定する情報を具備してもよい。[0008] The extracted frame comprises a frame group,
The first information may include information for specifying a position of the extracted frame group in the original video data.

【０００９】該抽出フレームの重要度に関する第３情報
を記述するステップをさらに具備してもよい。The method may further include a step of describing third information relating to the importance of the extracted frame.

【００１０】前記第１情報は前記元映像データから作成
される該抽出フレームに対応する画像データファイルを
特定する情報を具備してもよい。[0010] The first information may include information for specifying an image data file corresponding to the extracted frame created from the original video data.

【００１１】前記抽出フレームは前記元映像データのあ
る時間的区間の複数フレームから抽出されたフレームを
具備し、前記時間的区間を特定する第４情報を記述する
ステップをさらに具備してもよい。[0011] The extracted frame may include a frame extracted from a plurality of frames in a certain temporal section of the original video data, and may further include a step of describing fourth information for specifying the temporal section.

【００１２】前記第１情報は前記元映像データから作成
される該抽出フレームに対応する画像データファイルを
特定する情報を具備してもよい。[0012] The first information may include information for specifying an image data file corresponding to the extracted frame created from the original video data.

【００１３】前記第２情報は再生時の画面変化量がほぼ
一定となるような表示時間に関する情報を具備してもよ
い。[0013] The second information may include information relating to a display time so that the screen change amount during reproduction is substantially constant.

【００１４】前記抽出フレームの再生又は非再生を指示
する第５情報を記述するステップをさらに具備してもよ
い。The method may further include a step of describing fifth information for instructing reproduction or non-reproduction of the extracted frame.

【００１５】前記第１情報は前記抽出フレームの位置を
示す情報、または前記元映像データから生成され、元映
像データとは別に格納される画像データファイル中の該
抽出フレームに対応する画像データの位置を示す情報を
具備してもよい。The first information is information indicating a position of the extracted frame or a position of image data corresponding to the extracted frame in an image data file generated from the original video data and stored separately from the original video data. May be provided.

【００１６】前記抽出フレームを含む前記元映像データ
以外のメディアデータに、該メディアデータの位置を示
す情報と、該メディアデータの表示時間に関する情報を
記述するステップをさらに具備してもよい。[0016] The method may further include the step of describing information indicating the position of the media data and information relating to the display time of the media data in media data other than the original video data including the extracted frame.

【００１７】本発明の元映像データの複数のフレームか
ら抽出したフレームに関するフレーム情報を格納するコ
ンピュータ読取り可能な記録媒体において、前記フレー
ム情報は該抽出フレームの前記元映像データ中における
位置を特定する第１情報と、該抽出フレームの表示時間
に関する第２情報とを具備する。In a computer-readable recording medium for storing frame information relating to frames extracted from a plurality of frames of original video data according to the present invention, the frame information specifies a position of the extracted frame in the original video data. 1 information and second information about the display time of the extracted frame.

【００１８】前記抽出フレームはフレーム群からなり、
前記第１情報は該抽出フレーム群の該元映像データ中に
おける位置を特定する情報を具備してもよい。The extracted frame is composed of a frame group,
The first information may include information for specifying a position of the extracted frame group in the original video data.

【００１９】前記フレーム情報は該抽出フレームの重要
度に関する第３情報をさらに具備してもよい。[0019] The frame information may further include third information relating to the importance of the extracted frame.

【００２０】前記第１情報は前記元映像データから作成
される該抽出フレームに対応する画像データファイルを
特定する情報を具備してもよい。[0020] The first information may include information for specifying an image data file corresponding to the extracted frame created from the original video data.

【００２１】前記フレーム情報とともに、前記元映像デ
ータ、および前記元映像データから作成される前記抽出
フレームに対応する画像データファイルを更に格納して
もよい。[0021] Along with the frame information, the original video data and an image data file corresponding to the extracted frame created from the original video data may be further stored.

【００２２】本発明のフレーム情報記述装置は元映像デ
ータの複数のフレームから抽出したフレームの該元映像
データ中における位置を特定する第１情報を記述する手
段と、該抽出フレームの表示時間に関する第２情報を記
述する手段とを具備することを特徴とする。A frame information description device according to the present invention includes means for describing first information for specifying a position in a source video data of a frame extracted from a plurality of frames of the source video data, and a second information relating to a display time of the extracted frame. 2 means for describing information.

【００２３】本発明のフレーム情報生成方法は元映像デ
ータの複数のフレームから抽出したフレームの該元映像
データ中における位置を特定する第１情報を生成するス
テップと、該抽出フレームの表示時間に関する第２情報
を生成するステップとを具備することを特徴とする。According to the frame information generating method of the present invention, there is provided a step of generating first information for specifying a position of a frame extracted from a plurality of frames of original video data in the original video data; Generating two pieces of information.

【００２４】本発明の映像再生装置は元映像データの複
数のフレームから抽出したフレームの該元映像データ中
における位置を特定する第１情報と、該抽出フレームの
表示時間に関する第２情報とを参照する手段と、前記第
１情報に基づいて前記抽出フレームの元映像データを取
得する手段と、前記第２情報に基づいて前記抽出フレー
ムの元映像データを再生する表示時間を決定する手段
と、取得された前記元映像データを決定された前記表示
時間再生する手段とを具備することを特徴とする。The video reproducing apparatus of the present invention refers to the first information for specifying the position in the original video data of the frames extracted from the plurality of frames of the original video data, and the second information for the display time of the extracted frame. Means for obtaining the original video data of the extracted frame based on the first information; means for determining a display time for reproducing the original video data of the extracted frame based on the second information; Means for reproducing the determined original video data for the determined display time.

【００２５】本発明の映像再生方法は元映像データの複
数のフレームから抽出したフレームの該元映像データ中
における位置を特定する第１情報と、該抽出フレームの
表示時間に関する第２情報とを参照するステップと、前
記第１情報に基づいて前記抽出フレームの元映像データ
を取得するステップと、前記第２情報に基づいて前記抽
出フレームの元映像データを再生する表示時間を決定す
るステップと、取得された前記元映像データを決定され
た前記表示時間再生するステップとを具備することを特
徴とする。In the video reproducing method according to the present invention, reference is made to first information for specifying the position of a frame extracted from a plurality of frames of the original video data in the original video data, and second information relating to the display time of the extracted frame. Obtaining the original video data of the extracted frame based on the first information; determining a display time for reproducing the original video data of the extracted frame based on the second information; Reproducing the determined original video data for the determined display time.

【００２６】本発明の映像再生プログラムを格納するコ
ンピュータ読取り可能な記録媒体において、前記映像再
生プログラムはコンピュータに元映像データの複数のフ
レームから抽出したフレームの該元映像データ中におけ
る位置を特定する第１情報と、該抽出フレームの表示時
間に関する第２情報とを参照させるプログラムコード
と、コンピュータに前記第１情報に基づいて前記抽出
フレームの元映像データを取得させるプログラムコード
と、コンピュータに前記第２情報に基づいて前記抽出フ
レームの元映像データを再生する表示時間を決定させる
プログラムコードと、コンピュータに取得された前記元
映像データを決定された前記表示時間再生させるプログ
ラムコードとを具備することを特徴とする。In a computer-readable recording medium storing a video reproduction program according to the present invention, the video reproduction program causes a computer to specify a position in the original video data of a frame extracted from a plurality of frames of the original video data. 1 information and program code for referring to second information relating to the display time of the extracted frame; program code for causing a computer to acquire the original video data of the extracted frame based on the first information; A program code for determining a display time for reproducing the original video data of the extracted frame based on information; and a program code for reproducing the display time for the determined original video data obtained by a computer. And

【００２７】本発明のフレーム情報記述方法はオリジナ
ル音データの複数のフレームから抽出した音フレームの
該オリジナル音データ中における位置を特定する第１情
報を記述するステップと、該抽出フレームの再生時間に
関する第２情報を記述するステップとを具備することを
特徴とする。The frame information description method according to the present invention relates to a step of describing first information for specifying a position in an original sound data of a sound frame extracted from a plurality of frames of the original sound data, and a reproduction time of the extracted frame. Writing the second information.

【００２８】本発明のオリジナル音データの複数のフレ
ームから抽出した音フレームに関するフレーム情報を格
納するコンピュータ読取り可能な記録媒体において、前
記フレーム情報は該抽出フレームの前記オリジナル音デ
ータ中における位置を特定する第１情報と、該抽出フレ
ームの再生時間に関する第２情報とを具備することを特
徴とする。In a computer-readable recording medium for storing frame information relating to sound frames extracted from a plurality of frames of original sound data according to the present invention, the frame information specifies a position of the extracted frame in the original sound data. It is characterized by comprising first information and second information relating to the reproduction time of the extracted frame.

【００２９】本発明のフレーム情報記述方法はオリジナ
ルテキストデータの複数のフレームから抽出したテキス
トフレームの位置を特定する第１情報を記述するステッ
プと、該抽出フレームの表示時間に関する第２情報を記
述するステップとを具備することを特徴とする。According to the frame information description method of the present invention, first information for specifying the position of a text frame extracted from a plurality of frames of original text data is described, and second information relating to the display time of the extracted frame is described. And a step.

【００３０】本発明のオリジナルテキストデータの複数
のフレームから抽出したテキストフレームに関するフレ
ーム情報を格納するコンピュータ読取り可能な記録媒体
において、前記フレーム情報は該抽出フレームの前記オ
リジナルテキストデータ中における位置を特定する第１
情報と、該抽出フレームの表示時間に関する第２情報と
を具備することを特徴とする。In a computer-readable recording medium for storing frame information relating to a text frame extracted from a plurality of frames of original text data according to the present invention, the frame information specifies a position of the extracted frame in the original text data. First
Information and second information relating to the display time of the extracted frame.

【００３１】本発明によれば、利用者にとってより効果
的な特殊再生を可能とするフレーム情報記述方法、フレ
ーム情報生成装置及び方法、映像再生装置及び方法並び
に記録媒体が提供される。According to the present invention, there are provided a frame information description method, a frame information generation apparatus and method, a video reproduction apparatus and method, and a recording medium which enable more effective special reproduction for a user.

【００３２】[0032]

【発明の実施の形態】以下、図面を参照しながら発明の
実施の形態を説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００３３】本発明は全てのディジタルコンテンツに関
するが、実施形態として映像データを有する映像コンテ
ンツの再生を説明する。映像データは、動画像を構成す
る映像フレームの集合（映像フレーム群）からなってい
るものとする。Although the present invention relates to all digital contents, as an embodiment, reproduction of video contents having video data will be described. The video data is assumed to be composed of a set of video frames (video frame group) constituting a moving image.

【００３４】最初に、本実施形態において重要な役割を
果たす特殊再生制御情報について説明する。First, special reproduction control information that plays an important role in the present embodiment will be described.

【００３５】特殊再生制御情報は、対象となる映像デー
タの特殊再生のための制御情報であって、特殊再生制御
情報生成装置により、当該映像データをもとにして作成
され、当該映像データに付与される、あるいは関連付け
られる制御情報である。特殊再生は、通常再生以外の方
法による再生であり、例えば、倍速再生（あるいは高速
再生）、飛び越し再生（あるいは飛び越し連続再生）、
トリック再生等がある。トリック再生には、例えば、入
れ替え再生、重複再生、スロー再生等、様々な種類があ
る。特殊再生制御情報は、例えば映像データを再生する
映像再生装置において特殊再生を行う際に参照される。The special reproduction control information is control information for special reproduction of the target video data, and is created by the special reproduction control information generating device based on the video data and added to the video data. Control information to be performed or associated. The special reproduction is a reproduction by a method other than the normal reproduction, for example, double-speed reproduction (or high-speed reproduction), interlaced reproduction (or intercontinuous continuous reproduction),
There is trick play. There are various types of trick playback, such as replacement playback, overlap playback, and slow playback. The special reproduction control information is referred to when performing special reproduction in, for example, a video reproduction device that reproduces video data.

【００３６】図１に、特殊再生の対象となる映像データ
をもとにして作成される特殊再生制御情報の基本的なデ
ータ構造の一例を示す。FIG. 1 shows an example of a basic data structure of trick play control information created based on video data to be trick played.

【００３７】このデータ構造は、複数のフレーム情報ｉ
（ｉ＝１〜Ｎ）を元映像データにおけるフレーム出現順
序と対応付けて記述したもので、各フレーム情報は、特
殊再生時に表示すべき元映像（１枚のフレーム、または
複数枚の連続するフレーム群、もしくは複数枚の近接す
るフレーム群すなわち複数枚の連続するフレームのうち
の一部分からなるフレーム群）の所在を示す情報を含む
映像位置情報１０１と、その元映像についての特殊再生
時に表示すべき表示時間を示す情報およびまたは表示時
間を算出する基となる情報を含む表示時間制御情報１０
２の組を含む。This data structure includes a plurality of pieces of frame information i.
(I = 1 to N) are described in association with the frame appearance order in the original video data, and each frame information is an original video (one frame or a plurality of continuous frames) to be displayed during special reproduction. Group or a plurality of adjacent frames, that is, a frame group consisting of a part of a plurality of continuous frames) and video position information 101 including information indicating the location of the original frame, and should be displayed during special reproduction of the original video. Display time control information 10 including information indicating the display time and / or information used to calculate the display time
Includes two sets.

【００３８】図１は、フレーム情報ｉを映像データにお
けるフレーム出現順序で配列して記述したものである
が、フレーム情報ｉ内にそのフレーム情報の順序を示す
情報を記述すれば、フレーム情報ｉをどのような順番で
配列して記述しても構わない。FIG. 1 describes the frame information i arranged in the order in which the frames appear in the video data. If information indicating the order of the frame information is described in the frame information i, the frame information i is described. They may be arranged and described in any order.

【００３９】これらの複数のフレーム情報ｉからなるフ
レーム情報群に添付される再生倍率情報１０３は、特殊
再生の再生速度の倍率を示し、フレーム情報に記述され
ている表示時間そのままではなく、フレームの表示時間
を短縮し、数倍の速度で再生することを指定するために
用いる。ただし、再生倍率情報１０３は必須の情報では
ない。常に添付する構成と、常に添付しない構成と、添
付するかどうかを個別的に選択可能とする構成とがあ
る。再生倍率情報１０３が添付されていても特殊再生に
おいて必ずしもこれを用いなくても構わない。常に用い
る構成と、常に用いない構成と、用いるかどうかを個別
的に選択可能とする構成とがある。The reproduction magnification information 103 attached to the frame information group composed of the plurality of frame information i indicates the magnification of the reproduction speed of special reproduction. It is used to shorten the display time and specify that playback is to be performed several times faster. However, the reproduction magnification information 103 is not essential information. There are a configuration that is always attached, a configuration that is not always attached, and a configuration that allows the user to individually select whether to attach. Even if the reproduction magnification information 103 is attached, the reproduction magnification information 103 does not necessarily need to be used in the special reproduction. There are a configuration that is always used, a configuration that is not always used, and a configuration that allows individual selection of whether to use it.

【００４０】図１において、フレーム情報群に対して再
生倍率情報と共に、または再生倍率情報に代えて更に他
の制御情報を付加する構成も可能である。図１におい
て、各フレーム情報ｉに更に他の制御情報を付加する構
成も可能である。それらの場合において、映像再生装置
側では特殊再生制御情報に含まれる各々の情報を全て使
用してもよいし、それらの一部の情報のみを使用しても
よい。In FIG. 1, it is also possible to adopt a configuration in which other control information is added to the frame information group together with the reproduction magnification information or in place of the reproduction magnification information. In FIG. 1, a configuration in which other control information is added to each frame information i is also possible. In these cases, the video playback device may use all of the information included in the special playback control information, or may use only some of the information.

【００４１】図２に、このような特殊再生制御情報の生
成装置の構成例を示す。FIG. 2 shows a configuration example of such a device for generating special reproduction control information.

【００４２】図２に示されるように、この特殊再生制御
情報生成装置は、映像データ記憶部２、映像位置情報処
理部１１および表示時間制御情報処理部１２を含む映像
データ処理部１、特殊再生制御情報記憶部３を備えてい
る。詳しくは後述するが、図２の構成では、映像データ
を表示する場合は、元映像データ（符号化されている）
をデコードして画像データとしてから表示するので、表
示が指示されてから実際に画像が表示されるまでに、デ
コードのための処理時間がかかる。この時間を短縮する
ために、特殊再生に使う映像データを予めデコードして
おき、画像データファイルを記憶しておく方法が考えら
れる。このような画像データファイルを用いる場合（常
に画像データファイルを用いる形態をとる場合、または
画像データファイルを用いるかどうか選択可能とする形
態をとる場合）には、図３に示すように、映像データ処
理部１内に画像データファイル作成部１３を更に備え、
映像データ処理部１に画像データファイル記憶部４を接
続する。特殊再生制御情報に映像データをもとにして求
めた他の制御情報を付加する場合には、適宜、映像デー
タ処理部１内に該当する機能が追加される。As shown in FIG. 2, this special reproduction control information generation apparatus includes a video data storage unit 2, a video data processing unit 1 including a video position information processing unit 11 and a display time control information processing unit 12, a special reproduction The control information storage unit 3 is provided. As will be described in detail later, in the configuration of FIG. 2, when displaying video data, original video data (encoded) is displayed.
Is decoded and displayed as image data, so that it takes a processing time for decoding from when the display is instructed until the image is actually displayed. In order to shorten this time, a method of decoding video data used for special reproduction in advance and storing an image data file is conceivable. When such an image data file is used (when an image data file is always used or when it is possible to select whether to use an image data file), as shown in FIG. The processing unit 1 further includes an image data file creation unit 13,
The image data file storage unit 4 is connected to the video data processing unit 1. When adding other control information obtained based on video data to the special reproduction control information, a corresponding function is added to the video data processing unit 1 as appropriate.

【００４３】本処理においてユーザの操作を介入させる
形態をとる場合には、例えば映像データをフレーム単位
で表示させ、ユーザの指示入力等を受け付けるなどの機
能を提供するＧＵＩが用いられる（図２や図３では省略
している）。In the present process, in the case of taking the form of intervening a user's operation, a GUI that provides a function of, for example, displaying video data in frame units and receiving an instruction input of the user is used (see FIG. 2 and FIG. 2). It is omitted in FIG. 3).

【００４４】図２、図３においては、ＣＰＵやメモリ、
必要に応じて設けられる外部記憶装置やネットワーク通
信装置、必要に応じて使用されるドライバソフトやＯＳ
等のソフトウェアについては省略している。2 and 3, a CPU, a memory,
External storage device and network communication device provided as needed, driver software and OS used as needed
The software such as is omitted.

【００４５】映像データ記憶部２は、特殊再生制御情
報、または特殊再生制御情報および画像データファイル
を生成する処理対象となる映像データを記憶するための
ものである。特殊再生制御情報記憶部３は、生成された
特殊再生制御情報を記憶するためのものである。The video data storage unit 2 stores special reproduction control information or special reproduction control information and video data to be processed for generating an image data file. The special reproduction control information storage unit 3 stores the generated special reproduction control information.

【００４６】画像データファイル記憶部４は、画像デー
タファイル作成部１３により作成された画像データファ
イルを記憶するためのものである。The image data file storage unit 4 stores the image data file created by the image data file creation unit 13.

【００４７】映像データ記憶部２、特殊再生制御情報記
憶部３、画像データファイル記憶部４はいずれも、例え
ばハードディスクや光ディスクや半導体メモリなどで構
成される。映像データ記憶部２、特殊再生制御情報記憶
部３、画像データファイル記憶部４は、別々の記憶装置
によって構成されていてもよいが、それらの全部または
一部が同一の記憶装置によって構成されていてもよい。Each of the video data storage unit 2, the special reproduction control information storage unit 3, and the image data file storage unit 4 is composed of, for example, a hard disk, an optical disk, a semiconductor memory, or the like. The video data storage unit 2, the trick play control information storage unit 3, and the image data file storage unit 4 may be configured by separate storage devices, but all or a part of them is configured by the same storage device. You may.

【００４８】映像データ処理部１は、処理対象となる映
像データをもとにして、特殊再生制御情報（または特殊
再生制御情報および画像データファイル）を生成するた
めのものである。The video data processing section 1 is for generating special reproduction control information (or special reproduction control information and an image data file) based on video data to be processed.

【００４９】映像位置情報処理部１１は、特殊再生時に
表示すべきもしくは表示可能な映像フレーム（群）を決
定（抽出）し、各フレーム情報ｉに記述すべき情報１０
１を作成する処理を行う。The video position information processing section 11 determines (extracts) video frames (group) to be displayed or displayable at the time of special reproduction, and outputs the information 10 to be described in each frame information i.
1 is performed.

【００５０】表示時間制御情報処理部１０２は、各フレ
ーム情報に係る映像フレーム（群）の表示時間に関係す
る情報１０２を作成する処理を行う。The display time control information processing section 102 performs processing for creating information 102 relating to the display time of the video frame (group) related to each frame information.

【００５１】画像データファイル作成部１３は、映像デ
ータから各画像データファイルを作成する処理を行う。The image data file creating section 13 performs a process of creating each image data file from video data.

【００５２】特殊再生制御情報生成装置は、例えば、計
算機上でソフトウェアを実行する形で実現することがで
きる。特殊再生制御情報生成のための専用の装置として
実現してもよい。The trick play control information generating device can be realized, for example, by executing software on a computer. It may be realized as a dedicated device for generating special reproduction control information.

【００５３】図４に、図２の構成の場合の制御情報生成
手順の一例を示す。記憶部２から映像データを読み込み
（ステップＳ１１）、映像位置情報を作成し（ステップ
Ｓ１２）、表示時間制御情報を作成し（ステップＳ１
３）、映像位置情報、表示時間制御情報からなる特殊再
生制御情報を記憶部３に保存する（ステップＳ１４）。
図４の手順は、各フレーム情報ごとに逐次行ってもよい
し、各処理をバッチ的に行ってもよい。その他の手順も
可能である。FIG. 4 shows an example of a control information generation procedure in the case of the configuration of FIG. Image data is read from the storage unit 2 (step S11), image position information is created (step S12), and display time control information is created (step S1).
3) The special reproduction control information including the video position information and the display time control information is stored in the storage unit 3 (step S14).
The procedure of FIG. 4 may be performed sequentially for each frame information, or each process may be performed in a batch. Other procedures are possible.

【００５４】図５に、図３の構成の場合の制御情報生成
手順の一例を示す。図４の手順に対して、さらに画像デ
ータファイルを作成し保存する手順が加わっている（ス
テップＳ２２）。ここでは、画像データファイルの作成
およびまたは保存は映像位置情報の作成とともに行なっ
ているが、図５とは異なるタイミングで行うことも可能
でである。図４の場合と同様、図５の手順は、各フレー
ム情報ごとに逐次行ってもよいし、各処理をバッチ的に
行ってもよい。その他の手順も可能である。FIG. 5 shows an example of a control information generation procedure in the case of the configuration of FIG. A procedure for creating and storing an image data file is added to the procedure of FIG. 4 (step S22). Here, the creation and / or saving of the image data file is performed together with the creation of the video position information. However, the creation and storage can be performed at a timing different from that in FIG. As in the case of FIG. 4, the procedure of FIG. 5 may be performed sequentially for each frame information, or each process may be performed in a batch. Other procedures are possible.

【００５５】次に、図６に、映像再生装置の構成例を示
す。Next, FIG. 6 shows a configuration example of a video reproducing apparatus.

【００５６】図６に示されるように、この映像再生装置
は、制御部２１、通常再生処理部２２、特殊再生処理部
２３、表示部２４、コンテンツ記憶部２５を備えてい
る。映像データに音声などの音（Ａｕｄｉｏ）が付加さ
れているコンテンツを扱う場合には、音声出力部を備え
ていることが望ましい。映像データにテキストデータが
付加されているコンテンツを扱う場合には、テキストは
表示部２４に表示してもよいし、音声出力部から出力さ
せることも可能である。プログラムが添付されているコ
ンテンツを扱う場合には、添付プログラム実行部を設け
るようにしても良い。As shown in FIG. 6, the video reproducing apparatus includes a control unit 21, a normal reproduction processing unit 22, a special reproduction processing unit 23, a display unit 24, and a content storage unit 25. When handling content in which sound such as audio is added to video data, it is desirable to have an audio output unit. When handling content in which text data is added to video data, the text may be displayed on the display unit 24 or output from the audio output unit. When handling contents to which a program is attached, an attached program execution unit may be provided.

【００５７】コンテンツ記憶部２５には、少なくとも映
像データと特殊再生制御情報が記憶されている。詳しく
は後述するように、画像データファイルを用いる形態を
とる場合には、更に画像データファイルが記憶されてい
る。音声データやテキストデータや添付プログラムが更
に記憶されていることもある。The content storage unit 25 stores at least video data and special reproduction control information. As will be described later in detail, in a case where an image data file is used, an image data file is further stored. Voice data, text data, and an attached program may be further stored.

【００５８】コンテンツ記憶部２５は、一箇所に集中配
置されていても、複数箇所に分散配置されていてもよ
く、要は通常再生処理部２２や特殊再生処理部２３でア
クセスすることができればよい。映像データや特殊再生
制御情報や画像データファイルや音声データやテキスト
データや添付プログラムは、別々の媒体に格納されてい
てもよいし、同一の媒体に格納されていてもよい。媒体
としては、例えばＤＶＤなどが用いられる。それらはネ
ットワークを介して伝送されるデータであってもよい。The content storage unit 25 may be centrally arranged at one place or may be dispersedly arranged at a plurality of places. In short, the contents storage unit 25 only needs to be accessible by the normal reproduction processing unit 22 and the special reproduction processing unit 23. . The video data, the special reproduction control information, the image data file, the audio data, the text data, and the attached program may be stored on different media, or may be stored on the same medium. As the medium, for example, a DVD or the like is used. They may be data transmitted over a network.

【００５９】制御部２１は、基本的には、ＧＵＩ等のユ
ーザ・インタフェースを介してユーザからコンテンツに
対する通常再生や特殊再生などの指示を受け、該当する
処理部分に、指定されたコンテンツについての指定され
た方法による再生を指示するなどの制御を行う。The control unit 21 basically receives an instruction such as normal reproduction or special reproduction for a content from a user via a user interface such as a GUI, and designates the designated processing portion with respect to the designated content. Control such as instructing reproduction by the specified method.

【００６０】通常再生処理部２２は、指定されたコンテ
ンツを通常再生するためのものである。The normal reproduction processing section 22 is for normal reproduction of the designated content.

【００６１】特殊再生処理部２３は、指定されたコンテ
ンツについて、特殊再生制御情報を参照して、指定され
た内容の特殊再生（例えば、倍速再生、飛び越し再生、
トリック再生等）を行うためのものである。The special reproduction processing section 23 refers to the special reproduction control information for the specified content, and performs special reproduction (for example, double speed reproduction, jump reproduction,
Trick play).

【００６２】表示部２４は、映像を表示するためのもの
である。The display unit 24 is for displaying an image.

【００６３】映像再生装置は、例えば、計算機上でソフ
トウェアを実行する形で実現することができる（もちろ
ん、一部分に、ハードウェアを用いてもよい（例えば、
デコードボード（ＭＰＥＧ−２デコーダ）など））。映
像再生のための専用の装置として実現してもよい。The video reproducing apparatus can be realized, for example, by executing software on a computer (of course, hardware may be partially used (for example,
Decode board (MPEG-2 decoder) etc.)). It may be realized as a dedicated device for video reproduction.

【００６４】図７に、図６の映像再生装置の再生処理手
順の一例を示す。ステップＳ３１でユーザから要求され
たのは通常再生か特殊再生かを判断する。通常再生の要
求があった場合は、ステップＳ３２で指定された映像デ
ータを読み込み、ステップ３３で通常再生を行う。ユー
ザから特殊再生の要求があった場合は、ステップＳ３４
で指定された映像データに対応する特殊再生制御情報を
読み込み、ステップ３５で表示すべき映像の位置の特定
と、表示時間の決定を行い、ステップ３６で該当するフ
レーム（群）を映像データ中（または画像データファイ
ル）から読み込み、ステップ３７で指定された内容の特
殊再生を行う。表示すべき映像の位置の特定およびまた
は表示時間の決定は、図７とは異なるタイミングで行う
ことも可能である。図７の特殊再生の手順は、各フレー
ム情報ごとに逐次行ってもよいし、各処理をバッチ的に
行ってもよい。その他の手順も可能である。例えば、各
フレームの表示時間を等しく一定の値にするような再生
方法の場合には、表示時間の決定は行われない。FIG. 7 shows an example of a reproduction processing procedure of the video reproduction apparatus of FIG. In step S31, it is determined whether the user has requested normal reproduction or special reproduction. If there is a request for normal reproduction, the designated video data is read in step S32, and normal reproduction is performed in step 33. If there is a special reproduction request from the user, step S34
The special reproduction control information corresponding to the video data designated by the step is read, the position of the video to be displayed is specified in step 35, and the display time is determined. In step 36, the corresponding frame (group) is included in the video data ( Or, it reads from the image data file) and performs special reproduction of the content designated in step 37. The specification of the position of the video to be displayed and / or the determination of the display time can be performed at a timing different from that in FIG. The special reproduction procedure of FIG. 7 may be performed sequentially for each frame information, or each processing may be performed in a batch. Other procedures are possible. For example, in the case of a reproduction method in which the display time of each frame is set to a constant value, the display time is not determined.

【００６５】通常再生と特殊再生のいずれにおいても、
ユーザが種々の指定（例えば、コンテンツにおける再生
開始点およびまたはコンテンツにおける再生終了点、倍
速再生における再生速度、倍速再生における再生時間、
その他の特殊再生の方法、等）を要求できるようにする
とより効果的である。In both normal playback and special playback,
The user can make various designations (e.g., a playback start point and / or a playback end point for content, a playback speed for double speed playback, a playback time for double speed playback,
It is more effective to be able to request other special reproduction methods.

【００６６】次に、特殊再生制御情報のフレーム情報の
生成のアルゴリズムや、特殊再生時の表示時間の決定の
アルゴリズムなどについて、概略的に説明する。Next, an algorithm for generating the frame information of the special reproduction control information, an algorithm for determining the display time during the special reproduction, and the like will be schematically described.

【００６７】フレーム情報の生成時には、映像データの
うちから特殊再生で使用するフレームの決定、映像位置
情報の作成、表示時間制御情報の作成が行われる。At the time of generating frame information, a frame to be used in special reproduction is determined from video data, video position information is generated, and display time control information is generated.

【００６８】フレームの決定は、（１）当該映像データ
についての何らかの特徴量に基づいて行う方法（例え
ば、隣接フレーム間の特定の特徴量（例えば、フレーム
間の画面変化量）が各抽出フレーム間でその総和が一定
になるようにする方法、各抽出フレーム間での全フレー
ムの重要度の総和が一定になるようにする方法）、
（２）画一的な基準により行う方法（例えば、ランダム
に抽出する方法、等間隔に抽出する方法）、などがあ
る。The frame is determined based on (1) a method based on some characteristic amount of the video data (for example, a specific characteristic amount between adjacent frames (for example, a screen change amount between frames) is determined between each extracted frame). To make the sum constant, and to make the sum of the importance of all the frames between the extracted frames constant),
(2) A method based on a uniform standard (for example, a method of extracting at random, a method of extracting at equal intervals), and the like.

【００６９】表示時間制御情報の作成には、（ｉ）表示
時間または表示フレーム数の絶対値または相対値を求め
る方法、（ii）表示時間または表示フレーム数の基準と
なる情報（例えば、ユーザ指定、映像中の文字、映像に
同期した音、映像中の人、あるいは映像中の特定パター
ン等に基づいて得られる重要度）を求める方法、（iii
）上記の（ｉ）と（ii）の両方を記述する方法、など
がある。The display time control information is created by (i) a method of obtaining an absolute value or a relative value of the display time or the number of display frames, and (ii) information serving as a reference of the display time or the number of display frames (for example, a user-specified value). , A character in a video, a sound synchronized with the video, a person in the video, or a degree of importance obtained based on a specific pattern in the video, etc.), (iii)
) There is a method of describing both (i) and (ii) above.

【００７０】（１）または（２）と、（ｉ）または（i
i）または（iii ）とは、適宜組み合わせることが可能
である。もちろん、それ以外の方法も可能である。それ
らのうちの特定の１つの組み合わせのみ可能としてもよ
いし、それらのうちの複数の組み合わせを可能とし、適
宜選択できるようにしてもよい。(1) or (2), (i) or (i)
It can be appropriately combined with i) or (iii). Of course, other methods are possible. Only one specific combination among them may be possible, or a plurality of combinations among them may be made possible to be appropriately selected.

【００７１】特殊な場合として、（１）の方法でのフレ
ームの決定と同時に（ｉ）の表示時間または表示フレー
ム数の相対値が求まる方法がある。常にこの方法を用い
る場合には、表示時間制御情報処理部１０２を省くこと
も可能である。As a special case, there is a method in which the relative value of the display time or the number of display frames is obtained at the same time as the frame determination by the method (1). If this method is always used, the display time control information processing unit 102 can be omitted.

【００７２】特殊再生時には、フレーム情報に含まれる
（i）または（ii）または（iii）の表示時間制御情報を
参照して行うことを想定しているが、記述されている値
に従うようにしてもよいし、記述されている値を修正し
て使うようにしてもよいし、記述されている値またはこ
れを修正した値に加えて独自に用意した他の情報やユー
ザから入力された情報をも使うようにしてもよいし、独
自に用意した他の情報やユーザから入力された情報のみ
をも使うようにしてもよい。それらのうちの複数の方法
を可能とし、適宜選択できるようにしてもよい。At the time of special reproduction, it is assumed that the reproduction is performed with reference to the display time control information (i), (ii), or (iii) included in the frame information. May be used by modifying the described value.In addition to the stated value or the modified value, other information prepared by the user or information input by the user may be used. May be used, or only other information prepared independently or information input by the user may be used. A plurality of methods among them may be made possible and may be appropriately selected.

【００７３】次に、特殊再生の概略について説明する。Next, an outline of the trick play will be described.

【００７４】倍速再生（あるいは高速再生）は、映像デ
ータ・コンテンツを構成する全フレームのうちの一部の
フレームを再生することによって、もとのコンテンツを
通常再生するのに要する時間より短い時間で再生を行う
ものである。例えば、フレーム情報で示されるフレーム
を、フレーム情報で示される表示時間ずつ、その時系列
順に表示する。ユーザから、もとのコンテンツを通常再
生する速度の何倍で再生するか（もとのコンテンツを通
常再生するのに要する時間の何分の一の時間で再生する
か）を指定する倍速指定や、どのくらいの時間をかけて
再生するかを指定する時間指定などの要求を受け付け、
該要求を満たすように各フレーム（群）の表示時間を求
めて再生するようにしてもよい。そのため、この倍速再
生は要約再生とも称する。The double-speed reproduction (or high-speed reproduction) reproduces a part of all the frames constituting the video data content in a shorter time than the time required for normal reproduction of the original content. It is for regenerating. For example, the frames indicated by the frame information are displayed in chronological order by the display time indicated by the frame information. From the user, a double speed specification for specifying how many times the normal content playback speed is to be played (or a fraction of the time required for normal playback of the original content) or , And accepts requests to specify how long it takes to play,
The display time of each frame (group) may be obtained and reproduced so as to satisfy the request. Therefore, this double speed reproduction is also referred to as summary reproduction.

【００７５】飛び越し再生（あるいは飛び越し連続再
生）は、倍速再生において、例えば後述する再生／非再
生情報に基づいて、フレーム情報で示されるフレームの
一部を非再生とする。フレーム情報で示されるフレーム
のうち非再生とされたフレームを除いたフレームについ
て倍速再生するものである。In the interlaced playback (or interleaved continuous playback), in double-speed playback, for example, a part of the frame indicated by the frame information is not played back based on play / non-playback information described later. The frame is reproduced at double speed except for the frame indicated by the frame information except the non-reproduced frame.

【００７６】トリック再生は、通常再生以外の再生か
ら、上記の倍速再生および飛び越し再生を除いたもので
ある。例えば、フレーム情報で示されるフレームを再生
する際に、ある部分について時系列順を入れ替えて再生
する入れ替え再生、フレーム情報で示されるフレームを
再生する際に、ある部分については複数回繰り返し再生
する重複再生、フレーム情報で示されるフレームを再生
する際に、ある部分については、他の部分より低速に再
生し（通常再生時の速度にする場合と、通常再生時より
低速にする場合とを含む）、あるいは他の部分より高速
に再生し、あるいは一定時間表示して静止させ、あるい
はそれらを適宜組み合わせる変速再生、フレーム情報で
示されるフレームの一定の纏まりごとに時系列をランダ
ムにして再生するランダム再生など、様々な形態のもの
が考えられる。The trick reproduction is a reproduction other than the normal reproduction except for the double speed reproduction and the jump reproduction. For example, when reproducing the frame indicated by the frame information, the switching reproduction is performed by changing the chronological order for a certain portion, and when reproducing the frame indicated by the frame information, the certain portion is repeatedly reproduced a plurality of times. When reproducing the frame indicated by the reproduction and frame information, a certain portion is reproduced at a lower speed than other portions (including a case where the speed is set to the normal reproduction speed and a case where the speed is set lower than the normal reproduction). Or variable-speed playback that reproduces at a higher speed than other parts, or displays for a fixed time and stops, or appropriately combines them, and random playback that plays back a time series at random for a certain group of frames indicated by frame information Various forms are conceivable.

【００７７】もちろん、複数種類の方法を適宜組み合わ
せたものも可能である。例えば、倍速再生時に、重要な
部分については、複数回再生するとともに、再生速度を
通常再生速度とする方法など、多彩なバリエーションが
考えられる。Of course, a combination of a plurality of types of methods may be used. For example, at the time of double-speed reproduction, various variations can be considered, such as a method of reproducing an important portion a plurality of times and setting a reproduction speed to a normal reproduction speed.

【００７８】以下、本実施形態についてより具体的に詳
しく説明する。Hereinafter, the present embodiment will be described in more detail.

【００７９】まず、フレームの決定のための画像データ
の特性値として隣接フレーム間の画面変化量を用いる場
合を例にとって説明する。First, a case where the amount of screen change between adjacent frames is used as a characteristic value of image data for determining a frame will be described.

【００８０】ここでは、１つのフレーム情報に、１つの
フレームを対応させる場合について説明する。Here, a case will be described in which one frame corresponds to one frame information.

【００８１】図８に、対象となる映像データをもとにし
て作成される、特殊再生制御情報のデータ構造の一例を
示す。FIG. 8 shows an example of the data structure of the special reproduction control information created based on the target video data.

【００８２】このデータ構造は、図１における表示時間
制御情報１０２として（または表示時間制御情報１０２
の代わりに）、絶対的なまたは相対的な表示時間を示す
情報である表示時間情報１２１を記述するようにしたも
のである。表示時間制御情報１０２に重要度を記述する
構成などについては後で説明する。This data structure is used as display time control information 102 in FIG. 1 (or display time control information 102).
), The display time information 121 which is information indicating the absolute or relative display time is described. A configuration for describing the importance in the display time control information 102 will be described later.

【００８３】映像位置情報１０１は、当該映像の元映像
フレームにおける位置を特定可能とする情報であり、フ
レーム番号（例えば先頭フレームからのシーケンス番
号）やタイムスタンプなどのようにストリーム内の１フ
レームを特定できるものならどのようなものを用いても
構わない。元映像ストリームから抜き出したフレームに
対応する画像データを別ファイルとする場合は、そのフ
ァイル位置を特定する情報としてＵＲＬなどを用いても
よい。The video position information 101 is information that enables the position of the video in the original video frame to be specified. One frame in the stream such as a frame number (for example, a sequence number from the first frame) or a time stamp is used. Anything that can be specified may be used. When the image data corresponding to the frame extracted from the original video stream is set as another file, a URL or the like may be used as information for specifying the file position.

【００８４】表示時間情報１２１は、当該映像を表示す
る時間あるいはフレーム数を特定可能とする情報であ
り、実際に時間あるいはフレーム数を単位として記述す
る方法と、他のフレーム情報に記述されている表示時間
情報との相対的な時間の長さの関係がわかるような相対
値（例えば正規化された数値）を記述する方法とがあ
る。後者の場合は、全体の総再生時間から、各映像の実
際の再生時間を算出することになる。各映像について、
表示の継続時間を記述するのではなく、特定のタイミン
グを起点とした（例えば最初の映像の開始時間を０とし
た）開始時間と終了時間の組み合わせでの記述や、開始
時間と継続時間の組み合わせでの記述を用いてもよい。The display time information 121 is information that can specify the time or the number of frames for displaying the video, and is described in a method of actually describing the time or the number of frames as a unit and in other frame information. There is a method of describing a relative value (for example, a normalized numerical value) so that the relation of the relative time length with the display time information can be understood. In the latter case, the actual playback time of each video is calculated from the entire total playback time. For each video,
Rather than describing the duration of the display, a description of the combination of the start time and the end time starting from a specific timing (for example, the start time of the first video is set to 0), or the combination of the start time and the duration May be used.

【００８５】特殊再生では、映像位置情報１０１により
特定される位置に存在する映像を、表示時間情報１２１
により特定される表示時間だけ再生することを、配列に
含まれるフレーム情報の数だけ逐次行うことを基本とす
る。In the special reproduction, the video present at the position specified by the video position information 101 is displayed on the display time information 121.
Is basically performed sequentially for the number of frame information included in the array.

【００８６】開始時間と終了時間又は継続時間が指定さ
れており、かつ、この指定に従う場合には、映像位置情
報１０１により特定される位置に存在する映像を、表示
時間情報１２１により特定される開始時間から終了時間
まで再生することを、配列に含まれるフレーム情報の数
だけ逐次行うことを基本とする。In the case where the start time and the end time or the continuation time are specified, and in accordance with the specification, the video present at the position specified by the video position information 101 is identified by the start time specified by the display time information 121. Reproduction from time to end time is basically performed sequentially by the number of frame information included in the array.

【００８７】再生倍率などのパラメータや、別の付加情
報を用いることにより、記述された表示時間を加工して
再生することも可能である。By using parameters such as the reproduction magnification and other additional information, the described display time can be processed and reproduced.

【００８８】次に、図９〜図１１を用いて、映像の位置
情報の記述方法を説明する。Next, a description will be given of a method of describing the position information of a video with reference to FIGS.

【００８９】図９は、元映像フレームを参照する映像位
置情報の記述方法を説明する図である。FIG. 9 is a diagram for explaining a description method of video position information referring to an original video frame.

【００９０】図９において、時間軸２００は、特殊再生
のためのフレーム情報を作成する対象となる元映像スト
リームに対応し、画像２０１は映像ストリーム中の記述
対象となる１フレームに対応する。時間軸２０２は、元
映像ストリームから抜き出した画像２０１を使って特殊
再生を行うときの映像の再生時間に対応し、表示時間２
０３はその中に含まれる１つの画像２０１に対応する区
間である。この場合には、例えば、画像２０１の位置を
示す映像位置情報１０１と、表示時間２０３の長さを示
す映像表示時間１２１との組がフレーム情報として記述
される。前述のように、画像２０１の位置の記述はフレ
ーム番号やタイムスタンプなど、元映像ストリーム内の
１フレームを特定できるものならなんでもよい。このフ
レーム情報が他の画像についても同様に記述される。In FIG. 9, a time axis 200 corresponds to an original video stream for which frame information for special reproduction is created, and an image 201 corresponds to one frame to be described in the video stream. The time axis 202 corresponds to the video playback time when performing special playback using the image 201 extracted from the original video stream.
03 is a section corresponding to one image 201 included therein. In this case, for example, a set of video position information 101 indicating the position of the image 201 and video display time 121 indicating the length of the display time 203 is described as frame information. As described above, the description of the position of the image 201 may be anything that can specify one frame in the original video stream, such as a frame number or a time stamp. This frame information is similarly described for other images.

【００９１】図１０は、画像データファイルを参照する
映像位置情報の記述方法を説明する図である。FIG. 10 is a diagram for explaining a description method of video position information referring to an image data file.

【００９２】図９によって示される映像位置情報の記述
方法は、特殊再生を行おうとする元映像データ内のフレ
ームを直接参照するものであったが、図１０によって示
される映像位置情報の記述方法は、元映像ストリームか
ら抜き出した単一フレーム３０２に対応する画像データ
３００を別のファイルに用意し、その位置を記述するも
のである。ファイル位置の記述方法は、例えば、ＵＲＬ
などを用いることにより、ローカルな記憶装置上に存在
する場合でも、ネットワーク上に存在する場合でも同様
に扱うことが可能である。この画像データファイルの位
置を示す映像位置情報１０１と、対応する表示時間３０
１の長さを示す映像表示時間１２１との組をフレーム情
報として記述する。The method of describing the video position information shown in FIG. 9 directly refers to the frame in the original video data to be trick-played, but the method of describing the video position information shown in FIG. The image data 300 corresponding to the single frame 302 extracted from the original video stream is prepared in another file, and its position is described. The description method of the file position is, for example, URL
By using such a method, it is possible to handle the case where it exists on a local storage device and the case where it exists on a network in the same way. Video position information 101 indicating the position of the image data file and the corresponding display time 30
A set with the video display time 121 indicating the length of 1 is described as frame information.

【００９３】元映像フレームとの対応が必要な場合は、
記述したフレーム情報に対応する元映像の単一フレーム
３０２を示す情報（例えば図９の場合における映像位置
情報と同様のもの）をフレーム情報に含めればよい。こ
の場合、フレーム情報は、映像位置情報、表示時間情
報、元映像情報より構成されることとなる。もちろん、
元映像情報は、必要がなければ記述する必要はない。If correspondence with the original video frame is required,
Information indicating a single frame 302 of the original video corresponding to the described frame information (for example, the same as the video position information in the case of FIG. 9) may be included in the frame information. In this case, the frame information is composed of video position information, display time information, and original video information. of course,
Original video information need not be described if it is not necessary.

【００９４】図１０の方法によって記述される画像デー
タの形態は、特に制約はないが、例えば、元映像のフレ
ームをそのまま用いたり、縮小して用いたりするように
してもよい。これは、元映像を展開する必要がないの
で、高速に再生処理を行うためにも有効である。The form of the image data described by the method shown in FIG. 10 is not particularly limited. For example, the frame of the original video may be used as it is, or may be used after being reduced. This is effective for performing high-speed playback processing because it is not necessary to develop the original video.

【００９５】縮小画像の作成は、元映像ストリームがＭ
ＰＥＧ−１やＭＰＥＧ−２などによって圧縮されている
場合には、そのストリームを部分的に復号するだけで、
高速に作成することができる。この手法は、フレーム内
符号化されているＩピクチャフレーム（フレーム内符号
化フレーム）のＤＣＴ（離散コサイン変換）係数のみを
復号し、その直流成分を用いることによって、縮小画像
を作成する。The creation of a reduced image is performed when the original video stream is M
When compressed by PEG-1 or MPEG-2, the stream is only partially decoded,
Can be created at high speed. In this method, a reduced image is created by decoding only DCT (discrete cosine transform) coefficients of an intra-frame encoded I picture frame (intra-frame encoded frame) and using its DC component.

【００９６】図１０の記述方法では、画像データをそれ
ぞれ別のファイルに格納していたが、これらのファイル
はランダムアクセス可能な映像フォーマット（例えば、
ＭｏｔｉｏｎＪＰＥＧ）を持つ画像データ群格納ファイ
ルにまとめて格納してもよい。この場合、画像データの
位置は、画像データ群格納ファイルの位置を示すＵＲＬ
および画像データ群格納ファイル内での位置を示すフレ
ーム番号またはタイムスタンプの組み合わせによって記
述される。画像データ群格納ファイルの位置を示すＵＲ
Ｌ情報は、個々のフレーム情報内に記述してもよいし、
フレーム情報の配列外に付加情報として記述してもよ
い。In the description method shown in FIG. 10, the image data is stored in separate files, but these files are stored in a randomly accessible video format (for example,
(Motion JPEG). In this case, the position of the image data is a URL indicating the position of the image data group storage file.
And a combination of a frame number or a time stamp indicating a position in the image data group storage file. UR indicating the location of the image data group storage file
L information may be described in individual frame information,
It may be described as additional information outside the array of frame information.

【００９７】元映像のどのフレームを選択して画像デー
タを作成して映像位置情報に記述するかについては、様
々な方法をとることができる。例えば、元映像から等間
隔に画像データを抽出してもよいし、画面の動きの多い
ところは狭い間隔で多くの画像データを抽出し、動きの
少ないところは広い間隔で少ない画像データを抽出して
もよい。Various methods can be used to select which frame of the original video is to be selected to create the image data and describe it in the video position information. For example, image data may be extracted from the original video at equal intervals, a large amount of image data may be extracted at narrow intervals in areas where the screen moves a lot, and a small amount of image data may be extracted at wide intervals where there is little movement. You may.

【００９８】図１１を参照しながら、フレーム選択方法
の一例として、画面の動きに応じて、画面の動きの多い
ところは狭い間隔で多くの画像データを抽出し、動きの
少ないところは広い間隔で少ない画像データを抽出する
方法について説明する。Referring to FIG. 11, as an example of a frame selection method, a lot of image data is extracted at a narrow interval in a place where the screen moves a lot, and a wide interval is extracted in a place with a little movement according to the movement of the screen. A method for extracting a small amount of image data will be described.

【００９９】図１１において、横軸はフレーム番号を表
し、曲線８００は（隣接フレーム間の）画面変化量の変
化を表している。各フレームの画面変化量の算出方法
は、後述する表示時間情報を求める際の手法と同様であ
る。ここでは、画面の動きに応じて抽出間隔を決定する
ために、画像データ抽出元の映像フレーム間の画像変化
量が一定となるような間隔を求める方法を示す。画像デ
ータ抽出元の映像フレーム間の画面変化量の合計を
Ｓ_ｉ、全フレームの画面変化量の総和をＳ（＝ΣＳ_ｉ）
とし、抽出する画像データ数をｎとする。画像データ抽
出元フレーム間の画像変化量を一定にするには、Ｓ_ｉ＝
Ｓ／ｎとなればよい。図１１では、画面変化量の曲線８
００が破線によって区切られた区間の面積Ｓ_ｉが一定に
なることに対応する。そこで、例えば、先頭フレームよ
り、順次画面変化量を加算し、その値がＳ／ｎを超えた
フレームを画像データ抽出元の映像フレームＦ_ｉとす
る。In FIG. 11, the horizontal axis represents the frame number, and the curve 800 represents the change in the screen change amount (between adjacent frames). The calculation method of the screen change amount of each frame is the same as the method of obtaining display time information described later. Here, in order to determine the extraction interval in accordance with the movement of the screen, a method of obtaining an interval at which the image change amount between video frames from which image data is extracted becomes constant will be described. The sum of the screen changes between the video frames from which the image data is extracted is S _i , and the sum of the screen changes of all the frames is S (= ΣS _i ).
And the number of image data to be extracted is n. In order to make the image change amount between the image data extraction source frames constant, S _i =
S / n is sufficient. In FIG. 11, the curve 8 of the screen change amount
00 the area S _i of the interval delimited by the dashed line corresponds to a constant. Therefore, for example, from the head frame, and sequentially adds screen variation, the value to the frames that exceed the S / n with the image data extraction source of the video frame F _i.

【０１００】ＭＰＥＧのＩピクチャフレームより画像デ
ータを生成する場合には、算出された画像データ作成元
フレームがＩピクチャであるとは限らないので、近傍の
Ｉピクチャフレームより、画像データを作成する。When image data is generated from an MPEG I-picture frame, the calculated image data generation source frame is not necessarily an I-picture. Therefore, image data is generated from a nearby I-picture frame.

【０１０１】ところで、図１１で説明した方法において
は、画面変化量＝０の区間に属する映像は、スキップさ
れることになる。しかし、例えば静止画像が継続する場
合には、重要な場面であることも多い。そこで、画面変
化量＝０が一定時間以上経過した場合には、そのときの
フレームを抜き出すようにしてもよい。この場合におい
ては、例えば、先頭フレームより、順次画面変化量を加
算し、その値がＳ／ｎを超えたフレーム、または画面変
化量＝０が一定時間以上経過したフレームを画像データ
抽出元の映像フレームＦ_ｉとするようにしてもよい。画
面変化量＝０が一定時間以上経過してフレームを抽出し
たときに、画面変化量の加算値を０にクリアする方法
と、クリアせずに保持する方法とがある。この方法を使
うか否かを選択可能にしてもよい。By the way, in the method described with reference to FIG. 11, the video belonging to the section where the screen change amount = 0 is skipped. However, for example, when a still image continues, it is often an important scene. Therefore, when the screen change amount = 0 has passed for a predetermined time or more, the frame at that time may be extracted. In this case, for example, a screen change amount is sequentially added from the first frame, and a frame whose value exceeds S / n or a frame in which the screen change amount = 0 has passed for a predetermined time or more is determined as the image data extraction source image. The frame F _i may be used. When a frame is extracted after a predetermined time has elapsed since the screen change amount = 0, there are a method of clearing the added value of the screen change amount to 0 and a method of holding the value without clearing. Whether or not to use this method may be selectable.

【０１０２】図１１の例の場合、いずれのフレームにつ
いても表示時間は同じとなるように表示時間情報１２１
を記述することを想定しているが（この表示時間情報１
２１に従って一定時間ずつ再生したときに、画面の変化
量が一定となる）、表示時間情報１２１は一定ではな
く、別の方法で求めて記述するようにしても構わない。In the case of the example shown in FIG. 11, the display time information 121 is set so that the display time is the same for all the frames.
Is assumed to be described (this display time information 1
The display time information 121 is not constant, and may be obtained and described by another method, when the reproduction is performed at a constant time interval according to the method described in FIG.

【０１０３】次に、１つのフレーム情報に、１または複
数のフレームを対応させる場合について説明する。Next, a case where one or a plurality of frames correspond to one frame information will be described.

【０１０４】この場合の特殊再生制御情報のデータ構造
の一例は図８と同様である。An example of the data structure of the special reproduction control information in this case is the same as in FIG.

【０１０５】以下、図１２〜図２０図２１を用いて、映
像の位置情報の記述方法を説明する。Hereinafter, a description method of the positional information of the video will be described with reference to FIGS.

【０１０６】図１２は、元映像の連続するフレームを参
照する映像位置情報の記述方法を説明する図である。FIG. 12 is a diagram for explaining a method of describing video position information referring to successive frames of an original video.

【０１０７】図９によって示される映像位置情報の記述
方法は、特殊再生を行おうとする元映像内の１フレーム
を参照するものであったが、図１２によって示される映
像位置情報の記述方法は、元映像内の連続する複数のフ
レームの集合５００を記述するものである。フレームの
集合５００は元映像内の連続する複数のフレームのうち
の一部分を抜き出したものであってもよい。また、フレ
ームの集合５００のうちに１つのフレームのみ含むもの
があってもよい。The description method of the video position information shown in FIG. 9 refers to one frame in the original video to be trick-played, but the description method of the video position information shown in FIG. It describes a set 500 of a plurality of continuous frames in the original video. The frame set 500 may be obtained by extracting a part of a plurality of continuous frames in the original video. Further, there may be a frame set 500 that includes only one frame.

【０１０８】フレームの集合５００が、元映像内の連続
する複数のフレームまたは１つのフレームを含むもので
ある場合には、フレーム位置の記述は、開始フレームお
よび終了フレームの位置を記述するか、開始フレームの
位置と記述区間の継続時間を記述する。１つのフレーム
を含むものである場合には、例えば、開始フレームと終
了フレームの位置を同じにすればよい。位置や時間の記
述は、フレーム番号やタイムスタンプなど、ストリーム
内のフレームを特定できるものを用いる。When the set 500 of frames includes a plurality of continuous frames or one frame in the original video, the description of the frame position may describe the position of the start frame and the end frame, or may be the description of the start frame. Describe the position and the duration of the description section. In the case of including one frame, for example, the positions of the start frame and the end frame may be the same. As the description of the position and the time, a description such as a frame number and a time stamp that can specify a frame in the stream is used.

【０１０９】フレームの集合５００が、元映像内の連続
する複数のフレームのうちの一部分である場合には、そ
のフレームが特定可能になるような情報を記述する。フ
レームの抜き出し方法が決まっており、例えば開始フレ
ームおよび終了フレームの位置を記述すればフレームが
特定可能となる場合には、それらを記述すればよい。When the frame set 500 is a part of a plurality of continuous frames in the original video, information is described so that the frame can be specified. A method for extracting a frame is determined. For example, if the position of a start frame and an end frame can be specified by specifying the position, the frame may be specified.

【０１１０】図１２の表示時間情報５０１は、対応する
元映像フレーム集合５００に含まれるフレーム群全体に
対応する総表示時間を示すものである。元映像フレーム
集合５００に含まれる各フレームの表示時間について
は、特殊再生する装置側で適宜決定可能とすることがで
きる。簡単な方法としては、上記の総表示時間を全フレ
ーム数で均等割りして、１つのフレームの表示時間とす
る方法がある。もちろん、その他にも、種々の方法があ
る。The display time information 501 in FIG. 12 indicates the total display time corresponding to the entire frame group included in the corresponding original video frame set 500. The display time of each frame included in the original video frame set 500 can be appropriately determined by the device that performs special reproduction. As a simple method, there is a method of equally dividing the total display time by the total number of frames to obtain a display time of one frame. Of course, there are various other methods.

【０１１１】図１３は、画像データファイルを参照する
映像位置情報の記述方法を説明する図である。FIG. 13 is a view for explaining a method of describing video position information referring to an image data file.

【０１１２】図１２によって示される映像位置情報の記
述方法は、再生しようとする元映像内の連続するフレー
ムを直接参照するものであったが、図１３によって示さ
れる映像位置情報の記述方法は、元映像ストリームから
抜き出したフレーム集合６０２に対応する画像データの
フレーム集合６００を別のファイルに用意し、その位置
を記述するものである。ファイル位置の記述方法は、例
えば、ＵＲＬなどを用いることにより、ローカルな記憶
装置上に存在する場合でも、ネットワーク上に存在する
場合でも同様に扱うことが可能である。この画像データ
ファイルの位置を示す映像位置情報１０１と、対応する
表示時間６０１の長さを示す映像表示時間１２１との組
をフレーム情報として記述する。Although the description method of the video position information shown in FIG. 12 directly refers to a continuous frame in the original video to be reproduced, the description method of the video position information shown in FIG. A frame set 600 of image data corresponding to the frame set 602 extracted from the original video stream is prepared in another file, and its position is described. By using a URL or the like, for example, the description method of the file position can be handled in the same manner whether the file exists on a local storage device or on a network. A set of the video position information 101 indicating the position of the image data file and the video display time 121 indicating the length of the corresponding display time 601 is described as frame information.

【０１１３】元映像フレームとの対応が必要な場合は、
記述したフレーム情報に対応する元映像のフレーム集合
６０２を示す情報（例えば図１２の場合における映像位
置情報と同様のもの）をフレーム情報に含めればよい。
この場合、フレーム情報は、映像位置情報、表示時間情
報、元映像情報より構成されることとなる。もちろん、
元映像情報は、必要がなければ記述する必要はない。If correspondence with the original video frame is required,
Information indicating the frame set 602 of the original video corresponding to the described frame information (for example, the same as the video position information in the case of FIG. 12) may be included in the frame information.
In this case, the frame information is composed of video position information, display time information, and original video information. of course,
Original video information need not be described if it is not necessary.

【０１１４】画像データの形態や、画像データの作成、
縮小画像の作成、画像データの格納方法、ＵＲＬなどの
位置情報の記述方法等については、前述と同様である。The form of image data, creation of image data,
The method of creating a reduced image, the method of storing image data, and the method of describing positional information such as a URL are the same as those described above.

【０１１５】元映像のどのフレームを選択して画像デー
タを作成して映像位置情報に記述するかについても、前
述と同様、様々な方法をとることができ、例えば、元映
像から等間隔に画像データを抽出してもよいし、画面の
動きの多いところは狭い間隔で多くの画像データを抽出
し、動きの少ないところは広い間隔で少ない画像データ
を抽出してもよい。Various methods can be used to select which frame of the original video to create the image data and describe it in the video position information, as described above. Data may be extracted, or a large amount of image data may be extracted at a small interval at a place where the screen moves a lot, and a small amount of image data may be extracted at a large interval at a place where there is little movement.

【０１１６】上記した実施形態では画像データファイル
３００と元映像３０２の対応付けをフレーム単位で行っ
ているが、元映像情報として記述するフレームの位置情
報に時間的な幅を持たせることも可能である。それため
のフレーム情報のデータ構造は例えば、図１４のように
なる。図１４では図８のフレーム情報に元映像情報３７
０１が追加されている。元映像情報３７０１には特殊再
生対象である元映像の対応区間の始点位置と区間長が、
それぞれ始点情報３７０２、区間長情報３７０３として
記述される。In the above-described embodiment, the image data file 300 and the original video 302 are associated with each other on a frame-by-frame basis. However, the positional information of the frame described as the original video information may have a time width. is there. The data structure of the frame information for that purpose is, for example, as shown in FIG. In FIG. 14, the original video information 37 is added to the frame information of FIG.
01 has been added. In the original video information 3701, the starting point position and the section length of the corresponding section of the original video to be specially reproduced are included.
These are described as start point information 3702 and section length information 3703, respectively.

【０１１７】元映像情報として記述する情報は区間を特
定できるものならなんでもよい。ここでは始点位置と区
間長を用いたが、それらに代えて、始点位置と終点位置
を元映像情報として用いてもよい。図１５は図９に対し
て、元映像情報に時間的な幅を持たせた例である。この
場合、例えば、同一のフレーム情報に含まれる映像位置
情報、表示時間情報、元映像情報として、それぞれ、元
映像フレーム３８０１の位置、表示時間３８０２、元映
像フレーム区間３８０３（始点フレーム位置と区間長）
を記述し、お互いが対応していることを示す。つまり、
元映像フレーム区間３８０３を代表する画像として、映
像位置情報に記述された元映像フレーム３８０１を表示
することになる。The information described as the original video information may be any information that can specify the section. Although the start point position and the section length are used here, the start point position and the end point position may be used as the original video information instead. FIG. 15 shows an example in which the original video information has a temporal width in FIG. In this case, for example, as the video position information, display time information, and original video information included in the same frame information, respectively, the position of the original video frame 3801, the display time 3802, and the original video frame section 3803 (the start frame position and the section length) )
To indicate that they correspond to each other. That is,
The original video frame 3801 described in the video position information is displayed as an image representing the original video frame section 3803.

【０１１８】図１６は図１０に対して、元映像情報に時
間的な幅を持たせた例である。この場合、例えば、同一
のフレーム情報に含まれる映像位置情報、表示時間情
報、元映像情報として、それぞれ、表示用画像データフ
ァイル３９０１の格納場所、表示時間３９０２、元映像
フレーム区間３９０３（始点フレーム位置と区間長）を
記述し、お互いが対応していることを示す。つまり、元
映像フレーム区間３９０３を代表する画像として、映像
位置情報に記述された画像データファイルの画像３９０
１を表示することになる。FIG. 16 is an example in which the original video information is given a temporal width as compared with FIG. In this case, for example, as the video position information, display time information, and original video information included in the same frame information, respectively, the storage location of the display image data file 3901, the display time 3902, the original video frame section 3903 (the start frame position) And section length) to indicate that they correspond to each other. That is, the image 390 of the image data file described in the video position information is represented as the image representing the original video frame section 3903.
1 will be displayed.

【０１１９】また、図１２、１３で示したようにフレー
ムの集合を表示用映像として用いる場合において、表示
用の映像に用いられている元映像フレーム区間とは異な
る区間を元映像情報として対応付けても構わない。When a set of frames is used as a display image as shown in FIGS. 12 and 13, a section different from the original image frame section used for the display image is associated with the original image information. It does not matter.

【０１２０】図１７は図１２に対して、元映像情報に時
間的な幅を持たせた例である。この場合、例えば、同一
のフレーム情報に含まれる映像位置情報、表示時間情
報、元映像情報として、それぞれ、元映像中のフレーム
の集合４００１、表示時間４００２、元映像フレーム区
間４００３（始点フレーム位置と区間長）を記述し、お
互いが対応していることを示す。このとき、映像位置情
報として記述するフレームの集合の区間４００１と元映
像情報として記述する元映像フレーム区間４００３は必
ずしも一致する必要はなく、異なる区間を表示用に用い
ても構わない。FIG. 17 shows an example in which the original video information is given a temporal width as compared with FIG. In this case, for example, as the video position information, display time information, and original video information included in the same frame information, a set 4001 of frames in the original video, a display time 4002, and an original video frame section 4003 (the starting frame position and Section length) to indicate that they correspond to each other. At this time, the section 4001 of the set of frames described as the video position information and the original video frame section 4003 described as the original video information need not necessarily match, and different sections may be used for display.

【０１２１】図１８は図１３に対して、元映像情報に時
間的な幅を持たせた例である。この場合、例えば、同一
のフレーム情報に含まれる映像位置情報、表示時間情
報、元映像情報として、それぞれ、表示に用いるフレー
ム集合４１０１の格納場所、表示時間４１０２、元映像
フレーム区間４１０３（始点フレーム位置と区間長）を
記述し、お互いが対応していることを示す。FIG. 18 shows an example in which the original video information has a temporal width as compared with FIG. In this case, for example, as the video position information, the display time information, and the original video information included in the same frame information, respectively, the storage location of the frame set 4101 used for display, the display time 4102, the original video frame section 4103 (the starting frame position And section length) to indicate that they correspond to each other.

【０１２２】このとき、映像位置情報として記述するフ
レームの集合４１０１の区間と元映像情報として記述す
る元映像フレーム区間４１０３は必ずしも一致する必要
はない。つまり、表示用フレームの集合４１０１の区間
が元映像フレーム区間４１０３より、短くてもよいし、
長くてもよい。また、内容が全く異なる映像が含まれて
いてもよい。その他に、映像データファイルとして、元
映像情報に記述された区間のうち特に重要な区間のみを
抜き出して、まとめた映像データファイルを使用する方
法も考えられる。At this time, the section of the frame set 4101 described as the video position information and the original video frame section 4103 described as the original video information do not necessarily have to match. That is, the section of the set of display frames 4101 may be shorter than the original video frame section 4103,
May be longer. Also, an image having completely different contents may be included. In addition, a method of extracting only a particularly important section from the sections described in the original video information as the video data file and using the collected video data file is also conceivable.

【０１２３】これらのフレーム情報を用いて、例えば要
約再生（特殊再生）映像を閲覧する際に、元映像中の対
応するフレームを参照したい場合もある。When browsing a summary reproduction (special reproduction) video using such frame information, for example, there are cases where it is desired to refer to a corresponding frame in the original video.

【０１２４】図１９は要約表示された映像のフレームに
対応する元映像のフレームから再生を開始するためのフ
ローである。ステップＳ３６０１で、要約映像で再生開
始フレームを指定する。ステップＳ３６０２では後述す
る方法で、指定されたフレームに対応する元映像フレー
ムを算出する。ステップＳ３６０３では算出されたフレ
ームより元映像を再生する。FIG. 19 is a flowchart for starting the reproduction from the frame of the original video corresponding to the frame of the video displayed in summary. In step S3601, a playback start frame is specified in the summary video. In step S3602, an original video frame corresponding to the specified frame is calculated by a method described later. In step S3603, the original video is reproduced from the calculated frame.

【０１２５】もちろん、このフローは再生以外にも元映
像の対応する位置を参照するために用いることが可能で
ある。Of course, this flow can be used to refer to the corresponding position of the original video in addition to the reproduction.

【０１２６】ステップＳ３６０２において、対応する元
映像フレームを算出する方法の一例として、要約映像で
指定されたフレームの表示時間に対する比例配分を用い
る方法を示す。ｉ番目のフレーム情報の含まれる表示時
間情報をＤ_ｉ秒とし、元映像情報の区間始点位置をｔ_ｉ
秒、区間長をｄ_ｉ秒とする。ｉ番目のフレーム情報を
用いた再生が始まってから、ｔ秒経過した位置を指定し
た場合、対応する元映像のフレームの位置はＴ＝ｔ_ｉ＋
ｄ_ｉ×ｔ／Ｄ_ｉとなる。As an example of a method of calculating the corresponding original video frame in step S3602, a method of using a proportional distribution with respect to the display time of the frame specified by the summary video will be described. The display time information including the i-th frame information is set to _Di seconds, and the section start position of the original video information is set to t _i.
Second, the section length and _{d i} seconds. When a position at which t seconds have elapsed since the start of reproduction using the i-th frame information is specified, the position of the corresponding frame of the original video is T = t _i +
d _i × t / D _i .

【０１２７】次に、特殊再生、要約再生するフレーム選
択の方法を説明する。Next, a method of selecting a frame for special reproduction and summary reproduction will be described.

【０１２８】図２０、図２１を参照しながら、フレーム
選択方法の一例として、画面の動きに応じて、画面の動
きの多いところは狭い間隔で多くの画像データを抽出
し、動きの少ないところは広い間隔で少ない画像データ
を抽出する方法について説明する。図２０、図２１の横
軸や曲線８００やＳ_ｉやＦ_ｉは図１１と同様である。Referring to FIGS. 20 and 21, as an example of a frame selection method, a lot of image data is extracted at a small interval in a place where the screen moves a lot, and a lot of image data is extracted in a place with a little movement according to the movement of the screen. A method of extracting a small amount of image data at wide intervals will be described. 20, the horizontal axis and the curve 800 and _{S i} and _{F i} of Figure 21 is similar to FIG. 11.

【０１２９】図１１の例では、画像データ抽出元フレー
ム間の画像変化量が一定となるような間隔で、１フレー
ムづつ抽出した。図２０、図２１は、フレーム番号Ｆ_ｉ
を基準として複数のフレームの集合を抽出する例を示し
ている。この場合、例えば、図２０に示すようにフレー
ム番号Ｆ_ｉから一定数の連続するフレームを抽出するよ
うにしてもよいし（フレーム長８１１とフレーム長８１
２は同一）、図２１に示すようにフレーム番号Ｆ_ｉから
画像変化量の総和が一定となるようにそれぞれ該当する
数の連続するフレームを抽出するようにしてもよい（面
積８１３と面積８１４が同一）。もちろん、その他にも
種々の方法が考えられる。In the example of FIG. 11, frames are extracted frame by frame at intervals such that the amount of change in the image between image data extraction source frames is constant. FIGS. 20 and 21 show frame numbers F _i.
2 shows an example in which a set of a plurality of frames is extracted on the basis of. In this case, for example, may be extracted successive frames of a certain number from the frame number F _i, as shown in FIG. 20 (frame length 811 and the frame length 81
2 are the same), the frame number F may also be image variation amount sum extracts the number of consecutive frames corresponding respectively to be constant from _i (area 813 and the area 814 as shown in FIG. 21 Same). Of course, various other methods are conceivable.

【０１３０】もちろん、前述した画面変化量＝０が一定
時間以上経過した場合のＦ_ｉの抽出処理も用いることが
可能である。[0130] Of course, it is possible to use also extraction of F _i in the case where the screen change amount = 0 as described above has passed a predetermined time or more.

【０１３１】図１１の場合と同様、図２０、図２１の例
の場合、いずれのフレーム集合についても同じ表示時間
となるように表示時間情報１２１を記述するようにして
もよいし、別の方法で表示時間を求めて記述するように
しても構わない。As in the case of FIG. 11, in the case of the examples of FIGS. 20 and 21, the display time information 121 may be described so that the same display time is obtained for any of the frame sets. Alternatively, the display time may be obtained and described.

【０１３２】次に、表示時間を決定する処理の一例につ
いて説明する。Next, an example of the processing for determining the display time will be described.

【０１３３】図２２は、映像位置情報に記述された映像
を、表示時間情報に記述された時間どおりに連続的に再
生したときに、画面の変化量ができる限り一定となるよ
うな表示時間を求めるための基本処理手順の一例であ
る。FIG. 22 shows a display time such that when the video described in the video position information is continuously reproduced at the time described in the display time information, the amount of change in the screen is as constant as possible. It is an example of a basic processing procedure for obtaining.

【０１３４】この処理は、フレームの抽出をどのような
方法で行った場合にも適用可能であるが、例えば図１１
のような方法でフレームを抽出した場合にはこの処理は
省くことができる。何故ならば、図１１は表示時間一定
で画面の変化量ができる限り一定となるようにフレーム
を選択したからである。This processing can be applied to the case where the frame is extracted by any method.
This process can be omitted when a frame is extracted by the method as described above. This is because the frame is selected in FIG. 11 so that the amount of change in the screen is as constant as possible for a constant display time.

【０１３５】ステップＳ７１では、元映像の全フレーム
について隣接フレームとの間の画面変化量を求める。映
像の各フレームがビットマップにより表現されている場
合は、隣接するフレーム間の画素の差分値を画面変化量
とすることができる。映像がＭＰＥＧによって圧縮され
ている場合は、動きベクトルを用いて、画面変化量を求
めることが可能である。In step S71, the amount of screen change between adjacent frames for all frames of the original video is determined. When each frame of the video is represented by a bitmap, the difference between pixels between adjacent frames can be used as a screen change amount. When the video is compressed by MPEG, it is possible to obtain the amount of screen change using the motion vector.

【０１３６】画面変化量の求め方の一例を説明する。An example of how to determine the screen change amount will be described.

【０１３７】図２３は、ＭＰＥＧにより圧縮された映像
ストリームから、全フレームの画面変化量を求めるため
の基本処理手順の一例である。FIG. 23 shows an example of a basic processing procedure for obtaining the screen change amounts of all the frames from a video stream compressed by MPEG.

【０１３８】ステップＳ８１では、Ｐピクチャのフレー
ムから動きベクトルを抽出する。ＭＰＥＧによって圧縮
された映像のフレームは、図２４に示すように、Ｉピク
チャ（フレーム内符号化フレーム）、Ｐピクチャ（前方
予測フレーム間符号化フレーム）、Ｂピクチャ（双方向
予測フレーム間符号化フレーム）の並びによって記述さ
れる。このうち、Ｐピクチャには直前のＩピクチャまた
はＰピクチャからの動きに対応する動きベクトルが含ま
れている。In step S81, a motion vector is extracted from a P picture frame. As shown in FIG. 24, video frames compressed by MPEG include an I picture (intra-frame coded frame), a P picture (forward predicted inter-frame coded frame), and a B picture (bidirectional predicted inter-frame coded frame). ). Among them, the P picture contains a motion vector corresponding to the motion from the immediately preceding I picture or P picture.

【０１３９】ステップＳ８２では、１つのＰピクチャの
フレームに含まれる各動きベクトルの大きさ（強度）を
求め、その平均を直前のＩピクチャまたはＰピクチャか
らの画面変化量とする。In step S82, the magnitude (intensity) of each motion vector included in one P-picture frame is determined, and the average is taken as the amount of screen change from the immediately preceding I-picture or P-picture.

【０１４０】ステップＳ８３では、Ｐピクチャのフレー
ムに対して求めた画面変化量をもとに、Ｐピクチャ以外
のフレームを含めた全フレームに対応する１フレームご
との画面変化量を算出する。例えば、Ｐピクチャのフレ
ームの動きベクトルの平均値がｐで、参照元となる直前
のＩピクチャまたはＰピクチャのフレームからの間隔が
ｄである場合、間の各フレームの１フレームあたりの画
面変化量はｐ／ｄである。In step S83, a screen change amount for each frame corresponding to all frames including frames other than the P picture is calculated based on the screen change amount obtained for the P picture frame. For example, if the average value of the motion vector of the P picture frame is p and the interval from the I or P picture frame immediately before the reference source is d, the amount of screen change per frame of each frame between them Is p / d.

【０１４１】続いて、図２２の手順におけるステップＳ
７２では、映像位置情報に記述する記述対象フレームか
ら、次の記述対象フレーム間での間にあるフレームの画
面変化量の総和を求める。Subsequently, step S in the procedure of FIG.
At 72, the total sum of the screen change amounts of the frames between the next description target frames is obtained from the description target frame described in the video position information.

【０１４２】図２５は、１フレームごとの画面変化量の
変化を記述した図である。横軸はフレーム番号に対応
し、曲線１０００が画面変化量の変化を表す。フレーム
位置Ｆ _ｉの位置情報を持つ映像の表示時間を求める場
合、次の記述対象フレーム位置であるＦ_ｉ＋１までの区
間１００１の画面変化量を累積加算する。これは、斜線
部１００２の面積Ｓ_ｉとなり、フレーム位置Ｆ_ｉの動き
の大きさと考えることができる。FIG. 25 shows the amount of screen change for each frame.
It is a figure which described change. The horizontal axis corresponds to the frame number
A curve 1000 represents a change in the screen change amount. flame
Position F _iTo find the display time of video with different location information
In the case, F is the next frame to be described._{i + 1}Ward to
The screen change amount during the interval 1001 is cumulatively added. This is a diagonal line
Area S of part 1002_iAnd the frame position F_iMovement
Can be thought of as the size of.

【０１４３】続いて、図２２の手順におけるステップＳ
７３では、各フレームの表示時間を求める。画面の変化
量をできるだけ一定にするためには、画面の動きの大き
いフレームほど、表示時間を多く配分すればよいので、
各フレーム位置Ｆ_ｉの映像に配分する表示時間の再生時
間に対する割合を、Ｓ_ｉ／ΣＳ_ｉとすればよい。再生時
間の総和をＴとすると、各映像の表示時間は、Ｄ_ｉ＝Ｔ
・Ｓ_ｉ／ΣＳ_ｉとなる。再生時間の総和Ｔの値は、標準
の再生時間で、元映像の総再生時間と規定しておく。Subsequently, step S in the procedure of FIG.
At 73, the display time of each frame is obtained. In order to keep the amount of change in the screen as constant as possible, it is necessary to allocate more display time to frames with larger screen movement,
The ratio of the display time allocated to the video at each frame position F _i to the reproduction time may be S _i / ΣS _i . Assuming that the total playback time is T, the display time of each video is D _i = T
S _i / ΣS _i . The value of the total reproduction time T is a standard reproduction time and is defined as the total reproduction time of the original video.

【０１４４】画面変化がなくＳ_ｉ＝０となる場合は、予
め決められた下限値（例えば、１）を入れてもよいし、
そのフレーム情報を記述しなくてもよい。Ｓ_ｉ＝０とな
らないまでも、画面変化が非常に小さく、実際の再生に
おいてほとんど表示されないことが予想されるフレーム
に関しても、下限値を代入してもよいし、フレーム情報
を記述しなくてもよい。フレーム情報を記述しない場合
は、Ｓ_ｉの値はＳ_ｉ＋ _１に加算してもよいし、しなくて
もよい。When there is no screen change and S _i = 0, a predetermined lower limit (for example, 1) may be entered,
The frame information need not be described. Until S _i = 0, the lower limit may be substituted for a frame whose screen change is very small and is expected to be hardly displayed in actual reproduction, and frame information may not be described. Good. If not describe the frame information, the value of S _i is may be added to the S _{i +} _1, or not.

【０１４５】この表示時間を求める処理は、特殊再生制
御情報生成装置にてフレーム情報作成のために行うこと
ができるが、映像再生装置側で特殊再生時に行うことも
可能である。The processing for obtaining the display time can be performed by the special reproduction control information generating apparatus for creating frame information, but can also be performed by the video reproducing apparatus at the time of special reproduction.

【０１４６】次に、特殊再生を行う場合の処理の例につ
いて説明する。Next, an example of processing for performing special reproduction will be described.

【０１４７】図２６は、記述された特殊再生制御情報に
基づき、Ｎ倍速再生を行うための処理手順の一例であ
る。FIG. 26 shows an example of a processing procedure for performing N-times speed reproduction based on the described special reproduction control information.

【０１４８】ステップＳ１１１では、再生倍率に基づい
て、再生時の表示時間Ｄ’_ｉを算出する。フレーム情報
に記述されている表示時間情報は、標準の表示時間なの
で、Ｎ倍速での再生を行う場合、各フレームの表示時間
Ｄ’_ｉ＝Ｄ_ｉ／Ｎとなる。In step S111, a display time D ′ _i during reproduction is calculated based on the reproduction magnification. Since the display time information described in the frame information is a standard display time, when the reproduction is performed at N times speed, the display time of each frame is D ′ _i = D _i / N.

【０１４９】ステップＳ１１２では、表示のための初期
化を行う。すなわち、先頭のフレーム情報を表示するよ
うにｉ＝０とする。In step S112, initialization for display is performed. That is, i = 0 is set so as to display the first frame information.

【０１５０】ステップＳ１１３では、ｉ番目のフレーム
情報の表示時間Ｄ’_ｉが予め設定された表示時間の閾値
より大きいか否かを判定する。In step S113, it is determined whether or not the display time D ′ _i of the i-th frame information is longer than a preset display time threshold.

【０１５１】大きい場合は、ステップＳ１１４におい
て、ｉ番目のフレーム情報Ｆ_ｉに含まれる映像位置情報
の映像をＤ’_ｉ秒間表示する。[0151] If so, at step S114, the image of the image position information contained in the i-th frame information _{F i} is displayed D _'i seconds.

【０１５２】大きくない（下回る）場合は、ステップＳ
１１５に進み、表示時間の閾値を下回らないｉ番目のフ
レーム情報を順方向に探索する。この間、表示時間の閾
値を下回ったフレーム情報の表示時間は、すべて探索の
結果得られたｉ番目のフレーム情報の表示時間に加算
し、表示時間の閾値を下回ったフレーム情報の表示時間
は０とする。このような処理を行うのは、再生時の表示
時間が非常に短くなると、表示する映像を準備する時間
が表示時間よりも長くなり、表示が間に合わなくなる場
合があるためである。そこで、表示時間が非常に短い場
合は、表示をせずに先に進むようにする。その際に総再
生時間が変わらないように、表示されなかった映像の表
示時間を表示される映像の表示時間に加算する。If not (lower), step S
Proceeding to 115, the i-th frame information that does not fall below the display time threshold is searched for in the forward direction. During this time, the display time of the frame information that is less than the display time threshold is added to the display time of the i-th frame information obtained as a result of the search, and the display time of the frame information that is less than the display time threshold is 0. I do. The reason why such processing is performed is that if the display time at the time of reproduction is extremely short, the time required to prepare a video to be displayed becomes longer than the display time, and the display may not be in time. Therefore, if the display time is very short, the process proceeds without displaying. At that time, the display time of the video not displayed is added to the display time of the displayed video so that the total playback time does not change.

【０１５３】ステップＳ１１６では、まだ表示されてい
ないフレーム情報が残っていないかを判断するために、
ｉがフレーム情報の総数を下回っているか判定する。下
回っている場合は、ステップＳ１１７へ進み、ｉを１増
加させて次フレーム情報の表示を行う準備をする。ｉが
フレーム情報の総数に到達した場合は、再生処理を終了
する。In step S116, in order to determine whether or not frame information that has not been displayed remains.
It is determined whether i is less than the total number of frame information. If it is lower, the process proceeds to step S117, where i is incremented by 1 to prepare for displaying the next frame information. If i has reached the total number of frame information, the reproduction process ends.

【０１５４】図２７は、既定の表示サイクル（例えば、
１秒間に３０フレームを表示する場合、１表示サイクル
は１／３０秒）を基準にして、記述された特殊再生制御
情報に基づき、Ｎ倍速再生を行うための処理手順の一例
である。FIG. 27 shows a predetermined display cycle (for example,
In the case of displaying 30 frames per second, one display cycle is 1/30 second). This is an example of a processing procedure for performing N-times speed reproduction based on the described special reproduction control information.

【０１５５】ステップＳ１２１では、Ｎ倍速再生時に、
各フレームの表示時間Ｄ’_ｉを、Ｄ’_ｉ＝Ｄ_ｉ／Ｎとし
て求める。ここで算出される表示時間は、実際には表示
サイクルとの関係があるので、算出された表示時間で映
像を表示できるとは限らない。In the step S121, at the time of the N-times speed reproduction,
The display time D ′ _i of each frame is obtained as D ′ _i = D _i / N. Since the display time calculated here actually has a relationship with the display cycle, it is not always possible to display an image with the calculated display time.

【０１５６】図２８は、算出された表示時間と表示サイ
クルの関係を表した図である。時間軸１３００は算出さ
れた表示時間を示し、時間軸１３０１は表示レートに基
づく表示サイクルを示す。表示レートがｆフレーム／秒
の場合、表示サイクルの間隔は１／ｆ秒となる。FIG. 28 is a diagram showing the relationship between the calculated display time and the display cycle. A time axis 1300 indicates the calculated display time, and a time axis 1301 indicates a display cycle based on the display rate. When the display rate is f frames / sec, the display cycle interval is 1 / f sec.

【０１５７】したがって、ステップＳ１２２では、表示
サイクルの開始点が含まれるフレーム情報Ｆ_ｉを探索
し、ステップＳ１２３では、フレーム情報Ｆ_ｉに含まれ
る映像を１表示サイクル（１／ｆ秒）表示する。[0157] Therefore, in step S122, searches the frame information _{F i} including the start point of the display cycle, in step S123, 1 display cycle (1 / f sec) the video included in the frame information _{F i} is displayed.

【０１５８】例えば、表示サイクル１３０２は、表示開
始点１３０３が、算出された表示時間１３０４に含まれ
るので、この表示時間に対応するフレーム情報の映像を
表示する。For example, in the display cycle 1302, since the display start point 1303 is included in the calculated display time 1304, the video of the frame information corresponding to the display time is displayed.

【０１５９】表示サイクルとフレーム情報との対応付け
方法は、図２９に示すように、表示サイクル開始点の最
も近傍の映像を表示するようにしてもよい。図２８の表
示時間１３０５のように、表示時間が表示サイクルより
小さくなった場合は、その映像の表示を省略してもよい
し、強制的に表示してもよい。強制的に表示した場合
は、前後の表示時間を短くして全体の総表示時間が変わ
らないように調整する。As a method of associating the display cycle with the frame information, as shown in FIG. 29, the video closest to the display cycle start point may be displayed. When the display time becomes shorter than the display cycle as in the display time 1305 in FIG. 28, the display of the video may be omitted or the display may be forcibly performed. In the case of forcible display, the display time before and after is shortened so that the total display time does not change.

【０１６０】ステップＳ１２４では、現在の表示が最終
表示サイクルであるかを調べ、最終表示サイクルであれ
ば処理を終了し、最終表示サイクルでなければ次の表示
サイクルを処理するために、ステップＳ１２５へ進む。In step S124, it is determined whether the current display is the last display cycle. If the current display is the last display cycle, the process is terminated. If not, the process proceeds to step S125 to process the next display cycle. move on.

【０１６１】フレーム情報記述の他の例を説明する。Another example of the frame information description will be described.

【０１６２】図８あるいは図１４のデータ構造に含まれ
るフレーム情報は単一の元映像を要約する場合について
扱ったものであるが、フレーム情報を拡張することによ
って、複数の元映像をまとめて要約することができる。
図３０はその一例で、個々のフレーム情報に含まれる元
映像情報４２０１に元映像ファイルの位置などを示す元
映像位置情報４２０２を追加した構造となっている。元
映像位置情報に記述されるファイルは必ずしもファイル
全体の区間を扱う必要はなく、一部区間のみを抜き出し
た形で用いてもよい。この場合、ファイル名などファイ
ルの情報だけでなく、ファイルのどの区間が対象となっ
ているかを示すための区間情報も合わせて記述する。映
像ファイルから選択する区間は１つの映像に対して、複
数であってもよい。The frame information included in the data structure of FIG. 8 or FIG. 14 deals with the case of summarizing a single original video. By expanding the frame information, a plurality of original videos are summarized and summarized. can do.
FIG. 30 shows an example of such a structure, in which original video position information 4202 indicating the position of the original video file is added to the original video information 4201 included in each frame information. The file described in the original video position information does not necessarily need to handle the section of the entire file, and may be used with only a part of the section extracted. In this case, not only the file information such as the file name, but also section information for indicating which section of the file is the target is described. A plurality of sections to be selected from the video file may be provided for one video.

【０１６３】また、元映像が何種類か存在し、個々に識
別情報が付与されている場合は、元映像位置情報の代わ
りに元映像識別情報を記述してもよい。In the case where there are several types of original video and individual identification information is given, original video identification information may be described instead of the original video position information.

【０１６４】図３１は元映像位置情報を追加したフレー
ム情報を用いて、複数の元映像をまとめて要約表示する
例について説明する図である。この例では３つの映像
（映像１、映像２、映像３）をまとめて、１つの要約映
像を表示している。映像２に関しては全区間ではなく、
４３０１と４３０２の２箇所の区間を取り出して、別々
の元映像として扱っている。フレーム情報としてはこれ
らの元映像情報と共に、それぞれを代表する画像のフレ
ーム位置（４３０１に対しては４３０３）が映像位置情
報として、また、表示時間（４３０１に対しては４３０
４）が表示時間情報として記述される。FIG. 31 is a diagram for explaining an example in which a plurality of original images are summarized and displayed together using the frame information to which the original image position information has been added. In this example, three images (image 1, image 2, and image 3) are combined and one summary image is displayed. For video 2, not all sections,
Two sections 4301 and 4302 are extracted and treated as separate original videos. As the frame information, together with the original video information, the frame position of the representative image (4303 for 4301) is used as the video position information, and the display time (4303 for 4301).
4) is described as display time information.

【０１６５】図３２は元映像位置情報を追加したフレー
ム情報を用いて、複数の元映像をまとめて要約表示する
別の例について説明する図である。この例でも３つの映
像をまとめて、１つの要約映像を表示している。映像２
に関しては全区間ではなく、一部区間を取り出して、別
々の元映像として扱っている。もちろん、図３１のよう
に複数の区間を取り出してもよい。フレーム情報として
はこれらの元映像情報（例えば映像２に加え４４０１の
区間情報）と共に、それぞれを代表する画像ファイル
（４４０２）の格納場所が映像位置情報として、また、
表示時間（４４０３）が表示時間情報として記述され
る。FIG. 32 is a view for explaining another example in which a plurality of original images are summarized and displayed together using the frame information to which the original image position information has been added. Also in this example, three images are combined and one summary image is displayed. Video 2
As for, not all sections but some sections are extracted and treated as separate original videos. Of course, a plurality of sections may be extracted as shown in FIG. As the frame information, together with the original video information (for example, section information of 4401 in addition to the video 2), the storage location of the representative image file (4402) is stored as the video position information.
The display time (4403) is described as display time information.

【０１６６】これらの例で説明したようなフレーム情報
への元映像位置情報の追加は、フレームの集合を映像位
置情報として用いる場合においても、全く同じように適
用することができ、複数の元映像をまとめた要約表示が
可能である。The addition of the original video position information to the frame information as described in these examples can be applied in exactly the same way even when a set of frames is used as the video position information. Can be displayed as a summary.

【０１６７】図３３はフレーム情報を記述するための別
のデータ構造である。このデータ構造では既に説明した
映像位置情報、表示時間情報、元映像情報に加えて、動
き情報４５０１と注目領域情報４５０２が加わってい
る。動き情報とはフレーム情報が対応する元映像の区間
（すなわち元映像情報に記述された区間）の動きの大き
さ（画面の変化量）を記述する。注目領域情報とは映像
位置情報に記述されている画像の中で特に注目すべき領
域の情報を記述したものである。FIG. 33 shows another data structure for describing frame information. In this data structure, motion information 4501 and attention area information 4502 are added in addition to the video position information, display time information, and original video information already described. The motion information describes the magnitude of the motion (the amount of change in the screen) in the section of the original video corresponding to the frame information (that is, the section described in the original video information). The attention area information describes information of an area of particular interest in the image described in the video position information.

【０１６８】動き情報は図２２において、映像の動きか
ら表示時間を算出する際に用いたように映像位置情報に
記述される画像の表示時間を算出するために用いること
が可能である。この場合、表示時間情報を省略し、動き
情報のみを記述しても、表示時間を記述した場合と同様
に早送りなどの特殊再生を行うことができる（この場
合、再生時に表示時間を計算する）。The motion information can be used for calculating the display time of the image described in the video position information as used in calculating the display time from the motion of the video in FIG. In this case, even if the display time information is omitted and only the motion information is described, special reproduction such as fast-forwarding can be performed in the same manner as when the display time is described (in this case, the display time is calculated at the time of reproduction). .

【０１６９】表示時間情報と動き情報の両方を同時に記
述することも可能であり、その場合は表示を行うアプリ
ケーションが処理に合わせて必要な方を用いたり、組み
合わせて用いればよい。It is also possible to describe both display time information and motion information at the same time. In such a case, the application that performs the display may use the one necessary for the processing or use it in combination.

【０１７０】例えば、表示時間情報には動きと関係なく
算出された表示時間を記述しておく。元映像から、重要
な場面を切り出す表示時間の算出方法などがこれに該当
する。このように算出された要約表示の早送りを行う際
に、動き情報を用いて、動きの大きい部分は遅めに、動
きの小さい部分の速めに再生を行うことによって、見落
としの少ない早送りが可能である。For example, the display time calculated irrespective of the movement is described in the display time information. This corresponds to a method of calculating a display time for cutting out an important scene from the original video. When fast-forwarding the summary display calculated in this way, by using the motion information, the portion with large movement is played back later and the portion with small movement is played back earlier, so that fast-forwarding with less oversight is possible. is there.

【０１７１】注目領域情報はフレーム情報の映像位置情
報に記述された画像の中で特に注目すべき領域が存在す
るときに用いる。例えば、視聴者にとって重要と思われ
る人物の顔などがこれに該当する。このような注目領域
情報を含む画像を表示する際には領域が分かるように矩
形などを重ね合わせて表示してもよい。この表示は必須
ではなく、そのまま画像を表示するだけでも構わない。The attention area information is used when there is a particularly noticeable area in the image described in the video position information of the frame information. For example, a face of a person deemed important to a viewer corresponds to this. When an image including such attention area information is displayed, a rectangle or the like may be superimposed and displayed so that the area can be recognized. This display is not essential, and the image may be displayed as it is.

【０１７２】注目領域情報はフレーム情報などの特殊再
生情報を加工して表示したりすることも可能である。例
えば、一部のフレーム情報のみを再生表示する場合に、
注目領域情報が含まれるフレーム情報を優先的に表示す
る。また、大きな面積をもつ矩形領域が含まれるほど、
重要度が高いという解釈を用いて、選択表示することも
可能である。The attention area information can be displayed by processing special reproduction information such as frame information. For example, when playing back and displaying only some frame information,
The frame information including the attention area information is preferentially displayed. In addition, the larger the rectangular area with a large area is,
It is also possible to select and display using the interpretation that the importance is high.

【０１７３】以上、画面変化量に基づいて要約再生する
フレームを選択する場合を説明してきたが、以下では、
重要度情報を利用してフレームを選択する場合を説明す
る。The case where the frame to be summarized and reproduced is selected based on the screen change amount has been described above.
A case where a frame is selected using importance information will be described.

【０１７４】図３４は、映像に付帯させるフレーム情報
のデータ構造の一例である。FIG. 34 shows an example of the data structure of frame information attached to a video.

【０１７５】このデータ構造は、図１のフレーム情報の
データ構造において、表示時間制御情報１０２として
（または表示時間制御情報１０２の代わりに）、表示時
間の基となる情報である重要度情報１２２を記述するよ
うにしたものである。This data structure is different from the data structure of the frame information shown in FIG. 1 in that importance information 122 which is information on which display time is based is used as display time control information 102 (or instead of display time control information 102). It is intended to be described.

【０１７６】重要度情報１２２は、対応するフレーム
（またはフレーム集合）の重要度を表す。重要度は、例
えば、一定範囲（例えば０から１００の間）の整数とし
て表現したり、一定範囲（例えば０から１の間）の実数
として表現する。あるいは、上限を定めずに整数、実数
値として表現しても良い。重要度情報は、映像の全ての
フレームに対して付帯させても良いし、重要度の変化し
たフレームのみ付帯させても良い。The importance information 122 indicates the importance of the corresponding frame (or frame set). The importance is expressed, for example, as an integer in a certain range (for example, between 0 and 100) or as a real number in a certain range (for example, between 0 and 1). Alternatively, it may be expressed as an integer or a real number without setting an upper limit. The importance information may be attached to all frames of the video, or only the frames whose importance has changed may be attached.

【０１７７】この場合、映像の位置情報の記述方法は、
図９、図１０、図１２、図１３のいずれの形態をとるこ
とも可能である。図１１や図２０、図２１のフレーム抽
出方法も利用可能である（この場合には、図１１や図２
０、図２１の画面変化量を重要度に置き換えればよ
い）。In this case, the description method of the position information of the video is as follows.
It is possible to take any of the forms of FIGS. 9, 10, 12, and 13. The frame extraction methods of FIGS. 11, 20, and 21 can also be used (in this case, FIGS. 11 and 2
0, the screen change amount in FIG. 21 may be replaced with the importance).

【０１７８】次に、先に説明した例では、画面の変化量
により表示時間の設定を行ったが、重要度情報により表
示時間を設定することも可能である。以下、このような
表示時間の設定方法について説明する。Next, in the example described above, the display time is set according to the amount of change in the screen, but the display time can be set according to the importance information. Hereinafter, a method of setting such a display time will be described.

【０１７９】先に例示した画面の変化量に基づく表示時
間設定では、映像内容を理解しやすくするため、変化量
の大きいところでは表示時間を長く設定し、変化量の小
さいところでは表示時間を短く設定した。この重要度に
基づく表示時間設定では、重要度の高いところは表示時
間を長く設定し、重要度の低いところでは表示時間を短
くすれば良い。すなわち、重要度による表示時間の設定
方法は、基本的に画面の変化量に基づく表示時間設定方
法（図２５参照）と同様であるため、ここでは簡単に説
明することにする。In the display time setting based on the amount of change of the screen described above, the display time is set to be long where the amount of change is large, and the display time is shortened where the amount of change is small in order to make it easy to understand the video contents. Set. In the display time setting based on the importance, the display time may be set longer for a portion having a higher importance, and the display time may be set shorter for a portion having a lower importance. That is, the method of setting the display time based on the importance is basically the same as the method of setting the display time based on the amount of change in the screen (see FIG. 25), and therefore, will be briefly described here.

【０１８０】図３６に、この場合の基本処理手順の一例
を示す。FIG. 36 shows an example of the basic processing procedure in this case.

【０１８１】ステップＳ１９１では、元映像の全フレー
ムの重要度を求める。その具体的な方法については後で
例示する。In step S191, the importance of all frames of the original video is obtained. The specific method will be described later.

【０１８２】ステップＳ１９２では、映像位置情報に記
述する記述対象フレームから、次の記述対象フレームま
での間にあるフレームの重要度の総和を求める。In step S192, the sum of the importance of the frames between the description target frame described in the video position information and the next description target frame is obtained.

【０１８３】図３７は、１フレームごとの重要度の変化
を記述した図である。２２００が重要度である。フレー
ム位置Ｆ_ｉの位置情報を持つ映像の表示時間を求める場
合、次の記述対象フレーム位置であるＦ_ｉ＋１までの
区間２２０１の重要度を加算する。加算結果は、斜線部
２２０２の面積Ｓ’_ｉとなる。FIG. 37 is a diagram describing a change in importance for each frame. 2200 is the importance. When calculating the display time of the video having the position information of the frame position F _i , the importance of the section 2201 up to the next description target frame position F _{i + 1} is added. The result of the addition is the area S ′ _i of the hatched portion 2202.

【０１８４】ステップＳ１９３では、各フレームの表示
時間を求める。各フレーム位置Ｆ_ｉの映像に配分する表
示時間の再生時間に対する割合を、Ｓ’_ｉ／ΣＳ’_ｊ
とする。再生時間の総和をＴとすると、各映像の表示時
間は、Ｄ_ｉ＝Ｔ・Ｓ’_ｉ／ΣＳ’_ｊとなる。再生時間
の総和Ｔの値は、標準の再生時間で、元映像の総再生時
間と規定しておく。In the step S193, the display time of each frame is obtained. The ratio of the display time to the playback time allocated to the video at each frame position F _i is represented by S ′ _i / ΣS ′ _j
And Assuming that the total reproduction time is T, the display time of each video is D _i = T · S ′ _i / ΣS ′ _j . The value of the total reproduction time T is a standard reproduction time and is defined as the total reproduction time of the original video.

【０１８５】重要度の和がＳ’_ｉ＝０となる場合は、予
め決められた下限値（例えば、１）を入れてもよいし、
そのフレーム情報を記述しなくてもよい。Ｓ’_ｉ＝０と
ならないまでも、重要度が非常に小さく、実際の再生に
おいてほとんど表示されないことが予想されるフレーム
に関しても、下限値を代入してもよいし、フレーム情報
を記述しなくてもよい。フレーム情報を記述しない場合
は、Ｓ’_ｉの値はＳ’ _ｉ＋１に加算してもよいし、し
なくてもよい。The sum of the degrees of importance is S '_i= 0
You can enter a lower limit (for example, 1)
The frame information need not be described. S '_i= 0 and
If not, the importance is very small,
Frame that is expected to be hardly displayed in
The lower limit may be substituted for
Need not be described. When not describing frame information
Is S '_iIs S ' _{i + 1}May be added to
It is not necessary.

【０１８６】図３５のように、図１のフレーム情報のデ
ータ構造において、各フレーム情報ｉに、映像位置情報
１０１と、表示時間情報１２１と、重要度情報１２２を
記述するようにしてもよい。この場合において、特殊再
生時には、表示時間情報１２１を用いるが重要度情報１
２２を用いない方法と、重要度情報１２２を用いるが表
示時間情報１２１を用いない方法と、両方用いる方法
と、両方用いない方法がある。As shown in FIG. 35, in the data structure of the frame information shown in FIG. 1, video position information 101, display time information 121, and importance information 122 may be described in each frame information i. In this case, during the special reproduction, the display time information 121 is used, but the importance level information 1 is used.
There is a method that does not use 22, a method that uses the importance information 122 but does not use the display time information 121, a method that uses both, and a method that does not use both.

【０１８７】この表示時間を求める処理は、特殊再生制
御情報生成装置にてフレーム情報作成のために行うこと
ができるが、映像再生装置側で特殊再生時に行うことも
可能である。The process for obtaining the display time can be performed by the special reproduction control information generating device for creating frame information, but can also be performed by the video reproducing device at the time of special reproduction.

【０１８８】次に、各フレームまたは場面（映像区間）
の重要度の決定方法（例えば、図３６のステップＳ１９
１）について説明する。Next, each frame or scene (video section)
(For example, step S19 in FIG. 36)
1) will be described.

【０１８９】映像のある場面が重要かどうかは、通常、
様々な要因が絡み合っているため、重要度を決定する最
も妥当な方法は、人間が決定する方法である。この方法
では、映像のそれぞれの場面、または一定の時間間隔ご
とに重要度評価者が重要度を評価し、重要度データへの
入力を行う。ここで言う重要度データとは、フレーム番
号または時刻と、そのときの重要度の値との対応表のこ
とである。重要度の評価が主観的になってしまうことを
避けるためには、複数の重要度評価者に同一の映像を評
価してもらい、各場面または各映像区間ごとに平均値
（またはメジアンなどでも良い）を算出して最終的な重
要度を決定する。このような人手による重要度データ入
力は、言葉では表現できないようなあいまいな印象や複
数の要素を重要度に加味することが可能である。It is usually determined whether a scene in a video is important or not.
As various factors are intertwined, the most plausible way to determine importance is by humans. In this method, the importance evaluator evaluates the importance at each scene of the video or at regular time intervals, and inputs the importance data. The importance data referred to here is a correspondence table between a frame number or time and a value of the importance at that time. In order to prevent the importance evaluation from becoming subjective, a plurality of importance evaluators evaluate the same video, and an average value (or a median or the like) may be used for each scene or each video section. ) To determine the final importance. Such manual input of importance data can add an ambiguous impression or a plurality of factors that cannot be expressed in words to the importance.

【０１９０】人間が決定する手間を省くためには、重要
であると思われる映像場面に出現しそうな事象を考え、
このような事象を自動で評価して重要度に変換する処理
を利用するのが好ましい。以下、重要度の自動生成の例
をいくつか示す。In order to save the trouble of human decision, consider the events likely to appear in a video scene that is considered important,
It is preferable to use a process of automatically evaluating such an event and converting it to importance. Hereinafter, some examples of the automatic generation of the importance are shown.

【０１９１】図３８は、音声レベルの大きな場面が重要
であるとして、重要度データを自動算出する際の処理手
順の一例である（図３８は機能ブロック図としても成立
する）。FIG. 38 shows an example of a processing procedure for automatically calculating importance data assuming that a scene with a high audio level is important (FIG. 38 is also established as a functional block diagram).

【０１９２】ステップＳ２１０の音声レベル算出処理で
は、映像に付随している音声データが入力されると、各
時刻における音声レベルを算出する。音声レベルは瞬時
に大きく変化するため、ステップＳ２１０の音声レベル
算出処理では平滑化等の処理を行っても良い。In the audio level calculation processing in step S210, when audio data accompanying the video is input, the audio level at each time is calculated. Since the audio level greatly changes instantaneously, processing such as smoothing may be performed in the audio level calculation processing in step S210.

【０１９３】ステップＳ２１１の重要度算出処理では、
音声レベル算出処理の結果出力される音声レベルを重要
度に変換する処理を行う。例えば、あらかじめ定められ
ている最低音声レベルを０、最高音声レベルを１００と
して入力された音声レベルを０から１００の値に線形に
変換する。最低音声レベル以下の場合は０、最高音声レ
ベル以上の場合は１００とする。重要度算出処理の結
果、各時刻における重要度が決定され、重要度データと
して出力される。In the importance calculation process in step S211,
A process of converting the audio level output as a result of the audio level calculation process into the importance level is performed. For example, a predetermined minimum audio level is set to 0 and a maximum audio level is set to 100, and the input audio level is linearly converted from 0 to 100. The value is set to 0 when the audio level is lower than the minimum audio level, and to 100 when the audio level is higher than the maximum audio level. As a result of the importance calculation process, the importance at each time is determined and output as importance data.

【０１９４】図３９は、他の重要度レベル自動決定方法
の処理手順例である（図３９は機能ブロック図としても
成立する）。FIG. 39 shows an example of a processing procedure of another importance level automatic determination method (FIG. 39 is also realized as a functional block diagram).

【０１９５】図３９の処理は、映像に付随する音声中
に、あらかじめ登録されている重要単語が多く出現する
場面を重要であると判断するものである。The processing shown in FIG. 39 is for judging that a scene in which many pre-registered important words appear in the audio accompanying the video is important.

【０１９６】ステップＳ２２０の音声認識処理では、映
像に付随する音声データが入力されると、音声認識処理
により人が話した言葉（単語）をテキストデータに変換
する。In the voice recognition processing of step S220, when voice data accompanying a video is input, words (words) spoken by a person are converted into text data by the voice recognition processing.

【０１９７】重要単語辞書２２１には、重要な場面に登
場しそうな単語が登録されている。登録されている単語
の重要さの度合いが異なっている場合には、登録単語ご
とに重みを付加しておく。In the important word dictionary 221, words likely to appear in important scenes are registered. If the registered words have different degrees of importance, a weight is added to each registered word.

【０１９８】ステップＳ２２２の単語照合処理では、音
声認識処理の出力であるテキストデータと重要単語辞書
２２１に登録されている単語を照合し、重要な単語が話
されたかどうかを判定する。In the word matching process in step S222, text data output from the voice recognition process is compared with words registered in the important word dictionary 221 to determine whether an important word has been spoken.

【０１９９】ステップＳ２２３の重要度算出処理では、
単語照合処理の結果から映像の各場面や各時刻における
重要度を算出する。この計算には、重要単語の出現数、
重要単語の重みが使われ、例えば重要単語の出現した時
刻の周辺（または出現した場面）の重要度を一定値もし
くは重要単語の重みに比例する値だけ上昇させるといっ
た処理を行う。重要度算出処理の結果、各時刻における
重要度が決定され、重要度データとして出力される。In the importance calculating process in step S223,
The importance of each scene and each time of the video is calculated from the result of the word matching process. This calculation includes the number of important words,
The weight of the important word is used. For example, a process of increasing the importance around the time (or the scene where the important word appears) by a certain value or a value proportional to the weight of the important word is performed. As a result of the importance calculation process, the importance at each time is determined and output as importance data.

【０２００】全ての単語の重みを同一とした場合には、
重要単語辞書２２１は不要となる。これは、多くの単語
が話された場面は重要な場面であると想定していること
に相当する。このとき、ステップＳ２２２の単語照合処
理では、単に音声認識処理から出力される単語の数をカ
ウントする処理を行う。単語数ではなく、文字数をカウ
ントするようにしても良い。When the weights of all words are the same,
The important word dictionary 221 becomes unnecessary. This is equivalent to assuming that a scene where many words are spoken is an important scene. At this time, in the word matching process of step S222, a process of simply counting the number of words output from the voice recognition process is performed. The number of characters may be counted instead of the number of words.

【０２０１】図４０は、さらに他の重要度レベル自動決
定方法の処理手順例である（図４０は機能ブロック図と
しても成立する）。FIG. 40 shows a processing procedure example of still another importance level automatic determination method (FIG. 40 is also realized as a functional block diagram).

【０２０２】図４０の処理は、映像中に登場するテロッ
プに、あらかじめ登録されている重要単語が多く出現す
る場面を重要であると判断するものである。The processing shown in FIG. 40 is for judging that a scene in which many important words registered in advance appear in a telop appearing in a video is important.

【０２０３】ステップＳ２３０のテロップ認識処理で
は、映像中の文字位置を特定し、文字位置の映像領域を
２値化して文字認識を行う。認識された結果は、テキス
トデータとして出力される。In the telop recognition processing of step S230, the character position in the video is specified, and the video area at the character position is binarized to perform the character recognition. The recognized result is output as text data.

【０２０４】重要単語辞書２３１は、図３９の重要単語
辞書２２１と同様のものである。The important word dictionary 231 is similar to the important word dictionary 221 in FIG.

【０２０５】ステップＳ２３２の単語照合処理では、図
３９の手順におけるステップＳ２２２と同様に、テロッ
プ認識処理の出力であるテキストデータと重要単語辞書
２３１に登録されている単語を照合し、重要な単語が登
場したかどうかを判定する。In the word collation processing in step S232, text data output from the telop recognition processing is collated with words registered in the important word dictionary 231 as in step S222 in the procedure of FIG. Determine if it has appeared.

【０２０６】ステップＳ２３３の重要度算出処理では、
図３９の手順におけるステップＳ２２３と同様に、重要
単語の出現数、重要単語の重みから各場面または各時刻
における重要度を算出する。重要度算出処理の結果、各
時刻における重要度が決定され、重要度データとして出
力される。In the importance calculating process in step S233,
As in step S223 in the procedure of FIG. 39, the importance at each scene or each time is calculated from the number of appearances of important words and the weight of important words. As a result of the importance calculation process, the importance at each time is determined and output as importance data.

【０２０７】全ての単語の重みを同一とした場合には、
重要単語辞書２３１は不要となる。これは、テロップと
して多くの単語が出現した場面は重要な場面であると想
定していることに相当する。このとき、ステップＳ２３
２の単語照合処理では、単にテロップ認識処理から出力
される単語の数をカウントする処理を行う。単語数では
なく、文字数をカウントするようにしても良い。When all the words have the same weight,
The important word dictionary 231 becomes unnecessary. This is equivalent to assuming that a scene where many words appear as a telop is an important scene. At this time, step S23
In the second word matching process, a process of simply counting the number of words output from the telop recognition process is performed. The number of characters may be counted instead of the number of words.

【０２０８】図４１は、さらに他の重要度レベル自動決
定方法の処理手順例である（図４１は機能ブロック図と
しても成立する）。FIG. 41 shows an example of a processing procedure of still another importance level automatic determination method (FIG. 41 is also realized as a functional block diagram).

【０２０９】図４１の処理は、映像中に登場するテロッ
プの文字が大きいほど重要な場面であると判断するもの
である。The processing in FIG. 41 is for judging that the larger the character of the telop appearing in the video, the more important the scene.

【０２１０】ステップＳ２４０のテロップ検出処理で
は、映像中の文字列の位置を特定する処理を行う。In the telop detection processing in step S240, processing for specifying the position of the character string in the video is performed.

【０２１１】ステップＳ２４１の文字サイズ算出処理で
は、文字列から個々の文字を切り出し、文字の大きさ
（面積）の平均値または最大値を算出する。In the character size calculation process in step S241, individual characters are cut out from the character string, and the average value or the maximum value of the character size (area) is calculated.

【０２１２】ステップＳ２４２の重要度算出処理では、
文字サイズ算出処理の出力である文字サイズに比例した
重要度を算出する。算出された重要度が大きすぎたり小
さすぎたりした場合には、しきい値処理により重要度を
あらかじめ決められた範囲内に収める処理も行う。重要
度算出処理の結果、各時刻における重要度が決定され、
重要度データとして出力される。In the importance calculation process of step S242,
The importance is calculated in proportion to the character size which is the output of the character size calculation process. If the calculated importance is too large or too small, a process for keeping the importance within a predetermined range by threshold processing is also performed. As a result of the importance calculation process, the importance at each time is determined,
Output as importance data.

【０２１３】図４２は、さらに他の重要度レベル自動決
定方法の処理手順例である（図４２は機能ブロック図と
しても成立する）。FIG. 42 shows an example of a processing procedure of still another importance level automatic determination method (FIG. 42 is also realized as a functional block diagram).

【０２１４】図４２の処理は、映像中に人間の顔が登場
する場面は重要であると判断するものである。The processing in FIG. 42 is for judging that a scene where a human face appears in a video is important.

【０２１５】ステップＳ２５０の顔検出処理では、映像
中にある人間の顔らしい領域を検出する処理を行う。処
理の結果として、人間の顔と判断された領域の数（顔の
数）が出力される。顔の大きさ（面積）の情報も同時に
出力するようにしても良い。In the face detecting process in step S250, a process for detecting a human face-like region in the video is performed. As a result of the process, the number of areas determined to be human faces (the number of faces) is output. Information on the size (area) of the face may also be output at the same time.

【０２１６】ステップＳ２５１の重要度算出処理では、
顔検出処理の出力である顔の数を定数倍して重要度を算
出する。顔検出処理の出力が顔の大きさ情報を含む場合
には、重要度は顔の大きさとともに増大するように計算
される。例えば、顔の面積を定数倍して重要度を算出す
る。重要度算出処理の結果、各時刻における重要度が決
定され、重要度データとして出力される。In the importance calculation process in step S251,
The importance is calculated by multiplying the number of faces, which is the output of the face detection processing, by a constant. When the output of the face detection processing includes face size information, the importance is calculated so as to increase with the face size. For example, the importance is calculated by multiplying the face area by a constant. As a result of the importance calculation process, the importance at each time is determined and output as importance data.

【０２１７】図４３は、さらに他の重要度レベル自動決
定方法の処理手順例である（図４３は機能ブロック図と
しても成立する）。FIG. 43 is an example of a processing procedure of still another importance level automatic determination method (FIG. 43 is also realized as a functional block diagram).

【０２１８】図４３の処理は、あらかじめ登録されてい
る画像と類似した映像が登場する場面は重要である判断
するものである。The process shown in FIG. 43 is for judging that a scene in which a video similar to a previously registered image appears is important.

【０２１９】重要シーン辞書２６０には、重要と判断す
べき画像が登録されている。画像は生データとして記録
されていたり、データ圧縮された形式で記録されてい
る。画像そのものではなく、画像の特徴量（色ヒストグ
ラムや周波数など）を記録しておいても良い。In the important scene dictionary 260, images to be determined as important are registered. The image is recorded as raw data or recorded in a data compressed format. Instead of the image itself, the feature amount (color histogram, frequency, etc.) of the image may be recorded.

【０２２０】ステップＳ２６１の類似度／非類似度算出
処理では、重要シーンに登録されている画像と入力され
た画像データとの類似度または非類似度を算出する。非
類似度としては、２乗誤差の総和や絶対値差分の総和な
どが用いられる。重要シーン辞書２６０に画像データが
記録されている場合には、対応する画素ごとの２乗誤差
の総和や絶対値差分の総和などが非類似度として算出さ
れる。重要シーン辞書２６０に画像の色ヒストグラムが
記録されている場合には、入力された画像データに対し
て同様の色ヒストグラムを算出し、ヒストグラム同士の
２乗誤差の総和や絶対値差分の総和を算出して非類似度
とする。In the similarity / dissimilarity calculation process of step S261, the similarity or dissimilarity between the image registered in the important scene and the input image data is calculated. As the degree of dissimilarity, a sum of square errors, a sum of absolute value differences, and the like are used. When image data is recorded in the important scene dictionary 260, the sum of the square errors and the sum of the absolute value differences for each corresponding pixel are calculated as the dissimilarity. When the color histogram of the image is recorded in the important scene dictionary 260, the same color histogram is calculated for the input image data, and the sum of square errors and the sum of absolute value differences between the histograms are calculated. To be dissimilarity.

【０２２１】ステップＳ２６２の重要度算出処理では、
類似度／非類似度算出処理の出力である類似度または非
類似度から重要度を算出する。類似度が入力される場合
には類似度が大きいほど大きな重要度となるように、非
類似度が入力される場合には非類似度が大きいほど小さ
な重要度なるように重要度は計算される。重要度算出処
理の結果、各時刻における重要度が決定され、重要度デ
ータとして出力される。In the importance calculation process of step S262,
The importance is calculated from the similarity or dissimilarity output from the similarity / dissimilarity calculation processing. When the similarity is input, the importance is calculated so that the greater the similarity, the greater the importance is. When the dissimilarity is input, the importance is calculated such that the greater the dissimilarity, the smaller the importance. . As a result of the importance calculation process, the importance at each time is determined and output as importance data.

【０２２２】さらに他の重要度レベル自動決定方法とし
て、瞬間視聴率の高い場面を重要とする方法がある。瞬
間視聴率のデータは、視聴率調査の集計結果として得ら
れるものであり、この瞬間視聴率を定数倍することで重
要度が算出される。もちろん、その他にも種々の方法が
ある。As another method of automatically determining the importance level, there is a method in which a scene with a high instantaneous rating is important. The instantaneous audience rating data is obtained as an aggregated result of the audience rating survey, and the importance is calculated by multiplying the instantaneous audience rating by a constant. Of course, there are various other methods.

【０２２３】重要度の算出処理は、単独で用いてもよい
し、複数を同時に用いて重要度を算出するようにしても
よい。後者の場合には、例えば、いくつかの異なる方法
で一つの映像の重要度を算出し、最終的な重要度は平均
値または最大値として算出するようにしてもよい。The processing for calculating the importance may be used alone, or the importance may be calculated by using a plurality of importance at the same time. In the latter case, for example, the importance of one video may be calculated by several different methods, and the final importance may be calculated as an average value or a maximum value.

【０２２４】以上では画面変化量や重要度を例にとって
説明を行ったが、画面変化量およびまたは重要度ととも
に、あるいは画面変化量および重要度に代えて、その他
の１または複数種類の情報を用いる（フレーム情報に記
述する）ことも可能である。In the above description, the screen change amount and importance have been described as an example. However, one or more types of information are used together with the screen change amount and / or importance, or instead of the screen change amount and importance. (Described in the frame information).

【０２２５】次に、フレーム情報（図１参照）に、再生
／非再生の制御のための情報を付加する場合について説
明する。Next, a case where information for controlling reproduction / non-reproduction is added to the frame information (see FIG. 1) will be described.

【０２２６】映像データ中における、特定の場面あるい
は部分（例えばハイライトシーン）のみを再生したり、
特定の人物が登場している場面あるいは部分のみを再生
したいなどというように、映像の一部のみを見たいとい
う要求がある。In a video data, only a specific scene or a part (for example, a highlight scene) is reproduced.
There is a demand to view only a part of a video, such as to reproduce only a scene or a part where a specific person appears.

【０２２７】この要求を満たすため、フレーム情報に、
再生するか非再生にするかを制御するための再生／非再
生情報を付加するようにしてもよい。これにより、再生
側では、この再生／非再生情報に基づいて、映像の一部
のみを再生したり、逆に映像の一部のみを再生しなかっ
たりすることができる。To satisfy this request, the frame information includes:
Play / non-play information for controlling whether to play or not play may be added. Thereby, on the reproduction side, based on the reproduction / non-reproduction information, it is possible to reproduce only a part of the video or conversely not reproduce only a part of the video.

【０２２８】図４４、図４５、図４６に、再生／非再生
情報を付加したデータ構造例を示す。FIGS. 44, 45 and 46 show examples of data structures to which reproduction / non-reproduction information is added.

【０２２９】図４４は、図８のデータ構造例において、
再生／非再生情報１２３を付加したものである。もちろ
ん、図４５、図４６は、図３４、図３５のデータ構造に
再生／非再生情報１２３を付加したものである。図示し
ていないが、図１のデータ構造例において、再生／非再
生情報を付加してもよい。FIG. 44 shows an example of the data structure in FIG.
The reproduction / non-reproduction information 123 is added. Of course, FIGS. 45 and 46 are obtained by adding the reproduction / non-reproduction information 123 to the data structure of FIGS. Although not shown, reproduction / non-reproduction information may be added in the data structure example of FIG.

【０２３０】再生／非再生情報１２３は、再生するか非
再生にするかの２値情報を指定する方法と、再生レベル
のような連続値を指定する方法がある。As the reproduction / non-reproduction information 123, there are a method of designating binary information of reproduction or non-reproduction, and a method of designating continuous values such as a reproduction level.

【０２３１】後者の場合には、例えば、再生時に再生レ
ベルがある閾値以上だったら再生し、そうでなければ非
再生とする。閾値は、例えば、ユーザが直接的にまたは
間接的に指定可能としてもよい。In the latter case, for example, when the reproduction level is higher than a certain threshold value during reproduction, the reproduction is performed, and otherwise, the reproduction is not performed. The threshold may be, for example, directly or indirectly specifiable by the user.

【０２３２】再生／非再生情報１２３は、独立した情報
とし保持してもよいが、再生か非再生かを選択的に指定
する場合において、表示時間情報１２１により示される
表示時間が特定の値（例えば、０あるいは−１など）の
ときに非再生であるとすることも可能である。あるい
は、重要度情報１２２により示される重要度が特定の値
（例えば、０あるいは−１など）のときに非再生である
とすることも可能である。この場合には、再生／非再生
情報１２３は付加しなくてよい。The reproduction / non-reproduction information 123 may be held as independent information. However, in the case where reproduction or non-reproduction is selectively designated, the display time indicated by the display time information 121 has a specific value ( For example, when the value is 0 or -1), it is possible to determine that the reproduction is not performed. Alternatively, when the importance indicated by the importance information 122 is a specific value (for example, 0 or −1), it is possible to determine that the reproduction is not performed. In this case, the reproduction / non-reproduction information 123 need not be added.

【０２３３】再生か非再生かをレベル値で指定する場合
においても、表示時間情報１２１およびまたは重要度情
報１２２（ただし、重要度をレベル値で表す場合）で代
用することも可能である。In the case where reproduction or non-reproduction is designated by a level value, display time information 121 and / or importance information 122 (when importance is represented by a level value) can be substituted.

【０２３４】再生／非再生情報１２３を独立した情報と
して保持する場合は、データ量がその分増えるが、再生
側で非再生指定部分を再生しないようにしてダイジェス
トを見ることもできるし、非再生指定部分も再生して映
像の全部を見ることも可能となる（再生／非再生情報１
２３を独立した情報として保持しない場合は、非再生指
定部分も再生して映像の全部を見るためには、例えば０
として指定されている表示時間を適宜変更する必要があ
る）。In the case where the reproduction / non-reproduction information 123 is held as independent information, the data amount increases accordingly. However, the digest can be viewed without reproducing the non-reproduction designated portion on the reproduction side, or the non-reproduction information can be viewed. It is also possible to play the designated portion and see the entire video (playback / non-playback information 1
23 is not stored as independent information, in order to reproduce the non-reproduction designated portion and to view the entire video, for example, 0
You need to change the display time specified as.)

【０２３５】再生／非再生情報１２３は、人間が入力し
てもよいし、なんらかの条件より決定してもよい。例え
ば、映像の動き情報から動きが一定値以上大きいときは
再生、そうでなければ非再生とすれば、動きの激しいと
ころのみ再生できるし、色情報から肌色が一定値より大
きいか小さいかから決定すれば人物がいるところのみ再
生できる。音の大小によって決定する手法、あらかじめ
入力されている再生プログラム情報から決定する手法も
考えられる。重要度をなんらかの手法で決定しておき、
重要度情報から再生／非再生情報１２３を生成してもよ
い。再生／非再生情報を連続値としたときは、これらの
情報を適当な関数で再生／非再生情報に変換することに
よって求ればよい。The reproduction / non-reproduction information 123 may be input by a human, or may be determined based on some condition. For example, if the motion is larger than a certain value from the motion information of the video, it is played, otherwise if it is not played, it can be played back only in places where the movement is intense, and from the color information it is determined whether the skin color is larger or smaller than a certain value That way, you can only play where people are. A method of determining the size of the sound and a method of determining the size based on reproduction program information input in advance may be considered. The importance is determined in some way,
The reproduction / non-reproduction information 123 may be generated from the importance information. When the reproduction / non-reproduction information is a continuous value, the information may be obtained by converting the information into reproduction / non-reproduction information using an appropriate function.

【０２３６】図４７は、再生／非再生情報１２３に基づ
いて、再生／非再生の制御を行って再生した例を示す。FIG. 47 shows an example in which reproduction is performed by controlling reproduction / non-reproduction based on the reproduction / non-reproduction information 123.

【０２３７】図４７において、元映像２１５１を、Ｆ_１
〜Ｆ_６で表される映像フレーム位置情報または映像フレ
ーム群位置情報２１５３と、Ｄ_１〜Ｄ_６で表される表示
時間情報２１５４に基づいて再生するとする。このと
き、再生／非再生情報２１５５は、表示時間情報２１５
４に付加されるものとする。この例において、Ｄ_１，Ｄ
_２，Ｄ_４，Ｄ_６の区間が再生となり、それ以外の区間が
非再生となった場合には、再生映像２１５２としては、
Ｄ_１，Ｄ_２，Ｄ_４，Ｄ_６の区間が連続的に再生される
（それ以外は非再生となる）。In FIG. 47, the original video 2151 is₁
~ F₆Video frame position information or video frame
Group position information 2153 and D₁~ D₆Display represented by
It is assumed that reproduction is performed based on the time information 2154. This and
The playback / non-playback information 2155 includes display time information 215
4 shall be added. In this example, D₁, D
₂, D₄, D₆Section is played, and other sections are
In the case of non-playback, as the playback video 2152,
D₁, D₂, D₄, D₆Sections are played continuously
(Otherwise, it will not be played).

【０２３８】例えば、再生映像のフレームＦ_ｉにおい
て、再生／非再生情報１２３が再生を示すものであった
ときの表示時間をＤ⁺ _ｉ、非再生であったときの表示時
間をＤ^- _ｉとしたとき、元映像の再生部分の総時間が
Ｔ’であるとすると、Σ_ｉＤ⁺ _ｉ＝Ｔ’になる。通常
は、Ｄ⁺ _ｉは、元映像と等倍速に表示時間を設定してお
く。あらかじめ決め事として暗黙の固定された倍速とし
ても良いし、何倍速に設定するかの情報を記述しても良
い。Ｎ倍速再生したい場合は、再生部分の表示時間Ｄ⁺
_ｉを１／Ｎ倍する。例えば、決められた時間Ｄ’で再生
を行うようにするためには、各再生部分の表示時間Ｄ⁺
_ｉをＤ’／Σ_ｉＤ⁺ _ｉ倍に加工して表示すれば良い。For example, the frame F of the reproduced video_ismell
Thus, the reproduction / non-reproduction information 123 indicates reproduction.
The display time at the time is D⁺ _iAt the time of non-playback
D between^- _iAnd the total time of the playback part of the original video
If it is T ', then Σ_iD⁺ _i= T '. Normal
Is D⁺ _iSet the display time at the same speed as the original video
Good. As a prerequisite, a fixed fixed speed is implied.
May be described, and information on how many times the speed is set may be described.
No. If you want to play at N times speed, display time D of the playback part⁺
_iIs multiplied by 1 / N. For example, playback at the determined time D '
Is performed, the display time D of each playback portion is required.⁺
_iTo D '/ Σ_iD⁺ _iWhat is necessary is to process it twice and display it.

【０２３９】フレーム情報に基づいて各フレーム（また
はフレーム群）の表示時間を決定する場合に、決定され
た表示時間を調整するようにしてもよい。When the display time of each frame (or frame group) is determined based on the frame information, the determined display time may be adjusted.

【０２４０】決定された表示時間を調整しない方法で
は、非再生の区間が発生したことを考慮せずに決定され
た表示時間をそのまま用いるので、非再生の区間にもと
もと０を越える表示時間が割り当てられていた場合に
は、その分だけ全体の表示時間が短くなる。In the method in which the determined display time is not adjusted, the determined display time is used as it is without considering the occurrence of the non-reproduction section. Therefore, the display time exceeding 0 is originally assigned to the non-reproduction section. If so, the entire display time is shortened accordingly.

【０２４１】決定された表示時間を調整する方法では、
例えば、非再生の区間にもともと０を越える表示時間が
割り当てられている場合には、非再生の区間を再生した
ときと全体の表示時間が同じになるように、再生する各
フレーム（またはフレーム群）の表示時間に一定数を乗
じて、調整を行う。In the method of adjusting the determined display time,
For example, if a display time exceeding 0 is originally assigned to a non-reproduction section, each frame (or frame group) to be reproduced is set so that the entire display time becomes the same as when the non-reproduction section is reproduced. ) Is adjusted by multiplying the display time by a certain number.

【０２４２】ユーザが、調整するか否かを選択可能とし
てもよい。The user may be allowed to select whether or not to make adjustments.

【０２４３】ユーザがＮ倍速再生を指定した場合にも、
決定された表示時間を調整せずにＮ倍速再生の処理を行
ってもよいし、決定された表示時間を上記のようにして
調整した後の表示時間を基礎としてＮ倍速再生の処理を
行ってもよい（前者の方が表示時間が短くなる）。[0243] Even when the user specifies N-times speed reproduction,
The process of N-times speed reproduction may be performed without adjusting the determined display time, or the process of N times speed reproduction may be performed based on the display time after the determined display time is adjusted as described above. (The former has a shorter display time).

【０２４４】ユーザが全体の表示時間を指定可能として
もよい。この場合にも、例えば、指定された全体の表示
時間になるように、再生する各フレーム（またはフレー
ム群）の表示時間に一定数を乗じて、調整を行うように
してもよい。The user may be able to specify the entire display time. Also in this case, for example, the adjustment may be performed by multiplying the display time of each frame (or frame group) to be reproduced by a certain number so that the designated entire display time is reached.

【０２４５】図４８は、再生／非再生情報１２３に基づ
いて映像の一部のみを再生する処理手順の一例を示す。FIG. 48 shows an example of a processing procedure for reproducing only a part of the video based on the reproduction / non-reproduction information 123.

【０２４６】ステップＳ１６２で該フレームのフレーム
情報（映像位置情報及び表示時間情報）を読み出し、ス
テップＳ１６３で表示時間情報内の再生／非再生情報よ
り該フレームを再生するか、非再生とするかを判断す
る。In step S162, the frame information (image position information and display time information) of the frame is read, and in step S163, whether the frame is to be reproduced or not reproduced is determined from the reproduction / non-reproduction information in the display time information. to decide.

【０２４７】判断結果が再生であれば、ステップＳ１６
４で表示時間分だけ該フレームを表示する。そうでなけ
れば、そのフレームは再生せず、次のフレームの処理に
移る。If the judgment result is reproduction, step S16
In step 4, the frame is displayed for the display time. Otherwise, the frame is not reproduced, and the process proceeds to the next frame.

【０２４８】ステップＳ１６１で再生すべき映像が終了
したかどうかを判別し、映像が終了したら、再生処理も
終了する。In step S161, it is determined whether or not the video to be reproduced has ended. When the video has ended, the reproduction process ends.

【０２４９】ところで、ステップＳ１６３で該フレーム
を再生するか非再生にするかを判断するときには、単純
に再生／非再生情報が再生であれば再生し、非再生であ
れば再生しないという以外に、ユーザーの好みによって
非再生部を再生するか再生しないかを決定したいことが
ある。このときは、映像の再生前にあらかじめ非再生部
を再生するか再生しないかをユーザープロファイルなど
から決定しておき、非再生部を再生するときは必ずステ
ップＳ１６４でフレームの再生を行うようにする。When it is determined in step S163 whether the frame is to be played back or not played back, the playback / non-playback information is simply played back if the playback / non-playback information is not played back. There are times when it is desired to determine whether to reproduce or not reproduce the non-reproducing part according to the user's preference. At this time, whether or not to reproduce the non-reproducing part is determined in advance from the user profile or the like before reproducing the video, and when reproducing the non-reproducing part, the frame is always reproduced in step S164. .

【０２５０】その他にも、再生／非再生情報が再生レベ
ルとして連続値として保存されていたときは、ユーザー
プロファイルから再生と非再生を区別する閾値を求め、
再生／非再生情報が閾値を超えているかどうかで再生す
るか非再生にするかを判断するようにしてもよい。ユー
ザープロファイルを使う以外にも、例えば、各フレーム
ごとに設定された重要度から閾値を計算したり、ユーザ
ーからあらかじめ、またはリアルタイムに再生するか再
生しないかの情報を受け取ってもよい。In addition, when the reproduction / non-reproduction information is stored as a continuous value as the reproduction level, a threshold value for distinguishing between reproduction and non-reproduction is obtained from the user profile.
Whether to reproduce or not reproduce may be determined based on whether the reproduction / non-reproduction information exceeds a threshold value. In addition to using the user profile, for example, a threshold may be calculated from the degree of importance set for each frame, or information on whether to reproduce or not reproduce in advance or in real time may be received from the user.

【０２５１】このように、フレーム情報に、再生するか
非再生にするかを制御するための再生／非再生情報１２
３を付加することによって、映像の一部のみを再生する
ことが可能となり、ハイライトシーンのみを再生した
り、興味有る人物や物体が出ているシーンのみを再生し
たりすることが可能となる。As described above, the reproduction / non-reproduction information 12 for controlling reproduction or non-reproduction is included in the frame information.
By adding 3, it becomes possible to reproduce only a part of the video, and it is possible to reproduce only the highlight scene or to reproduce only the scene in which the person or the object of interest appears. .

【０２５２】次に、フレーム情報（図１参照）に、表示
される映像に関連した映像以外のメディア（例えばテキ
ストや音）の位置情報と、それらを表示もしくは再生す
る時間を付加情報とする場合の記述方法について説明す
る。Next, in the case where the frame information (see FIG. 1) includes, as the additional information, the position information of a medium (eg, text or sound) other than the video related to the video to be displayed and the time for displaying or reproducing them. A description method will be described.

【０２５３】図８では各フレーム情報１００に映像位置
情報１０１と表示時間情報１０２が含まれ、図３４では
各フレーム情報１００に映像位置情報１０１と重要度情
報１０３が含まれ、図３５では各フレーム情報１００に
映像位置情報１０１と表示時間情報１０２と重要度情報
１０３が含まれ、図４４、図４５、図４６では、さらに
再生／非再生情報１２３が含まれてる例を示したが、そ
れらのいずれにおいても、さらに、０以上の音位置情報
２７０３、音再生時間情報２７０４、０以上のテキスト
情報２７０５、テキスト表示時間情報２７０６（ただ
し、いずれかは１以上とする）を付加するようにしても
よい。In FIG. 8, each frame information 100 includes video position information 101 and display time information 102, in FIG. 34, each frame information 100 includes video position information 101 and importance information 103, and in FIG. The information 100 includes the video position information 101, the display time information 102, and the importance information 103, and FIGS. 44, 45, and 46 show examples in which the reproduction / non-reproduction information 123 is further included. In any case, 0 or more sound position information 2703, sound reproduction time information 2704, 0 or more text information 2705, and text display time information 2706 (where one is 1 or more) may be added. Good.

【０２５４】図４９は、図８のデータ構造例に、１組の
音位置情報２７０３／音再生時間情報２７０４と、Ｎ組
のテキスト情報２７０５／テキスト表示時間情報２７０
６を付加した場合の例である。FIG. 49 shows an example of the data structure of FIG. 8 in which one set of sound position information 2703 / sound reproduction time information 2704 and N sets of text information 2705 / text display time information 270
This is an example in the case where 6 is added.

【０２５５】音は、音位置情報２７０３に格納した位置
から音再生時間情報２７０４に格納した時間だけ再生を
行う。再生の対象は、最初から映像に付帯していた音情
報でもかまわないし、バックグラウンドミュージックな
どを作成してあらたに付加してもかまわない。The sound is reproduced from the position stored in the sound position information 2703 for the time stored in the sound reproduction time information 2704. The reproduction target may be sound information attached to the video from the beginning, or background music or the like may be created and newly added.

【０２５６】テキストは、テキスト情報２７０５に格納
したテキスト情報をテキスト表示時間情報２７０６に格
納した時間だけ表示する。１つの映像フレームに対して
複数のテキスト情報を付加してもよい。The text displays the text information stored in the text information 2705 for the time stored in the text display time information 2706. A plurality of text information may be added to one video frame.

【０２５７】音の再生とテキストの表示を開始する時刻
は、関連付けられた映像フレームが表示されるのと同時
である。音の再生時間とテキストの表示時間も関連付け
られた映像フレームの表示時間以内となる。複数の映像
フレームにわたって、連続した音を再生する場合には、
音の位置情報と再生時間を連続するように設定すればよ
い。[0257] The time at which the reproduction of the sound and the display of the text are started is the same as when the associated video frame is displayed. The sound playback time and the text display time are also within the display time of the associated video frame. To play back a continuous sound over multiple video frames,
The position information of the sound and the reproduction time may be set to be continuous.

【０２５８】このような方法によって、要約音声や要約
テキストなども可能になる。By such a method, a summary voice, a summary text, and the like can be obtained.

【０２５９】図５０に、フレーム情報とは別に音情報を
記述する方法の一例を示す。これは、特殊再生を行う際
に、表示されている映像フレームに関連する音声を再生
するためのデータ構造例である。再生する音声の所在を
示す位置情報２８０１と、音声の再生を開始する時刻２
８０２と、再生を継続する時間２８０３の組を１つの音
情報２８００とし、この音情報の配列として記述され
る。FIG. 50 shows an example of a method for describing sound information separately from frame information. This is an example of a data structure for reproducing the sound related to the displayed video frame when performing the special reproduction. Position information 2801 indicating the location of the sound to be reproduced, and time 2 at which the reproduction of the sound is started
A set of 802 and a time 2803 during which the reproduction is continued is defined as one sound information 2800, which is described as an array of the sound information.

【０２６０】図５１に、テキスト情報を記述するための
データ構造の一例を示す。図５０の音情報と同様な構造
を持ち、表示するテキストの文字コード２９０１と、表
示開始時刻２９０２と、表示時間２９０３の組を１つの
テキスト情報２９００とし、このテキスト情報の配列と
して記述される。２９０１に該当する情報として文字コ
ードの代わりに、その文字コードを保存した場所、ある
いはその文字を画像として保存した場所などを指す位置
情報を用いてもよい。FIG. 51 shows an example of a data structure for describing text information. It has the same structure as the sound information in FIG. 50, and a set of a character code 2901 of a text to be displayed, a display start time 2902, and a display time 2903 is one text information 2900, and is described as an array of the text information. As information corresponding to 2901, instead of a character code, position information indicating a place where the character code is stored or a place where the character is stored as an image may be used.

【０２６１】上記の音情報やテキスト情報は、映像フレ
ームの表示と同期をとり、表示されている映像フレーム
もしくはその映像フレームが存在する一定の映像区間に
関連のある情報として表示される。図５２に示すよう
に、音情報やテキスト情報は、時間軸３００１が示す時
間の経過にしたがって、再生あるいは表示が開始され
る。まず、映像３００２は、各映像フレームが記述され
た順序で、記述された表示時間ずつ表示されることによ
って再生される。３００５、３００６、３００７はそれ
ぞれ映像フレームを表しており、所定の表示時間が割り
当てられている。音３００３は、各音情報に記述された
再生開始時刻になると再生され、同様に記述された再生
時間を過ぎると再生を停止する。図５２に示すように、
同時に複数の音３００８と３００９が再生されてもよ
い。テキスト３００４も音と同様に、各テキスト情報に
記述された表示開始時刻になると表示され、記述された
表示時間を過ぎると表示を停止する。同時に複数のテキ
スト３０１０と３０１１を表示してもよい。The above sound information and text information are synchronized with the display of the video frame, and are displayed as information related to the displayed video frame or a certain video section in which the video frame exists. As shown in FIG. 52, reproduction or display of sound information or text information is started as the time indicated by the time axis 3001 elapses. First, the video 3002 is reproduced by displaying each video frame in the described order in the described display time. Reference numerals 3005, 3006, and 3007 each represent a video frame, and a predetermined display time is assigned to the video frame. The sound 3003 is reproduced at the reproduction start time described in each sound information, and stops reproduction when the reproduction time similarly described has passed. As shown in FIG.
A plurality of sounds 3008 and 3009 may be played at the same time. Similarly to the sound, the text 3004 is displayed at the display start time described in each piece of text information, and stops displaying after the described display time. A plurality of texts 3010 and 3011 may be displayed at the same time.

【０２６２】音の再生開始時刻およびテキストの表示開
始時刻は、映像フレームの表示を行う時刻と一致してい
る必要はない。音の再生時間およびテキストの表示時間
は、映像フレームの表示時間と一致している必要はな
い。これらは、自由に設定することができ、逆に音の再
生時間やテキストの表示時間に応じて、映像フレームの
表示時間を変更しても良い。The sound reproduction start time and the text display start time need not coincide with the video frame display time. The playback time of the sound and the display time of the text need not match the display time of the video frame. These can be freely set, and conversely, the display time of the video frame may be changed according to the sound reproduction time or the text display time.

【０２６３】これらは人間が手動で設定することも可能
である。These can be set manually by a human.

【０２６４】なお、図５０、図５１は映像のフレーム情
報とは別に記述し、映像と同期して再生・表示する例で
あるが、映像とは別個に音、テキスト情報のみを記述し
て、音、テキストの要約再生、要約表示を行なってもよ
い。FIGS. 50 and 51 show an example in which reproduction and display are described separately from video frame information and synchronized with video, but only sound and text information are described separately from video. Sound, text summary playback, and summary display may be performed.

【０２６５】人間が決定する手間を省くためには、重要
であると思われる映像場面に出現しそうな事象を考え、
このような事象を自動的に設定する処理を利用するのが
好ましい。以下、自動設定の例をいくつか示す。In order to save the trouble of human decision, consider an event likely to appear in a video scene that is considered important,
It is preferable to use a process for automatically setting such an event. Hereinafter, some examples of the automatic setting will be described.

【０２６６】図５３は、ショットと呼ばれる画面の切り
替わりから次の切り替わりまでの連続した映像区間を求
め、そのショットに含まれる映像フレームの表示時間の
総和を音声の再生時間とする処理手順の一例を示す（図
５３は機能ブロック図としても成立する）。FIG. 53 shows an example of a processing procedure in which a continuous video section from a screen change called a shot to the next change is determined, and the sum of the display times of the video frames included in the shot is used as the audio playback time. (FIG. 53 also holds as a functional block diagram).

【０２６７】ステップＳ３１０１において、映像からシ
ョットを検出する。これには、「ゆう度比検定を用いた
ＭＰＥＧビットストリームからの動画像カット検出手法
（信学論，Ｖｏｌ．Ｊ８２−Ｄ−ＩＩ，Ｎｏ．３，ｐ
ｐ．３６１−３７０，１９９９）」などの方法を用い
る。In step S3101, a shot is detected from the video. This includes a technique for detecting a moving image cut from an MPEG bit stream using a likelihood ratio test (Res. IEICE, Vol. J82-D-II, No. 3, p.
p. 361-370, 1999) ".

【０２６８】ステップＳ３１０２において、映像フレー
ムの位置情報を参照して、それぞれの映像フレームがど
のショットに属しているかを調べる。さらに、映像フレ
ームの表示時間の総和を取ることによって、それぞれの
ショットの表示時間を求める。In step S3102, it is determined which shot each video frame belongs to by referring to the position information of the video frame. Further, the display time of each shot is obtained by taking the sum of the display times of the video frames.

【０２６９】例えば、音の位置情報はショットの始まり
に対応した音声の位置とし、音の再生開始時刻はそれぞ
れのショットに属す最初の映像フレームの表示時刻にあ
わせ、音の再生時間はそのショットの表示時間に等しく
すればよい。あるいは、音の再生時間に応じて、それぞ
れのショットに含まれる映像フレームの表示時間を修正
してもよい。ここではショットを検出したが、（フレー
ム情報に重要度情報を記述するデータ構造をとる場合に
は）映像フレームに対する重要度を用いて、その重要度
がしきい値以上の区間を求め、その区間に含まれる音を
再生してもよい。For example, the sound position information is the position of the sound corresponding to the start of the shot, the sound reproduction start time is set to the display time of the first video frame belonging to each shot, and the sound reproduction time is set to that shot. What is necessary is just to make it equal to display time. Alternatively, the display time of the video frame included in each shot may be modified according to the sound reproduction time. Here, the shot is detected, but when the data structure describes the importance information in the frame information, the importance of the video frame is used to determine a section where the importance is equal to or more than the threshold, and the section is determined. May be reproduced.

【０２７０】求められた再生時間が一定基準に満たない
場合には、当該音声は再生しないようにしてもよい。If the determined reproduction time is less than the predetermined reference, the sound may not be reproduced.

【０２７１】図５４は、ショットもしくは重要度の高い
映像区間に対応する音声データから、音声認識によって
重要な単語を取り出し、その単語もしくは単語が含まれ
る音声もしくは複数の単語を組み合わせた音声を再生す
る処理手順の一例を示す（図５４は機能ブロック図とし
ても成立する）。In FIG. 54, an important word is extracted by voice recognition from voice data corresponding to a shot or a video section of high importance, and the word, a voice including the word, or a voice obtained by combining a plurality of words is reproduced. An example of the processing procedure is shown (FIG. 54 also holds as a functional block diagram).

【０２７２】ステップＳ３２０１において、ショットを
検出する。ショットのかわりに前記重要度の高い映像区
間を求めてもよい。In step S3201, a shot is detected. Instead of a shot, a video section having a higher importance may be obtained.

【０２７３】ステップＳ３２０２において、得られた映
像区間に対応する音声データの区間に対して、音声認識
を行う。In step S3202, speech recognition is performed on the section of the audio data corresponding to the obtained video section.

【０２７４】ステップＳ３２０３において、認識結果の
中から、重要な単語を含む音声もしくは重要単語部分の
音声を求める。重要単語を選択するには、重要単語辞書
３２０４を参照する。In step S3203, a speech containing an important word or a speech of an important word portion is obtained from the recognition result. To select an important word, reference is made to the important word dictionary 3204.

【０２７５】ステップＳ３２０５において、再生用の音
声を作成する。重要単語を含む連続した音声をそのまま
用いてもよいし、重要単語のみを抽出してもよい。重要
単語を複数組み合わせた音声を作成してもよい。In step S3205, a sound for reproduction is created. A continuous voice including important words may be used as it is, or only important words may be extracted. A voice combining a plurality of important words may be created.

【０２７６】ステップＳ３２０６において、作成した音
声の再生時間に応じて、映像フレームの表示時間を修正
する。ただし、音の再生時間が映像フレームの表示時間
内となるように、選択した単語の数を減らし、音の再生
時間を短くしてもよい。[0276] In step S3206, the display time of the video frame is corrected according to the playback time of the created audio. However, the number of selected words may be reduced to shorten the sound reproduction time so that the sound reproduction time is within the display time of the video frame.

【０２７７】図５５に、テロップからテキスト情報を取
得する手順の一例を示す（図５５は機能ブロック図とし
ても成立する）。FIG. 55 shows an example of a procedure for acquiring text information from a telop (FIG. 55 is also established as a functional block diagram).

【０２７８】図５５の処理は、テキスト情報は、映像中
に表示されるテロップあるいは音声から取得するもので
ある。In the processing shown in FIG. 55, text information is obtained from telops or audio displayed in a video.

【０２７９】ステップＳ３３０１において、映像内で表
示されるテロップを読み取る。これには、例えば文献
「堀修：“テロップ領域のための映像からの文字部抽出
法”、ＣＶＩＭ１１４−１７、ｐｐ．１２９−１３６
（１９９９）」に記述されている方法等により、元映像
中のテロップを自動抽出するか、人間がテロップを読み
取って手入力する方法がある。In step S3301, the telop displayed in the video is read. This includes, for example, the document “Hori Osamu:“ Character part extraction method from video for telop area ””, CVIM114-17, pp. 129-136.
(1999) ”, a telop in the original video is automatically extracted, or a human reads the telop and manually inputs the telop.

【０２８０】ステップＳ３３０２において、読み取った
テロップ文字列から重要な単語を取り出す。重要単語の
判定には、重要単語辞書３３０３を用いる。もちろん、
読み取ったテロップ文字列をそのままテキスト情報とし
てもよい。抽出した単語を並べ、重要単語のみでその映
像区間を表す文章を構成し、テキスト情報としてもよ
い。At step S3302, an important word is extracted from the read telop character string. An important word dictionary 3303 is used to determine an important word. of course,
The read telop character string may be used as text information as it is. The extracted words may be arranged, and a sentence representing the video section may be composed of only the important words, which may be used as text information.

【０２８１】図５６に、音声からテキスト情報を取得す
る処理手順の一例を示す（図５６は機能ブロック図とし
ても成立する）。FIG. 56 shows an example of a processing procedure for acquiring text information from voice (FIG. 56 is also established as a functional block diagram).

【０２８２】ステップＳ３４０１の音声認識処理によっ
て、音声を認識する。The voice is recognized by the voice recognition processing in step S3401.

【０２８３】ステップＳ３４０２において、認識した音
声データから重要な単語を取り出す。重要単語の判定に
は、重要単語辞書３４０３を用いる。もちろん、認識し
た音声データをそのままテキスト情報としてもよい。抽
出した単語を並べ、重要単語のみでその映像区間を表す
文章を構成し、テキスト情報としてもよい。In step S3402, important words are extracted from the recognized voice data. An important word dictionary 3403 is used to determine an important word. Of course, the recognized voice data may be used directly as text information. The extracted words may be arranged, and a sentence representing the video section may be composed of only the important words, which may be used as text information.

【０２８４】図５７に、ショットもしくは重要度の高い
映像区間からテロップ認識によって、テキスト情報を取
り出し、テキスト情報を作成する処理手順の一例を示す
（図５７は機能ブロック図としても成立する）。FIG. 57 shows an example of a processing procedure for extracting text information from a shot or a video section having high importance by telop recognition and creating text information (FIG. 57 is also established as a functional block diagram).

【０２８５】ステップＳ３５０１において、映像からシ
ョットを検出する。ショットではなく、重要度の高い区
間を求めてもよい。At step S3501, a shot is detected from the video. Instead of a shot, a section having a high importance may be obtained.

【０２８６】ステップＳ３５０２において、その映像区
間中に表示されるテロップを認識する。In step S3502, the telop displayed in the video section is recognized.

【０２８７】ステップＳ３５０３において、重要単語辞
書３５０４を用いて、重要な単語を抽出する。In step S3503, important words are extracted using the important word dictionary 3504.

【０２８８】ステップＳ３５０５において、表示用のテ
キストを作成する。これには、重要単語を含むテロップ
文字列を用いてもよいし、重要単語のみ、もしくは重要
単語を複数用いた文字列をテキスト情報としてもよい。
音声認識によってテキスト情報を得る場合には、ステッ
プＳ３５０２のテロップ認識処理の部分を音声認識処理
におきかえ、音声データを入力とすればよい。テキスト
情報は、そのテキストがテロップとして表示された映像
フレーム、あるいは音声として再生された時刻の映像フ
レームに合わせて表示する。あるいは、その映像区間中
のテキスト情報を一度に表示してもよい。At step S3505, a text for display is created. For this purpose, a telop character string including an important word may be used, or a character string using only an important word or a plurality of important words may be used as text information.
When text information is obtained by voice recognition, the part of the telop recognition processing in step S3502 may be replaced with voice recognition processing, and voice data may be input. The text information is displayed in accordance with a video frame in which the text is displayed as a telop or a video frame at the time when the text is reproduced as audio. Alternatively, text information in the video section may be displayed at a time.

【０２８９】図５８にテキスト情報の表示例を示す。図
５８の（ａ）のように、テキスト情報表示部３６０１と
映像表示部３６０２に分けてもよいし、図５８の（ｂ）
のように、テキスト情報を映像表示部３６０３に重ねて
表示してもよい。FIG. 58 shows a display example of text information. As shown in (a) of FIG. 58, the text information display unit 3601 and the video display unit 3602 may be divided, or (b) of FIG.
, The text information may be displayed on the video display unit 3603 in an overlapping manner.

【０２９０】映像フレーム、音情報、テキスト情報それ
ぞれの表示時間（再生時間）は、すべてのメディア情報
が同期するように調整する。例えば、映像を倍速再生す
る際には、まず、前述の方法で重要な音声を抽出し、通
常再生の２分の１の時間の音声情報を取得しておく。次
に、それぞれの音声に関連した映像フレームに表示時間
を割り当てる。画面の変化量が一定となるように映像フ
レームの表示時間を決めた場合には、音声の再生時間や
テキストの表示時間は、それぞれ関連する映像フレーム
の表示時間内とする。もしくは、ショットのように複数
の映像フレームを含む区間を求めておき、その区間に含
まれる音声あるいはテキストをその区間の表示時間に応
じて再生もしくは表示するようにする。The display time (reproduction time) of each of the video frame, sound information, and text information is adjusted so that all media information is synchronized. For example, when reproducing a video at double speed, first, important audio is extracted by the above-described method, and audio information for half the time of normal reproduction is obtained. Next, a display time is allocated to a video frame associated with each audio. When the display time of the video frame is determined so that the amount of change in the screen is constant, the audio reproduction time and the text display time are within the display time of the associated video frame. Alternatively, a section including a plurality of video frames such as a shot is obtained, and the sound or text included in the section is reproduced or displayed according to the display time of the section.

【０２９１】これまでは、映像データを中心にして説明
してきたが、音声データを中心に扱ったシステムももち
ろん可能である。In the above, the description has been made centering on video data, but a system dealing mainly with audio data is of course also possible.

【０２９２】これまでは映像を中心とした要約表示につ
いて扱ってきたが、フレーム情報を持たない（すなわち
映像を持たない）形式で、音情報やテキスト情報を用い
ることも可能である。この場合、元映像に対して、音情
報とテキスト情報だけで構成される要約を作成すること
になる。また、音声データや音楽データに対して、音情
報とテキスト情報だけで構成される要約を作成すること
も可能である。In the above, the summary display centering on video has been dealt with, but it is also possible to use sound information and text information in a format having no frame information (that is, no video). In this case, an abstract composed of only sound information and text information is created for the original video. In addition, it is also possible to create a summary composed of only sound information and text information for audio data and music data.

【０２９３】その際に、フレーム情報の場合と同様に、
音情報やテキスト情報に元の音声、音楽データとの対応
関係を記述するための元データ情報を追加してもよい。At this time, as in the case of the frame information,
Original data information for describing the correspondence between the original sound and music data may be added to the sound information or text information.

【０２９４】図５９は図５０に示すデータ構造の音情報
に、元データ情報４９０１を含めたデータ構造の例であ
る。元データ情報４９０１は入力が映像の場合は映像の
区間を示す時間（始点情報４９０２と区間長情報４９０
３）となるし、入力が音声／音楽の場合は音声／音楽の
区間を示す時間となる。FIG. 59 shows an example of a data structure including original data information 4901 in the sound information having the data structure shown in FIG. When the input is a video, the original data information 4901 indicates the time indicating the video section (starting point information 4902 and section length information 490).
3), and when the input is voice / music, the time indicates a section of voice / music.

【０２９５】図６０は図３０に相当するデータ構造の音
情報に、元データ情報４９０１を含めたデータ構造の例
である。FIG. 60 shows an example of a data structure including original data information 4901 in sound information having a data structure corresponding to FIG.

【０２９６】図６１は音声情報を用いて、音声／音楽を
要約した一例について説明したものである。この例では
元となる音声／音楽をいくつかの区間に分割し、それぞ
れの区間の一部をその区間の要約音声／音楽として切り
出して、要約を作成している。例えば、区間２の５００
１の部分を要約音声／音楽として切り出し、要約の５０
０２の区間として再生する。区間を分割方法の例として
は、音楽を楽章ごとに分けたり、会話を内容ごとに分け
たりする方法が考えられる。FIG. 61 illustrates an example of summarizing voice / music using voice information. In this example, the original speech / music is divided into several sections, and a part of each section is cut out as a summary speech / music of the section to create a summary. For example, 500 in section 2
1 is cut out as a summary audio / music, and 50
Reproduction is made as a section 02. Examples of the method of dividing the section include a method of dividing music into movements and a method of dividing conversation into contents.

【０２９７】また、フレーム情報の場合と同様に、音情
報やテキスト情報に元データファイルおよび区間の記述
を含めることによって、複数の音声、音楽データをまと
めて要約することもできる。このとき、個々の元デー
タに識別情報が付与されている場合は元データファイル
や区間を記述する代わりに、元データ識別情報を用いて
もよい。As in the case of the frame information, by including the description of the original data file and the section in the sound information and text information, a plurality of voice and music data can be summarized together. At this time, when identification information is added to each original data, the original data identification information may be used instead of describing the original data file or section.

【０２９８】図６２は音声情報を用いて、音声／音楽を
要約した一例について説明したものである。この例では
複数の音声／音楽データに対して、それぞれ一部区間の
要約音声／音楽として切り出して、要約を作成してい
る。例えば、音声／音楽２の５１０１の部分を要約音声
／音楽として切り出し、要約の５１０２の区間として再
生する。１枚の音楽アルバムに含まれる曲の一部区間ず
つを切り出してまとめ、試聴用の要約データを作成する
用途などが考えられる。FIG. 62 illustrates an example of summarizing voice / music using voice information. In this example, a plurality of voice / music data are cut out as a summary voice / music of a partial section, respectively, to create a summary. For example, the section 5101 of voice / music 2 is cut out as a summary voice / music and reproduced as a section 5102 of the summary. It is conceivable, for example, to cut out and collect a part of the music included in one music album and create summary data for trial listening.

【０２９９】アルバムなどを要約する場合など、曲名が
分かった方がよい場合は音情報に音楽データの曲名を含
めるようにしてもよい。もちろん、この情報は必須では
ない。When it is better to know the title of a song, such as when summarizing an album or the like, the title of the music data may be included in the sound information. Of course, this information is not required.

【０３００】次に、映像データや物体領域データの提供
方法について説明する。Next, a method for providing video data and object area data will be described.

【０３０１】本実施形態の処理により作成された特殊再
生制御情報がユーザの用に供される場合には、作成者側
からユーザ側に何らかの方法で特殊再生制御情報を提供
する必要がある。この提供の方法としても以下に例示す
るように種々の形態が考えられる。（１）映像データとその特殊再生制御情報とを１つ（ま
たは複数の）記録媒体に記録して同時に提供する形態（２）映像データを１つ（または複数の）記録媒体に記
録して提供し、別途、特殊再生制御情報を１つ（または
複数の）記録媒体に記録して提供する形態（３）映像データとその特殊再生制御情報とを同じ機会
に通信媒体を介して提供する形態（４）映像データとその特殊再生制御情報とを異なる機
会に通信媒体を介して提供する形態これにより、映像コンテンツの特殊再生に供するための
制御情報として、元映像から選択的に抽出したフレーム
（群）の取得方法と、そのフレーム（群）に割り当てた
表示時間の情報又はこれを得る基となる情報とを含むフ
レーム情報を、複数配列させて記述することにより、再
生側では該制御情報に基づいた効果的な特殊再生が可能
になる。When the special reproduction control information created by the processing of the present embodiment is provided to the user, it is necessary for the creator to provide the special reproduction control information to the user in some way. Various forms of this providing method can be considered as exemplified below. (1) A form in which video data and its special reproduction control information are recorded on one (or more) recording media and provided simultaneously (2) A video data is recorded and provided on one (or more) recording media And separately providing the trick play control information on one (or more) recording media (3) providing the video data and its trick play control information via a communication medium at the same opportunity ( 4) A form in which video data and its special reproduction control information are provided at different occasions via a communication medium. As a result, frames (groups) selectively extracted from an original video as control information for providing special reproduction of video contents. ), And the frame information including the display time information allocated to the frame (group) or the information from which the frame (group) is obtained is described in a plurality of arrays. Effective special reproduction based on the information becomes possible.

【０３０２】以上説明したように、本実施形態によれ
ば、映像コンテンツに対する特殊再生に供するための特
殊再生制御情報を記述する特殊再生制御情報記述方法に
おいて、前記映像コンテンツを構成する映像データの全
フレーム系列のなかから選択的に抽出された１フレーム
又は連続若しくは近接する複数フレームからなるフレー
ム群ごとに、該１フレーム又は該フレーム群のデータが
存在する位置を示す第１の情報と、該１フレーム又は該
フレーム群に対して付与された表示時間に関する第２の
情報及び又は該フレーム情報に対応する前記１フレーム
又は前記フレーム群に対して付与された重要度を示す第
３の情報をフレーム情報として記述する。As described above, according to the present embodiment, in the special reproduction control information description method for describing the special reproduction control information to be used for the special reproduction of the video content, all of the video data constituting the video content For each one frame selectively extracted from the frame sequence or for each frame group including a plurality of continuous or adjacent frames, first information indicating a position where the data of the one frame or the frame group exists; The second information relating to the display time given to the frame or the frame group and / or the third information indicating the importance given to the one frame or the frame group corresponding to the frame information is represented by frame information. Described as

【０３０３】また、本実施形態によれば、映像コンテン
ツを構成する映像データの全フレーム系列のなかから選
択的に抽出された１フレーム又は連続若しくは近接する
複数フレームからなるフレーム群ごとに記述された、該
１フレーム又は該フレーム群のデータが存在する位置を
示す第１の情報と、該１フレーム又は該フレーム群に対
して付与された表示時間に関する第２の情報及び又は該
１フレーム又は該フレーム群に対して付与された重要度
を示す第３の情報とを含むフレーム情報を少なくとも含
む特殊再生制御情報を格納したコンピュータ読取り可能
な記録媒体も提供される。According to the present embodiment, one frame selectively extracted from the entire frame sequence of the video data constituting the video content or a frame group consisting of a plurality of continuous or adjacent frames is described. , First information indicating a position where data of the one frame or the frame group exists, second information relating to a display time given to the one frame or the frame group, and / or the one frame or the frame There is also provided a computer-readable recording medium storing special reproduction control information including at least frame information including third information indicating importance assigned to a group.

【０３０４】さらに、映像コンテンツに対する特殊再生
に供するための特殊再生制御情報を生成する特殊再生制
御情報装置／生成方法の実施形態において、前記映像コ
ンテンツを構成する映像データの全フレーム系列のなか
から、特殊再生に供される一部のフレームを、１フレー
ム又は連続若しくは近接する複数フレームからなるフレ
ーム群ごとに、該フレーム系列に沿って順次選択的に抽
出し、抽出された前記１フレーム又は前記フレーム群ご
とに、該１フレーム又は該フレーム群のデータが存在す
る位置を示す映像位置情報と、該１フレーム又は該フレ
ーム群に対して割り当てるべき表示時間の情報又はこれ
を算出する基となる情報を含む表示時間制御情報とを生
成し、前記１フレーム又は前記フレーム群ごとに生成さ
れた、前記映像位置情報及び前記表示時間制御情報をフ
レーム情報として記述することによって、前記映像コン
テンツに対する特殊再生制御情報を作成する。Further, in the embodiment of the special reproduction control information device / generation method for generating special reproduction control information to be used for the special reproduction of the video content, in the special reproduction control information device / generation method, Partial frames to be subjected to trick play are selectively extracted along the frame sequence in a frame group consisting of one frame or a plurality of consecutive or adjacent frames, and the extracted one frame or the frames are extracted. For each group, the video position information indicating the position where the data of the one frame or the frame group exists, and the information of the display time to be assigned to the one frame or the frame group or the information for calculating the same are And display time control information including the video position, the video position generated for each of the one frame or the frame group. By describing the information and the display time control information as the frame information, to create a special reproduction control information for the video content.

【０３０５】また、映像コンテンツに対する特殊再生を
行うことが可能な映像再生装置／方法の実施形態におい
て、前記映像コンテンツに付随する、該映像コンテンツ
を構成する映像データの全フレーム系列のなかから選択
的に抽出された１フレーム又は連続若しくは近接する複
数フレームからなるフレーム群ごとに記述された、該１
フレーム又は該フレーム群のデータが存在する位置を示
す映像位置情報と、該１フレーム又は該フレーム群に対
して割り当てるべき表示時間の情報又はこれを算出する
基となる情報を示す表示時間制御情報とを含むフレーム
情報を少なくとも含む特殊再生制御情報を参照し、前記
フレーム情報に含まれる映像位置情報に基づいて、各フ
レーム情報に対応する前記１フレーム又は前記フレーム
群のデータを取得するとともに、少なくとも各々の前記
フレーム情報に含まれる前記表示時間制御情報に基づい
て、各フレーム情報に対して割り当てるべき表示時間を
決定し、取得された前記１フレーム又は複数フレームの
データを、決定された前記表示時間によって再生するこ
とを、所定の順序で行うことによって、特殊再生を行
う。Also, in the embodiment of the video reproducing apparatus / method capable of performing the special reproduction for the video content, in the video reproducing apparatus / method, the video data can be selectively selected from the entire frame sequence of the video data constituting the video content accompanying the video content. Described for each frame group consisting of one frame or a plurality of consecutive or adjacent frames extracted in
Video position information indicating the position where the data of the frame or the frame group is present, display time information to be assigned to the one frame or the frame group, or display time control information indicating the information on which the calculation is based; With reference to the special reproduction control information including at least the frame information including, based on the video position information included in the frame information, to obtain the data of the one frame or the frame group corresponding to each frame information, at least each Based on the display time control information included in the frame information, determine a display time to be assigned to each frame information, the obtained data of one or more frames, the determined display time by the determined display time The special reproduction is performed by performing the reproduction in a predetermined order.

【０３０６】本発明の実施形態では、例えば、あらかじ
め表示に用いる有効な映像フレームの位置情報または元
映像からフレーム単位で取り出した画像データを準備
し、その映像フレーム位置情報または画像データの表示
時間に関する情報を元映像とは別に準備する。元映像か
ら取り出した映像フレームもしくは画像データを表示情
報に基づいて、それらを連続表示することにより、倍速
再生、トリック再生、飛び越し連続再生などの特殊再生
を行うことができる。In the embodiment of the present invention, for example, position information of an effective video frame used for display or image data extracted in frame units from an original video is prepared in advance, and the position information of the video frame or the display time of the image data is prepared. Prepare the information separately from the original video. By continuously displaying video frames or image data extracted from the original video based on display information, special reproduction such as double-speed reproduction, trick reproduction, and interlaced continuous reproduction can be performed.

【０３０７】例えば、高速に内容を確認するための倍速
再生においては、表示画面の画面の変化ができるだけ一
定になるように、動きの大きいところは表示時間を長
く、動きの小さいところは表示時間を短くするように、
あらかじめ表示時間を決めておくようにしてもよい。ま
たは、表示に用いる映像フレームまたは画像データの動
きの多い部分からは多く、動きの少ないところは少なく
するように、位置情報を決めても、同じ効果が得られ
る。全体としてユーザによって指定される倍速値または
再生時間になるようにコントロールされる値を準備する
ようにしてもよい。長い映像も短い時間で、見やすい倍
速再生等でみることができ、内容を短い時間で把握する
ことができる。[0307] For example, in double-speed playback for checking the contents at high speed, the display time is increased for places with large movements and the display time is set for places with small movements so that the change of the display screen is as constant as possible. To make it shorter,
The display time may be determined in advance. Alternatively, the same effect can be obtained even if the position information is determined so that a portion of the video frame or image data used for display that has a large amount of movement and a portion of small amount of movement are small. A value that is controlled so as to be a double speed value or a reproduction time specified by the user as a whole may be prepared. A long video can be viewed in a short time with easy-to-view double-speed reproduction, etc., and the content can be grasped in a short time.

【０３０８】例えば、表示時間を重要度に応じて、重要
な場所は表示時間を長く、低い場所は短くすることによ
り、重要な場所を見落としにくい再生も可能である。For example, by making the display time longer in an important place and shorter in a lower place in accordance with the degree of importance of the display time, it is possible to reproduce the important place without overlooking it.

【０３０９】例えば、全映像フレームを表示せず、部分
的に映像の一部を省略することにより、重要な部分だけ
を効率良く再生するようにしてもよい。For example, by not displaying all video frames and partially omitting a part of video, only important portions may be efficiently reproduced.

【０３１０】本発明の実施形態によれば、映像コンテン
ツの特殊再生に供するための制御情報として、元映像か
ら選択的に抽出したフレーム（群）の取得方法と、その
フレーム（群）に割り当てた（絶対的若しくは相対的
な）表示時間の情報又はこれを得る基となる情報とを含
むフレーム情報を、複数配列させて記述することによ
り、再生側では該制御情報に基づいた効果的な特殊再生
が可能になる。According to the embodiment of the present invention, as control information for providing special reproduction of video content, a method of obtaining a frame (group) selectively extracted from an original video, and assigning the frame (group) to the frame (group). By arranging and describing a plurality of pieces of frame information including (absolute or relative) display time information or information on which the display time is obtained, an effective trick play based on the control information is performed on the playback side. Becomes possible.

【０３１１】例えば、以上の各機能は、ソフトウェアと
しても実現可能である。上記実施形態は、コンピュータ
に所定の手段を実行させるための、あるいはコンピュー
タを所定の手段として機能させるための、あるいはコン
ピュータに所定の機能を実現させるためのプログラムを
記録したコンピュータ読取り可能な記録媒体としても実
施することもできる。For example, each of the above functions can be implemented as software. The above-described embodiment is a computer-readable recording medium that records a program for causing a computer to execute predetermined means, or for causing a computer to function as predetermined means, or for causing a computer to realize predetermined functions. Can also be implemented.

【０３１２】各実施形態で例示した構成は一例であっ
て、それ以外の構成を排除する趣旨のものではなく、例
示した構成の一部を他のもので置き換えたり、例示した
構成の一部を省いたり、例示した構成に別の機能を付加
したり、それらを組み合わせたりすることなどによって
得られる別の構成も可能である。例示した構成と論理的
に等価な別の構成、例示した構成と論理的に等価な部分
を含む別の構成、例示した構成の要部と論理的に等価な
別の構成なども可能である。例示した構成と同一もしく
は類似の目的を達成する別の構成、例示した構成と同一
もしくは類似の効果を奏する別の構成なども可能であ
る。各実施形態内において、各種構成部分についての各
種バリエーションは、適宜組み合わせて実施することが
可能である。各実施形態は適宜組み合わせて実施するこ
とが可能である。各実施形態は、情報の記述方法として
の発明、記述された情報としての発明、装置またはそれ
に対応する方法としての発明、装置内部またはそれに対
応する方法としての発明等、種々の観点、段階、概念ま
たはカテゴリに係る発明を包含・内在するものである。
また、本発明は、コンピュータに所定の手段を実行させ
るための、あるいはコンピュータを所定の手段として機
能させるための、あるいはコンピュータに所定の機能を
実現させるためのプログラムを記録したコンピュータ読
取り可能な記録媒体としても実施することもできる。[0312] The configuration illustrated in each embodiment is an example, and is not intended to exclude other configurations. Some of the illustrated configuration may be replaced with another, or part of the illustrated configuration may be replaced. Another configuration obtained by omitting, adding another function to the illustrated configuration, or combining them is also possible. Another configuration that is logically equivalent to the illustrated configuration, another configuration including a portion that is logically equivalent to the illustrated configuration, another configuration that is logically equivalent to a main part of the illustrated configuration, and the like are also possible. Another configuration that achieves the same or similar purpose as the illustrated configuration, another configuration that achieves the same or similar effect as the illustrated configuration, and the like are also possible. Within each embodiment, various variations on various components can be implemented in appropriate combinations. Each embodiment can be implemented in combination as appropriate. Each embodiment includes various aspects, steps, and concepts, such as an invention as a method of describing information, an invention as described information, an invention as an apparatus or a method corresponding thereto, and an invention as an inside of an apparatus or a method corresponding thereto. Or, the inventions belonging to the category are included or inherent.
Further, the present invention provides a computer-readable recording medium for recording a program for causing a computer to execute predetermined means, for causing a computer to function as predetermined means, or for causing a computer to realize predetermined functions. It can also be implemented as.

【０３１３】従って、この発明の実施の形態に開示した
内容からは、例示した構成に限定されることなく発明を
抽出することができるものである。Therefore, the present invention can be extracted from the contents disclosed in the embodiments of the present invention without being limited to the exemplified configuration.

【０３１４】[0314]

【発明の効果】以上説明したように本発明によれば、利
用者にとってより効果的な特殊再生を可能とするフレー
ム情報記述方法、フレーム情報生成装置及び方法並びに
映像再生装置及び方法を提供することができる。As described above, according to the present invention, it is possible to provide a frame information description method, a frame information generation device and method, and a video reproduction device and method which enable more effective special reproduction for a user. Can be.

[Brief description of the drawings]

【図１】本発明の一実施形態に係る特殊再生制御情報の
データ構造例を示す図。FIG. 1 is a view showing an example of a data structure of trick play control information according to an embodiment of the present invention.

【図２】特殊再生制御情報生成装置の構成例を示す図。FIG. 2 is a diagram showing a configuration example of a special reproduction control information generation device.

【図３】特殊再生制御情報生成装置の他の構成例を示す
図。FIG. 3 is a diagram showing another configuration example of the special reproduction control information generation device.

【図４】図２の構成の場合の処理手順の一例を示すフレ
ーチャート。FIG. 4 is a flowchart illustrating an example of a processing procedure in the case of the configuration of FIG. 2;

【図５】図３の構成の場合の処理手順の一例を示すフレ
ーチャート。FIG. 5 is a flowchart showing an example of a processing procedure in the case of the configuration of FIG. 3;

【図６】映像再生装置の構成例を示す図。FIG. 6 is a diagram illustrating a configuration example of a video playback device.

【図７】図６の構成の場合の処理手順の一例を示すフレ
ーチャート。7 is a flowchart showing an example of a processing procedure in the case of the configuration of FIG. 6;

【図８】特殊再生制御情報のデータ構造例を示す図。FIG. 8 is a view showing an example of a data structure of special reproduction control information.

【図９】元映像フレームを参照する映像位置情報につい
て説明する図。FIG. 9 is a view for explaining video position information referring to an original video frame.

【図１０】画像データファイルを参照する映像位置情報
について説明する図。FIG. 10 is a view for explaining video position information referring to an image data file.

【図１１】画面の動きに応じた画像データの抽出方法に
ついて説明する図。FIG. 11 is a view for explaining a method of extracting image data according to the movement of a screen.

【図１２】元映像フレームを参照する映像位置情報につ
いて説明する図。FIG. 12 is a view for explaining video position information referring to an original video frame.

【図１３】画像データファイルを参照する映像位置情報
について説明する図。FIG. 13 is a view for explaining video position information referring to an image data file.

【図１４】元映像情報として記述するフレームの位置情
報に時間的な幅を持たせる場合のフレーム情報のデータ
構造を示す図。FIG. 14 is a diagram showing a data structure of frame information when the position information of a frame described as original video information has a temporal width.

【図１５】元映像フレームを参照する映像位置情報に時
間的な幅を持たせた例を説明する図。FIG. 15 is a view for explaining an example in which video position information referring to an original video frame has a temporal width.

【図１６】画像データファイルを参照する映像位置情報
に時間的な幅を持たせた例を説明する図。FIG. 16 is a view for explaining an example in which video position information referring to an image data file has a temporal width.

【図１７】元映像フレームを参照する映像位置情報に時
間的な幅を持たせた例を説明する図。FIG. 17 is a diagram illustrating an example in which video position information referring to an original video frame has a temporal width.

【図１８】元映像フレームを参照する画像データファイ
ルに時間的な幅を持たせた例を説明する図。FIG. 18 is a view for explaining an example in which an image data file referring to an original video frame has a temporal width.

【図１９】要約表示された映像のフレームに対応する元
映像のフレームから再生を開始するためのフロー図。FIG. 19 is a flowchart for starting reproduction from a frame of an original video corresponding to a frame of a video that is displayed in summary.

【図２０】画面の動きに応じた画像データの抽出方法に
ついて説明する図。FIG. 20 is a view for explaining a method of extracting image data according to the movement of the screen.

【図２１】画面の動きに応じた画像データの抽出方法に
ついて説明する図。FIG. 21 is a view for explaining a method of extracting image data according to the movement of a screen.

【図２２】画面の変化量ができるだけ一定となる表示時
間を求める処理手順の一例を示すフローチャート。FIG. 22 is a flowchart illustrating an example of a processing procedure for obtaining a display time at which the amount of change in the screen is as constant as possible.

【図２３】ＭＰＥＧ映像から全フレームの画面変化量を
求める処理手順の一例を示すフローチャート。FIG. 23 is a flowchart illustrating an example of a processing procedure for obtaining a screen change amount of all frames from an MPEG video.

【図２４】ＭＰＥＧストリームからの画像変化量の算出
方法について説明する図。FIG. 24 is a view for explaining a method of calculating an image change amount from an MPEG stream.

【図２５】画面の変化量ができるだけ一定となる表示時
間を求める処理手法について説明する図。FIG. 25 is a view for explaining a processing method for obtaining a display time in which the amount of change in the screen is as constant as possible.

【図２６】特殊再生制御情報に基づく特殊再生を行う処
理手順の一例を示すフローチャート。FIG. 26 is a flowchart illustrating an example of a processing procedure for performing special reproduction based on special reproduction control information.

【図２７】表示サイクルを基準に特殊再生を行う処理手
順の一例を示すフローチャート。FIG. 27 is a flowchart illustrating an example of a processing procedure for performing special reproduction based on a display cycle.

【図２８】算出された表示時間と表示サイクルの関係に
ついて説明する図。FIG. 28 is a diagram illustrating a relationship between a calculated display time and a display cycle.

【図２９】算出された表示時間と表示サイクルの関係に
ついて説明する図。FIG. 29 is a diagram illustrating a relationship between a calculated display time and a display cycle.

【図３０】元映像位置情報を持つ特殊再生制御情報のデ
ータ構造例を示す図。FIG. 30 is a view showing an example of the data structure of special reproduction control information having original video position information.

【図３１】複数の元映像をまとめて要約表示する際の元
映像フレームを参照する映像位置情報を説明する図。FIG. 31 is an exemplary view for explaining image position information referring to an original image frame when a plurality of original images are collectively displayed in summary.

【図３２】複数の元映像をまとめて要約表示する際の画
像データファイルを参照する映像位置情報を説明する
図。FIG. 32 is a view for explaining video position information referring to an image data file when a plurality of original videos are collectively displayed in a summary.

【図３３】フレーム情報を記述するための別のデータ構
造を示す図。FIG. 33 is a view showing another data structure for describing frame information.

【図３４】特殊再生制御情報のデータ構造例を示す図。FIG. 34 is a view showing an example of the data structure of special reproduction control information.

【図３５】特殊再生制御情報のデータ構造例を示す図。FIG. 35 is a view showing an example of the data structure of special reproduction control information.

【図３６】重要度から表示時間を求める処理手順の一例
を示すフローチャート。FIG. 36 is a flowchart showing an example of a processing procedure for obtaining a display time from importance.

【図３７】重要度から表示時間を求める手法について説
明する図。FIG. 37 is a view for explaining a method for obtaining a display time from importance.

【図３８】音声レベルの大きな場面を重要として重要度
データを算出する処理手順の一例を示すフローチャー
ト。FIG. 38 is a flowchart showing an example of a processing procedure for calculating importance data with a scene having a high audio level as important.

【図３９】音声認識により重要な単語が多く出現してい
る場面を重要として重要度データを算出する処理または
時間あたりに話された単語の数が多い場面を重要として
重要度データを算出する処理の手順の一例を示すフロー
チャート。FIG. 39 is a process for calculating importance data with a scene where many important words appear by voice recognition as important, or a process for calculating importance data with a scene where the number of words spoken per time is important. 5 is a flowchart showing an example of the procedure.

【図４０】テロップ認識により重要な単語が多く出現し
ている場面を重要として重要度データを算出する処理ま
たは時間あたりに出現したテロップに含まれる単語の数
が多い場面を重要として重要度データを算出する処理の
手順の一例を示すフローチャート。FIG. 40 is a diagram illustrating a process of calculating importance data with a scene in which many important words appear by telop recognition as important or a scene in which the number of words included in a telop appearing per time is significant as important. 9 is a flowchart illustrating an example of a procedure of a calculation process.

【図４１】大きな文字がテロップとして出現した場面を
重要として重要度データを算出する処理の手順の一例を
示すフローチャート。FIG. 41 is a flowchart illustrating an example of a procedure of processing for calculating importance data with a scene in which a large character appears as a telop as important.

【図４２】人間の顔が多く登場する場面を重要として重
要度データを算出する処理または人間の顔が大きく写る
場面を重要として重要度データを算出する処理の手順の
一例を示すフローチャート。FIG. 42 is a flowchart illustrating an example of a procedure of processing for calculating importance data with a scene where many human faces appear important or calculating importance data with a scene where a human face appears largely.

【図４３】登録しておいた重要シーンと類似した映像が
出現する場面を重要として重要度データを算出する処理
の手順の一例を示すフローチャート。FIG. 43 is a flowchart illustrating an example of a procedure of a process of calculating importance data with a scene in which a video similar to a registered important scene appears as important.

【図４４】特殊再生制御情報のデータ構造例を示す図。FIG. 44 is a view showing an example of the data structure of special reproduction control information.

【図４５】特殊再生制御情報のデータ構造例を示す図。FIG. 45 is a view showing an example of the data structure of special reproduction control information.

【図４６】特殊再生制御情報のデータ構造例を示す図。FIG. 46 is a view showing an example of the data structure of special reproduction control information.

【図４７】再生するか非再生にするかの情報と再生映像
の関係について説明する図。FIG. 47 is a view for explaining the relationship between information on whether to reproduce or not reproduce and a reproduced video.

【図４８】再生／非再生判断を含む特殊再生の処理の手
順の一例を示すフローチャート。FIG. 48 is a flowchart showing an example of the procedure of a special reproduction process including reproduction / non-reproduction determination.

【図４９】音情報、テキスト情報を付加したときのデー
タ構造の一例を示す図。FIG. 49 is a view showing an example of a data structure when sound information and text information are added.

【図５０】音情報のみをフレーム情報とは別に記述する
ためのデータ構造の一例を示す図。FIG. 50 is a diagram showing an example of a data structure for describing only sound information separately from frame information.

【図５１】テキスト情報のみをフレーム情報とは別に記
述するためのデータ構造の一例を示す図。FIG. 51 is a diagram showing an example of a data structure for describing only text information separately from frame information.

【図５２】各メディア再生の同期について説明する図。FIG. 52 is an exemplary view for explaining synchronization of media reproduction;

【図５３】映像区間内での音再生開始時刻と音再生時間
の決定手順の一例を示すフローチャート。FIG. 53 is a flowchart showing an example of a procedure for determining a sound reproduction start time and a sound reproduction time within a video section.

【図５４】再生用音声データの作成と映像フレーム表示
時間の修正の処理の手順の一例を示すフローチャート。FIG. 54 is an exemplary flowchart showing an example of the procedure of processing for creating reproduction audio data and correcting the video frame display time.

【図５５】テロップ認識によるテキスト情報取得処理の
手順の一例を示すフローチャート。FIG. 55 is a flowchart showing an example of a procedure of text information acquisition processing by telop recognition.

【図５６】音声認識によるテキスト情報の取得処理の手
順の一例を示すフローチャート。FIG. 56 is an exemplary flowchart showing an example of the procedure of text information acquisition processing by voice recognition.

【図５７】テキスト情報の作成処理の手順の一例を示す
フローチャート。FIG. 57 is a flowchart illustrating an example of a procedure of a text information creation process.

【図５８】テキスト情報の表示方法について説明する
図。FIG. 58 is an exemplary view for explaining a method for displaying text information;

【図５９】フレーム情報を記述するための別のデータ構
造を示す図。FIG. 59 is a view showing another data structure for describing frame information.

【図６０】フレーム情報を記述するための別のデータ構
造を示す図。FIG. 60 is a view showing another data structure for describing frame information.

【図６１】音楽データの要約再生を説明する図。FIG. 61 is an exemplary view for explaining abstract reproduction of music data;

【図６２】複数の音楽データの要約再生を説明する図。FIG. 62 is an exemplary view for explaining abstract reproduction of a plurality of music data;

[Explanation of symbols]

１…映像データ処理部２…映像データ記憶部３…特殊再生制御情報記憶部４…画像データファイル記憶部１１…映像位置情報処理部１２…表示時間制御情報処理部１３…画像データファイル作成部２１…制御部２２…通常再生処理部２３…特殊再生処理部２４…表示部２５…コンテンツ記憶部 DESCRIPTION OF SYMBOLS 1 ... Video data processing part 2 ... Video data storage part 3 ... Special reproduction control information storage part 4 ... Image data file storage part 11 ... Video position information processing part 12 ... Display time control information processing part 13 ... Image data file creation part 21 ... Control unit 22 ... Normal reproduction processing unit 23 ... Special reproduction processing unit 24 ... Display unit 25 ... Content storage unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者三田雄志神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内 (72)発明者山本晃司神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内 (72)発明者増倉孝一神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内Ｆターム(参考） 5C052 AA01 AC01 CC11 DD04 5C053 FA23 GA11 GB37 HA21 JA24 5D044 AB05 AB07 BC03 CC04 FG18 FG23 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Yuji Mita 1st address, Komukai Toshiba-cho, Saiwai-ku, Kawasaki-shi, Kanagawa Prefecture Inside Toshiba R & D Center (72) Inventor Koji Yamamoto Komukai Toshiba, Sai-ku, Kawasaki-shi, Kanagawa No. 1 in the Toshiba R & D Center, Inc. (72) Inventor Koichi Masukura No. 1, Komukai Toshiba-cho, Koyuki-ku, Kawasaki-shi, Kanagawa F-term (reference) 5C052 AA01 AC01 CC11 DD04 5C053 FA23 GA11 GB37 HA21 JA24 5D044 AB05 AB07 BC03 CC04 FG18 FG23

Claims

[Claims]

A step of describing first information for specifying a position in the original video data of a frame extracted from a plurality of frames of the original video data; and a step of describing second information relating to a display time of the extracted frame. A frame information description method, comprising:

2. The frame information according to claim 1, wherein the extracted frame comprises a frame group, and the first information includes information for specifying a position of the extracted frame group in the original video data. Description method.

3. The method according to claim 1, further comprising the step of writing third information relating to the importance of the extracted frame.

4. The frame information description method according to claim 1, wherein the first information includes information for specifying an image data file corresponding to the extracted frame created from the original video data.

5. The method according to claim 5, wherein the extracted frame comprises frames extracted from a plurality of frames in a certain time section of the original video data, and further comprising a step of describing fourth information for specifying the time section. Claim 1.
Described frame information description method.

6. The frame information description method according to claim 5, wherein the first information includes information for specifying an image data file corresponding to the extracted frame created from the original video data.

7. The frame information description method according to claim 1, wherein the second information includes information on a display time such that a screen change amount during reproduction is substantially constant.

8. The frame information description method according to claim 1, further comprising a step of describing fifth information for instructing reproduction or non-reproduction of the extracted frame.

9. The first information is information indicating a position of the extracted frame or image data corresponding to the extracted frame in an image data file generated from the original video data and stored separately from the original video data. 2. The frame information description method according to claim 1, further comprising information indicating a position of the frame information.

10. The method further comprising the step of describing information indicating a position of the media data and information relating to a display time of the media data in media data other than the original video data including the extracted frame. The frame information description method according to claim 1, wherein

11. A computer-readable recording medium for storing frame information related to a frame extracted from a plurality of frames of original video data, wherein the frame information specifies a position of the extracted frame in the original video data. A recording medium comprising: information; and second information relating to a display time of the extracted frame.

12. The recording medium according to claim 11, wherein said extracted frame comprises a frame group, and said first information includes information for specifying a position of said extracted frame group in said original video data. .

13. The recording medium according to claim 11, wherein the frame information further includes third information relating to the importance of the extracted frame.

14. The apparatus according to claim 1, wherein the first information includes information for specifying an image data file corresponding to the extracted frame created from the original video data.
2. The recording medium according to 1.

15. The recording medium according to claim 11, further comprising, together with said frame information, said original video data and an image data file corresponding to said extracted frame created from said original video data.

16. A means for describing first information for specifying a position in the original video data of a frame extracted from a plurality of frames of the original video data, and a means for describing second information relating to a display time of the extracted frame. A frame information description device, comprising:

17. Generating first information for specifying a position in the original video data of a frame extracted from a plurality of frames of the original video data, and generating second information relating to a display time of the extracted frame. A frame information generation method, comprising:

18. A means for referring to first information for specifying a position in the original video data of a frame extracted from a plurality of frames of the original video data, and second information relating to a display time of the extracted frame, Means for acquiring the original video data of the extracted frame based on the first information; means for determining a display time for reproducing the original video data of the extracted frame based on the second information; and Means for reproducing the data for the determined display time.

19. referencing first information specifying a position in the original video data of a frame extracted from a plurality of frames of the original video data, and second information relating to a display time of the extracted frame; Obtaining original video data of the extracted frame based on first information; determining display time for reproducing the original video data of the extracted frame based on the second information; Reproducing the data for the determined display time.

20. A computer-readable recording medium storing a video reproduction program, wherein the video reproduction program causes the computer to specify a position in the original video data of a frame extracted from a plurality of frames of the original video data. A second information and a display time of the extracted frame.
A program code for referring to information; a program code for causing a computer to acquire the original video data of the extracted frame based on the first information; and a computer for reproducing the original video data of the extracted frame based on the second information. A recording medium comprising: a program code for determining a display time to be performed; and a program code for reproducing the display time determined by the computer for the original video data.

21. A step of describing first information for specifying a position in the original sound data of a sound frame extracted from a plurality of frames of the original sound data, and describing second information on a reproduction time of the extracted frame. A frame information description method, comprising:

22. A computer-readable recording medium for storing frame information on sound frames extracted from a plurality of frames of original sound data, wherein the frame information specifies a position of the extracted frame in the original sound data. A recording medium comprising: 1 information; and 2nd information relating to a reproduction time of the extracted frame.

23. A step of describing first information for specifying a position of a text frame extracted from a plurality of frames of original text data, and a step of describing second information relating to a display time of the extracted frame. A frame information description method characterized in that:

24. A computer-readable recording medium for storing frame information related to a text frame extracted from a plurality of frames of original text data, wherein the frame information specifies a position of the extracted frame in the original text data. A recording medium comprising: 1 information; and 2nd information relating to a display time of the extracted frame.