JP2009129003A

JP2009129003A - Trimming processing apparatus and trimming processing program

Info

Publication number: JP2009129003A
Application number: JP2007300567A
Authority: JP
Inventors: Makoto Numata; 誠沼田; Hiroshi Senoo; 宏妹尾; Yoshiaki Shishikui; 善明鹿喰; Kinji Matsumura; 欣司松村; Takako Ariyasu; 香子有安; Kazuhiro Otsuki; 一博大槻; Rie Sawai; 里枝澤井
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2007-11-20
Filing date: 2007-11-20
Publication date: 2009-06-11
Anticipated expiration: 2027-11-20
Also published as: JP4881282B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a trimming processor for trimming a region under consideration even in the case of a video image in which the movement of a whole screen is almost constant. <P>SOLUTION: A trimming processor 1 includes: an image reduction/division means 3 for reducing a frame image, and for dividing it into blocks; an image feature extraction means 4 for extracting image features in the blocks; a shot classification means 5 for classifying shots based on whether or not the shots are long shots based on the degree of complication of the shape of a field block group configured of blocks similar to the predetermined color information of the field; a region under consideration setting means 6 for, when the shots are classified as the long shots, setting a region including the block under consideration whose luminance distribution value is higher than a predetermined luminance distribution value as a region under consideration; and a trimming video image generation means 7 for extracting the region including the region under consideration corresponding to the aspect ratio of the display device as a trimming region from the frame image, and outputting the extracted region. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、映像内の注目領域をトリミング処理するトリミング処理装置およびトリミング処理プログラムに関する。 The present invention relates to a trimming processing apparatus and a trimming processing program for trimming a region of interest in a video.

近年、携帯電話や情報携帯端末（ＰＤＡ：Personal Digital Assistants）といったモバイル機器の普及に伴い、低解像度のディスプレイでコンテンツ（動画コンテンツ）を視聴する視聴環境が増大している。このコンテンツは、地上デジタル放送、ストリーミング、動画ファイル等の種々の形式で配信、配布されている。 In recent years, with the spread of mobile devices such as mobile phones and portable information terminals (PDA: Personal Digital Assistants), the viewing environment for viewing content (moving content) on a low-resolution display is increasing. This content is distributed and distributed in various formats such as terrestrial digital broadcasting, streaming, and moving image files.

しかし、モバイル機器での視聴環境を考慮して制作されたコンテンツは数少なく、ほとんどのコンテンツが、ハイビジョンテレビ等の高解像度のディスプレイを想定したコンテンツである。このため、このような高解像度用のコンテンツは、文字スーパのサイズや、ロングショットで撮影された際の登場人物が低解像度のディスプレイで視認するには小さすぎるという問題がある。 However, there are few contents produced in consideration of the viewing environment on mobile devices, and most contents are contents assuming a high-resolution display such as a high-definition television. For this reason, there is a problem that such high-resolution content is too small to be visually recognized on a low-resolution display when the character is captured in a long shot.

そこで、従来、コンテンツの映像をトリミングし、拡大表示させることで、コンテンツの視認性を改善させるトリミング処理の技術が開示されている（例えば、特許文献１，２参照）。この特許文献に開示されたトリミング処理では、コンテンツの映像内の動きベクトルにより視聴者が注目すべき領域（人物等）を判定し、当該領域をトリミング領域として決定している。
特開２００２−２８１５０６号公報特開２００７−１０１８６７号公報 Therefore, conventionally, trimming technology for improving the visibility of content by trimming and enlarging the video of the content has been disclosed (see, for example, Patent Documents 1 and 2). In the trimming process disclosed in this patent document, an area (person or the like) that should be watched by a viewer is determined based on a motion vector in a video of content, and the area is determined as a trimming area.
JP 2002-281506 A JP 2007-101867 A

前記した従来の技術では、動きベクトルによりトリミング領域を決定しているため、動きベクトルの比較により映像に映っている人物等のオブジェクトの領域を他の領域と区別して追跡し、注目領域（トリミング領域）として抽出することは可能である。しかし、従来の技術では、画面全体の動きがほぼ一定の映像の場合、動きベクトルの動き量がほとんどないため、注目領域を抽出することは困難であるという問題がある。 In the conventional technique described above, the trimming region is determined by the motion vector, and therefore, the region of the object such as a person shown in the video is tracked separately from other regions by comparing the motion vectors, and the attention region (trimming region) ) Can be extracted. However, the conventional technique has a problem in that it is difficult to extract a region of interest because there is almost no motion vector motion amount in the case of an image in which the motion of the entire screen is almost constant.

また、従来の技術では、動きベクトルのみによって注目領域を抽出するため、例えば、フィールド内で行われるスポーツを撮影した映像から人物（選手）の領域を注目領域として抽出しようとする場合、観客席、電光掲示板等の予期せぬ映像が撮影されると当該映像内で注目領域を抽出してしまうことになる。このように、従来の技術は、予期せぬショットにより意味のない注目領域をトリミングしてしまうため、スポーツ映像を放送するための放送映像には適用することができないという問題がある。 Further, in the conventional technique, because the attention area is extracted only by the motion vector, for example, when an area of a person (player) is extracted as an attention area from an image of sports performed in the field, When an unexpected video such as an electronic bulletin board is shot, a region of interest is extracted from the video. As described above, the conventional technique has a problem that it cannot be applied to a broadcast video for broadcasting a sports video because an insignificant attention area is trimmed by an unexpected shot.

さらに、従来の技術では、コンテンツの映像フォーマットが、動き補償データとして動きベクトルを抽出できるＭＰＥＧ（Moving Picture Expert Group）以外のフォーマットである場合、別途映像から動きベクトルを求める手段が必要となる。しかし、この動きベクトルを求めるためには、多大な計算が必要になるため、携帯電話、情報携帯端末等の計算能力が低い端末では、リアルタイムで動きベクトルを求めてトリミング領域を決定することは困難であるという問題がある。 Further, in the conventional technique, when the video format of the content is a format other than MPEG (Moving Picture Expert Group) that can extract a motion vector as motion compensation data, a means for obtaining a motion vector from the video is required. However, since a large amount of calculation is required to obtain the motion vector, it is difficult to determine the trimming region by obtaining the motion vector in real time on a terminal having a low calculation capability such as a mobile phone or an information portable terminal. There is a problem that.

本発明は、以上のような課題を解決するためになされたものであり、フィールド内で行われるスポーツを撮影した映像において、画面全体の動きがほぼ一定の映像であっても、また、画面内にフィールド以外の映像が予期せぬタイミングで撮影された場合であっても、フィールド内に登場する人物（選手）のみに注目した注目領域をトリミングすることを可能としたトリミング処理装置およびトリミング処理プログラムを提供することを目的とする。 The present invention has been made in order to solve the above-described problems. In a video of sports performed in a field, even if the motion of the entire screen is almost constant, Trimming processing apparatus and trimming processing program capable of trimming a region of interest focused only on a person (player) appearing in the field even when an image other than the field is shot at an unexpected timing The purpose is to provide.

本発明は、前記目的を達成するために創案されたものであり、まず、請求項１に記載のトリミング処理装置は、フィールド内で行われるスポーツを撮影した映像の注目領域をトリミングして表示装置に出力するトリミング処理装置であって、画像分割手段と、画像特徴抽出手段と、ショット分類手段と、注目領域設定手段と、トリミング映像生成手段と、を備える構成とした。 The present invention has been developed to achieve the above object. First, the trimming processing device according to claim 1 trims a region of interest of an image of sports performed in a field and displays the image. And a trimming processing apparatus for outputting to the image processing apparatus, including image segmentation means, image feature extraction means, shot classification means, attention area setting means, and trimming video generation means.

かかる構成において、トリミング処理装置は、画像分割手段によって、映像を構成するフレーム画像を所定の大きさのブロックに分割する。そして、トリミング処理装置は、画像特徴抽出手段によって、画像分割手段で分割されたブロックにおける色情報および輝度情報を当該ブロックの画像特徴として抽出する。これによって、ブロック単位でフレーム画像内の特徴が抽出されることになる。 In such a configuration, the trimming apparatus divides the frame image constituting the video into blocks of a predetermined size by the image dividing unit. Then, the trimming processing apparatus extracts the color information and the luminance information in the block divided by the image dividing unit by the image feature extracting unit as the image feature of the block. As a result, the features in the frame image are extracted in units of blocks.

そして、トリミング処理装置は、ショット分類手段によって、画像特徴抽出手段で抽出された色情報と予め定めたフィールドの色情報とが類似するブロックであるフィールドブロックで形成されるフィールドブロック全体の形状の複雑度が、予め定めた基準値を超過するか否かに基づいて、映像を撮影したショットをロングショットであるか否かに分類する。例えば、ロングショットの場合、被写体である人物（選手）等が小さく撮影されているため、フィールドブロック全体の形状は、矩形に近い形状となる。また、ロングショット以外（非ロングショット）の場合、人物（選手）等のブロックがフィールドブロック全体の形状を侵食することで、フィールドブロック全体の形状が分離したり、凹凸が増加したり、穴あき形状となったり等、矩形に対して複雑な形状となる。そこで、ショット分類手段は、形状の複雑さにより、撮影ショットがロングショットであるか否かを判定することができる。 Then, the trimming processing device has a complicated shape of the entire field block formed by a field block that is a block in which the color information extracted by the image feature extraction unit and the color information of the predetermined field are similar by the shot classification unit. Based on whether or not the degree exceeds a predetermined reference value, the shot in which the video is captured is classified as a long shot. For example, in the case of a long shot, since the person (player) who is the subject is photographed small, the shape of the entire field block is a shape close to a rectangle. For non-long shots (non-long shots), blocks such as people (players) erode the shape of the entire field block, separating the shape of the entire field block, increasing irregularities, and making holes. It becomes a complicated shape with respect to a rectangle such as a shape. Therefore, the shot classification means can determine whether or not the shot is a long shot based on the complexity of the shape.

さらに、トリミング処理装置は、注目領域設定手段によって、ショット分類手段でショットがロングショットであると分類された場合に、ブロックの輝度情報に基づいて、輝度分散値が予め定めた輝度分散値よりも高いブロックである注目ブロックを含む領域を注目領域として設定する。これによって、輝度分散値がフィールド内で異なる領域である人物（選手）等を含んだ領域が抽出されることになる。 Further, the trimming processing device is configured such that when the shot classification unit classifies the shot as a long shot by the attention area setting unit, the luminance variance value is greater than a predetermined luminance variance value based on the block luminance information. A region including a target block that is a high block is set as a target region. As a result, a region including a person (player) or the like that is a region having a different luminance dispersion value in the field is extracted.

そして、トリミング処理装置は、トリミング映像生成手段によって、注目領域設定手段で設定された注目領域を含み、かつ、表示装置のアスペクト比に応じた領域をトリミング領域としてフレーム画像から抽出して出力する。これによって、注目領域が、表示装置の画面上に拡大されて表示されることになる。 Then, the trimming processing device extracts the region including the region of interest set by the region-of-interest setting unit by the trimming video generation unit and extracts the region according to the aspect ratio of the display device from the frame image as the trimming region and outputs the trimmed region. As a result, the attention area is enlarged and displayed on the screen of the display device.

また、請求項２に記載のトリミング処理装置は、請求項１に記載のトリミング処理装置において、画像縮小手段をさらに備える構成とした。 According to a second aspect of the present invention, the trimming processing apparatus according to the first aspect further includes an image reducing unit.

かかる構成において、トリミング処理装置は、画像縮小手段によって、映像を構成するフレーム画像を縮小する。そして、トリミング処理装置は、画像分割手段によって、画像縮小手段で縮小されたフレーム画像をブロックに分割する。 In such a configuration, the trimming apparatus reduces the frame image constituting the video by the image reduction unit. Then, the trimming apparatus divides the frame image reduced by the image reduction unit into blocks by the image division unit.

さらに、請求項３に記載のトリミング処理装置は、請求項１または請求項２に記載のトリミング処理装置において、ショット分類手段が、形状複雑度算出手段を備える構成とした。 Furthermore, a trimming processing apparatus according to a third aspect is the trimming processing apparatus according to the first or second aspect, wherein the shot classification means includes a shape complexity calculation means.

かかる構成において、トリミング処理装置は、ショット分類手段の形状複雑度算出手段によって、フィールドブロック全体の形状の複雑度を、フィールドブロック全体を形成するブロック数に対する当該フィールドブロック全体の形状の周囲長の割合として算出する。なお、フィールドブロック全体の形状は、形状が複雑化すると凹凸の数が増加するため、周囲長が長くなればなるほど、複雑な形状であることを示すことになる。また、周囲長は、フィールドブロック全体の外周のみならず、形状に穴形状が発生している場合は内周も含んだ長さである。また、フィールドブロックが分離して形成されている場合はそれぞれの周囲長を足した長さである。 In such a configuration, the trimming apparatus uses the shape complexity calculation unit of the shot classification unit to calculate the complexity of the shape of the entire field block as a ratio of the perimeter of the shape of the entire field block to the number of blocks forming the entire field block. Calculate as In addition, since the number of unevenness | corrugation will increase if the shape becomes complicated, the shape of the whole field block will show that it is a complicated shape, so that perimeter length becomes long. The peripheral length is a length including not only the outer periphery of the entire field block but also the inner periphery when a hole shape is generated. If the field blocks are formed separately, the length is the sum of their perimeters.

また、請求項４に記載のトリミング処理装置は、請求項３に記載のトリミング処理装置において、ショット分類手段が、占有割合算出手段をさらに備える構成とした。 According to a fourth aspect of the present invention, in the trimming processing device according to the third aspect, the shot classification unit further includes an occupation ratio calculation unit.

かかる構成において、トリミング処理装置は、ショット分類手段の占有割合算出手段によって、フレーム画像の下部領域において、当該下部領域における全ブロック数に対するフィールドブロックのブロック数の割合をフィールド占有割合として算出する。そして、ショット分類手段は、フィールドブロック全体の形状の複雑度が予め定めた値より大きく、かつ、フィールド占有割合が予め定めた値よりも小さい場合に、ショットをロングショットであると分類する。 In such a configuration, the trimming apparatus calculates the ratio of the number of field blocks to the total number of blocks in the lower area as the field occupation ratio in the lower area of the frame image by the occupation ratio calculation means of the shot classification means. The shot classification means classifies the shot as a long shot when the complexity of the shape of the entire field block is larger than a predetermined value and the field occupation ratio is smaller than a predetermined value.

さらに、請求項５に記載のトリミング処理装置は、請求項１から請求項４のいずれか一項に記載のトリミング処理装置において、注目領域設定手段が、注目ブロック抽出手段と、クラスタリング手段と、クラス選択手段と、を備える構成とした。 Furthermore, the trimming processing device according to claim 5 is the trimming processing device according to any one of claims 1 to 4, wherein the attention area setting means includes attention block extraction means, clustering means, class And a selection unit.

かかる構成において、トリミング処理装置は、注目ブロック抽出手段によって、ブロックの輝度情報に基づいて、輝度分散値が予め定めた輝度分散値よりも高いブロックを注目ブロックとして抽出する。そして、トリミング処理装置は、クラスタリング手段によって、注目ブロック抽出手段で抽出された注目ブロックを予め定めた距離以下の近接するブロック同士でクラスタリングする。これによって、注目ブロックを含んだ１つ以上の領域が、注目領域の候補としてクラスタリングされることになる。そして、トリミング処理装置は、クラス選択手段によって、クラスタリング手段で注目ブロックが複数のクラスにクラスタリングされた場合に、予め定めた選択基準により、１つのクラスを注目領域として選択する。 In such a configuration, the trimming processing device extracts, as the target block, the block whose luminance variance value is higher than the predetermined luminance variance value based on the luminance information of the block by the target block extracting unit. Then, the trimming apparatus clusters the blocks of interest extracted by the block of interest extracting unit with the adjacent blocks that are not more than a predetermined distance by the clustering unit. As a result, one or more regions including the target block are clustered as candidates for the target region. Then, the trimming apparatus selects one class as a region of interest based on a predetermined selection criterion when the block of interest is clustered into a plurality of classes by the clustering unit by the class selection unit.

また、請求項６に記載のトリミング処理プログラムは、フィールド内で行われるスポーツを撮影した映像の注目領域をトリミングして表示装置に出力するために、コンピュータを、画像分割手段、画像特徴抽出手段、ショット分類手段、注目領域設定手段、トリミング映像生成手段、として機能させる構成とした。 In addition, the trimming processing program according to claim 6, a computer is provided with an image dividing unit, an image feature extracting unit, an image segmenting unit, an image feature extracting unit, It is configured to function as shot classification means, attention area setting means, and trimmed video generation means.

かかる構成において、トリミング処理プログラムは、画像分割手段によって、映像を構成するフレーム画像を所定の大きさのブロックに分割する。そして、トリミング処理プログラムは、画像特徴抽出手段によって、画像分割手段で分割されたブロックにおける色情報および輝度情報を当該ブロックの画像特徴として抽出する。そして、トリミング処理プログラムは、ショット分類手段によって、画像特徴抽出手段で抽出された色情報と予め定めたフィールドの色情報とが類似するブロックであるフィールドブロックで形成されるフィールドブロック全体の形状の複雑度が、予め定めた基準値を超過するか否かに基づいて、映像を撮影したショットをロングショットであるか否かに分類する。 In such a configuration, the trimming processing program divides the frame image constituting the video into blocks of a predetermined size by the image dividing means. Then, the trimming processing program extracts the color information and the luminance information in the block divided by the image dividing unit by the image feature extracting unit as the image feature of the block. Then, the trimming program stores the complexity of the shape of the entire field block formed by the field classification block which is similar to the color information extracted by the image feature extraction unit and the color information of the predetermined field by the shot classification unit. Based on whether or not the degree exceeds a predetermined reference value, the shot in which the video is captured is classified as a long shot.

さらに、トリミング処理プログラムは、注目領域設定手段によって、ショット分類手段でショットがロングショットであると分類された場合に、ブロックの輝度情報に基づいて、輝度分散値が予め定めた輝度分散値よりも高いブロックである注目ブロックを含む領域を注目領域として設定する。そして、トリミング処理プログラムは、トリミング映像生成手段によって、注目領域設定手段で設定された注目領域を含み、かつ、表示装置のアスペクト比に応じた領域をトリミング領域としてフレーム画像から抽出して出力する。 Further, the trimming processing program is configured such that when the shot classification unit classifies the shot as a long shot by the attention area setting unit, the luminance variance value is greater than a predetermined luminance variance value based on the block luminance information. A region including a target block that is a high block is set as a target region. Then, the trimming processing program extracts from the frame image an area corresponding to the aspect ratio of the display device, including the attention area set by the attention area setting means by the trimming video generation means, and outputs the trimming area.

本発明は、以下に示す優れた効果を奏するものである。
請求項１，６に記載の発明によれば、フィールド内で行われるスポーツを撮影した映像において、ロングショット時の映像から注目領域を含んだ領域のみをトリミングして表示させることができる。これによって、本発明は、注目領域を拡大表示させることができ、低解像度の表示装置（ディスプレイ）において視認性を高めることができる。 The present invention has the following excellent effects.
According to the first and sixth aspects of the present invention, it is possible to trim and display only a region including a region of interest from a long shot video in a video shot of sports performed in a field. As a result, the present invention can enlarge and display the region of interest, and can improve visibility in a low-resolution display device (display).

また、請求項１，６に記載の発明によれば、動きベクトルを用いることなく、輝度情報により注目領域を設定してトリミングを行うことができる。これによって、本発明は、動きベクトルを用いる場合に比べて少ない計算量で注目領域を設定することができ、画面全体の動きがほぼ一定の映像であっても、トリミングを行うことができる。さらに、請求項１，６に記載の発明によれば、色情報により抽出したフィールドブロックの形状によって、ショットがロングショットであるか否かの判定を行うため、観客席等の予期せぬショットから注目領域をトリミングすることを防止することができる。これによって、本発明は、スポーツ映像を放送するための放送映像に対してトリミング処理を行うことができる。 According to the first and sixth aspects of the invention, it is possible to perform trimming by setting a region of interest based on luminance information without using a motion vector. As a result, the present invention can set a region of interest with a small amount of calculation compared to the case of using a motion vector, and can perform trimming even if the motion of the entire screen is substantially constant. Further, according to the first and sixth aspects of the present invention, since it is determined whether or not the shot is a long shot according to the shape of the field block extracted based on the color information, an unexpected shot such as a spectator seat is used. It is possible to prevent the attention area from being trimmed. Accordingly, the present invention can perform trimming processing on a broadcast video for broadcasting a sports video.

請求項２に記載の発明によれば、フレーム画像を縮小するため、画像特徴の抽出や、ショットの判定等の処理における計算量を減らすことができる。これによって、本発明は、トリミング処理を高速に実現することができる。 According to the second aspect of the present invention, since the frame image is reduced, it is possible to reduce the amount of calculation in processing such as image feature extraction and shot determination. As a result, the present invention can realize the trimming process at high speed.

請求項３に記載の発明によれば、フィールドブロック全体の形状の複雑度を、フィールドブロック全体のブロック数に対する当該フィールドブロック全体の形状の周囲長の割合で求めるため、定量的に複雑さを求めることができる。これによって、本発明は、一定の基準によりロングショットの判定を行うことができるため、安定してロングショットの判定を行うことが可能になる。 According to the third aspect of the invention, since the complexity of the shape of the entire field block is obtained by the ratio of the perimeter of the shape of the entire field block to the number of blocks of the entire field block, the complexity is obtained quantitatively. be able to. Thus, according to the present invention, since it is possible to determine a long shot based on a certain standard, it is possible to stably determine a long shot.

請求項４に記載の発明によれば、画面下部のフィールドブロックの割合によって、ロングショットの判定を行うことができる。これによって、本発明は、画面上部に観客席等が撮影された映像であっても、精度よくロングショットの判定を行うことができる。 According to the invention described in claim 4, it is possible to determine the long shot based on the ratio of the field blocks at the bottom of the screen. Thus, according to the present invention, it is possible to accurately determine a long shot even for a video in which a spectator seat or the like is captured at the top of the screen.

請求項５に記載の発明によれば、クラスタリングよって、複数の注目ブロックを１つの注目領域として抽出するため、注目すべき場面が必要以上に細分化されることがない。これによって、本発明は、映像に対して、抽出された注目領域を含んでトリミングを行うことで、その場面における最適な大きさの注目領域を拡大して表示することができる。 According to the fifth aspect of the present invention, since a plurality of blocks of interest are extracted as one region of interest by clustering, a scene to be noticed is not subdivided more than necessary. Thus, according to the present invention, by cropping an image including the extracted attention area, the attention area having the optimum size in the scene can be enlarged and displayed.

以下、本発明の実施の形態について図面を参照して説明する。
［トリミング処理装置の構成］
まず、図１を参照して、本発明の実施の形態に係るトリミング処理装置の構成について説明する。図１は、本発明の実施の形態に係るトリミング処理装置の全体構成を示すブロック図である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[Configuration of trimming apparatus]
First, the configuration of a trimming processing apparatus according to an embodiment of the present invention will be described with reference to FIG. FIG. 1 is a block diagram showing the overall configuration of a trimming apparatus according to an embodiment of the present invention.

トリミング処理装置１は、フィールド内で行われるスポーツ（サッカー、ラグビー等）を撮影した映像の注目領域をトリミングするものである。また、このトリミング処理装置１は、携帯電話や情報携帯端末の内部に組み込まれ、入力された映像の注目領域を、携帯電話等の表示装置（ディスプレイ）の解像度やアスペクト比に応じてトリミングする。ここでは、トリミング処理装置１は、映像入力手段２と、画像縮小・分割手段３と、画像特徴抽出手段４と、ショット分類手段５と、注目領域設定手段６と、トリミング映像生成手段７とを備える。 The trimming processing device 1 trims a region of interest of a video shot of a sport (such as soccer or rugby) performed in a field. The trimming apparatus 1 is incorporated in a mobile phone or an information portable terminal, and trims the attention area of the input video in accordance with the resolution and aspect ratio of a display device (display) such as a mobile phone. Here, the trimming processing apparatus 1 includes a video input unit 2, an image reduction / division unit 3, an image feature extraction unit 4, a shot classification unit 5, an attention area setting unit 6, and a trimmed video generation unit 7. Prepare.

映像入力手段２は、外部から映像を入力するものである。なお、映像は、例えば、デジタル放送等で使用されているＭＰＥＧ２−ＴＳ、パーソナルコンピュータ上で再生可能なＡＶＩ（オーディオビデオインターリーブ）ファイル等であって、映像フォーマットや解像度は特に限定するものではない。ここでは、映像入力手段２は、映像をフレーム画像ごとに入力し、図示を省略したメモリに書き込み、画像縮小・分割手段３に通知する。 The video input means 2 inputs video from outside. The video is, for example, MPEG2-TS used in digital broadcasting or the like, an AVI (audio video interleave) file that can be played back on a personal computer, and the video format and resolution are not particularly limited. Here, the video input means 2 inputs a video for each frame image, writes it in a memory (not shown), and notifies the image reduction / division means 3.

画像縮小・分割手段３は、映像入力手段２で入力された映像のフレーム画像ごとに縮小し、所定の大きさのブロックに分割するものである。ここで、図２を参照（適宜図１参照）して、画像縮小・分割手段３の詳細な構成について説明する。図２は、画像縮小・分割手段の構成を示すブロック図である。ここでは、画像縮小・分割手段３は、画像縮小手段３１と、画像分割手段３２とを備える。 The image reduction / division means 3 reduces each frame image of the video input by the video input means 2 and divides it into blocks of a predetermined size. Here, with reference to FIG. 2 (refer to FIG. 1 as appropriate), a detailed configuration of the image reduction / division means 3 will be described. FIG. 2 is a block diagram showing the configuration of the image reduction / division means. Here, the image reduction / division means 3 includes an image reduction means 31 and an image division means 32.

画像縮小手段３１は、フレーム画像を縮小するものである。例えば、画像縮小手段３１は、映像がハイビジョン映像の場合、１９２０×１０８０画素（横×縦）の解像度のフレーム画像を解像度２４０×１３５画素（横×縦）程度の解像度に縮小した縮小フレーム画像を生成する。この画像縮小手段３１における縮小処理は、一般的な処理を用いることができ、例えば、平均画素法、バイキュービック法等を用いることができる。 The image reducing means 31 reduces the frame image. For example, when the video is a high-definition video, the image reduction unit 31 reduces a frame image having a resolution of 1920 × 1080 pixels (horizontal × vertical) to a resolution of about 240 × 135 pixels (horizontal × vertical). Generate. For the reduction process in the image reduction unit 31, a general process can be used. For example, an average pixel method, a bicubic method, or the like can be used.

このように、フレーム画像を縮小することで、トリミング処理装置１は、後段の処理を高速に行うことが可能になる。なお、画像縮小手段３１における縮小比率は、特に限定するものではないが、例えば、トリミング処理装置１のＣＰＵの処理性能が低い場合は、より縮小比率の度合いを大きくすることとする。 In this way, by reducing the frame image, the trimming apparatus 1 can perform subsequent processing at high speed. The reduction ratio in the image reduction means 31 is not particularly limited. For example, when the processing performance of the CPU of the trimming apparatus 1 is low, the degree of the reduction ratio is increased.

画像分割手段３２は、画像縮小手段３１で生成された縮小フレーム画像を所定の大きさのブロックに分割するものである。例えば、画像分割手段３２は、縮小フレーム画像を１６×１６画素のマクロブロックに分割する。 The image dividing unit 32 divides the reduced frame image generated by the image reducing unit 31 into blocks having a predetermined size. For example, the image dividing unit 32 divides the reduced frame image into 16 × 16 pixel macroblocks.

この画像縮小手段３１で縮小され、画像分割手段３２で分割されたブロック（マクロブロック）は、画像特徴抽出手段４に出力される。なお、画像分割手段３２で分割されたブロックの縮小フレーム画像内における位置情報は、ブロックのデータとともに画像特徴抽出手段４に出力されるものとする。 The blocks (macroblocks) reduced by the image reduction unit 31 and divided by the image division unit 32 are output to the image feature extraction unit 4. Note that the position information in the reduced frame image of the block divided by the image dividing unit 32 is output to the image feature extracting unit 4 together with the block data.

なお、ここでは、画像分割手段３２は、画像縮小手段３１で生成された縮小フレーム画像をブロックに分割することとしたが、入力される映像の解像度が低い（例えば、表示装置の解像度と同じ）場合、あるいは、トリミング処理装置１のＣＰＵの処理性能が高い場合は、必ずしも縮小フレーム画像をブロックに分割する必要はなく、入力された映像を構成するフレーム画像を直接ブロックに分割することとしてもよい。この場合、構成から画像縮小手段３１を省くことができる。 Here, the image dividing unit 32 divides the reduced frame image generated by the image reducing unit 31 into blocks, but the resolution of the input video is low (for example, the same as the resolution of the display device). In this case, or when the processing performance of the CPU of the trimming processing apparatus 1 is high, it is not always necessary to divide the reduced frame image into blocks, and the frame image constituting the input video may be directly divided into blocks. . In this case, the image reduction means 31 can be omitted from the configuration.

ここで、図７を参照（適宜図２参照）して、画像縮小・分割手段３で行われる画像縮小・分割処理について、具体的な画像例で説明する。図７は、画像縮小・分割処理の内容を説明するための説明図であって、（ａ）は入力された映像を構成するフレーム画像、（ｂ）はフレーム画像を縮小した縮小フレーム画像、（ｃ）は縮小フレーム画像をマクロブロックに分割した状態をそれぞれ示している。 Here, referring to FIG. 7 (refer to FIG. 2 as appropriate), the image reduction / division processing performed by the image reduction / division means 3 will be described using a specific image example. 7A and 7B are explanatory diagrams for explaining the contents of the image reduction / division processing, where FIG. 7A is a frame image constituting the input video, FIG. 7B is a reduced frame image obtained by reducing the frame image, c) shows a state in which the reduced frame image is divided into macro blocks.

図７に示すように、画像縮小・分割手段３は、画像縮小手段３１によって、サッカー競技を撮影した映像を構成するフレーム画像である１９２０×１０８０画素のフレーム画像Ｈを、２４０×１３５画素の画像（縮小フレーム画像Ｌ）に縮小する。そして、画像縮小・分割手段３は、画像分割手段３２によって、縮小フレーム画像Ｌを１６×１６画素のマクロブロック（ブロックＢ，Ｂ…）に分割する。このブロックＢ，Ｂ…が、画像特徴抽出手段４（図１）の処理対象となる。図１に戻って、トリミング処理装置１の構成について説明を続ける。 As shown in FIG. 7, the image reduction / division means 3 uses the image reduction means 31 to convert a frame image H of 1920 × 1080 pixels, which is a frame image constituting a video of a soccer game, into an image of 240 × 135 pixels. Reduce to (reduced frame image L). Then, the image reduction / division means 3 divides the reduced frame image L into 16 × 16 pixel macroblocks (blocks B, B...) By the image division means 32. These blocks B, B... Are processed by the image feature extraction means 4 (FIG. 1). Returning to FIG. 1, the description of the configuration of the trimming apparatus 1 will be continued.

画像特徴抽出手段４は、画像縮小・分割手段３の画像分割手段３２で分割されたブロックごとに色情報および輝度情報を当該ブロックの画像特徴として抽出（算出）するものである。ここでは、画像特徴抽出手段４は、ブロック内の色の平均値（色平均値）を色情報とし、輝度の分散（輝度分散値）を輝度情報として用いることとする。 The image feature extraction unit 4 extracts (calculates) color information and luminance information as image features of each block divided by the image division unit 32 of the image reduction / division unit 3. Here, the image feature extraction unit 4 uses an average value (color average value) of colors in a block as color information, and uses luminance variance (luminance variance value) as luminance information.

例えば、ブロックのサイズが縦ｍ画素、横ｎ画素、ブロックを構成する画素のＲＧＢ値をそれぞれＲ（赤），Ｇ（緑），Ｂ（青）としたとき、画像特徴抽出手段４は、ブロックにおける色情報を以下の（１）式に示すようなＲ，Ｇ，Ｂの平均値として算出する。なお、ｋはブロックの画素数を示し、例えば、ｍ＝１６、ｎ＝１６の場合、ｋ＝２５６となる。 For example, when the block size is m pixels vertically and n pixels horizontally, and the RGB values of the pixels constituting the block are R (red), G (green), and B (blue), the image feature extracting unit 4 Is calculated as an average value of R, G, and B as shown in the following equation (1). Note that k indicates the number of pixels in the block. For example, when m = 16 and n = 16, k = 256.

さらに、ブロックを構成する画素の輝度をＬとしたとき、画像特徴抽出手段４は、ブロックにおける輝度情報を以下の（２）式に示すような輝度値の分散（輝度分散値Ｓ^２）として算出する。 Further, when the luminance of the pixels constituting the block is L, the image feature extraction unit 4 calculates the luminance information in the block as a luminance value variance (luminance variance value S ² ) as shown in the following equation (2). To do.

なお、ここでは、色情報としてＲＧＢ値の平均値を用いたが、色を表現する指標であればＲＧＢ値に限定されるものではない。例えば、ＨＳＶ（Ｈ：色相、Ｓ：彩度、Ｖ：明度）等を用いることとしてもよい。 Here, the average value of the RGB values is used as the color information. However, the color value is not limited to the RGB values as long as it is an index representing the color. For example, HSV (H: Hue, S: Saturation, V: Lightness) may be used.

ショット分類手段５は、画像特徴抽出手段４で抽出された色情報と予め定めたフィールドの色情報とが類似するブロックであるフィールドブロックで形成されるフィールドブロック全体の形状の複雑度（形状複雑度）が、予め定めた基準値を超過するか否かに基づいて、映像を撮影したショットがロングショットであるか否かに分類するものである。ここで、図３を参照（適宜図１参照）して、ショット分類手段５の詳細な構成について説明する。図３は、ショット分類手段の構成を示すブロック図である。ここでは、ショット分類手段５は、ブロック判定手段５１と、形状複雑度算出手段５２と、占有割合算出手段５３と、ショット判定手段５４とを備える。 The shot classification means 5 is a shape complexity (shape complexity) of the entire field block formed by field blocks that are blocks whose color information extracted by the image feature extraction means 4 is similar to color information of a predetermined field. ) Is classified as a long shot based on whether a predetermined reference value is exceeded or not. Here, the detailed configuration of the shot classification means 5 will be described with reference to FIG. FIG. 3 is a block diagram showing the configuration of the shot classification means. Here, the shot classification unit 5 includes a block determination unit 51, a shape complexity calculation unit 52, an occupation ratio calculation unit 53, and a shot determination unit 54.

ブロック判定手段５１は、画像特徴抽出手段４で抽出された色情報に基づいて、画像縮小・分割手段３で分割されたブロックが、フィールド部分が写っているブロック（フィールドブロック）であるか否かを判定するものである。例えば、入力された映像が、サッカー等の芝生上で行われるスポーツ映像の場合、芝生の緑色が多く含まれているブロックをフィールドブロックと判定することができる。また、例えば、入力された映像が、ラグビー等の土の上で行われるスポーツ映像の場合、土の茶色が多く含まれているブロックをフィールドブロックと判定することができる。 Based on the color information extracted by the image feature extraction unit 4, the block determination unit 51 determines whether or not the block divided by the image reduction / division unit 3 is a block (field block) in which a field portion is shown. Is determined. For example, if the input video is a sports video played on a lawn such as soccer, a block containing a lot of green grass can be determined as a field block. In addition, for example, when the input video is a sports video performed on soil such as rugby, a block containing a lot of soil brown can be determined as a field block.

例えば、ブロック判定手段５１は、サッカーの映像でフィールドブロックを判定する場合、色情報であるＲＧＢの平均値を用いて、以下の（３）式の条件を満たすか否かによって、当該ブロックがフィールドブロックであるか否かを判定する。なお、（３）式において、α、βは定数であり、例えば、ＲＧＢを２５６階調としたとき、α＝５、β＝２５とすることで、芝生のフィールドブロックを判定することができる。 For example, when determining a field block in a soccer video, the block determination unit 51 uses an average value of RGB, which is color information, to determine whether the block is a field depending on whether the following equation (3) is satisfied. It is determined whether it is a block. In equation (3), α and β are constants. For example, when RGB has 256 gradations, a field block of lawn can be determined by setting α = 5 and β = 25.

ここでは、ブロックがフィールドブロックであるか否かを判定するための基準としてフィールドの色情報を用いたが、このフィールドの色情報は、予め図示を省略したメモリ等に複数記憶しておくものとする。これによって、例えば、図示を省略した色情報選択手段によって、画面上にこれから視聴するスポーツ映像の種類をメニュー形式で視聴者に選択させることで、使用するフィールドの色情報を変えたり、デジタル放送で配信される番組表（ＥＰＧ：Electric Program Guide）から、視聴するスポーツ映像の種類を判別し、使用するフィールドの色情報を変えたり等、複数のスポーツ映像に対して適宜フィールドの色情報を変更することができる。 Here, the field color information is used as a reference for determining whether or not the block is a field block. However, a plurality of the field color information is stored in advance in a memory (not shown). To do. Thus, for example, the color information selection means (not shown) allows the viewer to select the type of sports video to be viewed on the screen in the menu format, thereby changing the color information of the field to be used or digital broadcasting. Change the color information of the field appropriately for multiple sports videos, such as discriminating the type of sports video to be viewed from the program guide (EPG: Electric Program Guide) and changing the color information of the field to be used. be able to.

形状複雑度算出手段５２は、ブロック判定手段５１でフィールドブロックと判定されたブロック（フィールドブロック）全体の形状の複雑度（形状複雑度）を算出するものである。この形状複雑度算出手段５２で算出された形状複雑度は、ショット判定手段５４に出力される。 The shape complexity calculation unit 52 calculates the complexity (shape complexity) of the shape of the entire block (field block) determined by the block determination unit 51 as a field block. The shape complexity calculated by the shape complexity calculation unit 52 is output to the shot determination unit 54.

通常、ロングショットで撮影された映像は、被写体である人物（選手）等が小さく撮影されているため、フィールドブロック全体の形状は、矩形に近い形状となる。これに対し、ロングショット以外で撮影された映像は、被写体である人物が大きく撮影されているため、フレーム画像は非フィールドブロック（フィールドブロック以外のブロック）に侵食され、フィールドブロック全体の形状は、矩形に対して複雑さを増した形状となる。そこで、ここでは、形状複雑度を、フレーム画像がロングショットの画像であるか否かを分類するための指標として用いることとする。 Usually, in a video shot with a long shot, a person (player) or the like as a subject is shot small, so that the shape of the entire field block is a shape close to a rectangle. On the other hand, in the video shot other than the long shot, the person who is the subject is shot large, so the frame image is eroded by non-field blocks (blocks other than field blocks), and the shape of the entire field block is The shape becomes more complicated than the rectangle. Therefore, here, the shape complexity is used as an index for classifying whether or not the frame image is a long shot image.

なお、フィールドブロック全体の形状の凹凸が増えるほど形状が複雑であるといえるため、形状複雑度は、フィールドブロック全体のブロック数に対する当該フィールドブロック全体の形状の周囲長の割合で求めることができる。例えば、フィールドブロック全体の形状の周囲長を“Ｄ”、面積を“Ａ”とした場合、形状複雑度算出手段５２は、以下の（４）式により形状複雑度“Ｃ”を算出する。ここで、フィールドブロック全体の形状の周囲長“Ｄ”は、フィールドブロック以外のブロックに接しているフィールドブロックの数、面積“Ａ”は、フィールドブロックの数を用いることができる。 Since it can be said that the shape becomes more complex as the unevenness of the shape of the entire field block increases, the shape complexity can be obtained by the ratio of the perimeter of the shape of the entire field block to the number of blocks of the entire field block. For example, when the perimeter of the shape of the entire field block is “D” and the area is “A”, the shape complexity calculation means 52 calculates the shape complexity “C” by the following equation (4). Here, the peripheral length “D” of the shape of the entire field block can be the number of field blocks in contact with a block other than the field block, and the area “A” can be the number of field blocks.

なお、フィールドブロック全体の形状が分離している場合であっても、前記（４）式により形状複雑度を算出することができる。また、フィールドブロック全体の形状が穴あき形状となった場合、周囲長“Ｄ”は、フィールドブロック全体の外周と、穴形状の内周を含んだ長さとする。 Even when the shape of the entire field block is separated, the shape complexity can be calculated by the equation (4). Further, when the shape of the entire field block is a perforated shape, the peripheral length “D” is a length including the outer periphery of the entire field block and the inner periphery of the hole shape.

占有割合算出手段５３は、縮小フレーム画像において、全ブロック数に対するフィールドブロックのブロック数の割合（フィールド占有割合）を算出するものである。この占有割合算出手段５３で算出されたフィールド占有割合は、ショット判定手段５４に出力される。 The occupation ratio calculation means 53 calculates the ratio of the number of field blocks to the total number of blocks (field occupation ratio) in the reduced frame image. The field occupation ratio calculated by the occupation ratio calculation unit 53 is output to the shot determination unit 54.

通常、ロングショットで撮影された映像は、被写体である人物（選手）等が小さく撮影されているため、画面（縮小フレーム画像）内に占めるフィールドブロックの割合が大きくなる。そこで、ここでは、フィールド占有割合を、フレーム画像がロングショットの画像であるか否かを分類するための指標として用いることとする。 Usually, in a video shot with a long shot, a person (player) or the like as a subject is shot small, so that the proportion of field blocks in the screen (reduced frame image) increases. Therefore, here, the field occupation ratio is used as an index for classifying whether or not the frame image is a long shot image.

なお、一般に、フィールド内で行われるスポーツを撮影した映像は、画面上部に観客席等のフィールド以外の映像が含まれることがある。そこで、占有割合算出手段５３は、画面（縮小フレーム画像）の下部領域において、当該下部領域における全ブロック数に対するフィールドブロックのブロック数の割合をフィールド占有割合として算出することとする。 In general, an image of sports performed in a field may include an image other than a field such as a spectator seat at the top of the screen. Therefore, the occupation ratio calculating means 53 calculates the ratio of the number of field blocks to the total number of blocks in the lower area of the screen (reduced frame image) as the field occupation ratio.

このフィールドブロック占有割合は、画面の下部領域（具体的には下半分）のブロック数をγ、当該領域におけるフィールドブロック数をＮ_ｄとした場合、占有割合算出手段５３は、以下の（５）式によりフィールド占有割合Ｒ_ｄを算出する。 The field block occupancy ratio is calculated by assuming that the number of blocks in the lower area (specifically, the lower half) of the screen is γ and the number of field blocks in the area is N _d , The field occupation ratio _Rd is calculated from the equation.

例えば、縮小フレーム画像が２４０×１３５画素であって、１６×１６画素のブロック（マクロブロック）で分割されている場合、総ブロック数は“１２０”となるため、γの値には“６０”を用いる。 For example, if the reduced frame image has 240 × 135 pixels and is divided into 16 × 16 pixel blocks (macroblocks), the total number of blocks is “120”, so the value of γ is “60”. Is used.

ショット判定手段５４は、形状複雑度算出手段５２で算出された形状複雑度と、占有割合算出手段５３で算出されたフィールドブロック占有割合とに基づいて、映像を構成するフレーム画像（縮小フレーム画像）がロングショットの画像であるか否かを判定するものである。ここでは、ショット判定手段５４は、形状複雑度が予め定めた閾値よりも小さく、かつ、フィールドブロック占有割合が予め定めた閾値よりも大きい場合に、「ロングショット」と判定し、それ以外の場合に、「非ロングショット」と判定する。この判定による分類されたフレーム画像の種別（「ロングショット」または「非ロングショット」）は、注目領域設定手段６およびトリミング映像生成手段７に出力される。 The shot determination means 54 is a frame image (reduced frame image) constituting a video based on the shape complexity calculated by the shape complexity calculation means 52 and the field block occupancy ratio calculated by the occupation ratio calculation means 53. Is a long shot image or not. Here, the shot determination means 54 determines “long shot” when the shape complexity is smaller than a predetermined threshold and the field block occupation ratio is larger than the predetermined threshold, and otherwise Then, it is determined as “non-long shot”. The type of the frame image classified by this determination (“long shot” or “non-long shot”) is output to the attention area setting unit 6 and the trimmed video generation unit 7.

なお、ここでは、ショット判定手段５４は、形状複雑度とフィールドブロック占有割合との２つの指標を用いてロングショットの判定を行ったが、いずれか一方で判定することとしてもよい。しかし、判定の精度を高めるためには、この２つの指標を用いることが望ましい。図１に戻って、トリミング処理装置１の構成について説明を続ける。 Here, the shot determination unit 54 performs the long shot determination using the two indexes of the shape complexity and the field block occupation ratio, but may determine either one. However, in order to increase the accuracy of the determination, it is desirable to use these two indexes. Returning to FIG. 1, the description of the configuration of the trimming apparatus 1 will be continued.

注目領域設定手段６は、ショット分類手段５でフレーム画像がロングショットの画像と分類された場合に、画像特徴抽出手段４で抽出されたブロックの輝度情報に基づいて、輝度分散値が予め定めた輝度分散値よりも高いブロックである注目ブロックを含む領域を注目領域として設定するものである。ここで、図４を参照（適宜図１参照）して、注目領域設定手段６の詳細な構成について説明する。図４は、注目領域設定手段の構成を示すブロック図である。ここでは、注目領域設定手段６は、注目ブロック抽出手段６１と、クラスタリング手段６２と、クラス選択手段６３とを備える。 The attention area setting unit 6 determines a luminance dispersion value based on the luminance information of the block extracted by the image feature extraction unit 4 when the frame image is classified as a long shot image by the shot classification unit 5. An area including an attention block that is a block higher than the luminance dispersion value is set as the attention area. Here, the detailed configuration of the attention area setting means 6 will be described with reference to FIG. 4 (refer to FIG. 1 as appropriate). FIG. 4 is a block diagram showing the configuration of the attention area setting means. Here, the attention area setting means 6 includes attention block extraction means 61, clustering means 62, and class selection means 63.

注目ブロック抽出手段６１は、画像特徴抽出手段４で抽出された輝度情報に基づいて、画像縮小・分割手段３で分割されたブロック（マクロブロック）から選手等を含んだ注目ブロックを抽出するものである。通常、輝度が一様なフィールドにおいて、選手等が写っている領域は輝度分散値が高くなる。そこで、注目ブロック抽出手段６１は、ブロックの輝度分散値が予め定めた輝度分散値よりも高いブロックを注目ブロックとして抽出することとする。 The attention block extraction means 61 extracts attention blocks including players and the like from the blocks (macroblocks) divided by the image reduction / division means 3 based on the luminance information extracted by the image feature extraction means 4. is there. Usually, in a field with uniform luminance, the luminance dispersion value is high in an area where a player or the like is shown. Therefore, the block-of-interest extracting unit 61 extracts a block having a block whose luminance variance value is higher than a predetermined luminance variance value as the block of interest.

クラスタリング手段６２は、注目ブロック抽出手段６１で抽出された注目ブロックをクラスタリングするものである。ここでは、クラスタリング手段６２は、複数の注目ブロックを予め定めた距離以下の近接するブロック同士を同一のクラスとしてクラスタリングする。このように、クラスタリング手段６２は、近接した注目ブロックを１つのクラスとしてクラスタリングことで、１つ以上の注目領域の候補が形成されることになる。なお、クラスタリング結果として、１つのクラスのみがクラスタリングされた場合は、当該クラスが注目領域となる。これによって、注目ブロックが分離して存在している場合であっても、一定の距離以下のブロックを１つのクラスとすることができるため、注目すべき場面が必要以上に細分化されることがない。 The clustering means 62 clusters the attention blocks extracted by the attention block extraction means 61. Here, the clustering means 62 clusters a plurality of blocks of interest in close proximity to each other with a predetermined distance or less as the same class. As described above, the clustering unit 62 performs clustering by using adjacent blocks of interest as one class, thereby forming one or more candidate regions of interest. Note that when only one class is clustered as a clustering result, that class becomes a region of interest. As a result, even if the blocks of interest exist separately, blocks of a certain distance or less can be made into one class, so the scene to be noticed can be subdivided more than necessary. Absent.

クラス選択手段６３は、クラスタリング手段６２で注目ブロックが複数のクラスにクラスタリングされた場合に、予め定めた選択基準により、いずれか１つのクラスを注目領域として選択するものである。この選択基準は、注目領域の候補の位置や大きさを基準とすることができる。例えば、注目領域の候補の位置を基準とする場合、クラス選択手段６３は、各クラス（注目領域の候補）のうちで、縮小フレーム画像の中心位置に最も近いクラスを注目領域として選択する。また、例えば、注目領域の候補の大きさを基準とする場合、クラス選択手段６３は、各クラス（注目領域の候補）のうちで、最も占有面積が広いクラスを注目領域として選択する。 The class selection unit 63 selects any one class as a region of interest based on a predetermined selection criterion when the block of interest is clustered into a plurality of classes by the clustering unit 62. This selection criterion can be based on the position and size of the candidate for the region of interest. For example, when the position of the candidate for the attention area is used as a reference, the class selection unit 63 selects the class closest to the center position of the reduced frame image among the classes (candidate areas for attention) as the attention area. For example, when the size of the attention area candidate is used as a reference, the class selection unit 63 selects the class having the largest occupied area among the classes (notification area candidates) as the attention area.

ここで、図８を参照（適宜図４参照）して、注目領域設定手段６で行われる注目領域設定処理について、具体的な画像例で説明する。図８は、注目領域設定処理の内容を説明するための説明図であって、（ａ）は縮小フレーム画像内において注目ブロックを抽出した状態、（ｂ）は注目ブロックに基づいて注目領域を設定した状態をそれぞれ示している。
図８に示すように、注目領域設定手段６は、注目ブロック抽出手段６１によって、ブロックの輝度分散値が予め定めた輝度分散値よりも高いブロックを注目ブロックとして抽出する。このように、輝度分散値が高いブロックを抽出することで、選手等が写っている領域を抽出することができる。なお、図８（ａ）では、注目ブロックが６個（Ｂ_Ｎ１〜Ｂ_Ｎ６）抽出された例を示している。 Here, with reference to FIG. 8 (refer to FIG. 4 as appropriate), the attention area setting processing performed by the attention area setting means 6 will be described using a specific image example. FIG. 8 is an explanatory diagram for explaining the contents of the attention area setting processing, where (a) shows a state in which attention blocks are extracted from the reduced frame image, and (b) shows attention areas based on the attention blocks. Each state is shown.
As illustrated in FIG. 8, the attention area setting unit 6 uses the attention block extraction unit 61 to extract, as a target block, a block in which the luminance dispersion value of the block is higher than a predetermined luminance dispersion value. Thus, by extracting a block having a high luminance dispersion value, it is possible to extract a region where a player or the like is shown. FIG. 8A shows an example in which six blocks of interest (B _{N1 to} B _N6 ) are extracted.

そして、注目領域設定手段６は、クラスタリング手段６２によって、近接するブロック同士を同一のクラスとしてクラスタリングする。ここでは、注目ブロック同士の距離が２ブロック以内のブロックを同一のクラスとしている。これによって、注目ブロックＢ_Ｎ１〜Ｂ_Ｎ３が１つのクラス、注目ブロックＢ_Ｎ４〜Ｂ_Ｎ６が１つのクラスとしてクラスタリングされ、図８（ｂ）に示すように、注目領域の候補Ｃ_１、Ｃ_２が選択されることになる。なお、中心位置に最も近いクラスを注目領域として選択する場合、あるいは、最も占有面積が広いクラスを注目領域として選択する場合、クラス選択手段６３は、注目領域の候補Ｃ_１を注目領域として選択することになる。図１に戻って、トリミング処理装置１の構成について説明を続ける。 Then, the attention area setting unit 6 uses the clustering unit 62 to cluster adjacent blocks as the same class. Here, the blocks having the distance between the blocks of interest within two blocks are set as the same class. As a result, the attention blocks B _{N1 to} B _N3 are clustered as one class, and the attention blocks B _{N4 to} B _N6 are clustered as one class. As shown in FIG. 8B, the attention area candidates C ₁ and C ₂ are clustered. Will be selected. In the case to select the nearest class to the center position as the region of interest, or, if you choose the most occupied area is wide class as the region of interest, the class selecting unit 63 selects a candidate C ₁ region of interest as a region of interest It will be. Returning to FIG. 1, the description of the configuration of the trimming apparatus 1 will be continued.

トリミング映像生成手段７は、表示対象の表示装置（図示せず）のアスペクト比に応じた領域をトリミング領域としてフレーム画像から抽出して出力するものである。ここで、図５を参照（適宜図１参照）して、トリミング映像生成手段７の詳細な構成について説明する。図５は、トリミング映像生成手段の構成を示すブロック図である。ここでは、トリミング映像生成手段７は、トリミング領域設定手段７１と、トリミング領域抽出手段７２とを備える。 The trimming video generation means 7 extracts and outputs a region corresponding to the aspect ratio of a display device (not shown) as a display target from the frame image as a trimming region. Here, the detailed configuration of the trimmed video generation means 7 will be described with reference to FIG. 5 (refer to FIG. 1 as appropriate). FIG. 5 is a block diagram showing the configuration of the trimmed video generation means. Here, the trimmed video generation means 7 includes a trimming area setting means 71 and a trimming area extraction means 72.

トリミング領域設定手段７１は、フレーム画像において、表示装置のアスペクト比に応じた領域をトリミング領域として設定するものである。ここで、ショット分類手段５において、フレーム画像がロングショットの画像であると分類された場合、トリミング領域設定手段７１は、少なくとも注目領域設定手段６で設定された注目領域を含み、かつ、表示装置のアスペクト比に応じた領域をトリミング領域として設定する。通常、注目領域設定手段６で設定された注目領域は、表示装置のアスペクト比とは同一とならない。そこで、トリミング領域設定手段７１は、注目領域に対して上下左右に領域を拡張することで、トリミング領域を設定する。あるいは、トリミング領域設定手段７１は、注目領域の上下または左右のいずれかに拡張することで、トリミング領域を設定することとしてもよい。 The trimming area setting unit 71 sets an area corresponding to the aspect ratio of the display device as a trimming area in the frame image. Here, when the shot classification means 5 classifies the frame image as a long shot image, the trimming area setting means 71 includes at least the attention area set by the attention area setting means 6, and the display device An area corresponding to the aspect ratio is set as a trimming area. Usually, the attention area set by the attention area setting means 6 is not the same as the aspect ratio of the display device. Therefore, the trimming area setting means 71 sets the trimming area by extending the area vertically and horizontally with respect to the attention area. Alternatively, the trimming area setting unit 71 may set the trimming area by extending the area to the top, bottom, left, or right of the attention area.

例えば、注目領域の縦横比とアスペクト比とを比較し、注目領域の縦の比率が大きい場合、トリミング領域設定手段７１は、注目領域の縦幅を固定し、縦横比がアスペクト比となるように、横幅（左右領域）を拡張してトリミング領域とする。また、注目領域の縦横比とアスペクト比とを比較し、注目領域の横の比率が大きい場合、トリミング領域設定手段７１は、注目領域の横幅を固定し、縦横比がアスペクト比となるように、縦幅（上下領域）を拡張してトリミング領域とする。 For example, the aspect ratio and aspect ratio of the attention area are compared, and when the vertical ratio of the attention area is large, the trimming area setting unit 71 fixes the vertical width of the attention area so that the aspect ratio becomes the aspect ratio. The width (left and right area) is expanded to make a trimming area. Further, the aspect ratio of the attention area is compared with the aspect ratio, and when the horizontal ratio of the attention area is large, the trimming area setting means 71 fixes the width of the attention area and the aspect ratio becomes the aspect ratio. The vertical width (upper and lower area) is expanded to be a trimming area.

一方、ショット分類手段５において、フレーム画像が非ロングショットの画像であると分類された場合、トリミング領域設定手段７１は、フレーム画像の中心（画面中央）をトリミング領域の中心として、表示装置のアスペクト比分の最大領域をトリミング領域に設定する。 On the other hand, when the shot classification unit 5 classifies the frame image as a non-long shot image, the trimming region setting unit 71 sets the center of the frame image (the center of the screen) as the center of the trimming region and the aspect of the display device. The maximum area for the ratio is set as the trimming area.

トリミング領域抽出手段７２は、トリミング領域設定手段７１で設定されたトリミング領域を、フレーム画像から抽出し、表示装置の解像度に応じて拡大または縮小した画像を生成するものである。このトリミング領域抽出手段７２で抽出され、拡大または縮小された画像は、映像のフレーム画像として逐次表示装置に出力される。これによって、トリミングされた領域の映像が表示装置に表示されることになる。 The trimming area extraction unit 72 extracts the trimming area set by the trimming area setting unit 71 from the frame image, and generates an image enlarged or reduced according to the resolution of the display device. The image extracted and enlarged or reduced by the trimming area extraction unit 72 is sequentially output to the display device as a video frame image. As a result, the image of the trimmed area is displayed on the display device.

ここで、図９を参照（適宜図２参照）して、トリミング映像生成手段７で行われるトリミング領域を抽出する手法について、具体的な画像例で説明する。図９は、トリミング領域抽出処理の内容を説明するための説明図であって、（ａ）は「ロングショット」撮影時、（ｂ）は「非ロングショット」撮影時におけるそれぞれのトリミング領域を示している。 Here, referring to FIG. 9 (refer to FIG. 2 as appropriate), a technique for extracting a trimming area performed by the trimming video generation means 7 will be described with a specific image example. FIGS. 9A and 9B are explanatory diagrams for explaining the contents of the trimming area extraction process. FIG. 9A shows the respective trimming areas at the time of “long shot” shooting, and FIG. 9B shows the respective trimming areas at the time of “non-long shot” shooting. ing.

図９（ａ）に示すように、フレーム画像がロングショットの画像である場合、トリミング映像生成手段７は、トリミング領域設定手段７１によって、注目領域Ｃ_１に対して上下左右に領域を拡張することで、表示装置のアスペクト比に応じたトリミング領域Ｔ_１を設定する。なお、トリミング領域Ｔ_２は、注目領域がＣ_２のように縦の比率が大きい場合、注目領域Ｃ_２の縦幅を固定し、縦横比がアスペクト比となるように、横幅を拡張した例を示している。 As shown in FIG. 9A, when the frame image is a long shot image, the trimming video generation means 7 extends the area vertically and horizontally with respect to the attention area C ₁ by the trimming area setting means 71. in sets trimming area T ₁ corresponding to the aspect ratio of the display device. Note that the trimming area T ₂ are, when the target area is large vertical ratio as C _2, the longitudinal width of the region of interest C ₂ is fixed, so that the aspect ratio is the aspect ratio, an example of extending the width Show.

また、図９（ｂ）に示すように、フレーム画像が非ロングショットの画像である場合、トリミング映像生成手段７は、トリミング領域設定手段７１によって、フレーム画像の中心をトリミング領域の中心として、アスペクト比分の最大領域をトリミング領域Ｔ_３に設定する。例えば、元のフレーム画像がハイビジョンのアスペクト比である１６：９の画像であって、表示装置のアスペクト比がＱＶＧＡの４：３である場合、図９（ｂ）に示すように、両サイドの画像がカットされたトリミング領域Ｔ_３が設定されることになる。 Further, as shown in FIG. 9B, when the frame image is a non-long shot image, the trimming video generation unit 7 uses the trimming region setting unit 71 to set the center of the frame image as the center of the trimming region. setting the maximum area ratio partial trimming area T _3. For example, if the original frame image is a 16: 9 image having a high-definition aspect ratio and the display device has an aspect ratio of 4: 3 QVGA, as shown in FIG. image so that the trimming area T ₃ that is cut is set.

以上説明したようにトリミング処理装置１を構成することで、トリミング処理装置１は、フィールド内で行われるスポーツを撮影した映像を表示する際に、ロングショットであるか否かを判定し、ロングショットである場合に、フレーム画像全体を表示する場合に比べて、選手等が存在する注目領域を拡大して表示することができる。これによって、トリミング処理装置１は、高解像度のディスプレイを想定した映像（ハイビジョン映像）であっても、低解像度のディスプレイにおいて注目領域の視認性を高めて映像を表示することができる。 By configuring the trimming processing apparatus 1 as described above, the trimming processing apparatus 1 determines whether or not the shot is a long shot when displaying an image of sports performed in the field. In such a case, it is possible to enlarge and display the attention area where the player or the like is present, as compared with the case where the entire frame image is displayed. As a result, the trimming apparatus 1 can display a video with improved visibility of a region of interest on a low-resolution display, even for a video that assumes a high-resolution display (high-vision video).

なお、トリミング処理装置１は、一般的なコンピュータを、前記した各手段として機能させるトリミング処理プログラムによって動作させることができる。また、このトリミング処理プログラムが、予め図示を省略した不揮発性のメモリ等に記憶しておき動作させることとしてもよいし、例えば、放送波や通信回線を介して取得し、動作させることとしてもよい。 Note that the trimming apparatus 1 can be operated by a trimming processing program that causes a general computer to function as each of the above-described units. Further, this trimming processing program may be stored in a non-volatile memory (not shown) or the like and operated, or may be acquired and operated via a broadcast wave or a communication line, for example. .

［トリミング処理装置の動作］
次に、図６を参照（構成については適宜図１〜５参照）して、本発明の実施形態に係るトリミング処理装置の動作について説明する。図６は、本発明の実施形態に係るトリミング処理装置の動作を示すフローチャートである。 [Operation of trimming device]
Next, the operation of the trimming apparatus according to the embodiment of the present invention will be described with reference to FIG. 6 (refer to FIGS. 1 to 5 for the configuration as appropriate). FIG. 6 is a flowchart showing the operation of the trimming processing apparatus according to the embodiment of the present invention.

（映像入力）
まず、トリミング処理装置１は、映像入力手段２によって、外部から映像を入力する（ステップＳ１）。 (Video input)
First, the trimming apparatus 1 inputs a video from the outside by the video input unit 2 (step S1).

（画像縮小・分割）
そして、トリミング処理装置１は、画像縮小・分割手段３によって、映像のフレーム画像ごとに画像縮小を行い、所定の大きさのブロックに分割する（ステップＳ２）。具体的には、トリミング処理装置１は、画像縮小・分割手段３の画像縮小手段３１によって、映像のフレーム画像ごとに画像縮小を行うことで、例えば、１９２０×１０８０画素の解像度のフレーム画像を２４０×１３５画素に縮小した縮小フレーム画像を生成する。そして、トリミング処理装置１は、画像分割手段３２によって、縮小フレーム画像を所定の大きさのブロック（例えば、１６×１６画素のマクロブロック）に分割する。 (Image reduction / division)
Then, the trimming apparatus 1 performs image reduction for each frame image of the video by the image reduction / division means 3 and divides it into blocks of a predetermined size (step S2). Specifically, the trimming apparatus 1 performs image reduction for each frame image by the image reduction unit 31 of the image reduction / division unit 3, for example, to convert a frame image having a resolution of 1920 × 1080 pixels to 240. A reduced frame image reduced to × 135 pixels is generated. Then, the trimming apparatus 1 divides the reduced frame image into blocks of a predetermined size (for example, 16 × 16 pixel macroblocks) by the image dividing unit 32.

（画像特徴抽出）
そして、トリミング処理装置１は、画像特徴抽出手段４によって、ステップＳ２で分割されたブロック（マクロブロック）ごとに、色情報（色平均値）および輝度情報（輝度分散値）を当該ブロックの画像特徴として算出する（ステップＳ３）。 (Image feature extraction)
Then, for each block (macroblock) divided in step S2 by the image feature extraction unit 4, the trimming processing device 1 obtains color information (color average value) and luminance information (luminance variance value) for the image feature of the block. (Step S3).

（ショット分類）
その後、トリミング処理装置１は、ショット分類手段５によって、ステップＳ３で算出されたブロックごとの色情報（色平均値）に基づいて、特定色（例えば、芝生の色）のフィールドブロック全体の形状複雑度およびフィールド占有割合を算出し（ステップＳ４）、撮影ショットがロングショットであるのか、それ以外（非ロングショット）であるのかを分類する（ステップＳ５）。 (Shot classification)
Thereafter, the trimming apparatus 1 uses the shot classification unit 5 to complicate the shape of the entire field block of a specific color (for example, the color of the lawn) based on the color information (color average value) for each block calculated in step S3. The degree and the field occupation ratio are calculated (step S4), and whether the shot is a long shot or other (non-long shot) is classified (step S5).

具体的には、トリミング処理装置１は、ショット分類手段５のブロック判定手段５１によって、各ブロックの色情報が特定色（例えば、芝生の色）の色情報に類似するか否かを判定することで、各ブロックが、フィールド部分が写っているブロック（フィールドブロック）であるか否かを判定する。 Specifically, the trimming apparatus 1 uses the block determination unit 51 of the shot classification unit 5 to determine whether the color information of each block is similar to the color information of a specific color (eg, lawn color). Thus, it is determined whether or not each block is a block (field block) in which the field portion is reflected.

そして、トリミング処理装置１は、ショット分類手段５の形状複雑度算出手段５２によって、フィールドブロックと判定されたブロック（フィールドブロック）全体の形状の複雑さの度合いを示す形状複雑度を算出する。さらに、トリミング処理装置１は、ショット分類手段５の占有割合算出手段５３によって、全ブロック数に対するフィールドブロックのブロック数の割合（フィールド占有割合）を算出する（ステップＳ４）。 Then, the trimming processing device 1 calculates the shape complexity indicating the degree of complexity of the shape of the entire block (field block) determined as the field block by the shape complexity calculating unit 52 of the shot classification unit 5. Further, the trimming apparatus 1 calculates the ratio of the number of field blocks to the total number of blocks (field occupation ratio) by the occupation ratio calculation means 53 of the shot classification means 5 (step S4).

その後、トリミング処理装置１は、ショット分類手段５のショット判定手段５４によって、形状複雑度が予め定めた閾値よりも小さく、かつ、フィールドブロック占有割合が予め定めた閾値よりも大きい場合に、フレーム画像が「ロングショット」の画像であると判定し、それ以外の場合に、「非ロングショット」の画像であると判定する（ステップＳ５）。 Thereafter, the trimming apparatus 1 uses the shot determination unit 54 of the shot classification unit 5 to generate a frame image when the shape complexity is smaller than a predetermined threshold value and the field block occupation ratio is larger than a predetermined threshold value. Is a “long shot” image, otherwise it is determined to be a “non-long shot” image (step S5).

（注目領域設定）
ステップＳ５でフレーム画像が「ロングショット」の画像であると判定された場合、トリミング処理装置１は、注目領域設定手段６によって、ステップＳ３で算出されたブロックごとの輝度情報（輝度分散値）に基づいて、注目すべきブロックを注目ブロックとして抽出し（ステップＳ６）、その注目ブロックを含む領域を注目領域として設定する（ステップＳ７）。 (Attention area setting)
When it is determined in step S5 that the frame image is a “long shot” image, the trimming apparatus 1 uses the attention area setting unit 6 to add the luminance information (luminance variance value) for each block calculated in step S3. Based on this, a block to be noticed is extracted as a noticed block (step S6), and an area including the noticed block is set as a noticed area (step S7).

具体的には、トリミング処理装置１は、注目領域設定手段６の注目ブロック抽出手段６１によって、各ブロックの輝度分散値が予め定めた輝度分散値よりも高いブロックを注目ブロックとして抽出する（ステップＳ６）。 Specifically, the trimming processing apparatus 1 extracts, as a target block, a block in which the luminance variance value of each block is higher than a predetermined luminance variance value by the target block extracting unit 61 of the target area setting unit 6 (step S6). ).

そして、トリミング処理装置１は、注目領域設定手段６のクラスタリング手段６２によって、複数の注目ブロックを予め定めた距離以下の近接するブロック同士を同一のクラスとしてクラスタリングする。これによって、１つ以上の注目領域の候補が形成されることになる。そして、トリミング処理装置１は、注目領域設定手段６のクラス選択手段６３によって、予め定めた選択基準（例えば、画面中心位置に最も近いクラスを選択）により、いずれか１つのクラスを注目領域として選択する（ステップＳ７）。 Then, the trimming apparatus 1 uses the clustering unit 62 of the attention area setting unit 6 to cluster a plurality of blocks of interest close to each other with a predetermined distance or less as the same class. As a result, one or more candidate regions of interest are formed. Then, the trimming processing apparatus 1 selects any one class as the attention area by the class selection means 63 of the attention area setting means 6 according to a predetermined selection criterion (for example, the class closest to the screen center position). (Step S7).

（トリミング領域設定）
そして、トリミング処理装置１は、トリミング映像生成手段７によって、表示対象の表示装置（図示せず）のアスペクト比に応じた領域をトリミング領域として設定し（ステップＳ８，Ｓ９）、当該トリミング領域をフレーム画像から抽出して表示する（ステップＳ１０）。 (Trimming area setting)
Then, the trimming processing apparatus 1 sets an area corresponding to the aspect ratio of the display device (not shown) to be displayed as a trimming area by the trimming video generation means 7 (steps S8 and S9), and sets the trimming area as a frame. Extracted from the image and displayed (step S10).

具体的には、トリミング処理装置１は、トリミング映像生成手段７のトリミング領域設定手段７１によって、少なくともステップＳ７で設定された注目領域を含み、かつ、表示装置のアスペクト比に応じた領域をトリミング領域として設定する（ステップＳ８）。なお、ステップＳ５でフレーム画像が「非ロングショット」の画像であると判定された場合、トリミング領域設定手段７１は、フレーム画像の中心（画面中央）をトリミング領域の中心として、表示装置のアスペクト比分の最大領域をトリミング領域に設定する（ステップＳ９）。 Specifically, the trimming processing device 1 includes at least a region of interest set in step S7 by the trimming region setting unit 71 of the trimming video generation unit 7, and a region corresponding to the aspect ratio of the display device as a trimming region. (Step S8). If it is determined in step S5 that the frame image is a “non-long shot” image, the trimming area setting unit 71 uses the center of the frame image (the center of the screen) as the center of the trimming area to match the aspect ratio of the display device. Is set as a trimming area (step S9).

そして、トリミング処理装置１は、トリミング映像生成手段７のトリミング領域抽出手段７２によって、ステップＳ８またはステップＳ９で設定されたトリミング領域を、フレーム画像から抽出し、表示装置の解像度に応じて拡大または縮小した画像を生成し、表示装置に表示する（ステップＳ１０）。 Then, the trimming processing device 1 extracts the trimming region set in step S8 or step S9 from the frame image by the trimming region extraction unit 72 of the trimmed video generation unit 7, and enlarges or reduces it according to the resolution of the display device. The generated image is generated and displayed on the display device (step S10).

以上の動作によって、トリミング処理装置１は、フィールド内で行われるスポーツを撮影した映像を表示する際に、ロングショットであるか否かを判定し、ロングショットである場合に、フレーム画像全体を表示する場合に比べて、選手等が存在する注目領域を拡大して表示することができる。 With the above operation, the trimming apparatus 1 determines whether or not the shot is a long shot when displaying an image of sports performed in the field, and displays the entire frame image when the shot is a long shot. Compared with the case where it does, the attention area where a player etc. exist can be expanded and displayed.

本発明の実施の形態に係るトリミング処理装置の全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the trimming processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係るトリミング処理装置の画像縮小・分割手段の構成を示すブロック図である。It is a block diagram which shows the structure of the image reduction / division | segmentation means of the trimming processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係るトリミング処理装置のショット分類手段の構成を示すブロック図である。It is a block diagram which shows the structure of the shot classification means of the trimming processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係るトリミング処理装置の注目領域設定手段の構成を示すブロック図である。It is a block diagram which shows the structure of the attention area setting means of the trimming processing apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係るトリミング処理装置のトリミング映像生成手段の構成を示すブロック図である。It is a block diagram which shows the structure of the trimming image | video production | generation means of the trimming processing apparatus which concerns on embodiment of this invention. 本発明の実施形態に係るトリミング処理装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the trimming processing apparatus which concerns on embodiment of this invention. 画像縮小・分割処理の内容を説明するための説明図である。It is explanatory drawing for demonstrating the content of the image reduction / division process. 注目領域設定処理の内容を説明するための説明図である。It is explanatory drawing for demonstrating the content of an attention area setting process. トリミング領域抽出処理の内容を説明するための説明図である。It is explanatory drawing for demonstrating the content of the trimming area | region extraction process.

Explanation of symbols

１トリミング処理装置
２映像入力手段
３画像縮小・分割手段
３１画像縮小手段
３２画像分割手段
４画像特徴抽出手段
５ショット分類手段
５１ブロック判定手段
５２形状複雑度算出手段
５３占有割合算出手段
５４ショット判定手段
６注目領域設定手段
６１注目ブロック抽出手段
６２クラスタリング手段
６３クラス選択手段
７トリミング映像生成手段
７１トリミング領域設定手段
７２トリミング領域抽出手段 DESCRIPTION OF SYMBOLS 1 Trimming processing apparatus 2 Image | video input means 3 Image reduction / division means 31 Image reduction means 32 Image division means 4 Image feature extraction means 5 Shot classification means 51 Block determination means 52 Shape complexity calculation means 53 Occupancy ratio calculation means 54 Shot determination means 6 attention area setting means 61 attention block extraction means 62 clustering means 63 class selection means 7 trimming video generation means 71 trimming area setting means 72 trimming area extraction means

Claims

A trimming processing device for trimming a region of interest of an image of sports performed in a field and outputting the image to a display device,
Image dividing means for dividing a frame image constituting the video into blocks of a predetermined size;
Image feature extraction means for extracting color information and luminance information in the block divided by the image dividing means as image features of the block;
The complexity of the shape of the entire field block formed by the field block, which is a block in which the color information extracted by the image feature extraction unit and the predetermined color information of the field are similar, exceeds a predetermined reference value. Shot classification means for classifying the shot of the video based on whether or not to be a long shot,
When the shot classification means classifies the shot as a long shot, based on the luminance information of the block, an area including a block of interest that is a block whose luminance variance value is higher than a predetermined luminance variance value. Attention area setting means for setting as the attention area;
Trimming video generation means including the attention area set by the attention area setting means, and extracting and outputting the area according to the aspect ratio of the display device from the frame image as a trimming area;
A trimming processing apparatus comprising:

Image reduction means for reducing the frame image;
2. The trimming apparatus according to claim 1, wherein the image dividing unit divides the frame image reduced by the image reducing unit.

The shot classification means includes
3. The shape complexity calculating means for calculating the complexity of the shape as a ratio of the perimeter of the shape of the entire field block to the number of blocks forming the entire field block. The trimming processing apparatus described in 1.

The shot classification means includes
In the lower area of the frame image, further comprises an occupation ratio calculating means for calculating a ratio of the number of blocks of the field block to the total number of blocks in the lower area as a field occupation ratio, and the complexity is larger than a predetermined value, 4. The trimming apparatus according to claim 3, wherein the shot is classified as a long shot when the field occupation ratio is smaller than a predetermined value.

The attention area setting means includes:
Attention block extraction means for extracting, as the attention block, a block having a luminance dispersion value higher than a predetermined luminance dispersion value based on the luminance information of the block;
Clustering means for clustering the blocks of interest extracted by the block of interest extracting means in close proximity to each other within a predetermined distance;
Class selection means for selecting one class as the attention area according to a predetermined selection criterion when the attention block is clustered into a plurality of classes by the clustering means;
The trimming apparatus according to any one of claims 1 to 4, further comprising:

In order to crop and output the attention area of the video of sports performed in the field to the display device,
Image dividing means for dividing the frame image constituting the video into blocks of a predetermined size;
Image feature extraction means for extracting color information and luminance information in the block divided by the image dividing means as image features of the block;
The complexity of the shape of the entire field block formed by the field block, which is a block in which the color information extracted by the image feature extraction unit and the predetermined color information of the field are similar, exceeds a predetermined reference value. Shot classification means for classifying the shot of the video as a long shot based on whether or not
When the shot classification means classifies the shot as a long shot, based on the luminance information of the block, an area including a block of interest that is a block whose luminance variance value is higher than a predetermined luminance variance value. Attention area setting means for setting as the attention area;
Trimming video generation means that includes the attention area set by the attention area setting means, and extracts and outputs an area corresponding to the aspect ratio of the display device from the frame image as a trimming area;
A trimming processing program characterized by functioning as