JP2022116368A

JP2022116368A - Video processing device, display device, and video processing method

Info

Publication number: JP2022116368A
Application number: JP2019104433A
Authority: JP
Inventors: 龍昇中村; Tatsunori Nakamura
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2019-06-04
Filing date: 2019-06-04
Publication date: 2022-08-10
Also published as: CN113924785A; WO2020246146A1

Abstract

【課題】従来よりも表示品位に優れた出力映像を提供する。【解決手段】映像処理装置（１）において、第１受信部（８１）は入力映像を受信し、第１受信部（８１）とは異なる第２受信部（８２）は、入力映像に対応する表示情報を受信する。表示情報は、入力映像に同期したメタデータを含む。メタデータは、入力映像の注目領域を示す注目情報を含む。映像処理装置（１）は、（ｉ）表示情報を用いて入力映像を処理する第１モード、および、（ｉｉ）表示情報を用いずに入力映像を処理する第２モード、のいずれかの一方のモードを選択する。【選択図】図１An object of the present invention is to provide an output image having a display quality superior to that of the conventional art. Kind Code: A1 In a video processing device (1), a first receiving section (81) receives an input video, and a second receiving section (82) different from the first receiving section (81) corresponds to the input video. Receive display information. The display information includes metadata synchronized with the input video. The metadata includes attention information indicating an attention area of the input video. The video processing device (1) operates in either (i) a first mode in which the input video is processed using the display information, or (ii) a second mode in which the input video is processed without using the display information. mode. [Selection drawing] Fig. 1

Description

以下の開示は、映像処理装置に関する。 The following disclosure relates to video processing devices.

近年、映像コンテンツのさらなる充実化のために、様々な技術が提案されている。例えば、ハイブリッドキャスト放送システムについての技術開発がなされている（非特許文献１を参照）。 In recent years, various techniques have been proposed for further enhancement of video content. For example, technical development of a hybridcast broadcasting system has been made (see Non-Patent Document 1).

ハイブリッドキャスト放送システムでは、入力映像と、当該入力映像に対応する表示情報（例：当該入力映像のメタデータ）とが、個別の経路によって、映像処理装置に対し同時に配信される。そして、映像処理装置において、表示情報に基づき入力映像を処理することで、ユーザ（視聴者）にとってより魅力的な映像（出力映像）を提供できる。 In a hybridcast broadcasting system, input video and display information corresponding to the input video (eg, metadata of the input video) are simultaneously delivered to the video processing device through separate paths. By processing the input video based on the display information in the video processing device, it is possible to provide a more attractive video (output video) to the user (viewer).

“放送通信連携システムの機能拡張～ハイブリッドキャストの高度化に向けて～”，大亦寿之，NHK技研R&D/No.146/2014.8“Expanding Functions of Broadcasting and Telecommunications Linked Systems - Toward Advanced Hybridcast -”, Toshiyuki Oya, NHK STRL R&D/No.146/2014.8

但し、後述するように、入力映像と表示情報とが同時に配信される放送システムでは、映像処理装置での表示情報の受信が適切に行われない場合も懸念される。このような場合、出力映像の表示品位が低下しうる。但し、従来技術（例：非特許文献１）では、このような問題について特に考慮されていない。 However, as will be described later, in a broadcasting system in which input video and display information are distributed at the same time, there is a concern that the display information may not be properly received by the video processing device. In such a case, the display quality of the output video may deteriorate. However, in the prior art (eg, Non-Patent Document 1), no particular consideration is given to such a problem.

本発明の一開示は、上記の問題に鑑みてなされたものであり、従来よりも表示品位に優れた出力映像を提供することを目的とする。 One disclosure of the present invention has been made in view of the above problems, and an object of the present invention is to provide an output image having a display quality superior to that of the conventional art.

上記の課題を解決するために、本開示の一態様に係る映像処理装置は、入力映像を処理することにより出力映像を生成する映像処理装置であって、上記入力映像を受信する第１受信部と、上記入力映像に対応する表示情報を受信する第２受信部と、を備えており、上記第２受信部は、上記第１受信部とは異なる受信部であり、上記表示情報は、上記入力映像に同期したメタデータを含み、上記メタデータは、上記入力映像の注目領域を示す注目情報を含み、上記映像処理装置は、（ｉ）上記表示情報を用いて上記入力映像を処理する第１モード、および、（ｉｉ）上記表示情報を用いずに上記入力映像を処理する第２モード、のいずれかの一方のモードを選択する。 In order to solve the above problems, a video processing device according to an aspect of the present disclosure is a video processing device that generates an output video by processing an input video, a first receiving unit that receives the input video and a second receiving section for receiving display information corresponding to the input video, wherein the second receiving section is a receiving section different from the first receiving section, and the display information is the Metadata synchronized with the input image is included, the metadata includes attention information indicating an attention area of the input image, and the image processing device is configured to: (i) process the input image using the display information; and (ii) a second mode in which the input image is processed without using the display information.

上記の課題を解決するために、本開示の一態様に係る映像処理方法は、入力映像を処理することにより出力映像を生成する映像処理方法であって、第１受信部によって上記入力映像を受信する第１受信工程と、上記第１受信部とは異なる第２受信部によって、上記入力映像に対応する表示情報を受信する第２受信工程と、を含んでおり、上記表示情報は、上記入力映像に同期したメタデータを含み、上記メタデータは、上記入力映像の注目領域を示す注目情報を含み、上記映像処理方法は、（ｉ）上記表示情報を用いて上記入力映像を処理する第１モード、および、（ｉｉ）上記表示情報を用いずに上記入力映像を処理する第２モード、のいずれかの一方のモードを選択する工程をさらに含んでいる。 To solve the above problems, a video processing method according to an aspect of the present disclosure is a video processing method for generating an output video by processing an input video, wherein the input video is received by a first receiving unit. and a second receiving step of receiving display information corresponding to the input video by a second receiving unit different from the first receiving unit, wherein the display information is the input including metadata synchronized with video, the metadata including attention information indicating an attention area of the input image, the image processing method comprising: (i) processing the input image using the display information; and (ii) a second mode in which the input image is processed without using the display information.

本開示の一態様によれば、従来よりも表示品位に優れた出力映像を提供できる。 According to one aspect of the present disclosure, it is possible to provide an output image with better display quality than conventional.

実施形態１の表示装置の要部の構成を示すブロック図である。2 is a block diagram showing the configuration of the main part of the display device of Embodiment 1; FIG. ＺＯＯＭモードにおける拡大処理について説明する図である。It is a figure explaining the expansion process in ZOOM mode. ＺＯＯＭモードにおける拡大処理について説明する別の図である。FIG. 11 is another diagram illustrating enlargement processing in ZOOM mode; ＺＯＯＭモードにおける入力映像と出力映像との関係を例示する図である。FIG. 4 is a diagram illustrating the relationship between an input image and an output image in ZOOM mode; 比較例の表示装置の要部の構成を示すブロック図である。FIG. 10 is a block diagram showing the configuration of a main part of a display device of a comparative example; 比較例における処理の一例について説明する図である。It is a figure explaining an example of the process in a comparative example. 実施形態１の映像処理装置の処理の流れを例示するシーケンス図である。4 is a sequence diagram illustrating the flow of processing of the video processing device according to the first embodiment; FIG. 実施形態１の映像処理装置の処理の全体的な流れを例示するフローチャートである。4 is a flowchart illustrating the overall flow of processing of the video processing device of Embodiment 1; ＺＯＯＭモードが予め選択されている場合の、映像処理の流れを例示するフローチャートである。4 is a flowchart illustrating the flow of video processing when the ZOOM mode is preselected; ＴＯ＿ＯＲＩＧＩＮＡＬモードが予め選択されている場合の、映像処理の流れを例示するフローチャートである。10 is a flowchart illustrating the flow of video processing when the TO_ORIGINAL mode is preselected; ＴＯ＿ＯＲＩＧＩＮＡＬモードにおける拡大処理について説明する図である。It is a figure explaining the expansion process in TO_ORIGINAL mode. ＰＲＯＣＥＳＳ＿ＣＮＴの増加に応じたズームアウト領域の変化の一例を示す図である。FIG. 10 is a diagram showing an example of a zoom-out area change according to an increase in PROCESS_CNT; ＯＲＩＧＩＮＡＬモードが予め選択されている場合の、映像処理の流れを例示するフローチャートである。7 is a flowchart illustrating the flow of video processing when the ORIGINAL mode has been selected in advance; 実施形態１におけるαの時間変化について例示する図である。4 is a diagram illustrating temporal change of α in Embodiment 1. FIG. 比較例との対比によって実施形態１の効果を説明する図である。It is a figure explaining the effect of Embodiment 1 by comparison with a comparative example. 参考例との対比によって実施形態１の効果を説明する図である。It is a figure explaining the effect of Embodiment 1 by comparison with a reference example. 実施形態２において、ＴＯ＿ＺＯＯＭモードが予め選択されている場合の、映像処理の流れを例示するフローチャートである。10 is a flowchart illustrating the flow of image processing when the TO_ZOOM mode is preselected in the second embodiment; 実施形態２におけるαの時間変化について例示する図である。FIG. 10 is a diagram illustrating temporal change of α in Embodiment 2; FIG. 実施形態３におけるαの時間変化について例示する図である。FIG. 11 is a diagram illustrating temporal changes of α in Embodiment 3; 実施形態３において、ＯＲＩＧＩＮＡＬモードが予め選択されている場合の映像処理の流れを例示するフローチャートである。10 is a flowchart illustrating the flow of video processing when the ORIGINAL mode is preselected in Embodiment 3; 実施形態４におけるαの時間変化について例示する図である。FIG. 12 is a diagram illustrating temporal changes of α in Embodiment 4; 実施形態５におけるαの時間変化について例示する図である。FIG. 12 is a diagram illustrating temporal change of α in Embodiment 5; 実施形態１との対比によって実施形態５の効果を説明する図である。It is a figure explaining the effect of Embodiment 5 by comparison with Embodiment 1. FIG.

〔実施形態１〕
実施形態１の表示装置１００について、以下に説明する。便宜上、実施形態１にて説明した部材と同じ機能を有する部材については、以降の各実施形態では、同じ符号を付記し、その説明を繰り返さない。公知技術と同様の事項についても、説明を適宜省略する。 [Embodiment 1]
The display device 100 of Embodiment 1 will be described below. For convenience, members having the same functions as the members described in the first embodiment are denoted by the same reference numerals in the subsequent embodiments, and description thereof will not be repeated. Descriptions of matters similar to those of known technology are also omitted as appropriate.

各図に示されている装置構成は、説明の便宜上のための単なる一例である。また、明細書中において以下に述べる各数値も、単なる一例である。本明細書では、２つの数ＡおよびＢについての「Ａ～Ｂ」という記載は、特に明示されない限り、「Ａ以上かつＢ以下」を意味するものとする。 The device configuration shown in each figure is merely an example for convenience of explanation. Further, each numerical value described below in the specification is also a mere example. As used herein, the description "A to B" for two numbers A and B shall mean "greater than or equal to A and less than or equal to B" unless otherwise specified.

（表示装置１００の概要）
図１は、表示装置１００の要部の構成を示すブロック図である。表示装置１００は、映像処理装置１と表示部９０とを備える。映像処理装置１は、放送システム１０００から、入力映像ＭＯＶ１を含む各種のデータを取得する。以下、入力映像ＭＯＶ１を、単にＭＯＶ１とも略記する。その他の記号についても、適宜同様に略記する。映像処理装置１は、ＭＯＶ１を処理することにより、出力映像ＭＯＶ２を生成する。映像処理装置１は、当該ＭＯＶ２を表示部９０に表示させる。なお、ＭＯＶ１は、オリジナル映像とも称される。 (Overview of display device 100)
FIG. 1 is a block diagram showing the configuration of the essential parts of the display device 100. As shown in FIG. The display device 100 includes the video processing device 1 and a display section 90 . The video processing device 1 acquires various data including the input video MOV1 from the broadcasting system 1000 . Hereinafter, the input video MOV1 is also simply abbreviated as MOV1. Other symbols are appropriately abbreviated in the same way. The video processing device 1 generates an output video MOV2 by processing MOV1. The video processing device 1 causes the display unit 90 to display the MOV2. Note that MOV1 is also called an original video.

放送システム１０００は、ＭＯＶ１と表示情報（ＭＯＶ１に対応する情報）とを同時に配信する。放送システム１０００は、（ｉ）ＭＯＶ１を配信するための第１ネットワークＮＴ１と、（ｉｉ）表示情報を配信するための第２ネットワークＮＴ２と、を備える。表示情報には、ＭＯＶ１に同期したメタデータ（より具体的には、映像情報メタデータ）が含まれているものとする。メタデータには、ＭＯＶ１の注目領域（より厳密には、ＭＯＶ１の各フレームの注目領域）（例：後述のＡＴＴ）を示す注目情報が示されている。 Broadcast system 1000 simultaneously distributes MOV1 and display information (information corresponding to MOV1). The broadcasting system 1000 comprises (i) a first network NT1 for distributing MOV1 and (ii) a second network NT2 for distributing display information. It is assumed that the display information includes metadata synchronized with MOV1 (more specifically, video information metadata). The metadata indicates attention information indicating an attention area of MOV1 (more strictly, an attention area of each frame of MOV1) (eg, an ATT described later).

図１の例における放送システム１０００は、ハイブリッドキャスト放送システムである。このため、実施形態１のＮＴ１およびＮＴ２はそれぞれ、放送ネットワークおよびＩＰ（Internet Protocol）通信ネットワークであるものとする。 Broadcast system 1000 in the example of FIG. 1 is a hybridcast broadcast system. Therefore, NT1 and NT2 in Embodiment 1 are assumed to be a broadcasting network and an IP (Internet Protocol) communication network, respectively.

但し、放送システム１０００は、ハイブリッドキャスト放送システムに限定されない。放送システム１０００は、ＭＯＶ１と表示情報とを、個別の経路にて同期して配信する映像配信システムであればよい。このため、ＮＴ１およびＮＴ２は、互いに異なる通信ネットワークであればよい。同様に、以下に述べる第１受信部８１および第２受信部８２は、互いに異なる受信部（通信インターフェース）であればよい。 However, the broadcasting system 1000 is not limited to a hybridcast broadcasting system. The broadcasting system 1000 may be a video distribution system that distributes the MOV1 and the display information synchronously through separate paths. Therefore, NT1 and NT2 may be different communication networks. Similarly, the first receiver 81 and the second receiver 82 described below may be different receivers (communication interfaces).

ＮＴ１は、放送波を映像処理装置１に対して送信する。放送波は、ＭＯＶ１を搬送する搬送波の一例である。ＮＴ２は、表示情報を、映像処理装置１に対して送信する。ＮＴ２は、ＮＴ１によるＭＯＶの送信と同期して、表示情報を映像処理装置１に送信する。 NT1 transmits broadcast waves to video processing apparatus 1 . A broadcast wave is an example of a carrier wave carrying MOV1. The NT 2 transmits display information to the video processing device 1 . NT2 transmits display information to video processing apparatus 1 in synchronization with transmission of MOV by NT1.

映像処理装置１は、制御部１０と第１受信部８１と第２受信部８２とを備える。制御部１０は、映像処理装置１（および表示装置１００）の各部を統括的に制御する。制御部１０は、映像処理部１１０、表示情報受信判定部１２０、シーンチェンジ検出部１３０、および選択部１４０を備える。制御部１０の各部の動作については、後述する。また、映像処理装置１は、不図示の各種の記憶装置（例：フレームメモリ）を含む。 The video processing device 1 includes a control section 10 , a first reception section 81 and a second reception section 82 . The control unit 10 comprehensively controls each unit of the video processing device 1 (and the display device 100). The control unit 10 includes a video processing unit 110 , a display information reception determination unit 120 , a scene change detection unit 130 and a selection unit 140 . The operation of each section of the control section 10 will be described later. The video processing device 1 also includes various storage devices (eg, frame memory) not shown.

第１受信部８１は、ＮＴ１を介してＭＯＶ１を取得（受信）する。図１の例では、第１受信部８１は、放送チューナ８１０および映像デコーダ８２０を含む。放送チューナ８１０は、放送波を受信する。映像デコーダ８２０は、当該放送波を復号（デコード）することにより、（ｉ）ＭＯＶ１と、（ｉｉ）当該ＭＯＶ１のタイムスタンプとを取得する。以下の説明では、特に明示されない限り、「タイムスタンプ」とは、「ＭＯＶ１のタイムスタンプ」を指すものとする。第１受信部８１は、取得したＭＯＶ１とタイムスタンプとを、制御部１０に供給する。 The first receiver 81 acquires (receives) MOV1 via NT1. In the example of FIG. 1, first receiver 81 includes broadcast tuner 810 and video decoder 820 . Broadcast tuner 810 receives broadcast waves. The video decoder 820 acquires (i) MOV1 and (ii) the time stamp of the MOV1 by decoding the broadcast wave. In the following description, unless otherwise specified, "time stamp" refers to "time stamp of MOV1". The first receiving unit 81 supplies the acquired MOV1 and time stamp to the control unit 10 .

第２受信部８２は、ＮＴ２を介して表示情報を取得する。第２受信部８２は、公知のＩＰ通信インターフェースであってよい。第２受信部８２は、取得した表示情報を制御部１０に供給する。 The second receiver 82 acquires display information via NT2. The second receiver 82 may be a known IP communication interface. The second receiver 82 supplies the acquired display information to the controller 10 .

図２は、ＺＯＯＭモード（後述）における拡大処理について説明する図である。図２のＭＯＶ１では、２人の人物の全身が映るシーン（後述するメインコンテンツのワンシーン）が描画されている。ＭＯＶ１の注目領域ＡＴＴとは、ＭＯＶ１の部分領域であって、ズームイン表示（フォーカス表示）の対象となる領域である。つまり、ＡＴＴは、ＭＯＶ１の領域のうち、ユーザ（視聴者）に注目させることが意図された領域と言える。 FIG. 2 is a diagram for explaining enlargement processing in a ZOOM mode (described later). In MOV1 of FIG. 2, a scene (one scene of the main content described later) showing the whole bodies of two people is drawn. The attention area ATT of MOV1 is a partial area of MOV1 and is an area targeted for zoom-in display (focus display). In other words, ATT can be said to be an area intended to attract the user's (viewer's) attention in the area of MOV1.

図２では、説明の便宜上、ＭＯＶ１にＡＴＴを重ね合わせて図示している。本明細書では、ＡＴＴの形状は、ＭＯＶ１の形状に相似しているものとする。すなわち、ＡＴＴのアスペクト比は、ＭＯＶ１のアスペクト比（一定値）と等しいものとする。図２の例では、ＡＴＴは、ＭＯＶ１の上記シーンにおいて、２人の人物の顔にフォーカスすることを意図して設定されている。 In FIG. 2, for convenience of explanation, ATT is shown superimposed on MOV1. In this specification, the shape of ATT shall be similar to the shape of MOV1. That is, the aspect ratio of ATT is assumed to be equal to the aspect ratio (constant value) of MOV1. In the example of FIG. 2, ATT is set with the intention of focusing on the faces of two people in the above scene of MOV1.

図２の例では、映像処理部１１０は、注目情報に基づいてＭＯＶ１を生成し、ＭＯＶ２を生成する。具体的には、映像処理部１１０は、ＭＯＶ１のＡＴＴが拡大された映像として、ＭＯＶ２を生成する。このように、映像処理部１１０は、ＡＴＴをズームインするようにＭＯＶ１を拡大することにより、ＭＯＶ２を生成する。なお、ＭＯＶ２の解像度は、ＭＯＶ１と等しくともよいし、当該ＭＯＶ１と異なっていてもよい。 In the example of FIG. 2, the video processing unit 110 generates MOV1 and MOV2 based on attention information. Specifically, the video processing unit 110 generates MOV2 as a video in which the ATT of MOV1 is enlarged. Thus, the video processing unit 110 generates MOV2 by enlarging MOV1 so as to zoom in on ATT. Note that the resolution of MOV2 may be the same as that of MOV1, or may be different from that of MOV1.

以下、ＭＯＶ１に対するＭＯＶ２の拡大率を、αとして表す。図２の例では、α＝Ｐである。Ｐは、ＺＯＯＭモードにおける拡大率であり、１よりも大きい任意の数である。ＺＯＯＭモードとは、以下に述べる第１モードのうち、ＡＴＴと同スケールの映像をＭＯＶ２として表示するモードである。 Hereinafter, the enlargement ratio of MOV2 with respect to MOV1 is represented as α. In the example of FIG. 2, α=P. P is the magnification in ZOOM mode and is any number greater than one. The ZOOM mode is a mode in which an image having the same scale as ATT is displayed as MOV2 among the first modes described below.

図３は、ＺＯＯＭモードにおける拡大処理について説明する別の図である。以下、図３を参照し、ＭＯＶ１とＡＴＴとの関係について説明する。図３に示されるように、ＭＯＶ１の横方向（Ｘ方向）および縦方向（Ｙ方向）は予め設定されている。Ｘ方向およびＹ方向のそれぞれの正の向きは、図３の紙面の右方向および下方向であるものとする。特に明示されない限り、他の各図においても、同様の方向付けが適用されるものとする。 FIG. 3 is another diagram illustrating enlargement processing in the ZOOM mode. The relationship between MOV1 and ATT will be described below with reference to FIG. As shown in FIG. 3, the horizontal direction (X direction) and vertical direction (Y direction) of MOV1 are set in advance. The positive directions of the X direction and the Y direction are assumed to be the right direction and the downward direction on the page of FIG. A similar orientation shall be applied to each of the other figures unless otherwise specified.

以下の説明では、ＭＯＶ１の解像度（画素数）は、「ＳＩＺＥＸ×ＳＩＺＥＹ」であるものとする。ＳＩＺＥＸおよびＳＩＺＥＹはそれぞれ、ＭＯＶ１の横方向解像度および縦方向解像度である。一例として、ＭＯＶ１がＦｕｌｌＨＤ（High Definition）映像である場合、ＳＩＺＥＸ＝１９２０、ＳＩＺＥＹ＝１０８０である。 In the following description, it is assumed that the resolution (number of pixels) of MOV1 is "SIZEX×SIZEY". SIZEX and SIZEY are the horizontal and vertical resolutions of MOV1, respectively. As an example, when MOV1 is a Full HD (High Definition) video, SIZEX=1920 and SIZEY=1080.

図３に示されるように、ＭＯＶ１の４つの頂点のうち左上の頂点を、原点Ｏ（０，０）として設定する。そして、ＭＯＶ１の４つの頂点のうち右下の頂点（すなわち、原点Ｏの対頂点）を、頂点Ｅと称する。頂点Ｅの座標は、Ｅ（ＳＩＺＥＸ－１，ＳＩＺＥＹ－１）として表される。原点Ｏおよび頂点Ｅは、ＸＹ平面上における、ＭＯＶ１の始点および終点とも表現できる。 As shown in FIG. 3, the upper left vertex of the four vertices of MOV1 is set as the origin O(0,0). Then, of the four vertices of MOV1, the lower right vertex (that is, the opposite vertex of origin O) is referred to as vertex E. The coordinates of vertex E are represented as E(SIZEX-1, SIZEY-1). The origin O and the vertex E can also be expressed as the start point and end point of MOV1 on the XY plane.

図３の例において、ＡＴＴの４つの頂点のうち左上の頂点を、頂点ＡＳ０（ｘｓ０，ｙｓ０）とする。頂点ＡＳ０は、原点Ｏに対応する。また、ＡＴＴの４つの頂点のうち右下の頂点を、頂点ＡＥ０（ｘｅ０，ｙｅ０）とする。頂点ＡＥは、頂点Ｅに対応する。実施形態１では、注目情報は、頂点ＡＳ０・ＡＥ０の座標を示す情報を含むものとする。つまり、注目情報は、ｘｓ０、ｙｓ０、ｘｅ０、およびｙｅ０のそれぞれの値を示す情報を含むものとする。 In the example of FIG. 3, the upper left vertex among the four vertices of ATT is assumed to be vertex AS0 (xs0, ys0). The vertex AS0 corresponds to the origin O. Also, let the lower right vertex of the four vertices of ATT be vertex AE0 (xe0, ye0). Vertex AE corresponds to vertex E. In the first embodiment, attention information includes information indicating coordinates of vertices AS0 and AE0. In other words, attention information includes information indicating each value of xs0, ys0, xe0, and ye0.

なお、ＺＯＯＭモードにおける拡大率がＰであることから、Ｘ方向およびＹ方向のそれぞれについて、
Ｐ×（ｘｅ０－ｘｓ０）＝ＳＩＺＥＸ－１ …（１Ａ）
Ｐ×（ｙｅ０－ｙｓ０）＝ＳＩＺＥＹ－１ …（１Ｂ）
の関係が成立する。 Note that since the magnification in the ZOOM mode is P, for each of the X and Y directions,
P×(xe0−xs0)=SIZEX−1 (1A)
P×(ye0-ys0)=SIZEY-1 (1B)
relationship is established.

ＺＯＯＭモードにおいて、映像処理部１１０は、上述の式（１Ａ）または（１Ｂ）のいずれかに基づき、Ｐを算出する。そして、映像処理部１１０は、当該Ｐを用いてＭＯＶ１を拡大することにより、ＭＯＶ２を生成する。 In the ZOOM mode, the video processing unit 110 calculates P based on either formula (1A) or (1B) above. Then, the video processing unit 110 expands MOV1 using the P to generate MOV2.

一例として、ＭＯＶ１においてユーザに特に注目させるべき被写体（主要被写体）の像がＡＴＴの中央部分に位置するように、当該ＡＴＴが設定されることが好ましい。また、多くのＭＯＶ１では、時間の進展に伴って主要被写体の像が移動するので、当該主要被写体の像の移動に追従するように、ＡＴＴが設定されることが好ましい。このため、ＭＯＶ１の各フレームにおいて、ＡＴＴは相違しうる。 As an example, the ATT is preferably set so that the image of the subject (main subject) to which the user should pay particular attention in MOV1 is positioned in the central portion of the ATT. Also, in many MOV1s, the image of the main subject moves as time progresses, so it is preferable to set ATT so as to follow the movement of the image of the main subject. Therefore, in each frame of MOV1, the ATT can be different.

図４は、ＺＯＯＭモードにおけるＭＯＶ１とＭＯＶ２との関係を例示する図である。図４では、ＺＯＯＭモードにおいて、メタデータ（表示情報）の取得が適切に行われている場合を例示する。この場合、後述する比較例においても、図４と同様の処理が行われる。図４では、ＭＯＶ１がサブコンテンツ映像（以下、サブコンテンツ）からメインコンテンツ映像（以下、メインコンテンツ）へと切り替わる場合を例示する。図４の例のサブコンテンツおよびメインコンテンツはそれぞれ、ＣＭ（Commercial）映像（以下、ＣＭ）およびＴＶドラマ番組の映像（以下、ドラマ）である。 FIG. 4 is a diagram illustrating the relationship between MOV1 and MOV2 in ZOOM mode. FIG. 4 illustrates a case where metadata (display information) is appropriately acquired in the ZOOM mode. In this case, the same processing as in FIG. 4 is performed in a comparative example to be described later. FIG. 4 illustrates a case where MOV1 switches from a sub-content video (hereinafter referred to as sub-content) to a main content video (hereinafter referred to as main content). The sub-contents and main content in the example of FIG. 4 are CM (Commercial) video (hereinafter referred to as CM) and TV drama program video (hereinafter referred to as drama), respectively.

図４の例では、ＭＯＶ１のフレームｉ（第ｉフレーム）のタイムスタンプを、タイプスタンプｉと表記する。その他のタイムスタンプについても同様である。フレームｉについての説明は、特に明示されない限り、その他のフレームにも当てはまる。以下の説明では、特に明示されない限り、「フレーム」とは、「ＭＯＶ１のフレーム」を指すものとする。 In the example of FIG. 4, the time stamp of frame i (i-th frame) of MOV1 is denoted as time stamp i. The same applies to other time stamps. The description for frame i also applies to other frames unless otherwise stated. In the following description, "frame" refers to "frame of MOV1" unless otherwise specified.

図４の例では、フレームｉ＋２において、ＭＯＶ１がＣＭからドラマへと切り替わる。第２受信部８２は、ＭＯＶ１のあるフレームの先頭（当該あるフレームの表示開始タイミング）において、当該あるフレームのメタデータを取得できるものとする。 In the example of FIG. 4, MOV1 switches from CM to drama at frame i+2. It is assumed that the second receiving unit 82 can acquire the metadata of a certain frame at the beginning of the certain frame of MOV1 (the display start timing of the certain frame).

フレームｉ～ｉ＋１（ＣＭの表示期間）では、ＭＯＶ１の中央部分に表示されているＣＭの主要部分をユーザに注目させるために、注目領域は画面中央の部分に設定されている。以下、このように設定された注目領域を、「注目領域：中央」とも略記する。これに対し、フレームｉ＋２以降（ドラマの表示期間）では、ＭＯＶ１の画面上側に位置している２人の人物の顔をユーザに注目させるために、注目領域は当該顔の部分に設定されている。以下、このように設定された注目領域を、「注目領域：顔」とも略記する。 In frames i to i+1 (commercial display period), the attention area is set at the center of the screen in order to draw the user's attention to the main part of the commercial displayed in the center of MOV1. Hereinafter, the attention area set in this way is also abbreviated as "attention area: center". On the other hand, from frame i+2 onwards (drama display period), in order to draw the user's attention to the faces of the two persons positioned on the upper side of the screen of MOV1, the region of interest is set to the faces. . Hereinafter, the attention area set in this way is also abbreviated as "attention area: face".

映像処理部１１０は、フレームｉに応じた（タイプスタンプｉに応じた）メタデータを用いて、当該フレームｉを拡大する。そして、映像処理部１１０は、拡大後のフレームｉを、ＭＯＶ２のフレームｉ－１として出力する。このように、ＭＯＶ２は、ＭＯＶ１よりも１フレーム分遅延した映像として生成される（後述の図１５も参照）。 The video processing unit 110 enlarges the frame i using metadata corresponding to the frame i (corresponding to the time stamp i). Then, the video processing unit 110 outputs the enlarged frame i as the frame i−1 of MOV2. In this way, MOV2 is generated as a video that is delayed by one frame from MOV1 (see also FIG. 15 described later).

従って、図４の例では、映像処理部１１０は、フレームｉ～ｉ＋１の注目領域を拡大することにより、ＭＯＶ２の対応する各フレームを生成する。その後、映像処理部１１０は、フレームｉ＋２以降の注目領域を拡大することにより、ＭＯＶ２の対応する各フレームを生成する。このように生成されたＭＯＶ２によれば、ＣＭ表示期間とドラマの表示期間において、ＭＯＶ２の異なる領域をユーザに注目させることができる。 Therefore, in the example of FIG. 4, the video processing unit 110 generates each corresponding frame of MOV2 by enlarging the attention area of frames i to i+1. After that, the video processing unit 110 generates each corresponding frame of MOV2 by enlarging the attention area after the frame i+2. According to MOV2 generated in this way, it is possible to draw the user's attention to different areas of MOV2 during the CM display period and the drama display period.

（比較例）
表示装置１００（より具体的には、映像処理装置１）に対するさらなる説明に先立ち、従来技術の問題点について説明するために、比較例としての表示装置１００ｒについて述べる。図５は、表示装置１００ｒの要部の構成を示すブロック図である。便宜上、表示装置１００ｒの映像処理装置を、映像処理装置１ｒと称する。また、映像処理装置１ｒの制御部を、制御部１０ｒと称する。 (Comparative example)
Prior to further explanation of the display device 100 (more specifically, the video processing device 1), a display device 100r as a comparative example will be described in order to explain the problems of the conventional technology. FIG. 5 is a block diagram showing the configuration of the main part of the display device 100r. For the sake of convenience, the image processing device of the display device 100r will be referred to as an image processing device 1r. Also, the control section of the video processing device 1r is referred to as a control section 10r.

図６は、比較例における処理の一例について説明する図である。図６は、図４と対になる図である。比較例においても、第２受信部８２は、ＮＴ２を介してメタデータを受信する。しかしながら、一般的に、第２受信部８２（あるいはＮＴ２）の通信状態の安定性は、第１受信部８１（あるいはＮＴ１）の通信状態の安定性に比べて低い。従って、第２受信部８２によって、メタデータを適切に受信できない状態が生じることも考えられる。 FIG. 6 is a diagram illustrating an example of processing in a comparative example. FIG. 6 is a diagram paired with FIG. Also in the comparative example, the second receiving unit 82 receives metadata via NT2. However, in general, the stability of the communication state of the second receiving section 82 (or NT2) is lower than the stability of the communication state of the first receiving section 81 (or NT1). Therefore, it is conceivable that the second receiving unit 82 may be unable to properly receive the metadata.

一例として、図６に示す通り、フレームｉ＋１～ｉ＋３のメタデータが受信できない場合を考える。このような場合、あるフレームを、同フレームに対応するメタデータ（適切なメタデータ）を用いて拡大することができないため、ＭＯＶ２の映像に乱れが生じうる（ＭＯＶ２の表示品位が低下しうる）。そこで、メタデータの受信不良に対応するための対策を採ることが好ましい。 As an example, consider the case where metadata for frames i+1 to i+3 cannot be received, as shown in FIG. In such a case, a certain frame cannot be enlarged using the metadata (appropriate metadata) corresponding to the same frame, so the image of MOV2 may be disturbed (the display quality of MOV2 may be degraded). . Therefore, it is preferable to take measures to cope with poor reception of metadata.

例えば、対策１に示す通り、あるフレーム（例：フレームｉ＋１）のメタデータが受信できない場合、当該フレームに対応するＭＯＶ２のフレーム（例：ＭＯＶ２のフレームｉ）に、ブランク画像（例：黒画面）を挿入することも考えられる。対策１では、映像処理装置１ｒは、ＭＯＶ２のフレームｉ～ｉ＋２に、ブランク画像を挿入する。 For example, as shown in countermeasure 1, if the metadata of a certain frame (eg, frame i+1) cannot be received, a blank image (eg, black screen) is added to the frame of MOV2 corresponding to the frame (eg, frame i of MOV2). It is also possible to insert In countermeasure 1, the video processing device 1r inserts blank images in frames i to i+2 of MOV2.

対策１によれば、注目領域として本来意図していないＭＯＶ１の部分が拡大されることを防止できる。しかしながら、対策１によって生成されたＭＯＶ２では、ＭＯＶ１において本来描画されていた一部のシーンを、ユーザに鑑賞させることができない。このことは、ＭＯＶ１の鑑賞を望むユーザに不満をもたらす。 According to measure 1, it is possible to prevent the portion of MOV1 that is not originally intended as the attention area from being enlarged. However, in MOV2 generated by Countermeasure 1, the user cannot view some of the scenes that were originally drawn in MOV1. This brings dissatisfaction to users who want to watch MOV1.

このようなユーザの不満を避けるためには、ＭＯＶ２にブランク画像を挿入しないことが好ましい。そこで、例えば、対策２に示す通り、受信に成功した最新のメタデータを、後続する各フレームにそのまま適用することも考えられる。対策２では、映像処理装置１ｒは、フレームｉのメタデータを用いて、フレームｉ＋１～フレームｉ＋３を拡大することにより、ＭＯＶ２のフレームｉ～ｉ＋２を生成する。このため、対策２では、フレームｉ＋１～フレームｉ＋３においても、フレームｉと同じ拡大率（Ｐ）が維持される。 In order to avoid such user dissatisfaction, it is preferable not to insert a blank image into MOV2. Therefore, for example, as shown in countermeasure 2, it is conceivable to apply the latest metadata successfully received to each subsequent frame as it is. In countermeasure 2, the video processing device 1r generates frames i to i+2 of MOV2 by enlarging frames i+1 to i+3 using the metadata of frame i. Therefore, in measure 2, the same enlargement ratio (P) as frame i is maintained for frames i+1 to i+3.

上述の通り、フレームｉのメタデータでは、注目領域が中央に設定されている。このため、ＭＯＶ２のフレームｉ～ｉ＋２は、フレームｉ＋１～フレームｉ＋３の中央部分が拡大された画像となる。しかしながら、上述の通り、フレームｉ＋２～ｉ＋３のメタデータでは、注目領域が中央に設定されている。すなわち、フレームｉ＋２～ｉ＋３は、本来であれば、中央部分に注目することが意図された画像である。 As described above, in the metadata of frame i, the region of interest is set in the center. Therefore, frames i to i+2 of MOV2 are images obtained by enlarging the central portion of frames i+1 to i+3. However, as described above, in the metadata of frames i+2 to i+3, the region of interest is set in the center. That is, frames i+2 to i+3 are images originally intended to focus attention on the central portion.

このため、対策２によってＭＯＶ２を生成した場合、ＭＯＶ２のフレームｉ＋１～フレームｉ＋２では、本来注目することが意図されていない領域（非注目領域）（例：２人の人物の胴体部分）が、フォーカスされてしまう。このように、対策２によっても、ＭＯＶ１の表示品位をかえって低下させる事態が生じうるため、ＭＯＶ１の鑑賞を望むユーザの不満を十分に解消するには至らない。 Therefore, when MOV2 is generated by countermeasure 2, in frames i+1 to i+2 of MOV2, a region (non-attention region) that is originally not intended to be focused on (eg, the torso portions of two people) is out of focus. It will be done. As described above, even with countermeasure 2, the display quality of MOV1 may be lowered, and the dissatisfaction of users who want to view MOV1 cannot be fully resolved.

（映像処理装置１の処理の一例）
映像処理装置１は、上述の比較例の問題点を踏まえ、本願の発明者によって新たに創作された。以下、映像処理装置１の処理の一例について述べる。映像処理装置１は、映像処理装置１ｒとは異なり、第１モードおよび第２モードを切り替えて動作可能である。 (Example of processing of video processing device 1)
The video processing device 1 was newly created by the inventor of the present application in consideration of the problems of the comparative example described above. An example of the processing of the video processing device 1 will be described below. Unlike the video processing device 1r, the video processing device 1 can operate by switching between the first mode and the second mode.

第１モードとは、表示情報を用いてＭＯＶ１を処理するモードである。第１モードには、上述のＺＯＯＭモードが含まれる。ＺＯＯＭモードでは、α＝Ｐに維持される。これに対し、第２モードとは、表示情報を用いずにＭＯＶ１を処理するモードである。第２モードには、ＯＲＩＧＩＮＡＬモードが含まれる。 The first mode is a mode for processing MOV1 using display information. The first mode includes the ZOOM mode described above. In ZOOM mode, α=P is maintained. On the other hand, the second mode is a mode in which MOV1 is processed without using display information. The second mode includes ORIGINAL mode.

ＯＲＩＧＩＮＡＬモードとは、ＭＯＶ１と等しいスケールを有しており、かつ、当該ＭＯＶ１と同じシーンを表現するＭＯＶ２を生成するモードである。一例として、ＯＲＩＧＩＮＡＬモードでは、映像処理装置１は、ＭＯＶ１のフレームｉを、ＭＯＶ２のフレームｉ－１として割り当てる。このように、ＯＲＩＧＩＮＡＬモードでは、時間の経過によらず、α＝１に維持される。すなわち、ＭＯＶ１の本来のスケール（オリジナルスケール）が、一定に維持される。 The ORIGINAL mode is a mode for generating MOV2 that has the same scale as MOV1 and expresses the same scene as MOV1. As an example, in the ORIGINAL mode, the video processing device 1 assigns frame i of MOV1 as frame i−1 of MOV2. Thus, in the ORIGINAL mode, α is maintained at 1 regardless of the passage of time. That is, the original scale of MOV1 (original scale) is kept constant.

さらに、第１モードには、上述のＺＯＯＭモードに加え、ＴＯ＿ＯＲＩＧＩＮＡＬモードが含まれる。ＴＯ＿ＯＲＩＧＩＮＡＬモードとは、αをＰから１まで徐々に（漸進的に）減少させるモードである。ＴＯ＿ＯＲＩＧＩＮＡＬモードは、ＺＯＯＭモードとＯＲＩＧＩＮＡＬモードとの間の中間的なモードと言える。ＴＯ＿ＯＲＩＧＩＮＡＬモードを設けることにより、以下に述べるように、ＺＯＯＭモードからＯＲＩＧＩＮＡＬモードへの遷移時における、表示品位の低下を効果的に防止できる。 Furthermore, the first mode includes TO_ORIGINAL mode in addition to the ZOOM mode described above. The TO_ORIGINAL mode is a mode in which α is gradually (gradually) decreased from P to 1. The TO_ORIGINAL mode can be said to be an intermediate mode between the ZOOM mode and the ORIGINAL mode. By providing the TO_ORIGINAL mode, it is possible to effectively prevent deterioration in display quality when transitioning from the ZOOM mode to the ORIGINAL mode, as described below.

以下、第２受信部８２が表示情報を受信できる状態を、受信可能状態と称する。他方、第２受信部８２が表示情報を受信できない状態を、受信不能状態と称する。表示情報受信判定部１２０は、第２受信部８２の状態（通信状態）が、受信可能状態または受信不能状態のいずれであるかを判定する。表示情報受信判定部１２０は、公知の通信状態診断技術を用いて実現されてよい。表示情報受信判定部１２０の判定結果を示す情報（通信状態情報）を生成し、当該通信状態情報を選択部１４０に供給する。 Hereinafter, a state in which the second receiving section 82 can receive display information is referred to as a receivable state. On the other hand, a state in which the second receiving section 82 cannot receive display information is referred to as a reception disabled state. The display information reception determination unit 120 determines whether the state (communication state) of the second reception unit 82 is a receivable state or a non-receivable state. The display information reception determination unit 120 may be realized using a known communication state diagnosis technique. Information (communication state information) indicating the determination result of the display information reception determination unit 120 is generated, and the communication state information is supplied to the selection unit 140 .

シーンチェンジ検出部１３０は、ＭＯＶ１を解析することにより、当該ＭＯＶ１のシーンチェンジの有無を検出する。シーンチェンジ検出部１３０は、シーンチェンジ検出部１３０の判定結果を示す情報（シーンチェンジ情報）を生成し、当該シーンチェンジ情報を選択部１４０に供給する。シーンチェンジの検出は、公知の手法によって行われてよい。 The scene change detection unit 130 detects whether or not there is a scene change in MOV1 by analyzing MOV1. The scene change detection unit 130 generates information (scene change information) indicating the determination result of the scene change detection unit 130 and supplies the scene change information to the selection unit 140 . Scene change detection may be performed by a known technique.

一例として、シーンチェンジ検出部１３０は、ＭＯＶ１の各フレームについて特徴量を導出する。そして、シーンチェンジ検出部１３０は、あるフレーム（例：フレームｉ）と次フレーム（例：フレームｉ＋１）との間の特徴量の変化に基づき、当該あるフレームと当該次フレームとの間にシーンチェンジが生じたか否かを判定する。一例として、特徴量は、１つのフレームの全画素の輝度の総和であってよい。また、１つのフレームの各画素の輝度の分布を示すヒストグラム（輝度ヒストグラム）に基づき、特徴量が算出されてもよい。 As an example, the scene change detection unit 130 derives feature amounts for each frame of MOV1. Then, the scene change detection unit 130 detects a scene change between a certain frame (eg, frame i) and the next frame (eg, frame i+1) based on the change in the feature amount between the frame (eg, frame i) and the next frame (eg, frame i+1). has occurred. As an example, the feature amount may be the sum of luminance of all pixels in one frame. Also, the feature amount may be calculated based on a histogram (brightness histogram) showing the distribution of brightness of each pixel in one frame.

選択部１４０は、映像処理装置１のモードを選択する。すなわち、選択部１４０は、第１モードおよび第２モードの一方を選択する。具体的には、選択部１４０は、ＺＯＯＭモード、ＴＯ＿ＯＲＩＧＩＮＡＬモード、およびＯＲＩＧＩＮＡＬモードのうち、１つのモードを選択する。映像処理装置１は、選択部１４０によって選択されたモードに応じて、ＭＯＶ１を処理する（すなわち、ＭＯＶ２を生成する）。 The selection unit 140 selects the mode of the video processing device 1 . That is, the selection unit 140 selects one of the first mode and the second mode. Specifically, the selection unit 140 selects one mode from the ZOOM mode, the TO_ORIGINAL mode, and the ORIGINAL mode. The video processing device 1 processes MOV1 (that is, generates MOV2) according to the mode selected by the selection unit 140 .

上述の通り、メタデータの受信可否（すなわち、第２受信部８２の状態）は、ＭＯＶ２の表示品位に大きく影響する。そこで、選択部１４０は、通信状態情報に基づき、モードを選択することが好ましい。さらに、ＭＯＶ１のシーンチェンジの有無も、ＭＯＶ２の表示品位に大きく影響する。このため、選択部１４０は、シーンチェンジ情報にさらに基づき、モードを選択することが好ましい。 As described above, whether metadata can be received (that is, the state of the second receiving unit 82) greatly affects the display quality of MOV2. Therefore, it is preferable that the selection unit 140 selects the mode based on the communication state information. Furthermore, the presence or absence of a scene change in MOV1 also greatly affects the display quality of MOV2. Therefore, it is preferable that the selection unit 140 selects the mode further based on the scene change information.

以下では、選択部１４０は、通信状態情報およびシーンチェンジ情報の両方に基づき、モードを選択する場合を主に例示する。以下では、映像処理装置１のモードを示すフラグ値を、ｓｔａｔｅとして表す。一例として、ｓｔａｔｅ＝ＺＯＯＭは、映像処理装置１がＺＯＯＭモードで動作していることを示す。その他の表記についても同様である。選択部１４０は、自身が選択したモードに応じて、ｓｔａｔｅを更新する。 Below, the case where the selection unit 140 selects the mode based on both the communication state information and the scene change information will be mainly exemplified. Below, the flag value indicating the mode of the video processing device 1 is expressed as state. As an example, state=ZOOM indicates that the video processing device 1 is operating in ZOOM mode. The same applies to other notations. The selection unit 140 updates the state according to the mode selected by itself.

図７は、映像処理装置１の処理の流れを例示するシーケンス図である。図７については、後述の図８～図１３の各処理も合わせて参照されたい。図７の例では、時間の経過に伴い、ＭＯＶ１のシーンが、「シーン１→シーン２→シーン３」の通り変更される。図７の「メタデータ受信状態」は、第２受信部８２の状態を示す。「メタデータ受信状態：可」は、受信可能状態を示す。これに対し、「メタデータ受信状態：不可」は、受信不能状態を示す。 FIG. 7 is a sequence diagram illustrating the processing flow of the video processing device 1. As shown in FIG. As for FIG. 7, please also refer to each processing of FIGS. 8 to 13 described later. In the example of FIG. 7, the scene of MOV1 is changed as "scene 1→scene 2→scene 3" as time passes. “Metadata reception state” in FIG. 7 indicates the state of the second reception unit 82 . "Metadata reception status: Enabled" indicates a receivable status. On the other hand, "metadata reception status: not possible" indicates a reception disabled status.

まず、シーン１について説明する。図７の例では、シーン１の先頭において、第２受信部８２によって、表示情報が受信されているものとする。このため、選択部１４０によって、ＺＯＯＭモード（第１モードの一例）が予め選択されているものとする。シーン１において、第２受信部８２が表示情報を受信している期間では、ＺＯＯＭモードがそのまま維持される。ＺＯＯＭモードでは、拡大率ＰのＭＯＶ２が生成される。 First, scene 1 will be described. In the example of FIG. 7, it is assumed that the display information is received by the second receiving unit 82 at the beginning of scene 1 . Therefore, it is assumed that the ZOOM mode (an example of the first mode) has been selected in advance by the selection unit 140 . In scene 1, the ZOOM mode is maintained while the second receiving unit 82 is receiving display information. In the ZOOM mode, MOV2 with magnification P is generated.

図７の例では、シーン１の途中に、第２受信部８２の状態が、受信可能状態から受信不能状態へと遷移している。このような場合にＺＯＯＭモードを維持すると、上述の比較例の通り、表示品位の低いＭＯＶ２が生成されうる。そこで、選択部１４０は、第２受信部８２の状態が受信可能状態から受信不能状態へと遷移した場合には、ＺＯＯＭモードに替えてＴＯ＿ＯＲＩＧＩＮＡＬモードを選択する。ＴＯ＿ＯＲＩＧＩＮＡＬモードでは、αがＰから１まで徐々に減少するように設定される。このため、注目領域が徐々にＭＯＶ１の全体へとズームアウトするように、ＭＯＶ２の各フレームが生成される。 In the example of FIG. 7, the state of the second receiver 82 transitions from the receivable state to the unreceivable state in the middle of scene 1 . If the ZOOM mode is maintained in such a case, MOV2 with low display quality can be generated as in the comparative example described above. Therefore, when the state of the second receiver 82 transitions from the receivable state to the unreceivable state, the selector 140 selects the TO_ORIGINAL mode instead of the ZOOM mode. In the TO_ORIGINAL mode, α is set to gradually decrease from P to 1. Therefore, each frame of MOV2 is generated such that the region of interest gradually zooms out to the entirety of MOV1.

図７の例では、シーン２の途中まで、受信不能状態が継続している。そして、この受信不能状態の継続中に、シーン１からシーン２へのシーンチェンジが生じている。図７に示されるように、選択部１４０は、受信不能状態において、ＭＯＶ１にシーンチェンジが生じたことを契機として、ＯＧＩＧＩＮＡＬモード（第２モード）を選択する。ＺＯＯＭモードでは、拡大率１のＭＯＶ２（オリジナルサイズのＭＯＶ２とも称する）が生成される。 In the example of FIG. 7 , the unreceivable state continues until the middle of scene 2 . A scene change from scene 1 to scene 2 occurs while this unreceivable state continues. As shown in FIG. 7, the selector 140 selects the OGIGINAL mode (second mode) when a scene change occurs in MOV1 in the unreceivable state. In ZOOM mode, MOV2 with an enlargement factor of 1 (also referred to as MOV2 with original size) is generated.

図７の例では、シーン２の途中で、受信不能状態が解消されている。そして、この受信不能状態の解消後に、シーン２からシーン３へのシーンチェンジが生じている。図７に示されるように、選択部１４０は、受信可能状態において、ＭＯＶ１にシーンチェンジが生じたことを契機として、ＺＯＯＭモード（第１モードの一例）を選択する。 In the example of FIG. 7 , the unreceivable state is canceled in the middle of scene 2 . A scene change from scene 2 to scene 3 occurs after the unreceivable state is resolved. As shown in FIG. 7, the selector 140 selects the ZOOM mode (an example of the first mode) when a scene change occurs in MOV1 in the receivable state.

図８は、映像処理装置１（より具体的には、制御部１０）の処理の全体的な流れを例示するフローチャートである。上述の通り、制御部１０は、ＭＯＶ１のフレーム毎に処理を実行する。このため、制御部１０は、ＭＯＶ１の垂直同期信号（Ｖｓｙｎｃ）がＯＮされたことを契機として、処理を開始する。 FIG. 8 is a flowchart illustrating the overall flow of processing of the video processing device 1 (more specifically, the control unit 10). As described above, the control unit 10 executes processing for each frame of MOV1. Therefore, the control unit 10 starts processing when the vertical synchronization signal (Vsync) of MOV1 is turned ON.

まず、制御部１０は、ＭＯＶ１のタイムススタンプを取得する（Ｓ１）。そして、シーンチェンジ検出部１３０は、ＭＯＶ１を解析することにより、シーンチェンジ情報を生成する（Ｓ２）。以下では、ＭＯＶ１のシーンチェンジの有無を示すフラグ値を、ＳＣＤ＿Ｆｌａｇとして表す。ＳＣＤ＿Ｆｌａｇは、シーンチェンジ情報の一例である。 First, the control unit 10 acquires the time stamp of MOV1 (S1). Then, the scene change detection unit 130 generates scene change information by analyzing MOV1 (S2). Below, a flag value indicating whether or not there is a scene change in MOV1 is represented as SCD_Flag. SCD_Flag is an example of scene change information.

一例として、シーンチェンジ検出部１３０は、ＭＯＶ１にシーンチェンジが生じた場合（シーンチェンジ有の場合）、ＳＣＤ＿Ｆｌａｇ＝Ｔｒｕｅとして、ＳＣＤ＿Ｆｌａｇを設定する。これに対し、シーンチェンジ検出部１３０は、ＭＯＶ１にシーンチェンジが生じていない場合（シーンチェンジ無の場合）、ＳＣＤ＿Ｆｌａｇ＝Ｆａｌｓｅとして、ＳＣＤ＿Ｆｌａｇを設定する。 As an example, when a scene change occurs in MOV1 (when there is a scene change), the scene change detection unit 130 sets SCD_Flag as SCD_Flag=True. On the other hand, the scene change detection unit 130 sets SCD_Flag as SCD_Flag=False when no scene change occurs in MOV1 (when there is no scene change).

続いて、制御部１０は、映像処理装置１のモードに応じて（換言すれば、ｓｔａｔｅに応じて）、ＭＯＶ１を処理する（Ｓ３）。以下、モードに応じた映像処理の例について述べる。 Subsequently, the control unit 10 processes MOV1 according to the mode of the video processing device 1 (in other words, according to the state) (S3). Examples of video processing according to modes are described below.

（ＺＯＯＭモードの例）
図９は、ＺＯＯＭモードが予め選択されている場合の、映像処理の流れを例示するフローチャートである。まず、表示情報受信判定部１２０は、タイムスタンプに応じたメタデータ（すなわち、フレームｉのメタデータ）を取得できたか否かを判定する（Ｓ１１）。 (Example of ZOOM mode)
FIG. 9 is a flowchart illustrating the flow of image processing when the ZOOM mode is preselected. First, the display information reception determining unit 120 determines whether or not the metadata corresponding to the time stamp (that is, the metadata of frame i) has been acquired (S11).

フレームｉのメタデータを取得できた場合（Ｓ１１でＹＥＳ）、映像処理部１１０は、当該メタデータに含まれている、フレームｉの注目情報を取得する。つまり、映像処理部１１０は、フレームｉの注目領域を特定する。そして、映像処理部１１０は、当該注目領域をズームインするように、フレームｉを拡大する（Ｓ１２）。すなわち、上述の通り、フレームｉが拡大率Ｐによって拡大された画像が、ＭＯＶ２のフレームｉ－１として生成される。 If the metadata of the frame i can be obtained (YES in S11), the video processing unit 110 obtains attention information of the frame i included in the metadata. That is, the video processing unit 110 identifies the attention area of the frame i. Then, the video processing unit 110 enlarges the frame i so as to zoom in on the attention area (S12). That is, as described above, the image obtained by enlarging the frame i by the enlargement ratio P is generated as the frame i−1 of MOV2.

フレームｉのメタデータを取得できなかった場合（Ｓ１１でＮＯ）、シーンチェンジ検出部１３０は、ＭＯＶ１において（より具体的には、フレームｉにおいて）シーンチェンジが生じたか否か（すなわち、ＳＣＤ＿Ｆｌａｇ＝Ｔｒｕｅであるか否か）を判定する（Ｓ１３）。 If the metadata of frame i could not be acquired (NO in S11), the scene change detection unit 130 determines whether a scene change has occurred in MOV1 (more specifically, in frame i) (that is, SCD_Flag=True). (S13).

ＳＣＤ＿Ｆｌａｇ＝Ｔｒｕｅである場合（Ｓ１３でＹＥＳ）、選択部１４０は、ＯＲＩＧＩＮＡＬモードを選択する。つまり、選択部１４０は、ＺＯＯＭモードからＯＲＩＧＩＮＡＬモードへのモード切替を行う。当該モード切替を契機として、映像処理部１１０は、αを１に設定する（Ｓ１４）。選択部１４０は、ｓｔａｔｅ＝ＯＲＩＧＩＮＡＬとして、ｓｔａｔｅを更新する（Ｓ１５）。 When SCD_Flag=True (YES in S13), the selection unit 140 selects the ORIGINAL mode. That is, the selection unit 140 switches the mode from the ZOOM mode to the ORIGINAL mode. Triggered by the mode switching, the video processing unit 110 sets α to 1 (S14). The selection unit 140 sets state=ORIGINAL and updates the state (S15).

他方、ＳＣＤ＿Ｆｌａｇ＝Ｔｒｕｅでない場合（すなわち、ＳＣＤ＿Ｆｌａｇ＝Ｆａｌｓｅである場合）（Ｓ１３でＮＯ）、選択部１４０は、ＴＯ＿ＯＲＩＧＩＮＡＬモードを選択する。つまり、選択部１４０は、ＺＯＯＭモードからＴＯ＿ＯＲＩＧＩＮＡＬモードへのモード切替を行う。当該モード切替を契機として、映像処理部１１０は、ＰＲＯＣＥＳＳ＿ＣＮＴを０に設定する（つまり、ＰＲＯＣＥＳＳ＿ＣＮＴを初期化する）（Ｓ１６）。 On the other hand, when SCD_Flag is not True (that is, when SCD_Flag is False) (NO in S13), selection unit 140 selects TO_ORIGINAL mode. That is, the selection unit 140 performs mode switching from the ZOOM mode to the TO_ORIGINAL mode. Triggered by the mode switching, the video processing unit 110 sets PROCESS_CNT to 0 (that is, initializes PROCESS_CNT) (S16).

後述するように、ＰＲＯＣＥＳＳ＿ＣＮＴは、ＴＯ＿ＯＲＩＧＩＮＡＬモード（または、実施形態２において述べるＴＯ＿ＺＯＯＭモード）におけるαの設定に用いられるカウント値である。選択部１４０は、ｓｔａｔｅ＝ＴＯ＿ＯＲＩＧＩＮＡＬとして、ｓｔａｔｅを更新する（Ｓ１７）。 As will be described later, PROCESS_CNT is a count value used for setting α in TO_ORIGINAL mode (or TO_ZOOM mode described in Embodiment 2). The selection unit 140 sets state=TO_ORIGINAL and updates the state (S17).

（ＴＯ＿ＯＲＩＧＩＮＡＬモードの例）
図１０は、ＴＯ＿ＯＲＩＧＩＮＡＬモードが予め選択されている場合の、映像処理の流れを例示するフローチャートである。まず、シーンチェンジ検出部１３０は、ＳＣＤ＿Ｆｌａｇ＝Ｔｒｕｅであるか否かを判定する（Ｓ２１）。 (Example of TO_ORIGINAL mode)
FIG. 10 is a flowchart illustrating the flow of video processing when the TO_ORIGINAL mode is preselected. First, the scene change detection unit 130 determines whether SCD_Flag=True (S21).

ＳＣＤ＿Ｆｌａｇ＝Ｔｒｕｅである場合（Ｓ２１でＹＥＳ）、選択部１４０は、ＯＲＩＧＩＮＡＬモードを選択する。つまり、選択部１４０は、ＴＯ＿ＯＲＩＧＩＮＡＬモードからＯＲＩＧＩＮＡＬモードへのモード切替を行う。当該モード切替を契機として、上述のＳ１４・１５と同様の処理が実行される（Ｓ２２・Ｓ２３）。 If SCD_Flag=True (YES in S21), the selection unit 140 selects the ORIGINAL mode. That is, the selection unit 140 performs mode switching from the TO_ORIGINAL mode to the ORIGINAL mode. Triggered by the mode switching, processes similar to those of S14 and S15 described above are executed (S22 and S23).

他方、ＳＣＤ＿Ｆｌａｇ＝Ｔｒｕｅでない場合（Ｓ２１でＮＯ）、表示情報受信判定部１２０は、タイムスタンプに応じたメタデータを取得できたか否かを判定する（Ｓ２４）。Ｓ２４でＹＥＳの場合、選択部１４０は、ＴＯ＿ＺＯＯＭモードを選択する。つまり、選択部１４０は、ＴＯ＿ＯＲＩＧＩＮＡＬモードからＴＯ＿ＺＯＯＭモードへのモード切替を行う。選択部１４０は、ｓｔａｔｅ＝ＴＯ＿ＺＯＯＭとして、ｓｔａｔｅを更新する（Ｓ２５）。 On the other hand, if SCD_Flag=True (NO in S21), the display information reception determining unit 120 determines whether metadata corresponding to the time stamp has been acquired (S24). If YES in S24, the selector 140 selects the TO_ZOOM mode. That is, the selection unit 140 performs mode switching from the TO_ORIGINAL mode to the TO_ZOOM mode. The selection unit 140 sets state=TO_ZOOM and updates the state (S25).

Ｓ２４でＮＯの場合、選択部１４０は、ＰＲＯＣＥＳＳ＿ＣＮＴ＜ＭＡＸ＿ＣＮＴであるか否かを判定する（Ｓ２６）。ＭＡＸ＿ＣＮＴは、ＴＯ＿ＯＲＩＧＩＮＡＬモードにおいて予め設定されている、ＰＲＯＣＥＳＳ＿ＣＮＴの最大値である。ＭＡＸ＿ＣＮＴは、拡大率がＰから１に戻るまでのフレーム数とも表現できる。一例として、実施形態１では、ＭＡＸ＿ＣＮＴ＝６００として設定されている。 In the case of NO in S24, the selector 140 determines whether or not PROCESS_CNT<MAX_CNT (S26). MAX_CNT is the maximum value of PROCESS_CNT preset in TO_ORIGINAL mode. MAX_CNT can also be expressed as the number of frames until the expansion rate returns from P to 1. As an example, in Embodiment 1, it is set as MAX_CNT=600.

ＰＲＯＣＥＳＳ＿ＣＮＴ＜ＭＡＸ＿ＣＮＴである場合（Ｓ２６でＹＥＳ）、映像処理部１１０は、現在のＰＲＯＣＥＳＳ＿ＣＮＴに応じたαを設定する。換言すれば、映像処理部１１０は、αを徐々に減少させる（Ｓ２７）。そして、映像処理部１１０は、フレームｉを拡大率αによって拡大した画像を、ＭＯＶ２のフレームｉ－１として生成する。続いて、映像処理部１１０は、ＰＲＯＣＥＳＳ＿ＣＮＴを１だけインクリメント（カウントアップ）する（Ｓ２８）。 If PROCESS_CNT<MAX_CNT (YES in S26), the video processing unit 110 sets α according to the current PROCESS_CNT. In other words, the video processing unit 110 gradually decreases α (S27). Then, the video processing unit 110 generates an image obtained by enlarging the frame i by the enlargement factor α as the frame i−1 of MOV2. Subsequently, the video processing unit 110 increments (counts up) PROCESS_CNT by 1 (S28).

以上のように、ＴＯ＿ＯＲＩＧＩＮＡＬモードでは、ＰＲＯＣＥＳＳ＿ＣＮＴが、初期値（０）から１ずつインクリメントされる。そして、最終的には、ＰＲＯＣＥＳＳ＿ＣＮＴ＝ＭＡＸ＿ＣＮＴとなる。すなわち、ＰＲＯＣＥＳＳ＿ＣＮＴ＜ＭＡＸ＿ＣＮＴという条件が満たされなくなる（Ｓ２６でＮＯ）。 As described above, in the TO_ORIGINAL mode, PROCESS_CNT is incremented by one from the initial value (0). Finally, PROCESS_CNT=MAX_CNT. That is, the condition PROCESS_CNT<MAX_CNT is no longer satisfied (NO in S26).

Ｓ２６でＮＯの場合、Ｓ２２に進む。すなわち、選択部１４０は、ＰＲＯＣＥＳＳ＿ＣＮＴがＭＡＸ＿ＣＮＴ（最大値）まで増加したことを契機として、ＯＲＩＧＩＮＡＬモードを選択する。このように、選択部１４０は、ＰＲＯＣＥＳＳ＿ＣＮＴがＭＡＸ＿ＣＮＴまで増加した場合（すなわち、αが１まで減少した場合）にも、ＴＯ＿ＯＲＩＧＩＮＡＬモードからＯＲＩＧＩＮＡＬモードへのモード切替を行う。 If NO in S26, proceed to S22. That is, the selector 140 selects the ORIGINAL mode when PROCESS_CNT increases to MAX_CNT (maximum value). In this way, the selection unit 140 also switches from the TO_ORIGINAL mode to the ORIGINAL mode when PROCESS_CNT increases to MAX_CNT (that is, when α decreases to 1).

図１１は、ＴＯ＿ＯＲＩＧＩＮＡＬモードにおける拡大処理について説明する図である。ＴＯ＿ＯＲＩＧＩＮＡＬモードでは、αがＰから徐々に減少する。すなわち、ＴＯ＿ＯＲＩＧＩＮＡＬモードでは、ＡＴＴが、対応ズーム領域ＺＴＴへと縮小される。換言すれば、ＴＯ＿ＯＲＩＧＩＮＡＬモードは、ＭＯＶ１が、Ｐ以下の拡大率によって、ＺＴＴへと拡大されるモードである。 FIG. 11 is a diagram illustrating enlargement processing in the TO_ORIGINAL mode. In TO_ORIGINAL mode, α is gradually decreased from P. That is, in TO_ORIGINAL mode, ATT is reduced to the corresponding zoom area ZTT. In other words, the TO_ORIGINAL mode is a mode in which MOV1 is expanded to ZTT by an expansion factor of P or less.

図１１の例において、ＺＴＴの４つの頂点のうち左上の頂点を、頂点ＺＳ（ｘｓ，ｙｓ）とする。頂点ＺＳは、原点Ｏおよび頂点ＡＳ０に対応する。また、ＺＴＴの４つの頂点のうち右下の頂点を、頂点ＺＥ（ｘｅ，ｙｅ）とする。頂点ＺＥは、頂点Ｅおよび頂点ＡＥ０に対応する。 In the example of FIG. 11, let the upper left vertex of the four vertices of ZTT be the vertex ZS (xs, ys). Vertex ZS corresponds to origin O and vertex AS0. Also, let the lower right vertex of the four vertices of ZTT be vertex ZE (xe, ye). Vertex ZE corresponds to vertex E and vertex AE0.

ＴＯ＿ＯＲＩＧＩＮＡＬモードでは、映像処理部１１０は、受信に成功した最新のメタデータに含まれる注目情報を用いて、ＡＴＴを特定する（すなわち、ｘｓ０、ｙｓ０、ｙｓ０、およびｙｅ０を特定する）。そして、映像処理部１１０は、以下の式（２Ａ）～（２Ｄ）、 In the TO_ORIGINAL mode, the video processing unit 110 identifies ATT (that is, identifies xs0, ys0, ys0, and ye0) using attention information included in the latest successfully received metadata. Then, the video processing unit 110 calculates the following formulas (2A) to (2D),

に基づき、ｘｓ、ｙｓ、ｙｓ、およびｙｅを算出する。 xs, ys, ys, and ye are calculated based on

また、ＴＯ＿ＯＲＩＧＩＮＡＬモードにおける拡大率αについて、
α×（ｘｅ－ｘｓ）＝ＳＩＺＥＸ－１ …（３Ａ）
α×（ｙｅ－ｙｓ）＝ＳＩＺＥＹ－１ …（３Ｂ）
の関係が成立する。 Also, regarding the enlargement factor α in the TO_ORIGINAL mode,
α×(xe-xs)=SIZEX-1 (3A)
α×(ye-ys)=SIZEY-1 (3B)
relationship is established.

そこで、ＴＯ＿ＯＲＩＧＩＮＡＬモードにおいて、映像処理部１１０は、上述の式（３Ａ）または（３Ｂ）のいずれかに基づき、αを算出する。そして、映像処理部１１０は、当該αを用いてＭＯＶ１を拡大することにより、ＭＯＶ２を生成する。 Therefore, in the TO_ORIGINAL mode, the video processing unit 110 calculates α based on either formula (3A) or (3B) above. Then, the video processing unit 110 generates MOV2 by enlarging MOV1 using the α.

上述の式（２Ａ）～（２Ｄ）に示される通り、ＰＲＯＣＥＳＳ＿ＣＮＴ＝０である場合、ＺＴＴはＡＴＴと同一の領域となる。このため、映像処理部１１０は、αをＰ（ＴＯ＿ＯＲＩＧＩＮＡＬモードにおける最大値）に設定する。 As shown in equations (2A)-(2D) above, when PROCESS_CNT=0, ZTT is the same region as ATT. Therefore, the video processing unit 110 sets α to P (the maximum value in the TO_ORIGINAL mode).

また、ＰＲＯＣＥＳＳ＿ＣＮＴ＝ＭＡＸ＿ＣＮＴである場合、ＺＴＴはＭＯＶ１と同一の領域となる。このため、映像処理部１１０は、αを１（ＴＯ＿ＯＲＩＧＩＮＡＬモードにおける最大値）に設定する。 Also, when PROCESS_CNT=MAX_CNT, ZTT is the same area as MOV1. Therefore, the video processing unit 110 sets α to 1 (the maximum value in the TO_ORIGINAL mode).

さらに、上述の式（３Ａ）～（３Ｂ）からも理解されるように、ＰＲＯＣＥＳＳ＿ＣＮＴが大きくなるにつれて、αは減少する。つまり、ＡＴＴが徐々にズームアウトされるように、ＺＴＴが設定される。図１２は、ＰＲＯＣＥＳＳ＿ＣＮＴの増加に応じたＺＴＴの変化の一例を示す図である。図１２の例では、ＰＲＯＣＥＳＳ＿ＣＮＴ＝０～４の場合が例示されている。図１２からも、ＰＲＯＣＥＳＳ＿ＣＮＴの増加に伴い、αが減少することが理解できる。 Furthermore, as can be seen from equations (3A)-(3B) above, α decreases as PROCESS_CNT increases. That is, ZTT is set such that ATT is gradually zoomed out. FIG. 12 is a diagram showing an example of changes in ZTT according to an increase in PROCESS_CNT. In the example of FIG. 12, PROCESS_CNT=0 to 4 are illustrated. Also from FIG. 12, it can be understood that α decreases as PROCESS_CNT increases.

以上のように、ＴＯ＿ＯＲＩＧＩＮＡＬモードでは、αは、最大値としてＰをとり、かつ、最小値として１をとる、ＰＲＯＣＥＳＳ＿ＣＮＴの単調減少関数として設定されている。実施形態１の例では、αは、ＰＲＯＣＥＳＳ＿ＣＮＴの増加に応じて（換言すれば、時刻ｔの増加に応じて）、線形的に減少するように設定されている。 As described above, in the TO_ORIGINAL mode, α is set as a monotonically decreasing function of PROCESS_CNT with P as the maximum value and 1 as the minimum value. In the example of Embodiment 1, α is set to decrease linearly as PROCESS_CNT increases (in other words, as time t increases).

上述の説明から理解されるように、ＴＯ＿ＯＲＩＧＩＮＡＬモードでは、ＭＡＸ＿ＣＮＴが大きく設定されている場合には、低速で（緩やかに）ズームアウトが行われる。他方、ＭＡＸ＿ＣＮＴが大きく設定されている場合には、高速でズームアウトが行われる。 As can be understood from the above description, in the TO_ORIGINAL mode, when MAX_CNT is set large, zooming out is performed at a low speed (gradually). On the other hand, when MAX_CNT is set large, zooming out is performed at high speed.

一般的な映像コンテンツ（例：映画またはドラマ）では、多くの場合、シーンチェンジの間隔は、５～１０秒程度である。このため、ＴＯ＿ＯＲＩＧＩＮＡＬモードの持続期間が５～１０秒程度に設定されていれば、ＭＯＶ２を鑑賞するユーザに違和感に与える可能性を少なくできると期待される。 In general video content (eg movies or dramas), the interval between scene changes is often about 5 to 10 seconds. Therefore, if the duration of the TO_ORIGINAL mode is set to about 5 to 10 seconds, it is expected that the possibility of causing discomfort to the user watching MOV2 can be reduced.

以上の点を踏まえ、一例として、ＭＯＶ１のフレームレートが６０Ｈｚである場合には、ＭＡＸ＿ＣＮＴは、３００～６００程度に設定されることが好ましい。ＭＡＸ＿ＣＮＴは、ＭＯＶ１の仕様等を考慮し、適宜設定されてよい。 Based on the above points, as an example, when the frame rate of MOV1 is 60 Hz, MAX_CNT is preferably set to about 300-600. MAX_CNT may be appropriately set in consideration of the specifications of MOV1.

（ＯＲＩＧＩＮＡＬモードの例）
図１３は、ＯＲＩＧＩＮＡＬモードが予め選択されている場合の、映像処理の流れを例示するフローチャートである。 (Example of ORIGINAL mode)
FIG. 13 is a flowchart illustrating the flow of video processing when the ORIGINAL mode has been selected in advance.

まず、表示情報受信判定部１２０は、タイムスタンプに応じたメタデータを取得できたか否かを判定する。また、シーンチェンジ検出部１３０は、ＭＯＶ１のシーンチェンジの有無を判定する。ＯＲＩＧＩＮＡＬモードでは、制御部１０は、「タイムスタンプに応じたメタデータを取得できており、かつ、ＳＣＤ＿Ｆｌａｇ＝Ｔｒｕｅであるか否か」を判定する（Ｓ３１）。 First, the display information reception determination unit 120 determines whether metadata corresponding to the time stamp has been acquired. Also, the scene change detection unit 130 determines whether or not there is a scene change in MOV1. In the ORIGINAL mode, the control unit 10 determines "whether metadata corresponding to the time stamp has been acquired and SCD_Flag=True" (S31).

Ｓ３１でＮＯの場合、すなわち、（ｉ）「タイムスタンプに応じたメタデータを取得できなかった場合」、および、（ｉｉ）「ＳＣＤ＿Ｆｌａｇ＝Ｆａｌｓｅであった場合」、の少なくとも一方の場合には、ＯＲＩＧＩＮＡＬモードがそのまま維持される。 In the case of NO in S31, that is, in the case of at least one of (i) ``the metadata corresponding to the time stamp could not be obtained'' and (ii) ``the SCD_Flag=False'', ORIGINAL mode remains unchanged.

他方、Ｓ３１でＹＥＳの場合、選択部１４０は、ＺＯＯＭモードを選択する。つまり、選択部１４０は、ＯＲＩＧＩＮＡＬモードからＺＯＯＭモードへのモード切替を行う。当該モード切替を契機として、上述のＳ１２と同様の処理が実行される（Ｓ３２）。そして、選択部１４０は、ｓｔａｔｅ＝ＺＯＯＭとして、ｓｔａｔｅを更新する（Ｓ３３）。 On the other hand, if S31 is YES, the selection unit 140 selects the ZOOM mode. That is, the selection unit 140 switches the mode from the ORIGINAL mode to the ZOOM mode. Triggered by the mode switching, a process similar to that of S12 described above is executed (S32). Then, the selection unit 140 sets state=ZOOM and updates the state (S33).

（効果）
上述の通り、従来技術では、メタデータの受信に失敗した場合の対策について、何ら言及されていない。すなわち、従来技術では、「第１モードと第２モードとを切り替える」という技術的思想についても、何ら考慮されていない。これに対し、映像処理装置１によれば、第１モードと第２モードとを切り替えることで、状況に応じた（例：第２受信部の状態に応じた）、適切な映像処理を行うことができる。それゆえ、映像処理装置１によれば、従来に比べて表示品位の高いＭＯＶ２を提供できる。 (effect)
As described above, the prior art does not mention any measures to be taken when metadata reception fails. That is, in the prior art, the technical concept of "switching between the first mode and the second mode" is not considered at all. On the other hand, according to the video processing device 1, by switching between the first mode and the second mode, appropriate video processing can be performed according to the situation (for example, according to the state of the second receiving unit). can be done. Therefore, according to the video processing device 1, MOV2 with higher display quality than the conventional one can be provided.

続いて、図１４および図１５を参照し、映像処理装置１のさらなる効果について述べる。図１４は、実施形態１におけるαの時間変化について例示する図である。図１４のグラフの横軸はｔ（時刻）を、縦軸はα（拡大率）をそれぞれ表す。図１４の例では、簡単のために、ＭＯＶ１のシーンチェンジは生じないものとする。以下、図１４の例を、事例１とも称する。 Next, further effects of the video processing device 1 will be described with reference to FIGS. 14 and 15. FIG. 14A and 14B are diagrams illustrating temporal changes of α in Embodiment 1. FIG. The horizontal axis of the graph in FIG. 14 represents t (time), and the vertical axis represents α (magnification rate). In the example of FIG. 14, for the sake of simplicity, it is assumed that no scene change occurs in MOV1. The example of FIG. 14 is also referred to as case 1 below.

事例１では、初期時刻（ｔ＝０）からｔ１１まで、受信可能状態が持続している。このため、初期時刻からｔ１１までは、ＺＯＯＭモードが選択される。ＺＯＯＭモードでは、αはＰに維持される。 In Case 1, the receivable state continues from the initial time (t=0) to t11. Therefore, the ZOOM mode is selected from the initial time to t11. In ZOOM mode, α is kept at P.

そして、事例１では、ｔ１１において、第２受信部８２が、受信可能状態から受信不能状態へと遷移している。この受信不能状態は、復旧されずにそのまま持続するものとする。ｔ１１において、第２受信部８２が受信不能状態に至った（メタデータの受信に失敗した）ことを契機として、ＴＯ＿ＯＲＩＧＩＮＡＬモードが選択される。従って、ｔ１１以降では、ｔの増加に応じて、αが線形的に減少する。但し、ＴＯ＿ＯＲＩＧＩＮＡＬモードにおけるαの減少態様は、図１４の例に限定されない。例えば、αは、ｔの増加に応じて非線形的に減少するように設定されてもよい。 In Case 1, at t11, the second receiver 82 transitions from the receivable state to the unreceivable state. It is assumed that this unreceivable state continues without recovery. At t11, the TO_ORIGINAL mode is selected triggered by the fact that the second receiving unit 82 has reached the unreceivable state (metadata reception has failed). Therefore, after t11, α linearly decreases as t increases. However, the mode of decreasing α in the TO_ORIGINAL mode is not limited to the example of FIG. For example, α may be set to decrease non-linearly as t increases.

事例１では、ＴＯ＿ＯＲＩＧＩＮＡＬモードにおけるαの減少の結果、ｔ１２において、αが１（最小値）に減少した。このため、ｔ１２において、αが１となったことを契機として、ＯＲＩＧＩＮＡＬモードが選択される。ＯＲＩＧＩＮＡＬモードでは、αは１に維持される。 In case 1, the decrease of α in TO_ORIGINAL mode resulted in α decreasing to 1 (minimum value) at t12. Therefore, the ORIGINAL mode is selected when α becomes 1 at t12. In ORIGINAL mode, α is kept at 1.

図１４から理解されるように、ＺＯＯＭモードからＯＲＩＧＩＮＡＬモードへの切替を直接的に行った場合（後述の参考例の場合）には、αがＰから１へと瞬時的に変化する。このため、ＺＯＯＭモードからＯＲＩＧＩＮＡＬモードへの切替時に、ＭＯＶ２の表示品位が低下しうる。例えば、当該切替時に、ＭＯＶ２を鑑賞するユーザに違和感を与えてしまう場合がある。 As can be understood from FIG. 14, when the ZOOM mode is directly switched to the ORIGINAL mode (in the case of the reference example described later), α changes from P to 1 instantaneously. Therefore, the display quality of MOV2 may deteriorate when switching from the ZOOM mode to the ORIGINAL mode. For example, at the time of switching, the user viewing MOV2 may feel uncomfortable.

他方、ＺＯＯＭモードとＯＲＩＧＩＮＡＬモードとの間に、ＴＯ＿ＯＲＩＧＩＮＡＬモードを介在させることによって、時間の経過に応じてαを滑らかに減少させることができる。このように、ＴＯ＿ＯＲＩＧＩＮＡＬモードを介して、ＺＯＯＭモードからＯＲＩＧＩＮＡＬモードへの切替を行うことにより、ＯＲＩＧＩＮＡＬモードへの切替時に、表示品位がより高いＭＯＶ２（上記違和感がより少ない映像）を提供できる。 On the other hand, by interposing the TO_ORIGINAL mode between the ZOOM mode and the ORIGINAL mode, α can be smoothly decreased over time. In this way, by switching from the ZOOM mode to the ORIGINAL mode via the TO_ORIGINAL mode, it is possible to provide MOV2 with higher display quality (video with less discomfort) when switching to the ORIGINAL mode.

図１５は、図６（比較例）との対比によって映像処理装置１の効果を説明する図である。図１５に示されるように、映像処理装置１では、映像処理装置１ｒとは異なり、フレームｉ＋１のメタデータの受信に失敗したことを契機として、ＺＯＯＭモードからＴＯ＿ＯＲＩＧＩＮＡＬモードへのモード切替が行われる。 FIG. 15 is a diagram for explaining the effect of the video processing device 1 in comparison with FIG. 6 (comparative example). As shown in FIG. 15, unlike the video processing device 1r, the video processing device 1 switches from the ZOOM mode to the TO_ORIGINAL mode when it fails to receive the metadata of the frame i+1.

また、映像処理装置１では、フレームｉ＋２（受信不能状態下の１つのフレーム）においてＭＯＶ１のシーンチェンジが生じたことを契機として、ＴＯ＿ＯＲＩＧＩＮＡＬモードからＯＲＩＧＩＮＡＬモードへのモード切替が行われる。このため、上述の図１４からも理解される通り、映像処理装置１ｒに比べても、ＭＯＶ２の表示品位を向上させることができる。 Also, in the video processing apparatus 1, mode switching from the TO_ORIGINAL mode to the ORIGINAL mode is performed when a scene change of MOV1 occurs in frame i+2 (one frame in the unreceivable state). Therefore, as can be understood from FIG. 14 described above, the display quality of the MOV2 can be improved even compared to the video processing device 1r.

（参考例との対比）
図１６は、参考例との対比によって映像処理装置１の効果を説明する図である。参考例では、実施形態１とは異なり、第１モードにＴＯ＿ＯＲＩＧＩＮＡＬモードが含まれていない。このため、参考例では、ＺＯＯＭモードとＯＲＩＧＩＮＡＬモードとの間での切り替えのみが行われる。 (Comparison with reference example)
FIG. 16 is a diagram for explaining the effect of the video processing device 1 by comparison with the reference example. In the reference example, unlike the first embodiment, the TO_ORIGINAL mode is not included in the first mode. Therefore, in the reference example, only switching between the ZOOM mode and the ORIGINAL mode is performed.

図１６の例においても、図１５の例と同様に、受信不能状態下においてＭＯＶ１のシーンチェンジが生じている。参考例では、当該シーンチェンジを契機として、ＺＯＯＭモードからＯＲＩＧＩＮＡＬモードへのモード切替が行われる。このため、参考例では、当該切替時にαがＰから１に瞬時的に変化したことに起因し、ＭＯＶ２のフラッシング（flashing）（ちらつき）が生じうる。 In the example of FIG. 16, similarly to the example of FIG. 15, the scene change of MOV1 occurs under the unreceivable state. In the reference example, mode switching from the ZOOM mode to the ORIGINAL mode is performed with the scene change as a trigger. Therefore, in the reference example, flashing (flickering) of MOV2 may occur due to the momentary change of α from P to 1 at the time of switching.

これに対し、実施形態１の映像処理装置１によれば、上述の通り、上記シーンチェンジを契機として、ＺＯＯＭモードからＴＯ＿ＯＲＩＧＩＮＡＬモードへのモード切替が行われる。ＴＯ＿ＯＲＩＧＩＮＡＬモードを設けることにより、参考例とは異なり、αの瞬時的な変化が生じることを防止できる。すなわち、フラッシングを有効に防止できる。 In contrast, according to the video processing apparatus 1 of Embodiment 1, as described above, mode switching from the ZOOM mode to the TO_ORIGINAL mode is performed with the scene change as a trigger. By providing the TO_ORIGINAL mode, unlike the reference example, it is possible to prevent an instantaneous change in α from occurring. That is, flushing can be effectively prevented.

一般に、第２受信部が受信不能状態に至るタイミングについては、事前に予測することが難しい。また、ＭＯＶ１にシーンチェンジが生じるタイミングも、当該ＭＯＶ１の内容によって相違する。しかしながら、実施形態１によれば、ＴＯ＿ＯＲＩＧＩＮＡＬモードを設けることにより、上述の各タイミングの予測を要することなく、フラッシングを防止することが可能となる。 In general, it is difficult to predict in advance the timing at which the second receiving section will reach the unreceivable state. Also, the timing at which a scene change occurs in MOV1 differs depending on the content of MOV1. However, according to the first embodiment, by providing the TO_ORIGINAL mode, it is possible to prevent flushing without requiring the above-described prediction of each timing.

〔実施形態２〕
実施形態２では、ＴＯ＿ＺＯＯＭモードについて説明する。ＴＯ＿ＺＯＯＭモードは、ＴＯ＿ＯＲＩＧＩＮＡＬモードと対になるモードである。ＴＯ＿ＺＯＯＭモードでは、ＴＯ＿ＯＲＩＧＩＮＡＬモードとは異なり、αが徐々に増加するように設定される。ＴＯ＿ＺＯＯＭモードも、第１モードに含まれる。 [Embodiment 2]
Embodiment 2 describes the TO_ZOOM mode. The TO_ZOOM mode is a mode paired with the TO_ORIGINAL mode. In the TO_ZOOM mode, unlike the TO_ORIGINAL mode, α is set to gradually increase. TO_ZOOM mode is also included in the first mode.

（ＴＯ＿ＺＯＯＭモードの例）
図１７は、実施形態２において、ＴＯ＿ＺＯＯＭモードが予め選択されている場合の、映像処理の流れを例示するフローチャートである。以下に述べるように、図１７の例では、ＰＲＯＣＥＳＳ＿ＣＮＴをデクリメント（カウントダウン）することにより、αが設定される。 (Example of TO_ZOOM mode)
FIG. 17 is a flowchart illustrating the flow of video processing when the TO_ZOOM mode is selected in advance in the second embodiment. As described below, in the example of FIG. 17, α is set by decrementing (counting down) PROCESS_CNT.

なお、ＴＯ＿ＺＯＯＭモードの開始時には、ＰＲＯＣＥＳＳ＿ＣＮＴは、初期値（ＴＯ＿ＺＯＯＭモードの開始時のαに対応する所定の値）に設定されている。当該初期値は、０～ＣＮＴ＿ＭＡＸの範囲の任意の整数値をとりうる。図１７についての以下の説明では、初期値が０よりも大きい場合を例示する。 At the start of the TO_ZOOM mode, PROCESS_CNT is set to an initial value (predetermined value corresponding to α at the start of the TO_ZOOM mode). The initial value can take any integer value in the range 0 to CNT_MAX. The following description of FIG. 17 illustrates the case where the initial value is greater than zero.

まず、選択部１４０は、０＜ＰＲＯＣＥＳＳ＿ＣＮＴであるか否かを判定する（Ｓ４１）Ｓ４１でＹＥＳの場合、映像処理部１１０は、現在のＰＲＯＣＥＳＳ＿ＣＮＴに応じたαを設定する。 First, the selection unit 140 determines whether or not 0<PROCESS_CNT (S41). If YES in S41, the video processing unit 110 sets α according to the current PROCESS_CNT.

ＴＯ＿ＺＯＯＭモードにおいても、映像処理部１１０は、受信に成功した最新のメタデータに含まれる注目情報を用いて、ＡＴＴを特定する（すなわち、ｘｓ０、ｙｓ０、ｙｓ０、およびｙｅ０を特定する）。そして、映像処理部１１０は、上述の式（２Ａ）～（２Ｄ）に基づき、ｘｓ、ｙｓ、ｙｓ、およびｙｅを算出する。 Also in the TO_ZOOM mode, the video processing unit 110 identifies ATT (that is, identifies xs0, ys0, ys0, and ye0) using attention information included in the latest successfully received metadata. Then, video processing section 110 calculates xs, ys, ys, and ye based on the above equations (2A) to (2D).

ＴＯ＿ＺＯＯＭモードにおいても、拡大率αについて、上述の（３Ａ）～（３Ｂ）の関係が成立する。従って、ＴＯ＿ＺＯＯＭモードにおいても、映像処理部１１０は、上述の式（３Ａ）または（３Ｂ）のいずれかに基づき、αを算出してよい。上述の通り、ＰＲＯＣＥＳＳ＿ＣＮＴが小さくなるにつれて、αが増加する。それゆえ、図１７の処理の通りＰＲＯＣＥＳＳ＿ＣＮＴをデクリメントすることで、αを徐々に増加させることができる（Ｓ４２）。例えば、αを、ＰＲＯＣＥＳＳ＿ＣＮＴの減少に応じて（換言すれば、時刻ｔの増加に応じて）、線形的に減少させることができる。 Also in the TO_ZOOM mode, the above-described relationships (3A) to (3B) are established with respect to the enlargement factor α. Therefore, even in the TO_ZOOM mode, the video processing unit 110 may calculate α based on either formula (3A) or (3B) above. As noted above, α increases as PROCESS_CNT decreases. Therefore, α can be gradually increased by decrementing PROCESS_CNT as in the process of FIG. 17 (S42). For example, α can be decreased linearly as PROCESS_CNT decreases (in other words, as time t increases).

そして、映像処理部１１０は、フレームｉを拡大率αによって拡大した画像を、ＭＯＶ２のフレームｉ－１として生成する。続いて、映像処理部１１０は、ＰＲＯＣＥＳＳ＿ＣＮＴを１だけデクリメントする（Ｓ４３）。このように、ＴＯ＿ＺＯＯＭモードでは、ＴＯ＿ＯＲＩＧＩＮＡＬモードとは異なり、ＡＴＴが徐々にズームインされるように、ＺＴＴが設定される。 Then, the video processing unit 110 generates an image obtained by enlarging the frame i by the enlargement factor α as the frame i−1 of MOV2. Subsequently, the video processing unit 110 decrements PROCESS_CNT by 1 (S43). Thus, in TO_ZOOM mode, unlike TO_ORIGINAL mode, ZTT is set such that ATT is gradually zoomed in. FIG.

以上のように、ＴＯ＿ＺＯＯＭモードでは、ＰＲＯＣＥＳＳ＿ＣＮＴが、初期値（ＴＯ＿ＺＯＯＭモードにおける最大値）から１ずつデクリメントされる。そして、最終的には、ＰＲＯＣＥＳＳ＿ＣＮＴ＝０となる。すなわち、０＜ＰＲＯＣＥＳＳ＿ＣＮＴという条件が満たされなくなる（Ｓ４１でＮＯ）。 As described above, in TO_ZOOM mode, PROCESS_CNT is decremented by one from the initial value (maximum value in TO_ZOOM mode). Finally, PROCESS_CNT=0. That is, the condition 0<PROCESS_CNT is no longer satisfied (NO in S41).

Ｓ４１でＮＯの場合、ＰＲＯＣＥＳＳ＿ＣＮＴ＝０であるため、映像処理部１１０は、α＝Ｐに設定する（Ｓ４４）。そして、選択部１４０は、ＺＯＯＭモードを選択する。以上の通り、選択部１４０は、ＰＲＯＣＥＳＳ＿ＣＮＴが０（最小値）まで減少したことを契機として、ＴＯ＿ＺＯＯＭモードからＺＯＯＭモードへのモード切替を行う。選択部１４０は、ｓｔａｔｅ＝ＺＯＯＭとして、ｓｔａｔｅを更新する（Ｓ４５）。 In the case of NO in S41, since PROCESS_CNT=0, the video processing unit 110 sets α=P (S44). Then, the selection unit 140 selects the ZOOM mode. As described above, the selector 140 switches from the TO_ZOOM mode to the ZOOM mode when PROCESS_CNT decreases to 0 (minimum value). The selection unit 140 sets state=ZOOM and updates the state (S45).

（事例２）
図１８は、実施形態２におけるαの時間変化について例示する図である。以下、図１８の例を、事例２とも称する。事例２では、初期時刻からｔ２１まで、受信可能状態が持続している（すなわち、ＺＯＯＭモードが維持されている）。そして、ｔ２１において、第２受信部８２が、受信可能状態から受信不能状態へと遷移している。その後、ｔ２２において、この受信不能状態が解消された。 (Case 2)
FIG. 18 is a diagram illustrating temporal change of α in the second embodiment. The example of FIG. 18 is also referred to as case 2 below. In Case 2, the receivable state continues (that is, the ZOOM mode is maintained) from the initial time to t21. At t21, the second receiver 82 transitions from the receivable state to the unreceivable state. After that, at t22, this unreceivable state is resolved.

このため、事例２では、ｔ２１からｔ２２までの期間において、ＴＯ＿ＯＲＩＧＩＮＡＬモードが選択される（図９・１０も参照）。ここで、ｔ２２におけるαを、α２２と称する。事例２では、α２２＞１であるものとする。つまり、事例２では、αが１に減少するまでの所定の時点（ｔ２２）において、受信不能状態が解消されたものとする。 Therefore, in case 2, the TO_ORIGINAL mode is selected during the period from t21 to t22 (see also FIGS. 9 and 10). Here, α at t22 is referred to as α22. In Case 2, it is assumed that α22>1. That is, in Case 2, it is assumed that the unreceivable state is resolved at a predetermined time (t22) before α decreases to 1.

実施形態２では、選択部１４０は、ｔ２２において、映像処理装置１のモードをＴＯ＿ＯＲＩＧＩＮＡＬモードからＴＯ＿ＺＯＯＭモードに切り替える（図１０も参照）。すなわち、選択部１４０は、αが１に減少するまでの所定の時点において、受信不能状態が解消されたことを契機として、ＴＯ＿ＺＯＯＭモードを選択する。α２２は、ＴＯ＿ＺＯＯＭモードにおけるαの初期値の一例である。 In the second embodiment, at t22, the selection unit 140 switches the mode of the video processing device 1 from the TO_ORIGINAL mode to the TO_ZOOM mode (see also FIG. 10). That is, the selector 140 selects the TO_ZOOM mode when the unreceivable state is resolved at a predetermined point in time before α decreases to 1. α22 is an example of the initial value of α in TO_ZOOM mode.

図１８の例では、ＴＯ＿ＺＯＯＭモードにおいて、映像処理装置１（より具体的には、映像処理部１１０）は、ｔの増加に応じて、αを線形的に増加させる。但し、ＴＯ＿ＺＯＯＭモードにおけるαの増加態様は、図１８の例に限定されない。例えば、αは、ｔの増加に応じて非線形的に増加するように設定されてもよい。 In the example of FIG. 18, in the TO_ZOOM mode, the video processing device 1 (more specifically, the video processing unit 110) linearly increases α as t increases. However, the mode of increasing α in the TO_ZOOM mode is not limited to the example of FIG. For example, α may be set to increase non-linearly as t increases.

事例２では、ＴＯ＿ＺＯＯＭモードにおけるαの増加の結果、ｔ２３において、αがＰに増加した。選択部１４０は、ｔ２３において、αがＰとなったことを契機として、ＺＯＯＭモードを選択する（図１７も参照）。 In case 2, α increased to P at t23 as a result of increasing α in TO_ZOOM mode. The selector 140 selects the ZOOM mode when α becomes P at t23 (see also FIG. 17).

事例１と同様に、仮に、ＯＲＩＧＩＮＡＬモードからＺＯＯＭモードへの切替を直接的に行った場合には、αが１からＰへと瞬時的に変化する。このため、ＺＯＯＭモードへの切替時に、ＭＯＶ２の表示品位が低下しうる。例えば、当該切替時に、ＭＯＶ２を鑑賞するユーザに違和感を与えてしまう場合がある。そこで、実施形態２では、ＺＯＯＭモードとＯＲＩＧＩＮＡＬモードとの間の中間的なモードとして、ＴＯ＿ＺＯＯＭモードがさらに設けられている。 As in Case 1, if the ORIGINAL mode is directly switched to the ZOOM mode, α changes from 1 to P instantaneously. Therefore, the display quality of MOV2 may be degraded when switching to the ZOOM mode. For example, at the time of switching, the user viewing MOV2 may feel uncomfortable. Therefore, in the second embodiment, a TO_ZOOM mode is further provided as an intermediate mode between the ZOOM mode and the ORIGINAL mode.

ＯＲＩＧＩＮＡＬモードとＺＯＯＭモードとの間に、ＴＯ＿ＺＯＯＭモードを介在させることによって、時間の経過に応じてαを滑らかに増加させることができる。このように、ＴＯ＿ＺＯＯＭモードを介して、ＯＲＩＧＩＮＡＬモードからＺＯＯＭモードへの切替を行うことにより、ＺＯＯＭモードへの切替時に、表示品位がより高いＭＯＶ２（上記違和感がより少ない映像）を提供できる。 By interposing the TO_ZOOM mode between the ORIGINAL mode and the ZOOM mode, α can be smoothly increased over time. In this way, by switching from the ORIGINAL mode to the ZOOM mode via the TO_ZOOM mode, it is possible to provide MOV2 with higher display quality (video with less discomfort) when switching to the ZOOM mode.

〔実施形態３〕
図１９は、実施形態３におけるαの時間変化について例示する図である。以下、図１９の例を、事例３とも称する。事例３では、初期時刻からｔ３１まで、受信可能状態が持続している（すなわち、ＺＯＯＭモードが維持されている）。そして、ｔ３１において、第２受信部８２が、受信可能状態から受信不能状態へと遷移した。このため、ｔ３１において、ＴＯ＿ＯＲＩＧＩＮＡＬモードが選択される。その後、ｔ３３まで、受信可能状態と受信不能状態とが断続的に切り替わる状態が継続した。 [Embodiment 3]
FIG. 19 is a diagram illustrating temporal change of α in the third embodiment. The example of FIG. 19 is also referred to as case 3 below. In Case 3, the receivable state continues (that is, the ZOOM mode is maintained) from the initial time to t31. Then, at t31, the second receiver 82 transitioned from the receivable state to the unreceivable state. Therefore, at t31, the TO_ORIGINAL mode is selected. After that, until t33, the state in which the receivable state and the unreceivable state were intermittently switched continued.

事例３では、事例２とは異なり、選択部１４０は、受信可能状態が所定の時間（以下、ｔｓｓ）に亘り維持されない限り、映像処理装置１の現在のモードをそのまま維持する（以下の図２０も参照）。一例として、ｔｓｓは、１０～２０秒程度である。受信可能状態と受信不能状態とが断続的に切り替わる状況下において、個別の受信可能状態と受信不能状態との切替毎にモードを変更すると、かえってＭＯＶ２の視認性が低下すると懸念されるためである。このため、事例３では、ｔ３１以降でも、ＴＯ＿ＯＲＩＧＩＮＡＬモードが維持される。 In case 3, unlike case 2, the selection unit 140 maintains the current mode of the video processing device 1 unless the receivable state is maintained for a predetermined time (hereinafter referred to as tss) (see FIG. 20 below). see also). As an example, tss is about 10 to 20 seconds. This is because, in a situation where the receivable state and the unreceivable state are intermittently switched, if the mode is changed for each switching between the receivable state and the unreceivable state, there is a concern that the visibility of MOV2 will rather deteriorate. . Therefore, in Case 3, the TO_ORIGINAL mode is maintained even after t31.

事例３では、ＴＯ＿ＯＲＩＧＩＮＡＬモードにおけるαの減少の結果、ｔ３２において、αが１に減少した。ｔ３２は、ｔ３３よりも前の時点であるものとする。ｔ３２において、αが１となったことを契機として、ＯＲＩＧＩＮＡＬモードが選択される。 In Case 3, the reduction of α in the TO_ORIGINAL mode resulted in a reduction of α to 1 at t32. It is assumed that t32 is before t33. At t32, Triggered by α becoming 1, the ORIGINAL mode is selected.

そして、ｔ３３において、受信可能状態と受信不能状態とが断続的に切り替わる状態が解消された。図１９に示すように、ｔ３３以降では、受信可能状態が維持されるものとする。事例３では、選択部１４０は、ｔ３４において、ＴＯ＿ＺＯＯＭモードを選択する（図２０も参照）。ｔ３４は、ｔ３３からｔｓｓだけ経過した時点であり、ｔ３４＝ｔ３３＋ｔｓｓとして表される。 Then, at t33, the intermittent switching between the receivable state and the unreceivable state is resolved. As shown in FIG. 19, the receivable state is maintained after t33. In case 3, the selection unit 140 selects the TO_ZOOM mode at t34 (see also FIG. 20). t34 is the time point tss after t33, and is expressed as t34=t33+tss.

このように、選択部１４０は、第２モード（例：ＯＲＩＧＩＮＡＬモード）が予め選択されていた場合、受信不能状態が解消された状態が所定の時間に亘り維持されたことを契機として、第１モード（例：ＴＯ＿ＺＯＯＭモード）を選択してよい。事例３では、映像処理装置１は、ＴＯ＿ＺＯＯＭモードにおいて、α１から増加させる。 In this way, when the second mode (eg, ORIGINAL mode) is selected in advance, the selection unit 140 selects the first A mode (eg, TO_ZOOM mode) may be selected. In Case 3, the video processing device 1 increases from α1 in the TO_ZOOM mode.

ＴＯ＿ＺＯＯＭモードにおけるαの増加の結果、ｔ３５において、αがＰに増加した。選択部１４０は、ｔ３５において、αがＰとなったことを契機として、ＺＯＯＭモードを選択する。 The increase in α in TO_ZOOM mode resulted in α increasing to P at t35. The selector 140 selects the ZOOM mode when α becomes P at t35.

事例３の処理によれば、事例２の処理とは異なり、受信可能状態と受信不能状態との一時的な切替を無視して、モードを選択できる。このため、第２受信部８２（あるいはＮＴ２）の通信状態の安定性が低いと考えられる場合には、事例２に替えて事例３の処理を採用することが好ましい。 According to the processing of Case 3, unlike the processing of Case 2, the mode can be selected ignoring the temporary switching between the receivable state and the unreceivable state. Therefore, when the stability of the communication state of the second receiving unit 82 (or NT2) is considered to be low, it is preferable to adopt the processing of Case 3 instead of Case 2.

（ＯＲＩＧＩＮＡＬモードの別の例）
図２０は、実施形態３において、ＯＲＩＧＩＮＡＬモードが予め選択されている場合の映像処理の流れを例示するフローチャートである。実施形態３では、ＯＲＩＧＩＮＡＬモードが予め選択されている場合には、図１３のフローチャートに替えて、図２０のフローチャートが適用される。以下の説明におけるｔ３４は、現時点（但し、ｔ３３よりも後の時点）を表すものとする。 (another example of ORIGINAL mode)
FIG. 20 is a flowchart illustrating the flow of video processing when the ORIGINAL mode is preselected in the third embodiment. In Embodiment 3, when the ORIGINAL mode is selected in advance, the flowchart of FIG. 20 is applied instead of the flowchart of FIG. t34 in the following description represents the current time (however, the time after t33).

まず、表示情報受信判定部１２０は、タイムスタンプに応じたメタデータを取得できたか否かを判定する。加えて、制御部１０は、「ｔ３４－ｔ３３＞ｔｓｓ」という条件が満たされているか否かを判定する。すなわち、実施形態３のＯＲＩＧＩＮＡＬモードでは、制御部１０は、「タイムスタンプに応じたメタデータを取得できており、かつ、ｔ３４－ｔ３３＞ｔｓｓであるか否か」を判定する（Ｓ５１）。 First, the display information reception determination unit 120 determines whether metadata corresponding to the time stamp has been acquired. In addition, the control unit 10 determines whether or not the condition "t34-t33>tss" is satisfied. That is, in the ORIGINAL mode of the third embodiment, the control unit 10 determines "whether metadata corresponding to the time stamp has been acquired and whether t34-t33>tss" (S51).

Ｓ５１でＮＯの場合、すなわち、（ｉ）「タイムスタンプに応じたメタデータを取得できなかった場合」、および、（ｉｉ）「ｔ３４－ｔ３３≦ｔｓｓであった場合」、の少なくとも一方の場合には、ＯＲＩＧＩＮＡＬモードがそのまま維持される。 In the case of NO in S51, that is, in at least one of (i) ``the metadata corresponding to the time stamp could not be obtained'' and (ii) ``t34-t33≤tss''. , the ORIGINAL mode is maintained as it is.

他方、Ｓ５１でＹＥＳの場合、選択部１４０は、ＴＯ＿ＺＯＯＭモードを選択する。つまり、選択部１４０は、ＯＲＩＧＩＮＡＬモードからＴＯ＿ＺＯＯＭモードへのモード切替を行う。 On the other hand, if S51 is YES, the selection unit 140 selects the TO_ZOOM mode. That is, the selection unit 140 performs mode switching from the ORIGINAL mode to the TO_ZOOM mode.

当該モード切替を契機として、映像処理部１１０は、ＰＲＯＣＥＳＳ＿ＣＮＴ＝ＭＡＸ＿ＣＮＴに設定する（Ｓ５２）。このようにＰＲＯＣＥＳＳ＿ＣＮＴを設定することにより、ＴＯ＿ＺＯＯＭモードにおけるαの初期値を１に設定できる。そして、選択部１４０は、ｓｔａｔｅ＝ＴＯ＿ＺＯＯＭとして、ｓｔａｔｅを更新する（Ｓ５３）。 Triggered by the mode switching, the video processing unit 110 sets PROCESS_CNT=MAX_CNT (S52). By setting PROCESS_CNT in this way, the initial value of α can be set to 1 in the TO_ZOOM mode. Then, the selection unit 140 sets state=TO_ZOOM and updates the state (S53).

〔実施形態４〕
図２１は、実施形態４におけるαの時間変化について例示する図である。以下、図２１の例を、事例４とも称する。事例４におけるメタデータ受信状態の時間変化の様子は、事例３と同様である。事例４では、事例３とは異なり、ＭＯＶ１のシーンチェンジの有無をさらに考慮して、モード切替が行われる。なお、実施形態４では、ＯＲＩＧＩＮＡＬモードが予め選択されている場合には、図１３のフローチャートが適用される。 [Embodiment 4]
FIG. 21 is a diagram illustrating temporal change of α in the fourth embodiment. The example of FIG. 21 is also referred to as case 4 below. The temporal change of the metadata reception state in Case 4 is the same as in Case 3. In case 4, unlike case 3, mode switching is performed further considering the presence or absence of a scene change in MOV1. Note that in the fourth embodiment, the flowchart of FIG. 13 is applied when the ORIGINAL mode is selected in advance.

事例４における初期時刻からｔ４３までのモードの推移の様子（αの変化の様子）は、事例３における初期時刻からｔ３３までのモードの推移の様子と同様である。ｔ４３は、事例４において、受信可能状態と受信不能状態とが断続的に切り替わる状態が解消された時点である。 The mode transition (change of α) from the initial time to t43 in Case 4 is the same as the mode transition from the initial time to t33 in Case 3. In case 4, t43 is the point in time when the intermittent switching between the receivable state and the unreceivable state is resolved.

図２１に示すように、事例４では、ｔ４３に後続する時刻ｔ４４において、ＭＯＶ１のシーンチェンジが生じている。実施形態１にても説明した通り、第２モード（例：ＯＲＩＧＩＮＡＬモード）が予め選択されていた場合、選択部１４０は、受信可能状態において、ＭＯＶ１のシーンチェンジが生じたことを契機として、第１モード（例：ＺＯＯＭモード）を選択してよい。事例４では、選択部１４０は、ｔ４４において、ＯＲＩＧＩＮＡＬモードからＺＯＯＭモードへのモード切替を行う。 As shown in FIG. 21, in case 4, a scene change of MOV1 occurs at time t44 following t43. As described in the first embodiment, when the second mode (eg, the ORIGINAL mode) has been selected in advance, the selection unit 140 selects the first 1 mode (eg, ZOOM mode) may be selected. In case 4, the selection unit 140 switches the mode from the ORIGINAL mode to the ZOOM mode at t44.

シーンチェンジが生じた場合、シーンチェンジ直前のフレームとシーンチェンジ直後のフレームとで、注目領域の設定（例：注目領域の位置および大きさの少なくともいずれか）が有意に異なることも考えられる。このような場合、シーンチェンジ直前のフレームにおける注目情報を用いて、ＴＯ＿ＺＯＯＭモードの動作を実行すると、ＭＯＶ２を鑑賞するユーザにかえって違和感を与えることも懸念される。ＴＯ＿ＺＯＯＭモードにおけるＭＯＶ２では、本来注目すべき領域（シーンチェンジ直後のフレームの注目領域）が、十分に強調されない可能性があるためである。 When a scene change occurs, it is conceivable that the setting of the attention area (for example, at least one of the position and size of the attention area) is significantly different between the frame immediately before the scene change and the frame immediately after the scene change. In such a case, if the attention information in the frame immediately before the scene change is used to execute the operation of the TO_ZOOM mode, there is a concern that the user viewing MOV2 may feel uncomfortable. This is because, in MOV2 in the TO_ZOOM mode, there is a possibility that the area that should be focused on (the focused area of the frame immediately after the scene change) will not be sufficiently emphasized.

そこで、上述の通り、第２受信部８２が受信可能状態にある場合には、シーンチェンジ時に、ＴＯ＿ＺＯＯＭモードを介在させずに、ＯＲＩＧＩＮＡＬモードからＺＯＯＭモードへの切替えを行ってもよい。シーンチェンジ時には、αを１からＰまで瞬時に増加させた方が、ユーザにとって違和感が少ない映像を提供できる場合もあるためである。 Therefore, as described above, when the second receiving section 82 is in the receivable state, switching from the ORIGINAL mode to the ZOOM mode may be performed without intervening the TO_ZOOM mode at the scene change. This is because, in some cases, increasing α from 1 to P instantaneously at the time of a scene change can provide an image that gives less discomfort to the user.

〔実施形態５〕
図２２は、実施形態５におけるαの時間変化について例示する図である。以下、図２２の例を、事例５とも称する。事例５では、事例４と同様に、ＭＯＶ１のシーンチェンジの有無を考慮して、モード切替が行われる。 [Embodiment 5]
FIG. 22 is a diagram illustrating temporal change of α in the fifth embodiment. The example of FIG. 22 is also referred to as case 5 below. In case 5, as in case 4, mode switching is performed in consideration of the presence or absence of a scene change in MOV1.

事例５では、初期時刻からｔ５１まで、受信可能状態が持続している。事例５では、ｔ５１において、第２受信部８２が受信可能状態から受信不能状態へと遷移し、この受信不能状態が復旧されずにそのまま持続する。このため、事例５では、ｔ５１において、ＴＯ＿ＯＲＩＧＩＮＡＬモードが選択される。従って、ｔ５１以降では、ｔの増加に応じて、αが減少する。 In case 5, the receivable state continues from the initial time to t51. In case 5, at t51, the second receiving unit 82 transitions from the receivable state to the unreceivable state, and this unreceivable state continues without recovery. Therefore, in case 5, the TO_ORIGINAL mode is selected at t51. Therefore, after t51, α decreases as t increases.

事例５では、ｔ５１に後続する時刻ｔ５２において、ＭＯＶ１のシーンチェンジが生じている。ここで、ｔ５２におけるαを、α５２と称する。事例５では、α５２＞１であるものとする。つまり、事例５では、αが１に減少するまでの所定の時点（ｔ５２）において、シーンチェンジが生じたものとする。 In case 5, a scene change of MOV1 occurs at time t52 following t51. Here, α at t52 is referred to as α52. In case 5, it is assumed that α52>1. That is, in Case 5, it is assumed that a scene change occurs at a predetermined time (t52) before α decreases to one.

第１モード（例：ＴＯ＿ＯＲＩＧＩＮＡＬモード）が予め選択されていた場合、選択部１４０は、受信不能状態において、ＭＯＶ１のシーンチェンジが生じたことを契機として、第２モード（ＯＲＩＧＩＮＡＬモード）を選択してよい。事例５では、選択部１４０は、ｔ５２において、ＴＯ＿ＯＲＩＧＩＮＡＬモードからＯＲＩＧＩＮＡＬモードへのモード切替を行う。 If the first mode (eg, TO_ORIGINAL mode) has been selected in advance, the selection unit 140 selects the second mode (ORIGINAL mode) in response to the scene change of MOV1 in the unreceivable state. good. In case 5, the selection unit 140 switches the mode from the TO_ORIGINAL mode to the ORIGINAL mode at t52.

以上のように、第２受信部８２が受信不能状態にある場合には、シーンチェンジ時に、αをα５２から１に瞬時に減少させてもよい。事例４と同様の趣旨により、シーンチェンジ時には、αを瞬時に減少させた方が、ユーザにとって違和感が少ない映像を提供できることも考えられるためである。 As described above, when the second receiving section 82 is in the unreceivable state, α may be instantaneously decreased from α52 to 1 at the scene change. This is because, for the same reason as Case 4, it is conceivable that an image that is less uncomfortable for the user can be provided by instantly decreasing α at the time of a scene change.

図２３は、実施形態１との対比によって実施形態５の効果を説明する図である。図２３は、図１６と対になる図である。図２３にも示される通り、実施形態５では、実施形態１とは異なり、ＴＯ＿ＯＲＩＧＩＮＡＬモードにおいてαが１まで減少する前の時点であっても、受信不能状態下においてＭＯＶ１のシーンチェンジが生じた場合には、ＴＯ＿ＯＲＩＧＩＮＡＬモードからＯＲＩＧＩＮＡＬモードへのモード切替が行われる。 FIG. 23 is a diagram for explaining the effect of the fifth embodiment by comparing with the first embodiment. FIG. 23 is a diagram paired with FIG. As shown in FIG. 23, in the fifth embodiment, unlike the first embodiment, even before α decreases to 1 in the TO_ORIGINAL mode, when a scene change occurs in MOV1 under the unreceivable state, , the mode is switched from the TO_ORIGINAL mode to the ORIGINAL mode.

シーンチェンジに起因するＭＯＶ２の違和感を低減させたい場合には、実施形態５のモード切替手法を採用することが好ましい。これに対し、ＭＯＶ２のフラッシングを解消したい場合には、実施形態１のモード切替手法を採用することが好ましい。実施形態１または５のいずれのモード切替手法を採用するかについては、ＭＯＶ１の仕様等に応じ、映像処理装置１の設計者によって適宜選択されてよい。 It is preferable to adopt the mode switching method of the fifth embodiment when it is desired to reduce the discomfort of MOV2 caused by scene changes. On the other hand, when it is desired to eliminate the flashing of MOV2, it is preferable to adopt the mode switching method of the first embodiment. A designer of the video processing apparatus 1 may appropriately select which of the mode switching methods of the first embodiment and the fifth embodiment is to be adopted according to the specifications of the MOV 1 and the like.

〔変形例〕
（１）上述の説明では、ＭＯＶ１を解析することにより、当該ＭＯＶ１のシーンチェンジの有無を判定する場合を例示した。但し、一部の放送システム１０００では、表示情報にシーンチェンジ情報がさらに含まれている場合もある。この場合、第２受信部８２によって、ＮＴ２からシーンチェンジ情報を取得できる。 [Modification]
(1) In the above description, by analyzing MOV1, it is determined whether or not there is a scene change in MOV1. However, in some broadcasting systems 1000, the display information may further include scene change information. In this case, the scene change information can be acquired from the NT2 by the second receiving section 82 .

但し、第２受信部８２が受信不能状態に陥った場合には、ＮＴ２からシーンチェンジ情報を取得できなくなる。このため、制御部１０にシーンチェンジ検出部１３０が設けられていることが好ましい。例えば、事例５の処理は、制御部１０にシーンチェンジ検出部１３０が設けられていることを前提としている。 However, if the second receiving section 82 falls into a reception disabled state, the scene change information cannot be obtained from the NT2. Therefore, it is preferable that the controller 10 is provided with a scene change detector 130 . For example, the processing of case 5 is based on the premise that the control unit 10 is provided with the scene change detection unit 130 .

（２）上述の説明では、ＭＯＶ１の各フレームの特徴量に基づき、当該ＭＯＶ１のシーンチェンジの有無を判定する場合を例示した。但し、シーンチェンジ検出部１３０は、放送波を解析することにより、上記特徴量を導出してもよい。この場合にも、当該特徴量に基づきＭＯＶ１のシーンチェンジの有無を判定できる。 (2) In the above description, the case of determining whether or not there is a scene change in MOV1 based on the feature amount of each frame of MOV1 has been exemplified. However, the scene change detection unit 130 may derive the feature amount by analyzing broadcast waves. In this case as well, it is possible to determine whether or not there is a scene change in MOV1 based on the feature amount.

（３）上述の説明では、注目情報は、ＡＴＴ（注目領域）の２つの対頂点（原点Ｏおよび頂点Ｅ）の座標を示す情報を含んでいる場合を例示した。但し、注目情報は、ＸＹ平面上におけるＡＴＴの位置を一意的に特定できる情報であればよく、上記の例に限定されない。 (3) In the above description, attention information includes information indicating the coordinates of two paired vertices (origin O and vertex E) of ATT (attention area). However, the information of interest is not limited to the above examples as long as it is information that can uniquely specify the position of the ATT on the XY plane.

一例として、注目情報には、（ｉ）ＡＴＴの中心の座標と、（ｉｉ）当該ＡＴＴの幅（以下、Ｗ０）および高さ（以下、Ｈ０）、とを示す情報が含まれていてもよい。これらの情報によっても、ＡＴＴの位置を一意的に特定できるためである。以下、ＡＴＴの中心を、点Ｃ（ｘｃ０，ｙｃ０）と表す。 As an example, the attention information may include information indicating (i) the coordinates of the center of the ATT and (ii) the width (hereinafter W0) and height (H0) of the ATT. . This is because the position of the ATT can be uniquely specified also by these pieces of information. Hereinafter, the center of ATT is represented as point C (xc0, yc0).

例えば、図３の例の場合には、
Ｗ０＝ｘｅ０－ｘｓ０…（４Ａ）
Ｈ０＝ｙｅ０－ｙｓ０…（４Ｂ）
ｘｃ＝（ｘｓ０＋ｘｅ０）／２…（４Ｃ）
ｙｃ＝（ｘｓ０＋ｘｅ０）／２…（４Ｄ）
の関係が成立する。このため、ｘｃとｙｃとＷ０とＨ０とが既知であれば、ＡＴＴの上記２つの対頂点の位置（すなわち、ｘｓ０とｙｓ０とｘｅ０とｙｅ０）を特定できる。 For example, in the case of the example in FIG.
W0=xe0-xs0 (4A)
H0=ye0-ys0 (4B)
xc=(xs0+xe0)/2 (4C)
yc=(xs0+xe0)/2 (4D)
relationship is established. Therefore, if xc, yc, W0, and H0 are known, the positions of the two paired vertices of ATT (that is, xs0, ys0, xe0, and ye0) can be specified.

あるいは、注目情報には、（ｉ）点Ｃの座標に加え、（ｉｉ）Ｗ０またはＨ０の一方を示す情報が含まれていてもよい。すなわち、Ｗ０およびＨ０の一方（例：Ｗ）が既知であれば、ＭＯＶ１のアスペクト比に基づいて、他方（例：Ｈ）を算出できるためである。 Alternatively, the attention information may include (i) the coordinates of the point C and (ii) information indicating either W0 or H0. That is, if one of W0 and H0 (eg W) is known, the other (eg H) can be calculated based on the aspect ratio of MOV1.

例えば、ＭＯＶ１のアスペクト比が、横：縦＝１６：９である場合を考える。この場合、Ｈ０＝（９／１６）×Ｗ０である。このように、注目情報には、（ｉ）点Ｃの座標を示す情報と、（ｉｉ）Ｗ０およびＨ０の少なくとも一方と、を示す情報が含まれていればよい。 For example, consider a case where the aspect ratio of MOV1 is horizontal:vertical=16:9. In this case, H0=(9/16)*W0. In this way, the attention information may include (i) information indicating the coordinates of the point C and (ii) information indicating at least one of W0 and H0.

（４）選択部１４０のモード選択手法は、上述の各例に限定されない。一例として、選択部１４０は、ユーザ操作に基づき、モード選択を行ってもよい。この場合、第２受信部８２の状態（あるいは、ＭＯＶ１のシーンチェンジの有無）によらず、所望のモードをユーザに選択させることができる。 (4) The mode selection method of the selection unit 140 is not limited to the examples described above. As an example, the selection unit 140 may perform mode selection based on a user's operation. In this case, the user can select a desired mode regardless of the state of the second receiving section 82 (or whether there is a scene change in MOV1).

例えば、一部のユーザは、第２受信部８２の受信状態によらず、オリジナルスケールの映像の視聴を希望することも考えられる。そこで、選択部１４０は、第２受信部８２が受信可能状態である場合にも、ユーザ操作に応じて、第２モードを選択してもよい。このように、選択部１４０にユーザ操作に応じてモード選択を行わせることにより、ユーザの利便性をさらに向上させることができる。 For example, it is conceivable that some users wish to view the original scale video regardless of the reception state of the second receiver 82 . Therefore, the selection unit 140 may select the second mode according to the user's operation even when the second reception unit 82 is in the receivable state. In this way, by allowing the selection unit 140 to perform mode selection in accordance with the user's operation, it is possible to further improve the user's convenience.

〔ソフトウェアによる実現例〕
表示装置１００の制御ブロック（特に制御部１０）は、集積回路（ＩＣチップ）等に形成された論理回路（ハードウェア）によって実現してもよいし、ソフトウェアによって実現してもよい。 [Example of realization by software]
A control block (particularly, the control unit 10) of the display device 100 may be implemented by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like, or may be implemented by software.

後者の場合、表示装置１００は、各機能を実現するソフトウェアであるプログラムの命令を実行するコンピュータを備えている。このコンピュータは、例えば少なくとも１つのプロセッサ（制御装置）を備えていると共に、上記プログラムを記憶したコンピュータ読み取り可能な少なくとも１つの記録媒体を備えている。そして、上記コンピュータにおいて、上記プロセッサが上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記プロセッサとしては、例えばＣＰＵ（Central Processing Unit）を用いることができる。上記記録媒体としては、「一時的でない有形の媒体」、例えば、ＲＯＭ（Read Only Memory）等の他、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムを展開するＲＡＭ（Random Access Memory）などをさらに備えていてもよい。また、上記プログラムは、該プログラムを伝送可能な任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに供給されてもよい。なお、本発明の一態様は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 In the latter case, the display device 100 is provided with a computer that executes instructions of a program, which is software that implements each function. This computer includes, for example, at least one processor (control device) and at least one computer-readable recording medium storing the program. In the computer, the processor reads the program from the recording medium and executes it, thereby achieving the object of the present invention. As the processor, for example, a CPU (Central Processing Unit) can be used. As the recording medium, a "non-temporary tangible medium" such as a ROM (Read Only Memory), a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. In addition, a RAM (Random Access Memory) for developing the above program may be further provided. Also, the program may be supplied to the computer via any transmission medium (communication network, broadcast wave, etc.) capable of transmitting the program. Note that one aspect of the present invention can also be implemented in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission.

〔付記事項〕
本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。さらに、各実施形態にそれぞれ開示された技術的手段を組み合わせることにより、新しい技術的特徴を形成できる。 [Additional notes]
The present invention is not limited to the above-described embodiments, but can be modified in various ways within the scope of the claims, and can be obtained by appropriately combining technical means disclosed in different embodiments. is also included in the technical scope of the present invention. Furthermore, new technical features can be formed by combining the technical means disclosed in each embodiment.

１映像処理装置
１０制御部
８１第１受信部
８２第２受信部
９０表示部
１００表示装置
１１０映像処理部
１２０表示情報受信判定部
１３０シーンチェンジ検出部
１４０選択部
１０００放送システム
ＮＴ１第１ネットワーク
ＮＴ２第２ネットワーク
ＭＯＶ１入力映像
ＭＯＶ２出力映像
ＡＴＴ注目領域
ＺＴＴズームアウト領域
α 拡大率 1 video processing device 10 control unit 81 first reception unit 82 second reception unit 90 display unit 100 display device 110 video processing unit 120 display information reception determination unit 130 scene change detection unit 140 selection unit 1000 broadcasting system NT1 first network NT2 2 network MOV1 Input image MOV2 Output image ATT Area of interest ZTT Zoom out area α Magnification ratio

Claims

A video processing device that generates an output video by processing an input video,
a first receiving unit that receives the input video;
a second receiving unit that receives display information corresponding to the input video,
The second receiving unit is a receiving unit different from the first receiving unit,
the display information includes metadata synchronized with the input video;
the metadata includes attention information indicating an attention area of the input video;
The video processing device has one of (i) a first mode in which the display information is used to process the input video, and (ii) a second mode in which the input video is processed without using the display information. A video processing device that selects one mode of

2. The video processing apparatus according to claim 1, wherein said first mode is selected when said second receiving section is capable of receiving said display information.

3. The video processing device according to claim 2, wherein in the first mode, the video processing device generates the output video by enlarging the attention area of the input video by a predetermined magnification.

Assuming that the enlargement ratio of the output image with respect to the input image is α,
If the first mode has been previously selected,
The video processing device is
Gradually decreasing α when the second receiving unit transitions to a reception disabled state in which the display information cannot be received,
4. The video processing apparatus according to claim 3, wherein the second mode is selected when .alpha. has decreased to 1.

The video processing device is
Triggered by the cancellation of the unreceivable state at a predetermined time until α decreases to 1 in the first mode,
5. The video processing device according to claim 4, wherein α is gradually increased from the value at said predetermined time.

If the second mode has been previously selected,
The video processing device is
Selecting the first mode when the state in which the unreceivable state of the second receiving unit is resolved is maintained for a predetermined period of time,
5. The image processing apparatus according to claim 4, wherein .alpha. is gradually increased from 1 in the first mode.

The video processing device is
generating scene change information indicating whether or not there is a scene change in the input video by analyzing the input video;
7. The image according to any one of claims 1 to 6, wherein one of the first mode and the second mode is selected based on the communication state of the second receiving section and the scene change information. processing equipment.

The display information further includes scene change information indicating whether or not there is a scene change in the input video,
7. The video processing device according to any one of claims 1 to 6, wherein the video processing device selects one of the first mode and the second mode based on the communication state of the second receiving section and the scene change information. The video processing device according to item 1.

If the second mode has been previously selected,
9. The video processing device according to claim 7 or 8, wherein the video processing device selects the first mode when a scene change occurs in the input video when the second receiving unit is capable of receiving the display information. The video processing device described.

If the first mode has been previously selected,
3. The video processing device selects the second mode when a scene change occurs in the input video when the second receiving unit is in a reception disabled state in which the display information cannot be received. 8. The video processing device according to 7.

The first receiving unit receives a broadcast wave and acquires the input video by decoding the broadcast wave,
11. The video processing device according to any one of claims 1 to 10, wherein said second receiving section receives said display information via an IP (Internet Protocol) communication network.

A video processing device according to any one of claims 1 to 11;
and a display unit that displays the output video.

A video processing method for generating an output video by processing an input video,
a first receiving step of receiving the input image by a first receiving unit;
a second receiving step of receiving display information corresponding to the input image by a second receiving unit different from the first receiving unit;
the display information includes metadata synchronized with the input video;
the metadata includes attention information indicating an attention area of the input video;
The image processing method includes (i) a first mode of processing the input image using the display information, or (ii) a second mode of processing the input image without using the display information. , further comprising the step of selecting one mode of .