JP2018522430A

JP2018522430A - Method and apparatus for reducing spherical video bandwidth to a user headset

Info

Publication number: JP2018522430A
Application number: JP2017550903A
Authority: JP
Inventors: ウィーバー，ジョシュア; ゲフィン，ノーム; ベンガリ，ハサイン; アダムス，ライリー
Original assignee: Google LLC
Current assignee: Google LLC
Priority date: 2015-05-27
Filing date: 2016-05-27
Publication date: 2018-08-09
Anticipated expiration: 2036-05-27
Also published as: WO2016191702A1; KR20170122791A; JP6672327B2; KR101969943B1; CN107409203A; EP3304895A1

Abstract

方法は、３次元（３Ｄ）ビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブを判断するステップと、少なくとも１つの好ましいビューパースペクティブに対応する３Ｄビデオの第１の部分を第１の品質で符号化するステップと、３Ｄビデオの第２の部分を第２の品質で符号化するステップとを含み、第１の品質は第２の品質と比べてより高い品質である。The method determines at least one preferred view perspective associated with a three-dimensional (3D) video and encodes a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality. And encoding a second portion of the 3D video with a second quality, wherein the first quality is higher quality than the second quality.

Description

関連出願との相互参照
本願は、２０１５年５月２７日に出願された「ユーザヘッドセットへの球状ビデオ帯域幅を減少させる方法および装置」（Method and Apparatus to Reduce Spherical Video Bandwidth to User Headset）と題された米国出願連続番号第６２／１６７，１２１号の利益を主張する。当該出願は、その全体がここに引用により援用される。 Cross-reference to related applications This application is a “Method and Apparatus to Reduce Spherical Video Bandwidth to User Headset” filed May 27, 2015. Claims the benefit of the entitled US Serial No. 62 / 167,121. This application is incorporated herein by reference in its entirety.

分野
実施形態は、球状ビデオをストリーミングすることに関する。 FIELD Embodiments relate to streaming spherical video.

背景
球状ビデオ（または他の３次元ビデオ）をストリーミングすることは、かなりの量のシステムリソースを消費する場合がある。たとえば、符号化された球状ビデオは送信用の多数のビットを含む場合があり、それらは、かなりの量の帯域幅、ならびに、エンコーダおよびデコーダに関連付けられた処理およびメモリを消費する場合がある。 Background Streaming spherical video (or other 3D video) can consume a significant amount of system resources. For example, encoded spherical video may include a large number of bits for transmission, which may consume a significant amount of bandwidth and processing and memory associated with the encoder and decoder.

概要
例示的な実施形態は、ビデオをストリーミングすること、３Ｄビデオをストリーミングすること、および／または球状ビデオをストリーミングすることを最適化するシステムおよび方法を説明する。 Overview Exemplary embodiments describe systems and methods that optimize streaming video, streaming 3D video, and / or streaming spherical video.

一般的な一局面では、方法は、３次元（３Ｄ）ビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブを判断するステップと、少なくとも１つの好ましいビューパースペクティブに対応する３Ｄビデオの第１の部分を第１の品質で符号化するステップと、３Ｄビデオの第２の部分を第２の品質で符号化するステップとを含み、第１の品質は第２の品質と比べてより高い品質である。 In one general aspect, a method determines at least one preferred view perspective associated with a three-dimensional (3D) video and a first portion of the 3D video corresponding to the at least one preferred view perspective. Encoding with a quality of 1 and encoding a second part of the 3D video with a second quality, the first quality being higher than the second quality.

別の一般的な局面では、サーバおよび／またはストリーミングサーバは、３次元（３Ｄ）ビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブを判断するように構成されたコントローラと、エンコーダとを含み、エンコーダは、少なくとも１つの好ましいビューパースペクティブに対応する３Ｄビデオの第１の部分を第１の品質で符号化し、３Ｄビデオの第２の部分を第２の品質で符号化するように構成され、第１の品質は第２の品質と比べてより高い品質である。 In another general aspect, the server and / or streaming server includes a controller configured to determine at least one preferred view perspective associated with a three-dimensional (3D) video, and an encoder, , Configured to encode a first portion of 3D video corresponding to at least one preferred view perspective with a first quality and to encode a second portion of 3D video with a second quality, The quality is higher than the second quality.

さらに別の一般的な局面では、方法は、ストリーミングビデオに対する要求を受信するステップを含み、要求は、３次元（３Ｄ）ビデオに関連付けられたユーザビューパースペクティブの表示を含み、方法はさらに、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されているかどうかを判断するステップと、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されていると判断すると、ユーザビューパースペクティブに関連付けられたランキング値をインクリメントするステップと、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されていないと判断すると、ユーザビューパースペクティブをビューパースペクティブデータストアに追加し、ユーザビューパースペクティブに関連付けられたランキング値を１に設定するステップとを含む。 In yet another general aspect, the method includes receiving a request for streaming video, the request includes displaying a user view perspective associated with a three-dimensional (3D) video, and the method further includes: Determining whether the perspective is stored in the view perspective data store; determining that the user view perspective is stored in the view perspective data store; incrementing a ranking value associated with the user view perspective; If it determines that the user view perspective is not stored in the view perspective data store, it adds the user view perspective to the view perspective data store and And a step of setting a ranking value associated with the perspective to 1.

実現化例は、以下の特徴のうちの１つ以上を含み得る。たとえば、方法（または、サーバでの実現化例）は、３Ｄビデオの第１の部分をデータストアに格納するステップと、３Ｄビデオの第２の部分をデータストアに格納するステップと、ストリーミングビデオに対する要求を受信するステップと、データストアから３Ｄビデオの第１の部分と３Ｄビデオの第２の部分とをストリーミングビデオとしてストリーミングするステップとをさらに含み得る。方法（または、サーバでの実現化例）は、ストリーミングビデオに対する要求を受信するステップをさらに含み、要求は、ユーザビューパースペクティブの表示を含み、方法はさらに、ユーザビューパースペクティブに対応する３Ｄビデオを、３Ｄビデオの符号化された第１の部分として選択するステップと、３Ｄビデオの選択された第１の部分と３Ｄビデオの第２の部分とをストリーミングビデオとしてストリーミングするステップとを含み得る。 Implementations can include one or more of the following features. For example, the method (or server implementation) stores a first portion of 3D video in a data store, stores a second portion of 3D video in the data store, and for streaming video The method may further include receiving the request and streaming the first portion of the 3D video and the second portion of the 3D video as streaming video from the data store. The method (or server implementation) further includes receiving a request for streaming video, the request includes displaying a user view perspective, and the method further includes a 3D video corresponding to the user view perspective, Selecting as the encoded first portion of the 3D video, and streaming the selected first portion of the 3D video and the second portion of the 3D video as streaming video.

方法（または、サーバでの実現化例）は、ストリーミングビデオに対する要求を受信するステップをさらに含み、要求は、３Ｄビデオに関連付けられたユーザビューパースペクティブの表示を含み、方法はさらに、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されているかどうかを判断するステップと、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されていると判断すると、ユーザビューパースペクティブに関連付けられたカウンタをインクリメントするステップと、ユーザビューパースペクティブがビューパースペクティブデータストアに格納されていないと判断すると、ユーザビューパースペクティブをビューパースペクティブデータストアに追加し、ユーザビューパースペクティブに関連付けられたカウンタを１に設定するステップとを含み得る。方法（または、サーバでの実現化例）は、３Ｄビデオの第２の部分を符号化するステップは、少なくとも１つの第１のＱｏＳ（Quality of Service）パラメータを第１のパス符号化動作で使用するステップを含み、３Ｄビデオの第１の部分を符号化するステップは、少なくとも１つの第２のＱｏＳ（Quality of Service）パラメータを第２のパス符号化動作で使用するステップを含むことを含み得る。 The method (or server implementation) further includes receiving a request for streaming video, the request includes displaying a user view perspective associated with the 3D video, and the method further includes: Determining whether the user view perspective is stored in the view perspective data store, determining that the user view perspective is stored in the view perspective data store, incrementing a counter associated with the user view perspective, and the user view perspective Is not stored in the view perspective data store, the user view perspective is added to the view perspective data store and the user view The counter associated with Pekutibu may include a step of setting to 1. The method (or implementation at the server), wherein the step of encoding the second part of the 3D video uses at least one first quality of service (QoS) parameter in the first pass encoding operation. Encoding the first portion of the 3D video may include using at least one second quality of service (QoS) parameter in the second pass encoding operation. .

たとえば、３Ｄビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブを判断するステップは、これまで（historically）見られた基準点、およびこれまで見られたビューパースペクティブ、のうちの少なくとも１つに基づいている。３Ｄビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブは、３Ｄビデオの視聴者の配向、３Ｄビデオの視聴者の位置、３Ｄビデオの視聴者の点、および３Ｄビデオの視聴者の焦点、のうちの少なくとも１つに基づいている。３Ｄビデオに関連付けられた少なくとも１つの好ましいビューパースペクティブを判断するステップは、デフォルトビューパースペクティブに基づいており、デフォルトビューパースペクティブは、ディスプレイデバイスのユーザの特性、ディスプレイデバイスのユーザに関連付けられたグループの特性、ディレクターズカット、および、３Ｄビデオの特性、のうちの少なくとも１つに基づいている。たとえば、方法（または、サーバでの実現化例）は、３Ｄビデオの第２の部分の少なくとも一部を第１の品質で繰り返し符号化するステップと、３Ｄビデオの第２の部分の少なくとも一部をストリーミングするステップとをさらに含み得る。 For example, determining at least one preferred view perspective associated with a 3D video is based on at least one of historically viewed reference points and previously viewed view perspectives. . At least one preferred view perspective associated with the 3D video includes: 3D video viewer orientation, 3D video viewer position, 3D video viewer point, and 3D video viewer focus Based on at least one. The step of determining at least one preferred view perspective associated with the 3D video is based on a default view perspective, wherein the default view perspective includes characteristics of a user of the display device, characteristics of a group associated with the user of the display device, Based on at least one of director's cut and 3D video characteristics. For example, the method (or server implementation) repeatedly encodes at least a portion of a second portion of 3D video with a first quality and at least a portion of the second portion of 3D video. Streaming.

例示的な実施形態は、ここに以下に提供される詳細な説明、および添付図面からより十分に理解されるであろう。図中、同じ要素は同じ参照番号によって表わされ、それらは例示としてのみ与えられており、このため例示的な実施形態の限定ではない。
少なくとも１つの例示的な実施形態に従った球の２次元（２Ｄ）表現を示す図である。２Ｄ矩形表現としての、球の２Ｄ表現の展開円筒表現を示す図である。少なくとも１つの例示的な実施形態に従った、ストリーミング球状ビデオを符号化するための方法を示す図である。少なくとも１つの例示的な実施形態に従った、ストリーミング球状ビデオを符号化するための方法を示す図である。少なくとも１つの例示的な実施形態に従った、ストリーミング球状ビデオを符号化するための方法を示す図である。少なくとも１つの例示的な実施形態に従った、ストリーミング球状ビデオを符号化するための方法を示す図である。少なくとも１つの例示的な実施形態に従ったビデオエンコーダシステムを示す図である。少なくとも１つの例示的な実施形態に従ったビデオデコーダシステムを示す図である。少なくとも１つの例示的な実施形態に従ったビデオエンコーダシステムについてのフロー図を示す図である。少なくとも１つの例示的な実施形態に従ったビデオデコーダシステムについてのフロー図を示す図である。少なくとも１つの例示的な実施形態に従ったシステムを示す図である。ここに説明される手法を実現するために使用され得るコンピュータデバイスおよびモバイルコンピュータデバイスの概略ブロック図である。 Exemplary embodiments will be more fully understood from the detailed description provided below and the accompanying drawings. In the figures, the same elements are represented by the same reference numerals, which are given by way of example only and are thus not a limitation of the exemplary embodiments.
FIG. 3 shows a two-dimensional (2D) representation of a sphere according to at least one exemplary embodiment. It is a figure which shows the expansion | deployment cylindrical representation of 2D representation of a sphere as 2D rectangular representation. FIG. 6 illustrates a method for encoding streaming spherical video in accordance with at least one exemplary embodiment. FIG. 6 illustrates a method for encoding streaming spherical video in accordance with at least one exemplary embodiment. FIG. 6 illustrates a method for encoding streaming spherical video in accordance with at least one exemplary embodiment. FIG. 6 illustrates a method for encoding streaming spherical video in accordance with at least one exemplary embodiment. FIG. 2 illustrates a video encoder system in accordance with at least one exemplary embodiment. FIG. 2 illustrates a video decoder system in accordance with at least one exemplary embodiment. FIG. 6 shows a flow diagram for a video encoder system in accordance with at least one exemplary embodiment. FIG. 4 shows a flow diagram for a video decoder system in accordance with at least one exemplary embodiment. FIG. 1 illustrates a system in accordance with at least one exemplary embodiment. FIG. 6 is a schematic block diagram of a computing device and a mobile computing device that can be used to implement the techniques described herein.

なお、これらの図は、ある例示的な実施形態において利用される方法、構造および／または材料の一般的な特徴を示すよう意図されており、かつ、以下に提供される記載を補足するよう意図されている。しかしながら、これらの図面は縮尺通りではなく、また、任意の所与の実施形態の構造特性または性能特性そのものを正確に反映していない場合があり、例示的な実施形態が包含する特性を定義または限定していると解釈されるべきでない。たとえば、明瞭にするために、構造要素の位置付けが減少または誇張される場合がある。さまざまな図面における同様または同一の参照番号の使用は、同様または同一の要素または特徴の存在を示すよう意図される。 It is noted that these figures are intended to illustrate the general features of the methods, structures and / or materials utilized in certain exemplary embodiments, and are intended to supplement the description provided below. Has been. However, these drawings are not to scale, and may not accurately reflect the structural or performance characteristics of any given embodiment, and may define or include characteristics encompassed by example embodiments. It should not be construed as limiting. For example, the positioning of the structural elements may be reduced or exaggerated for clarity. The use of similar or identical reference numbers in the various drawings is intended to indicate the presence of similar or identical elements or features.

実施形態の詳細な説明
例示的な実施形態はさまざまな修正および代替的形態を含み得るが、それらの実施形態は例として図面に示されており、ここに詳細に説明されるであろう。しかしながら、例示的な実施形態を開示された特定の形態に限定する意図はなく、それどころか、例示的な実施形態は請求の範囲内に該当するすべての修正、均等物、および代替物を網羅することが理解されるべきである。同じ番号は、図の説明全体にわたって同じ要素を指す。 DETAILED DESCRIPTION OF EMBODIMENTS While the illustrative embodiments may include various modifications and alternative forms, those embodiments are shown by way of example in the drawings and will be described in detail herein. However, it is not intended that the exemplary embodiments be limited to the specific forms disclosed, but rather that the exemplary embodiments cover all modifications, equivalents, and alternatives falling within the scope of the claims. Should be understood. Like numbers refer to like elements throughout the description of the figures.

例示的な実施形態は、ビデオのストリーミング、３Ｄビデオのストリーミング、球状ビデオ（および／または他の３次元ビデオ）のストリーミングを、球状ビデオの（ビデオの視聴者によって）優先的に見られた部分（たとえば、ディレクターズカット、これまでの視聴（historical viewings）など）に基づいて最適化するように構成されたシステムおよび方法を説明する。たとえば、ディレクターズカットとは、ビデオの監督（ディレクター）または製作者によって選択されたようなビューパースペクティブであり得る。ディレクターズカットは、ビデオの監督または製作者がビデオを録画する際に選択され、または見られた、カメラの（複数のカメラの）ビューに基づいていてもよい。 An exemplary embodiment is a preferentially viewed portion of a spherical video (by a video viewer) that streams video, 3D video, spherical video (and / or other 3D video). Systems and methods configured to optimize based on, for example, directors cuts, historical viewings, etc. are described. For example, a director's cut may be a view perspective as selected by a video director (director) or producer. The director's cut may be based on the camera's (multi-camera) view that is selected or viewed when the video director or producer records the video.

球状ビデオ、球状ビデオのフレーム、および／または球状画像は、パースペクティブを有し得る。たとえば、球状画像は地球の画像であってもよい。内部パースペクティブは、地球の中心から外側を見るビューであってもよい。または、内部パースペクティブは、地球上で宇宙を眺めるものであってもよい。外部パースペクティブは、宇宙から地球に向かって見下ろすビューであってもよい。別の例として、パースペクティブは、可視である画像の一部に基づき得る。言い換えれば、可視パースペクティブは、視聴者が見ることができるものであり得る。可視パースペクティブは、視聴者の前にある球状画像の一部であり得る。たとえば、内部パースペクティブから見る際、視聴者は地面（たとえば地球）上に横たわり、宇宙を眺めていてもよい。視聴者は画像内で、月、太陽、または特定の星を見るかもしれない。しかしながら、視聴者が横たわっている地面は球状画像に含まれているものの、地面は現在の可視パースペクティブの外部にある。この例では、視聴者が自分の頭を回転させると、地面は周囲の可視パースペクティブに含まれるであろう。視聴者がうつぶせになると、地面は可視パースペクティブ内にあるものの、月、太陽、または星は可視パースペクティブ内にはないであろう。 A spherical video, a frame of a spherical video, and / or a spherical image may have a perspective. For example, the spherical image may be an image of the earth. The interior perspective may be a view looking out from the center of the earth. Alternatively, the internal perspective may look at the universe on the earth. The external perspective may be a view looking down from space into the earth. As another example, the perspective may be based on a portion of the image that is visible. In other words, the visible perspective can be what a viewer can see. The visible perspective may be a part of a spherical image that is in front of the viewer. For example, when viewed from an internal perspective, the viewer may lie on the ground (eg, the Earth) and look at the universe. The viewer may see the moon, the sun, or a specific star in the image. However, although the ground on which the viewer lies is included in the spherical image, the ground is outside the current visible perspective. In this example, if the viewer rotates his head, the ground will be included in the surrounding visible perspective. When the viewer is face down, the ground will be in the visible perspective, but the moon, sun, or star will not be in the visible perspective.

外部パースペクティブからの可視パースペクティブは、（たとえば画像の別の部分によって）遮られていない球状画像の一部、および／または、見えなくなるまで湾曲していない球状画像の一部であってもよい。球状画像を動かすこと（たとえば回転させること）によって、および／または球状画像の動きによって、球状画像の別の部分が外部パースペクティブから可視パースペクティブに持ち込まれてもよいい。したがって、可視パースペクティブは、球状画像の視聴者の可視範囲内にある球状画像の一部である。 The visible perspective from the external perspective may be a part of a spherical image that is not obstructed (eg by another part of the image) and / or a part of a spherical image that is not curved until it disappears. Another portion of the spherical image may be brought from the external perspective to the visible perspective by moving (eg, rotating) the spherical image and / or by movement of the spherical image. Thus, the visible perspective is a portion of a spherical image that is within the visible range of the viewer of the spherical image.

球状画像は、経時変化しない画像である。たとえば、地球に関するような内部パースペクティブからの球状画像は、月および星を１つの位置に示す場合がある。一方、球状ビデオ（または画像のシーケンス）は経時変化する場合がある。たとえば、地球に関するような内部パースペクティブからの球状ビデオは、（たとえば地球の自転のために）動く月および星、および／または、画像（たとえば空）を横切る飛行機雲を示す場合がある。 A spherical image is an image that does not change over time. For example, a spherical image from an internal perspective, such as for the Earth, may show the moon and stars in one position. On the other hand, spherical video (or a sequence of images) may change over time. For example, a spherical video from an internal perspective, such as for the Earth, may show a moving moon and stars (eg, due to Earth rotation) and / or a contrail across an image (eg, the sky).

図１Ａは、球の２次元（２Ｄ）表現である。図１Ａに示すように、（たとえば、球状ビデオの球状画像またはフレームとしての）球１００は、内部パースペクティブ１０５、１１０、外部パースペクティブ１１５、および可視パースペクティブ１２０、１２５、１３０の方向を示す。可視パースペクティブ１２０は、内部パースペクティブ１１０から見られるような球状画像の一部であってもよい。可視パースペクティブ１２０は、内部パースペクティブ１０５から見られるような球１００の一部であってもよい。可視パースペクティブ１２５は、外部パースペクティブ１１５から見られるような球１００の一部であってもよい。 FIG. 1A is a two-dimensional (2D) representation of a sphere. As shown in FIG. 1A, a sphere 100 (eg, as a spherical image or frame of a sphere video) shows the orientation of the interior perspectives 105, 110, the exterior perspective 115, and the visible perspectives 120, 125, 130. The visible perspective 120 may be part of a spherical image as seen from the internal perspective 110. The visible perspective 120 may be a part of the sphere 100 as seen from the internal perspective 105. The visible perspective 125 may be a part of the sphere 100 as seen from the external perspective 115.

図１Ｂは、２Ｄ矩形表現としての、球１００の２Ｄ表現の展開円筒表現１５０を示す図である。展開円筒表現１５０として示された画像の正距円筒投影は、画像が点Ａ、Ｂ間の中央線から垂直に（図１Ｂに示すように上下に）遠ざかって進むにつれて、伸張された画像として現われ得る。２Ｄ矩形表現は、Ｎ×ＮブロックのＣ×Ｒマトリックスとして分解され得る。たとえば、図１Ｂに示すように、図示された展開円筒表現１５０は、Ｎ×Ｎブロックの３０×１６マトリックスである。しかしながら、他のＣ×Ｒ次元がこの開示の範囲内にある。ブロックは、２×２、２×４、４×４、４×８、８×８、８×１６、１６×１６などのブロック（または画素のブロック）であってもよい。 FIG. 1B is a diagram illustrating a developed cylindrical representation 150 of a 2D representation of a sphere 100 as a 2D rectangular representation. An equirectangular projection of the image shown as the developed cylindrical representation 150 appears as a stretched image as the image moves vertically away from the center line between points A and B (up and down as shown in FIG. 1B). obtain. The 2D rectangular representation can be decomposed as an N × N block C × R matrix. For example, as shown in FIG. 1B, the illustrated expanded cylinder representation 150 is a 30 × 16 matrix of N × N blocks. However, other C × R dimensions are within the scope of this disclosure. The block may be a block (or a block of pixels) such as 2 × 2, 2 × 4, 4 × 4, 4 × 8, 8 × 8, 8 × 16, and 16 × 16.

球状画像とは、全方向に連続している画像である。したがって、仮に球状画像を複数のブロックに分解した場合、複数のブロックは球状画像全体で近接しているであろう。言い換えれば、２Ｄ画像にあるようなエッジも境界もない。例示的な実現化例では、隣接端ブロックが、２Ｄ表現の境界に隣接していてもよい。加えて、隣接端ブロックは、２Ｄ表現の境界上のブロックとの近接ブロックであってもよい。たとえば、隣接端ブロックは、２次元表現の２つ以上の境界に関連付けられている。言い換えれば、球状画像は全方向に連続している画像であるため、隣接端は、画像またはフレームにおける（たとえばブロックの列の）上側境界および下側境界に関連付けられてもよく、および／または、画像またはフレームにおける（たとえばブロックの行の）左側境界および右側境界に関連付けられてもよい。 A spherical image is an image that is continuous in all directions. Therefore, if the spherical image is decomposed into a plurality of blocks, the plurality of blocks will be close together in the entire spherical image. In other words, there are no edges or boundaries as in a 2D image. In an exemplary implementation, adjacent end blocks may be adjacent to the boundary of the 2D representation. In addition, the adjacent block may be a block adjacent to the block on the boundary of the 2D representation. For example, adjacent end blocks are associated with two or more boundaries of a two-dimensional representation. In other words, because a spherical image is an image that is continuous in all directions, adjacent edges may be associated with upper and lower boundaries (eg, in a row of blocks) in the image or frame, and / or It may be associated with left and right boundaries (eg, in a row of blocks) in an image or frame.

たとえば、正距円筒投影が使用される場合、隣接端ブロックは、列または行の他方端のブロックであってもよい。たとえば、図１Ｂに示すように、ブロック１６０および１７０は、互いにそれぞれの（列ごとの）隣接端ブロックであってもよい。また、ブロック１８０および１８５は、互いにそれぞれの（列ごとの）隣接端ブロックであってもよい。さらに、ブロック１６５および１７５は、互いにそれぞれの（行ごとの）隣接端ブロックであってもよい。ビューパースペクティブ１９２は、少なくとも１つのブロックを含んでいてもよい（および／または、少なくとも１つのブロックと重複していてもよい）。ブロックは、画像の領域、フレームの領域、画像またはフレームの一部もしくは部分集合、ブロックの群などとして符号化されてもよい。以下に、ブロックのこの群は、タイルまたはタイルの群と称され得る。たとえば、図１Ｂでは、タイル１９０および１９５は、４つのブロックの群として図示される。タイル１９５は、ビューパースペクティブ１９２内にあるとして図示される。 For example, if equirectangular projection is used, the adjacent end block may be the block at the other end of the column or row. For example, as shown in FIG. 1B, blocks 160 and 170 may be adjacent to each other (by column). Also, blocks 180 and 185 may be adjacent end blocks (for each column) of each other. Further, blocks 165 and 175 may be adjacent to each other (per row). View perspective 192 may include at least one block (and / or may overlap with at least one block). A block may be encoded as a region of an image, a region of a frame, a part or subset of an image or frame, a group of blocks, and the like. In the following, this group of blocks may be referred to as a tile or a group of tiles. For example, in FIG. 1B, tiles 190 and 195 are illustrated as a group of four blocks. The tile 195 is illustrated as being in the view perspective 192.

例示的な実施形態では、符号化された球状ビデオのフレームをストリーミングすることに加え、視聴者によって頻繁に見られた少なくとも１つの基準点に基づいて選択されたタイル（またはタイルの群）としてのビューパースペクティブ（たとえば、これまで見られた少なくとも１つの基準点またはビューパースペクティブ）が、たとえばより高い品質（たとえば、より高い解像度および／またはより少ない歪み）で符号化され、球状ビデオの符号化されたフレームとともに（またはその一部として）ストリーミングされ得る。したがって、再生中、球状ビデオ全体が再生されている間に視聴者は復号されたタイルを（より高い品質で）見ることができ、視聴者のビューパースペクティブが、視聴者によって頻繁に見られたビューパースペクティブに変わった場合でも、球状ビデオ全体は利用可能である。視聴者はまた、視聴位置を変更したり、別のビューパースペクティブに切替えることもできる。その別のビューパースペクティブが、視聴者によって頻繁に見られた少なくとも１つの基準点に含まれる場合、再生されたビデオは、（たとえば、視聴者によって頻繁に見られた少なくとも１つの基準点のうちの１つではない）何らかの他のビューパースペクティブに比べ、より高い品質（たとえば、より高い解像度）のものであり得る。画像またはフレームの選択された一部または部分集合のみをより高い品質で符号化してストリーミングすることの１つの利点は、必ずしも球状ビデオ全体をより高い品質で符号化し、ストリーミングし、復号しなくても、球状ビデオの選択された画像またはフレームがより高い品質で復号され、再生され得るという利点を有しており、このため、帯域幅使用、ならびに、エンコーダおよびデコーダに関連付けられた処理リソースおよびメモリリソースにおける効率を高める。 In an exemplary embodiment, in addition to streaming encoded spherical video frames, as tiles (or groups of tiles) selected based on at least one reference point frequently viewed by the viewer. View perspective (eg, at least one reference point or view perspective seen so far) is encoded, eg, with higher quality (eg, higher resolution and / or less distortion) and encoded with spherical video It can be streamed with (or as part of) the frame. Thus, during playback, the viewer can see the decoded tiles (with higher quality) while the entire spherical video is playing, and the viewer's view perspective is the view frequently viewed by the viewer. Even when changing to a perspective, the entire spherical video is still available. The viewer can also change the viewing position or switch to another view perspective. If that other view perspective is included in at least one reference point that is frequently viewed by the viewer, then the played video is (e.g., of at least one reference point that is frequently viewed by the viewer). It can be of higher quality (eg, higher resolution) compared to some other view perspective (not one). One advantage of encoding and streaming only a selected portion or subset of an image or frame with higher quality is that it is not necessary to encode, stream and decode the entire spherical video with higher quality. Has the advantage that the selected image or frame of the spherical video can be decoded and played with higher quality, so that bandwidth usage and processing and memory resources associated with the encoder and decoder Increase efficiency.

頭部装着ディスプレイ（head mount display：ＨＭＤ）では、視聴者は、知覚された３次元（３Ｄ）ビデオまたは画像を投影する左（たとえば左目）ディスプレイおよび右（たとえば右目）ディスプレイの使用を通して、視覚的バーチャルリアリティを体験する。例示的な実施形態によれば、球状（たとえば３Ｄ）ビデオまたは画像がサーバ上に格納される。ビデオまたは画像は符号化され、サーバからＨＭＤにストリーミングされ得る。球状ビデオまたは画像は、左画像および右画像として符号化され得る。左画像および右画像は、左画像および右画像についてのメタデータとともに（たとえばデータパケットに）パッケージ化される。左画像および右画像は次に復号され、左（たとえば左目）ディスプレイおよび右（たとえば右目）ディスプレイによって表示される。 In a head mount display (HMD), viewers can visually perceive through the use of left (eg, left eye) and right (eg, right eye) displays that project perceived three-dimensional (3D) video or images. Experience virtual reality. According to an exemplary embodiment, a spherical (eg, 3D) video or image is stored on the server. The video or image can be encoded and streamed from the server to the HMD. Spherical video or images can be encoded as left and right images. The left and right images are packaged (eg, in a data packet) with metadata about the left and right images. The left and right images are then decoded and displayed by a left (eg left eye) display and a right (eg right eye) display.

ここに説明されるシステムおよび方法は左画像および右画像双方に適用可能であり、本開示全体を通し、使用事例に依存して、画像、フレーム、画像の一部、フレームの一部、タイルなどと称される。言い換えれば、サーバ（たとえばストリーミングサーバ）からユーザデバイス（たとえばＨＭＤ）に通信され、次に表示のために復号される符号化データは、３Ｄビデオまたは画像に関連付けられた左画像および／または右画像であり得る。 The systems and methods described herein are applicable to both left and right images, and throughout the present disclosure, depending on use cases, images, frames, parts of images, parts of frames, tiles, etc. It is called. In other words, the encoded data that is communicated from the server (eg, streaming server) to the user device (eg, HMD) and then decoded for display is the left image and / or right image associated with the 3D video or image. possible.

図２〜５は、例示的な実施形態に従った方法のフローチャートである。図２〜５に関して説明されるステップは、（たとえば（以下に説明される）図６Ａ、図６Ｂ、図７Ａ、図７Ｂ、および図８に示すような）装置に関連付けられたメモリ（たとえば、少なくとも１つのメモリ６１０）に格納され、当該装置に関連付けられた少なくとも１つのプロセッサ（たとえば、少なくとも１つのプロセッサ６０５）によって実行される、ソフトウェアコードの実行によって行なわれてもよい。しかしながら、特殊用途プロセッサとして具現化されるシステムといった、代替的な実施形態が考えられる。以下に説明されるステップはプロセッサによって実行されるとして説明されるが、これらのステップは必ずしも同じプロセッサによって実行されるわけではない。言い換えれば、少なくとも１つのプロセッサが、図２〜５に関して以下に説明されるステップを実行してもよい。 2-5 are flowcharts of methods according to exemplary embodiments. The steps described with respect to FIGS. 2-5 include memory associated with the device (eg, as shown in FIGS. 6A, 6B, 7A, 7B, and 8 (described below)) (eg, at least May be performed by execution of software code stored in one memory 610) and executed by at least one processor (eg, at least one processor 605) associated with the device. However, alternative embodiments are possible, such as a system embodied as a special purpose processor. Although the steps described below are described as being performed by a processor, these steps are not necessarily performed by the same processor. In other words, at least one processor may perform the steps described below with respect to FIGS.

図２は、これまでのビューパースペクティブを格納するための方法を示しており、ここで、「これまでの」（historical）とは、ユーザによって以前に要求されたビューパースペクティブを指す。たとえば、図２は、球状ビデオストリームにおいてよく見られるビューパースペクティブのデータベースの構築を示し得る。図２に示すように、ステップＳ２０５で、ビューパースペクティブの表示が受信される。たとえば、デコーダを含むデバイスによってタイルが要求され得る。タイル要求は、球状ビデオ上の視聴者の配向、位置、点、または焦点に関するパースペクティブまたはビューパースペクティブに基づいた情報を含み得る。パースペクティブまたはビューパースペクティブは、ユーザビューパースペクティブ、すなわちＨＭＤのユーザのビューパースペクティブであり得る。たとえば、ビューパースペクティブ（たとえばユーザビューパースペクティブ）は、（たとえば、内部パースペクティブまたは外部パースペクティブとしての）球状ビデオ上の緯度および経度位置であってもよい。ビュー、パースペクティブ、またはビューパースペクティブは、球状ビデオに基づいて立方体の辺として判断され得る。ビューパースペクティブの表示はまた、球状ビデオ情報を含み得る。例示的な実現化例では、ビューパースペクティブの表示は、ビューパースペクティブに関連付けられたフレームについての情報（たとえばフレームシーケンス）を含み得る。たとえば、ビュー（たとえば、緯度および経度位置、または辺）は、たとえばハイパーテキスト転送プロトコル（Hypertext Transfer Protocol：ＨＴＴＰ）を使用して、ＨＭＤを含むユーザデバイス（に関連付けられたコントローラ）からストリーミングサーバに通信され得る。 FIG. 2 illustrates a method for storing a previous view perspective, where “historical” refers to a view perspective previously requested by a user. For example, FIG. 2 may illustrate the construction of a view perspective database that is often found in spherical video streams. As shown in FIG. 2, in step S205, a view of the view perspective is received. For example, tiles may be requested by a device that includes a decoder. The tile request may include information based on a perspective or view perspective regarding the orientation, position, point, or focus of the viewer on the spherical video. The perspective or view perspective may be a user view perspective, i.e., a view perspective of an HMD user. For example, a view perspective (eg, a user view perspective) may be a latitude and longitude location on a spherical video (eg, as an internal perspective or an external perspective). A view, perspective, or view perspective may be determined as a side of a cube based on spherical video. The display of the view perspective may also include spherical video information. In an exemplary implementation, the display of the view perspective may include information about frames associated with the view perspective (eg, a frame sequence). For example, views (eg, latitude and longitude positions, or edges) communicate to a streaming server from a user device (associated with) including an HMD using, for example, a Hypertext Transfer Protocol (HTTP). Can be done.

ステップＳ２１０で、ビューパースペクティブ（たとえばユーザビューパースペクティブ）がビューパースペクティブデータストアに格納されているかどうかが判断される。たとえば、データストア（たとえばビューパースペクティブデータストア８１５）が、ビューパースペクティブまたはユーザビューパースペクティブに関連付けられた情報に基づいてクエリまたはフィルタされ得る。たとえば、データストアは、ビューパースペクティブの球状ビデオ上の緯度および経度位置、ならびに、ビューパースペクティブが見られた球状ビデオにおけるタイムスタンプに基づいて、クエリまたはフィルタされてもよい。タイムスタンプは、球状ビデオの再生に関連付けられた時間および／または時間範囲であり得る。クエリまたはフィルタは、空間近接性（たとえば、現在のビューパースペクティブが所与の格納されたビューパースペクティブにどのくらい近いか）、および／または、時間近接性（たとえば、現在のタイムスタンプが所与の格納されたタイムスタンプにどのくらい近いか）に基づき得る。クエリまたはフィルタが結果を返す場合、ビューパースペクティブはデータストアに格納されている。結果を返さない場合、ビューパースペクティブはデータストアに格納されていない。ビューパースペクティブがビューパースペクティブデータストアに格納されている場合、ステップＳ２１５で、処理はステップＳ２２０に続く。格納されていない場合、処理はステップＳ２２５に続く。 In step S210, it is determined whether a view perspective (eg, a user view perspective) is stored in the view perspective data store. For example, a data store (eg, view perspective data store 815) may be queried or filtered based on information associated with the view perspective or the user view perspective. For example, the data store may be queried or filtered based on the latitude and longitude positions on the spherical video of the view perspective, and the timestamp in the spherical video where the view perspective was viewed. The time stamp may be a time and / or time range associated with the playback of the spherical video. The query or filter can be spatial proximity (eg, how close the current view perspective is to a given stored view perspective) and / or temporal proximity (eg, the current timestamp is stored for a given store). How close it is to the time stamp). If the query or filter returns results, the view perspective is stored in the data store. If no result is returned, the view perspective is not stored in the data store. If the view perspective is stored in the view perspective data store, in step S215, processing continues to step S220. If not, processing continues to step S225.

ステップＳ２２０で、受信されたビューパースペクティブに関連付けられたカウンタまたはランキング（またはランキング値）がインクリメントされる。たとえば、データストアは、これまでのビューパースペクティブを含むデータテーブルを含んでいてもよい（たとえば、データストアは、複数のデータテーブルを含むデータベースであってもよい）。データテーブルは、鍵付きの（たとえば、各々に固有の）ビューパースペクティブであってもよい。データテーブルは、ビューパースペクティブの識別情報と、ビューパースペクティブに関連付けられた情報と、ビューパースペクティブが何回要求されたかを示すカウンタとを含んでいてもよい。カウンタは、ビューパースペクティブが要求されるたびにインクリメントされてもよい。データテーブルに格納されたデータは、匿名化されてもよい。言い換えれば、データは、ユーザ、デバイス、セッションなどが言及されない（または、ユーザ、デバイス、セッションなどの識別情報がない）ように格納され得る。そのため、データテーブルに格納されたデータは、ビデオのユーザまたは視聴者に基づいて区別できない。例示的な実現化例では、データテーブルに格納されたデータは、ユーザを識別することなく、ユーザに基づいて分類されてもよい。たとえば、データは、ユーザの年齢、年齢層、性別、タイプまたは役割（たとえば音楽家または観衆）などを含んでいてもよい。 In step S220, a counter or ranking (or ranking value) associated with the received view perspective is incremented. For example, the data store may include a data table that includes a previous view perspective (eg, the data store may be a database that includes multiple data tables). The data table may be a keyed (eg, unique to each) view perspective. The data table may include view perspective identification information, information associated with the view perspective, and a counter indicating how many times the view perspective has been requested. The counter may be incremented each time a view perspective is requested. The data stored in the data table may be anonymized. In other words, the data may be stored such that no user, device, session, etc. is mentioned (or there is no identification information for the user, device, session, etc.). Therefore, the data stored in the data table cannot be distinguished based on the video user or viewer. In an exemplary implementation, the data stored in the data table may be classified based on the user without identifying the user. For example, the data may include the user's age, age group, gender, type or role (eg, musician or audience), and the like.

ステップＳ２２５で、ビューパースペクティブはビューパースペクティブデータストアに追加される。たとえば、ビューパースペクティブの識別情報と、ビューパースペクティブに関連付けられた情報と、１に設定されたカウンタ（またはランキング値）とが、これまでのビューパースペクティブを含むデータテーブルに格納されてもよい。 In step S225, the view perspective is added to the view perspective data store. For example, the identification information of the view perspective, the information associated with the view perspective, and the counter (or ranking value) set to 1 may be stored in the data table including the previous view perspective.

例示的な実施形態では、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルが、より高いＱｏＳで符号化され得る。ＱｏＳは、上述の品質の実現（たとえば、品質を規定するエンコーダ変数入力）であり得る。たとえば、エンコーダ（たとえばビデオエンコーダ６２５）が、３Ｄビデオに関連付けられたタイルを個々に符号化することができる。少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、３Ｄビデオの残りに関連付けられたタイルに比べ、より高いＱｏＳで符号化され得る。例示的な実現化例では、３Ｄビデオは、（たとえば第１のパスにおける）第１のＱｏＳパラメータ、または、第１の符号化パスで使用される少なくとも１つの第１のＱｏＳパラメータを使用して符号化され得る。加えて、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、（たとえば第２のパスにおける）第２のＱｏＳパラメータ、または、第２の符号化パスで使用される少なくとも１つの第２のＱｏＳパラメータを使用して符号化され得る。この例示的な実現化例では、第２のＱｏＳは、第１のＱｏＳに比べ、より高いＱｏＳである。別の例示的な実現化例では、３Ｄビデオは、３Ｄビデオを表わす複数のタイルとして符号化され得る。少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、第２のＱｏＳパラメータを使用して符号化され得る。残りのタイルは、第１のＱｏＳパラメータを使用して符号化され得る。 In an exemplary embodiment, tiles associated with at least one preferred view perspective may be encoded with a higher QoS. QoS can be an implementation of the above-described quality (eg, encoder variable input that defines quality). For example, an encoder (eg, video encoder 625) can individually encode tiles associated with 3D video. Tiles associated with at least one preferred view perspective can be encoded with a higher QoS compared to tiles associated with the rest of the 3D video. In an exemplary implementation, 3D video uses a first QoS parameter (eg, in the first pass) or at least one first QoS parameter used in the first coding pass. Can be encoded. In addition, the tile associated with the at least one preferred view perspective is a second QoS parameter (eg, in the second pass) or at least one second QoS parameter used in the second coding pass. Can be encoded using. In this exemplary implementation, the second QoS is a higher QoS compared to the first QoS. In another example implementation, 3D video may be encoded as multiple tiles representing 3D video. Tiles associated with at least one preferred view perspective may be encoded using the second QoS parameter. The remaining tiles may be encoded using the first QoS parameter.

代替的な実現化例（および／または追加の実現化例）では、エンコーダは、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルを、３Ｄビデオフレームの残りの２Ｄ表現を生成するために使用されるものとは異なる投影手法またはアルゴリズムを使用して投影することができる。投影によっては、フレームのあるエリアに歪みを有する場合がある。したがって、タイルを球状フレームとは異なるように投影することは、最終画像の品質を向上させること、および／または、画素をより効率的に使用する（たとえば、コンピュータ計算をより少なくする、またはユーザの目に対する負担をより少なくする）ことができる。１つの例示的な実現化例では、投影アルゴリズムに基づいて歪みが最小となる位置にタイルを向けるために、タイルを投影する前に球状画像を回転させることができる。別の例示的な実現化例では、タイルは、タイルの位置に基づいた投影アルゴリズムを使用する（および／または修正する）ことができる。たとえば、球状ビデオフレームを２Ｄ表現に投影することは正距円筒投影を使用でき、一方、球状ビデオフレームをタイルとして選択されるべき部分を含む表現に投影することは立方体投影を使用できる。 In alternative implementations (and / or additional implementations), the encoder is used to generate the remaining 2D representation of the 3D video frame with tiles associated with at least one preferred view perspective. Projections can be made using different projection techniques or algorithms. Depending on the projection, there may be distortion in an area of the frame. Thus, projecting the tiles differently than the spherical frame improves the quality of the final image and / or uses the pixels more efficiently (e.g., requires fewer computer calculations or the user's Can reduce the burden on the eyes). In one exemplary implementation, the spherical image can be rotated before projecting the tile to direct the tile to a position with minimal distortion based on the projection algorithm. In another exemplary implementation, the tiles can use (and / or modify) a projection algorithm based on the position of the tiles. For example, projecting a spherical video frame into a 2D representation can use equirectangular projection, while projecting a spherical video frame into a representation that includes a portion to be selected as a tile can use a cubic projection.

図３は、３Ｄビデオをストリーミングするための方法を示す。図３は、ライブストリーミングイベントなどの最中に、ストリーミング３Ｄビデオがオンデマンドで符号化されるシナリオを説明する。図３に示すように、ステップＳ３０５で、３Ｄビデオをストリーミングする要求が受信される。たとえば、ストリーミングに利用可能な３Ｄビデオ、３Ｄビデオの一部、またはタイルが、デコーダを含むデバイスによって（たとえば、媒体アプリケーションとのユーザインタラクションを介して）要求され得る。当該要求は、球状ビデオ上の視聴者の配向、位置、点、または焦点に関するパースペクティブまたはビューパースペクティブに基づいた情報を含み得る。パースペクティブまたはビューパースペクティブに基づいた情報は、現在の配向またはデフォルト（たとえば初期化）配向に基づき得る。デフォルト配向は、たとえば、３Ｄビデオについてのディレクターズカットであり得る。 FIG. 3 shows a method for streaming 3D video. FIG. 3 illustrates a scenario where streaming 3D video is encoded on demand, such as during a live streaming event. As shown in FIG. 3, a request to stream 3D video is received in step S305. For example, 3D video, 3D video portions, or tiles that are available for streaming may be requested by a device that includes a decoder (eg, via user interaction with a media application). The request may include information based on a perspective or view perspective regarding the orientation, position, point, or focus of the viewer on the spherical video. Information based on a perspective or view perspective may be based on a current orientation or a default (eg, initialization) orientation. The default orientation can be, for example, a director's cut for 3D video.

ステップＳ３１０で、少なくとも１つの好ましいビューパースペクティブが判断される。たとえば、データストア（たとえばビューパースペクティブデータストア８１５）が、ビューパースペクティブに関連付けられた情報に基づいてクエリまたはフィルタされ得る。データストアは、ビューパースペクティブの球状ビデオ上の緯度および経度位置に基づいてクエリまたはフィルタされてもよい。例示的な実現化例では、少なくとも１つの好ましいビューパースペクティブは、これまでのビューパースペクティブに基づき得る。そのため、データストアは、これまでのビューパースペクティブを含むデータテーブルを含み得る。ビューパースペクティブが何回要求されたかによって、好みが表示され得る。したがって、クエリまたはフィルタは、しきい値カウンタ値未満の結果を取り除くことを含み得る。言い換えれば、これまでのビューパースペクティブを含むデータテーブルのクエリのために設定されたパラメータは、カウンタまたはランキングについての値を含み得る。ここで、クエリの結果は、カウンタについてのしきい値より上でなければならない。これまでのビューパースペクティブを含むデータテーブルのクエリの結果は、少なくとも１つの好ましいビューパースペクティブとして設定され得る。 In step S310, at least one preferred view perspective is determined. For example, a data store (eg, view perspective data store 815) may be queried or filtered based on information associated with the view perspective. The data store may be queried or filtered based on latitude and longitude positions on the spherical video of the view perspective. In an exemplary implementation, the at least one preferred view perspective may be based on previous view perspectives. As such, the data store may include a data table that includes previous view perspectives. Depending on how many times the view perspective is requested, preferences may be displayed. Thus, the query or filter may include removing results that are less than the threshold counter value. In other words, the parameters set for the query of the data table containing the previous view perspective may include values for counters or rankings. Here, the result of the query must be above the threshold for the counter. The result of a query of the data table containing the previous view perspective can be set as at least one preferred view perspective.

加えて、デフォルトの好ましいビューパースペクティブ（または複数の当該ビューパースペクティブ）が、３Ｄビデオに関連付けられ得る。デフォルトの好ましいビューパースペクティブは、ディレクターズカット、関心点（たとえば、地平線、移動物体、優先物体）などであり得る。たとえば、あるゲームの目的は、物体（たとえば、ビルまたは車両）を破壊することである場合がある。この物体は、優先物体とラベル付けされてもよい。優先物体を含むビューパースペクティブは、好ましいビューパースペクティブとして表示され得る。デフォルトの好ましいビューパースペクティブは、これまでのビューパースペクティブに加えて、またはこれまでのビューパースペクティブに代えて含まれ得る。デフォルト配向はたとえば、（たとえばビデオが最初にアップロードされた場合はこれまでのデータがないため）たとえば自動コンピュータビジョンアルゴリズムに基づいた最初の一組の好ましいビューパースペクティブであり得る。ビジョンアルゴリズムは、動きまたは複雑な詳細、または何がおもしろそうか推測するための近くのステレオ物体、および／または、他のこれまでのビデオの好ましいビューに存在していた特徴を有する、ビデオの好ましいビューパースペクティブ部分を判断してもよい。 In addition, a default preferred view perspective (or multiple such view perspectives) may be associated with the 3D video. The default preferred view perspective may be a director's cut, a point of interest (eg, horizon, moving object, priority object), etc. For example, the purpose of a game may be to destroy an object (eg, a building or a vehicle). This object may be labeled as a priority object. A view perspective that includes a priority object may be displayed as a preferred view perspective. The default preferred view perspective may be included in addition to or instead of the previous view perspective. The default orientation can be, for example, the first set of preferred view perspectives based on, for example, an automated computer vision algorithm (eg, because there is no previous data if the video was first uploaded). Vision algorithms are preferred for videos that have motion or complex details, or features that were present in nearby stereo objects to infer what is interesting, and / or other previous preferred views of video. The view perspective portion may be determined.

少なくとも１つの好ましいビューパースペクティブを判断する際に、他の要因を使用することができる。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のビューパースペクティブの範囲内にある（たとえば、現在のビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のユーザの、または現在のユーザが属するグループ（タイプまたはカテゴリー）のこれまでのビューパースペクティブの範囲内にある（たとえば、当該ビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。言い換えれば、少なくとも１つの好ましいビューパースペクティブは、格納されたこれまでのビューパースペクティブと距離が近い、および／または時間が近いビューパースペクティブ（またはタイル）を含み得る。デフォルトの好ましいビューパースペクティブは、これまでのビューパースペクティブを含むデータストア８１５に、または図示されない別個の（たとえば追加の）データストアに格納され得る。 Other factors can be used in determining at least one preferred view perspective. For example, the at least one preferred view perspective may be a previous view perspective that is within range of the current view perspective (eg, close to the current view perspective). For example, at least one preferred view perspective is within the scope of the previous view perspective of the current user or of the group (type or category) to which the current user belongs (e.g., close to the view perspective) View perspective. In other words, the at least one preferred view perspective may include a view perspective (or tile) that is close in distance and / or close in time to the stored previous view perspective. The default preferred view perspective may be stored in the data store 815 containing the previous view perspective, or in a separate (eg, additional) data store not shown.

ステップＳ３１５で、３Ｄビデオは、少なくとも１つの好ましいビューパースペクティブに基づいた少なくとも１つの符号化パラメータを用いて符号化される。たとえば、３Ｄビデオ（またはその一部）は、少なくとも１つの好ましいビューパースペクティブを含む部分が３Ｄビデオの残りとは異なるように符号化されるように、符号化され得る。そのため、少なくとも１つの好ましいビューパースペクティブを含む部分は、３Ｄビデオの残りに比べてより高いＱｏＳで符号化され得る。その結果、ＨＭＤ上でレンダリングされる場合、少なくとも１つの好ましいビューパースペクティブを含む部分は、３Ｄビデオの残りに比べてより高い解像度を有し得る。 In step S315, the 3D video is encoded using at least one encoding parameter based on at least one preferred view perspective. For example, the 3D video (or a portion thereof) may be encoded such that the portion including at least one preferred view perspective is encoded differently than the rest of the 3D video. Thus, the part containing at least one preferred view perspective can be encoded with a higher QoS compared to the rest of the 3D video. As a result, when rendered on the HMD, the portion containing at least one preferred view perspective may have a higher resolution than the rest of the 3D video.

ステップＳ３２０で、符号化された３Ｄビデオはストリーミングされる。たとえば、タイルが、送信用パケットに含まれてもよい。パケットは、圧縮されたビデオビット１０Ａを含んでいてもよい。パケットは、球状ビデオフレームの符号化された２Ｄ表現と、符号化されたタイル（または複数のタイル）とを含んでいてもよい。パケットは、送信用ヘッダを含んでいてもよい。ヘッダは、とりわけ、エンコーダによるフレーム内符号化におけるモードまたはスキーム使用を示す情報を含んでいてもよい。ヘッダは、球状ビデオフレームのフレームを２Ｄ矩形表現に変換するために使用されるパラメータを示す情報を含んでいてもよい。ヘッダは、符号化された２Ｄ矩形表現の、および符号化されたタイルのＱｏＳを獲得するために使用されるパラメータを示す情報を含んでいてもよい。上述のように、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルのＱｏＳは、少なくとも１つの好ましいビューパースペクティブに関連付けられていないタイルに比べ、異なっていてもよい（たとえば、より高くてもよい）。 In step S320, the encoded 3D video is streamed. For example, a tile may be included in the transmission packet. The packet may include compressed video bits 10A. The packet may include an encoded 2D representation of a spherical video frame and an encoded tile (or tiles). The packet may include a transmission header. The header may include information indicating the mode or scheme usage in intra-frame coding by the encoder, among others. The header may include information indicating parameters used to convert the frame of the spherical video frame into a 2D rectangular representation. The header may include information indicating a parameter used to obtain the QoS of the encoded 2D rectangular representation and of the encoded tile. As described above, the QoS of tiles associated with at least one preferred view perspective may be different (eg, higher) than tiles that are not associated with at least one preferred view perspective.

３Ｄビデオをストリーミングすることは、優先段階の使用を通して実現され得る。たとえば、第１の優先段階では、低い（または最低基準の）ＱｏＳで符号化されたビデオデータがストリーミングされ得る。これにより、ＨＭＤのユーザは、バーチャルリアリティ体験を開始できるようになる。次に、より高いＱｏＳのビデオがＨＭＤにストリーミングされ、（たとえば、バッファ８３０に格納されたデータ）以前にストリーミングされた低い（または最低基準）のＱｏＳで符号化されたビデオデータを置き換えることができる。一例として、第２の段階では、現在のビューパースペクティブに基づいて、より高い品質のビデオまたは画像データがストリーミングされ得る。次の段階では、１つ以上の好ましいビューパースペクティブに基づいて、より高いＱｏＳのビデオまたは画像データがストリーミングされ得る。これは、ＨＭＤバッファが実質的に高ＱｏＳビデオまたは画像データのみを含むようになるまで続き得る。加えて、この段階的なストリーミングは、ＱｏＳが次第により高くなるビデオまたは画像データを用いてループし得る。言い換えれば、第１の繰返しの後で、ＨＭＤは第１のＱｏＳで符号化されたビデオまたは画像データを含み、第２の繰返しの後で、ＨＭＤは第２のＱｏＳで符号化されたビデオまたは画像データを含み、第３の繰返しの後で、ＨＭＤは第３のＱｏＳで符号化されたビデオまたは画像データを含む、というふうになっている。例示的な実現化例では、第２のＱｏＳは第１のＱｏＳよりも高く、第３のＱｏＳは第２のＱｏＳよりも高い、というふうになっている。 Streaming 3D video can be realized through the use of priority stages. For example, in the first priority stage, video data encoded with low (or minimum) QoS may be streamed. This allows HMD users to start a virtual reality experience. The higher QoS video can then be streamed to the HMD to replace the previously streamed low (or minimum criteria) QoS encoded video data (eg, data stored in buffer 830). . As an example, in the second stage, higher quality video or image data may be streamed based on the current view perspective. In the next stage, higher QoS video or image data may be streamed based on one or more preferred view perspectives. This may continue until the HMD buffer substantially contains only high QoS video or image data. In addition, this gradual streaming may loop with video or image data with progressively higher QoS. In other words, after the first iteration, the HMD includes video or image data encoded with a first QoS, and after the second iteration, the HMD is a video or video encoded with a second QoS. It contains image data, and after the third iteration, the HMD contains video or image data encoded with a third QoS. In an exemplary implementation, the second QoS is higher than the first QoS, and the third QoS is higher than the second QoS.

エンコーダ６２５は、球状ビデオをストリーミング用に利用可能にするためのセットアップ手順の一環として、オフラインで動作してもよい。複数のタイルの各々は、ビューフレームストレージ７９５に格納されてもよい。複数のタイルの各々は、複数のタイルの各々がフレームを参照して（たとえば時間依存性）、およびビューを参照して（たとえばビュー依存性）格納され得るように、索引付けされてもよい。したがって、複数のタイルの各々は、時間およびビュー、パースペクティブ、またはビューパースペクティブに依存しており、時間依存性およびビュー依存性に基づいて呼び戻され得る。 Encoder 625 may operate offline as part of a setup procedure to make spherical video available for streaming. Each of the plurality of tiles may be stored in the view frame storage 795. Each of the plurality of tiles may be indexed such that each of the plurality of tiles can be stored with reference to a frame (eg, time-dependent) and with reference to a view (eg, view-dependent). Thus, each of the plurality of tiles depends on time and view, perspective, or view perspective, and can be recalled based on time dependency and view dependency.

そのため、例示的な実現化例では、エンコーダ６２５は、フレームが選択され、そのフレームの一部がビューパースペクティブに基づいてタイルとして選択されるループを実行するように構成されてもよい。タイルは次に符号化され、格納される。ループは、複数のビューパースペクティブを通して循環し続ける。たとえば球状画像の垂直線を中心に５度ずつ、および水平線を中心に５度ずつの所望数のビューパースペクティブがタイルとして保存される場合、新しいフレームが選択され、プロセスは、球状ビデオのすべてのフレームがそれらのために保存された所望数のタイルを有するようになるまで繰り返す。例示的な実施形態では、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルではないタイルに比べ、より高いＱｏＳで符号化され得る。これは、タイルを符号化し、保存するための１つの例示的な実現化例に過ぎない。 Thus, in an exemplary implementation, encoder 625 may be configured to perform a loop in which a frame is selected and a portion of the frame is selected as a tile based on a view perspective. The tile is then encoded and stored. The loop continues to circulate through multiple view perspectives. For example, if the desired number of view perspectives are saved as tiles, 5 degrees around the vertical line of the spherical image and 5 degrees around the horizontal line, a new frame is selected and the process is performed for all frames of the spherical video. Repeat until it has the desired number of tiles stored for them. In an exemplary embodiment, tiles associated with at least one preferred view perspective may be encoded with a higher QoS than tiles that are not tiles associated with at least one preferred view perspective. This is just one example implementation for encoding and storing tiles.

図４は、符号化された３Ｄビデオを格納するための方法を示す。図４は、将来のストリーミングのために、ストリーミング３Ｄビデオが前もって符号化され、格納されるシナリオを説明する。図４に示すように、ステップＳ４０５で、３Ｄビデオについての少なくとも１つの好ましいビューパースペクティブが判断される。たとえば、データストア（たとえばビューパースペクティブデータストア８１５）が、ビューパースペクティブに関連付けられた情報に基づいてクエリまたはフィルタされ得る。データストアは、ビューパースペクティブの球状ビデオ上の緯度および経度位置に基づいてクエリまたはフィルタされてもよい。例示的な実現化例では、少なくとも１つの好ましいビューパースペクティブは、これまでのビューパースペクティブに基づき得る。そのため、データテーブルは、これまでのビューパースペクティブを含む。ビューパースペクティブが何回要求されたかによって、好みが表示され得る。したがって、クエリまたはフィルタは、しきい値カウンタ値未満の結果を取り除くことを含み得る。言い換えれば、これまでのビューパースペクティブを含むデータテーブルのクエリのために設定されたパラメータは、カウンタについての値を含み得る。ここで、クエリの結果は、カウンタについてのしきい値より上でなければならない。これまでのビューパースペクティブを含むデータテーブルのクエリの結果は、少なくとも１つの好ましいビューパースペクティブとして設定され得る。 FIG. 4 shows a method for storing encoded 3D video. FIG. 4 illustrates a scenario where streaming 3D video is pre-encoded and stored for future streaming. As shown in FIG. 4, at step S405, at least one preferred view perspective for the 3D video is determined. For example, a data store (eg, view perspective data store 815) may be queried or filtered based on information associated with the view perspective. The data store may be queried or filtered based on latitude and longitude positions on the spherical video of the view perspective. In an exemplary implementation, the at least one preferred view perspective may be based on previous view perspectives. Therefore, the data table includes the conventional view perspective. Depending on how many times the view perspective is requested, preferences may be displayed. Thus, the query or filter may include removing results that are less than the threshold counter value. In other words, the parameters set for the query of the data table containing the previous view perspective may include values for the counter. Here, the result of the query must be above the threshold for the counter. The result of a query of the data table containing the previous view perspective can be set as at least one preferred view perspective.

加えて、デフォルトの好ましいビューパースペクティブ（または複数の当該ビューパースペクティブ）が、３Ｄビデオに関連付けられ得る。デフォルトの好ましいビューパースペクティブは、ディレクターズカット、関心点（たとえば、地平線、移動物体、優先物体）などであり得る。たとえば、あるゲームの目的は、物体（たとえば、ビルまたは車両）を破壊することである場合がある。この物体は、優先物体とラベル付けされてもよい。優先物体を含むビューパースペクティブは、好ましいビューパースペクティブとして表示され得る。デフォルトの好ましいビューパースペクティブは、これまでのビューパースペクティブに加えて、またはこれまでのビューパースペクティブに代えて含まれ得る。少なくとも１つの好ましいビューパースペクティブを判断する際に、他の要因を使用することができる。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のビューパースペクティブの範囲内にある（たとえば、現在のビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のユーザの、または現在のユーザが属するグループ（タイプまたはカテゴリー）のこれまでのビューパースペクティブの範囲内にある（たとえば、当該ビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。デフォルトの好ましいビューパースペクティブは、これまでのビューパースペクティブを含むデータストアに、または別個の（たとえば追加の）データテーブルに格納され得る。 In addition, a default preferred view perspective (or multiple such view perspectives) may be associated with the 3D video. The default preferred view perspective may be a director's cut, a point of interest (eg, horizon, moving object, priority object), etc. For example, the purpose of a game may be to destroy an object (eg, a building or a vehicle). This object may be labeled as a priority object. A view perspective that includes a priority object may be displayed as a preferred view perspective. The default preferred view perspective may be included in addition to or instead of the previous view perspective. Other factors can be used in determining at least one preferred view perspective. For example, the at least one preferred view perspective may be a previous view perspective that is within range of the current view perspective (eg, close to the current view perspective). For example, at least one preferred view perspective is within the scope of the previous view perspective of the current user or of the group (type or category) to which the current user belongs (e.g., close to the view perspective) View perspective. The default preferred view perspective may be stored in the data store containing the previous view perspective or in a separate (eg, additional) data table.

ステップＳ４１０で、３Ｄビデオは、少なくとも１つの好ましいビューパースペクティブに基づいた少なくとも１つの符号化パラメータを用いて符号化される。たとえば、３Ｄビデオのフレームが選択され、そのフレームの一部がビューパースペクティブに基づいてタイルとして選択され得る。タイルは次に符号化される。例示的な実施形態では、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、より高いＱｏＳで符号化され得る。少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルは、３Ｄビデオの残りに関連付けられたタイルに比べ、より高いＱｏＳで符号化され得る。 In step S410, the 3D video is encoded using at least one encoding parameter based on at least one preferred view perspective. For example, a frame of 3D video may be selected and a portion of the frame may be selected as a tile based on the view perspective. The tile is then encoded. In an exemplary embodiment, tiles associated with at least one preferred view perspective may be encoded with a higher QoS. Tiles associated with at least one preferred view perspective can be encoded with a higher QoS compared to tiles associated with the rest of the 3D video.

代替的な実現化例（および／または追加の実現化例）では、エンコーダは、少なくとも１つの好ましいビューパースペクティブに関連付けられたタイルを、３Ｄビデオフレームの残りの２Ｄ表現を生成するために使用されるものとは異なる投影手法またはアルゴリズムを使用して投影することができる。投影によっては、フレームのあるエリアに歪みを有する場合がある。したがって、タイルを球状フレームとは異なるように投影することは、最終画像の品質を向上させること、および／または、画素をより効率的に使用することができる。１つの例示的な実現化例では、投影アルゴリズムに基づいて歪みが最小となる位置にタイルを向けるために、タイルを投影する前に球状画像を回転させることができる。別の例示的な実現化例では、タイルは、タイルの位置に基づいた投影アルゴリズムを使用する（および／または修正する）ことができる。たとえば、球状ビデオフレームを２Ｄ表現に投影することは正距円筒投影を使用でき、一方、球状ビデオフレームをタイルとして選択されるべき部分を含む表現に投影することは立方体投影を使用できる。 In alternative implementations (and / or additional implementations), the encoder is used to generate the remaining 2D representation of the 3D video frame with tiles associated with at least one preferred view perspective. Projections can be made using different projection techniques or algorithms. Depending on the projection, there may be distortion in an area of the frame. Thus, projecting tiles differently than spherical frames can improve the quality of the final image and / or use pixels more efficiently. In one exemplary implementation, the spherical image can be rotated before projecting the tile to direct the tile to a position with minimal distortion based on the projection algorithm. In another exemplary implementation, the tiles can use (and / or modify) a projection algorithm based on the position of the tiles. For example, projecting a spherical video frame into a 2D representation can use equirectangular projection, while projecting a spherical video frame into a representation that includes a portion to be selected as a tile can use a cubic projection.

ステップＳ４１５で、符号化された３Ｄビデオは格納される。たとえば、複数のタイルの各々は、ビューフレームストレージ７９５に格納されてもよい。３Ｄビデオに関連付けられた複数のタイルの各々は、複数のタイルの各々がフレームを参照して（たとえば時間依存性）、およびビューを参照して（たとえばビュー依存性）格納され得るように、索引付けされてもよい。したがって、複数のタイルの各々は、時間およびビュー、パースペクティブ、またはビューパースペクティブに依存しており、時間依存性およびビュー依存性に基づいて呼び戻され得る。 In step S415, the encoded 3D video is stored. For example, each of the plurality of tiles may be stored in the view frame storage 795. Each of the multiple tiles associated with the 3D video is indexed such that each of the multiple tiles can be stored with reference to a frame (eg, time-dependent) and with reference to a view (eg, view-dependent). It may be attached. Thus, each of the plurality of tiles depends on time and view, perspective, or view perspective, and can be recalled based on time dependency and view dependency.

例示的な実現化例では、３Ｄビデオ（たとえば、それに関連付けられたタイル）は、可変符号化パラメータを用いて符号化され、格納されてもよい。したがって、３Ｄビデオは、異なる符号化状態で格納されてもよい。これらの状態は、ＱｏＳに基づいて変わってもよい。たとえば、３Ｄビデオは、同じＱｏＳで各々符号化された複数のタイルとして格納されてもよい。たとえば、３Ｄビデオは、異なるＱｏＳで各々符号化された複数のタイルとして格納されてもよい。たとえば、３Ｄビデオは、符号化された少なくとも１つの好ましいビューパースペクティブに基づいたＱｏＳで一部が符号化された複数のタイルとして格納されてもよい。 In an exemplary implementation, 3D video (eg, tiles associated with it) may be encoded and stored using variable encoding parameters. Accordingly, 3D video may be stored in different encoding states. These states may change based on QoS. For example, 3D video may be stored as multiple tiles, each encoded with the same QoS. For example, 3D video may be stored as multiple tiles, each encoded with different QoS. For example, the 3D video may be stored as multiple tiles partially encoded with QoS based on at least one preferred view perspective encoded.

図５は、３Ｄビデオについての好ましいビューパースペクティブを判断するための方法を示す。３Ｄビデオについての好ましいビューパースペクティブは、３Ｄビデオのこれまでの視聴に基づいた好ましいビューパースペクティブに加わるものであってもよい。図６に示すように、ステップＳ５０５で、少なくとも１つのデフォルトビューパースペクティブが判断される。たとえば、デフォルトの好ましいビューパースペクティブは、データストア（たとえばビューパースペクティブデータストア８１５）に含まれるデータテーブルに格納され得る。データストアは、３Ｄビデオについてのデフォルト表示に基づいてクエリまたはフィルタされ得る。クエリまたはフィルタが結果を返す場合、３Ｄビデオは、関連付けられたデフォルトビューパースペクティブを有する。結果を返さない場合、３Ｄビデオは、関連付けられたデフォルトビューパースペクティブを有していない。デフォルトの好ましいビューパースペクティブは、ディレクターズカット、関心点（たとえば、地平線、移動物体、優先物体）などであり得る。たとえば、あるゲームの目的は、物体（たとえば、ビルまたは車両）を破壊することである場合がある。この物体は、優先物体とラベル付けされてもよい。優先物体を含むビューパースペクティブは、好ましいビューパースペクティブとして表示され得る。 FIG. 5 illustrates a method for determining a preferred view perspective for 3D video. The preferred view perspective for 3D video may be in addition to the preferred view perspective based on previous viewing of 3D video. As shown in FIG. 6, at step S505, at least one default view perspective is determined. For example, a default preferred view perspective may be stored in a data table included in a data store (eg, view perspective data store 815). The data store may be queried or filtered based on the default display for 3D video. If the query or filter returns a result, the 3D video has an associated default view perspective. If it returns no results, the 3D video does not have an associated default view perspective. The default preferred view perspective may be a director's cut, a point of interest (eg, horizon, moving object, priority object), etc. For example, the purpose of a game may be to destroy an object (eg, a building or a vehicle). This object may be labeled as a priority object. A view perspective that includes a priority object may be displayed as a preferred view perspective.

ステップＳ５１０で、ユーザの特性／好み／カテゴリーに基づいた少なくとも１つのビューパースペクティブが判断される。たとえば、ＨＭＤのユーザは、ＨＭＤの以前の使用に基づいた特性を有していてもよい。これらの特性は、統計的な視聴の好み（たとえば、遠くにある物体ではなく、すぐ近くの物体を見る方を好むこと）に基づいていてもよい。たとえば、ＨＭＤのユーザは、ＨＭＤに関連付けられたユーザの好みを格納していてもよい。これらの好みは、セットアッププロセスの一環として、ユーザによって選択されてもよい。好みは、一般的なもの（たとえば、動きに引き寄せられること）であってもよく、または、ビデオ特有のもの（たとえば、音楽演奏でギタリストに注目しがちなこと）であってもよい。たとえば、ＨＭＤのユーザは、あるグループまたはカテゴリー（たとえば、１５〜２２才の男性）に属してもよい。たとえば、ユーザの特性／好み／カテゴリーは、データストア（たとえばビューパースペクティブデータストア８１５）に含まれたデータテーブルに格納され得る。データストアは、３Ｄビデオについてのデフォルト表示に基づいてクエリまたはフィルタされ得る。クエリまたはフィルタが結果を返す場合、３Ｄビデオは、ユーザについての関連付けられた特性／好み／カテゴリーに基づいた、少なくとも１つの関連付けられた好ましいビューパースペクティブを有する。結果を返さない場合、３Ｄビデオは、ユーザに基づいた、関連付けられたビューパースペクティブを有していない。 In step S510, at least one view perspective based on user characteristics / preferences / categories is determined. For example, HMD users may have characteristics based on previous use of the HMD. These characteristics may be based on statistical viewing preferences (e.g., preferring to see nearby objects rather than distant objects). For example, an HMD user may store user preferences associated with the HMD. These preferences may be selected by the user as part of the setup process. The preference may be general (e.g., attracted to movement) or video specific (e.g., tending to focus on a guitarist in a music performance). For example, an HMD user may belong to a group or category (eg, males 15 to 22 years old). For example, user characteristics / preferences / categories may be stored in a data table included in a data store (eg, view perspective data store 815). The data store may be queried or filtered based on the default display for 3D video. If the query or filter returns results, the 3D video has at least one associated preferred view perspective based on the associated characteristics / preference / category for the user. If it returns no results, the 3D video does not have an associated view perspective based on the user.

ステップＳ５１５で、関心領域に基づいた少なくとも１つのビューパースペクティブが判断される。たとえば、関心領域は、現在のビューパースペクティブであってもよい。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のビューパースペクティブの範囲内にある（たとえば、現在のビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。たとえば、少なくとも１つの好ましいビューパースペクティブは、現在のユーザの、または現在のユーザが属するグループ（タイプまたはカテゴリー）のこれまでのビューパースペクティブの範囲内にある（たとえば、当該ビューパースペクティブに接近した）これまでのビューパースペクティブであり得る。 In step S515, at least one view perspective based on the region of interest is determined. For example, the region of interest may be the current view perspective. For example, the at least one preferred view perspective may be a previous view perspective that is within range of the current view perspective (eg, close to the current view perspective). For example, at least one preferred view perspective is within the scope of the previous view perspective of the current user or of the group (type or category) to which the current user belongs (e.g., close to the view perspective) View perspective.

ステップＳ５２０で、少なくとも１つのシステム特性に基づいた少なくとも１つのビューパースペクティブが判断される。たとえば、ＨＭＤは、ユーザ体験を強化し得る特徴を有していてもよい。１つの特徴は、強化された音声であってもよい。したがって、バーチャルリアリティ環境で、ユーザは特定の音に引き付けられるかもしれない（たとえば、ゲームユーザは爆発音に引き付けられるかもしれない）。好ましいビューパースペクティブは、これらの可聴キューを含むビューパースペクティブに基づいていてもよい。ステップＳ５２５で、前述のビューパースペクティブ判断の各々、および／または、それらの組合せ／サブ組合せに基づいた、３Ｄビデオについての少なくとも１つの好ましいビューパースペクティブが判断される。たとえば、少なくとも１つの好ましいビューパースペクティブは、前述のクエリの結果を合併または結合することによって生成されてもよい。 In step S520, at least one view perspective based on at least one system characteristic is determined. For example, the HMD may have features that can enhance the user experience. One feature may be enhanced speech. Thus, in a virtual reality environment, a user may be attracted to a specific sound (eg, a game user may be attracted to an explosion sound). A preferred view perspective may be based on a view perspective that includes these audible cues. In step S525, at least one preferred view perspective for the 3D video is determined based on each of the aforementioned view perspective determinations and / or combinations / subcombinations thereof. For example, at least one preferred view perspective may be generated by merging or combining the results of the aforementioned queries.

図６Ａの例において、ビデオエンコーダシステム６００は、少なくとも１つのコンピューティングデバイスであってもよく、または少なくとも１つのコンピューティングデバイスを含んでいてもよく、ここに説明される方法を行なうように構成された事実上あらゆるコンピューティングデバイスを表わし得る。そのため、ビデオエンコーダシステム６００は、ここに説明される手法、もしくはその異なるバージョンまたは将来のバージョンを実現するために利用され得るさまざまなコンポーネントを含み得る。例として、ビデオエンコーダシステム６００は、少なくとも１つのプロセッサ６０５と、少なくとも１つのメモリ６１０（たとえば、非一時的なコンピュータ読取可能記憶媒体）とを含むとして図示される。 In the example of FIG. 6A, video encoder system 600 may be at least one computing device or may include at least one computing device and is configured to perform the methods described herein. It can represent virtually any computing device. As such, video encoder system 600 may include various components that can be utilized to implement the techniques described herein, or different or future versions thereof. By way of example, video encoder system 600 is illustrated as including at least one processor 605 and at least one memory 610 (eg, a non-transitory computer readable storage medium).

図６Ａは、少なくとも１つの例示的な実施形態に従ったビデオエンコーダシステムを示す。図６Ａに示すように、ビデオエンコーダシステム６００は、少なくとも１つのプロセッサ６０５と、少なくとも１つのメモリ６１０と、コントローラ６２０と、ビデオエンコーダ６２５とを含む。少なくとも１つのプロセッサ６０５、少なくとも１つのメモリ６１０、コントローラ６２０、およびビデオエンコーダ６２５は、バス６１５を介して通信可能に結合される。 FIG. 6A illustrates a video encoder system according to at least one exemplary embodiment. As shown in FIG. 6A, video encoder system 600 includes at least one processor 605, at least one memory 610, a controller 620, and a video encoder 625. At least one processor 605, at least one memory 610, controller 620, and video encoder 625 are communicatively coupled via bus 615.

少なくとも１つのプロセッサ６０５は、少なくとも１つのメモリ６１０上に格納された命令を実行するために利用されてもよく、それにより、ここに説明されるさまざまな特徴および機能、もしくは追加のまたは代替的な特徴および機能を実現する。少なくとも１つのプロセッサ６０５および少なくとも１つのメモリ６１０は、さまざまな他の目的のために利用されてもよい。特に、少なくとも１つのメモリ６１０は、ここに説明されるモジュールのうちのいずれか１つを実現するために使用され得るさまざまなタイプのメモリならびに関連するハードウェアおよびソフトウェアの一例を表わし得る。 The at least one processor 605 may be utilized to execute instructions stored on the at least one memory 610, thereby various features and functions described herein, or additional or alternative Realize features and functions. At least one processor 605 and at least one memory 610 may be utilized for various other purposes. In particular, the at least one memory 610 may represent an example of various types of memory and associated hardware and software that may be used to implement any one of the modules described herein.

少なくとも１つのメモリ６１０は、ビデオエンコーダシステム６００に関連付けられたデータおよび／または情報を格納するように構成されてもよい。たとえば、少なくとも１つのメモリ６１０は、球状ビデオを符号化することに関連付けられたコーデックを格納するように構成されてもよい。たとえば、少なくとも１つのメモリは、球状ビデオのフレームの一部を、球状ビデオの符号化とは別に符号化されるべきタイルとして選択することに関連付けられた符号を格納するように構成されてもよい。少なくとも１つのメモリ６１０は、共有リソースであってもよい。以下により詳細に説明されるように、タイルは、球状ビューア（たとえばＨＭＤ）の再生中に視聴者のビューパースペクティブに基づいて選択された複数の画素であってもよい。複数の画素は、ユーザによって見られ得る球状画像の一部を含み得る、ブロック、複数のブロック、またはマクロブロックであってもよい。たとえば、ビデオエンコーダシステム６００は、より大型のシステム（たとえば、サーバ、パーソナルコンピュータ、モバイルデバイスなど）の要素であってもよい。したがって、少なくとも１つのメモリ６１０は、より大型のシステム内の他の要素（たとえば、画像／ビデオ供給、ウェブブラウジング、または有線／無線通信）に関連付けられたデータおよび／または情報を格納するように構成されてもよい。 At least one memory 610 may be configured to store data and / or information associated with video encoder system 600. For example, the at least one memory 610 may be configured to store a codec associated with encoding the spherical video. For example, the at least one memory may be configured to store a code associated with selecting a portion of the spherical video frame as a tile to be encoded separately from the spherical video encoding. . At least one memory 610 may be a shared resource. As will be described in more detail below, a tile may be a plurality of pixels selected based on the viewer's view perspective during playback of a spherical viewer (eg, HMD). The plurality of pixels may be a block, a plurality of blocks, or a macroblock that may include a portion of a spherical image that may be viewed by a user. For example, video encoder system 600 may be an element of a larger system (eg, server, personal computer, mobile device, etc.). Accordingly, the at least one memory 610 is configured to store data and / or information associated with other elements in the larger system (eg, image / video supply, web browsing, or wired / wireless communication). May be.

コントローラ６２０は、さまざまな制御信号を生成し、ビデオエンコーダシステム６００におけるさまざまなブロックに当該制御信号を通信するように構成されてもよい。コントローラ６２０は、以下に説明される手法を実現するために当該制御信号を生成するように構成されてもよい。コントローラ６２０は、例示的な実施形態によれば、画像、画像のシーケンス、ビデオフレーム、ビデオシーケンスなどを符号化するようビデオエンコーダ６２５を制御するように構成されてもよい。たとえば、コントローラ６２０は、球状ビデオを符号化するためのパラメータに対応する制御信号を生成してもよい。ビデオエンコーダ６２５およびコントローラ６２０の機能および動作に関するさらなる詳細が、少なくとも図７Ａ、図４Ａ、図５Ａ、図５Ｂおよび図６〜９に関連して以下に説明される。 Controller 620 may be configured to generate various control signals and communicate the control signals to various blocks in video encoder system 600. Controller 620 may be configured to generate the control signal to implement the techniques described below. The controller 620 may be configured to control the video encoder 625 to encode an image, a sequence of images, a video frame, a video sequence, etc., according to an exemplary embodiment. For example, the controller 620 may generate a control signal corresponding to parameters for encoding the spherical video. Further details regarding the function and operation of video encoder 625 and controller 620 are described below in connection with at least FIGS. 7A, 4A, 5A, 5B, and 6-9.

ビデオエンコーダ６２５は、ビデオストリーム入力５を受信し、圧縮された（たとえば符号化された）ビデオビット１０を出力するように構成されてもよい。ビデオエンコーダ６２５は、ビデオストリーム入力５を離散ビデオフレームに変換してもよい。ビデオストリーム入力５はまた、画像であってもよく、したがって、圧縮された（たとえば符号化された）ビデオビット１０も、圧縮された画像ビットであってもよい。ビデオエンコーダ６２５はさらに、各離散ビデオフレーム（または画像）をブロックのマトリックス（以下、ブロックと称される）に変換してもよい。たとえば、ビデオフレーム（または画像）は、各々が多数の画素を有するブロックの１６×１６、１６×８、８×８、８×４、４×４、４×２、２×２などのマトリックスに変換されてもよい。これらの例示的なマトリックスが列挙されているが、例示的な実施形態はそれらに限定されない。 Video encoder 625 may be configured to receive video stream input 5 and output compressed (eg, encoded) video bits 10. Video encoder 625 may convert video stream input 5 into discrete video frames. The video stream input 5 may also be an image, so the compressed (eg, encoded) video bits 10 may also be compressed image bits. Video encoder 625 may further convert each discrete video frame (or image) into a matrix of blocks (hereinafter referred to as blocks). For example, video frames (or images) are organized into a 16 × 16, 16 × 8, 8 × 8, 8 × 4, 4 × 4, 4 × 2, 2 × 2, etc. matrix of blocks each having a number of pixels. It may be converted. Although these exemplary matrices are listed, exemplary embodiments are not limited thereto.

圧縮されたビデオビット１０は、ビデオエンコーダシステム６００の出力を表わしていてもよい。たとえば、圧縮されたビデオビット１０は、符号化されたビデオフレーム（または符号化された画像）を表わしていてもよい。たとえば、圧縮されたビデオビット１０は、受信デバイス（図示せず）への送信の準備ができていてもよい。たとえば、ビデオビットは、受信デバイスへの送信のためにシステムトランシーバ（図示せず）に送信されてもよい。 The compressed video bit 10 may represent the output of the video encoder system 600. For example, the compressed video bit 10 may represent an encoded video frame (or encoded image). For example, the compressed video bit 10 may be ready for transmission to a receiving device (not shown). For example, the video bits may be transmitted to a system transceiver (not shown) for transmission to a receiving device.

少なくとも１つのプロセッサ６０５は、コントローラ６２０および／またはビデオエンコーダ６２５に関連付けられたコンピュータ命令を実行するように構成されてもよい。少なくとも１つのプロセッサ６０５は、共有リソースであってもよい。たとえば、ビデオエンコーダシステム６００は、より大型のシステム（たとえばモバイルデバイス）の要素であってもよい。したがって、少なくとも１つのプロセッサ６０５は、より大型のシステム内の他の要素（たとえば、画像／ビデオ供給、ウェブブラウジング、または有線／無線通信）に関連付けられたコンピュータ命令を実行するように構成されてもよい。 At least one processor 605 may be configured to execute computer instructions associated with controller 620 and / or video encoder 625. At least one processor 605 may be a shared resource. For example, video encoder system 600 may be an element of a larger system (eg, a mobile device). Accordingly, at least one processor 605 may be configured to execute computer instructions associated with other elements in a larger system (eg, image / video supply, web browsing, or wired / wireless communication). Good.

図６Ｂの例において、ビデオデコーダシステム６５０は、少なくとも１つのコンピューティングデバイスであってもよく、ここに説明される方法を行なうように構成された事実上あらゆるコンピューティングデバイスを表わし得る。そのため、ビデオデコーダシステム６５０は、ここに説明される手法、もしくはその異なるバージョンまたは将来のバージョンを実現するために利用され得るさまざまなコンポーネントを含み得る。例として、ビデオデコーダシステム６５０は、少なくとも１つのプロセッサ６５５と、少なくとも１つのメモリ６６０（たとえば、コンピュータ読取可能記憶媒体）とを含むとして図示される。 In the example of FIG. 6B, video decoder system 650 may be at least one computing device and may represent virtually any computing device configured to perform the methods described herein. As such, video decoder system 650 may include various components that can be utilized to implement the techniques described herein, or different or future versions thereof. By way of example, video decoder system 650 is illustrated as including at least one processor 655 and at least one memory 660 (eg, a computer-readable storage medium).

このため、少なくとも１つのプロセッサ６５５は、少なくとも１つのメモリ６６０上に格納された命令を実行するために利用されてもよく、それにより、ここに説明されるさまざまな特徴および機能、もしくは追加のまたは代替的な特徴および機能を実現する。少なくとも１つのプロセッサ６５５および少なくとも１つのメモリ６６０は、さまざまな他の目的のために利用されてもよい。特に、少なくとも１つのメモリ６６０は、ここに説明されるモジュールのうちのいずれか１つを実現するために使用され得るさまざまなタイプのメモリならびに関連するハードウェアおよびソフトウェアの一例を表わし得る。例示的な実施形態によれば、ビデオエンコーダシステム６００およびビデオデコーダシステム６５０は、同じより大型のシステム（たとえば、パーソナルコンピュータ、モバイルデバイスなど）に含まれていてもよい。例示的な実施形態によれば、ビデオデコーダシステム６５０は、ビデオエンコーダシステム６００に関して説明されたものとは逆または反対の手法を実現するように構成されてもよい。 Thus, at least one processor 655 may be utilized to execute instructions stored on at least one memory 660, thereby providing various features and functions described herein, or additional or Implement alternative features and functions. At least one processor 655 and at least one memory 660 may be utilized for various other purposes. In particular, at least one memory 660 may represent an example of various types of memory and associated hardware and software that may be used to implement any one of the modules described herein. According to an exemplary embodiment, video encoder system 600 and video decoder system 650 may be included in the same larger system (eg, a personal computer, mobile device, etc.). According to an exemplary embodiment, video decoder system 650 may be configured to implement a technique that is the opposite or opposite of that described with respect to video encoder system 600.

少なくとも１つのメモリ６６０は、ビデオデコーダシステム６５０に関連付けられたデータおよび／または情報を格納するように構成されてもよい。たとえば、少なくとも１つのメモリ６１０は、符号化された球状ビデオデータを復号することに関連付けられたコーデックを格納するように構成されてもよい。たとえば、少なくとも１つのメモリは、符号化されたタイルおよび別個に符号化された球状ビデオフレームを復号することに関連付けられた符号、ならびに、復号された球状ビデオフレームにおける画素を復号されたタイルと置き換えるための符号を格納するように構成されてもよい。少なくとも１つのメモリ６６０は、共有リソースであってもよい。たとえば、ビデオデコーダシステム６５０は、より大型のシステム（たとえば、パーソナルコンピュータ、モバイルデバイスなど）の要素であってもよい。したがって、少なくとも１つのメモリ６６０は、より大型のシステム内の他の要素（たとえば、ウェブブラウジング、または無線通信）に関連付けられたデータおよび／または情報を格納するように構成されてもよい。 At least one memory 660 may be configured to store data and / or information associated with video decoder system 650. For example, at least one memory 610 may be configured to store a codec associated with decoding encoded spherical video data. For example, at least one memory replaces encoded tiles and codes associated with decoding separately encoded spherical video frames, and pixels in the decoded spherical video frames with decoded tiles. May be configured to store a code for. The at least one memory 660 may be a shared resource. For example, video decoder system 650 may be an element of a larger system (eg, a personal computer, mobile device, etc.). Accordingly, the at least one memory 660 may be configured to store data and / or information associated with other elements in the larger system (eg, web browsing, or wireless communications).

コントローラ６７０は、さまざまな制御信号を生成し、ビデオデコーダシステム６５０におけるさまざまなブロックに当該制御信号を通信するように構成されてもよい。コントローラ６７０は、以下に説明されるビデオ復号手法を実現するために当該制御信号を生成するように構成されてもよい。コントローラ６７０は、例示的な実施形態によれば、ビデオフレームを復号するようビデオデコーダ６７５を制御するように構成されてもよい。コントローラ６７０は、ビデオの復号に対応する制御信号を生成するように構成されてもよい。ビデオデコーダ６７５およびコントローラ６７０の機能および動作に関するさらなる詳細が、以下に説明される。 The controller 670 may be configured to generate various control signals and communicate the control signals to various blocks in the video decoder system 650. The controller 670 may be configured to generate the control signal to implement the video decoding technique described below. The controller 670 may be configured to control the video decoder 675 to decode video frames according to an exemplary embodiment. The controller 670 may be configured to generate a control signal corresponding to video decoding. Further details regarding the function and operation of video decoder 675 and controller 670 are described below.

ビデオデコーダ６７５は、圧縮された（たとえば符号化された）ビデオビット１０入力を受信し、ビデオストリーム５を出力するように構成されてもよい。ビデオデコーダ６７５は、圧縮されたビデオビット１０の離散ビデオフレームをビデオストリーム５に変換してもよい。圧縮された（たとえば符号化された）ビデオビット１０はまた、圧縮された画像ビットであってもよく、したがって、ビデオストリーム５も画像であってもよい。 Video decoder 675 may be configured to receive a compressed (eg, encoded) video bit 10 input and output a video stream 5. Video decoder 675 may convert the compressed discrete video frame of video bits 10 into video stream 5. The compressed (eg, encoded) video bit 10 may also be a compressed image bit, and thus the video stream 5 may also be an image.

少なくとも１つのプロセッサ６５５は、コントローラ６７０および／またはビデオデコーダ６７５に関連付けられたコンピュータ命令を実行するように構成されてもよい。少なくとも１つのプロセッサ６５５は、共有リソースであってもよい。たとえば、ビデオデコーダシステム６５０は、より大型のシステム（たとえば、パーソナルコンピュータ、モバイルデバイスなど）の要素であってもよい。したがって、少なくとも１つのプロセッサ６５５は、より大型のシステム内の他の要素（たとえば、ウェブブラウジング、または無線通信）に関連付けられたコンピュータ命令を実行するように構成されてもよい。 At least one processor 655 may be configured to execute computer instructions associated with controller 670 and / or video decoder 675. At least one processor 655 may be a shared resource. For example, video decoder system 650 may be an element of a larger system (eg, a personal computer, mobile device, etc.). Accordingly, the at least one processor 655 may be configured to execute computer instructions associated with other elements in the larger system (eg, web browsing or wireless communication).

図７Ａおよび図７Ｂはそれぞれ、少なくとも１つの例示的な実施形態に従った、図６Ａに示すビデオエンコーダ６２５、および図６Ｂに示すビデオデコーダ６７５についてのフロー図を示す。（上述の）ビデオエンコーダ６２５は、球状−２Ｄ表現ブロック７０５と、予測ブロック７１０と、変換ブロック７１５と、量子化ブロック７２０と、エントロピー符号化ブロック７２５と、逆量子化ブロック７３０と、逆変換ブロック７３５と、再構築ブロック７４０と、ループフィルタブロック７４５と、タイル表現ブロック７９０と、ビューフレームストレージ７９５とを含む。ビデオエンコーダ６２５の他の構造変形例が、入力ビデオストリーム５を符号化するために使用され得る。図７Ａに示すように、破線は、いくつかのブロック間の再構築経路を表わし、実線は、いくつかのブロック間の順方向経路を表わす。 7A and 7B show flow diagrams for the video encoder 625 shown in FIG. 6A and the video decoder 675 shown in FIG. 6B, respectively, in accordance with at least one exemplary embodiment. Video encoder 625 (described above) includes spherical-2D representation block 705, prediction block 710, transform block 715, quantization block 720, entropy coding block 725, inverse quantization block 730, and inverse transform block. 735, a reconstruction block 740, a loop filter block 745, a tile representation block 790, and a view frame storage 795. Other structural variations of video encoder 625 can be used to encode input video stream 5. As shown in FIG. 7A, the dashed line represents the reconstruction path between several blocks, and the solid line represents the forward path between several blocks.

前述のブロックの各々は、（たとえば図６Ａに示すような）ビデオエンコーダシステムに関連付けられたメモリ（たとえば少なくとも１つのメモリ６１０）に格納され、当該ビデオエンコーダシステムに関連付けられた少なくとも１つのプロセッサ（たとえば少なくとも１つのプロセッサ６０５）によって実行される、ソフトウェアコードとして実行されてもよい。しかしながら、特殊用途プロセッサとして具現化されるビデオエンコーダといった、代替的な実施形態が考えられる。たとえば、（単独の、および／または組合された）前述のブロックの各々は、特定用途向け集積回路、すなわちＡＳＩＣであってもよい。たとえば、ＡＳＩＣは、変換ブロック７１５および／または量子化ブロック７２０として構成されてもよい。 Each of the foregoing blocks is stored in a memory (eg, at least one memory 610) associated with the video encoder system (eg, as shown in FIG. 6A) and is associated with at least one processor (eg, It may be executed as software code executed by at least one processor 605). However, alternative embodiments are conceivable, such as a video encoder embodied as a special purpose processor. For example, each of the foregoing blocks (single and / or combined) may be an application specific integrated circuit or ASIC. For example, the ASIC may be configured as a transform block 715 and / or a quantization block 720.

球状−２Ｄ表現ブロック７０５は、球状フレームまたは画像を球状フレームまたは画像の２Ｄ表現にマッピングするように構成されてもよい。たとえば、球が、別の形状（たとえば、正方形、矩形、円筒、および／または立方体）の表面上に投影されてもよい。その投影は、たとえば、正距円筒または半正距円筒であってもよい。 Spherical-2D representation block 705 may be configured to map a spherical frame or image to a 2D representation of the spherical frame or image. For example, a sphere may be projected onto a surface of another shape (eg, square, rectangle, cylinder, and / or cube). The projection may be, for example, an equirectangular cylinder or a semi equirectangular cylinder.

予測ブロック７１０は、ビデオフレーム整合性（たとえば、以前に符号化された画素と比べて変わっていない画素）を利用するように構成されてもよい。予測は、２つのタイプを含んでいてもよい。たとえば、予測は、フレーム内予測とフレーム間予測とを含んでいてもよい。フレーム内予測は、画像のブロックにおける画素値を、同じ画像の以前に符号化された隣接するブロックにおける基準サンプルと比べて予測することに関する。フレーム内予測では、サンプルは、予測変換コーデックの変換（たとえばエントロピー符号化ブロック７２５）およびエントロピー符号化（たとえばエントロピー符号化ブロック７２５）部分によって符号化される残差を減少させるために、同じフレーム内の再構築された画素から予測される。フレーム間予測は、画像のブロックにおける画素値を、以前に符号化された画像のデータと比べて予測することに関する。 Prediction block 710 may be configured to take advantage of video frame consistency (eg, pixels that have not changed compared to previously encoded pixels). The prediction may include two types. For example, the prediction may include intra-frame prediction and inter-frame prediction. Intraframe prediction relates to predicting pixel values in a block of images relative to reference samples in previously encoded neighboring blocks of the same image. For intra-frame prediction, the samples are generated in the same frame to reduce the residual encoded by the transform (eg, entropy encoding block 725) and entropy encoding (eg, entropy encoding block 725) portions of the predictive transform codec. Predicted from the reconstructed pixels. Inter-frame prediction relates to predicting pixel values in a block of an image relative to previously encoded image data.

変換ブロック７１５は、画素の値を空間ドメインから変換ドメインにおける変換係数に変換するように構成されてもよい。変換係数は、元のブロックと通常同じサイズである係数の２次元マトリックスに対応していてもよい。言い換えれば、元のブロックにおける画素と同じぐらい多くの変換係数が存在していてもよい。しかしながら、変換により、変換係数の一部はゼロに等しい値を有していてもよい。 Transform block 715 may be configured to transform pixel values from the spatial domain to transform coefficients in the transform domain. The transform coefficients may correspond to a two-dimensional matrix of coefficients that are usually the same size as the original block. In other words, there may be as many transform coefficients as there are pixels in the original block. However, due to the conversion, some of the conversion coefficients may have a value equal to zero.

変換ブロック７１５は、（予測ブロック７１０からの）残りを、たとえば周波数ドメインにおける変換係数に変換するように構成されてもよい。典型的には、変換は、カルフネン−ロエヴェ変換（Karhunen-Loeve Transform：ＫＬＴ）、離散コサイン変換（Discrete Cosine Transform：ＤＣＴ）、特異値分解変換（Singular Value Decomposition Transform：ＳＶＤ）、および非対称離散サイン変換（asymmetric discrete sine transform：ＡＤＳＴ）を含む。 Transform block 715 may be configured to transform the remainder (from prediction block 710), for example, into transform coefficients in the frequency domain. Typically, the transformations are Karhunen-Loeve Transform (KLT), Discrete Cosine Transform (DCT), Singular Value Decomposition Transform (SVD), and Asymmetric Discrete Sine Transform. (Asymmetric discrete sine transform: ADST).

量子化ブロック７２０は、各変換係数におけるデータを減少させるように構成されてもよい。量子化は、比較的大きい範囲内の値を比較的小さい範囲内の値にマッピングすることを伴ってもよく、このため、量子化された変換係数を表わすのに必要なデータの量を減少させる。量子化ブロック７２０は、変換係数を、量子化された変換係数または量子化レベルと称される離散量子値に変換してもよい。たとえば、量子化ブロック７２０は、変換係数に関連付けられたデータにゼロを加えるように構成されてもよい。たとえば、符号化規準は、スカラー量子化プロセスにおける１２８個の量子化レベルを規定してもよい。 The quantization block 720 may be configured to reduce the data at each transform coefficient. Quantization may involve mapping values in a relatively large range to values in a relatively small range, thus reducing the amount of data needed to represent the quantized transform coefficients. . The quantization block 720 may convert the transform coefficients into discrete quantum values called quantized transform coefficients or quantization levels. For example, the quantization block 720 may be configured to add zero to data associated with the transform coefficient. For example, the encoding criteria may define 128 quantization levels in the scalar quantization process.

量子化された変換係数は、エントロピー符号化ブロック７２５によってエントロピー符号化される。その後、エントロピー符号化された係数は、使用される予測のタイプ、運動ベクトル、および量子化器の値といった、ブロックを復号するのに必要とされる情報とともに、圧縮されたビデオビット１０として出力される。圧縮されたビデオビット１０は、ランレングス符号化（run-length encoding：ＲＬＥ）およびゼロラン符号化（zero-run coding）といったさまざまな手法を使用してフォーマットされ得る。 The quantized transform coefficient is entropy encoded by an entropy encoding block 725. The entropy encoded coefficients are then output as compressed video bits 10 along with the information needed to decode the block, such as the type of prediction used, the motion vector, and the quantizer value. The The compressed video bit 10 may be formatted using various techniques such as run-length encoding (RLE) and zero-run coding.

図７Ａにおける再構築経路は、ビデオエンコーダ６２５および（図７Ｂに関して以下に説明される）ビデオデコーダ６７５の双方が、同じ基準フレームを使用して、圧縮されたビデオビット１０（または圧縮された画像ビット）を復号することを保証するために存在する。当該再構築経路は、以下により詳細に説明される、復号処理中に行なわれる機能に類似する機能を行なう。当該機能は、微分残差（derivative residual）ブロック（微分残差）を作り出すために、逆量子化ブロック７３０で、量子化された変換係数を逆量子化することと、逆変換ブロック７３５で、逆量子化された変換係数を逆変換することとを含む。再構築ブロック７４０で、再構築ブロックを作り出すために、予測ブロック７１０で予測された予測ブロックは微分残差に加えられ得る。次に、ブロッキングアーティファクトなどの歪みを減少させるために、ループフィルタ７４５が再構築ブロックに適用され得る。 The reconstruction path in FIG. 7A is that both video encoder 625 and video decoder 675 (described below with respect to FIG. 7B) use the same reference frame to compress 10 compressed video bits (or compressed image bits). ) Exists to ensure that it is decrypted. The reconstruction path performs a function similar to that performed during the decoding process, which will be described in more detail below. The function includes dequantizing the quantized transform coefficient in inverse quantization block 730 to produce a derivative residual block (differential residual), and inverse transform block 735 in inverse. Inversely transforming the quantized transform coefficients. At reconstruction block 740, the prediction block predicted at prediction block 710 may be added to the differential residual to create a reconstruction block. Next, a loop filter 745 may be applied to the reconstructed block to reduce distortion, such as blocking artifacts.

タイル表現ブロック７９０は、画像および／またはフレームを複数のタイルに変換するように構成され得る。１つのタイルは、画素のグループ化であり得る。タイルは、ビューまたはビューパースペクティブに基づいて選択された複数の画素であってもよい。複数の画素は、ユーザによって見られ得る（または見られることが予測される）球状画像の一部を含み得る、ブロック、複数のブロック、またはマクロブロックであってもよい。タイルとしての球状画像の一部は、長さと幅とを有していてもよい。球状画像の一部は、２次元、または実質的に２次元であってもよい。タイルは、可変サイズ（たとえば、タイルは球の何割をカバーするか）を有し得る。たとえば、タイルのサイズは、たとえば、視聴者の視野がどれくらい広いか、別のタイルへの近接性、および／または、ユーザがどれくらい速く自分の頭を回転させているかに基づいて、符号化され、ストリーミングされ得る。たとえば、視聴者が絶えず見回している場合、より大きく、より低品質のタイルが選択されるかもしれない。しかしながら、視聴者が１つのパースペクティブに注目している場合、より小さく、より詳細なタイルが選択されるかもしれない。 Tile representation block 790 may be configured to convert an image and / or frame into a plurality of tiles. A tile can be a grouping of pixels. A tile may be a plurality of pixels selected based on a view or a view perspective. The plurality of pixels may be a block, a plurality of blocks, or a macroblock that may include a portion of a spherical image that may be seen (or predicted to be seen) by a user. A part of the spherical image as a tile may have a length and a width. A portion of the spherical image may be two-dimensional or substantially two-dimensional. A tile may have a variable size (eg, what percentage of a sphere a tile covers). For example, the size of a tile is encoded based on, for example, how wide the viewer's field of view is, proximity to another tile, and / or how fast the user is turning his head, Can be streamed. For example, if the viewer is constantly looking around, a larger, lower quality tile may be selected. However, if the viewer is looking at one perspective, smaller and more detailed tiles may be selected.

一実現化例では、タイル表現ブロック７９０は、球状−２Ｄ表現ブロック７０５にタイルを生成させる、球状−２Ｄ表現ブロック７０５への命令を起動する。別の実現化例では、タイル表現ブロック７９０がタイルを生成する。いずれの実現化例でも、各タイルは次に、個々に符号化される。さらに別の実現化例では、タイル表現ブロック７９０は、ビューフレームストレージ７９５に、符号化された画像および／またはビデオフレームをタイルとして格納させる、ビューフレームストレージ７９５への命令を起動する。タイル表現ブロック７９０は、ビューフレームストレージ７９５に、タイルについての情報またはメタデータとともにタイルを格納させる、ビューフレームストレージ７９５への命令を起動できる。たとえば、タイルについての情報またはメタデータは、画像またはフレーム内のタイル位置の表示、タイルの符号化に関連付けられた情報（たとえば、解像度、帯域幅、および／または３Ｄ−２Ｄ投影アルゴリズム）、１つ以上の関心領域との関連付けなどを含んでいてもよい。 In one implementation, the tile representation block 790 invokes an instruction to the spherical-2D representation block 705 that causes the spherical-2D representation block 705 to generate a tile. In another implementation, tile representation block 790 generates tiles. In either implementation, each tile is then encoded individually. In yet another implementation, the tile representation block 790 invokes an instruction to the view frame storage 795 that causes the view frame storage 795 to store the encoded image and / or video frames as tiles. The tile representation block 790 can initiate an instruction to the view frame storage 795 that causes the view frame storage 795 to store the tile along with information or metadata about the tile. For example, information or metadata about tiles may include information related to the display of tile positions within an image or frame, encoding of tiles (eg, resolution, bandwidth, and / or 3D-2D projection algorithm), one The above association with the region of interest may be included.

例示的な実現化例によれば、エンコーダ６２５は、フレーム、フレームの一部、および／またはタイルを、異なる品質（またはＱｏＳ（Quality of Service））で符号化してもよい。例示的な実施形態によれば、エンコーダ６２５は、フレーム、フレームの一部、および／またはタイルを複数回、各々異なるＱｏＳで符号化してもよい。したがって、ビューフレームストレージ７９５は、画像またはフレーム内の同じ位置を表わすフレーム、フレームの一部、および／またはタイルを、異なるＱｏＳで格納できる。そのため、タイルについての前述の情報またはメタデータは、フレーム、フレームの一部、および／またはタイルが符号化された際のＱｏＳの表示を含んでいてもよい。 According to an exemplary implementation, encoder 625 may encode frames, portions of frames, and / or tiles with different qualities (or quality of service (QoS)). According to an exemplary embodiment, encoder 625 may encode a frame, a portion of a frame, and / or a tile multiple times, each with a different QoS. Thus, view frame storage 795 can store frames, portions of frames, and / or tiles that represent the same location within an image or frame, with different QoS. As such, the aforementioned information or metadata about the tile may include a frame, a portion of the frame, and / or a QoS indication when the tile is encoded.

ＱｏＳは、圧縮アルゴリズム、解像度、伝送速度、および／または符号化スキームに基づき得る。したがって、エンコーダ６２５は、フレーム、フレームの一部、および／またはタイルごとに、異なる圧縮アルゴリズムおよび／または符号化スキームを使用してもよい。たとえば、符号化されたタイルは、エンコーダ６２５によって符号化された（タイルに関連付けられたた）フレームに比べ、より高いＱｏＳであってもよい。上述のように、エンコーダ６２５は、球状ビデオフレームの２Ｄ表現を符号化するように構成されてもよい。したがって、（球状ビデオフレームの一部を含む可視パースペクティブとしての）タイルは、球状ビデオフレームの２Ｄ表現に比べ、より高いＱｏＳで符号化され得る。ＱｏＳは、復号された場合にフレームの解像度に影響を与えるかもしれない。したがって、（球状ビデオフレームの一部を含む可視パースペクティブとしての）タイルは、タイルが、復号された場合に、球状ビデオフレームの復号された２Ｄ表現と比べて、フレームのより高い解像度を有するように、符号化され得る。タイル表現ブロック７９０は、タイルが符号化されるべきＱｏＳを示してもよい。タイル表現ブロック７９０は、フレーム、フレームの一部、および／またはタイルが関心領域であるか、関心領域内にあるか、シード領域に関連付けられているか否かなどに基づいて、ＱｏＳを選択してもよい。関心領域およびシード領域は、以下により詳細に説明される。 QoS may be based on compression algorithms, resolutions, transmission rates, and / or encoding schemes. Accordingly, encoder 625 may use a different compression algorithm and / or encoding scheme for each frame, part of a frame, and / or tile. For example, an encoded tile may have a higher QoS compared to a frame (associated with the tile) encoded by encoder 625. As described above, encoder 625 may be configured to encode a 2D representation of a spherical video frame. Thus, a tile (as a visible perspective that includes a portion of a spherical video frame) can be encoded with a higher QoS compared to a 2D representation of the spherical video frame. QoS may affect frame resolution when decoded. Thus, a tile (as a visible perspective that includes a portion of a spherical video frame) will have a higher resolution of the frame when the tile is decoded compared to a decoded 2D representation of the spherical video frame. Can be encoded. Tile representation block 790 may indicate the QoS with which the tile is to be encoded. The tile representation block 790 selects the QoS based on whether the frame, a portion of the frame, and / or the tile is a region of interest, is in the region of interest, is associated with a seed region, or the like. Also good. The region of interest and the seed region are described in more detail below.

図７Ａに関して上述されたビデオエンコーダ６２５は、図示されたブロックを含む。しかしながら、例示的な実施形態はそれらに限定されない。使用される異なるビデオ符号化構成および／または手法に基づいて、追加のブロックが追加されてもよい。また、図７Ａに関して上述されたビデオエンコーダ６２５に示されるブロックの各々は、使用される異なるビデオ符号化構成および/または手法に基づくオプションのブロックであってもよい。 Video encoder 625 described above with respect to FIG. 7A includes the illustrated blocks. However, the exemplary embodiments are not limited thereto. Additional blocks may be added based on the different video coding configurations and / or techniques used. Also, each of the blocks shown in video encoder 625 described above with respect to FIG. 7A may be an optional block based on the different video encoding configurations and / or techniques used.

図７Ｂは、圧縮されたビデオビット１０（または圧縮された画像ビット）を復号するように構成されたデコーダ６７５の概略ブロック図である。デコーダ６７５は、前述のエンコーダ６２５の再構築経路に類似して、エントロピー復号ブロック７５０と、逆量子化ブロック７５５と、逆変換ブロック７６０と、再構築ブロック７６５と、ループフィルタブロック７７０と、予測ブロック７７５と、ブロック解除フィルタブロック７８０と、２Ｄ表現−球状ブロック７８５とを含む。 FIG. 7B is a schematic block diagram of a decoder 675 configured to decode compressed video bits 10 (or compressed image bits). Similar to the reconstruction path of the encoder 625 described above, the decoder 675 includes an entropy decoding block 750, an inverse quantization block 755, an inverse transform block 760, a reconstruction block 765, a loop filter block 770, and a prediction block. 775, deblock filter block 780, and 2D representation-spherical block 785.

圧縮されたビデオビット１０内のデータ要素は、一組の量子化された変換係数を生成するために、（たとえば、コンテキスト適応型二値算術復号方式（Context Adaptive Binary Arithmetic Decoding）を使用して）エントロピー復号ブロック７５０によって復号され得る。逆量子化ブロック７５５は、量子化された変換係数を逆量子化し、逆変換ブロック７６０は、逆量子化された変換係数を（ＡＤＳＴを使用して）逆変換して、エンコーダ６２５における再構築段階によって作り出されたものと同一であり得る微分残差を作り出す。 The data elements in the compressed video bit 10 are used to generate a set of quantized transform coefficients (eg, using Context Adaptive Binary Arithmetic Decoding). It may be decoded by entropy decoding block 750. Inverse quantization block 755 inverse quantizes the quantized transform coefficients, and inverse transform block 760 inverse transforms the inverse quantized transform coefficients (using ADST) to reconstruct in encoder 625. Produces a differential residual that can be identical to that produced by.

圧縮されたビデオビット１０から復号されたヘッダ情報を使用して、デコーダ６７５は、エンコーダ６７５において作り出されたのと同じ予測ブロックを作り出すために、予測ブロック７７５を使用することができる。予測ブロックは、再構築ブロック７６５によって再構築ブロックを作り出すために、微分残差に加えられ得る。ブロッキングアーティファクトを減少させるために、ループフィルタブロック７７０が再構築ブロックに適用され得る。ブロッキング歪みを減少させるために、ブロック解除フィルタブロック７８０が再構築ブロックに適用され得る。その結果が、ビデオストリーム５として出力される。 Using the header information decoded from the compressed video bit 10, the decoder 675 can use the prediction block 775 to produce the same prediction block that was produced in the encoder 675. The prediction block may be added to the differential residual to create a reconstructed block by the reconstruct block 765. A loop filter block 770 can be applied to the reconstruction block to reduce blocking artifacts. An unblock filter block 780 may be applied to the reconstructed block to reduce blocking distortion. The result is output as a video stream 5.

２Ｄ表現−球状ブロック７８５は、球状フレームまたは画像の２Ｄ表現を球状フレームまたは画像にマッピングするように構成されてもよい。たとえば、球状フレームまたは画像の２Ｄ表現を球状フレームまたは画像にマッピングすることは、エンコーダ６２５によって行なわれる３Ｄ−２Ｄマッピングの逆であり得る。 The 2D representation-spherical block 785 may be configured to map a 2D representation of the spherical frame or image to the spherical frame or image. For example, mapping a 2D representation of a spherical frame or image to a spherical frame or image may be the reverse of the 3D-2D mapping performed by encoder 625.

図７Ｂに関して上述されたビデオデコーダ６７５は、図示されたブロックを含む。しかしながら、例示的な実施形態はそれらに限定されない。使用される異なるビデオ符号化構成および／または手法に基づいて、追加のブロックが追加されてもよい。また、図７Ｂに関して上述されたビデオデコーダ６７５に示されるブロックの各々は、使用される異なるビデオ符号化構成および/または手法に基づくオプションのブロックであってもよい。 The video decoder 675 described above with respect to FIG. 7B includes the illustrated blocks. However, the exemplary embodiments are not limited thereto. Additional blocks may be added based on the different video coding configurations and / or techniques used. Also, each of the blocks shown in video decoder 675 described above with respect to FIG. 7B may be optional blocks based on the different video encoding configurations and / or techniques used.

エンコーダ６２５およびデコーダ６７５はそれぞれ、球状ビデオおよび／または画像を符号化するように、ならびに球状ビデオおよび／または画像を復号するように構成されてもよい。球状画像は、球状に組織化された複数の画素を含む画像である。言い換えれば、球状画像は、全方向に連続している画像である。したがって、球状画像の視聴者は、任意の方向（たとえば、上方向、下方向、左方向、右方向、またはそれらの任意の組合せ）に位置または向きを変える（たとえば、自分の頭または目を動かす）ことができ、画像の一部を連続的に見ることができる。 Encoder 625 and decoder 675 may each be configured to encode spherical video and / or images and to decode spherical video and / or images. A spherical image is an image including a plurality of pixels organized in a spherical shape. In other words, the spherical image is an image that is continuous in all directions. Thus, a viewer of a spherical image changes position or orientation (eg, moves his head or eyes) in any direction (eg, up, down, left, right, or any combination thereof) ) And part of the image can be viewed continuously.

例示的な実現化例では、エンコーダ６２５において使用され、および／またはエンコーダ６２５によって判断されたパラメータは、エンコーダ４０５の他の要素によって使用され得る。たとえば、２Ｄ表現を符号化するために使用される（たとえば、予測において使用されるような）運動ベクトルが、タイルを符号化するために使用されてもよい。また、予測ブロック７１０、変換ブロック７１５、量子化ブロック７２０、エントロピー符号化ブロック７２５、逆量子化ブロック７３０、逆変換ブロック７３５、再構築ブロック７４０、およびループフィルタブロック７４５において使用され、および／または当該ブロックによって判断されたパラメータは、エンコーダ６２５とエンコーダ４０５との間で共有されてもよい。 In an exemplary implementation, parameters used at and / or determined by encoder 625 may be used by other elements of encoder 405. For example, motion vectors used to encode 2D representations (eg, as used in prediction) may be used to encode tiles. Also used in prediction block 710, transform block 715, quantization block 720, entropy coding block 725, inverse quantization block 730, inverse transform block 735, reconstruction block 740, and loop filter block 745, and / or The parameters determined by the block may be shared between encoder 625 and encoder 405.

球状ビデオフレームまたは画像の一部は、画像として処理されてもよい。したがって、球状ビデオフレームの一部は、ブロックのＣ×Ｒマトリックス（以下、ブロックと称される）に変換（または分解）されてもよい。たとえば、球状ビデオフレームの一部は、各々が多数の画素を有するブロックの１６×１６、１６×８、８×８、８×４、４×４、４×２、２×２などのマトリックスといったＣ×Ｒマトリックスに変換されてもよい。 A spherical video frame or part of an image may be processed as an image. Thus, a portion of a spherical video frame may be converted (or decomposed) into a C × R matrix of blocks (hereinafter referred to as blocks). For example, a portion of a spherical video frame may be a 16 × 16, 16 × 8, 8 × 8, 8 × 4, 4 × 4, 4 × 2, 2 × 2, etc. matrix of blocks each having a number of pixels. It may be converted to a C × R matrix.

図８は、少なくとも１つの例示的な実施形態に従ったシステム８００を示す。図８に示すように、システム７００は、コントローラ６２０と、コントローラ６７０と、ビデオエンコーダ６２５と、ビューフレームストレージ７９５と、配向センサ８３５とを含む。コントローラ６２０はさらに、ビュー位置制御モジュール８０５と、タイル制御モジュール８１０と、ビューパースペクティブデータストア８１５とを含む。コントローラ６７０はさらに、ビュー位置判断モジュール８２０と、タイル要求モジュール８２５と、バッファ８３０とを含む。 FIG. 8 illustrates a system 800 in accordance with at least one exemplary embodiment. As shown in FIG. 8, the system 700 includes a controller 620, a controller 670, a video encoder 625, a view frame storage 795, and an orientation sensor 835. The controller 620 further includes a view position control module 805, a tile control module 810, and a view perspective data store 815. The controller 670 further includes a view position determination module 820, a tile request module 825, and a buffer 830.

例示的な実現化例によれば、配向センサ８３５は、視聴者の目（または頭）の配向（または配向の変化）を検出し、ビュー位置判断モジュール８２０は、検出された配向に基づいて、ビュー、パースペクティブ、またはビューパースペクティブを判断し、タイル要求モジュール８２５は、（球状ビデオに加えて）ビュー、パースペクティブ、またはビューパースペクティブを、タイルまたは複数のタイルに対する要求の一部として通信する。別の例示的な実現化例によれば、配向センサ８３５は、ＨＭＤまたはディスプレイ上でレンダリングされる際の画像パン配向（image panning orientation）に基づいて、配向（または配向の変化）を検出する。たとえば、ＨＭＤのユーザは、焦点深度を変更してもよい。言い換えれば、ＨＭＤのユーザは、配向の変化の有無にかかわらず、遠くにあった物体から近くにある物体に自分の焦点を変更してもよい（逆もまた同様）。たとえば、ユーザは、ディスプレイ上にレンダリングされる際の球状ビデオまたは画像の一部の選択、移動、ドラッグ、拡大などを行なうために、マウス、トラックパッド、または（たとえばタッチ感知ディスプレイ上での）ジェスチャを使用してもよい。 According to an exemplary implementation, the orientation sensor 835 detects the orientation (or change in orientation) of the viewer's eyes (or head), and the view position determination module 820 is based on the detected orientation. Determining a view, perspective, or view perspective, tile request module 825 communicates the view, perspective, or view perspective (in addition to the spherical video) as part of the request for the tile or tiles. According to another exemplary implementation, the orientation sensor 835 detects orientation (or change in orientation) based on the image panning orientation as rendered on the HMD or display. For example, an HMD user may change the depth of focus. In other words, an HMD user may change his focus from a distant object to a nearby object with or without an orientation change (and vice versa). For example, a user can use a mouse, trackpad, or gesture (eg, on a touch-sensitive display) to select, move, drag, enlarge, etc., a portion of a spherical video or image as it is rendered on the display. May be used.

タイルに対する要求は、球状ビデオのフレームに対する要求とともに通信されてもよい。タイルに対する要求は、球状ビデオのフレームに対する要求とは別に、ともに通信されてもよい。たとえば、タイルに対する要求は、変更されたビュー、パースペクティブ、またはビューパースペクティブに応答してもよく、以前に要求された、および／または待ち行列に入れられたタイルを置き換える必要性をもたらす。 The request for tiles may be communicated with the request for a frame of spherical video. Requests for tiles may be communicated together separately from requests for frames of spherical video. For example, a request for a tile may respond to a modified view, perspective, or view perspective, resulting in the need to replace a previously requested and / or queued tile.

ビュー位置制御モジュール８０５は、タイルに対する要求を受信し、処理する。たとえば、ビュー位置制御モジュール８０５は、ビューに基づいて、フレームと、そのフレームにおけるタイルまたは複数のタイルの位置とを判断し得る。次に、ビュー位置制御モジュール８０５は、タイル制御モジュール８１０に、タイルまたは複数のタイルを選択するよう命令し得る。タイルまたは複数のタイルを選択することは、パラメータをビデオエンコーダ６２５へ渡すことを含み得る。パラメータは、球状ビデオおよび/またはタイルの符号化中にビデオエンコーダ６２５によって使用され得る。これに代えて、タイルまたは複数のタイルを選択することは、ビューフレームストレージ７９５からタイルまたは複数のタイルを選択することを含み得る。 View position control module 805 receives and processes requests for tiles. For example, the view position control module 805 may determine a frame and the position of the tile or tiles in the frame based on the view. The view position control module 805 may then instruct the tile control module 810 to select the tile or tiles. Selecting a tile or multiple tiles may include passing parameters to video encoder 625. The parameters may be used by video encoder 625 during spherical video and / or tile encoding. Alternatively, selecting the tile or tiles may include selecting the tile or tiles from the view frame storage 795.

したがって、タイル制御モジュール８１０は、球状ビデオを見ているユーザのビューまたはパースペクティブまたはビューパースペクティブに基づいてタイル（または複数のタイル）を選択するように構成されてもよい。タイルは、ビューに基づいて選択された複数の画素であってもよい。複数の画素は、ユーザによって見られ得る球状画像の一部を含み得る、ブロック、複数のブロック、またはマクロブロックであってもよい。球状画像の一部は、長さと幅とを有していてもよい。球状画像の一部は、２次元、または実質的に２次元であってもよい。タイルは、可変サイズ（たとえば、タイルは球の何割をカバーするか）を有し得る。たとえば、タイルのサイズは、たとえば、視聴者の視野がどれくらい広いか、および／または、ユーザがどれくらい速く自分の頭を回転させているかに基づいて、符号化され、ストリーミングされ得る。たとえば、視聴者が絶えず見回している場合、より大きく、より低品質のタイルが選択されるかもしれない。しかしながら、視聴者が１つのパースペクティブに注目している場合、より小さく、より詳細なタイルが選択されるかもしれない。 Accordingly, the tile control module 810 may be configured to select a tile (or tiles) based on the view or perspective or view perspective of the user watching the spherical video. The tile may be a plurality of pixels selected based on the view. The plurality of pixels may be a block, a plurality of blocks, or a macroblock that may include a portion of a spherical image that may be viewed by a user. A part of the spherical image may have a length and a width. A portion of the spherical image may be two-dimensional or substantially two-dimensional. A tile may have a variable size (eg, what percentage of a sphere a tile covers). For example, the size of the tiles can be encoded and streamed based on, for example, how wide the viewer's field of view is and / or how fast the user is turning their head. For example, if the viewer is constantly looking around, a larger, lower quality tile may be selected. However, if the viewer is looking at one perspective, smaller and more detailed tiles may be selected.

したがって、配向センサ８３５は、視聴者の目（または頭）の配向（または配向の変化）を検出するように構成され得る。たとえば、配向センサ８３５は、動きを検出するために加速度計を、および、配向を検出するためにジャイロスコープを含み得る。これに代えて、またはこれに加えて、配向センサ８３５は、視聴者の目または頭の配向を判断するために、視聴者の目または頭に焦点を合わせたカメラまたは赤外線センサを含み得る。これに代えて、またはこれに加えて、配向センサ８３５は、球状ビデオまたは画像の配向を検出するために、ディスプレイ上でレンダリングされるような球状ビデオまたは画像の一部を判断し得る。配向センサ８３５は、配向および配向変化情報をビュー位置判断モジュール８２０に通信するように構成され得る。 Accordingly, the orientation sensor 835 can be configured to detect the orientation (or change in orientation) of the viewer's eyes (or head). For example, the orientation sensor 835 may include an accelerometer to detect motion and a gyroscope to detect orientation. Alternatively or in addition, orientation sensor 835 may include a camera or infrared sensor focused on the viewer's eyes or head to determine the orientation of the viewer's eyes or head. Alternatively or additionally, the orientation sensor 835 may determine a portion of the spherical video or image as rendered on the display to detect the orientation of the spherical video or image. Orientation sensor 835 may be configured to communicate orientation and orientation change information to view position determination module 820.

ビュー位置判断モジュール８２０は、球状ビデオに関してビューまたはパースペクティブビュー（たとえば、視聴者が現在見ている球状ビデオの一部）を判断するように構成され得る。ビュー、パースペクティブ、またはビューパースペクティブは、球状ビデオ上の位置、点、または焦点として判断され得る。たとえば、ビューは、球状ビデオ上の緯度および経度位置であってもよい。ビュー、パースペクティブ、またはビューパースペクティブは、球状ビデオに基づいて立方体の辺として判断され得る。ビュー（たとえば、緯度および経度位置、または辺）は、たとえばハイパーテキスト転送プロトコル（ＨＴＴＰ）を使用して、ビュー位置制御モジュール８０５に通信され得る。 View position determination module 820 may be configured to determine a view or perspective view (eg, a portion of the spherical video that the viewer is currently viewing) for the spherical video. A view, perspective, or view perspective can be determined as a location, point, or focus on a spherical video. For example, the view may be a latitude and longitude position on a spherical video. A view, perspective, or view perspective may be determined as a side of a cube based on spherical video. The view (eg, latitude and longitude position, or edge) may be communicated to the view position control module 805 using, for example, hypertext transfer protocol (HTTP).

ビュー位置制御モジュール８０５は、球状ビデオ内のタイルまたは複数のタイルのビュー位置（たとえば、フレーム、およびそのフレーム内での位置）を判断するように構成されてもよい。たとえば、ビュー位置制御モジュール８０５は、ビュー位置、点、または焦点（たとえば、緯度および経度位置、または辺）を中心とする矩形を選択し得る。タイル制御モジュール８１０は、当該矩形をタイルまたは複数のタイルとして選択するように構成され得る。タイル制御モジュール８１０は、（たとえば、パラメータまたは構成設定を介して）ビデオエンコーダ６２５に、選択されたタイルまたは複数のタイルを符号化するよう命令するように構成され得る。および／または、タイル制御モジュール８１０は、ビューフレームストレージ７９５からタイルまたは複数のタイルを選択するように構成され得る。 View position control module 805 may be configured to determine the view position (eg, the frame and position within the frame) of the tile or tiles in the spherical video. For example, the view position control module 805 may select a rectangle centered at the view position, point, or focus (eg, latitude and longitude positions, or sides). The tile control module 810 may be configured to select the rectangle as a tile or multiple tiles. The tile control module 810 may be configured to instruct the video encoder 625 to encode the selected tile or tiles (eg, via parameters or configuration settings). And / or tile control module 810 may be configured to select a tile or tiles from view frame storage 795.

理解されるように、図６Ａに示すシステム６００および図６Ｂに示すシステム６５０、および／または図８に示すシステム８００は、図９に関して以下に説明される汎用コンピュータデバイス９００および／または汎用モバイルコンピュータデバイス９５０の要素および/または拡張として実現されてもよい。これに代えて、またはこれに加えて、図６Ａに示すシステム６００および図６Ｂに示すシステム６５０、および／または図８に示すシステム８００は、汎用コンピュータデバイス９００および／または汎用モバイルコンピュータデバイス９５０に関して以下に説明される特徴のうちのいくつかまたはすべてを有する、汎用コンピュータデバイス９００および／または汎用モバイルコンピュータデバイス９５０とは別個のシステムにおいて実現されてもよい。 As will be appreciated, the system 600 shown in FIG. 6A and the system 650 shown in FIG. 6B, and / or the system 800 shown in FIG. It may be implemented as 950 elements and / or extensions. Alternatively or in addition, the system 600 shown in FIG. 6A and the system 650 shown in FIG. 6B and / or the system 800 shown in FIG. General purpose computing device 900 and / or general purpose mobile computing device 950 may be implemented in a separate system having some or all of the features described in FIG.

図９は、ここに説明される手法を実現するために使用され得るコンピュータデバイスおよびモバイルコンピュータデバイスの概略ブロック図である。図９は、ここに説明される手法を用いて使用され得る汎用コンピュータデバイス９００および汎用モバイルコンピュータデバイス９５０の一例である。コンピューティングデバイス９００は、ラップトップ、デスクトップ、ワークステーション、携帯情報端末、サーバ、ブレードサーバ、メインフレーム、および他の適切なコンピュータといった、さまざまな形態のデジタルコンピュータを表わすよう意図されている。コンピューティングデバイス９５０は、携帯情報端末、携帯電話、スマートフォン、および他の同様のコンピューティングデバイスといった、さまざまな形態のモバイルデバイスを表わすよう意図されている。ここに示すコンポーネント、それらの接続および関係、ならびにそれらの機能は単なる例示であることが意図されており、本文書に記載のおよび／または請求項に記載の本発明の実現化例を限定するよう意図されてはいない。 FIG. 9 is a schematic block diagram of a computing device and a mobile computing device that can be used to implement the techniques described herein. FIG. 9 is an example of a general purpose computing device 900 and a general purpose mobile computing device 950 that may be used with the techniques described herein. Computing device 900 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Computing device 950 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular phones, smartphones, and other similar computing devices. The components shown here, their connections and relationships, and their functions are intended to be examples only and are intended to limit the implementation of the invention described in this document and / or in the claims. Not intended.

コンピューティングデバイス９００は、プロセッサ９０２と、メモリ９０４と、記憶装置９０６と、メモリ９０４および高速拡張ポート９１０に接続している高速インターフェイス９０８と、低速バス９１４および記憶装置９０６に接続している低速インターフェイス９１２とを含む。コンポーネント９０２、９０４、９０６、９０８、９１０、および９１２の各々は、さまざまなバスを使用して相互接続されており、共通のマザーボード上にまたは他の態様で適宜搭載されてもよい。プロセッサ９０２は、コンピューティングデバイス９００内で実行される命令を処理可能であり、これらの命令は、ＧＵＩのためのグラフィック情報を、高速インターフェイス９０８に結合されたディスプレイ９１６などの外部入出力デバイス上に表示するために、メモリ９０４内または記憶装置９０６上に格納された命令を含む。他の実現化例では、複数のプロセッサおよび／または複数のバスが、複数のメモリおよび複数のタイプのメモリとともに適宜使用されてもよい。また、複数のコンピューティングデバイス９００が接続されてもよく、各デバイスは（たとえば、サーババンク、ブレードサーバのグループ、またはマルチプロセッサシステムとして）必要な動作の部分を提供する。 The computing device 900 includes a processor 902, a memory 904, a storage device 906, a high speed interface 908 connected to the memory 904 and a high speed expansion port 910, and a low speed interface connected to the low speed bus 914 and the storage device 906. 912. Each of the components 902, 904, 906, 908, 910, and 912 are interconnected using various buses and may be optionally mounted on a common motherboard or in other manners. The processor 902 is capable of processing instructions that are executed within the computing device 900, which instructions display graphical information for the GUI on an external input / output device such as a display 916 coupled to the high speed interface 908. Instructions stored in memory 904 or on storage device 906 are included for display. In other implementations, multiple processors and / or multiple buses may be used as appropriate with multiple memories and multiple types of memory. A plurality of computing devices 900 may also be connected, each device providing a portion of the required operation (eg, as a server bank, a group of blade servers, or a multiprocessor system).

メモリ９０４は、情報をコンピューティングデバイス９００内に格納する。一実現化例では、メモリ９０４は１つまたは複数の揮発性メモリユニットである。別の実現化例では、メモリ９０４は１つまたは複数の不揮発性メモリユニットである。メモリ９０４はまた、磁気ディスクまたは光ディスクといった別の形態のコンピュータ読取可能媒体であってもよい。 Memory 904 stores information within computing device 900. In one implementation, the memory 904 is one or more volatile memory units. In another implementation, the memory 904 is one or more non-volatile memory units. The memory 904 may also be another form of computer readable media such as a magnetic disk or optical disk.

記憶装置９０６は、コンピューティングデバイス９００のための大容量記憶を提供可能である。一実現化例では、記憶装置９０６は、フロッピー（登録商標）ディスクデバイス、ハードディスクデバイス、光ディスクデバイス、もしくはテープデバイス、フラッシュメモリまたは他の同様のソリッドステートメモリデバイス、もしくは、ストレージエリアネットワークまたは他の構成におけるデバイスを含むデバイスのアレイといった、コンピュータ読取可能媒体であってもよく、または当該コンピュータ読取可能媒体を含んでいてもよい。コンピュータプログラム製品が情報担体において有形に具現化され得る。コンピュータプログラム製品はまた、実行されると上述のような１つ以上の方法を行なう命令を含んでいてもよい。情報担体は、メモリ９０４、記憶装置９０６、またはプロセッサ９０２上のメモリといった、コンピュータ読取可能媒体または機械読取可能媒体である。 Storage device 906 can provide mass storage for computing device 900. In one implementation, the storage device 906 is a floppy disk device, hard disk device, optical disk device, or tape device, flash memory or other similar solid state memory device, or storage area network or other configuration. Or may be a computer readable medium, such as an array of devices including the device. A computer program product may be tangibly embodied on an information carrier. The computer program product may also include instructions that, when executed, perform one or more methods as described above. The information carrier is a computer-readable or machine-readable medium, such as memory 904, storage device 906, or memory on processor 902.

高速コントローラ９０８はコンピューティングデバイス９００のための帯域幅集約的な動作を管理し、一方、低速コントローラ９１２はより低い帯域幅集約的な動作を管理する。機能のそのような割当ては例示に過ぎない。一実現化例では、高速コントローラ９０８は、メモリ９０４、ディスプレイ９１６に（たとえば、グラフィックスプロセッサまたはアクセラレータを介して）、および、さまざまな拡張カード（図示せず）を受付け得る高速拡張ポート９１０に結合される。この実現化例では、低速コントローラ９１２は、記憶装置９０６および低速拡張ポート９１４に結合される。さまざまな通信ポート（たとえば、ＵＳＢ、ブルートゥース（登録商標）、イーサネット（登録商標）、無線イーサネット）を含み得る低速拡張ポートは、キーボード、ポインティングデバイス、スキャナなどの１つ以上の入出力デバイスに、もしくは、スイッチまたはルータなどのネットワーキングデバイスに、たとえばネットワークアダプタを介して結合されてもよい。 The high speed controller 908 manages bandwidth intensive operations for the computing device 900, while the low speed controller 912 manages lower bandwidth intensive operations. Such assignment of functions is exemplary only. In one implementation, high speed controller 908 couples to memory 904, display 916 (eg, via a graphics processor or accelerator), and a high speed expansion port 910 that can accept various expansion cards (not shown). Is done. In this implementation, low speed controller 912 is coupled to storage device 906 and low speed expansion port 914. A low-speed expansion port that can include various communication ports (eg, USB, Bluetooth, Ethernet, wireless Ethernet) to one or more input / output devices such as a keyboard, pointing device, scanner, or May be coupled to a networking device, such as a switch or router, via a network adapter, for example.

コンピューティングデバイス９００は、図に示すように多くの異なる形態で実現されてもよい。たとえばそれは、標準サーバ９２０として、またはそのようなサーバのグループで複数回実現されてもよい。それはまた、ラックサーバシステム９２４の一部として実現されてもよい。加えて、それは、ラップトップコンピュータ９２２などのパーソナルコンピュータにおいて実現されてもよい。これに代えて、コンピューティングデバイス９００からのコンポーネントは、デバイス９５０などのモバイルデバイス（図示せず）における他のコンポーネントと組合されてもよい。そのようなデバイスの各々は、コンピューティングデバイス９００、９５０のうちの１つ以上を含んでいてもよく、システム全体が、互いに通信する複数のコンピューティングデバイス９００、９５０で構成されてもよい。 The computing device 900 may be implemented in many different forms as shown in the figure. For example, it may be implemented multiple times as a standard server 920 or in a group of such servers. It may also be implemented as part of the rack server system 924. In addition, it may be implemented on a personal computer such as a laptop computer 922. Alternatively, components from computing device 900 may be combined with other components in a mobile device (not shown) such as device 950. Each such device may include one or more of the computing devices 900, 950, and the entire system may be comprised of multiple computing devices 900, 950 communicating with each other.

コンピューティングデバイス９５０は、数あるコンポーネントの中でも特に、プロセッサ９５２と、メモリ９６４と、ディスプレイ９５４などの入出力デバイスと、通信インターフェイス９６６と、トランシーバ９６８とを含む。デバイス９５０にはまた、追加の格納を提供するために、マイクロドライブまたは他のデバイスなどの記憶装置が設けられてもよい。コンポーネント９５０、９５２、９６４、９５４、９６６、および９６８の各々は、さまざまなバスを使用して相互接続されており、当該コンポーネントのうちのいくつかは、共通のマザーボード上にまたは他の態様で適宜搭載されてもよい。 Computing device 950 includes processor 952, memory 964, input / output devices such as display 954, communication interface 966, and transceiver 968, among other components. Device 950 may also be provided with a storage device, such as a microdrive or other device, to provide additional storage. Each of components 950, 952, 964, 954, 966, and 968 are interconnected using various buses, some of which are optionally on a common motherboard or otherwise. It may be mounted.

プロセッサ９５２は、メモリ９６４に格納された命令を含む、コンピューティングデバイス９５０内の命令を実行可能である。プロセッサは、別個の複数のアナログおよびデジタルプロセッサを含むチップのチップセットとして実現されてもよい。プロセッサは、たとえば、ユーザインターフェイス、デバイス９５０が実行するアプリケーション、およびデバイス９５０による無線通信の制御といった、デバイス９５０の他のコンポーネント同士の連携を提供してもよい。 The processor 952 can execute instructions in the computing device 950, including instructions stored in the memory 964. The processor may be implemented as a chip set of chips that include separate analog and digital processors. The processor may provide cooperation between other components of the device 950, such as, for example, a user interface, applications executed by the device 950, and control of wireless communication by the device 950.

プロセッサ９５２は、ディスプレイ９５４に結合された制御インターフェイス９５８およびディスプレイインターフェイス９５６を介してユーザと通信してもよい。ディスプレイ９５４は、たとえば、ＴＦＴＬＣＤ（Thin-Film-Transistor Liquid Crystal Display：薄膜トランジスタ液晶ディスプレイ）、またはＯＬＥＤ（Organic Light Emitting Diode：有機発光ダイオード）ディスプレイ、または他の適切なディスプレイ技術であってもよい。ディスプレイインターフェイス９５６は、ディスプレイ９５４を駆動してグラフィカル情報および他の情報をユーザに提示するための適切な回路を含んでいてもよい。制御インターフェイス９５８は、ユーザからコマンドを受信し、それらをプロセッサ９５２に送出するために変換してもよい。加えて、デバイス９５０と他のデバイスとの近接エリア通信を可能にするように、外部インターフェイス９６２がプロセッサ９５２と通信した状態で設けられてもよい。外部インターフェイス９６２は、たとえば、ある実現化例では有線通信を提供し、他の実現化例では無線通信を提供してもよく、複数のインターフェイスも使用されてもよい。 The processor 952 may communicate with the user via a control interface 958 and a display interface 956 coupled to the display 954. The display 954 may be, for example, a TFT LCD (Thin-Film-Transistor Liquid Crystal Display), or an OLED (Organic Light Emitting Diode) display, or other suitable display technology. Display interface 956 may include appropriate circuitry for driving display 954 to present graphical information and other information to the user. The control interface 958 may receive commands from the user and convert them for delivery to the processor 952. In addition, an external interface 962 may be provided in communication with the processor 952 to allow near area communication between the device 950 and other devices. The external interface 962 may provide, for example, wired communication in some implementations, wireless communication in other implementations, and multiple interfaces may also be used.

メモリ９６４は、情報をコンピューティングデバイス９５０内に格納する。メモリ９６４は、１つまたは複数のコンピュータ読取可能媒体、１つまたは複数の揮発性メモリユニット、もしくは、１つまたは複数の不揮発性メモリユニットのうちの１つ以上として実現され得る。拡張メモリ９７４も設けられ、拡張インターフェイス９７２を介してデバイス９５０に接続されてもよく、拡張インターフェイス９７２は、たとえばＳＩＭＭ（Single In Line Memory Module）カードインターフェイスを含んでいてもよい。そのような拡張メモリ９７４は、デバイス９５０に余分の格納スペースを提供してもよく、もしくは、デバイス９５０のためのアプリケーションまたは他の情報も格納してもよい。具体的には、拡張メモリ９７４は、上述のプロセスを実行または補足するための命令を含んでいてもよく、安全な情報も含んでいてもよい。このため、たとえば、拡張メモリ９７４はデバイス９５０のためのセキュリティモジュールとして設けられてもよく、デバイス９５０の安全な使用を許可する命令でプログラミングされてもよい。加えて、ハッキング不可能な態様でＳＩＭＭカード上に識別情報を載せるといったように、安全なアプリケーションが追加情報とともにＳＩＭＭカードを介して提供されてもよい。 Memory 964 stores information within computing device 950. Memory 964 may be implemented as one or more of one or more computer-readable media, one or more volatile memory units, or one or more non-volatile memory units. An expansion memory 974 is also provided and may be connected to the device 950 via an expansion interface 972, which may include, for example, a SIMM (Single In Line Memory Module) card interface. Such extended memory 974 may provide extra storage space for device 950 or may store applications or other information for device 950. Specifically, extended memory 974 may include instructions for performing or supplementing the above-described process, and may also include secure information. Thus, for example, the extended memory 974 may be provided as a security module for the device 950 and may be programmed with instructions that allow the device 950 to be used safely. In addition, a secure application may be provided through the SIMM card with additional information, such as placing identification information on the SIMM card in a non-hackable manner.

メモリはたとえば、以下に説明されるようなフラッシュメモリおよび／またはＮＶＲＡＭメモリを含んでいてもよい。一実現化例では、コンピュータプログラム製品が情報担体において有形に具現化される。コンピュータプログラム製品は、実行されると上述のような１つ以上の方法を行なう命令を含む。情報担体は、メモリ９６４、拡張メモリ９７４、またはプロセッサ９５２上のメモリといった、コンピュータ読取可能媒体または機械読取可能媒体であり、たとえばトランシーバ９６８または外部インターフェイス９６２を通して受信され得る。 The memory may include, for example, flash memory and / or NVRAM memory as described below. In one implementation, the computer program product is tangibly embodied on an information carrier. The computer program product includes instructions that, when executed, perform one or more methods as described above. The information carrier is a computer-readable or machine-readable medium, such as memory 964, expansion memory 974, or memory on processor 952, and may be received through transceiver 968 or external interface 962, for example.

デバイス９５０は、必要に応じてデジタル信号処理回路を含み得る通信インターフェイス９６６を介して無線通信してもよい。通信インターフェイス９６６は、とりわけ、ＧＳＭ（登録商標）音声通話、ＳＭＳ、ＥＭＳ、またはＭＭＳメッセージング、ＣＤＭＡ、ＴＤＭＡ、ＰＤＣ、ＷＣＤＭＡ（登録商標）、ＣＤＭＡ２０００、またはＧＰＲＳといった、さまざまなモードまたはプロトコル下での通信を提供してもよい。そのような通信は、たとえば無線周波数トランシーバ９６８を介して生じてもよい。加えて、ブルートゥース、Ｗｉ−Ｆｉ、または他のそのようなトランシーバ（図示せず）などを使用して、短距離通信が生じてもよい。加えて、ＧＰＳ（Global Positioning System：全地球測位システム）レシーバモジュール９７０が、追加のナビゲーション関連および位置関連無線データをデバイス９５０に提供してもよく、当該データは、デバイス９５０上で実行されるアプリケーションによって適宜使用されてもよい。 Device 950 may communicate wirelessly via communication interface 966, which may include digital signal processing circuitry as required. The communication interface 966, among other things, communicates under various modes or protocols such as GSM® voice calls, SMS, EMS, or MMS messaging, CDMA, TDMA, PDC, WCDMA®, CDMA2000, or GPRS. May be provided. Such communication may occur, for example, via radio frequency transceiver 968. In addition, short range communications may occur using Bluetooth, Wi-Fi, or other such transceivers (not shown). In addition, a GPS (Global Positioning System) receiver module 970 may provide additional navigation-related and position-related wireless data to the device 950, which data is applied to the application running on the device 950. May be used as appropriate.

デバイス９５０はまた、ユーザから口頭情報を受信してそれを使用可能なデジタル情報に変換し得る音声コーデック９６０を使用して、音声通信してもよい。音声コーデック９６０はまた、たとえばデバイス９５０のハンドセットにおいて、スピーカを介するなどして、ユーザに聞こえる音を生成してもよい。そのような音は、音声電話からの音を含んでいてもよく、録音された音（たとえば、音声メッセージ、音楽ファイルなど）を含んでいてもよく、デバイス９５０上で動作するアプリケーションが生成する音も含んでいてもよい。 Device 950 may also communicate in voice using an audio codec 960 that can receive verbal information from a user and convert it to usable digital information. The audio codec 960 may also generate sounds audible to the user, such as through a speaker, for example, in the handset of the device 950. Such sounds may include sounds from voice calls, may include recorded sounds (eg, voice messages, music files, etc.), and sound generated by applications running on device 950. May also be included.

コンピューティングデバイス９５０は、図に示すように多くの異なる形態で実現されてもよい。たとえばそれは、携帯電話９８０として実現されてもよい。それはまた、スマートフォン９８２、携帯情報端末、または他の同様のモバイルデバイスの一部として実現されてもよい。 The computing device 950 may be implemented in many different forms as shown. For example, it may be implemented as a mobile phone 980. It may also be implemented as part of a smartphone 982, a personal digital assistant, or other similar mobile device.

上述の例示的な実施形態のうちのいくつかは、フローチャートとして示されるプロセスまたは方法として説明される。これらのフローチャートは動作を逐次プロセスとして説明しているが、動作の多くは、並列、同時または一斉に行なわれてもよい。加えて、動作の順序は並び替えられてもよい。それらの動作が完了されるとプロセスは終了されてもよいが、図に含まれていない追加のステップも有していてもよい。これらのプロセスは、方法、機能、手順、サブルーチン、サブプログラムなどに対応していてもよい。 Some of the exemplary embodiments described above are described as processes or methods shown as flowcharts. Although these flowcharts describe the operation as a sequential process, many of the operations may be performed in parallel, simultaneously, or simultaneously. In addition, the order of operations may be rearranged. The process may be terminated when those operations are completed, but may have additional steps not included in the figure. These processes may correspond to methods, functions, procedures, subroutines, subprograms, and the like.

そのいくつかがフローチャートによって示されている、上述された方法は、ハードウェア、ソフトウェア、ファームウェア、ミドルウェア、マイクロコード、ハードウェア記述言語、またはそれらの任意の組合せによって実現されてもよい。ソフトウェア、ファームウェア、ミドルウェア、またはマイクロコードにおいて実現される場合、必要なタスクを行なうプログラムコードまたはコードセグメントは、記憶媒体などの機械読取可能媒体またはコンピュータ読取可能媒体に格納されてもよい。プロセッサが必要なタスクを行なってもよい。 The methods described above, some of which are illustrated by flowcharts, may be implemented by hardware, software, firmware, middleware, microcode, hardware description language, or any combination thereof. When implemented in software, firmware, middleware, or microcode, program code or code segments that perform necessary tasks may be stored on machine-readable or computer-readable media such as storage media. The processor may perform the necessary tasks.

ここに開示された具体的な構造詳細および機能詳細は、例示的な実施形態を説明するための代表的なものに過ぎない。しかしながら、例示的な実施形態は、多くの代替的な形態で具現化され、ここに述べられた実施形態のみに限定されると解釈されるべきでない。 The specific structural and functional details disclosed herein are merely representative for describing exemplary embodiments. However, the exemplary embodiments may be embodied in many alternative forms and should not be construed as limited to only the embodiments set forth herein.

第１、第２などといった用語は、さまざまな要素を説明するためにここに使用され得るが、これらの要素はこれらの用語によって限定されるべきでない、ということが理解されるであろう。これらの用語は、１つの要素を別の要素と区別するために使用されているに過ぎない。たとえば、例示的な実施形態の範囲から逸脱することなく、第１の要素を第２の要素と称してもよく、同様に、第２の要素を第１の要素と称してもよい。ここに使用されるように、「および／または」という用語は、関連付けられる列挙された項目の１つ以上のいずれかおよびすべての組合せを含む。 It will be understood that terms such as first, second, etc. may be used herein to describe various elements, but that these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element may be referred to as a second element, and, similarly, a second element may be referred to as a first element, without departing from the scope of the exemplary embodiments. As used herein, the term “and / or” includes any and all combinations of one or more of the associated listed items.

ある要素が別の要素に接続または結合されると称される場合、ある要素は別の要素に直接接続または結合され得るか、または介在要素が存在し得る、ということが理解されるであろう。対照的に、ある要素が別の要素に直接接続または直接結合されると称される場合、介在要素は存在しない。要素間の関係を説明するために使用される他の文言は、類似の態様（たとえば、「間に」と「間に直接」、「隣接」と「直接隣接」など）で解釈されるべきである。 When an element is referred to as being connected or coupled to another element, it will be understood that an element may be directly connected or coupled to another element or that there may be intervening elements . In contrast, when an element is referred to as being directly connected or directly coupled to another element, there are no intervening elements. Other terms used to describe the relationship between elements should be interpreted in a similar manner (eg, “between” and “directly between”, “adjacent” and “directly adjacent”, etc.). is there.

ここに使用される用語は特定の実施形態を説明するためのものに過ぎず、例示的な実施形態の限定であるよう意図されてはいない。ここに使用されるように、単数形は、文脈が別の態様を明らかに示していない限り、複数形も含むよう意図される。「備える（comprises, comprising）」および／または「含む（includes, including）」という用語は、ここに使用される場合、言及された特徴、整数、ステップ、動作、要素および／またはコンポーネントの存在を特定するが、１つ以上の他の特徴、整数、ステップ、動作、要素、コンポーネントおよび／またはそれらのグループの存在または追加を排除しない、ということがさらに理解されるであろう。 The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms “comprises, comprising” and / or “includes, including”, as used herein, identify the presence of the mentioned feature, integer, step, action, element and / or component. However, it will be further understood that it does not exclude the presence or addition of one or more other features, integers, steps, operations, elements, components and / or groups thereof.

また、いくつかの代替的な実現化例では、言及された機能／行為が、図に示された順番とは異なって起きてもよい。たとえば、連続して示される２つの図は実際には、関与する機能性／行為に依存して、同時に実行されてもよく、または、時には逆の順序で実行されてもよい。 Also, in some alternative implementations, the functions / acts mentioned may occur out of the order shown in the figures. For example, two figures shown in succession may actually be performed simultaneously, or sometimes in reverse order, depending on the functionality / action involved.

別の態様で定義されていない限り、ここに使用されるすべての用語（技術用語および科学用語を含む）は、例示的な実施形態が属する技術の当業者によって一般に理解されているのと同じ意味を有する。さらに、たとえば一般に使用されている辞書で定義されているような用語は、関連技術の文脈におけるそれらの意味と一致する意味を有すると解釈されるべきであり、ここに明らかにそう定義されていない限り、理想化されたまたは過度に形式的な意味で解釈されない、ということが理解されるであろう。 Unless defined otherwise, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which an exemplary embodiment belongs. Have In addition, terms such as defined in commonly used dictionaries, for example, should be construed as having a meaning consistent with their meaning in the context of the related art, and are not clearly defined herein It will be understood that as long as it is not interpreted in an idealized or overly formal sense.

ソフトウェア、または、コンピュータメモリ内でのデータビットに対する動作のアルゴリズムおよび記号的表現に関して、上述の例示的な実施形態および対応する詳細な説明の部分が提示される。これらの説明および表現は、当業者が自分の研究の内容を他の当業者に効果的に伝えるものである。アルゴリズムとは、その用語がここに使用される場合、および一般的に使用される場合、所望の結果に至るステップの首尾一貫したシーケンスであると考えられる。これらのステップは、物理量の物理的操作を必要とするものである。必ずではないものの、通常は、これらの量は、格納、転送、組合せ、比較、および別の態様での操作が可能である光学信号、電気信号、または磁気信号の形態を取る。これらの信号をビット、値、要素、記号、文字、項、または数字などと称することは、主に一般的な使用の理由により、時に便利であることが証明されている。 With respect to algorithms or symbolic representations of operations on data bits in software or computer memory, the exemplary embodiments described above and portions of corresponding detailed descriptions are presented. These descriptions and representations are intended to enable those skilled in the art to effectively convey the substance of their work to others skilled in the art. An algorithm is considered to be a consistent sequence of steps leading to a desired result when the term is used herein and generally. These steps are those requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of optical, electrical, or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.

上述の例示的な実施形態において、プログラムモジュールまたは機能的プロセスとして実現され得る（たとえばフローチャートの形態での）行為および動作の記号的表現への参照は、特定のタスクを行ない、または特定の抽象データタイプを実現するとともに、既存の構造要素で既存のハードウェアを使用して記述および／または実現され得る、ルーチン、プログラム、オブジェクト、コンポーネント、データ構造などを含む。そのような既存のハードウェアは、１つ以上の中央処理装置（ＣＰＵ）、デジタル信号プロセッサ（ＤＳＰ）、特定用途向け集積回路、または、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）コンピュータなどを含み得る。 In the exemplary embodiments described above, references to acts and symbolic representations of actions (eg, in the form of flowcharts) that can be implemented as program modules or functional processes perform specific tasks or specific abstract data. Includes routines, programs, objects, components, data structures, etc. that implement types and can be described and / or implemented using existing hardware with existing structural elements. Such existing hardware may include one or more central processing units (CPUs), digital signal processors (DSPs), application specific integrated circuits, field programmable gate array (FPGA) computers, or the like.

しかしながら、これらおよび同様の用語はすべて、適切な物理量に関連付けられるべきであり、これらの量に適用された便利なラベルに過ぎない、ということが念頭に置かれるべきである。特に別記されない限り、あるいは説明から明らかであるように、表示の処理、コンピューティング、計算、または判断といった用語は、コンピュータシステムのレジスタおよびメモリ内で物理的な電子量として表わされるデータを操作し、当該データを、コンピュータシステムメモリ、レジスタ、もしくは他のそのような情報記憶デバイス、送信デバイスまたは表示デバイス内の物理量として同様に表わされる他のデータに変換する、コンピュータシステムまたは同様の電子コンピューティングデバイスのアクションおよびプロセスを指す。 However, it should be borne in mind that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless otherwise stated or apparent from the description, terms such as display processing, computing, calculation, or judgment manipulate data represented as physical electronic quantities in computer system registers and memory; A computer system or similar electronic computing device that converts the data into computer system memory, registers, or other data that is also represented as physical quantities in such information storage devices, transmitting devices, or display devices Refers to actions and processes.

また、例示的な実施形態のソフトウェアによって実現される局面は典型的には、何らかの形態の非一時的なプログラム記憶媒体上で符号化されるか、または、何らかのタイプの伝送媒体上で実現される。プログラム記憶媒体は、磁気的（たとえば、フロッピーディスクまたはハードドライブ）であるか、または光学的（たとえば、コンパクトディスク読み取り専用メモリ、すなわちＣＤＲＯＭ）であってもよく、読み取り専用またはランダムアクセスであってもよい。同様に、伝送媒体は、当該技術について公知であるツイストペア線、同軸ケーブル、光ファイバ、または何らかの他の好適な伝送媒体であってもよい。例示的な実施形態は、所与の実現化例のこれらの局面によって限定されない。 Also, aspects implemented by the software of the exemplary embodiments are typically encoded on some form of non-transitory program storage medium or implemented on some type of transmission medium. . The program storage medium may be magnetic (eg, floppy disk or hard drive) or optical (eg, compact disk read only memory, ie CD ROM), read only or random access. Also good. Similarly, the transmission medium may be a twisted pair wire, coaxial cable, optical fiber, or some other suitable transmission medium known in the art. Exemplary embodiments are not limited by these aspects of a given implementation.

最後に、添付の請求の範囲は、ここに説明された特徴の特定の組合せを述べているが、本開示の範囲は、請求されるその特定の組合せに限定されず、代わりに、その特定の組合せが現時点で添付の請求の範囲において具体的に列挙されているか否かに関わらず、ここに開示された特徴または実施形態の任意の組合せを包含するように広がる。 Finally, while the appended claims describe specific combinations of the features described herein, the scope of the disclosure is not limited to those specific combinations claimed, but instead Regardless of whether the combination is specifically recited in the appended claims at this time, it extends to encompass any combination of the features or embodiments disclosed herein.

Claims

Determining at least one preferred view perspective associated with a three-dimensional (3D) video;
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
Encoding a second portion of the 3D video with a second quality, wherein the first quality is a higher quality compared to the second quality.

Storing the first portion of the 3D video in a data store;
Storing the second portion of the 3D video in the data store;
Receiving a request for streaming video;
The method of claim 1, further comprising: streaming the first portion of the 3D video and the second portion of the 3D video as the streaming video from the data store.

The method further includes receiving a request for streaming video, wherein the request includes displaying a user view perspective, and the method further includes:
Selecting a 3D video corresponding to the user view perspective as the encoded first portion of the 3D video;
The method of claim 1, comprising streaming the selected first portion of the 3D video and the second portion of the 3D video as the streaming video.

Receiving a request for streaming video, wherein the request includes a display of a user view perspective associated with the 3D video, the method further comprising:
Determining whether the user view perspective is stored in a view perspective data store;
Determining that the user view perspective is stored in the view perspective data store, incrementing a counter associated with the user view perspective;
Determining that the user view perspective is not stored in the view perspective data store, adding the user view perspective to the view perspective data store and setting the counter associated with the user view perspective to 1; The method of claim 1 comprising:

Encoding the second portion of the 3D video comprises using at least one first quality of service (QoS) parameter in a first pass encoding operation;
The encoding of the first portion of the 3D video comprises using at least one second quality of service (QoS) parameter in a second pass encoding operation. Method.

The step of determining the at least one preferred view perspective associated with the 3D video is based on at least one of a reference point seen so far and a view perspective seen so far. The method according to 1.

The at least one preferred view perspective associated with the 3D video includes the orientation of the viewer of the 3D video, the location of the viewer of the 3D video, the point of viewer of the 3D video, and the viewer of the 3D video The method of claim 1, wherein the method is based on at least one of:

Determining the at least one preferred view perspective associated with the 3D video is based on a default view perspective;
The default view perspective is
Display device user characteristics,
Characteristics of the group associated with the user of the display device;
Director's cut and
The method of claim 1, wherein the method is based on at least one of the characteristics of the 3D video.

Repetitively encoding at least a portion of the second portion of the 3D video with the first quality;
The method of claim 1, further comprising streaming the at least part of the second portion of the 3D video.

A streaming server,
A controller configured to determine at least one preferred view perspective associated with a three-dimensional (3D) video;
An encoder, wherein the encoder is
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
A streaming server configured to encode a second portion of the 3D video with a second quality, wherein the first quality is a higher quality than the second quality.

The controller further includes:
Storing the first portion of the 3D video in a data store;
Storing the second portion of the 3D video in the data store;
Receiving a request for streaming video;
The streaming server of claim 10, configured to cause streaming the first portion of the 3D video and the second portion of the 3D video as the streaming video from the data store.

The controller further includes:
Configured to cause a request for streaming video to be received, the request including a display of a user view perspective, the controller further comprising:
Selecting a 3D video corresponding to the user view perspective as the encoded first portion of the 3D video;
The streaming server of claim 10, configured to cause the selected first portion of the 3D video and the second portion of the 3D video to be streamed as the streaming video.

The controller further includes:
Configured to cause a request for streaming video to be received, the request including a display of a user view perspective associated with the 3D video, the controller further comprising:
Determining whether the user view perspective is stored in a view perspective data store;
Determining that the user view perspective is stored in the view perspective data store, incrementing a counter associated with the user view perspective;
Determining that the user view perspective is not stored in the view perspective data store, adding the user view perspective to the view perspective data store and setting the counter associated with the user view perspective to 1; The streaming server of claim 10, wherein the streaming server is configured to cause

Encoding the second portion of the 3D video includes using at least one first quality of service (QoS) parameter in a first pass encoding operation;
The encoding of the first portion of the 3D video comprises using at least one second quality of service (QoS) parameter in a second pass encoding operation. Streaming server.

The determining of the at least one preferred view perspective associated with the 3D video is based on at least one of a reference point seen so far and a view perspective seen so far. 10. The streaming server according to 10.

The at least one preferred view perspective associated with the 3D video includes an orientation of the viewer of the 3D video, a location of the viewer of the 3D video, a viewer point of the 3D video, and a viewer of the 3D video The streaming server of claim 10, wherein the streaming server is based on at least one of the following:

Determining the at least one preferred view perspective associated with the 3D video is based on a default view perspective;
The default view perspective is
Display device user characteristics,
Characteristics of the group associated with the user of the display device;
Director's cut and
The streaming server according to claim 10, wherein the streaming server is based on at least one of the characteristics of the 3D video.

The controller further includes:
Repetitively encoding at least a portion of the second portion of the 3D video with the first quality;
The streaming server of claim 10, configured to cause the at least a portion of the second portion of the 3D video to be streamed.

A method,
Receiving a request for streaming video, wherein the request includes a display of a user view perspective associated with a three-dimensional (3D) video, the method further comprising:
Determining whether the user view perspective is stored in a view perspective data store;
Determining that the user view perspective is stored in the view perspective data store, incrementing a ranking value associated with the user view perspective;
If it is determined that the user view perspective is not stored in the view perspective data store, the user view perspective is added to the view perspective data store, and the ranking value associated with the user view perspective is set to 1. Including a method.

Determining at least one preferred view perspective associated with the 3D video based on the ranking value associated with the stored user view perspective;
Encoding a first portion of the 3D video corresponding to the at least one preferred view perspective with a first quality;
20. The method of claim 19, further comprising: encoding a second portion of the 3D video with a second quality, wherein the first quality is a higher quality compared to the second quality. .