JP2022035607A

JP2022035607A - Presentation system, server, second terminal and program

Info

Publication number: JP2022035607A
Application number: JP2020140059A
Authority: JP
Inventors: 晴久加藤; Haruhisa Kato
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2020-08-21
Filing date: 2020-08-21
Publication date: 2022-03-04
Anticipated expiration: 2040-08-21
Also published as: JP7370305B2

Abstract

To provide a presentation system capable of efficiently performing avatar drawing.SOLUTION: A presentation system includes: a first recognition part 11 for acquiring first recognition information by recognizing a first user state; a second positioning part 22 for acquiring second positioning information by positioning a position attitude of a second user; a third drawing part 33 for acquiring third drawing information obtained by drawing an avatar of the first user, by reflecting the first recognition information at a virtual camera viewpoint arranged at the second positioning information; a fourth drawing part 34 for acquiring fourth drawing information obtained by drawing the avatar in quality higher than in a drawing mode of the third drawing part; an extraction part 35 for extracting difference between the third drawing information and the fourth drawing information as second extraction information; a second drawing part 26 for acquiring second drawing information obtained by drawing the avatar in the same quality as in the drawing mode of the third drawing part; a second integration part 27 for acquiring second integration information being the same avatar obtained by simulating the fourth drawing information, by reflecting the second extraction information on the second drawing information; and a second presentation part 28 for displaying the second integration information to the second user.SELECTED DRAWING: Figure 2

Description

本発明は、遠隔コミュニケーション等に利用可能な、アバタ描画を行う提示システム、サーバ、端末及びプログラムに関する。 The present invention relates to a presentation system, a server, a terminal and a program for drawing an avatar, which can be used for remote communication and the like.

遠隔コミュニケーション等に利用可能な遠隔地間の映像通信技術に関し、非特許文献１は、３Ｄ（３次元）映像伝送として深度センサで計測したユーザの点群情報を通信相手へ伝送し、相手側のデバイスで描画する取り組みを開示している。また、特許文献１は、端末で撮影した複数映像をサーバへ伝送しサーバで推定した骨格情報を通信相手に伝送した上で端末が骨格情報をアバタに適用する手法を開示している。 Regarding the video communication technology between remote locations that can be used for remote communication, Non-Patent Document 1 transmits the user's point group information measured by a depth sensor as 3D (three-dimensional) video transmission to the communication partner, and the other party's point group information is transmitted. It discloses the efforts to draw on the device. Further, Patent Document 1 discloses a method in which a plurality of images taken by a terminal are transmitted to a server, skeleton information estimated by the server is transmitted to a communication partner, and then the terminal applies the skeleton information to an avatar.

特開２０２０―６５２２９号公報Japanese Unexamined Patent Publication No. 2020-6229

Ben Cutler、他２名、"holoportation"，［online］，２０１８年９月４日［令和２年７月１７日検索］、インターネット＜URL：https://www.microsoft.com/en-us/research/project/holoportation-3＞Ben Cutler, 2 others, "holoportation", [online], September 4, 2018 [Search on July 17, 2018], Internet <URL: https://www.microsoft.com/en-us / research / project / holoportation-3 ＞

しかしながら従来技術においては、ユーザの側で利用する端末の計算資源や通信帯域等に関して必ずしも潤沢に利用できない制約がある状況下において、高品位な情報提示を実現することができなかった。 However, in the prior art, it has not been possible to realize high-quality information presentation under the condition that the computational resources of the terminal used on the user side, the communication band, and the like cannot always be abundantly used.

非特許文献１は膨大な点群情報を伝送するため通信帯域が狭いと実現できないという問題がある。また、ユーザの点群情報を全方位から取得するには深度センサを複数配置する必要があり装置が大規模化する問題がある。さらに、深度センサの分解能に限界があるため利用者と背景との分離が十分ではなくユーザの点群に背景が混在し低品質であるという問題がある。なおかつ、ユーザの姿勢によっては死角が生じ当該箇所の点群を取得できないという問題がある。 Non-Patent Document 1 has a problem that it cannot be realized if the communication band is narrow because a huge amount of point cloud information is transmitted. Further, in order to acquire the user's point cloud information from all directions, it is necessary to arrange a plurality of depth sensors, which causes a problem that the device becomes large-scale. Further, since the resolution of the depth sensor is limited, there is a problem that the separation between the user and the background is not sufficient and the background is mixed in the user's point cloud and the quality is low. Moreover, there is a problem that a blind spot is generated depending on the posture of the user and the point cloud of the relevant portion cannot be acquired.

特許文献１は異なる視点からの複数映像を端末からサーバへ伝送するため通信帯域が狭いと実現できないという問題がある。また、計算資源に制約がある端末でアバタを描画するため、リアルタイムに描画しようとする場合に、計算資源が潤沢なサーバでの描画と比較して低品質であるという問題がある。 Patent Document 1 has a problem that it cannot be realized if the communication band is narrow because a plurality of images from different viewpoints are transmitted from the terminal to the server. In addition, since the avatar is drawn on a terminal with limited computational resources, there is a problem that the quality is lower than that of drawing on a server with abundant computational resources when trying to draw in real time.

上記従来技術の課題に鑑み、本発明は、効率的にアバタ描画を行うことができる提示システム、サーバ、第２端末及びプログラムを提供することを目的とする。 In view of the above problems of the prior art, it is an object of the present invention to provide a presentation system, a server, a second terminal and a program capable of efficiently performing avatar drawing.

上記目的を達成するため、本発明は、提示システムであって、第１ユーザのコミュニケーションに関連する状態を認識して第１認識情報を得る第１認識部と、第２ユーザの位置姿勢を測位して第２測位情報を得る第２測位部と、前記第２測位情報に配置した仮想カメラ視点において、前記第１認識情報を反映して前記第１ユーザのアバタを描画した第３描画情報を得る第３描画部と、前記第２測位情報に配置した仮想カメラ視点において、前記第１認識情報を反映して前記第１ユーザのアバタを前記第３描画部の描画態様よりも高品質に描画した第４描画情報を得る第４描画部と、前記第３描画情報と前記第４描画情報との相違を第２抽出情報として抽出する抽出部と、前記第２測位情報に配置した仮想カメラ視点において、前記第１認識情報を反映して前記第１ユーザのアバタを前記第３描画部の描画態様と同一品質で描画した第２描画情報を得る第２描画部と、前記第２描画情報に前記第２抽出情報を反映することで、前記第４描画情報を模したものとしての前記第１ユーザのアバタである第２統合情報を得る第２統合部と、前記第２統合情報を前記第２ユーザに対して表示する第２提示部と、を備えることを特徴とする。また、本発明は、前記第３描画部、前記第４描画部及び前記抽出部を備えるサーバであることを特徴とする。また、本発明は、前記２測位部、前記第２描画部、前記第２統合部及び前記第２提示部を備える第２端末であることを特徴とする。さらに、本発明は、コンピュータを前記サーバまたは前記第２端末として機能させるプログラムであることを特徴とする。 In order to achieve the above object, the present invention is a presentation system for positioning the position and orientation of a first recognition unit that recognizes a state related to communication of a first user and obtains first recognition information, and a second user. In the second positioning unit that obtains the second positioning information and the virtual camera viewpoint arranged in the second positioning information, the third drawing information that reflects the first recognition information and draws the avatar of the first user is obtained. In the third drawing unit to be obtained and the virtual camera viewpoint arranged in the second positioning information, the avatar of the first user is drawn with higher quality than the drawing mode of the third drawing unit, reflecting the first recognition information. The fourth drawing unit that obtains the fourth drawing information, the extraction unit that extracts the difference between the third drawing information and the fourth drawing information as the second extraction information, and the virtual camera viewpoint arranged in the second positioning information. In the second drawing unit, which reflects the first recognition information and obtains the second drawing information in which the avatar of the first user is drawn with the same quality as the drawing mode of the third drawing unit, and the second drawing information. The second integrated unit that obtains the second integrated information that is the avatar of the first user as imitating the fourth drawing information by reflecting the second extracted information, and the second integrated information. It is characterized by including a second presentation unit to be displayed to two users. Further, the present invention is characterized in that the server includes the third drawing unit, the fourth drawing unit, and the extraction unit. Further, the present invention is characterized in that it is a second terminal including the two positioning units, the second drawing unit, the second integrated unit, and the second presenting unit. Further, the present invention is characterized in that it is a program that causes a computer to function as the server or the second terminal.

本発明によれば、互いに異なる品質で描画された２つの共通のアバタの相違として第２抽出情報の形で抽出されることで通信帯域の狭い回線を利用する場合であっても高速に第２抽出情報を第２ユーザの側へと伝送し、第２ユーザの側において第４描画部による高品質な描画を行うことなく、第４描画部と同等の高品質な描画結果としての第２統合情報を得て第１ユーザのアバタとして第２ユーザに表示することが可能であるため、効率的にアバタ描画を行うことができる。 According to the present invention, as the difference between two common avatars drawn with different qualities, the second is extracted at high speed even when a line having a narrow communication band is used by being extracted in the form of the second extraction information. The second integration as a high-quality drawing result equivalent to that of the 4th drawing unit without transmitting the extracted information to the 2nd user side and performing high-quality drawing by the 4th drawing unit on the 2nd user side. Since it is possible to obtain information and display it to the second user as an avatar of the first user, it is possible to efficiently draw the avatar.

一実施形態に係る提示システムの構成図である。It is a block diagram of the presentation system which concerns on one Embodiment. N=2の場合の一実施形態に係る提示システムの機能ブロック図である。It is a functional block diagram of the presentation system which concerns on one Embodiment in the case of N = 2. 一実施形態に係る提示システムの動作のシーケンス図である。It is a sequence diagram of the operation of the presentation system which concerns on one Embodiment. アバタによるリアルタイムでの遠隔コミュニケーションの模式例を示す図である。It is a figure which shows the schematic example of the remote communication in real time by avatar. 第１認識情報の模試例として、表情認識の場合の顔画像から抽出した表情に関する特徴点分布を示す図である。As a mock example of the first recognition information, it is a figure which shows the feature point distribution about a facial expression extracted from a facial image in the case of a facial expression recognition. 量子化ステップqiを1または4に限定して量子化誤差を最小化した模式例を示す図である。It is a figure which shows the schematic example which minimized the quantization error by limiting the quantization step qi to 1 or 4. 各描画情報及び第２抽出情報の模式例を示す図である。It is a figure which shows the schematic example of each drawing information and the 2nd extraction information. アバタ表示処理を双方向に行う場合の一実施形態に係る提示システムの機能ブロック図である。It is a functional block diagram of the presentation system which concerns on one Embodiment when the avatar display processing is performed bidirectionally. 一般的なコンピュータにおけるハードウェア構成の例を示す図である。It is a figure which shows the example of the hardware configuration in a general computer.

図１は、一実施形態に係る提示システム100の構成図であり、提示システム100は、互いにインターネット等のネットワークNWを介して通信可能に構成されているN個（N≧2）の端末10,20,…、N0と、サーバ30と、を備える。端末10,20,…、N0を利用するユーザがそれぞれユーザU1,U2,…,UNであり、これらN人のユーザU1,U2,…,UNは各自の端末10,20,…、N0（例えばスマートフォン端末やヘッドマウントディスプレイ端末など）を利用することにより、各自の遠隔地に存在しながら提示システム100によってアバタ（コミュニケーション相手側ユーザのアバタ）を用いた遠隔コミュニケーションを行うことが可能である。 FIG. 1 is a configuration diagram of a presentation system 100 according to an embodiment, in which the presentation system 100 is configured to be able to communicate with each other via a network NW such as the Internet, and N (N ≧ 2) terminals 10, It is equipped with 20, ..., N0, and server 30. The users who use the terminals 10, 20, ..., N0 are the users U1, U2, ..., UN, respectively, and these N users U1, U2, ..., UN are their own terminals 10, 20, ..., N0 (for example). By using a smartphone terminal, a head-mounted display terminal, etc.), it is possible to perform remote communication using an avatar (communication partner user's avatar) by the presentation system 100 while being present at each remote location.

以下、説明のためにN=2とし、第１端末10を利用する第１ユーザU1と第２端末20を利用する第２ユーザU2との間で、サーバ30を介して提示システム100において遠隔コミュニケーションを実現する場合を例とする。N≧3の場合も、N人のユーザのうち任意の2ユーザ間での遠隔コミュニケーションを2ユーザU1,U2の場合と同様にして実現することにより、全く同様にしてN人での遠隔コミュニケーションを行うことが可能である。 Hereinafter, for the sake of explanation, N = 2 is set, and remote communication is performed between the first user U1 using the first terminal 10 and the second user U2 using the second terminal 20 in the presentation system 100 via the server 30. Is taken as an example. Even in the case of N ≧ 3, remote communication between any two users out of N users is realized in the same way as in the case of two users U1 and U2, so that remote communication with N people can be performed in exactly the same way. It is possible to do.

図２は、N=2の場合の一実施形態に係る提示システム100の機能ブロック図である。提示システム100は、第１ユーザU1が利用する第１端末10と、第２ユーザU2が利用する第２端末20と、サーバ30と、を備える。第１端末10、第２端末20ではそれぞれユーザU1,U2より遠隔コミュニケーションを行うために必要となる情報の取得を行い、当該取得した情報を相手側の端末へと送信する。サーバ30は、当該送信された情報を中継して相手側の端末へと送信する処理を行うが、当該中継する際に送信された情報を用いて所定の描画処理等を行ったうえで相手側の端末へと送信することで、計算資源等に制約がある第１端末10及び第２端末20を利用する状況においても、ユーザU1,U2間での高品位な遠隔コミュニケーションを実現することができる。 FIG. 2 is a functional block diagram of the presentation system 100 according to the embodiment in the case of N = 2. The presentation system 100 includes a first terminal 10 used by the first user U1, a second terminal 20 used by the second user U2, and a server 30. The first terminal 10 and the second terminal 20 acquire information necessary for remote communication from users U1 and U2, respectively, and transmit the acquired information to the other terminal. The server 30 performs a process of relaying the transmitted information and transmitting it to the terminal of the other party, but after performing a predetermined drawing process or the like using the information transmitted at the time of the relay, the other party By transmitting to the terminal of No. 1, high-quality remote communication between users U1 and U2 can be realized even in the situation where the first terminal 10 and the second terminal 20 have restrictions on computational resources and the like. ..

図２に示すように、第１端末10は第１認識部11及び第１測位部12を備え、第２端末20は第２測位部22、第２描画部26、第２統合部27及び第２提示部28を備え、サーバ30は第３描画部33、第４描画部34及び抽出部35を備える。 As shown in FIG. 2, the first terminal 10 includes a first recognition unit 11 and a first positioning unit 12, and the second terminal 20 has a second positioning unit 22, a second drawing unit 26, a second integrated unit 27, and a second terminal 20. 2 The presentation unit 28 is provided, and the server 30 includes a third drawing unit 33, a fourth drawing unit 34, and an extraction unit 35.

なお、図２では、サーバ30の第３描画部33及び第４描画部34をまとめて機能部31として示しているが、これは次の情報送受を表すものである。すなわち、第１端末10の第１認識部11及び第１測位部12でそれぞれ取得する第１認識情報及び第１測位情報と、第２端末20の第２測位部22で取得する第２測位情報と、がサーバ30側へと送信され、第３描画部33及び第４描画部34においてこれらの情報が利用されることを表現するために、機能部31としてまとめて示している。 In FIG. 2, the third drawing unit 33 and the fourth drawing unit 34 of the server 30 are collectively shown as the functional unit 31, which represents the following information transmission / reception. That is, the first recognition information and the first positioning information acquired by the first recognition unit 11 and the first positioning unit 12 of the first terminal 10, and the second positioning information acquired by the second positioning unit 22 of the second terminal 20 respectively. Is transmitted to the server 30 side, and is collectively shown as a functional unit 31 in order to express that the third drawing unit 33 and the fourth drawing unit 34 use these information.

図３は、一実施形態に係る提示システム100の動作のシーケンス図であり、所定の処理レートの各時刻t=1,2,3,…においてそれぞれ図３の動作全体が行われることにより、提示システム100により第１端末10を利用する第１ユーザU1と第２端末20を利用する第２ユーザU2との間でリアルタイムに、アバタを利用した遠隔コミュニケーションを行うことが可能となる。 FIG. 3 is a sequence diagram of the operation of the presentation system 100 according to the embodiment, and is presented by performing the entire operation of FIG. 3 at each time t = 1, 2, 3, ... At a predetermined processing rate. The system 100 enables remote communication using avatars in real time between the first user U1 who uses the first terminal 10 and the second user U2 who uses the second terminal 20.

図２及び図３にも示されるように、当該各時刻tでのリアルタイムの処理概要は以下の通りである。（なお、各機能部の処理と、各機能部間での処理情報の授受の流れの観点から概要のみをまず説明し、各機能部の個別処理の詳細に関しては後述する。） As shown in FIGS. 2 and 3, the outline of real-time processing at each time t is as follows. (Note that only the outline will be explained first from the viewpoint of the processing of each functional unit and the flow of exchanging processing information between each functional unit, and the details of the individual processing of each functional unit will be described later.)

第１端末10において、第１認識部11はユーザU1の表情等を認識して時刻tでの第１認識情報R1(t)を得て、この第１認識情報R1(t)をサーバ30の第３描画部33及び第４描画部34へと送信する（ステップS111,S112）。第１端末10において、第１測位部12は時刻tでの第１ユーザU1の位置姿勢を測位して第１測位情報P1(t)を得て、この第１測位情報P1(t)をサーバ30の第３描画部33及び第４描画部34へと送信する（ステップS121,S122）。 In the first terminal 10, the first recognition unit 11 recognizes the facial expression of the user U1 and obtains the first recognition information R1 (t) at time t, and uses this first recognition information R1 (t) on the server 30. It is transmitted to the third drawing unit 33 and the fourth drawing unit 34 (steps S111 and S112). In the first terminal 10, the first positioning unit 12 positions the position and orientation of the first user U1 at time t to obtain the first positioning information P1 (t), and uses this first positioning information P1 (t) as a server. It is transmitted to the third drawing unit 33 and the fourth drawing unit 34 of 30 (steps S121 and S122).

第２端末20において、第２測位部22は時刻tでの第２ユーザU2の位置姿勢を測位して第２測位情報P2(t)を得て、この第２測位情報P2(t)をサーバ30の第３描画部33及び第４描画部34へと送信する（ステップS221,S222）と共に、第２端末20内の第２描画部26へと出力する（ステップS223）。 In the second terminal 20, the second positioning unit 22 positions the position and orientation of the second user U2 at time t to obtain the second positioning information P2 (t), and uses this second positioning information P2 (t) as a server. It is transmitted to the third drawing unit 33 and the fourth drawing unit 34 of 30 (steps S221 and S222), and is output to the second drawing unit 26 in the second terminal 20 (step S223).

サーバ30の第３描画部33は、第１測位情報P1(t)及び第２測位情報P2(t)によって定まる位置姿勢において、時刻tでの第１ユーザU1のアバタを標準品質で描画した結果として第３描画情報G3(t)を得て、この第３描画情報G3(t)を抽出部35へと出力する（ステップS331）。第４描画部34は、第１測位情報P1(t)及び第２測位情報P2(t)によって定まる位置姿勢において、時刻tでの第１ユーザU1のアバタを高品質で描画した結果として第４描画情報G4(t)を得て、この第４描画情報G4(t)を抽出部35へと出力する（ステップS341）。 The third drawing unit 33 of the server 30 draws the avatar of the first user U1 at time t in standard quality in the position and orientation determined by the first positioning information P1 (t) and the second positioning information P2 (t). As a result, the third drawing information G3 (t) is obtained, and the third drawing information G3 (t) is output to the extraction unit 35 (step S331). The fourth drawing unit 34 draws the avatar of the first user U1 at time t in high quality in the position and orientation determined by the first positioning information P1 (t) and the second positioning information P2 (t). The drawing information G4 (t) is obtained, and the fourth drawing information G4 (t) is output to the extraction unit 35 (step S341).

ここで、第４描画部34は、第３描画部33の描画品質と比べてより高品質に描画を行う。また、第３描画部33の描画品質は、第２端末20の第２描画部26の描画品質と同一であり、第３描画部33及び第２描画部26では同一の描画を行う。第２描画部26、第３描画部33及び第４描画部34では、第１測位情報P1(t)及び第２測位情報P2(t)によって定まる共通の位置姿勢において、それぞれの描画品質により時刻tでの第１ユーザU1のアバタを描画する。 Here, the fourth drawing unit 34 draws with higher quality than the drawing quality of the third drawing unit 33. Further, the drawing quality of the third drawing unit 33 is the same as the drawing quality of the second drawing unit 26 of the second terminal 20, and the third drawing unit 33 and the second drawing unit 26 perform the same drawing. In the second drawing unit 26, the third drawing unit 33, and the fourth drawing unit 34, in the common position and posture determined by the first positioning information P1 (t) and the second positioning information P2 (t), the time is determined by the drawing quality of each. Draw the avatar of the first user U1 at t.

サーバ30においてさらに、抽出部35は、第３描画情報G3(t)と第４描画情報G4(t)との相違（差分）を時刻tでの第２抽出情報E2(t)として抽出し、この第２抽出情報E2(t)を第２端末20の第２統合部27へと送信する（ステップS352）。 In the server 30, the extraction unit 35 further extracts the difference (difference) between the third drawing information G3 (t) and the fourth drawing information G4 (t) as the second extraction information E2 (t) at time t. This second extraction information E2 (t) is transmitted to the second integration unit 27 of the second terminal 20 (step S352).

ここで、アバタの直接の描画結果である第３描画情報G3(t)や第４描画情報G4(t)ではなく、情報量が削減されたその差分としての第２抽出情報E2(t)をサーバ20から第２端末20へと送信することにより、ネットワークNWの通信帯域の圧迫を抑制することが可能となる。 Here, instead of the third drawing information G3 (t) and the fourth drawing information G4 (t) which are the direct drawing results of the avatar, the second extraction information E2 (t) as the difference in which the amount of information is reduced is used. By transmitting from the server 20 to the second terminal 20, it is possible to suppress the pressure on the communication band of the network NW.

サーバ30ではまた、第１端末10から第２端末20への情報送信の中継処理として、第１認識部11及び第１測位部12からそれぞれ得られた時刻tでの第１ユーザの第１認識情報R1(t)及び第１測位情報P1(t)を、そのまま第２端末20の第２描画部26へと送信する（ステップS351）。（なお、図２及び図３では、便宜上、第１認識情報R1(t)及び第１測位情報P1(t)の中継送信元を抽出部35として描いているが、抽出部35において特に第１認識情報R1(t)及び第１測位情報P1(t)をさらに加工する処理等が行われるわけではない。） The server 30 also performs the first recognition of the first user at the time t obtained from the first recognition unit 11 and the first positioning unit 12 as a relay process of information transmission from the first terminal 10 to the second terminal 20. The information R1 (t) and the first positioning information P1 (t) are transmitted as they are to the second drawing unit 26 of the second terminal 20 (step S351). (In addition, in FIGS. 2 and 3, for convenience, the relay transmission source of the first recognition information R1 (t) and the first positioning information P1 (t) is drawn as the extraction unit 35, but the extraction unit 35 particularly draws the first. Processing such as further processing of the recognition information R1 (t) and the first positioning information P1 (t) is not performed.)

第２端末20の第２描画部26は、第１測位情報P1(t)及び第２測位情報P2(t)によって定まる位置姿勢において、時刻tでの第１ユーザU1のアバタを標準品質で描画した結果として第２描画情報G2(t)を得て、この第２描画情報G2(t)を第２統合部27へと出力する（ステップS261）。 The second drawing unit 26 of the second terminal 20 draws the avatar of the first user U1 at time t in standard quality in the position and orientation determined by the first positioning information P1 (t) and the second positioning information P2 (t). As a result, the second drawing information G2 (t) is obtained, and the second drawing information G2 (t) is output to the second integration unit 27 (step S261).

既に説明したように、第２端末20の第２描画部26ではサーバ30の第３描画部33と同一品質で、第１測位情報P1(t)及び第２測位情報P2(t)によって定まる共通の位置姿勢において、時刻tでの第１ユーザU1のアバタを描画する。すなわち、第２描画部26で得られる第２描画情報G2(t)は、第３描画部33で得られる第３描画情報G3(t)と同一である。 As described above, the second drawing unit 26 of the second terminal 20 has the same quality as the third drawing unit 33 of the server 30, and is common to be determined by the first positioning information P1 (t) and the second positioning information P2 (t). At the position and orientation of, the avatar of the first user U1 at time t is drawn. That is, the second drawing information G2 (t) obtained by the second drawing unit 26 is the same as the third drawing information G3 (t) obtained by the third drawing unit 33.

第２統合部27は、第２描画情報G2(t)に対して、サーバ30の抽出部35から得られた第２抽出情報E2(t)を加算することにより、時刻tでの第２統合情報G2S(t)を得て、この第２統合情報G2S(t)を第２提示部28へと出力する（ステップS271）。第２提示部28はディスプレイとして構成され、第２統合情報G2S(t)を第２ユーザU2に対して表示する。 The second integration unit 27 adds the second extraction information E2 (t) obtained from the extraction unit 35 of the server 30 to the second drawing information G2 (t), thereby causing the second integration at time t. The information G2S (t) is obtained, and the second integrated information G2S (t) is output to the second presentation unit 28 (step S271). The second presentation unit 28 is configured as a display and displays the second integrated information G2S (t) to the second user U2.

ここで、第２抽出情報E2(t)はサーバ30において高品質で描画された第４描画情報G4(t)と標準品質で描画された第３描画情報G3(t)との差分（「E2(t)=G4(t)-G3(t)」）として得られており、第３描画情報G3(t)と同一情報である第２描画情報G2(t)が第２端末20の第２描画部26で得られている。従って、第２統合部27で加算して得られる第２統合情報G2S(t)は、サーバ30において高品質で描画された第４描画情報G4(t)と同一情報となり、サーバ30の第４描画部34において高品質に描画された結果としての第１ユーザU1のアバタを、第２端末20自身において直接描画することなく、第２統合情報G2S(t)として加算により復元して第２ユーザU2に対して提示することが可能となる。 Here, the second extraction information E2 (t) is the difference between the fourth drawing information G4 (t) drawn with high quality on the server 30 and the third drawing information G3 (t) drawn with standard quality (“E2). (t) = G4 (t) -G3 (t) "), and the second drawing information G2 (t), which is the same information as the third drawing information G3 (t), is the second of the second terminal 20. Obtained by the drawing unit 26. Therefore, the second integrated information G2S (t) obtained by adding in the second integrated unit 27 becomes the same information as the fourth drawing information G4 (t) drawn with high quality in the server 30, and is the fourth information of the server 30. The avatar of the first user U1 as a result of being drawn with high quality in the drawing unit 34 is restored by addition as the second integrated information G2S (t) without directly drawing on the second terminal 20 itself, and is restored by the second user. It will be possible to present to U2.

なお、後述するように第２抽出情報E2(t)は差分（「E2(t)=G4(t)-G3(t)」）として得たうえでさらに量子化による非可逆圧縮を適用する場合もあるが、この場合も、第２統合部27で加算して得られる第２統合情報G2S(t)は、サーバ30において高品質で描画された第４描画情報G4(t)とは完全には同一ではないが、これを可能な限り模したものとして得られることとなる。 As will be described later, when the second extraction information E2 (t) is obtained as a difference (“E2 (t) = G4 (t) -G3 (t)”) and then lossy compression by quantization is further applied. However, in this case as well, the second integrated information G2S (t) obtained by adding in the second integrated unit 27 is completely different from the fourth drawing information G4 (t) drawn with high quality on the server 30. Is not the same, but it will be obtained as an imitation of this as much as possible.

図４は、以上の図２，３の構成によって実現されるアバタによるリアルタイムでの遠隔コミュニケーションの模式例を示す図である。例EX11に示すように、第１端末10を利用する第１ユーザU1は環境E1（部屋など）に存在し、これとは遠隔地に、第２端末20を利用する第２ユーザU2が環境E2（部屋など）に存在する。例EX11のような遠隔環境E1,E2に対して、例EX12には提示システム100により実現される拡張現実表示によるアバタコミュニケーションが示されている。第１端末10を利用する第１ユーザU1は環境E1に対して拡張現実表示（アバタA2の重畳表示）を加えた仮想空間V1において、コミュニケーション相手である第２ユーザU2のアバタA2が表示され、アバタA2に対してコミュニケーションを行うことで、遠隔に存在する第２ユーザU2とコミュニケーションを行うことが可能となる。同様に、第２端末20を利用する第２ユーザU2は環境E2に対して拡張現実表示（アバタA1の重畳表示）を加えた仮想空間V2において、コミュニケーション相手である第１ユーザU1のアバタA1が表示され、アバタA1に対してコミュニケーションを行うことで、遠隔に存在する第１ユーザU1とコミュニケーションを行うことが可能となる。 FIG. 4 is a diagram showing a schematic example of real-time remote communication by the avatar realized by the above configurations of FIGS. 2 and 3. As shown in Example EX11, the first user U1 who uses the first terminal 10 exists in the environment E1 (room, etc.), and the second user U2 who uses the second terminal 20 is in the environment E2 at a remote location. Exists in (room, etc.). For remote environments E1 and E2 such as Example EX11, Example EX12 shows avatar communication by augmented reality display realized by the presentation system 100. The first user U1 who uses the first terminal 10 displays the avatar A2 of the second user U2 who is the communication partner in the virtual space V1 in which the augmented reality display (superimposed display of the avatar A2) is added to the environment E1. By communicating with the avatar A2, it becomes possible to communicate with the second user U2 that exists remotely. Similarly, the second user U2 who uses the second terminal 20 has the avatar A1 of the first user U1 who is the communication partner in the virtual space V2 in which the augmented reality display (superimposed display of the avatar A1) is added to the environment E2. By displaying and communicating with the avatar A1, it becomes possible to communicate with the remote first user U1.

なお、以上の図２，３の構成は、図４の例EX12に示すうちの右側を実現するものである。（すなわち、第２統合情報G2S(t)とは、第１ユーザU1のアバタA1であり、第２ユーザU2に対して拡張現実表示として提供されるものである。）第１端末10と第２端末20の役割を入れ替えることで図４の例EX12に示すうちの左側も全く同様に実現可能となる。 The above configurations of FIGS. 2 and 3 realize the right side of the example EX12 of FIG. (That is, the second integrated information G2S (t) is the avatar A1 of the first user U1 and is provided to the second user U2 as an augmented reality display.) The first terminal 10 and the second. By exchanging the roles of the terminal 20, the left side of the example EX12 in FIG. 4 can be realized in exactly the same way.

ここで、アバタA1は、第２統合情報G2S(t)の形で第１ユーザU1の位置姿勢である第１測位情報P1(t)及び表情等である第１認識情報R1(t)が反映されてリアルタイムに描画されているため、アバタとして第１ユーザU1のリアルタイムの挙動がそのまま反映され、第２ユーザU2に対して仮想空間V2において拡張現実表示されるものとなる。すなわち、第１ユーザU1が位置姿勢を変えればアバタA1も追従して仮想空間V2内における３次元的な位置姿勢が変化し、第１ユーザU1が表情等を変化させればアバタA1も追従して表情等が変化する。（この逆の、仮想空間V1におけるアバタA2と第１ユーザU1との関係も全く同様となる。） Here, the avatar A1 reflects the first positioning information P1 (t), which is the position and orientation of the first user U1, and the first recognition information R1 (t), which is a facial expression, etc., in the form of the second integrated information G2S (t). Since it is drawn in real time, the real-time behavior of the first user U1 is reflected as it is as an avatar, and augmented reality is displayed in the virtual space V2 for the second user U2. That is, if the first user U1 changes the position and posture, the avatar A1 also follows and the three-dimensional position and posture in the virtual space V2 changes, and if the first user U1 changes the facial expression and the like, the avatar A1 also follows. The facial expression etc. changes. (The opposite, the relationship between the avatar A2 and the first user U1 in the virtual space V1 is exactly the same.)

さらに、アバタA1は、第２統合情報G2S(t)の形で第２ユーザU2の視点の位置姿勢である第２測位情報P2(t)から見た状態として（画像平面へ描画するための仮想カメラの位置姿勢として第２測位情報P2(t)の位置姿勢を用いて）リアルタイムに描画されているため、第２ユーザU2が移動して、アバタA1の例えば横や後ろに回り込んだ状態で、アバタA1を見るといったことも可能となる。 Further, the avatar A1 is in the form of the second integrated information G2S (t) as seen from the second positioning information P2 (t) which is the position and orientation of the viewpoint of the second user U2 (virtual for drawing on the image plane). Since it is drawn in real time (using the position and orientation of the second positioning information P2 (t) as the position and orientation of the camera), the second user U2 moves and wraps around, for example, to the side or back of the avatar A1. , It is also possible to see Avata A1.

このように各時刻tで第２統合情報G2S(t)として描画されるアバタA1は、時刻tを固定すると画像平面上の２次元領域として描画されるものであるが、３次元モデルのアバタの描画結果として２次元的に描画されているため、時刻tの進行に伴うユーザU1,U2の移動に応じて、３次元形状として変化しながら表示されることとなる。 In this way, the avatar A1 drawn as the second integrated information G2S (t) at each time t is drawn as a two-dimensional region on the image plane when the time t is fixed, but the avatar of the three-dimensional model Since it is drawn two-dimensionally as a drawing result, it is displayed while changing as a three-dimensional shape according to the movement of users U1 and U2 with the progress of time t.

なお、図４の例EX12に示される双方向でのアバタコミュニケーションは、後述する図８の構成によって実現されるが、この図８の構成は、図２，３の構成を双方向として書き換えたに過ぎないものであり、図２，３の構成に対して第１端末10及び第２端末20の役割を入れ替えた構成を追加したのが図８の構成に相当する。 The bidirectional avatar communication shown in Example EX12 of FIG. 4 is realized by the configuration of FIG. 8 described later, but the configuration of FIG. 8 is a rewrite of the configuration of FIGS. 2 and 3 as bidirectional. It is nothing more than a configuration in which the roles of the first terminal 10 and the second terminal 20 are exchanged with respect to the configurations of FIGS. 2 and 3, which corresponds to the configuration of FIG.

以下、図３のように各時刻t=1,2,3,…についてリアルタイムに動作する図２の各機能部の詳細に関して、説明する。 Hereinafter, details of each functional unit of FIG. 2 that operates in real time at each time t = 1, 2, 3, ... As shown in FIG. 3 will be described.

第１認識部11は、第１ユーザU1のコミュニケーションに関連する状態の情報として、例えば姿勢（ポーズ）及び／又は表情を認識して、時刻tでの第１認識情報R1(t)を得る。図５は、第１認識情報の模試例として、表情認識の場合の顔画像から抽出した表情に関するランドマーク座標分布を示す図である。表情認識としては以下の非特許文献２のような顔のランドマークの座標を推定する既存技術を利用できる。姿勢認識は身体の各部に装着したセンサを用いたトラッキング技術やカメラを用いた画像認識技術のような骨格情報を推定する既存技術を利用できる。
[非特許文献２] A. Bulat and et al. ``How far are we from solving the 2D & 3D Face Alignment problem?,'' International Conference on Computer Vision, 2017 The first recognition unit 11 recognizes, for example, a posture (pose) and / or a facial expression as state information related to the communication of the first user U1, and obtains the first recognition information R1 (t) at time t. FIG. 5 is a diagram showing a landmark coordinate distribution relating to a facial expression extracted from a facial expression in the case of facial expression recognition as a mock example of the first recognition information. For facial expression recognition, existing techniques for estimating the coordinates of facial landmarks as in Non-Patent Document 2 below can be used. For posture recognition, existing technologies for estimating skeletal information, such as tracking technology using sensors attached to each part of the body and image recognition technology using a camera, can be used.
[Non-Patent Document 2] A. Bulat and et al. `` How far are we from solving the 2D & 3D Face Alignment problem ?,'' International Conference on Computer Vision, 2017

第１測位部12及び第２測位部22はそれぞれ、第１ユーザU1及び第２ユーザU2の位置姿勢として時刻tの第１測位情報P1(t)及び第２測位情報P2(t)を得る。第１測位部12及び第２測位部22の処理は共通であり、位置姿勢（カメラの外部パラメータに相当する情報）を測位する任意の既存手法として、Visual SLAM（画像ベースの自己位置推定と環境地図作成の同時実行）や６DOF（６自由度）センサなど既存技術を利用できる。測位に際しユーザが存在する環境の深度情報を得られる場合は測位情報の一部として含めることもできる。 The first positioning unit 12 and the second positioning unit 22 obtain the first positioning information P1 (t) and the second positioning information P2 (t) at time t as the positions and postures of the first user U1 and the second user U2, respectively. The processing of the first positioning unit 12 and the second positioning unit 22 is common, and Visual SLAM (image-based self-position estimation and environment) is an arbitrary existing method for positioning the position and orientation (information corresponding to the external parameters of the camera). Existing technologies such as (simultaneous execution of map creation) and 6DOF (6 degrees of freedom) sensor can be used. If the depth information of the environment in which the user exists can be obtained at the time of positioning, it can be included as a part of the positioning information.

なお、第１認識部11、第１測位部12及び第２測位部22をそれぞれ以上のような任意の既存手法により実現する際に、第１ユーザU1及び第２ユーザU2を撮像やキャプチャするためのハードウェアとしてカメラや専用センサ等を利用する場合は、当該ハードウェアは第１端末10や第２端末20に固定的に備わるもの（例えば各端末がスマートフォン等のモバイル端末である場合の当該モバイル端末の内蔵カメラ）であってもよいし、第１ユーザU1や第２ユーザU2が存在する環境E1,E2に設置されて備わるものであってもよい。 In order to capture and capture the first user U1 and the second user U2 when the first recognition unit 11, the first positioning unit 12, and the second positioning unit 22 are realized by any existing method as described above. When using a camera, a dedicated sensor, etc. as the hardware of the device, the hardware is fixedly provided in the first terminal 10 or the second terminal 20 (for example, the mobile device when each terminal is a mobile terminal such as a smartphone). It may be the built-in camera of the terminal), or it may be installed and provided in the environments E1 and E2 in which the first user U1 and the second user U2 exist.

サーバ30において、第３描画部33及び第４描画部34は共に、第１測位情報P1(t)及び第２測位情報P2(t)によって定まる３次元座標（第２ユーザU2の仮想空間V2の３次元カメラ座標）に第１ユーザU1のアバタA1を配置し、第１認識情報R1(t)を反映してこのアバタA1を描画することにより、それぞれ、第３描画情報G3(t)及び第４描画情報G4(t)を得る。既に説明したように、第３描画情報G3(t)及び第４描画情報G4(t)は同一の３次元アバタを同一状態且つ同一配置で描画したものであるが、その描画品質のみが異なるものであり、描画品質を区別するものとしては3次元コンピュータグラフィックスにおける光源モデルや表面反射モデルの設定等の、レンダリングに関する設定が挙げられる。 In the server 30, the third drawing unit 33 and the fourth drawing unit 34 both have three-dimensional coordinates determined by the first positioning information P1 (t) and the second positioning information P2 (t) (the virtual space V2 of the second user U2). By arranging the avatar A1 of the first user U1 on the (three-dimensional camera coordinates) and drawing this avatar A1 reflecting the first recognition information R1 (t), the third drawing information G3 (t) and the third drawing information G3 (t) and the third are drawn, respectively. 4 Obtain drawing information G4 (t). As described above, the third drawing information G3 (t) and the fourth drawing information G4 (t) are the same three-dimensional avatars drawn in the same state and in the same arrangement, but only the drawing quality is different. Therefore, what distinguishes the drawing quality is the setting related to rendering such as the setting of the light source model and the surface reflection model in 3D computer graphics.

ここで、描画対象となる第１ユーザU1のアバタA1に関しては、第１認識情報R1(t)をパラメータ（表情やポーズを決定するパラメータ）として描画に反映可能に構成されている所定の３次元モデルを予め用意しておけばよい。表情やポーズに応じた描画は、任意の既存の３次元コンピュータグラフィックスの手法を利用してよい。 Here, regarding the avatar A1 of the first user U1 to be drawn, a predetermined three-dimensional structure is configured so that the first recognition information R1 (t) can be reflected in drawing as a parameter (a parameter for determining a facial expression or a pose). You can prepare the model in advance. Any existing 3D computer graphics technique may be used for drawing according to facial expressions and poses.

第２測位情報P2(t)は、第１ユーザU1のアバタA1を見る側である第２ユーザU2の視点の位置姿勢を表すものとして、第２ユーザU2が存在する環境E2の３次元世界座標内における位置姿勢（カメラの外部パラメータ）として取得しておき、アバタA1として描画される対象である第１ユーザU1の第１測位情報P1(t)に対して所定変換（並進及び回転変換）Tを施すことにより、当該環境E2の３次元世界座標内に変換した第１測位情報T・P1(t)をアバタA1の位置姿勢として、第２測位情報P2(t)で定まる仮想カメラ位置からこのアバタA1を描画すればよい。 The second positioning information P2 (t) represents the position and orientation of the viewpoint of the second user U2 who sees the avatar A1 of the first user U1, and is the three-dimensional world coordinates of the environment E2 in which the second user U2 exists. Predetermined conversion (translation and rotation conversion) T for the first positioning information P1 (t) of the first user U1 that is acquired as the internal position and orientation (external parameter of the camera) and is drawn as the avatar A1. The first positioning information T / P1 (t) converted into the three-dimensional world coordinates of the environment E2 is used as the position / orientation of the avatar A1, and this is performed from the virtual camera position determined by the second positioning information P2 (t). Just draw the avatar A1.

例えば、変換した第１測位情報T・P1(t)における位置（並進成分）をアバタの顔の位置とし、姿勢（回転成分）をアバタの顔の向きとして描画してよい。 For example, the position (translational component) in the converted first positioning information T / P1 (t) may be drawn as the position of the avatar's face, and the posture (rotational component) may be drawn as the direction of the avatar's face.

サーバ30において抽出部35は、第３描画情報G3(t)と第４描画情報G4(t)（共に、画像平面内の同一領域のみにマスク画像としてアバタA1を描画した状態として得られている）の差分として第２抽出情報E2(t)=G4(t)-G3(t)を得る。 In the server 30, the extraction unit 35 is obtained in a state where the third drawing information G3 (t) and the fourth drawing information G4 (t) (both of which, the avatar A1 is drawn as a mask image only in the same region in the image plane). ), The second extraction information E2 (t) = G4 (t) -G3 (t) is obtained.

ここで、第２抽出情報の伝送量を抑制するため、第３描画情報G3(t)に一次変換を施した「a・G3(t)+b」と第４描画情報G4(t)との差分として以下のように第２抽出情報E2(t)を得るようにしてもよい。一次変換の係数a,bは最小二乗法により各時刻tにおいて求め、第２端末20の第２描画部26へと第２抽出情報E2(t)に付随する情報として送信するようにしてもよい。a=1とし、第３描画情報G3(t)に値bを加算（各画素位置に一律に加算）した「G3(t)+b」の画素値平均と第４描画情報G4(t)の画素値平均が一致するようにbの値を求めてもよい。
E2(t)=G4(t)- a・G3(t)-b Here, in order to suppress the transmission amount of the second extraction information, "a · G3 (t) + b" obtained by performing a primary conversion on the third drawing information G3 (t) and the fourth drawing information G4 (t) The second extraction information E2 (t) may be obtained as the difference as follows. The coefficients a and b of the linear transformation may be obtained at each time t by the least squares method and transmitted to the second drawing unit 26 of the second terminal 20 as information accompanying the second extraction information E2 (t). .. With a = 1, the average pixel value of "G3 (t) + b" obtained by adding the value b to the third drawing information G3 (t) (uniformly added to each pixel position) and the fourth drawing information G4 (t) The value of b may be obtained so that the average pixel values match.
E2 (t) = G4 (t)-a ・ G3 (t) -b

なお、上記の係数a,bによる一次変換で第３描画情報G3(t)と第４描画情報G4(t)との相違を抑制する際に、一次変換は第３描画情報G3(t)に対してではなく第４描画情報G4(t)に対して「a・G4(t)+b」として適用して、相違としての第２抽出情報E2(t)を以下のように求めるようにしてもよい。
E2(t)= a・G4(t)+b-G3(t) In addition, when suppressing the difference between the third drawing information G3 (t) and the fourth drawing information G4 (t) by the linear conversion by the above coefficients a and b, the primary conversion is changed to the third drawing information G3 (t). It is applied as "a ・ G4 (t) + b" to the 4th drawing information G4 (t) instead of the above, and the 2nd extraction information E2 (t) as a difference is obtained as follows. May be good.
E2 (t) = a ・ G4 (t) + b-G3 (t)

また、上記の係数a,b等は、第３描画情報G3(t)及び第４描画情報G4(t)を画像平面内の部分領域としてのマスク画像の全体において共通の値として求めるようにしてもよいし、マスク画像全体を複数のブロック領域に区切ったうえで、当該ブロック領域ごとの値として求めるようにしてもよい。 Further, the above coefficients a and b are obtained by obtaining the third drawing information G3 (t) and the fourth drawing information G4 (t) as common values in the entire mask image as a partial region in the image plane. Alternatively, the entire mask image may be divided into a plurality of block areas and then obtained as a value for each block area.

ここで、元の画像としての第３描画情報G3(t)と第４描画情報G4(t)が例えば8ビットで画素値0～255の範囲で構成される場合、これらの差分画像として得られる第２抽出情報E2(t)は通常、当初のビット数よりも広い-255～+255の範囲の画素値で構成されうるものとなる。このため、当初の色深度B bpp(bit per pixel)（Bは例えば8ビット／ピクセル）に収まるように差分値を量子化するために、抽出部35では量子化誤差を抑制するように差分値を量子化したうえで、当該量子化された第２抽出情報E2(t)_[量子化]を第２統合部27へと送信してもよい。具体的には、以下の式のように、差分値のヒストグラムの値Pi（当該差分値に該当する画素の個数としての頻度Pi）に対して量子化誤差を最小化する量子化ステップqiを求める（貪欲法等の任意の既存手法により求める）ことで量子化すればよい。 Here, when the third drawing information G3 (t) and the fourth drawing information G4 (t) as the original image are composed of, for example, 8 bits and a pixel value in the range of 0 to 255, they are obtained as a difference image. The second extraction information E2 (t) can usually be composed of pixel values in the range of -255 to +255, which is wider than the initial number of bits. Therefore, in order to quantize the difference value so that it fits in the initial color depth B bpp (bit per pixel) (B is, for example, 8 bits / pixel), the extraction unit 35 suppresses the quantization error. May be quantized, and then the quantized second extraction information E2 (t) _{[quantization]} may be transmitted to the second integration unit 27. Specifically, as shown in the following equation, the quantization step qi that minimizes the quantization error is obtained for the value Pi of the histogram of the difference value (frequency Pi as the number of pixels corresponding to the difference value). Quantization may be performed by (obtaining by an arbitrary existing method such as the greedy method).

上記の式において、int()は整数化関数、Nはヒストグラムのビン数を表す。ここでさらに、解の自由度を下げることによる解の算出の高速化のために、量子化ステップを限定する制約を課すようにしてもよく、量子化ステップの情報は、当該量子化された第２抽出情報E2(t)_[量子化]に付随する情報として第２統合部27へと送信してもよい。 In the above equation, int () represents the integerization function and N represents the number of bins in the histogram. Here, further, in order to speed up the calculation of the solution by reducing the degree of freedom of the solution, a constraint limiting the quantization step may be imposed, and the information of the quantization step is the quantized first. 2 Extraction information E2 (t) May be transmitted to the second integration unit 27 as information accompanying _{[quantization]} .

図６は、量子化ステップqiの値の候補を所定の組み合わせとして例えば1または4に限定（q1=1,q2=4）して量子化誤差を最小化した模式例を示す図（B=8ビット）であり、上段側に示すヒストグラムでは取りうる範囲-255～+255の全体ではなく、min～maxの範囲で差分値が分布している。上段側の横方向の双方向矢印で示される範囲が、差分値のヒストグラムの頻度の上位q1*{2^B*q2-(max-min)}/(q2-q1)個の範囲（上位範囲）であり、下段側に量子化誤差を最小化して量子化した結果を8ビットの量子化値と差分値（範囲-255～+255）との対応付けを表すグラフとして示すように、この上位範囲は量子化ステップを細かく1とし、上位範囲以外は量子化ステップを粗く4としている。（なお、当該上位の個数の意義は次の通りである。8ビットの0～255に収まりきらない（minからmaxに存在する）画素値を8ビットの0～255にマッピングしたいとき、A個の画素値をq1=1で量子化し、残りの255－A個の画素値をq2=4で量子化する際、誤差を最小化するにはAを最大化することとなる。個数だけで考えているため、仮にヒストグラムが単調減少で0からmax-minの画素値を0から255にマッピングすると考えると、y=(1/q1)*xとy=(1/q2)*x+255-(max-min)/q2の交点が最大のA個となる。） FIG. 6 is a diagram (B = 8) showing a schematic example in which the candidate of the value of the quantization step qi is limited to 1 or 4 (q1 = 1, q2 = 4) as a predetermined combination to minimize the quantization error. (Bits), and in the histogram shown on the upper side, the difference values are distributed in the range of min to max, not the entire range of -255 to +255. The range indicated by the horizontal double-sided arrow on the upper side is the upper q1 * {2 ^B * q2- (max-min)} / (q2-q1) ranges (upper range) of the frequency of the histogram of the difference value. As shown in the graph showing the correspondence between the 8-bit quantization value and the difference value (range -255 to +255), the result of quantization with the quantization error minimized on the lower side is shown in this upper range. The quantization step is finely set to 1, and the quantization step is roughly set to 4 except for the upper range. (Note that the significance of the upper number is as follows. When you want to map pixel values that do not fit in 8-bit 0 to 255 (existing from min to max) to 8-bit 0 to 255, A When the pixel value of is quantized by q1 = 1 and the remaining 255-A pixel values are quantized by q2 = 4, A is maximized in order to minimize the error. Therefore, assuming that the histogram is monotonically decreasing and the pixel value from 0 to max-min is mapped from 0 to 255, y = (1 / q1) * x and y = (1 / q2) * x + 255- (Max-min) / q2 intersections are the maximum A.)

第２端末20において、第２描画部26で第２描画情報G2(t)を得る処理は既に説明したように、サーバ30における第３描画部33で第３描画情報G3(t)を得る処理と同一（描画品質も同一）であるため、重複する説明を省略する。 In the second terminal 20, the process of obtaining the second drawing information G2 (t) by the second drawing unit 26 is the process of obtaining the third drawing information G3 (t) by the third drawing unit 33 of the server 30 as described above. Since it is the same as (the drawing quality is also the same), the duplicate description will be omitted.

第２統合部27では、第３描画情報G3(t)と同一である第２描画情報G2(t)に対して、第２抽出情報E2(t)を加算することにより、高品質に描画された第４描画情報G4(t)と同一のものとして、または、第４描画情報G4(t)を模したものとして第２統合情報G2S(t)を得る。この第２統合部27の処理は、サーバ30の抽出部35の処理の逆に相当するものである。 In the second integration unit 27, the second extraction information E2 (t) is added to the second drawing information G2 (t) which is the same as the third drawing information G3 (t), so that the drawing is performed with high quality. The second integrated information G2S (t) is obtained as the same as the fourth drawing information G4 (t) or as a copy of the fourth drawing information G4 (t). The processing of the second integration unit 27 corresponds to the reverse of the processing of the extraction unit 35 of the server 30.

なお、第２統合部27では、第２抽出情報E2(t)が前述した係数a,bによる一次変換を用いて抽出されている場合は、当該係数a,bを同様に利用して第２統合情報G2S(t)を得るようにすればよい。また、第２抽出情報E2(t)が前述した量子化ステップで量子化されている場合、第２統合部27では、逆量子化により量子化値より対応する差分値を定めたうえで、差分値分布としての第２抽出情報E2(t)を求めて、第２統合情報G2S(t)を得るようにすればよい。 In the second integration unit 27, when the second extraction information E2 (t) is extracted by using the linear transformation by the coefficients a and b described above, the second extraction information E2 (t) is similarly used in the second. The integrated information G2S (t) should be obtained. Further, when the second extraction information E2 (t) is quantized in the above-mentioned quantization step, the second integration unit 27 determines the corresponding difference value from the quantized value by dequantization, and then the difference. The second extracted information E2 (t) as a value distribution may be obtained, and the second integrated information G2S (t) may be obtained.

第２提示部28は、ハードウェアとしてはディスプレイで構成され、第２統合部27で得た第１ユーザU1のアバタA1を描画したものとしての第２統合情報G2S(t)を第２ユーザに対して表示する。第２提示部28を構成するディスプレイが光学シースルー型の場合、アバタの描画結果である第２統合情報G2S(t)のみを表示すればよく、この光学シースルー型ディスプレイをユーザU2が装着した際の視点の位置姿勢が、第２測位部22の測位する第２測位情報P2(t)の位置姿勢と一致するように、この光学シースルー型ディスプレイを配置しておけばよい。（すなわち、第２測位部22は、当該配置されている光学シースルー型ディスプレイの位置姿勢（第２ユーザU2が装着することで第２ユーザU2の視点の位置姿勢に一致する）を、第２測位情報P2(t)（第２ユーザU2の仮想空間V2を描画するための仮想カメラの位置姿勢）として測位するようにすればよい。）また同様に、第２提示部28を構成するディスプレイがビデオシースルー型の場合、アバタの描画結果である第２統合情報G2S(t)を背景映像に対して重畳して表示すればよく、このビデオシースルー型ディスプレイに表示する背景映像は、第２測位部22の測位する第２測位情報P2(t)の位置姿勢と一致するカメラで現時刻tについて撮影したものを用いるようにすればよい。（すなわち、第２測位部22は、当該背景映像を撮影するカメラの位置姿勢を第２測位情報P2(t)として測位すればよい。第２測位部22が画像撮像を行いこの画像から第２測位情報P2(t)を測位している場合は、この画像撮像を行うカメラによる映像を、このビデオシースルー型ディスプレイに表示する背景映像とすればよい。） The second presentation unit 28 is composed of a display as hardware, and provides the second user with the second integrated information G2S (t) as a drawing of the avatar A1 of the first user U1 obtained by the second integration unit 27. Display against. When the display constituting the second presentation unit 28 is an optical see-through type, it is sufficient to display only the second integrated information G2S (t) which is the drawing result of the avatar, and when the user U2 wears this optical see-through type display. The optical see-through display may be arranged so that the position and orientation of the viewpoint coincide with the position and orientation of the second positioning information P2 (t) to be positioned by the second positioning unit 22. (That is, the second positioning unit 22 sets the position / orientation of the arranged optical see-through display (which matches the position / orientation of the viewpoint of the second user U2 when worn by the second user U2) in the second positioning. Information P2 (t) (position and orientation of the virtual camera for drawing the virtual space V2 of the second user U2) may be positioned.) Similarly, the display constituting the second presentation unit 28 is a video. In the case of the see-through type, the second integrated information G2S (t), which is the drawing result of the avatar, may be superimposed on the background image and displayed, and the background image displayed on this video see-through type display is the second positioning unit 22. It suffices to use a camera taken at the current time t with a camera that matches the position and orientation of the second positioning information P2 (t) to be positioned. (That is, the second positioning unit 22 may position the position and orientation of the camera that captures the background image as the second positioning information P2 (t). The second positioning unit 22 takes an image and takes a second image from this image. When the positioning information P2 (t) is positioned, the image taken by the camera that captures this image may be used as the background image to be displayed on this video see-through display.)

図７は、各描画情報及び第２抽出情報の模式例を示す図であり、標準品質で同一のものとして描画される第３描画情報G3(t)及び第２描画情報G2(t)と、これらと比べて高品質に描画される第４描画情報G4(t)と、第３描画情報G3(t)及び第４描画情報G4(t)の差分としての第２抽出情報E2(t)と、の例が示されている。各描画情報は第１ユーザU1のアバタとして顔部分のみを描いた例となっているが、身体部分も含めたアバタを描画するようにしてもよい。第４描画情報G4(t)では方向性光源を配置し、アバタの表面での反射や陰も考慮したレンダリングを行うことにより、これらを考慮しない第３描画情報G3(t)及び第２描画情報G2(t)よりも高品質に描画されている。 FIG. 7 is a diagram showing a schematic example of each drawing information and the second extraction information, and is a diagram showing the third drawing information G3 (t) and the second drawing information G2 (t) drawn as the same with standard quality. The fourth drawing information G4 (t) drawn with higher quality than these, and the second extraction information E2 (t) as the difference between the third drawing information G3 (t) and the fourth drawing information G4 (t). An example of, is shown. Each drawing information is an example in which only the face part is drawn as the avatar of the first user U1, but the avatar including the body part may be drawn. In the fourth drawing information G4 (t), a directional light source is arranged, and rendering is performed in consideration of reflection and shadow on the surface of the avatar, so that the third drawing information G3 (t) and the second drawing information do not take these into consideration. It is drawn with higher quality than G2 (t).

以上、本実施形態の提示システム100によれば、アバタを利用した遠隔コミュニケーションにおいて、サーバ30の豊富な計算資源を利用して高品質に描画された3次元アバタと同一またはほぼ同等の３次元アバタをユーザ端末において直接描画することなく表示することにより、高品質な３次元アバタを用いて臨場感を持った遠隔コミュニケーションが可能となり、且つ、サーバ30の描画結果から得られる差分のみを伝送することでサーバ30とユーザ端末との間の通信量も抑制することが可能となる。 As described above, according to the presentation system 100 of the present embodiment, in the remote communication using the avatar, the three-dimensional avatar which is the same as or almost the same as the three-dimensional avatar drawn with high quality by using the abundant computational resources of the server 30. Is displayed on the user terminal without drawing directly, enabling remote communication with a sense of reality using a high-quality 3D avatar, and transmitting only the difference obtained from the drawing result of the server 30. It is also possible to suppress the amount of communication between the server 30 and the user terminal.

以下、各実施形態についての種々の補足等を説明する。 Hereinafter, various supplements and the like for each embodiment will be described.

（１）概略説明において説明したように、提示システム100では所定の処理レートの各時刻t=1,2,3,…における情報を同期してリアルタイムで処理するが、第１端末10、第２端末20及びサーバ30ではネットワークタイムプロトコル等の既存手法により予め時計（計時機能）を同期しておくことにより、共通の各時刻tで処理を行うことができる。なお、最終的に第２提示部28で第２統合情報G2S(t)を提示する際の現在時刻が、伝送遅延や処理遅延により第２統合情報G2S(t)に紐づく時刻tよりも未来の時刻t+Δt(Δt>0)となっていてもよい。 (1) As described in the schematic description, in the presentation system 100, information at each time t = 1, 2, 3, ... At a predetermined processing rate is synchronously processed in real time, but the first terminal 10, the second terminal 10, the second. By synchronizing the clock (timekeeping function) in advance in the terminal 20 and the server 30 by an existing method such as a network time protocol, processing can be performed at each common time t. The current time when the second integrated information G2S (t) is finally presented by the second presenting unit 28 is in the future than the time t associated with the second integrated information G2S (t) due to transmission delay or processing delay. Time t + Δt (Δt> 0) may be set.

第１端末10、第２端末20及びサーバ30では各情報（第１認識情報R1(t)、第１測位情報P1(t)、第２測位情報P2(t)）を取得した時刻tを同期し、これに基づいて当該時刻tをタイムスタンプとして紐づけて第２，第３，第４描画情報G2(t),G3(t),G4(t)や第２抽出情報E2(t)、第２統合情報G2S(t)を得る。時刻tをこのように同期したうえで、第１端末10、第２端末20及びサーバ30の全部または一部において、互いに処理レートが異なっていてもよい。 The first terminal 10, the second terminal 20, and the server 30 synchronize the time t when each information (first recognition information R1 (t), first positioning information P1 (t), second positioning information P2 (t)) is acquired. Then, based on this, the time t is linked as a time stamp, and the second, third, and fourth drawing information G2 (t), G3 (t), G4 (t) and the second extraction information E2 (t), The second integrated information G2S (t) is obtained. After synchronizing the time t in this way, the processing rates may be different from each other in all or a part of the first terminal 10, the second terminal 20, and the server 30.

（２）第２測位部22の測位において深度情報を含めて第２測位情報P2(t)を得た場合、第２描画部26、第３描画部33、第４描画部34において第１ユーザU1の３次元アバタとして第２，第３，第４描画情報G2(t),G3(t),G4(t)をそれぞれ描画する際に、３次元アバタの全体のうち、深度情報よりも奥側（仮想カメラから見て遠方側）に位置する部分が存在する場合は、当該奥側に位置する部分を描画しないようにしてもよい。当該奥側に位置する部分は、第２ユーザU2の存在する環境E2においては何らかの現実物体によって遮蔽される部分であるため、描画しないことにより、現実物体によるオクルージョンを反映して自然な描画結果が得られる場合がある。（なお、位置関係によっては部分的のみ描画されたアバタが現実物体内（例えば壁の内部）に埋もれているように描画される場合もありうる。） (2) When the second positioning information P2 (t) including the depth information is obtained in the positioning of the second positioning unit 22, the first user in the second drawing unit 26, the third drawing unit 33, and the fourth drawing unit 34. When drawing the 2nd, 3rd, and 4th drawing information G2 (t), G3 (t), and G4 (t) as the 3D avatar of U1, it is deeper than the depth information in the whole 3D avatar. If there is a part located on the side (far side when viewed from the virtual camera), the part located on the back side may not be drawn. Since the part located on the back side is a part that is shielded by some real object in the environment E2 where the second user U2 exists, by not drawing, a natural drawing result is reflected by the occlusion by the real object. May be obtained. (In addition, depending on the positional relationship, the avatar drawn only partially may be drawn as if it is buried in a real object (for example, inside a wall).)

（３）遠隔コミュニケーションの利用設定上、第１ユーザU1のアバタA1を第２ユーザU2に対して提供される仮想空間V2内の固定位置姿勢で表示する場合は、第１測位部12において各時刻tでリアルタイムに第１測位情報P1(t)を得る処理は省略してよい。この場合、リアルタイムの第１測位情報P1(t)が時刻tによらず一定値（予め与えられる所定値）であるものとみなして、サーバ30の第３描画部33、第４描画部34及び抽出部35の処理と、第２端末20の第２描画部26、第２統合部27及び第２提示部28の処理とを、同様に行うようにすればよい。（当該一定値及び前述の所定変換Tにより、仮想空間V2内での固定位置姿勢が定まることとなる。） (3) When displaying the avatar A1 of the first user U1 in the fixed position posture in the virtual space V2 provided to the second user U2 in the remote communication usage setting, each time is displayed in the first positioning unit 12. The process of obtaining the first positioning information P1 (t) in real time with t may be omitted. In this case, assuming that the real-time first positioning information P1 (t) is a constant value (predetermined value given in advance) regardless of the time t, the third drawing unit 33, the fourth drawing unit 34, and the server 30 The processing of the extraction unit 35 and the processing of the second drawing unit 26, the second integration unit 27, and the second presentation unit 28 of the second terminal 20 may be performed in the same manner. (The fixed position and orientation in the virtual space V2 are determined by the constant value and the above-mentioned predetermined conversion T.)

（４）以上の図２や図３による説明は、概略説明で前述した通り、第１ユーザU1のアバタA1を第２ユーザU2に対してその仮想空間V2内で表示する処理（「第１アバタ表示処理」とする）に関するものであったが、第１端末10及び第２端末20の役割を入れ替えて全く同様に、第２ユーザU2のアバタA2を第１ユーザU1に対してその仮想空間V1内で表示する処理（「第２アバタ表示処理」とする）を行うことも可能である。 (4) In the above description with reference to FIGS. 2 and 3, as described above in the schematic description, the process of displaying the avatar A1 of the first user U1 to the second user U2 in the virtual space V2 (“first avatar”). Although it was related to "display processing"), the roles of the first terminal 10 and the second terminal 20 were exchanged, and the avatar A2 of the second user U2 was changed to the virtual space V1 for the first user U1 in exactly the same manner. It is also possible to perform a process of displaying within (referred to as "second avatar display process").

図８は、第１アバタ表示処理及び第２アバタ表示処理を双方向に行う場合の一実施形態に係る提示システム100の機能ブロック図である。図８において第１アバタ表示処理を行う構成は、図２と同様であるため、重複した説明は省略する。図８において、第２アバタ表示処理を行うための構成として、第１端末10は第１測位部12、第１描画部16、第１統合部17及び第１提示部18を備え、第２端末20は第２認識部21及び第２測位部22を備え、これら各部が第２アバタ表示処理を行う際の動作はそれぞれ、第１アバタ表示処理の際の第２端末20における第２測位部22、第２描画部26、第２統合部27及び第２提示部28と、第１端末10における第１認識部11及び第１測位部12と、同一である（処理対象となる第１ユーザの情報と第２ユーザの情報とを入れ替えて全く同一である）ため、重複した説明は省略する。サーバ30での処理も第２アバタ表示処理と第１アバタ表示処理とは同一である（処理対象となる第１ユーザの情報と第２ユーザの情報とを入れ替えて全く同一である）ため、重複した説明は省略する。 FIG. 8 is a functional block diagram of the presentation system 100 according to an embodiment in which the first avatar display process and the second avatar display process are performed in both directions. Since the configuration for performing the first avatar display process in FIG. 8 is the same as that in FIG. 2, duplicated description will be omitted. In FIG. 8, the first terminal 10 includes a first positioning unit 12, a first drawing unit 16, a first integrated unit 17, and a first presenting unit 18 as a configuration for performing a second avatar display process, and is a second terminal. 20 includes a second recognition unit 21 and a second positioning unit 22, and the operation when each of these units performs the second avatar display processing is the second positioning unit 22 in the second terminal 20 during the first avatar display processing, respectively. , The second drawing unit 26, the second integration unit 27 and the second presentation unit 28, and the first recognition unit 11 and the first positioning unit 12 in the first terminal 10 are the same (of the first user to be processed). The information and the information of the second user are exchanged and are exactly the same), so duplicate explanations will be omitted. Since the processing on the server 30 is also the same as the second avatar display processing and the first avatar display processing (the information of the first user to be processed and the information of the second user are exchanged and are exactly the same), they are duplicated. The explanation given is omitted.

（５）提示システム100によりアバタを用いてユーザU1,U2間で遠隔コミュニケーションを行う際は、音声もリアルタイムで録音して相手ユーザ側で再生するようにしてもよい。第１認識情報P1(t)にユーザU1の口の動きが反映されている場合は、ユーザU1のアバタA1はユーザU1が喋る口の動きと連動して喋るようにして、相手ユーザU2に対して表示され、喋っている内容も音声として再生されることとなる。 (5) When remote communication is performed between the users U1 and U2 using the avatar by the presentation system 100, the voice may be recorded in real time and played back by the other user. When the movement of the mouth of the user U1 is reflected in the first recognition information P1 (t), the avatar A1 of the user U1 speaks in conjunction with the movement of the mouth spoken by the user U1 to the other user U2. Will be displayed, and the spoken content will also be played back as voice.

（６）図９は、一般的なコンピュータ装置70におけるハードウェア構成の例を示す図である。提示システム100における第１端末10、第２端末20及びサーバ30はそれぞれ、このような構成を有する１台以上のコンピュータ装置70として実現可能である。なお、２台以上のコンピュータ装置70で第１端末10、第２端末20及びサーバ30のそれぞれを実現する場合、ネットワークNW経由で処理に必要な情報の送受を行うようにしてよい。コンピュータ装置70は、所定命令を実行するCPU（中央演算装置）71、CPU71の実行命令の一部又は全部をCPU71に代わって又はCPU71と連携して実行する専用プロセッサとしてのGPU（グラフィックス演算装置）72、CPU71（及びGPU72）にワークエリアを提供する主記憶装置としてのRAM73、補助記憶装置としてのROM74、通信インタフェース75、ディスプレイ76、マウス、キーボード、タッチパネル等によりユーザ入力を受け付ける入力インタフェース77、環境やユーザを撮像するカメラ78及びLiDARセンサ等の画像撮像以外を用いたセンシングや計測を行う１種類以上のセンサ79と、これらの間でデータを授受するためのバスBSと、を備える。 (6) FIG. 9 is a diagram showing an example of a hardware configuration in a general computer device 70. The first terminal 10, the second terminal 20, and the server 30 in the presentation system 100 can each be realized as one or more computer devices 70 having such a configuration. When each of the first terminal 10, the second terminal 20, and the server 30 is realized by two or more computer devices 70, information necessary for processing may be transmitted and received via the network NW. The computer device 70 is a CPU (central processing unit) 71 that executes a predetermined instruction, and a GPU (graphics calculation device) as a dedicated processor that executes a part or all of the execution instructions of the CPU 71 on behalf of the CPU 71 or in cooperation with the CPU 71. ) 72, RAM73 as the main storage device that provides the work area to the CPU71 (and GPU72), ROM74 as the auxiliary storage device, communication interface 75, display 76, input interface 77 that accepts user input by mouse, keyboard, touch panel, etc. It includes one or more types of sensors 79 that perform sensing and measurement using other than image imaging, such as a camera 78 that captures the environment and the user, and a LiDAR sensor, and a bus BS for exchanging data between them.

第１端末10、第２端末20及びサーバ30のそれぞれの各機能部は、各部の機能に対応する所定のプログラムをROM74から読み込んで実行するCPU71及び／又はGPU72によって実現することができる。なお、CPU71及びGPU72は共に、演算装置（プロセッサ）の一種である。ここで、表示関連の処理が行われる場合にはさらに、ディスプレイ76が連動して動作し、データ送受信に関する通信関連の処理が行われる場合にはさらに通信インタフェース75が連動して動作する。第１提示部18及び第２提示部28はディスプレイ76として実現することで、拡張現実表示を出力してよい。 Each functional unit of the first terminal 10, the second terminal 20, and the server 30 can be realized by a CPU 71 and / or a GPU 72 that reads a predetermined program corresponding to the function of each unit from the ROM 74 and executes the program. Both CPU71 and GPU72 are a kind of arithmetic unit (processor). Here, when the display-related processing is performed, the display 76 further operates in conjunction with the display 76, and when the communication-related processing related to data transmission / reception is performed, the communication interface 75 further operates in conjunction with the display 76. The first presentation unit 18 and the second presentation unit 28 may output an augmented reality display by realizing the display 76.

100…提示システム、10…第１端末、20…第２端末、30…サーバ
11…第１認識部、12…第１測位部
22…第２測位部、26…第２描画部、27…第２統合部、28…第２提示部
33…第３描画部、34…第４描画部、35…抽出部 100 ... presentation system, 10 ... first terminal, 20 ... second terminal, 30 ... server
11 ... 1st recognition unit, 12 ... 1st positioning unit
22 ... 2nd positioning unit, 26 ... 2nd drawing unit, 27 ... 2nd integrated unit, 28 ... 2nd presentation unit
33 ... 3rd drawing unit, 34 ... 4th drawing unit, 35 ... extraction unit

Claims

The first recognition unit that recognizes the state related to the communication of the first user and obtains the first recognition information,
The second positioning unit that obtains the second positioning information by positioning the position and posture of the second user,
A third drawing unit that reflects the first recognition information and obtains a third drawing information that draws the avatar of the first user from the viewpoint of the virtual camera arranged in the second positioning information.
From the viewpoint of the virtual camera arranged in the second positioning information, the fourth drawing information in which the avatar of the first user is drawn with higher quality than the drawing mode of the third drawing unit is obtained by reflecting the first recognition information. 4th drawing part and
An extraction unit that extracts the difference between the third drawing information and the fourth drawing information as the second extraction information,
From the viewpoint of the virtual camera arranged in the second positioning information, the second drawing information in which the avatar of the first user is drawn with the same quality as the drawing mode of the third drawing unit is obtained by reflecting the first recognition information. 2 drawing part and
By reflecting the second extraction information in the second drawing information, a second integration unit that obtains the second integrated information that is an avatar of the first user as imitating the fourth drawing information, and a second integration unit.
A presentation system including a second presentation unit that displays the second integrated information to the second user.

The first recognition unit is provided in the first terminal used by the first user.
The two positioning units, the second drawing unit, the second integration unit, and the second presentation unit are provided in the second terminal used by the second user.
The presentation system according to claim 1, wherein the third drawing unit, the fourth drawing unit, and the extraction unit are provided in a server.

The presentation system according to claim 2, wherein the first terminal, the second terminal, and the server are configured to be capable of communicating with each other via a network.

The first recognition unit is described in any one of claims 1 to 3, wherein the first recognition unit recognizes a facial expression and / or a pose as a state related to the communication of the first user and obtains the first recognition information. Presentation system.

The extraction unit has one of the third drawing information and the fourth drawing information subjected to conversion processing so as to suppress the difference between the third drawing information and the fourth drawing information, and the other. The presentation system according to any one of claims 1 to 4, wherein the difference between the above and the second is extracted as the second extraction information together with the information of the conversion process.

The presentation system according to claim 5, wherein the conversion process is a linear conversion or a constant addition.

The second extraction unit assumes that each difference value in the pixel difference value map calculated as the difference between the third drawing information and the fourth drawing information is quantized so as to suppress the quantization error. The presentation system according to any one of claims 1 to 6, wherein the extraction information is extracted.

The presentation system according to claim 7, wherein the extraction unit limits the quantization step at the time of quantization.

When the second positioning unit positions the position and posture of the second user to obtain the second positioning information, the second positioning unit also acquires depth information in the environment in which the second user exists.
In the second drawing unit, the third drawing unit, and the fourth drawing unit, when the avatar of the first user is drawn as the second drawing information, the third drawing information, and the fourth drawing information, respectively. The presentation system according to any one of claims 1 to 8, wherein the portion shielded by the depth information is not drawn.

Further, a first positioning unit for positioning the position and posture of the first user and obtaining the first positioning information is provided.
In the second drawing unit, the third drawing unit, and the fourth drawing unit, the avatar of the first user is arranged in a position and posture corresponding to the first positioning information, and the second drawing information, the third drawing unit. The presentation system according to any one of claims 1 to 9, wherein the drawing information and the fourth drawing information are drawn respectively.

The first terminal used by the first user and equipped with the first recognition unit,
A second terminal used by a second user and having a second positioning unit, a second drawing unit, a second integrated unit, and a second presentation unit.
A server in a presentation system including a third drawing unit, a fourth drawing unit, and a server including an extraction unit.
The first recognition unit recognizes a state related to the communication of the first user and obtains the first recognition information.
The second positioning unit positions the position and posture of the second user to obtain the second positioning information, and then obtains the second positioning information.
The third drawing unit obtains the third drawing information in which the avatar of the first user is drawn by reflecting the first recognition information in the virtual camera viewpoint arranged in the second positioning information.
The fourth drawing unit draws the avatar of the first user with higher quality than the drawing mode of the third drawing unit, reflecting the first recognition information, from the viewpoint of the virtual camera arranged in the second positioning information. Obtaining the 4th drawing information
The extraction unit extracts the difference between the third drawing information and the fourth drawing information as the second extraction information.
The second drawing unit draws the avatar of the first user with the same quality as the drawing mode of the third drawing unit, reflecting the first recognition information, from the viewpoint of the virtual camera arranged in the second positioning information. Obtaining the second drawing information,
By reflecting the second extraction information in the second drawing information, the second integrated unit obtains the second integrated information which is the avatar of the first user as imitating the fourth drawing information. ,
The second presenting unit is a server characterized in that the second integrated information is displayed to the second user.

The first terminal used by the first user and equipped with the first recognition unit,
A second terminal used by a second user and having a second positioning unit, a second drawing unit, a second integrated unit, and a second presentation unit.
A second terminal in a presentation system including a third drawing unit, a fourth drawing unit, and a server including an extraction unit.
The first recognition unit recognizes a state related to the communication of the first user and obtains the first recognition information.
The second positioning unit positions the position and posture of the second user to obtain the second positioning information, and then obtains the second positioning information.
The third drawing unit obtains the third drawing information in which the avatar of the first user is drawn by reflecting the first recognition information in the virtual camera viewpoint arranged in the second positioning information.
The fourth drawing unit draws the avatar of the first user with higher quality than the drawing mode of the third drawing unit, reflecting the first recognition information, from the viewpoint of the virtual camera arranged in the second positioning information. Obtaining the 4th drawing information
The extraction unit extracts the difference between the third drawing information and the fourth drawing information as the second extraction information.
The second drawing unit draws the avatar of the first user with the same quality as the drawing mode of the third drawing unit, reflecting the first recognition information, from the viewpoint of the virtual camera arranged in the second positioning information. Obtaining the second drawing information,
By reflecting the second extraction information in the second drawing information, the second integrated unit obtains the second integrated information which is the avatar of the first user as imitating the fourth drawing information. ,
The second presenting unit is a second terminal characterized in that the second integrated information is displayed to the second user.

A program comprising the computer functioning as the server according to claim 11 or the second terminal according to claim 12.