JP6667785B1

JP6667785B1 - A program for learning by associating a three-dimensional model with a depth image

Info

Publication number: JP6667785B1
Application number: JP2019001565A
Authority: JP
Inventors: 裕樹有光
Original assignee: Individual
Current assignee: Individual
Priority date: 2019-01-09
Filing date: 2019-01-09
Publication date: 2020-03-18
Anticipated expiration: 2039-01-09
Also published as: JP2020112899A

Abstract

【課題】計測対象ユーザをデプスカメラで撮影したポイントクラウドから、そのユーザの３次元モデルを生成するプログラムを提供する。【解決手段】教師データの３次元モデルとポイントクラウドとを対応付けて学習する装置に搭載されたコンピュータを機能させるプログラムであって、教師データの３次元モデル毎に、当該３次元モデルを１つ以上の所定視点からソフトウェア上で撮影したポイントクラウドを作成するポイントクラウド作成手段と、教師データ群の複数体の３次元モデルから、次元圧縮された次元数ｍの成分変数を出力すると共に、統計学習モデルを構築する統計学習エンジンと、教師データ群の複数体の３次元モデルについて、当該３次元モデルのポイントクラウドと、次元数ｍの成分変数との第１の相関学習モデルを構築する第１の相関学習エンジンとしてコンピュータを機能させることを特徴とする。【選択図】図１PROBLEM TO BE SOLVED: To provide a program for generating a three-dimensional model of a measurement target user from a point cloud captured by a depth camera. A program for operating a computer mounted on an apparatus for learning by associating a three-dimensional model of teacher data with a point cloud, one three-dimensional model for each three-dimensional model of teacher data. A point cloud creating means for creating a point cloud photographed on software from the above predetermined viewpoint and a three-dimensional model of a plurality of teacher data groups outputs a dimension-compressed component variable of the number m of dimensions and performs statistical learning. A statistical learning engine for constructing a model, and a first correlation learning model for constructing a first correlation learning model of a point cloud of the three-dimensional model and a component variable having a dimension number m for a three-dimensional model of a plurality of teacher data groups. The feature is that a computer functions as a correlation learning engine. [Selection diagram] Fig. 1

Description

本発明は、３次元モデルに基づく機械学習エンジンの技術に関する。 The present invention relates to a technology of a machine learning engine based on a three-dimensional model.

近年、人体形状データを検知可能な３次元スキャナの技術がある（例えば非特許文献１及び２参照）。この技術によれば、人体に対する非接触の光学三角測量によって、３次元の点群データを計測する。これら点群データは、約100万点と超高密度であり、人体計測の用途では極めて小さい誤差を実現している。このような人体形状データは、体形以外の健康管理データとしても有効なものである。 In recent years, there is a three-dimensional scanner technology capable of detecting human body shape data (for example, see Non-Patent Documents 1 and 2). According to this technique, three-dimensional point cloud data is measured by non-contact optical triangulation with a human body. These point cloud data are ultra-high density of about 1 million points, and realize extremely small errors in human body measurement applications. Such human body shape data is also effective as health management data other than the body shape.

従来、骨格モデルに重ねた筋肉モデルを、被験者の測定結果に合わせて変形させる技術がある（例えば特許文献１参照）。この技術によれば、体組成計及び３次元測定器による被験者の身体的な測定結果に基づいて、被験者自身に応じた人体モデルを作成する。人体モデルは、骨格、筋肉及び脂肪をセットにした解剖的なモデルであり、これらは、被験者の測定結果に応じて変形される。これらの骨格、筋肉及び脂肪それぞれのモデルを切り替えて表示することにより、被験者は、自らの体内の状況などを視覚的に認識することができる。 Conventionally, there is a technique for deforming a muscle model superimposed on a skeletal model in accordance with a measurement result of a subject (for example, see Patent Document 1). According to this technique, a human body model corresponding to the subject himself is created based on the result of the physical measurement of the subject using the body composition meter and the three-dimensional measuring device. The human body model is an anatomical model in which a skeleton, a muscle, and a fat are set, and these are deformed according to a measurement result of a subject. By switching and displaying the model of each of these skeleton, muscle and fat, the subject can visually recognize his or her internal state and the like.

また、物体を特徴パラメータで表現した３次元モデルを予め格納しておき、撮像画像から検出した特徴領域の画像を、３次元モデルに適応させる技術もある（例えば特許文献２参照）。この技術によれば、特徴領域の画像に撮像されている物体を表す、３次元モデルの特徴パラメータの値を算出する。そして、その特徴パラメータの値と、特徴領域以外の領域の画像とを出力することによって、３次元画像全体のデータ量を削減している。 There is also a technique in which a three-dimensional model in which an object is represented by feature parameters is stored in advance, and an image of a characteristic region detected from a captured image is adapted to the three-dimensional model (for example, see Patent Document 2). According to this technique, a value of a feature parameter of a three-dimensional model representing an object captured in an image of a feature region is calculated. Then, by outputting the value of the characteristic parameter and the image of the region other than the characteristic region, the data amount of the entire three-dimensional image is reduced.

このような３次元モデルに対して、２．５次元画像を撮影可能なデプスカメラ(Depth Camera)が、一般的になってきている。デプスカメラとは、奥行き情報を取得する深度センサを内蔵したカメラをいう。通常のカメラが取得する２次元の平面的な画像(RGB)に加えて、奥行き情報(Depth)を取得することによって、３次元の立体的な情報を取得することができる。この情報は、画素毎の距離情報（深度）に応じてグレースケール階調とした「デプス画像」と称される。特に、１視点から見た３次元情報しか取得できないため、概念的に「２．５次元画像」とも称される。 For such a three-dimensional model, a depth camera capable of capturing a 2.5-dimensional image has become popular. The depth camera is a camera having a built-in depth sensor for acquiring depth information. By acquiring depth information (Depth) in addition to a two-dimensional planar image (RGB) acquired by a normal camera, three-dimensional three-dimensional information can be acquired. This information is referred to as a “ depth image ” in which grayscale gradation is set according to distance information (depth) for each pixel. In particular, because only three-dimensional information is viewed from a viewpoint not be obtained, Ru conceptually also referred to as "2.5-dimensional image".

特開２０１７−１７６８０３号公報JP 2017-176803 A 特開２００９−２６８０８８号公報JP 2009-268088 A

「3D BODY SCANNER SCUVEG4」、株式会社スペースビジョン、[online]、［平成３０年１２月３１日検索］、インターネット＜URL:http://www.spacevision.tokyo/＞"3D BODY SCANNER SCUVEG4", Space Vision Co., Ltd., [online], [Search December 31, 2018], Internet <URL: http://www.spacevision.tokyo/> 「3D Body Station」、株式会社3D Body Lab、[online]、［平成３０年１２月３１日検索］、インターネット＜URL:https://www.3dbodylab.co.jp/3dbodystation/＞“3D Body Station”, 3D Body Lab, Inc., [online], [searched on December 31, 2018], Internet <URL: https://www.3dbodylab.co.jp/3dbodystation/> 「オートエンコーダ」、[online]、［平成３１年１２月３１日検索］、インターネット＜https://deepage.net/deep_learning/2016/10/09/deeplearning_autoencoder.html＞"Auto Encoder", [online], [Search December 31, 2019], Internet <https://deepage.net/deep_learning/2016/10/09/deeplearning_autoencoder.html> 「ＭＡＹＡ被写界深度を調節しよう」、[online]、［平成３１年１２月３１日検索］、インターネット＜http://thankstotoday.com/modeling-depth1/ ＞"MAYA Let's adjust the depth of field", [online], [searched on December 31, 2019], Internet <http://thankstotoday.com/modeling-depth1/>

前述した非特許文献１及び２の場合、人体の３次元モデルを生成するために、規模的に且つコスト的に大きい３次元スキャナを用いる必要がある。また、３次元モデルは、その精度を高めるべく、頂点数を例えば15,000以上とし、各頂点も３次元(x,y,z)で表現するために、それら点群データの次元数は、45,000以上の膨大なデータ量となる。 In the case of Non-Patent Documents 1 and 2 described above, it is necessary to use a three-dimensional scanner that is large in scale and cost in order to generate a three-dimensional model of a human body. In order to increase the accuracy of the three-dimensional model, the number of vertices is set to, for example, 15,000 or more, and each vertex is expressed in three dimensions (x, y, z). Enormous amount of data.

これに対し、本願の発明者は、光学三角測量の３次元スキャナを用意することなく、所定視点から撮影するデプスカメラのみから、ユーザの体形に近い３次元モデルを簡易に対応付けることができないか、と考えた。例えばスマートフォンやその他のヘルスケアの装置に搭載されたデプスカメラによって自らの体形を撮影し、そのデプス画像から３次元モデルを対応付けることはできないか、と考えた。 On the other hand, the inventor of the present application can easily associate a three-dimensional model close to the user's body shape only from a depth camera that photographs from a predetermined viewpoint without preparing a three-dimensional scanner for optical triangulation, I thought. For example, they thought that a depth camera mounted on a smartphone or other healthcare device could photograph their own body shape and associate a three-dimensional model from the depth image .

一方で、デプス画像は、例えば高さh＝960pixel、幅w＝540pixelの場合、w*h＝518,400次元のように、膨大なデータ量となる。このような膨大なデータ量を、組込用のプロセッサで処理することは極めて難しい。 On the other hand, when the depth image has a height h = 960 pixels and a width w = 540 pixels , the depth image has a huge data amount such as w * h = 518,400 dimensions. It is extremely difficult to process such a huge amount of data with an embedded processor.

また、本願の発明者は、３次元モデルやデプス画像のデータを、小容量で、且つ、簡易に共有（送受信）することができないか、と考えた。特に、サービス提供者としては、ユーザの体形データを取得しても、この膨大なデータ量を瞬時に送受信させることができれば、ユーザ向け特有の様々なサービスを提供することができると思われる。 In addition, the inventor of the present application considered whether it is possible to easily share (transmit and receive) three-dimensional model and depth image data with a small capacity. In particular, as a service provider, even if the body data of the user is acquired, if it is possible to instantaneously transmit and receive this enormous amount of data, various services unique to the user can be provided.

更に、本願の発明者は、ユーザ自らの体形を表現する３次元モデルやデプス画像は、そのユーザにとって個人情報として守秘されるべきものである、と考えた。即ち、これら膨大なデータ量を圧縮できたとしても、第三者にとって直ぐに復号できないように実装される必要もある。 Furthermore, the inventor of the present application has considered that a three-dimensional model and a depth image expressing a user's own body shape should be kept confidential as personal information for the user. In other words, even if such an enormous amount of data can be compressed, it must be implemented so that a third party cannot immediately decode the data.

更に、本願の発明は、ユーザ自らの体形は、外観のみならず、そのユーザの体組成値によっても大きく変化するであろう、と考えた。例えば同じ身長、腹囲及び胸囲のユーザであっても、その体組成値によっては外観が異なるのではないか、と考えた。そうすると、ヘルスケア計測器によって取得される体組成値も、外観となる３次元モデルやデプス画像と対応付けることができるのではないか、と考えた。 Furthermore, the invention of the present application has considered that the user's own body shape will vary greatly depending not only on the appearance but also on the user's body composition value. For example, it was considered that even if the users have the same height, abdominal girth, and chest girth, their appearance may differ depending on their body composition values. Then, it was thought that the body composition value acquired by the health care measuring device could be associated with the three-dimensional model or the depth image as the appearance.

そこで、本発明によれば、計測対象ユーザをデプスカメラで撮影したデプス画像から、そのユーザの３次元モデルを生成することができるプログラムを提供することを目的とする。 Thus, according to the present invention, it is an object to provide a program capable of generating a three-dimensional model of a measurement target user from a depth image captured by the depth camera.

本発明によれば、教師データの人体の３次元モデルとデプス画像とを対応付けて学習する装置に搭載されたコンピュータを機能させるプログラムであって、
教師データの３次元モデル毎に、当該３次元モデルを１つ以上の所定視点からソフトウェア上で撮影したデプス画像を作成するデプス画像作成手段と、
教師データ群の複数体の３次元モデルから、次元圧縮された次元数ｍの成分変数を出力すると共に、統計学習モデルを構築する統計学習エンジンと、
教師データ群の複数体の３次元モデルについて、当該３次元モデルのデプス画像と、次元数ｍの成分変数との第１の相関学習モデルを構築する第１の相関学習エンジンと、
教師データ群の複数体の３次元モデルについて、少なくとも身長と１つ以上の体組成計値とを含む次元数ｎの組成値と、次元数ｍの成分変数との第２の相関学習モデルを構築する第２の相関学習エンジンと
してコンピュータを機能させることを特徴とする。 According to the present invention, there is provided a program for causing a computer mounted on an apparatus for learning by associating a three-dimensional model of a human body of a teacher data and a depth image with each other,
Depth image creating means for creating, for each three-dimensional model of the teacher data, a depth image obtained by photographing the three-dimensional model from one or more predetermined viewpoints on software;
A statistic learning engine that outputs a dimensionally compressed m number of component variables from a plurality of three-dimensional models of the teacher data group and constructs a statistic learning model;
A first correlation learning engine for constructing a first correlation learning model of a depth image of the three-dimensional model and a component variable having the number of dimensions m for a plurality of three-dimensional models of the teacher data group;
For a plurality of three-dimensional models of the teacher data group, a second correlation learning model is constructed between a composition value of dimension n including at least height and one or more body composition measurement values and a component variable of dimension m. A computer functioning as a second correlation learning engine .

本発明のプログラムにおける他の実施形態によれば、
統計学習エンジンは、主成分分析(Principal Component Analysis)又はAutoEncoderに基づくものである
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
It is also preferred that the statistical learning engine causes the computer to function as being based on Principal Component Analysis or AutoEncoder.

本発明のプログラムにおける他の実施形態によれば、
第１の相関学習エンジンは、畳み込みニューラルネットワークに基づくものであり、
第２の相関学習エンジンは、最小二乗法又は多層パーセプトロンに基づくものである
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
First correlation learning engine of state, and are not based on convolutional neural network,
The second correlation learning engine also preferably causes the computer to function as being based on least squares or multilayer perceptron .

本発明のプログラムにおける他の実施形態によれば、 According to another embodiment of the program of the present invention,
教師データ群の３次元モデルの複数体数は、当該３次元モデルの頂点数よりも少ないものである The number of plural bodies of the three-dimensional model of the teacher data group is smaller than the number of vertices of the three-dimensional model.
ようにコンピュータを機能させることも好ましい。It is also preferable to make the computer function as described above.

本発明のプログラムにおける他の実施形態によれば、
第１の相関学習エンジンを用いて、対象データとしてのデプス画像から次元数ｍの成分変数へエンコードする第１のエンコーダと、
第２の相関学習エンジンを用いて、対象データとしての１体の次元数ｎの組成値から次元数ｍの成分変数へエンコードする第２のエンコーダと
してコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
A first encoder that encodes a depth variable as target data into a component variable having a dimension number m using a first correlation learning engine;
A second encoder that encodes, using a second correlation learning engine, a composition value of one dimension n as target data into a component variable of dimension m,
It is also preferable that the computer functions as a computer.

本発明のプログラムにおける他の実施形態によれば、
対象データとして対応付けられたデプス画像及び組成値を入力し、第１の相関学習エンジンと、第２の相関学習エンジンとを再学習するために、
第１のエンコーダによって、対象データのデプス画像を第１の次元数ｍの成分変数にエンコードし、
第２のエンコーダによって、対象データの組成値を第２の次元数ｍの成分変数にエンコードし、
第１の次元数ｍの成分変数と、第２の次元数ｍの成分変数とを、１つの次元数ｍの成分変数に統合し、
第１の相関学習エンジンは、対象データのデプス画像と、統合された次元数ｍの成分変数とから再学習し、
第２の相関学習エンジンは、対象データの組成値と、統合された次元数ｍの成分変数とから再学習する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
In order to input a depth image and a composition value associated as target data and re-learn the first correlation learning engine and the second correlation learning engine,
The first encoder encodes the depth image of the target data into a component variable having a first dimension number m,
The second encoder encodes the composition value of the target data into a component variable having a second dimension number m,
Integrating a component variable having a first dimension number m and a component variable having a second dimension number m into a component variable having one dimension number m;
The first correlation learning engine re-learns from the depth image of the target data and the integrated component variables having the number of dimensions m,
It is also preferable that the second correlation learning engine causes the computer to function so as to re-learn from the composition value of the target data and the integrated component variables having the number of dimensions m .

本発明のプログラムにおける他の実施形態によれば、
対象データの組成値について欠損した組成値の数が多いほど、小さくなる重みｗを付与して、次元毎に算出した、第１の次元数ｍの成分変数と第２の次元数ｍの成分変数との加重平均を、１つの次元数ｍの成分変数として統合する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The component variable of the first dimension m and the component variable of the second dimension m calculated for each dimension by assigning a smaller weight w as the number of missing composition values of the composition value of the target data increases. It is also preferable to cause the computer to function so as to integrate the weighted average of the two as a component variable having one dimension number m .

本発明のプログラムにおける他の実施形態によれば、
対象データとして１体の次元数ｎの組成値について、ｋ（＜ｎ）個の組成値のみ決定され、その他のｎ−ｋ個の組成値が欠損していても、次元数ｍの成分変数を推定するために、ｋ個の組成値を束縛条件として、最適化された他のｎ−ｋ個の組成値を推定する欠損値推定手段と
して更にコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
Only k (<n) composition values are determined for the composition value of one dimension n as the target data, and even if other nk composition values are missing, the component variable of dimension m is determined. A missing value estimating means for estimating the other nk optimized component values using the k component values as constraints for the estimation.
It is also preferable to make the computer function further .

本発明のプログラムにおける他の実施形態によれば、
欠損値推定手段は、ラグランジュの未定乗数法(method of Lagrange multiplier)を用いる
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
It is also preferred that the missing value estimating means causes the computer to function so as to use a Lagrange's method of Lagrange multiplier .

本発明のプログラムにおける他の実施形態によれば、
統計学習エンジンを用いて、当該次元数ｍの成分変数から３次元モデルにデコードする第１のデコーダと
して更にコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
A first decoder for decoding the component variable having the number of dimensions m into a three-dimensional model using a statistical learning engine;
It is also preferable to make the computer function further .

本発明のプログラムにおける他の実施形態によれば、
デコードされた３次元モデルを、所定視点からソフトウェア上で撮影した１枚以上のグレースケール画像を作成するグレースケール画像作成手段と
してコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
Grayscale image creation means for creating one or more grayscale images of the decoded three-dimensional model taken from a predetermined viewpoint on software
It is also preferable that the computer functions as a computer.

本発明のプログラムにおける他の実施形態によれば、
第２の相関学習エンジンを用いて、対象データとしての次元数ｍの成分変数から組成値にデコードする第２のデコーダと
してコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
A second decoder that decodes a component variable of dimension number m as target data into a composition value using a second correlation learning engine;
It is also preferable that the computer functions as a computer.

本発明のプログラムにおける他の実施形態によれば、
デプス画像は、深度に応じてグレースケール階調とした画像である
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
A depth image is an image with grayscale gradation depending on the depth
It is also preferable to make the computer function as described above .

本発明のプログラムにおける他の実施形態によれば、
エンコードされた次元数ｍの成分変数を、シェア(Share)コードとして出力すると共に、当該シェアコードを、ＱＲ(Quick Response、登録商標)、ＲＦＩＤ(Radio Frequency IDentifier)又はCookieに埋め込むシェアコード出力手段と
して更にコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
A shared code output unit that outputs the encoded component variable having the number of dimensions m as a share (Share) code, and embeds the share code in QR (Quick Response, registered trademark), RFID (Radio Frequency IDentifier) or Cookie.
It is also preferable to make the computer function further .

本発明のプログラムによれば、計測対象ユーザをデプスカメラで撮影したデプス画像から、そのユーザの３次元モデルを生成することができる。 According to the program of the present invention, it is possible to generate a three-dimensional model of a measurement target user from a depth image captured by the depth camera.

本発明の装置における学習段階の機能構成図である。It is a functional block diagram of the learning stage in the apparatus of this invention. 本発明の装置における学習段階の処理の流れを表す説明図である。FIG. 4 is an explanatory diagram illustrating a flow of a process in a learning stage in the device of the present invention. 統計学習エンジンにおける３次元モデルのベクトル空間を表す説明図である。FIG. 4 is an explanatory diagram illustrating a vector space of a three-dimensional model in the statistical learning engine. 統計学習エンジンにおける統計形状空間を表す説明図である。FIG. 4 is an explanatory diagram illustrating a statistical shape space in a statistical learning engine. 統計学習エンジンにおける主成分分析を表す簡易なコードである。This is a simple code representing principal component analysis in a statistical learning engine. 第１の相関学習エンジンにおける畳み込みニューラルネットワークを表す簡易なコードである。5 is a simple code representing a convolutional neural network in a first correlation learning engine. 組成値から導出した組成値空間を表す説明図である。It is explanatory drawing showing the composition value space derived from the composition value. 統計形状空間と組成値空間との線形変換を表す説明図である。FIG. 4 is an explanatory diagram illustrating a linear transformation between a statistical shape space and a composition value space. 第２の相関学習エンジンにおける統計形状空間と組成値空間との間の線形変換を表す簡易なコードである。It is a simple code representing a linear transformation between a statistical shape space and a composition value space in a second correlation learning engine. 本発明における運用段階のエンコード側の機能構成図である。It is a functional block diagram of the encoding side of the operation stage in this invention. 本発明における欠損値を推定する組成値空間を表す説明図である。It is explanatory drawing showing the composition value space which estimates the missing value in this invention. 本発明における欠損値推定を表す簡易なコードである。It is a simple code representing missing value estimation in the present invention. 本発明における運用段階のデコード側の機能構成図である。It is a functional block diagram on the decoding side of the operation stage in the present invention. 本発明によってデコードされた３次元モデルの精度を表す説明図である。FIG. 3 is an explanatory diagram showing the accuracy of a three-dimensional model decoded by the present invention. 本発明における運用段階の再学習を表す機能構成図である。It is a functional block diagram showing the re-learning of the operation stage in this invention. 本発明のエンコード側機能を体組成計に組み込んだ構成図である。FIG. 2 is a configuration diagram in which the encoding side function of the present invention is incorporated in a body composition meter. 体組成計及び端末を用いたデプスカメラの種々配置を表すシステム構成図である。1 is a system configuration diagram showing various arrangements of a depth camera using a body composition meter and a terminal. 体組成計のグリップ部分にデプスカメラを搭載した場合におけるユーザの姿勢を表す外観図である。It is an external view showing the attitude | position of a user at the time of mounting a depth camera in the grip part of a body composition meter. 図１８のユーザの姿勢によって撮影されたデプス画像からデコードされた３次元モデルの精度を表す説明図である。FIG. 19 is an explanatory diagram illustrating the accuracy of a three-dimensional model decoded from a depth image captured according to the posture of the user in FIG. 18.

以下では、本発明の実施の形態について、図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の装置における学習段階の機能構成図である。 FIG. 1 is a functional configuration diagram of a learning stage in the apparatus of the present invention.

図１によれば、装置１は、教師データとして、３次元モデルと組成値とを対応付けたデータ群を入力する。
本発明に実施形態によれば、３次元モデルは、人体であるとして説明する。
３次元モデル自体は、人工的に分散させて生成されたものであってもよいし、不特定多数の人体から光学スキャナによって予め取得されたものであってもよい。
組成値は、その３次元モデルに基づく人（ユーザ）から取得された体組成計値であってもよい。ここで、組成値とは、例えば身長、腹囲、胸囲のように、３次元モデルの外観からも取得可能な採寸値を含むものであってもよい。但し、本発明によれば、人体の場合、少なくとも身長は、組成値として含むことが好ましい。また、組成値には、体組成計によって計測された体重、体脂肪、内臓脂肪、骨格筋率などのように、３次元モデルの外観とは結び付かない体組成計値を含むことも好ましい。
尚、本発明の本質的には、組成値までも必須とするものではない。 According to FIG. 1, the device 1 inputs a data group in which a three-dimensional model is associated with a composition value as teacher data.
According to the embodiment of the present invention, the three-dimensional model will be described as a human body.
The three-dimensional model itself may be generated by being artificially dispersed, or may be obtained in advance by an optical scanner from an unspecified number of human bodies.
The composition value may be a body composition measurement value obtained from a person (user) based on the three-dimensional model. Here, the composition value may include a measurement value that can be obtained from the appearance of the three-dimensional model, such as height, abdomen circumference, and chest circumference. However, according to the present invention, in the case of a human body, at least the height is preferably included as a composition value. It is also preferable that the composition value include a body composition meter value that is not linked to the appearance of the three-dimensional model, such as body weight, body fat, visceral fat, and skeletal muscle percentage measured by the body composition meter.
Note that the composition of the present invention is not essentially essential.

本発明によれば、学習段階の装置１は、各学習エンジンに予め学習モデルを構築させるものである。学習モデルを構築させた後は、学習エンジン毎に、エンコーダ側若しくはデコーダ側、又は、システムの各装置に分散的に組み込むことができる。 According to the present invention, the device 1 in the learning stage causes each learning engine to construct a learning model in advance. After the learning model is constructed, the learning model can be distributedly incorporated into the encoder side or the decoder side or each device of the system for each learning engine.

本発明によれば、学習段階の装置１は、デプス画像作成部１０と、統計学習エンジン１００と、第１の相関学習エンジン１１０と、第２の相関学習エンジン１２０とを有する。これら機能構成部は、装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現できる。また、これら機能構成部の処理の流れは、装置の学習方法としても理解できる。 According to the present invention, the learning stage device 1 includes the depth image creating unit 10, the statistical learning engine 100, the first correlation learning engine 110, and the second correlation learning engine 120. These functional components can be realized by executing a program that causes a computer mounted on the device to function. In addition, the flow of processing of these functional components can be understood as a learning method of the device.

図２は、本発明の装置における学習段階の処理の流れを表す説明図である。 FIG. 2 is an explanatory diagram illustrating a flow of a process at a learning stage in the device of the present invention.

［デプス画像作成部１０］
デプス画像作成部１０は、教師データの３次元モデル毎に、当該３次元モデルを１つ以上の所定視点からソフトウェア上で撮影したデプス画像を作成する。
デプス画像は、デプス（深度）画像であって、深度に応じてグレースケール階調としたものである。 [ Depth image creation unit 10]
The depth image creation unit 10 creates, for each three-dimensional model of the teacher data, a depth image obtained by photographing the three-dimensional model from one or more predetermined viewpoints on software.
The depth image is a depth (depth) image and has a gray scale gradation according to the depth.

デプス画像作成部１０は、仮想カメラの位置（所定視点）をソフトウェア上でずらすこともできる。即ち、１体の３次元モデルから、複数の異なる視点の仮想カメラから撮影した複数のデプス画像を作成することができる。例えば映像制作ソフトのＭＡＹＡ（登録商標）の場合、仮想カメラの位置を調節して、.exr形式のグレースケール（Ｚ深度）画像を作成することができる。このグレースケール画像は、深度のみの１チャネルであって、各ピクセルは、0.0〜1.0の32ビット浮動小数点数によって表される。
そして、作成されたデプス画像は、第１の相関学習エンジン１１０へ出力される。 The depth image creation unit 10 can also shift the position (predetermined viewpoint) of the virtual camera on software. That is, a plurality of depth images captured by a plurality of virtual cameras from different viewpoints can be created from one three-dimensional model. For example, in the case of the video production software MAYA (registered trademark), a gray scale (Z depth) image in the .exr format can be created by adjusting the position of the virtual camera. This grayscale image is one channel of depth only, where each pixel is represented by a 32-bit floating point number between 0.0 and 1.0.
Then, the created depth image is output to the first correlation learning engine 110.

［統計学習エンジン１００］
統計学習エンジン１００は、教師データ群の複数体の３次元モデルを入力し、次元圧縮された次元数ｍの成分変数を出力すると共に、統計学習モデルを構築する。
統計学習エンジンは、主成分分析(Principal Component Analysis)又はAutoEncoderに基づくものであってもよい。 [Statistical learning engine 100]
The statistical learning engine 100 inputs a three-dimensional model of a plurality of teacher data groups, outputs a component variable having the number of dimensions m that has been dimensionally compressed, and constructs a statistical learning model.
The statistical learning engine may be based on Principal Component Analysis or AutoEncoder.

図３は、統計学習エンジンにおける３次元モデルのベクトル空間を表す説明図である。 FIG. 3 is an explanatory diagram illustrating a vector space of a three-dimensional model in the statistical learning engine.

図３によれば、教師データとしては、例えば様々な体形を持つ1,000体の人体を想定している。３次元モデルは、人体又は物体の形状データであって、同一対象体に対して同一頂点数で表現される。
図３（ａ）によれば、３次元モデルは、１体毎に頂点数がN=15,000あり、各頂点は３次元(x,y,z)で表現される。即ち、１体の３次元モデルは、3N（＝45,000）次元のベクトルで表される。
図３（ｂ）によれば、３次元モデルの１体毎に、3N次元空間における１点で表される。 According to FIG. 3, for example, 1,000 human bodies having various body shapes are assumed as teacher data. The three-dimensional model is shape data of a human body or an object, and is represented by the same number of vertices for the same object.
According to FIG. 3A, the three-dimensional model has N = 15,000 vertices per body, and each vertex is expressed in three dimensions (x, y, z). That is, one three-dimensional model is represented by a 3N (= 45,000) -dimensional vector.
According to FIG. 3B, each point of the three-dimensional model is represented by one point in the 3N-dimensional space.

尚、一般的な機械学習エンジンによれば、膨大な数の教師データを必要とするのに対し、本発明によれば、教師データ群の複数体数は、３次元モデルの頂点数よりも少なくてもよい。即ち、教師データの人体数1,000は、３次元モデルのベクトル次元数45,000よりも少ない。本発明によれば、教師データの複数体数は、３次元モデルのベクトル次元数以上に用意する必要がなく、そうであっても十分に精度を維持することができる。 According to a general machine learning engine, an enormous number of teacher data is required, whereas according to the present invention, the number of plural teacher data groups is smaller than the number of vertices of the three-dimensional model. You may. That is, the number of human bodies 1,000 in the teacher data is smaller than the number of vector dimensions 45,000 of the three-dimensional model. According to the present invention, it is not necessary to prepare the plurality of teacher data in the number of vector dimensions of the three-dimensional model or more, and even in such a case, sufficient accuracy can be maintained.

図４は、統計学習エンジンにおける統計形状空間を表す説明図である。
図５は、統計学習エンジンにおける主成分分析を表す簡易なコードである。 FIG. 4 is an explanatory diagram illustrating a statistical shape space in the statistical learning engine.
FIG. 5 is a simple code showing the principal component analysis in the statistical learning engine.

統計学習エンジン１００は、具体的には、主成分分析(Principal Component Analysis)に基づくものであってもよい。
「主成分分析」によって、相関のある3N次元空間の1000点から、互いに無相関で全体のばらつきを最もよく表す少数（例えば30個）の主成分（成分変数）を導出する。第１主成分の分散を最大化し、続く主成分はそれまでに決定した主成分と無相関という拘束条件の下で、分散を最大化するようにして選択される。主成分の分散を最大化することによって、観測値の変化に対する説明能力を可能な限り主成分に持たせる。主成分を与える主軸は3N次空間の1000点の群の直交基底となっている。主軸の直交性は、主軸が共分散行列の固有ベクトルになっており、共分散行列が実対称行列であることから導かれる。 The statistical learning engine 100 may specifically be based on Principal Component Analysis.
By “principal component analysis”, a small number (for example, 30) of principal components (component variables) that are uncorrelated and best represent the overall variation are derived from 1000 points in a correlated 3N-dimensional space. The variance of the first principal component is maximized, and the following principal components are selected so as to maximize the variance under the constraint that there is no correlation with the principal components determined so far. By maximizing the variance of the principal components, the principal components have as much explanatory power as possible for changes in the observed values. The principal axis giving the principal component is an orthogonal basis of a group of 1000 points in the 3N-order space. The orthogonality of the principal axes is derived from the fact that the principal axes are eigenvectors of the covariance matrix and the covariance matrix is a real symmetric matrix.

統計学習エンジン１００は、3N次元空間に対して、主成分分析に基づく成分変数を次元数とする統計形状空間（例えば30次元）に射影させる統計学習モデルを構築する。
本発明によれば、3N(＝45,000)次元空間における各３次元モデルを、例えば30次元（成分変数）空間に射影する。主成分を与える変換は、観測値の集合からなる行列の特異値分解で表され、3N次元空間の1000点の群からなる矩形行列Ｘの特異値分解は、以下の式で表される。
Ｘ＝Ｕ*Σ*Ｖ^T
Ｘ：3N次元空間の1000点からなる行列（1000行×3N列）
Ｕ：n(1000)×n(1000)の正方行列（n次元単位ベクトルの直交行列）
Σ：n(1000)×p(3N)の矩形対角行列（対角成分は、Ｘの特異値）
Ｖ：p(3N)×p(3N)の正方行列（p次元単位ベクトルの直交行列）
ここで、Ｖの最初の30列からなる行列をＶと改める。そして、その行列Ｖによる線形変換はＸの主成分を与える。
Ｖ：3N次元空間->統計形状(30次元)空間への変換を表す行列
Ｖ^-1：統計形状(30次元)空間->3N次元空間への変換を表す行列
尚、行列の上付き添え字-1は逆行列を示す記号ではなく、行列が定めるベクトルの変換に対して、その逆変換を意味する抽象的な記号として用いている。ここでは、Ｖ^-1は、Ｖの転置Ｖ^Tと等しい。 The statistical learning engine 100 constructs a statistical learning model for projecting a 3N-dimensional space onto a statistical shape space (for example, 30 dimensions) having component variables based on principal component analysis as dimensions.
According to the present invention, each three-dimensional model in a 3N (= 45,000) dimensional space is projected onto, for example, a 30-dimensional (component variable) space. The transformation giving the principal component is represented by singular value decomposition of a matrix composed of a set of observation values, and the singular value decomposition of a rectangular matrix X composed of a group of 1000 points in a 3N-dimensional space is represented by the following equation.
X = U * Σ * V ^T
X: Matrix consisting of 1000 points in 3N-dimensional space (1000 rows x 3N columns)
U: n (1000) × n (1000) square matrix (orthogonal matrix of n-dimensional unit vector)
Σ: n (1000) × p (3N) rectangular diagonal matrix (diagonal components are singular values of X)
V: square matrix of p (3N) × p (3N) (orthogonal matrix of p-dimensional unit vector)
Here, the matrix consisting of the first 30 columns of V is referred to as V. Then, the linear transformation using the matrix V gives the main component of X.
V: Matrix representing conversion to 3N-dimensional space-> statistical shape (30-dimensional) space
V ^-1 : Statistical shape (30-dimensional) space-> matrix representing conversion to 3N-dimensional space Note that the superscript -1 of the matrix is not a symbol indicating an inverse matrix, but a conversion of a vector defined by the matrix. , Is used as an abstract symbol meaning the inverse transformation. Here, V ^-1 is equal to the transpose V ^T of V.

図４からも明らかなとおり、行列Ｖ又はＶ^-1による線形変換によって、3N次元空間と統計形状空間との間で、３次元モデルの１体毎に対応付けることができる。
ｓ＝ｘ*Ｖ
ｘ＝ｓ*Ｖ^-1
ｓ：統計形状空間のベクトル
ｘ：3N次元空間のベクトル
Ｖ：統計学習モデル As is clear from FIG. 4, it is possible to associate each 3D model between the 3N-dimensional space and the statistical shape space by the linear transformation using the matrix V or V ⁻¹ .
s = x * V
x = s * V ^-1
s: vector of statistical shape space
x: vector in 3N-dimensional space
V: Statistical learning model

統計学習エンジン１００は、オートエンコーダ(AutoEncoder)に基づくものであってもよい。
オートエンコーダは、ニューラルネットワークの一種で、情報量を小さくした特徴表現を実現する（例えば非特許文献３参照）。具体的には、入力データの次元数よりも、隠れ層の次元を圧縮したものである。入力データを、ニューラルネットワークを通して圧縮し、出力時には元のサイズに戻す。このとき、ニューラルネットワークは、入力データの抽象的概念（特徴量）を抽出する。
オートエンコーダも、主成分分析と同様に、相関のある3N次元空間の1,000点から、互いに無相関で全体のばらつきを最もよく表す30次元の成分変数を導出する。 The statistical learning engine 100 may be based on an auto encoder (AutoEncoder).
The auto-encoder is a type of neural network and realizes a feature expression with a reduced amount of information (for example, see Non-Patent Document 3). More specifically, the dimensions of the hidden layer are compressed rather than the number of dimensions of the input data. The input data is compressed through a neural network and returned to its original size on output. At this time, the neural network extracts an abstract concept (feature amount) of the input data.
The auto-encoder also derives a 30-dimensional component variable that is uncorrelated with each other and that best represents the overall variation from 1,000 points in the correlated 3N-dimensional space, similar to the principal component analysis.

［第１の相関学習エンジン１１０］
第１の相関学習エンジン１１０は、教師データ群の複数体の３次元モデルについて、当該３次元モデルのデプス画像と、次元数ｍの成分変数との第１の相関学習モデルを構築する。 [First Correlation Learning Engine 110]
The first correlation learning engine 110 constructs a first correlation learning model between a depth image of the three-dimensional model and a component variable having the number of dimensions m for a plurality of three-dimensional models of the teacher data group.

第１の相関学習エンジン１１０は、畳み込みニューラルネットワーク(Convolutional neural network)に基づくものであってもよい。これは、順伝播型の深層学習の一種であり、特に説明変数から目的変数を予測するべく、回帰分析として学習させることができる。 The first correlation learning engine 110 may be based on a convolutional neural network. This is a type of forward-propagation type deep learning, and can be learned as a regression analysis in order to predict a target variable from an explanatory variable.

図６は、第１の相関学習エンジンにおける畳み込みニューラルネットワークを表す簡易なコードである。 FIG. 6 is a simple code representing a convolutional neural network in the first correlation learning engine.

畳み込みニューラルネットワークは、以下のように一方向に学習する。
説明変数：３次元モデルのデプス画像（グレースケール画像）
目的変数：統計学習エンジン１００から出力された次元数ｍの成分変数
これによって、運用段階では、説明変数としてのデプス画像を入力することによって、目的変数としての次元数ｍの成分変数を出力することができる。 The convolutional neural network learns in one direction as follows.
Explanatory variables: Depth image (grayscale image) of 3D model
Object variable: component variable of dimension m output from the statistical learning engine 100 By this, in the operation stage, by inputting a depth image as an explanatory variable, the component variable of dimension m as an objective variable is output. Can be.

［第２の相関学習エンジン１２０］
第２の相関学習エンジン１２０は、教師データ群の複数体の３次元モデルについて、少なくとも１つ以上の採寸箇所を含む次元数ｎの組成値と、次元数ｍの成分変数との第２の相関学習モデルを構築する。 [Second correlation learning engine 120]
The second correlation learning engine 120 is configured to perform a second correlation between a composition value of the number of dimensions n including at least one or more measuring points and a component variable of the number of dimensions m with respect to a plurality of three-dimensional models of the teacher data group. Build a learning model.

図７は、組成値から導出した組成値空間を表す説明図である。 FIG. 7 is an explanatory diagram illustrating a composition value space derived from the composition values.

教師データにおける３次元モデルの人体毎に、次元数ｎの組成値が対応付けられている。
組成値は、３次元モデルと紐付けられているが、その３次元モデル自体から導出可能な１つ以上の採寸値（例えば身長や胸囲、腹囲など）を含むものであってもよい。但し、幾何学に基づく採寸箇所は、１カ所以上であることが好ましい。身長だけでもよいし、身長＋腹囲であってもよいし、身長＋腹囲＋胸囲であってもよい。
ここで、複数の組成値をその要素値とした組成値空間を導出することができる。例えば10個の組成値が付与されている場合、組成値空間は10次元となる。 The composition value of the dimension number n is associated with each human body of the three-dimensional model in the teacher data.
The composition value is linked to the three-dimensional model, but may include one or more measurement values (for example, height, chest measurement, abdominal measurement, and the like) derivable from the three-dimensional model itself. However, it is preferable that the number of measuring points based on the geometry is one or more. Height alone may be used, height + abdominal circumference may be used, or height + abdominal circumference + chest circumference may be used.
Here, a composition value space in which a plurality of composition values are used as element values can be derived. For example, when ten composition values are given, the composition value space has ten dimensions.

第２の相関学習エンジン１２０は、最小二乗法又は多層パーセプトロンに基づくものであってもよい。 The second correlation learning engine 120 may be based on least squares or multi-layer perceptron.

図８は、統計形状空間と組成値空間との線形変換を表す説明図である。
図９は、第２の相関学習エンジンにおける統計形状空間と組成値空間との間の線形変換を表す簡易なコードである。 FIG. 8 is an explanatory diagram illustrating a linear transformation between the statistical shape space and the composition value space.
FIG. 9 is a simple code showing a linear transformation between the statistical shape space and the composition value space in the second correlation learning engine.

＜最小二乗法＞
第２の相関学習エンジン１２０は、最小二乗法に基づくものであってもよい。
「最小二乗法(least squares method)」とは、複数の多次元ベクトル（データの組）から線形モデルで近似する際に、残差の二乗和が最小となる最も確からしい線形モデルを決定することをいう。 <Least squares method>
The second correlation learning engine 120 may be based on the least squares method.
The "least squares method" is to determine the most probable linear model that minimizes the residual sum of squares when approximating a linear model from multiple multidimensional vectors (data sets). Say.

図８からも明らかなとおり、行列Ａ又はＡ^-1による線形変換によって、統計形状空間と組成値空間との間で、３次元モデルの１体毎に対応付けることができる。
ｓ＝ｄ*Ａ
ｄ＝ｓ*Ａ^-1
Ａ＝(Ｄ^T*Ｄ)^-1*Ｄ^T*Ｓ（||Ｄ*Ａ−Ｓ||を最小化するＡを導出する）
ｓ：統計形状空間のベクトル
ｄ：組成値空間のベクトル
Ｓ：統計形状空間のベクトルの組
Ｄ：組成値空間のベクトルの組
Ａ：相関学習モデル As is clear from FIG. 8, it is possible to make correspondence between the statistical shape space and the composition value space for each one of the three-dimensional models by the linear transformation using the matrix A or A ⁻¹ .
s = d * A
d = s * A ^-1
A = (D ^T * D) ⁻¹ * D ^T * S (Deriving A that minimizes || D * AS− |)
s: vector of statistical shape space
d: Composition value space vector
S: vector set of statistical shape space
D: Set of vectors in composition value space
A: Correlation learning model

＜多層パーセプトロン＞
第２の相関学習エンジン１２０は、多層パーセプトロンに基づくものであってもよい。
「多層パーセプトロン(Multilayer perceptron)」とは、順伝播型ニューラルネットワークであって、誤差逆伝播法と称される教師あり学習を用いており、これは、線形パーセプトロンにおける最小二乗法アルゴリズムの一般化である。 <Multilayer Perceptron>
The second correlation learning engine 120 may be based on a multi-layer perceptron.
"Multilayer perceptron" is a forward-propagation type neural network that uses supervised learning called error backpropagation, which is a generalization of the least squares algorithm in the linear perceptron. is there.

図１０は、本発明における運用段階のエンコード側の機能構成図である。 FIG. 10 is a functional block diagram of the encoding side at the operation stage in the present invention.

図１０によれば、装置１は、第１の相関学習エンジン１１０を用いる第１のエンコーダ１１１と、第２の相関学習エンジン１２０を用いる第２のエンコーダ１２１と、欠損値推定部１２２と、シェアコード出力部１３とを更に有する。これら機能構成部は、装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現できる。また、これら機能構成部の処理の流れは、装置のエンコード方法としても理解できる。
尚、第１のエンコーダ１１１と、第２のエンコーダ１２１及び欠損値推定部１２２とは、装置１の用途に応じて、いずれか一方のみを備えたものであってもよい。 According to FIG. 10, the device 1 includes a first encoder 111 using a first correlation learning engine 110, a second encoder 121 using a second correlation learning engine 120, a missing value estimating unit 122, And a code output unit 13. These functional components can be realized by executing a program that causes a computer mounted on the device to function. In addition, the flow of processing of these functional components can be understood as an encoding method of the device.
The first encoder 111, the second encoder 121, and the missing value estimation unit 122 may include only one of them according to the use of the device 1.

［第１のエンコーダ１１１］
第１のエンコーダ１１１は、第１の相関学習エンジン１１０を用いて、対象データとしてのデプス画像から次元数ｍの成分変数へエンコードする。エンコードされた次元数ｍの成分変数は、デプス画像を認識できない秘匿性を持つ。そのために、個人情報としてのデプス画像が守秘情報である場合に適する。
エンコードされた次元数ｍの成分変数は、シェアコード出力部１３へ出力される。 [First encoder 111]
The first encoder 111 uses the first correlation learning engine 110 to encode a depth image as target data into a component variable having the number of dimensions m. The encoded component variable having the number of dimensions m has confidentiality that the depth image cannot be recognized. Therefore, it is suitable when the depth image as personal information is confidential information.
The encoded component variable having the number of dimensions m is output to the share code output unit 13.

第１のエンコーダ１１１は、入力された対象データのデプス画像に対して、＜大津の判別分析機能＞を適用したものであってもよい。 The first encoder 111 may apply the <Otsu's discriminant analysis function> to the input depth image of the target data.

＜大津の判別分析機能＞
Ｓ１：物体抽出段階
物体抽出のために、以下の３つのステップを要する。
（Ｓ１１）人体と背景との両方を含む対象デプス画像を入力する。
（Ｓ１２）２値化処理によって２つのクラス（物体部分と非物体部分）に分離した２値画像を生成する。この２値化処理に、例えば「大津の判別分析法(discriminant analysis method)」を適用した場合、分離度(separation metrics)という値が最大となる閾値を求め、自動的に２値化することができる。分離度は、クラス間分散(between-class variance)とクラス内分散(within-class variance)との比で求める。
（Ｓ１３）Ｓ１１で入力された対象デプス画像を、Ｓ１２で算出された２値化画像によってマスク処理して、人体のみのデプス画像を抽出する。 <Otsu's discriminant analysis function>
S1: Object extraction stage The following three steps are required for object extraction.
(S11) A target depth image including both a human body and a background is input.
(S12) Generates a binary image separated into two classes (object part and non-object part) by binarization processing. For example, when the “Otsu's discriminant analysis method” is applied to the binarization processing, a threshold value at which the value of the separation metrics is maximized is obtained, and the binarization is automatically performed. it can. The degree of separation is determined by the ratio between the between-class variance and the within-class variance.
(S13) The target depth image input in S11 is subjected to mask processing with the binary image calculated in S12 to extract a depth image of only a human body.

Ｓ２：正規化段階
正規化のために、以下の５つのステップを要する。
（Ｓ２１）物体抽出段階によって抽出したデプス画像の画像平面上のピクセル座標p=(x,y,z)から、３次元座標P=(X,Y,Z)を計算する。
各点P＝(X,Y,Z)について、X＝x*z/f, Y＝y*z/f, Z＝(Max-Min)*z+Min
＜既知のカメラ仕様＞
焦点距離(物理) F (mm)
焦点距離(ピクセル) f＝(fx,fy)=(F*sx,F*sy) (pixel)
ピクセルサイズ s＝(sx,sy) (pixel/mm)
深度測定距離 Min,Max (mm)
（Ｓ２２）物体のデプス画像からバウンディングボックスを特定し、その重心を決定する。
（Ｓ２３）Ｓ２２によって決定された重心に基づいて、デプス画像を所定の重心に平行移動する。
（Ｓ２４）Ｓ２３によって平行移動されたデプス画像に対して、画像平面上のピクセル座標を計算する。
各点p＝(x,y,z)について、x＝X*f/Z, y＝Y*f/Z, z＝(Z-Min)/(Max-Min)
（Ｓ２５）Ｓ２４によって計算されたピクセル座標は必ずしも整数値ではないので、補完を使って最終的なピクセル座標を計算する。
（Ｓ２６）Ｓ５によって補完されたデプス画像を、対象データとして、第１の相関学習エンジン１１０へ入力する。 S2: Normalization stage The following five steps are required for normalization.
(S21) Three-dimensional coordinates P = (X, Y, Z) are calculated from pixel coordinates p = (x, y, z) on the image plane of the depth image extracted in the object extraction step.
For each point P = (X, Y, Z), X = x * z / f, Y = y * z / f, Z = (Max-Min) * z + Min
<Known camera specifications>
Focal length (physical) F (mm)
Focal length (pixel) f = (fx, fy) = (F * sx, F * sy) (pixel)
Pixel size s = (sx, sy) (pixel / mm)
Depth measurement distance Min, Max (mm)
(S22) A bounding box is specified from the depth image of the object, and its center of gravity is determined.
(S23) The depth image is translated to a predetermined center of gravity based on the center of gravity determined in S22.
(S24) The pixel coordinates on the image plane are calculated for the depth image translated in S23.
For each point p = (x, y, z), x = X * f / Z, y = Y * f / Z, z = (Z-Min) / (Max-Min)
(S25) Since the pixel coordinates calculated in S24 are not necessarily integer values, final pixel coordinates are calculated using interpolation.
(S26) The depth image complemented in S5 is input to the first correlation learning engine 110 as target data.

［第２のエンコーダ１２１］
第２のエンコーダ１２１は、第２の相関学習エンジン１２０を用いて、対象データとしての１体の次元数ｎの組成値から次元数ｍの成分変数へエンコードする。この場合も、エンコードされた次元数ｍの成分変数は、組成値を認識できない秘匿性を持つ。
エンコードされた次元数ｍの成分変数は、シェアコード出力部１３へ出力される。 [Second encoder 121]
The second encoder 121 uses the second correlation learning engine 120 to encode the composition value of one dimension number n as the target data into a component variable of the dimension number m. Also in this case, the encoded component variable having the number of dimensions m has confidentiality in which the composition value cannot be recognized.
The encoded component variable having the number of dimensions m is output to the share code output unit 13.

［欠損値推定部１２２］
欠損値推定部１２２は、オプション的な他の実施形態として、入力された対象データの組成値に対して、欠損値を推定する。 [Missing value estimation unit 122]
As another optional embodiment, the missing value estimating unit 122 estimates a missing value for the composition value of the input target data.

図１１は、本発明における欠損値を推定する組成値空間を表す説明図である。
図１２は、本発明における欠損値推定を表す簡易なコードである。 FIG. 11 is an explanatory diagram illustrating a composition value space for estimating a missing value according to the present invention.
FIG. 12 is a simple code showing missing value estimation in the present invention.

組成値空間は、教師データ群の３次元モデルに対応付けられた、固定の次元数ｎ（＝10）の組成値を表すものである。
ここで、欠損値推定部１２２は、対象データとして１体の次元数ｎの組成値について、ｋ（＜ｎ）個の組成値のみが決定され、その他のｎ−ｋ個の組成値が欠損していてもよい。即ち、本発明によれば、教師データ群によって例えば10次元の組成値空間から第２の相関学習モデルを構築したとしても、例えばｋ＝3個の組成値のみを入力することによって、次元数ｍの成分変数を推定することができる。 The composition value space represents a composition value having a fixed number of dimensions n (= 10) associated with the three-dimensional model of the teacher data group.
Here, the missing value estimating unit 122 determines only k (<n) composition values for the composition value of one dimension n as the target data, and missing the other nk composition values. May be. That is, according to the present invention, even if a second correlation learning model is constructed from, for example, a ten-dimensional composition value space using the teacher data group, for example, by inputting only k = 3 composition values, the number of dimensions m Can be estimated.

図１１によれば、組成値（10次元）空間上に、超楕円体の等値面が表されている。超楕円体とは、楕円を次元数ｎ（＝10）次元へ拡張したような図形をいう。等値面とは、その次元（＝10）上に描かれる等高線図をいう。ここで、分散共分散行列Ｃ＝Ｄ^T*Ｄは既知であるとする。
超楕円体を表す二次形式（ｘ：列ベクトル）
ｆ(ｘ)＝ｘ^T*Ｃ^-1*ｘ
Ｃ^-1：対称行列
※実際には分散共分散行列はＣ＝Ｄ^T*Ｄ／教師データ群数
※^-1は、逆行列を示す According to FIG. 11, the isosurface of the hyperellipsoid is represented on the composition value (10-dimensional) space. The hyperellipsoid refers to a figure obtained by expanding an ellipse into n (= 10) dimensions. The isosurface refers to a contour map drawn on the dimension (= 10). Here, it is assumed that the variance-covariance matrix C = D ^T * D is known.
Quadratic form representing hyperellipsoid (x: column vector)
f (x) = x ^T * C ^-1 * x
C ^-1 : symmetric matrix
* Actually, the variance-covariance matrix is C = D ^T * D / the number of teacher data groups.
* ^-1 indicates the inverse matrix

本発明によれば、ｋ個の組成値を束縛条件として、最適化された他のｎ−ｋ個の組成値を含む次元数ｍの成分変数を算出する。これには、ラグランジュの未定乗数法(method of Lagrange multiplier)を用いる。 According to the present invention, a component variable having a dimension number m including other optimized nk composition values is calculated using k composition values as constraints. For this, Lagrange's method of Lagrange multiplier is used.

ラグランジュの未定乗数法とは、束縛条件のもとで最適化する解析方法であって、いくつかの変数に対して、いくつかの関数の値を固定するという束縛条件のもとで、別のある１つの関数の極値を求めるという問題を考える。各束縛条件に対して、定数（未定乗数、Lagrange multiplier）を用意し、これらを係数とする線形結合を新しい関数（未定乗数も新たな変数とする）として考えることで、束縛問題を普通の極値問題として解く。 Lagrange's method of undetermined multipliers is an analysis method that optimizes under constraints, and under the constraint that the values of some functions are fixed for some variables, Consider the problem of finding the extremum of a certain function. For each binding condition, a constant (undetermined multiplier, Lagrange multiplier) is prepared, and the linear combination using these coefficients as a new function (the undetermined multiplier is also a new variable) is considered. Solve as a value problem.

制約条件ｇ_j(ｘ₁,・・・,ｘ_n)＝０（j＝1,・・・,k）の下で、関数ｆ(ｘ₁,・・・,ｘ_n)が極値をとる点について、
Ｆ(ｘ₁,・・・,ｘ_n,λ₁,・・・,λ_k)
＝ｆ(ｘ₁,・・・,ｘ_n)＋Σλ_jｇ_j(ｘ₁,・・・,ｘ_n)
とすることによって、以下の式を満たす。
dＦ／dｘ_i＝０（i＝1,・・・,n）
dＦ／dλ_j＝０（j＝1,・・・,k） Constraints _{_{g j (x 1, ···,}} x n) = 0 (j = 1, ···, k) under the function _{f (x 1, ···, x} n) is an extreme value About the point
F (x ₁ , ..., x _n , λ ₁ , ..., λ _k )
= F (x ₁ ,..., X _n ) + Σλ _j g _j (x ₁ ,..., X _n )
Satisfies the following equation.
dF / dx _i = 0 (i = 1,..., n)
dF / dλ _j = 0 (j = 1,..., k)

組成値の欠損値推定の場合に、k個の組成値が与えられた場合、制約条件ｇ_j(ｘ)＝０（j＝1,・・・,k）は、10次元空間上のアフィン超平面を表す一次方程式であり、以下の式で表される。
アフィン超平面を表す一次方程式
ｇ_j(ｘ)＝ｎ_j ^T*(ｘ−ｐ_j)＝０
n：超平面の法線ベクトル
p：超平面上の点
特に、それぞれの超平面は基底に直交する（nの方向が基底方向に一致する）ために、以下のようになる。
ｇ_j(ｘ)＝ｘ_i−ｙ_j＝０
ｙ_j：j番目の組成値
ｘ_i：対応するｘの要素
制約条件の下で、関数ｆ(ｘ)の最小値を求めることは、与えられた組成値の下で、平均に最も近い体形を求めることとなる。 In the case of missing value estimation of a composition value, when k composition values are given, the constraint condition g _j (x) = 0 (j = 1,..., K) is This is a linear equation representing a plane, and is represented by the following equation.
A linear equation representing an affine hyperplane g _j (x) = n _j ^T * (x−p _j ) = 0
n: Normal vector of hyperplane
p: a point on the hyperplane In particular, since each hyperplane is orthogonal to the base (the direction of n coincides with the base direction), it becomes as follows.
g _j (x) = x _i −y _j = 0
y _j : j-th composition value
x _i : corresponding element of x To obtain the minimum value of the function f (x) under the constraint condition, to obtain the body shape closest to the average under the given composition value.

10次元空間の場合、具体的には、以下のように表される。
ｘ：10次元列ベクトル
ｙ：k個の組成値を含む10次元列ベクトル (k個以外の組成値の値は任意)
λ：ラグランジュ乗数を要素とするk次元列ベクトル
Ｏ：k行10列の行列各行は与えられた組成値に応じたone-hot行ベクトル
Ｃ：分散共分散行列
ｆ(ｘ)＝1/2*ｘ^T*Ｃ^-1*ｘ
ｇ(ｘ)＝Ｏ*(ｙ−ｘ)
Ｆ(ｘ)＝ｆ(ｘ)＋λ^T*ｇ(ｘ)
dＦ／dｘ＝Ｃ^-1*ｘ−Ｏ^T*λ＝０（１）
dＦ／dλ＝Ｏ*(ｙ−ｘ)＝０（２）
（１）より、ｘ＝Ｃ*Ｏ^T*λ （３）
（３）を（２）に代入
Ｏ*ｙ−Ｏ*Ｃ*Ｏ^T*λ＝０
λ＝(Ｏ*Ｃ*Ｏ^T)^-1*Ｏ*ｙ
λを（３）に代入
ｘ＝Ｃ*Ｏ^T*(Ｏ*Ｃ*Ｏ^T)^-1*Ｏ*ｙ In the case of a 10-dimensional space, it is specifically expressed as follows.
x: 10-dimensional column vector y: 10-dimensional column vector containing k composition values (values of composition values other than k are arbitrary)
λ: k-dimensional column vector having a Lagrange multiplier as an element O: matrix of k rows and 10 columns each row is a one-hot row vector corresponding to a given composition value C: variance-covariance matrix f (x) = 1/2 * x ^T * C ^-1 * x
g (x) = O * (y−x)
F (x) = f (x) + λ ^T * g (x)
^{dF / dx = C -1 * x} -O T * λ = 0 (1)
dF / dλ = O * (y−x) = 0 (2)
(1) ^{than, x = C * O T *} λ (3)
Substitute (3) into (2)
O * y-O * C * O T * λ = 0
λ = (O * C * O T) -1 * O * y
Substitute λ into (3)
^{x = C * O T * (} O * C * O T) -1 * O * y

［シェアコード出力部１３］
シェアコード出力部１３は、エンコードされた次元数ｍの成分変数を、シェア(Share)コードとして出力する。
本発明によれば、３次元モデルを、成分変数（４バイト）で３０次元とした場合、１２０バイトで表すことができる。
このとき、当該シェアコードを、ＱＲ(Quick Response、登録商標)、ＲＦＩＤ(Radio Frequency IDentifier)又はCookieに埋め込むものであってもよい。 [Share code output unit 13]
The share code output unit 13 outputs the encoded component variable having the number of dimensions m as a share code.
According to the present invention, when a three-dimensional model is 30-dimensional with component variables (4 bytes), it can be represented by 120 bytes.
At this time, the share code may be embedded in QR (Quick Response, registered trademark), RFID (Radio Frequency IDentifier), or Cookie.

ＱＲコードは、マトリックス型２次元コードであり、バイナリで最大2,953バイトを記述することができる。一般的なスマートフォンでは、ＱＲコードをディスプレイに表示することもできるし、そのＱＲコードをカメラで読み取ることができる。 The QR code is a matrix type two-dimensional code, and can describe a maximum of 2,953 bytes in binary. In a general smartphone, a QR code can be displayed on a display, and the QR code can be read by a camera.

また、タグとしては、ＲＦＩＤとは、ＲＦタグに記述された情報を、電磁界や電波を用いた近距離無線通信によって通信する技術をいう。例えばFelica（登録商標）であって、電子マネーや乗車カードに用いられている。
本発明によれば、例えば３次元モデルの成分変数を、ＲＦタグに記述しておくだけで、リーダによって瞬時に読み取らせることができる。ＲＦタグから成分変数を読み取ったリーダは、その成分変数に対応した３次元モデルを瞬時にディスプレイに表示することもきる。
このユーザインタフェースによれば、組成値に対応する３次元モデルと、その３次元モデルの成分変数が記述されたＱＲコードとを、一見することができる。特に、ＱＲコードを、カメラによって読み取らせるだけで、３次元モデルを共有することできる。 As a tag, RFID refers to a technology for communicating information described in an RF tag by short-range wireless communication using an electromagnetic field or a radio wave. For example, Felica (registered trademark) is used for electronic money and boarding cards.
According to the present invention, for example, a component variable of a three-dimensional model can be instantaneously read by a reader simply by describing it in an RF tag. The reader that reads the component variable from the RF tag can instantaneously display the three-dimensional model corresponding to the component variable on the display.
According to this user interface, a three-dimensional model corresponding to the composition value and a QR code in which component variables of the three-dimensional model are described can be seen at a glance. In particular, a three-dimensional model can be shared simply by reading a QR code with a camera.

図１３は、本発明における運用段階のデコード側の機能構成図である。 FIG. 13 is a functional configuration diagram on the decoding side at the operation stage in the present invention.

図１３によれば、デコード側の装置１は、シェアコード入力部１４と、統計学習エンジン１００を用いる第１のデコーダと１１３と、第２の相関学習エンジン１２０を用いる第２のデコーダ１２３と、グレースケール画像作成部１５とを有する。これら機能構成部は、装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現できる。また、これら機能構成部の処理の流れは、装置のエンコード方法としても理解できる。
尚、統計学習エンジン１００及び第１のデコーダ１１３と、第２の相関学習エンジン１２０及び第２のデコーダ１２３とは、装置１の用途に応じて、いずれか一方のみを備えたものであってもよい。 According to FIG. 13, the device 1 on the decoding side includes a share code input unit 14, a first decoder and 113 using the statistical learning engine 100, a second decoder 123 using the second correlation learning engine 120, And a grayscale image creation unit 15. These functional components can be realized by executing a program that causes a computer mounted on the device to function. In addition, the flow of processing of these functional components can be understood as an encoding method of the device.
The statistical learning engine 100 and the first decoder 113, and the second correlation learning engine 120 and the second decoder 123 may be provided with only one of them according to the use of the device 1. Good.

［第１のデコーダ１１３］
第１のデコーダ１１３は、統計学習エンジン１００を用いて、当該次元数ｍの成分変数から３次元モデルにデコードする。これによって、シェアコードから３次元モデルを生成することができる。 [First decoder 113]
The first decoder 113 uses the statistical learning engine 100 to decode the component variable having the number of dimensions m into a three-dimensional model. Thus, a three-dimensional model can be generated from the share code.

［グレースケール画像作成部１５］
グレースケール画像作成部１５は、デコードされた３次元モデルを、所定視点からソフトウェア上で撮影した１枚以上のグレースケール画像を作成する。これによって、シェアコードから、生成された３次元モデルの任意の視点から見たグレースケール画像を生成することができる。 [Grayscale image creation unit 15]
The grayscale image creation unit 15 creates one or more grayscale images of the decoded three-dimensional model photographed from a predetermined viewpoint on software. Thus, a grayscale image of the generated three-dimensional model viewed from an arbitrary viewpoint can be generated from the share code.

［第２のデコーダ１２３］
第２のデコーダ１２３は、第２の相関学習エンジン１２０を用いて、対象データとしての次元数ｍの成分変数から組成値にデコードする。これによって、シェアコードから組成値を生成することができる。 [Second decoder 123]
The second decoder 123 uses the second correlation learning engine 120 to decode a component variable having dimensionality m as target data into a composition value. Thus, a composition value can be generated from the share code.

図１４は、本発明によってデコードされた３次元モデルの精度を表す説明図である。 FIG. 14 is an explanatory diagram showing the accuracy of the three-dimensional model decoded according to the present invention.

図１４によれば、左側には、４つのモデルとなる対象データの人体が表されている。
中央には、その対象データの人体を、デプスカメラで撮影したデプス画像が表されている。
右側には、そのデプス画像をエンコードしてシェアコードを作成した後、そのシェアコードをデコードして生成した３次元モデルが表されている。
ここで特徴的な点として、左側の対象データの人体と、右側の生成された３次元モデルとが、ほぼ同じ体形となっている。
尚、エンコード及びデコードには、第１の相関学習エンジン１１０を用いた第１のエンコーダ１１１と、統計学習エンジン１００を用いた第１のデコーダ１１３とから実現しており、組成値までは含めていない。 According to FIG. 14, the human body of the target data serving as the four models is shown on the left side.
In the center, a depth image obtained by photographing the human body of the target data with a depth camera is shown.
On the right side, a three-dimensional model generated by encoding the depth image , creating a share code, and then decoding the share code is shown.
Here, as a characteristic point, the human body of the target data on the left side and the generated three-dimensional model on the right side have substantially the same body shape.
Note that the encoding and decoding are realized by a first encoder 111 using the first correlation learning engine 110 and a first decoder 113 using the statistical learning engine 100, and include even a composition value. Absent.

図１５は、本発明における運用段階の再学習を表す機能構成図である。 FIG. 15 is a functional block diagram showing the re-learning at the operation stage in the present invention.

図１５によれば、対象データとして対応付けられたデプス画像及び組成値を入力し、第１の相関学習エンジンと、第２の相関学習エンジンとを再学習する。
統合部１６は、第１のエンコーダ１１１によってエンコードされた対象データの第１の次元数ｍの成分変数と、第２のエンコーダ１２１によってエンコードされた第２の次元数ｍの成分変数とを入力する。そして、統合部１６は、第１の次元数ｍの成分変数と、第２の次元数ｍの成分変数とを、１つの次元数ｍの成分変数に統合する。
このとき、図１２のシェアコード出力部１３の統合機能と同様に、対象データの組成値について欠損した組成値の数が多いほど、小さくなる重みｗを付与して、次元毎に算出した、第１の次元数ｍの成分変数と第２の次元数ｍの成分変数との加重平均を、１つの次元数ｍの成分変数として統合する。
これによって、デプスカメラによって撮影されたデプス画像、又は、体組成計によって計測された体組成値の一方のみに強く依存したシェアコードにならないようにすることができる。
尚、組成値について欠損した組成値の数が明確であることを要する。この場合、シェアコードに、別途、欠損した組成値がデコード側で検出することができるように、組成値１０次元に対応する１０ビットによって、入力値１／欠損値０のようにフラグを立てておくことも好ましい。 According to FIG. 15, a depth image and a composition value associated as target data are input, and the first correlation learning engine and the second correlation learning engine are re-learned.
The integrating unit 16 inputs a component variable of the first dimension number m of the target data encoded by the first encoder 111 and a component variable of the second dimension number m encoded by the second encoder 121. . Then, the integrating unit 16 integrates the component variable having the first dimension number m and the component variable having the second dimension number m into one component variable having the dimension number m.
At this time, similar to the integration function of the share code output unit 13 in FIG. 12, the weight w that becomes smaller as the number of missing composition values of the composition value of the target data increases is calculated and calculated for each dimension. The weighted average of the component variable having one dimension number m and the component variable having the second dimension number m is integrated as a component variable having one dimension number m.
As a result, it is possible to prevent a share code that strongly depends on only one of the depth image captured by the depth camera or the body composition value measured by the body composition meter.
It is necessary that the number of missing composition values is clear. In this case, a flag such as input value 1 / missing value 0 is set in the share code by using 10 bits corresponding to the 10-dimensional composition value so that the missing composition value can be separately detected on the decoding side. It is also preferable to keep it.

第１の相関学習エンジン１１０は、対象データのデプス画像と、統合された次元数ｍの成分変数とから再学習する。
同様に、第２の相関学習エンジン１２０は、対象データの組成値と、統合された次元数ｍの成分変数とから再学習する。 The first correlation learning engine 110 re-learns from the depth image of the target data and the integrated component variables having the number of dimensions m.
Similarly, the second correlation learning engine 120 re-learns from the composition value of the target data and the integrated component variables having the number of dimensions m.

図１６は、本発明のエンコード側機能を体組成計に組み込んだ構成図である。 FIG. 16 is a configuration diagram in which the encoding side function of the present invention is incorporated in a body composition meter.

本発明の体組成計には、組成値の計測対象ユーザを、部分的又は全体的に撮影するデプスカメラが搭載されている。特に、そのデプスカメラは、人が手で把持するグリップに搭載されている。
計測対象ユーザは、体組成計の上に立つと共に、電極が装着されたグリップを両手で把持する。そして、腕を水平に上げて、肘を伸ばす。これによって、腕と背筋とが垂直となり、その両手に把持されたグリップが顔の正面に位置する。このとき、グリップに搭載されたデプスカメラが、顔の正面からその計測対象ユーザの上半身を含む部分的な体形を撮影することができる。 The body composition meter of the present invention is equipped with a depth camera that partially or wholly photographs the user whose composition value is to be measured. In particular, the depth camera is mounted on a grip that is held by a human hand.
The user to be measured stands on the body composition meter and grasps the grip on which the electrodes are mounted with both hands. Then raise your arms horizontally and extend your elbows. As a result, the arm and the back muscles become vertical, and the grip held by both hands is positioned in front of the face. At this time, the depth camera mounted on the grip can capture a partial body shape including the upper body of the measurement target user from the front of the face.

図１６によれば、人の組成値を計測する体組成計に、図１０のエンコード側機能が組み込まれている。
これによって、デプスカメラによって撮影されたデプス画像を次元数ｍの成分変数へエンコードし、当該次元数ｍの成分変数と自ら計測した組成値とを対応付けることができる。 According to FIG. 16, a body composition meter that measures the composition value of a person incorporates the encoding function of FIG.
As a result, the depth image captured by the depth camera can be encoded into a component variable having the number of dimensions m, and the component variable having the number m of dimensions can be associated with the composition value measured by itself.

また、図１６によれば、シェアコード出力部１３には、統合機能を含むものであってもよい。
統合機能とは、第１のエンコーダ１１１から出力された第１の次元数ｍの成分変数と、第２のエンコーダ１２１から出力された第２の次元数ｍの成分変数とを統合したシェアコードを出力する。
これによって、デプスカメラによって撮影されたデプス画像、又は、体組成計によって計測された体組成値の一方のみに強く依存したシェアコードにならないようにすることができる。 According to FIG. 16, the share code output unit 13 may include an integrated function.
The integrated function is a shared code that integrates a component variable of the first dimension m output from the first encoder 111 and a component variable of the second dimension m output from the second encoder 121. Output.
As a result, it is possible to prevent a share code that strongly depends on only one of the depth image captured by the depth camera or the body composition value measured by the body composition meter.

ここで、対象データの組成値について欠損した組成値の数が多いほど、小さくなる重みｗを付与して、次元毎に算出した、第１の次元数ｍの成分変数と第２の次元数ｍの成分変数との加重平均を、１つの次元数ｍの成分変数として統合する。 Here, as the number of missing composition values in the composition value of the target data increases, a smaller weight w is assigned, and a component variable of the first dimension number m and a second dimension number m calculated for each dimension are assigned. Are integrated as a component variable having one dimension number m.

図１７は、体組成計及び端末を用いたデプスカメラの種々配置を表すシステム構成図である。 FIG. 17 is a system configuration diagram showing various arrangements of a depth camera using a body composition meter and a terminal.

前述した図１６によれば、１台の体組成計に、体組成値計測部とデプスカメラとの両方が搭載されている。
これに対し、図１７（ａ）によれば、人の組成値を計測する体組成計は、組成値の計測対象ユーザを、部分的又は全体的に撮影するデプスカメラを搭載した端末（例えばスマートフォン）と通信可能である。
体組成計は、端末から、デプスカメラによって撮影されたデプス画像を受信すると共に、当該デプス画像を次元数ｍの成分変数へエンコードし、当該次元数ｍの成分変数と自ら計測した組成値とを対応付けることができる。 According to FIG. 16 described above, one body composition meter is equipped with both a body composition value measurement unit and a depth camera.
On the other hand, according to FIG. 17A, the body composition meter that measures the composition value of a person is a terminal (for example, a smartphone) equipped with a depth camera that partially or wholly captures the user whose composition value is to be measured. ).
The body composition monitor receives, from the terminal, the depth image captured by the depth camera, encodes the depth image into a component variable having the number of dimensions m, and converts the component variable having the number of dimensions m and the composition value measured by itself. Can be assigned.

図１７（ｂ）によれば、端末は、人の組成値を計測する体組成計と通信可能であって、組成値の計測対象ユーザを、部分的又は全体的に撮影するデプスカメラを搭載したものである。
端末は、体組成計によって計測された組成値を受信すると共に、デプスカメラによって撮影されたデプス画像を次元数ｍの成分変数へエンコードし、当該次元数ｍの成分変数と体組成計から受信した組成値とを対応付けることができる。 According to FIG. 17 (b), the terminal is capable of communicating with a body composition meter that measures the composition value of a person, and is equipped with a depth camera that partially or wholly photographs the user whose composition value is to be measured. Things.
The terminal receives the composition value measured by the body composition meter, encodes the depth image captured by the depth camera into a component variable having the number of dimensions m, and receives the component variable having the number m of dimensions and the body composition meter. The composition value can be associated with the composition value.

図１７（ｃ）によれば、端末は、人の組成値を計測し、且つ、当該組成値の計測対象ユーザを部分的又は全体的に撮影するデプスカメラを搭載した体組成計と通信可能である。
端末は、体組成計から、計測された組成値と、デプスカメラによって撮影されたデプス画像とを受信すると共に、デプス画像を次元数ｍの成分変数へエンコードし、当該次元数ｍの成分変数と組成値とを対応付けることができる。 According to FIG. 17 (c), the terminal can communicate with a body composition meter equipped with a depth camera that measures the composition value of a person and partially or wholly photographs the user whose composition value is to be measured. is there.
The terminal receives, from the body composition meter, the measured composition value and the depth image captured by the depth camera, encodes the depth image into a component variable having the number of dimensions m, The composition value can be associated with the composition value.

図１８は、体組成計のグリップ部分にデプスカメラを搭載した場合におけるユーザの姿勢を表す外観図である。 FIG. 18 is an external view illustrating a posture of a user when a depth camera is mounted on a grip portion of the body composition meter.

図１８によれば、体組成計のグリップ部分から撮影可能なデプス画像の画角を表している。具体的には、デプスカメラは、画角７０度で、水平方向から下方に２０度傾いている。この場合、デプス画像には、頭部分や膝下部分、手首部分は映り込んでいないが、そのデプス画像をエンコード及びデコードすることによって、３次元モデルを生成することができる。 FIG. 18 shows the angle of view of a depth image that can be photographed from the grip portion of the body composition meter. Specifically, the depth camera has an angle of view of 70 degrees and is inclined downward by 20 degrees from the horizontal direction. In this case, the depth image head portion and knee portions, the wrist portion is not crowded reflected, by encoding and decoding the depth image, it is possible to generate a three-dimensional model.

図１９は、図１８のユーザの姿勢によって撮影されたデプス画像からデコードされた３次元モデルの精度を表す説明図である。 FIG. 19 is an explanatory diagram illustrating the accuracy of the three-dimensional model decoded from the depth image captured according to the posture of the user in FIG. 18.

図１９によれば、左側には、４つのモデルとなる対象データの人体が表されている。
中央には、その対象データの人体が、図１８と同じ姿勢で、体組成計のグリップ部分からデプスカメラで撮影したデプス画像が表されている。
右側には、そのデプス画像をエンコードしてシェアコードを作成した後、そのシェアコードをデコードして生成した３次元モデルが表されている。
ここで特徴的な点として、左側の対象データの人体と、右側の生成された３次元モデルとが、ほぼ同じ体形となっている。
尚、図１４と同様に、エンコード及びデコードには、第１の相関学習エンジン１１０を用いた第１のエンコーダ１１１と、統計学習エンジン１００を用いた第１のデコーダ１１３とから実現しており、組成値としては身長のみを考慮している。 According to FIG. 19, the human body of the target data serving as the four models is shown on the left side.
In the center, a depth image of the human body of the target data taken by the depth camera from the grip portion of the body composition meter in the same posture as in FIG. 18 is shown.
On the right side, a three-dimensional model generated by encoding the depth image , creating a share code, and then decoding the share code is shown.
Here, as a characteristic point, the human body of the target data on the left side and the generated three-dimensional model on the right side have substantially the same body shape.
As in FIG. 14, encoding and decoding are realized by a first encoder 111 using a first correlation learning engine 110 and a first decoder 113 using a statistical learning engine 100. Only the height is considered as the composition value.

以上、詳細に説明したように、本発明のプログラムによれば、計測対象ユーザをデプスカメラで撮影したデプス画像から、そのユーザの３次元モデルを生成することができる。 As described above in detail, according to the program of the present invention, it is possible to generate a three-dimensional model of a measurement target user from a depth image captured by the depth camera.

前述した本発明の種々の実施形態について、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略は、当業者によれば容易に行うことができる。前述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 For the above-described various embodiments of the present invention, various changes, modifications, and omissions of the scope of the technical idea and viewpoint of the present invention can be easily performed by those skilled in the art. The foregoing description is merely an example, and is not intended to be limiting. The invention is limited only as defined by the following claims and equivalents thereof.

１装置
１０デプス画像作成部
１００統計学習エンジン
１１０第１の相関学習エンジン
１１１第１のエンコーダ
１１３第１のデコーダ
１２０第２の相関学習エンジン
１２１第２のエンコーダ
１２２欠損値推定部
１２３第２のデコーダ
１３シェアコード出力部
１４シェアコード入力部
１５グレースケール画像作成部
１６統合部 Reference Signs List 1 apparatus 10 depth image creation unit 100 statistical learning engine 110 first correlation learning engine 111 first encoder 113 first decoder 120 second correlation learning engine 121 second encoder 122 missing value estimation unit 123 second decoder 13 Share code output unit 14 Share code input unit 15 Gray scale image creation unit 16 Integration unit

Claims

A program that causes a computer mounted on an apparatus for learning by associating a three-dimensional model of a human body of a teacher data with a depth image ,
Depth image creating means for creating, for each three-dimensional model of the teacher data, a depth image obtained by photographing the three-dimensional model from one or more predetermined viewpoints on software;
A statistic learning engine that outputs a dimensionally compressed m number of component variables from a plurality of three-dimensional models of the teacher data group and constructs a statistic learning model;
A first correlation learning engine for constructing a first correlation learning model of a depth image of the three-dimensional model and a component variable having the number of dimensions m for a plurality of three-dimensional models of the teacher data group;
For a plurality of three-dimensional models of the teacher data group, a second correlation learning model is constructed between a composition value of dimension n including at least height and one or more body composition measurement values and a component variable of dimension m. A program causing a computer to function as a second correlation learning engine .

The program according to claim 1, wherein the statistical learning engine causes the computer to function as based on Principal Component Analysis or AutoEncoder.

First correlation learning engine of state, and are not based on convolutional neural network,
The program according to claim 1 or 2, wherein the second correlation learning engine causes the computer to function as being based on a least squares method or a multilayer perceptron .

4. The computer according to claim 1, wherein the computer is operated such that the number of plural bodies of the three-dimensional model of the teacher data group is smaller than the number of vertices of the three-dimensional model. 5. program.

A first encoder that encodes a depth variable as target data into a component variable having a dimension number m using a first correlation learning engine ;
Using a second correlation learning engine to cause a computer to function as a second encoder that encodes a composition value having one dimension number n as a target data into a component variable having a dimension number m. The program according to any one of claims 1 to 4, characterized in that:

In order to input a depth image and a composition value associated as target data and re-learn the first correlation learning engine and the second correlation learning engine,
A first encoder that encodes the depth image of the target data into a component variable having a first dimension number m;
A second encoder that encodes the composition value of the target data into a component variable having a second dimension number m;
Integrating a component variable having a first dimension number m and a component variable having a second dimension number m into a component variable having one dimension number m;
A first correlation learning engine that re-learns from the depth image of the target data and the integrated component variables having the number of dimensions m;
The program according to claim 5 , wherein the second correlation learning engine causes a computer to function so as to re-learn from the composition value of the target data and the integrated component variables having the number of dimensions m.

The component variable of the first dimension m and the component variable of the second dimension m calculated for each dimension by assigning a smaller weight w as the number of missing composition values of the composition value of the target data increases. 7. The computer-readable storage medium according to claim 6 , wherein the computer is caused to integrate a weighted average of the two as a component variable having one dimension number m.

Only k (<n) composition values are determined for the composition value of one dimension n as the target data, and even if other nk composition values are missing, the component variable of dimension m is determined. 6. The computer according to claim 5 , further comprising a computer functioning as a missing value estimating means for estimating other nk optimized composition values by using k composition values as constraints for the estimation. The described program.

9. The program according to claim 8 , wherein the missing value estimating means causes a computer to function so as to use a Lagrange's undetermined multiplier method.

The computer according to any one of claims 1 to 9 , wherein the computer further functions as a first decoder that decodes the component variable having the number of dimensions m into a three-dimensional model using the statistical learning engine. program.

The program according to claim 10 , wherein the program causes a computer to function as a grayscale image creating unit that creates one or more grayscale images of the decoded three-dimensional model photographed from a predetermined viewpoint on software.

The computer according to any one of claims 1 to 11 , wherein the computer is caused to function as a second decoder that decodes a component variable having the number of dimensions m as target data into a composition value using the second correlation learning engine. The program described in.

Before Symbol depth image, the program according to any one of claims 1 to 12, characterized in that causes a computer to function as a picture image of the gray-scale gradation, depending on the depth.

Outputs the encoded component variables of the number of dimensions m as a share code, and as a share code output means for embedding the share code in QR (Quick Response, registered trademark), RFID (Radio Frequency IDentifier) or Cookie further program according to any one of claims 1 1 to 3, characterized in that causes a computer to function.