HK1205311B

HK1205311B - Three-dimensional user-interface device, and three-dimensional operation method

Info

Publication number: HK1205311B
Application number: HK15105864.5A
Authority: HK
Inventors: Morishita Koji; Nagai Katsuyuki; Noda Hisashi
Original assignee: 日本电气方案创新株式会社
Priority date: 2012-07-27
Filing date: 2013-03-08
Publication date: 2018-03-29

Description

Three-dimensional user interface device and three-dimensional operation method

Technical Field

The present invention relates to three-dimensional user interface technology.

Background

In recent years, technologies for realizing a three-dimensional environment on a computer, such as 3DCG (three-dimensional computer graphics) and Augmented Reality (AR), have been widely put into practical use. The AR technology displays a virtual object and data on an object in the real world, which is obtained via a camera or a Head Mounted Display (HMD) of a portable device such as a smartphone. By such a display technique, the user can visually confirm the three-dimensional video. Patent document 1 below proposes the following: the user is identified and tracked within a scene using a depth detection camera, and a virtual reality animation (avatar animation) that simulates the movement of the user is displayed within the scene based on the result.

However, a User Interface (UI) for operating a three-dimensional environment represented by the above-described technology is currently implemented using a two-dimensional input device. For example, a two-dimensional mouse operation is converted into an operation in a three-dimensional space. Therefore, many of the current UIs for operating a three-dimensional environment are not easily understood intuitively.

Therefore, the following patent document 2 proposes the following technique: a remote controller having a depth camera is used to detect a change in the position of the remote controller, and an input command for operating an application is triggered based on the change. Further, patent document 3 proposes the following technique: additional equipment such as arm covers or gloves is not required, and the computer interaction experience in a natural three-dimensional environment is provided for the user. In this proposal, a depth camera is provided at a position facing a user, an image in which a virtual object is inserted is displayed on a display together with the user photographed by the depth camera, and an interaction between the user and the virtual object is detected.

Documents of the prior art

Patent document

Patent document 1: japanese Kohyo publication 2011-515736

Patent document 2: japanese Kohyo publication 2011-514232

Patent document 3: japanese patent No. 4271236

Disclosure of Invention

Problems to be solved by the invention

According to the method proposed in patent document 3, a virtual object placed in a visualized real space can be moved by the hand of a user present in the image. However, patent document 3 does not propose any operation method other than the movement of the virtual object and an operation method of a three-dimensional space reflected as an image.

The present invention has been made in view of the above circumstances, and provides a user interface technique for intuitively and easily understanding a virtual three-dimensional space in which stereoscopic display is operated.

Means for solving the problems

In each aspect of the present invention, in order to solve the above problem, the following configurations are adopted.

The three-dimensional user interface device according to the first aspect includes: a three-dimensional information acquisition unit that acquires three-dimensional information from a three-dimensional sensor; a position calculation unit that calculates three-dimensional position information on a three-dimensional coordinate space related to a specific part of the target person, using the three-dimensional information acquired by the three-dimensional information acquisition unit; a virtual data generating unit configured to generate virtual three-dimensional space data representing a virtual three-dimensional space which is arranged in the three-dimensional coordinate space and at least partially set in a display area; a space processing unit that performs predetermined processing corresponding to a change in three-dimensional position information on a specific part of the subject person with respect to the three-dimensional coordinate space or the virtual three-dimensional space data; and a display processing unit that displays the virtual three-dimensional space in the display area on the display unit based on the virtual three-dimensional space data obtained by performing the predetermined processing by the space processing unit.

The three-dimensional operation method of the second aspect of the present invention is executed by at least one computer, and includes: the method includes the steps of obtaining three-dimensional information from a three-dimensional sensor, calculating three-dimensional position information on a three-dimensional coordinate space relating to a specific part of a subject using the obtained three-dimensional information, generating virtual three-dimensional space data representing a virtual three-dimensional space which is arranged in the three-dimensional coordinate space and at least partially set in a display area, performing predetermined processing corresponding to a change in the three-dimensional position information relating to the specific part of the subject on the three-dimensional coordinate space or the virtual three-dimensional space data, and displaying the virtual three-dimensional space in the display area on a display unit based on the virtual three-dimensional space data obtained by performing the predetermined processing.

Further, as another aspect of the present invention, a program for causing a computer to realize each configuration included in the first aspect may be provided, or a computer-readable recording medium on which such a program is recorded. The recording medium includes a non-transitory tangible medium.

Effects of the invention

According to the above aspects, it is possible to provide a user interface technique for intuitively and easily comprehendingly operating a virtual three-dimensional space for stereoscopic display.

Drawings

The above objects, other objects, features and advantages will become more apparent from the following description of preferred embodiments and the accompanying drawings attached hereto.

Fig. 1 is a diagram conceptually showing an example of the hardware configuration of a three-dimensional user interface device (3D-UI device) according to a first embodiment.

Fig. 2 is a diagram showing an example of a usage mode of the three-dimensional user interface device (3D-UI device) according to the first embodiment.

Fig. 3 is a diagram showing an example of the external structure of the HMD.

Fig. 4 is a diagram conceptually showing an example of a processing configuration of the sensor-side device according to the first embodiment.

Fig. 5 is a diagram conceptually showing an example of a processing configuration of the display-side device according to the first embodiment.

Fig. 6 is a diagram showing an example of a synthesized image displayed on the HMD.

Fig. 7 is a sequence diagram showing an operation example of the three-dimensional user interface device (3D-UI device) according to the first embodiment.

Fig. 8 is a diagram showing an example of a composite image displayed on the HMD according to the second embodiment.

Fig. 9 is a diagram showing an example of the operation of moving the virtual 3D space according to embodiment 1.

Fig. 10 is a diagram showing an example of the reduction operation of the virtual 3D space according to embodiment 1.

Fig. 11 is a diagram showing an example of the rotation operation of the virtual 3D space according to embodiment 1.

Fig. 12 is a diagram conceptually showing an example of the hardware configuration of a three-dimensional user interface device (3D-UI device) according to a modification.

Fig. 13 is a diagram conceptually showing an example of the processing configuration of the three-dimensional user interface device (3D-UI device) according to the modification.

Detailed Description

Hereinafter, embodiments of the present invention will be described. The embodiments described below are examples, and the present invention is not limited to the configurations of the embodiments described below.

The three-dimensional user interface device of the present embodiment includes: a three-dimensional information acquisition unit that acquires three-dimensional information from a three-dimensional sensor; a position calculation unit that calculates three-dimensional position information on a three-dimensional coordinate space related to a specific part of the target person, using the three-dimensional information acquired by the three-dimensional information acquisition unit; a virtual data generating unit configured to generate virtual three-dimensional space data representing a virtual three-dimensional space which is arranged in the three-dimensional coordinate space and at least partially set in a display area; a space processing unit that performs predetermined processing corresponding to a change in three-dimensional position information on a specific part of the subject person with respect to the three-dimensional coordinate space or the virtual three-dimensional space data; and a display processing unit that displays the virtual three-dimensional space in the display area on the display unit based on the virtual three-dimensional space data obtained by performing the predetermined processing by the space processing unit.

The three-dimensional operation method of the present embodiment is executed by at least one computer, and the three-dimensional operation method includes: the method includes the steps of obtaining three-dimensional information from a three-dimensional sensor, calculating three-dimensional position information on a three-dimensional coordinate space relating to a specific part of a subject using the obtained three-dimensional information, generating virtual three-dimensional space data representing a virtual three-dimensional space which is arranged in the three-dimensional coordinate space and at least partially set in a display area, performing predetermined processing corresponding to a change in the three-dimensional position information relating to the specific part of the subject on the three-dimensional coordinate space or the virtual three-dimensional space data, and displaying the virtual three-dimensional space in the display area on a display unit based on the virtual three-dimensional space data obtained by performing the predetermined processing.

In the present embodiment, three-dimensional information is acquired from a three-dimensional sensor. The three-dimensional information includes a two-dimensional image of the subject person obtained by visible light and information of the distance (depth) from the three-dimensional sensor. The three-dimensional sensor may be constituted by a plurality of devices such as a visible light camera and a depth sensor.

In the present embodiment, by using the three-dimensional information, three-dimensional position information on a three-dimensional coordinate space relating to a specific part of a target person is calculated, and virtual three-dimensional space data which is arranged in the three-dimensional coordinate space and at least partially set in a display region is generated. Here, the specific part is a part of a body used by the subject person to operate the virtual three-dimensional space displayed on the display unit. The specific portion is not limited in this embodiment. Also, the three-dimensional coordinate space represents a three-dimensional space represented by three-dimensional coordinates for position recognition of a virtual three-dimensional space realized by so-called computer graphics.

The calculation of the three-dimensional position information includes not only a method of directly obtaining the three-dimensional position information from the three-dimensional information detected by the three-dimensional sensor but also a method of indirectly obtaining the three-dimensional position information from the three-dimensional information detected by the three-dimensional sensor. The indirect means that the three-dimensional position information is obtained from information obtained by performing predetermined processing on the three-dimensional information detected by the three-dimensional sensor. Thus, the three-dimensional coordinate space can be determined by, for example, a camera coordinate system of the three-dimensional sensor, or can be determined by a marker coordinate system calculated from an image marker or the like having a known shape detected from the three-dimensional information.

In the present embodiment, as described above, the three-dimensional position information on the specific part of the target person is sequentially calculated using the three-dimensional information sequentially acquired from the three-dimensional camera, and thereby a change in the three-dimensional position information on the specific part of the target person, that is, a three-dimensional motion (three-dimensional gesture) of the specific part of the target person is detected. In the present embodiment, a predetermined process corresponding to a change in three-dimensional position information on a specific part of the subject person is applied to the three-dimensional coordinate space or the virtual three-dimensional space data. Then, the virtual three-dimensional space corresponding to the result of the execution of the predetermined processing is displayed on the display unit. Here, the predetermined processing is, for example, processing of moving, rotating, enlarging, and reducing the virtual three-dimensional space.

Therefore, according to the present embodiment, the target person (user) can freely operate the virtual three-dimensional space displayed on the display unit by performing a predetermined three-dimensional gesture using a specific part of the user. In addition, in the present embodiment, since the user can operate the virtual three-dimensional space by the three-dimensional motion of the specific part of the user himself, the user can easily understand intuitively and can obtain the feeling of operating the virtual three-dimensional space.

The above embodiments will be described in more detail below.

[ first embodiment ]

[ device Structure ]

Fig. 1 is a diagram conceptually showing an example of a hardware configuration of a three-dimensional user interface device (hereinafter referred to as a 3D-UI device) 1 according to a first embodiment. The 3D-UI device 1 of the first embodiment generally has a sensor-side structure and a display-side structure. The sensor-side structure is formed by a three-dimensional sensor (hereinafter referred to as a 3D sensor) 8 and a sensor-side device 10. The display-side structure is formed by a head-mounted display (hereinafter referred to as HMD)9 and a display-side device 20. Hereinafter, the three dimensions are appropriately omitted as 3D.

Fig. 2 is a diagram showing an example of a usage mode of the 3D-UI device 1 according to the first embodiment. As shown in fig. 2, the 3D sensor 8 is disposed at a position capable of detecting a specific part of the subject person (user). HMD9 is attached to the head of a subject (user) and allows the subject to visually recognize a sight-line image corresponding to the sight line of the subject and the virtual 3D space synthesized with the sight-line image.

The 3D sensor 8 detects 3D information used for detection of a specific part of the subject person and the like. The 3D sensor 8 is realized by a visible light camera and a range image sensor, like Kinect (registered trademark), for example. The distance image sensor is also called a depth sensor, and a pattern of near-infrared light is irradiated from a laser to a subject person, and a distance (depth) from the distance image sensor to the subject person is calculated from information obtained by imaging the pattern with a camera that detects the near-infrared light. In addition, the implementation method of the 3D sensor 8 itself is not limited, and the 3D sensor 8 may be implemented in a three-dimensional scanning manner using a plurality of visible light cameras. Although the 3D sensor 8 is illustrated by 1 element in fig. 1, the 3D sensor 8 may be implemented by a plurality of devices such as a visible light camera for capturing a two-dimensional image of a subject person and a sensor for detecting a distance to the subject person.

Fig. 3 is a diagram showing an example of the external structure of HMD 9. In fig. 3, a structure of an HMD9 called a video perspective (video-Through) type is shown. In the example of fig. 3, HMD9 has 2 line-of-sight cameras 9a and 9b and 2 displays 9c and 9 d. The sight line cameras 9a and 9b capture sight line images corresponding to the respective sight lines of the user. Thus, HMD9 may also be referred to as an imaging unit. Each of the displays 9c and 9D is arranged to cover most of the field of view of the user, and displays a synthesized 3D image obtained by synthesizing the virtual 3D space in each of the sight line images.

The sensor-side device 10 and the display-side device 20 each include a Central Processing Unit (CPU) 2, a memory 3, a communication device 4, an input/output interface (I/F)5, and the like, which are connected to each other by a bus or the like. The Memory 3 is a RAM (Random Access Memory), a ROM (Read Only Memory), a hard disk, a portable storage medium, or the like.

The 3D sensor 8 is connected to the input/output I/F5 of the sensor-side device 10, and the HMD9 is connected to the input/output I/F5 of the display-side device 20. The input/output I/F5 and the 3D sensor 8, and the input/output I/F5 and the HMD9 are connected to each other so as to be able to communicate wirelessly. Each communication device 4 communicates with other devices (the sensor-side device 10, the display-side device 20, and the like) by wireless or wired communication. The present embodiment does not limit such a communication method. Further, the specific hardware configuration of the sensor-side device 10 and the display-side device 20 is not limited.

[ treatment Structure ]

< sensor-side device >

Fig. 4 is a diagram conceptually showing an example of the processing configuration of the sensor-side device 10 according to the first embodiment. The sensor-side device 10 of the first embodiment includes a 3D information acquisition unit 11, a first object detection unit 12, a first reference setting unit 13, a position calculation unit 14, a state acquisition unit 15, a transmission unit 16, and the like. The processing units are realized by executing a program stored in the memory 3 by the CPU2, for example. The program may be installed from a portable recording medium such as a CD (Compact Disc) or a memory card, or from another computer on the network via the input/output I/F5, and stored in the memory 3.

The 3D information acquisition unit 11 sequentially acquires 3D information detected by the 3D sensor 8.

The first object detection unit 12 detects a known common real object from the 3D information acquired by the 3D information acquisition unit 11. The general-purpose real object is an image or an object placed in the real world, and is also called an AR (Augmented Reality) marker or the like. In the present embodiment, any reference point and 3 mutually orthogonal directions from the reference point can be constantly acquired from the common actual object regardless of the reference direction, and the specific form of the common actual object is not limited. The first object detection unit 12 holds information such as the shape, size, and color of the common real object in advance, and detects the common real object from the 3D information using the known information.

The first reference setting unit 13 sets a 3D coordinate space based on the common real object detected by the first object detecting unit 12, and calculates the position and orientation of the 3D sensor 8 in the 3D coordinate space. For example, the first reference setting unit 13 sets a 3D coordinate space having a reference point extracted from a common real object as an origin and 3 directions orthogonal to each other from the reference point as respective axes. The first reference setting unit 13 calculates the position and orientation of the 3D sensor 8 by comparing a known shape and size (corresponding to an original shape and size) of the common real object with a shape and size (corresponding to a visual representation from the 3D sensor 8) indicated by the common real object extracted from the 3D information.

The position calculation unit 14 sequentially calculates 3D position information on the specific part of the subject person in the 3D coordinate space using the 3D information sequentially acquired by the 3D information acquisition unit 11. In the first embodiment, the position calculation unit 14 specifically calculates the 3D position information as follows. The position calculating unit 14 first extracts 3D position information of a specific part of the subject person from the 3D information acquired by the 3D information acquiring unit 11. The 3D position information extracted here corresponds to the camera coordinate system of the 3D sensor 8. Therefore, the position calculation unit 14 converts the 3D position information corresponding to the camera coordinate system of the 3D sensor 8 into the 3D position information on the 3D coordinate space set by the first reference setting unit 13 based on the position, orientation, and 3D coordinate space of the 3D sensor 8 calculated by the first reference setting unit 13. This conversion is from the camera coordinate system of the 3D sensor 8 to the 3D coordinate system set based on the common real object.

Here, the specific portion of the subject to be detected may be plural. For example, there may be a way of using both hands of the subject person as a plurality of specific parts. In this case, the position calculating unit 14 extracts 3D position information of a plurality of specific portions from the 3D information acquired by the 3D information acquiring unit 11, and converts each of the 3D position information into each of 3D position information on the 3D coordinate space. The specific part is a part of the body used by the subject to operate the virtual three-dimensional space displayed on the display unit, and therefore has a certain area or volume. Thus, the 3D position information calculated by the position calculating unit 14 may be position information of a certain 1 point in the specific portion, or may be position information of a plurality of points.

The state acquisition unit 15 acquires state information of a specific part of the subject person. The specific portion is the same as the specific portion to be detected in the position calculation unit 14. The state information represents at least one of two states. Specifically, when the specific portion is a hand, the state information indicates at least one of a held state and an opened state. The present embodiment does not limit the number of states that can be represented by the state information within a detectable range. When a plurality of specific portions are used, the state acquisition unit 15 acquires state information on each specific portion.

The state acquisition unit 15 holds, for example, image feature information corresponding to each state to be recognized of the specific portion in advance, and acquires the state information of the specific portion by comparing feature information extracted from the 2D image included in the 3D information acquired by the 3D information acquisition unit 11 with the image feature information held in advance. The state acquisition unit 15 may acquire state information of the specific portion from information obtained by a strain sensor (not shown) attached to the specific portion. The state acquisition unit 15 may acquire the state information from an input mouse (not shown) operated by the hand of the subject person. The state acquiring unit 15 may acquire the state information by recognizing a sound obtained by a microphone (not shown).

The transmission unit 16 transmits the three-dimensional position information on the three-dimensional coordinate space calculated by the position calculation unit 14 and the state information acquired by the state acquisition unit 15, which are related to the specific part of the subject person, to the display-side device 20.

< display-side apparatus >

Fig. 5 is a diagram conceptually showing an example of the processing configuration of the display-side device 20 according to the first embodiment. The display-side device 20 of the first embodiment includes a line-of-sight image acquisition unit 21, a second object detection unit 22, a second reference setting unit 23, a virtual data generation unit 24, an operation specification unit 25, a spatial processing unit 26, an image synthesis unit 27, a display processing unit 28, and the like. The processing units are realized by executing a program stored in the memory 3 by the CPU2, for example. The program may be installed from a portable recording medium such as a cd (compact disc), a memory card, or the like, or from another computer on the network via the input/output I/F5, and may be stored in the memory 3.

The line-of-sight image acquisition unit 21 acquires a line-of-sight image of a specific part of the subject person from the HMD 9. The specific portion is also the same as the specific portion to be detected in the sensor-side device 10. In the present embodiment, since the line-of-sight cameras 9a and 9b are provided, the line-of-sight image acquisition unit 21 acquires line-of-sight images corresponding to the left eye and the right eye, respectively. Since each processing unit performs processing in the same manner for each of the two line-of-sight images corresponding to the left eye and the right eye, the following description will be made with 1 line-of-sight image as a target.

The second object detection unit 22 detects a known general-purpose real object from the sight-line image acquired by the sight-line image acquisition unit 21. The common actual object is the same as the object detected by the sensor-side device 10 described above. The processing of the second object detection unit 22 is the same as that of the first object detection unit 12 of the sensor-side device 10, and therefore, a detailed description thereof is omitted here. Further, the common real object included in the sight-line image and the common real object included in the 3D information obtained by the 3D sensor 8 have different imaging directions.

The second reference setting unit 23 sets the 3D coordinate space set by the first reference setting unit 13 of the sensor-side device 10 based on the common real object detected by the second object detecting unit 22, and calculates the position and orientation of the HMD 9. The processing of the second reference setting unit 23 is the same as that of the first reference setting unit 13 of the sensor-side device 10, and therefore, a detailed description thereof is omitted here. The 3D coordinate space set by the second reference setting portion 23 is also set based on the same common real object as the 3D coordinate space set by the first reference setting portion 13 of the sensor-side device 10, and as a result, the 3D coordinate space is shared between the sensor-side device 10 and the display-side device 20.

The virtual data generating unit 24 generates virtual 3D space data, which is disposed in the 3D coordinate space shared with the sensor-side device 10 and at least partially set in the display area, by the second reference setting unit 23. The virtual 3D space data includes, for example, data relating to a virtual object arranged at a predetermined position within a virtual 3D space represented by the data.

The operation determination unit 25 receives the 3D position information and the state information on the 3D coordinate space relating to the specific part of the subject person from the sensor-side device 10, and determines 1 predetermined process to be executed by the space processing unit 26 from among a plurality of predetermined processes based on a combination of the state information and a change in the 3D position information. The change in the 3D position information is calculated from the relationship with the 3D position information obtained at the time of the previous processing. When a plurality of specific portions (for example, both hands) are used, the operation specification unit 25 calculates the positional relationship between the plurality of specific portions based on the plurality of pieces of 3D positional information acquired from the sensor-side device 10, and specifies 1 predetermined process from among the plurality of predetermined processes based on the calculated change in the positional relationship between the plurality of specific portions and the plurality of pieces of state information. The plurality of predetermined processes include a movement process, a rotation process, an enlargement process, a reduction process, an addition process of display data of a function menu, and the like.

More specifically, the operation determination portion 25 determines the following predetermined processing. For example, when the specific portion of the subject person is a single hand, the operation specifying unit 25 specifies the processing for moving the single hand of the subject person by the distance corresponding to the linear movement amount of the single hand while the single hand is maintained in the specific state (for example, the held state). The operation specifying unit 25 measures a period during which the state information and the three-dimensional position information do not change, and specifies processing of the display data of the additional function menu when the measured period exceeds a predetermined period.

When the plurality of specific portions of the subject person are both hands, the operation specification unit 25 specifies the following predetermined processing. The operation specifying unit 25 specifies the enlargement processing using the position of one hand of the subject person as a reference point at an enlargement ratio corresponding to the amount of change in the distance between the two hands of the subject person. The operation specification unit 25 specifies the reduction processing using the position of one hand of the subject person as a reference point at a reduction rate corresponding to the amount of change in the distance between the two hands of the subject person. The operation specifying unit 25 specifies the rotation process using the position of one hand of the target person as a reference point, based on the amount of change in the solid angle corresponding to the amount of change in the solid angle of the line segment connecting the two hands of the target person.

The operation determination unit 25 determines whether or not the state information indicates a specific state, and determines whether or not to cause the spatial processing unit 26 to execute a predetermined process based on the determination result. For example, when the specific part of the subject person is one hand, the operation specification unit 25 determines to cause the space processing unit 26 not to execute the predetermined processing or to cause the space processing unit 26 to stop executing the predetermined processing when the state information indicates that the one hand is in the open state. When the plurality of specific portions of the subject person are both hands, the operation specification unit 25 determines to cause the space processing unit 26 to execute the predetermined processing when the state information indicates that both hands are in the held state, and determines to cause the space processing unit 26 not to execute the predetermined processing when the state information indicates that either one of the hands is in the opened state.

For example, the operation determination unit 25 holds an ID for identifying each of the predetermined processes, and selects an ID corresponding to the predetermined process to determine the predetermined process. The operation determining section 25 delivers the selected ID to the space processing section 26, thereby causing the space processing section 26 to execute the predetermined processing.

The space processing unit 26 applies predetermined processing determined by the operation determination unit 25 to the 3D coordinate space set by the second reference setting unit 23 or the virtual 3D space data generated by the virtual data generation unit 24. The spatial processing section 26 realizes a plurality of predetermined processes that can execute the support.

The image combining unit 27 combines the virtual three-dimensional space within the display area indicated by the virtual 3D space data subjected to the predetermined processing by the space processing unit 26 with the sight line image acquired by the sight line image acquisition unit 21 based on the position, orientation, and three-dimensional coordinate space of the HMD9 calculated by the second reference setting unit 23. Note that the synthesis processing performed by the image synthesis unit 27 may be performed by a known method used in Augmented Reality (AR) or the like, and therefore, the description thereof is omitted here.

The display processing unit 28 displays the composite image obtained by the image compositing unit 27 on the HMD 9. In the present embodiment, since the 2 line-of-sight images corresponding to the respective lines of sight of the subject person are processed as described above, the display processing unit 28 displays the respective line-of-sight images and the respective combined images combined on the displays 9c and 9d of the HMD9, respectively.

Fig. 6 is a diagram showing an example of a composite image displayed on HMD 9. The composite image shown in the example of fig. 6 is formed from a 3D topographic map, a virtual 3D space including an airplane, an airport, and the like, and a line-of-sight image including both hands of the subject person (user). The user can freely manipulate the virtual 3D space included in the image by moving both hands of the user while observing the image with HMD 9.

[ action example ]

Hereinafter, the three-dimensional operation method of the first embodiment will be described with reference to fig. 7. Fig. 7 is a sequence diagram showing an example of the operation of the 3D-UI device 1 according to the first embodiment.

The sensor-side device 10 successively acquires 3D information from the 3D sensor 8 (S71). The sensor-side device 10 operates as follows for the 3D information at a predetermined frame rate.

The sensor-side device 10 detects a common real object from the 3D information (S72).

Next, the sensor-side device 10 sets a 3D coordinate space based on the detected common real object, and calculates the position and orientation of the 3D sensor 8 in the 3D coordinate space (S73).

Then, the sensor-side device 10 calculates 3D position information of the specific part of the target person using the 3D information (S74). Then, the sensor-side device 10 converts the 3D position information calculated in the step (S74) into 3D position information on the 3D coordinate space set in the step (S73) based on the position, orientation, and 3D coordinate space of the 3D sensor 8 calculated in the step (S73) (S75).

Then, the sensor-side device 10 acquires the state information on the specific part of the subject person (S76).

The sensor-side device 10 transmits the 3D position information obtained in the step (S75) and the state information obtained in the step (S76) to the display-side device 20 regarding the specific part of the subject person (S77).

In fig. 7, for convenience of explanation, an example in which the acquisition of the 3D information (S71) and the acquisition of the state information (S76) are sequentially performed is shown, but when the state information of the specific portion is obtained from other than the 3D information, the steps (S71) and (S76) are performed in parallel. In fig. 7, an example in which the processes (S72) and (S73) are performed at a predetermined frame rate of the 3D information is shown, but the processes (S72) and (S73) may be performed only at the time of calibration.

On the other hand, the display-side device 20 sequentially acquires the line-of-sight images from the HMD9 asynchronously with the acquisition of the 3D information (S71) (S81). The display-side device 20 operates as follows for the sight-line image at a predetermined frame rate.

The display-side device 20 detects a common real object from the sight-line image (S82).

Next, the display-side device 20 sets a 3D coordinate space based on the detected common real object, and calculates the position and orientation of the HMD9 in the 3D coordinate space (S83).

The display-side device 20 generates virtual 3D space data arranged in the set 3D coordinate space (S84).

When the display-side device 20 receives the 3D position information and the state information about the specific part of the subject person from the sensor-side device 10 (S85), the predetermined process corresponding to the gesture of the subject person is specified based on the combination of the change in the 3D position information and the state information of the specific part (S86). When there are a plurality of specific portions, the display-side device 20 specifies a predetermined process based on a combination of a change in the positional relationship between the plurality of specific portions and the plurality of status information.

The display-side device 20 applies predetermined processing (S87) determined in the process (S86) to the virtual 3D space data generated in the process (S84). Next, the display-side device 20 synthesizes the virtual 3D space data subjected to the predetermined processing with the line-of-sight image (S88), and generates display data.

The display-side device 20 displays the image obtained by the synthesis on the HMD9 (S89).

Fig. 7 shows an example in which the process of the information on the specific part of the target person transmitted from the sensor-side device 10 (step (S85) to step (S87)) and the process of generating the virtual 3D space data (step (S82) to step (S84)) are sequentially executed for convenience of explanation. However, the process (S85) to the process (S87), and the process (S82) to the process (S84) are performed in parallel. Also, in fig. 7, an example in which the processes (S82) to (S84) are performed at a predetermined frame rate of the sight-line image is shown, but the processes (S82) to (S84) may be performed only at the time of calibration.

[ actions and effects of the first embodiment ]

As described above, in the first embodiment, the sight line image of the target person is acquired, and the image obtained by combining the sight line image with the virtual 3D space is displayed in the field of view of the target person in a video perspective manner. This allows the subject person to visually confirm the virtual 3D space as if the subject person is present in front of the subject person. In the first embodiment, since a specific part (hand or the like) of the subject person who operates the virtual 3D space is captured in the sight-line image, the subject person can feel that the virtual 3D space is operated by the specific part of the subject person. That is, according to the first embodiment, the virtual 3D space can be visually confirmed intuitively by the subject person, and the intuitive operation feeling of the virtual 3D space can be given to the subject person.

In the first embodiment, HMD9 for obtaining a sight line image of a subject person and 3D sensor 8 for obtaining a position of a specific part of the subject person are provided separately. Thus, according to the first embodiment, the 3D sensor 8 can be disposed at a position where the 3D position of the specific part of the subject person can be accurately measured. This is because, if the distance to the measurement object is not separated to a certain extent, there is a possibility that the 3D sensor 8 cannot accurately measure the position of the measurement object.

In the first embodiment, a common real object is used, and a common 3D coordinate space is set between the sensors based on information obtained by the sensors (the 3D sensor 8 and the HMD9) provided separately. Then, the position of the specific part of the subject person is determined using the common 3D coordinate space, and virtual 3D space data is generated and operated. Therefore, according to the first embodiment, the target person can intuitively recognize the relationship between the virtual 3D space and the position of the specific part of the target person, and as a result, the target person can be given an intuitive sense of operation in the virtual 3D space.

In the first embodiment, the predetermined processing to be applied to the virtual 3D space data is determined based on a combination of the position change and the state of the specific part of the target person, and the virtual 3D space subjected to the predetermined processing is combined with the sight line image. Thus, the target person can perform an operation using the virtual 3D space corresponding to the 3D gesture of the specific part of the target person. Therefore, according to the first embodiment, it is possible to provide a user interface that intuitively and easily comprehends the operation of the virtual 3D space.

[ second embodiment ]

In the first embodiment, the virtual 3D space is manipulated by the movement of the specific part of the subject person himself included in the composite image displayed in HMD 9. In the second embodiment, a virtual object directly operated by a specific part of a subject person is displayed as a substitute for a virtual 3D space to be an actual operation target, and the operation of the virtual 3D space is enabled by the operation of the specific part of the subject person with respect to the virtual object. Hereinafter, the 3D-UI device 1 according to the second embodiment will be described mainly focusing on differences from the first embodiment. In the following description, the same contents as those of the first embodiment are appropriately omitted.

[ treatment Structure ]

In the second embodiment, the following processing section of the display-side device 20 is different from that of the first embodiment.

The virtual data generation unit 24 generates virtual object data in a virtual 3D space arranged in the display area. The virtual object displayed by the virtual object data has, for example, a spherical shape.

The operation specifying unit 25 determines whether or not a specific part of the subject person exists within a predetermined 3D range with respect to the virtual object, based on the 3D position information calculated by the position calculating unit 14 of the sensor-side device 10. When the specific portion of the subject person is present within the predetermined 3D range, the operation determination unit 25 determines the following rotation process as a predetermined process to be executed by the space processing unit 26. Specifically, as the predetermined processing, the operation specification unit 25 specifies the rotation processing using the specific point of the virtual object as the reference point by the solid angle change amount corresponding to the solid angle change amount of the line segment connecting the specific point of the target person and the specific point of the virtual object. The specific point of the virtual object utilizes, for example, a center point (gravity point) of the virtual object.

When the specific portion of the target person is outside the predetermined 3D range, the operation specification unit 25 causes the spatial processing unit 26 not to execute the processing on the virtual 3D spatial data. Thus, the subject person cannot operate on the virtual 3D space. However, when the specific part of the target person is outside the predetermined 3D range, the operation specification unit 25 may specify the predetermined process in the first embodiment without specifying the predetermined process in the method of the second embodiment.

The operation specifying unit 25 may detect a movement of the specific portion of the target person from within the predetermined 3D range to outside the predetermined 3D range, and specify, as the predetermined processing, rotation processing corresponding to a movement distance and a movement direction between a position within the predetermined 3D range and a position outside the predetermined 3D range before and after the movement. Specifically, the rotation process is a process of rotating the rotation axis by an angle amount corresponding to the movement distance in an angle direction corresponding to the movement direction. Thus, the subject person can rotate the virtual 3D space inertially by an operation on the virtual 3D space that is about to become impossible. Such an inactive rotation operation can be set to switch between active and inactive.

Fig. 8 is a diagram showing an example of a composite image displayed on HMD9 according to the second embodiment. The composite image shown in the example of fig. 8 includes a virtual object VO in addition to the configuration of the first embodiment. The user can freely manipulate the virtual 3D space included in the image by moving both hands of the user within a predetermined 3D range with reference to the virtual object VO while observing the image through the HMD 9.

[ action example ]

The three-dimensional working method according to the second embodiment differs from the first embodiment in the steps (S84) and (S86) shown in fig. 7. Specifically, in the step (S84), virtual object data is also generated as virtual 3D space data, and in the step (S86), the relationship between the predetermined 3D range with respect to the virtual object and the position of the specific portion is determined.

[ actions and effects of the second embodiment ]

As described above, in the second embodiment, the virtual object is displayed in the virtual 3D space on the 3D coordinate space, and when the specific part of the subject person exists within the predetermined 3D range with reference to the virtual object, the rotation process to the virtual 3D space data is executed, and when the specific part does not exist, the rotation process is not executed. Then, the rotation process using the specific point of the virtual object as a reference point is determined by the solid angle change amount corresponding to the solid angle change amount of the line segment connecting the specific point of the target person and the specific point of the virtual object, and the rotation process is applied to the virtual 3D space.

As described above, in the second embodiment, an invisible operation object having a shape that is not visually recognized as a whole, such as a virtual 3D space, is replaced with an image having a certain shape, such as a virtual object, and an operation of the invisible operation object by the target person is realized by detecting a motion of a specific part of the target person with respect to the virtual substitute. Thus, the subject person can obtain a feeling of actually operating the virtual object at the specific part of the subject person and can operate the virtual 3D space in accordance with the operation, and thus can also obtain a feeling of intuitively operating the virtual 3D space. Therefore, according to the second embodiment, it is possible to make the subject person operate the virtual 3D space as an intangible operation object more intuitively.

The above embodiments will be described in more detail below with reference to examples. The present invention is not limited to the following examples. Example 1 below corresponds to a specific example of the first embodiment, and example 2 below corresponds to a specific example of the second embodiment. In each of the following embodiments, "one hand" or "both hands" are used as the specific part of the subject person.

Example 1

Fig. 9 is a diagram showing an example of the operation of moving the virtual 3D space according to embodiment 1. In fig. 9, each axis of the 3D coordinate space is represented as an X axis, a Y axis, and a Z axis, and a virtual 3D space corresponding to the virtual 3D space data generated by the virtual data generation unit 24 is represented by a symbol VA. Although the size of the virtual 3D space is not limited, fig. 9 illustrates a virtual 3D space VA having a limited size for convenience of explanation. Also, an area displayed on HMD9 in virtual 3D space VA is represented as display area DA.

In the example of fig. 9, the subject person moves in the negative direction of the X axis with one hand holding the subject person. This one-handed operation can be said to be a gesture of holding the space and pulling the space in that direction. When recognizing the gesture, the display-side device 20 moves the virtual 3D space VA in the negative direction of the X axis by a distance corresponding to the linear movement amount of the single hand. As a result, the display area DA moves in the positive direction of the X axis in the virtual 3D space VA, and thus a part of the virtual 3D space VA that has not been displayed so far is displayed on the HMD 9.

As described above, the subject person can move the virtual 3D space by performing a gesture of holding the space with one hand and pulling the space in a certain direction, thereby moving the field of view of the subject person. Such that the gesture is intuitively linked to the virtual 3D space operation.

When it is detected that the one hand of the subject person is not moving for a certain period of time while holding the one hand, the display-side device 20 can perform a user operation on the menu screen by displaying the menu screen for calling up another function on the display area of the virtual 3D space.

When it is detected that the subject person is held with one hand and the other hand is also held, the display-side device 20 enables the operation of enlarging, reducing, and rotating the virtual 3D space. In this state, when the display-side device 20 detects that either hand is in the open state, the operation of enlarging, reducing, and rotating the virtual 3D space is invalidated.

Fig. 10 is a diagram showing an example of the reduction operation of the virtual 3D space according to embodiment 1. Fig. 10 shows an X axis, a Y axis, and a Z axis, a virtual 3D space VA, and a display area DA, as in fig. 9. In the example of fig. 10, one hand is moved diagonally downward and leftward in the drawing while the holding subject is held by both hands. The motion using the hand can be said to be a gesture for grasping the space with both hands and reducing the space between the both hands. When this gesture is recognized, the display-side device 20 reduces the virtual 3D space VA at a reduction rate corresponding to the degree of reduction in the distance between the two hands. At this time, the display-side device 20 takes the hand held first as a reference point of the reduction processing at the time when the operation of enlarging, reducing, and rotating the virtual 3D space is effective.

Although not shown in fig. 10, when it is detected that one hand moves in a direction in which the two hands of the holding target person are separated while being held by the two hands, the display-side device 20 enlarges the virtual 3D space VA at an enlargement rate corresponding to the degree of extension of the distance between the two hands. The motion may be a gesture of holding the space with both hands and stretching the space with both hands.

As described above, the subject person can reduce or enlarge the virtual 3D space by performing a gesture of reducing or expanding the space while holding the space with both hands. The association is also intuitively established with respect to the gesture and the virtual 3D space operation.

Fig. 11 is a diagram showing an example of the rotation operation of the virtual 3D space according to embodiment 1. Fig. 11 shows the X axis, Y axis, and Z axis, the virtual 3D space VA, and the display area DA, as in fig. 9 and 10. In the example of fig. 11, the both hands of the subject person are held, and one hand is moved in an angular direction different from the original direction along a line segment connecting the both hands. The movement using the hand can be said to be a gesture of holding the space with both hands, leaving a part of the space, and pulling the other part in a certain direction. When the gesture is recognized, the display-side device 20 rotates the virtual 3D space VA by the solid angle change amount corresponding to the solid angle change amount of the straight line connecting the two hands at the time when each operation is effective and the straight line connecting the two hands after moving one hand. At this time, the display-side device 20 sets the hand to be held first as the axis of rotation at the time when each operation is enabled.

When the display-side device 20 validates the operation of enlarging, reducing, and rotating, whether to enlarge and reduce or to rotate is determined based on a change in a vector (line segment) connecting both hands to be held. Specifically, the display-side device 20 compares a unit vector obtained by normalizing a vector at a time when the operation of enlarging, reducing, and rotating is effective even when the hands are held with the unit vectors of subsequent vectors, and performs an enlarging process or a reducing process corresponding to a change in the magnitude of the vector when the unit vectors are approximated to each other. On the other hand, if the unit vectors are not approximated to each other, the display-side device 20 performs the rotation process.

The subject person can rotate the virtual 3D space by performing a gesture of holding the space with both hands, leaving one part of the space, and pulling the other part in a certain direction as described above. The association is also intuitively established with respect to the gesture and the virtual 3D space operation.

Example 2

In embodiment 2, the display-side device 20 displays a spherical virtual object VO as shown in the example of fig. 8 on the display area of the virtual 3D space. The display-side device 20 sets a predetermined 3D range with the virtual object as a reference, and if one hand of the subject person is present within the predetermined 3D range, the rotation operation on the virtual 3D space is enabled. In embodiment 2, the subject person can operate the virtual 3D space with one hand.

The display-side device 20 obtains the amount of change in solid angle of a line segment connecting the one hand of the subject person and the center point of the virtual object before and after the movement of the one hand, and rotates the virtual 3D space by the amount of change in solid angle equal to the amount of change in solid angle. At this time, the reference point of the rotation is set as the center point of the virtual object. By this rotation operation, the virtual object is rotated together with the virtual 3D space.

When the specific portion of the subject person moves out of the predetermined 3D range, the display-side device 20 invalidates the operation on the virtual 3D space. The display-side device 20 can rotate the virtual 3D space with inertia at a timing when the rotation operation changes from active to inactive. In this case, when it is detected that the hand of the subject person moves from the predetermined 3D range to outside the predetermined 3D range, the display-side device 20 applies the rotation corresponding to the movement distance and the movement direction between the position in the predetermined 3D range before the movement and the position outside the predetermined 3D range after the movement to the virtual 3D space.

Thus, the subject can rotate the virtual 3D space together with the virtual object VO by operating the spherical virtual object VO in a manner of rotating the globe. In this case, since the subject person can operate the virtual 3D space in the same manner as the virtual object VO is operated, the subject person can intuitively operate the virtual 3D space.

[ modified examples ]

In the first and second embodiments described above, as shown in fig. 3, the HMD9 includes the line-of-sight cameras 9a and 9b and the displays 9c and 9d corresponding to both eyes of the subject person (user), but may include one line-of-sight camera and one display. In this case, the 1 display may be arranged to cover the field of view of one eye of the subject person or may be arranged to cover the field of view of both eyes of the subject person. In this case, the virtual data generation unit 24 of the display-side device 20 may generate virtual 3D space data so that a display object included in the virtual 3D space can be displayed by 3DCG, using a known 3DCG technique.

In the first and second embodiments, the video see-through HMD9 is used to obtain the sight-line image, but the optical see-through HMD9 may be used. In this case, the HMD9 may be provided with the displays 9c and 9D of half mirrors, and the virtual 3D space may be displayed on the displays 9c and 9D. However, in this case, a camera for obtaining an image for detecting a common real object is provided in a region of the HMD9 that does not obstruct the field of view of the subject person in the direction of the line of sight of the subject person.

In the first and second embodiments, the video See-Through HMD9 is used to obtain the line-of-sight image, but an Optical See-Through (Optical See-Through) HMD9 may be used. In this case, the HMD9 may be provided with the displays 9c and 9D of half mirrors, and the virtual 3D space may be displayed on the displays 9c and 9D. However, in this case, a camera for obtaining an image for detecting a common real object in the line of sight direction of the subject person is provided in a portion of the HMD9 that does not obstruct the field of view of the subject person.

In the first and second embodiments, as shown in fig. 1, the sensor side device 10 and the display side device 20 are provided separately, and the virtual 3D space is synthesized with the sight line image of the subject person, but an image obtained by synthesizing the virtual 3D space with a two-dimensional image included in the 3D information obtained by the sensor side device 10 may be displayed.

Fig. 12 conceptually shows an example of the hardware configuration of the 3D-UI device 1 according to the modification. The 3D-UI device 1 includes a processing device 50, a 3D sensor 8, and a display device 51. The processing device 50 includes a CPU2, a memory 3, an input/output I/F5, and the like, and the input/output I/F5 is connected to the 3D sensor 8 and the display device 51. The display device 51 displays the composite image.

Fig. 13 is a diagram conceptually showing an example of the processing configuration of the 3D-UI device 1 according to the modification. The 3D-UI device 1 according to the modification includes the 3D information acquisition unit 11, the position calculation unit 14, and the state acquisition unit 15 included in the sensor-side device 10 according to each of the above embodiments, and further includes the virtual data generation unit 24, the operation determination unit 25, the spatial processing unit 26, the image synthesis unit 27, and the display processing unit 28 included in the display-side device 20 according to each of the above embodiments. The processing units are similar to the embodiments described above except for the following points.

The position calculating unit 14 directly obtains three-dimensional position information of a specific part of the subject person from the three-dimensional information obtained from the 3D sensor 8 by the 3D information obtaining unit 11. The operation specifying unit 25 specifies a predetermined process based on the three-dimensional position information of the camera coordinate system calculated by the position calculating unit 14 and the state information obtained by the state obtaining unit 15. The image synthesizing unit 27 synthesizes a two-dimensional image included in the three-dimensional information obtained by the 3D information obtaining unit 11 with the virtual 3D space data subjected to the predetermined processing by the space processing unit 26.

In this modification, the subject person operates the virtual 3D space while observing his/her own video captured from a direction other than the direction of his/her own line of sight. Thus, although this modification may have a lower intuitiveness than the above-described embodiments using the sight line image of the subject person, the virtual 3D space can be operated by a 3D gesture using a specific portion, and therefore the ease of understanding of the operation can be sufficiently achieved.

In the flowcharts used in the above description, although a plurality of steps (processes) are described in order, the order of execution of the steps executed in the present embodiment is not limited to the described order. In the present embodiment, the order of the illustrated steps can be changed within a range that does not hinder the contents. The embodiments and modifications described above can be combined in a range not contrary to the contents.

The above-described contents can be determined as described in the attached notes below. However, the above embodiments, modifications, and examples are not limited to the following descriptions.

(attached note 1)

A three-dimensional user interface device is provided with:

a three-dimensional information acquisition unit that acquires three-dimensional information from a three-dimensional sensor;

a position calculation unit that calculates three-dimensional position information on a three-dimensional coordinate space related to a specific part of the target person, using the three-dimensional information acquired by the three-dimensional information acquisition unit;

a virtual data generation unit configured to generate virtual three-dimensional space data representing a virtual three-dimensional space which is arranged in the three-dimensional coordinate space and at least partially set in a display area;

a space processing unit that performs predetermined processing corresponding to a change in the three-dimensional position information on the specific portion of the subject person on the three-dimensional coordinate space or the virtual three-dimensional space data; and

and a display processing unit configured to display the virtual three-dimensional space in the display area on a display unit based on the virtual three-dimensional space data obtained by the predetermined processing performed by the space processing unit.

(attached note 2)

The three-dimensional user interface device according to supplementary note 1, wherein,

the three-dimensional user interface device is further provided with:

a state acquisition unit configured to acquire state information of the specific part of the subject person; and

an operation specifying unit that specifies the predetermined process to be executed by the spatial processing unit from among a plurality of predetermined processes based on a combination of the state information acquired by the state acquiring unit and the change in the three-dimensional position information.

(attached note 3)

The three-dimensional user interface device according to supplementary note 2, wherein,

the position calculation section calculates three-dimensional position information of one hand of the subject person as a specific part of the subject person,

the state acquisition unit acquires state information of the one hand of the subject person as a specific part of the subject person,

the operation determination unit determines, as the predetermined processing, processing of moving the one hand of the subject person by a distance amount corresponding to a linear movement amount of the one hand while the one hand is maintained in a specific state.

(attached note 4)

The three-dimensional user interface device according to supplementary note 2 or 3, wherein,

the position calculating section calculates three-dimensional position information on a three-dimensional coordinate space relating to a plurality of specific portions of the subject person,

the state acquisition unit acquires each piece of state information on the plurality of specific portions of the subject person,

the operation specifying unit calculates a positional relationship between the plurality of specific portions based on the plurality of three-dimensional position information on the plurality of specific portions calculated by the position calculating unit, and specifies the predetermined process from among a plurality of predetermined processes based on a change in the calculated positional relationship and the plurality of state information acquired by the state acquiring unit.

(attached note 5)

The three-dimensional user interface device according to supplementary note 4, wherein,

the position calculating section calculates three-dimensional position information of both hands of the subject person as the plurality of specific portions,

the state acquisition unit acquires state information of both hands of the subject person as the plurality of specific parts,

the operation specification unit specifies, as the predetermined processing, an enlargement processing or a reduction processing using a position of one hand of the subject person as a reference point at an enlargement rate or a reduction rate corresponding to a change amount of a distance between both hands of the subject person, or a rotation processing using a position of one hand of the subject person as a reference point at a change amount of a solid angle corresponding to a change amount of a solid angle of a line segment connecting both hands of the subject person.

(attached note 6)

The three-dimensional user interface device according to any one of supplementary notes 2 to 5, wherein,

the operation determination unit determines whether or not the state information acquired by the state acquisition unit indicates a specific state, and determines whether or not to cause the spatial processing unit to execute the predetermined process based on the determination result.

(attached note 7)

The three-dimensional user interface device according to any one of supplementary notes 1 to 6, wherein,

the virtual data generation unit generates virtual object data in a virtual three-dimensional space arranged in the display area,

the operation determination unit determines whether or not the specific portion of the subject person is present within a predetermined three-dimensional range with respect to the virtual object, based on the three-dimensional position information calculated by the position calculation unit, and determines whether or not to cause the spatial processing unit to execute the predetermined process, based on the determination result.

(attached note 8)

The three-dimensional user interface device according to supplementary note 7, wherein,

the operation determination unit determines, as the predetermined processing, rotation processing using a specific point of the virtual object as a reference point, the rotation processing using a solid angle change amount corresponding to a solid angle change amount of a line segment connecting the specific point of the target person and the specific point of the virtual object.

(attached note 9)

The three-dimensional user interface device according to supplementary note 7 or 8, wherein,

the operation specifying unit detects movement of the specific portion of the subject from within the predetermined three-dimensional range to outside the predetermined three-dimensional range, and specifies, as the predetermined processing, rotation processing corresponding to a distance and a direction between a position within the predetermined three-dimensional range and a position outside the predetermined three-dimensional range before and after the movement.

(attached note 10)

The three-dimensional user interface device according to any one of supplementary notes 2 to 9, wherein,

the operation specifying unit measures a period during which the state information acquired by the state acquiring unit and the three-dimensional position information do not change, and specifies, as the predetermined process, a process of adding display data of the function menu when the measured period exceeds a predetermined period.

(attached note 11)

The three-dimensional user interface device according to any one of supplementary notes 1 to 10, wherein,

the disclosed device is provided with: a first object detection unit that detects a known general-purpose real object from the three-dimensional information;

a first reference setting unit that sets the three-dimensional coordinate space based on the common real object detected by the first object detection unit, and calculates a position and an orientation of the three-dimensional sensor;

a line-of-sight image acquisition unit that acquires a line-of-sight image obtained by imaging the specific part of the subject person from an imaging unit arranged at a position and in a different orientation from those of the three-dimensional sensor;

a second object detection unit that detects the known common real object from the sight-line image acquired by the sight-line image acquisition unit;

a second reference setting unit that shares the three-dimensional coordinate space based on the common real object detected by the second object detection unit, and calculates a position and an orientation of the imaging unit; and

an image combining unit that combines a virtual three-dimensional space within the display area with the sight line image captured by the imaging unit, based on the position and orientation of the imaging unit calculated by the second reference setting unit and the three-dimensional coordinate space,

the position calculation unit calculates the three-dimensional position information on the three-dimensional coordinate space by converting three-dimensional position information on the specific part of the subject acquired from the three-dimensional information acquired by the three-dimensional information acquisition unit, based on the position and orientation of the three-dimensional sensor calculated by the first reference setting unit and the three-dimensional coordinate space,

the display processing unit causes the display unit to display the image obtained by the image synthesizing unit.

(attached note 12)

A method of three-dimensional manipulation performed by at least one computer, comprising:

three-dimensional information is obtained from a three-dimensional sensor,

calculating three-dimensional position information on a three-dimensional coordinate space related to a specific part of the subject person using the acquired three-dimensional information,

generating virtual three-dimensional space data representing a virtual three-dimensional space arranged in the three-dimensional coordinate space and at least partially set in a display region,

performing predetermined processing corresponding to a change in the three-dimensional position information on the specific portion of the subject person with respect to the three-dimensional coordinate space or the virtual three-dimensional space data,

and displaying the virtual three-dimensional space in the display area on a display unit based on the virtual three-dimensional space data obtained by performing the predetermined processing.

(attached note 13)

The three-dimensional operation method according to supplementary note 12, wherein,

the three-dimensional operation method further includes:

acquiring state information of the specific part of the subject,

the predetermined process is determined from among a plurality of predetermined processes based on a combination of the acquired state information and the change in the three-dimensional position information.

(attached note 14)

The three-dimensional operation method according to supplementary note 13, wherein,

in the calculation of the three-dimensional position information, three-dimensional position information of one hand of the subject person as a specific part of the subject person is calculated,

acquiring the state information of the one hand of the subject person as a specific part of the subject person,

in the determination of the predetermined processing, a process of moving the one hand of the subject person by a distance amount corresponding to a linear movement amount of the one hand while the one hand is maintained in a specific state is determined as the predetermined processing.

(attached note 15)

The three-dimensional operation method according to supplementary note 13 or 14, wherein,

in the calculation of the three-dimensional position information, three-dimensional position information on a three-dimensional coordinate space relating to a plurality of specific portions of the subject person is calculated,

the state information acquisition unit acquires each of the state information on the plurality of specific parts of the subject person,

in the specifying of the predetermined processing, the positional relationship between the plurality of specific portions is calculated based on the plurality of three-dimensional positional information on the plurality of specific portions calculated by the calculating, and the predetermined processing is specified from among the plurality of predetermined processing based on a change in the calculated positional relationship and the plurality of state information acquired by the state acquiring unit.

(subsidiary 16)

The three-dimensional operation method according to supplementary note 15, wherein,

in the calculation of the three-dimensional position information, three-dimensional position information of both hands of the subject person as the plurality of specific parts is calculated,

acquiring the state information by acquiring state information of both hands of the subject person as the plurality of specific parts,

the predetermined processing determination may be performed by determining, as the predetermined processing, an enlargement processing or a reduction processing using a position of one hand of the subject person as a reference point at an enlargement rate or a reduction rate corresponding to a change amount of a distance between both hands of the subject person, or by determining, as the predetermined processing, a rotation processing using a position of one hand of the subject person as a reference point at a change amount of a solid angle corresponding to a change amount of a solid angle of a line segment connecting both hands of the subject person.

(attached note 17)

The three-dimensional operation method according to any one of supplementary notes 13 to 16, wherein,

determining whether the acquired state information indicates a specific state,

and deciding whether to implement the predetermined processing according to the determination result.

(attached note 18)

The three-dimensional operation method according to any one of supplementary notes 13 to 17, wherein,

generating virtual object data in a virtual three-dimensional space configured within the display area,

determining whether the specific portion of the subject person is present within a predetermined three-dimensional range with respect to the virtual object based on the calculated three-dimensional position information,

(attached note 19)

The three-dimensional operation method according to supplementary note 18, wherein,

the predetermined processing determination determines, as the predetermined processing, rotation processing in which the specific point of the virtual object is set as a reference point, based on a solid angle change amount corresponding to a solid angle change amount of a line segment connecting the specific point of the target person and the specific point of the virtual object.

(attached note 20)

The three-dimensional operation method according to supplementary note 18 or 19, wherein,

in the determination of the predetermined processing, a movement of the specific portion of the subject from inside the predetermined three-dimensional range to outside the predetermined three-dimensional range is detected, and rotation processing corresponding to a distance and a direction between a position in the predetermined three-dimensional range and a position outside the predetermined three-dimensional range before and after the movement is determined as the predetermined processing.

(attached note 21)

The three-dimensional operation method according to any one of supplementary notes 13 to 20, wherein,

in the specifying of the predetermined process, a period in which the acquired state information and the three-dimensional position information do not change is measured, and if the measured period exceeds a predetermined period, a process of adding display data of a function menu is specified as the predetermined process.

(attached note 22)

The three-dimensional operation method according to any one of supplementary notes 12 to 21, wherein,

the method comprises the following steps: detecting a known generic real object from the three-dimensional information,

setting the three-dimensional coordinate space based on the detected general actual object, and calculating the position and orientation of the three-dimensional sensor,

acquiring a line-of-sight image obtained by imaging the specific part of the subject person from an imaging unit arranged at a position and in a direction different from those of the three-dimensional sensor,

detecting the known generic real object from the acquired sight-line image,

sharing the three-dimensional coordinate space based on the detected common real object, and calculating the position and orientation of the image pickup unit,

synthesizing a virtual three-dimensional space within the display area with the sight line image captured by the imaging unit based on the calculated position and orientation of the imaging unit and the three-dimensional coordinate space,

displaying the image obtained by the synthesis on the display unit,

in the calculation of the three-dimensional position information, the three-dimensional position information on the three-dimensional coordinate space is calculated by converting three-dimensional position information related to a specific part of the subject acquired from the acquired three-dimensional information, based on the calculated position and orientation of the three-dimensional sensor and the three-dimensional coordinate space.

(attached note 23)

A program for causing at least one computer to execute the three-dimensional operation method according to any one of supplementary notes 12 to 21.

(attached note 24)

A computer-readable recording medium having recorded thereon the program described in supplementary note 23.

This application claims priority based on japanese patent application laid-open at 7/27/2012, application No. 2012-167111, the disclosure of which is hereby incorporated by reference in its entirety.

Claims

1. A three-dimensional user interface device is provided with:

a virtual data generation unit that generates virtual three-dimensional space data representing a virtual three-dimensional space which is arranged in the three-dimensional coordinate space and at least partially set in a display area, and generates a virtual object displayed in the display area in addition to the virtual three-dimensional space;

a space processing unit that performs predetermined processing corresponding to a change in the three-dimensional position information on the specific portion of the target person on the virtual three-dimensional space data;

a display processing unit that displays the virtual object on a display unit and also displays a virtual three-dimensional space in the display area on the display unit based on virtual three-dimensional space data obtained by performing the predetermined processing by the space processing unit;

an operation specifying unit that specifies the predetermined process to be executed by the spatial processing unit from among a plurality of predetermined processes based on a combination of a change in the position of the specific portion with reference to the virtual object and a change in the state of the specific portion,

the operation determination unit determines whether or not the specific portion of the subject person is present within a predetermined three-dimensional range with respect to the virtual object based on the three-dimensional position information calculated by the position calculation unit, and determines whether or not to cause the spatial processing unit to execute the predetermined process based on a result of the determination,

in a case where it is determined by the operation determination section that the predetermined processing is to be executed by the spatial processing section, the spatial processing section executes the determined predetermined processing,

after the predetermined processing is performed on the virtual three-dimensional space, the display processing unit changes the virtual three-dimensional space displayed on the display unit to the virtual three-dimensional space on which the predetermined processing is performed.

2. The three-dimensional user interface device of claim 1,

3. The three-dimensional user interface device of claim 1 or 2,

4. The three-dimensional user interface device of claim 3,

5. The three-dimensional user interface device of claim 1 or 2,

6. The three-dimensional user interface device of claim 1,

7. The three-dimensional user interface device of claim 1,

8. The three-dimensional user interface device of claim 1 or 2,

9. The three-dimensional user interface device of claim 1 or 2,

10. A method of three-dimensional manipulation performed by at least one computer, comprising:

a three-dimensional information acquisition step of acquiring three-dimensional information from a three-dimensional sensor;

a position calculation step of calculating three-dimensional position information on a three-dimensional coordinate space related to a specific part of the subject person, using the acquired three-dimensional information;

a virtual data generation step of generating virtual three-dimensional space data representing a virtual three-dimensional space which is arranged in the three-dimensional coordinate space and at least partially set in a display area, and generating a virtual object displayed in the display area in addition to the virtual three-dimensional space;

a space processing step of performing predetermined processing corresponding to a change in the three-dimensional position information on the specific part of the subject person with respect to the virtual three-dimensional space data;

a display processing step of displaying the virtual object on a display unit, and displaying a virtual three-dimensional space in the display area on the display unit based on virtual three-dimensional space data obtained by performing the predetermined processing in the space processing step;

a state acquisition step of acquiring state information of the specific part of the subject person;

a space processing step of performing predetermined processing on the virtual three-dimensional space; and

an operation determination step of determining the predetermined processing executed by the spatial processing step from among a plurality of predetermined processings based on a combination of a change in the position of the specific portion with reference to the virtual object and a change in the state of the specific portion,

in the operation determination step, it is determined whether or not the specific portion of the subject person is present within a predetermined three-dimensional range with respect to the virtual object based on the three-dimensional position information calculated in the position calculation step, and it is determined whether or not the predetermined processing is executed in the spatial processing step based on a result of the determination,

in a case where it is determined by the operation determination step that the predetermined processing is performed at the spatial processing step, the determined predetermined processing is performed at the spatial processing step,

after the predetermined processing is performed on the virtual three-dimensional space, the display processing step changes the virtual three-dimensional space displayed on the display unit to the virtual three-dimensional space on which the predetermined processing is performed.