WO2014017095A1

WO2014017095A1 - Low cost, non-intrusive, high accuracy head tracking apparatus and method

Info

Publication number: WO2014017095A1
Application number: PCT/JP2013/004525
Authority: WO
Inventors: David Kryze; Yue Fei
Original assignee: Panasonic Corp
Current assignee: Panasonic Corp
Priority date: 2012-07-25
Filing date: 2013-07-25
Publication date: 2014-01-30
Anticipated expiration: 2015-01-25

Description

LOW COST, NON-INTRUSIVE, HIGH ACCURACY HEAD TRACKING APPARATUS AND METHOD

The present disclosure relates to user head tracking for use in systems such as three-dimensional display systems, including projectors and televisions, or systems where knowledge of a user's head position and orientation is needed for system interaction or control.

This section provides background information related to the present disclosure which is not necessarily prior art.

In a typical three-dimensional display system, the user wears special 3D glasses that separately control the image seen by each eye. By supplying the left and right eyes with images recorded or projected to appear from slightly different vantage points, a 3D stereoscopic effect is produced. Early 3D glasses were simple passive devices featuring red and green colored lenses to supply each eye with a slightly different perspective, based on what light would pass through each colored lens.

More recent 3D glasses employ actively controlled, polarized lenses that are alternately toggled on and off in synchronism with the display (such as projectors, televisions, computer monitors, mobile devices, etc.), at toggle speeds that are imperceptible to the wearer. Operating in synchronism with the display, the active lenses alternately provide the left eye and the right eye with unique information corresponding to the intended perspective for that eye.

[PTL 1] Japanese Unexamined Patent Application Publication No. 2006-084963
[PTL 2] Japanese Unexamined Patent Application Publication No. 10-232626

The active 3D lens system works fairly well, so long as the user remains seated in one physical location, preferably at the "sweet spot" in front and center of the display screen, and does not turn his or her head too much from side to side. While such behavior may be suitable for watching a movie or television program, the 3D effect becomes distorted and degraded as the user moves to a different location or changes the angle of his or her head.

To illustrate, imagine you are viewing a three-dimensional image of a statue. When viewed head on, from the center of the display screen the statue is facing directly towards you, as if speaking directly to you. As you move to the right side of the screen and look left towards the statue, you would ideally expect to see the statue in partial profile. However, with conventional 3D technology, the statue's head will continue to face directly towards you. The 3D system has no way to correct for differences in the user's head position (eye position) or orientation.

To address this deficiency, some have suggested using a head tracking system that would provide a head location feedback signal to the display which provides or renders stereo images. Using this signal, the images supplied to the left and right eyes can be adjusted, on the fly, to supply the proper perspective to the user based on where the user is located and how the user's head is oriented. However, providing sophisticated head position information with any degree of accuracy in an affordable package has heretofore been difficult to achieve.

This section provides a general summary of the disclosure, and is not a comprehensive disclosure of its full scope or all of its features.

In accordance with one aspect, the disclosed technology provides a method of detecting position and orientation of an eyeglasses wearer's head, and infer the 3D position of the user's left eye and right eye. A light pattern is caused to emanate from the eyeglasses, the light pattern comprising at least one two-dimensional shape. In one embodiment, four generally rectangular shapes are produced. The emanated light pattern is recorded in an optical sensor to generate a two-dimensional raster image of said at least one two-dimensional shape. A processor is then used to extract at least four two-dimensional data points from said raster image. A processor is further used to extract from the at least four two-dimensional data points six degrees of freedom position information and orientation information. Then, using the processor, a signal is supplied corresponding to the detected position and orientation of the eyeglasses wearer's head computed from said image position information and said orientation information.

Also disclosed is an apparatus for detecting head position, comprising eyeglasses having a frame front defining a pair of rims with a bridge between. A plurality of reflective layers are disposed on an outer face of the frame front, the reflective layers being disposed along an upper portion and along a lower portion of each rim. The plurality of reflective layers each define a contiguous elongated two-dimensional shape. A plurality of covering layers are then disposed on the reflective layers, the covering layers being transmissive at infrared wavelengths and substantially opaque in visible light wavelengths.

Further areas of applicability will become apparent from the description provided herein. The description and specific examples in this summary are intended for purposes of illustration only and are not intended to limit the scope of the present disclosure.

The disclosed method and apparatus can provide high accuracy head tracking.

The drawings described herein are for illustrative purposes only of selected embodiments and not all possible implementations, and are not intended to limit the scope of the present disclosure.
Figure 1 is a system block diagram of the basic components used to implement the head tracking method. Figure 2 is a flowchart diagram describing how an existing pair of glasses is retrofit with the reflective rectangular bars and covering tape. Figure 3 is an exploded perspective view illustrating how the reflective bar and covering tape are installed. Figure 4A is a flowchart diagram illustrating how the processor extracts information from the reflected image data. Figure 4B is a flowchart diagram illustrating how the processor calculates the six degrees of freedom information about the position and orientation of the glasses. Figure 5 is a data flow diagram illustrating a first appearance model embodiment. Figure 6 is a data flow diagram illustrating a second appearance model embodiment. Figure 7 is a flowchart diagram illustrating how to determine 3D position of the wearer's left eye and right eye using the 3D position and 3D orientation of the glasses. Figure 8 is a flowchart diagram illustrating how to generate a new projection matrix for 3D rendering using eye position data generated using the glasses. Corresponding reference numerals indicate corresponding parts throughout the several views of the drawings.

Embodiment 1

Example embodiments will now be described more fully with reference to the accompanying drawings.

Referring to Figure 1, an overview of the head tracking system will now be presented. The system employs a pair of instrumented glasses 14. The glasses may preferably include a pair of 3D lenses 16 mounted in a pair of rims 18 that join to form a bridge 20. The instrumented glasses further include earpieces 22 and may include an embedded communication circuit 24, such as a wireless Bluetooth communication circuit that communicates wirelessly with a 3D display system and controls the lenses 16 so that the user's left and right eyes receive the respective left and right portions of a stereoscopic image.

Secured to the front face of the eyeglasses frame is a set of reflective elements 26, each reflective element being of a predefined two-dimensional shape, such as an elongated rectangular shape. These reflective elements receive and reflect light from and to a light source/sensor 28. If desired, the light source/sensor can be configured to project and respond to light at a predetermined wavelength, such as an infrared wavelength that is not within the visible light spectrum.

If desired, the reflective elements may be assembled onto existing 3D glasses as described and shown in Figures 2 and 3. Following the flowchart of Figure 2, existing 3D glasses are equipped with reflective elements by creating four narrow rectangular infrared reflective bars (step 30) and applying them to the glasses (step 32) as shown in Figure 3. The reflective elements 26 are then covered by applying a protective tape 36 (step 34) that is transmissive at infrared wavelengths but opaque at visible light wavelengths. Preferably, the color of the covering tape 36 is chosen to match the color of the glasses frames. The resulting instrumented eyeglasses have four reflective elements that are readily visible at infrared wavelengths but masked by the covering tape so the cosmetic appearance of the eyeglasses is essentially identical to standard eyeglasses (without the reflective elements).

Referring back to Figure 1, the light source/sensor 28 causes a light pattern to emanate from the eyeglasses (it projects infrared light onto the reflective elements 26, which then reflect that light back to the infrared sensor within the light source/sensor 28). The light source/sensor 28 includes an internal optical sensor 38 that is electrically read by raster scanning or the like into a memory shown diagrammatically at 40. Essentially, memory 40 contains (x,y) pixel information corresponding to regions of sensor 38 that were illuminated when the image was captured to memory. In the present case, the infrared information reflected from reflective elements 26 contains useful information about the three-dimensional position and three-dimensional orientation of the eyeglasses, and hence of the wearer's head. Thus, the captured pattern of infrared light reflected from elements 26 is shown at 42, in this case occupying the lower half of the memory 40. For illustration, some additional extraneous light patterns 44 have also been illustrated. These extraneous light patterns may be light at other wavelengths to which the sensor 38 responds, or light in the infrared spectrum that has reflected from other surfaces within the room where the user is located.

The head tracking system includes a processor 46 that manipulates the data within memory 40 to remove the extraneous light patterns and enhance the contrast of the image so the captured pattern 42 may be more strongly differentiated from the background as illustrated at 40a. Processor 46 may perform this operation by applying a digital filter to remove information at wavelengths other than the infrared wavelength of the light source/sensor 28. Processor 46 may also apply a thresholding algorithm that increases the digital values that are above a predetermined threshold while decreasing those below that threshold. This has the effect of increasing the signal-to-noise ratio of the stored image of the captured pattern 42.

Next, processor 46 performs image processing operations and stores those in memory as depicted at 40b. The image processing may include operations such as determining the centroid of each of the four rectangular patterns, determining the major and minor axis directions, and detecting edges, corners and other geometric components of the captured pattern 42.

Processor 26 then extracts from these data a three-dimensional position and three-dimensional orientation of the captured pattern as depicted at 48. This position and orientation information may be stored in a suitable memory location administered by processor 46, for subsequent use in tailoring the projected three-dimensional image to the user's precise head position and orientation. In this regard, knowing the head position and orientation is important in delivering a realistic 3D or stereoscopic experience. This is because when looking at a true three-dimensional object in three-dimensional space, the user is accustomed to being able to move around the object and thereby see it from different vantage points. To simulate this effect in virtual 3D space, it is necessary to know the user's head position and orientation so the proper image can be projected for viewing by each eye separately. The three-dimensional position and three-dimensional orientation information generated by the disclosed head tracking system may be used for this purpose.

Figures 4A and 4B show in greater detail the processing steps performed by processor 46 to generate the 3D position and 3D orientation information 48. Referring first to Figure 4A, beginning at step 50, image processing is performed on the image captured and stored within memory 40 (Fig. 1). This image processing may include noise removal, thresholding and histogram analysis. Noise removal may consist of identifying and suppressing or erasing individual pixels or small groups of pixels that are far too small to constitute data from the reflective elements 26.

Thresholding involves increasing the contrast of the overall captured image and optionally suppressing or deleting data that correspond to captured information below a certain intensity. In this regard, the head tracking system is optimized to "see" the four rectangular reflective elements. If infrared light happens to reflect from other sources within the room, particularly from surfaces that are a greater distance than the distance to the glasses 14, the intensity of those reflections will be lower. Thus, the thresholding process can detect regions of low luminosity and suppress or delete the information corresponding to those surfaces.

The histogram process involves analyzing the captured image stored in memory 40 on the basis of wavelength. The histogram analysis organizes the captured information according to wavelength and this allows the processor to identify those captured patterns that correspond to the wavelength produced by the light source/sensor 28. Using this information, the processor thus filters out wavelengths that have nothing to do with the reflective elements 26. In this regard, it will be recalled that the reflective elements are preferably masked by a covering tape 36, so that light within the visible light spectrum will not reflect from the reflective elements 26. Because the reflective elements 26 cannot reflect light within the visible spectrum, the processor 46 may use the histogram data to filter out or ignore any visible light information that happens to have been captured and stored within memory 40.

After the image processing has been performed, processor 46 moves on to step 52 where the geometric data from the four rectangular-shaped reflective elements are analyzed. In this regard, there are a variety of different processing algorithms that may be utilized. Two techniques will be discussed in detail here. Essentially, the processing performed at step 52 involves extracting feature information from the four rectangular shapes and using this feature information to compute the position and orientation of the glasses in six degrees of freedom (x, y, z, yaw, pitch, roll). This is depicted at step 54.

Regarding extraction of six degrees of freedom (6 DOF) data, the process requires a minimum number of four (x,y) data points. Such four data points may be extracted from the four rectangles by determining the centroid of each rectangle and then using those centroids to calculate the 6 DOF data. While four data points are mathematically sufficient, in noisy conditions found in many real-world applications, a higher number of (x,y) data points can be used to increase system reliability. In this regard, Figure 4B depicts an embodiment that extracts and uses eight (x,y) points from the image data. This eight data point embodiment is quite robust even under noisy conditions.

Referring now to Figure 4B, the gray-scale image from the sensor 28 (Fig. 1) is stored in memory at 60. Next, at 62 a thresholding algorithm is applied to segment the foreground and background, as depicted at 40a. Then in step 64 the foreground data generated in step 62 are analyzed to find the contours and these contours are then mapped to geometric representations of rectangles. A suitable algorithm for finding the contours is the cvFindContours operation available in the OpenCV library (Open Source Computer Vision).

Once the contours have been found, the data are operated on in step 66 to identify bounding rectangles for each contour. A bounding rectangle is a computer-generated rectangular shape that most closely fits the contour. There may be multiple such bounding box candidates, as the contour shapes may not necessarily appear themselves as perfect rectangles. Thus at step 68, four "best" rectangles are selected from among the plural bounding box candidates, using assessment of the respective candidates' area size and long axis vs short axis ratio to pick the four candidates that represent the best fit. In this regard, analysis of the area size and axis ratios involves use of stored knowledge of the original area size and axis ratios of the four rectangles, and also knowledge of how the four rectangle sizes and axis ratios change at different viewing angles.

Steps 64-68 essentially convert the bit-mapped graphic data depicted at 40a into parametric data (such as vector graphic data) representing the rectangular shapes that most closely match the rectangle images captured by the sensor 28. The parametric representations are shown at 40b.

Once converted to geometric parameterized data, the data are operated on at step 70 to find the eight corners corresponding to the four rectangular shapes obtained in the preceding steps 64-68. This is shown at 40c. The eight corners are then compared at step 72 with pre-stored point configurations corresponding to true measurements of the actual rectangle corners (on the glasses themselves) as seen from different viewing angles through an iterative process using the solvePnP algorithm from the OpenCV library. This algorithm implements an iterative Levenberg-Marquardt optimization which fines the glasses orientation by minimizing the re-projection error. The end result of step 72 is the 3D position and 3D orientation of the glasses, which is then stored in memory at 74. Essentially, the solvePnP algorithm finds an object pose from 3D-2D point correspondence.

The above described head tracking algorithm utilizes at least four data points in (x,y) space that are then manipulated to determine a six degrees of freedom (6 DOF) state of the glasses position in (x,y,z) space and its rotational orientation (yaw, pitch, roll). While four data points in (x,y) space represent the minimum input requirement, the above described embodiment uses eight (x,y) data points. Using eight points extracted from the four rectangle image data, the algorithm of Figure 4A and 4B does a good job of determining position and orientation of the glasses, even under noisy conditions. However, if desired, more sophisticated analytical techniques may be used for even higher accuracy. Figures 5 and 6 will discuss two such more sophisticated techniques that are based on appearance modeling. The embodiment of Figure 5 uses sixteen (x,y) data points and the embodiment of Figure 6 uses the entire raw image (i.e., many data points).

Referring now to Figure 5, a first embodiment of appearance model for extracting features from the image will now be described. It is assumed that the image has already been processed by step 50 (Fig. 4A). Thus, the image is saved as depicted at 40a (Fig. 1). The processor 46 first finds the 16 endpoints (step 76) which are defined as the four corners for each of the four rectangles. This is depicted at 40b. The algorithm then effects a process whereby it establishes an initial six degrees of freedom estimated value

based on the geometry of the glasses and positions of the actual reflective rectangles which are both know a priori. The algorithm then re-projects the estimate back into (x,y) space (step 78) and calculates an error value d which measures how far the initial estimate deviates from the actual. Then an update estimate is calculated as

(step 80) and the six degrees of freedom estimate is updated using the update estimate. The procedure iteratively repeats until the error is below a predetermined threshold.

Referring now to Figure 6, a second embodiment of an appearance model for extracting features from the image will now be described. The process begins by operating on the image data as stored in 40a. In other words, the data have already been processed by step 50 (Fig. 4A) but still represent image data as opposed to parameterized, vector-graphic data. In this case, the data stored at 40a will be compared to previously stored reference data stored at 40c and the difference process (step 82) is performed to identify regions where the data stored at 40a differs from the reference data stored at 40c. The difference data are stored at 40d. The processor then computes a predicted update (step 84) which is then used to compute an alpha weight (step 86), where the alpha weight is based on the predicted update and also based on the error observed and stored at 40d. The alpha weight is then applied to the appearance data stored at 40a and the process iteratively repeats until the appearance and the reference data converge (i.e., no error).

In the embodiment of Figure 6, the six degrees of freedom position and orientation of the reference image (stored at 40c) is known a priori. The predicted update 84 is then applied to this known six degrees of freedom position and orientation information to generate a new estimate for the six degrees of freedom position and orientation. When the difference data finally converge (at 40d), the most recently updated six degrees of freedom position and orientation information is taken as the output (i.e., taken as indicative of the six degrees of freedom position and orientation of the eyeglasses).

While the input data for the algorithms of Figures 5 and 6 are different, both involve a similar computational technique whereby an initial six degrees of freedom estimate of glasses position and orientation is generated based on a priori knowledge of the glasses geometry and locations where the reflective elements are installed. This first estimate is then used to generate or re-project the six degrees of freedom data into the observed (x,y) space where it is compared with the observed (x,y) data to generate an error value. The error value is then used to modify the initial six degrees of freedom estimate and re-projection is again performed to obtain a new error value. The process repeats iteratively until the error value is below a predetermined threshold, indicating that the six degrees of freedom information is accurate within design tolerances.

The foregoing discussion has explained how to calculate the 3D position and 3D orientation (six degrees of freedom data) of the glasses. One purpose for doing so is to ascertain the viewing angle of the glasses wearer, so that a 3D rendering can be adapted to give the wearer a more realistic 3D experience. By using the six degrees of freedom data, an image projection or display system can dynamically modify the image presented, so that it looks correct in three dimensions, from whatever vantage point the user happens to be in at the time.

To utilize the six degrees of freedom data, the system processor is programmed to execute the steps depicted in Figures 7 and 8. Figure 7 shows how the glasses position data p(x,y,z) and glasses orientation data (expressed using Quaternion representation) are used to compute the position of the wearer's left and right eyes. Figure 8 shows how this eye position data are used to calculate how to modify an image from a projector or display system, so the user will see a proper 3D rendering from his or her current vantage point (i.e., vantage point based on the wearer's eye position).

Referring to Figure 7, the glasses position data are stored as points in (x,y,z) space as depicted at 84. The glasses orientation is preferably represented as a Quaternion mathematical representation and stored at 86. In this regard, any rotation in three dimensions can be represented as an axis vector and an angle of rotation. Quaternions give a simple way to encode this axis-angle representation in four numbers and apply the corresponding rotation to position vectors representing points relative to the origin.

The conversion algorithm first begins at step 88 by computing a middle point for the two eyes, using the 3D position of the glasses determined using the solvePnP algorithm discussed above. Next, a vector-3 offset is computed, at step 90, based on the assumption that each eye is 3 cm away from the middle point. Next, the rotation is determined at step 92, whereby the offset is rotated by the 3D orientation of the glasses, given by the solvePnP algorithm discussed above. Rotations are computed using a Quaternion representation. Finally, in step 94, the right and left eye positions are calculated, using the middle point determined in step 88 and applying an offset for the respective left and right eyes. In this regard, the left eye is the middle position minus the offset, whereas the right eye is the middle position plus the offset.

With the eye positions thus calculated, the process proceeds, as shown in Figure 8, to compute how to transform points in display space (screen position space) to produce a three-dimensional image that is rendered for the appropriate viewing angle based on the user's eye position. The eye position data p(x,y,z) are stored in memory at 96 while the screen position data (upper right [UR], upper left [UL], bottom right [BR], bottom left [BL]) are stored as (x,y,z) points in memory at 98.

First, the normal direction of the screen is determined at step 100. In this orientation the screen data are presented in proper vantage point for a viewer situated directly in front of the screen, with a viewing angle at right angles to the plane of the screen. This normal direction is computed as a Vector3 normal according to the following vector computation:
Eq. 1: Vector3 normal = (BL-UL).crossProduct (UR-UL) + (UR-BR).crossProduct(BL-BR).

Then as at step 102 the "up" and "right" directions of the screen are computed as a Vector3 computation:
Eq. 2: Vector3 up = (screenUL-screenBL) + (screenUR-screenBR).
Eq. 3: Vector3 rt = up.crossProduct(normal).

Next, at step 104 the screen's local coordinate system is computed by finding corners in the local coordinate system, the screen being situated in the xy plane and centered at the origin. The computation is perfomed using the following calculation:
Eq.4: coordSystem = GetCoordChangeMatrix (Vector3::ZERO, Vector3(1,0,0), Vector3(0,1,0), Vector3(0,0,1).

The user's eye data are then transformed into the screen local coordinate system at step 106 using the following calculation:
Eq. 5: Vector3 eye-screen = coordSystem * eyePosition.

Finally, the projection frustum is calculated using local eye position and screen data to calculate a frustum matrix, as at 108. This frustum matrix is then stored in memory and supplied to a rendering algorithm at step 110 to generate the rendered image according to the user's current eye position. Suitable rendering technologies include OpenGL and DirectX.

The foregoing description of the embodiments has been provided for purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosure. Individual elements or features of a particular embodiment are generally not limited to that particular embodiment, but, where applicable, are interchangeable and can be used in a selected embodiment, even if not specifically shown or described. The same may also be varied in many ways. Such variations are not to be regarded as a departure from the disclosure, and all such modifications are intended to be included within the scope of the disclosure.

The disclosure is applicable to user head tracking for use in systems such as three-dimensional display systems, including projectors and televisions, or systems where knowledge of a user's head position and orientation is needed for system interaction or control.

14 glasses
16 lenses
18 rims
20 bridge
22 earpieces
24 embedded communication circuit
26 reflective elements
28 light source/sensor
36 protective tape
38 internal optical sensor
40, 40a, 40b, 40c memory
42 captured pattern
44 extraneous light patterns
46 processor
48 3D position and 3D orientation information

Claims

A method of detecting position and orientation of an eyeglasses wearer's head, comprising:
causing a light pattern to emanate from the eyeglasses, the light pattern comprising at least one two-dimensional shape;
recording the emanated light pattern in an optical sensor to generate a two-dimensional raster image of said at least one two-dimensional shape;
using a processor to extract at least four two-dimensional data points from said raster image;
using a processor to extract from the at least four two-dimensional data points six degrees of freedom position information and orientation information; and
using the processor to supply a signal corresponding to the detected position and orientation of the eyeglasses wearer's head computed from said image position information and said orientation information.
The method of claim 1 wherein said light pattern comprises at least one contiguous two-dimensional shape.
The method of claim 1 wherein said light pattern is caused to emanate by reflection from a surface associated with said eyeglasses.
The method of claim 1 wherein said light pattern is caused to emanate by projection from a light source associated with said eyeglasses.
The method of claim 1 wherein the light pattern is substantially monochromatic.
The method of claim 1 wherein the light pattern is within the infrared spectrum.
The method of claim 1 wherein the light pattern comprises at least two, two-dimensional shapes in spaced relation to one another.
The method of claim 1 wherein the eyeglasses define a frame front and wherein said light pattern is caused to emanate from at least a portion of said frame front.
The method of claim 1 wherein the eyeglasses define a frame front defining a pair of rims with a bridge between and wherein said light pattern is caused to emanate: (a) from a portion of at least one rim above the lens and (b) from a portion of at least one rim below the lens.
The method of claim 1 wherein the eyeglasses define a frame front defining a pair of rims with a bridge between and wherein said light pattern is caused to emanate: (a) from a portion of at least one rim to the left of the bridge and (b) from a portion of at least one rim to the right of the bridge.
The method of claim 1 wherein the eyeglasses define a frame front defining a pair of rims with a bridge between and wherein said light pattern is caused to emanate from four discrete locations:
(a) from a portion of at least one rim above the lens;
(b) from a portion of at least one rim below the lens;
(c) from a portion of at least one rim to the left of the bridge; and
(d) from a portion of at least one rim to the right of the bridge.
The method of claim 1 further comprising processing the two-dimensional raster image by applying a threshold to segment foreground and background and using said segmented foreground as two-dimensional data representing the at least one two-dimensional shape.
The method of claim 1 further comprising converting said two-dimensional raster data into contour data representing the outline of said at least one two-dimensional shape.
The method of claim 1 further comprising converting said two-dimensional raster data into parametric data representing a geometric rectangle fit to conform to the outer periphery of said two-dimensional shape.
The method of claim 1 wherein said light pattern defines four two-dimensional generally rectangular shapes.
The method of claim 1 wherein said light pattern defines four two-dimensional generally rectangular shapes and further comprising converting said four two-dimensional generally rectangular shapes into at least four two-dimensional points.
The method of claim 1 wherein said light pattern defines four two-dimensional generally rectangular shapes and further comprising:
processing a two-dimensional raster image of said four two-dimensional generally rectangular shapes by applying a threshold to segment foreground and background, and using said segmented foreground to represent said four two-dimensional generally rectangular shapes;
converting the segmented foreground into contour data representing the outlines of said four two-dimensional generally rectangular shapes;
converting said contour data into parametric data representing four geometric rectangles fit to conform to the contour data; and
extracting at least four two-dimensional data points from said four geometric rectangles.
The method of claim 1 wherein six degrees of freedom position information and orientation information is extracted by:
(a) generating an initial estimate of six degrees of freedom position and orientation information based on a priori information known about the geometry of the eyeglasses and positions of the at least one two-dimensional shape;
(b) re-projecting said initial estimate as re-projected two-dimensional data points and comparing said re-projected data points with said at least four two-dimensional data points to generate an error value;
(c) using said error value to revise the initial estimate of six degrees of freedom position and orientation information; and
(d) iteratively repeating steps (a) through (c) until the error value is below a predetermined threshold.
The method of claim 1 wherein six degrees of freedom position information and orientation information is extracted by:
(a) comparing the two-dimensional raster image with a pre-stored reference image to generate an error value;
(b) using said error value to generate a predicted update;
(c) using said error value to adjust data associated with the two-dimensional raster image; and
(d) iteratively repeating steps (a) through (c) to minimize the error value and then using the predicted update to generate the six degrees of freedom position and orientation information.
An apparatus for detecting head position, comprising:
eyeglasses having a frame front defining a pair of rims with a bridge between;
a plurality of reflective layers disposed on an outer face of the frame front;
the reflective layers being disposed along an upper portion and along a lower portion of each rim;
the plurality of reflective layers each defining a contiguous elongated two-dimensional shape; and
a plurality of covering layers disposed on said reflective layers, the covering layers being transmissive at infrared wavelengths and substantially opaque in visible light wavelengths.
The apparatus of claim 20 wherein the eyeglasses include separate left and right stereoscopic lenses mounted within said rims.
The apparatus of claim 20 wherein the covering layers are provided colored to match the color of the rims.
The apparatus of claim 20 further comprising an infrared emitter illuminating the plurality of reflective layers and an optical sensor reading infrared light reflected from said reflective layers.
The apparatus of claim 23 further comprising a processor coupled to said optical sensor and programmed to compute head position based on readings obtained from said optical sensor.
A method of generating a three-dimensional image using the method of detecting position and orientation of claim 1 comprising:
using the signal corresponding to the detected position and orientation to compute the eye position of the wearer; and
generating three-dimensional rendering information using said computed eye position.