HK1173785B

HK1173785B - Illuminator with refractive optical element

Info

Publication number: HK1173785B
Application number: HK13100907.7A
Authority: HK
Inventors: A．佩尔曼; D.科恩; G．叶海弗
Original assignee: 微软技术许可有限责任公司
Priority date: 2011-03-07
Filing date: 2013-01-21
Publication date: 2017-08-04

Description

Luminaire with refractive optical element

Technical Field

The invention relates to an illuminator with a refractive optical element.

Background

The amount of electromagnetic radiation (e.g., light) collected and imaged by the camera photosensors may depend on the position of the object in the field of view (FOV) of the camera. In short, an object on the optical axis of a camera may appear brighter than an off-axis object to the camera's photosensor, all other factors being equal. The term "irradiance" refers to electromagnetic radiation incident on a surface, such as a photosensor. The term "emittance of radiation" or "emittance of radiation" refers collectively to electromagnetic radiation that is emitted (e.g., reflected) from an object. Irradiance and emittance may be measured per unit area per second. For objects having an angular displacement "θ" with respect to the optical axis of the camera, the irradiance may be generally in cos⁴Theta decreases.

For various applications, it may be advantageous for objects in a scene imaged by the camera to have substantially the same irradiance on the camera photosensor regardless of the angular displacement θ. For example, a time-of-flight (TOF) three-dimensional (3D) camera determines the distance to an object in a scene imaged by the camera by timing how much time it takes for light emitted by the camera to propagate to the object and back to the camera. For some systems, the illumination system emits light into the object in very short light pulses. The camera images light reflected by the object and collected by the photosensor to determine the round trip travel time of the light. Accurate distance measurement depends on the irradiance associated with the image of the object captured by the photosensor. Furthermore, accuracy may improve as irradiance rises.

Disclosure of Invention

An illumination system is provided having refractive optical elements that compensate for dependence of irradiance of an image of an object captured by a photosensor. The irradiance of an image of an object with a given exitance may depend on angular displacement from the optical axis. The refractive optical element may be used in an imaging device such as a time of flight (TOF) camera. The refractive optical element may structure the light such that similar objects in the same spherical surface within the field of view of the camera have the same irradiance on the camera photosensor.

One embodiment is an illumination system that includes an image sensor, a light source, and a refractive optical element. The image sensor has a photosensor that captures an image of an object in a field of view. The irradiance of an image of an object with a given exitance captured by a photosensor depends on angular displacement from the optical axis of the image sensor. A refractive optical element receives light from a light source and structures the light to illuminate a field of view to compensate for dependence of irradiance on angular displacement from the optical axis.

One embodiment is a depth camera system that includes an image sensor, a light source, a collimator, and a refractive diffuser. The photosensor captures an image of an object in the field of view. The collimator collimates light from the light source. The refractive diffuser receives light from the light source and is in accordance with 1/cos⁴Theta structures the light within the field of view, where theta is an angular displacement from the optical axis of the image sensor. The depth camera system also has logic to generate a depth image based on light received by the photosensor.

One embodiment is a method for generating a depth image, comprising the following steps. A refractive diffuser is used to refract light onto objects in the field of view. Light reflected from an object in the field of view is captured at a photosensor having an optical axis. The irradiance of an image of an object with a given exitance captured by the photosensor depends on angular displacement from the optical axis. Refracting the light includes: the light is structured to compensate for dependence of irradiance on angular displacement from the optical axis. A depth image is generated based on the captured light.

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

Drawings

In the drawings, like numbered elements correspond to each other.

FIG. 1 illustrates one embodiment of an image sensor within an image camera assembly.

FIG. 2A illustrates one embodiment of an illuminator within an image camera component.

Figure 2B illustrates the illumination intensity created by one embodiment of a luminaire.

FIG. 3A depicts a side view of a refractive diffuser having a pair of lens arrays.

Fig. 3B depicts a front view of one of the lens arrays of fig. 3A.

Fig. 3C depicts a side view of a refractive diffuser based on a microlens array.

FIG. 3D illustrates an embodiment in which the position of the lens has a different orientation than that of FIG. 3A.

Figure 3E shows an embodiment where two arrays of lenses are on opposite sides of a single lens.

FIG. 4A depicts one embodiment of a collimator.

FIG. 4B depicts another embodiment of a collimator.

FIG. 5 depicts an exemplary embodiment of a motion capture system.

FIG. 6 depicts an exemplary block diagram of the motion capture system of FIG. 5.

FIG. 7 is a flow diagram of one embodiment of a process for generating a depth image.

FIG. 8 depicts an example block diagram of a computing environment that may be used in the motion capture system of FIG. 5.

FIG. 9 depicts another example block diagram of a computing environment that may be used in the motion capture system of FIG. 5.

Detailed Description

An illuminator having a refractive optical element is disclosed. The illuminator may be used in or in conjunction with a camera or other device having a photosensor. The refractive optical element may be configured to illuminate a field of view (FOV) of the camera in a manner that compensates for angular displacement dependent dependence of irradiance of an image captured by the photosensor. In particular, the irradiance of an image of an object having the same exitance may depend on the angular displacement of the object from the optical axis. In some embodiments, the refractive optical element is a refractive diffuser that structures the light to have an increasing intensity with increasing angular displacement from the optical axis of the camera. The increased intensity may compensate for the dependence of irradiance on angular displacement.

FIG. 1 illustrates one embodiment of an image sensor 26 within the image camera assembly 22. The camera sensor 26 includes a photosensor 5 and a focusing lens 7. Note that the image camera component 22 may also include an illuminator that may illuminate the FOV. FIG. 2A illustrates one embodiment of the illuminator 24 within the image camera component 22. Note that both the illuminator 24 and the image sensor 26 may be present in the same image camera component 22. However, it is not absolutely required that the illuminator 24 and the image sensor 26 be housed within the same device. For purposes of discussion, the image sensor 26 of FIG. 1 will first be discussed. In some embodiments, the image camera component 22 is part of a depth camera system, an example of which will be described below. However, the image camera assembly 22 may use other devices employing the photoelectric sensor 5.

Fig. 1 is not necessarily shown to the correct scale relative to the other figures. In general, light reflected from objects 15a, 15b in the FOV is focused by the lens 7 onto the photosensor 5. For ease of illustration, the planar surface 3 in the FOV perpendicular to the optical axis 9 of the camera assembly is referred to as the "FOV imaging plane". Similarly, the spherical surface 11 centered in the FOV of the camera assembly 22 at the optical center of the camera assembly 22 is referred to as the "FOV imaging spherical surface. "

As mentioned, the irradiance of an object may depend on its position in the FOV. For example, object 15a is depicted as being on the optical axis 9, while object 15b is depicted as being at approximately an angle θ from the optical axis. However, both are on the spherical surface 11 and are therefore the same distance from the camera assembly 22. If, for the sake of debate, the exitance of objects 15a and 15b is the same, the irradiance of the image of object 15b captured by photosensor 5 may be significantly lower than the irradiance of the image of object 15 a. In other words, the image of object 15b will be less intense than the image of object 15a for the photosensor 5. The irradiance may decrease with angular displacement theta from the optical axis 9. In some cases, the drop may be about cos⁴θ。

On the left of the FOV imaging sphere 11 is shown the representation R (θ) ═ cos⁴Dashed curve 61 of theta. As noted, the reduction in irradiance may be about cos⁴θ, where θ is the angular displacement with respect to the optical axis 9. The label "reduced irradiance" is intended to mean that the image of the object captured at the photosensor 5 has reduced irradiance away from the optical axis 9. The reduced irradiance is therefore with reference to the image captured by the photosensor 5. Note, therefore, that the falling irradiance is not with reference to the exitance of the object 15 in the FOV.

It may be advantageous for various applications that objects with practically the same exitance have substantially the same irradiance on the camera photosensor regardless of angular displacement from the optical axis 9. For example, a time of flight (TOF) three-dimensional (3D) camera provides a distance from a feature in a scene imaged by the camera by timing how much time it takes for light emitted by the camera to propagate to the feature and back to the camera. For some technologies, the time of flight is actually based on the intensity of the light, as will be explained in the discussion below.

To time the light propagation, the illumination system may emit a plurality of very short light pulses into the FOV. These light pulses reflect off of the object 15 and back to the image sensor 26, which image sensor 26 captures the light over a certain period of time. The photosensor 5 may comprise an array of pixels, each pixel generating an intensity value. Each pixel represents a different location in the FOV. The time of flight of the light pulse may then be determined based on the light intensity at each pixel. Since time of flight is dependent on light intensity, being able to accurately compensate for the dependence of irradiance on angular displacement from the optical axis 9 may allow a much more accurate determination of the distance to the object 15.

Thus, accurately determining a distance measurement to the object 15 may depend on the irradiance associated with the image of the object captured by the photosensor 5.

Furthermore, accurate determination of distance measurements may improve as irradiance increases.

Furthermore, being able to illuminate the FOV with higher light intensity may improve distance determination.

As alluded to previously, for applications of a 3D TOF camera (such as applications involving gesture recognition, and/or docking a computer with a person interacting simultaneously in the same computer game running on the computer), it may often be advantageous for objects in the same plane 3 or spherical surface 11 in the FOV of the camera to have the same irradiance on the camera photosensor.

To compensate for the dependence of irradiance on angular displacement from the optical axis 9, the illumination system may be configured to increase illumination as the angular displacement θ from the optical axis 9 increases. As a result, objects 15 having a greater angular displacement from the optical axis 9 can be illuminated with more intense light, so that they have a greater degree of exitance. As one example, the illuminator may illuminate object 15b with a greater intensity than object 15a to increase the exitance of object 15 a. In one embodiment, the illumination system is used with a 1/cos lighting system⁴Theta illuminates the FOV of the camera with a substantially proportional intensity of light. Thus, the lighting system can compensate for cos⁴The aforementioned dependence of irradiance on theta, where theta is the angular displacement from the optical axis 9. Note, however, that outside the FOV, the light intensity may be very low, so that energy is not wasted.

FIG. 2A illustrates one embodiment of an image camera component 22 having an illuminator 24. In this embodiment, the illuminator 24 has a light source 24, a collimator 25, and a refractive optical element 40. In some embodiments, the refractive optical element 40 may also be referred to as a refractive diffuser. The refractive optical element 40 may receive light from the light source 23 and structure the light to illuminate the FOV to compensate for the dependence of irradiance on angular displacement from the optical axis. That is, the refractive optical element 40 may compensate for the dependence of irradiance of the object image captured by the photosensor 5 as the angle of the object from the optical axis (e.g., optical axis 9 of fig. 1) increases.

In the embodiment of fig. 2A, the collimator 25 is between the refractive optical element 40 and the light source 24. However, the refractive optical element 40 may be located between the light source 24 and the collimator 25. Although the collimator 25 and the refractive optical element 40 are depicted as separate physical elements in the embodiment of fig. 2, in other embodiments they form one integrated assembly. Although not explicitly shown in fig. 2A, the image camera assembly 22 may have a photosensor (such as photosensor 5 of fig. 1).

Curve 62 represents the intensity of light provided by illuminator 24. The curve 62 is intended to show: the intensity of the light provided by the illuminator 24 increases with increasing angular displacement from the illumination axis 109. For example, the lowest light intensity within the FOV may be along the illumination axis 109. The light intensity may increase with greater angular displacement from the illumination axis 109 within the FOV. Thus, object 15b may be illuminated with a greater intensity than object 15 a. Thus, object 15b may have a greater degree of exitance than object 15a, other factors being equal. Note, however, that outside the FOV, the light intensity may be very low, so that energy is not wasted where illumination is unnecessary. As one example, the refractive optical element 40 may have an optical efficiency of about 95%.

Fig. 2B shows a graphical representation of relative illumination intensity in relation to angular displacement from the optical axis. The intensity may be for a sphere 11 at some arbitrary distance from the illuminator 24. In this example, the FOV may range from-30 to +30 degrees. Within this range, the light intensity may be approximately 1/cos4 θ. Outside this range, however, the light intensity may drop very rapidly. Note that the light drop outside the FOV is only an example, and the relative illumination intensity may have a significantly different shape outside the FOV.

Referring again to fig. 2A, the illuminator 24 is configured to illuminate the FOV to compensate for the dependence of irradiance on angular displacement θ for the image of an object located on the FOV imaging sphere 11 in the FOV. On the left of the FOV imaging sphere 11 is shown the representation R (θ) ═ cos⁴Dashed curve 61 of theta. Curve 61 is also shown and discussed with reference to fig. 1. As noted, the reduction in irradiance may be about cos⁴θ, where θ is the angular displacement with respect to the optical axis 9.

A certain example of a refractive optical element 40 will be discussed next. In one embodiment, the refractive diffuser 40 includes a pair of lens arrays 43. Fig. 3A depicts a side view of a refractive diffuser 40 having a pair of lens arrays 43. Fig. 3B depicts a front view of one of the lens arrays 43 of fig. 3A. Each lens array 43 has a lens 42. As one example, the lens 42 may be on the order of several hundred microns in diameter, however, the lens 42 may have a larger or smaller diameter. The lenses 42 may also be referred to as lenslets (lenslets).

Referring now to fig. 3A, the refractive optical element 40 may include two lenses 43A, 43b, each lens housing an array of lenses 42. The collimated beam is received at lens 43a and passes through lenslets 42 in lens 43 a. In this embodiment, lenslets 42 are depicted as having convex surfaces, but other options such as concave are possible. After being refracted by lenslets 42 in lens 43a, the light rays pass through the gap before entering lenslets 42 in lens 43 b. The gap may be an air gap, but the gap may be formed of a substance other than gas. The light rays then pass into the convex surface of lenslet 42 on lens 43 b. Thus, the light can be refracted again. Then, the light is transmitted out of the lens 43 b. Two exemplary light rays are shown as diverging from lens 43 b. As described above, the refractive optical element 40 may structure the light in a manner that compensates for the dependence of irradiance on angular displacement. In some embodiments, the refractive optical element 40 may diffuse the light such that its intensity is higher with greater angular displacement from the illumination axis. Outside the FOV, however, the light intensity may be significantly reduced so that energy is not wasted.

Referring now to fig. 3B, a front view of one lens 43 illustrates: in this embodiment, the array of lenses 42 may have a rectangular shape. However, other shapes are possible. As described above, one surface of the lens 42 may have a convex shape. The opposing surface of the lens 42 may have a variety of shapes. In one embodiment, the opposing surfaces are substantially flat.

Note that the lenses 42 need not all be of the same curvature, size and/or shape. In some embodiments, the lenses 42 in the array have a variety of different curvatures. Thus, different lenslets 42 may refract light at different angles. In some embodiments, the lenses 42 in the array have a variety of different shapes. For example, some lenslets 42 may be rectangular, others triangular, etc. In some embodiments, the lenses 42 in the array have a variety of different sizes. Fig. 3C depicts an embodiment in which the lenses 42 have different curvatures and different sizes of the refractive optical element 40. The lens 42 in fig. 3C may be several tens of microns in diameter, however, the lens 42 may have a larger or smaller diameter. The lenses 42 in fig. 3C may be referred to as microlenses. In one embodiment, the illumination intensity is created by selecting an appropriate plurality of lens curvatures, sizes, and/or shapes. Note that one or more of these or other attributes may be used to generate the desired illumination intensity spectrum.

Fig. 3A shows: the lenses 42 of one array 41 face the lenses 42 of the other array 41. However, other configurations are possible. For example, there may be only a single array of lenses 42 as in FIG. 3C. In this case, the refractive optical element 40 may look more like either of the lenses 43a or 43 b. Also, there may be more than two arrays 41 of lenslets. For example, there may be three arrays 41, four arrays 41, and so on.

Another possible variant is to reverse the position of the lenses 43a, 43 b. Fig. 3D shows an embodiment in which the position of the lenses 43A, 43b has a different orientation compared to fig. 3A. Figure 3E shows an embodiment in which two arrays 41 of lenses 42 are on opposite sides of a single lens 43.

Fig. 4A and 4B depict two other embodiments of the illuminator 24. Each illuminator 24 has a light source 23, a refractive optical element 40, and a collimator 25. In fig. 4A, there may be a refractive optical element 40 at the input of the device, which refractive optical element 40 refracts the non-collimated light from the light source 23. The collimator 25 collimates the light by internal reflection. The net effect is to produce a structured light output that compensates for the dependence of irradiance described herein. From the light beams inside the collimator 25 it can be seen that some of the light beams are refracted by the refractive optical element 40 towards the inner walls of the collimator 25 and may be reflected off the inner walls of the collimator 25. These internally reflected beams may be used to collimate the light. The other beams may be refracted by the refractive optical element 40 and pass through the collimator 25 without further reflection. The device may have a lens 42 at the output side if other refraction of the light beam is desired to properly structure the light.

The embodiment of fig. 4B has a collimator 25 with a microlens array at the output. The collimator 25 may receive uncollimated light from the light source 23. Thus, by the time the light reaches the refractive optical element 40, the light may have been collimated. In this embodiment, the microlenses in refractive element 40 are not uniform in size. That is, some microlenses are larger than others, while others are smaller. Also, the microlenses have different curvatures from each other. Therefore, the refractive indices of the microlenses may be different from each other. Note that the pattern of sizes and/or curvatures may be a regular pattern or may be random or irregular in some way. As can be appreciated from fig. 4B, the pattern may be irregular.

The exemplary refractive optical elements of fig. 3-4 are presented for illustrative purposes. Numerous other configurations may be used. In general, the refractive optical element 40 may be configured to distribute light from the beam 26 to be collimated such that very little light from the beam propagates out of the FOV, and the intensity of light in the FOV is inversely proportional to the dependence of the irradiance achieved for an object in the FOV on the angular displacement of the object.

As mentioned above, for applications of 3D TOF cameras (such as applications involving gesture recognition, and/or interfacing a computer with a person interacting in the same computer game running on the computer at the same time), it may be advantageous for objects in the same plane or sphere in the FOV of the camera to have the same irradiance on the camera photosensor.

In some embodiments, a refractive diffuser 40 is used in the motion capture system 10. A motion capture system acquires data about the position and motion of a human or other object in physical space and may use this data as input to an application in a computing system. There are many applications, such as for military, entertainment, sports and medical purposes. For example, the motion of a person may be mapped to a three-dimensional (3-D) human skeletal model and used to create an animated character or avatar. The motion capture system may include an optical system including a system using visible and non-visible (e.g., infrared) light, which uses a camera to detect the presence of a person in a field of view. In general, a motion capture system includes an illuminator that illuminates a field of view, and an image sensor that senses light from the field of view to form an image.

FIG. 5 depicts an example embodiment of the motion capture system 10 in which a person 8 interacts with an application, such as in the home of a user. The motion capture system 10 includes a display 196, a depth camera system 20, and a computing environment or device 12. The depth camera system 20 may include an image camera component 22 having an illuminator 24, such as an Infrared (IR) emitter, an image sensor 26, such as an infrared camera, and a red-green-blue (RGB) camera 28. A person 8, also referred to as a user or player, stands within the field of view 6 of the depth camera. Lines 2 and 4 represent the boundaries of the field of view 6.

The illuminator 24 may structure the light such that the intensity of the light increases with increasing angular displacement from the optical axis 9. The collimator 24 may structure the light as already discussed such that the dependence of irradiance on angular displacement is compensated. Note that the light intensity may be very low outside FOV 30. In this example, the optical axis 9 is aligned with the image sensor 26. Illuminator 24 may have an illumination axis (not shown in fig. 5) that is not necessarily precisely aligned with optical axis 9.

In this example, the depth camera system 20 and the computing environment 12 provide an application in which an avatar 197 on the display 196 track the movement of the person 8. For example, when a person lifts an arm, the avatar may lift the arm. An avatar 197 stands on the road 198 of the 3-D virtual world. A cartesian world coordinate system may be defined which includes: a z-axis extending, for example, horizontally along a focal length of the depth camera system 20; a vertically extending y-axis; and an x-axis extending laterally and horizontally. Note that because the display 196 extends vertically in the y-axis direction, and the z-axis extends from the depth camera system perpendicular to the y-axis and x-axis and parallel to the ground on which the user 8 stands, the perspective of the drawing is modified to be simplified.

In general, the motion capture system 10 is used to identify, analyze, and/or track one or more human targets. The computing environment 12 may include a computer, a gaming system or console, or the like, as well as hardware components and/or software components that execute applications.

The depth camera system 20 may include a camera used to visually monitor one or more people, such as the person 8, so that gestures and/or movements performed by the person may be captured, analyzed, and tracked to perform one or more controls or actions within an application, such as animating an avatar or on-screen character or selecting a menu item in a User Interface (UI).

The motion capture system 10 may be connected to an audiovisual device such as the display 196, e.g., a television, a monitor, a High Definition Television (HDTV), etc., or even a projection on a wall or other surface that provides visual and audio output to the user. Audio output may also be provided via a separate device. To drive the display, the computing environment 12 may include a video adapter such as a graphics card and/or an audio adapter such as a sound card that provides audiovisual signals associated with the application. The display 196 may be connected to the computing environment 12 through, for example, an S-video cable, a coaxial cable, an HDMI cable, a DVI cable, a VGA cable, or the like.

The person 8 may be tracked using the depth camera system 20 such that gestures and/or movements of the user are captured and used to animate an avatar or on-screen character and/or interpreted as input controls to an application being executed by the computer environment 12.

Some movements of the person 8 may be interpreted as controls that may correspond to actions other than controlling an avatar. For example, in one embodiment, a player may use motion to end, pause, or save a game, select a level, view a high score, communicate with a friend, and so forth. The player may use the movements to select a game or other application from the main user interface, or to otherwise navigate a menu of options. In this manner, a full set of movements of the person 8 may be available, used, and analyzed in any suitable manner to interact with the application.

The motion capture system 10 may also be used to interpret target movements as operating system and/or application controls that are outside the realm of games or other applications intended for entertainment and leisure. For example, virtually any controllable aspect of an operating system and/or application may be controlled by movement of the person 8.

Fig. 6 depicts an exemplary block diagram of the motion capture system 10 of fig. 5. The depth camera system 20 may be configured to capture video with depth information (including depth images, which may include depth values) by any suitable technique, including, for example, time-of-flight, structured light, stereo images, and so forth. The depth camera system 20 may organize the depth information into "Z layers," or layers that may be perpendicular to a Z axis extending from the depth camera along its line of sight.

The depth camera system 20 may include an image camera component 22, the image camera component 22 capturing a depth image of a scene in a physical space. The depth image may include a two-dimensional (2-D) pixel area of the captured scene, where each pixel in the 2-D pixel area may have an associated depth value representing a linear distance to the image camera component 22, providing a 3-D depth image.

The image camera component 22 may include an illuminator 24, such as an Infrared (IR) emitter 24, an image sensor 26, such as an infrared camera, and a red-green-blue (RGB) camera 28, which may be used to capture depth images of a scene or to provide additional cameras for other applications. The illuminator 24 may be configured to compensate for the dependence of the irradiance of the object captured by the image sensor 26. Accordingly, illuminator 24 may have a refractive optical element 40 such as, but not limited to, any of the examples herein.

The 3-D depth camera is formed by a combination of the infrared emitter 24 and the infrared camera 26. For example, in time-of-flight analysis, the illuminator 24 may emit infrared light onto the physical space, and the image sensor 26 detects backscattered light from the surface of one or more targets and objects in the physical space. In some embodiments, pulsed infrared light may be used, such that the time difference between an outgoing light pulse and a corresponding incoming light pulse may be measured and used to determine a physical distance from the depth camera system 20 to a particular location on a target or object in physical space. The phase of the incident light wave can be compared to the phase of the outgoing light wave to determine the phase shift. The phase shift may then be used to determine a physical distance from the depth camera system to a particular location on the targets or objects.

Time-of-flight analysis may also be used to indirectly determine a physical distance from the depth camera system 20 to a particular location on the targets or objects by analyzing the intensity variation of the reflected beam of light over time via various techniques including, for example, shuttered light pulse imaging.

In another example embodiment, the depth camera system 20 may use a structured light to capture depth information. In such an analysis, patterned light (i.e., light displayed as a known pattern such as a grid pattern or a stripe pattern) may be projected onto the scene through, for example, illuminator 24. Upon falling onto the surface of one or more targets or objects in the scene, the pattern may become distorted in response. Such a deformation of the pattern may be captured by, for example, the image sensor 26 and/or the RGB camera 28, and may then be analyzed to determine a physical distance from the depth camera system to a particular location on the targets or objects.

The depth camera system 20 may also include a microphone 30 that includes, for example, a transducer or sensor that receives sound waves and converts them into electrical signals. Additionally, the microphone 30 may be used to receive audio signals, such as sounds, that may also be provided by a person to control applications that may be run by the computing environment 12. The audio signals may include human vocal sounds such as spoken words, whistling, shouts and other vocalizations, as well as non-vocal sounds such as applause or stomping.

The depth camera system 20 may include a processor 32 in communication with the 3-D depth camera 22. Processor 32 may include a standardized processor, a specialized processor, a microprocessor, or the like that may execute instructions that may include, for example, instructions for receiving a depth image; instructions for generating a voxel grid based on the depth image; instructions for removing background included in the voxel grid to isolate one or more voxels associated with the human target; instructions for determining a location or position of one or more extremities of an isolated human target; instructions for adjusting the model based on the location or positioning of the one or more extremities; or any other suitable instructions, which will be described in more detail below.

The depth camera system 20 may also include a memory component 34, which memory component 34 may store instructions that may be executed by the processor 32, as well as store images or frames of images captured by a 3-D camera or an RGB camera, or any other suitable information, images, or the like. According to an example embodiment, the memory component 34 may include Random Access Memory (RAM), Read Only Memory (ROM), cache, flash memory, a hard disk, or any other suitable tangible computer-readable storage component. The memory component 34 may be a separate component in communication with the image capture component 22 and the processor 32 via the bus 21. According to another embodiment, the memory component 34 may be integrated into the processor 32 and/or the image capture component 22.

The depth camera system 20 may communicate with the computing environment 12 over a communication link 36. The communication link 36 may be a wired and/or wireless connection. According to one embodiment, the computing environment 12 may provide a clock signal to the depth camera system 20 via the communication link 36 that indicates when to capture image data from a physical space that is in the field of view of the depth camera system 20.

In addition, the depth camera system 20 may provide the depth information and images captured by, for example, the image sensor 26 and/or the RGB camera 28, and/or a skeletal model that may be generated by the depth camera system 20, to the computing environment 12 via the communication link 36. The computing environment 12 may then use the model, depth information, and captured images to control the application. For example, as shown in FIG. 2, the computing environment 12 may include a gestures library 190, such as a collection of gesture filters, each having information about gestures that may be performed by the skeletal model (as the user moves). For example, gesture filters may be provided for various gestures, such as a stroke or throw of a hand. By comparing the detected motion to each filter, a specified gesture or motion performed by the person may be identified. The extent to which the action is performed may also be determined.

The data captured by the depth camera system 20 in the form of a skeletal model and the movements associated with it may be compared to gesture filters in the gesture library 190 to identify when the user (as represented by the skeletal model) performed one or more particular movements. Those movements may be associated with various controls of the application.

The computing environment may also include a processor 192 for executing instructions stored in memory 194 to provide audio-video output signals to a display device 196 and to perform other functions as described herein.

FIG. 7 is a flow diagram of one embodiment of a process 700 for generating a depth image. This process may be performed in a depth camera such as the examples of fig. 5 and 6. In process 700, an illuminator 24 having a refractive optical element 40 may be used. Thus, the dependence of irradiance on angular displacement can be compensated. Step 702 comprises: the light is refracted onto objects in the field of view using a refractive diffuser 40. Step 702 may include: the light is structured to compensate for the dependence of irradiance on angular displacement. In one embodiment, step 702 comprises: a sequence of light (e.g., IR) pulses is emitted.

Step 704 includes: light reflected from an object 15 in the field of view is captured at the photosensor 5. The irradiance of the image of the object captured by the photosensor 5 may be dependent on the angular displacement. As described herein, if two objects have the same exitance, the irradiance of an image of an object further away from the optical axis may be lower. As described, step 702 may structure the light to compensate for this dependence of irradiance on angular displacement. In some embodiments, step 704 includes: the reflection of the IR pulse is captured for some predetermined period of time. For example, the photosensor 5 may be shuttered (shuttered) such that the photosensor is allowed to collect light for some predetermined period of time and is blocked from receiving light for some predetermined period of time.

Step 706 includes: a depth image is generated based on the captured light. The depth image may contain depth values (e.g., distance from an object in the FOV). The depth image may include a two-dimensional (2-D) pixel area of the captured scene, where each pixel in the 2-D pixel area has an associated depth value that represents a linear distance (radial distance) from the image camera component 22 or a Z-component (vertical distance) of a 3D location viewed by the pixel. In some embodiments, step 706 includes: the luminance value of each pixel of the photosensor 5 is analyzed. The determination of the depth image is improved since the refractive optical element 40 compensates for the dependence of irradiance on angular displacement.

FIG. 8 depicts an example block diagram of a computing environment that may be used in the motion capture system of FIG. 5. One or more gestures or other movements may be interpreted using the computing environment and, in response, the visual space on the display is updated. The computing environment such as the computing environment 12 described above may include a multimedia console 100 such as a gaming console. The multimedia console 100 includes a Central Processing Unit (CPU)101 having a level 1 cache 102, a level 2 cache 104, and a flash ROM (read only memory) 106. The level one cache 102 and the level two cache 104 temporarily store data and thus reduce the number of memory access cycles, thereby improving processing speed and throughput. The CPU 101 may be provided with more than one core and thus with additional level one caches 102 and level two caches 104. The memory 106, such as a flash ROM, may store executable code that is loaded during an initial phase of a boot process when the multimedia console 100 is powered ON.

A Graphics Processing Unit (GPU)108 and a video encoder/video codec (coder/decoder) 114 form a video processing pipeline for high speed and high resolution graphics processing. Data is carried from the graphics processing unit 108 to the video encoder/video codec 114 via a bus. The video processing pipeline outputs data to an a/V (audio/video) port 140 for transmission to a television or other display. A memory controller 110 is connected to the GPU 108 to facilitate processor access to various types of memory 112, such as RAM (random access memory).

The multimedia console 100 includes an I/O controller 120, a system management controller 122, an audio processing unit 123, a network interface 124, a first USB host controller 126, a second USB controller 128, and a front panel I/O subassembly 130 that are preferably implemented on a module 118. The USB controllers 126 and 128 serve as hosts for peripheral controllers 142(1) -142(2), a wireless adapter 148, and an external memory device 146 (e.g., flash memory, external CD/DVD ROM drive, removable media, etc.). The network interface (NW IF)124 and/or wireless adapter 148 provide access to a network (e.g., the internet, home network, etc.) and may be any of a wide variety of various wired or wireless adapter components including an ethernet card, a modem, a bluetooth module, a cable modem, and the like.

System memory 143 is provided to store application data that is loaded during the boot process. A media drive 144 is provided which may comprise a DVD/CD drive, hard drive, or other removable media drive. The media drive 144 may be internal or external to the multimedia console 100. Application data may be accessed via the media drive 144 for execution, playback, etc. by the multimedia console 100. The media drive 144 is connected to the I/O controller 120 via a bus, such as a Serial ATA bus or other high speed connection.

The system management controller 122 provides various service functions related to ensuring availability of the multimedia console 100. The audio processing unit 123 and the audio codec 132 form a corresponding audio processing pipeline with high fidelity and stereo processing. Audio data is transmitted between the audio processing unit 123 and the audio codec 132 via a communication link. The audio processing pipeline outputs data to the A/V port 140 for reproduction by an external audio player or device having audio capabilities.

The front panel I/O subassembly 130 supports the functionality of the power button 150 and the eject button 152, as well as any LEDs (light emitting diodes) or other indicators exposed on the outer surface of the multimedia console 100. The system power supply module 136 provides power to the components of the multimedia console 100. A fan 138 cools the circuitry within the multimedia console 100.

The CPU 101, GPU 108, memory controller 110, and various other components within the multimedia console 100 are interconnected via one or more buses, including serial and parallel buses, a memory bus, a peripheral bus, and a processor or local bus using any of a variety of bus architectures.

When the multimedia console 100 is powered ON, application data may be loaded from the system memory 143 into memory 112 and/or caches 102, 104 and executed on the CPU 101. The application may present a graphical user interface that provides a consistent user experience when navigating to different media types available on the multimedia console 100. In operation, applications and/or other media contained within the media drive 144 may be launched or played from the media drive 144 to provide additional functionalities to the multimedia console 100.

The multimedia console 100 may be operated as a standalone system by simply connecting the system to a television or other display. In the standalone mode, the multimedia console 100 allows one or more users to interact with the system, watch movies, or listen to music. However, with the integration of broadband connectivity made available through the network interface 124 or the wireless adapter 148, the multimedia console 100 may further be operated as a participant in a larger network community.

When the multimedia console 100 is powered ON, a specified amount of hardware resources may be reserved for system use by the multimedia console operating system. These resources may include a reserve of memory (such as 16MB), a reserve of CPU and GPU cycles (such as 5%), a reserve of network bandwidth (such as 8kbs), and so on. Because these resources are reserved at system boot time, the reserved resources are not present from the application's perspective.

In particular, the memory reservation is preferably large enough to contain the launch kernel, concurrent system applications and drivers. The CPU reservation is preferably constant such that if the reserved CPU usage is not used by the system applications, the idle thread will consume any unused cycles.

With regard to the GPU reservation, lightweight messages generated by system applications (e.g., popups) are displayed by using a GPU interrupt to schedule code to render popup into an overlay. The amount of memory required for the overlay depends on the overlay area size, and the overlay preferably scales with the screen resolution. Where the concurrent system application uses a full user interface, it is preferable to use a resolution that is independent of the application resolution. A scaler may be used to set this resolution so that there is no need to change the frequency and cause a TV resynch.

After the multimedia console 100 boots and system resources are reserved, concurrent system applications execute to provide system functionality. The system functions are encapsulated in a set of system applications that execute within the reserved system resources described above. The operating system kernel identifies threads that are either system application threads or game application threads. The system applications are preferably scheduled to run on the CPU 101 at predetermined times and intervals in order to provide a consistent view of system resources to the application. The scheduling is to minimize cache disruption for the gaming application running on the console.

When the concurrent system application requires audio, audio processing is asynchronously scheduled to the gaming application due to time sensitivity. A multimedia console application manager (described below) controls the audio level (e.g., mute, attenuate) of the gaming application while system applications are active.

The input devices (e.g., controllers 142(1) and 142(2)) are shared by the gaming application and the system application. Rather than reserving resources, the input devices are switched between the system application and the gaming application so that each has a focus of the device. The application manager preferably controls the switching of input stream without knowledge of the gaming application's knowledge, and the driver maintains state information regarding focus switches. The console 100 may receive additional input from the depth camera system 20 of FIG. 6 that includes the cameras 26 and 28.

FIG. 9 depicts another example block diagram of a computing environment that may be used in the motion capture system of FIG. 5. In a motion capture system, a computing environment may be used to determine a depth image and interpret one or more gestures or other movements and, in response, update a visual space on a display. The computing environment 220 includes a computer 241, which typically includes a variety of tangible computer-readable storage media. This can be any available media that can be accessed by computer 241 and includes both volatile and nonvolatile media, removable and non-removable media. The system memory 222 includes computer storage media in the form of volatile and/or nonvolatile memory such as Read Only Memory (ROM)223 and Random Access Memory (RAM) 260. A basic input/output system 224(BIOS), containing the basic routines that help to transfer information between elements within computer 241, such as during start-up, is typically stored in ROM 223. RAM 260 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 259. Graphics interface 231 communicates with GPU 229. By way of example, and not limitation, FIG. 4 depicts operating system 225, application programs 226, other program modules 227, and program data 228.

The computer 241 may also include other removable/non-removable, volatile/nonvolatile computer storage media such as a hard disk drive 238 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 239 that reads from or writes to a removable, nonvolatile magnetic disk 254, and an optical disk drive 240 that reads from or writes to a removable, nonvolatile optical disk 253 such as a CD ROM or other optical media. Other removable/non-removable, volatile/nonvolatile tangible computer readable storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The hard disk drive 238 is typically connected to the system bus 221 through a non-removable memory interface such as interface 234, and magnetic disk drive 239 and optical disk drive 240 are typically connected to the system bus 221 by a removable memory interface, such as interface 235.

The drives and their associated computer storage media discussed above and depicted in FIG. 9, provide storage of computer readable instructions, data structures, program modules and other data for the computer 241. For example, hard disk drive 238 is depicted as storing operating system 258, application programs 257, other program modules 256, and program data 255. Note that these components can either be the same as or different from operating system 225, application programs 226, other program modules 227, and program data 228. Operating system 258, application programs 257, other program modules 256, and program data 255 are given different numbers here to illustrate that, at a minimum, they are different copies. A user may enter commands and information into the computer 241 through input devices such as a keyboard 251 and pointing device 252, commonly referred to as a mouse, trackball or touch pad. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 259 through a user input interface 236 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port or a Universal Serial Bus (USB). The depth camera system 20 of FIG. 6, including the cameras 26 and 28, may define additional input devices for the console 100. A monitor 242 or other type of display is also connected to the system bus 221 via an interface, such as a video interface 232. In addition to the monitor, computers may also include other peripheral output devices such as speakers 244 and printer 243, which may be connected through an output peripheral interface 233.

The computer 241 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 246. The remote computer 246 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 241, although only a memory storage device 247 has been depicted in fig. 4. The logical connections include a Local Area Network (LAN)245 and a Wide Area Network (WAN)249, but may also include other networks. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.

When used in a LAN networking environment, the computer 241 is connected to the LAN 245 through a network interface or adapter 237. When used in a WAN networking environment, the computer 241 typically includes a modem 250 or other means for establishing communications over the WAN 249, such as the Internet. The modem 250, which may be internal or external, may be connected to the system bus 221 via the user input interface 236, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 241, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation, FIG. 9 depicts remote application programs 248 as residing on memory device 247. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.

The foregoing detailed description of the technology has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the technology to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. The described embodiments were chosen in order to best explain the principles of the technology and its practical application to thereby enable others skilled in the art to best utilize the technology in various embodiments and with various modifications as are suited to the particular use contemplated. The scope of the present technology is defined by the appended claims.

Claims

1. An illumination system, comprising:

an image sensor (26), the image sensor (26) having a photosensor (5) configured to capture images of objects in a field of view outside the illumination system, the image sensor having an optical axis, wherein irradiance of an image of an object in the field of view captured by the photosensor with a given exitance is dependent on an angular displacement from the optical axis;

a light source (23); and

a refractive diffuser (40), the refractive diffuser (40) configured to receive light from the light source, the refractive diffuser having a first array of convex mirrors and a second array of convex mirrors, the convex mirrors of the first array having convex faces facing the light source and the convex mirrors of the second array having convex faces away from the light source, wherein the refractive diffuser (40) is configured to refract light from the light source to structure the light to illuminate a field of view outside the illumination system to compensate for dependence of irradiance on angular displacement from the optical axis, wherein objects on the same spherical field of view imaging surface and having the same exitance have substantially the same irradiance on the photosensor independent of angular displacement of the object from the optical axis; and

logic configured to generate a depth image having depth values to the object based on light received by the photosensor when the light from the light source is refracting into the field of view.

2. The illumination system of claim 1, further comprising a collimator that receives light from the light source, wherein the refractive diffuser is configured to receive collimated light and to structure the collimated light to illuminate the field of view to compensate for dependence of irradiance on angular displacement of the optical axis.

3. The illumination system of claim 1, wherein the refractive diffuser is configured in accordance with 1/cos⁴Theta to diffuse the light to illuminate objects within the field of view, where theta is an angular displacement from the optical axis.

4. A method for generating a depth image, comprising:

refracting light from a light source onto an object in a field of view of a depth camera using a refractive diffuser (702), the refractive diffuser having a first array of convex mirrors and a second array of convex mirrors, the convex mirrors of the first array having protruding faces facing the light source and the convex mirrors of the second array having protruding faces exiting the light source, including projecting light from the light source through the first array of convex mirrors and through the second array of convex mirrors;

capturing, at a photosensor of the depth camera having an optical axis, a portion of refracted light returning from an object in the field of view, wherein irradiance of an image of an object having a given exitance captured by the photosensor is dependent on an angular displacement from the optical axis, wherein refracting the light comprises: refracting the light through the first and second arrays of lenses to structure the light to compensate for dependence of irradiance on angular displacement from the optical axis (704), wherein objects on the same spherical-field imaging surface and having the same exitance have substantially the same irradiance on the photosensor independent of angular displacement of the objects from the optical axis; and

a depth image (706) including depth values to the object is generated based on the captured light.

5. The method of claim 4, wherein refracting the light comprises refracting the light according to formula 1/cos⁴Theta, where theta is an angular displacement from the optical axis.

6. The method of claim 4, wherein refracting light onto an object in the field of view using a refractive diffuser comprises: structuring the light to reduce intensity outside the field of view.