HK1181519B - User controlled real object disappearance in a mixed reality display - Google Patents
User controlled real object disappearance in a mixed reality display Download PDFInfo
- Publication number
- HK1181519B HK1181519B HK13108789.3A HK13108789A HK1181519B HK 1181519 B HK1181519 B HK 1181519B HK 13108789 A HK13108789 A HK 13108789A HK 1181519 B HK1181519 B HK 1181519B
- Authority
- HK
- Hong Kong
- Prior art keywords
- user
- real
- see
- real object
- display
- Prior art date
Links
Description
Technical Field
The invention relates to user-controlled disappearance of real objects in a mixed reality display.
Background
Mixed reality, also known as augmented reality, is a technology that allows virtual images to be mixed with real-world views. Unlike other display devices, see-through, mixed, or augmented reality display devices differ from other display devices in that the displayed image is not exclusive to the user's field of view. When a user looks at the computer screen of a laptop, desktop, or smartphone, software executing on the processor generates what is viewed on one hundred percent of the computer screen. While viewing the computer screen, the user's perspective is moved away from the real world. With a see-through, mixed reality display device, a user can see through the display and interact with the real world while also seeing images generated by one or more software applications. It can be said that there is shared control of the display by the software being executed and by people and things the user sees (which are not computer controlled).
Disclosure of Invention
In the embodiments described below, a see-through, head-mounted mixed reality display device system causes a real-world object to disappear from the field of view of the device due to the real-world object's relationship to a subject associated with a particular user. The topic may be identified as a topic to be avoided in some examples. The real world object may also disappear due to its relevance to the current topic of interest to the user.
The present technology provides embodiments of one or more processor-readable storage devices having instructions encoded thereon for causing one or more processors to perform a method for causing a real object in a see-through display of a see-through, mixed reality display device system to disappear. The method includes receiving metadata identifying one or more real objects in a field of view of a see-through display. For example, one or more real objects may be identified based on image data captured by one or more physical environment-facing cameras attached to a see-through, mixed reality display device system. Determining whether any of the one or more real objects satisfy a user disappearance criterion. In response to determining that a real object satisfies a user disappearance criterion, image data of the real object in the see-through display is tracked to cause the real object in a field of view of the see-through display to disappear. The content of the image data is based on the modification technique assigned to the real object.
The present technology provides embodiments of a see-through, head-mounted, mixed reality display device system for causing real objects in a field of view of a see-through display of the display device system to disappear. The system includes one or more location detection sensors and a memory for storing user disappearance criteria including at least one subject item. One or more processors have access to the memory and are communicatively coupled to the one or more location detection sensors to receive location identifier data for the display device system. The one or more processors identify one or more real objects in a field of view of the see-through display that relate to the at least one subject item and are within a predetermined visibility distance for a location determined from the location identifier data. At least one image generation unit is communicatively coupled to the one or more processors and optically coupled to the see-through display to track image data to the identified one or more real objects in the field of view of the see-through display to cause disappearance of the one or more real objects.
The present technology provides an embodiment of a method for causing a real object in a field of view of a see-through display of a see-through, mixed reality display device system to disappear. The method includes receiving a user input identifying a physical action of a subject for disappearing, the subject including a real object for disappearing that is currently in a field of view of the see-through display. The theme for disappearing is stored in the user's disappearance criteria. Image data for the missing real object is tracked according to the alteration technique. In response to identifying a user-specified alteration technique to be applied to the real object currently in view for disappearance, the user-specified alteration technique is selected as the alteration technique.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Drawings
FIG. 1A is a block diagram depicting example components of one embodiment of a see-through, mixed reality display device system.
Fig. 1B is a block diagram depicting example components of another embodiment of a see-through, mixed reality display device system.
Fig. 2A is a side view of the temple of a frame in an embodiment of a see-through, mixed reality display device embodied as eyeglasses that provides support for hardware and software components.
Fig. 2B is a top view of an embodiment of a display optical system of a see-through, near-eye, mixed reality device.
FIG. 3A is a block diagram of one embodiment of hardware and software components of a see-through, near-eye, mixed reality display device that may be used with one or more embodiments.
FIG. 3B is a block diagram depicting components of the processing unit.
Fig. 4 is a block diagram of an embodiment of a system for causing a real object in a display of a see-through, mixed reality display device system to disappear from a software perspective.
Fig. 5A shows an example of a disappearing theme item data record in the user disappearing criterion.
Fig. 5B illustrates an example of data fields of a real object type record of a subject item data record in a user disappearance criterion.
Fig. 5C is an example of a current topic of interest data record in user disappearance criteria.
Fig. 5D illustrates an example of metadata for identifying real objects.
Fig. 5E illustrates an example of a reference object data set for an inanimate object (inanimateobject), which may represent data fields for replacing an object (replenimeobject) in a modification technique.
FIG. 5F illustrates an example of a reference object dataset for a person that may represent data fields for replacing an object in a change technique.
FIG. 6A is a flow diagram of an embodiment of a method for determining the position of real and virtual objects in a three-dimensional field of view of a display device system.
FIG. 6B is a flow diagram of an embodiment of a method for identifying one or more real objects in a field of view of a display device.
FIG. 6C is a flow diagram of an embodiment of a method for generating a three-dimensional model of a location.
FIG. 6D is a flow diagram of an embodiment of a method for determining the position of real and virtual objects in a three-dimensional field of view of a display device system based on a three-dimensional model of a location.
Fig. 7 is a flow diagram of an embodiment of a method for causing a real object in a field of view of a see-through, mixed reality display device system to disappear based on a user disappearance criterion being met.
FIG. 8 is a flow diagram of an embodiment of another method for causing a real object in a field of view of a see-through, mixed reality display device system to disappear.
FIG. 9 is a flow diagram of an embodiment of a process for selecting an alteration technique based on the visibility of a real object for disappearance.
FIG. 10 is a flow diagram of an embodiment of a process of sharing altered image data between see-through, mixed reality display device systems that are within a predetermined distance of each other.
FIG. 11 is a flow diagram of an embodiment of another method for causing a real object in a field of view of a see-through, mixed reality display device system to disappear based on a current topic of interest.
FIG. 12 is a flow diagram of an embodiment of a process for providing a user with a collision warning for a disappearing real object.
13A, 13B, 13C, and 13D illustrate examples of processing gesture user input that identifies a real object for disappearance.
14A, 14B, 14C, and 14D illustrate examples of different alteration techniques applied to real objects that satisfy user disappearance criteria based on different degrees of visibility.
Fig. 15A and 15B show examples of causing a real object that does not satisfy the correlation criterion to disappear.
FIG. 16 is a block diagram of one embodiment of a computing system that may be used to implement a network accessible computing system hosting (host) a disappearing application.
FIG. 17 is a block diagram of an exemplary mobile device that may operate in embodiments of the present technology.
Detailed Description
The present technology provides embodiments for causing a real object in a see-through display of a mixed reality display device system to disappear. As in other image processing applications, the term "object" may refer to a person or thing. For example, as part of a pattern recognition process for identifying what is in an image, edge detection may be applied to a person or object in the image data. As described above, with a see-through, mixed reality display device, a user is able to actually (literally) see-through the display and interact with the real world while also seeing images generated by one or more software applications.
A software application executing on the display device system identifies user disappearance criteria. In some examples, the user disappearance criteria may be based on user input that specifically identifies subject items for which the user does not want to see their associated objects. In some examples, a topic may be a category or general topic that may be embodied across different types of objects. For example, the identified subject may be a type of person, such as a clown. In another example, a topic may be a type of tree. In other examples, a subject item may refer to a particular object or group of objects. For example, a topic may be a particular person (such as a person's brother that is nine years old) or a particular thing (such as a cell tower located in a particular location that has broken a landscape). In another example, the subject matter may be a particular person, place, or thing, with different types of objects being related to the particular person, place, or thing. An example of such a particular object is a particular chain restaurant. Some examples of related objects are a person subdued by staff wearing the chain of restaurants, a billboard advertising the chain of restaurants, a road sign for the chain of restaurants, or a building housing one of the chain of restaurants.
In some embodiments, the user disappearance criteria may be determined based on the topic of current interest to the user. In order to emphasize information of the currently interesting subject matter, objects that do not meet the relevant criteria of the currently interesting subject matter are made to disappear from the display.
As described below, image data is placed in the see-through display over at least a portion of a real object in a field of view of the see-through display that is related to a subject item. When the user's head, body, or eye gaze may change, or the position of the real object may change, or both, the overlaid image data tracks the position of the real object that is disappearing relative to the field of view of the see-through display.
To make the object disappear, different modification techniques can be employed. Some techniques may be a simple revision effect (redactioneffect) that overlays a real object with black image data or masks (obscure) a real object by blurred image data that is tracked to the object. Other techniques may involve replacing the real object with a virtual different type of object in the display. For example, if a clown is to be occluded (block), an avatar (avatar) that tracks the movement of the real clown may be displayed in the see-through display. In other examples, an erasure technique (erasuretechnique) may be employed. One example of an implementation technique of the wipe technique is to display image data of something behind a real object to overlay the real object in a see-through display. In another implementation example of the erasing technique, the image data is generated by copying image data of an object surrounding the real object. The image data is displayed to overlay the real object, effectively blending the real image out of the field of view of the display (blendout). In some embodiments, the location of the real object within a predetermined visible distance of the see-through display that satisfies the user's disappearance criteria is the basis for selecting an alteration technique for the real object.
In particular, when using erasure techniques, certain embodiments provide a collision avoidance security feature by: the position or trajectory of the see-through, mixed reality display device (trajectory) and the relative position or trajectory of the "erased" real object are tracked, and a safety warning is output when the "erased" real object and the display device fall within a collision distance.
FIG. 1A is a block diagram depicting example components of one embodiment of a see-through, mixed reality display device system. System 8 includes a transparent display device as a near-eye, head-mounted display device 2 in communication with processing unit 4 via line 6. In other embodiments, head mounted display device 2 communicates with processing unit 4 through wireless communication. The processing unit 4 may take various embodiments. In some embodiments, processing unit 4 is a separate unit that may be worn on the user's body (e.g., the wrist in the illustrated example) or placed in a pocket, and includes most of the computing power for operating near-eye display device 2. Processing unit 4 may communicate wirelessly (e.g., WiFi, bluetooth, infrared, RFID transmission, Wireless Universal Serial Bus (WUSB), cellular, 3G, 4G, or other wireless communication devices) with one or more computing systems 12 over a communication network 50, whether located nearby as in this example or at a remote location. In other embodiments, the functionality of the processing unit 4 may be integrated in the software and hardware components of the display device 2.
The head mounted display device 2 (which in this embodiment is in the shape of framed 115 glasses) is worn on the head of a user so that the user can see through the display (which in this example is embodied as display optics 14 for each eye) and thus have an actual direct view of the space in front of the user. The term "actual direct view" is used to refer to the ability to see real-world objects directly with the human eye, rather than seeing the created image representation of the object. For example, looking through glasses in a room would allow a user to have an actual direct view of the room, whereas viewing a video of a room on a television set is not an actual direct view of the room. Based on the context of executing software (e.g., a gaming application), the system may project an image of a virtual object (sometimes referred to as a virtual image) on a display viewable by a person wearing the transparent display device, while the person also views real-world objects through the display. Thus, each display optical system 14 is a see-through display for its respective eye, and the two display optical systems 14 may also be collectively referred to as a see-through display.
The frame 115 provides a support for holding the elements of the system in place and a conduit for electrical connections. In this embodiment, the frame 115 provides a convenient frame for the glasses as a support for the elements of the system discussed further below. In other embodiments, other support structures may be used. Examples of such structures are a visor or goggles. The frame 115 includes a temple or side arm for resting on each ear of the user. The temple 102 represents an embodiment of a right temple and includes control circuitry 136 of the display device 2. The nose bridge 104 of the frame comprises a microphone 110 for recording sound and transmitting audio data to the processing unit 4.
The computing system 12 may be a computer, a gaming system or console, or a combination of one or more of these. The application may be executed on the computing system 12 or may be executed in the see-through, mixed reality display device 8.
In the present embodiment, computing system 12 is communicatively coupled to one or more capture devices 20A and 20B. In other embodiments, more or less than two capture devices may be used to capture a room or other physical environment of the user. The capture devices 20A and 20B may be, for example, cameras that visually monitor one or more users and the surrounding space so that gestures and/or movements performed by the one or more users and the structure of the surrounding space may be captured, analyzed, and tracked. Gestures act as one or more controls or actions within the application and/or are used to animate an avatar or on-screen character.
The capture devices 20A and 20B may be depth cameras. According to an example embodiment, each capture device 20A, 20B may be configured to capture video with depth information including a depth image, which may include depth values, by any suitable technique, which may include, for example, time-of-flight, structured light, stereo image, or the like. According to one embodiment, the capture devices 20A, 20B may organize the depth information into "Z layers," or layers that may be perpendicular to a Z axis extending from the depth camera along its line of sight. The capture devices 20A, 20B may include image camera components that may include an IR light component for capturing a depth image of a scene, a three-dimensional (3-D) camera, and an RGB camera. The depth image may include a two-dimensional (2-D) pixel area of the captured scene, where each pixel in the 2-D pixel area may represent a length (e.g., in centimeters, millimeters, etc.) of an object in the captured scene from the camera.
Each capture device (20A and 20B) may also include a microphone (not shown). The computing environment 12 may be connected to an audiovisual device 16 such as a television, a monitor, a high-definition television (HDTV), or the like that may provide game or application visuals. In some cases, the audiovisual device 16 may be a three-dimensional display device. In one example, the audiovisual device 16 includes a built-in speaker. In other embodiments, the audiovisual device 16, a separate stereo system, or the computing system 12 is connected to the external speakers 22.
FIG. 1B is a block diagram depicting example components of another embodiment of a see-through, mixed reality display device system 8 that can communicate with other devices over a communication network 50. In this embodiment, near-eye display device 2 communicates with a mobile computing device 5, which is an example embodiment of processing unit 4. In the example shown, the mobile device 5 communicates via a line 6, but in other examples the communication may also be wireless.
Also, as in computing system 12, an application, such as a disappearing application (see 456 below), may be executed on the processor of mobile device 5, the user action controlling the application and the application may display image data through display optics 14. The display 7 of the mobile device 5 may also display data (e.g., menus) for executing applications, and may be touch sensitive to receive user input. The mobile device 5 also provides a network interface for communicating with other computing devices, such as the computing system 12, over the internet 50 or via another communication network 50 (e.g., WiFi, bluetooth, infrared, RFID transmission, WUSB, cellular, 3G, 4G, or other wireless communication means) via a wired or wireless communication medium using a wired or wireless communication protocol. A remote network accessible computing system, such as computing system 12, may be leveraged by processing unit 4, such as mobile device 5, to provide processing power and remote data access. Examples of hardware and software components of the mobile device 5 (which may be included in a smartphone or tablet computing device, for example) are described in fig. 17, and these components may include hardware and software components of the processing unit 4, such as those discussed in the embodiment of fig. 3B. Some other examples of mobile devices 5 are smart phones, laptop or notebook computers, and netbook computers.
Fig. 2A is a side view of the temple 102 of the frame 115 in an embodiment of a see-through, mixed reality display device embodied as eyeglasses that provides support for hardware and software components. At the front of the frame 115 is a video camera 113 facing the physical environment, which can capture video and still images sent to the processing units 4, 5. In particular, in some embodiments in which the display device 2 does not operate in conjunction with depth cameras such as the capture devices 20a and 20b of the image system 12, the camera 113 facing the physical environment may be a depth camera and a camera sensitive to visible light. The camera may include one or more depth sensors and corresponding infrared illuminators and visible light detectors. Other examples of detectors that may be included in camera 113 of head mounted display device 2, without limitation, are SONAR, LIDAR, structured light, and/or a time of flight (TimeofFlight) distance detector positioned to detect objects that may be being viewed by the wearer of the device.
Data from the camera may be sent to the processor 210 of the control circuitry 136, or the processing units 4, 5, or both, which may process the data, but the units 4, 5 may also send the data over the network 50 or to one or more computer systems 12. The process identifies and maps the user's real world field of view. In addition, the camera 113 facing the physical environment may further include an exposure meter for measuring ambient light.
Control circuitry 136 provides various electronics that support other components of head mounted display device 2. More details of the control circuit 136 are provided below with reference to fig. 3A. Within or mounted to the temple 102 are an earpiece 130, an inertial sensor 132, one or more location sensors 144 (e.g., a GPS transceiver), an Infrared (IR) transceiver, an optional electro-pulse sensor 128 for detecting commands via eye movement, and a temperature sensor 138. In one embodiment, inertial sensors 132 include a three axis magnetometer 132A, three axis gyroscope 132B, and three axis accelerometer 132C (see FIG. 3A). Inertial sensors are used to sense the position, orientation, and sudden acceleration of head mounted display device 2. From these movements, the head position can also be determined.
The image source or image generation unit 120 may be mounted to the temple 102 or internal to the temple 102. In one embodiment, the image source includes a microdisplay 120 for projecting an image of one or more virtual objects and a lens system 122 for directing the image from the microdisplay 120 into the light guide optical element 112. Lens system 122 may include one or more lenses. In one embodiment, lens system 122 includes one or more collimating lenses. In the example shown, the reflective element 124 of the light guide optical element 112 receives the image directed by the lens system 122.
There are different image generation techniques that can be used to implement microdisplay 120. For example, microdisplay 120 can be implemented using a transmissive projection technology where the light source is modulated by optically active material, with white light fromIs backlit. Microdisplay 120 can also be implemented using a reflective technology where external light is reflected and modulated by an optically active material. Digital Light Processing (DLP), Liquid Crystal On Silicon (LCOS) and those of the general electric companyDisplay technologies are all examples of reflective technologies. In addition, microdisplay 120 can be implemented using an emissive technology in which light is generated by the display, e.g., PicoP of Microvision, IncTMAnd a display engine.
Fig. 2B is a top view of an embodiment of display optical system 14 of a see-through, near-eye, mixed reality device. A portion of the frame 115 of the near-eye display device 2 will surround the display optics 14 for providing support and making electrical connections for the one or more lenses shown. To illustrate the various components of display optical system 14 (in this case, right eye system 14 r) in head mounted display device 2, a portion of frame 115 around the display optical system is not depicted.
In one embodiment, display optical system 14 includes light guide optical element 112, opacity filter 114, see-through lens 116, and see-through lens 118. In one embodiment, opacity filter 114 is behind and aligned with see-through lens 116, light guide optical element 112 is behind and aligned with opacity filter 114, and see-through lens 118 is behind and aligned with light guide optical element 112. See-through lenses 116 and 118 are standard lenses used in eyeglasses and can be made according to any prescription, including no prescription. In some embodiments, head mounted display device 2 will include only one transparent lens or no transparent lens. Opacity filter 114 filters out natural light (either on a per-pixel basis or uniformly) to enhance the contrast of the virtual image. The light guide optical element 112 guides the artificial light to the eye. More details of light guide optical element 112 and opacity filter 114 are provided below.
Light guide optical element 112 transmits light from microdisplay 120 to an eye 140 of a user wearing head mounted display device 2. Light guide optical element 112 also allows light to be transmitted from the front of head mounted display device 2 through light guide optical element 112 to eye 140 as indicated by arrow 142 representing the optical axis of display optical system 14r, thereby allowing the user to have an actual direct view of the space in front of head mounted display device 2 in addition to receiving the virtual image from microdisplay 120. Thus, the walls of the light guide optical element 112 are see-through. The light guide optical element 112 includes a first reflective surface 124 (e.g., a mirror or other surface). Light from microdisplay 120 passes through lens 122 and is incident on reflecting surface 124. Reflective surface 124 reflects incident light from microdisplay 120 such that light is trapped by internal reflection within the planar substrate comprising light guide optical element 112.
After a number of reflections at the surface of the substrate, the captured light waves reach the array of selectively reflective surfaces 126. Note that only one of the five surfaces is labeled 126 to prevent the drawings from being too crowded. The reflective surfaces 126 couple light waves exiting the substrate and incident on these reflective surfaces to the user's eye 140. In one embodiment, each eye will have its own light guide optical element 112. When a head mounted display device has two light guide optical elements, each eye may have its own microdisplay 120, which microdisplay 120 may display the same image in both eyes or different images in both eyes. In another embodiment, there may be one light guide optical element that reflects light into both eyes.
Opacity filter 112, which is aligned with light guide optical element 114, selectively blocks natural light from passing through light guide optical element 112, either uniformly or on a per-pixel basis. In one embodiment, the opacity filter may be a see-through LCD panel, an electrochromic film (electrochromic) or similar device capable of acting as an opacity filter. Such a see-through LCD panel can be obtained by removing the various layers of substrate, backlight and diffuser from a conventional LCD. The LCD panel may include one or more light-transmissive LCD chips that allow light to pass through the liquid crystal. Such chips are used, for example, in LCD projectors.
Opacity filter 114 may include a dense grid of pixels, where the light transmittance of each pixel can be individually controlled between a minimum and a maximum light transmittance. Although a light transmission range of 0-100% is desirable, a more limited range is acceptable. In one example, 100% light transmittance represents a perfectly clear lens. The "alpha" scale may be defined from 0-100%, where 0% does not allow light to pass through and 100% allows all light to pass through. The value of alpha may be set for each pixel by the opacity filter control unit 224 described below. After z-buffering (z-buffering) with the proxy for real world objects, a mask of alpha values from the rendering pipeline may be used.
When the system renders a scene for an augmented reality display, the system notes which real-world objects are in front of which virtual objects. In one embodiment, the display and opacity filter are rendered simultaneously and calibrated to the user's precise location in space to compensate for the angular offset problem. Eye tracking can be used to calculate the correct image offset at the end of the field of view. If a virtual object is in front of a real world object, then opacity is turned on for the coverage area of the virtual object. If the virtual object is (virtually) behind a real-world object, the opacity and any color of the pixel are turned off so that the user will only see the real-world object for the corresponding region of real light (one pixel or more in size). The overlay may be on a pixel-by-pixel basis, so the system can handle cases where a portion of the virtual object is in front of the real-world object, a portion of the virtual object is behind the real-world object, and a portion of the virtual object coincides with the real-world object. The opacity filter helps the image of the virtual object to appear more realistic and to represent the full range of colors and intensities. Further details of an opacity filter are provided in U.S. patent application No. 12/887,426, "opacity filter for set-through mounted display," filed on 21/9/2010, the entire contents of which are incorporated herein by reference.
Head mounted display device 2 also includes a system for tracking the position of the user's eyes. As will be explained below, the system will track the user's position and orientation (orientation) so that the system can determine the user's field of view. However, a human will not perceive everything in front of it. Instead, the user's eyes will be directed at a subset of the environment. Thus, in one embodiment, the system will include techniques for tracking the position of the user's eyes in order to refine the measurement of the user's field of view. For example, head mounted display device 2 includes eye tracking component 134 (see fig. 2B), which eye tracking component 134 will include eye tracking illumination device 134A and eye tracking camera 134B (see fig. 3A).
In one embodiment, eye-tracking illumination source 134A includes one or more Infrared (IR) emitters that emit IR light toward the eye. Eye tracking camera 134B includes one or more cameras that sense reflected IR light. The location of the pupil can be identified by known imaging techniques that detect the reflection of the cornea. See, for example, U.S. patent 7,401,920 entitled "Headproducting and display System" issued to Kranz et al, 2008, month 7 and 22, which is hereby incorporated by reference. Such techniques may locate the position of the center of the eye relative to the tracking camera. In general, eye tracking involves obtaining images of the eye and using computer vision techniques to determine the location of the pupil within the eye socket. In one embodiment, it is sufficient to track the position of one eye, since the eyes typically move in unison. However, it is possible to track each eye separately. Alternatively, the eye tracking camera may be an alternative form of tracking camera that detects position with or without an illumination source using any motion based image of the eye.
Another embodiment of tracking eye movement is based on charge tracking (charging). This scheme is based on the following observations: the retina carries a measurable positive charge and the cornea has a negative charge. In some embodiments, the sensor 128 is mounted through the user's ear (near the earpiece 130) to detect the potential of the eye as it rotates and effectively read out what the eye is doing in real time. (see "control your mobile music with eyeball activated headphones |" [ available from the internet on 7/12/2011: http:// www.wirefresh.com/control-your-mobile-music-with-eye ball-activated-headphones ] on 19/2/2010) blinks can also be tracked as commands. Other embodiments for tracking eye movement, such as blinking, based on pattern and motion recognition from image data of the eye-tracking camera 134B loaded on the interior of the glasses may also be used.
In the above embodiments, the specific number of lenses shown is merely an example. Other numbers and configurations of lenses operating according to the same principles may be used. In addition, fig. 2A and 2B show only half of the head mounted display device 2. A complete head-mounted display device may include another set of see-through lenses 116 and 118, another opacity filter 114, another light guide optical element 112, another microdisplay 120, another lens system 122, a physical environment-facing camera 113 (also referred to as an externally or forward facing camera 113), an eye tracking component 134, earphones 130, a sensor 128 (if present), and a temperature sensor 138. Additional details of the head mounted display 2 are shown in U.S. patent application No. 12/905952 entitled "fusing virtual content into real content," filed on 15/10/2010, which is incorporated herein by reference.
FIG. 3A is a block diagram of one embodiment of the hardware and software components of a see-through, near-eye, mixed reality display device 2 that may be used with one or more embodiments. Fig. 3B is a block diagram depicting the components of the processing units 4, 5. In this embodiment, the near-eye display device 2 receives instructions on the virtual image from the processing unit 4, 5 and provides data from the sensor back to the processing unit 4, 5. Software and hardware components such as depicted in fig. 3B that may be implemented in the processing units 4, 5 receive sensory data from the display device 2 and may also receive sensory information from the computing system 12 via the network 50 (see fig. 1A and 1B). Based on this information the processing unit 4, 5 will determine where and when to provide the virtual image to the user and send instructions to the control circuitry 136 of the display device 12 accordingly.
Note that certain components of FIG. 3A (e.g., camera 113, eye camera 134, microdisplay 120, opacity filter 114, eye-tracking lighting unit 134A, headphones 130, sensor 128 (if present), and temperature sensor 138 facing the outside or physical environment) are shown shaded to indicate that each of those devices may have at least two, at least one on the left and at least one on the right of head mounted display device 2. Fig. 3A shows the control circuit 200 in communication with the power management circuit 202. The control circuit 200 includes a processor 210, a memory controller 212 in communication with a memory 244 (e.g., D-RAM), a camera interface 216, a camera buffer 218, a display driver 220, a display formatter 222, a timing generator 226, a display output interface 228, and a display input interface 230. In one embodiment, all components of control circuit 200 communicate with each other via dedicated lines of one or more buses. In another embodiment, each component of the control circuit 200 is in communication with the processor 210.
The camera interface 216 provides an interface to both the physical environment facing cameras 113 and each eye camera 134 and stores the respective images received from the cameras 113, 134 in a camera buffer 218. Display driver 220 will drive microdisplay 120. Display formatter 222 may provide information about the virtual image displayed on microdisplay 120 to one or more processors of one or more computer systems (e.g., 4, 5, 12, 210) that perform the processing of the mixed reality system. Display formatter 222 may identify transmittance settings for pixels of display optical system 14 to opacity control unit 224. A timing generator 226 is used to provide timing data to the system. The display output interface 228 comprises a buffer for providing images from the physical environment facing camera 113 and the eye camera 134 to the processing units 4, 5. Display input interface 230 displays a buffer that includes a display for receiving images, such as virtual images to be displayed on microdisplay 120. The display output 228 and the display input 230 communicate with a band interface 232 that is an interface to the processing units 4, 5.
Power management circuit 202 includes voltage regulator 234, eye tracking illumination driver 236, audio DAC and amplifier 238, microphone preamplifier and audio ADC240, temperature sensor interface 242, electrical pulse controller 237, and clock generator 245. The voltage regulator 234 receives power from the processing units 4, 5 through the band interface 232 and provides the power to the other components of the head mounted display device 2. The illumination driver 236 controls the eye tracking illumination unit 134A to operate at approximately a predetermined wavelength or within a certain wavelength range, e.g., via a drive current or voltage. Audio DAC and amplifier 238 provides audio data to headphones 130. Microphone preamplifier and audio ADC240 provide an interface for microphone 110. Temperature sensor interface 242 is an interface for temperature sensor 138. The electrical pulse controller 237 receives data from the sensor 128 (if implemented by the display device 2) indicative of eye movement. Power management unit 202 also provides power to and receives data back from three axis magnetometer 132A, three axis gyroscope 132B, and three axis accelerometer 132C. The power management unit 202 also provides power to and receives data from and sends data to one or more location sensors 144, which in this example include a GPS transceiver and an IR transceiver.
FIG. 3B is a block diagram of one embodiment of the hardware and software components of the processing unit 4 associated with a see-through, near-eye, mixed reality display unit. The mobile device 5 may include this embodiment of hardware and software components as well as similar components that perform similar functions. Fig. 3B shows control circuitry 304 in communication with power management circuitry 306. Control circuitry 304 includes a Central Processing Unit (CPU) 320, a Graphics Processing Unit (GPU) 322, a cache 324, a RAM326, a memory controller 328 in communication with a memory 330 (e.g., D-RAM), a flash controller 332 in communication with a flash memory 334 (or other type of non-volatile storage), a display output buffer 336 in communication with see-through, near-eye display device 2 via band interface 302 and band interface 232, a display input buffer 338 in communication with near-eye display device 2 via band interface 302 and band interface 232, a microphone interface 340 in communication with an external microphone connector 342 for connecting to a microphone, a pci express interface for connecting to a wireless communication device 346, and a USB port 348.
In one embodiment, the wireless communication component 346 includes a Wi-Fi enabled communication device, a Bluetooth communication device, an infrared communication device, a cellular, 3G, 4G communication device, a Wireless USB (WUSB) communication device, an RFID communication device, and so forth. The wireless communication device 346 thus allows end-to-end data transmission with, for example, another display device system 8, as well as connection to a larger network via a wireless router or base tower. The USB port may be used to interface the processing unit 4, 5 to another display device system 8. Additionally, the processing units 4, 5 may be docked to another computing system 12 for loading data or software to the processing units 4, 5 and charging the processing units 4, 5. In one embodiment, CPU320 and GPU322 are the main load devices used to determine where, when, and how to insert virtual images into a user's field of view.
Power management circuitry 306 includes a clock generator 360, an analog-to-digital converter 362, a battery charger 364, a voltage regulator 366, a see-through, near-eye display power supply 376, and a temperature sensor interface 372 (located on the wrist band of processing unit 4) that communicates with a temperature sensor 374. The AC-to-DC converter 362 is connected to a charging receptacle 370 to receive AC power and generate DC power for the system. The voltage regulator 366 communicates with a battery 368 for providing power to the system. The battery charger 364 is used to charge the battery 368 (via the voltage regulator 366) upon receiving power from the charging receptacle 370. The device power interface 376 provides power to the display device 2.
The image data may be applied to real objects identified in a location and within a user field of view of a see-through, mixed reality display device system. The location of real objects (such as people and objects) in the user's environment is tracked to track to virtual objects of their intended real objects. For the purpose of image processing, both a person and an object may be an object, and the object may be a real object (something physically present) or a virtual object in an image displayed by the display device 2. Software executing on one or more of the hardware components discussed above uses data provided by sensors (such as cameras, orientation sensors, and one or more location sensors) and network connections to track real and virtual objects in the user's environment.
Typically, virtual objects are displayed in three dimensions so that a user can interact with the virtual objects in three-dimensional space just as the user interacts with real objects in three-dimensional space. Fig. 4 through 12, discussed next, describe embodiments of methods, exemplary data structures and software components for processing image data, user input for causing real objects that satisfy user disappearance criteria to disappear from a see-through display, and user disappearance criteria at different levels of detail.
Fig. 4 is a block diagram of an embodiment of a system for causing a real object to disappear in a display of a see-through, mixed reality display device system from a software perspective. In the present embodiment, the application 456 is executed as a client-side or device disappearance application1The see-through mixed reality display device system 8 of a certain version of the disappearing application is communicatively coupled to a computing system 12 executing a disappearing application as another version of the server-side disappearing application 456 by a network 50. In this embodiment, each of the different applications (disappearance 456, other applications 462, push service application 459, etc.) of the local and server-side versions perform the same functions. The processes discussed may be performed entirely by software and hardware components of display device system 8. For example, local disappearing application 456 for display system 81A real object for disappearance may be identified based on its local resources and image data tracked to the real object. Server side versionHave more hardware resources available to them than the local version, such as larger local memory size and dedicated processors. In other examples, the server-side disappearance application 456 remotely performs tracking of the identification of the real object for disappearance and the image data to the real object for disappearance with respect to the display system device 8. In other examples, different versions utilize their resources and share processing.
In the present embodiment, each of the systems 8, 461 and 12 are communicatively coupled to databases, such as reference object data set 474, image database 470, and user profile database 460, discussed further below, over one or more networks 50.
Some examples of other processor-based systems 461 are other see-through, mixed reality display device systems, other head-mounted display systems, servers, mobile devices such as mobile phones, smart phones, netbooks, notebooks, etc., and desktop computers. These other processor-based systems 461 communicate with the display device system 8 to provide data in various multimedia formats, such as text, audio, image, and video data, from one or more of the applications. For example, the data may be a video of a friend in the same place where the user is located, or a social networking page showing messages from others on the user's friend list. As discussed in the examples below, such data may also be image-lost data of real objects that the sender does not wish to see. In this way, the recipient user may view a scene that is altered to obscure real objects from the recipient's field of view that are not intended by the sender.
Display device system 8 and other processor-based systems 461 execute client-side versions of push service applications 459NSaid push service application 459NCommunicate with the information push service application 459 over the communications network 50. The user may register an account with the information push service application 459, which permits information push service permissions to monitor the applications the user is executing and the data it generates and receives and the number of user profilesAccording to 460NAnd device data 464 for tracking location and device capabilities of the userN。
In one embodiment, the computing system 12 includes a user profile database 460 that may aggregate data from user profile data stored on different user computer systems 8, 461 of the user. Local copy of user profile data 4601、460NA portion of the same user profile data 460 may be stored and its local copy may be periodically updated over the communication network 50 with the user profile data 460 stored by the computer system 12 in an accessible database. User disappearance criteria 473 may also be stored in the user profile data 460. The topic data 410 for disappearing and the topic data 420 of current interest are both types of user disappearance criteria 473.
Fig. 5A shows an example of a disappearing subject item data record 410 in a user disappearing criterion 473. In this example, the subject identifier 412 may include an identifier of a person, place, or thing, and the identifier may indicate a category or type of the person, place, or thing. The identifier may also reference (refer) a particular person, place or thing. In some examples, the identifier may be a reference to a storage location for an image of a person, place, or thing. For example, fig. 13A-13D illustrate examples of gestures as user input indicating that a person is removed from the user's view of a hill. The user may not know the name of the person, place, or thing that she or he is looking at and requesting to disappear, so the image storage place reference (e.g., file name) identifies the image data as a basis for identifying the subject. Location information may also be the basis for topic identification.
Disappearing applications 456, 4561One or more real object types 414 are associated with the subject identifier 412 based on the disappearing subject represented by the identifier 412. Similarly, the disappearing application 456 associates one or more topic keywords 416 with the topic identifier 412 in the data record 410. For example, disappearing applications 456, 4561Can be associated with search applications 462, 4621Interface (interface) to identify the sameSynonyms and related words of a topic that the search application returns to disappearing applications 456, 4561. The keywords assist in finding matches in real object metadata for any real object that meets the user's disappearance criteria. The topic identifier 412 and keywords 416 may also identify the real object type 414 related to the topic by looking up a match in the reference object dataset. In some examples, the user input may also identify a real object type 414 and a subject keyword 416.
Fig. 5B shows an example of data fields of the real object type record of the disappearance subject item data record 410 in the user disappearance criteria 473. For each real object type 414, an alteration technique indicator 415 indicates an alteration technique for causing the real object to disappear. The user may indicate the modification technique and the application may provide one or more default modification techniques. Some examples of change techniques include erasing, revising (revision), blurring (blurring), and replacing with another object. If the replacement is an indicated change technique, the user may optionally identify a replacement object 417, which may be specific or refer to a type of object. The substitute object 417 may store display data (see 472 below) for the substitute object or an identifier that points to the display data. The display data may be an instantiation of the reference object data set 474 described below.
The user may indicate a topic that the user wishes to disappear, but the user may also indicate a topic of current interest. For a subject of current interest, one or more real object types of interest for the subject are identified. Relevance criteria are applied to the metadata (see fig. 5C), which describes the real-world objects having a type (e.g., signpost) that matches at least one of the real-world object types of interest. If a real object of the one or more object types of interest does not satisfy the relevance criterion, causing the real object to disappear. In some examples, the user input may directly identify the currently interesting subject matter, while disappearing applications 456, 4561Via identifying one or more of the topics of current interestOne or more electronically provided requests of a plurality of real object types to prompt (prompt) the user. For example, a user may be interested in a restaurant and wish to reduce her clutter with a view of building symbols on a busy street. However, it can be dangerous to have all real objects (including the street in front of her) disappear, and there is no help in locating a chinese restaurant.
In some cases, the user has indicated a currently interesting topic via another executing application 462, and the application is associated with disappearing applications 456, 4561Communicate to identify the current topic of interest, topic keywords (particularly those related to the application 462), and the real object type of interest 424. For example, for a car navigation application example, different types of signposts may be identified as real object types of interest. The executing application selects the modification indicator 414 and any applicable replacement objects 417.
In some examples, the disappearing application 456 provides a software interface, such as an Application Programming Interface (API), that defines a data format for defining the real object type 414, the topic identifier 412, the topic keywords 416, the relevance criteria 428 (e.g., rules for determining relevance).
Fig. 5C shows an example of the current topic of interest data record 420 in the user disappearance criteria data 473. The topic identifier 422 is assigned (assign), and like a disappearing topic, the identifier 422 may be a reference to another storage location, identify a type of person, place, or thing, or a reference to a particular person, place, or thing. As discussed above, the real object type data record 414 is stored and includes an alteration technique indicator 415 and optionally a replacement object identifier 417. Similarly, one or more topic keywords 416 can be associated with the identifier 422, such as by requesting and receiving keywords from the search application 462. Also, relevance criteria 428 are assigned to the topics of interest. In some cases, the relevance criteria 428 may be rules provided by the executing application 462 by which to identify the subject matter of current interest. A set of default rules may also implement the relevance formula 428.
One example of the logic implemented by the disappearing application for the set of default rules is: a real object has the same type as the type of real object currently of interest, and the metadata of the real object includes at least one of a topic keyword 416 or a topic identifier 422.
As with the example of FIG. 5B, in some cases the user may select the modification technique and any applicable replacement object identifiers. In other examples, the disappearing applications 456, 4561Applications 462, 462 for execution of interfaces1The modification technique indicator 415 is set to select a modification technique for a real object type and set any applicable replacement objects 417.
In some cases, the subject item data records 410, 420 may be temporarily stored. An event trigger may prompt the user as to whether he or she wishes to store the topic in non-volatile memory as a disappearing topic or a topic of interest for later retrieval, some examples of event triggers being: the object for disappearance is no longer in view, the user has moved out of a place, or has received data identifying the topic as no longer of interest. The request is provided electronically when displayed, played as audio data, or a combination of both. Requests may also be provided electronically for changing technology and overriding object preferences. If the user does not have any designation, a default modification technique and any override objects for the default technique may be selected.
Some other examples of user profile data are: the user's expressed preferences, the user's list of friends, the user's activities of preferences, the user's list of reminders, the user's social group, the user's current location, and other user-created content, such as the user's photos, images, and recorded videos. In one embodiment, user-specific information may be obtained from one or more applications and data sources, such as: a user's social sites, address book, scheduling data from a calendar application, email data, instant messaging data, user profile or other sources on the internet, and data directly input by the user.
Each version of the push service application 459 also stores a tracking history of the user in user profile data 460. Some examples of events, people, and things tracked in the tracking history are: visited places, transactions, purchased content and real-world objects, and detected people with whom the user interacts. If electronically identified friends (e.g., social networking friends, contact lists) are also registered with the push service application 459, or they make information available publicly through data to the user or through other applications 466, the push service application 459 may also use this data to track the user's social context (sociecontext). The visited locations are tracked over time and real objects for disappearance and their altered image data are allowed to be stored, pre-fetched (prefetch) and prepared for download as the user approaches a familiar location.
Location identifier data for the mixed reality device system 8 may be obtained based on one or more detection sensors. In some cases, the computing system 12 communicates with detection sensors (such as one or more transceivers 144) on the mixed reality device. As discussed in the examples below, data from one or more different types of location detection sensors may be used to obtain location identifier data. Based on signals from the processing unit 4 (e.g., when the unit is embodied as a mobile device 5), cell tower triangulation (celltowertriangulation) may be used to identify the location of the mixed reality display device system. Global Positioning System (GPS) data may be obtained from a GPS transceiver of the display device or, as disclosed in fig. 17, from a GPS transceiver 965 in the processing unit 5 to identify the location of the mixed reality device. GPS technology can be used to identify when a user enters a geo-fence (geoference). The geofence identifier can be used to retrieve (retrieve) an image of an area within the geofence, and in some cases a three-dimensional model of the area generated from the image data.
Smaller area locations or spaces may also be delineated or enclosed by other types of wireless detection sensors, such as Wireless Universal Serial Bus (WUSB) transceivers, bluetooth transceivers, RFID transceivers, or IR transceivers, e.g., 346, 144. The identification data may be exchanged with the computer system 12 or other computer systems 461, including other see-through mixed reality display device systems. In other examples, the computing system 12 is in communication with an intermediate detection sensor. One example of such an intermediate detection sensor is a wireless network access point, e.g., WiFi, through which the display system 8 is communicatively coupled to the computer system 12. The location of the network access point is stored by the computing system 12. The physical environment facing or outward facing camera 113 may also be used as a location detection sensor, either alone or in combination with other sensor data (e.g., GPS coordinate data). This image data may be compared to other images using pattern recognition software (patternrecognitofware) to identify matches.
The device data 464 may include: a unique identifier for the computer system 8, a network address (e.g., IP address), a model number, configuration parameters (such as installed devices), an operating system, and what applications are available in the display device system 8 and executed in the display system 8, etc. Particularly for the see-through, mixed reality display device system 8, the device data may also include data from or determined from the sensors, such as the orientation sensor 132, the temperature sensor 138, the microphone 110, the electrical pulse sensor 128 (if present), and one or more location detection sensors 346, 144 (e.g., GPS transceiver, IR transceiver). Image data 469 is data captured by outward facing camera 113 and stored to be analyzed to detect real objects, either locally or remotely by computing system 12.
Before discussing the application of image data to cause a disappearance of a real object in a field of view of the see-through, mixed reality display, a discussion is first presented regarding components for identifying a real object in a location and in the see-through display field of view. Also, locations of virtual objects (such as those associated with a disappearance) are identified and image data including the locations is generated based on the received image data of the real world.
Computing system 12 may be implemented using one or more computer systems. In this example, the computing system 12 is communicatively coupled to one or more depth cameras 20A, 20B in a location to receive three-dimensional (3D) image data for the location from which real objects and their locations may be identified.
Image data from the outward facing (outward from the user's head) camera 113 may be used to determine the field of view of the see-through display, which approximates the user's field of view. The camera 113 is placed at a predetermined offset from the optical axis 142 of each display optical system 14. The offset is applied to the image data of the camera 113 to identify real objects in the display field of view. Based on the resolution or focal length setting of the camera 113, the distance to real objects in the field of view may be determined. In some examples, outward facing camera 113 may be a depth camera as well as a video camera.
As described further below, real-world objects are identified and their appearance characteristics 475 are stored in image data, such as in metadata 430 accessible from computer system 12 over communications network 50, or locally at 4301In (1). The reference object data sets 474 provide appearance feature categories bound to different types of objects, and these reference object data sets 474 may be used to identify objects in the image data and to select appearance features of virtual objects so that they appear real. The surrogate object 417 identifier may store an identifier of an instantiation of a reference object dataset.
Fig. 5D illustrates an example of metadata for identifying real objects. A real object identifier 431 is assigned and stored for each detected real object. In addition, location data 432 of the object in three dimensions may also be stored. In this example, the location data 432 includes trace data 433 for tracking at least a location of an object through a placeSpeed and direction of movement. As discussed further below in fig. 6A through 6E, the location of the real object relative to the field of view of the display device is identified. Further, the location data 432 may track the location of the identified one or more real objects within the defined place, even if not currently within the display device field of view. For example, the user may be in a store, home, or other location where cameras in the environment (such as capture devices 20A, 20B) track objects, including the user wearing display device 2 in the store in the three-dimensional (3D) model coordinate system of the location. Additionally, a location-based image data source (e.g., such asA 3D model of a location and real-world objects developed from an image archive therein may be provided.
The position data also includes a visibility 434 that is based on the position of the real object from the display device system 8. The optical axis 142 of the display optical system 14 may be a reference point from which the position and trajectory of the real object, and thus the visibility, may be determined. In some embodiments, visibility is assigned as a refinement to a predetermined visible distance of a location. Location image tracking software 453 (e.g., such as discussed below)A predetermined visible distance 479 may be provided for a location detected by the display device system 8 being worn by the user. For example, a display device worn at the top of a HalfDome of Yosemite may look one hundred miles. Another display device worn by a user walking on a busy seattle mall street during peak hours may have a predetermined visible distance of about 15 meters (i.e., approximately 45 feet).
A plurality of visibilities may be defined based on the averaged human vision data to distinguish a set of appearance features and movements of one or more body parts. In many embodiments, each visibility represents at least a range of distances and a degree of recognition between the object and the display device 2. In some examples, the angle of the real object from each optical axis 142 and the position of the real object relative to the display device may be used as a basis. Traces may also be used as a basis. (as mentioned below, even if a real object is stationary, trace 433 may be assigned to that real object because display device system 8 is moving). Some examples of visibility that represent distance from far to near are: a color recognition degree, a joint movement recognition degree, and a face movement recognition degree. In some embodiments, real objects outside the predetermined visible distance are identified with an invisibility level.
The location data 435 of the real object may also be stored if available. This may be GPS data or other location data independent of the field of view of the display device 2.
Keywords 436 may be stored for the real object. In some cases, the local or server-side disappearance application may have identified this real object as satisfying the disappearance criteria. For example, the real object is stationary and in the user's workplace. In such an example, the keyword includes a topic identifier. Other sources of origin for the keywords may be metadata associated with the real-world object by other users via the location image tracking application 453 (e.g., the location image tracking applicationTheir images are loaded into the location image database 470. Also, an application that captures data of the place (such as the storage application 462) can associate keywords with real objects in its place. Further, the real object may be assigned keywords based on data received by the local or server-side information push service application 459 from other applications executing for the user or other users that are permitted to monitor. Disappearing application 456 or 4561Keywords may also be assigned based on user profile data related to the real world object.
Appearance feature data sets 475 from the identification process discussed below that describe physical characteristics or features of the real object are also stored in the real object metadata 430.
Fig. 5E shows an example of a reference object data set of an inanimate object. The data field includes a type 481 of the object, which may be a data record that also includes a subfield. For the type of object 481, the other data fields provide data records as follows: the data record identifies the type of appearance characteristic typical for that type of object. For example, the other data records identify a size range 483, a shape selection 484, a type of material 485, a surface texture 486, a representative color 487, a representative pattern 488, a surface 491, and a geometric orientation 490 of each surface. The reference object data set 474 of the object appears as a template. The offline identification may have been performed manually or by pattern recognition software and used as a basis for each reference data set 474 stored for each type of object defined by the system. In many embodiments, the display data 472 for the virtual object (including the virtual object used to mask or replace the real object used to disappear) includes an instantiation of a reference object dataset. In an instantiation, at least some of the data fields (e.g., size and color) of the object are assigned to particular values.
In the example of a table as the object type, the sub-field of the object type may be selected as a desk. Size range 483 may be a typical value for the following range: 4 to 6 feet wide, 2 to 4 feet long, and 2 to 4 feet high. The available colors may be brown, silver, black, white, navy blue, beige or gray. Someone may have a desk that is red in color, so the reference appearance characteristics generally provide common or average parameters. Surface 491 may comprise a flat surface with a geometric orientation 490 indicated as horizontal. The vertical surface may also be noted from the image data of the table. The surface texture 486 of the planar surface may be smooth and the available pattern 488 may indicate wood grain (woodgrain), as well as vinyl reflectance. The type of wood grain pattern may be a sub-field or sub-record recordable with pattern 488.
FIG. 5E illustrates an example of a reference object data set for a person, which may represent data fields used to replace an object in a change technique. For type 492 of the person object dataset, the other data fields provide the following data records: the data record identifies the type of appearance characteristic typical for that type of person. The type of person may be a category, some examples of which are: clown, teacher, police, engineer, judge, and rock star. For a human, some examples of data fields and data sets are shown as including height 493, body part features 494, facial features 495, skin characteristics 496, hair features 497, apparel selection 498, and shoe selection 499. Some examples of body part feature data are: torso width, chest width, muscle texture, shape, leg length, knee position relative to the entire leg, and head shape. Some examples of facial features may include: eye color, birthmarks or other markings, lip shape, nose shape, forehead height. Additional characteristics such as eyebrow shape and make-up may also be stored as data in the person's facial features.
As mentioned above, the reference object data set 474 also provides input parameters for defining the appearance characteristics of a missing virtual object. In one embodiment, the disappearing display data 472 may define the type of virtual object and its appearance characteristics for presentation by microdisplay 120 of display device 2. For example, these reference objects 474 may be considered templates and parameters for appearance features of virtual objects. For display data 472, specific data values (e.g., specific colors and sizes) are selected in the instantiation of the template to generate the actual virtual object to be displayed. For example, a class may be defined for each type of object, and the disappearing application instantiates a virtual object of the corresponding class at runtime with parameters for each surface and the size, material type, color, pattern, surface texture, shape parameters, and geometrically oriented appearance characteristics of the object. The display data 472 may be implemented in a markup language. For example, extensible markup language (XML) may be used. In another example, a markup language such as Virtual Reality Modeling Language (VRML) may be used.
The appearance characteristic data set 475 of the real object may have fields and subsets defined similarly to the same type of reference data object set 474, but include actual data values detected or determined based on captured data of the real object. It may not be possible to determine a data value for each data field. In some embodiments, the assigned data value is selected from a selection (selection) of available types provided by the reference object data set 474.
In addition to interpreting commands, voice recognition software 478 may also be used to identify nearby users and other real-world objects. Face and pattern recognition software 476 may also be used to detect and identify users in image data as well as objects in image data. User input software 477 may receive data identifying a physical action used to control an application, such as a gesture, a particular spoken voice command, or eye movement. The one or more physical actions may include a user response or request for a real or virtual object. For example, in fig. 13A to 13D, a thumb gesture indicates that a real object is to disappear. Applications 450, 456, and 459 of computing system 12 may also communicate requests and receive data from server-side versions of voice recognition software 478 and face and pattern recognition software 476 in identifying users and other objects in a venue.
The block diagram of FIG. 4 also represents software components for identifying physical actions in the image data, which are discussed further below. Moreover, the image data, as well as the available sensor data, is processed to determine the location of objects (including other users) within the field of view of the see-through, near-eye display device 2. This embodiment illustrates how devices can utilize a networked computer to map a three-dimensional model of a user's field of view and surrounding space, as well as real and virtual objects within the model. An image processing application 451 executing in a processing unit 4, 5 communicatively coupled to the display device 2 may transfer image data 469 from the front facing camera 113 to the depth image processing and skeletal tracking application 450 in the computing system 12 over one or more communication networks 50 to process the image data to determine and track objects in three dimensions, including both people and objects. In some embodiments, moreover, the image processing application 451 may perform some of the processing for mapping and locating objects in 3D user space locally and may interact with the remote location image tracking application 453 to receive distances between objects. By leveraging network connectivity, many combinations of sharing processing between applications are possible.
The depth image and skeleton tracking application 450 detects objects, identifies objects, and their locations in the model. Application 450 may perform its processing based on depth image data from depth cameras (e.g., 20A, 20B), two-dimensional or depth image data from one or more outward facing cameras 113, and images obtained from database 470. The image database 470 may include reference images of objects for use in pattern and facial recognition (e.g., as may be performed by software 476). Some of the images in one or more of the databases 470 may also be accessed via the place metadata 435 associated with the objects in the images by the place image tracking application 453. Some examples of place metadata include: GPS metadata, location data for network access points (such as WiFi hotspots), location data based on cell tower triangulation, and location data from other types of wireless transceivers.
The location image tracking application 453 identifies images of the user's location in one or more image databases 470 based on location identifier data received from the processing units 4, 5 or other positioning units (e.g., GPS units) identified as being in the vicinity of the user, or both. In addition, the image database 470 may provide images of a location uploaded by users who wish to share their images. The database may be indexed or accessible with location metadata such as GPS data, wifi ssids, cell tower based triangulation data, WUSB port locations, or the location of infrared transceivers. The location image tracking application 453 provides the distance between objects in the image to the depth image processing application 450 based on the location data. In some examples, the location image tracking application 453 provides a three-dimensional model of the location, which may be dynamic based on real-time image data updates provided by cameras in the location. In addition to stationary cameras in designated locations, other users' display device systems 8 and mobile devices may provide such a benefitAnd (5) new.Is an example of such a location image tracking application 453.
Both the depth camera processing and skeletal tracking application 450 and the image processing software 451 may generate metadata 430 for real objects identified in the image data. To identify and track living objects (regardless of how at least people) in a field of view or user location, skeletal tracking may be performed.
Outward facing camera 113 provides RGB images (or visual images in other formats or color spaces) and depth images (in some examples) to computing system 12. If present, capture devices 20A and 20B may also send visual images and depth data to computing system 12, which computing system 12 uses the RGB images and the depth images to track user or object movement. For example, the system will use the depth image to track the skeleton of the person. Many methods can be used to track the skeleton of a person by using depth images. One suitable example of using a depth image to track a skeleton is provided in U.S. patent application 12/603,437 "PosetTrackingPipeline," filed on 21/10/2009 by Craig et al. (hereinafter the' 437 application), incorporated herein by reference in its entirety.
The process of the' 437 application includes: obtaining a depth image; down-sampling the data; removing and/or smoothing high variance noise data; identifying and removing the background; and assigning each of the foreground pixels to a different part of the body. Based on these steps, the system will fit a model to the data and create a skeleton. The skeleton will include a set of joints and connections between the joints. Other methods for tracking may also be used. Suitable tracking techniques are also disclosed in the following four U.S. patent applications, all of which are incorporated herein by reference in their entirety: U.S. patent application 12/475,308 "devicefor identifying and tracking multiple humansovertime (a device for identifying and tracking multiple humans over time)" filed on 29.5.2009; U.S. patent application 12/696,282, "visual basic identity tracking," filed on 29/1/2010; us patent application 12/641,788 "motion detection using depth images" filed on 12, 18/2009; and U.S. patent application 12/575,388 "humantracking system" filed on 7.10.2009.
The skeletal tracking data identifies which joints move over a period of time and is sent to a gesture recognizer engine 454, which includes a plurality of filters 455 to determine whether any person or object in the image data has performed a gesture or action. A gesture is a physical action for the user to provide input to the disappearing application. A filter includes information defining a gesture, action, or condition and parameters or metadata for the gesture or action. For example, a throw that includes motion of one hand from behind the body past the front of the body may be implemented as a gesture that includes information representing motion of one hand of the user from behind the body past the front of the body, as that motion may be captured by the depth camera. Parameters may then be set for the gesture. When the gesture is a throw, the parameters may be a threshold speed that the hand must reach, a distance the hand travels (absolute, or relative to the overall size of the user), and a confidence rating of the recognizer engine 454 for the gesture that occurred. These parameters for a gesture may vary from application to application, from context to context of a single application, or within one context of one application over time.
The input to the filter may include content such as joint data about the user's joint position, the angle formed by the bones that intersect at the joint, RGB color data from the scene, and the rate of change of some aspect of the user. The output from the filter may include such things as the confidence that a given gesture is being made, the speed at which the gesture motion is made, and the time at which the gesture motion is made. In some cases, only two-dimensional image data is available. For example, the front facing camera 113 provides only two-dimensional image data. From the device data 464, the type of forward facing camera 113 may be identified, and the recognizer engine 454 may insert a two-dimensional filter for its gestures.
More information about the recognizer engine 454 may be found in U.S. patent application 12/422,661 "Gesturer recognitionarchitecture" filed on 13.4.2009, the entire contents of which are incorporated herein by reference. More information about recognized gestures may be found in U.S. patent application 12/391,150, "standard gestures" (standard gestures), filed on 23/2/2009; and us patent application 12/474,655 "geturetool", filed on 29.5.2009, which is hereby incorporated by reference in its entirety.
Image processing software 451 executing in display device system 8 may also have depth image processing capabilities or the ability to perform 3D position estimation of objects from stereoscopic images from outward facing camera 113. Further, the image processing software 451 may also include logic for detecting a set of gestures indicative of user input. For example, a set of finger or hand gestures may be recognizable. Skeletal tracking may be used, but pattern recognition of fingers or hands in the image data may also recognize gestures in the set of gestures.
In the discussion below of identifying objects near the user, the reference to the outward facing image data is to the image data from the outward facing camera 113. In these embodiments, the outward facing field of view of camera 113 approximates the field of view of the user when the camera is located at a relatively small offset from optical axis 142 of each display optical system 14, and the offset is accounted for when processing the image data.
FIG. 6A is a flow diagram of an embodiment of a method for determining the position of real and virtual objects in a three-dimensional field of view of a display device system. At step 510, the one or more processors of control circuitry 136, processing units 4, 5, hub computing system 12, or a combination of these receive image data from one or more outward facing cameras and at step 512 identify one or more real objects in the outward facing image data. In certain embodiments, the outward facing image data is three-dimensional image data. Data from orientation sensors 132 (e.g., three axis accelerometer 132C and three axis magnetometer 132A) may also be used with outward facing camera 113 image data to map the user's surroundings, the user's face and head positions to determine which objects (real or virtual) he or she may be focusing on at the time. The face and pattern recognition software 476 may identify objects of people and things by comparison with the reference object data set 474 and the actual images stored in the image database 470.
In step 514, the one or more processors executing the face and pattern recognition software 476 also identifies one or more appearance features of each real world object, such as the type, size, surface, geometric orientation, shape, color, etc. of the object. In step 516, a three-dimensional (3D) position is determined for each real object in the field of view of the see-through display device. Based on the executed application, the one or more processors in step 518 identify one or more virtual object 3D locations in the field of view. In other words, where each object is located with respect to the display device 2, for example, with respect to the optical axis 142 of each display optical system 14.
FIG. 6B is a flow diagram of an embodiment of a method for identifying one or more real objects in a field of view of a display device. This embodiment may be used to implement step 512. At step 520, the location of the user wearing display device 2 is identified. For example, the user's location may be identified via GPS data of a GPS unit 965 (see fig. 17) on the mobile device 5 or the GPS transceiver 144 on the display device 2. Additionally, the IP address of a WiFi access point or cell site with which the display device system 8 has a connection may identify a location. Cameras at known locations within a venue may identify a user through facial recognition. Furthermore, the identifier token may be exchanged between the display device systems 8 via infrared, bluetooth, RFID transmission, or WUSB. The range of infrared, RFID, WUSB or bluetooth signals may serve as a predefined distance for determining proximity to a reference point, such as a display device of another user.
In step 522, the one or more processors retrieve one or more images of the location from a database (e.g., 470), such as via a request to image tracking software 453. In step 524, the local or server-based executed version or both of the face and pattern recognition software 476 selects one or more images that match the image data from the one or more outward facing cameras 113. In some embodiments, steps 522 and 524 may be performed remotely by a more powerful computer (e.g., 12) that has access to the image database. Based on the location data (e.g., GPS data), the one or more processors determine a relative position of one or more objects in the outward-facing image data to one or more identified objects in the location in step 526, and determine a position of the user from the one or more identified real objects based on the one or more relative positions in step 528.
In certain embodiments, such as in fig. 1A where depth cameras 20A and 20B capture depth image data of a live room, a user wearing a see-through, near-eye mixed reality display may be in the following locations: where depth image processing software 450 of computer system 12 provides a three-dimensional map of objects within a location, such as a defined space, e.g., a store. FIG. 6C is a flow diagram of an embodiment of a method for generating a three-dimensional model of a location. In step 530, a computer system having access to a depth camera (such as system 12 with capture devices 20A and 20B) creates a three-dimensional model of a location based on the depth image. The depth images may be from multiple angles and may be combined based on a common coordinate location (e.g., store space) and create a volumetric or three-dimensional description of that location. At step 532, the object is deleted in the place. For example, edge detection may be performed on the depth images to distinguish objects (including people) from each other. Computer system 12 executing depth image processing and skeleton tracking software 450 and face and pattern recognition software 476 identifies one or more detected objects, including their location in the site, in step 534, and identifies one or more appearance features of each real-world object in step 536. The object may be identified with reference images of objects and persons from the user profile data 460, the image database 470, and the reference object data set 474.
Image processing software 451 may forward the outward facing image data and sensor data to depth image processing software 450 and receive back from computer system 12 the three-dimensional location and identification, including appearance features. The three-dimensional position may be relative to a 3D model coordinate system of the user location. In this way, disappearing application 4561It can be determined which real objects are in the field of view and which are not currently in the field of view but in the 3D modeled location.
FIG. 6D is a flow diagram of an embodiment of a method for determining the location of real and virtual objects in a three-dimensional field of view of a display device system based on a three-dimensional model of the location. In step 540, one or more processors (210, 320, 322) of display device system 8 send the outward facing image data to a three-dimensional modeling computer system associated with a location. For example, image processing software 451 sends the outward facing image data to depth image processing and skeleton tracking software 450 of computer system 12. The image processing software 451 receives real object metadata including the 3D model position of one or more real objects in the location in step 542. The real objects are detected from image data from cameras in the environment, which may also include outward facing cameras of other users. In step 544, image processing software 451 receives the user's position in the 3D model. Optionally, in step 546, image processing software 451 receives virtual object metadata, including the 3D model position of the one or more virtual objects in the location. The image processing software determines the position of the one or more objects relative to the field of view of the display device based on the 3D model position in step 548. Each of the embodiments of fig. 6A-6D is typically performed repeatedly as the user and objects within the user environment move around.
Disappearing applications 456, 456 in a see-through display for a user1Indicated for disappearanceThe image processing application 451 tracks the position of the real object in the field of view of the display device to the position in each display optical system and tracks the indicated display data 472, 4721E.g., a black rectangle for revision, to cover real objects in each display optical system 14 (and thus in the field of view). The image processing application 451 of the see-through, mixed reality display device system 8 may disappear at the device side to the application 4561Will be used to cause real objects to disappear from the display data 472, 4721Formatted into a format that can be processed by image generation unit 120 (e.g., microdisplay 120) and, if opacity filter 114 is used, provides instructions to opacity controller 224 for opacity filter 114. For some alteration techniques (e.g., erasing), the disappearing display data 472, 4721Including image data generated by copying image data around a real object to be erased and overlaying the erased object therewith. In other examples, the disappearing display data 472, 4721Is the image data (e.g., from database 470) that is behind the object to be erased.
Fig. 7-12 present embodiments of methods for this technique and example implementations for some of the steps of the methods. For illustrative purposes, the following method embodiments are described in the context of the above-described system embodiments. However, the method embodiments are not limited to operation in the system embodiments described above, but may be implemented in other system embodiments. As mentioned above with respect to fig. 6A-6D, these method and process embodiments are also repeatedly performed when the user wearing the see-through display moves at least his or her eyes and the object in the field of view may move under its own control.
Fig. 7 is a flow diagram of an embodiment of a method for causing a real object to disappear in a field of view of a see-through, mixed reality display device system based on satisfying a disappearance criterion. In step 602, the image processing software 451 receives metadata identifying one or more real objects at least in the field of view of the see-through display of the hybrid display deviceWherein the software 451 may be applied in a disappearing application 4561The metadata is also stored at least temporarily in accessible memory. Based on the metadata, in step 604, disappearing application 4561Any real objects that meet the user disappearance criteria are identified.
In some examples, the local device disappearing application 4561A message is received from the server-side application 456 that identifies which real objects meet the user disappearance criteria. In other examples, the local disappearing application 4561A keyword search is performed on the received real object metadata and a real object for disappearance is identified locally.
For any real object identified for disappearance, the disappearance application 4561The image processing software 451 is caused to control the image on the see-through display via the image generation unit 120 to track the image data to any identified real object to cause it to disappear in the see-through display. If no real objects in the field of view are identified for disappearance, the one or more processors of display device system 8 return to other processing in step 608.
Fig. 8 is a flow diagram of an embodiment of another method embodiment for causing a real object to disappear in a field of view of a see-through, mixed reality display device system. The embodiment of fig. 8 may be used to implement steps 604, 606 and 608 of the embodiment of fig. 7. In step 612, disappearing application 4561The identification of any real objects in the field of view of the see-through display device that meet the user's disappearance criteria is examined. In step 614, a disappearance process is performed for any real objects in the field of view of the see-through display device identified in step 612 (e.g., step 606 of FIG. 7).
Network access to software and image databases in which to track location and real-world objects allows the disappearance application 4561 to prefetch (prefetch) any applicable image data of real-world objects designated as disappeared in these tracked locations when a user entering such locations meets prediction criteria.
In step (b)In step 616, disappearing application 4561Any real objects that meet the user disappearance criteria, but are outside the current field of view of the display device and within a predetermined visible distance of the location of the user's display device system 8, are examined for identification. In step 618, disappearing application 4561Prefetching or causing to be prefetched any applicable missing image data for any real object identified in step 616.
In step 620, disappearing application 4561Applying or causing to be applied a location prediction method for identifying one or more subsequent locations that meet prediction criteria. In step 622, disappearing application 4561It is determined whether any satisfied subsequent places have been identified. If not, then in step 623, processing returns to the field of view check in step 612 at the next scheduled check. Disappearing application 456 if a subsequent place that satisfies the prediction criteria is identified in step 6221The identification of any real objects in any identified subsequent locations that meet the user disappearance criteria is checked at step 624 and any applicable disappeared image data for any real objects identified in step 624 is pre-fetched at step 626.
Because the field of view of the display device includes what the user's eyes are currently focused on (focuson) and what the user can see at the periphery (peripheraly) at the current point in time, the hardware and software components of the display system give priority to keeping real objects in the field of view that do not meet the user's disappearance criteria over data prefetching and location prediction. Thus, in some embodiments, the inspection to identify any real objects for disappearing in the field of view (step 612), and the disappearance process if any identified real objects are found (step 614) may occur more often and have a higher priority to the components of the display system 8 than the pre-fetching and location prediction steps.
As discussed with respect to FIG. 7, the server-side disappearing application 456 may send a message, the local application 4561Examining the message to identify in the field of view that are within a predetermined visible distance of the user's locationOr real objects in subsequent locations that meet user disappearance criteria. In other words, networking resources are utilized to assist in identifying real objects for disappearance. An implementation example process of searching for matches in metadata with topic keywords and identifiers to identify real objects for disappearance may be performed by the server-side application 456 to offload work from the local device system processor. Local copy 4561Prefetching may also be performed by requesting the server-side disappearing application 456 to perform prefetching, and location prediction methods applied by requesting the server-side 456 to have location prediction performed and provide the results of any identified subsequent locations.
Further, server-side applications 456 may be passed through native applications 4561The disappearing image display data 472 for the particular alteration technique is provided upon request to save memory space on the display system 8 to assist in the disappearing process.
Also, a copy of the application requesting prefetching (e.g., local disappearance application 456)1Or server side 456) can prefetch image data for storage at another participating computer system in the location of the real object to make it disappear. In the case of a subsequent location, the requesting application may schedule other computer systems to download the image data some time period prior to the estimated user arrival. The image data for alteration arrives in time before the user enters the location. In one example, the local copy application 4561The image data may be downloaded to the display device system 8 when a connection is made with other computer systems 8, or in another example when the user's display device system 8 is within distance criteria of a reference point in the location.
As mentioned above, for each location, the location-based tracking application 453 may have assigned a predetermined visible distance. For real objects that are intended to disappear, the user or application may have selected an alteration technique, such as replacing the unwanted real object with an avatar overlay (avatar), which tracks the facial expressions or at least body movements of the real object (in this example, a person). A person to be covered with the avatar may be within a predetermined visible distance, but more than 40 feet away, so that the user cannot clearly see the facial expression of the person.
Because tracking the facial expression of a real person with the avatar facial expression is computationally intensive and provides little benefit to the user at that distance, another modification technique or different image data may be applied. For example, the image data of the person with a blurred appearance, or an avatar, may be displayed to the user. When the person is within, for example, 20 feet of the user, then the local copy of the application is disappeared 4561Or server copy 456, works with image processing software 451, depth image processing and skeletal tracking application 450, or both, to track the movement of the person and continuously track the position of the person in the field of view to the position in display optical system 14 so that image generation unit 120 tracks image data of the movement of the avatar to the body of the person being lost. When the person is within ten (10) feet of the user, avatar image data is performed into the see-through display for both facial and body movement tracking of the person. This example illustrates the selection of different modification techniques for different visibilities.
Visibility definitions can be programmed as disappearing applications 456, 4561Or stored in accessible memory. Visibility may be based on a study of which appearance features and movements are visually recognizable to an average person. Other refinements based on the visibility of human features (e.g., age) may also be incorporated.
FIG. 9 is a flow diagram of an embodiment of a process for selecting an alteration technique based on the visibility of a real object for disappearance. The embodiment of fig. 9 is an example of a process that may be used in implementing step 606 or step 614. In step 642, the disappearance application 642 determines the location (and optionally the trajectory) of each real object within the field of view of the see-through display that is identified as satisfying the user disappearance criteria. The trajectory of a stationary real object may be determined based on the movement of the user relative to the stationary object, even if the object is not moving. Determining based on each identified real-world objectAnd optionally the trajectory, and a predetermined visible distance of the user's display device location, the disappearing application 456 in step 6441The visibility of the mark.
The appearance characteristics may also be used as a basis for selecting visibility. Some examples of appearance characteristics that may be used as a basis for determining visibility are size and color. A person wearing a bright orange color at 40 feet has visibility indicating that it is more likely to be seen in the user's field of view than a person wearing a navy shirt 25 feet away.
In step 646, each real object is prioritized for disappearance (priority) based on the identified visibility of each real object in the field of view. The priority of visibility increases as the display device is approached in the field of view. In step 648, an alteration technique is selected for each real object used for disappearance based on the priority of visibility of the respective real object. Other bases for selecting modification techniques may include: the computational time to implement the alteration technique, the available memory resources, and the number of real objects to be obscured from the current field of view. For example, children wearing see-through mixed reality display devices are afraid of clown, while a tour of a local circus is progressing on the street. Replacing each of the five clowns just entering the field of view of the see-through display with the desired rabbit avatar may not occur fast enough to not expose the child to at least one clown. First a revision effect may be applied, e.g. a black box is displayed on each clown. In step 650, the selected modification technique is applied for each real object that satisfies the disappearance criteria. In the next iteration of the process of figure 9, the black box of one or more of the revised clowns may be replaced with a bunny avatar overlay.
The prioritization in the example of fig. 9 may also be applied to the selection of different replacement objects for the vanishing image data for the selected technique. As discussed above, the user may have selected a replacement change technique and indicated that he or she will receive other user-generated replacement objects. Part of the image data associated with these replacement objectsScores may require a large amount of memory and have dynamic content, so they are more computationally intensive to display. Disappearing application 4561The selection from the available replacement objects may also be based on factors of visibility, implementation time, and the number of objects to be processed.
The user may share his change image data with another nearby user so that the nearby user can experience how the user experiences the real world around them. FIG. 10 is a flow diagram of an embodiment of a process of sharing altered image data between see-through, mixed reality display device systems that are within a predetermined distance of each other. In step 652, the first see-through mixed reality display device system identifies a second see-through mixed reality display device system that is within a predetermined distance. For example, these display device systems may exchange identity tokens via bluetooth, WUSB, IR or RFID connections. The type and range of the wireless transceiver may be selected to allow connections only within a predetermined distance. Location data such as GPS or cellular triangulation withCombinations of the like applications may also be used to identify devices within a predetermined distance of each other.
In step 654, the disappearing application 456 of the first device1An identifier of a real object that satisfies a disappearance criterion of another user wearing the second mixed reality device is received, and in step 656, the first device receives image data for tracking to the real object for altering technology from the second device. In step 658, the disappearing application 456 of the first device1Image data tracking the real object is displayed from the perspective of its field of view of its see-through display.
FIG. 11 is a flow diagram of an embodiment of another method for causing a real object to disappear in a field of view of a see-through, mixed reality display device system based on a current topic of interest. In step 662, the disappearing application 4561A current topic of interest of a user is received. As mentioned above, the user may use physical actions (e.g., gaze duration, blinking)Eye commands, other eye movement based commands, audio data, and one or more gestures) to identify a subject of current interest. The user may also indicate the currently interesting subject matter by text input, e.g. via a mobile device 5 embodiment of the processing unit 4. In addition to direct user input, other executing applications may utilize disappearing capabilities to enhance their service to the user. In other examples, another executing application 462 determines a topic of current interest to the user based on its data exchange with the user and sends the topic of current interest to the user, e.g., as part of the topic data item 420 of current interest.
In step 664, disappearing application 4651Any real object types associated with the current topic of interest are identified. Application 462, when interfacing with executing application 462, goes to disappearing application 4561The real-world object type for the current interest is indicated (e.g., via a data item such as 420). In the case of the user, the disappearing application 4561An audio or visual request may be output for the user to identify only real-world objects of the type that the user wishes to see that are relevant to the current interest. The user may enter input identifying such real-world object types using any of the various input methods discussed above. Disappearing application 4561The real object type may also be identified based on searches in the online database and user profile data related to the subject. Further, default real-world object types may be stored for common topics of interest, some examples of such topics being restaurants and directions.
In step 666, the disappearance application 4561 identifies any real objects in the field of view of the see-through display that match any identified real object type, e.g., based on a match with an object type in the appearance features stored in the metadata for each real object identified by the image processing software 451 in the current field of view. In step 668, disappearing application 4561It is determined whether any identified real-world objects do not satisfy the relevance criteria for the current topic of interest. For example, disappearing application 4561Keyword search techniques may be applied to metadata identified as having any real object that matches the real object type. The search technique returns a relevance score for each real-world object. For example, the applied keyword search technique may return a relevance score based on a manhattan distance weighted sum (manhattan distance weighted sum) of the metadata of the real object. Based on the keyword relevance score for each real object metadata search, in step 668 disappearing application 4561Any real objects that do not meet the relevance criteria of the current topic of interest are identified. In step 670, disappearing application 4561The image generation unit 120 is caused to track (e.g., via image processing software 452) image data of each real object that does not meet the relevance criterion for causing its disappearance in the field of view of the see-through display. In the example of finding a woman at a chinese restaurant on a crowded restaurant area street, removing other building symbols reduces clutter in her view so that she can find her chinese restaurant where her friends are waiting more quickly.
Of course, the real object is still in the user's environment, although it may be lost from the perspective display's field of view. To avoid users colliding with people or other objects and becoming injured, collision avoidance mechanisms may be employed. FIG. 12 is a flow diagram of an embodiment of a process for providing a user with a collision warning for a disappearing real object. In step 682, the disappearing application 456, 4561A position and trajectory of the mixed reality display device relative to a real object disappearing from the see-through display is determined. In step 684, the application 456, 456 disappear1It is determined whether the mixed reality device and the real object are within a collision distance. If the device and the disappearing real object are within the collision distance, then in step 686 the disappearing application 456, 4561A safety warning is output. For example, disappearing applications 456, 4561Displaying image data or playing audio data, the data including a safety warning. If the device and the real object are not within the collision distance, then processing in step 688 returns to other tasks, such as for any of another applicationThe identification of real objects in the field of view is serviced or updated until the next scheduled inspection.
13A, 13B, 13C, and 13D illustrate examples of processing gesture user input that identifies a real object for disappearance. Fig. 13A shows a view from the perspective of a user wearing the display device 2. As indicated by dashed lines 704l and 704r, he is currently paying attention to the space in front of him occupied by the person 702. In this example, the user wearing the display device 2 wants to see an unobstructed view of wildflowers and hills, but the park is crowded and the person 702 looks to keep him in the block all the time.
Fig. 13B shows a first position of a thumb gesture, which is one example of a gesture for indicating disappearance. The thumb 706 is placed in front of the display device and blocks the person 702. The image processing software 451 sends a notification to the user input software 477 that a first position of a thumb gesture has been detected in the image data from the outward facing camera 113. In this example, disappearing application 4561The outline tool application (outlinetoolappication) is activated and the thumb fingertip position 709 is entered as the starting reference point for the outline. In this example the contour follows the width of the thumb. FIG. 13C shows an example of a profile along the movement of the thumb on the display that performs the gesture. The image processing software 451 displays the outline to match the movement of the thumb and identifies real objects located within the outline. A user input is received indicating that the contouring tool is to be disabled. For example, the thumb stops for a predetermined period of time, or is no longer detected in the field of view of the outward facing camera 113. Image processing software 451 applies 456 to disappearing1The real object identifier of any object within the outline is notified. In this example, user input software 477 has received an "erase" command from voice recognition software 478 (which processes audio data from microphone 110). In another example, eye movement or text input may be used to select the erasure altering technique. Further, a default modification technique may be selected.
Disappearing application 456 based on the thumb gesture and the audio wipe command1Over a networkThe location data and image data from the camera 113 are sent to the location image tracking application 453, together with a request for real-time image data at the location and from the perspective of the see-through display (as represented by the image data from the camera 113) and its predetermined offset from the display optical axis 142. Disappearing application 456 if the real-time image data is available, e.g., from display device system 8 worn by person 7021Causing the image processing software 451 to display the image data on the person 702 for the user's angle. FIG. 13D shows an example of the see-through display field of view with the person 702 occluded removed. It may be desirable to blend the edges of the image data with image data of the surrounding space around the person 702 extracted from the image data from the outward facing camera 113. In another example, image data of the surrounding space may be extracted and copied to generate image data that obscures an object.
14A, 14B, 14C, and 14D illustrate examples of different alteration techniques applied to real objects that meet user disappearance criteria based on different visibilities to the display device. In this example, the user has identified the clown as a type of real object that is to be made to disappear when the user is wearing her see-through display device 2. Figure 14A shows an example of a real object (in this example a clown 712) meeting user disappearance criteria in a field of view of the see-through display device within a predetermined visible distance of the user location. Clown 712 is only one of the other objects in the field of view of the display device (mostly a person like person 710). Here, the point of user's gaze as indicated by dashed lines 704l and 704r is directly in front of the user.
The user has also selected a replacement object for an avatar that looks like a normal person for a place to be covered and tracked to any clown. Disappearing application 4561A selection can be made from a plurality of alternatives representing general persons at the location. Although the clown is within the predetermined visible distance 479 of the venue, the distance to the clown indicates the visibility of only the color detection in the current venue (busy mall street). When the clown is in this visibilityDisappearing-time application 4561Causing the revision effect black image data to be applied. In disappearing applications 4561Avatar data may be prefetched while monitoring the clown's trajectory relative to the display device. Figure 14B shows an example of a revision change technique applied to clown. The black image data is tracked to a clown in the see-through display of the device 2.
Figure 14C shows an example of a clown at visibility as follows: wherein movement of the joints of the human subject is visible based on the eyesight of the average person of the user's age. Figure 14D shows an example of image data of an avatar replacing a clown in a field of view of a see-through display device. The avatar movements mimic those of a clown. In some examples, tracking face movements of clown may be applied in disappearing 4561Under the control of (3). Visibility of face movement detection may also be applied in disappearing 4561Is stored or defined.
Fig. 15A and 15B show examples of causing a real object that does not satisfy the correlation criterion to disappear. In this example, car navigation application 462 and disappearing application 4561An interface. The car navigation application 462, based on its road bank, symbols thereon, and businesses and services on the road, may notify the disappearance application 456 when real objects that meet and fail to meet the relevance criteria appear in both the specified time period and its location1. Furthermore, the user may have entered one or more destinations, while the navigation system has determined a route and provided data for some of the data fields of the real object metadata of the real object to be encountered on the route. Having a route for the user facilitates pre-fetching of image data to the local computer system near the time the user enters different locations. Fig. 15A shows an example of the field of view of the see-through display seen from the dashboard 720 while driving without executing the disappearance application for the display device 2. The user traveling on the way 732 is approaching the intersection 734. On the user's left side there is a STOP symbol 722 and a series of route number symbols 728, 730, 724, 726 with directional arrows thereon. The car navigation application 462 has identified West line 5 (Route 5 West) as the next of the user's RouteAnd (4) partial. In fig. 15A, the gaze determination application 462 executing in device 2 indicates that the user is currently focusing on east line No. 5 line symbol 728.
FIG. 15 illustrates disappearing application 4561How irrelevant symbols can be caused to disappear, rather than the user having to glance at each symbol to find the correct direction for the westbound line 5. In fact, irrelevant symbols are altered to assist the user in finding relevant information more quickly. As shown in fig. 15B, the signposts for north row 24 line (724), south row 24 line (726), and east row 5 line (728) are all overlaid in the see-through display with a copy of west row 5 line (730), all pointing to the left. The user spends less time trying to find the correct signpost indicating which turn to turn. Disappearing application 4561Real object types that the car navigation application 462 requests never disappear (even if the user requests it) may also be received. Stop symbol 722 is an example of such a real object type, for which safety is a reason.
FIG. 16 is a block diagram of one embodiment of a computing system that may be used to implement a network accessible computing system hosting (host) a disappearing application. For example, the embodiment of the computing system in FIG. 16 may be used to implement the computing systems of FIGS. 1A and 1B. In this embodiment, the computing system is a multimedia console 800, such as a gaming console. As shown in fig. 16, the multimedia console 800 has a Central Processing Unit (CPU) 801 and a memory controller 802 that facilitates processor access to various types of memory, including a flash Read Only Memory (ROM) 803, a Random Access Memory (RAM) 806, a hard disk drive 808, and a portable media drive 806. In one implementation, CPU801 includes a level 1 cache 810 and a level 2 cache 812 to temporarily store data and thus reduce the number of memory access cycles made to hard disk drive 808, thereby improving processing speed and throughput.
The CPU801, the memory controller 802, and various memory devices are interconnected together via one or more buses (not shown). The details of the bus used in this implementation are not particularly relevant to understanding the subject matter of interest discussed herein. It should be understood, however, that such a bus may include one or more of serial and parallel buses, a memory bus, a peripheral bus, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures can include an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an enhanced ISA (eisa) bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus also known as a mezzanine bus.
In one embodiment, the CPU801, memory controller 802, ROM803, and RAM806 are integrated onto a common module 814. In this embodiment, ROM803 is configured as a flash ROM that is connected to memory controller 802 via a PCI bus and a ROM bus (neither of which are shown). RAM806 is configured as multiple Double Data Rate Synchronous Dynamic RAM (DDRSDRAM) modules that are independently controlled by memory controller 802 via separate buses (not shown). Hard disk drive 808 and portable media drive 805 are shown connected to memory controller 802 by a PCI bus and an AT attachment (ATA) bus 816. However, in other implementations, different types of dedicated data bus structures may alternatively be applied.
Graphics processing unit 820 and video encoder 822 form a video processing pipeline for high speed and high resolution (e.g., high definition) graphics processing. Data is transmitted from a Graphics Processing Unit (GPU) 820 to a video encoder 822 over a digital video bus (not shown). Lightweight messages (e.g., popups) generated by system applications are displayed using the GPU820 interrupt to schedule code to render popup into an overlay. The amount of memory used for the overlay depends on the overlay area size, and the overlay preferably scales with the screen resolution. Where the concurrent system application uses a full user interface, it is preferable to use a resolution that is independent of the application resolution. A scaler (scaler) may be used to set this resolution, thereby eliminating the need to change the frequency and cause a TV resynch.
An audio processing unit 824 and an audio codec (coder/decoder) 826 form a corresponding audio processing pipeline for multi-channel audio processing of various digital audio formats. Audio data is transmitted between audio processing unit 824 and audio codec 826 via a communication link (not shown). The video and audio processing pipelines output data to an a/V (audio/video) port 828 for transmission to a television or other display. In the illustrated implementation, the video and audio processing component 820 and 828 are installed on the module 214.
FIG. 16 shows a module 814 that includes a USB host controller 830 and a network interface 832. USB host controller 830 is shown in communication with CPU801 and memory controller 802 via a bus (e.g., a PCI bus) and hosts peripheral controllers 804(1) - (804) - (4). The network interface 832 provides access to a network (e.g., the internet, home network, etc.) and may be any of a wide variety of wired or wireless interface components including an ethernet card, a modem, a wireless access card, a bluetooth module, an RFID module, an infrared module, a WUSB module, a cable modem, and the like.
In the implementation depicted in fig. 16, the console 800 includes a controller support subassembly 840 for supporting four controllers 804(1) -804 (4). The controller support subassembly 840 includes any hardware and software components necessary to support wired and wireless operation with external control devices such as, for example, media and game controllers. The front panel I/O subassembly 842 supports the multiple functions of the power button 812, the eject button 813, as well as any LEDs (light emitting diodes) or other indicators exposed on the outer surface of the console 802. Subassemblies 840 and 842 are in communication with module 814 via one or more cable assemblies 844. In other implementations, the console 800 may include additional controller subcomponents. The illustrated implementation also shows an optical I/O interface 835 configured to send and receive signals that may be passed to module 814.
MUs 840(1) and 840(2) are shown as being connectable to MU ports "a" 830(1) and "B" 830(2), respectively. Additional MUs (e.g., MUs 840(3) -840 (6)) are shown as connectable to controllers 804(1) and 804(3), i.e., two MUs per controller. Controllers 804(2) and 804(4) may also be configured to receive MUs (not shown). Each MU840 provides additional storage on which games, game parameters, and other data may be stored. In some implementations, the other data can include any of a digital game component, an executable gaming application, an instruction set for expanding a gaming application, and a media file. When inserted into console 800 or a controller, MU840 may be accessed by memory controller 802. The system power supply module 850 supplies power to the components of the gaming system 800. A fan 852 cools the circuitry within console 800. A microcontroller unit 854 is also provided.
An application 860 comprising machine instructions is stored on hard disk drive 808. When console 800 is powered on, various portions of application 860 are loaded into RAM806, and/or caches 810 and 812, for execution on CPU801, with application 860 being one such example. Various applications may be stored on hard disk drive 808 for execution on CPU 801.
Gaming and media system 800 may be used as a standalone system by simply connecting the system to monitor 16 (FIG. 1A), a television, a video projector, or other display device. In this standalone mode, gaming and media system 800 allows one or more players to play games or enjoy digital media, such as watching movies or listening to music. However, with the integration of broadband connectivity made possible through network interface 832, gaming and media system 800 may also be operated as a participant in a larger network gaming community.
As discussed above, the processing unit 4 may be embedded in the mobile device 5. Fig. 17 is a block diagram of an exemplary mobile device 900 that may operate in embodiments of the present technology. Exemplary electronic circuitry of a typical mobile telephone is depicted. The phone 900 includes one or more microprocessors 912, and memory 910 (e.g., non-volatile memory such as ROM and volatile memory such as RAM) that stores processor readable code that is executed by one or more processors of the control processor 912 to implement the functions described herein.
The mobile device 900 may include, for example, a processor 912, memory 1010 including applications and non-volatile storage. The processor 912 may implement communications as well as any number of applications, including the interactive applications described herein. The memory 1010 can be any variety of memory storage device types including non-volatile and volatile memory. The device operating system handles the different operations of the mobile device 900 and may contain user interfaces for operations such as making and receiving phone calls, text messaging, checking voicemail, and the like. The applications 930 may be any kind of program, such as a camera application for photos and/or videos, an address book, a calendar application, a media player, an internet browser, games, other multimedia applications, an alarm application, other third party applications, such as the disappearing application and image processing software discussed herein for processing image data to or from the display device 2, and so forth. The non-volatile storage component 940 in memory 910 contains data such as web caches, music, photos, contact data, scheduling data, and other files.
The processor 912 also communicates with RF transmit/receive circuitry 906, which circuitry 906 in turn is coupled to an antenna 902, which also communicates with an infrared transmitter/receiver 908, with any additional communication channel 960 like Wi-Fi, WUSB, RFID, infrared or bluetooth, and with a movement/orientation sensor 914 like an accelerometer. An accelerometer is included in the mobile device to enable applications such as intelligent user interfaces that let the user input commands through gestures, indoor GPS functionality that calculates the movement and direction of the device after disconnecting from GPS satellites, and to detect the orientation of the device and automatically change the display from portrait to landscape when the phone is rotated. The accelerometer may be provided, for example, by a micro-electromechanical system (MEMS), which is a tiny mechanical device (micron-scale) built on a semiconductor chip. Acceleration direction, as well as orientation, vibration and shock can be sensed. The processor 912 also communicates with a ringer/vibrator 916, a user interface keypad/screen, a biometric sensor system 918, a speaker 920, a microphone 922, a camera 924, a light sensor 921, and a temperature sensor 927.
The processor 912 controls the transmission and reception of wireless signals. During a transmit mode, processor 912 provides a voice signal or other data signal from microphone 922 to RF transmit/receive circuitry 906. Transmit/receive circuitry 906 transmits the signal to a remote station (e.g., a fixed station, carrier, other cellular telephone, etc.) for communication via antenna 902. The ringer/vibrator 916 is used to signal an incoming call, text message, calendar reminder, alarm clock reminder, or other notification to the user. During a receive mode, the transmit/receive circuitry 906 receives voice or other data signals from a remote station via the antenna 902. The received voice signal is provided to the speaker 920 while other received data signals are also processed appropriately.
In addition, a physical connector 988 may be used to connect the mobile device 900 to an external power source, such as an AC adapter or powered docking station. The physical connector 988 may also be used as a data connection to a computing device. The data connection allows operations such as synchronizing mobile data with computing data on another device.
A GPS receiver 965 that relays the location of the user application using satellite-based radio navigation is enabled for this service.
The example computer systems illustrated in the figures include examples of computer-readable storage devices. The computer readable storage device is also a processor readable storage device. Such media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. Computer storage devices include, but are not limited to, RAM, ROM, EEPROM, cache, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, memory sticks or cards, magnetic cassettes, magnetic tape, a media drive, a hard disk, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
Claims (10)
1. A method for causing a real object in a see-through display of a see-through mixed reality display device system to disappear, the method comprising:
receiving metadata identifying one or more real objects in a field of view of the see-through display;
determining whether any of the one or more real objects satisfy a user disappearance criterion; and
in response to determining that a first real object satisfies the user disappearance criteria, image data of the first real object is tracked into the see-through display to cause the first real object to disappear in the field of view of the see-through display.
2. The method of claim 1, wherein:
determining whether any of the one or more real objects satisfy user disappearance criteria further comprises:
identifying any real-world object types associated with a subject of current interest based on a notification of the subject of current interest to a user;
identifying any real objects in the field of view of the see-through display that match any identified real object type; and
determining whether any identified real-world objects do not satisfy relevance criteria for the current topic of interest; and
wherein tracking image data of the first real object into the see-through display to cause the first real object to disappear in the field of view of the see-through display further comprises tracking image data of each real object that does not meet a relevance criterion to cause it to disappear in the field of view of the see-through display.
3. The method of claim 1, further comprising:
determining a location and a trajectory for each real object within the field of view of the see-through display identified as satisfying user disappearance criteria;
identifying a visibility of each identified real-world object based on its respective determined position and trajectory and a predetermined visible distance to the location of the display device; and
prioritizing each real object for disappearance based on the identified visibility of each real object for disappearance in the field of view.
4. A method as described in claim 3, wherein identifying the visibility of each identified real object is further based on at least one appearance feature of each real object within the field of view of the see-through display identified as satisfying user disappearance criteria.
5. The method of claim 3, wherein tracking image data of the first real object into the see-through display to cause the first real object to disappear in the field of view of the see-through display further comprises:
selecting an alteration technique for each real object that disappears based on a priority of visibility for the respective real object;
the selected alteration technique is applied for each real object that disappears.
6. The method of claim 1, further comprising:
receiving a user input identifying a gesture for a disappearing subject matter, the disappearing subject matter including at least one of the one or more real objects in the field of view;
storing the theme for disappearing in a user disappearing criterion; and
in response to identifying a user-specified alteration technique to be applied to the at least one real object, image data to the at least one real object is tracked to cause the at least one real object to disappear in the see-through display using the user-specified alteration technique.
7. The method of claim 5 or 6, wherein the altering technique comprises at least one of:
replacing the first real object with a virtual object;
erasing the first real object by overlaying the first real object with image data generated by blending image data of objects surrounding the first real object;
revising the first real object with black image data; and
occluding the first real object by tracking blurred image data to the first real object.
8. A see-through head-mounted mixed reality display device system for causing real objects in a field of view of a see-through display of the display device system to disappear, the system comprising:
one or more location detection sensors;
a memory for storing user disappearance criteria, the user disappearance criteria including at least one subject item;
one or more processors accessible to the memory and communicatively coupled to the one or more location detection sensors for receiving location identifier data for the display device system and for identifying one or more real objects in the field of view of the see-through display that relate to the at least one subject item and are within a predetermined visible distance of a location determined from the location identifier data; and
at least one image generation unit communicatively coupled to the one or more processors and optically coupled to the see-through display to track image data to the identified one or more real objects in the field of view of the see-through display to cause the one or more real objects to disappear.
9. The system of claim 8, wherein the one or more processors are communicatively coupled to a remote computing system to access a three-dimensional model that includes metadata identifying one or more real objects in the location within a predetermined visible distance of the see-through head-mounted mixed reality display device system.
10. The system of claim 8,
the one or more processors are capable of accessing user profile data to predict a location subsequent to a current location of the display device;
the one or more processors check for identification of any real objects in the subsequent location that meet user disappearance criteria;
the one or more processors are communicatively coupled to the computer systems in the subsequent sites; and
the one or more processors prefetch image data of any identified real-world objects in the subsequent location that meet user disappearance criteria by scheduling downloads to the computer system in the subsequent location in time before the user arrives.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/274,136 | 2011-10-14 | ||
| US13/274,136 US9255813B2 (en) | 2011-10-14 | 2011-10-14 | User controlled real object disappearance in a mixed reality display |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1181519A1 HK1181519A1 (en) | 2013-11-08 |
| HK1181519B true HK1181519B (en) | 2016-09-23 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10132633B2 (en) | User controlled real object disappearance in a mixed reality display | |
| US12321666B2 (en) | Methods for quick message response and dictation in a three-dimensional environment | |
| US20230343049A1 (en) | Obstructed objects in a three-dimensional environment | |
| US20240403080A1 (en) | Devices, methods, and graphical user interfaces for displaying views of physical locations | |
| US9342610B2 (en) | Portals: registered objects as virtualized, personalized displays | |
| US20240404207A1 (en) | Devices, methods, and graphical user interfaces for displaying content of physical locations | |
| US8963956B2 (en) | Location based skins for mixed reality displays | |
| US9767524B2 (en) | Interaction with virtual objects causing change of legal status | |
| US9645394B2 (en) | Configured virtual environments | |
| KR102289389B1 (en) | Virtual object orientation and visualization | |
| US9153195B2 (en) | Providing contextual personal information by a mixed reality device | |
| US9122053B2 (en) | Realistic occlusion for a head mounted augmented reality display | |
| US9767720B2 (en) | Object-centric mixed reality space | |
| US20130044128A1 (en) | Context adaptive user interface for augmented reality display | |
| US20150379770A1 (en) | Digital action in response to object interaction | |
| US20240248678A1 (en) | Digital assistant placement in extended reality | |
| US20250380046A1 (en) | Selectively using sensors for contextual data | |
| HK1181519B (en) | User controlled real object disappearance in a mixed reality display | |
| US20250383750A1 (en) | Devices, methods, and graphical user interfaces for manipulating virtual objects in a three-dimensional environment | |
| US20250110336A1 (en) | Systems and methods of controlling the output of light |