CN119025689A

CN119025689A - Digital Supplement Relevance and Retrieval for Visual Search

Info

Publication number: CN119025689A
Application number: CN202410775019.7A
Authority: CN
Inventors: 艾伦·乔伊斯; 埃德加·琼; 杨哲; 伊恩·梅萨; 约瑟夫·奥尔森
Original assignee: Google LLC
Current assignee: Google LLC
Priority date: 2018-06-21
Filing date: 2019-06-21
Publication date: 2024-11-26
Also published as: JP7741026B2; CN112020712A; JP7393361B2; CN112020712B; JP2022110057A; WO2019245801A1; KR20230003388A; KR102753371B1; KR20200136030A; EP3811238A1; JP2021522614A; JP2024112912A

Abstract

The application discloses digital supplemental association and retrieval for visual searches. Systems and methods for identification and retrieval of visually searched content are provided. An example method includes receiving data specifying a digital supplement. The data may identify a digital supplement and a supplement anchor for associating the digital supplement with the visual content. The method may further include generating a data structure instance specifying the digital replenishment and replenishment anchor, and after generating the data structure instance, enabling digital replenishment to be triggered by the image based at least on storing the data structure instance in a database comprising a plurality of other data structure instances. Other data structure instances may each specify a digital supplement and one or more supplemental anchors.

Description

Digital supplemental association and retrieval for visual searches

Description of the division

The application belongs to a divisional application of Chinese application patent application 201980022269.0 with application date of 2019, 6 and 21.

Cross Reference to Related Applications

The present application is a continuation of, and claims priority to, U.S. non-provisional patent application No.16/014,520 entitled "DIGITAL SUPPLEMENT ASSOCIATION AND RETRIEVAL FOR VISUAL SEARCH (digital supplemental association and retrieval for visual search)" filed on month 6 and 21 of 2018, the disclosure of which is incorporated herein by reference in its entirety.

Technical Field

The application relates to digital supplemental association and retrieval for visual searches.

Background

Mobile computing devices such as smartphones typically include a camera. These cameras may be used to capture images of entities in the computing device's surroundings. Various types of content or experiences related to those entities may be made available to users via mobile computing devices.

Disclosure of Invention

The present disclosure describes systems and methods for digital supplemental association and retrieval for visual searches. For example, the systems and techniques described herein may be used to provide digital supplements responsive to visual searches, such as Augmented Reality (AR) content or experience. The visual search may be based on, for example, an image or an entity identified within an image. The digital supplement may, for example, include providing information or functionality associated with the image.

One aspect is a computer-implemented method comprising: data specifying a digital complement is received, the data identifying the digital complement and a complement anchor for associating the digital complement with the visual content. The method also includes generating a data structure instance specifying the digital supplemental and supplemental anchors. The method further includes, after generating the data structure instance, enabling digital replenishment by image triggering based at least on storing the data structure instance in a database comprising a plurality of other data structure instances. Each other data structure instance specifies a digital complement and one or more complement anchors.

Another aspect is a computing device that includes at least one processor and a memory storing instructions. The instructions, when executed by the at least one processor, cause the computing device to receive data specifying a digital complement, the data identifying the digital complement, a complement anchor for associating the digital complement with the visual content, and the contextual information. The instructions also cause the computing device to generate a data structure instance specifying the digital supplemental, supplemental anchor, and context information. The instructions further cause the computing device, after generating the data structure instance, to enable digital replenishment by image triggering based at least on storing the data structure instance in a database comprising a plurality of other data structure instances. Each other data structure instance specifies a digital complement and one or more complement anchors.

Yet another aspect is a computer-implemented method that includes receiving a visual content query from a computing device, and identifying a supplemental anchor based on the visual content query. The method also includes generating an ordered list of digital supplements based on the identified supplemental anchors and transmitting the ordered list to the client computing device.

The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features will be apparent from the description and drawings, and from the claims.

Drawings

Fig. 1 is a block diagram illustrating a system according to an example embodiment.

FIG. 2 is a third human perspective of an example physical space in which one embodiment of the client computing device of FIG. 1 is accessing digital supplements.

FIG. 3 is a diagram of an example method that enables triggering digital replenishment according to an embodiment described herein.

FIG. 4 is a diagram of an example method that enables triggering digital replenishment according to an implementation described herein.

FIG. 5 is a diagram of an example method of searching and presenting digital supplements according to an embodiment described herein.

FIG. 6 is a diagram of an example method of image-based recognition and presentation of digital supplements according to an embodiment described herein.

7A-7C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of FIG. 1 for visual content searching and displaying digital supplements.

8A-8C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of FIG. 1 for visual content searching and displaying digital supplements.

Fig. 9A and 9B are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of fig. 1 for visual content searching and displaying digital supplements.

10A-10C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of FIG. 1 for visual content searching and displaying digital supplements.

11A-11C are schematic illustrations of user interface screens displayed by an embodiment of the client computing device of FIG. 1 for conducting various visual content searches within a store.

12A-12C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of FIG. 11 during various visual content searches.

FIG. 13 is a schematic diagram of an example of a computer device and a mobile computer device that may be used to implement the techniques described here.

Reference will now be made in detail to non-limiting examples of the present disclosure, examples of which are illustrated in the accompanying drawings. Examples are described below with reference to the drawings, wherein like reference numerals refer to like elements. When the same reference numerals are shown, the corresponding descriptions are not repeated and the interested reader may refer to the previously discussed drawings for a description of the same elements.

Detailed Description

The present disclosure describes technical improvements to simplify the identification and presentation of digital supplements based on visual content. Some implementations of the technology described herein generate an index of digital supplements related to a particular type of visual content and provide those digital supplements in response to visual content queries received from a client computing device. The index may allow the user to access relevant digital supplements provided by network-accessible resources (e.g., web pages) disposed throughout the world. This may provide a functional data structure that allows for more efficient retrieval of information.

For example, a client computing device, such as a smart phone, may capture an image of a supplemental anchor, such as an entity. The client computing device may then transmit a visual content query to the server computing device based on the image to retrieve the digital supplement associated with the identified supplemental anchor. In some implementations, the replenishment anchor is based on a physical environment surrounding the client computing device, and the digital replenishment is virtual content that can replenish the user's experience in the physical environment.

The visual content query may include an image or data determined from an image (e.g., an indicator such as an identified supplemental anchor). An example of data determined from an image is text extracted from the image using, for example, optical character recognition. Other examples of data extracted from an image include values read from a bar code, QR code, etc. in the image, an identifier or description of an entity, product, or entity type identified in the image.

For example, a neural network system, such as a convolutional neural network system, may be used to identify an entity, product, or entity type in an image. An identifier or description of an entity, product, or entity type may include metadata or a reference to a record in a database relating to the entity, product, or entity type. Non-limiting examples of entities include buildings, artwork, products, books, posters, photographs, catalogs, logos, documents (e.g., business cards, receipts, coupons, catalogs), people, and body parts.

Various types of digital supplements associated with the supplemental anchors may be available. Digital supplements may be provided through network-accessible resources, such as web pages available on the internet. There is a need for a way to locate and provide these digital supplements in response to visual content queries. Some embodiments generate and maintain a digitally-supplemented index associated with an entity for responding to visual content queries. For example, the index may be populated by crawling the network-accessible resources to determine whether the network-accessible resources include or provide any digital supplements and to determine the supplemental anchors associated with those digital supplements.

For example, the network-accessible resources may include metadata that identifies a supplemental anchor (e.g., text, code, entity, or entity type) associated with the digital supplemental. In response to a hypertext transfer protocol (HTTP) request, the network-accessible resource may include metadata. Metadata may be provided in various formats, such as extensible markup language (XML), javaScript object notation (JSON), or another format.

Metadata for digital supplementation may include one or more of the following: a type indicator, an anchor indicator, a name, a description, a piece of content (i.e., a excerpt or preview of a portion of content), an associated image, a link such as a URL to a digital supplement, and an identifier of an application associated with the digital supplement. The metadata may also include information about the publisher of the digital supplement. For example, the metadata may include one or more of a publisher name, a publisher description, and an image or icon associated with the publisher. In some implementations, the metadata includes contextual information related to providing the digital supplement. For example, the metadata may also include conditions (e.g., geographic conditions, desired applications) associated with providing or accessing the digital supplement.

The identified digital complement may be added to an index stored in memory. In at least some implementations, the associated replenishment anchor for digital replenishment is used as a key to index (key). Digital supplements may also be associated with various scores. For example, a digital complement may be associated with a reputation score that is based on how many other links are found (e.g., when crawling accessible resources) and the reputation of the network-accessible resources that provide those links that reference the digital complement or the network-accessible resources associated with the digital complement. As another example, a digital supplement may be associated with one or more relevance scores corresponding to the relevance of the digital supplement (or associated network accessible resources) to a particular anchor. The relevance score may also be associated with a keyword or topic. The relevance score may be determined based on one or more of digitally-supplemented content, content of the network-accessible resource, content of a site linked to the network-accessible resource, and content (e.g., text) of a link to the network-accessible resource.

Fig. 1 is a block diagram illustrating a system 100 according to an example embodiment. The system 100 can associate the digital supplement with an entity or entity type and can retrieve the digital supplement in response to a visual search. The visual search is a visual content-based search. For example, a visual search may be performed based on a visual content query. The visual content query is an image or other visual content based query. For example, the visual content query may include an image. In some implementations, the visual content query can include image-based text or data. For example, text or data may be generated by recognizing one or more entities in an image. Some visual content queries do not include images (e.g., a visual content query may include only data or text generated from an image). In some implementations, the system 100 includes a client computing device 102, a search server 152, and a digital supplemental server 172. Also shown is a network 190 through which the client computing device 102, the search server 152, and the digital supplemental server 172 may communicate.

The client computing device 102 may include a processor component 104, a communication module 106, a sensor system 110, and a memory 120. The sensor system 110 may include various sensors such as a camera assembly 112, an Inertial Motion Unit (IMU) 114, and a Global Positioning System (GPS) receiver 116. Embodiments of sensor system 110 may also include other sensors including, for example, light sensors, audio sensors, image sensors, distance and/or proximity sensors, contact sensors such as capacitive sensors, timers, and/or other sensors and/or different combinations of sensors. In some implementations, the client computing device 102 is a mobile device (e.g., a smart phone).

The camera component 112 captures images or video of the physical space surrounding the client computing device 102. The camera assembly 112 may include one or more cameras. The camera assembly 112 may also include an infrared camera. The images captured with the camera component 112 can be used to identify supplemental anchors and form a visual content query.

In some implementations, the image captured with the camera component 112 can also be used to determine the position and orientation of the client computing device 102 within a physical space, such as an internal space, based on a representation of the physical space received from the memory 120 or an external computing device. In some implementations, the representation of the physical space may include visual features of the physical space (e.g., features extracted from an image of the physical space). The representation may also include location determination data associated with those features that may be used by the visual positioning system to determine a location and/or position within the physical space based on one or more images of the physical space. The representation may also include a three-dimensional model of at least some of the structures within the physical space. In some embodiments, the representation does not include a three-dimensional model of the physical space.

The IMU 114 may detect motion, movement, and/or acceleration of the client computing device. The IMU 114 may include a variety of different types of sensors, such as, for example, accelerometers, gyroscopes, magnetometers, and other such sensors. The orientation of the client computing device 102 may be detected and tracked based on data provided by the IMU 114 or the GPS receiver 116.

The GPS receiver 116 may receive signals transmitted by GPS satellites. The signals include the time and position of the satellites. Based on signals received from several satellites (e.g., at least four), the GPS receiver 116 may determine the global position of the client computing device 102.

Memory 120 may include applications 122, other applications 140, and a device location system 142. Other applications 140 include any other application installed on the client computing device 102 or otherwise available for execution on the client computing device 102. The application 122 may cause one of the other applications 140 to launch to provide digital supplements. In some implementations, some digital supplements are only available if other applications 140 include specific applications that are associated with or required to provide digital supplements.

The device location system 142 determines the location of the client computing device 102. The device location system 142 may use the sensor system 110 to determine the position and orientation of the client computing device 102 within the global or physical space. In some implementations, the device location system 142 determines the location of the client computing device 102 based on, for example, cellular triangulation.

In some implementations, the client computing device 102 may include a visual positioning system that compares images captured by the camera component 112 (or features extracted from those images) to known arrangements of features within a representation of the physical space to determine six degree-of-freedom poses (e.g., positions and orientations) of the client computing device 102 within the physical space.

The application 122 may include a supplemental anchor identification engine 124, a digital supplemental search engine 126, a digital supplemental presentation engine 128, and a user interface engine 130. Some implementations of the application 122 may include fewer, more, or other components.

The supplemental anchor identification engine 124 identifies supplemental anchors based on, for example, images captured by the camera assembly 112. In some implementations, the supplemental anchor recognition engine 124 analyzes the image to recognize text. This text can then be used to identify the anchor. For example, text may be mapped to nodes in a knowledge graph. For example, text may be recognized as the name of an entity, such as a person, place, product, building, artwork, movie, or other type of entity. In some implementations, text can be recognized as a phrase generally associated with a particular entity or as a phrase describing a particular entity. For example, the text may then be recognized as an anchor associated with a particular entity.

In some implementations, the supplemental anchor identification engine 124 identifies one or more codes within the image, such as a barcode, QR code, or another type of code. The code may then be mapped to a supplemental anchor.

The supplemental anchor identification engine 124 may include a machine learning module that may recognize at least some types of entities within an image. For example, the machine learning module may include a neural network system. Neural networks are computational models for machine learning and consist of nodes organized in layers with weighted connections. Training a neural network uses training examples, each of which is an input and a desired output, to determine weight values for connections between layers through a series of iterative rounds that increase the likelihood that the neural network will provide the desired output for a given input. In each training round, the weights will be adjusted to account for erroneous output values. After training, the neural network may be used to predict the output based on the provided input.

In some embodiments, the neural network system comprises a Convolutional Neural Network (CNN). Convolutional Neural Networks (CNNs) are neural networks in which at least one layer of the neural network is a convolutional layer. A convolutional layer is a layer in which the values of the layer are calculated based on a subset of the values for which a kernel function was applied to the previous layer. Training the neural network may involve adjusting weights of the kernel functions based on the training examples. Typically, each value in the convolutional layer is calculated using the same kernel function. Thus, the weight that must be learned while training a convolutional layer is much less than a fully connected layer in a neural network (e.g., a layer in which each value in a layer is calculated as an independently adjusted weighted combination of each value in a previous layer). Because the weights in the convolutional layer are typically less, training and using the convolutional layer may require less memory, processor cycles, and time than an equivalent fully-connected layer.

After the supplemental anchor recognition engine 124 recognizes an entity or entity type in the image, a textual description of the entity or entity type may be generated. Additionally, entities or entity types may be mapped to supplemental anchors. In some implementations, the supplemental anchor is associated with one or more digital supplements.

In some implementations, the supplemental anchor identification engine 124 determines a confidence score for the identified anchor. A higher confidence score may indicate that content from an image (e.g., image, extracted text, barcode, QR code) is more likely to be associated with the determined anchor than a lower confidence score.

Although the example of fig. 1 shows the supplemental anchor identification engine 124 as a component of the application 122 on the client computing device 102, some implementations include a supplemental anchor identification engine on the search server 152. For example, the client computing device 102 may send the image captured by the camera component 112 to the search server 152, which search server 152 may then identify the supplemental anchor within the image.

In some implementations, the supplemental anchor identification engine 124 identifies potential supplemental anchors. For example, supplemental anchor identification engine 124 may identify various entities within the image (identified). The identifier of the identified entity may then be transmitted to the search server 152, which search server 152 may determine whether any entity is associated with any supplemental anchors. In some implementations, the search server 152 can use the identified entity as context information even if the identified entity is not a supplemental anchor.

The digital supplement retrieval engine 126 retrieves digital supplements. For example, the digital replenishment retrieval engine 126 may retrieve digital replenishment associated with the replenishment anchor identified by the replenishment anchor identification engine 124. In some implementations, the digital replenishment retrieval engine 126 retrieves digital replenishment from the search server 152 or the digital replenishment server 172.

For example, after identifying the supplemental anchor, the digital supplemental search engine 126 may search for one or more digital supplements associated with the identified supplemental anchor. The digital supplemental search engine 126 may generate a visual content query including an image (or an identifier of a supplemental anchor or entity within the image) and transmit the visual content query to the search server 152. The visual content query may also include contextual information, such as the location of the client computing device 102. In some implementations, data related to the digital supplement, such as a name, image, or description, is retrieved and presented to the user (e.g., through user interface engine 130). If multiple digital supplements are presented, the user may select one of the digital supplements via a user interface generated by user interface engine 130.

The digital supplemental presentation engine 128 presents or causes a digital supplemental presentation on the client computing device 102. In some implementations, the digital supplemental presentation engine 128 causes the client computing device to initiate one of the other applications 140. In some implementations, the digital supplemental presentation engine 128 causes information or content to be displayed. For example, the digital supplemental presentation engine 128 may cause the user interface engine 130 to generate a user interface that includes information or content from the digital supplemental to be displayed by the client computing device 102. In some implementations, the digital supplemental presentation engine 128 is triggered by the digital supplemental retrieval engine 126 retrieving the digital supplemental. The digital supplemental presentation engine 128 may then trigger the display device 108 to display content associated with the digital supplemental. In some implementations, the digital replenishment presentation engine 128 causes the digital replenishment to be displayed at a different time than when the digital replenishment is retrieved by the digital replenishment retrieval engine 126. For example, the digital supplement may be retrieved at a first time in response to the visual content query, and the digital supplement may be presented at a second time. For example, at a first time (e.g., when a user is browsing a catalog or is at a store), a digital supplement may be retrieved in response to a visual content query based on images of home decoration or furniture from the catalog or store. A digital complement of AR content including home decoration or furniture may be presented at a second time (e.g., when a user is in a room in which the home decoration or furniture may be placed).

The user interface engine 130 generates a user interface. The user interface engine 130 may also cause the client computing device 102 to display the generated user interface. The generated user interface may, for example, display information or content from the digital supplement. In some implementations, the user interface engine 130 generates a user interface that includes a plurality of user-actuatable controls, each control associated with a digital complement. For example, the user may actuate one of the user-actuatable controls (e.g., by touching the control on a touch screen, clicking on the control using a mouse or another input device, or otherwise actuating the control).

Search server 152 is a computing device. Search server 152 may be responsive to search requests such as visual content queries. The response may include one or more digital supplements potentially relevant to the visual content query. In some implementations, the search server 152 includes a memory 160, a processor component 154, and a communication module 156. Memory 160 may include a content crawler 162, a digital supplemental search engine 164, and a digital supplemental data store 166.

The content crawler 162 may crawl network-accessible resources to identify digital supplements. For example, the content crawler 162 may access web pages accessible via the Internet, such as web pages provided by the digital replenishment server 172. Crawling the network-accessible resources may include requesting the resources from the web server and parsing at least a portion of the resources. Digital supplements may be identified based on metadata provided by the network-accessible resource, such as XML or JSON data that provides information about the digital supplement. In some implementations, the crawler identifies the network-accessible resources based on extracting links from previously crawled network-accessible resources. The content crawler 162 may also identify network accessible resources to crawl based on receiving user submitted input. For example, a user may submit a URL (or other information) to a network accessible resource including a digital supplement through a web form or Application Programming Interface (API). In some implementations, the content crawler 162 generates an index of the identified digital supplements. The content crawler 162 may also generate scores associated with the digital supplements, such as relevance scores or popularity (reputation) scores.

The digital complement search engine 164 receives search queries and generates responses that may include one or more potentially relevant digital supplements. For example, the digital supplemental search engine 164 may receive visual content queries from the client computing device 102. The visual content query may include an image. The digital replenishment search engine 164 can identify replenishment anchors in the image and identify relevant or potentially relevant digital replenishment based on the identified replenishment anchors. The digital supplemental search engine 164 may transmit a response to the client computing device 102 that includes the digital supplement or information that may be used to access the digital supplement. In some implementations, the digital supplemental search engine 164 can return information associated with multiple digital supplements. For example, a list of digital supplements may be included in the response to the query. The list may be ordered based on relevance to the supplemental anchor, popularity, or other properties of the digital supplement.

The visual content query may, for example, include images captured by the camera component 112 or text or other data associated with images captured by the camera component 112. The visual content query may also include other information, such as the location of the client computing device 102 or an identifier of the user of the client computing device 102. In some implementations, the search server 152 may determine a likely location of the client computing device 102 from the user identifier (e.g., if the user has enabled a location service on the client computing device 102 that associates information about the user location with the user account).

Digital supplemental data store 166 stores information regarding digital supplements. In some implementations, the digital supplemental data store 166 includes an index of digital supplements. For example, the index may be generated by the content crawler 162. The numeric supplemental search engine 164 may use the index to respond to search queries.

Digital replenishment server 172 is a computing device. The digital replenishment server 172 provides digital replenishment. In some embodiments, digital supplemental server 172 includes memory 180, processor component 174, and communication module 176. The memory 180 may include a digital supplement 182 and metadata 184. In some implementations, memory 180 may also include other network-accessible resources, such as web pages that are not necessarily digital supplements. For example, memory 180 may store web pages that include metadata to provide details regarding one or more digital supplements and how to access those digital supplements. In addition, the memory 180 may include a resource service engine, such as a web server, that responds to requests, such as HTTP requests, for example, with network accessible resources, such as web pages and digital supplements.

Digital complement 182 is any type of content that may be provided as a complement to things in the physical environment surrounding a user. Digital complement 182 may also include any type of content that may complement a stored image (e.g., an image of a previous physical environment surrounding a user). For example, the digital replenishment may be associated with replenishment anchors, such as images, objects or products or locations identified in the images. Digital complement 182 may include one or more images, audio content, text data, video, games, data files, applications, or structured text documents. Examples of structured text documents include hypertext markup language (HTML) documents, XML documents, and other types of structured text documents.

The digital complement 182 may cause an application to launch and may define parameters for the application. The digital complement 182 may also cause the request to be transmitted to a server (e.g., an HTTP request) and may define parameters of the request. In some implementations, the digital replenishment 182 is initiated as a workflow for completing an activity (such as a workflow for completing a purchase). For example, the digital supplement 182 may transmit an HTTP request to the server to add a particular product to the user's shopping cart, to add a coupon code, and to retrieve a purchase confirmation page.

Metadata 184 is data describing the digital supplement. Metadata 184 may describe one or digital supplements provided by digital replenishment server 172 or provided elsewhere. Metadata 184 for digital supplementation may include one or more of the following: a type indicator, an anchor indicator, a name, a description, a preview clip or snippet, an associated image, such as a link to a URL of the digital supplement, and an identifier of an application associated with the digital supplement. The metadata may also include information about the digitally-supplemented publisher, such as a publisher name, a publisher description, and an image or icon associated with the publisher. In some implementations, the metadata also includes context information about the digital complement or context information that must be satisfied to provide the digital complement. For example, the metadata may include conditions (e.g., geographic conditions, client computing device requirements, required applications) that the access digital supplement must satisfy. Exemplary context information includes location, entities identified within the image, or multiple entities identified within the image (e.g., some digital supplements may require identifying a combination of entities within the image). The identified entity may be a supplemental anchor. In some implementations, the identified entity is not a supplemental anchor, but rather provides context information. Metadata 184 may also include supplemental anchors (e.g., text, code, entity, or entity type) associated with the digital supplemental.

Metadata 184 may be stored in a variety of formats. In some implementations, the metadata 184 is stored in a database. Metadata 184 may also be stored as an XML file, JSON file, or another format file. In some implementations, the digital supplemental server 172 retrieves the metadata 184 from the database and formats the metadata 184 into XML, JSON, or otherwise to provide a response to a request from the client or search server 152. For example, the search server 152 may access the metadata 184 to generate data stored in the digital supplemental data store 166 and for responding to search requests from the client computing device 102.

The communication module 106 includes one or more devices for communicating with other computing devices, such as a search server 152 or a digital replenishment server 172. The communication module 106 may communicate via a wireless or wired network, such as network 190. The communication module 156 of the search server 152 and the communication module 176 of the digital supplemental server 172 may be similar to the communication module 106.

The display device 108 may for example comprise an LCD (liquid crystal display) screen, an LED (light emitting diode) screen, an OLED (organic light emitting diode) screen, a touch screen or any other screen or display for displaying images or information to a user. In some implementations, the display device 108 includes a light projector arranged to project light onto a portion of the user's eye.

Memory 120 may include one or more non-transitory computer-readable storage media. Memory 120 may store instructions and data that may be used by client computing device 102 to implement the techniques described herein, such as generating a visual content query based on captured images, transmitting the visual content query, receiving a response to the visual content query, and presenting a digital supplement identified in response to the visual content query. Memory 160 of search server 152 and memory 180 of digital supplemental server 172 may be similar to memory 120 and may store data instructions that may be used to implement the techniques of search server 152 and digital supplemental server 172, respectively.

The processor component 104 includes one or more devices capable of executing instructions, such as instructions stored by the memory 120, to perform various tasks associated with digital supplemental association and retrieval for visual searching. For example, the processor component 104 may include a Central Processing Unit (CPU) and/or a Graphics Processor Unit (GPU). For example, if a GPU is present, some image/video rendering tasks (such as generating and displaying a user interface or displaying digitally-supplemented portions) may be offloaded from the CPU to the GPU. In some implementations, some image recognition tasks may also be offloaded from the CPU to the GPU.

Although not shown in fig. 1, some implementations include a head mounted display device (HMD). The HMD may be a separate device from the client computing device 102, or the client computing device 102 may include an HMD. In some implementations, the client computing device 102 communicates with the HMD via a cable. For example, the client computing device 102 may transmit video signals and/or audio signals to the HMD for display to the user, and the HMD may transmit motion, position, and/or orientation information to the client computing device 102.

The client computing device 102 may also include various user input components (not shown), such as a controller that communicates with the client computing device 102 using a wireless communication protocol. In some implementations, the client computing device 102 may communicate with an HMD (not shown) via a wired connection (e.g., universal Serial Bus (USB) cable) or via a wireless communication protocol (e.g., any WiFi protocol, any bluetooth protocol, zigbee, etc.). In some implementations, the client computing device 102 is a component of the HMD and may be contained within a housing of the HMD.

The network 190 may be the Internet, a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and/or any other network. The client computing device 102 may receive audio/video signals, for example, via a network, which may be provided as part of a digital supplement in the illustrative example embodiment.

FIG. 2 is a third person perspective of an example physical space 200 in which an embodiment of the client computing device 102 is accessing digital supplements. In this example, physical space 200 includes object 222. Here, the object 222 is an artwork on a wall of the physical space 200. The object 222 is contained within the field of view 204 of the camera component 112 of the client computing device 102.

An example user interface screen 206 is also shown. The user interface screen 206 may be generated, for example, by the user interface engine 130 of the client computing device 102. The user interface screen 206 includes an image display panel 208 and a digital supplemental selection panel 210. The image display panel 208 shows an image. For example, the image display panel 208 may show images corresponding to real-time feeds from the camera component 112 of the client computing device 102. In some implementations, the image display panel 208 shows previously captured images or information that has been retrieved from the memory 120 of the client computing device 102.

In some implementations, the user interface screen 206 is displayed to the user on a display device of the client computing device 102. In some implementations, the user interface screen 206 can be overlaid on an image of the physical space (or a video feed captured by a camera of the computing device). Further, the user interface screen 206 may be displayed as AR content on the user's field of view using an HMD worn by the user.

The image display panel 208 may also include notes or user interface elements that may be related to the image. For example, the image display panel 208 may include an indicator that an object in the image (e.g., object 222) has been recognized as a supplemental anchor. The indicator may include a user-actuatable control to access or view information about the digital supplement associated with the identified supplemental anchor. In some cases, the image displayed in the image display panel 208 may include a plurality of objects that are recognized as supplemental anchors, and the image display panel 208 may include a plurality of annotations overlaying the image to identify those supplemental anchors.

The supplemental anchor may be recognized by a supplemental anchor recognition engine of the client computing device 102. In some implementations, the supplemental anchor is identified by transmitting the image to the search server 152. The search server 152 may then analyze the image and identify supplemental anchors in the image. In some implementations, the search server 152 can transmit one or more of the location (e.g., image coordinates) or size of any identified objects associated with the supplemental anchors to the client computing device 102. The client computing device 102 may then update the user interface screen to show annotations that identify the supplemental anchors (or associated objects) in the image. In some implementations, the client computing device 102 can track the location of supplemental anchors (or associated objects) in the video stream captured by the camera component 112 (e.g., images captured in sequential order) (e.g., the supplemental anchor identification engine 124 can track the supplemental anchors identified by the search server 152).

The digital supplement selection panel 210 allows a user to select digital supplements for presentation. For example, the digital replenishment selection panel 210 may include a menu including user-actuatable controls each associated with a digital replenishment. In this example, the digital supplemental selection panel 210 includes a user actuatable control 212 and a user actuatable control 214, each of which includes information about the associated digital supplemental. For example, the user-actuatable control may display one or more of a name (or title), a brief description, and an image associated with a digital supplement that may be received from the search server 152. Upon actuation of the user-actuatable control 212 or the user-actuatable control 214, the associated digitally-supplemental content may be presented to the user. Presenting the digital complement to the user may include causing the client computing device 102 to display a user interface screen that includes images, video, text, other content, or a combination thereof from the digital complement. In some implementations, the digital supplemental content is displayed as an overlay over an image or camera feed on the image display panel 208. The digital supplemental content may be three-dimensional augmented reality content.

In some implementations, presenting the digital complement includes activating an application (e.g., one of the other applications 140) installed on the client computing device 102. Presenting the digital complement may also include transmitting the request to a URL associated with the digital complement. The request may include parameters associated with the digital supplement, such as an identifier of the product or object identified within the image. In some implementations, images (or other content) from the visual content query are passed parameters and requests. The image may also be provided via an API associated with the digital replenishment server 172. In some implementations, the client computing device 102 transmits the image to the digital replenishment server 172. In some implementations, the search server 152 may transmit the image to the digital supplemental server 172. For example, in response to a user selecting a digital supplement, the client computing device 102 may transmit the selected indicator to the search server 152, and the search server 152 may then transmit the image to the corresponding digital supplement server. The client computing device 102 may also transmit the URL to a location on the search server 152 that the digital supplemental server 172 may use to access the image. Advantageously, these implementations may reduce the amount of data that a client computing device needs to transmit.

The digital complement associated with the user-actuatable control 212 may cause information about the object 222, such as information from a museum, to be displayed. The digital complement associated with the user-actuatable control 214 may cause information about the museum tour to be displayed. For example, the presentation of the digital supplement may cause one station of the museum tour to be marked as completed and information about the next station to be displayed.

FIG. 3 is a diagram of an example method 300 that enables triggering digital replenishment according to an embodiment described herein. The method 300 may be performed, for example, by the content crawler 162 of the search server 152 to allow a user to access digital supplements based on visual content queries.

At operation 302, data specifying a digital complement is received. The data may identify the situation in which the digital supplement should be provided. The data specifying the digital complement may be received in various ways. For example, data specifying a digital supplement may be received from a network accessible resource such as a web page that includes metadata about the digital supplement. The data specifying the digital supplement may also be received via, for example, an API or form provided by the search server 152. Data specifying digital supplements may also be received from a memory location or data store.

The data regarding the digital supplement may include access data usable by the client computing device to access the digital supplement. For example, the access data may include a digitally-supplemented URL and parameters passed to the URL. The access data may also include an application identifier and parameters of the application. The data about the digital supplement may also include descriptive data about the digital supplement. The client computing device may use the descriptive data to present information about the digital supplement to the user (e.g., on a menu that the user may select the digital supplement). Descriptive data may include, for example, names (or titles, descriptions, publisher names, and images). The data regarding the digital supplement may also include an identifier of the supplemental anchor.

At operation 304, a data structure instance based on the received data is generated. The data structure may be, for example, a record in a database. The database may be a relational database and the data structure instance may be linked (e.g., via a foreign key) to one or more records associated with the supplemental anchor.

At operation 306, after the data structure instance is generated, the digital complement is enabled to be retrieved by the visual content query. For example, a database field associated with a data structure instance may be set active such that the digital supplemental search engine 164 may access and return the associated digital supplemental. In some embodiments, triggering of the digital replenishment may include saving or logging a database record. In some implementations, enabling retrieval of the digital complement includes enabling triggering of the digital complement by the client computing device. For example, after the instance is generated, the digital complement may be returned to the client computing device in response to the search and activated or presented by the client computing device.

FIG. 4 is a diagram of an example method 400 that enables triggering digital replenishment according to an implementation described herein. The method 400 may be performed, for example, by the content crawler 162 of the search server 152 to allow a user to access digital supplements based on visual content queries.

At operation 402, network accessible resources are analyzed. In some implementations, the network-accessible resource is a web page served by, for example, digital replenishment server 172. In some implementations, a set of network accessible resources is analyzed. The set of network accessible resources may be generated based on submission via a form or API. In some implementations, the set of network-accessible resources may be generated by crawling other network-accessible resources to identify URLs. The crawling process may be performed recursively.

At operation 404, metadata associated with the digital complement within the network-accessible resource is identified. In some implementations, the network-accessible resource can include an indicator of metadata associated with the digital complement. For example, the network-accessible resource may include a tag that identifies a portion of the network-accessible resource that includes metadata. The tags may be XML tags having a particular type or attribute. The tags may be HTML tags, such as script tags, that include JSON data structures containing metadata.

At operation 406, a metadata-based digital supplemental data structure instance is generated. Operation 406 may be similar to operation 304.

At operation 408, a visual content query is received. The visual content query may be sent, for example, by a client computing device, such as client computing device 102. In some implementations, the visual content query includes an image. The visual content query may also include text data describing the image. For example, the text data may include an identifier of a supplemental anchor within an image captured by a camera component of the client computing device. In some implementations, the visual content query also includes other information, such as a location of the client computing device or an identifier of a user account associated with the client computing device.

At operation 410, a plurality of digital supplemental data structure instances are identified based on the visual content query. In some implementations, supplemental anchors are identified within images provided in the visual content query. The supplement anchors may then be used to query the index or database to obtain the relevant digital supplements. In some implementations, other data provided with the query may also be used to identify digital supplements, such as the location of the client computing device or information associated with the user account. In some implementations, a plurality of supplemental anchors are used to identify the relevant supplemental anchor.

At operation 412, an ordering of the plurality of digital supplemental data structure instances is determined. The ranking may be based on various scores associated with the digital complement or the relevance of the digital complement to the visual content query. In some implementations, a relevance score corresponding to the relevance of the digital supplement to the visual content query is used to rank the plurality of digital supplement data structure instances.

The relevance score may be determined from a number of factors such as one or more of the content of the digital supplement, the content linked to the network-accessible resource of the digital supplement (or a network-accessible resource associated with the digital supplement), the linked text or content near the link to the digital supplemental information on other network-accessible resources.

The score may also be based on a popularity metric. Reputation metrics are one example of popularity metrics. The reputation metric may be based on a combination of how many other network resources are linked to the digital supplemental content and the reputation score of those other network accessible resources. In some implementations, the popularity score may be based on how frequently a digital resource is selected or has been selected. In some implementations, the popularity score may correspond to how frequently a digital resource is selected for visual content queries.

The score may be determined or retrieved from a data store or API. In some implementations, an API is accessed to retrieve a numerically-supplemented score. For example, the score may be retrieved from a search engine that has determined relevance and/or popularity of the digital resource with respect to the supplemental anchor-based search term.

The plurality of digital supplemental data structures may also be ordered based on a frequency of use or recency of use of a particular user (e.g., a user of the client computing device). In some embodiments, the plurality of digital supplemental data structures are randomly ordered.

At operation 414, the visual content query is responded to based on the plurality of digital supplemental data structure instances. For example, information associated with the plurality of digital supplemental data structure instances may be transmitted to the client computing device in the order determined at operation 412. In some implementations, the information includes descriptive data that may be shown in a menu or another type of user interface configured to receive user selections of digital supplements. The information may also include access data that may be used by the client computing device to access or present the digital supplement.

FIG. 5 is a diagram of an example method 500 of searching and presenting digital supplements in accordance with an embodiment described herein. The method 500 may be performed, for example, by the application 122 of the client computing device 102 to identify and access digital supplements based on visual content queries.

At operation 502, an image-based visual content query is transmitted to a server computing device (e.g., search server 152). For example, an image may be captured with the camera component 112 of the client computing device 102. The image may also be a stored image, such as an image previously captured by the camera component 112. In some implementations, the visual content query contains only images. In some implementations, the visual content query includes additional information. For example, the visual content query may include information such as a location of the client computing device 102 or an identifier of an account associated with a user of the client computing device 102. The application 122 may also identify anchors in the image (e.g., with a supplemental anchor identification engine 124). The visual content query may include an identifier (e.g., text, number, or other type of identifier) of the identified anchor. In at least some implementations, the visual content query does not include an image.

In some implementations, transmitting the visual content query to the server includes calling an API. In some implementations, transmitting the visual content query to the server includes calling an API provided by the server. In some implementations, transmitting the visual content query to the server includes submitting a form (e.g., submitting a GET or POST request) using the HTTP protocol.

At operation 504, a response to the visual content query identifying the digital supplement is received. The response may be received from search server 152 via network 190. The response may include one or more digital supplements identified by the search server 152 based on the visual content query. For example, the response may include an array of data associated with the digital complement. In some implementations, the data associated with the digital complement can include descriptive data that can be used to present digital complement options for selection by a user. For example, descriptive data may include names, short descriptions, publisher names, and images. The data may also include access data such as URLs and parameters or application names and related parameters included with the request via the URL. The data may also include the location, coordinates, or size of the supplemental anchor in the image transmitted with the visual content query (e.g., if the supplemental anchor is identified by the search server 152).

At operation 506, a user interface screen is displayed that includes information associated with the digital complement. In some implementations, the user interface screen includes annotations overlaying the identified supplemental anchors (e.g., based on the provided coordinates). The annotation may provide information about the object in the image associated with the identified supplemental anchor. The annotation may include a user-actuatable control that may be actuated to present or activate the digital complement. The user interface screen may also include a digital supplement selection panel that may be used to select from among a plurality of digital supplements identified in the response received at operation 504. In some implementations, the user interface screen can be generated by a web browser that opens a digital complement-specified URL. The user interface screen may also be generated by another application that is launched to provide digital supplements.

FIG. 6 is a diagram of an example method 600 of image-based recognition and presentation of digital supplements according to an embodiment described herein. The method 600 may be performed, for example, by the application 122 of the client computing device 102 to identify and access digital supplements based on visual content queries.

At operation 602, an image is captured. For example, the image may be captured by the camera component 112 of the client computing device 102. In some implementations, a sequence of images (i.e., video) can be captured by the camera component 112.

At operation 604, an image-based visual content query is transmitted to a server computing device, such as search server 152. Operation 604 may be similar to operation 502. In an embodiment in which a sequence of images is captured, the visual content query may include a plurality of images or a sequence of images. In some implementations, the sequence of images may be streamed to the server computing device.

At operation 606, a response to the visual content query identifying a plurality of digital supplements is received. Operation 606 may be similar to operation 504 previously described.

At operation 608, a user interface screen is displayed that includes a user actuatable control to select a digital supplement from a plurality of digital supplements. For example, a digital supplemental selection panel may be displayed. The digital supplement selection panel may include a plurality of user-actuatable controls, each associated with one of the plurality of digital supplements identified in the response. The digital replenishment selection may arrange the user-actuatable controls based on an ordering or ranking of digital replenishment provided by the server computing device. The digital supplemental selection panel may arrange the user-actuatable controls vertically, horizontally, or otherwise. The user-actuatable control may be associated with or include information about the associated digital complement that the user may consider when deciding whether to select the digital complement. For example, the displayed information may include one or more of a digitally supplemented name, description, image, and publisher name.

At operation 610, user input is received selecting a digital supplement. The user input may be a click using a mouse or other device. The user input may also be a touch input from a stylus or finger. Another example of a user input is a near touch input (e.g., holding a finger or pointing device near a touch screen). In some implementations, the user input may also include gestures, head movements, eye movements, or voice inputs.

At operation 612, information is provided to a resource associated with the selected digital complement. For example, information about a user of a client computing device may be transmitted to a server that provides digital supplements (if a license to provide the information has been provided). This information may also be provided to an application that provides digital supplements. Various types of information may be provided. For example, the information may include user information such as a user name, user preferences, or location.

The information may also include information related to visual content queries, such as images or image sequences. The information may also include identifiers and/or bitholders of one or more supplemental anchors in the image. This information may be used to provide digital supplements to the user. For example, the digitally-supplemented AR content may be resized and located based on the image.

This information may be transmitted by the client computing device 102 directly to a resource associated with the digital replenishment (e.g., the digital replenishment server 172). In some implementations, the information is provided to the resources associated with the digital supplements through search server 152 (e.g., so that the client computing device does not need to transmit as much data). In at least some of these implementations, the client computing device 102 can transmit selection information identifying the selected digital complement to the search server 152. After receiving the selection and verifying that the user has authorized information sharing, search server 152 may then transmit the information to a resource that provides digital replenishment. The client computing device 102 may also prompt the user to license the sharing of information. In some implementations, the search server 152 can determine information to transmit to the resource based on the digital supplemental data structure instance (which can be based on metadata associated with the digital supplemental).

At operation 614, the user interface is updated based on the selected digital complement. Operation 614 may be similar to operation 506.

Fig. 7A-7C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 for visual content searching and displaying digital supplements. In fig. 7A, a user interface screen 700a is shown. The user interface screen 700a includes an image display panel 708 and an information panel 730. In this example, the image display panel 708 is displaying an image of a rack filled with wine bottles (e.g., you can find in a store). The image display panel 708 also includes an indicator 740 and an indicator 742. Each of these indicators indicates that the wine bottle shown in the image below the indicator has been identified as a supplemental anchor (e.g., an identified product in this case). Indicators 740 and 742 are examples of user-actuatable controls. Within the information panel 730, the instruction is provided as "Tap on what you' RE INTERESTED IN (Tap on what you are interested in)".

In fig. 7B, user interface screen 700B is shown after the user has actuated indicator 740. After actuation, an annotation 744 from the digital supplement is displayed. The annotation 744 includes information about the rating of the wine, which may assist the user in selecting a bottle of wine that he wants to purchase.

In fig. 7C, another user interface screen 700C is shown after the user has actuated the indicator 740. User interface screen 700c may be shown instead of or in addition to user interface screen 700B (e.g., after actuation of comment 744, or if the user swipes up on information panel 730 in fig. 7B). In fig. 7C, an extended information panel 732 is shown. The extended information panel 732 occupies more of the user interface screen 700c than the information panel 730 in fig. 7A and 7B.

The extended information panel 732 includes a digital supplemental selection panel 710 and a digital supplemental content display panel 734. The digital supplemental selection panel 710 includes a user actuatable control 712, a user actuatable control 714, and a user actuatable control 716 (only partially visible). In some implementations, additional user-actuatable controls can be displayed as the user swipes on the digital supplemental selection panel 710. The user-actuatable controls of the digital supplemental selection panel 710 may be arranged in a ranked order. User-actuatable control 712 is associated with a digital supplement for meal pairing. Upon actuation of user-actuatable control 712, a digital supplement may be displayed displaying food and meal pairing information for the selected wine. The user-actuatable control 714 is associated with a digital supplement that saves the photograph. Upon actuation, an application that saves the photograph may be activated and provided with the image. Other information may be saved with the photograph such as the identified supplemental anchor.

Digital supplemental content display panel 734 may display content from the digital supplemental. The digital supplemental content display panel 734 may display a default digital supplemental or highest ranked digital supplemental associated with the identified supplemental anchor. In this example, the digital supplemental content display panel 734 includes product information regarding the product associated with the selected supplemental anchor. In this case, wine names, ratings, origins, images and comments are provided.

8A-8C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 for visual content searching and displaying digital supplements. In this example, the visual content search is based on the image of the receipt.

In fig. 8A, a user interface screen 800a is shown. The user interface screen 800a includes an image display panel 808 and an information panel 830. In this example, the image display panel 808 is displaying an image of a receipt from a restaurant. The image display panel 808 also includes an indicator 840, an indicator 842, an annotation 844, and a highlighting overlay 846. In this case, indicator 840 is associated with the receipt as a document and indicator 842 is associated with the particular restaurant named on the receipt. Both the identified receipt document and the identified restaurant name are examples of supplemental anchors.

The annotation 844 is associated with a digital supplement that provides a small fee calculator. In this example, an exemplary small fee calculation is included on the annotation 844 and is overlaid on the image display panel 808 at the appropriate location. In some implementations, the digital supplement may be selected by default and displayed when the appropriate supplemental anchor is identified. Highlighting overlay 846 overlays a portion of the receipt document that includes information that is digitally supplemented by the fractional fee calculator.

In this example, the items displayed in the information panel 830 are related to receipts that are documents, as if the indicator 840 had been actuated. In some implementations, the identified supplemental anchors are ranked based on a likely relevance or user interest based on, for example, past actions of the user, other user actions for similar images, confidence scores for the supplemental anchors, or the position or size of portions of the supplemental anchor-related image. In at least some implementations, the information panel 830 can then display items related to the highest ranked supplemental anchor. If instead the indicator 842 is actuated, the information panel 830 may include items regarding a particular restaurant.

Here, the information panel 830 includes a digital supplementary selection panel 810. The digital supplemental panel includes a user actuatable control 812, a user actuatable control 814, and a user actuatable control 816. In this example, user-actuatable control 812 is associated with the fractional fee calculator digital supplemental, user-actuatable control 814 is associated with the apportioned checkout digital supplemental, and user-actuatable control 816 is associated with the fee report digital supplemental. For example, upon actuation of user-actuatable control 812, a user interface control for adjusting parameters of the fractional fee calculator may be displayed (e.g., to adjust the percentages).

In fig. 8B, user interface screen 800B is shown after the user has actuated user-actuatable control 814. After actuation, an extended information panel 832 is shown that includes items that help the user calculate how to apportion the checkout. For example, the number of people apportioned to pay may be entered to determine the amount each person should pay.

In fig. 8C, user interface screen 800C is shown after the user has actuated user-actuatable control 816. After actuation, an extended information panel 834 is shown that includes an item that assists the user in storing a receipt into the expense report. For example, the user may select a cost report (e.g., "SYDNEY TRIP 2018 (sydney travel 2018)") that should be associated with the receipt. Once the expense report is selected, an image of the receipt may be uploaded to an expense report submission or management system. In some implementations, the complete image shown on the image display panel 808 is uploaded. In some implementations, a portion of the image is uploaded (e.g., the image is cropped to include only a receipt).

Fig. 9A and 9B are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 for visual content searching and displaying digital supplements. In this example, the visual content search is based on facial images.

In fig. 9A, a user interface screen 900a is shown. The user interface screen 900a includes an image display panel 908 and an information panel 930. In this example, the image display panel 908 is displaying an image of a face. Here, the face is an example of a supplemental anchor. The information panel 930 includes user-activatable controls 912 for digital augmentation that are identified for augmentation anchors in an image (i.e., face). The user-actuatable control 912 is associated with a digital supplement for fitting eyeglasses.

In fig. 9B, user interface screen 900B is shown after the user has actuated user-actuatable control 912. After actuation, an extended information panel 932 is shown that includes an item that helps the user visually attempt to wear glasses on the face in the image. Here, a plurality of glasses styles are displayed, and the user can select a pair to try on. Upon selection of a pair of glasses, AR content 960 is overlaid on image display panel 908. Here, the AR content 960 corresponds to the selected glasses and is sized to match the face in the image. In some implementations, when a digital supplement for try-on glasses is selected, the image shown in the image display panel 908 is transmitted to a server that provides the digital supplement so that the image can be analyzed to determine the location of the AR content 960 and to resize or recommend the position and manner of try-on glasses.

10A-10C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 for visual content searching and displaying digital supplements. In this example, the visual content search is based on images of furniture in the catalog.

In fig. 10A, a user interface screen 1000A is shown. The user interface screen 1000a includes an image display panel 1008. In this example, the image display panel 1008 is displaying an image of a portion of a page of a furniture catalog. The image display panel further includes an indicator 1040, an indicator 1042, and an indicator 1044. In this example, indicator 1040 is associated with a bed, indicator 1042 is associated with a decorative item, and indicator 1044 is associated with a carpet. Images of beds, decorative items, and carpeting in the catalog are examples of supplemental anchors.

In fig. 10B, user interface screen 1000B is shown after the user has selected indicator 1040 (e.g., by touching the screen at or near the display of indicator 1040). User interface screen 1000b includes a digital supplemental selection panel 1010 and an information panel 1030. Information panel 1030 includes information (e.g., product name, description, and image) about the supplemental anchor associated with the selected indicator.

The digital supplemental selection panel 1010 includes a user-actuatable control 1012 and a user-actuatable control 1014. User-actuatable controls 1012 are associated with digital supplements that provide views at home. The user-actuatable control 1014 is associated with another digital complement (e.g., a digital complement for posting to a social media site).

In fig. 10C, the user interface screen 1000C is shown after actuation of the user-actuatable control 1012. The user interface screen 1000c includes an image display panel 1008, a digital supplemental selection panel 1010, and a reduced information panel 1032. The condensed information panel 1032 may include user-actuatable controls that, when actuated, may pop-up and display the information panel.

Here, the image display panel 1008 now displays an image of the room and includes AR content 1060. The AR content 1060 includes a 3D model of the bed associated with the indicators 1040 overlaid on the image panel. The user may be able to adjust the location of the AR content 1060 within the room to see how the bed fits into the room. In some implementations, when a digital supplement for a view in the home is selected, the image shown in the image display panel 1008 is transmitted to a server that provides the digital supplement so that the image can be analyzed to determine where and how to locate the AR content 1060 and resize the AR content 1060. In some implementations, the AR content 1060 may be provided at a later time than the visual content query.

11A-11C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 for conducting various visual content searches within a store. In this example, the visual content search is based on images of products captured within the store.

In fig. 11A, a user interface screen 1100a is shown. User interface screen 1100a includes an image display panel 1108 and an information panel 1130. In this example, image display panel 1108 is displaying images captured within a store. The image display panel 1108 also includes an indicator 1140 associated with the vase. The vase displayed on the image display panel 1108 is an example of a supplemental anchor. The information panel 1130 is displaying digital supplements including product information about the vase and the functionality to purchase the vase. The digital replenishment may, for example, include a workflow to initiate a vase purchase. In this example, the digital complement is identified based on the image content and the location of the client computing device such that the digital complement issued by (or associated with) the store in which the image was captured while the client computing device was in the store may be identified and provided to the visual content query as a result of the high ranking. In some implementations, if the location of the client computing device is changed, a different digital supplement will be provided for the same image.

In fig. 11B, a user interface screen 1100B is shown. User interface screen 1100b includes an image display panel 1108 and an information panel 1130. In this example, image display panel 1108 is displaying another image captured within the store. The image display panel 1108 also includes an indicator 1142 associated with the carpet. The carpet displayed on image display panel 1108 is an example of a supplemental anchor. The information panel 1130 is displaying digital supplements that include product information about the carpet as well as the functionality to select the size and purchase the carpet. As in fig. 11A, the digital supplement is identified based on the image content and the location of the client computing device.

In fig. 11C, a user interface screen 1100C is shown. User interface screen 1100c includes an image display panel 1108 and an information panel 1130. In this example, image display panel 1108 is displaying another image captured within the store. The image display panel 1108 also includes an indicator 1144 associated with the vase. The vase displayed on the image display panel 1108 is an example of an additional anchor. The information panel 1130 is displaying a digital complement including product information about the vase. Information panel 1130 also includes coupon indicators 1132 and functionality to redeem coupons. Redeeming the coupon may include purchasing the merchandise at a discounted price from a website associated with the store. In some implementations, a coupon code is presented that can be used to ensure discounts during checkout. 11A and 11B, digital supplements are identified based on the image content and the location of the client computing device.

Fig. 12A-12C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 during various visual content searches. In this example, the visual content search is based on images of movie posters (e.g., images that may be captured at a movie theater).

In fig. 12A, a user interface screen 1200a is shown. User interface screen 1200a includes image display panel 1208. In this example, image display panel 1208 is displaying an image of a movie poster. The image display panel 1208 also includes an indicator 1240 associated with the movie poster identified in the image. Movie posters are examples of supplemental anchors. The indicator 1240 may include a user-actuatable control that, when actuated, will display a digital supplement or menu to select the digital supplement.

In fig. 12B, a user interface screen 1200B is shown. Image display panel 1208 also includes a preview digital complement 1242 associated with the movie poster identified in the image. For example, preview number supplement 1242 may be shown following actuation of indicator 1240 (of fig. 12A). The preview digital complement 1242 can overlay the image or video from the movie associated with the identified movie poster over the image of the movie poster.

In fig. 12C, a user interface screen 1200C is shown. Image display panel 1208 also includes a rating indicator 1244 and a rating indicator 1246. The rating indicators 1244 and 1246 may be generated by one or more digital supplements in response to visual content queries including movie posters. The digital complement may, for example, overlay rating information for movies associated with movie posters in the image. The rating indicators 1244 and 1246 may include user-actuatable controls that, when actuated, cause additional information about the rating and associated movies to be displayed.

Fig. 13 illustrates an example of a computer device 1300 and a mobile computer device 1350 that can be used with the techniques described here (to implement the client computing device 102, the search server 152, and the digital replenishment server 172). Computing device 1300 includes a processor 1302, memory 1304, storage device 1306, a high-speed interface 1308 connected to memory 1304 and high-speed expansion ports 1310, and a low-speed interface 1312 connected to low-speed bus 1314 and storage device 1306. Each of the components 1302, 1304, 1306, 1308, 1310, and 1312, are interconnected using various buses, and may be mounted on a common motherboard or in other manners as appropriate. The processor 1302 can process instructions executing within the computing device 1300, including instructions stored in the memory 1304 or on the storage device 1306 to display graphical information for a GUI on an external input/output device, such as display 1316 coupled to high speed interface 1308. In other embodiments, multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory. In addition, multiple computing devices 1300 may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a set of blade servers, or a multiprocessor system).

Memory 1304 stores information within computing device 1300. In one implementation, the memory 1304 is one or more volatile memory units. In another implementation, the memory 1304 is one or more nonvolatile memory cells. The memory 1304 may also be another form of computer-readable medium, such as a magnetic or optical disk.

The storage device 1306 is capable of providing mass storage for the computing device 1300. In one implementation, the storage device 1306 may be or contain a computer-readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory, or other similar array of solid state memory devices or devices, including devices in a storage area network or other configurations. The computer program product may be tangibly embodied in an information carrier. The computer program product may also contain instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer-or machine-readable medium, such as the memory 1304, the storage device 1306, or memory on processor 1302.

The high speed controller 1308 manages bandwidth-intensive operations for the computing device 1300, while the low speed controller 1312 manages lower bandwidth-intensive operations. This allocation of functions is merely exemplary. In one embodiment, the high speed controller 1308 is coupled to the memory 1304, the display 1316 (e.g., by a graphics processor or accelerator), and to a high speed expansion port 1310, which high speed expansion port 1310 may accept various expansion cards (not shown). In this embodiment, low-speed controller 1312 is coupled to storage device 1306 and low-speed expansion port 1314. The low-speed expansion port, which may include various communication ports (e.g., USB, bluetooth, ethernet, wireless ethernet), is coupled to one or more input/output devices, such as a keyboard, pointing device, scanner, or networking device, such as a switch or router, for example, through a network adapter.

As shown in the figures, computing device 1300 may be implemented in many different forms. For example, it may be implemented as a standard server 1320, or multiple times in a group of such servers. It may also be implemented as part of a rack server system 1324. Furthermore, it may be implemented in a personal computer such as a laptop computer 1322. Alternatively, components from computing device 1300 may be combined with other components in a mobile device (not shown), such as device 1350. Each of such devices may contain one or more of computing device 1300, 1350, and the entire system may be made up of multiple computing devices 1300, 1350 communicating with each other.

The computing device 1350 includes a processor 1352, memory 1364, input/output devices such as a display 1354, communication interfaces 1366, and transceivers 1368, as well as other components. The device 1350 may also be provided with a storage device, such as a microdrive or other device, to provide additional storage. Each of the components 1350, 1352, 1364, 1354, 1366, and 1368 are interconnected using various buses, and several of the components may be mounted on a common motherboard or in other manners as appropriate.

Processor 1352 may execute instructions within computing device 1350, including instructions stored in memory 1364. The processor may be implemented as a chipset of chips that include separate and multiple analog and digital processors. For example, the processor may provide for coordination of the other components of the device 1350, such as control of user interfaces, applications run by the device 1350, and wireless communication by the device 1350.

The processor 1352 may communicate with a user through a control interface 1358 and a display interface 1356 coupled to the display 1354. The display 1354 may be, for example, a TFT LCD (thin film transistor liquid crystal display) and LED (light emitting diode) or OLED (organic light emitting diode) display or other suitable display technology. The display interface 1356 may include appropriate circuitry for driving the display 1354 to present graphical and other information to a user. The control interface 1358 may receive commands from a user and convert them for submission to the processor 1352. Further, an external interface 1362 may be provided in communication with the processor 1352 to enable near area communication of the device 1350 with other devices. External interface 1362 may be provided for wired communication, for example, in some embodiments, or for wireless communication in other embodiments, and multiple interfaces may also be used.

Memory 1364 stores information within computing device 1350. The memory 1364 may be implemented as one or more of one or more computer-readable media, one or more volatile memory units, or one or more non-volatile memory units. Expansion memory 1374 may also be provided by and connected to device 1350 by expansion interface 1372, which expansion interface 1372 may include, for example, a SIMM (Single in line memory Module) card interface. Such expansion memory 1374 may provide additional storage space for device 1350 or may also store applications or other information for device 1350. Specifically, expansion memory 1374 may include instructions for performing or supplementing the processes described above, and may also include security information. Thus, for example, expansion memory 1374 may be provided as a secure module for device 1350 and may be programmed with instructions that allow secure use of device 1350. Further, secure applications may be provided via the SIMM card, as well as additional information, such as placing identifying information on the SIMM card in an indestructible manner.

As discussed below, the memory may include, for example, flash memory and/or NVRAM memory. In one implementation, a computer program product is tangibly embodied in an information carrier. The computer program product contains instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer-or machine-readable medium, such as the memory 1364, the expansion memory 1374, or memory on the processor 1352, that may be received, for example, over the transceiver 1368 or the external interface 1362.

The device 1350 may communicate wirelessly through a communication interface 1366, which communication interface 1366 may include digital signal processing circuitry if desired. The communication interface 1366 may provide for communication in various modes or protocols, such as GSM voice calls, SMS, EMS, or MMS messaging, CDMA, TDMA, PDC, WCDMA, CDMA, or GPRS, among others. Such communication may occur, for example, through radio frequency transceiver 1368. Further, short-range communications may occur, such as using bluetooth, wi-Fi, or other such transceivers (not shown). Further, a GPS (global positioning system) receiver module 1370 may provide additional navigation-and location-related wireless data to the device 1350, which may be used as appropriate by applications running on the device 1350.

The device 1350 may also communicate audibly using an audio codec 1360, which audio codec 1360 may receive spoken information from a user and convert it to usable digital information. The audio codec 1360 may likewise generate audible sound for a user, such as through a speaker, e.g., in a headset of the device 1350. Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, etc.) and may also include sound generated by applications operating on device 1350.

As shown in the figures, the computing device 1350 may be implemented in many different forms. For example, it may be implemented as a cellular telephone 1380. It may also be implemented as part of a smart phone 1382, personal digital assistant, or other similar mobile device.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include embodiments in one or more computer programs that may be executed and/or construed on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium," "computer-readable medium" refers to any computer program product, apparatus and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having a display device (LED (light emitting diode), or OLED (organic LED), or LCD (liquid crystal display) monitor/screen) to display information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), and the Internet.

The computing system may include clients and servers. The client and server are generally remote from each other and typically interact through a communication network. The relationship between client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

In some implementations, the computing device depicted in fig. 13 may include a sensor that interfaces with the AR headset/HMD device 1390 to generate an enhanced environment for viewing inserted content within the physical space. For example, including one or more sensors on computing device 1350 depicted in fig. 13 or other computing devices, input may be provided to AR headset 1390, or in general, input may be provided to the AR space. The sensors may include, but are not limited to, touch screens, accelerometers, gyroscopes, pressure sensors, biometric sensors, temperature sensors, humidity sensors, and ambient light sensors. The computing device 1350 may use the sensors to determine the absolute position of the computing device in the AR space and/or the detected rotation, which may then be used as input to the AR space. For example, computing device 1350 may be incorporated into the AR space as a virtual object such as a controller, laser pointer, keyboard, weapon, etc. When incorporated into AR space, the positioning of the computing device/virtual object by the user may allow the user to position the computing device to view the virtual object in AR space in some manner. For example, if the virtual object represents a laser pointer, the user may manipulate the computing device as if it were an actual laser pointer. The user may move the computing device in a side-to-side, up-and-down, circle, etc., manner and use the device in a manner similar to using a laser pointer. In some implementations, the user may aim at the target location using a virtual laser pointer.

In some implementations, one or more input devices included on or connected to computing device 1350 may be used as inputs to the AR space. The input device may include, but is not limited to, a touch screen, keyboard, one or more buttons, a touch pad, a pointing device, a mouse, a trackball, a joystick, a camera, a microphone, headphones, or an ear bud with input capabilities, a game controller, or other connectable input device. When the computing device is incorporated into the AR space, a user interacting with an input device included on computing device 1350 may cause certain actions to occur in the AR space.

In some implementations, the touch screen of the computing device 1350 may be rendered as a touch pad in AR space. A user may interact with a touch screen of the computing device 1350. For example, in AR headset 1390, interactions are rendered as movements on a touch pad rendered in AR space. The rendered movement may control virtual objects in the AR space.

In some implementations, one or more output devices included on the computing device 1350 can provide output and/or feedback to a user of the AR headset 1390 in AR space. The output and feedback may be visual, tactile or audible. The output and/or feedback may include, but is not limited to, vibration, turning on, off, or flashing and/or flashing one or more lights or strobes, alerting, playing a ring tone, playing a song, and playing an audio file. Output devices may include, but are not limited to, vibration motors, vibration coils, piezoelectric devices, electrostatic devices, light Emitting Diodes (LEDs), strobe lights, and speakers.

In some implementations, the computing device 1350 may appear as another object in a computer-generated 3D environment. User interaction with the computing device 1350 (e.g., rotating, panning, touching the touch screen, sliding a finger over the touch screen) may be interpreted as interaction with an object in AR space. In an example of a laser pointer in AR space, computing device 1350 appears as a virtual laser pointer in a computer-generated 3D environment. As the user manipulates computing device 1350, the user in AR space sees the movement of the laser pointer. The user receives feedback from interactions with the computing device 1350 in an AR environment on the computing device 1350 or AR headset 1390. User interactions with the computing device may be transformed into interactions with a user interface generated for the controllable device in the AR environment.

In some implementations, the computing device 1350 may include a touch screen. For example, a user may interact with a touch screen to interact with a user interface of a controllable device. For example, the touch screen may include user interface elements, such as sliders that may control the nature of the controllable device.

Computing device 1300 is intended to represent various forms of digital computers and devices including, but not limited to, laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Computing device 1350 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smartphones, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

A number of embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the specification.

Furthermore, the logic flows depicted in the figures do not require the particular order shown, or sequential order, to achieve desirable results. Further, other steps or steps may be provided from the described flows, and other components may be added to or removed from the described systems. Accordingly, other embodiments are within the scope of the following claims.

While certain features of the described embodiments have been illustrated as described herein, many modifications, substitutions, changes, and equivalents will now occur to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the scope of the embodiments. It is to be understood that they have been presented by way of example only, and not limitation, and various changes in form and details may be made. Any portion of the devices and/or methods described herein may be combined in any combination, except combinations that are mutually exclusive. The embodiments described herein may include various combinations and/or sub-combinations of the functions, components, and/or features of the different embodiments described.

Claims

1. A computer-implemented method comprising:

Receiving image data by a search server;

identifying at least one entity within the image data;

storing in a database of the search server an instance of a data structure including the at least one entity identified within the image data;

receiving a visual content query from a client computing device;

detecting the at least one entity within the visual content query;

detecting contextual information associated with the at least one entity identified within the visual content query;

searching the database for the at least one entity detected within the visual content query;

matching, by the search server, the at least one entity detected within the visual content query with the data structure instance comprising the at least one entity;

identifying at least one digital supplement associated with the at least one entity based on the contextual information; and

In response to the visual content query and the match, supplemental information including the at least one digital supplement associated with the at least one entity is transmitted by the search server to the client computing device.

2. The computer-implemented method of claim 1 , wherein transmitting the supplemental information associated with the at least one entity comprises transmitting at least one of:

identification information associated with the at least one entity;

location information associated with the at least one entity;

one or more applications associated with the at least one entity; or

One or more network-accessible resources associated with the at least one entity.

3. The computer-implemented method of claim 2, wherein the supplemental information includes a name, a description, an image, and a uniform resource locator.

4 . The computer-implemented method of claim 1 , wherein storing the data structure instance comprises storing the data structure instance comprising a digital supplement associated with the at least one entity identified within the image data.

5 . The computer-implemented method of claim 4 , wherein transmitting the supplemental information associated with the at least one entity to the client computing device comprises transmitting information associated with a plurality of network-accessible resources to the client computing device.

6. A computer-implemented method according to claim 4, wherein storing the data structure instance includes storing the data structure instance in the database including multiple other data structure instances, each of the multiple other data structure instances including at least one previously identified entity and at least one corresponding digital supplement associated with the corresponding at least one previously identified entity.

7. The computer-implemented method of claim 6, wherein transmitting the supplemental information comprises:

identifying a plurality of digital supplements including the at least one digital supplement associated with the at least one entity identified in the visual content query;

determining a relevance score for each digital supplement in the plurality of digital supplements; and

An ordered list of digital supplements is transmitted to the client computing device based on the relevance score of each digital supplement in the plurality of digital supplements.

8. The computer-implemented method of claim 7, wherein determining the relevance score comprises determining the relevance score for each of the plurality of digital supplements associated with the corresponding plurality of data structure instances based on the context information.

9. A computer-implemented method according to claim 8, wherein each of the multiple data structure instances specifies contextual information, and wherein the matching includes matching the contextual information associated with at least one entity identified within the visual content query with the contextual information included in the multiple data structure instances.

10. The computer-implemented method of claim 6, wherein transmitting the digital supplement comprises:

A list of digital supplements is transmitted, the list including the digital supplement from the data structure instance associated with the at least one entity and a digital supplement from another data structure instance of the plurality of data structure instances.

11. A non-transitory computer-readable medium comprising instructions that, when executed by a processor of a computing system, cause the computing system to:

Storing a plurality of data structure instances in a database of the computing system includes:

receiving image data;

identifying at least one entity within the image data; and

storing a data structure instance in the database, the data structure instance including the at least one entity identified within the image data and supplemental information associated with the at least one entity, the database including a plurality of data structure instances;

receiving a visual content query from a client computing device;

detecting at least one entity within the visual content query;

matching the at least one entity detected within the visual content query to at least one entity included in one or more data structure instances among the plurality of data structure instances;

In response to the visual content query and the matching with the one or more of the plurality of data structure instances, the supplemental information including the at least one digital supplement associated with the at least one entity detected within the visual content query is transmitted.

12. The non-transitory computer-readable medium of claim 11, wherein the instructions cause the computing system to transmit the supplemental information to the client computing device to include at least one of:

identification information associated with the at least one entity;

location information associated with the at least one entity;

application information associated with the at least one entity; or

A network-accessible resource associated with the at least one entity detected within the visual content query.

13. The non-transitory computer-readable medium of claim 11, wherein the instructions cause the computing system to transmit the supplemental information to the client computing device to include a name, description, image, and uniform resource locator associated with the at least one entity detected within the visual content query.

14. A non-transitory computer-readable medium according to claim 11, wherein each data structure instance includes supplemental information, the supplemental information including a corresponding numerical supplement associated with the at least one entity identified within the image data, and wherein the instructions cause the computing system to transmit the corresponding numerical supplement included in the data structure instance that matches the at least one entity identified within the visual content query to the client computing device.

15. The non-transitory computer readable medium of claim 14, wherein the instructions cause the computing system to:

identifying a plurality of digital supplements including the at least one digital supplement associated with the at least one entity identified within the visual content query;

The ordered list of digital supplements is transmitted to the client computing device.

16. The non-transitory computer-readable medium of claim 15, wherein the instructions cause the computing system to determine the relevance score for each of the plurality of digital supplements based on the contextual information.

17. A non-transitory computer-readable medium according to claim 16, wherein each digital structure instance in the plurality of data structure instances specifies contextual information, and wherein the instructions cause the computing system to match the contextual information associated with the at least one entity identified within the visual content query with the contextual information included in the plurality of data structure instances.

18. A non-transitory computer-readable medium according to claim 16, wherein the instructions cause the computing system to transmit a list of digital supplements, the list including the digital supplement from the data structure instance associated with the at least one entity and a digital supplement from another data structure instance among the multiple data structure instances.

19. A computer-implemented method comprising:

receiving, by the server, a visual content query from a client device, the visual content query comprising image data;

detecting an entity within the image data;

identifying a plurality of digital supplements associated with the entity detected within the image data;

assigning a relevance score to each of the plurality of digital supplements based on the entity detected within the image data and at least one of a location associated with the client device, user profile data, or specifications associated with the client device;

ranking the plurality of digital supplements based on the relevance score assigned to each of the plurality of digital supplements;

generating a numerically supplemented ordered list based on the ranking; and

In response to the visual content query, the ordered list of digital supplements is transmitted to the client device.

20. The computer-implemented method of claim 19, wherein at least one of the plurality of digital supplements comprises a three-dimensional model corresponding to the entity detected within the image data.

21. The computer-implemented method of claim 20, wherein assigning the relevance score to each of the plurality of digital supplements comprises assigning a higher relevance score to the at least one of the plurality of digital supplements that includes the three-dimensional model.

22. A computer-implemented method according to claim 21, wherein assigning the higher relevance score includes assigning the higher relevance score to the at least one digital supplement of the plurality of digital supplements that includes the three-dimensional model based on the specification associated with the client device.

23. The computer-implemented method of claim 19, wherein detecting an entity comprises:

detecting a plurality of entities within the image data;

identifying a digital supplement associated with each of the plurality of entities; and

A target entity is identified from the plurality of entities based on a cumulative relevance score of the plurality of numeric supplements associated with each entity of the plurality of entities.

24. The computer-implemented method of claim 23, wherein

Ranking the plurality of digital supplements includes ranking the plurality of digital supplements associated with the target entity based on the relevance score assigned to each of the plurality of digital supplements associated with the target entity; and

Generating the ordered list of digital supplements includes generating the ordered list of digital supplements based on the ranking of the plurality of digital supplements associated with the target entity.

25. A computer-implemented method comprising:

executing the application by a processor of the computing device;

capturing image data by an image sensor of the computing device;

transmitting a query including the image data to a search server;

receiving, by the application from the search server, a response to the query, the response identifying a digital supplement, the digital supplement comprising information related to a document identified in the image data based on at least one indicator detected in the image data; and

A user interface is output by the application, the user interface including the information included in the digital supplement associated with the image data.

26. The computer-implemented method of claim 25, wherein outputting the user interface comprises at least one of:

outputting a first digital complement of a first user control, and in response to detecting selection of the first user control, triggering output of an annotation displayed as an overlay on the document captured in the image data, the annotation representing a first numerical calculation performed on a digital element detected from the image data;

outputting a second digital complement including a second user control, and in response to detecting selection of the second user control, triggering outputting at least one request to at least one other user based on a second digital calculation performed on the digital element detected from the image data;

outputting a third digital complement including a third user control, and in response to detecting selection of the third user control, triggering saving of the document to a specified location; or

A fourth digital complement including a fourth user control is output, and in response to detecting selection of the fourth user control, submission of the at least one document to the designated recipient is triggered.