US20150331908A1 - Visual interactive search - Google Patents
Visual interactive search Download PDFInfo
- Publication number
- US20150331908A1 US20150331908A1 US14/494,364 US201414494364A US2015331908A1 US 20150331908 A1 US20150331908 A1 US 20150331908A1 US 201414494364 A US201414494364 A US 201414494364A US 2015331908 A1 US2015331908 A1 US 2015331908A1
- Authority
- US
- United States
- Prior art keywords
- documents
- candidate
- space
- user
- identifying
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G06F17/30477—
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3325—Reformulation based on results of preceding query
- G06F16/3326—Reformulation based on results of preceding query using relevance feedback from the user, e.g. relevance feedback on documents, documents sets, document terms or passages
- G06F16/3328—Reformulation based on results of preceding query using relevance feedback from the user, e.g. relevance feedback on documents, documents sets, document terms or passages using graphical result space presentation or visualisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/904—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/93—Document management systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/957—Browsing optimisation, e.g. caching or content distillation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
Definitions
- the invention relates generally to a tool for searching for digital documents in an interactive and visual way.
- digital documents include: photographs, product descriptions, or webpages.
- this tool may be used on a mobile device to search for furniture available for sale via an online retailer.
- the queries may be in the form of a structured query language, natural language text, speech, or a reference image.
- the results returned often do not satisfy the user's search goal. The user then proceeds to refine or modify the query in an attempt to better achieve desired goals.
- the system described herein addresses this challenge in at least two ways. First, it presents a set of search results that are diverse along multiple axes. Second, it allows the user to interactively “navigate” within those results by selecting results that incrementally address desired search goals and using that user input to iteratively refine the set of results presented to the user.
- a user performs an initial search using a query, which maybe textual, verbal, or visual, e.g., via a prototype image.
- the system returns a set of results in response to the query.
- the result set is selected to be diverse, that is, each of the results differs from the others along different axes relevant to the search query.
- the results are presented in a 2 dimensional arrangement (such as a grid).
- One result, the primary result may be displayed more prominently.
- the other results are displayed in a layout relative to the primary result such that results that are most similar to each other are laid out near to each other and results that are most dissimilar from each other are laid out far from each other.
- the user may select one or more results to indicate a request to refine the search to identify items more similar to those results.
- the system then returns and displays a set of results more similar to the selected results and less similar to the non-selected results. In this way the user can interactively refine a search specification in order to more precisely encode intended search goals.
- FIG. 1 illustrates an example environment in which aspects of the invention may be implemented.
- FIG. 2 is a simplified block diagram of a computer system that can be used to implement aspects of the present invention.
- FIGS. 3-7 and 10 A illustrate documents embedded in 2-dimensional embedding space.
- FIG. 8 is a block diagram of various components of an embodiment of a system according to the invention.
- FIG. 9 is a flow chart illustrating various logic phases through which a system according to the invention may proceed.
- FIG. 10B illustrates certain documents from FIG. 10A in layout space.
- a system can have several aspects, and different embodiments need not implement all of them: 1) a module for creating an initial query, 2) a module for obtaining a set of candidate results satisfying the initial query, 3) a module for determining the distance or similarity between candidate results or a module for embedding the candidate results in a vector space, 4) a module for sub-selecting a discriminating set of candidate results, 5) a module for arranging candidate results in 2 dimensions, 6) a module for obtaining user input with regard to the candidate results, 7) a module for refining the search query to incorporate information regarding the user input encoded as geometric or distance constraints with respect to the embedding or distance measures of 3, 8) a module for iteratively obtaining a set of candidate results satisfying the initial query and the geometric or distance constraints accumulated from user input.
- FIG. 8 is a block diagram of various components of an embodiment of a system according to the invention. It includes an embedding module 820 which calculates an embedding of the source documents into an embedding space, and writes the embedding information, in association with identification of the documents, into a document collection database 816 .
- a user interaction module 822 receives queries and query refinement input from a user, and provides them to a query processing module 824 .
- the user interaction module 822 includes a computer terminal, whereas in another embodiment it includes only certain network connection components through which the system communicates with an external computer terminal.
- the query processing module 824 interprets the queries as geometric constraints on the embedding space, and narrows the collection to develop a set of candidate documents which satisfy the geometric constraints.
- Candidate spaces as used herein are also embedding spaces, and for example may constitute a portion of the embedding space of collection 816 .
- query processing module may also perform a re-embedding of the candidate documents in embedding space.
- a discriminative selection module 828 selects a discriminative set of the documents from the candidate space 826 and presents them to the user via user interaction module 822 .
- User interaction module 822 may then receive further refinement queries from the user, which are handled as above, or it may receive a user commit indication, in which case the system takes some action 830 with respect to the user's selected document such as opening the document for the user or engaging in further search refinement.
- the user refinement input may not require a further geometric constraint on the candidate space, but rather may involve only selection of a different discriminative set of documents from the existing candidate space 826 for presentation to the user.
- the candidate space database may not be implemented as a separate database, but rather may be combined in various ways with the document collection embedding space 816 .
- Candidate space may also be implied rather than physical in some embodiments.
- FIG. 9 is a flow chart illustrating various logic phases through which a system according to the invention may proceed.
- a collection of digital documents which, as used herein includes images, text, web-pages, catalog entries, and sections of documents
- an initial query is optionally processed to yield an initial subset of digital documents satisfying the query results.
- the term “subset” refers only to a “proper” subset.
- This may be a conventional text query, for example.
- the initial query is processed to yield a set of geometric constraints (possibly empty).
- step 916 the geometric constraints are applied to the space of embedded digital documents or to the initial subset of digital documents to identify a set of candidate results.
- step 918 a discriminative subset of documents is selected from the candidate results, and in step 920 the discriminative subset of documents is presented toward the user.
- step 922 the user provides further input, for example by selecting or deselecting candidate results, by indicating a direction on the display with respect to one or more candidate results, or by providing a ranked order of the candidate results; or if satisfied with one of the candidate results, the user indicates to commit to that result. If the user input indicates further refinement, then the logic returns to step 914 to process the refinement query to yield further geometric constraints step 924 ). If not, then in step 926 the system takes action with respect to the user-selected document.
- step 910 occurs continuously in the background, separately from the remainder of the steps, and updates the document collection in embedding space asynchronously with the remainder of the steps.
- the logic of FIG. 9 can be implemented using processors programmed using computer programs stored in memory accessible to the computer systems and executable by the processors, by dedicated logic hardware, including field programmable integrated circuits, or by combinations of dedicated logic hardware and computer programs.
- Each block in the flowchart or phase in a logic sequence describes logic that can be implemented in hardware or in software running on one or more computing processes executing on one or more computer systems.
- each step of the flow chart or phase in a logic sequence illustrates or describes the function of a separate module of software.
- the logic of the step is performed by software code routines which are distributed throughout more than one module.
- the “embedding space”, into which digital documents are embedded by embedding module 820 and in step 910 , as used herein is a geometric space within which documents are represented.
- the embedding space is a vector space, in which the features of a document define its “position” in the vector space relative to an origin. The position is typically represented as a vector from the origin to the document's position, and the space has a number of dimensions based on the number of coordinates in the vector. Vector spaces deal with vectors and the operations that may be performed on those vectors.
- the embedding space is a metric space, which does not have a concept of position, dimensions or an origin. Distances among documents in a metric space are maintained relative to each other, rather than relative to any particular origin.
- Metric spaces deal with objects combined with a distance between those objects and the operations that may be performed on those objects. For purposes of the current discussion these objects are significant in that there exist many efficient algorithms that operate on vector spaces and metric spaces. For example metric trees may be used to rapidly identify objects that are “close” to each other.
- metric trees may be used to rapidly identify objects that are “close” to each other.
- we embed objects into vector spaces and/or metric spaces In the context of a vector space this means that we define a function that maps objects to vectors in some vector space.
- a metric space it means that we define a metric (or distance) between those objects that allows us to treat the set of all such objects as a metric space.
- vector spaces allow the use of a variety of standard measures of distance (divergence) including the Euclidean distance. Other embodiments can use other types of embedding spaces.
- the goal of embedding digital documents in a vector space is to place intuitively similar documents close to each other.
- a common way of embedding text documents is to use a bag-of-words model.
- the bag of words model maintains a dictionary.
- Each word in the dictionary is given an integer index, for example, the word aardvark may be given the index 1, and the word zebra may be given the index 60,000.
- Each document is processed by counting the number of occurrences of each dictionary word in that document.
- a vector is created where the value at the i th index is the count for the i th dictionary word. Variants of this representation normalize the counts in various ways.
- Such an embedding captures information about the content and therefor the meaning of the documents. Text documents with similar word distributions are close to each other in this embedded space.
- documents may be embedded into a vector space.
- images may be processed to identify commonly occurring features using, e.g., scale invariant feature transforms (SIFT), which are then binned and used in a representation similar to the bag-of-words embedding described above.
- SIFT scale invariant feature transforms
- embeddings created using deep neural networks, or other deep learning techniques.
- a neural network can learn an appropriate embedding by performing gradient descent against a measure of dimensionality reduction on a large set of training data.
- a kernel based on data and derive a distance based on that kernel.
- These approaches generally use large neural networks to map documents, words, or images to high dimensional vectors.
- an embedding into a vector space may also be defined implicitly via a kernel. In this case the explicit vectors may never be generated or used, rather the operations in the vectors space are carried out by performing kernel operations in the original space.
- “Distance” between two documents in embedding space “corresponds to” a predetermined measurement of dissimilarity among documents. Preferably it is a monotonic function of the measurement of dissimilarity. Typically it equals the measurement of dissimilarity.
- Example distances include the Manhattan distance, the Euclidean distance, and the Hamming distance.
- Such distances may be defined in a variety of ways.
- One typical way is via embeddings into a vector space.
- Other ways include encoding the similarity via a Kernel.
- Kernel By associating a set of documents with a distance we are effectively embedding those documents into a metric space. Documents that are intuitively similar will be close in this metric space while those that are intuitively dissimilar will be far apart.
- kernels and distance functions may be learned. In fact, it may be useful to learn new distance functions on subsets of the documents at each iteration of the search procedure.
- databases 816 and 826 may use commonly available means to store the data in, e.g., a relational database, a document store, a key value store, or other related technologies.
- a relational database e.g., a relational database
- a document store e.g., a document store
- a key value store e.g., a document store
- the original document contents e.g., pointers to them
- indexing structures are critical.
- documents are embedded in a vector space indexes may be built using, e.g., kd-trees.
- documents are associated with a distance metric and hence embedded in a metric space metric trees may be used.
- databases described herein are stored on one or more non-transitory computer readable media. As used herein, no distinction is intended between whether a database is disposed “on” or “in” a computer readable medium. Additionally, as used herein, the term “database” does not necessarily imply any unity of structure. For example, two or more separate databases, when considered together, still constitute a “database” as that term is used herein.
- the initial query presented to the system in step 912 may be created and evaluated using a variety of standard techniques.
- the query may be presented as a set of keywords entered via a keyboard or via speech; the query may be a natural language phrase, or sentence entered via a keyboard or via speech; or the query may be an audio signal, an image, a video, or a piece of text representing a prototype for which similar audio signals, images, videos, or text may be sought.
- a variety of means are known by which such an initial query may be efficiently evaluated, e.g., searching a relational database, or using an inverted index.
- the initial query may also be designed to simply return a random set of results.
- Faceted search provides a means for users to constrain a search along a set of axes.
- the faceted search might provide a slider that allows users to constrain the range of acceptable prices.
- the search constraints created from an initial query and subsequent user input are used to identify a set of candidate results.
- This may be achieved using a variety of means.
- the initial query may be performed against a relational database whereby the results are then embedded in a vector or metric space. These results may then be indexed using, e.g., a kd-tree or a metric tree and searched to identify candidates that satisfy both the initial query and the constraints.
- the initial query may also be converted to geometric constraints that are applied to the set of embedded documents. In this case the geometric representation of the constraints implied both by the initial query and the user input are combined and an appropriate index is used to identify embedded documents satisfying both sets of constraints.
- a “geometric constraint” on an embedding space is a constraint that is described formulaically in the embedding space, rather than only by cataloguing individual documents or document features to include or exclude.
- the constraint can be described in the form of a specified function which defines a hypersurface. Documents on one side of the hypersurface satisfy the constraint whereas documents on the other side do not.
- a hyperplane may be defined in terms of dot products or kernels and requires that k(x,z)>0 for a fixed vector x and a candidate z. Likewise a conic constraint may require that k(x,z)>c for some constant c.
- a geometric constraint In a metric embedding space, the constraint can be described in the form of a function of, for example, distances between documents.
- a geometric constraint might take the form of ‘all documents within a specified distance from document X’, for example, or ‘all documents whose distance to document A is less than its distance to document B’.
- Geometric constraints also may be combined using set operations, e.g., union, intersection to define more complex geometric constraints. They also may be created by taking transformations of any of the example constraints discussed.
- constraints may be “hard” or “soft”. Hard constraints are those which must be satisfied in the sense that solutions must satisfy the conditions of all hard constraints. Soft constraints are those which need not be satisfied but candidate solutions may be penalized for each soft constraint that they don't satisfy. Solutions may be rejected in a particular embodiment if the accumulation of such penalties is too large. Constraints may be relaxed in some embodiments, for example hard constraints may be converted to soft constraints by associating them with a penalty, and soft constraints may have their penalties reduced.
- Search queries may be ambiguous, or underspecified and so the documents satisfying a query may be quite diverse. For example, if the initial query is for a “red dress” the results may be quite varied in terms of their length, neckline, sleeves, etc.
- This aspect of the module sub-selects a discriminating set of results. Intuitively the objective is to provide a set of results to the user such that selection or de-selection of those results provides the most informative feedback or constraints to the search algorithm.
- One may think of this step as identifying an “informative” set of results, or a “diverse” set of results, or a “discriminating” set of results.
- Discriminative selection module 828 performing step 918 , selects a discriminative subset of results in any of a variety of ways.
- a subset of the results may be discriminative as it provides a diversity of different kinds of feedback that the user can select.
- Diverse images may be selected as in, e.g., van Leuken, et. al., “Visual Diversification of Image Search Results”, in WWW '09 Proceedings of the 18th international conference on World wide web, pp. 341-350 (2009), incorporated by reference herein.
- This diverse set is selected in order to provide the user with a variety of ways in which to refine the query at the next iteration. There are a variety of ways in which such a set may be identified. For example, farthest first traversal may be performed which incrementally identifies the “most” diverse set of results. Farthest first traversal requires only a distance measure and does not require an embedding. Farthest first traversal may also be initialized with a set of results; subsequent results are then the most different from that initial set.
- discriminative subsets of candidate results include using an algorithm like PCA (principal component analysis) or kernel PCA to identify the key axes of variation in the complete set of results.
- PCA principal component analysis
- kernel PCA kernel PCA
- Another means might consider the set of constraints that would results from the user selecting or deselecting a given document. This set of constraints may be considered in terms of the candidate results it would yield.
- a discriminative subset may be selected so that the sets of candidate results produced by selecting any of the documents in that discriminative subset are as different as possible.
- discriminativeness of a particular set of documents in a collection of documents is the least number of documents in the collection that are excluded as a result of user selection of any document in the set. That is, if user selection of different documents in the particular set results in excluding different numbers of documents in the collection, then the set's “discriminativeness” is considered herein to be the least of those numbers. Note that either the discriminative set of documents, or the formula by which user selection of a document determines which documents are to be excluded, or both, should be chosen such that the union of the set of documents excluded by selecting any of the documents in a discriminative set equals the entire collection of documents.
- the “average discriminativeness” of a set of size n documents in a collection of documents is the average, over all sets of size n documents in the collection of documents, of the discriminativeness of that set. Also as used herein, one particular set of documents can be “more discriminative” than another set of documents if the discriminativeness of the first set is greater than the discriminativeness of the second set.
- the selection module 828 performing step 918 , selects a set of N1>1 documents from the current candidate space 826 , which is more discriminative than the average discriminativeness of sets of size N1 documents in the candidate space. Even more preferably, selection module 828 performing step 918 selects a set which is at least as discriminative as all other sets of size N1 documents in the current candidate space.
- the aim of the discriminative results presentation to the user in step 920 by user interaction module 822 is to provide the user with a framework in which to refine the query constraints.
- results may be presented as a two dimensional grid.
- Results should be placed on that grid in a way that allows the user to appreciate the underlying distances between those results (as defined using a distance measure or embedding).
- One way to do this would be to ensure that results that are far from each other with respect to the distance measure are also displayed far from each other on the grid.
- Another way would be to project the embedding space onto two dimensions for example using multidimensional scaling (MDS) (for example see: Jing Yang, et. al., Semantic image browser: Bridging information visualization with automated intelligent image analysis, Proc. IEEE Symposium on Visual Analytics Science and Technology (2006), incorporated herein by reference).
- MDS multidimensional scaling
- Yet another way would be to sub-select axes in the embedding space and position results along those axes.
- layouts contemplated include 2 dimensional organizations not on a grid (possibly including overlapping results), 3 dimensional organizations analogous to the 2 dimensional organizations. Multi-dimensional organizations analogous to the 2 and 3 dimensional organizations with the ability to rotate around one or more axes.
- M-dimensional layout can be used, where M>1.
- the number of dimensions in the presentation layout need not be the same as the number of dimensions in the embedding space.
- layouts include hierarchical organizations or graph based layouts.
- the document placement in the layout space should be indicative of the relationship among the documents in embedding space.
- the distance between documents in layout space should correspond (monotonically, if not linearly) with the distance between the same documents in embedding space.
- three documents are collinear in embedding space, advantageously they are placed collinearly in layout space as well.
- collinearity in layout space with a candidate document which the system identifies as the most likely target of the user's query (referred to herein as the primary candidate document) indicates collinearity in the embedding space with the primary candidate document.
- embedding space typically has a very large number of dimensions, and in high dimensional spaces very few points are actually collinear.
- documents presented collinearly in layout space indicate only “substantial” collinearity in embedding space.
- the embedding space is such that each document has a position in the space (as for a vector space)
- three documents are considered “substantially collinear” in embedding space if the largest angle of the triangle formed by the three documents in embedding space is greater than 160 degrees.
- a group of three documents are considered collinear if the sum of the two smallest distances between pairs of the documents in the group in embedding space equals the largest distance between pairs of the documents in the group in embedding space.
- the three documents are considered “substantially collinear” if the sum of the two smallest distances exceeds the largest distance by no more than 10%.
- colllinearity and substantially collinearity do not include the trivial cases of coincidence or substantial coincidence.
- User interaction module 822 provides the user with a user interface (UI) which allows the user to provide input in a variety of ways. For example, the user may click on a single result to select it, or may swipe in the direction of a single result to de-select it. Similarly, the user may select or deselect multiple results at a time. For example, this may be done using a toggle selector on each result. The user might also implicitly select a set of results by swiping in the direction of a result indicating a desire for results that are more like that result “in that direction”. In this case “in that direction” means that the differences between the primary result and the result being swiped should be magnified.
- UI user interface
- next set of results should be more like the result being swiped and less like the “primary result”.
- This concept may be generalized by allowing the user to swipe “from” one result “to” another result. In this case new results should be more like the “to” result and less like the “from” result.
- the UI can provide the user with the ability (e.g., via a double-click, or a pinch) to specify that the next set of results should be more like a specific result than any of the other results displayed. This is analogous to zooming on a map. Conversely, the UI can provide the ability to specify that the next set of results should be like a particular selection but more diverse than the currently selected set of results.
- the UI may also provide the ability for the user to specify that a different set of similarly diverse images be provided.
- the system then incorporates the user's input to create a refined query.
- the refined query includes information regarding the initial query and information derived from the iterative sequence of refinements made by the user so far.
- This refined query may be represented as a set of geometric constraints that focus subsequent results within a region of the embedding space. Likewise, it may be represented as a set of distance constraints whose intersection defines the refined candidate set of results. It may also be represented as a path through the set of all possible results.
- the refined query may include constraints that require subsequent results to be within a specified distance of one of the selected candidate results.
- the refined query may include constraints that require subsequent results to be closer (with respect to the distance measure) to one candidate result than to another.
- a system according to the invention may have the ability to relax, tighten, remove, or modify constraints that it determines are inappropriate.
- Such data structures include metric trees, kd-trees, R-trees, universal B-trees, X-trees, ball trees, locality sensitive hashes, and inverted indexes.
- the system uses a combination of such data structures to identify the next set of candidate results based on the refined query.
- User behavior data may be collected by a system according to the invention and used to improve or to specialize the search experience.
- many ways of expressing distances or similarities may be parameterized and those parameters may be fit.
- a similarity defined using a linear combination of kernels may have the coefficients of that linear combination tuned based on user behavior data. In this way the system may adapt to individual (or community, or contextual) notions of similarity.
- FIG. 3 illustrates a set of documents embedded in 2-dimensional space. Aspects of the invention envision embedding documents in spaces of large dimensionality, hence two dimensions is for illustration purposes only.
- the space 310 contains documents, e.g., 321 , 322 . Each pair of documents has a distance 330 between them.
- FIG. 4 illustrates the set of documents from FIG. 3 in addition to a circular geometric constraint 410 .
- Those documents inside the circle, e.g., 421 , 422 are said to satisfy the constraint.
- Aspects of the invention express queries and user input in the form of such geometric constraints.
- the documents that satisfy the constraints are the current results of the query. As the user provides further input additional constraints may be added, or existing constraints may be added or removed.
- FIG. 5 illustrates the set of documents from FIG. 3 in addition to a non-circular geometric constraint 510 .
- Aspects of the invention envision geometric constraints of arbitrary shape, and unions, intersections and differences of such constraints.
- FIG. 6 illustrates a means by which the circular constraint of FIG. 4 may be updated in response to user input.
- the original circular constraint 610 may be modified by increasing its radius to produce circular constraint 620 , or by decreasing its radius to produce constraint 630 . These modifications are done in response to user input.
- the set of documents satisfying these constraints will change as the constraints are modified thus reducing or expanding the set of images considered for display to the user.
- FIG. 7 illustrates a means by which a discriminative subset of documents may be selected for presentation to the user.
- the documents highlighted, e.g., 711 , 712 are distinct from each other and from the others contained in the circular constraint region.
- FIG. 10A illustrates a set of documents in embedding space, in which query processing module 824 has narrowed the collection to those documents within the circle 1020 , and has identified primary result document 1018 .
- discriminative selection module 828 has selected documents 1010 , 1012 , 1014 and 1016 as the discriminative set to present to the user. It can be seen that in embedding space, documents 1012 , 1018 and 1016 are substantially collinear, and that documents 1010 , 1018 and 1014 are substantially collinear.
- FIG. 10B illustrates how the system may present the set of documents in layout space.
- the broken lines are implied, rather than visible.
- the specific positions of the documents do not necessarily match those in embedding space, in part because dimensionality of the space has been reduced.
- documents which were substantially collinear in embedding space are collinear in layout space.
- the broken lines in FIG. 10A represent dimensions in embedding space along which the candidate documents differ
- the placement of the documents in layout space in FIG. 10B are indicative of those same dimensions.
- the relative distances among the documents along each of the lines of collinearity in layout space also are indicative of the relative distances in embedding space.
- One implementation allows users to search a personal photograph collection.
- Users are initially shown an arbitrary photograph (the primary result), e.g., the most recent photograph taken or viewed. This is displayed in the center of a 3 ⁇ 3 grid of photographs from the collection.
- Each of the photographs is selected to be close (defined below) to the primary result but different from each other along different axes relative to the primary result.
- the primary result is a photograph taken with family last week at home
- other photographs may be a) with the family last year at home, b) with the family last week outdoors, c) without the family last week at home, etc.
- the system may place two photographs on opposite sides of the primary result which are along the same axis but differ from each other in their positions along that axis. For example, the photo placed on the left side may show family member A more prominently than in the primary result, while the photo placed on the right side may show family member A less prominently than in the primary result.
- the user selects one of the 9 photographs which then becomes the primary result. This is then laid out in an updated 3 ⁇ 3 grid of photographs again “close” to it but different from each other.
- photographs may be considered similar with respect to a number of criteria, including:
- a normalization function can be applied in order that distances along different axes are comparable to each other.
- the “scale” at which the user is searching changes.
- This scale specifies how “close” the photos in the result set are to the primary result. More precisely all photos in the result set must have a “distance” less than some threshold. As the scale increases or decreases this threshold increases or decreases.
- Another implementation looks at searching for accessories (apparel, furniture, apartments, jewelry, etc).
- the user searches using text, speech, or with a prototype image as an initial query.
- a user searches for “brown purse” using text entry.
- the search engine responds by identifying a diverse set of possible results, e.g., purses of various kinds and various shades of brown. These results are laid out in a 2 dimensional arrangement (for example a grid), whereby more similar results are positioned closer to each other and more different results are positioned relatively far from each other.
- the user selects one or more images, for example using radio buttons.
- the image selections are then used by the search engine to define a “search direction” or a vector in the embedding space along which further results may be obtained.
- Documents are encoded in an embedding space such as a vector space or metric space (via a distance). Searches proceed as a sequence of query refinements. Query refinements are encoded as geometric constraints over the vector space or metric space. Discriminative candidate results are displayed to provide the user with the ability to add discriminative constraints. User inputs, e.g., selecting or deselecting results, are encoded as geometric constraints.
- the documents may embedded after the initial query is process and only those documents satisfying the query may be embedded.
- the documents may be re-embedded using a different embedding at any point in the process. In this case, the geometric constraints would be re-interpreted in the new embedding.
- the geometric constraints may be augmented at any point with non-geometric constraints.
- the candidate results can be filtered in a straightforward way to select only those satisfying the non-geometric constraints.
- the interaction can be augmented with faceted search, text, or speech inputs.
- the geometric constraints can be managed together with a set of non-geometric constraints.
- An example implementation may proceed through these steps:
- the above method may be viewed either from the viewpoint of the user interacting with a computer system, or the viewpoint of a computer system interacting with a user, or both.
- the concept may be generalized to refer to “digital documents” rather than images, where digital documents include audio, video, text, html, multimedia documents and product listings in a digital catalog, in addition to images.
- the concept may also be generalized so that the initial collection obtained at step 1 is obtained as the result of the user performing a search (query) within another information retrieval system or search engine.
- the concept may also be generalized so that rather than reducing the threshold at step 8, the user interface provides for the ability to decrease or increase the threshold or leave it unchanged.
- the concept may also be generalized so that at steps 1, and 6 there are two collections of prototype images and at step 2 the system identifies images with distance less than threshold T 1 of the first set, and greater than T 2 of the second set.
- the concept may also be generalized so that at one iteration of step 6 the user selects image(s) along a first subset of at least one axis, and at another iteration of step 6 the user selects image(s) along a second subset of at least one axis, where the second subset of axes contains at least one axis not included in the first subset of axes.
- FIG. 1 illustrates an example environment in which aspects of the invention may be implemented.
- the system includes a user computer 110 and a server computer 112 , connected to each other via a network 114 such as the Internet.
- the server computer 112 has accessibly thereto the database 816 identifying documents in association with embedding information, such as relative distances and/or their positions in a vector space.
- the user computer 110 also in various embodiments may or may not have accessibly thereto a database 118 identifying the same information.
- embedding module 820 (which may for example be the server computer 112 or a separate computer system or a process running on such a computer) analyzes a collection of documents to extract embedding information about the documents. For example, if the documents are photographs, the embedding module 820 may include a neural network and may use deep learning to derive embedding image information from the photographs.
- embedding module 820 may derive a library of image classifications (axes on which a given photograph may be placed), each in association with an algorithm for recognizing in a given photograph whether (or with what probability) the given photograph satisfies that classification. Then the embedding module 820 may apply its pre-developed library to a smaller set of newly provided photographs, such as the photos currently on the user computer 110 , in order to determine embedding information applicable to each photograph. Either way, the embedding module 820 writes into the database 816 the identifications of the collection of documents that the user may search, each in association with its embedding information.
- the embedding information that embedding module 820 writes into database 816 may be provided from an external source, or entered manually.
- the iterative identification steps described above can be implemented in a number of different ways.
- all computation takes place on the server computer 112 , as the user iteratively searches for a desired document.
- the user operating the user computer 110 , sees all results only by way of a browser.
- the server computer 112 transmits its entire database 118 of documents in embedding space (or a subset of that database) to the user computer 110 , which writes it into its own database 118 . All computation takes place on the user computer 110 in such an embodiment, as the user iteratively searches for a desired document. Many other arrangements are possible as well.
- FIG. 2 is a simplified block diagram of a computer system 210 that can be used to implement software incorporating aspects of the present invention.
- the drawing represents both an embodiment of user computer 110 and server computer 112 . While the above-described methods indicate individual logic steps or modules for carrying out specified operations, it will be appreciated that each step or module actually causes the computer system 210 to operate in the specified manner.
- Computer system 210 typically includes a processor subsystem 214 which communicates with a number of peripheral devices via bus subsystem 212 .
- peripheral devices may include a storage subsystem 224 , comprising a memory subsystem 226 and a file storage subsystem 228 , user interface input devices 222 , user interface output devices 220 , and a network interface subsystem 216 .
- the input and output devices allow user interaction with computer system 210 .
- Network interface subsystem 216 provides an interface to outside networks, including an interface to communication network 218 , and is coupled via communication network 218 to corresponding interface devices in other computer systems.
- Communication network 218 may comprise many interconnected computer systems and communication links.
- communication links may be wireline links, optical links, wireless links, or any other mechanisms for communication of information, but typically it is an IP-based communication network. While in one embodiment, communication network 218 is the Internet, in other embodiments, communication network 218 may be any suitable computer network.
- NICs network interface cards
- ICs integrated circuits
- ICs integrated circuits
- macrocells fabricated on a single integrated circuit chip with other components of the computer system.
- User interface input devices 222 may include a keyboard, pointing devices such as a mouse, trackball, touchpad, or graphics tablet, a scanner, a touch screen incorporated into the display, audio input devices such as voice recognition systems, microphones, and other types of input devices.
- pointing devices such as a mouse, trackball, touchpad, or graphics tablet
- audio input devices such as voice recognition systems, microphones, and other types of input devices.
- use of the term “input device” is intended to include all possible types of devices and ways to input information into computer system 210 or onto computer network 218 . It is by way of input devices 222 that the user provides queries and query refinements to the system.
- User interface output devices 220 may include a display subsystem, a printer, a fax machine, or non-visual displays such as audio output devices.
- the display subsystem may include a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), a projection device, or some other mechanism for creating a visible image.
- the display subsystem may also provide non-visual display such as via audio output devices.
- output device is intended to include all possible types of devices and ways to output information from computer system 210 to the user or to another machine or computer system. It is by way of output devices 220 that the system presents query result layouts toward the user.
- Storage subsystem 224 stores the basic programming and data constructs that provide the functionality of certain embodiments of the present invention.
- the various modules implementing the functionality of certain embodiments of the invention may be stored in storage subsystem 224 .
- These software modules are generally executed by processor subsystem 214 .
- Memory subsystem 226 typically includes a number of memories including a main random access memory (RAM) 230 for storage of instructions and data during program execution and a read only memory (ROM) 232 in which fixed instructions are stored.
- File storage subsystem 228 provides persistent storage for program and data files, and may include a hard disk drive, a floppy disk drive along with associated removable media, a CD ROM drive, an optical drive, or removable media cartridges.
- the databases and modules implementing the functionality of certain embodiments of the invention may have been provided on a computer readable medium such as one or more CD-ROMs, and may be stored by file storage subsystem 228 .
- the host memory 226 contains, among other things, computer instructions which, when executed by the processor subsystem 214 , cause the computer system to operate or perform functions as described herein. As used herein, processes and software that are said to run in or on “the host” or “the computer”, execute on the processor subsystem 214 in response to computer instructions and data in the host memory subsystem 226 including any other local or remote storage for such instructions and data.
- Bus subsystem 212 provides a mechanism by which the various components and subsystems of computer system 210 communicate with each other as intended. Although bus subsystem 212 is shown schematically as a single bus, alternative embodiments of the bus subsystem may use multiple busses.
- Computer system 210 itself can be of varying types including a personal computer, a portable computer, a workstation, a computer terminal, a network computer, a television, a mainframe, a server farm, or any other data processing system or user device.
- user computer 110 may be a hand-held device such as a tablet computer or a smart-phone. Due to the ever-changing nature of computers and networks, the description of computer system 210 depicted in FIG. 2 is intended only as a specific example for purposes of illustrating the preferred embodiments of the present invention. Many other configurations of computer system 210 are possible having more or less components than the computer system depicted in FIG. 2 .
- a computer readable medium is one on which information can be stored and read by a computer system. Examples include a floppy disk, a hard disk drive, a RAM, a CD, a DVD, flash memory, a USB drive, and so on.
- the computer readable medium may store information in coded formats that are decoded for actual use in a particular data processing system.
- a single computer readable medium may also include more than one physical item, such as a plurality of CD-ROMs or a plurality of segments of RAM, or a combination of several different kinds of media.
- the term does not include mere time varying signals in which the information is encoded in the way the signal varies over time.
- a given event or value is “responsive” to a predecessor event or value if the predecessor event or value influenced the given event or value. If there is an intervening processing element, step or time period, the given event or value can still be “responsive” to the predecessor event or value. If the intervening processing element or step combines more than one event or value, the signal output of the processing element or step is considered “responsive” to each of the event or value inputs. If the given event or value is the same as the predecessor event or value, this is merely a degenerate case in which the given event or value is still considered to be “responsive” to the predecessor event or value. “Dependency” of a given event or value upon another event or value is defined similarly.
- the “identification” of an item of information does not necessarily require the direct specification of that item of information.
- Information can be “identified” in a field by simply referring to the actual information through one or more layers of indirection, or by identifying one or more items of different information which are together sufficient to determine the actual item of information.
- the term “indicate” is used herein to mean the same as “identify”.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- Applicants hereby claim the benefit under 35 U.S.C. 119(e) of U.S. provisional application No. 61/994,048, filed 15 May 2014. The provisional application is hereby incorporated by reference herein for its teachings.
- The invention relates generally to a tool for searching for digital documents in an interactive and visual way. Examples of digital documents include: photographs, product descriptions, or webpages. For example this tool may be used on a mobile device to search for furniture available for sale via an online retailer.
- Current computer search technologies allow users to perform queries and respond to those queries with an ordered list of results. The queries may be in the form of a structured query language, natural language text, speech, or a reference image. However, the results returned often do not satisfy the user's search goal. The user then proceeds to refine or modify the query in an attempt to better achieve desired goals.
- The system described herein addresses this challenge in at least two ways. First, it presents a set of search results that are diverse along multiple axes. Second, it allows the user to interactively “navigate” within those results by selecting results that incrementally address desired search goals and using that user input to iteratively refine the set of results presented to the user.
- Roughly described, a user performs an initial search using a query, which maybe textual, verbal, or visual, e.g., via a prototype image. The system returns a set of results in response to the query. The result set is selected to be diverse, that is, each of the results differs from the others along different axes relevant to the search query. The results are presented in a 2 dimensional arrangement (such as a grid). One result, the primary result, may be displayed more prominently. The other results are displayed in a layout relative to the primary result such that results that are most similar to each other are laid out near to each other and results that are most dissimilar from each other are laid out far from each other. The user may select one or more results to indicate a request to refine the search to identify items more similar to those results. The system then returns and displays a set of results more similar to the selected results and less similar to the non-selected results. In this way the user can interactively refine a search specification in order to more precisely encode intended search goals.
- The above summary of the invention is provided in order to provide a basic understanding of some aspects of the invention. This summary is not intended to identify key or critical elements of the invention or to delineate the scope of the invention. Its sole purpose is to present some concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later. Particular aspects of the invention are described in the claims, specification and drawings.
- The invention will be described with respect to specific embodiments thereof, and reference will be made to the drawings, in which:
-
FIG. 1 illustrates an example environment in which aspects of the invention may be implemented. -
FIG. 2 is a simplified block diagram of a computer system that can be used to implement aspects of the present invention. -
FIGS. 3-7 and 10A illustrate documents embedded in 2-dimensional embedding space. -
FIG. 8 is a block diagram of various components of an embodiment of a system according to the invention. -
FIG. 9 is a flow chart illustrating various logic phases through which a system according to the invention may proceed. -
FIG. 10B illustrates certain documents fromFIG. 10A in layout space. - The following description is presented to enable any person skilled in the art to make and use the invention, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present invention. Thus, the present invention is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
- In an embodiment of the invention, a system can have several aspects, and different embodiments need not implement all of them: 1) a module for creating an initial query, 2) a module for obtaining a set of candidate results satisfying the initial query, 3) a module for determining the distance or similarity between candidate results or a module for embedding the candidate results in a vector space, 4) a module for sub-selecting a discriminating set of candidate results, 5) a module for arranging candidate results in 2 dimensions, 6) a module for obtaining user input with regard to the candidate results, 7) a module for refining the search query to incorporate information regarding the user input encoded as geometric or distance constraints with respect to the embedding or distance measures of 3, 8) a module for iteratively obtaining a set of candidate results satisfying the initial query and the geometric or distance constraints accumulated from user input.
-
FIG. 8 is a block diagram of various components of an embodiment of a system according to the invention. It includes anembedding module 820 which calculates an embedding of the source documents into an embedding space, and writes the embedding information, in association with identification of the documents, into adocument collection database 816. Auser interaction module 822 receives queries and query refinement input from a user, and provides them to aquery processing module 824. In one embodiment theuser interaction module 822 includes a computer terminal, whereas in another embodiment it includes only certain network connection components through which the system communicates with an external computer terminal. Thequery processing module 824 interprets the queries as geometric constraints on the embedding space, and narrows the collection to develop a set of candidate documents which satisfy the geometric constraints. These are written into acandidate space database 826. Candidate spaces as used herein are also embedding spaces, and for example may constitute a portion of the embedding space ofcollection 816. In some embodiments query processing module may also perform a re-embedding of the candidate documents in embedding space. Adiscriminative selection module 828 then selects a discriminative set of the documents from thecandidate space 826 and presents them to the user viauser interaction module 822.User interaction module 822 may then receive further refinement queries from the user, which are handled as above, or it may receive a user commit indication, in which case the system takes someaction 830 with respect to the user's selected document such as opening the document for the user or engaging in further search refinement. - Note that in some embodiments the user refinement input may not require a further geometric constraint on the candidate space, but rather may involve only selection of a different discriminative set of documents from the existing
candidate space 826 for presentation to the user. Also, in various embodiments, the candidate space database may not be implemented as a separate database, but rather may be combined in various ways with the documentcollection embedding space 816. Candidate space may also be implied rather than physical in some embodiments. -
FIG. 9 is a flow chart illustrating various logic phases through which a system according to the invention may proceed. Initially, instep 910, a collection of digital documents (which, as used herein includes images, text, web-pages, catalog entries, and sections of documents) is embedded in embedding space. Instep 912, an initial query is optionally processed to yield an initial subset of digital documents satisfying the query results. (As used herein, the term “subset” refers only to a “proper” subset.) This may be a conventional text query, for example. Instep 914, the initial query is processed to yield a set of geometric constraints (possibly empty). Instep 916 the geometric constraints are applied to the space of embedded digital documents or to the initial subset of digital documents to identify a set of candidate results. In step 918 a discriminative subset of documents is selected from the candidate results, and instep 920 the discriminative subset of documents is presented toward the user. Instep 922 the user provides further input, for example by selecting or deselecting candidate results, by indicating a direction on the display with respect to one or more candidate results, or by providing a ranked order of the candidate results; or if satisfied with one of the candidate results, the user indicates to commit to that result. If the user input indicates further refinement, then the logic returns to step 914 to process the refinement query to yield further geometric constraints step 924). If not, then instep 926 the system takes action with respect to the user-selected document. - Note that at
910, 912 and 914 can happen in any order. In one embodiment,least steps step 910 occurs continuously in the background, separately from the remainder of the steps, and updates the document collection in embedding space asynchronously with the remainder of the steps. In general, the logic ofFIG. 9 , as well as all sequences and flow charts herein, can be implemented using processors programmed using computer programs stored in memory accessible to the computer systems and executable by the processors, by dedicated logic hardware, including field programmable integrated circuits, or by combinations of dedicated logic hardware and computer programs. Each block in the flowchart or phase in a logic sequence describes logic that can be implemented in hardware or in software running on one or more computing processes executing on one or more computer systems. In one embodiment, each step of the flow chart or phase in a logic sequence illustrates or describes the function of a separate module of software. In another embodiment, the logic of the step is performed by software code routines which are distributed throughout more than one module. As with all flowcharts and logic sequences herein, it will be appreciated that many of the steps can be combined, performed in parallel or performed in a different sequence without affecting the functions achieved. In some cases, as the reader will appreciate, a re-arrangement of steps will achieve the same results only if certain other changes are made as well. In other cases, as the reader will appreciate, a re-arrangement of steps will achieve the same results only if certain conditions are satisfied. Furthermore, it will be appreciated that the flow charts and logic sequences herein show only aspects that are pertinent to an understanding of the invention, and it will be understood that in a specific embodiment, numerous additional steps for accomplishing other functions for that embodiment can be performed before, after and between those steps shown. - The “embedding space”, into which digital documents are embedded by embedding
module 820 and instep 910, as used herein is a geometric space within which documents are represented. In one embodiment the embedding space is a vector space, in which the features of a document define its “position” in the vector space relative to an origin. The position is typically represented as a vector from the origin to the document's position, and the space has a number of dimensions based on the number of coordinates in the vector. Vector spaces deal with vectors and the operations that may be performed on those vectors. In another embodiment the embedding space is a metric space, which does not have a concept of position, dimensions or an origin. Distances among documents in a metric space are maintained relative to each other, rather than relative to any particular origin. Metric spaces deal with objects combined with a distance between those objects and the operations that may be performed on those objects. For purposes of the current discussion these objects are significant in that there exist many efficient algorithms that operate on vector spaces and metric spaces. For example metric trees may be used to rapidly identify objects that are “close” to each other. In the discussion below we embed objects into vector spaces and/or metric spaces. In the context of a vector space this means that we define a function that maps objects to vectors in some vector space. In the context of a metric space it means that we define a metric (or distance) between those objects that allows us to treat the set of all such objects as a metric space. Note that vector spaces allow the use of a variety of standard measures of distance (divergence) including the Euclidean distance. Other embodiments can use other types of embedding spaces. - Let us first consider embedding in a vector space. To embed a document collection in a vector space we associate each document with a vector. The distance between two documents in such a space is then determined using standard measures of distance on vectors.
- The goal of embedding digital documents in a vector space is to place intuitively similar documents close to each other. There are many ways to achieve this. For example a common way of embedding text documents is to use a bag-of-words model. The bag of words model maintains a dictionary. Each word in the dictionary is given an integer index, for example, the word aardvark may be given the index 1, and the word zebra may be given the index 60,000. Each document is processed by counting the number of occurrences of each dictionary word in that document. A vector is created where the value at the ith index is the count for the ith dictionary word. Variants of this representation normalize the counts in various ways. Such an embedding captures information about the content and therefor the meaning of the documents. Text documents with similar word distributions are close to each other in this embedded space.
- There are a wide variety of means by which documents may be embedded into a vector space. For example images may be processed to identify commonly occurring features using, e.g., scale invariant feature transforms (SIFT), which are then binned and used in a representation similar to the bag-of-words embedding described above. Of particular interest are embeddings created using deep neural networks, or other deep learning techniques. For example a neural network can learn an appropriate embedding by performing gradient descent against a measure of dimensionality reduction on a large set of training data. As another example, one could learn a kernel based on data and derive a distance based on that kernel. Likewise one may learn a distance directly. These approaches generally use large neural networks to map documents, words, or images to high dimensional vectors. Alternatively, one may learn an embedding using examples with algorithms such as Multi-Dimensional Scaling, or Stochastic Neighbor Embedding. An embedding into a vector space may also be defined implicitly via a kernel. In this case the explicit vectors may never be generated or used, rather the operations in the vectors space are carried out by performing kernel operations in the original space.
- In many cases a very high dimensional space would be required to capture the intuitive relationships between documents. In some of these cases the required dimensionality may be reduced by choosing to embed the documents on a manifold (curved surface) in the space rather than to arbitrary locations.
- Finally, it is worth noting that different embeddings may be appropriate on different subsets of the document collection. For example, it may be most effective to re-embed the candidate result sets at each iteration of the search procedure. In this way the subset may be re-embedded to capture the most important axes of variation or of interest in that subset.
- To embed a digital document collection in a metric space requires associating that collection with a distance (or metric). Below we discuss a number of ways to define a distance between digital documents.
- “Distance” between two documents in embedding space “corresponds to” a predetermined measurement of dissimilarity among documents. Preferably it is a monotonic function of the measurement of dissimilarity. Typically it equals the measurement of dissimilarity. Example distances include the Manhattan distance, the Euclidean distance, and the Hamming distance.
- There are a wide variety ways to measure the distance (or dissimilarity) between digital documents, and these may be combined to produce new measures of distance. An important concept is that the intuitive relationships between digital documents may be captured via such a similarity or distance measure. For example, some useful distance measures place images containing the same person in the same place close to each other. Likewise, some useful measures place documents discussing the same topic close to each other. Of course there are many axes along which digital documents may be intuitively related, so that the set of all documents close (with respect to that distance) to a given document may be quite diverse. For example, a historical text describing the relationship between Anthony and Cleopatra may be similar to other historical texts, texts about Egypt, texts about Rome, movies about Anthony and Cleopatra, and love stories. Each of these types of differences constitutes a different axis relative to the original historical text.
- Such distances may be defined in a variety of ways. One typical way is via embeddings into a vector space. Other ways include encoding the similarity via a Kernel. By associating a set of documents with a distance we are effectively embedding those documents into a metric space. Documents that are intuitively similar will be close in this metric space while those that are intuitively dissimilar will be far apart. Note further that kernels and distance functions may be learned. In fact, it may be useful to learn new distance functions on subsets of the documents at each iteration of the search procedure.
- The databases used in an embodiment of the invention, such as
816 and 826, may use commonly available means to store the data in, e.g., a relational database, a document store, a key value store, or other related technologies. In each case the original document contents (or pointers to them) may be stored and associated with their high dimensional representation, or a set of measures of distance relative to other documents.databases - In order to achieve scalable and fast search performance indexing structures are critical. When documents are embedded in a vector space indexes may be built using, e.g., kd-trees. When documents are associated with a distance metric and hence embedded in a metric space metric trees may be used.
- The databases described herein are stored on one or more non-transitory computer readable media. As used herein, no distinction is intended between whether a database is disposed “on” or “in” a computer readable medium. Additionally, as used herein, the term “database” does not necessarily imply any unity of structure. For example, two or more separate databases, when considered together, still constitute a “database” as that term is used herein.
- The initial query presented to the system in
step 912 may be created and evaluated using a variety of standard techniques. For example, the query may be presented as a set of keywords entered via a keyboard or via speech; the query may be a natural language phrase, or sentence entered via a keyboard or via speech; or the query may be an audio signal, an image, a video, or a piece of text representing a prototype for which similar audio signals, images, videos, or text may be sought. A variety of means are known by which such an initial query may be efficiently evaluated, e.g., searching a relational database, or using an inverted index. - The initial query may also be designed to simply return a random set of results.
- Other interfaces for initial queries allow for faceted search. Faceted search provides a means for users to constrain a search along a set of axes. For example, the faceted search might provide a slider that allows users to constrain the range of acceptable prices.
- The search constraints created from an initial query and subsequent user input are used to identify a set of candidate results. This may be achieved using a variety of means. For example, the initial query may be performed against a relational database whereby the results are then embedded in a vector or metric space. These results may then be indexed using, e.g., a kd-tree or a metric tree and searched to identify candidates that satisfy both the initial query and the constraints. Alternatively, the initial query may also be converted to geometric constraints that are applied to the set of embedded documents. In this case the geometric representation of the constraints implied both by the initial query and the user input are combined and an appropriate index is used to identify embedded documents satisfying both sets of constraints.
- As used herein, a “geometric constraint” on an embedding space is a constraint that is described formulaically in the embedding space, rather than only by cataloguing individual documents or document features to include or exclude. In a vector embedding space, for example, the constraint can be described in the form of a specified function which defines a hypersurface. Documents on one side of the hypersurface satisfy the constraint whereas documents on the other side do not. A hyperplane may be defined in terms of dot products or kernels and requires that k(x,z)>0 for a fixed vector x and a candidate z. Likewise a conic constraint may require that k(x,z)>c for some constant c. In a metric embedding space, the constraint can be described in the form of a function of, for example, distances between documents. Thus in a metric embedding space, a geometric constraint might take the form of ‘all documents within a specified distance from document X’, for example, or ‘all documents whose distance to document A is less than its distance to document B’. Geometric constraints also may be combined using set operations, e.g., union, intersection to define more complex geometric constraints. They also may be created by taking transformations of any of the example constraints discussed. For example, one may take a polynomial function of distances, e.g., d(x,z)*d(x,z)+d(y,z)<d(w, z) for given documents x, y, and w, and only those documents z which satisfy the function are considered to satisfy the geometric constraint.
- In various embodiments, constraints (either as indicated by the user or as converted to geometric constraints) may be “hard” or “soft”. Hard constraints are those which must be satisfied in the sense that solutions must satisfy the conditions of all hard constraints. Soft constraints are those which need not be satisfied but candidate solutions may be penalized for each soft constraint that they don't satisfy. Solutions may be rejected in a particular embodiment if the accumulation of such penalties is too large. Constraints may be relaxed in some embodiments, for example hard constraints may be converted to soft constraints by associating them with a penalty, and soft constraints may have their penalties reduced.
- Search queries may be ambiguous, or underspecified and so the documents satisfying a query may be quite diverse. For example, if the initial query is for a “red dress” the results may be quite varied in terms of their length, neckline, sleeves, etc. This aspect of the module sub-selects a discriminating set of results. Intuitively the objective is to provide a set of results to the user such that selection or de-selection of those results provides the most informative feedback or constraints to the search algorithm. One may think of this step as identifying an “informative” set of results, or a “diverse” set of results, or a “discriminating” set of results.
Discriminative selection module 828, performingstep 918, selects a discriminative subset of results in any of a variety of ways. - In one embodiment, a subset of the results may be discriminative as it provides a diversity of different kinds of feedback that the user can select. Diverse images may be selected as in, e.g., van Leuken, et. al., “Visual Diversification of Image Search Results”, in WWW '09 Proceedings of the 18th international conference on World wide web, pp. 341-350 (2009), incorporated by reference herein. This diverse set is selected in order to provide the user with a variety of ways in which to refine the query at the next iteration. There are a variety of ways in which such a set may be identified. For example, farthest first traversal may be performed which incrementally identifies the “most” diverse set of results. Farthest first traversal requires only a distance measure and does not require an embedding. Farthest first traversal may also be initialized with a set of results; subsequent results are then the most different from that initial set.
- Other means for selecting discriminative subsets of candidate results include using an algorithm like PCA (principal component analysis) or kernel PCA to identify the key axes of variation in the complete set of results. The discriminative subset is then constructed to contain documents that lie at multiple points along those most discriminating axes.
- Another means might consider the set of constraints that would results from the user selecting or deselecting a given document. This set of constraints may be considered in terms of the candidate results it would yield. A discriminative subset may be selected so that the sets of candidate results produced by selecting any of the documents in that discriminative subset are as different as possible.
- As used herein, “discriminativeness” of a particular set of documents in a collection of documents is the least number of documents in the collection that are excluded as a result of user selection of any document in the set. That is, if user selection of different documents in the particular set results in excluding different numbers of documents in the collection, then the set's “discriminativeness” is considered herein to be the least of those numbers. Note that either the discriminative set of documents, or the formula by which user selection of a document determines which documents are to be excluded, or both, should be chosen such that the union of the set of documents excluded by selecting any of the documents in a discriminative set equals the entire collection of documents.
- Also as used herein, the “average discriminativeness” of a set of size n documents in a collection of documents, is the average, over all sets of size n documents in the collection of documents, of the discriminativeness of that set. Also as used herein, one particular set of documents can be “more discriminative” than another set of documents if the discriminativeness of the first set is greater than the discriminativeness of the second set.
- Preferably the
selection module 828, performingstep 918, selects a set of N1>1 documents from thecurrent candidate space 826, which is more discriminative than the average discriminativeness of sets of size N1 documents in the candidate space. Even more preferably,selection module 828 performingstep 918 selects a set which is at least as discriminative as all other sets of size N1 documents in the current candidate space. - The aim of the discriminative results presentation to the user in
step 920 byuser interaction module 822, is to provide the user with a framework in which to refine the query constraints. - For example the results may be presented as a two dimensional grid. Results should be placed on that grid in a way that allows the user to appreciate the underlying distances between those results (as defined using a distance measure or embedding). One way to do this would be to ensure that results that are far from each other with respect to the distance measure are also displayed far from each other on the grid. Another way would be to project the embedding space onto two dimensions for example using multidimensional scaling (MDS) (for example see: Jing Yang, et. al., Semantic image browser: Bridging information visualization with automated intelligent image analysis, Proc. IEEE Symposium on Visual Analytics Science and Technology (2006), incorporated herein by reference). Yet another way would be to sub-select axes in the embedding space and position results along those axes.
- Other layouts contemplated include 2 dimensional organizations not on a grid (possibly including overlapping results), 3 dimensional organizations analogous to the 2 dimensional organizations. Multi-dimensional organizations analogous to the 2 and 3 dimensional organizations with the ability to rotate around one or more axes. In general an M-dimensional layout can be used, where M>1. In embodiments in which the embedding space has dimensions, the number of dimensions in the presentation layout need not be the same as the number of dimensions in the embedding space. Yet other layouts include hierarchical organizations or graph based layouts.
- The document placement in the layout space should be indicative of the relationship among the documents in embedding space. For example, the distance between documents in layout space should correspond (monotonically, if not linearly) with the distance between the same documents in embedding space. Also, if three documents are collinear in embedding space, advantageously they are placed collinearly in layout space as well. In particular, collinearity in layout space with a candidate document which the system identifies as the most likely target of the user's query (referred to herein as the primary candidate document) indicates collinearity in the embedding space with the primary candidate document.
- It will be appreciated, however, that embedding space typically has a very large number of dimensions, and in high dimensional spaces very few points are actually collinear. In an embodiment, therefore, documents presented collinearly in layout space indicate only “substantial” collinearity in embedding space. As used herein, if the embedding space is such that each document has a position in the space (as for a vector space), then three documents are considered “substantially collinear” in embedding space if the largest angle of the triangle formed by the three documents in embedding space is greater than 160 degrees. If the embedding space is such that documents do not have a position in the embedding space, but they do have distances from each other (such as for a metric space), then as used herein, a group of three documents are considered collinear if the sum of the two smallest distances between pairs of the documents in the group in embedding space equals the largest distance between pairs of the documents in the group in embedding space. The three documents are considered “substantially collinear” if the sum of the two smallest distances exceeds the largest distance by no more than 10%. As used herein, “collinearity” and “substantial collinearity” do not include the trivial cases of coincidence or substantial coincidence.
-
User interaction module 822 provides the user with a user interface (UI) which allows the user to provide input in a variety of ways. For example, the user may click on a single result to select it, or may swipe in the direction of a single result to de-select it. Similarly, the user may select or deselect multiple results at a time. For example, this may be done using a toggle selector on each result. The user might also implicitly select a set of results by swiping in the direction of a result indicating a desire for results that are more like that result “in that direction”. In this case “in that direction” means that the differences between the primary result and the result being swiped should be magnified. That is, the next set of results should be more like the result being swiped and less like the “primary result”. This concept may be generalized by allowing the user to swipe “from” one result “to” another result. In this case new results should be more like the “to” result and less like the “from” result. - Additionally, the UI can provide the user with the ability (e.g., via a double-click, or a pinch) to specify that the next set of results should be more like a specific result than any of the other results displayed. This is analogous to zooming on a map. Conversely, the UI can provide the ability to specify that the next set of results should be like a particular selection but more diverse than the currently selected set of results.
- The UI may also provide the ability for the user to specify that a different set of similarly diverse images be provided.
- The system then incorporates the user's input to create a refined query. The refined query includes information regarding the initial query and information derived from the iterative sequence of refinements made by the user so far. This refined query may be represented as a set of geometric constraints that focus subsequent results within a region of the embedding space. Likewise, it may be represented as a set of distance constraints whose intersection defines the refined candidate set of results. It may also be represented as a path through the set of all possible results.
- For example, the refined query may include constraints that require subsequent results to be within a specified distance of one of the selected candidate results. Or the refined query may include constraints that require subsequent results to be closer (with respect to the distance measure) to one candidate result than to another.
- Users may specify incompatible constraints. A system according to the invention may have the ability to relax, tighten, remove, or modify constraints that it determines are inappropriate.
- Given a distance (dissimilarity measure) between documents to be searched, or an embedding of those documents into a vector space or onto a manifold there are a variety of data structures that may be used to index that document collection and hence allow for rapid search. Such data structures include metric trees, kd-trees, R-trees, universal B-trees, X-trees, ball trees, locality sensitive hashes, and inverted indexes.
- The system uses a combination of such data structures to identify the next set of candidate results based on the refined query.
- User behavior data may be collected by a system according to the invention and used to improve or to specialize the search experience. In particular, many ways of expressing distances or similarities may be parameterized and those parameters may be fit. For example, a similarity defined using a linear combination of kernels may have the coefficients of that linear combination tuned based on user behavior data. In this way the system may adapt to individual (or community, or contextual) notions of similarity.
-
FIG. 3 illustrates a set of documents embedded in 2-dimensional space. Aspects of the invention envision embedding documents in spaces of large dimensionality, hence two dimensions is for illustration purposes only. Thespace 310 contains documents, e.g., 321, 322. Each pair of documents has adistance 330 between them. -
FIG. 4 illustrates the set of documents fromFIG. 3 in addition to a circulargeometric constraint 410. Those documents inside the circle, e.g., 421, 422 are said to satisfy the constraint. Aspects of the invention express queries and user input in the form of such geometric constraints. The documents that satisfy the constraints are the current results of the query. As the user provides further input additional constraints may be added, or existing constraints may be added or removed. -
FIG. 5 illustrates the set of documents fromFIG. 3 in addition to a non-circulargeometric constraint 510. Aspects of the invention envision geometric constraints of arbitrary shape, and unions, intersections and differences of such constraints. -
FIG. 6 illustrates a means by which the circular constraint ofFIG. 4 may be updated in response to user input. The originalcircular constraint 610 may be modified by increasing its radius to producecircular constraint 620, or by decreasing its radius to produceconstraint 630. These modifications are done in response to user input. The set of documents satisfying these constraints will change as the constraints are modified thus reducing or expanding the set of images considered for display to the user. -
FIG. 7 illustrates a means by which a discriminative subset of documents may be selected for presentation to the user. The documents highlighted, e.g., 711, 712, are distinct from each other and from the others contained in the circular constraint region. -
FIG. 10A illustrates a set of documents in embedding space, in whichquery processing module 824 has narrowed the collection to those documents within thecircle 1020, and has identifiedprimary result document 1018. In addition,discriminative selection module 828 has selected 1010, 1012, 1014 and 1016 as the discriminative set to present to the user. It can be seen that in embedding space,documents 1012, 1018 and 1016 are substantially collinear, and thatdocuments 1010, 1018 and 1014 are substantially collinear.documents -
FIG. 10B illustrates how the system may present the set of documents in layout space. (The broken lines are implied, rather than visible.) The specific positions of the documents do not necessarily match those in embedding space, in part because dimensionality of the space has been reduced. However, documents which were substantially collinear in embedding space are collinear in layout space. In particular, if the broken lines inFIG. 10A represent dimensions in embedding space along which the candidate documents differ, the placement of the documents in layout space inFIG. 10B are indicative of those same dimensions. In addition, the relative distances among the documents along each of the lines of collinearity in layout space also are indicative of the relative distances in embedding space. - Various embodiments of the approaches described herein may yield one or more of the following advantages over prior systems:
-
- An embodiment need not be limited to a single fixed hierarchy of documents. More specifically, an embodiment need not require the explicit determination of a taxonomy by which the document collection is described. Nor does it require a clustering of documents into a static hierarchy. That is, the sequence of refinements that a user may perform need not be constrained to narrowing or broadening in some pre-defined taxonomy or hierarchy.
- An embodiment can be extremely flexible and may be applied to images, text, audio, video, and many other kinds of data.
- Intuitions about the relationships among documents are often easier to express using notions of similarity or distance between documents than using a taxonomy or tags.
- Selecting and deselecting candidate results in a visual way is a more facile interface for performing search on a mobile device or a tablet.
- Encoding query refinements in terms of geometric constraints allows for a more flexible user interaction. Specifically, in an embodiment, the user is not required to be familiar with a pre-defined tagging ontology, or with a query logic used to combine constraints. Furthermore, in an embodiment such geometric constraints can be more robust to errors in a feature tagging or annotation process.
- The ability to incrementally refine a search is helpful to a productive user experience.
- The use of a discriminative subset of candidate results makes more effective use of limited display space. The clutter on the display is minimized while simultaneously capturing a high proportion of the information available in the complete results set and providing a wide variety of options for the user to refine a query.
- Given that distances, embeddings, and similarities may be machine learned, a system using this approach can provide the ability to specialize the search experience to individuals, groups, cultures, and document categories.
- Compared to content based image retrieval (CBIR) techniques, an embodiment of the invention can be more amenable to incremental refinement of a search. Specifically, a user may take a photograph and use a CBIR system to identify related or highly similar photographs. However, if the user is dissatisfied with the results the CBIR system does not provide them with a way to refine search goals.
- One implementation allows users to search a personal photograph collection. Users are initially shown an arbitrary photograph (the primary result), e.g., the most recent photograph taken or viewed. This is displayed in the center of a 3×3 grid of photographs from the collection. Each of the photographs is selected to be close (defined below) to the primary result but different from each other along different axes relative to the primary result. For example, if the primary result is a photograph taken with family last week at home, then other photographs may be a) with the family last year at home, b) with the family last week outdoors, c) without the family last week at home, etc. In some situations, the system may place two photographs on opposite sides of the primary result which are along the same axis but differ from each other in their positions along that axis. For example, the photo placed on the left side may show family member A more prominently than in the primary result, while the photo placed on the right side may show family member A less prominently than in the primary result.
- The user selects one of the 9 photographs which then becomes the primary result. This is then laid out in an updated 3×3 grid of photographs again “close” to it but different from each other.
- If at any point the user double clicks on the primary result then the definition of “close” changes to a “smaller scale” (defined below). If the user uses a “pinch out” gesture then the definition of “close” changes to a “larger scale” and the result set is updated.
- In this way a user may navigate a collection of photographs to find specific ones.
- In this example photographs may be considered similar with respect to a number of criteria, including:
-
- GPS location of the photograph
- Time of the photograph
- Color content of the photograph
- Whether the photograph was taken indoors or outdoors
- Whether there are people in the photograph
- Who is in the photograph
- Whether people in the photograph are happy or sad
- The activity depicted in the photograph
- The objects contained in the photograph
- And many others. These criteria are captured into a numerical “distance”, or as a vector locating photographs in some space. In the latter case a standard notion of similarity or distance may be used, e.g., the dot product or Euclidean distance. In an embodiment, a normalization function can be applied in order that distances along different axes are comparable to each other.
- As the user navigates a photo collection the “scale” at which the user is searching changes. This scale specifies how “close” the photos in the result set are to the primary result. More precisely all photos in the result set must have a “distance” less than some threshold. As the scale increases or decreases this threshold increases or decreases.
- Considering this example with respect to the steps described above:
-
- Embedding: For each photograph in a user's collection of personal photographs a vector is produced that has indices corresponding to, e.g., the longitude, the latitude, the time of day, the day of week, the number of faces, whether a given activity is depicted, among many others.
- Initial Query: In this case the initial query is empty, that is all photos are candidate results and the one presented to the user is arbitrary.
- Initial Query as geometric constraints: The initial query produces an empty set of geometric constraints
- The geometric constraints are applied to the set of embedded photographs to identify those that satisfy the constraints, i.e., the candidate results
- A discriminative subset of 9 photographs is selected from the candidate results using farthest first traversal.
- The 9 photographs are presented to the user in a 3×3 grid
- The user selects one of the photographs to indicate a desire to see more photographs like that one.
- The user selected photograph is processed to yield a new geometric constraint which can be represented as a sphere around the selected photograph in the embedding space. This new constraint is added to the current set of constraints. The combined constraint is the intersection of spheres around all photographs selected so far.
- Another implementation looks at searching for accessories (apparel, furniture, apartments, jewelry, etc). In this implementation the user searches using text, speech, or with a prototype image as an initial query. For example, a user searches for “brown purse” using text entry. The search engine responds by identifying a diverse set of possible results, e.g., purses of various kinds and various shades of brown. These results are laid out in a 2 dimensional arrangement (for example a grid), whereby more similar results are positioned closer to each other and more different results are positioned relatively far from each other. The user then selects one or more images, for example using radio buttons. The image selections are then used by the search engine to define a “search direction” or a vector in the embedding space along which further results may be obtained.
- Considering this example with respect to the steps described above:
-
- Embedding: For each entry in an accessories catalog a vector is produced using deep learning techniques trained to differentiate accessories.
- Initial Query: In this case the initial query is a textual search that narrows further results to be within a portion of the full catalog. This restricted is the set of initial candidate results.
- Initial Query as geometric constraints: The initial query produces an empty set of geometric constraints
- The geometric constraints are applied to the set of embedded accessories in the restricted set (i.e., the initial candidate results) to identify those that satisfy the constraints, i.e., the candidate results
- A diverse subset of 9 catalog entries is selected from the candidate results using farthest first traversal.
- The 9 catalog entries are presented to the user in a 3×3 grid
- The user selects one of the catalog entries to indicate a desire to see more accessories like that one.
- The user selected accessory is processed to yield a new geometric constraint which can be represented as a sphere around the selected accessory in the embedding space. This new constraint is added to the current set of constraints. The combined constraint is the intersection of spheres around all accessories selected so far.
- Documents are encoded in an embedding space such as a vector space or metric space (via a distance). Searches proceed as a sequence of query refinements. Query refinements are encoded as geometric constraints over the vector space or metric space. Discriminative candidate results are displayed to provide the user with the ability to add discriminative constraints. User inputs, e.g., selecting or deselecting results, are encoded as geometric constraints.
- Variations of this approach will be apparent to the reader. For example the documents may embedded after the initial query is process and only those documents satisfying the query may be embedded. Similarly, the documents may be re-embedded using a different embedding at any point in the process. In this case, the geometric constraints would be re-interpreted in the new embedding.
- The geometric constraints may be augmented at any point with non-geometric constraints. In this case the candidate results can be filtered in a straightforward way to select only those satisfying the non-geometric constraints. In this way the interaction can be augmented with faceted search, text, or speech inputs. At each iteration of the process the geometric constraints can be managed together with a set of non-geometric constraints.
- An example implementation may proceed through these steps:
-
- 1. Obtaining a collection of prototype images (at least 1) from a user;
- 2. Identifying all images in the collection with a distance less than a threshold T from the prototype images;
- 3. Identifying a discriminative subset of the images collected in (2);
- 4. Presenting the discriminative subset of images to the user in a 2-dimensional layout;
- 5. If the user is satisfied with one of the presented images, receiving an indication of such satisfaction and taking desired action with respect to the selected image;
- 6. If the user is not yet satisfied, obtaining from the user a selection of the presented images more like the desired result;
- 7. Producing a revised collection of prototype images;
- 8. Reducing threshold T;
- 9. Goto 2.
- The above method may be viewed either from the viewpoint of the user interacting with a computer system, or the viewpoint of a computer system interacting with a user, or both.
- The concept may be generalized to refer to “digital documents” rather than images, where digital documents include audio, video, text, html, multimedia documents and product listings in a digital catalog, in addition to images.
- The concept may also be generalized so that the initial collection obtained at step 1 is obtained as the result of the user performing a search (query) within another information retrieval system or search engine.
- The concept may also be generalized so that rather than reducing the threshold at step 8, the user interface provides for the ability to decrease or increase the threshold or leave it unchanged.
- The concept may also be generalized so that at steps 1, and 6 there are two collections of prototype images and at step 2 the system identifies images with distance less than threshold T1 of the first set, and greater than T2 of the second set.
- The concept may also be generalized so that at one iteration of step 6 the user selects image(s) along a first subset of at least one axis, and at another iteration of step 6 the user selects image(s) along a second subset of at least one axis, where the second subset of axes contains at least one axis not included in the first subset of axes.
-
FIG. 1 illustrates an example environment in which aspects of the invention may be implemented. The system includes auser computer 110 and aserver computer 112, connected to each other via anetwork 114 such as the Internet. Theserver computer 112 has accessibly thereto thedatabase 816 identifying documents in association with embedding information, such as relative distances and/or their positions in a vector space. Theuser computer 110 also in various embodiments may or may not have accessibly thereto adatabase 118 identifying the same information. - Initially, embedding module 820 (which may for example be the
server computer 112 or a separate computer system or a process running on such a computer) analyzes a collection of documents to extract embedding information about the documents. For example, if the documents are photographs, the embeddingmodule 820 may include a neural network and may use deep learning to derive embedding image information from the photographs. - Alternatively, embedding
module 820 may derive a library of image classifications (axes on which a given photograph may be placed), each in association with an algorithm for recognizing in a given photograph whether (or with what probability) the given photograph satisfies that classification. Then the embeddingmodule 820 may apply its pre-developed library to a smaller set of newly provided photographs, such as the photos currently on theuser computer 110, in order to determine embedding information applicable to each photograph. Either way, the embeddingmodule 820 writes into thedatabase 816 the identifications of the collection of documents that the user may search, each in association with its embedding information. - In yet another embodiment, the embedding information that embedding
module 820 writes intodatabase 816 may be provided from an external source, or entered manually. - The iterative identification steps described above can be implemented in a number of different ways. In one embodiment, all computation takes place on the
server computer 112, as the user iteratively searches for a desired document. The user, operating theuser computer 110, sees all results only by way of a browser. In this embodiment, it is not necessary that theuser computer 110 have thedocument collection database 118 accessibly thereto. In another embodiment, theserver computer 112 transmits itsentire database 118 of documents in embedding space (or a subset of that database) to theuser computer 110, which writes it into itsown database 118. All computation takes place on theuser computer 110 in such an embodiment, as the user iteratively searches for a desired document. Many other arrangements are possible as well. -
FIG. 2 is a simplified block diagram of acomputer system 210 that can be used to implement software incorporating aspects of the present invention. The drawing represents both an embodiment ofuser computer 110 andserver computer 112. While the above-described methods indicate individual logic steps or modules for carrying out specified operations, it will be appreciated that each step or module actually causes thecomputer system 210 to operate in the specified manner. -
Computer system 210 typically includes aprocessor subsystem 214 which communicates with a number of peripheral devices viabus subsystem 212. These peripheral devices may include astorage subsystem 224, comprising amemory subsystem 226 and afile storage subsystem 228, userinterface input devices 222, userinterface output devices 220, and anetwork interface subsystem 216. The input and output devices allow user interaction withcomputer system 210.Network interface subsystem 216 provides an interface to outside networks, including an interface tocommunication network 218, and is coupled viacommunication network 218 to corresponding interface devices in other computer systems.Communication network 218 may comprise many interconnected computer systems and communication links. These communication links may be wireline links, optical links, wireless links, or any other mechanisms for communication of information, but typically it is an IP-based communication network. While in one embodiment,communication network 218 is the Internet, in other embodiments,communication network 218 may be any suitable computer network. - The physical hardware component of network interfaces are sometimes referred to as network interface cards (NICs), although they need not be in the form of cards: for instance they could be in the form of integrated circuits (ICs) and connectors fitted directly onto a motherboard, or in the form of macrocells fabricated on a single integrated circuit chip with other components of the computer system.
- User
interface input devices 222 may include a keyboard, pointing devices such as a mouse, trackball, touchpad, or graphics tablet, a scanner, a touch screen incorporated into the display, audio input devices such as voice recognition systems, microphones, and other types of input devices. In general, use of the term “input device” is intended to include all possible types of devices and ways to input information intocomputer system 210 or ontocomputer network 218. It is by way ofinput devices 222 that the user provides queries and query refinements to the system. - User
interface output devices 220 may include a display subsystem, a printer, a fax machine, or non-visual displays such as audio output devices. The display subsystem may include a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), a projection device, or some other mechanism for creating a visible image. The display subsystem may also provide non-visual display such as via audio output devices. In general, use of the term “output device” is intended to include all possible types of devices and ways to output information fromcomputer system 210 to the user or to another machine or computer system. It is by way ofoutput devices 220 that the system presents query result layouts toward the user. -
Storage subsystem 224 stores the basic programming and data constructs that provide the functionality of certain embodiments of the present invention. For example, the various modules implementing the functionality of certain embodiments of the invention may be stored instorage subsystem 224. These software modules are generally executed byprocessor subsystem 214. -
Memory subsystem 226 typically includes a number of memories including a main random access memory (RAM) 230 for storage of instructions and data during program execution and a read only memory (ROM) 232 in which fixed instructions are stored.File storage subsystem 228 provides persistent storage for program and data files, and may include a hard disk drive, a floppy disk drive along with associated removable media, a CD ROM drive, an optical drive, or removable media cartridges. The databases and modules implementing the functionality of certain embodiments of the invention may have been provided on a computer readable medium such as one or more CD-ROMs, and may be stored byfile storage subsystem 228. Thehost memory 226 contains, among other things, computer instructions which, when executed by theprocessor subsystem 214, cause the computer system to operate or perform functions as described herein. As used herein, processes and software that are said to run in or on “the host” or “the computer”, execute on theprocessor subsystem 214 in response to computer instructions and data in thehost memory subsystem 226 including any other local or remote storage for such instructions and data. -
Bus subsystem 212 provides a mechanism by which the various components and subsystems ofcomputer system 210 communicate with each other as intended. Althoughbus subsystem 212 is shown schematically as a single bus, alternative embodiments of the bus subsystem may use multiple busses. -
Computer system 210 itself can be of varying types including a personal computer, a portable computer, a workstation, a computer terminal, a network computer, a television, a mainframe, a server farm, or any other data processing system or user device. In particular, it is envisaged thatuser computer 110 may be a hand-held device such as a tablet computer or a smart-phone. Due to the ever-changing nature of computers and networks, the description ofcomputer system 210 depicted inFIG. 2 is intended only as a specific example for purposes of illustrating the preferred embodiments of the present invention. Many other configurations ofcomputer system 210 are possible having more or less components than the computer system depicted inFIG. 2 . - While the present invention has been described in the context of a fully functioning data processing system, those of ordinary skill in the art will appreciate that the processes described herein are capable of being distributed in the form of a computer readable medium of instructions and data and that the invention applies equally regardless of the particular type of signal bearing media actually used to carry out the distribution. As used herein, a computer readable medium is one on which information can be stored and read by a computer system. Examples include a floppy disk, a hard disk drive, a RAM, a CD, a DVD, flash memory, a USB drive, and so on. The computer readable medium may store information in coded formats that are decoded for actual use in a particular data processing system. A single computer readable medium, as the term is used herein, may also include more than one physical item, such as a plurality of CD-ROMs or a plurality of segments of RAM, or a combination of several different kinds of media. As used herein, the term does not include mere time varying signals in which the information is encoded in the way the signal varies over time.
- As used herein, a given event or value is “responsive” to a predecessor event or value if the predecessor event or value influenced the given event or value. If there is an intervening processing element, step or time period, the given event or value can still be “responsive” to the predecessor event or value. If the intervening processing element or step combines more than one event or value, the signal output of the processing element or step is considered “responsive” to each of the event or value inputs. If the given event or value is the same as the predecessor event or value, this is merely a degenerate case in which the given event or value is still considered to be “responsive” to the predecessor event or value. “Dependency” of a given event or value upon another event or value is defined similarly.
- As used herein, the “identification” of an item of information does not necessarily require the direct specification of that item of information. Information can be “identified” in a field by simply referring to the actual information through one or more layers of indirection, or by identifying one or more items of different information which are together sufficient to determine the actual item of information. In addition, the term “indicate” is used herein to mean the same as “identify”.
- The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.
- The foregoing description of preferred embodiments of the present invention has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations will be apparent to practitioners skilled in this art. In particular, and without limitation, any and all variations described, suggested or incorporated by reference in the Background section of this patent application are specifically incorporated by reference into the description herein of embodiments of the invention. In addition, any and all variations described, suggested or incorporated by reference herein with respect to any one embodiment are also to be considered taught with respect to all other embodiments. The embodiments described herein were chosen and described in order to best explain the principles of the invention and its practical application, thereby enabling others skilled in the art to understand the invention for various embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the following claims and their equivalents.
Claims (28)
Priority Applications (13)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/494,364 US20150331908A1 (en) | 2014-05-15 | 2014-09-23 | Visual interactive search |
| DE112015002286.4T DE112015002286T9 (en) | 2014-05-15 | 2015-05-04 | VISUAL INTERACTIVE SEARCH |
| JP2016567798A JP2017518570A (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| CN201580038513.4A CN107209762B (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| TW104114144A TW201606537A (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| PCT/IB2015/001267 WO2015173647A1 (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| GB1621341.5A GB2544660A (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| US15/311,163 US10503765B2 (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| EP15760512.2A EP3143523B1 (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| US15/295,930 US10606883B2 (en) | 2014-05-15 | 2016-10-17 | Selection of initial document collection for visual interactive search |
| US15/295,926 US20170039198A1 (en) | 2014-05-15 | 2016-10-17 | Visual interactive search, scalable bandit-based visual interactive search and ranking for visual interactive search |
| US15/373,897 US10102277B2 (en) | 2014-05-15 | 2016-12-09 | Bayesian visual interactive search |
| US16/681,514 US11216496B2 (en) | 2014-05-15 | 2019-11-12 | Visual interactive search |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201461994048P | 2014-05-15 | 2014-05-15 | |
| US14/494,364 US20150331908A1 (en) | 2014-05-15 | 2014-09-23 | Visual interactive search |
Related Child Applications (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/311,163 Continuation US10503765B2 (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| US15/311,163 Continuation-In-Part US10503765B2 (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| PCT/IB2015/001267 Continuation-In-Part WO2015173647A1 (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20150331908A1 true US20150331908A1 (en) | 2015-11-19 |
Family
ID=54066162
Family Applications (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US14/494,364 Abandoned US20150331908A1 (en) | 2014-05-15 | 2014-09-23 | Visual interactive search |
| US15/311,163 Active 2035-06-20 US10503765B2 (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| US16/681,514 Expired - Fee Related US11216496B2 (en) | 2014-05-15 | 2019-11-12 | Visual interactive search |
Family Applications After (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/311,163 Active 2035-06-20 US10503765B2 (en) | 2014-05-15 | 2015-05-04 | Visual interactive search |
| US16/681,514 Expired - Fee Related US11216496B2 (en) | 2014-05-15 | 2019-11-12 | Visual interactive search |
Country Status (8)
| Country | Link |
|---|---|
| US (3) | US20150331908A1 (en) |
| EP (1) | EP3143523B1 (en) |
| JP (1) | JP2017518570A (en) |
| CN (1) | CN107209762B (en) |
| DE (1) | DE112015002286T9 (en) |
| GB (1) | GB2544660A (en) |
| TW (1) | TW201606537A (en) |
| WO (1) | WO2015173647A1 (en) |
Cited By (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20160104074A1 (en) * | 2014-10-10 | 2016-04-14 | Yahoo! Inc. | Recommending Bidded Terms |
| US20170213249A1 (en) * | 2014-10-15 | 2017-07-27 | Toor Inc. | Information display method and information display device |
| US20180253496A1 (en) * | 2017-02-28 | 2018-09-06 | Laserlike Inc. | Interest embedding vectors |
| US20180250554A1 (en) * | 2017-03-03 | 2018-09-06 | Sentient Technologies (Barbados) Limited | Behavior Dominated Search in Evolutionary Search Systems |
| CN108604315A (en) * | 2015-12-30 | 2018-09-28 | 脸谱公司 | Identify entities using a deep learning model |
| US20190258719A1 (en) * | 2017-02-28 | 2019-08-22 | Laserlike, Inc. | Emoji classifier |
| US10600208B2 (en) | 2017-12-21 | 2020-03-24 | Industrial Technology Research Institute | Object detecting device, object detecting method and non-transitory computer-readable medium |
| US10762439B2 (en) * | 2016-07-26 | 2020-09-01 | International Business Machines Corporation | Event clustering and classification with document embedding |
| US11100145B2 (en) * | 2019-09-11 | 2021-08-24 | International Business Machines Corporation | Dialog-based image retrieval with contextual information |
| US20210312297A1 (en) * | 2020-04-07 | 2021-10-07 | Cognizant Technology Solutions U.S. Corporation | Framework For Interactive Exploration, Evaluation, and Improvement of AI-Generated Solutions |
| US11216496B2 (en) * | 2014-05-15 | 2022-01-04 | Evolv Technology Solutions, Inc. | Visual interactive search |
| US20220092515A1 (en) * | 2020-09-24 | 2022-03-24 | The trustee of the Thomas J. Watson Foundation, DBA Watson Foundation, a Delaware charitable trust, | Personal, professional, cultural (ppc) insight system |
| US11294974B1 (en) * | 2018-10-04 | 2022-04-05 | Apple Inc. | Golden embeddings |
| US20220164381A1 (en) * | 2019-03-29 | 2022-05-26 | Semiconductor Energy Laboratory Co., Ltd. | Image retrieval system and image retrieval method |
| US11481448B2 (en) * | 2020-03-31 | 2022-10-25 | Microsoft Technology Licensing, Llc | Semantic matching and retrieval of standardized entities |
| US11775841B2 (en) | 2020-06-15 | 2023-10-03 | Cognizant Technology Solutions U.S. Corporation | Process and system including explainable prescriptions through surrogate-assisted evolution |
| US11783195B2 (en) | 2019-03-27 | 2023-10-10 | Cognizant Technology Solutions U.S. Corporation | Process and system including an optimization engine with evolutionary surrogate-assisted prescriptions |
| US11995559B2 (en) | 2018-02-06 | 2024-05-28 | Cognizant Technology Solutions U.S. Corporation | Enhancing evolutionary optimization in uncertain environments by allocating evaluations via multi-armed bandit algorithms |
| US12424335B2 (en) | 2020-07-08 | 2025-09-23 | Cognizant Technology Solutions U.S. Corporation | AI based optimized decision making for epidemiological modeling |
Families Citing this family (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA3035277C (en) | 2016-09-02 | 2024-05-14 | FutureVault Inc. | Real-time document filtering systems and methods |
| GB201703602D0 (en) * | 2017-03-07 | 2017-04-19 | Selerio Ltd | Multi-Modal image search |
| US11238083B2 (en) * | 2017-05-12 | 2022-02-01 | Evolv Technology Solutions, Inc. | Intelligently driven visual interface on mobile devices and tablets based on implicit and explicit user actions |
| CN107766414B (en) * | 2017-09-06 | 2020-06-12 | 北京三快在线科技有限公司 | Multi-document intersection acquisition method, device and equipment and readable storage medium |
| CN108401024B (en) * | 2018-02-22 | 2020-11-03 | 武汉大学 | A Context Scaling Cache Method Based on User-Centric Access Behavior |
| TWI718374B (en) * | 2018-05-10 | 2021-02-11 | 和碩聯合科技股份有限公司 | Method and system for establishing hierarchy chart of electronic text |
| KR102643444B1 (en) | 2018-09-27 | 2024-03-06 | 구글 엘엘씨 | Analyzing web pages to facilitate automatic navigation |
| EP3662417A1 (en) * | 2018-10-08 | 2020-06-10 | Google LLC. | Digital image classification and annotation |
| US11410220B2 (en) * | 2019-05-15 | 2022-08-09 | Samsung Electronics Co., Ltd. | Exploration for interactive recommendation system, method, and computer program product |
| EP4133385A1 (en) | 2020-04-11 | 2023-02-15 | IPRally Technologies Oy | System and method for performing a search in a vector space based search engine |
| US11431609B2 (en) * | 2020-04-30 | 2022-08-30 | Hewlett Packard Enterprise Development Lp | Dynamic network service data routing based on measured network performance data |
| EP3985545A1 (en) * | 2020-10-15 | 2022-04-20 | Dassault Systèmes | 3d clustering navigation |
| US11552943B2 (en) * | 2020-11-13 | 2023-01-10 | Cyberark Software Ltd. | Native remote access to target resources using secretless connections |
| CN117355826A (en) | 2021-05-28 | 2024-01-05 | 富士通株式会社 | Information processing program, information processing method, and information processing device |
| US20240028638A1 (en) * | 2022-07-22 | 2024-01-25 | Google Llc | Systems and Methods for Efficient Multimodal Search Refinement |
| US12182125B1 (en) * | 2024-02-15 | 2024-12-31 | Snark AI, Inc. | Systems and methods for trained embedding mappings for improved retrieval augmented generation |
| US12367353B1 (en) * | 2024-12-06 | 2025-07-22 | U.S. Bancorp, National Association | Control parameter feedback protocol for adapting to data stream response feedback |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5930784A (en) * | 1997-08-21 | 1999-07-27 | Sandia Corporation | Method of locating related items in a geometric space for data mining |
| US20020091678A1 (en) * | 2001-01-05 | 2002-07-11 | Miller Nancy E. | Multi-query data visualization processes, data visualization apparatus, computer-readable media and computer data signals embodied in a transmission medium |
| US20020164078A1 (en) * | 2001-03-23 | 2002-11-07 | Fujitsu Limited | Information retrieving system and method |
| US20100134415A1 (en) * | 2008-11-28 | 2010-06-03 | Sony Corporation | Image processing apparatus, image displaying method, and image displaying program |
Family Cites Families (56)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6098034A (en) | 1996-03-18 | 2000-08-01 | Expert Ease Development, Ltd. | Method for standardizing phrasing in a document |
| US5696964A (en) | 1996-04-16 | 1997-12-09 | Nec Research Institute, Inc. | Multimedia database retrieval system which maintains a posterior probability distribution that each item in the database is a target of a search |
| US6286018B1 (en) | 1998-03-18 | 2001-09-04 | Xerox Corporation | Method and apparatus for finding a set of documents relevant to a focus set using citation analysis and spreading activation techniques |
| US7152031B1 (en) | 2000-02-25 | 2006-12-19 | Novell, Inc. | Construction, manipulation, and comparison of a multi-dimensional semantic space |
| US6405188B1 (en) | 1998-07-31 | 2002-06-11 | Genuity Inc. | Information retrieval system |
| US6347313B1 (en) | 1999-03-01 | 2002-02-12 | Hewlett-Packard Company | Information embedding based on user relevance feedback for object retrieval |
| US6353825B1 (en) | 1999-07-30 | 2002-03-05 | Verizon Laboratories Inc. | Method and device for classification using iterative information retrieval techniques |
| US6574616B1 (en) | 2000-02-16 | 2003-06-03 | Index Stock Imagery, Inc. | Stochastic visually based image query and retrieval system |
| US7099860B1 (en) | 2000-10-30 | 2006-08-29 | Microsoft Corporation | Image retrieval systems and methods with semantic and feature based relevance feedback |
| US7200243B2 (en) | 2002-06-28 | 2007-04-03 | The United States Of America As Represented By The Secretary Of The Army | Spectral mixture process conditioned by spatially-smooth partitioning |
| GB2395808A (en) * | 2002-11-27 | 2004-06-02 | Sony Uk Ltd | Information retrieval |
| US7315833B2 (en) | 2003-01-16 | 2008-01-01 | Rosetta Holdings, Llc | Graphical internet search system and methods |
| US20090132347A1 (en) | 2003-08-12 | 2009-05-21 | Russell Wayne Anderson | Systems And Methods For Aggregating And Utilizing Retail Transaction Records At The Customer Level |
| US7231399B1 (en) | 2003-11-14 | 2007-06-12 | Google Inc. | Ranking documents based on large data sets |
| US7480640B1 (en) | 2003-12-16 | 2009-01-20 | Quantum Leap Research, Inc. | Automated method and system for generating models from data |
| US8868405B2 (en) | 2004-01-27 | 2014-10-21 | Hewlett-Packard Development Company, L. P. | System and method for comparative analysis of textual documents |
| US7711679B2 (en) | 2004-07-26 | 2010-05-04 | Google Inc. | Phrase-based detection of duplicate documents in an information retrieval system |
| US20060212415A1 (en) | 2005-03-01 | 2006-09-21 | Alejandro Backer | Query-less searching |
| US9400838B2 (en) | 2005-04-11 | 2016-07-26 | Textdigger, Inc. | System and method for searching for a query |
| US7813581B1 (en) | 2005-05-06 | 2010-10-12 | Fitzpatrick Ben G | Bayesian methods for noise reduction in image processing |
| US7734644B2 (en) | 2005-05-06 | 2010-06-08 | Seaton Gras | System and method for hierarchical information retrieval from a coded collection of relational data |
| GB0524572D0 (en) | 2005-12-01 | 2006-01-11 | Univ London | Information retrieval |
| US8065286B2 (en) | 2006-01-23 | 2011-11-22 | Chacha Search, Inc. | Scalable search system using human searchers |
| US7567960B2 (en) | 2006-01-31 | 2009-07-28 | Xerox Corporation | System and method for clustering, categorizing and selecting documents |
| US8725711B2 (en) | 2006-06-09 | 2014-05-13 | Advent Software, Inc. | Systems and methods for information categorization |
| US7870140B2 (en) | 2006-06-12 | 2011-01-11 | D&S Consultants, Inc. | System and method of incorporating user preferences in image searches |
| US20080126464A1 (en) | 2006-06-30 | 2008-05-29 | Shahin Movafagh Mowzoon | Least square clustering and folded dimension visualization |
| US8676802B2 (en) | 2006-11-30 | 2014-03-18 | Oracle Otc Subsidiary Llc | Method and system for information retrieval with clustering |
| US8150822B2 (en) | 2007-01-09 | 2012-04-03 | Favoweb Ltd. | On-line iterative multistage search engine with text categorization and supervised learning |
| US8027541B2 (en) | 2007-03-15 | 2011-09-27 | Microsoft Corporation | Image organization based on image content |
| WO2008118977A1 (en) | 2007-03-26 | 2008-10-02 | Desert Research Institute | Data analysis process |
| US7617195B2 (en) | 2007-03-28 | 2009-11-10 | Xerox Corporation | Optimizing the performance of duplicate identification by content |
| US7814107B1 (en) | 2007-05-25 | 2010-10-12 | Amazon Technologies, Inc. | Generating similarity scores for matching non-identical data strings |
| US8170966B1 (en) | 2008-11-04 | 2012-05-01 | Bitdefender IPR Management Ltd. | Dynamic streaming message clustering for rapid spam-wave detection |
| US8254697B2 (en) | 2009-02-02 | 2012-08-28 | Microsoft Corporation | Scalable near duplicate image search with geometric constraints |
| US20100293117A1 (en) | 2009-05-12 | 2010-11-18 | Zuobing Xu | Method and system for facilitating batch mode active learning |
| US8447760B1 (en) | 2009-07-20 | 2013-05-21 | Google Inc. | Generating a related set of documents for an initial set of documents |
| US8572084B2 (en) | 2009-07-28 | 2013-10-29 | Fti Consulting, Inc. | System and method for displaying relationships between electronically stored information to provide classification suggestions via nearest neighbor |
| US9384214B2 (en) | 2009-07-31 | 2016-07-05 | Yahoo! Inc. | Image similarity from disparate sources |
| US8352465B1 (en) | 2009-09-03 | 2013-01-08 | Google Inc. | Grouping of image search results |
| CN102640146B (en) | 2009-09-11 | 2017-07-14 | 萨姆万斯集团知识产权控股私人有限公司 | Database search method, system and controller |
| JP5546819B2 (en) | 2009-09-16 | 2014-07-09 | 株式会社東芝 | Pattern recognition method, character recognition method, pattern recognition program, character recognition program, pattern recognition device, and character recognition device |
| US20110246409A1 (en) | 2010-04-05 | 2011-10-06 | Indian Statistical Institute | Data set dimensionality reduction processes and machines |
| GB201011062D0 (en) | 2010-07-01 | 2010-08-18 | Univ Antwerpen | Method and system for using an information system |
| CN102419755B (en) | 2010-09-28 | 2013-04-24 | 阿里巴巴集团控股有限公司 | Method and device for sorting search results |
| US8891878B2 (en) | 2012-06-15 | 2014-11-18 | Mitsubishi Electric Research Laboratories, Inc. | Method for representing images using quantized embeddings of scale-invariant image features |
| CN104115149B (en) | 2012-02-07 | 2017-11-17 | 皇家飞利浦有限公司 | Interactive optimization for the scan database of statistics test |
| US9208219B2 (en) | 2012-02-09 | 2015-12-08 | Stroz Friedberg, LLC | Similar document detection and electronic discovery |
| CN103336770B (en) | 2012-02-28 | 2017-03-01 | 国际商业机器公司 | Method and system for identification of complementary data object |
| US20140019431A1 (en) | 2012-07-13 | 2014-01-16 | Deepmind Technologies Limited | Method and Apparatus for Conducting a Search |
| GB201212518D0 (en) | 2012-07-13 | 2012-08-29 | Deepmind Technologies Ltd | Method and apparatus for image searching |
| CN103530649A (en) * | 2013-10-16 | 2014-01-22 | 北京理工大学 | Visual searching method applicable mobile terminal |
| US20150331908A1 (en) * | 2014-05-15 | 2015-11-19 | Genetic Finance (Barbados) Limited | Visual interactive search |
| US10102277B2 (en) * | 2014-05-15 | 2018-10-16 | Sentient Technologies (Barbados) Limited | Bayesian visual interactive search |
| US9798780B2 (en) | 2014-09-30 | 2017-10-24 | University Of Helsinki | Low-dimensional information discovery and presentation system, apparatus and method |
| US20160350336A1 (en) | 2015-05-31 | 2016-12-01 | Allyke, Inc. | Automated image searching, exploration and discovery |
-
2014
- 2014-09-23 US US14/494,364 patent/US20150331908A1/en not_active Abandoned
-
2015
- 2015-05-04 CN CN201580038513.4A patent/CN107209762B/en not_active Expired - Fee Related
- 2015-05-04 DE DE112015002286.4T patent/DE112015002286T9/en not_active Withdrawn - After Issue
- 2015-05-04 TW TW104114144A patent/TW201606537A/en unknown
- 2015-05-04 WO PCT/IB2015/001267 patent/WO2015173647A1/en not_active Ceased
- 2015-05-04 JP JP2016567798A patent/JP2017518570A/en active Pending
- 2015-05-04 GB GB1621341.5A patent/GB2544660A/en not_active Withdrawn
- 2015-05-04 US US15/311,163 patent/US10503765B2/en active Active
- 2015-05-04 EP EP15760512.2A patent/EP3143523B1/en not_active Not-in-force
-
2019
- 2019-11-12 US US16/681,514 patent/US11216496B2/en not_active Expired - Fee Related
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5930784A (en) * | 1997-08-21 | 1999-07-27 | Sandia Corporation | Method of locating related items in a geometric space for data mining |
| US20020091678A1 (en) * | 2001-01-05 | 2002-07-11 | Miller Nancy E. | Multi-query data visualization processes, data visualization apparatus, computer-readable media and computer data signals embodied in a transmission medium |
| US20020164078A1 (en) * | 2001-03-23 | 2002-11-07 | Fujitsu Limited | Information retrieving system and method |
| US20100134415A1 (en) * | 2008-11-28 | 2010-06-03 | Sony Corporation | Image processing apparatus, image displaying method, and image displaying program |
Cited By (27)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11216496B2 (en) * | 2014-05-15 | 2022-01-04 | Evolv Technology Solutions, Inc. | Visual interactive search |
| US20160104074A1 (en) * | 2014-10-10 | 2016-04-14 | Yahoo! Inc. | Recommending Bidded Terms |
| US20170213249A1 (en) * | 2014-10-15 | 2017-07-27 | Toor Inc. | Information display method and information display device |
| US10402750B2 (en) * | 2015-12-30 | 2019-09-03 | Facebook, Inc. | Identifying entities using a deep-learning model |
| CN108604315A (en) * | 2015-12-30 | 2018-09-28 | 脸谱公司 | Identify entities using a deep learning model |
| US10762439B2 (en) * | 2016-07-26 | 2020-09-01 | International Business Machines Corporation | Event clustering and classification with document embedding |
| US11151203B2 (en) * | 2017-02-28 | 2021-10-19 | Apple Inc. | Interest embedding vectors |
| US20180253496A1 (en) * | 2017-02-28 | 2018-09-06 | Laserlike Inc. | Interest embedding vectors |
| US20190258719A1 (en) * | 2017-02-28 | 2019-08-22 | Laserlike, Inc. | Emoji classifier |
| US10744372B2 (en) * | 2017-03-03 | 2020-08-18 | Cognizant Technology Solutions U.S. Corporation | Behavior dominated search in evolutionary search systems |
| US11247100B2 (en) * | 2017-03-03 | 2022-02-15 | Cognizant Technology Solutions U.S. Corporation | Behavior dominated search in evolutionary search systems |
| US20180250554A1 (en) * | 2017-03-03 | 2018-09-06 | Sentient Technologies (Barbados) Limited | Behavior Dominated Search in Evolutionary Search Systems |
| US10600208B2 (en) | 2017-12-21 | 2020-03-24 | Industrial Technology Research Institute | Object detecting device, object detecting method and non-transitory computer-readable medium |
| US11995559B2 (en) | 2018-02-06 | 2024-05-28 | Cognizant Technology Solutions U.S. Corporation | Enhancing evolutionary optimization in uncertain environments by allocating evaluations via multi-armed bandit algorithms |
| US11294974B1 (en) * | 2018-10-04 | 2022-04-05 | Apple Inc. | Golden embeddings |
| US11783195B2 (en) | 2019-03-27 | 2023-10-10 | Cognizant Technology Solutions U.S. Corporation | Process and system including an optimization engine with evolutionary surrogate-assisted prescriptions |
| US20220164381A1 (en) * | 2019-03-29 | 2022-05-26 | Semiconductor Energy Laboratory Co., Ltd. | Image retrieval system and image retrieval method |
| US12061644B2 (en) * | 2019-03-29 | 2024-08-13 | Semiconductor Energy Laboratory Co., Ltd. | Image retrieval system and image retrieval method |
| US11860928B2 (en) | 2019-09-11 | 2024-01-02 | International Business Machines Corporation | Dialog-based image retrieval with contextual information |
| US11100145B2 (en) * | 2019-09-11 | 2021-08-24 | International Business Machines Corporation | Dialog-based image retrieval with contextual information |
| US11481448B2 (en) * | 2020-03-31 | 2022-10-25 | Microsoft Technology Licensing, Llc | Semantic matching and retrieval of standardized entities |
| US20210312297A1 (en) * | 2020-04-07 | 2021-10-07 | Cognizant Technology Solutions U.S. Corporation | Framework For Interactive Exploration, Evaluation, and Improvement of AI-Generated Solutions |
| US12099934B2 (en) * | 2020-04-07 | 2024-09-24 | Cognizant Technology Solutions U.S. Corporation | Framework for interactive exploration, evaluation, and improvement of AI-generated solutions |
| US11775841B2 (en) | 2020-06-15 | 2023-10-03 | Cognizant Technology Solutions U.S. Corporation | Process and system including explainable prescriptions through surrogate-assisted evolution |
| US12424335B2 (en) | 2020-07-08 | 2025-09-23 | Cognizant Technology Solutions U.S. Corporation | AI based optimized decision making for epidemiological modeling |
| US20220092515A1 (en) * | 2020-09-24 | 2022-03-24 | The trustee of the Thomas J. Watson Foundation, DBA Watson Foundation, a Delaware charitable trust, | Personal, professional, cultural (ppc) insight system |
| US12361348B2 (en) * | 2020-09-24 | 2025-07-15 | The trustee of the Thomas J. Watson Foundation | Personal, professional, cultural (PPC) insight system |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2015173647A1 (en) | 2015-11-19 |
| CN107209762B (en) | 2021-02-12 |
| GB201621341D0 (en) | 2017-02-01 |
| TW201606537A (en) | 2016-02-16 |
| GB2544660A (en) | 2017-05-24 |
| US20170075958A1 (en) | 2017-03-16 |
| JP2017518570A (en) | 2017-07-06 |
| DE112015002286T9 (en) | 2017-07-06 |
| CN107209762A (en) | 2017-09-26 |
| EP3143523B1 (en) | 2020-07-08 |
| US20200081906A1 (en) | 2020-03-12 |
| US10503765B2 (en) | 2019-12-10 |
| DE112015002286T5 (en) | 2017-04-06 |
| EP3143523A1 (en) | 2017-03-22 |
| US11216496B2 (en) | 2022-01-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11216496B2 (en) | Visual interactive search | |
| US12099542B2 (en) | Implementing a graphical user interface to collect information from a user to identify a desired document based on dissimilarity and/or collective closeness to other identified documents | |
| US10102277B2 (en) | Bayesian visual interactive search | |
| US10606883B2 (en) | Selection of initial document collection for visual interactive search | |
| US10909459B2 (en) | Content embedding using deep metric learning algorithms | |
| US20170039198A1 (en) | Visual interactive search, scalable bandit-based visual interactive search and ranking for visual interactive search | |
| US10346727B2 (en) | Utilizing a digital canvas to conduct a spatial-semantic search for digital visual media | |
| Wang et al. | Assistive tagging: A survey of multimedia tagging with human-computer joint exploration | |
| US9262445B2 (en) | Image ranking based on attribute correlation | |
| Shen et al. | Large-scale item categorization for e-commerce | |
| US10657162B2 (en) | Method and system for visualizing documents | |
| US20090150376A1 (en) | Mutual-Rank Similarity-Space for Navigating, Visualising and Clustering in Image Databases | |
| Sergieh et al. | Geo-based automatic image annotation | |
| US12282516B2 (en) | Faceted navigation | |
| Lu et al. | Browse-to-search: Interactive exploratory search with visual entities | |
| US11874868B2 (en) | Generating and presenting multi-dimensional representations for complex entities | |
| WO2017064561A2 (en) | Selection of initial document collection for visual interactive search | |
| Nath et al. | A survey on personal image retrieval systems | |
| WO2017064563A2 (en) | Visual interactive search, scalable bandit-based visual interactive search and ranking for visual interactive search | |
| WO2017098475A1 (en) | Bayesian visual interactive search | |
| Mousselly et al. | Geo-based Automatic Image Annotation | |
| Patil et al. | Implementation of Web Image Re-ranking by Preserving Relevance | |
| Kienreich et al. | Challenges in Knowledge Discovery: Structured Repositories and Multimedia Content |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: GENETIC FINANCE (BARBADOS) LIMITED, BARBADOS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DUFFY, NIGEL;REEL/FRAME:033801/0520 Effective date: 20140923 |
|
| AS | Assignment |
Owner name: SENTIENT TECHNOLOGIES (BARBADOS) LIMITED, BARBADOS Free format text: CHANGE OF NAME;ASSIGNOR:GENETIC FINANCE (BARBADOS) LIMITED;REEL/FRAME:034781/0900 Effective date: 20141202 |
|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
| AS | Assignment |
Owner name: EVOLV TECHNOLOGY SOLUTIONS, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SENTIENT TECHNOLOGIES HOLDINGS LIMITED;REEL/FRAME:058200/0304 Effective date: 20190701 Owner name: SENTIENT TECHNOLOGIES HOLDINGS LIMITED, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SENTIENT TECHNOLOGIES (BARBADOS) LIMITED;REEL/FRAME:058200/0206 Effective date: 20190627 |