[go: up one dir, main page]

HK1072114B - Hierarchical data-driven havigation system and method for information retrieval - Google Patents

Hierarchical data-driven havigation system and method for information retrieval Download PDF

Info

Publication number
HK1072114B
HK1072114B HK05104843.6A HK05104843A HK1072114B HK 1072114 B HK1072114 B HK 1072114B HK 05104843 A HK05104843 A HK 05104843A HK 1072114 B HK1072114 B HK 1072114B
Authority
HK
Hong Kong
Prior art keywords
attribute
navigational
navigational state
value pair
state
Prior art date
Application number
HK05104843.6A
Other languages
Chinese (zh)
Other versions
HK1072114A1 (en
Inventor
亚当.J.费拉里
戴维.郭利
凯斯.约翰森
弗莱德里克.C.纳比
丹尼尔.图恩克朗
约翰.S.沃尔特
Original Assignee
恩迪卡技术公司
Filing date
Publication date
Application filed by 恩迪卡技术公司 filed Critical 恩迪卡技术公司
Priority claimed from PCT/US2001/017046 external-priority patent/WO2002097671A2/en
Publication of HK1072114A1 publication Critical patent/HK1072114A1/en
Publication of HK1072114B publication Critical patent/HK1072114B/en

Links

Description

Hierarchical data-driven navigation system and method for information retrieval
Technical Field
The present invention relates generally to information navigation systems and search engines.
Background
Information retrieval from information databases is an increasingly challenging problem, particularly on the World Wide Web (WWW), as enhanced computing power and network architectures allow for the aggregation of large amounts of information and wide access to that information. The purpose of the information retrieval process is to allow identification of material of interest to the user.
As the amount of material that a user can retrieve increases, identifying material relevant to a search becomes increasingly important, but difficult. Challenges posed by the information retrieval process include providing an intuitive, flexible user interface and accurately identifying material relevant to a user's needs in a reasonable amount of time. The information retrieval process involves two related technical aspects, namely information organization and access.
Current information navigation systems generally follow one of three paradigms. An information navigation system employs a database query system. In a typical database query system, a user formulates a structured query by specifying for fixed data fields, and the system enumerates documents whose data fields contain those values. Com uses such an interface. Typically, database query systems provide a table-based interface for users, convert table input into a query in a formal database language, such as SQL, and then execute the query on a relational database management system. Disadvantages of typical query-based systems include that they allow users to make queries that do not return documents and that they provide query modification options that result only in additional result set (documents corresponding to the user's search specification) limitations, without expanding or augmenting the result set.
The second type of information navigation system is a free-text search engine. In a typical full-text search engine, a user typically enters an arbitrary text string in the form of a boolean expression, and the system responds by listing documents that contain matching text. Com includes a full text search engine, for example. Typically, full-text search engines provide a user with a search form, typically a single line, and process queries using a pre-computed index. Typically, such an index associates each document with most of the words contained in that document, without independently considering the content of the document. Thus, the result set is typically a massive, unorganized list of mixed relevant and irrelevant documents. While variations have been developed that attempt to determine the goals of a user query and provide a relevant ranking of results or narrow or organize the result set, these systems are limited and unreliable in achieving these goals.
A third type of information navigation system is a tree-based directory. In tree-based directories, a user typically starts with the root node of the tree and specifies a query by successively selecting refined branches leading to other nodes in the tree. For example, shoping.yahoo.com uses tree-based directories. In a typical implementation, a hard-coded tree (hard-coded tree) is stored in a data structure, and the same or another data structure maps documents to the node or nodes of the tree in which they reside. A particular document is typically accessed through the tree from only one or at least a few paths. The set of navigational state is relatively static-although documents are typically added to nodes in a directory, the structure of the directory typically remains the same. In a pure tree-based directory, directory nodes are arranged as a single root node from which all users start, and every other directory node (directorynode) that can only be reached via a unique sequence of branches that the user selects from the root node. The branches of such directory-imposed trees must be navigation-disjoint restrictions-even though the manner in which documents are assigned to disjoint branches is not intuitive to the user. This notion can be solved by adding additional links to convert the tree into an acyclic directed graph. Updating directory structures is still a difficult task and leaf nodes in particular tend to end up with a large number of corresponding documents.
In all of these types of navigation systems, it is difficult for a user to efficiently modify a query after browsing its result set. In a database query system, a user may add or remove terms from a query, but it is often difficult for the user to avoid specifying too few queries (i.e., too many results) or too many queries (i.e., no results). In a full-text search engine, the same problem arises. In tree-based directories, the only way for a user to modify a query is to narrow it by selecting a branch or summarize it by rolling back to a previous branch.
Various other systems for information retrieval are also available. For example, U.S. Pat. Nos. 5,715,444 and 5,983,219 to Danish et al, both entitled "Method and Systmeor Executing a Guided parameter Search," disclose an interface for identifying individual items from a series of items. The interface provides the user with a set list of features that exist in a list of items and identifies items that satisfy the selected characteristics.
Disclosure of Invention
The hierarchical data-driven information navigation system and method of the present invention enables navigation of a set of documents or other material using certain common attributes associated with those materials. The navigation system interface allows a user to select attribute values for use in connection with the material in the current navigation state and return the material corresponding to the user selection. The present invention initiates this navigation mode by associating items (attribute-value pairs) with a document, defining a hierarchically refined set of relationships in the items (i.e., partial order), and providing a guided navigation mechanism based on the association of items with the document and the relationships in the items.
The present invention includes several components and features related to hierarchical data driven navigation systems. Among these are user interfaces, knowledge bases, methods for generating and maintaining knowledge bases, navigable data structures and methods for generating data structures, WWW-based system applications, and methods of implementing the system. Although the invention is described herein primarily with reference to a WWW-based system for navigating a product database, it should be understood that a similar navigation system can be employed in any database context, where profiles can be related to terms and users can identify profiles of interest through those terms.
The present invention uses a knowledge base of information about a data set to design an interface and employs the interface to guide a user through a set of navigational states by providing relevant navigational options. The knowledge base includes a representation that lists attributes associated with the material, value ranges for each attribute, and partial order associated with the terms (attribute-value pairs). Attribute-value pairs for entertainment-related material may be, for example, product: Movie and director. (in this specification, this attribute represents an attribute-Value pair: Value format; and the navigation state represents a relative set of attribute-Value pairs). The knowledge base also includes a classification map that associates each item in the corpus with a set of terms that characterize that item.
The knowledge base is typically organized by domain, which is a collection of data that conforms to a natural grouping. Preferably, the plurality of attributes selected for a domain to be tractable is sufficient to effectively distinguish and navigate through the material of that domain. The knowledge base preferably includes characteristics for each domain, which may include rules or default expectations relating to the classification of documents in that domain. A particular item may be in more than one domain.
The present invention includes a user interface for navigation. The user interface preferably presents the user navigational state as a set of terms organized by attribute. For a given set of terms, the user interface displays information about those terms and displays relevant navigation options for narrowing down or summarizing the navigation state. In one aspect of the invention, a user navigates through a corpus by selecting and deselecting terms.
In one aspect of the invention, the user interface responds immediately to selecting or deselecting a term without waiting for the user to indicate and submit a complex query composed of multiple terms. Once the query has been executed, the user may narrow down the navigation state (navigation state) by selecting additional terms or by refining existing terms. In addition, the user may expand the navigational state by deselecting terms that have already been selected or by generalizing terms. In a preferred embodiment, the user can expand the navigational state by deselecting terms in an order other than their selection. For example, a user may start with { products: Movies } and narrow down to { products: Movies; again, reduce to { product: Movies; genre is Drama; director: Spike Lee }, and then generalizing to { product: Movies; director is Spike Lee }.
In another aspect of the invention, the user interface allows a user to use a full-text search to find terms of interest. In another aspect of the invention, the user interface also allows the user to use a full-text search on descriptive information related to the material.
In another aspect of the invention, the user interface displays contextually relevant navigation options to the user for zooming out of the navigational state. The user interface does not display to the user the terms whose selections do not correspond to the document in the final navigational state. The user interface displays to the user only terms associated with at least one item in the current navigational state. At the same time, the user interface displays new navigation options as they become relevant. The knowledge base may contain rules that determine when particular attributes or terms become available to the user for navigation.
In another aspect of the invention, the knowledge base includes a catalog of standard representations that have been aggregated from the knowledge base, for example, when the knowledge base corresponds to products available for purchase from various sources.
In another aspect of the invention, the knowledge base may include library definitions, making up a simultaneously searchable collection of materials. The library may include documents from one or more domains. Items may be assigned to more than one library. The knowledge base may include rules that are customized for navigation of a particular base.
In another aspect of the invention, the knowledge base is developed through a multi-stage, iterative process. Workflow management allocates resources to minimize the efficiency of generating and maintaining knowledge bases.
A knowledge base is used to generate navigation-enabled data structures from the corpus. In one aspect of the invention, the navigation system consists of a hierarchy of navigation states (i.e., partial order) that maps sets of terms to datasets related to those terms. In another aspect of the invention, navigational states are associated by a transition corresponding to a narrowing of terms from one navigational state to another. The navigational state may be calculated in whole or in part in advance, or may be calculated in whole at runtime.
According to one aspect of the present invention, there is provided a computer-implemented method for retrieving information associated with a set of materials, the method comprising: storing a plurality of attribute-value pairs associated with the material, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material; and
calculating a plurality of navigational states, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of calculated navigational states, there are at least the following cases,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material characterized by the third attribute;
storing the calculated navigational state in a data structure;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state;
accepting a query directed to the material; and
in response to the query, the stored navigational state is retrieved.
According to another aspect of the present invention, there is provided a computer-implemented method for retrieving information associated with a set of materials, the method comprising:
storing, in a first data structure, a plurality of attribute-value pairs associated with the material, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material; and
wherein the first data structure allows for dynamically calculating a plurality of navigational states, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material in the navigational system that is each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of navigational states, there is at least a case,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material in the navigational system characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material in the navigational system characterized by the third attribute;
generating a set of pre-computed navigational states;
storing a set of pre-computed navigational states in a second data structure, the set of pre-computed navigational states stored in the second data structure including at least one of the first navigational state and the second navigational state;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state;
accepting a query directed to the material; and
in response to the query, either the stored, pre-computed navigational state is retrieved or one of a plurality of navigational states is dynamically computed using the first data structure.
According to another aspect of the present invention, there is provided a computer-implemented method for retrieving information associated with a set of materials, the method comprising:
storing, in a data structure, a plurality of attribute-value pairs associated with the profile, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the profile; and
dynamically computing, at runtime, a plurality of navigational states using the data structure in response to a plurality of sequential queries, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material in the navigational system that is each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of navigational states, there is at least a case,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material characterized by the third attribute;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state; and
in response to each query, the computed navigational state is displayed.
According to another aspect of the present invention, there is provided a computer program product residing in a computer readable medium for retrieving information associated with a set of materials, the computer program product comprising instructions for causing a computer to:
storing a plurality of attribute-value pairs associated with the material, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material; and
calculating a plurality of navigational states, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of calculated navigational states, there are at least the following cases,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material characterized by the third attribute;
storing the calculated navigational state in a data structure;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state;
accepting a query directed to the material; and
in response to the query, the stored navigational state is retrieved.
According to another aspect of the present invention, there is provided a computer program product residing in a computer readable medium for retrieving information associated with a set of materials, the computer program product comprising instructions for causing a computer to:
storing in a data structure a plurality of attribute-value pairs associated with the material, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material, an
Dynamically computing, at runtime, a plurality of navigational states using the data structure in response to a plurality of sequential queries, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material in the navigational system that is each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of navigational states, there is at least a case,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material characterized by the third attribute;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state; and
in response to each query, the computed navigational state is displayed.
According to another aspect of the present invention, there is provided a computer-implemented method for retrieving information associated with a set of materials, the method comprising:
storing, in a first data structure, a plurality of attribute-value pairs associated with the material, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material; and
wherein the first data structure allows for dynamically calculating a plurality of navigational states, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material in the navigational system that is each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of navigational states, there is at least a case,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material in the navigational system characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material in the navigational system characterized by the third attribute;
generating a set of pre-computed navigational states;
storing a set of pre-computed navigational states in a second data structure, the set of pre-computed navigational states stored in the second data structure including at least one of the first navigational state and the second navigational state;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provides a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state, wherein a first transition provided in the second navigational state corresponds to a first navigational attribute for which no transition is provided in the first navigational state;
accepting a query directed to the material; and
in response to the query, either the stored, pre-computed navigational state is retrieved or one of a plurality of navigational states is dynamically computed using the first data structure.
According to another aspect of the present invention, there is provided a computer-implemented method for retrieving information associated with a set of materials, the method comprising:
storing a plurality of attribute-value pairs associated with the material in a first data structure, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material, an
Wherein the first data structure allows for dynamically calculating a plurality of navigational states, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material in the navigational system that is each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of navigational states, there is at least a case,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material in the navigational system characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material in the navigational system characterized by the third attribute;
generating a set of pre-computed navigational states;
storing a set of pre-computed navigational states in a second data structure, the set of pre-computed navigational states stored in the second data structure including at least one of the first navigational state and the second navigational state;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state, the interface further providing a full text search tool that enables a full text search of a plurality of attributes and a plurality of values;
accepting a query directed to the material; and
in response to the query, either the stored, pre-computed navigational state is retrieved or one of a plurality of navigational states is dynamically computed using the first data structure.
Drawings
The present invention, including those and other features, will be more fully understood from the following description and drawings, in which:
fig. 1 is a view of a user interface of a navigation system according to an embodiment of the present invention.
FIG. 2 is a view of the user interface of FIG. 1 showing a drop-down selection list of navigational terms.
FIG. 3 is a view of the user interface of FIG. 1 showing a navigational state.
FIG. 4 is a view of the user interface of FIG. 1 showing a navigational state.
FIG. 5 is a view of the user interface of FIG. 1 showing a navigational state.
FIG. 6 is a view of the user interface of FIG. 1 showing a navigational state.
FIG. 7 is a view of the user interface of FIG. 1 showing a navigational state.
FIG. 8 is a view of the user interface of FIG. 1 showing a navigational state.
FIG. 9 is a view of the user interface of FIG. 1, showing results for a full text search of terms.
FIG. 10 is a view of the user interface of FIG. 1 showing information about a particular document.
11A-11C are representative examples of how attribute value ranges can be partially ordered in accordance with embodiments of the present invention.
FIG. 12 is a block diagram of a process for selecting and classifying documents according to an embodiment of the present invention.
FIG. 13 is a table illustrating how a set of documents may be classified according to an embodiment of the present invention.
FIG. 14 is a representative partial order (partialorder) of navigation states, according to an embodiment of the invention.
FIG. 15 is a block diagram of a process for pre-computing navigational states, according to an embodiment of the present invention.
Detailed Description
User interface
FIG. 1 shows a user interface 10 for a hierarchical, data-driven navigation system, according to one embodiment of the present invention. The navigation system operates on a set of documents defined in a knowledge base. As shown, it is preferable to provide the user with at least two alternative methods of using the navigation system: (1) navigate through the corpus of documents by selecting terms, or (2) enter desired keywords in a search box.
The navigation system preferably organizes the documents by domain. According to one embodiment of the invention, the user interface 10 as shown in FIGS. 1-10 operates on a set of documents that are part of a wine domain. Preferably, the domain defines a portion of the document set that reflects natural grouping. Typically, the set of attributes used to classify a document in a domain will be a tractable subset of the attributes used to classify the entire set of documents. The domain definition may be a product type, such as wine or consumer electronics. The domain may be divided into subdomains to further organize the document set. For example, the consumer electronics domain can be divided into television sub-domains, stereo equipment, and so on. The document may correspond to a product or service.
The user interface may allow the user to navigate through one domain at a time. In addition, the user interface may allow for navigation of multiple domains simultaneously, particularly when certain attributes are common to multiple domains.
The user interface allows a user to navigate through the set of navigational states. Each state consists of a set of terms and a set of documents related to those terms. The user navigates through the set of navigational states by selecting and deselecting terms to obtain a navigational state corresponding to each set of selected terms. Preferably, as in FIG. 4, the user interface 10 displays a navigation state by displaying a list 50 of terms 52 and a list 41 of some or all of the documents 42 corresponding to that state. Preferably, the user interface displays the terms 52 of the navigational state organized by attribute. Preferably, the initial navigational state is a root state corresponding to no-term selection, and thus, all documents in the collection.
As shown in FIG. 2, the user interface 10 allows the user to narrow down the navigational state by selecting a value 28 for the attribute 22, or by replacing the currently selected value with a more specific one, if appropriate. Preferably, the user interface 10 presents the user with options, preferably with related terms organized by attribute, that can be used to narrow down the current navigational state. In some embodiments of the present invention, as shown in FIG. 2, the user can select a value 28 from a drop down list 26 represented by indicators 24 organized by attributes 22 in the current navigational state. The user interface 10 may display these navigation options in various formats. For example, the values can be presented as graphics or symbols rather than text. The interface may allow for any method of selecting items, such as mouse clicks, keystrokes, or voice commands. The interface may be provided by various media and devices, such as a television or WWW, or a telephone or wireless device. Although primarily discussed herein as a visual interface, the interface may include an audio portion or be primarily audio-based.
Preferably, in the current navigation state, the user interface displays only the option to narrow the navigation state that will result in the navigation state having the smallest one document. This optimal criteria for providing navigation options ensures that there is no "impasse," or navigation state corresponding to an empty result set.
Preferably, the user interface may display options for narrowing the navigational state if they result in a navigational state with exactly fewer documents than currently. This is done to ensure that the user interface does not have to display to the user the selections that have been implied by the term in the current navigational state.
Preferably, upon the user selecting the term 28 to narrow the current navigational state, the user interface displays the new navigational state without any further triggering action by the user. Because the system responds to each user with immediate feedback, the user does not need to formulate a comprehensive query and then submit the query.
According to one embodiment of the invention, as shown in FIGS. 3 and 4, the user interface 10 may expand the current navigation state by allowing the user to remove terms 52 from the list 50 of selected terms. For example, the interface 10 may provide the list 50 with a check box 54 and button 56 for de-selection to trigger a new selection. In the illustrated embodiment, the user may remove selected terms 52 in any order and may remove less than one selection 52 at a time.
Preferably, the navigation options displayed to the user are contextually relevant. For example, a term refining a previously selected term may become a navigation option in the final navigation state. For example, referring to FIG. 5, after selecting the terms Flavors: Wood and Nut Flavors52 (the user has selected the values of the attributes Flavors Wood and Nut Flavors23), Wood and NutFlavrs 23 appear in the interface of the new navigational state in the property list 20 and allow the selection of the values associated with that particular property to further refine the query. The user interface may also display certain properties that were not initially displayed when they become newly relevant. For example, comparing FIG. 3 with FIG. 2, the attribute French Vineyards25 appears in the list of attributes 20 only after the user has selected the term Region: French Region in the pre-navigation state. In this way, attributes can be embedded into multiple levels as desired. Displaying attributes as navigation options when those attributes become relevant avoids overwhelming the user with navigation options before those options become meaningful.
Additionally, for some attributes 22, multiple rationless (non-refined) selection values 28 may be applied. For example, for the properties of _ flag, the values of _ priority and Nutty, none of which refines the other, may be selected so that the terms _ flag: priority and _ flag: Nutty narrow the navigation state. Thus, users can sometimes refine a query by selecting multiple values under a single attribute.
Preferably, certain attributes that are navigation options will be removed if they are no longer valid or useful choices. For example, if all documents in the result set share a common term (other than the term selected to reach the navigational state), then selecting that term will not further refine the result set, and thus the attribute associated with that term as a navigational option is removed. For example, comparing FIG. 6 with FIG. 4, the attribute WineTypes27 has been removed as a navigation option because all documents 42 in the result set share the same terminology WineTypes: applied Wines. In a preferred embodiment, an additional feature of the interface 10 is to display this information to the user as a common feature of the documents 42 in the result set. For example, referring to FIG. 6, the interface 10 includes a display 60 that represents common characteristics of the documents 42 in the result set. Removing terms as navigation options prevents the user from wasting time by choosing not to refine the result set when all documents in the result set share that term.
Preferably, the user interface may remove values as navigation options if their selection would result in no documents in the result set. For example, comparing FIG. 8 with FIG. 7, when the user selects the term Wine Spectrator Range: after 95-100, the user interface removes all values 28, 29 in the list 26 of values for the property applications 22 as a navigation option, except for the values AlexanderValley29 and Napa Valley 29. Alexander Valley29 and Napa Valley29 are just two values in the list 26 of values for the attribute applications used to return at least one document in the result set; all other values 28 return an empty set. Removing values as navigation options that would result in an empty result set saves the user time by preventing the user from reaching an impasse.
Preferably, the user interface allows the user to search for the desired word using a full text search. According to one embodiment of the present invention, as shown in FIG. 9, the search box 30 preferably allows the user to perform a full text search for the term of interest, rather than performing a full text search of the document itself. Preferably, the user interface responds to such a search by displaying a list 32 including terms 34 organized by attributes 36 and allowing the user to select from them. Preferably, the user interface responds to the user selection by displaying to the user the navigational status corresponding to the selected term. The user may then navigate from that state (i.e., by narrowing or enlarging it), or perform additional full text searches for terms.
Preferably, the user interface 10 displays all or a portion of the document list 41 corresponding to the current navigation state. Preferably, if a user is interested in a particular document 42, the user may select it and obtain a record 70 containing additional information about it, including a list 72 of terms 74 associated with that document, as shown in FIG. 10. Preferably, the user interface 10 allows the user to select any subset of those terms 74 in order to navigate to a navigational state corresponding to the selected set of terms.
Preferably, the user interface 10 also provides navigation options that link directly to relevant but not necessarily summarize or refine the relevant navigation states of the present invention. These links preferably infer user interests from the present navigational state and allow the user to span related topics. For example, if the user is viewing a particular navigational state in a food product domain, the link may direct the user to a navigational state of wine in a wine domain that makes those food products more comprehensive.
Although the interface of the navigation system has been described herein as the user interface 10, the interface can provide other forms of access to the navigation system. In further embodiments, the interface may be an application program interface to allow access to the navigation system for or through other applications. The interface may also enhance the functionality of a stand-alone data-oriented application. The interface may also be used in the context of WWW-based applications or XML-based applications. The navigation system may also support multiple interface modes simultaneously. The navigation system may be made available in a number of ways, for example, via wireless communication or on a handheld device.
Knowledge base
Preferably, the navigation system stores all information related to navigation in a knowledge base. The knowledge base is a library of information from two processes: taxonomically defined classification. Taxonomic definition is the process of identifying relevant attributes to characterize a document, determining acceptable values for those attributes (such as a list or range of values), and defining the partial order of refined relationships in terms (attribute-value pairs). Classification is the process of associating terms with documents. A knowledge base may also be used to maintain any information assets that support the two processes, such as domains, classification rules, and default expectations. In addition, a knowledge base may be used to maintain complementary information and profiles that affect the user's navigation experience.
The taxonomy definition process identifies a set of attributes that properly characterize the document. A typical approach to the process of histological taxonomy definition is to arrange a set of documents into domains that conform to a naturally grouped set of documents, and for that domain, a tractable number of attributes is sufficient to effectively distinguish and navigate among the documents in that domain. The knowledge base preferably includes characteristics for each domain, which may include rules or default expectations relating to the classification of documents in that domain.
The taxonomy definition process also identifies a full set of values at varying feature levels for each attribute, as appropriate. These values preferably identify specific attributes of the documents in the collection. These values may be explicitly enumerated or implicitly defined. For example, for a "color" attribute, a full set of valid color values may be specified, but for a "price" or "date" attribute, a range or generic date type may be specified in which the value falls, without defining a range. The process of identifying those values may include studying domains or analyzing document sets.
The taxonomy definition process also defines the partial order of refined relationships in terms (attribute-value pairs). For example, the term Origin: France can refine the term Origin: Europe. The refinement relationship is transitive and antisymmetric, but not necessarily all. Transitivity means that if the term a refines the term B and the term B refines the term C, then the term a refines the term C. For example, if Origin: Paris refines Origin: France and Origin: France refines Origin: Europe, then Origin: Paris refines Origin: Europe. Antisymmetric means that if the two terms are different, the terms cannot be refined to each other. For example, if Origin: Paris refines Origin: France, then Origin: France does not refine Origin: Paris.
In addition, the partial order of the refined relationship in the terminology is not necessarily all one. For example, there can be two terms, Origin: France and Origin: Spain, so that none of the terms refines the other. Two terms with this property are said to be incomparable. In general, two terms are not comparable if a set of two or more terms are not comparable to each other for any pair of different terms selected from that set. Typically, but not necessarily, two terms having different properties will be incomparable.
A set of terms is assumed, the largest term if a term does not refine any other term in the set, and the smallest term if no other term in the set refines it. For example, in the set { Origin: France, Origin: Paris, Origin: Spain, Origin: Madrid }, Origin: France and Origin: Spain are the largest, while Origin: Paris and Origin: Madrid are the smallest. In the knowledge base, a term is a root term if it is not refined to any other term, and a leaf term if it is not refined to other terms.
11A, 11B, and 11C illustrate attributes 112 and values 114 that can be used to classify wine, arranged according to a partial order relationship. Attributes 112 are Type/value, Origin, and Vintage. Each attribute 112 corresponds to the maximum term for that attribute. The attributes 112 can have a fixed set of general partial orderings that allow values to be refined into two or more sets of mutually irrelative values (e.g., Type/value) as compared to an irrelative value (e.g., Vintage), a tree of values (e.g., Origin), or a general partial order. Arrow 113 represents a refined relationship in the value 114.
Attributes and values can be identified and developed in several ways, including manually or automatically processing and analyzing documents. Furthermore, such analysis may be top-down or bottom-up; that is, starting with root terminology and proceeding to leaf terminology, or starting with leaf terminology and striving to proceed with root terminology. Attributes and terms may also be defined for retailers or others interested in having the displayed invention to disseminate messages.
The classification process locates (locate) documents in the navigational state by associating each document with a set of terms. Each document is associated with a set of mutually incomparable terms, such as { Type/value: Chianti, Origin: Italy, Vintage: 1996} and any other descriptive information as required. If the document is related to a specified term. The document will also be related to all terms specifying term refinement.
The classification process may be handled according to various workflows. Documents are sorted in series or in parallel, and the automatic and manual sorting steps may be performed one or more times and in any order. To improve accuracy and throughput, a human expert may be designated as an expert that monitors specific attributes for a particular document subset classification task, or even for a particular subset of documents. In addition, the classification and taxonomy processes can be interleaved, particularly with knowledge gained from one process allowing improvement in the other.
Fig. 12 shows a stage in a possible flow for the classification process 250. The data collection step 252, i.e., the collection of documents for the database, may be performed in a variety of different ways. For example, a retailer having a catalog of products on which the navigation system will operate may provide a set of documents describing their products as a predefined set. In addition, documents may be collected from one source, e.g., one Web site, or from multiple sources, such as multiple Web sites, and then aggregated. If the desired document is a Web page, the document may be collected by crawling the Web appropriately, selecting documents, and deleting documents in the domain that do not fit. The collected documents are formatted and parsed for further processing, at a data conversion step 254. In an automatic classification step 256, the formatted and parsed documents are processed in order to automatically associate the documents with terms. In a manual classification step 258, the human reviewer may verify and modify the automatic classification to ensure quality control. Preferably, any rules or expectations violated in the automatic classification step 256 or the manual classification step 258 will be marked and displayed to a human reviewer as part of the manual classification step 258. If the collected documents are divided into domains, there will typically be rules that specify the minimum or best set of attributes to classify the documents from each domain, as well as other domain-specific classification rules. When the classification process is complete, each document will have a set of terms associated with it that locate the documents in the navigational state set.
In fig. 13, a table 180 represents possible representations of the sorted set of wine bottles. Preferably, each record is associated with a document number 182, a name 184, and associated terms 186, which can be unique identifiers. The names are preferably descriptive information of collections that are allowed to be accessible via a full-text search engine as well as via a term-based navigation system.
In another aspect of the invention, the knowledge base may include a catalog of canonical representations of documents. Each directory record represents a conceptually different item that may be related to one or more documents. A directory allows aggregation of profile information from multiple documents related to an item, which may be from multiple sources. For example, if the same wine is sold by two vendors, and if one vendor provides year of manufacture and geographic location information and the other provides taste information, the information from the two vendors can be combined in a catalog record for that wine. Directories may also improve the efficiency of the classification process by eliminating duplicate configurations. In FIG. 12, a catalog creation step 260 associates the classified documents with catalog records and, when appropriate, creates new catalog records. For ease of reference, an entry may be uniquely identified in a directory by a unique identifier.
The knowledge base may also define a library, where a library is a subset of documents grouped so as to be searchable at one time. For example, a particular online wine merchant may not wish to display documents corresponding to products sold by that merchant's competitors, even if the knowledge base contains such documents. In this case, the knowledge base may define a document base that does not include wine sold by competitors of the merchant. In FIG. 12, the library creation step 262 may define a library based on attributes, terms, or any other attribute of the document. More than one library may be used to identify documents. The knowledge base may contain attributes or terms that have been customized for a particular base.
In FIG. 12, output process step 264 outputs information from the knowledge base to another stage of the system that performs the additional processing required to generate the navigable data structure.
Navigation state
The navigation system explicitly or implicitly represents a set of navigation states. These navigational states are associated by a refined partial order derived from a partial order related to the term.
The navigational state has two representations. First, the navigational state corresponds to a subset of the set of documents. Second, the navigational state corresponds to a set of terms that are not comparable to each other. FIG. 14 illustrates some navigational states for documents and terms based on the wine example described above. For example, one navigational state 224 is { Origin: South America } (documents #1, #4, # 5); the second navigational state 224 is { Type/Varietal: White, Origin: United States } (document #2, # 9). The subset of documents corresponding to each navigational state includes documents that are typically associated with all terms in the respective set of mutually incomparable terms. Meanwhile, the set of mutually incomparable terms corresponding to each navigational state includes all the smallest terms from the set of terms that are common to the subset of documents, i.e., the set of terms that are commonly associated with each document in the subset. Each navigational state is preferably unique and is specified in its entirety, with only one corresponding navigational state for each particular set of terms, or for a given set of documents.
One of the best ways to define the set of navigational states is to uniquely identify each navigational state by a standard set of mutually incomparable terms. The two-step mapping process that maps an arbitrary set of terms to a set of terms that are not comparable to each other in the specification creates a state that satisfies this property. In the first step of the process, any set of terms is mapped to a subset of the documents that are related to all of those terms. Recall that if a document is associated with a specified term, the document will also be associated with all terms refined by the specified term, and in a second step of the process this subset of documents is mapped to the smallest set of terms in the set of terms that is common to all documents in that set of documents. The term set derived from this second step is a mutually incomparable term set that uniquely identifies the corresponding subset of documents and thus is a canonical representation for navigational state. By way of example, referring to the wine example in FIG. 14, the set of terms { Origin: France } maps onto the subset of documents { document #8, #11} which in turn maps onto the canonical set of terms { type/varietal: Red, Origin: France }.
The navigational states 222, 224, 226 are associated by a partial order of the refined relationship 220 derived from the partial order related to the term. This partial order can be expressed in terms of a subset of documents or a set of terms that define the navigational state. The navigation state B is refined from the navigation state a represented by the subset of documents if the set of documents corresponding to state a is a subset of the set of documents corresponding to state B. Refining the navigation state B according to the navigation state A represented by the term set if all terms in the state B are in the state A or are refined by terms in the state A. Referring to FIG. 14, the navigation state 226 corresponding to the set of terms type/variety: Red, Origin: Chile } { document #4} refines the navigation state 224 corresponding to { Origin: Chile } (documents #4, # 5). Since the refined relationships in the navigational state produce partial order, they are transitive and antisymmetric. In the example, { Type/value: Red, Orgin: Chile } (document #4) refinements { Origin: Chile } (documents #4, #5) and { Origin: Chile } (documents #4, #5) refinements { Origin: South America } (documents #1, #4, # 5); thus, { Type/value: Red, Orgin: Chile } (document #4) refines { Origin: South America } (documents #1, #4, # 5). The root navigational state 222 is defined as the navigational state corresponding to the entire set of documents. The leaf navigational state 226 is defined as a navigational state that cannot be further refined and generally (although not necessarily) corresponds to individual documents. There can be any number of indirect navigational states 224 between the root 222 and the leaf 226. Assuming a pair of navigational states A and B, where B refines A, then there are multiple paths in the partial order that can connect A to the indirect navigational state 224 of B. For ease of definition, the navigational state is considered to be self-refining with reference to the implementations described herein.
The user browses the document set by viewing a sequence of one or more navigational states, typically starting at the root navigational state 22. In these states, there are three basic modes. The first mode is to refine, or move, the current navigational state to a navigational state that refines it. The user can perform the refinement by adding a term to the current navigation state or by refining a term in the current navigation state, i.e. replacing that term by a refinement of the term. After the user has added or refined terms, the new term set can be mapped onto the canonical term set according to the two-step mapping described above. The second mode is to summarize, or move, the current navigational state to a more general navigational state that is refined by the current state. The user can perform the summarization by removing terms from the current navigation state or by summarizing terms in the current navigation state, i.e. replacing the current terms with terms refined by the current terms. After the user removes or summarizes terms, the new term set can be mapped onto the canonical term set. The third mode simply creates a query in the form of a set of required terms that can be mapped again onto the canonical set of terms to obtain the navigational state.
Implementation of
The knowledge base is transformed into a navigable data structure in order to implement the present invention. The navigation states may be fully pre-computed, dynamically computed at runtime, or partially pre-computed. Caching may be used to avoid redundant computation of navigation states.
In a preferred embodiment, the set of navigational states may be represented as a graph, preferably a directed acyclic multiple graph with labeled edges. A graph is a composite structure consisting of nodes and edges, where each edge links a pair of nodes. Two nodes linked by an edge are called endpoints. According to the invention, nodes correspond to navigational states, and edges represent transitions that are refined from one navigational state to another. Since the refinement is directional, each edge points from the more general node to the node that refines it. Because there is a partial order in the navigation state, there is no directed loop in the graph, i.e., the graph is acyclic. Preferably, the graph is a multi-graph, as it allows the possibility of connecting multiple edges of a given node pair. Each edge is labeled with a term. Each edge has attributes that start with the more general set of terms for the end point, add edge terms, and use a two-step mapping to translate this set of terms into a canonical form that is refined, resulting in a navigational state for the other end point. That is, each edge represents a refined transition between nodes based on adding a single term.
To understand the structure of the graph, the following definitions are useful: descendants, ancestors, smallest common ancestor (LCA), true ancestors, true descendants, and largest lower bound (GLB). These definitions apply to the refined partial order in terms and nodes. If A and B are terms and B refines A, then B is said to be the descendant of A and A is said to be the ancestor of B. In addition, if A and B are different terms, then B is said to be the true descendant of A, and A is said to be the true ancestor of B. The same definition applies if a and B are both nodes.
If C is an ancestor of A and C is also an ancestor of B, then C is said to be a common ancestor of A and B, where A, B and C are both terms or both nodes. The smallest element of the common ancestor set of A and B is called the smallest common ancestor (LCA) of A and B. The LCAs of two terms or two nodes are unique if no term has an incomparable pair of ancestors. For example, the LCA of Origin: Argentina and Origin: ceiling is Origin: South America in partial order of term 110 of FIG. 11B. In general, however, there may be a set of LCAs that are used to specify terms or node pairs.
The node calculations in the graph are preferably performed bottom-up.
Leaf nodes in the graph-i.e., nodes corresponding to leaf navigational state-may be computed directly from the classified documents. Typically, but not necessarily, a leaf node will correspond to a set containing a single document. The remaining, non-leaf nodes are obtained by computing the LCA closures of the leaf nodes-i.e., all nodes of the LCAs of a subset of the leaf nodes.
The edges of the graph are determined according to a refinement function called R-function for labeling convenience. The R function uses two nodes, A and B, as arguments, where A is the true ancestor of B, and returns the maximum term set so that if the term C is in R (A, B), then refining node A with the term C results in the true descendants of A and the nodes of the ancestors of B (not necessarily true). For example, in FIG. 14, R ({ Type/Varietal: Red }, { Type/Varietal: Merlot, Orgin: Argentina, Vintage: 1998}) { Type/Varietal: Merlot, Origin: South America, Vintage: 1998}. If B is present1Is B2Ancestors of (1), then R (A, B)1) Is R (A, B)2) Assuming A is B1And B2The true ancestor of (1). For example, R ({ Type/Varietal: Red } { Type/Varietal: Red, Origin: southAmerica }) }, { Origin: South America },
in the figure, the edge between nodes a and B will correspond to the subset of terms in R (a, B). At the same time, no two edges from a single ancestor node A use the same terminology for refinement. If node A has a descendant node set (B)1,B2,..) so that the term C is in all R (A, B)i) Then only the edge from node a with the term C reaches LCA (B)1,B2,..) guaranteed to be a node BiThe only largest node in (c). In FIG. 14, for example, an edge from node { Type/Varietal: Red } with the term Origin: South America passes through node { Type/Varietal: Red, Origin: South America } instead of the true descendant of the node { Type/Varietal: Merlot, Orgin: South America, Vintge: 1998} and { Type/value: Red, Origin: Chile }. The LCA closed property of the graph ensures that there is only a maximum node in Bi. Thus, each edge uniquely maps a node-term pair to the true descendant of that node.
LCA closure of the graph results in a set of nodes whose term set refines S having the useful property of only the largest node for a given term set S. This node is called the maximum lower bound of S (GLB).
The graph can be computed explicitly and stored in a combined data structure; the graph is represented implicitly in structures that do not necessarily contain explicit node and edge representations, or using methods that combine these strategies. Because navigation systems will typically operate on large document sets, it is preferable to represent the graphics in a scalable way.
The graph can be obtained by computing the LCA for a subset of each possible leaf node. However, this approach grows exponentially in the number of leaf nodes and is inherently not scalable. Another strategy for obtaining LCA closure is to repeatedly consider all node pairs in the graph, retrieve whether the LCA of each pair is in the graph, and add that LCA to the graph as needed. This strategy, despite a significant improvement over the previous one, is still relatively non-scalable.
A more efficient way to pre-compute nodes is to sequentially process a set of documents, compute a node for each document, and add that node to the graph, along with any other nodes needed to maintain LCA closures. The system stores nodes and edges as directed acyclic multiple graphs. The graph is initialized to contain a single node, the root node, corresponding to the empty term set. Referring to FIG. 15, in a process 230 for inserting a new node into the graph, at step 232, the system creates a new node for each new document that will be inserted into the graph that does not correspond to an existing node. At step 234, the system recursively generates and inserts any missing LCA nodes between the root node (or ancestor nodes) and the new node before inserting the new node into the graph. To ensure that the LCA after each node insertion is closed, the system inserts the document node that was lost in steps 236 and 238 after inserting all other nodes of its true ancestor.
Inserting a new node requires adding the appropriate edges from ancestors to the node in step 236 and adding descendants outside the new node in step 238. The edge of an entering node is preferably determined by identifying ancestors that have refined terms that caused the new node and that have not been used with those refined terms on the edge of an intermediate ancestor leading to the new node. The edge outside the node is preferably determined by calculating the GLB of the new node and adding edges from the new node to the GLB and to the node to which the GLB has an edge as appropriate.
By following the above-described process for each document in the collection, the entire graph can be pre-computed. In the case of a tractable size of the graph, or if the user wants to see every navigational state with equal probability, the graph is preferably pre-computed. In practice, however, users typically view some navigational state more frequently than others. Indeed, as the graph becomes larger, some navigational state may not be visible at all. Unfortunately, it is difficult to reliably predict the frequency at which navigational state will be viewed.
An additional strategy to pre-compute navigational state is to create an index that allows navigational state to be computed dynamically. In particular, each document can be indexed by all terms that are relevant to that document or have refinements associated with that document. The final index is typically smaller in size than the data structure of the graph storing the navigational state. This dynamic approach may save space and pre-computation time, but it may do so at the expense of higher response time or higher computational demand for the operation. The dynamic implementation may use one variable form that returns all the refined term R functions from the specified navigational state, and the process of GLB for computing the term set.
A subset of navigational states may be pre-computed. It is preferable to pre-compute the state with the highest dynamic computation cost. For example, if the state corresponds to a large subset of documents, it is preferably pre-computed. In one possible partial pre-computation approach, all navigation states corresponding to a subset of documents above a threshold size may be pre-computed. If the states are viewed frequently, it is preferable to pre-compute the states. In some instances, the frequency with which navigation states will be viewed may be predicted. Even if the frequency with which navigation states are to be viewed cannot be predicted in advance, the need for continuous pre-computation can be reduced by caching the results of dynamic computations. The most recently or frequently viewed state may be cached.
As described with reference to the interface, the system supports three query operations-namely, refinement, summarization, and querying by specifying terms. These operations may be further described in terms of graphs. For query refinement, the system enumerates terms on the edges from the node corresponding to the current navigational state. When the user selects the term for refinement, the system responds by displaying the node to which that edge leads. Similarly, for the query summarization option, the system enumerates and selects edges leading to (rather than from) the node corresponding to the current navigational state. In addition, a query can be generalized to a particular query case by specifying a set of terms. For a query by specifying a set of keywords, the system creates a virtual node corresponding to the specified set of terms and determines the GLB of the virtual node in the graph. If no GLB is found, then no documents will satisfy the query. Otherwise, the GLB node would be the most common node in the graph corresponding to all documents satisfying the navigational state of the query.
The navigation system of the present invention allows the information provider to overlay the navigation system over all the document sets. The knowledge base and navigation aspects of the invention can be performed independently by different providers, and information providers can supply these functions externally to separate entities. Similarly, the generated knowledge base may be entered by a navigation expert. The information provider may also supply this navigation requirement externally to the navigation system provider. The navigation system provider can charge the consumer a license fee for the system regardless of the amount of usage. Additionally, if the product is available, the navigation system provider can charge the consumer per click, per purchase, or per transaction generated by a click through the navigation system via the system. The navigation system provider can also act as an aggregator-compiling records from multiple sources, combining them into a global dataset, and generating navigation systems to search the dataset.
The navigation system according to the present invention can also improve user configurability and sales capability. The navigation system may maintain a user profile based on user selections, including selections to develop a particular path of the set of navigation states. Using the knowledge base, the system may also infer additional information about user preferences and interests by supplementing the knowledge base with selection information about relevant documents, attributes, and terms. That information may be used on market goods and services related to documents of interest to the user.
The foregoing descriptions have been presented with respect to specific embodiments of this invention. The present invention may be embodied in other specific forms without departing from its scope. The embodiments, figures, terms and examples used herein are by way of reference and illustration only and not by way of limitation. The scope of the invention is indicated by the appended claims, and all changes that come within the meaning and range of equivalency of the claims are intended to be embraced therein.

Claims (54)

1. A computer-implemented method for retrieving information associated with a set of materials, the method comprising:
storing a plurality of attribute-value pairs associated with the material, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material; and
calculating a plurality of navigational states, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of calculated navigational states, there are at least the following cases,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material characterized by the third attribute;
storing the calculated navigational state in a data structure;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state;
accepting a query directed to the material; and
in response to the query, the stored navigational state is retrieved.
2. The method of claim 1, wherein the data structure is a graph data structure comprising nodes representing navigational state and edges between nodes representing transitions.
3. A computer-implemented method for retrieving information associated with a set of materials, the method comprising:
storing, in a first data structure, a plurality of attribute-value pairs associated with the material, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material; and
wherein the first data structure allows for dynamically calculating a plurality of navigational states, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material in the navigational system that is each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of navigational states, there is at least a case,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material in the navigational system characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material in the navigational system characterized by the third attribute;
generating a set of pre-computed navigational states;
storing a set of pre-computed navigational states in a second data structure, the set of pre-computed navigational states stored in the second data structure including at least one of the first navigational state and the second navigational state;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state;
accepting a query directed to the material; and
in response to the query, either the stored, pre-computed navigational state is retrieved or one of a plurality of navigational states is dynamically computed using the first data structure.
4. A computer-implemented method for retrieving information associated with a set of materials, the method comprising:
storing, in a data structure, a plurality of attribute-value pairs associated with the profile, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the profile; and
dynamically computing, at runtime, a plurality of navigational states using the data structure in response to a plurality of sequential queries, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material in the navigational system that is each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of navigational states, there is at least a case,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material characterized by the third attribute;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state; and
in response to each query, the computed navigational state is displayed.
5. The method of claim 4, further comprising the steps of: storing the dynamically computed navigational state in a cache.
6. The method of claim 4, wherein at least one transition from the originating navigational state to the destination navigational state represents a refinement of a value for one of the respective set of attribute-value pairs for the originating navigational state.
7. The method of claim 4, wherein at least one transition from the originating navigational state to the destination navigational state represents a generalization of a value for one of the respective set of attribute-value pairs of the originating navigational state.
8. The method of claim 4, wherein at least one transition from the originating navigational state to the destination navigational state represents a deselection of an attribute of one of the respective set of attribute-value pairs for the originating navigational state.
9. The method of claim 4, wherein a first transition from the originating navigational state to the first destination navigational state represents a selection of an attribute-value pair corresponding to a first attribute, and a second transition from the originating navigational state to the second destination navigational state represents a selection of an attribute-value pair corresponding to a second attribute.
10. The method of claim 4, wherein at least one transition from the originating navigational state to the destination navigational state represents a specification of an attribute-value pair corresponding to an attribute for which there is no corresponding attribute-value pair in the attribute-value pair set corresponding to the originating navigational state.
11. The method of claim 4, wherein an originating navigational state corresponds to only one profile of the set of profiles, at least one transition from the originating navigational state to a destination navigational state representing selection of a particular attribute-value pair associated with the originating navigational state.
12. The method of claim 4, wherein a value associated with at least one of the plurality of attributes is unambiguously defined.
13. The method of claim 4, wherein a value associated with at least one of the plurality of attributes is implicitly defined.
14. The method of claim 4, wherein for attribute-value pairs that share a common attribute, no attribute-value pair refines the plurality of mutually incomparable attribute-value pairs.
15. The method of claim 4, wherein at least one attribute-value pair refines a plurality of mutually incomparable attribute-value pairs for attribute-value pairs that share a common attribute.
16. The method of claim 4, wherein no attribute-value pairs refine the plurality of mutually incomparable attribute-value pairs.
17. The method of claim 4, wherein at least one attribute-value pair refines a plurality of mutually incomparable attribute-value pairs.
18. The method of claim 4, wherein for any two attribute-value pairs corresponding to different attributes, the two attribute-value pairs are not comparable.
19. The method of claim 4, wherein the set of material includes material associated with a single subject area.
20. The method of claim 4, wherein the set of material includes material associated with a plurality of subject areas.
21. The method of claim 4, wherein a portion of the material in the set of material is assigned to a subset of the set of material, the subset being navigable as a whole.
22. The method of claim 21, wherein the interface is employed to provide a plurality of transitions associated with the subset of material.
23. The method of claim 4, wherein the set of material includes a plurality of subsets, each of the plurality of subsets being independently navigable as a whole, a portion of the material in the set of material being assigned to each subset, at least one of the material being assigned to more than one subset.
24. The method of claim 4, further comprising storing a configuration file for each of the profiles, the configuration file comprising a set of attribute-value pairs.
25. The method of claim 24, the configuration file further comprising descriptive information.
26. The method of claim 4, the interface comprising a human user interface.
27. The method of claim 4, the interface comprising an application program interface.
28. The method of claim 4, wherein the interface is operable in a web-based environment.
29. The method of claim 4, wherein the interface is operable in an XML-based environment.
30. The method of claim 4, wherein the interface complements the functionality of a stand-alone data-oriented program.
31. The method of claim 4, the interface comprising a full text search tool for searching attributes.
32. The method of claim 4, the interface comprising a full text search tool for searching for values.
33. The method of claim 4, further comprising storing a configuration file for each material in the set of materials, the configuration file including descriptive information, the interface including a full text search tool for searching the descriptive information in the configuration file.
34. The method of claim 4, the interface comprising accessing a material in the set of materials.
35. The method of claim 4, the interface comprising displaying attribute-value pairs corresponding to a current navigational state.
36. The method of claim 35, displaying attribute-value pairs corresponding to the current navigational state includes user-selected attribute-value pairs and inferred attribute-value pairs, and the interface includes an indication of the user-selected attribute-value pairs and inferred attribute-value pairs.
37. The method of claim 35, displaying attribute-value pairs corresponding to the current navigational state includes only attribute-value pairs that are not comparable to each other.
38. The method of claim 35, wherein the display organizes attribute-value pairs corresponding to a current navigational state by attribute.
39. The method of claim 35, wherein the display organizes the attribute-value pairs corresponding to the current navigational state by more general attribute-value pairs.
40. The method of claim 4, the interface comprising a guided search tool for allowing navigation from a current navigational state based on a plurality of transitions in the plurality of navigational states.
41. The method of claim 40, the guided search tool comprising displaying a navigation option for selection from a current navigation state, the option corresponding to a transition from the current navigation state.
42. The method of claim 41, the navigation options comprising attribute-value pairs as refinements of the attribute-value pairs corresponding to a current navigation state.
43. The method of claim 41, wherein the options include displaying a set of lists of attribute-value pairs, each list corresponding to one of the attributes, some lists including attribute-value pairs for refining attribute-value pairs corresponding to a current navigational state, and some lists including attribute-value pairs not comparable to attribute-value pairs corresponding to the current navigational state.
44. The method of claim 41, wherein the display organizes navigation options by attribute.
45. The method of claim 41, wherein the display organizes navigation options by more general attribute-value pairs.
46. The method of claim 41, the navigation options comprising attribute-value pairs that are not comparable to attribute-value pairs corresponding to a current navigation state.
47. The method of claim 41, the navigation options comprising attribute-value pairs that are summaries of attribute-value pairs corresponding to a current navigation state.
48. The method of claim 41, the navigation option comprising deselecting an attribute-value pair from the set of attribute-value pairs corresponding to a current navigation state.
49. The method of claim 41, the navigation options further comprising links to related navigation states.
50. The method of claim 49, wherein the relevant navigational state is a summary of a current navigational state.
51. The method of claim 49, wherein the relevant navigational state is a refinement of the current navigational state.
52. The method of claim 49, wherein the links correspond to paths of one or more transitions.
53. A computer-implemented method for retrieving information associated with a set of materials, the method comprising:
storing, in a first data structure, a plurality of attribute-value pairs associated with the material, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material; and
wherein the first data structure allows for dynamically calculating a plurality of navigational states, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material in the navigational system that is each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of navigational states, there is at least a case,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material in the navigational system characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material in the navigational system characterized by the third attribute;
generating a set of pre-computed navigational states;
storing a set of pre-computed navigational states in a second data structure, the set of pre-computed navigational states stored in the second data structure including at least one of the first navigational state and the second navigational state;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provides a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state, wherein a first transition provided in the second navigational state corresponds to a first navigational attribute for which no transition is provided in the first navigational state;
accepting a query directed to the material; and
in response to the query, either the stored, pre-computed navigational state is retrieved or one of a plurality of navigational states is dynamically computed using the first data structure.
54. A computer-implemented method for retrieving information associated with a set of materials, the method comprising:
storing, in a first data structure, a plurality of attribute-value pairs associated with the material, wherein each of the plurality of values has an association with at least one of a plurality of attributes characterizing the material; and
wherein the first data structure allows for dynamically calculating a plurality of navigational states, each navigational state corresponding to a particular set of attribute-value pairs and a particular subset of material, wherein for each navigational state, the particular subset of material corresponding to the navigational state comprises material in the navigational system that is each described by each attribute-value pair in the particular set of attribute-value pairs corresponding to the navigational state;
wherein, in the plurality of navigational states, there is at least a case,
the first navigational state includes a first attribute-value pair having a first attribute, wherein the first attribute-value pair does not describe all material in the navigational system characterized by the first attribute,
the second navigational state includes a second attribute-value pair having the first attribute, wherein the second attribute-value pair refines the first attribute-value pair, an
At least one of the first navigational state and the second navigational state includes a third attribute-value pair having a third attribute, the third attribute being different from the first attribute, wherein the third attribute-value pair is not comparable to the first attribute-value pair and not comparable to the second attribute-value pair, and the third attribute-value pair does not describe all material in the navigational system characterized by the third attribute;
generating a set of pre-computed navigational states;
storing a set of pre-computed navigational states in a second data structure, the set of pre-computed navigational states stored in the second data structure including at least one of the first navigational state and the second navigational state;
providing an interface providing a plurality of transitions, each transition providing a direct path between two navigational states without an intervening navigational state, wherein each transition represents a change from a set of attribute-value pairs corresponding to an originating navigational state to a set of attribute-value pairs corresponding to a destination navigational state, wherein a series of one or more transitions provide a path between any two navigational states, wherein the interface provides a direct path between a first navigational state and a second navigational state without an intervening navigational state, the interface further providing a full text search tool that enables a full text search of a plurality of attributes and a plurality of values;
accepting a query directed to the material; and
in response to the query, either the stored, pre-computed navigational state is retrieved or one of a plurality of navigational states is dynamically computed using the first data structure.
HK05104843.6A 2001-05-25 Hierarchical data-driven havigation system and method for information retrieval HK1072114B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2001/017046 WO2002097671A2 (en) 2001-05-25 2001-05-25 Hierarchical data-driven navigation system and method for information retrieval

Publications (2)

Publication Number Publication Date
HK1072114A1 HK1072114A1 (en) 2005-08-12
HK1072114B true HK1072114B (en) 2009-11-20

Family

ID=

Similar Documents

Publication Publication Date Title
US7035864B1 (en) Hierarchical data-driven navigation system and method for information retrieval
US7567957B2 (en) Hierarchical data-driven search and navigation system and method for information retrieval
US7617184B2 (en) Scalable hierarchical data-driven navigation system and method for information retrieval
EP1502205B1 (en) Hierarchical data-driven navigation system and method for information retrieval
US7325201B2 (en) System and method for manipulating content in a hierarchical data-driven search and navigation system
US7428528B1 (en) Integrated application for manipulating content in a hierarchical data-driven search and navigation system
US6631496B1 (en) System for personalizing, organizing and managing web information
AU2001268095A1 (en) Hierarchical data-driven navigation system and method for information retrieval
US8655869B2 (en) System and method for information retrieval from object collections with complex interrelationships
US8041709B2 (en) Domain collapsing of search results
US20050027694A1 (en) User-friendly search results display system, method, and computer program product
US20070055680A1 (en) Method and system for creating a taxonomy from business-oriented metadata content
US20030038836A1 (en) Web map tool
CA2622268A1 (en) Navigation of structured data
HK1072114B (en) Hierarchical data-driven havigation system and method for information retrieval
Fluit et al. Visualization facility