DATABASE RETRIEVAL SYSTEM
The present invention relates generally to a database retrieval system and more specifically to such a system which uses a natural language search request interface. The invention has particular application for use as an interface for an internet search facility, and the invention is herein described in that context. However it is to be appreciated that the invention has broader application and is not limited to that particular use.
Throughout the specification, the term "natural language" is used to refer to word statements which are represented in a sentence structure or in a structure which is close to a sentence structure and which can be readily understood by a non-instructed person to which the statements are presented.
Database retrieval systems using a natural language format have been gaining popularity in recent times. Such systems have been used as internet search facility interfaces and also as interfaces for other types of databases, such as online help systems accompanying computer software applications or as part of automated customer service systems. Typically, when using a natural language format, users are directed to enter their search requests in the form of a complete question expressed in terms of every day conversational language. The database retrieval system then analyses the question submitted by the user and returns a list of database records which have been deemed by the system to be a possible match to the user's search request.
Natural language interfaces have been developed primarily to make the computer more "humanised" as compared to traditional search structures using a Boolean search structure or the like. In this way, the interface is designed to be easier to use and less intimidating, particularly to users who are not experienced in computer-based searching.
Whilst the existing natural language search interfaces have been successful in providing a more "humanised" approach to searching, they still have various problems. In particular, these systems are prone to generate a lot of irrelevant records, resulting in inefficiencies due to time wasted by the user evaluating those records and consulting the material to which the
records refer. Additionally, given such a wide scope of acceptable search requests, namely any question expressed in a natural language form, users are frequently unsure of how best to express the search requests in order to retrieve the material they are interested in. In addition to the above problems, because the user is required to create the entire search request, typographical errors can easily be made in formulating the request. The inclusion of a typographical error in a search request will result in either no records or irrelevant records being returned to the user. Again, time will be wasted in assessing these records and the material to which they relate.
It is an aim of the present invention to provide a database retrieval system which uses a natural language format and which ameliorates the problems detailed above.
In its broadest term, the present invention relates to a system operative to enable a user to produce a search request for retrieving information from a database, the system including storage means including a plurality of query terms, each being operative to form part of a said search request to retrieve the information from the database, processing means operative to generate a plurality of request lists each including selected ones of the query terms, means for making the request lists available to the user, means for receiving a user selection of at least one query term in each request list, and means operative to cause each selected query term to be represented to the user in a controlled order as part of the search request, wherein the processing means is operative to select query terms to generate at least one said request list in response to the user selection of at least one query term from a previous request list, and wherein when represented to the user in the controlled order, the search request including the selected query terms is in a natural language format.
The advantage of the present invention is that the system provides a search request interface which guides the user in development of the complete search request, and which also represents that search request in a natural language format. In this way, the user not only readily understands the search request that is being constructed but also the search request has
a higher probability of achieving a successful outcome. This occurs through the guidance provided by the system in construction of that request and also by the design of the system which provides a more intuitive approach and which is able to be designed to be specifically compatible with the database in which to access the information. This guiding capability of the system is enabled by the system generating lists of query terms from which the user may select based on previously chosen query terms. Request lists generated by this process are thereby filtered so that the query terms included are appropriate for selection by the user based on the part of the search request that has already been constructed. Further, the representation to the user of the request lists may be ordered so that as individual query terms are selected, the concatenation of the selected query terms in their selected order builds up a natural language search request.
The storage means includes the full range of possible search query terms. At least some of these query terms are linked to other specific query terms so that when that specific query term is selected by the user, the linked query terms are subsequently represented to the user as a request list. In this way, different categories are formed by virtue of this interlinking of the query terms with the terms in each category having a common linked query term.
The system may be operated in various mediums including audio, visual or a combination of these mediums. In one embodiment, the system is used on a computer or interactive television, typically through an Internet Web page or similar graphical user interface. In that application, the user commences a search request which when initiated, displays an initial list of query terms. The user then chooses a query term in that list and this then prompts the system to display a new list of query terms, one or more of which is selected. This in turn prompts yet a further list of query terms, one or more of which is selected and then which prompts yet a further list of query terms and so on until the search request is completed. In this approach some of the request lists generated may be dependent on previously selected query terms while some may not. On completion of the search request, the system generates the results.
The system may be designed so that each of the query terms forms part of a predefined search path. The number of search paths is dependent on the number of possible query term combinations and would typically be in the order of hundreds of thousands. With this arrangement, each predetermined path has specific information relating to the information database linked to it. This information is displayed as the results of the search. This information may be in the form of a specific URL in an internet search facility, or in the form of recorded information or the like.
With this application, the system effectively provides a means by which a user may select a predetermined search path from a store of many such paths. The selection process is not one off but rather is built up as the search request is constructed as the user chooses the individual query terms. This approach enables the user to effectively manage and choose appropriately from a vast number of predetermined search paths. In another form, the query terms are used to interrogate a separate database containing discrete information. Typically this discrete information is individually tagged and when the selected query term matches with the tags on a particular discrete information, that information or selected parts of it, is then displayed as a result. The process of selecting individual query terms from the displayed category of terms may take any suitable form. For example, these query terms may be entered through activation of a remote device such as a keyboard or mouse connected to the computer, menu control or by touching a screen, or by voice activation, assuming the computer interface has appropriate capabilities.
In constructing the search request, the system is also operative to represent to the user in a controlled manner the selected query terms in a natural language format. This again may be achieved in various ways. In one embodiment, each of the query terms is in the form of a part sentence. The selected query terms are then displayed in the controlled order so that the concatenation of the selected query terms together with any embedded terms in the interface, constructs a sentence. In this application, to make a complete sentence, the individual query terms take different forms. Some
query terms are in the form of verbs, whilst others are in the form of phrases or nouns.
In one form, as soon as a query term is selected it is then represented to the user as part of the search request. With this arrangement at all times the search request that is being constructed is displayed to the user. This has particular advantage as it enables the user to see exactly the logic which is behind the search request which is being constructed with the aid of the system. As this search request is in the form of a natural language format it is easily understood by even an inexperienced user. In an alternative form, the system may be used purely through an audio medium such as a telephone or the like. In that application, the user is prompted by the system voicing each of the query terms in the lists selected by the system at any one time. The user selects one or more of the prompted queries either by voice activation or through the telephone key pad or the like which then causes the system to tell the user the query terms of the next selected list. To enable the user to fully comprehend the search request that is being constructed, once the search request has been compiled it is then voiced back to the user prior to it being submitted to interrogate the information database. The controlled order by which the search request terms are represented to the user may vary. Preferably in a system where the search request is displayed, it is preferable that the search terms are displayed sequentially so that the search request as it is being constructed, can be viewed. The arrangement of the display may vary from a strict linear sentence structure to a more liberal format where the search request may be represented in a paragraph or as bullet points or the like. In each case it may still constitute a natural language format. In an audio system, where the search request is not represented until after the search request has been completed, the order may be changed from sequential to improve the language of the search request to aid understanding for the user.
In yet a further aspect, the present invention provides a method of guiding a user to construct a search request for retrieving information from a database, the method including the steps of:
(a) providing a plurality of query terms;
(b) representing to a user a plurality of request lists, each list including a selection of the query terms, and wherein in use a user selects at least one said query term from a plurality of request lists to form said search request, wherein the request lists are represented to the user for selecting a query term sequentially so that there is a first request list and at least one subsequent request list and wherein the query term in at least one of the subsequent request lists is represented to the user in response to the user selection of at least one query term in a previous request list. It will be convenient to hereinafter describe embodiments of the present invention with reference to the accompanying drawings. It is to be appreciated that the particularity of these drawings and their related description is understood as not superceding the generality of the preceding broad description of the invention. In the drawings:
Figure 1 illustrates a schematic view of the components of a database retrieval system according to a preferred embodiment of the invention; Figures 2 to 4 are user interfaces for the system of Figure 1 ; and Figure 5 is a flow chart illustrating the steps performed by the system of Figure 1 ;
Figure 1 illustrates in schematic view, a system 10 which is operative to retrieve information from a database 50. The system 10 is operative over a distributed network of computers such as available over the Internet through the World Wide Web. The system is accessible by a user from a remote client device 11 , and includes a server 12 and a search database 13. The system 10 is operative to establish a search request, which is then used to retrieve information from the database 50 which is typically web site characterisation data which allows a user to locate associated web pages on the World Wide Web. In the illustrated form, the databases 13 and 50 are shown as separate items, although it is appreciated that they may be incorporated into a single database as explained in more detail below.
The remote user device 11 communicates with the server 12 through a distributed computer network 14 such as the Internet. The remote
device 11 is typically in the form of a PC, although may include any other suitable device such as a personal digital assistant or mobile phone with appropriate communication protocols to enable access over the Internet.
The server 12 is operative to send to the remote device, on request, a user interface 30 such as web pages illustrated in Figures 2 to 5, to enable a user to construct a search request which is then processed by the server 12 which in turn accesses databases 13 and 50.
The system 10 is structured such that the search database 13 includes a store of query terms. These terms are associated with different category fields. At least some of the query terms are interlinked so that when a specific query term is selected by the user under one category field, the linked query terms are subsequently represented to the user as a list of terms in a subsequent category field. In this way, the terms in many of the request lists each have a common linked query term so as to provide a filtering process for displaying only relevant query terms.
To establish a complete search request, a user is required to select a query term from a request list in each category field. Because of the interlinking of the query terms across the different category fields, the search database 13 is structured to define a plurality of a predetermined search path. Each of these search paths, which typically number in the tens to hundreds of thousands, have specific web site characterisation data associated with it which is stored in database 50. This characterisation data is typically specific URLs or the like which enable a user to locate specific sites on the World Wide Web. The system may of course be applicable to search a database containing other information and is not limited to an Internet search engine as described in the present embodiment.
Table 1 illustrates the structure of the search database 13, identifying both the request lists, category fields and predetermined search paths. Specifically, the database 13 is structured so that there are six category fields, namely Action, Content Type, General Topic, Specific Topic, Region, and Match Phrase Description. These are represented by the columns of Table 1. The rows of the Table are the individual search paths, while the Table entries are the query terms which are grouped together in the different
request lists. The order of the category fields is structured to guide the user in developing a natural language search request. In this way, the display to the user of the request lists as the user progresses through the category fields is ordered so that as individual query terms are selected, the concatenation of the selected query terms in their selected order builds up a natural language search request.
Within each category field there are multiple query terms which may be displayed together in one list or which may be displayed in different filtered lists depending on previously selected query terms. In the Action field, the terms are displayed in one list, whereas in most of the other category fields the request lists are filtered. In the Action field, the request list consists of query terms including "listen to", "view", "read", "buy", "sell" etc. In the next category field (Content Type) a filtered request list can be generated based on each query term in the request category in the Action field. For example, for the query term "listen to" there is a linked request list which includes the query terms "music", "comedy", "news", "interview", "sounds", "radio". In the next category field (General Topic) there is a linked request list for each of the query terms in each of the request categories of Content Type. In the illustrated example, a request list including "rock genre", "pop genre", "jazz genre", and "grunge genre" is generated under the General Topic category field which in turn is linked to the query term "music" in one of the request list under the Content Type field.
This structure of linking specific query terms to generated filtered request lists carries through the other category fields (Specific Topic, and Match Phrase Description). Because it only includes a limited number of query terms, the Region category field does not need to be filtered and therefore the request list generated is not dependent on the selected term in the previous category fields. It is noted that the system 10 is able to include a free form query term which allows keywords to be included in the search request. This free form query would appear as a query term in the selected category and indicated in the arrangement of Table 1 under the "Match Phrase Description" category field.
Table 1 also illustrates to embedded search request terms which appear on the user interface 30. These embedded terms are "I wish to" and "specifically". The system 10 is structured so that the embedded terms, in conjunction with the query terms in the category fields, form a complete search request which is in a natural language format so as to be easy for a user to understand the nature of the specific request.
As mentioned above, each completed search request forms a predetermined search path and associated with each search path, is appropriate web site characterisation data from the database 50. This association of the web site characterisation data can be obtained by modifying and manipulating an existing dump of web site characterisation data from the Internet. Alternatively, it could be built up using appropriate automated search tools or manually. The ongoing maintenance of the categories and the links directory may be performed manually (either on line or by the web site owners or by internal staff), or by using suitable automated tools that may be available.
Figures 2 to 4 illustrate the construction on the user interface 30 of the following search request:
"I wish to listen to music of a rock genre from movie sound tracks in Australia specifically Australian sound tracks 1927-1995."
The interface 30 is structured so that the embedded terms 31 appear on the screen. The interface 30 also includes six separate entry field boxes. The first box is the Action field box 32, the second is the Content Type box 33, the third is the General Topic field box 34, the fourth is the Specific Topic field box 35, the fifth is the Region field box 36, and the sixth is the Match Phrase Description field box 37. To commence a search, a user goes first to the Action field box where clicking a cursor on the box will prompt a pop up menu 38 to appear which displays the request list showing each of the query terms in that category field. The user is then able to select one of the query terms within the menu box 38 by highlighting on the selected query term. The selected query term is then represented in the field box 32. In the present example the selected query term is "listen to".
φ
.y .y .2.t..y .<..y .t> .y .<_■ .9 <_■ .9.y .y .9.SJ <» <a
'en 'ω '(Λ
'in
'in 'ω 'in
'to
'in 'to
'in 'in 'en
'to 'in 'in
'in ε 3 o s iliiiiiiiiiiliiiiss
w ilt-tlwlwlwlwfwlwfcolwflwlwlwlwlwiwlwlυlσ!σ!α
C S S S O O O S S S O O O O O O O O O O O O O O
≤222S2S22SS£SS≤SBS2≤SS5SaSSS2SS≤2SSSSSSS≤SS ifiifiiiiiijfffiiiiiϊiififiiiiiiiiiiiifϋif
Action Content Type General Topic Specific Topic Region Match Phrase Description
1 wish to View Quotes for Shares in Current ASX Listed Companies from Australia specifically ASX Quote WebLink
1 wish to View Quotes for Shares in Current ASX Listed Companies from Australia specifically etc
1 wish to View Quotes for Shares in Current ASX Listed Companies from the USA, specifically
1 wish to View Quotes for Shares in Current ASX Listed Companies etc
1 wish to View Quotes for Shares in Floating Companies on ASX
1 wish to View Quotes for Shares etc
1 wish to View Quotes for Exchange Rates
1 wish to View Quotes for Futures
1 wish to View Quotes for Options
1 wish to View Quotes etc
1 wish to View Financials
1 wish to View ?
1 wish to View etc
1 wish to Watch Movies or Trailers of a Comedy Genre that are on Video in Australia, specifically ♦"♦KEYWORD****
1 wish to Watch Movies or Trailers of a Comedy Genre that are on Video in Australia specifically Cyborg's Den: Movies!
1 wish to Watch Movies or Trailers of a Comedy Genre that are on Video in Australia specifically Movies, Etc
1 wish to Watch Movies or Trailers of a Comedy Genre that are on Video in Australia specifically Jurassic Punk
1 wish to Watch Movies or Trailers of a Comedy Genre that are on Video in Australia specifically etc
1 wish to Watch Movies or Trailers of a Comedy Genre that are on Video in the USA
1 wish to Watch Movies or Trailers of a Comedy Genre that are on Video Worldwide
1 wish to Watch Movies or Trailers of a Comedy Genre that are on Video etc.
1 wish to Watch Movies or Trailers of a Comedy Genre that are on Television
1 wish to Watch Movies or Trailers of a Comedy Genre Currently at the Box Office
1 wish to Watch Movies or Trailers of a Comedy Genre that are Unknown
1 wish to Watch Movies or Trailers of a Comedy Genre that are Unknown etc.
1 wish to Watch Movies or Trailers of a Horror Genre
/ wish to Watch Movies or Trailers of a Drama Genre
1 wish to Watch Movies or Trailers of a Childrens Genre
1 wish to Watch Movies or Trailers etc
1 wish to Watch Music Video
1 wish to Watch Television Shows
1 wish to Watch Animation
1 wish to Watch Documentaries
1 wish to Watch Home Video
1 wish to Watch etc
1 wish to Buy
1 wish to Sell
1 wish to Chat to
1 wish to Read
1 wish to Enter
1 wish to Write
1 wish to Play
1 wish to Learn
1 wish to Bet
1 wish to Apply for
The user then moves to the next entry field box 33 wherein on activating that field box, a pop up menu is displayed as illustrated in Figure 3 which displays a request list of query terms which are linked to the selected query term "listen to" and which are indexed under the Content Type category field. The user is then able to select a query term from that menu 39 which is then represented in the second field box 33. This process of the display of filtered request lists which are linked to previously selected query terms then continues until all fields are completed resulting in the complete search request being constructed as illustrated in Figure 4. Figure 5 is a flow chart illustrating the steps performed by the system as outlined above. As can be seen at step 20, the user commences the search request by initiating the system by some appropriate form such as clicking on the arrow in the first field 32. On initiating the search request, the system 10 at step 21 then displays a first list of query terms. The user then selects one of the query terms in the display list as indicated at step 22. Once the query term is selected, the display changes to remove the first list of query terms and displays the selected query term adjacent to the embedded words "I wish to" to form part of the constructed search request. This occurs at step 23. At step 24, the selected query term is then sent to the server 12 which determines whether the selection of that term represents the completed search request. If the server determines that the search request is completed, then at step 25, the query term(s) is used to interrogate the search database which then provides the matching linked results to that complete search request and sends those results back to the remote device at step 26.
If on the other hand, the search request is not completed, then the query term is then used to interrogate the search database to select a subsequent list of query terms from which the user is able to select. This may occur automatically or by initiating action by the user such as by clicking on the next empty category field. This occurs at step 27.
At step 28, the subsequent selected list of query terms is then displayed in a similar manner as at the first step 21 in the form of a drop down box or the like. The user is then again in a position to select the query
term (step 22) and the cycle then continues. In particular, the next selected query term is then displayed as a continuation of the search request and that query term is then forwarded to the server which it first determines whether it completes the requested search and then adopts either steps 25 and 26 or 27 and 28 accordingly.
This cycle then continues until a search request is fully constructed. As noted above, the system 10 is able to include a free form query term which allows keywords to be included into the search request. If the user makes a free form entry, the system then performs a keyword search directly off the database 50, taking into account all the category selections and the keywords that the user has entered.
Accordingly, the present invention allows the construction of a natural language format search request. The user is guided to select each query term sequentially in order to gradually construct a very precise natural language request with the interface continually displaying the concatenation of the already selected query terms and relevant embedded text at any point in time. The guided structure not only greatly assists the user to enter appropriate query terms for retrieving information to produce meaningful content, but it also allows the system to use a standard request structure. This is distinct from more recent natural language structures which require specific systems to analyse the content and syntax of the question and then generate appropriate query terms to interrogate the query database. Other advantages include the fact that the user merely needs to choose from a selected list of query terms at any one particular time thereby restricting user entry error potential. Further, by both guiding the user and also displaying the search request as it is being constructed, gives even inexperienced users a much greater probability of finding relevant material as compared to a free form natural language search request.
It is to be appreciated that alterations and/or modifications may be made to the parts previously described without departing from the spirit or ambit of the invention as defined in the accompanying claims.