WO2011130706A2

WO2011130706A2 - Methods and systems for performing cross store joins in a multi-tenant store

Info

Publication number: WO2011130706A2
Application number: PCT/US2011/032781
Authority: WO
Inventors: Craig Weissman; James Taylor
Original assignee: Salesforce com Inc
Current assignee: Salesforce Inc
Priority date: 2010-04-16
Filing date: 2011-04-15
Publication date: 2011-10-20
Anticipated expiration: 2012-10-16
Also published as: WO2011130706A3

Abstract

Methods, systems, and computer implemented operations for optimizing queries and also for performing high volume searches in a multi-tenant store are described. In one embodiment, such operations includes retrieving data from a multi-tenant database system having a relational data store and a non-relational data store, receiving a request specifying data to be retrieved, retrieving one or more locations of the data to be retrieved, generating a database query based on the request, in which the database query specifies a plurality of data elements to be retrieved, the plurality of data elements including one or more data elements residing within the non-relational data store and one or more other data elements residing within the relational data store, generating an optimized database query having an optimized query syntax that is distinct from a query syntax of the database query, and executing the optimized database query against the multi-tenant database system to retrieve the data. In an alternative embodiment, operations for performing high volume searches include receiving a query directed to a database, the database including a plurality of items, determining whether the query complies with one of a plurality of search criteria, each of the plurality of search criteria corresponding to a predefined index of the database, selecting a predefined index of the database corresponding to one of the plurality of search criteria if the query complies with said search criterion, the index containing entries that comply with the corresponding search criterion, applying the query to the selected index to find database items referenced in the index, selecting items based on applying the query to the selected index, building a report for the query, the report including only items of the selected index.

Description

METHODS AND SYSTEMS FOR PERFORMING CROSS STORE JOINS IN

A MULTI-TENANT STORE

CLAIM OF PRIORITY

[0001] This PCT application is related to, and claims priority to, the U.S. provisional utility application entitled "METHODS AND SYSTEMS FOR PERFORMING CROSS STORE JOINS IN A MULTI-TENANT STORE," filed on April 19, 2010, having an application number of 61/325,709 and attorney docket No. 008956P006Z (320PROV), the entire contents of which are incorporated herein by reference; this PCT application is further related to, and claims priority to, the U.S. utility application entitled "METHODS AND SYSTEMS FOR PERFORMING CROSS STORE JOINS IN A MULTI-TENANT STORE," filed on December 20, 2010, having an application number of 12/973,668 and attorney docket No. 8956P006 (320US), the entire contents of which are incorporated herein by reference; this PCT application is further related to, and claims priority to, the U.S. provisional utility application entitled "METHODS AND SYSTEMS FOR OPTIMIZING QUERIES IN A MULTI-TENANT STORE," filed on April 20, 2010, having an application number of 61/325,951 and attorney docket No. 008956P007Z (321PROV), the entire contents of which are incorporated herein by reference; this PCT application is further related to, and claims priority to, the U.S. utility application entitled "METHODS AND SYSTEMS FOR

OPTIMIZING QUERIES IN A MULTI-TENANT STORE," filed on December 20, 2010, having an application number of 12/973,676 and attorney docket No. 8956P007 (321US), the entire contents of which are incorporated herein by reference; this PCT application is further related to, and claims priority to, the U.S. provisional utility application entitled "METHODS AND SYSTEMS FOR APPENDING DATA TO LARGE DATA VOLUMES IN A MULTI- TENANT STORE," filed on April 16, 2010, having an application number of 61/324,946 and attorney docket No. 008956P005Z (319PROV), the entire contents of which are incorporated herein by reference; and this PCT application is further related to, and claims priority to, the U.S. utility application entitled "METHODS AND SYSTEMS FOR PERFORMING HIGH VOLUME SEARCHES IN A MULTI-TENANT STORE," filed on April 14, 2011, having an application number of 13/087,210 and attorney docket No. 8956P005 (319US), the entire contents of which are incorporated herein by reference. COPYRIGHT NOTICE

[0002] A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

TECHNICAL FIELD

[0003] The subject matter described herein relates generally to the field of computing, and more particularly, to methods, systems, apparatuses and operations for optimizing queries in a multi-tenant store and also for performing high volume searches in a multi-tenant store.

BACKGROUND

[0004] The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also correspond to embodiments of the claimed subject matter.

[0005] In conventional database systems, database users access data resources from one logical or relational database. A user of such a conventional system typically retrieves data from and stores data on the system using the user's own systems. A client device may remotely access one of a many of server systems, each of which may then, in turn, access such a conventional database system. Data retrieval from such conventional database systems may include a client or server issuing a query to the conventional database system which may, in response, process the request for information received via the query and return information to the requestor relevant to the request. Improving the speed, security, easy of use for both users and administrators, efficiency of system and network resources, and accuracy of data stored and retrieved have all been, and continue to be the focus and goal of administrators of conventional database systems.

[0006] Unfortunately, such conventional database systems exhibit many undesirable characteristics, such as processing queries and requests relatively slowly (e.g., when a large number of users simultaneously make a request of the same conventional database system), allow data access to unauthorized persons, become increasingly inefficient over time, waste network and system resources with overhead that does not contribute directly to the storage and processing of data, etc.

[0007] Various types of databases are also available, such as relational database models and non-relational database models, each having their own particular benefits and drawbacks. Unfortunately, database queries that simultaneously reference information from multiple data stores are highly inefficient and detract from benefits that may otherwise be derived from the implementation of multiple data stores of different types. Moreover, database queries that simultaneously reference distinct implementations of diverse database models may be wholly impracticable using previous database query mechanisms as conventional query engines lack the sophistication to communicate with diverse database types.

[0008] Moreover, in some query systems, an upper limit is placed on the number of results that may be provided in response to a query. A limit may also be placed on the number of results that are retrieved at intermediate levels or stages of processing a search. So for example at a first stage, a search query might be processed until there are for example, 500, 10,000, 50,000, 200,000 or some other maximum number of results. After the results reach the maximum cutoff number, then searching stops. This allows the search to be completed in less time and reduces the demands on the search system. Since a user will rarely wish to see even 200 results, 50,000 may be a safe maximum. At a second stage of a search, the results may be reduced again to, for example, the top 10,000.

[0009] A search index is one type of search data structure used for servicing queries for a given organization or database. A search index may sometimes be broken up into partitions or shards for large organizations or large databases. As more documents are added to a search index, it becomes more likely that queries against that index will bump up against a 50k, a 10k or any other reasonable collection limits. The collection limits are intended to cause the search results to be truncated. However, in some cases the truncation will lead to less relevant end-user results. Placing any upper limit on the number of results can prevent the search from retrieving the results that the user is looking for.

[0010] A problem arises if the search is limited at one stage and then further filtered or post-filtered at another stage. It can happen that the search returns no results, because all of the collected results are later filtered out. In other cases, in the final results there may be only a few hits or only low ranking or poor hits.

BRIEF SUMMARY

[0011] Methods, systems, and computer implemented operations for optimizing queries and also for performing high volume searches in a multi-tenant store are described. In one example, such operations includes retrieving data from a multi-tenant database system having a relational data store and a non-relational data store, receiving a request specifying data to be retrieved, retrieving one or more locations of the data to be retrieved, generating a database query based on the request, in which the database query specifies a plurality of data elements to be retrieved, the plurality of data elements including one or more data elements residing within the non-relational data store and one or more other data elements residing within the relational data store, generating an optimized database query having an optimized query syntax that is distinct from a query syntax of the database query, and executing the optimized database query against the multi-tenant database system to retrieve the data. In another example operations include receiving a query directed to a database, the database including a plurality of items, determining whether the query complies with one of a plurality of search criteria, each of the plurality of search criteria corresponding to a predefined index of the database, selecting a predefined index of the database corresponding to one of the plurality of search criteria if the query complies with said search criterion, the index containing entries that comply with the corresponding search criterion, applying the query to the selected index to find database items referenced in the index, selecting items based on applying the query to the selected index, building a report for the query, the report including only items of the selected index.

[0012] While one or more implementations and techniques are described with reference to embodiments in which high volume searches are implemented and/or optimizing queries are implemented in a system having an application server providing a front end for an on-demand database service capable of supporting multiple tenants, the one or more implementations and techniques are not limited to multi-tenant databases nor deployment on application servers. Embodiments may be practiced using other database architectures, i.e., ORACLE®, DB2® by IBM and the like without departing from the scope of the

embodiments claimed.

[0013] Any of the above embodiments may be used alone or together with one another in any combination. The one or more implementations encompassed within this specification may also include embodiments that are only partially mentioned or alluded to or are not mentioned or alluded to at all in this brief summary or in the abstract. Although various embodiments may have been motivated by various deficiencies with the prior art, which may be discussed or alluded to in one or more places in the specification, the embodiments do not necessarily address any of these deficiencies. In other words, different embodiments may address different deficiencies that may be discussed in the specification. Some embodiments may only partially address some deficiencies or just one deficiency that may be discussed in the specification, and some embodiments may not address any of these deficiencies.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] Embodiments are illustrated by way of example, and not by way of limitation, and can be more fully understood with reference to the following detailed description when considered in connection with the figures in which:

[0015] Figure 1 is an operational flow diagram illustrating a high level overview of a technique for querying a multi-tenant database in an embodiment;

[0016] Figure 2 illustrates a representative system for querying a multi-tenant database and the database in an embodiment;

[0017] Figure 3 is an operational flow diagram illustrating a high level overview of a technique for querying a multi-tenant database in an embodiment using indexed search criteria in an embodiment;

[0018] Figure 4 is an operational flow diagram illustrating a high level overview of a technique for querying a multi-tenant database in an embodiment using an access level index in an embodiment;

[0019] Figure 5 illustrates an exemplary architecture in which embodiments may operate;

[0020] Figure 6 illustrates an alternative exemplary architecture in which

embodiments may operate;

[0021] Figure 7A illustrates an alternative exemplary architecture in which embodiments may operate;

[0022] Figure 7B illustrates an alternative exemplary architecture in which embodiments may operate;

[0023] Figure 8 shows a diagrammatic representation of a system in which embodiments may operate, be installed, integrated, or configured;

[0024] Figure 9 is a flow diagram illustrating a method for optimizing queries in a multi-tenant store in accordance with one embodiment;

[0025] Figure 10 illustrates a diagrammatic representation of a machine in the exemplary form of a computer system, in accordance with one embodiment; and

[0026] Figures 11A and 11B each set forth a table depicting exemplary database query optimizations in accordance with the embodiments described herein.

DETAILED DESCRIPTION

[0027] Described herein are systems, devices, and methods for optimizing queries in a multi-tenant store. In one embodiment, such a method includes retrieving data from a multi- tenant database system having a relational data store and a non-relational data store. For example, in such a method, a host system for the multi-tenant database system receives a request specifying data to be retrieved from the multi-tenant database system, retrieving, based on the request via the host system, one or more locations of the data to be retrieved, generating, at the host system, a database query based on the request, in which the database query specifies a plurality of data elements to be retrieved, the plurality of data elements including one or more data elements residing within the non-relational data store and one or more other data elements residing within the relational data store, optimizing the database query via a query optimizer to generate an optimized database query having an optimized query syntax that is distinct from a query syntax of the database query, and executing the optimized database query against the multi-tenant database system to retrieve the data.

[0028] Database query logic is made more complex when the requested data set is spread across diverse database structures and when the referenced database system supports multiple tenants or multiple customer organizations, as is the case with a multi-tenant database system. Such a system greatly simplifies efforts on the part of a customer organization, as the a multi-tenant database system may be utilized as though it is a service that is accessible via the cloud, for example, over the Internet, without requiring the customer organization to implement any database hardware or functionality themselves. However, a hosting entity of the multi-tenant database system must cope with greatly increased complexity. As a result of this complexity, there are often opportunities to optimize database queries directed toward such a multi-tenant database system. Such optimizations may be undertaken by the hosting entity, without requiring participation from customer organizations that utilize the multi-tenant database system.

[0029] In the following description, numerous specific details are set forth such as examples of specific systems, languages, components, etc., in order to provide a thorough understanding of the various embodiments. It will be apparent, however, to one skilled in the art that these specific details need not be employed to practice the disclosed embodiments. In other instances, well known materials or methods have not been described in detail in order to avoid unnecessarily obscuring the disclosed embodiments.

[0030] In addition to various hardware components depicted in the figures and described herein, embodiments further include various operations which are described below. The operations described in accordance with such embodiments may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor programmed with the instructions to perform the operations. Alternatively, the operations may be performed by a combination of hardware and software.

[0031] Embodiments also relate to a system or apparatus for performing the operations herein. The disclosed system or apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a non-transitory computer readable storage medium, such as, but not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing non-transitory electronic instructions, each coupled to a computer system bus. In one embodiment, a computer readable storage medium having instructions stored thereon, causes one or more processors within a multi-tenant database environment to perform the methods and operations which are described herein. In another embodiment, the instructions to perform such methods and operations are stored upon a non-transitory computer readable medium for later execution.

[0032] The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus nor are embodiments described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the embodiments as described herein.

[0033] Query Overview

[0034] In one example, Lucene (a text search engine library from the Apache Software Foundation written in Java, a programming language from Sun Microsystems) is used to develop document identifiers (docids) from a query up to a maximum upper limit of 50,000. The results are scored in Lucene and post-filtering is applied so that only the highest scoring 10,000 of those are selected. The corresponding records are retrieved only for those 10,000. The score can be based on standard tf-idf (term frequency-inverse document frequency), though some characteristics may be boosted in the rankings for things such as matching on owner and some filter clauses. The particular numbers of 50,000 and 10,000 are provided here as examples only. The principles described herein may be applied to systems with much higher or lower collection limits or to systems with no collection limits at all.

[0035] In one embodiment, while the data that is searched originally comes from the database, the database is not queried directly. Data is extracted from the database and stored in a search system. The query is applied to the search system. In one example, there is a search that is run against the search system and then these results are post-filtered against the database. In other words, the results of the search index query are then post-filtered in the database using different criteria than those of the search string. These filter criteria may include authorization rules or separate filters defined by a user.

[0036] If a user, for example, is only authorized to view 5 documents but the search query matches 1 million documents, the presence of limits in the search system means the search query might only return 10,000 results, or 1% of the total number of matches. The odds are good that none of the 5 documents that the user is authorized to view will be in these 10,000 results. The post-filtering causes the query to return 0 authorized results. This can be frustrating to a user who knows that the database contains at least some documents that match the query.

[0037] A variety of different filters can be applied in the database after a collection limit is reached, for example a date or time filter, a source or account filter, or an

authorization or privileges filter. An authorization filter refers to a filter that limits results to those which a user is authorized to access. If the search only returns results to which the user does not have access, then no results can be provided to the user.

[0038] The efficiency and quality of a search are improved by first analyzing a search query to select a characteristic of a search that is going to limit the number of records that it can retrieve from a database. This characteristic may be a data field, such as date, time, address, size, or a dollar amount, it may be user access privileges, or some other common criterion. This characteristic is indexed and the query is applied first against the index. The rest of the query is then applied against the results from this specific index. By putting this limiting characteristic first, the results are limited to those that pass the criterion, making the results more relevant. The search is also limited to those that pass the criterion making the search faster. [0039] Figure 1 is a flowchart illustrating a simplified method 100 for performing a database search according to an embodiment. A user may enter a regular expression in order to find values stored in particular fields of a database. Additional parameters for the search may request particular data ranges for values associated (e.g. linked) with that field. For example, when a field is stored as table data at a column and row, the additional parameters may select particular data from that column.

[0040] In step 110, the user enters a regular expression. In one embodiment, this may be done by entering symbols and characters into a window of an application (e.g.

application running on the database), or a browser or application interface running on a client machine. In another embodiment, characters may be combined with actions (e.g.

corresponding to particular symbols) chosen from lists (such as drop down lists).

[0041] In step 120, a query is formulated based on the regular expression that was entered. For example, an application server or a query host of the database system may formulate the query. The query may include other filters (e.g. additional parameters) entered by the user or imposed by the search system. For example, the system may allow access to only data to which the user is authorized.

[0042] In step 130, filters may be imposed by the query host in order to limit the number of character strings searched. The user application can apply the filters that the user provides or the application may impose its own filters or both. The filters may be applied before or after using the regular expression. For example, the number of fields to be searched can be decreased by applying the filters.

[0043] In step 140, valid fields (i.e. fields passing the initial filters) are searched for a string matching the regular expression. Various mechanisms may be used to perform the search.

[0044] In step 150, the results are returned. In one embodiment, the matching strings may be aggregated and then returned all at once. In another embodiment, results associated with each matching string may be returned when that matching string is found. Additional filtering or searches may be performed using the matching strings. For example, data linked to a particular string may then be searched using filters (e.g. parameters) input by a user.

[0045] Determining whether all or part of a character string matches a regular expression is not a simple task. The characters may be allocated among the operators in many different ways. Accordingly, separate indexes can be used to parse the database particularly for efficient search. [0046] Figure 2 illustrates an architecture 200 for querying search indexes drawn from a database using a network connection in an embodiment. As an option, the architecture 200 may be used to carry out the method 100 of Figure 1. The architecture 200 includes a system 202 including an application program server or interface 206. A search index store 212 containing multiple search indexes is provided which interfaces a query host 210. The query host 210 is coupled on one side to the application server 206 and on the other side to the search index store 212. The application server 206 may include any entity capable of offloading the search indexes of the search index store 212 or of caching query results.

[0047] Further included in the database system 202 is a multi-tenant database 214. The multi-tenant database 214 includes some number N of tables of data 218-1 to 218-N. The search index store 212 includes some number N of indexes 216-1 to 216-N drawn from the data in the multi-tenant database 214. The indexes can be divided into any type of groupings that will best facilitate rapid and efficient search. The indexes may also be broken up into partitions or shards so that each index contains partitions for a particular tenant, for example. Additional indexes can be built to serve particularly common searches or data that may otherwise be difficult to search. An indexer 220 coupled to both the search index store 212 and the multi-tenant database 214 gathers data from the multi-tenant database to build indexes and to update indexes in the search index store. The indexer is coupled to the query host 210 that controls the indexing processes and their timing.

[0048] In use, a client 204 (shown here as a single client, but there may be thousands of clients) uses the system 202 to store data in the multi-tenant database 214, and to retrieve data associated with one or more tenants of the system 202.

[0049] The client 204 may generate a query and transmit the query to the system 202 for receipt by the application server 206. The query may be based on any desired syntax and may be generated directly by the user of an application running either at the client 204 or at the application server or application host 206.

[0050] Upon receipt of the query, the application server API 206 processes it and provides it to the query host 210. The query may be forwarded to the query host 210 to translate it in a manner appropriate for querying the search index store 212.

[0051] In one example of use, the client 204 may generate and transmit a first type of query, e.g. utilizing an extended simple query language (SQL), object query language (OQL), or any other appropriate format, adapted for use over a network, while the query host 210 may translate the query into one or more queries, e.g. search index queries, etc., of a second type which are specifically tailored for querying the indexes of the search index store using various join conditions, etc. Due to specific protocol requirements, a single query from the client 204 may be translated into multiple queries to the search index store under the direction of the query host 210.

[0052] In response to one or more queries from the query host 210, data is returned to the client 204. The data may be presented to the client 204 utilizing an appropriate description language, e.g. XML, web services description language (WSDL), etc., or any other suitable format.

[0053] To reduce the amount of data transmitted over the network, to reduce the time spent in searching, and to make the results easier for the user to receive and review, the resulting data set may be divided into subsets, e.g. "chunks," etc., which are sent to the client 204 one- at- a- time. On the other hand, if the system 202 determines that the data resulting from the query is larger than a determined amount, the data may be reduced and only a part of the data is transmitted to the client 204.

[0054] Data-based relevancy

[0055] Changes can be made to improve the relevancy of a user's query results and the likelihood of the user receiving those hits after the results are filtered, such as in a post- filtering process. As mentioned above, the post-filtering may be based on access privileges, personal account data, or other filter criterion, such as date, location, names, etc. However, the relevancy of the query results will be further improved by applying a ranking to criteria applied by the search engine to the indexes.

[0056] By pushing the metadata and ranking functions down into the query engine, the results may be more relevant. This can be used to decrease the odds that the best results are never collected. If the results are truncated, then those that remain may be the most relevant, at least on some of the query criteria. Pushing down these ranking functions may also decrease the time spent checking results for authorization.

[0057] Formulas for ranking and re-ranking search result scores may be configured in any of a variety of different ways. Some possibilities are that they may be specific to a particular entity, partition, organization, or user group. Alternatively, a ranking formala may apply an overall ordering to the scores across all entities within a single partition or across multiple partitions.

[0058] Ranking can be applied to any searchable fields of the database. These fields are not required to be included in an index to be used in a separate ranking process, such as that applied using the database (after results are retrieved from search). However, any one or more of these fields or other fields may be indexed into search for use as a search criterion to affect the ordering of results or to constrain the search result set. The values in the fields may be ordered from highest to lowest, first to last, or in some other way to boost the most desired results to be selected first. Some examples of fields that may be used for ranking are listed. Each field is listed with an indication of its data type. For example: last_update (date);

last_activity (date); close_date (date); closed (boolean); escalation_state (boolean); owner (string); other fields specified by customers or developers.

[0059] To boost particular results in Boolean fields, the boolean fields can be indexed into separate fields (eg closeDate). Such an index may exclude characteristics or metadata such as term frequency, positions, or norms. When a query is applied against the index an additional clause can be added to the query to boost particular values in the search. So for example, a clause such as "[closeDate: 1]^Λ 1.5" would boost results having a closing date later than the first of May.

[0060] Similarly identifiers (IDs) for any subject can be treated so as to boost desired results. IDs are similar to Boolean fields and can be treated in the same way. So, for example, the IDs can be indexed and then an additional clause can be added to boost particular identifiers. A particular owner could be boosted by adding a clause referring to that particular owner or the owner and any related entities. As with the Boolean fields, this can be done without including the norms, term frequency or positions at indexing time.

However, this data may be included if desired.

[0061] Dates can also be indexed, ranked, and boosted in the same way as Booleans and identifiers. Because a user may be looking for a particular date or date range and not just the most recent items, more complexity can be used. In one example, if a version of Lucene is used for searching a NumericRangeQuery function is available. For example, a search for [warriors] might expand into a required term query for warriors, along with optional range queries with boosts for various date ranges. For example, an additional clause can be added to the query to boost the updates in the last week more than last month. Such a clause might take the following form: [lastUpdate:[20100101-20100108]^A1.8 lastUpdate:[20091201- 20091231]^Λ1.2].

[0062] Alternatively, there may be more than one index for a date field. Each field can be ordered to boost different dates. For example, there may be indexes with dates phrased as year-month, year- week, year-day. Boosting can be applied to boost the closest, finest grained dates. As a further alternative or as an addition, a boost can be applied after the results are scored. This additional boost can be done in any of a variety of different ways. In one example, the relevant field can be retrieved from the search index to use for the boost process.

[0063] Entity-based relevancy

[0064] Similar to how the data and name fields are boosted as described above, particular entities in the database can also be boosted over others. The boosting as in the other examples above helps to ensure that all of the relevant results are obtained before the collection limit is reached. Boosting can also be used to present the best results to the user at the top of the search results.

[0065] In one example, a parameter such as an entity or organization name parameter or class of names could be boosted at search time, to push more likely entity types higher into the results. In addition or instead, reverse boosts can be done in the same way on entity types that are never clicked on. This processing can be done at regular intervals. A small Hadoop cluster (a distributed computing utility from Apache Software Foundation) may be used to perform this processing asynchronously. The selection of entities to be boosted or reverse boosted can be made based on user behavior or other information. Different data may be used for individual organizations or users.

[0066] Authorization-based relevancy

[0067] Authorization based relevancy can also be used to prescreen results. Record visibility is an important component of relevancy for query search results in any system that grants different access rights to different users or groups. While a document may be highly relevant (eg, high tf-idf), if the user running the query doesn't have authorization to that record, then its actual relevance to that user is 0. If the search results are first checked for authorization after the query has been executed on the query hosts and returned back to the application tier, then many good results may be eliminated. In other words, for large enough datasets, when a "low-visibility" user runs a non-selective query, the intersection of authorization and the search results can lead to a poor user experience because the user may get incomplete results. By searching only in records that the user can see, the query time can be shortened and the relevance of the results are improved.

[0068] Search results can be improved by taking the user's ownership into account and boosting the scores of those records. For users that have access to far more records than just the ones they own, additional owner IDs can be boosted. A cache of the owner or organization IDs used by a particular requesting user or entity can be used to improve results for users that have access to records of other owners. Such a cache could be fetched at the beginning of a query and made available to the query engine. [0069] Authorization-based relevancy can also be used to improve results using a special index. The index can be compiled periodically for any users or organizations with a sufficiently large dataset. An index on authorization-based relevancy would be useful for many different users. One example is users that do not have permissions to view all of the data for their organization or for the whole database. Another example, is users that belong to an entity with a large data set, such as a million records. Another example is users that cannot see a large percent of the records for their organization. The particular percentage and the size of a large data set can be selected to suit different implementations and available tools.

[0070] The index can be created for authorization-based relevancy by computing a cached value of what a particular user group can see. In other words, the index can be created by determining which records are authorized for the particular user or group. These records can then be cached in a fast memory. The records can be stored as IDs of the actual records that occur in the database. If there are a small enough number of records, then they can be stored as a compressed set of IDs. For larger record sets, a variety of compression or search- oriented storage methods can be used, for example a bloom filter of IDs.

[0071] In one example a query is received and applied against one or more caches. At query time, after the IDs are retrieved from the search index, it can be determined whether the IDs exist in an entity-share cache. If the cache doesn't exist for an entity type, then no boosts are applied. If a cache does exist for the IDs, then only entities that are found in the cache are boosted. The amount of boost to apply can be adjusted to suit any particular implementation and type of user behavior. On the one hand, for very low visibility users on non-selective queries, large authorization boosts can be used. On the other hand, for more selective queries, a smaller or no boost may be applied.

[0072] By preparing caches in a separate process unrelated to the user's activity, it is possible that the caches will have stale and incorrect values. Similarly after users have made large scale authorization changes or other changes to the data, but before the entity-share cache is recomputed, search results could be slowed because the wrong results have been boosted. For example, the cache might boost results that the user can no longer see.

However, such problems can be mitigated with frequent caching or controlling the user of the caches. For example, a cache might be invalidated or its use restricted after a large operation that affects authorization rights or other data. These operations might include changing role hierarchy, changing a user's role, etc. [0073] The caches can be recalculated at periodic intervals or after major data changes or both. In addition, particular user events might be used to invoke the calculation of a cache. The particular selected frequency can be adapted to suit different conditions and usage models. In a calculation of the cache, the system can populate a new database table with org, user, key prefix, value, and last update. These values can then be pushed into the cache. The actual cache can take any of a variety of different forms including memcached (a general purpose memory caching system).

[0074] For scoring, the rankings and boosting may be relied upon instead of scoring. Alternatively, the scoring can be determined in a separate process after the search results are obtained. A more careful ranking may provide higher accuracy and provide the most important results to the user first. Alternatively, cache information may be plugged directly into the query engine and incorporated into the query's score.

[0075] Processes

[0076] Figure 3 is a simplified flowchart of performing a query using a search index. The process begins at 310 with receiving a query directed to a search. The search is directed a set of search indexes that are drawn from a database. The database can be a multi-tenant database as described in more detail below or a single-tenant database. In the case of a relational database, it includes many items organized in rows and columns. The search indexes contain identifiers for records of the items in the database. Accordingly, if the query finds a hit within a search index, the hit will relate to a record in the database. Using the identifiers in the index, the database records can be retrieved and provided as search results.

[0077] At 320, the query is received at a query host or application processor for processing. The query host determines whether the query complies with any one of a plurality of different search criteria. Each of the search criteria correspond to a predefined index drawn from the database. If there are many queries with unique criteria that are not indexed, then a new index can be constructed to handle such queries.

[0078] The search criteria can be any of the types suggested above, such as data, entity, or authorization related criteria, among others. In one example, the criterion is a time or a range of times, such as dates, or time of day. The corresponding index for such a criterion can be ordered based on a time sequence from most recent to least recent or vice versa. Such an ordering allows the query to very quickly be applied to the most recent items first based on the ordering of the index. For a query for the oldest items, the same index can be used in reverse order or another index can be generated which lists the items from oldest to newest. Similarly for a time range criterion, the corresponding index can have references to items having a field that complies with the time range.

[0079] Additional indexes can be related to access or authorization privileges. If a user has limited access to the database, then, when the query is received from the user, the access privileges of the user can be analyzed and, if the user has restricted access, then a corresponding index can be used. Such an index has references or identifiers to items or records with a common access level. This approach can be applied to other characteristics of the user as well. The query host, in determining whether the query complies with one of the predefined search criteria, can compare characteristics of the user to characteristics for the predefined indexes. These characteristics might include the user's organization, tenant identifier, subscription rights, registered applications, or other criteria including

authorization.

[0080] At 330, an index is selected for the query. The index can be any of the types mentioned above or others. For speed, the index will be a predefined index that has entries that comply with the corresponding search criterion. However, for later searches, or even a new query, a new index can be built before the search is executed.

[0081] At 340, the query is applied to the selected index. The application to the index can come before any post-filtering of the results against the database. In this way, the index reduces the number of possible hits before any post-filtering is applied. The index can also be used for pre-filtering before the query is applied to other search indexes. In other words, the results are filtered before the search is performed. The search can then focus on the results from the first index and avoid wasting time or computational resources on results that will not be provided to the user in the end. When the index corresponds to an aspect of the query, the index can be used to reduce the results to only those that comply with the most essential criteria. This also saves time in the subsequent search process.

[0082] The listed items within the index can also be ordered so that the query is applied to the index items in a particular order. This can be used to quickly find, for example, the 200 most recently modified listings, or the 200 most recent messages, or the 200 most recent or largest payments. The ordering within the index can be based on any useful criterion such as size price, time date, popularity, etc. This ordering can be used to find results very quickly by applying the query to indexed items based on the ordering of the index. The index results can then be referenced to the database using identifiers within the index. [0083] At 350, items are selected based on applying the query to the selected index. The selection relies on the way that the query is applied. Typically, matching is used, but ranging and other techniques can be applied. In one example tools within Lucene are applied to make these selections.

[0084] At 360, the query is applied to further indexes. This is done after applying the query to the first selected index. The next index may be a selected second predefined index of the database that corresponds to a second one of the search criteria in the query. In one example, the additional indexes are to satisfy the remaining criteria of the query. In another example, additional indexes may be selected based on the characteristics of the user that submitted the query. In any event, to benefit from the first index, the query is applied to the next index after the first index and the application to the second index is limited to items that already have been determined to satisfy the first search criterion. In other words, the query is applied to the second selected index after it is applied to the first selected index.

[0085] For a sequence of indexes, the application to the indexes can be staged or sequenced in different ways. The number of items in each index can be used so that the query is applied to an index with fewer items before an index with more items. The ordering can also be based on the size of each index. In this way, shorter indexes or indexes with shorter pointers that can be processed more quickly are applied first to reduce the number of results, then longer slower pointers or shorter indexes are applied next with the already restricted result set.

[0086] In another embodiment, the query is applied to all indexes simultaneously. The results from each index are matched using a sequence that is selected to provide the most relevant results. In such a case, the search indexes are ordered based on the contents of the query in the manner described above. The query is applied to each index at about the same time, depending upon resource allocation for the search system. The result set from each query is then combined to generate the result set that is returned to the user as the response to the query. The result set from each index is combined with the result set from each other query based on the ordering. In this way, the results are first limited by the first ordered criterion, etc. The final result set may then contain results that are more relevant by selecting which index to use first.

[0087] At 370, a final result set for the query is built by the query host for delivery to the user. The result set will include only items that are in the first selected index. As mentioned above, typically the index will only have references to items in the multi-tenant database. To build a reported result set, the query host first accesses the selected items in the results list from the multi-tenant database using the identifiers, references, or pointers retrieved from the index. Using a list of pointers or indexes that point to data in the multi- tenant database, rather than a list of the actual data, allows the index to be searched more quickly. It also reduces the amount of memory required to store each index. After the items are identified through the index, the pointers can then be used to retrieve the actual items. These are compiled into an appropriate format and sent to the user. In one embodiment, as items are retrieved from the index using the pointers, the results are post-filtered to remove certain results from the report.

[0088] Figure 4 is a simplified flowchart of an example of using an index to improve searching with a user that has limited authorization rights. In this example, a user belongs to an organization that has established different levels of access to a common set of information in the multi-tenant database. A high level manager might have access to all of the data, while a divisional accountant might have access only to financial data for a single division. A sales associate might have access only to that associate's accounts, etc. Such a set of authorization levels can be established by the organization or by an administrator of the account. As mentioned above, for the high level manager, there may not be a benefit to filtering data because all of the data is available. However, for the sales associate, most of the data will not be visible. By limiting searches to only the available data, queries can be processed more quickly and higher quality results will be obtained.

[0089] At 410 in Figure 4, a query is received from the user. The query can come from the client device 204 and be supplied to the application server 206. Upon determining that the incoming request is a query, the query can be passed to the query host 210. The application server may perform some processing of the query first or may simply pass the query on.

[0090] At 420, the user's access privilege level is determined. This may be done by the application server or the query host or by some other equipment. Typically, the user will have logged in using a user identification through a specialized application or an internet browser window. The user identification can be linked to a set of rights and characteristics of the user. This information can include an organization identifier, subscription status, activity status, user preferences for the application or browser and the access privileges or

authorization rights to data in the multi-tenant database. Alternatively, the access privileges can be obtained from a user identifier, a MAC (Media Access Control) address, or in any other way. [0091] At 430, the access level, or access privileges level, or authorization level is identified and then used to select an index to which the query will be applied. The indexes can be predefined and in the case of unlimited access, the process may stop and the query simply handled conventionally. For restricted access, a predefined index corresponding to that level of access is selected. The selected index will contain only those items that correspond to the determined user access privilege level. In other words, only items that the user can see will be included in the index.

[0092] In a search index store 212, there may be many different indexes,

corresponding to different user access privilege levels of the database. Having identified a privilege level for the user one of the privilege level search indexes can be selected for the query. The selected privilege level search index contains identifiers or pointers only to database items corresponding to the determined user access privilege level. For a more complex access privileges system, there may be multiple indexes used to correspond to a particular users access privileges level.

[0093] At 440, the query is applied to the index. This will result in a list of results or hits. The list of results can be further narrowed at 450 by then applying the search to more indexes of the database. These additional indexes can be based on other criteria in the query or additional characteristics of the user. After these indexes, the query is fully processed to develop the final result list.

[0094] At 460 a list of results are compiled. These results will include only items that are represented in the selected user access privilege level index or indexes. The indicated results are retrieved from the database for a result set based on the query. These results may be post-filtered against the database, in some embodiments. At 470, this result set of the selected items is presented to the user. The user can then present further queries, retrieve additional data for each result, edit or modify data, etc.

[0095] Query Optimization

[0096] Figure 5 illustrates an exemplary architecture 500 in which embodiments may operate. Architecture 500 depicts a host system 510 communicably interfaced with several customer organizations (505A, 505B, and 505C) via network 525. Within host system 510 is a multi-tenant database system 530 having a plurality of underlying hardware, software, and logic elements 520 therein that implement database functionality and a code execution environment within the host system 510 and in which the hardware, software, and logic elements 520 of the multi-tenant database system 530 are separate and distinct from a plurality of customer organizations (505A, 505B, and 505C) which utilize the services provided by the host system 510 by communicably interfacing to the host system 510 via network 525. In such an embodiment, each of the separate and distinct customer

organizations (505A-505C) may be remotely located from the host organization that provides services to the customer organizations (505A-505C) via host system 510 having the multi- tenant database system 530 executing therein. Alternatively, one or more of the customer organizations 505A-505C may be co-located with the host system 510, such as within a same host organization that provides the multi-tenant database system 530 upon which underlying data is persistently stored.

[0097] In one embodiment, the hardware, software, and logic elements 520 of the multi-tenant database system 530 include at least a non-relational data store 550 and a relational data store 555, which operate in accordance with the hardware, software, and logic elements 520 that implement the database functionality and code execution environment within the host system 510. Host system 510 may further receive requests 515 from one or more of the plurality of customer organizations 505A-505C via the network. For example, an incoming request 515 may correspond to a request for services or a request to retrieve or store data on behalf of one of the customer organizations 505A-C within the multi-tenant database system 530.

[0098] Figure 6 illustrates an alternative exemplary architecture 600 in which embodiments may operate. In one embodiment, host system 510 implements a method of optimizing queries in a multi-tenant database system 530 having a relational data store 555 and a non-relational data store 550.

[0099] In one embodiment, a request 615 is received at host system 510 on behalf of the multi-tenant database system 530. In such an embodiment, request 615 specifies data 618 to be retrieved from the multi-tenant database system 530. In some embodiments, a distinct web-server 610 operating within the host system 510 receives the incoming request 615 via network 525. For example, web- server 610 may be responsible for receiving requests 615 from various customer organizations 505 A-C via network 525. Web-server 610 may provide a web-based interface to an end-user client machine originating the request 615 (e.g., such as an end-user client device located within a customer organization 505 A-C), the request 615 constituting a request for services from the multi-tenant database system 530 operating within a host organization such as host system 510 that provides, for example, remotely

implemented cloud computing services. In some embodiments, request 615 specifies data 618 to be written to, or updated within, multi-tenant database system 530, rather than retrieved.

[00100] In one embodiment, host system 510 retrieves, based on the request 615 received, one or more locations 616 of the data 618 to be retrieved. . In one embodiment, a customer schema 640 describes the one or more locations 616 of data 618 to be retrieved, in which the customer schema 640 specifies each of the plurality of data elements of the data 618 to be retrieved as residing within either the non-relational data store 550 or residing within the relational data store 555, or as being available from both the non-relational data store 550 and the relational data store 555. In one embodiment, the host system 510 retrieves the customer schema 640 responsive to receiving the request 615. Alternatively, the host system 510 retrieves the one or more locations 616 of the data 618 to be retrieved from the customer schema 640. In one embodiment where the request 615 specifies data to be written or updated within the multi-tenant database system 530, the host system retrieves one or more locations 616 within either the non-relational data store 550 or the relational data store 555 to update data as specified or to write data as specified, for example, the one or more locations 616 being target locations.

[00101] Customer schema 640 may be accessible via, for example, a global caching layer that provides fast efficient access to various elements of a host system 510

implementing or providing the described multi-tenant storage capabilities. In alternative embodiments, the one or more locations 616 of data 618 to be retrieved, updated, or written available, provided, or retrieved via the customer schema 640 by the host system 510, by an optimizer agent 645, by a query layer 660 of the host system 510, or by other elements of the host system 510 responsible for determining the locations 616 of data 618 to be retrieved from the multi-tenant database system 530 that is spread across diverse database

implementations, such as data 618 having a plurality of data elements residing in the nonrelational data store 550 and the relational data store 555 as depicted or alternatively, data that is to be updated within, or written to, the non-relational data store 550 and/or the relational data store 555.

[00102] In one embodiment, the host system 510 generates a database query 617 based on the request 615, in which the database query 617 specifies a plurality of data elements to be retrieved, the plurality of data elements including one or more data elements residing within the non-relational data store 550 and one or more other data elements residing within the relational data store 555. In a particular embodiment, the database query 617 is based further on the retrieved one or more locations 616 of the data 618 to be retrieved, updated, or written. Such a database query 617 may further be generated or delegated by the host system 510 for generation by a sub-system of the host system 510, such as query layer 660 or optimizer agent 645. [00103] In some embodiments, database query 617 includes a plurality of sub- queries. In such an embodiment, at least one of the plurality of sub-queries are directed toward retrieving the one or more data elements residing within the non-relational data store 550 from the non-relational data store 550 and at least a second one of the plurality of sub- queries are directed toward retrieving the one or more other data elements residing within the relational data store 555 from the relational data store 555. For example, depicted by Figure 6 within the expanded view of database query 617 are several sub-query strings such as "retrieve data element 'a' from the non-relational data store" (e.g., 550) and "retrieve data element 'b' from the relational data store" (e.g., 555) and another sub-query string which states "select 'x' from 'y' where 'z'" reflective of a generic Structured Query Language (SQL) type query. Such a query may or may not be appropriate for querying the underlying data stores (e.g., 550 and 555) depending upon the implementing query language or syntax chosen.

[00104] Figure 7A illustrates an alternative exemplary architecture 700 in which embodiments may operate. In particular, optimizer agent 645 is described in additional detail in accordance with certain embodiments.

[00105] In one embodiment, host system 510 optimizes the database query 617 to generate an optimized database query 750 having an optimized query syntax that is distinct from a query syntax of the database query 617, such as the syntax specified by an initial, preceding, or un-optimized database query, such as database query 617.

[00106] For example, an incoming request (e.g., 615) may specify data to be retrieved, but may additionally specify a query syntax with which to retrieve the requested data 618 or a query syntax with which to update or write specified data. Such a query syntax may be technically or syntactically capable of retrieving, writing, or updating the specified data (e.g., if executed, the query syntax may execute without error and produce the requested data or effectuate the requested update or write operation), however, the query syntax may nevertheless be capable of optimization. Database queries that have the potential to be optimized, may similarly be generated by the host system 510 itself, such as database query 617 which may be generated by query layer 660.

[00107] The mere fact that a database query 617 may technically or syntactically execute to produce results without error does not necessarily mean that such a query is as efficient as it potentially can be. For example, optimizations may be available to make the query return a result faster, return a result with consuming fewer computing resources, such as bandwidth, processor cycles, memory, etc., return a result utilizing less cost, such as through the use of lower cost commoditized hardware and systems, and so forth.

[00108] Thus, in accordance one embodiment, although a first database query 617 may be specified and available for execution, an optimized database query 750 is generated having optimized query syntax distinct from the original or previously available database query 617. Examples of distinct query syntaxes are provided at Figures 11A and 11B and discussed below.

[00109] In such an embodiment, the optimized database query 750 is then executed against the multi-tenant database system 530 to retrieve, update, or write specified data 618, rather than executing an un-optimized and/or previously available database query 617. In accordance with one embodiment, executing the optimized database query 750 against the multi-tenant database system 530 includes referencing data elements stored in both the relational data store 555 and the non-relational data store 550 so as to retrieve the requisite data 618. Alternatively, the optimized database query 750 includes referencing one or more target locations to update the specified data 618 or write the specified data 618 to the multi- tenant database system 530.

[00110] In one embodiment, query layer 660 receives the optimized database query 750 from optimizer agent 645 and then passes the optimized database query 750 to the multi- tenant database system 530. In alternative embodiments, optimizer agent 645 passes the optimized database query 750 directly to the multi-tenant database system 530 for execution via its underlying hardware, software, and logic elements 520, as depicted via the dashed arrow directing optimized database query 750 from optimizer agent 645 directly to the multi- tenant database system 530.

[00111] In one embodiment, database analytics 745 are generated for one or more customer organizations 505 based on data structures and historical database queries corresponding to each of the one or more customer organizations. In one embodiment, an analytics engine 740 of the optimizer agent 645 generates the database analytics 745. In alternative embodiments, analytics engine 740 operates externally to optimizer agent 645 and provides database analytics 745 to the optimizer agent 645.

[00112] Database analytics 745 may include specified size thresholds (e.g., number of rows, amount of data in terms of size (e.g., megabytes of data), cardinality of data requested, etc.). Such analytics may be performed in advance, before a request for data is received, or on-the-fly, responsive to receiving a request for data.

[00113] Database analytics 745 may be based upon, for example, known statistics and analysis within the host system 510, such as a query cost for a particular database query 617 or for one or more sub-queries (e.g., 706-709 from Figure 7B). Such statistics and analysis may be derived from a known pick list quantity for a particular query in which a maximum number of elements is known, and thus, a maximum or estimated query cost is determinable or is known and available from the optimizer agent 645 or analytics engine 740.

[00114] In one embodiment, the optimized database query 750 is generated based on database analytics 745 which are generated or created for the respective customer

organization 505 having originated a corresponding request 615. For example, database analytics 745 may be available on a per-customer organization basis and may be used for optimizing the original database query 617. Database analytics 745 on a per-customer organization basis may be applicable wherein a particular database table or a particular series of database objects or records are only accessed by one customer. Take for example a listing of sales-leads for a particular customer organization's sales team. Such information may be accessible to only a single customer-organization, and thus, analytics performed against such a table may be specific to that single customer-organization 505.

[00115] Similarly, there may be many customer organizations, each of whom have sales-leads data stored within the multi-tenant database system 530, however, each of the sales-lead data sets associated with the various customer organizations may have different characteristics, such as distinct overall sizes, different numbers of records, different fields, and different storage structures, such as being stored within the non-relational data store 550 portion or the relational data store portion 555 of the multi-tenant database system 530. Each of these varying characteristics may affect the manner in which a particular query may be optimized. Because each customer organization may have data sets with characteristics distinguishable from other customer organizations having data stored within the same multi- tenant database system 530, it may be advantageous to conduct at least a portion of the database analytics 745 on a per-customer organization basis.

[00116] Although some database analytics 745 which relate to data associated with multiple distinct customer organizations 505 or an entire data store as a whole (e.g., 550 or 555) may also be beneficial in some situations, in other situations, having database analytics 745 that are based on an entire data store as a whole (e.g., 550 or 555) or based on multiple distinct customers organizations 505, each having their own varied structures and

characteristics, may potentially cause potential optimizations to be averaged out, or canceled out, and thus, a potential optimization may be missed altogether. Thus, in accordance with certain embodiments, one or more database analytics 745 are based on a per-customer organization 505 basis. [00117] Figure 7B illustrates an alternative exemplary architecture 701 in which embodiments may operate. For example, in one embodiment, host system 510 passes an optimized database query 750 to the multi-tenant database system 530 for execution in which the optimized database query 750 includes one or more database join operations 705.

[00118] In one embodiment, the query optimization consists of a modification to the order of one or more specified join operations within an un-optimized query or a pre-existing query. For example, in one embodiment, an un-optimized database query 617 includes a specified order for two or more database join operations 705. In such an embodiment, optimizing the database query includes modifying the specified order for the two or more database join operations within the optimized database query 750.

[00119] In alternative embodiments, there are no join operations specified within the un-optimized database query 617, and thus, optimizing the database query includes modifying the un-optimized database query 617 to include at least one database join operation 705, or in some embodiments, two or more database join operations 705 and a specified order for the two or more database join operations within the optimized database query 750. In alternative embodiments specified or implicit join operations within an un- optimized database query 617 are negated altogether through alternative query syntax resulting in the optimized database query 750 executing without having to perform a previously specified join.

[00120] In one embodiment, at least one of the database join operations 705 within the optimized database query 750 includes joining a child table from the non-relational data store 550 with a master table from the relational data store 555.

[00121] In some embodiments, a join operation 705 includes multiple sub-queries. For example, in such an embodiment a first sub-query is to be executed against the non-relational data store 550 and identifies the one or more data elements residing within the non-relational data store 550. In such an embodiment, a second sub-query 708 is to be executed against the relational data store 555 which then retrieves data elements 715 from the relational data store 555 that are not available from the non-relational data store 550. Such elements may be joined within a new table in either data store (e.g., 550 or 555) or joined together in a caching location. In alternative embodiments, missing data from one data store is replicated from a source location to a target location. For example, sub-query 708 may determine a data delta 710 between the first sub-query that identifies the one or more data elements residing within the non-relational data store 550 and the one or more other data elements residing within the relational data store 555. In such an embodiment, a third sub- query may be executed against the relational data store 555 and the non-relational data store 550, wherein the third sub-query replicates data corresponding to the determined data delta 710 from the relational data 555 store to the non-relational data store 550, such as that which is depicted by sub-query 708 causing a replication event from relational data store 555 to non-relational data store 550 based on the determined data delta 710.

[00122] For example, sub-query 708 may retrieve the one or more other data elements residing within the relational data store 555, pulling them into, for example, a temporary table, file, temporarily caching the data, etc., and then such a third sub-query 708 may issue an insertion or write command of the retrieved data corresponding to the data delta 710 against the non-relational data store 550, causing the data delta 710 data to be written, stored, or inserted into the non-relational data store 550, thus completing the replication and thus further causing the previously unavailable data elements which resided in the relational data store 555 to now be available from the non-relational data store 550. Refer to the dashed line of Figure 7B depicting sub-query 708 being executed against both data stores (relational data store 555 and non-relational data store 550) to replicate the identified data delta 710 from the relational data store 555 to the non-relational data store 550.

[00123] Optimized database queries 750 utilizing replication or data synchronization between data stores (e.g., 550 and 555) may be derived from a policy to replicate a smaller dataset from its primary location to the location having the larger dataset. For example, replicating the one or more data elements that are part of the requested data may be more efficient from a network bandwidth perspective to conduct the replication from the relational data store 555 to the non-relational data store 550, than vise-versa. Analytics engine 740 may provide such a policy based on previously performed database analytics 745 or based on on- the-fly database analytics 745 conducted via, for example, one or more sub-queries 706-709 within an optimized database query 750 or pre-queries executed as part of the development of the optimized database query 750.

[00124] Other database optimization decisions may be based upon policies that reflect optimizations to data consistency and resiliency. For example, optimized database query 750 may implement one or more sub-queries 706-709 or database query syntax to ensure consistency guarantees. For example, replicated data and synchronized data is not necessarily guaranteed to be in sync depending on how the multi-tenant database system 530 performs its synchronization algorithms. The optimized database query 750 may reflect an optimization that is more computationally economical or more economical from a bandwidth perspective, but does not require a consistency guarantee when the potential for data deviance is an acceptable risk. Alternatively, the optimized database query 750 may reflect an optimization that requires a consistency guarantee, and thus, such an optimization may not emphasize speed or computational efficiency, but instead, emphasize data guarantees. Such a policy may be triggered by an incoming request 615 from a customer organization 505 or triggered internally based on a policy specified by, for example, analytics engine 740.

[00125] Thus, in accordance with one embodiment, optimizing the original database query 617 includes a) identifying a first sub-query within the original database query 617 directed to a table within relational data store 555 in which the first sub-query corresponds to a first portion of data to be retrieved based on an incoming request 615; b) identifying a second sub-query within the original database query 617 directed to a table in the nonrelational data store 550 in which the second sub-query corresponds to a second portion of the data to be retrieved based on the request 615; c) replicating the table within the relational data store 555 to the non-relational data store 550; and d) generating the optimized database query 750 with one or more sub-queries 706-709 to fetch both the first portion of the data to be retrieved and the second portion of the data to be retrieved from the non-relational data store 550. Stated differently, the resultant optimized database query 750 references only the non-relational data store 550 for all of its requisite data to be retrieved, made possible through the replication event conducted by, for example, optimizer agent 645. In some embodiments, the replication even is triggered before the original database query 617 is received, for example, where the original database query 617 is conducted frequently enough for the analytics engine 740 or the optimizer agent 645 to determine that it is economically viable to trigger such a replication event.

[00126] In some embodiments, the replication is performed in the opposite direction, and all data is retrieved from the relational data store 555 instead of the non-relational data store 550. In some embodiments, an entire table specified via corresponding "FROM" clause is replicated, while in others, only particular columns are replicated or particular records derived from a corresponding "WHERE" clause are replicated. For example, a relevant data set may be replicated between the underlying data stores 550 and 555 so as to negate an otherwise required join operation where data elements from each of the data stores 550 and 555 are being evaluated as part of the original database query 617. In yet other embodiments, a non-SQL based syntax is utilized to specify tables or portions of tables for replication. In some embodiments, one or more data structures or data sets are replicated, for example, data in a non-relational data store 550 for which there is no associating "table" structure. For example, a list or series of data files may be replicated from a non-relational data store 550 and represented within a table of a relational data store 555 as object elements. Such files and objects may instead require searching and parsing various elements as they do not reflect a relational "table" based format, but may nevertheless contain relevant data which can be leveraged to conduct a fulfill a request 615.

[00127] In one embodiment, an un-optimized database query 617 includes a plurality of sub-queries each having a corresponding "WHERE" clause and optimizing the database query 617 includes analyzing each of the plurality of sub-queries based on in-memory cached statistics to identify one of the plurality of sub-queries as a most efficient sub-query of the original database query 617. Based on the in-memory cached statistics, the optimized database query 750 is then generated having the one sub-query identified as the most efficient sub-query as a leading query within the optimized database query 750. For example, un- optimized database query 617 may include sub-queries (similar to sub-queries 706-709) in which sub-query 706 is specified as the leading query. In such an embodiment, the optimized database query 750 may similarly include sub-queries 706-709 corresponding to those within the original database query 617, but sub-query 709 may be identified as the most efficient sub-query, and thus, specified as the leading query as part of the query optimization. In one embodiment, the in-memory cached statistics are available via analytics engine 740. In alternative embodiments, the in-memory cached statistics are available via a global caching layer accessible from the optimizer agent 645.

[00128] In one embodiment, the most efficient query among a plurality of sub- queries in the un-optimized database query 617 is identified based on the one sub-query among the plurality of sub-queries within the un-optimized database query 617 corresponding to a best metric or a best combination of metrics in accordance with the in-memory cached statistics. In such an embodiment, each metric may be selected from the group comprising: a number of rows, a number of blocks, an average row length; Input/Output (I/O) utilization; processor utilization; and execution time. For example, a sub-query targeting a table or data structure with a small number of elements may result in subsequent sub-queries being more efficient as there may be fewer elements to evaluate stemming from the decision to lead with a sub-query directed to the smaller table or data structure. Alternatively, it may be determined that a larger table organized relationally is more efficient to lead an optimized query than having to search or parse an smaller index or a file which is not organized relationally.

Analytics engine 740 may specify such policy determinations and thresholds based on available statistics and analytics.

[00129] In one embodiment, optimizing the database query 617 includes a) identifying a sub-query within the un-optimized database query 617 that references a parent table within the relational data store as a leading sub-query within the un-optimized database query 617; b) injecting a new join operation to a foreign key index into the leading sub-query to the parent table in the relational data store, wherein the join operation joins a custom index on a foreign key for the non-relational data store; and c) leading the optimized database query 750 with the sub-query to the parent table having the join operation injected therein.

[00130] In accordance with one embodiment, optimizing the database query includes: a) identifying within an un-optimized database query, a sub-query to a named table within the relational data store having a "WHERE" clause to invoke a full table scan; b) replacing the identified sub-query to invoke the full table scan with an optimized sub-query to invoke a range scan of an index table associated with the named table; c) leading the optimized database query with the optimized sub-query; and d) injecting a join operation to the optimized database query, wherein the join operation combines indexed column data from the index table with results of the optimized sub-query when available and further joins non- indexed column data from the named table when indexed columns cover less than all required columns of the named table based on the identified sub-query.

[00131] Full table scans may provide significant opportunities for optimization, especially where an index table already exists for named data table. A full range scan may occur or be invoked by a database query when a table is specified and a "WHERE" clause specifies a particular matching condition. For example, "WHERE" an employee ID equals 12345 or "WHERE" salary is less than $50,000.00. If transacted in an un-optimized form, the query causes the database to consider each and every row within a named table to evaluate whether or not the condition exists. The query has no opportunity to pre-terminate until all rows are evaluated, a potentially wasteful exercise in large tables (e.g., a million row table thus requires a million condition evaluations). Conversely, by replacing the sub-query with an optimized sub-query directed toward an index table, the indexing scheme (e.g., traversing a b-tree, etc.) may be utilized to complete the same evaluation required by the "WHERE" clause, without having to consider all rows, and potentially negating consideration of a vast majority of rows in the named table. Where additionally required column data is present within the index table (e.g., because the columns are frequently referenced and thus indexed), the additionally required column data may be retrieved from the index table. Where column data is not covered by the index table, the originally specified named table may be referenced, now requiring only consideration of a known subset of rows or records due to the preceding optimized sub-query leading the directed to lead the optimized database query. [00132] In one embodiment, an un-optimized database query 617 includes a plurality of sub-queries within the un-optimized database query 617, each having a corresponding "WHERE" clause to retrieve a specified plurality of data elements 715 and optimizing the database query includes a) generating a pre-query for two or more of the plurality of sub- queries, in which each pre-query specifies the retrieval of a portion of the respective plurality of data elements 715 for the corresponding sub-query; b) executing each of the pre-queries against the multi-tenant database system 530; c) ranking the executed pre-queries on the basis of one or more metrics including Input/Output (I/O) utilization for the pre-query, processor utilization for the pre-query, and execution time for the pre-query; and d) generating the optimized database query 750 based on the ranking of the executed pre-queries.

[00133] For example, some un-optimized database queries 617 will solicit data retrieval or initiate database actions for which sufficient statistics or database analytics 745 are not available via the analytics engine 740 or elsewhere. When a requested un-optimized database query 617 is sufficiently large in terms of computational resources, execution time, bandwidth resources, or some other metric, it may be more efficient to have the optimizer agent 645 initiate probing pre-queries and test-queries against the multi-tenant database system soliciting a sub-set of the required information in order to determine which of various available retrieval or execution schemes are most efficient. Although the pre-queries themselves consume time and resource, the resources consumed may provide a return (e.g., an efficiency improvement or optimization) that is greater than the resources required to generate, execute, and evaluate the results of such pre-queries. For example, if a requested database query 617 requires searching millions of rows, it may very well be advantageous to trigger probing queries on a small subset of the total rows and evaluate which of multiple options most efficiently executes.

[00134] In some embodiments, a determination whether to conduct pre-queries may be made by the optimizer agent based on statistics or database analytics 745. For example, where an un-optimized database query 617 is evaluated to require resources below a particular threshold, pre-queries may be by-passed and other optimizations implemented not requiring the pre-queries. Such an evaluation may be based on, for example, a known cardinality or table depth of a specified target table or specified columns in a "WHERE" clause of the un-optimized database query 617. Other considerations may be based upon a size classification for a particular client organization, a corresponding size on disk of a specified table or data structure, etc. In some embodiments, pre-query results are

stored/cached and referenced in subsequent database requests 615 for which a common pre- query result may be referenced without re-execution. In one embodiment, a known pick list quantity (e.g., a known a maximum number of elements) for a referenced column or table or relevant sub-query is known, and thus, a maximum or estimated query cost is known and available from the analytics engine 740 or may be determinable via the optimizer agent 645 (e.g., through one or more pre-queries).

[00135] In one embodiment, a pre-query operates as a filter. For example, optimizer agent 645 may generate an execute query logic to retrieve additional detail from the database query to be optimized in search of or to "filter" which of multiple optimized query statements may be most appropriate or most efficient for the underlying database store. A similar database query may be optimized one way for the non-relational data store 550 and yet another way for the relational data store 555. For example, in one embodiment, a

getSelectiveFilterO function or SqlOptimizer.getSelectiveFilter() logic module systematically processes multiple SQL based "WHERE" clauses looking for which one of the multiple available is the best, most appropriate, or most efficient one to lead an optimized database query 750. In such an embodiment, the identified "WHERE" clause is selected and used to lead an optimized database query 750 having multiple sub-queries. Such an optimization may output extra leading joins and additional directives supported by the underlying data store (550 or 555), within the optimized database inquiry, for example, instructing the underlying data store (550 or 555) on how to process, respond to, implement, or execute the optimized database query 750.

[00136] In one embodiment, a filtering mechanism e.g., getSelectiveFilterO, may be based on available types of IndexedFilter implementations supported by an underlying data store (550 or 555). For example, basic custom indexes and name-indexes may be referenced (e.g., search name lookup), or more sophisticated query scopes (e.g., My Records) or index joins may be specified via an optimized database query 750. Such indexes and query scopes can be implemented on relational database query engines as well as non-relational database query engines. In addition to specialized filters and indexes, pivot tables may further be utilized as potential query optimizations when supported by the target data store. Whether or not such capabilities are supported by a particular data store (550 or 555) may be tracked and managed by implementing logic of an optimizer agent 645, or alternatively, may be discoverable via an analytics engine 740 on behalf of the optimizer agent.

[00137] In one embodiment, a database store dependent key function, mechanism, or method employed by a filter mechanism (e.g., IndexedFilter) implements a getSlowStats() or equivalent function to perform a database store-dependent pre-query to estimate the selectivity of queries. Alternatively, a getFastStats() or equivalent function may be used in a database store-independent fashion when the cardinality of a query can be determined from in-memory cached statistics. For example, where a picklist histogram is known and available (e.g., via analytics engine 740) without having to execute pre-queries. Same or equivalently corresponding data structures (e.g. where a custom_index_ndv = num distinct values) can be used to store statistics regardless of whether the underlying database store is relational or non-relational.

[00138] In some embodiments, various implicit indexes may be presumed to always be present, regardless of whether implementation and/or optimization query logic is data store dependent (e.g., specific to either non-relational data store 550 or relational data store 555). For example, a Name field of a custom object may be presumed to always be indexed (e.g., such as when targeting a relational Oracle™ based data store). A SearchNameLookup function or equivalent may be used along with its corresponding implementation in a relational data store 555, whereas in non-relational data store 550, a custom index on the Name field may be used to implement equivalent functionality and allow for the above presumption of always being present. In some embodiments utilizing foreign keys, an optimizer agent 645 may presume a multi-tenant CFKV (custom_foreign_key_value) table to be indexed when targeting a relational data store 555, whereas when optimizing query logic targeting a non-relational data store 550, an implicit custom index may be presumed.

[00139] In some embodiments, a policy may specify that optimizing query logic uses a selective filter mechanism, e.g., getSelectiveFilter(), by default to lead the optimized database query 750 in all cases, via query syntax appropriate for the underlying data store (550 or 555). Where optimizer agent 645 determines to lead the query with a parent table in a join, a join to the foreign key index table may then be injected to support the query lead (e.g., CFKV for relational stores or equivalent custom index on a foreign key for non-relational data store implementations).

[00140] In some embodiments, a non-relational data store 550 is made to support corresponding or equivalent query operations as a relational data store 555 via an API (Application Programming Interface) to the respective underlying data stores. For example, such an API may be implemented in the query layer 660 or via the hardware, software, and logic elements 520 of the multi-tenant database system 530. Using such an API, nonrelational database objects can be made to support the same query mechanisms as other objects on an otherwise distinct database structure. In some embodiments, the API that implements corresponding or equivalent query operations for distinct data store architectures (e.g., 550 and 555) is made available to the customer organizations 505, for example, via a public dynamic class of Salesforce.com based objects (e.g., "SObjects" or via

com.salesforce. objects, for example, inheritable from mx.utils.ObjectProxy). In a particular embodiment, an API query() verb and a Salesforce.com Object Query Language (SOQL) compatible syntax is made available as an entry point to a query targeting non-relational data or data stored in a non-relational data store 550, notwithstanding a lack of native nonrelational data store 550 support for such a query syntax (the non-compatible syntax is thus transformed via the API or query layer 660).

[00141] In a particular embodiment, an API query retrieve verb is directly mapped to a non-relational data store's 550 retrieve() function internal call, thus permitting less parsing and execution sophistication. In other embodiments, the API query retrieve verb may be parsed and executed or translated into a format supported by the underlying data store, regardless of type.

[00142] In certain embodiments, an underlying data store cannot be made to implement equivalent or corresponding functionality which is supported by a different type of data store. In such cases, the API may map to functionality at an application layer, external to the underlying data store (550 or 555) itself, in which the Application layer function retrieves data from the underlying data store utilizing supported query logic, and then manipulates the data at the application layer to reflect the functionality which is not implementable within the underlying data store (550 or 555) itself. The resulting data may then be placed into a caching layer or written to a temporary location within the data store so that an optimized database query 750 can retrieve the manipulated data, for example, as part of a sub-query that may otherwise require the functionality which is not implementable within the data store itself. In such a way, the API appears to a referencing entity as though the requested functionality is supported, regardless of whether the non-relational data store 550 or the relational data store 555 is targeted.

[00143] In accordance with some embodiments, optimizer agent 645 may seek to implement data guarantees where data inconsistency may potentially exist. For example, in one embodiment optimizing the database query includes a) identifying a first sub-query to the non-relational data store 550; b) identifying a second sub-query to the relational data store 555 in which the second sub-query is a data delta query to an append log of the relational data store 555 for new or updated data elements within the scope of the first sub-query to the nonrelational data store 550 which are not yet flushed from the append log of the relational data store 555 to the non-relational data store 550 (e.g., determines a data delta 710 between a guaranteed up to date append log and a potentially out of date non-relational data store 550); c) executing or triggering a flush against the relational data store 555 to flush the new or updated data elements within the append log of the relational data store 555 to the nonrelational data store 550; and d) eliminating the second sub-query to the relational data store 555 from the optimized database query 750. Stated differently, where data consistency guarantees are deemed important enough by the optimizer agent 645 for a particular data request 615, the database query 617 may be optimized to ensure that the non-relational data store 550 which may ordinarily exhibit some level of acceptable data inconsistency is made to be up to date immediately prior to executing the final optimized database query 750 against the non-relational data store 550. Such situations may arise where information that is permanently stored in a non-relational data store is transacted initially and temporarily to a relational data store 555. In alternative embodiments, only the missing elements reflected by the data delta 710 are synchronized or are joined in a temporary cache location.

[00144] In accordance with one embodiment, executing the optimized database query 750 against the multi-tenant database system 530 to retrieve the data requested includes a) designating the data to be retrieved as a massive data set based on a comparison of an estimated or actual size of the data to be retrieved against a massive query threshold; b) breaking up the optimized database query 750 into a plurality of optimized sub-queries (e.g., 706-709), each optimized sub-query corresponding to a portion of the massive data set; c) allocating execution of the plurality of optimized sub-queries to one work thread processor within a pool of work thread processors 795, in which the one work thread processor to establish a spooling algorithm internal to the one work thread processor to spool/cache one or more portions of the massive data set returned to the one work thread processor responsive to execution of one or more of the plurality of optimized sub-queries, and further in which the spooling algorithm retrieves a re-requested portion of the massive data set from the spool/cache, when available, in place of re-executing a corresponding optimized sub-query to retrieve the re-requested portion from the underlying data store (e.g., 550 or 555). Such a technique may be referred to as "chunking."

[00145] In such a way, a request soliciting a massive data set can be serviced in parts, without triggering multiple distinct work thread processors 795 to execute in parallel in an effort to complete execution of the un-optimized database query 617. Such an approach may be especially beneficial where a massive data set is triggered or requested by mistake (e.g., the query is broader than it need be) or where a retrieving a massive data set in parallel (e.g., as quickly as possible) could detrimentally impair performance of the multi-tenant database system 530 for other customer organizations 505 utilizing the system.

[00146] For example, a customer organization 505 may initiate a request 615 for data not realizing the size of the resultant data set may be massive. Although total execution time may be increased for the requesting party (e.g., one of the customer organizations 505), the total load on the multi-tenant database system 530 which is shared by multiple customer organizations 505 may be lessened, and thus, overall system efficiency may be improved. Moreover, if a party is reviewing portions of the data due to the large size of the resultant data set, those sub-portions may be transmitted more efficiently having been broken up into parts or chunks, and re-requests for sub-portions triggered by, for example, paging backwards and forward through a data set, are made more efficient by being serviced via the spooling algorithm implemented by the one work thread processor responsible for executing the optimized sub-queries. In such an embodiment, optimizing the database query 617 results in the optimized database query 750 being arranged for serialized execution rather than parallelized execution.

[00147] In one embodiment, executing the optimized database query 750 against the multi-tenant database system 530 to retrieve the data includes a) sending the optimized database query 750 to a query layer agent that operates on behalf of the multi-tenant database system 530 or operates within the multi-tenant database system (e.g., such as query layer 660). In such an embodiment, the query layer agent executes a plurality of sub-queries constituting the optimized database query 750 against the relational data store 555 and the non-relational data store 550 based in part on the customer schema (e.g., 640) and based further on an optimized query syntax defined by the plurality of optimized database sub- queries 750.

[00148] In one embodiment, a query layer agent (e.g., 660) executes the plurality of optimized sub-queries making up an optimized database query 750 by designating or allocating each of the plurality of optimized sub-queries to one distinct work thread processor within a pool of work thread processors 795, in which each work thread processor 795 in the pool executes zero, one, or a plurality of the plurality of sub-queries constituting the optimized database query 750. In such a way, an optimized database query 750 may be parallelized resulting in a more time-efficient execution.

[00149] Different types of data may be stored by multi-tenant database system 530. For example, in one embodiment, the one or more data elements residing within the nonrelational data store 550 correspond to plurality of compressed flat files or a plurality of binary files or a combination of compressed flat files and binary files. Such files may be more efficiently stored via a non-relational database store (e.g., 550).

[00150] In another embodiment, relational data store 555 implements a relational database in accordance with a relational database management system (RDBMS), in which a plurality of relation tables of the relational database are inter-related to each other through one or more overlapping common characteristics for each of two or more relation tables within the relational database, thus forming the "relationships" which are commonly associated with relational type data stores 555.

[00151] In one embodiment, the non-relational data store 550 implements a distributed structured database having a plurality of underlying hardware storage devices, each providing at least a portion of a total storage capacity for the non-relational data store 550. In such an embodiment, data elements within the non-relational data store 550 are referenceable on the basis of a primary key, but are not referenceable on the basis of one or more overlapping common characteristics between two or more relation tables, such as is the case with data elements within the relational data store 555.

[00152] In one embodiment, the relational data store 555 implements a relational database model selected from among the following: an Oracle compatible database implementation, an IBM DB2 Enterprise Server compatible relational database

implementation, a MySQL compatible relational database implementation, and a Microsoft SQL Server compatible relational database implementation.

[00153] In one embodiment, the non-relational data store 550 implements a NoSQL non-relational database implementation selected from among the following: a Vampire compatible non-relational database implementation, an Apache Cassandra compatible non-relational database implementation, a BigTable compatible non-relational database implementation, and an HBase compatible non-relational database implementation.

[00154] In one embodiment, the non-relational data store 550 includes a plurality of distributed computing nodes, each computing node comprising at least a memory, one or more processors, and one or more communicatively interfaced hard disk drives. In such an embodiment, each of the distributed computing nodes may further include an isolated non-relational database instance having functionality to read, write, and update non-relational database transactions without authorization or control from a centralized transaction authority.

[00155] In a particular embodiment, the relational data store 555 implements a monolithic relational database instance comprising memory and processors that coordinate computing resources with a centralized transaction authority that controls whether updates or changes to the monolithic relational database instance are committed to persistent storage upon persistent storage devices communicatively interfaced to, and controlled by, the monolithic relational database instance.

[00156] Figure 8 shows a diagrammatic representation of a system 800 in which embodiments may operate, be installed, integrated, or configured.

[00157] In one embodiment, system 800 includes a memory 895 and a processor or processors 890. For example, memory 895 may store instructions to be executed and processor(s) 890 may execute such instructions. System 800 includes bus 815 to transfer transactions and data within system 800 among a plurality of peripheral devices

communicably interfaced with bus 815. System 800 further includes web-server 825, for example, to receive requests, return responses, and otherwise interface with remote clients, such as client devices located within customer organizations 505A-C.

[00158] System 800 is further depicted as having a query layer agent 835 designed to transmit, transact, execute, designate, and/or allocate optimized data queries and optimized database sub-queries against multi-tenant database system (such as that depicted at element 530 of Figure 5). In some embodiments, query layer agent 835 passes the optimized database queries to the multi-tenant database system 530 for execution without specifying how to process optimized sub-queries. In other embodiments, the query layer agent 835 manages how optimized sub-queries are parallelized and/or serialized for execution.

[00159] System 800 further includes a global caching layer 850 to provide caching services to communicably interfaced devices and systems and in particular, provide caching of customer schema data (e.g., meta data, etc.), database analytics, database statistics, stored pre-query results, etc, such as that which is provided by customer schema 830 and/or analytics engine module 840, each of which operate in conjunction with the global caching layer 850. For example, customer schema 830 specifies whether requisite data elements are stored by a relational database or a non-relational database implementation within the multi- tenant database system or both, and may further specify locations within the underlying data stores for one or more data elements that make up a dataset for a corresponding request. Analytics engine module 840 specifies and stores various statistics and database analytics. Information associated with customer schema 830 and analytics engine module 840 may be stored upon a hard drive, a persistent data store or other storage location within system 800.

[00160] Distinct within system 800 is hardware based Optimizer Agent 801 which includes statistics processor 870, customer schema processor 875, pre-query generator 880, and sub-query processor 885. In accordance with one embodiment, statistics processor 870 retrieves database statistics, analytics, and stored pre-query/probing query results from the global caching layer 850 or from analytics engine module 840. Customer schema processor 875 may retrieve the one or more locations of the requested data that is to be retrieved from the underlying data stores based on the request, or may retrieve equivalent information from the global caching layer 850. Pre-query generator 880 develops and generates the necessary probing queries or pre-queries when necessary to either retrieve sub-sets or portions of the requested data from the appropriate underlying data stores when such pre-queries are evaluated to be beneficial. Some pre-queries may trigger or initiate join operations causing data subsets to be synchronized, flushed, or replicated from one data store to another, so that subsequent sub-queries can retrieve an entire requested data set from a lone data store. Such pre-queries may rely upon statistics and pre-query results available from the analytics engine module 840. Sub-query processor 885 may perform optimization upon individual sub-queries to generate an overall optimized database query. Such optimizations may include the introduction of a foreign key injection, inserting a join operation, re-ordering/re-sequencing underlying sub-queries, and so forth, so as to optimize the resulting database query that is to be executed.

[00161] Figure 9 is a flow diagram illustrating a method 900 for optimizing queries in a multi-tenant store in accordance with one embodiment, including optimizing database queries or a plurality of sub-queries constituting a database query. Such optimizations may include, for example, improving total execution time, reducing computational workload, reducing bandwidth usage, implementing data consistency guarantees, and so forth. Method 900 may be performed by processing logic that may include hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions run on a processing device to perform various query operations such reading, writing, updating, optimizing, initiating pre-queries, developing sub-queries, etc., or a combination thereof. In one embodiment, method 900 is performed by hardware logic, such as the hardware based optimizer agent depicted at element 801 of Figure 8. Some of the blocks and/or operations listed below are optional in accordance with certain embodiments. The numbering of the blocks presented is for the sake of clarity and is not intended to prescribe an order of operations in which the various blocks must occur.

[00162] Method 900 begins with processing logic receiving a request at a host system for the multi-tenant database system, the request specifying data to be retrieved from the multi-tenant database system (block 905). At block 910, processing logic retrieves, based on the request via the host system, one or more locations of the data to be retrieved. [00163] At block 915, processing logic retrieves, via the host system, a customer schema responsive to receiving the request. For example, a customer schema may describe the one or more locations of data to be retrieved, the customer schema specifying each of the plurality of data elements of the data as residing within either the non-relational data store or residing within the relational data store, or as being available from both the non-relational data store and the relational data store.

[00164] At block 920, processing logic generates, at the host system, a database query based on the request. For example, the database query may specify a plurality of data elements to be retrieved, the plurality of data elements including one or more data elements residing within the non-relational data store and one or more other data elements residing within the relational data store. The database query may further include a plurality of sub- queries. In one embodiment, the database query specifies a join operation via one of the sub- queries. In some embodiments, a database query is specified by, or provided within, an incoming customer request.

[00165] At block 925, processing logic optimizes the database query via a query optimizer to generate an optimized database query having an optimized query syntax that is distinct from a query syntax of the database query. Such query optimizations improve one or more execution characteristics above and beyond that which is provided for via the originally specified database query or an un-optimized database query, such as execution time, consistency guarantees, serializing or parallelizing execution based on specified policy, etc.

[00166] At block 930, processing logic executes the optimized database query against the multi-tenant database system to retrieve the data. In accordance with some embodiments, the original database query specified or requested for execution is discarded and is not passed to the multi-tenant database system for execution, thus leaving only the optimized database query to be executed.

[00167] At block 935, processing logic generates database analytics for one or more customer organizations based on data structures and historical database queries corresponding to each of the one or more customer organizations. For example, statistics and database analytics may be generated on a per customer organization basis. In some embodiments, such statistics and analytics are executed prior to receiving a database query which requires them. Alternatively, statistics and analytics may be generated on-the-fly, responsive to receiving a database query to be optimized, and then used in the database query optimization process. Cross customer organization statistics and analysis may similarly be generated and utilized by a query optimizer. [00168] At block 940, processing logic generates and executes one or more pre- queries and utilizes the results of the one or more pre-queries in generating the optimized database query. For example, an optimizing agent may rank the executed pre-queries on the basis of one or more metrics including Input/Output (I/O) utilization for the pre-query, processor utilization for the pre-query, and/or execution time for the pre-query and generate the optimized database query based on the ranking of the executed pre-queries. Various policies may control the rankings. For example, one policy may emphasize reduced I/O utilization, while another may emphasize reduced processor loads, while yet another emphasizes fasted execution times, for example, by implementing sub-queries that may be executed in parallel.

[00169] Figure 10 illustrates a diagrammatic representation of a machine 1000 in the exemplary form of a computer system, in accordance with one embodiment, within which a set of instructions, for causing the machine 1000 to perform any one or more of the methodologies discussed herein, may be executed. In alternative embodiments, the machine may be connected (e.g., networked) to other machines in a Local Area Network (LAN), an intranet, an extranet, or the Internet. The machine may operate in the capacity of a server or a client machine in a client- server network environment, or as a peer machine in a peer-to-peer (or distributed) network environment or as a server or series of servers within an on-demand service environment, including an on-demand environment providing multi-tenant database storage services (e.g., a "cloud computing" environment in which a remotely located multi- tenant database system is accessed via, for example, the Internet). Certain embodiments of the machine may be in the form of a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a server, a network router, switch or bridge, computing system, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term "machine" shall also be taken to include any collection of machines (e.g., computers) that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.

[00170] The exemplary computer system 1000 includes a processor 1002, a main memory 1004 (e.g., read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM) or Rambus DRAM (RDRAM), etc., static memory such as flash memory, static random access memory (SRAM), volatile but high-data rate RAM, etc.), and a secondary memory 1018 (e.g., a persistent storage device including hard disk drives and persistent multi-tenant database implementations), which communicate with each other via a bus 1030. Main memory 1004 includes customer schema 1024 (e.g., specifies one or more locations of data or data elements constituting a specified data or data set among two or more diverse data stores, such as locations of data elements spread across both a relational data store and a non-relational data store and retrievable via a query layer agent or via a hardware based optimizer agent 1034). Hardware based optimizer agent 1034 may implement database query optimizations based on statistics and analytics stored within global caching layer 1023 or available via an analytics engine. In one embodiment, global cache layer 1023 resides within memory 1004 and implements a system- wide accessible global caching layer to provide meta-data and other association or correspondence information between multiple data elements of a larger data set, such as the type of information provided via customer schema 1024 in addition to analytics and statistics. Main memory 1004 and its sub-elements (e.g. 1023 and 1024) are operable in conjunction with processing logic 1026 and processor 1002 to perform the methodologies discussed herein.

[00171] Processor 1002 represents one or more general-purpose processing devices such as a microprocessor, central processing unit, or the like. More particularly, the processor 1002 may be a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, processor implementing other instruction sets, or processors implementing a combination of instruction sets. Processor 1002 may also be one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. Processor 1002 is configured to execute the processing logic 1026 for performing the operations and functionality which is discussed herein.

[00172] The computer system 1000 may further include a network interface card 1008. The computer system 1000 also may include a user interface 1010 (such as a video display unit, a liquid crystal display (LCD), or a cathode ray tube (CRT)), an alphanumeric input device 1012 (e.g., a keyboard), a cursor control device 1014 (e.g., a mouse), and a signal generation device 1016 (e.g., an integrated speaker). The computer system 1000 may further include peripheral device 1036 (e.g., wireless or wired communication devices, memory devices, storage devices, audio processing devices, video processing devices, etc.

[00173] The secondary memory 1018 may include a non-transitory machine-readable storage medium (or more specifically a non-transitory machine-accessible storage medium) 1031 on which is stored one or more sets of instructions (e.g., software 1022) embodying any one or more of the methodologies or functions described herein. The software 1022 may also reside, completely or at least partially, within the main memory 1004 and/or within the processor 1002 during execution thereof by the computer system 1000, the main memory 1004 and the processor 1002 also constituting machine-readable storage media. The software 1022 may further be transmitted or received over a network 1020 via the network interface card 1008.

[00174] Figures 11A and 11B each set forth a table (1100 and 1101 respectively) depicting exemplary database query optimizations in accordance with the embodiments described herein.

[00175] The first column of each table labeled "query" sets forth an "original query" (second column), a corresponding "optimized query" (third column), and "optimization notes" for the queries as depicted by query example rows 1105, 1110, 1115, 1120, 1125, 1130, and 1135 respectively.

[00176] Each query row 1105-1135 provides an example of a potential database query optimization in accordance with the described embodiments. For example, the various query example rows depict filtering optimizations (e.g., 1105), an upper bound optimization (1110), range scan termination optimizations (1115 and 1120), negation of a join operation optimization (1125), empty byte value optimization in place of an "IS NULL" function (1130), and an example of a sub-query optimization implementing evaluation logic within a generated optimized database query (e.g., 1150) thus negating an otherwise required join operation (1135).

[00177] While the subject matter disclosed herein has been described by way of example and in terms of the specific embodiments, it is to be understood that the claimed embodiments are not limited to the explicitly enumerated embodiments disclosed. To the contrary, the disclosure is intended to cover various modifications and similar arrangements as would be apparent to those skilled in the art. Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements. It is to be understood that the above description is intended to be illustrative, and not restrictive. Many other embodiments will be apparent to those of skill in the art upon reading and understanding the above description. The scope of the disclosed subject matter is therefore to be determined in reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.

Claims

CLAIMS What is claimed is:

1. A non-transitory computer readable storage medium having instructions stored thereon that, when executed by processor in a host system, the instructions perform operations comprising:

receiving a request at the host system for the multi-tenant database system, the request

specifying data to be retrieved from the multi-tenant database system;

retrieving one or more locations of the data to be retrieved based on the request;

generating a database query based on the request, wherein the database query specifies a plurality of data elements to be retrieved, the plurality of data elements including one or more data elements residing within the non-relational data store and one or more other data elements residing within the relational data store;

optimizing the database query to generate an optimized database query having an optimized query syntax that is distinct from a query syntax of the database query; and executing the optimized database query against the multi-tenant database system to retrieve the data.

2. The non-transitory computer readable storage medium of claim 1, wherein the database query comprises a plurality of sub-queries; and

wherein optimizing the database query comprises:

analyzing each of the plurality of sub-queries based on in-memory cached statistics; identifying one of the plurality of sub-queries as a most efficient sub-query based on the in-memory cached statistics, and

generating the optimized database query having the one sub-query identified as the most efficient sub-query as a leading query within the optimized database query.

3. The non-transitory computer readable storage medium of claim 1, wherein the multi-tenant database system further comprises elements of hardware and software that are shared by a plurality of separate and distinct customer organizations, each of the separate and distinct customer organizations being remotely located from a host organization having the multi-tenant database system executing therein.

4. The non-transitory computer readable storage medium of claim 1, wherein the operations further comprise:

generating database analytics for one or more customer organizations based on data structures and historical database queries corresponding to each of the one or more customer organizations; and

wherein optimizing the database query comprises generating the optimized database query based on the database analytics generated for a respective one of the customer organizations having originated the request.

5. The non-transitory computer readable storage medium of claim 1:

wherein the database query comprises a specified order for two or more database join

operations; and

wherein optimizing the database query comprises modifying the specified order for the two or more database join operations.

6. The non-transitory computer readable storage medium of claim 5, wherein at least one of the database join operations comprises joining a child table from the non-relational data store with a master table from the relational data store.

7. The non-transitory computer readable storage medium of claim 1:

wherein the database query comprises a plurality of sub-queries; and

wherein optimizing the database query comprises:

8. The non-transitory computer readable storage medium of claim 7, wherein the most

efficient query comprises the one sub-query among the plurality of sub-queries corresponding to a best metric or a best combination of metrics in accordance with the in-memory cached statistics, wherein each metric is selected from the group comprising: a number of rows, a number of blocks, an average row length;

Input/Output (I/O) utilization; processor utilization; and execution time.

9. The non-transitory computer readable storage medium of claim 1:

wherein the database query comprise a plurality of sub-queries, each having a corresponding "WHERE" clause to retrieve a specified plurality of data elements; and

wherein optimizing the database query comprises:

generating a pre-query for two or more of the plurality of sub-queries, wherein each pre-query specifies the retrieval of a portion of the respective plurality of data elements for the corresponding sub-query;

executing each of the pre-queries against the multi-tenant database system;

ranking the executed pre-queries on the basis of one or more metrics including

Input/Output (I/O) utilization for the pre-query, processor utilization for the pre-query, and execution time for the pre-query; and

generating the optimized database query based on the ranking of the executed pre- queries.

10. The non-transitory computer readable storage medium of claim 1, wherein optimizing the database query comprises:

identifying a sub-query to a parent table within the relational data store as a leading sub- query within the database query;

injecting a join operation to a foreign key index into the leading sub-query to the parent table in the relational data store, wherein the join operation joins a custom index on a foreign key for the non-relational data store; and

leading the optimized database query with the sub-query to the parent table having the join operation injected therein.

11. The non-transitory computer readable storage medium of claim 1, wherein optimizing the database query comprises:

identifying within the database query, a sub-query to a named table within the relational data store having a "WHERE" clause to invoke a full table scan;

replacing the identified sub-query to invoke the full table scan with an optimized sub-query to invoke a range scan of an index table associated with the named table;

leading the optimized database query with the optimized sub-query; and

injecting a join operation to the optimized database query, wherein the join operation

combines indexed column data from the index table with results of the optimized sub- query when available and further joins non-indexed column data from the named table when indexed columns cover less than all required columns of the named table based on the identified sub-query.

12. The non-transitory computer readable storage medium of claim 1, wherein optimizing the database query comprises:

identifying a first sub-query to a table within relational data store, the first sub-query

corresponding to the first portion of the data to be retrieved;

identifying a second sub-query to the non-relational data store; the second sub-query

corresponding to the second portion of the data to be retrieved; replicating the table within the relational data store to the non-relational data store; and generating the optimized database query with one or more sub-queries to fetch both the first portion of the data to be retrieved and the second portion of the data to be retrieved from the non-relational data store.

13. The non-transitory computer readable storage medium of claim 1, wherein optimizing the database query comprises:

identifying a first sub-query to the non-relational data store;

identifying a second sub-query to the relational data store, wherein the second sub-query comprises a data delta query to an append log of the relational data store for new or updated data elements within the scope of the first sub-query to the non-relational data store which are not yet flushed from the append log of the relational data store to the non-relational data store;

executing a flush against the relational data store to flush the new or updated data elements within the append log of the relational data store to the non-relational data store; and eliminating the second sub-query to the relational data store from the optimized database query.

14. The non-transitory computer readable storage medium of claim 1, wherein executing the optimized database query against the multi-tenant database system to retrieve the data comprises:

designating the data to be retrieved as a massive data set based on a comparison of an

estimated or actual size of the data to be retrieved against a massive query threshold; breaking up the optimized database query into a plurality of optimized sub-queries, each optimized sub-query corresponding to a portion of the massive data set;

allocating execution of the plurality of optimized sub-queries to one work thread processor within a pool of work thread processors, wherein the one work thread processor to establish a spooling algorithm internal to the one work thread processor to

spool/cache one or more portions of the massive data set returned to the one work thread processor responsive to execution of one or more of the plurality of optimized sub-queries, and wherein the spooling algorithm retrieves a re-requested portion of the massive data set from the spool/cache, when available, in place of re-executing a corresponding optimized sub-query to retrieve the re-requested portion.

15. The non-transitory computer readable storage medium of claim 1, wherein executing the optimized database query against the multi-tenant database system to retrieve the data comprises: sending the optimized database query to a query layer agent of the multi-tenant database system, wherein the query layer agent executes a plurality of sub-queries constituting the optimized database query against the relational data store and the non-relational data store based in part on the customer schema and based further on an optimized query syntax defined by the plurality of optimized sub-queries.

16. The non-transitory computer readable storage medium of claim 14, wherein the query layer agent executes the plurality of sub-queries by designating each of the plurality of sub-queries to a work thread processor within a pool of work thread processors, and wherein each work thread processor in the pool executes zero, one, or a plurality of the plurality of sub-queries constituting the optimized database query.

17. The non-transitory computer readable storage medium of claim 1:

wherein receiving the request via an interface of the multi-tenant database system comprises receiving the request via a web-server of the multi-tenant database system, wherein the web- server provides a web-based interface to a remotely located end-user client machine originating the request; and

wherein the request comprises a request for services from the multi-tenant database system operating within a host organization for the multi-tenant database system.

18. The non-transitory computer readable storage medium of claim 1:

wherein the first portion of the data to be retrieved residing within the non-relational data store comprises a plurality of compressed flat files or a plurality of binary files or a combination of the compressed flat files and the binary files; and

wherein the relational data store comprises a relational database implemented in accordance with a relational database management system (RDBMS), wherein a plurality of relation tables of the relational database are inter-related to each other through one or more overlapping common characteristics for each of two or more relation tables within the relational database.

19. The non-transitory computer readable storage medium of claim 1, wherein the nonrelational data store comprises a distributed structured database having a plurality of underlying hardware storage devices, each providing at least a portion of a total storage capacity for the non-relational data store, and wherein data elements within the non-relational data store are referenceable on the basis of a primary key and not on the basis of one or more overlapping common characteristics between two or more relation tables.

20. The non-transitory computer readable storage medium of claim 1: wherein the relational data store comprises a relational database implementation selected from the group comprising: an Oracle compatible database implementation, an IBM DB2 Enterprise Server compatible relational database implementation, a MySQL compatible relational database implementation, and a Microsoft SQL Server compatible relational database implementation; and

wherein the non-relational data store comprises a NoSQL non-relational database

implementation selected from the group comprising a Vampire compatible nonrelational database implementation, an Apache Cassandra compatible non-relational database implementation, a BigTable compatible non-relational database

implementation, and an HBase compatible non-relational database implementation.

21. The non-transitory computer readable storage medium of claim 1:

wherein the non-relational data store comprises a plurality of distributed computing nodes, each computing node comprising at least a memory, one or more processors, and one or more communicatively interfaced hard disk drives, and wherein each of the distributed computing nodes comprise an isolated non-relational database instance having functionality to read, write, and update non-relational database transactions without authorization or control from a centralized transaction authority; and wherein the relational data store comprises a monolithic relational database instance

comprising memory and processors that coordinate computing resources with a centralized transaction authority that controls whether updates or changes to the monolithic relational database instance are committed to persistent storage upon persistent storage devices communicatively interfaced to, and controlled by, the monolithic relational database instance.

22. A system comprising:

a processor and a memory;

a communications interface to a multi-tenant database system having a relational data store and a non-relational data store implemented therein;

a request processor to receive a request specifying data to be retrieved from the multi-tenant database system;

a customer schema processor to retrieve, based on the request, one or more locations of the data to be retrieved;

a query layer agent to generate a database query based on the request, wherein the database query specifies a plurality of data elements to be retrieved, the plurality of data elements including one or more data elements residing within the non-relational data store of the multi-tenant database system and one or more other data elements residing within the relational data store of the multi-tenant database system;

a query optimizer to optimize the database query to generate an optimized database query having an optimized query syntax that is distinct from a query syntax of the database query; and

wherein the query layer agent to execute the optimized database query against the multi- tenant database system to retrieve the data.

23. The system of claim 22, wherein the database query comprises a plurality of sub-queries; and

wherein the system further comprises a sub-query processor to:

analyze each of the plurality of sub-queries based on in-memory cached statistics; identify one of the plurality of sub-queries as a most efficient sub-query based on the in-memory cached statistics, and

generate one or more optimized sub-queries to replace a corresponding one or more of the plurality of sub-queries, wherein one of the optimized sub-queries is identified as the most efficient optimized sub-query and specified as a leading query within the optimized database query.

24. A non-transitory computer readable storage medium having instructions stored thereon that, when executed by processor in a host system, the instructions perform operations comprising:

receiving, at the host organization, a query directed to a search;

determining whether the query complies with one of a plurality of search criteria, each of the plurality of search criteria corresponding to a predefined search index, each search index being drawn from a database and containing identifiers for records of items in the database;

selecting a predefined search index corresponding to one of the plurality of search criteria when the query complies with said search criterion, the index containing entries that comply with the corresponding search criterion;

applying the query to the selected index;

selecting items based on applying the query to the selected index;

returning a result set for the query, the result set including only items of the selected index.

25. The non-transitory computer readable storage medium of claim 24, wherein one of the plurality of search criteria is time, wherein the corresponding index is ordered based on a time sequence from most recent to least recent and wherein the query is applied to the most recent items first based on the ordering of the index.

26. The non-transitory computer readable storage medium of claim 24, wherein one of the plurality of criteria comprises a time range and wherein the corresponding index comprises references to items having a field that complies with the time range.

27. The non-transitory computer readable storage medium of claim 24, wherein one of the plurality of criteria comprises access privileges and wherein the corresponding index comprises references to items with a common access level.

28. The non-transitory computer readable storage medium of claim 24:

wherein the query is received from a user;

wherein the operations further comprise determining characteristics of the user; and wherein determining whether the query complies with one of a plurality of search criteria comprises comparing characteristics of the user to characteristics for the predefined indexes.

29. The non-transitory computer readable storage medium of claim 24, wherein the search criteria are ordered so that a query is applied to each complying predefined index in a specific order.

30. The non-transitory computer readable storage medium of claim 29, wherein the ordering is based on a number of items in each index, wherein a query is applied to an index with fewer items before an index with more items.

31. The non-transitory computer readable storage medium of claim 29, wherein the ordering is based on the size of each index, wherein a query is applied to a shorter index before being applied to a longer index.

32. The non-transitory computer readable storage medium of claim 29, wherein the index is ordered based on a determined value and wherein applying the query further comprises applying the query to find database items based on the ordering of the index.

33. The non-transitory computer readable storage medium of claim 24, further comprising applying the query to further indexes after applying the query to the selected index.

34. The non-transitory computer readable storage medium of claim 24, further comprising selecting a second predefined index of the database corresponding to a second one of the plurality of search criteria and applying the query to the second selected index after applying the query to the first selected index.

35. The non-transitory computer readable storage medium of claim 24, wherein the index comprises references to items in the database and wherein building a report comprises accessing selected items from the database using the references of the index.

36. The non-transitory computer readable storage medium of claim 24, further comprising post-filtering the selected items against the database based on the query.

37. A non-transitory computer readable storage medium having instructions stored thereon that, when executed by processor in a host system, the instructions perform operations comprising:

receiving, at the host organization, a query from a user, the query being directed to a set of search indexes, the search indexes being drawn from a database and containing identifiers for records of items in the database;

determining a user access privilege level for the query, the database having a plurality of different user access privilege levels, the different levels allowing access to different sets of items of the database;

selecting one of a plurality of privilege level search indexes, each privilege level search index corresponding to a user access privilege level of the database, the selected privilege level search index containing only items corresponding to the determined user access privilege level;

applying a search to the selected privilege level search index;

applying the search to further indexes drawn from the database;

returning a result set for the query, the result set including only items that are represented in the selected privilege level search index.

38. The non-transitory computer readable storage medium of claim 37, further comprising determining whether the query complies with one of a plurality of additional search criteria, each of the plurality of search criteria corresponding to a predefined search index, and selecting additional predefined search indexes to which the query is applied.

39. The non-transitory computer readable storage medium of claim 38, wherein the search criteria are ordered so that the results from each search index are in a specific order.

40. An apparatus comprising:

a processor; and

one or more stored sequences of instructions which, when executed by the processor, cause the processor to carry out the steps of

receiving a query directed to a search;

selecting a predefined search index corresponding to one of the plurality of search criteria if the query complies with said search criterion, the index containing entries that comply with the corresponding search criterion;

applying the query to the selected index to find database items referenced in the index;

selecting items based on applying the query to the selected index;