WO2005076160A1 - Systeme et architecture repartis d'entrepot de donnees destines a prendre en charge l'execution d'une interrogation de support repartie - Google Patents
Systeme et architecture repartis d'entrepot de donnees destines a prendre en charge l'execution d'une interrogation de support repartie Download PDFInfo
- Publication number
- WO2005076160A1 WO2005076160A1 PCT/PT2004/000001 PT2004000001W WO2005076160A1 WO 2005076160 A1 WO2005076160 A1 WO 2005076160A1 PT 2004000001 W PT2004000001 W PT 2004000001W WO 2005076160 A1 WO2005076160 A1 WO 2005076160A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- query
- data
- nodes
- cluster
- tables
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24532—Query optimisation of parallel queries
Definitions
- This invention relates to data warehousing and decision support systems. Particularly, the invention relates to a method to distribute a data warehouse over a large number of low cost computers, assuring a near linear speed up and scale up .
- the invention provides the necessary tools to load and distribute the data in the cluster, to execute queries in the system, and manage the system.
- Data warehouses range from comprehensive enterprise-wide data warehouses to subject or application oriented data marts. No matter the data size, data warehouses must have efficient Online Analytical Processing (O AP) tools to explore the data and to provide the users with a real insight from the data in the warehouse .
- O AP Online Analytical Processing
- a data warehouse is organized according to the multidimensional model .
- Each dimension of this model represents a different perspective for the analysis of the business. For instance, in the classical example of a chain of stores business, some of the dimensions are products, stores, and time.
- Each cell within the multidimensional structure (a cube in this simple three dimension example) contains data (typically numerical facts) along each of the dimensions. For example, a single cell may contain the total sales for a given product in a given store in a single day.
- the most flexible way to store the data in a data warehouse could be a multidimensional database server
- most of the data warehouses and OLAP applications store the data in a relational database. That is, the multidimensional model is implemented as a star schema formed by a large central fact table surrounded by several dimensional tables related to the fact table by foreign keys.
- One well-known problem of data warehouses is that they tend to be extremely large, which causes heavy storage and performance problems. Data warehouses also tend to grow quite rapidly.
- a scalable architecture is crucial in a warehouse environment, not only to handle with very large amounts of data but also to assure interactive response time to OLAP users .
- Typical warehouse queries are very complex and ad hoc in nature and generally access huge volumes of warehouse data and perform many joins and aggregations. Additionally, the decision making process using OLAP is often based on a sequence of interactive queries. That is, the answer of one query immediately sets the need for a second query, and the answer of this second query raises another query, and so on in an ad hoc manner. Thus, efficient query processing is a critical requirement to cope with the usual large amount of data involved and to assure interactive response time.
- FIG. 1 Single Server Architecture
- prior art of data warehousing systems were comprised of single server (Id) with a storage system (lc) , that receive queries sent by the clients (la) through a connection (lb) to the server.
- the connection was established using a driver that could be ODBC (open database connectivity) or any other drivers that provide access to the database .
- This architecture had suffered evolutions and some variations among time to increase processing capability.
- the SMP Symmetric multi processing
- This architecture is also known as Shared-Everything or Shared-Memory.
- the MPP Massive Parallel Processing
- FIG. 2 - Shared Disk Architecture have introduced a big server (2a) comprised of several nodes (2b) , each one with its own memory, bus, I/O, and optionally its own disks.
- the storage (lc) is separated and the system is called Shared Disk.
- the MPP has removed the bus sharing bottleneck, presented in SMP, enabling the system to scale to several hundreds of processors. Due to the lack of shared memory, the major issue of these architectures is the communication between nodes . The performance and scalability of these systems is directly linked to the amount of data that must be exchanged between nodes through the network (2c) .
- the prior art architectures defined until this point are able to manage high volumes of data. However, they require very expensive hardware and software. Furthermore, the prior art architectures have two additional major problems : lack of optimal data partitioning methods and lack of efficient query execution methods to take advantage of the processing power of the distributed system.
- Range partition mapped data into partitions based on the ranges of partition keys. It could be used, for example, to partition a table into month partitions, each partition being assigned to one node. It is commonly used in data warehouse environments with good performance results when the queries access only a small number of partitions, what is rarely the case .
- Hash partition maps data into partitions based on a hash function applied to partition keys . It can be used to distribute the data through the partitions without a semantic meaning. This is typically used to increase the performance of I/O, because it is possible to read from all partitions at the same time.
- hash partitioning does not assume any prior knowledge on the most frequent queries, it cannot take real advantage of local partition indexing, as it is the case of range partitioning.
- Inter- query means that several queries are being executed at the same time, and it is widely used in the commercially available database management system.
- Intra-query parallelism means that each single query is partitioned and executed in parallel by more than one node. This intra-query parallelism is not widely used and few database management systems incorporate some degree of intra-query parallelism.
- the query execution is divided in several steps (e.g. table scan, merge, sort, join, etc.) and when possible, more than one of these operations were executed in parallel .
- This kind of parallelism is not specifically conceived for distributed systems, and do not take the full advantage of distributed systems because only few nodes, and not the entire processing power available, are used to each query.
- Objects of the present invention are, therefore, to provide new and improved data warehousing systems and architectures to support the processing of queries to large and scaling amounts data, having one or more of the following capabilities, features, characteristics, and/or advantages :
- a scalable architecture comprised of a cluster of computers, possibly personal computers or low cost workstations, able to handle large data warehouses, and able to scale up to accommodate the large growth of data warehouses;
- a data distribution method assuring an optimal load balance :
- All nodes have approximately the same amount of information; Each query requires approximately the same amount of data to be processed in each node;
- a query execution method that re-writes each query in a manner that it is executed in parallel by all nodes, having all nodes about the same amount of data to process within their local data;
- Each node is a complete computer with an instance of a database management system. This way, each node takes advantage of all optimization methods provided by the database for the execution of a query;
- DW-SP data warehouse star partitioning
- a round-robin and probabilistic row by row data partitioning method is applied to the DW star schema.
- the fact tables are partitioned through all nodes according to partition methods objects of the current invention, and dimension tables are replicated in all nodes.
- the data to be loaded to the data warehouse must pass through a traditional process of extraction, transformation and loading. Once it has finished the transformation process and it is ready to be loaded to the DW, objects of current invention load the data according to the partition methods, partitioning the fact tables and replicating the dimensions.
- a query re-write and execution method is also applied to the DW.
- the partitioning approaches assure an optimal data load balance through all nodes of the cluster.
- the query re-write and execution method takes advantage of the specific characteristics of star schemas, typical data warehouse queries profile, and the optimal data load balancing, achieved by data partition objects of the current invention, to guarantee an optimal load balance of query execution and optimal intra-query parallelism.
- the data partitioning approach and the query rewrite and execution method take advantage of the specific characteristics of star schemas and typical data warehouse queries profile to guarantee an optimal load balance and near linear speed up of query execution and assures high scalability.
- a key aspect of the current invention method is that all the queries (or, at least, the vast majority of queries) can be converted into queries that compute partial results, and the global result can be computed very fast from these partial results.
- the (partial) queries are executed independently in the different computers, which is the necessary condition to achieve optimal load balance and performance speed up.
- a huge data warehouse can be distributed by an arbitrary number of relatively small computers, guarantying a scalable and cost effective solution as a large and very expensive server can be replaced by a number of inexpensive computers (i.e., computers with the best cost/performance relation) .
- the system is comprised of an arbitrary number of single servers (possibly personal computers) forming a cluster. (It is assumed that all nodes of the cluster and the client computers are connected through a network) .
- One or more of the nodes receive queries from the clients (data analysis tools) , and should assign one node to control the current execution of the query. It shall also request the query re-written and send the queries to all cluster nodes getting the partial answer. Afterwards it should merge the partial results into the final result and return it to the client as if the request was been made to a single server. Cases might happen where the query could be answered using one node only.
- FIG. 1 - provides a simple diagram a single server prior art, with storage, and client computers arrangements for user access .
- FIG. 2 - provides a simple diagram of a shared-disk distributed architecture prior art, with storage, nodes, and client computers arrangements for user access.
- FIG. 3 - illustrates a simplified diagram of a logical representation of an embodiment of the system architecture of the present invention.
- FIG. 4 - illustrates a simplified diagram of an embodiment of the loading and partitioning method.
- FIG. 5 - illustrates a simplified diagram of an embodiment of the software architecture responsible for receive the query and execute them in the System.
- FIG. 6 - presents a flow chart of an embodiment of a query execution in the system.
- ⁇ query' as generated or produced by an application may be assumed to be any request for any volume of data needed for commencement, continuation, and or completion of a data accessing and or data processing activities. It should be understood that the query may be made in common query language employing, for example, a structured query language (SQL) instruction, or a custom query format/language.
- SQL structured query language
- the term 'client computer' may be any system ranging from the personal computer (PC) , which may also be termed as workstation, network computer, thin client, web terminal, etc. to computer devices, such as a mini-computer, a laptop, a palmtop, a handheld computer, a super computer, or any other device able to create a query in the above listed formats.
- client computer is to be broadly defined.
- the expression 'clients (data analysis tools) ' or the like is to be indicative of one or mode software programs that are executing, or possible in a suspended state but ready to execute in a client computer as defined above, that are able to accessing and processing data.
- Such applications may vary from a simple spreadsheet type of program, to a data processing software program, or even a mathematical program performing complex analysis .
- the term DW-SP Node' may be assumed as a personal computer or any other computer with the capacity of executing a database management system in it.
- the term DW-SP Controller Node' may be assumed as a DW-SP Node which is also executing the DW-SP software.
- the DW-SP software may be assumed as an embodiment of the objects of the present invention responsible for processing queries and returning the results to the clients.
- the term ⁇ cluster' may be assumed as a set a DW-SP Nodes and DW-SP Controller Nodes connected between them with any kind of network apparatus .
- the term 'software component' may be assumed as a software program, or part of a program, with specific functionalities that is executed in a client computer (as defined above) , in a DW-SP Node (as defined above) , or in a DW-SP Controller Node (as defined above) .
- star schema' may be assumed as a database design model used for data warehouses comprised of fact tables and dimension tables forming a configuration similar to a star, with a fact table in the centre of the star and the dimension tables around the fact table.
- Several stars can exist in the same star schema.
- star schema may be assumed as a database schema comprised only of fact tables and dimension tables or any variation of the star schema model (e.g. snow-flakes, mini-dimensions) .
- FIG. 3 shows a simplified diagram that illustrates an embodiment of the loading of data into the data warehouse and the partition method of the star schema object in accordance with the present invention.
- a data staging area 3d containing a database with a star schema model with two dimension tables 3b and 3c and one fact table 3a.
- the data to be loaded to the data warehouse is stored in the tables 3a, 3b, and 3c of the data staging area 3d.
- the data staging area is only illustrative and the data to be loaded can be stored in any database, file, or any other apparatus able to store data.
- a cluster of nodes is comprised of three nodes, 3e, 3f, and 3g (At least one node must be DW-SP Controller Node) .
- the cluster node's 3e, 3f, and 3g are connected through the network connection 3h.
- Each node contains the same star schema model as in the data staging area 3d, comprised of two dimension tables and one fact table.
- the data staging area 3d is connected to the nodes 3e, 3f, and 3g through the network connection 3i.
- the data of the dimension table 3b of the data staging area 3d is totally loaded (replicated) to the dimension table 3b of all the nodes 3e, 3f and 3g.
- the data of the dimension table 3c of the data staging area 3d is totally loaded (replicated) to the dimension table 3c of all the nodes 3e, 3f and 3g.
- the data of the fact table 3a is distributed according to the round-robin partitioning method to the tables 3al, 3a2, and 3a3 of the nodes 3e, 3f, and 3g respectively.
- FIG. 4 there is provided a high-level block diagram of a logical representation of a portion of an embodiment of a distributed data warehouse system in accordance with the present invention. It should be noted that the data distribution is not included in FIG. 4, but was shown in FIG. 3, and the DW-SP software architecture is not included in FIG. 4 but will be shown in FIG. 5. As can be seen in FIG. 4 a cluster 4d is comprised of two types of nodes: DW-SP Nodes 4a and DW-SP Controller Nodes 4b.
- All DW- SP Nodes 4a and DW-SP Controller nodes 4b are executing a database management system application, have their portion of the data warehouse data, are all connected through a network connection 4c and share nothing among them.
- the Client Computers la are executing data analysis tools (nor shown) and are connected to the DW-SP Nodes 4a and DW-SP Controller Nodes 4b through a network connection lb.
- the DW-SP Controller Nodes 4b are executing the DW-SP software and are ready to receive queries from the data analysis tools being executed in the Client Computers la.
- the data analysis tool generates a query and forwards it through lb using a generic client connection driver (not shown) to one of the DW-SP Controller Node 4b.
- the forwarding of the query may be via any suitable communication channel or network including, for example, Ethernet LAN, Fast Ethernet LAN, Internet, Wireless, etc.
- the DW-SP Controller Node 4b that has received the query is responsible for the query execution in the cluster 4d. It uses the DW-SP software to analyse, re-write, execute in the nodes (DW-SP Nodes 4a and DW-SP Controller Nodes 4b) the re-written queries to obtain the necessary data (named partial results) to answer the original query, merge the partial results to obtain the final result, and send the results that answer the query to the Client Computer la through 4d.
- FIG. 5 shows a simplified diagram representing an embodiment of the DW-SP Software architecture.
- the Client Computers la are connected to the DW-SP Controller Node 5h, more specifically they are connected to the DW-SP Software being executed in 5h.
- the Query Receiver 5a, the Query Execution Controller 5b, the Query Re-Writer 5c, the Query Executor 5d, and the Results Merger 5e are all software components of the DW-SP software.
- the cluster is comprised of DW-SP Controller Nodes 4b, DW-SP Nodes 4a, and a DW-SP Controller Node 5h responsible for the execution of queries in the current illustration, connected through the network connection 5g.
- the Query Receiver 5a is responsible for handling the connections of the data analysis tools (not shown) being executed in the Client Computer la to the DW-SP System. It receives the queries generated by the data analysis tools. The communication is performed using a client driver connection and is illustrated by 5i. After the reception of the query, the Query Receiver 5a forwards the query internally in the DW-SP Software to the Query Execution Controller 5b. The internal communication is represented in FIG. 5 by the dashed arrows.
- the Query Execution Controller 5b is responsible for controlling the process of executing a query in the DW-SP System. It uses the Query Re-Writer 5c, the Query Executor 5d, and the Results Merger 5e to obtain the final result.
- the Query Re-Writer 5c analysis the query to verify the need for re-written, and when necessary constructs a group of partial queries to be executed in all cluster node's to obtain the partial results. For each partial query the Query Re-Writer 5c constructs a merge query to merge the partial results obtained as the result of the execution of the partial query in all nodes of the cluster.
- the Query Executor 5d is responsible for executing a given partial query in all system nodes.
- the connection to all nodes 5f is established using a client connection driver (not shown) , through the physical network connection 5g.
- the Results Merger 5e is responsible for merging the partial results, obtained by the Query Executor 5d, and build the final results, according to the merge queries built in the Query Re-Writer 5c, to be send to the data analysis tools that have request it. To merge the results, the Results Merger 5e needs to have a connection to all cluster nodes. The connection to all nodes 5j is established using a client connection driver (not shown) , through the physical network connection 5g.
- FIG. 6 it is provided therein a flow chart of another embodiment of the present invention concerning the execution of queries in the present- invention. It must be noted that the data loading and distribution method to the data warehouse is not represented in FIG. 6, as it was presented in FIG. 3.
- a query is generated by data analysis tools being executed in the client computers and passed at 15 through a client connection driver to the DW-SP Software being executed in DW-SP Controller Node. As discussed, the passing of the query may be realized by employing any suitable communication network.
- the query is received by the DW-SP Software.
- the query is analysed by the Query Re-writer in order to verify if the query needs to be re-written to be executed in the DW-SP System. If the query requires to be re-written, in 40 the query is re-written in one or more partial queries and in one merge query per partial query.
- Each query is partitioned according to the following characteristics: existence of aggregation functions and access to fact tables .
- the queries or sub-queries that access fact tables need to be executed in all nodes .
- the aggregation functions in queries and sub-queries that access fact tables must be partitioned in count, sum, min, and max basic aggregation functions.
- For each query that needs to be executed in all nodes a partial query and a merge query must be constructed; the partial query must be executed in all nodes to obtain the partial results and the merge query must gather the partial results and compute a final result .
- a query that does not access fact tables must be executed in one and only one node.
- the sub-queries that do not access fact tables do not need to be partitioned neither executed independently in the system but are executed together with the query that contains it .
- one partial query is executed by the Query Executor in all nodes of the cluster.
- the partial results obtained in 50 are merged by the Results Merger using the adequate merge query built for this purpose by the Query Re-Writer in 40.
- In 70 is verified if there are more partial queries to be executed, and if there are, it goes back to 50, if there aren't, the final result has been obtained and is sent back to the data analysis tool that have request it by 80.
- the Query Re-Writer verifies that the query did not require re-write than the query is executed in a single node of the system and then the results are sent back to the data analysis tools in 80.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Fuzzy Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/PT2004/000001 WO2005076160A1 (fr) | 2004-02-06 | 2004-02-06 | Systeme et architecture repartis d'entrepot de donnees destines a prendre en charge l'execution d'une interrogation de support repartie |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/PT2004/000001 WO2005076160A1 (fr) | 2004-02-06 | 2004-02-06 | Systeme et architecture repartis d'entrepot de donnees destines a prendre en charge l'execution d'une interrogation de support repartie |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2005076160A1 true WO2005076160A1 (fr) | 2005-08-18 |
Family
ID=34836919
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/PT2004/000001 Ceased WO2005076160A1 (fr) | 2004-02-06 | 2004-02-06 | Systeme et architecture repartis d'entrepot de donnees destines a prendre en charge l'execution d'une interrogation de support repartie |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2005076160A1 (fr) |
Cited By (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2008018969A1 (fr) * | 2006-08-04 | 2008-02-14 | Parallel Computers Technology, Inc. | appareil et procédé d'optimisation de groupage de bases de données avec zéro perte de transaction |
| US7680765B2 (en) | 2006-12-27 | 2010-03-16 | Microsoft Corporation | Iterate-aggregate query parallelization |
| US7917541B2 (en) | 2008-03-31 | 2011-03-29 | Microsoft Corporation | Collecting and aggregating data using distributed resources |
| US7930294B2 (en) | 2008-08-12 | 2011-04-19 | International Business Machines Corporation | Method for partitioning a query |
| GB2478189A (en) * | 2010-02-22 | 2011-08-31 | Sean Corbett | Method of optimizing data flow between a client and a database server |
| US8046750B2 (en) | 2007-06-13 | 2011-10-25 | Microsoft Corporation | Disco: a simplified distributed computing library |
| WO2012025884A1 (fr) * | 2010-08-23 | 2012-03-01 | Nokia Corporation | Procédé et appareil de traitement de requêtes de recherche pour un index partitionné |
| US8700679B2 (en) * | 2012-04-17 | 2014-04-15 | Sap Ag | Classic to in-memory cube conversion |
| WO2014107359A1 (fr) * | 2013-01-07 | 2014-07-10 | Facebook, Inc. | Système et procédé pour moteurs de requêtes réparties de bases de données |
| US9195710B2 (en) * | 2007-08-07 | 2015-11-24 | International Business Machines Corporation | Query optimization in a parallel computer system to reduce network traffic |
| CN113646767A (zh) * | 2019-03-19 | 2021-11-12 | 西格玛计算机有限公司 | 在基于云的数据仓库上启用可编辑表 |
-
2004
- 2004-02-06 WO PCT/PT2004/000001 patent/WO2005076160A1/fr not_active Ceased
Non-Patent Citations (7)
| Title |
|---|
| DATABASE INSPEC [online] THE INSTITUTION OF ELECTRICAL ENGINEERS, STEVENAGE, GB; 1998, DATTA A ET AL: "A case for parallelism in data warehousing and OLAP", XP002296778, Database accession no. 6034650 * |
| DATABASE INSPEC [online] THE INSTITUTION OF ELECTRICAL ENGINEERS, STEVENAGE, GB; 2000, BELLATRECHE L ET AL: "OLAP query processing for partitioned data warehouses", XP002296779, Database accession no. 6604944 * |
| DATABASE INSPEC [online] THE INSTITUTION OF ELECTRICAL ENGINEERS, STEVENAGE, GB; 2001, BERNARDINO J ET AL: "Experimental evaluation of a new distributed partitioning technique for data warehouses", XP002296777, Database accession no. 7024493 * |
| PROCEEDINGS 2001 INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM 16-18 JULY 2001 GRENOBLE, FRANCE, July 2001 (2001-07-01), Proceedings 2001 International Database Engineering and Applications Symposium IEEE Comput. Soc Los Alamitos, CA, USA, pages 312 - 321, XP002296683, ISBN: 0-7695-1140-6, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/iel5/7469/20303/00938099.pdf?tp=&arnumber=938099&isnumber=20303&arSt=312&ared=321&arAuthor=Bernardino%2C+J.%3B+Madeira%2C+H.%3B> [retrieved on 20040916] * |
| PROCEEDINGS NINTH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS 26-28 AUG. 1998 VIENNA, AUSTRIA, 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130) IEEE Comput. Soc Los Alamitos, CA, USA, pages 226 - 231, XP002296684, ISBN: 0-8186-8353-8, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/iel4/5718/15304/00707407.pdf?tp=&arnumber=707407&isnumber=15304&arSt=226&ared=231&arAuthor=Datta%2C+A.%3B+Bongki+Moon%3B+Thomas%2C+H.%3B> [retrieved on 20040916] * |
| PROCEEDINGS OF 1999 INTERNATIONAL SYMPOSIUM ON DATABASE APPLICATIONS IN NON-TRADITIONAL ENVIRONMENTS (DANTE'99) 28-30 NOV. 1999 KYOTO, JAPAN, 2000, Proceedings 1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE'99) (Cat. No.PR00496) IEEE Comput. Soc Los Alamitos, CA, USA, pages 35 - 42, XP002296685, ISBN: 0-7695-0496-5, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/iel5/6800/18245/00844939.pdf?tp=&arnumber=844939&isnumber=18245&arSt=35&ared=42&arAuthor=Bellatreche%2C+L.%3B+Karlapalem%2C+K.%3B+Mohania%2C+M.%3B> [retrieved on 20040916] * |
| RUDIN K: "WHEN PARALLEL LINES MEET", BYTE, MCGRAW-HILL INC. ST PETERBOROUGH, US, vol. 23, no. 5, 1 May 1998 (1998-05-01), pages 81 - 84,86,88, XP000774153, ISSN: 0360-5280 * |
Cited By (24)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2008018969A1 (fr) * | 2006-08-04 | 2008-02-14 | Parallel Computers Technology, Inc. | appareil et procédé d'optimisation de groupage de bases de données avec zéro perte de transaction |
| US7680765B2 (en) | 2006-12-27 | 2010-03-16 | Microsoft Corporation | Iterate-aggregate query parallelization |
| US8046750B2 (en) | 2007-06-13 | 2011-10-25 | Microsoft Corporation | Disco: a simplified distributed computing library |
| US9195710B2 (en) * | 2007-08-07 | 2015-11-24 | International Business Machines Corporation | Query optimization in a parallel computer system to reduce network traffic |
| US7917541B2 (en) | 2008-03-31 | 2011-03-29 | Microsoft Corporation | Collecting and aggregating data using distributed resources |
| US7930294B2 (en) | 2008-08-12 | 2011-04-19 | International Business Machines Corporation | Method for partitioning a query |
| US9396228B2 (en) | 2010-02-22 | 2016-07-19 | Data Accelerator Ltd. | Method of optimizing the interaction between a software application and a database server or other kind of remote data source |
| GB2478189A (en) * | 2010-02-22 | 2011-08-31 | Sean Corbett | Method of optimizing data flow between a client and a database server |
| US8543642B2 (en) | 2010-02-22 | 2013-09-24 | Data Accelerator Limited | Method of optimizing data flow between a software application and a database server |
| WO2012025884A1 (fr) * | 2010-08-23 | 2012-03-01 | Nokia Corporation | Procédé et appareil de traitement de requêtes de recherche pour un index partitionné |
| CN103069421B (zh) * | 2010-08-23 | 2017-02-08 | 诺基亚技术有限公司 | 用于处理针对分区式索引的搜索请求的方法和装置 |
| CN103069421A (zh) * | 2010-08-23 | 2013-04-24 | 诺基亚公司 | 用于处理针对分区式索引的搜索请求的方法和装置 |
| US9229946B2 (en) | 2010-08-23 | 2016-01-05 | Nokia Technologies Oy | Method and apparatus for processing search request for a partitioned index |
| US8700679B2 (en) * | 2012-04-17 | 2014-04-15 | Sap Ag | Classic to in-memory cube conversion |
| CN104903894A (zh) * | 2013-01-07 | 2015-09-09 | 脸谱公司 | 用于分布式数据库查询引擎的系统和方法 |
| US9361344B2 (en) | 2013-01-07 | 2016-06-07 | Facebook, Inc. | System and method for distributed database query engines |
| US9081826B2 (en) | 2013-01-07 | 2015-07-14 | Facebook, Inc. | System and method for distributed database query engines |
| WO2014107359A1 (fr) * | 2013-01-07 | 2014-07-10 | Facebook, Inc. | Système et procédé pour moteurs de requêtes réparties de bases de données |
| CN104903894B (zh) * | 2013-01-07 | 2018-12-28 | 脸谱公司 | 用于分布式数据库查询引擎的系统和方法 |
| US10210221B2 (en) | 2013-01-07 | 2019-02-19 | Facebook, Inc. | System and method for distributed database query engines |
| US10698913B2 (en) | 2013-01-07 | 2020-06-30 | Facebook, Inc. | System and methods for distributed database query engines |
| US11347761B1 (en) | 2013-01-07 | 2022-05-31 | Meta Platforms, Inc. | System and methods for distributed database query engines |
| CN113646767A (zh) * | 2019-03-19 | 2021-11-12 | 西格玛计算机有限公司 | 在基于云的数据仓库上启用可编辑表 |
| US12153577B1 (en) | 2019-03-19 | 2024-11-26 | Sigma Computing, Inc. | Enabling editable tables on a cloud-based data warehouse |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9946780B2 (en) | Interpreting relational database statements using a virtual multidimensional data model | |
| US6505189B1 (en) | Aggregate join index for relational databases | |
| US20060218123A1 (en) | System and Methodology for Parallel Query Optimization Using Semantic-Based Partitioning | |
| Wang et al. | Supporting a light-weight data management layer over hdf5 | |
| JP2003526159A (ja) | 多次元データベースおよび統合集約サーバ | |
| US20050283459A1 (en) | Combining multidimensional expressions and data mining extensions to mine OLAP cubes | |
| Dehne et al. | The cgmCUBE project: Optimizing parallel data cube generation for ROLAP | |
| Chattopadhyay et al. | Procella | |
| US10977280B2 (en) | Systems and methods for memory optimization interest-driven business intelligence systems | |
| Oussous et al. | NoSQL databases for big data | |
| Hua et al. | ANTELOPE: A semantic-aware data cube scheme for cloud data center networks | |
| WO2005076160A1 (fr) | Systeme et architecture repartis d'entrepot de donnees destines a prendre en charge l'execution d'une interrogation de support repartie | |
| Gueidi et al. | Towards unified modeling for NoSQL solution based on mapping approach | |
| Özsu et al. | NoSQL, NewSQL, and polystores | |
| Cuzzocrea et al. | A rewrite/merge approach for supporting real-time data warehousing via lightweight data integration | |
| Wehrle et al. | A model for distributing and querying a data warehouse on a computing grid | |
| Johnson et al. | Hierarchically split cube forests for decision support: description and tuned design | |
| Sergey et al. | Applying map-reduce paradigm for parallel closed cube computation | |
| Schreiner et al. | A hybrid partitioning strategy for NewSQL databases: the VoltDB case | |
| Khedr | Decomposable algorithm for computing k-nearest neighbours across partitioned data | |
| Ranawade et al. | Online analytical processing on hadoop using apache kylin | |
| Lawrence et al. | The OLAP-enabled grid: Model and query processing algorithms | |
| Mouhiha et al. | NoSQL data warehouse optimizing models: A comparative study of column-oriented approaches | |
| Ordonez et al. | A survey on parallel database systems from a storage perspective: rows versus columns | |
| Fiore et al. | Towards high performance data analytics for climate change |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWW | Wipo information: withdrawn in national office |
Country of ref document: DE |
|
| 122 | Ep: pct application non-entry in european phase |