CN106294380A - The querying method of data base and device - Google Patents
The querying method of data base and device Download PDFInfo
- Publication number
- CN106294380A CN106294380A CN201510254514.4A CN201510254514A CN106294380A CN 106294380 A CN106294380 A CN 106294380A CN 201510254514 A CN201510254514 A CN 201510254514A CN 106294380 A CN106294380 A CN 106294380A
- Authority
- CN
- China
- Prior art keywords
- query
- inquiry
- parameter
- threshold
- data base
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2425—Iterative querying; Query formulation based on the results of a preceding query
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明涉及通信领域,具体而言,涉及一种数据库的查询方法及装置。The present invention relates to the communication field, in particular to a database query method and device.
背景技术Background technique
对于关系数据库查询,相关技术中的效率优化办法是采用适当的索引,必要时使用结构化查询语言(Structured Query Language,简称为SQL)注释(如Oraclehint)来指定数据库查询优化方式。对于一些场景,由于数据分布的不均匀、Hint不是最优方式,导致无法达到实际效率的最优状态。For relational database query, the efficiency optimization method in related technologies is to use appropriate index, and use Structured Query Language (Structured Query Language, SQL for short) annotation (such as Oraclehint) to specify the database query optimization method when necessary. For some scenarios, due to uneven data distribution and Hint is not an optimal method, the optimal state of actual efficiency cannot be achieved.
针对相关技术中对于数据库的查询方式比较单一的问题,目前尚未提出有效的解决方案。Aiming at the relatively simple query mode of the database in related technologies, no effective solution has been proposed yet.
发明内容Contents of the invention
本发明的主要目的在于提供一种数据库的查询方法及装置,以至少解决相关技术中对于数据库的查询方式比较单一的问题。The main purpose of the present invention is to provide a database query method and device to at least solve the problem in the related art that the database query method is relatively simple.
根据本发明的一个方面,提供了一种数据库的查询方法,包括:获取用于指示查询数据库的查询场景的多个参数:依据所述多个参数进行综合决策,并按照与决策结果对应的查询类型执行对所述数据库的查询。According to one aspect of the present invention, a database query method is provided, including: acquiring multiple parameters used to indicate query scenarios for querying the database; making a comprehensive decision based on the multiple parameters, and according to the query result corresponding to the decision result Type executes a query against the database.
进一步地,获取用于指示查询数据库的查询场景的多个参数包括:获取第一预设统计时间内查询条件中限制返回的数据量与从所述数据库查询的数据总量之比的第一参数;获取在第二预设统计时间内所述数据库内需要查询的数据索引范围与所述数据库中所有数据的索引范围之比的第二参数;获取在第三预设统计时间内查询条件中的数据库分区与总分区之比的第三参数。Further, obtaining multiple parameters used to indicate the query scene of querying the database includes: obtaining the first parameter of the ratio of the amount of data returned in the query condition within the first preset statistical time to the total amount of data queried from the database ; Obtain the second parameter of the ratio of the index range of the data to be queried in the database to the index range of all data in the database within the second preset statistical time; obtain the query condition in the third preset statistical time The third parameter for the ratio of database partitions to total partitions.
进一步地,依据所述多个参数进行综合决策,并按照与决策结果对应的查询类型执行对所述数据库的查询包括:获取所述第一参数分别与预设的第一阈值和第二阈值进行比较后的第一比较结果,并依据所述第一比较结果确定是否将所述数据库查询SQL拆分为多个子SQL;在确定是否拆分为多个子SQL之后,获取所述第二参数与所述第三参数之间较小的参数,并将所述较小的参数分别与预设的第三阈值和第四阈值进行比较,得到第二比较结果,并依据所述第二比较结果随机选择指定索引查询方式进行查询或全表扫描查询方式进行查询。Further, making a comprehensive decision based on the multiple parameters, and performing a query on the database according to the query type corresponding to the decision result includes: obtaining the first parameter and the preset first threshold and the second threshold respectively. The first comparison result after the comparison, and determine whether to split the database query SQL into a plurality of sub-SQLs according to the first comparison result; after determining whether to split into a plurality of sub-SQLs, obtain the second parameter and the the smaller parameter between the third parameters, and compare the smaller parameter with the preset third threshold and the fourth threshold respectively to obtain a second comparison result, and randomly select according to the second comparison result Specify the index query method to query or the full table scan query method to query.
进一步地,依据所述第一比较结果确定是否将所述数据库查询SQL拆分为多个子SQL包括:在所述第一比较结果小于等于所述第一阈值时,拆分为多个子SQL;或,在所述第一比较结果大于等于所述第二阈值时,不拆分为多个子SQL;或,在所述第一比较结果大于所述第一阈值且小于所述第二阈值时,随机选择是否拆分为多个子SQL;依据所述第二比较结果确定采用指定索引查询或全表扫描查询包括:在所述第二比较结果小于等于所述第三阈值时,采用指定索引查询;或,在所述第二比较结果大于等于第四阈值时,采用全表扫描查询;或,在所述第二比较结果大于所述第三阈值且小于所述第四阈值时,随机选择所述指定索引查询或所述全表扫描查询。Further, determining whether to split the database query SQL into multiple sub-SQLs according to the first comparison result includes: splitting into multiple sub-SQLs when the first comparison result is less than or equal to the first threshold; or , when the first comparison result is greater than or equal to the second threshold, do not split into multiple sub-SQLs; or, when the first comparison result is greater than the first threshold and less than the second threshold, randomly Selecting whether to split into multiple sub-SQLs; determining to use a specified index query or a full table scan query according to the second comparison result includes: when the second comparison result is less than or equal to the third threshold, using a specified index query; or , when the second comparison result is greater than or equal to the fourth threshold, use a full table scan query; or, when the second comparison result is greater than the third threshold and less than the fourth threshold, randomly select the specified index queries or the full table scan queries.
进一步地,在依据所述指定参数与预定阈值的比较结果执行对所述数据库的查询之后,所述方法还包括:保存所述用于指示查询所述数据库数据的所述多个参数,以及采用当前的查询类型查询所述数据库的耗时。Further, after the query of the database is performed according to the comparison result of the specified parameter and a predetermined threshold, the method further includes: saving the plurality of parameters used to indicate querying the database data, and using The current query type takes time to query the database.
进一步地,在保存所述用于指示查询所述数据库数据的所述多个参数,以及采用当前的查询类型查询所述数据库的耗时之后,所述方法还包括:获取本地保存的与查询类型对应的查询耗时,其中,所述查询类型包括:所述全表扫描查询和所述指定索引查询;比较多个所述查询耗时,并按照比较结果调整所述第一阈值、所述第二阈值、所述第三阈值以及所述第四阈值的取值。Further, after saving the multiple parameters for instructing to query the database data, and using the current query type to query the database, the method further includes: acquiring the locally saved query type Corresponding query time consumption, wherein the query type includes: the full table scan query and the specified index query; compare the time consumption of multiple queries, and adjust the first threshold, the second threshold according to the comparison result The values of the second threshold, the third threshold, and the fourth threshold.
根据本发明的另一个方面,提供了一种数据库的查询装置,包括:第一获取模块,用于获取用于指示查询数据库的查询场景的多个参数:查询模块,用于依据所述多个参数进行综合决策,并按照与决策结果对应的查询类型执行对所述数据库的查询。According to another aspect of the present invention, a database query device is provided, including: a first acquisition module, configured to acquire a plurality of parameters used to indicate query scenarios for querying the database; a query module, configured to The parameters are used for comprehensive decision-making, and the query to the database is executed according to the query type corresponding to the decision result.
进一步地,所述第一获取模块包括:第一获取单元,用于获取第一预设统计时间内查询条件中限制返回的数据量与从所述数据库查询的数据总量之比的第一参数;第二获取单元,用于获取在第二预设统计时间内所述数据库内需要查询的数据索引范围与所述数据库中所有数据的索引范围之比的第二参数;第三获取单元,用于获取在第三预设统计时间内查询条件中的数据库分区与总分区之比的第三参数。Further, the first acquisition module includes: a first acquisition unit, configured to acquire the first parameter of the ratio of the amount of data that is limited to be returned in the query condition within the first preset statistical time to the total amount of data queried from the database The second obtaining unit is used to obtain the second parameter of the ratio of the index range of the data to be queried in the database to the index range of all data in the database within the second preset statistical time; the third obtaining unit uses The third parameter is used to obtain the ratio of the database partition to the total partition in the query condition within the third preset statistical time.
进一步地,所述查询模块包括:确定单元,用于获取所述第一参数分别与预设的第一阈值和第二阈值进行比较后的第一比较结果,并依据所述第一比较结果确定是否将所述数据库查询SQL拆分为多个子SQL;查询单元,用于在确定是否拆分为多个子SQL之后,获取所述第二参数与所述第三参数之间较小的参数,并将所述较小的参数分别与预设的第三阈值和第四阈值进行比较,得到第二比较结果,并依据所述第二比较结果随机选择指定索引查询方式进行查询或全表扫描查询方式进行查询。Further, the query module includes: a determining unit, configured to obtain a first comparison result after comparing the first parameter with a preset first threshold and a second threshold, and determine according to the first comparison result Whether to split the database query SQL into multiple sub-SQLs; a query unit, after determining whether to split into multiple sub-SQLs, obtain the smaller parameter between the second parameter and the third parameter, and Comparing the smaller parameter with the preset third threshold and fourth threshold respectively to obtain a second comparison result, and randomly selecting a designated index query method for query or a full table scan query method according to the second comparison result Make an inquiry.
进一步地,所述确定单元,还用于在所述第一比较结果小于等于所述第一阈值时,拆分为多个子SQL;或,在所述第一比较结果大于等于所述第二阈值时,不拆分为多个子SQL;或,在所述第一比较结果大于所述第一阈值且小于所述第二阈值时,随机选择是否拆分为多个子SQL;所述查询单元,还用于在所述第二比较结果小于等于所述第三阈值时,采用指定索引查询;或,在所述第二比较结果大于等于第四阈值时,采用全表扫描查询;或,在所述第二比较结果大于所述第三阈值且小于所述第四阈值时,随机选择所述指定索引查询或所述全表扫描查询。Further, the determining unit is further configured to split into multiple sub-SQLs when the first comparison result is less than or equal to the first threshold; or, when the first comparison result is greater than or equal to the second threshold , not split into multiple sub-SQLs; or, when the first comparison result is greater than the first threshold and smaller than the second threshold, randomly select whether to split into multiple sub-SQLs; the query unit also When the second comparison result is less than or equal to the third threshold, use the specified index query; or, when the second comparison result is greater than or equal to the fourth threshold, use a full table scan query; or, in the When the second comparison result is greater than the third threshold and less than the fourth threshold, randomly select the specified index query or the full table scan query.
通过本发明,采用获取用于指示查询数据库的查询场景的多个参数,进而对该多个参数进行综合决策,并按照决策结果对应的查询类型指定对数据库的查询;可见,在本实施例中是通过获取查询场景的多个参数,并不仅仅是用hint来指定数据库进行查询,通过该多个参数的综合决定能够更好的适应场景采用优化的查询的方式,解决了相关技术中对于数据库的查询方式比较单一的问题,提高了数据库的查询效率。Through the present invention, a plurality of parameters used to indicate the query scene for querying the database are obtained, and then the multiple parameters are comprehensively decided, and the query to the database is specified according to the query type corresponding to the decision result; it can be seen that in this embodiment By obtaining multiple parameters of the query scene, not just using hints to specify the database for query, through the comprehensive decision of the multiple parameters, it can better adapt to the scene and adopt an optimized query method, which solves the problem of database in related technologies. The query method is relatively simple, which improves the query efficiency of the database.
附图说明Description of drawings
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The accompanying drawings described here are used to provide a further understanding of the present invention and constitute a part of the application. The schematic embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute improper limitations to the present invention. In the attached picture:
图1是根据本发明实施例的数据库的查询方法的流程图;Fig. 1 is the flow chart of the query method of the database according to the embodiment of the present invention;
图2是根据本发明实施例的数据库的查询装置结构框图;Fig. 2 is a structural block diagram of a query device of a database according to an embodiment of the present invention;
图3是根据本发明可选实施例的数据库的查询装置可选结构框图一;3 is an optional structural block diagram 1 of a query device for a database according to an optional embodiment of the present invention;
图4是根据本发明可选实施例的数据库的查询装置可选结构框图二;Fig. 4 is an optional structural block diagram 2 of a query device of a database according to an optional embodiment of the present invention;
图5是根据本发明可选实施例的适应使用场景的数据查询算法的流程图。Fig. 5 is a flowchart of a data query algorithm adapted to usage scenarios according to an optional embodiment of the present invention.
具体实施方式detailed description
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present invention will be described in detail below with reference to the accompanying drawings and examples.
在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。The steps shown in the flow diagrams of the figures may be implemented in a computer system, such as a set of computer-executable instructions, and, although a logical order is shown in the flow diagrams, in some cases, may be executed differently from this The steps shown or described are performed in the order shown or described.
本实施例提供了一种数据库的查询方法,图1是根据本发明实施例的数据库的查询方法的流程图,如图1所示,该方法的步骤包括:The present embodiment provides a method for querying a database. FIG. 1 is a flowchart of a method for querying a database according to an embodiment of the present invention. As shown in FIG. 1 , the steps of the method include:
步骤S102:获取用于指示查询数据库的查询场景的多个参数:Step S102: Obtain multiple parameters used to indicate the query scenario for querying the database:
步骤S104:依据多个参数进行综合决策,并按照与决策结果对应的查询类型执行对数据库的查询。Step S104: Carry out comprehensive decision-making according to multiple parameters, and perform a query on the database according to the query type corresponding to the decision result.
通过本实施例中的上述步骤S102与步骤S104,采用获取用于指示查询数据库的查询场景的多个参数,进而对该多个参数进行综合决策,并按照决策结果对应的查询类型指定对数据库的查询;可见,在本实施例中是通过获取查询场景的多个参数,并不仅仅是用hint来指定数据库进行查询,通过该多个参数的综合决定能够更好的适应场景采用优化的查询的方式,解决了相关技术中对于数据库的查询方式比较单一的问题,提高了数据库的查询效率。Through the above-mentioned steps S102 and S104 in this embodiment, a plurality of parameters used to indicate the query scene of the query database are obtained, and then the multiple parameters are comprehensively decided, and the query type corresponding to the decision result is specified to the database. Query; it can be seen that in this embodiment, by obtaining multiple parameters of the query scene, not only using hint to specify the database for query, the comprehensive decision of the multiple parameters can better adapt to the scene and adopt optimized query The method solves the problem that the query method of the database is relatively simple in related technologies, and improves the query efficiency of the database.
需要说明的是,本实施例中涉及到的该多个参数可以预设获得相应的值,进而可以在使用过程中可以动态调整的。It should be noted that the multiple parameters involved in this embodiment can be preset to obtain corresponding values, and then can be dynamically adjusted during use.
而对于本实施例中涉及到的获取用于指示查询数据库的查询场景的多个参数的方式,在本实施例的可选实施方式可以通过如下方式来实现:As for the manner of obtaining multiple parameters used to indicate the query scene of the query database involved in this embodiment, an optional implementation manner of this embodiment may be implemented in the following manner:
步骤S11:获取第一预设统计时间内查询条件中限制返回的数据量与从数据库查询的数据总量之比的第一参数;Step S11: Obtain the first parameter that limits the ratio of the amount of returned data to the total amount of data queried from the database in the query condition within the first preset statistical time;
步骤S12:获取在第二预设统计时间内数据库内需要查询的数据索引范围与数据库中所有数据的索引范围之比的第二参数;Step S12: Obtain the second parameter of the ratio of the index range of data to be queried in the database to the index range of all data in the database within the second preset statistical time;
步骤S13:获取在第三预设统计时间内查询条件中的数据库分区与总分区之比的第三参数。Step S13: Obtain a third parameter of the ratio of the database partition to the total partition in the query condition within the third preset statistical time.
可见,通过上述步骤S11至步骤S13可以获取查询场景中的三个不同参数,但该三个参数仅仅是用来进行举例说明,并不构成对本发明的限定;也就是说,可以根据实际情况获取更多的参数,此外,上述第一预设统计时间、第二预设统计时间以及第三预设统计时间在本可选实施例中可以相等也可以不相等。It can be seen that three different parameters in the query scene can be obtained through the above steps S11 to S13, but these three parameters are only used for illustration and do not constitute a limitation of the present invention; that is, they can be obtained according to actual conditions More parameters. In addition, the first preset statistical time, the second preset statistical time, and the third preset statistical time may or may not be equal in this optional embodiment.
在本实施例的另一个可选实施方式中,对于本实施例涉及到的依据多个参数进行综合决策,并按照与决策结果对应的查询类型执行对数据库的查询的方式可以通过如下方式来实现:In another optional implementation of this embodiment, the method of performing comprehensive decision-making based on multiple parameters involved in this embodiment and executing a query to the database according to the query type corresponding to the decision result can be implemented in the following manner :
步骤S21:获取第一参数分别与预设的第一阈值和第二阈值进行比较后的第一比较结果,并依据第一比较结果确定是否将数据库查询SQL拆分为多个子SQL;Step S21: Obtain the first comparison result after comparing the first parameter with the preset first threshold and the second threshold respectively, and determine whether to split the database query SQL into multiple sub-SQLs according to the first comparison result;
步骤S22:在确定是否拆分多个子SQL之后,获取第二参数与第三参数之间较小的参数,并将较小的参数分别与预设的第三阈值和第四阈值进行比较,得到第二比较结果,并依据第二比较结果随机选择指定索引查询方式进行查询或全表扫描查询方式进行查询。Step S22: After determining whether to split multiple sub-SQLs, obtain the smaller parameter between the second parameter and the third parameter, and compare the smaller parameter with the preset third threshold and fourth threshold respectively, to obtain The second comparison result, and according to the second comparison result, randomly select a specified index query method to perform query or a full table scan query method to perform query.
通过上述步骤S21和S22,可知,通过第一参数、第二参数以及第三参数与预定阈值进行比较之后,确定是否要对数据库进行拆分,进而确定对于拆分或未拆分的数据库进行指定索引查询或全表扫描查询;通过该方式进一步细化了在对数据库进行查询时考虑的因素,在综合考虑到查询场景的多个参数之后,能够使得查询更加优化。而对于上述步骤S21和S22中具体如何执行对数据库的查询,可以通过如下可选方式来实现;Through the above steps S21 and S22, it can be seen that after comparing the first parameter, the second parameter and the third parameter with the predetermined threshold, it is determined whether to split the database, and then it is determined whether to specify the split or unsplit database Index query or full table scan query; this method further refines the factors considered when querying the database, and can make the query more optimized after comprehensively considering multiple parameters of the query scene. As for how to specifically execute the query to the database in the above steps S21 and S22, it can be realized through the following optional methods;
对于上述步骤S21中的依据第一比较结果确定是否将数据库拆分为多个子SQL可以通过如下方式来实现:在第一比较结果小于等于第一阈值时,拆分为多个子SQL;或,在第一比较结果大于等于第二阈值时,不拆分为多个子SQL;或,在第一比较结果大于第一阈值且小于第二阈值时,随机选择是否拆分为多个子SQL;Determining whether to split the database into multiple sub-SQLs based on the first comparison result in the above step S21 can be achieved in the following manner: when the first comparison result is less than or equal to the first threshold, split into multiple sub-SQLs; or, When the first comparison result is greater than or equal to the second threshold, do not split into multiple sub-SQLs; or, when the first comparison result is greater than the first threshold and less than the second threshold, randomly select whether to split into multiple sub-SQLs;
而对于上述步骤S22中的依据第二比较结果确定采用指定索引查询或全表扫描查询的方式可以通过如下方式来实现:在第二比较结果小于等于第三阈值时,采用指定索引查询;或,在第二比较结果大于等于第四阈值时,采用全表扫描查询;或,在第二比较结果大于第三阈值且小于第四阈值时,随机选择指定索引查询或全表扫描查询。The determination of using the specified index query or the full table scan query based on the second comparison result in the above step S22 can be achieved in the following manner: when the second comparison result is less than or equal to the third threshold, the specified index query is used; or, When the second comparison result is greater than or equal to the fourth threshold, a full table scan query is used; or, when the second comparison result is greater than the third threshold and less than the fourth threshold, a specified index query or a full table scan query is randomly selected.
可见,在上述步骤S21和S22中给出了一种可选的如何在综合考虑到查询场景的多个参数之后,选取查询方式。It can be seen that, in the above steps S21 and S22, an optional method of how to select a query method after comprehensively considering multiple parameters of the query scene is given.
而在上述步骤S22之后,即在依据指定参数与预定阈值的比较结果执行对数据库的查询之后,本实施例的方法还可以包括:保存用于指示查询数据库数据的多个参数,以及采用当前的查询类型查询数据库的耗时。也就是说,每次执行查询之后都会保存查询采用的参数、预定阈值以及查询的耗时。After the above step S22, that is, after executing the query to the database according to the comparison result of the specified parameter and the predetermined threshold, the method of this embodiment may further include: saving a plurality of parameters used to indicate the query database data, and using the current Query type The time spent in querying the database. That is to say, after each query is executed, the parameters used in the query, the predetermined threshold and the time consumption of the query are saved.
而对于上述保存的数据,在保存用于指示查询数据库数据的多个参数,以及采用当前的查询类型查询数据库的耗时之后,本实施例的方法还可以包括:For the above-mentioned saved data, after saving multiple parameters for instructing to query database data, and using the current query type to query the time-consuming database, the method of this embodiment may also include:
获取本地保存的与查询类型对应的查询耗时,其中,查询类型包括:全表扫描查询和指定索引查询;Obtaining the time-consuming query corresponding to the query type saved locally, where the query type includes: full table scan query and specified index query;
比较多个查询耗时,并按照比较结果调整第一阈值、第二阈值、第三阈值以及第四阈值的取值。Compare the time consumption of multiple queries, and adjust the values of the first threshold, the second threshold, the third threshold, and the fourth threshold according to the comparison results.
也就是说,通过保存的数据,进行比较得出最优或最合适的查询方式为以后的查询提供可靠的依据。That is to say, by comparing the stored data, the optimal or most suitable query method can be obtained to provide a reliable basis for future queries.
在本实施例中还提供了一种数据库的查询装置,该装置用于实现上述实施例及可选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”“单元”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。In this embodiment, a database query device is also provided, and the device is used to implement the above embodiments and optional implementation modes, and what has been described will not be repeated. As used below, the terms "module" and "unit" may be a combination of software and/or hardware that realizes a predetermined function. Although the devices described in the following embodiments are preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
图2是根据本发明实施例的数据库的查询装置结构框图,如图2所示,该装置包括:第一获取模块22,用于获取用于指示查询数据库的查询场景的多个参数:查询模块24,与获取模块22耦合连接,用于依据多个参数进行综合决策,并按照与决策结果对应的查询类型执行对数据库的查询。Fig. 2 is a structural block diagram of a query device of a database according to an embodiment of the present invention. As shown in Fig. 2, the device includes: a first acquisition module 22, which is used to acquire multiple parameters for indicating a query scene of a query database: query module 24. Coupled with the acquisition module 22, it is used to make a comprehensive decision based on multiple parameters, and execute a query to the database according to a query type corresponding to the decision result.
图3是根据本发明可选实施例的数据库的查询装置可选结构框图一,如图3所示,该第一获取模块22包括:第一获取单元32,用于获取第一预设统计时间内查询条件中限制返回的数据量与从数据库查询的数据总量之比的第一参数;第二获取单元34,与第一获取单元32耦合连接,用于获取在第二预设统计时间内数据库内需要查询的数据索引范围与数据库中所有数据的索引范围之比的第二参数;第三获取单元36,与第二获取单元34耦合连接,用于获取在第三预设统计时间内查询条件中的数据库分区与总分区之比的第三参数。Fig. 3 is an optional structural block diagram 1 of the query device of the database according to an optional embodiment of the present invention. As shown in Fig. 3, the first obtaining module 22 includes: a first obtaining unit 32, which is used to obtain the first preset statistical time The first parameter of the ratio of the amount of data returned in the inner query condition to the total amount of data queried from the database; the second acquisition unit 34 is coupled with the first acquisition unit 32 for obtaining the data within the second preset statistical time The second parameter of the ratio of the index range of the data to be queried in the database to the index range of all data in the database; the third acquisition unit 36 is coupled to the second acquisition unit 34 for obtaining the query within the third preset statistical time The third parameter for the ratio of database partitions to total partitions in the condition.
图4是根据本发明可选实施例的数据库的查询装置可选结构框图二,如图4所示,查询模块24包括:确定单元42,用于获取第一参数分别与预设的第一阈值和第二阈值进行比较后的第一比较结果,并依据第一比较结果确定是否将数据库查询SQL拆分为多个子SQL;查询单元44,与确定单元42耦合连接,用于在确定是否拆分多个子SQL之后,获取第二参数与第三参数之间较小的参数,并将较小的参数分别与预设的第三阈值和第四阈值进行比较,得到第二比较结果,并依据第二比较结果随机选择指定索引查询方式进行查询或全表扫描查询方式进行查询。Fig. 4 is an optional structural block diagram 2 of the query device of the database according to an optional embodiment of the present invention. As shown in Fig. 4, the query module 24 includes: a determination unit 42, which is used to obtain the first parameter and the preset first threshold respectively The first comparison result after comparing with the second threshold, and determine whether to split the database query SQL into multiple sub-SQLs according to the first comparison result; the query unit 44 is coupled with the determination unit 42 for determining whether to split After a plurality of sub-SQLs, obtain the smaller parameter between the second parameter and the third parameter, and compare the smaller parameter with the preset third threshold and the fourth threshold respectively to obtain the second comparison result, and based on the first 2. The comparison result randomly selects the specified index query method for query or the full table scan query method for query.
可选地,该确定单元42,还用于在第一比较结果小于等于第一阈值时,拆分为多个子SQL;或,在第一比较结果大于等于第二阈值时,不拆分为多个子SQL;或,在第一比较结果大于第一阈值且小于第二阈值时,随机选择是否拆分为多个子SQL;Optionally, the determining unit 42 is also configured to split into multiple sub-SQLs when the first comparison result is less than or equal to the first threshold; or, when the first comparison result is greater than or equal to the second threshold, not to split into multiple sub-SQLs. sub-SQL; or, when the first comparison result is greater than the first threshold and less than the second threshold, randomly select whether to split into multiple sub-SQLs;
可选地,查询单元44,还用于在第二比较结果小于等于第三阈值时,采用指定索引查询;或,在第二比较结果大于等于第四阈值时,采用全表扫描查询;或,在第二比较结果大于第三阈值且小于第四阈值时,随机选择指定索引查询或全表扫描查询。Optionally, the query unit 44 is further configured to use a specified index query when the second comparison result is less than or equal to the third threshold; or, use a full table scan query when the second comparison result is greater than or equal to the fourth threshold; or, When the second comparison result is greater than the third threshold and less than the fourth threshold, a specified index query or a full table scan query is randomly selected.
可选地,本实施例涉及到的装置还可以包括:在依据指定参数与预定阈值的比较结果执行对数据库的查询之后,该装置还包括:保存模块,与查询模块22耦合连接,保存用于指示查询数据库数据的多个参数,以及采用当前的查询类型查询数据库的耗时。Optionally, the device involved in this embodiment may also include: after executing the query to the database according to the comparison result of the specified parameter and the predetermined threshold, the device further includes: a saving module, coupled with the query module 22, saving the Indicates multiple parameters for querying database data, and the time spent querying the database with the current query type.
可选地,在保存用于指示查询数据库数据的多个参数,以及采用当前的查询类型查询数据库的耗时之后,该装置还可以包括:第二获取模块,与保存模块耦合连接,用于获取本地保存的与查询类型对应的查询耗时,其中,查询类型包括:全表扫描查询和指定索引查询;Optionally, after saving a plurality of parameters used to indicate querying database data, and using the current query type to query the database, the device may further include: a second acquiring module, coupled to the saving module, for acquiring The query time corresponding to the query type saved locally, where the query type includes: full table scan query and specified index query;
比较模块,与第二获取模块耦合连接,用于比较多个查询耗时,并按照比较结果调整第一阈值、第二阈值、第三阈值以及第四阈值的取值。The comparison module is coupled with the second acquisition module, and is used for comparing the time consumption of multiple queries, and adjusting the values of the first threshold, the second threshold, the third threshold and the fourth threshold according to the comparison results.
下面通过本发明的可选实施例对本发明进行举例说明;The present invention is illustrated below by optional embodiments of the present invention;
本可选实施例提供了一种自适应使用场景的数据查询算法,通过本可选实施例能够让数据查询随使用场景的变化自动选取优化的方式。This optional embodiment provides a data query algorithm that adapts to usage scenarios. Through this optional embodiment, data query can automatically select an optimized mode as the usage scenarios change.
本可选实施例采用了根据查询的类型、查询的索引范围、查询的数据表分区大小、查询的数据量估算与查询返回条数限制的参数,综合评定后选取结构化查询语言(Structured Query Language,简称为SQL)的优化方式,SQL优化方式包括按分区索引拆分子SQL语句、hint选择等,并根据每次查询耗用的时间统计来作为下一次选取优化方式的依据。This optional embodiment adopts parameters according to the type of query, the index range of query, the partition size of the data table of query, the estimation of the data volume of query and the limit of the number of query returns, and selects Structured Query Language (Structured Query Language) after comprehensive evaluation , referred to as SQL) optimization method, the SQL optimization method includes splitting sub-SQL statements according to the partition index, hint selection, etc., and the statistics of the time consumed by each query are used as the basis for the next selection of the optimization method.
下面对于本可选实施例中涉及到的查询的类型、查询的索引范围、查询的数据表分区大小、查询的数据量估算与查询返回条数限制的参数进行说明;The type of query involved in this optional embodiment, the index range of the query, the partition size of the data table for the query, the estimation of the data volume of the query, and the parameters for limiting the number of query returns are described below;
对于本可选实施例的中查询的类型一般有实时查询与统计查询,实时查询对时间要求高、而对返回的数据量有限制;统计查询对时间要求低,但往往要求统计全部的数据。因此,对于这两种查询类型:实时查询更偏重按索引扫描,而统计查询更偏向于分区内的全表扫描。The types of query in this optional embodiment generally include real-time query and statistical query. Real-time query has high time requirements but has a limit on the amount of returned data; statistical query has low time requirements but often requires statistics of all data. Therefore, for these two query types: real-time queries are more inclined to scan by index, and statistical queries are more inclined to full table scans within partitions.
对于本可选实施例涉及到的查询的索引范围对hint的选择影响比较大,如果查询的索引范围已经达到分区内这个索引全部数据的1/2以上,而且需要返回满足条件的全部数据时,适合于全表扫描;如果查询的索引范围小于分区内这个索引全部数据的1/5时,适合于按索引扫描。对于索引范围占全部范围的1/5至1/2的场景,其效率受数据库服务器、存储设备硬软件配置的影响比较大,那么采用哪种方式,可以更进一步的采用以往查询耗用的时间统计来作为下一次选取优化方式的依据。The index range of the query involved in this optional embodiment has a relatively large impact on the selection of the hint. If the index range of the query has reached more than 1/2 of all the data in the index in the partition, and it is necessary to return all the data that meets the conditions, It is suitable for full table scanning; if the index range of the query is less than 1/5 of all the data in this index in the partition, it is suitable for scanning by index. For scenarios where the index range accounts for 1/5 to 1/2 of the entire range, its efficiency is greatly affected by the hardware and software configuration of the database server and storage device, so which method can be used to further use the time consumed by previous queries The statistics are used as the basis for selecting the optimization method next time.
此外,需要说明的是对于索引范围难以判断的场景,比如时间索引下的数据缺失、位置索引下的位置变化,适用于采用历史查询耗用的时间统计来作为下一次选取优化方式的依据。In addition, it should be noted that for scenarios where it is difficult to judge the index range, such as data loss under the time index and location changes under the location index, it is suitable to use the time statistics of historical queries as the basis for the next selection of optimization methods.
在本可选实施例中对于拆分子SQL的场景,如果有返回条数限制,那么可以依据数据量估算来拆分为子SQL,具体情况例如查询1个月的数据,查询返回条数限制为10万条,而1个月的数据估算有3000万条,那么可以拆分为30个1天来查询,这样只用查询1天的数据就能满足要求。In this optional embodiment, for the scenario of splitting sub-SQL, if there is a limit on the number of returned items, then it can be split into sub-SQLs based on the estimated data volume. In specific cases, such as querying data for one month, the number of returned items is limited to 100,000 records, and the estimated 30 million records in one month can be divided into 30 one-day queries, so that only one-day data query can meet the requirements.
而对于本实施例中的查询1周的大量数据,按位置汇总,数据表分区为1天,可以拆分为7个1天的子查询,避免1周大数据汇总带来的响应延迟。For the large amount of data that is queried for one week in this embodiment, it is summarized by location, and the data table is partitioned into one day, which can be split into seven subqueries of one day, so as to avoid the response delay caused by the one-week big data summary.
图5是根据本发明可选实施例的适应使用场景的数据查询算法的流程图,如图5所示,基于上述说明,本可选实施例的自适应使用场景的数据查询算法可以通过如下步骤来实现:Fig. 5 is a flow chart of a data query algorithm adapted to a usage scenario according to an optional embodiment of the present invention. As shown in Fig. 5, based on the above description, the data query algorithm of an adaptive usage scenario in this optional embodiment can go through the following steps to realise:
步骤S502:获取查询类型,确定参数A,其中,参数A=1为实时查询,A=0为报表统计查询;Step S502: Obtain the query type and determine the parameter A, wherein, the parameter A=1 is a real-time query, and A=0 is a report statistics query;
步骤S504:判断查询的索引范围,确定参数B,其中,参数B=查询索引范围/所有数据范围,如数据表保存1个月数据,按时间索引,查询1天数据,则B=1/30;Step S504: Determine the index range of the query, and determine the parameter B, where parameter B=query index range/all data range, if the data table saves 1 month of data, indexes by time, and queries 1 day of data, then B=1/30 ;
步骤S506:判断是否分区,及分区大小,确定参数C;其中,如该数据表没有分区,则C=1,如果有分区且按时间分区、分区大小为1周,查询条件中时间范围是1天,则C=1/7;Step S506: Determine whether to partition and the size of the partition, and determine the parameter C; wherein, if the data table has no partition, then C=1, if there is a partition and partitioned by time, the partition size is 1 week, and the time range in the query condition is 1 day, then C=1/7;
步骤S508:判断查询是否有返回条数限制,以及查询条件中时间范围内的数据量估算,确定参数D。比如所查询的数据表1天数据量估算为100万,而查询返回条数限制为10万,则D=1/10;Step S508: Determine whether the query has a limit on the number of returned items, and estimate the amount of data within the time range in the query condition, and determine the parameter D. For example, the 1-day data volume of the queried data table is estimated to be 1 million, and the number of query returns is limited to 100,000, then D=1/10;
步骤S510:按照参数D,当D<=t1时拆分子SQL,当D>=t2时不拆分子SQL,t1<D<t2时随机选择是否拆分子SQL;进而按照value=min(B,C)的值来确定是否采用指定索引的方式:value<=k1时指定索引查询;value>=k2时全表扫描查询;k1<value<k2时,随机采用是否指定索引;Step S510: According to the parameter D, split the sub-SQL when D<=t1, do not split the sub-SQL when D>=t2, and randomly select whether to split the sub-SQL when t1<D<t2; then according to value=min(B, C ) value to determine whether to use the specified index: value<=k1, specify the index query; value>=k2, the full table scan query; k1<value<k2, randomly use whether to specify the index;
需要说明的是value=min(B,C)该公式表示取B和C中较小的值作为value的值;It should be noted that the formula of value=min(B, C) indicates that the smaller value of B and C is taken as the value of value;
步骤S512:保存每次查询使用的参数、t1、t2、k1、k2以及查询耗时;Step S512: Save the parameters used in each query, t1, t2, k1, k2 and query time;
步骤S514:根据保存的耗时数据、根据参数A的不同分开统计,判断t1、t2、k1、k2的取值是否需要调整。Step S514: According to the saved time-consuming data and separate statistics according to the difference of parameter A, determine whether the values of t1, t2, k1, and k2 need to be adjusted.
通过本可选实施例的方式,针对大数据表的数据统计,不用依赖固定的SQL来满足效率优化目的,而是灵活根据实际使用场景采取适当的优化方式,满足业务统计的同时大大提升了查询效率。Through the method of this optional embodiment, for the data statistics of large data tables, it is not necessary to rely on fixed SQL to meet the purpose of efficiency optimization, but to flexibly adopt appropriate optimization methods according to actual usage scenarios, which greatly improves the query while meeting business statistics. efficiency.
以上仅为本发明的可选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above are only optional embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.
Claims (10)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510254514.4A CN106294380B (en) | 2015-05-18 | 2015-05-18 | Database query method and device |
| PCT/CN2015/089194 WO2016183964A1 (en) | 2015-05-18 | 2015-09-08 | Database query method and apparatus |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510254514.4A CN106294380B (en) | 2015-05-18 | 2015-05-18 | Database query method and device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN106294380A true CN106294380A (en) | 2017-01-04 |
| CN106294380B CN106294380B (en) | 2021-02-12 |
Family
ID=57319249
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201510254514.4A Active CN106294380B (en) | 2015-05-18 | 2015-05-18 | Database query method and device |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN106294380B (en) |
| WO (1) | WO2016183964A1 (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113360520A (en) * | 2021-06-30 | 2021-09-07 | 中国农业银行股份有限公司 | Database-based query method, device and equipment |
| CN115905193A (en) * | 2022-10-31 | 2023-04-04 | 东软集团股份有限公司 | Data cleaning method, device, electronic equipment and storage medium |
Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6732085B1 (en) * | 2001-05-31 | 2004-05-04 | Oracle International Corporation | Method and system for sample size determination for database optimizers |
| US7058622B1 (en) * | 2001-12-26 | 2006-06-06 | Tedesco Michael A | Method, apparatus and system for screening database queries prior to submission to a database |
| CN101088082A (en) * | 2004-10-25 | 2007-12-12 | 英孚威尔公司 | Full text query and search systems and methods of use |
| CN102118264A (en) * | 2010-12-20 | 2011-07-06 | 大唐移动通信设备有限公司 | Method and device for generating performance report |
| CN102279849A (en) * | 2010-06-09 | 2011-12-14 | 中兴通讯股份有限公司 | Method and system for big data query |
| US20120005189A1 (en) * | 2010-06-30 | 2012-01-05 | Oracle International Corporation | Techniques for recommending alternative sql execution plans |
| CN103942289A (en) * | 2014-04-12 | 2014-07-23 | 广西师范大学 | Memory caching method oriented to range querying on Hadoop |
| CN103984695A (en) * | 2014-03-21 | 2014-08-13 | 华为技术有限公司 | Method and device for inquiring data in database |
| CN104021145A (en) * | 2014-05-16 | 2014-09-03 | 华为技术有限公司 | Mixed service concurrent access method and device |
| CN104077530A (en) * | 2013-03-27 | 2014-10-01 | 国际商业机器公司 | Method and device used for evaluating safety of data access sentence |
| CN104462434A (en) * | 2014-12-15 | 2015-03-25 | 北京国双科技有限公司 | Data inquiring method and device |
| CN104516981A (en) * | 2014-12-31 | 2015-04-15 | 上海格尔软件股份有限公司 | LDAP anonymous inquiry method not limited by quantity |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8352459B2 (en) * | 2009-07-23 | 2013-01-08 | International Business Machines Corporation | SQL queries for simple network management protocol management information base tables |
-
2015
- 2015-05-18 CN CN201510254514.4A patent/CN106294380B/en active Active
- 2015-09-08 WO PCT/CN2015/089194 patent/WO2016183964A1/en not_active Ceased
Patent Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6732085B1 (en) * | 2001-05-31 | 2004-05-04 | Oracle International Corporation | Method and system for sample size determination for database optimizers |
| US7058622B1 (en) * | 2001-12-26 | 2006-06-06 | Tedesco Michael A | Method, apparatus and system for screening database queries prior to submission to a database |
| CN101088082A (en) * | 2004-10-25 | 2007-12-12 | 英孚威尔公司 | Full text query and search systems and methods of use |
| CN102279849A (en) * | 2010-06-09 | 2011-12-14 | 中兴通讯股份有限公司 | Method and system for big data query |
| US20120005189A1 (en) * | 2010-06-30 | 2012-01-05 | Oracle International Corporation | Techniques for recommending alternative sql execution plans |
| CN102118264A (en) * | 2010-12-20 | 2011-07-06 | 大唐移动通信设备有限公司 | Method and device for generating performance report |
| CN104077530A (en) * | 2013-03-27 | 2014-10-01 | 国际商业机器公司 | Method and device used for evaluating safety of data access sentence |
| CN103984695A (en) * | 2014-03-21 | 2014-08-13 | 华为技术有限公司 | Method and device for inquiring data in database |
| CN103942289A (en) * | 2014-04-12 | 2014-07-23 | 广西师范大学 | Memory caching method oriented to range querying on Hadoop |
| CN104021145A (en) * | 2014-05-16 | 2014-09-03 | 华为技术有限公司 | Mixed service concurrent access method and device |
| CN104462434A (en) * | 2014-12-15 | 2015-03-25 | 北京国双科技有限公司 | Data inquiring method and device |
| CN104516981A (en) * | 2014-12-31 | 2015-04-15 | 上海格尔软件股份有限公司 | LDAP anonymous inquiry method not limited by quantity |
Non-Patent Citations (2)
| Title |
|---|
| CHENG ZENG 等: "Efficient Web Service Composition and Intelligent Search Based on Relational Database", 《2010 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS》 * |
| 史英杰 等: "云数据管理系统中查询技术研究综述", 《计算机学报》 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113360520A (en) * | 2021-06-30 | 2021-09-07 | 中国农业银行股份有限公司 | Database-based query method, device and equipment |
| CN115905193A (en) * | 2022-10-31 | 2023-04-04 | 东软集团股份有限公司 | Data cleaning method, device, electronic equipment and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN106294380B (en) | 2021-02-12 |
| WO2016183964A1 (en) | 2016-11-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9870390B2 (en) | Selecting from OR-expansion states of a query | |
| US10817514B2 (en) | Splitting of a join operation to allow parallelization | |
| PH12022551525A1 (en) | Method and apparatus for storing and querying time series data, and server and storage medium thereof | |
| CN102163195B (en) | Query optimization method based on unified view of distributed heterogeneous database | |
| US8732163B2 (en) | Query optimization with memory I/O awareness | |
| KR20130101349A (en) | Data management system and method using database middleware | |
| US20140214799A1 (en) | Enhancing Parallelism in Evaluation Ranking/Cumulative Window Functions | |
| US10303654B2 (en) | Hybrid data distribution in a massively parallel processing architecture | |
| CN108874950B (en) | Data distribution storage method and device based on ER relationship | |
| JP2013516008A5 (en) | ||
| US20130159347A1 (en) | Automatic and dynamic design of cache groups | |
| EP3331218B1 (en) | Method and device for leveling load of distributed database | |
| CN106815258A (en) | A kind of date storage method and coordinator node | |
| US9836505B2 (en) | Star and snowflake join query performance | |
| CN104462430A (en) | Relational database data processing method and device | |
| US20150227584A1 (en) | Access plan for a database query | |
| US20130054567A1 (en) | Dynamic reordering of operations in a query plan | |
| US20240168953A1 (en) | Pruning techniques for processing top k queries | |
| US20160224579A1 (en) | Workload aware data placement for join-based query processing in a cluster | |
| CN106294380A (en) | The querying method of data base and device | |
| US8868545B2 (en) | Techniques for optimizing outer joins | |
| CN117708169A (en) | Database query optimization method and device, electronic equipment and storage medium | |
| CN107169138B (en) | A data distribution method for distributed in-memory database query engine | |
| CN104935953A (en) | Network on-demand service providing method and system based on real-time transcoding | |
| CN116610756B (en) | A distributed database adaptive copy selection method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |