Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The following disclosure provides many different embodiments, or examples, for implementing different structures of the invention. In order to simplify the present disclosure, components and arrangements of specific examples are described below. They are, of course, merely examples and are not intended to limit the invention. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
In order to solve the technical problems that in the prior art, a database optimization tool provides index and execution plan suggestions through static analysis, the performance of a query under different execution environments cannot be considered when a complex query scene is processed, misjudgment is easy to occur due to the fact that the query mode cannot be dynamically adapted to the change of the query mode, and the query performance of a database is reduced.
Fig. 1 is a database performance optimization method provided by the embodiment of the present application, and referring to fig. 1, the database performance optimization method provided by the present application includes:
s1, acquiring query data, and analyzing the query data to obtain a query mode corresponding to the query data.
Specifically, the invention captures newly added slow query data in the slow query log by monitoring the slow query log (slow. Log) of MySQL in real time. The slow query log is used for recording a slow query, wherein the slow query is a database query with the execution time exceeding a certain preset threshold value, and the threshold value is usually set by a database manager and can be adjusted according to actual requirements. The slow query is recorded through the slow query log, so that the SQL query with poor performance can be identified, and targeted optimization is further carried out.
In one possible embodiment of the application, the slow log file of MySQL may be monitored in real time by using eBPF (extended Berkeley PACKET FILTER) program in combination with the inotify technique to capture the newly added slow query data in the file. The eBPF program can monitor the change of the slow.log file in the kernel space, and can transmit the newly added query data (including but not limited to SQL sentences, execution time, time consumption, index use condition and the like) in the slow.log file to the user space in real time for analysis and storage. inotify is a very efficient file system event monitoring mechanism that can provide real-time file change notification for applications such as databases. In the database, inotify may be used to monitor changes in data files, log files, configuration files.
In a feasible embodiment of the application, query data is acquired, and the query data is analyzed to obtain a query mode corresponding to the query data, which comprises the following specific steps:
S101, acquiring newly added query data in the slow query log based on a preset acquisition rule, and storing the query data into a preset slow query database.
It should be appreciated that the preset acquisition rules may be set according to actual conditions, for example, the change of the slow log file may be periodically monitored at fixed time intervals, the new query data can be obtained, and the change of the slow log file can be monitored in real time, so that the new query data can be obtained. Preferably, the change of the slow.log file can be monitored in a real-time monitoring mode.
As shown in fig. 2, in a possible embodiment of the present application, after the newly added query data in the slow-query database is read, before the query data is stored in the preset slow-query database, the query data may be parsed first to extract the SQL statement and the corresponding performance data thereof, where the performance data includes, but is not limited to, at least one of execution time, time consumption, and whether to use the index, and then the parsed SQL statement and the performance data thereof are stored in the slow-query database, so that the data can be read from the slow-query database at any time to perform the comparison analysis before and after the optimization, without reading the slow-query database during each comparison test.
S102, extracting query data from the slow query database, and analyzing the query data to obtain a query mode corresponding to the query data.
Specifically, as shown in fig. 2, after query data is extracted from the slow query database, the query data is analyzed by SQL mode analysis through an SQL analysis tool. In one possible embodiment of the present application, query data may be parsed by using a parsing tool including, but not limited to sqlparse, antlr, etc., into at least one of features including, but not limited to, JOIN connections, WHERE conditions, etc.
And S2, acquiring historical query modes matched with the query modes from a preset mode library based on the query modes, wherein each historical query mode in the mode library has at least one group of corresponding prompt statement combinations, and each group of prompt statement combinations corresponds to at least one optimization suggestion.
Specifically, the pattern library is used for storing historical query patterns and prompt statement combinations (henceforth referred to as Hint combinations) corresponding to the historical query patterns, so as to support matching and multiplexing of the query patterns, and each group of prompt statement combinations corresponds to at least one optimization suggestion and serves as a reference for optimizing query performance. After new query data arrives and its corresponding query pattern is parsed, the same or similar historical query patterns are matched in a pattern library.
And S3, under the condition that the historical query pattern matched with the query pattern is acquired, taking all prompt statement combinations corresponding to the historical query pattern as effective prompt statement combinations, and under the condition that the historical query pattern matched with the query pattern is not acquired, analyzing the query pattern and generating at least one group of effective prompt statement combinations based on analysis results.
Specifically, as shown in fig. 2, because the history query patterns in the pattern library all have corresponding verified Hint combinations, when the history query patterns matched with the current query pattern exist in the pattern library, the Hint combinations corresponding to the history query patterns are directly multiplexed, and the Hint combinations corresponding to the history query patterns are used as the effective Hint combinations of the current query pattern. Under the condition that a history query mode which is completely matched with the current query mode does not exist in the mode library, analyzing the current query mode to obtain the query characteristic of the current query mode, and generating at least one group of new Hint combinations based on the query characteristic to serve as effective Hint combinations of the current query mode for subsequent testing. It should be appreciated that the present application is not limited to Hint combinations, for example, hint combinations may include, but are not limited to, at least one of indexing, concatenation order, and parallel processing.
In a possible embodiment of the present application, the query pattern is parsed and at least one set of valid hint statement combinations is generated based on the parsing result, as shown in fig. 2, and the specific steps are as follows:
And S301, analyzing the query pattern to obtain an execution plan of the query pattern.
In one possible embodiment of the application, the execution plan of the query pattern may be predicted using EXPLAIN commands of MySQL, or the actual execution path and performance data of the query pattern may be obtained by EXPLAIN ANALYZE commands. The execution plan of the query pattern includes, but is not limited to, at least one of a scanning manner of the query (e.g., full disk scan, index scan), a connection order, and a temporary table creation case.
S302, determining the performance bottleneck of the query mode according to the execution plan.
Specifically, based on the execution plan parsed in step S301, a corresponding performance bottleneck is identified, where the performance bottleneck includes, but is not limited to, at least one of index miss condition, connection order error condition, temporary table ordering, and file ordering.
And S303, generating optimization suggestions corresponding to the query modes based on the performance bottlenecks.
Specifically, based on the performance bottleneck determined in step S303, preliminary optimization suggestions are generated, and the optimization suggestions include, but are not limited to, at least one of suggestions of adding an index, adjusting a connection order, and rewriting a query structure.
S304, generating at least one group of valid prompt statement combination corresponding to the query mode based on the optimization suggestion.
Specifically, based on the optimization suggestion generated in step S303, at least one new set of Hint combinations is generated as valid Hint combinations, such as INDEX () or JOIN_ORDER (), and the generated valid Hint combinations are used as candidates for subsequent testing.
And S4, testing the query modes based on the optimization suggestions corresponding to the effective prompt statement combinations of each group and recording performance data, and selecting at least one group of effective prompt statement combinations as the optimal prompt statement combination of the query modes according to a preset selection rule and the performance data of the query modes corresponding to the effective prompt statement combinations of each group.
Specifically, after each group of effective Hint combinations corresponding to the current query mode are obtained, the current query mode needs to be executed again based on the optimization suggestions of each group of Hint combinations in sequence, the data of the performance of the query mode in the execution process of each group of Hint combinations are recorded respectively, and the test is repeated for a plurality of times until the optimization suggestions corresponding to all the effective Hint combinations are executed.
Because the query operation is executed based on each group of Hint combinations, certain system resources are occupied, if the test is directly executed in the production system, adverse effects on the normal operation of the production system are necessarily generated, and if the consistency of the system environment before the test cannot be ensured, the result deviation caused by the system performance difference cannot be eliminated. Therefore, in a feasible embodiment of the present application, in order to avoid the influence of testing each set of valid Hint combinations on the production system, and simultaneously ensure the consistency of the test environment before the execution of each set of valid Hint combinations, the steps of testing the query mode and recording the performance data based on the optimization suggestions corresponding to each set of valid Hint statement combinations are performed in independent test environments independent of the production system, and through multiple rounds of testing, performance data such as execution time, resource occupation (such as CPU, memory, I/O, etc.), response speed, etc. of executing the query based on the optimization suggestions corresponding to each set of valid Hint combinations can be obtained, and the corresponding Hint combination when the query performance is optimal is selected as the optimal Hint combination based on the preset selection rules. It should be understood that the present application is not limited to the selection rule of the optimal Hint combination, and for example, any one of the following schemes may be adopted to select the optimal Hint combination:
(1) Selecting an effective Hint combination corresponding to the query mode with the shortest execution time as an optimal Hint combination;
(2) Selecting an effective Hint combination corresponding to a query mode with the least occupied resources as an optimal Hint combination;
(3) And respectively giving corresponding weights for the execution time and the resource occupation, calculating the performance scores of the query modes corresponding to each group of effective Hint combinations under the comprehensive consideration of the execution time and the resource occupation, and selecting the effective Hint combination corresponding to the query mode with the highest score as the optimal Hint combination.
Database optimization tools (e.g., SQLAdvisor and Oracle Tuning Advisor) employed in the prior art generally lack closed loop feedback capabilities and cannot continuously optimize query patterns based on historical query results. For example, after a static analysis is performed on a query request and an execution plan suggestion is given, a database optimization tool in the prior art still needs to perform a complete static analysis to give an optimization suggestion even if the query of the same type is encountered again later, and the optimization suggestion result obtained before cannot be directly utilized, or cannot be improved. In one possible embodiment of the present application, after at least one set of valid alert sentence combinations is selected as the optimal alert sentence combinations of the query pattern in step S4, as shown in fig. 2, the method further includes the step of updating the query pattern and the optimal alert sentence combinations corresponding to the query pattern into the pattern library. By storing the optimization results of each time in the pattern library, the existing optimization results in the pattern library can be improved to form a closed loop feedback system, and verified Hint combinations in the pattern library can be preferentially used when similar queries appear in the future, so that unnecessary repeated tests are reduced. Compared with the scheme that the SQL sentence structure is changed to realize optimization in the prior art, and further the complex consistency verification is needed, the method and the device provided by the application have the advantages that the Hint combination is used for guiding the database optimizer to select indexes, connection sequences, connection modes and the like, so that the query is optimized on the premise that the SQL sentence structure is not changed, the complex work of consistency verification on a result set is avoided, and the optimization is simpler and more efficient. In addition, the application guides the database optimizer to optimize by using the Hint combination, can realize that corresponding multi-level Hint strategies such as INDEX control (INDEX/NO_INDEX) is generated according to different query scenes, connection ORDER (JOIN_ORDER) is manually adjusted to the execution ORDER of multi-table connection, batch Key Access (BKA) and multi-range reading (MRR) are used for optimizing I/O performance, ICP is forbidden, wrong filtering condition pushing is prevented, query block naming (QB_NAME) is used for improving debugging capability of complex queries, and the like, so that the optimizing effect is purposefully improved.
Because the database optimization tool adopted in the prior art adopts static analysis to provide index and execution plan suggestions and does not execute actual query operations, the performance before and after query optimization cannot be compared, and thus the performance improvement effect after optimization cannot be intuitively displayed. In one possible embodiment of the present application, after selecting at least one set of valid hint statement combinations as the optimal hint statement combinations of the query pattern in step S4, as shown in fig. 2, there is further included the steps of summarizing and comparing performance data of the query pattern before and after optimization based on the optimal hint statement combinations, and generating a report for showing performance changes before and after optimization based on the comparison result.
Specifically, by comparing the query performance after optimization based on the optimal Hint combination with the query performance before non-optimization and arranging the comparison result into a detailed optimization report, the performance improvement effect can be intuitively displayed, and decision basis can be provided for database administrators and development teams. The optimization report not only comprises the comparison analysis of the execution time and the resource consumption before and after the optimization, but also can be added with the detail suggestion corresponding to the Hint combination, such as recommending to create a new index, so that the new index covers the Hint combination used in the optimization process and the specific effect thereof, and the transparency, easy understanding and multiplexing of the optimization process are ensured. It should be understood that the report in the present application may include not only the query performance after optimization based on the optimal Hint combination and the query performance before non-optimization, but also the query performance after optimization based on each effective Hint combination, so as to intuitively show the influence of multiple Hint combinations on the query performance, and provide more diverse decision bases for database administrators and development teams.
According to the technical scheme, query data in the slow query log are acquired and analyzed, the historical query patterns are matched in the pattern library based on the query patterns obtained through analysis, the Hint combination corresponding to the historical query patterns is used as an effective Hint combination when the matching is successful, and the current query patterns are analyzed and a new effective Hint combination is generated when the matching is unsuccessful. And testing the optimization suggestions corresponding to the effective Hint combinations of each group in an independent testing environment, selecting the effective Hint combination corresponding to the query mode with optimal query performance as the optimal Hint combination of the current query mode, and updating the current query mode and the optimal Hint combination corresponding to the query mode into a mode library to form closed loop feedback. Compared with the problems that the static analysis method adopted for optimizing the query performance of the database in the prior art cannot consider the performance of the query under different execution environments, cannot dynamically adapt to the change of the query mode and the like, the scheme adopted by the application can dynamically provide the most suitable Hint combination according to the query modes of different queries, can cover complex query modes and execution environments, and has wide application range. According to the scheme provided by the application, by constructing the closed-loop feedback system, when a new query is encountered, a historical query mode can be matched in the mode library, verified effective Hint combinations are recommended, unnecessary repeated tests are reduced, an intuitive optimization report can be provided, and intelligent management of database query optimization is realized.
Corresponding to the method embodiment, the application also provides a database performance optimization device, and referring to fig. 3, the database performance optimization device provided by the application comprises a slow log acquisition module 301, a history analysis module 302, a query analysis module 303 and a test module 304.
The slow log collection module 301 is configured to obtain newly added query data in a slow query log, and parse the query data to obtain a query mode corresponding to the query data.
The history analysis module 302 is configured to obtain, based on the query patterns, a history query pattern matching the query pattern from a preset pattern library, where each history query pattern in the pattern library has at least one set of corresponding alert sentence combinations, and each set of alert sentence combinations corresponds to at least one optimization suggestion.
The query analysis module 303 is configured to, when a historical query pattern matching the query pattern is obtained, take all prompt sentence combinations corresponding to the historical query pattern as valid prompt sentence combinations, and when a historical query pattern matching the query pattern is not obtained, parse the query pattern and generate at least one set of valid prompt sentence combinations based on the parsing result.
The test module 304 is configured to test the query patterns based on the optimization suggestions corresponding to the valid alert sentence combinations of each group, record performance data, and select at least one set of valid alert sentence combinations as the optimal alert sentence combinations of the query patterns according to a preset selection rule and the performance data of the query patterns corresponding to the valid alert sentence combinations of each group.
In a possible embodiment of the present application, the test module 304 further includes an update feedback unit, configured to update the query pattern and the optimal hint statement combination corresponding to the query pattern to the pattern library.
In a possible embodiment of the present application, the database performance optimization device further includes a report generating module, configured to summarize and compare performance data of the query pattern before and after optimization based on the optimization suggestions corresponding to the optimal hint statement combination, and generate a report for displaying performance changes before and after optimization based on the comparison result.
As shown in fig. 4, an embodiment of the present application provides an electronic device, which includes a processor 411, a communication interface 412, a memory 413, and a communication bus 414, where the processor 411, the communication interface 412, and the memory 413 implement communication between each other through the communication bus 414, and the memory 413 is used to store a computer program.
In one embodiment of the present application, the processor 411 is configured to implement the database performance optimization method provided in any one of the foregoing method embodiments when executing the program stored in the memory 413, where the method includes:
acquiring query data, and analyzing the query data to obtain a query mode corresponding to the query data;
Based on the query mode, acquiring a history query mode matched with the query mode from a preset mode library, wherein each history query mode in the mode library has at least one group of corresponding prompt statement combinations, and each group of prompt statement combinations corresponds to at least one optimization suggestion;
under the condition that the history query pattern matched with the query pattern is not acquired, analyzing the query pattern and generating at least one group of effective prompt statement combinations based on analysis results;
And respectively testing the query modes based on the optimization suggestions corresponding to the effective prompt statement combinations of each group, recording performance data, and selecting at least one group of effective prompt statement combinations as the optimal prompt statement combination of the query modes according to a preset selection rule and the performance data of the query modes corresponding to the effective prompt statement combinations of each group.
The present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a database performance optimization method as provided in any of the method embodiments described above.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
From the above description of embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus a general purpose hardware platform, or may be implemented by hardware. Based on such understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the related art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the method described in the respective embodiments or some parts of the embodiments.
It is to be understood that the terminology used herein is for the purpose of describing particular example embodiments only, and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises," "comprising," "includes," "including," and "having" are inclusive and therefore specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order described or illustrated, unless an order of performance is explicitly stated. It should also be appreciated that additional or alternative steps may be used.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.