CN116303500A - Data consistency verification method, device, equipment and storage medium - Google Patents
Data consistency verification method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN116303500A CN116303500A CN202310201564.0A CN202310201564A CN116303500A CN 116303500 A CN116303500 A CN 116303500A CN 202310201564 A CN202310201564 A CN 202310201564A CN 116303500 A CN116303500 A CN 116303500A
- Authority
- CN
- China
- Prior art keywords
- data
- consistency
- verification
- result
- core
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明涉及数据库技术领域,尤其涉及一种数据一致性校验方法、装置、设备及存储介质。The invention relates to the technical field of databases, in particular to a data consistency checking method, device, equipment and storage medium.
背景技术Background technique
随着业务的增长,企业系统变得越来越复杂。复杂的分布式系统架构容易出现远程调用失败、消息发送失败等异常,这些异常可能导致系统间的数据不一致。另外,随着时间的推移,系统的用户数量不断增长,系统的数据量越来越大,为了保证核心功能的性能同时保留历史数据,存储层逐渐演变成冷热两个架构,热库存储核心热门数据,冷库存储历史数据,同样对冷热库之间的数据一致性提出了挑战。As businesses grow, enterprise systems become increasingly complex. The complex distributed system architecture is prone to abnormalities such as remote call failure and message sending failure, which may lead to data inconsistency between systems. In addition, as time goes by, the number of users of the system continues to grow, and the amount of data in the system is getting larger and larger. In order to ensure the performance of core functions while retaining historical data, the storage layer has gradually evolved into two architectures, hot and cold. For hot data, cold storage stores historical data, which also poses a challenge to data consistency between cold and hot storage.
现有技术中常用的冗余数据一致性检验方案一般会采用online模式,即通过数据库表中专门设置一个字段用于保存记录的校验和,对该记录进行更新时同时更新校验和,通过校验和字段来检验一致性。此外,也有基于业务特点,也可能采用offline模式,主要通过对原数据数量进行校验,以保证数量上的一致,针对offline模式的冗余数据一致性检验方案,虽然与业务相解耦,但是实效性低,同时会给存储层的数据库带来额外压力。The commonly used redundant data consistency check scheme in the prior art generally adopts the online mode, that is, a field is specially set in the database table to save the checksum of the record, and the checksum is updated at the same time when the record is updated. Checksum fields to check for consistency. In addition, based on the characteristics of the business, the offline mode may also be adopted, mainly by verifying the quantity of the original data to ensure the consistency of the quantity. The redundant data consistency verification scheme for the offline mode is decoupled from the business, but The effectiveness is low, and it will bring additional pressure to the database in the storage layer.
发明内容Contents of the invention
本发明的主要目的在于解决现有的offline模式的冗余数据一致性检验方案,实效性低,同时会给存储层的数据库带来额外压力的技术问题。The main purpose of the present invention is to solve the technical problem that the existing redundant data consistency check scheme in offline mode has low effectiveness and will bring additional pressure to the database of the storage layer.
本发明第一方面提供了一种数据一致性校验方法,方法包括:The first aspect of the present invention provides a method for verifying data consistency, the method comprising:
获取预设的数据一致性校验参数和核心查询脚本,其中,所述数据一致性检验参数包括核心脚本执行次数、校验周期和采样时间段个数,所述核心查询脚本用于查询源数据库和目标数据库的数据;Obtain preset data consistency verification parameters and core query scripts, wherein the data consistency verification parameters include the number of core script execution times, verification cycles, and sampling time periods, and the core query scripts are used to query the source database and target database data;
根据所述采样时间段个数划分所述校验周期,得到采样时间段,并在所述采样时间段中根据所述核心脚本执行次数查询所述核心查询脚本;Dividing the verification period according to the number of sampling time periods to obtain a sampling time period, and querying the core query script according to the execution times of the core script in the sampling time period;
根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果,其中,所述N为不小于1的自然数;According to the preset hierarchical sampling algorithm, take N time points in each sampling time period to sample the data of the source database and the target database queried by the core query script, and perform data volume consistency verification to obtain consistent data volume A sex check result, wherein said N is a natural number not less than 1;
若所述数据量一致性校验结果为数据量一致,则根据所述核心查询脚本查询的所述源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果;If the data volume consistency verification result is that the data volume is consistent, then perform data content consistency verification according to the data of the source database and the target database queried by the core query script, and obtain a data content consistency verification result;
根据所述数据内容一致性校验结果得到数据一致性校验结果。A data consistency verification result is obtained according to the data content consistency verification result.
本发明第二方面提供了一种数据一致性校验装置,包括:The second aspect of the present invention provides a data consistency verification device, including:
获取模块,用于获取预设的数据一致性校验参数和核心查询脚本,其中,所述数据一致性检验参数包括核心脚本执行次数、校验周期和采样时间段个数,所述核心查询脚本用于查询源数据库和目标数据库的数据;An acquisition module, configured to acquire preset data consistency verification parameters and core query scripts, wherein the data consistency verification parameters include core script execution times, verification cycles and the number of sampling time periods, and the core query scripts Data used to query the source and target databases;
查询模块,用于根据所述采样时间段个数划分所述校验周期,得到采样时间段,并在所述采样时间段中根据所述核心脚本执行次数查询所述核心查询脚本;A query module, configured to divide the verification cycle according to the number of sampling time periods to obtain a sampling time period, and query the core query script according to the execution times of the core script in the sampling time period;
数据量校验模块,用于根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果,其中,所述N为不小于1的自然数;The data volume verification module is used to take N time points in each sampling time period to sample the data of the source database and the target database queried by the core query script according to a preset layered sampling algorithm, and perform data volume consistency Consistency verification, to obtain the consistency verification result of the data volume, wherein, the N is a natural number not less than 1;
数据内容校验模块,用于当所述数据量一致性校验结果为数据量一致时,根据所述核心查询脚本查询的所述源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果;A data content verification module, configured to perform a data content consistency verification according to the data of the source database and the target database queried by the core query script when the result of the data volume consistency verification is that the data volume is consistent, to obtain Data content consistency check result;
校验结果生成模块,用于根据所述数据内容一致性校验结果得到数据一致性校验结果。A verification result generating module, configured to obtain a data consistency verification result according to the data content consistency verification result.
本发明第三方面提供了一种数据一致性校验装置,包括:存储器和至少一个处理器,所述存储器中存储有指令,所述存储器和所述至少一个处理器通过线路互连;所述至少一个处理器调用所述存储器中的所述指令,以使得所述数据一致性校验设备执行上述的数据一致性校验方法的步骤。The third aspect of the present invention provides a device for checking data consistency, including: a memory and at least one processor, instructions are stored in the memory, and the memory and the at least one processor are interconnected through a line; the At least one processor invokes the instructions in the memory, so that the data consistency checking device executes the steps of the above data consistency checking method.
本发明的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述的数据一致性校验方法的步骤。A fourth aspect of the present invention provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer, it causes the computer to execute the steps of the above data consistency checking method.
上述数据一致性校验方法、装置、设备及存储介质,通过根据采样时间段个数划分校验周期,得到采样时间段,并在采样时间段中执行核心脚本执行次数的核心查询脚本;根据分层采样算法,在每个采样时间段中取N个时间点采样核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果;若数据量一致性校验结果为数据量一致,则对核心查询脚本查询的源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果。本方式中通过分层采样对每个时间段选取特定个数时间点采样验证,从而降低数据库压力,通过周期性校验及时发现数据问题,提高校验的实效性并降低数据库压力。The above data consistency verification method, device, equipment and storage medium, by dividing the verification period according to the number of sampling time periods, obtain the sampling time period, and execute the core query script of the execution times of the core script in the sampling time period; The layer sampling algorithm takes N time points in each sampling period to sample the data of the source database and the target database queried by the core query script, and performs data volume consistency verification to obtain the data volume consistency verification result; if the data If the volume consistency verification result shows that the data volume is consistent, then the data content consistency verification is performed on the data in the source database and the target database queried by the core query script, and the data content consistency verification result is obtained. In this method, a specific number of time points are selected for sampling and verification in each time period through stratified sampling, thereby reducing the pressure on the database, and timely discovering data problems through periodic verification, improving the effectiveness of verification and reducing the pressure on the database.
本发明的其他特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。Additional features and advantages of the invention will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
为使本发明的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present invention more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.
附图说明Description of drawings
图1为本发明实施例中数据一致性校验方法的第一个实施例示意图;Fig. 1 is the schematic diagram of the first embodiment of the data consistency checking method in the embodiment of the present invention;
图2为本发明实施例中数据一致性校验装置的一个实施例示意图;Fig. 2 is a schematic diagram of an embodiment of a data consistency checking device in an embodiment of the present invention;
图3为本发明实施例中数据一致性校验装置的另一个实施例示意图;FIG. 3 is a schematic diagram of another embodiment of the data consistency checking device in the embodiment of the present invention;
图4为本发明实施例中数据一致性校验设备的一个实施例示意图。Fig. 4 is a schematic diagram of an embodiment of a data consistency checking device in an embodiment of the present invention.
具体实施方式Detailed ways
本发明实施例提供一种数据一致性校验方法、装置、设备及存储介质,通过根据采样时间段个数划分校验周期,得到采样时间段,并在采样时间段中执行核心脚本执行次数的核心查询脚本;根据分层采样算法,在每个采样时间段中取N个时间点采样核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果;若数据量一致性校验结果为数据量一致,则对核心查询脚本查询的源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果。本方式中通过分层采样对每个时间段选取特定个数时间点采样验证,从而降低数据库压力,通过周期性校验及时发现数据问题,提高校验的实效性并降低数据库压力。The embodiment of the present invention provides a data consistency verification method, device, device, and storage medium. By dividing the verification cycle according to the number of sampling time segments, the sampling time segment is obtained, and the execution times of the core script are executed in the sampling time segment. Core query script; according to the hierarchical sampling algorithm, take N time points in each sampling period to sample the data of the source database and the target database queried by the core query script, and perform data volume consistency verification to obtain data volume consistency Verify the result; if the result of the data volume consistency verification is that the data volume is consistent, then the data content consistency verification is performed on the data of the source database and the target database queried by the core query script, and the data content consistency verification result is obtained. In this method, a specific number of time points are selected for sampling and verification in each time period through stratified sampling, thereby reducing the pressure on the database, and timely discovering data problems through periodic verification, improving the effectiveness of verification and reducing the pressure on the database.
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and not necessarily Used to describe a specific sequence or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the term "comprising" or "having" and any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to those explicitly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.
为便于理解,下面对本发明实施例的具体流程进行描述,请参阅图1,本发明实施例中数据一致性校验方法的第一个实施例包括:For ease of understanding, the following describes the specific process of the embodiment of the present invention, please refer to Figure 1, the first embodiment of the data consistency verification method in the embodiment of the present invention includes:
101、获取预设的数据一致性校验参数和核心查询脚本,其中,数据一致性检验参数包括核心脚本执行次数、校验周期和采样时间段个数,核心查询脚本用于查询源数据库和目标数据库的数据;101. Obtain preset data consistency verification parameters and core query scripts, wherein the data consistency verification parameters include the number of core script execution times, verification cycles, and sampling time periods, and the core query scripts are used to query the source database and target database data;
在本实施例中,工作人员事先配置数据一致性校验任务,该数据一致性校验任务包括工作人员设置的数据一致性校验参数和核心查询脚本,当服务器或系统或数据一致性校验装置执行本方案流程时,会从工作人员配置数据一致性校验任务中获取数据一致性校验参数和核心查询脚本,其中,数据一致性校验参数包括源数据库和目标数据库地址、核心脚本执行次数、校验周期、采样区间时间段个数和重试次数等,此外,还可以包含其他关于数据一致性校验的参数,本发明不做限定,核心查询脚本用于查询源数据库和目标数据库的数据,In this embodiment, the staff configures the data consistency verification task in advance. The data consistency verification task includes the data consistency verification parameters and the core query script set by the staff. When the server or system or data consistency verification When the device executes the process of this solution, it will obtain data consistency verification parameters and core query scripts from the data consistency verification task configured by the staff. Among them, the data consistency verification parameters include source database and target database addresses, core script execution Number of times, verification period, number of sampling interval time periods and retries, etc. In addition, other parameters about data consistency verification can also be included, which is not limited in the present invention. The core query script is used to query the source database and the target database The data,
在实际应用中,源数据库和目标数据库是需要保持数据一致性的两数据库,例如,为了保证核心功能的性能同时保留历史数据,存储层逐渐演变成冷热两个架构,热库存储核心热门数据,冷库存储历史数据,其中热库可以作为源数据库,冷库可以作为目标数据库。In practical applications, the source database and the target database are two databases that need to maintain data consistency. For example, in order to ensure the performance of core functions while retaining historical data, the storage layer gradually evolves into two architectures, hot and cold, and the hot database stores core popular data. , the cold storage stores historical data, the hot storage can be used as the source database, and the cold storage can be used as the target database.
102、根据采样时间段个数划分校验周期,得到采样时间段,并在采样时间段中根据核心脚本执行次数查询核心查询脚本;102. Divide the verification cycle according to the number of sampling time periods to obtain the sampling time period, and query the core query script according to the execution times of the core script in the sampling time period;
在本实施例中,校验周期为需要进行数据一致性校验的周期时间段,例如校验周期为1小时,在服务器或系统或数据一致性校验装置执行本方案流程后的1个小时内需要对源数据库和目标数据库之间的数据一致性进行多次校验,并根据采样时间段个数划分校验周期,例如采样时间段为5,校验周期为1小时,则在检验周期中每12分钟作为一个采样时间段,在每个采样时间段中均会执行核心查询脚本查询源数据库和目标数据库的数据,并且执行的次数为事先设定的数据一致性检验参数的核心脚本执行次数,通过周期性校验及时发现源数据库和目标数据库之间的数据一致性问题。In this embodiment, the verification period is the period of time during which data consistency verification is required. For example, the verification period is 1 hour, which is 1 hour after the server or system or data consistency verification device executes the process of this solution. It is necessary to verify the data consistency between the source database and the target database multiple times, and divide the verification period according to the number of sampling time periods. For example, if the sampling time period is 5 and the verification period is 1 hour, then the verification period Every 12 minutes is used as a sampling time period. In each sampling time period, the core query script is executed to query the data of the source database and the target database, and the number of executions is the core script execution of the pre-set data consistency inspection parameters. The number of times, the data consistency problem between the source database and the target database is discovered in time through periodic verification.
103、根据预设的分层采样算法,在每个采样时间段中取N个时间点采样核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果;103. According to the preset hierarchical sampling algorithm, take N time points in each sampling time period to sample the data of the source database and the target database queried by the core query script, and check the consistency of the data volume to obtain a consistent data volume gender check result;
在本实施例中,在进行数据一致性校验的过程中,首先进行数据量一致性校验,这是因为数据内容一致性校验需要进行编码处理,且数据内容已进行一一比对需要用到的资源较大,所以首先进行数据量一致性校验,当数据量不一致时,可以直接确定源数据库和目标数据库的数据不一致,而在进行数据量一致性校验的过程中,为了进一步降低数据库压力,并不对每个采样时间段统计的数据量直接进行比对,而是通过分层采样的方式,从采样时间段中取预设数量的时间点,获取核心查询脚本查询的源数据库和目标数据库的数据中该时间点的数据进行比对,当校验结果为两者在采样的这些时间点的数据量均一致时,再进行下一步校验,若不一致,则进行延时重试。In this embodiment, in the process of data consistency verification, the data volume consistency verification is first performed, because the data content consistency verification needs to be encoded, and the data content has been compared one by one. The resources used are relatively large, so the data volume consistency check is performed first. When the data volume is inconsistent, it can be directly determined that the data in the source database and the target database are inconsistent. In the process of data volume consistency check, in order to further To reduce the pressure on the database, it does not directly compare the amount of data counted in each sampling time period, but takes a preset number of time points from the sampling time period through hierarchical sampling to obtain the source database queried by the core query script Compare with the data of the time point in the data of the target database. When the verification result is that the data volumes of these time points sampled by the two are consistent, the next step of verification is performed. If they are not consistent, the delayed retry try.
在本实施例中,在所述根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果之后,还包括:根据各采样时间段执行核心脚本执行次数的核心查询脚本查询所述源数据库和所述目标数据库的数据的查询数据量,从所述校验周期中的各采样时间段中筛选出查询数据量最大的t个采样时间段,其中,所述t为不小于1的自然数;根据所述分层采样算法,在查询数据量最大的t个采样时间段中的每个采样时间段取M个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,其中,M为不小于N的自然数;对M个时间点采样的源数据库和目标数据库的采样数据进行数据量一致性校验,并对所述数据量一致性校验结果进行一次更新。In this embodiment, according to the preset layered sampling algorithm, N time points are taken in each sampling time period to sample the data of the source database and the target database queried by the core query script, and the data volume Consistency check, after obtaining the data volume consistency check result, also includes: according to the core query script of execution times of core script execution times in each sampling time period, inquire about the query data volume of the data of the source database and the target database, from In each sampling time period in the verification cycle, t sampling time periods with the largest amount of query data are selected, wherein, the t is a natural number not less than 1; according to the hierarchical sampling algorithm, when the amount of query data is the largest Each sampling time period in the t sampling time periods takes M time points to sample the data of the source database and the target database queried by the core query script, wherein M is a natural number not less than N; M time points are sampled The sampled data of the source database and the target database are checked for data volume consistency, and the result of the data volume consistency check is updated once.
在本实施例中,若一次更新后的数据量一致性校验结果为M个时间点采样的采样数据的数据量一致,则在所述对所述数据量一致性校验结果进行一次更新之后,还包括:获取各采样时间段中执行核心查询脚本查询源数据库和目标数据库的数据对应的查询总时间;根据所述各采样时间段对应的查询总时间和核心脚本执行次数,计算各采样时间段的平均耗时和比例耗时。In this embodiment, if the data volume consistency check result after an update is that the data volumes of the sampled data sampled at M time points are consistent, then after the data volume consistency check result is updated once , also includes: obtaining the total query time corresponding to the data of the source database and the target database executed by the core query script in each sampling time period; calculating each sampling time according to the total query time corresponding to each sampling time period and the execution times of the core script The average time consumption and proportional time consumption of the segment.
进一步的,在所述根据所述各采样时间段对应的查询总时间和核心脚本执行次数,计算各采样时间段的平均耗时和比例耗时之后,还包括:从所述校验周期的各采样时间段中筛选出平均耗时和/或比例耗时最高的t个采样时间段;根据所述分层采样算法,在平均耗时和/或比例耗时最高的t个采样时间段中的每个采样时间段取W个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,其中,W为不小于N的自然数;对W个时间点采样的源数据库和目标数据库的采样数据进行数据量一致性校验,并对所述数据量一致性校验结果进行二次更新。Further, after calculating the average time-consuming and proportional time-consuming of each sampling time period according to the total query time and core script execution times corresponding to each sampling time period, it also includes: In the sampling time period, the t sampling time periods with the highest average time consumption and/or the highest proportional time consumption are selected; according to the stratified sampling algorithm, in the t sampling time periods with the highest average time consumption and/or proportional time consumption Each sampling period takes W time points to sample the data of the source database and the target database inquired by the core query script, wherein W is a natural number not less than N; The data is checked for data volume consistency, and the result of the data volume consistency check is updated a second time.
具体的,在实际应用中,直接对采样x个采样时间段进行数据验证可能会给数据库带来较大的负载。例如校验周期为1小时,采样时间段为5,如果直接对这5个时间段进行数据量校验,就意味着直接将12分钟的数据拿出来进行比较,这个数据量可能会非常大,会导致数据库压力增大、降低整个校验流程速度,甚至影响线上功能。因此,采取在每个时间段内取特定个数的时间点进行统计,当这些时间点的数据量一致时,再对数据密度高和数据库负载高的时间段,选取更加密集详细的时间点进行验证。因为数据密度越高或负载越大的情况下,稳定会越低,因此数据不一致的可能性会越大。Specifically, in practical applications, directly performing data verification on sampling x sampling time periods may bring a relatively large load to the database. For example, the verification period is 1 hour, and the sampling time period is 5. If the data volume of these 5 time periods is directly verified, it means that the 12-minute data is directly taken out for comparison, and the data volume may be very large. It will lead to increased pressure on the database, slow down the speed of the entire verification process, and even affect online functions. Therefore, take a specific number of time points in each time period for statistics. When the data volume of these time points is consistent, then select more intensive and detailed time points for time periods with high data density and high database load. verify. Because the higher the data density or the greater the load, the lower the stability will be, so the possibility of data inconsistency will be greater.
在本实施例中,所述数据一致性检验参数还包括重试次数;在所述根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果之后,还包括:若所述数据量一致性校验结果或一次更新后的数据量一致性校或二次更新后的数据量一致性校验结果为数据量不一致,则延时预设时间后,根据所述数据一致性校验参数和核心查询脚本重新查询源数据库和目标数据库的数据,并进行数据量一致性校验,直至数据量一致性校验结果为数据量一致或重新查询的次数到达所述重试次数。In this embodiment, the data consistency check parameter also includes the number of retries; according to the preset hierarchical sampling algorithm, N time points are taken to sample the core query script query in each sampling time period The data of the source database and the target database, and perform data volume consistency verification. After obtaining the data volume consistency verification result, it also includes: if the data volume consistency verification result or the data volume consistency after an update If the result of the consistency check of the data volume after the calibration or secondary update is that the data volume is inconsistent, after a preset time delay, re-query the data of the source database and the target database according to the data consistency check parameters and the core query script, And the data volume consistency check is performed until the result of the data volume consistency check is that the data volume is consistent or the number of times of re-query reaches the number of retries.
具体的,本实施例中采用延时重试的机制以兼容分片集合等场景,在数据量一致性校验结果或一次更新后的数据量一致性校或二次更新后的数据量一致性校验结果为数据量不一致时,均会触发延时重试机制,直至延时重试的次数到达设定的重试次数。Specifically, in this embodiment, a delayed retry mechanism is adopted to be compatible with scenarios such as shard collections. After the data volume consistency check result or the data volume consistency check after the first update or the data volume consistency check after the second update When the verification result shows that the amount of data is inconsistent, the delayed retry mechanism will be triggered until the number of delayed retries reaches the set number of retries.
在所述进行数据量一致性校验,直至数据量一致性校验结果为数据量一致或重新查询的次数到达所述重试次数之后,还包括:After performing the data volume consistency check until the result of the data volume consistency check is that the data volume is consistent or the number of re-queries reaches the number of retries, it also includes:
若重新查询的次数到达所述重试次数之后所述数据量一致性校验结果为不一致,则根据所述数据量校验结果得到数据一致性校验结果。If the data volume consistency check result is inconsistent after the number of re-queries reaches the retry count, a data consistency check result is obtained according to the data volume check result.
具体的,在重复进行查询后若数据量校验结果任然为不一致,则直接根据数据量校验结果生成数据一致性校验结果,不需要进行后续的数据内容一致性校验过程,节省计算资源。Specifically, if the data volume verification result is still inconsistent after repeated queries, the data consistency verification result will be generated directly according to the data volume verification result, without the need for subsequent data content consistency verification process, saving calculation resource.
104、若数据量一致性校验结果为数据量一致,则根据核心查询脚本查询的源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果;104. If the data volume consistency verification result is that the data volume is consistent, perform data content consistency verification according to the data of the source database and the target database queried by the core query script, and obtain the data content consistency verification result;
在本实施例中,所述根据所述核心查询脚本查询的所述源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果包括:根据所述分层采样算法,分别从所述目标数据库和所述源数据库中采样所述核心查询脚本在各采样时间段查询的y条数据,其中,所述y为不小于1的自然数;对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果。In this embodiment, performing data content consistency verification on the data of the source database and the target database queried according to the core query script, and obtaining the result of the data content consistency verification include: according to the layered sampling algorithm , respectively sampling y pieces of data queried by the core query script in each sampling time period from the target database and the source database, wherein the y is a natural number not less than 1; The y pieces of data sampled in the source database are checked for data content consistency, and the result of the data content consistency check is obtained.
进一步的,所述对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果包括:分别对进行所述目标数据库和所述源数据库中采样的y条数据进行CRC编码,得到对应的编码信息;对所述源数据库和目标数据库对应的编码信息进行数据内容一致性校验,得到数据内容一致性校验结果。Further, the performing data content consistency verification on the y pieces of data sampled in the target database and the source database, and obtaining the result of the data content consistency verification include: performing the data content consistency verification on the target database and the source database respectively. Perform CRC encoding on the y pieces of data sampled in the source database to obtain corresponding encoding information; perform data content consistency verification on the encoding information corresponding to the source database and the target database, and obtain a data content consistency verification result.
具体的,在实际应用中,CRC是循环冗余校验码,是计算机网络通信领域常用的校验码,通过循环冗余校验码进行CRC编码。循环冗余校验码包括一系列移位、相除等数据编码规则,其算法原理、算法程序的设计与分析,都可以通过相应的软件编码进行解决,CRC编码根据模2除法的原理计算出校验码,再把得到的校验码附在要发送的数据后面组成一帧新的数据,发送给接收端后,接收端根据生成的校验码的规则重新计算校验码,若相同则说明数据内容存在一致性,若不同则说明数据内容不存在一致性。Specifically, in practical applications, CRC is a cyclic redundancy check code, which is a commonly used check code in the field of computer network communication, and CRC encoding is performed through the cyclic redundancy check code. The cyclic redundancy check code includes a series of data coding rules such as shifting and division. Its algorithm principle, design and analysis of algorithm programs can be solved by corresponding software coding. CRC code is calculated according to the principle of modulo 2 division. Check code, and then attach the obtained check code to the data to be sent to form a new frame of data. After sending it to the receiving end, the receiving end recalculates the check code according to the rules of the generated check code. If they are the same, then It means that there is consistency in the data content, and if it is different, it means that there is no consistency in the data content.
在本实施例中,若所述数据内容一致性校验结果为编码信息一致,则在所述对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果之后,还包括:对M个时间点采样的源数据库和目标数据库的采样数据进行数据内容一致性校验,并对所述数据内容一致性校验结果进行一次更新;若一次更新后的数据内容一致性校验结果为M个时间点采样的采样数据的数据内容一致,则对W个时间点采样的源数据库和目标数据库的采样数据进行数据内容一致性校验,并对所述数据内容一致性校验结果进行二次更新。In this embodiment, if the result of the data content consistency check is that the encoded information is consistent, the data content consistency check is performed on the y pieces of data sampled in the target database and the source database, After the data content consistency verification result is obtained, it also includes: performing data content consistency verification on the sampled data of the source database and the target database sampled at M time points, and updating the data content consistency verification result once ; If the data content consistency check result after an update is that the data content of the sampled data sampled at M time points is consistent, then the data content consistency check is performed on the sampled data of the source database and the target database sampled at W time points , and perform a secondary update on the result of the consistency check of the data content.
105、根据数据内容一致性校验结果得到数据一致性校验结果。105. Obtain a data consistency verification result according to the data content consistency verification result.
在本实施例中,若数据一致性校验结果为不一致,则在所述根据所述数据内容一致性校验结果得到数据一致性校验结果之后,还包括:将所述数据一致性校验结果记录在预设的记录日志中,并执行预设的自动执行脚本,根据所述自动执行脚本获取对应的兜底补偿脚本;根据所述兜底补偿脚本获取数据一致性校验不一致的采样时间段,并根据数据一致性校验不一致的采样时间段确定所述目标数据库的待补偿数据段;从所述源数据库中获取所述待补偿数据段对应的数据,并复制至所述目标数据库中。In this embodiment, if the result of the data consistency verification is inconsistent, after the data consistency verification result is obtained according to the data content consistency verification result, it further includes: verifying the data consistency The result is recorded in the preset recording log, and the preset automatic execution script is executed, and the corresponding pocket compensation script is obtained according to the automatic execution script; according to the pocket compensation script, the sampling period of inconsistent data consistency check is obtained, And determine the data segment to be compensated in the target database according to the sampling time period when the data consistency check is inconsistent; obtain the data corresponding to the data segment to be compensated from the source database, and copy it to the target database.
具体的,当数据一致性校验结果为源数据库和目标数据库之间的数据存在数据量不一致或数据内容不一致时,通过兜底补偿脚本对目标数据库中与源数据库不一致的数据进行补偿,兜底补偿脚本是一种离线数据补偿的方式。根据业务方提供的兜底补偿脚本地址cs,通过远程调用的方式,传入需要进行数据补偿的时间范围,兜底补偿脚本会将这段时间内的数据复制到目标数据库中,其中,需要进行数据补偿的时间范围即为数据量一致性校验或数据内容一致性校验过程中比对数据不一致的采样时间段,通过该采样时间段确定源数据库中查询的数据,并复制到目标数据库中,完成兜底补偿。Specifically, when the result of the data consistency check is that there is inconsistency in data volume or data content between the source database and the target database, the data in the target database that is inconsistent with the source database is compensated through the bottom compensation script, and the bottom compensation script It is a way of offline data compensation. According to the address cs of the pocket compensation script provided by the business party, the time range for which data compensation needs to be performed is passed in through a remote call, and the pocket compensation script will copy the data within this period to the target database, among which data compensation is required The time range is the sampling time period during which the comparison data is inconsistent during data volume consistency verification or data content consistency verification. Through this sampling time period, the data queried in the source database is determined and copied to the target database. Pocket compensation.
具体的,在进行数据量一致性校验和数据内容一致性校验后,若得到的数据一致性校验结果为一致,则说明源数据库和目标数据库中的数据一致,此时,释放进行数据量一致性校验和数据内容一致性校验所占用的资源,退出程序。Specifically, after the data volume consistency check and data content consistency check are performed, if the obtained data consistency check result is consistent, it means that the data in the source database and the target database are consistent. At this time, release the data The resources occupied by volume consistency check and data content consistency check will be deleted, and the program will exit.
在本实施例中,通过根据采样时间段个数划分校验周期,得到采样时间段,并在采样时间段中执行核心脚本执行次数的核心查询脚本;根据分层采样算法,在每个采样时间段中取N个时间点采样核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果;若数据量一致性校验结果为数据量一致,则对核心查询脚本查询的源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果。本方式中通过分层采样对每个时间段选取特定个数时间点采样验证,从而降低数据库压力,通过周期性校验及时发现数据问题,提高校验的实效性并降低数据库压力。In this embodiment, by dividing the verification cycle according to the number of sampling time periods, the sampling time period is obtained, and the core query script of the execution times of the core script is executed in the sampling time period; according to the hierarchical sampling algorithm, at each sampling time Take N time points in the segment to sample the data of the source database and the target database queried by the core query script, and perform data volume consistency verification to obtain the data volume consistency verification result; if the data volume consistency verification result is the data volume If they are consistent, the data content consistency verification is performed on the data of the source database and the target database queried by the core query script, and the result of the data content consistency verification is obtained. In this method, a specific number of time points are selected for sampling and verification in each time period through stratified sampling, thereby reducing the pressure on the database, and timely discovering data problems through periodic verification, improving the effectiveness of verification and reducing the pressure on the database.
上面对本发明实施例中数据一致性校验方法进行了描述,下面对本发明实施例中数据一致性校验装置进行描述,请参阅图2,本发明实施例中数据一致性校验装置一个实施例包括:The data consistency verification method in the embodiment of the present invention is described above, and the data consistency verification device in the embodiment of the present invention is described below, please refer to FIG. 2, an embodiment of the data consistency verification device in the embodiment of the present invention include:
获取模块201,用于获取预设的数据一致性校验参数和核心查询脚本,其中,所述数据一致性检验参数包括核心脚本执行次数、校验周期和采样时间段个数,所述核心查询脚本用于查询源数据库和目标数据库的数据;The obtaining
查询模块202,用于根据所述采样时间段个数划分所述校验周期,得到采样时间段,并在所述采样时间段中根据所述核心脚本执行次数查询所述核心查询脚本;The
数据量校验模块203,用于根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果,其中,所述N为不小于1的自然数;The data
数据内容校验模块204,用于当所述数据量一致性校验结果为数据量一致时,根据所述核心查询脚本查询的所述源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果;The data
校验结果生成模块205,用于根据所述数据内容一致性校验结果得到数据一致性校验结果。The verification
本发明实施例中,所述数据一致性校验装置运行上述数据一致性校验方法,所述数据一致性校验装置通过根据采样时间段个数划分校验周期,得到采样时间段,并在采样时间段中执行核心脚本执行次数的核心查询脚本;根据分层采样算法,在每个采样时间段中取N个时间点采样核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果;若数据量一致性校验结果为数据量一致,则对核心查询脚本查询的源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果。本方式中通过分层采样对每个时间段选取特定个数时间点采样验证,从而降低数据库压力,通过周期性校验及时发现数据问题,提高校验的实效性并降低数据库压力。In the embodiment of the present invention, the data consistency verification device operates the above-mentioned data consistency verification method, and the data consistency verification device divides the verification period according to the number of sampling time periods to obtain the sampling time period, and The core query script that executes the number of times the core script is executed in the sampling period; according to the stratified sampling algorithm, take N time points in each sampling period to sample the data of the source database and the target database queried by the core query script, and calculate the data volume Consistency verification, to obtain the result of the consistency verification of the data volume; if the result of the consistency verification of the data volume is that the data volume is consistent, then the data content consistency verification is performed on the data of the source database and the target database queried by the core query script to obtain Data content consistency check result. In this method, a specific number of time points are selected for sampling and verification in each time period through stratified sampling, thereby reducing the pressure on the database, and timely discovering data problems through periodic verification, improving the effectiveness of verification and reducing the pressure on the database.
请参阅图3,本发明实施例中数据一致性校验装置的第二个实施例包括:Referring to Fig. 3, the second embodiment of the data consistency checking device in the embodiment of the present invention includes:
获取模块201,用于获取预设的数据一致性校验参数和核心查询脚本,其中,所述数据一致性检验参数包括核心脚本执行次数、校验周期和采样时间段个数,所述核心查询脚本用于查询源数据库和目标数据库的数据;The obtaining
查询模块202,用于根据所述采样时间段个数划分所述校验周期,得到采样时间段,并在所述采样时间段中根据所述核心脚本执行次数查询所述核心查询脚本;The
数据量校验模块203,用于根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果,其中,所述N为不小于1的自然数;The data
数据内容校验模块204,用于当所述数据量一致性校验结果为数据量一致时,根据所述核心查询脚本查询的所述源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果;The data
校验结果生成模块205,用于根据所述数据内容一致性校验结果得到数据一致性校验结果。The verification
在本实施例中,所述数据一致性校验装置还包括第一更新模块206,所述第一更新模块206具体用于:In this embodiment, the data consistency checking device further includes a
根据各采样时间段执行核心脚本执行次数的核心查询脚本查询所述源数据库和所述目标数据库的数据的查询数据量,从所述校验周期中的各采样时间段中筛选出查询数据量最大的t个采样时间段,其中,所述t为不小于1的自然数;According to the query data volume of the core query script that executes the core script execution times in each sampling time period to query the data of the source database and the target database, filter out the largest query data volume from each sampling time period in the verification cycle t sampling time periods, wherein the t is a natural number not less than 1;
根据所述分层采样算法,在查询数据量最大的t个采样时间段中的每个采样时间段取M个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,其中,M为不小于N的自然数;According to the hierarchical sampling algorithm, M time points are taken to sample the data of the source database and the target database queried by the core query script in each of the t sampling time periods with the largest amount of query data, wherein M is a natural number not less than N;
对M个时间点采样的源数据库和目标数据库的采样数据进行数据量一致性校验,并对所述数据量一致性校验结果进行一次更新。Perform a data volume consistency check on the sampled data of the source database and the target database sampled at M time points, and update the data volume consistency check result once.
在本实施例中,所述数据一致性校验装置还包括耗时计算模块207,所述耗时计算模块207具体用于:In this embodiment, the data consistency checking device further includes a time-consuming
获取各采样时间段中执行核心查询脚本查询源数据库和目标数据库的数据对应的查询总时间;Obtain the total query time corresponding to the data in the source database and the target database by executing the core query script in each sampling time period;
根据所述各采样时间段对应的查询总时间和核心脚本执行次数,计算各采样时间段的平均耗时和比例耗时。According to the total query time and core script execution times corresponding to each sampling time period, the average time consumption and proportional time consumption of each sampling time period are calculated.
在本实施例中,所述数据一致性校验装置还包括第二更新模块208,所述第二更新模块208具体用于:In this embodiment, the data consistency checking device further includes a
从所述校验周期的各采样时间段中筛选出平均耗时和/或比例耗时最高的t个采样时间段;Selecting the t sampling time periods with the highest average time-consuming and/or proportional time-consuming from each sampling time period of the verification cycle;
根据所述分层采样算法,在平均耗时和/或比例耗时最高的t个采样时间段中的每个采样时间段取W个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,其中,W为不小于N的自然数;According to the hierarchical sampling algorithm, W time points are taken in each of the t sampling time periods with the highest average time-consuming and/or proportional time-consuming to sample the source database and target database queried by the core query script data, where W is a natural number not less than N;
对W个时间点采样的源数据库和目标数据库的采样数据进行数据量一致性校验,并对所述数据量一致性校验结果进行二次更新。The data volume consistency check is performed on the sampled data of the source database and the target database sampled at W time points, and a second update is performed on the data volume consistency check result.
在本实施例中,所述数据一致性检验参数还包括重试次数;所述数据一致性校验装置还包括重试模块209,所述重试模块209具体用于:In this embodiment, the data consistency check parameter also includes the number of retries; the data consistency check device also includes a retry
若所述数据量一致性校验结果或一次更新后的数据量一致性校或二次更新后的数据量一致性校验结果为数据量不一致,则延时预设时间后,根据所述数据一致性校验参数和核心查询脚本重新查询源数据库和目标数据库的数据,并进行数据量一致性校验,直至数据量一致性校验结果为数据量一致或重新查询的次数到达所述重试次数。If the data volume consistency check result or the data volume consistency check after the first update or the data volume consistency check result after the second update is that the data volume is inconsistent, after a preset time delay, according to the data Consistency check parameters and core query scripts re-query the data of the source database and the target database, and perform data volume consistency checks until the result of the data volume consistency check is that the data volume is consistent or the number of re-queries reaches the retry frequency.
在本实施例中,所述重试模块209具体还用于:In this embodiment, the retry
若重新查询的次数到达所述重试次数之后所述数据量一致性校验结果为不一致,则根据所述数据量校验结果得到数据一致性校验结果。If the data volume consistency check result is inconsistent after the number of re-queries reaches the retry count, a data consistency check result is obtained according to the data volume check result.
在本实施例中,所述数据内容校验模块204具体用于:In this embodiment, the data
数据采样单元2041,用于根据所述分层采样算法,分别从所述目标数据库和所述源数据库中采样所述核心查询脚本在各采样时间段查询的y条数据,其中,所述y为不小于1的自然数;The
内容一致性校验单元2042,用于对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果。The content
在本实施例中,所述内容一致性校验单元2042具体用于:In this embodiment, the content
分别对进行所述目标数据库和所述源数据库中采样的y条数据进行CRC编码,得到对应的编码信息;Performing CRC encoding on the y pieces of data sampled in the target database and the source database respectively, to obtain corresponding encoding information;
对所述源数据库和目标数据库对应的编码信息进行数据内容一致性校验,得到数据内容一致性校验结果。Performing a data content consistency check on the encoding information corresponding to the source database and the target database, to obtain a data content consistency check result.
在本实施例中,所述数据一致性校验装置还包括第三更新模块210,所述第三更新模块210具体用于:In this embodiment, the data consistency checking device further includes a
对M个时间点采样的源数据库和目标数据库的采样数据进行数据内容一致性校验,并对所述数据内容一致性校验结果进行一次更新;Performing a data content consistency check on the sampled data of the source database and the target database sampled at M time points, and updating the result of the data content consistency check once;
若一次更新后的数据内容一致性校验结果为M个时间点采样的采样数据的数据内容一致,则对W个时间点采样的源数据库和目标数据库的采样数据进行数据内容一致性校验,并对所述数据内容一致性校验结果进行二次更新。If the data content consistency check result after an update is that the data content of the sampled data sampled at M time points is consistent, then the data content consistency check is performed on the sampled data of the source database and the target database sampled at W time points, And a second update is performed on the result of the consistency check of the data content.
在本实施例中,所述数据一致性校验装置还包括兜底补偿模块211,所述重试模块211具体用于:In this embodiment, the data consistency verification device further includes a
将所述数据一致性校验结果记录在预设的记录日志中,并执行预设的自动执行脚本,根据所述自动执行脚本获取对应的兜底补偿脚本;Recording the data consistency verification result in a preset recording log, and executing a preset automatic execution script, and obtaining a corresponding bottom-up compensation script according to the automatic execution script;
根据所述兜底补偿脚本获取数据一致性校验不一致的采样时间段,并根据数据一致性校验不一致的采样时间段确定所述目标数据库的待补偿数据段;Acquiring the sampling time period during which the data consistency check is inconsistent according to the bottom-up compensation script, and determining the data segment to be compensated in the target database according to the sampling time period during which the data consistency check is inconsistent;
从所述源数据库中获取所述待补偿数据段对应的数据,并复制至所述目标数据库中。The data corresponding to the data segment to be compensated is obtained from the source database and copied to the target database.
本实施例在上一实施例的基础上,详细描述了各个模块的具体功能以及部分模块的单元构成,通过上述模块和模块的单元根据采样时间段个数划分校验周期,得到采样时间段,并在采样时间段中执行核心脚本执行次数的核心查询脚本;根据分层采样算法,在每个采样时间段中取N个时间点采样核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果;若数据量一致性校验结果为数据量一致,则对核心查询脚本查询的源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果。本方式中通过分层采样对每个时间段选取特定个数时间点采样验证,从而降低数据库压力,通过周期性校验及时发现数据问题,提高校验的实效性并降低数据库压力。On the basis of the previous embodiment, this embodiment describes in detail the specific functions of each module and the unit composition of some modules. By dividing the verification cycle according to the number of sampling time periods through the above modules and module units, the sampling time period is obtained. And execute the core query script of the core script execution times in the sampling time period; according to the hierarchical sampling algorithm, take N time points in each sampling time period to sample the data of the source database and the target database queried by the core query script, and perform Data volume consistency check to get the data volume consistency check result; if the data volume consistency check result is the same data volume, perform data content consistency check on the data in the source database and the target database queried by the core query script , to get the result of the consistency check of the data content. In this method, a specific number of time points are selected for sampling and verification in each time period through stratified sampling, thereby reducing the pressure on the database, and timely discovering data problems through periodic verification, improving the effectiveness of verification and reducing the pressure on the database.
上面图2和图3从模块化功能实体的角度对本发明实施例中的中数据一致性校验装置进行详细描述,下面从硬件处理的角度对本发明实施例中数据一致性校验设备进行详细描述。The above Figures 2 and 3 describe in detail the data consistency verification device in the embodiment of the present invention from the perspective of modular functional entities, and the following describes the data consistency verification device in the embodiment of the present invention in detail from the perspective of hardware processing .
图4是本发明实施例提供的一种数据一致性校验设备的结构示意图,该数据一致性校验设备400可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)410(例如,一个或一个以上处理器)和存储器420,一个或一个以上存储应用程序433或数据432的存储介质430(例如一个或一个以上海量存储设备)。其中,存储器420和存储介质430可以是短暂存储或持久存储。存储在存储介质430的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对数据一致性校验设备400中的一系列指令操作。更进一步地,处理器410可以设置为与存储介质430通信,在数据一致性校验设备400上执行存储介质430中的一系列指令操作,以实现以下的步骤:FIG. 4 is a schematic structural diagram of a data consistency verification device provided by an embodiment of the present invention. The data
获取预设的数据一致性校验参数和核心查询脚本,其中,所述数据一致性检验参数包括核心脚本执行次数、校验周期和采样时间段个数,所述核心查询脚本用于查询源数据库和目标数据库的数据;Obtain preset data consistency verification parameters and core query scripts, wherein the data consistency verification parameters include the number of core script execution times, verification cycles, and sampling time periods, and the core query scripts are used to query the source database and target database data;
根据所述采样时间段个数划分所述校验周期,得到采样时间段,并在所述采样时间段中根据所述核心脚本执行次数查询所述核心查询脚本;Dividing the verification period according to the number of sampling time periods to obtain a sampling time period, and querying the core query script according to the execution times of the core script in the sampling time period;
根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果,其中,所述N为不小于1的自然数;According to the preset hierarchical sampling algorithm, take N time points in each sampling time period to sample the data of the source database and the target database queried by the core query script, and perform data volume consistency verification to obtain consistent data volume A sex check result, wherein said N is a natural number not less than 1;
若所述数据量一致性校验结果为数据量一致,则根据所述核心查询脚本查询的所述源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果;If the data volume consistency verification result is that the data volume is consistent, then perform data content consistency verification according to the data of the source database and the target database queried by the core query script, and obtain a data content consistency verification result;
根据所述数据内容一致性校验结果得到数据一致性校验结果。A data consistency verification result is obtained according to the data content consistency verification result.
可选的,在所述根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果之后,还包括:Optionally, according to the preset stratified sampling algorithm, take N time points in each sampling time period to sample the data of the source database and the target database queried by the core query script, and perform data volume consistency Verification, after obtaining the data volume consistency verification result, also includes:
根据各采样时间段执行核心脚本执行次数的核心查询脚本查询所述源数据库和所述目标数据库的数据的查询数据量,从所述校验周期中的各采样时间段中筛选出查询数据量最大的t个采样时间段,其中,所述t为不小于1的自然数;According to the query data volume of the core query script that executes the core script execution times in each sampling time period to query the data of the source database and the target database, filter out the largest query data volume from each sampling time period in the verification cycle t sampling time periods, wherein the t is a natural number not less than 1;
根据所述分层采样算法,在查询数据量最大的t个采样时间段中的每个采样时间段取M个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,其中,M为不小于N的自然数;According to the hierarchical sampling algorithm, M time points are taken to sample the data of the source database and the target database queried by the core query script in each of the t sampling time periods with the largest amount of query data, wherein M is a natural number not less than N;
对M个时间点采样的源数据库和目标数据库的采样数据进行数据量一致性校验,并对所述数据量一致性校验结果进行一次更新。Perform a data volume consistency check on the sampled data of the source database and the target database sampled at M time points, and update the data volume consistency check result once.
可选的,若一次更新后的数据量一致性校验结果为M个时间点采样的采样数据的数据量一致,则在所述对所述数据量一致性校验结果进行一次更新之后,还包括:Optionally, if the data volume consistency check result after an update is that the data volumes of the sampled data sampled at M time points are consistent, then after the data volume consistency check result is updated once, further include:
获取各采样时间段中执行核心查询脚本查询源数据库和目标数据库的数据对应的查询总时间;Obtain the total query time corresponding to the data in the source database and the target database by executing the core query script in each sampling time period;
根据所述各采样时间段对应的查询总时间和核心脚本执行次数,计算各采样时间段的平均耗时和比例耗时。According to the total query time and core script execution times corresponding to each sampling time period, the average time consumption and proportional time consumption of each sampling time period are calculated.
可选的,在所述根据所述各采样时间段对应的查询总时间和核心脚本执行次数,计算各采样时间段的平均耗时和比例耗时之后,还包括:Optionally, after calculating the average time-consuming and proportional time-consuming of each sampling time period according to the total query time and core script execution times corresponding to each sampling time period, the method further includes:
从所述校验周期的各采样时间段中筛选出平均耗时和/或比例耗时最高的t个采样时间段;Selecting the t sampling time periods with the highest average time-consuming and/or proportional time-consuming from each sampling time period of the verification cycle;
根据所述分层采样算法,在平均耗时和/或比例耗时最高的t个采样时间段中的每个采样时间段取W个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,其中,W为不小于N的自然数;According to the hierarchical sampling algorithm, W time points are taken in each of the t sampling time periods with the highest average time-consuming and/or proportional time-consuming to sample the source database and target database queried by the core query script data, where W is a natural number not less than N;
对W个时间点采样的源数据库和目标数据库的采样数据进行数据量一致性校验,并对所述数据量一致性校验结果进行二次更新。The data volume consistency check is performed on the sampled data of the source database and the target database sampled at W time points, and a second update is performed on the data volume consistency check result.
可选的,所述数据一致性检验参数还包括重试次数;Optionally, the data consistency check parameter also includes the number of retries;
在所述根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果之后,还包括:According to the preset layered sampling algorithm, N time points are taken to sample the data of the source database and the target database queried by the core query script in each sampling time period, and the data volume consistency check is performed to obtain After the data volume consistency check result, it also includes:
若所述数据量一致性校验结果或一次更新后的数据量一致性校或二次更新后的数据量一致性校验结果为数据量不一致,则延时预设时间后,根据所述数据一致性校验参数和核心查询脚本重新查询源数据库和目标数据库的数据,并进行数据量一致性校验,直至数据量一致性校验结果为数据量一致或重新查询的次数到达所述重试次数。If the data volume consistency check result or the data volume consistency check after the first update or the data volume consistency check result after the second update is that the data volume is inconsistent, after a preset time delay, according to the data Consistency check parameters and core query scripts re-query the data of the source database and the target database, and perform data volume consistency checks until the result of the data volume consistency check is that the data volume is consistent or the number of re-queries reaches the retry frequency.
可选的,在所述进行数据量一致性校验,直至数据量一致性校验结果为数据量一致或重新查询的次数到达所述重试次数之后,还包括:Optionally, after the data volume consistency check is performed until the result of the data volume consistency check is that the data volume is consistent or the number of re-queries reaches the number of retries, the method further includes:
若重新查询的次数到达所述重试次数之后所述数据量一致性校验结果为不一致,则根据所述数据量校验结果得到数据一致性校验结果。If the data volume consistency check result is inconsistent after the number of re-queries reaches the retry count, a data consistency check result is obtained according to the data volume check result.
可选的,所述根据所述核心查询脚本查询的所述源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果包括:Optionally, performing the data content consistency check on the data of the source database and the target database queried according to the core query script, and obtaining the result of the data content consistency check include:
根据所述分层采样算法,分别从所述目标数据库和所述源数据库中采样所述核心查询脚本在各采样时间段查询的y条数据,其中,所述y为不小于1的自然数;According to the hierarchical sampling algorithm, sample y pieces of data queried by the core query script in each sampling time period from the target database and the source database respectively, wherein the y is a natural number not less than 1;
对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果。Performing a data content consistency check on the y pieces of data sampled in the target database and the source database to obtain a data content consistency check result.
可选的,所述对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果包括:Optionally, performing data content consistency verification on the y pieces of data sampled in the target database and the source database, and obtaining a data content consistency verification result include:
分别对进行所述目标数据库和所述源数据库中采样的y条数据进行CRC编码,得到对应的编码信息;Performing CRC encoding on the y pieces of data sampled in the target database and the source database respectively, to obtain corresponding encoding information;
对所述源数据库和目标数据库对应的编码信息进行数据内容一致性校验,得到数据内容一致性校验结果。Performing a data content consistency check on the encoding information corresponding to the source database and the target database, to obtain a data content consistency check result.
可选的,若所述数据内容一致性校验结果为编码信息一致,则在所述对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果之后,还包括:Optionally, if the result of the data content consistency check is that the encoded information is consistent, then perform a data content consistency check on the y pieces of data sampled in the target database and the source database to obtain data After the content consistency verification results, it also includes:
对M个时间点采样的源数据库和目标数据库的采样数据进行数据内容一致性校验,并对所述数据内容一致性校验结果进行一次更新;Performing a data content consistency check on the sampled data of the source database and the target database sampled at M time points, and updating the result of the data content consistency check once;
若一次更新后的数据内容一致性校验结果为M个时间点采样的采样数据的数据内容一致,则对W个时间点采样的源数据库和目标数据库的采样数据进行数据内容一致性校验,并对所述数据内容一致性校验结果进行二次更新。If the data content consistency check result after an update is that the data content of the sampled data sampled at M time points is consistent, then the data content consistency check is performed on the sampled data of the source database and the target database sampled at W time points, And a second update is performed on the result of the consistency check of the data content.
可选的,若数据一致性校验结果为不一致,则在所述根据所述数据内容一致性校验结果得到数据一致性校验结果之后,还包括:Optionally, if the result of the data consistency verification is inconsistent, after the data consistency verification result is obtained according to the data content consistency verification result, the method further includes:
将所述数据一致性校验结果记录在预设的记录日志中,并执行预设的自动执行脚本,根据所述自动执行脚本获取对应的兜底补偿脚本;Recording the data consistency verification result in a preset recording log, and executing a preset automatic execution script, and obtaining a corresponding bottom-up compensation script according to the automatic execution script;
根据所述兜底补偿脚本获取数据一致性校验不一致的采样时间段,并根据数据一致性校验不一致的采样时间段确定所述目标数据库的待补偿数据段;Acquiring the sampling time period during which the data consistency check is inconsistent according to the bottom-up compensation script, and determining the data segment to be compensated in the target database according to the sampling time period during which the data consistency check is inconsistent;
从所述源数据库中获取所述待补偿数据段对应的数据,并复制至所述目标数据库中。The data corresponding to the data segment to be compensated is obtained from the source database and copied to the target database.
数据一致性校验设备400还可以包括一个或一个以上电源440,一个或一个以上有线或无线网络接口450,一个或一个以上输入输出接口460,和/或,一个或一个以上操作系统431,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图4示出的数据一致性校验设备结构并不构成对本发明提供的数据一致性校验设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。The data
本发明还提供一种计算机可读存储介质,该计算机可读存储介质可以为非易失性计算机可读存储介质,该计算机可读存储介质也可以为易失性计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在计算机上运行时,使得计算机执行以下步骤:The present invention also provides a computer-readable storage medium. The computer-readable storage medium may be a non-volatile computer-readable storage medium. The computer-readable storage medium may also be a volatile computer-readable storage medium. Instructions are stored in the computer-readable storage medium, and when the instructions are run on the computer, the computer is made to perform the following steps:
获取预设的数据一致性校验参数和核心查询脚本,其中,所述数据一致性检验参数包括核心脚本执行次数、校验周期和采样时间段个数,所述核心查询脚本用于查询源数据库和目标数据库的数据;Obtain preset data consistency verification parameters and core query scripts, wherein the data consistency verification parameters include the number of core script execution times, verification cycles, and sampling time periods, and the core query scripts are used to query the source database and target database data;
根据所述采样时间段个数划分所述校验周期,得到采样时间段,并在所述采样时间段中根据所述核心脚本执行次数查询所述核心查询脚本;Dividing the verification period according to the number of sampling time periods to obtain a sampling time period, and querying the core query script according to the execution times of the core script in the sampling time period;
根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果,其中,所述N为不小于1的自然数;According to the preset hierarchical sampling algorithm, take N time points in each sampling time period to sample the data of the source database and the target database queried by the core query script, and perform data volume consistency verification to obtain consistent data volume A sex check result, wherein said N is a natural number not less than 1;
若所述数据量一致性校验结果为数据量一致,则根据所述核心查询脚本查询的所述源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果;If the data volume consistency verification result is that the data volume is consistent, then perform data content consistency verification according to the data of the source database and the target database queried by the core query script, and obtain a data content consistency verification result;
根据所述数据内容一致性校验结果得到数据一致性校验结果。A data consistency verification result is obtained according to the data content consistency verification result.
可选的,在所述根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果之后,还包括:Optionally, according to the preset stratified sampling algorithm, take N time points in each sampling time period to sample the data of the source database and the target database queried by the core query script, and perform data volume consistency Verification, after obtaining the data volume consistency verification result, also includes:
根据各采样时间段执行核心脚本执行次数的核心查询脚本查询所述源数据库和所述目标数据库的数据的查询数据量,从所述校验周期中的各采样时间段中筛选出查询数据量最大的t个采样时间段,其中,所述t为不小于1的自然数;According to the query data volume of the core query script that executes the core script execution times in each sampling time period to query the data of the source database and the target database, filter out the largest query data volume from each sampling time period in the verification cycle t sampling time periods, wherein the t is a natural number not less than 1;
根据所述分层采样算法,在查询数据量最大的t个采样时间段中的每个采样时间段取M个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,其中,M为不小于N的自然数;According to the hierarchical sampling algorithm, M time points are taken to sample the data of the source database and the target database queried by the core query script in each of the t sampling time periods with the largest amount of query data, wherein M is a natural number not less than N;
对M个时间点采样的源数据库和目标数据库的采样数据进行数据量一致性校验,并对所述数据量一致性校验结果进行一次更新。Perform a data volume consistency check on the sampled data of the source database and the target database sampled at M time points, and update the data volume consistency check result once.
可选的,若一次更新后的数据量一致性校验结果为M个时间点采样的采样数据的数据量一致,则在所述对所述数据量一致性校验结果进行一次更新之后,还包括:Optionally, if the data volume consistency check result after an update is that the data volumes of the sampled data sampled at M time points are consistent, then after the data volume consistency check result is updated once, further include:
获取各采样时间段中执行核心查询脚本查询源数据库和目标数据库的数据对应的查询总时间;Obtain the total query time corresponding to the data in the source database and the target database by executing the core query script in each sampling time period;
根据所述各采样时间段对应的查询总时间和核心脚本执行次数,计算各采样时间段的平均耗时和比例耗时。According to the total query time and core script execution times corresponding to each sampling time period, the average time consumption and proportional time consumption of each sampling time period are calculated.
可选的,在所述根据所述各采样时间段对应的查询总时间和核心脚本执行次数,计算各采样时间段的平均耗时和比例耗时之后,还包括:Optionally, after calculating the average time-consuming and proportional time-consuming of each sampling time period according to the total query time and core script execution times corresponding to each sampling time period, the method further includes:
从所述校验周期的各采样时间段中筛选出平均耗时和/或比例耗时最高的t个采样时间段;Selecting the t sampling time periods with the highest average time-consuming and/or proportional time-consuming from each sampling time period of the verification cycle;
根据所述分层采样算法,在平均耗时和/或比例耗时最高的t个采样时间段中的每个采样时间段取W个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,其中,W为不小于N的自然数;According to the hierarchical sampling algorithm, W time points are taken in each of the t sampling time periods with the highest average time-consuming and/or proportional time-consuming to sample the source database and target database queried by the core query script data, where W is a natural number not less than N;
对W个时间点采样的源数据库和目标数据库的采样数据进行数据量一致性校验,并对所述数据量一致性校验结果进行二次更新。The data volume consistency check is performed on the sampled data of the source database and the target database sampled at W time points, and a second update is performed on the data volume consistency check result.
可选的,所述数据一致性检验参数还包括重试次数;Optionally, the data consistency check parameter also includes the number of retries;
在所述根据预设的分层采样算法,在每个采样时间段中取N个时间点采样所述核心查询脚本查询的源数据库和目标数据库的数据,并进行数据量一致性校验,得到数据量一致性校验结果之后,还包括:According to the preset layered sampling algorithm, N time points are taken to sample the data of the source database and the target database queried by the core query script in each sampling time period, and the data volume consistency check is performed to obtain After the data volume consistency check result, it also includes:
若所述数据量一致性校验结果或一次更新后的数据量一致性校或二次更新后的数据量一致性校验结果为数据量不一致,则延时预设时间后,根据所述数据一致性校验参数和核心查询脚本重新查询源数据库和目标数据库的数据,并进行数据量一致性校验,直至数据量一致性校验结果为数据量一致或重新查询的次数到达所述重试次数。If the data volume consistency check result or the data volume consistency check after the first update or the data volume consistency check result after the second update is that the data volume is inconsistent, after a preset time delay, according to the data Consistency check parameters and core query scripts re-query the data of the source database and the target database, and perform data volume consistency checks until the result of the data volume consistency check is that the data volume is consistent or the number of re-queries reaches the retry frequency.
可选的,在所述进行数据量一致性校验,直至数据量一致性校验结果为数据量一致或重新查询的次数到达所述重试次数之后,还包括:Optionally, after the data volume consistency check is performed until the result of the data volume consistency check is that the data volume is consistent or the number of re-queries reaches the number of retries, the method further includes:
若重新查询的次数到达所述重试次数之后所述数据量一致性校验结果为不一致,则根据所述数据量校验结果得到数据一致性校验结果。If the data volume consistency check result is inconsistent after the number of re-queries reaches the retry count, a data consistency check result is obtained according to the data volume check result.
可选的,所述根据所述核心查询脚本查询的所述源数据库和目标数据库的数据进行数据内容一致性校验,得到数据内容一致性校验结果包括:Optionally, performing the data content consistency check on the data of the source database and the target database queried according to the core query script, and obtaining the result of the data content consistency check include:
根据所述分层采样算法,分别从所述目标数据库和所述源数据库中采样所述核心查询脚本在各采样时间段查询的y条数据,其中,所述y为不小于1的自然数;According to the hierarchical sampling algorithm, sample y pieces of data queried by the core query script in each sampling time period from the target database and the source database respectively, wherein the y is a natural number not less than 1;
对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果。Performing a data content consistency check on the y pieces of data sampled in the target database and the source database to obtain a data content consistency check result.
可选的,所述对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果包括:Optionally, performing data content consistency verification on the y pieces of data sampled in the target database and the source database, and obtaining a data content consistency verification result include:
分别对进行所述目标数据库和所述源数据库中采样的y条数据进行CRC编码,得到对应的编码信息;Performing CRC encoding on the y pieces of data sampled in the target database and the source database respectively, to obtain corresponding encoding information;
对所述源数据库和目标数据库对应的编码信息进行数据内容一致性校验,得到数据内容一致性校验结果。Performing a data content consistency check on the encoding information corresponding to the source database and the target database, to obtain a data content consistency check result.
可选的,若所述数据内容一致性校验结果为编码信息一致,则在所述对进行所述目标数据库和所述源数据库中采样的y条数据进行数据内容一致性校验,得到数据内容一致性校验结果之后,还包括:Optionally, if the result of the data content consistency check is that the encoded information is consistent, then perform a data content consistency check on the y pieces of data sampled in the target database and the source database to obtain data After the content consistency verification results, it also includes:
对M个时间点采样的源数据库和目标数据库的采样数据进行数据内容一致性校验,并对所述数据内容一致性校验结果进行一次更新;Performing a data content consistency check on the sampled data of the source database and the target database sampled at M time points, and updating the result of the data content consistency check once;
若一次更新后的数据内容一致性校验结果为M个时间点采样的采样数据的数据内容一致,则对W个时间点采样的源数据库和目标数据库的采样数据进行数据内容一致性校验,并对所述数据内容一致性校验结果进行二次更新。If the data content consistency check result after an update is that the data content of the sampled data sampled at M time points is consistent, then the data content consistency check is performed on the sampled data of the source database and the target database sampled at W time points, And a second update is performed on the result of the consistency check of the data content.
可选的,若数据一致性校验结果为不一致,则在所述根据所述数据内容一致性校验结果得到数据一致性校验结果之后,还包括:Optionally, if the result of the data consistency verification is inconsistent, after the data consistency verification result is obtained according to the data content consistency verification result, the method further includes:
将所述数据一致性校验结果记录在预设的记录日志中,并执行预设的自动执行脚本,根据所述自动执行脚本获取对应的兜底补偿脚本;Recording the data consistency verification result in a preset recording log, and executing a preset automatic execution script, and obtaining a corresponding bottom-up compensation script according to the automatic execution script;
根据所述兜底补偿脚本获取数据一致性校验不一致的采样时间段,并根据数据一致性校验不一致的采样时间段确定所述目标数据库的待补偿数据段;Acquiring the sampling time period during which the data consistency check is inconsistent according to the bottom-up compensation script, and determining the data segment to be compensated in the target database according to the sampling time period during which the data consistency check is inconsistent;
从所述源数据库中获取所述待补偿数据段对应的数据,并复制至所述目标数据库中。The data corresponding to the data segment to be compensated is obtained from the source database and copied to the target database.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统或装置、单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the system, device, and unit described above can refer to the corresponding process in the foregoing method embodiments, and details are not repeated here.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on such an understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other various media that can store program codes. .
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still understand the foregoing The technical solutions recorded in each embodiment are modified, or some of the technical features are replaced equivalently; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention.
Claims (13)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310201564.0A CN116303500A (en) | 2023-02-28 | 2023-02-28 | Data consistency verification method, device, equipment and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310201564.0A CN116303500A (en) | 2023-02-28 | 2023-02-28 | Data consistency verification method, device, equipment and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN116303500A true CN116303500A (en) | 2023-06-23 |
Family
ID=86831847
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310201564.0A Pending CN116303500A (en) | 2023-02-28 | 2023-02-28 | Data consistency verification method, device, equipment and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN116303500A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116932567A (en) * | 2023-07-27 | 2023-10-24 | 联想(北京)有限公司 | Data detection method and device and electronic equipment |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20160171070A1 (en) * | 2014-12-10 | 2016-06-16 | International Business Machines Corporation | Query dispatching system and method |
| CN110083615A (en) * | 2019-04-12 | 2019-08-02 | 平安普惠企业管理有限公司 | A kind of data verification method, device, electronic equipment and storage medium |
| CN110287102A (en) * | 2019-05-22 | 2019-09-27 | 深圳壹账通智能科技有限公司 | Core data detection and processing method, device, computer equipment and storage medium |
| CN111104392A (en) * | 2019-12-12 | 2020-05-05 | 京东数字科技控股有限公司 | Database migration method and device, electronic equipment and storage medium |
| CN112181945A (en) * | 2020-09-28 | 2021-01-05 | 中国平安人寿保险股份有限公司 | Data archiving processing method and device, computer equipment and storage medium |
| CN114581251A (en) * | 2022-03-03 | 2022-06-03 | 深圳壹账通科技服务有限公司 | Data verification method and device, computer equipment and computer readable storage medium |
| CN115455078A (en) * | 2022-09-08 | 2022-12-09 | 千寻位置网络有限公司 | Consistency checking method and system for data center |
-
2023
- 2023-02-28 CN CN202310201564.0A patent/CN116303500A/en active Pending
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20160171070A1 (en) * | 2014-12-10 | 2016-06-16 | International Business Machines Corporation | Query dispatching system and method |
| CN110083615A (en) * | 2019-04-12 | 2019-08-02 | 平安普惠企业管理有限公司 | A kind of data verification method, device, electronic equipment and storage medium |
| CN110287102A (en) * | 2019-05-22 | 2019-09-27 | 深圳壹账通智能科技有限公司 | Core data detection and processing method, device, computer equipment and storage medium |
| CN111104392A (en) * | 2019-12-12 | 2020-05-05 | 京东数字科技控股有限公司 | Database migration method and device, electronic equipment and storage medium |
| CN112181945A (en) * | 2020-09-28 | 2021-01-05 | 中国平安人寿保险股份有限公司 | Data archiving processing method and device, computer equipment and storage medium |
| CN114581251A (en) * | 2022-03-03 | 2022-06-03 | 深圳壹账通科技服务有限公司 | Data verification method and device, computer equipment and computer readable storage medium |
| CN115455078A (en) * | 2022-09-08 | 2022-12-09 | 千寻位置网络有限公司 | Consistency checking method and system for data center |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116932567A (en) * | 2023-07-27 | 2023-10-24 | 联想(北京)有限公司 | Data detection method and device and electronic equipment |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN104067216B (en) | Systems and methods for implementing scalable data storage services | |
| Jensen et al. | Modelardb: Modular model-based time series management with spark and cassandra | |
| CN106233259B (en) | Method and system for retrieving multi-generational stored data in a decentralized storage network | |
| US11232071B2 (en) | Regressable differential data structures | |
| JP6254606B2 (en) | Database streaming restore from backup system | |
| US10031935B1 (en) | Customer-requested partitioning of journal-based storage systems | |
| US7447839B2 (en) | System for a distributed column chunk data store | |
| CN114746843A (en) | Memory health tracking for differentiated data recovery configurations | |
| CN103581331B (en) | The online moving method of virtual machine and system | |
| CN114564457B (en) | A storage space optimization method and system for database files | |
| US8402119B2 (en) | Real-load tuning of database applications | |
| CN110609797A (en) | Page cache logging for block-based storage | |
| US11250019B1 (en) | Eventually consistent replication in a time-series database | |
| US11455305B1 (en) | Selecting alternate portions of a query plan for processing partial results generated separate from a query engine | |
| US11853284B2 (en) | In-place updates with concurrent reads in a decomposed state | |
| WO2019197918A1 (en) | Fault-tolerant federated distributed database | |
| CN114968111A (en) | Data deleting method, device, equipment and computer readable storage medium | |
| CN108228432A (en) | A kind of distributed link tracking, analysis method and server, global scheduler | |
| US10749772B1 (en) | Data reconciliation in a distributed data storage network | |
| CN116303500A (en) | Data consistency verification method, device, equipment and storage medium | |
| CN115878052B (en) | RAID array inspection method, inspection device and electronic equipment | |
| CN112395265B (en) | Database access method, device, electronic device and computer storage medium | |
| CN118550935A (en) | Tuple processing method, system and device | |
| CN111880964A (en) | Method and system for provenance-based data backup | |
| CN120226000A (en) | In-band file system access |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |