[go: up one dir, main page]

CN116150179A - Method and device for comparing data consistency between databases - Google Patents

Method and device for comparing data consistency between databases Download PDF

Info

Publication number
CN116150179A
CN116150179A CN202310394989.8A CN202310394989A CN116150179A CN 116150179 A CN116150179 A CN 116150179A CN 202310394989 A CN202310394989 A CN 202310394989A CN 116150179 A CN116150179 A CN 116150179A
Authority
CN
China
Prior art keywords
data
boundary
data block
calculating
comparison
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310394989.8A
Other languages
Chinese (zh)
Inventor
卜洪涛
刘金鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Nankai University General Data Technologies Co ltd
Original Assignee
Tianjin Nankai University General Data Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Nankai University General Data Technologies Co ltd filed Critical Tianjin Nankai University General Data Technologies Co ltd
Priority to CN202310394989.8A priority Critical patent/CN116150179A/en
Priority to PCT/CN2023/095538 priority patent/WO2024212312A1/en
Publication of CN116150179A publication Critical patent/CN116150179A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/273Asynchronous replication or reconciliation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method and a device for data consistency comparison among databases, which relate to the field of data consistency comparison among databases and comprise the following steps: selecting a field as a condition column for calculating the boundary of the data block based on the table data, and calculating the maximum value and the minimum value of the condition column; calculating the boundary of the data block according to the minimum value, marking the maximum value as the next boundary inquiry minimum value, and repeatedly calculating the boundary of the data block of the whole table; 2n threads are configured and started, n threads are respectively allocated to be responsible for processing source table data and target table data, and a data block boundary value is obtained from a condition queue; and inquiring all the primary key values in the boundary value range of the source table, and calculating difference data of the source table and the target table in the data corresponding to the same data block boundary according to the primary key. According to the data block boundary comparison method, the data is decomposed into the plurality of data block boundaries through the algorithm, each data block boundary can be independently subjected to data query comparison, and the plurality of data block boundaries can be compared in parallel, so that the performance is improved, the comparison speed is improved, and the comparison difficulty is reduced.

Description

Method and device for comparing data consistency between databases
Technical Field
The present disclosure relates to the field of data consistency comparison in databases, and in particular, to a method for data consistency comparison between databases. The application also relates to a device for comparing data consistency among databases.
Background
With the development of big data, operations of data synchronization are involved in many business scenarios.
In the prior art, it is generally necessary to synchronize the primary node data to the backup node data, or to synchronize the data of one type of database table to the other type of database table. If data inconsistency occurs in synchronization, the difference data is usually compared by a manual method.
The defects in the prior art are that the difference data are difficult to compare by adopting a manual method, and particularly, heterogeneous databases are not mutually communicated, so that the operation is more difficult.
Disclosure of Invention
The method for comparing the data consistency among the databases aims to overcome the defect that the manual method is difficult to compare the difference data in the prior art. The application also relates to a device for comparing data consistency among databases.
The data consistency comparison method between databases provided by the application comprises the following steps:
selecting a field from the table data as a condition column for calculating the boundary of the data block, and calculating the maximum value and the minimum value of the condition column of the table;
calculating a data block boundary according to the minimum value, marking the maximum value as the next boundary inquiry minimum value, and repeating and calculating the data block boundary of the whole table;
2n threads are configured and started, n threads are respectively allocated to process source table data and target table data, and a data block boundary value is obtained from a condition queue;
and inquiring all the primary key values in the boundary value range of the source table, and calculating difference data of the source table and the target table in the data corresponding to the same data block boundary according to the primary key.
Optionally, the condition column is indexed.
Optionally, said calculating the maximum value and the minimum value of the condition columns of the table includes:
calculating the maximum value and the minimum value of the condition columns of the table by [ select min (c 1), max (c 1) from t ];
wherein c1 represents a condition column.
Optionally, the calculating the data block boundary includes:
calculating the boundary of the data block as [ max (c 1) value-the value of the query condition column of sql ] by [ select max (c 1) from t where c1 > = boundary query minimum value order by c1 limit 1000 ];
wherein c1 represents a condition column.
Optionally, the querying all primary key values within the source table boundary value range are as follows:
select primary key column 1..the primary key column n from t where comparison column > = boundary minimum and comparison column < = boundary maximum order by comparison column desc.
Optionally, the method further comprises: all primary key values of the boundary of the source table are queried and recorded into a source table block data container.
Optionally, the recording into the source table block data container includes:
the usage size of the data container is controlled by configuration.
Optionally, the calculating calculates difference data between the source table and the target table in the data corresponding to the same data block boundary: comprising the following steps:
marking the source table data blocks and the target table data blocks of the same data block boundary as the same group;
and reading the data marked as the same group, performing bidirectional comparison, calculating the data of the differential primary key, and then landing the data to form a file.
Optionally, the floor-forming file includes:
a main key existing in the source table, if the target table does not exist, recording the main key data into the file 1;
and if the main key exists in the target table, if the source table does not exist, recording the main key data into the file 2.
The application also provides a data consistency comparison device between databases, which comprises:
the first calculation module is used for selecting a field from the table data as a condition column for calculating the boundary of the data block and calculating the maximum value and the minimum value of the condition column of the table;
the second calculation module calculates the boundary of the data block according to the minimum value, marks the maximum value as the minimum value of the next boundary inquiry, and repeats and calculates the boundary of the data block of the whole table;
the configuration inquiry module is used for configuring and starting 2n threads, respectively distributing n threads to be responsible for processing source table data and target table data, and acquiring a data block boundary value from a condition queue;
and the comparison module is used for inquiring all the primary key values in the boundary value range of the source table and calculating difference data of the source table and the target table in the data corresponding to the same data block boundary according to the primary key.
The application has the advantages and beneficial effects that:
the data consistency comparison method between databases provided by the application comprises the following steps: selecting a field from the table data as a condition column for calculating the boundary of the data block, and calculating the maximum value and the minimum value of the condition column of the table; calculating a data block boundary according to the minimum value, marking the maximum value as the next boundary inquiry minimum value, and repeating and calculating the data block boundary of the whole table; 2n threads are configured and started, n threads are respectively allocated to process source table data and target table data, and a data block boundary value is obtained from a condition queue; and inquiring all the primary key values in the boundary value range of the source table, and calculating difference data of the source table and the target table in the data corresponding to the same data block boundary according to the primary key. According to the data block boundary comparison method and device, the data is rapidly decomposed into the plurality of data block boundaries through the algorithm, each data block boundary can be independently subjected to data query comparison, the plurality of data block boundaries can be compared in parallel, so that the performance is improved, the comparison speed is improved, and the comparison difficulty is reduced.
Drawings
FIG. 1 is a diagram of data consistency comparison flow between databases in the present application.
FIG. 2 is a schematic diagram of data consistency comparison logic between databases in the present application.
FIG. 3 is a schematic diagram of a data consistency comparison device between databases in the present application.
Detailed Description
The present application is further described in conjunction with the drawings and detailed embodiments so that those skilled in the art may better understand the present application and practice it.
The following are examples of specific implementation provided for the purpose of illustrating the technical solutions to be protected in this application in detail, but this application may also be implemented in other ways than described herein, and one skilled in the art may implement this application by using different technical means under the guidance of the conception of this application, so this application is not limited by the following specific embodiments.
The data consistency comparison method between databases provided by the application comprises the following steps: selecting a field from the table data as a condition column for calculating the boundary of the data block, and calculating the maximum value and the minimum value of the condition column of the table; calculating a data block boundary according to the minimum value, marking the maximum value as the next boundary inquiry minimum value, and repeating and calculating the data block boundary of the whole table; 2n threads are configured and started, n threads are respectively allocated to process source table data and target table data, and a data block boundary value is obtained from a condition queue; and inquiring all the primary key values in the boundary value range of the source table, and calculating difference data of the source table and the target table in the data corresponding to the same data block boundary according to the primary key. According to the data block boundary comparison method and device, the data is rapidly decomposed into the plurality of data block boundaries through the algorithm, each data block boundary can be independently subjected to data query comparison, the plurality of data block boundaries can be compared in parallel, so that the performance is improved, the comparison speed is improved, and the comparison difficulty is reduced.
Referring to fig. 1, the present application aims to solve the problem of slow data comparison in the conventional method. The data is rapidly decomposed into a plurality of data block boundaries through an algorithm, each data block (chunk) boundary can independently perform data query comparison, and the data block boundaries can be compared in parallel, so that the performance is improved. In the comparison process, only the primary key is compared, and the bidirectional comparison is carried out.
For a primary key that exists in the source table, if the target table does not exist, the primary key data is recorded in file 1.
For a primary key that exists in the target table, if the source table does not exist, the primary key data is recorded in file 2.
According to the technical scheme, the condition of cutting the boundary of the data block is not required to be a primary key, so that the use and the efficiency are not affected even if the joint primary key exists, and meanwhile, on the basis of comprehensively considering the memory and the resource occupation of the CPU, the optimal performance of the comparison task is realized through reasonable configuration.
As shown in fig. 1, S101 selects a field from the table data as a condition column for calculating the boundary of the data block, and calculates the maximum value and the minimum value of the condition column in the table.
Calculating a data block boundary of a table, the table comprising: a source table and a target table.
In the application, the data block boundary of the computation table is the most important step, and the data of the data block boundary can be queried in parallel through multiple threads after the boundary is computed rapidly to compare and improve the performance.
Specifically, a field is first selected as a conditional column for calculating the boundary of the data block, and typically the column requires an index and the data is not repeated as much as possible. In this application, this condition is denoted as c1.
The maximum and minimum values of the condition columns described in the table are calculated by [ select min (c 1), max (c 1) from t ]. Where min (c 1) is noted as the initialized boundary query minimum.
As shown in fig. 1, S102 calculates a data block boundary according to the minimum value, marks the maximum value as the next boundary query minimum value, and repeats and calculates the data block boundary of the entire table.
The data block boundary is calculated as [ max (c 1) value-the value of the query condition column of sql ] by [ select max (c 1) from t where c1 > = boundary query minimum value order by c1 limit 1000 ], and the value of the mark max (c 1) is the next boundary query minimum value.
Finally, the data block boundaries of the entire table are repeated and calculated.
A specific example illustrates the results of the above steps as follows:
assuming that the t table has 1000 columns of c1, c2 and c3, for convenience of demonstration, assuming that the content of the column of c1 data is 1000 columns of data1-data1000 in total, calculating the limit condition as 100, and recording the split data blocks after calculation according to the rule as follows:
data block Range Data block ID
data1-data100 1
data100-data200 2
... ... ... ...
data800-data900 9
data900-data1000 10
And calculating and obtaining the boundary of the data block with the structure and putting the boundary into a condition queue for processing in the subsequent step.
As shown in fig. 1, S103 is configured to start 2n threads, allocate n threads to process source table data and target table data, respectively, and acquire a data block boundary value from a condition queue.
The method comprises the steps of multithreading, wherein each thread is responsible for reading main key data to be compared from a table after acquiring a data block boundary from a condition queue, and storing the main key data into a memory for subsequent comparison.
The fact that the data block boundary is the conditional column is not mandatory, because the primary key may theoretically be a joint primary key, and if multiple columns are used as conditional columns, the difficulty of calculating the boundary is increased and the performance is affected. The basic algorithm process is as follows:
as shown in FIG. 2, S201 queries the thread for boundaries.
S202 prepares a determined source table condition column and target condition column.
S203, starting 2n threads through configuration, wherein n threads are responsible for processing source table data and n threads are responsible for processing target table data aiming at the source table.
S204, each thread of the source table is responsible for acquiring the boundary value of the data block from the condition queue, and then inquiring all the main key values of the boundary of the source table and recording the main key values into a data container of the source table block. Its sql is of the form:
select primary key column 1..the primary key column n from t where comparison column > = boundary minimum and comparison column < = boundary maximum order by comparison column desc.
Each thread of the target table is responsible for acquiring a data block boundary value from the condition queue, then querying all primary key values of the boundary of the target table, and recording the primary key values into a target table block data container. Its sql is of the form:
select primary key column 1..the primary key column n from t where comparison column > = boundary minimum and comparison column < = boundary maximum order by comparison column desc.
When the size exceeds the specified size, the data is blocked when being put into the block data container, and only the blocked data can be put into the block data container after the data is destroyed by the comparison processing of the subsequent threads. To control the use of memory, the size of the use of the data container can be controlled by configuration.
As shown in fig. 1, S104 queries all primary key values within the boundary value range of the source table, and calculates difference data between the source table and the target table in the data corresponding to the same data block boundary according to the primary key.
And calculating difference data of the source table and the target table in the data corresponding to the same data block boundary according to the primary key.
With continued reference to fig. 2, S205 indicates that the source table data block and the target table data block of the same data block boundary are marked as the same group, and the thread is responsible for acquiring the data marked as the same group after having been read for bidirectional comparison.
S206, calculating difference primary key data, then landing the data into a file, and destroying the data blocks after the data are compared, so as to release space.
Finally, the difference data is landed to generate a file.
For the comparison result, the file is landed according to the following rule:
for a primary key that exists in the source table, if the target table does not exist, the primary key data is recorded in file 1.
For a primary key that exists in the target table, if the source table does not exist, the primary key data is recorded in file 2.
As shown in fig. 3, the present application further provides a device for comparing data consistency between databases, where the device is configured to execute the above method.
The first calculation module 301 calculates the maximum value and the minimum value of the condition columns of the table based on selecting a field in the table data as the condition column for calculating the boundary of the data block.
Calculating a data block boundary of a table, the table comprising: a source table and a target table.
In the application, the data block boundary of the computation table is the most important step, and the data of the data block boundary can be queried in parallel through multiple threads after the boundary is computed rapidly to compare and improve the performance.
Specifically, a field is selected as a conditional column for calculating the boundary of a block of data, which generally requires indexing and as little duplication of data as possible. In this application, this condition is denoted as c1.
The maximum and minimum values of the condition columns described in the table are calculated by [ select min (c 1), max (c 1) from t ]. Where min (c 1) is noted as the initialized boundary query minimum.
The second calculation module 302 calculates the data block boundary according to the minimum value, marks the maximum value as the next minimum value of the boundary query, and repeats and calculates the data block boundary of the whole table.
The data block boundary is calculated as [ max (c 1) value-the value of the query condition column of sql ] by [ select max (c 1) from t where c1 > = boundary query minimum value order by limit 1000 ], and the value of the mark max (c 1) is the next boundary query minimum value.
Finally, the data block boundaries of the entire table are repeated and calculated.
The configuration query module 303 configures to start 2n threads, allocate n threads to be responsible for processing the source table data and the target table data, and acquire the data block boundary value from the condition queue, respectively.
The method comprises the steps of multithreading, wherein each thread is responsible for reading main key data to be compared from a table after acquiring a data block boundary from a condition queue, and storing the main key data into a memory for subsequent comparison.
The above-described condition columns do not necessarily require that the condition columns must be primary keys, because primary keys may theoretically be joint primary keys, which would increase the difficulty of computing boundaries and affect performance if multiple columns were the condition columns. The basic algorithm is as follows:
by configuring to start 2n threads, n threads are responsible for processing source table data and n threads are responsible for processing target table data for the source table
Each thread of the source table is responsible for acquiring a data block boundary value from the condition queue, then querying all primary key values of the boundary of the source table, and recording the primary key values into a source table block data container. Its sql is of the form:
select primary key column 1..the primary key column n from t where comparison column > = boundary minimum and comparison column < = boundary maximum order by comparison column desc.
Each thread of the target table is responsible for acquiring a data block boundary value from the condition queue, then querying all primary key values of the boundary of the target table, and recording the primary key values into a target table block data container. Its sql is of the form:
select primary key column 1..the primary key column n from t where comparison column > = boundary minimum and comparison column < = boundary maximum order by comparison column desc.
After the specified size is exceeded, the data is blocked when being put into the block data container, and only the blocked data can be put into after the block number is destroyed by the comparison processing of the subsequent threads. To control the use of memory, the size of the use of the data container can be controlled by configuration.
And the comparison module 304 queries all the primary key values in the boundary value range of the source table, and calculates difference data between the source table and the target table in the data corresponding to the same data block boundary according to the primary key.
And calculating difference data of the source table and the target table in the data corresponding to the same data block boundary according to the primary key.
Specifically, the source table data block and the target table data block on the same data block boundary are marked as the same group, the thread is responsible for acquiring the read data marked as the same group for bidirectional comparison, calculating the differential primary key data, and then landing the data to form a file, and destroying the data block after the comparison of the data blocks, and releasing the space.
Finally, the difference data is landed to generate a file.
For the comparison result, the file is landed according to the following rule:
for a primary key that exists in the source table, if the target table does not exist, the primary key data is recorded in file 1.
For a primary key that exists in the target table, if the source table does not exist, the primary key data is recorded in file 2.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (10)

1. The data consistency comparison method between databases is characterized by comprising the following steps:
selecting a field from the table data as a condition column for calculating the boundary of the data block, and calculating the maximum value and the minimum value of the condition column of the table;
calculating a data block boundary according to the minimum value, marking the maximum value as the next boundary inquiry minimum value, and repeating and calculating the data block boundary of the whole table;
2n threads are configured and started, n threads are respectively allocated to process source table data and target table data, and a data block boundary value is obtained from a condition queue;
and inquiring all the primary key values in the boundary value range of the source table, and calculating difference data of the source table and the target table in the data corresponding to the same data block boundary according to the primary key.
2. The method of claim 1, wherein the condition columns are indexed.
3. The method for matching data consistency between databases according to claim 1, wherein said calculating the maximum and minimum values of the condition columns of the table comprises:
calculating the maximum value and the minimum value of the condition columns of the table by [ select min (c 1), max (c 1) from t ];
wherein c1 represents a condition column, min (c 1) is marked as an initialized minimum value of boundary query, and max (c 1) is the minimum value of the next boundary query.
4. A method of data consistency comparison between databases as claimed in claim 3, wherein said calculating data block boundaries comprises:
calculating the value of a query condition column with the data block boundary being the value of [ max (c 1) -sql ] through [ select max (c 1) from t where c1 > = boundary query minimum value order by c1 limit 1000 ], wherein max (c 1) is the next boundary query minimum value;
wherein c1 represents a condition column.
5. The method for comparing data consistency among databases according to claim 1, wherein the query is of the form of all primary key values within the range of source table boundary values as follows:
select primary key column 1..the primary key column n from t where comparison column > = boundary minimum and comparison column < = boundary maximum order by comparison column desc.
6. The method for comparing data consistency among databases according to claim 1, further comprising: all primary key values of the boundary of the source table are queried and recorded into a source table block data container.
7. The method for matching data consistency between databases according to claim 6, wherein said recording into a source table block data container comprises:
the usage size of the data container is controlled by configuration.
8. The method for comparing data consistency between databases according to any one of claims 1 to 7, wherein the calculating the difference data between the source table and the target table in the data corresponding to the same data block boundary is characterized in that: comprising the following steps:
marking the source table data blocks and the target table data blocks of the same data block boundary as the same group;
and reading the data marked as the same group, performing bidirectional comparison, calculating the data of the differential primary key, and then landing the data to form a file.
9. The method for comparing data consistency among databases according to claim 8, wherein the file is formed by landing, comprising:
a main key existing in the source table, if the target table does not exist, recording the main key data into the file 1;
and if the main key exists in the target table, if the source table does not exist, recording the main key data into the file 2.
10. A data consistency comparison apparatus between databases, comprising:
the first calculation module is used for selecting a field from the table data as a condition column for calculating the boundary of the data block and calculating the maximum value and the minimum value of the condition column of the table;
the second calculation module calculates the boundary of the data block according to the minimum value, marks the maximum value as the minimum value of the next boundary inquiry, and repeats and calculates the boundary of the data block of the whole table;
the configuration inquiry module is used for configuring and starting 2n threads, respectively distributing n threads to be responsible for processing source table data and target table data, and acquiring a data block boundary value from a condition queue;
and the comparison module is used for inquiring all the primary key values in the boundary value range of the source table and calculating difference data of the source table and the target table in the data corresponding to the same data block boundary according to the primary key.
CN202310394989.8A 2023-04-14 2023-04-14 Method and device for comparing data consistency between databases Pending CN116150179A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202310394989.8A CN116150179A (en) 2023-04-14 2023-04-14 Method and device for comparing data consistency between databases
PCT/CN2023/095538 WO2024212312A1 (en) 2023-04-14 2023-05-22 Method and apparatus for data consistency comparison between databases

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310394989.8A CN116150179A (en) 2023-04-14 2023-04-14 Method and device for comparing data consistency between databases

Publications (1)

Publication Number Publication Date
CN116150179A true CN116150179A (en) 2023-05-23

Family

ID=86373868

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310394989.8A Pending CN116150179A (en) 2023-04-14 2023-04-14 Method and device for comparing data consistency between databases

Country Status (2)

Country Link
CN (1) CN116150179A (en)
WO (1) WO2024212312A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118394775A (en) * 2024-07-01 2024-07-26 天津南大通用数据技术股份有限公司 A fast data consistency detection method based on Oracle pseudo column

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989194A (en) * 2015-03-20 2016-10-05 国际商业机器公司 Method and system of table data comparison
CN107679104A (en) * 2017-09-12 2018-02-09 杭州美创科技有限公司 Big surface low formula parallel high-speed data comparison method
CN108153619A (en) * 2017-12-25 2018-06-12 杭州恩牛网络技术有限公司 A kind of data proofreading method and device
CN114138739A (en) * 2021-11-05 2022-03-04 浪潮软件集团有限公司 Database table content rapid comparison system
CN114328470A (en) * 2022-03-14 2022-04-12 北京奥星贝斯科技有限公司 Data migration method and device for single source table
CN114996288A (en) * 2022-06-23 2022-09-02 网易(杭州)网络有限公司 Data comparison method and device, computer storage medium and electronic equipment
CN115952168A (en) * 2022-12-23 2023-04-11 成都康赛信息技术有限公司 Education industry-oriented multi-scale progressive difference data positioning method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989194A (en) * 2015-03-20 2016-10-05 国际商业机器公司 Method and system of table data comparison
CN107679104A (en) * 2017-09-12 2018-02-09 杭州美创科技有限公司 Big surface low formula parallel high-speed data comparison method
CN108153619A (en) * 2017-12-25 2018-06-12 杭州恩牛网络技术有限公司 A kind of data proofreading method and device
CN114138739A (en) * 2021-11-05 2022-03-04 浪潮软件集团有限公司 Database table content rapid comparison system
CN114328470A (en) * 2022-03-14 2022-04-12 北京奥星贝斯科技有限公司 Data migration method and device for single source table
CN114996288A (en) * 2022-06-23 2022-09-02 网易(杭州)网络有限公司 Data comparison method and device, computer storage medium and electronic equipment
CN115952168A (en) * 2022-12-23 2023-04-11 成都康赛信息技术有限公司 Education industry-oriented multi-scale progressive difference data positioning method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118394775A (en) * 2024-07-01 2024-07-26 天津南大通用数据技术股份有限公司 A fast data consistency detection method based on Oracle pseudo column

Also Published As

Publication number Publication date
WO2024212312A1 (en) 2024-10-17

Similar Documents

Publication Publication Date Title
US10452676B2 (en) Managing database with counting bloom filters
US9317517B2 (en) Hashing scheme using compact array tables
CN106547644B (en) Incremental backup method and equipment
CN107679104B (en) Large-flow parallel high-speed data comparison method
CN109189783B (en) Time sequence database table structure change processing method
US20130013648A1 (en) Method for database storage of a table with plural schemas
CN105550225A (en) Index construction method and query method and apparatus
US8843499B2 (en) Accelerating database queries comprising positional text conditions plus bitmap-based conditions
CN116150179A (en) Method and device for comparing data consistency between databases
CN110928665B (en) Data processing method, device, storage medium and terminal
US9779121B2 (en) Transparent access to multi-temperature data
WO2025026170A1 (en) Data query method and related device
CN109213751B (en) Spark platform based Oracle database parallel migration method
WO2017020735A1 (en) Data processing method, backup server and storage system
US9262472B2 (en) Concatenation for relations
CN110928863A (en) Method for task breakpoint resume applied to data cleaning tool
CN109977113A (en) A kind of HBase Index Design method based on Bloom filter for medical imaging data
CN114356923A (en) MVCC multi-version folding tree implementation system and method based on ClickHouse database
US11816245B2 (en) Method for analysis on interim result data of de-identification procedure, apparatus for the same, computer program for the same, and recording medium storing computer program thereof
CN114238258A (en) Database data processing method and device, computer equipment and storage medium
CN114064653A (en) Data insertion method, apparatus, computer equipment and storage medium
CN110399375B (en) Data table index creation method and device
CN117609181A (en) Method and system for migrating TCHouse database
CN117349293A (en) Database parallel aggregation method based on column storage format, computer equipment and storage medium
US20210248142A1 (en) Dual filter histogram optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20230523