[go: up one dir, main page]

CN116501539A - Data processing method and related device - Google Patents

Data processing method and related device Download PDF

Info

Publication number
CN116501539A
CN116501539A CN202210061367.9A CN202210061367A CN116501539A CN 116501539 A CN116501539 A CN 116501539A CN 202210061367 A CN202210061367 A CN 202210061367A CN 116501539 A CN116501539 A CN 116501539A
Authority
CN
China
Prior art keywords
transaction
log chain
chain
incremental
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210061367.9A
Other languages
Chinese (zh)
Inventor
柴云鹏
任波
骆远辉
黄人煌
王元桢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Renmin University of China
Huawei Cloud Computing Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China, Huawei Cloud Computing Technologies Co Ltd filed Critical Renmin University of China
Priority to CN202210061367.9A priority Critical patent/CN116501539A/en
Publication of CN116501539A publication Critical patent/CN116501539A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1474Saving, restoring, recovering or retrying in transactions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2322Optimistic concurrency control using timestamps
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种数据处理的方法及相关装置,第一设备获取事务表,事务表用于指示事务的状态以及日志链的存储地址,每个事务对应一个日志链,日志链包括至少一个链节点,每个链节点指示事务中的一个元组的存储地址;第一设备根据事务表获取待恢复事务的日志链,其中,事务表中处于undo状态的事务为待恢复事务;第一设备根据待恢复事务的日志链确定待恢复事务的元组的存储地址,以获取待恢复事务的元组;第一设备删除待恢复事务的日志链和元组。本申请中,可以通过事务表查询到待恢复事务,并根据该待恢复事务的日志链获取到该待恢复事务的元组,从而不需要对第一设备进行全表扫描,提高了数据恢复、数据处理的效率。

The application discloses a data processing method and related devices. The first device acquires a transaction table, which is used to indicate the state of the transaction and the storage address of the log chain. Each transaction corresponds to a log chain, and the log chain includes at least one chain node, each chain node indicates the storage address of a tuple in the transaction; the first device obtains the log chain of the transaction to be restored according to the transaction table, wherein the transaction in the undo state in the transaction table is the transaction to be restored; the first device obtains the transaction according to The log chain of the transaction to be restored determines the storage address of the tuple of the transaction to be restored, so as to obtain the tuple of the transaction to be restored; the first device deletes the log chain and the tuple of the transaction to be restored. In this application, the transaction to be recovered can be queried through the transaction table, and the tuple of the transaction to be recovered can be obtained according to the log chain of the transaction to be recovered, so that there is no need to perform a full table scan on the first device, which improves data recovery, The efficiency of data processing.

Description

一种数据处理的方法及相关装置A data processing method and related device

技术领域technical field

本申请实施例涉及计算机技术领域,尤其涉及一种数据处理的方法及相关装置。The embodiments of the present application relate to the field of computer technologies, and in particular, to a data processing method and related devices.

背景技术Background technique

后写日志(write behind logging,WBL)是针对非易失性内存(non-volatilememory,NVM)而设计的一种新的日志记录及恢复协议。其关键思想是记录数据库哪些部分的数据发生了变化,而不是数据如何发生变化。使用WBL需要在持久化日志之前先持久化事务,日志用于追踪事务修改的数据以供撤销未提交事务。Write behind logging (WBL) is a new logging and recovery protocol designed for non-volatile memory (NVM). The key idea is to record which parts of the database have changed data, not how the data has changed. Using WBL requires persisting the transaction before persisting the log, which is used to track the data modified by the transaction for undoing uncommitted transactions.

Zen的无日志策略则是WBL的一种实现方式。Zen机制中,将日志信息写入了事务的每一个元组,减少了对于日志的写操作。但是,Zen在面对进行数据恢复的场景时,需要对数据库中全表的数据进行两次扫描,来分辨出哪些是未提交完成的事务,从而进行数据回滚。Zen's no-log policy is an implementation of WBL. In the Zen mechanism, the log information is written into each tuple of the transaction, reducing the write operations on the log. However, when facing data recovery scenarios, Zen needs to scan the data of the entire table in the database twice to identify uncommitted transactions and roll back the data.

由于数据库存储的数据内容较多,因此,对全表的数据进行两次扫描,需要较大的系统开销,数据恢复的时间太长。Since the database stores a lot of data content, scanning the data of the entire table twice requires a large system overhead, and the data recovery time is too long.

发明内容Contents of the invention

本申请实施例提供了一种数据处理的方法及相关装置,用于提高数据处理的效率。The embodiments of the present application provide a data processing method and a related device, which are used to improve the efficiency of data processing.

第一方面,本申请实施例提供了一种数据处理的方法,第一设备获取事务表,事务表用于指示事务的状态以及日志链的存储地址,每个事务对应一个日志链,日志链包括至少一个链节点,每个链节点指示事务中的一个元组的存储地址;第一设备根据事务表获取待恢复事务的日志链,其中,事务表中处于undo状态的事务为待恢复事务;第一设备根据待恢复事务的日志链确定待恢复事务的元组的存储地址,以获取待恢复事务的元组;第一设备删除待恢复事务的日志链和元组。In the first aspect, the embodiment of the present application provides a data processing method. The first device obtains a transaction table, which is used to indicate the state of the transaction and the storage address of the log chain. Each transaction corresponds to a log chain. The log chain includes At least one chain node, each chain node indicates the storage address of a tuple in the transaction; the first device obtains the log chain of the transaction to be restored according to the transaction table, wherein the transaction in the undo state in the transaction table is the transaction to be restored; A device determines the storage address of the tuple of the transaction to be restored according to the log chain of the transaction to be restored, so as to obtain the tuple of the transaction to be restored; the first device deletes the log chain and the tuple of the transaction to be restored.

本申请中,可以通过事务表查询到待恢复事务,并根据该待恢复事务的日志链获取到该待恢复事务的元组,从而不需要对第一设备进行全表扫描,提高了数据恢复、数据处理的效率。另一方面,由于本申请中的日志链,其本身并不承载事务的元组信息,从而可以避免在使用日志链过程中带来额外的写操作的开销,从而相较于日志文件承载了事务元组的预写日志(write ahead logging,WAL)机制也具有极大的性能优势。In this application, the transaction to be recovered can be queried through the transaction table, and the tuple of the transaction to be recovered can be obtained according to the log chain of the transaction to be recovered, so that there is no need to perform a full table scan on the first device, which improves data recovery, The efficiency of data processing. On the other hand, since the log chain in this application does not carry the tuple information of the transaction itself, it can avoid the overhead of additional write operations in the process of using the log chain, thus compared with the log file carrying the transaction The write-ahead logging (WAL) mechanism of tuples also has great performance advantages.

基于第一方面,一种可选的实施方式中,日志链包括多个串行的链节点,多个串行的链节点之间存在关联关系。Based on the first aspect, in an optional implementation manner, the log chain includes multiple serial chain nodes, and there is an association relationship among the multiple serial chain nodes.

基于第一方面,一种可选的实施方式中,日志链的存储地址为日志链的链头节点的存储地址。Based on the first aspect, in an optional implementation manner, the storage address of the log chain is the storage address of the head node of the log chain.

基于第一方面,一种可选的实施方式中,第一设备根据事务表获取待恢复事务的日志链,包括:Based on the first aspect, in an optional implementation manner, the first device obtains the log chain of the transaction to be restored according to the transaction table, including:

第一设备根据事务表,确定处于undo状态的事务为待恢复事务;According to the transaction table, the first device determines that the transaction in the undo state is the transaction to be recovered;

第一设备根据事务表确定待恢复事务的日志链的存储地址;The first device determines the storage address of the log chain of the transaction to be recovered according to the transaction table;

第一设备根据待恢复事务的日志链的存储地址,获取待恢复事务的日志链。The first device obtains the log chain of the transaction to be restored according to the storage address of the log chain of the transaction to be restored.

基于第一方面,一种可选的实施方式中,方法应用于主备数据库系统,第一设备为主数据库,第二设备为备数据库,事务表还用于指示事务的提交时间,方法还包括:Based on the first aspect, in an optional implementation manner, the method is applied to the primary and standby database systems, the first device is the primary database, the second device is the standby database, and the transaction table is also used to indicate the commit time of the transaction. The method also includes :

第一设备根据事务表确定增量事务,增量事务的提交时间在主备数据库系统的最新备份时间之后;The first device determines the incremental transaction according to the transaction table, and the commit time of the incremental transaction is after the latest backup time of the primary and standby database systems;

第一设备根据事务表获取增量事务的日志链;The first device obtains the log chain of the incremental transaction according to the transaction table;

第一设备根据增量事务的日志链确定增量事务的元组的存储地址,以获取增量事务的元组;The first device determines the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction;

第一设备将增量事务的日志链和元组同步至第二设备。The first device synchronizes the log chain and tuple of the incremental transaction to the second device.

基于第一方面,一种可选的实施方式中,第一设备根据事务表获取增量事务的日志链,包括:Based on the first aspect, in an optional implementation manner, the first device obtains the log chain of the incremental transaction according to the transaction table, including:

第一设备根据事务表确定增量事务的日志链的存储地址;The first device determines the storage address of the log chain of the incremental transaction according to the transaction table;

第一设备根据增量事务的日志链的存储地址,获取增量事务的日志链。The first device acquires the log chain of the incremental transaction according to the storage address of the log chain of the incremental transaction.

第二方面,本申请实施例提供了一种数据处理的方法,其特征在于,方法应用于主备数据库系统,第一设备为主数据库,第二设备为备数据库,方法包括:In the second aspect, the embodiment of the present application provides a data processing method, which is characterized in that the method is applied to the primary and secondary database systems, the first device is the primary database, and the second device is the secondary database, and the method includes:

第一设备获取事务表,事务表用于指示事务的提交时间以及日志链的存储地址,每个事务对应一个日志链,日志链包括至少一个链节点,每个链节点指示事务中的一个元组的存储地址;The first device obtains a transaction table, the transaction table is used to indicate the commit time of the transaction and the storage address of the log chain, each transaction corresponds to a log chain, the log chain includes at least one chain node, and each chain node indicates a tuple in the transaction storage address;

第一设备根据事务表确定增量事务,增量事务的提交时间在主备数据库系统最新的数据同步的时间之后;The first device determines the incremental transaction according to the transaction table, and the commit time of the incremental transaction is after the latest data synchronization time of the active and standby database systems;

第一设备根据事务表获取增量事务的日志链;The first device obtains the log chain of the incremental transaction according to the transaction table;

第一设备根据增量事务的日志链确定增量事务的元组的存储地址,以获取增量事务的元组;The first device determines the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction;

第一设备将增量事务的日志链和元组同步至第二设备。The first device synchronizes the log chain and tuple of the incremental transaction to the second device.

基于第二方面,一种可选的实施方式中,日志链包括多个串行的链节点,多个串行的链节点之间存在关联关系。Based on the second aspect, in an optional implementation manner, the log chain includes multiple serial chain nodes, and there is an association relationship between the multiple serial chain nodes.

基于第二方面,一种可选的实施方式中,日志链的存储地址为日志链的链头节点的存储地址。Based on the second aspect, in an optional implementation manner, the storage address of the log chain is the storage address of the head node of the log chain.

基于第二方面,一种可选的实施方式中,第一设备根据事务表获取增量事务的日志链,包括:Based on the second aspect, in an optional implementation manner, the first device obtains the log chain of the incremental transaction according to the transaction table, including:

第一设备根据事务表确定增量事务的日志链的存储地址;The first device determines the storage address of the log chain of the incremental transaction according to the transaction table;

第一设备根据增量事务的日志链的存储地址,获取增量事务的日志链。The first device acquires the log chain of the incremental transaction according to the storage address of the log chain of the incremental transaction.

第三方面,本申请实施例提供了一种数据处理装置,其特征在于,包括:In a third aspect, the embodiment of the present application provides a data processing device, which is characterized in that it includes:

获取单元,用于获取事务表,事务表用于指示事务的状态以及日志链的存储地址,每个事务对应一个日志链,日志链包括至少一个链节点,每个链节点指示事务中的一个元组的存储地址;The obtaining unit is used to obtain the transaction table, the transaction table is used to indicate the state of the transaction and the storage address of the log chain, each transaction corresponds to a log chain, and the log chain includes at least one chain node, and each chain node indicates an element in the transaction The storage address of the group;

获取单元,还用于根据事务表获取待恢复事务的日志链,其中,事务表中处于undo状态的事务为待恢复事务;The obtaining unit is also used to obtain the log chain of the transaction to be recovered according to the transaction table, wherein the transaction in the undo state in the transaction table is the transaction to be recovered;

获取单元,还用于根据待恢复事务的日志链确定待恢复事务的元组的存储地址,以获取待恢复事务的元组;The acquisition unit is also used to determine the storage address of the tuple of the transaction to be recovered according to the log chain of the transaction to be recovered, so as to obtain the tuple of the transaction to be recovered;

删除单元,用于删除待恢复事务的日志链和元组。The deletion unit is used to delete the log chain and tuple of the transaction to be recovered.

基于第三方面,一种可选的实施方式中,日志链包括多个串行的链节点,多个串行的链节点之间存在关联关系。Based on the third aspect, in an optional implementation manner, the log chain includes multiple serial chain nodes, and there is an association relationship between the multiple serial chain nodes.

基于第三方面,一种可选的实施方式中,日志链的存储地址为日志链的链头节点的存储地址。Based on the third aspect, in an optional implementation manner, the storage address of the log chain is the storage address of the head node of the log chain.

基于第三方面,一种可选的实施方式中,获取单元具体用于:Based on the third aspect, in an optional implementation manner, the acquiring unit is specifically configured to:

根据事务表,确定处于undo状态的事务为待恢复事务;According to the transaction table, it is determined that the transaction in the undo state is the transaction to be recovered;

根据事务表确定待恢复事务的日志链的存储地址;Determine the storage address of the log chain of the transaction to be recovered according to the transaction table;

根据待恢复事务的日志链的存储地址,获取待恢复事务的日志链。Obtain the log chain of the transaction to be restored according to the storage address of the log chain of the transaction to be restored.

基于第三方面,一种可选的实施方式中,数据处理装置应用于主备数据库系统,数据处理装置为主数据库,第二设备为备数据库,事务表还用于指示事务的提交时间,数据处理装置还包括确定单元和同步单元,Based on the third aspect, in an optional implementation manner, the data processing device is applied to the active and standby database systems, the data processing device is the main database, the second device is the standby database, and the transaction table is also used to indicate the commit time of the transaction, and the data The processing device also includes a determination unit and a synchronization unit,

确定单元,用于根据事务表确定增量事务,增量事务的提交时间在主备数据库系统的最新备份时间之后;The determination unit is used to determine the incremental transaction according to the transaction table, and the commit time of the incremental transaction is after the latest backup time of the primary and standby database systems;

获取单元,还用于根据事务表获取增量事务的日志链;The obtaining unit is also used to obtain the log chain of the incremental transaction according to the transaction table;

获取单元,还用于根据增量事务的日志链确定增量事务的元组的存储地址,以获取增量事务的元组;The obtaining unit is also used to determine the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction;

同步单元,用于将增量事务的日志链和元组同步至第二设备。The synchronization unit is used to synchronize the log chain and tuple of the incremental transaction to the second device.

基于第三方面,一种可选的实施方式中,获取单元具体用于:Based on the third aspect, in an optional implementation manner, the acquiring unit is specifically configured to:

根据事务表确定增量事务的日志链的存储地址;Determine the storage address of the log chain of the incremental transaction according to the transaction table;

根据增量事务的日志链的存储地址,获取增量事务的日志链。According to the storage address of the log chain of the incremental transaction, the log chain of the incremental transaction is obtained.

第四方面,本申请实施例提供了一种数据处理装置,数据处理装置应用于主备数据库系统,数据处理装置为主数据库,第二设备为备数据库,数据处理装置包括:In a fourth aspect, the embodiment of the present application provides a data processing device. The data processing device is applied to the primary and backup database systems. The data processing device is the primary database, and the second device is the standby database. The data processing device includes:

获取单元,用于获取事务表,事务表用于指示事务的提交时间以及日志链的存储地址,每个事务对应一个日志链,日志链包括至少一个链节点,每个链节点指示事务中的一个元组的存储地址;The obtaining unit is used to obtain the transaction table, the transaction table is used to indicate the commit time of the transaction and the storage address of the log chain, each transaction corresponds to a log chain, the log chain includes at least one chain node, and each chain node indicates one of the transactions The storage address of the tuple;

确定单元,用于根据事务表确定增量事务,增量事务的提交时间在主备数据库系统最新的数据同步的时间之后;The determination unit is used to determine the incremental transaction according to the transaction table, and the commit time of the incremental transaction is after the latest data synchronization time of the main and standby database systems;

获取单元,还用于根据事务表获取增量事务的日志链;The obtaining unit is also used to obtain the log chain of the incremental transaction according to the transaction table;

确定单元,还用于根据增量事务的日志链确定增量事务的元组的存储地址,以获取增量事务的元组;The determination unit is also used to determine the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction;

同步单元,用于将增量事务的日志链和元组同步至第二设备。The synchronization unit is used to synchronize the log chain and tuple of the incremental transaction to the second device.

基于第四方面,一种可选的实施方式中,日志链包括多个串行的链节点,多个串行的链节点之间存在关联关系。Based on the fourth aspect, in an optional implementation manner, the log chain includes multiple serial chain nodes, and there is an association relationship between the multiple serial chain nodes.

基于第四方面,一种可选的实施方式中,日志链的存储地址为日志链的链头节点的存储地址。Based on the fourth aspect, in an optional implementation manner, the storage address of the log chain is the storage address of the head node of the log chain.

基于第四方面,一种可选的实施方式中,获取单元具体用于:Based on the fourth aspect, in an optional implementation manner, the acquiring unit is specifically configured to:

根据事务表确定增量事务的日志链的存储地址;Determine the storage address of the log chain of the incremental transaction according to the transaction table;

根据增量事务的日志链的存储地址,获取增量事务的日志链。According to the storage address of the log chain of the incremental transaction, the log chain of the incremental transaction is obtained.

第五方面,本发明实施例提供了一种计算机设备,包括存储器、通信接口及与所述存储器和通信接口耦合的处理器;所述存储器用于存储指令,所述处理器用于执行所述指令,所述通信接口用于在所述处理器的控制下与其他设备进行通信;其中,所述处理器执行所述指令时执行上述任一方面所述的数据处理的方法。In a fifth aspect, an embodiment of the present invention provides a computer device, including a memory, a communication interface, and a processor coupled to the memory and the communication interface; the memory is used to store instructions, and the processor is used to execute the instructions , the communication interface is used to communicate with other devices under the control of the processor; wherein, when the processor executes the instructions, it executes the data processing method described in any one of the above aspects.

第六方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,当其在计算机上运行时,使得计算机执行上述任一方面所述的数据处理的方法。In the sixth aspect, the embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when it is run on a computer, the computer executes the data program described in any one of the above-mentioned aspects. The method of processing.

第七方面,本申请实施例提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,当其在计算机上运行时,使得计算机执行上述任一方面所述的数据处理的方法。In the seventh aspect, the embodiment of the present application provides a computer program product or computer program, the computer program product or computer program includes computer instructions, and when it is run on a computer, it causes the computer to perform the data processing described in any of the above aspects Methods.

从以上技术方案可以看出,本申请实施例具有以下优点:It can be seen from the above technical solutions that the embodiments of the present application have the following advantages:

本申请公开了一种数据处理的方法及相关装置,第一设备获取事务表,事务表用于指示事务的状态以及日志链的存储地址,每个事务对应一个日志链,日志链包括至少一个链节点,每个链节点指示事务中的一个元组的存储地址;第一设备根据事务表获取待恢复事务的日志链,其中,事务表中处于undo状态的事务为待恢复事务;第一设备根据待恢复事务的日志链确定待恢复事务的元组的存储地址,以获取待恢复事务的元组;第一设备删除待恢复事务的日志链和元组。本申请中,可以通过事务表查询到待恢复事务,并根据该待恢复事务的日志链获取到该待恢复事务的元组,从而不需要对第一设备进行全表扫描,提高了数据恢复、数据处理的效率。另一方面,由于本申请中的日志链,其本身并不承载事务的元组信息,从而可以避免在使用日志链过程中带来额外的写操作的开销,从而相较于日志文件承载了事务元组的预写日志(write ahead logging,WAL)机制也具有极大的性能优势。The application discloses a data processing method and related devices. The first device acquires a transaction table, which is used to indicate the state of the transaction and the storage address of the log chain. Each transaction corresponds to a log chain, and the log chain includes at least one chain node, each chain node indicates the storage address of a tuple in the transaction; the first device obtains the log chain of the transaction to be restored according to the transaction table, wherein the transaction in the undo state in the transaction table is the transaction to be restored; the first device obtains the transaction according to The log chain of the transaction to be restored determines the storage address of the tuple of the transaction to be restored, so as to obtain the tuple of the transaction to be restored; the first device deletes the log chain and the tuple of the transaction to be restored. In this application, the transaction to be recovered can be queried through the transaction table, and the tuple of the transaction to be recovered can be obtained according to the log chain of the transaction to be recovered, so that there is no need to perform a full table scan on the first device, which improves data recovery, The efficiency of data processing. On the other hand, since the log chain in this application does not carry the tuple information of the transaction itself, it can avoid the overhead of additional write operations in the process of using the log chain, thus compared with the log file carrying the transaction The write-ahead logging (WAL) mechanism of tuples also has great performance advantages.

附图说明Description of drawings

为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only It is an embodiment of the present application, and those skilled in the art can also obtain other drawings according to the provided drawings without creative work.

图1为数据库系统中预写日志与后写日志的示意图;FIG. 1 is a schematic diagram of a pre-write log and a post-write log in a database system;

图2为基于Zen系统的架构示意图;Figure 2 is a schematic diagram of the Zen-based system architecture;

图3为本申请所提供的日志机制的示意图;FIG. 3 is a schematic diagram of the logging mechanism provided by the present application;

图4为本申请中的数据处理方法的流程示意图;Fig. 4 is a schematic flow chart of the data processing method in the present application;

图5为本申请中的主备数据库系统的架构示意图;Fig. 5 is a schematic diagram of the architecture of the active and standby database systems in the present application;

图6为本申请中日志链的结构示意图;Figure 6 is a schematic structural diagram of the log chain in the present application;

图7为本申请实施例中数据恢复的场景示意图;FIG. 7 is a schematic diagram of a scene of data recovery in an embodiment of the present application;

图8为本申请中的数据同步流程的示意图;FIG. 8 is a schematic diagram of the data synchronization process in this application;

图9为本申请实施例中主备数据库系统的数据同步的一种场景示意图;FIG. 9 is a schematic diagram of a scenario of data synchronization of the active and standby database systems in the embodiment of the present application;

图10为本申请实施例中主备数据库系统的数据同步的另一场景示意图;FIG. 10 is a schematic diagram of another scenario of data synchronization of the active and standby database systems in the embodiment of the present application;

图11为本申请实施例中日志链机制应用于数据恢复场景和数据同步场景的示意图;FIG. 11 is a schematic diagram of the application of the log chain mechanism in the data recovery scenario and the data synchronization scenario in the embodiment of the present application;

图12为本申请实施例提供的一种数据处理装置的结构示意图;FIG. 12 is a schematic structural diagram of a data processing device provided in an embodiment of the present application;

图13为本申请实施例提供的另一种数据处理装置的结构示意图;FIG. 13 is a schematic structural diagram of another data processing device provided in the embodiment of the present application;

图14为本申请实施例提供的计算机设备一种结构示意图。FIG. 14 is a schematic structural diagram of a computer device provided by an embodiment of the present application.

具体实施方式Detailed ways

本申请实施例提供了一种数据处理的方法及相关装置,用于提高数据处理的效率。The embodiments of the present application provide a data processing method and a related device, which are used to improve the efficiency of data processing.

下面结合本发明实施例中的附图对本发明实施例进行描述。本发明的实施方式部分使用的术语仅用于对本发明的具体实施例进行解释,而非旨在限定本发明。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。Embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention. The terms used in the embodiments of the present invention are only used to explain specific examples of the present invention, and are not intended to limit the present invention. Those of ordinary skill in the art know that, with the development of technology and the emergence of new scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.

本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。In this application, "at least one" means one or more, and "multiple" means two or more. "And/or" describes the association relationship of associated objects, indicating that there can be three types of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the contextual objects are an "or" relationship. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one item (piece) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or multiple .

本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and not necessarily Used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of practice in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.

数据库的安全涉及到各方面,数据库中数据的丢失或者被篡改将会带来无法估量的损失,所以数据库的安全尤为重要。目前主要通过数据库的日志(logging)来分析数据库中的数据,然后针对分析结果采取相应的措施,例如数据恢复或数据同步等。The security of the database involves various aspects, and the loss or tampering of data in the database will bring immeasurable losses, so the security of the database is particularly important. At present, the data in the database is mainly analyzed through the log (logging) of the database, and then corresponding measures are taken according to the analysis results, such as data recovery or data synchronization.

接下来,先对几种常见的日志机制进行介绍。Next, several common logging mechanisms are introduced.

A:预写日志(write ahead logging,WAL),是数据库系统中常见的一种手段,用于保证数据操作的原子性和持久性。在计算机科学中,WAL是关系数据库系统中用于提供原子性和持久性的一系列技术。在使用WAL的数据库系统中,所有的修改在提交之前都要先写入日志(logging)中。A: Write ahead logging (WAL) is a common method in database systems to ensure the atomicity and persistence of data operations. In computer science, WAL is a family of techniques used in relational database systems to provide atomicity and durability. In a database system using WAL, all modifications must be written to the log (logging) before committing.

日志文件中通常包括redo信息和undo信息。这样做的目的可以通过一个例子来说明。假设一个程序在执行某些操作的过程中机器掉电了,那么在重新启动时,程序可能需要知道当时执行的操作是成功了还是部分成功或者是失败了。如果使用了WAL,程序就可以检查日志文件,并对突然掉电时计划执行的操作内容跟实际上执行的操作内容进行比较。在这个比较的基础上,程序就可以决定是撤销已做的操作还是继续完成已做的操作,或者是保持原样。Log files usually include redo information and undo information. The purpose of this can be illustrated by an example. Assuming that a program is in the process of performing some operations, the machine is powered off, then when restarting, the program may need to know whether the operation performed at that time was successful, partially successful or failed. If the WAL is used, the program can examine the log file and compare what was planned to be done with what was actually done when there was a sudden power loss. On the basis of this comparison, the program can decide whether to undo the operation that has been done, continue to complete the operation that has been done, or keep it as it is.

但是,WAL技术存在的一个问题是,日志文件里已经包含了日志信息和数据信息,同一份数据会在数据库文件和日志文件中各保存一份,数据写了两次,存在写放大的问题。尤其在主备数据库系统的场景下,主数据库设备在将执行数据同步时,WAL日志会造成写操作的巨大开销。However, a problem with WAL technology is that the log file already contains log information and data information, and the same data will be saved in the database file and log file respectively. The data is written twice, and there is a problem of write amplification. Especially in the scenario of the active and standby database systems, when the active database device is synchronizing data, the WAL log will cause a huge overhead for write operations.

B:后写日志(write behind logging,WBL)是针对非易失性内存(non-volatilememory,NVM)数据库而设计的一种新的日志记录及恢复协议。NVM的优点是可字节寻址、接近内存的高性能、顺序访问和随机访问差距不大,并且,当电流关掉后,NVM所存储的数据不会消失。B: Write behind logging (WBL) is a new logging and recovery protocol designed for non-volatile memory (non-volatile memory, NVM) databases. The advantages of NVM are byte addressability, high performance close to memory, little gap between sequential access and random access, and, when the current is turned off, the data stored in NVM will not disappear.

请参阅图1,图1为数据库系统中预写日志与后写日志的示意图。如图1所示,WAL机制中,每个事务对数据库的修改会先顺序写在WAL日志中,事务的数据在持久化之前需要先持久化对应的日志,同时记录新值和旧值以在提交的关键路径上只需要以顺序写的方式持久化日志;而WBL机制与WAL机制不同,其关键思想是记录数据库哪些部分的数据发生了变化,而不是数据如何发生变化。使用WBL需要在持久化日志之前先持久化事务,日志用于追踪事务修改的数据以供撤销未提交事务。Please refer to FIG. 1, which is a schematic diagram of a write-ahead log and a write-behind log in a database system. As shown in Figure 1, in the WAL mechanism, the modification of each transaction to the database will be written sequentially in the WAL log. Before the data of the transaction is persisted, the corresponding log must be persisted first, and the new value and the old value are recorded at the same time. On the critical path of submission, only the log needs to be persisted in the form of sequential writing; the WBL mechanism is different from the WAL mechanism, and its key idea is to record which parts of the database data have changed, rather than how the data has changed. Using WBL requires persisting the transaction before persisting the log, which is used to track the data modified by the transaction for undoing uncommitted transactions.

C:Zen系统,是一种用于NVM的高吞吐量无日志联机事务处理(on linetransaction processing,OLTP)引擎。Zen的无日志策略算是WBL的一种实现方式。在Zen机制中,将日志信息写入了事务的每一个元组。请参阅图2,图2为基于Zen系统的架构示意图。如图2所示,在事务提交时,把在内存中的元组写进NVM中,并更新事务写入的最后一个元组的(dirty-bit,LP)值,代表这个事务数据已经完全持久化。在进行数据库的数据恢复时,会扫描全表检查所有的元组,若某一个事务对应的所有元组的LP位都没有置1,说明该事务没有成功持久化。Zen通过在对每个区域扫描的过程中,用ts-commit变量保存当前扫描遇到的最大时间戳,每次遇到LP值为1的元组都会更新时间戳,扫描过程中如果一个元组的时间戳≤ts-commit,则Zen认为关联的事务已经提交。如果一个元组的时间戳>ts-commit时,则会放到等待二次扫描的队列,在整个区域扫描完以后,得到最大的ts-commit,然后再去处理等待二次扫描的队列。C: Zen system, which is a high-throughput log-free online transaction processing (on line transaction processing, OLTP) engine for NVM. Zen's no-log policy is an implementation of WBL. In the Zen mechanism, log information is written into each tuple of the transaction. Please refer to FIG. 2 , which is a schematic diagram of the Zen-based system architecture. As shown in Figure 2, when the transaction is committed, the tuple in memory is written into NVM, and the (dirty-bit, LP) value of the last tuple written by the transaction is updated, which means that the transaction data has been completely persistent change. When restoring data in the database, the entire table will be scanned to check all tuples. If the LP bits of all tuples corresponding to a certain transaction are not set to 1, it means that the transaction has not been successfully persisted. Zen uses the ts-commit variable to save the maximum time stamp encountered by the current scan during the scanning of each area. Every time it encounters a tuple with an LP value of 1, the time stamp will be updated. If a tuple If the timestamp ≤ ts-commit, Zen considers that the associated transaction has been committed. If the timestamp of a tuple > ts-commit, it will be placed in the queue waiting for the second scan. After the entire area is scanned, the maximum ts-commit will be obtained, and then it will be processed in the queue waiting for the second scan.

综上所述,上述三种日志机制,都存在一定的缺陷。具体的,对于WAL机制,日志文件里已经包含了日志信息和数据信息,同一份数据会在数据库文件和日志文件中各保存一份,因此在执行写操作时会带来额外的重复开销;对于WBL机制,在执行数据恢复后,每次访问都要判断当前访问的数据是否为脏数据,同样带来一定的额外开销;Zen在面对进行数据恢复的场景时,需要对数据库中全表的数据进行两次扫描,来分辨出哪些是未提交完成的事务,从而进行数据回滚。由于数据库存储的数据内容较多,因此,对全表的数据进行两次扫描,需要较大的系统开销,数据恢复的时间太长。To sum up, the above three log mechanisms all have certain defects. Specifically, for the WAL mechanism, the log file already contains log information and data information, and the same data will be saved in the database file and log file respectively, so additional duplication overhead will be incurred when performing write operations; for The WBL mechanism, after performing data recovery, must determine whether the currently accessed data is dirty data each time it is accessed, which also brings a certain amount of additional overhead; Zen needs to perform data recovery on all tables in the database. The data is scanned twice to identify uncommitted transactions, so that the data can be rolled back. Since the database stores a lot of data content, scanning the data of the entire table twice requires a large system overhead, and the data recovery time is too long.

有鉴于此,本申请实施例中,提供了一种日志机制,应用于本申请中的数据处理的方法及相关设备,从而提高数据恢复的效率。请参阅图3,图3为本申请所提供的日志机制的示意图。如图3所示,在Zen机制中,将日志信息写入了事务的每一个元组;而本申请所提供的日志机制中,日志信息以“日志链”的形式进行表达,每个事务分别对应一条日志链,并且,事务的元组和日志链并不是融合在一起的,换句话说,事务的元组和日志链存储于不同的存储地址。具体的,每条日志链包括至少一个链节点,而事务的每个元组都会有其对应的一个链节点。即事务的日志链中,每个链节点指示其所对应的元组的存储地址。示例性的,图3中,事务1有3个元组(数据1、数据2和数据3),事务1对应的日志链中包括3个链节点(链节点1、链节点2和链节点3)。其中,链节点1指示出数据1存储于表1中的位置1,链节点2指示出数据2存储于表1中的位置2,链节点3指示出数据3存储于表2中的位置1。In view of this, in the embodiment of the present application, a log mechanism is provided, which is applied to the data processing method and related equipment in the present application, so as to improve the efficiency of data recovery. Please refer to FIG. 3 , which is a schematic diagram of the log mechanism provided by this application. As shown in Figure 3, in the Zen mechanism, the log information is written into each tuple of the transaction; while in the log mechanism provided by this application, the log information is expressed in the form of a "log chain", and each transaction is Corresponds to a log chain, and the transaction tuple and the log chain are not fused together, in other words, the transaction tuple and the log chain are stored in different storage addresses. Specifically, each log chain includes at least one chain node, and each tuple of a transaction will have its corresponding chain node. That is, in the log chain of the transaction, each chain node indicates the storage address of its corresponding tuple. Exemplarily, in Figure 3, transaction 1 has 3 tuples (data 1, data 2 and data 3), and the log chain corresponding to transaction 1 includes 3 chain nodes (chain node 1, chain node 2 and chain node 3 ). Among them, link node 1 indicates that data 1 is stored in position 1 in table 1, link node 2 indicates that data 2 is stored in position 2 in table 1, and link node 3 indicates that data 3 is stored in position 1 in table 2.

需要说明的是,本申请中,并不限定日志链的存储位置,一方面可以配置独立的表来对数据库中所有的日志链进行集中存储,从而便于集中对各个事务的日志链进行管理;另一方面,也可以将日志链存储于各个元组所在的表格中,具体此处不做限定。It should be noted that in this application, the storage location of the log chain is not limited. On the one hand, an independent table can be configured to centrally store all the log chains in the database, so as to facilitate the centralized management of the log chains of each transaction; On the one hand, the log chain can also be stored in the table where each tuple is located, which is not limited here.

本申请的数据处理方法,可以应用于数据库的数据恢复流程,也可以应用于主备数据库系统中的数据同步流程。接下来,首先对本申请中的数据恢复流程进行介绍。请参阅图4,图4为本申请中的数据处理方法的流程示意图。如图4所示,本申请实施例中数据处理的方法包括:The data processing method of the present application can be applied to the data recovery process of the database, and can also be applied to the data synchronization process in the active and standby database systems. Next, the data recovery process in this application is firstly introduced. Please refer to FIG. 4 , which is a schematic flowchart of the data processing method in this application. As shown in Figure 4, the data processing method in the embodiment of the present application includes:

101.第一设备获取事务表。101. The first device acquires a transaction table.

由于NVM具备256B的读写粒度、非对称的读写性能,以及特定的最优并发数等优势,为了更好地实施本申请的数据处理的方法,优选的,可以将本申请的数据处理的方法应用于NVM数据库当中。Since NVM has the advantages of 256B read and write granularity, asymmetric read and write performance, and a specific optimal number of concurrency, in order to better implement the data processing method of this application, preferably, the data processing of this application can be The method is applied to the NVM database.

需要说明的是,本申请中的第一设备,即可以是独立的物理服务器数据库,也可以是多个物理服务器构成的服务器集群数据库、分布式系统数据库或主备数据库系统中的主数据库,也还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(content delivery network,CDN)、大数据或人工智能平台等基础云计算服务的云服务器数据库,具体此处不做限定。It should be noted that the first device in this application can be an independent physical server database, or a server cluster database composed of multiple physical servers, a distributed system database, or a master database in a master-standby database system. It can also provide cloud service, cloud database, cloud computing, cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, content delivery network (content delivery network, CDN), big data or manual Cloud server databases for basic cloud computing services such as smart platforms, which are not limited here.

以本申请的数据处理方法应用于主备数据库系统为例,请参阅图5,图5为本申请中的主备数据库系统的架构示意图。如图5所示,主数据库(相当于独立的物理服务器数据库)包括空闲空间管理模块、事务表、日志链以及表数据。具体的,本申请所提供的日志机制中,日志信息以“日志链”的形式进行表达,每个事务分别对应一条日志链,并且,事务的元组和日志链并不是融合在一起的,换句话说,事务的元组和日志链存储于不同的存储地址。具体的,每条日志链包括至少一个链节点,而事务的每个元组都会有其对应的一个链节点。即事务的日志链中,每个链节点指示其所对应的元组的存储地址。当事务的元组有多个时,即该事务对应的日志链中也会有与该事务的元组数量相同的链节点,同一个日志链中的每个链节点以串行的方式进行关联,因此当获取某个日志链中的某个链节点时,与其关联的属于同一个日志链中的其他链节点也能够被追溯到,提高了日志链的管理效率。Taking the application of the data processing method of the present application to the active and standby database systems as an example, please refer to FIG. 5 , which is a schematic diagram of the architecture of the active and standby database systems in this application. As shown in FIG. 5 , the master database (equivalent to an independent physical server database) includes a free space management module, a transaction table, a log chain, and table data. Specifically, in the log mechanism provided by this application, the log information is expressed in the form of "log chain", and each transaction corresponds to a log chain, and the tuple of the transaction and the log chain are not fused together. In other words, transaction tuples and log chains are stored in different storage addresses. Specifically, each log chain includes at least one chain node, and each tuple of a transaction will have its corresponding chain node. That is, in the log chain of the transaction, each chain node indicates the storage address of its corresponding tuple. When there are multiple tuples in a transaction, that is, the log chain corresponding to the transaction will also have the same number of chain nodes as the number of tuples in the transaction, and each chain node in the same log chain will be associated in a serial manner , so when a chain node in a certain log chain is obtained, other chain nodes associated with it belonging to the same log chain can also be traced back, which improves the management efficiency of the log chain.

第一设备中,当事务创建时,会创建一个元组,用于记录事务的基本信息(data)。事务提交时,则需要先写入该事物对应的日志链的各个链节点,以及写入各个元组。当该事务的日志链的所有链节点和事务的所有元组全部完成持久化之后,完成整个事务提交过程。In the first device, when a transaction is created, a tuple is created for recording basic information (data) of the transaction. When a transaction is committed, it is necessary to first write to each chain node of the log chain corresponding to the transaction, and to write each tuple. When all the chain nodes of the log chain of the transaction and all the tuples of the transaction have been persisted, the entire transaction submission process is completed.

由于数据库(第一设备)中存储有海量的事务,因此,为了对这些事务的日志链进行集中管理、查询,本申请配置了针对于这些日志链的事务表,事务表中存储有第一设备中所有事务的状态以及这些事务对应的日志链的存储地址。第一设备通过获取事务表中事务的状态,可以知道事务是否已经提交,或者,是否需要进行回滚,或者,是否需要进行数据同步;而第一设备通过获取事务的日志链的存储地址,则可以知道这些日志链存储在哪些地方。进一步的,事务表中也还可以指示各个事务的提交时间等。由于日志链包括若干个链节点,本申请中,可以仅每条日志链的链头节点所在的存储地址,作为该日志链的存储地址。而同一条日志链中的各个链节点都是串行关联的,获取到链头节点所在的存储地址,便可以获取整条日志链中的所有关联的链节点。因此,事务表中便不需要将日志链中每个链节点的存储地址都进行记录,节约了存储资源,提高了日志链的管理效率。Since there are a large number of transactions stored in the database (the first device), in order to centrally manage and query the log chains of these transactions, this application configures a transaction table for these log chains, and the transaction table stores the first device The status of all transactions in and the storage addresses of the log chains corresponding to these transactions. By obtaining the state of the transaction in the transaction table, the first device can know whether the transaction has been committed, or whether it needs to be rolled back, or whether it needs to perform data synchronization; and by obtaining the storage address of the log chain of the transaction, the first device can then You can know where these log chains are stored. Further, the transaction table may also indicate the commit time of each transaction and the like. Since the log chain includes several chain nodes, in this application, only the storage address where the head node of each log chain is located can be used as the storage address of the log chain. Each chain node in the same log chain is serially associated, and by obtaining the storage address of the chain head node, all associated chain nodes in the entire log chain can be obtained. Therefore, there is no need to record the storage address of each chain node in the log chain in the transaction table, which saves storage resources and improves the management efficiency of the log chain.

由上可知,数据库(第一设备)中存储的事务的数量是很庞大的,而每个事务都对应一条日志链,因此,本申请中,可以将这些事务的日志链进一步串联起来,每个日志链的链头节点与在先的日志链的尾部相接,使得第一设备中存储的日志链排布清晰,便于管理。为了便于理解,请参阅图6,图6为本申请中日志链的结构示意图。如图6所示,第一设备的事务表中指出两个事务,分别为0x001事务和0x002事务,且都已经完成提交(处于commited状态),0x001事务的链头节点的存储地址为0x001,0x002事务的链头节点的存储地址为0x002。通过事务表所指示的链头节点的存储地址,可以获取到0x001事务的日志链的链头节点,和0x002事务的日志链的链头节点。进一步的,可以通过链头节点获取到与该链头节点串行关联的其他链节点,从而获取到完整的日志链,进而通过日志链中的各个链节点明确事务的每个元组所在的存储位置。如图6中,0x001事务的日志链指示出该事务的元组存储于表(table)A中的位置1a、位置1b、table B中的位置2a,以及table C中的位置3a,而0x002事务的日志链指示出该事务的元组存储于table A中的位置1c和table B中的位置2b。It can be seen from the above that the number of transactions stored in the database (the first device) is huge, and each transaction corresponds to a log chain. Therefore, in this application, the log chains of these transactions can be further connected in series. The head node of the log chain is connected to the tail of the previous log chain, so that the arrangement of the log chains stored in the first device is clear and easy to manage. For ease of understanding, please refer to FIG. 6, which is a schematic structural diagram of the log chain in this application. As shown in Figure 6, two transactions are indicated in the transaction table of the first device, namely 0x001 transaction and 0x002 transaction, and both of them have been committed (in committed state), and the storage address of the chain head node of 0x001 transaction is 0x001, 0x002 The storage address of the chain head node of the transaction is 0x002. Through the storage address of the chain head node indicated by the transaction table, the chain head node of the log chain of the 0x001 transaction and the chain head node of the log chain of the 0x002 transaction can be obtained. Further, other chain nodes serially associated with the chain head node can be obtained through the chain head node, so as to obtain the complete log chain, and then use each chain node in the log chain to clarify the storage where each tuple of the transaction is located Location. As shown in Figure 6, the log chain of the 0x001 transaction indicates that the tuple of the transaction is stored in the position 1a, the position 1b in the table (table) A, the position 2a in the table B, and the position 3a in the table C, and the 0x002 transaction The log chain for indicates that the tuple for this transaction is stored at location 1c in table A and at location 2b in table B.

空闲空间管理模块则需要对这些日志链进行管理,当事务回滚后,其对应的日志链会被删除,通过将该日志链的链头节点挪到日志链所在的空闲链的尾部,完成空闲空间的回收。表数据则用于存储事务的元组。The free space management module needs to manage these log chains. When the transaction is rolled back, the corresponding log chain will be deleted. By moving the head node of the log chain to the end of the free chain where the log chain is located, the free space is completed. Recycling of space. Table data is used to store transaction tuples.

102.第一设备根据事务表获取待恢复事务的日志链。102. The first device acquires the log chain of the transaction to be restored according to the transaction table.

当第一设备发生宕机故障时,对于正在执行持久化的事务,其元组往往不能全部完成持久化。为了保证事务的原子性和持久性,需要对这类中断的、未能将全部的元组进行持久化的事务进行回滚,这类事务即属于undo事务。由于事务表中记录了所有事务的状态,因此,第一设备在获取到该事务表之后,可以筛选出事务表中处于undo状态的事务为待恢复事务。由于事务表还记录了每个事务的日志链的存储地址,因此,第一设备可以根据事务表查找到这些待恢复事务对应的日志链的存储地址,进一步的,第一设备再根据待恢复事务的日志链的存储地址,获取到待恢复事务的日志链。When the first device goes down, the tuples of the transactions that are being persisted often cannot be fully persisted. In order to ensure the atomicity and durability of transactions, it is necessary to roll back such interrupted transactions that fail to persist all tuples. Such transactions belong to undo transactions. Since states of all transactions are recorded in the transaction table, after obtaining the transaction table, the first device can filter out transactions in the undo state in the transaction table as transactions to be restored. Since the transaction table also records the storage addresses of the log chains of each transaction, the first device can find the storage addresses of the log chains corresponding to the The storage address of the log chain to obtain the log chain of the transaction to be restored.

103.第一设备根据待恢复事务的日志链确定待恢复事务的元组的存储地址,以获取待恢复事务的元组。103. The first device determines the storage address of the tuple of the transaction to be restored according to the log chain of the transaction to be restored, so as to obtain the tuple of the transaction to be restored.

本申请中,由于日志链中的每个链节点,都指示了事务的一个元组的存储地址。因此,第一设备在获取到待恢复事务的日志链之后,根据待恢复事务的日志链中所有链节点的指示,获取到待恢复事务的所有元组。In this application, since each chain node in the log chain indicates the storage address of a tuple of the transaction. Therefore, after obtaining the log chain of the transaction to be restored, the first device obtains all tuples of the transaction to be restored according to the indications of all chain nodes in the log chain of the transaction to be restored.

104.第一设备删除待恢复事务的日志链和元组。104. The first device deletes the log chain and tuple of the transaction to be recovered.

第一设备删除待恢复事务的日志链和元组,便完成了针对于本次的数据恢复流程。The first device deletes the log chain and tuple of the transaction to be recovered, and completes the data recovery process for this time.

为了便于理解,请参阅图7,图7为本申请实施例中数据恢复的场景示意图。如图7所示,宕机后的数据恢复场景中,事务表指示了编号为0x004的事务的状态为“active”,即标识该事务处于undo状态。而根据事务0x004的日志链的指示,可以确定其位于表1中的位置1d的元组未能完成持久化,因此,需要将该事务的日志链和所有元组进行删除,完成事务回滚。For ease of understanding, please refer to FIG. 7 , which is a schematic diagram of a data recovery scenario in an embodiment of the present application. As shown in Figure 7, in the data recovery scenario after a downtime, the transaction table indicates that the status of the transaction numbered 0x004 is "active", which means that the transaction is in the undo state. However, according to the indication of the log chain of transaction 0x004, it can be determined that the tuple at position 1d in Table 1 failed to be persisted. Therefore, the log chain and all tuples of the transaction need to be deleted to complete the transaction rollback.

本申请中,可以通过事务表查询到待恢复事务,并根据该待恢复事务的日志链获取到该待恢复事务的元组,从而不需要对第一设备进行全表扫描,提高了数据恢复、数据处理的效率。另一方面,由于本申请中的日志链,其本身并不承载事务的元组信息,从而可以避免在使用日志链过程中带来额外的写操作的开销,从而相较于日志文件承载了事务元组的WAL机制也具有极大的性能优势。In this application, the transaction to be recovered can be queried through the transaction table, and the tuple of the transaction to be recovered can be obtained according to the log chain of the transaction to be recovered, so that there is no need to perform a full table scan on the first device, which improves data recovery, The efficiency of data processing. On the other hand, since the log chain in this application does not carry the tuple information of the transaction itself, it can avoid the overhead of additional write operations in the process of using the log chain, thus compared with the log file carrying the transaction The tuple WAL mechanism also has great performance advantages.

接下来,对本申请中的数据同步流程进行介绍。在主备数据库系统中,主数据库设备负责读操作和写操作,而备数据库设备则只负责读操作。因此,在主数据库设备完成事务的写操作之后,需要将数据同步至备数据库设备当中。本申请中的日志链机制,同样可以适用于主备数据库系统的数据同步流程。下面以第一设备为主数据库设备,第二设备为备数据库设备为例,进行说明。请参阅图8,图8为本申请中的数据处理方法中用于数据同步流程的示意图。如图8所示,本申请实施例中数据同步的方法包括:Next, the data synchronization process in this application is introduced. In the active-standby database system, the primary database device is responsible for read and write operations, while the standby database device is only responsible for read operations. Therefore, after the primary database device completes the write operation of the transaction, the data needs to be synchronized to the standby database device. The log chain mechanism in this application can also be applied to the data synchronization process of the primary and secondary database systems. The following takes the first device as the primary database device and the second device as the standby database device as an example for description. Please refer to FIG. 8 . FIG. 8 is a schematic diagram of a data synchronization process in the data processing method of the present application. As shown in Figure 8, the data synchronization method in the embodiment of the present application includes:

201.第一设备获取事务表。201. The first device acquires a transaction table.

步骤201与前述图4所示步骤101相类似,具体此处不再进行赘述。需要说明的是,为了后续便于确定增量事务,在数据同步的流程中,事务表中应当记录每个事务的提交时间。Step 201 is similar to the aforementioned step 101 shown in FIG. 4 , and details are not repeated here. It should be noted that, in order to facilitate subsequent determination of incremental transactions, in the process of data synchronization, the commit time of each transaction should be recorded in the transaction table.

202.第一设备根据事务表确定增量事务。202. The first device determines incremental transactions according to the transaction table.

在实际应用中,主备数据库系统一般是需要定期执行数据同步的,即定期将第一设备中新增的事务的元组同步至备数据库系统,对于那些在最新一次执行数据同步之后新增的事务,即为本申请中的增量事务,这些事务是还未执行数据同步的。由于事务表中记录了每个事务的提交时间,因此,第一设备便可以根据事务表,将提交时间在主备数据库系统最新的数据同步时间之后的事务确定为增量事务。In practical applications, the primary and standby database systems generally need to perform data synchronization on a regular basis, that is, periodically synchronize the tuples of new transactions in the first device to the standby database system. Transactions are incremental transactions in this application, and these transactions have not yet performed data synchronization. Since the commit time of each transaction is recorded in the transaction table, the first device can determine, according to the transaction table, a transaction whose commit time is after the latest data synchronization time of the primary and standby database systems as an incremental transaction.

203.第一设备根据事务表获取增量事务的日志链。203. The first device acquires the log chain of the incremental transaction according to the transaction table.

步骤203与前述图4所示步骤102相类似,具体此处不再进行赘述。Step 203 is similar to the aforementioned step 102 shown in FIG. 4 , and details are not repeated here.

204.第一设备根据增量事务的日志链确定增量事务的元组的存储地址,以获取增量事务的元组。204. The first device determines the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction.

步骤204与前述图4所示步骤103相类似,具体此处不再进行赘述。Step 204 is similar to the aforementioned step 103 shown in FIG. 4 , and details are not repeated here.

205.第一设备将增量事务的日志链和元组同步至第二设备。205. The first device synchronizes the log chain and tuple of the incremental transaction to the second device.

请参阅图9,图9为本申请实施例中主备数据库系统的数据同步的一种场景示意图。如图9所示,第一设备将增量事务的日志链和元组发送至第二设备,由第二设备将增量事务的日志链和元组持久化到本地,完成数据同步流程。Please refer to FIG. 9 . FIG. 9 is a schematic diagram of a scenario of data synchronization of the active and standby database systems in the embodiment of the present application. As shown in Figure 9, the first device sends the log chain and tuple of the incremental transaction to the second device, and the second device persists the log chain and tuple of the incremental transaction locally to complete the data synchronization process.

进一步的,请参阅图10,图10为本申请实施例中主备数据库系统的数据同步的另一场景示意图。如图10所示,本申请中,并不限定参与数据同步的数据库设备的数量,主数据库设备将增量事务的元组和日志链同步至多个备数据库设备,即第二设备可以指代多个备数据库设备。Further, please refer to FIG. 10 . FIG. 10 is a schematic diagram of another scenario of data synchronization of the primary and secondary database systems in the embodiment of the present application. As shown in Figure 10, in this application, the number of database devices participating in data synchronization is not limited. The primary database device synchronizes the tuples and log chains of incremental transactions to multiple standby database devices, that is, the second device can refer to multiple A standby database device.

WAL机制中,日志文件包括了事务的元组,因此在数据同步的过程中,持久化日志文件和事务的元组,则相当于将事务的元组写入了两次,数据同步的开销极大;WBL机制中,则通过扫全表重新生成WAL用于同步。由于事务写入数据的位置是随机的,要组织WAL日志,必须扫描全表,开销很大,性能不好;Zen机制中,额外在内存中组织日志,这种做法几乎不影响性能,但数据库宕机以后,内存日志会丢失,主备机想要重新同步代价很大。本申请中,可以通过事务表查询到增量事务,并根据该增量事务的日志链获取到该增量事务的元组,从而不需要对第一设备进行全表扫描,提高了数据同步、数据处理的效率。另一方面,由于本申请中的日志链,其本身并不承载事务的元组信息,从而可以避免在备数据库持久化日志链的过程中带来额外的写操作的开销,从而相较于日志文件承载了事务元组的WAL机制也具有极大的性能优势。In the WAL mechanism, the log file includes the tuple of the transaction, so during the data synchronization process, the persistence of the log file and the tuple of the transaction is equivalent to writing the tuple of the transaction twice, and the overhead of data synchronization is extremely high. In the WBL mechanism, WAL is regenerated by scanning the entire table for synchronization. Since the position where the transaction writes data is random, to organize the WAL log, the entire table must be scanned, which is very expensive and poor in performance; in the Zen mechanism, the log is additionally organized in memory, which hardly affects performance, but the database After a downtime, the memory logs will be lost, and it will be very costly for the master and standby machines to resynchronize. In this application, the incremental transaction can be queried through the transaction table, and the tuple of the incremental transaction can be obtained according to the log chain of the incremental transaction, so that there is no need to perform a full table scan on the first device, which improves data synchronization, The efficiency of data processing. On the other hand, since the log chain in this application does not carry the tuple information of the transaction itself, it can avoid additional write operation overhead in the process of persisting the log chain in the standby database, thus compared with the log The WAL mechanism in which files carry transaction tuples also has great performance advantages.

在实际应用中,上述数据恢复方法和数据同步方法即可以互相独立,也可以配合使用。即数据库设备可以只采用图4所示的步骤101至步骤104来进行数据恢复,或者,可以只采用图8所示的步骤201至步骤205来进行数据同步,或者,也可以在数据库系统中集成上述数据恢复方法和数据同步方法。为了便于理解,请参阅图11,图11为本申请实施例中日志链机制应用于数据恢复场景和数据同步场景的示意图。如图11所示,数据恢复场景和数据同步场景中,可以共享同一份事务表和日志链,即事务表以及事务的日志链可以用于数据恢复,也可以用于数据同步。In practical applications, the above data recovery method and data synchronization method can be independent of each other, or can be used in conjunction with each other. That is, the database device can only use steps 101 to 104 shown in FIG. 4 for data recovery, or it can only use steps 201 to 205 shown in FIG. 8 for data synchronization, or it can also be integrated in the database system The above data recovery method and data synchronization method. For ease of understanding, please refer to FIG. 11 , which is a schematic diagram of the application of the log chain mechanism in the data recovery scenario and the data synchronization scenario in the embodiment of the present application. As shown in Figure 11, in the data recovery scenario and the data synchronization scenario, the same transaction table and log chain can be shared, that is, the transaction table and transaction log chain can be used for data recovery and data synchronization.

为了更好的实施本申请实施例的上述方案,下面还提供用于实施上述方案的相关设备。具体的,请参阅图12,图12为本申请实施例提供的一种数据处理装置的结构示意图,数据处理装置包括:In order to better implement the above solutions of the embodiments of the present application, related equipment for implementing the above solutions is also provided below. Specifically, please refer to FIG. 12. FIG. 12 is a schematic structural diagram of a data processing device provided by an embodiment of the present application. The data processing device includes:

获取单元301,用于获取事务表,事务表用于指示事务的状态以及日志链的存储地址,每个事务对应一个日志链,日志链包括至少一个链节点,每个链节点指示事务中的一个元组的存储地址;The obtaining unit 301 is used to obtain a transaction table, the transaction table is used to indicate the status of the transaction and the storage address of the log chain, each transaction corresponds to a log chain, the log chain includes at least one chain node, and each chain node indicates one of the transactions The storage address of the tuple;

获取单元301,还用于根据事务表获取待恢复事务的日志链,其中,事务表中处于undo状态的事务为待恢复事务;The obtaining unit 301 is also used to obtain the log chain of the transaction to be restored according to the transaction table, wherein the transaction in the undo state in the transaction table is the transaction to be restored;

获取单元301,还用于根据待恢复事务的日志链确定待恢复事务的元组的存储地址,以获取待恢复事务的元组;The obtaining unit 301 is further configured to determine the storage address of the tuple of the transaction to be restored according to the log chain of the transaction to be restored, so as to obtain the tuple of the transaction to be restored;

删除单元302,用于删除待恢复事务的日志链和元组。The deletion unit 302 is configured to delete the log chain and tuple of the transaction to be recovered.

在一种可能的设计中,日志链包括多个串行的链节点,多个串行的链节点之间存在关联关系。In a possible design, the log chain includes multiple serial chain nodes, and there is an association relationship among the multiple serial chain nodes.

在一种可能的设计中,日志链的存储地址为日志链的链头节点的存储地址。In a possible design, the storage address of the log chain is the storage address of the head node of the log chain.

在一种可能的设计中,获取单元301具体用于:In a possible design, the obtaining unit 301 is specifically used to:

根据事务表,确定处于undo状态的事务为待恢复事务;According to the transaction table, it is determined that the transaction in the undo state is the transaction to be recovered;

根据事务表确定待恢复事务的日志链的存储地址;Determine the storage address of the log chain of the transaction to be recovered according to the transaction table;

根据待恢复事务的日志链的存储地址,获取待恢复事务的日志链。Obtain the log chain of the transaction to be restored according to the storage address of the log chain of the transaction to be restored.

在一种可能的设计中,数据处理装置应用于主备数据库系统,数据处理装置为主数据库,第二设备为备数据库,事务表还用于指示事务的提交时间,数据处理装置还包括确定单元303和同步单元304,In a possible design, the data processing device is applied to the active and standby database systems, the data processing device is the main database, the second device is the standby database, the transaction table is also used to indicate the commit time of the transaction, and the data processing device also includes a determination unit 303 and synchronization unit 304,

确定单元303,用于根据事务表确定增量事务,增量事务的提交时间在主备数据库系统的最新备份时间之后;The determination unit 303 is configured to determine the incremental transaction according to the transaction table, and the commit time of the incremental transaction is after the latest backup time of the active and standby database systems;

获取单元301,还用于根据事务表获取增量事务的日志链;The obtaining unit 301 is also used to obtain the log chain of the incremental transaction according to the transaction table;

获取单元301,还用于根据增量事务的日志链确定增量事务的元组的存储地址,以获取增量事务的元组;The obtaining unit 301 is further configured to determine the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction;

同步单元304,用于将增量事务的日志链和元组同步至第二设备。A synchronization unit 304, configured to synchronize the log chain and tuple of the incremental transaction to the second device.

在一种可能的设计中,获取单元301具体用于:In a possible design, the obtaining unit 301 is specifically used to:

根据事务表确定增量事务的日志链的存储地址;Determine the storage address of the log chain of the incremental transaction according to the transaction table;

根据增量事务的日志链的存储地址,获取增量事务的日志链。According to the storage address of the log chain of the incremental transaction, the log chain of the incremental transaction is obtained.

请参阅图13,图13为本申请实施例提供的另一种数据处理装置的结构示意图,数据处理装置应用于主备数据库系统,数据处理装置为主数据库,第二设备为备数据库,数据处理装置包括:Please refer to Figure 13. Figure 13 is a schematic structural diagram of another data processing device provided in the embodiment of the present application. The data processing device is applied to the active and standby database systems. Devices include:

获取单元401,用于获取事务表,事务表用于指示事务的提交时间以及日志链的存储地址,每个事务对应一个日志链,日志链包括至少一个链节点,每个链节点指示事务中的一个元组的存储地址;The obtaining unit 401 is used to obtain a transaction table, the transaction table is used to indicate the submission time of the transaction and the storage address of the log chain, each transaction corresponds to a log chain, and the log chain includes at least one chain node, and each chain node indicates a transaction in the transaction The storage address of a tuple;

确定单元402,用于根据事务表确定增量事务,增量事务的提交时间在主备数据库系统最新的数据同步的时间之后;The determination unit 402 is configured to determine the incremental transaction according to the transaction table, and the commit time of the incremental transaction is after the time of the latest data synchronization of the active and standby database systems;

获取单元401,还用于根据事务表获取增量事务的日志链;The obtaining unit 401 is also used to obtain the log chain of the incremental transaction according to the transaction table;

确定单元402,还用于根据增量事务的日志链确定增量事务的元组的存储地址,以获取增量事务的元组;The determining unit 402 is further configured to determine the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction;

同步单元403,用于将增量事务的日志链和元组同步至第二设备。A synchronization unit 403, configured to synchronize the log chain and tuple of the incremental transaction to the second device.

基于第四方面,一种可选的实施方式中,日志链包括多个串行的链节点,多个串行的链节点之间存在关联关系。Based on the fourth aspect, in an optional implementation manner, the log chain includes multiple serial chain nodes, and there is an association relationship between the multiple serial chain nodes.

基于第四方面,一种可选的实施方式中,日志链的存储地址为日志链的链头节点的存储地址。Based on the fourth aspect, in an optional implementation manner, the storage address of the log chain is the storage address of the head node of the log chain.

基于第四方面,一种可选的实施方式中,获取单元401具体用于:Based on the fourth aspect, in an optional implementation manner, the acquiring unit 401 is specifically configured to:

根据事务表确定增量事务的日志链的存储地址;Determine the storage address of the log chain of the incremental transaction according to the transaction table;

根据增量事务的日志链的存储地址,获取增量事务的日志链。According to the storage address of the log chain of the incremental transaction, the log chain of the incremental transaction is obtained.

本申请实施例还提供了一种计算机设备,请参阅图14,图14为本申请实施例提供的计算机设备一种结构示意图,计算机设备上可以部署有图12或图13对应实施例中所描述的数据处理装置,用于实现图4或图8对应实施例的方法.具体的,计算机设备由一个或多个服务器实现,计算机设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)522(例如,一个或一个以上处理器)和存储器532,一个或一个以上存储应用程序542或数据544的存储介质530(例如一个或一个以上海量存储设备)。其中,存储器532和存储介质530可以是短暂存储或持久存储。存储在存储介质530的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对计算机设备中的一系列指令操作。更进一步地,中央处理器522可以设置为与存储介质530通信,在计算机设备500上执行存储介质530中的一系列指令操作。The embodiment of the present application also provides a computer device. Please refer to FIG. 14. FIG. 14 is a schematic structural diagram of the computer device provided in the embodiment of the present application. The computer device described in the corresponding embodiment in FIG. 12 or FIG. 13 can be deployed. The data processing device is used to realize the method of the embodiment corresponding to FIG. 4 or FIG. 8. Specifically, the computer equipment is implemented by one or more servers, and the computer equipment may have relatively large differences due to different configurations or performances, and may include a or more than one central processing unit (central processing units, CPU) 522 (for example, one or more processors) and memory 532, one or more storage media 530 for storing application programs 542 or data 544 (for example, one or more mass storage device). Wherein, the memory 532 and the storage medium 530 may be temporary storage or persistent storage. The program stored in the storage medium 530 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the computer device. Furthermore, the central processing unit 522 may be configured to communicate with the storage medium 530 , and execute a series of instruction operations in the storage medium 530 on the computer device 500 .

计算机设备还可以包括一个或一个以上电源526,一个或一个以上有线或无线网络接口550,一个或一个以上输入输出接口558,和/或,一个或一个以上操作系统541,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。The computer device can also include one or more power supplies 526, one or more wired or wireless network interfaces 550, one or more input and output interfaces 558, and/or, one or more operating systems 541, such as Windows Server , Mac OS X , Unix , Linux , FreeBSD , etc.

本申请实施例中还提供一种包括计算机程序产品,当其在计算机上运行时,使得计算机执行如前述图4或图8所示实施例描述的方法。The embodiment of the present application also provides a computer program product, which, when running on a computer, causes the computer to execute the method described in the embodiment shown in FIG. 4 or FIG. 8 .

本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于进行信号处理的程序,当其在计算机上运行时,使得计算机执行如前述图4或图8所示实施例描述的方法。An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a program for signal processing, and when it is run on a computer, the computer executes the program shown in Figure 4 or Figure 8 above. The method described in the example is shown.

本申请实施例提供的图像处理装置具体可以为芯片,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使芯片执行上述图4或图8所示实施例描述的方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述无线接入设备端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。The image processing device provided in the embodiment of the present application may specifically be a chip, and the chip includes: a processing unit and a communication unit, the processing unit may be, for example, a processor, and the communication unit may be, for example, an input/output interface, a pin or a circuit, etc. . The processing unit may execute the computer-executable instructions stored in the storage unit, so that the chip executes the method described in the embodiment shown in FIG. 4 or FIG. 8 above. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit located outside the chip in the wireless access device, such as only Read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (random access memory, RAM), etc.

所另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。It should be noted that the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be It is not a physical unit, that is, it can be located in one place, or it can be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the device embodiments provided in the present application, the connection relationship between the modules indicates that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines.

通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,训练设备,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus necessary general-purpose hardware, and of course it can also be realized by special hardware including application-specific integrated circuits, dedicated CPUs, dedicated memories, Special components, etc. to achieve. In general, all functions completed by computer programs can be easily realized by corresponding hardware, and the specific hardware structure used to realize the same function can also be varied, such as analog circuits, digital circuits or special-purpose circuit etc. However, for this application, software program implementation is a better implementation mode in most cases. Based on this understanding, the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a floppy disk of a computer , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, training device, or network device, etc.) execute the instructions described in various embodiments of the present application method.

在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。In the above embodiments, all or part of them may be implemented by software, hardware, firmware or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product.

所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、训练设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、训练设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的训练设备、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transferred from a website, computer, training device, or data The center transmits to another website site, computer, training device or data center via wired (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a training device or a data center integrated with one or more available media. The available medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, DVD), or a semiconductor medium (for example, a solid state disk (Solid State Disk, SSD)) and the like.

Claims (23)

1.一种数据处理的方法,其特征在于,包括:1. A method for data processing, comprising: 第一设备获取事务表,所述事务表用于指示事务的状态以及日志链的存储地址,每个事务对应一个日志链,所述日志链包括至少一个链节点,每个所述链节点指示事务中的一个元组的存储地址;The first device obtains a transaction table, the transaction table is used to indicate the state of the transaction and the storage address of the log chain, each transaction corresponds to a log chain, and the log chain includes at least one chain node, and each of the chain nodes indicates a transaction The storage address of a tuple in ; 所述第一设备根据所述事务表获取待恢复事务的日志链,其中,所述事务表中处于undo状态的事务为所述待恢复事务;The first device obtains the log chain of the transaction to be restored according to the transaction table, wherein the transaction in the undo state in the transaction table is the transaction to be restored; 所述第一设备根据所述待恢复事务的日志链确定所述待恢复事务的元组的存储地址,以获取所述待恢复事务的元组;The first device determines the storage address of the tuple of the transaction to be recovered according to the log chain of the transaction to be recovered, so as to obtain the tuple of the transaction to be recovered; 所述第一设备删除所述待恢复事务的日志链和元组。The first device deletes the log chain and tuple of the transaction to be recovered. 2.根据权利要求1所述的方法,其特征在于,所述日志链包括多个串行的链节点,所述多个串行的链节点之间存在关联关系。2 . The method according to claim 1 , wherein the log chain includes a plurality of serial chain nodes, and there is an association relationship among the plurality of serial chain nodes. 3 . 3.根据权利要求2所述的方法,其特征在于,所述日志链的存储地址为所述日志链的链头节点的存储地址。3. The method according to claim 2, wherein the storage address of the log chain is the storage address of the head node of the log chain. 4.根据权利要求1、2或3所述的方法,其特征在于,所述第一设备根据所述事务表获取待恢复事务的日志链,包括:4. The method according to claim 1, 2 or 3, wherein the first device obtains the log chain of the transaction to be restored according to the transaction table, comprising: 所述第一设备根据所述事务表,确定处于undo状态的事务为待恢复事务;The first device determines, according to the transaction table, that the transaction in the undo state is a transaction to be recovered; 所述第一设备根据所述事务表确定所述待恢复事务的日志链的存储地址;The first device determines the storage address of the log chain of the transaction to be restored according to the transaction table; 所述第一设备根据所述待恢复事务的日志链的存储地址,获取所述待恢复事务的日志链。The first device acquires the log chain of the transaction to be restored according to the storage address of the log chain of the transaction to be restored. 5.根据权利要求1至4中任一项所述的方法,其特征在于,所述方法应用于主备数据库系统,所述第一设备为主数据库,第二设备为备数据库,所述事务表还用于指示事务的提交时间,所述方法还包括:5. The method according to any one of claims 1 to 4, wherein the method is applied to a master database system, the first device is the master database, the second device is the standby database, and the transaction The table is also used to indicate a commit time of the transaction, and the method further includes: 所述第一设备根据所述事务表确定增量事务,所述增量事务的提交时间在所述主备数据库系统的最新备份时间之后;The first device determines an incremental transaction according to the transaction table, and the commit time of the incremental transaction is after the latest backup time of the active and standby database systems; 所述第一设备根据所述事务表获取所述增量事务的日志链;The first device obtains the log chain of the incremental transaction according to the transaction table; 所述第一设备根据所述增量事务的日志链确定所述增量事务的元组的存储地址,以获取所述增量事务的元组;The first device determines the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction; 所述第一设备将所述增量事务的日志链和元组同步至所述第二设备。The first device synchronizes the log chain and tuple of the incremental transaction to the second device. 6.根据权利要求5所述的方法,其特征在于,所述第一设备根据所述事务表获取所述增量事务的日志链,包括:6. The method according to claim 5, wherein the first device obtains the log chain of the incremental transaction according to the transaction table, comprising: 所述第一设备根据所述事务表确定所述增量事务的日志链的存储地址;The first device determines the storage address of the log chain of the incremental transaction according to the transaction table; 所述第一设备根据所述增量事务的日志链的存储地址,获取所述增量事务的日志链。The first device acquires the log chain of the incremental transaction according to the storage address of the log chain of the incremental transaction. 7.一种数据处理的方法,其特征在于,所述方法应用于主备数据库系统,第一设备为主数据库,第二设备为备数据库,所述方法包括:7. A method for data processing, characterized in that, the method is applied to a master database system, the first device is the master database, and the second device is the standby database, the method comprising: 所述第一设备获取事务表,所述事务表用于指示事务的提交时间以及日志链的存储地址,每个事务对应一个日志链,所述日志链包括至少一个链节点,每个所述链节点指示事务中的一个元组的存储地址;The first device obtains a transaction table, the transaction table is used to indicate the commit time of the transaction and the storage address of the log chain, each transaction corresponds to a log chain, and the log chain includes at least one chain node, each of the chain The node indicates the storage address of a tuple in the transaction; 所述第一设备根据事务表确定增量事务,所述增量事务的提交时间在所述主备数据库系统最新的数据同步的时间之后;The first device determines an incremental transaction according to the transaction table, and the commit time of the incremental transaction is after the latest data synchronization time of the primary and standby database systems; 所述第一设备根据所述事务表获取所述增量事务的日志链;The first device obtains the log chain of the incremental transaction according to the transaction table; 所述第一设备根据所述增量事务的日志链确定所述增量事务的元组的存储地址,以获取所述增量事务的元组;The first device determines the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction; 所述第一设备将所述增量事务的日志链和元组同步至所述第二设备。The first device synchronizes the log chain and tuple of the incremental transaction to the second device. 8.根据权利要求7所述的方法,其特征在于,所述日志链包括多个串行的链节点,所述多个串行的链节点之间存在关联关系。8 . The method according to claim 7 , wherein the log chain includes a plurality of serial chain nodes, and there is an association relationship among the plurality of serial chain nodes. 9.根据权利要求8所述的方法,其特征在于,所述日志链的存储地址为所述日志链的链头节点的存储地址。9. The method according to claim 8, wherein the storage address of the log chain is the storage address of the head node of the log chain. 10.根据权利要求7、8或9所述的方法,其特征在于,所述第一设备根据所述事务表获取所述增量事务的日志链,包括:10. The method according to claim 7, 8 or 9, wherein the first device obtains the log chain of the incremental transaction according to the transaction table, comprising: 所述第一设备根据所述事务表确定所述增量事务的日志链的存储地址;The first device determines the storage address of the log chain of the incremental transaction according to the transaction table; 所述第一设备根据所述增量事务的日志链的存储地址,获取所述增量事务的日志链。The first device acquires the log chain of the incremental transaction according to the storage address of the log chain of the incremental transaction. 11.一种数据处理装置,其特征在于,包括:11. A data processing device, characterized in that it comprises: 获取单元,用于获取事务表,所述事务表用于指示事务的状态以及日志链的存储地址,每个事务对应一个日志链,所述日志链包括至少一个链节点,每个所述链节点指示事务中的一个元组的存储地址;The obtaining unit is used to obtain a transaction table, the transaction table is used to indicate the state of the transaction and the storage address of the log chain, each transaction corresponds to a log chain, and the log chain includes at least one chain node, each of the chain nodes Indicates the storage address of a tuple in the transaction; 所述获取单元,还用于根据所述事务表获取待恢复事务的日志链,其中,所述事务表中处于undo状态的事务为所述待恢复事务;The acquiring unit is further configured to acquire a log chain of a transaction to be recovered according to the transaction table, wherein the transaction in the undo state in the transaction table is the transaction to be recovered; 所述获取单元,还用于根据所述待恢复事务的日志链确定所述待恢复事务的元组的存储地址,以获取所述待恢复事务的元组;The acquiring unit is further configured to determine the storage address of the tuple of the transaction to be recovered according to the log chain of the transaction to be recovered, so as to acquire the tuple of the transaction to be recovered; 删除单元,用于删除所述待恢复事务的日志链和元组。The deletion unit is used to delete the log chain and tuple of the transaction to be recovered. 12.根据权利要求11所述的数据处理装置,其特征在于,所述日志链包括多个串行的链节点,所述多个串行的链节点之间存在关联关系。12 . The data processing device according to claim 11 , wherein the log chain includes a plurality of serial chain nodes, and there is an association relationship among the plurality of serial chain nodes. 13 . 13.根据权利要求12所述的数据处理装置,其特征在于,所述日志链的存储地址为所述日志链的链头节点的存储地址。13. The data processing device according to claim 12, wherein the storage address of the log chain is the storage address of the head node of the log chain. 14.根据权利要求11、12或13所述的数据处理装置,其特征在于,所述获取单元具体用于:14. The data processing device according to claim 11, 12 or 13, wherein the acquisition unit is specifically used for: 根据所述事务表,确定处于undo状态的事务为待恢复事务;According to the transaction table, it is determined that the transaction in the undo state is a transaction to be recovered; 根据所述事务表确定所述待恢复事务的日志链的存储地址;determining the storage address of the log chain of the transaction to be recovered according to the transaction table; 根据所述待恢复事务的日志链的存储地址,获取所述待恢复事务的日志链。Acquire the log chain of the transaction to be restored according to the storage address of the log chain of the transaction to be restored. 15.根据权利要求11至14中任一项所述的数据处理装置,其特征在于,所述数据处理装置应用于主备数据库系统,所述数据处理装置为主数据库,第二设备为备数据库,所述事务表还用于指示事务的提交时间,所述数据处理装置还包括确定单元和同步单元,15. The data processing device according to any one of claims 11 to 14, wherein the data processing device is applied to an active and standby database system, the data processing device is a primary database, and the second device is a standby database , the transaction table is also used to indicate the submission time of the transaction, and the data processing device also includes a determination unit and a synchronization unit, 所述确定单元,用于根据所述事务表确定增量事务,所述增量事务的提交时间在所述主备数据库系统的最新备份时间之后;The determining unit is configured to determine an incremental transaction according to the transaction table, and the commit time of the incremental transaction is after the latest backup time of the primary and standby database systems; 所述获取单元,还用于根据所述事务表获取所述增量事务的日志链;The acquiring unit is further configured to acquire the log chain of the incremental transaction according to the transaction table; 所述获取单元,还用于根据所述增量事务的日志链确定所述增量事务的元组的存储地址,以获取所述增量事务的元组;The acquiring unit is further configured to determine the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to acquire the tuple of the incremental transaction; 所述同步单元,用于将所述增量事务的日志链和元组同步至所述第二设备。The synchronization unit is configured to synchronize the log chain and tuple of the incremental transaction to the second device. 16.根据权利要求15所述的数据处理装置,其特征在于,所述获取单元具体用于:16. The data processing device according to claim 15, wherein the acquiring unit is specifically used for: 根据所述事务表确定所述增量事务的日志链的存储地址;determining the storage address of the log chain of the incremental transaction according to the transaction table; 根据所述增量事务的日志链的存储地址,获取所述增量事务的日志链。Acquire the log chain of the incremental transaction according to the storage address of the log chain of the incremental transaction. 17.一种数据处理装置,其特征在于,所述数据处理装置应用于主备数据库系统,所述数据处理装置为主数据库,第二设备为备数据库,所述数据处理装置包括:17. A data processing device, characterized in that, the data processing device is applied to a primary and backup database system, the data processing device is a primary database, and the second device is a standby database, and the data processing device includes: 获取单元,用于获取事务表,所述事务表用于指示事务的提交时间以及日志链的存储地址,每个事务对应一个日志链,所述日志链包括至少一个链节点,每个所述链节点指示事务中的一个元组的存储地址;The obtaining unit is used to obtain a transaction table, the transaction table is used to indicate the submission time of the transaction and the storage address of the log chain, each transaction corresponds to a log chain, and the log chain includes at least one chain node, each of the chains The node indicates the storage address of a tuple in the transaction; 确定单元,用于根据事务表确定增量事务,所述增量事务的提交时间在所述主备数据库系统最新的数据同步的时间之后;A determining unit, configured to determine an incremental transaction according to the transaction table, the commit time of the incremental transaction is after the latest data synchronization time of the primary and standby database systems; 所述获取单元,还用于根据所述事务表获取所述增量事务的日志链;The acquiring unit is further configured to acquire the log chain of the incremental transaction according to the transaction table; 所述确定单元,还用于根据所述增量事务的日志链确定所述增量事务的元组的存储地址,以获取所述增量事务的元组;The determining unit is further configured to determine the storage address of the tuple of the incremental transaction according to the log chain of the incremental transaction, so as to obtain the tuple of the incremental transaction; 同步单元,用于将所述增量事务的日志链和元组同步至所述第二设备。A synchronization unit, configured to synchronize the log chain and tuple of the incremental transaction to the second device. 18.根据权利要求17所述的数据处理装置,其特征在于,所述日志链包括多个串行的链节点,所述多个串行的链节点之间存在关联关系。18 . The data processing device according to claim 17 , wherein the log chain includes multiple serial chain nodes, and there is an association relationship between the multiple serial chain nodes. 19 . 19.根据权利要求18所述的数据处理装置,其特征在于,所述日志链的存储地址为所述日志链的链头节点的存储地址。19. The data processing device according to claim 18, wherein the storage address of the log chain is the storage address of the head node of the log chain. 20.根据权利要求17、18或19所述的数据处理装置,其特征在于,所述获取单元具体用于:20. The data processing device according to claim 17, 18 or 19, wherein the acquisition unit is specifically used for: 根据所述事务表确定所述增量事务的日志链的存储地址;determining the storage address of the log chain of the incremental transaction according to the transaction table; 根据所述增量事务的日志链的存储地址,获取所述增量事务的日志链。Acquire the log chain of the incremental transaction according to the storage address of the log chain of the incremental transaction. 21.一种计算机设备,其特征在于,包括处理器和存储器,所述处理器与所述存储器耦合,21. A computer device, comprising a processor and a memory, the processor being coupled to the memory, 所述存储器,用于存储程序;The memory is used to store programs; 所述处理器,用于执行所述存储器中的程序,使得所述计算机设备执行如权利要求1至6中任一项所述的方法,或者,使得所述计算机设备执行如权利要求7至10中任一项所述的方法。The processor is configured to execute the program in the memory, so that the computer device executes the method according to any one of claims 1 to 6, or makes the computer device execute the method according to claims 7 to 10 any one of the methods described. 22.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的方法,或者,所述计算机程序被处理器执行时实现如权利要求7至10中任一项所述的方法。22. A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the method according to any one of claims 1 to 6 is implemented Or, when the computer program is executed by the processor, the method according to any one of claims 7 to 10 is implemented. 23.一种计算机程序产品,其特征在于,所述计算机程序产品中存储有计算机可读指令,当所述计算机可读指令被处理器执行时实现如权利要求1至6中任一项所述的方法,或者,当所述计算机可读指令被处理器执行时实现如权利要求7至10中任一项所述的方法。23. A computer program product, characterized in that, computer-readable instructions are stored in the computer program product, and when the computer-readable instructions are executed by a processor, the computer-readable instructions according to any one of claims 1 to 6 are realized. or, when the computer-readable instructions are executed by a processor, implement the method according to any one of claims 7 to 10.
CN202210061367.9A 2022-01-19 2022-01-19 Data processing method and related device Pending CN116501539A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210061367.9A CN116501539A (en) 2022-01-19 2022-01-19 Data processing method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210061367.9A CN116501539A (en) 2022-01-19 2022-01-19 Data processing method and related device

Publications (1)

Publication Number Publication Date
CN116501539A true CN116501539A (en) 2023-07-28

Family

ID=87325430

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210061367.9A Pending CN116501539A (en) 2022-01-19 2022-01-19 Data processing method and related device

Country Status (1)

Country Link
CN (1) CN116501539A (en)

Similar Documents

Publication Publication Date Title
CN111143389B (en) Transaction execution method and device, computer equipment and storage medium
US7934262B1 (en) Methods and apparatus for virus detection using journal data
US7860836B1 (en) Method and apparatus to recover data in a continuous data protection environment using a journal
WO2019154394A1 (en) Distributed database cluster system, data synchronization method and storage medium
US8060714B1 (en) Initializing volumes in a replication system
US7882286B1 (en) Synchronizing volumes for replication
US8108343B2 (en) De-duplication and completeness in multi-log based replication
US8548945B2 (en) Database caching utilizing asynchronous log-based replication
US7844856B1 (en) Methods and apparatus for bottleneck processing in a continuous data protection system having journaling
US8250033B1 (en) Replication of a data set using differential snapshots
CN108509462B (en) Method and device for synchronizing activity transaction table
CN111078667B (en) Data migration method and related device
US9792345B1 (en) Parallel database mirroring
US9229970B2 (en) Methods to minimize communication in a cluster database system
JP2016522514A (en) Replication method, program, and apparatus for online hot standby database
US9087115B1 (en) Mirror resynchnronization of fixed page length tables for better repair time to high availability in databases
CN109726211B (en) Distributed time sequence database
CN114490570A (en) Production data synchronization method and device, data synchronization system and server
US11163799B2 (en) Automatic rollback to target for synchronous replication
US11520747B2 (en) Method and system for detecting and resolving a write conflict
WO2019109256A1 (en) Log management method, server and database system
WO2019109257A1 (en) Log management method, server and database system
CN115469810A (en) Data acquisition method, device, equipment and storage medium
CN114595224A (en) Data storage method and device and data query method and device
CN112685431B (en) Asynchronous caching method, device, system, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination