CN111290881B

CN111290881B - Data recovery method, device, equipment and storage medium

Info

Publication number: CN111290881B
Application number: CN202010071604.0A
Authority: CN
Inventors: 王海龙; 王蒙蒙; 韩朱忠
Original assignee: Shanghai Dameng Database Co Ltd
Current assignee: Shanghai Dameng Database Co Ltd
Priority date: 2020-01-21
Filing date: 2020-01-21
Publication date: 2023-09-19
Anticipated expiration: 2040-01-21
Also published as: CN111290881A

Abstract

The application discloses a data recovery method, a device, equipment and a storage medium. The method comprises the following steps: when the database is restarted after the fault is removed, reading a parallel log packet from the online log file; obtaining self-description information of the parallel log packet, wherein the self-description information comprises: the writing start position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the serial number of the parallel log packet and the length of the parallel log packet in the parallel log packet; according to the technical scheme, the parallel log sorting action during normal operation of the system can be completely avoided, the parallel logs in the parallel log package are sorted through the self-description information only when the database is restarted, and the system performance can be effectively improved under the operation scene of high concurrency and high pressure.

Description

Data recovery method, device, equipment and storage medium

Technical Field

The embodiment of the application relates to the field of databases, in particular to a data recovery method, a device, equipment and a storage medium.

Background

During the operation of the database system, various fault conditions, such as operating system faults or hardware faults, may be encountered, and after the fault is removed, the database system can be restored to the moment before the fault by means of the REDO log when being restarted.

The REDO log is used for recording modification operations performed on data by the database, a new LSN (Log Sequence Number, log sequence value) is used for identification when the REDO log is generated by data modification, 1 is automatically added to the LSN value of the REDO log when the REDO log is written once, one LSN value represents one database modification operation, the size relation of the LSNs can represent the modification sequence of the database, and the REDO log needs to be replayed in the sequence from small LSN to large LSN when the REDO log is replayed.

The REDO log generated by the database system is firstly put into a log buffer area, and is written into an online log file when the log buffer area is full or a transaction is submitted, and under the condition of high concurrency, higher concurrency conflict can be generated when all sessions write the log into the same log buffer area. Since LSNs are globally unique, the REDO logs in each log buffer cannot guarantee that LSNs are continuously incremented, and before an online log is written, all logs in the log buffers must be ordered according to the size of the LSNs, and this ordering action can become a performance bottleneck in a scenario where the system pressure is relatively high and a large number of REDO logs are generated.

Disclosure of Invention

The embodiment of the application provides a data recovery method, a device, equipment and a storage medium, which can completely avoid parallel log ordering actions when a system normally operates, order each path of parallel logs in a parallel log packet through self-description information only when a database is restarted, and effectively improve the system performance under a high-concurrency high-pressure operation scene.

In a first aspect, an embodiment of the present application provides a data recovery method, including:

when the database is restarted after the fault is removed, reading a parallel log packet from the online log file;

obtaining self-description information of the parallel log packet, wherein the self-description information comprises: the writing start position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the serial number of the parallel log packet and the length of the parallel log packet in the parallel log packet;

and sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs.

Further, obtaining the self-description information of the parallel log packet includes:

and acquiring self-description information of the packet heads stored in the parallel log packet.

Further, when the database is restarted after the fault is removed, before the parallel log packet is read from the online log file, the method further comprises:

when any one log packet buffer area is full, creating a parallel log packet, and distributing a parallel log packet sequence number for the parallel log packet;

sequentially copying database logs in a log packet buffer area into the parallel log packet;

counting and recording the self-description information of the parallel log packet;

and writing the self-description information into the packet head of the parallel log packet, and writing the parallel log packet into an online log file.

Further, when any one log packet buffer area is full, before creating a parallel log packet, the method further includes:

and writing database logs generated by at least two working threads into a log packet buffer area, wherein each working thread is allocated with one log packet buffer area.

Further, the minimum log sequence value in the currently created parallel log packet is one plus the maximum log sequence value in the previous parallel log packet.

Further, sorting the parallel logs according to the self-description information, and sequentially performing data recovery according to the sorted parallel logs includes:

sequentially reading out each path of parallel logs according to the self-description information;

sequencing the parallel logs according to the order of log sequence values from small to large;

and sequentially recovering the data according to the ordered parallel logs.

Further, the serial numbers of the parallel log packets are sequentially increased according to the creating sequence of the parallel log packets.

In a second aspect, an embodiment of the present application further provides a data recovery apparatus, where the apparatus includes:

the reading module is used for reading the parallel log packet from the online log file when the database is restarted after the fault is removed;

the acquisition module is used for acquiring the self-description information of the parallel log packet, wherein the self-description information comprises: the writing start position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the serial number of the parallel log packet and the length of the parallel log packet in the parallel log packet;

and the recovery module is used for sequencing the parallel logs according to the self-description information and sequentially recovering the data according to the sequenced parallel logs.

Further, the obtaining module includes:

and the information acquisition sub-module is used for acquiring the self-description information of the packet heads stored in the parallel log packet.

Further, the method further comprises the following steps:

the creation sub-module is used for creating a parallel log packet when any log packet buffer area is full, and distributing a parallel log packet sequence number for the parallel log packet;

the copying module is used for copying the database logs in the log packet buffer area into the parallel log packet in sequence;

the statistics module is used for counting and recording the self-description information of the parallel log packet;

and the writing module is used for writing the self-description information into the packet heads of the parallel log packets and writing the parallel log packets into an online log file.

In a third aspect, an embodiment of the present application further provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the data recovery method according to any one of the embodiments of the present application when the processor executes the program.

In a fourth aspect, embodiments of the present application further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a data recovery method according to any of the embodiments of the present application.

According to the embodiment of the application, when the database is restarted after the fault is removed, the parallel log packet is read from the online log file;

obtaining self-description information of the parallel log packet, wherein the self-description information comprises: the writing start position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the serial number of the parallel log packet and the length of the parallel log packet in the parallel log packet; the parallel logs are ordered according to the self-description information, and data recovery is sequentially carried out according to the ordered parallel logs, so that parallel log ordering actions during normal operation of the system can be completely avoided, the parallel logs in the parallel log package are ordered only through the self-description information when the database is restarted, and the system performance can be effectively improved under a high-concurrency high-pressure operation scene.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a data recovery method according to a first embodiment of the application;

FIG. 2A is a flow chart of a data recovery method according to a second embodiment of the present application;

FIG. 2B is a diagram of a parallel log packet format in a second embodiment of the application;

FIG. 2C is a flow chart of parallel log packet generation in a second embodiment of the application;

FIG. 2D is a flow chart of parallel log packet replay in a second embodiment of the present application;

FIG. 3 is a schematic diagram of a data recovery device according to a third embodiment of the present application;

fig. 4 is a schematic structural diagram of a computer device in a fourth embodiment of the present application.

Detailed Description

The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present application are shown in the drawings.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.

Example 1

Fig. 1 is a flowchart of a data recovery method provided in a first embodiment of the present application, where the method may be applied to a data recovery case, and the method may be performed by a data recovery device in the embodiment of the present application, where the device may be implemented in a software and/or hardware manner, as shown in fig. 1, and the method specifically includes the following steps:

s110, when the database is restarted after the fault is removed, the parallel log packet is read from the online log file.

The online log file is generally created and recycled when the database is initialized, and is used for storing log packets generated when the system operates normally.

The parallel log package stores at least two paths of parallel logs, and the parallel log package is a complete parallel log package, and it should be noted that, due to a fault, it is possible that the last parallel log package is not completely written into the online log file, when the parallel log package is read from the online log file, the incomplete parallel log package is discarded, and the complete parallel log package is read.

The fault may be an operating system fault, a hardware fault, or the like, which is not limited in the embodiment of the present application.

Specifically, after the fault is removed, when the database is restarted, the parallel log package in the online log file is read.

S120, obtaining self-description information of the parallel log packet, wherein the self-description information comprises: the writing start position of each parallel log in the parallel log packet, the parallel log number stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the parallel log packet sequence number and the length of the parallel log packet in the parallel log packet.

The self-description information acquisition mode comprises statistics and record acquisition according to parallel logs stored in a parallel log packet and allocation when the parallel log packet is created, wherein the acquisition mode of the write-in starting position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value and the length of the parallel log packet are obtained according to the parallel logs stored in the parallel log packet, and the serial numbers of the parallel log packets are allocated when the parallel log packet is created.

The N working threads correspond to N paths of parallel logs.

Wherein the log sequence value is globally unique.

Each parallel log packet has a parallel log packet sequence number to be uniquely identified and sequentially increased or decreased, so as to identify the sequence of log packet generation.

Optionally, the serial numbers of the parallel log packets are sequentially increased according to the creation sequence of the parallel log packets.

The parallel log packet sequence numbers are sequentially increased according to the sequence of creation when the parallel log packet is created, for example, the parallel log packet is created, the parallel log packet sequence number 001 is allocated, the parallel log packet is created, the parallel log packet sequence number 002 is allocated, the parallel log packet is created, the parallel log packet sequence number 003 is allocated, the parallel log packet is created, the parallel log packet sequence number 004 is allocated, or the parallel log packet is created, the parallel log packet sequence number 100000 is allocated, the parallel log packet is created, the parallel log packet sequence number 99999 is allocated, the parallel log packet is created, and the parallel log packet sequence number 99998 is allocated.

Specifically, a writing starting position of each path of parallel logs in the parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet in the parallel log packet are obtained.

S130, ordering the parallel logs according to the self-description information, and sequentially recovering data according to the ordered parallel logs.

Specifically, each path of parallel logs in the parallel log packet is firstly read out in sequence according to the self-description information, LSNs in each path of parallel logs are necessarily increased, but are not necessarily continuous, the read-out paths of parallel logs are subjected to merging and sorting according to the sequence from small to large of the LSNs (the LSNs are necessarily increased continuously after the merging and sorting), and are sequentially subjected to replay, so that the recovery of data is realized.

Optionally, sorting the parallel logs according to the self-description information, and sequentially performing data recovery according to the sorted parallel logs includes:

and reading out each path of parallel logs in sequence according to the self-description information.

Wherein, each path of parallel log is stored in a parallel log packet.

Specifically, each parallel log is sequentially read out from the parallel log packet based on the self-description information, for example, if M parallel logs exist in the parallel log packet, the M parallel logs in the parallel log packet may be sequentially read out.

and sequentially recovering the data according to the ordered parallel logs.

Specifically, when replaying the parallel logs, the parallel logs are required to be sequenced from small to large according to LSN, REDO logs are stored in the form of parallel log packets, parallel logs generated by each working thread are directly stored in the parallel log packets, the parallel logs in each path of parallel logs in the parallel log packets are not sequenced any more, but the LSN in each path of parallel logs in the parallel log packets is not necessarily continuous, the LSN is ensured to be continuously increased between each path of parallel log packets, when the REDO logs are replayed by fault restarting, only sequencing and replaying are required to be performed on each path of parallel logs in the parallel log packets, log sequencing operation is not required to be performed when the system normally operates, and the system performance can be effectively improved in a high-concurrency high-pressure operation scene.

In the prior art, even when the system operates normally, the logs in all log cache areas are required to be ordered according to the LSN, and under the scene that the system pressure is higher and a large number of REDO logs are generated, the ordering action becomes a performance bottleneck.

According to the technical scheme, when the database is restarted after the fault is removed, the parallel log packet is read from the online log file; obtaining self-description information of the parallel log packet, wherein the self-description information comprises: the writing start position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the serial number of the parallel log packet and the length of the parallel log packet in the parallel log packet; the parallel logs are ordered according to the self-description information, and data recovery is sequentially carried out according to the ordered parallel logs, so that parallel log ordering actions during normal operation of the system can be completely avoided, the parallel logs in the parallel log package are ordered only through the self-description information when the database is restarted, and the system performance can be effectively improved under a high-concurrency high-pressure operation scene.

Example two

Fig. 2A is a flowchart of a data recovery method according to a second embodiment of the present application, where the optimization is performed based on the foregoing embodiment, and in this embodiment, obtaining the self-description information of the parallel log packet includes: and acquiring self-description information of the packet heads stored in the parallel log packet. When the database is restarted after the fault is removed, before the parallel log packet is read from the online log file, the method further comprises the following steps: when any one log packet buffer area is full, creating a parallel log packet, and distributing a parallel log packet sequence number for the parallel log packet; sequentially copying database logs in a log packet buffer area into the parallel log packet; counting and recording the self-description information of the parallel log packet; and writing the self-description information into the packet head of the parallel log packet, and writing the parallel log packet into an online log file.

As shown in fig. 2A, the method of this embodiment specifically includes the following steps:

s210, when any log packet buffer area is full, creating a parallel log packet and distributing a parallel log packet sequence number for the parallel log packet.

Each working thread is allocated with a log packet buffer zone, wherein the number of the working threads is configured by a user, N working threads correspond to N paths of parallel logs, N log packet buffer zones are correspondingly allocated, REDO logs generated by each working thread are written into the corresponding log packet buffer zone, and LSNs in the log packet buffer zones are necessarily increased, but are not necessarily continuous.

Specifically, when a certain path of log packet buffer area is full, a new parallel log packet is created, and a unique parallel log packet serial number is allocated to the parallel log packet.

Optionally, before creating a parallel log packet when any log packet buffer is full, the method further includes:

Optionally, the minimum log sequence value in the currently created parallel log packet is one plus the maximum log sequence value in the previous parallel log packet.

Specifically, the minimum LSN of the next parallel log packet is the maximum LSN of the last parallel log packet plus 1, so as to ensure that LSNs among the parallel log packets are continuously increased, wherein each parallel log packet is provided with a parallel log packet serial number for unique identification and sequential increase, so as to identify the sequence of the generation of the parallel log packets.

S220, the database logs in the log packet buffer area are copied to the parallel log packet in sequence.

Wherein the database log is a REDO log.

Specifically, the database logs in the log packet buffer area are sequentially copied to the newly created parallel log packet, for example, a log copying action may be triggered, the REDO logs in the N log packet buffer areas are sequentially copied to a newly created parallel log packet, and M segments of REDO logs are stored in the newly created parallel log packet at most (M < = N, possibly a log packet buffer area is empty, and no log needs to be copied).

S230, self-description information of the parallel log packet is counted and recorded.

Specifically, according to the writing starting position of each path of parallel logs in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value and the length of the parallel log packet of the REDO log statistics parallel log packet copied into the newly built parallel log packet, the writing starting position of each path of parallel logs in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value and the length of the parallel log packet of the REDO log statistics parallel log packet are recorded.

S240, writing the self-description information into the packet heads of the parallel log packets, and writing the parallel log packets into the online log file.

Specifically, as shown in fig. 2B, the write start position of each path of parallel log in the packet is recorded on the packet header of the parallel log packet, and the parallel log packet is written into the online log file, wherein the parallel log number, the minimum log sequence value, the maximum log sequence value, the parallel log packet sequence number and the length of the parallel log packet are stored in the parallel log packet.

S250, acquiring self-description information of a packet head stored in a parallel log packet, wherein the self-description information comprises: the writing start position of each parallel log in the parallel log packet, the parallel log number stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the parallel log packet sequence number and the length of the parallel log packet in the parallel log packet.

S260, ordering the parallel logs according to the self-description information, and sequentially recovering data according to the ordered parallel logs.

In a specific example, as shown in fig. 2C, in the N-path parallel log, when the parallel log packet buffer of a certain path is full, a log copying action is triggered. Creating a new parallel log packet, and allocating a unique parallel log packet sequence number to the parallel log packet. And copying REDO logs in the N log packet buffer areas into new parallel log packets in sequence, emptying the log packet buffer areas, and skipping if the REDO logs are not in the log packet buffer areas. And counting and recording self-description information of the parallel log packet in the copying process, wherein the self-description information comprises a writing starting position of each path of parallel log in the parallel log packet, the number M (M < =N) of the parallel log actually stored in the parallel log packet, the minimum LSN and the maximum LSN in the parallel log packet and the length of the parallel log packet. After copying, the self-description information of the serial number of the parallel log packet, the packet length, the minimum LSN, the maximum LSN, the parallel log number in the parallel log packet and the starting position of each path of parallel log in the packet are written into the parallel log packet header, and the parallel log packet is written into the online log file. Under the condition that the system normally operates, the REDO log is not used, but only when the system is restarted, the REDO log needs to be replayed to restore the system to the moment before the fault, at the moment, the REDO log in the parallel log packet needs to be ordered, and the replay is carried out according to the sequence from small to large of LSN. As shown in fig. 2D, the parallel log packet replay steps are as follows: a complete parallel log packet is read from the online log file. The parallel log number (M) in the parallel log packet and the initial offset of each path of parallel log are taken out from the parallel log packet header, and the M paths of parallel logs in the parallel log packet are sequentially read out. And sequencing the M paths of parallel logs according to the sequence from small to large of LSN, and replaying in sequence.

According to the technical scheme, REDO logs are stored in a self-description log packet mode, parallel logs generated by each working thread are directly stored in the log packet, all paths of parallel logs are not ordered in the log packet, LSN (continuous incremental state) is guaranteed among all log packets, parallel log ordering actions during normal operation of a system can be completely avoided, the parallel logs in the parallel log packet are ordered through self-description information only when a database is restarted, and system performance can be effectively improved under a high-concurrency high-pressure operation scene.

Example III

Fig. 3 is a schematic structural diagram of a data recovery device according to a third embodiment of the present application. The embodiment may be applied to the case of data recovery, where the apparatus may be implemented in software and/or hardware, and the apparatus may be integrated in any device that provides a function of data recovery, as shown in fig. 3, where the data recovery apparatus specifically includes: a reading module 310, an acquisition module 320, and a recovery module 330.

The reading module 310 is configured to read the parallel log packet from the online log file when the database is restarted after the fault is removed;

an obtaining module 320, configured to obtain self-description information of the parallel log packet, where the self-description information includes: the writing start position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the serial number of the parallel log packet and the length of the parallel log packet in the parallel log packet;

and the recovery module 330 is configured to sort the parallel logs according to the self-description information, and sequentially perform data recovery according to the sorted parallel logs.

Optionally, the acquiring module includes:

Optionally, the method further comprises:

and the writing sub-module is used for writing the database logs generated by at least two paths of working threads into the log packet buffer area, wherein each working thread is allocated with one log packet buffer area.

Optionally, the recovery module is specifically configured to:

reading out parallel logs in turn according to the self-description information;

and sequentially recovering the data according to the ordered parallel logs.

The product can execute the method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method.

Example IV

Fig. 4 is a schematic structural diagram of a computer device according to a fourth embodiment of the present application. Fig. 4 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present application. The computer device 12 shown in fig. 4 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present application.

As shown in FIG. 4, the computer device 12 is in the form of a general purpose computing device. Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.

Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Computer device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard disk drive"). Although not shown in fig. 4, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the application.

A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.

The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with the computer device 12, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. In addition, in the computer device 12 of the present embodiment, the display 24 is not present as a separate body but is embedded in the mirror surface, and the display surface of the display 24 and the mirror surface are visually integrated when the display surface of the display 24 is not displayed. Moreover, computer device 12 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with computer device 12, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, implementing a data recovery method provided by an embodiment of the present application: when the database is restarted after the fault is removed, reading a parallel log packet from the online log file; obtaining self-description information of the parallel log packet, wherein the self-description information comprises: the writing start position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the serial number of the parallel log packet and the length of the parallel log packet in the parallel log packet; and sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs.

Example five

A fifth embodiment of the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a data recovery method as provided by all the embodiments of the present application: when the database is restarted after the fault is removed, reading a parallel log packet from the online log file; obtaining self-description information of the parallel log packet, wherein the self-description information comprises: the writing start position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the serial number of the parallel log packet and the length of the parallel log packet in the parallel log packet; and sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs.

Any combination of one or more computer readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

Note that the above is only a preferred embodiment of the present application and the technical principle applied. It will be understood by those skilled in the art that the present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the application. Therefore, while the application has been described in connection with the above embodiments, the application is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the application, which is set forth in the following claims.

Claims

1. A method of data recovery, comprising:

sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs;

when the database is restarted after the fault is removed, before the parallel log packet is read from the online log file, the method further comprises the following steps:

sequentially copying database logs in all log packet buffer areas into the parallel log packet;

2. The method of claim 1, wherein obtaining self-descriptive information for the parallel log package comprises:

3. The method of claim 1, wherein when any one of the log packet buffers is full, prior to creating a parallel log packet, further comprising:

4. The method of claim 1, wherein the smallest log sequence value in the currently created parallel log packet is one plus the largest log sequence value in the previous parallel log packet.

5. The method of claim 1, wherein sorting the parallel logs according to the self-description information, and sequentially performing data recovery according to the sorted parallel logs comprises:

and sequentially recovering the data according to the ordered parallel logs.

6. The method of claim 1, wherein the parallel log packet sequence number is sequentially incremented according to a sequential order of creation of the parallel log packets.

7. A data recovery apparatus, comprising:

the recovery module is used for sequencing the parallel logs according to the self-description information and sequentially recovering data according to the sequenced parallel logs;

the copying module is used for copying the database logs in all log packet buffer areas to the parallel log packets in sequence;

8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1-6 when the program is executed by the processor.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-6.