Disclosure of Invention
The embodiment of the application provides a data backup method and device, a storage medium and electronic equipment, which at least solve the problem of resource waste caused by data backup in the related art.
According to one embodiment of the application, a data backup method is provided, and the method is applied to a storage server, wherein the storage server is respectively connected with a cloud server and a host server, the storage server comprises a storage pool, the storage pool comprises a plurality of storage volumes, the method comprises the steps of receiving a first data backup request sent by the host server, creating a first snapshot when the first data backup request is determined to be used for requesting to execute data backup operation on data in a target storage volume, the target storage volume is any one of the storage volumes, the first snapshot is used for indicating data blocks in the target storage volume to set IO operation identifiers when input/output IO operation is executed, responding to the first data backup request, reading first target data from first target data blocks based on the first snapshot identifiers of the first snapshot, the first target data blocks are at least one data block in the target storage volume, the first snapshot identifiers of the first target data blocks are identical to the first target data blocks in the first target storage volume, and the first IO operation identifier is executed on the cloud server.
In an exemplary embodiment, before the first snapshot is created, the method further includes performing a mapping operation on the target storage volume to map the target storage volume to the host server, where the mapping operation includes sending a logical unit number of the target storage volume to the host server and instructing the host server to establish a connection with the target storage volume based on the logical unit number to allow the host server to access the target storage volume, receiving a first IO operation request sent by the host server, where the first IO operation request includes information including identification information of the target storage volume, address information of a first target data block, data information of the IO operation to be performed, and the IO operation identification, responding to the IO operation request, using the information included in the first operation request to establish a connection with the target storage volume to allow the host server to access the target storage volume, and receiving the first IO operation request including the identification information of the first target data block and performing the IO operation of the first target data block.
In an exemplary embodiment, after performing the IO operation on the first target data block by using the information included in the first IO operation request and recording the target IO operation identifier of the first target data block in response to the first IO operation request, the method further includes updating, in a case where it is determined that the IO operation performed on the first target data block is an initial IO operation, a first usage field of the first target data block to obtain a first target usage field, where the first usage field is set before receiving the IO operation request sent by the host server, and a field value of the first usage field is used to identify whether deletion of data in the first target data block is allowed.
In one exemplary embodiment, in the case that the first data backup request is determined to be for requesting to perform a data backup operation on data in a target storage volume, creating a first snapshot includes sending a snapshot instruction to the host server to instruct the host server to set the first snapshot identification to a second IO operation request and send the second IO operation request to the target storage volume in response to the snapshot instruction, determining the IO operation identification based on the first snapshot identification after performing the second IO operation in the first target data block in response to the second IO operation request, and setting the IO operation identification to the first target data block to create the first snapshot.
In an exemplary embodiment, reading first target data from a first target data block based on a first snapshot identifier of the first snapshot in response to the first data backup request includes receiving data acquisition indication information sent by the target storage volume, and in the case that the data acquisition indication information includes first indication information, searching, in response to the first data backup request, for an IO operation identifier matching the first snapshot identifier from the target storage volume to determine the first target data block, where the first indication information is used to indicate that only incremental data is allowed to be read from the first target data block, and reading the incremental data from the first target data block to obtain the first target data.
In an exemplary embodiment, after receiving the data acquisition indication information sent by the target storage volume, the method further includes, in response to the first data backup request, searching, in the target storage volume, an IO operation identifier matching the first snapshot identifier to determine the first target data block, where the second indication information is used to indicate that reading of all data included in the first target data block is allowed, and reading all data from the first target data block to obtain the first target data.
In an exemplary embodiment, reading first target data from a first target data block based on a first snapshot identification of the first snapshot includes traversing the target storage volume based on an initial snapshot identification of the first snapshot, determining data blocks included in the target storage volume that perform the IO operation as the first target data block, where the initial snapshot identification is a snapshot identification sent to the host server for the first time, sending a first read command to the first target data block, and receiving the first target data sent by the first target data block in response to the first read command.
In an exemplary embodiment, in response to the first data backup request, reading first target data from a first target data block based on a first snapshot identifier of the first snapshot includes traversing the target storage volume based on the first snapshot identifier to determine the first target data block, searching the IO operation identifier matching the first snapshot identifier from the target storage volume to determine the first target data block, sending a second read instruction to the first target data block, and receiving the first target data sent by the first target data block in response to the second read instruction.
In one exemplary embodiment, performing the data backup operation on the first target data to backup the first target data to the cloud server includes invoking a data backup interface in the cloud server, wherein the data backup interface only allows transmission of the first target data read from the first target data block, and uploading the first target data to the cloud server through the data backup interface to backup the first target data in the cloud server.
In an exemplary embodiment, after performing the data backup operation on the first target data to backup the first target data to the cloud server, the method further includes updating a second usage field of the first target data block to obtain a second target usage field, where the second usage field is set when the first target data block performs the IO operation, a field value of the second usage field is used to identify whether to allow deletion of data in the first target data block, and deleting the first target data to release a storage space of the first target data block if the field value of the second target usage field is a target field value.
In an exemplary embodiment, the storage server further includes a data deletion service, and after the data backup operation is performed on the first target data to backup the first target data to the cloud server, the method further includes traversing the target storage volume, searching for second target data from the target storage volume, where the second target data is data that is not backed up to the cloud server in the target storage volume, and invoking the data deletion service to perform a deletion operation on the second target data through the data deletion service.
According to another embodiment of the present application, a data backup device is provided, and the data backup device is applied to a storage server, where the storage server is respectively connected to a cloud server and a host server, the storage server includes a storage pool, the storage pool includes a plurality of storage volumes, the device includes a first receiving module configured to receive a first data backup request sent by the host server, a first creating module configured to create a first snapshot when determining that the first data backup request is for requesting to perform a data backup operation on data in a target storage volume, where the target storage volume is any one of the plurality of storage volumes, the first snapshot is configured to instruct a data block in the target storage volume to set an IO operation identifier when performing an input/output IO operation, and a first response module configured to read, in response to the first data backup request, first target data from a first target data block based on the first snapshot first backup identifier, where the first target data block is the first target data block in the target storage volume, and the target data block in the target storage volume is at least one of the target storage volume, and the first backup identifier is used to perform the same as the first data in the target storage volume.
In an exemplary embodiment, the device further comprises a first mapping module, configured to perform a mapping operation on the target storage volume to map the target storage volume to the host server before creating the first snapshot, where the mapping operation includes sending a logical unit number of the target storage volume to the host server and instructing the host server to establish a connection with the target storage volume based on the logical unit number to allow the host server to access the target storage volume, and a second receiving module, configured to receive a first IO operation request sent by the host server, where the first IO operation request includes identification information of the target storage volume, address information of a first target data block, data information of the IO operation to be performed, and the IO operation identifier, and a second response module, configured to record the first IO operation request for the target data block and perform the IO operation on the target data block using the first IO operation block included in the first IO operation request.
In an exemplary embodiment, the apparatus further includes a first update module, configured to respond to the first IO operation request, perform the IO operation on the first target data block using the information included in the first IO operation request, record the target IO operation identifier of the first target data block, and update, after determining that the IO operation performed on the first target data block is an initial IO operation, a first usage field of the first target data block to obtain a first target usage field, where a field value of the first usage field is set before receiving the IO operation request sent by the host server, where the field value of the first usage field is used to identify whether to allow deletion of data in the first target data block.
In an exemplary embodiment, the first creation module includes a first sending submodule, configured to send a snapshot instruction to the host server to instruct the host server to set the first snapshot identifier to a second IO operation request and send the second IO operation request to the target storage volume, and a first response submodule, configured to determine the IO operation identifier based on the first snapshot identifier after the second IO operation is performed in the first target data block and set the IO operation identifier to the first target data block to create the first snapshot in response to the second IO operation request.
In an exemplary embodiment, the first response module includes a first receiving sub-module configured to receive data acquisition indication information sent by the target storage volume, a second response sub-module configured to, in response to the first data backup request when the data acquisition indication information includes first indication information, search for an IO operation identifier matching the first snapshot identifier from the target storage volume to determine the first target data block, where the first indication information is configured to indicate that only incremental data is allowed to be read from the first target data block, and a first reading sub-module configured to read the incremental data from the first target data block to obtain the first target data.
In an exemplary embodiment, the first response module further includes a third response sub-module, configured to, after the data acquisition indication information sent by the target storage volume includes second indication information, search, in response to the first data backup request, an IO operation identifier that matches the first snapshot identifier from the target storage volume to determine the first target data block, where the second indication information is used to indicate that reading of all data included in the first target data block is allowed, and a second reading sub-module, configured to read all data from the first target data block, and obtain the first target data.
In an exemplary embodiment, the first response module includes a first traversing sub-module configured to traverse the target storage volume based on an initial snapshot identifier of the first snapshot, and determine data blocks included in the target storage volume that perform the IO operation as the first target data blocks, where the initial snapshot identifier is a snapshot identifier sent to the host server for the first time, send a first read command to the first target data blocks, and a second receiving sub-module configured to receive the first target data sent by the first target data blocks in response to the first read command.
In an exemplary embodiment, the first response module includes a second traversing sub-module configured to traverse the target storage volume based on the first snapshot identifier to find an IO operation identifier matching the first snapshot identifier from the target storage volume to determine the first target data block, send a second read instruction to the first target data block, and a third receiving sub-module configured to receive the first target data sent by the first target data block in response to the second read instruction, where the first data backup request is another backup request.
In an exemplary embodiment, the first backup module includes a first calling sub-module configured to call a data backup interface in the cloud server, where the data backup interface only allows transmission of the first target data read from the first target data block, and a first uploading sub-module configured to upload the first target data to the cloud server through the data backup interface to backup the first target data in the cloud server.
In an exemplary embodiment, the device further includes a second updating module configured to perform the data backup operation on the first target data, so as to update a second usage field of the first target data block after the first target data is backed up to the cloud server, to obtain a second target usage field, where the second usage field is set when the first target data block performs the IO operation, and a field value of the second usage field is used to identify whether to allow deletion of data in the first target data block, and a first deleting module configured to delete the first target data when a field value of the second target usage field is a target field value, so as to release a storage space of the first target data block.
In an exemplary embodiment, the storage server further includes a data deletion service, and the apparatus further includes a second traversing module configured to perform the data backup operation on the first target data, so as to traverse the target storage volume after the first target data is backed up to the cloud server, and find second target data from the target storage volume, where the second target data is data that is not backed up to the cloud server in the target storage volume, and a first calling module configured to call the data deletion service, so as to perform a deletion operation on the second target data through the data deletion service.
According to yet another embodiment of the present application, there is further provided a storage server, where the storage server is connected to a cloud server and a host server, respectively, and where the storage server includes a storage pool, where the storage pool includes a plurality of storage volumes, where the storage server is configured to perform the steps in any of the method embodiments described above at runtime.
According to a further embodiment of the present application, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
According to a further embodiment of the application, there is also provided an electronic device comprising a memory having stored therein a computer program, and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
According to a further embodiment of the application, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of any of the method embodiments described above.
According to the method and the device for the data backup, under the condition that the first data backup request is used for requesting to execute data backup operation on data in the target storage volume, the first snapshot is created, the data blocks in the target storage volume are indicated to set IO operation identifiers when input/output IO operation is executed, the first target data blocks are searched from the target storage volume based on the first snapshot identifiers of the first snapshot in response to the first data backup request, first target data in the first target data blocks are read, the target IO operation identifiers of the first target data blocks are identical to the first snapshot identifiers, and finally data backup operation is executed on the first target data, and the first target data are backed up to the cloud server. According to the application, when cloud backup is carried out on the target storage volume, the IO operation identification is set to record which generation of snapshot relation each IO operation belongs to, and the target volume for the snapshot is not required to be additionally created, and all data exist in the target storage volume, so that the problem of resource waste in the related technology when data backup is carried out can be solved, and the effect of carrying out the data backup with low resource and low consumption is realized.
Detailed Description
Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the embodiments of the present application may be executed in a server apparatus or similar computing device. Taking the example of running on a server device, fig. 1 is a block diagram of a hardware structure of a server device of a data backup method according to an embodiment of the present application. As shown in fig. 1, the server device may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU, a programmable logic device FPGA, or the like processing means) and a memory 104 for storing data, wherein the server device may further include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those of ordinary skill in the art that the architecture shown in fig. 1 is merely illustrative and is not intended to limit the architecture of the server apparatus described above. For example, the server device may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a method for controlling the operation of a RAID card of a disk array in an embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104, thereby performing various functional applications and data processing, that is, implementing the method described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located with respect to the processor 102, which may be connected to the server device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a server device. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as a NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
In this embodiment, a data backup method is provided, which is applied to a storage server, where the storage server is connected to a cloud server and a host server, the storage server includes a storage pool, and the storage pool includes a plurality of storage volumes, and fig. 2 is a flowchart of a data backup method according to an embodiment of the present application, as shown in fig. 2, and the flowchart includes the following steps:
Step S202, receiving a first data backup request sent by a host server;
Step S204, in the case that the first data backup request is determined to be used for requesting to execute the data backup operation on the data in the target storage volume, creating a first snapshot, wherein the target storage volume is any one of a plurality of storage volumes, and the first snapshot is used for indicating that the data block in the target storage volume is provided with an IO operation identifier when the input/output IO operation is executed;
Optionally, the storage server is configured to store data in a host server, and there is an IO operation between the host server and the storage server. The storage volume includes one or more data blocks therein, the data blocks for storing data.
Optionally, the IO operation identification is used to record which generation of snapshot the IO operation belongs to. For example, for storage volume a, when the first snapshot is not created, the IO operation identifier of all the IO operations received by storage volume a is 0. After the first snapshot is created for the first time, the snapshot is the 0 th generation snapshot, the corresponding data is the data in the data block with all IO operation identifiers of 0 in the storage volume A, and after the snapshot is created, the IO operation identifiers of all IO operations received by the storage volume A are added with 1 and become 1. After the first snapshot is created for the second time, the snapshot is the 1 st generation snapshot, the corresponding data is the data block data with all IO operation identifiers of 1 in the storage volume A, and after the snapshot is created, the IO operation identifiers of all IO operations received by the storage volume A are added with 1 and become 2.
Step S206, responding to a first data backup request, and reading first target data from a first target data block based on a first snapshot identification of the first snapshot, wherein the first target data block is at least one data block in a target storage volume, and a target IO operation identification of the first target data block is the same as the first snapshot identification;
In step S208, a data backup operation is performed on the first target data, so as to backup the first target data to the cloud server.
Alternatively, in the above data backup method, a typical application scenario may be envisaged, which involves an enterprise-level IT infrastructure, including a storage server, a cloud server, and a host server. The following is a detailed description of the method in practical application, in which the host server is an internal server of the enterprise for running critical business applications. The storage server manages a large number of storage volumes for storing traffic data. Cloud servers are remote storage facilities used by businesses for backup and disaster recovery. Enterprises need to regularly backup key business data to prevent data loss or to enable quick recovery when a system failure occurs. The host server sends a first data backup request to the storage server at a predetermined time or under a specific trigger condition. After receiving the first data backup request, the storage server determines the purpose of the first data backup request, creates a first snapshot for the specific target storage volume, and reads, from the target storage volume, first target data blocks matched with the snapshot identifier according to the first snapshot identifier of the first snapshot, wherein the first target data blocks may include first target data of files, database records or other key information. And finally, the storage server sends the read first target data to the cloud server for backup.
According to the method and the device for the data backup, under the condition that the first data backup request is used for requesting to execute data backup operation on data in the target storage volume, the first snapshot is created, the data blocks in the target storage volume are indicated to set IO operation identifiers when input/output IO operation is executed, the first target data blocks are searched from the target storage volume based on the first snapshot identifiers of the first snapshot in response to the first data backup request, first target data in the first target data blocks are read, the target IO operation identifiers of the first target data blocks are identical to the first snapshot identifiers, and finally data backup operation is executed on the first target data, and the first target data are backed up to the cloud server. According to the application, when cloud backup is carried out on the target storage volume, the IO operation identification is set to record which generation of snapshot relation each IO operation belongs to, and the target volume for the snapshot is not required to be additionally created, and all data exist in the target storage volume, so that the problem of resource waste in the related technology when data backup is carried out can be solved, and the effect of carrying out the data backup with low resource and low consumption is realized.
In an exemplary embodiment, before the first snapshot is created, the method further includes performing a mapping operation on the target storage volume to map the target storage volume to the host server, where the mapping operation includes sending a logical unit number of the target storage volume to the host server and instructing the host server to establish a connection with the target storage volume based on the logical unit number to allow the host server to access the target storage volume, receiving a first IO operation request sent by the host server, where the first IO operation request includes information including identification information of the target storage volume, address information of a first target data block, data information of the IO operation to be performed, and the IO operation identification, responding to the IO operation request, using the information included in the first operation request to establish a connection with the target storage volume to allow the host server to access the target storage volume, and receiving the first IO operation request including the identification information of the first target data block and performing the IO operation of the first target data block.
Alternatively, the identification information of the storage volume is used to inform the storage server of in which storage volume the data is stored, for example, the identification information of the storage volume is lun_id, which indicates that the data of the storage server is stored in the storage volume whose identification information is lun_id. The address information of the data block is used to specify where the data is stored on the storage volume, e.g., the address information of the data block is lba, which indicates where the data is stored on the storage volume at the address lba. The data information of the IO operation is used to specify the number of data bytes of the IO operation, for example, length, which is used to represent the length of the number of data bytes of the IO operation. As shown in FIG. 3, assuming a host server (e.g., a physical server or virtual machine running a Windows or Linux operating system) that needs to be linked to a storage volume for data read and write operations, including assuming that the Logical Unit Number (LUN) of the target storage volume is 1234, first, the volume management module in the storage server sends the logical unit number to the host server, after receiving the logical unit number, the host server establishes a link with the target storage volume according to the logical unit number, and configures a storage access path on the host server, e.g., in Linux, it may be necessary to create a device file (e.g., '/dev/sdX') to represent the target storage volume. After establishing a link with a target storage volume, the host server sends a first IO operation request to a volume management module in the storage server, informs the storage server that 8 bytes of data are to be written into a first target data block A with an address of '0 x 1000' on the target storage volume A in a storage pool by means of identification information of the target storage volume in the first IO operation request, address information of a first target data block-0 x1000, data information of IO operation to be executed and writing 8 bytes of data into the first target data block A, and then records a target IO operation identification 0 of the first target data block A of the target storage volume A by a snapshot module in the storage server.
The mapping operation and the IO operation request processing are performed before the data backup, so that the purpose of ensuring the smooth performance of the data backup operation and improving the performance of the storage host server and the safety of the data is achieved.
In an exemplary embodiment, after performing the IO operation on the first target data block by using the information included in the first IO operation request and recording the target IO operation identifier of the first target data block in response to the first IO operation request, the method further includes updating, in a case where it is determined that the IO operation performed on the first target data block is an initial IO operation, a first usage field of the first target data block to obtain a first target usage field, where the first usage field is set before receiving the IO operation request sent by the host server, and a field value of the first usage field is used to identify whether deletion of data in the first target data block is allowed.
Optionally, as shown in fig. 3, the cloud backup module in the storage server maintains a first usage field use_status for each data block in each storage volume, to indicate whether the data block in the database needs to be stored in a storage pool in the storage server, and deletes redundant data in the storage pool according to the first usage field. For example, the initial value of the first usage field may be set to 2, and the value of the first usage field is reduced by 1 when the data in the first target data block is backed up to the cloud server, and the value of the first usage field is reduced by 1 when the host server writes the data in the first target data block for the first time, and when the value is reduced to 0, it indicates that the cloud backup task and the IO operation of the host server no longer use the data of the first target data block, and at this time, the cloud backup module notifies the data deletion service to delete the data in the first target database.
According to the embodiment, the purpose of timely deleting useless data in the cloud backup process without occupying extra storage space for a long time and releasing the storage space is achieved by setting the service condition of the first use field marking data block.
In one exemplary embodiment, in the case that the first data backup request is determined to be for requesting to perform a data backup operation on data in a target storage volume, creating a first snapshot includes sending a snapshot instruction to the host server to instruct the host server to set the first snapshot identification to a second IO operation request and send the second IO operation request to the target storage volume in response to the snapshot instruction, determining the IO operation identification based on the first snapshot identification after performing the second IO operation in the first target data block in response to the second IO operation request, and setting the IO operation identification to the first target data block to create the first snapshot.
Optionally, assuming that cloud backup is to be performed on the target storage volume a, before performing first cloud backup, the storage server sends a snapshot instruction to the host server, and the host server sets the first snapshot identifier 0 in the IO operation request, where the IO operation identifiers of all the IO operations received by the target storage volume a are all 0. When cloud backup is performed, a first snapshot is created for the first time, the snapshot is a 0 th generation snapshot, corresponding data is data block data with all IO operation identifiers of 0 in the target storage volume A, and after the snapshot creation is completed, IO operation identifiers of all IO operations received by the target storage volume A are added with 1 and become 1. After the first snapshot is created for the second time, the snapshot is the 1 st generation snapshot, the corresponding data is the data block data with all IO operation identifiers of 1 in the target storage volume A, and after the snapshot is created, the IO operation identifiers of all IO operations received by the target storage volume A are added with 1 and become 2.
By creating the snapshot, the embodiment provides a flexible data backup strategy, and compared with the traditional full-volume backup, the snapshot technology reduces the requirement on storage resources and achieves the purpose of carrying out data backup with low resource consumption.
In an exemplary embodiment, reading first target data from a first target data block based on a first snapshot identifier of the first snapshot in response to the first data backup request includes receiving data acquisition indication information sent by the target storage volume, and in the case that the data acquisition indication information includes first indication information, searching, in response to the first data backup request, for an IO operation identifier matching the first snapshot identifier from the target storage volume to determine the first target data block, where the first indication information is used to indicate that only incremental data is allowed to be read from the first target data block, and reading the incremental data from the first target data block to obtain the first target data.
Optionally, the backup may be full-scale backup, or may be incremental backup, and the obtained first target data may be incremental data, or may be full-scale data, specifically, whether the obtained first target data is the incremental data or the full-scale data may be determined through the data obtaining indication information force_this_gen, and if the value is true, the obtained first target data is the incremental data. For example, assume that a third cloud backup is to be performed on a target storage volume a, where a data block a in which three IO operations are performed (first before the first cloud backup, second and third after the first cloud backup) is included, and a data block B in which an IO operation is performed after the first cloud backup. Since the data block a has performed three IO operations, the IO operation of the first IO operation is identified as 0, the IO operation of the second IO operation is identified as 1, the IO operation of the third IO operation is identified as 2, and the IO operation of the data block B is identified as 1. Because of the third cloud backup, the first snapshot identifier of the target storage volume a is 2, as shown in fig. 3, before the snapshot module acquires the data in the target storage volume a from the storage pool, the snapshot module receives the data acquisition indication information force_this_gen, which includes the first indication information true, and the snapshot module searches through the IO operation identifier of each data block in the target storage volume, and reads the first target data (incremental data) of the first IO operation of the first target data block a from the storage pool, where the IO operation identifier is the first target data block a of 2.
According to the embodiment, the data acquisition indication information is set to indicate whether the incremental data or the full data is acquired, so that the purpose of only backing up the data which changes since the last backup (namely the incremental data) instead of backing up the full data again and greatly reducing the required storage space is achieved.
In an exemplary embodiment, after receiving the data acquisition indication information sent by the target storage volume, the method further includes, in response to the first data backup request, searching, in the target storage volume, an IO operation identifier matching the first snapshot identifier to determine the first target data block, where the second indication information is used to indicate that reading of all data included in the first target data block is allowed, and reading all data from the first target data block to obtain the first target data.
Optionally, the backup may be full-scale backup, or may be incremental backup, and the obtained first target data may be incremental data, or may be full-scale data, specifically, whether the obtained first target data is the incremental data or the full-scale data may be determined through the data obtaining indication information force_this_gen, and if the obtained first target data is false, the obtained first target data is the full-scale data. For example, assume that a third cloud backup is to be performed on a target storage volume a, where a data block a in which three IO operations are performed (first before the first cloud backup, second and third after the first cloud backup) is included, and a data block B in which an IO operation is performed after the first cloud backup. Since the data block a has performed three IO operations, the IO operation of the first IO operation is identified as 0, the IO operation of the second IO operation is identified as 1, the IO operation of the third IO operation is identified as 2, and the IO operation of the data block B is identified as 1. Because of the third cloud backup, the first snapshot identifier of the target storage volume a is 2, as shown in fig. 3, before the snapshot module acquires the data in the target storage volume a from the storage pool, the snapshot module receives the data acquisition indication information force_this_gen and includes the first indication information false, the snapshot module searches through the IO operation identifier of each data block in the target storage volume, the first target data block a with the IO operation identifier of 2, and then the snapshot module sequentially reads the first target data of the third IO operation of the first target data block a, the first target data of the second IO operation of the first target data block a, and the first target data (full-volume data) of the first IO operation of the first target data block a from the storage pool.
According to the embodiment, the data acquisition indication information is set to indicate whether incremental data or full data is acquired, so that the purpose of backing up the full data is achieved.
In an exemplary embodiment, reading first target data from a first target data block based on a first snapshot identification of the first snapshot includes traversing the target storage volume based on an initial snapshot identification of the first snapshot, determining data blocks included in the target storage volume that perform the IO operation as the first target data block, where the initial snapshot identification is a snapshot identification sent to the host server for the first time, sending a first read command to the first target data block, and receiving the first target data sent by the first target data block in response to the first read command.
Optionally, assuming that the target storage volume a is to be subjected to the first cloud backup, the target storage volume a includes a data block a in which an IO operation is performed, a data block B in which an IO operation is not performed, and the IO operation identifier is 0 because the data block a is in the IO operation, and the data block B is not in the IO operation identifier. Because the initial snapshot identification of the target storage volume A is 0, the cloud backup module in the storage server sends a traversing instruction to the snapshot module in the storage server based on the initial snapshot identification 0, the snapshot module traverses the IO operation identification of each data block in the target storage volume to find a first target data block A with the IO operation identification of 0, then the snapshot module sends a first reading instruction to the first target data block A, acquires the first target data of the first target data block A from the storage pool, and sends the data to the cloud backup module in the storage server.
According to the embodiment, the backup data is acquired based on the snapshot identification, so that a flexible data backup strategy is provided, and compared with the traditional full-scale backup, the purpose of carrying out data backup with low resource consumption is achieved.
In an exemplary embodiment, in response to the first data backup request, reading first target data from a first target data block based on a first snapshot identifier of the first snapshot includes traversing the target storage volume based on the first snapshot identifier to determine the first target data block, searching the IO operation identifier matching the first snapshot identifier from the target storage volume to determine the first target data block, sending a second read instruction to the first target data block, and receiving the first target data sent by the first target data block in response to the second read instruction.
Alternatively, assuming that the target storage volume a is to be subjected to the second cloud backup, the target storage volume a includes a data block a in which the IO operation is performed twice (the first time before the first cloud backup and the second time after the first cloud backup), a data block B in which the IO operation is performed after the first cloud backup, and since the data block a is performed twice, the IO operation of the first IO operation is identified as 0, the IO operation of the second IO operation is identified as 1, and the IO operation of the data block B is identified as 1. Because the cloud backup is performed for the second time, the first snapshot identifier of the target storage volume A is 1, the cloud backup module in the storage server sends a traversing instruction to the snapshot module in the storage server based on the first snapshot identifier 1, the snapshot module traverses the IO operation identifier of each data block in the target storage volume for searching, the first target data block A and the first target data block B with the IO operation identifier of 1, then the snapshot module sends a second reading instruction to the first target data block A and the first target data block B, obtains first target data of the first target data block A for the second IO operation and first target data of the first target data block B for the first IO operation from the storage pool, and sends the data to the cloud backup module in the storage server.
By creating the snapshot, the embodiment provides a flexible data backup strategy, and achieves the purpose of carrying out data backup with low resource consumption.
In one exemplary embodiment, performing the data backup operation on the first target data to backup the first target data to the cloud server includes invoking a data backup interface in the cloud server, wherein the data backup interface only allows transmission of the first target data read from the first target data block, and uploading the first target data to the cloud server through the data backup interface to backup the first target data in the cloud server.
Optionally, as shown in fig. 3, after the cloud backup module obtains the first target data, the cloud backup module backs up the first target data to the cloud server by calling a data backup interface provided by the cloud server. The embodiment achieves the aim of ensuring the safety of data in the transmission process by calling the data backup interface in the cloud server.
In an exemplary embodiment, after performing the data backup operation on the first target data to backup the first target data to the cloud server, the method further includes updating a second usage field of the first target data block to obtain a second target usage field, where the second usage field is set when the first target data block performs the IO operation, a field value of the second usage field is used to identify whether to allow deletion of data in the first target data block, and deleting the first target data to release a storage space of the first target data block if the field value of the second target usage field is a target field value.
Optionally, as shown in fig. 3, the cloud backup module in the storage server maintains a second usage field use_status for each data block in each storage volume, to indicate whether the data block in the database needs to be stored in a storage pool in the storage server, and deletes redundant data in the storage pool according to the second usage field. For example, the initial value of the first usage field of the first target data block a is 2, when the host server writes data in the first target data block for the first time, the value of the first usage field is subtracted by 1 to obtain the value of the second usage field is 1, when the data in the first target data block is backed up to the cloud server, the value of the second usage field is subtracted by 1, and the second usage field becomes 0 to indicate that the cloud backup task and the IO operation of the host server no longer use the data of the first target data block a, and at this time, the cloud backup module notifies the data reclamation service to delete the data in the first target data block a. According to the embodiment, the purpose of timely deleting useless data in the cloud backup process without occupying extra storage space for a long time and releasing the storage space is achieved by setting the service condition of the first use field marking data block.
In an exemplary embodiment, the storage server further includes a data deletion service, and after the data backup operation is performed on the first target data to backup the first target data to the cloud server, the method further includes traversing the target storage volume, searching for second target data from the target storage volume, where the second target data is data that is not backed up to the cloud server in the target storage volume, and invoking the data deletion service to perform a deletion operation on the second target data through the data deletion service.
Optionally, the storage server is prevented from having data which is backed up to other devices without creating the first snapshot, so that after cloud backup, the data backed up to other devices in the target storage volume is searched in time, and the expired data is deleted. For example, assuming that the target storage volume a includes first target data a backed up in the cloud server and second target data B backed up to the backup server, after the first target data a is backed up to the cloud server, the storage server traverses the target storage volume, finds second target data B that is not backed up to the cloud server but is backed up to the backup server, and the storage service invokes the data deletion service to delete the second target data B. According to the embodiment, the data which is not backed up to the cloud server but is backed up to other servers is deleted, so that the storage space is released, and the purposes of optimizing the use of storage resources and reducing the storage cost are achieved.
The application is illustrated below with reference to specific examples:
in this embodiment, fig. 4 is a flowchart of a data backup method according to an embodiment of the present application, as shown in fig. 4, specifically including the following steps:
S402, assuming that the Logical Unit Number (LUN) of the target storage volume a is 1234, as shown in fig. 3, the volume management module in the storage server first sends the logical unit number to the host server, and after receiving the logical unit number, the host server establishes a link with the target storage volume according to the logical unit number, and configures a storage access path on the host server. After establishing a link with the target storage volume a, the host server sends a first IO operation request to a volume management module in the storage server, informs the storage server that 8 bytes of data are to be written into a first target data block a with an address of '0x 1000' on the target storage volume a in a storage pool and 4 bytes of data are to be written into the first target data block B with an address of '0x 1000' through identification information of the target storage volume in the first IO operation request, address information of the first target data block, namely-0 x1000 and-0 x2000, of the target storage volume a, and data of the first target data block B is written into the first target data block B, then a snapshot module in the storage server records a target IO operation identification 0 of the first target data block a of the target storage volume a and a target IO operation identification 0 of the first target data block B, and then a cloud backup module in the storage server updates a first usage field of the first target data block a from 2 to 1, and a first usage field of the first target data block B from 2 to 1 is obtained. Starting cloud backup;
in S404, as shown in fig. 3, the cloud backup module responds to the first data backup request directly triggered by the host server or the user, and creates the first snapshot identifier of the target storage volume a through the snapshot module, and the first snapshot identifier is 0 because the cloud backup is performed for the first time.
In the process, the storage server receives a first IO operation request sent by the host server again, informs the storage server that 1 byte of data is to be written into a first target data block A with an address of '0 x 1000' on a target storage volume A in a storage pool by means of identification information of a target storage volume in the first IO operation request, address information of a first target data block-0 x1000, data information of IO operation to be executed and data written in 1 byte, and then records a target IO operation identification 1 and a first use field 1 of the first target data block A of the target storage volume A by a snapshot module in the storage server;
S406, before the snapshot module acquires the data in the target storage volume A from the storage pool, the snapshot module receives the data acquisition indication information force_this_gen including the first indication information true, and traverses the IO operation identifier of each data block in the target storage volume to find out the first target data block A and the second target data block B with the IO operation identifier of 0, and then the snapshot module sequentially reads the first target data in the first target data block A, B from the storage pool. And uploading the first target data to a cloud backup module.
In the process, the cloud backup module responds to a first data backup request directly triggered by a host server or a user again, a first snapshot identifier of the target storage volume A is created through the snapshot module, the first snapshot identifier is 1 because of carrying out cloud backup for the second time, the snapshot module traverses the IO operation identifier of each data block in the target storage volume to find a first target data block A with the IO operation identifier of 1, and then the snapshot module reads first target data of the second IO operation in the first target data block A from the storage pool. The first target data is uploaded to a cloud backup module;
s408, the cloud backup module backs up the first target data to the cloud server, updates the second usage fields of the first target data block A and the second target data block B from 1 to 0 to obtain the second target usage fields, and notifies the data deletion service to delete the first target data in the first target data block A and the second target data block B in the storage pool, and recovers the storage space;
and S410, ending cloud backup.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.
The embodiment also provides a data backup device, which is applied to a system on chip and applied to a storage server, wherein the storage server is respectively connected with a cloud server and a host server, the storage server comprises a storage pool, and the storage pool comprises a plurality of storage volumes. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 5 is a block diagram of a data backup apparatus according to an embodiment of the present application, as shown in fig. 5, the apparatus includes:
a first receiving module 502, configured to receive a first data backup request sent by the host server;
A first creating module 504, configured to create a first snapshot, where the first data backup request is determined to be for requesting to perform a data backup operation on data in a target storage volume, and the target storage volume is any one of the storage volumes, and the first snapshot is used to instruct a data block in the target storage volume to set an IO operation identifier when performing an input/output IO operation;
a first response module 506, configured to respond to the first data backup request, and read first target data from a first target data block based on a first snapshot identifier of the first snapshot, where the first target data block is at least one data block in the target storage volume, and a target IO operation identifier of the first target data block is the same as the first snapshot identifier;
and the first backup module 508 is configured to perform the data backup operation on the first target data, so as to backup the first target data to the cloud server.
In an exemplary embodiment, the device further comprises a first mapping module, configured to perform a mapping operation on the target storage volume to map the target storage volume to the host server before creating the first snapshot, where the mapping operation includes sending a logical unit number of the target storage volume to the host server and instructing the host server to establish a connection with the target storage volume based on the logical unit number to allow the host server to access the target storage volume, and a second receiving module, configured to receive a first IO operation request sent by the host server, where the first IO operation request includes identification information of the target storage volume, address information of a first target data block, data information of the IO operation to be performed, and the IO operation identifier, and a second response module, configured to record the first IO operation request for the target data block and perform the IO operation on the target data block using the first IO operation block included in the first IO operation request.
In an exemplary embodiment, the apparatus further includes a first update module, configured to respond to the first IO operation request, perform the IO operation on the first target data block using the information included in the first IO operation request, record the target IO operation identifier of the first target data block, and update, after determining that the IO operation performed on the first target data block is an initial IO operation, a first usage field of the first target data block to obtain a first target usage field, where a field value of the first usage field is set before receiving the IO operation request sent by the host server, where the field value of the first usage field is used to identify whether to allow deletion of data in the first target data block.
In an exemplary embodiment, the first creation module includes a first sending submodule, configured to send a snapshot instruction to the host server to instruct the host server to set the first snapshot identifier to a second IO operation request and send the second IO operation request to the target storage volume, and a first response submodule, configured to determine the IO operation identifier based on the first snapshot identifier after the second IO operation is performed in the first target data block and set the IO operation identifier to the first target data block to create the first snapshot in response to the second IO operation request.
In an exemplary embodiment, the first response module includes a first receiving sub-module configured to receive data acquisition indication information sent by the target storage volume, a second response sub-module configured to, in response to the first data backup request when the data acquisition indication information includes first indication information, search for an IO operation identifier matching the first snapshot identifier from the target storage volume to determine the first target data block, where the first indication information is configured to indicate that only incremental data is allowed to be read from the first target data block, and a first reading sub-module configured to read the incremental data from the first target data block to obtain the first target data.
In an exemplary embodiment, the first response module further includes a third response sub-module, configured to, after the data acquisition indication information sent by the target storage volume includes second indication information, search, in response to the first data backup request, an IO operation identifier that matches the first snapshot identifier from the target storage volume to determine the first target data block, where the second indication information is used to indicate that reading of all data included in the first target data block is allowed, and a second reading sub-module, configured to read all data from the first target data block, and obtain the first target data.
In an exemplary embodiment, the first response module includes a first traversing sub-module configured to traverse the target storage volume based on an initial snapshot identifier of the first snapshot, and determine data blocks included in the target storage volume that perform the IO operation as the first target data blocks, where the initial snapshot identifier is a snapshot identifier sent to the host server for the first time, send a first read command to the first target data blocks, and a second receiving sub-module configured to receive the first target data sent by the first target data blocks in response to the first read command.
In an exemplary embodiment, the first response module includes a second traversing sub-module configured to traverse the target storage volume based on the first snapshot identifier to find an IO operation identifier matching the first snapshot identifier from the target storage volume to determine the first target data block, send a second read instruction to the first target data block, and a third receiving sub-module configured to receive the first target data sent by the first target data block in response to the second read instruction, where the first data backup request is another backup request.
In an exemplary embodiment, the first backup module includes a first calling sub-module configured to call a data backup interface in the cloud server, where the data backup interface only allows transmission of the first target data read from the first target data block, and a first uploading sub-module configured to upload the first target data to the cloud server through the data backup interface to backup the first target data in the cloud server.
In an exemplary embodiment, the device further includes a second updating module configured to perform the data backup operation on the first target data, so as to update a second usage field of the first target data block after the first target data is backed up to the cloud server, to obtain a second target usage field, where the second usage field is set when the first target data block performs the IO operation, and a field value of the second usage field is used to identify whether to allow deletion of data in the first target data block, and a first deleting module configured to delete the first target data when a field value of the second target usage field is a target field value, so as to release a storage space of the first target data block.
In an exemplary embodiment, the storage server further includes a data deletion service, and the apparatus further includes a second traversing module configured to perform the data backup operation on the first target data, so as to traverse the target storage volume after the first target data is backed up to the cloud server, and find second target data from the target storage volume, where the second target data is data that is not backed up to the cloud server in the target storage volume, and a first calling module configured to call the data deletion service, so as to perform a deletion operation on the second target data through the data deletion service.
According to yet another embodiment of the present application, there is further provided a storage server, where the storage server is connected to a cloud server and a host server, respectively, and where the storage server includes a storage pool, where the storage pool includes a plurality of storage volumes, where the storage server is configured to perform the steps in any of the method embodiments described above at runtime.
According to a further embodiment of the present application, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
It should be noted that each of the above modules may be implemented by software or hardware, and the latter may be implemented by, but not limited to, the above modules all being located in the same processor, or each of the above modules being located in different processors in any combination.
Embodiments of the present application also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
In an exemplary embodiment, the computer readable storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, etc. various media in which a computer program may be stored.
An embodiment of the application also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
In an exemplary embodiment, the electronic device may further include a transmission device connected to the processor, and an input/output device connected to the processor.
Embodiments of the application also provide a computer program product comprising a computer program which, when executed by a processor, implements the steps of any of the method embodiments described above.
Embodiments of the present application also provide another computer program product comprising a non-volatile computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of any of the method embodiments described above.
Embodiments of the present application also provide a computer program comprising computer instructions stored on a computer-readable storage medium, a processor of a computer device reading the computer instructions from the computer-readable storage medium, and a burial executing the computer instructions to cause the computer device to perform the steps of any of the method embodiments described above.
Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present application should be included in the protection scope of the present application.