[go: up one dir, main page]

CN114647533B - Data repair method and device - Google Patents

Data repair method and device Download PDF

Info

Publication number
CN114647533B
CN114647533B CN202210238315.4A CN202210238315A CN114647533B CN 114647533 B CN114647533 B CN 114647533B CN 202210238315 A CN202210238315 A CN 202210238315A CN 114647533 B CN114647533 B CN 114647533B
Authority
CN
China
Prior art keywords
repaired
blocks
data
batch
repair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210238315.4A
Other languages
Chinese (zh)
Other versions
CN114647533A (en
Inventor
陈林芳
景磊
侯琳
丁贵军
郝敬龙
田帅
潘青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Information Technology Co Ltd
Original Assignee
Jingdong Technology Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Information Technology Co Ltd filed Critical Jingdong Technology Information Technology Co Ltd
Priority to CN202210238315.4A priority Critical patent/CN114647533B/en
Publication of CN114647533A publication Critical patent/CN114647533A/en
Application granted granted Critical
Publication of CN114647533B publication Critical patent/CN114647533B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1044Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices with specific ECC/EDC distribution
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

本公开提出了一种数据修复方法和装置,涉及计算机技术领域。其中的数据修复方法包括:确定当前节点中的待修复编码块;根据所述待修复编码块在磁盘中的分布信息,对所述待修复编码块进行排序;按照所述排序,依次获取修复所述待修复编码块所需的可用块;根据修复所述待修复编码块所需的可用块,对所述待修复编码块进行修复。本公开能够顺序拉取并修复数据,减少读写磁盘的输入输出(IO)次数,降低磁盘占用率,提升数据修复效率。

The present disclosure proposes a data repair method and device, which relate to the field of computer technology. The data repair method includes: determining the coding blocks to be repaired in the current node; sorting the coding blocks to be repaired according to the distribution information of the coding blocks to be repaired in the disk; obtaining the available blocks required to repair the coding blocks to be repaired in sequence according to the sorting; repairing the coding blocks to be repaired according to the available blocks required to repair the coding blocks to be repaired. The present disclosure can sequentially pull and repair data, reduce the input and output (IO) times of reading and writing disks, reduce disk occupancy, and improve data repair efficiency.

Description

Data restoration method and device
Technical Field
The present disclosure relates to the field of computers, and in particular, to a data repair method and apparatus.
Background
The distributed storage system usually adopts an Erasure Code (EC) technology, and original data is encoded to obtain redundancy through an erasure code algorithm, and the data and the redundancy are stored together to achieve the fault-tolerant purpose. The data segments are stored on each node, when an abnormality occurs in a certain node or the data is behind, the data can be pulled from other nodes, and the data of the current node is calculated according to a corresponding algorithm.
In the related art, a common implementation scheme for restoring data of a certain node is that a key (key) corresponding to a data block to be restored is queried from a metadata (meta) file, and data corresponding to the key is pulled from other nodes for restoration for one data block to be restored each time until all the data blocks to be restored are successfully restored. In this way, the disks are directly read and written randomly according to the sequence of keys in the meta file, which results in high disk occupancy rate and low data repair efficiency. In addition, after one data block is repaired, the content of the next data block is obtained for repairing, and the repairing speed is limited.
Disclosure of Invention
The technical problem to be solved by the present disclosure is to provide a solution, which can sequentially pull and repair data, reduce the number of input/output (IO) times of reading and writing a disk, reduce the disk occupancy rate, and improve the data repair efficiency.
According to one aspect of the disclosure, a data restoration method is provided, which comprises the steps of determining a coding block to be restored in a current node, sequencing the coding blocks to be restored according to distribution information of the coding blocks to be restored in a disk, sequentially obtaining available blocks required by restoring the coding blocks to be restored according to the sequencing, and restoring the coding blocks to be restored according to the available blocks required by restoring the coding blocks to be restored.
In some embodiments, the ordered blocks to be repaired are divided into at least one batch, available blocks required for repairing the blocks to be repaired of each batch are obtained in batches, and the blocks to be repaired of each batch are repaired in batches according to the available blocks required for repairing the blocks to be repaired of each batch.
In some embodiments, bulk fetching of available blocks is performed asynchronously to bulk repair.
In some embodiments, the batch acquisition of available blocks required to repair the coded blocks to be repaired of each batch comprises selecting a plurality of nodes from other nodes except the current node in the distributed storage system, concurrently initiating batch data acquisition requests to the plurality of nodes, and receiving available blocks returned by the plurality of nodes to repair the coded blocks to be repaired of the batch.
In some embodiments, the method further comprises the step of inquiring a storage information table according to keys of the coding blocks to be repaired after the coding blocks to be repaired in the current node are determined, so as to obtain distribution information of the coding blocks to be repaired in a disk.
In some embodiments, the method further comprises the steps of reading the repaired coding blocks through an asynchronous reader-writer, writing the repaired coding blocks into a pre-written log file, and writing the repaired coding blocks stored in the pre-written log file into a disk in batches when preset disk writing conditions are met.
In some embodiments, the preset write disc condition includes the number of data blocks written to the pre-written log file being greater than a block number threshold, or the total number of bytes written to the pre-written log file being greater than a byte number threshold.
According to another aspect of the disclosure, a data repairing device is further provided, which comprises a determining module configured to determine a to-be-repaired coding block in a current node, a sorting module configured to sort the to-be-repaired coding blocks according to distribution information of the to-be-repaired coding blocks in a disk, an obtaining module configured to sequentially obtain available blocks required by repairing the to-be-repaired coding blocks according to the sorting, and a repairing module configured to repair the to-be-repaired coding blocks according to the available blocks required by repairing the to-be-repaired coding blocks.
In some embodiments, the acquisition module is configured to divide the ordered blocks to be repaired into at least one batch, obtain available blocks required for repairing each batch of blocks to be repaired in batches, and repair the blocks to be repaired in batches according to the available blocks required for repairing each batch of blocks to be repaired.
In some embodiments, the acquisition module acquires available blocks in bulk and the repair module performs repair asynchronously.
In some embodiments, the acquisition module is configured to select a plurality of nodes from other nodes in the distributed storage system except the current node, concurrently initiate a batch data acquisition request to the plurality of nodes, and receive available blocks returned by the plurality of nodes for repairing the batch of coded blocks to be repaired.
In some embodiments, the system further comprises a query module configured to query a stored information table according to keys of the to-be-repaired coded block to obtain distribution information of the to-be-repaired coded block in a disk.
In some embodiments, the system further comprises a writing module, wherein the writing module is configured to read the repaired coding blocks through an asynchronous reader-writer, write the repaired coding blocks into a pre-written log file, and write the repaired coding blocks stored in the pre-written log file into a disk in batches when preset disk writing conditions are met.
In some embodiments, the preset write disc condition includes the number of data blocks written to the pre-written log file being greater than a block number threshold, or the total number of bytes written to the pre-written log file being greater than a byte number threshold.
According to another aspect of the present disclosure, there is also presented a data repair apparatus comprising a memory, and a processor coupled to the memory, the processor configured to perform a data repair method as described above based on instructions stored in the memory.
According to another aspect of the disclosure, a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described data restoration method is also presented.
Compared with the related art, in the embodiment of the disclosure, by determining the coding blocks to be repaired in the current node, sorting the coding blocks to be repaired according to the distribution information of the coding blocks to be repaired in the disk, sequentially acquiring the available blocks required for repairing the coding blocks to be repaired according to the sorting, repairing the coding blocks to be repaired based on the acquired available blocks, sequentially pulling and repairing data, reducing the input/output (IO) times of reading and writing the disk, reducing the disk occupancy rate, and improving the data repairing efficiency.
Other features of the present disclosure and its advantages will become apparent from the following detailed description of exemplary embodiments of the disclosure, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The disclosure will be understood more clearly from the following detailed description, with reference to the accompanying drawings,
Wherein:
fig. 1 is a flow diagram of a data repair method according to some embodiments of the present disclosure.
Fig. 2 is a flow diagram of obtaining available blocks needed to repair a coded block to be repaired according to some embodiments of the present disclosure.
Fig. 3 is a flow diagram of repairing a coded block to be repaired according to an available block according to some embodiments of the present disclosure.
Fig. 4 is a flow chart of a data repair method according to other embodiments of the present disclosure.
Fig. 5 is a schematic structural diagram of a data repair device according to some embodiments of the present disclosure.
Fig. 6 is a schematic structural view of a data repair device according to other embodiments of the present disclosure.
Fig. 7 is a schematic structural view of a data repair device according to further embodiments of the present disclosure.
Fig. 8 is a schematic diagram of a computer system according to some embodiments of the present disclosure.
Detailed Description
Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but should be considered part of the specification where appropriate.
In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same.
Fig. 1 is a flow diagram of a data repair method according to some embodiments of the present disclosure. As shown in fig. 1, the data repair method includes:
and step S100, determining a coding block to be repaired in the current node.
The coding blocks refer to all blocks obtained by performing coding algorithm operation such as erasure coding algorithm on the data blocks. The encoded blocks include a data block and a check block. A data block refers to a block generated by dividing an original data object (such as a file), and a check block is a redundant block obtained by encoding the original data block. The code blocks corresponding to one file are stored in a plurality of nodes in the distributed storage system. The node may be a computer device in a cluster of computer devices, or may be a disk under the computer device, for example. When a certain node is abnormal or data lag occurs, the code block stored by the node needs to be repaired.
In some embodiments, each node stores a plurality of directory files, such as metadata directory files for storing information such as keys for coded blocks. In these embodiments, the key (key) of the coded block to be repaired in the current node is obtained from a local metadata directory file, such as a user metadata (userMeta) file. For example, for node 1, the userMeta file in node 1 is accessed to obtain the keys of all the code blocks to be repaired in node 1, thereby determining the code blocks to be repaired in the current node.
In other embodiments, a data repair request is received, the data repair request including a key of a coded block to be repaired in a current node, and the data repair request is parsed to obtain the key of the coded block to be repaired in the current node.
And step S120, sorting the coded blocks to be repaired according to the distribution information of the coded blocks to be repaired in the magnetic disk.
In an actual application scenario, the keys of the to-be-repaired coding blocks in the current node obtained in step S100 are often ordered according to service requirements, rather than ordered according to the distribution of the coding blocks on the disk. If the key of the code block to be repaired obtained in step S100 is directly repaired, the problem that the disk IO occupies a relatively high area due to random reading and writing and the data repairing efficiency is low will occur. In view of this, the inventors of the present disclosure think that, after determining a to-be-repaired encoded block in a current node, distribution information of the to-be-repaired encoded block in a disk is obtained, and the to-be-repaired encoded block is ordered according to the distribution information of the to-be-repaired encoded block in the disk, so as to solve the problems that in a data repair process, random read-write causes high disk IO occupation and low data repair efficiency.
In some embodiments, the distribution information of the coded blocks to be repaired in the disk comprises the storage address of the coded blocks to be repaired in the disk. In these embodiments, the keys of the code blocks to be repaired are ordered according to the order of the memory addresses of the code blocks to be repaired in the disk. For example, assume that the keys of the to-be-repaired encoded block obtained in step S100 are key1, key2, key3, and key4, and the keys of the to-be-repaired encoded block after being sequenced are key2, key4, key1, and key3 according to the sequence of the storage addresses of the to-be-repaired encoded block in the disk.
And step 140, sequentially acquiring available blocks required for repairing the coding blocks to be repaired according to the sequence.
According to the principle of erasure coding algorithm, n+m coding blocks can be obtained by coding n data blocks, wherein the n+m coding blocks comprise n data blocks and m check blocks, and when m coding blocks are lost in the n+m coding blocks, the lost coding blocks can be restored by any n available coding blocks (i.e. available blocks).
In a distributed storage system, the same file is encoded into a plurality of encoded blocks that are stored in a distributed manner at a plurality of nodes, each node in turn storing encoded blocks of a plurality of files. In each node, the storage order of the code blocks of the plurality of files in the disk is consistent. For example, the code block a1 of the file a, the code block b1 of the file b, and the code block c1 of the file c are sequentially stored in the node 1, the code block a2 of the file a, the code block b2 of the file b, and the code block c2 of the file c are sequentially stored in the node 2, and the code block a3 of the file a, the code block b3 of the file b, and the code block c3 of the file c are sequentially stored in the node 3.
In view of this, after the encoded blocks to be repaired are ordered according to step S120, available blocks required for repairing the encoded blocks to be repaired are sequentially acquired from other nodes according to the ordering. In this way, compared with the method for reading available blocks out of order in the related art, the method reduces disk IO by reading available blocks sequentially, improves the efficiency of obtaining available blocks, and further improves the efficiency of data repair.
And step S160, repairing the coding block to be repaired according to the available block required by repairing the coding block to be repaired.
In the embodiment of the disclosure, the data restoration of the current node is realized through the steps. Compared with the related art, the method and the device can sequentially pull and repair the data, reduce the IO times of reading and writing the disk in the data repair process, reduce the disk occupancy rate and improve the data repair efficiency.
Fig. 2 is a flow diagram of obtaining available blocks needed to repair a coded block to be repaired according to some embodiments of the present disclosure. Fig. 2 is a detailed description of step S140. As shown in fig. 2, step S140 includes step S141 and step S142.
In step S141, the sorted blocks to be repaired are divided into at least one lot.
In some embodiments, the number of batches into which the ordered encoded blocks to be repaired are divided is determined according to factors such as the data volume of the encoded blocks to be repaired, the distribution condition on the disk, the network performance, the memory occupation and the like.
In step S142, available blocks required for repairing the coded blocks to be repaired of each batch are acquired in batches.
In some embodiments, step S142 specifically includes selecting a plurality of nodes from other nodes in the distributed storage system than the current node, concurrently initiating a batch data acquisition request to the selected plurality of nodes for each batch of coded blocks to be repaired, and receiving available blocks returned by the selected plurality of nodes for repairing the batch of coded blocks to be repaired.
In specific implementation, the plurality of nodes can be arbitrarily selected from other nodes except the current node in the distributed storage system, or can be selected according to a set selection strategy. After a plurality of nodes are selected, judging whether to establish connection with each node, if so, sending a batch data acquisition request to the node through the connection, and if not, establishing connection with the node according to the IP address and port (port) of the node and sending the batch data acquisition request to the node through the connection. Wherein a list of data to be pulled is declared in the bulk data acquisition request. The selected plurality of nodes respond to the batch data acquisition request and sequentially return available blocks required for repairing the batch of coded blocks to be repaired. Illustratively, the data repair device may communicate with other nodes via a Remote Procedure Call (RPC) protocol, such as gRPC.
In some embodiments, considering retry and fusing problems when abnormal conditions such as node network jitter are considered, when the data repairing device obtains available blocks required for repairing the coded blocks to be repaired of each batch in batches, the method further comprises the steps of judging whether available blocks required for repairing the coded blocks to be repaired of the current batch returned by the selected node are received within preset time, if yes, continuing to send batch data obtaining requests for the next batch to the selected node, if no, judging whether a retry number threshold is reached, if not, restarting the batch data obtaining requests for the current batch to the selected node, adding 1 to the retry number, and if the retry number threshold is reached, outputting error prompt information, or reselecting one node from other nodes except the current node and the selected nodes, and sending batch data obtaining requests to the newly selected node. Through the processing steps, the applicability and flexibility of the data modification method are improved.
In the embodiment of the disclosure, the available blocks required for repairing the data to be repaired are sequentially and batched obtained through the steps, the number of times of sending requests to other nodes is reduced, the number of IO (input/output) times of a disk is reduced, and the efficiency of obtaining the available blocks from the selected multiple nodes is improved. Further, by concurrently initiating batch data acquisition requests to the selected plurality of nodes, the efficiency of acquiring available blocks can be further improved, and further, the efficiency of data repair can be improved.
Fig. 3 is a flow diagram of repairing a coded block to be repaired according to an available block according to some embodiments of the present disclosure. Fig. 3 is a detailed description of step S160. As shown in fig. 3, step S160 includes step S161 and step S162.
In step S161, available blocks required for repairing the coded blocks to be repaired of each batch are asynchronously read.
In some embodiments, to further improve the data repair efficiency, step S140 is performed asynchronously with step S160. For example, the batch data repair work shown in step S160 is performed by the main thread, the batch data acquisition work shown in step S140 is performed by the created coroutine, and after the available block is acquired from the selected node, the acquired available block is written into the channel.
In some embodiments, step S161 specifically includes determining whether there is data in the channel, if there is data in the channel, obtaining data from the channel, and determining whether the data obtained from the channel is erroneous, if there is no error in the data obtained from the channel, recording the data, that is, an available block required for repairing the encoded block to be repaired, so as to repair the encoded block to be repaired based on the available block, if there is an error in the data obtained from the channel, adding 1 to the error count value, and determining whether the error count value is greater than the redundancy number, if the error count value is greater than the redundancy number, outputting error prompt information, if the error count value is less than or equal to the redundancy number, ending the process of reading the data from the channel, and if there is no data in the channel, ending the process of reading the data from the channel.
In the above embodiment, the method for judging whether the data acquired from the channel is wrong may be an alternative implementation manner, in which the key of the available block acquired from the channel and the data cyclic redundancy check code (crc) are checked, if the key of the available block or the data crc are not checked, the error of the data acquired from the channel is judged, and if the key of the available block and the data crc are checked, the error of the data acquired from the channel is judged.
In step S162, batch repair is performed on each batch of blocks to be repaired according to the available blocks required for repairing the blocks to be repaired.
In some embodiments, step S161 specifically includes determining whether all available blocks required for repairing the block of the batch to be repaired have been successfully read or whether all nodes have been accessed, if so, repairing the block of the batch to be repaired, and if not, continuing to wait until all available blocks required for repairing the block of the batch to be repaired have been successfully read or all nodes have been accessed.
In the embodiment of the disclosure, the coding blocks to be repaired are repaired asynchronously and in batches through the steps, so that the efficiency of data repair can be further improved.
Fig. 4 is a flow chart of a data repair method according to other embodiments of the present disclosure. As shown in fig. 4, the data repair method includes:
and step S100, determining a coding block to be repaired in the current node.
In some embodiments, each node stores a plurality of directory files, such as metadata directory files for storing information such as keys for coded blocks. In these embodiments, the key (key) of the coded block to be repaired in the current node is obtained from a local metadata directory file, such as a user metadata (userMeta) file. For example, for node 1, the userMeta file in node 1 is accessed to obtain the keys of all the code blocks to be repaired in node 1, thereby determining the code blocks to be repaired in the current node.
In other embodiments, a data repair request is received, the data repair request including a key of a coded block to be repaired in a current node, and the data repair request is parsed to obtain the key of the coded block to be repaired in the current node.
And S110, inquiring a storage information table according to keys of the coding blocks to be repaired so as to obtain distribution information of the coding blocks to be repaired in the magnetic disk.
In some embodiments, a storage information table is preset for storing distribution information of the encoded blocks in the disk. In these embodiments, the stored information table may be queried according to the key of the to-be-repaired coding block in the current node, so as to obtain the distribution information of the to-be-repaired coding block in the disk conveniently and quickly.
And step S120, sorting the coded blocks to be repaired according to the distribution information of the coded blocks to be repaired in the magnetic disk.
In some embodiments, the distribution information of the coded blocks to be repaired in the disk comprises the storage address of the coded blocks to be repaired in the disk. In these embodiments, the keys of the code blocks to be repaired are ordered according to the order of the memory addresses of the code blocks to be repaired in the disk. For example, assume that the keys of the to-be-repaired encoded block obtained in step S100 are key1, key2, key3, and key4, and the keys of the to-be-repaired encoded block after being sequenced are key2, key4, key1, and key3 according to the sequence of the storage addresses of the to-be-repaired encoded block in the disk.
And step 140, sequentially acquiring available blocks required for repairing the coding blocks to be repaired according to the sequence.
In some embodiments, step S140 specifically includes dividing the ordered coded blocks to be repaired into at least one batch, and obtaining available blocks for repairing the coded blocks to be repaired of each batch in batches.
In the implementation, the number of batches into which the ordered encoding blocks to be repaired are divided can be determined according to factors such as the data volume of the encoding blocks to be repaired, the distribution condition on a disk, the network performance, the memory occupation and the like.
In an alternative implementation mode, the available blocks needed for repairing the coded blocks to be repaired of each batch are obtained in batches, and the method specifically comprises the steps of selecting a plurality of nodes from other nodes except the current node in the distributed storage system, concurrently initiating batch data obtaining requests to the selected plurality of nodes for the coded blocks to be repaired of each batch, and receiving the available blocks returned by the selected plurality of nodes and needed for repairing the coded blocks to be repaired of the batch.
In the embodiment of the disclosure, the available blocks required for repairing the data to be repaired are sequentially and batched acquired through the steps S100 to S140, so that the number of times of sending requests to other nodes is reduced, the number of times of IO of a disk is reduced, and the efficiency of acquiring the available blocks from the selected multiple nodes is improved. Further, by concurrently initiating batch data acquisition requests to the selected plurality of nodes, the efficiency of acquiring available blocks can be further improved, and further, the efficiency of data repair can be improved.
And step S160, repairing the coding block to be repaired according to the available block required by repairing the coding block to be repaired.
In some embodiments, step S160 specifically includes batch repairing each batch of blocks to be repaired based on the available blocks required to repair the block.
In some embodiments, to further improve the data repair efficiency, step S140 is performed asynchronously with step S160.
And S170, reading the repaired coding block by an asynchronous reader-writer, and writing the repaired coding block into a pre-written log file.
Considering serial reading and writing, the reading and writing speeds can affect each other to limit. Accordingly, the inventors of the present disclosure contemplate creating an asynchronous reader/writer that asynchronously reads and writes the repaired encoded block.
In some embodiments, step S170 specifically includes reading, by an asynchronous reader/writer, the repaired encoded block from the channel, determining whether the read repaired encoded block is in error, writing the repaired encoded block into a pre-written log file (wal file) if the read repaired encoded block is not in error, determining whether an end identifier (EOF) of the data block is read if the read repaired encoded block is in error, ending if the end identifier of the data block is read, and outputting error prompt information if the end identifier of the data block is not read.
In some embodiments, after writing wal the repaired encoded block to the file, determining whether writing wal the file fails, if so, ending, and if not, updating the number of blocks of data written to the wal file and updating the total number of bytes written to the pre-written log file.
And step S180, when the preset disc writing condition is met, the repaired coding blocks stored in the pre-written log file are written into the disc in batches.
In some embodiments, the preset write disc condition includes the number of data blocks written to the pre-written log file being greater than a block number threshold, or the total number of bytes written to the pre-written log file being greater than a byte number threshold.
When the method is implemented, when preset disc writing conditions are met, the sync interface can be called to write the repaired coding blocks stored in the pre-written log file into the disc in batches, so that the aim of reducing IO of the disc is achieved.
In the embodiment of the disclosure, the data of the current node is efficiently repaired through the steps. Compared with the related art, the method and the device have the advantages that through the steps, data can be sequentially and batched pulled, the data can be batched repaired, the data can be read and written concurrently, the repaired data can be written in batched, IO times of reading and writing the disk in the data repairing process are reduced, the disk occupancy rate is reduced, and the data repairing efficiency is improved.
Fig. 5 is a schematic structural diagram of a data repair device according to some embodiments of the present disclosure. As shown in fig. 5, the data repairing apparatus includes a determining module 510, a sorting module 520, an obtaining module 530, and a repairing module 540.
A determining module 510 is configured to determine a block to be repaired in the current node.
In some embodiments, each node stores a plurality of directory files, such as metadata directory files for storing information such as keys for coded blocks. In these embodiments, the determination module 510 obtains the key (key) of the coded block to be repaired in the current node from a local metadata directory file, such as a user metadata (userMeta) file. For example, for node 1, the userMeta file in node 1 is accessed to obtain the keys of all the code blocks to be repaired in node 1, thereby determining the code blocks to be repaired in the current node.
In other embodiments, the determination module 510 receives a data repair request that includes a key of a code block to be repaired in the current node, parses the data repair request to obtain the key of the code block to be repaired in the current node.
The sorting module 520 is configured to sort the to-be-repaired encoded blocks according to the distribution information of the to-be-repaired encoded blocks in the disk.
In some embodiments, the distribution information of the coded blocks to be repaired in the disk comprises the storage address of the coded blocks to be repaired in the disk. In these embodiments, the ordering module 520 orders the keys of the code blocks to be repaired in the order of the memory addresses of the code blocks to be repaired in the disk. For example, assume that the keys of the to-be-repaired coded block obtained by the determining module 510 are key1, key2, key3, and key4, and the keys of the to-be-repaired coded block after being sequenced are key2, key4, key1, and key3 according to the sequence of the storage addresses of the to-be-repaired coded block in the disk.
An obtaining module 530 is configured to sequentially obtain available blocks required for repairing the coded blocks to be repaired according to the ordering.
In some embodiments, the acquisition module 530 is configured to divide the ordered coded blocks to be repaired into at least one batch, and to acquire the available blocks needed to repair the coded blocks to be repaired for each batch in batches.
In specific implementation, the obtaining module 530 may determine the number of batches into which the ordered encoding blocks to be repaired are divided according to factors such as the data size of the encoding blocks to be repaired, the distribution situation on the disk, the network performance, the memory occupation, and the like.
In an alternative embodiment, the obtaining module 530 obtains available blocks required for repairing the to-be-repaired coded blocks of each batch in batches specifically includes the obtaining module 530 selecting a plurality of nodes from other nodes except the current node in the distributed storage system, for the to-be-repaired coded blocks of each batch, the obtaining module 530 concurrently initiates batch data obtaining requests to the selected plurality of nodes, and the obtaining module 530 receives available blocks returned by the selected plurality of nodes for repairing the to-be-repaired coded blocks of the batch.
In the embodiment of the disclosure, the available blocks required for repairing the data to be repaired are sequentially and batched acquired through the sequencing module and the acquisition module, the number of times of sending requests to other nodes is reduced, the number of IO (input/output) times of a disk is reduced, and the efficiency of acquiring the available blocks from the selected multiple nodes is improved. Furthermore, the acquisition module initiates batch data acquisition requests to the selected multiple nodes concurrently, so that the efficiency of acquiring available blocks can be further improved, and the efficiency of data restoration is further improved.
And the repair module 540 is configured to repair the coding block to be repaired according to the available blocks required for repairing the coding block to be repaired.
In some embodiments, the repair module 540 is configured to batch repair each batch of coded blocks to be repaired based on the available blocks required to repair the coded blocks to be repaired for that batch.
In some embodiments, to further improve data repair efficiency, the acquisition module 530 bulk acquires available blocks and the repair module 540 bulk repairs are performed asynchronously.
In the embodiment of the disclosure, the data restoration of the current node is realized through the device. Compared with the related art, the device can sequentially pull and repair data, reduce the IO times of reading and writing the disk in the data repair process, reduce the disk occupancy rate and improve the data repair efficiency.
Fig. 6 is a schematic structural view of a data repair device according to other embodiments of the present disclosure. As shown in fig. 6, the data repairing apparatus includes a determining module 510, a querying module 511, a sorting module 520, an obtaining module 530, a repairing module 540, and a writing module 550.
A determining module 510 is configured to determine a block to be repaired in the current node.
In some embodiments, each node stores a plurality of directory files, such as metadata directory files for storing information such as keys for coded blocks. In these embodiments, the determination module 510 obtains the key (key) of the coded block to be repaired in the current node from a local metadata directory file, such as a user metadata (userMeta) file. For example, for node 1, the userMeta file in node 1 is accessed to obtain the keys of all the code blocks to be repaired in node 1, thereby determining the code blocks to be repaired in the current node.
The query module 511 is configured to query the stored information table according to the key of the to-be-repaired coded block to obtain the distribution information of the to-be-repaired coded block in the disk.
In some embodiments, a storage information table is preset for storing distribution information of the encoded blocks in the disk. In these embodiments, the query module 511 may query the stored information table according to the key of the to-be-repaired coding block in the current node, so as to conveniently and quickly obtain the distribution information of the to-be-repaired coding block in the disk.
The sorting module 520 is configured to sort the to-be-repaired encoded blocks according to the distribution information of the to-be-repaired encoded blocks in the disk.
In some embodiments, the distribution information of the coded blocks to be repaired in the disk comprises the storage address of the coded blocks to be repaired in the disk. In these embodiments, the ordering module 520 orders the keys of the code blocks to be repaired in the order of the memory addresses of the code blocks to be repaired in the disk. For example, assume that the keys of the to-be-repaired coded block obtained by the determining module 510 are key1, key2, key3, and key4, and the keys of the to-be-repaired coded block after being sequenced are key2, key4, key1, and key3 according to the sequence of the storage addresses of the to-be-repaired coded block in the disk.
An obtaining module 530 is configured to sequentially obtain available blocks required for repairing the coded blocks to be repaired according to the ordering.
In some embodiments, the acquisition module 530 is configured to divide the ordered coded blocks to be repaired into at least one batch, and to acquire the available blocks needed to repair the coded blocks to be repaired for each batch in batches.
And the repair module 540 is configured to repair the coding block to be repaired according to the available blocks required for repairing the coding block to be repaired.
In some embodiments, the repair module 540 is configured to batch repair each batch of coded blocks to be repaired based on the available blocks required to repair the coded blocks to be repaired for that batch.
In some embodiments, to further improve data repair efficiency, the acquisition module 530 bulk acquires available blocks and the repair module 540 bulk repairs are performed asynchronously.
And the writing module 550 is configured to read the repaired coding blocks through an asynchronous reader-writer and write the repaired coding blocks into a pre-written log file, and when the preset disc writing condition is met, batch writing the repaired coding blocks stored in the pre-written log file into a disc.
In some embodiments, the preset write disc condition includes the number of data blocks written to the pre-written log file being greater than a block number threshold, or the total number of bytes written to the pre-written log file being greater than a byte number threshold.
In specific implementation, when the preset disk writing condition is met, the writing module 550 may call the sync interface to write the repaired encoded blocks stored in the pre-written log file into the disk in batches, so as to achieve the purpose of reducing the disk IO.
In some embodiments, the write module 550 is further configured to determine whether writing wal the file fails after writing wal the repaired encoded blocks, if so, end, if not, update the number of data blocks written to wal the file and update the total number of bytes written to the pre-written log file.
In the embodiment of the disclosure, the data of the current node is efficiently repaired by the device. Compared with the related art, the method and the device can sequentially and batchly pull data, batchly repair the data, concurrently read and write the data and batchly write the repaired data, reduce IO times of reading and writing the disk in the data repairing process, reduce the disk occupancy rate and improve the data repairing efficiency.
Fig. 7 is a block diagram illustrating a data repair apparatus according to further embodiments of the present disclosure.
As shown in fig. 7, the data repair device 700 includes a memory 710, and a processor 720 coupled to the memory 710. The memory 710 is used to store instructions for performing corresponding embodiments of the data repair method. Processor 720 is configured to perform the data repair method in any of the embodiments of the present disclosure based on instructions stored in memory 710.
FIG. 8 is a block diagram illustrating a computer system for implementing some embodiments of the present disclosure.
As shown in FIG. 8, computer system 800 may be in the form of a general purpose computing device. Computer system 800 includes a memory 810, a processor 820, and a bus 830 that connects the various system components.
Memory 810 may include, for example, system memory, non-volatile storage media, and the like. The system memory stores, for example, an operating system, application programs, boot Loader (Boot Loader), and other programs. The system memory may include volatile storage media, such as Random Access Memory (RAM) and/or cache memory. The non-volatile storage medium stores, for example, instructions for performing a corresponding embodiment of at least one of the data repair methods. Non-volatile storage media include, but are not limited to, disk storage, optical storage, flash memory, and the like.
Processor 820 may be implemented as discrete hardware components such as a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gates or transistors, and the like. Accordingly, each of the modules, such as the judgment module and the determination module, may be implemented by a Central Processing Unit (CPU) executing instructions of the corresponding steps in the memory, or may be implemented by a dedicated circuit that performs the corresponding steps.
Bus 830 may employ any of a variety of bus architectures. For example, bus structures include, but are not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, and a Peripheral Component Interconnect (PCI) bus.
Computer system 800 may also include an input-output interface 840, a network interface 850, a storage interface 860, and the like. These interfaces 840, 850, 860 may be connected between the memory 810 and the processor 820 via a bus 830. The input output interface 840 may provide a connection interface for input output devices such as a display, mouse, keyboard, etc. Network interface 850 provides a connection interface for various networking devices. The storage interface 860 provides a connection interface for external storage devices such as a floppy disk, a USB flash disk, an SD card, and the like.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable apparatus to produce a machine, such that the instructions, which execute via the processor, create means for implementing the functions specified in the flowchart and/or block diagram block or blocks.
These computer readable program instructions may also be stored in a computer readable memory that can direct a computer to function in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture including instructions which implement the function specified in the flowchart and/or block diagram block or blocks.
The present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects.
By the data restoration method and the data restoration device in the embodiment, IO times of reading and writing the disk can be reduced, the disk occupancy rate is reduced, and the data restoration efficiency is improved.
Thus far, the data repair method and apparatus according to the present disclosure have been described in detail. In order to avoid obscuring the concepts of the present disclosure, some details known in the art are not described. How to implement the solutions disclosed herein will be fully apparent to those skilled in the art from the above description.

Claims (12)

1. A method of data repair, comprising:
Determining a coding block to be repaired in the current node;
ordering the coding blocks to be repaired according to the distribution information of the coding blocks to be repaired in the magnetic disk, wherein the ordering comprises ordering the coding blocks to be repaired according to the sequence of the storage addresses of the coding blocks to be repaired in the magnetic disk;
Dividing the ordered coded blocks to be repaired into at least one batch, and obtaining available blocks required by repairing the coded blocks to be repaired of each batch in batches;
And carrying out batch repair on the blocks to be repaired of each batch according to the available blocks required by repairing the blocks to be repaired of each batch, wherein the batch acquisition of the available blocks and the batch repair are carried out asynchronously.
2. The data repair method of claim 1, wherein the batch acquisition of available blocks required to repair each batch of coded blocks to be repaired comprises:
Selecting a plurality of nodes from other nodes except the current node in the distributed storage system;
concurrent initiation of a batch data acquisition request to the plurality of nodes;
and receiving available blocks returned by the plurality of nodes, wherein the available blocks are required for repairing the coded blocks to be repaired of the batch.
3. The data repair method of claim 1, further comprising:
And after the coding block to be repaired in the current node is determined, inquiring a storage information table according to the key of the coding block to be repaired so as to obtain the distribution information of the coding block to be repaired in a disk.
4. The data repair method of claim 1, further comprising:
reading the repaired coding block by an asynchronous reader-writer, and writing the repaired coding block into a pre-written log file;
and when the preset disc writing condition is met, writing the repaired coding blocks stored in the pre-written log file into the disc in batches.
5. The data repair method of claim 4, the preset write disc condition comprising:
the number of data blocks written to the pre-written log file is greater than a block number threshold, or
The total number of bytes written to the pre-written log file is greater than a byte count threshold.
6. A data repair device, comprising:
a determining module configured to determine a block to be repaired in the current node;
The sorting module is configured to sort the to-be-repaired coded blocks according to the distribution information of the to-be-repaired coded blocks in the magnetic disk, and comprises the steps of sorting the to-be-repaired coded blocks according to the sequence of storage addresses of the to-be-repaired coded blocks in the magnetic disk;
The acquisition module is configured to divide the ordered coded blocks to be repaired into at least one batch, and acquire available blocks required for repairing the coded blocks to be repaired of each batch in batches;
And the repair module is configured to repair the blocks to be repaired in batches according to the available blocks required by repairing the blocks to be repaired in each batch, wherein the acquisition module acquires the available blocks in batches and performs batch repair asynchronously with the repair module.
7. The data repair device of claim 6, wherein the acquisition module is configured to:
Selecting a plurality of nodes from other nodes except the current node in the distributed storage system;
concurrent initiation of a batch data acquisition request to the plurality of nodes;
and receiving available blocks returned by the plurality of nodes, wherein the available blocks are required for repairing the coded blocks to be repaired of the batch.
8. The data repair device of claim 6, further comprising:
And the query module is configured to query a storage information table according to the key of the coding block to be repaired so as to obtain the distribution information of the coding block to be repaired in the disk.
9. The data repair device of claim 6, further comprising a write module configured to:
reading the repaired coding block by an asynchronous reader-writer, and writing the repaired coding block into a pre-written log file;
and when the preset disc writing condition is met, writing the repaired coding blocks stored in the pre-written log file into the disc in batches.
10. The data repair device of claim 9, wherein the preset write disc condition comprises:
the number of data blocks written to the pre-written log file is greater than a block number threshold, or
The total number of bytes written to the pre-written log file is greater than a byte count threshold.
11. A data repair device, comprising:
Memory, and
A processor coupled to the memory, the processor configured to perform the data repair method of any of claims 1 to 5 based on instructions stored in the memory.
12. A computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the data repair method of any of claims 1 to 5.
CN202210238315.4A 2022-03-10 2022-03-10 Data repair method and device Active CN114647533B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210238315.4A CN114647533B (en) 2022-03-10 2022-03-10 Data repair method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210238315.4A CN114647533B (en) 2022-03-10 2022-03-10 Data repair method and device

Publications (2)

Publication Number Publication Date
CN114647533A CN114647533A (en) 2022-06-21
CN114647533B true CN114647533B (en) 2025-03-18

Family

ID=81994243

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210238315.4A Active CN114647533B (en) 2022-03-10 2022-03-10 Data repair method and device

Country Status (1)

Country Link
CN (1) CN114647533B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049354A (en) * 2012-12-21 2013-04-17 华为技术有限公司 Data restoration method, data restoration device and storage system
CN110019063A (en) * 2017-08-15 2019-07-16 厦门雅迅网络股份有限公司 Method, terminal device and the storage medium of calculate node data disaster tolerance playback

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8949395B2 (en) * 2004-06-01 2015-02-03 Inmage Systems, Inc. Systems and methods of event driven recovery management
CN103336785B (en) * 2013-06-04 2016-12-28 华中科技大学 A kind of distributed storage method based on network code and device thereof
CN113407376B (en) * 2021-06-18 2024-08-02 北京金山云网络技术有限公司 Data recovery method and device and electronic equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049354A (en) * 2012-12-21 2013-04-17 华为技术有限公司 Data restoration method, data restoration device and storage system
CN110019063A (en) * 2017-08-15 2019-07-16 厦门雅迅网络股份有限公司 Method, terminal device and the storage medium of calculate node data disaster tolerance playback

Also Published As

Publication number Publication date
CN114647533A (en) 2022-06-21

Similar Documents

Publication Publication Date Title
CN109284073B (en) Data storage method, device, system, server, control node and medium
WO2017049764A1 (en) Method for reading and writing data and distributed storage system
CN111125040B (en) Method, device and storage medium for managing redo log
CN112835885B (en) A processing method, device and system for distributed table storage
CN113064859B (en) Metadata processing method, device, electronic device and storage medium
CN111324668A (en) Database data synchronous processing method and device and storage medium
EP3051700A1 (en) Hardware efficient fingerprinting
CN106776795A (en) Method for writing data and device based on Hbase databases
CN115801765A (en) File transmission method, device, system, electronic equipment and storage medium
CN114647533B (en) Data repair method and device
CN113326146B (en) Message processing method and device, electronic equipment and storage medium
CN117112004B (en) Differential data determination method, differential restoration method, device, equipment and medium
CN112445761B (en) File checking method and device and storage medium
CN114036110A (en) Data duplicate checking method and device
CN117666931A (en) A data processing method and related equipment
CN113535714A (en) Data storage method, reading method and computer equipment
CN115437836A (en) Metadata processing method and related equipment
US20240231622A9 (en) Resolving deduplication matches when using non-cryptographic hash
EP3051699B1 (en) Hardware efficient rabin fingerprints
CN113282545A (en) Method and device for merging data, computing equipment and storage medium
CN110945792B (en) Method for compressing and decompressing data and related devices
CN118210659B (en) Management method, equipment, system, product and storage medium of disaster recovery storage system
CN113934682B (en) Sharding method, device, server and medium of distributed table system
CN104424238B (en) A kind of method, apparatus that mass file generates
CN109542900B (en) A data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant