Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
At present, in the process of writing management metadata into a disk, if a silent error occurs in the disk, the data with the written error is written or the written sectors are not aligned, but the disk still returns a message of successful IO writing, so that the service cannot normally run when the management metadata with the silent error is read subsequently.
In order to solve the above problems, the present application provides a method for repairing management metadata, wherein when a notification that a currently read management metadata cannot enable a service to normally operate is received, a new read policy is determined according to the currently used read policy, the management metadata is read again through the new read policy and is sent to the service, when the notification that the newly read management metadata can enable the service to normally operate is received, a repair direction is determined according to the read policy and a repair range is determined, and finally, a main address field and a backup address field are determined according to the repair range and are repaired according to the repair direction.
Based on the above description, when a silent error occurs in the disk, which causes a management metadata error, and a service cannot run, the data is retrieved by switching a new read strategy, and when the retrieved data enables the service to run normally, a repair direction can be determined by the new read strategy for repair, so that the management metadata with the silent error is repaired, and the data security is improved.
The technical solution of the present application is explained in detail by the following specific examples.
Fig. 1 is a flowchart illustrating an embodiment of a method for managing metadata repair according to an exemplary embodiment of the present application, where the method for managing metadata repair may be applied to a storage device. As shown in fig. 1, the management metadata repair method includes the steps of:
step 101: when receiving a notice that the currently read management metadata can not enable the service to normally run, determining a new reading strategy according to the currently used reading strategy, re-reading the management metadata through the reading strategy, and sending the re-read management metadata to the service.
In an embodiment, the notification that the currently read management metadata cannot enable the service to operate normally may be a notification sent by a front-end server running the service, or may be a notification generated by triggering a certain button by an administrator when the service running in the storage device itself is abnormal.
In one embodiment, for a process of determining a new read policy according to a currently used read policy, if the currently used read policy is a priority main data area, determining a priority backup data area as the new read policy; and if the currently used read strategy is the priority backup data area, determining the priority main data area as a new read strategy.
The preset reading strategy on the storage device comprises the following steps: a priority main data area (hereinafter abbreviated as MF), and a priority backup data area (hereinafter abbreviated as BF). The MF indicates that the data state of the main address field and the data state of the backup address field are both effective, the main address field is read preferentially, and the backup address field is read only when the data state of the main address field is invalid and the data state of the backup address field is effective, or the main address field fails to be read and the data state of the backup address field is effective; BF indicates that when the data state of the main address field and the data state of the backup address field are both valid, the backup address field is read preferentially, and the main address field is read only when the data state of the backup address field is invalid and the data state of the main address field is valid, or when the data state of the main address field is valid due to reading failure of the backup address field.
In an embodiment, for the process of re-reading the management metadata through the read policy, the storage device may notify the service of the identifier of the read policy, and the service carries the received identifier of the read policy and the read address field into the read command for re-reading, and then sends the read command to the RAID; the RAID searches a main address field (MAM-Sx) which has intersection with the read address field and a backup address field (MAB-Sy) corresponding to the main address field from a preset backup table, and then reads the management metadata from the searched main address field or backup address field according to a read strategy carried by the read command.
The backup table records the data states of the main address field and the data states of the backup address field and the backup address field. The main address field belongs to the main data area, visible to the service, the backup address field belongs to the backup data area, invisible to the service, and the backup address field corresponding to the main address field and the main address field is used for storing the same management metadata. In order to avoid that the management metadata stored in the main address field is inaccessible due to RAID failure, the backup address fields corresponding to the main address field and the main address field are located on different RAIDs and have the same size, and therefore the backup table may further include a RAID identifier of the RAID where the main address field is located and a RAID identifier of the RAID where the backup address field is located.
It will be understood by those skilled in the art that the primary address segment or the backup address segment may be composed of a start address and an end address, or a start address and a segment size, which is not limited in the present application.
Specifically, assume that the read policy identifier carried by the read command received by the RAID is MF:
1, the data states of MAM-Sx and MAB-Sy are both valid: directly reading the MAM-Sx, if the reading fails, reading the MAB-Sy again, and if any reading succeeds, returning a reading success notice;
2, the data state of MAM-Sx is valid, the data state of MAB-Sy is invalid: directly reading the MAM-Sx, if the reading is successful, returning a reading success notice, otherwise, returning a reading failure notice;
3, the data state of MAM-Sx is invalid, the data state of MAB-Sy is valid: directly reading the MAB-Sy, if the reading is successful, returning a reading success notice, otherwise, returning a reading failure notice;
4, data states of both MAM-Sx and MAB-Sy are invalid: the read failure notification is returned directly.
Secondly, assuming that the reading strategy carried by the reading command is identified as BF:
1, the data states of MAM-Sx and MAB-Sy are both valid: directly reading the MAB-Sy, if the reading fails, reading the MAM-Sx again, and if any reading succeeds, returning a reading success notice;
2, the data state of MAM-Sx is valid, the data state of MAB-Sy is invalid: directly reading the MAM-Sx, if the reading is successful, returning a reading success notice, otherwise, returning a reading failure notice;
3, the data state of MAM-Sx is invalid, the data state of MAB-Sy is valid: directly reading the MAB-Sy, if the reading is successful, returning a reading success notice, otherwise, returning a reading failure notice;
4, data states of both MAM-Sx and MAB-Sy are invalid: the read failure notification is returned directly.
Step 102: and when receiving a notification that the re-read management metadata can enable the service to normally run, determining a repair direction according to the read strategy and determining a repair range.
In an embodiment, the notification that the newly read management metadata enables the service to normally operate may be a notification sent by a front-end server running the service, or may be a notification generated by triggering a certain button by an administrator when the service running in the storage device itself returns to normal.
In an embodiment, for the process of determining the repair direction according to the read policy, if the read policy is the priority main data area, the repair direction is to repair the management metadata (hereinafter abbreviated as MtB) of the backup data area using the management metadata of the main data area; if the read policy is the priority backup data area, the repair direction is to repair the management metadata of the main data area using the management metadata of the backup data area (hereinafter referred to as BtM).
If the management metadata retrieved by the MF can enable the service to normally operate, which indicates that the disk where the main data area is located has no silent error, the main data area can be used to repair the backup data area; if the management metadata retrieved by BF can make the service normally run, which indicates that the disk where the backup data area is located has no silent error, the backup data area can be used to repair the main data area.
In an embodiment, for the process of determining the repair range, all the main address fields in the main data area, in which the management metadata is stored, may be determined as the repair range, a section including the read address field may be determined as the repair range, or a space allocated to the service and used for storing the management metadata may be determined as the repair range.
If the re-read management metadata can enable the service to normally operate, it indicates that a silent error exists on a disk sector corresponding to a main data area or a backup data area which is allocated for the service and used for storing the management metadata, and data repair is required.
Step 103: and determining a main address field and a backup address field according to the repair range, and repairing the management metadata in the main address field and the backup address field according to the repair direction.
In one embodiment, for a process of determining a primary address segment and a backup address segment according to a repair range, the primary address segment contained in the repair range and the backup address segment corresponding to the primary address segment are acquired from a backup table. The description of the backup table may refer to the related description of step 101, which is not described herein again.
In an embodiment, for a process of repairing management metadata in a main address field and a backup address field according to a repairing direction, if the repairing direction is to repair the management metadata (MtB) of the backup data area by using the management metadata of the main data area, when a data state of the main address field is valid, the management metadata in the backup address field corresponding to the main address field is repaired by using the management metadata in the main address field; if the repair direction is to repair the management metadata of the main data area using the management metadata of the backup data area (BtM), the management metadata in the main address field corresponding to the backup address field is repaired using the management metadata in the backup address field when the data status of the backup address field is valid.
Because each record in the backup table is the corresponding relation between the main address field and the backup address field, there may be multiple records in the main address field and the backup address field determined according to the repair range, and the main address field in each record is included in the repair range, so that when repairing, the repair is performed in sequence for each record.
It is worth to be noted that, after the management metadata in the backup address field corresponding to the main address field is repaired by using the management metadata in the main address field, if the data state of the backup address field is invalid, the backup address field is modified to be valid; after the management metadata in the main address field corresponding to the backup address field is repaired by using the management metadata in the backup address field, if the data state of the main address field is invalid, the data state is modified to be valid.
Further, if the repair direction is MtB but the data state of the main address segment is invalid, the repair is not performed, and the data state of the backup address segment corresponding to the main address segment is directly modified to be invalid; if the repair direction is BtM but the data state of the backup address segment is invalid, the data state of the main address segment corresponding to the backup address segment is directly modified to be invalid without repairing.
Specifically, assuming that the repair direction is MtB, the primary address segment in one record obtained from the backup table is MAM-Sx, and the backup address segment is MAB-Sy:
if the data state of MAM-Sx is valid, the data state of MAB-Sy is valid: the management metadata in the MAB-Sy is directly repaired by using the management metadata in the MAM-Sx;
if the data state of MAM-Sx is valid, the data state of MAB-Sy is invalid: the management metadata in the MAB-Sy is directly repaired by using the management metadata in the MAM-Sx, and the data state of the MAB-Sy is modified from invalid to valid;
if the data state of MAM-Sx is invalid, the data state of MAB-Sy is valid: the data state of the MAB-Sy is changed from valid to invalid without repair;
if the data state of MAM-Sx is invalid, the data state of MAB-Sy is invalid: the record is skipped directly without repair.
Secondly, assuming that the repair direction is BtM, the primary address segment obtained from the backup table is MAM-Sx, and the corresponding backup address segment is MAB-Sy:
if the data state of MAM-Sx is valid, the data state of MAB-Sy is valid: the management metadata in the MAM-Sx is directly repaired by using the management metadata in the MAB-Sy;
if the data state of MAM-Sx is valid, the data state of MAB-Sy is invalid: the data state of the MAM-Sx is changed from valid to invalid without repairing;
if the data state of MAM-Sx is invalid, the data state of MAB-Sy is valid: the management metadata in the MAM-Sx is directly repaired by using the management metadata in the MAB-Sy, and the data state of the MAM-Sx is changed from invalid to valid;
if the data states of both MAM-Sx and MAB-Sy are invalid: the record is skipped directly without repair.
In the embodiment of the application, when receiving a notification that the currently read management metadata cannot enable a service to normally operate, the storage device determines a new read policy according to the currently used read policy, re-reads the management metadata through the new read policy, and sends the re-read management metadata to the service, and when receiving the notification that the re-read management metadata can enable the service to normally operate, determines a repair direction according to the read policy, determines a repair range, and finally determines a main address field and a backup address field according to the repair range, and repairs the management metadata in the main address field and the backup address field according to the repair direction.
Based on the above description, when a silent error occurs in the disk, which causes a management metadata error, and a service cannot run, the data is retrieved by switching a new read strategy, and when the retrieved data enables the service to run normally, a repair direction can be determined by the new read strategy for repair, so that the management metadata with the silent error is repaired, and the data security is improved.
Fig. 2 is a hardware block diagram of a storage device according to an exemplary embodiment of the present application, where the storage device includes: a communication interface 201, a processor 202, a machine-readable storage medium 203, and a bus 204; wherein the communication interface 201, the processor 202 and the machine-readable storage medium 203 communicate with each other via a bus 204. The processor 202 may execute the above-described management metadata repair method by reading and executing machine-executable instructions in the machine-readable storage medium 202 corresponding to control logic for managing the metadata repair method, and the details of the method are described in the above embodiments and will not be described herein again.
The machine-readable storage medium 203 referred to herein may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: volatile memory, non-volatile memory, or similar storage media. In particular, the machine-readable storage medium 203 may be a RAM (random Access Memory), a flash Memory, a storage drive (e.g., a hard drive), any type of storage disk (e.g., an optical disk, a DVD, etc.), or similar storage medium, or a combination thereof.
Fig. 3 is a block diagram of an embodiment of a management metadata recovery apparatus according to an exemplary embodiment of the present application, where the management metadata recovery apparatus may be applied to a storage device, as shown in fig. 3, and the management metadata recovery apparatus includes:
a retrieve data unit 310, configured to determine a new read policy according to a currently used read policy when receiving a notification that the currently read management metadata cannot enable the service to operate normally, re-read the management metadata through the read policy, and send the re-read management metadata to the service;
a determining unit 320, configured to determine, when receiving a notification that the re-read management metadata enables normal operation of the service, a repair direction according to the read policy, and determine a repair range;
and the repair unit 330 is configured to determine a main address segment and a backup address segment according to the repair range, and repair the management metadata in the main address segment and the backup address segment according to the repair direction.
In an optional implementation manner, the determining unit 320 is specifically configured to, in the process of determining the repair direction according to the read policy, if the read policy is the priority main data area, repair the management metadata of the backup data area using the management metadata of the main data area in the repair direction; and if the read strategy is the priority backup data area, the repair direction is to repair the management metadata of the main data area by using the management metadata of the backup data area.
In an optional implementation manner, the repair unit 330 is specifically configured to, in a process of determining a main address segment and a backup address segment according to the repair range, obtain the main address segment included in the repair range and the backup address segment corresponding to the main address segment from a preset backup table, where the backup table includes data states of the main address segment and data states of the backup address segment and the backup address segment, and the main address segment and the backup address segment corresponding to the main address segment have the same size and are located on different RAID disks.
In an optional implementation manner, the repair unit 330 is specifically configured to, in a process of repairing the management metadata in the main address segment and the backup address segment according to the repair direction, if the repair direction is to repair the management metadata in the backup data segment using the management metadata in the main data segment, repair the management metadata in the backup address segment corresponding to the main address segment using the management metadata in the main address segment when the data state of the main address segment is valid; and if the repair direction is to repair the management metadata of the main data area by using the management metadata of the backup data area, when the data state of the backup address field is effective, repairing the management metadata in the main address field corresponding to the backup address field by using the management metadata in the backup address field.
In an alternative implementation, the apparatus further comprises (not shown in fig. 3):
the data state modifying unit is used for modifying the data state of the backup address segment into effective if the data state of the backup address segment is invalid after the management metadata in the backup address segment corresponding to the main address segment is repaired by the management metadata in the main address segment; and after the management metadata in the main address field corresponding to the backup address field is repaired by using the management metadata in the backup address field, if the data state of the main address field is invalid, the data state is modified to be valid.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.