[go: up one dir, main page]

CN119396628A - A RAID5 verification data power failure processing method and information processing device - Google Patents

A RAID5 verification data power failure processing method and information processing device Download PDF

Info

Publication number
CN119396628A
CN119396628A CN202411494217.2A CN202411494217A CN119396628A CN 119396628 A CN119396628 A CN 119396628A CN 202411494217 A CN202411494217 A CN 202411494217A CN 119396628 A CN119396628 A CN 119396628A
Authority
CN
China
Prior art keywords
page
data
stripe
check data
intermediate state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411494217.2A
Other languages
Chinese (zh)
Inventor
秦龙华
孙宝勇
居颖轶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Memblaze Technology Co Ltd
Original Assignee
Beijing Memblaze Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Memblaze Technology Co Ltd filed Critical Beijing Memblaze Technology Co Ltd
Priority to CN202411494217.2A priority Critical patent/CN119396628A/en
Publication of CN119396628A publication Critical patent/CN119396628A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1441Resetting or repowering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

本申请提供一种RAID5校验数据掉电处理方法及信息处理设备,涉及数据处理技术领域,该方法包括如下步骤:响应于掉电,获取未写完整的页条带的中间态校验数据;判断未写入数据的页条带是否足够写入中间态校验数据,若是,则直接将中间态校验数据写入该页条带未写入数据的页,否则,将中间态校验数据写满该页条带后,剩余未写入的中间态校验数据写入该页条带存校验数据的页。本申请用于在掉电后,提高恢复页条带中间态校验数据的速度,为不完整的条带数据中的数据提供保护,对不完整条带发生读出错时进行读恢复,并且在遇到读出错处理的时候可以无需改变数据重建过程,从而不引入额外的研发成本与技术复杂度。

The present application provides a RAID5 checksum data power failure processing method and information processing equipment, which relates to the field of data processing technology. The method includes the following steps: in response to power failure, obtaining the intermediate state checksum data of the incomplete page stripe; judging whether the page stripe with no data written is sufficient to write the intermediate state checksum data, if so, directly writing the intermediate state checksum data into the page of the page stripe with no data written, otherwise, after the intermediate state checksum data is fully written into the page stripe, the remaining unwritten intermediate state checksum data is written into the page of the page stripe storing the checksum data. The present application is used to increase the speed of recovering the intermediate state checksum data of the page stripe after power failure, provide protection for the data in the incomplete stripe data, perform read recovery when a read error occurs in the incomplete stripe, and when a read error occurs, there is no need to change the data reconstruction process, thereby not introducing additional R&D costs and technical complexity.

Description

RAID5 check data power-down processing method and information processing equipment
Technical Field
The application relates to the technical field of storage, in particular to a RAID5 check data power-down processing method and information processing equipment.
Background
Solid state storage devices typically use RAID techniques (e.g., RAID 5) to organize and protect data on the NAND flash memory in order to improve the reliability of the stored data. Thus, even if a part of the data fails to be read, the data which fails to be read can be recovered by the check data generated based on the RAID technique.
The NAND flash memory chip of the storage device includes one or more LUNs (logical units). Each LUN (logical unit) includes one or more planes. The Plane includes a plurality of physical blocks (blocks). One physical block per Plane constructs a RAID stripe (RAID STRIPE). Physical pages (Page) components PAGE STRIPE (Page stripes) with the same Page number (Page number) within physical blocks (blocks) belonging to the same RAID stripe. When any data in the storage device is in error, the exclusive or operation can be used for data recovery according to other data pages (including check pages) of the page stripe to which the error data belongs. Thereby realizing the protection of the user data through RAID check data. Taking RAID5 as an example, each PAGE STRIPE (page stripe) includes one page as PARITY PAGE (a page storing check data, abbreviated as a check page).
Since the data throughput bandwidth of the back-end NAND flash memory is utilized as much as possible, when the storage device writes data to the NAND flash memory, a write command is issued to the NAND flash memory in a multi-Plane program (multi-layer programming) operation mode, and in the case that each LUN (logical unit) has 4 planes, a single write command issued to the NAND flash memory carries data of 4 physical pages, and the granularity of the write command is 4 pages.
By way of example, the stripe size is 31+1 (comprising 31 physical pages storing user data and 1 physical page storing parity data). As shown in FIG. 1,32 LUNs are used to construct a stripe. Pages of Block m for each of the 32 LUNs numbered nx (denoted as page nx) make up a page stripe. The page stripe includes a total of 32 physical pages of 31+1. The page nx of Block m of LUN 31 is used to store check data. Where x varies with the Plane in which the physical page is located and n represents the physical page number within the physical block within the Plane. In the example of FIG. 1, page n0 of Plane 0 of Block m for each of 32 LUNs constitutes page stripe n0, page n1 of Plane 1 of Block m for each of 32 LUNs constitutes page stripe n1, page n2 of Plane 2 of Block m for each of 32 LUNs constitutes page stripe n2, and page n3 of Plane 3 of Block m for each of 32 LUNs constitutes page stripe n3. By way of example, in addition to physical pages n0-n3, other physical pages of LUN 31 are also used to store parity data.
As shown in fig. 1, taking 31 LUNs as an example, a total of 31 LUNs from LUN 0 to LUN 30 are used for storing user data, and LUN31 is used for storing check data. The write granularity is 4 pages, and each multi-plane program write command writes data to page i 0-page i3 for a total of 4 physical pages (i is an integer greater than or equal to 0).
When reading data, if the read data is wrong, the other data of the page stripe of the physical page where the data is located is read again, and the XOR (exclusive or) is calculated on all the read other data, so that the wrong data can be recovered.
In writing data, the written user data is written to each physical page of the page stripe. And each piece of user data is also exclusive-ored with the data in the check data buffer, and the result is retained in the check data buffer to generate check data for the page stripe. After all user data of the page stripe are written into the data, the data in the check data buffer is the check data of the page stripe. Verification data is also written to the page stripe.
When the host is powered down, if the data of one page stripe is not completely written, if the stripe is written with three pages of user data at the moment, after the host is powered up again, the user data cannot be directly written into the page stripe due to the loss of the temporarily stored check data in the memory, and the following steps are needed to be executed:
First, 3 pages of user Data, namely data_1_0/data_1_1/data_1_2, which are already written in the page stripe need to be sequentially read;
second, the check Data of the three pages of Data, data_1_0XOR Data_1_1XOR, is recalculated
Data_1_2;
Thirdly, writing the generated check data into a check data buffer area in the memory;
And step four, responding to the write command of the host, writing the user data carried by the write command into other blank physical pages in the stripe, and performing exclusive OR operation on the user data carried by the write command and the check data recovered in the check data buffer area to calculate the check data of the page stripe.
In the above process, since the first to third steps are required to restore the check data in the memory before the power-down, the delay is large when the write command starts to be processed after the power-down restart.
Several approaches have been proposed to solve the above problems. For example, the page stripe is replenished (by actively filling in data) at the time of power failure, thereby recording the check data on the page stripe. A disadvantage of this approach is that the padding data increases the amount of data written to the NAND flash memory. If power is lost when a small amount of data is written into a page stripe frequently, the amount of the written data of the NAND flash memory is increased, and the service life of the solid-state storage device is accelerated.
The other way is to directly power down without filling the page strips when power is lost. After the next power-up, the data written to the page stripe is read, and the check data is calculated. In this way, if the page stripe is relatively large, the power-up initialization time is increased.
Still another approach is to not fill up the page stripe when power is lost, but store the intermediate state check data to a dedicated Block of NAND flash memory when power is lost. And reading out the intermediate state check data after the next power-on. This approach requires additional management of the Block storing intermediate parity data, increasing complexity.
Still another approach does not fill up the page stripe when power is lost, but stores the intermediate state check data as user data into the current page stripe. At the next power up, the check data buffer in the memory is initialized with all 0 s. And after power-on, writing the data carried by the write command into the page stripe for the newly received host write command, and calculating check data according to the existing flow (the data carried by the write command is exclusive-or with the data in the check data buffer area). This approach has the advantage that at power up, the data in the page stripe need not be read out to calculate intermediate state check data.
In the case of writing data to the NAND flash memory at 4 page granularity using a Multi-Plane-Program command, the 4 physical pages written belong to 4 page stripes. These 4 page stripes are written with data at the same time. If there are page stripes that are not full, the number of page stripes is also 4 when the exception is powered down. However, some bad blocks inevitably exist in the NAND flash memory. This results in that one or more of the 4 physical pages accessed by a single Multi-Plane Program command may belong to bad blocks, and thus need not be written with data. Under normal circumstances, the location of the check page in the page stripe may be selected to ensure that the corresponding 4 physical pages are all from the good block when writing data to the check page when constructing the page stripe. However, in the case of abnormal power failure, it cannot be guaranteed that all 4 physical pages corresponding to the physical page of the location where the page stripe is being written are from a good block, and even one location cannot be found that all 4 physical pages corresponding to the physical page are from a good block, so that intermediate state verification data of the current 4 page stripes cannot be written into the NAND flash memory by using a single Multi-Plane-Program command. And the association relation between the intermediate state check data and the user data thereof on the physical address is destroyed, the physical pages where the intermediate state check data and the corresponding user data are located may not have the same physical page number, which causes the problem of how to recover the intermediate state check data after the next power-on and how to reconstruct the error data by using the intermediate state check data when the read data is in error.
There is a need to address one or more of the above-mentioned problems.
Disclosure of Invention
The application aims to provide a RAID5 check data power-down processing method which is used for improving the speed of recovering page stripe intermediate state check data after power-down and providing protection for data in incomplete stripe data. The method provided by the application can be used for easily reconstructing read error data when the incomplete stripe has a read error, and the data reconstruction process can be not required to be changed when the read error processing is encountered, so that additional research and development cost and technical complexity are not introduced.
In order to achieve the above object, as a first aspect of the present application, the present application provides a method for processing RAID5 check data power down, the method comprising the steps of:
in response to a power failure, acquiring intermediate state check data of an unwritten complete page stripe;
And judging whether the page stripe which is not written with the data is enough to write the intermediate state check data, if so, directly writing the intermediate state check data into the page which is not written with the data of the page stripe, otherwise, writing the intermediate state check data into the page which is stored with the check data of the page stripe after the page stripe is fully written with the intermediate state check data.
The RAID5 verification data power-down processing method is described above, wherein the intermediate state verification data is restored in response to power-up.
The RAID5 check data power-down processing method is characterized in that when the LUN pages provided by a plurality of LUNs of the intermediate state check data are larger than 4, 0 is written in the rest pages.
The RAID5 check data power-down processing method, wherein, according to the address of the page written with the intermediate check data, the page written with the intermediate check data is read into a check data buffer area as the intermediate check data,
In response to receiving a write command to write data to the storage device, exclusive OR calculation is performed on the data to be written and intermediate parity data in the parity data buffer to calculate new intermediate parity data for the page stripe.
The RAID5 check data power-down processing method is characterized in that in response to RAID read errors, when the read error pages are recovered, all data written in the pages of the intermediate check data are removed.
The RAID5 check data power-down processing method comprises the steps of removing data of all pages written with intermediate state check data when the page stripe of the page with the read error is completely written when the page with the read error is recovered, and performing exclusive OR calculation by using the rest of physical page data to recover the page with the read error.
The RAID5 check data power-down processing method comprises the steps of obtaining the position of a page written with intermediate state check data on a LUN;
And executing a corresponding check data recovery method according to the difference of the positions on the LUNs of the pages written with the intermediate state check data.
As a second aspect of the present application, the present application provides a method for processing RAID5 check data in a power-down manner, in response to power-up, recovering intermediate check data;
If the page written with the intermediate state check data exists on a single LUN, initializing the intermediate state check data of the page stripe to be all 0, continuing writing the data into the page stripe, if the page written with the intermediate state check data exists on a plurality of LUNs,
And reading out the data of different pages written with the intermediate state check data into the check data buffer of the corresponding page stripe.
The RAID5 check data power-down processing method comprises the steps of processing read data errors when a read command encounters the read data errors after power-up, and reconstructing the read data errors.
The RAID5 check data power-down processing method comprises the following steps of:
The page in which the intermediate parity data is written is regarded as normal data.
The RAID5 check data power-down processing method described above, wherein reading out data of different pages written with intermediate state check data into the check data buffer of the corresponding page stripe includes:
Read out [ part n3] into the page stripe PS n1 check data buffer:
Intermediate state check data for page stripe PS n1 =[lun0 page n1]xor[lun1 page n1]xor[lun2 page n1]xor[lun3 page n1]xor[lun4 page n1]xor[lun5page n1]xor[part n1]xor[part n3]=[part n3].
The RAID5 check data power-down processing method described above, wherein reading out data of different pages written with intermediate state check data into the check data buffer of the corresponding page stripe includes:
Read [ part n2] into the page stripe PS n2 check data buffer:
Intermediate state check data for page stripe PS n2 =[lun0 page n2]xor[lun1 page n2]xor[lun2 page n2]xor[lun3 page n2]xor[lun4 page n2]xor[lun5page n2]xor[dummy]=[part n2].
The RAID5 check data power-down processing method described above, wherein reading out data of different pages written with intermediate state check data into the check data buffer of the corresponding page stripe includes:
Read [ part n3] into the page stripe PS n3 check data buffer:
Intermediate state check data for page stripe PS n3 =[lun0 page n3]xor[lun1 page n3]xor[lun2 page n3]xor[lun3 page n3]xor[lun4 page n3]xor[lun5page n3]=[part n3].
According to the RAID5 check data power-down processing method, if 4 parts nx stored by intermediate state check data stored when the storage device is powered down are distributed on 2 LUNs continuously, the intermediate state check data of each page stripe when the storage device is powered up are obtained in the following manner:
intermediate state check data of page stripe PS n 0= [ part n1];
intermediate state check data of page stripe PS n 1= [ part n1] xor [ part n2];
Intermediate state check data of page stripe PS n 2= [ part n2] xor [ part n3];
intermediate state check data= [ part n3] of page stripe PS n 3.
According to the RAID5 check data power-down processing method, if 4 parts nx stored by intermediate state check data stored when the storage device is powered down are distributed on 3 LUNs continuously, the intermediate state check data of each page stripe when the storage device is powered up are obtained in the following manner:
Intermediate state check data of page stripe PS n 0= [ part n0] xor [ part n0] xor [ part n1] xor [ part n2] = [ part n1] xor [ part n2];
Intermediate state check data of page stripe PS n 1= [ part n1] xor [ part n3];
intermediate state check data of page stripe PS n 2= [ part n2];
intermediate state check data= [ part n3] of page stripe PS n 3.
According to the RAID5 check data power-down processing method, if the intermediate check data stored during power-down of the storage device is stored in 4 parts nx distributed on 4 continuous luns, the intermediate check data of each page stripe during power-up of the storage device is obtained in the following manner:
Intermediate state check data of page stripe PS n 0= [ part n1] xor [ part n2] xor [ part n3];
Intermediate state check data of page stripe PS n 1= [ part n1];
intermediate state check data of page stripe PS n 2= [ part n2];
intermediate state check data= [ part n3] of page stripe PS n 3.
The RAID5 check data power-down processing method is characterized in that when the page written with the intermediate state check data is distributed on a plurality of continuous LUNs, if the page stripe is complete, other pages of the page stripe are read or operated to recover the read error data.
According to the RAID5 check data power-down processing method, when the pages written with the intermediate state check data are distributed on a plurality of LUNs continuously, if the page stripe is incomplete, any page of any logic unit is read out by mistake, the current check data of the recovered page stripe is copied into a read exclusive or memory, and then the pages of other logic units are read out or exclusive or are read to recover the page data of the logic unit.
As a third aspect of the present application, there is provided an information processing apparatus comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the method according to any one of claims 1 to 18 when executing the program.
The beneficial effects achieved by the application are as follows:
(1) When the storage device is powered down, the storage device can rapidly and completely store intermediate state check data even if the current writing position of the page stripe encounters a bad block;
(2) The method and the device are used for improving the speed of recovering the page stripe intermediate state check data after power failure.
(3) The method for recovering the check data by powering on can provide protection for the data in the incomplete stripe data, and can avoid changing the data reconstruction process when the read error process is encountered, thereby not introducing extra research and development cost and technical complexity.
Drawings
In order to more clearly illustrate embodiments of the present application or technical embodiments in the prior art, the drawings required for the embodiments or the description of the prior art will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to the drawings for those skilled in the art.
FIG. 1 is an exemplary diagram of a lun31+1RAID stripe distribution in accordance with embodiments of the present application.
FIG. 2 is a flowchart of a method for processing RAID5 check data in a power down mode according to an embodiment of the present application.
FIG. 3A is a diagram illustrating an embodiment of the present application in which intermediate parity data is written to the same lun.
FIG. 3B is a diagram illustrating an embodiment of the present application in which intermediate parity data is written into a plurality of luns.
FIG. 4 is a flowchart of a method for recovering check data during power-up according to an embodiment of the present application.
FIG. 5 is a flowchart of another method for recovering check data during power-up according to an embodiment of the present application.
Detailed Description
Technical embodiments of the present application will be described in detail below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The present application stripe/page stripe structure
Bad blocks are unavoidable in NAND flash memories. In the example where the LUN includes 4 planes, data is written to the NAND flash memory with a Multi-Plane Program command at a granularity of 4 pages, denoted Page i0, page i1, page i2, and Page i3 (i represents the number of physical pages in the planes). The 4 pages belong to 4 page stripes.
Referring to FIG. 1, taking the example of constructing stripes across 32 LUNs, in the present application, 32 Block m of planes with the same Plane number (denoted x) from each of the 32 LUNs, their respective Page ix constitutes a Page stripe (denoted PS ix) comprising 32 physical pages.
In the case where a Block m of the aforementioned 32 blocks m is a bad Block (in fig. 1, block m from Plane 0 of LUN 1 is a bad Block), the bad Block is provided to a physical Page (Page i 0) of the Page stripe PS ix (where x=0, 0 is the number of Plane 0) and cannot be used. In the page stripe construction method of the present application, bad blocks are replaced with blocks m of other planes belonging to the same LUN as bad blocks (blocks m of Plane 0 of LUN 1), and are replaced in the order of Plane numbers. For example, block m of Plane 0 of LUN 1 is a bad Block that is replaced by Block m of Plane 1 of LUN 1 when constructing a stripe. It will be appreciated that upon replacement, the bad block being replaced has the same LUN number, physical block number, but a different Plane number as the corresponding good block, and the Plane number of the good block is greater than the Plane number of the bad block. And, preferably, the Plane number of the good block is the Plane number +1 of the corresponding bad block. If the Plane number +1 of the bad block is still a bad block, the Plane number continues to increment (+1) until a good block is found or the upper limit of the Plane number is exceeded (3 in the example of FIG. 1). If the Plane number exceeds the upper limit, a good block that can be replaced is still not found, then the LUN where the bad block is located no longer provides a physical page for the page stripe. When a good block is found for replacement, the physical page i of the good block is replaced with a physical page that cannot be used due to a bad block.
Further, in the case where a Block m of the aforementioned 32 blocks m has been used to replace a bad Block to construct a stripe (in fig. 1, a Block m of Plane 1 from LUN 1 is used to replace a bad Block m of Plane 0), the physical Page (Page ix) of the Block m that should be provided to the Page stripe PS ix, which is used to replace a bad Block, needs to be replaced with another good Block in the same manner because it is already occupied.
In the example of fig. 1, the physical Page (Page i 0) of Block m from Plane 0 of LUN 1 is not available, and Page i0 is replaced with Page i1 in Page stripe PS i 0. Referring to fig. 1, instead of page i1, which expresses that page i0 is used to replace page i to its left in page stripe PS i0, each physical page of Plane 0 of Block m of LUN 1 has a dark background, representing belonging to a bad Block, whereas each physical page of Plane 1 of Block m of LUN 1 is labeled as page i0 (0 < = i < N, N being the number of physical pages within a physical Block within a Plane).
Further, since the physical pages of Plane 1 of Block m of LUN 1 are used to replace the physical pages of Plane 0 of Block m from LUN 1 in page stripe PS i0, the physical pages of Plane 1 of Block m from LUN 1 in page stripe PS i1 are further replaced with the physical pages of Plane 2 of Block m from LUN 1, and the physical pages of Plane 2 of Block m from LUN 1 in page stripe PS i2 are replaced with the physical pages of Plane 3 of Block m from LUN 1. And for page stripe PS i3 from Block m for each of the 32 LUNs, no physical page from LUN 1 is included in page stripe PS i3 because the Plane number has reached the upper limit, and no physical page is available.
Since LUN 1 has only 3 good blocks, the 4 th Stripe m3 of 4 stripes (denoted as stripes m0, m1, m2, and m3, respectively) corresponding to 4 blocks m of 4 planes of LUN 1 does not contain Block m from LUN 1, and the page Stripe PS i3 does not contain a physical page from LUN 1.
According to the stripe/page stripe configuration method of the embodiment of the present application, only depending on the location of the bad blocks in each LUN, it is possible to identify which blocks in the LUN belong to the same stripe, which physical pages belong to the same page stripe, and obtain all physical pages included in a certain page stripe, without recording the addresses of each physical block of each stripe, or recording the addresses of each physical page of each page stripe. When the page stripe is needed to reconstruct data, even if bad blocks exist, the number of each physical page of the page stripe is calculated according to the physical page with error of read data and the bad block position of each LUN. For example, if the physical Page where read data is erroneous is Page d of Block c of Plane b of LUN a, and the number of bad blocks (denoted as ka) of 4 blocks c of Plane 0 to Plane 3 is obtained from the bad Block information of LUN a, the Page stripe to which Page db where read data is erroneous belongs is PS d (b-ka). Accordingly, other LUNs than LUN a are determined to be the physical pages provided by page stripe PS d (b-ka). For example, for LUN 0, where the number of bad blocks of 4 blocks c of Plane 0 through Plane 3 is k0, then LUN 0 provides physical page for page stripe PS d (b-ka) as physical page d provided by the b-ka Block c in the order of the number of the belonging Plane from among 4 blocks c of Plane 0 through 3, respectively. And if k0> ka, then LUN 0 does not provide a physical page for page stripe PS d (b-ka).
The parity data for a stripe/page stripe is recorded in the LUN that provides the stripe/page stripe with the most good blocks or physical pages corresponding to the most good blocks, and specifically in the physical blocks or physical pages that the LUN provides for the stripe/page stripe. For example, in fig. 1, block m of each of 4 planes of LUN 31 is a good Block, and LUN 31 is selected to record check data. It is to be understood that the parity data recorded by the LUN 31 is not intermediate parity data of the page stripe, but rather parity data generated from all 31 pages of user data of the page stripe (also referred to as full parity data).
(II) Power-down processing of intermediate state check data of the application
When writing data to the storage device, the storage device allocates a page stripe to carry the written data. When the write granularity is 4 physical pages, the 4 page stripes corresponding to each of the 4 physical pages simultaneously carry the write data. If a power down occurs to the storage device while the 4 page stripes are not fully written (corresponding full parity data has not yet been generated), processing is required in accordance with the manner provided by embodiments of the present application.
As shown in fig. 2, the present application provides a method for processing power failure of RAID5 check data, which includes the following steps:
and step S1, responding to power failure, and acquiring intermediate state check data of the unwritten complete page stripe.
Wherein, the page stripe of the complete check data is not obtained when the complete page stripe is not written.
And S2, judging whether the unwritten complete page stripe is enough to write the intermediate state check data, if yes, directly writing the intermediate state check data into the page of the page stripe unwritten data, otherwise, writing the intermediate state check data into the page of the page stripe stored check data after the page stripe is fully written with the intermediate state check data.
(1) In one set of examples, for example, a page stripe includes 31+1 physical pages. When power failure occurs, 10 pages of user data are written, and the intermediate state check data size is 1 physical page. The intermediate state check data is written to the page stripe in step S2.
In yet another example, 30 pages of user data have been written when a power loss occurs. At this time, after the intermediate parity data is written in step S2, 31 physical pages of the page stripe are all written, and then in step S2, the complete parity data may be generated and written in the page stripe.
In yet another example, 30 pages of user data have been written when a power loss occurs. The next physical page containing the user data is from the bad block and cannot be written with data, the intermediate check data at this time corresponds to the complete check data, the physical page from the bad block is skipped, and the complete check data is written into the page stripe in step S2.
In yet another example, 31 pages of user data have been written when a power loss occurs. At this point, the complete verification data is generated in step S2 and written into the page stripe.
(2) As yet another set of examples, 4 page stripes associated with the same Multi-Plane Program command are used to carry write data, each page stripe comprising, for example, 31+1 physical pages. When power down occurs, 10 pages of user data have been written to each page stripe, with the intermediate parity data size for each page stripe being 1 physical page. The intermediate state check data of each page stripe is written to the page stripe through 1 Multi-Plane Program command in step S2. The memory device may then be powered down, with the 4 page stripes in an unfilled state.
In yet another example, if each page stripe has been written with 30 pages of user data while a power down occurs, the next 4 physical pages cannot accommodate 4 pages of intermediate parity data due to bad blocks. In this case, in step S2, part of the intermediate state check data is written to the physical page from the good block among the 4 physical pages, and the remaining intermediate state check data is written to one or more of the check pages of the 4 page stripe. And filling the unfilled check pages with the complete check data.
Fig. 3A shows yet another example. When a power down occurs, 4 page stripes have each been written with 6 pages of user data (to lun 5). At this time, it is checked that 4 physical pages provided by lun6 for the 4 page stripes are enough to write 4 page intermediate state check data, and after writing, the 4 page stripes are still not full, and the 4 page intermediate state check data in the memory are directly written to the 4 physical pages provided by lun6 (that is, part n 0to part n3 of lun6 shown in fig. 3A, part n 0to part n3 represent physical pages for storing intermediate state check data).
The addresses of the four pages of lun6 may be chosen to hold, where the addresses of the pages of part nx (the page where intermediate parity data is written) may be chosen so that the intermediate parity data for these 4 page stripes is readily available at the next power up based on the held addresses. Alternatively, the addresses of the 4 physical pages of LUN6 are not saved, and the method of recovering stripe intermediate check data after power-up is described later.
As another embodiment of the present invention for storing intermediate state check data, a determination is made as to whether 4 physical pages in the page stripe currently used to hold intermediate state check data are from bad blocks, prior to storing intermediate state check data. If the bad block exists, the page in the bad block is complemented with 0 data and written down together with intermediate state check data.
As shown in fig. 3B, 4 page stripes (denoted PS n0, PS n1, PS n2, and PS n 3) constructed on 32 LUNs are illustrated, PS n0 includes page n0 provided by each of the 32 LUNs, PS n1 includes page n1 provided by each of the 32 LUNs, and so on, PS nx includes page nx provided by each of the 32 LUNs. n represents a physical page number.
When a power down occurs, 4 page stripes have each been written with 6 pages of user data (to lun 5). At this point it is checked whether the 4 physical pages provided by Lun6 for the 4 page stripe are sufficient to write 4 pages of intermediate parity data. Lun6 has only two good blocks (distribution of good block bad blocks is known when stripe is enabled and recorded in NAND flash). In LUN6, the dark page n represents a physical page from a bad block that cannot be written with data. So that 4 physical pages provided by LUN6 for 4 page stripes can only (in LUN 6) hold 2 physical pages (denoted as part n0 and part n 1). At this point, the physical pages holding the intermediate state check data are determined using the method provided in the "construct page stripe" embodiment above. Specifically, of the 2 available physical pages provided by LUN6, page stripes PS n0 and PS n1 are provided with intermediate parity data (in FIG. 3B, the stored intermediate parity data are denoted as Part n0 and Part n 1) in the Plane numbering order, respectively.
Since the 4-page intermediate parity data has not been written in its entirety to the page stripe, the search for available physical pages from the next lun7 continues. Lun7 provides one of the physical pages (Page n in dark) for the 4 Page stripes from the bad block, and physical pages from Plane 1 (Page n0 and Page n1 of Lun7 in FIG. 3B) are used to store the remaining 2 pages of intermediate parity data (Part n2 and Part n 3) in the order of Plane numbering. At this time, since 4 physical pages provided by LUN7 to the 4 page stripes have not been fully written (page n2 of LUN7 has no data in fig. 3B), the previous page of all 0 data padding page n2 is also padded (padded data is shown as dummy in fig. 3B). The 0 data is complemented here because the exclusive-or calculation of the data with all 0's is also the data itself, and the physical page where the read error will be reconstructed using RAID techniques will not be affected in the future. To this end, step S2 is completed and the storage device may be powered down.
In the example of FIG. 3B, 2 out of 4 physical pages of LUN 6 where intermediate state check data is to be stored when a power down occurs are from bad blocks, while 1 out of physical pages of LUN 7 are from bad blocks. Intermediate parity data is written onto 2 LUNs. In the worst case, intermediate parity data needs to be written onto 4 luns. Each of these 4 LUNs provides only 1 available physical page to the 4 page stripe.
As an alternative embodiment of the present invention, the address of the physical page where the intermediate state check data is written is also saved. For example, for the example of fig. 3B, the addresses of the physical pages of 4 record parts nx (x=0, 1,2, 3) are also saved to facilitate recovery of the parity data after power-up and subsequent writing of the data to the page stripe. It will be appreciated that after power up, subsequent to writing data to the page stripe, the memory device may be powered down again soon before the page stripe is full, at which time intermediate parity data is processed in the same manner as described above, and there are multiple pages of intermediate parity data in the page stripe. Because the positions of the part nx (the pages written with the intermediate state check data) are not fixed, the number is not fixed, and the like, the positions of all the part nx of the page stripe are not required to be recorded, so that only the part nx address when the power is lost at this time is required to be saved, and the part nx address saved before the last power is lost can be covered or discarded.
There are several ways to recover intermediate parity data at power up of the storage device.
It should be appreciated that if the memory device is powered down, the page stripe is already full, and the check data buffer is initialized to 0 when the memory device is powered up.
According to whether the address of the physical page storing the intermediate state check data recorded when the storage device is powered down is utilized, two groups of technical schemes (III) and (IV) are provided. The physical page address previously recorded is used in (III) to recover intermediate parity data at power-up of the storage device. And (IV) the physical page address recorded before is not needed, so that the method is also applicable to the situation that the physical address used is not recorded when the storage device is powered down and intermediate state check data is recorded.
(III) reconstruction-1 of the power-on recovery and read-out error data of the intermediate state check data
As shown in fig. 4, the present application provides a first method for recovering check data at power-up, the method comprising:
In step S410, the intermediate state check data written in the page of the intermediate state check data is read into the check data buffer according to the address of the page written in the intermediate state check data, and is used as the intermediate state check data.
In step S420, in response to receiving the write command to write data to the storage device, the data to be written is xored with the intermediate parity data in the parity data buffer to calculate new intermediate parity data for the page stripe.
The result of the exclusive or calculation is reserved in the check data buffer area and used as new intermediate state check data for replacing the previous intermediate state check data.
As a specific embodiment of the invention, the data can be continuously written into the page stripe by reading the part nx (the page written with the intermediate state check data) into the memory according to the address of the part nx (the page written with the intermediate state check data) stored before the power failure. The written data is exclusive-ored with the intermediate verification data in the verification data buffer.
For example, when reconstructing a read error page, if the page stripe to which the read error page belongs has been completely written (on which the complete check data is stored), the data of all the pages of the written intermediate check data are removed, and the exclusive or calculation is performed with the remaining physical page data to recover the read error page.
For another example, if the page stripe to which the page with the read error belongs is not written completely, as shown in fig. 3B, if the page n2 with the read error of lun 5 is read, the page n2 with lun 0-lun 4 and the part n2 with lun 7 need to be read out for exclusive or calculation, so that lun 5page n2 data is recovered, and if lun 0-lun 4 has intermediate state check data, then the page stripe needs to be rejected.
In some cases, when the storage device is powered down, the part nx (page with intermediate state check data written) addresses within the page stripe are not all saved, but only the latest part nx addresses are saved. For this reason, intermediate state check data is also required to be stored in which physical pages of the tag page stripe. For example, when the storage device is powered down to write intermediate state check data, a flag is set in the written physical page to distinguish whether the intermediate state check data or user data is stored in the physical page. For example, a tag is stored in a page meta (extra memory space of a physical page) portion, and this page is marked as intermediate check data. For example, 4 bytes are dedicated to marking in the page meta, each bit represents a lun, if the lun page is part nx, then the bit is set to 1, otherwise, 0 is set, for example, bit0 of the mark of the page meta of the data page of lun0 in fig. 3B is 0, the page of lun 6 in fig. 3B is intermediate state check data, then bit 6 of the mark of the page meta is set to 1, the 4 bytes are marked to 00000040h (hexadecimal), and the mark of the page meta of lun 7 is 00000080h (hexadecimal).
Referring still to fig. 3B, when a read error is encountered (e.g., a read error of page n0 of LUN 0 shown in fig. 3B), all page meta tags of LUN 1-LUN 31 page n0 of a stripe to which the read error page belongs are read and xored, and a result is PAGE META FLAG =000000C 0h (hexadecimal), where bit 6 and bit 7 are 1 and other bits are 0, which indicates that LUN 6 and LUN 7 are part nx, and intermediate check data of a read error page (page n0 of LUN 0) is recorded in part nx of LUN 6, so long as page n0 of LUN 1 to LUN5 is read, and then data of LUN 6part n0 is read out and is different, so that correct LUN 0page n0 data can be obtained.
Alternatively, the tag in the page meta does not need to occupy 4 bytes, but only 1 bit is used to tag whether it is part nx or not. When the reading is wrong, each page meta of the page stripe is read out to check to identify whether the physical page is part nx.
The third technical scheme provides a method for recovering intermediate state check data after power-on, which is troublesome to process when the error-reading reconstruction error data is encountered, so that another method for recovering intermediate state check data after power-on is provided.
(IV) reconstruction-2 of the power-on recovery and read error data of the intermediate state check data
The fourth solution is also used to recover intermediate state check data when the storage device is powered up, but the difference from the third solution is that the physical page address of the part nx recorded before is not used.
In the scheme (IV), the intermediate state check data stored before power failure is taken as the common user data on the page stripe.
As shown in fig. 5, the method specifically includes:
In step S610, the location on the LUN of the page where the intermediate state check data is written is acquired.
In step S620, according to the difference of the locations on the LUN of the page written with the intermediate state check data, a corresponding check data recovery method is executed.
Specifically, according to the difference of the positions on the LUN of the page where the intermediate state check data is written, the method for executing the corresponding check data recovery includes:
if the page written with the intermediate state check data exists on a single LUN, the page written with the intermediate state check data is regarded as common data, the intermediate state check data of the page stripe is initialized to be all 0, and the data is continuously written into the stripe.
For the scenario in FIG. 3A, all of the intermediate state check data for the 4 page stripes are located on a single LUN. When the storage device is powered on, the check data buffer in the memory is set to all 0, because the exclusive or result of the page nx from LUN5 to LUN5 and the part nx of LUN6 in fig. 3A is all 0.
If the page written with the intermediate state check data exists on a plurality of LUNs (for example, in the case of fig. 3B), after the host is powered on, the page written with the intermediate state check data is taken as normal data, and the data of different pages written with the intermediate state check data are read out into the check data buffer area of the corresponding page stripe.
Taking the case of fig. 3B as an example, taking part nx as normal data, there is [part nx]=[lun0 page nx]xor[lun1 page nx]xor[lun2 page nx]xor[lun3 page nx]xor[lun4page nx]xor[lun5 page nx], where the symbols within [ ] represent physical pages, and [ ] represent data stored on their physical pages. In FIG. 3B, page stripe PS n0 includes Page n0 of LUN 0-LUN 5, page n0 of LUN6 and Page n0 of LUN7, page stripe PS n1 includes Page n1 of LUN 0-LUN 5, page n1 of LUN6 and Page n1 of LUN7, page stripe PS n2 includes Page n2 of LUN 0-LUN 5 and Page n2 of LUN7, and Page stripe PS n3 includes Page n3 of LUN 0-LUN 5. It should be understood that Page n0 of LUN7 has a recorded value of [ Part n2], which is intermediate verification data of Page stripe PS n2 at the time of power failure, but in this embodiment, it is used as normal user data, and is also one of the physical pages storing normal user data of Page stripe PS n0 according to the Page stripe construction method provided by the "(one stripe/Page stripe construction"). Similarly, page n1 of LUN6 records intermediate parity data for PS n1 at power down of the storage device, according to the Page stripe construction method provided by the stripe/Page stripe construction, which also belongs to one of the physical pages of Page stripe PS n1 storing normal user data. Page n1 of LUN7 belongs to one of the physical pages of Page stripe PS n1 storing normal user data. Page n2 of LUN7 belongs to one of the physical pages of Page stripe PS n2 that store normal user data. LUN6 does not provide physical pages for PS n2 and PS n3. LUN7 does not provide a physical page for PS n3.
Therefore, at power up of the storage device, the current intermediate state parity data =[lun0 page n0]xor[lun1 page n0]xor[lun2 page n0]xor[lun3 page n0]xor[lun4 page n0]xor[lun5 page n0]xor[part n0]xor[part n2]=[part n0]xor[part n0]xor[part n2]=[part n2]. in the parity data buffer of page stripe PS n0 that needs to be restored thereby initializes the parity data buffer for page stripe PS n0 with [ part n2 ]. When the next data is written to the page stripe PS n0, new intermediate parity data is calculated using the intermediate parity data [ part n2] in the write parity data buffer and the written data.
Similarly, upon power up of the storage device, intermediate parity data =[lun0 page n1]xor[lun1 page n1]xor[lun2 page n1]xor[lun3 page n1]xor[lun4 page n1]xor[lun5 page n1]xor[part n1]xor[part n3]=[part n3]. in the parity data buffer of page stripe PS n1 is read out [ part n3] into the PS n1 parity data buffer. When the next data is written to the page stripe PS n1, new intermediate parity data is calculated using the intermediate parity data [ part n3] in the write parity data buffer and the written data.
Since the dummy data stored in Page n2 of LUN 7 is all 0,
Therefore, upon power up of the storage device, intermediate state check data =[lun0 page n2]xor[lun1 page n2]xor[lun2 page n2]xor[lun3 page n2]xor[lun4 page n2]xor[lun5 page n2]xor[dummy]=[part n2], in the check data buffer of page stripe PS n2 is read out of [ part n2] into the page stripe PS n2 check data buffer. When the next data is written to the page stripe PS n2, new intermediate parity data is calculated using the intermediate parity data [ part n2] in the write parity data buffer and the written data.
Intermediate state check data in the check data buffer of page stripe PS n3 when the memory device is powered up =[lun0 page n3]xor[lun1 page n3]xor[lun2 page n3]xor[lun3 page n3]xor[lun4 page n3]xor[lun5 page n3]=[part n3];
Thereby reading out [ part n3] into the PS n3 check data buffer. When the next data is written to the page stripe PS n3, new intermediate parity data is calculated using the intermediate parity data [ part n3] in the write parity data buffer and the written data.
In summary, intermediate state check data in the check data buffer of the page stripe PS nx at the time of powering up the storage device can be recovered. When data is subsequently written to the page stripe PS nx, new intermediate parity data is calculated using the intermediate parity data [ part n3] in the write parity data buffer and the written data.
And (c) recovering intermediate state check data when the storage device is powered on, so that when the read data is in error in the follow-up process, the page stripe is used for reconstructing the error data without rejecting part nx, and therefore, the reconstruction flow of the read error data is not required to be designed additionally, and the existing error data reconstruction flow for RAID5 can be used.
FIG. 3B illustrates a case where 4 Part nx's stored in intermediate state check data stored at power-down of the storage device are distributed over 2 consecutive LUNs, each LUN storing 2 Part nx's.
If 4 Part nx stored in intermediate state check data stored when the storage device is powered down are distributed on 2 continuous LUNs, the front and rear 2 LUNs store 1 Part nx and 3 Part nx respectively, and if the latter LUN can provide 4 available physical pages, the last physical page needs to be filled with all 0 s. In this case, the intermediate state check data for each page stripe at power-up of the storage device is obtained as follows:
Intermediate state check data in the check data buffer of page stripe PS n 0= [ part n1];
Intermediate state check data= [ part n1] xor [ part n2] in the check data buffer of page stripe PS n 1;
Intermediate state check data= [ part n2] xor [ part n3] in the check data buffer of page stripe PS n 2;
intermediate state check data= [ part n3] in the check data buffer of page stripe PS n 3.
If 4 Part nx stored in intermediate state check data stored when the storage device is powered down are distributed on 3 continuous LUNs, the front and rear 3 LUNs store 1 Part nx, 1 Part nx and 2 Part nx respectively, and if the latter LUN can provide 4 available physical pages, the last 2 physical pages need to be filled with all 0. In this case, the intermediate state check data for each page stripe at power-up of the storage device is obtained as follows:
Intermediate state check data= [ part n0] xor [ part n0] xor [ part n1] xor [ part n2] = [ part n1] xor [ part n2] in the check data buffer of page stripe PS n 0;
Intermediate state check data= [ part n1] xor [ part n3] in the check data buffer of page stripe PS n 1;
Intermediate state check data in the check data buffer of page stripe PS n 2= [ part n2];
intermediate state check data= [ part n3] in the check data buffer of page stripe PS n 3.
The most complex case is that the intermediate state check data stored at power-down of the storage device is stored with 4 parts nx distributed over 4 consecutive luns. In this case, the intermediate state check data for each page stripe at power-up of the storage device is obtained as follows:
Intermediate state check data in the check data buffer of page stripe PS n 0= [ part n1] xor [ part n2] xor [ part n3];
Intermediate state check data in the check data buffer of page stripe PSn 1= [ part n1];
Intermediate state check data in the check data buffer of page stripe PS n 2= [ part n2];
intermediate state check data= [ part n3] in the check data buffer of page stripe PS n 3.
Regardless of how 4 part nx stored in intermediate state check data stored when the storage device is powered down are distributed on the LUN, the manner of recovering the intermediate state check data required by each page stripe when the storage device is powered up takes [ part nx ] as common user data, obtains each physical page belonging to each page stripe according to the page stripe construction method provided by the (one) stripe/page stripe construction in the foregoing, and exclusive-or is performed according to the data stored in the physical page written with the common data in each physical page to obtain the intermediate state check data. By the generation mode of the part nx stored in the power-down process, the exclusive OR result of the part nx and the data used for generating the part nx is 0, so that the common data used for generating the part nx stored in the power-down process can be not required to be read, and only the intermediate state verification data required by each page stripe can be obtained through further exclusive OR calculation.
In this case, one page part n0 is less read than in the previous embodiment (III) of recovering intermediate state check data, but several exclusive OR operations are required. Obviously, the overhead of computing the exclusive or operation is significantly lower for a memory device than reading data from one physical page.
In the fourth embodiment, the error data reconstruction process when the read data is in error is relatively simple, and is basically the same as the data reconstruction mode of RAID 5.
The reconstruction of read error data is also divided into two cases (1) for a complete page stripe, read error data can be reconstructed by only reading and exclusive-or the pages of the page stripe except for the error page. (2) Taking fig. 3B as an example, assuming that the lun0 page n0 is read in error, the data of lun0 page n0 can be recovered by performing exclusive or with the page n0 of lun 1-lun 7 only by using the intermediate state check data of the previously recovered page band n0, that is, the current intermediate state check data ]xor[lun1 page n0]xor[lun2 page n0]xor[lun3 page n0]xor[lun4 page n0]xor[lun5 page n0]xor[part n0]xor[part n2]. of [ lun0 page n0] = [ PS n0] optionally, if the page band is written with data after being powered on, the intermediate state check data will be changed. The current intermediate state check data is used in reconstructing the read error data. Another reconstruction method of lun0 page n0 is to select page n0 and part n0 of read lun 1-lun 5 to recover data, namely [lun0 page n0]=[lun1 page n0]xor[lun2page n0]xor[lun3 page n0]xor[lun4 page n0]xor[lun5 page n0]xor[part n0].
The method for recovering the intermediate state check data by powering up in the third scheme and the method for recovering the intermediate state check data by powering up in the fourth scheme can protect incomplete stripe data. The method for recovering the check data by powering up in the scheme (IV) has the advantages that the reconstruction process of the read error data is simpler when the read error process is encountered, and the process of recovering the intermediate state check data by powering up in the embodiment of the scheme (III) is simpler when the number of bad blocks is more.
The application provides an information processing device, which comprises a memory, a processor and a program stored on the memory and capable of running on the processor, wherein the processor realizes any RAID5 check data power failure processing method when executing the program.
The application also provides a computer storage medium which stores computer instructions, and the computer instructions are used for executing the address mapping method of the high-capacity solid state disk when being called. The computer storage medium contains one or more program instructions for executing a RAID5 verification data power down processing method by the processor.
The disclosed embodiments provide a computer readable storage medium having stored therein computer program instructions that, when executed on a computer, cause the computer to perform a RAID5 parity data power-down processing method as described above.
The embodiment of the invention provides a processor for processing the RAID5 check data power-down processing method.
In the embodiment of the invention, the processor may be an integrated circuit chip with signal processing capability. The Processor may be a general purpose Processor, a digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field programmable gate array (Field Programmable GATE ARRAY, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.
The disclosed methods, steps, and logic blocks in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The processor reads the information in the storage medium and, in combination with its hardware, performs the steps of the above method.
The storage medium may be memory, for example, may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory.
The nonvolatile Memory may be Read-Only Memory (ROM), programmable ROM (PROM), erasable Programmable ROM (z230078 f8xm2016. Eprom), electrically Erasable Programmable ROM (ELECTRICALLY EPROM EEPROM), or flash Memory. The volatile memory may be a random access memory (Random Access Memory, RAM for short) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as static random access memory (STATIC RAM, SRAM), dynamic random access memory (DYNAMIC RAM, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate Synchronous dynamic random access memory (Double DATA RATE SDRAM, ddr SDRAM), enhanced Synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCHLINK DRAM, SLDRAM), and direct memory bus random access memory (Direct Rambus RAM, DRRAM).
The beneficial effects achieved by the application are as follows:
(1) When the storage device is powered down, the intermediate state check data can be stored rapidly and completely even if the current writing position of the page stripe encounters a bad block;
(2) The application is used for improving the speed of recovering the page stripe intermediate state check data after power failure.
(3) The method for recovering the check data by powering on can provide protection for the data in the incomplete stripe data, and can avoid changing the data reconstruction process when the read error process is encountered, thereby not introducing extra research and development cost and technical complexity.
In the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more of the described features. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In the description of the present application, the word "for example" is used to mean "serving as an example, instance, or illustration. Any embodiment described as "for example" in this disclosure is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for purposes of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known structures and processes have not been described in detail so as not to obscure the description of the application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
The foregoing description is only illustrative of the invention and is not to be construed as limiting the invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present invention are intended to be included within the scope of the claims of the present invention.

Claims (10)

1. A RAID5 check data power-down processing method is characterized by comprising the following steps:
in response to a power failure, acquiring intermediate state check data of an unwritten complete page stripe;
and judging whether the unwritten complete page stripe is enough to write intermediate state check data, if so, directly writing the intermediate state check data into a page of the page stripe where the data is unwritten, otherwise, writing the intermediate state check data into a page of the page stripe where the user data is stored after the intermediate state check data is written into the page stripe where the data is stored, and writing the remaining unwritten intermediate state check data into the page of the page stripe where the check data is stored.
2. The RAID5 parity data power-down processing method of claim 1, wherein the intermediate state parity data is restored in response to power-up.
3. The method of claim 1, wherein when the number of available pages provided by the plurality of LUNs storing the intermediate parity data in the page stripe is greater than the number of planes in the LUN, writing 0 in the page remaining after writing the intermediate parity data in the plurality of LUNs.
4. The method for processing a power failure of RAID5 verification data according to claim 2, wherein the page written with the intermediate verification data is read into the verification data buffer as the intermediate verification data based on an address of the page written with the intermediate verification data,
In response to receiving a write command to write data to the storage device, exclusive OR calculation is performed on the data to be written and intermediate parity data in the parity data buffer to calculate new intermediate parity data for the page stripe.
5. A method for a storage device, wherein an unwritten complete page stripe is acquired in response to a storage device powering up;
initializing the intermediate state check data of the check data buffer area of the unwritten complete page stripe to be all 0 if the page of the unwritten complete page stripe where the intermediate state check data is written exists on a single LUN;
if the page which is not written with the complete page stripe and is written with the intermediate state check data exists on a plurality of LUNs, reading out the data of different pages which are written with the intermediate state check data into the check data buffer area of the corresponding page stripe;
and continuing to write data to the unwritten complete page stripe.
6. The method for a storage device of claim 5, wherein reconstructing read-erroneous data with a page stripe to which the read-erroneous data belongs by RAID5 comprises:
The page of intermediate parity data written to the page stripe is treated as normal data, so that it is not necessary to identify the page of intermediate parity data written to the page stripe when reconstructing the read-erroneous data.
7. The method for a storage device of claim 6, wherein if 4 physical pages part n0, part n1, part n2, and part n3 that hold intermediate state parity data for an unwritten page stripe upon power-down of the storage device are distributed over consecutive 2 LUNs, then reading out data for a different page that is written to intermediate state parity data into the parity data buffer for the corresponding page stripe comprises:
Check data buffer= [ part n2] of page stripe PSn 0;
check data buffer= [ part n3] of page stripe PSn 1;
check data buffer of page stripe PSn 2= [ part n2];
check data buffer = [ part n3] of page stripe PSn3, where
N in the page stripes PSn0, PSn1, PSn2, and PSn3 represents a physical page number, and physical pages part n0, part n1, part n2, and part n3 are physical pages in which intermediate state check data of the page stripes PSn0, PSn1, PSn2, and PSn3 are recorded, respectively, [ part n2] represents data stored by the physical page part n2, and [ part n3] represents data stored by the physical page part n 3.
8. The method for a storage device according to claim 5 or 7, wherein if 4 physical pages part n0, part n1, part n2, and part n3 of the intermediate state check data store that store the unwritten page stripe at power-down of the storage device are distributed over 3 consecutive LUNs, reading out data of a different page that writes the intermediate state check data into the check data buffer of the corresponding page stripe comprises:
The check data buffer of the page stripe PSn 0= [ part n0] xor [ part n0] xor [ part n1] xor [ part n2] = [ part n1] xor [ part n2];
Check data buffer= [ part n1] xor [ part n3] of page stripe PSn 1;
check data buffer of page stripe PSn 2= [ part n2];
check data buffer = [ part n3] of page stripe PSn3, where
N in the page stripes PSn0, PSn1, PSn2, and PSn3 represents a physical page number, and physical pages part n0, part n1, part n2, and part n3 are physical pages in which intermediate state check data of the page stripes PSn0, PSn1, PSn2, and PSn3 are recorded, respectively, [ part n1] represents data stored in the physical page part n1, [ part n2] represents data stored in the physical page part n2, and [ part n3] represents data stored in the physical page part n 3.
9. The method for a storage device according to claim 5 or 7, wherein storing 4 physical pages part n0, part n1, part n2, and part n3 of intermediate parity data for a stripe of unwritten pages if the storage device is powered down is distributed over 4 luns, and reading out data of different pages of written intermediate parity data into the parity data buffers of the corresponding stripe of pages comprises:
check data buffer= [ part n1] xor [ part n2] xor [ part n3] of page stripe PSn 0;
check data buffer of page stripe PSn 1= [ part n1];
check data buffer of page stripe PSn 2= [ part n2];
check data buffer = [ part n3] of page stripe PSn3, where
N in the page stripes PSn0, PSn1, PSn2, and PSn3 represents a physical page number, and physical pages part n0, part n1, part n2, and part n3 are physical pages in which intermediate state check data of the page stripes PSn0, PSn1, PSn2, and PSn3 are recorded, respectively, [ part n1] represents data stored in the physical page part n1, [ part n2] represents data stored in the physical page part n2, and [ part n3] represents data stored in the physical page part n 3.
10. An information processing apparatus comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1-9 when executing the program.
CN202411494217.2A 2024-10-24 2024-10-24 A RAID5 verification data power failure processing method and information processing device Pending CN119396628A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411494217.2A CN119396628A (en) 2024-10-24 2024-10-24 A RAID5 verification data power failure processing method and information processing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411494217.2A CN119396628A (en) 2024-10-24 2024-10-24 A RAID5 verification data power failure processing method and information processing device

Publications (1)

Publication Number Publication Date
CN119396628A true CN119396628A (en) 2025-02-07

Family

ID=94427474

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411494217.2A Pending CN119396628A (en) 2024-10-24 2024-10-24 A RAID5 verification data power failure processing method and information processing device

Country Status (1)

Country Link
CN (1) CN119396628A (en)

Similar Documents

Publication Publication Date Title
US8347138B2 (en) Redundant data distribution in a flash storage device
KR101298827B1 (en) Improved error correction in a solid state disk
JP6422600B2 (en) Stripe mapping in memory
US7546515B2 (en) Method of storing downloadable firmware on bulk media
KR101405741B1 (en) Stripe-based non-volatile multilevel memory operation
US9377960B2 (en) System and method of using stripes for recovering data in a flash storage system
US7536627B2 (en) Storing downloadable firmware on bulk media
US9130597B2 (en) Non-volatile memory error correction
US8055983B2 (en) Data writing method for flash memory and error correction encoding/decoding method thereof
US20140068208A1 (en) Separately stored redundancy
US20180157428A1 (en) Data protection of flash storage devices during power loss
WO2007136447A2 (en) Non-volatile memory error correction system and method
US7380198B2 (en) System and method for detecting write errors in a storage device
CN107885620B (en) Method and system for improving performance and reliability of solid-state disk array
JP2019168897A (en) Memory system
US8775902B2 (en) Memory controller and storage device
US10922025B2 (en) Nonvolatile memory bad row management
US10574270B1 (en) Sector management in drives having multiple modulation coding
US10713160B1 (en) Data writing method, memory control circuit unit and memory storage device
CN119396628A (en) A RAID5 verification data power failure processing method and information processing device
US11604586B2 (en) Data protection method, with disk array tags, memory storage device and memory control circuit unit
CN114080596A (en) Data protection method for memory and memory device thereof
CN114625563B (en) Data protection method and device of SSD, readable storage medium and electronic equipment
CN118656030B (en) Data protection method and storage device
CN115421964A (en) Non-aligned data error processing method, control component and storage device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination