[go: up one dir, main page]

US20210240584A1 - Data recovery method, system, and apparatus in storage system - Google Patents

Data recovery method, system, and apparatus in storage system Download PDF

Info

Publication number
US20210240584A1
US20210240584A1 US17/233,893 US202117233893A US2021240584A1 US 20210240584 A1 US20210240584 A1 US 20210240584A1 US 202117233893 A US202117233893 A US 202117233893A US 2021240584 A1 US2021240584 A1 US 2021240584A1
Authority
US
United States
Prior art keywords
ssd
fault domain
fault
chunk
storage system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US17/233,893
Inventor
Guiyou Pu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/CN2019/103085 external-priority patent/WO2020082888A1/en
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of US20210240584A1 publication Critical patent/US20210240584A1/en
Priority to US17/883,708 priority Critical patent/US12111728B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1658Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit
    • G06F11/1662Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit the resynchronized component or unit being a persistent storage device
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C29/00Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
    • G11C29/04Detection or location of defective memory elements, e.g. cell constructio details, timing of test signals
    • G11C29/08Functional testing, e.g. testing during refresh, power-on self testing [POST] or distributed testing
    • G11C29/12Built-in arrangements for testing, e.g. built-in self testing [BIST] or interconnection details
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/108Parity data distribution in semiconductor storages, e.g. in SSD
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0727Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0688Non-volatile semiconductor memory arrays
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/13Linear codes
    • H03M13/15Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
    • H03M13/151Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes using error location or error correction polynomials
    • H03M13/154Error and erasure correction, e.g. by using the error and erasure locator or Forney polynomial
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/82Solving problems relating to consistency
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools

Definitions

  • the present disclosure relates to the field of information technologies, and in particular, to a data recovery method, system, and apparatus in a storage system.
  • a redundant array of independent disks (RAID) technology is a technology widely used in storage systems to ensure data reliability.
  • RAID redundant array of independent disks
  • a reconstruction speed is 1 terabyte (TB) every five hours, five hours are consumed if an SSD with a capacity of 1 TB is partially faulty, and if the capacity of the SSD is 100 TB, reconstruction time is 500 hours.
  • a data recovery method in a storage system includes a controller, a first solid state disk SSD, and a second SSD.
  • the first SSD and the second SSD each include a plurality of fault domains
  • the storage system includes a chunk group that is formed based on an erasure code algorithm, and the chunk group includes a first chunk and a second chunk.
  • An address of the first chunk is mapped to a physical address provided by a first fault domain of the first SSD
  • an address of the second chunk is mapped to a physical address provided by a second fault domain of the second SSD.
  • the method includes: receiving, by controller, fault information of the first SSD; and in response to the fault information, recovering, by the controller based on the erasure code algorithm, data stored at a logical address of the first chunk in the chunk group.
  • the first SSD and the second SSD each include a plurality of fault domains, but a quantity of fault domains in the first SSD and a quantity of fault domains in the second SSD may be different. Therefore, compared with the prior art, in the storage system in this embodiment of the present disclosure, it is unnecessary to reconstruct data at all logical addresses of a faulty SSD, and only data at some logical addresses of the SSD needs to be reconstructed, where the some logical addresses are logical addresses mapped to physical addresses in faulty fault domains. In this way, a data reconstruction speed is increased.
  • that the address of the first chunk is mapped to the physical address provided by the first fault domain of the first SSD includes: the address of the first chunk is a first logical address of the first SSD, and the first logical address is mapped to the physical address provided by the first fault domain of the first SSD; that the address of the second chunk is mapped to the physical address provided by the second fault domain of the second SSD includes: the address of the second chunk is a second logical address of the second SSD, and the second logical address is mapped to the physical address provided by the second fault domain of the second SSD.
  • the address of the first chunk is the physical address provided by the first fault domain of the first SSD, and that the address of the first chunk is mapped to the physical address provided by the first fault domain of the first SSD is that the address of the first chunk is directly mapped to the physical address provided by the first fault domain of the first SSD; and the address of the second chunk is the physical address provided by the second fault domain of the second SSD, and the address of the second chunk is directly mapped to the physical address provided by the second fault domain of the second SSD.
  • this embodiment of the present disclosure also supports indirect mapping of a chunk address to a physical address provided by a fault domain.
  • the storage system stores a correspondence between the address of the first chunk and the first fault domain and a correspondence between the address of the second chunk and the second fault domain.
  • the address of the first chunk is the first logical address of the first SSD
  • the address of the second chunk is the second logical address of the second SSD.
  • the storage system stores a correspondence between a chunk included in the chunk group and a fault domain, for example, the first chunk belongs to the first fault domain, and the second chunk belongs to the second fault domain.
  • the storage system also stores a fault domain index table.
  • the fault domain index table includes a correspondence between a fault domain and a chunk group.
  • the controller can quickly find, based on the fault domain index table, a chunk group affected by the fault domain, to quickly reconstruct data in a chunk that is in the chunk group and that is affected by the fault domain.
  • one fault domain in the first SSD and the second SSD is a plurality of die packages connected on one channel, or is one or more die packages, or is one or more dies, or is one or more flash memory planes.
  • the responding to the fault information includes: querying, by the controller, a correspondence between the first fault domain and the chunk group to determine the chunk group.
  • the storage system stores a correspondence between the address of the first chunk and the first fault domain and a correspondence between the address of the second chunk and the second fault domain.
  • a method for managing a solid state disk SSD includes a first fault domain and a second fault domain, and the method includes: assigning a first range of logical addresses of the SSD to the first fault domain, and assigning a second range of logical addresses of the SSD to the second fault domain.
  • the method further includes: separately recording a correspondence between the first fault domain and the first range of logical addresses and a correspondence between the second fault domain and the second range of logical addresses.
  • both the first range of logical addresses and the second range of logical addresses are contiguous logical addresses, or the first range of logical addresses and the second range of logical addresses are non-contiguous logical addresses.
  • the method further includes: sending, by the SSD, a correspondence between the first fault domain and the first range of logical addresses and a correspondence between the second fault domain and the second range of logical addresses to a controller in a storage system, where the storage system includes the SSD.
  • an embodiment of the present disclosure provides a controller used in a storage system, and the controller includes units for implementing various solutions in the first aspect.
  • an embodiment of the present disclosure provides an apparatus for managing an SSD, and the apparatus includes units for implementing various solutions in the second aspect.
  • an embodiment of the present disclosure provides a computer readable storage medium, the computer readable storage medium stores a computer instruction, and the computer instruction is used to perform various methods in the first aspect.
  • an embodiment of the present disclosure provides a computer program product including a computer instruction, and the computer instruction is used to perform various methods in the first aspect.
  • an embodiment of the present disclosure provides a computer readable storage medium, the computer readable storage medium stores a computer instruction, and the computer instruction is used to perform various methods in the second aspect.
  • an embodiment of the present disclosure provides a computer program product including a computer instruction, and the computer instruction is used to perform various methods in the second aspect.
  • an embodiment of the present disclosure provides a solid state disk SSD, the SSD includes an SSD controller, a first fault domain, and a second fault domain, and the SSD controller is configured to execute various solutions in the second aspect.
  • an embodiment of the present disclosure provides a controller used in a storage system, and the controller includes an interface and a processor and is configured to implement various solutions in the first aspect.
  • an embodiment of the present disclosure provides a data recovery method in a storage system, and the storage system includes a controller, a first solid state disk SSD, and a second SSD.
  • the first SSD and the second SSD each include a plurality of namespaces, one namespace corresponds to one fault domain, the storage system includes a chunk group that is formed based on an erasure code algorithm, and the chunk group includes a first chunk and a second chunk.
  • An address of the first chunk is a first logical address of a first namespace of the first SSD
  • an address of the second chunk is a second logical address of a second namespace of the second SSD.
  • the first logical address is mapped to a physical address provided by a first fault domain of the first SSD
  • the second logical address is mapped to a physical address provided by a second fault domain of the second SSD.
  • the method includes: receiving, by the controller, fault information of the first SSD, where the fault information is used to indicate that the first fault domain is faulty or the first namespace is faulty; and in response to the fault information, recovering, by the controller based on the erasure code algorithm, data stored at a logical address of the first chunk in the chunk group.
  • a method for managing a solid state disk SSD includes a first fault domain and a second fault domain, and the method includes: assigning a first namespace of the SSD to the first fault domain, and assigning a second namespace of the SSD to the second fault domain.
  • the method further includes: separately recording a correspondence between the first fault domain and the first namespace and a correspondence between the second fault domain and the second namespace.
  • the method further includes: sending, by the SSD, the correspondence between the first fault domain and the first namespace and the correspondence between the second fault domain and the second namespace to a controller in a storage system, where the storage system includes the SSD.
  • the method further includes: sending, by the SSD, a correspondence between the first fault domain and a logical address of the first namespace and a correspondence between the second fault domain and a logical address of the second namespace to a controller in a storage system, where the storage system includes the SSD.
  • FIG. 1 is a schematic diagram of a storage system according to an embodiment of the present disclosure
  • FIG. 2 is a schematic structural diagram of a storage array controller according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a distributed storage system according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a server in a distributed storage system according to an embodiment of the present disclosure
  • FIG. 5 is a schematic structural diagram of an SSD according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of a relationship between a fault domain in an SSD and a logical address according to an embodiment of the present disclosure
  • FIG. 7 is a schematic diagram of a chunk group relationship in a storage system
  • FIG. 8 is a schematic diagram of a fault domain index table
  • FIG. 9 is a schematic diagram of a namespace index table
  • FIG. 10 is a schematic structural diagram of a controller
  • FIG. 11 is a schematic structural diagram of an apparatus for managing an SSD.
  • a fault range is limited in the fault domain, so that an influence range on a storage system side is reduced, reconstruction overheads are reduced, and less time is consumed to reconstruct smaller storage space, thereby improving reliability.
  • a storage system in an embodiment of the present disclosure may be a storage array (such as the OceanStor® 18000 series or the OceanStor® Dorado® V3 series of Huawei®).
  • the storage array includes a controller 101 and a plurality of SSDs.
  • the controller 101 includes a central processing unit (CPU) 201 , a memory 202 , and an interface 203 .
  • the memory 202 stores a computer instruction.
  • the CPU 201 executes the computer instruction in the memory 202 to manage the storage system, perform a data access operation, a data recovery operation, and the like.
  • a field programmable gate array (FPGA) or other hardware may also be used to perform all operations of the CPU 201 in this embodiment of the present disclosure, or an FPGA or other hardware and the CPU 201 are separately configured to perform some operations of the CPU 201 in this embodiment of the present disclosure.
  • FPGA field programmable gate array
  • a processor is used to indicate a combination of the CPU 201 and the memory 202 and the foregoing various implementations, and the processor communicates with the interface 203 .
  • the interface 203 may be a network interface card (NIC) or a host bus adapter (HBA).
  • a distributed block storage system includes a plurality of servers, such as a server 1 , a server 2 , a server 3 , a server 4 , a server 5 , and a server 6 .
  • the servers communicate with each other by using InfiniBand, an Ethernet network, or the like.
  • a quantity of servers in the distributed block storage system may be increased based on actual requirements. This is not limited in this embodiment of the present disclosure.
  • the server in the distributed block storage system includes a structure shown in FIG. 4 .
  • each server in the distributed block storage system includes a central processing unit (CPU) 401 , a memory 402 , an interface 403 , an SSD 1 , an SSD 2 , and an SSD 3 .
  • the memory 402 stores a computer instruction, and the CPU 401 executes the computer instruction in the memory 402 to perform a corresponding operation.
  • the interface 403 may be a hardware interface such as a network interface card (NIC) or a host bus adapter (HBA), or may be a program interface module or the like.
  • NIC network interface card
  • HBA host bus adapter
  • a field programmable gate array (FPGA) or other hardware may alternatively perform the foregoing corresponding operation in place of the CPU 401 , or an FPGA or other hardware may perform the foregoing corresponding operation together with the CPU 401 .
  • FPGA field programmable gate array
  • a combination of the CPU 401 and the memory 402 and a combination of the FPGA and the other hardware in place of the CPU 401 or the FPGA and the other hardware in place of the CPU 401 and the CUP 401 are collectively referred to as a processor.
  • the interface 403 may be a network interface card (NIC) or a host bus adapter (HBA).
  • NIC network interface card
  • HBA host bus adapter
  • a server responsible for storage management in the distributed storage system is referred to as a controller.
  • the controller is configured to manage storage space, access data, and the like.
  • the SSD In the SSD, a page is used as a read/write unit and a block is used as an erase unit.
  • the SSD can implement parallelism of data access of a plurality of levels such as a channel, a die package, a flash memory chip, a die, and a flash memory plane.
  • die packages in a flash memory are organized in a multi-channel manner, a plurality of die packages may be connected on each channel, and the plurality of die packages share a transmission channel, but can independently execute instructions.
  • FIG. 5 The SSD includes an interface 501 , an SSD controller 502 , a channel 503 , and a package 504 .
  • One package 504 includes a plurality of flash memory chips, each flash memory chip includes one or more dies, each die includes a plurality of flash memory planes, each flash memory plane includes a plurality of blocks, and each block includes a plurality of pages.
  • the interface 501 may be an interface that supports a serial attached small computer system interface (SAS) protocol, a non-volatile memory express (NVMe) protocol, a peripheral component interconnect express (PCIe) protocol, or the like.
  • SAS serial attached small computer system interface
  • NVMe non-volatile memory express
  • PCIe peripheral component interconnect express
  • the SSD is faulty, usually only some elements of the SSD are faulty, such as a physical block, but not the entire SSD is faulty.
  • a range potentially affected by the fault is not the entire SSD, but a part of the SSD. This part potentially affected by the fault is referred to as a fault domain in this embodiment of the present disclosure.
  • the SSD is divided into a plurality of fault domains, for example, a plurality of die packages connected on a channel are used as a fault domain, or one or more dies are used as a fault domain, or one or more flash memory planes are used as a fault domain.
  • the fault domain is considered as the range potentially affected by the fault, and data in the faulty fault domain needs to be recovered.
  • that the fault domain of the SSD is faulty may be that the entire fault domain is faulty, or may be that the fault domain is partially faulty.
  • Other components of the SSD may also be used as a fault domain in the embodiments of the present disclosure. This is not limited in this embodiment of the present disclosure.
  • the SSD monitors a status of each fault domain. In specific implementation, the controller of the SSD monitors the status of the fault domain through background inspection or the like. The SSD may further determine a health status of the fault domain based on a quantity of erasure times of a physical block in each fault domain, in other words, determine the status of the fault domain based on a wear degree.
  • the SSD externally provides storage space in a form of a logical address.
  • a logical address is a logical block address (LBA)
  • the SSD maps the LBA to a page on a physical block of the SSD by using a flash translation layer (FTL), and establishes a mapping relationship from the LBA to a page address.
  • FTL flash translation layer
  • mapping from the LBA to the page is configured in the SSD based on the fault domain.
  • one SSD includes 128 dies, and an available capacity of the SSD is 32 TB, in other words, the SSD can provide a logical address of 32 TB, or the SSD can provide address space of 32 TB.
  • the SSD includes 32 fault domains, and fault domain identifiers are respectively 0 to 31.
  • the SSD may identify the fault domain by using a number or in another manner. This is not limited in this embodiment of the present disclosure.
  • each fault domain corresponds to a specific range of LBAs of the SSD.
  • an LBA range corresponding to the fault domain 0 is 0 to (1 TB ⁇ 1)
  • an LBA range corresponding to the fault domain 1 is 1 TB to (2 TB ⁇ 1)
  • a logical block address range corresponding to the fault domain 31 is 31 TB to (32 TB ⁇ 1), in other words, logical addresses corresponding to one fault domain are contiguous.
  • a specific range of logical addresses are assigned to the fault domain, and this is also referred to as mapping, by the SSD, a specific range of logical addresses to a physical address in a particular fault domain based on the FTL.
  • assigning a specific range of logical addresses to a fault domain does not require establishment of all mapping from the specific range of logical addresses to a physical address in the fault domain.
  • the SSD selects the physical address from the fault domain to establish the mapping.
  • LBAs in each fault domain may be non-contiguous, in other words, a specific range of logical addresses may be non-contiguous logical addresses.
  • an LBA of 32 TB is divided into 32 parts at a granularity of 1 gigabyte (GB), and each fault domain provides a physical address for an LBA of 1 TB.
  • an LBA range corresponding to the fault domain 0 is 0 to (1 GB ⁇ 1)
  • an LBA range corresponding to the fault domain 1 is 1 to (2 GB ⁇ 1)
  • an LBA range corresponding to the fault domain 31 is 31 GB to (32 GB ⁇ 1)
  • the LBA range corresponding to the fault domain 0 is 0 to (1 GB ⁇ 1)
  • the LBA range corresponding to the fault domain 31 is 32 GB to (33 GB ⁇ 1)
  • . . . and the LBA range corresponding to the fault domain 31 is 63 GB to (64 GB ⁇ 1).
  • a correspondence between a fault domain and an LBA range is established through circulation and interleaving.
  • LBA range corresponding to one fault domain are non-contiguous.
  • the SSD stores the foregoing mapping relationship between an LBA range and a fault domain.
  • the SSD reports the foregoing mapping relationship between an LBA range and a fault domain to the controller 101 .
  • the storage array shown in FIG. 1 is used as an example of the storage system in which the SSD provides a chunk (CK) of a fixed length, and the controller 101 forms, by using a redundancy algorithm such as an erasure code (EC) algorithm, a chunk group (CKG) by using chunks from different SSDs.
  • the EC algorithm may be a RAID algorithm.
  • the CKG includes a CK 1 , a CK 2 , and a CK 3 .
  • the CK 1 is provided by an SSD 1
  • the CK 2 is provided by an SSD 2
  • the CK 3 is provided by an SSD 3 .
  • An address of the CK 1 is an LBA 1 of the SSD 1
  • an address of the CK 2 is an LBA 2 of the SSD 2
  • an address of the CK 3 is an LBA 3 of the SSD 3 .
  • the LBA 1 is mapped to a physical address provided by a fault domain 1 of the SSD 1
  • the address of the CK 1 is referred to as being mapped to the physical address provided by the fault domain 1 of the SSD 1
  • the LBA 2 is mapped to a physical address provided by a fault domain 2 of the SSD 2
  • the LBA 3 is mapped to a physical address provided by a fault domain 3 of the SSD 3 .
  • a fault domain of an SSD that provides the CK may be determined based on load.
  • the load may be of an input/output (IO) type, IO temperature, or the like.
  • the SSD sends the correspondence between a fault domain and an LBA to the controller 101 .
  • the controller 101 may determine, based on a correspondence between a fault domain of the SSD and a logical address, a fault domain corresponding to a logical address of each CK in the CKG.
  • the controller 101 obtains status information of the SSD.
  • the SSD 1 when the fault domain 1 of the SSD 1 is faulty, the SSD 1 sends fault information to the controller 101 , to indicate that the fault domain 1 is faulty. Because the controller 101 can determine, based on the correspondence between a fault domain of an SSD and a logical address, an LBA affected when the fault domain 1 of the SSD 1 is faulty, and the storage array includes a plurality of CKGs, the controller 101 learns, through searching, that an address of a CK included in the CKG is an LBA mapped to the fault domain 1 of the SSD 1 . For example, it is determined that an address of a CK 1 included in a CKG 1 is the LBA mapped to the fault domain 1 of the SSD 1 .
  • the controller 101 recovers data of the CK 1 in the CKG 1 based on a redundancy algorithm such as the EC algorithm. Therefore, compared with the prior art, in this embodiment of the present disclosure, it is unnecessary to reconstruct CKs corresponding to all logical addresses provided by the SSD 1 , thereby increasing a data reconstruction speed.
  • the data in the CK 1 may be recovered to another fault domain of the SSD 1 or another SSD. This is not limited in this embodiment of the present disclosure.
  • the SSD reports the mapping relationship between an LBA and a fault domain to the controller 101 . Therefore, the storage array stores a correspondence between an address of a CK included in a CKG and a fault domain. For example, a first CK belongs to a first fault domain, and a second CK belongs to a second fault domain. Further, to quickly learn, through searching, the address of the CK included in the CKG is the LBA mapped to the fault domain of the SSD 1 , the storage array further stores a fault domain index table based on the mapping relationship between an LBA and a fault domain. For example, the fault domain index table includes a correspondence between a fault domain and a CKG, for example, a correspondence between a fault domain identifier and a CKG identifier.
  • a same CKG includes CKs from fault domains of different SSDs
  • different fault domains may correspond to a same CKG.
  • the controller 101 can quickly find, based on the fault domain index table, a CKG affected by the fault domain, to quickly reconstruct data in a CK that is in the CKG and that is affected by the fault domain.
  • the controller 101 may record a corresponding entry in the fault domain index table based on the mapping relationship between an LBA and a fault domain, and the entry includes the correspondence between a fault domain and a CKG.
  • a multi-level fault domain index table may be created, for example, a first level is an SSD and fault domain index table, and a second level is a fault domain and CKG index table.
  • the fault domain index table may be partitioned based on the SSD, thereby facilitating quick query.
  • a corresponding namespace may be allocated to the SSD based on a quantity of fault domains, to be specific, one fault domain corresponds to one namespace. Therefore, logical addresses of different namespaces of an SSD can be independently addressed. For example, that the available capacity of the SSD is 32 TB is still used as an example.
  • the SSD is divided into 32 fault domains and one namespace is allocated to one fault domain.
  • An LBA range of each namespace is 0 to (1 TB ⁇ 1).
  • An LBA of one namespace is mapped to a physical address in a fault domain corresponding to the namespace.
  • the SSD reports a mapping relationship between a namespace and a fault domain to the controller 101 .
  • the SSD stores the mapping relationship between a namespace and a fault domain.
  • the SSD reports the mapping relationship between a namespace and a fault domain to the controller 101 .
  • a mapping relationship between an LBA in a namespace and a fault domain may be reported.
  • CKs are selected from a plurality of SSDs to form a CKG
  • a namespace of an SSD that provides the CK may be determined based on load.
  • the load may be of an input/output (IO) type, IO temperature, or the like.
  • the storage array stores a fault domain index table.
  • the storage array stores a namespace index table
  • the namespace index table includes a correspondence between a namespace and a CKG, for example, a correspondence between a namespace identifier and a CKG identifier. Because a same CKG includes CKs from namespaces of different SSDs, in the namespace index table, different namespaces may correspond to a same CKG.
  • the SSD reports fault information to the controller 101 , and the fault information is used to indicate a namespace in which a fault occurs.
  • the fault information includes a namespace identifier.
  • the controller 101 can quickly find, based on the namespace index table, a CKG affected by the fault domain, to quickly reconstruct data in a CK that is in the CKG and that is affected by the fault domain.
  • the controller 101 may record a corresponding entry in the namespace index table based on the mapping relationship between a namespace and a fault domain, and the entry includes the correspondence between a namespace and a CKG.
  • a multi-level namespace index table may be created, for example, a first level is an SSD and namespace index table, and a second level is a namespace and CKG index table.
  • the namespace index table may be partitioned based on the SSD, thereby facilitating quick query.
  • the controller of the SSD collects wear information of each fault domain inside the SSD, and reports the wear information of the fault domain to the controller 101 .
  • the controller 101 selects, based on a wear degree of each fault domain of the SSD and a data modification frequency, a CK that is mapped to a physical address of a corresponding fault domain.
  • This embodiment of the present disclosure may also be applied to an SSD that supports an open channel.
  • the SSD In the SSD that supports the open channel, in one implementation, the SSD is divided into a plurality of fault domains, and the controller 101 in the storage system can directly access a physical address of the SSD.
  • an address of a CK that constitutes the CKG in the storage system may be the physical address of the SSD, in other words, the address of the CK is a physical address provided by the fault domain of the SSD, and the address of the CK is mapped to the physical address provided by the fault domain of the SSD.
  • an embodiment of the present disclosure also provides a controller applied to a storage system.
  • the storage system includes the controller, a first solid state disk SSD, and a second SSD.
  • the first SSD and the second SSD each include a plurality of fault domains
  • the storage system includes a chunk group that is formed based on an erasure code algorithm, and the chunk group includes a first chunk and a second chunk.
  • An address of the first chunk is mapped to a physical address provided by a first fault domain of the first SSD, and an address of the second chunk is mapped to a physical address provided by a second fault domain of the second SSD.
  • the controller includes a receiving unit 1001 and a recovery unit 1002 .
  • the receiving unit 1001 is configured to receive fault information of the first SSD, where the fault information is used to indicate that the first fault domain is faulty.
  • the recovery unit 1002 is configured to: in response to the fault information, recover, based on the erasure code algorithm, data stored at the address of the first chunk in the chunk group.
  • the storage system stores a correspondence between the address of the first chunk and the first fault domain and a correspondence between the address of the second chunk and the second fault domain
  • the controller further includes a query unit, configured to query a correspondence between the first fault domain and the chunk group to determine the chunk group.
  • the controller provided in FIG. 10 of the embodiments of the present disclosure may be implemented by software.
  • an embodiment of the present disclosure further provides an apparatus for managing an SSD, where the SSD includes a first fault domain and a second fault domain.
  • the apparatus for managing the SSD includes: a first assignment unit 1101 , configured to assign a first range of logical addresses of the SSD to the first fault domain; and a second assignment unit 1102 , configured to assign a second range of logical addresses of the SSD to the second fault domain.
  • the apparatus for managing the SSD further includes a sending unit, configured to send a correspondence between the first fault domain and the first range of logical addresses and a correspondence between the second fault domain and the second range of logical addresses to a controller in a storage system, where the storage system includes the SSD.
  • the apparatus for managing the SSD further includes a recording unit, configured to separately record the correspondence between the first fault domain and the first range of logical addresses and the correspondence between the second fault domain and the second range of logical addresses.
  • a recording unit configured to separately record the correspondence between the first fault domain and the first range of logical addresses and the correspondence between the second fault domain and the second range of logical addresses.
  • An embodiment of the present disclosure provides a computer readable storage medium, and the computer readable storage medium stores a computer instruction.
  • the computer instruction runs on the controller 101 shown in FIG. 1 or the server shown in FIG. 4 , the method in the embodiments of the present disclosure is performed.
  • An embodiment of the present disclosure provides a computer program product including a computer instruction.
  • the computer instruction runs on the controller 101 shown in FIG. 1 or the server shown in FIG. 4 , the method in the embodiments of the present disclosure is performed.
  • Each unit of a data recovery apparatus may be implemented by a processor, or may be jointly implemented by a processor and a memory, or may be implemented by software.
  • An embodiment of the present disclosure provides a computer program product including a computer instruction.
  • the computer instruction runs on a controller of an SSD, the method for managing the SSD in the embodiments of the present disclosure is performed.
  • the logical address in the embodiments of the present disclosure may be alternatively a key value (KV) in a KV disk, a log in a log disk, or the like.
  • KV key value
  • the correspondence has a same meaning as the mapping relationship.
  • the expression of a correspondence between an address of a chunk and a fault domain has a same meaning as a correspondence between a fault domain and an address of a chunk.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the described apparatus embodiments are merely examples.
  • the unit division is merely logical function division and may be other division in actual implementation.
  • a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed.
  • the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces.
  • the indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
  • the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected based on actual requirements to achieve the objectives of the solutions of the embodiments.
  • functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.
  • the computer software product is stored in a storage medium and includes several computer instructions for instructing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or some of the steps of the methods described in the embodiments of the present disclosure.
  • the storage medium includes various media that can store computer instructions, such as, a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Hardware Redundancy (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)
  • Algebra (AREA)
  • Pure & Applied Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments of the present disclosure provide a data recovery method in a storage system. A solid state disk is divided into a plurality of fault domains, and each fault domain is used to provide a physical address for a specific range of logical addresses of an SSD, so that when a fault domain of the solid state disk is faulty, it is unnecessary to reconstruct data in the entire SSD.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • This application is a continuation of International Application No. PCT/CN2019/103085, filed on Aug. 28, 2019, which claims priority to Chinese Patent Application No. 201811560345.7, filed on Dec. 20, 2018, which claims priority to Chinese Patent Application No. 201811248415.5, filed on Oct. 25, 2018. All of the aforementioned patent applications are hereby incorporated by reference in their entireties.
  • TECHNICAL FIELD
  • The present disclosure relates to the field of information technologies, and in particular, to a data recovery method, system, and apparatus in a storage system.
  • BACKGROUND
  • A redundant array of independent disks (RAID) technology is a technology widely used in storage systems to ensure data reliability. When a hard disk in the storage system is damaged, data on the damaged hard disk can be re-calculated by using data and parity data on an undamaged hard disk. Such a process is referred to as reconstruction of a RAID.
  • In a RAID-based storage system including a solid state disk (SSD), if a reconstruction speed is 1 terabyte (TB) every five hours, five hours are consumed if an SSD with a capacity of 1 TB is partially faulty, and if the capacity of the SSD is 100 TB, reconstruction time is 500 hours.
  • SUMMARY
  • According to a first aspect, a data recovery method in a storage system is provided. The storage system includes a controller, a first solid state disk SSD, and a second SSD. The first SSD and the second SSD each include a plurality of fault domains, the storage system includes a chunk group that is formed based on an erasure code algorithm, and the chunk group includes a first chunk and a second chunk. An address of the first chunk is mapped to a physical address provided by a first fault domain of the first SSD, and an address of the second chunk is mapped to a physical address provided by a second fault domain of the second SSD. The method includes: receiving, by controller, fault information of the first SSD; and in response to the fault information, recovering, by the controller based on the erasure code algorithm, data stored at a logical address of the first chunk in the chunk group. The first SSD and the second SSD each include a plurality of fault domains, but a quantity of fault domains in the first SSD and a quantity of fault domains in the second SSD may be different. Therefore, compared with the prior art, in the storage system in this embodiment of the present disclosure, it is unnecessary to reconstruct data at all logical addresses of a faulty SSD, and only data at some logical addresses of the SSD needs to be reconstructed, where the some logical addresses are logical addresses mapped to physical addresses in faulty fault domains. In this way, a data reconstruction speed is increased.
  • In specific implementation, that the address of the first chunk is mapped to the physical address provided by the first fault domain of the first SSD includes: the address of the first chunk is a first logical address of the first SSD, and the first logical address is mapped to the physical address provided by the first fault domain of the first SSD; that the address of the second chunk is mapped to the physical address provided by the second fault domain of the second SSD includes: the address of the second chunk is a second logical address of the second SSD, and the second logical address is mapped to the physical address provided by the second fault domain of the second SSD. In another implementation, in an SSD scenario supporting an open channel, the address of the first chunk is the physical address provided by the first fault domain of the first SSD, and that the address of the first chunk is mapped to the physical address provided by the first fault domain of the first SSD is that the address of the first chunk is directly mapped to the physical address provided by the first fault domain of the first SSD; and the address of the second chunk is the physical address provided by the second fault domain of the second SSD, and the address of the second chunk is directly mapped to the physical address provided by the second fault domain of the second SSD. In another implementation, in an SSD scenario supporting an open channel, this embodiment of the present disclosure also supports indirect mapping of a chunk address to a physical address provided by a fault domain.
  • With reference to the first aspect, in some implementations of the first aspect, the storage system stores a correspondence between the address of the first chunk and the first fault domain and a correspondence between the address of the second chunk and the second fault domain. The address of the first chunk is the first logical address of the first SSD, and the address of the second chunk is the second logical address of the second SSD. Further, the storage system stores a correspondence between a chunk included in the chunk group and a fault domain, for example, the first chunk belongs to the first fault domain, and the second chunk belongs to the second fault domain. Further, the storage system also stores a fault domain index table. For example, the fault domain index table includes a correspondence between a fault domain and a chunk group. Because a same chunk group includes chunks from fault domains of different SSDs, in the fault domain index table, different fault domains may correspond to a same chunk group. When a fault domain of an SSD is faulty, the controller can quickly find, based on the fault domain index table, a chunk group affected by the fault domain, to quickly reconstruct data in a chunk that is in the chunk group and that is affected by the fault domain.
  • Optionally, one fault domain in the first SSD and the second SSD is a plurality of die packages connected on one channel, or is one or more die packages, or is one or more dies, or is one or more flash memory planes.
  • With reference to the first aspect, in some implementations of the first aspect, the responding to the fault information includes: querying, by the controller, a correspondence between the first fault domain and the chunk group to determine the chunk group.
  • With reference to the first aspect, in some implementations of the first aspect, the storage system stores a correspondence between the address of the first chunk and the first fault domain and a correspondence between the address of the second chunk and the second fault domain.
  • According to a second aspect, a method for managing a solid state disk SSD is provided, where the SSD includes a first fault domain and a second fault domain, and the method includes: assigning a first range of logical addresses of the SSD to the first fault domain, and assigning a second range of logical addresses of the SSD to the second fault domain.
  • With reference to the second aspect, in some implementations of the second aspect, the method further includes: separately recording a correspondence between the first fault domain and the first range of logical addresses and a correspondence between the second fault domain and the second range of logical addresses.
  • With reference to the second aspect, in some implementations of the second aspect, both the first range of logical addresses and the second range of logical addresses are contiguous logical addresses, or the first range of logical addresses and the second range of logical addresses are non-contiguous logical addresses.
  • With reference to the second aspect, in some implementations of the second aspect, the method further includes: sending, by the SSD, a correspondence between the first fault domain and the first range of logical addresses and a correspondence between the second fault domain and the second range of logical addresses to a controller in a storage system, where the storage system includes the SSD.
  • According to a third aspect, an embodiment of the present disclosure provides a controller used in a storage system, and the controller includes units for implementing various solutions in the first aspect.
  • According to a fourth aspect, an embodiment of the present disclosure provides an apparatus for managing an SSD, and the apparatus includes units for implementing various solutions in the second aspect.
  • According to a fifth aspect, an embodiment of the present disclosure provides a computer readable storage medium, the computer readable storage medium stores a computer instruction, and the computer instruction is used to perform various methods in the first aspect.
  • According to a sixth aspect, an embodiment of the present disclosure provides a computer program product including a computer instruction, and the computer instruction is used to perform various methods in the first aspect.
  • According to a seventh aspect, an embodiment of the present disclosure provides a computer readable storage medium, the computer readable storage medium stores a computer instruction, and the computer instruction is used to perform various methods in the second aspect.
  • According to an eighth aspect, an embodiment of the present disclosure provides a computer program product including a computer instruction, and the computer instruction is used to perform various methods in the second aspect.
  • According to a ninth aspect, an embodiment of the present disclosure provides a solid state disk SSD, the SSD includes an SSD controller, a first fault domain, and a second fault domain, and the SSD controller is configured to execute various solutions in the second aspect.
  • According to a tenth aspect, an embodiment of the present disclosure provides a controller used in a storage system, and the controller includes an interface and a processor and is configured to implement various solutions in the first aspect.
  • According to an eleventh aspect, an embodiment of the present disclosure provides a data recovery method in a storage system, and the storage system includes a controller, a first solid state disk SSD, and a second SSD. The first SSD and the second SSD each include a plurality of namespaces, one namespace corresponds to one fault domain, the storage system includes a chunk group that is formed based on an erasure code algorithm, and the chunk group includes a first chunk and a second chunk. An address of the first chunk is a first logical address of a first namespace of the first SSD, and an address of the second chunk is a second logical address of a second namespace of the second SSD. The first logical address is mapped to a physical address provided by a first fault domain of the first SSD, and the second logical address is mapped to a physical address provided by a second fault domain of the second SSD. The method includes: receiving, by the controller, fault information of the first SSD, where the fault information is used to indicate that the first fault domain is faulty or the first namespace is faulty; and in response to the fault information, recovering, by the controller based on the erasure code algorithm, data stored at a logical address of the first chunk in the chunk group.
  • According to a twelfth aspect, a method for managing a solid state disk SSD is provided. The SSD includes a first fault domain and a second fault domain, and the method includes: assigning a first namespace of the SSD to the first fault domain, and assigning a second namespace of the SSD to the second fault domain.
  • With reference to the twelfth aspect, in some implementations of the second aspect, the method further includes: separately recording a correspondence between the first fault domain and the first namespace and a correspondence between the second fault domain and the second namespace.
  • With reference to the twelfth aspect, in some implementations of the second aspect, the method further includes: sending, by the SSD, the correspondence between the first fault domain and the first namespace and the correspondence between the second fault domain and the second namespace to a controller in a storage system, where the storage system includes the SSD.
  • With reference to the twelfth aspect, in some implementations of the second aspect, the method further includes: sending, by the SSD, a correspondence between the first fault domain and a logical address of the first namespace and a correspondence between the second fault domain and a logical address of the second namespace to a controller in a storage system, where the storage system includes the SSD.
  • BRIEF DESCRIPTION OF DRAWINGS
  • To describe the technical solutions in the embodiments of this application more clearly, the following briefly describes the accompanying drawings required for describing the embodiments.
  • FIG. 1 is a schematic diagram of a storage system according to an embodiment of the present disclosure;
  • FIG. 2 is a schematic structural diagram of a storage array controller according to an embodiment of the present disclosure;
  • FIG. 3 is a schematic diagram of a distributed storage system according to an embodiment of the present disclosure;
  • FIG. 4 is a schematic structural diagram of a server in a distributed storage system according to an embodiment of the present disclosure;
  • FIG. 5 is a schematic structural diagram of an SSD according to an embodiment of the present disclosure;
  • FIG. 6 is a schematic diagram of a relationship between a fault domain in an SSD and a logical address according to an embodiment of the present disclosure;
  • FIG. 7 is a schematic diagram of a chunk group relationship in a storage system;
  • FIG. 8 is a schematic diagram of a fault domain index table;
  • FIG. 9 is a schematic diagram of a namespace index table;
  • FIG. 10 is a schematic structural diagram of a controller; and
  • FIG. 11 is a schematic structural diagram of an apparatus for managing an SSD.
  • DESCRIPTION OF EMBODIMENTS
  • The technical solutions in the embodiments of this application are described in more detail below.
  • In the embodiments of the present disclosure, when some components of an SSD in a storage system are partially faulty, based on a manner in which a fault domain of the SSD corresponds to physical space of the SSD, a fault range is limited in the fault domain, so that an influence range on a storage system side is reduced, reconstruction overheads are reduced, and less time is consumed to reconstruct smaller storage space, thereby improving reliability.
  • As shown in FIG. 1, a storage system in an embodiment of the present disclosure may be a storage array (such as the OceanStor® 18000 series or the OceanStor® Dorado® V3 series of Huawei®). The storage array includes a controller 101 and a plurality of SSDs. As shown in FIG. 2, the controller 101 includes a central processing unit (CPU) 201, a memory 202, and an interface 203. The memory 202 stores a computer instruction. The CPU 201 executes the computer instruction in the memory 202 to manage the storage system, perform a data access operation, a data recovery operation, and the like. In addition, to save computing resources of the CPU 201, a field programmable gate array (FPGA) or other hardware may also be used to perform all operations of the CPU 201 in this embodiment of the present disclosure, or an FPGA or other hardware and the CPU 201 are separately configured to perform some operations of the CPU 201 in this embodiment of the present disclosure. For ease of description, in this embodiment of the present disclosure, a processor is used to indicate a combination of the CPU 201 and the memory 202 and the foregoing various implementations, and the processor communicates with the interface 203. The interface 203 may be a network interface card (NIC) or a host bus adapter (HBA).
  • Further, the storage system in this embodiment of the present disclosure may be alternatively a distributed storage system (such as the FusionStorage® series of Huawei®) or the like. The FusionStorage® series of Huawei® is used as an example. For example, as shown in FIG. 3, a distributed block storage system includes a plurality of servers, such as a server 1, a server 2, a server 3, a server 4, a server 5, and a server 6. The servers communicate with each other by using InfiniBand, an Ethernet network, or the like. In actual application, a quantity of servers in the distributed block storage system may be increased based on actual requirements. This is not limited in this embodiment of the present disclosure.
  • The server in the distributed block storage system includes a structure shown in FIG. 4. As shown in FIG. 4, each server in the distributed block storage system includes a central processing unit (CPU) 401, a memory 402, an interface 403, an SSD 1, an SSD 2, and an SSD 3. The memory 402 stores a computer instruction, and the CPU 401 executes the computer instruction in the memory 402 to perform a corresponding operation. The interface 403 may be a hardware interface such as a network interface card (NIC) or a host bus adapter (HBA), or may be a program interface module or the like. In addition, to save computing resources of the CPU 401, a field programmable gate array (FPGA) or other hardware may alternatively perform the foregoing corresponding operation in place of the CPU 401, or an FPGA or other hardware may perform the foregoing corresponding operation together with the CPU 401. For ease of description, in this embodiment of the present disclosure, a combination of the CPU 401 and the memory 402 and a combination of the FPGA and the other hardware in place of the CPU 401 or the FPGA and the other hardware in place of the CPU 401 and the CUP 401 are collectively referred to as a processor. The interface 403 may be a network interface card (NIC) or a host bus adapter (HBA). In the distributed storage system, a server responsible for storage management in the distributed storage system is referred to as a controller. Specifically, the controller is configured to manage storage space, access data, and the like.
  • In the SSD, a page is used as a read/write unit and a block is used as an erase unit. The SSD can implement parallelism of data access of a plurality of levels such as a channel, a die package, a flash memory chip, a die, and a flash memory plane. In the SSD, die packages in a flash memory are organized in a multi-channel manner, a plurality of die packages may be connected on each channel, and the plurality of die packages share a transmission channel, but can independently execute instructions. For a specific structure of the SSD, refer to FIG. 5. The SSD includes an interface 501, an SSD controller 502, a channel 503, and a package 504. One package 504 includes a plurality of flash memory chips, each flash memory chip includes one or more dies, each die includes a plurality of flash memory planes, each flash memory plane includes a plurality of blocks, and each block includes a plurality of pages. The interface 501 may be an interface that supports a serial attached small computer system interface (SAS) protocol, a non-volatile memory express (NVMe) protocol, a peripheral component interconnect express (PCIe) protocol, or the like.
  • If the SSD is faulty, usually only some elements of the SSD are faulty, such as a physical block, but not the entire SSD is faulty. To be specific, when a fault occurs inside the SSD, a range potentially affected by the fault is not the entire SSD, but a part of the SSD. This part potentially affected by the fault is referred to as a fault domain in this embodiment of the present disclosure. Based on a structure of the SSD, the SSD is divided into a plurality of fault domains, for example, a plurality of die packages connected on a channel are used as a fault domain, or one or more dies are used as a fault domain, or one or more flash memory planes are used as a fault domain. In this embodiment of the present disclosure, if the SSD is faulty, the fault domain is considered as the range potentially affected by the fault, and data in the faulty fault domain needs to be recovered. In an actual application scenario, that the fault domain of the SSD is faulty may be that the entire fault domain is faulty, or may be that the fault domain is partially faulty. Other components of the SSD may also be used as a fault domain in the embodiments of the present disclosure. This is not limited in this embodiment of the present disclosure. The SSD monitors a status of each fault domain. In specific implementation, the controller of the SSD monitors the status of the fault domain through background inspection or the like. The SSD may further determine a health status of the fault domain based on a quantity of erasure times of a physical block in each fault domain, in other words, determine the status of the fault domain based on a wear degree.
  • The SSD externally provides storage space in a form of a logical address. In the SSD, a logical address is a logical block address (LBA), and the SSD maps the LBA to a page on a physical block of the SSD by using a flash translation layer (FTL), and establishes a mapping relationship from the LBA to a page address. In this embodiment of the present disclosure, to resolve a problem that in the storage system, if an SSD is faulty, data on the entire SSD needs to be recovered, mapping from the LBA to the page is configured in the SSD based on the fault domain. For example, one SSD includes 128 dies, and an available capacity of the SSD is 32 TB, in other words, the SSD can provide a logical address of 32 TB, or the SSD can provide address space of 32 TB. If an LBA range affected when the SSD is faulty is to be limited to a size of 1 TB, a quantity of fault domains is 32, in other words, 32 TB/1 TB=32. In this embodiment of the present disclosure, the SSD includes 128 dies, and therefore, a quantity of dies in each fault domain is 4, in other words, 128/32=4. As shown in FIG. 6, the SSD includes 32 fault domains, and fault domain identifiers are respectively 0 to 31. In specific implementation, the SSD may identify the fault domain by using a number or in another manner. This is not limited in this embodiment of the present disclosure. In one implementation, each fault domain corresponds to a specific range of LBAs of the SSD. For example, an LBA range corresponding to the fault domain 0 is 0 to (1 TB−1), an LBA range corresponding to the fault domain 1 is 1 TB to (2 TB−1), . . . , and a logical block address range corresponding to the fault domain 31 is 31 TB to (32 TB−1), in other words, logical addresses corresponding to one fault domain are contiguous. The foregoing is also referred to as assignment of a specific range of logical addresses, in other words, a specific range of LBAs, to the fault domain. In this embodiment of the present disclosure, a specific range of logical addresses are assigned to the fault domain, and this is also referred to as mapping, by the SSD, a specific range of logical addresses to a physical address in a particular fault domain based on the FTL. In this embodiment of the present disclosure, assigning a specific range of logical addresses to a fault domain does not require establishment of all mapping from the specific range of logical addresses to a physical address in the fault domain. In one implementation, when mapping from a particular logical block address in the specific range of logical addresses to a physical address needs to be established, the SSD selects the physical address from the fault domain to establish the mapping. In another implementation of this embodiment of the present disclosure, the foregoing SSD is still used as an example. LBAs in each fault domain may be non-contiguous, in other words, a specific range of logical addresses may be non-contiguous logical addresses. For example, an LBA of 32 TB is divided into 32 parts at a granularity of 1 gigabyte (GB), and each fault domain provides a physical address for an LBA of 1 TB. To be specific, an LBA range corresponding to the fault domain 0 is 0 to (1 GB−1), an LBA range corresponding to the fault domain 1 is 1 to (2 GB−1), and an LBA range corresponding to the fault domain 31 is 31 GB to (32 GB−1); then, the LBA range corresponding to the fault domain 0 is 0 to (1 GB−1), the LBA range corresponding to the fault domain 31 is 32 GB to (33 GB−1), . . . , and the LBA range corresponding to the fault domain 31 is 63 GB to (64 GB−1). A correspondence between a fault domain and an LBA range is established through circulation and interleaving. In this implementation, LBA range corresponding to one fault domain are non-contiguous. The SSD stores the foregoing mapping relationship between an LBA range and a fault domain. The SSD reports the foregoing mapping relationship between an LBA range and a fault domain to the controller 101.
  • In this embodiment of the present disclosure, the storage array shown in FIG. 1 is used as an example of the storage system in which the SSD provides a chunk (CK) of a fixed length, and the controller 101 forms, by using a redundancy algorithm such as an erasure code (EC) algorithm, a chunk group (CKG) by using chunks from different SSDs. In specific implementation, the EC algorithm may be a RAID algorithm. As shown in FIG. 7, the CKG includes a CK 1, a CK 2, and a CK 3. The CK 1 is provided by an SSD 1, the CK 2 is provided by an SSD 2, and the CK 3 is provided by an SSD 3. An address of the CK 1 is an LBA 1 of the SSD 1, an address of the CK 2 is an LBA 2 of the SSD 2, and an address of the CK 3 is an LBA 3 of the SSD 3. The LBA 1 is mapped to a physical address provided by a fault domain 1 of the SSD 1, and herein, the address of the CK 1 is referred to as being mapped to the physical address provided by the fault domain 1 of the SSD 1. The LBA 2 is mapped to a physical address provided by a fault domain 2 of the SSD 2. The LBA 3 is mapped to a physical address provided by a fault domain 3 of the SSD 3. In this embodiment of the present disclosure, when CKs are selected from a plurality of SSDs to form a CKG, a fault domain of an SSD that provides the CK may be determined based on load. The load may be of an input/output (IO) type, IO temperature, or the like. In one implementation, the SSD sends the correspondence between a fault domain and an LBA to the controller 101. The controller 101 may determine, based on a correspondence between a fault domain of the SSD and a logical address, a fault domain corresponding to a logical address of each CK in the CKG. The controller 101 obtains status information of the SSD. For example, when the fault domain 1 of the SSD 1 is faulty, the SSD 1 sends fault information to the controller 101, to indicate that the fault domain 1 is faulty. Because the controller 101 can determine, based on the correspondence between a fault domain of an SSD and a logical address, an LBA affected when the fault domain 1 of the SSD 1 is faulty, and the storage array includes a plurality of CKGs, the controller 101 learns, through searching, that an address of a CK included in the CKG is an LBA mapped to the fault domain 1 of the SSD 1. For example, it is determined that an address of a CK 1 included in a CKG 1 is the LBA mapped to the fault domain 1 of the SSD 1. The controller 101 recovers data of the CK 1 in the CKG 1 based on a redundancy algorithm such as the EC algorithm. Therefore, compared with the prior art, in this embodiment of the present disclosure, it is unnecessary to reconstruct CKs corresponding to all logical addresses provided by the SSD 1, thereby increasing a data reconstruction speed. In a specific implementation process, the data in the CK 1 may be recovered to another fault domain of the SSD 1 or another SSD. This is not limited in this embodiment of the present disclosure.
  • Further, the SSD reports the mapping relationship between an LBA and a fault domain to the controller 101. Therefore, the storage array stores a correspondence between an address of a CK included in a CKG and a fault domain. For example, a first CK belongs to a first fault domain, and a second CK belongs to a second fault domain. Further, to quickly learn, through searching, the address of the CK included in the CKG is the LBA mapped to the fault domain of the SSD 1, the storage array further stores a fault domain index table based on the mapping relationship between an LBA and a fault domain. For example, the fault domain index table includes a correspondence between a fault domain and a CKG, for example, a correspondence between a fault domain identifier and a CKG identifier. Because a same CKG includes CKs from fault domains of different SSDs, in the fault domain index table, different fault domains may correspond to a same CKG. When a fault domain of an SSD is faulty, the controller 101 can quickly find, based on the fault domain index table, a CKG affected by the fault domain, to quickly reconstruct data in a CK that is in the CKG and that is affected by the fault domain. In specific implementation, when creating the CKG, the controller 101 may record a corresponding entry in the fault domain index table based on the mapping relationship between an LBA and a fault domain, and the entry includes the correspondence between a fault domain and a CKG. To facilitate query and management of the fault domain index table, in one implementation, a multi-level fault domain index table may be created, for example, a first level is an SSD and fault domain index table, and a second level is a fault domain and CKG index table. In another implementation, as shown in FIG. 8, the fault domain index table may be partitioned based on the SSD, thereby facilitating quick query.
  • In this embodiment of the present disclosure, in another implementation, in an SSD supporting the NVME interface specification, a corresponding namespace may be allocated to the SSD based on a quantity of fault domains, to be specific, one fault domain corresponds to one namespace. Therefore, logical addresses of different namespaces of an SSD can be independently addressed. For example, that the available capacity of the SSD is 32 TB is still used as an example. The SSD is divided into 32 fault domains and one namespace is allocated to one fault domain. An LBA range of each namespace is 0 to (1 TB−1). An LBA of one namespace is mapped to a physical address in a fault domain corresponding to the namespace. The SSD reports a mapping relationship between a namespace and a fault domain to the controller 101. The SSD stores the mapping relationship between a namespace and a fault domain. The SSD reports the mapping relationship between a namespace and a fault domain to the controller 101. In another implementation, a mapping relationship between an LBA in a namespace and a fault domain may be reported. In this embodiment of the present disclosure, when CKs are selected from a plurality of SSDs to form a CKG, a namespace of an SSD that provides the CK may be determined based on load. The load may be of an input/output (IO) type, IO temperature, or the like.
  • Accordingly, as described above, the storage array stores a fault domain index table. In another implementation, the storage array stores a namespace index table, and the namespace index table includes a correspondence between a namespace and a CKG, for example, a correspondence between a namespace identifier and a CKG identifier. Because a same CKG includes CKs from namespaces of different SSDs, in the namespace index table, different namespaces may correspond to a same CKG. When a fault domain of an SSD is faulty, the SSD reports fault information to the controller 101, and the fault information is used to indicate a namespace in which a fault occurs. For example, the fault information includes a namespace identifier. The controller 101 can quickly find, based on the namespace index table, a CKG affected by the fault domain, to quickly reconstruct data in a CK that is in the CKG and that is affected by the fault domain. In specific implementation, when allocating and creating the CKG, the controller 101 may record a corresponding entry in the namespace index table based on the mapping relationship between a namespace and a fault domain, and the entry includes the correspondence between a namespace and a CKG. To facilitate query and management of the namespace index table, in one implementation, a multi-level namespace index table may be created, for example, a first level is an SSD and namespace index table, and a second level is a namespace and CKG index table. In another implementation, as shown in FIG. 9, the namespace index table may be partitioned based on the SSD, thereby facilitating quick query.
  • In this embodiment of the present disclosure, when the SSD collects junk data, valid data is also written in different physical addresses of a same fault domain.
  • In this embodiment of the present disclosure, the controller of the SSD collects wear information of each fault domain inside the SSD, and reports the wear information of the fault domain to the controller 101. When creating the CKG, the controller 101 selects, based on a wear degree of each fault domain of the SSD and a data modification frequency, a CK that is mapped to a physical address of a corresponding fault domain.
  • This embodiment of the present disclosure may also be applied to an SSD that supports an open channel. In the SSD that supports the open channel, in one implementation, the SSD is divided into a plurality of fault domains, and the controller 101 in the storage system can directly access a physical address of the SSD. When the SSD establishes a mapping relationship between a fault domain and a physical address of the SSD, an address of a CK that constitutes the CKG in the storage system may be the physical address of the SSD, in other words, the address of the CK is a physical address provided by the fault domain of the SSD, and the address of the CK is mapped to the physical address provided by the fault domain of the SSD. In this embodiment of the present disclosure, for another operation required for implementation that is based on the SSD supporting the open channel, refer to descriptions of other embodiments of the present disclosure. Details are not described herein.
  • Various operations performed by the SSD in this embodiment of the present disclosure may be performed by the controller of the SSD.
  • Accordingly, an embodiment of the present disclosure also provides a controller applied to a storage system. The storage system includes the controller, a first solid state disk SSD, and a second SSD. The first SSD and the second SSD each include a plurality of fault domains, the storage system includes a chunk group that is formed based on an erasure code algorithm, and the chunk group includes a first chunk and a second chunk. An address of the first chunk is mapped to a physical address provided by a first fault domain of the first SSD, and an address of the second chunk is mapped to a physical address provided by a second fault domain of the second SSD. As shown in FIG. 10, the controller includes a receiving unit 1001 and a recovery unit 1002. The receiving unit 1001 is configured to receive fault information of the first SSD, where the fault information is used to indicate that the first fault domain is faulty. The recovery unit 1002 is configured to: in response to the fault information, recover, based on the erasure code algorithm, data stored at the address of the first chunk in the chunk group. Further, the storage system stores a correspondence between the address of the first chunk and the first fault domain and a correspondence between the address of the second chunk and the second fault domain, and the controller further includes a query unit, configured to query a correspondence between the first fault domain and the chunk group to determine the chunk group. For specific implementation of the controller shown in FIG. 10, refer to the previous implementations of the embodiments of the present disclosure, for example, a structure of the controller 101 shown in FIG. 2. Details are not described herein again. In another implementation, the controller provided in FIG. 10 of the embodiments of the present disclosure may be implemented by software.
  • As shown in FIG. 11, an embodiment of the present disclosure further provides an apparatus for managing an SSD, where the SSD includes a first fault domain and a second fault domain. The apparatus for managing the SSD includes: a first assignment unit 1101, configured to assign a first range of logical addresses of the SSD to the first fault domain; and a second assignment unit 1102, configured to assign a second range of logical addresses of the SSD to the second fault domain. Further, the apparatus for managing the SSD further includes a sending unit, configured to send a correspondence between the first fault domain and the first range of logical addresses and a correspondence between the second fault domain and the second range of logical addresses to a controller in a storage system, where the storage system includes the SSD. Further, the apparatus for managing the SSD further includes a recording unit, configured to separately record the correspondence between the first fault domain and the first range of logical addresses and the correspondence between the second fault domain and the second range of logical addresses. For a hardware implementation of the apparatus for managing the SSD provided in this embodiment of the present disclosure, refer to a structure of the controller of the SSD. Details are not described herein in this embodiment of the present disclosure. In another implementation, the apparatus for managing the SSD provided in this embodiment of the present disclosure may be implemented by software, or may be jointly implemented by a controller of the SSD and software.
  • An embodiment of the present disclosure provides a computer readable storage medium, and the computer readable storage medium stores a computer instruction. When the computer instruction runs on the controller 101 shown in FIG. 1 or the server shown in FIG. 4, the method in the embodiments of the present disclosure is performed.
  • An embodiment of the present disclosure provides a computer program product including a computer instruction. When the computer instruction runs on the controller 101 shown in FIG. 1 or the server shown in FIG. 4, the method in the embodiments of the present disclosure is performed.
  • Each unit of a data recovery apparatus provided in the embodiment of the present disclosure may be implemented by a processor, or may be jointly implemented by a processor and a memory, or may be implemented by software.
  • An embodiment of the present disclosure provides a computer program product including a computer instruction. When the computer instruction runs on a controller of an SSD, the method for managing the SSD in the embodiments of the present disclosure is performed.
  • The logical address in the embodiments of the present disclosure may be alternatively a key value (KV) in a KV disk, a log in a log disk, or the like.
  • In the embodiments of the present disclosure, the correspondence has a same meaning as the mapping relationship. The expression of a correspondence between an address of a chunk and a fault domain has a same meaning as a correspondence between a fault domain and an address of a chunk.
  • It should be noted that the memory described in this specification is intended to include, but is not limited to, these and any other suitable types of memory.
  • A person of ordinary skill in the art may be aware that, in combination with the examples described in the embodiments disclosed in this specification, units and algorithm steps may be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether the functions are performed by hardware or software depends on particular applications and design constraint conditions of the technical solutions. A person skilled in the art may use different methods to implement the described functions for each particular application, but it should not be considered that the implementation goes beyond the scope of the present disclosure.
  • It may be clearly understood by a person skilled in the art that, for the purpose of convenient and brief description, for a detailed working process of the foregoing system, apparatus, and unit, refer to a corresponding process in the foregoing method embodiments. Details are not described herein again.
  • In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the described apparatus embodiments are merely examples. For example, the unit division is merely logical function division and may be other division in actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
  • The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected based on actual requirements to achieve the objectives of the solutions of the embodiments.
  • In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.
  • When the functions are implemented in the form of a software functional unit and sold or used as an independent product, the functions may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of the present disclosure essentially, or the part contributing to the prior art, or some of the technical solutions may be implemented in the form of a software product. The computer software product is stored in a storage medium and includes several computer instructions for instructing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or some of the steps of the methods described in the embodiments of the present disclosure. The storage medium includes various media that can store computer instructions, such as, a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.

Claims (14)

What is claimed is:
1. A data recovery method in a storage system, wherein the storage system comprises a controller, a first solid state disk (SSD), and a second SSD; the first SSD and the second SSD each comprise a plurality of fault domains; the storage system comprises a chunk group that is formed based on an erasure code (EC) algorithm, and the chunk group comprises a first chunk in the first SSD and a second chunk in the second SSD; an address of the first chunk is mapped to a physical address provided by a first fault domain of the first SSD, and an address of the second chunk is mapped to a physical address provided by a second fault domain of the second SSD; and
the method comprises:
receiving, by the controller, fault information of the first SSD, wherein the fault information indicates that the first fault domain is faulty; and
in response to receiving the fault information, recovering, by the controller based on the erasure code algorithm, data stored at the address of the first chunk in the chunk group.
2. The method according to claim 1, wherein one fault domain in each of the first SSD and the second SSD is a plurality of die packages connected on one channel.
3. The method according to claim 1, wherein one fault domain in each of the first SSD and the second SSD is one or more die packages.
4. The method according to claim 1, wherein one fault domain in each of the first SSD and the second SSD is one or more dies.
5. The method according to claim 1, wherein one fault domain in each of the first SSD and the second SSD is one or more flash memory planes.
6. The method according to claim 1, wherein the storage system stores a correspondence between the first fault domain and the chunk group and a correspondence between the second fault domain and the chunk group.
7. The method according to claim 6, further comprising, in response to receiving the fault information, querying, by the controller, the correspondence between the first fault domain and the chunk group to determine the chunk group.
8. The method according to claim 1, wherein the storage system stores a correspondence between the address of the first chunk and the first fault domain and a correspondence between the address of the second chunk and the second fault domain.
9. A storage system, wherein the storage system comprises a controller, a first solid state disk (SSD), and a second SSD; the first SSD and the second SSD each comprise a plurality of fault domains, the storage system comprises a chunk group that is formed based on an erasure code (EC) algorithm, and the chunk group comprises a first chunk in the first SSD and a second chunk in the second SSD; an address of the first chunk is mapped to a physical address provided by a first fault domain of the first SSD, and an address of the second chunk is mapped to a physical address provided by a second fault domain of the second SSD; and
the controller is configured to:
receive fault information of the first SSD, wherein the fault information is used to indicate that the first fault domain is faulty; and
in response to receiving the fault information, recover, based on the erasure code algorithm, data stored at the address of the first chunk in the chunk group.
10. The storage system according to claim 9, wherein the storage system stores a correspondence between the first fault domain and the chunk group and a correspondence between the second fault domain and the chunk group; and
the controller is further configured to query the correspondence between the first fault domain and the chunk group to determine the chunk group in response to receiving the fault information.
11. The storage system according to claim 9, wherein one fault domain in each of the first SSD and the second SSD is a plurality of die packages connected on one channel.
12. The storage system according to claim 9, wherein one fault domain in each of the first SSD and the second SSD is one or more die packages.
13. The storage system according to claim 9, wherein one fault domain in each of the first SSD and the second SSD is one or more dies.
14. The storage system according to claim 9, wherein one fault domain in each of the first SSD and the second SSD is one or more flash memory planes.
US17/233,893 2018-10-25 2021-04-19 Data recovery method, system, and apparatus in storage system Abandoned US20210240584A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/883,708 US12111728B2 (en) 2018-10-25 2022-08-09 Data recovery method, system, and apparatus in storage system

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
CN201811248415.5 2018-10-25
CN201811248415 2018-10-25
CN201811560345.7 2018-12-20
CN201811560345.7A CN111104056B (en) 2018-10-25 2018-12-20 Data recovery method, system and device in storage system
PCT/CN2019/103085 WO2020082888A1 (en) 2018-10-25 2019-08-28 Method, system and apparatus for restoring data in storage system

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103085 Continuation WO2020082888A1 (en) 2018-10-25 2019-08-28 Method, system and apparatus for restoring data in storage system

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/883,708 Continuation US12111728B2 (en) 2018-10-25 2022-08-09 Data recovery method, system, and apparatus in storage system

Publications (1)

Publication Number Publication Date
US20210240584A1 true US20210240584A1 (en) 2021-08-05

Family

ID=70420188

Family Applications (2)

Application Number Title Priority Date Filing Date
US17/233,893 Abandoned US20210240584A1 (en) 2018-10-25 2021-04-19 Data recovery method, system, and apparatus in storage system
US17/883,708 Active US12111728B2 (en) 2018-10-25 2022-08-09 Data recovery method, system, and apparatus in storage system

Family Applications After (1)

Application Number Title Priority Date Filing Date
US17/883,708 Active US12111728B2 (en) 2018-10-25 2022-08-09 Data recovery method, system, and apparatus in storage system

Country Status (4)

Country Link
US (2) US20210240584A1 (en)
EP (1) EP3851949A4 (en)
KR (1) KR102648688B1 (en)
CN (4) CN111552436B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230054801A1 (en) * 2021-08-20 2023-02-23 Macronix International Co., Ltd. Memory device and method for accessing memory device with namespace management
US20230076381A1 (en) * 2018-10-25 2023-03-09 Huawei Technologies Co., Ltd. Data recovery method, system, and apparatus in storage system
WO2024054700A1 (en) * 2022-09-06 2024-03-14 Western Digital Technologies, Inc. Data recovery for zoned namespace devices
US12182449B2 (en) 2022-05-25 2024-12-31 Samsung Electronics Co., Ltd. Systems and methods for managing a storage system
US20250208954A1 (en) * 2023-12-22 2025-06-26 Dell Products L.P. Namespace recovery for key-value store (kvs)-persisted metadata

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767182B (en) * 2020-06-30 2024-09-24 深圳忆联信息系统有限公司 SSD failure analysis method, SSD failure analysis device, SSD failure analysis computer equipment and storage medium
CN112000286B (en) * 2020-08-13 2023-02-28 北京浪潮数据技术有限公司 Four-control full-flash-memory storage system and fault processing method and device thereof
CN112130762B (en) * 2020-09-07 2024-01-26 上海威固信息技术股份有限公司 Solid state disk data storage and operation method
CN114327962A (en) * 2020-09-27 2022-04-12 中国移动通信集团浙江有限公司 Method and device for classifying and early warning of hard disk failure
CN116339636B (en) * 2023-03-30 2025-06-24 苏州浪潮智能科技有限公司 Data storage method, system, computer equipment and storage medium
US12340111B1 (en) * 2023-12-21 2025-06-24 Dell Products L.P. Processing of input/output operations within a fault domain of a data storage system
US20250328293A1 (en) * 2024-04-17 2025-10-23 Micron Technology, Inc. Endurance group for tiered storage applications
CN120335730B (en) * 2025-06-19 2025-10-14 成都华瑞数鑫科技有限公司 Method for quickly migrating data RAID level

Family Cites Families (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6282045B1 (en) * 1997-09-15 2001-08-28 Texas Instruments Incorporated Server hard disk drive integrated circuit and method of operation
CN101105970A (en) * 2006-07-12 2008-01-16 北京赫芯斯信息技术有限公司 ATA interface DDR solid disc
US7934120B2 (en) * 2006-09-11 2011-04-26 International Business Machines Corporation Storing data redundantly
US8495471B2 (en) * 2009-11-30 2013-07-23 International Business Machines Corporation Solid-state storage system with parallel access of multiple flash/PCM devices
CN102708019B (en) * 2012-04-28 2014-12-03 华为技术有限公司 Method, device and system for hard disk data recovery
CN102915212B (en) * 2012-09-19 2015-06-10 记忆科技(深圳)有限公司 RAID (redundant arrays of inexpensive disks) realization method of solid state disks, solid state disk and electronic equipment
US10642505B1 (en) * 2013-01-28 2020-05-05 Radian Memory Systems, Inc. Techniques for data migration based on per-data metrics and memory degradation
WO2015017963A1 (en) 2013-08-05 2015-02-12 Intel Corporation Storage systems with adaptive erasure code generation
US9442670B2 (en) * 2013-09-03 2016-09-13 Sandisk Technologies Llc Method and system for rebalancing data stored in flash memory devices
CN103488583B (en) * 2013-09-09 2016-08-10 华中科技大学 The solid-state disk implementation method that a kind of high-performance is highly reliable
CN103559138B (en) * 2013-10-09 2016-03-30 华为技术有限公司 Solid state hard disc and space management thereof
CN103713969A (en) * 2013-12-30 2014-04-09 华为技术有限公司 Method and device for improving reliability of solid state disk
US9582363B2 (en) * 2014-06-09 2017-02-28 International Business Machines Corporation Failure domain based storage system data stripe layout
WO2016023230A1 (en) * 2014-08-15 2016-02-18 华为技术有限公司 Data migration method, controller and data migration device
US9524108B2 (en) * 2014-08-29 2016-12-20 Dell Products, Lp System and method for providing personality switching in a solid state drive device
US9542118B1 (en) * 2014-09-09 2017-01-10 Radian Memory Systems, Inc. Expositive flash memory control
US9575853B2 (en) * 2014-12-12 2017-02-21 Intel Corporation Accelerated data recovery in a storage system
US9448887B1 (en) 2015-08-22 2016-09-20 Weka.IO Ltd. Distributed erasure coded virtual file system
CN105260267B (en) * 2015-09-28 2019-05-17 北京联想核芯科技有限公司 A kind of method for refreshing data and solid state hard disk
CN107085546B (en) * 2016-02-16 2020-05-01 深信服科技股份有限公司 Data management method and device based on fault domain technology
JP6448571B2 (en) * 2016-03-08 2019-01-09 東芝メモリ株式会社 Storage system, information processing system, and control method
CN107203328A (en) * 2016-03-17 2017-09-26 伊姆西公司 Memory management method and storage device
US10180875B2 (en) * 2016-07-08 2019-01-15 Toshiba Memory Corporation Pool-level solid state drive error correction
KR102573301B1 (en) * 2016-07-15 2023-08-31 삼성전자 주식회사 Memory System performing RAID recovery and Operating Method thereof
CN107885457B (en) * 2016-09-30 2020-08-07 华为技术有限公司 A kind of solid state hard disk SSD, storage device and data storage method
CN106844098B (en) 2016-12-29 2020-04-03 中国科学院计算技术研究所 A fast data recovery method and system based on cross erasure erasure coding
CN107193758A (en) * 2017-05-19 2017-09-22 记忆科技(深圳)有限公司 The mapping table management method and solid state hard disc of a kind of solid state hard disc
CN107315546A (en) * 2017-07-10 2017-11-03 郑州云海信息技术有限公司 A kind of method and system of solid state hard disc low-level formatting
CN107273061A (en) * 2017-07-12 2017-10-20 郑州云海信息技术有限公司 A kind of solid state hard disc creates many namespace method and system
US10929226B1 (en) 2017-11-21 2021-02-23 Pure Storage, Inc. Providing for increased flexibility for large scale parity
KR102410671B1 (en) * 2017-11-24 2022-06-17 삼성전자주식회사 Storage device, host device controlling storage device, and operation mehtod of storage device
CN108540315B (en) * 2018-03-28 2021-12-07 新华三技术有限公司成都分公司 Distributed storage system, method and device
US10942807B2 (en) 2018-06-12 2021-03-09 Weka.IO Ltd. Storage system spanning multiple failure domains
US11074668B2 (en) 2018-06-19 2021-07-27 Weka.IO Ltd. GPU based server in a distributed file system
CN111552436B (en) * 2018-10-25 2022-02-25 华为技术有限公司 Data recovery method, system and device in storage system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230076381A1 (en) * 2018-10-25 2023-03-09 Huawei Technologies Co., Ltd. Data recovery method, system, and apparatus in storage system
US12111728B2 (en) * 2018-10-25 2024-10-08 Huawei Technologies Co., Ltd. Data recovery method, system, and apparatus in storage system
US20230054801A1 (en) * 2021-08-20 2023-02-23 Macronix International Co., Ltd. Memory device and method for accessing memory device with namespace management
US11614876B2 (en) * 2021-08-20 2023-03-28 Macronix International Co., Ltd. Memory device and method for accessing memory device with namespace management
US12182449B2 (en) 2022-05-25 2024-12-31 Samsung Electronics Co., Ltd. Systems and methods for managing a storage system
WO2024054700A1 (en) * 2022-09-06 2024-03-14 Western Digital Technologies, Inc. Data recovery for zoned namespace devices
US12481455B2 (en) 2022-09-06 2025-11-25 SanDisk Technologies, Inc. Data recovery for zoned namespace devices
US20250208954A1 (en) * 2023-12-22 2025-06-26 Dell Products L.P. Namespace recovery for key-value store (kvs)-persisted metadata
US12475003B2 (en) * 2023-12-22 2025-11-18 Dell Products L.P. Namespace recovery for key-value store (KVS)-persisted metadata

Also Published As

Publication number Publication date
CN111104056B (en) 2021-12-31
CN111552436A (en) 2020-08-18
EP3851949A4 (en) 2021-12-01
KR102648688B1 (en) 2024-03-19
EP3851949A1 (en) 2021-07-21
US12111728B2 (en) 2024-10-08
CN111552435A (en) 2020-08-18
KR20210064359A (en) 2021-06-02
US20230076381A1 (en) 2023-03-09
CN111104056A (en) 2020-05-05
CN114496051A (en) 2022-05-13
CN111552436B (en) 2022-02-25

Similar Documents

Publication Publication Date Title
US12111728B2 (en) Data recovery method, system, and apparatus in storage system
US11797387B2 (en) RAID stripe allocation based on memory device health
AU2010234648B2 (en) Method and apparatus for storing data in a flash memory data storage device
US10037152B2 (en) Method and system of high-throughput high-capacity storage appliance with flash translation layer escalation and global optimization on raw NAND flash
US9026845B2 (en) System and method for failure protection in a storage array
WO2012016209A2 (en) Apparatus, system, and method for redundant write caching
US10365845B1 (en) Mapped raid restripe for improved drive utilization
CN110413454B (en) Data reconstruction method and device based on storage array and storage medium
US20210318826A1 (en) Data Storage Method and Apparatus in Distributed Storage System, and Computer Program Product
US20240086092A1 (en) Method for managing namespaces in a storage device and storage device employing the same
CN106873903B (en) Data storage method and device
US20190087111A1 (en) Common logical block addressing translation layer for a storage array
US20230297260A1 (en) Initial cache segmentation recommendation engine using customer-specific historical workload analysis
US8688908B1 (en) Managing utilization of physical storage that stores data portions with mixed zero and non-zero data
WO2022143741A1 (en) Storage device management method, device, and storage system
US12248709B2 (en) Storage server and operation method of storage server
CN110865945B (en) Extended address space for memory devices
US20250068560A1 (en) Method and device for configuring zones of zns ssd
WO2020082888A1 (en) Method, system and apparatus for restoring data in storage system
WO2014045329A1 (en) Storage system and storage control method
US10705905B2 (en) Software-assisted fine-grained data protection for non-volatile memory storage devices
EP4332773A1 (en) Storage server and operation method of storage server
WO2025113322A1 (en) Storage system, data access method, and storage subsystem

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION