[go: up one dir, main page]

US12182445B2 - NVMe command completion management for host system memory - Google Patents

NVMe command completion management for host system memory Download PDF

Info

Publication number
US12182445B2
US12182445B2 US17/886,369 US202217886369A US12182445B2 US 12182445 B2 US12182445 B2 US 12182445B2 US 202217886369 A US202217886369 A US 202217886369A US 12182445 B2 US12182445 B2 US 12182445B2
Authority
US
United States
Prior art keywords
completion
memory
host
data chunk
indication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US17/886,369
Other versions
US20230214157A1 (en
Inventor
Sahil Soi
Dhananjayan Athiyappan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Micron Technology Inc
Original Assignee
Micron Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Micron Technology Inc filed Critical Micron Technology Inc
Assigned to MICRON TECHNOLOGY, INC. reassignment MICRON TECHNOLOGY, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ATHIYAPPAN, DHANANJAYAN, SOI, SAHIL
Publication of US20230214157A1 publication Critical patent/US20230214157A1/en
Priority to US18/951,446 priority Critical patent/US20250077126A1/en
Application granted granted Critical
Publication of US12182445B2 publication Critical patent/US12182445B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]

Definitions

  • Embodiments of the disclosure relate generally to memory sub-systems, and more specifically, relate to NVMe command completion management for host system memory.
  • a memory sub-system can include one or more memory devices that store data.
  • the memory devices can be, for example, non-volatile memory devices and volatile memory devices.
  • a host system can utilize a memory sub-system to store data at the memory devices and to retrieve data from the memory devices.
  • FIG. 1 illustrates an example computing system that includes a memory sub-system in accordance with some embodiments of the present disclosure.
  • FIG. 2 is a flow diagram of an example method of NVMe command completion management for host system memory using appended dummy data in accordance with some embodiments of the present disclosure.
  • FIGS. 3 A and 3 B are block diagrams illustrating example NVMe command completion sequences in accordance with some embodiments of the present disclosure.
  • FIG. 4 is a flow diagram of an example method of NVMe command completion management for host system memory using completion coalescing in accordance with some embodiments of the present disclosure.
  • FIG. 5 is a block diagram of an example computer system in which embodiments of the present disclosure may operate.
  • a memory sub-system can be a storage device, a memory module, or a hybrid of a storage device and memory module. Examples of storage devices and memory modules are described below in conjunction with FIG. 1 .
  • a host system can utilize a memory sub-system that includes one or more components, such as memory devices that store data. The host system can provide data to be stored at the memory sub-system and can request data to be retrieved from the memory sub-system.
  • a memory sub-system can include high density non-volatile memory devices where retention of data is desired when no power is supplied to the memory device.
  • a non-volatile memory device is a NAND memory device, such as 3D flash NAND memory, which offers storage in the form of compact, high density configurations. Other examples of non-volatile memory devices are described below in conjunction with FIG. 1 .
  • a non-volatile memory device is a package of one or more die. Each die can consist of one or more planes. For some types of non-volatile memory devices (e.g., NAND memory devices), each plane consists of a set of physical blocks. Each block consists of a set of pages. Each page consists of a set of memory cells (“cells”).
  • a cell is an electronic circuit that stores information. Depending on the cell type, a cell can store one or more bits of binary information, and has various logic states that correlate to the number of bits being stored.
  • the logic states can be represented by binary values, such as “0” and “1”, or combinations of such values.
  • a memory device can be made up of bits arranged in a two-dimensional or a three-dimensional grid. Memory cells are etched onto a silicon wafer in an array of columns (also hereinafter referred to as bitlines) and rows (also hereinafter referred to as wordlines).
  • a wordline can refer to one or more rows of memory cells of a memory device that are used with one or more bitlines to generate the address of each of the memory cells. The intersection of a bitline and wordline constitutes the address of the memory cell.
  • a block hereinafter refers to a unit of the memory device used to store data and can include a group of memory cells, a wordline group, a wordline, or individual memory cells.
  • the memory device can include circuitry that performs concurrent memory page accesses of two or more memory planes.
  • the memory device can include multiple access line driver circuits and power circuits that can be shared by the planes of the memory device to facilitate concurrent access of pages of two or more memory planes, including different page types.
  • Memory access commands such as those sent by the host system, request the memory sub-system to perform memory access operations on the memory devices contained therein.
  • Memory access commands can generally be classified into respective categories, such as read commands, write commands, erase commands, move commands, etc.
  • a memory sub-system controller can receive the memory access commands from the host system connected externally to the memory sub-system, such as via a Non-Volatile Memory Express (NVMe) interface on a Peripheral Component Interconnect Express (PCIe) communication bus.
  • the memory sub-system can execute the memory access commands to perform the memory access operations and can store the results of the memory access commands for retrieval by the host system after the memory sub-system reports completion of the execution of the memory access commands.
  • NVMe Non-Volatile Memory Express
  • PCIe Peripheral Component Interconnect Express
  • the host system can utilize a set of queues to track the memory access commands issued to the memory sub-system.
  • the host system can include a submission queue, storing submission queue entries representing the memory access commands issued to the memory sub-system, and a completion queue, storing completion queue entries received from the memory sub-system to indicate that the corresponding memory access commands have been executed.
  • the host system can maintain these queues in a volatile host memory, such as a dynamic random access memory (DRAM) device), having an optimal write size granularity (e.g., 64 byte chunks) at which the host memory can be most efficiently written.
  • DRAM dynamic random access memory
  • a completion queue entry may have a different size (e.g., 16 bytes), often smaller than the write size granularity of the host memory. Accordingly, conventional systems often resort to performing a masked write, when supported, or a read-modify-write operation if a masked write is not possible, in order to add newly received completion queue entries to the completion queue in the host memory. With a masked write of an individual completion queue entry, only a portion (e.g., one quarter) of the host memory write chunk size is written. Thus, multiple masked write operations are performed in order to fill the entire host memory write chunk.
  • a masked write when supported, or a read-modify-write operation if a masked write is not possible
  • both masked write and read-modify-write operations have significant time penalties compared to performing a write of an entire host memory write chunk and can negatively impact host system performance.
  • writing to the host system memory in data sizes smaller than the host memory write chunk size can hurt cache coherency.
  • a completion manager component in the memory sub-system can take one of a number of actions when sending memory access operation completion data (e.g., completion queue entries) to a host system in order to optimize the process of writing the completion data to the host system memory.
  • completion manager can append some amount of dummy data to the completion queue entry to form a packet that aligns with the write size granularity of the host memory.
  • the completion manager can include 48 bytes of dummy data, such that a 64 byte chunk can be written to the completion queue of the host memory.
  • the completion manager can coalesce multiple completion queue entries together such that they can be written to the completion queue of the host memory as a single chunk having the optimal write size granularity. For example, if multiple 16 byte completion queue entries are available within a threshold period of time, the completion manager can coalesce up to four completion queue entries before writing them all together as a single chunk that is up to 64 bytes in size.
  • the completion manager can append dummy data to form a packet that aligns with the write size granularity.
  • Advantages of this approach include, but are not limited to, improved performance in the host system. Optimizing the writing of completion queue entries at the host system memory, by using either coalescing or dummy data, offers power savings, decreased latency, and performance improvements compared to masked write and read-modify-write operations, which can now be avoided.
  • the bandwidth of the PCIe link between the memory sub-system and host system can be utilized more efficiently when transmitting completion data, as multiple completions are sent in a single PCIe transaction, rather than having a separate PCIe transaction for every completion.
  • FIG. 1 illustrates an example computing system 100 that includes a memory sub-system 110 in accordance with some embodiments of the present disclosure.
  • the memory sub-system 110 can include media, such as one or more volatile memory devices (e.g., memory device 140 ), one or more non-volatile memory devices (e.g., memory device 130 ), or a combination of such.
  • a memory sub-system 110 can be a storage device, a memory module, or a hybrid of a storage device and memory module.
  • a storage device include a solid-state drive (SSD), a flash drive, a universal serial bus (USB) flash drive, an embedded Multi-Media Controller (eMMC) drive, a Universal Flash Storage (UFS) drive, a secure digital (SD) and a hard disk drive (HDD).
  • SSD solid-state drive
  • USB universal serial bus
  • eMMC embedded Multi-Media Controller
  • UFS Universal Flash Storage
  • SD secure digital
  • HDD hard disk drive
  • memory modules include a dual in-line memory module (DIMM), a small outline DIMM (SO-DIMM), and various types of non-volatile dual in-line memory module (NVDIMM).
  • the computing system 100 can be a computing device such as a desktop computer, laptop computer, network server, mobile device, a vehicle (e.g., airplane, drone, train, automobile, or other conveyance), Internet of Things (IoT) enabled device, embedded computer (e.g., one included in a vehicle, industrial equipment, or a networked commercial device), or such computing device that includes memory and a processing device.
  • a computing device such as a desktop computer, laptop computer, network server, mobile device, a vehicle (e.g., airplane, drone, train, automobile, or other conveyance), Internet of Things (IoT) enabled device, embedded computer (e.g., one included in a vehicle, industrial equipment, or a networked commercial device), or such computing device that includes memory and a processing device.
  • vehicle e.g., airplane, drone, train, automobile, or other conveyance
  • IoT Internet of Things
  • embedded computer e.g., one included in a vehicle, industrial equipment, or a networked commercial device
  • the computing system 100 can include a host system 120 that is coupled to one or more memory sub-systems 110 .
  • the host system 120 is coupled to different types of memory sub-system 110 .
  • FIG. 1 illustrates one example of a host system 120 coupled to one memory sub-system 110 .
  • “coupled to” or “coupled with” generally refers to a connection between components, which can be an indirect communicative connection or direct communicative connection (e.g., without intervening components), whether wired or wireless, including connections such as electrical, optical, magnetic, etc.
  • the host system 120 can include a processor chipset and a software stack executed by the processor chipset.
  • the processor chipset can include one or more cores, one or more caches, a memory controller (e.g., NVDIMM controller), and a storage protocol controller (e.g., PCIe controller, SATA controller).
  • the host system 120 uses the memory sub-system 110 , for example, to write data to the memory sub-system 110 and read data from the memory sub-system 110 .
  • the host system 120 can be coupled to the memory sub-system 110 via a physical host interface.
  • a physical host interface include, but are not limited to, a serial advanced technology attachment (SATA) interface, a peripheral component interconnect express (PCIe) interface, universal serial bus (USB) interface, Fibre Channel, Serial Attached SCSI (SAS), a double data rate (DDR) memory bus, Small Computer System Interface (SCSI), a dual in-line memory module (DIMM) interface (e.g., DIMM socket interface that supports Double Data Rate (DDR)), etc.
  • SATA serial advanced technology attachment
  • PCIe peripheral component interconnect express
  • USB universal serial bus
  • SAS Serial Attached SCSI
  • DDR double data rate
  • SCSI Small Computer System Interface
  • DIMM dual in-line memory module
  • DIMM DIMM socket interface that supports Double Data Rate (DDR)
  • the host system 120 can further utilize an NVM Express (NVMe) interface to access components (e.g., memory devices 130 ) when the memory sub-system 110 is coupled with the host system 120 by the physical host interface (e.g., PCIe bus).
  • NVMe NVM Express
  • the physical host interface can provide an interface for passing control, address, data, and other signals between the memory sub-system 110 and the host system 120 .
  • FIG. 1 illustrates a memory sub-system 110 as an example.
  • the host system 120 can access multiple memory sub-systems via a same communication connection, multiple separate communication connections, and/or a combination of communication connections.
  • the memory devices 130 , 140 can include any combination of the different types of non-volatile memory devices and/or volatile memory devices.
  • the volatile memory devices e.g., memory device 140
  • RAM random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • non-volatile memory devices include negative-and (NAND) type flash memory and write-in-place memory, such as a three-dimensional cross-point (“3D cross-point”) memory device, which is a cross-point array of non-volatile memory cells.
  • NAND negative-and
  • 3D cross-point three-dimensional cross-point
  • a cross-point array of non-volatile memory can perform bit storage based on a change of bulk resistance, in conjunction with a stackable cross-gridded data access array.
  • cross-point non-volatile memory can perform a write in-place operation, where a non-volatile memory cell can be programmed without the non-volatile memory cell being previously erased.
  • NAND type flash memory includes, for example, two-dimensional NAND (2D NAND) and three-dimensional NAND (3D NAND).
  • Each of the memory devices 130 can include one or more arrays of memory cells, such as memory array 137 .
  • One type of memory cell for example, single level cells (SLC) can store one bit per cell.
  • Other types of memory cells such as multi-level cells (MLCs), triple level cells (TLCs), quad-level cells (QLCs), and penta-level cells (PLCs) can store multiple bits per cell.
  • each of the memory devices 130 can include one or more arrays of memory cells such as SLCs, MLCs, TLCs, QLCs, or any combination of such.
  • a particular memory device can include an SLC portion, and an MLC portion, a TLC portion, a QLC portion, or a PLC portion of memory cells.
  • the memory cells of the memory devices 130 can be grouped as pages that can refer to a logical unit of the memory device used to store data. With some types of memory (e.g., NAND), pages can be grouped to form blocks.
  • non-volatile memory components such as 3D cross-point array of non-volatile memory cells and NAND type flash memory (e.g., 2D NAND, 3D NAND)
  • the memory device 130 can be based on any other type of non-volatile memory, such as read-only memory (ROM), phase change memory (PCM), self-selecting memory, other chalcogenide based memories, ferroelectric transistor random-access memory (FeTRAM), ferroelectric random access memory (FeRAM), magneto random access memory (MRAM), Spin Transfer Torque (STT)-MRAM, conductive bridging RAM (CBRAM), resistive random access memory (RRAM), oxide based RRAM (OxRAM), negative-or (NOR) flash memory, and electrically erasable programmable read-only memory (EEPROM).
  • ROM read-only memory
  • PCM phase change memory
  • FeTRAM ferroelectric transistor random-access memory
  • FeRAM ferroelectric random access memory
  • MRAM magneto random access memory
  • a memory sub-system controller 115 (or controller 115 for simplicity) can communicate with the memory devices 130 to perform operations such as reading data, writing data, or erasing data at the memory devices 130 and other such operations.
  • the memory sub-system controller 115 can include hardware such as one or more integrated circuits and/or discrete components, a buffer memory, or a combination thereof.
  • the hardware can include a digital circuitry with dedicated (i.e., hard-coded) logic to perform the operations described herein.
  • the memory sub-system controller 115 can be a microcontroller, special purpose logic circuitry (e.g., a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), etc.), or other suitable processor.
  • FPGA field programmable gate array
  • ASIC application specific integrated circuit
  • the memory sub-system controller 115 can be a processing device, which includes one or more processors (e.g., processor 117 ), configured to execute instructions stored in a local memory 119 .
  • the local memory 119 of the memory sub-system controller 115 includes an embedded memory configured to store instructions for performing various processes, operations, logic flows, and routines that control operation of the memory sub-system 110 , including handling communications between the memory sub-system 110 and the host system 120 .
  • the local memory 119 can include memory registers storing memory pointers, fetched data, etc.
  • the local memory 119 can also include read-only memory (ROM) for storing micro-code. While the example memory sub-system 110 in FIG. 1 has been illustrated as including the memory sub-system controller 115 , in another embodiment of the present disclosure, a memory sub-system 110 does not include a memory sub-system controller 115 , and can instead rely upon external control (e.g., provided by an external host, or by a processor or controller separate from the memory sub-system).
  • external control e.g., provided by an external host, or by a processor or controller separate from the memory sub-system.
  • the memory sub-system controller 115 can receive commands or operations from the host system 120 and can convert the commands or operations into instructions or appropriate commands to achieve the desired access to the memory devices 130 .
  • the memory sub-system controller 115 can be responsible for other operations such as wear leveling operations, garbage collection operations, error detection and error-correcting code (ECC) operations, encryption operations, caching operations, and address translations between a logical address (e.g., logical block address (LBA), namespace) and a physical address (e.g., physical block address) that are associated with the memory devices 130 .
  • the memory sub-system controller 115 can further include host interface circuitry to communicate with the host system 120 via the physical host interface. The host interface circuitry can convert the commands received from the host system into command instructions to access the memory devices 130 as well as convert responses associated with the memory devices 130 into information for the host system 120 .
  • the memory sub-system 110 can also include additional circuitry or components that are not illustrated.
  • the memory sub-system 110 can include a cache or buffer (e.g., DRAM) and address circuitry (e.g., a row decoder and a column decoder) that can receive an address from the memory sub-system controller 115 and decode the address to access the memory devices 130 .
  • a cache or buffer e.g., DRAM
  • address circuitry e.g., a row decoder and a column decoder
  • the memory devices 130 include local media controller 132 that operate in conjunction with memory sub-system controller 115 to execute operations on one or more memory cells of the memory devices 130 .
  • An external controller e.g., memory sub-system controller 115
  • a memory device 130 is a managed memory device, which is a raw memory device combined with a local controller (e.g., local controller 132 ) for media management within the same memory device package.
  • An example of a managed memory device is a managed NAND (MNAND) device.
  • MNAND managed NAND
  • the memory sub-system 110 includes input/output (IO) completion manager 113 .
  • the memory sub-system controller 115 includes at least a portion of the IO completion manager 113 .
  • the memory sub-system controller 115 can include a processor 117 (processing device) configured to execute instructions stored in local memory 119 for performing the operations described herein.
  • IO completion manager 113 performs NVMe command completion management for efficient host system memory operation.
  • IO completion manager can generate completion data to be sent back to the requestor to indicate that execution of the one or more memory access commands is complete.
  • this completion data can include a completion queue entry having a certain size (e.g., 16 bytes).
  • the host system 120 can utilize a set of queues to track the memory access commands issued to the memory sub-system 110 .
  • the host system 120 can include a submission queue 124 , storing submission queue entries representing the memory access commands issued to the memory sub-system 110 , and a completion queue 126 , storing completion queue entries received from the memory sub-system 110 to indicate that the corresponding memory access commands have been executed.
  • the host system 120 can maintain these queues in a host memory 122 , such as a dynamic random access memory (DRAM) device or other non-volatile memory device.
  • DRAM dynamic random access memory
  • submission queue 124 and completion queue 126 can include circular buffers with a fixed slot size.
  • host memory 122 has an optimal write size granularity (e.g., 64 byte chunks) at which the host memory 122 can be most efficiently written. In other embodiments, there can be some other number of queues or queue pairs in host memory 122 , the write size granularity of host memory 122 can be different, and/or the size of a completion queue entry can be different. In general, however, the size of the completion queue entry is smaller than the write size granularity of host memory 122 .
  • optimal write size granularity e.g., 64 byte chunks
  • the write size granularity of host memory 122 can be different, and/or the size of a completion queue entry can be different. In general, however, the size of the completion queue entry is smaller than the write size granularity of host memory 122 .
  • IO completion manager 113 can take any of a number of actions when sending memory access operation completion data (e.g., completion queue entries) to host system 120 in order to optimize the process of writing the completion data to the host memory 122 .
  • completion queue entries e.g., completion queue entries
  • IO completion manager 113 can append some amount of dummy data to the completion queue entry to form a packet that aligns with the write size granularity of the host memory 122 .
  • IO completion manager 113 can include 48 bytes of dummy data, such that a 64 byte chunk can be written to the completion queue 126 of the host memory 122 .
  • IO completion manager 113 can coalesce multiple completion queue entries together such that they can be written to the completion queue 126 of the host memory 122 as a single chunk having the optimal write size granularity. For example, if multiple 16 byte completion queue entries are available within a threshold period of time, IO completion manager 113 can coalesce up to four completion queue entries before writing them all together as a single chunk that is up to 64 bytes in size.
  • IO completion manager 113 can append dummy data to form a packet that aligns with the write size granularity. Further details with regards to the operations of IO completion manager 113 are described below.
  • FIG. 2 is a flow diagram of an example method of NVMe command completion management for host system memory using appended dummy data in accordance with some embodiments of the present disclosure.
  • the method 200 can be performed by processing logic that can include hardware (e.g., processing device, circuitry, dedicated logic, programmable logic, microcode, hardware of a device, integrated circuit, etc.), software (e.g., instructions run or executed on a processing device), or a combination thereof.
  • the method 200 is performed by IO completion manager 113 of FIG. 1 . Although shown in a particular sequence or order, unless otherwise specified, the order of the processes can be modified.
  • the processing logic identifies an indication of a completion of a memory access command directed to a memory device, such as memory device 130 .
  • a controller such as memory sub-system controller 115 can receive one or more memory access commands from a requestor, such as host system 120 .
  • the host system 120 can be connected externally to the memory sub-system 110 , such as via an NVMe interface.
  • the memory sub-system controller 115 can execute the one or more memory access commands to perform one or more corresponding memory access operations and can store the results of the memory access operations for retrieval by the host system 120 after IO completion manager 113 reports completion of the execution of the memory access operations.
  • IO completion manager 113 can generate or identify an otherwise generated indication of the completion.
  • the processing logic can determine whether a size of the indication of the completion is smaller than a host memory write size granularity.
  • the host system 120 can maintain a completion queue 126 , for example, in a volatile host memory 122 , such as a DRAM device, having an optimal write size granularity (e.g., 64 byte chunks) at which the host memory can be most efficiently written.
  • the indication of the completion which can ultimately be stored in completion queue 126 as a completion queue entry, however, may have a different size (e.g., 16 bytes), often smaller than the write size granularity of the host memory.
  • IO completion manager 113 can compare the size of the indication to the known host memory write size granularity to determine whether the size of the indication of the completion is smaller than the host memory write size granularity.
  • the processing logic can send the indication of the completion to the host system 120 as a full completion data chunk equal to the host memory write size granularity.
  • the host system 120 can store the full completion data chunk in completion queue 126 .
  • the processing logic can append dummy data to the indication of the completion to form a full completion data chunk (i.e., a data chunk having a size equal to the host memory write size granularity).
  • the dummy data can include a random data pattern, a pseudo-random data pattern, all zeroes, all ones, etc.
  • the command completion sequence 300 includes a number of completion data chunks 302 , 304 , 306 , and 308 .
  • Completion data chunk 302 includes the indication of a completion C 1 which has a size (e.g., 16 bytes) smaller than that of completion data chunk 302 . Accordingly, IO completion manager 113 can append a number of dummy data elements DD to the indication of completion C 1 to fill the remaining portion of completion data chunk 302 . When a subsequent indication of a completion C 2 is available, IO completion manager 113 can similarly append a number of dummy data elements DD to the indication of completion C 2 to fill the remaining portion of completion data chunk 304 .
  • the processing logic can sending the full completion data chunk, such as chunk 302 , comprising the indication of the completion C 1 and the dummy data DD to the host system 120 .
  • the host system 120 can store the full completion data chunk in completion queue 126 of host memory 122 using a single host memory write operation.
  • FIG. 4 is a flow diagram of an example method of NVMe command completion management for host system memory using completion coalescing in accordance with some embodiments of the present disclosure.
  • the method 400 can be performed by processing logic that can include hardware (e.g., processing device, circuitry, dedicated logic, programmable logic, microcode, hardware of a device, integrated circuit, etc.), software (e.g., instructions run or executed on a processing device), or a combination thereof.
  • the method 400 is performed by IO completion manager 113 of FIG. 1 . Although shown in a particular sequence or order, unless otherwise specified, the order of the processes can be modified.
  • the processing logic identifies an indication of a completion of a memory access command directed to a memory device, such as memory device 130 .
  • a controller such as memory sub-system controller 115 can receive one or more memory access commands from a requestor, such as host system 120 .
  • the host system 120 can be connected externally to the memory sub-system 110 , such as via an NVMe interface.
  • the memory sub-system controller 115 can execute the one or more memory access commands to perform one or more corresponding memory access operations and can store the results of the memory access operations for retrieval by the host system 120 after IO completion manager 113 reports completion of the execution of the memory access operations.
  • IO completion manager 113 can generate or identify an otherwise generated indication of the completion.
  • the processing logic can determine whether there are other memory access commands directed to the memory device 130 that are pending.
  • IO completion manager 113 tracks all memory access commands received at memory sub-system 110 (e.g., by adding an indication of a memory access command to a command queue) and tracks which memory access commands are completed (e.g., by removing the indication of the memory access command from the command queue and generating an indication of the completion.
  • IO completion manager 113 can determine whether there are other commands that are pending (i.e., commands that have been received but have not yet been completed), as well as when those command are likely to be completed.
  • the processing logic can send the indication of the completion to the host system 120 as a partial completion data chunk. Since the size of the indication (e.g., 16 bytes) is likely less than the host memory write size granularity (e.g., 64 bytes) a full completion data chunk is not available. Since there are no other pending memory access commands, waiting for additional indications of completions of other memory access commands is impractical, and thus, in one embodiment, the indication of the completion can be sent alone to host system 120 . In another embodiment, however, IO completion manager 113 can append dummy data to the indication of the completion to form a full completion data chunk equal to a host memory write size granularity, as illustrated by chunk 302 in FIG. 3 A , for example.
  • the host memory write size granularity e.g. 64 bytes
  • the processing logic can coalesce additional indications of completions of the other memory access commands that are available within a threshold period of time with the indication of the completion into a completion data chunk.
  • IO completion manager 113 can delay the sending and wait to see if any additional indications of completions of the other memory access commands become available within the threshold period of time (e.g., before the expiration of a timer set to a threshold value), such that the indications of multiple completions can be sent to the host system 120 together.
  • the processing logic can determine whether the indication of the completion of the memory access command or any of the additional indications of the completions of the other memory access commands indicate an error of a corresponding memory access operation.
  • the indication of the completion in generated upon completion of a corresponding memory access operation and will indicate whether the memory access operation was successful or whether an error occurred. If an error has not occurred, IO completion manager 113 can safely coalesce the indication of the completion, as the indication of a successful completion is not as time sensitive. If an error has occurred, however, IO completion manager 113 may not coalesce the indication and can instead send the send the indication of the completion to the host system 120 as a partial completion data chunk at operation 415 .
  • the processing logic determines whether a threshold period of time has expired.
  • IO completion manager 113 maintains a counter (or set of counters) which is initialized to a configurable initial value representing the threshold period of time.
  • the counter begins a countdown to zero, and thus will expire after the threshold period of time has passed.
  • the processing logic can send a completion data chunk to the host system 120 including any indications of completions having been coalesced up to that point.
  • the completion data chunk comprises a partial completion data chunk having a smaller size than a host memory write size granularity.
  • the command completion sequence 350 includes a number of completion data chunks 352 , 354 , 356 , and 358 .
  • Each of completion data chunks 352 , 354 , 356 , and 358 are equal to the host memory write size granularity (e.g., 64 bytes or some other size).
  • Completion data chunk 352 includes the indications of multiple completions C 17 -C 20 , each of which has a size (e.g., 16 bytes) smaller than that of completion data chunk 352 .
  • completions C 17 , C 18 , and C 19 can be available when the threshold period of time has expired, for example.
  • completions C 17 , C 18 , and C 19 together are still smaller than the host memory write size granularity, in one embodiment, these completions can be sent to host system 120 together.
  • Host system 120 can write the completions C 17 , C 18 , and C 19 to completion queue 126 .
  • IO completion manager 113 can append dummy data to the indications of the completions to form a full completion data chunk equal to a host memory write size granularity, as illustrated by chunk 306 in FIG. 3 A , for example.
  • IO completion manager 113 can send the indication of completion C 20 to host system 120 immediately (i.e., without coalescing) since the indication of completion C 20 is the only remaining completion in completion data chunk 356 . If, however, completions C 21 and C 22 are available when the threshold period of time ends, completions C 21 and C 22 can be sent to host system 120 . Once the indication of completion C 23 is subsequently available, IO completion manager 113 can coalesce the indication of completion C 23 until the indication of completion C 24 is available (assuming C 24 is available within a threshold period of time of C 23 ) since completions C 23 and C 24 together will complete the completion data chunk 358 .
  • the processing logic determines whether a size of the coalesced indications has reached the host memory write size granularity.
  • IO completion manager 113 compares the size of the coalesced indications to the host memory write size granularity (or a number of coalesced indications to a threshold number). Responsive to determining that the size of the coalesced indications has not reached the host memory write size granularity, the processing logic can continue to coalesce additional indications of completions of the other memory access commands (e.g., return to operation 410 ).
  • the processing logic sends the completion data chunk to the host system 120 .
  • the completion data chunk comprises a full completion data chunk equal to the host memory write size granularity.
  • completion data chunk 352 includes indications of completions C 9 , C 10 , C 11 , and C 12 , all of which can be sent to host system 120 together.
  • the host system 120 can store the full completion data chunk as one or more completion queue entries in completion queue 126 in host memory 122 via a single host memory write operation.
  • FIG. 5 illustrates an example machine of a computer system 500 within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, can be executed.
  • the computer system 500 can correspond to a host system (e.g., the host system 120 of FIG. 1 ) that includes, is coupled to, or utilizes a memory sub-system (e.g., the memory sub-system 110 of FIG. 1 ) or can be used to perform the operations of a controller (e.g., to execute an operating system to perform operations corresponding to the IO sequencer 111 of FIG. 1 ).
  • a host system e.g., the host system 120 of FIG. 1
  • a memory sub-system e.g., the memory sub-system 110 of FIG. 1
  • a controller e.g., to execute an operating system to perform operations corresponding to the IO sequencer 111 of FIG. 1 .
  • the machine can be connected (e.g., networked) to other machines in a LAN, an intranet, an extranet, and/or the Internet.
  • the machine can operate in the capacity of a server or a client machine in client-server network environment, as a peer machine in a peer-to-peer (or distributed) network environment, or as a server or a client machine in a cloud computing infrastructure or environment.
  • the machine can be a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a server, a network router, a switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
  • PC personal computer
  • PDA Personal Digital Assistant
  • STB set-top box
  • STB set-top box
  • a cellular telephone a web appliance
  • server a server
  • network router a network router
  • switch or bridge or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
  • machine shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.
  • the example computer system 500 includes a processing device 502 , a main memory 504 (e.g., read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM) or Rambus DRAM (RDRAM), etc.), a static memory 506 (e.g., flash memory, static random access memory (SRAM), etc.), and a data storage system 518 , which communicate with each other via a bus 530 .
  • main memory 504 e.g., read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM) or Rambus DRAM (RDRAM), etc.
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • RDRAM Rambus DRAM
  • static memory 506 e.g., flash memory, static random access memory (SRAM), etc.
  • SRAM static random access memory
  • Processing device 502 represents one or more general-purpose processing devices such as a microprocessor, a central processing unit, or the like. More particularly, the processing device can be a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, or a processor implementing other instruction sets, or processors implementing a combination of instruction sets. Processing device 502 can also be one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. The processing device 502 is configured to execute instructions 526 for performing the operations and steps discussed herein.
  • the computer system 500 can further include a network interface device 508 to communicate over the network 520 .
  • the data storage system 518 can include a machine-readable storage medium 524 (also known as a computer-readable medium) on which is stored one or more sets of instructions 526 or software embodying any one or more of the methodologies or functions described herein.
  • the instructions 526 can also reside, completely or at least partially, within the main memory 504 and/or within the processing device 502 during execution thereof by the computer system 500 , the main memory 504 and the processing device 502 also constituting machine-readable storage media.
  • the machine-readable storage medium 524 , data storage system 518 , and/or main memory 504 can correspond to the memory sub-system 110 of FIG. 1 .
  • the instructions 526 include instructions to implement functionality corresponding to the IO sequencer 111 of FIG. 1 ). While the machine-readable storage medium 524 is shown in an example embodiment to be a single medium, the term “machine-readable storage medium” should be taken to include a single medium or multiple media that store the one or more sets of instructions. The term “machine-readable storage medium” shall also be taken to include any medium that is capable of storing or encoding a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present disclosure. The term “machine-readable storage medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical media, and magnetic media.
  • the present disclosure also relates to an apparatus for performing the operations herein.
  • This apparatus can be specially constructed for the intended purposes, or it can include a general purpose computer selectively activated or reconfigured by a computer program stored in the computer.
  • a computer program can be stored in a computer readable storage medium, such as, but not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, each coupled to a computer system bus.
  • the present disclosure can be provided as a computer program product, or software, that can include a machine-readable medium having stored thereon instructions, which can be used to program a computer system (or other electronic devices) to perform a process according to the present disclosure.
  • a machine-readable medium includes any mechanism for storing information in a form readable by a machine (e.g., a computer).
  • a machine-readable (e.g., computer-readable) medium includes a machine (e.g., a computer) readable storage medium such as a read only memory (“ROM”), random access memory (“RAM”), magnetic disk storage media, optical storage media, flash memory components, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A processing device in a memory sub-system identifies an indication of a completion of a memory access command directed to a memory device and determines whether there are other memory access commands directed to the memory device that are pending. Responsive to determining that there are other memory access commands pending, the processing device coalesces additional indications of completions of the other memory access commands that are available within a threshold period of time with the indication of the completion into a completion data chunk and sends the completion data chunk to a host system. The host system is to store the completion data chunk as one or more completion queue entries in a completion queue in a host memory of the host system via a single host memory write operation.

Description

RELATED APPLICATIONS
This application claims the benefit of India Provisional Patent Application No. 202141061856, filed Dec. 30, 2021, which is hereby incorporated by reference herein.
TECHNICAL FIELD
Embodiments of the disclosure relate generally to memory sub-systems, and more specifically, relate to NVMe command completion management for host system memory.
BACKGROUND
A memory sub-system can include one or more memory devices that store data. The memory devices can be, for example, non-volatile memory devices and volatile memory devices. In general, a host system can utilize a memory sub-system to store data at the memory devices and to retrieve data from the memory devices.
BRIEF DESCRIPTION OF THE DRAWINGS
The present disclosure will be understood more fully from the detailed description given below and from the accompanying drawings of various embodiments of the disclosure.
FIG. 1 illustrates an example computing system that includes a memory sub-system in accordance with some embodiments of the present disclosure.
FIG. 2 is a flow diagram of an example method of NVMe command completion management for host system memory using appended dummy data in accordance with some embodiments of the present disclosure.
FIGS. 3A and 3B are block diagrams illustrating example NVMe command completion sequences in accordance with some embodiments of the present disclosure.
FIG. 4 is a flow diagram of an example method of NVMe command completion management for host system memory using completion coalescing in accordance with some embodiments of the present disclosure.
FIG. 5 is a block diagram of an example computer system in which embodiments of the present disclosure may operate.
DETAILED DESCRIPTION
Aspects of the present disclosure are directed to NVMe command completion management for host system memory. A memory sub-system can be a storage device, a memory module, or a hybrid of a storage device and memory module. Examples of storage devices and memory modules are described below in conjunction with FIG. 1 . In general, a host system can utilize a memory sub-system that includes one or more components, such as memory devices that store data. The host system can provide data to be stored at the memory sub-system and can request data to be retrieved from the memory sub-system.
A memory sub-system can include high density non-volatile memory devices where retention of data is desired when no power is supplied to the memory device. One example of a non-volatile memory device is a NAND memory device, such as 3D flash NAND memory, which offers storage in the form of compact, high density configurations. Other examples of non-volatile memory devices are described below in conjunction with FIG. 1 . A non-volatile memory device is a package of one or more die. Each die can consist of one or more planes. For some types of non-volatile memory devices (e.g., NAND memory devices), each plane consists of a set of physical blocks. Each block consists of a set of pages. Each page consists of a set of memory cells (“cells”). A cell is an electronic circuit that stores information. Depending on the cell type, a cell can store one or more bits of binary information, and has various logic states that correlate to the number of bits being stored. The logic states can be represented by binary values, such as “0” and “1”, or combinations of such values.
A memory device can be made up of bits arranged in a two-dimensional or a three-dimensional grid. Memory cells are etched onto a silicon wafer in an array of columns (also hereinafter referred to as bitlines) and rows (also hereinafter referred to as wordlines). A wordline can refer to one or more rows of memory cells of a memory device that are used with one or more bitlines to generate the address of each of the memory cells. The intersection of a bitline and wordline constitutes the address of the memory cell. A block hereinafter refers to a unit of the memory device used to store data and can include a group of memory cells, a wordline group, a wordline, or individual memory cells. One or more blocks can be grouped together to form a plane of the memory device in order to allow concurrent operations to take place on each plane. The memory device can include circuitry that performs concurrent memory page accesses of two or more memory planes. For example, the memory device can include multiple access line driver circuits and power circuits that can be shared by the planes of the memory device to facilitate concurrent access of pages of two or more memory planes, including different page types.
Memory access commands, such as those sent by the host system, request the memory sub-system to perform memory access operations on the memory devices contained therein. Memory access commands can generally be classified into respective categories, such as read commands, write commands, erase commands, move commands, etc. A memory sub-system controller can receive the memory access commands from the host system connected externally to the memory sub-system, such as via a Non-Volatile Memory Express (NVMe) interface on a Peripheral Component Interconnect Express (PCIe) communication bus. The memory sub-system can execute the memory access commands to perform the memory access operations and can store the results of the memory access commands for retrieval by the host system after the memory sub-system reports completion of the execution of the memory access commands.
In certain implementations, the host system can utilize a set of queues to track the memory access commands issued to the memory sub-system. For example, the host system can include a submission queue, storing submission queue entries representing the memory access commands issued to the memory sub-system, and a completion queue, storing completion queue entries received from the memory sub-system to indicate that the corresponding memory access commands have been executed. Typically, the host system can maintain these queues in a volatile host memory, such as a dynamic random access memory (DRAM) device), having an optimal write size granularity (e.g., 64 byte chunks) at which the host memory can be most efficiently written. A completion queue entry, however, may have a different size (e.g., 16 bytes), often smaller than the write size granularity of the host memory. Accordingly, conventional systems often resort to performing a masked write, when supported, or a read-modify-write operation if a masked write is not possible, in order to add newly received completion queue entries to the completion queue in the host memory. With a masked write of an individual completion queue entry, only a portion (e.g., one quarter) of the host memory write chunk size is written. Thus, multiple masked write operations are performed in order to fill the entire host memory write chunk. With a read-modify-write operation, the host memory chunk is read from the host memory, modified to include the newly received completion queue entry, and written back to the host memory. Thus, both masked write and read-modify-write operations have significant time penalties compared to performing a write of an entire host memory write chunk and can negatively impact host system performance. In addition, writing to the host system memory in data sizes smaller than the host memory write chunk size can hurt cache coherency.
Aspects of the present disclosure address the above and other deficiencies by implementing NVMe command completion management for efficient host system memory operation. In one embodiment, a completion manager component in the memory sub-system can take one of a number of actions when sending memory access operation completion data (e.g., completion queue entries) to a host system in order to optimize the process of writing the completion data to the host system memory. In one embodiment, when a completion queue entry is available and ready to send to the host system, the completion manager can append some amount of dummy data to the completion queue entry to form a packet that aligns with the write size granularity of the host memory. For example, if a 16 byte completion queue entry is available, and if the write size granularity of the host memory is 64 bytes, the completion manager can include 48 bytes of dummy data, such that a 64 byte chunk can be written to the completion queue of the host memory. In another embodiment, the completion manager can coalesce multiple completion queue entries together such that they can be written to the completion queue of the host memory as a single chunk having the optimal write size granularity. For example, if multiple 16 byte completion queue entries are available within a threshold period of time, the completion manager can coalesce up to four completion queue entries before writing them all together as a single chunk that is up to 64 bytes in size. In yet another embodiment, if less than a full chunk equal to the write size granularity is written to the host memory due to expiration of the threshold period of time, the completion manager can append dummy data to form a packet that aligns with the write size granularity.
Advantages of this approach include, but are not limited to, improved performance in the host system. Optimizing the writing of completion queue entries at the host system memory, by using either coalescing or dummy data, offers power savings, decreased latency, and performance improvements compared to masked write and read-modify-write operations, which can now be avoided. In addition, the bandwidth of the PCIe link between the memory sub-system and host system can be utilized more efficiently when transmitting completion data, as multiple completions are sent in a single PCIe transaction, rather than having a separate PCIe transaction for every completion.
FIG. 1 illustrates an example computing system 100 that includes a memory sub-system 110 in accordance with some embodiments of the present disclosure. The memory sub-system 110 can include media, such as one or more volatile memory devices (e.g., memory device 140), one or more non-volatile memory devices (e.g., memory device 130), or a combination of such.
A memory sub-system 110 can be a storage device, a memory module, or a hybrid of a storage device and memory module. Examples of a storage device include a solid-state drive (SSD), a flash drive, a universal serial bus (USB) flash drive, an embedded Multi-Media Controller (eMMC) drive, a Universal Flash Storage (UFS) drive, a secure digital (SD) and a hard disk drive (HDD). Examples of memory modules include a dual in-line memory module (DIMM), a small outline DIMM (SO-DIMM), and various types of non-volatile dual in-line memory module (NVDIMM).
The computing system 100 can be a computing device such as a desktop computer, laptop computer, network server, mobile device, a vehicle (e.g., airplane, drone, train, automobile, or other conveyance), Internet of Things (IoT) enabled device, embedded computer (e.g., one included in a vehicle, industrial equipment, or a networked commercial device), or such computing device that includes memory and a processing device.
The computing system 100 can include a host system 120 that is coupled to one or more memory sub-systems 110. In some embodiments, the host system 120 is coupled to different types of memory sub-system 110. FIG. 1 illustrates one example of a host system 120 coupled to one memory sub-system 110. As used herein, “coupled to” or “coupled with” generally refers to a connection between components, which can be an indirect communicative connection or direct communicative connection (e.g., without intervening components), whether wired or wireless, including connections such as electrical, optical, magnetic, etc.
The host system 120 can include a processor chipset and a software stack executed by the processor chipset. The processor chipset can include one or more cores, one or more caches, a memory controller (e.g., NVDIMM controller), and a storage protocol controller (e.g., PCIe controller, SATA controller). The host system 120 uses the memory sub-system 110, for example, to write data to the memory sub-system 110 and read data from the memory sub-system 110.
The host system 120 can be coupled to the memory sub-system 110 via a physical host interface. Examples of a physical host interface include, but are not limited to, a serial advanced technology attachment (SATA) interface, a peripheral component interconnect express (PCIe) interface, universal serial bus (USB) interface, Fibre Channel, Serial Attached SCSI (SAS), a double data rate (DDR) memory bus, Small Computer System Interface (SCSI), a dual in-line memory module (DIMM) interface (e.g., DIMM socket interface that supports Double Data Rate (DDR)), etc. The physical host interface can be used to transmit data between the host system 120 and the memory sub-system 110. The host system 120 can further utilize an NVM Express (NVMe) interface to access components (e.g., memory devices 130) when the memory sub-system 110 is coupled with the host system 120 by the physical host interface (e.g., PCIe bus). The physical host interface can provide an interface for passing control, address, data, and other signals between the memory sub-system 110 and the host system 120. FIG. 1 illustrates a memory sub-system 110 as an example. In general, the host system 120 can access multiple memory sub-systems via a same communication connection, multiple separate communication connections, and/or a combination of communication connections.
The memory devices 130, 140 can include any combination of the different types of non-volatile memory devices and/or volatile memory devices. The volatile memory devices (e.g., memory device 140) can be, but are not limited to, random access memory (RAM), such as dynamic random access memory (DRAM) and synchronous dynamic random access memory (SDRAM).
Some examples of non-volatile memory devices (e.g., memory device 130) include negative-and (NAND) type flash memory and write-in-place memory, such as a three-dimensional cross-point (“3D cross-point”) memory device, which is a cross-point array of non-volatile memory cells. A cross-point array of non-volatile memory can perform bit storage based on a change of bulk resistance, in conjunction with a stackable cross-gridded data access array. Additionally, in contrast to many flash-based memories, cross-point non-volatile memory can perform a write in-place operation, where a non-volatile memory cell can be programmed without the non-volatile memory cell being previously erased. NAND type flash memory includes, for example, two-dimensional NAND (2D NAND) and three-dimensional NAND (3D NAND).
Each of the memory devices 130 can include one or more arrays of memory cells, such as memory array 137. One type of memory cell, for example, single level cells (SLC) can store one bit per cell. Other types of memory cells, such as multi-level cells (MLCs), triple level cells (TLCs), quad-level cells (QLCs), and penta-level cells (PLCs) can store multiple bits per cell. In some embodiments, each of the memory devices 130 can include one or more arrays of memory cells such as SLCs, MLCs, TLCs, QLCs, or any combination of such. In some embodiments, a particular memory device can include an SLC portion, and an MLC portion, a TLC portion, a QLC portion, or a PLC portion of memory cells. The memory cells of the memory devices 130 can be grouped as pages that can refer to a logical unit of the memory device used to store data. With some types of memory (e.g., NAND), pages can be grouped to form blocks.
Although non-volatile memory components such as 3D cross-point array of non-volatile memory cells and NAND type flash memory (e.g., 2D NAND, 3D NAND) are described, the memory device 130 can be based on any other type of non-volatile memory, such as read-only memory (ROM), phase change memory (PCM), self-selecting memory, other chalcogenide based memories, ferroelectric transistor random-access memory (FeTRAM), ferroelectric random access memory (FeRAM), magneto random access memory (MRAM), Spin Transfer Torque (STT)-MRAM, conductive bridging RAM (CBRAM), resistive random access memory (RRAM), oxide based RRAM (OxRAM), negative-or (NOR) flash memory, and electrically erasable programmable read-only memory (EEPROM).
A memory sub-system controller 115 (or controller 115 for simplicity) can communicate with the memory devices 130 to perform operations such as reading data, writing data, or erasing data at the memory devices 130 and other such operations. The memory sub-system controller 115 can include hardware such as one or more integrated circuits and/or discrete components, a buffer memory, or a combination thereof. The hardware can include a digital circuitry with dedicated (i.e., hard-coded) logic to perform the operations described herein. The memory sub-system controller 115 can be a microcontroller, special purpose logic circuitry (e.g., a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), etc.), or other suitable processor.
The memory sub-system controller 115 can be a processing device, which includes one or more processors (e.g., processor 117), configured to execute instructions stored in a local memory 119. In the illustrated example, the local memory 119 of the memory sub-system controller 115 includes an embedded memory configured to store instructions for performing various processes, operations, logic flows, and routines that control operation of the memory sub-system 110, including handling communications between the memory sub-system 110 and the host system 120.
In some embodiments, the local memory 119 can include memory registers storing memory pointers, fetched data, etc. The local memory 119 can also include read-only memory (ROM) for storing micro-code. While the example memory sub-system 110 in FIG. 1 has been illustrated as including the memory sub-system controller 115, in another embodiment of the present disclosure, a memory sub-system 110 does not include a memory sub-system controller 115, and can instead rely upon external control (e.g., provided by an external host, or by a processor or controller separate from the memory sub-system).
In general, the memory sub-system controller 115 can receive commands or operations from the host system 120 and can convert the commands or operations into instructions or appropriate commands to achieve the desired access to the memory devices 130. The memory sub-system controller 115 can be responsible for other operations such as wear leveling operations, garbage collection operations, error detection and error-correcting code (ECC) operations, encryption operations, caching operations, and address translations between a logical address (e.g., logical block address (LBA), namespace) and a physical address (e.g., physical block address) that are associated with the memory devices 130. The memory sub-system controller 115 can further include host interface circuitry to communicate with the host system 120 via the physical host interface. The host interface circuitry can convert the commands received from the host system into command instructions to access the memory devices 130 as well as convert responses associated with the memory devices 130 into information for the host system 120.
The memory sub-system 110 can also include additional circuitry or components that are not illustrated. In some embodiments, the memory sub-system 110 can include a cache or buffer (e.g., DRAM) and address circuitry (e.g., a row decoder and a column decoder) that can receive an address from the memory sub-system controller 115 and decode the address to access the memory devices 130.
In some embodiments, the memory devices 130 include local media controller 132 that operate in conjunction with memory sub-system controller 115 to execute operations on one or more memory cells of the memory devices 130. An external controller (e.g., memory sub-system controller 115) can externally manage the memory device 130 (e.g., perform media management operations on the memory device 130). In some embodiments, a memory device 130 is a managed memory device, which is a raw memory device combined with a local controller (e.g., local controller 132) for media management within the same memory device package. An example of a managed memory device is a managed NAND (MNAND) device.
In one embodiment, the memory sub-system 110 includes input/output (IO) completion manager 113. In some embodiments, the memory sub-system controller 115 includes at least a portion of the IO completion manager 113. For example, the memory sub-system controller 115 can include a processor 117 (processing device) configured to execute instructions stored in local memory 119 for performing the operations described herein. In one embodiment, IO completion manager 113 performs NVMe command completion management for efficient host system memory operation. For example, responsive to memory sub-system controller 115 performing one or more memory access commands (e.g., read commands, write commands, erase commands, move commands, etc.) based on memory access requests received from a requestor, such as host system 120, IO completion manager can generate completion data to be sent back to the requestor to indicate that execution of the one or more memory access commands is complete. In one embodiment, this completion data can include a completion queue entry having a certain size (e.g., 16 bytes). Once generated, IO completion manager 113 can transmit the completion data back to host system 120.
In one embodiment, the host system 120 can utilize a set of queues to track the memory access commands issued to the memory sub-system 110. For example, the host system 120 can include a submission queue 124, storing submission queue entries representing the memory access commands issued to the memory sub-system 110, and a completion queue 126, storing completion queue entries received from the memory sub-system 110 to indicate that the corresponding memory access commands have been executed. In one embodiment, the host system 120 can maintain these queues in a host memory 122, such as a dynamic random access memory (DRAM) device or other non-volatile memory device. Submission queue 124 and completion queue 126 can include circular buffers with a fixed slot size. In one embodiment, host memory 122 has an optimal write size granularity (e.g., 64 byte chunks) at which the host memory 122 can be most efficiently written. In other embodiments, there can be some other number of queues or queue pairs in host memory 122, the write size granularity of host memory 122 can be different, and/or the size of a completion queue entry can be different. In general, however, the size of the completion queue entry is smaller than the write size granularity of host memory 122.
Depending on the embodiment, IO completion manager 113 can take any of a number of actions when sending memory access operation completion data (e.g., completion queue entries) to host system 120 in order to optimize the process of writing the completion data to the host memory 122. In one embodiment, when a completion queue entry is available and ready to send to host system 120, IO completion manager 113 can append some amount of dummy data to the completion queue entry to form a packet that aligns with the write size granularity of the host memory 122. For example, if a 16 byte completion queue entry is available, and if the write size granularity of the host memory is 64 bytes, IO completion manager 113 can include 48 bytes of dummy data, such that a 64 byte chunk can be written to the completion queue 126 of the host memory 122. In another embodiment, IO completion manager 113 can coalesce multiple completion queue entries together such that they can be written to the completion queue 126 of the host memory 122 as a single chunk having the optimal write size granularity. For example, if multiple 16 byte completion queue entries are available within a threshold period of time, IO completion manager 113 can coalesce up to four completion queue entries before writing them all together as a single chunk that is up to 64 bytes in size. In yet another embodiment, if less than a full chunk equal to the write size granularity is written to the host memory 122 due to expiration of the threshold period of time, IO completion manager 113 can append dummy data to form a packet that aligns with the write size granularity. Further details with regards to the operations of IO completion manager 113 are described below.
FIG. 2 is a flow diagram of an example method of NVMe command completion management for host system memory using appended dummy data in accordance with some embodiments of the present disclosure. The method 200 can be performed by processing logic that can include hardware (e.g., processing device, circuitry, dedicated logic, programmable logic, microcode, hardware of a device, integrated circuit, etc.), software (e.g., instructions run or executed on a processing device), or a combination thereof. In some embodiments, the method 200 is performed by IO completion manager 113 of FIG. 1 . Although shown in a particular sequence or order, unless otherwise specified, the order of the processes can be modified. Thus, the illustrated embodiments should be understood only as examples, and the illustrated processes can be performed in a different order, and some processes can be performed in parallel. Additionally, one or more processes can be omitted in various embodiments. Thus, not all processes are required in every embodiment. Other process flows are possible.
At operation 205, the processing logic identifies an indication of a completion of a memory access command directed to a memory device, such as memory device 130. In one embodiment, a controller, such as memory sub-system controller 115 can receive one or more memory access commands from a requestor, such as host system 120. The host system 120 can be connected externally to the memory sub-system 110, such as via an NVMe interface. The memory sub-system controller 115 can execute the one or more memory access commands to perform one or more corresponding memory access operations and can store the results of the memory access operations for retrieval by the host system 120 after IO completion manager 113 reports completion of the execution of the memory access operations. In response to completion of the execution of each memory access command, IO completion manager 113 can generate or identify an otherwise generated indication of the completion.
At operation 210, the processing logic can determine whether a size of the indication of the completion is smaller than a host memory write size granularity. In one embodiment, the host system 120 can maintain a completion queue 126, for example, in a volatile host memory 122, such as a DRAM device, having an optimal write size granularity (e.g., 64 byte chunks) at which the host memory can be most efficiently written. The indication of the completion, which can ultimately be stored in completion queue 126 as a completion queue entry, however, may have a different size (e.g., 16 bytes), often smaller than the write size granularity of the host memory. In one embodiment, IO completion manager 113 can compare the size of the indication to the known host memory write size granularity to determine whether the size of the indication of the completion is smaller than the host memory write size granularity.
Responsive to determining that the size of the indication of the completion is not smaller than the host memory write size granularity (i.e., that the size of the indication is at least equal to the host memory write size granularity), at operation 215, the processing logic can send the indication of the completion to the host system 120 as a full completion data chunk equal to the host memory write size granularity. Upon receiving the full completion data chunk, the host system 120 can store the full completion data chunk in completion queue 126.
Responsive to determining that the size of the indication of the completion is smaller than the host memory write size granularity, however, at operation 220, the processing logic can append dummy data to the indication of the completion to form a full completion data chunk (i.e., a data chunk having a size equal to the host memory write size granularity). In one embodiment, the dummy data can include a random data pattern, a pseudo-random data pattern, all zeroes, all ones, etc. For example, as illustrated in FIG. 3A, the command completion sequence 300 includes a number of completion data chunks 302, 304, 306, and 308. Each of completion data chunks 302, 304, 306, and 308 are equal to the host memory write size granularity (e.g., 64 bytes or some other size). Completion data chunk 302 includes the indication of a completion C1 which has a size (e.g., 16 bytes) smaller than that of completion data chunk 302. Accordingly, IO completion manager 113 can append a number of dummy data elements DD to the indication of completion C1 to fill the remaining portion of completion data chunk 302. When a subsequent indication of a completion C2 is available, IO completion manager 113 can similarly append a number of dummy data elements DD to the indication of completion C2 to fill the remaining portion of completion data chunk 304.
At operation 225, the processing logic can sending the full completion data chunk, such as chunk 302, comprising the indication of the completion C1 and the dummy data DD to the host system 120. Upon receiving the full completion data chunk, the host system 120 can store the full completion data chunk in completion queue 126 of host memory 122 using a single host memory write operation.
FIG. 4 is a flow diagram of an example method of NVMe command completion management for host system memory using completion coalescing in accordance with some embodiments of the present disclosure. The method 400 can be performed by processing logic that can include hardware (e.g., processing device, circuitry, dedicated logic, programmable logic, microcode, hardware of a device, integrated circuit, etc.), software (e.g., instructions run or executed on a processing device), or a combination thereof. In some embodiments, the method 400 is performed by IO completion manager 113 of FIG. 1 . Although shown in a particular sequence or order, unless otherwise specified, the order of the processes can be modified. Thus, the illustrated embodiments should be understood only as examples, and the illustrated processes can be performed in a different order, and some processes can be performed in parallel. Additionally, one or more processes can be omitted in various embodiments. Thus, not all processes are required in every embodiment. Other process flows are possible.
At operation 405, the processing logic identifies an indication of a completion of a memory access command directed to a memory device, such as memory device 130. In one embodiment, a controller, such as memory sub-system controller 115 can receive one or more memory access commands from a requestor, such as host system 120. The host system 120 can be connected externally to the memory sub-system 110, such as via an NVMe interface. The memory sub-system controller 115 can execute the one or more memory access commands to perform one or more corresponding memory access operations and can store the results of the memory access operations for retrieval by the host system 120 after IO completion manager 113 reports completion of the execution of the memory access operations. In response to completion of the execution of each memory access command, IO completion manager 113 can generate or identify an otherwise generated indication of the completion.
At operation 410, the processing logic can determine whether there are other memory access commands directed to the memory device 130 that are pending. In one embodiment, IO completion manager 113 tracks all memory access commands received at memory sub-system 110 (e.g., by adding an indication of a memory access command to a command queue) and tracks which memory access commands are completed (e.g., by removing the indication of the memory access command from the command queue and generating an indication of the completion. Thus, at any point in time, IO completion manager 113 can determine whether there are other commands that are pending (i.e., commands that have been received but have not yet been completed), as well as when those command are likely to be completed.
Responsive to determining that there are no other memory access commands pending, at operation 415, the processing logic can send the indication of the completion to the host system 120 as a partial completion data chunk. Since the size of the indication (e.g., 16 bytes) is likely less than the host memory write size granularity (e.g., 64 bytes) a full completion data chunk is not available. Since there are no other pending memory access commands, waiting for additional indications of completions of other memory access commands is impractical, and thus, in one embodiment, the indication of the completion can be sent alone to host system 120. In another embodiment, however, IO completion manager 113 can append dummy data to the indication of the completion to form a full completion data chunk equal to a host memory write size granularity, as illustrated by chunk 302 in FIG. 3A, for example.
Responsive to determining that there are other memory access commands pending, however, at operation 420, the processing logic can coalesce additional indications of completions of the other memory access commands that are available within a threshold period of time with the indication of the completion into a completion data chunk. In one embodiment, rather than sending the indication of the completion to host system 120 as soon as it is available, IO completion manager 113 can delay the sending and wait to see if any additional indications of completions of the other memory access commands become available within the threshold period of time (e.g., before the expiration of a timer set to a threshold value), such that the indications of multiple completions can be sent to the host system 120 together.
At operation 425, the processing logic can determine whether the indication of the completion of the memory access command or any of the additional indications of the completions of the other memory access commands indicate an error of a corresponding memory access operation. Generally, the indication of the completion in generated upon completion of a corresponding memory access operation and will indicate whether the memory access operation was successful or whether an error occurred. If an error has not occurred, IO completion manager 113 can safely coalesce the indication of the completion, as the indication of a successful completion is not as time sensitive. If an error has occurred, however, IO completion manager 113 may not coalesce the indication and can instead send the send the indication of the completion to the host system 120 as a partial completion data chunk at operation 415.
At operation 430, the processing logic determines whether a threshold period of time has expired. In one embodiment, IO completion manager 113 maintains a counter (or set of counters) which is initialized to a configurable initial value representing the threshold period of time. When the command completion is identified at operation 405, the counter begins a countdown to zero, and thus will expire after the threshold period of time has passed. Responsive to the threshold period of time having expired (i.e., the timer having reach zero), the processing logic can send a completion data chunk to the host system 120 including any indications of completions having been coalesced up to that point. In one embodiment, the completion data chunk comprises a partial completion data chunk having a smaller size than a host memory write size granularity. For example, as illustrated in FIG. 3B, the command completion sequence 350 includes a number of completion data chunks 352, 354, 356, and 358. Each of completion data chunks 352, 354, 356, and 358 are equal to the host memory write size granularity (e.g., 64 bytes or some other size). Completion data chunk 352 includes the indications of multiple completions C17-C20, each of which has a size (e.g., 16 bytes) smaller than that of completion data chunk 352. In one embodiment, completions C17, C18, and C19 can be available when the threshold period of time has expired, for example. Although, completions C17, C18, and C19 together are still smaller than the host memory write size granularity, in one embodiment, these completions can be sent to host system 120 together. Host system 120 can write the completions C17, C18, and C19 to completion queue 126. In another embodiment, however, IO completion manager 113 can append dummy data to the indications of the completions to form a full completion data chunk equal to a host memory write size granularity, as illustrated by chunk 306 in FIG. 3A, for example.
Subsequently, once the indication of completion C20 is available in memory sub-system 115, IO completion manager 113 can send the indication of completion C20 to host system 120 immediately (i.e., without coalescing) since the indication of completion C20 is the only remaining completion in completion data chunk 356. If, however, completions C21 and C22 are available when the threshold period of time ends, completions C21 and C22 can be sent to host system 120. Once the indication of completion C23 is subsequently available, IO completion manager 113 can coalesce the indication of completion C23 until the indication of completion C24 is available (assuming C24 is available within a threshold period of time of C23) since completions C23 and C24 together will complete the completion data chunk 358.
Responsive to the threshold period of time not having expired, at operation 435, the processing logic determines whether a size of the coalesced indications has reached the host memory write size granularity. In one embodiment, IO completion manager 113 compares the size of the coalesced indications to the host memory write size granularity (or a number of coalesced indications to a threshold number). Responsive to determining that the size of the coalesced indications has not reached the host memory write size granularity, the processing logic can continue to coalesce additional indications of completions of the other memory access commands (e.g., return to operation 410).
Responsive to determining that the size of the coalesced indications has reached the host memory write size granularity, however, at operation 440, the processing logic sends the completion data chunk to the host system 120. In one embodiment, the completion data chunk comprises a full completion data chunk equal to the host memory write size granularity. For example, as illustrated in FIG. 3B, completion data chunk 352 includes indications of completions C9, C10, C11, and C12, all of which can be sent to host system 120 together. The host system 120 can store the full completion data chunk as one or more completion queue entries in completion queue 126 in host memory 122 via a single host memory write operation.
FIG. 5 illustrates an example machine of a computer system 500 within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, can be executed. In some embodiments, the computer system 500 can correspond to a host system (e.g., the host system 120 of FIG. 1 ) that includes, is coupled to, or utilizes a memory sub-system (e.g., the memory sub-system 110 of FIG. 1 ) or can be used to perform the operations of a controller (e.g., to execute an operating system to perform operations corresponding to the IO sequencer 111 of FIG. 1 ). In alternative embodiments, the machine can be connected (e.g., networked) to other machines in a LAN, an intranet, an extranet, and/or the Internet. The machine can operate in the capacity of a server or a client machine in client-server network environment, as a peer machine in a peer-to-peer (or distributed) network environment, or as a server or a client machine in a cloud computing infrastructure or environment.
The machine can be a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a server, a network router, a switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.
The example computer system 500 includes a processing device 502, a main memory 504 (e.g., read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM) or Rambus DRAM (RDRAM), etc.), a static memory 506 (e.g., flash memory, static random access memory (SRAM), etc.), and a data storage system 518, which communicate with each other via a bus 530.
Processing device 502 represents one or more general-purpose processing devices such as a microprocessor, a central processing unit, or the like. More particularly, the processing device can be a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, or a processor implementing other instruction sets, or processors implementing a combination of instruction sets. Processing device 502 can also be one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. The processing device 502 is configured to execute instructions 526 for performing the operations and steps discussed herein. The computer system 500 can further include a network interface device 508 to communicate over the network 520.
The data storage system 518 can include a machine-readable storage medium 524 (also known as a computer-readable medium) on which is stored one or more sets of instructions 526 or software embodying any one or more of the methodologies or functions described herein. The instructions 526 can also reside, completely or at least partially, within the main memory 504 and/or within the processing device 502 during execution thereof by the computer system 500, the main memory 504 and the processing device 502 also constituting machine-readable storage media. The machine-readable storage medium 524, data storage system 518, and/or main memory 504 can correspond to the memory sub-system 110 of FIG. 1 .
In one embodiment, the instructions 526 include instructions to implement functionality corresponding to the IO sequencer 111 of FIG. 1 ). While the machine-readable storage medium 524 is shown in an example embodiment to be a single medium, the term “machine-readable storage medium” should be taken to include a single medium or multiple media that store the one or more sets of instructions. The term “machine-readable storage medium” shall also be taken to include any medium that is capable of storing or encoding a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present disclosure. The term “machine-readable storage medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical media, and magnetic media.
Some portions of the preceding detailed descriptions have been presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the ways used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of operations leading to a desired result. The operations are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. The present disclosure can refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage systems.
The present disclosure also relates to an apparatus for performing the operations herein. This apparatus can be specially constructed for the intended purposes, or it can include a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program can be stored in a computer readable storage medium, such as, but not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, each coupled to a computer system bus.
The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose systems can be used with programs in accordance with the teachings herein, or it can prove convenient to construct a more specialized apparatus to perform the method. The structure for a variety of these systems will appear as set forth in the description below. In addition, the present disclosure is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages can be used to implement the teachings of the disclosure as described herein.
The present disclosure can be provided as a computer program product, or software, that can include a machine-readable medium having stored thereon instructions, which can be used to program a computer system (or other electronic devices) to perform a process according to the present disclosure. A machine-readable medium includes any mechanism for storing information in a form readable by a machine (e.g., a computer). In some embodiments, a machine-readable (e.g., computer-readable) medium includes a machine (e.g., a computer) readable storage medium such as a read only memory (“ROM”), random access memory (“RAM”), magnetic disk storage media, optical storage media, flash memory components, etc.
In the foregoing specification, embodiments of the disclosure have been described with reference to specific example embodiments thereof. It will be evident that various modifications can be made thereto without departing from the broader spirit and scope of embodiments of the disclosure as set forth in the following claims. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.

Claims (20)

What is claimed is:
1. A system comprising:
a memory device; and
a processing device, operatively coupled with the memory device, to perform operations comprising:
identifying an indication of a completion of a memory access command directed to the memory device;
determining whether there are other memory access commands directed to the memory device that are pending;
responsive to determining that there are other memory access commands that are pending and have not yet been completed, waiting for a threshold period of time to expire before sending the indication of the completion to a host system;
responsive to determining that the threshold period of time has expired, coalescing additional indications of completions of the other memory access commands that have since completed and are available with the indication of the completion into a completion data chunk; and
sending the completion data chunk to the host system, the host system to store the completion data chunk as one or more completion queue entries in a completion queue in a host memory of the host system via a single host memory write operation.
2. The system of claim 1, wherein the processing device is to perform operations further comprising:
responsive to determining that there are no other memory access commands pending, sending the indication of the completion to the host system as a partial completion data chunk, wherein the partial completion data chunk has a smaller size than a host memory write size granularity.
3. The system of claim 1, wherein the processing device is to perform operations further comprising:
responsive to determining that there are no other memory access commands pending, appending dummy data to the indication of the completion to form a full completion data chunk equal to a host memory write size granularity; and
sending the full completion data chunk comprising the indication of the completion and the dummy data to the host system.
4. The system of claim 1, wherein the processing device is to perform operations further comprising:
determining whether the indication of the completion of the memory access command or any of the additional indications of the completions of the other memory access commands indicate an error of a corresponding memory access operation; and
responsive to there being an indication of an error, sending the indication of the error to the host system without waiting for the threshold period of time to expire.
5. The system of claim 1, wherein coalescing the additional indications of completions of the other memory access commands that are available within the threshold period of time into the completion data chunk comprises:
determining whether the threshold period of time has expired; and
responsive to the threshold period of time having expired, sending the completion data chunk to the host system, wherein the completion data chunk comprises a partial completion data chunk having a smaller size than a host memory write size granularity.
6. The system of claim 1, wherein coalescing the additional indications of completions of the other memory access commands that are available within the threshold period of time into the completion data chunk comprises:
determining whether the threshold period of time has expired;
responsive to the threshold period of time having expired, appending dummy data to the additional indications of completions to form a full completion data chunk equal to a host memory write size granularity; and
sending the full completion data chunk comprising the indications of the completions and the dummy data to the host system.
7. The system of claim 5, wherein coalescing the additional indications of completions of the other memory access commands that are available within the threshold period of time into the completion data chunk comprises:
responsive to the threshold period of time not having expired, determining whether a size of the coalesced indications has reached the host memory write size granularity; and
responsive to determining that the size of the coalesced indications has reached the host memory write size granularity, sending the completion data chunk to the host system, wherein the completion data chunk comprises a full completion data chunk equal to the host memory write size granularity.
8. The system of claim 7, wherein coalescing the additional indications of completions of the other memory access commands that are available within the threshold period of time into the completion data chunk comprises:
responsive to determining that the size of the coalesced indications has not reached the host memory write size granularity, continuing to coalesce additional indications of completions of the other memory access commands.
9. A method comprising:
identifying an indication of a completion of a memory access command directed to a memory device;
determining whether there are other memory access commands directed to the memory device that are pending;
responsive to determining that there are other memory access commands that are pending and have not yet been completed, waiting for a threshold period of time to expire before sending the indication of the completion to a host system;
responsive to determining that the threshold period of time has expired, coalescing additional indications of completions of the other memory access commands that have since completed and are available with the indication of the completion into a completion data chunk; and
sending the completion data chunk to the host system, the host system to store the completion data chunk as one or more completion queue entries in a completion queue in a host memory of the host system via a single host memory write operation.
10. The method of claim 9, further comprising:
responsive to determining that there are no other memory access commands pending, sending the indication of the completion to the host system as a partial completion data chunk, wherein the partial completion data chunk has a smaller size than a host memory write size granularity.
11. The method of claim 9, further comprising:
responsive to determining that there are no other memory access commands pending, appending dummy data to the indication of the completion to form a full completion data chunk equal to a host memory write size granularity; and
sending the full completion data chunk comprising the indication of the completion and the dummy data to the host system.
12. The method of claim 9, further comprising:
determining whether the indication of the completion of the memory access command or any of the additional indications of the completions of the other memory access commands indicate an error of a corresponding memory access operation; and
responsive to there being an indication of an error, sending the indication of the error to the host system without waiting for the threshold period of time to expire.
13. The method of claim 9, wherein coalescing the additional indications of completions of the other memory access commands that are available within the threshold period of time into the completion data chunk comprises:
determining whether the threshold period of time has expired; and
responsive to the threshold period of time having expired, sending the completion data chunk to the host system, wherein the completion data chunk comprises a partial completion data chunk having a smaller size than a host memory write size granularity.
14. The method of claim 9, wherein coalescing the additional indications of completions of the other memory access commands that are available within the threshold period of time into the completion data chunk comprises:
determining whether the threshold period of time has expired;
responsive to the threshold period of time having expired, appending dummy data to the additional indications of completions to form a full completion data chunk equal to a host memory write size granularity; and
sending the full completion data chunk comprising the indications of the completions and the dummy data to the host system.
15. The method of claim 13, wherein coalescing the additional indications of completions of the other memory access commands that are available within the threshold period of time into the completion data chunk comprises:
responsive to the threshold period of time not having expired, determining whether a size of the coalesced indications has reached the host memory write size granularity; and
responsive to determining that the size of the coalesced indications has reached the host memory write size granularity, sending the completion data chunk to the host system, wherein the completion data chunk comprises a full completion data chunk equal to the host memory write size granularity.
16. The method of claim 15, wherein coalescing the additional indications of completions of the other memory access commands that are available within the threshold period of time into the completion data chunk comprises:
responsive to determining that the size of the coalesced indications has not reached the host memory write size granularity, continuing to coalesce additional indications of completions of the other memory access commands.
17. A system comprising:
a memory device; and
a processing device, operatively coupled with the memory device, to perform operations comprising:
identifying an indication of a completion of a memory access command directed to the memory device;
determining whether there are other memory access commands directed to the memory device that are pending;
responsive to determining that there are other memory access commands pending, determining whether a size of the indication of the completion is smaller than a host memory write size granularity;
responsive to determining that the size of the indication of the completion is smaller than the host memory write size granularity, appending dummy data to the indication of the completion to form a full completion data chunk; and
sending the full completion data chunk comprising the indication of the completion and the dummy data to a host system.
18. The system of claim 17, wherein the host system is to store the full completion data chunk as one or more completion queue entries in a completion queue in a host memory of the host system via a single host memory write operation.
19. The system of claim 18, where the full completion data chunk is equal to a host memory write size granularity of the host memory.
20. The system of claim 19, wherein the processing device is to perform operations further comprising:
responsive to determining that the size of the indication of the completion is not smaller than the host memory write size granularity, sending the indication of the completion to the host system as a full completion data chunk equal to the host memory write size granularity.
US17/886,369 2021-12-30 2022-08-11 NVMe command completion management for host system memory Active 2042-11-11 US12182445B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/951,446 US20250077126A1 (en) 2021-12-30 2024-11-18 NVMe COMMAND COMPLETION MANAGEMENT FOR HOST SYSTEM MEMORY

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN202141061856 2021-12-30
IN202141061856 2021-12-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/951,446 Continuation US20250077126A1 (en) 2021-12-30 2024-11-18 NVMe COMMAND COMPLETION MANAGEMENT FOR HOST SYSTEM MEMORY

Publications (2)

Publication Number Publication Date
US20230214157A1 US20230214157A1 (en) 2023-07-06
US12182445B2 true US12182445B2 (en) 2024-12-31

Family

ID=86991581

Family Applications (2)

Application Number Title Priority Date Filing Date
US17/886,369 Active 2042-11-11 US12182445B2 (en) 2021-12-30 2022-08-11 NVMe command completion management for host system memory
US18/951,446 Pending US20250077126A1 (en) 2021-12-30 2024-11-18 NVMe COMMAND COMPLETION MANAGEMENT FOR HOST SYSTEM MEMORY

Family Applications After (1)

Application Number Title Priority Date Filing Date
US18/951,446 Pending US20250077126A1 (en) 2021-12-30 2024-11-18 NVMe COMMAND COMPLETION MANAGEMENT FOR HOST SYSTEM MEMORY

Country Status (1)

Country Link
US (2) US12182445B2 (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090157971A1 (en) * 2007-12-13 2009-06-18 Xian Jun Liu Integration of Secure Data Transfer Applications for Generic IO Devices
US20160034188A1 (en) * 2014-07-31 2016-02-04 Pradeep Bisht Input/output interceptor with intelligent flush control logic
US9626309B1 (en) * 2014-07-02 2017-04-18 Microsemi Storage Solutions (U.S.), Inc. Method and controller for requesting queue arbitration and coalescing memory access commands
US20170123656A1 (en) * 2015-10-31 2017-05-04 Sandisk Technologies Llc Methods, systems, and computer readable media for aggregating completion entries in a nonvolatile storage device
US20170212800A1 (en) * 2016-01-26 2017-07-27 Freescale Semiconductor, Inc. System and method for performing bus transactions
US20180018101A1 (en) * 2016-07-13 2018-01-18 Sandisk Technologies Llc Methods, systems, and computer readable media for write classification and aggregation using host memory buffer (hmb)
US20180150220A1 (en) * 2016-11-25 2018-05-31 Samsung Electronics Co., Ltd. System and method for improving storage device i/o performance
US20190042149A1 (en) * 2017-08-02 2019-02-07 Intuit Inc. Writing composite objects to a data store
US10528469B2 (en) * 2017-01-12 2020-01-07 SK Hynix Inc. Memory system for writing data based on types of command and data and operating method of the same
US20200150898A1 (en) * 2018-11-12 2020-05-14 SK Hynix Inc. Memory system and operating method thereof
US20200304142A1 (en) * 2019-03-19 2020-09-24 Toshiba Memory Corporation Memory system and information processing system
US20210165691A1 (en) * 2019-12-02 2021-06-03 Facebook, Inc. High bandwidth memory system with dynamically programmable distribution scheme
US20210181957A1 (en) * 2019-12-12 2021-06-17 Facebook, Inc. High bandwidth memory system with distributed request broadcasting masters
US20220091980A1 (en) * 2020-09-24 2022-03-24 Advanced Micro Devices, Inc. Memory access response merging in a memory hierarchy

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090157971A1 (en) * 2007-12-13 2009-06-18 Xian Jun Liu Integration of Secure Data Transfer Applications for Generic IO Devices
US9626309B1 (en) * 2014-07-02 2017-04-18 Microsemi Storage Solutions (U.S.), Inc. Method and controller for requesting queue arbitration and coalescing memory access commands
US20160034188A1 (en) * 2014-07-31 2016-02-04 Pradeep Bisht Input/output interceptor with intelligent flush control logic
US20170123656A1 (en) * 2015-10-31 2017-05-04 Sandisk Technologies Llc Methods, systems, and computer readable media for aggregating completion entries in a nonvolatile storage device
US20170212800A1 (en) * 2016-01-26 2017-07-27 Freescale Semiconductor, Inc. System and method for performing bus transactions
US20180018101A1 (en) * 2016-07-13 2018-01-18 Sandisk Technologies Llc Methods, systems, and computer readable media for write classification and aggregation using host memory buffer (hmb)
US20180150220A1 (en) * 2016-11-25 2018-05-31 Samsung Electronics Co., Ltd. System and method for improving storage device i/o performance
US10528469B2 (en) * 2017-01-12 2020-01-07 SK Hynix Inc. Memory system for writing data based on types of command and data and operating method of the same
US20190042149A1 (en) * 2017-08-02 2019-02-07 Intuit Inc. Writing composite objects to a data store
US20200150898A1 (en) * 2018-11-12 2020-05-14 SK Hynix Inc. Memory system and operating method thereof
US20200304142A1 (en) * 2019-03-19 2020-09-24 Toshiba Memory Corporation Memory system and information processing system
US20210165691A1 (en) * 2019-12-02 2021-06-03 Facebook, Inc. High bandwidth memory system with dynamically programmable distribution scheme
US20210181957A1 (en) * 2019-12-12 2021-06-17 Facebook, Inc. High bandwidth memory system with distributed request broadcasting masters
US20220091980A1 (en) * 2020-09-24 2022-03-24 Advanced Micro Devices, Inc. Memory access response merging in a memory hierarchy

Also Published As

Publication number Publication date
US20230214157A1 (en) 2023-07-06
US20250077126A1 (en) 2025-03-06

Similar Documents

Publication Publication Date Title
US11699491B2 (en) Double interleaved programming of a memory device in a memory sub-system
US12461680B2 (en) Operation based on consolidated memory region description data
US11698864B2 (en) Memory access collision management on a shared wordline
US11709605B2 (en) Storing zones in a zone namespace on separate planes of a multi-plane memory device
US11687285B2 (en) Converting a multi-plane write operation into multiple single plane write operations performed in parallel on a multi-plane memory device
US11720490B2 (en) Managing host input/output in a memory system executing a table flush
US12360901B2 (en) Memory performance during program suspend protocol
US12147712B2 (en) Memory performance using memory access command queues in memory devices
US20240143232A1 (en) Reduce read command latency in partition command scheduling at a memory device
US11941290B2 (en) Managing distribution of page addresses and partition numbers in a memory sub-system
US12430258B2 (en) Padding cached data with valid data for memory flush commands
US20220137856A1 (en) Program operation execution during program operation suspend
US11726716B2 (en) Internal commands for access operations
US11756612B2 (en) All levels dynamic start voltage programming of a memory device in a memory sub-system
US12182445B2 (en) NVMe command completion management for host system memory
US11693597B2 (en) Managing package switching based on switching parameters
US12061806B2 (en) Second read initialization on latch-limited memory device
US11669456B2 (en) Cache release command for cache reads in a memory sub-system
US11636904B2 (en) Almost ready memory management
US20260010293A1 (en) Optimized out-of-order data fetching in a memory sub-system

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICRON TECHNOLOGY, INC., IDAHO

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SOI, SAHIL;ATHIYAPPAN, DHANANJAYAN;SIGNING DATES FROM 20220729 TO 20220810;REEL/FRAME:060789/0575

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE