US20180113810A1 - Method and system for efficient hashing optimized for hardware accelerated caching - Google Patents
Method and system for efficient hashing optimized for hardware accelerated caching Download PDFInfo
- Publication number
- US20180113810A1 US20180113810A1 US15/335,025 US201615335025A US2018113810A1 US 20180113810 A1 US20180113810 A1 US 20180113810A1 US 201615335025 A US201615335025 A US 201615335025A US 2018113810 A1 US2018113810 A1 US 2018113810A1
- Authority
- US
- United States
- Prior art keywords
- hash
- slot
- slots
- controller
- bucket
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0864—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using pseudo-associative means, e.g. set-associative or hashing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0866—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
- G06F12/0871—Allocation or management of cache space
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
- G06F12/0895—Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/065—Replication mechanisms
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0659—Command handling arrangements, e.g. command buffers, queues, command scheduling
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/30—Providing cache or TLB in specific location of a processing system
- G06F2212/302—In image processor or graphics adapter
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/45—Caching of specific data in cache memory
- G06F2212/452—Instruction code
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/60—Details of cache memory
- G06F2212/604—Details relating to cache allocation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/62—Details of cache specific to multiprocessor cache arrangements
- G06F2212/621—Coherency control relating to peripheral accessing, e.g. from DMA or I/O device
Definitions
- the present disclosure is generally directed toward computer memory.
- Hashing is one of the basic frameworks utilized in a caching environment.
- the advantage of hashing for cache lookup is that in the best case the lookup will be O(1) when there are no collisions. But when there are collisions, the average or worst case lookup can be very time consuming because of the memory accesses involved.
- LBA Logical Block Address
- the load store operations of memory locations determine the performance that can be achieved by the solution.
- An effective hash structure organization would be useful so that the load store operations are minimized.
- the division operation is very expensive and it is not preferred to add division logic to the hardware if such hardware can be avoided.
- FIG. 1 is a block diagram depicting a computing system in accordance with at least some embodiments of the present disclosure
- FIG. 2 is a block diagram depicting details of an illustrative controller in accordance with at least some embodiments of the present disclosure
- FIG. 3A is a block diagram depicting additional details of hash slot buckets and the slots contained therein in accordance with at least some embodiments of the present disclosure
- FIG. 3B is a block diagram depicting additional details of a hash extension in accordance with at least some embodiments of the present disclosure
- FIG. 4 is a flow diagram depicting a method of adding to a local hash value in accordance with at least some embodiments of the present disclosure
- FIG. 5A is a first portion of a flow diagram depicting a hash searching method in accordance with at least some embodiments of the present disclosure
- FIG. 5B is a second portion of the flow diagram from FIG. 5A ;
- FIG. 6 is a flow diagram depicting a method of approximating a number of rows needed to accommodate an Input/Output (I/O) operation in accordance with at least some embodiments of the present disclosure.
- embodiments of the present disclosure present a hashing method for efficient cache lookup that minimizes memory accesses.
- a new hash with a chaining scheme is introduced with a hash table (e.g., a deep DRAM hash table) that is used for the cache lookup.
- a hash table e.g., a deep DRAM hash table
- multi-level hash functions are used—one level to get to an appropriate hash slot bucket and another level to get the appropriate hash slot within the hash slot bucket in call all slots are occupied.
- a unique hash structure is also provided that can aid in set associative hashing scheme.
- the proposed hash structure also enables a method to add elements to a hash, while ensuring that all slots in a hash slot bucket are equally loaded.
- Another aspect of the present disclosure is to provide a method for approximating the number of slots to load for an R5/R6 Input/Output (I/O) operation.
- embodiments of the present disclosure will be described. While many of the examples depicted and described herein will relate to a RAID architecture, it should be appreciated that embodiments of the present disclosure are not so limited. Indeed, aspects of the present disclosure can be used in any type of computing system and/or memory environment. In particular, embodiments of the present disclosure can be used in any type of caching scheme (whether employed by a RAID controller or some other type of device used in a communication system). In particular, hard drives, hard drive controllers (e.g., SCSI controllers, SAS controllers, or RAID controllers) may be configured to implement embodiments of the present disclosure. As another example, network cards or the like having cache memory may also be configured to implement embodiments of the present disclosure.
- hard drives e.g., SCSI controllers, SAS controllers, or RAID controllers
- network cards or the like having cache memory may also be configured to implement embodiments of the present disclosure.
- the computing system 100 is shown to include a host system 104 , a controller 108 (e.g., a SCSI controller, a SAS controller, a RAID controller, etc.), and a storage array 112 having a plurality of storage devices 136 a -N therein.
- the system 100 may utilize any type of data storage architecture.
- the particular architecture depicted and described herein e.g., a RAID architecture
- RAID-0 also referred to as a RAID level 0
- data blocks are stored in order across one or more of the storage devices 136 a -N without redundancy. This effectively means that none of the data blocks are copies of another data block and there is no parity block to recover from failure of a storage device 136 .
- a RAID-1 also referred to as a RAID level 1
- RAID-1 uses one or more of the storage devices 136 a -N to store a data block and an equal number of additional mirror devices for storing copies of a stored data block.
- Higher level RAID schemes can further segment the data into bits, bytes, or blocks for storage across multiple storage devices 136 a -N.
- One or more of the storage devices 136 a -N may also be used to store error correction or parity information.
- a single unit of storage can be spread across multiple devices 136 a -N and such a unit of storage may be referred to as a stripe.
- a stripe may include the related data written to multiple devices 136 a -N as well as the parity information written to a parity storage device 136 a -N.
- a RAID-5 also referred to as a RAID level 5
- the data being stored is segmented into blocks for storage across multiple devices 136 a -N with a single parity block for each stripe distributed in a particular configuration across the multiple devices 136 a -N.
- This scheme can be compared to a RAID-6 (also referred to as a RAID level 6) scheme in which dual parity blocks are determined for a stripe and are distributed across each of the multiple devices 136 a -N in the array 112 .
- controller 108 One of the functions of the controller 108 is to make the multiple storage devices 136 a -N in the array 112 appear to a host system 104 as a single high capacity disk drive.
- the controller 108 may be configured to automatically distribute data supplied from the host system 104 across the multiple storage devices 136 a -N(potentially with parity information) without ever exposing the manner in which the data is actually distributed to the host system 104 .
- the host system 104 is shown to include a processor 116 , an interface 120 , and memory 124 . It should be appreciated that the host system 104 may include additional components without departing from the scope of the present disclosure.
- the host system 104 in some embodiments, corresponds to a user computer, laptop, workstation, server, collection of servers, or the like. Thus, the host system 104 may or may not be designed to receive input directly from a human user.
- the processor 116 of the host system 104 may include a microprocessor, central processing unit (CPU), collection of microprocessors, or the like.
- the memory 124 may be designed to store instructions that enable functionality of the host system 104 when executed by the processor 116 .
- the memory 124 may also store data that is eventually written by the host system 104 to the storage array 112 . Further still, the memory 124 may be used to store data that is retrieved from the storage array 112 .
- Illustrative memory 124 devices may include, without limitation, volatile or non-volatile computer memory (e.g., flash memory, RAM, DRAM, ROM, EEPROM, etc.).
- the interface 120 of the host system 104 enables the host system 104 to communicate with the controller 108 via a host interface 128 of the controller 108 .
- the interface 120 and host interface(s) 128 may be of a same or similar type (e.g., utilize a common protocol, a common communication medium, etc.) such that commands issued by the host system 104 are receivable at the controller 108 and data retrieved by the controller 108 is transmittable back to the host system 104 .
- the interfaces 120 , 128 may correspond to parallel or serial computer interfaces that utilize wired or wireless communication channels.
- the interfaces 120 , 128 may include hardware that enables such wired or wireless communications.
- the communication protocol used between the host system 104 and the controller 108 may correspond to any type of known host/memory control protocol.
- Non-limiting examples of protocols that may be used between interfaces 120 , 128 include SAS, SATA, SCSI, FibreChannel (FC), iSCSI, ATA over Ethernet, InfiniBand, or the like.
- the controller 108 may provide the ability to represent the entire storage array 112 to the host system 104 as a single high volume data storage device. Any known mechanism can be used to accomplish this task.
- the controller 108 may help to manager the storage devices 136 a -N (which can be hard disk drives, sold-state drives, or combinations thereof) so as to operate as a logical unit.
- the controller 108 may be physically incorporated into the host device 104 as a Peripheral Component Interconnect (PCI) expansion (e.g., PCI express (PCI)e) card or the like. In such situations, the controller 108 may be referred to as a RAID adapter.
- PCI Peripheral Component Interconnect
- the storage devices 136 a -N in the storage array 112 may be of similar types or may be of different types without departing from the scope of the present disclosure.
- the storage devices 136 a -N may be co-located with one another or may be physically located in different geographical locations.
- the nature of the storage interface 132 may depend upon the types of storage devices 136 a -N used in the storage array 112 and the desired capabilities of the array 112 .
- the storage interface 132 may correspond to a virtual interface or an actual interface. As with the other interfaces described herein, the storage interface 132 may include serial or parallel interface technologies. Examples of the storage interface 132 include, without limitation, SAS, SATA, SCSI, FC, iSCSI, ATA over Ethernet, InfiniBand, or the like.
- the controller 108 is shown to have communication capabilities with a controller cache 140 . While depicted as being separate from the controller 108 , it should be appreciated that the controller cache 140 may be integral to the controller 108 , meaning that components of the controller 108 and the controller cache 140 may be contained within a single physical housing or computing unit (e.g., server blade).
- the controller cache 140 is provided to enable the controller 108 to perform caching operations.
- the controller 108 may employ caching operations during execution of I/O commands received from the host system 104 . Depending upon the nature of the I/O command and the amount of information being processed during the command, the controller 108 may require a large number of cache memory modules 148 or a smaller number of cache memory modules 148 .
- the memory modules 148 may correspond to flash memory, RAM, DDR memory, or some other type of computer memory that is quickly accessible and can be rewritten multiple times.
- the number of separate memory modules 148 in the controller cache 140 is typically larger than one, although a controller cache 140 may be configured to operate with a single memory module 148 if desired.
- the cache interface 144 may correspond to any interconnect that enables the controller 108 to access the memory modules 148 , temporarily store data thereon, and/or retrieve data stored thereon in connection with performing an I/O command or some other executable command.
- the controller cache 140 may be integrated with the controller 108 and may be executed on a CPU chip or placed on a separate chip within the controller 108 .
- the interface 144 may correspond to a separate bus interconnect within the CPU or traces connecting a chip of the controller cache 140 with a chip executing the processor of the controller 108 .
- the controller cache 140 may be external to the controller 108 in which case the interface 144 may correspond to a serial or parallel data port.
- Direct mapping is one type of cache configuration in which each block is mapped to exactly one cache location.
- this is like rows in a table with three columns: the data block or cache line that contains the actual data fetched and stored, a tag that contains all or part of the address of the fetched data, and a flag bit that connotes the presence of a valid bit of data in the row entry.
- Fully associative mapping is similar to direct mapping in structure, but allows a block to be mapped to any cache location rather than to a pre-specified cache location (as is the case with direct mapping).
- Set associative mapping can be viewed as a compromise between direct mapping and fully associative mapping in which each block is mapped to a subset of cache locations. It is sometimes called N-way set associative mapping, which provides for a location in main memory to be cached to any of “N” locations in the L1 cache.
- the controller 108 and controller cache 140 may be enabled to implement an N-way set associated mapping in a more efficient manner than previous caching systems.
- the controller 108 and controller cache 140 may be configured to use hashing techniques that are N-time associative. This will effectively control the number of hash collisions, which can help increase processing speed even though different inputs may result in the same hash output.
- hashing techniques that are N-time associative. This will effectively control the number of hash collisions, which can help increase processing speed even though different inputs may result in the same hash output.
- the controller 108 is shown to include the host interface(s) 128 and storage interface(s) 132 .
- the controller 108 is also shown to include a processor 204 , memory 208 , one or more drivers 212 , and a power source 216 .
- the processor 204 may include an Integrated Circuit (IC) chip or multiple IC chips, a CPU, a microprocessor, or the like.
- the processor 204 may be configured to execute instructions in memory 208 that are shown to include a cache management module 224 , hashing instructions 228 , hash search instructions 232 , and a slot collision list.
- the processor 204 may utilize a set associative hash 220 containing a plurality of hash buckets 240 a -M.
- the set associative hash 220 is shown to be maintained internally to the controller 108 .
- set associative hash 220 may be stored and/or maintained external to the controller 108 .
- the set associative hash 220 may be stored or contained within memory 208 of the controller 108 .
- the memory 208 may be volatile and/or non-volatile in nature. As indicated above, the memory 208 may include any hardware component or collection of hardware components that are capable of storing instructions and communicating those instructions to the processor 204 for execution. Non-limiting examples of memory 208 include RAM, ROM, flash memory, EEPROM, variants thereof, combinations thereof, and the like.
- the instructions stored in memory 208 are shown to be different instruction sets, but it should be appreciated that the instructions can be combined into a smaller number of instruction sets without departing from the scope of the present disclosure.
- the cache management module 224 when executed, enable the processor 204 to receive I/O commands from the host system 104 , determine one or more caching operations that need to be performed in connection with executing the I/O command, and identify which memory module(s) 148 should be used in connection with performing the one or more caching operations.
- the cache management module 224 may also be configured to invoke the other instruction sets in memory 208 to further perform such caching operations.
- the cache management module 224 may also be configured to manage the set associative hash 220 and the values assigned to various data fields within the set associative hash 220 .
- the cache management module 224 may be enabled to update the set associative hash 220 to reflect such utilizations.
- the cache management module 224 may also be responsible for facilitating communications between the controller 108 and the controller cache 140 .
- the hashing instructions 228 when executed, enable the processor 208 to calculate hash output values based hash inputs.
- hash inputs used by the hashing instructions 228 to calculate hash output values may include numbers assigned to particular hash buckets 240 a -M, numbers assigned to hash slots within the hash buckets 240 a -M, memory addresses of memory modules 148 , LBAs, storage time, data retrieval time, strip/row numbers, a hash index, a maximum number of bits needed to complete the I/O command, and other numbers derived therefrom.
- the hashing instructions 228 may implement a hashing function that correspond to a linear hashing function.
- Use of a linear hash function can help ensure that hash slots for successive strips/rows of data are stored consecutively (thereby increasing the efficiency with which such data can be later retrieved by the controller 108 ).
- Use of a linear hash function is advantageous especially in hardware accelerated caching solutions where the hardware thread does not have direct access to the memory module 148 (e.g., SRAM or DRAM) where the hash slots are stored.
- These hash slots can be loaded when required and stored back when the objective is achieved (e.g., the I/O command is completed). Since an I/O command almost always results in consecutive strips/rows, many slots can be loaded and stored together, thus ensuring that there are not too many unnecessary load and store operations.
- the transpose operation of the hash function executed by the hashing instructions 228 ensures that a region in each virtual disk should not map to the same slot in a hash bucket 240 a -M, otherwise the slot will have a tendency to be added to the slot collision list 236 .
- the hash search instructions 232 when executed, enable the processor 208 to search the set associative hash 220 for hash collisions. As will be described in further detail herein, as the controller 108 executes I/O commands, the controller 108 may utilize the hash search instructions 232 to see if a particular piece or set of data is already stored in the memory module 148 .
- the hash search instructions 232 may be enabled to utilize the hashing instructions 228 to generate a hash for the data being searched and may compare that hash to hashes stored in the set associative hash 220 . If collisions (e.g., matches) are found, the hash search instructions 232 and/or cache management module 224 may be enabled to update the slot collision list 236 as will be described in further detail herein.
- the set associative hash 220 is shown to include a plurality of hash buckets 240 a -M linearly arranged.
- the set associative hash 220 may include up to M hash buckets, where M is an integer number greater than or equal to one.
- Each hash slot bucket 240 is shown to include a plurality of hash slots 304 a - d . Although the hash slot buckets 240 are shown to include four hash slots 304 a - d , it should be appreciated that a hash slot bucket 240 may be configured to include a greater or lesser number of hash slots. For instance, the hash slot bucket 240 may have two, three, four, five, . . . , ten, twenty, or more hash slots without departing from the scope of the present disclosure.
- Each hash slot 304 is shown to include a number of fields for storing information in connection with a hash generated for the slot 304 .
- the fields include, without limitation, a flag field 308 , a Cache Segment ID (CSID) field 312 , an LSB field 316 , and an MSB field 320 .
- the flag field 308 may contain a number of bits that indicate various details or pieces of information related to the hash slot 304 .
- the flags 308 may be used to indicate whether data stored in the hash slot 304 continues in another hash slot (e.g., by an appropriate indication in the bit link field 324 ).
- the flags 308 may indicate the presence of a bad block in one of the LBA ranges that map into the corresponding slot 304 .
- the CSID field 312 provides an identification of a cache segment.
- the identification may be provided as an alphanumerical string.
- adjacent slots may be assigned consecutively-numbered CSIDs.
- CSIDs may be assigned randomly or pseudo-randomly.
- the CSID provides a substantially unique (e.g., unique to the data structure 220 ) identifier for a slot 304 . This means that any particular slot 304 will have a different CSID from all other slots 304 in the set associative hash 220 .
- the combination of the LSB field 316 and MSB field 320 may provide an address range in memory where a hash slot is stored.
- the identifiers stored in the fields 316 , 320 may identify an LBA of a memory module 148 or an LBA range of the memory module 148 .
- the fields 316 , 320 may be referred to as a tag field.
- a 1 byte value in the tag field may indicate that the cache segment frame is used as a hash slot extension, which is shown in FIG. 3B .
- each hash slot 304 is provisioned with four potential collision lists.
- This set associative hashing ensures that the length of the collision list is reduced. For instance, assume there are 4 elements that needed to be added to a same hash slot bucket 240 then all the 4 are added into the slots 304 a - d and there would not be any addition to the collision list 236 .
- an extension (as shown in FIG. 3B ) that can hold up to 16 slots is created and the last index (e.g., the fourth index) in the hash slot 304 d updated with the extension ID, the information in the last index (e.g., the fourth index) is copied into the first index slot in the extension thus establishing the chain. Now up to 16 more elements can be added to this extension frame without adding a further collision list 236 .
- linear traversals do not involve loading a new memory which makes the traversal faster unlike a non-linear traversal (e.g., a linked list) which involves loading of the next link and its corresponding memory before proceeding to the next element.
- a non-linear traversal e.g., a linked list
- Hardware implementations can optimize comparisons in a linear list.
- the hash bucket 240 preserves the expendability of the chaining, but this feature kicks in only when unusual workloads make the chaining necessary (e.g., there are more than 4 collisions).
- the hash slot 304 also indicates if the bad block bit is set (e.g., via flag 308 ). If the flag bit is set, it indicates that some/all of the cache segments present in this hash slot 304 have bad blocks and hence the caching algorithms can take appropriate actions based on the specific use case—for example divert the command to a firmware module to take specific actions.
- each consecutive strip/row can be found in consecutive hash slot buckets. This effectively allows a group of memory accesses to be performed instead of multiple individual memory accesses.
- a hash slot bucket corresponds to a group of positions in a hash table and a hash slot corresponds to a particular position in the hash table.
- a particular slot may be empty or contain an entry.
- the hash search instructions 232 will search the hashes assigned to each slot bucket 240 for matches. These hashes need to be updated as additional slots in the hash slot bucket 240 are used and data is stored in corresponding addresses of cache memory 148 .
- the method begins when an I/O command is received at the controller 108 from the host system 104 and then the controller 108 invokes the cache management module 224 to search for an empty slot in the set associative hash 220 .
- the cache management module 224 will set a last slot index equal to 3 (e.g., if there are 4 slots per hash bucket) (step 404 ).
- the cache management module 224 will set an empty slot index (step 408 ). In this step, the cache management module 224 will find an empty slot by searching hash slot buckets 240 starting at hash slot bucket 0 240 a until the last slot index is reached.
- the cache manager 224 will check to see if the CSID field of the hash slot bucket is INVALID. This may indicate slot bucket is empty.
- the first hash index is obtained from the I/O command although it can also be received from other sources. This first hash index gives the hash slot bucket which contains four slots. The next check is to see if any of the four slots 304 within the hash slot bucket 240 are empty (step 412 )(e.g., from 0 to 3). If empty, the slot will be used to store the data in connection with the received I/O command.
- the cache management module 224 will also add the CSID that is assigned to the strip/row in connection to the IO command, to the hash slot assigned to the hash slot bucket 240 (step 440 ).
- the cache management module 224 will also update the appropriate fields of the slot 304 and bucket 240 to indicate that the assigned slot is not part of a slot chain. Thereafter, the method ends at step 444 .
- the cache management module 224 will check to see if the last slot index in the hash bucket has an extension (step 416 ). If an extension is found (step 420 ), then the cache management module 224 will get the CSID from the last slot and load the CSID into local memory. The cache management module 224 will then start the linear search within the frame/slot extension (step 424 ). The method then reverts back to step 412 to see if any empty slots are found within the extension.
- the cache management module 224 allocates a new cache frame/extension (step 428 ).
- the allocated cache frame can be of any size and can be indexable by an identifier.
- the cache frame may be a 128 byte cache frame.
- the method further continues with the cache management module 224 copying the last slot into the first slot in the newly-allocated frame (step 432 ). Then the cache management module 224 sets the tag field 316 , 320 to indicate that the newly-allocated frame corresponds to an extension (step 436 ). The cache management module 224 may further set the empty slot index to an appropriate value to indicate that the newly-allocated frame does not have any slots currently allocated therein. The method then proceeds to step 440 .
- the hash searching method is performed to see if there is a hash hit or miss for a particular I/O command.
- the method begins by setting the slot index (step 504 ).
- the slot index is set using a hash function.
- the hash function used to set the slot index is not a “strong” hash function. Rather, a relatively weak hash function is used for simple manipulation of data (e.g., shift and add functions). This means that faster responses can be achieved, but at the expense of higher collision rates.
- the hash slots are grouped into buckets and because adjacent LBAs are assigned to consecutive buckets, the benefits of the fast I/O processing can be realized without having to worry about the increased risk of collisions.
- the I/O command can be executed relatively quickly because the data for any I/O command is stored in consecutive memory locations. Moreover, all of the M buckets can be accessed in a single read access rather than having to use multiple accesses. This will be true as long as there are less than four collisions for each hash bucket (if each bucket has four slots).
- the get hash index algorithm computes a hash index (e.g., “HashIndex”) according to the following:
- the method proceeds with the cache management module 224 and/or hash search instructions 232 determining the number of slots that need to be loaded to accommodate the I/O command (step 508 ). In some embodiments, this number may be approximated by the cache management module 224 and/or hash search instructions 232 (as will be described in connection with FIG. 6 ).
- the slots then begin to be loaded (step 512 ). In some embodiments, the slots are loaded in alignment with a burst boundary. The burst boundary can depend upon the slot size, the burst size, and the number of slots included in a slot bucket.
- the slot loading begins with the first slot (step 516 ).
- the cache management module 224 and/or hash search instructions 232 will determine whether the number of slots needed (e.g., as estimated or determined in step 508 ) is less than the total number of slots in a bucket (step 520 ). If this query is answered negatively, then the method proceeds by checking if the last slot has an extension (step 536 ). If the last slot has an extension, then the cache management module 224 and/or hash search instructions 232 get the ID of the frame from the last slot index and load that particular frame into local memory for additional searching (step 544 ). The cache management module 224 and/or hash search instructions 232 then update the last slot number and the last slot index (step 548 ). After the slot number and last slot index have been updated, the method proceeds to step 528 .
- step 540 if there is no extension present, then the method proceeds to step 568 ( FIG. 5B ) as will be discussed in further detail herein.
- the method proceeds by comparing the strip/row from the slot to the strip/row for the request I/O command (step 524 ). Thereafter, the method proceeds with the cache management module 224 and/or hash search instructions 232 determining if a hash match is found (step 528 ). If not, the method increments to the next slot (step 532 ) and returns to step 520 .
- the method proceeds until a CSID match is found (step 552 ).
- Finding a matching CSID may be referred to as finding a hash hit.
- the required operation is performed in connection with executing the I/O command (step 556 ).
- the local hash value may be stored in memory of the controller 108 or in the controller cache 140 (step 560 ). Thereafter, the method ends (step 564 ).
- step 540 ( FIG. 5A ) if there is no extension present, then the method proceeds with the cache management module 224 and/or hash search instructions 232 performing the required operation associated with a hash miss (step 568 ).
- the local hash can then be stored in memory if needed (step 560 ) and then the method ends (step 564 ).
- Embodiments of the present disclosure contemplate such an approximation method.
- the method begins by determining the number of strips of the first row (NSFR) (step 604 ).
- This value of NSFR may be set equal to the row data size minus log Arm.
- the Log Arm is the Logical Arm in a Raid Volume. For A Raid 5 volume of 3 drives. There will be two data Logical Arms Log Arm 0 and Log Arm 1 and one parity arm. Data is striped across these logical Arms.
- An I/O command contains Row Number, Logical Arm Number and Offset in Arm to represent the start LBA for a Raid 5 volume.
- RowNumber startlba/number of Data Arms.
- Log Arm (startLba/srtipsize) % Number of Data Arms.
- OffsetInArm startLba % strip size.
- the method proceeds by determining whether the NSFR is greater than the number of strips spanned by the I/O command (step 608 ). If the query is answered positively, then the NSFR is adjusted to equal the number of strips spanned by the I/O operation (step 612 ).
- step 608 determines the row data size, the number of rows, the number of full rows, and the number of strips spanned in the last row (step 616 ). The method further continues by determining whether the number of strips spanned in the last row is equal to zero (step 620 ). If this query is answered affirmatively, then the method proceeds to setting the number of rows equal to the determined number of rows plus one (step 624 ). However, if the query of step 720 is answered negatively, then the method proceeds by approximating the number of rows spanned by the I/O operation (step 628 ). In some embodiments, the nearest power of two value (NP2) of the row data size of the RAID volume is used to approximate the number of rows spanned by the I/O operation.
- NP2 nearest power of two value
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Computer Security & Cryptography (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- This Non-Provisional patent Application claims the benefit of U.S. Provisional Patent Application No. 62/410,752, filed Oct. 20, 2016, the entire disclosure of which is hereby incorporated herein by reference.
- The present disclosure is generally directed toward computer memory.
- Hashing is one of the basic frameworks utilized in a caching environment. The advantage of hashing for cache lookup is that in the best case the lookup will be O(1) when there are no collisions. But when there are collisions, the average or worst case lookup can be very time consuming because of the memory accesses involved.
- There is a need for an effective hashing function which can make sure that the hash slots are used optimally. A region in each virtual disk should not map to the same slot otherwise it has tendency to get into collision list. For example: Logical Block Address (LBA) 0, which contains the master boot record, is frequently accessed. This means that if
LBA 0 of every virtual disk maps to the same hash slot, it is more likely that most of these LBAs are frequently accessed and hence lead to the collision list. - When caching is implemented in hardware, the load store operations of memory locations determine the performance that can be achieved by the solution. An effective hash structure organization would be useful so that the load store operations are minimized. Typically, in a hardware accelerated solution, the division operation is very expensive and it is not preferred to add division logic to the hardware if such hardware can be avoided.
- The present disclosure is described in conjunction with the appended figures, which are not necessarily drawn to scale:
-
FIG. 1 is a block diagram depicting a computing system in accordance with at least some embodiments of the present disclosure; -
FIG. 2 is a block diagram depicting details of an illustrative controller in accordance with at least some embodiments of the present disclosure; -
FIG. 3A is a block diagram depicting additional details of hash slot buckets and the slots contained therein in accordance with at least some embodiments of the present disclosure; -
FIG. 3B is a block diagram depicting additional details of a hash extension in accordance with at least some embodiments of the present disclosure; -
FIG. 4 is a flow diagram depicting a method of adding to a local hash value in accordance with at least some embodiments of the present disclosure; -
FIG. 5A is a first portion of a flow diagram depicting a hash searching method in accordance with at least some embodiments of the present disclosure; -
FIG. 5B is a second portion of the flow diagram fromFIG. 5A ; and -
FIG. 6 is a flow diagram depicting a method of approximating a number of rows needed to accommodate an Input/Output (I/O) operation in accordance with at least some embodiments of the present disclosure. - The ensuing description provides embodiments only, and is not intended to limit the scope, applicability, or configuration of the claims. Rather, the ensuing description will provide those skilled in the art with an enabling description for implementing the described embodiments. It being understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the appended claims.
- Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and this disclosure.
- As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprise,” “comprises,” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The term “and/or” includes any and all combinations of one or more of the associated listed items.
- As will be discussed in further detail herein, embodiments of the present disclosure present a hashing method for efficient cache lookup that minimizes memory accesses. A new hash with a chaining scheme is introduced with a hash table (e.g., a deep DRAM hash table) that is used for the cache lookup. In some embodiments, multi-level hash functions are used—one level to get to an appropriate hash slot bucket and another level to get the appropriate hash slot within the hash slot bucket in call all slots are occupied.
- A unique hash structure is also provided that can aid in set associative hashing scheme. The proposed hash structure also enables a method to add elements to a hash, while ensuring that all slots in a hash slot bucket are equally loaded.
- Another aspect of the present disclosure is to provide a method for approximating the number of slots to load for an R5/R6 Input/Output (I/O) operation.
- With reference to
FIGS. 1-6 , various embodiments of the present disclosure will be described. While many of the examples depicted and described herein will relate to a RAID architecture, it should be appreciated that embodiments of the present disclosure are not so limited. Indeed, aspects of the present disclosure can be used in any type of computing system and/or memory environment. In particular, embodiments of the present disclosure can be used in any type of caching scheme (whether employed by a RAID controller or some other type of device used in a communication system). In particular, hard drives, hard drive controllers (e.g., SCSI controllers, SAS controllers, or RAID controllers) may be configured to implement embodiments of the present disclosure. As another example, network cards or the like having cache memory may also be configured to implement embodiments of the present disclosure. - With reference now to
FIG. 1 , additional details of acomputing system 100 capable of implementing hashing methods and various cache lookup techniques will be described in accordance with at least some embodiments of the present disclosure. Thecomputing system 100 is shown to include ahost system 104, a controller 108 (e.g., a SCSI controller, a SAS controller, a RAID controller, etc.), and astorage array 112 having a plurality of storage devices 136 a-N therein. Thesystem 100 may utilize any type of data storage architecture. The particular architecture depicted and described herein (e.g., a RAID architecture) should not be construed as limiting embodiments of the present disclosure. If implemented as a RAID architecture, however, it should be appreciated that any type of RAID scheme may be employed (e.g., RAID-0, RAID-1, RAID-2, . . . , RAID-5, RAID-6, etc.). - In a RAID-0 (also referred to as a RAID level 0) scheme, data blocks are stored in order across one or more of the storage devices 136 a-N without redundancy. This effectively means that none of the data blocks are copies of another data block and there is no parity block to recover from failure of a storage device 136. A RAID-1 (also referred to as a RAID level 1) scheme, on the other hand, uses one or more of the storage devices 136 a-N to store a data block and an equal number of additional mirror devices for storing copies of a stored data block. Higher level RAID schemes can further segment the data into bits, bytes, or blocks for storage across multiple storage devices 136 a-N. One or more of the storage devices 136 a-N may also be used to store error correction or parity information.
- A single unit of storage can be spread across multiple devices 136 a-N and such a unit of storage may be referred to as a stripe. A stripe, as used herein and as is well known in the data storage arts, may include the related data written to multiple devices 136 a-N as well as the parity information written to a parity storage device 136 a-N. In a RAID-5 (also referred to as a RAID level 5) scheme, the data being stored is segmented into blocks for storage across multiple devices 136 a-N with a single parity block for each stripe distributed in a particular configuration across the multiple devices 136 a-N. This scheme can be compared to a RAID-6 (also referred to as a RAID level 6) scheme in which dual parity blocks are determined for a stripe and are distributed across each of the multiple devices 136 a-N in the
array 112. - One of the functions of the
controller 108 is to make the multiple storage devices 136 a-N in thearray 112 appear to ahost system 104 as a single high capacity disk drive. Thus, thecontroller 108 may be configured to automatically distribute data supplied from thehost system 104 across the multiple storage devices 136 a-N(potentially with parity information) without ever exposing the manner in which the data is actually distributed to thehost system 104. - In the depicted embodiment, the
host system 104 is shown to include aprocessor 116, aninterface 120, andmemory 124. It should be appreciated that thehost system 104 may include additional components without departing from the scope of the present disclosure. Thehost system 104, in some embodiments, corresponds to a user computer, laptop, workstation, server, collection of servers, or the like. Thus, thehost system 104 may or may not be designed to receive input directly from a human user. - The
processor 116 of thehost system 104 may include a microprocessor, central processing unit (CPU), collection of microprocessors, or the like. Thememory 124 may be designed to store instructions that enable functionality of thehost system 104 when executed by theprocessor 116. Thememory 124 may also store data that is eventually written by thehost system 104 to thestorage array 112. Further still, thememory 124 may be used to store data that is retrieved from thestorage array 112.Illustrative memory 124 devices may include, without limitation, volatile or non-volatile computer memory (e.g., flash memory, RAM, DRAM, ROM, EEPROM, etc.). - The
interface 120 of thehost system 104 enables thehost system 104 to communicate with thecontroller 108 via ahost interface 128 of thecontroller 108. In some embodiments, theinterface 120 and host interface(s) 128 may be of a same or similar type (e.g., utilize a common protocol, a common communication medium, etc.) such that commands issued by thehost system 104 are receivable at thecontroller 108 and data retrieved by thecontroller 108 is transmittable back to thehost system 104. The 120, 128 may correspond to parallel or serial computer interfaces that utilize wired or wireless communication channels. Theinterfaces 120, 128 may include hardware that enables such wired or wireless communications. The communication protocol used between theinterfaces host system 104 and thecontroller 108 may correspond to any type of known host/memory control protocol. Non-limiting examples of protocols that may be used between 120, 128 include SAS, SATA, SCSI, FibreChannel (FC), iSCSI, ATA over Ethernet, InfiniBand, or the like.interfaces - The
controller 108 may provide the ability to represent theentire storage array 112 to thehost system 104 as a single high volume data storage device. Any known mechanism can be used to accomplish this task. Thecontroller 108 may help to manager the storage devices 136 a-N (which can be hard disk drives, sold-state drives, or combinations thereof) so as to operate as a logical unit. In some embodiments, thecontroller 108 may be physically incorporated into thehost device 104 as a Peripheral Component Interconnect (PCI) expansion (e.g., PCI express (PCI)e) card or the like. In such situations, thecontroller 108 may be referred to as a RAID adapter. - The storage devices 136 a-N in the
storage array 112 may be of similar types or may be of different types without departing from the scope of the present disclosure. The storage devices 136 a-N may be co-located with one another or may be physically located in different geographical locations. The nature of thestorage interface 132 may depend upon the types of storage devices 136 a-N used in thestorage array 112 and the desired capabilities of thearray 112. Thestorage interface 132 may correspond to a virtual interface or an actual interface. As with the other interfaces described herein, thestorage interface 132 may include serial or parallel interface technologies. Examples of thestorage interface 132 include, without limitation, SAS, SATA, SCSI, FC, iSCSI, ATA over Ethernet, InfiniBand, or the like. - The
controller 108 is shown to have communication capabilities with acontroller cache 140. While depicted as being separate from thecontroller 108, it should be appreciated that thecontroller cache 140 may be integral to thecontroller 108, meaning that components of thecontroller 108 and thecontroller cache 140 may be contained within a single physical housing or computing unit (e.g., server blade). Thecontroller cache 140 is provided to enable thecontroller 108 to perform caching operations. Thecontroller 108 may employ caching operations during execution of I/O commands received from thehost system 104. Depending upon the nature of the I/O command and the amount of information being processed during the command, thecontroller 108 may require a large number ofcache memory modules 148 or a smaller number ofcache memory modules 148. Thememory modules 148 may correspond to flash memory, RAM, DDR memory, or some other type of computer memory that is quickly accessible and can be rewritten multiple times. The number ofseparate memory modules 148 in thecontroller cache 140 is typically larger than one, although acontroller cache 140 may be configured to operate with asingle memory module 148 if desired. - The
cache interface 144 may correspond to any interconnect that enables thecontroller 108 to access thememory modules 148, temporarily store data thereon, and/or retrieve data stored thereon in connection with performing an I/O command or some other executable command. In some embodiments, thecontroller cache 140 may be integrated with thecontroller 108 and may be executed on a CPU chip or placed on a separate chip within thecontroller 108. In such a scenario, theinterface 144 may correspond to a separate bus interconnect within the CPU or traces connecting a chip of thecontroller cache 140 with a chip executing the processor of thecontroller 108. In other embodiments, thecontroller cache 140 may be external to thecontroller 108 in which case theinterface 144 may correspond to a serial or parallel data port. - Direct mapping is one type of cache configuration in which each block is mapped to exactly one cache location. Conceptually, this is like rows in a table with three columns: the data block or cache line that contains the actual data fetched and stored, a tag that contains all or part of the address of the fetched data, and a flag bit that connotes the presence of a valid bit of data in the row entry.
- Fully associative mapping is similar to direct mapping in structure, but allows a block to be mapped to any cache location rather than to a pre-specified cache location (as is the case with direct mapping).
- Set associative mapping can be viewed as a compromise between direct mapping and fully associative mapping in which each block is mapped to a subset of cache locations. It is sometimes called N-way set associative mapping, which provides for a location in main memory to be cached to any of “N” locations in the L1 cache.
- As will be discussed in further detail herein, the
controller 108 andcontroller cache 140 may be enabled to implement an N-way set associated mapping in a more efficient manner than previous caching systems. In particular, thecontroller 108 andcontroller cache 140 may be configured to use hashing techniques that are N-time associative. This will effectively control the number of hash collisions, which can help increase processing speed even though different inputs may result in the same hash output. Although this particular type of hash strategy will be described herein, it should be appreciated that any hash type can be used, including relatively simple hash types that potentially result in the same hash outputs using different hash inputs. - With reference now to
FIG. 2 additional details of acontroller 108 will be described in accordance with at least some embodiments of the present disclosure. Thecontroller 108 is shown to include the host interface(s) 128 and storage interface(s) 132. Thecontroller 108 is also shown to include aprocessor 204,memory 208, one ormore drivers 212, and apower source 216. - The
processor 204 may include an Integrated Circuit (IC) chip or multiple IC chips, a CPU, a microprocessor, or the like. Theprocessor 204 may be configured to execute instructions inmemory 208 that are shown to include acache management module 224, hashinginstructions 228, hashsearch instructions 232, and a slot collision list. Furthermore, in connection with executing thecache management module 224, theprocessor 204 may utilize a setassociative hash 220 containing a plurality of hash buckets 240 a-M. The setassociative hash 220 is shown to be maintained internally to thecontroller 108. It should be appreciated, however, that some or all of the setassociative hash 220 may be stored and/or maintained external to thecontroller 108. Alternatively or additionally, the setassociative hash 220 may be stored or contained withinmemory 208 of thecontroller 108. - The
memory 208 may be volatile and/or non-volatile in nature. As indicated above, thememory 208 may include any hardware component or collection of hardware components that are capable of storing instructions and communicating those instructions to theprocessor 204 for execution. Non-limiting examples ofmemory 208 include RAM, ROM, flash memory, EEPROM, variants thereof, combinations thereof, and the like. - The instructions stored in
memory 208 are shown to be different instruction sets, but it should be appreciated that the instructions can be combined into a smaller number of instruction sets without departing from the scope of the present disclosure. Thecache management module 224, when executed, enable theprocessor 204 to receive I/O commands from thehost system 104, determine one or more caching operations that need to be performed in connection with executing the I/O command, and identify which memory module(s) 148 should be used in connection with performing the one or more caching operations. Thecache management module 224 may also be configured to invoke the other instruction sets inmemory 208 to further perform such caching operations. Thecache management module 224 may also be configured to manage the setassociative hash 220 and the values assigned to various data fields within the setassociative hash 220. In particular, as hash buckets 240 a-M are utilized, thecache management module 224 may be enabled to update the setassociative hash 220 to reflect such utilizations. Thecache management module 224 may also be responsible for facilitating communications between thecontroller 108 and thecontroller cache 140. - The hashing
instructions 228, when executed, enable theprocessor 208 to calculate hash output values based hash inputs. In some embodiments, hash inputs used by the hashinginstructions 228 to calculate hash output values may include numbers assigned to particular hash buckets 240 a-M, numbers assigned to hash slots within the hash buckets 240 a-M, memory addresses ofmemory modules 148, LBAs, storage time, data retrieval time, strip/row numbers, a hash index, a maximum number of bits needed to complete the I/O command, and other numbers derived therefrom. In some embodiments, the hashinginstructions 228 may implement a hashing function that correspond to a linear hashing function. Use of a linear hash function can help ensure that hash slots for successive strips/rows of data are stored consecutively (thereby increasing the efficiency with which such data can be later retrieved by the controller 108). Use of a linear hash function is advantageous especially in hardware accelerated caching solutions where the hardware thread does not have direct access to the memory module 148 (e.g., SRAM or DRAM) where the hash slots are stored. These hash slots can be loaded when required and stored back when the objective is achieved (e.g., the I/O command is completed). Since an I/O command almost always results in consecutive strips/rows, many slots can be loaded and stored together, thus ensuring that there are not too many unnecessary load and store operations. In some embodiments, the transpose operation of the hash function executed by the hashinginstructions 228 ensures that a region in each virtual disk should not map to the same slot in a hash bucket 240 a-M, otherwise the slot will have a tendency to be added to theslot collision list 236. - The
hash search instructions 232, when executed, enable theprocessor 208 to search the setassociative hash 220 for hash collisions. As will be described in further detail herein, as thecontroller 108 executes I/O commands, thecontroller 108 may utilize thehash search instructions 232 to see if a particular piece or set of data is already stored in thememory module 148. Thehash search instructions 232 may be enabled to utilize the hashinginstructions 228 to generate a hash for the data being searched and may compare that hash to hashes stored in the setassociative hash 220. If collisions (e.g., matches) are found, thehash search instructions 232 and/orcache management module 224 may be enabled to update theslot collision list 236 as will be described in further detail herein. - With reference now to
FIGS. 3A and 3B , details of a setassociative hash 220 used in accordance with embodiments of the present disclosure will be described. The setassociative hash 220 is shown to include a plurality of hash buckets 240 a-M linearly arranged. The setassociative hash 220 may include up to M hash buckets, where M is an integer number greater than or equal to one. - Each hash slot bucket 240 is shown to include a plurality of hash slots 304 a-d. Although the hash slot buckets 240 are shown to include four hash slots 304 a-d, it should be appreciated that a hash slot bucket 240 may be configured to include a greater or lesser number of hash slots. For instance, the hash slot bucket 240 may have two, three, four, five, . . . , ten, twenty, or more hash slots without departing from the scope of the present disclosure.
- Each hash slot 304 is shown to include a number of fields for storing information in connection with a hash generated for the slot 304. The fields include, without limitation, a
flag field 308, a Cache Segment ID (CSID)field 312, anLSB field 316, and anMSB field 320. Theflag field 308 may contain a number of bits that indicate various details or pieces of information related to the hash slot 304. For instance, theflags 308 may be used to indicate whether data stored in the hash slot 304 continues in another hash slot (e.g., by an appropriate indication in the bit link field 324). As another non-limiting example, theflags 308 may indicate the presence of a bad block in one of the LBA ranges that map into the corresponding slot 304. - The
CSID field 312 provides an identification of a cache segment. The identification may be provided as an alphanumerical string. In some embodiments, adjacent slots may be assigned consecutively-numbered CSIDs. In other embodiments, CSIDs may be assigned randomly or pseudo-randomly. In any event, the CSID provides a substantially unique (e.g., unique to the data structure 220) identifier for a slot 304. This means that any particular slot 304 will have a different CSID from all other slots 304 in the setassociative hash 220. - The combination of the
LSB field 316 andMSB field 320 may provide an address range in memory where a hash slot is stored. In particular, the identifiers stored in the 316, 320 may identify an LBA of afields memory module 148 or an LBA range of thememory module 148. Collectively, the 316, 320 may be referred to as a tag field. In some embodiments, a 1 byte value in the tag field may indicate that the cache segment frame is used as a hash slot extension, which is shown infields FIG. 3B . In some embodiments, each hash slot 304 is provisioned with four potential collision lists. This effectively ensures that when there is a hash hit and a new element is added to the hash index, the hash does not go into the collision list but instead into the next available slot. The hash will only be added to thecollision list 236 when all four slots 304 of the hash bucket 240 are filled. - This set associative hashing ensures that the length of the collision list is reduced. For instance, assume there are 4 elements that needed to be added to a same hash slot bucket 240 then all the 4 are added into the slots 304 a-d and there would not be any addition to the
collision list 236. When a fifth element is to be added, an extension (as shown inFIG. 3B ) that can hold up to 16 slots is created and the last index (e.g., the fourth index) in thehash slot 304 d updated with the extension ID, the information in the last index (e.g., the fourth index) is copied into the first index slot in the extension thus establishing the chain. Now up to 16 more elements can be added to this extension frame without adding afurther collision list 236. After all the 16 slots of the hash extension are filled, if an additional element needs to be added, then one more hash extension frame that can hold up to 16 frames is created and it is chained to the last slot in the current extension frame (thus creating a chain of hash slot extensions). Accordingly, if a 100 elements needs to be added, if there is not set associative hashing employed then the worst case nonlinear traversal is going to be 100 to determine if there is a hash hit or not. But with this approach, only four slots 304 in the hash bucket 240 would do a linear search, then one chain is loaded and 16 slots in the first extension involves a linear search. In the same way there would be maximum of 6 extensions with each one having a linear list of 16 elements. Hence with this approach, in the worst case, there are 6 non-linear traversals and 100 linear traversals. It should be noted that linear traversals do not involve loading a new memory which makes the traversal faster unlike a non-linear traversal (e.g., a linked list) which involves loading of the next link and its corresponding memory before proceeding to the next element. - Hardware implementations can optimize comparisons in a linear list. In some embodiments, the hash bucket 240 preserves the expendability of the chaining, but this feature kicks in only when unusual workloads make the chaining necessary (e.g., there are more than 4 collisions). The hash slot 304 also indicates if the bad block bit is set (e.g., via flag 308). If the flag bit is set, it indicates that some/all of the cache segments present in this hash slot 304 have bad blocks and hence the caching algorithms can take appropriate actions based on the specific use case—for example divert the command to a firmware module to take specific actions.
- As mentioned above, each consecutive strip/row can be found in consecutive hash slot buckets. This effectively allows a group of memory accesses to be performed instead of multiple individual memory accesses. As used herein, a hash slot bucket corresponds to a group of positions in a hash table and a hash slot corresponds to a particular position in the hash table. A particular slot may be empty or contain an entry.
- With reference now to
FIG. 4 , a method of adding elements to a hash assigned to a slot bucket 240 will be described in accordance with at least some embodiments of the present disclosure. As can be appreciated, thehash search instructions 232 will search the hashes assigned to each slot bucket 240 for matches. These hashes need to be updated as additional slots in the hash slot bucket 240 are used and data is stored in corresponding addresses ofcache memory 148. - The method begins when an I/O command is received at the
controller 108 from thehost system 104 and then thecontroller 108 invokes thecache management module 224 to search for an empty slot in the setassociative hash 220. Once initiated, thecache management module 224 will set a last slot index equal to 3 (e.g., if there are 4 slots per hash bucket) (step 404). Then thecache management module 224 will set an empty slot index (step 408). In this step, thecache management module 224 will find an empty slot by searching hash slot buckets 240 starting athash slot bucket 0 240 a until the last slot index is reached. At each h ash bucket, thecache manager 224 will check to see if the CSID field of the hash slot bucket is INVALID. This may indicate slot bucket is empty. In some embodiments, the first hash index is obtained from the I/O command although it can also be received from other sources. This first hash index gives the hash slot bucket which contains four slots. The next check is to see if any of the four slots 304 within the hash slot bucket 240 are empty (step 412)(e.g., from 0 to 3). If empty, the slot will be used to store the data in connection with the received I/O command. - If the first slot bucket is empty, then the first available slot within the slot bucket is assigned to the I/O command and the strip/row associated with that slot will be used to store data in connection with the I/O command. The
cache management module 224 will also add the CSID that is assigned to the strip/row in connection to the IO command, to the hash slot assigned to the hash slot bucket 240 (step 440). Thecache management module 224 will also update the appropriate fields of the slot 304 and bucket 240 to indicate that the assigned slot is not part of a slot chain. Thereafter, the method ends atstep 444. - Referring back to step 408, if an empty slot is not found, the
cache management module 224 will check to see if the last slot index in the hash bucket has an extension (step 416). If an extension is found (step 420), then thecache management module 224 will get the CSID from the last slot and load the CSID into local memory. Thecache management module 224 will then start the linear search within the frame/slot extension (step 424). The method then reverts back to step 412 to see if any empty slots are found within the extension. - If no extension is found in
step 420, then thecache management module 224 allocates a new cache frame/extension (step 428). In some embodiments, the allocated cache frame can be of any size and can be indexable by an identifier. As a non-limiting example, the cache frame may be a 128 byte cache frame. - The method further continues with the
cache management module 224 copying the last slot into the first slot in the newly-allocated frame (step 432). Then thecache management module 224 sets the 316, 320 to indicate that the newly-allocated frame corresponds to an extension (step 436). Thetag field cache management module 224 may further set the empty slot index to an appropriate value to indicate that the newly-allocated frame does not have any slots currently allocated therein. The method then proceeds to step 440. - With reference now to
FIGS. 5A and 5B , additional details of a hash searching method will be described in accordance with at least some embodiments of the present disclosure. The hash searching method is performed to see if there is a hash hit or miss for a particular I/O command. - The method begins by setting the slot index (step 504). In some embodiments, the slot index is set using a hash function. In more specific embodiments, the hash function used to set the slot index is not a “strong” hash function. Rather, a relatively weak hash function is used for simple manipulation of data (e.g., shift and add functions). This means that faster responses can be achieved, but at the expense of higher collision rates. However, because the hash slots are grouped into buckets and because adjacent LBAs are assigned to consecutive buckets, the benefits of the fast I/O processing can be realized without having to worry about the increased risk of collisions. Furthermore, because embodiments of the present disclosure contemplate using DRAM, the I/O command can be executed relatively quickly because the data for any I/O command is stored in consecutive memory locations. Moreover, all of the M buckets can be accessed in a single read access rather than having to use multiple accesses. This will be true as long as there are less than four collisions for each hash bucket (if each bucket has four slots).
- In some embodiments, the get hash index algorithm computes a hash index (e.g., “HashIndex”) according to the following:
-
Transp(VD=a0a1a2a3a4a5a6a7a8a9aaab) = abaaa9a8a7a6a5a4a3a2a1a0 Shift = maxBitsForHS-maxBitsInLdNumber flippedVD = (Transp(VD) << shift) HashIndex = (Strip_RowNumber + flippedVD) % numberOfHashSlots
Where abaaa9a8a7a6a5a4a3a2a1a0 is the virtual disk ID or number (e.g., TargetID), where maxBitsForHS is the number of bits required to represent the hash table (e.g., if the hash table size is 1024, the maxBitsForHS value is 10 (2̂10=1−24)), where maxBitsInLdNumber is the number of bits required to represent all of the Virtual Disks (e.g., if the total number of virtual disks supported is 64, then this value is 6 whereas for 2048, then the value is 11), where Transp(VD) is the transposed value of the Virtual Disk Number, where shift is the number of bits in Transp(VD) that need to be shifted, where flippedVD is the shifted Transp(VD) makes the hash sparse, and where Strip_RowNumber is the strip number present in the I/O command forRAID 0 orRAID 1 volume or row number forRAID 5,RAID 6 volumes. - After the slot index is found, the method proceeds with the
cache management module 224 and/or hashsearch instructions 232 determining the number of slots that need to be loaded to accommodate the I/O command (step 508). In some embodiments, this number may be approximated by thecache management module 224 and/or hash search instructions 232 (as will be described in connection withFIG. 6 ). The slots then begin to be loaded (step 512). In some embodiments, the slots are loaded in alignment with a burst boundary. The burst boundary can depend upon the slot size, the burst size, and the number of slots included in a slot bucket. The slot loading begins with the first slot (step 516). - The
cache management module 224 and/or hashsearch instructions 232 will determine whether the number of slots needed (e.g., as estimated or determined in step 508) is less than the total number of slots in a bucket (step 520). If this query is answered negatively, then the method proceeds by checking if the last slot has an extension (step 536). If the last slot has an extension, then thecache management module 224 and/or hashsearch instructions 232 get the ID of the frame from the last slot index and load that particular frame into local memory for additional searching (step 544). Thecache management module 224 and/or hashsearch instructions 232 then update the last slot number and the last slot index (step 548). After the slot number and last slot index have been updated, the method proceeds to step 528. - Referring back to step 540, if there is no extension present, then the method proceeds to step 568 (
FIG. 5B ) as will be discussed in further detail herein. - Referring back to step 520, if the slot number is not less than or equal to the last slot index, then the method proceeds by comparing the strip/row from the slot to the strip/row for the request I/O command (step 524). Thereafter, the method proceeds with the
cache management module 224 and/or hashsearch instructions 232 determining if a hash match is found (step 528). If not, the method increments to the next slot (step 532) and returns to step 520. - At
step 552 and with reference toFIG. 5B , the method proceeds until a CSID match is found (step 552). Finding a matching CSID may be referred to as finding a hash hit. At this point, the required operation is performed in connection with executing the I/O command (step 556). In some embodiments, the local hash value may be stored in memory of thecontroller 108 or in the controller cache 140 (step 560). Thereafter, the method ends (step 564). - Referring back to the inquiry of step 540 (
FIG. 5A ) if there is no extension present, then the method proceeds with thecache management module 224 and/or hashsearch instructions 232 performing the required operation associated with a hash miss (step 568). The local hash can then be stored in memory if needed (step 560) and then the method ends (step 564). - With reference now to
FIG. 6 , a method of approximating a number of rows needed to accommodate an I/O operation will be described in accordance with at least some embodiments of the present disclosure. While for R0/R1 RAID architectures the number of hash slots to be loaded is equal to the number of strips spanned by the I/O command, for R5 the number of hash slots needed for an I/O command corresponds to the number of rows spanned by the I/O command. To find the number of rows often requires a divider function. Typically, in hardware accelerated solutions, the division operation is very expensive and it is not preferred to add division logic to the hardware if it can be avoided. Accordingly, if an R5 architecture it may be desirable to provide a method for approximating the number of rows needed to accommodate an I/O operation without requiring a division operation. - Embodiments of the present disclosure contemplate such an approximation method. The method begins by determining the number of strips of the first row (NSFR) (step 604). This value of NSFR may be set equal to the row data size minus log Arm. The Log Arm is the Logical Arm in a Raid Volume. For A
Raid 5 volume of 3 drives. There will be two data LogicalArms Log Arm 0 andLog Arm 1 and one parity arm. Data is striped across these logical Arms. - An I/O command contains Row Number, Logical Arm Number and Offset in Arm to represent the start LBA for a
Raid 5 volume. RowNumber=startlba/number of Data Arms. Log Arm=(startLba/srtipsize) % Number of Data Arms. OffsetInArm=startLba % strip size. The method proceeds by determining whether the NSFR is greater than the number of strips spanned by the I/O command (step 608). If the query is answered positively, then the NSFR is adjusted to equal the number of strips spanned by the I/O operation (step 612). If, however, the query ofstep 608 is answered negatively, then the method proceeds to determine the row data size, the number of rows, the number of full rows, and the number of strips spanned in the last row (step 616). The method further continues by determining whether the number of strips spanned in the last row is equal to zero (step 620). If this query is answered affirmatively, then the method proceeds to setting the number of rows equal to the determined number of rows plus one (step 624). However, if the query of step 720 is answered negatively, then the method proceeds by approximating the number of rows spanned by the I/O operation (step 628). In some embodiments, the nearest power of two value (NP2) of the row data size of the RAID volume is used to approximate the number of rows spanned by the I/O operation. - Specific details were given in the description to provide a thorough understanding of the embodiments. However, it will be understood by one of ordinary skill in the art that the embodiments may be practiced without these specific details. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments.
- While illustrative embodiments of the disclosure have been described in detail herein, it is to be understood that the inventive concepts may be otherwise variously embodied and employed, and that the appended claims are intended to be construed to include such variations, except as limited by the prior art.
Claims (20)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/335,025 US20180113810A1 (en) | 2016-10-20 | 2016-10-26 | Method and system for efficient hashing optimized for hardware accelerated caching |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201662410752P | 2016-10-20 | 2016-10-20 | |
| US15/335,025 US20180113810A1 (en) | 2016-10-20 | 2016-10-26 | Method and system for efficient hashing optimized for hardware accelerated caching |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20180113810A1 true US20180113810A1 (en) | 2018-04-26 |
Family
ID=61969606
Family Applications (5)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/335,037 Active US10223009B2 (en) | 2016-10-20 | 2016-10-26 | Method and system for efficient cache buffering supporting variable stripe sizes to enable hardware acceleration |
| US15/335,014 Abandoned US20180113639A1 (en) | 2016-10-20 | 2016-10-26 | Method and system for efficient variable length memory frame allocation |
| US15/335,039 Active 2036-11-25 US10078460B2 (en) | 2016-10-20 | 2016-10-26 | Memory controller utilizing scatter gather list techniques |
| US15/335,025 Abandoned US20180113810A1 (en) | 2016-10-20 | 2016-10-26 | Method and system for efficient hashing optimized for hardware accelerated caching |
| US15/335,030 Active 2036-11-29 US10108359B2 (en) | 2016-10-20 | 2016-10-26 | Method and system for efficient cache buffering in a system having parity arms to enable hardware acceleration |
Family Applications Before (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/335,037 Active US10223009B2 (en) | 2016-10-20 | 2016-10-26 | Method and system for efficient cache buffering supporting variable stripe sizes to enable hardware acceleration |
| US15/335,014 Abandoned US20180113639A1 (en) | 2016-10-20 | 2016-10-26 | Method and system for efficient variable length memory frame allocation |
| US15/335,039 Active 2036-11-25 US10078460B2 (en) | 2016-10-20 | 2016-10-26 | Memory controller utilizing scatter gather list techniques |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/335,030 Active 2036-11-29 US10108359B2 (en) | 2016-10-20 | 2016-10-26 | Method and system for efficient cache buffering in a system having parity arms to enable hardware acceleration |
Country Status (1)
| Country | Link |
|---|---|
| US (5) | US10223009B2 (en) |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10789176B2 (en) * | 2018-08-09 | 2020-09-29 | Intel Corporation | Technologies for a least recently used cache replacement policy using vector instructions |
| US10970205B2 (en) * | 2018-05-31 | 2021-04-06 | Micron Technology, Inc. | Logical-to-physical data structures for tracking logical block addresses indicative of a collision |
| US11106585B2 (en) * | 2019-10-31 | 2021-08-31 | EMC IP Holding Company, LLC | System and method for deduplication aware read cache in a log structured storage array |
| US11797539B2 (en) * | 2019-09-12 | 2023-10-24 | Oracle International Corporation | Accelerated building and probing of hash tables using symmetric vector processing |
| US12174788B2 (en) * | 2021-11-25 | 2024-12-24 | Gluesys Co., Ltd. | Data input/output method using storage node-based key-value store |
| US20250328473A1 (en) * | 2024-04-19 | 2025-10-23 | Hitachi Vantara, Ltd. | Storage system and storage control method |
| US20250335362A1 (en) * | 2024-04-24 | 2025-10-30 | Sk Hynix Nand Product Solutions Corp. | Systems, methods, and media for providing append-only caches |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10250473B2 (en) * | 2016-11-29 | 2019-04-02 | Red Hat Israel, Ltd. | Recovery from a networking backend disconnect |
| US10642821B2 (en) * | 2017-03-17 | 2020-05-05 | Apple Inc. | Elastic data storage system |
| US10282116B2 (en) * | 2017-07-19 | 2019-05-07 | Avago Technologies International Sales Pte. Limited | Method and system for hardware accelerated cache flush |
| US20190087111A1 (en) * | 2017-09-15 | 2019-03-21 | Seagate Technology Llc | Common logical block addressing translation layer for a storage array |
| US10496297B2 (en) * | 2017-11-21 | 2019-12-03 | Micron Technology, Inc. | Data categorization based on invalidation velocities |
| KR102795964B1 (en) * | 2018-11-08 | 2025-04-16 | 삼성전자주식회사 | Storage device, operating method of storage device and operating method of host controlling storage device |
| US11061676B2 (en) | 2019-04-24 | 2021-07-13 | International Business Machines Corporation | Scatter gather using key-value store |
| US12366961B2 (en) * | 2023-10-30 | 2025-07-22 | Avago Technologies International Sales Pte. Limited | Watermarked IO diverts for virtual disks |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130032615A1 (en) * | 2011-08-04 | 2013-02-07 | Chung Julian Jaeyoon | T-Shirts Hanger |
| US20150213805A1 (en) * | 2014-01-30 | 2015-07-30 | Qualcomm Incorporated | Indicating frame parameter reusability for coding vectors |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5931920A (en) | 1997-08-05 | 1999-08-03 | Adaptec, Inc. | Command interpreter system in an I/O controller |
| US6175900B1 (en) | 1998-02-09 | 2001-01-16 | Microsoft Corporation | Hierarchical bitmap-based memory manager |
| KR100528967B1 (en) | 2002-12-18 | 2005-11-15 | 한국전자통신연구원 | Apparatus and method for controlling memory allocation for variable sized packets |
| EP1619584A1 (en) | 2004-02-13 | 2006-01-25 | Jaluna SA | Memory allocation |
| US7730239B2 (en) | 2006-06-23 | 2010-06-01 | Intel Corporation | Data buffer management in a resource limited environment |
| US8266116B2 (en) | 2007-03-12 | 2012-09-11 | Broadcom Corporation | Method and apparatus for dual-hashing tables |
| US9280609B2 (en) | 2009-09-08 | 2016-03-08 | Brocade Communications Systems, Inc. | Exact match lookup scheme |
| US9134909B2 (en) | 2011-08-30 | 2015-09-15 | International Business Machines Corporation | Multiple I/O request processing in a storage system |
| US8938603B2 (en) | 2012-05-31 | 2015-01-20 | Samsung Electronics Co., Ltd. | Cache system optimized for cache miss detection |
-
2016
- 2016-10-26 US US15/335,037 patent/US10223009B2/en active Active
- 2016-10-26 US US15/335,014 patent/US20180113639A1/en not_active Abandoned
- 2016-10-26 US US15/335,039 patent/US10078460B2/en active Active
- 2016-10-26 US US15/335,025 patent/US20180113810A1/en not_active Abandoned
- 2016-10-26 US US15/335,030 patent/US10108359B2/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130032615A1 (en) * | 2011-08-04 | 2013-02-07 | Chung Julian Jaeyoon | T-Shirts Hanger |
| US20150213805A1 (en) * | 2014-01-30 | 2015-07-30 | Qualcomm Incorporated | Indicating frame parameter reusability for coding vectors |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10970205B2 (en) * | 2018-05-31 | 2021-04-06 | Micron Technology, Inc. | Logical-to-physical data structures for tracking logical block addresses indicative of a collision |
| US11556466B2 (en) | 2018-05-31 | 2023-01-17 | Micron Technology, Inc. | Logical-to-physical data structures |
| US10789176B2 (en) * | 2018-08-09 | 2020-09-29 | Intel Corporation | Technologies for a least recently used cache replacement policy using vector instructions |
| US11797539B2 (en) * | 2019-09-12 | 2023-10-24 | Oracle International Corporation | Accelerated building and probing of hash tables using symmetric vector processing |
| US11106585B2 (en) * | 2019-10-31 | 2021-08-31 | EMC IP Holding Company, LLC | System and method for deduplication aware read cache in a log structured storage array |
| US12174788B2 (en) * | 2021-11-25 | 2024-12-24 | Gluesys Co., Ltd. | Data input/output method using storage node-based key-value store |
| US20250328473A1 (en) * | 2024-04-19 | 2025-10-23 | Hitachi Vantara, Ltd. | Storage system and storage control method |
| US20250335362A1 (en) * | 2024-04-24 | 2025-10-30 | Sk Hynix Nand Product Solutions Corp. | Systems, methods, and media for providing append-only caches |
Also Published As
| Publication number | Publication date |
|---|---|
| US20180113635A1 (en) | 2018-04-26 |
| US10078460B2 (en) | 2018-09-18 |
| US20180113634A1 (en) | 2018-04-26 |
| US20180113639A1 (en) | 2018-04-26 |
| US10223009B2 (en) | 2019-03-05 |
| US10108359B2 (en) | 2018-10-23 |
| US20180113633A1 (en) | 2018-04-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20180113810A1 (en) | Method and system for efficient hashing optimized for hardware accelerated caching | |
| US9690493B2 (en) | Two-level system main memory | |
| US20180095662A1 (en) | Parallel segment writer | |
| US11237743B2 (en) | Sub-block deduplication using sector hashing | |
| US20150349806A1 (en) | Technique for adaptively strengthing ecc for flash cache | |
| US11803527B2 (en) | Techniques for efficient data deduplication | |
| WO2017020668A1 (en) | Physical disk sharing method and apparatus | |
| US10282116B2 (en) | Method and system for hardware accelerated cache flush | |
| US20190243779A1 (en) | Method and system for operating nand flash physical space to extend memory capacity | |
| US11288204B2 (en) | Logical and physical address field size reduction by alignment-constrained writing technique | |
| US10282301B2 (en) | Method and system for hardware accelerated read-ahead caching | |
| US20200057576A1 (en) | Method and system for input/output processing for write through to enable hardware acceleration | |
| US10528438B2 (en) | Method and system for handling bad blocks in a hardware accelerated caching solution | |
| WO2022119595A1 (en) | Techniques for estimating deduplication between storage volumes | |
| US10649906B2 (en) | Method and system for hardware accelerated row lock for a write back volume | |
| US12038852B2 (en) | Partial logical-to-physical (L2P) address translation table for multiple namespaces | |
| US20250245149A1 (en) | Generating a logical to physical data structure for a solid state drive using sectors of different sizes | |
| EP4155934A1 (en) | Systems and methods for workload distribution across processing units | |
| WO2025160316A1 (en) | Generating a logical to physical data structure for a solid state drive using sectors of different sizes | |
| CN120832091A (en) | System and method for memory block allocation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SIMIONESCU, HORIA;HOGLUND, TIMOTHY;VEERLA, SRIDHAR RAO;AND OTHERS;SIGNING DATES FROM 20161017 TO 20161025;REEL/FRAME:040148/0355 |
|
| AS | Assignment |
Owner name: AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE. LIMITED, SINGAPORE Free format text: MERGER;ASSIGNOR:AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD.;REEL/FRAME:047231/0369 Effective date: 20180509 Owner name: AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE. LIMITE Free format text: MERGER;ASSIGNOR:AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD.;REEL/FRAME:047231/0369 Effective date: 20180509 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
| AS | Assignment |
Owner name: AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE. LIMITE Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE EXECUTION DATE OF THE MERGER AND APPLICATION NOS. 13/237,550 AND 16/103,107 FROM THE MERGER PREVIOUSLY RECORDED ON REEL 047231 FRAME 0369. ASSIGNOR(S) HEREBY CONFIRMS THE MERGER;ASSIGNOR:AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD.;REEL/FRAME:048549/0113 Effective date: 20180905 Owner name: AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE. LIMITED, SINGAPORE Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE EXECUTION DATE OF THE MERGER AND APPLICATION NOS. 13/237,550 AND 16/103,107 FROM THE MERGER PREVIOUSLY RECORDED ON REEL 047231 FRAME 0369. ASSIGNOR(S) HEREBY CONFIRMS THE MERGER;ASSIGNOR:AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD.;REEL/FRAME:048549/0113 Effective date: 20180905 |
|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |