WO2024182288A1 - Cache-coherent memory extension with cpu stall avoidance - Google Patents
Cache-coherent memory extension with cpu stall avoidance Download PDFInfo
- Publication number
- WO2024182288A1 WO2024182288A1 PCT/US2024/017276 US2024017276W WO2024182288A1 WO 2024182288 A1 WO2024182288 A1 WO 2024182288A1 US 2024017276 W US2024017276 W US 2024017276W WO 2024182288 A1 WO2024182288 A1 WO 2024182288A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- memory
- logic
- data
- request
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0866—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
- G06F12/0868—Data transfer between cache memory and other subsystems, e.g. storage devices or host systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/12—Replacement control
- G06F12/121—Replacement control using replacement algorithms
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/10—Program control for peripheral devices
- G06F13/12—Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor
- G06F13/124—Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor where hardware is a sequential transfer control unit, e.g. microprocessor, peripheral processor or state-machine
- G06F13/128—Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor where hardware is a sequential transfer control unit, e.g. microprocessor, peripheral processor or state-machine for dedicated transfers to a network
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
- G06F13/1668—Details of memory controller
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/70—Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
- G06F21/78—Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
- G06F3/0605—Improving or facilitating administration, e.g. storage management by facilitating the interaction with a user or administrator
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
- G06F3/0649—Lifecycle management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1024—Latency reduction
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/15—Use in a specific computing environment
- G06F2212/154—Networked environment
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/31—Providing disk cache in a specific location of a storage system
- G06F2212/313—In storage device
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/31—Providing disk cache in a specific location of a storage system
- G06F2212/314—In storage network, e.g. network attached cache
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/50—Control mechanisms for virtual memory, cache or TLB
- G06F2212/502—Control mechanisms for virtual memory, cache or TLB using adaptive policy
Definitions
- This application relates to memory and, in particular, to memory extensions.
- computing systems may encounter a scenario where memory usage nearly equals, equals, or exceeds available local primary memory.
- Such computing systems suffer from a variety of drawbacks, limitations, and disadvantages. Accordingly, there is a need for inventive systems, methods, components, and apparatuses described herein.
- FIG. 1 illustrates a hardware diagram of an example memory extension system
- FIG. 2 illustrates a hardware diagram of an example memory extension system using an external memory extension module
- FIG. 3A illustrates an example flowchart for a request for reclaim candidates
- FIG. 3B illustrates an example flowchart for starting to load data into memory
- FIG. 4A illustrates an example flowchart for allocating memory
- FIG. 4B illustrates an example flowchart for freeing memory
- FIG. 5 illustrates an example flowchart for a memory access
- FIG. 6 illustrates an example flowchart for page migration
- FIG. 7 illustrates a memory architecture diagram of an example system providing multiple tiers of memory
- FIG. 8 illustrates an example categorization of latency sources for computing systems. DETAILED DESCRIPTION
- the present disclosure provides a technical solution to solve a technical problem of providing scalable cache-coherent primary memory to a computing system using a variety of memory media while minimizing CPU stalling costs.
- FIG. 1 illustrates a hardware diagram of an example memory extension system.
- the memory extension system may include a client 100.
- the memory extension system may include more, fewer, or different elements.
- the memory extension system may include multiple clients.
- the memory extension system may include one or more memory appliances and/or management servers, such as described in U.S. Non-provisional patent application Ser. No. 14/530,908, filed November 3, 2014, and U.S. Non-provisional patent application Ser. No. 15/424,395, filed February 3, 2017, each of which is hereby incorporated by reference.
- the memory extension system may include just the client.
- the client 100 may be any machine or a device that uses memory as described herein.
- the client 100 may be a server, a device, an embedded system, a circuit, a chipset, an integrated circuit, a field programmable gate array (FPGA), an application-specific integrated circuit, a virtual machine, a virtualization instance, a container, a jail, a zone, an operating system, a kernel, a device driver, a device firmware, a hypervisor service, a cloud computing interface, and/or any other hardware, software, and/or firmware entity which may perform the same functions as described.
- the client 100 may include a memory 110, a memory controller 120, a processor 130, and/or a memory extension module 140.
- the client 100 may include more, fewer, or different components.
- the client 100 may include a storage controller 150, a backing store 160, multiple storage controllers, multiple backing stores, multiple memories, multiple memory controllers, multiple processors, multiple memory extension modules and/or any combination thereof.
- the client 100 may just include a process executed by the processor 130.
- the storage controller 150 the client 100 may include a component that facilitates storage operations to be performed on the backing store 160.
- a storage operation may include reading from or writing to locations within the backing store 160.
- the storage controller 150 may include a hardware component. Alternatively or in addition, the storage controller 150 may include a software component.
- the backing store 160 may include an area of storage comprising one or more persistent media, including but not limited to flash memory, phase change memory, 3D XPoint memory, Memristors, EEPROM, magnetic disk, tape, or other media.
- the media in the backing store 160 may potentially be slower than the memory 110.
- the storage controller 150 and/or backing store 160 of the client 100 may be internal to the client 100, a physically discrete device external to the client 100 that is coupled to the client 100, included in a second client or in a device different from the client 100, part of a server, part of a backup device, part of a storage device on a Storage Area Network, and/or part of some other externally attached persistent storage.
- the memory 110 may be any memory or combination of memories, such as a solid state memory, a random access memory (RAM), a dynamic random access memory (DRAM), a static random access memory (SRAM), a flash memory, a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a phase change memory, 3D XPoint memory, a memristor memory, any type of memory configured in an address space addressable by the processor, or any combination thereof.
- the memory 110 may be volatile or non-volatile, or a combination of both.
- the memory 110 may be a solid state memory.
- Solid state memory may include a device, or a combination of devices, that stores data, is constructed primarily from electrical conductors, semiconductors and insulators, and is considered not to have any moving mechanical parts.
- Solid state memory may be byte-addressable, word-addressable or block-addressable. For example, most dynamic RAM and some flash RAM may be byte-addressable or word-addressable. Flash RAM and other persistent types of RAM may be block-addressable.
- Solid state memory may be designed to connect to a memory controller, such as the memory controller 120 in the client 100, via an interconnect bus, such as the interconnect 170 in the client 100.
- Solid state memory may include random access memory that permits stored data to be read and/or written in any order (for example, at random).
- random refers to the fact that any piece of data may be returned and/or written within a constant time period, regardless of the physical location of the data and regardless of whether the data is related to a previously read or written piece of data.
- storage devices such as magnetic or optical discs rely on the physical movement of the recording medium or a read/write head so that retrieval time varies based on the physical location of the next item read and write time varies based on the physical location of the next item written.
- solid state memory examples include, but are not limited to: DRAM, SRAM, NAND flash RAM, NOR flash RAM, V-NAND, Z-NAND, phase change memory (PRAM), 3D XPoint memory, EEPROM, FeRAM, MRAM, CBRAM, PRAM, SONOS, RRAM, Racetrack memory, NRAM, Millipede, T-RAM, Z-Ram, TTRAM, and/or any other randomly- accessible data storage medium known now or later discovered.
- solid state storage devices are systems or devices that package solid state memory with a specialized storage controller through which the packaged solid state memory may be accessed using a hardware interconnect that conforms to a standardized storage hardware interface.
- solid state storage devices include, but are not limited to: flash memory drives that include Serial Advanced Technology Attachment (SATA) or Small Computer System Interface (SCSI) interfaces; Flash or DRAM drives that include SCSI over Fibre Channel interfaces; DRAM, Flash, and/or 3D XPoint memory drives that include NVMe interfaces; DRAM drives that include SATA or SCSI interfaces, USB (universal serial bus) flash drives with USB interfaces, and/or any other combination of solid state memory and storage controller known now or later discovered.
- SATA Serial Advanced Technology Attachment
- SCSI Small Computer System Interface
- Flash or DRAM drives that include SCSI over Fibre Channel interfaces
- DRAM, Flash, and/or 3D XPoint memory drives that include NVMe interfaces
- DRAM drives that include SATA or SCSI interfaces, USB (universal serial bus)
- the memory 110 of the client 100 may include a client logic 112.
- the memory 110 of the client 100 may include more, fewer, or different components.
- the memory 110 of the client 100 may include an application logic 114.
- the processor 130 may be a general processor, a central processing unit (CPU), a Graphics Processing Unit (GPU), a server, a microcontroller, an application specific integrated circuit (ASIC), a digital signal processor, a field programmable gate array (FPGA), a digital circuit, an analog circuit, a logic, and/or any combination of these and/or other components.
- the processor 130 may include one or more devices operable to execute computer executable instructions or computer code embodied in the memory 110 or in other memory to perform features of the memory extension system. For example, the processor 130 may execute computer executable instructions that are included in the client logic 112 and/or the application logic 114.
- the processor 130, the memory controller 120, and the one or more memory extension modules 140 may each be in communication with each other. Each one of the processor 130, the memory controller 120, and the one or more memory extension modules 140 may also be in communication with additional components, such as the storage controller 150, and the backing store 160.
- the communication between the components of the client 100 may be over an interconnect, a bus, a point-to-point connection, a switched fabric, a network, any other type of interconnect, or any combination of interconnects 170.
- the communication may use any type of topology, including but not limited to a star, a mesh, a hypercube, a ring, a torus, a fat tree, a dragonfly, or any other type of topology known now or later discovered.
- any of the processor 130, the memory 110, the memory controller 120, and/or the memory extension module(s) 140 may be logically or physically combined with each other or with other components, such as with the storage controller 150, and/or the backing store 160.
- the memory controller 120 may include a hardware component that translates memory addresses specified by the processor 130 into the appropriate signaling to access corresponding locations in the memory 110.
- the processor 130 may specify the address on the interconnect 170.
- the processor 130, the interconnect 170, and the memory 110 may be directly or indirectly coupled to a common circuit board, such as a motherboard.
- the interconnect 170 may include an address bus that is used to specify a physical address, where the address bus includes a series of lines connecting two or more components.
- the memory controller 120 may, for example, also perform background processing tasks, such as periodically refreshing the contents of the memory 110.
- the memory controller 120 may be included in the processor 130.
- the application logic 114 and/or the client logic 112 may include a user application, an operating system, a kernel, a device driver, a device firmware, a virtual machine, a hypervisor, a container, a jail, a zone, a cloud computing interface, a circuit, a logical operating system partition, or any other logic that uses the services provided by the client logic 112.
- a container, a jail, and a zone may be technologies that provide userspace isolation or compartmentalization. Any process in the container, the jail, or the zone may communicate only with processes that are in the same container, the same jail, or the same zone.
- the application logic 114 and/or the client logic 112 may be embedded in a chipset, an FPGA, an ASIC, a processor, and/or any other hardware device.
- the memory extension module 140 may be a logical and/or physical collection of components that perform the functions as described herein.
- the memory extension module 140 may be a physical hardware device and/or component, distinct from other components of the client 100.
- the memory extension module 140 may be a CPU- socket module, an HTX (HyperTransport expansion) device, a Quick Path Interconnect (QPI) device, an Ultra Path Interconnect (UPI) device, an Infinity Fabric device, a PCI device, a PCI-X device, a PCI Express device, a CXL module, a Gen-Z module, a memory module, a Single In-line Pin Package (SIPP), a Single In-line Memory Module (SIMM), a Dual In-line Memory Module (DIMM), a Rambus In-line Memory Module (RIMM), a Small outline DIMM (SO-DIMM), a Small Outline RIMM (SO-RIMM), a Compression Attached Memory Module (CAMM), a
- the memory extension module 130 may be and/or may include a server, a device, an embedded system, a circuit, a chipset, an integrated circuit, a field programmable gate array (FPGA), an application-specific integrated circuit, a virtual machine, a virtualization instance, a container, a jail, a zone, an operating system, a kernel, a device driver, a device firmware, a hypervisor service, a cloud computing interface, and/or any other hardware, software, and/or firmware entity which may perform the same functions as described.
- FPGA field programmable gate array
- the memory extension module 140 may include a memory extension logic 142 and/or a memory 144.
- the memory extension module 140 may include more, fewer, or different components.
- the memory extension module 140 may include multiple memory extension logics, multiple memories, one or more backing stores 146, one or more communication interfaces 148, and/or any combination thereof.
- the memory extension module 140 may just include the memory extension logic 142.
- the memory 144 of the memory extension module 140 may be any memory or combination of memories, such as a solid state memory, a random access memory (RAM), a dynamic random access memory (DRAM), a synchronous dynamic random access memory (SDRAM), an asynchronous dynamic random access memory (ADRAM), a double data rate (DDR) RAM, a fast page mode (FPM) memory, an extended data output (EDO) DRAM, a Rambus DRAM (RDRAM), a Cached DRAM (CDRAM), a static random access memory (SRAM), a flash memory, a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a phase change memory, 3D XPoint memory, a memristor memory, any type of memory configured in an address space addressable by the processor 130 and/or the memory extension logic 142, or any combination of memories known now or later discovered.
- RAM random access memory
- DRAM dynamic random access memory
- SDRAM synchronous dynamic random access memory
- ADRAM asynchronous dynamic random
- the memory 144 may be volatile or non-volatile, or a combination of both. In some examples, the memory 144 may be a solid state memory.
- the memory 144 of the memory extension module 140 may be of the same or different types of memory/memories as the memory 110 of the client 100. In some examples, the memory 144 of the memory extension module 140 may be the same as and/or may be combined with the memory 110 of the client 100.
- the backing store 146 of the memory extension module 140 may include an area of storage comprising one or more persistent media, including but not limited to flash memory, phase change memory, 3D XPoint memory, Memristors, EEPROM, magnetic disk, tape, or other media.
- the media in the backing store 146 may potentially be slower than the memory 144 of the memory extension module 140 and/or the memory 110 of the client 100.
- the backing store 146 and/or the memory extension module may include a storage controller, similar to the storage controller 150 of the client 100.
- the storage controller and/or backing store 146 of the memory extension module 140 may be internal to the memory extension module 140, internal to the client 100, a physically discrete device external to the memory extension module 140 and/or client 100 that is coupled to the memory extension module 140 and/or client 100, included in a second memory extension module and/or client or in a device different from the memory extension module 140 and/or client 100, part of a server, part of a backup device, part of a storage device on a Storage Area Network, and/or part of some other externally attached persistent storage.
- the backing store may be accessed using the communication interface(s) 148.
- the communication interface(s) 148 may provide client-side memory access to the memory of a memory appliance, to regions, and/or to portions of the regions in the memory appliance.
- One or more interconnects or networks may transport data between the communication interface(s) 148 of the memory extension module 140 and one or more communication interfaces of other devices, such as a memory appliance.
- the communication interface(s) may be any one or more network interface controller(s), host controller adaptor(s), host fabric interface(s), memory fabric interface(s), processor interface(s), and/or any other interface(s) known now and/or later discovered that may be capable of operating as described herein.
- the communication interface(s) and/or client-side memory access may be as described in U.S. Non-provisional patent application Ser. No. 14/530,908, filed November s, 2014, and U.S. Non-provisional patent application Ser. No. 15/424,395, filed February 3, 2017, each of which is hereby included by reference.
- a client-side memory access may bypass a processor, such as a CPU (Central Processing Unit), at the client 100 and/or may otherwise facilitate the client 100, the memory extension module 140, and/or the memory extension logic 142 accessing the memory on the memory appliance without waiting for an action by the processor included in the client 100, in the memory appliance, or both.
- a processor such as a CPU (Central Processing Unit)
- the client-side memory access may be based on the Remote Direct Memory Access (RDMA) protocol.
- the RDMA protocol may be carried over an InfiniBand interconnect, an iWARP interconnect, an RDMA over Converged Ethernet (RoCE) interconnect, an Aries interconnect, a Slingshot interconnect, and/or any other interconnect and/or combination of interconnects known now or later discovered.
- the client-side memory access may be based on any other protocol and/or interconnect that may be used for accessing memory.
- a protocol that may be used for accessing memory may be a CPU protocol/interconnect, such as HyperTransport, Quick Path Interconnect (QPI), Ultra Path Interconnect (UPI), and/or Infinity Fabric.
- a protocol that may be used for accessing memory may be a peripheral protocol/interconnect, such as Peripheral Component Interconnect (PCI), PCI Express, PCI-X, ISA, Gen-Z, CXL, and/or any other protocol/interconnect used to interface with peripherals and/or access memory.
- PCI Peripheral Component Interconnect
- the communication interfaces may provide reliable delivery of messages and/or reliable execution of memory access operations, such as any memory access operation carried out when performing the client-side memory access. Alternatively, or in addition, delivery of messages and/or execution of memory access operations may be unreliable, such as when data is transported between the communication interfaces using the User Datagram Protocol (UDP).
- UDP User Datagram Protocol
- the client 100, the memory extension module 140, and/or the memory extension logic 142 may read, write, and/or perform other operations on the memory of the memory appliance, to the regions, and/or to portions of the regions using client-side memory access. In providing client-side memory access, the client 100, the memory extension module 140, and/or the memory extension logic 142 may transmit requests to perform memory access operations to the memory appliance. In response, the memory appliance may perform the memory access operations.
- the memory appliance may observe or otherwise identify the memory access operations.
- the memory appliance may, for example, copy the data of the region to one or more backing stores independently of performing the memory access operations on the memory.
- a backing store may include one or more persistent non-volatile storage media, such as flash memory, phase change memory, 3D XPoint memory, memristors, EEPROM, magnetic disk, tape, or some other media.
- the memory of the memory appliance and/or the backing store (if included) may be subdivided into regions.
- the memory extension logic 142 may be a logical and/or physical collection of components that perform the functions as described herein.
- the memory extension logic 142 may include one or more hardware, software, and/or logic entities that facilitate performing the functions. The entities may be combined and/or distributed in a number of ways to suit a particular embodiment.
- the memory extension logic 142 may include a processor, a memory, such as the memory 144 of the memory extension module 140, another memory, a chipset, an FPGA, an ASIC, and/or any other hardware device.
- the memory extension module 140, the memory extension logic 142, and/or the memory 144 of the memory extension module 140 may be in communication with the processor 130 and/or with other components of the client 100 via an interconnect 180.
- the communication between the components of the client 100 and/or of the memory extension module 140 may be over an interconnect, a bus, a point-to-point connection, a switched fabric, a network, any other type of interconnect, and/or any combination of interconnects 180.
- the communication may use any type of topology, including but not limited to a star, a mesh, a hypercube, a ring, a torus, a fat tree, a dragonfly, and/or any other type of topology known now or later discovered.
- any of the processor 130, the memory extension module 140, the memory extension logic 142, and/or other components of the client 100 and/or memory extension module(s) 140 may be logically or physically combined with each other or with other components, such as with the memory controller 120.
- the interconnect 180 may be logically or physically combined with the interconnect 170 of the client 100 and/or with any other interconnect(s) of the memory extension module 140.
- the interconnect 180 may have some of the same properties as the interconnect 170 of the client 100.
- the interconnect 180 may be the same interconnect as the interconnect 170 of the client 100.
- the interconnect 180 may be, may implement, may provide, and/or may include a physically-addressable interface.
- a physically-addressable interface may be an interface which provides access to the underlying data using physical addresses, such as the physical addresses used on an address bus, a CPU interconnect, a memory interconnect, and/or on a peripheral interconnect.
- the interconnect 180 may be and/or may include a virtually-addressable interface.
- the interconnect 180 may be addressed using IO- virtual addresses, virtual machine physical addresses, and/or any other type(s) of virtual addresses known now and/or later discovered.
- the virtual addresses, such as the IO-virtual addresses may be translated to physical addresses by one or more address translation logics.
- interconnect 180 may be and/or may include a peripheral interconnect such as Peripheral Component Interconnect (PCI), PCI Express, PCI-X, ISA, Gen-Z, CXL, and/or any other protocol/interconnect used to interface with peripherals and/or to access memory.
- PCI Peripheral Component Interconnect
- PCI Express Peripheral Component Interconnect Express
- PCI-X Peripheral Component Interconnect Express
- ISA Peripheral Component Interconnect
- Gen-Z Gen-Z
- CXL CXL
- the interconnect 180 may be and/or may include a processor interconnect, such as HyperTransport, Quick Path Interconnect (QPI), Ultra Path Interconnect (UPI), Infinity Fabric, and/or any other interconnect used to communicate between processors, to maintain cachecoherency, and/or to access memory.
- processor interconnect such as HyperTransport, Quick Path Interconnect (QPI), Ultra Path Interconnect (UPI), Infinity Fabric, and/or any other interconnect used to communicate between processors, to maintain cachecoherency, and/or to access memory.
- a portion of the memory extension logic 142 may be included in the memory 144 of the memory extension module 140, in another memory, such as the memory 110 of the client 100, in the backing store 146 of the memory extension module 140, in the backing store 160 of the client 100, and/or in any other one or more computer-readable storage media.
- the memory extension logic 142 includes an FPGA
- configuration for the FPGA that may define and/or control the logic to be implemented with the FPGA may be included in one or more of these locations.
- the memory extension logic 142 includes a processor, computer executable instructions and/or computer code, executable by the processor, may be included in one or more of these locations.
- portions of the memory extension logic 142, the client logic 112, and/or the application logic 114 may be included in different memories, backing stores, and/or media at different times.
- all of or a portion of the memory extension logic 142 may be initially included in the backing store 160 of the client 100 and/or may be transferred to the memory 144 of the memory extension module 140, such as by a firmware loading logic, which may be included in the client logic 112 and/or in another logic.
- a portion of the memory extension logic 142 may be included in one or more passed functions provided to the memory extension module 140.
- Passed functions may be any logic which may be provided by a first logic as a parameter when interacting with a second logic.
- the passed function may be computer executable instructions or computer code.
- the passed function may be embodied in a computer readable storage medium, such as the memory 110 and/or the backing store 160, and/or may be transmitted via an interconnect, such as the interconnects 170 180, for operation with another entity.
- the passed function may be transmitted via the interconnects 170 180 from the client logic 112 or any other logic to the memory extension module, for execution with the memory extension logic 142 and/or the processor (if present) of the memory extension module 140.
- Passed functions may be used to define custom, adaptive, and/or flexible caching strategies that may be cumbersome to express in other ways.
- the interconnect 180, the memory extension logic 142, the client logic, the application logic, and/or any other logic may implement one or more data interfaces, such as the data interfaces described in U.S. Non-provisional patent application Ser. No. 14/530,908, filed November 3, 2014, which is hereby included by reference.
- the data interface(s) may include an API, a block-level interface, a character-level interface, a memorymapped interface, a memory allocation interface, a memory swapping interface, a memory caching interface, a hardware-accessible interface, a graphics processing unit (GPU) accessible interface and/or any other interface used to access data included in the memory 144, included in the backing store 146, and/or accessible via the communication interface(s) 148.
- the data interface may include a hardware-accessible interface.
- the memory extension module 140, the memory extension logic 142, and/or other components may be included in the hardware client component as described therein.
- the processor 130, the memory controller 120, the memory 110, the client logic 112, and/or the application logic may be included in the hardware application component as described therein.
- the hardware-accessible interface may be and/or may include a physically- addressable interface.
- a physically-addressable interface may be an interface which provides access to the underlying data using physical addresses, such as the physical addresses used on an address bus, a CPU interconnect, a memory interconnect, and/or on a peripheral interconnect.
- hardware-accessible interface may be and/or may include a virtually- addressable interface.
- the hardware-accessible interface may be addressed using IO-virtual addresses.
- IO-virtual addresses may be translated to physical addresses by one or more address translation logics. Examples of address translation logics include memory management units (MMUs), input-output memory management units (IO- MM Us), translation lookaside buffers, and/or any other logic capable of translating virtual addresses to physical addresses known now or later discovered.
- MMUs memory management units
- IO- MM Us input-output memory management units
- translation lookaside buffers and/or any other logic capable of translating virtual addresses to physical addresses known now or later discovered.
- the hardware-accessible interface may enable a hardware application component, the client logic 112, the application logic 114, and/or another logic to access data included in the memory 144, included in the backing store 146, and/or accessible via the communication interface(s) 148.
- the hardware-accessible interface may enable the hardware application component to access data of a region.
- the hardware-accessible interface may enable the hardware application component to access data of one or more of the regions referenced by an external memory allocation.
- the hardware-accessible interface may enable the hardware application component to access data of the external memory allocation.
- the hardware application component may be a processor, a GPU, a communication interface, a direct memory access controller, an FPGA, an ASIC, a chipset, a compute module, a hardware accelerator module, a hardware logic, and/or any other physical component that accesses memory.
- the hardware application component may be included in the application logic 114 and/or in the client logic 112.
- the hardware-accessible interface may include a hardware client component.
- a hardware client component may be and/or may include a processor, a GPU, an MMU, an IO-MMU, a communication interface, such as the one or more communication interfaces 148, a direct memory access controller, an FPGA, an ASIC, a chipset, a compute module, a hardware accelerator module, a hardware logic, a memory access transaction translation logic, any other hardware component, and/or a combination of multiple hardware components.
- the hardware client component may be included in the client logic 312 and/or in the memory extension logic 142.
- the hardware client component, the hardware application component, and/or the one or more communication interfaces may be embedded in one or more chipsets.
- the hardware client component may include a memory and/or cache, such as the memory 144 of the memory extension module 140.
- the memory and/or cache of the hardware client component may be used to hold portions of the data of the backing store 146 and/or of external memory allocations and/or regions.
- the hardware client component may utilize a portion of the memory 110 of the client 100 to hold portions of the data of the backing store 146 and/or of external memory allocations and/or regions.
- the hardware client component may respond to and/or translate attempts to access virtual addresses, physical addresses, logical addresses, IO addresses, and/or any other address used to identify the location of data.
- the hardware client component may participate in a cache coherency protocol with the hardware application component.
- the hardware client component may respond to attempts of the hardware application component to access physical addresses by accessing data included in the memory 144 and/or cache of the hardware client component.
- the hardware component may interface with a CPU interconnect and handle cache fill requests by reading data from the memory 144 and/or cache included in the hardware client component.
- the hardware client component may redirect and/or forward attempts of the hardware application component to access physical addresses to alternate physical addresses, such as the physical addresses of the portion of the memory 110 of the client 100 utilized by the hardware component.
- the hardware client component may translate attempts of the hardware application component to access physical addresses into client-side memory access.
- the hardware client component may interface with the interconnect(s) 170 180 and/or handle cache fill requests by reading the requested data from backing store 146 and/or from the region and/or external memory allocation, such as via the communication interface(s) 148.
- the hardware client component may handle cache flush requests by performing clientside memory access to write the requested data to the backing store 146 and/or to the region and/or external memory allocation.
- the hardware client component may handle cache invalidate requests by updating the memory 144 and/or cache of the hardware client component to indicate the non-presence of the data indicated by the cache invalidate requests.
- the hardware client component may translate attempts of the hardware application component to access IO addresses into client-side memory access.
- the hardware client component may interface with a peripheral interconnect, such as PCI Express, and may respond to requests to read a portion of the IO address space by reading data from the memory 144 included in the hardware client component, by reading the portion of the memory 110 and/or cache of the client 100 utilized by the hardware component, and/or by reading the requested data from backing store 146 and/or from the region and/or external memory allocation.
- the hardware client component may interface with a memory-fabric interconnect, such as Gen-Z and/or CXL, and may respond to memory read operations by reading data from the memory 144 included in the hardware client component, by reading the portion of the memory 110 and/or cache of the client 100 utilized by the hardware component, and/or by reading the requested data from the backing store 146 and/or from the region and/or external memory allocation.
- a memory-fabric interconnect such as Gen-Z and/or CXL
- the memory extension system, the memory extension module 140, the memory 144, the memory extension logic 142, and/or any other logic may expose and/or may provide one or more memory description data structures and/or performance indication(s) to the client logic 112, the application logic 114, and/or other logic(s).
- the memory description data structure(s) and/or performance indication(s) may be and/or may include ACPI System Resource Affinity Table (SRAT) information, ACPI Static Resource Affinity Table (SRAT) information, ACPI System Locality Distance Information Table (SLIT) information, ACPI Heterogeneous Memory Attribute Table (HMAT) information, ACPI Heterogeneous Memory Attributes (HMA), and/or any other data structures that may indicate one or more relative and/or absolute performance attributes of the memory 144 of the memory extension module 140, of the memory 110 of the client 100, of the backing store 146 of the memory extension module 140, of the communication interface(s) 148, of the memory extension logic 142, of the processor 130, of the interconnect(s) 170, 180, the memory appliances (if present), and/or any of one or more caches that may hold data of the memory 110, 144, of one or more architectural elements, and/or of any other one or more physical and/or logical components that may affect the performance of memory-access operations and
- the performance attribute(s) may be any one or more attributes, characteristics, properties, and/or aspects of the corresponding memory/memories, mapping(s), memory control ler(s), interconnect(s), processor(s), logic(s), communication interface(s), and/or any other component, subsystem, system, and/or architectural element related to memory access performance.
- Examples of performance attributes may include latency, bandwidth, operations per second, transfers per second, determinacy, jitter, and/or any other measurable and/or unmeasurable quantity/quality related to performance.
- the performance attributes may be absolute (for example, “100 ns”, “30 GiB/s”, and/or 50 gigatransfers/second), and/or may be relative (for example, “low jitter” relative to a reference measurement of jitter).
- the performance indication(s) may indicate one or more relative and/or absolute performance attributes and/or any other attributes of the memory extension module 140, the client 100, the interconnects 170, 180, and/or any other one or more component related to the performance of memory accesses.
- Examples of the performance indication(s) may include an indicator of any of the following (or any combination thereof): price, capacity, latency, bandwidth, operations per second, physical locality, network locality, logical locality, power draw, and/or any other one or more attributes of the memory 144 of the memory extension module 140, of the memory 110 of the client 100, of the backing store 146 of the memory extension module 140, of the communication interface(s) 148, of the memory extension logic 142, of the processor 130, of the interconnect(s) 170, 180, the memory appliances (if present), and/or any of one or more caches that may hold data of the memory 110, 144, of one or more architectural elements, and/or of any other one or more physical and/or logical components that may affect the performance of memoryaccess operations and/or that may be useful in determining which memory to use for different purposes.
- the performance indication(s) may include or be associated with a memory identifier.
- the memory identifier may be any identifier that identifies the memory, the architectural element containing the memory, the memory appliance containing the memory, the communication interface(s) 148 handling communication with the memory, the interconnect(s) 170, 180 connecting to the memory, the memory controller(s) 120 for the memory, and/or any other logical element or physical component having the property or properties included in the performance indication.
- the performance indication(s) may indicate one or more relative distances and/or latencies between architectural elements, such as with an ACPI System Locality Distance Information Table (SLIT).
- SLIT ACPI System Locality Distance Information Table
- the performance indication(s) may include one or more indicator(s) of any of the following, or any combination thereof: read/write latency and/or bandwidth metrics between architectural elements and/or memories, such as with an ACPI System Locality Latency and Bandwidth Information Structure.
- the memory 144 of the memory extension module 140 may include a first memory and a second memory, where the second memory of the memory extension module 140 differs from the first memory in price, capacity, latency, bandwidth, operations per second, physical locality, network locality, logical locality, interconnect, power draw, and/or any other one or more attributes, such as by being memory of a different type, media, style, package, clock rate, architecture, and/or any other relevant difference.
- the first memory and the second memory be said to be different classes of memory.
- the performance indication(s) may indicate one or more attributes of one or more caches associated with one or more memories, such as with an ACPI Memory Side Cache Information Structure.
- the attribute(s) of the cache(s) may include the size of the cache(s), number of cache levels, cache associativity, write policy, cache-line size, latency, bandwidth, performance, and/or any other one or more attributes of the cache(s) that may affect the performance of memory-access operations and/or that may be useful in determining which memory to use for different purposes.
- the cache(s) may be and/or may include cache memories (such as with a CPU cache); the cache(s) may be and/or may include memories 110, 144 used to hold portions of one or more backing stores 146, 160; and/or the cache(s) may be and/or may include local primary memory used to hold cached portions of external primary memory.
- cache memories such as with a CPU cache
- the cache(s) may be and/or may include memories 110, 144 used to hold portions of one or more backing stores 146, 160
- the cache(s) may be and/or may include local primary memory used to hold cached portions of external primary memory.
- the performance indication(s) may be created, destroyed, modified, and/or updated during operation of the memory extension system.
- the performance indication(s) may be provided, conveyed, delivered, indicated, and/or identified via a transitory means, such as with a message, a function call, a programmatic method invocation, a programmatic signal, an electrical signal, an optical signal, a wireless signal, and/or any other mechanism used to provide, convey, deliver, indicate, and/or identify information between logics.
- the performance indication(s) may be indicated via one or more ACPI Device Configuration Objects, such as ACPI System Locality Information (_SLI) object(s), ACPI Proximity (_PXM) object(s), ACPI Heterogeneous Memory Attributes (_HMA) object(s), and/or any other data structure and/or mechanism that may describe and/or indicate the performance indication(s) and/or one or more portions thereof.
- ACPI Device Configuration Objects such as ACPI System Locality Information (_SLI) object(s), ACPI Proximity (_PXM) object(s), ACPI Heterogeneous Memory Attributes (_HMA) object(s), and/or any other data structure and/or mechanism that may describe and/or indicate the performance indication(s) and/or one or more portions thereof.
- the client logic 112, the application logic 314, and/or any other logic may operate differently based upon the performance indication(s).
- the logic may include an operating system and/or the logic may be configured to use memory with faster performance for frequently-accessed data and/or recently-accessed data and/or may be configured to use memory with slower performance for infrequently-accessed data and/or not-recently-accessed data.
- the logic may be configured to use memory with a shorter distance for frequently-accessed data and/or recently-accessed data and/or may be configured to use memory with a longer distance for infrequently-accessed data and/or not-recently-accessed data.
- the logic may be configured to migrate data from one portion of memory to another based upon whether the data is frequently accessed, infrequently accessed, recently accessed, not recently accessed, and/or matches or does not match any other one/or more access patterns.
- the logic may be configured to migrate data from one portion of memory (a source portion) to another (a destination portion) based upon whether the performance indication(s) indicate that the source portion has faster performance, slower performance, higher bandwidth, lower bandwidth, lower latency, higher latency, a larger cache, a smaller cache, larger capacity, smaller capacity, a shorter distance, a longer distance, more- durable media, less-durable media, higher power draw, lower power draw, and/or any other one or more characteristics relative to the destination portion.
- the data may be migrated from the memory 144 of the memory extension module 140 to the memory 110 of the client 100 in response to identifying that the data is accessed frequently.
- the data may be migrated by copying the data from one portion of memory to another, such as with NUMA page migration, and/or by performing operations as described herein, such as described for FIG. 6.
- the term "frequently-accessed” means accessed more frequently than a threshold frequency.
- the term “infrequently-accessed” means accessed less than a threshold frequency.
- the threshold for "frequently-accessed” may not necessarily be the same as the threshold for "infrequently-accessed”.
- the term “recently-accessed” means accessed within a predetermined timeframe.
- Memory with slower performance means memory that may be read and/or written a rate that is slower than another memory available to be allocated such as memory with higher latency, lower bandwidth, lower operations per second, and/or less performant metrics for any other performance attributes.
- Memory with faster performance means memory that may be read and/or written at a rate that is faster than other memory available to be allocated, such as memory with lower latency, higher bandwidth, higher operations per second, and/or more performant metrics for any other performance attributes.
- the phrase "shorter distance” in this context means closer to the client device in the context of physical locality, network locality, and/or logical locality than another memory available to be allocated.
- the phrase "longer distance” in this context means further from the client device in the context of physical locality, network locality, and/or logical locality than another memory available to be allocated.
- more-durable in this context means media which is more likely to retain its contents over time, which is more likely to retain its contents after repeated reading and/or writing over time, which is more likely than a threshold probability to retain its data after more repeated read and/or write operations than for other media, and/or which has any other properties and/or attributes associated with improved media durability.
- less-durable in this context means media which is less likely to retain its contents over time, which is less likely to retain its contents after repeated reading and/or writing over time, which is more likely than a threshold probability to retain its data after fewer repeated read and/or write operations than for other media, and/or which has any other properties and/or attributes associated with reduced media durability.
- higher power draw in this context means memory which requires more electric power to operate, more thermal energy to maintain the temperature of the memory within a target operating range, and/or which has any other properties and/or attributes associated with higher power usage at the current time and/or over time.
- lower power draw in this context means memory which requires less electric power to operate, less thermal energy to maintain the temperature of the memory within the target operating range, and/or which has any other properties and/or attributes associated with lower power usage at the current time and/or over time.
- any of the components of the memory extension system may be internal or external to the client 100, to the memory extension module(s) 140, and/or both. In examples where components are external, they may be in communication via an interconnect and/or network. In some examples, one or more memory extension modules 140 may be external to the client 100.
- FIG. 2 illustrates a hardware diagram of an example memory extension system using an external memory extension module.
- the memory extension system may include one or more memory extension modules 140 that are external to the client 100.
- the memory extension system may include one or more clients 100.
- the interconnect 180 may be external to the client(s) 100, such as with a switched fabric and/or a network.
- the interconnect 180 may include one or more PCIe switches, one or more CXL switches, one or more network/interconnect management systems, and/or any other components that contribute to the communication between the client(s) 100, the memory extension module(s) 140, components of the client(s) 100, and/or components of the memory extension module(s) 140.
- the memory extension system may include one or more management components responsible for forming and/or maintaining associations between one or more clients 100 and/or one or more memory extension modules 140.
- the management components may be and/or may operate similarly to the management servers, as described in U.S. Non-provisional patent application Ser. No. 14/530,908, filed November 3, 2014, and U.S. Non-provisional patent application Ser. No. 15/424,395, filed February 3, 2017, each of which is hereby included by reference.
- the memory extension logic 142 may retrieve data from the memory 144 of the memory extension module 140 in response to one or more accesses by the processor 130 and/or by another logic. For example, the memory extension logic may respond to a cache fill request by reading the corresponding data from the memory 144. In examples where the data was not previously stored in the memory 144, the memory extension logic 142 may retrieve the data from the backing store 146 and/or via the communication interface 148. In addition to responding to the cache fill request with the retrieved data, the memory extension logic may store the retrieved data in the memory 144. When data is stored in the memory 144, it may be faster to retrieve than when stored in the backing store 146 and/or when retrieved via the communication interface 148.
- the processor 130 and/or other logic may wait for the data. While waiting, one or more hardware resources of the processor 130 and/or other logic may be unable to perform meaningful work. For example, one or more hardware pipelines of the processor 130 and/or other logic may be stalled. In examples such as these, it may be advantageous to reduce the duration when the processor 130 and/or other logic is waiting and/or to enable the processor 130 and/or other logic to perform other meaningful work while waiting.
- the operations of the memory extension logic 142 may be optimized to reduce the duration when the processor 130 and/or other logic is waiting. For example, when handling a cache fill request, if there are no available locations in the memory to hold the data, operations may be structured such that the cache fill request is not completed until a suitable location can be identified and the data is stored in memory. Alternatively or in addition, the cache fill request may be completed upon retrieving the data for the request, and the data may be stored in the memory 144 after completing the cache fill request. In other examples, operations may be performed in advance of receiving cache fill requests to identify and/or prepare one or more suitable memory locations, lessoning the work needed while handling cache fill requests. [0056] FIG. 3A illustrates an example flowchart for a request for reclaim candidates.
- the memory extension logic 142 and/or another logic may perform the operations shown in FIG. 3A and/or other figures herein.
- the memory extension logic 142 and/or another logic may receive a request for reclaim candidates (302).
- the request for reclaim candidates (302) may be any one or more mechanisms that may trigger the operations described for FIG. 3A. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc.
- the request (302) may be received from another logic, such as the client logic 112 and/or application logic 114, and/or the request (302) may be received from the memory extension logic 142, such as by the memory extension logic 142 making a determination to reclaim memory.
- the memory extension logic 142, and/or any other one or more logics may make the determination independently and/or in coordination with each other and/or with any other one or more logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason.
- the memory extension logic 142 may determine that additional reclaim candidates are needed to service an expected data access rate from the processor 130.
- the memory extension logic 142 and/or another logic may make the determination and/or send the request (302) in response to a low memory condition.
- the memory extension logic 142 may determine that newer reclaim candidates are needed than had been previously selected, such as after a certain amount of time has passed and/or after a certain number of events (such as data accesses) have occurred. In some examples, there may be no request (302), such as if the determination is made by the memory extension logic 142.
- the memory extension logic 142 and/or another logic may process access information (304).
- Processing access information (304) may include evaluating one or more stored data related to portions of the memory 144 and/or of the backing store 146 and/or selecting one or more portions as reclaim candidates.
- the reclaim candidate(s) may be one or more portions deemed to be unlikely to be used in the near future.
- the reclaim candidate(s) may be selected using any one or more selection strategies, portion replacement strategies, and/or page replacement strategies known now or later discovered.
- one or more portion(s) of memory may be tracked using one or more sorted and/or unsorted collections, such as one or more least-recently used lists.
- the collection(s) may be updated, modified, and/or replaced to reflect the action that occurred.
- the position of the portion of memory may be moved to the back of a sorted list of least-recently used portions, and/or may be moved to a different collection, such as to a collection of active portions.
- portion replacement strategies may include not-recently-used, first-in-first-out, second-chance, clock, random replacement, not- frequently-used, aging, longest-distance-first, any other portion replacement strategy known now or later discovered, and/or any combination of two or more of these and/or any other strategies.
- the one or more selection strategies, portion replacement strategies, and/or page replacement strategies used may be selected by a user and/or an administrator, may be configured, and/or may be selected based on any one or more policies, passed functions, steps, and/or rules that the memory extension logic 142, the client logic 112, and/or another logic follows to determine which one or more selection strategies, portion replacement strategies, and/or page replacement strategies to use and/or which parameter(s) and/or configuration(s) to use with the one or more selection strategies, portion replacement strategies, and/or page replacement strategies.
- the one or more policies, passed functions, steps, and/or rules may use any available information, such as any one or more of the characteristics and/or configurations of the client(s) 100, the memory extension module(s) 140, the memory appliance(s), and/or the management server(s), to select the one or more selection strategies, portion replacement strategies, and/or page replacement strategies.
- a policy and/or passed function may specify to use a random replacement strategy unless the number of accesses that result in reading from the backing store 146 and/or via the communication interface 148 reaches a threshold, and/or to use a least-recently-used strategy after reaching the threshold.
- characteristics and/or configurations of the client 100, memory extension module 140, memory appliance, management server, and/or any other device/component may include: name, system time, time zone, time synchronization settings, time server(s), network configuration, hostname, power configuration, battery policy, Uninterruptible Power Supply (UPS) configuration, disk policy, backing store configuration, persistence configuration, persistence mode, customer-support configuration(s), service configuration(s), user interface configuration, user configuration, user-group configuration, health monitoring configuration(s), network monitoring configuration(s), SNMP configuration, logic version, software version, firmware version, microcode version, any other fixed aspect of the device/component, and/or any other aspect of the device/component which may be configured and/or changed.
- characteristics and/or configurations may include associations between components/devices.
- the one or more selection strategies, portion replacement strategies, and/or page replacement strategies used may be affected by one or more policies, passed functions, steps, and/or rules that the memory extension logic 142, the client logic 112, and/or another logic follows to determine how the one or more selection strategies, portion replacement strategies, and/or page replacement strategies should operate.
- the one or more policies, passed functions, steps, and/or rules may use any available information, such as any one or more of the characteristics and/or configurations of the client(s) 100, the memory extension module(s) 140, the memory appliance(s), and/or the management server(s), to affect the one or more selection strategies, portion replacement strategies, and/or page replacement strategies.
- a policy and/or passed function may specify an amount of memory to unmap from page table entries and/or to reclaim when activated.
- a policy and/or passed function may specify an amount of data to retain in local primary memory as frequently-used and/or “hot” data.
- a sorted collection may be approximated by storing one or more values with portion-tracking data structures.
- the one or more values stored may include one or more generation counters and/or timestamp values.
- a generation counter may be maintained and/or may be incremented and/or decremented as portion(s) are accessed and/or as data accesses and/or page fault(s) occur.
- the one or more values may be stored with one or more portion tracking data structures that correspond to the portion(s) being accessed and/or the portion(s) being page-faulted.
- the one or more values stored may include a count of the number of times corresponding portion(s) were accessed and/or page-faulted.
- all of or a subset of all portion tracking data structures may be inspected and/or corresponding stored values may be collected in a data structure.
- all portion tracking data structures for a file and/or allocation domain may be inspected and/or corresponding stored values may be collected.
- An allocation domain may be a logical partitioning of computing resources and/or may be controlled via one or more file data limits, such as described in U.S. Nonprovisional patent application 15/424,395, filed February 3, 2017, which is hereby incorporated by reference.
- one or more virtualization instances, virtual machines, containers, jails, and/or zones may be and/or may be included in an allocation domain.
- the collected values may be stored in the data structure with one or more identifiers for the corresponding portions, such as an address, offset, and/or index of the corresponding portions and/or an address and/or reference to corresponding portion tracking data structures.
- the collected values may be sorted while being collected and/or after being collected.
- all of or a subset of all portion tracking data structures may be updated, modified, and/or replaced periodically and/or in response to the event and/or condition and/or in response to another event and/or condition.
- the value(s) may be reset, cleared, zeroed, removed, updated, modified, replaced, and/or invalidated, such as by setting the value(s) to zero and/or setting and/or clearing one or more indicators indicating that the value(s) are invalid, not present, not available, not accessible, and/or not writable.
- the sorted collected values may be used to identify portion(s) of memory that have not been used recently and/or are unlikely to be used in the near future by inspecting the portiontracking data structures for the portions corresponding to the first and/or last value in the sorted collected values.
- the lowest value may correspond to the portion which was least recently used.
- the sorted collected values are sorted based upon the count of the number of times corresponding portions were accessed and/or page-faulted, then the lowest value may correspond to a portion which is relatively infrequently used and/or which has not been used frequently since the value(s) were last reset.
- storing, collecting, and/or sorting the stored values may be used to approximate a least-recently-used strategy, a not-frequently-used strategy, and/or any other strategy that traditionally includes maintaining a sorted collection to identify one or more candidates to reclaim.
- a timestamp value may be any value that corresponds to the current time.
- the timestamp value may be specified in any one or more units of time, such as seconds, milliseconds, microseconds, jiffies, etc., and or may be relative to some other defined time.
- a timestamp may be the number of seconds since midnight on January 1st of 1970.
- a timestamp may be the number of seconds and microseconds since midnight on January 1 st of 2000.
- the timestamp value may be specified in a defined time zone, such as UTC, GMT, and/or any other time zone.
- the timestamp value may be specified in the time zone where one or more clients 130, one or more memory appliances 110, one or more management servers 120, and/or any other one or more entities are physically located.
- the actual values for one or more portions may change between when the values in the sorted collected values are collected and when portion(s) of memory are to be identified for reclaim and/or are to be reclaimed.
- the memory extension logic 142, the client logic 112, and/or another logic may inspect the corresponding portion tracking data structure and/or obtain the current value for the portion. If the current value for the portion is equal to the value in the sorted collected values data structure, the portion may be considered a good candidate to reclaim. Alternatively or in addition, the current value for the portion may be compared with a threshold value to determine if the portion is a good candidate to reclaim.
- the threshold value may be the mean and/or median value from the sorted collected values, the 80th percentile value from the sorted collected values, any other value from the sorted collected values, a calculated value based upon one or more values from the collected values, a value based on the current timestamp and/or generation counter, and/or any other value which may be useful for determining suitability for reclaiming a portion.
- the current value for the portion may be checked during other and/or additional steps of reclaiming portion(s) of memory. For example, the current value may be compared to the stored and/or threshold value while selecting destination memory (324), prior to starting to read data (328), and/or prior to invalidating one or more page table entries and/or updating other data structure(s).
- the collected values may be analyzed and/or used to determine the threshold value, but not used directly to identify candidate portions to reclaim.
- the collected values may be sorted, analyzed, and/or partially sorted in order to determine the value that would be at the 80th percentile and/or any other position of the sorted collected values and choose this value as the threshold value.
- the threshold value may be calculated based on one or more of the collected values and/or sorted collected values.
- candidate portions may be chosen using any selection strategy, portion replacement strategy, and/or page replacement strategy known now or later discovered.
- candidate portions to reclaim may be selected randomly and/or portion tracking data structures for candidate portions may be inspected to compare one or more stored values to one or more threshold values that were determined from the collected values.
- An advantage of using the sorted collected values and/or threshold values as described herein over maintaining one or more sorted collections may be that lock contention may be reduced and/or eliminated. For example, if a lock primitive would be used to access, update, modify, and/or replace the sorted collection(s), the lock primitive could be avoided when not attempting to maintain the sorted collection(s), and/or the lock primitive may not need to be acquired when handling a page fault and/or other type of access operations which may otherwise cause the sorted collection(s) to be updated/modified/replaced and/or one or more portions to be moved within the sorted collection(s) and/or between multiple collections.
- the sorted collected values may be collected and/or sorted without acquiring any highly-contended lock primitives.
- portions of a file and/or address space may be iterated by speculatively referencing entries in a radix tree and/or other data structure that uses a read-copy- update and/or other lock-free mechanism to coordinate access.
- Speculatively-referenced entries may be individually locked and/or verified to exist when collecting the stored values for corresponding portion tracking data structures.
- the portion may be skipped. Skipping these portions may be advantageous and/or may improve efficiency of collecting values in example systems where locked and/or invalid portions are unlikely to be good candidates to reclaim.
- the memory extension logic 142 and/or another logic may optionally start writeback (306) for one or portions of the memory 144 of the memory extension module 140.
- Starting writeback (306) may include writing data from the one or more portions of the memory 144 to corresponding portion(s) of the backing store 146, and/or to other locations, such as by writing to one or more memory appliances and/or other locations via the communication interface(s) 148.
- the data may be copied using client-side memory access, via one or more RDMA operations, and/or via any other one or more suitable data transfer operations.
- the data may be optionally encrypted, decrypted, compressed, decompressed, and/or may undergo data translation before, during, and/or after being written (304).
- Other examples of data translation may be described elsewhere in this disclosure.
- the encryption, decryption, compression, decompression, and/or data translation may be performed in whole or in part by the memory extension logic 142, one or more communication interfaces 148, one or more processors 130, other devices, and/or other logic.
- the compression may be performed prior to selecting corresponding portion(s) of the backing store 146, and/or to other locations. Compressing the data before selecting a destination may be advantageous when using compression algorithms for which it may be difficult to predict the resulting size of compressed data.
- Data translation may include manipulating the data being read and/or written.
- data translation may include compressing the data being written and/or decompressing the data being read.
- Compression and/or decompression may be performed using any one or more compression schemes, such as Lempel-Ziv (LZ), DEFLATE, Lempel-Ziv-Welch (LZW), Lempel-Ziv-Renau (LZR), Lempel-Ziv-Oberhumer (LZO), Huffman encoding, LZX, LZ77, Prediction by Partial Matching (PPM), Burrows-Wheeler transform (BWT), Sequitur, Re-Pair, arithmetic code, and/or other method and/or scheme known now or later discovered which may be used to recoverably reduce the size of data.
- compression schemes such as Lempel-Ziv (LZ), DEFLATE, Lempel-Ziv-Welch (LZW), Lempel-Ziv-Renau (LZR
- data translation may include encrypting the data being written and/or decrypting the data being read. Encryption and/or decryption may be performed using any one or more encryption schemes and/or ciphers, such as symmetric encryption, public-key encryption, block ciphers, stream ciphers, substitution ciphers, transposition ciphers, and/or any other scheme which may be used to encode information such that only authorized parties may decode it.
- encryption schemes and/or ciphers such as symmetric encryption, public-key encryption, block ciphers, stream ciphers, substitution ciphers, transposition ciphers, and/or any other scheme which may be used to encode information such that only authorized parties may decode it.
- data translation may include performing error detection and/or error correction upon the data being written and/or the data being read.
- Error detection and/or error correction may be performed using any one or more error detection and/or error correction schemes, such as repetition codes, parity bits, checksums, cyclic redundancy checks, cryptographic hash functions, error correcting codes, forward error correction, convolutional codes, block codes, Hamming codes, Reed-Solomon codes, Erasure Coding-X (EC-X) codes, Turbo codes, low-density parity-check codes (LDPC), and/or any other scheme which may be used to detect and/or correct data errors.
- error detection and/or error correction schemes such as repetition codes, parity bits, checksums, cyclic redundancy checks, cryptographic hash functions, error correcting codes, forward error correction, convolutional codes, block codes, Hamming codes, Reed-Solomon codes, Erasure Coding-X (EC-X) codes, Turbo codes, low-density parity-check codes
- Error detection and/or error correction may include performing additional calculations to confirm the integrity of the data written to and/or read. For example, one or more digests may be written for one or more corresponding portions of the memory and/or backing store. When reading the corresponding portion, if the stored digest does not match the digest which can be computed from the read data for the portion, then the read may be considered failed and/or the portion may be considered corrupted. Alternatively or in addition, the data may be corrected based upon the one or more digests and/or error correcting codes.
- Further examples may include performing multiple types of data translation.
- the client logic 112, the memory extension logic 142, and/or another entity may encrypt the data being written and/or compute one or more error detecting and/or error correcting codes for the data and/or for the encrypted data.
- the client logic 112, the memory extension logic 142, and/or another entity may decrypt the data being read and/or may perform error detection and/or error correction upon the data and/or encrypted data being read.
- the memory extension logic 142 and/or another logic may optionally wait for the writeback to complete. Alternatively, or in addition, the memory extension logic 142 and/or another logic may proceed without waiting, such as by marking the portions as being under writeback and/or by updating one or more data structures to indicate writeback is in progress.
- the memory extension logic and/or another logic may provide one or more reclaim candidates (308).
- Providing the one or more reclaim candidates (308) may indicate that one or more portions are appropriate candidates to consider when reclaiming one or more portions of memory.
- Providing the reclaim candidates (308) may include sending/invoking a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, and/or may utilize any other mechanism capable of conveying information about the reclaim candidates to a reclaim logic.
- the memory extension logic and/or another logic may be done (310) handling the request for reclaim candidates.
- the reclaim logic may be any logic responsible for and/or that performs reclaiming portions of memory 144, facilitating the portions being reused for other purposes.
- the reclaim logic may be included in the memory extension logic 142, in the client logic 112, and/or in another logic.
- the reclaim logic may, for example, reclaim portions of memory 144 in response to a request to start loading data into memory (322).
- FIG. 3B illustrates an example flowchart for starting to load data into memory.
- the memory extension logic 142 and/or another logic may perform the operations shown in FIG. 3B and/or other figures herein.
- the memory extension logic 142 and/or another logic may receive a request to start loading data into memory (322).
- the request to start loading data into memory (322) may be any one or more mechanisms that may trigger the operations described for FIG. 3B. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc.
- the request (322) may be received from another logic, such as the client logic 112 and/or application logic 114, and/or the request (322) may be received from the memory extension logic 142, such as by the memory extension logic 142 making a determination to start loading data.
- the memory extension logic 142, and/or any other one or more logics may make the determination independently and/or in coordination with each other and/or with any other one or more logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason.
- the memory extension logic 142, the client logic 112, the application logic 114, and/or another logic may determine that some data is needed to service an expected data access from the processor 130. In some examples, there may be no request (322), such as if the determination is made by the memory extension logic 142.
- the memory extension logic 142 and/or another logic may select one or more portions of destination memory (324). Selecting one or more portions of destination memory (324) may include selecting one or more portions from the reclaim candidates. Alternatively or in addition, selecting the destination memory (324) may include selecting portions using any one or more selection strategies, portion replacement strategies, and/or page replacement strategies known now or later discovered, such as described for selecting reclaim candidate(s). In examples where the reclaim candidate(s) are provided as a sorted collection, the destination memory may be selected (324) by selecting the next one or more portions from the sorted collection. In other examples, one or more random portions may be selected.
- one or more of the unused portions may be selected (324).
- An advantage of using the reclaim candidates(s), selecting randomly, and/or selecting from among unused portions may be that selecting the one or more portions of destination memory (324) may be performed by a hardware logic and/or without invoking a processor (for example, the processor 130 of the client 100 and/or the processor of the memory extension logic 142, if present).
- a hardware state machine may select the one or more portions of destination memory (324) and/or may perform one, more, and/or all of the other operations of FIG. 3B. Performing one or more operations of FIG.
- selecting the one or more portions of destination memory (324) may be performed by any other logic(s), such as a software logic.
- selecting the destination memory may include notifying one or more logics, such as the client logic 112 and/or application logic 114, that one or more corresponding portions of address space are no longer associated with the selected portion(s) of destination memory.
- the memory extension logic 142 and/or another logic may send a notification to the client logic 112, to the application logic 114, to the client 100, and/or to any other one or more logics indicating that the address space portion(s) are no longer associated with the selected portion(s).
- the client logic 112, the application logic 114, the client, and/or other logic(s) may update one or more page table entries, page table(s), memory presence indicator(s), portion tracking data structure(s), and/or other data structure(s) to indicate that one or more address space portion(s) are no longer present, valid, available, accessible, writable, and/or associated with the selected portion(s).
- the memory extension logic 142 and/or another logic may update one or more page table entries, page table(s), memory presence indicator(s), portion tracking data structure(s), and/or other data structure(s) to indicate that one or more address space portion(s) are no longer present, valid, available, writable, accessible, and/or associated with the selected portion(s).
- the memory extension logic and/or another logic may optionally start writeback (326) for one or more of the selected portions. For example, if one or more of the selected portions are from the provided reclaim candidates and/or if one or more of the selected portions contain data that has not been written to the backing store 146 and/or via the communication interface(s) 148, the memory extension logic and/or another logic may start writing data from the one or more portions of the memory 144 to corresponding portion(s) of the backing store 146, and/or to other locations, such as described elsewhere in this disclosure for starting writeback (306). Alternatively or in addition, such as if one or more of the selected portions are unused and/or do not contain such data, writeback may not be started for these portions.
- the memory extension logic and/or another logic may start reading (328) the data indicated by the request and/or determination to start loading data into memory (322).
- the data may be read into memory 110, 144 by reading from corresponding portion(s) of the backing store 146, by reading corresponding portions of a memory appliance, such as via the communication interface(s) 148, and/or by initializing one or more portions of memory.
- operations may be complete (330).
- FIG. 3A and FIG. 3B may occur independently, concurrently, repeatedly, and/or may be interleaved.
- the memory extension logic 142 and/or another logic may handle one or more requests for reclaim candidates (302) while also reading data (328) with previously-provided reclaim candidates.
- two or more of the steps of FIG. 3A and/or FIG. 3B may be combined.
- writeback may be started (306) for one or more portions while processing access information (304), such as by starting writeback upon processing information for each of the one or more portions.
- one or more reclaim candidates may be provided (308) prior to starting writeback (306) for the corresponding portion(s) and/or for other portion(s).
- both writeback (306, 326) and/or reading data (328) may be started and/or may proceed concurrently.
- the data being written may be copied to a temporary location, such as another portion of the memory 144, 110 and/or a temporary buffer of the backing store 146 and/or of the communication interface(s) 148, prior to starting and/or as part of starting writeback (306, 326).
- the data being read may be copied from the temporary location and/or a second temporary location after and/or as part of reading data (328).
- starting writeback (306, 326) may be performed independently of the operations depicted in FIG. 3A and/or FIG 3B, such as periodically and/or as part of a background operation.
- the operations of the client logic 112, the application logic 114, the memory extension logic 142, and/or other logic(s) may be optimized and/or coordinated to further reduce the duration when the processor 130 and/or other logic is waiting.
- the client logic 112, the application logic 114, and/or another logic may indicate to the memory extension logic 142 that one or more portions of memory 110, 144 may no longer contain useful data and/or that the portion(s) may be re-initialized instead of retrieving data from the backing store 146 and/or via the communication interface(s) 148.
- the indication(s) may be performed in response to other operations of the client 100, such as allocating and/or freeing memory and/or in response to a request, such as an initialization, zeroing, and/or discard request, from one or more logic(s), such as from the application logic 114.
- FIG. 4A illustrates an example flowchart for allocating memory.
- the client logic 112 and/or another logic may perform the operation shown in FIG. 4A and/or other figures herein.
- the client logic 112 and/or another logic may receive a request for allocating memory (402).
- the request for allocating memory (402) may be any one or more mechanisms that may trigger the operations described for FIG. 4A. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc.
- the request (402) may be received from another logic, such as the application logic 114, and/or the request (402) may be received from the client logic 112, such as by the client logic 112 making a determination to allocate memory.
- the client logic 112, and/or any other one or more logics may make the determination independently and/or in coordination with each other and/or with any other one or more logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason.
- the client logic 112 may determine that additional memory may improve operation of the application logic 114 and/or may determine to allocate additional memory for the application logic 114, such as in examples where the application logic 114 includes a virtualization instance.
- the request for allocating memory (402) may include a memory allocation function call, such as malloc.
- the request for allocating memory (402) may be implied by the occurrence of a page fault.
- the client logic 142 and/or another logic may select one or more portion(s) of memory (404).
- the selected portion(s) may be addressable by the processor 130 of the client 100.
- the selected portion(s) may include one or more portions of the memory 110 of the client 100.
- the selected portion(s) may include one or more portions of the memory 144 of the memory extension module 140.
- the selected portion(s) may include one or more portions of physical address space associated with the memory extension module 140, such as with the physically-addressable interface and/or hardware-accessible interface.
- portion(s) of the memory 110 of the client 100 may be determined based upon an expected likelihood that the allocated portion(s) will be accessed soon and/or based upon one or more performance characteristics of the memory 100 of the client 100, the memory 144 of the memory extension module 140, and/or of any other one or more components, such as the interconnect(s) 170, 180, the memory controller 120, the backing store 146, the communication interface(s) 148, and/or the memory extension logic 142.
- one or more portions may be selected from the memory with faster performance, such as the memory 110 of the client.
- one or more portions may be selected from the memory with slower performance, such as the memory 144 of the memory extension module 148, and/or may be selected from the physical address space associated with the memory extension module 140.
- the portion(s) may be selected using any one or more selection strategies described herein and/or one or more selection strategies known now or later discovered, such as those described in U.S. Non-provisional patent application Ser. No.
- the client logic 142 and/or another logic may configure a portion of an address space of the application logic 314 as anonymous memory and/or may configure the address space, such that the operations described for FIG. 4A may be performed at a later time, such as when accessing and/or writing to the portion of the address space for the first time.
- the client logic 112 and/or another logic may optionally request one or more portions of memory and/or address space to be initialized (406).
- the memory extension logic 142 and/or another logic may initialize the one or more indicated and/or associated portions of memory, such as one or more portions of the memory 144 of the memory extension module 140 and/or one or more portions of the memory 110 of the client 100.
- Initializing one or more portions may include writing to the portions with some predetermined and/or specified data. For example, the portion(s) of memory 144, 110 may be overwritten with all zeros.
- the portion(s) of memory 144, 110 initialized may be specified by the request and/or may be implied by the specified portion(s) of address space, such as by identifying the portion(s) of memory 144, 110 via one or more mapping data structures.
- the mapping data structure(s) may be any data structure(s) known now or later discovered that may be capable of associating one or more portion(s) of memory 144, 110 with one or more portion(s) of address space and/or of identifying the one or more portion(s) of memory 144, 110 given one or more portion(s) of address space to query.
- Examples of data structures suitable for this purpose may include an array, an extensible array, a sparse array, a radix tree, an interval tree, a mapping, a hash table, a page table, and/or any other data structure(s) that facilitates associating memory portions, memory offsets, and/or memory identifiers with address space references and/or identifiers.
- the portions of memory 144, 110 initialized may be initialized at the time of the request (406) and/or may be initialized at a later time.
- the memory extension logic 142 and/or another logic may update one or more data structures, such as the mapping data structure(s), to indicate that the portions should be initialized prior to and/or upon being accessed, such as when starting to read data (328).
- the memory extension logic 142 and/or another logic may update one or more data structures, such as the mapping data structure(s) to disassociate one or more portions of memory 110, 144 and/or backing store 146 from one or more address space portions. Disassociating the portions may serve to indicate that the corresponding portions of address space should be initialized prior to and/or upon being accessed.
- the client logic 112 and/or another logic may map the portion(s) of memory 110, 144 and/or address space to a second address space (408).
- the second address space may be a virtual address space, such as a virtual address space associated with the application logic 114.
- Mapping the portions(s) to the second address space (408) may include updating one or more mapping data structures to associate the portion(s) with one or more corresponding portions of the second address space.
- the client logic 122 and/or another logic may update one or more page tables and/or page table entries to associate the portion(s) of memory 110, 144 and/or address space with one or more portions of the virtual address space associated with the application logic 114.
- An effect of mapping the portion(s) of memory 114, 144 and/or address space may include causing future accesses by the processor 130, a logic, such as the application logic 114, and/or any other component of the client 100 of the portion(s) of the second address space to cause the processor to access the mapped portion(s) of memory 114, 144 and/or the address space.
- the processor 130 may determine that the mapped portion(s) of memory 114, 144 and/or the address space are mapped to the accessed portion(s) of the second address space and may translate virtual addresses of the second address space to physical addresses of the memory 114, 144 and/or of the address space. Upon determining and/or translating, the processor 130 may then access the mapped portion(s) of memory 114, 144 and/or the address space.
- the client logic 112 and/or another logic may optionally initialize and/or activate one or more page table entries (410).
- initializing and/or activating the one or more page table entries (410) may be performed as part of mapping the portion(s) of memory 110, 144 and/or address space (408).
- Initializing and/or activating one or more page table entries (410) may include marking the one or more page table entries as valid, present, available, accessible, and/or writable.
- the one or more page table entries may be associated with and/or included in the mapping data structure(s), such as if the mapping data structure(s) are and/or include a page table.
- An effect of initializing and/or activating the one or more page table entries (410) may be that when the processor 130, a logic, such as the application logic 114, and/or any other component of the client 100 attempts to access a portion of the second address space, the logic, processor 130, and/or component may proceed to access the mapped portion(s) of the memory 110, 144 and/or address space without causing one or more page faults.
- Accessing the mapped portion(s) without causing page fault(s) may be advantageous in examples such as these where the accessed portions have been configured with the request for initialization (406), as the logic, processor 130, and/or component may not need to wait an extended period of time for data of the access to be made available.
- the logic, processor 130, and/or component may need to wait and/or stall for an extended period of time while the memory extension logic 142 and/or another logic prepares the data for access, such as by retrieving (328) the data from the backing store 146 and/or via the communication interface(s) 148.
- page table entries may be advantageous not to optionally initialize and/or activate one or more page table entries (410), as not initializing and/or activating the one or more page table entries may cause access one or more page faults to occur when the portion(s) of the second address space is/are accessed.
- Having page fault(s) occur in these examples may be advantageous as the page fault(s) may enable the logic, another logic, the processor 130, the component, and/or another component to perform meaningful work while the data is prepared for access.
- FIG. 4B illustrates an example flowchart for freeing memory.
- the client logic 112 and/or another logic may receive a request for freeing memory (422).
- the request for freeing memory (422) may be any one or more mechanisms that may trigger the operations described for FIG. 4B.
- Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc.
- the request (422) may be received from another logic, such as the application logic 114, and/or the request (422) may be received from the client logic 112, such as by the client logic 112 making a determination to free memory.
- the client logic 112, and/or any other one or more logics may make the determination independently and/or in coordination with each other and/or with any other one or more logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason.
- the client logic 112 may determine that memory assigned to the application logic 114 may be unused and/or underutilized and/or may determine to free some memory for the application logic 114, such as in examples where the application logic 114 includes a virtualization instance.
- the request for allocating memory (422) may include a memory freeing function call, such as free.
- the request for freeing memory (422) may be implied by a determination to page data out, such as to the backing store 160 of the client 100, and/or to reclaim memory.
- the client logic 142 and/or another logic may unmap one or more portions of memory 110, 144 and/or address space from the second address space (424).
- Unmapping the portion(s) may include updating one or more mapping data structures to disassociate the portion(s) from one or more corresponding portions of the second address space.
- unmapping the portion(s) (424) may include deactivating one or more page table entries, such as by marking the one or more page table entries as invalid, not present, not available, not accessible, and/or not writable.
- the client logic 122 and/or another logic may update one or more page tables and/or page table entries to disassociate the portion(s) of memory 110, 144 and/or address space with one or more portions of the virtual address space associated with the application logic 114.
- An effect of unmapping the portion(s) of memory 114, 144 and/or address space may include causing future accesses by the processor 130, a logic, such as the application logic 114, and/or any other component of the client 100 of the portion(s) of the second address space to cause a page fault, a segmentation fault, an error, and/or otherwise not access the unmapped portions.
- the client logic 142 and/or another logic may optionally request the portions of memory and/or address space to be discarded (426).
- the memory extension logic 142 and/or another logic may initialize and/or write to the one or more indicated and/or associated portions of memory, such as one or more portions of the memory 144 of the memory extension module 140 and/or one or more portions of the memory 110 of the client 100, with some predetermined and/or specified data.
- the portion(s) of memory 144, 110 may be overwritten with all zeros.
- the portion(s) of memory 144, 110 discarded may be specified by the request and/or may be implied by the specified portion(s) of address space, such as by identifying the portion(s) of memory 144, 110 via one or more mapping data structures.
- the portions of memory 144, 110 discarded may be discarded and/or initialized at the time of the request (426) and/or may be discarded and/or initialized at a later time.
- the memory extension logic 142 and/or another logic may update one or more data structures, such as the mapping data structure(s), to indicate that the portions should be initialized prior to and/or upon being accessed.
- the memory extension logic 142 and/or another logic may update one or more data structures, such as the mapping data structure(s) to disassociate one or more portions of memory 110, 144 and/or backing store 146 from one or more address space portions.
- Disassociating the portions may serve to indicate that the corresponding portions of address space should be initialized prior to and/or upon being accessed.
- the memory extension logic 142 and/or another logic may update one or more data structures, such as one or more page table entries, page table(s), portion tracking data structure(s), memory presence indicator(s), and/or any data structure(s), such as to indicate that the portions of address space are no longer valid, present, available, accessible, and/or writable.
- Indicating that the portions of address space are no longer valid, present, available, and/or writable may cause a logic, such as the client logic 112, to request the portions be made available prior to accessing them, which may provide the memory extension logic 142 and/or another logic an opportunity to initialize the portions.
- the one or more data structures may be updated to indicate that the portions of address space are valid, present, available, accessible, and/or writable. Indicating that the portions of address space are valid, present, available, accessible, and/or writable may be advantageous in examples where the memory extension logic 142 and/or another logic will initialize the portions upon being accessed.
- the effect of requesting the portions of memory and/or address space to be discarded (426) may be the same as the effect of requesting the portion(s) to be initialized (406).
- the two operations may be different.
- the memory extension logic 142 and/or another logic may treat the request to initialize (406) as higher or lower priority than the request to discard (426) and/or may handle the requests with different latencies and/or qualities of service.
- Treating the request to initialize (406) as higher priority, lower latency, and/or higher quality of service than the request to discard (426) may be advantageous in examples where the portion(s) may be accessed shortly after allocation and/or in examples where more portions need to be initialized and/or discarded than the memory extension logic 142 and/or another logic can service in a timely manner.
- the request to discard (426) may be not sent, ignored, deferred, delayed, refused, failed, and/or discarded.
- the client logic 112 and/or another logic may complete (428) handling the request and/or determination to free memory, such as by responding to the request and/or determination with or without a status indication.
- FIG. 4A and FIG. 4B may occur independently, concurrently, repeatedly, and/or may be interleaved. For example, multiple memory allocation and/or memory freeing operations may be executed concurrently, serially, repeatedly, and/or interleaved. Alternatively or in addition, two or more of the steps of FIG. 4A and/or FIG. 4B may be combined. For example, mapping the portion(s) of memory 110, 144 and/or address space to the second address space (408) and initializing and/or activating the one or more page table entries (410) may be combined.
- FIG. 5 illustrates an example flowchart for a memory access.
- the processor 130, the memory extension logic 142, the client logic 112, and/or other logic(s) may perform the operations shown in FIG. 5 and/or other figures herein.
- the processor 130 may perform operations depicted in the left portion of FIG. 5 (502) (504) (518) (526)
- the memory extension logic 142 may perform operations depicted in the center portion of FIG. 5 (512) (514) (520) (522) (524)
- the client logic 112 may perform operations depicted in the right portion of FIG. 5 (506) (508) (510) (516).
- one or more other logics and/or components may perform any or all of the depicted operations.
- another logic, device, and/or component may perform the operations depicted in the left portion of FIG. 5, such as a GPU, a storage controller, a communication interface, and/or any other logic, device, and/or component capable of accessing memory.
- the processor 130 and/or another logic may receive a request to access memory (502).
- the request to access memory (502) may be any one or more mechanisms that may trigger the operations described for FIG. 5. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc.
- the request (502) may be received from another component and/or logic, such as the application logic 114, and/or the request (502) may be received from the processor 130, such as by the processor 130 making a determination to access and/or pre-fetch memory.
- the processor 130 and/or any other one or more components and/or logics may make the determination independently and/or in coordination with each other and/or with any other one or more components and/or logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason.
- the processor, the client logic 112, the application logic 114, and/or another logic and/or component may determine, such as with a pre-fetch mechanism, that a portion of memory and/or address space is likely to be accessed soon by another logic (such as the application logic 114) and/or may access the memory in anticipation.
- the request (502) may be one or more computer executable instructions, such as a load instruction and/or a store instruction.
- the request (502) may be an electrical, optical, and/or wireless signal, such as a memory access over an interconnect (such as the interconnects described herein 170, 180) and/or a cache manipulation request (such as a cache fill request).
- the processor 130 and/or another logic may evaluate (502) a memory presence indicator, such as a page table entry.
- the memory presence indicator may be stored in memory and/or in another location, such as the backing store 160 of the client 100 and/or the backing store 146 of the memory extension module 140.
- the memory presence indicator may be cached, such as with a translation lookaside buffer (TLB).
- TLB translation lookaside buffer
- the memory presence indicator may indicate whether the data associated with the request and/or determination (502) is included in memory 110, 144.
- the memory presence indicator may indicate whether the data is accessible via the physically addressable interface and/or the hardware accessible interface.
- the memory extension logic 142 is capable of retrieving the data from the backing store 146 and/or via the communication interface(s) 148, even though it is not yet included in the memory 144, the memory presence indicator associated with the data may indicate that the data is present, valid, available, accessible, and/or writable.
- Embodiments where the memory presence indicator may indicate whether the data is accessible, such as via the physically addressable interface and/or the hardware accessible interface, may be advantageous in examples where the client logic 112, the application logic 114, and/or any other logic is incapable of coordinating with the memory extension logic and/or updating the memory presence indicators as data is transferred (such as described for FIG. 3A and/or FIG. 3B) between memory 110, 144 and backing stores 160, 146, and/or via the communication interface(s) 148.
- Embodiments where the memory presence indicator may indicate whether the data is included in memory 110, 144 may be advantageous in examples where attempting to access the data, such as via the physically addressable interface and/or the hardware accessible interface, causes the processor 140 and/or other logic to be unable to perform meaningful work during the access. For example, one or more hardware pipelines of the processor 130 and/or other logic may be stalled. In examples such as these, it may be advantageous to reduce the duration when the processor 130 and/or other logic is waiting and/or to enable the processor 130 and/or other logic to perform other meaningful work while waiting, such as by causing a page fault.
- a page fault and/or other request, indication, and/or signal may occur (506).
- the client logic 112 and/or another logic may perform operations in response such as depicted.
- the client logic 112 and/or another logic may determine if the data associated with the request and/or determination (502) is included in memory 110, 144 (508). Determining if the data is included in memory (508) may include checking one or more data structures, such as one or more mapping data structures, one or more memory presence indicators, and/or any data structure(s) capable of indicating presence or non-presence of data for one or more portions of memory and/or address space.
- the checked data structure(s) (508) may be the same as that/those evaluated (504) by the processor 130 and/or other logic.
- one or more different data structures may be checked (508), such as a bitmask of presence indicators, and/or a different page table.
- the client logic 112 and/or other logic may proceed to complete handing the page fault and/or other request, indication, and/or signal (516).
- the client logic 112 and/or other logic may request the data to be loaded into memory (510), such as by invoking the request to start loading data into memory (322) and/or by sending a data load request.
- the memory extension logic 142 and/or another logic may start loading data memory (512). Starting to load data into memory (512) may be and/or may include one, more, and/or all of the operations as described for FIG. 3B.
- the memory extension logic 142 and/or another logic may acknowledge the request to load the data into memory. Acknowledging the request (514) may indicate that the request to start loading data into memory was received. Alternatively or in addition, acknowledging the request (514) may indicate that the memory extension logic 142 and/or another logic has started to load the data. Alternatively or in addition, acknowledging the request (514) may indicate that the data has been loaded and/or was already loaded at the time the request was received.
- Acknowledging the request (514) may include additional status and/or data, such as a success/failure indication, one or more error codes, and/or one or more operation identifiers.
- the operation identifier(s) may be used to check the status of loading the data, such as by checking a completion queue and/or otherwise checking for an indication that the operation associated with the operation identifier is complete. Checking the completion queue and/or checking for an indication that the operation is complete may be performed periodically, such as with polling, and/or may be performed in response to an event, such as an interrupt indication from the memory extension logic 142 and/or other logic.
- the client logic 112 and/or another logic may perform other operations.
- the client logic 114 and/or another logic may perform a context switch and/or may cause the processor 130 to execute instructions for a different logic, such as a second application logic. Performing the context switch may enable the processor 130 and/or other logic to perform meaningful work while the data is loaded.
- the client logic 112 and/or another logic may complete handling the page fault and/or other request, indication, and/or signal (516).
- Completing handling the page fault and/or other request, indication, and/or signal (516) may include updating one or more of the memory presence indicator(s) to indicate that the data is present, valid, available, accessible, and/or writable.
- completing handling the page fault and/or other request, indication, and/or signal (516) may include performing a context switch back to the logic that caused the initial request to access memory (502), may include marking the logic as runnable, and/or may include indicating that the logic is no longer waiting for data to be loaded into memory, such as by releasing a lock primitive associated with the portion being accessed.
- Lock primitives may be logic and/or data structures which enforces a concurrency control policy.
- Examples of lock primitives include binary semaphores, counting semaphores, spinlocks, readers-writers spinlocks, mutexes, recursive mutexes, and/or readers-writers mutexes.
- Lock primitives may include distributed locks and/or may be provided by a distributed lock manager. Examples of distributed locks and/or distributed lock managers include VM SCI uster, Chubby, ZooKeeper, Etcd, Redis, Consul, Taooka, OpenSSI, and/or Corosync and/or any other distributed lock(s) and/or distributed lock manager(s) known now or later discovered.
- the processor 130 and/or other logic may proceed to attempt to access memory (502). For example, if as part of completing handling the page fault and/or other request, indication, and/or signal (516), the client logic 112 and/or other logic performed a context switch back to the logic that caused the initial request to access memory (502), then the processor 130 may attempt to re-execute the same instruction that caused the initial page fault and/or other request, indication, and/or signal (516).
- the processor 130 and/or other logic checks the memory presence indicator(s) (504), it may find the data to be present, valid, available, accessible, and/or writable.
- the processor 130 and/or another logic may request the data (518) via the interconnect(s) 170, 180.
- Requesting the data (518) may include requesting one or more cache line fills via the interconnect(s) 170, 180.
- the data may be read from the cache instead.
- the cache line fill request may be directed to the memory extension logic 142, such as via the physically-addressable interface and/or hardware-accessible interface.
- the cache line fill request may be directed to another logic and/or component, such as the memory controller 120, the memory 110 of the client 100, and/or the memory 144 of the memory extension module 140.
- the memory extension logic 142 and/or another logic may determine if the data associated with the request and/or determination (502) is included in memory 110, 144 (520). Determining if the data is included in memory (520) may include checking one or more data structures, such as one or more mapping data structures, one or more memory presence indicators, and/or any data structure(s) capable of indicating presence or non-presence of data for one or more portions of memory and/or address space.
- the checked data structure(s) (520) may be the same as those evaluated (504) by the processor 130 and/or other logic and/or may be the same as those evaluated (508) by the client logic 112 and/or other logic.
- one or more different data structures may be checked (520), such as a bitmask of presence indicators, and/or a different page table.
- the memory extension logic 142 and/or other logic may proceed to respond (524) to the data request (518). Alternatively or in addition, if the memory extension logic 142 and/or other logic determines (520) that the data is not included in memory, the memory extension logic 142 and/or other logic may load the data into memory (522). Loading the data into memory (522) may be and/or may include one, more, and/or all of the operations as described for FIG. 3B.
- the memory extension logic 142 and/or other logic may respond (524) to the data request (518).
- the response (524) may include the requested data, and/or the response may indicate that the data is available, such as in a cache and/or memory 110, 144.
- the memory access (502) may be complete (526), and/or the processor 130 and/or other logic may continue one or more previous operations. For example, if the processor 130 was previously executing computer executable instructions included in the application logic 114 that instructed it to access memory (502), the processor 130 may resume executing the computer executable instructions.
- the operations depicted in FIG. 5 may occur independently, concurrently, repeatedly, and/or may be interleaved.
- multiple memory allocation and/or memory freeing operations may be executed concurrently, serially, repeatedly, and/or interleaved.
- multiple flows of the operations depicted may be in progress concurrently, such as if multiple execution cores and/or multiple hardware threads are accessing memory.
- the multiple flows may execute the same operations as each other in coordination, and/or the flows may operate independently.
- the logic(s) may include multiple instances of portions of the logic to enable concurrent execution, such as the processor 130 may include multiple execution cores and/or multiple hardware threads.
- the memory extension logic 142 may include multiple state machine logics for handling multiple requests (510), (518) for data concurrently.
- two or more of the steps of FIG. 5 may be combined. For example, starting to load data into memory (512) may be combined with acknowledging the request to load data (514).
- starting to load data into memory (512) may be combined with acknowledging the request to load data (514).
- sequences may be equally effective in response to the memory access (502).
- sequences may have additional operations, some operations may be skipped, and/or some operations may be performed in a different order than described.
- additional operations may be performed to update one or more memory presence indicators, to update one or more data structures representing memory access statistics, and/or measuring performance of the operations of FIG. 5 and/or any other operations of the system.
- the memory extension logic 142 and/or other logic may acknowledge the request (514) after waiting for the data to be loaded, and/or after some amount of time has passed.
- the amount of time may be fixed and/or may be variable.
- the memory extension logic 142 and/or another logic may measure the amount of time between two or more operations, such as acknowledging the data load request (514) and the processor requesting the data (518), and/or the memory extension logic 142 and/or other logic may calculate statistics related to the expected time it would take for the processor 130 and/or other logic(s) to react to completing handling of the page fault and/or other request, indication, and/or signal (516).
- the memory extension logic 142 and/or other logic may calculate and/or measure statistics related to the average amount of time waiting for the data request (518). In other examples, the memory extension logic 142 and/or other logic may measure the amount of time between requesting the data (518) and responding to the data request (524).
- the memory extension logic 142 and/or other logic may calculate, for one or more processors 130 and/or other logics that access memory, one or more average, mode, median, mean, variance, standard deviation, any other values related to the time measurements and/or to any other quantity, and/or a combination of these and/or other values, and/or the memory extension logic 142 and/or other logic may adjust the amount of time for acknowledging the request.
- the amount of time for acknowledging the request may be the same amount of time for some and/or all of the processor(s) 130 and/or other logic(s), and/or the amount of time for acknowledging the request may be different for one or more of the processor(s) 130 and/or other logic(s). Adjusting the amount of time for acknowledging the request may enable the memory extension logic 142 and/or other logic to minimize the time when the processor 130 and/or other logic is waiting for the request for data (518) and/or may maximize the amount of useful work that may be performed while waiting for handling of the page fault and/or other request, indication, and/or signal (506) to be completed (516).
- adjusting the amount of time for acknowledging the request may enable the memory extension logic 142 and/or other logic to minimize unnecessary waiting for handling of the page fault and/or other request.
- the memory extension logic 142 and/or other logic may increase the amount of time for acknowledging the request when the processor and/or other logic needs to wait longer than a threshold duration between requesting the data 518 and receiving a response (524).
- the memory extension logic 142 and/or other logic may decrease the amount of time for acknowledging the request when the time that the data load (512) is completed and the data is requested 518 is longer than a threshold duration.
- the memory extension logic 142 and/or other logic may calculate and/or determine the amount of time for acknowledging the request using any one or more approaches known now or later discovered for determining a value in response to feedback.
- the memory extension logic 142 and/or other logic may utilize one or more control systems and/or process control approaches, such as a Proportional Integral Derivative (PID) logic and/or circuit, to determine the amount of time for acknowledging the request.
- PID Proportional Integral Derivative
- the memory extension logic 142 and/or other logic may calculate and/or determine the amount of time for acknowledging the request using one or more machine learning logics and/or approaches.
- Other examples may include one or more of and/or combination(s) of the same and/or different measurements, calculations, and/or determinations, as these are merely exemplary of presently-preferred embodiments for adjusting the amount of time for acknowledging the request.
- the one or more measurements, calculations, determinations, logics, circuits, and/or approaches used to determine the amount of time for acknowledging the request may be selected by a user and/or an administrator, may be configured, and/or may be selected based on any one or more policies, passed functions, steps, and/or rules that the memory extension logic 142 and/or other logic follows to determine which one or more measurements, calculations, determinations, logics, circuits, and/or approaches to use and/or which parameter(s) and/or configuration(s) to use with the one or more measurements, calculations, determinations, logics, circuits, and/or approaches.
- the one or more policies, passed functions, steps, and/or rules may use any available information, such as any one or more of the characteristics and/or configurations of the client(s) 100, the memory extension module(s) 140, the memory appliance(s), and/or the management server(s), to select the one or more selection strategies, portion replacement strategies, and/or page replacement strategies.
- a policy and/or passed function may specify to use a PID control strategy unless the determined amount of time exceeds a threshold, and/or to use a fixed value when exceeding the threshold.
- the client logic 112, the application logic 114, and/or another logic may indicate to the memory extension logic 142 portions of memory 110, 144 that may be likely and/or unlikely to be accessed in the future.
- the memory extension logic 142 may not have information usable to determine that one or more portions of memory 110, 144 are unlikely to be accessed soon relative to other portion(s).
- the memory extension logic 142 may observe the data being written to the memory 144 of the memory extension module and may treat the data as being recently accessed.
- the client logic 112, the application logic 114, and/or other logic may indicate that the data written to the memory 144 of the memory extension module 140 is unlikely to be accessed soon.
- Indicating that the data written to the memory 144 of the memory extension module 140 in this way may be advantageous in that the memory extension logic 142 and/or another logic may make more accurate predictions of which portions are likely/unlikely to be accessed in the near future and/or may be less likely to reclaim portions of active memory 110, 144 and/or to need to load data into memory (512), (522) that had recently been removed from memory by a reclaim operation.
- FIG. 6 illustrates an example flowchart for portion migration with cold portion notification.
- the client logic 112 and/or another logic such as an operating system, may perform the operations shown in FIG. 6 and/or other figures herein.
- the client logic 112 and/or another logic may receive a portion migration request (602).
- the portion migration request (602) may be any one or more mechanisms that may trigger the operations described for FIG. 6. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc.
- the request (502) may be received from another component and/or logic, such as the application logic 114, and/or the request (502) may be received from the client logic 112, such as by the client logic 112 making a determination to perform portion migration.
- the client logic 112 and/or any other one or more components and/or logics may make the determination independently and/or in coordination with each other and/or with any other one or more components and/or logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason.
- the client logic 112, the application logic 114, and/or another logic may determine that one or more portions of the memory 110 of the client 100 are unlikely to be accessed in the near future and/or may determine that data of the portion(s) should be migrated to one or more portion(s) of the memory 144 and/or backing store 146 of the memory extension module 140.
- the client logic 112 and/or another logic may migrate one or more portions of memory (604).
- Migrating one or more portions of memory (604) may include copying data contained in the portions from one part of memory 110, 144 to one or more second portions in another part of memory 110, 144 and/or copying/transferring/re-associating associated metadata and/or portion-tracking data structures from the portion(s) to the second portion(s).
- Associated metadata may include one or more flags, lock primitives, references, indices, stored values, counters, usage counters, lists, and/or collections related to the portion(s) of memory.
- the associated metadata may be and/or may include the portion tracking data structures.
- the portion tracking data structures may be and/or may include the associated metadata.
- migrating the one or more portions of memory may include updating one or more data structures, collections, and/or mapping data structures to associate the one or more second portions and/or other portions with one or more portions of address space previously associated with the one or more portions of memory.
- a radix tree, an extensible array, and/or a page table may be updated to reference the second portion(s) instead of the one or more portions being migrated (604).
- the client logic 112 and/or another logic may perform one or more cold portion notifications (606).
- Performing the cold portion notification(s) may include notifying a receiving logic, such as the memory extension logic 142, the application logic 114, the client logic 112, and/or another logic of one or more portions of memory that may be unlikely to be accessed in the near future.
- a receiving logic such as the memory extension logic 142, the application logic 114, the client logic 112, and/or another logic of one or more portions of memory that may be unlikely to be accessed in the near future.
- a threshold probability such as 50% and/or any other probability value.
- the threshold probability may be fixed, configurable, and/or calculated.
- the calculation may be performed using any information available to the logic performing the calculation, such as portion access information, the portion tracking data structures, the reclaim candidates, an available memory determination, and/or any other information, such as any information described herein.
- the receiving logic may update one or more data structures, flags, and/or stored values in such a way as to increase and/or decrease the probability that the one or more portions of memory indicated by the portion migration notification(s) would be selected for reclaim and/or would be included as reclaim candidates.
- the receiving logic may update one or more stored timestamps, generation counters, flags, and/or other data and/or metadata in such a way as to increase the probability that the portion(s) will be selected for reclaim, such as by subtracting an offset from the timestamp(s) and/or generation counter(s) and/or by setting and/or clearing one or more flags.
- the receiving logic may update one or more collections, such as by removing the portion(s) from the collection of active portions and adding the portion(s) to the collection of inactive portions, as described elsewhere in this disclosure.
- FIG. 7 illustrates a memory architecture diagram of an example system providing multiple tiers of memory.
- the system may provide one or more tiers of memory.
- a tier may be a collection of memory with a common set of attributes, such as price, capacity, latency, bandwidth, operations per second, physical locality, network locality, logical locality, and/or any other attributes of the memory and/or of the device containing the memory.
- the attributes of a tier involving memory of a memory appliance may include any of the characteristics and/or configurations of the memory appliance.
- the attributes of a tier involving one or more memory extension modules 140 may include any characteristics and/or configurations of the memory extension module(s) 140, any of the components (e.g. 142, 144, 146, and/or 148) included in and/or associated with the memory extension module(s), and/or any interconnect(s) (e.g. 170 and/or 180) used to access the memory extension module(s) 140.
- the attributes of one tier may differ from those of another tier.
- price and performance may decrease for lower tiers while capacity increases. This may enable the system to naturally demote data from higher levels to lower levels as other data proves to be used more often and/or more recently.
- any one or more of price, performance, distance, latency, determinacy, size, hotness of data, predictive probability, latency, latency sources, latency categories, latency modalities, power usage, and/or any other characteristics relevant to performance, cost, and/or suitability for one or more purposes known now and/or later discovered may be the same, similar, and/or different between different tiers.
- the highest-level tiers may be provided by the hardware of the client 100.
- level 1 may be provided by the L1 cache of the processor of the client 100
- level 2 may be provided by the L2 cache of the processor of the client 100
- level 3 may be provided by the L3 cache of the processor 130 of the client 100
- level 4 may be provided by the memory 110 of the client 100
- another level may be provided by the backing store 160 of the client 100.
- one or more tiers may be provided by one or more memory appliances.
- level 5 may be provided by one or more memory appliances with very low latency and/or high bandwidth
- level 6 may be provided by one or more memory appliances with higher latency, lower bandwidth, and/or higher capacity
- level 7 may be provided by the backing store of one or more memory appliances and/or of the client.
- one or more tiers may be provided by one or more memory extension modules 140.
- level 5 may be provided by one or more memory extension modules with high-bandwidth and/or low-latency memory, a large amount of memory relative to backing store, and/or in communication via a high performance interconnect
- level 6 may be provided by one or more memory extension modules with lower-bandwidth and/or higher- latency memory, a smaller amount of memory relative to backing store, and/or in communication via a lower performance interconnect
- level 7 may be provided by the backing store(s) of one or more memory extension modules and/or of the client 100.
- one or more memory extension modules may utilize the communication interface(s) 148 instead of and/or in addition to the backing store 146.
- the memory extension modules utilizing the communication interface(s) may be associated with a different tier than the memory extension modules utilizing the backing store. For example, memory extension modules utilizing the communication interface(s) to access one or more memory appliances may be in a lower numbered tier than memory extension modules utilizing flash memory as a backing store if the memory appliance(s) can be accessed with better performance than the flash memory.
- a logic such as the client logic 112 and/or the memory extension logic 142 may cause data for one or more portions of memory to be migrated to lower-numbered tiers by causing the data of the portions to be faulted-in at the desired level.
- the client logic 112 may attempt to read the data, causing the data to be loaded into the memory 110 of the client 112, into the memory 144 of the memory extension module(s) 140 and/or into one or more levels of processor cache of the client.
- the client logic 112 may pre-fetch the data, such as by issuing a pre-fetch request with an operating system of the client 100.
- the prefetch request may be a memory advisory request, indicating that the client logic 112 and/or another logic will need the data.
- the client logic 112 may send a pre-fetch request to the memory extension logic 140 and/or another logic.
- the pre-fetch request may cause the data to be loaded into the memory 144 of the memory extension module(s) 140.
- the pre-fetch request may cause the data to be loaded into the memory of the memory appliance.
- the client logic 112 may send a pin request to the memory extension logic 142 and/or another logic.
- the pin request may cause the data to be loaded into the memory 144 of the memory extension module 140.
- the pin request may cause the data to be loaded into the memory of the memory appliance.
- a logic such as the client logic 112 and/or the memory extension logic 142 may cause the data for one or more portions of memory to be migrated away from lower-numbered tiers by unpinning the corresponding portions of memory and/or by causing the portions to be invalidated and/or reclaimed at the desired level. Causing the portions to be invalidated and/or reclaimed at the desired level may be as described elsewhere in this document.
- the client logic 112 may send an unpin request to the memory extension logic 142, and/or another logic. The unpin request may cause the data to be unpinned from the memory 144 of the memory extension module 140.
- the client logic 112 may send an unpin request to the memory extension logic 142, and/or another logic. The unpin request may cause the data to be unpinned from the memory 144 of the memory extension module 140.
- the unpin request may cause the data to be unpinned from the memory of the memory appliance.
- the client logic 112 may send a reclaim request to the memory extension logic 142 and/or another logic.
- the reclaim request may cause the portions to be invalidated and/or reclaimed from the memory 144 of the memory extension module 140.
- the reclaim request may cause the portions to be invalidated and/or reclaimed from the memory of the memory appliance.
- a logic such as the client logic 112 and/or the memory extension logic 142 may cause the data for one or more portions of memory to be migrated to lower-numbered tiers by performing a portion migration, such as described elsewhere in this disclosure, via NUMA page migration, and/or as described for FIG. 6.
- a logic such as the client logic 112 and/or the memory extension logic 142 may cause the data for one or more portions of memory to be migrated away from lower-numbered tiers by performing a portion migration.
- the operating system of the client, of the memory appliance, of the memory extension module 140, and/or of any other component may cause the data for one or more portions of the region 214 to be migrated away from lower-numbered tiers by causing the portions to be invalidated and/or reclaimed.
- latency and/or temporal latency may come from several sources and/or may take several forms. Sources of latency may be categorized in a number of ways to aid in explanation and/or understanding and/or to help visualize the dominant latency cost in a given system.
- FIG. 8 illustrates an example categorization of latency sources for computing systems.
- the categorization may include one or more categories, such as physics, digitization, computation, multitasking, virtual memory, networking, and/or storage effects.
- Other categorizations are possible and/or may include more, fewer, and/or different categories, the categories may be defined based upon the same and/or different aspects of the latency sources, and/or the categories may include more, fewer and/or different sources of latency than illustrated and/or described in this example.
- the latency values illustrated in FIG. 8 are only examples to aid in understanding relative differences in magnitude that may or may not exist between latency categories. Other latency values are possible for the same and/or different latency source categories. Alternatively or in addition, other arrangements of the categories and/or of other categories are possible.
- Latency sources of the physics category may be sources of latency related to properties, laws, theorems, axioms, and/or other aspects of physics, physical processes, physical properties, physical behaviors, and/or physical phenomena including, but not limited to: electrical effects, magnetic effects, optical effects, quantum effects, effects caused by certain materials, the speed of light, the speed of electrical-magnetic wave propagation, signal propagation, signal detection, electrical/optical/other interference, resistance, capacitance, inductance, impedance, permittivity, and/or any other sources of latency related to physics known now or later discovered. Some latency sources of this and/or other categories may vary based on distance, scale, and/or other properties.
- signal propagation latency may be proportional to the distance the signal needs to propagate.
- signal propagation latency may vary based on materials used/involved, such as conductor material, fiber optic material, atmospheric composition/density, vacuum, etc. Latency sources of the physics category may be more prevalent at very small scales (e.g. quantum effects, capacitance, etc.) and/or at very large scales (e.g. signal propagation in very long wires/cables, interference, resistance, etc.).
- Typical latency values due to sources of the physics category may be approximately 10' 1 ns or less, particularly for very small scales, but latency values may vary widely and/or may be outside of this range. For example, at very large scales, signal propagation delays can take several seconds or more, such as when sending electromagnetic signals to/from spacecraft, other planets, etc.
- Latency sources of the digitization category may be sources of latency related to digital logic, digital signal transmission, analog-to-digital conversion, digital-to-analog conversion, modulation, demodulation, digital amplification/retransmissions, clock signals, clock domains, synchronization, serialization, deserialization, and/or any other sources related to digital signals, digital values, digital logic, and/or digital communication known now or later discovered.
- digitization latency may vary based on the complexity and/or number of logic devices that a signal and/or information may traverse, such as the type of and/or number of logic gates in a logic path and/or a number of pipeline stages involved.
- Typical latency values due to sources of the digitization category may range between 10 -1 and 10° ns, though other latency values are also possible.
- latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s).
- Latency sources of the computation category may be sources of latency related to computation and/or other digital logic that may perform meaningful calculations, determinations, comparisons, and/or decisions, including, but not limited to instruction execution, number of instructions, instruction complexity, data dependencies, pipeline stalls, branch prediction I misprediction, pipeline flushing, cache access stalls, memory access stalls, inter-processor communication, cache coherency, and/or any other sources related to calculations, determinations, comparisons, decisions, and/or any other higher-level logic known now or later discovered.
- Typical latency values due to sources of the computation category may range between 10° and 10 1 ns, though other latency values are also possible.
- Latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s).
- Latency sources of the multitasking category may be sources of latency related to multiple logics operating with a processor and/or other computing device, including, but not limited to context switches, interrupt latency, synchronization, locking, data dependencies, lock contention, and/or any other sources related to multitasking known now or later discovered.
- Typical latency values due to sources of the multitasking category may range between 10 1 and 10 2 ns, though other latency values are also possible.
- latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s).
- latency values of this category may vary widely.
- Latency sources of the virtual memory category may be sources of latency related to the use, mapping, translation, and/or maintenance of virtual addresses, virtual address space(s), page tables, page table entries, translation lookaside buffer(s) (TLB), and/or any other aspects of virtual memory known now or later discovered, including, but not limited to address translation, TLB misses, page faults, page table manipulation, page table creation, minor page faults, major page faults, copy-on-write page faults, zeroing page faults, page reclaim activity, background reclaim, direct reclaim, and/or any other sources related to virtual memory known now or later discovered.
- TLB translation lookaside buffer
- Examples of page faults and/or major page faults may include page faults that decompress and/or decrypt data (such as compressed and/or encrypted data held in memory), page faults that read from remote memory, page faults that read from memory that is slower than main memory (such as flash memory), page faults that read from disk, page faults that read from remote disk, page faults that generate content dynamically and/or computationally, any other type of page fault known now or later discovered, and/or combinations of these and/or other types of page faults.
- Typical latency values due to sources of the virtual memory category may range between 10 2 and 10 3 ns, though other latency values are also possible.
- latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s).
- latency values of this category may vary widely, particularly in examples where data is being transferred over one or more networks and/or between one or more mechanical storage devices.
- Latency sources of the networking category may be sources of latency related to communication, networks, interconnects, communication interfaces, cables, ports, connectors, antennas, signal splitters, signal combiners, hubs, switches, routers, and/or any other device, component, and/or aspect of communication, networking, and/or aggregation of digital systems and/or components, including, but not limited to, packetizing, reassembly, buffering, queuing, switching, routing, flow control, congestion, error detection, error correction, data corruption, packet loss, and/or any other sources related to networking and/or communication known now or later discovered.
- Typical latency values due to sources of the networking category may range between 10 3 and 10 5 ns, though other latency values are also possible.
- latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s).
- latency values of this category may vary widely.
- Latency sources of the storage category may be sources of latency related to data storage and/or retrieval, including, but not limited to, memory reading, memory writing, memory erasing, disk head positioning, disk rotation, storage allocation, fragmentation, defragmentation, tape feeding, media retrieval, and/or any other electrical, mechanical, and/or other effects related to storage and/or retrieval of data known now or later discovered.
- Typical latency values due to sources of the storage category may be greater than 10 4 ns, though other latency values are also possible.
- latency values for these and/or other sources of latency may vary based on the storage medium and/or storage device being used. Alternatively or in addition, such as in cases of heavy storage use, thrashing, and/or other effects, latency values of this category may vary widely.
- the systems, methods, devices, techniques, and/or approaches described herein may improve the effectiveness of computing systems, such as in cases where experienced latency values are outside typical ranges of a given latency category. For example, in systems where memory access latency is expected to be in and/or near a range typical of the computation latency category, such as -10° ns to ⁇ 10 1 ns, if the memory access latency actually experiences is much higher, such as above ⁇ 10 3 ns or higher due to network latency related to retrieving data over the communication interface 148 and/or such as above ⁇ 10 5 ns due to storage latency related to retrieving data from the backing store 146, then the computing system may not perform as expected.
- the processor 130 may stall one or more pipelines that may be experiencing data dependencies related to the memory access. While the computing system may be designed to accommodate pipeline stalls typical of memory accesses related to computation latency, such as by incorporating deep pipelines and/or aggressive branch prediction, it may be impractical to accommodate memory access latencies that may be orders of magnitude longer in duration. As a result, all or a portion of the processor may stay idle for long periods of time, reducing its effectiveness. However, if the computing system incorporates the systems, methods, techniques, and/or approaches described herein, it may avoid pipeline stalls that may exceed the system’s design parameters and/or that may reduce its effectiveness.
- the client 100, and/or the memory extension module(s) 140 may be configured in any number of ways.
- the memory extension module 140 may be included in a computer.
- the processor may be the CPU (and/or any other computing device and/or logic, such as a GPU, an APU, an FPGA, an ASIC, a system on chip (SoC), etc.) of the computer
- the memory may be the memory of the computer
- the computer may include the interconnects 170 180.
- the memory extension module 140 may be a peripheral of a computer, including but not limited to a PCI device, a PCI-X device, a PCIe device, an HTX (HyperTransport expansion) device, a CXL device, or any other type of peripheral, internally or externally connected to a computer.
- a PCI device including but not limited to a PCI device, a PCI-X device, a PCIe device, an HTX (HyperTransport expansion) device, a CXL device, or any other type of peripheral, internally or externally connected to a computer.
- the memory extension module 140 may be added to a computer or another type of computing device that accesses data in the memory extension module 140.
- the extension module 140 may be a device installed in a computer, where the client 100 is a process executed by a CPU of the computer.
- the memory in the memory extension module 140 may be different than the memory accessed by the CPU of the computer.
- the processor in the memory extension module 140 if present, may be different than the CPU of the computer.
- the client 100 and/or the memory extension module 140 may be implemented using a Non-Uniform Memory Architecture (NUMA).
- NUMA Non-Uniform Memory Architecture
- the processor may comprise multiple processor cores connected together via a switched fabric of point-to-point links.
- the memory controller may include multiple memory controllers. Each one of the memory controllers may be electrically coupled to a corresponding one or more of the processor cores. Alternatively, multiple memory controllers may be electrically coupled to each of the processor cores. Each one of the multiple memory controllers may service a different portion of the memory than the other memory controllers.
- the processor 130 of the client 100 and/or the memory extension module 140 may include multiple processors that are electrically coupled to the interconnect(s) 170, 180, such as with a bus.
- Other components of the client 100 and/or the memory extension module 140 such as multiple memories included in the memory, the communication interface, the memory controller, and/or the storage controller may also be electrically coupled to the interconnect.
- the memory extension system may include multiple clients 100, multiple memory extension modules 140, multiple memories 110, 144, multiple backing stores 160, 146, multiple memory extension logics 142, multiple client logics 112, and/or multiple application logics 114.
- the client 100 may provide additional services to other systems and/or devices.
- the client 100 may include a Network Attached Storage (NAS) appliance.
- the client 100 may include a Redundant Array of Independent Disks (RAID) head.
- the client 100 and/or the memory extension module 140 may include a resiliency logic, such as described in US provisional application 62/540,259, filed August 2, 2017, and U.S. Non-provisional patent application Ser. No.
- the client 100 may include and/or may be in communication with multiple memory extension modules 140.
- the memory extension logic 142, the resiliency logic, the client logic 112, and/or another logic may perform data resiliency operations, such as described therein.
- the resiliency logic may be included in and/or combined with one or more other logics, such as the client logic 112, the application logic, 114, and/or the memory extension logic 142.
- the client 100 may provide file-level access to data stored in the memory extension module 140.
- the client 100 may include a database, such as an in-memory database.
- the client 100 may operate as a memory appliance for one or more other clients.
- multiple clients 100 may utilize one or more memory extension modules 140 as shared memory.
- the clients 100 may include or interoperate with an application logic 114 that relies on massive parallelization and/or sharing of large data sets.
- application logic that may use massive parallelization include logic that performs protein folding, genetic algorithms, seismic analysis, or any other computationally intensive algorithm and/or iterative calculations where each result is based on a prior result.
- the application logic 114 may store application data, application state, and/or checkpoint data in the regions of the one or more memory extension modules 140 and/or in one or more memory appliances.
- the additional capabilities of the one or more memory extension modules and/or memory appliances may be exploited by the clients in order to protect against application crashes, a loss of power to the clients 100, or any other erroneous or unexpected event on any of clients 100.
- the clients 100 may access the one or more memory extension modules 140 and/or memory appliance(s) in a way that provides for atomic access.
- the memory access operations requested by the clients may include atomic operations, including but not limited to a fetch and add operation, a compare and swap operation, or any other atomic operation now known or later discovered.
- An atomic operation may be a combination of operations that execute as a group or that do not execute at all.
- the result of performing the combination of operations may be as if no operations other than the combination of operations executed between the first and last operations of the combination of operations.
- the clients may safely access the one or more memory extension modules 140 and/or memory appliances without causing data corruption.
- the application logic 114, the client logic 112, the memory extension logic 142, and/or any other one or more logics described herein may be co-located, separated, or combined.
- the actions performed by combined logic may perform the same or similar feature as the aggregate of the features performed by the logics that are combined.
- all logics may be co-located in a single device.
- the memory extension logic 142 may be split into multiple logics.
- the client logic 112 and the allocation logic 114 may be combined into a single logic.
- the client logic 112 and the application logic 114 may be partially combined. For example, some of the functionality of each logic may be separated from the logic described herein as implementing the functionality and/or may be combined into a new logic. Other combinations of the various components are possible, just a few of which are described here.
- the application logic 114, the client logic 112, the memory extension logic 142, and/or any other logic described herein may include computer code.
- the computer code may include instructions executable with the processor 130 and/or any other processor, such as the processor of the memory extension logic 142, if included in the memory extension logic.
- the computer code may be written in any computer language now known or later discovered, such as assembly language, C, C++, C#, Java, JavaScript, Python, Go, R, Swift, PHP, Dart, Kotlin, MATLAB, Perl, Ruby, Rust, and/or Scala, or any combination thereof.
- the computer code may be firmware.
- all or a portion of the application logic 114, the client logic 112, the memory extension logic 142, any other logic(s) described herein, and/or the processor may be implemented as a circuit.
- the circuit may include an FPGA (Field Programmable Gate Array) configured to perform the features of the application logic 114, the client logic 112, the memory extension logic 142, and/or the other logic(s).
- the circuit may include an ASIC (Application Specific Integrated Circuit) configured to perform the features of the application logic 114, the client logic 212, the memory extension logic 142, any other logic(s) described herein, and/or the processor 130.
- the circuit may be embedded in a chipset, a processor, and/or any other hardware device.
- a portion of the application logic 114, the client logic 112, the memory extension logic 142, and/or the processor 130 may be implemented as part of the one or more communication interfaces or other hardware components.
- the memory extension logic 142 may be partially and wholly incorporated in and/or executed by a communication interface that is in communication with the processor 130 via the interconnect(s) 170, 180.
- the memory extension logic 142 includes hardware logic
- the hardware logic may be physically combined with the communication interface and/or other hardware component(s).
- the computer executable code may be executed by a processor included in the communication interface and/or other hardware component(s).
- suitable communication interfaces may include PCIe interfaces, InfiniBand interfaces, Gen-Z interfaces, Hypertransport interfaces, QPI interfaces, UPI interfaces, and/or CXL interfaces.
- the client logic 112 may be partially and wholly incorporated in and/or executed by a memory controller, such as the memory controller 120 of the client 100, and/or another memory access device, such as a PCIe controller/device and/or a CXL controller/device.
- each module or unit such as the client logic unit, the application logic unit, the memory extension logic unit, the configuration unit, and/or any other logic described herein may be hardware or a combination of hardware and software.
- each module may include an application specific integrated circuit (ASIC), a Field Programmable Gate Array (FPGA), a circuit, a digital logic circuit, an analog circuit, a combination of discrete circuits, gates, or any other type of hardware or combination thereof.
- ASIC application specific integrated circuit
- FPGA Field Programmable Gate Array
- each module may include memory hardware, such as a portion of the memory 110, for example, that comprises instructions executable with the processor 130 or other processor to implement one or more of the features of the module.
- each module may or may not include the processor.
- each module may just be the portion of the memory 110, 144 or other physical memory that comprises instructions executable with the processor 130 or other processor to implement the features of the corresponding module without the module including any other hardware. Because each module includes at least some hardware even when the included hardware comprises software, each module may be interchangeably referred to as a hardware module.
- a processor may be implemented as a microprocessor, microcontroller, application specific integrated circuit (ASIC), discrete logic, or a combination of other type of circuits or logic.
- memories may be DRAM, SRAM, SDRAM, ADRAM, DDR RAM, FPM, EDO DRAM, RDRAM, CDRAM, Flash, and/or any other type of memory, such as those described and/or listed herein. Flags, data, databases, tables, entities, and other data structures may be separately stored and managed, may be incorporated into a single memory or database, may be distributed, or may be logically and physically organized in many different ways.
- the components may operate independently or be part of a same program.
- the components may be resident on separate hardware, such as separate removable circuit boards, or share common hardware, such as a same memory and processor for implementing instructions from the memory.
- Programs may be parts of a single program, separate programs, or distributed across several memories and processors.
- the action when it is said that an interface is performing some action, the action may be performed by the interface directly, and/or the action may be performed by a logic which provides and/or implements the interface, such as by an interface-implementing logic.
- the interface-implementing logic may be any logic which provides the interface and/or implements the logic that performs actions described as being performed by the interface.
- the interfaceimplementing logic may be and/or may include any of the logics described herein.
- the interface-implementing logic for the corresponding data interface(s) may be and/or may include the client logic 112.
- the actions performed may be performed in response to an invocation of the interface.
- the actions performed may be performed in response to an invocation of the corresponding interface.
- the respective logic, software or instructions for implementing the processes, methods and/or techniques discussed throughout this disclosure may be provided on computer- readable media or memories or other tangible media, such as a cache, buffer, RAM, removable media, hard drive, other computer readable storage media, or any other tangible media or any combination thereof.
- the tangible media include various types of volatile and nonvolatile storage media.
- the functions, acts or tasks illustrated in the figures or described herein may be executed in response to one or more sets of logic or instructions stored in or on computer readable media.
- the functions, acts or tasks are independent of the particular type of instructions set, storage media, processor or processing strategy and may be performed by software, hardware, integrated circuits, firmware, micro code, or any type of other processor, operating alone or in combination.
- processing strategies may include multiprocessing, multitasking, parallel processing and/or any other processing strategy known now or later discovered.
- the instructions are stored on a removable media device for reading by local or remote systems.
- the logic or instructions are stored in a remote location for transfer through a computer network or over telephone lines.
- the logic or instructions are stored within a given computer, CPU, GPU, or system.
- the logic illustrated in the flow diagrams may include additional, different, or fewer operations than illustrated.
- the operations illustrated may be performed in an order different than illustrated.
- a second action may be said to be "in response to" a first action independent of whether the second action results directly or indirectly from the first action.
- the second action may occur at a substantially later time than the first action and still be in response to the first action.
- the second action may be said to be in response to the first action even if intervening actions take place between the first action and the second action, and even if one or more of the intervening actions directly cause the second action to be performed.
- a second action may be in response to a first action if the first action sets a flag and a third action later initiates the second action whenever the flag is set.
- the phrases "at least one of ⁇ A>, ⁇ B>, ... and ⁇ N>” or “at least one of ⁇ A>, ⁇ B>, ... ⁇ N>, or combinations thereof" or " ⁇ A>, ⁇ B>, ... and/or ⁇ N>” are defined by the Applicant in the broadest sense, superseding any other implied definitions hereinbefore or hereinafter unless expressly asserted by the Applicant to the contrary, to mean one or more elements selected from the group comprising A, B, ... and N. In other words, the phrases mean any combination of one or more of the elements A, B, ...
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present disclosure describes example implementations of memory extension modules to achieve a scalable cache-coherent primary memory to a computing system using a variety of memory media while minimizing CPU stalling costs. In some example implementations, the memory extension module may include both a memory, a backing store, and a memory extension logic that receives memory access requests and controls data migration between the backing store and the memory assisted or provide data access according to the request, assisted by memory tracking data structure maintained in the memory of the memory extension module.
Description
CACHE-COHERENT MEMORY EXTENSION WITH CPU STALL AVOIDANCE
CROSS REFERENCE
[0001] This application is based on and claims the benefit of priority to U.S. Provisional Patent Application No. 63/448,442, filed on February 27, 2023, the entirety of which is herein incorporated by reference.
BACKGROUND
1. Technical Field.
[0002] This application relates to memory and, in particular, to memory extensions.
2. Related Art.
[0003] Under some circumstances, computing systems may encounter a scenario where memory usage nearly equals, equals, or exceeds available local primary memory. Such computing systems suffer from a variety of drawbacks, limitations, and disadvantages. Accordingly, there is a need for inventive systems, methods, components, and apparatuses described herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0004] The embodiments may be better understood with reference to the following drawings and description. The components in the figures are not necessarily to scale. Moreover, in the figures, like-referenced numerals designate corresponding parts throughout the different views.
[0005] FIG. 1 illustrates a hardware diagram of an example memory extension system;
[0006] FIG. 2 illustrates a hardware diagram of an example memory extension system using an external memory extension module;
[0007] FIG. 3A illustrates an example flowchart for a request for reclaim candidates;
[0008] FIG. 3B illustrates an example flowchart for starting to load data into memory;
[0009] FIG. 4A illustrates an example flowchart for allocating memory;
[0010] FIG. 4B illustrates an example flowchart for freeing memory;
[0011] FIG. 5 illustrates an example flowchart for a memory access;
[0012] FIG. 6 illustrates an example flowchart for page migration;
[0013] FIG. 7 illustrates a memory architecture diagram of an example system providing multiple tiers of memory; and
[0014] FIG. 8 illustrates an example categorization of latency sources for computing systems.
DETAILED DESCRIPTION
[0015] The present disclosure provides a technical solution to solve a technical problem of providing scalable cache-coherent primary memory to a computing system using a variety of memory media while minimizing CPU stalling costs.
[0016] FIG. 1 illustrates a hardware diagram of an example memory extension system. The memory extension system may include a client 100. The memory extension system may include more, fewer, or different elements. For example, the memory extension system may include multiple clients. Alternatively, or in addition, the memory extension system may include one or more memory appliances and/or management servers, such as described in U.S. Non-provisional patent application Ser. No. 14/530,908, filed November 3, 2014, and U.S. Non-provisional patent application Ser. No. 15/424,395, filed February 3, 2017, each of which is hereby incorporated by reference. Alternatively, the memory extension system may include just the client.
[0017] The client 100 may be any machine or a device that uses memory as described herein. For example, the client 100 may be a server, a device, an embedded system, a circuit, a chipset, an integrated circuit, a field programmable gate array (FPGA), an application-specific integrated circuit, a virtual machine, a virtualization instance, a container, a jail, a zone, an operating system, a kernel, a device driver, a device firmware, a hypervisor service, a cloud computing interface, and/or any other hardware, software, and/or firmware entity which may perform the same functions as described. The client 100 may include a memory 110, a memory controller 120, a processor 130, and/or a memory extension module 140. The client 100 may include more, fewer, or different components. For example, the client 100 may include a storage controller 150, a backing store 160, multiple storage controllers, multiple backing stores, multiple memories, multiple memory controllers, multiple processors, multiple memory extension modules and/or any combination thereof. Alternatively, the client 100 may just include a process executed by the processor 130.
[0018] The storage controller 150 the client 100 may include a component that facilitates storage operations to be performed on the backing store 160. A storage operation may include reading from or writing to locations within the backing store 160. The storage controller 150 may include a hardware component. Alternatively or in addition, the storage controller 150 may include a software component.
[0019] The backing store 160 may include an area of storage comprising one or more persistent media, including but not limited to flash memory, phase change memory, 3D XPoint memory, Memristors, EEPROM, magnetic disk, tape, or other media. The media in the backing store 160 may potentially be slower than the memory 110.
[0020] The storage controller 150 and/or backing store 160 of the client 100 may be internal to the client 100, a physically discrete device external to the client 100 that is coupled to the client 100, included in a second client or in a device different from the client 100, part of a server, part of
a backup device, part of a storage device on a Storage Area Network, and/or part of some other externally attached persistent storage.
[0021] The memory 110 may be any memory or combination of memories, such as a solid state memory, a random access memory (RAM), a dynamic random access memory (DRAM), a static random access memory (SRAM), a flash memory, a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a phase change memory, 3D XPoint memory, a memristor memory, any type of memory configured in an address space addressable by the processor, or any combination thereof. The memory 110 may be volatile or non-volatile, or a combination of both.
[0022] The memory 110 may be a solid state memory. Solid state memory may include a device, or a combination of devices, that stores data, is constructed primarily from electrical conductors, semiconductors and insulators, and is considered not to have any moving mechanical parts. Solid state memory may be byte-addressable, word-addressable or block-addressable. For example, most dynamic RAM and some flash RAM may be byte-addressable or word-addressable. Flash RAM and other persistent types of RAM may be block-addressable. Solid state memory may be designed to connect to a memory controller, such as the memory controller 120 in the client 100, via an interconnect bus, such as the interconnect 170 in the client 100.
[0023] Solid state memory may include random access memory that permits stored data to be read and/or written in any order (for example, at random). The term "random" refers to the fact that any piece of data may be returned and/or written within a constant time period, regardless of the physical location of the data and regardless of whether the data is related to a previously read or written piece of data. In contrast, storage devices such as magnetic or optical discs rely on the physical movement of the recording medium or a read/write head so that retrieval time varies based on the physical location of the next item read and write time varies based on the physical location of the next item written. Examples of solid state memory include, but are not limited to: DRAM, SRAM, NAND flash RAM, NOR flash RAM, V-NAND, Z-NAND, phase change memory (PRAM), 3D XPoint memory, EEPROM, FeRAM, MRAM, CBRAM, PRAM, SONOS, RRAM, Racetrack memory, NRAM, Millipede, T-RAM, Z-Ram, TTRAM, and/or any other randomly- accessible data storage medium known now or later discovered.
[0024] In contrast to solid state memory, solid state storage devices are systems or devices that package solid state memory with a specialized storage controller through which the packaged solid state memory may be accessed using a hardware interconnect that conforms to a standardized storage hardware interface. For example, solid state storage devices include, but are not limited to: flash memory drives that include Serial Advanced Technology Attachment (SATA) or Small Computer System Interface (SCSI) interfaces; Flash or DRAM drives that include SCSI over Fibre Channel interfaces; DRAM, Flash, and/or 3D XPoint memory drives that include NVMe interfaces; DRAM drives that include SATA or SCSI interfaces, USB (universal serial bus)
flash drives with USB interfaces, and/or any other combination of solid state memory and storage controller known now or later discovered.
[0025] The memory 110 of the client 100 may include a client logic 112. The memory 110 of the client 100 may include more, fewer, or different components. For example, the memory 110 of the client 100 may include an application logic 114.
[0026] The processor 130 may be a general processor, a central processing unit (CPU), a Graphics Processing Unit (GPU), a server, a microcontroller, an application specific integrated circuit (ASIC), a digital signal processor, a field programmable gate array (FPGA), a digital circuit, an analog circuit, a logic, and/or any combination of these and/or other components. The processor 130 may include one or more devices operable to execute computer executable instructions or computer code embodied in the memory 110 or in other memory to perform features of the memory extension system. For example, the processor 130 may execute computer executable instructions that are included in the client logic 112 and/or the application logic 114.
[0027] The processor 130, the memory controller 120, and the one or more memory extension modules 140 may each be in communication with each other. Each one of the processor 130, the memory controller 120, and the one or more memory extension modules 140 may also be in communication with additional components, such as the storage controller 150, and the backing store 160. The communication between the components of the client 100 may be over an interconnect, a bus, a point-to-point connection, a switched fabric, a network, any other type of interconnect, or any combination of interconnects 170. The communication may use any type of topology, including but not limited to a star, a mesh, a hypercube, a ring, a torus, a fat tree, a dragonfly, or any other type of topology known now or later discovered. Alternatively or in addition, any of the processor 130, the memory 110, the memory controller 120, and/or the memory extension module(s) 140 may be logically or physically combined with each other or with other components, such as with the storage controller 150, and/or the backing store 160.
[0028] The memory controller 120 may include a hardware component that translates memory addresses specified by the processor 130 into the appropriate signaling to access corresponding locations in the memory 110. The processor 130 may specify the address on the interconnect 170. The processor 130, the interconnect 170, and the memory 110 may be directly or indirectly coupled to a common circuit board, such as a motherboard. In one example, the interconnect 170 may include an address bus that is used to specify a physical address, where the address bus includes a series of lines connecting two or more components. The memory controller 120 may, for example, also perform background processing tasks, such as periodically refreshing the contents of the memory 110. In one example implementation, the memory controller 120 may be included in the processor 130.
[0029] The application logic 114 and/or the client logic 112 may include a user application, an operating system, a kernel, a device driver, a device firmware, a virtual machine, a hypervisor, a
container, a jail, a zone, a cloud computing interface, a circuit, a logical operating system partition, or any other logic that uses the services provided by the client logic 112. A container, a jail, and a zone may be technologies that provide userspace isolation or compartmentalization. Any process in the container, the jail, or the zone may communicate only with processes that are in the same container, the same jail, or the same zone. The application logic 114 and/or the client logic 112 may be embedded in a chipset, an FPGA, an ASIC, a processor, and/or any other hardware device.
[0030] The memory extension module 140 may be a logical and/or physical collection of components that perform the functions as described herein. In some examples, the memory extension module 140 may be a physical hardware device and/or component, distinct from other components of the client 100. For example, the memory extension module 140 may be a CPU- socket module, an HTX (HyperTransport expansion) device, a Quick Path Interconnect (QPI) device, an Ultra Path Interconnect (UPI) device, an Infinity Fabric device, a PCI device, a PCI-X device, a PCI Express device, a CXL module, a Gen-Z module, a memory module, a Single In-line Pin Package (SIPP), a Single In-line Memory Module (SIMM), a Dual In-line Memory Module (DIMM), a Rambus In-line Memory Module (RIMM), a Small outline DIMM (SO-DIMM), a Small Outline RIMM (SO-RIMM), a Compression Attached Memory Module (CAMM), a stacked memory module, any other memory module known now or later discovered, a peripheral, and/or any other physical module internally or externally connected to the client 100. Alternatively, or in addition, the memory extension module 130 may be and/or may include a server, a device, an embedded system, a circuit, a chipset, an integrated circuit, a field programmable gate array (FPGA), an application-specific integrated circuit, a virtual machine, a virtualization instance, a container, a jail, a zone, an operating system, a kernel, a device driver, a device firmware, a hypervisor service, a cloud computing interface, and/or any other hardware, software, and/or firmware entity which may perform the same functions as described.
[0031] The memory extension module 140 may include a memory extension logic 142 and/or a memory 144. The memory extension module 140 may include more, fewer, or different components. For example, the memory extension module 140 may include multiple memory extension logics, multiple memories, one or more backing stores 146, one or more communication interfaces 148, and/or any combination thereof. Alternatively, the memory extension module 140 may just include the memory extension logic 142.
[0032] The memory 144 of the memory extension module 140 may be any memory or combination of memories, such as a solid state memory, a random access memory (RAM), a dynamic random access memory (DRAM), a synchronous dynamic random access memory (SDRAM), an asynchronous dynamic random access memory (ADRAM), a double data rate (DDR) RAM, a fast page mode (FPM) memory, an extended data output (EDO) DRAM, a Rambus DRAM (RDRAM), a Cached DRAM (CDRAM), a static random access memory (SRAM), a flash memory,
a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a phase change memory, 3D XPoint memory, a memristor memory, any type of memory configured in an address space addressable by the processor 130 and/or the memory extension logic 142, or any combination of memories known now or later discovered. The memory 144 may be volatile or non-volatile, or a combination of both. In some examples, the memory 144 may be a solid state memory. The memory 144 of the memory extension module 140 may be of the same or different types of memory/memories as the memory 110 of the client 100. In some examples, the memory 144 of the memory extension module 140 may be the same as and/or may be combined with the memory 110 of the client 100.
[0033] The backing store 146 of the memory extension module 140 may include an area of storage comprising one or more persistent media, including but not limited to flash memory, phase change memory, 3D XPoint memory, Memristors, EEPROM, magnetic disk, tape, or other media. The media in the backing store 146 may potentially be slower than the memory 144 of the memory extension module 140 and/or the memory 110 of the client 100. In some examples, the backing store 146 and/or the memory extension module may include a storage controller, similar to the storage controller 150 of the client 100. The storage controller and/or backing store 146 of the memory extension module 140 may be internal to the memory extension module 140, internal to the client 100, a physically discrete device external to the memory extension module 140 and/or client 100 that is coupled to the memory extension module 140 and/or client 100, included in a second memory extension module and/or client or in a device different from the memory extension module 140 and/or client 100, part of a server, part of a backup device, part of a storage device on a Storage Area Network, and/or part of some other externally attached persistent storage. In some examples, the backing store may be accessed using the communication interface(s) 148.
[0034] The communication interface(s) 148 may provide client-side memory access to the memory of a memory appliance, to regions, and/or to portions of the regions in the memory appliance. One or more interconnects or networks may transport data between the communication interface(s) 148 of the memory extension module 140 and one or more communication interfaces of other devices, such as a memory appliance. For example, the communication interface(s) may be any one or more network interface controller(s), host controller adaptor(s), host fabric interface(s), memory fabric interface(s), processor interface(s), and/or any other interface(s) known now and/or later discovered that may be capable of operating as described herein. In some examples, the communication interface(s) and/or client-side memory access may be as described in U.S. Non-provisional patent application Ser. No. 14/530,908, filed November s, 2014, and U.S. Non-provisional patent application Ser. No. 15/424,395, filed February 3, 2017, each of which is hereby included by reference.
[0035] A client-side memory access may bypass a processor, such as a CPU (Central Processing Unit), at the client 100 and/or may otherwise facilitate the client 100, the memory
extension module 140, and/or the memory extension logic 142 accessing the memory on the memory appliance without waiting for an action by the processor included in the client 100, in the memory appliance, or both. For example, the client-side memory access may be based on the Remote Direct Memory Access (RDMA) protocol. The RDMA protocol may be carried over an InfiniBand interconnect, an iWARP interconnect, an RDMA over Converged Ethernet (RoCE) interconnect, an Aries interconnect, a Slingshot interconnect, and/or any other interconnect and/or combination of interconnects known now or later discovered. Alternatively, or in addition, the client-side memory access may be based on any other protocol and/or interconnect that may be used for accessing memory. A protocol that may be used for accessing memory may be a CPU protocol/interconnect, such as HyperTransport, Quick Path Interconnect (QPI), Ultra Path Interconnect (UPI), and/or Infinity Fabric. Alternatively, or in addition, a protocol that may be used for accessing memory may be a peripheral protocol/interconnect, such as Peripheral Component Interconnect (PCI), PCI Express, PCI-X, ISA, Gen-Z, CXL, and/or any other protocol/interconnect used to interface with peripherals and/or access memory. The communication interfaces may provide reliable delivery of messages and/or reliable execution of memory access operations, such as any memory access operation carried out when performing the client-side memory access. Alternatively, or in addition, delivery of messages and/or execution of memory access operations may be unreliable, such as when data is transported between the communication interfaces using the User Datagram Protocol (UDP). The client 100, the memory extension module 140, and/or the memory extension logic 142 may read, write, and/or perform other operations on the memory of the memory appliance, to the regions, and/or to portions of the regions using client-side memory access. In providing client-side memory access, the client 100, the memory extension module 140, and/or the memory extension logic 142 may transmit requests to perform memory access operations to the memory appliance. In response, the memory appliance may perform the memory access operations. Similar to as done by the storage device of US Patent Application 13/036,544, filed February 28, 2011 , entitled “High performance data storage using observable client-side memory access” by Stabrawa, et al., which published as US Patent Application Publication US2012/0221803 A1 , and which is hereby incorporated by reference, the memory appliance may observe or otherwise identify the memory access operations. In response to identifying the memory access operations, the memory appliance may, for example, copy the data of the region to one or more backing stores independently of performing the memory access operations on the memory. A backing store may include one or more persistent non-volatile storage media, such as flash memory, phase change memory, 3D XPoint memory, memristors, EEPROM, magnetic disk, tape, or some other media. The memory of the memory appliance and/or the backing store (if included) may be subdivided into regions.
[0036] The memory extension logic 142 may be a logical and/or physical collection of components that perform the functions as described herein. The memory extension logic 142 may include one or more hardware, software, and/or logic entities that facilitate performing the
functions. The entities may be combined and/or distributed in a number of ways to suit a particular embodiment. For example, the memory extension logic 142 may include a processor, a memory, such as the memory 144 of the memory extension module 140, another memory, a chipset, an FPGA, an ASIC, and/or any other hardware device.
[0037] The memory extension module 140, the memory extension logic 142, and/or the memory 144 of the memory extension module 140 may be in communication with the processor 130 and/or with other components of the client 100 via an interconnect 180. The communication between the components of the client 100 and/or of the memory extension module 140 may be over an interconnect, a bus, a point-to-point connection, a switched fabric, a network, any other type of interconnect, and/or any combination of interconnects 180. The communication may use any type of topology, including but not limited to a star, a mesh, a hypercube, a ring, a torus, a fat tree, a dragonfly, and/or any other type of topology known now or later discovered. Alternatively or in addition, any of the processor 130, the memory extension module 140, the memory extension logic 142, and/or other components of the client 100 and/or memory extension module(s) 140 may be logically or physically combined with each other or with other components, such as with the memory controller 120. In some examples, the interconnect 180 may be logically or physically combined with the interconnect 170 of the client 100 and/or with any other interconnect(s) of the memory extension module 140. The interconnect 180 may have some of the same properties as the interconnect 170 of the client 100. Alternatively or in addition, the interconnect 180 may be the same interconnect as the interconnect 170 of the client 100.
[0038] The interconnect 180 may be, may implement, may provide, and/or may include a physically-addressable interface. A physically-addressable interface may be an interface which provides access to the underlying data using physical addresses, such as the physical addresses used on an address bus, a CPU interconnect, a memory interconnect, and/or on a peripheral interconnect. Alternatively or in addition, the interconnect 180 may be and/or may include a virtually-addressable interface. For example, the interconnect 180 may be addressed using IO- virtual addresses, virtual machine physical addresses, and/or any other type(s) of virtual addresses known now and/or later discovered. The virtual addresses, such as the IO-virtual addresses, may be translated to physical addresses by one or more address translation logics. Examples of address translation logics include memory management units (MMUs), input-output memory management units (IO- MM Us), translation lookaside buffers, and/or any other logic capable of translating virtual addresses to physical addresses known now or later discovered. In some examples, the interconnect 180 may be and/or may include a peripheral interconnect such as Peripheral Component Interconnect (PCI), PCI Express, PCI-X, ISA, Gen-Z, CXL, and/or any other protocol/interconnect used to interface with peripherals and/or to access memory. In other examples, the interconnect 180 may be and/or may include a processor interconnect, such as HyperTransport, Quick Path Interconnect (QPI), Ultra Path Interconnect (UPI), Infinity Fabric,
and/or any other interconnect used to communicate between processors, to maintain cachecoherency, and/or to access memory.
[0039] In some examples, a portion of the memory extension logic 142 may be included in the memory 144 of the memory extension module 140, in another memory, such as the memory 110 of the client 100, in the backing store 146 of the memory extension module 140, in the backing store 160 of the client 100, and/or in any other one or more computer-readable storage media. For example, if the memory extension logic 142 includes an FPGA, configuration for the FPGA that may define and/or control the logic to be implemented with the FPGA may be included in one or more of these locations. In examples where the memory extension logic 142 includes a processor, computer executable instructions and/or computer code, executable by the processor, may be included in one or more of these locations.
[0040] In some examples, portions of the memory extension logic 142, the client logic 112, and/or the application logic 114 may be included in different memories, backing stores, and/or media at different times. For example, all of or a portion of the memory extension logic 142 may be initially included in the backing store 160 of the client 100 and/or may be transferred to the memory 144 of the memory extension module 140, such as by a firmware loading logic, which may be included in the client logic 112 and/or in another logic. In another example, a portion of the memory extension logic 142 may be included in one or more passed functions provided to the memory extension module 140.
[0041] Passed functions may be any logic which may be provided by a first logic as a parameter when interacting with a second logic. For example, the passed function may be computer executable instructions or computer code. The passed function may be embodied in a computer readable storage medium, such as the memory 110 and/or the backing store 160, and/or may be transmitted via an interconnect, such as the interconnects 170 180, for operation with another entity. For example, the passed function may be transmitted via the interconnects 170 180 from the client logic 112 or any other logic to the memory extension module, for execution with the memory extension logic 142 and/or the processor (if present) of the memory extension module 140. Passed functions may be used to define custom, adaptive, and/or flexible caching strategies that may be cumbersome to express in other ways.
[0042] The interconnect 180, the memory extension logic 142, the client logic, the application logic, and/or any other logic may implement one or more data interfaces, such as the data interfaces described in U.S. Non-provisional patent application Ser. No. 14/530,908, filed November 3, 2014, which is hereby included by reference. As described therein, the data interface(s) may include an API, a block-level interface, a character-level interface, a memorymapped interface, a memory allocation interface, a memory swapping interface, a memory caching interface, a hardware-accessible interface, a graphics processing unit (GPU) accessible interface and/or any other interface used to access data included in the memory 144, included in the backing
store 146, and/or accessible via the communication interface(s) 148. In some examples, the data interface may include a hardware-accessible interface. In examples such as these, the memory extension module 140, the memory extension logic 142, and/or other components may be included in the hardware client component as described therein. Alternatively or in addition, the processor 130, the memory controller 120, the memory 110, the client logic 112, and/or the application logic may be included in the hardware application component as described therein.
[0043] The hardware-accessible interface may be and/or may include a physically- addressable interface. A physically-addressable interface may be an interface which provides access to the underlying data using physical addresses, such as the physical addresses used on an address bus, a CPU interconnect, a memory interconnect, and/or on a peripheral interconnect. Alternatively or in addition, hardware-accessible interface may be and/or may include a virtually- addressable interface. For example, the hardware-accessible interface may be addressed using IO-virtual addresses. IO-virtual addresses may be translated to physical addresses by one or more address translation logics. Examples of address translation logics include memory management units (MMUs), input-output memory management units (IO- MM Us), translation lookaside buffers, and/or any other logic capable of translating virtual addresses to physical addresses known now or later discovered.
[0044] The hardware-accessible interface may enable a hardware application component, the client logic 112, the application logic 114, and/or another logic to access data included in the memory 144, included in the backing store 146, and/or accessible via the communication interface(s) 148. Alternatively or in addition, the hardware-accessible interface may enable the hardware application component to access data of a region. Alternatively or in addition, the hardware-accessible interface may enable the hardware application component to access data of one or more of the regions referenced by an external memory allocation. Alternatively or in addition, the hardware-accessible interface may enable the hardware application component to access data of the external memory allocation. The hardware application component may be a processor, a GPU, a communication interface, a direct memory access controller, an FPGA, an ASIC, a chipset, a compute module, a hardware accelerator module, a hardware logic, and/or any other physical component that accesses memory. The hardware application component may be included in the application logic 114 and/or in the client logic 112. The hardware-accessible interface may include a hardware client component. A hardware client component may be and/or may include a processor, a GPU, an MMU, an IO-MMU, a communication interface, such as the one or more communication interfaces 148, a direct memory access controller, an FPGA, an ASIC, a chipset, a compute module, a hardware accelerator module, a hardware logic, a memory access transaction translation logic, any other hardware component, and/or a combination of multiple hardware components. The hardware client component may be included in the client logic 312 and/or in the memory extension logic 142. The hardware client component, the hardware
application component, and/or the one or more communication interfaces may be embedded in one or more chipsets. The hardware client component may include a memory and/or cache, such as the memory 144 of the memory extension module 140. The memory and/or cache of the hardware client component may be used to hold portions of the data of the backing store 146 and/or of external memory allocations and/or regions. Alternatively, or in addition, the hardware client component may utilize a portion of the memory 110 of the client 100 to hold portions of the data of the backing store 146 and/or of external memory allocations and/or regions.
[0045] The hardware client component may respond to and/or translate attempts to access virtual addresses, physical addresses, logical addresses, IO addresses, and/or any other address used to identify the location of data. Alternatively, or in addition, the hardware client component may participate in a cache coherency protocol with the hardware application component. In a first example, the hardware client component may respond to attempts of the hardware application component to access physical addresses by accessing data included in the memory 144 and/or cache of the hardware client component. In a second example, the hardware component may interface with a CPU interconnect and handle cache fill requests by reading data from the memory 144 and/or cache included in the hardware client component. In a third example, the hardware client component may redirect and/or forward attempts of the hardware application component to access physical addresses to alternate physical addresses, such as the physical addresses of the portion of the memory 110 of the client 100 utilized by the hardware component. In a fourth example, the hardware client component may translate attempts of the hardware application component to access physical addresses into client-side memory access. For example, the hardware client component may interface with the interconnect(s) 170 180 and/or handle cache fill requests by reading the requested data from backing store 146 and/or from the region and/or external memory allocation, such as via the communication interface(s) 148. Alternatively, or in addition, the hardware client component may handle cache flush requests by performing clientside memory access to write the requested data to the backing store 146 and/or to the region and/or external memory allocation. Alternatively, or in addition, the hardware client component may handle cache invalidate requests by updating the memory 144 and/or cache of the hardware client component to indicate the non-presence of the data indicated by the cache invalidate requests. In a fifth example, the hardware client component may translate attempts of the hardware application component to access IO addresses into client-side memory access. For example, the hardware client component may interface with a peripheral interconnect, such as PCI Express, and may respond to requests to read a portion of the IO address space by reading data from the memory 144 included in the hardware client component, by reading the portion of the memory 110 and/or cache of the client 100 utilized by the hardware component, and/or by reading the requested data from backing store 146 and/or from the region and/or external memory allocation. In another example, the hardware client component may interface with a memory-fabric interconnect, such as Gen-Z and/or CXL, and may respond to memory read operations by reading
data from the memory 144 included in the hardware client component, by reading the portion of the memory 110 and/or cache of the client 100 utilized by the hardware component, and/or by reading the requested data from the backing store 146 and/or from the region and/or external memory allocation.
[0046] Such as described in U.S. Non-provisional patent application Ser. No. 17/859,389, filed July 7, 2022, which is hereby incorporated by reference, the memory extension system, the memory extension module 140, the memory 144, the memory extension logic 142, and/or any other logic may expose and/or may provide one or more memory description data structures and/or performance indication(s) to the client logic 112, the application logic 114, and/or other logic(s). For example, the memory description data structure(s) and/or performance indication(s) may be and/or may include ACPI System Resource Affinity Table (SRAT) information, ACPI Static Resource Affinity Table (SRAT) information, ACPI System Locality Distance Information Table (SLIT) information, ACPI Heterogeneous Memory Attribute Table (HMAT) information, ACPI Heterogeneous Memory Attributes (HMA), and/or any other data structures that may indicate one or more relative and/or absolute performance attributes of the memory 144 of the memory extension module 140, of the memory 110 of the client 100, of the backing store 146 of the memory extension module 140, of the communication interface(s) 148, of the memory extension logic 142, of the processor 130, of the interconnect(s) 170, 180, the memory appliances (if present), and/or any of one or more caches that may hold data of the memory 110, 144, of one or more architectural elements, and/or of any other one or more physical and/or logical components that may affect the performance of memory-access operations and/or that may be useful in determining which memory to use for different purposes.
[0047] The performance attribute(s) may be any one or more attributes, characteristics, properties, and/or aspects of the corresponding memory/memories, mapping(s), memory control ler(s), interconnect(s), processor(s), logic(s), communication interface(s), and/or any other component, subsystem, system, and/or architectural element related to memory access performance. Examples of performance attributes may include latency, bandwidth, operations per second, transfers per second, determinacy, jitter, and/or any other measurable and/or unmeasurable quantity/quality related to performance. The performance attributes may be absolute (for example, “100 ns”, “30 GiB/s”, and/or 50 gigatransfers/second), and/or may be relative (for example, “low jitter” relative to a reference measurement of jitter).
[0048] The performance indication(s) may indicate one or more relative and/or absolute performance attributes and/or any other attributes of the memory extension module 140, the client 100, the interconnects 170, 180, and/or any other one or more component related to the performance of memory accesses. Examples of the performance indication(s) may include an indicator of any of the following (or any combination thereof): price, capacity, latency, bandwidth, operations per second, physical locality, network locality, logical locality, power draw, and/or any
other one or more attributes of the memory 144 of the memory extension module 140, of the memory 110 of the client 100, of the backing store 146 of the memory extension module 140, of the communication interface(s) 148, of the memory extension logic 142, of the processor 130, of the interconnect(s) 170, 180, the memory appliances (if present), and/or any of one or more caches that may hold data of the memory 110, 144, of one or more architectural elements, and/or of any other one or more physical and/or logical components that may affect the performance of memoryaccess operations and/or that may be useful in determining which memory to use for different purposes. The performance indication(s) may include or be associated with a memory identifier. The memory identifier may be any identifier that identifies the memory, the architectural element containing the memory, the memory appliance containing the memory, the communication interface(s) 148 handling communication with the memory, the interconnect(s) 170, 180 connecting to the memory, the memory controller(s) 120 for the memory, and/or any other logical element or physical component having the property or properties included in the performance indication. Alternatively or in addition, the performance indication(s) may indicate one or more relative distances and/or latencies between architectural elements, such as with an ACPI System Locality Distance Information Table (SLIT). Alternatively or in addition, the performance indication(s) may include one or more indicator(s) of any of the following, or any combination thereof: read/write latency and/or bandwidth metrics between architectural elements and/or memories, such as with an ACPI System Locality Latency and Bandwidth Information Structure. In one such example, the memory 144 of the memory extension module 140 may include a first memory and a second memory, where the second memory of the memory extension module 140 differs from the first memory in price, capacity, latency, bandwidth, operations per second, physical locality, network locality, logical locality, interconnect, power draw, and/or any other one or more attributes, such as by being memory of a different type, media, style, package, clock rate, architecture, and/or any other relevant difference. In such an example, the first memory and the second memory be said to be different classes of memory. Alternatively or in addition, the performance indication(s) may indicate one or more attributes of one or more caches associated with one or more memories, such as with an ACPI Memory Side Cache Information Structure. The attribute(s) of the cache(s) may include the size of the cache(s), number of cache levels, cache associativity, write policy, cache-line size, latency, bandwidth, performance, and/or any other one or more attributes of the cache(s) that may affect the performance of memory-access operations and/or that may be useful in determining which memory to use for different purposes. In examples such as this, the cache(s) may be and/or may include cache memories (such as with a CPU cache); the cache(s) may be and/or may include memories 110, 144 used to hold portions of one or more backing stores 146, 160; and/or the cache(s) may be and/or may include local primary memory used to hold cached portions of external primary memory.
[0049] In some examples, the performance indication(s) may be created, destroyed, modified, and/or updated during operation of the memory extension system. For example, the performance
indication(s) may be provided, conveyed, delivered, indicated, and/or identified via a transitory means, such as with a message, a function call, a programmatic method invocation, a programmatic signal, an electrical signal, an optical signal, a wireless signal, and/or any other mechanism used to provide, convey, deliver, indicate, and/or identify information between logics. In some examples, the performance indication(s) may be indicated via one or more ACPI Device Configuration Objects, such as ACPI System Locality Information (_SLI) object(s), ACPI Proximity (_PXM) object(s), ACPI Heterogeneous Memory Attributes (_HMA) object(s), and/or any other data structure and/or mechanism that may describe and/or indicate the performance indication(s) and/or one or more portions thereof.
[0050] The client logic 112, the application logic 314, and/or any other logic may operate differently based upon the performance indication(s). For example, the logic may include an operating system and/or the logic may be configured to use memory with faster performance for frequently-accessed data and/or recently-accessed data and/or may be configured to use memory with slower performance for infrequently-accessed data and/or not-recently-accessed data. Alternatively or in addition, the logic may be configured to use memory with a shorter distance for frequently-accessed data and/or recently-accessed data and/or may be configured to use memory with a longer distance for infrequently-accessed data and/or not-recently-accessed data. Alternatively or in addition, the logic may be configured to migrate data from one portion of memory to another based upon whether the data is frequently accessed, infrequently accessed, recently accessed, not recently accessed, and/or matches or does not match any other one/or more access patterns. Alternatively or in addition, the logic may be configured to migrate data from one portion of memory (a source portion) to another (a destination portion) based upon whether the performance indication(s) indicate that the source portion has faster performance, slower performance, higher bandwidth, lower bandwidth, lower latency, higher latency, a larger cache, a smaller cache, larger capacity, smaller capacity, a shorter distance, a longer distance, more- durable media, less-durable media, higher power draw, lower power draw, and/or any other one or more characteristics relative to the destination portion. For example, the data may be migrated from the memory 144 of the memory extension module 140 to the memory 110 of the client 100 in response to identifying that the data is accessed frequently. The data may be migrated by copying the data from one portion of memory to another, such as with NUMA page migration, and/or by performing operations as described herein, such as described for FIG. 6.
[0051] The term "frequently-accessed" means accessed more frequently than a threshold frequency. The term "infrequently-accessed" means accessed less than a threshold frequency. The threshold for "frequently-accessed" may not necessarily be the same as the threshold for "infrequently-accessed". The term "recently-accessed" means accessed within a predetermined timeframe. Memory with slower performance means memory that may be read and/or written a rate that is slower than another memory available to be allocated such as memory with higher
latency, lower bandwidth, lower operations per second, and/or less performant metrics for any other performance attributes. Memory with faster performance means memory that may be read and/or written at a rate that is faster than other memory available to be allocated, such as memory with lower latency, higher bandwidth, higher operations per second, and/or more performant metrics for any other performance attributes. The phrase "shorter distance" in this context means closer to the client device in the context of physical locality, network locality, and/or logical locality than another memory available to be allocated. The phrase "longer distance" in this context means further from the client device in the context of physical locality, network locality, and/or logical locality than another memory available to be allocated. The term “more-durable” in this context means media which is more likely to retain its contents over time, which is more likely to retain its contents after repeated reading and/or writing over time, which is more likely than a threshold probability to retain its data after more repeated read and/or write operations than for other media, and/or which has any other properties and/or attributes associated with improved media durability. In contrast, the term “less-durable” in this context means media which is less likely to retain its contents over time, which is less likely to retain its contents after repeated reading and/or writing over time, which is more likely than a threshold probability to retain its data after fewer repeated read and/or write operations than for other media, and/or which has any other properties and/or attributes associated with reduced media durability. The term “higher power draw” in this context means memory which requires more electric power to operate, more thermal energy to maintain the temperature of the memory within a target operating range, and/or which has any other properties and/or attributes associated with higher power usage at the current time and/or over time. In contrast, the term "lower power draw" in this context means memory which requires less electric power to operate, less thermal energy to maintain the temperature of the memory within the target operating range, and/or which has any other properties and/or attributes associated with lower power usage at the current time and/or over time.
[0052] It should be clear to one skilled in the art that many arrangements of the components of the memory extension system are possible, and that those described herein are only example implementations. For example, any of the components of the memory extension system may be internal or external to the client 100, to the memory extension module(s) 140, and/or both. In examples where components are external, they may be in communication via an interconnect and/or network. In some examples, one or more memory extension modules 140 may be external to the client 100.
[0053] FIG. 2 illustrates a hardware diagram of an example memory extension system using an external memory extension module. As shown in FIG. 2, the memory extension system may include one or more memory extension modules 140 that are external to the client 100. Alternatively or in addition, the memory extension system may include one or more clients 100. In example systems such as this, the interconnect 180 may be external to the client(s) 100, such as
with a switched fabric and/or a network. For example, the interconnect 180 may include one or more PCIe switches, one or more CXL switches, one or more network/interconnect management systems, and/or any other components that contribute to the communication between the client(s) 100, the memory extension module(s) 140, components of the client(s) 100, and/or components of the memory extension module(s) 140. In some examples, the memory extension system may include one or more management components responsible for forming and/or maintaining associations between one or more clients 100 and/or one or more memory extension modules 140. The management components may be and/or may operate similarly to the management servers, as described in U.S. Non-provisional patent application Ser. No. 14/530,908, filed November 3, 2014, and U.S. Non-provisional patent application Ser. No. 15/424,395, filed February 3, 2017, each of which is hereby included by reference.
[0054] During operation, the memory extension logic 142 may retrieve data from the memory 144 of the memory extension module 140 in response to one or more accesses by the processor 130 and/or by another logic. For example, the memory extension logic may respond to a cache fill request by reading the corresponding data from the memory 144. In examples where the data was not previously stored in the memory 144, the memory extension logic 142 may retrieve the data from the backing store 146 and/or via the communication interface 148. In addition to responding to the cache fill request with the retrieved data, the memory extension logic may store the retrieved data in the memory 144. When data is stored in the memory 144, it may be faster to retrieve than when stored in the backing store 146 and/or when retrieved via the communication interface 148. While waiting for cache fill requests to be satisfied, the processor 130 and/or other logic may wait for the data. While waiting, one or more hardware resources of the processor 130 and/or other logic may be unable to perform meaningful work. For example, one or more hardware pipelines of the processor 130 and/or other logic may be stalled. In examples such as these, it may be advantageous to reduce the duration when the processor 130 and/or other logic is waiting and/or to enable the processor 130 and/or other logic to perform other meaningful work while waiting.
[0055] The operations of the memory extension logic 142 may be optimized to reduce the duration when the processor 130 and/or other logic is waiting. For example, when handling a cache fill request, if there are no available locations in the memory to hold the data, operations may be structured such that the cache fill request is not completed until a suitable location can be identified and the data is stored in memory. Alternatively or in addition, the cache fill request may be completed upon retrieving the data for the request, and the data may be stored in the memory 144 after completing the cache fill request. In other examples, operations may be performed in advance of receiving cache fill requests to identify and/or prepare one or more suitable memory locations, lessoning the work needed while handling cache fill requests.
[0056] FIG. 3A illustrates an example flowchart for a request for reclaim candidates. The memory extension logic 142 and/or another logic may perform the operations shown in FIG. 3A and/or other figures herein. The memory extension logic 142 and/or another logic may receive a request for reclaim candidates (302). The request for reclaim candidates (302) may be any one or more mechanisms that may trigger the operations described for FIG. 3A. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc. The request (302) may be received from another logic, such as the client logic 112 and/or application logic 114, and/or the request (302) may be received from the memory extension logic 142, such as by the memory extension logic 142 making a determination to reclaim memory. The memory extension logic 142, and/or any other one or more logics may make the determination independently and/or in coordination with each other and/or with any other one or more logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason. For example, the memory extension logic 142 may determine that additional reclaim candidates are needed to service an expected data access rate from the processor 130. In another example, the memory extension logic 142 and/or another logic may make the determination and/or send the request (302) in response to a low memory condition. In another example, the memory extension logic 142 may determine that newer reclaim candidates are needed than had been previously selected, such as after a certain amount of time has passed and/or after a certain number of events (such as data accesses) have occurred. In some examples, there may be no request (302), such as if the determination is made by the memory extension logic 142.
[0057] In response to the request and/or determination, the memory extension logic 142 and/or another logic may process access information (304). Processing access information (304) may include evaluating one or more stored data related to portions of the memory 144 and/or of the backing store 146 and/or selecting one or more portions as reclaim candidates. The reclaim candidate(s) may be one or more portions deemed to be unlikely to be used in the near future.
[0058] The reclaim candidate(s) may be selected using any one or more selection strategies, portion replacement strategies, and/or page replacement strategies known now or later discovered. For example, one or more portion(s) of memory may be tracked using one or more sorted and/or unsorted collections, such as one or more least-recently used lists. As portion(s) of memory are accessed, read, written, referenced, etc., the collection(s) may be updated, modified, and/or replaced to reflect the action that occurred. For example, the position of the portion of memory may be moved to the back of a sorted list of least-recently used portions, and/or may be moved to a different collection, such as to a collection of active portions. Alternatively or in addition, the position of other portion(s) of memory may be updated, modified, and/or replaced in response to the action. For example, one or more other portions may be moved from the collection of active
portions to a collection of inactive portions. Other examples of portion replacement strategies may include not-recently-used, first-in-first-out, second-chance, clock, random replacement, not- frequently-used, aging, longest-distance-first, any other portion replacement strategy known now or later discovered, and/or any combination of two or more of these and/or any other strategies.
[0059] The one or more selection strategies, portion replacement strategies, and/or page replacement strategies used may be selected by a user and/or an administrator, may be configured, and/or may be selected based on any one or more policies, passed functions, steps, and/or rules that the memory extension logic 142, the client logic 112, and/or another logic follows to determine which one or more selection strategies, portion replacement strategies, and/or page replacement strategies to use and/or which parameter(s) and/or configuration(s) to use with the one or more selection strategies, portion replacement strategies, and/or page replacement strategies. The one or more policies, passed functions, steps, and/or rules may use any available information, such as any one or more of the characteristics and/or configurations of the client(s) 100, the memory extension module(s) 140, the memory appliance(s), and/or the management server(s), to select the one or more selection strategies, portion replacement strategies, and/or page replacement strategies. For example, a policy and/or passed function may specify to use a random replacement strategy unless the number of accesses that result in reading from the backing store 146 and/or via the communication interface 148 reaches a threshold, and/or to use a least-recently-used strategy after reaching the threshold.
[0060] In addition to those listed elsewhere in this disclosure, characteristics and/or configurations of the client 100, memory extension module 140, memory appliance, management server, and/or any other device/component may include: name, system time, time zone, time synchronization settings, time server(s), network configuration, hostname, power configuration, battery policy, Uninterruptible Power Supply (UPS) configuration, disk policy, backing store configuration, persistence configuration, persistence mode, customer-support configuration(s), service configuration(s), user interface configuration, user configuration, user-group configuration, health monitoring configuration(s), network monitoring configuration(s), SNMP configuration, logic version, software version, firmware version, microcode version, any other fixed aspect of the device/component, and/or any other aspect of the device/component which may be configured and/or changed. Alternatively or in addition, characteristics and/or configurations may include associations between components/devices.
[0061] Alternatively or in addition, the one or more selection strategies, portion replacement strategies, and/or page replacement strategies used may be affected by one or more policies, passed functions, steps, and/or rules that the memory extension logic 142, the client logic 112, and/or another logic follows to determine how the one or more selection strategies, portion replacement strategies, and/or page replacement strategies should operate. The one or more policies, passed functions, steps, and/or rules may use any available information, such as any one
or more of the characteristics and/or configurations of the client(s) 100, the memory extension module(s) 140, the memory appliance(s), and/or the management server(s), to affect the one or more selection strategies, portion replacement strategies, and/or page replacement strategies. For example, a policy and/or passed function may specify an amount of memory to unmap from page table entries and/or to reclaim when activated. In another example, a policy and/or passed function may specify an amount of data to retain in local primary memory as frequently-used and/or “hot” data.
[0062] In some examples, a sorted collection may be approximated by storing one or more values with portion-tracking data structures. The one or more values stored may include one or more generation counters and/or timestamp values. For example, a generation counter may be maintained and/or may be incremented and/or decremented as portion(s) are accessed and/or as data accesses and/or page fault(s) occur. As the portion(s) are accessed and/or as page fault(s) occur, the one or more values may be stored with one or more portion tracking data structures that correspond to the portion(s) being accessed and/or the portion(s) being page-faulted. Alternatively or in addition, the one or more values stored may include a count of the number of times corresponding portion(s) were accessed and/or page-faulted.
[0063] Periodically and/or in response to an event and/or condition, such as when processing access information (304), all of or a subset of all portion tracking data structures may be inspected and/or corresponding stored values may be collected in a data structure. For example, all portion tracking data structures for a file and/or allocation domain may be inspected and/or corresponding stored values may be collected. An allocation domain may be a logical partitioning of computing resources and/or may be controlled via one or more file data limits, such as described in U.S. Nonprovisional patent application 15/424,395, filed February 3, 2017, which is hereby incorporated by reference. In some examples, one or more virtualization instances, virtual machines, containers, jails, and/or zones may be and/or may be included in an allocation domain. The collected values may be stored in the data structure with one or more identifiers for the corresponding portions, such as an address, offset, and/or index of the corresponding portions and/or an address and/or reference to corresponding portion tracking data structures. The collected values may be sorted while being collected and/or after being collected. In some examples, all of or a subset of all portion tracking data structures may be updated, modified, and/or replaced periodically and/or in response to the event and/or condition and/or in response to another event and/or condition. For example, in examples where the one or more values stored include a count of the number of times corresponding portion(s) were accessed and/or page-faulted, the value(s) may be reset, cleared, zeroed, removed, updated, modified, replaced, and/or invalidated, such as by setting the value(s) to zero and/or setting and/or clearing one or more indicators indicating that the value(s) are invalid, not present, not available, not accessible, and/or not writable.
[0064] The sorted collected values may be used to identify portion(s) of memory that have not been used recently and/or are unlikely to be used in the near future by inspecting the portiontracking data structures for the portions corresponding to the first and/or last value in the sorted collected values. For example, if the sorted collected values are sorted based upon a timestamp and/or generation counter for the most recent access and/or page fault of corresponding portions, then the lowest value may correspond to the portion which was least recently used. Alternatively or in addition, if the sorted collected values are sorted based upon the count of the number of times corresponding portions were accessed and/or page-faulted, then the lowest value may correspond to a portion which is relatively infrequently used and/or which has not been used frequently since the value(s) were last reset. As such, storing, collecting, and/or sorting the stored values may be used to approximate a least-recently-used strategy, a not-frequently-used strategy, and/or any other strategy that traditionally includes maintaining a sorted collection to identify one or more candidates to reclaim.
[0065] A timestamp value may be any value that corresponds to the current time. The timestamp value may be specified in any one or more units of time, such as seconds, milliseconds, microseconds, jiffies, etc., and or may be relative to some other defined time. For example, a timestamp may be the number of seconds since midnight on January 1st of 1970. In another example, a timestamp may be the number of seconds and microseconds since midnight on January 1 st of 2000. The timestamp value may be specified in a defined time zone, such as UTC, GMT, and/or any other time zone. Alternatively or in addition, the timestamp value may be specified in the time zone where one or more clients 130, one or more memory appliances 110, one or more management servers 120, and/or any other one or more entities are physically located.
[0066] In some examples, the actual values for one or more portions may change between when the values in the sorted collected values are collected and when portion(s) of memory are to be identified for reclaim and/or are to be reclaimed. Prior to selecting a portion of memory for reclaim and/or prior to reclaiming the selected portion, the memory extension logic 142, the client logic 112, and/or another logic may inspect the corresponding portion tracking data structure and/or obtain the current value for the portion. If the current value for the portion is equal to the value in the sorted collected values data structure, the portion may be considered a good candidate to reclaim. Alternatively or in addition, the current value for the portion may be compared with a threshold value to determine if the portion is a good candidate to reclaim. For example, the threshold value may be the mean and/or median value from the sorted collected values, the 80th percentile value from the sorted collected values, any other value from the sorted collected values, a calculated value based upon one or more values from the collected values, a value based on the current timestamp and/or generation counter, and/or any other value which may be useful for determining suitability for reclaiming a portion. In some examples, the current value for the portion
may be checked during other and/or additional steps of reclaiming portion(s) of memory. For example, the current value may be compared to the stored and/or threshold value while selecting destination memory (324), prior to starting to read data (328), and/or prior to invalidating one or more page table entries and/or updating other data structure(s).
[0067] In some examples, the collected values may be analyzed and/or used to determine the threshold value, but not used directly to identify candidate portions to reclaim. For example, the collected values may be sorted, analyzed, and/or partially sorted in order to determine the value that would be at the 80th percentile and/or any other position of the sorted collected values and choose this value as the threshold value. In another example, the threshold value may be calculated based on one or more of the collected values and/or sorted collected values. In lieu of using the collected values directly to identify candidate portions to reclaim, candidate portions may be chosen using any selection strategy, portion replacement strategy, and/or page replacement strategy known now or later discovered. For example, candidate portions to reclaim may be selected randomly and/or portion tracking data structures for candidate portions may be inspected to compare one or more stored values to one or more threshold values that were determined from the collected values.
[0068] An advantage of using the sorted collected values and/or threshold values as described herein over maintaining one or more sorted collections may be that lock contention may be reduced and/or eliminated. For example, if a lock primitive would be used to access, update, modify, and/or replace the sorted collection(s), the lock primitive could be avoided when not attempting to maintain the sorted collection(s), and/or the lock primitive may not need to be acquired when handling a page fault and/or other type of access operations which may otherwise cause the sorted collection(s) to be updated/modified/replaced and/or one or more portions to be moved within the sorted collection(s) and/or between multiple collections.
[0069] In some example systems the sorted collected values may be collected and/or sorted without acquiring any highly-contended lock primitives. For example, when using a lock-free page cache and/or other data structure, portions of a file and/or address space may be iterated by speculatively referencing entries in a radix tree and/or other data structure that uses a read-copy- update and/or other lock-free mechanism to coordinate access. Speculatively-referenced entries may be individually locked and/or verified to exist when collecting the stored values for corresponding portion tracking data structures. In some examples, if a portion is already locked, invalid, and/or unable to be verified, the portion may be skipped. Skipping these portions may be advantageous and/or may improve efficiency of collecting values in example systems where locked and/or invalid portions are unlikely to be good candidates to reclaim.
[0070] Upon completing processing access information (304), the memory extension logic 142 and/or another logic may optionally start writeback (306) for one or portions of the memory 144 of the memory extension module 140. Starting writeback (306) may include writing data from the one
or more portions of the memory 144 to corresponding portion(s) of the backing store 146, and/or to other locations, such as by writing to one or more memory appliances and/or other locations via the communication interface(s) 148. In some examples, the data may be copied using client-side memory access, via one or more RDMA operations, and/or via any other one or more suitable data transfer operations. In these and/or other examples, the data may be optionally encrypted, decrypted, compressed, decompressed, and/or may undergo data translation before, during, and/or after being written (304). Other examples of data translation may be described elsewhere in this disclosure. In some examples, the encryption, decryption, compression, decompression, and/or data translation may be performed in whole or in part by the memory extension logic 142, one or more communication interfaces 148, one or more processors 130, other devices, and/or other logic. In examples where the data is compressed, the compression may be performed prior to selecting corresponding portion(s) of the backing store 146, and/or to other locations. Compressing the data before selecting a destination may be advantageous when using compression algorithms for which it may be difficult to predict the resulting size of compressed data.
[0071] Data translation may include manipulating the data being read and/or written. In a first example, data translation may include compressing the data being written and/or decompressing the data being read. Compression and/or decompression may be performed using any one or more compression schemes, such as Lempel-Ziv (LZ), DEFLATE, Lempel-Ziv-Welch (LZW), Lempel-Ziv-Renau (LZR), Lempel-Ziv-Oberhumer (LZO), Huffman encoding, LZX, LZ77, Prediction by Partial Matching (PPM), Burrows-Wheeler transform (BWT), Sequitur, Re-Pair, arithmetic code, and/or other method and/or scheme known now or later discovered which may be used to recoverably reduce the size of data.
[0072] In a second example, data translation may include encrypting the data being written and/or decrypting the data being read. Encryption and/or decryption may be performed using any one or more encryption schemes and/or ciphers, such as symmetric encryption, public-key encryption, block ciphers, stream ciphers, substitution ciphers, transposition ciphers, and/or any other scheme which may be used to encode information such that only authorized parties may decode it.
[0073] In a third example, data translation may include performing error detection and/or error correction upon the data being written and/or the data being read. Error detection and/or error correction may be performed using any one or more error detection and/or error correction schemes, such as repetition codes, parity bits, checksums, cyclic redundancy checks, cryptographic hash functions, error correcting codes, forward error correction, convolutional codes, block codes, Hamming codes, Reed-Solomon codes, Erasure Coding-X (EC-X) codes, Turbo codes, low-density parity-check codes (LDPC), and/or any other scheme which may be used to detect and/or correct data errors.
[0074] Error detection and/or error correction may include performing additional calculations to confirm the integrity of the data written to and/or read. For example, one or more digests may be written for one or more corresponding portions of the memory and/or backing store. When reading the corresponding portion, if the stored digest does not match the digest which can be computed from the read data for the portion, then the read may be considered failed and/or the portion may be considered corrupted. Alternatively or in addition, the data may be corrected based upon the one or more digests and/or error correcting codes.
[0075] Further examples may include performing multiple types of data translation. For example, the client logic 112, the memory extension logic 142, and/or another entity may encrypt the data being written and/or compute one or more error detecting and/or error correcting codes for the data and/or for the encrypted data. Alternatively or in addition, the client logic 112, the memory extension logic 142, and/or another entity may decrypt the data being read and/or may perform error detection and/or error correction upon the data and/or encrypted data being read.
[0076] The memory extension logic 142 and/or another logic may optionally wait for the writeback to complete. Alternatively, or in addition, the memory extension logic 142 and/or another logic may proceed without waiting, such as by marking the portions as being under writeback and/or by updating one or more data structures to indicate writeback is in progress.
[0077] Upon completing starting writeback (306) and/or upon skipping starting writeback, the memory extension logic and/or another logic may provide one or more reclaim candidates (308). Providing the one or more reclaim candidates (308) may indicate that one or more portions are appropriate candidates to consider when reclaiming one or more portions of memory. Providing the reclaim candidates (308) may include sending/invoking a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, and/or may utilize any other mechanism capable of conveying information about the reclaim candidates to a reclaim logic. Upon completing providing the reclaim candidates (308), the memory extension logic and/or another logic may be done (310) handling the request for reclaim candidates.
[0078] The reclaim logic may be any logic responsible for and/or that performs reclaiming portions of memory 144, facilitating the portions being reused for other purposes. In some examples, the reclaim logic may be included in the memory extension logic 142, in the client logic 112, and/or in another logic. The reclaim logic may, for example, reclaim portions of memory 144 in response to a request to start loading data into memory (322).
[0079] FIG. 3B illustrates an example flowchart for starting to load data into memory. The memory extension logic 142 and/or another logic may perform the operations shown in FIG. 3B and/or other figures herein. The memory extension logic 142 and/or another logic may receive a request to start loading data into memory (322). The request to start loading data into memory (322) may be any one or more mechanisms that may trigger the operations described for FIG. 3B.
Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc. The request (322) may be received from another logic, such as the client logic 112 and/or application logic 114, and/or the request (322) may be received from the memory extension logic 142, such as by the memory extension logic 142 making a determination to start loading data. The memory extension logic 142, and/or any other one or more logics may make the determination independently and/or in coordination with each other and/or with any other one or more logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason. For example, the memory extension logic 142, the client logic 112, the application logic 114, and/or another logic may determine that some data is needed to service an expected data access from the processor 130. In some examples, there may be no request (322), such as if the determination is made by the memory extension logic 142.
[0080] In response to the request and/or determination, the memory extension logic 142 and/or another logic may select one or more portions of destination memory (324). Selecting one or more portions of destination memory (324) may include selecting one or more portions from the reclaim candidates. Alternatively or in addition, selecting the destination memory (324) may include selecting portions using any one or more selection strategies, portion replacement strategies, and/or page replacement strategies known now or later discovered, such as described for selecting reclaim candidate(s). In examples where the reclaim candidate(s) are provided as a sorted collection, the destination memory may be selected (324) by selecting the next one or more portions from the sorted collection. In other examples, one or more random portions may be selected. In examples where some portions of the memory 144 are currently unused, one or more of the unused portions may be selected (324). An advantage of using the reclaim candidates(s), selecting randomly, and/or selecting from among unused portions may be that selecting the one or more portions of destination memory (324) may be performed by a hardware logic and/or without invoking a processor (for example, the processor 130 of the client 100 and/or the processor of the memory extension logic 142, if present). For example, a hardware state machine may select the one or more portions of destination memory (324) and/or may perform one, more, and/or all of the other operations of FIG. 3B. Performing one or more operations of FIG. 3B by a hardware logic may be advantageous in examples where doing so reduces the time and/or power consumption for performing the operations, where doing so enables a larger number of concurrent data loading operations to be performed, and/or where doing so enables higher throughput of data loading operations. In other examples, selecting the one or more portions of destination memory (324) may be performed by any other logic(s), such as a software logic.
[0081] Alternatively or in addition, selecting the destination memory (324) may include notifying one or more logics, such as the client logic 112 and/or application logic 114, that one or
more corresponding portions of address space are no longer associated with the selected portion(s) of destination memory. For example, the memory extension logic 142 and/or another logic may send a notification to the client logic 112, to the application logic 114, to the client 100, and/or to any other one or more logics indicating that the address space portion(s) are no longer associated with the selected portion(s). In response to the notification, the client logic 112, the application logic 114, the client, and/or other logic(s) may update one or more page table entries, page table(s), memory presence indicator(s), portion tracking data structure(s), and/or other data structure(s) to indicate that one or more address space portion(s) are no longer present, valid, available, accessible, writable, and/or associated with the selected portion(s). Alternatively or in addition, the memory extension logic 142 and/or another logic may update one or more page table entries, page table(s), memory presence indicator(s), portion tracking data structure(s), and/or other data structure(s) to indicate that one or more address space portion(s) are no longer present, valid, available, writable, accessible, and/or associated with the selected portion(s).
[0082] Upon completing selecting one or more portions of destination memory (324), the memory extension logic and/or another logic may optionally start writeback (326) for one or more of the selected portions. For example, if one or more of the selected portions are from the provided reclaim candidates and/or if one or more of the selected portions contain data that has not been written to the backing store 146 and/or via the communication interface(s) 148, the memory extension logic and/or another logic may start writing data from the one or more portions of the memory 144 to corresponding portion(s) of the backing store 146, and/or to other locations, such as described elsewhere in this disclosure for starting writeback (306). Alternatively or in addition, such as if one or more of the selected portions are unused and/or do not contain such data, writeback may not be started for these portions.
[0083] Upon completing starting writeback (326) and/or upon skipping starting writeback, the memory extension logic and/or another logic may start reading (328) the data indicated by the request and/or determination to start loading data into memory (322). The data may be read into memory 110, 144 by reading from corresponding portion(s) of the backing store 146, by reading corresponding portions of a memory appliance, such as via the communication interface(s) 148, and/or by initializing one or more portions of memory. Upon starting to read the data, operations may be complete (330).
[0084] The operations depicted in FIG. 3A and FIG. 3B may occur independently, concurrently, repeatedly, and/or may be interleaved. For example, the memory extension logic 142 and/or another logic may handle one or more requests for reclaim candidates (302) while also reading data (328) with previously-provided reclaim candidates. Alternatively or in addition, two or more of the steps of FIG. 3A and/or FIG. 3B may be combined. For example, writeback may be started (306) for one or more portions while processing access information (304), such as by starting writeback upon processing information for each of the one or more portions. In other examples,
one or more reclaim candidates may be provided (308) prior to starting writeback (306) for the corresponding portion(s) and/or for other portion(s). In other examples, both writeback (306, 326) and/or reading data (328) may be started and/or may proceed concurrently. For example, the data being written may be copied to a temporary location, such as another portion of the memory 144, 110 and/or a temporary buffer of the backing store 146 and/or of the communication interface(s) 148, prior to starting and/or as part of starting writeback (306, 326). Alternatively or in addition, the data being read may be copied from the temporary location and/or a second temporary location after and/or as part of reading data (328). In other examples, starting writeback (306, 326) may be performed independently of the operations depicted in FIG. 3A and/or FIG 3B, such as periodically and/or as part of a background operation.
[0085] Furthermore, while some of the operations may have been described as an ordered set of operations, it would be clear to one skilled in the art that other sequences may be equally effective in response to the request(s) and/or determination(s) for reclaim candidates (302) and/or to start loading data (322). For example, other sequences may have additional operations, some operations may be skipped, and/or some operations may be performed in a different order than described. For example, writeback may not be started (326) in response to the request and/or determination to start loading data (322) if writeback had already been started (306) in response to the request and/or determination for reclaim candidates (302).
[0086] The operations of the client logic 112, the application logic 114, the memory extension logic 142, and/or other logic(s) may be optimized and/or coordinated to further reduce the duration when the processor 130 and/or other logic is waiting. For example, the client logic 112, the application logic 114, and/or another logic may indicate to the memory extension logic 142 that one or more portions of memory 110, 144 may no longer contain useful data and/or that the portion(s) may be re-initialized instead of retrieving data from the backing store 146 and/or via the communication interface(s) 148. The indication(s) may be performed in response to other operations of the client 100, such as allocating and/or freeing memory and/or in response to a request, such as an initialization, zeroing, and/or discard request, from one or more logic(s), such as from the application logic 114.
[0087] FIG. 4A illustrates an example flowchart for allocating memory. The client logic 112 and/or another logic may perform the operation shown in FIG. 4A and/or other figures herein. The client logic 112 and/or another logic may receive a request for allocating memory (402). The request for allocating memory (402) may be any one or more mechanisms that may trigger the operations described for FIG. 4A. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc. The request (402) may be received from another logic, such as the application logic 114, and/or the request (402) may be received from the client logic 112, such as by the client logic 112 making a determination to allocate
memory. The client logic 112, and/or any other one or more logics may make the determination independently and/or in coordination with each other and/or with any other one or more logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason. For example, the client logic 112 may determine that additional memory may improve operation of the application logic 114 and/or may determine to allocate additional memory for the application logic 114, such as in examples where the application logic 114 includes a virtualization instance. In some examples, there may be no request (402), such as if the determination is made by the client logic 112. In other examples, the request for allocating memory (402) may include a memory allocation function call, such as malloc. Alternatively or in addition, the request for allocating memory (402) may be implied by the occurrence of a page fault.
[0088] In response to the request and/or determination, the client logic 142 and/or another logic may select one or more portion(s) of memory (404). The selected portion(s) may be addressable by the processor 130 of the client 100. For example, the selected portion(s) may include one or more portions of the memory 110 of the client 100. In another example, the selected portion(s) may include one or more portions of the memory 144 of the memory extension module 140. Alternatively or in addition, the selected portion(s) may include one or more portions of physical address space associated with the memory extension module 140, such as with the physically-addressable interface and/or hardware-accessible interface.
[0089] Whether to select portion(s) of the memory 110 of the client 100, portion(s) of the memory 144 of the memory extension module 140, and/or portions of the physical address space associated with the memory extension module 140 may be determined based upon an expected likelihood that the allocated portion(s) will be accessed soon and/or based upon one or more performance characteristics of the memory 100 of the client 100, the memory 144 of the memory extension module 140, and/or of any other one or more components, such as the interconnect(s) 170, 180, the memory controller 120, the backing store 146, the communication interface(s) 148, and/or the memory extension logic 142. For example, if the client logic 112 and/or another logic makes a determination that the allocation portions are likely to be accessed soon, one or more portions may be selected from the memory with faster performance, such as the memory 110 of the client. In another example, if the client logic 112 and/or another logic makes a determination that the allocation portions are unlikely to be accessed soon, one or more portions may be selected from the memory with slower performance, such as the memory 144 of the memory extension module 148, and/or may be selected from the physical address space associated with the memory extension module 140. Alternatively or in addition, the portion(s) may be selected using any one or more selection strategies described herein and/or one or more selection strategies known now or later discovered, such as those described in U.S. Non-provisional patent application Ser. No. 17/859,389, filed July 7, 2022, which is hereby incorporated by reference, including strategies 1
described for provisioning, allocation, selection, slab allocation, allocation strategies, policies, passed functions, steps, and/or rules, etc. In examples where more than one portion of memory is allocated and/or selected, one or more of the portion(s) may be selected from the same memory type and/or others may be selected from a different memory type. The same selection strategy may be used for all memory types, and/or one or more different selection strategies may be used, depending upon from which memory type the portion(s) is/are being selected.
[0090] Alternatively or in addition, in response to the request and/or determination, the client logic 142 and/or another logic may configure a portion of an address space of the application logic 314 as anonymous memory and/or may configure the address space, such that the operations described for FIG. 4A may be performed at a later time, such as when accessing and/or writing to the portion of the address space for the first time.
[0091] Upon completing selecting the portion(s) (404), the client logic 112 and/or another logic may optionally request one or more portions of memory and/or address space to be initialized (406). In response to the request to initialize the one or more portions of memory and/or address space, the memory extension logic 142 and/or another logic may initialize the one or more indicated and/or associated portions of memory, such as one or more portions of the memory 144 of the memory extension module 140 and/or one or more portions of the memory 110 of the client 100. Initializing one or more portions may include writing to the portions with some predetermined and/or specified data. For example, the portion(s) of memory 144, 110 may be overwritten with all zeros.
[0092] The portion(s) of memory 144, 110 initialized may be specified by the request and/or may be implied by the specified portion(s) of address space, such as by identifying the portion(s) of memory 144, 110 via one or more mapping data structures. The mapping data structure(s) may be any data structure(s) known now or later discovered that may be capable of associating one or more portion(s) of memory 144, 110 with one or more portion(s) of address space and/or of identifying the one or more portion(s) of memory 144, 110 given one or more portion(s) of address space to query. Examples of data structures suitable for this purpose may include an array, an extensible array, a sparse array, a radix tree, an interval tree, a mapping, a hash table, a page table, and/or any other data structure(s) that facilitates associating memory portions, memory offsets, and/or memory identifiers with address space references and/or identifiers.
[0093] The portions of memory 144, 110 initialized may be initialized at the time of the request (406) and/or may be initialized at a later time. For example, the memory extension logic 142 and/or another logic may update one or more data structures, such as the mapping data structure(s), to indicate that the portions should be initialized prior to and/or upon being accessed, such as when starting to read data (328). Alternatively or in addition, the memory extension logic 142 and/or another logic may update one or more data structures, such as the mapping data structure(s) to disassociate one or more portions of memory 110, 144 and/or backing store 146 from one or more
address space portions. Disassociating the portions may serve to indicate that the corresponding portions of address space should be initialized prior to and/or upon being accessed.
[0094] Upon completing and/or skipping requesting the portion(s) to be initialized (406), the client logic 112 and/or another logic may map the portion(s) of memory 110, 144 and/or address space to a second address space (408). The second address space may be a virtual address space, such as a virtual address space associated with the application logic 114. Mapping the portions(s) to the second address space (408) may include updating one or more mapping data structures to associate the portion(s) with one or more corresponding portions of the second address space. For example, the client logic 122 and/or another logic may update one or more page tables and/or page table entries to associate the portion(s) of memory 110, 144 and/or address space with one or more portions of the virtual address space associated with the application logic 114. An effect of mapping the portion(s) of memory 114, 144 and/or address space may include causing future accesses by the processor 130, a logic, such as the application logic 114, and/or any other component of the client 100 of the portion(s) of the second address space to cause the processor to access the mapped portion(s) of memory 114, 144 and/or the address space. For example, upon accessing the portion(s) of the second address space, the processor 130 may determine that the mapped portion(s) of memory 114, 144 and/or the address space are mapped to the accessed portion(s) of the second address space and may translate virtual addresses of the second address space to physical addresses of the memory 114, 144 and/or of the address space. Upon determining and/or translating, the processor 130 may then access the mapped portion(s) of memory 114, 144 and/or the address space.
[0095] Upon completing mapping the portion(s) of memory 110, 144 and/or address space to the second address space (408), the client logic 112 and/or another logic may optionally initialize and/or activate one or more page table entries (410). In some examples, initializing and/or activating the one or more page table entries (410) may be performed as part of mapping the portion(s) of memory 110, 144 and/or address space (408). Initializing and/or activating one or more page table entries (410) may include marking the one or more page table entries as valid, present, available, accessible, and/or writable. The one or more page table entries may be associated with and/or included in the mapping data structure(s), such as if the mapping data structure(s) are and/or include a page table. An effect of initializing and/or activating the one or more page table entries (410) may be that when the processor 130, a logic, such as the application logic 114, and/or any other component of the client 100 attempts to access a portion of the second address space, the logic, processor 130, and/or component may proceed to access the mapped portion(s) of the memory 110, 144 and/or address space without causing one or more page faults. Accessing the mapped portion(s) without causing page fault(s) may be advantageous in examples such as these where the accessed portions have been configured with the request for initialization (406), as the logic, processor 130, and/or component may not need to wait an extended period of
time for data of the access to be made available. In contrast, in examples where the accessed portions have not been configured with the request for initialization (406), the logic, processor 130, and/or component may need to wait and/or stall for an extended period of time while the memory extension logic 142 and/or another logic prepares the data for access, such as by retrieving (328) the data from the backing store 146 and/or via the communication interface(s) 148. In these examples, it may be advantageous not to optionally initialize and/or activate one or more page table entries (410), as not initializing and/or activating the one or more page table entries may cause access one or more page faults to occur when the portion(s) of the second address space is/are accessed. Having page fault(s) occur in these examples may be advantageous as the page fault(s) may enable the logic, another logic, the processor 130, the component, and/or another component to perform meaningful work while the data is prepared for access.
[0096] Upon completing and/or skipping initializing and/or activating the one or more page table entries (410), the client logic 112 and/or another logic may complete (412) handling the request and/or determination to allocate memory, such as by responding to the request and/or determination with a status indication and/or with one or more identifiers for the allocated memory. [0097] FIG. 4B illustrates an example flowchart for freeing memory. The client logic 112 and/or another logic may receive a request for freeing memory (422). The request for freeing memory (422) may be any one or more mechanisms that may trigger the operations described for FIG. 4B. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc. The request (422) may be received from another logic, such as the application logic 114, and/or the request (422) may be received from the client logic 112, such as by the client logic 112 making a determination to free memory. The client logic 112, and/or any other one or more logics may make the determination independently and/or in coordination with each other and/or with any other one or more logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason. For example, the client logic 112 may determine that memory assigned to the application logic 114 may be unused and/or underutilized and/or may determine to free some memory for the application logic 114, such as in examples where the application logic 114 includes a virtualization instance. In some examples, there may be no request (422), such as if the determination is made by the client logic 112. In other examples, the request for allocating memory (422) may include a memory freeing function call, such as free. Alternatively or in addition, the request for freeing memory (422) may be implied by a determination to page data out, such as to the backing store 160 of the client 100, and/or to reclaim memory.
[0098] In response to the request and/or determination (422), the client logic 142 and/or another logic may unmap one or more portions of memory 110, 144 and/or address space from the second address space (424). Unmapping the portion(s) may include updating one or more
mapping data structures to disassociate the portion(s) from one or more corresponding portions of the second address space. Alternatively or in addition, unmapping the portion(s) (424) may include deactivating one or more page table entries, such as by marking the one or more page table entries as invalid, not present, not available, not accessible, and/or not writable. For example, the client logic 122 and/or another logic may update one or more page tables and/or page table entries to disassociate the portion(s) of memory 110, 144 and/or address space with one or more portions of the virtual address space associated with the application logic 114. An effect of unmapping the portion(s) of memory 114, 144 and/or address space may include causing future accesses by the processor 130, a logic, such as the application logic 114, and/or any other component of the client 100 of the portion(s) of the second address space to cause a page fault, a segmentation fault, an error, and/or otherwise not access the unmapped portions.
[0099] Upon completing unmapping the portion(s) of memory (424), the client logic 142 and/or another logic may optionally request the portions of memory and/or address space to be discarded (426). In response to the request to discard the one or more portions of memory and/or address space, the memory extension logic 142 and/or another logic may initialize and/or write to the one or more indicated and/or associated portions of memory, such as one or more portions of the memory 144 of the memory extension module 140 and/or one or more portions of the memory 110 of the client 100, with some predetermined and/or specified data. For example, the portion(s) of memory 144, 110 may be overwritten with all zeros. The portion(s) of memory 144, 110 discarded may be specified by the request and/or may be implied by the specified portion(s) of address space, such as by identifying the portion(s) of memory 144, 110 via one or more mapping data structures.
[00100] The portions of memory 144, 110 discarded may be discarded and/or initialized at the time of the request (426) and/or may be discarded and/or initialized at a later time. For example, the memory extension logic 142 and/or another logic may update one or more data structures, such as the mapping data structure(s), to indicate that the portions should be initialized prior to and/or upon being accessed. Alternatively or in addition, the memory extension logic 142 and/or another logic may update one or more data structures, such as the mapping data structure(s) to disassociate one or more portions of memory 110, 144 and/or backing store 146 from one or more address space portions. Disassociating the portions may serve to indicate that the corresponding portions of address space should be initialized prior to and/or upon being accessed. Alternatively or in addition, the memory extension logic 142 and/or another logic may update one or more data structures, such as one or more page table entries, page table(s), portion tracking data structure(s), memory presence indicator(s), and/or any data structure(s), such as to indicate that the portions of address space are no longer valid, present, available, accessible, and/or writable. Indicating that the portions of address space are no longer valid, present, available, and/or writable may cause a logic, such as the client logic 112, to request the portions
be made available prior to accessing them, which may provide the memory extension logic 142 and/or another logic an opportunity to initialize the portions. Alternatively or in addition, the one or more data structures may be updated to indicate that the portions of address space are valid, present, available, accessible, and/or writable. Indicating that the portions of address space are valid, present, available, accessible, and/or writable may be advantageous in examples where the memory extension logic 142 and/or another logic will initialize the portions upon being accessed. [00101] In some examples, the effect of requesting the portions of memory and/or address space to be discarded (426) may be the same as the effect of requesting the portion(s) to be initialized (406). In other examples the two operations may be different. For example, the memory extension logic 142 and/or another logic may treat the request to initialize (406) as higher or lower priority than the request to discard (426) and/or may handle the requests with different latencies and/or qualities of service. Treating the request to initialize (406) as higher priority, lower latency, and/or higher quality of service than the request to discard (426) may be advantageous in examples where the portion(s) may be accessed shortly after allocation and/or in examples where more portions need to be initialized and/or discarded than the memory extension logic 142 and/or another logic can service in a timely manner. In some examples, the request to discard (426) may be not sent, ignored, deferred, delayed, refused, failed, and/or discarded.
[00102] Upon completing and/or skipping requesting the portions of memory and/or address space to be discarded (426), the client logic 112 and/or another logic may complete (428) handling the request and/or determination to free memory, such as by responding to the request and/or determination with or without a status indication.
[00103] The operations depicted in FIG. 4A and FIG. 4B may occur independently, concurrently, repeatedly, and/or may be interleaved. For example, multiple memory allocation and/or memory freeing operations may be executed concurrently, serially, repeatedly, and/or interleaved. Alternatively or in addition, two or more of the steps of FIG. 4A and/or FIG. 4B may be combined. For example, mapping the portion(s) of memory 110, 144 and/or address space to the second address space (408) and initializing and/or activating the one or more page table entries (410) may be combined. Furthermore, while some of the operations may have been described as an ordered set of operations, it would be clear to one skilled in the art that other sequences may be equally effective in response to the request(s) and/or determination(s) to allocate memory (402) and/or free memory (422). For example, other sequences may have additional operations, some operations may be skipped, and/or some operations may be performed in a different order than described. For example, additional operations may be performed to update one or more data structures tracking free and/or allocated memory.
[00104] FIG. 5 illustrates an example flowchart for a memory access. The processor 130, the memory extension logic 142, the client logic 112, and/or other logic(s) may perform the operations shown in FIG. 5 and/or other figures herein. In the example illustrated in FIG. 5, the
processor 130 may perform operations depicted in the left portion of FIG. 5 (502) (504) (518) (526), the memory extension logic 142 may perform operations depicted in the center portion of FIG. 5 (512) (514) (520) (522) (524), and/or the client logic 112 may perform operations depicted in the right portion of FIG. 5 (506) (508) (510) (516). Alternatively or in addition, one or more other logics and/or components may perform any or all of the depicted operations. For example, instead of or in addition to the processor performing operations depicted in the left portion of FIG. 5, another logic, device, and/or component may perform the operations depicted in the left portion of FIG. 5, such as a GPU, a storage controller, a communication interface, and/or any other logic, device, and/or component capable of accessing memory.
[00105] The processor 130 and/or another logic may receive a request to access memory (502). The request to access memory (502) may be any one or more mechanisms that may trigger the operations described for FIG. 5. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc. The request (502) may be received from another component and/or logic, such as the application logic 114, and/or the request (502) may be received from the processor 130, such as by the processor 130 making a determination to access and/or pre-fetch memory. The processor 130 and/or any other one or more components and/or logics may make the determination independently and/or in coordination with each other and/or with any other one or more components and/or logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason. For example, the processor, the client logic 112, the application logic 114, and/or another logic and/or component may determine, such as with a pre-fetch mechanism, that a portion of memory and/or address space is likely to be accessed soon by another logic (such as the application logic 114) and/or may access the memory in anticipation. In some examples, the request (502) may be one or more computer executable instructions, such as a load instruction and/or a store instruction. In other examples, the request (502) may be an electrical, optical, and/or wireless signal, such as a memory access over an interconnect (such as the interconnects described herein 170, 180) and/or a cache manipulation request (such as a cache fill request).
[00106] In response to the request and/or determination (502), the processor 130 and/or another logic may evaluate (502) a memory presence indicator, such as a page table entry. The memory presence indicator may be stored in memory and/or in another location, such as the backing store 160 of the client 100 and/or the backing store 146 of the memory extension module 140. In some examples, the memory presence indicator may be cached, such as with a translation lookaside buffer (TLB). The memory presence indicator may indicate whether the data associated with the request and/or determination (502) is included in memory 110, 144. Alternatively or in addition, the memory presence indicator may indicate whether the data is accessible via the
physically addressable interface and/or the hardware accessible interface. For example, if the memory extension logic 142 is capable of retrieving the data from the backing store 146 and/or via the communication interface(s) 148, even though it is not yet included in the memory 144, the memory presence indicator associated with the data may indicate that the data is present, valid, available, accessible, and/or writable.
[00107] Embodiments where the memory presence indicator may indicate whether the data is accessible, such as via the physically addressable interface and/or the hardware accessible interface, may be advantageous in examples where the client logic 112, the application logic 114, and/or any other logic is incapable of coordinating with the memory extension logic and/or updating the memory presence indicators as data is transferred (such as described for FIG. 3A and/or FIG. 3B) between memory 110, 144 and backing stores 160, 146, and/or via the communication interface(s) 148.
[00108] Embodiments where the memory presence indicator may indicate whether the data is included in memory 110, 144 may be advantageous in examples where attempting to access the data, such as via the physically addressable interface and/or the hardware accessible interface, causes the processor 140 and/or other logic to be unable to perform meaningful work during the access. For example, one or more hardware pipelines of the processor 130 and/or other logic may be stalled. In examples such as these, it may be advantageous to reduce the duration when the processor 130 and/or other logic is waiting and/or to enable the processor 130 and/or other logic to perform other meaningful work while waiting, such as by causing a page fault. [00109] Upon evaluating the memory presence indicator 504, if the memory presence indicator indicates non presence, a page fault and/or other request, indication, and/or signal may occur (506). The client logic 112 and/or another logic may perform operations in response such as depicted.
[00110] In response to the page fault and/or other request, indication, and/or signal (506), the client logic 112 and/or another logic may determine if the data associated with the request and/or determination (502) is included in memory 110, 144 (508). Determining if the data is included in memory (508) may include checking one or more data structures, such as one or more mapping data structures, one or more memory presence indicators, and/or any data structure(s) capable of indicating presence or non-presence of data for one or more portions of memory and/or address space. In one example, the checked data structure(s) (508) may be the same as that/those evaluated (504) by the processor 130 and/or other logic. In another example, one or more different data structures may be checked (508), such as a bitmask of presence indicators, and/or a different page table.
[00111] If the client logic 112 and/or other logic determines (508) that the data is included in memory, the client logic 112 and/or other logic may proceed to complete handing the page fault and/or other request, indication, and/or signal (516). Alternatively or in addition, if the client logic
112 and/or other logic determines (508) that the data is not included in memory, the client logic 112 and/or other logic may request the data to be loaded into memory (510), such as by invoking the request to start loading data into memory (322) and/or by sending a data load request. In response to the client logic 112 and/or other logic requesting data to be loaded into memory (510), the memory extension logic 142 and/or another logic may start loading data memory (512). Starting to load data into memory (512) may be and/or may include one, more, and/or all of the operations as described for FIG. 3B.
[00112] After starting to load data into memory (512) and/or after performing one, more, and/or all of the operations as described for FIG. 3B, the memory extension logic 142 and/or another logic may acknowledge the request to load the data into memory. Acknowledging the request (514) may indicate that the request to start loading data into memory was received. Alternatively or in addition, acknowledging the request (514) may indicate that the memory extension logic 142 and/or another logic has started to load the data. Alternatively or in addition, acknowledging the request (514) may indicate that the data has been loaded and/or was already loaded at the time the request was received. Acknowledging the request (514) may include additional status and/or data, such as a success/failure indication, one or more error codes, and/or one or more operation identifiers. The operation identifier(s) may be used to check the status of loading the data, such as by checking a completion queue and/or otherwise checking for an indication that the operation associated with the operation identifier is complete. Checking the completion queue and/or checking for an indication that the operation is complete may be performed periodically, such as with polling, and/or may be performed in response to an event, such as an interrupt indication from the memory extension logic 142 and/or other logic.
[00113] During the time between when the client logic and/or other logic requests the data to be loaded into memory (510) and when the loading is complete, the client logic 112 and/or another logic may perform other operations. For example, the client logic 114 and/or another logic may perform a context switch and/or may cause the processor 130 to execute instructions for a different logic, such as a second application logic. Performing the context switch may enable the processor 130 and/or other logic to perform meaningful work while the data is loaded.
[00114] Upon the memory extension logic 142 and/or other logic acknowledging the request to load the data into memory (514), the client logic 112 and/or another logic may complete handling the page fault and/or other request, indication, and/or signal (516). Completing handling the page fault and/or other request, indication, and/or signal (516) may include updating one or more of the memory presence indicator(s) to indicate that the data is present, valid, available, accessible, and/or writable. Alternatively or in addition, completing handling the page fault and/or other request, indication, and/or signal (516) may include performing a context switch back to the logic that caused the initial request to access memory (502), may include marking the logic as runnable,
and/or may include indicating that the logic is no longer waiting for data to be loaded into memory, such as by releasing a lock primitive associated with the portion being accessed.
[00115] Lock primitives may be logic and/or data structures which enforces a concurrency control policy. Examples of lock primitives include binary semaphores, counting semaphores, spinlocks, readers-writers spinlocks, mutexes, recursive mutexes, and/or readers-writers mutexes. Lock primitives may include distributed locks and/or may be provided by a distributed lock manager. Examples of distributed locks and/or distributed lock managers include VM SCI uster, Chubby, ZooKeeper, Etcd, Redis, Consul, Taooka, OpenSSI, and/or Corosync and/or any other distributed lock(s) and/or distributed lock manager(s) known now or later discovered.
[00116] Upon completing handling the page fault and/or other request, indication, and/or signal (516), the processor 130 and/or other logic may proceed to attempt to access memory (502). For example, if as part of completing handling the page fault and/or other request, indication, and/or signal (516), the client logic 112 and/or other logic performed a context switch back to the logic that caused the initial request to access memory (502), then the processor 130 may attempt to re-execute the same instruction that caused the initial page fault and/or other request, indication, and/or signal (516). In this example, however, if the memory presence indicator(s) have been updated, such as part of completing handling the page fault and/or other request, indication, and/or signal (516), when the processor 130 and/or other logic checks the memory presence indicator(s) (504), it may find the data to be present, valid, available, accessible, and/or writable.
[00117] Upon evaluating the memory presence indicator 504, if the memory presence indicator indicates presence, the processor 130 and/or another logic may request the data (518) via the interconnect(s) 170, 180. Requesting the data (518) may include requesting one or more cache line fills via the interconnect(s) 170, 180. Alternatively or in addition, if the data is included in a cache, the data may be read from the cache instead. In examples where the memory and/or address space being accessed (502) is associated with the memory extension module 140, the cache line fill request may be directed to the memory extension logic 142, such as via the physically-addressable interface and/or hardware-accessible interface. In other examples, the cache line fill request may be directed to another logic and/or component, such as the memory controller 120, the memory 110 of the client 100, and/or the memory 144 of the memory extension module 140.
[00118] In response to the data request (518), the memory extension logic 142 and/or another logic may determine if the data associated with the request and/or determination (502) is included in memory 110, 144 (520). Determining if the data is included in memory (520) may include checking one or more data structures, such as one or more mapping data structures, one or more memory presence indicators, and/or any data structure(s) capable of indicating presence or non-presence of data for one or more portions of memory and/or address space. In one example, the checked data structure(s) (520) may be the same as those evaluated (504) by the
processor 130 and/or other logic and/or may be the same as those evaluated (508) by the client logic 112 and/or other logic. In another example, one or more different data structures may be checked (520), such as a bitmask of presence indicators, and/or a different page table.
[00119] If the memory extension logic 142 and/or other logic determines (520) that the data is included in memory, the memory extension logic 142 and/or other logic may proceed to respond (524) to the data request (518). Alternatively or in addition, if the memory extension logic 142 and/or other logic determines (520) that the data is not included in memory, the memory extension logic 142 and/or other logic may load the data into memory (522). Loading the data into memory (522) may be and/or may include one, more, and/or all of the operations as described for FIG. 3B. [00120] After loading the data into memory (522) and/or after at least a portion of the data that includes and/or is the requested (518) data is loaded into memory, the memory extension logic 142 and/or other logic may respond (524) to the data request (518). The response (524) may include the requested data, and/or the response may indicate that the data is available, such as in a cache and/or memory 110, 144.
[00121] Upon the memory extension logic 142 and/or other logic responding to the data request (524), the memory access (502) may be complete (526), and/or the processor 130 and/or other logic may continue one or more previous operations. For example, if the processor 130 was previously executing computer executable instructions included in the application logic 114 that instructed it to access memory (502), the processor 130 may resume executing the computer executable instructions.
[00122] The operations depicted in FIG. 5 may occur independently, concurrently, repeatedly, and/or may be interleaved. For example, multiple memory allocation and/or memory freeing operations may be executed concurrently, serially, repeatedly, and/or interleaved. For example, multiple flows of the operations depicted may be in progress concurrently, such as if multiple execution cores and/or multiple hardware threads are accessing memory. The multiple flows may execute the same operations as each other in coordination, and/or the flows may operate independently. In examples where part and/or all of one or more logics, such as the memory extension logic 142, are hardware logics, the logic(s) may include multiple instances of portions of the logic to enable concurrent execution, such as the processor 130 may include multiple execution cores and/or multiple hardware threads. For example, the memory extension logic 142 may include multiple state machine logics for handling multiple requests (510), (518) for data concurrently. Alternatively or in addition, two or more of the steps of FIG. 5 may be combined. For example, starting to load data into memory (512) may be combined with acknowledging the request to load data (514). Furthermore, while some of the operations may have been described as an ordered set of operations, it would be clear to one skilled in the art that other sequences may be equally effective in response to the memory access (502). For example, other sequences may have additional operations, some operations may be skipped, and/or some operations may be
performed in a different order than described. For example, additional operations may be performed to update one or more memory presence indicators, to update one or more data structures representing memory access statistics, and/or measuring performance of the operations of FIG. 5 and/or any other operations of the system.
[00123] In some examples, such as if acknowledging the request (514) indicates that the data has been loaded, the memory extension logic 142 and/or other logic may acknowledge the request (514) after waiting for the data to be loaded, and/or after some amount of time has passed. The amount of time may be fixed and/or may be variable. For example, the memory extension logic 142 and/or another logic may measure the amount of time between two or more operations, such as acknowledging the data load request (514) and the processor requesting the data (518), and/or the memory extension logic 142 and/or other logic may calculate statistics related to the expected time it would take for the processor 130 and/or other logic(s) to react to completing handling of the page fault and/or other request, indication, and/or signal (516). Alternatively or in addition, the memory extension logic 142 and/or other logic may calculate and/or measure statistics related to the average amount of time waiting for the data request (518). In other examples, the memory extension logic 142 and/or other logic may measure the amount of time between requesting the data (518) and responding to the data request (524).
[00124] The memory extension logic 142 and/or other logic may calculate, for one or more processors 130 and/or other logics that access memory, one or more average, mode, median, mean, variance, standard deviation, any other values related to the time measurements and/or to any other quantity, and/or a combination of these and/or other values, and/or the memory extension logic 142 and/or other logic may adjust the amount of time for acknowledging the request. In examples with more than one processor 130 and/or other logic(s) that access memory, the amount of time for acknowledging the request may be the same amount of time for some and/or all of the processor(s) 130 and/or other logic(s), and/or the amount of time for acknowledging the request may be different for one or more of the processor(s) 130 and/or other logic(s). Adjusting the amount of time for acknowledging the request may enable the memory extension logic 142 and/or other logic to minimize the time when the processor 130 and/or other logic is waiting for the request for data (518) and/or may maximize the amount of useful work that may be performed while waiting for handling of the page fault and/or other request, indication, and/or signal (506) to be completed (516). Alternatively or in addition, adjusting the amount of time for acknowledging the request may enable the memory extension logic 142 and/or other logic to minimize unnecessary waiting for handling of the page fault and/or other request. For example, the memory extension logic 142 and/or other logic may increase the amount of time for acknowledging the request when the processor and/or other logic needs to wait longer than a threshold duration between requesting the data 518 and receiving a response (524). Alternatively or in addition, the memory extension logic 142 and/or other logic may decrease the amount of time for acknowledging
the request when the time that the data load (512) is completed and the data is requested 518 is longer than a threshold duration. Alternatively or in addition, the memory extension logic 142 and/or other logic may calculate and/or determine the amount of time for acknowledging the request using any one or more approaches known now or later discovered for determining a value in response to feedback. For example, the memory extension logic 142 and/or other logic may utilize one or more control systems and/or process control approaches, such as a Proportional Integral Derivative (PID) logic and/or circuit, to determine the amount of time for acknowledging the request. Alternatively, or in addition, the memory extension logic 142 and/or other logic may calculate and/or determine the amount of time for acknowledging the request using one or more machine learning logics and/or approaches. Other examples may include one or more of and/or combination(s) of the same and/or different measurements, calculations, and/or determinations, as these are merely exemplary of presently-preferred embodiments for adjusting the amount of time for acknowledging the request.
[00125] The one or more measurements, calculations, determinations, logics, circuits, and/or approaches used to determine the amount of time for acknowledging the request may be selected by a user and/or an administrator, may be configured, and/or may be selected based on any one or more policies, passed functions, steps, and/or rules that the memory extension logic 142 and/or other logic follows to determine which one or more measurements, calculations, determinations, logics, circuits, and/or approaches to use and/or which parameter(s) and/or configuration(s) to use with the one or more measurements, calculations, determinations, logics, circuits, and/or approaches. The one or more policies, passed functions, steps, and/or rules may use any available information, such as any one or more of the characteristics and/or configurations of the client(s) 100, the memory extension module(s) 140, the memory appliance(s), and/or the management server(s), to select the one or more selection strategies, portion replacement strategies, and/or page replacement strategies. For example, a policy and/or passed function may specify to use a PID control strategy unless the determined amount of time exceeds a threshold, and/or to use a fixed value when exceeding the threshold.
[00126] In some examples, the client logic 112, the application logic 114, and/or another logic may indicate to the memory extension logic 142 portions of memory 110, 144 that may be likely and/or unlikely to be accessed in the future. For example, in example systems where the client logic 112 and/or one or more other logics are not operating with the client 100 and/or are not in communication with the memory extension logic 142, the memory extension logic 142 may not have information usable to determine that one or more portions of memory 110, 144 are unlikely to be accessed soon relative to other portion(s). For example, if the client logic 112, the application logic, and/or another logic determines that data included in the memory 110 is unlikely to be accessed soon relative to other data and/or migrates that data to the memory 144 of the memory extension module 140, such as via NUMA page migration, the memory extension logic 142 may
observe the data being written to the memory 144 of the memory extension module and may treat the data as being recently accessed. Alternatively or in addition, the client logic 112, the application logic 114, and/or other logic may indicate that the data written to the memory 144 of the memory extension module 140 is unlikely to be accessed soon. Indicating that the data written to the memory 144 of the memory extension module 140 in this way may be advantageous in that the memory extension logic 142 and/or another logic may make more accurate predictions of which portions are likely/unlikely to be accessed in the near future and/or may be less likely to reclaim portions of active memory 110, 144 and/or to need to load data into memory (512), (522) that had recently been removed from memory by a reclaim operation.
[00127] FIG. 6 illustrates an example flowchart for portion migration with cold portion notification. The client logic 112 and/or another logic, such as an operating system, may perform the operations shown in FIG. 6 and/or other figures herein. The client logic 112 and/or another logic may receive a portion migration request (602). The portion migration request (602) may be any one or more mechanisms that may trigger the operations described for FIG. 6. Examples of possible requests include: a message, a programmatic signal, an electrical signal, an optical signal, a wireless signal, a method invocation, a function call, a TCP and/or UDP message, an HTTP request, etc. The request (502) may be received from another component and/or logic, such as the application logic 114, and/or the request (502) may be received from the client logic 112, such as by the client logic 112 making a determination to perform portion migration. The client logic 112 and/or any other one or more components and/or logics may make the determination independently and/or in coordination with each other and/or with any other one or more components and/or logics based on any one or more conditions, parameters, configurations, and/or other properties of the client 100 and/or of the memory extension module 140 and/or for any other reason. For example, the client logic 112, the application logic 114, and/or another logic may determine that one or more portions of the memory 110 of the client 100 are unlikely to be accessed in the near future and/or may determine that data of the portion(s) should be migrated to one or more portion(s) of the memory 144 and/or backing store 146 of the memory extension module 140.
[00128] In response to the request and/or determination (602), the client logic 112 and/or another logic may migrate one or more portions of memory (604). Migrating one or more portions of memory (604) may include copying data contained in the portions from one part of memory 110, 144 to one or more second portions in another part of memory 110, 144 and/or copying/transferring/re-associating associated metadata and/or portion-tracking data structures from the portion(s) to the second portion(s). Associated metadata may include one or more flags, lock primitives, references, indices, stored values, counters, usage counters, lists, and/or collections related to the portion(s) of memory. In some examples, the associated metadata may
be and/or may include the portion tracking data structures. Alternatively or in addition, the portion tracking data structures may be and/or may include the associated metadata.
[00129] Alternatively or in addition, migrating the one or more portions of memory (604) may include updating one or more data structures, collections, and/or mapping data structures to associate the one or more second portions and/or other portions with one or more portions of address space previously associated with the one or more portions of memory. For example, a radix tree, an extensible array, and/or a page table may be updated to reference the second portion(s) instead of the one or more portions being migrated (604).
[00130] Upon migrating the one or more portions of memory (604), the client logic 112 and/or another logic may perform one or more cold portion notifications (606). Performing the cold portion notification(s) may include notifying a receiving logic, such as the memory extension logic 142, the application logic 114, the client logic 112, and/or another logic of one or more portions of memory that may be unlikely to be accessed in the near future. For a portion to be unlikely to be accessed in the near future may mean that the portion may be less likely to be accessed than one or more other portions. Alternatively or in addition, for a portion to be unlikely to be accessed in the near future may mean that the portion may be less likely to be accessed than a threshold probability, such as 50% and/or any other probability value. The threshold probability may be fixed, configurable, and/or calculated. The calculation may be performed using any information available to the logic performing the calculation, such as portion access information, the portion tracking data structures, the reclaim candidates, an available memory determination, and/or any other information, such as any information described herein.
[00131] Upon receipt of the portion migration notification(s), the receiving logic may update one or more data structures, flags, and/or stored values in such a way as to increase and/or decrease the probability that the one or more portions of memory indicated by the portion migration notification(s) would be selected for reclaim and/or would be included as reclaim candidates. For example, the receiving logic may update one or more stored timestamps, generation counters, flags, and/or other data and/or metadata in such a way as to increase the probability that the portion(s) will be selected for reclaim, such as by subtracting an offset from the timestamp(s) and/or generation counter(s) and/or by setting and/or clearing one or more flags. In another example, the receiving logic may update one or more collections, such as by removing the portion(s) from the collection of active portions and adding the portion(s) to the collection of inactive portions, as described elsewhere in this disclosure.
[00132] Upon completing the one or more cold portion notifications (606), the client logic 112 and/or another logic may complete (608) handling the portion migration request and/or determination, such as by responding to the request and/or determination with or without a status indication.
[00133] FIG. 7 illustrates a memory architecture diagram of an example system providing multiple tiers of memory. The system may provide one or more tiers of memory. A tier may be a collection of memory with a common set of attributes, such as price, capacity, latency, bandwidth, operations per second, physical locality, network locality, logical locality, and/or any other attributes of the memory and/or of the device containing the memory. The attributes of a tier involving memory of a memory appliance may include any of the characteristics and/or configurations of the memory appliance. Alternatively or in addition, the attributes of a tier involving one or more memory extension modules 140 may include any characteristics and/or configurations of the memory extension module(s) 140, any of the components (e.g. 142, 144, 146, and/or 148) included in and/or associated with the memory extension module(s), and/or any interconnect(s) (e.g. 170 and/or 180) used to access the memory extension module(s) 140.
[00134] The attributes of one tier may differ from those of another tier. In one example, price and performance may decrease for lower tiers while capacity increases. This may enable the system to naturally demote data from higher levels to lower levels as other data proves to be used more often and/or more recently. In other examples, any one or more of price, performance, distance, latency, determinacy, size, hotness of data, predictive probability, latency, latency sources, latency categories, latency modalities, power usage, and/or any other characteristics relevant to performance, cost, and/or suitability for one or more purposes known now and/or later discovered may be the same, similar, and/or different between different tiers.
[00135] In at least one example, the highest-level tiers may be provided by the hardware of the client 100. For example, level 1 may be provided by the L1 cache of the processor of the client 100, level 2 may be provided by the L2 cache of the processor of the client 100, level 3 may be provided by the L3 cache of the processor 130 of the client 100, level 4 may be provided by the memory 110 of the client 100, and/or another level may be provided by the backing store 160 of the client 100.
[00136] In at least one example, one or more tiers may be provided by one or more memory appliances. For example: level 5 may be provided by one or more memory appliances with very low latency and/or high bandwidth; level 6 may be provided by one or more memory appliances with higher latency, lower bandwidth, and/or higher capacity; level 7 may be provided by the backing store of one or more memory appliances and/or of the client.
[00137] Alternatively or in addition, one or more tiers may be provided by one or more memory extension modules 140. For example: level 5 may be provided by one or more memory extension modules with high-bandwidth and/or low-latency memory, a large amount of memory relative to backing store, and/or in communication via a high performance interconnect; level 6 may be provided by one or more memory extension modules with lower-bandwidth and/or higher- latency memory, a smaller amount of memory relative to backing store, and/or in communication via a lower performance interconnect; level 7 may be provided by the backing store(s) of one or
more memory extension modules and/or of the client 100. In some examples, one or more memory extension modules may utilize the communication interface(s) 148 instead of and/or in addition to the backing store 146. In examples where utilizing the communication interface(s) 148 provides different performance characteristics than using the backing store, the memory extension modules utilizing the communication interface(s) may be associated with a different tier than the memory extension modules utilizing the backing store. For example, memory extension modules utilizing the communication interface(s) to access one or more memory appliances may be in a lower numbered tier than memory extension modules utilizing flash memory as a backing store if the memory appliance(s) can be accessed with better performance than the flash memory.
[00138] A logic, such as the client logic 112 and/or the memory extension logic 142 may cause data for one or more portions of memory to be migrated to lower-numbered tiers by causing the data of the portions to be faulted-in at the desired level. In one example, the client logic 112 may attempt to read the data, causing the data to be loaded into the memory 110 of the client 112, into the memory 144 of the memory extension module(s) 140 and/or into one or more levels of processor cache of the client. Alternatively, or in addition, the client logic 112 may pre-fetch the data, such as by issuing a pre-fetch request with an operating system of the client 100. The prefetch request may be a memory advisory request, indicating that the client logic 112 and/or another logic will need the data. In another example, the client logic 112 may send a pre-fetch request to the memory extension logic 140 and/or another logic. The pre-fetch request may cause the data to be loaded into the memory 144 of the memory extension module(s) 140. Alternatively or in addition, such as described in FIG. 10A of US Patent App Publication 2016/0077761 , entitled “PAGING OF EXTERNAL MEMORY”, which is hereby incorporated by reference, the pre-fetch request may cause the data to be loaded into the memory of the memory appliance. In another example, the client logic 112 may send a pin request to the memory extension logic 142 and/or another logic. The pin request may cause the data to be loaded into the memory 144 of the memory extension module 140. Alternatively or in addition, such as described for FIG. 10B of “PAGING OF EXTERNAL MEMORY”, the pin request may cause the data to be loaded into the memory of the memory appliance.
[00139] Alternatively or in addition, a logic, such as the client logic 112 and/or the memory extension logic 142 may cause the data for one or more portions of memory to be migrated away from lower-numbered tiers by unpinning the corresponding portions of memory and/or by causing the portions to be invalidated and/or reclaimed at the desired level. Causing the portions to be invalidated and/or reclaimed at the desired level may be as described elsewhere in this document. Alternatively, or in addition, the client logic 112 may send an unpin request to the memory extension logic 142, and/or another logic. The unpin request may cause the data to be unpinned from the memory 144 of the memory extension module 140. Alternatively or in addition, such as described for FIG. 10C of “PAGING OF EXTERNAL MEMORY”, the unpin request may cause the
data to be unpinned from the memory of the memory appliance. Alternatively, or in addition, the client logic 112 may send a reclaim request to the memory extension logic 142 and/or another logic. The reclaim request may cause the portions to be invalidated and/or reclaimed from the memory 144 of the memory extension module 140. Alternatively or in addition, such as described for FIG. 10D of “PAGING OF EXTERNAL MEMORY”, the reclaim request may cause the portions to be invalidated and/or reclaimed from the memory of the memory appliance.
[00140] Alternatively or in addition, a logic, such as the client logic 112 and/or the memory extension logic 142 may cause the data for one or more portions of memory to be migrated to lower-numbered tiers by performing a portion migration, such as described elsewhere in this disclosure, via NUMA page migration, and/or as described for FIG. 6. In other examples, a logic, such as the client logic 112 and/or the memory extension logic 142 may cause the data for one or more portions of memory to be migrated away from lower-numbered tiers by performing a portion migration.
[00141] Alternatively, or in addition, the operating system of the client, of the memory appliance, of the memory extension module 140, and/or of any other component may cause the data for one or more portions of the region 214 to be migrated away from lower-numbered tiers by causing the portions to be invalidated and/or reclaimed.
[00142] In computing systems, such as the example memory extension system(s) and/or external memory system(s) as described herein, latency and/or temporal latency may come from several sources and/or may take several forms. Sources of latency may be categorized in a number of ways to aid in explanation and/or understanding and/or to help visualize the dominant latency cost in a given system.
[00143] FIG. 8 illustrates an example categorization of latency sources for computing systems. The categorization may include one or more categories, such as physics, digitization, computation, multitasking, virtual memory, networking, and/or storage effects. Other categorizations are possible and/or may include more, fewer, and/or different categories, the categories may be defined based upon the same and/or different aspects of the latency sources, and/or the categories may include more, fewer and/or different sources of latency than illustrated and/or described in this example. Furthermore, the latency values illustrated in FIG. 8 are only examples to aid in understanding relative differences in magnitude that may or may not exist between latency categories. Other latency values are possible for the same and/or different latency source categories. Alternatively or in addition, other arrangements of the categories and/or of other categories are possible.
[00144] Latency sources of the physics category may be sources of latency related to properties, laws, theorems, axioms, and/or other aspects of physics, physical processes, physical properties, physical behaviors, and/or physical phenomena including, but not limited to: electrical effects, magnetic effects, optical effects, quantum effects, effects caused by certain materials, the
speed of light, the speed of electrical-magnetic wave propagation, signal propagation, signal detection, electrical/optical/other interference, resistance, capacitance, inductance, impedance, permittivity, and/or any other sources of latency related to physics known now or later discovered. Some latency sources of this and/or other categories may vary based on distance, scale, and/or other properties. In one example, which may be called distal latency, signal propagation latency may be proportional to the distance the signal needs to propagate. In another example, which may be called material latency, signal propagation latency may vary based on materials used/involved, such as conductor material, fiber optic material, atmospheric composition/density, vacuum, etc. Latency sources of the physics category may be more prevalent at very small scales (e.g. quantum effects, capacitance, etc.) and/or at very large scales (e.g. signal propagation in very long wires/cables, interference, resistance, etc.). Typical latency values due to sources of the physics category may be approximately 10'1 ns or less, particularly for very small scales, but latency values may vary widely and/or may be outside of this range. For example, at very large scales, signal propagation delays can take several seconds or more, such as when sending electromagnetic signals to/from spacecraft, other planets, etc.
[00145] Latency sources of the digitization category may be sources of latency related to digital logic, digital signal transmission, analog-to-digital conversion, digital-to-analog conversion, modulation, demodulation, digital amplification/retransmissions, clock signals, clock domains, synchronization, serialization, deserialization, and/or any other sources related to digital signals, digital values, digital logic, and/or digital communication known now or later discovered. In one example, which may be called logical latency, digitization latency may vary based on the complexity and/or number of logic devices that a signal and/or information may traverse, such as the type of and/or number of logic gates in a logic path and/or a number of pipeline stages involved. Typical latency values due to sources of the digitization category may range between 10-1 and 10° ns, though other latency values are also possible. For example, latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s).
[00146] Latency sources of the computation category may be sources of latency related to computation and/or other digital logic that may perform meaningful calculations, determinations, comparisons, and/or decisions, including, but not limited to instruction execution, number of instructions, instruction complexity, data dependencies, pipeline stalls, branch prediction I misprediction, pipeline flushing, cache access stalls, memory access stalls, inter-processor communication, cache coherency, and/or any other sources related to calculations, determinations, comparisons, decisions, and/or any other higher-level logic known now or later discovered. Typical latency values due to sources of the computation category may range between 10° and 101 ns, though other latency values are also possible. For example, latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s).
[00147] Latency sources of the multitasking category may be sources of latency related to multiple logics operating with a processor and/or other computing device, including, but not limited to context switches, interrupt latency, synchronization, locking, data dependencies, lock contention, and/or any other sources related to multitasking known now or later discovered. Typical latency values due to sources of the multitasking category may range between 101 and 102 ns, though other latency values are also possible. For example, latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s). Alternatively or in addition, such as in cases of extreme lock contention and/or other effects, latency values of this category may vary widely.
[00148] Latency sources of the virtual memory category may be sources of latency related to the use, mapping, translation, and/or maintenance of virtual addresses, virtual address space(s), page tables, page table entries, translation lookaside buffer(s) (TLB), and/or any other aspects of virtual memory known now or later discovered, including, but not limited to address translation, TLB misses, page faults, page table manipulation, page table creation, minor page faults, major page faults, copy-on-write page faults, zeroing page faults, page reclaim activity, background reclaim, direct reclaim, and/or any other sources related to virtual memory known now or later discovered. Examples of page faults and/or major page faults may include page faults that decompress and/or decrypt data (such as compressed and/or encrypted data held in memory), page faults that read from remote memory, page faults that read from memory that is slower than main memory (such as flash memory), page faults that read from disk, page faults that read from remote disk, page faults that generate content dynamically and/or computationally, any other type of page fault known now or later discovered, and/or combinations of these and/or other types of page faults. Typical latency values due to sources of the virtual memory category may range between 102 and 103 ns, though other latency values are also possible. For example, latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s). Alternatively or in addition, such as in cases of extreme memory contention, heavy reclaim activity, thrashing, and/or other effects, latency values of this category may vary widely, particularly in examples where data is being transferred over one or more networks and/or between one or more mechanical storage devices.
[00149] Latency sources of the networking category may be sources of latency related to communication, networks, interconnects, communication interfaces, cables, ports, connectors, antennas, signal splitters, signal combiners, hubs, switches, routers, and/or any other device, component, and/or aspect of communication, networking, and/or aggregation of digital systems and/or components, including, but not limited to, packetizing, reassembly, buffering, queuing, switching, routing, flow control, congestion, error detection, error correction, data corruption, packet loss, and/or any other sources related to networking and/or communication known now or later discovered. Typical latency values due to sources of the networking category may range
between 103 and 105 ns, though other latency values are also possible. For example, latency values for these and/or other sources of latency may be proportional to the clock period(s) of the computing system(s). Alternatively or in addition, such as in cases of extreme congestion, packet loss, and/or other effects, latency values of this category may vary widely.
[00150] Latency sources of the storage category may be sources of latency related to data storage and/or retrieval, including, but not limited to, memory reading, memory writing, memory erasing, disk head positioning, disk rotation, storage allocation, fragmentation, defragmentation, tape feeding, media retrieval, and/or any other electrical, mechanical, and/or other effects related to storage and/or retrieval of data known now or later discovered. Typical latency values due to sources of the storage category may be greater than 104 ns, though other latency values are also possible. For example, latency values for these and/or other sources of latency may vary based on the storage medium and/or storage device being used. Alternatively or in addition, such as in cases of heavy storage use, thrashing, and/or other effects, latency values of this category may vary widely.
[00151] The systems, methods, devices, techniques, and/or approaches described herein may improve the effectiveness of computing systems, such as in cases where experienced latency values are outside typical ranges of a given latency category. For example, in systems where memory access latency is expected to be in and/or near a range typical of the computation latency category, such as -10° ns to ~101 ns, if the memory access latency actually experiences is much higher, such as above ~103 ns or higher due to network latency related to retrieving data over the communication interface 148 and/or such as above ~105 ns due to storage latency related to retrieving data from the backing store 146, then the computing system may not perform as expected. For example, while waiting for memory access, the processor 130 may stall one or more pipelines that may be experiencing data dependencies related to the memory access. While the computing system may be designed to accommodate pipeline stalls typical of memory accesses related to computation latency, such as by incorporating deep pipelines and/or aggressive branch prediction, it may be impractical to accommodate memory access latencies that may be orders of magnitude longer in duration. As a result, all or a portion of the processor may stay idle for long periods of time, reducing its effectiveness. However, if the computing system incorporates the systems, methods, techniques, and/or approaches described herein, it may avoid pipeline stalls that may exceed the system’s design parameters and/or that may reduce its effectiveness.
[00152] The client 100, and/or the memory extension module(s) 140 may be configured in any number of ways. In one example, the memory extension module 140 may be included in a computer. For example, the processor may be the CPU (and/or any other computing device and/or logic, such as a GPU, an APU, an FPGA, an ASIC, a system on chip (SoC), etc.) of the computer, the memory may be the memory of the computer, and the computer may include the interconnects
170 180. Alternatively or in addition, the memory extension module 140 may be a peripheral of a computer, including but not limited to a PCI device, a PCI-X device, a PCIe device, an HTX (HyperTransport expansion) device, a CXL device, or any other type of peripheral, internally or externally connected to a computer.
[00153] In a second example, the memory extension module 140 may be added to a computer or another type of computing device that accesses data in the memory extension module 140. For example, the extension module 140 may be a device installed in a computer, where the client 100 is a process executed by a CPU of the computer. The memory in the memory extension module 140 may be different than the memory accessed by the CPU of the computer. The processor in the memory extension module 140, if present, may be different than the CPU of the computer.
[00154] In a third example, the client 100 and/or the memory extension module 140, may be implemented using a Non-Uniform Memory Architecture (NUMA). In NUMA, the processor may comprise multiple processor cores connected together via a switched fabric of point-to-point links. The memory controller may include multiple memory controllers. Each one of the memory controllers may be electrically coupled to a corresponding one or more of the processor cores. Alternatively, multiple memory controllers may be electrically coupled to each of the processor cores. Each one of the multiple memory controllers may service a different portion of the memory than the other memory controllers.
[00155] In a fourth example, the processor 130 of the client 100 and/or the memory extension module 140, if a processor is present in the memory extension module 140, may include multiple processors that are electrically coupled to the interconnect(s) 170, 180, such as with a bus. Other components of the client 100 and/or the memory extension module 140, such as multiple memories included in the memory, the communication interface, the memory controller, and/or the storage controller may also be electrically coupled to the interconnect.
[00156] In a fifth example, the memory extension system may include multiple clients 100, multiple memory extension modules 140, multiple memories 110, 144, multiple backing stores 160, 146, multiple memory extension logics 142, multiple client logics 112, and/or multiple application logics 114.
[00157] In a sixth example, the client 100 may provide additional services to other systems and/or devices. For example, the client 100 may include a Network Attached Storage (NAS) appliance. Alternatively or in addition, the client 100 may include a Redundant Array of Independent Disks (RAID) head. Alternatively or in addition, the client 100 and/or the memory extension module 140 may include a resiliency logic, such as described in US provisional application 62/540,259, filed August 2, 2017, and U.S. Non-provisional patent application Ser. No.
16/050,974, filed July 31 , 2018, each of which is hereby included by reference For example, the client 100 may include and/or may be in communication with multiple memory
extension modules 140. In such examples, the memory extension logic 142, the resiliency logic, the client logic 112, and/or another logic may perform data resiliency operations, such as described therein. In some examples, the resiliency logic may be included in and/or combined with one or more other logics, such as the client logic 112, the application logic, 114, and/or the memory extension logic 142. Alternatively or in addition, the client 100 may provide file-level access to data stored in the memory extension module 140. Alternatively, or in addition, the client 100 may include a database, such as an in-memory database. Alternatively or in addition, the client 100 may operate as a memory appliance for one or more other clients.
[00158] In a seventh example, multiple clients 100 may utilize one or more memory extension modules 140 as shared memory. For example, the clients 100 may include or interoperate with an application logic 114 that relies on massive parallelization and/or sharing of large data sets. Examples of application logic that may use massive parallelization include logic that performs protein folding, genetic algorithms, seismic analysis, or any other computationally intensive algorithm and/or iterative calculations where each result is based on a prior result. The application logic 114 may store application data, application state, and/or checkpoint data in the regions of the one or more memory extension modules 140 and/or in one or more memory appliances. The additional capabilities of the one or more memory extension modules and/or memory appliances, such as low latency access and persistence to the backing store, may be exploited by the clients in order to protect against application crashes, a loss of power to the clients 100, or any other erroneous or unexpected event on any of clients 100. The clients 100 may access the one or more memory extension modules 140 and/or memory appliance(s) in a way that provides for atomic access. For example, the memory access operations requested by the clients may include atomic operations, including but not limited to a fetch and add operation, a compare and swap operation, or any other atomic operation now known or later discovered. An atomic operation may be a combination of operations that execute as a group or that do not execute at all. The result of performing the combination of operations may be as if no operations other than the combination of operations executed between the first and last operations of the combination of operations. Thus, the clients may safely access the one or more memory extension modules 140 and/or memory appliances without causing data corruption.
[00159] The application logic 114, the client logic 112, the memory extension logic 142, and/or any other one or more logics described herein may be co-located, separated, or combined. The actions performed by combined logic may perform the same or similar feature as the aggregate of the features performed by the logics that are combined. In a first example, all logics may be co-located in a single device. In a second example, the memory extension logic 142 may be split into multiple logics. In a third example, the client logic 112 and the allocation logic 114 may be combined into a single logic. In a fourth example, the client logic 112 and the application logic 114 may be partially combined. For example, some of the functionality of each logic may be
separated from the logic described herein as implementing the functionality and/or may be combined into a new logic. Other combinations of the various components are possible, just a few of which are described here.
[00160] The application logic 114, the client logic 112, the memory extension logic 142, and/or any other logic described herein may include computer code. The computer code may include instructions executable with the processor 130 and/or any other processor, such as the processor of the memory extension logic 142, if included in the memory extension logic. The computer code may be written in any computer language now known or later discovered, such as assembly language, C, C++, C#, Java, JavaScript, Python, Go, R, Swift, PHP, Dart, Kotlin, MATLAB, Perl, Ruby, Rust, and/or Scala, or any combination thereof. In one example, the computer code may be firmware. Alternatively or in addition, all or a portion of the application logic 114, the client logic 112, the memory extension logic 142, any other logic(s) described herein, and/or the processor may be implemented as a circuit. For example, the circuit may include an FPGA (Field Programmable Gate Array) configured to perform the features of the application logic 114, the client logic 112, the memory extension logic 142, and/or the other logic(s). Alternatively, or in addition, the circuit may include an ASIC (Application Specific Integrated Circuit) configured to perform the features of the application logic 114, the client logic 212, the memory extension logic 142, any other logic(s) described herein, and/or the processor 130. The circuit may be embedded in a chipset, a processor, and/or any other hardware device.
[00161] Alternatively, or in addition, a portion of the application logic 114, the client logic 112, the memory extension logic 142, and/or the processor 130 may be implemented as part of the one or more communication interfaces or other hardware components. For example, the memory extension logic 142 may be partially and wholly incorporated in and/or executed by a communication interface that is in communication with the processor 130 via the interconnect(s) 170, 180. In examples where the memory extension logic 142 includes hardware logic, the hardware logic may be physically combined with the communication interface and/or other hardware component(s). In examples where the memory extension logic includes computer executable code, the computer executable code may be executed by a processor included in the communication interface and/or other hardware component(s). Examples of suitable communication interfaces may include PCIe interfaces, InfiniBand interfaces, Gen-Z interfaces, Hypertransport interfaces, QPI interfaces, UPI interfaces, and/or CXL interfaces. In another example, the client logic 112 may be partially and wholly incorporated in and/or executed by a memory controller, such as the memory controller 120 of the client 100, and/or another memory access device, such as a PCIe controller/device and/or a CXL controller/device.
[00162] The system may be implemented in many different ways. Each module or unit, such as the client logic unit, the application logic unit, the memory extension logic unit, the configuration unit, and/or any other logic described herein may be hardware or a combination of
hardware and software. For example, each module may include an application specific integrated circuit (ASIC), a Field Programmable Gate Array (FPGA), a circuit, a digital logic circuit, an analog circuit, a combination of discrete circuits, gates, or any other type of hardware or combination thereof. Alternatively or in addition, each module may include memory hardware, such as a portion of the memory 110, for example, that comprises instructions executable with the processor 130 or other processor to implement one or more of the features of the module. When any one of the modules includes the portion of the memory that comprises instructions executable with the processor, the module may or may not include the processor. In some examples, each module may just be the portion of the memory 110, 144 or other physical memory that comprises instructions executable with the processor 130 or other processor to implement the features of the corresponding module without the module including any other hardware. Because each module includes at least some hardware even when the included hardware comprises software, each module may be interchangeably referred to as a hardware module.
[00163] All of the discussion, regardless of the particular implementation described, is exemplary in nature, rather than limiting. For example, although selected aspects, features, or components of the implementations are depicted as being stored in memories, all or part of systems and methods consistent with the innovations may be stored on, distributed across, or read from other computer-readable storage media, for example, secondary storage devices such as hard disks, floppy disks, and CD-ROMs; or other forms of ROM or RAM either currently known or later developed. The computer-readable storage media may be non-transitory computer-readable media, which includes CD-ROMs, volatile or non-volatile memory such as ROM and RAM, or any other suitable storage device.
[00164] Furthermore, although specific components of innovations were described, methods, systems, and articles of manufacture consistent with the innovation may include additional or different components. For example, a processor may be implemented as a microprocessor, microcontroller, application specific integrated circuit (ASIC), discrete logic, or a combination of other type of circuits or logic. Similarly, memories may be DRAM, SRAM, SDRAM, ADRAM, DDR RAM, FPM, EDO DRAM, RDRAM, CDRAM, Flash, and/or any other type of memory, such as those described and/or listed herein. Flags, data, databases, tables, entities, and other data structures may be separately stored and managed, may be incorporated into a single memory or database, may be distributed, or may be logically and physically organized in many different ways. The components may operate independently or be part of a same program. The components may be resident on separate hardware, such as separate removable circuit boards, or share common hardware, such as a same memory and processor for implementing instructions from the memory. Programs may be parts of a single program, separate programs, or distributed across several memories and processors.
[00165] In some examples, when it is said that an interface is performing some action, the action may be performed by the interface directly, and/or the action may be performed by a logic which provides and/or implements the interface, such as by an interface-implementing logic. The interface-implementing logic may be any logic which provides the interface and/or implements the logic that performs actions described as being performed by the interface. The interfaceimplementing logic may be and/or may include any of the logics described herein. For example, if the actions described as being performed by one or more data interfaces are performed by the client logic 112, then the interface-implementing logic for the corresponding data interface(s) may be and/or may include the client logic 112. In examples where the action is performed by the interface, the actions performed may be performed in response to an invocation of the interface. In examples where the action is performed by the interface-implementing logic, the actions performed may be performed in response to an invocation of the corresponding interface.
[00166] The respective logic, software or instructions for implementing the processes, methods and/or techniques discussed throughout this disclosure may be provided on computer- readable media or memories or other tangible media, such as a cache, buffer, RAM, removable media, hard drive, other computer readable storage media, or any other tangible media or any combination thereof. The tangible media include various types of volatile and nonvolatile storage media. The functions, acts or tasks illustrated in the figures or described herein may be executed in response to one or more sets of logic or instructions stored in or on computer readable media. The functions, acts or tasks are independent of the particular type of instructions set, storage media, processor or processing strategy and may be performed by software, hardware, integrated circuits, firmware, micro code, or any type of other processor, operating alone or in combination. Likewise, processing strategies may include multiprocessing, multitasking, parallel processing and/or any other processing strategy known now or later discovered. In one embodiment, the instructions are stored on a removable media device for reading by local or remote systems. In other embodiments, the logic or instructions are stored in a remote location for transfer through a computer network or over telephone lines. In yet other embodiments, the logic or instructions are stored within a given computer, CPU, GPU, or system.
[00167] The logic illustrated in the flow diagrams may include additional, different, or fewer operations than illustrated. The operations illustrated may be performed in an order different than illustrated.
[00168] A second action may be said to be "in response to" a first action independent of whether the second action results directly or indirectly from the first action. The second action may occur at a substantially later time than the first action and still be in response to the first action. Similarly, the second action may be said to be in response to the first action even if intervening actions take place between the first action and the second action, and even if one or more of the intervening actions directly cause the second action to be performed. For example, a second
action may be in response to a first action if the first action sets a flag and a third action later initiates the second action whenever the flag is set.
[00169] To clarify the use of and to hereby provide notice to the public, the phrases "at least one of <A>, <B>, ... and <N>" or "at least one of <A>, <B>, ... <N>, or combinations thereof" or "<A>, <B>, ... and/or <N>" are defined by the Applicant in the broadest sense, superseding any other implied definitions hereinbefore or hereinafter unless expressly asserted by the Applicant to the contrary, to mean one or more elements selected from the group comprising A, B, ... and N. In other words, the phrases mean any combination of one or more of the elements A, B, ... or N including any one element alone or the one element in combination with one or more of the other elements which may also include, in combination, additional elements not listed. Unless otherwise indicated or the context suggests otherwise, as used herein, "a" or "an" means "at least one" or "one or more."
[00170] While various embodiments have been described, it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible. Accordingly, the embodiments described herein are examples, not the only possible embodiments and implementations.
Claims
1. A memory extension device comprising: a backing store including at least a first datum; a memory including at least a second datum and one or more data structures, wherein the one or more data structures indicate that the first datum is not accessible and that the second datum is accessible; an interconnect; and a memory extension logic, wherein the memory extension logic is configured to respond via the interconnect to a memory access request by at least one of: reading from the memory, if the memory access request is for the one or more data structures; reading from the memory, if the memory access request is for data included in the memory; or reading from the backing store if the memory access request is for data included in the backing store.
2. The memory extension device of claim 1 , wherein the interconnect comprises at least one of: a processor interconnect; a peripheral interconnect; an interface providing cache-coherent memory access; a CXL interconnect; or a network including at least a second memory extension device or a second client device.
3. The memory extension device of claim 1 , wherein the second datum is removed from the memory in response to a memory initialization request and/or a memory discard request.
4. The memory extension device of claim 3, wherein the one or more data structures are modified to indicate that the second datum is not accessible in response to the memory discard request.
5. The memory extension device of claim 1 , wherein the memory extension logic is further configured to respond to the memory access request with initialized data in response to a memory identifier specified by the memory access request being previously specified by a memory initialization request and/or a memory discard request.
6. The memory extension device of claim 5, wherein the initialized data are not included in the memory prior to receipt of the memory access request.
7. The memory extension device of claim 5, wherein the one or more data structures indicate that
the initialized data are accessible prior to receipt of the memory access request.
8. The memory extension device of claim 1 , wherein the memory extension logic is further configured to store the first datum in the memory in response to the memory access request if the memory access request is for the first datum.
9. The memory extension device of claim 8, wherein the memory extension logic is further configured to select a destination in the memory for the first datum without invoking a processor of the memory extension device.
10. The memory extension device of claim 1 , wherein the backing store includes a third datum and wherein the memory extension logic is further configured to copy the third datum to the memory in response to a data load request received via the interconnect.
11. The memory extension device of claim 10, wherein the data load request is received from a client device and wherein the client device is configured to send the data load request in response to a page fault for the third datum.
12. The memory extension device of claim 1 , wherein a data load request for the first datum is received from a client device via the interconnect, wherein the client device is configured to send the data load request and wait for a response before issuing the memory access request, and wherein the memory access request is for the first datum.
13. The memory extension device of claim 1 , wherein the memory extension logic is further configured to identify one or more portions of the memory as reclaim candidates periodically or in response to a low memory condition.
14. The memory extension device of claim 13, wherein the memory extension logic is further configured to select a portion of the memory to reclaim from among the reclaim candidates without invoking a processor of the memory extension device.
15. The memory extension device of claim 13, wherein the one or more portions are selected according to a selectable portion selection strategy.
16. The memory extension device of claim 1 , wherein the memory extension logic is configured to increase, in response to a notification received via the interconnect, a probability that a portion of the memory will be selected for reclaim.
17. A memory extension method, comprising: storing at least a first datum in a backing store of a memory extension device; storing at least a second datum and one or more data structures in a memory of the memory extension device, wherein the one or more data structures indicate that the first datum is not accessible and that the second datum is accessible;
receiving a memory access request via an interconnect; performing by a memory extension logic of the memory extension device at least one of: reading from the memory the one or more data structures, by a memory extension logic of the memory extension device if the memory access request is for the one or more data structures; reading from the memory, if the memory access request is for data included in the memory; or reading from the backing store, if the memory access request is for data included in the backing store.
18. The memory extension method of claim 17, wherein the memory extension logic is further configured to respond to the memory access request with initialized data in response to a memory identifier specified by the memory access request being previously specified by a memory initialization request and/or a memory discard request.
19. The memory extension method of claim 18, wherein the initialized data are not included in the memory prior to receipt of the memory access request.
20. The memory extension method of claim 18, wherein the one or more data structures indicate that the initialized data are accessible prior to receipt of the memory access request.
21. The memory extension device of claim 17, wherein the memory extension logic is further configured to store the first datum in the memory in response to the memory access request if the memory access request is for the first datum.
22. The memory extension method of claim 21 , wherein the memory extension logic is further configured to select a destination in the memory for the first datum without invoking a processor of the memory extension device.
23. The memory extension device of claim 17, wherein the second datum is removed from the memory in response to a memory initialization request and/or a memory discard request.
24. The memory extension device of claim 23, wherein the one or more data structures are modified to indicate that the second datum is not accessible in response to the memory discard request.
25. The memory extension device of claim 17, wherein the backing store includes a third datum and wherein the memory extension logic is further configured to copy the third datum to the memory in response to a data load request received via the interconnect.
26. The memory extension device of claim 25, wherein the data load request is received from a client device and wherein the client device is configured to send the data load request in response
to a page fault for the third datum.
27. The memory extension device of claim 17, wherein a data load request for the first datum is received from a client device via the interconnect, wherein the client device is configured to send the data load request and wait for a response before issuing the memory access request, and wherein the memory access request is for the first datum.
28. The memory extension device of claim 17, wherein the memory extension logic is further configured to identify one or more portions of the memory as reclaim candidates periodically or in response to a low memory condition.
29. The memory extension device of claim 28, wherein the memory extension logic is further configured to select a portion of the memory to reclaim from among the reclaim candidates without invoking a processor of the memory extension device.
30. The memory extension device of claim 28, wherein the one or more portions are selected according to a selectable portion selection strategy.
31. The memory extension device of claim 17, wherein the memory extension logic is configured to increase, in response to a notification received via the interconnect, a probability that a portion of the memory will be selected for reclaim.
32. The memory extension device of claim 17, wherein the interconnect comprises at least one of: a processor interconnect; a peripheral interconnect; an interface providing cache-coherent memory access; a CXL interconnect; or a network including at least a second memory extension device or a second client device.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363448442P | 2023-02-27 | 2023-02-27 | |
| US63/448,442 | 2023-02-27 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024182288A1 true WO2024182288A1 (en) | 2024-09-06 |
Family
ID=92590912
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/017276 Pending WO2024182288A1 (en) | 2023-02-27 | 2024-02-26 | Cache-coherent memory extension with cpu stall avoidance |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2024182288A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20060225135A1 (en) * | 2005-03-31 | 2006-10-05 | Cheng Antonio S | Providing extended memory protection |
| US20080056014A1 (en) * | 2006-07-31 | 2008-03-06 | Suresh Natarajan Rajan | Memory device with emulated characteristics |
| US20140195764A1 (en) * | 2013-01-08 | 2014-07-10 | Qualcomm Incorporated | Memory device having an adaptable number of open rows |
| US20200150872A1 (en) * | 2015-04-23 | 2020-05-14 | Huawei Technologies Co., Ltd. | Method for Accessing Extended Memory, Device, and System |
-
2024
- 2024-02-26 WO PCT/US2024/017276 patent/WO2024182288A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20060225135A1 (en) * | 2005-03-31 | 2006-10-05 | Cheng Antonio S | Providing extended memory protection |
| US20080056014A1 (en) * | 2006-07-31 | 2008-03-06 | Suresh Natarajan Rajan | Memory device with emulated characteristics |
| US20140195764A1 (en) * | 2013-01-08 | 2014-07-10 | Qualcomm Incorporated | Memory device having an adaptable number of open rows |
| US20200150872A1 (en) * | 2015-04-23 | 2020-05-14 | Huawei Technologies Co., Ltd. | Method for Accessing Extended Memory, Device, and System |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12007892B2 (en) | External memory as an extension to local primary memory | |
| US11360679B2 (en) | Paging of external memory | |
| US20200371700A1 (en) | Coordinated allocation of external memory | |
| US20240020003A1 (en) | Hardware accessible memory fabric | |
| US11487675B1 (en) | Collecting statistics for persistent memory | |
| US11163699B2 (en) | Managing least recently used cache using reduced memory footprint sequence container | |
| US10372335B2 (en) | External memory for virtualization | |
| US10049055B2 (en) | Managing asymmetric memory system as a cache device | |
| US9606870B1 (en) | Data reduction techniques in a flash-based key/value cluster storage | |
| US20230008874A1 (en) | External memory as an extension to virtualization instance memory | |
| US9830092B2 (en) | Solid state device parity caching in a hybrid storage array | |
| US20120210043A1 (en) | Systems and Methods for Managing Data Input/Output Operations | |
| US20160313920A1 (en) | System and method for an accelerator cache and physical storage tier | |
| US20220382478A1 (en) | Systems, methods, and apparatus for page migration in memory systems | |
| US20250138883A1 (en) | Distributed Memory Pooling | |
| WO2024182288A1 (en) | Cache-coherent memory extension with cpu stall avoidance |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24764408 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024764408 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |