Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a hierarchical memory based on an NVM and a hierarchical storage method and a hierarchical storage system.
The method for promoting the intelligent cooperative work of the layered memory and the layered storage system provided by the invention comprises the following steps:
Step S1, identifying access characteristics of a memory page, classifying the memory page into a persistent cold page, a memory access cold page, a persistent hot page and a memory access hot page, and initializing partition of a storage space of the NVM based on classification;
Step S2, dynamically adjusting the distribution of the memory pages between the DRAM and the NVM by tracking the access heat and the persistence heat of the memory pages;
step S3, persisting the persisted hot page to the NVM in the background.
Preferably, step S1 comprises the following sub-steps:
S1.1, checking validity of a super block, if the content of the super block is illegal, distributing and initializing a log area, and finishing initialization of the NVM (non-volatile memory) by setting the super block, wherein the NVM is not mounted to a system memory;
S1.2, mounting the space except the direct persistent file page, the super block and the log area in the NVM as a system memory through a memory hot plug mechanism of an operating system;
S1.3, recording the access heat of each memory page by using an LRU linked list, and counting the persistence times of each memory page in preset time by using a hash table;
And S1.4, judging that the memory page is accessed to the hot page if the memory page is located in the active list of the LRU linked list in two continuous samplings, judging that the memory page is durable if the persistence times of the memory page exceeds a preset threshold value, and judging that the memory page which does not meet any hot page condition is cold page.
Preferably, step S1.4 includes the system providing an operating system interface allowing the user to manually adjust the cool and hot attributes of the memory pages and the placement.
Preferably, step S2 comprises the following sub-steps:
step S2.1, after the memory mount of the NVM is completed, enabling a memory access heat tracking module, a persistence heat tracking module and a background migration thread in the kernel through an operating system interface, and creating a stacked file system;
Step S2.2, when a thread accesses a memory page, updating the position of the memory page in the LRU linked list to count the memory access heat degree, and when fsync persistence operation is executed, recording the range information of the persisted memory page to count the persistence heat degree;
Step S2.3, if the memory page is determined to be required to be migrated and the memory page is located in the DRAM, adding the memory page into a specific linked list of the kernel, marking a target storage position, and migrating the memory page to a target medium one by one in the background through a kernel thread;
Step S2.4, if the memory page is a memory access page and is not a persistent hot page and is currently located in the NVM, the memory page is migrated to the DRAM.
Preferably, in step S2.1, the access heat tracking module is realized by modifying the LRU linked list updating function, wherein the persistence heat tracking module is positioned in fsync operation processing logic and background thread refreshing logic of the file system;
and the background migration thread writes the persistent file pages and the direct persistent file pages on the NVM back to the SSD or the HDD according to the content of the log area of the NVM, and performs migration between the DRAM and the NVM.
Preferably, in step S2.4, when the remaining capacity of the DRAM is lower than the lower threshold, the cold page is randomly selected from the inactive list of the LRU linked list for migration to NVM until the remaining capacity of the DRAM is restored to above the upper threshold, and then the memory page is migrated.
Preferably, step S3 comprises the following sub-steps:
step S3.1, if the memory page is a memory access cold page and a persistent hot page, the memory page is migrated to the NVM, and if the memory page is both the memory access hot page and the persistent hot page, the processing mode is determined according to the comparison result of the memory access strength and the persistent strength:
if the access strength is higher than the persistence strength, the memory page is reserved in the DRAM, if the memory page is a dirty page, the memory page is persistence to the NVM, the page table is updated to be marked as a clean page, and meanwhile, the corresponding relation between the file and the memory page is recorded in an index area of the NVM;
if the access strength is lower than the persistence strength, the memory page is migrated to the NVM and the corresponding relation is recorded;
And S3.2, when the file is read through the stacked file system, firstly acquiring a memory page to be read from the underlying file system, then searching a matched memory page in a log area of the NVM, mapping the memory page in the NVM to a page table if the matched memory page exists, and otherwise, reading and mapping the memory page from the SSD.
Preferably, in step S3.1, the access strength is obtained by collecting page address statistics of L3 cache miss through the accurate event sampling PEBS function of the intel CPU;
the persistence strength is obtained by counting the number of pages to be persistence through fsync system calls;
When the access memory strength and the persistence strength are compared, the delay data and the throughput of the DRAM and the NVM are combined, the delay cost ratio of the access memory and the persistence is calculated, then the delay increment is calculated according to the access memory times and the persistence times in the preset time, if the delay increment of the DRAM is larger, the persistence strength is judged to be higher, and if the delay increment of the NVM is larger, the access memory strength is judged to be higher.
Preferably, step S3 further comprises:
And receiving the persistence heat or access heat parameters of the memory pages set by the user through an operating system interface so as to adjust the placement positions and maintenance strategies of the memory pages.
The invention also provides a system for promoting intelligent cooperative work of the layered memory and the layered storage system, which comprises:
the method comprises the steps that a module M1 identifies access characteristics of a memory page, classifies the memory page into a persistent cold page, a memory access cold page, a persistent hot page and a memory access hot page, and performs initialization partitioning on a storage space of an NVM based on classification;
the module M2 dynamically adjusts the distribution of the memory pages between the DRAM and the NVM by tracking the access heat and the persistence heat of the memory pages;
Module M3 persists the persisted hot page in the background onto NVM.
Compared with the prior art, the invention has the following beneficial effects:
1. The invention combines the layered memory and the layered storage system based on the NVM, uses the single NVM as the memory and the storage at the same time, and improves the utilization ratio of the NVM.
2. The invention can intelligently realize zero-overhead conversion between the memory and the storage system, avoids the copy operation from the memory part NVM to the storage part NVM or in the opposite direction when the memory page is persistent, and remarkably improves the system performance.
3. The invention can be used based on any existing file system, does not need to modify the code level of the file system, and has good compatibility and usability.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications could be made by those skilled in the art without departing from the inventive concept. These are all within the scope of the present invention.
The invention provides a layered memory and a layered storage method and system based on an NVM (non-volatile memory), which comprise the steps of S1, identifying access characteristics of memory pages, classifying the memory pages into a persistent cold page, a memory access cold page, a persistent hot page and a memory access hot page, initializing and partitioning a storage space of the NVM based on the classification, S2, dynamically adjusting the distribution of the memory pages between a dynamic random access memory DRAM and the NVM by tracking the access heat and the persistent heat of the memory pages, and S3, and persisting the persistent hot page to the NVM in the background. The invention combines the layered memory system and the layered storage system, introduces new migration judgment standard, intelligently migrates the corresponding page into the NVM or the DRAM by predicting whether the page needs to be subjected to persistent operation frequently or not and the access heat of the page, and reduces the overhead of the system in the processes of persistent operation and access.
According to the invention, a log structure and an index mechanism are adopted on the NVM, and a persistent area and a non-persistent area are definitely divided, so that the high bandwidth characteristic of the NVM as the memory expansion is fully exerted, and the persistent characteristic is utilized to realize quick data disk drop. When the system crashes and is restarted, the design can quickly recover the persistent data in the NVM through the index information, and ensure the consistency of the data. Secondly, a memory page migration strategy based on two-dimensional heat evaluation is designed, and the optimal placement position of the memory page between the DRAM and the NVM is intelligently determined by comprehensively analyzing two indexes of access heat and persistence heat, so that the optimization of the overall performance of the storage system is realized.
Example 1:
The embodiment provides a method for promoting intelligent collaborative work of a layered memory and a layered storage system, which comprises the following steps:
Step S1, when the nonvolatile memory NVM is mounted for the first time, identifying the access characteristics of the memory pages, classifying the memory pages into a lasting cold page, a memory access cold page, a lasting hot page and a memory access hot page, and initializing and partitioning the storage space of the NVM based on the classification;
In the embodiment, the persistent cold page is a memory page with the persistent operation heat lower than a preset threshold, the access cold page is a memory page with the access heat lower than the preset threshold, the persistent hot page is a memory page with the persistent operation heat higher than the preset threshold, and the access hot page is a memory page with the access heat higher than the preset threshold.
FIG. 1 is a schematic diagram of a memory layout in an NVM in an embodiment of the present invention, and FIG. 2 is a diagram illustrating all possible page types on different media in an embodiment of the present invention.
As shown in fig. 1,2, the memory space of the NVM is divided into a plurality of functional areas including superblocks, log areas, persistent file pages, direct persistent file pages, etc.
FIG. 3 is a schematic diagram of a process of starting a hierarchical memory and a hierarchical storage system after power-on in an embodiment of the present invention.
As shown in fig. 3, step S1 includes the following sub-steps:
S1.1, checking validity of a super block, if the content of the super block is illegal, distributing and initializing a log area, and finishing initialization of the NVM (non-volatile memory) by setting the super block, wherein the NVM is not mounted to a system memory;
Step S1.2, the space except the direct persistent file page, the superblock and the log area in the NVM is mounted as the system memory through the memory hot plug mechanism of the operating system, so that the system can place the page on the NVM.
S1.3, recording the access heat of each memory page by using an LRU linked list, and counting the persistence times of each memory page in preset time by using a hash table;
in this embodiment, the persistence number is the number of calls fsync.
And S1.4, judging that the memory page is accessed to the hot page if the memory page is located in the active list of the LRU linked list in two continuous samplings, judging that the memory page is durable if the persistence times of the memory page exceeds a preset threshold value, and judging that the memory page which does not meet any hot page condition is cold page.
Further, step S1.4 includes providing an operating system interface to allow the user to manually adjust the cold and hot properties of the memory pages and the placement.
Step S2, dynamically adjusting the distribution of the memory pages between the DRAM and the NVM by tracking the access heat and the persistence heat of the memory pages.
Specifically, step S2 includes the following sub-steps:
And S2.1, after the memory mount of the NVM is completed, enabling a memory access heat tracking module, a persistence heat tracking module and a background migration thread in the kernel through an operating system interface, and creating a stacked file system.
In the embodiment, the access heat tracking module is realized by modifying an LRU linked list updating function, and the persistence heat tracking module is positioned in fsync operation processing logic and background thread refreshing logic of the file system.
And the background migration thread writes the persistent file pages and the direct persistent file pages on the NVM back to the SSD or the HDD according to the content of the log area of the NVM, and performs migration between the DRAM and the NVM.
Because of the possible inconsistencies of persistent file pages residing in the NVM and file content on the underlying block device (SSD/HDD), a stacked file system is employed.
Step S2.2, when the thread accesses the memory page, updating the position of the memory page in the LRU linked list to count the memory access heat degree, when the fsync persistence operation is executed, recording the range information of the persisted memory page to count the persistence heat degree, and judging whether the memory page needs to be migrated or not by combining the current position of the memory page, the memory access heat degree and the persistence heat degree.
Step S2.3, if the memory page is determined to be required to be migrated and the memory page is located in the DRAM, adding the memory page into a specific linked list of the kernel, marking a target storage position, and migrating the memory page to a target medium one by one in the background through a kernel thread;
Step S2.4, if the memory page is a memory access page and is not a persistent hot page and is currently located in the NVM, the memory page is migrated to the DRAM.
In this embodiment, when the remaining capacity of the DRAM is lower than the lower threshold, the cold page is randomly selected from the inactive list of the LRU linked list to migrate to the NVM until the remaining capacity of the DRAM is restored to above the upper threshold, and then the migration of the memory page is performed, thereby avoiding system crash due to insufficient memory and ensuring that the high-frequency access memory page is preferentially retained in the DRAM.
Step S3, persisting the persisted hot page to the NVM in the background.
Specifically, step S3 includes the following sub-steps:
step S3.1, if the memory page is a memory access cold page and a persistent hot page, the memory page is migrated to the NVM, and if the memory page is both the memory access hot page and the persistent hot page, the processing mode is determined according to the comparison result of the memory access strength and the persistent strength:
If the access strength is higher than the persistence strength, the memory page is reserved in the DRAM, if the memory page is a dirty page, the memory page is persistence to the NVM, the page table is updated to be marked as a clean page, and meanwhile, the corresponding relation between the file and the memory page is recorded in the index area of the NVM, the advantage of the access speed of the DRAM is exerted, and the persistence expenditure of the memory page is reduced by using an speculative strategy.
If the access strength is lower than the persistence strength, the memory page is migrated to the NVM and the corresponding relation is recorded. Further, the subsequent persistence operation is directly skipped, at this time, the memory page is directly accessed and modified through the access instruction, and the modification is directly persisted to the NVM, so as to reduce the persistence cost.
Further, in step S3.1, the access strength is obtained by counting the page addresses of the L3 cache miss through the accurate event sampling PEBS function of the intel CPU, and the access times of the current memory page in the preset time are obtained.
The persistence strength is obtained by counting the number of pages that are being persisted through fsync system calls.
When the access memory strength and the persistence strength are compared, the delay data and the throughput of the DRAM and the NVM are combined, the delay cost ratio of the access memory and the persistence is calculated, then the delay increment is calculated according to the access memory times and the persistence times in the preset time, if the delay increment of the DRAM is larger, the persistence strength is judged to be higher, and if the delay increment of the NVM is larger, the access memory strength is judged to be higher.
And S3.2, when the file is read through the stacked file system, firstly acquiring a memory page to be read from the underlying file system, then searching a matched memory page in a log area of the NVM, mapping the memory page in the NVM to a page table if the matched memory page exists, and otherwise, reading and mapping the memory page from the SSD.
Further, step S3 further includes:
And receiving the persistence heat or access heat parameters of the memory pages set by the user through an operating system interface so as to adjust the placement positions and maintenance strategies of the memory pages.
Example 2:
the present invention also provides a system for promoting intelligent cooperative work of the layered memory and the layered storage system, where the system for promoting intelligent cooperative work of the layered memory and the layered storage system may be implemented by executing the flow steps of the method for promoting intelligent cooperative work of the layered memory and the layered storage system, that is, those skilled in the art may understand the method for promoting intelligent cooperative work of the layered memory and the layered storage system as a preferred embodiment of the system for promoting intelligent cooperative work of the layered memory and the layered storage system.
Specifically, the system for promoting intelligent cooperative work of the layered memory and the layered storage system comprises:
the method comprises the steps that a module M1 identifies access characteristics of a memory page, classifies the memory page into a persistent cold page, a memory access cold page, a persistent hot page and a memory access hot page, and performs initialization partitioning on a storage space of an NVM based on classification;
the module M2 dynamically adjusts the distribution of the memory pages between the DRAM and the NVM by tracking the access heat and the persistence heat of the memory pages;
Module M3 persists the persisted hot page in the background onto NVM.
Specifically, the module M1 comprises the following sub-modules:
The module M1.1 is used for checking the validity of the super block, if the content of the super block is illegal, allocating and initializing a log area, and finishing the initialization of the NVM by setting the super block, but the NVM is not mounted to a system memory;
and the module M1.2 mounts the space except the direct persistent file pages, the superblock and the log area in the NVM as a system memory through a memory hot plug mechanism of the operating system, so that the system can place the pages on the NVM.
The module M1.3 records the access heat of each memory page by using the LRU linked list, and counts the persistence times of each memory page in preset time through the hash table;
in this embodiment, the persistence number is the number of calls fsync.
And a module M1.4, wherein if the memory pages are both positioned in the active list of the LRU linked list in two continuous samplings, the memory pages are judged to be accessed to the hot page, if the persistence times of the memory pages exceed a preset threshold value, the memory pages are judged to be persistence hot pages, and the memory pages which do not meet any hot page condition are judged to be cold pages.
Further, the module M1.4 includes an operating system interface provided by the system to allow the user to manually adjust the cold and hot properties of the memory pages and the placement.
Specifically, module M2 includes the following sub-modules:
And the module M2.1 starts a memory access heat tracking module, a persistence heat tracking module and a background migration thread in the kernel through an operating system interface after the memory mount of the NVM is completed, and creates a stacked file system.
In the embodiment, the access heat tracking module is realized by modifying an LRU linked list updating function, and the persistence heat tracking module is positioned in fsync operation processing logic and background thread refreshing logic of the file system.
And the background migration thread writes the persistent file pages and the direct persistent file pages on the NVM back to the SSD or the HDD according to the content of the log area of the NVM, and performs migration between the DRAM and the NVM.
Because of the possible inconsistencies of persistent file pages residing in the NVM and file content on the underlying block device (SSD/HDD), a stacked file system is employed.
The module M2.2 is used for updating the position of the memory page in the LRU chain table to count the memory access heat degree when the thread accesses the memory page, recording the range information of the memory page which is durable to count the durable heat degree when the fsync durable operation is executed, and judging whether the memory page needs to be migrated or not by combining the current position of the memory page, the memory access heat degree and the durable heat degree.
If the memory page is located in the NVM and the lasting times of the memory page in the preset time are lower than the preset threshold value, the memory page is regarded as a cold page, and the memory page is lasting to the SSD;
and the module M2.4 is used for migrating the memory page to the DRAM if the memory page is a memory access page and is not a persistent hot page and is currently positioned in the NVM.
In this embodiment, when the remaining capacity of the DRAM is lower than the lower threshold, the cold page is randomly selected from the inactive list of the LRU linked list to migrate to the NVM until the remaining capacity of the DRAM is restored to above the upper threshold, and then the migration of the memory page is performed, thereby avoiding system crash due to insufficient memory and ensuring that the high-frequency access memory page is preferentially retained in the DRAM.
Specifically, module M3 includes the following sub-modules:
The module M3.1 is used for migrating the memory page to the NVM if the memory page is a memory access cold page and a persistent hot page, and determining a processing mode according to a comparison result of the memory access strength and the persistent strength if the memory page is both the memory access hot page and the persistent hot page:
If the access strength is higher than the persistence strength, the memory page is reserved in the DRAM, if the memory page is a dirty page, the memory page is persistence to the NVM, the page table is updated to be marked as a clean page, and meanwhile, the corresponding relation between the file and the memory page is recorded in the index area of the NVM, the advantage of the access speed of the DRAM is exerted, and the persistence expenditure of the memory page is reduced by using an speculative strategy.
If the access strength is lower than the persistence strength, the memory page is migrated to the NVM and the corresponding relation is recorded. Further, the subsequent persistence operation is directly skipped, at this time, the memory page is directly accessed and modified through the access instruction, and the modification is directly persisted to the NVM, so as to reduce the persistence cost.
Further, in the module M3.1, the access strength is obtained by collecting page address statistics of L3 cache miss through the accurate event sampling PEBS function of the intel CPU, and the access times of the current memory page in a preset time are obtained.
The persistence strength is obtained by counting the number of pages that are being persisted through fsync system calls.
When the access memory strength and the persistence strength are compared, the delay data and the throughput of the DRAM and the NVM are combined, the delay cost ratio of the access memory and the persistence is calculated, then the delay increment is calculated according to the access memory times and the persistence times in the preset time, if the delay increment of the DRAM is larger, the persistence strength is judged to be higher, and if the delay increment of the NVM is larger, the access memory strength is judged to be higher.
And the module M3.2 is used for firstly acquiring a memory page to be read from the bottom file system when the file is read through the stacked file system, searching a matched memory page in a log area of the NVM, mapping the memory page in the NVM to a page table if the matched memory page exists, and otherwise, reading and mapping the memory page from the SSD.
In summary, the invention provides a method and a system for promoting intelligent collaborative work of a layered memory and a layered storage system, which can more intelligently identify semantics of pages in two systems, support the conversion of pages between the two systems without spending, and improve memory access and persistence operation performance of the system using the NVM to a certain extent compared with the common layered memory system and the common layered storage method and system.
Those skilled in the art will appreciate that the invention provides a system and its individual devices, modules, units, etc. that can be implemented entirely by logic programming of method steps, in addition to being implemented as pure computer readable program code, in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Therefore, the system and the devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units for realizing various functions included in the system can be regarded as structures in the hardware component, and the devices, modules and units for realizing various functions can be regarded as structures in the hardware component as well as software modules for realizing the method.
The foregoing describes specific embodiments of the present application. It is to be understood that the application is not limited to the particular embodiments described above, and that various changes or modifications may be made by those skilled in the art within the scope of the appended claims without affecting the spirit of the application. The embodiments of the application and the features of the embodiments may be combined with each other arbitrarily without conflict.