WO2016037499A1 - 一种内存迁移方法及设备 - Google Patents
一种内存迁移方法及设备 Download PDFInfo
- Publication number
- WO2016037499A1 WO2016037499A1 PCT/CN2015/080491 CN2015080491W WO2016037499A1 WO 2016037499 A1 WO2016037499 A1 WO 2016037499A1 CN 2015080491 W CN2015080491 W CN 2015080491W WO 2016037499 A1 WO2016037499 A1 WO 2016037499A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- memory
- memory page
- node
- block
- page
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0813—Multiuser, multiprocessor or multiprocessing cache systems with a network or matrix configuration
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/06—Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0644—Management of space entities, e.g. partitions, extents, pools
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5022—Mechanisms to release resources
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5033—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering data affinity
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/25—Using a specific main memory architecture
- G06F2212/254—Distributed memory
- G06F2212/2542—Non-uniform memory access [NUMA] architecture
Definitions
- the present invention relates to the field of computer application technologies, and in particular, to a memory migration method and device.
- Non-uniform memory access (English: Non Uniform Memory Access, abbreviated: NUMA) system architecture is one of the server architectures.
- 1 is a schematic diagram of a NUMA system architecture in the prior art.
- the NUMA architecture includes multiple nodes, and each node includes a central processing unit (English: Central Processing Unit, abbreviated as CPU) and corresponding to the CPU.
- CPU Central Processing Unit
- each CPU can access data in the local memory area, or access data in other node memory areas (ie, non-local memory areas) across nodes.
- the difference in the length of the memory access link occurs in the bus design, so that the time for the CPU to access the non-local memory area across nodes is much longer than the time when the CPU accesses the local memory area.
- the data of the remote memory area needs to be migrated from the remote memory area to the local memory area.
- data in a remote memory area is usually migrated to a local memory area in units of memory pages.
- the number of migrations is the same as the number of memory pages to be migrated. In this case, if there are many memory pages to be migrated, the number of migrations will be large, resulting in a large CPU usage. , system performance is low.
- the embodiment of the invention provides a memory migration method and device, which solves the problem that when there are a large number of memory pages to be migrated, memory migration is performed in units of memory pages, and the number of migrations is large, resulting in a large CPU usage.
- an embodiment of the present invention provides a memory migration method, including:
- the first node receives a migration instruction sent by the second node, where the migration instruction is used to indicate that all memory pages of the first node are accessed by the target process, from the first a memory area of a node is migrated to a memory area of the second node; the target process is a process running on the second node;
- the first node sequentially scans each memory page between a physical address of a starting memory page accessed by the target process and a physical address of a last memory page accessed by the target process, wherein the memory page is a target process The accessed memory page, or the memory page accessed by the non-target process;
- the first node respectively determines whether each memory page satisfies a block merging condition, and merges a memory page that satisfies the block merging condition into a corresponding memory block;
- the first node migrates the corresponding memory block to a memory area of the second node.
- the memory page that meets a block merging condition includes:
- the determining whether the memory page satisfies a block merge condition includes:
- the distance relationship table records a distance that the third node accesses a memory area of the first node, and the third node accesses the second node The distance of the memory area;
- the distance that the third node accesses the memory area of the second node is less than or equal to the distance of the third node accessing the memory area of the first node, it is determined that the memory page satisfies a block merge condition.
- the first aspect is combined with the first party Any one of the second possible implementation manners, for the first memory page, when determining that the first memory page satisfies a block merging condition, the merging the first memory page into a corresponding one Memory blocks, including:
- the first memory page is merged into the first memory block
- the first memory page is used as a second memory block Starting a memory page, merged into the second memory block;
- the first memory page is: between the physical address of the first memory node, the physical address of the starting memory page accessed by the target process, and the physical address of the last memory page accessed by the target process. a memory page;
- the first memory block is: a memory block in which a previous memory page is continuous with the first memory page.
- the method further includes:
- the second memory page is a memory page accessed by the target process
- the second memory page is merged into the third memory block as a starting memory page of the third memory block.
- an embodiment of the present invention provides a memory migration device, including:
- a receiving unit configured to receive a migration instruction sent by the second node, where the migration instruction is used to indicate that all memory pages at the first node accessed by the target process are migrated from the memory area of the first node to a memory area of the second node; the target process is a process running on the second node;
- a scanning unit configured to sequentially scan, according to the migration instruction received by the receiving unit, a physical address of a starting memory page accessed by the target process and accessed by the target process Each memory page between the physical addresses of the last memory page, wherein the memory page is a memory page accessed by the target process, or a memory page accessed by the non-target process;
- Judging unit for respectively determining whether each memory page satisfies a block merging condition
- a block merging unit configured to merge a memory page determined by the determining unit that satisfies the block merging condition into a corresponding memory block;
- a sending unit configured to migrate the corresponding memory block to a memory area of the second node.
- the memory page accessed by the non-target process that satisfies the block merging condition includes:
- the determining unit is specifically configured to:
- the distance relationship table records a distance that the third node accesses a memory area of the first node, and the third node accesses the second node The distance of the memory area;
- the distance that the third node accesses the memory area of the second node is less than or equal to the distance of the third node accessing the memory area of the first node, it is determined that the memory page satisfies a block merge condition.
- the block merging unit is specifically configured to:
- the first memory page is merged into the first memory block
- the first memory page is used as a second memory block Starting a memory page, merged into the second memory block;
- the first memory page is: between the physical address of the first memory node, the physical address of the starting memory page accessed by the target process, and the physical address of the last memory page accessed by the target process. a memory page;
- the first memory block is: a memory block in which a previous memory page is continuous with the first memory page.
- determining unit determines that the first memory page does not satisfy the block merging condition, or The block merging unit determines that the number of memory pages included in the first memory block is equal to a preset threshold, and the first memory page is a memory page accessed by a non-target process;
- the determining unit is further configured to determine whether the second memory page is a memory page accessed by the target process; wherein the second memory page is a next memory page consecutive to the first memory page ;
- the block merging unit is further configured to merge the second memory page as a starting memory page of the third memory block into the third if the second memory page is a memory page accessed by the target process Memory block.
- the embodiment of the present invention provides a memory migration method and device, where the first node receives the migration instruction sent by the second node, and sequentially scans the physical address of the starting memory page accessed by the target process and the target process access.
- the memory pages accessed by the target process and the memory pages accessed by the non-target processes that satisfy the block merge condition are merged into the corresponding ones.
- the block is saved, so that the discrete memory pages are merged together as much as possible, and the memory blocks are migrated as a unit, thereby greatly reducing the number of memory migrations and improving the CPU utilization.
- FIG. 1 is a schematic diagram of a NUMA system architecture in the prior art
- FIG. 2 is a flowchart of a memory migration method according to an embodiment of the present invention.
- FIG. 3 is a schematic diagram of a memory page merge in a memory area with consecutive physical addresses
- FIG. 4 is a schematic diagram of a memory migration device according to an embodiment of the present disclosure.
- FIG. 5 is a schematic diagram of a memory migration device according to an embodiment of the present invention.
- the memory migration method provided by the embodiment of the present invention is applicable to the non-uniform memory access (English: Non Uniform Memory Access, abbreviation: NUMA) system architecture as shown in FIG. 1 , and is also applicable to other communication scenarios (such as non-NUMA).
- the memory migration in the system or in the virtualized scenario is not limited in the present invention.
- the present invention is described by taking the memory migration in the NUMA system architecture shown in FIG. 1 as an example.
- each node is divided into multiple memory pages, and the memory pages are used to store data in a minimum storage unit, wherein each memory page occupies about 4 KB of physical memory, that is, 4 KB in size.
- a contiguous physical address identifies a memory page, for example, a memory page occupies a contiguous memory area with a physical address from 00001000 to 00001FFF.
- FIG. 3 is a schematic diagram of a memory page merge in a memory area with consecutive physical addresses.
- the memory area of the segment 00000000 to 00017FFF includes 24 memory pages accessed by the target process, the first process, and the second process.
- a free memory page wherein the free memory page is a memory page in the memory area that does not store data, that is, a memory page that is not used.
- the target process accesses 11 memory pages, and the 11 memory pages are not completely continuous, the memory page accessed by the first process, the memory page accessed by the second process, and The free memory pages are separated.
- the embodiment of the present invention provides a memory migration method, which is applied to the NUMA system architecture shown in FIG. 1 .
- the method may include:
- the first node receives a migration instruction sent by the second node.
- the migration instruction is used to indicate that all memory pages of the first node accessed by the target process are migrated from the memory area of the first node to the memory area of the second node; Any process running on the second node.
- the first node and the second node are any two different nodes in the NUMA architecture.
- the first node sequentially scans each memory page between a physical address of a starting memory page accessed by the target process and a physical address of a last memory page accessed by the target process.
- the memory page is a memory page accessed by the target process, or a memory page accessed by the non-target process.
- the first node may learn the physical address of the memory page accessed by the target process according to the mapping relationship between the virtual address and the physical address of the memory page accessed by the process stored in the system, and the starting memory accessed from the target process.
- the physical address of the page begins, sequentially scanning each memory page between the physical address of the starting memory page accessed by the target process and the physical address of the last memory page accessed by the target process until the end of the target process accesses the memory page.
- the starting memory page accessed by the target process is a memory page identified by consecutive physical addresses 00000000 to 00000FF
- the memory page at the end of the target process access is a memory page identified by the continuous physical address 00017000 to 00017FFF, the first node.
- each memory page in the memory area of one end as shown in FIG. 3 is sequentially scanned until the memory page identified by the continuous physical address 00017000 to 00017FFF.
- the first node determines whether each memory page satisfies a block merging condition, and merges the memory page that meets the block merging condition into a corresponding memory block.
- the first node when scanning a memory page that is accessed by the target process, the first node directly determines that the memory page satisfies the block merging condition;
- the first node may determine whether the memory page accessed by the non-target process satisfies the block merge condition according to the following three cases (1)(2)(3):
- the memory page is a memory page accessed by the first process, wherein the first process is a process running on the second node other than the target process, it is determined that the memory page satisfies the block merge condition.
- the memory page is a memory page accessed by the second process, wherein the second process is a process running on the third node, and the third node is a NUMA system architecture, except the first node and the second node. Any other node,
- the first node queries the distance relationship table stored in the operating system, wherein the distance relationship table records the distance that the third node accesses the memory area of the first node, and the third node accesses the The distance of the memory area of the second node;
- the distance that the third node accesses the memory area of the second node is less than or equal to the distance of the third node accessing the memory area of the first node, it is determined that the memory page satisfies a block merge condition.
- the distance of the third node accessing the memory area of the second node is greater than the distance of the third node accessing the memory area of the first node, indicating that the third node is After the memory page accessed by the first process is migrated from the first node to the second node, the first process access time is too long, and the system performance is degraded. Therefore, it is determined that the memory page does not satisfy the block merge condition.
- the distance relationship table in the NUMA system architecture is stored in the operating system, wherein the distance relationship table includes the distance of the memory area of the global node access node, which is known by the global node, and the distance in the distance relationship table is fixed. of.
- Table 1 is a distance relationship table stored in the NUMA system architecture shown in FIG. 1.
- the distance of the first node accessing the memory area of the third node is 21, and the second node accesses the memory area of the third node.
- the distance is also 21, which means that the time when any process on the third node accesses the memory area of the first node is the same as the time of accessing the memory area in the second node.
- the memory page of the first process accessed by the first process migrates to the second node, and the processing performance of the first process is not affected.
- the first node migrates the corresponding memory block to a memory area of the second node.
- the burden of migrating the memory block is excessive.
- the first memory page wherein the first memory page is: the first node a memory page, any memory page between a physical address of a start memory page accessed by the target process and a physical address of a memory page accessed by the target process; if it is determined that the first memory page satisfies a block merge condition Then, the first memory page that satisfies the block merging condition can be merged into the corresponding memory block by the following specific steps (I)-(V):
- the first memory block is: a memory block where a previous memory page is continuous with the first memory page.
- step (III) or (IV) is performed according to the specific situation of the first memory page.
- step (IV) is performed.
- the second memory page is a memory page accessed by the target process, the second memory page is merged into a third memory page as a starting memory page of the third memory block;
- the memory page is continuously scanned, and when the scanned memory page is a memory page accessed by the target process, the memory page is used as the third memory.
- the starting memory page of the block is merged into the third memory page.
- the preset threshold is preset according to requirements, and is not limited in the embodiment of the present invention. Preferably, it may be any value of 5-10.
- the method may further include:
- the second memory page is a memory page accessed by the target process, the second memory page is merged into a third memory page as a starting memory page of the third memory block;
- the memory page is continuously scanned, and when the scanned memory page is a memory page accessed by the target process, The memory page is merged into the third memory page as a starting memory page of the third memory block.
- the starting memory page accessed by the target process is directly merged into the first memory block as the starting page of the first memory block.
- the above method is specifically described by merging the memory pages between the starting memory page and the last memory page accessed by the target process in the memory area of the first node as shown in FIG. 3, wherein the above method is specifically explained, wherein each memory block is allowed
- the number of memory pages included is 10, the first process runs on the third node, and the second process runs on the fourth node:
- the memory page is merged into the memory block 1;
- the second memory page of the physical address 00001000-00001FFF is sequentially scanned to determine that the second memory page is a memory page accessed by the target process, and it is determined that the memory page 1 contains less than 10 memory pages, and the second memory page is merged into the memory block 1 ;
- the third memory page of the physical address 00002000-00002FFF (that is, the memory page accessed by the first process) is sequentially scanned, and according to the stored node distance relationship table (such as Table 1), it is determined that the third memory page satisfies the block merging condition, and the memory block is determined. 1 contains less than 10 memory pages, then the third page is merged into memory block 1;
- the fourth memory page of the physical address 00003000-00003FFF (that is, the memory page accessed by the first process) is sequentially scanned, and according to the stored node distance relationship table (such as Table 1), it is determined that the fourth memory page satisfies the block merging condition, and the memory block is determined. 1 contains less than 10 memory pages, then the third page is merged into memory block 1;
- the fifth memory page of the physical address 00004000 to 00004FFF is sequentially scanned as a free page, and it is determined that the fifth memory page satisfies the block merging condition, and it is determined that the number of memory pages included in the memory block 1 is less than 10, and the fifth memory page page is merged into Memory block 1;
- the seventh memory page of the physical address 00006000 ⁇ 00006FFF and the eighth memory page of the physical address 00007000 ⁇ 00007FFF are sequentially scanned, and the seventh memory page and the eighth memory page are respectively merged into the memory block 1;
- the eleventh memory page of the physical address 0000A000 to 0000AFFF, the twelfth memory page of the physical address 0000B000 to 0000BFFF, the thirteenth memory page of the physical address 0000C000 to 0000CFFF, and the fourteenth memory page of the physical address 0000D000 to 0000DFFF are sequentially scanned.
- the fifteenth memory page of the physical address 0000E000-0000EFFF and the sixteenth memory page of the physical address 0000F000-0000FFFF respectively determine the memory page that each memory page does not access for the target process, and cannot be the starting memory page of the memory block 2;
- the seventeenth memory page of the physical address 00010000 ⁇ 00010FFF is sequentially scanned to determine that the seventeenth memory page is the memory page accessed by the target process, and the seventeenth memory page is used as the starting memory page of the memory block 2, and the memory block 2 is started.
- the eighteenth memory page of the physical address 00011000 to 00011FFF is sequentially scanned to determine that the eighteenth memory page is a memory page accessed by the target process, and the number of memory pages included in the memory block 2 is less than 10, the eighteenth memory page is merged.
- the ninth memory page of the physical address 00012000 ⁇ 00012FFF and the twentieth memory page of the physical address 00013000 ⁇ 00013FFF are merged into the memory block 2;
- the 23rd memory page of the physical address 00016000-00016FFF is sequentially scanned, and the memory page accessed by the target process, and the number of memory pages included in the memory block 2 is less than 10, the 23rd memory page is merged into the memory block 2 ;
- the embodiment of the present invention provides a memory migration method.
- the first node receives the migration instruction sent by the second node, and sequentially scans the physical address of the starting memory page accessed by the target process and the end of the target process access.
- the discrete memory pages can be as much as possible without increasing the system performance cost. Merge together and migrates in units of memory blocks, greatly reducing the number of memory migrations and increasing CPU utilization.
- the embodiment of the present invention further provides a memory migration device 40, wherein the memory migration device 40 may be any one of the NUMA system architectures shown in FIG. 1. As shown in FIG. 4, the device may include:
- the receiving unit 401 is configured to receive a migration instruction sent by the second node, where the migration instruction is used to indicate that all memory pages of the first node accessed by the target process are migrated from the memory area of the first node a memory area to the second node; the target process is a process running on the second node.
- the second node may be any node other than the first node in the NUMA system architecture as shown in FIG. 1.
- the scanning unit 402 is configured to, when the receiving unit 401 receives the migration instruction, sequentially scan each of the physical address of the starting memory page accessed by the target process and the physical address of the last memory page accessed by the target process. Memory pages.
- the memory page is a memory page accessed by the target process, or a memory page accessed by the non-target process.
- the determining unit 403 is configured to respectively determine whether each memory page satisfies a block merging condition.
- the block merging unit 404 is configured to merge the memory pages determined by the determining unit 403 that satisfy the block merging condition into the corresponding memory block.
- the sending unit 405 is configured to migrate the corresponding memory block to a memory area of the second node.
- the scanning unit 402 is specifically configured to: according to the mapping relationship between the virtual address and the physical address of the memory page accessed by the process stored in the system, obtain the physical address of the memory page accessed by the target process, and access from the target process Beginning with the physical address of the initial memory page, sequentially scanning each memory page between the physical address of the starting memory page accessed by the target process and the physical address of the last memory page accessed by the target process until the end of the target process accesses the memory page .
- the determining unit 403 directly determines that the memory page satisfies the block merging condition
- the determining unit 403 is specifically configured to determine whether the memory page satisfies the block merging condition according to the following three cases (1), (2), and (3) :
- the memory page is a memory page accessed by the first process, wherein the first process is a process running on the second node other than the target process, it is determined that the memory page satisfies the block merge condition.
- the memory page is a memory page accessed by the second process, wherein the second process is a process running on the third node, and the third node is a NUMA system architecture, except the first node and the second node. Any other node,
- the distance relationship table records a distance that the third node accesses a memory area of the first node, and the third node accesses the second node The distance of the memory area;
- the third node accesses a distance of the memory area of the first node, and determines that the memory page satisfies a block merge condition.
- the distance that the third node accesses the memory area of the second node is greater than the distance that the third node accesses the memory area of the first node, it indicates that the memory page accessed by the first process on the third node is After the first node migrates to the second node, the first process access time is too long, and the system performance is degraded. Therefore, it is determined that the memory page does not satisfy the block merge condition.
- the distance relationship table in the NUMA system architecture is stored in the operating system, wherein the distance relationship table includes the distance of the memory area of the global node access node, which is known by the global node, and the distance in the distance relationship table is fixed. of.
- Table 1 is a distance relationship table stored in the NUMA system architecture shown in FIG. 1.
- the distance of the first node accessing the memory area of the third node is 21, and the second node accesses the memory area of the third node.
- the distance is also 21, which means that the time when any process on the third node accesses the memory area of the first node is the same as the time of accessing the memory area in the second node.
- the memory page of the first process accessed by the first process migrates to the second node, and the processing performance of the first process is not affected.
- the burden of migrating the memory block is excessive.
- the first memory page wherein the first memory page is: the first node In the memory area, any memory page between the physical address of the starting memory page accessed by the target process and the physical address of the last memory page accessed by the target process, when the determining unit 403 determines that the first memory page satisfies the block merge
- the block merging unit 404 is specifically configured to:
- the first memory block is: a memory block in which a previous memory page is continuous with the first memory page;
- the first memory page is merged into the first memory block
- the first memory page is a memory page accessed by the target process, ending the merging of the memory page into the first memory block, a memory page as a starting memory page of the second memory block, merged into the second memory page;
- the first memory page is a memory page accessed by a non-target process that satisfies a block merge condition, determining whether the second memory page is a memory page accessed by the target process; wherein the second memory page is The first memory page is consecutive to the next memory page;
- the second memory page is a memory page accessed by the target process, the second memory page is merged into a third memory page as a starting memory page of the third memory block;
- the memory page is continuously scanned, and when the scanned memory page is a memory page accessed by the target process, the memory page is used as the third memory.
- the starting memory page of the block is merged into the third memory page.
- the preset threshold is preset according to requirements, and is not limited in the embodiment of the present invention. Preferably, it may be any value of 5-10.
- the determining unit 403 determines that the first memory page does not satisfy the block merging condition, the block merging unit 404 ends merging the first memory page into the first memory block, and at the same time, the determining unit 403.
- the method further includes: determining whether the second memory page is a memory page accessed by the target process; wherein the second memory page is a next memory page that is continuous with the first memory page;
- the block merging unit 404 is further configured to merge the second memory page as a starting memory page of the third memory block into the memory page if the second memory page is a memory page accessed by the target process Third memory page;
- the memory page is continuously scanned, and when the scanned memory page is a memory page accessed by the target process, the memory page is used as the third memory.
- the starting memory page of the block is merged into the third memory page.
- the starting memory page accessed by the target process is directly merged into the first memory block as the starting page of the first memory block.
- an embodiment of the present invention provides a memory migration device 40, which receives a migration instruction sent by a second node, and sequentially scans a physical address of a starting memory page accessed by the target process and a memory page accessed by the target process.
- the memory pages are merged into memory blocks, so that the discrete memory pages are merged together as much as possible without increasing the system performance cost, and the memory blocks are migrated in units, which greatly reduces the number of memory migrations and improves CPU utilization. rate.
- the embodiment of the present invention further provides a memory migration device 50, wherein the memory migration device 50 may be any one of the NUMA system architectures shown in FIG. 1.
- the device may include: Communication unit 501, processor 502, memory 503, at least one communication bus 504;
- the communication unit 501 can be configured to perform data transmission with an external device or other nodes in the NUMA system through the communication bus 504.
- the processor 502 can be a central processing unit (English: central processing unit, abbreviated as CPU).
- the memory 503 may be a volatile memory (English: volatile memory), such as random access memory (English: random-access memory, abbreviation: RAM); or non-volatile memory (English: non-volatile memory), for example Read-only memory (English: read-only memory, abbreviation: ROM), flash memory (English: flash memory), hard disk (English: hard disk drive, abbreviation: HDD) or solid state drive (English: solid-state drive, abbreviation :SSD); or a combination of the above types of memory, and providing instructions and data to the processor 1001;
- volatile memory such as random access memory (English: random-access memory, abbreviation: RAM); or non-volatile memory (English: non-volatile memory), for example Read-only memory (English: read-only memory, abbreviation: ROM), flash memory (English: flash memory), hard disk (English: hard disk drive, abbreviation: HDD) or solid state drive (English: solid-state drive, abbreviation :SSD); or
- the memory 503 includes a memory area 5031 for providing required instructions and data to processes running on the processor 502.
- the communication unit 501 is configured to receive a migration instruction sent by the second node, where the migration instruction is used to indicate that all memory pages of the first node accessed by the target process are migrated from the memory area of the first node a memory area to the second node; the target process is a process running on the second node.
- the second node may be any node other than the first node in the NUMA system architecture as shown in FIG. 1.
- the processor 502 is configured to, when the communication unit 501 receives the migration instruction, sequentially scan each of the physical address of the starting memory page accessed by the target process and the physical address of the last memory page accessed by the target process.
- a memory page wherein the memory page is a memory page accessed by a target process, or a memory page accessed by a non-target process.
- the communication unit 501 is further configured to migrate the corresponding memory block obtained by the processor 502 to a memory area of the second node.
- the processor 502 is specifically configured to: according to the mapping relationship between the virtual address and the physical address of the memory page accessed by the process stored in the system, obtain the physical address of the memory page accessed by the target process, and access from the target process. Beginning with the physical address of the initial memory page, sequentially scanning each memory page between the physical address of the starting memory page accessed by the target process and the physical address of the last memory page accessed by the target process until the end of the target process accesses the memory page .
- the processor 502 directly determines that the memory page satisfies the block merge condition
- the processor 502 is specifically configured to determine whether the memory page satisfies the block merging condition according to the following three cases (1), (2), and (3). :
- the memory page is a memory page accessed by the first process, wherein the first process is a process running on the second node other than the target process, it is determined that the memory page satisfies the block merge condition.
- the memory page is a memory page accessed by the second process, wherein the second process is a process running on the third node, and the third node is a NUMA system architecture, except the first node and the second node. Any other node,
- the distance relationship table records a distance that the third node accesses a memory area of the first node, and the third node accesses the second node The distance of the memory area;
- the distance that the third node accesses the memory area of the second node is less than or equal to the distance of the third node accessing the memory area of the first node, it is determined that the memory page satisfies a block merge condition.
- the distance that the third node accesses the memory area of the second node is greater than the distance that the third node accesses the memory area of the first node, it indicates that the memory page accessed by the first process on the third node is After the first node migrates to the second node, the first process access time is too long, and the system performance is degraded. Therefore, it is determined that the memory page does not satisfy the block merge condition.
- the distance relationship table in the NUMA system architecture is stored in the operating system, wherein the distance relationship table includes the distance of the memory area of the global node access node, which is known by the global node, and the distance in the distance relationship table is fixed. of.
- Table 1 is a distance relationship table stored in the NUMA system architecture shown in FIG. 1.
- the distance of the first node accessing the memory area of the third node is 21, and the second node accesses the memory area of the third node.
- the distance is also 21, which means that the time when any process on the third node accesses the memory area of the first node is the same as the time of accessing the memory area in the second node.
- the memory page of the first process accessed by the first process migrates to the second node, and the processing performance of the first process is not affected.
- the burden of migrating the memory block is excessive.
- the first memory page wherein the first memory page is: the first node In the memory area, any memory page between the physical address of the starting memory page accessed by the target process and the physical address of the last memory page accessed by the target process, when the processor 502 determines that the first memory page satisfies the block merge
- the processor 502 determines that the first memory page satisfies the block merge
- the first memory block is: a memory block in which a previous memory page is continuous with the first memory page;
- the first memory page is merged into the first memory block
- the first memory page is a memory page accessed by the target process, ending the merging of the memory page into the first memory block, a memory page as a starting memory page of the second memory block, merged into the second memory page;
- the first memory page is a memory page accessed by a non-target process that satisfies a block merge condition, determining whether the second memory page is a memory page accessed by the target process; wherein the second memory page is The first memory page is consecutive to the next memory page;
- the second memory page is a memory page accessed by the target process, the second memory page is merged into a third memory page as a starting memory page of the third memory block;
- the memory page is continuously scanned, and when the scanned memory page is a memory page accessed by the target process, the memory page is used as the third memory.
- the starting memory page of the block is merged into the third memory page.
- the preset threshold is preset according to requirements, and is not limited in the embodiment of the present invention. Preferably, it may be any value of 5-10.
- the processor 502 determines that the first memory page does not satisfy the block merging condition, the processor 502 ends merging the first memory page into the first memory block, and at the same time, The processor 502 is further configured to determine whether the second memory page is a memory page accessed by the target process, where the second memory page is a next memory page that is continuous with the first memory page;
- the second memory page is a memory page accessed by the target process, the second memory page is merged into a third memory page as a starting memory page of the third memory block;
- the memory page is continuously scanned, and when the scanned memory page is a memory page accessed by the target process, the memory page is used as the third memory.
- the starting memory page of the block is merged into the third memory page.
- the starting memory page accessed by the target process is directly merged into the first memory block as the starting page of the first memory block.
- the embodiment of the present invention provides a memory migration device 50, which receives a migration instruction sent by a second node, sequentially scans a physical address of a starting memory page accessed by the target process, and a memory page of the last process accessed by the target process.
- a memory page between physical addresses wherein the memory page contains a memory page accessed by a target process or a memory page accessed by a non-target process; and a memory page that satisfies the block merge condition is merged into a corresponding memory block, The corresponding memory block is migrated to a memory area of the second node.
- the discrete memory pages can be as much as possible without increasing the system performance cost. Merge together and migrates in units of memory blocks, greatly reducing the number of memory migrations and increasing CPU utilization.
- the disclosed system, Equipment and methods can be implemented in other ways.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical or otherwise.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be physically included separately, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
- the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
- the software functional units described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform portions of the steps of the methods described in various embodiments of the present invention.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, and the program code can be stored. Medium.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Mathematical Physics (AREA)
- Computer Security & Cryptography (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
一种内存迁移方法及设备,涉及计算机应用技术领域。通过将内存页合并成内存块,减低了迁移次数,提高了CPU的利用率。该方法包括:第一节点接收第二节点发送的迁移指令(201),依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页(202),其中,所述内存页为目标进程访问的内存页,或者,非目标进程访问的内存页;分别判断每个内存页是否满足块合并条件,将满足所述块合并条件的内存页合并到相应的内存块(203),将所述相应的内存块迁移至所述第二节点的内存区域(204)。
Description
本发明涉及计算机应用技术领域,尤其涉及一种内存迁移方法及设备。
非统一内存访问(英文:Non Uniform Memory Access,缩写:NUMA)系统架构是服务器架构中的一种。图1为现有技术中NUMA系统架构示意图,如图1所示,NUMA架构包含多个节点,每个节点包含一个中央处理器(英文:Central Processing Unit,缩写:CPU)以及与该CPU对应的内存区域,其中,内存区域的数据以内存页为最小单位存储。在NUMA架构中,各CPU可以访问本地内存区域中的数据,也可以跨节点访问其他节点内存区域(即非本地内存区域)中的数据。由于节点数量扩大,总线设计不同出现了内存访问链路长短差异,使得CPU跨节点访问非本地内存区域的时间远大于CPU访问本地内存区域的时间。为了解决NUMA架构下,节点中的CPU访问远端内存区域的时间较长的问题,需要将远端内存区域的数据从远端内存区域迁移至本地内存区域。
现有技术中,通常以内存页为单位将远端内存区域中的数据迁移至本地内存区域中。由于,采用内存页为单位进行内存迁移,迁移次数与需要迁移的内存页的个数相等,此时,若需要迁移的内存页很多时,则会造成迁移次数较多,导致CPU占用率较大,系统性能较低。
发明内容
本发明实施例提供一种内存迁移方法及设备,解决了当存在大量需要迁移的内存页时,采用内存页为单位进行内存迁移,迁移次数较多,导致CPU占用率较大的问题。
为达到上述目的,本发明采用的技术方案是,
第一方面,本发明实施例提供一种内存迁移方法,包括:
第一节点接收第二节点发送的迁移指令,其中,所述迁移指令用于指示将目标进程访问的处于所述第一节点的所有内存页,从所述第
一节点的内存区域迁移至所述第二节点的内存区域;所述目标进程为在第二节点上运行的进程;
所述第一节点依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,其中,所述内存页为目标进程访问的内存页,或者,非目标进程访问的内存页;
所述第一节点分别判断每个内存页是否满足块合并条件,将满足所述块合并条件的内存页合并到相应的内存块;
所述第一节点将所述相应的内存块迁移至所述第二节点的内存区域。
在第一方面的第一种可能的实现方式中,结合第一方面,所述满足块合并条件的内存页,包括:
所述目标进程访问的内存页;
或者,空闲内存页;
或者,第一进程访问的内存页,其中,所述第一进程为所述第二节点上运行的除所述目标进程之外的进程。
在第一方面的第二种可能的实现方式中,结合第一方面或第一方面的第一种可能的实现方式,当内存页为第二进程访问的内存页时,其中,所述第二进程为第三节点上运行的进程;所述判断所述内存页是否满足块合并条件,包括:
查询操作系统内部存储的距离关系表,其中,所述距离关系表记录了所述第三节点访问所述第一节点的内存区域的距离,以及,所述第三节点访问所述第二节点的内存区域的距离;
判断所述第三节点访问所述第二节点的内存区域的距离是否小于等于所述第三节点访问所述第一节点的内存区域的距离;
若所述第三节点访问所述第二节点的内存区域的距离小于等于所述第三节点访问所述第一节点的内存区域的距离,则确定所述内存页满足块合并条件。
在第一方面的第三种可能的实现方式中,结合第一方面至第一方
面的第二种可能的实现方式中的任一种实现方式,对于第一内存页,当确定所述第一内存页满足块合并条件时,所述将所述第一内存页合并到相应的内存块,包括:
确定第一内存块包含的内存页的个数是否小于预设阈值;
若所述第一内存块包含的内存页的个数小于预设阈值,则将所述第一内存页合并到所述第一内存块;
若所述第一内存块包含的内存页的个数等于预设阈值,且所述第一内存页是所述目标进程访问的内存页,则将所述第一内存页作为第二内存块的起始内存页,合并到所述第二内存块;
其中,所述第一内存页为:所述第一节点的内存区域中,所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的任一内存页;
所述第一内存块为:与所述第一内存页相连续的上一内存页所在的内存块。
在第一方面的第四种可能的实现方式中,结合第一方面的第三种可能的实现方式,若确定所述第一内存页不满足块合并条件,或者,若所述第一内存块包含的内存页的个数等于预设阈值,且所述第一内存页为非目标进程访问的内存页,所述方法还包括:
判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;
若所述第二内存页是所述目标进程访问的内存页,则将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存块。
第二方面,本发明实施例提供一种内存迁移设备,包括:
接收单元,用于接收第二节点发送的迁移指令,其中,所述迁移指令用于指示将目标进程访问的处于所述第一节点的所有内存页,从所述第一节点的内存区域迁移至所述第二节点的内存区域;所述目标进程为在第二节点上运行的进程;
扫描单元:用于根据所述接收单元接收到的迁移指令,依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的
末尾内存页的物理地址之间的每个内存页,其中,所述内存页为目标进程访问的内存页,或者,非目标进程访问的内存页;
判断单元:用于分别判断每个内存页是否满足块合并条件;
块合并单元:用于将所述判断单元确定的满足所述块合并条件的内存页合并到相应的内存块;
发送单元,用于将所述相应的内存块迁移至所述第二节点的内存区域。
在第二方面的第一种可能的实现方式中,结合第二方面,所述满足块合并条件的非目标进程访问的内存页,包括:
所述目标进程访问的内存页;
或者,空闲内存页;
或者,第一进程访问的内存页,其中,所述第一进程为所述第二节点上运行的除所述目标进程之外的进程。
在第二方面的第二种可能的实现方式中,结合第二方面或第二方面的第一种可能的实现方式,当内存页为第二进程访问的内存页时,其中,所述第二进程为第三节点上运行的进程;所述判断单元,具体用于:
查询操作系统内部存储的距离关系表,其中,所述距离关系表记录了所述第三节点访问所述第一节点的内存区域的距离,以及,所述第三节点访问所述第二节点的内存区域的距离;
判断所述第三节点访问所述第二节点的内存区域的距离是否小于等于所述第三节点访问所述第一节点的内存区域的距离;
若所述第三节点访问所述第二节点的内存区域的距离小于等于所述第三节点访问所述第一节点的内存区域的距离,则确定所述内存页满足块合并条件。
在第二方面的第三种可能的实现方式中,结合第二方面至第二方面的第二种可能的实现方式中的任一种实现方式,对于第一内存页,当所述判断单元确定所述第一内存页满足块合并条件时,所述块合并单元,具体用于:
确定第一内存块包含的内存页的个数是否小于预设阈值;
若所述第一内存块包含的内存页的个数小于预设阈值,则将所述第一内存页合并到所述第一内存块;
若所述第一内存块包含的内存页的个数等于预设阈值,且所述第一内存页是所述目标进程访问的内存页,则将所述第一内存页作为第二内存块的起始内存页,合并到所述第二内存块;
其中,所述第一内存页为:所述第一节点的内存区域中,所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的任一内存页;
所述第一内存块为:与所述第一内存页相连续的上一内存页所在的内存块。
在第二方面的第四种可能的实现方式中,结合第二方面的第三种可能的实现方式,若所述判断单元确定所述第一内存页不满足块合并条件,或者,若所述块合并单元确定所述第一内存块包含的内存页的个数等于预设阈值,且所述第一内存页为非目标进程访问的内存页;
相应的,所述判断单元,还用于判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;
所述块合并单元,还用于若所述第二内存页是所述目标进程访问的内存页,将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存块。
由上可知,本发明实施例提供一种内存迁移方法及设备,第一节点接收第二节点发送的迁移指令,依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,其中,所述内存页为目标进程访问的内存页,或者,非目标进程访问的内存页;将满足所述块合并条件的内存页合并到相应的内存块,将所述相应的内存块迁移至所述第二节点的内存区域,其中,所述满足块合并条件的内存页包含目标进程访问的内存页或者非目标进程访问的内存页。如此,通过将目标进程访问的内存页,以及,部分满足块合并条件的非目标进程访问的内存页合并到相应的内
存块,使得在不增加系统性能代价的前提下,尽可能多地将离散的内存页合并在一起,以内存块为单位进行迁移,大大减少了内存迁移次数,提高了CPU利用率。
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。
图1为现有技术中NUMA系统架构示意图;
图2为本发明实施例提供的一种内存迁移方法的流程图;
图3为一段物理地址连续的内存区域中内存页合并示意图;
图4为本发明实施例提供的一种内存迁移设备的示意图;
图5为本发明实施例提供的一种内存迁移设备的示意图。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例提供的内存迁移方法,适用于如图1所示的非统一内存访问(英文:Non Uniform Memory Access,缩写:NUMA)系统架构,同时,也适用于其他通信场景下(如非NUMA系统中或者虚拟化场景下)的内存迁移,本发明对比不进行限定,本发明仅以图1所示的NUMA系统架构下的内存迁移为例进行说明。
在NUMA系统架构中,每个节点内的整个内存区域被划分为多个内存页,以内存页为最小存储单位存储数据,其中,每个内存页占用4KB左右的物理内存,即用4KB大小的连续物理地址标识一个内存页,例如:某个内存页占用物理地址从00001000~00001FFF的连续内存区域。
每个节点内的CPU可以执行多个进程,每个进程可以访问NUMA系统架构中本地节点(即进程所在的节点)或其他远端节点
的内存区域中的内存页,且同一内存区域中同一进程访问的内存页可能是不连续的。例如,图3为一段物理地址连续的内存区域中内存页合并示意图,如图3所示,该段00000000~00017FFF的内存区域包含24个被目标进程、第一进程、第二进程访问的内存页、以及空闲内存页,其中,空闲内存页为内存区域中,不存储数据的内存页,即不被使用的内存页。从图3可知,在该段内存区域中,目标进程访问的内存页为11个,且这11个内存页不完全连续,被第一进程访问的内存页,第二进程访问的内存页,以及空闲内存页分离开来。
不难理解,如图3所示,如果采用现有技术中的内存迁移技术,以内存页为单位将目标进程访问的内存页进程迁移,则需要迁移11次,即迁移次数与需要迁移的内存页的个数相等,此时,若需要迁移的内存页的数量很大,则会造成内存迁移次数较多,从而会导致节点的CPU的处理效率,降低系统性能。为此,本发明实施例提供一种内存迁移方法,应用于如图1所示的NUMA系统架构下,参见图2所示,该方法可以包括:
201、第一节点接收第二节点发送的迁移指令。
其中,所述迁移指令用于指示将目标进程访问的处于所述第一节点的所有内存页,从所述第一节点的内存区域迁移至所述第二节点的内存区域;所述目标进程为在第二节点上运行的任一进程。
所述第一节点和所述第二节点为NUMA架构中的任意两个不同的节点。
202、第一节点依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页。
其中,所述内存页为目标进程访问的内存页,或者,非目标进程访问的内存页。
优选的,所述第一节点可以根据系统内部存储的进程访问的内存页的虚拟地址与物理地址之间的映射关系,获知目标进程访问的内存页的物理地址,从目标进程访问的起始内存页的物理地址开始,顺序扫描目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,直至目标进程访问的末尾内存页。
例如,如图3所示,目标进程访问的起始内存页为连续物理地址00000000~00000FF标识的内存页,目标进程访问的末尾内存页为连续物理地址00017000~00017FFF标识的内存页,第一节点从连续物理地址00000000~0000FFF标识的内存页开始,依次扫描如图3所示的一端内存区域内的每个内存页,直至连续物理地址00017000~00017FFF标识的内存页。
203、第一节点分别判断每个内存页是否满足块合并条件,将满足所述块合并条件的内存页合并到相应的内存块。
优选的,当扫描到内存页为目标进程访问的内存页时,第一节点直接确定该内存页满足块合并条件;
当扫描到的内存页为非目标进程访问的内存页时,第一节点可以根据下述(1)(2)(3)三种情况,判断非目标进程访问的内存页是否满足块合并条件:
(1)若内存页是空闲内存页,则确定内存页满足块合并条件。
(2)若内存页为第一进程访问的内存页,其中,所述第一进程为第二节点上运行的除所述目标进程之外的进程,则确定内存页满足块合并条件。
(3)若内存页为第二进程访问的内存页,其中,所述第二进程为第三节点上运行的进程,第三节点为NUMA系统架构中,除第一节点和第二节点之外的其他任一节点,
则第一节点查询操作系统内部存储的距离关系表,其中,所述距离关系表记录了所述第三节点访问所述第一节点的内存区域的距离,以及,所述第三节点访问所述第二节点的内存区域的距离;
判断所述第三节点访问所述第二节点的内存区域的距离是否小于等于所述第三节点访问所述第一节点的内存区域的距离;
若所述第三节点访问所述第二节点的内存区域的距离小于等于所述第三节点访问所述第一节点的内存区域的距离,则确定所述内存页满足块合并条件。
若所述第三节点访问所述第二节点的内存区域的距离大于所述第三节点访问所述第一节点的内存区域的距离,则表示将第三节点上
第一进程访问的内存页从第一节点迁移到第二节点后,会造成第一进程访问的时间过长,系统性能降低,所以,确定所述内存页不满足块合并条件。
其中,NUMA系统架构中的距离关系表存储在操作系统中,其中,距离关系表包含全局各节点访问节点的内存区域的距离,被全局节点可知,且该距离关系表中的距离是固定不变的。例如,表1为图1所示NUMA系统架构中存储的距离关系表,如表1所示,第一节点访问第三节点的内存区域的距离为21,第二节点访问第三节点的内存区域的距离也为21,这表示处于第三节点上的任一进程访问第一节点的内存区域的时间与访问第二节点中的内存区域的时间是相同的,此时,若将第三节点上的第一进程访问的,处于第一节点上的内存页迁移至第二节点内,则不会影响到第一进程的处理性能。
表1
204、第一节点将所述相应的内存块迁移至所述第二节点的内存区域。
进一步的,为了避免每个内存块中包含的内存页的数量过多,导致迁移内存块时负担过重,对于第一内存页,其中,所述第一内存页为:所述第一节点的内存区域中,所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的任一内存页;若确定所述第一内存页满足块合并条件,则可以通过下述具体步骤(Ⅰ)~(Ⅴ)将满足所述块合并条件的第一内存页合并到相应的内存块:
(Ⅰ)确定第一内存块包含的内存页的个数是否小于预设阈值;
其中,所述第一内存块为:与所述第一内存页相连续的上一内存页所在的内存块。
(Ⅱ)若所述第一内存块包含的内存页的个数小于预设阈值,则将所述第一内存页合并到所述第一内存块;
若所述第一内存块包含的内存页的个数等于预设阈值,则根据第一内存页的具体情况,执行步骤(Ⅲ)或(Ⅳ)。
(Ⅲ)若第一内存页为所述目标进程访问的内存页,则结束将内存页合并到第一内存块,将所述第一内存页作为第二内存块的起始内存页,合并到所述第二内存页;
若所述第一内存页为满足块合并条件的非目标进程访问的内存页),则执行步骤(Ⅳ)。
(Ⅳ)判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;
若所述第二内存页为所述目标进程访问的内存页,则将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存页;
若所述第二内存页不为所述目标进程访问的内存页,则继续依次扫描内存页,当扫描到的内存页为所述目标进程访问的内存页时,将该内存页作为第三内存块的起始内存页,合并到所述第三内存页。
其中,预设阈值是根据需要进行预先设置的,本发明实施例对比不进行限定,优选的,可以为5~10中的任一数值。
进一步的,若在第一节点扫描第一内存页后,确定第一内存页不满足块合并条件,则结束将第一内存页合并到第一内存块,同时,所述方法还可以包括:
判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;
若所述第二内存页为所述目标进程访问的内存页,则将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存页;
若所述第二内存页不为所述目标进程访问的内存页,则继续依次扫描内存页,当扫描到的内存页为所述目标进程访问的内存页时,将
该内存页作为第三内存块的起始内存页,合并到所述第三内存页。
其中,需要说明的是,本发明实施例中,将目标进程访问的起始内存页作为第一个内存块的起始页,直接合并到第一个内存块。
下面以图3所示的将第一节点内存区域中,目标进程访问的起始内存页与末尾内存页之间的内存页进行合并,对上述方法进行具体说明,其中,每个内存块中允许包含的内存页的个数为10,第一进程运行在第三节点,第二进程运行在第四节点:
从物理地址00000000~00000FFF开始扫描目标进程访问的起始内存页,将该内存页合并到内存块1;
顺序扫描物理地址00001000~00001FFF的第二内存页,确定第二内存页为目标进程访问的内存页,且确定内存块1包含的内存页数量小于10,则将第二内存页合并到内存块1;
顺序扫描物理地址00002000~00002FFF的第三内存页(即第一进程访问的内存页),根据存储的节点距离关系表(如表1),确定第三内存页满足块合并条件,且确定内存块1包含的内存页的数量小于10,则将第三页合并到内存块1;
顺序扫描物理地址00003000~00003FFF的第四内存页(即第一进程访问的内存页),根据存储的节点距离关系表(如表1),确定第四内存页满足块合并条件,且确定内存块1包含的内存页的数量小于10,则将第三页合并到内存块1;
顺序扫描物理地址00004000~00004FFF的第五内存页,为空闲页,确定第五内存页满足块合并条件,且确定内存块1包含的内存页的数量小于10,则将第五内存页页合并到内存块1;
顺序扫描物理地址00005000~00005FFF的第六内存页,确定第六内存页为目标进程访问的内存页,且确定内存块1包含的内存页的个数小于10,则将第六内存页合并到内存块1;
同理,顺序扫描物理地址00006000~00006FFF的第七内存页、物理地址00007000~00007FFF的第八内存页,分别将第七内存页、第八内存页合并到内存块1;
顺序扫描物理地址00008000~00008FFF的第九内存页,为空闲
内存页,确定第九内存页满足块合并条件,且内存块1包含的内存页的个数小于10,则将第九页合并到内存块1;
顺序扫描物理地址00009000~00009FFF的第十内存页(即第二进程访问的内存页),根据存储的节点距离关系表(如表1),确定第九内存页不满足块合并条件,则不将第九内存页合并到内存块1,结束内存块1的合并;
顺序扫描物理地址0000A000~0000AFFF的第十一内存页、物理地址0000B000~0000BFFF的第十二内存页、物理地址0000C000~0000CFFF的第十三内存页、物理地址0000D000~0000DFFF的第十四内存页、物理地址0000E000~0000EFFF的第十五内存页、物理地址0000F000~0000FFFF的第十六内存页,分别确定每个内存页不为目标进程访问的内存页,不能作为内存块2的起始内存页;
顺序扫描物理地址00010000~00010FFF的第十七内存页,确定第十七内存页为目标进程访问的内存页,则将第十七内存页作为内存块2的起始内存页,开始内存块2的内存页合并;
顺序扫描物理地址00011000~00011FFF的第十八内存页,确定第十八内存页为目标进程访问的内存页,且内存块2包含的内存页的个数小于10,则将第十八内存页合并到内存块2;
同理,分别将物理地址00012000~00012FFF的第十九内存页、物理地址00013000~00013FFF的第二十内存页合并到内存块2;
顺序扫描物理地址00014000~00014FFF的第二十一内存页,为空闲内存页,且内存块2包含的内存页的个数小于10,则将第二十一内存页合并到内存块2;
顺序扫描物理地址00015000~00015FFF的第二十二内存页,为空闲内存页,且内存块2包含的内存页的个数小于10,则将第二十二内存页合并到内存块2;
顺序扫描物理地址00016000~00016FFF的第二十三内存页,为目标进程访问的内存页,且内存块2包含的内存页的个数小于10,则将第二十三内存页合并到内存块2;
顺序扫描物理地址00017000~00017FFF的第二十四内存页,为
目标进程访问的末尾内存页,且内存块2包含的内存页的个数小于10,则将第二十四内存页合并到内存块2,同时,内存页合并过程结束。
由图3可知,经过上述内存页合并过程后,将离散的目标进程访问的内存页合并成内存块1和内存块2两个内存块,使得在进行内存迁移时,只需要迁移两次就可以了,与现有技术中采用内存页迁移11次的情况相比,大大降低了迁移次数,从而提高了CPU的利用率。
由上可知,本发明实施例提供一种内存迁移方法,第一节点接收第二节点发送的迁移指令,依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,其中,所述内存页包含目标进程访问的内存页或者非目标进程访问的内存页;将满足所述块合并条件的内存页合并到相应的内存块,将所述相应的内存块迁移至所述第二节点的内存区域。如此,通过将目标进程访问的内存页,以及,满足块合并条件的非目标进程访问的内存页合并成内存块,使得在不增加系统性能代价的前提下,尽可能多地将离散的内存页合并在一起,以内存块为单位进行迁移,大大减少了内存迁移次数,提高了CPU利用率。
此外,本发明实施例还提供一种内存迁移设备40,其中,所述内存迁移设备40可以为图1所示的NUMA系统架构中的任一节点,如图4所示,该设备可以包括:
接收单元401,用于接收第二节点发送的迁移指令,其中,所述迁移指令用于指示将目标进程访问的处于所述第一节点的所有内存页,从所述第一节点的内存区域迁移至所述第二节点的内存区域;所述目标进程为在第二节点上运行的进程。
其中,所述第二节点可以为如图1所示的NUMA系统架构中除第一节点之外的任一节点。
扫描单元402,用于在所述接收单元401接收到迁移指令时,依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页。
其中,所述内存页为目标进程访问的内存页,或者,非目标进程访问的内存页。
判断单元403,用于分别判断每个内存页是否满足块合并条件。
块合并单元404,用于将判断单元403确定的满足所述块合并条件的内存页合并到相应的内存块。
发送单元405,用于将所述相应的内存块迁移至所述第二节点的内存区域。
进一步的,扫描单元402,具体用于:根据系统内部存储的进程访问的内存页的虚拟地址与物理地址之间的映射关系,获知目标进程访问的内存页的物理地址,从目标进程访问的起始内存页的物理地址开始,顺序扫描目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,直至目标进程访问的末尾内存页。
进一步的,当扫描单元402扫描到的内存页为目标进程访问的内存页时,判断单元403直接确定该内存页满足块合并条件;
当扫描单元402扫描到的内存页不为目标进程访问的内存页时,判断单元403,具体用于根据下述(1)(2)(3)三种情况,判断内存页是否满足块合并条件:
(1)若内存页是空闲内存页,则确定内存页满足块合并条件。
(2)若内存页为第一进程访问的内存页,其中,所述第一进程为第二节点上运行的除所述目标进程之外的进程,则确定内存页满足块合并条件。
(3)若内存页为第二进程访问的内存页,其中,所述第二进程为第三节点上运行的进程,第三节点为NUMA系统架构中,除第一节点和第二节点之外的其他任一节点,
则查询操作系统内部存储的距离关系表,其中,所述距离关系表记录了所述第三节点访问所述第一节点的内存区域的距离,以及,所述第三节点访问所述第二节点的内存区域的距离;
判断所述第三节点访问所述第二节点的内存区域的距离是否小于等于所述第三节点访问所述第一节点的内存区域的距离;
若所述第三节点访问所述第二节点的内存区域的距离小于等于
所述第三节点访问所述第一节点的内存区域的距离,则确定所述内存页满足块合并条件。
若所述第三节点访问所述第二节点的内存区域的距离大于所述第三节点访问所述第一节点的内存区域的距离,则表示将第三节点上第一进程访问的内存页从第一节点迁移到第二节点后,会造成第一进程访问的时间过长,系统性能降低,所以,确定所述内存页不满足块合并条件。
其中,NUMA系统架构中的距离关系表存储在操作系统中,其中,距离关系表包含全局各节点访问节点的内存区域的距离,被全局节点可知,且该距离关系表中的距离是固定不变的。例如,表1为图1所示NUMA系统架构中存储的距离关系表,如表1所示,第一节点访问第三节点的内存区域的距离为21,第二节点访问第三节点的内存区域的距离也为21,这表示处于第三节点上的任一进程访问第一节点的内存区域的时间与访问第二节点中的内存区域的时间是相同的,此时,若将第三节点上的第一进程访问的,处于第一节点上的内存页迁移至第二节点内,则不会影响到第一进程的处理性能。
进一步的,为了避免每个内存块中包含的内存页的数量过多,导致迁移内存块时负担过重,对于第一内存页,其中,所述第一内存页为:所述第一节点的内存区域中,所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的任一内存页,当判断单元403确定第一内存页满足块合并条件时,块合并单元404,具体用于:
确定第一内存块包含的内存页的个数是否小于预设阈值;其中,所述第一内存块为:与所述第一内存页相连续的上一内存页所在的内存块;
若所述第一内存块包含的内存页的个数小于预设阈值,则将所述第一内存页合并到所述第一内存块;
若所述第一内存块包含的内存页的个数等于预设阈值,且第一内存页为所述目标进程访问的内存页,则结束将内存页合并到第一内存块,将所述第一内存页作为第二内存块的起始内存页,合并到所述第二内存页;
若所述第一内存页为满足块合并条件的非目标进程访问的内存页,则判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;
若所述第二内存页为所述目标进程访问的内存页,则将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存页;
若所述第二内存页不为所述目标进程访问的内存页,则继续依次扫描内存页,当扫描到的内存页为所述目标进程访问的内存页时,将该内存页作为第三内存块的起始内存页,合并到所述第三内存页。
其中,预设阈值是根据需要进行预先设置的,本发明实施例对比不进行限定,优选的,可以为5~10中的任一数值。
进一步的,若在扫描第一内存页后,判断单元403确定第一内存页不满足块合并条件,则块合并单元404结束将第一内存页合并到第一内存块,同时,所述判断单元403,还用于判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;
所述块合并单元404,还用于若所述第二内存页为所述目标进程访问的内存页,则将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存页;
若所述第二内存页不为所述目标进程访问的内存页,则继续依次扫描内存页,当扫描到的内存页为所述目标进程访问的内存页时,将该内存页作为第三内存块的起始内存页,合并到所述第三内存页。
其中,需要说明的是,本发明实施例中,将目标进程访问的起始内存页作为第一个内存块的起始页,直接合并到第一个内存块。
由上可知,本发明实施例提供一种内存迁移设备40,接收第二节点发送的迁移指令,依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,其中,所述内存页包含目标进程访问的内存页或者非目标进程访问的内存页;将满足所述块合并条件的内存页合并到相应的内存块,将所述相应的内存块迁移至所述第二节点的内存区域。如此,通过将目标进程访问的内存页,以及,满足块合并条件的非目标进程访问的
内存页合并成内存块,使得在不增加系统性能代价的前提下,尽可能多地将离散的内存页合并在一起,以内存块为单位进行迁移,大大减少了内存迁移次数,提高了CPU利用率。
此外,本发明实施例还提供一种内存迁移设备50,其中,所述内存迁移设备50可以为图1所示的NUMA系统架构中的任一节点,如图5所示,该设备可以包括:通信单元501,处理器502,存储器503,至少一个通信总线504;
其中,通信单元501,可以用于通过通信总线504与外部设备或者NUMA系统内部其他节点进行数据传输。
处理器502,可以是节点内部的一个中央处理器(英文:centralprocessing unit,简称为CPU)。
存储器503,可以是易失性存储器(英文:volatile memory),例如随机存取存储器(英文:random-access memory,缩写:RAM);或者非易失性存储器(英文:non-volatile memory),例如只读存储器(英文:read-only memory,缩写:ROM),快闪存储器(英文:flash memory),硬盘(英文:hard disk drive,缩写:HDD)或固态硬盘(英文:solid-state drive,缩写:SSD);或者上述种类的存储器的组合,并向处理器1001提供指令和数据;
其中,所述存储器503包含内存区域5031,用于向处理器502上运行的进程提供需要的指令和数据。
通信单元501,用于接收第二节点发送的迁移指令,其中,所述迁移指令用于指示将目标进程访问的处于所述第一节点的所有内存页,从所述第一节点的内存区域迁移至所述第二节点的内存区域;所述目标进程为在第二节点上运行的进程。
其中,所述第二节点可以为如图1所示的NUMA系统架构中除第一节点之外的任一节点。
处理器502:用于在所述通信单元501接收到迁移指令时,依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页;其中,所述内存页为目标进程访问的内存页,或者,非目标进程访问的内存页。
分别判断每个内存页是否满足块合并条件,将满足所述块合并条件的内存页合并到相应的内存块。
所述通信单元501,还用于将处理器502得到的所述相应的内存块迁移至所述第二节点的内存区域。
进一步的,处理器502,具体用于:根据系统内部存储的进程访问的内存页的虚拟地址与物理地址之间的映射关系,获知目标进程访问的内存页的物理地址,从目标进程访问的起始内存页的物理地址开始,顺序扫描目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,直至目标进程访问的末尾内存页。
进一步的,当处理器502扫描到的内存页为目标进程访问的内存页时,处理器502直接确定该内存页满足块合并条件;
当处理器502扫描到的内存页不为目标进程访问的内存页时,处理器502,具体用于根据下述(1)(2)(3)三种情况,确定内存页是否满足块合并条件:
(1)若内存页是空闲内存页,则确定内存页满足块合并条件。
(2)若内存页为第一进程访问的内存页,其中,所述第一进程为第二节点上运行的除所述目标进程之外的进程,则确定内存页满足块合并条件。
(3)若内存页为第二进程访问的内存页,其中,所述第二进程为第三节点上运行的进程,第三节点为NUMA系统架构中,除第一节点和第二节点之外的其他任一节点,
则查询操作系统内部存储的距离关系表,其中,所述距离关系表记录了所述第三节点访问所述第一节点的内存区域的距离,以及,所述第三节点访问所述第二节点的内存区域的距离;
判断所述第三节点访问所述第二节点的内存区域的距离是否小于等于所述第三节点访问所述第一节点的内存区域的距离;
若所述第三节点访问所述第二节点的内存区域的距离小于等于所述第三节点访问所述第一节点的内存区域的距离,则确定所述内存页满足块合并条件。
若所述第三节点访问所述第二节点的内存区域的距离大于所述第三节点访问所述第一节点的内存区域的距离,则表示将第三节点上第一进程访问的内存页从第一节点迁移到第二节点后,会造成第一进程访问的时间过长,系统性能降低,所以,确定所述内存页不满足块合并条件。
其中,NUMA系统架构中的距离关系表存储在操作系统中,其中,距离关系表包含全局各节点访问节点的内存区域的距离,被全局节点可知,且该距离关系表中的距离是固定不变的。例如,表1为图1所示NUMA系统架构中存储的距离关系表,如表1所示,第一节点访问第三节点的内存区域的距离为21,第二节点访问第三节点的内存区域的距离也为21,这表示处于第三节点上的任一进程访问第一节点的内存区域的时间与访问第二节点中的内存区域的时间是相同的,此时,若将第三节点上的第一进程访问的,处于第一节点上的内存页迁移至第二节点内,则不会影响到第一进程的处理性能。
进一步的,为了避免每个内存块中包含的内存页的数量过多,导致迁移内存块时负担过重,对于第一内存页,其中,所述第一内存页为:所述第一节点的内存区域中,所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的任一内存页,当处理器502确定第一内存页满足块合并条件后,具体用于:
确定第一内存块包含的内存页的个数是否小于预设阈值;其中,所述第一内存块为:与所述第一内存页相连续的上一内存页所在的内存块;
若所述第一内存块包含的内存页的个数小于预设阈值,则将所述第一内存页合并到所述第一内存块;
若所述第一内存块包含的内存页的个数等于预设阈值,且第一内存页为所述目标进程访问的内存页,则结束将内存页合并到第一内存块,将所述第一内存页作为第二内存块的起始内存页,合并到所述第二内存页;
若所述第一内存页为满足块合并条件的非目标进程访问的内存页,则判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;
若所述第二内存页为所述目标进程访问的内存页,则将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存页;
若所述第二内存页不为所述目标进程访问的内存页,则继续依次扫描内存页,当扫描到的内存页为所述目标进程访问的内存页时,将该内存页作为第三内存块的起始内存页,合并到所述第三内存页。
其中,预设阈值是根据需要进行预先设置的,本发明实施例对比不进行限定,优选的,可以为5~10中的任一数值。
进一步的,若在第一节点扫描第一内存页后,处理器502确定第一内存页不满足块合并条件,则处理器502结束将第一内存页合并到第一内存块,同时,所述处理器502,还用于判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;
若所述第二内存页为所述目标进程访问的内存页,则将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存页;
若所述第二内存页不为所述目标进程访问的内存页,则继续依次扫描内存页,当扫描到的内存页为所述目标进程访问的内存页时,将该内存页作为第三内存块的起始内存页,合并到所述第三内存页。
其中,需要说明的是,本发明实施例中,将目标进程访问的起始内存页作为第一个内存块的起始页,直接合并到第一个内存块。
由上可知,本发明实施例提供一种内存迁移设备50,接收第二节点发送的迁移指令,依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,其中,所述内存页包含目标进程访问的内存页或者非目标进程访问的内存页;将满足所述块合并条件的内存页合并到相应的内存块,将所述相应的内存块迁移至所述第二节点的内存区域。如此,通过将目标进程访问的内存页,以及,满足块合并条件的非目标进程访问的内存页合并成内存块,使得在不增加系统性能代价的前提下,尽可能多地将离散的内存页合并在一起,以内存块为单位进行迁移,大大减少了内存迁移次数,提高了CPU利用率。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,
设备和方法,可以通过其它的方式实现。例如,以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理包括,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。
Claims (10)
- 一种内存迁移方法,其特征在于,包括:第一节点接收第二节点发送的迁移指令,其中,所述迁移指令用于指示将目标进程访问的处于所述第一节点的所有内存页,从所述第一节点的内存区域迁移至所述第二节点的内存区域;所述目标进程为在第二节点上运行的进程;所述第一节点依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,其中,所述内存页为目标进程访问的内存页,或者,非目标进程访问的内存页;所述第一节点分别判断每个内存页是否满足块合并条件,将满足所述块合并条件的内存页合并到相应的内存块;所述第一节点将所述相应的内存块迁移至所述第二节点的内存区域。
- 根据权利要求1所述的方法,其特征在于,所述满足块合并条件的内存页,包括:所述目标进程访问的内存页;或者,空闲内存页;或者,第一进程访问的内存页,其中,所述第一进程为所述第二节点上运行的除所述目标进程之外的进程。
- 根据权利要求1或2所述的方法,其特征在于,当内存页为第二进程访问的内存页时,其中,所述第二进程为第三节点上运行的进程;所述判断所述内存页是否满足块合并条件,包括:查询操作系统内部存储的距离关系表,其中,所述距离关系表记录了所述第三节点访问所述第一节点的内存区域的距离,以及,所述第三节点访问所述第二节点的内存区域的距离;判断所述第三节点访问所述第二节点的内存区域的距离是否小于等于所述第三节点访问所述第一节点的内存区域的距离;若所述第三节点访问所述第二节点的内存区域的距离小于等于所述第三节点访问所述第一节点的内存区域的距离,则确定所述内存页满足块合并条件。
- 根据权利要求1-3任一项所述的方法,其特征在于,对于第一内存页,当确定所述第一内存页满足块合并条件时,所述将所述第一内存页合并到相应的内存块,包括:确定第一内存块包含的内存页的个数是否小于预设阈值;若所述第一内存块包含的内存页的个数小于预设阈值,则将所述第一内存页合并到所述第一内存块;若所述第一内存块包含的内存页的个数等于预设阈值,且所述第一内存页是所述目标进程访问的内存页,则将所述第一内存页作为第二内存块的起始内存页,合并到所述第二内存块;其中,所述第一内存页为:所述第一节点的内存区域中,所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的任一内存页;所述第一内存块为:与所述第一内存页相连续的上一内存页所在的内存块。
- 根据权利要求4所述的方法,其特征在于,若确定所述第一内存页不满足块合并条件,或者,若所述第一内存块包含的内存页的个数等于预设阈值,且所述第一内存页为非目标进程访问的内存页,所述方法还包括:判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;若所述第二内存页是所述目标进程访问的内存页,则将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存块。
- 一种内存迁移设备,其特征在于,包括:接收单元,用于接收第二节点发送的迁移指令,其中,所述迁移指令用于指示将目标进程访问的处于所述第一节点的所有内存页,从所述第一节点的内存区域迁移至所述第二节点的内存区域;所述目标 进程为在第二节点上运行的进程;扫描单元:用于根据所述接收单元接收到的迁移指令,依次扫描所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的每个内存页,其中,所述内存页为目标进程访问的内存页,或者,非目标进程访问的内存页;判断单元:用于分别判断每个内存页是否满足块合并条件;块合并单元:用于将所述判断单元确定的满足所述块合并条件的内存页合并到相应的内存块;发送单元,用于将所述相应的内存块迁移至所述第二节点的内存区域。
- 根据权利要求6所述的内存迁移设备,其特征在于,所述满足块合并条件的内存页,包括:所述目标进程访问的内存页;或者,空闲内存页;或者,第一进程访问的内存页,其中,所述第一进程为所述第二节点上运行的除所述目标进程之外的进程。
- 根据权利要求6或7所述的内存迁移设备,其特征在于,当内存页为第二进程访问的内存页时,其中,所述第二进程为第三节点上运行的进程;所述判断单元,具体用于:查询操作系统内部存储的距离关系表,其中,所述距离关系表记录了所述第三节点访问所述第一节点的内存区域的距离,以及,所述第三节点访问所述第二节点的内存区域的距离;判断所述第三节点访问所述第二节点的内存区域的距离是否小于等于所述第三节点访问所述第一节点的内存区域的距离;若所述第三节点访问所述第二节点的内存区域的距离小于等于所述第三节点访问所述第一节点的内存区域的距离,则确定所述内存页满足块合并条件。
- 根据权利要求6-8任一项所述的内存迁移设备,其特征在于,对于第一内存页,当所述判断单元确定所述第一内存页满足块合并条 件时,所述块合并单元,具体用于:确定第一内存块包含的内存页的个数是否小于预设阈值;若所述第一内存块包含的内存页的个数小于预设阈值,则将所述第一内存页合并到所述第一内存块;若所述第一内存块包含的内存页的个数等于预设阈值,且所述第一内存页是所述目标进程访问的内存页,则将所述第一内存页作为第二内存块的起始内存页,合并到所述第二内存块;其中,所述第一内存页为:所述第一节点的内存区域中,所述目标进程访问的起始内存页的物理地址与所述目标进程访问的末尾内存页的物理地址之间的任一内存页;所述第一内存块为:与所述第一内存页相连续的上一内存页所在的内存块。
- 根据权利要求9所述的内存迁移设备,其特征在于,若所述判断单元确定所述第一内存页不满足块合并条件,或者,若所述块合并单元确定所述第一内存块包含的内存页的个数等于预设阈值,且所述第一内存页为非目标进程访问的内存页;相应的,所述判断单元,还用于判断第二内存页是否是所述目标进程访问的内存页;其中,所述第二内存页为与所述第一内存页相连续的下一内存页;所述块合并单元,还用于若所述第二内存页是所述目标进程访问的内存页,将所述第二内存页作为第三内存块的起始内存页,合并到所述第三内存块。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP15840667.8A EP3131015B1 (en) | 2014-09-12 | 2015-06-01 | Memory migration method and device |
| US15/357,240 US10013205B2 (en) | 2014-09-12 | 2016-11-21 | Memory migration method and device |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410464534.X | 2014-09-12 | ||
| CN201410464534.XA CN105468538B (zh) | 2014-09-12 | 2014-09-12 | 一种内存迁移方法及设备 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/357,240 Continuation US10013205B2 (en) | 2014-09-12 | 2016-11-21 | Memory migration method and device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2016037499A1 true WO2016037499A1 (zh) | 2016-03-17 |
Family
ID=55458331
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2015/080491 Ceased WO2016037499A1 (zh) | 2014-09-12 | 2015-06-01 | 一种内存迁移方法及设备 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US10013205B2 (zh) |
| EP (1) | EP3131015B1 (zh) |
| CN (1) | CN105468538B (zh) |
| WO (1) | WO2016037499A1 (zh) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI658360B (zh) * | 2018-04-13 | 2019-05-01 | 宏碁股份有限公司 | 儲存系統及儲存方法 |
Families Citing this family (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105827453A (zh) * | 2016-04-25 | 2016-08-03 | 浪潮电子信息产业股份有限公司 | 一种复杂拓扑结构的计算机系统中numa域设置方法 |
| CN108572864A (zh) * | 2017-03-13 | 2018-09-25 | 龙芯中科技术有限公司 | 触发负载均衡调度的方法、装置及服务器 |
| KR102693834B1 (ko) * | 2019-09-02 | 2024-08-12 | 에스케이하이닉스 주식회사 | 저장 장치 및 그 동작 방법 |
| CN110888821B (zh) * | 2019-09-30 | 2023-10-20 | 华为技术有限公司 | 一种内存管理方法及装置 |
| CN112214302B (zh) * | 2020-10-30 | 2023-07-21 | 中国科学院计算技术研究所 | 一种进程调度方法 |
| CN115408138B (zh) * | 2021-05-26 | 2025-05-16 | 华为技术有限公司 | 一种内存页处理方法及其相关设备 |
| CN113918527B (zh) * | 2021-12-15 | 2022-04-12 | 西安统信软件技术有限公司 | 一种基于文件缓存的调度方法、装置与计算设备 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110145471A1 (en) * | 2009-12-10 | 2011-06-16 | Ibm Corporation | Method for efficient guest operating system (os) migration over a network |
| CN102999437A (zh) * | 2011-09-19 | 2013-03-27 | 群联电子股份有限公司 | 数据搬移方法、存储器控制器与存储器储存装置 |
| CN103353850A (zh) * | 2013-06-13 | 2013-10-16 | 华为技术有限公司 | 虚拟机热迁移内存处理方法、装置和系统 |
| CN103605564A (zh) * | 2013-11-15 | 2014-02-26 | 青岛尚慧信息技术有限公司 | 一种移动终端及其复制粘贴数据的方法 |
| US20140244891A1 (en) * | 2013-02-26 | 2014-08-28 | Red Hat Israel, Ltd. | Providing Dynamic Topology Information in Virtualized Computing Environments |
Family Cites Families (26)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1996035169A1 (en) * | 1995-05-05 | 1996-11-07 | Silicon Graphics, Inc. | Page migration in a non-uniform memory access (numa) system |
| US5918249A (en) * | 1996-12-19 | 1999-06-29 | Ncr Corporation | Promoting local memory accessing and data migration in non-uniform memory access system architectures |
| US6871219B2 (en) * | 2001-03-07 | 2005-03-22 | Sun Microsystems, Inc. | Dynamic memory placement policies for NUMA architecture |
| EP1568208A4 (en) * | 2002-11-27 | 2010-06-23 | Rgb Networks Inc | METHOD AND APPARATUS FOR TEMPORARILY PROCESSING MULTIPLE DIGITAL VIDEO PROGRAMS |
| CN100383763C (zh) | 2004-02-27 | 2008-04-23 | 中国人民解放军国防科学技术大学 | 基于操作系统反向页表的页迁移和复制方法 |
| US8037280B2 (en) * | 2008-06-11 | 2011-10-11 | Vmware, Inc. | System and method for improving memory locality of virtual machines |
| CN102135963B (zh) | 2010-01-21 | 2013-04-24 | 深圳市智骏数据科技有限公司 | 数据迁移的方法和系统 |
| US10169087B2 (en) * | 2011-01-28 | 2019-01-01 | International Business Machines Corporation | Technique for preserving memory affinity in a non-uniform memory access data processing system |
| US9436402B1 (en) * | 2011-04-18 | 2016-09-06 | Micron Technology, Inc. | Methods and apparatus for pattern matching |
| CN102834807B (zh) * | 2011-04-18 | 2015-09-09 | 华为技术有限公司 | 多处理器系统负载均衡的方法和装置 |
| US9081764B2 (en) | 2011-06-21 | 2015-07-14 | International Business Machines Corporation | Iimplementing DMA migration of large system memory areas |
| JP5712451B2 (ja) | 2011-07-28 | 2015-05-07 | ▲ホア▼▲ウェイ▼技術有限公司 | メモリ移行を実施するための方法およびデバイス |
| US8966204B2 (en) | 2012-02-29 | 2015-02-24 | Hewlett-Packard Development Company, L.P. | Data migration between memory locations |
| CN103365704B (zh) | 2012-03-26 | 2016-12-14 | 中国移动通信集团公司 | 虚拟机迁移中的内存预拷贝方法及执行该方法的装置和系统 |
| US9116792B2 (en) * | 2012-05-18 | 2015-08-25 | Silicon Motion, Inc. | Data storage device and method for flash block management |
| CN103198028B (zh) | 2013-03-18 | 2015-12-23 | 华为技术有限公司 | 一种内存数据迁移方法、装置及系统 |
| US20140304453A1 (en) * | 2013-04-08 | 2014-10-09 | The Hong Kong Polytechnic University | Effective Caching for Demand-based Flash Translation Layers in Large-Scale Flash Memory Storage Systems |
| CN103324582A (zh) | 2013-06-17 | 2013-09-25 | 华为技术有限公司 | 一种内存迁移方法、装置及设备 |
| CN103324592B (zh) | 2013-06-24 | 2016-11-23 | 华为技术有限公司 | 一种数据迁移控制方法、数据迁移方法及装置 |
| US9779015B1 (en) * | 2014-03-31 | 2017-10-03 | Amazon Technologies, Inc. | Oversubscribed storage extents with on-demand page allocation |
| US9772787B2 (en) * | 2014-03-31 | 2017-09-26 | Amazon Technologies, Inc. | File storage using variable stripe sizes |
| US10045291B2 (en) * | 2014-07-16 | 2018-08-07 | Itron Global Sarl | Relay functionality of battery powered devices |
| CN104216784B (zh) * | 2014-08-25 | 2018-01-23 | 杭州华为数字技术有限公司 | 热点均衡控制方法及相关装置 |
| US10261703B2 (en) * | 2015-12-10 | 2019-04-16 | International Business Machines Corporation | Sharing read-only data among virtual machines using coherent accelerator processor interface (CAPI) enabled flash |
| KR102547642B1 (ko) * | 2016-05-18 | 2023-06-28 | 에스케이하이닉스 주식회사 | 메모리 시스템 및 메모리 시스템의 동작방법 |
| US9971515B2 (en) * | 2016-09-13 | 2018-05-15 | Western Digital Technologies, Inc. | Incremental background media scan |
-
2014
- 2014-09-12 CN CN201410464534.XA patent/CN105468538B/zh active Active
-
2015
- 2015-06-01 EP EP15840667.8A patent/EP3131015B1/en active Active
- 2015-06-01 WO PCT/CN2015/080491 patent/WO2016037499A1/zh not_active Ceased
-
2016
- 2016-11-21 US US15/357,240 patent/US10013205B2/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110145471A1 (en) * | 2009-12-10 | 2011-06-16 | Ibm Corporation | Method for efficient guest operating system (os) migration over a network |
| CN102999437A (zh) * | 2011-09-19 | 2013-03-27 | 群联电子股份有限公司 | 数据搬移方法、存储器控制器与存储器储存装置 |
| US20140244891A1 (en) * | 2013-02-26 | 2014-08-28 | Red Hat Israel, Ltd. | Providing Dynamic Topology Information in Virtualized Computing Environments |
| CN103353850A (zh) * | 2013-06-13 | 2013-10-16 | 华为技术有限公司 | 虚拟机热迁移内存处理方法、装置和系统 |
| CN103605564A (zh) * | 2013-11-15 | 2014-02-26 | 青岛尚慧信息技术有限公司 | 一种移动终端及其复制粘贴数据的方法 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP3131015A4 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI658360B (zh) * | 2018-04-13 | 2019-05-01 | 宏碁股份有限公司 | 儲存系統及儲存方法 |
| US10671312B2 (en) | 2018-04-13 | 2020-06-02 | Acer Incorporated | Storage system and storing method |
Also Published As
| Publication number | Publication date |
|---|---|
| CN105468538B (zh) | 2018-11-06 |
| EP3131015B1 (en) | 2018-08-15 |
| EP3131015A1 (en) | 2017-02-15 |
| US10013205B2 (en) | 2018-07-03 |
| US20170068486A1 (en) | 2017-03-09 |
| CN105468538A (zh) | 2016-04-06 |
| EP3131015A4 (en) | 2017-06-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2016037499A1 (zh) | 一种内存迁移方法及设备 | |
| TWI549060B (zh) | Access methods and devices for virtual machine data | |
| RU2658886C1 (ru) | Способ управления файлами, распределенная система хранения и узел управления | |
| KR101994021B1 (ko) | 파일 조작 방법 및 장치 | |
| CN104462225B (zh) | 一种数据读取的方法、装置及系统 | |
| CN105579977B (zh) | 访问文件的方法、装置及存储系统 | |
| US10719479B2 (en) | Data unit cloning in memory-based file systems | |
| CN104156322B (zh) | 一种缓存管理方法及缓存管理装置 | |
| CN107250995B (zh) | 存储器管理设备 | |
| US20150113230A1 (en) | Directory storage method and query method, and node controller | |
| CN107391033B (zh) | 数据迁移方法及装置、计算设备、计算机存储介质 | |
| WO2017041570A1 (zh) | 向缓存写入数据的方法及装置 | |
| CN105138481B (zh) | 存储数据的处理方法、装置和系统 | |
| WO2016090985A1 (zh) | 缓存的读取、读取处理方法及装置 | |
| CN107423301A (zh) | 一种数据处理的方法、相关设备及存储系统 | |
| CN110609708B (zh) | 用于数据处理的方法、设备和计算机可读介质 | |
| US11520818B2 (en) | Method, apparatus and computer program product for managing metadata of storage object | |
| CN110352410B (zh) | 跟踪索引节点的访问模式以及预提取索引节点 | |
| CN113900815A (zh) | 异构众核处理器的高带宽访存方法及装置 | |
| CN111552438B (zh) | 一种对象写入的方法、装置、服务器和存储介质 | |
| US10031777B2 (en) | Method and system for scheduling virtual machines in integrated virtual machine clusters | |
| CN107633090A (zh) | 一种基于分布式文件系统客户端锁拆分的方法 | |
| WO2016049808A1 (zh) | 多核处理器系统的缓存目录处理方法和目录控制器 | |
| US10073657B2 (en) | Data processing apparatus, data processing method, and computer program product, and entry processing apparatus | |
| CN105988871B (zh) | 一种远端内存分配方法、装置和系统 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15840667 Country of ref document: EP Kind code of ref document: A1 |
|
| REEP | Request for entry into the european phase |
Ref document number: 2015840667 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2015840667 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |