[go: up one dir, main page]

WO2019000358A1 - Techniques for live migration support for graphics processing unit virtualization - Google Patents

Techniques for live migration support for graphics processing unit virtualization Download PDF

Info

Publication number
WO2019000358A1
WO2019000358A1 PCT/CN2017/090990 CN2017090990W WO2019000358A1 WO 2019000358 A1 WO2019000358 A1 WO 2019000358A1 CN 2017090990 W CN2017090990 W CN 2017090990W WO 2019000358 A1 WO2019000358 A1 WO 2019000358A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
vgpu
memory addresses
ggtt
ppgtt
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/090990
Other languages
French (fr)
Inventor
Xiao Huang
Kun TIAN
Yaozu Dong
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Priority to PCT/CN2017/090990 priority Critical patent/WO2019000358A1/en
Publication of WO2019000358A1 publication Critical patent/WO2019000358A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution
    • G06F9/4856Task life-cycle, e.g. stopping, restarting, resuming execution resumption being on a different machine, e.g. task migration, virtual machine migration
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45579I/O management, e.g. providing access to device drivers or storage
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation

Definitions

  • Examples described herein are generally related to live migration support of a virtual graphics processing unit (vGPU) between nodes, servers or computing platforms in a network.
  • vGPU virtual graphics processing unit
  • graphics hardware such as a graphics processing unit (GPU) may be virtualized in a virtual machine (VM) as a virtual GPU (vGPU) .
  • VM virtual machine
  • vGPU virtual GPU
  • a vGPU driver in coordination with a VM manager (VMM) managing the VM may trap and emulate access to a GPU by a guest operating system (OS) being executed by the VM for purposes of security and multiplexing.
  • the VMM may allow for pass through central processing unit (CPU) access of memory allocated to the vGPU of the VM for graphics processing.
  • GPU commands, once submitted, may be directly executed in graphics hardware (e.g., in a GPU) without VMM intervention.
  • Live migration of VMs hosted by compute nodes, sleds, servers or computing platforms may an important feature for a system such as a datacenter to enable fault-tolerant capabilities, flexible resource management or dynamic workload rebalancing.
  • Live migration may include migrating the VM and an instance of a vGPU from a source server to a destination server.
  • the migration of the VM and the instance of the vGPU may be over a network connection between the source and destination servers.
  • the migration may be considered a “live” migration if one or more applications being executed by the migrated VM or vGPU continue to be executed by the VM or vGPU during a large portion of a migration between source and destination servers. Execution on the one or more applications may only be briefly halted just prior to copying remaining state information from the source server to the destination server to enable the VM to resume execution of the application at the destination server.
  • FIG. 1 illustrates an example system
  • FIG. 2 illustrates an example portion of the example system.
  • FIG. 3 illustrates an example scheme
  • FIG. 4 illustrates an example dirty page log.
  • FIG. 5 illustrates an example dirty page table
  • FIG. 6 illustrates an example block diagram for a first apparatus.
  • FIG. 7 illustrates an example of a logic flow.
  • FIG. 8 illustrates an example of a storage medium
  • FIG. 9 illustrates an example computing platform.
  • live migration of a VM and an instance of a vGPU from a source server to a destination service may be considered as live if an application being executed by the VM continues to be executed during most of the live migration.
  • VM hypervisors or VMMs separately supported by respective source/destination servers may be arranged to coordinate live or offline migration of VMs between servers.
  • a large portion of a live migration of a VM and an instance of a vGPU may be state information that includes memory used by the VM and a supporting vGPU while executing one or more applications or assisting in graphics processing for the one or more applications.
  • Live migration of state information for a VM involves a two-phase process.
  • the first phase may be a pre-memory copy phase that includes copying initial memory (e.g., for a 1 st iteration) and changing memory (e.g., dirty pages) for remaining iterations from the source node to the destination node while the VM is still executing one or more applications or the VM is still running on the source server.
  • the first or pre-memory phase may continue until remaining dirty pages at the source server fall below a threshold.
  • the second phase may then be a stop-and-copy phase that stops or halts the VM at the source server, copies remaining state information (e.g., remaining dirty pages and/or processor/GPU states, input/output state, etc. ) to the destination server.
  • a third or final phase includes resuming the VM at the destination server.
  • the copying of VM state information for the first two phases may be through a network connection maintained between the source and destination node.
  • Live migration for an instance of a vGPU may follow the same two-phase process as mentioned above for a VM.
  • memory pages for memory allocated to the vGPU for use by a GPU may be difficult to track during a live migration.
  • CPU memory page dirty tracking may be possible through techniques such as dirty bit scanning or page fault, these techniques may be not available for GPU-related memory pages.
  • Without a system or technique to track dirty page generation for memory pages for memory allocated to the vGPU a large number of these memory pages may be copied during a live migration even when not dirty. The large number of unnecessarily copied memory pages may delay the stop-and-copy phase of the live migration.
  • the amount of time spent in the second stop-and-copy phase is important as the one or more applications being executed by the VM may be briefly halted for this period of time. Thus, any services being provided while executing the application may be temporarily unresponsive. It is with respect to these challenges that the examples described herein are needed.
  • FIG. 1 illustrates an example system 100.
  • system 100 includes a source server 110 coupled through a network 150 with a destination server 120.
  • system 100 may be part of a datacenter and network 150 may represent elements of an internal network that communicatively couples a plurality of servers included in the datacenter such as communicatively coupling source server 110 with various other servers or nodes that includes destination server 120.
  • These interconnected servers or nodes may provide network services to one or more clients or customers as part of an SDN or part of an European Telecommunications Standards Institute (ETSI) network function virtualization (NFV) infrastructure.
  • ETSI European Telecommunications Standards Institute
  • NFV network function virtualization
  • source server 110 and destination server 120 may represent at least a portion of composed computing resources arranged to support VMs separately executing one or more applications as part of providing network services to clients or customers.
  • VM 130-1 to 130-n (where “n” represents any whole, positive integer greater than 2) and VM (s) 180-1 may be supported by composed computing resources associated with respective source server 110 and destination server 120.
  • VMs 130-1 to 130-n at source server 110 may be managed or controlled by a VMM or hypervisor such as VMM 112.
  • VM (s) 180-1 at destination server 120 may be managed or controlled by VMM 122.
  • At least some of the composed computing resources for source server 110 may include processing elements such as graphics hardware (HW) 141-1 to 141-n that may represent one or more GPUs or allocated portions of a single GPU. Processing elements may also include CPU/cores 142-1 to 142-n that may integrated with or separate from graphics HW 141-1 to 141-n.
  • graphics HW 141-1 to 141-n and CPU/cores 142-1 to 142-n may separately have allocated portions of memory 144 for use in supporting VMs 130-1 to 130-n.
  • an allocated portion of memory 144 such as allocation 145-1 may be accessible to CPU/cores 142-1 to support VM 130-1 and thus may include CPU memory pages.
  • another allocated portion of memory 144 such as allocation 145-2 may be accessible to graphics HW 141-1 to support virtualized graphics HW such as a vGPU for VM 130-1 and thus may include vGPU memory pages.
  • CPU memory pages maintained in allocation 145-1 may include state information for one or more applications App (s) ) 132-1 as well as state information for respective guest OS 134-1 for VM 130-1.
  • the state information maintained in CPU memory pages may reflect a current state of the VM while executing App (s) 132-1 to fulfill a workload as part of providing a network service.
  • guest OS 134-1 of VM 130-1 may have a vGPU driver 135-1 to enable App (s) 132-1 to implement virtualized graphics HW such as a vGPU via use of graphics HW 141-1.
  • App (s) 132-1 may thus have graphics processing capabilities to assist in fulfilling the workload.
  • vGPU memory pages maintained in allocation 145-2 may include vGPU state information that may reflect a current state of the vGPU.
  • the network service provided by App (s) 132-1 may include, but is not limited to, a video streaming service, a multi-player video game service, a database network service, a website hosting network service, a routing network service, an e-mail network service, a firewalling service, a domain name service (DNS) , a caching service, a network address translation (NAT) service or virus scanning network service.
  • DNS domain name service
  • NAT network address translation
  • At least some composed computing resources for destination server 120 may include CPU/cores 172-1 to 172-n having an allocated portion of memory 174 for use in supporting one or more VM (s) 180-1 and for use to support a migrated VM 130-1.
  • Allocated portions of memory 174 such as allocations 175-1 to 175-n may be accessible to graphics HW 171-2 to 171-n and CPU/cores 172-1 to 172-n for use in supporting VM 130-1 and VM (s) 180-1.
  • an allocated portion of memory 174 such as allocation 175-1 may be accessible to CPU/cores 172-1 to support VM 130-1 following a live migration of VM 130-1 and thus may include CPU memory pages.
  • allocation 175-2 may be accessible to graphics HW 171-1 to support virtualized graphics HW such as a vGPU for the migrated VM 130-1 and thus may include vGPU memory pages.
  • graphics HW 171-1 to 171-n at destination server 120 may represent one or more GPUs or allocated portions of a single GPU.
  • CPU memory pages maintained in allocation 175-1 may include state information for one or more applications App (s) ) 132-1 as well as state information for respective guest OS 134-1 for migrated VM 130-1.
  • the state information maintained in CPU memory pages may reflect a state of the VM while executing App (s) 132-1 to fulfill the previously mentioned workload as part of providing the network service following live migration to destination server 120.
  • vGPU memory pages maintained in allocation 175-2 may include vGPU state information that may reflect a state of the vGPU following the live migration to destination server 120.
  • CPUs/cores 142-1 to 142-n or CPUs/cores 172-1 to 172-n may represent, either individually or collectively, various commercially available processors, including without limitation an and processors; application, embedded and secure processors; and and processors; IBM and Cell processors; Core (2) Core i3, Core i5, Core i7, or Xeon processors; and similar processors.
  • graphics HW 141-1 to 141-n and graphics HW 171-1 to 171-n may include various types of graphics processing HW resources such as rendering engine, media (e.g., audio and video) engine, general purpose GPU (GPGPU) engines. These types of graphics processing HW resources may be embodied in one or more discreate GPU packages or may be integrated with other types of computing resources such as chipsets or CPU/cores.
  • graphics processing HW resources such as rendering engine, media (e.g., audio and video) engine, general purpose GPU (GPGPU) engines.
  • GPU general purpose GPU
  • a resource manager 190 may include logic and/or features to facilitate composition of disaggregated computing resources as shown in FIG. 1 for system 100.
  • the logic and/or features of resource manager 190 may initially allocate compute resources at source server 110 to support VMs 130-1 to 130-1 and may decide to cause a live migration of VM 130-1 to be supported by compute resources at destination server 120 in order to reallocate or load balance compute resources of system 100.
  • resource manager 190 may also cause an allocation of other compute resources such as, but not limited to, networking resources (e.g., physical ports, network switching capabilities, network interface cards, accelerators, etc. ) .
  • memory 144 and memory 174 include volatile memory, non-volatile memory or combination of volatile and non-volatile types of memory.
  • Volatile types of memory may include, but are not limited to, dynamic random access memory (DRAM) or static random access memory (SRAM) , thyristor RAM (TRAM) or zero-capacitor RAM (ZRAM) .
  • Non-volatile types of memory may include byte or block addressable types of non-volatile memory having a 3-dimensional (3-D) cross-point memory structure that includes chalcogenide phase change material (e.g., chalcogenide glass) hereinafter referred to as “3-D cross-point memory” .
  • Non-volatile types of memory may also include other types of byte or block addressable non-volatile memory such as, but not limited to, multi-threshold level NAND flash memory, NOR flash memory, single or multi-level phase change memory (PCM) , resistive memory, nanowire memory, ferroelectric transistor random access memory (FeTRAM) , magnetoresistive random access memory (MRAM) that incorporates memristor technology, spin transfer torque MRAM (STT-MRAM) , or a combination of any of the above.
  • PCM phase change memory
  • FeTRAM ferroelectric transistor random access memory
  • MRAM magnetoresistive random access memory
  • STT-MRAM spin transfer torque MRAM
  • allocations 145-1 to 145-n may be allocated to VMs 130-1 to 130-n as a type of system memory for use by CPUs/cores 142-1 to 142-n or graphics HW 141-1 to 141-n to support these VMs at source server 110. Also, separate portions of memory 174 shown in FIG. 1 as allocations 175-2 to 175-n may be allocated to VM (s) 180-1 as a type of system memory for use by CPUs/cores 171-1 to 171-n or graphics HW 171-1 to support these VMs at destination server 120.
  • migration logic 113 of VMM 112 at source server 110 and migration logic 123 of VMM 122 may separately include logic and/or features capable of facilitating a live migration of a VM from source server 110 to destination server 120.
  • migration logic 113 and migration logic 123 may facilitate the live migration of VM 130-1 from source server 110 to destination server 120.
  • one or more schemes may be implemented to facilitate live migration of vGPU state information that may reduce an amount of copying that needs to occur to complete the live migration of VM 130-1.
  • the or more schemes may accelerate the live migration of VM 130-1.
  • the live migration may be initiated (e.g., by resource manager 190) responsive to such reasons as load balancing, fog/edge computing, or system maintenance.
  • the dash-lined boxes for CPU and vGPU memory pages at memory 174 represent transient movement of these memory pages during live migration. Also, the dash-lined boxes for VM 130-1, App (a) 132-1 and guest OS 134-1 represent a live migration of VM 130-1 to destination server 120 just before live migration is complete.
  • FIG. 2 illustrates an example portion 200.
  • portion 200 may be a portion of system 100 that includes elements of VM 130-1, VMM 112 and memory 144.
  • portion 200 may be part of a full GPU virtualization that may include vGPU driver 135-1 being able to trap and emulate guest access of privileged graphics HW resources (e.g., graphics HW 141-1) for security and multiplexing.
  • vGPU driver 135-1 being able to trap and emulate guest access of privileged graphics HW resources (e.g., graphics HW 141-1) for security and multiplexing.
  • pass through CPU/core access to memory 144 may be allowed for CPU/core access to performance critical resources such as CPU/core access of graphics HW allocated memory included in allocation 145-2.
  • submitted commands from App (s) 132-1 for graphics processing by the graphics HW may be directly executed without VMM 112 intervention or involvement.
  • GVT-g a graphics processing virtualization technology introduced by Intel Corporation is known as “GVT-g” .
  • GVT-g may use system memory such as memory 144 for graphics HW to access.
  • Access to memory 144 for graphics hardware may include vGPU driver 135-1’s use of global graphics translation table (GGTT) /per-process graphics translation table (PPGTT) 235 to translate from a graphics HW memory address to a system or physical memory address corresponding to vGPU memory pages maintained in allocation 145-2.
  • GGTT global graphics translation table
  • PPGTT per-process graphics translation table
  • VMM 112 may use a shadow of GGTT/PPTT 235 that is synchronized to GGTT/PPTT 235.
  • GGTT/PPTT 235 may be write-protected so that the shadow can be continually synchronized to GGTT/PPTT 235 by trapping and emulating modifications to GGTT/PPTT 235.
  • this shadowing mechanism changes or modifications to GGTT/PPTT 235 may be monitored by logic and/or features of VMM 122 such as migration logic 113.
  • migration logic 113 may implement a scheme or mechanism that includes monitoring vGPU memory pages included in allocation 145-2 during a live migration of VM 130-1 to log or identify dirty vGPU memory pages and remove duplicated memory pages from being copied during the live migration.
  • the logging of identified dirty vGPU memory pages may significantly reduce the number of vGPU memory pages to be copied during a live migration compared to copying all vGPU memory pages included in allocation 145-2.
  • migration logic 113 may scan or read GGTT/PPTT 235 to identify a range of memory addresses allocated for use by graphics HW 141-1 and include that identified range in memory range 215. By migration logic 113 reading or scanning GGTT/PPGTT 235, the range of memory addresses being actively used by graphics HW 141-1 may be identified. A scan of GGTT/PPTT 235 to identify a full range of memory address may only be done once.
  • GGTT/PPTT 235 is monitored for changes and memory range 215 is updated accordingly.
  • changes to memory range 215 may be caused due to runtime changes in workload requirements from App (s) 132-1.
  • Monitoring just for changes to GGTT/PPTT 235 may avoid a need to continually scan all of GGTT/PPTT 235 to determine memory range 215.
  • migration logic 113 may generate a vGPU dirty page log 216.
  • vGPU dirty page log 216 may include a calculated hash value for content included in each vGPU memory page of memory range 215. Individual vGPU dirty memory pages may then be identified by changes in a respective vGPU memory page’s hash value.
  • Various hash algorithms may be used to generate hash values for vGPU memory pages.
  • Example hash algorithms may include, but are not limited to, XOR, SHA1, MD5 or CRC32 hash algorithms.
  • an XOR hash algorithm is used to calculate a hash value for vGPU memory pages of memory range 215.
  • the XOR hash algorithm may be used due to an ability to calculate a hash value at a relatively faster rate than other types of hash algorithms.
  • a change in an XOR hash value for a vGPU memory page indicates that content in the vGPU memory page has changed.
  • vGPU dirty page log 216 may maintain the hash values for the vGPU memory pages of memory range 215 and may indicate which hash values have changed and thus identified as a dirty vGPU memory page.
  • dirty type 1 may include currently mapped vGPU memory pages that are still mapped to GGTT/PPGTT 255 at a stop-and-copy phase of live migration.
  • the second type referred to as “dirty type 2” may include memory pages that were remapped to CPU/core 142-1 just prior to the stop-and-copy phase and hence a vGPU memory page mapping no longer exists in GGTT/PPGTT 255 for these dirty type 2 memory pages.
  • dirty type 2 memory pages are already tracked as part of a migration of state information for CPU/core 142-1.
  • vGPU dirty page table 217 may include only dirty type 1 dirty vGPU memory pages or migration logic 113 may select on dirty type 1 memory pages for copying.
  • migration logic 113 may use vGPU dirty page table 217 to determine which vGPU memory pages to copy during the stop-and-copy phase of the live migration.
  • FIG. 3 illustrates an example scheme 300.
  • scheme 300 may depict an example of a live migration of a VM having a vGPU.
  • elements of system 100 and portion 200 shown in FIGS. 1 and 2 such as elements of source server 110, VM 130-1, VMM 112, migration logic 113 memory 114 and destination server 120 as depicted in FIGS. 1 and 2 may implement portions of scheme 300.
  • vGPU dirty page logging is started.
  • a live migration of VM 130-1 from source server 110 to destination server 120 may have been initiated (e.g., by resource manager 190) .
  • Logic and/or features of VMM 112 such migration logic 113 may identify memory ranges for vGPU memory pages included in allocation 145-2 of memory 144, calculate hash values for vGPU memory pages and then log any changes in calculated hash values in order to log dirty vGPU memory pages in vGPU dirty page log 216 during pre-memory phase 310.
  • vGPU dirty page table 217 may be used by migration logic 113 to determine what vGPU memory pages included in allocation 145-2 are to be copied to memory 174. Minimizing the amount of memory pages copied reduces service downtime as a portion of the total migration time.
  • VMM 112 may reallocate graphics HW 141-1 to support a different VM or may allow graphics HW 141-1 to be unallocated. In either case the vGPU instance of VM-130-1 at source server 110 is removed from a scheduler at source server 110.
  • the vGPU supporting VM 130-1 is resumed at destination server 120.
  • dirty vGPU memory pages identified in vGPU dirty page table 217 as dirty type 1 memory pages are copied to allocation 175-1 of memory 175 at destination server 120.
  • the state of the vGPU supporting VM 130-1 at destination server 120 may then be restored based, at least in part, on using content included in these dirty vGPU memory pages.
  • VMM 122 may then add this vGPU to a scheduler at destination server 120. Live migration of VM 130-1 may then be completed.
  • FIG. 4 illustrates an example dirty page log 400.
  • dirty page log 400 may include a range of memory addresses 410-1 to 410-m, where “m” represents any whole integers greater than 10.
  • Memory addresses 410-1 to 410-m included in dirty page log 216 may represent vGPU memory pages of allocation 145-2 included in memory range 215 scanned from GGTT/PPGTT 235 as mentioned above for FIG. 2.
  • respective initial hash values 420-1 to 420-m for addresses 410-1 to 410-m may be calculated via use of a hash algorithm (e.g., XOR algorithm) that has contents of memory pages corresponding to addresses 410-1 to 410-m as inputs to determine these initial hash values.
  • a hash algorithm e.g., XOR algorithm
  • logic and/or features of VMM 112 may determine current hash values 430-1 to 430-m at each iteration of a pre-memory phase such as pre-memory phase 310 shown in FIG. 3.
  • the current hash may be calculated via use of the same hash algorithm used to calculate the initial hash value that use contents of memory pages corresponding to addresses 410-1 to 410-m at a given iteration to determine these current initial hash values. For these examples, if current and initial hash value do not match for the given iteration for a given address, then the vGPU memory page corresponding to the give address is determined as dirty and added to dirty page table 217. For example, as shown in FIG.
  • migration logic 113 may have compared the initial and current hash values for addresses 410-1 to 410-m and determined that the hash values did not match for addresses 410-2, 410-5, 410-6, 410-7, 410-9, 410-10 and 410-m.
  • FIG. 5 illustrates an example dirty page table 217.
  • dirty page table 217 may include those addresses identified in dirty page log 216 as including dirty vGPU memory pages.
  • dirty page table 217 may include an indication of whether the identified addresses have dirty type 1 or dirty type 2 memory pages.
  • dirty type 1 may include vGPU memory pages that are still mapped to GGTT/PPGTT 255 at a stop-and-copy phase such as stop-and-copy phase 320 shown in FIG. 3.
  • dirty type 2 may include vGPU memory pages that were remapped to CPU/core 142-1 just prior to the stop-and-copy phase (e.g., became CPU memory pages) .
  • contents of memory pages corresponding to addresses 410-2, 410-5, 410-7, 410-10 and 410-m may be copied as part of the stop-and-copy phase for migrating vGPU state information from source server 110 to destination server 120. Meanwhile contents of memory pages corresponding to addresses 410-6 and 410-9 are not copied as part of this migration of vGPU state information. Rather, the contents of memory pages corresponding to addresses 410-6 and 410-9 may be copied as part of the stop-and-copy phase for migrating CPU/core state information from source server 110 to destination server 120.
  • FIG. 6 illustrates an example block diagram for an apparatus 600.
  • apparatus 600 shown in FIG. 6 has a limited number of elements in a certain topology, it may be appreciated that the apparatus 600 may include more or less elements in alternate topologies as desired for a given implementation.
  • apparatus 600 may be associated with logic and/or features of a VMM (e.g., migration logic 113 of VMM 112 as shown in FIG. 1) and may be supported by circuitry 620.
  • circuitry 620 may be incorporated within circuitry, processor circuitry, processing element, CPU or core maintained at a source server.
  • Circuitry 620 may be arranged to execute one or more software, firmware or hardware implemented modules, components or logic 622-a (module, component or logic may be used interchangeably in this context) . It is worthy to note that “a” and “b” and “c” and similar designators as used herein are intended to be variables representing any positive integer.
  • a complete set software, firmware and/or hardware for logic 622-a may include logic 622-1, 622-2, 622-3, 622-4 or 622-5.
  • logic 622-1, 622-2, 622-3, 622-4 or 622-5 may represent the same or different integer values.
  • logic , “module” or “component” may also include software/firmware stored in computer-readable media, and although the types of logic are shown in FIG. 6 as discrete boxes, this does not limit these components to storage in distinct computer-readable media components (e.g., a separate memory, etc. ) .
  • circuitry 620 may include a processor, processor circuit, processor circuitry, processor element, core or CPU. Circuitry 620 may be generally arranged to execute or implement one or more modules, components or logic 622-a. Circuitry 620 may be all or at least a portion of any of various commercially available processors, including without limitation an and processors; application, embedded and secure processors; and and processors; IBM and Cell processors; Core (2) Core i3, Core i5, Core i7, Xeon and processors; and similar processors. According to some examples circuitry 620 may also include an application specific integrated circuit (ASIC) and at least some logic 622-amay be implemented as hardware elements of the ASIC. According to some examples, circuitry 620 may also include a field programmable gate array (FPGA) and at least some logic 622-amay be implemented as hardware elements of the FPGA.
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • apparatus 600 may include initiate logic 622-1. Initiate logic 622-1 may be executed or implemented by circuitry 620 to initiate a pre-memory copy phase to live migrate a VM to a destination server. For these examples. live migration indication 605 may cause initiate logic 622-1 to initiate the pre-memory copy phase. In one example, live migration indication 605 may be sent by a resource manager of a datacenter that includes the source and destination servers for load balancing purposes.
  • apparatus 600 may include identify logic 622-2. Identify logic 622-2 may be executed or implemented by circuitry 620 to identify a plurality of memory addresses for memory allocated to a vGPU supporting the VM.
  • the plurality of memory addresses may include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing.
  • the vGPU may be allocated graphics hardware resources at the source server.
  • identify logic 622-2 may scan translation tables (e.g., GGTT/PPGTT tables) maintained with a vGPU driver at the VM to receive vGPU memory address information 610 in order to identify the plurality of memory address.
  • Identify logic 622-2 may at least temporarily maintain the plurality of memory addresses with memory addresses 624-a, e.g., maintained in a data structure such as look up table (LUT) .
  • LUT look up table
  • apparatus 600 may include log logic 622-3.
  • Log logic 622-3 may be executed or implemented by circuitry 620 to generate a log that includes the plurality of memory addresses.
  • the log logic 622-1 may also determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log.
  • Log logic 622-1 may also, responsive to a first of at least two copy iterations of the pre-memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration.
  • Log logic 622-3 may thus track changes in vGPU memory pages and the log maintained by log logic 622-3 may be included in dirty page log 624-b (e.g., maintained in a LUT) .
  • apparatus 600 may include table logic 622-4.
  • Table logic 622-4 may be executed or implemented by circuitry 620 to add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table.
  • table logic 622-4 may at least temporarily included the add memory addresses to dirty page table 624-c (e.g., maintained in a LUT) .
  • apparatus 600 may include copy logic 622-5.
  • Copy logic 622-5 may be executed or implemented by circuitry 620 to copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  • copy logic 622-5 may also determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing and the cause the first memory address to not be copied during the stop-and-copy phase.
  • the first memory address for example, may be identified as a dirty type 2 vGPU memory page and therefore not copied.
  • Copied vGPU memory page content 630 may include the content from the one or vGPU memory pages copies during the stop-and-copy phase.
  • Various components of apparatus 600 may be communicatively coupled to each other by various types of communications media to coordinate operations.
  • the coordination may involve the uni-directional or bi-directional exchange of information.
  • the components may communicate information in the form of signals communicated over the communications media.
  • the information can be implemented as signals allocated to various signal lines. In such allocations, each message is a signal.
  • Further embodiments, however, may alternatively employ data messages. Such data messages may be sent across various connections.
  • Example connections include parallel interfaces, serial interfaces, and bus interfaces.
  • FIG. 7 illustrates an example of a logic flow 700.
  • Logic flow 700 may be representative of some or all of the operations executed by one or more logic, features, or devices described herein, such as apparatus 600. More particularly, logic flow 700 may be implemented by at least initiate logic 622-1, identify logic 622-2, log logic 622-3, table logic 622-4 or copy logic 622-5.
  • logic flow 700 at block 702 may initiate, at a source server, a pre-memory copy phase to live migrate a VM to a destination server.
  • initiate logic 622-1 may initiate the pre-memory copy phase.
  • logic flow 700 at block 704 may identify a plurality of memory addresses for memory allocated to a vGPU supporting the VM, the plurality of memory addresses to include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing, the vGPU allocated graphics hardware resources at the source server.
  • identify logic 622-2 may identify the plurality of memory addresses.
  • logic flow 700 at block 706 may generate a log that includes the plurality of memory addresses.
  • log logic 622-3 may generate the log (e.g., a dirty page log) .
  • logic flow 700 at block 708 may determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log.
  • log logic 622-3 determine the initial hash values and add these values to the log.
  • logic flow 700 at block 710 may determine, following a first of at least two copy iterations, current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration.
  • log logic 622-3 may determine the current hash values.
  • logic flow 700 at block 712 may add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table.
  • table logic 622-4 may add the memory addresses to the dirty page table.
  • logic flow 700 at block 714 may copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  • copy logic 622-5 may copy the content.
  • FIG. 8 illustrates an example of a storage medium 800.
  • Storage medium 800 may comprise an article of manufacture.
  • storage medium 800 may include any non-transitory computer readable medium or machine readable medium, such as an optical, magnetic or semiconductor storage.
  • Storage medium 800 may store various types of computer executable instructions, such as instructions to implement logic flow 800.
  • Examples of a computer readable or machine readable storage medium may include any tangible media capable of storing electronic data, including volatile memory or non-volatile memory, removable or non-removable memory, erasable or non-erasable memory, writeable or re-writeable memory, and so forth.
  • Examples of computer executable instructions may include any suitable type of code, such as source code, compiled code, interpreted code, executable code, static code, dynamic code, object-oriented code, visual code, and the like. The examples are not limited in this context.
  • FIG. 9 illustrates an example computing platform 900.
  • computing platform 900 may include a processing component 940, other platform components 950 or a communications interface 960.
  • computing platform 900 may be implemented in a server.
  • the server may be capable of coupling through a network to other servers and may be part of a datacenter including a plurality of network connected servers arranged to support VMs executing applications using vGPU allocated graphics HW resources at the servers.
  • processing component 940 may execute processing operations or logic for apparatus 600 and/or storage medium 800.
  • Processing component 940 may include various hardware elements, software elements, or a combination of both.
  • hardware elements may include devices, logic devices, components, processors, microprocessors, circuits, processor circuits, circuit elements (e.g., transistors, resistors, capacitors, inductors, and so forth) , integrated circuits, application specific integrated circuits (ASIC) , programmable logic devices (PLD) , digital signal processors (DSP) , field programmable gate array (FPGA) , memory units, logic gates, registers, semiconductor device, chips, microchips, chip sets, and so forth.
  • ASIC application specific integrated circuits
  • PLD programmable logic devices
  • DSP digital signal processors
  • FPGA field programmable gate array
  • Examples of software elements may include software components, programs, applications, computer programs, application programs, device drivers, system programs, software development programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API) , instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof. Determining whether an example is implemented using hardware elements and/or software elements may vary in accordance with any number of factors, such as desired computational rate, power levels, heat tolerances, processing cycle budget, input data rates, output data rates, memory resources, data bus speeds and other design or performance constraints, as desired for a given example.
  • platform components 950 may include common computing elements, such as one or more processors, multi-core processors, co-processors, memory units, chipsets, controllers, peripherals, interfaces, oscillators, timing devices, video cards, audio cards, multimedia input/output (I/O) components (e.g., digital displays) , power supplies, and so forth.
  • processors multi-core processors, co-processors, memory units, chipsets, controllers, peripherals, interfaces, oscillators, timing devices, video cards, audio cards, multimedia input/output (I/O) components (e.g., digital displays) , power supplies, and so forth.
  • I/O multimedia input/output
  • Examples of memory units may include without limitation various types of computer readable and machine readable storage media in the form of one or more higher speed memory units, such as read-only memory (ROM) , random-access memory (RAM) , dynamic RAM (DRAM) , Double-Data-Rate DRAM (DDRAM) , synchronous DRAM (SDRAM) , static RAM (SRAM) , programmable ROM (PROM) , erasable programmable ROM (EPROM) , electrically erasable programmable ROM (EEPROM) , types of non-volatile memory such as 3-D cross-point memory that may be byte or block addressable.
  • ROM read-only memory
  • RAM random-access memory
  • DRAM dynamic RAM
  • DDRAM Double-Data-Rate DRAM
  • SDRAM synchronous DRAM
  • SRAM static RAM
  • PROM programmable ROM
  • EPROM erasable programmable ROM
  • EEPROM electrically erasable programmable ROM
  • Non-volatile types of memory may also include other types of byte or block addressable non-volatile memory such as, but not limited to, multi-threshold level NAND flash memory, NOR flash memory, single or multi-level PCM, resistive memory, nanowire memory, FeTRAM, MRAM that incorporates memristor technology, STT-MRAM, or a combination of any of the above.
  • Other types of computer readable and machine readable storage media may also include magnetic or optical cards, an array of devices such as Redundant Array of Independent Disks (RAID) drives, solid state memory devices (e.g., USB memory) , solid state drives (SSD) and any other type of storage media suitable for storing information.
  • RAID Redundant Array of Independent Disks
  • SSD solid state drives
  • communications interface 960 may include logic and/or features to support a communication interface.
  • communications interface 960 may include one or more communication interfaces that operate according to various communication protocols or standards to communicate over direct or network communication links or channels.
  • Direct communications may occur via use of communication protocols or standards described in one or more industry standards (including progenies and variants) such as those associated with the PCIe specification.
  • Network communications may occur via use of communication protocols or standards such those described in one or more Ethernet standards promulgated by IEEE.
  • one such Ethernet standard may include IEEE 802.3.
  • Network communication may also occur according to one or more OpenFlow specifications such as the OpenFlow Hardware Abstraction API Specification.
  • computing platform 900 may be implemented in a server of a datacenter. Accordingly, functions and/or specific configurations of computing platform 900 described herein, may be included or omitted in various embodiments of computing platform 900, as suitably desired for a server deployed in a datacenter.
  • computing platform 900 may be implemented using any combination of discrete circuitry, ASICs, logic gates and/or single chip architectures. Further, the features of computing platform 900 may be implemented using microcontrollers, programmable logic arrays and/or microprocessors or any combination of the foregoing where suitably appropriate. It is noted that hardware, firmware and/or software elements may be collectively or individually referred to herein as “logic” or “circuit. ”
  • exemplary computing platform 900 shown in the block diagram of FIG. 9 may represent one functionally descriptive example of many potential implementations. Accordingly, division, omission or inclusion of block functions depicted in the accompanying figures does not infer that the hardware components, circuits, software and/or elements for implementing these functions would necessarily be divided, omitted, or included in embodiments.
  • IP cores may be stored on a tangible, machine readable medium and supplied to various customers or manufacturing facilities to load into the fabrication machines that actually make the logic or processor.
  • hardware elements may include devices, components, processors, microprocessors, circuits, circuit elements (e.g., transistors, resistors, capacitors, inductors, and so forth) , integrated circuits, ASIC, programmable logic devices (PLD) , digital signal processors (DSP) , FPGA, memory units, logic gates, registers, semiconductor device, chips, microchips, chip sets, and so forth.
  • PLD programmable logic devices
  • DSP digital signal processors
  • FPGA field-programmable gate array
  • software elements may include software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API) , instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof. Determining whether an example is implemented using hardware elements and/or software elements may vary in accordance with any number of factors, such as desired computational rate, power levels, heat tolerances, processing cycle budget, input data rates, output data rates, memory resources, data bus speeds and other design or performance constraints, as desired for a given implementation.
  • a computer-readable medium may include a non-transitory storage medium to store logic.
  • the non-transitory storage medium may include one or more types of computer-readable storage media capable of storing electronic data, including volatile memory or non-volatile memory, removable or non-removable memory, erasable or non-erasable memory, writeable or re-writeable memory, and so forth.
  • the logic may include various software elements, such as software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, API, instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof.
  • a computer-readable medium may include a non-transitory storage medium to store or maintain instructions that when executed by a machine, computing device or system, cause the machine, computing device or system to perform methods and/or operations in accordance with the described examples.
  • the instructions may include any suitable type of code, such as source code, compiled code, interpreted code, executable code, static code, dynamic code, and the like.
  • the instructions may be implemented according to a predefined computer language, manner or syntax, for instructing a machine, computing device or system to perform a certain function.
  • the instructions may be implemented using any suitable high-level, low-level, object-oriented, visual, compiled and/or interpreted programming language.
  • a logic flow or scheme may be implemented in software, firmware, and/or hardware.
  • a logic flow or scheme may be implemented by computer executable instructions stored on at least one non-transitory computer readable medium or machine readable medium, such as an optical, magnetic or semiconductor storage. The embodiments are not limited in this context.
  • Coupled and “connected” along with their derivatives. These terms are not necessarily intended as synonyms for each other. For example, descriptions using the terms “connected” and/or “coupled” may indicate that two or more elements are in direct physical or electrical contact with each other. The term “coupled, ” however, may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.
  • An example apparatus may include circuitry at a source server and logic for execution by the circuitry.
  • the logic may initiate, at the source server, a pre-memory copy phase to live migrate a VM to a destination server.
  • the logic may also identify a plurality of memory addresses for memory allocated to a vGPU supporting the VM.
  • the plurality of memory addresses may include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing.
  • the vGPU may be allocated graphics hardware resources at the source server.
  • the logic may also generate a log that includes the plurality of memory addresses.
  • the logic may also determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log.
  • the logic may also, responsive to a first of at least two copy iterations of the pre- memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration.
  • the logic may also add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table.
  • the logic may also copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  • Example 2 The apparatus of example 1, may also include the logic to determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing. The logic may also cause the first memory address to not be copied during the stop-and-copy phase.
  • Example 3 The apparatus of example 1, the logic may determine the initial and current hash values via use of an XOR hash algorithm.
  • the logic to identify the plurality of memory addresses may include the logic to scan a GGTT and a PPGTT maintained in a guest operating system for the VM.
  • the GGTT and the PPGTT may translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
  • Example 5 The apparatus of example 4, the logic may also monitor the GGTT or the PPGTT and update the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
  • Example 6 The apparatus of example 1 may also include a digital display coupled to the circuitry to present a user interface view.
  • An example method may include initiating, at a source server, a pre-memory copy phase to live migrate a VM to a destination server.
  • the method may also include identifying a plurality of memory addresses for memory allocated to a vGPU supporting the VM.
  • the plurality of memory addresses including vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing.
  • the vGPU may be allocated graphics hardware resources at the source server.
  • the method may also include generating a log that includes the plurality of memory addresses.
  • the method may also include determining initial hash values for respective vGPU memory pages corresponding to the plurality of memory addresses based on content stored in the respective vGPU memory pages and adding the initial hash values to the log.
  • the method may also include determining, following a first of at least two copy iterations of the pre-memory copy phase, current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration.
  • the method may also include copying content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  • Example 8 The method of example 7 may also include determining whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing. The method may also include causing the first memory address to not be copied during the stop-and-copy phase.
  • Example 9 The method of example 7 may also include determining the initial and current hash values using an XOR hash algorithm.
  • Example 10 The method of example 7, identifying the plurality of memory addresses may include scanning a GGTT and a PPGTT maintained in a guest operating system for the VM.
  • the GGTT and the PPGTT may translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
  • Example 11 The method of example 10 may also include monitoring the GGTT or the PPGTT and updating the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
  • Example 12 An example at least one machine readable medium may include a plurality of instructions that in response to being executed by a system may cause the system to carry out a method according to any one of examples 7 to 11.
  • Example 13 An example apparatus may include means for performing the methods of any one of examples 7 to 11.
  • An example at least one machine readable medium may include a plurality of instructions that in response to being executed by a system may cause the system to initiate, at the source server, a pre-memory copy phase to live migrate a VM to a destination server.
  • the instructions may also cause the system to identify a plurality of memory addresses for memory allocated to a vGPU supporting the VM, the plurality of memory addresses to include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing.
  • the vGPU may be allocated graphics hardware resources at the source server.
  • the instructions may also cause the system to generate a log that includes the plurality of memory addresses and determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log.
  • the instructions may also cause the system to, responsive to a first of at least two copy iterations of the pre-memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration.
  • the instructions may also cause the system to add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table.
  • the instructions may also cause the system to copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  • Example 15 The at least one machine readable medium of example 14, the instructions may also cause the system to determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing. The instructions may also cause the system to cause the first memory address to not be copied during the stop-and-copy phase.
  • Example 16 The at least one machine readable medium of example 14, the instructions may further cause the system to determine the initial and current hash values via use of an XOR hash algorithm.
  • Example 17 The at least one machine readable medium of example 14, the instructions may further cause the system to identify the plurality of memory addresses comprises the logic to scan a GGTT and a PPGTT maintained in a guest operating system for the VM.
  • the GGTT and the PPGTT may translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
  • Example 18 The at least one machine readable medium of example 14, the instructions may further cause the system to monitor the GGTT or the PPGTT and update the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
  • An example system may include circuitry at a source server; .
  • the system may also include memory at the source server coupled with the circuitry.
  • the system may also include graphics hardware at the source server coupled with the memory.
  • the system may also include logic for execution by the circuitry.
  • the logic may initiate, at the source server, a pre-memory copy phase to live migrate a VM to a destination server.
  • the logic may also identify a plurality of memory addresses for a portion of the memory that is allocated to a vGPU supporting the VM.
  • the plurality of memory addresses may include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing.
  • the vGPU may be allocated at least a portion of graphics processing capabilities of the graphics hardware.
  • the logic may generate a log that includes the plurality of memory addresses and determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log.
  • the logic may also, responsive to a first of at least two copy iterations of the pre-memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration.
  • the logic may also add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table.
  • the logic may also copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  • Example 20 The system of example 18, the logic may also determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing. The logic may also cause the first memory address to not be copied during the stop-and-copy phase.
  • Example 21 The system of example 18, the logic may determine the initial and current hash values via use of an XOR hash algorithm.
  • the logic to identify the plurality of memory addresses may include the logic to scan a GGTT and a PPGTT maintained in a guest operating system for the VM.
  • the GGTT and the PPGTT may translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
  • the logic may also monitor the GGTT or the PPGTT and update the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Debugging And Monitoring (AREA)

Abstract

Examples include techniques to live migrate a virtual machine (VM) between servers. Examples include identifying memory addresses for memory allocated to store content associated with the VM executing an application that uses a virtual graphics processing unit (vGPU) for graphics processing and then monitoring the identified memory addresses for changes in content during the live migration to determine dirty (vGPU) memory pages to copy at a stop-and-copy phase of the live migration.

Description

Techniques for Live Migration Support for Graphics Processing Unit Virtualization TECHNICAL FIELD
Examples described herein are generally related to live migration support of a virtual graphics processing unit (vGPU) between nodes, servers or computing platforms in a network.
BACKGROUND
In some examples, graphics hardware such as a graphics processing unit (GPU) may be virtualized in a virtual machine (VM) as a virtual GPU (vGPU) . In a full GPU virtualization, a vGPU driver in coordination with a VM manager (VMM) managing the VM may trap and emulate access to a GPU by a guest operating system (OS) being executed by the VM for purposes of security and multiplexing. The VMM may allow for pass through central processing unit (CPU) access of memory allocated to the vGPU of the VM for graphics processing. GPU commands, once submitted, may be directly executed in graphics hardware (e.g., in a GPU) without VMM intervention.
Live migration of VMs hosted by compute nodes, sleds, servers or computing platforms may an important feature for a system such as a datacenter to enable fault-tolerant capabilities, flexible resource management or dynamic workload rebalancing. Live migration may include migrating the VM and an instance of a vGPU from a source server to a destination server. The migration of the VM and the instance of the vGPU may be over a network connection between the source and destination servers. The migration may be considered a “live” migration if one or more applications being executed by the migrated VM or vGPU continue to be executed by the VM or vGPU during a large portion of a migration between source and destination servers. Execution on the one or more applications may only be briefly halted just prior to copying remaining state information from the source server to the destination server to enable the VM to  resume execution of the application at the destination server.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates an example system.
FIG. 2 illustrates an example portion of the example system.
FIG. 3 illustrates an example scheme.
FIG. 4 illustrates an example dirty page log.
FIG. 5 illustrates an example dirty page table.
FIG. 6 illustrates an example block diagram for a first apparatus.
FIG. 7 illustrates an example of a logic flow.
FIG. 8 illustrates an example of a storage medium
FIG. 9 illustrates an example computing platform.
DETAILED DESCRIPTION
As contemplated in the present disclosure, live migration of a VM and an instance of a vGPU from a source server to a destination service may be considered as live if an application being executed by the VM continues to be executed during most of the live migration. Typically, VM hypervisors or VMMs separately supported by respective source/destination servers may be arranged to coordinate live or offline migration of VMs between servers.
A large portion of a live migration of a VM and an instance of a vGPU may be state information that includes memory used by the VM and a supporting vGPU while executing one or more applications or assisting in graphics processing for the one or more applications. Live migration of state information for a VM involves a two-phase process. The first phase may be a pre-memory copy phase that includes copying initial memory (e.g., for a 1st iteration) and changing memory (e.g., dirty pages) for remaining iterations from the source node to the destination node while the VM is still executing one or more applications or the VM is still  running on the source server. The first or pre-memory phase may continue until remaining dirty pages at the source server fall below a threshold. The second phase may then be a stop-and-copy phase that stops or halts the VM at the source server, copies remaining state information (e.g., remaining dirty pages and/or processor/GPU states, input/output state, etc. ) to the destination server. A third or final phase includes resuming the VM at the destination server. The copying of VM state information for the first two phases may be through a network connection maintained between the source and destination node.
Live migration for an instance of a vGPU may follow the same two-phase process as mentioned above for a VM. However, memory pages for memory allocated to the vGPU for use by a GPU may be difficult to track during a live migration. While CPU memory page dirty tracking may be possible through techniques such as dirty bit scanning or page fault, these techniques may be not available for GPU-related memory pages. Without a system or technique to track dirty page generation for memory pages for memory allocated to the vGPU, a large number of these memory pages may be copied during a live migration even when not dirty. The large number of unnecessarily copied memory pages may delay the stop-and-copy phase of the live migration. The amount of time spent in the second stop-and-copy phase is important as the one or more applications being executed by the VM may be briefly halted for this period of time. Thus, any services being provided while executing the application may be temporarily unresponsive. It is with respect to these challenges that the examples described herein are needed.
FIG. 1 illustrates an example system 100. In some examples, as shown in FIG. 1, system 100 includes a source server 110 coupled through a network 150 with a destination server 120. For these examples, system 100 may be part of a datacenter and network 150 may represent  elements of an internal network that communicatively couples a plurality of servers included in the datacenter such as communicatively coupling source server 110 with various other servers or nodes that includes destination server 120. These interconnected servers or nodes may provide network services to one or more clients or customers as part of an SDN or part of an European Telecommunications Standards Institute (ETSI) network function virtualization (NFV) infrastructure.
According to some examples, source server 110 and destination server 120 may represent at least a portion of composed computing resources arranged to support VMs separately executing one or more applications as part of providing network services to clients or customers. For example, VM 130-1 to 130-n (where “n” represents any whole, positive integer greater than 2) and VM (s) 180-1 may be supported by composed computing resources associated with respective source server 110 and destination server 120. VMs 130-1 to 130-n at source server 110 may be managed or controlled by a VMM or hypervisor such as VMM 112. VM (s) 180-1 at destination server 120 may be managed or controlled by VMM 122.
In some examples, as shown in FIG. 1, at least some of the composed computing resources for source server 110 may include processing elements such as graphics hardware (HW) 141-1 to 141-n that may represent one or more GPUs or allocated portions of a single GPU. Processing elements may also include CPU/cores 142-1 to 142-n that may integrated with or separate from graphics HW 141-1 to 141-n. For these examples, graphics HW 141-1 to 141-n and CPU/cores 142-1 to 142-n may separately have allocated portions of memory 144 for use in supporting VMs 130-1 to 130-n. For example, an allocated portion of memory 144 such as allocation 145-1 may be accessible to CPU/cores 142-1 to support VM 130-1 and thus may include CPU memory pages. Also, another allocated portion of memory 144 such as allocation  145-2 may be accessible to graphics HW 141-1 to support virtualized graphics HW such as a vGPU for VM 130-1 and thus may include vGPU memory pages.
According to some examples, CPU memory pages maintained in allocation 145-1 may include state information for one or more applications App (s) ) 132-1 as well as state information for respective guest OS 134-1 for VM 130-1. The state information maintained in CPU memory pages may reflect a current state of the VM while executing App (s) 132-1 to fulfill a workload as part of providing a network service. Also, in some examples, guest OS 134-1 of VM 130-1 may have a vGPU driver 135-1 to enable App (s) 132-1 to implement virtualized graphics HW such as a vGPU via use of graphics HW 141-1. App (s) 132-1 may thus have graphics processing capabilities to assist in fulfilling the workload. For these examples, vGPU memory pages maintained in allocation 145-2 may include vGPU state information that may reflect a current state of the vGPU. The network service provided by App (s) 132-1 may include, but is not limited to, a video streaming service, a multi-player video game service, a database network service, a website hosting network service, a routing network service, an e-mail network service, a firewalling service, a domain name service (DNS) , a caching service, a network address translation (NAT) service or virus scanning network service.
In some examples, as shown in FIG. 1, at least some composed computing resources for destination server 120 may include CPU/cores 172-1 to 172-n having an allocated portion of memory 174 for use in supporting one or more VM (s) 180-1 and for use to support a migrated VM 130-1. Allocated portions of memory 174 such as allocations 175-1 to 175-n may be accessible to graphics HW 171-2 to 171-n and CPU/cores 172-1 to 172-n for use in supporting VM 130-1 and VM (s) 180-1. For example, an allocated portion of memory 174 such as allocation 175-1 may be accessible to CPU/cores 172-1 to support VM 130-1 following a live  migration of VM 130-1 and thus may include CPU memory pages. Also, another allocated portion of memory 174 such as allocation 175-2 may be accessible to graphics HW 171-1 to support virtualized graphics HW such as a vGPU for the migrated VM 130-1 and thus may include vGPU memory pages. Also, similar to what was mentioned above for source server 110, graphics HW 171-1 to 171-n at destination server 120 may represent one or more GPUs or allocated portions of a single GPU.
According to some examples, CPU memory pages maintained in allocation 175-1 may include state information for one or more applications App (s) ) 132-1 as well as state information for respective guest OS 134-1 for migrated VM 130-1. The state information maintained in CPU memory pages may reflect a state of the VM while executing App (s) 132-1 to fulfill the previously mentioned workload as part of providing the network service following live migration to destination server 120. Also, for these examples, vGPU memory pages maintained in allocation 175-2 may include vGPU state information that may reflect a state of the vGPU following the live migration to destination server 120.
In some examples, CPUs/cores 142-1 to 142-n or CPUs/cores 172-1 to 172-n may represent, either individually or collectively, various commercially available processors, including without limitation an
Figure PCTCN2017090990-appb-000001
and
Figure PCTCN2017090990-appb-000002
processors; 
Figure PCTCN2017090990-appb-000003
application, embedded and secure processors; 
Figure PCTCN2017090990-appb-000004
and
Figure PCTCN2017090990-appb-000005
and 
Figure PCTCN2017090990-appb-000006
processors; IBM and
Figure PCTCN2017090990-appb-000007
Cell processors; 
Figure PCTCN2017090990-appb-000008
Core (2) 
Figure PCTCN2017090990-appb-000009
Core i3, Core i5, Core i7, 
Figure PCTCN2017090990-appb-000010
or Xeon
Figure PCTCN2017090990-appb-000011
processors; and similar processors.
According to some examples, graphics HW 141-1 to 141-n and graphics HW 171-1 to 171-n may include various types of graphics processing HW resources such as rendering engine,  media (e.g., audio and video) engine, general purpose GPU (GPGPU) engines. These types of graphics processing HW resources may be embodied in one or more discreate GPU packages or may be integrated with other types of computing resources such as chipsets or CPU/cores.
In some examples, a resource manager 190 may include logic and/or features to facilitate composition of disaggregated computing resources as shown in FIG. 1 for system 100. The logic and/or features of resource manager 190 may initially allocate compute resources at source server 110 to support VMs 130-1 to 130-1 and may decide to cause a live migration of VM 130-1 to be supported by compute resources at destination server 120 in order to reallocate or load balance compute resources of system 100. Although not shown in FIG. 1, resource manager 190 may also cause an allocation of other compute resources such as, but not limited to, networking resources (e.g., physical ports, network switching capabilities, network interface cards, accelerators, etc. ) .
According to some examples, memory 144 and memory 174 include volatile memory, non-volatile memory or combination of volatile and non-volatile types of memory. Volatile types of memory may include, but are not limited to, dynamic random access memory (DRAM) or static random access memory (SRAM) , thyristor RAM (TRAM) or zero-capacitor RAM (ZRAM) . Non-volatile types of memory may include byte or block addressable types of non-volatile memory having a 3-dimensional (3-D) cross-point memory structure that includes chalcogenide phase change material (e.g., chalcogenide glass) hereinafter referred to as “3-D cross-point memory” . Non-volatile types of memory may also include other types of byte or block addressable non-volatile memory such as, but not limited to, multi-threshold level NAND flash memory, NOR flash memory, single or multi-level phase change memory (PCM) , resistive memory, nanowire memory, ferroelectric transistor random access memory (FeTRAM) ,  magnetoresistive random access memory (MRAM) that incorporates memristor technology, spin transfer torque MRAM (STT-MRAM) , or a combination of any of the above. For these examples, separate portions of memory 144 shown in FIG. 1 as allocations 145-1 to 145-n may be allocated to VMs 130-1 to 130-n as a type of system memory for use by CPUs/cores 142-1 to 142-n or graphics HW 141-1 to 141-n to support these VMs at source server 110. Also, separate portions of memory 174 shown in FIG. 1 as allocations 175-2 to 175-n may be allocated to VM (s) 180-1 as a type of system memory for use by CPUs/cores 171-1 to 171-n or graphics HW 171-1 to support these VMs at destination server 120.
In some examples, migration logic 113 of VMM 112 at source server 110 and migration logic 123 of VMM 122 may separately include logic and/or features capable of facilitating a live migration of a VM from source server 110 to destination server 120. For example, migration logic 113 and migration logic 123 may facilitate the live migration of VM 130-1 from source server 110 to destination server 120. As described more below, one or more schemes may be implemented to facilitate live migration of vGPU state information that may reduce an amount of copying that needs to occur to complete the live migration of VM 130-1. Thus, the or more schemes may accelerate the live migration of VM 130-1. The live migration may be initiated (e.g., by resource manager 190) responsive to such reasons as load balancing, fog/edge computing, or system maintenance. The dash-lined boxes for CPU and vGPU memory pages at memory 174 represent transient movement of these memory pages during live migration. Also, the dash-lined boxes for VM 130-1, App (a) 132-1 and guest OS 134-1 represent a live migration of VM 130-1 to destination server 120 just before live migration is complete.
FIG. 2 illustrates an example portion 200. As shown in FIG. 2, portion 200 may be a portion of system 100 that includes elements of VM 130-1, VMM 112 and memory 144. In  some examples, portion 200 may be part of a full GPU virtualization that may include vGPU driver 135-1 being able to trap and emulate guest access of privileged graphics HW resources (e.g., graphics HW 141-1) for security and multiplexing. Meanwhile, pass through CPU/core access to memory 144 may be allowed for CPU/core access to performance critical resources such as CPU/core access of graphics HW allocated memory included in allocation 145-2. In some examples, submitted commands from App (s) 132-1 for graphics processing by the graphics HW may be directly executed without VMM 112 intervention or involvement.
According to some examples, a graphics processing virtualization technology introduced by Intel Corporation is known as “GVT-g” . GVT-g may use system memory such as memory 144 for graphics HW to access. Access to memory 144 for graphics hardware may include vGPU driver 135-1’s use of global graphics translation table (GGTT) /per-process graphics translation table (PPGTT) 235 to translate from a graphics HW memory address to a system or physical memory address corresponding to vGPU memory pages maintained in allocation 145-2. Meanwhile, a shadowing mechanism may be used for GGTT/PPGTT 235. VMM 112 may use a shadow of GGTT/PPTT 235 that is synchronized to GGTT/PPTT 235. GGTT/PPTT 235 may be write-protected so that the shadow can be continually synchronized to GGTT/PPTT 235 by trapping and emulating modifications to GGTT/PPTT 235. Thus, by this shadowing mechanism, changes or modifications to GGTT/PPTT 235 may be monitored by logic and/or features of VMM 122 such as migration logic 113.
In some examples, migration logic 113 may implement a scheme or mechanism that includes monitoring vGPU memory pages included in allocation 145-2 during a live migration of VM 130-1 to log or identify dirty vGPU memory pages and remove duplicated memory pages from being copied during the live migration. The logging of identified dirty vGPU memory  pages may significantly reduce the number of vGPU memory pages to be copied during a live migration compared to copying all vGPU memory pages included in allocation 145-2.
According to some example, migration logic 113 may scan or read GGTT/PPTT 235 to identify a range of memory addresses allocated for use by graphics HW 141-1 and include that identified range in memory range 215. By migration logic 113 reading or scanning GGTT/PPGTT 235, the range of memory addresses being actively used by graphics HW 141-1 may be identified. A scan of GGTT/PPTT 235 to identify a full range of memory address may only be done once. Once GGTT/PPTT 235 is scanned, dirty memory pages from among vGPU memory pages included in allocation 145-2 are included in memory range 215 and memory pages outside of this range do not need to be tracked as memory pages outside of this range are likely not being used by graphics HW 141-1 to support VM 130-1 and/or App (s) 132-1.
In some examples, GGTT/PPTT 235 is monitored for changes and memory range 215 is updated accordingly. For example, changes to memory range 215 may be caused due to runtime changes in workload requirements from App (s) 132-1. Monitoring just for changes to GGTT/PPTT 235 may avoid a need to continually scan all of GGTT/PPTT 235 to determine memory range 215.
According to some examples, migration logic 113 may generate a vGPU dirty page log 216. As described more below, vGPU dirty page log 216 may include a calculated hash value for content included in each vGPU memory page of memory range 215. Individual vGPU dirty memory pages may then be identified by changes in a respective vGPU memory page’s hash value. Various hash algorithms may be used to generate hash values for vGPU memory pages. Example hash algorithms may include, but are not limited to, XOR, SHA1, MD5 or CRC32 hash algorithms.
In one example, an XOR hash algorithm is used to calculate a hash value for vGPU memory pages of memory range 215. The XOR hash algorithm may be used due to an ability to calculate a hash value at a relatively faster rate than other types of hash algorithms. A change in an XOR hash value for a vGPU memory page indicates that content in the vGPU memory page has changed. vGPU dirty page log 216 may maintain the hash values for the vGPU memory pages of memory range 215 and may indicate which hash values have changed and thus identified as a dirty vGPU memory page.
According to some examples, two different types of dirty vGPU memory pages may be identified based on changes to hash values. A first type, referred to as “dirty type 1” , may include currently mapped vGPU memory pages that are still mapped to GGTT/PPGTT 255 at a stop-and-copy phase of live migration. The second type, referred to as “dirty type 2” may include memory pages that were remapped to CPU/core 142-1 just prior to the stop-and-copy phase and hence a vGPU memory page mapping no longer exists in GGTT/PPGTT 255 for these dirty type 2 memory pages. In some examples, dirty type 2 memory pages are already tracked as part of a migration of state information for CPU/core 142-1. In order to avoid copying the same memory page twice during a live migration, dirty type 2 memory pages identified as dirty vGPU memory pages in vGPU dirty page log 216 are not copied during the stop-and-copy phase. Thus, in some examples, vGPU dirty page table 217 may include only dirty type 1 dirty vGPU memory pages or migration logic 113 may select on dirty type 1 memory pages for copying. Thus, migration logic 113 may use vGPU dirty page table 217 to determine which vGPU memory pages to copy during the stop-and-copy phase of the live migration.
FIG. 3 illustrates an example scheme 300. In some examples, scheme 300 may depict an example of a live migration of a VM having a vGPU. For these examples, elements of system  100 and portion 200 shown in FIGS. 1 and 2 such as elements of source server 110, VM 130-1, VMM 112, migration logic 113 memory 114 and destination server 120 as depicted in FIGS. 1 and 2 may implement portions of scheme 300.
According to some examples, at pre-memory phase 310, vGPU dirty page logging is started. For these examples, a live migration of VM 130-1 from source server 110 to destination server 120 may have been initiated (e.g., by resource manager 190) . Logic and/or features of VMM 112 such migration logic 113 may identify memory ranges for vGPU memory pages included in allocation 145-2 of memory 144, calculate hash values for vGPU memory pages and then log any changes in calculated hash values in order to log dirty vGPU memory pages in vGPU dirty page log 216 during pre-memory phase 310.
In some examples, at stop-and-copy phase 320, all current state information related to the vGPU supporting VM 130-1 is to be saved and copied to memory 174 at destination server 120. For these examples, vGPU dirty page table 217 may be used by migration logic 113 to determine what vGPU memory pages included in allocation 145-2 are to be copied to memory 174. Minimizing the amount of memory pages copied reduces service downtime as a portion of the total migration time.
According to some examples, VMM 112 may reallocate graphics HW 141-1 to support a different VM or may allow graphics HW 141-1 to be unallocated. In either case the vGPU instance of VM-130-1 at source server 110 is removed from a scheduler at source server 110.
According to some examples, at resume /post-copy phase 330, the vGPU supporting VM 130-1 is resumed at destination server 120. For these examples, dirty vGPU memory pages identified in vGPU dirty page table 217 as dirty type 1 memory pages are copied to allocation 175-1 of memory 175 at destination server 120. The state of the vGPU supporting VM 130-1 at  destination server 120 may then be restored based, at least in part, on using content included in these dirty vGPU memory pages. VMM 122 may then add this vGPU to a scheduler at destination server 120. Live migration of VM 130-1 may then be completed.
FIG. 4 illustrates an example dirty page log 400. In some examples, as shown in FIG. 4, dirty page log 400 may include a range of memory addresses 410-1 to 410-m, where “m” represents any whole integers greater than 10. Memory addresses 410-1 to 410-m included in dirty page log 216 may represent vGPU memory pages of allocation 145-2 included in memory range 215 scanned from GGTT/PPGTT 235 as mentioned above for FIG. 2. For these examples, respective initial hash values 420-1 to 420-m for addresses 410-1 to 410-m may be calculated via use of a hash algorithm (e.g., XOR algorithm) that has contents of memory pages corresponding to addresses 410-1 to 410-m as inputs to determine these initial hash values.
In some examples, logic and/or features of VMM 112 such as migration logic 113 may determine current hash values 430-1 to 430-m at each iteration of a pre-memory phase such as pre-memory phase 310 shown in FIG. 3. The current hash may be calculated via use of the same hash algorithm used to calculate the initial hash value that use contents of memory pages corresponding to addresses 410-1 to 410-m at a given iteration to determine these current initial hash values. For these examples, if current and initial hash value do not match for the given iteration for a given address, then the vGPU memory page corresponding to the give address is determined as dirty and added to dirty page table 217. For example, as shown in FIG. 4, migration logic 113 may have compared the initial and current hash values for addresses 410-1 to 410-m and determined that the hash values did not match for addresses 410-2, 410-5, 410-6, 410-7, 410-9, 410-10 and 410-m.
FIG. 5 illustrates an example dirty page table 217. In some examples, as shown in FIG.  5, dirty page table 217 may include those addresses identified in dirty page log 216 as including dirty vGPU memory pages. For these examples, dirty page table 217 may include an indication of whether the identified addresses have dirty type 1 or dirty type 2 memory pages. As mentioned previously, dirty type 1 may include vGPU memory pages that are still mapped to GGTT/PPGTT 255 at a stop-and-copy phase such as stop-and-copy phase 320 shown in FIG. 3. Meanwhile dirty type 2 may include vGPU memory pages that were remapped to CPU/core 142-1 just prior to the stop-and-copy phase (e.g., became CPU memory pages) .
In some examples, according to dirty page table 217, contents of memory pages corresponding to addresses 410-2, 410-5, 410-7, 410-10 and 410-m may be copied as part of the stop-and-copy phase for migrating vGPU state information from source server 110 to destination server 120. Meanwhile contents of memory pages corresponding to addresses 410-6 and 410-9 are not copied as part of this migration of vGPU state information. Rather, the contents of memory pages corresponding to addresses 410-6 and 410-9 may be copied as part of the stop-and-copy phase for migrating CPU/core state information from source server 110 to destination server 120.
FIG. 6 illustrates an example block diagram for an apparatus 600. Although apparatus 600 shown in FIG. 6 has a limited number of elements in a certain topology, it may be appreciated that the apparatus 600 may include more or less elements in alternate topologies as desired for a given implementation.
According to some examples, apparatus 600 may be associated with logic and/or features of a VMM (e.g., migration logic 113 of VMM 112 as shown in FIG. 1) and may be supported by circuitry 620. For these examples, circuitry 620 may be incorporated within circuitry, processor circuitry, processing element, CPU or core maintained at a source server. Circuitry 620 may be  arranged to execute one or more software, firmware or hardware implemented modules, components or logic 622-a (module, component or logic may be used interchangeably in this context) . It is worthy to note that “a” and “b” and “c” and similar designators as used herein are intended to be variables representing any positive integer. Thus, for example, if an implementation sets a value for a = 5, then a complete set software, firmware and/or hardware for logic 622-amay include logic 622-1, 622-2, 622-3, 622-4 or 622-5. The examples presented are not limited in this context and the different variables used throughout may represent the same or different integer values. Also, “logic” , “module” or “component” may also include software/firmware stored in computer-readable media, and although the types of logic are shown in FIG. 6 as discrete boxes, this does not limit these components to storage in distinct computer-readable media components (e.g., a separate memory, etc. ) .
According to some examples, circuitry 620 may include a processor, processor circuit, processor circuitry, processor element, core or CPU. Circuitry 620 may be generally arranged to execute or implement one or more modules, components or logic 622-a. Circuitry 620 may be all or at least a portion of any of various commercially available processors, including without limitation an
Figure PCTCN2017090990-appb-000012
and
Figure PCTCN2017090990-appb-000013
processors; 
Figure PCTCN2017090990-appb-000014
application, embedded and secure processors; 
Figure PCTCN2017090990-appb-000015
and
Figure PCTCN2017090990-appb-000016
and
Figure PCTCN2017090990-appb-000017
processors; IBM and
Figure PCTCN2017090990-appb-000018
Cell processors; 
Figure PCTCN2017090990-appb-000019
Core (2) 
Figure PCTCN2017090990-appb-000020
Core i3, Core i5, Core i7, 
Figure PCTCN2017090990-appb-000021
Xeon
Figure PCTCN2017090990-appb-000022
and
Figure PCTCN2017090990-appb-000023
processors; and similar processors. According to some examples circuitry 620 may also include an application specific integrated circuit (ASIC) and at least some logic 622-amay be implemented as hardware elements of the ASIC. According to some examples, circuitry 620 may also include a field programmable gate array (FPGA) and at least some logic 622-amay be implemented as  hardware elements of the FPGA.
According to some examples, apparatus 600 may include initiate logic 622-1. Initiate logic 622-1 may be executed or implemented by circuitry 620 to initiate a pre-memory copy phase to live migrate a VM to a destination server. For these examples. live migration indication 605 may cause initiate logic 622-1 to initiate the pre-memory copy phase. In one example, live migration indication 605 may be sent by a resource manager of a datacenter that includes the source and destination servers for load balancing purposes.
In some examples, apparatus 600 may include identify logic 622-2. Identify logic 622-2 may be executed or implemented by circuitry 620 to identify a plurality of memory addresses for memory allocated to a vGPU supporting the VM. For these examples, the plurality of memory addresses may include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing. The vGPU may be allocated graphics hardware resources at the source server. According to some examples identify logic 622-2 may scan translation tables (e.g., GGTT/PPGTT tables) maintained with a vGPU driver at the VM to receive vGPU memory address information 610 in order to identify the plurality of memory address. Identify logic 622-2 may at least temporarily maintain the plurality of memory addresses with memory addresses 624-a, e.g., maintained in a data structure such as look up table (LUT) .
According to some examples, apparatus 600 may include log logic 622-3. Log logic 622-3 may be executed or implemented by circuitry 620 to generate a log that includes the plurality of memory addresses. For these examples, the log logic 622-1 may also determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to  the log. Log logic 622-1 may also, responsive to a first of at least two copy iterations of the pre-memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration. Log logic 622-3 may thus track changes in vGPU memory pages and the log maintained by log logic 622-3 may be included in dirty page log 624-b (e.g., maintained in a LUT) .
In some examples, apparatus 600 may include table logic 622-4. Table logic 622-4 may be executed or implemented by circuitry 620 to add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table. For these examples, table logic 622-4 may at least temporarily included the add memory addresses to dirty page table 624-c (e.g., maintained in a LUT) .
According to some examples, apparatus 600 may include copy logic 622-5. Copy logic 622-5 may be executed or implemented by circuitry 620 to copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table. For these examples, copy logic 622-5 may also determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing and the cause the first memory address to not be copied during the stop-and-copy phase. The first memory address, for example, may be identified as a dirty type 2 vGPU memory page and therefore not copied. Copied vGPU memory page content 630 may include the content from the one or vGPU memory pages copies during the stop-and-copy phase.
Various components of apparatus 600 may be communicatively coupled to each other by various types of communications media to coordinate operations. The coordination may involve the uni-directional or bi-directional exchange of information. For instance, the components may communicate information in the form of signals communicated over the communications media. The information can be implemented as signals allocated to various signal lines. In such allocations, each message is a signal. Further embodiments, however, may alternatively employ data messages. Such data messages may be sent across various connections. Example connections include parallel interfaces, serial interfaces, and bus interfaces.
FIG. 7 illustrates an example of a logic flow 700. Logic flow 700 may be representative of some or all of the operations executed by one or more logic, features, or devices described herein, such as apparatus 600. More particularly, logic flow 700 may be implemented by at least initiate logic 622-1, identify logic 622-2, log logic 622-3, table logic 622-4 or copy logic 622-5. 
According to some examples, logic flow 700 at block 702 may initiate, at a source server, a pre-memory copy phase to live migrate a VM to a destination server. For these examples, initiate logic 622-1 may initiate the pre-memory copy phase.
In some examples, logic flow 700 at block 704 may identify a plurality of memory addresses for memory allocated to a vGPU supporting the VM, the plurality of memory addresses to include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing, the vGPU allocated graphics hardware resources at the source server. For these examples, identify logic 622-2 may identify the plurality of memory addresses.
According to some examples, logic flow 700 at block 706 may generate a log that includes the plurality of memory addresses. For these examples, log logic 622-3 may generate the log (e.g., a dirty page log) .
In some examples, logic flow 700 at block 708 may determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log. For these examples, log logic 622-3 determine the initial hash values and add these values to the log.
According to some examples, logic flow 700 at block 710 may determine, following a first of at least two copy iterations, current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration. For these examples, log logic 622-3 may determine the current hash values.
In some examples, logic flow 700 at block 712 may add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table. For these examples, table logic 622-4 may add the memory addresses to the dirty page table.
According to some examples, logic flow 700 at block 714 may copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table. For these examples, copy logic 622-5 may copy the content.
FIG. 8 illustrates an example of a storage medium 800. Storage medium 800 may comprise an article of manufacture. In some examples, storage medium 800 may include any  non-transitory computer readable medium or machine readable medium, such as an optical, magnetic or semiconductor storage. Storage medium 800 may store various types of computer executable instructions, such as instructions to implement logic flow 800. Examples of a computer readable or machine readable storage medium may include any tangible media capable of storing electronic data, including volatile memory or non-volatile memory, removable or non-removable memory, erasable or non-erasable memory, writeable or re-writeable memory, and so forth. Examples of computer executable instructions may include any suitable type of code, such as source code, compiled code, interpreted code, executable code, static code, dynamic code, object-oriented code, visual code, and the like. The examples are not limited in this context.
FIG. 9 illustrates an example computing platform 900. In some examples, as shown in FIG. 9, computing platform 900 may include a processing component 940, other platform components 950 or a communications interface 960. According to some examples, computing platform 900 may be implemented in a server. The server may be capable of coupling through a network to other servers and may be part of a datacenter including a plurality of network connected servers arranged to support VMs executing applications using vGPU allocated graphics HW resources at the servers.
According to some examples, processing component 940 may execute processing operations or logic for apparatus 600 and/or storage medium 800. Processing component 940 may include various hardware elements, software elements, or a combination of both. Examples of hardware elements may include devices, logic devices, components, processors, microprocessors, circuits, processor circuits, circuit elements (e.g., transistors, resistors, capacitors, inductors, and so forth) , integrated circuits, application specific integrated circuits (ASIC) , programmable logic devices (PLD) , digital signal processors (DSP) , field programmable  gate array (FPGA) , memory units, logic gates, registers, semiconductor device, chips, microchips, chip sets, and so forth. Examples of software elements may include software components, programs, applications, computer programs, application programs, device drivers, system programs, software development programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API) , instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof. Determining whether an example is implemented using hardware elements and/or software elements may vary in accordance with any number of factors, such as desired computational rate, power levels, heat tolerances, processing cycle budget, input data rates, output data rates, memory resources, data bus speeds and other design or performance constraints, as desired for a given example.
In some examples, other platform components 950 may include common computing elements, such as one or more processors, multi-core processors, co-processors, memory units, chipsets, controllers, peripherals, interfaces, oscillators, timing devices, video cards, audio cards, multimedia input/output (I/O) components (e.g., digital displays) , power supplies, and so forth. Examples of memory units may include without limitation various types of computer readable and machine readable storage media in the form of one or more higher speed memory units, such as read-only memory (ROM) , random-access memory (RAM) , dynamic RAM (DRAM) , Double-Data-Rate DRAM (DDRAM) , synchronous DRAM (SDRAM) , static RAM (SRAM) , programmable ROM (PROM) , erasable programmable ROM (EPROM) , electrically erasable programmable ROM (EEPROM) , types of non-volatile memory such as 3-D cross-point memory that may be byte or block addressable. Non-volatile types of memory may also include other  types of byte or block addressable non-volatile memory such as, but not limited to, multi-threshold level NAND flash memory, NOR flash memory, single or multi-level PCM, resistive memory, nanowire memory, FeTRAM, MRAM that incorporates memristor technology, STT-MRAM, or a combination of any of the above. Other types of computer readable and machine readable storage media may also include magnetic or optical cards, an array of devices such as Redundant Array of Independent Disks (RAID) drives, solid state memory devices (e.g., USB memory) , solid state drives (SSD) and any other type of storage media suitable for storing information.
In some examples, communications interface 960 may include logic and/or features to support a communication interface. For these examples, communications interface 960 may include one or more communication interfaces that operate according to various communication protocols or standards to communicate over direct or network communication links or channels. Direct communications may occur via use of communication protocols or standards described in one or more industry standards (including progenies and variants) such as those associated with the PCIe specification. Network communications may occur via use of communication protocols or standards such those described in one or more Ethernet standards promulgated by IEEE. For example, one such Ethernet standard may include IEEE 802.3. Network communication may also occur according to one or more OpenFlow specifications such as the OpenFlow Hardware Abstraction API Specification.
As mentioned above computing platform 900 may be implemented in a server of a datacenter. Accordingly, functions and/or specific configurations of computing platform 900 described herein, may be included or omitted in various embodiments of computing platform 900, as suitably desired for a server deployed in a datacenter.
The components and features of computing platform 900 may be implemented using any combination of discrete circuitry, ASICs, logic gates and/or single chip architectures. Further, the features of computing platform 900 may be implemented using microcontrollers, programmable logic arrays and/or microprocessors or any combination of the foregoing where suitably appropriate. It is noted that hardware, firmware and/or software elements may be collectively or individually referred to herein as “logic” or “circuit. ”
It should be appreciated that the exemplary computing platform 900 shown in the block diagram of FIG. 9 may represent one functionally descriptive example of many potential implementations. Accordingly, division, omission or inclusion of block functions depicted in the accompanying figures does not infer that the hardware components, circuits, software and/or elements for implementing these functions would necessarily be divided, omitted, or included in embodiments.
One or more aspects of at least one example may be implemented by representative instructions stored on at least one machine-readable medium which represents various logic within the processor, which when read by a machine, computing device or system causes the machine, computing device or system to fabricate logic to perform the techniques described herein. Such representations, known as “IP cores” may be stored on a tangible, machine readable medium and supplied to various customers or manufacturing facilities to load into the fabrication machines that actually make the logic or processor.
Various examples may be implemented using hardware elements, software elements, or a combination of both. In some examples, hardware elements may include devices, components, processors, microprocessors, circuits, circuit elements (e.g., transistors, resistors, capacitors, inductors, and so forth) , integrated circuits, ASIC, programmable logic devices (PLD) , digital  signal processors (DSP) , FPGA, memory units, logic gates, registers, semiconductor device, chips, microchips, chip sets, and so forth. In some examples, software elements may include software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API) , instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof. Determining whether an example is implemented using hardware elements and/or software elements may vary in accordance with any number of factors, such as desired computational rate, power levels, heat tolerances, processing cycle budget, input data rates, output data rates, memory resources, data bus speeds and other design or performance constraints, as desired for a given implementation.
Some examples may include an article of manufacture or at least one computer-readable medium. A computer-readable medium may include a non-transitory storage medium to store logic. In some examples, the non-transitory storage medium may include one or more types of computer-readable storage media capable of storing electronic data, including volatile memory or non-volatile memory, removable or non-removable memory, erasable or non-erasable memory, writeable or re-writeable memory, and so forth. In some examples, the logic may include various software elements, such as software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, API, instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof.
According to some examples, a computer-readable medium may include a non-transitory storage medium to store or maintain instructions that when executed by a machine, computing device or system, cause the machine, computing device or system to perform methods and/or operations in accordance with the described examples. The instructions may include any suitable type of code, such as source code, compiled code, interpreted code, executable code, static code, dynamic code, and the like. The instructions may be implemented according to a predefined computer language, manner or syntax, for instructing a machine, computing device or system to perform a certain function. The instructions may be implemented using any suitable high-level, low-level, object-oriented, visual, compiled and/or interpreted programming language.
Some examples may be described using the expression “in one example” or “an example” along with their derivatives. These terms mean that a particular feature, structure, or characteristic described in connection with the example is included in at least one example. The appearances of the phrase “in one example” in various places in the specification are not necessarily all referring to the same example.
Included herein are logic flows or schemes representative of example methodologies for performing novel aspects of the disclosed architecture. While, for purposes of simplicity of explanation, the one or more methodologies shown herein are shown and described as a series of acts, those skilled in the art will understand and appreciate that the methodologies are not limited by the order of acts. Some acts may, in accordance therewith, occur in a different order and/or concurrently with other acts from that shown and described herein. For example, those skilled in the art will understand and appreciate that a methodology could alternatively be represented as a series of interrelated states or events, such as in a state diagram. Moreover, not all acts illustrated in a methodology may be required for a novel implementation.
A logic flow or scheme may be implemented in software, firmware, and/or hardware. In software and firmware embodiments, a logic flow or scheme may be implemented by computer executable instructions stored on at least one non-transitory computer readable medium or machine readable medium, such as an optical, magnetic or semiconductor storage. The embodiments are not limited in this context.
Some examples may be described using the expression "coupled" and "connected" along with their derivatives. These terms are not necessarily intended as synonyms for each other. For example, descriptions using the terms “connected” and/or “coupled” may indicate that two or more elements are in direct physical or electrical contact with each other. The term "coupled, ” however, may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.
The follow examples pertain to additional examples of technologies disclosed herein.
Example 1. An example apparatus may include circuitry at a source server and logic for execution by the circuitry. The logic may initiate, at the source server, a pre-memory copy phase to live migrate a VM to a destination server. The logic may also identify a plurality of memory addresses for memory allocated to a vGPU supporting the VM. The plurality of memory addresses may include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing. The vGPU may be allocated graphics hardware resources at the source server. The logic may also generate a log that includes the plurality of memory addresses. The logic may also determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log. The logic may also, responsive to a first of at least two copy iterations of the pre- memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration. The logic may also add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table. The logic may also copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
Example 2. The apparatus of example 1, may also include the logic to determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing. The logic may also cause the first memory address to not be copied during the stop-and-copy phase.
Example 3. The apparatus of example 1, the logic may determine the initial and current hash values via use of an XOR hash algorithm.
Example 4. The apparatus of example 1, the logic to identify the plurality of memory addresses may include the logic to scan a GGTT and a PPGTT maintained in a guest operating system for the VM. The GGTT and the PPGTT may translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
Example 5. The apparatus of example 4, the logic may also monitor the GGTT or the PPGTT and update the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in  the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
Example 6. The apparatus of example 1 may also include a digital display coupled to the circuitry to present a user interface view.
Example 7. An example method may include initiating, at a source server, a pre-memory copy phase to live migrate a VM to a destination server. The method may also include identifying a plurality of memory addresses for memory allocated to a vGPU supporting the VM. The plurality of memory addresses including vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing. The vGPU may be allocated graphics hardware resources at the source server. The method may also include generating a log that includes the plurality of memory addresses. The method may also include determining initial hash values for respective vGPU memory pages corresponding to the plurality of memory addresses based on content stored in the respective vGPU memory pages and adding the initial hash values to the log. The method may also include determining, following a first of at least two copy iterations of the pre-memory copy phase, current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration. The method may also include copying content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
Example 8. The method of example 7 may also include determining whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM  executing the application that uses the vGPU for graphics processing. The method may also include causing the first memory address to not be copied during the stop-and-copy phase.
Example 9. The method of example 7 may also include determining the initial and current hash values using an XOR hash algorithm.
Example 10. The method of example 7, identifying the plurality of memory addresses may include scanning a GGTT and a PPGTT maintained in a guest operating system for the VM. The GGTT and the PPGTT may translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
Example 11. The method of example 10 may also include monitoring the GGTT or the PPGTT and updating the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
Example 12. An example at least one machine readable medium may include a plurality of instructions that in response to being executed by a system may cause the system to carry out a method according to any one of examples 7 to 11.
Example 13. An example apparatus may include means for performing the methods of any one of examples 7 to 11.
Example 14. An example at least one machine readable medium may include a plurality of instructions that in response to being executed by a system may cause the system to initiate, at the source server, a pre-memory copy phase to live migrate a VM to a destination server. The instructions may also cause the system to identify a plurality of memory addresses for memory  allocated to a vGPU supporting the VM, the plurality of memory addresses to include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing. The vGPU may be allocated graphics hardware resources at the source server. The instructions may also cause the system to generate a log that includes the plurality of memory addresses and determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log. The instructions may also cause the system to, responsive to a first of at least two copy iterations of the pre-memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration. The instructions may also cause the system to add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table. The instructions may also cause the system to copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
Example 15. The at least one machine readable medium of example 14, the instructions may also cause the system to determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing. The instructions may also cause the system to cause the first memory address to not be copied during the stop-and-copy phase.
Example 16. The at least one machine readable medium of example 14, the instructions may further cause the system to determine the initial and current hash values via use of an XOR hash algorithm.
Example 17. The at least one machine readable medium of example 14, the instructions may further cause the system to identify the plurality of memory addresses comprises the logic to scan a GGTT and a PPGTT maintained in a guest operating system for the VM. The GGTT and the PPGTT may translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
Example 18. The at least one machine readable medium of example 14, the instructions may further cause the system to monitor the GGTT or the PPGTT and update the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
Example 19. An example system may include circuitry at a source server; . The system may also include memory at the source server coupled with the circuitry. The system may also include graphics hardware at the source server coupled with the memory. The system may also include logic for execution by the circuitry. The logic may initiate, at the source server, a pre-memory copy phase to live migrate a VM to a destination server. The logic may also identify a plurality of memory addresses for a portion of the memory that is allocated to a vGPU supporting the VM. The plurality of memory addresses may include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing. The vGPU may be allocated at least a portion of graphics processing capabilities of the graphics hardware. The logic may generate a log that includes the plurality of  memory addresses and determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log. The logic may also, responsive to a first of at least two copy iterations of the pre-memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration. The logic may also add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table. The logic may also copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
Example 20. The system of example 18, the logic may also determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing. The logic may also cause the first memory address to not be copied during the stop-and-copy phase.
Example 21. The system of example 18, the logic may determine the initial and current hash values via use of an XOR hash algorithm.
Example 22. The apparatus of example 18, the logic to identify the plurality of memory addresses may include the logic to scan a GGTT and a PPGTT maintained in a guest operating system for the VM. The GGTT and the PPGTT may translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
23. The apparatus of example 22, the logic may also monitor the GGTT or the PPGTT and update the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
It is emphasized that the Abstract of the Disclosure is provided to comply with 37 C. F. R. Section 1.72 (b) , requiring an abstract that will allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in a single example for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed examples require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed example. Thus the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate example. In the appended claims, the terms "including" and "in which" are used as the plain-English equivalents of the respective terms "comprising" and "wherein, " respectively. Moreover, the terms "first, " "second, " "third, " and so forth, are used merely as labels, and are not intended to impose numerical requirements on their objects.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (23)

  1. An apparatus comprising:
    circuitry at a source server; and
    logic for execution by the circuitry to:
    initiate, at the source server, a pre-memory copy phase to live migrate a virtual machine (VM) to a destination server;
    identify a plurality of memory addresses for memory allocated to a virtual graphics processing unit (vGPU) supporting the VM, the plurality of memory addresses to include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing, the vGPU allocated graphics hardware resources at the source server;
    generate a log that includes the plurality of memory addresses;
    determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log;
    responsive to a first of at least two copy iterations of the pre-memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration;
    add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table; and
    copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  2. The apparatus of claim 1, further comprising the logic to:
    determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing; and
    cause the first memory address to not be copied during the stop-and-copy phase.
  3. The apparatus of claim 1, the logic to determine the initial and current hash values via use of an XOR hash algorithm.
  4. The apparatus of claim 1, the logic to identify the plurality of memory addresses comprises the logic to scan a global graphics translation table (GGTT) and a per-process graphics translation table (PPGTT) maintained in a guest operating system for the VM, the GGTT and the PPGTT to translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
  5. The apparatus of claim 4, comprising the logic to:
    monitor the GGTT or the PPGTT; and
    update the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
  6. The apparatus of claim 1, comprising a digital display coupled to the circuitry to present a user interface view.
  7. A method comprising:
    initiating, at a source server, a pre-memory copy phase to live migrate a virtual machine (VM) to a destination server;
    identifying a plurality of memory addresses for memory allocated to a virtual graphics processing unit (vGPU) supporting the VM, the plurality of memory addresses including vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing, the vGPU allocated graphics hardware resources at the source server;
    generating a log that includes the plurality of memory addresses;
    determining initial hash values for respective vGPU memory pages corresponding to the plurality of memory addresses based on content stored in the respective vGPU memory pages and adding the initial hash values to the log;
    determining, following a first of at least two copy iterations of the pre-memory copy phase, current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration; and
    copying content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  8. The method of claim 7, further comprising:
    determining whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing; and
    causing the first memory address to not be copied during the stop-and-copy phase.
  9. The method of claim 7, determining the initial and current hash values using an XOR hash algorithm.
  10. The method of claim 7, identifying the plurality of memory addresses comprises scanning a global graphics translation table (GGTT) and a per-process graphics translation table (PPGTT) maintained in a guest operating system for the VM, the GGTT and the PPGTT to translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
  11. The method of claim 10, comprising:
    monitoring the GGTT or the PPGTT; and
    updating the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
  12. At least one machine readable medium comprising a plurality of instructions that in response to being executed by a system cause the system to carry out a method according to any one of claims 7 to 11.
  13. An apparatus comprising means for performing the methods of any one of claims 7 to 11.
  14. At least one machine readable medium comprising a plurality of instructions that in response to being executed by a system cause the system to:
    initiate, at the source server, a pre-memory copy phase to live migrate a virtual machine (VM) to a destination server;
    identify a plurality of memory addresses for memory allocated to a virtual graphics processing unit (vGPU) supporting the VM, the plurality of memory addresses to include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing, the vGPU allocated graphics hardware resources at the source server;
    generate a log that includes the plurality of memory addresses;
    determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log;
    responsive to a first of at least two copy iterations of the pre-memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration;
    add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table; and
    copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  15. The at least one machine readable medium of claim 14, further comprising the instructions to cause the system to:
    determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing; and
    cause the first memory address to not be copied during the stop-and-copy phase.
  16. The at least one machine readable medium of claim 14, the instructions to further cause the system to determine the initial and current hash values via use of an XOR hash algorithm.
  17. The at least one machine readable medium of claim 14, comprising the instructions to further cause the system to identify the plurality of memory addresses comprises the logic to scan a global graphics translation table (GGTT) and a per-process graphics translation table (PPGTT) maintained in a guest operating system for the VM, the GGTT and the PPGTT to translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
  18. The at least one machine readable medium of claim 14, comprising the instructions to further cause the system to:
    monitor the GGTT or the PPGTT; and
    update the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
  19. A system comprising:
    circuitry at a source server;
    memory at the source server coupled with the circuitry;
    graphics hardware at the source server coupled with the memory; and
    logic for execution by the circuitry to:
    initiate, at the source server, a pre-memory copy phase to live migrate a virtual machine (VM) to a destination server;
    identify a plurality of memory addresses for a portion of the memory that is allocated to a virtual graphics processing unit (vGPU) supporting the VM, the plurality of memory addresses to include vGPU memory pages arranged to store content associated with the VM executing an application that uses the vGPU for graphics processing, the vGPU allocated at least a portion of graphics processing capabilities of the graphics hardware;
    generate a log that includes the plurality of memory addresses;
    determine initial hash values for respective vGPU memory pages that correspond to the plurality of memory addresses based on content stored in the respective vGPU memory pages and add the initial hash values to the log;
    responsive to a first of at least two copy iterations of the pre-memory copy phase, determine current hash values for the respective vGPU memory pages based on content stored in the respective vGPU memory pages following the first iteration;
    add memory addresses from among the plurality of memory addresses to a dirty page table based on current hash values not matching initial hash values for the memory addresses added to the dirty page table; and
    copy content included in one or more vGPU memory pages during a stop-and-copy phase to live migrate the VM to the destination server based, at least in part, on whether the one or more vGPU memory pages correspond to the memory addresses added to the dirty page table.
  20. The system of claim 18, further comprising the logic to:
    determine whether a first memory address from among the one or more memory addresses added to the dirty page table corresponds to a memory page that is no longer arranged to store content associated with the VM executing the application that uses the vGPU for graphics processing; and
    cause the first memory address to not be copied during the stop-and-copy phase.
  21. The system of claim 18, the logic to determine the initial and current hash values via use of an XOR hash algorithm.
  22. The apparatus of claim 18, the logic to identify the plurality of memory addresses comprises the logic to scan a global graphics translation table (GGTT) and a per-process graphics translation table (PPGTT) maintained in a guest operating system for the VM, the GGTT and the PPGTT to translate graphics hardware memory addresses to physical memory addresses for the memory allocated to the vGPU.
  23. The apparatus of claim 22, comprising the logic to:
    monitor the GGTT or the PPGTT; and
    update the plurality of memory addresses based on changes to the GGTT or the PPGTT that indicate one or more of the plurality of memory addresses are no longer included in the GGTT or the PPGTT or additional memory addresses have been added to the GGTT or the PPGTT.
PCT/CN2017/090990 2017-06-30 2017-06-30 Techniques for live migration support for graphics processing unit virtualization Ceased WO2019000358A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/090990 WO2019000358A1 (en) 2017-06-30 2017-06-30 Techniques for live migration support for graphics processing unit virtualization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/090990 WO2019000358A1 (en) 2017-06-30 2017-06-30 Techniques for live migration support for graphics processing unit virtualization

Publications (1)

Publication Number Publication Date
WO2019000358A1 true WO2019000358A1 (en) 2019-01-03

Family

ID=64740287

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/090990 Ceased WO2019000358A1 (en) 2017-06-30 2017-06-30 Techniques for live migration support for graphics processing unit virtualization

Country Status (1)

Country Link
WO (1) WO2019000358A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111736943A (en) * 2019-03-25 2020-10-02 阿里巴巴集团控股有限公司 Virtual machine migration method and system
WO2025238440A1 (en) * 2024-05-16 2025-11-20 云智能资产控股(新加坡)私人股份有限公司 Virtual machine migration method, and device, computer program product and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102307208A (en) * 2010-09-25 2012-01-04 广东电子工业研究院有限公司 Virtual machine operation control device and operation control method based on cloud computing
US20140157258A1 (en) * 2012-11-30 2014-06-05 International Business Machines Corporation Hardware contiguous memory region tracking
US20150324236A1 (en) * 2014-05-12 2015-11-12 The Research Foundation For The State University Of New York Gang migration of virtual machines using cluster-wide deduplication
US20160299773A1 (en) * 2014-11-12 2016-10-13 Intel Corporation Live migration of virtual machines from/to host computers with graphics virtualization

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102307208A (en) * 2010-09-25 2012-01-04 广东电子工业研究院有限公司 Virtual machine operation control device and operation control method based on cloud computing
US20140157258A1 (en) * 2012-11-30 2014-06-05 International Business Machines Corporation Hardware contiguous memory region tracking
US20150324236A1 (en) * 2014-05-12 2015-11-12 The Research Foundation For The State University Of New York Gang migration of virtual machines using cluster-wide deduplication
US20160299773A1 (en) * 2014-11-12 2016-10-13 Intel Corporation Live migration of virtual machines from/to host computers with graphics virtualization

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111736943A (en) * 2019-03-25 2020-10-02 阿里巴巴集团控股有限公司 Virtual machine migration method and system
EP3951590A4 (en) * 2019-03-25 2022-12-21 Alibaba Group Holding Limited VIRTUAL MACHINE MIGRATION METHOD AND SYSTEM
US12417112B2 (en) 2019-03-25 2025-09-16 Alibaba Group Holding Limited Virtual machine migration method and system
WO2025238440A1 (en) * 2024-05-16 2025-11-20 云智能资产控股(新加坡)私人股份有限公司 Virtual machine migration method, and device, computer program product and storage medium

Similar Documents

Publication Publication Date Title
US11036531B2 (en) Techniques to migrate a virtual machine using disaggregated computing resources
US11474916B2 (en) Failover of virtual devices in a scalable input/output (I/O) virtualization (S-IOV) architecture
WO2016205977A1 (en) Techniques to run one or more containers on virtual machine
US9870248B2 (en) Page table based dirty page tracking
US10140212B2 (en) Consistent and efficient mirroring of nonvolatile memory state in virtualized environments by remote mirroring memory addresses of nonvolatile memory to which cached lines of the nonvolatile memory have been flushed
CN108713189B (en) Speculative virtual machine execution
CN103955399A (en) Migrating method and device for virtual machine, as well as physical host
US9639388B2 (en) Deferred assignment of devices in virtual machine migration
US20150095585A1 (en) Consistent and efficient mirroring of nonvolatile memory state in virtualized environments
US20170075706A1 (en) Using emulated input/output devices in virtual machine migration
US20160098302A1 (en) Resilient post-copy live migration using eviction to shared storage in a global memory architecture
US10754783B2 (en) Techniques to manage cache resource allocations for a processor cache
US9934157B2 (en) Post-copy VM migration speedup using free page hinting
WO2019000358A1 (en) Techniques for live migration support for graphics processing unit virtualization
Kale Virtual machine migration techniques in cloud environment: A survey
WO2022139922A1 (en) Prioritizing booting of virtual execution environments
US20220035649A1 (en) Event notification support for nested virtual machines
CN108701047B (en) High-density VM containers with DMA copy-on-write
US20230342458A1 (en) Techniques to mitigate cache-based side-channel attacks
US11762573B2 (en) Preserving large pages of memory across live migrations of workloads
US20220350499A1 (en) Collaborated page fault handling
AU2014328735B2 (en) Consistent and efficient mirroring of nonvolatile memory state in virtualized environments
CN117591252A (en) Performance-optimized task replication and migration
Riteau An Overview of Virtualization Technologies

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17916217

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17916217

Country of ref document: EP

Kind code of ref document: A1