CN114281529A

CN114281529A - Distributed virtualization guest operating system scheduling optimization method, system and terminal

Info

Publication number: CN114281529A
Application number: CN202111508295.XA
Authority: CN
Inventors: 管海兵; 李嘉森; 余博识; 贾兴国; 项羽心; 戚正伟
Original assignee: Shanghai Jiao Tong University
Current assignee: Shanghai Jiao Tong University
Priority date: 2021-12-10
Filing date: 2021-12-10
Publication date: 2022-04-05
Anticipated expiration: 2041-12-10
Also published as: CN114281529B

Abstract

The present invention provides a distributed virtualized guest operating system scheduling optimization method and system. The physical machine CPU core is allocated to the virtual machine vCPU, the virtual machine vCPU is bound to the node where the physical machine CPU is located, and the virtual machine vCPU is bound. The vCPU information is transmitted to the guest operating system; the scheduling policy in the guest operating system is modified according to the vCPU information of the virtual machine, and the guest operating system vCPU core allocation of computing tasks is directly completed through the guest operating system; a group of The computing tasks with frequent information interaction are allocated to the same virtual machine node to realize the optimization of the scheduling policy of the distributed virtualization guest operating system. At the same time, a corresponding terminal and medium are provided. By modifying the scheduling strategy of the client operating system, the present invention allocates computing tasks with frequent information interaction to the same node, so as to reduce the information interaction between nodes and the cost of information interaction between nodes.

Description

Distributed virtualized client operating system scheduling optimization method, system and terminal

Technical Field

The invention relates to the technical field of computer virtualization and distributed systems, in particular to a dispatching optimization method, a dispatching optimization system and a dispatching optimization terminal of a distributed virtualized client operating system, and simultaneously provides a corresponding computer readable storage medium.

Background

Distributed virtualization, as shown in fig. 1, refers to remote memory access between machines through some communication, so as to implement distributed shared memory, CPU, and IO resources. Such distributed virtualization may also provide access support for other hardware devices such as GPUs. For example, a giant virtual machine (GiantVM) abstracts hardware resources on multiple machines to provide massive computing and I/O resources for a single or even multiple virtual machines, thereby meeting application scenarios with extremely high resource and performance requirements.

In existing distributed virtualization, sharing of memory is achieved using I/O of the network. For example, a giant virtual machine is added with a plurality of functional modules on the basis of QEMU-KVM, wherein the functional modules comprise IPI forwarding, interrupt forwarding, I/O forwarding, clock synchronization and distributed shared memory modules, and machines are connected through an RDMA network.

However, the network I/O itself has a larger overhead than the normal local memory access, and for a smaller shared overhead, a high-latency overhead is shared to more bytes at one time, and the minimum granularity of the most important cross-node memory sharing in the cross-node information interaction or data transmission cannot be too small. In addition, some existing distributed virtualization works, such as giant virtual machines, utilize a rewrite Page Fault handling mechanism to implement cross-node memory sharing, and the minimum granularity of the cross-node memory sharing is not less than one Page.

The existing distributed virtualized cross-node memory sharing minimum granularity may cause performance loss. The minimum granularity of the cross-node shared memory is far larger than the size of CacheLine, so that the probability of false sharing of an unwritten program is much higher, and great performance reduction is caused. Pseudo-sharing refers to the preemption problem caused by two threads accessing two memory addresses that are not originally shared but are within the same minimum granularity. In addition, even if there are not many programs that are pseudo-shared, if there are many programs that are true-shared, the virtual machine may run slower than a single machine due to the huge overhead of network I/O.

As can be seen from the above, the existing distributed virtualization technology has a fatal problem, and when computing tasks are distributed to multiple physical devices, some of the computing tasks across nodes have frequent memory sharing or other information interaction, which results in high overhead.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a dispatching optimization method, a dispatching optimization system and a dispatching optimization terminal of a distributed virtualized client operating system, and also provides a corresponding computer readable storage medium.

According to one aspect of the invention, a scheduling optimization method for a distributed virtualized guest operating system is provided, which comprises the following steps:

distributing a physical machine CPU core to a virtual machine vCPU, binding the virtual machine vCPU with a node where the physical machine CPU is located, and transmitting information of the virtual machine vCPU to a client operating system;

modifying a scheduling strategy in a guest operating system according to the information of the virtual machine vCPU, and directly completing the allocation of the guest operating system vCPU core of the computing task through the guest operating system;

and distributing a group of computing tasks with frequent information interaction to the same virtual machine node through the modified scheduling strategy to realize the optimization of the scheduling strategy of the distributed virtualized client operating system.

Preferably, the binding the virtual machine vCPU and the node where the physical machine CPU is located includes:

statically binding nodes where the virtual machine vCPU and the physical machine CPU are located;

or

And temporarily binding the nodes where the virtual machine vCPU and the physical machine CPU are located.

Preferably, the information of the virtual machine vCPU includes:

the information of each vCPU node and the vCPU information contained in each node.

Preferably, the modifying the scheduling policy in the guest operating system according to the information of the virtual machine vCPU includes:

according to the information of the vCPU of the virtual machine, computing tasks with frequent information interaction are placed on the same node, and computing tasks with infrequent information interaction are placed on different nodes to form a new scheduling strategy; wherein:

the frequent information interaction means that: the communication cost caused by information interaction is larger than a set threshold; wherein the communication cost comprises: inter-node communication times and inter-node communication traffic.

Preferably, the communication cost caused by the information interaction is obtained through a thread process relation or through a Perf analysis tool for collecting cross-node memory access and information acquisition.

Preferably, the allocating the vCPU core of the guest operating system of the computing task is directly completed by the guest operating system, and comprises the following steps:

the allocation is accomplished by using accessibility, or by modifying the kernel of the guest operating system.

Preferably, the modified scheduling policy includes any one or more of the following:

-defining all threads of the same process to the same virtual machine node;

-sampling the frequency of information exchange between threads and assigning to minimize the number of cross-node sharing;

the modified scheduling strategy reduces the information interaction cost of the distributed virtual machine.

According to another aspect of the present invention, there is provided a guest operating system scheduling optimization system for distributed virtualization, comprising:

the information processing module distributes the CPU core of the physical machine to the vCPU of the virtual machine, binds the vCPU of the virtual machine with the node where the CPU of the physical machine is positioned, and transmits the information of the vCPU of the virtual machine to a client operating system;

the scheduling strategy modification module modifies the scheduling strategy in the guest operating system according to the information of the virtual machine vCPU and directly completes the kernel allocation of the guest operating system vCPU of the calculation task through the guest operating system;

and the scheduling policy optimization module distributes a group of computing tasks with frequent information interaction to the same virtual machine node through the modified scheduling policy to realize the scheduling policy optimization of the distributed virtualized client operating system.

According to a third aspect of the present invention, there is provided a terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor being operable to perform the method of any one of the above, or to operate the system as described above, when executing the program.

According to a fourth aspect of the invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, is operable to perform the method of any one of the above or to operate the system described above.

Due to the adoption of the technical scheme, compared with the prior art, the invention has the following beneficial effects:

according to the scheduling optimization method, the scheduling optimization system, the scheduling optimization terminal and the scheduling optimization medium for the distributed virtualized guest operating system, the scheduling strategy of the guest operating system is modified, and the calculation tasks with frequent information interaction are distributed to the same node, so that the aim of reducing the cost of information interaction between nodes and the cost of information interaction between nodes is fulfilled.

The invention provides a scheduling optimization method, a system, a terminal and a medium of a distributed virtualized guest operating system, which realize a novel virtualized physical resource scheduling realization technology based on virtualization.

The scheduling optimization method, the system, the terminal and the medium for the distributed virtualized client operating system can reduce the memory sharing jitter frequency of the distributed virtualization, bind the calculation tasks with frequent information interaction to the same node aiming at different application scenes and application limitations, and greatly improve the performance of the distributed virtualization.

The scheduling optimization method, the system, the terminal and the medium of the distributed virtualized client operating system modify a scheduling strategy, namely a vCPU scheduling strategy, in the client operating system, are used for scheduling CPUs of different nodes, and realize the function of arranging computing tasks which are possibly frequently shared to the same node as much as possible.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

fig. 1 is a diagram illustrating a distributed virtualization technique in the prior art.

FIG. 2 is a flowchart illustrating a method for scheduling optimization of a guest operating system for distributed virtualization according to an embodiment of the present invention.

FIG. 3 is a diagram illustrating a method for scheduling optimization of a guest operating system for distributed virtualization in accordance with a preferred embodiment of the present invention.

FIG. 4 is a block diagram of a guest operating system scheduling system for distributed virtualization according to an embodiment of the present invention.

Detailed Description

The following examples illustrate the invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are given. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Fig. 2 is a flowchart of a method for scheduling and optimizing a guest operating system in distributed virtualization according to an embodiment of the present invention.

As shown in fig. 2, the method for scheduling and optimizing a guest operating system in distributed virtualization according to this embodiment may include the following steps:

s100, distributing a physical machine CPU core to a virtual machine vCPU, binding the virtual machine vCPU with a node where the physical machine CPU is located, and transmitting information of the virtual machine vCPU to a client operating system;

s200, modifying a scheduling strategy in a guest operating system according to the information of the vCPU of the virtual machine, and directly completing the vCPU core allocation of the guest operating system of a calculation task through the guest operating system;

s300, distributing a group of computing tasks with frequent information interaction to the same virtual machine node through the modified scheduling strategy, and realizing the scheduling strategy optimization of the distributed virtualized guest operating system.

In S100 of this embodiment, as a preferred embodiment, the binding the virtual machine vCPU and the node where the physical machine CPU is located includes:

or

In S100 of this embodiment, as a preferred embodiment, the information of the virtual machine vCPU includes:

In S200 of this embodiment, as a preferred embodiment, modifying the scheduling policy in the guest operating system according to the information of the virtual machine vCPU includes:

Further, as a preferred embodiment, the communication cost caused by information interaction is obtained through a thread process relationship or through a Perf analysis tool to acquire cross-node access to the memory and information acquisition.

In S200 of this embodiment, as a preferred embodiment, the step of directly completing the allocation of the vCPU core of the guest operating system of the computing task by the guest operating system includes:

In S300 of this embodiment, as a preferred embodiment, the modified scheduling policy includes any one or more of the following:

-defining all threads of the same process to the same virtual machine node;

Fig. 3 is a schematic diagram of a scheduling optimization method for a guest operating system of distributed virtualization according to a preferred embodiment of the present invention.

As shown in fig. 3, the method for scheduling and optimizing a guest operating system in distributed virtualization according to this embodiment may include: the physical machine CPU core is distributed to a virtual machine vCPU and a guest operating system vCPU core.

Wherein:

the method for distributing the CPU core of the physical machine to the vCPU of the virtual machine comprises the following steps:

and the physical machine CPU core is distributed to the vCPU, the virtual machine vCPU and the node where the physical machine CPU is positioned are statically or temporarily bound, and the information of the virtual machine vCPU is transmitted to a client operating system. Further allocation of vCPU cores is made by the guest operating system.

The method for distributing the vCPU core of the guest operating system comprises the following steps:

a scheduling policy in the guest operating system is modified. The allocation of the vCPU core of the guest operating system of the computing task is directly completed by the guest operating system. And distributing a group of computing tasks with frequent information interaction to the same virtual machine node through the modified scheduling strategy.

In a specific application example, the modified scheduling policy is: all threads of the same process are limited to the same virtual machine node. The scheduling strategy of the present invention is not limited to the simple implementation strategies listed above.

The scheduling optimization method for the distributed virtualized guest operating system provided in the preferred embodiment is designed based on the architecture shown in fig. 3, and classifies computing tasks according to information interaction frequency, specifically:

in the original edition distributed virtual machine, each VM virtual node is provided with a plurality of vCPUs, and each Host physical node is provided with a plurality of CPUs. The default operating system of the distributed virtual machine uses a load balancing scheduling strategy, on the operating system, the scheduler uniformly distributes the computing tasks to the vCPUs in the VMs for completion, and the vCPUs are corresponding to the CPUs in the Host nodes, so that the original computing tasks (tasks) with a lot of communication have a lot of communication but are distributed to different Host nodes, and a large communication time cost is caused.

The technical scheme provided by the embodiment of the invention is mainly an optimization aiming at the defect. After the defect that the original operating system runs on the distributed virtual machine is discovered, the embodiment of the invention provides a method for realizing acceleration of the distributed virtual machine by modifying a scheduling strategy in the operating system. The scheduling strategy is to allocate calculation tasks with close communication to vCPUs corresponding to the same Host node and allocate calculation tasks with non-close communication to vCPUs of different Host nodes, so that the number of cross-node (Host) communication and the total cost of communication time are reduced, and the performance of the distributed virtual machine is optimized by the scheme.

An interface is reserved for a programmer in a standard operating system, so that the programmer can conveniently modify a scheduling strategy; some embodiments of the present invention utilize the interface to control the scheduling in the operating system to perform "scheduling according to the information of the virtual machine vCPU", thereby achieving the purpose of acceleration. Of course, in addition to using this interface, the present invention may also use other interfaces to perform "scheduling according to information of the virtual machine vCPU", where these interfaces include cpuiset, tasksched, taskset, and so on, and these interfaces are all software-free protocols, and whatever form is adopted, belongs to the protection scope of the present invention.

The technical solutions provided by the above embodiments of the present invention are further described in detail below with reference to a specific application example. However, it should be noted that the platform for using the technical solution provided by the above embodiment of the present invention is not limited to the following example.

In this specific application example, the specific deployment is a cluster consisting of three general servers, each server being equipped with a network card supporting InfiniBand. The servers are connected to the central InfiniBand switch by fiber optics. The technical scheme provided by the above embodiment of the invention is not limited by the types, configurations and numbers of the hosts, and can be extended to any number of hosts with the number more than 1 to form a cluster. The technical solution provided by the above embodiments of the present invention is not limited by the network card and the network device, and any type of network card and network device may be used.

Each server is provided with UbuntuServer16.04.1LTS64bit and GiantVM, and is provided with 56 cores and 64GB memory in total by two CPUs. The specific development is based on the source code versions of the GiantVM, QEMU2.9.0 and the Linux kernel 4.8.10 as an illustration, and the method is also applicable to other virtual machines, virtual machine managers and Linux kernels with other versions.

The huge virtual machine has 168 vcpus, each vCPU corresponds to one physical core, 56 vcpus are operated on the local physical core, and the rest 112 vcpus are operated on the other two remote servers. The giant virtual machine has 192GB distributed shared memory, 64GB is local memory, the rest 128GB is far-end memory, and the far-end memory is accessed at high speed by RDMA. Meanwhile, the virtual machine owns and can use I/O devices, such as GPU, FPGA, etc., located on different computers. The technical solution provided by the above embodiment of the present invention is not limited by vCPU number, memory size, I/O device, and communication protocol, and other protocols other than RDMA may also be optimized by using the technical solution provided by the above embodiment of the present invention.

The client operating system is the modified UbuntuServer16.04.1LTS64bit, wherein the scheduling strategy is a strategy of allocating the same virtual node for the same process. The technical scheme provided by the above embodiment of the invention is not limited by the client operating system, and any other client operating system can be rewritten by using a similar means. The technical scheme provided by the embodiment of the invention is not limited by scheduling strategies, and the invention can realize the optimization of information interaction by using different specific strategies.

The technical solution provided by the above embodiment of the present invention includes but is not limited to a scheduling policy "the same process is bound to the same virtual node", and the rationality of the scheduling policy for accelerating distributed virtualization is as follows: if the vCPU is freely distributed by the guest operating system, the virtual machine at the bottom layer cannot arrange the mixed computing tasks on the CPU of the physical machine in a grouping way according to whether a large number of shares exist; the guest operating system, if it employs the assignment of "same process on same virtual node", can ensure that the same process does not cross nodes on the physical machine. In this way, the inter-thread communications are all at the same node. Because the address spaces are different among the processes, direct memory sharing cannot be achieved originally. The minimum granularity of the shared memory such as MMap between the processes is page, and the processes exchange information according to needs, and the condition of pseudo sharing is almost absent unless the intention is.

Fig. 4 is a schematic diagram illustrating constituent modules of a distributed virtualized guest operating system scheduling optimization system according to an embodiment of the present invention.

As shown in fig. 4, the distributed virtualized guest operating system scheduling optimization system provided in this embodiment may include the following modules:

the information processing module distributes the CPU core of the physical machine to the vCPU of the virtual machine, binds the vCPU of the virtual machine with the node where the CPU of the physical machine is positioned, and transmits the information of the vCPU of the virtual machine to the client operating system;

An embodiment of the present invention provides a terminal, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor is configured to execute the method according to any one of the above embodiments of the present invention when executing the computer program.

An embodiment of the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is operable to perform the method of any of the above embodiments.

The distributed virtualized client operating system scheduling optimization method, the distributed virtualized client operating system scheduling optimization system, the distributed virtualized client operating system scheduling optimization terminal and the distributed virtualized client operating system scheduling optimization medium distribute the calculation tasks with frequent information interaction to the same node by modifying the scheduling strategy of the client operating system, and achieve the purpose of reducing the information interaction between the nodes and the cost of the nodes; on the basis of virtualization, a novel virtualized physical resource scheduling implementation technology is realized, and the technology is based on a client operating system scheduling strategy, so that the distributed shared memory obtains better performance, and the cross-node communication overhead is reduced; the method can reduce the information interaction jitter frequency of the distributed virtualization, and bind the calculation tasks with frequent information interaction to the same node aiming at different application scenes and application limitations, so that the performance of the distributed virtualization is greatly improved; and modifying a scheduling strategy, namely a vCPU scheduling strategy, in a client operating system to schedule CPUs of different nodes, so as to realize the function of arranging the calculation tasks which are possibly frequently shared to the same node as much as possible.

Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as structures within both software modules and hardware components for performing the methods.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims

1. A distributed virtualized guest operating system scheduling optimization method, characterized in that, comprising:

Allocate the physical machine CPU core to the virtual machine vCPU, bind the virtual machine vCPU to the node where the physical machine CPU is located, and transmit the information of the virtual machine vCPU to the guest operating system;

Modify the scheduling policy in the guest operating system according to the information of the virtual machine vCPU, and directly complete the guest operating system vCPU core allocation of the computing task through the guest operating system;

Through the modified scheduling strategy, a group of computing tasks with frequent information interaction are allocated to the same virtual machine node, so as to realize the optimization of the scheduling strategy of the distributed virtualized guest operating system.

2. The distributed virtualized guest operating system scheduling optimization method according to claim 1, wherein the binding of the virtual machine vCPU to the node where the physical machine CPU is located comprises:

Statically bind the virtual machine vCPU and the node where the physical machine CPU is located;

or

Temporarily bind the virtual machine vCPU to the node where the physical machine CPU resides.

3. The distributed virtualization guest operating system scheduling optimization method according to claim 1, wherein the information of the virtual machine vCPU comprises:

Information about the node where each vCPU is located and the information about the vCPU contained in each node.

4. The distributed virtualization guest operating system scheduling optimization method according to claim 1, wherein the modifying the scheduling policy in the guest operating system according to the information of the virtual machine vCPU comprises:

According to the information of the virtual machine vCPU, the computing tasks with frequent information interaction are placed on the same node, and the computing tasks with infrequent information interaction are placed on different nodes to form a new scheduling strategy; among them:

The frequent information exchange means that the communication cost caused by the information exchange is greater than the set threshold; wherein, the communication cost includes: the number of inter-node communication and the inter-node communication flow.

5. The distributed virtualized guest operating system scheduling optimization method according to claim 1, wherein the communication cost caused by the information interaction is obtained through the thread process relationship or through the Perf analysis tool to collect cross-node access memory and information Obtain.

6. The distributed virtualized guest operating system scheduling optimization method according to claim 1, wherein the guest operating system vCPU core allocation of the computing task is directly completed by the guest operating system, comprising:

Allocation is done by using auxiliary tools, or by modifying the kernel of the guest operating system.

7. The distributed virtualized guest operating system scheduling optimization method according to claim 1, wherein the modified scheduling policy comprises any one or more of the following:

- Limit all threads of the same process to the same virtual machine node;

- Sampling the frequency of information exchange between threads and assigning them to minimize the number of sharing across nodes;

The modified scheduling strategy reduces the information interaction cost of distributed virtual machines.

8. A distributed virtualized guest operating system scheduling optimization system, comprising:

an information processing module, which allocates the physical machine CPU core to the virtual machine vCPU, binds the virtual machine vCPU to the node where the physical machine CPU is located, and transmits the information of the virtual machine vCPU to the guest operating system;

A scheduling policy modification module, which modifies the scheduling policy in the guest operating system according to the information of the virtual machine vCPU, and directly completes the guest operating system vCPU core allocation of the computing task through the guest operating system;

Scheduling strategy optimization module, which allocates a group of computing tasks with frequent information interaction to the same virtual machine node through the modified scheduling strategy, so as to realize the optimization of the scheduling strategy of the distributed virtualized guest operating system.

9. A terminal comprising a memory, a processor and a computer program stored on the memory and running on the processor, characterized in that, when the processor executes the program, it can be used to execute any one of claims 1-7. A method of, or, operating the system of claim 8 .

10. A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the program can be used to execute the method according to any one of claims 1-7, or to run the claim 8 of the system.