CN114816718B

CN114816718B - A virtual heterogeneous global task scheduling and tuning method based on homogeneous processors

Info

Publication number: CN114816718B
Application number: CN202210611427.XA
Authority: CN
Inventors: 杨昊
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Priority date: 2022-05-31
Filing date: 2022-05-31
Publication date: 2024-10-22
Anticipated expiration: 2042-05-31
Also published as: CN114816718A

Abstract

The invention relates to the technical field of servers, in particular to a virtual heterogeneous global task scheduling and optimizing method based on isomorphic processors. Judging whether to start a virtual heterogeneous processing function by an operating system according to whether an actual workload is greater than a preset load threshold, if the actual workload is less than the preset load threshold, starting the virtual heterogeneous processing function if the kernel does not need to work nominally, and carrying out virtualization with the kernel as granularity on a non-heterogeneous CPU of a physical layer in an isomorphic mode to obtain a virtual kernel; the operating system divides the virtualized CPU into a plurality of clusters for regulation and control according to the SNC4 strategy; based on the isomorphic processor architecture, virtualization with kernel as granularity is realized, a corresponding heterogeneous multiprocessing architecture is formed, task energy-based allocation is achieved, power regulation with kernel granularity is provided based on PCU and FIVR, and the problem that the management granularity is not fine enough under the isomorphic CPU architecture of the existing server can be effectively solved.

Description

Virtual heterogeneous global task scheduling and optimizing method based on isomorphic processor

Technical Field

The invention relates to the technical field of servers, in particular to a virtual heterogeneous global task scheduling and optimizing method based on isomorphic processors.

Background

With the continuous rising and development of cloud computing technology and related derivative technologies and products, the traffic of the internet industry gradually presents an explosive growth. And as a physical carrier of the virtual data, the storage capacity, processing capacity and interaction capacity of the server for the data fundamentally determine the upper limit of cloud computing. As the volume of internet user traffic increases, the network real-time data throughput also presents a geometric increase. On the basis of the large number of interactions, the balance of the working efficiency and the working energy consumption of the server nodes becomes a technical elevation.

How does a server node achieve both high efficiency and energy conservation? For the large environment of the current server market, for non-embedded application scenes, products are mostly based on designs under an isomorphic multiprocessor architecture, namely servers with two paths, four paths or eight paths in a common way. The CPUs under these server architectures are all identical homogeneous CPUs, i.e., each Core is identical for the system.

Based on the architecture, the regulation and control mode of the power consumption of the processor is mostly based on frequency regulation and control, such as EIST intelligent frequency reduction technology of Intel, and the voltage and the frequency of the processor can be automatically regulated according to different system workloads so as to reduce electric quantity and heat productivity.

But this is in conflict with the aforementioned "simultaneous implementation is both energy efficient and efficient". Because the overall reduction in frequency means an overall reduction in efficiency based on a homogeneous architecture. Therefore, AMD provides a big.LITTLE architecture, and the aim is achieved by matching a high-performance large core with a plurality of high-energy-saving small cores and simultaneously distributing and migrating tasks through an OS. Intel also successively introduced Lakefield and ALDER LAKE generations after adding the size-accounting algorithm to the Win10 scheduler.

Both types of CPU, AMD and Intel, are based on MCP (muti-CHIP PACKAGED, multi-chip package) or MCM (muti-chip module), implementing HMP (Heterogeneous Mutiprocessing, heterogeneous multiprocessing) on a single chip. The process implementation is based on 3D Packaged, with an Interposer or EMIB (Embedded Muti-die Interconnect Bridge, embedded multi-die interconnect bridge) to achieve Chiplet stacking. Chiplet of the stack necessarily results in an increase in TDP (THERMAL DESIGN Power, heat dissipation design Power consumption) per unit area. So the technology is used for embedded products at present.

For server products, the threads are more, the resources are more, the overall performance allocation is more complex, the Base CPU Die is larger than that of the embedded products, if the packaging implementation Chiplet similar to the HBM (High Bandwidth Memory, high-bandwidth memory) technology is required to be carried out, the stacking cost is increased, the current packaging technology is not suitable for the use environment, and the development of products for a computer-intensive machine type of a server is not realistic.

For the existing general server architecture, a INTEL EAGLE STREAM platform 4-way server is taken as an example. The core of the big. LITTLE architecture is the size core combination on physical realization, which cannot be applied to the current server products. The current server product adopts an isomorphic processor architecture, and under the condition that the overall WL exceeds the service requirement and the service requirement is complex, the method can not realize the high efficiency and energy conservation similar to the heterogeneous processor architecture. If high efficiency and energy saving are to be achieved, the EIST technology is generally adopted, and the granularity is low.

The existing general server architecture is based on Chiplet-level heterogeneous regulation, and granularity is not applicable to the existing server scene. Therefore, the existing scheme cannot achieve the coexistence of high efficiency and energy conservation under the existing X86 general-purpose server architecture.

Disclosure of Invention

In order to solve the problems, the invention provides a virtual heterogeneous global task scheduling optimization method based on an isomorphic processor, which is used for realizing virtual isomerism under the existing server architecture, and is efficient and energy-saving. Based on the isomorphic processor architecture under the existing X86 server platform, the workload 'according to labour values distribution' of the heterogeneous processor architecture is realized, heterogeneous virtualization is carried out based on isomorphic hardware, and the power consumption and energy conservation under the premise of meeting the performance are realized through global task scheduling, so that the granularity of management is refined.

In order to achieve the above object, the embodiment of the present invention provides the following technical solutions:

In a first aspect, in an embodiment of the present invention, a method for dispatching and optimizing a virtual heterogeneous global task based on a homogeneous processor is provided, including the following steps:

An Operating System (OS for short) judges whether to start a virtual heterogeneous processing (Virtually Heterogeneous MutiProcessing) function according to whether an actual workload (WorkLoad) is larger than a preset load threshold, if the actual workload is smaller than the preset load threshold, a kernel does not need to work nominally, then the virtual heterogeneous processing (Virtually Heterogeneous MutiProcessing) function is started, and the non-heterogeneous CPU (Anisomeric CPU) of a physical layer (PHYSICAL LAYER) is virtualized in an isomorphic mode by taking the kernel as granularity to obtain a virtual kernel;

The operating system divides the virtualized CPU into a plurality of clusters (Cluster) for regulation and control according to an SNC4 strategy;

when the operating system receives a Command (Command), judging task Priority according to actual parameters of the operating system, and outputting a Priority List (Priority List);

A Linux Scheduler (Linux Scheduler) distributes tasks to corresponding virtual cores according to the priority list; when the task priority changes, a Scheduler (Scheduler) carries out kernel (Core) migration on the task;

and when the operating system judges that the current actual work load reaches a preset load threshold, exiting the virtualization mode, and enabling the CPU to normally work in a physical layer.

As a further scheme of the invention, the operating system judges whether to start the virtual heterogeneous processing function according to whether the actual work load is larger than a preset load threshold, and if the actual work load is larger than the preset load threshold, the kernel needs all rated work, the virtual heterogeneous processing function does not need to be started, and the virtual heterogeneous processing function is operated by an original regulation strategy.

As a further scheme of the invention, the virtualization of taking the kernel as granularity is carried out on the non-heterogeneous CPU of the physical layer in a isomorphic mode to obtain a virtual kernel, which comprises the following steps:

Two cores (cores) of the isomorphic CPU are virtualized into a high-efficiency Core (VCore _efficiency) in an isomorphic virtualization mode, and the high-efficiency Core is used for bearing tasks with high time-related demand and heavy working load when working at rated frequency;

the other cores of the isomorphic CPU are correspondingly virtualized as energy-saving cores (VCore-ENERGY SAVING) and are used for bearing the tasks with low time-related requirements and light workload.

As a further solution of the present invention, the performing, by a isomorphic manner, virtualization with a kernel as granularity on a non-heterogeneous CPU of a physical layer to obtain a virtual kernel, further includes: and packaging the high-efficiency kernel and the energy-saving kernel according to the actual Fan-out of the chip.

As a further scheme of the invention, the non-heterogeneous CPU of the physical layer is virtualized by taking the kernel as granularity according to an isomorphic mode to obtain a virtual kernel, and a virtualized virtual heterogeneous multi-processing unit (Virtually Heterogeneous MutiProcessing Unit) is obtained.

As a further scheme of the invention, the operating system divides the virtualized CPU into a plurality of clusters for regulation and control according to the SNC4 strategy, and the regulation and control method comprises the following steps:

and carrying out task scheduling by taking the virtual cores as units, and distributing the tasks according to the task priorities.

As a further scheme of the present invention, the operating system divides the virtualized CPU into a plurality of clusters according to the SNC4 policy for regulation, and further includes:

the efficient cores and the energy-saving cores are combined one by one to form clusters, and the clusters are used as units for management.

As a further aspect of the present invention, the actual parameters of the operating system include time-dependent desirability and workload.

As a further aspect of the present invention, the method for dispatching and optimizing a virtual heterogeneous global task based on a homogeneous processor further includes: based on PCU (Packaged Control Unit, grouping control unit) and FIVR (Fully Integrated Voltage Ragulator, fully integrated voltage regulation module), the power regulation (Power Regulating) with granularity of the cores is proposed, the PCU is adopted to control the frequency of the CPU cores, the frequency of each core is used for grasping, and the FIVR is adopted to control the power supply voltage (Vcc) of the CPU cores, and the power supply of each core is used for controlling whether the power supply of each core is turned off or not.

As a further scheme of the invention, a temperature sensor is also connected in the CPU core, the temperature sensor is used for detecting the temperature of the CPU core, and the FIVR is used for judging whether to turn off the power supply of the corresponding core according to the temperature detected by the temperature sensor.

As a further scheme of the invention, when the task priority is changed, a Scheduler (Scheduler) carries out kernel (Core) migration on the task, and if the operating system judges that the task priority is reduced, the process is switched to a corresponding high-efficiency kernel or energy-saving kernel in the same cluster.

In a second aspect, in yet another embodiment provided by the present invention, a computer device is provided, including a memory and a processor, where the memory stores a computer program, and the processor implements steps of a method for scheduling and optimizing tasks in a virtual heterogeneous domain based on a homogeneous processor when the computer program is loaded and executed.

In a third aspect, in still another embodiment of the present invention, a storage medium is provided, storing a computer program, where the computer program implements the steps of the isomorphic processor-based virtual heterogeneous global task scheduling tuning method when loaded and executed by a processor.

The technical scheme provided by the invention has the following beneficial effects:

The virtual heterogeneous universe task scheduling optimization method based on the isomorphic processor realizes virtualization with kernel as granularity on the basis of the isomorphic processor architecture and forms a corresponding heterogeneous multiprocessing (Heterogeneous MutiProcessing) architecture; the introduction of big. LITTLE concept is realized under the existing server architecture, the task energy-based distribution is realized, and the whole system is efficient and energy-saving; and, based on PCU and FIVR, a power regulation (Power Regulating) scheme with granularity as a kernel is provided, so that the problem that the management granularity is not fine enough under the isomorphic CPU architecture of the current server can be effectively solved, and the energy conservation and high efficiency are realized.

The virtual heterogeneous global task scheduling optimization method based on the isomorphic processor is based on kernel virtualization, takes a kernel as a power consumption management strategy of granularity, realizes 'according to labour values allocation' of a heterogeneous processor architecture based on overall task scheduling of virtual heterogeneous processing, performs heterogeneous virtualization based on isomorphic hardware, and realizes power consumption energy conservation and granularity refinement management under the premise of meeting performance through global task scheduling.

These and other aspects of the invention will be more readily apparent from the following description of the embodiments. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following description will briefly introduce the drawings that are needed in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the present invention. In the drawings:

FIG. 1 is a schematic diagram of AMD big.LITTE architecture Linux Scheduler operation.

Fig. 2 is a schematic diagram of a INTEL EAGLE STREAM platform 4CPU interconnect topology.

Fig. 3 is a flowchart of a virtual heterogeneous global task scheduling optimization method based on a homogeneous processor according to an embodiment of the present invention.

Fig. 4 is a topological schematic diagram of an overall architecture of a virtual heterogeneous global task scheduling optimization method based on a homogeneous processor according to an embodiment of the present invention.

Fig. 5 is a schematic diagram of an isomorphic CPU in a virtual heterogeneous global task scheduling optimization method based on an isomorphic processor according to an embodiment of the present invention.

Fig. 6 is a schematic diagram of internal Core interconnection of isomorphic CPU in a virtual heterogeneous global task scheduling and optimizing method based on an isomorphic processor according to an embodiment of the present invention.

Fig. 7 is a diagram of a CPU Core P-V mapping relationship in a virtual heterogeneous global task scheduling optimization method based on an isomorphic processor according to an embodiment of the present invention.

Fig. 8 is a mapping relationship diagram of CPU Core P-V in a virtual heterogeneous global task scheduling optimization method based on an isomorphic processor according to an embodiment of the present invention.

Fig. 9 is a simple schematic diagram of CPU Virtual Core interconnection in a Virtual heterogeneous global task scheduling optimization method based on an isomorphic processor according to an embodiment of the present invention.

Fig. 10 is a schematic diagram of equalization management under SNC4 policy in a virtual heterogeneous global task scheduling optimization method based on a homogeneous processor according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

In the description of the application and in the claims and in some of the flows described in the figures above, a number of operations are included that appear in a particular order, but it should be clear that these operations may be performed out of the order they appear in the application or in parallel, the sequence numbers of the operations being 101, 102, etc. are merely used to distinguish between the various operations, the sequence numbers themselves not representing any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" in the present application are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and the descriptions of "first" and "second" are not limited to different types.

Technical solutions in exemplary embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in exemplary embodiments of the present invention, and it is apparent that the described exemplary embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.

Since the current packaging process is not suitable for the use environment, it is not practical to develop products for servers as a computationally intensive model. The application provides a virtual heterogeneous global task scheduling optimization method based on an isomorphic processor, which is used for realizing virtual isomerism under the existing server architecture, and is efficient and energy-saving.

Due to the AMD big.LITTLE size core architecture currently applicable to embedded products, the CPU architecture of the general-purpose server is currently adopted. Take a15+7 of AMD as an example. The performance of Cortex-A15 and the energy conservation of Cortex-A7 are combined together, A15 is used as a large core, and the electric energy-saving device works at higher voltage and frequency, consumes more energy and is used for calculating heavy tasks; a7 is a small core with half the performance of a15 but with 1/7 of the power consumption.

CS (Cluster Switching, cluster switch) IKS (In-KERNEL SWITCHING, in-core switch) and GTS (Global Task Scheduling, global task scheduler) are then employed In the Linux manager to achieve intelligent allocation of specific task processes at the OS level, as shown In FIG. 1.

For the existing general server architecture, taking a INTEL EAGLE STREAM platform 4-way server as an example, an applicable interconnection mode is shown in fig. 2, wherein 4 CPUs are required to be of the same model, and the CPUs are interconnected through a UPI link.

However, in the existing general server architecture, when the heterogeneous processor architecture is managed, the heterogeneous processor is generally mapped into a homogeneous processor in a virtual manner and then is submitted to software for task processing and execution, or the heterogeneous processor is synchronized into a homogeneous model supporting the same frequency, so as to calculate the dependency relationship between the peak temperature and task allocation, realize the power consumption adjustment according to the allocation of the temperature, or perform the optimal allocation of the workload according to the different performance of different processing components at the same temperature, so as to achieve the optimization of the QoS (Quality of Service ).

However, the core of the big. LITTLE architecture is a combination of size cores in physical implementation, which cannot be applied to the current server products. Moreover, the current server product adopts a homogeneous processor architecture, and under the condition that the overall WL exceeds the service requirement and the service requirement is complex, the 'high efficiency and energy conservation' similar to the heterogeneous processor architecture can not be realized. If high efficiency and energy saving are to be achieved, the EIST technology is generally adopted, and the granularity is low.

In addition, the prior art is based on Chiplet-level heterogeneous regulation and control, and granularity is not suitable for the existing server scene.

In summary, the existing solution cannot achieve the coexistence of efficiency and energy saving under the existing X86 general server architecture.

Aiming at the isomorphic processor architecture under the existing X86 server platform, the virtual heterogeneous global task scheduling optimization method based on the isomorphic processor realizes 'according to labour values allocation' of the heterogeneous processor architecture, carries out heterogeneous virtualization based on isomorphic hardware, and realizes power consumption energy conservation and fine management granularity under the premise of meeting performance through global task scheduling.

In particular, embodiments of the present application are further described below with reference to the accompanying drawings.

Referring to fig. 3, an embodiment of the present invention provides a method for dispatching and optimizing a virtual heterogeneous global task based on a homogeneous processor, where the method specifically includes the following steps:

S1, judging whether to start a virtual heterogeneous processing function by an operating system according to whether the actual workload is greater than a preset load threshold, if the actual workload is less than the preset load threshold, starting the virtual heterogeneous processing function if the kernel does not need to work nominally, and carrying out virtualization with the kernel as granularity on a non-heterogeneous CPU of a physical layer in an isomorphic mode to obtain a virtual kernel;

S2, the operating system divides the virtualized CPU into a plurality of clusters for regulation and control according to an SNC4 strategy;

S3, when the operating system receives a command, judging task priority according to actual parameters of the operating system, and outputting a priority list;

S4, dispatching each task to a corresponding virtual core by the Linux dispatcher according to the priority list; when the task priority changes, the scheduler carries out kernel migration on the task;

And S5, when the operating system judges that the current actual work load reaches a preset load threshold, exiting the virtualization mode, and enabling the CPU to normally work in a physical layer.

In this embodiment, the operating system determines whether to start the virtual heterogeneous processing function according to whether the actual workload is greater than a preset load threshold, and if the actual workload is greater than the preset load threshold, the kernel needs to perform all rated operations, so that the virtual heterogeneous processing function does not need to be started, and the virtual heterogeneous processing function is operated with an original regulation policy.

The virtual heterogeneous global task scheduling optimization method based on the isomorphic processor, disclosed by the application, achieves the task capacity allocation by the method of the isomorphic virtual heterogeneous processing unit on the physical basis of the isomorphic CPU architecture, and realizes the overall 'high efficiency and energy conservation' of the system.

Referring to fig. 4, a specific scheduling tuning execution operation includes the following steps:

S10: the OS (Operating System, abbreviated as OS) determines whether to start the function of the virtual heterogeneous process (Virtually Heterogeneous MutiProcessing) according to the actual situation of the total workload, and if not, works with the original regulation policy.

S20: if the OS determines that the current workload (WorkLoad) is not heavy, the kernel does not need to work all rated. The virtual heterogeneous (Virtually Heterogeneous) function is turned on. The non-heterogeneous CPU (Anisomeric CPU) of the physical layer (PHYSICAL LAYER) is virtualized.

S30: the OS divides the virtualized CPU into 4 clusters (Cluster) for regulation and control according to the SNC4 strategy.

S40: when the OS receives a Command (Command), it judges the task Priority according to the actual situation (time-dependent demand, work compliance, etc. actual parameters) and outputs a Priority List (Priority List).

S51: the Linux Scheduler (Linux Scheduler) dispatches individual tasks to the appropriate virtual cores according to a Priority List (Priority List) to achieve "both efficient and energy efficient".

S52: when a change in the task priority is found, a Scheduler (Scheduler) has permission to Core migrate the task.

S60: when the OS determines that the current workload (WorkLoad) reaches a threshold, the virtualization mode is exited and the CPU operates normally at the physical layer.

In an embodiment of the present application, the actual parameters of the operating system include time-dependent desirability, workload.

In some embodiments of the present application, the performing virtualization with kernel granularity on the non-heterogeneous CPU of the physical layer in a isomorphic manner to obtain a virtual kernel includes:

In one embodiment of the present application, an architecture consisting of 4 isomorphic 72 core 4MC (Memory Controller ) CPUs is taken as an example. Of course, the practical application scenario includes the technical scheme which is not limited to the application.

Referring to FIG. 5, a 4MC72Core general purpose CPU is disclosed in an embodiment of the present application, wherein each two cores share a level L2 Cache. The 4 CPUs are interconnected by a topology similar to that shown in FIG. 2. The Core is connected with a Cache (Cache), the Cache (Cache) is connected with an MC, the MC is hung on a Memory (Memory) bus, and UPI modules (UPI modules) are arranged at four corners of a CPU chip (CPU Die) at the same time, so that the whole interconnection is realized. Under the original isomorphic architecture, when a command is issued, because of isomorphism, each Core of the OS is equal, and the instruction is assumed to fall on Core17 of CPU 0, the Core accesses the adjacent Cache again, CACHE MISS accesses the Main Memory (Main Memory) again, so as to complete a basic instruction.

Fig. 6 is a schematic diagram of a 72CoreCPU internal Core interconnect according to an embodiment of the present application. In the isomorphic processor-based virtual heterogeneous universe task scheduling optimization method, two cores are virtualized into one Efficient Core (VCore _efficiency) by isomorphic virtualization means, and tasks with higher time-related requirements and heavier workload (WorkLoad) are born at rated frequency; the other cores are correspondingly virtualized as energy-saving cores (VCore-ENERGY SAVING) and are used for bearing tasks with lighter time-dependent workload (WorkLoad) with lower demand, and the specific P-V mapping relation is shown in figure 7.

In some embodiments of the present application, the performing, in a isomorphic manner, virtualization with kernel granularity on a non-heterogeneous CPU of a physical layer to obtain a virtual kernel further includes: and packaging the high-efficiency kernel and the energy-saving kernel according to the actual Fan-out of the chip.

In an embodiment of the application, to achieve the Efficient (effect) function of the Efficient core (VCore _effect) operating at the rated voltage assuming the task of higher time dependent demand workload (WorkLoad) is heavier. When the energy-saving kernel (VCore _ ENERGY SAVING) realizes the energy-saving task, compared with the big. LITTLE architecture of AMD, the AMD selects Cortex-A7 which is specially designed for energy saving and is directly adopted on the integration Chiplet. The Core of Anisomeric CPU is physically equal to the present application, but the package Substrate (Package Substrate) is relatively larger based on the application scenario of the server, and various chips (Die) are integrated on the Substrate (Substrate) or through the interposer (InterPoser) or the EMIB technology, which provides feasibility for individually adjusting the virtualized "corelet", and the arrangement of two virtual cores can be designed according to the actual Fan-out package of the chip.

In this way, task processes with relatively low urgent task workload can be run on small cores, and virtual cores are taken as granularity, so that some functions of the virtual cores are virtually and selectively castrated, and the function castration is performed according to specific business scenes, for example, for tasks with low importance, it is not necessary to perform opaque encryption facing the system, like SGX, and whether secret leakage is not important.

As to how to adjust the power consumption according to the granularity of the virtual cores, the PLL of the CPU core is controlled due to PCU (Packaged Control Unit) of Intel, so that the frequency of each core can be grasped; controlling Vcc, the power supply of each core can be turned off; and is also connected with a temperature sensor in the kernel, and can be used as one of judging conditions. The combination of FIVR (Fully Integrated Voltage Ragulator, fully integrated voltage regulation module) is entirely hardware-wise possible.

In some embodiments of the present application, the method for dispatching and optimizing task in virtual heterogeneous domain based on isomorphic processor further includes: based on PCU (Packaged Control Unit, grouping control unit) and FIVR (Fully Integrated Voltage Ragulator, fully integrated voltage regulation module), the power regulation (Power Regulating) with granularity of the cores is proposed, the PCU is adopted to control the frequency of the CPU cores, the frequency of each core is used for grasping, and the FIVR is adopted to control the power supply voltage (Vcc) of the CPU cores, and the power supply of each core is used for controlling whether the power supply of each core is turned off or not.

The CPU core is also connected with a temperature sensor, the temperature sensor is used for detecting the temperature of the CPU core, and the FIVR is used for judging whether to turn off the power supply of the corresponding core according to the temperature detected by the temperature sensor.

The above problem is solved, in which the non-heterogeneous CPU of the physical layer performs virtualization with the kernel as granularity according to an isomorphic manner, so as to obtain a virtual kernel, and a virtualized virtual heterogeneous multi-processing unit (Virtually Heterogeneous MutiProcessing Unit) is obtained, as shown in fig. 8.

In some embodiments of the present application, the operating system divides the virtualized CPU into a plurality of clusters according to the SNC4 policy for regulation, and the regulation method includes:

As shown in fig. 8, the arrangement manner in the figure is only an ideal arrangement manner, and the arrangement is specifically required to be performed according to the above-mentioned information such as the combined package and PCU. Based on the arrangement, the task scheduling management based on the OS is carried out, and practical applications include but are not limited to the examples listed in the application.

As shown in fig. 9, task scheduling is performed by using a virtual Core as a unit, and tasks are directly allocated according to task priorities in a GTS-like manner. SNC4 noted above in FIG. 9 is illustrated herein with only 1/4 of the partitions in SNC 4.

In some embodiments of the present application, the operating system may divide the virtualized CPU into a plurality of clusters according to an SNC4 policy for regulation, and further includes: the efficient cores and the energy-saving cores are combined one by one to form clusters, and the clusters are used as units for management. As shown in fig. 10, the Efficient core (VCore _efficiency) and the energy-saving core (VCore _ ENERGY SAVING) are paired one by one to form a Cluster (Cluster), and then managed in units of clusters (clusters). The aim is to improve the flexibility.

In some embodiments of the present application, when the task priority changes, a Scheduler (Scheduler) performs kernel (Core) migration on a task, and if the operating system determines that the task priority decreases, the process switches to a corresponding efficient kernel or energy-saving kernel in the same cluster. If a process runs on the high-efficiency kernel (VCore _efficiency), but after a period of time, the OS determines that its priority is decreasing, in this case, the process is switched to the energy-saving kernel (VCore _ ENERGY SAVING) of the same Cluster (Cluster). There is a limitation in that the power saving kernel (VCore _ ENERGY SAVING) is modified compared to the normal Core, part of the Feature may not be available, and the OS needs to recognize the risk and make decisions.

In an embodiment of the present application, as to which VCore should operate at which voltage and which frequency, this need teaches the product according to the actual situation in combination with the power consumption regulation strategy of the overall system.

Of course, when the overall workload is high, so that the existing architecture cannot well meet the service requirement, the OS can also select the release authority to perform full rated operation or overload over-frequency operation.

Therefore, the virtual heterogeneous universe task scheduling optimization method based on the isomorphic processor realizes virtualization with the kernel as granularity on the basis of the isomorphic processor architecture and forms a corresponding heterogeneous multiprocessing (Heterogeneous MutiProcessing) architecture; the introduction of big. LITTLE concept is realized under the existing server architecture, the task energy-based distribution is realized, and the whole system is efficient and energy-saving; and, based on PCU and FIVR, a power regulation (Power Regulating) scheme with granularity as a kernel is provided, so that the problem that the management granularity is not fine enough under the isomorphic CPU architecture of the current server can be effectively solved, and the energy conservation and high efficiency are realized.

In an embodiment of the invention, a computer device is provided, comprising a memory in which a computer program is stored and a processor configured for executing the computer program stored in the memory. The memory is used to store one or more computer instructions that are executed by the processor to perform the steps in the method embodiments described above.

In one embodiment of the invention there is also provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

Those skilled in the art will appreciate that implementing all or part of the above described embodiment methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the above described embodiment methods. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims

1. A virtual heterogeneous global task scheduling and tuning method based on homogeneous processors, characterized by comprising the following steps:

The operating system determines whether to enable the virtual heterogeneous processing function based on whether the actual workload is greater than the preset load threshold. If the actual workload is less than the preset load threshold and the core does not need to work at full rated capacity, the virtual heterogeneous processing function is enabled to virtualize the non-heterogeneous CPUs at the physical layer in a homogeneous manner with the core as the granularity to obtain virtual cores.

The operating system divides the virtualized CPU into multiple clusters for regulation according to the SNC4 policy;

When the operating system receives the command, it determines the task priority according to the actual parameters of the operating system and outputs a priority list;

The Linux scheduler dispatches each task to the corresponding virtual core according to the priority list; when the priority of the task changes, the scheduler performs kernel migration on the task;

When the operating system determines that the current actual workload reaches a preset load threshold, the virtualization mode is exited and the CPU works normally at the physical layer;

The non-heterogeneous CPUs of the physical layer are virtualized in a homogeneous manner with the kernel as the granularity to obtain the virtual core, including:

Homogeneous virtualization is used to virtualize two cores of a homogeneous CPU into one high-efficiency core, which is used to undertake tasks with high time-related requirements and heavy workloads when working at rated frequency;

The remaining cores of the homogeneous CPU are virtualized as energy-saving cores to undertake tasks with low time-related requirements and light workloads;

The non-heterogeneous CPU of the physical layer is virtualized in a homogeneous manner with the kernel as the granularity to obtain a virtual core, thereby obtaining a virtual heterogeneous multi-processing unit after virtualization;

The operating system divides the virtualized CPU into multiple clusters according to the SNC4 policy for regulation, and the regulation method includes:

Task scheduling is performed in units of virtual cores, and tasks are assigned according to their priorities;

The operating system divides the virtualized CPU into multiple clusters for regulation according to the SNC4 strategy, and further includes:

The high-efficiency cores and energy-saving cores are paired one by one to form clusters, which are managed in clusters.

2. The virtual heterogeneous global task scheduling and tuning method based on homogeneous processors as described in claim 1 is characterized in that the operating system determines whether to enable the virtual heterogeneous processing function based on whether the actual workload is greater than a preset load threshold. If the actual workload is greater than the preset load threshold and the core needs to perform all rated work, there is no need to enable the virtual heterogeneous processing function and the core can operate according to the original control strategy.

3. The virtual heterogeneous global task scheduling and tuning method based on homogeneous processors as described in claim 1 is characterized in that the non-heterogeneous CPU of the physical layer is virtualized in a homogeneous manner with the core as the granularity to obtain the virtual core, and also includes: arranging the high-efficiency core and the energy-saving core according to the actual Fan-out packaging of the chip.

4. The virtual heterogeneous global task scheduling and tuning method based on homogeneous processors as described in claim 1 is characterized in that the virtual heterogeneous global task scheduling and tuning method based on homogeneous processors also includes: proposing power regulation with a granularity of core based on PCU and FIVR, using the PCU to control the frequency of the CPU core to master the frequency of each core, and using FIVR to control the power supply voltage of the CPU core to control whether the power supply of each core is turned off.

5. The virtual heterogeneous global task scheduling and tuning method based on homogeneous processors as described in claim 4 is characterized in that a temperature sensor is also connected to the CPU core, and the temperature sensor is used to detect the temperature of the CPU core, and the FIVR is used to determine whether to turn off the power of the corresponding core according to the temperature detected by the temperature sensor.

6. The virtual heterogeneous global task scheduling and tuning method based on homogeneous processors as described in claim 1 is characterized in that when the task priority changes, the scheduler performs kernel migration on the task. If the operating system determines that the priority of the task has decreased, the process switches to the corresponding high-efficiency core or energy-saving core in the same cluster.