CN116166434A - Processor allocation method and system, device, storage medium and electronic equipment - Google Patents
Processor allocation method and system, device, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN116166434A CN116166434A CN202310180080.2A CN202310180080A CN116166434A CN 116166434 A CN116166434 A CN 116166434A CN 202310180080 A CN202310180080 A CN 202310180080A CN 116166434 A CN116166434 A CN 116166434A
- Authority
- CN
- China
- Prior art keywords
- hosts
- host
- target
- processor
- processors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Power Sources (AREA)
Abstract
本申请实施例提供了一种处理器分配方法及系统、装置、存储介质、电子设备,该方法包括:确定目标服务器中包括的N个主机;接收N个主机请求的算力资源,其中,算力资源用于表示处理N个主机的数据的资源;基于N个主机的编号为N个主机分配与算力资源匹配的M个处理器,其中,M个处理器用于处理N个主机的数据,M个处理器和N个主机之间通过交换机芯片连接,交换机芯片用于扩展连接处理器的总线。通过本申请,解决了相关技术中处理器的利用率低的问题,达到提高处理器的利用率的效果。
The embodiment of the present application provides a processor allocation method, system, device, storage medium, and electronic equipment. The method includes: determining N hosts included in the target server; receiving computing power resources requested by the N hosts, wherein the computing Power resources are used to represent resources that process the data of N hosts; assign M processors that match computing power resources to N hosts based on the numbers of N hosts, where M processors are used to process data of N hosts, The M processors and the N hosts are connected through a switch chip, and the switch chip is used to expand a bus connected to the processors. Through the present application, the problem of low utilization rate of processors in the related art is solved, and the effect of improving the utilization rate of processors is achieved.
Description
技术领域technical field
本申请实施例涉及计算机领域,具体而言,涉及一种处理器分配方法及系统、装置、存储介质、电子设备。The embodiments of the present application relate to the computer field, and in particular, relate to a processor allocation method, system, device, storage medium, and electronic equipment.
背景技术Background technique
目前,人工智能通过数据、算力、算法和场景的融合深入到各行各业,促进和赋能数智化转型。其中,强大的算力让图像、语音等复杂数据的处理能力得以提升,进而改变传统的人人或人机交互方式,使得新的交互方式迅速得到应用。At present, artificial intelligence has penetrated into all walks of life through the integration of data, computing power, algorithms and scenarios, promoting and empowering digital intelligence transformation. Among them, the powerful computing power improves the processing capabilities of complex data such as images and voices, thereby changing the traditional human or human-computer interaction methods, and enabling new interaction methods to be quickly applied.
现阶段,中央处理器(Centra l Process i ng Un it,简称为CPU)与图形处理器(Graph i cs Process i ng Un it,简称为GPU)搭配的异构计算组合仍然是人工智能算力的首选。在实践中,很多企业人工智能(Art i f i c i a lI nte l l i gence,简称为AI)系统都是通过物理形式直接调用GPU,GPU并没有像云场景中计算、存储、网络虚拟化一样实现资源池化。因此,GPU的利用率极低,导致弹性扩展能力受限,投入产出不成正比。At this stage, the heterogeneous computing combination of a central processing unit (Central Processing ing Unit, referred to as CPU) and a graphics processing unit (Graph i cs Processing ing Unit, referred to as GPU) is still the most powerful computing power of artificial intelligence. preferred. In practice, many enterprise artificial intelligence (Artificial Intelligence (AI) systems for short) call GPUs directly through physical forms, and GPUs do not implement resource pooling like computing, storage, and network virtualization in cloud scenarios. Therefore, the utilization rate of the GPU is extremely low, resulting in limited elastic expansion capability, and the input-output ratio is not directly proportional.
此外,CPU和GPU的地位区别较大,一个是必需品,一个是加速器。CPU时刻都在运行,而GPU作为附加在计算机当中的设备,只有在需要时才被调用。因此,高效利用GPU资源的关键就在于按需调用、用完释放,且不用关心GPU资源够不够、从哪儿来。In addition, the status of CPU and GPU is quite different, one is a necessity and the other is an accelerator. The CPU is running all the time, and the GPU, as a device attached to the computer, is only called when needed. Therefore, the key to efficient use of GPU resources is to call them on demand and release them when they are used up, regardless of whether the GPU resources are enough or where they come from.
目前大多数服务器GPU卡都是搭载在服务器机箱内部,只能服务于单台服务器。实现GPU资源池化的方法大多数采用将单物理GPU按固定比例切分成多个虚拟GPU的方式,比如1/2或1/4,每个虚拟GPU的显存相等,算力轮询。例如,2021年英伟达在部分Ampere系列GPU上提供了M I G技术,可以将A100型号的GPU切分成最多7份。At present, most server GPU cards are mounted inside the server chassis and can only serve a single server. Most methods to realize GPU resource pooling adopt the method of dividing a single physical GPU into multiple virtual GPUs according to a fixed ratio, such as 1/2 or 1/4, each virtual GPU has equal video memory, and computing power is polled. For example, in 2021, Nvidia provides MIG technology on some Ampere series GPUs, which can divide the A100 model GPU into up to 7 parts.
传统的服务器的GPU卡基本都是搭配集成在服务器内部的,当服务器开机后,机箱内部所有GPU都会进行上电,当计算需求不是很大的时候,机箱内部空余的GPU也会上电运行,无法做到GPU资源跟随实际需求动态调整,造成不必要的资源浪费以及增加不必要的功耗。The GPU cards of traditional servers are basically integrated in the server. When the server is turned on, all the GPUs in the chassis will be powered on. When the computing demand is not very large, the spare GPUs in the chassis will also be powered on. It is impossible to dynamically adjust GPU resources according to actual needs, resulting in unnecessary waste of resources and unnecessary increase in power consumption.
发明内容Contents of the invention
本申请实施例提供了一种处理器分配方法及系统、装置、存储介质、电子设备,以至少解决相关技术中处理器的利用率低的问题。Embodiments of the present application provide a processor allocation method, system, device, storage medium, and electronic equipment, so as to at least solve the problem of low processor utilization in the related art.
根据本申请的一个实施例,提供了一种处理器分配方法,包括:确定目标服务器中包括的N个主机,其中,每个上述主机对应一个主机编号,上述N是大于或等于1的自然数;接收N个上述主机请求的算力资源,其中,上述算力资源用于表示处理N个上述主机的数据的资源;基于N个上述主机的编号为N个上述主机分配与上述算力资源匹配的M个处理器,其中,M个上述处理器用于处理N个上述主机的数据,M个上述处理器和N个上述主机之间通过交换机芯片连接,上述交换机芯片用于扩展连接上述处理器的总线,上述M是大于或等于1的自然数。According to an embodiment of the present application, a processor allocation method is provided, including: determining N hosts included in the target server, wherein each host corresponds to a host number, and N is a natural number greater than or equal to 1; Receive the computing power resources requested by the N above-mentioned hosts, wherein the above-mentioned computing power resources are used to represent the resources for processing the data of the N above-mentioned hosts; based on the numbers of the N above-mentioned hosts, assign the N above-mentioned hosts matching the above-mentioned computing power resources M processors, wherein the M processors are used to process the data of the N hosts, and the M processors and the N hosts are connected through a switch chip, and the switch chip is used to expand the bus connected to the processors , the above M is a natural number greater than or equal to 1.
根据本申请的另一个实施例,提供了一种处理器分配系统,包括:目标服务器,上述目标服务器中设置有N个主机,其中,每个上述主机对应一个主机编号,上述N是大于或等于1的自然数;交换机芯片,上述交换机芯片与N个上述主机连接,并与M个处理器连接,用于扩展连接上述处理器的总线,其中,上述M是大于或等于1的自然数,M个上述处理器用于处理N个上述主机的数据。According to another embodiment of the present application, a processor allocation system is provided, including: a target server, N hosts are set in the target server, wherein each of the hosts corresponds to a host number, and the above N is greater than or equal to A natural number of 1; a switch chip, the above-mentioned switch chip is connected to N above-mentioned hosts, and connected to M processors, for expanding the bus connected to the above-mentioned processors, wherein, the above-mentioned M is a natural number greater than or equal to 1, and the above-mentioned M The processor is used to process data of the N hosts.
在一个示例性实施例中,上述交换机芯片包括:复杂可编程逻辑器件CPLD,通过传输总线与每个上述主机中的管理控制器BMC连接,用于通过上述BMC接收上述主机请求的算力资源,并基于N个上述主机的编号为N个上述主机分配与上述算力资源匹配的M个上述处理器,其中,上述算力资源用于表示处理N个上述主机的数据的资源。In an exemplary embodiment, the above-mentioned switch chip includes: a complex programmable logic device CPLD, which is connected to the management controller BMC in each of the above-mentioned hosts through a transmission bus, and is used to receive the computing power resources requested by the above-mentioned host through the above-mentioned BMC, And based on the numbers of the N hosts, allocate M processors matching the computing power resources to the N hosts, where the computing power resources are used to represent resources for processing data of the N hosts.
在一个示例性实施例中,上述主机包括:信号收发器,与上述CPLD连接,用于将上述主机的低电平信号传输至上述CPLD,其中,上述低电平信号用于表示上述主机接入了上述目标服务器。In an exemplary embodiment, the above-mentioned host includes: a signal transceiver, connected to the above-mentioned CPLD, for transmitting the low-level signal of the above-mentioned host to the above-mentioned CPLD, wherein the above-mentioned low-level signal is used to indicate that the above-mentioned host accesses the above target server.
在一个示例性实施例中,上述处理器分配系统还包括:电源芯片,与上述交换机芯片中的CPLD连接,并与上述处理器连接,用于控制上述处理器的供电。In an exemplary embodiment, the above-mentioned processor allocation system further includes: a power supply chip, connected to the CPLD in the above-mentioned switch chip, and connected to the above-mentioned processor, for controlling the power supply of the above-mentioned processor.
根据本申请的又一个实施例,还提供了一种处理器分配装置,包括:第一确定模块,用于确定目标服务器中包括的N个主机,其中,每个上述主机对应一个主机编号,上述N是大于或等于1的自然数;第一接收模块,用于接收N个上述主机请求的算力资源,其中,上述算力资源用于表示处理N个上述主机的数据的资源;第一分配模块,用于基于N个上述主机的编号为N个上述主机分配与上述算力资源匹配的M个处理器,其中,M个上述处理器用于处理N个上述主机的数据,M个上述处理器和N个上述主机之间通过交换机芯片连接,上述交换机芯片用于扩展连接上述处理器的总线,上述M是大于或等于1的自然数。According to yet another embodiment of the present application, there is also provided a processor allocation device, including: a first determining module, configured to determine N hosts included in the target server, wherein each of the above-mentioned hosts corresponds to a host number, and the above-mentioned N is a natural number greater than or equal to 1; the first receiving module is configured to receive computing power resources requested by N above-mentioned hosts, wherein the above-mentioned computing power resources are used to represent resources for processing data of N above-mentioned hosts; the first allocation module , for allocating M processors matching the above-mentioned computing resources to the N above-mentioned hosts based on the numbers of the N above-mentioned hosts, wherein, the M above-mentioned processors are used to process the data of the N above-mentioned hosts, and the M above-mentioned processors and The N above-mentioned hosts are connected through a switch chip, and the above-mentioned switch chip is used to expand the bus connected to the above-mentioned processor, and the above-mentioned M is a natural number greater than or equal to 1.
在一个示例性实施例中,上述第一确定模块,包括:第一确定单元,用于在通过复杂可编程逻辑器件CPLD检测到低电平信号的情况下,确定上述目标服务器中接入上述主机,其中,上述CPLD设置在上述交换机芯片中,并与上述主机中的信号收发器连接,上述信号收发器用于向上述CPLD传输上述低电平信号;第二确定单元,用于基于检测到的上述低电平信号的数量确定上述主机的数量,确定N个上述主机,其中,一个上述主机对应一个上述低电平信号。In an exemplary embodiment, the above-mentioned first determining module includes: a first determining unit, configured to determine that the above-mentioned target server is connected to the above-mentioned host when a low-level signal is detected by the complex programmable logic device CPLD , wherein, the above-mentioned CPLD is arranged in the above-mentioned switch chip, and is connected with the signal transceiver in the above-mentioned host computer, and the above-mentioned signal transceiver is used to transmit the above-mentioned low-level signal to the above-mentioned CPLD; the second determining unit is used for detecting the above-mentioned The number of low-level signals determines the number of the hosts, and N hosts are determined, wherein one host corresponds to one low-level signal.
在一个示例性实施例中,上述装置还包括:第一处理模块,用于基于检测到的上述低电平信号的数量确定上述主机的数量,确定N个上述主机之后,通过上述CPLD对每个上述主机进行编号,得到N个主机编号;第一存储模块,用于将N个上述主机编号存储至寄存器中,其中,上述寄存器设置在上述CPLD中;第一发送模块,用于通过传输总线将每个上述主机编号发送至对应的上述主机中的管理控制器BMC,其中,上述传输总线连接上述主机和上述CPLD。In an exemplary embodiment, the above-mentioned apparatus further includes: a first processing module, configured to determine the number of the above-mentioned hosts based on the detected number of the above-mentioned low-level signals, and after determining N of the above-mentioned hosts, through the above-mentioned CPLD, each The hosts are numbered to obtain N host numbers; the first storage module is used to store the N host numbers in registers, wherein the registers are set in the CPLD; the first sending module is used to pass the transmission bus. Each host number is sent to the corresponding management controller BMC in the host, wherein the transmission bus connects the host and the CPLD.
在一个示例性实施例中,上述第一接收模块,包括:第一接收单元,用于接收每个上述主机中的BMC发送的每个上述主机需要的算力资源,得到N个上述主机的算力资源。In an exemplary embodiment, the above-mentioned first receiving module includes: a first receiving unit, configured to receive the computing resources required by each of the above-mentioned hosts sent by the BMC in each of the above-mentioned hosts, and obtain the computing resources of the N above-mentioned hosts. manpower resources.
在一个示例性实施例中,上述第一分配模块,包括:第一发送单元,用于在目标主机请求的目标算力资源大于第一预设阈值的情况下,向目标处理器发送上电指令,以控制上述目标处理器上电,其中,上述目标主机是N个上述主机中的任一主机,上述目标处理器是M个上述处理器中与上述目标算力资源匹配的处理器;第一建立单元,用于基于上述目标主机的目标编号建立上述目标处理器和上述目标主机之间的连接,以调用上述目标处理器处理上述目标主机发送的数据。In an exemplary embodiment, the above-mentioned first allocation module includes: a first sending unit, configured to send a power-on instruction to the target processor when the target computing power resource requested by the target host is greater than a first preset threshold , to control the power-on of the target processor, wherein the target host is any one of the N hosts, and the target processor is a processor among the M processors that matches the target computing resource; the first The establishment unit is configured to establish a connection between the above-mentioned target processor and the above-mentioned target host based on the target number of the above-mentioned target host, so as to call the above-mentioned target processor to process the data sent by the above-mentioned target host.
在一个示例性实施例中,上述装置还包括:第二发送模块,用于基于上述目标主机的目标编号连接上述目标处理器和上述目标主机之后,在目标主机请求的目标算力资源小于上述第一预设阈值的情况下,向目标处理器发送下电指令,以控制上述目标处理器下电;第一断开模块,用于基于上述目标主机的目标编号断开上述目标处理器和上述目标主机之间的连接。In an exemplary embodiment, the above-mentioned device further includes: a second sending module, configured to connect the above-mentioned target processor and the above-mentioned target host based on the target number of the above-mentioned target host, when the target computing power resource requested by the target host is less than the above-mentioned first In the case of a preset threshold, send a power-off instruction to the target processor to control the power-off of the target processor; the first disconnection module is used to disconnect the target processor and the target based on the target number of the target host connection between hosts.
在一个示例性实施例中,上述装置还包括:第三发送模块,用于基于上述目标主机的目标编号断开上述目标处理器和上述目标主机之间的连接之后,向上述目标处理器发送复位信号,以复位上述目标处理器与上述交换机之间扩展连接的总线。In an exemplary embodiment, the above-mentioned apparatus further includes: a third sending module, configured to send a reset to the above-mentioned target processor after the connection between the above-mentioned target processor and the above-mentioned target host is disconnected based on the target number of the above-mentioned target host signal to reset the bus for the expansion connection between the aforementioned target processor and the aforementioned switch.
在一个示例性实施例中,每个上述处理器均对应连接一个电源芯片,其中,上述电源芯片用于控制上述处理器的供电。In an exemplary embodiment, each of the above-mentioned processors is correspondingly connected to a power supply chip, wherein the above-mentioned power supply chip is used to control the power supply of the above-mentioned processor.
根据本申请的又一个实施例,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。According to yet another embodiment of the present application, a computer-readable storage medium is also provided, and a computer program is stored in the computer-readable storage medium, wherein the computer program is configured to perform any one of the above-mentioned methods when running Steps in the examples.
根据本申请的又一个实施例,还提供了一种电子设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项方法实施例中的步骤。According to yet another embodiment of the present application, there is also provided an electronic device, including a memory and a processor, wherein a computer program is stored in the memory, and the processor is configured to run the computer program to perform any of the above Steps in the method examples.
通过本申请,按照目标服务器中的N个主机请求的算力资源分配的M个处理器,动态管理的处理器的算力资源,使接入的主机可以根据实际需求来进行处理器的算力调度。可以有效提高处理器的利用率,提高处理器算力资源的利用率,并降低系统的整体功耗。因此,可以解决相关技术中处理器的利用率低的问题,达到提高处理器的利用率的效果。Through this application, M processors are allocated according to the computing power resources requested by N hosts in the target server, and the computing power resources of the processors are dynamically managed, so that the connected hosts can perform the computing power of the processors according to actual needs. scheduling. It can effectively improve the utilization rate of the processor, improve the utilization rate of the computing resources of the processor, and reduce the overall power consumption of the system. Therefore, the problem of low utilization rate of the processor in the related art can be solved, and the effect of improving the utilization rate of the processor can be achieved.
附图说明Description of drawings
图1是本申请实施例的一种处理器分配方法的移动终端的硬件结构框图;FIG. 1 is a block diagram of a hardware structure of a mobile terminal according to a processor allocation method according to an embodiment of the present application;
图2是根据本申请实施例的处理器分配方法的流程图;FIG. 2 is a flowchart of a processor allocation method according to an embodiment of the present application;
图3是根据本申请实施例的GPU PCI E链路示意图;FIG. 3 is a schematic diagram of a GPU PCI E link according to an embodiment of the present application;
图4是根据本申请实施例的PCI e Switch芯片的连接拓扑图;Fig. 4 is the connection topological diagram of the PCIe Switch chip according to the embodiment of the application;
图5是根据本申请实施例的识别多主机的示意图;FIG. 5 is a schematic diagram of identifying multiple hosts according to an embodiment of the present application;
图6是根据本申请实施例的GPU供电及复位示意图;FIG. 6 is a schematic diagram of GPU power supply and reset according to an embodiment of the present application;
图7是根据本申请实施例的处理器分配系统的示意图;7 is a schematic diagram of a processor allocation system according to an embodiment of the present application;
图8是根据本申请实施例的处理器分配装置的结构框图。Fig. 8 is a structural block diagram of a processor allocation device according to an embodiment of the present application.
具体实施方式Detailed ways
下文中将参考附图并结合实施例来详细说明本申请的实施例。Embodiments of the present application will be described in detail below with reference to the drawings and in combination with the embodiments.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that the terms "first" and "second" in the description and claims of the present application and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence.
首先,对本发明中所涉及到的相关技术进行说明:At first, the relevant technologies involved in the present invention are described:
BMC:Baseboard Management Contro l l er,用于服务器主板的管理。BMC: Baseboard Management Control l l er, used for server motherboard management.
CPU:Centra l Process i ng Un it,中央处理器。CPU: Central Process ing Unit it, central processing unit.
I2C:I nter-I ntegrated Ci rcu it,即I 2C,一种总线结构。I2C: Inter-Integrated Circuit, that is, I 2C, a bus structure.
CPLD:Comp l ex Programmab l e Logi c Dev i ce,复杂可编程逻辑器件。CPLD: Comp l ex Programmab l e Logic Device, complex programmable logic device.
GPU:Graph i cs Process i ng Un it,图形处理器。GPU: Graph i cs Process ing Unit it, graphics processor.
CDFP:400Gbps Form-factor P l uggab l e,一款支持400Gbps速率的可插拔收发器模块。CDFP: 400Gbps Form-factor P luggab l e, a pluggable transceiver module supporting 400Gbps rate.
BMC:Baseboard Management Contro l l er,用于服务器主板的管理。BMC: Baseboard Management Control l l er, used for server motherboard management.
CPU:Centra l Process i ng Un it,中央处理器。CPU: Central Process ing Unit it, central processing unit.
I2C:I nter-I ntegrated Ci rcu it,即I 2C,一种总线结构。I2C: Inter-Integrated Circuit, that is, I 2C, a bus structure.
CPLD:Comp l ex Programmab l e Log i c Dev i ce,复杂可编程逻辑器件。CPLD: Comp l ex Programmab l e Log i c Device, complex programmable logic device.
GPU:Graph i cs Process i ng Un it,图形处理器。GPU: Graph i cs Process ing Unit it, graphics processor.
CDFP:400Gbps Form-factor P l uggab l e,一款支持400Gbps速率的可插拔收发器模块。CDFP: 400Gbps Form-factor P luggab l e, a pluggable transceiver module supporting 400Gbps rate.
PCI E:Per i phera l Component I nterconnect Express,一种高速串行计算机扩展总线标准。PCI E: Peripheral Component Interconnect Express, a high-speed serial computer expansion bus standard.
本申请实施例中所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在移动终端上为例,图1是本申请实施例的一种处理器分配方法的移动终端的硬件结构框图。如图1所示,移动终端可以包括一个或多个(图1中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)和用于存储数据的存储器104,其中,上述移动终端还可以包括用于通信功能的传输设备106以及输入输出设备108。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述移动终端的结构造成限定。例如,移动终端还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。The method embodiments provided in the embodiments of the present application may be executed in mobile terminals, computer terminals or similar computing devices. Taking running on a mobile terminal as an example, FIG. 1 is a block diagram of a hardware structure of a mobile terminal according to a method for allocating processors according to an embodiment of the present application. As shown in Figure 1, the mobile terminal may include one or more (only one is shown in Figure 1) processors 102 (processors 102 may include but not limited to processing devices such as microprocessor MCU or programmable logic device FPGA, etc.) and a
存储器104可用于存储计算机程序,例如,应用软件的软件程序以及模块,如本申请实施例中的处理器分配方法对应的计算机程序,处理器102通过运行存储在存储器104内的计算机程序,从而执行各种功能应用以及数据处理,即实现上述的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至移动终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The
传输设备106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括移动终端的通信供应商提供的无线网络。在一个实例中,传输设备106包括一个网络适配器(Network I nterface Contro l l er,简称为N I C),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输设备106可以为射频(Rad i o Frequency,简称为RF)模块,其用于通过无线方式与互联网进行通讯。
在本实施例中提供了一种处理器分配方法,图2是根据本申请实施例的处理器分配方法的流程图,如图2所示,该流程包括如下步骤:A processor allocation method is provided in this embodiment. FIG. 2 is a flowchart of a processor allocation method according to an embodiment of the present application. As shown in FIG. 2 , the process includes the following steps:
步骤S202,确定目标服务器中包括的N个主机,其中,每个主机对应一个主机编号,N是大于或等于1的自然数;Step S202, determining N hosts included in the target server, wherein each host corresponds to a host number, and N is a natural number greater than or equal to 1;
步骤S204,接收N个主机请求的算力资源,其中,算力资源用于表示处理N个主机的数据的资源;Step S204, receiving computing power resources requested by N hosts, where computing power resources are used to represent resources for processing data of N hosts;
步骤S206,基于N个主机的编号为N个主机分配与算力资源匹配的M个处理器,其中,M个处理器用于处理N个主机的数据,M个处理器和N个主机之间通过交换机芯片连接,交换机芯片用于扩展连接处理器的总线,M是大于或等于1的自然数。Step S206, allocate M processors matching the computing power resources to the N hosts based on the numbers of the N hosts, where the M processors are used to process the data of the N hosts, and the M processors and the N hosts pass through The switch chip is connected, and the switch chip is used to expand the bus connected to the processor, and M is a natural number greater than or equal to 1.
在本实施例中,N和M的取值是可以基于实际场景来灵活设置,例如,目标服务器中包括2台主机,需要2台或3台处理器处理数据。处理器可以是GPU,也可以是其他的处理器,在此不做限定。In this embodiment, the values of N and M can be flexibly set based on actual scenarios. For example, the target server includes 2 hosts, and 2 or 3 processors are required to process data. The processor may be a GPU or other processors, which is not limited here.
在本实施例中,算力资源是主机中的数据需要的处理资源,例如,主机中传输一段录像视频,需要两台GPU进行图像处理。In this embodiment, the computing resource is the processing resource required by the data in the host. For example, two GPUs are required for image processing to transmit a recorded video in the host.
在本实施例中,交换机芯片可以包括多个,例如,处理器盒子GPU BOX中设置有两个交换机芯片PCI e Switch,并设置多个GPU卡插槽,以接入多个GPU卡。主机与GPU BOX之间通过CDFP连接器和相关线缆连接,CDFP连接器是一款400Gbps的可插拔I/O组件,适用于数据中心、高性能计算以及存储和联网设备等高速应用。In this embodiment, the switch chip may include multiple switches. For example, two switch chips PCIe Switch are set in the processor box GPU BOX, and multiple GPU card slots are set to access multiple GPU cards. The host computer and the GPU BOX are connected through a CDFP connector and related cables. The CDFP connector is a 400Gbps pluggable I/O component suitable for high-speed applications such as data centers, high-performance computing, and storage and networking equipment.
可选地,如图3所示,GPU BOX中可以设置多个PC I e Swi tch芯片。主机Host1与Host2分别各出一路PCI E x16高速信号,接到GPU BOX上的每个PC I E Switch芯片,每个PCI E Switch芯片的下行端口分别各接2个GPU卡。需要说明的是,接入GPU卡的数量是根据PCI E Switch芯片所支持的port数量确定的。当CPU提供的PCI e通道数量不足时,可以通过PC I e Switch芯片来扩展系统中的PCI e通道数量。虽然PCI e Gen5已经将信号速率提升到32GT/s,但GPU卡、AI加速卡、新一代的网卡仍然需要较大的数据传输带宽。通过PCI eSwitch芯片的扩展,可以让服务器整机上容纳更多的功能扩展插卡。Optionally, as shown in FIG. 3 , multiple PCIe Switch chips can be set in the GPU BOX. The hosts Host1 and Host2 each send out a PCI E x16 high-speed signal, which is connected to each PC I E Switch chip on the GPU BOX, and the downlink port of each PCI E Switch chip is respectively connected to two GPU cards. It should be noted that the number of connected GPU cards is determined according to the number of ports supported by the PCI E Switch chip. When the number of PCIe channels provided by the CPU is insufficient, the number of PCIe channels in the system can be expanded through the PCIe Switch chip. Although PCI e Gen5 has increased the signal rate to 32GT/s, GPU cards, AI accelerator cards, and new-generation network cards still require large data transmission bandwidth. Through the expansion of the PCI eSwitch chip, more function expansion cards can be accommodated on the whole server.
可选地,PCI e Swi tch芯片的连接拓扑如图4所示,实线代表Host1的PC I E连接关系,虚线代表Host2的PCI E连接关系。由图4可以看出,当目标服务器中包括两个Host时,任何一个Host出的PC I E X16都可经PC I E Swi tch芯片与任何一个GPU卡进行连接,从而达到每个GPU都可由所接入Host访问的目的。其中,PCI E Swi tch芯片中的H端口用于表示上行连接Host主机,F用于表示两个PCI E Switch芯片的PCI E级联端口,D端口用于表示下行端口,用于连接GPU卡。Optionally, the connection topology of the PCIe Switch chip is as shown in FIG. 4, the solid line represents the PCIE connection relationship of Host1, and the dotted line represents the PCIE connection relationship of Host2. As can be seen from Figure 4, when the target server includes two Hosts, any PCIE X16 from any Host can be connected to any GPU card through the PCIE Switch chip, so that each GPU can be connected by the connected PC. Enter the purpose of Host access. Among them, the H port in the PCIE Switch chip is used to indicate the uplink connection to the Host host, the F is used to indicate the PCIE cascade port of two PCIE Switch chips, and the D port is used to indicate the downlink port, which is used to connect to the GPU card.
其中,上述步骤的执行主体可以为终端、服务器、终端或服务器中设置的具体处理器,或者与终端或者服务器相对独立设置的处理器或者处理设备等,但不限于此。Wherein, the executor of the above steps may be a terminal, a server, a specific processor set in the terminal or server, or a processor or processing device set relatively independently from the terminal or server, but is not limited thereto.
通过上述步骤,按照目标服务器中的N个主机请求的算力资源分配的M个处理器,动态管理的处理器的算力资源,使接入的主机可以根据实际需求来进行处理器的算力调度。可以有效提高处理器的利用率,提高处理器算力资源的利用率,并降低系统的整体功耗。因此,可以解决相关技术中处理器的利用率低的问题,达到提高处理器的利用率的效果。Through the above steps, M processors are allocated according to the computing power resources requested by the N hosts in the target server, and the computing power resources of the processors are dynamically managed, so that the connected hosts can perform the computing power of the processors according to actual needs. scheduling. It can effectively improve the utilization rate of the processor, improve the utilization rate of the computing resources of the processor, and reduce the overall power consumption of the system. Therefore, the problem of low utilization rate of the processor in the related art can be solved, and the effect of improving the utilization rate of the processor can be achieved.
在一个示例性实施例中,确定目标服务器中包括的N个主机,包括:在通过复杂可编程逻辑器件CPLD检测到低电平信号的情况下,确定目标服务器中接入主机,其中,CPLD设置在交换机芯片中,并与主机中的信号收发器连接,信号收发器用于向CPLD传输低电平信号;基于检测到的低电平信号的数量确定主机的数量,确定N个主机,其中,一个主机对应一个低电平信号。本实施例中的信号收发器可以是CDFP,例如,如图5所示,目标服务器Server中包括n个Host主机,主机的编号分别为Host1-n,Server与GPU BOX之间的信号通过CDFP1-n传输。Host主机数量的识别主要靠Host主机上的HOST_PRESENT信号进行识别,HOST_PRESENT信号在每一个Host主机上做下拉接地处理,通过CDFP连接器连接到GPU BOX的CPLD,低电平有效,即当Host主机接入时,由于HOST_PRESENT下拉接地,GPU BOX的CPLD就会检测到低电平信号,从而判断出Host主机在位;当CPLD检测到n个HOST_PRESENT低电平信号时,则认为当前目标服务器中包括n个Host主机。本实施例通过识别当前接入Host主机的数量,可以准确的实现GPU计算资源的共享。In an exemplary embodiment, determining the N hosts included in the target server includes: in the case that a low-level signal is detected by the complex programmable logic device CPLD, determining the access host in the target server, wherein the CPLD is set to In the switch chip and connected with the signal transceiver in the host, the signal transceiver is used to transmit low-level signals to the CPLD; determine the number of hosts based on the number of detected low-level signals, and determine N hosts, wherein one The host corresponds to a low level signal. The signal transceiver in the present embodiment can be CDFP, for example, as shown in Figure 5, includes n Host mainframes in the target server Server, the serial number of mainframe is respectively Host1-n, and the signal between Server and GPU BOX passes through CDFP1- ntransmission. The identification of the number of host hosts is mainly based on the HOST_PRESENT signal on the host host. The HOST_PRESENT signal is pulled down and grounded on each host host. It is connected to the CPLD of the GPU BOX through the CDFP connector. When entering, because the HOST_PRESENT is pulled down to ground, the CPLD of the GPU BOX will detect a low-level signal, thereby judging that the Host host is in place; when the CPLD detects n HOST_PRESENT low-level signals, it is considered that the current target server includes n Host host. In this embodiment, by identifying the number of currently connected Host hosts, the sharing of GPU computing resources can be accurately realized.
在一个示例性实施例中,基于检测到的低电平信号的数量确定主机的数量,确定N个主机之后,上述方法还包括:通过CPLD对每个主机进行编号,得到N个主机编号;将N个主机编号存储至寄存器中,其中,寄存器设置在CPLD中;通过传输总线将每个主机编号发送至对应的主机中的管理控制器BMC,其中,传输总线连接主机和CPLD。在本实施例中,目标服务器Server中每个Host主机均需要知道自己在当前接入Host主机的编号,例如,1-n的相对编号。本实施例对主机的编号主要通过I2C信号进行传输。如图5所示,GPU BOX的CPLD在识别当前系统中Host数量之后,会对Host进行编号,并将编号信息存到CPLD的寄存器中。例如,当前系统存在5个Host主机,则编号是1-5,编号信息的传递则通过每个Host主机经CDFP连接器连接到CPLD的I2C信号实现。主机Host可以通过I2C信号读取CPLD的寄存器数据来获取自己当前的Host编号信息。同时,GPU BOX上CPLD与BMC也预留一路I2C信号,用于将主机编号信息传递给GPU BOX的BMC,便于GPU BOX监控Host数量以便于优化多Host时GPU卡的上电策略。本实施例通过设置主机的编号,可以准确的分配主机需要的计算资源。In an exemplary embodiment, the number of hosts is determined based on the number of detected low-level signals. After determining N hosts, the method further includes: numbering each host by CPLD to obtain N host numbers; N host numbers are stored in registers, wherein the registers are set in the CPLD; each host number is sent to the management controller BMC in the corresponding host through a transmission bus, wherein the transmission bus connects the host and the CPLD. In this embodiment, each Host in the target server needs to know the number of the host it is currently accessing, for example, the relative number of 1-n. In this embodiment, the serial number of the host is mainly transmitted through I2C signals. As shown in Figure 5, after the CPLD of the GPU BOX identifies the number of Hosts in the current system, it will number the Hosts and store the numbering information in the registers of the CPLD. For example, if there are 5 host hosts in the current system, the number is 1-5, and the transmission of the number information is realized through the I2C signal that each host host is connected to the CPLD through the CDFP connector. The host host can read the register data of the CPLD through the I2C signal to obtain its current host number information. At the same time, the CPLD and BMC on the GPU BOX also reserve an I2C signal to pass the host number information to the BMC of the GPU BOX, so that the GPU BOX can monitor the number of hosts and optimize the power-on strategy of the GPU card when there are multiple hosts. In this embodiment, by setting the number of the host, the computing resources required by the host can be allocated accurately.
在一个示例性实施例中,接收N个主机请求的算力资源,包括:接收每个主机中的BMC发送的每个主机需要的算力资源,得到N个主机的算力资源。在本实施例中,通过确定每个主机需要的算力资源,可以精准的给主机分配处理器。In an exemplary embodiment, receiving the computing power resources requested by the N hosts includes: receiving the computing power resources required by each host sent by the BMC in each host, and obtaining the computing power resources of the N hosts. In this embodiment, by determining the computing resources required by each host, processors can be accurately assigned to the host.
在一个示例性实施例中,基于N个主机的编号为N个主机分配与算力资源匹配的M个处理器,包括:在目标主机请求的目标算力资源大于第一预设阈值的情况下,向目标处理器发送上电指令,以控制目标处理器上电,其中,目标主机是N个主机中的任一主机,目标处理器是M个处理器中与目标算力资源匹配的处理器;基于目标主机的目标编号建立目标处理器和目标主机之间的连接,以调用目标处理器处理目标主机发送的数据。在本实施例中,第一预设阈值可以基于实际使用场景进行设置,例如,需要识别1个月存储的视频中的人像,则需要两台GPU进行处理,此时需要控制两台目标处理器上电。按照主机的编号将两台GPU连接至主机,以处理视频数据。本实施例通过算力资源分配处理器,可以有效的调度资源,增加资源利用的合理性。In an exemplary embodiment, allocating M processors matching the computing resources to the N hosts based on the numbers of the N hosts includes: when the target computing resource requested by the target host is greater than a first preset threshold , to send a power-on instruction to the target processor to control the power-on of the target processor, wherein the target host is any one of the N hosts, and the target processor is a processor among the M processors that matches the target computing resources ; Establish a connection between the target processor and the target host based on the target number of the target host to call the target processor to process the data sent by the target host. In this embodiment, the first preset threshold can be set based on the actual usage scenario. For example, if it is necessary to identify a portrait in a video stored for one month, two GPUs are required for processing. At this time, two target processors need to be controlled Power-on. Connect two GPUs to the host machine according to the number of the host machine to process video data. In this embodiment, the resource allocation processor can be used to effectively schedule resources and increase the rationality of resource utilization.
在一个示例性实施例中,基于目标主机的目标编号连接目标处理器和目标主机之后,方法还包括:在目标主机请求的目标算力资源小于第一预设阈值的情况下,向目标处理器发送下电指令,以控制目标处理器下电;基于目标主机的目标编号断开目标处理器和目标主机之间的连接。在本实施例中,在主机的数据处理完毕后,需要及时的释放处理器,控制处理器下电,以节省能耗。In an exemplary embodiment, after the target processor and the target host are connected based on the target number of the target host, the method further includes: when the target computing resource requested by the target host is less than a first preset threshold, sending the target processor A power-off command is sent to control the power-off of the target processor; the connection between the target processor and the target host is disconnected based on the target number of the target host. In this embodiment, after the data processing of the host is completed, the processor needs to be released in time, and the processor is controlled to be powered off, so as to save energy consumption.
在一个示例性实施例中,基于目标主机的目标编号断开目标处理器和目标主机之间的连接之后,方法还包括:向目标处理器发送复位信号,以复位目标处理器与交换机之间扩展连接的总线。在本实施例中,为了实现GPU的池化,需要对每一个GPU卡的电源单独进行控制,需要给每个处理器均对应连接一个电源芯片,从而实现每个GPU卡在空闲时间可单独下电或者需要连接时可及时受控上电。如图6所示,Server中每一个Host主机与GPU BOX的CPLD之间分别连接一路I2C信号,主要传输当前Host主机需要请求或者释放GPU资源的信息,传递给GPU BOX的CPLD,CPLD对传输信息进行解析,识别出需要对哪一个GPU卡进行上电或者下电,实现GPU卡的动态释放及调用,将GPU卡算力资源最大化利用。同时,GPU卡的PCIE的复位信号PERST,也从CPLD给出,其源头与电源芯片使能信号一致,均通过I2C信号从Host主机获得,用于对GPU卡PCI E进行复位。本实施例通过给每个处理器设置单独的电源芯片,可以实现对处理器的灵活调用和释放。In an exemplary embodiment, after the connection between the target processor and the target host is disconnected based on the target number of the target host, the method further includes: sending a reset signal to the target processor to reset the extension between the target processor and the switch. connected bus. In this embodiment, in order to realize GPU pooling, the power supply of each GPU card needs to be controlled separately, and each processor needs to be connected to a power supply chip, so that each GPU card can be independently downloaded in idle time. It can be controlled and powered on in time when it is powered on or needs to be connected. As shown in Figure 6, one I2C signal is connected between each Host in the Server and the CPLD of the GPU BOX, which mainly transmits the information that the current Host needs to request or release GPU resources, and transmits it to the CPLD of the GPU BOX, and the CPLD transmits the information Analyze and identify which GPU card needs to be powered on or off, realize the dynamic release and call of the GPU card, and maximize the use of GPU card computing resources. At the same time, the reset signal PERST of the PCIE of the GPU card is also given from the CPLD. Its source is consistent with the enable signal of the power chip, and is obtained from the Host through the I2C signal to reset the PCIE of the GPU card. In this embodiment, by setting a separate power supply chip for each processor, the flexible call and release of the processor can be realized.
在本实施例中提供了一种处理器分配系统,图7是根据本申请实施例的处理器分配系统的示意图,如图7所示,该系统包括如:In this embodiment, a processor allocation system is provided. FIG. 7 is a schematic diagram of a processor allocation system according to an embodiment of the present application. As shown in FIG. 7, the system includes:
目标服务器,目标服务器中设置有N个主机,其中,每个主机对应一个主机编号,N是大于或等于1的自然数;A target server, where N hosts are set in the target server, where each host corresponds to a host number, and N is a natural number greater than or equal to 1;
交换机芯片,交换机芯片与N个主机连接,并与M个处理器连接,用于扩展连接处理器的总线,其中,M是大于或等于1的自然数,M个处理器用于处理N个主机的数据。A switch chip, the switch chip is connected to N hosts and M processors to expand the bus connecting the processors, where M is a natural number greater than or equal to 1, and the M processors are used to process the data of the N hosts .
在本实施例中,算力资源是主机中的数据需要的处理资源,例如,主机中传输一段录像视频,需要两台GPU进行图像处理。In this embodiment, the computing resource is the processing resource required by the data in the host. For example, two GPUs are required for image processing to transmit a recorded video in the host.
在本实施例中,交换机芯片可以包括多个,例如,处理器盒子GPU BOX中设置有两个交换机芯片PCIe Switch,并设置多个GPU卡插槽,以接入多个GPU卡。主机与GPU BOX之间通过CDFP连接器和相关线缆连接,CDFP连接器是一款400Gbps的可插拔I/O组件,适用于数据中心、高性能计算以及存储和联网设备等高速应用。In this embodiment, the switch chip may include multiple, for example, two switch chips PCIe Switch are set in the processor box GPU BOX, and multiple GPU card slots are set to access multiple GPU cards. The host computer and the GPU BOX are connected through a CDFP connector and related cables. The CDFP connector is a 400Gbps pluggable I/O component suitable for high-speed applications such as data centers, high-performance computing, and storage and networking equipment.
可选地,如图3所示,GPU BOX中可以设置多个PC I e Swi tch芯片。主机Host1与Host2分别各出一路PCI E x16高速信号,接到GPU BOX上的每个PC I E Switch芯片,每个PCI E Switch芯片的下行端口分别各接2个GPU卡。需要说明的是,接入GPU卡的数量是根据PCI E Switch芯片所支持的port数量确定的。当CPU提供的PCI e通道数量不足时,可以通过PC I e Switch芯片来扩展系统中的PCI e通道数量。虽然PCI e Gen5已经将信号速率提升到32GT/s,但GPU卡、AI加速卡、新一代的网卡仍然需要较大的数据传输带宽。通过PCI eSwitch芯片的扩展,可以让服务器整机上容纳更多的功能扩展插卡。Optionally, as shown in FIG. 3 , multiple PCIe Switch chips can be set in the GPU BOX. The hosts Host1 and Host2 each send out a PCI E x16 high-speed signal, which is connected to each PC I E Switch chip on the GPU BOX, and the downlink port of each PCI E Switch chip is respectively connected to two GPU cards. It should be noted that the number of connected GPU cards is determined according to the number of ports supported by the PCI E Switch chip. When the number of PCIe channels provided by the CPU is insufficient, the number of PCIe channels in the system can be expanded through the PCIe Switch chip. Although PCI e Gen5 has increased the signal rate to 32GT/s, GPU cards, AI accelerator cards, and new-generation network cards still require large data transmission bandwidth. Through the expansion of the PCI eSwitch chip, more function expansion cards can be accommodated on the whole server.
可选地,PCI e Swi tch芯片的连接拓扑如图4所示,实线代表Host1的PC I E连接关系,虚线代表Host2的PCI E连接关系。由图4可以看出,当目标服务器中包括两个Host时,任何一个Host出的PC I E X16都可经PC I E Swi tch芯片与任何一个GPU卡进行连接,从而达到每个GPU都可由所接入Host访问的目的。其中,PCI E Swi tch芯片中的H端口用于表示上行连接Host主机,F用于表示两个PCI E Switch芯片的PCI E级联端口,D端口用于表示下行端口,用于连接GPU卡。Optionally, the connection topology of the PCIe Switch chip is as shown in FIG. 4, the solid line represents the PCIE connection relationship of Host1, and the dotted line represents the PCIE connection relationship of Host2. As can be seen from Figure 4, when the target server includes two Hosts, any PCIE X16 from any Host can be connected to any GPU card through the PCIE Switch chip, so that each GPU can be connected by the connected PC. Enter the purpose of Host access. Among them, the H port in the PCIE Switch chip is used to indicate the uplink connection to the Host host, the F is used to indicate the PCIE cascade port of two PCIE Switch chips, and the D port is used to indicate the downlink port, which is used to connect to the GPU card.
在一个示例性实施例中,如图4所示,交换机芯片包括:复杂可编程逻辑器件CPLD,通过传输总线与每个主机中的管理控制器BMC连接,用于通过BMC接收主机请求的算力资源,并基于N个主机的编号为N个主机分配与算力资源匹配的M个处理器,其中,算力资源用于表示处理N个主机的数据的资源。In an exemplary embodiment, as shown in FIG. 4, the switch chip includes: a complex programmable logic device CPLD, which is connected to the management controller BMC in each host through a transmission bus, and is used to receive the computing power requested by the host through the BMC. resources, and assign M processors that match the computing power resources to the N hosts based on the numbers of the N hosts, where the computing power resource is used to represent the resources that process the data of the N hosts.
在一个示例性实施例中,主机包括:信号收发器,与CPLD连接,用于将主机的低电平信号传输至CPLD,其中,低电平信号用于表示主机接入了目标服务器。In an exemplary embodiment, the host includes: a signal transceiver connected to the CPLD for transmitting a low-level signal of the host to the CPLD, wherein the low-level signal is used to indicate that the host has accessed the target server.
在一个示例性实施例中,处理器分配系统还包括:电源芯片,与交换机芯片中的CPLD连接,并与处理器连接,用于控制处理器的供电。In an exemplary embodiment, the processor distribution system further includes: a power supply chip, connected to the CPLD in the switch chip, and connected to the processor, for controlling the power supply of the processor.
下面结合具体实施例对本发明进行说明:The present invention is described below in conjunction with specific embodiment:
本实施例以对GPU的控制为例进行说明,主要包括以下内容:This embodiment takes the control of the GPU as an example for illustration, and mainly includes the following contents:
1、配置GPU的PCI E链路。GPU的PCI E链路以双主机Host的GPU池化为例,如图3所示,Host1及Host2分别为两个服务器的主机,主机与GPU BOX之间通过CDFP连接器1和2及相关线缆进行连接。主机Host1与Host2分别各出一路PCI E x16高速信号,接到GPU BOX上的PCI E Switch芯片,每个PCI E Switch下行端口分别各接2个GPU卡,实际应用中所接GPU卡的数量根据PCI E Switch所支持的port数量而定。当CPU提供的PCI e通道数量不足时,可以通过PCI e Switch芯片来扩展系统中的PCIe通道数量。通过PCIe Switch芯片的扩展,可以让服务器整机上容纳更多的功能扩展插卡。1. Configure the PCIe link of the GPU. The PCI E link of the GPU takes the GPU pooling of the dual-host Host as an example. As shown in Figure 3, Host1 and Host2 are the hosts of the two servers respectively, and the host and the GPU BOX are connected through
2、设置PCI E的连接拓扑。如图4所示,标示了Host1的PCI E连接关系,以及Host2的PCI E连接关系。任何一个Host出的PCI E X16都可经PCI E Switch与系统中的任何一个GPU卡进行连接,从而达到每个GPU都可由所接入Host访问的目的。2. Set the connection topology of PCI E. As shown in FIG. 4 , the PCI E connection relationship of Host1 and the PCI E connection relationship of Host2 are marked. Any PCI E X16 from the Host can be connected to any GPU card in the system through the PCI E Switch, so that each GPU can be accessed by the connected Host.
3、设置每个GPU单独的电源芯片。如图6所示,给每个GPU设置单独的电源芯片,可以实现每个GPU卡在空闲时间可单独下电或者需要连接时可及时受控上电。3. Set up a separate power chip for each GPU. As shown in Figure 6, setting a separate power chip for each GPU can realize that each GPU card can be powered off independently during idle time or can be powered on in a timely and controlled manner when it needs to be connected.
综上所述,本实施例每个Host主机对GPU的连接,为GPU池化提供高速连接通路;通过对系统下Host主机数量的识别及GPU的供电设计及复位设计,能够实现GPU的单独上下电控制。通过对GPU的池化设计,可以有效提高系统下GPU的利用率,提高GPU算力资源的利用率,可以在算力需求较少时释放部分GPU资源,降低系统的整体功耗,减少不必要的电力浪费,从而达到缩减成本的目的。To sum up, in this embodiment, the connection of each Host to the GPU provides a high-speed connection path for GPU pooling; through the identification of the number of Hosts in the system and the power supply design and reset design of the GPU, the separate up and down of the GPU can be realized. electric control. Through the GPU pooling design, the utilization rate of the GPU in the system can be effectively improved, and the utilization rate of GPU computing power resources can be improved. When the demand for computing power is low, some GPU resources can be released, the overall power consumption of the system can be reduced, and unnecessary waste of electricity, so as to achieve the purpose of reducing costs.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on such an understanding, the technical solution of the present application can be embodied in the form of a software product in essence or the part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, disk, CD) contains several instructions to enable a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods described in the various embodiments of the present application.
在本实施例中还提供了一种处理器分配装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。In this embodiment, a device for allocating processors is also provided, which is used to implement the above embodiments and preferred implementation modes, and those that have been explained will not be repeated here. As used below, the term "module" may be a combination of software and/or hardware that realizes a predetermined function. Although the devices described in the following embodiments are preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
图8是根据本申请实施例的处理器分配装置的结构框图,如图8所示,该装置包括:Fig. 8 is a structural block diagram of a processor allocation device according to an embodiment of the present application. As shown in Fig. 8, the device includes:
第一确定模块82,用于确定目标服务器中包括的N个主机,其中,每个上述主机对应一个主机编号,上述N是大于或等于1的自然数;The first determining
第一接收模块84,用于接收N个上述主机请求的算力资源,其中,上述算力资源用于表示处理N个上述主机的数据的资源;The
第一分配模块86,用于基于N个上述主机的编号为N个上述主机分配与上述算力资源匹配的M个处理器,其中,M个上述处理器用于处理N个上述主机的数据,M个上述处理器和N个上述主机之间通过交换机芯片连接,上述交换机芯片用于扩展连接上述处理器的总线,上述M是大于或等于1的自然数。The
在一个示例性实施例中,上述第一确定模块82,包括:In an exemplary embodiment, the above-mentioned
第一确定单元,用于在通过复杂可编程逻辑器件CPLD检测到低电平信号的情况下,确定上述目标服务器中接入上述主机,其中,上述CPLD设置在上述交换机芯片中,并与上述主机中的信号收发器连接,上述信号收发器用于向上述CPLD传输上述低电平信号;The first determination unit is configured to determine that the host is connected to the target server when a low-level signal is detected by the complex programmable logic device CPLD, wherein the CPLD is arranged in the switch chip and communicates with the host The signal transceiver in the above-mentioned signal transceiver is used to transmit the above-mentioned low-level signal to the above-mentioned CPLD;
第二确定单元,用于基于检测到的上述低电平信号的数量确定上述主机的数量,确定N个上述主机,其中,一个上述主机对应一个上述低电平信号。The second determination unit is configured to determine the number of the hosts based on the number of detected low-level signals, and determine N hosts, wherein one host corresponds to one low-level signal.
在一个示例性实施例中,上述装置还包括:In an exemplary embodiment, the above-mentioned device also includes:
第一处理模块,用于基于检测到的上述低电平信号的数量确定上述主机的数量,确定N个上述主机之后,通过上述CPLD对每个上述主机进行编号,得到N个主机编号;The first processing module is used to determine the number of the above-mentioned hosts based on the number of detected low-level signals, and after determining the N above-mentioned hosts, number each of the above-mentioned hosts through the above-mentioned CPLD to obtain N host numbers;
第一存储模块,用于将N个上述主机编号存储至寄存器中,其中,上述寄存器设置在上述CPLD中;The first storage module is used to store the N numbers of the above-mentioned hosts in registers, wherein the above-mentioned registers are set in the above-mentioned CPLD;
第一发送模块,用于通过传输总线将每个上述主机编号发送至对应的上述主机中的管理控制器BMC,其中,上述传输总线连接上述主机和上述CPLD。The first sending module is configured to send each host number to the corresponding management controller BMC in the host through a transmission bus, wherein the transmission bus connects the host and the CPLD.
在一个示例性实施例中,上述第一接收模块,包括:In an exemplary embodiment, the above-mentioned first receiving module includes:
第一接收单元,用于接收每个上述主机中的BMC发送的每个上述主机需要的算力资源,得到N个上述主机的算力资源。The first receiving unit is configured to receive the computing power resources required by each of the above-mentioned hosts sent by the BMC in each of the above-mentioned hosts, and obtain the computing power resources of the N above-mentioned hosts.
在一个示例性实施例中,上述第一分配模块,包括:In an exemplary embodiment, the above-mentioned first distribution module includes:
第一发送单元,用于在目标主机请求的目标算力资源大于第一预设阈值的情况下,向目标处理器发送上电指令,以控制上述目标处理器上电,其中,上述目标主机是N个上述主机中的任一主机,上述目标处理器是M个上述处理器中与上述目标算力资源匹配的处理器;The first sending unit is configured to send a power-on instruction to the target processor to control the power-on of the target processor when the target computing resource requested by the target host is greater than a first preset threshold, wherein the target host is Any one of the N above-mentioned hosts, the above-mentioned target processor is a processor among the M above-mentioned processors that matches the above-mentioned target computing resources;
第一建立单元,用于基于上述目标主机的目标编号建立上述目标处理器和上述目标主机之间的连接,以调用上述目标处理器处理上述目标主机发送的数据。The first establishing unit is configured to establish a connection between the above-mentioned target processor and the above-mentioned target host based on the target number of the above-mentioned target host, so as to call the above-mentioned target processor to process the data sent by the above-mentioned target host.
在一个示例性实施例中,上述装置还包括:In an exemplary embodiment, the above-mentioned device also includes:
第二发送模块,用于基于上述目标主机的目标编号连接上述目标处理器和上述目标主机之后,在目标主机请求的目标算力资源小于上述第一预设阈值的情况下,向目标处理器发送下电指令,以控制上述目标处理器下电;The second sending module is configured to, after connecting the target processor and the target host based on the target number of the target host, send the A power-off command to control the power-off of the above-mentioned target processor;
第一断开模块,用于基于上述目标主机的目标编号断开上述目标处理器和上述目标主机之间的连接。The first disconnection module is configured to disconnect the connection between the target processor and the target host based on the target number of the target host.
在一个示例性实施例中,上述装置还包括:In an exemplary embodiment, the above-mentioned device also includes:
第三发送模块,用于基于上述目标主机的目标编号断开上述目标处理器和上述目标主机之间的连接之后,向上述目标处理器发送复位信号,以复位上述目标处理器与上述交换机之间扩展连接的总线。The third sending module is configured to send a reset signal to the target processor after disconnecting the connection between the target processor and the target host based on the target number of the target host to reset the connection between the target processor and the switch Extended connected bus.
在一个示例性实施例中,每个上述处理器均对应连接一个电源芯片,其中,上述电源芯片用于控制上述处理器的供电。In an exemplary embodiment, each of the above-mentioned processors is correspondingly connected to a power supply chip, wherein the above-mentioned power supply chip is used to control the power supply of the above-mentioned processor.
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。It should be noted that the above-mentioned modules can be realized by software or hardware. For the latter, it can be realized by the following methods, but not limited to this: the above-mentioned modules are all located in the same processor; or, the above-mentioned modules can be combined in any combination The forms of are located in different processors.
本申请的实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。Embodiments of the present application also provide a computer-readable storage medium, in which a computer program is stored, wherein the computer program is set to execute the steps in any one of the above method embodiments when running.
在一个示例性实施例中,上述计算机可读存储介质可以包括但不限于:U盘、只读存储器(Read-On l y Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储计算机程序的介质。In an exemplary embodiment, the above-mentioned computer-readable storage medium may include but not limited to: U disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM) ), mobile hard disk, magnetic disk or optical disk and other media that can store computer programs.
本申请的实施例还提供了一种电子设备,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。An embodiment of the present application also provides an electronic device, including a memory and a processor, where a computer program is stored in the memory, and the processor is configured to run the computer program to perform the steps in any one of the above method embodiments.
在一个示例性实施例中,上述电子设备还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。In an exemplary embodiment, the electronic device may further include a transmission device and an input and output device, wherein the transmission device is connected to the processor, and the input and output device is connected to the processor.
本实施例中的具体示例可以参考上述实施例及示例性实施方式中所描述的示例,本实施例在此不再赘述。For specific examples in this embodiment, reference may be made to the examples described in the foregoing embodiments and exemplary implementation manners, and details will not be repeated here in this embodiment.
显然,本领域的技术人员应该明白,上述的本申请的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。Obviously, those skilled in the art should understand that each module or each step of the above-mentioned application can be realized by a general-purpose computing device, and they can be concentrated on a single computing device, or distributed in a network composed of multiple computing devices In fact, they can be implemented in program code executable by a computing device, and thus, they can be stored in a storage device to be executed by a computing device, and in some cases, can be executed in an order different from that shown here. Or described steps, or they are fabricated into individual integrated circuit modules, or multiple modules or steps among them are fabricated into a single integrated circuit module for implementation. As such, the present application is not limited to any specific combination of hardware and software.
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above descriptions are only preferred embodiments of the present application, and are not intended to limit the present application. For those skilled in the art, there may be various modifications and changes in the present application. Any modifications, equivalent replacements, improvements, etc. made within the principles of this application shall be included within the scope of protection of this application.
Claims (15)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310180080.2A CN116166434A (en) | 2023-02-28 | 2023-02-28 | Processor allocation method and system, device, storage medium and electronic equipment |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310180080.2A CN116166434A (en) | 2023-02-28 | 2023-02-28 | Processor allocation method and system, device, storage medium and electronic equipment |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN116166434A true CN116166434A (en) | 2023-05-26 |
Family
ID=86414529
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310180080.2A Pending CN116166434A (en) | 2023-02-28 | 2023-02-28 | Processor allocation method and system, device, storage medium and electronic equipment |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN116166434A (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116501681A (en) * | 2023-06-28 | 2023-07-28 | 苏州浪潮智能科技有限公司 | CXL data transmission board and method for controlling data transmission |
| CN117472596A (en) * | 2023-12-27 | 2024-01-30 | 苏州元脑智能科技有限公司 | Distributed resource management method, device, system, equipment and storage medium |
| WO2025179936A1 (en) * | 2024-02-28 | 2025-09-04 | 超聚变数字技术有限公司 | Gpu box, and data processing method and system |
| US12547579B2 (en) | 2023-06-28 | 2026-02-10 | Suzhou Metabrain Intelligent Technology Co., Ltd. | Board for CXL data transmission, method for data transmission control and device |
-
2023
- 2023-02-28 CN CN202310180080.2A patent/CN116166434A/en active Pending
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116501681A (en) * | 2023-06-28 | 2023-07-28 | 苏州浪潮智能科技有限公司 | CXL data transmission board and method for controlling data transmission |
| CN116501681B (en) * | 2023-06-28 | 2023-09-29 | 苏州浪潮智能科技有限公司 | CXL data transmission board card and method for controlling data transmission |
| WO2025001344A1 (en) * | 2023-06-28 | 2025-01-02 | 苏州元脑智能科技有限公司 | Cxl data transmission board and data transmission control method |
| US12547579B2 (en) | 2023-06-28 | 2026-02-10 | Suzhou Metabrain Intelligent Technology Co., Ltd. | Board for CXL data transmission, method for data transmission control and device |
| CN117472596A (en) * | 2023-12-27 | 2024-01-30 | 苏州元脑智能科技有限公司 | Distributed resource management method, device, system, equipment and storage medium |
| CN117472596B (en) * | 2023-12-27 | 2024-03-22 | 苏州元脑智能科技有限公司 | Distributed resource management method, device, system, equipment and storage medium |
| WO2025179936A1 (en) * | 2024-02-28 | 2025-09-04 | 超聚变数字技术有限公司 | Gpu box, and data processing method and system |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN116501681B (en) | CXL data transmission board card and method for controlling data transmission | |
| CN116166434A (en) | Processor allocation method and system, device, storage medium and electronic equipment | |
| CN116260725B (en) | A server bandwidth allocation method, apparatus, electronic device, and storage medium. | |
| CN116627520B (en) | System operation method of baseboard management controller and baseboard management controller | |
| CN117251039A (en) | Equipment reset method, device, storage medium and electronic equipment | |
| CN215769721U (en) | Data processing unit board card | |
| CN117978811B (en) | Method and system for determining mapping relationship, storage medium and electronic device | |
| CN113177018B (en) | Server using double-slot CPU | |
| WO2025030986A1 (en) | Method and apparatus for monitoring hardware partition of server host system | |
| CN116578316A (en) | Firmware updating method, device, server and storage medium of equipment | |
| US9779037B2 (en) | Establishing connectivity of modular nodes in a pre-boot environment | |
| CN117056275B (en) | Communication control method, device and server based on hardware partition system | |
| CN118605964A (en) | Resource configuration method of intelligent network card, computer equipment and medium | |
| CN112615739B (en) | A method and system for adapting an OCP3.0 network card in a multi-host application environment | |
| CN111158905A (en) | Method and apparatus for adjusting resources | |
| US20250265213A1 (en) | Pcie device management method, device, and server | |
| CN116032746B (en) | Resource pool information processing method and device, storage medium and electronic device | |
| CN110704365A (en) | Reconstruction device based on FPGA | |
| US11061838B1 (en) | System and method for graphics processing unit management infrastructure for real time data collection | |
| CN115442239B (en) | Bandwidth resource allocation method, PCIe channel switcher and electronic device | |
| US11366701B1 (en) | High performance computer with a control board, modular compute boards and resource boards that can be allocated to the modular compute boards | |
| CN120196513A (en) | Server management method, computer program product and server cabinet | |
| CN217157280U (en) | A data processing unit that provides remote management and extended network processing capabilities | |
| CN114866106B (en) | Multi-module communication system and data processing method | |
| CN217216573U (en) | Multi-module communication system |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| CB02 | Change of applicant information | ||
| CB02 | Change of applicant information |
Country or region after: China Address after: 215000 Building 9, No.1 guanpu Road, Guoxiang street, Wuzhong Economic Development Zone, Suzhou City, Jiangsu Province Applicant after: Suzhou Yuannao Intelligent Technology Co.,Ltd. Address before: 215000 Building 9, No.1 guanpu Road, Guoxiang street, Wuzhong Economic Development Zone, Suzhou City, Jiangsu Province Applicant before: SUZHOU LANGCHAO INTELLIGENT TECHNOLOGY Co.,Ltd. Country or region before: China |