CN114546644A

CN114546644A - Cluster resource scheduling method, device, software program, electronic device and storage medium

Info

Publication number: CN114546644A
Application number: CN202210144907.XA
Authority: CN
Inventors: 方睿
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-02-17
Filing date: 2022-02-17
Publication date: 2022-05-27
Anticipated expiration: 2042-02-17
Also published as: CN114546644B

Abstract

The invention provides a cluster resource scheduling method, a device, a software program, an electronic device and a storage medium, wherein the method comprises the following steps: configuring a workload in a cluster resource scheduling environment, and determining an overtime queue carrying the workload; when the work load in the overtime queue reaches an overtime state, the controller component adjusts the state of the work load into a secondary scheduling state; sending a failure copy number detection request to a corresponding estimator component based on the information of the workload; an estimator component determines a number of copies that failed scheduling in response to the failed number of copies probe request, and a scheduler component executes the cluster resource scheduler based on the number of copies that failed scheduling. Therefore, the availability of the working load can be guaranteed, the accuracy and the reliability of cluster resource scheduling are improved, the use efficiency of the cluster resources is improved, and the data processing speed of a cloud server user is ensured.

Description

Cluster resource scheduling method, device, software program, electronic device and storage medium

技术领域technical field

本发明涉及云网络的集群资源调度处理技术，尤其涉及集群资源调度方法、装置、软件程序、电子设备及存储介质。The present invention relates to a cluster resource scheduling processing technology of cloud networks, and in particular, to a cluster resource scheduling method, device, software program, electronic equipment and storage medium.

背景技术Background technique

随着计算机技术的不断发展，云服务器(CVM Cloud Virtual Machine)可以提供安全可靠的弹性计算服务，还可以提供不同的实例类型来满足用户特定的使用场景。这些实例类型由CPU、内存、存储和网络组成不同的组合，当云服务器的运行过程进行集群资源调度时，集群资源调度的速度直接影响云数据中心的资源利用率和用户体验。调度优先保障的是用户可以分配到资源，然后是如何最优的分配资源，即提高资源利用率。但是，相关技术中，集群在运行过程中仍有可能出现节点异常退出导致副本无资源进行调度的情况，同时，对于多集群的环境，在进行任务处理时会出现子集群资源竞争的问题，影响了集群资源调度的准确性和可靠性。With the continuous development of computer technology, cloud servers (CVM Cloud Virtual Machines) can provide safe and reliable elastic computing services, and can also provide different instance types to meet user-specific usage scenarios. These instance types consist of different combinations of CPU, memory, storage, and network. When cluster resources are scheduled during the running process of cloud servers, the speed of cluster resource scheduling directly affects the resource utilization and user experience of cloud data centers. The priority of scheduling is to ensure that users can allocate resources, and then how to allocate resources optimally, that is, to improve resource utilization. However, in the related art, during the operation of the cluster, it is still possible that the node exits abnormally, resulting in the replica having no resources for scheduling. At the same time, for the multi-cluster environment, the problem of sub-cluster resource competition may occur during task processing, which affects the It improves the accuracy and reliability of cluster resource scheduling.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明实施例提供一种集群资源调度方法、装置、软件程序、电子设备及存储介质，能够基于调度失败的副本数量，执行集群资源调度程序以实现利用集群资源调度环境中的最大可用副本数量，保障工作负载的可用性，同时提升集群资源调度的准确性和可靠性，提升集群资源的使用效率，保证云服务器用户的数据处理速度，提高用户的使用体验。In view of this, embodiments of the present invention provide a cluster resource scheduling method, device, software program, electronic device, and storage medium, which can execute the cluster resource scheduling program based on the number of replicas that fail to schedule to achieve maximum utilization in a cluster resource scheduling environment. The number of available replicas ensures the availability of workloads, improves the accuracy and reliability of cluster resource scheduling, improves cluster resource usage efficiency, ensures data processing speed for cloud server users, and improves user experience.

本发明实施例的技术方案是这样实现的：The technical solution of the embodiment of the present invention is realized as follows:

本发明实施例提供了一种集群资源调度方法，所述方法包括：An embodiment of the present invention provides a cluster resource scheduling method, and the method includes:

在集群资源调度环境中配置工作负载，并且确定承载所述工作负载的超时队列；Configuring a workload in a cluster resource scheduling environment, and determining a timeout queue to carry the workload;

当所述超时队列中的所述工作负载达到超时状态时，控制器组件将所述工作负载的状态调整为二次调度状态；When the workload in the timeout queue reaches the timeout state, the controller component adjusts the state of the workload to a secondary scheduling state;

当调度器组件确定述工作负载的状态为二次调度状态时，基于所述工作负载的信息，向对应的估计器组件发送失败副本数探测请求；When the scheduler component determines that the state of the workload is a secondary scheduling state, based on the information of the workload, a detection request for the number of failed replicas is sent to the corresponding estimator component;

所述估计器组件响应于所述失败副本数探测请求，确定调度失败的副本数量，并将所述调度失败的副本数量发送至所述调度器组件；The estimator component, in response to the failed replica number detection request, determines the number of replicas that fail to schedule, and sends the number of replicas that fail to schedule to the scheduler component;

所述调度器组件基于所述调度失败的副本数量，执行集群资源调度程序以实现利用所述集群资源调度环境中的最大可用副本数量。The scheduler component executes a cluster resource scheduler based on the number of replicas that fail to schedule to achieve utilizing the cluster resource to schedule the maximum number of replicas available in the environment.

本发明实施例还提供了一种集群资源调度装置，包括：The embodiment of the present invention also provides a cluster resource scheduling device, including:

信息传输装置，用于在集群资源调度环境中配置工作负载，并且确定承载所述工作负载的超时队列；an information transmission device, configured to configure a workload in a cluster resource scheduling environment, and determine a timeout queue that carries the workload;

信息处理装置，用于当所述超时队列中的所述工作负载达到超时状态时，控制器组件将所述工作负载的状态调整为二次调度状态；an information processing device, configured to adjust the state of the workload to a secondary scheduling state by the controller component when the workload in the timeout queue reaches a timeout state;

所述信息处理装置，用于当调度器组件确定述工作负载的状态为二次调度状态时，基于所述工作负载的信息，向对应的估计器组件发送失败副本数探测请求；The information processing apparatus is configured to, when the scheduler component determines that the state of the workload is a secondary scheduling state, send a request for detecting the number of failed replicas to the corresponding estimator component based on the information of the workload;

所述信息处理装置，用于所述估计器组件响应于所述失败副本数探测请求，确定调度失败的副本数量，并将所述调度失败的副本数量发送至所述调度器组件；the information processing apparatus, wherein the estimator component determines the number of replicas that fail to schedule in response to the failed replica number detection request, and sends the number of replicas that fail to schedule to the scheduler component;

所述信息处理装置，用于所述调度器组件基于所述调度失败的副本数量，执行集群资源调度程序以实现利用所述集群资源调度环境中的最大可用副本数量。The information processing apparatus is configured for the scheduler component to execute a cluster resource scheduler based on the number of replicas that fail to schedule, so as to realize utilizing the cluster resource to schedule the maximum number of available replicas in an environment.

上述方案中，所述信息处理装置，用于所述控制器组件确定所述集群资源调度环境的期望副本数量；In the above solution, the information processing apparatus is used for the controller component to determine the expected number of copies of the cluster resource scheduling environment;

所述信息处理装置，用于所述控制器组件基于期望副本数量，对所述工作负载进行实时检测；the information processing apparatus, for the controller component to perform real-time detection on the workload based on an expected number of replicas;

所述信息处理装置，用于当所述工作负载中的副本数量小于所述期望副本数量时，将所述工作负载调整至所述超时队列中。The information processing apparatus is configured to adjust the workload to the timeout queue when the number of replicas in the workload is less than the expected number of replicas.

上述方案中，所述信息处理装置，用于当所述超时队列中的工作负载的副本数量出现变化时，所述控制器组件基于期望副本数量，对所述超时队列中的工作负载进行检测；In the above solution, the information processing device is configured to, when the number of copies of the workload in the timeout queue changes, the controller component to detect the workload in the timeout queue based on the expected number of copies;

所述信息处理装置，用于当所述工作负载中的副本数量小于所述期望副本数量时，将所述工作负载保持在所述超时队列中；the information processing apparatus, configured to keep the workload in the timeout queue when the number of replicas in the workload is less than the expected number of replicas;

所述信息处理装置，用于当所述工作负载中的副本数量大于等于所述期望副本数量时，将所述超时队列中的工作负载删除。The information processing apparatus is configured to delete the workload in the timeout queue when the number of replicas in the workload is greater than or equal to the expected number of replicas.

上述方案中，所述信息处理装置，用于所述估计器组件获取集群资源调度环境中子集群的所有节点的节点信息和容器组信息；In the above solution, the information processing apparatus is used for the estimator component to obtain node information and container group information of all nodes of the sub-cluster in the cluster resource scheduling environment;

所述信息处理装置，用于所述估计器组件响应于所述失败副本数探测请求，在所述子集群中查询工作副本所关联的容器组，并且确定所述容器组对应的容器组列表；The information processing apparatus is used for the estimator component to query the sub-cluster for a container group associated with a working copy in response to the failed copy number detection request, and to determine a container group list corresponding to the container group;

所述信息处理装置，用于所述估计器组件从所述容器组列表中查询调度失败的容器组，并根据所述调度失败的容器组，计算调度失败的副本数量。The information processing apparatus is used for the estimator component to query the container group for which scheduling fails from the container group list, and calculate the number of replicas for which scheduling fails according to the container group for which scheduling fails.

上述方案中，所述信息处理装置，用于当所述工作副本的类型为资源类型时，确定所述工作副本对应的副本控制器对象列表；In the above solution, the information processing apparatus is configured to, when the type of the working copy is a resource type, determine a list of copy controller objects corresponding to the working copy;

所述信息处理装置，用于通过所述副本控制器对象列表的缓存查找所述工作副本所关联的容器组列表；the information processing apparatus, configured to search for the container group list associated with the working copy through the cache of the copy controller object list;

所述信息处理装置，用于当所述工作副本的类型为状态副本集类型时，在所述工作副本的缓存中查找所述工作副本所关联的容器组列表。The information processing apparatus is configured to, when the type of the working copy is a state copy set type, search the cache of the working copy for a list of container groups associated with the working copy.

上述方案中，所述信息处理装置，用于当所述估计器组件启动时，所述估计器组件获取集群资源调度环境中子集群的所有节点的节点信息和容器组信息；In the above solution, the information processing device is configured to, when the estimator component starts, the estimator component acquires node information and container group information of all nodes of the sub-cluster in the cluster resource scheduling environment;

所述信息处理装置，用于所述估计器组件响应于最大可用副本数量预估请求，从所述在集群资源的所有的节点中筛选与所述工作负载相匹配节点；the information processing apparatus, wherein the estimator component selects a node matching the workload from all the nodes in the cluster resource in response to a request for estimating the maximum number of available replicas;

所述信息处理装置，用于确定每一个与所述工作负载相匹配节点对应的容器组信息，并基于所述容器组信息，确定每一个与所述工作负载相匹配节点的最大可用副本数；The information processing device is configured to determine container group information corresponding to each node matching the workload, and based on the container group information, determine the maximum number of available copies of each node matching the workload;

所述信息处理装置，用于基于每一个与所述工作负载相匹配节点的最大可用副本数，确定所述集群资源调度环境中的最大可用副本数量。The information processing apparatus is configured to determine the maximum number of available copies in the cluster resource scheduling environment based on the maximum number of available copies of each node matching the workload.

本发明实施例还提供了一种电子设备，所述电子设备包括：The embodiment of the present invention also provides an electronic device, the electronic device includes:

存储器，用于存储可执行指令；memory for storing executable instructions;

处理器，用于运行所述存储器存储的可执行指令时，实现前序的集群资源调度方法。The processor is configured to implement the pre-order cluster resource scheduling method when executing the executable instructions stored in the memory.

本发明实施例还提供了一种计算机可读存储介质，存储有可执行指令，所述可执行指令被处理器执行时实现前序的集群资源调度方法。Embodiments of the present invention further provide a computer-readable storage medium storing executable instructions, which implement a pre-order cluster resource scheduling method when the executable instructions are executed by a processor.

本申请实施例提供了一种计算机程序产品或计算机程序，该计算机程序产品或计算机程序包括计算机指令，该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令，处理器执行该计算机指令，使得该计算机设备执行本申请实施例所提供的集群资源调度方法。Embodiments of the present application provide a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. The processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the cluster resource scheduling method provided by the embodiment of the present application.

本发明实施例具有以下有益效果：The embodiment of the present invention has the following beneficial effects:

本发明实施例通过在集群资源调度环境中配置工作负载，并且确定承载所述工作负载的超时队列；当所述超时队列中的所述工作负载达到超时状态时，控制器组件将所述工作负载的状态调整为二次调度状态；当调度器组件确定述工作负载的状态为二次调度状态时，基于所述工作负载的信息，向对应的估计器组件发送失败副本数探测请求；所述估计器组件响应于所述失败副本数探测请求，确定调度失败的副本数量，并将所述调度失败的副本数量发送至所述调度器组件；所述调度器组件基于所述调度失败的副本数量，执行集群资源调度程序以实现利用所述集群资源调度环境中的最大可用副本数量，能够基于调度失败的副本数量，执行集群资源调度程序以实现利用集群资源调度环境中的最大可用副本数量，保障工作负载的可用性，同时提升集群资源调度的准确性和可靠性，提升集群资源的使用效率，保证云服务器用户的数据处理速度，提高用户的使用体验。In this embodiment of the present invention, a workload is configured in a cluster resource scheduling environment, and a timeout queue that carries the workload is determined; when the workload in the timeout queue reaches a timeout state, the controller component sends the workload to the workload. When the scheduler component determines that the state of the workload is a secondary scheduling state, based on the information of the workload, a detection request for the number of failed copies is sent to the corresponding estimator component; the estimation In response to the failed replica number detection request, the scheduler component determines the number of replicas that fail to schedule, and sends the number of replicas that fail to schedule to the scheduler component; the scheduler component, based on the number of replicas that fail to schedule, Execute the cluster resource scheduler to achieve the maximum number of available replicas in the scheduling environment using the cluster resources, based on the number of replicas that fail to schedule, execute the cluster resource scheduler to achieve the maximum number of available replicas in the environment using the cluster resource scheduling to ensure work It can improve the availability of load, improve the accuracy and reliability of cluster resource scheduling, improve the efficiency of cluster resource use, ensure the data processing speed of cloud server users, and improve the user experience.

附图说明Description of drawings

图1为本发明实施例提供的集群资源调度方法的使用场景示意图；FIG. 1 is a schematic diagram of a usage scenario of a cluster resource scheduling method provided by an embodiment of the present invention;

图2为本发明实施例提供的电子设备的组成结构示意图；2 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention;

图3为本发明实施例提供的集群资源调度方法一个可选的流程示意图；FIG. 3 is an optional schematic flowchart of a cluster resource scheduling method provided by an embodiment of the present invention;

图4为本发明实施例中集群资源调度装置的架构示意图；4 is a schematic structural diagram of a cluster resource scheduling apparatus in an embodiment of the present invention;

图5为本发明实施例提供的集群资源调度方法一个可选的流程示意图；FIG. 5 is an optional schematic flowchart of a cluster resource scheduling method provided by an embodiment of the present invention;

图6为本发明实施例提供的集群资源调度方法一个可选的流程示意图；FIG. 6 is an optional schematic flowchart of a cluster resource scheduling method provided by an embodiment of the present invention;

图7为本发明实施例提供的集群资源调度方法一个可选的流程示意图。FIG. 7 is an optional schematic flowchart of a cluster resource scheduling method provided by an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明作进一步地详细描述，所描述的实施例不应视为对本发明的限制，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings. All other embodiments obtained under the premise of creative work fall within the protection scope of the present invention.

在以下的描述中，涉及到“一些实施例”，其描述了所有可能实施例的子集，但是可以理解，“一些实施例”可以是所有可能实施例的相同子集或不同子集，并且可以在不冲突的情况下相互结合。In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" can be the same or a different subset of all possible embodiments, and Can be combined with each other without conflict.

对本发明实施例进行进一步详细说明之前，对本发明实施例中涉及的名词和术语进行说明，本发明实施例中涉及的名词和术语适用于如下的解释。Before further describing the embodiments of the present invention in detail, the terms and terms involved in the embodiments of the present invention are described. The terms and terms involved in the embodiments of the present invention are applicable to the following explanations.

1)响应于，用于表示所执行的操作所依赖的条件或者状态，当满足所依赖的条件或状态时，所执行的一个或多个操作可以是实时的，也可以具有设定的延迟；在没有特别说明的情况下，所执行的多个操作不存在执行先后顺序的限制。1) In response, used to represent the condition or state on which the executed operation depends, when the dependent condition or state is satisfied, the executed one or more operations may be real-time, or may have a set delay; Unless otherwise specified, there is no restriction on the order of execution of multiple operations to be executed.

2)终端，包括但不限于：普通终端、专用终端，其中所述普通终端与发送通道保持长连接和/或短连接，所述专用终端与所述发送通道保持长连接。2) Terminals, including but not limited to: ordinary terminals and dedicated terminals, wherein the ordinary terminal maintains a long connection and/or a short connection with the transmission channel, and the dedicated terminal maintains a long connection with the transmission channel.

3)客户端，终端中实现特定功能的载体，例如移动客户端(APP)是移动终端中特定功能的载体，例如执行报表制作的功能或者进行报表展示的功能。3) Client, the carrier for realizing specific functions in the terminal, for example, the mobile client (APP) is the carrier for specific functions in the mobile terminal, for example, the function of executing reports or displaying reports.

4)组件(Component)，是小程序的视图的功能模块，也称为前端组件，页面中的按钮、标题、表格、侧边栏、内容和页脚等，组件包括模块化的代码以便于在小程序的不同的页面中重复使用。4) Component is the functional module of the view of the applet, also known as the front-end component, buttons, headers, tables, sidebars, content and footers in the page, etc. Components include modular code to facilitate Reuse in different pages of the applet.

5)服务器集群(Server cluste)，指将很多服务器集中起来一起进行同一种服务，在客户端看来就像是只有一个服务器。服务器集群可以利用多个计算机进行并行计算从而获得很高的计算速度，也可以用多个计算机做备份，从而使得任何一个机器坏了整个系统还是能正常运行。在本申请中所提供的服务器集群硬盘故障处理方法中，可以应用于云服务器使用场景和分布式服务器使用场景，实现对不同使用场景中的服务器硬盘的状态检测与故障修复，具体来说，云服务器(CVM Cloud Virtual Machine)是一种简单高效、安全可靠、处理能力可弹性伸缩的计算服务。其管理方式比传统的单一物理服务器更简单高效。用户无需提前购买硬件，即可迅速创建或释放任意多台云服务器供用户的业务进程使用，并存储云服务器用户的数据。而分布式服务器使用环境中用户的数据和程序可以不位于一个服务器上，而是分散到多个服务器中，同样的，分布式服务器的使用环境也需要配置大量的硬盘，也需要通过本申请所提供的服务器集群硬盘故障处理方法实现服务器硬盘的状态检测与故障修复。5) Server cluster (Server cluster), refers to the collection of many servers to perform the same service together, which appears to the client as if there is only one server. A server cluster can use multiple computers for parallel computing to obtain high computing speed, and can also use multiple computers for backup, so that any one machine is broken and the entire system can still run normally. The method for handling hard disk failures in a server cluster provided in this application can be applied to cloud server usage scenarios and distributed server usage scenarios to implement status detection and fault repair of server hard disks in different usage scenarios. A server (CVM Cloud Virtual Machine) is a simple, efficient, safe and reliable computing service with scalable processing power. Its management is simpler and more efficient than the traditional single physical server. Users can quickly create or release any number of cloud servers for users' business processes without purchasing hardware in advance, and store cloud server users' data. In the distributed server use environment, the user's data and programs may not be located on one server, but distributed in multiple servers. Similarly, the use environment of the distributed server also needs to configure a large number of hard disks, and it is also necessary to pass the procedures of this application. The provided method for handling hard disk failures in a server cluster realizes state detection and fault repair of the server hard disks.

6)容器集群管理系统Kubernetes，可称K8S，是一款开源的容器操作平台，其可以实现将若干个容器组合成一个服务及动态地分配容器运行的主机等功能，为用户使用容器提供了极大的便利。通过Kubernetes可以快速部署应用、快速扩展应用、无缝对接新的应用功能、优化硬件资源的使用。6) The container cluster management system Kubernetes, which can be called K8S, is an open source container operation platform, which can realize the functions of combining several containers into one service and dynamically assigning the host for the container to run, providing users with extremely efficient use of containers. Great convenience. Through Kubernetes, you can quickly deploy applications, quickly expand applications, seamlessly connect new application functions, and optimize the use of hardware resources.

节点是容器集群组成的基本元素。节点取决于业务，既可以是虚拟机，也可以是物理机。每个节点都包含运行容器组Pod所需要的基本组件，包括 Kubelet(容器管理组件)、Kubeproxy(网络代理组件)等。Nodes are the basic elements of a container cluster. Nodes depend on services and can be either virtual machines or physical machines. Each node contains the basic components required to run the container group Pod, including Kubelet (container management component), Kubeproxy (network proxy component), etc.

Master节点(主节点)指的是集群控制节点，管理和控制整个集群，k8s的所有控制命令都发给它，它负责具体的执行过程。Master节点上运行的kube-apiserver(资源访问组件)、kube-controller-mansger(运行管理控制器组件)和kube-scheduler(调度组件)通过不断与工作节点(Node)上的kubelet 和kube-proxy进行通信来维护整个集群的健康工作状态。如果Master节点的服务无法访问某个Node，则会将该Node标记为不可用，不再向其调度新建的Pod (容器组)。但对Master自身则需要进行额外监控，使Master不成为集群的单故障点，所以对Master服务也需要进行高可用部署。The master node (master node) refers to the cluster control node, which manages and controls the entire cluster. All control commands of k8s are sent to it, and it is responsible for the specific execution process. The kube-apiserver (resource access component), kube-controller-mansger (running management controller component) and kube-scheduler (scheduling component) running on the Master node continuously communicate with the kubelet and kube-proxy on the worker node (Node). Communication to maintain the healthy working state of the entire cluster. If the service of the Master node cannot access a Node, the Node will be marked as unavailable, and the newly created Pod (container group) will no longer be scheduled to it. However, additional monitoring is required for the Master itself, so that the Master does not become a single point of failure of the cluster, so the Master service also needs to be deployed with high availability.

除了Master以外的节点被称为Node或者Worker节点(工作节点)，可以在 Master中使用节点查看命令(kubectl get nodes)查看集群中的Node节点。每个 Node节点都会被Master节点分配一些工作负载(Docker容器)，当某个Node 宕机时，该节点上的工作负载就会被Master节点自动转移到其它节点上。Nodes other than the Master are called Node or Worker nodes (worker nodes). You can use the node view command (kubectl get nodes) in the Master to view the Node nodes in the cluster. Each Node node will be assigned some workload (Docker container) by the Master node. When a Node goes down, the workload on the node will be automatically transferred to other nodes by the Master node.

Pod(容器组)：kubernetes创建或部署的最小/最简单的基本单位---容器组，一个Pod代表集群上正在运行的一个微服务进程，且一个微服务进程封装一个提供微服务应用的边缘容器(也可以有多个边缘容器)、存储资源、一个独立的网络IP以及管理控制容器运行方式的策略选项。Pod (container group): The smallest/simplest basic unit created or deployed by kubernetes---container group, a Pod represents a microservice process running on the cluster, and a microservice process encapsulates an edge that provides microservice applications Containers (there can also be multiple edge containers), storage resources, a separate network IP, and management policy options that control how the containers run.

7)工作负载：工作负载是一类应用程序，可以含有多个副本实例。7) Workload: A workload is a class of applications that can contain multiple replica instances.

8)副本：工作负载的实例单元，每个副本实例都是一个独立的容器。8) Replica: The instance unit of the workload, each replica instance is an independent container.

9)二次调度：用于任务处理过程中，对集群资源进行重新分配，适应任务需求的调度。9) Secondary scheduling: In the process of task processing, it is used to reallocate cluster resources and adapt to the scheduling of task requirements.

在介绍本申请所提供的集群资源调度方法之前，首先对相关技术中的缺陷进行简要说明，在相关技术中，在进行云网络的资源调度时通常使用以下方式：Before introducing the cluster resource scheduling method provided by the present application, a brief description of the defects in the related art is given first. In the related art, the following methods are usually used in the resource scheduling of the cloud network:

1)、通过重载节点检测步骤来检测将资源利用率大于90％的节点视为重载副本，然后对这些副本进行调度和迁移，最终可以实现整个存储集群节点的负载均衡。该方式的缺陷在于：仅适用于单集群，无法扩展到多集群使用。1) Through the overloaded node detection step, the nodes with resource utilization greater than 90% are detected as overloaded copies, and then these copies are scheduled and migrated, and finally the load balance of the entire storage cluster node can be achieved. The disadvantage of this method is that it is only applicable to a single cluster and cannot be extended to multiple clusters.

2)、首先接收用户发送的应用容器部署指令和联邦集群上传的集群资源使用信息，然后通过应用容器部署指令中的应用模板总副本数和所述集群资源使用信息，确定所述联邦集群中各子集群的部署副本数，最终可以实现考虑子集群资源运行情况的副本调度，该方式的缺陷在于：集群在运行过程中仍有可能出现节点异常退出导致副本无资源进行调度的情况，同时仍然会出现子集群资源竞争问题，所以仍然有副本无法正常运行的风险。2) First, receive the application container deployment instruction sent by the user and the cluster resource usage information uploaded by the federated cluster, and then determine the number of application templates in the federated cluster through the total number of copies of the application template in the application container deployment instruction and the cluster resource usage information. The number of deployed replicas in the sub-cluster can finally realize replica scheduling that considers the operation of the sub-cluster resources. The disadvantage of this method is that the node may exit abnormally during the cluster operation, causing the replica to be scheduled without resources. There is a subcluster resource contention problem, so there is still a risk of replicas not functioning properly.

3)、每隔固定时间或有新节点加入时，通过各个节点的资源利用率、集群所有节点的资源利用率平均值来筛选出需要进行调度的Pod，将Pod迁移到资源利用率平均值较低的节点，该方式的缺陷在于：无法对集群资源不足时的场景进行处理，只能解决单集群资源剩余量足够时的集群利用率平衡，仍会出现系统宕机的风险。3) At regular intervals or when new nodes are added, the Pods that need to be scheduled are screened by the resource utilization of each node and the average resource utilization of all nodes in the cluster, and the Pods are migrated to the resource utilization average. The disadvantage of this method is that it cannot handle the scenario when the cluster resources are insufficient, and can only solve the cluster utilization balance when the remaining resources of a single cluster are sufficient, and the risk of system downtime will still occur.

为了克服上述缺陷，本申请提供了一种集群资源调度方法、装置、软件程序、电子设备及存储介质，图1为本发明实施例提供的集群资源调度方法的使用场景示意图，参见图1，随着计算机技术的不断发展，云服务器(Cloud Virtual Machine，CVM)可以提供安全可靠的弹性计算服务，还可以提供不同的实例类型来满足用户特定的使用场景。终端(包括终端10-1和终端10-2)上设置有能够执行不同功能相应客户端其中，所属客户端为终端(包括终端10-1和终端 10-2)通过网络300从相应的云服务器200中获取不同的信息，并可以在云服务器中部署不同的业务。终端通过网络300连接服务器200，网络300可以是广域网或者局域网，又或者是二者的组合，使用无线链路实现数据传输。云服务器所提供的这些实例类型由CPU、内存、存储和网络组成不同的组合，并将用户的业务数据存储在云服务器的硬盘中，但是在云服务器的运行中由于任务处理过程中会产生大量的资源碎片，造成资源的冗余，降低了云服务器网络的处理速度，影响任务处理速度，影响云服务器网络的使用效果。在本发明所提供的实施例中云服务器200中所运行的云服务器应用可以是在不同编程语言的软件代码环境中所编写的，代码对象可以是不同类型的代码实体。例如，在C语言的软件代码中，一个代码对象可以是一个函数。在JAVA语言的软件代码中，一个代码对象可以是一个类，IOS端OC语言中可以是一段目标代码。在C++语言的软件代码中，一个代码对象可以是一个类或一个函数以执行来自于不同终端的处理指令。其中本申请中不再对名云服务器的编译环境的来源进行区分。In order to overcome the above shortcomings, the present application provides a cluster resource scheduling method, device, software program, electronic device and storage medium. FIG. 1 is a schematic diagram of a usage scenario of the cluster resource scheduling method provided by an embodiment of the present invention. With the continuous development of computer technology, cloud servers (Cloud Virtual Machine, CVM) can provide safe and reliable elastic computing services, and can also provide different instance types to meet user-specific usage scenarios. The terminals (including the terminal 10-1 and the terminal 10-2) are provided with corresponding clients capable of performing different functions, wherein the client is the terminal (including the terminal 10-1 and the terminal 10-2) from the corresponding cloud server through the network 300. 200 to obtain different information, and can deploy different services in the cloud server. The terminal is connected to the server 200 through the network 300, and the network 300 may be a wide area network or a local area network, or a combination of the two, and uses a wireless link to realize data transmission. These instance types provided by the cloud server are composed of different combinations of CPU, memory, storage and network, and store the user's business data in the hard disk of the cloud server. However, during the operation of the cloud server, a large amount of fragmentation of resources, resulting in redundancy of resources, reducing the processing speed of the cloud server network, affecting the task processing speed, and affecting the use effect of the cloud server network. In the embodiments provided by the present invention, the cloud server applications running in the cloud server 200 may be written in software code environments of different programming languages, and the code objects may be different types of code entities. For example, in C language software code, a code object can be a function. In the software code of the JAVA language, a code object can be a class, and in the OC language on the IOS side, it can be a piece of object code. In software code in C++ language, a code object can be a class or a function to execute processing instructions from different terminals. The source of the compilation environment of the cloud server is no longer distinguished in this application.

下面对本发明实施例的集群资源调度装置的结构做详细说明，集群资源调度装置可以各种形式来实施，如带有集群资源调度装置处理功能的专用终端，也可以为设置有集群资源调度装置处理功能的服务器，例如前序图1中的服务器200。图2为本发明实施例提供的集群资源调度装置的组成结构示意图，可以理解，图2仅仅示出了集群资源调度装置的示例性结构而非全部结构，根据需要可以实施图2示出的部分结构或全部结构。The structure of the cluster resource scheduling apparatus according to the embodiment of the present invention will be described in detail below. The cluster resource scheduling apparatus may be implemented in various forms, such as a dedicated terminal with the processing function of the cluster resource scheduling apparatus, or it may be provided with a cluster resource scheduling apparatus for processing A functional server, such as the server 200 in Figure 1 of the preamble. FIG. 2 is a schematic diagram of the composition and structure of a cluster resource scheduling apparatus provided by an embodiment of the present invention. It can be understood that FIG. 2 only shows an exemplary structure of the cluster resource scheduling apparatus, but not the entire structure. Parts shown in FIG. 2 may be implemented as needed. structure or all structures.

本发明实施例提供的集群资源调度装置包括：至少一个处理器201、存储器 202、用户接口203和至少一个网络接口204。集群资源调度装置中的各个组件通过总线系统205耦合在一起。可以理解，总线系统205用于实现这些组件之间的连接通信。总线系统205除包括数据总线之外，还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见，在图2中将各种总线都标为总线系统205。The cluster resource scheduling apparatus provided by the embodiment of the present invention includes: at least one processor 201, a memory 202, a user interface 203, and at least one network interface 204. Various components in the cluster resource scheduling apparatus are coupled together through the bus system 205 . It will be understood that the bus system 205 is used to implement the connection communication between these components. In addition to the data bus, the bus system 205 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, the various buses are labeled as bus system 205 in FIG. 2 .

其中，用户接口203可以包括显示器、键盘、鼠标、轨迹球、点击轮、按键、按钮、触感板或者触摸屏等。The user interface 203 may include a display, a keyboard, a mouse, a trackball, a click wheel, keys, buttons, a touch pad or a touch screen, and the like.

可以理解，存储器202可以是易失性存储器或非易失性存储器，也可包括易失性和非易失性存储器两者。本发明实施例中的存储器202能够存储数据以支持终端(如10-1)的操作。这些数据的示例包括：用于在终端(如10-1)上操作的任何计算机程序，如操作系统和应用程序。其中，操作系统包含各种系统程序，例如框架层、核心库层、驱动层等，用于实现各种基础业务以及处理基于硬件的任务，应用程序可以包含各种应用程序，本发明实施例所涉的终端包括但不限于手机、电脑、智能语音交互设备、智能家电、车载终端等。本发明实施例可应用于各种场景，包括但不限于云技术、人工智能、智慧交通、辅助驾驶等，以实现在各种场景中执行本发明所提供的集群资源调度方法。It will be appreciated that the memory 202 may be either volatile memory or non-volatile memory, and may include both volatile and non-volatile memory. The memory 202 in the embodiment of the present invention can store data to support the operation of the terminal (eg 10-1). Examples of such data include: any computer program used to operate on the terminal (eg 10-1), such as operating systems and applications. The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, etc., which are used to implement various basic services and process hardware-based tasks, and the application programs may include various application programs. The terminals involved include but are not limited to mobile phones, computers, intelligent voice interactive devices, smart home appliances, vehicle-mounted terminals, etc. The embodiments of the present invention can be applied to various scenarios, including but not limited to cloud technology, artificial intelligence, smart transportation, assisted driving, etc., to implement the cluster resource scheduling method provided by the present invention in various scenarios.

在一些实施例中，本发明实施例提供的集群资源调度装置可以采用软硬件结合的方式实现，作为示例，本发明实施例提供的集群资源调度装置可以是采用硬件译码处理器形式的处理器，其被编程以执行本发明实施例提供的集群资源调度方法。例如，硬件译码处理器形式的处理器可以采用一个或多个应用专用集成电路(ASIC，Application SpecificIntegrated Circuit)、DSP、可编程逻辑器件(PLD，Programmable Logic Device)、复杂可编程逻辑器件(CPLD，Complex Programmable Logic Device)、现场可编程门阵列(FPGA，Field-Programmable Gate Array)或其他电子元件。In some embodiments, the cluster resource scheduling apparatus provided by the embodiments of the present invention may be implemented by a combination of software and hardware. As an example, the cluster resource scheduling apparatus provided by the embodiments of the present invention may be processors in the form of hardware decoding processors , which is programmed to execute the cluster resource scheduling method provided by the embodiment of the present invention. For example, the processor in the form of a hardware decoding processor may use one or more Application Specific Integrated Circuits (ASIC, Application Specific Integrated Circuit), DSP, Programmable Logic Device (PLD, Programmable Logic Device), Complex Programmable Logic Device (CPLD) , Complex Programmable Logic Device), Field Programmable Gate Array (FPGA, Field-Programmable Gate Array) or other electronic components.

作为本发明实施例提供的集群资源调度装置采用软硬件结合实施的示例，本发明实施例所提供的集群资源调度装置可以直接体现为由处理器201执行的软件模块组合，软件模块可以位于存储介质中，存储介质位于存储器202，处理器201读取存储器202中软件模块包括的可执行指令，结合必要的硬件(例如，包括处理器201以及连接到总线205的其他组件)完成本发明实施例提供的集群资源调度方法。As an example in which the cluster resource scheduling apparatus provided by the embodiment of the present invention is implemented by combining software and hardware, the cluster resource scheduling apparatus provided by the embodiment of the present invention may be directly embodied as a combination of software modules executed by the processor 201, and the software modules may be located in a storage medium The storage medium is located in the memory 202, and the processor 201 reads the executable instructions included in the software module in the memory 202, and combines necessary hardware (for example, including the processor 201 and other components connected to the bus 205) to complete the embodiments of the present invention. The cluster resource scheduling method.

作为示例，处理器201可以是一种集成电路芯片，具有信号的处理能力，例如通用处理器、数字信号处理器(DSP，Digital Signal Processor)，或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等，其中，通用处理器可以是微处理器或者任何常规的处理器等。As an example, the processor 201 may be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gates or transistor logic devices , discrete hardware components, etc., where a general-purpose processor may be a microprocessor or any conventional processor, or the like.

作为本发明实施例提供的集群资源调度装置采用硬件实施的示例，本发明实施例所提供的装置可以直接采用硬件译码处理器形式的处理器201来执行完成，例如，被一个或多个应用专用集成电路(ASIC，Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD，Programmable Logic Device)、复杂可编程逻辑器件(CPLD，ComplexProgrammable Logic Device)、现场可编程门阵列(FPGA，Field-Programmable GateArray)或其他电子元件执行实现本发明实施例提供的集群资源调度方法。As an example in which the cluster resource scheduling apparatus provided in the embodiment of the present invention is implemented in hardware, the apparatus provided in the embodiment of the present invention may be directly executed by a processor 201 in the form of a hardware decoding processor, for example, by one or more applications. Application Specific Integrated Circuit (ASIC, Application Specific Integrated Circuit), DSP, Programmable Logic Device (PLD, Programmable Logic Device), Complex Programmable Logic Device (CPLD, Complex Programmable Logic Device), Field Programmable Gate Array (FPGA, Field-Programmable) GateArray) or other electronic components to implement the cluster resource scheduling method provided by the embodiment of the present invention.

本发明实施例中的存储器202用于存储各种类型的数据以支持集群资源调度装置的操作。这些数据的示例包括：用于在集群资源调度装置上操作的任何可执行指令，如可执行指令，实现本发明实施例的从集群资源调度方法的程序可以包含在可执行指令中。The memory 202 in the embodiment of the present invention is used for storing various types of data to support the operation of the cluster resource scheduling apparatus. Examples of these data include: any executable instructions for operating on the cluster resource scheduling apparatus, such as executable instructions, and the program implementing the method for scheduling slave cluster resources according to the embodiment of the present invention may be included in the executable instructions.

在另一些实施例中，本发明实施例提供的集群资源调度装置可以采用软件方式实现，图2示出了存储在存储器202中的集群资源调度装置，其可以是程序和插件等形式的软件，并包括一系列的模块，作为存储器202中存储的程序的示例，可以包括集群资源调度装置，集群资源调度装置中包括以下的软件模块信息传输模块2081和信息处理模块2082。当集群资源调度装置中的软件模块被处理器201读取到RAM中并执行时，将实现本发明实施例提供的集群资源调度方法，其中，集群资源调度装置中各个软件模块的功能，包括：In other embodiments, the cluster resource scheduling apparatus provided by the embodiments of the present invention may be implemented in software. FIG. 2 shows the cluster resource scheduling apparatus stored in the memory 202, which may be software in the form of programs and plug-ins. It also includes a series of modules. As an example of the program stored in the memory 202, it may include a cluster resource scheduling apparatus. The cluster resource scheduling apparatus includes the following software modules, an information transmission module 2081 and an information processing module 2082. When the software modules in the cluster resource scheduling device are read into the RAM by the processor 201 and executed, the cluster resource scheduling method provided by the embodiment of the present invention will be implemented, wherein the functions of each software module in the cluster resource scheduling device include:

信息传输装置2081，用于在集群资源调度环境中配置工作负载，并且确定承载所述工作负载的超时队列。The information transmission device 2081 is configured to configure a workload in a cluster resource scheduling environment, and determine a timeout queue that carries the workload.

信息处理装置2082，用于当所述超时队列中的所述工作负载达到超时状态时，控制器组件将所述工作负载的状态调整为二次调度状态。An information processing device 2082, configured to adjust the state of the workload to a secondary scheduling state when the workload in the timeout queue reaches a timeout state.

所述信息处理装置2082，用于当调度器组件确定述工作负载的状态为二次调度状态时，基于所述工作负载的信息，向对应的估计器组件发送失败副本数探测请求。The information processing device 2082 is configured to, when the scheduler component determines that the state of the workload is a secondary scheduling state, send a detection request for the number of failed replicas to the corresponding estimator component based on the information of the workload.

所述信息处理装置2082，用于所述估计器组件响应于所述失败副本数探测请求，确定调度失败的副本数量，并将所述调度失败的副本数量发送至所述调度器组件。The information processing device 2082 is configured for the estimator component to determine, in response to the failed replica number detection request, the number of replicas that fail to schedule, and send the number of replicas that fail to schedule to the scheduler component.

所述信息处理装置2082，用于所述调度器组件基于所述调度失败的副本数量，执行集群资源调度程序以实现利用所述集群资源调度环境中的最大可用副本数量。The information processing device 2082 is configured for the scheduler component to execute a cluster resource scheduler based on the number of replicas that fail to schedule, so as to achieve the maximum number of available replicas in the environment by utilizing the cluster resource scheduling.

所述信息处理装置2082，用于响应于与所述集群资源相匹配的集群资源调度模式，根据所述待处理任务的优先级对待处理任务配置相应的集群资源。The information processing device 2082 is configured to, in response to the cluster resource scheduling mode matching the cluster resource, configure the corresponding cluster resource for the to-be-processed task according to the priority of the to-be-processed task.

根据图2所示的电子设备，在本申请的一个方面中，本申请还提供了一种计算机程序产品或计算机程序，该计算机程序产品或计算机程序包括计算机指令，该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令，处理器执行该计算机指令，使得该计算机设备执行上述集群资源调度方法的各种可选实现方式中所提供的方法。According to the electronic device shown in FIG. 2, in one aspect of the present application, the present application further provides a computer program product or computer program, the computer program product or computer program comprising computer instructions, the computer instructions being stored in a computer-readable in the storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the methods provided in various optional implementation manners of the foregoing cluster resource scheduling methods.

参考图3，图3为本发明实施例提供的集群资源调度方法一个可选的流程示意图，可以理解地，图3所示的步骤可以由运行集群资源调度装置的各种电子设备执行，例如可以是如带有集群资源调度功能的专用终端、服务器或者服务器集群控制器、云网络服务器的控制终端。其中，带有集群资源调度装置的专用终端可以封装于图1所示的服务器200中，以执行前序图2所示的集群资源调度装置中的相应软件模块。下面针对图3示出的步骤进行说明。Referring to FIG. 3, FIG. 3 is a schematic flowchart of an optional cluster resource scheduling method provided by an embodiment of the present invention. It is understood that the steps shown in FIG. 3 may be performed by various electronic devices running the cluster resource scheduling apparatus. It is a control terminal such as a dedicated terminal with cluster resource scheduling function, a server or a server cluster controller, and a cloud network server. Wherein, the dedicated terminal with the cluster resource scheduling device can be packaged in the server 200 shown in FIG. 1 to execute the corresponding software modules in the cluster resource scheduling device shown in the preceding sequence of FIG. 2 . The steps shown in FIG. 3 will be described below.

步骤301：集群资源调度装置在集群资源调度环境中配置工作负载，并且确定承载所述工作负载的超时队列。Step 301: The cluster resource scheduling apparatus configures the workload in the cluster resource scheduling environment, and determines a timeout queue that carries the workload.

在本发明的一些实施例中，对于云服务器集群的环境，集群资源调度装置可以包括不同类型的组件，例如包括：控制器组件、调度器组件以及估计器组件，具体来说，参考图4，图4为本发明实施例中集群资源调度装置的架构示意图，其中，控制器组件用于检测所有工作负载，并只将未达到期望副本数的的工作负载放入超时队列中进行计时。当产生超时事件后会对将工作负载的状态进行更新，标识其需要进行二次调度。调度器组件用于持续检测所有工作负载，当有工作负载出现需要进行二次调度的状态时，会向估计器发送请求获取该工作负载在各集群中调度失败的副本数，并以此结果进行二次调度。估计器组件用于器检测一个子集群的集群副本和节点以统计集群资源使用情况，当接收到来自调度器的失败副本统计请求时，会实时计算并返回集群中调度失败的副本数。控制器组件、调度器组件、估计器组件以及用户创建的工作负载共同构成了控制平面。In some embodiments of the present invention, for a cloud server cluster environment, the cluster resource scheduling apparatus may include different types of components, for example, including: a controller component, a scheduler component, and an estimator component, specifically, referring to FIG. 4 , 4 is a schematic diagram of the architecture of a cluster resource scheduling apparatus according to an embodiment of the present invention, wherein the controller component is used to detect all workloads, and only place workloads that do not reach the expected number of replicas into a timeout queue for timing. When a timeout event occurs, the status of the workload will be updated, indicating that it needs to be rescheduled. The scheduler component is used to continuously detect all workloads. When there is a workload that needs to be rescheduled, it will send a request to the estimator to obtain the number of replicas that the workload failed to schedule in each cluster, and use the results Secondary scheduling. The estimator component is used to detect the cluster replicas and nodes of a sub-cluster to count the cluster resource usage. When receiving a failed replica statistics request from the scheduler, it will calculate and return the number of replicas that failed to schedule in the cluster in real time. The controller component, scheduler component, estimator component, and user-created workloads together form the control plane.

其中，本发明实施例可结合云技术实现，云技术(Cloud technology)是指在广域网或局域网内将硬件、软件及网络等系列资源统一起来，实现数据的计算、储存、处理和共享的一种托管技术，也可理解为基于云计算商业模式应用的网络技术、信息技术、整合技术、管理平台技术及应用技术等的总称。技术网络系统的后台服务需要大量的计算、存储资源，如视频网站、图片类网站和更多的门户网站，因此云技术需要以云计算作为支撑。The embodiments of the present invention can be implemented in combination with cloud technology. Cloud technology refers to unifying a series of resources such as hardware, software, and network in a wide area network or a local area network to realize data computing, storage, processing, and sharing. Hosting technology can also be understood as a general term for network technology, information technology, integration technology, management platform technology and application technology based on cloud computing business model applications. Background services of technical network systems require a lot of computing and storage resources, such as video websites, picture websites and more portal websites, so cloud technology needs to be supported by cloud computing.

需要说明的是，云计算是一种计算模式，它将计算任务分布在大量计算机构成的资源池上，使各种应用系统能够根据需要获取计算力、存储空间和信息服务。提供资源的网络被称为“云”。“云”中的资源在使用者看来是可以无限扩展的，并且可以随时获取，按需使用，随时扩展，按使用付费。作为云计算的基础能力提供商，会建立云计算资源池平台，简称云平台，一般称为基础设施即服务(IaaS，Infrastructure as a Service)，在资源池中部署多种类型的虚拟资源，供外部客户选择使用。云计算资源池中主要包括：计算设备(可为虚拟化机器，包含操作系统)、存储设备和网络设备。对于用户使用云服务器存储数据或者部署不同应用进程时，检测服务器集群硬盘的运行参数，可以及时的发现可能出现的服务器集群硬盘故障，避免由于出现失效警告的服务器集群硬盘故障造成的用户数据丢失。It should be noted that cloud computing is a computing model that distributes computing tasks on a resource pool composed of a large number of computers, enabling various application systems to obtain computing power, storage space and information services as needed. The network that provides the resources is called the "cloud". The resources in the "cloud" are infinitely expandable in the eyes of users, and can be obtained at any time, used on demand, expanded at any time, and paid according to usage. As a basic capability provider of cloud computing, it will establish a cloud computing resource pool platform, referred to as a cloud platform, generally referred to as Infrastructure as a Service (IaaS, Infrastructure as a Service), deploying various types of virtual resources in the resource pool for External customers choose to use. The cloud computing resource pool mainly includes: computing devices (which can be virtualized machines, including operating systems), storage devices, and network devices. When a user uses a cloud server to store data or deploy different application processes, the operating parameters of the server cluster hard disk can be detected, and possible server cluster hard disk failures can be detected in time to avoid user data loss due to server cluster hard disk failures with failure warnings.

云存储(cloud storage)是在云计算概念上延伸和发展出来的一个新的概念，分布式云存储系统(以下简称存储系统)是指通过集群应用、网格技术以及分布存储文件系统等功能，将网络中大量各种不同类型的存储设备(存储设备也称之为存储节点)通过应用软件或应用接口集合起来协同工作，共同对外提供数据存储和业务访问功能的一个存储系统。目前，存储系统的存储方法为：创建逻辑卷，在创建逻辑卷时，就为每个逻辑卷分配物理存储空间，该物理存储空间可能是某个存储设备或者某几个存储设备的磁盘组成。客户端在某一逻辑卷上存储数据，也就是将数据存储在文件系统上，文件系统将数据分成许多部分，每一部分是一个对象，对象不仅包含数据而且还包含数据标识(ID，ID entity)等额外的信息，文件系统将每个对象分别写入该逻辑卷的物理存储空间，且文件系统会记录每个对象的存储位置信息，从而当客户端请求访问数据时，文件系统能够根据每个对象的存储位置信息让客户端对数据进行访问。存储系统为逻辑卷分配物理存储空间的过程，具体为：按照对存储于逻辑卷的对象的容量估量(该估量往往相对于实际要存储的对象的容量有很大余量)和独立冗余磁盘阵列 (RAID，Redundant Array of Independent Disk)的组别，预先将物理存储空间划分成分条，一个逻辑卷可以理解为一个分条，从而为逻辑卷分配了物理存储空间。Cloud storage is a new concept extended and developed from the concept of cloud computing. Distributed cloud storage system (hereinafter referred to as storage system) refers to functions such as cluster application, grid technology and distributed storage file system. A storage system that integrates a large number of different types of storage devices (also called storage nodes) in the network through application software or application interfaces to work together to provide external data storage and service access functions. At present, the storage method of the storage system is as follows: creating a logical volume, and when creating a logical volume, a physical storage space is allocated to each logical volume, and the physical storage space may be composed of a storage device or disks of several storage devices. The client stores data on a logical volume, that is, stores the data on the file system. The file system divides the data into many parts, each part is an object, and the object contains not only data but also data identification (ID, ID entity) and other additional information, the file system writes each object into the physical storage space of the logical volume, and the file system records the storage location information of each object, so that when the client requests to access data, the file system can The storage location information of the object allows the client to access the data. The process of allocating physical storage space by the storage system to the logical volume, specifically: according to the capacity estimation of the objects stored in the logical volume (this estimation often has a large margin relative to the actual capacity of the objects to be stored) and independent redundant disks Array (RAID, Redundant Array of Independent Disk) group, which divides the physical storage space into stripes in advance, and a logical volume can be understood as a stripe, thereby allocating physical storage space for the logical volume.

以通过云服务器网络实现本申请所提供的集群资源调度方法为例，通过在集群资源调度环境中配置工作负载，并且确定承载所述工作负载的超时队列；当所述超时队列中的所述工作负载达到超时状态时，控制器组件将所述工作负载的状态调整为二次调度状态；当调度器组件确定述工作负载的状态为二次调度状态时，基于所述工作负载的信息，向对应的估计器组件发送失败副本数探测请求；所述估计器组件响应于所述失败副本数探测请求，确定调度失败的副本数量，并将所述调度失败的副本数量发送至所述调度器组件；所述调度器组件基于所述调度失败的副本数量，执行集群资源调度程序以实现利用所述集群资源调度环境中的最大可用副本数量，由此，保障工作负载的可用性，同时提升集群资源调度的准确性和可靠性，提升集群资源的使用效率。Taking the implementation of the cluster resource scheduling method provided by this application through a cloud server network as an example, by configuring a workload in a cluster resource scheduling environment, and determining a timeout queue that carries the workload; When the load reaches the timeout state, the controller component adjusts the state of the workload to the secondary scheduling state; when the scheduler component determines that the state of the workload is the secondary scheduling state, based on the information of the workload, the corresponding The estimator component sends a detection request for the number of failed replicas; the estimator component determines the number of replicas that fail to schedule in response to the detection request for the number of failed replicas, and sends the number of replicas that fail to schedule to the scheduler component; The scheduler component executes a cluster resource scheduler based on the number of replicas that fail to schedule, so as to utilize the maximum number of available replicas in the cluster resource scheduling environment, thereby ensuring the availability of workloads and improving the efficiency of cluster resource scheduling. Accuracy and reliability, improve the efficiency of cluster resource usage.

其中，应用于云产品时，云产品的前端可以为Web UI组件，用于接收用户填写的Spark相关参数，并根据该Spark相关参数生成作业数据。其中，集群管理器(ClusterManager)即可以是YARN、Mesos或Kubernetes等开源集群资源调度平台。Spark本身已经支持了这些开源平台，即Spark组件和ClusterManager 组件间的协议是兼容的。Driver是作业驱动器，Work Node是工作节点，Executor 是任务执行组件，task是最小的执行单位。进一步地，结构化数据的程序包(spark SQL)是Spark用来操作结构化数据的程序包，通过该Spark SQL，可以使用SQL 语言来查询数据，该Spark SQL支持多种数据源，比如数据仓库工具(Hive)表等。该流式计算的组件是Spark提供的对实时数据进行流式计算的组件，提供了用来操作数据流的应用程序编程接口(API Application Programming Interface)。When applied to a cloud product, the front end of the cloud product may be a Web UI component, which is used to receive Spark-related parameters filled in by the user, and generate job data according to the Spark-related parameters. The cluster manager (ClusterManager) may be an open source cluster resource scheduling platform such as YARN, Mesos, or Kubernetes. Spark itself already supports these open source platforms, that is, the protocols between Spark components and ClusterManager components are compatible. Driver is a job driver, Work Node is a worker node, Executor is a task execution component, and task is the smallest execution unit. Further, the structured data package (spark SQL) is a package used by Spark to operate structured data. Through the Spark SQL, the data can be queried using the SQL language. The Spark SQL supports multiple data sources, such as data warehouses. Tools (Hive) table, etc. The stream computing component is a component provided by Spark to perform stream computing on real-time data, and provides an API Application Programming Interface for operating the data stream.

步骤302：当所述超时队列中的所述工作负载达到超时状态时，控制器组件将所述工作负载的状态调整为二次调度状态。Step 302: When the workload in the timeout queue reaches the timeout state, the controller component adjusts the state of the workload to a secondary scheduling state.

在本发明的一些实施例中，在集群资源调度环境中配置工作负载，并且确定承载所述工作负载的超时队列，包括：In some embodiments of the present invention, configuring a workload in a cluster resource scheduling environment, and determining a timeout queue to carry the workload, includes:

所述控制器组件确定所述集群资源调度环境的期望副本数量；所述控制器组件基于期望副本数量，对所述工作负载进行实时检测；当所述工作负载中的副本数量小于所述期望副本数量时，将所述工作负载调整至所述超时队列中。由于云服务器集群的使用环境多种多样，因此，期望副本数量的取值可以根据云服务器集群的使用环境进行灵活设置，例如通过云服务器集群对即时通讯客户端财付通支付或者即时通讯客户端中的进行资金借贷购买物品的信息进行处理时，由于任务数量多，因此，可以将期望副本数量设置为10000；对于仅通过单一服务器集群即可完成的视频处理任务，可以将期望副本数量设置为100，以充分使用服务器集群的资源，减少服务器集群的资源浪费。The controller component determines the expected number of copies of the cluster resource scheduling environment; the controller component performs real-time detection on the workload based on the expected number of copies; when the number of copies in the workload is less than the expected number of copies When the number is reached, adjust the workload to the timeout queue. Because the cloud server cluster is used in a variety of environments, the value of the expected number of copies can be flexibly set according to the use environment of the cloud server cluster, for example, the instant messaging client Tenpay payment or the instant messaging client When processing the information of borrowing funds to purchase items in , due to the large number of tasks, the expected number of copies can be set to 10,000; for video processing tasks that can be completed only through a single server cluster, the expected number of copies can be set to 100, so as to make full use of the resources of the server cluster and reduce the resource waste of the server cluster.

在本发明的一些实施例中，当所述超时队列中的工作负载的副本数量出现变化时，所述控制器组件基于期望副本数量，对所述超时队列中的工作负载进行检测；当所述工作负载中的副本数量小于所述期望副本数量时，将所述工作负载保持在所述超时队列中；当所述工作负载中的副本数量大于等于所述期望副本数量时，将所述超时队列中的工作负载删除。由于超时队列中的工作负载的副本数量是动态变化的数据，因此，通过对超时队列中的工作负载进行实时检测，可以及时地调整超时队列中的工作负载，减少进行二次资源调整的工作负载的数量。In some embodiments of the present invention, when the number of replicas of the workload in the timeout queue changes, the controller component detects the workload in the timeout queue based on the expected number of replicas; when the When the number of replicas in the workload is less than the expected number of replicas, the workload is kept in the timeout queue; when the number of replicas in the workload is greater than or equal to the expected number of replicas, the timeout queue is placed Workload deletion in . Since the number of copies of workloads in the timeout queue is dynamically changing data, by real-time detection of the workloads in the timeout queue, the workloads in the timeout queue can be adjusted in time and the workload for secondary resource adjustment can be reduced. quantity.

步骤303：当集群资源调度装置调度器组件确定述工作负载的状态为二次调度状态时，基于所述工作负载的信息，向对应的估计器组件发送失败副本数探测请求。Step 303 : when the scheduler component of the cluster resource scheduling apparatus determines that the state of the workload is a secondary scheduling state, based on the information of the workload, a request for detecting the number of failed replicas is sent to the corresponding estimator component.

其中，工作负载(Workload)的信息可包括：StatefulSet、Deployment、ReplicaSet、Daemonset等资源。这些资源信息包含了应用实例数量以及应用实例的亲和性规则等。只有与Workload的亲和性规则适配的应用实例才可部署在该计算节点上。Kubernetes集群中的资源对象可以是Kubernetes集群中的应用 (APP)，例如，部署(Deployment)、状态副本集(StatefulSet)，以及路由(Ingress)、容器组(pod)、容器(container)、服务(Service)、复制控制器(RC， ReplicationController)等资源中的一种或多种。The workload information may include resources such as StatefulSet, Deployment, ReplicaSet, and Daemonset. The resource information includes the number of application instances and the affinity rules of the application instances. Only application instances that are compatible with Workload's affinity rules can be deployed on this computing node. A resource object in a Kubernetes cluster can be an application (APP) in a Kubernetes cluster, for example, deployment (Deployment), state replica set (StatefulSet), and routing (Ingress), container group (pod), container (container), service ( One or more of resources such as Service), Replication Controller (RC, ReplicationController).

步骤304：集群资源调度装置的估计器组件响应于所述失败副本数探测请求，确定调度失败的副本数量，并将所述调度失败的副本数量发送至所述调度器组件。Step 304: The estimator component of the cluster resource scheduling apparatus, in response to the failed replica number detection request, determines the number of replicas that fail to schedule, and sends the number of replicas that fail to schedule to the scheduler component.

下面通过图5进一步说明确定调度失败的副本数量的工作过程。The working process of determining the number of replicas that fail to be scheduled is further described below with reference to FIG. 5 .

参考图5，图5为本发明实施例提供的集群资源调度方法一个可选的流程示意图，可以理解地，图5所示的步骤可以由运行集群资源调度装置的各种电子设备执行，例如可以是如带有集群资源调度功能的专用终端、服务器或者服务器集群控制器、云网络服务器的控制终端。其中，带有集群资源调度装置的专用终端可以封装于图1所示的服务器200中，以执行前序图2所示的集群资源调度装置中的相应软件模块。下面针对图5示出的步骤进行说明。Referring to FIG. 5, FIG. 5 is a schematic flowchart of an optional cluster resource scheduling method provided by an embodiment of the present invention. It is understood that the steps shown in FIG. 5 may be performed by various electronic devices running the cluster resource scheduling apparatus, for example, It is a control terminal such as a dedicated terminal with cluster resource scheduling function, a server or a server cluster controller, and a cloud network server. Wherein, the dedicated terminal with the cluster resource scheduling device can be packaged in the server 200 shown in FIG. 1 to execute the corresponding software modules in the cluster resource scheduling device shown in the preceding sequence of FIG. 2 . The steps shown in FIG. 5 will be described below.

步骤501：估计器组件获取集群资源调度环境中子集群的所有节点的节点信息和容器组信息。Step 501: The estimator component obtains node information and container group information of all nodes of the sub-cluster in the cluster resource scheduling environment.

步骤502：估计器组件响应于所述失败副本数探测请求，在所述子集群中查询工作副本所关联的容器组，并且确定所述容器组对应的容器组列表。Step 502: The estimator component, in response to the failed copy number detection request, queries the sub-cluster for a container group associated with a working copy, and determines a container group list corresponding to the container group.

在本发明的一些实施例中，当所述工作副本的类型为资源类型时，确定所述工作副本对应的副本控制器对象列表；通过所述副本控制器对象列表的缓存查找所述工作副本所关联的容器组列表，其中，以K8S为例，Kubernetes集群一般包括主节点(Master)，以及分别与主节点通信连接的多个计算节点(Node)，其中，主节点用于管理和控制多个计算节点，计算节点作为工作负载节点，其包含直接部署在节点中的原应用程序和多个容器组(Pod)，每个容器组中封装有一个或多个用于承载应用程序的容器(Container)，Pod是Kubernetes的基本操作单元，是最小的可创建、调试和管理的部署单元。工作副本的类型为资源类型(Deployment类型)，可以部署类型任务，Deployment集成了上线部署、滚动升级、创建副本、暂停上线任务，恢复上线任务，回滚到以前某一版本(成功/稳定)的Deployment等功能，在某种程度上，Deployment可以实现无人值守的上线，大大降低上线过程的复杂沟通、操作风险，对于Deployment类型的工作副本，可以首先确定Deployment类型关联的ReplicaSet对象列表，然后通过副本控制器ReplicaSet从缓存中找到所关联的Pod列表，其中，ReplicaSet是kubernetes 中的一种副本控制器，主要作用是控制由ReplicaSet管理的pod，使pod副本的数量始终维持在预设的个数。In some embodiments of the present invention, when the type of the working copy is a resource type, a list of copy controller objects corresponding to the working copy is determined; and the cache of the copy controller object list is used to search for the location of the working copy. A list of associated container groups. Taking K8S as an example, a Kubernetes cluster generally includes a master node (Master) and multiple computing nodes (Nodes) that communicate with the master node respectively. The master node is used to manage and control multiple A computing node, as a workload node, contains the original application directly deployed in the node and multiple container groups (Pods), each of which is encapsulated with one or more containers (Containers) for carrying applications. ), Pod is the basic operating unit of Kubernetes, the smallest deployment unit that can be created, debugged, and managed. The type of working copy is resource type (Deployment type), which can deploy type tasks. Deployment integrates online deployment, rolling upgrade, creating copies, suspending online tasks, resuming online tasks, and rolling back to a previous version (successful/stable). Deployment and other functions, to a certain extent, Deployment can achieve unattended online, which greatly reduces the complex communication and operational risks of the online process. For the working copy of the Deployment type, you can first determine the list of ReplicaSet objects associated with the Deployment type, and then pass the The replica controller ReplicaSet finds the associated Pod list from the cache. Among them, ReplicaSet is a replica controller in kubernetes. Its main function is to control the pods managed by ReplicaSet, so that the number of pod replicas is always maintained at the preset number. .

在本发明的一些实施例中，当所述工作副本的类型为状态副本集类型时，在所述工作副本的缓存中查找所述工作副本所关联的容器组列表，其中，对于 StatefulSet类型的工作副本，可以从缓存中直接找到StatefulSet类型的工作副本关联的Pod对象列表，以节省查找时间，提示资源调度速度。In some embodiments of the present invention, when the type of the working copy is a stateful copy set type, a list of container groups associated with the working copy is searched in the cache of the working copy, wherein for the work of the StatefulSet type For replicas, you can directly find the list of Pod objects associated with the StatefulSet type of working replicas from the cache to save search time and prompt resource scheduling speed.

步骤503：估计器组件从所述容器组列表中查询调度失败的容器组，并根据所述调度失败的容器组，计算调度失败的副本数量。Step 503: The estimator component searches the container group list for the container groups that fail to schedule, and calculates the number of replicas that fail to schedule according to the container groups that fail to schedule.

步骤305：集群资源调度装置的调度器组件基于所述调度失败的副本数量，执行集群资源调度程序以实现利用所述集群资源调度环境中的最大可用副本数量。Step 305 : The scheduler component of the cluster resource scheduling apparatus executes a cluster resource scheduler based on the number of replicas that fail to schedule, so as to achieve the maximum available number of replicas in the environment using the cluster resource scheduling.

在本发明的一些实施例中，在执行集群资源调度程序之前，还需要确定最大可用副本数量，具体来说，参考图6，图6为本发明实施例提供的集群资源调度方法一个可选的流程示意图，可以理解地，图6所示的步骤可以由运行集群资源调度装置的各种电子设备执行，例如可以是如带有集群资源调度功能的专用终端、服务器或者服务器集群控制器、云网络服务器的控制终端。其中，带有集群资源调度装置的专用终端可以封装于图1所示的服务器200中，以执行前序图2所示的集群资源调度装置中的相应软件模块。下面针对图6示出的步骤进行说明。In some embodiments of the present invention, before executing the cluster resource scheduler, the maximum number of available replicas needs to be determined. Specifically, referring to FIG. 6 , FIG. 6 is an optional cluster resource scheduling method provided by the embodiments of the present invention. A schematic flowchart, it can be understood that the steps shown in FIG. 6 can be executed by various electronic devices running the cluster resource scheduling apparatus, such as a dedicated terminal with a cluster resource scheduling function, a server or a server cluster controller, a cloud network The server's controlling terminal. Wherein, the dedicated terminal with the cluster resource scheduling device can be packaged in the server 200 shown in FIG. 1 to execute the corresponding software modules in the cluster resource scheduling device shown in the preceding sequence of FIG. 2 . The steps shown in FIG. 6 will be described below.

步骤601：当估计器组件启动时，所述估计器组件获取集群资源调度环境中子集群的所有节点的节点信息和容器组信息。Step 601: When the estimator component is started, the estimator component acquires node information and container group information of all nodes of the sub-cluster in the cluster resource scheduling environment.

步骤602：估计器组件响应于最大可用副本数量预估请求，从所述在集群资源的所有的节点中筛选与所述工作负载相匹配节点。Step 602: The estimator component selects a node matching the workload from all the nodes in the cluster resource in response to the maximum available replica quantity estimation request.

步骤603：确定每一个与所述工作负载相匹配节点对应的容器组信息，并基于所述容器组信息，确定每一个与所述工作负载相匹配节点的最大可用副本数。Step 603: Determine container group information corresponding to each node matching the workload, and determine the maximum number of available replicas of each node matching the workload based on the container group information.

步骤604：基于每一个与所述工作负载相匹配节点的最大可用副本数，确定所述集群资源调度环境中的最大可用副本数量。Step 604: Determine the maximum number of available copies in the cluster resource scheduling environment based on the maximum number of copies available for each node that matches the workload.

下面以集群资源管理器为微信服务器的资源管理器为例，对本发明所涉及的集群资源调度方法进行说明，其中，结合图1所示的本发明实施例的集群资源调度方法的使用环境示意图；终端(包括终端11-1和终端11-2)上设置有能够执行不同功能相应客户端，其中，所属客户端为终端(包括终端11-1和终端 11-2)通过网络300从相应的服务器200中通过微信应用程序获取不同的信息进行浏览，终端通过网络300连接服务器200，网络300可以是广域网或者局域网，又或者是二者的组合，使用无线链路实现数据传输，其中，服务器200运行与微信应用程序相匹配的集群资源管理器，实现资源的调度，终端(例如图1中的终端10-1和终端10-2)上还可以设置有能够显示相应进行金融借贷的软件的客户端，例如虚拟资源或者实体资源进行金融活动或者通过虚拟资源借贷的客户端或插件，用户通过相应的客户端可以获向金融机构或平台进行借贷(例如即时通讯客户端财付通支付或者即时通讯客户端中的进行资金借贷购买物品的进程)；终端通过网络300连接服务器200，网络300可以是广域网或者局域网，又或者是二者的组合，使用无线链路实现数据传输。服务器(例如图1中的服务器300)银行、证券、互金等提供支付、借贷、理财等金融业务的企业的服务器。当用户需要办理相关金融业务的用户使用客户端设备访问企业的客户服务器提供的服务时，客户服务器通过触发用户终端即时通讯客户端中的小程序可以发出支付任务，于任务数量多，因此，可以将期望副本数量设置为 10000，服务器集群在处理这些支付任务时在进行任务处理时为了避免出现子集群资源竞争的问题，影响了集群资源调度的准确性和可靠性，参考图7，图7为本发明实施例提供的集群资源调度方法一个可选的流程示意图，集群资源调度的架构如图4所示，下面针对图7示出的步骤进行说明。The following describes the cluster resource scheduling method involved in the present invention by taking the cluster resource manager as the resource manager of the WeChat server as an example, wherein, in conjunction with the schematic diagram of the usage environment of the cluster resource scheduling method according to the embodiment of the present invention shown in FIG. 1 ; The terminals (including the terminal 11-1 and the terminal 11-2) are provided with corresponding clients capable of performing different functions, wherein the client is the terminal (including the terminal 11-1 and the terminal 11-2) from the corresponding server through the network 300 In 200, different information is obtained through the WeChat application for browsing, and the terminal is connected to the server 200 through the network 300. The network 300 can be a wide area network or a local area network, or a combination of the two, using a wireless link to realize data transmission, wherein, the server 200 runs The cluster resource manager matched with the WeChat application realizes resource scheduling, and the terminals (such as the terminal 10-1 and the terminal 10-2 in FIG. 1) can also be provided with a client terminal capable of displaying the corresponding software for financial lending , such as virtual resources or physical resources for financial activities or through virtual resource lending clients or plug-ins, users can obtain loans from financial institutions or platforms through the corresponding client (such as instant messaging client Tenpay payment or instant messaging client The terminal is connected to the server 200 through the network 300, and the network 300 can be a wide area network or a local area network, or a combination of the two, using a wireless link to achieve data transmission. A server (for example, the server 300 in FIG. 1 ) is a server of an enterprise that provides financial services such as payment, lending, and wealth management, such as banks, securities, and mutual funds. When the user who needs to handle the relevant financial business uses the client device to access the service provided by the client server of the enterprise, the client server can issue a payment task by triggering the applet in the instant messaging client of the user terminal. Set the expected number of replicas to 10,000. When the server cluster processes these payment tasks, in order to avoid the problem of sub-cluster resource competition, the accuracy and reliability of cluster resource scheduling are affected. Refer to Figure 7. Figure 7 is An optional schematic flowchart of the cluster resource scheduling method provided by the embodiment of the present invention, the architecture of the cluster resource scheduling is shown in FIG. 4 , and the steps shown in FIG. 7 are described below.

步骤701：创建工作负载创，控制器组件持续判断是否工作负载是否达到期望副本数。Step 701: Create a workload creation, and the controller component continuously judges whether the workload reaches the expected number of replicas.

步骤702：未达到期望副本数的工作负载时，控制器组件将工作负载存入超时队列中。Step 702: When the workload of the expected number of replicas is not reached, the controller component stores the workload in the timeout queue.

步骤703：当超时队列中的工作负载发生副本数更新事件，控制器组件判断是否达到期望副本数，若已达到则从超时队列中删除，否则重新加入超时队列。Step 703: When a replica number update event occurs for the workload in the timeout queue, the controller component determines whether the expected replica number is reached, and if it has reached, it is deleted from the timeout queue, otherwise, it is re-added to the timeout queue.

步骤704：当超时队列中有工作负载触发超时，控制器组件将工作负载的状态调整为二次调度状态，并写入工作负载中。Step 704: When a workload in the timeout queue triggers a timeout, the controller component adjusts the state of the workload to a secondary scheduling state, and writes it into the workload.

步骤705：调度器组件检测到工作负载的状态为二次调度状态。Step 705: The scheduler component detects that the state of the workload is a secondary scheduling state.

步骤706：调度器根据工作负载的信息向子集群估计器发送失败副本数探测请求。Step 706: The scheduler sends a detection request for the number of failed replicas to the sub-cluster estimator according to the workload information.

步骤707：估计器组件持续检测集群副本和节点以统计集群资源使用情况。Step 707: The estimator component continuously detects cluster replicas and nodes to count cluster resource usage.

步骤708：估计器组件根据请求计算调度失败的副本总数，并返回给调度器组件。Step 708: The estimator component calculates the total number of replicas that fail to schedule according to the request, and returns it to the scheduler component.

步骤709：调度器根据调度失败的副本总数，重新执行调度程序，生成二次调度结果。Step 709: The scheduler re-executes the scheduler according to the total number of replicas that fail to schedule, and generates a secondary scheduling result.

在本发明的一些实施例中，结合图4所示，当前工作负载A触发了二次调度且在子集群1存在副本调度失败的情况时，工作负载A在子集群1、子集群2、子集群3中期望副本数分别为r1、r2、r3，子集群1调度失败的副本数为f1，子集群2、子集群3最大可用副本数分别为m2、m3。In some embodiments of the present invention, as shown in FIG. 4 , when the current workload A triggers secondary scheduling and the replica scheduling fails in sub-cluster 1, workload A is in sub-cluster 1, sub-cluster 2, sub-cluster 2, and sub-cluster 1. The expected number of replicas in cluster 3 is r1, r2, and r3, respectively, the number of replicas that fail to schedule in sub-cluster 1 is f1, and the maximum available replicas in sub-cluster 2 and sub-cluster 3 are m2 and m3, respectively.

生成二次调度结果过程中，当m2+m3>＝f1时，调度器组件将子集群1中工作负载A的期望副本数固定为r1-f1,然后将子集群2和子集群3作为候选集群，对f1个失败副本按照最大可用副本数比例m2:m3进行分配，作为失败副本的二次调度结果。In the process of generating the secondary scheduling result, when m2+m3>=f1, the scheduler component fixes the expected number of copies of workload A in subcluster 1 to r1-f1, and then uses subcluster 2 and subcluster 3 as candidate clusters, The f1 failed replicas are allocated according to the maximum available replica number ratio m2:m3, as the secondary scheduling result of the failed replicas.

生成二次调度结果过程中，当m2+m3<f1时，确定本次二次调度失败，等待下次调度，直至二次调度成功。In the process of generating the secondary scheduling result, when m2+m3<f1, it is determined that the secondary scheduling fails this time, and the next scheduling is waited until the secondary scheduling succeeds.

本发明具有以下有益技术效果：The present invention has the following beneficial technical effects:

本发明通过在集群资源调度环境中配置工作负载，并且确定承载所述工作负载的超时队列；当所述超时队列中的所述工作负载达到超时状态时，控制器组件将所述工作负载的状态调整为二次调度状态；当调度器组件确定述工作负载的状态为二次调度状态时，基于所述工作负载的信息，向对应的估计器组件发送失败副本数探测请求；所述估计器组件响应于所述失败副本数探测请求，确定调度失败的副本数量，并将所述调度失败的副本数量发送至所述调度器组件；所述调度器组件基于所述调度失败的副本数量，执行集群资源调度程序以实现利用所述集群资源调度环境中的最大可用副本数量，能够基于调度失败的副本数量，执行集群资源调度程序以实现利用集群资源调度环境中的最大可用副本数量，保障工作负载的可用性，同时提升集群资源调度的准确性和可靠性，提升集群资源的使用效率，保证云服务器用户的数据处理速度，提高用户的使用体验。The present invention configures the workload in the cluster resource scheduling environment, and determines the timeout queue that carries the workload; when the workload in the timeout queue reaches the timeout state, the controller component changes the status of the workload Adjust to the secondary scheduling state; when the scheduler component determines that the state of the workload is the secondary scheduling state, based on the information of the workload, it sends a detection request for the number of failed copies to the corresponding estimator component; the estimator component In response to the failed replica number detection request, determine the number of replicas that fail to schedule, and send the number of replicas that fail to schedule to the scheduler component; the scheduler component executes clustering based on the number of replicas that fail to schedule The resource scheduler can implement the maximum number of available copies in the scheduling environment using the cluster resources, and can execute the cluster resource scheduler based on the number of copies that fail to schedule to achieve the maximum number of copies available in the environment using the cluster resources. At the same time, it improves the accuracy and reliability of cluster resource scheduling, improves the utilization efficiency of cluster resources, ensures the data processing speed of cloud server users, and improves the user experience.

以上所述，仅为本发明的实施例而已，并非用于限定本发明的保护范围，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明的保护范围之内。The above descriptions are merely examples of the present invention, and are not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the present invention. within the scope of protection.

Claims

1. A method for scheduling cluster resources, the method comprising:

configuring a workload in a cluster resource scheduling environment, and determining a timeout queue carrying the workload;

when the workload in the overtime queue reaches an overtime state, a controller component adjusts the state of the workload to a secondary scheduling state;

when the scheduler component determines that the state of the workload is a secondary scheduling state, sending a detection request of the number of failed copies to a corresponding estimator component based on the information of the workload;

the estimator component responds to the detection request of the number of the failed copies, determines the number of copies failed in scheduling, and sends the number of copies failed in scheduling to the scheduler component;

the scheduler component executes a cluster resource scheduler to achieve a maximum number of available copies in the scheduling environment utilizing the cluster resources based on the number of copies that failed in the scheduling.

2. The method of claim 1, wherein configuring a workload in a cluster resource scheduling environment and determining a timeout queue carrying the workload comprises:

the controller component determining an expected number of copies of the cluster resource scheduling environment;

the controller component detects the workload in real time based on the expected copy number;

when the number of copies in the workload is less than the expected number of copies, adjusting the workload to the timeout queue.

3. The method of claim 2, further comprising:

when the number of copies of the workload in the timeout queue changes, the controller component detects the workload in the timeout queue based on the expected number of copies;

maintaining the workload in the timeout queue when the number of replicas in the workload is less than the desired number of replicas;

and deleting the workload in the timeout queue when the number of copies in the workload is greater than or equal to the expected number of copies.

4. The method of claim 1, wherein the estimator component determines a number of copies that failed scheduling in response to the probe request of the number of failed copies comprises:

the estimator component acquires node information and container group information of all nodes of a sub-cluster in a cluster resource scheduling environment;

the estimator component responds to the detection request of the number of the failed copies, inquires a container group associated with the working copy in the sub-cluster, and determines a container group list corresponding to the container group;

and the estimator component inquires the container group with failed scheduling from the container group list and calculates the number of copies with failed scheduling according to the container group with failed scheduling.

5. The method of claim 4, wherein the estimator component, in response to the failed copy number detection request, queries a container group associated with a working copy in the sub-cluster and determines a container group list corresponding to the container group, comprising:

when the type of the working copy is a resource type, determining a copy controller object list corresponding to the working copy;

searching a container group list associated with the working copy through the cache of the copy controller object list;

and when the type of the working copy is a state copy set type, searching a container group list associated with the working copy in a cache of the working copy.

6. The method of claim 1, further comprising:

when the estimator component is started, the estimator component acquires node information and container group information of all nodes of a sub-cluster in a cluster resource scheduling environment;

the estimator component filters nodes matching the workload from all nodes of the cluster resource in response to a maximum available copy number pre-estimation request;

determining container group information corresponding to each node matched with the workload, and determining the maximum available copy number of each node matched with the workload based on the container group information;

and determining the maximum available copy number in the cluster resource scheduling environment based on the maximum available copy number of each node matched with the workload.

7. An apparatus for cluster resource scheduling, the apparatus comprising:

the information transmission device is used for configuring the workload in a cluster resource scheduling environment and determining an overtime queue for bearing the workload;

the information processing device is used for adjusting the state of the workload to be in a secondary scheduling state by the controller component when the workload in the overtime queue reaches the overtime state;

the information processing device is used for sending a detection request of the number of failed copies to the corresponding estimator component based on the information of the workload when the scheduler component determines that the state of the workload is a secondary scheduling state;

the information processing device is used for the estimator component to respond to the detection request of the number of the failed copies, determine the number of copies which fail in scheduling and send the number of copies which fail in scheduling to the scheduler component;

the information processing apparatus is configured to execute, by the scheduler component, a cluster resource scheduler to achieve utilization of a maximum available number of copies in the cluster resource scheduling environment based on the number of copies that failed in the scheduling.

8. A software program, characterized in that the software program comprises:

a memory for storing executable instructions;

a processor, configured to execute the executable instructions stored in the memory, and implement the cluster resource scheduling method according to any one of claims 1 to 6.

9. An electronic device, characterized in that the electronic device comprises:

a memory for storing executable instructions;

10. A computer readable storage medium storing executable instructions, wherein the executable instructions when executed by a processor implement the cluster resource scheduling method of any one of claims 1 to 6.