CN111930493A

CN111930493A - NodeManager state management method, device and computing device in cluster

Info

Publication number: CN111930493A
Application number: CN201910394996.1A
Authority: CN
Inventors: 李瑶; 许佳
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Group Hubei Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Group Hubei Co Ltd
Priority date: 2019-05-13
Filing date: 2019-05-13
Publication date: 2020-11-13
Anticipated expiration: 2039-05-13
Also published as: CN111930493B

Abstract

Embodiments of the present invention relate to the technical field of distributed resource management and scheduling systems, and disclose a NodeManager state management method, device and computing device in a cluster. The method includes: collecting network load information of a cluster, and evaluating the hardware state of the cluster according to the network load information; determining the health state of the nodes in the cluster according to the evaluation result; NodeManager goes offline. Through the above method, the embodiment of the present invention realizes the NodeManager pre-judgment and automatic offline, guarantees the stable operation of the system, and avoids the situation of Container allocation failure and task failure caused by the node host being preempted by multiple applications. .

Description

NodeManager state management method, device and computing device in cluster

技术领域technical field

本发明实施例涉及分布式资源管理及调度系统技术领域，具体涉及一种集群中NodeManager状态管理方法、装置及计算设备。Embodiments of the present invention relate to the technical field of distributed resource management and scheduling systems, and in particular, to a method, an apparatus, and a computing device for managing the state of a NodeManager in a cluster.

背景技术Background technique

随着计算机技术的发展，各种基于数据密集型的应用计算框架不断涌现，如MpaReduce，Spark，S4，Storm等。在采用计算框架时，一般都会考虑资源利用率、运维成本、数据共享等因素，应用者一般希望将所有这些计算框架均部署到一个公共的集群中，让它们共享集群的资源，并对资源进行统一使用。这样，便诞生了资源统一管理与调度平台，其典型代表是YARN(Yet Another Resource Negotiator，另一种资源协调者)。With the development of computer technology, various data-intensive application computing frameworks are emerging, such as MpaReduce, Spark, S4, Storm, etc. When adopting a computing framework, factors such as resource utilization, operation and maintenance costs, data sharing, etc. are generally considered. The user generally hopes to deploy all these computing frameworks into a common cluster, so that they can share the resources of the cluster and have a good understanding of the resources. for unified use. In this way, a unified resource management and scheduling platform was born, and its typical representative is YARN (Yet Another Resource Negotiator, another resource coordinator).

YARN中分为ResouceManager(全局资源管理器，RM)和NodeManager(节点管理器，NM)角色，其中ResouceManager主要负责全局的分配和管理。NodeManager负责单个节点的资源分配和管理。NodeManager接受任务后可以分配Application Master和Container，当主机资源不是YARN独占的时候可能会造成ResouceManager资源申请失败的情况。YARN is divided into ResouceManager (Global Resource Manager, RM) and NodeManager (Node Manager, NM) roles, where ResouceManager is mainly responsible for global allocation and management. NodeManager is responsible for resource allocation and management of a single node. After the NodeManager accepts the task, it can assign the Application Master and Container. When the host resource is not exclusively owned by YARN, it may cause the resource application failure of the ResouceManager.

现有技术中，YARN资源分配只会将CPU和内存作为计算的资源，在集群启动时以yarn-site.xml配置的形式提前划分，ResouceManager和NodeManager之间通过心跳来维持连接，无法对网络做出判断从而进行资源分配。此外，MPP架构的Impala同样会部署在Hadoop集群的主机上，但是资源分配并不根据YARN管理，当执行MPP聚合查询时，会在内存中积累大量的数据，此时如果继续按照配置中的内存和CPU来申请的话，会造成Container分配失败进而导致任务失败。即时查询占用的内存比较高，但是使用时间比较短。如果全部预留会造成YARN的浪费。因此，这种方式无法适应节点主机有多个应用程序抢占的情况。In the prior art, YARN resource allocation only uses CPU and memory as computing resources, which are divided in advance in the form of yarn-site.xml configuration when the cluster is started. The connection between ResouceManager and NodeManager is maintained through heartbeat, and cannot be used for network operations. Make judgments to allocate resources. In addition, the Impala of the MPP architecture will also be deployed on the host of the Hadoop cluster, but the resource allocation is not managed according to YARN. When executing the MPP aggregation query, a large amount of data will be accumulated in the memory. At this time, if you continue to follow the configured memory If it is applied with the CPU, it will cause the Container allocation to fail and the task to fail. The memory occupied by the instant query is relatively high, but the usage time is relatively short. If all are reserved, it will cause waste of YARN. Therefore, this method cannot adapt to the situation where the node host is preempted by multiple applications.

发明内容SUMMARY OF THE INVENTION

鉴于上述问题，本发明实施例提供了一种基于TimesTen内存库的订阅数据库伸缩方法、装置及计算设备，克服了上述问题或者至少部分地解决了上述问题。In view of the above problems, the embodiments of the present invention provide a subscription database scaling method, apparatus, and computing device based on the TimesTen memory library, which overcome the above problems or at least partially solve the above problems.

根据本发明实施例的一个方面，提供了一种集群中NodeManager状态管理方法，所述方法包括：According to an aspect of the embodiments of the present invention, a method for managing the state of a NodeManager in a cluster is provided, and the method includes:

搜集集群的网络负载信息，根据所述网络负载信息对所述集群的硬件状态进行评估；Collect network load information of the cluster, and evaluate the hardware status of the cluster according to the network load information;

根据评估的结果确定所述集群中节点的健康状态；Determine the health status of the nodes in the cluster according to the evaluation result;

当所述节点的状态不健康时对NodeManager进行下线操作。When the state of the node is unhealthy, the NodeManager is offline.

在一种可选的方式中，所述搜集集群的网络负载信息，根据所述网络负载信息对所述集群的硬件状态进行评估，进一步包括：In an optional manner, the collecting network load information of the cluster, and evaluating the hardware status of the cluster according to the network load information, further includes:

搜集集群的网络负载信息；Collect network load information of the cluster;

根据所述网络负载信息对所述集群的网络延迟进行评估，以及对所述集群的磁盘状态进行评估。The network latency of the cluster is evaluated based on the network load information, and the disk status of the cluster is evaluated.

在一种可选的方式中，当主机资源不是YARN独占时，所述方法还包括：In an optional manner, when the host resource is not exclusive to YARN, the method further includes:

对CPU使用率和内存使用率进行评估；Evaluate CPU usage and memory usage;

所述根据评估的结果确定所述集群中节点的健康状态，进一步包括：The determining the health status of the nodes in the cluster according to the evaluation result further includes:

根据所述网络延迟、磁盘状态、CPU使用率和内存使用率的评估的结果确定所述集群中节点的健康状态。The health status of the nodes in the cluster is determined according to the results of the evaluation of the network latency, disk status, CPU usage and memory usage.

在一种可选的方式中，当主机资源是YARN独占时，所述方法还包括：In an optional manner, when the host resource is exclusive to YARN, the method further includes:

当所述网络延迟超过预设值时，结合历史的网络延迟及其对应的节点的健康状态记录，对所述集群的网络延迟进行评估。When the network delay exceeds a preset value, the network delay of the cluster is evaluated in combination with the historical network delay and the health status records of the corresponding nodes.

在一种可选的方式中，所述方法还包括：In an optional manner, the method further includes:

重新配置CPU资源和内存资源；Reconfigure CPU resources and memory resources;

当根据对所述集群的硬件状态的评估确定所述集群中节点的状态健康时，修改NodeManager配置文件的参数为所述重新配置后的值；When it is determined that the state of the nodes in the cluster is healthy according to the evaluation of the hardware state of the cluster, modify the parameters of the NodeManager configuration file to the reconfigured values;

对所述NodeManager进行上线操作。An online operation is performed on the NodeManager.

在一种可选的方式中，所述根据所述网络负载信息对所述集群的网络延迟进行评估，进一步包括：In an optional manner, the evaluating the network delay of the cluster according to the network load information further includes:

通过Hadoop中的JMX监控的JMX接口采集RPC队列的请求排队时间和处理时间；Collect the request queuing time and processing time of the RPC queue through the JMX interface monitored by JMX in Hadoop;

对所有节点的请求排队时间求和后取平均值，得到基准队列时间，将第一主机的处理时间作为基准处理时间；After summing the request queuing time of all nodes, take the average value to obtain the reference queuing time, and take the processing time of the first host as the reference processing time;

判断所述第一主机的网络延迟是否大于所述基准队列时间，或者所述第二主机的网络延迟是否大于所述基准处理时间；Determine whether the network delay of the first host is greater than the reference queue time, or whether the network delay of the second host is greater than the reference processing time;

当所述第一主机的网络延迟大于所述基准队列时间，或者所述第二主机的网络延迟大于所述基准处理时间时，确定所述节点的状态不健康。When the network delay of the first host is greater than the reference queue time, or the network delay of the second host is greater than the reference processing time, it is determined that the state of the node is unhealthy.

在一种可选的方式中，所述对所述集群的磁盘状态进行评估，进一步包括：In an optional manner, the evaluating the disk status of the cluster further includes:

通过脚本查看磁盘运行状况；View disk health through scripts;

判断所述磁盘是否报错；Determine whether the disk reports an error;

当所述集群的磁盘中某块磁盘报错时，确定所述节点的状态不健康。When a certain disk in the disk of the cluster reports an error, it is determined that the state of the node is unhealthy.

在一种可选的方式中，所述对CPU使用率进行评估，进一步包括：In an optional manner, the evaluating the CPU usage further includes:

通过脚本计算当前CPU的总核心数N，确定当前非YARN使用的CPU的使用率p，以及NodeManager分配的CPU的核心数M；Calculate the total number of cores N of the current CPU through the script, determine the current utilization rate p of the CPU not used by YARN, and the number of cores M of the CPU allocated by NodeManager;

将M减去N与(1-p)的乘积，得到所述CPU使用率的评估的分值；Subtract the product of N and (1-p) from M to obtain a score for the evaluation of the CPU usage;

当所述CPU使用率的评估的分值超过预设CPU使用率阈值时，确定所述节点的状态不健康。When the evaluated score of the CPU usage exceeds a preset CPU usage threshold, it is determined that the state of the node is unhealthy.

在一种可选的方式中，所述对内存使用率进行评估，进一步包括：In an optional manner, the evaluating the memory usage further includes:

通过脚本获得内存总量，NodeManager中分配的内存总量以及系统进程使用量；Obtain the total amount of memory through the script, the total amount of memory allocated in NodeManager and the usage of system processes;

判断所述内存总量与所述系统进程使用量的差值是否大于所述NodeManager中分配的内存总量；Determine whether the difference between the total amount of memory and the system process usage is greater than the total amount of memory allocated in the NodeManager;

当所述内存总量与所述系统进程使用量的差值不大于所述NodeManager中分配的内存总量时，确定所述节点的状态不健康。When the difference between the total amount of memory and the usage amount of the system process is not greater than the total amount of memory allocated in the NodeManager, it is determined that the state of the node is unhealthy.

根据本发明实施例的另一方面，提供了一种集群中NodeManager状态管理装置，所述装置包括：According to another aspect of the embodiments of the present invention, there is provided a NodeManager state management device in a cluster, the device comprising:

评估模块，用于搜集集群的网络负载信息，根据所述网络负载信息对所述集群的硬件状态进行评估；an evaluation module, configured to collect network load information of the cluster, and evaluate the hardware state of the cluster according to the network load information;

确定模块，用于根据评估的结果确定所述集群中节点的健康状态；a determining module, configured to determine the health status of the nodes in the cluster according to the evaluation result;

管理模块，用于当所述节点的状态不健康时对NodeManager进行下线操作。The management module is used for offline operation of the NodeManager when the state of the node is unhealthy.

根据本发明实施例的另一方面，提供了一种计算设备，包括：处理器、存储器、通信接口和通信总线，所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信；According to another aspect of the embodiments of the present invention, a computing device is provided, including: a processor, a memory, a communication interface, and a communication bus, and the processor, the memory, and the communication interface complete each other through the communication bus. communication between;

所述存储器用于存放至少一可执行指令，所述可执行指令使所述处理器执行如上所述的集群中NodeManager状态管理方法的操作。The memory is used for storing at least one executable instruction, and the executable instruction enables the processor to perform the operations of the above-mentioned NodeManager state management method in a cluster.

根据本发明实施例的另一方面，提供了一种计算机存储介质，所述存储介质中存储有至少一可执行指令，所述可执行指令使处理器执行如上所述的集群中NodeManager状态管理方法。According to another aspect of the embodiments of the present invention, a computer storage medium is provided, where at least one executable instruction is stored in the storage medium, and the executable instruction enables a processor to execute the foregoing method for managing NodeManager status in a cluster .

本发明实施例通过自动采集和评估集群的硬件状态，根据评估的结果确定所述集群中节点的健康状态，当所述节点的状态不健康时对NodeManager进行下线操作，实现了NodeManager故障前预判和自动下线，保障了系统的稳定运行；同时，本发明实施例并非仅针对配置中的内存和CPU状态评估节点的健康状态，避免了节点主机有多个应用程序抢占时所造成Container分配失败进而导致任务失败的情况。The embodiment of the present invention automatically collects and evaluates the hardware state of the cluster, determines the health state of the nodes in the cluster according to the evaluation result, and performs offline operation on the NodeManager when the state of the node is unhealthy, so as to realize the prediction before the NodeManager fails. and automatic offline, which ensures the stable operation of the system; at the same time, the embodiment of the present invention does not only evaluate the health status of the node based on the memory and CPU status in the configuration, and avoids the failure of Container allocation caused by the preemption of multiple applications on the node host. This in turn causes the task to fail.

上述说明仅是本发明实施例技术方案的概述，为了能够更清楚了解本发明实施例的技术手段，而可依照说明书的内容予以实施，并且为了让本发明实施例的上述和其它目的、特征和优点能够更明显易懂，以下特举本发明的具体实施方式。The above description is only an overview of the technical solutions of the embodiments of the present invention. In order to understand the technical means of the embodiments of the present invention more clearly, it can be implemented according to the contents of the description, and in order to make the above and other purposes, features and The advantages can be more clearly understood, and the following specific embodiments of the present invention are given.

附图说明Description of drawings

通过阅读下文优选实施方式的详细描述，各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的，而并不认为是对本发明的限制。而且在整个附图中，用相同的参考符号表示相同的部件。在附图中：Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are for the purpose of illustrating preferred embodiments only and are not to be considered limiting of the invention. Also, the same components are denoted by the same reference numerals throughout the drawings. In the attached image:

图1示出了本发明实施例提供的集群中NodeManager状态管理方法的流程图；1 shows a flowchart of a method for managing NodeManager state in a cluster provided by an embodiment of the present invention;

图2示出了本发明另一实施例提供的集群中NodeManager状态管理方法的流程图；2 shows a flowchart of a method for managing NodeManager state in a cluster provided by another embodiment of the present invention;

图3示出了本发明又一实施例提供的集群中NodeManager状态管理方法的流程图；3 shows a flowchart of a method for managing NodeManager state in a cluster provided by another embodiment of the present invention;

图4示出了本发明再一实施例提供的集群中NodeManager状态管理方法的流程图；4 shows a flowchart of a method for managing NodeManager state in a cluster provided by yet another embodiment of the present invention;

图5示出了本发明实施例一具体应用实例提供的集群中NodeManager状态管理方法的流程图；FIG. 5 shows a flowchart of a method for managing NodeManager state in a cluster provided by a specific application instance of Embodiment 1 of the present invention;

图6示出了本发明实施例提供的集群中NodeManager状态管理装置的结构示意图；6 shows a schematic structural diagram of a NodeManager state management device in a cluster provided by an embodiment of the present invention;

图7示出了本发明实施例提供的计算设备的结构示意图。FIG. 7 shows a schematic structural diagram of a computing device provided by an embodiment of the present invention.

具体实施方式Detailed ways

下面将参照附图更详细地描述本发明的示例性实施例。虽然附图中显示了本发明的示例性实施例，然而应当理解，可以以各种形式实现本发明而不应被这里阐述的实施例所限制。相反，提供这些实施例是为了能够更透彻地理解本发明，并且能够将本发明的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that the present invention will be more thoroughly understood, and will fully convey the scope of the invention to those skilled in the art.

图1示出了本发明实施例提供的集群中NodeManager状态管理方法的流程图，该方法应用于计算设备中，例如通信网络中的服务器、集群的资源统一管理与调度平台中的管理计算机等。如图1所示，该方法包括以下步骤：1 shows a flowchart of a method for managing NodeManager state in a cluster provided by an embodiment of the present invention. The method is applied to computing devices, such as a server in a communication network, a management computer in a unified resource management and scheduling platform of a cluster, and the like. As shown in Figure 1, the method includes the following steps:

步骤110：搜集集群的网络负载信息，根据所述网络负载信息对所述集群的硬件状态进行评估。Step 110: Collect network load information of the cluster, and evaluate the hardware status of the cluster according to the network load information.

本步骤中，硬件状态包括网络延迟、磁盘状态等。一般情况下，在主机资源YARN独占时，通过自动搜集的网络负载信息，对网络状态进行判断，同时还可以对硬盘状态进行判断，以评估集群的硬件状态。本步骤进一步包括：In this step, the hardware status includes network delay, disk status, and the like. Generally, when the host resource YARN is exclusively used, the network status can be judged by automatically collecting the network load information, and the hard disk status can also be judged to evaluate the hardware status of the cluster. This step further includes:

步骤A1：搜集集群的网络负载信息；Step A1: Collect network load information of the cluster;

步骤A2：根据所述网络负载信息对所述集群的网络延迟进行评估，以及对所述集群的磁盘状态进行评估。Step A2: Evaluate the network delay of the cluster according to the network load information, and evaluate the disk status of the cluster.

步骤120：根据评估的结果确定所述集群中节点的健康状态。Step 120: Determine the health status of the nodes in the cluster according to the evaluation result.

其中，根据评估的结果判断是否有宕机的风险，若有，则确定集群中节点的健康状态为不健康，需要进一步处理。评估的结果可以是上述集群的硬件状态是否符合预设条件，当符合时，判定该集群中节点的状态为不健康。或者，评估的结果还可以是分值，当评估的分值大于或小于预设阀值时，判定该集群中节点的状态为不健康。可以理解的是，步骤110中，硬件状态包括一个或多个硬件的状态，此时，其中某个硬件状态的评估的结果符合预设条件或者评估的分值大于或小于预设阀值，则判定该集群的节点状态为不健康，而无需根据整体硬件的评估结果确定节点的健康状态。Among them, according to the evaluation result, it is judged whether there is a risk of downtime. If so, it is determined that the health status of the nodes in the cluster is unhealthy, and further processing is required. The result of the evaluation may be whether the hardware status of the above-mentioned cluster complies with the preset condition, and when it complies, it is determined that the status of the nodes in the cluster is unhealthy. Alternatively, the evaluation result may also be a score, and when the evaluated score is greater than or less than a preset threshold, it is determined that the state of the node in the cluster is unhealthy. It can be understood that, in step 110, the hardware state includes one or more hardware states, and at this time, if the evaluation result of a certain hardware state meets the preset condition or the evaluation score is greater than or less than the preset threshold, then It is determined that the node status of the cluster is unhealthy without determining the health status of the node according to the evaluation result of the overall hardware.

步骤130：当所述节点的状态不健康时对NodeManager进行下线操作。Step 130: Perform an offline operation on the NodeManager when the state of the node is unhealthy.

本步骤根据当前节点的状况在不影响业务的情况下将NodeManager下线，保障系统的稳定运行。可以理解的是，当条件恢复后，还可以将NodeManager修改为合适参数并恢复上线，这将在后文详细描述。In this step, the NodeManager is taken offline according to the current status of the node without affecting the service, so as to ensure the stable operation of the system. It can be understood that when the conditions are restored, the NodeManager can also be modified to appropriate parameters and resumed online, which will be described in detail later.

图2示出了本发明另一实施例提供的集群中NodeManager状态管理方法的流程图。本实施例为主机资源不是YARN独占的情况。如图2所示，该方法包括以下步骤：FIG. 2 shows a flowchart of a method for managing the state of a NodeManager in a cluster provided by another embodiment of the present invention. In this embodiment, the host resource is not exclusively owned by YARN. As shown in Figure 2, the method includes the following steps:

步骤210：当主机资源不是YARN独占时，对CPU使用率和内存使用率进行评估。Step 210: Evaluate the CPU usage and memory usage when the host resource is not exclusive to YARN.

判断主机资源是否为YARN独占属于软件进程的判断。Judging whether the host resource is exclusive to YARN belongs to the judgment of the software process.

步骤120：根据所述网络延迟、磁盘状态、CPU使用率和内存使用率的评估的结果确定所述集群中节点的健康状态。Step 120: Determine the health status of the nodes in the cluster according to the results of the evaluation of the network delay, disk status, CPU usage and memory usage.

此时，评估的项包括多个，当其中某个项的评估的结果符合预设条件或者评估的分值大于或小于预设阀值，则可以判定该集群的节点状态为不健康，而无需根据所有项的评估结果确定节点的健康状态。例如，仅网络延迟大于预设阀值，则可以确定集群中的节点状态为不健康。At this time, there are multiple items to be evaluated, and when the evaluation result of one of the items meets the preset condition or the evaluation score is greater than or less than the preset threshold, it can be determined that the node status of the cluster is unhealthy, and there is no need to base the The results of the evaluation of all items determine the health status of the node. For example, only if the network delay is greater than a preset threshold, it can be determined that the state of the nodes in the cluster is unhealthy.

其中，步骤110、步骤120和步骤130与前述实施例相同，可参考前述实施例的详细描述，此处不再赘述。Wherein, step 110 , step 120 and step 130 are the same as those in the foregoing embodiments, and reference may be made to the detailed descriptions of the foregoing embodiments, which will not be repeated here.

本实施例在主机资源非YARN独占时，通过分析当前CPU、内存资源的使用率，充分考虑其他应用的优先级，从而合理地评估当前节点的状况，根据当前节点的状况在不影响业务的情况下将节点下线及恢复。In this embodiment, when the host resources are not exclusively owned by YARN, by analyzing the current utilization rate of CPU and memory resources, and fully considering the priorities of other applications, the current node status can be reasonably evaluated, and the current node status can be used without affecting services. Take the node offline and restore it.

图3示出了本发明又一实施例提供的集群中NodeManager状态管理方法的流程图。本实施例为主机资源是YARN独占、且网络延迟过大的情况。如图3所示，该方法包括以下步骤：FIG. 3 shows a flowchart of a method for managing the state of a NodeManager in a cluster provided by another embodiment of the present invention. In this embodiment, the host resource is exclusively owned by YARN and the network delay is too large. As shown in Figure 3, the method includes the following steps:

其中，硬件状态包括网络延迟。Among them, the hardware state includes network latency.

步骤310：当主机资源是YARN独占、且所述网络延迟超过预设值时，结合历史的网络延迟及其对应的节点的健康状态记录，对所述集群的网络延迟进行评估。Step 310: When the host resource is exclusive to YARN and the network delay exceeds a preset value, the network delay of the cluster is evaluated in combination with the historical network delay and the health status records of the corresponding nodes.

本步骤中，当主机资源是YARN独占时，若当前网络流量过大，此时可能是其他业务正在占用带宽，而并非节点状态不健康，若仅由此则判定节点状态不健康而将NodeManager进行下线操作，可能导致不必要的下线，降低系统运行效率。因此，可参考历史记录信息，历史记录中包括各种网络延迟及当时节点是否健康的记录。若网络延迟超过某预设值，结合历史的网络延迟及其对应的节点健康状态记录，以此辅助评估集群的网络延迟。若历史记录中在类似的网络延迟情况下超过特定比例(例如80％)的节点状况均为健康，则此时可以确定该网络延迟为正常情况。In this step, when the host resource is exclusively owned by YARN, if the current network traffic is too large, other services may be occupying the bandwidth, not the node status is unhealthy. operation, may lead to unnecessary offline and reduce the efficiency of system operation. Therefore, you can refer to the historical record information, which includes various network delays and records of whether the node is healthy at that time. If the network delay exceeds a preset value, the historical network delay and its corresponding node health status records are combined to assist in evaluating the network delay of the cluster. If the status of nodes exceeding a certain proportion (eg, 80%) in the historical records are healthy under similar network delay conditions, it can be determined that the network delay is normal at this time.

步骤120：根据所述网络延迟的评估的结果确定所述集群中节点的健康状态。Step 120: Determine the health status of the nodes in the cluster according to the result of the network delay evaluation.

本实施例中，当主机资源是YARN独占，且当前网络流量过大时，可能是其他业务正在占用带宽，此时根据历史的流量峰值来综合判断是否发送NodeManager下线的命令，避免了误下线。In this embodiment, when the host resource is exclusively owned by YARN and the current network traffic is too large, other services may be occupying the bandwidth. At this time, it is comprehensively judged whether to send the NodeManager offline command according to the historical traffic peak, so as to avoid false logoffs. Wire.

图4示出了本发明再一实施例提供的集群中NodeManager状态管理方法的流程图。本实施例为NodeManager下线后，当条件恢复后，将NodeManager修改为合适参数并恢复上线的情况。如图4所示，该方法包括以下步骤：FIG. 4 shows a flowchart of a method for managing the state of a NodeManager in a cluster provided by yet another embodiment of the present invention. In this embodiment, after the NodeManager goes offline, when the conditions are restored, the NodeManager is modified to appropriate parameters and then goes online again. As shown in Figure 4, the method includes the following steps:

步骤440：重新配置CPU资源和内存资源。Step 440: Reconfigure CPU resources and memory resources.

本步骤可通过程序修改yarn-site.xml配置文件中的yarn.nodemanager.resource.memory-mb和yarn.nodemanager.resource.cpu-vcores的值达到资源的动态分配和利用。In this step, the dynamic allocation and utilization of resources can be achieved by modifying the values of yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores in the yarn-site.xml configuration file through the program.

可通过如下步骤重新配置CPU资源：The CPU resources can be reconfigured by the following steps:

1.通过系统stat命令，获取每个CPU的空闲时间，对整体的CPU使用率进行评估；1. Obtain the idle time of each CPU through the system stat command, and evaluate the overall CPU usage;

2.计算除去NodeManager使用的CPU时间；2. Calculate and remove the CPU time used by NodeManager;

3.按照比例得出空闲CPU占比，结合物理CPU核心数N得到应该为CPU分配的核心数,其中，空闲CPU占比为：user,nice,system,idle四个的CPU总和占比；应该为CPU分配的核心数的计算公式为：Pf₁+Pf₂+Pf₃+…+Pf_n，其中Pf₁指的是CPU核心1的空闲占比……Pf_n指的是CPU核心n的空闲占比。3. Calculate the proportion of idle CPU according to the proportion, and combine the number of physical CPU cores N to obtain the number of cores that should be allocated to the CPU. Among them, the proportion of idle CPU is: user, nice, system, idle four CPU total proportion; should be The calculation formula of the number of cores allocated to the CPU is: Pf ₁ +Pf ₂ +Pf ₃ +…+Pf _n , where Pf ₁ refers to the idle ratio of CPU core 1… Pf _n refers to the idle ratio of CPU core n proportion.

可通过如下步骤重新配置内存资源：The memory resources can be reconfigured by the following steps:

1.统计NodeManager的Java进程启动时候预设的–X_mx值M_x，其中X_mx指Java进程启动占用的最大堆内存；1. Count the preset –X _mx value M _x when the Java process of NodeManager is started, where X _mx refers to the maximum heap memory occupied by the Java process startup;

2.统计当前系统总体的内存总量M_t，当前系统占用的内存总量M_u；2. Count the total memory M _t of the current system as a whole, and the total memory M _u occupied by the current system;

3.计算得出需要分配的内存总量M_s，计算公式为：M_s＝M_t-(M_x+M_u)3. Calculate the total amount of memory M _s that needs to be allocated, and the calculation formula is: M _s =M _t -(M _x +M _u )

步骤450：当根据对所述集群的硬件状态的评估确定所述集群中节点的状态健康时，修改NodeManager配置文件的参数为所述重新配置后的值。Step 450: When it is determined that the state of the nodes in the cluster is healthy according to the evaluation of the hardware state of the cluster, modify the parameters of the NodeManager configuration file to the reconfigured values.

步骤460：对所述NodeManager进行上线操作。Step 460: Perform an online operation on the NodeManager.

本实施例可以根据下线后的节点的当前运行状况修改节点的yarn-site.xml配置，当节点重新上线后，以更加灵活的方式分配内存及CPU。In this embodiment, the yarn-site.xml configuration of the node can be modified according to the current operating status of the node after going offline, and memory and CPU are allocated in a more flexible manner when the node is brought back online.

下面对上述实施例中，如何根据所述网络负载信息对所述集群的网络延迟进行评估、如何对所述集群的磁盘状态进行评估、如何对CPU使用率进行评估、如何对内存使用率进行评估进行进一步详细说明。In the above embodiment, how to evaluate the network delay of the cluster, how to evaluate the disk status of the cluster, how to evaluate the CPU usage, and how to evaluate the memory usage according to the network load information in the above embodiment The assessment is described in further detail.

在一些实施例中，上述步骤A2中，根据所述网络负载信息对所述集群的网络延迟进行评估包括如下步骤：In some embodiments, in the above step A2, evaluating the network delay of the cluster according to the network load information includes the following steps:

步骤A21：通过Hadoop中的JMX监控的JMX接口采集RPC队列的请求排队时间和处理时间；Step A21: Collect the request queuing time and processing time of the RPC queue through the JMX interface monitored by JMX in Hadoop;

步骤A22：对所有节点的请求排队时间求和后取平均值，得到基准队列时间，将第一主机的处理时间作为基准处理时间；Step A22: Summing the request queuing times of all nodes and taking an average value to obtain the reference queuing time, and taking the processing time of the first host as the reference processing time;

步骤A23：判断所述第一主机的网络延迟是否大于所述基准队列时间，或者所述第二主机的网络延迟是否大于所述基准处理时间；Step A23: Determine whether the network delay of the first host is greater than the reference queue time, or whether the network delay of the second host is greater than the reference processing time;

此时，所述根据评估的结果确定所述集群中节点的健康状态，进一步包括：At this time, the determining the health status of the nodes in the cluster according to the evaluation result further includes:

具体地，可计算所述第一主机的网络延迟分值与第二主机的网络延迟分值；其中，所述第一主机的网络延迟分值在所述第一主机的网络延迟大于所述基准队列时间时为所述第一主机的网络延迟与所述基准队列时间的差值，否则为0：所述第二主机的网络延迟分值在所述第二主机的网络延迟大于基准处理时间时时为所述第二主机的网络延迟与所述基准处理时间的差值，否则为0；将所述第一主机的网络延迟分值与第二主机的网络延迟分值相加，得到所述集群的网络延迟分值。当所述集群的网络延迟分值不为0时，确定所述节点的状态不健康。Specifically, the network delay score of the first host and the network delay score of the second host can be calculated; wherein the network delay score of the first host is greater than the reference when the network delay of the first host The queue time is the difference between the network delay of the first host and the reference queue time, otherwise it is 0: the network delay score of the second host is when the network delay of the second host is greater than the reference processing time is the difference between the network delay of the second host and the reference processing time, otherwise it is 0; the network delay score of the first host and the network delay score of the second host are added to obtain the cluster the network latency score. When the network delay score of the cluster is not 0, it is determined that the state of the node is unhealthy.

本实施例通过Hadoop中的JMX监控的JMX接口采集RpcQueueTimeAvgTime和RpcProcessingTimeAvgTime，从而获取RPC队列的请求排队时间和处理时间，得到集群正常运行的平均时间，具体计算可参考如下公式：This embodiment collects RpcQueueTimeAvgTime and RpcProcessingTimeAvgTime through the JMX interface monitored by JMX in Hadoop, so as to obtain the request queuing time and processing time of the RPC queue, and obtain the average time of the normal operation of the cluster. The specific calculation can refer to the following formula:

基准队列时间Tq＝(Tq₁+Tq₂+Tq₃+…+Tq_N)/NReference queue time Tq=(Tq ₁ +Tq ₂ +Tq ₃ +...+Tq _N )/N

基准处理时间Tp＝(Tp₁+Tp₁+Tp₁+…+Tp₁)/NReference processing time Tp=(Tp ₁ +Tp ₁ +Tp ₁ +...+Tp ₁ )/N

当前网络延迟时间判断为：(t₁-Tq)>0？(t₁-Tq):0+(t₂-Tp)>0？(t₂-Tp):0The current network delay time is judged as: (t ₁ -Tq)>0? (t ₁ -Tq): 0+(t ₂ -Tp)>0? (t ₂ -Tp):0

其中，N表示N个节点，t₁表示主机一的网络延迟，t₂表示主机二的网络延迟。Among them, N represents N nodes, t ₁ represents the network delay of host one, and t ₂ represents the network delay of host two.

当网络延迟时间超过预设值(例如0.8s)时，确定节点状态为不健康。When the network delay time exceeds a preset value (for example, 0.8s), it is determined that the node state is unhealthy.

在一些实施例中，上述步骤A2中，对所述集群的磁盘状态进行评估包括如下步骤：In some embodiments, in the above step A2, evaluating the disk status of the cluster includes the following steps:

步骤A21’：通过脚本查看磁盘运行状况；Step A21': Check the disk operating status through a script;

步骤A22’：判断所述磁盘是否报错；Step A22': determine whether the disk reports an error;

具体地，当某块磁盘未报错时，该磁盘的磁盘状态分值为100；当某块磁盘报错时，该磁盘的磁盘状态分值为0；当任一磁盘状态分值为0时，所述集群的磁盘状态的评估的分值为0，当所有磁盘的状态分值均为100时，所述集群的磁盘状态的评估的分值为100。当所述集群的磁盘状态的评估的分值为0时，确定所述节点的状态不健康。Specifically, when a certain disk does not report an error, the disk status score of the disk is 100; when a certain disk reports an error, the disk status score of the disk is 0; when any disk status score is 0, all The evaluation score of the disk state of the cluster is 0, and when the state score of all the disks is 100, the evaluation score of the disk state of the cluster is 100. When the evaluation score of the disk state of the cluster is 0, it is determined that the state of the node is unhealthy.

本实施例可通过脚本执行smartctl-H sdaN(linux自带检查脚本)查看磁盘运行状况，当磁盘未报错时评估的分值为100分，当磁盘报错时评估的分值为0分，n块磁盘的最终得分为：0∈(D₁,D₂,D₃,…,D_n)？0:100，其中D₁为磁盘1的得分……D_n为磁盘n的得分。当磁盘得分为0时，确定节点状态为不健康。In this embodiment, the running status of the disk can be checked by executing the smartctl-H sdaN script (the built-in check script of linux), and the evaluation score is 100 points when the disk does not report an error, and the evaluation score is 0 when the disk reports an error. The final score of the disk is: 0∈(D ₁ ,D ₂ ,D ₃ ,…,D _n )? 0:100, where D ₁ is the score for disk 1...D _n is the score for disk n. When the disk score is 0, the node status is determined to be unhealthy.

在一些实施例中，上述步骤A2中，当主机资源不是YARN独占时，对CPU使用率进行评估包括如下步骤：In some embodiments, in the above step A2, when the host resource is not exclusive to YARN, evaluating the CPU usage includes the following steps:

步骤B1：通过脚本计算当前CPU的总核心数N，确定当前非YARN使用的CPU的使用率p，以及NodeManager分配的CPU的核心数M；Step B1: Calculate the total number of cores N of the current CPU through the script, determine the current usage rate p of the CPU not used by YARN, and the number of cores M of the CPU allocated by NodeManager;

步骤B2：将M减去N与(1-p)的乘积，得到所述CPU使用率的评估的分值。Step B2: Subtract the product of N and (1-p) from M to obtain the evaluation score of the CPU usage.

此时，根据所述CPU使用率的评估的结果确定所述集群中节点的健康状态，进一步包括：At this time, determining the health status of the nodes in the cluster according to the result of the CPU usage evaluation, further comprising:

本实施例可通过查看/proc/stat(Linux系统显示当前系统占用的文件)计算当前CPU的总核心数N，确定当前非YARN使用的CPU使用率p、NodeManager分配的核心数M，则CPU得分为M-N*(1-p)。当CPU使用率得分超过预设值(例如75％或80％)时，确定节点状态为不健康。In this embodiment, the total number of cores N of the current CPU can be calculated by checking /proc/stat (the file occupied by the current system is displayed by the Linux system), and the CPU usage p and the number of cores M allocated by NodeManager can be determined, and then the CPU score can be determined. is M-N*(1-p). When the CPU usage score exceeds a preset value (eg, 75% or 80%), the node state is determined to be unhealthy.

在一些实施例中，上述步骤A2中，当主机资源不是YARN独占时，对内存使用率进行评估包括如下步骤：In some embodiments, in the above step A2, when the host resource is not exclusive to YARN, evaluating the memory usage includes the following steps:

步骤C1：通过脚本获得内存总量，NodeManager中分配的内存总量以及系统进程使用量；Step C1: Obtain the total amount of memory, the total amount of memory allocated in NodeManager and the amount of system process usage through the script;

步骤C2：判断所述内存总量与所述系统进程使用量的差值是否大于所述NodeManager中分配的内存总量。Step C2: Determine whether the difference between the total amount of memory and the usage amount of the system process is greater than the total amount of memory allocated in the NodeManager.

此时，根据所述内存使用率的评估的结果确定所述集群中节点的健康状态，进一步包括：At this time, determining the health status of the nodes in the cluster according to the result of the evaluation of the memory usage, further comprising:

具体地，当所述内存总量与所述系统进程使用量的差值大于所述NodeManager中分配的内存总量时，所述内存使用率的评估的分值为100，否则所述内存使用率的评估的分值为0；当所述内存使用率的评估的分值为0时，确定所述节点的状态不健康。Specifically, when the difference between the total amount of memory and the amount of system process usage is greater than the total amount of memory allocated in the NodeManager, the evaluation score of the memory usage rate is 100, otherwise the memory usage rate The evaluation score is 0; when the memory usage evaluation score is 0, it is determined that the state of the node is unhealthy.

本实施例中，通过/proc/meminfo(Linux系统显示当前系统占用的文件)文件得到内存总量amem,Nodemanager中分配的内存总量nmem，系统进程使用量smem。则内存得分为amem-smem>nmem？100：0。当内存使用率得分为0时，确定节点状态为不健康。In this embodiment, the total amount of memory amem, the total amount of memory allocated in Nodemanager nmem, and the amount of system process usage smem are obtained through the /proc/meminfo (Linux system displays files currently occupied by the system) file. Then the memory score is amem-smem>nmem? 100:0. When the memory usage score is 0, the node status is determined to be unhealthy.

下面通过一具体应用实例对本发明实施例做进一步详细说明，图5示出了本发明实施例一具体应用实例提供的集群中NodeManager状态管理方法的流程图。如图5所示，该方法包括以下步骤：The following describes the embodiment of the present invention in further detail through a specific application example. FIG. 5 shows a flowchart of a method for managing the state of a NodeManager in a cluster provided by a specific application example of the embodiment of the present invention. As shown in Figure 5, the method includes the following steps:

步骤510：对所述集群的网络延迟和磁盘坏块进行评估，得到网络分值和磁盘分值。Step 510: Evaluate the network delay and disk bad blocks of the cluster to obtain a network score and a disk score.

步骤520：判断所述主机资源是否为YARN独占；若是，执行步骤540；否则，执行步骤530；Step 520: Determine whether the host resource is exclusive to YARN; if so, go to Step 540; otherwise, go to Step 530;

步骤530：对所述集群的CPU使用率和内存使用率进行评估，得到CPU分值和内存分值。Step 530: Evaluate the CPU usage and memory usage of the cluster to obtain a CPU score and a memory score.

步骤540：判断上述分值中的任意一个是否满足各自的预设条件；若是，执行步骤550；否则，执行步骤560。Step 540 : Determine whether any one of the above-mentioned scores satisfies the respective preset conditions; if yes, go to Step 550 ; otherwise, go to Step 560 .

步骤550：将NodeManager下线并修改其配置。Step 550: Take the NodeManager offline and modify its configuration.

在执行本步骤之前，将节点状态设置为不健康。Before performing this step, set the node status to unhealthy.

步骤560：继续运行。Step 560: Continue to run.

在执行本步骤之前，将节点状态设置为健康。Before performing this step, set the node status to healthy.

本实施例中，通过对主机的网络、磁盘、CPU、内存实时状态综合判断来进行打分，当其中某项得分满足一定分值时候对主机的NodeManager角色进行下线操作，当条件恢复后，例如磁盘修复、内存占用降低、网络延迟满足上线条件，也即根据对所述集群的硬件状态的评估确定所述集群中节点的状态健康时，将NodeManager修改为合适参数恢复上线。In this embodiment, scoring is performed by comprehensively judging the real-time status of the network, disk, CPU, and memory of the host. When a certain score meets a certain score, the NodeManager role of the host is offline. When the conditions are restored, for example Disk repair, memory occupancy reduction, and network delay meet the online conditions, that is, when it is determined that the status of the nodes in the cluster is healthy according to the evaluation of the hardware status of the cluster, the NodeManager is modified to appropriate parameters to resume online.

图6示出了本发明实施例提供的集群中NodeManager状态管理装置的结构示意图。如图6所示，该装置600包括：评估模块610、确定模块620和管理模块630。FIG. 6 shows a schematic structural diagram of a NodeManager state management apparatus in a cluster provided by an embodiment of the present invention. As shown in FIG. 6 , the apparatus 600 includes an evaluation module 610 , a determination module 620 and a management module 630 .

其中评估模块610用于搜集集群的网络负载信息，根据所述网络负载信息对所述集群的硬件状态进行评估；确定模块620用于根据评估的结果确定所述集群中节点的健康状态；管理模块630用于当所述节点的状态不健康时对NodeManager进行下线操作。The evaluation module 610 is used to collect the network load information of the cluster, and to evaluate the hardware state of the cluster according to the network load information; the determination module 620 is used to determine the health state of the nodes in the cluster according to the evaluation result; the management module 630 is used to perform an offline operation on the NodeManager when the state of the node is unhealthy.

在一种可选的方式中，所述评估模块610进一步用于：In an optional manner, the evaluation module 610 is further configured to:

在一种可选的方式中，当主机资源不是YARN独占时，所述评估模块610还用于：In an optional manner, when the host resource is not exclusive to YARN, the evaluation module 610 is further configured to:

所述确定模块620，进一步用于：The determining module 620 is further configured to:

在一种可选的方式中，当主机资源是YARN独占时，所述评估模块610还用于：In an optional manner, when the host resource is exclusive to YARN, the evaluation module 610 is further configured to:

在一种可选的方式中，所述装置还包括：In an optional manner, the device further includes:

配置模块640，用于重新配置CPU资源和内存资源；a configuration module 640 for reconfiguring CPU resources and memory resources;

修改模块650，用于当根据对所述集群的硬件状态的评估确定所述集群中节点的状态健康时，修改NodeManager配置文件的参数为所述重新配置后的值；A modification module 650, configured to modify the parameters of the NodeManager configuration file to the reconfigured values when the state of the nodes in the cluster is determined to be healthy according to the evaluation of the hardware state of the cluster;

所述管理模块630还用于对所述NodeManager进行上线操作。The management module 630 is further configured to perform an online operation on the NodeManager.

通过脚本查看磁盘运行状况；View disk health through scripts;

判断所述磁盘是否报错；Determine whether the disk reports an error;

本发明实施例提供了一种计算机存储介质，所述存储介质中存储有至少一可执行指令，所述可执行指令使处理器执行上述任意方法实施例中的集群中NodeManager状态管理方法。An embodiment of the present invention provides a computer storage medium, where at least one executable instruction is stored in the storage medium, and the executable instruction enables a processor to execute the method for managing the state of a NodeManager in a cluster in any of the foregoing method embodiments.

本发明实施例提供了一种计算机程序产品，所述计算机程序产品包括存储在计算机存储介质上的计算机程序，所述计算机程序包括程序指令，当所述程序指令被计算机执行时，使所述计算机执行上述任意方法实施例中的集群中NodeManager状态管理方法。An embodiment of the present invention provides a computer program product, the computer program product includes a computer program stored on a computer storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, causes the computer to The method for managing the state of a NodeManager in a cluster in any of the foregoing method embodiments is executed.

图7示出了本发明实施例提供的计算设备的结构示意图，本发明具体实施例并不对计算设备的具体实现做限定。FIG. 7 shows a schematic structural diagram of a computing device provided by an embodiment of the present invention. The specific embodiment of the present invention does not limit the specific implementation of the computing device.

如图7所示，该计算设备可以包括：处理器(processor)702、通信接口(Communications Interface)704、存储器(memory)706、以及通信总线708。As shown in FIG. 7 , the computing device may include: a processor (processor) 702 , a communications interface (Communications Interface) 704 , a memory (memory) 706 , and a communication bus 708 .

其中：处理器702、通信接口704、以及存储器706通过通信总线708完成相互间的通信。通信接口704，用于与其它设备比如客户端或其它服务器等的网元通信。处理器702，用于执行程序710，具体可以执行上述任意方法实施例中的集群中NodeManager状态管理方法。The processor 702 , the communication interface 704 , and the memory 706 communicate with each other through the communication bus 708 . The communication interface 704 is used to communicate with network elements of other devices such as clients or other servers. The processor 702 is configured to execute the program 710, and may specifically execute the method for managing the state of a NodeManager in a cluster in any of the foregoing method embodiments.

具体地，程序710可以包括程序代码，该程序代码包括计算机操作指令。Specifically, the program 710 may include program code including computer operation instructions.

处理器702可能是中央处理器CPU，或者是特定集成电路ASIC(ApplicationSpecific Integrated Circuit)，或者是被配置成实施本发明实施例的一个或多个集成电路。计算设备包括的一个或多个处理器，可以是同一类型的处理器，如一个或多个CPU；也可以是不同类型的处理器，如一个或多个CPU以及一个或多个ASIC。The processor 702 may be a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the computing device may be the same type of processors, such as one or more CPUs; or may be different types of processors, such as one or more CPUs and one or more ASICs.

存储器706，用于存放程序710。存储器706可能包含高速RAM存储器，也可能还包括非易失性存储器(non-volatile memory)，例如至少一个磁盘存储器。The memory 706 is used to store the program 710 . Memory 706 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk memory.

在此提供的算法或显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述，构造这类系统所要求的结构是显而易见的。此外，本发明实施例也不针对任何特定编程语言。应当明白，可以利用各种编程语言实现在此描述的本发明的内容，并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。The algorithms or displays provided herein are not inherently related to any particular computer, virtual system, or other device. Various general-purpose systems can also be used with teaching based on this. The structure required to construct such a system is apparent from the above description. Furthermore, embodiments of the present invention are not directed to any particular programming language. It is to be understood that various programming languages may be used to implement the inventions described herein, and that the descriptions of specific languages above are intended to disclose the best mode for carrying out the invention.

在此处所提供的说明书中，说明了大量具体细节。然而，能够理解，本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中，并未详细示出公知的方法、结构和技术，以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. It will be understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

类似地，应当理解，为了精简本发明并帮助理解各个发明方面中的一个或多个，在上面对本发明的示例性实施例的描述中，本发明实施例的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而，并不应将该公开的方法解释成反映如下意图：即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说，如下面的权利要求书所反映的那样，发明方面在于少于前面公开的单个实施例的所有特征。因此，遵循具体实施方式的权利要求书由此明确地并入该具体实施方式，其中每个权利要求本身都作为本发明的单独实施例。Similarly, it is to be understood that, in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together into a single implementation in order to simplify the invention and to aid in the understanding of one or more of the various aspects of the invention. examples, figures, or descriptions thereof. This disclosure, however, should not be construed as reflecting an intention that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.

本领域那些技术人员可以理解，可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件，以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外，可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述，本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art will understand that the modules in the device in the embodiment can be adaptively changed and arranged in one or more devices different from the embodiment. The modules or units or components in the embodiments may be combined into one module or unit or component, and further they may be divided into multiple sub-modules or sub-units or sub-assemblies. All features disclosed in this specification (including accompanying claims, abstract and drawings) and any method so disclosed may be employed in any combination, unless at least some of such features and/or procedures or elements are mutually exclusive. All processes or units of equipment are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

此外，本领域的技术人员能够理解，尽管在此的一些实施例包括其它实施例中所包括的某些特征而不是其它特征，但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如，在下面的权利要求书中，所要求保护的实施例的任意之一都可以以任意的组合方式来使用。Furthermore, it will be understood by those skilled in the art that although some of the embodiments herein include certain features, but not others, included in other embodiments, that combinations of features of the different embodiments are intended to be within the scope of the present invention And form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制，并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中，不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中，这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。上述实施例中的步骤，除有特殊说明外，不应理解为对执行顺序的限定。It should be noted that the above-described embodiments illustrate rather than limit the invention, and that alternative embodiments may be devised by those skilled in the art without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. do not denote any order. These words can be interpreted as names. The steps in the above embodiments should not be construed as limitations on the execution order unless otherwise specified.

Claims

1. A method for managing NodeManager state in a cluster, wherein the method comprises:

Collect network load information of the cluster, and evaluate the hardware status of the cluster according to the network load information;

Determine the health status of the nodes in the cluster according to the evaluation result;

When the state of the node is unhealthy, the NodeManager is offline.

2 . The method according to claim 1 , wherein the collecting network load information of the cluster, and evaluating the hardware status of the cluster according to the network load information, further comprises: 2 .

Collect network load information of the cluster;

The network latency of the cluster is evaluated based on the network load information, and the disk status of the cluster is evaluated.

3. The method according to claim 2, wherein when the host resource is not exclusive to YARN, the method further comprises:

Evaluate CPU usage and memory usage;

The determining the health status of the nodes in the cluster according to the evaluation result further includes:

The health status of the nodes in the cluster is determined according to the results of the evaluation of the network latency, disk status, CPU usage and memory usage.

4. The method according to claim 2, wherein when the host resource is exclusive to YARN, the method further comprises:

When the network delay exceeds a preset value, the network delay of the cluster is evaluated in combination with the historical network delay and the health status records of the corresponding nodes.

5. The method according to any one of claims 1-4, wherein the method further comprises:

Reconfigure CPU resources and memory resources;

When it is determined that the state of the nodes in the cluster is healthy according to the evaluation of the hardware state of the cluster, modify the parameters of the NodeManager configuration file to the reconfigured values;

Perform an online operation on the NodeManager.

6. The method according to claim 2, wherein the evaluating the network delay of the cluster according to the network load information further comprises:

Collect the request queuing time and processing time of the RPC queue through the JMX interface monitored by JMX in Hadoop;

After summing the request queuing time of all nodes, take the average value to obtain the reference queuing time, and take the processing time of the first host as the reference processing time;

Determine whether the network delay of the first host is greater than the reference queue time, or whether the network delay of the second host is greater than the reference processing time;

When the network delay of the first host is greater than the reference queue time, or the network delay of the second host is greater than the reference processing time, it is determined that the state of the node is unhealthy.

7. The method according to claim 2, wherein the evaluating the disk status of the cluster further comprises:

View disk health through script;

Determine whether the disk reports an error;

When a certain disk in the disk of the cluster reports an error, it is determined that the state of the node is unhealthy.

8. The method according to claim 3, wherein the evaluating the CPU usage further comprises:

Calculate the total number of cores N of the current CPU through the script, determine the current utilization rate p of the CPU not used by YARN, and the number of cores M of the CPU allocated by NodeManager;

Subtract the product of N and (1-p) from M to obtain a score for the evaluation of the CPU usage;

When the evaluated score of the CPU usage exceeds a preset CPU usage threshold, it is determined that the state of the node is unhealthy.

9. The method according to claim 3, wherein the evaluating the memory usage further comprises:

Obtain the total amount of memory through the script, the total amount of memory allocated in NodeManager and the usage of system processes;

Determine whether the difference between the total amount of memory and the system process usage is greater than the total amount of memory allocated in the NodeManager;

When the difference between the total amount of memory and the usage amount of the system process is not greater than the total amount of memory allocated in the NodeManager, it is determined that the state of the node is unhealthy.

10. A NodeManager state management device in a cluster, wherein the device comprises:

an evaluation module, configured to collect network load information of the cluster, and evaluate the hardware state of the cluster according to the network load information;

a determining module, configured to determine the health status of the nodes in the cluster according to the evaluation result;

The management module is used for offline operation of the NodeManager when the state of the node is unhealthy.

11. A computing device, comprising: a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface communicate with each other through the communication bus;

The memory is used to store at least one executable instruction, and the executable instruction enables the processor to perform the operation of the method for managing the state of a NodeManager in a cluster according to any one of claims 1-9.

12. A computer storage medium, wherein the storage medium stores at least one executable instruction, and the executable instruction causes the processor to execute the NodeManager in the cluster according to any one of claims 1-9 State management methods.