[go: up one dir, main page]

CN104243579A - Computational node control method and system applied to water conservancy construction site - Google Patents

Computational node control method and system applied to water conservancy construction site Download PDF

Info

Publication number
CN104243579A
CN104243579A CN201410465692.7A CN201410465692A CN104243579A CN 104243579 A CN104243579 A CN 104243579A CN 201410465692 A CN201410465692 A CN 201410465692A CN 104243579 A CN104243579 A CN 104243579A
Authority
CN
China
Prior art keywords
computing
node
nodes
computing nodes
water conservancy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410465692.7A
Other languages
Chinese (zh)
Inventor
林鹏
李庆斌
高向友
胡森映
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201410465692.7A priority Critical patent/CN104243579A/en
Publication of CN104243579A publication Critical patent/CN104243579A/en
Pending legal-status Critical Current

Links

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

本发明提出一种应用于水利施工现场的计算节点的控制方法,包括以下步骤:采用定期轮询以发现可用于计算任务的多个计算节点;分别获取多个计算节点当前的计算能力,并将计算任务分解,并通过多个计算节点协同处理分解后的计算任务;每个计算节点分别将处理结果发送至中心控制节点;中心控制节点分析每个计算节点的处理结果以对多个计算节点进行控制。本发明的方法充分利用了施工现场各计算节点(如传感器和数据处理单元)的剩余计算能力,可以有效地提升水利施工现场信息化水平。本发明还提供了一种应用于水利施工现场的计算节点的控制系统。

The present invention proposes a method for controlling computing nodes applied to water conservancy construction sites, including the following steps: using periodic polling to find multiple computing nodes that can be used for computing tasks; obtaining the current computing capabilities of multiple computing nodes respectively, and The computing tasks are decomposed, and the decomposed computing tasks are collaboratively processed by multiple computing nodes; each computing node sends the processing results to the central control node; the central control node analyzes the processing results of each computing node to process the multiple computing nodes control. The method of the invention makes full use of the remaining computing power of each computing node (such as a sensor and a data processing unit) on the construction site, and can effectively improve the informatization level of the water conservancy construction site. The invention also provides a control system for computing nodes applied to water conservancy construction sites.

Description

应用于水利施工现场的计算节点的控制方法及系统Control method and system for computing nodes applied to water conservancy construction sites

技术领域technical field

本发明涉及分布式计算技术领域,特别涉及一种应用于水利施工现场的计算节点的控制方法及系统。The invention relates to the technical field of distributed computing, in particular to a control method and system for computing nodes applied to water conservancy construction sites.

背景技术Background technique

随着物联网和传感器网络的快速普及,在施工现场中使用传感器网络的情况越来越多。这些传感器网络被广泛应用于采集温度、湿度、压力、人员位置信息等各种和业务相关的方面,并且随着管理向数字化、信息化发展,也为其他业务的引入和发展打下了坚实的基础。但是长期以来,各个网络及其节点各司其职,相互割裂,无法达到普适计算和信息融合的目的,比如有的CPU采用的是32位的现代CPU,但长期以来占用率在1%以下,使计算潜力远远无法发挥,而中心服务器遇到大的计算任务和计算密集型的操作时却超负荷工作,计算周期过长,影响实时效率决策。With the rapid popularization of the Internet of Things and sensor networks, the use of sensor networks in construction sites is increasing. These sensor networks are widely used to collect various business-related aspects such as temperature, humidity, pressure, and personnel location information, and with the development of management to digitization and informationization, it has also laid a solid foundation for the introduction and development of other businesses . However, for a long time, each network and its nodes have performed their own duties and are separated from each other, unable to achieve the purpose of pervasive computing and information fusion. For example, some CPUs use 32-bit modern CPUs, but the occupancy rate has been below 1% for a long time , so that the computing potential is far from being brought into play, and the central server is overloaded when encountering large computing tasks and computing-intensive operations, and the computing cycle is too long, which affects real-time efficiency decision-making.

但是目前关于上述问题的解决方案极少,有的只是提到了一些浅显的设计,而有的也只是提出了一点想法,并没有真正的可以被用来开发出一个分布计算的完整方案。However, there are very few solutions to the above problems at present, some of which only mention some simple designs, and some of them only propose a little idea, and there is no real solution that can be used to develop a distributed computing.

发明内容Contents of the invention

本发明旨在至少在一定程度上解决上述相关技术中的技术问题之一。The present invention aims at solving one of the technical problems in the related art mentioned above at least to a certain extent.

为此,本发明的一个目的在于提出一种应用于水利施工现场的计算节点的控制方法,该方法充分利用了施工现场各计算节点(如传感器和数据处理单元)的剩余计算能力,可以有效地提升水利施工现场信息化水平。For this reason, an object of the present invention is to propose a kind of control method that is applied to the calculation node of water conservancy construction site, and this method has fully utilized the remaining computing power of each calculation node (as sensor and data processing unit) of construction site, can effectively Improve the informatization level of water conservancy construction sites.

本发明的另一个目的在于提供一种应用于水利施工现场的计算节点的控制系统。Another object of the present invention is to provide a control system for computing nodes applied to water conservancy construction sites.

为了实现上述目的,本发明第一方面的实施例提出了一种应用于水利施工现场的计算节点的控制方法,包括以下步骤:采用定期轮询以发现可用于计算任务的多个计算节点;分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务;每个计算节点分别将处理结果发送至中心控制节点;所述中心控制节点分析每个计算节点的处理结果以对所述多个计算节点进行控制。In order to achieve the above object, the embodiment of the first aspect of the present invention proposes a control method for computing nodes applied to water conservancy construction sites, including the following steps: using periodic polling to find multiple computing nodes that can be used for computing tasks; respectively Obtaining the current computing capabilities of the multiple computing nodes, decomposing the computing tasks, and co-processing the decomposed computing tasks through the multiple computing nodes; each computing node sends the processing results to the central control node; The central control node analyzes the processing results of each computing node to control the multiple computing nodes.

另外,根据本发明上述实施例的应用于水利施工现场的计算节点的控制方法还可以具有如下附加的技术特征:In addition, the method for controlling computing nodes applied to water conservancy construction sites according to the above-mentioned embodiments of the present invention may also have the following additional technical features:

在一些示例中,所述采用定期轮询以发现可用于计算任务的多个计算节点,具体包括:根据计算节点列表发送轮询请求并启用等待定时器;各计算节点接收所述轮询请求,估算各自当前的计算能力,并发送至中心控制节点,具体包括:In some examples, the use of periodic polling to discover multiple computing nodes that can be used for computing tasks specifically includes: sending a polling request according to the computing node list and enabling a waiting timer; each computing node receives the polling request, Estimate their current computing power and send it to the central control node, including:

M=N+P1+P2,M=N+P1+P2,

其中,M为计算节点的当前计算能力,N为当前CPU占有率,P1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率;在所述等待定时器到期前,所述中心控制节点根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务;如果可以,则将所述多个节点用于完成所述计算任务,否则继续发送轮询请求;当所述等待定时器到期时,不再等待计算节点的回应,并丢弃超时的回应消息。Wherein, M is the current computing capability of the computing node, N is the current CPU occupancy rate, P1 is the CPU occupancy rate of the past period of time, and P2 is the CPU occupancy rate of the expected future period of time; before the waiting timer expires, all The central control node judges whether multiple computing nodes can complete the computing task according to the current computing capabilities of each computing node; if yes, use the multiple nodes to complete the computing task, otherwise continue to send polling requests; when When the waiting timer expires, no longer wait for a response from the computing node, and discard the timed-out response message.

在一些示例中,所述分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务,具体包括:设所述多个计算节点为N个,以及将所述计算任务分解为m个子任务,其中N>M;将每个子任务发送给对应的计算节点,并启动超时定时器;定时判断各计算节点是否失效;在所述超时定时器到期前,接收各计算节点的计算结果。In some examples, the obtaining the current computing capabilities of the multiple computing nodes respectively, decomposing the computing tasks, and co-processing the decomposed computing tasks through the multiple computing nodes specifically includes: setting the A plurality of computing nodes is N, and decomposing the computing task into m subtasks, wherein N>M; sending each subtask to a corresponding computing node, and starting an overtime timer; regularly judging whether each computing node fails; Before the timeout timer expires, the calculation results of each calculation node are received.

在一些示例中,还包括:采取冗余的策略,同一分解的子任务可以分配到多个计算节点。In some examples, it also includes: adopting a redundant strategy, the same decomposed subtask can be assigned to multiple computing nodes.

在一些示例中,所述各个计算节点之间采用XML格式的通信协议。In some examples, the communication protocol in XML format is adopted between the computing nodes.

根据本发明实施例的应用于水利施工现场的计算节点的控制方法,由中心控制节点发起定期轮询,由潜在参与节点上报各自的剩余计算能力,并根据各节点上报的数据,进行任务分解,指派到指定节点进行计算,并上报计算结果,最后根据各个节点上报的信息汇总成最终结果。因此,该方法充分利用了现场的各计算节点(如传感器和数据处理单元)的剩余计算能力,可以有效地提升水利施工现场信息化水平。According to the control method of computing nodes applied to water conservancy construction sites according to the embodiments of the present invention, the central control node initiates regular polling, and the potential participating nodes report their respective remaining computing capabilities, and perform task decomposition according to the data reported by each node, Assign to designated nodes for calculation, and report the calculation results, and finally summarize the final results based on the information reported by each node. Therefore, this method makes full use of the remaining computing power of each computing node (such as sensors and data processing units) on the site, and can effectively improve the informatization level of the water conservancy construction site.

本发明第二方面的实施例提供了一种应用于水利施工现场的计算节点的控制系统,包括:发现模块,所述发现模块用于通过定期轮询以发现可用于计算任务的多个计算节点;分配模块,所述分配模块用于分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务;上报模块,所述上报模块用于传送每个计算节点的处理结果;控制模块,所述控制模块分析每个计算节点的处理结果以对所述多个计算节点进行控制。The embodiment of the second aspect of the present invention provides a computing node control system applied to a water conservancy construction site, including: a discovery module, the discovery module is used to discover multiple computing nodes that can be used for computing tasks through periodic polling an allocation module, the allocation module is used to respectively obtain the current computing capabilities of the multiple computing nodes, and decompose the computing tasks, and cooperatively process the decomposed computing tasks through the multiple computing nodes; the reporting module, The reporting module is used to transmit the processing result of each computing node; the control module analyzes the processing result of each computing node to control the multiple computing nodes.

另外,根据本发明上述实施例的应用于水利施工现场的计算节点的控制系统还可以具有如下附加的技术特征:In addition, the control system applied to the calculation node of the water conservancy construction site according to the above-mentioned embodiments of the present invention may also have the following additional technical features:

在一些示例中,所述发现模块通过定期轮询以发现可用于计算任务的多个计算节点,具体包括:根据计算节点列表发送轮询请求并启用等待定时器;各计算节点接收所述轮询请求,估算各自当前的计算能力,并发送至控制模块,具体包括:In some examples, the discovery module discovers multiple computing nodes that can be used for computing tasks through periodic polling, specifically including: sending a polling request according to the computing node list and enabling a waiting timer; each computing node receives the polling request, estimate their current computing power, and send it to the control module, including:

M=N+P1+P2,M=N+P1+P2,

其中,M为计算节点的当前计算能力,N为当前CPU占有率,P1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率;在所述等待定时器到期前,所述控制模块根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务;如果可以,则将所述多个节点用于完成所述计算任务,否则继续发送轮询请求;当所述等待定时器到期时,则所述发现模块不再等待计算节点的回应,并丢弃超时的回应消息。Wherein, M is the current computing capability of the computing node, N is the current CPU occupancy rate, P1 is the CPU occupancy rate of the past period of time, and P2 is the CPU occupancy rate of the expected future period of time; before the waiting timer expires, all The control module judges whether multiple computing nodes can complete the computing task according to the current computing capabilities of each computing node; if yes, use the multiple computing nodes to complete the computing task, otherwise continue to send polling requests; when all When the waiting timer expires, the discovery module no longer waits for the response from the computing node, and discards the timed-out response message.

在一些示例中,所述分配模块分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务,具体包括:设所述多个计算节点为N个,以及将所述计算任务分解为m个子任务,其中N>M;将每个子任务发送给对应的计算节点,并启动超时定时器;定时判断各计算节点是否失效;在所述超时定时器到期前,接收各计算节点的计算结果。In some examples, the allocating module respectively acquires the current computing capabilities of the multiple computing nodes, decomposes the computing tasks, and cooperatively processes the decomposed computing tasks through the multiple computing nodes, specifically including: setting The plurality of computing nodes is N, and decomposing the computing task into m subtasks, wherein N>M; sending each subtask to a corresponding computing node, and starting an overtime timer; regularly judging whether each computing node Invalidation: before the timeout timer expires, the calculation results of each calculation node are received.

在一些示例中,所述分配模块还用于采取冗余的策略,同一分解的子任务可以分配到多个计算节点。In some examples, the allocation module is further configured to adopt a redundant strategy, and the same decomposed subtask can be allocated to multiple computing nodes.

在一些示例中,所述各个计算节点之间采用XML格式的通信协议。In some examples, the communication protocol in XML format is adopted between the computing nodes.

根据本发明实施例的应用于水利施工现场的计算节点的控制系统,由中心控制节点发起定期轮询,由潜在参与节点上报各自的剩余计算能力,并根据各节点上报的数据,进行任务分解,指派到指定节点进行计算,并上报计算结果,最后根据各个节点上报的信息汇总成最终结果。因此,该系统充分利用了现场的个计算节点(如传感器和数据处理单元)的剩余计算能力,可以有效地提升水利施工现场信息化水平。According to the control system of the computing nodes applied to the water conservancy construction site according to the embodiment of the present invention, the central control node initiates regular polling, and the potential participating nodes report their respective remaining computing capabilities, and perform task decomposition according to the data reported by each node, Assign to designated nodes for calculation, and report the calculation results, and finally summarize the final results based on the information reported by each node. Therefore, the system makes full use of the remaining computing power of each computing node (such as sensors and data processing units) on site, which can effectively improve the informatization level of water conservancy construction sites.

本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

附图说明Description of drawings

本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and comprehensible from the description of the embodiments in conjunction with the following drawings, wherein:

图1是根据本发明一个实施例的应用于水利施工现场的计算节点的控制方法的流程图;Fig. 1 is a flow chart of a control method applied to a computing node at a water conservancy construction site according to an embodiment of the present invention;

图2是根据本发明一个实施例的应用于水利施工现场的计算节点的控制方法实现的四个阶段示意图;FIG. 2 is a schematic diagram of four stages of a control method applied to a computing node at a water conservancy construction site according to an embodiment of the present invention;

图3是根据本发明一个实施例的发现阶段的示意图;FIG. 3 is a schematic diagram of a discovery phase according to an embodiment of the present invention;

图4是根据本发明一个实施例的中心控制节点的维护信息模型示意图;以及4 is a schematic diagram of a maintenance information model of a central control node according to an embodiment of the present invention; and

图5是根据本发明一个实施例的应用于水利施工现场的计算节点的控制系统的结构框图。Fig. 5 is a structural block diagram of a control system applied to computing nodes at a water conservancy construction site according to an embodiment of the present invention.

具体实施方式Detailed ways

下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

以下结合附图描述根据本发明实施例应用于水利施工现场的计算节点的控制方法和系统。A control method and system applied to computing nodes at water conservancy construction sites according to embodiments of the present invention will be described below with reference to the accompanying drawings.

图1是根据本发明一个实施例的应用于水利施工现场的计算节点的控制方法的流程图。如图1所示,根据本发明一个实施例的应用于水利施工现场的计算节点的控制方法,包括以下步骤:Fig. 1 is a flow chart of a control method applied to a computing node at a water conservancy construction site according to an embodiment of the present invention. As shown in Figure 1, the control method applied to the calculation node of the water conservancy construction site according to an embodiment of the present invention includes the following steps:

步骤S101,采用定期轮询以发现可用于计算任务的多个计算节点。Step S101, using periodic polling to discover multiple computing nodes that can be used for computing tasks.

具体而言,在一些示例中,结合图3所示,该步骤具体包括:Specifically, in some examples, as shown in Figure 3, this step specifically includes:

步骤1:根据计算节点列表发送轮询请求并启用等待定时器。换言之,即中心控制节点按照上次计算维护的计算节点表最近最多用过的(Most Recently Used,MRU)节点列表,逐次发送轮询请求,并等待计算节点(如记为Nn)回应,并同时开启等待定时器,记为Tn。Step 1: Send a polling request based on the compute node list and enable the waiting timer. In other words, the central control node sends polling requests one by one according to the list of Most Recently Used (MRU) nodes in the computing node table maintained by the last calculation, and waits for the computing node (recorded as Nn) to respond, and at the same time Turn on the waiting timer, denoted as Tn.

在一些示例中,优选地,节点表采用了链表的形式,其格式如下:In some examples, preferably, the node table is in the form of a linked list, and its format is as follows:

链表头,总数-→节点ID,节点寻址→下一个节点…→链表尾。Linked list head, total number-→node ID, node addressing→next node...→linked list tail.

另外,在一些示例中,中心控制节点维护一张信息表,该表记录了各个计算节点的情况,包含了逻辑编号和物理编号的对应关系,相应地,对于物理编号失效的计算节点,也将在信息表里作相应的删除。换言之,节点信息中包含了逻辑编号和物理编号的对应关系,逻辑编号是指计算节点的ID,物理编号是指计算节点实际的表示,比如对于网卡而言,是Mac号,相应的,对于物理编号失效的计算节点,也在信息表里做相应的删除。在具体示例中,通用的消息格式用表格表示如下表1所示:In addition, in some examples, the central control node maintains an information table, which records the situation of each computing node, including the correspondence between the logical number and the physical number. Correspondingly, for the computing node whose physical number fails, the Delete accordingly in the information table. In other words, the node information includes the correspondence between the logical number and the physical number. The logical number refers to the ID of the computing node, and the physical number refers to the actual representation of the computing node. For example, for the network card, it is the Mac number. Correspondingly, for the physical Compute nodes with invalid serial numbers are also deleted from the information table. In a specific example, the general message format is expressed in a table as shown in Table 1 below:

序号serial number 标示marked 说明illustrate 1.1. 消息号message number 消息类型message type 2.2. 源节点号source node number 节点号node number 3.3. 目的节点号destination node number 目的节点号destination node number 4.4. 版本类型version type 该消息类型的版本号The version number of this message type 5.5. 时间戳timestamp 存放自2000年1月1日12:00:00时刻后流逝的秒数Stores the number of seconds elapsed since 12:00:00 on January 1, 2000

6.6. 序列号serial number 增长的一个流水号号码An increasing serial number 7.7. 消息体message body 和消息类型对应的消息内容The message content corresponding to the message type 8.8. 校验码check code 运用校验的算法Algorithm using validation

表1Table 1

而实际实现的一个报文内容如下:The actual content of a message is as follows:

<?xmlversion="1.0"encoding="utf-8"?><? xmlversion="1.0"encoding="utf-8"? >

<message><message>

<head><head>

<message_id>DISCOVERY_REQ</message_id>;<message_id>DISCOVERY_REQ</message_id>;

    <version>1.0</version><version>1.0</version>

     <src_node>1</src_node><src_node>1</src_node>

    <dest_node>2</dest_node><dest_node>2</dest_node>

<time_tamp>2013-12-13:00:03:45:234</time_tamp><time_tamp>2013-12-13:00:03:45:234</time_tamp>

<seq_no>11</seq_no><seq_no>11</seq_no>

</head></head>

<body><body>

<broadcast>no</broadcast><broadcast>no</broadcast>

</body></body>

<tail><tail>

<checksum>123</checksum><checksum>123</checksum>

</tail></tail>

</message></message>

进一步地,中心控制节点维护更多的信息,如图4所示的信息模型图,包含如下表2中所示的信息内容:Further, the central control node maintains more information, such as the information model diagram shown in Figure 4, which includes the information content shown in Table 2 below:

表2Table 2

在一些示例中,优选地,发送的轮询请求为:In some examples, preferably, the polling request sent is:

<?xmlversion="1.0"encoding="utf-8"?><? xmlversion="1.0"encoding="utf-8"? >

<message><message>

<head><head>

<message_id>DISCOVERY_REQ</message_id>;<message_id>DISCOVERY_REQ</message_id>;

<src_node>1</src_node><src_node>1</src_node>

<dest_node>2</dest_node><dest_node>2</dest_node>

<version>1.0</version><version>1.0</version>

<time_tamp>2013-12-13:00:03:45:234</time_tamp><time_tamp>2013-12-13:00:03:45:234</time_tamp>

<seq_no>11</seq_no><seq_no>11</seq_no>

</head></head>

<body><body>

<broadcast>no</broadcast><broadcast>no</broadcast>

</body></body>

<tail><tail>

<checksum>123</checksum><checksum>123</checksum>

</tail></tail>

  </message></message>

而在该步骤中,等待定时器例如可设置为10s。In this step, the waiting timer may be set to 10s, for example.

步骤2:各计算节点接收轮询请求,估算各自当前的计算能力,并发送至中心控制节点,具体包括:Step 2: Each computing node receives the polling request, estimates its current computing power, and sends it to the central control node, including:

M=N+P1+P2,M=N+P1+P2,

其中,M表示计算节点的当前计算能力,N为当前CPU占有率,P1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率。Among them, M represents the current computing capability of the computing node, N is the current CPU occupancy rate, P1 is the CPU occupancy rate in the past period of time, and P2 is the expected CPU occupancy rate in the future period of time.

进一步地,在一些示例中,发送的回应报文格式如下:Further, in some examples, the format of the sent response message is as follows:

<?xmlversion="1.0"encoding="utf-8"?><? xmlversion="1.0"encoding="utf-8"? >

<message><message>

<head><head>

<message_id>DISCOVERY_ACK</message_id>;<message_id>DISCOVERY_ACK</message_id>;

<src_node>2</src_node><src_node>2</src_node>

<dest_node>1</dest_node><dest_node>1</dest_node>

<version>1.0</version><version>1.0</version>

<time_tamp>2013-12-13:00:03:45:234</time_tamp><time_tamp>2013-12-13:00:03:45:234</time_tamp>

<seq_no>1</seq_no><seq_no>1</seq_no>

</head></head>

<body><body>

<power>234</power><power>234</power>

</body></body>

<tail><tail>

<checksum>123</checksum><checksum>123</checksum>

</tail></tail>

  </message></message>

步骤3:在等待定时器到期前,中心控制节点根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务。Step 3: Before the waiting timer expires, the central control node judges whether multiple computing nodes can complete computing tasks according to the current computing capabilities of each computing node.

步骤4:如果可以,则将多个节点用于完成计算任务,否则继续发送轮询请求。即中心控制节点根据各计算节点上报的情况,判断参与本次计算任务的这些计算节点(在考虑冗余的情况下)能否完成计算任务,如果可以完成,则进入指派阶段,即执行步骤S103。如果不能完成,则继续进行轮询,以进入更大范围发现阶段,并且将步骤1中的点对点消息更改为广播的轮询请求消息。在一些示例中,具体地报文格式如下:Step 4: If possible, use multiple nodes to complete the computing task, otherwise continue to send polling requests. That is, the central control node judges whether these computing nodes participating in this computing task (in consideration of redundancy) can complete the computing task according to the situation reported by each computing node. If it can be completed, it enters the assignment stage, that is, executes step S103 . If it cannot be completed, the polling is continued to enter a larger-scale discovery phase, and the point-to-point message in step 1 is changed to a broadcast polling request message. In some examples, the specific message format is as follows:

<?xmlversion="1.0"encoding="utf-8"?><? xmlversion="1.0"encoding="utf-8"? >

<message><message>

<head><head>

<message_id>DISCOVERY_REQ</message_id>;<message_id>DISCOVERY_REQ</message_id>;

<src_node>1</src_node><src_node>1</src_node>

<dest_node>2</dest_node><dest_node>2</dest_node>

<version>1.0</version><version>1.0</version>

<time_tamp>2013-12-13:00:03:45:234</time_tamp><time_tamp>2013-12-13:00:03:45:234</time_tamp>

<seq_no>11</seq_no><seq_no>11</seq_no>

</head></head>

<body><body>

<broadcast>yes</broadcast><broadcast>yes</broadcast>

</body></body>

<tail><tail>

<checksum>123</checksum><checksum>123</checksum>

</tail></tail>

  </message></message>

步骤5:当等待定时器到期时,不再等待计算节点的回应,并丢弃超时的回应消息。Step 5: When the waiting timer expires, no longer wait for the response from the computing node, and discard the timed-out response message.

步骤S102,分别获取多个计算节点当前的计算能力,并将计算任务分解,并通过多个计算节点协同处理分解后的计算任务。In step S102, the current computing capabilities of multiple computing nodes are obtained respectively, and the computing tasks are decomposed, and the decomposed computing tasks are processed cooperatively by the multiple computing nodes.

具体而言,该步骤具体包括:Specifically, this step specifically includes:

步骤A:设多个计算节点为N个,以及将计算任务分解为m个子任务,其中N>M。例如多个计算节点分别为N1,N2,…Nn,计算任务T分解为m个子任务分别为T1,T2,…Tm,并且N>M,则计算节点N1,N2,…Nn分别对应地处理子任务T1,T2,…Tm。Step A: Set the number of computing nodes as N, and decompose the computing task into m subtasks, where N>M. For example, multiple computing nodes are respectively N1, N2, ... Nn, and the computing task T is decomposed into m subtasks respectively T1, T2, ... Tm, and N>M, then the computing nodes N1, N2, ... Nn respectively process the subtasks correspondingly. Tasks T1, T2, ... Tm.

在一些示例中,该步骤中的子任务的表达是语义化的,这样在异构环境中,跟计算节点所在的操作系统无关,并且上述分解的子任务统一采用MathML描述。In some examples, the expression of the subtasks in this step is semantic, so that in a heterogeneous environment, it has nothing to do with the operating system where the computing nodes are located, and the above decomposed subtasks are uniformly described by MathML.

步骤B:将每个子任务发送给对应的计算节点,并启动超时定时器。Step B: Send each subtask to the corresponding computing node, and start the timeout timer.

步骤C:维持各计算节点的包活(心跳)定时器H,并定时判断各计算节点是否失效。Step C: Maintain the packet liveness (heartbeat) timer H of each computing node, and regularly determine whether each computing node is invalid.

步骤D:在超时定时器到期前,接收各计算节点的计算结果。Step D: Receive the calculation results of each calculation node before the timeout timer expires.

需要说明的是,在上述过程中,出于容错的目的,目标计算节点对于收到和自身ID不匹配的报文,简单丢弃即可。It should be noted that, in the above process, for the purpose of fault tolerance, the target computing node simply discards the packets that do not match its own ID.

步骤S103,每个计算节点分别将处理结果发送至中心控制节点。Step S103, each computing node sends the processing result to the central control node.

步骤S104,中心控制节点分析每个计算节点的处理结果以对多个计算节点进行控制。In step S104, the central control node analyzes the processing results of each computing node to control multiple computing nodes.

综上所述,本发明的方法充分利用铺设在现场的大量各种传感器和数据处理单元(即计算节点)的剩余计算能力,通过合适的协议进行组网,而该协议主要可概括为四个阶段,分别为:发现、指派、上报和汇总,如图2所示。In summary, the method of the present invention makes full use of the remaining computing power of a large number of various sensors and data processing units (that is, computing nodes) laid on the site, and forms a network through a suitable protocol, and the protocol can be mainly summarized as four The phases are: discovery, assignment, reporting and summarization, as shown in Figure 2.

具体而言,在发现阶段,由中心控制节点发起定期轮询,以发现可用于本次计算任务的计算节点;指派阶段(分配阶段)是在掌握了现有计算节点能力的前提下,将计算任务进行分解,同时协同计算;上报阶段就是各分配了子任务的计算节点将计算结果通过网络上报到中心控制节点,其中,上报的结果有两种,一个是计算成功,另外一个是失败,当然由于是通过网络进行的,需要中心控制节点检测失效的计算节点,如果在规定的时间内不能完成分配的子任务,则该计算节点不上报;汇总阶段是将上述上报的结果进行分析汇总,对于失败的结果发送到另外有效的计算节点,同时等待计算结果,具体步骤与分配阶段类同,属于二次分配。更极端的情况下,二次分配也得不到好的结果,可以再多次尝试,直到到达设定的时间或者达到尝试次数后作废本次计算任务。Specifically, in the discovery phase, the central control node initiates regular polling to discover computing nodes that can be used for this computing task; in the assignment phase (allocation phase), the computing The tasks are decomposed and calculated at the same time; the reporting stage is that the computing nodes assigned subtasks report the calculation results to the central control node through the network. Among them, there are two kinds of reported results, one is calculation success, the other is failure, of course Since it is carried out through the network, the central control node needs to detect the failed computing node. If the assigned subtask cannot be completed within the specified time, the computing node will not report; the summary stage is to analyze and summarize the above reported results. For The failed results are sent to another valid computing node while waiting for the calculation results. The specific steps are similar to the allocation stage, which belongs to the secondary allocation. In even more extreme cases, the second assignment will not yield good results, and you can try multiple times until the set time is reached or the number of attempts is reached, and the calculation task is invalidated.

在一些示例中,在考虑到节点失效的情况下,本发明的方法采取冗余策略,同一分解的子任务可以分配到多个计算机节点。这样,也可以比较执行相同任务的计算节点的计算结果。In some examples, in consideration of node failure, the method of the present invention adopts a redundancy strategy, and the same decomposed subtask can be allocated to multiple computer nodes. This way, it is also possible to compare the calculation results of compute nodes performing the same task.

在本发明的一个实施例中,各个计算节点之间的通信协议采用XML(ExtensibleMarkupLanguage,可扩展标记语言)格式并通过大型企业内部进行标准化或者遵循相应的国际国家标准。具体地说,就现有技术而言,在企业大型多单元分布系统中,一般通过网络套接字(Socket)协议作为应用单元之间进行数据交换的通常方法,基本上采取的是自定义报文格式,无论是定长或者分隔符的,但是,由于这些自定义的信息格式缺乏统一标准,随意性大,通用性,灵活性不足,不能满足企业IT建设周期长以及新技术层出不穷的现实情况的需求。因此,本发明采用标准的XML格式的通信协议来作为应用的数据交换标准。In one embodiment of the present invention, the communication protocol between computing nodes adopts XML (Extensible Markup Language, Extensible Markup Language) format and is standardized within large enterprises or follows corresponding international national standards. Specifically, as far as the prior art is concerned, in the large-scale multi-unit distributed system of an enterprise, the usual method for data exchange between application units is generally through the network socket (Socket) protocol, and basically a custom report is adopted. However, due to the lack of uniform standards, these self-defined information formats are arbitrary, versatile, and inflexible, and cannot meet the reality of long IT construction cycles and emerging new technologies in enterprises. demand. Therefore, the present invention adopts a standard communication protocol in XML format as the applied data exchange standard.

根据本发明实施例的应用于水利施工现场的计算节点的控制方法,由中心控制节点发起定期轮询,由潜在参与节点上报各自的剩余计算能力,并根据各节点上报的数据,进行任务分解,指派到指定节点进行计算,并上报计算结果,最后根据各个节点上报的信息汇总成最终结果。因此,该方法充分利用了现场的各计算节点(如传感器和数据处理单元)的剩余计算能力,可以有效地提升水利施工现场信息化水平。According to the control method of computing nodes applied to water conservancy construction sites according to the embodiments of the present invention, the central control node initiates regular polling, and the potential participating nodes report their respective remaining computing capabilities, and perform task decomposition according to the data reported by each node, Assign to designated nodes for calculation, and report the calculation results, and finally summarize the final results based on the information reported by each node. Therefore, this method makes full use of the remaining computing power of each computing node (such as sensors and data processing units) on the site, and can effectively improve the informatization level of the water conservancy construction site.

本发明的进一步实施例还提供了一种应用于水利施工现场的计算节点的控制系统。如图5所示,根据本发明一个实施例的应用于水利施工现场的计算节点的控制系统500,包括:发现模块510、分配模块520、上报模块530和控制模块540。A further embodiment of the present invention also provides a control system applied to computing nodes in water conservancy construction sites. As shown in FIG. 5 , a control system 500 applied to computing nodes at a water conservancy construction site according to an embodiment of the present invention includes: a discovery module 510 , an allocation module 520 , a reporting module 530 and a control module 540 .

其中,发现模块510用于通过定期轮询以发现可用于计算任务的多个计算节点。在一些示例中,结合图3所示,具体概括为以下步骤:Wherein, the discovery module 510 is configured to discover multiple computing nodes that can be used for computing tasks through periodic polling. In some examples, combined with what is shown in Figure 3, it is specifically summarized as the following steps:

步骤1:根据计算节点列表发送轮询请求并启用等待定时器。换言之,即中心控制节点(包含于控制模块540)按照上次计算维护的计算节点表最近最多用过的(Most RecentlyUsed,MRU)节点列表,逐次发送轮询请求,并等待计算节点(如记为Nn)回应,并同时开启等待定时器,记为Tn。Step 1: Send a polling request based on the compute node list and enable the waiting timer. In other words, the central control node (included in the control module 540) sends polling requests one by one according to the most recently used (Most Recently Used, MRU) node list in the computing node table maintained by the last calculation, and waits for the computing node (as denoted as Nn) response, and open the waiting timer at the same time, denoted as Tn.

在一些示例中,优选地,节点表采用了链表的形式,其格式如下:In some examples, preferably, the node table is in the form of a linked list, and its format is as follows:

链表头,总数-→节点ID,节点寻址→下一个节点…→链表尾。Linked list head, total number-→node ID, node addressing→next node...→linked list tail.

另外,在一些示例中,中心控制节点维护一张信息表,该表记录了各个计算节点的情况,包含了逻辑编号和物理编号的对应关系,相应地,对于物理编号失效的计算节点,也将在信息表里作相应的删除。换言之,节点信息中包含了逻辑编号和物理编号的对应关系,逻辑编号是指计算节点的ID,物理编号是指计算节点实际的表示,比如对于网卡而言,是Mac号,相应的,对于物理编号失效的计算节点,也在信息表里做相应的删除。在具体示例中,通用的消息格式用表格表示如下表1所示:In addition, in some examples, the central control node maintains an information table, which records the situation of each computing node, including the correspondence between the logical number and the physical number. Correspondingly, for the computing node whose physical number fails, the Delete accordingly in the information table. In other words, the node information includes the correspondence between the logical number and the physical number. The logical number refers to the ID of the computing node, and the physical number refers to the actual representation of the computing node. For example, for the network card, it is the Mac number. Correspondingly, for the physical Compute nodes with invalid serial numbers are also deleted from the information table. In a specific example, the general message format is expressed in a table as shown in Table 1 below:

序号serial number 标示marked 说明illustrate 9.9. 消息号message number 消息类型message type 10.10. 源节点号source node number 节点号node number 11.11. 目的节点号destination node number 目的节点号destination node number 12.12. 版本类型version type 该消息类型的版本号The version number of this message type 13.13. 时间戳timestamp 存放自2000年1月1日12:00:00时刻后流逝的秒数Stores the number of seconds elapsed since 12:00:00 on January 1, 2000 14.14. 序列号serial number 增长的一个流水号号码An increasing serial number 15.15. 消息体message body 和消息类型对应的消息内容The message content corresponding to the message type 16.16. 校验码check code 运用校验的算法Algorithm using validation

表1Table 1

而实际实现的一个报文内容如下:The actual content of a message is as follows:

<?xmlversion="1.0"encoding="utf-8"?><? xmlversion="1.0"encoding="utf-8"? >

<message><message>

<head><head>

<message_id>DISCOVERY_REQ</message_id>;<message_id>DISCOVERY_REQ</message_id>;

    <version>1.0</version><version>1.0</version>

      <src_node>1</src_node><src_node>1</src_node>

    <dest_node>2</dest_node><dest_node>2</dest_node>

<time_tamp>2013-12-13:00:03:45:234</time_tamp><time_tamp>2013-12-13:00:03:45:234</time_tamp>

<seq_no>11</seq_no><seq_no>11</seq_no>

</head></head>

<body><body>

<broadcast>no</broadcast><broadcast>no</broadcast>

</body></body>

<tail><tail>

<checksum>123</checksum><checksum>123</checksum>

</tail></tail>

</message></message>

进一步地,中心控制节点维护更多的信息,如图4所示的信息模型图,包含如下表2中所示的信息内容:Further, the central control node maintains more information, such as the information model diagram shown in Figure 4, which includes the information content shown in Table 2 below:

表2Table 2

在一些示例中,优选地,发送的轮询请求为:In some examples, preferably, the polling request sent is:

<?xmlversion="1.0"encoding="utf-8"?><? xmlversion="1.0"encoding="utf-8"? >

<message><message>

<head><head>

<message_id>DISCOVERY_REQ</message_id>;<message_id>DISCOVERY_REQ</message_id>;

<src_node>1</src_node><src_node>1</src_node>

<dest_node>2</dest_node><dest_node>2</dest_node>

<version>1.0</version><version>1.0</version>

<time_tamp>2013-12-13:00:03:45:234</time_tamp><time_tamp>2013-12-13:00:03:45:234</time_tamp>

<seq_no>11</seq_no><seq_no>11</seq_no>

</head></head>

<body><body>

<broadcast>no</broadcast><broadcast>no</broadcast>

</body></body>

<tail><tail>

<checksum>123</checksum><checksum>123</checksum>

</tail></tail>

  </message></message>

而在该步骤中,等待定时器例如可设置为10s。In this step, the waiting timer may be set to 10s, for example.

步骤2:各计算节点接收轮询请求,估算各自当前的计算能力,并发送至控制模块540(控制模块包括中心控制节点),具体包括:Step 2: Each computing node receives the polling request, estimates its current computing capability, and sends it to the control module 540 (the control module includes the central control node), specifically including:

M=N+P1+P2,M=N+P1+P2,

其中,M为计算节点的当前计算能力,N为当前CPU占有率,P1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率。Among them, M is the current computing capability of the computing node, N is the current CPU occupancy rate, P1 is the CPU occupancy rate in the past period of time, and P2 is the expected CPU occupancy rate in the future period of time.

进一步地,在一些示例中,发送的回应报文格式如下:Further, in some examples, the format of the sent response message is as follows:

<?xmlversion="1.0"encoding="utf-8"?><? xmlversion="1.0"encoding="utf-8"? >

<message><message>

<head><head>

<message_id>DISCOVERY_ACK</message_id>;<message_id>DISCOVERY_ACK</message_id>;

<src_node>2</src_node><src_node>2</src_node>

<dest_node>1</dest_node><dest_node>1</dest_node>

<version>1.0</version><version>1.0</version>

<time_tamp>2013-12-13:00:03:45:234</time_tamp><time_tamp>2013-12-13:00:03:45:234</time_tamp>

<seq_no>1</seq_no><seq_no>1</seq_no>

</head></head>

<body><body>

<power>234</power><power>234</power>

</body></body>

<tail><tail>

<checksum>123</checksum><checksum>123</checksum>

</tail></tail>

  </message></message>

步骤3:在等待定时器到期前,控制模块540根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务。Step 3: Before the waiting timer expires, the control module 540 judges whether multiple computing nodes can complete computing tasks according to the current computing capabilities of each computing node.

步骤4:如果可以,则将多个节点用于完成计算任务,否则继续发送轮询请求。即中心控制节点根据各计算节点上报的情况,判断参与本次计算任务的这些计算节点(在考虑冗余的情况下)能否完成计算任务,如果可以完成,则进入指派阶段,分配计算任务。如果不能完成,则继续进行轮询,以进入更大范围发现阶段,并且将步骤1中的点对点消息更改为广播的轮询请求消息。在一些示例中,具体地报文格式如下:Step 4: If possible, use multiple nodes to complete the computing task, otherwise continue to send polling requests. That is, the central control node judges whether these computing nodes participating in this computing task (in consideration of redundancy) can complete the computing task according to the situation reported by each computing node. If it can be completed, it enters the assignment stage and allocates computing tasks. If it cannot be completed, the polling is continued to enter a larger scope discovery stage, and the point-to-point message in step 1 is changed to a broadcast polling request message. In some examples, the specific message format is as follows:

<?xmlversion="1.0"encoding="utf-8"?><? xmlversion="1.0"encoding="utf-8"? >

<message><message>

<head><head>

<message_id>DISCOVERY_REQ</message_id>;<message_id>DISCOVERY_REQ</message_id>;

<src_node>1</src_node><src_node>1</src_node>

<dest_node>2</dest_node><dest_node>2</dest_node>

<version>1.0</version><version>1.0</version>

<time_tamp>2013-12-13:00:03:45:234</time_tamp><time_tamp>2013-12-13:00:03:45:234</time_tamp>

<seq_no>11</seq_no><seq_no>11</seq_no>

</head></head>

<body><body>

<broadcast>yes</broadcast><broadcast>yes</broadcast>

</body></body>

<tail><tail>

<checksum>123</checksum><checksum>123</checksum>

</tail></tail>

  </message></message>

步骤5:当等待定时器到期时,则发现模块510不再等待计算节点的回应,并丢弃超时的回应消息。Step 5: When the waiting timer expires, the discovery module 510 no longer waits for the response from the computing node, and discards the timed-out response message.

分配模块520用于分别获取多个计算节点当前的计算能力,并将计算任务分解,并通过多个计算节点协同处理分解后的计算任务。在一些示例中,具体概括为以下步骤:The allocating module 520 is used to respectively acquire the current computing capabilities of multiple computing nodes, decompose the computing tasks, and process the decomposed computing tasks collaboratively through multiple computing nodes. In some examples, this is outlined in the following steps:

步骤A:设多个计算节点为N个,以及将计算任务分解为m个子任务,其中N>M。例如多个计算节点分别为N1,N2,…Nn,计算任务T分解为m个子任务分别为T1,T2,…Tm,并且N>M,则计算节点N1,N2,…Nn分别对应地处理子任务T1,T2,…Tm。Step A: Set the number of computing nodes as N, and decompose the computing task into m subtasks, where N>M. For example, multiple computing nodes are respectively N1, N2, ... Nn, and the computing task T is decomposed into m subtasks respectively T1, T2, ... Tm, and N>M, then the computing nodes N1, N2, ... Nn respectively process the subtasks correspondingly. Tasks T1, T2, ... Tm.

在一些示例中,该步骤中的子任务的表达是语义化的,这样在异构环境中,跟计算节点所在的操作系统无关,并且上述分解的子任务统一采用MathML描述。In some examples, the expression of the subtasks in this step is semantic, so that in a heterogeneous environment, it has nothing to do with the operating system where the computing nodes are located, and the above decomposed subtasks are uniformly described by MathML.

步骤B:将每个子任务发送给对应的计算节点,并启动超时定时器。Step B: Send each subtask to the corresponding computing node, and start the timeout timer.

步骤C:维持各计算节点的包活(心跳)定时器H,并定时判断各计算节点是否失效。Step C: Maintain the packet liveness (heartbeat) timer H of each computing node, and regularly determine whether each computing node is invalid.

步骤D:在超时定时器到期前,接收各计算节点的计算结果。Step D: Receive the calculation results of each calculation node before the timeout timer expires.

需要说明的是,在上述过程中,出于容错的目的,目标计算节点对于收到和自身ID不匹配的报文,简单丢弃即可。It should be noted that, in the above process, for the purpose of fault tolerance, the target computing node simply discards the packets that do not match its own ID.

上报模块530用于上报每个计算节点的处理结果。具体地说,上报模块530将每个计算节点的处理结果上报至中心控制节点,也即上报至控制模块540。The reporting module 530 is used to report the processing result of each computing node. Specifically, the reporting module 530 reports the processing result of each computing node to the central control node, that is, to the control module 540 .

控制模块540分析每个计算节点的处理结果以对多个计算节点进行控制。The control module 540 analyzes the processing results of each computing node to control multiple computing nodes.

综上所述,本发明的系统500充分利用铺设在现场的大量各种传感器和数据处理单元(即计算节点)的剩余计算能力,通过合适的协议进行组网,而该协议主要可概括为四个阶段,分别为:发现、指派、上报和汇总,如图2所示。To sum up, the system 500 of the present invention makes full use of the remaining computing power of a large number of various sensors and data processing units (that is, computing nodes) laid on the site, and forms a network through a suitable protocol, and the protocol can be mainly summarized as four The stages are: discovery, assignment, reporting and summary, as shown in Figure 2.

具体而言,在发现阶段,中心控制节点发起定期轮询,以发现可用于本次计算任务的计算节点;指派阶段(分配阶段)是在掌握了现有计算节点能力的前提下,将计算任务进行分解,同时协同计算;上报阶段就是各分配了子任务的计算节点将计算结果通过网络上报到中心控制节点,其中,上报的结果有两种,一个是计算成功,另外一个是失败,当然由于是通过网络进行的,需要中心控制节点检测失效的计算节点,如果在规定的时间内不能完成分配的子任务,则该计算节点不上报;汇总阶段是将上述上报的结果进行分析汇总,对于失败的结果发送到另外有效的计算节点,同时等待计算结果,具体步骤与分配阶段类同,属于二次分配。更极端的情况下,二次分配也得不到好的结果,可以再多次尝试,直到到达设定的时间或者达到尝试次数后作废本次计算任务。Specifically, in the discovery phase, the central control node initiates periodic polling to discover computing nodes that can be used for this computing task; in the assignment phase (assignment phase), it assigns the computing task Decomposition and collaborative calculation at the same time; the reporting stage is that the calculation nodes assigned sub-tasks report the calculation results to the central control node through the network. Among them, there are two kinds of reported results, one is calculation success, and the other is failure. Of course, due to It is carried out through the network, and the central control node is required to detect the failed computing node. If the assigned subtask cannot be completed within the specified time, the computing node will not report; the summary stage is to analyze and summarize the above reported results. The result is sent to another effective computing node, while waiting for the calculation result, the specific steps are similar to the distribution stage, which belongs to the secondary distribution. In even more extreme cases, the second allocation will not yield good results, and you can try multiple times until the set time is reached or the number of attempts is reached, and the calculation task is invalidated.

在一些示例中,在考虑到节点失效的情况下,分配模块520采取冗余策略,同一分解的子任务可以分配到多个计算机节点。这样,也可以比较执行相同任务的计算节点的计算结果。In some examples, in consideration of node failure, the allocation module 520 adopts a redundancy strategy, and the same decomposed subtask can be allocated to multiple computer nodes. This way, it is also possible to compare the calculation results of compute nodes performing the same task.

在本发明的一个实施例中,各个计算节点之间的通信协议采用XML(ExtensibleMarkupLanguage,可扩展标记语言)格式并通过大型企业内部进行标准化或者遵循相应的国际国家标准。具体地说,就现有技术而言,在企业大型多单元分布系统中,一般通过网络套接字(Socket)协议作为应用单元之间进行数据交换的通常方法,基本上采取的是自定义报文格式,无论是定长或者分隔符的,但是,由于这些自定义的信息格式缺乏统一标准,随意性大,通用性,灵活性不足,不能满足企业IT建设周期长以及新技术层出不穷的现实情况的需求。因此,本发明采用标准的XML格式的通信协议来作为应用的数据交换标准。In one embodiment of the present invention, the communication protocol between computing nodes adopts XML (Extensible Markup Language, Extensible Markup Language) format and is standardized within large enterprises or follows corresponding international national standards. Specifically, as far as the existing technology is concerned, in the large-scale multi-unit distributed system of an enterprise, the usual method for data exchange between application units is generally through the network socket (Socket) protocol, and basically a custom report is adopted. However, due to the lack of uniform standards, these self-defined information formats are arbitrary, versatile, and inflexible, and cannot meet the reality of long IT construction cycles and emerging new technologies in enterprises. demand. Therefore, the present invention adopts a standard communication protocol in XML format as the applied data exchange standard.

根据本发明实施例的应用于水利施工现场的计算节点的控制系统,由中心控制节点发起定期轮询,由潜在参与节点上报各自的剩余计算能力,并根据各节点上报的数据,进行任务分解,指派到指定节点进行计算,并上报计算结果,最后根据各个节点上报的信息汇总成最终结果。因此,该系统充分利用了现场的各计算节点(如传感器和数据处理单元)的剩余计算能力,可以有效地提升水利施工现场信息化水平。According to the control system of the computing nodes applied to the water conservancy construction site according to the embodiment of the present invention, the central control node initiates regular polling, and the potential participating nodes report their respective remaining computing capabilities, and perform task decomposition according to the data reported by each node, Assign to designated nodes for calculation, and report the calculation results, and finally summarize the final results based on the information reported by each node. Therefore, the system makes full use of the remaining computing power of each computing node (such as sensors and data processing units) on the site, which can effectively improve the informatization level of the water conservancy construction site.

在本发明的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“长度”、“宽度”、“厚度”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”“内”、“外”、“顺时针”、“逆时针”、“轴向”、“径向”、“周向”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。In describing the present invention, it should be understood that the terms "center", "longitudinal", "transverse", "length", "width", "thickness", "upper", "lower", "front", " Back", "Left", "Right", "Vertical", "Horizontal", "Top", "Bottom", "Inner", "Outer", "Clockwise", "Counterclockwise", "Axial", The orientation or positional relationship indicated by "radial", "circumferential", etc. is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying the referred device or element Must be in a particular orientation, be constructed in a particular orientation, and operate in a particular orientation, and therefore should not be construed as limiting the invention.

此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In addition, the terms "first" and "second" are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, the features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In the description of the present invention, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined.

在本发明中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。In the present invention, unless otherwise clearly specified and limited, terms such as "installation", "connection", "connection" and "fixation" should be understood in a broad sense, for example, it can be a fixed connection or a detachable connection , or integrated; it may be mechanically connected or electrically connected; it may be directly connected or indirectly connected through an intermediary, and it may be the internal communication of two components or the interaction relationship between two components, unless otherwise specified limit. Those of ordinary skill in the art can understand the specific meanings of the above terms in the present invention according to specific situations.

在本发明中,除非另有明确的规定和限定,第一特征在第二特征“上”或“下”可以是第一和第二特征直接接触,或第一和第二特征通过中间媒介间接接触。而且,第一特征在第二特征“之上”、“上方”和“上面”可是第一特征在第二特征正上方或斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”可以是第一特征在第二特征正下方或斜下方,或仅仅表示第一特征水平高度小于第二特征。In the present invention, unless otherwise clearly specified and limited, the first feature may be in direct contact with the first feature or the first and second feature may be in direct contact with the second feature through an intermediary. touch. Moreover, "above", "above" and "above" the first feature on the second feature may mean that the first feature is directly above or obliquely above the second feature, or simply means that the first feature is higher in level than the second feature. "Below", "beneath" and "beneath" the first feature may mean that the first feature is directly below or obliquely below the second feature, or simply means that the first feature is less horizontally than the second feature.

在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.

尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limiting the present invention, those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims (10)

1.一种应用于水利施工现场的计算节点的控制方法,其特征在于,包括以下步骤:1. A control method applied to a calculation node of a water conservancy construction site, characterized in that, comprising the following steps: 采用定期轮询以发现可用于计算任务的多个计算节点;Use periodic polling to discover multiple compute nodes available for computing tasks; 分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务;Obtaining the current computing capabilities of the multiple computing nodes respectively, decomposing the computing tasks, and cooperatively processing the decomposed computing tasks through the multiple computing nodes; 每个计算节点分别将处理结果发送至中心控制节点;Each computing node sends the processing result to the central control node; 所述中心控制节点分析每个计算节点的处理结果以对所述多个计算节点进行控制。The central control node analyzes the processing results of each computing node to control the multiple computing nodes. 2.根据权利要求1所述的应用于水利施工现场的计算节点的控制方法,其特征在于,所述采用定期轮询以发现可用于计算任务的多个计算节点,具体包括:2. The method for controlling computing nodes applied to a water conservancy construction site according to claim 1, wherein said periodic polling is used to find a plurality of computing nodes that can be used for computing tasks, specifically comprising: 根据计算节点列表发送轮询请求并启用等待定时器;Send a polling request and enable the waiting timer according to the list of computing nodes; 各计算节点接收所述轮询请求,估算各自当前的计算能力,并发送至中心控制节点,具体包括:Each computing node receives the polling request, estimates its current computing capability, and sends it to the central control node, specifically including: M=N+P1+P2,M=N+P1+P2, 其中,M为计算节点的当前计算能力,N为当前CPU占有率,P1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率;Among them, M is the current computing power of the computing node, N is the current CPU occupancy rate, P1 is the CPU occupancy rate in the past period of time, and P2 is the expected CPU occupancy rate in the future period of time; 在所述等待定时器到期前,所述中心控制节点根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务;Before the waiting timer expires, the central control node judges whether multiple computing nodes can complete computing tasks according to the current computing capabilities of each computing node; 如果可以,则将所述多个节点用于完成所述计算任务,否则继续发送轮询请求;If possible, use the plurality of nodes to complete the computing task, otherwise continue to send a polling request; 当所述等待定时器到期时,不再等待计算节点的回应,并丢弃超时的回应消息。When the waiting timer expires, no longer wait for the response from the computing node, and discard the timed-out response message. 3.根据权利要求1所述的应用于水利施工现场的计算节点的控制方法,其特征在于,所述分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务,具体包括:3. The method for controlling computing nodes applied to water conservancy construction sites according to claim 1, wherein the current computing capabilities of the plurality of computing nodes are acquired respectively, and the computing tasks are decomposed, and the The multiple computing nodes cooperatively process the decomposed computing tasks, specifically including: 设所述多个计算节点为N个,以及将所述计算任务分解为m个子任务,其中N>M;Assuming that the number of computing nodes is N, and decomposing the computing task into m subtasks, where N>M; 将每个子任务发送给对应的计算节点,并启动超时定时器;Send each subtask to the corresponding computing node and start the timeout timer; 定时判断各计算节点是否失效;Regularly judge whether each computing node is invalid; 在所述超时定时器到期前,接收各计算节点的计算结果。Before the timeout timer expires, the calculation results of each calculation node are received. 4.根据权利要求3所述的应用于水利施工现场的计算节点的控制方法,其特征在于,还包括:4. the control method applied to the calculation node of the water conservancy construction site according to claim 3, is characterized in that, also comprises: 采取冗余的策略,同一分解的子任务可以分配到多个计算节点。With a redundant strategy, the subtasks of the same decomposition can be assigned to multiple computing nodes. 5.根据权利要求1-4任一项所述的应用于水利施工现场的计算节点的控制方法,其特征在于,所述各个计算节点之间采用XML格式的通信协议。5. The method for controlling computing nodes applied to water conservancy construction sites according to any one of claims 1-4, wherein a communication protocol in XML format is adopted between the computing nodes. 6.一种应用于水利施工现场的计算节点的控制系统,其特征在于,包括:6. A control system applied to computing nodes at water conservancy construction sites, characterized in that it comprises: 发现模块,所述发现模块用于通过定期轮询以发现可用于计算任务的多个计算节点;A discovery module, the discovery module is used to discover multiple computing nodes that can be used for computing tasks through periodic polling; 分配模块,所述分配模块用于分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务;An allocation module, the allocation module is configured to respectively acquire the current computing capabilities of the multiple computing nodes, decompose the computing tasks, and cooperatively process the decomposed computing tasks through the multiple computing nodes; 上报模块,所述上报模块用于上报每个计算节点的处理结果;A reporting module, the reporting module is used to report the processing results of each computing node; 控制模块,所述控制模块分析每个计算节点的处理结果以对所述多个计算节点进行控制。A control module, the control module analyzes the processing result of each computing node to control the multiple computing nodes. 7.根据权利要求6所述的应用于水利施工现场的计算节点的控制系统,其特征在于,所述发现模块通过定期轮询以发现可用于计算任务的多个计算节点,具体包括:7. The control system applied to computing nodes at water conservancy construction sites according to claim 6, wherein the discovery module discovers a plurality of computing nodes that can be used for computing tasks through regular polling, specifically comprising: 根据计算节点列表发送轮询请求并启用等待定时器;Send a polling request and enable the waiting timer according to the list of computing nodes; 各计算节点接收所述轮询请求,估算各自当前的计算能力,并发送至控制模块,具体包括:Each computing node receives the polling request, estimates its current computing capability, and sends it to the control module, specifically including: M=N+P1+P2,M=N+P1+P2, 其中,M为计算节点的当前计算能力,N为当前CPU占有率,P1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率;Among them, M is the current computing power of the computing node, N is the current CPU occupancy rate, P1 is the CPU occupancy rate in the past period of time, and P2 is the expected CPU occupancy rate in the future period of time; 在所述等待定时器到期前,所述控制模块根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务;Before the waiting timer expires, the control module judges whether multiple computing nodes can complete computing tasks according to the current computing capabilities of each computing node; 如果可以,则将所述多个节点用于完成所述计算任务,否则继续发送轮询请求;If possible, use the plurality of nodes to complete the computing task, otherwise continue to send a polling request; 当所述等待定时器到期时,则所述发现模块不再等待计算节点的回应,并丢弃超时的回应消息。When the waiting timer expires, the discovery module no longer waits for a response from the computing node, and discards the timed-out response message. 8.根据权利要求6所述的应用于水利施工现场的计算节点的控制系统,其特征在于,所述分配模块分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务,具体包括:8. The control system applied to computing nodes at water conservancy construction sites according to claim 6, wherein the distribution module obtains the current computing capabilities of the plurality of computing nodes respectively, and decomposes the computing tasks, And the decomposed computing tasks are collaboratively processed through the multiple computing nodes, specifically including: 设所述多个计算节点为N个,以及将所述计算任务分解为m个子任务,其中N>M;Assuming that the number of computing nodes is N, and decomposing the computing task into m subtasks, where N>M; 将每个子任务发送给对应的计算节点,并启动超时定时器;Send each subtask to the corresponding computing node and start the timeout timer; 定时判断各计算节点是否失效;Regularly judge whether each computing node is invalid; 在所述超时定时器到期前,接收各计算节点的计算结果。Before the timeout timer expires, the calculation results of each calculation node are received. 9.根据权利要求8所述的应用于水利施工现场的计算节点的控制系统,其特征在于,所述分配模块还用于采取冗余的策略,同一分解的子任务可以分配到多个计算节点。9. The control system applied to computing nodes at water conservancy construction sites according to claim 8, wherein the distribution module is also used to adopt a redundant strategy, and the same decomposed subtasks can be assigned to multiple computing nodes . 10.根据权利要求6-9任一项所述的应用于水利施工现场的计算节点的控制系统,其特征在于,所述各个计算节点之间采用XML格式的通信协议。10. The control system for computing nodes applied to water conservancy construction sites according to any one of claims 6-9, wherein a communication protocol in XML format is used between the computing nodes.
CN201410465692.7A 2014-09-12 2014-09-12 Computational node control method and system applied to water conservancy construction site Pending CN104243579A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410465692.7A CN104243579A (en) 2014-09-12 2014-09-12 Computational node control method and system applied to water conservancy construction site

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410465692.7A CN104243579A (en) 2014-09-12 2014-09-12 Computational node control method and system applied to water conservancy construction site

Publications (1)

Publication Number Publication Date
CN104243579A true CN104243579A (en) 2014-12-24

Family

ID=52230907

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410465692.7A Pending CN104243579A (en) 2014-09-12 2014-09-12 Computational node control method and system applied to water conservancy construction site

Country Status (1)

Country Link
CN (1) CN104243579A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105208136A (en) * 2015-11-04 2015-12-30 苏州墨华高科信息技术有限公司 Elasticity parallel CFD (computational fluid dynamics) cloud computing system
CN105787175A (en) * 2016-02-25 2016-07-20 中国农业大学 Water conservancy model cloud computing method and device based on model combination
CN108540568A (en) * 2018-04-23 2018-09-14 移康智能科技(上海)股份有限公司 Computing capability sharing method and smart machine
CN110839220A (en) * 2019-10-28 2020-02-25 无锡职业技术学院 Distributed computing method and system based on wireless ad hoc network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101072133A (en) * 2007-05-23 2007-11-14 华中科技大学 High-performance computing system based on peer-to-peer network
US20100241741A1 (en) * 2005-01-31 2010-09-23 Computer Associates Think, Inc. Distributed computing system having hierarchical organization
CN102063327A (en) * 2010-12-15 2011-05-18 中国科学院深圳先进技术研究院 Application service scheduling method with power consumption consciousness for data center
CN102929718A (en) * 2012-09-17 2013-02-13 江苏九章计算机科技有限公司 Distributed GPU (graphics processing unit) computer system based on task scheduling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100241741A1 (en) * 2005-01-31 2010-09-23 Computer Associates Think, Inc. Distributed computing system having hierarchical organization
CN101072133A (en) * 2007-05-23 2007-11-14 华中科技大学 High-performance computing system based on peer-to-peer network
CN102063327A (en) * 2010-12-15 2011-05-18 中国科学院深圳先进技术研究院 Application service scheduling method with power consumption consciousness for data center
CN102929718A (en) * 2012-09-17 2013-02-13 江苏九章计算机科技有限公司 Distributed GPU (graphics processing unit) computer system based on task scheduling

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡敏: "《对几种典型分布式计算技术的比较》", 《电脑知识与技术》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105208136A (en) * 2015-11-04 2015-12-30 苏州墨华高科信息技术有限公司 Elasticity parallel CFD (computational fluid dynamics) cloud computing system
CN105787175A (en) * 2016-02-25 2016-07-20 中国农业大学 Water conservancy model cloud computing method and device based on model combination
CN108540568A (en) * 2018-04-23 2018-09-14 移康智能科技(上海)股份有限公司 Computing capability sharing method and smart machine
CN108540568B (en) * 2018-04-23 2021-06-01 移康智能科技(上海)股份有限公司 Computing capacity sharing method and intelligent equipment
CN110839220A (en) * 2019-10-28 2020-02-25 无锡职业技术学院 Distributed computing method and system based on wireless ad hoc network
CN110839220B (en) * 2019-10-28 2022-12-20 无锡职业技术学院 A Distributed Computing Method Based on Wireless Ad Hoc Network

Similar Documents

Publication Publication Date Title
CN103236949B (en) Monitoring method, device and the system of a kind of server cluster
CN101183984B (en) Network management system, management method and equipment
AU2024227768A1 (en) Hierarchical update and configuration of software for networked communication devices using multicast
CN110191148B (en) Statistical function distributed execution method and system for edge calculation
CN102420699B (en) Equipment number distribution method of digital radio frequency remote system and system thereof
CN103684933B (en) Internet of things system, Internet of Things agent apparatus and method
CN1874267A (en) Method for ensuring accordant configuration information in cluster system
CN104243579A (en) Computational node control method and system applied to water conservancy construction site
CN101951369A (en) Batch terminal upgrading method and system based on automatic discovery
CN109040184B (en) A method for electing a master node and a server
CN105471995A (en) High-availability implementation method for large-scale Web server cluster based on SOA
CN104750544B (en) Applied to the process management system and process management method in distributed system
CN104038570B (en) A kind of data processing method and device
CN110661637A (en) Distributed system member change method and distributed system
CN103974140A (en) Management method and management system of TR069 protocol based large-scale interactive TV terminal
WO2014000698A1 (en) Ip layer-based network topology identification method and device
CN104320347B (en) A kind of method and apparatus for actively updating LLDP
CN102769867B (en) Method for network access
CN104270452B (en) A kind of tele-medicine data management system and its wireless network communication method
WO2013097363A1 (en) Method and system for scheduling data sharing device
CN108810042A (en) A kind of task processing method, relevant device and system
CN201387555Y (en) A Comprehensive Remote Monitoring System
WO2021254466A1 (en) Method, apparatus and system for configuring edge side device
CN117082106B (en) Multi-level data networking methods, systems, devices and equipment for government cloud environments
CN104349338A (en) Method and system used for monitoring sensor access gateway

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20141224

RJ01 Rejection of invention patent application after publication