CN111400026B

CN111400026B - A distributed load balancing method based on master-slave backup technology

Info

Publication number: CN111400026B
Application number: CN201911119106.2A
Authority: CN
Inventors: 谢在鹏; 李博文; 张基; 朱晓瑞; 徐媛媛; 叶保留; 毛莺池
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2019-11-15
Filing date: 2019-11-15
Publication date: 2023-02-28
Anticipated expiration: 2039-11-15
Also published as: CN111400026A

Abstract

本发明公开了一种主从备份技术的分布式负载均衡方法，包括步骤如下：S1划分节点集合；S2划分任务集合；S3分发子任务；S4记录执行时间；S5计算执行效率；S6求解分配方案；通过将任务分配问题建模为线性规划问题，通过对抽象模型的分析，实现了对任务分配可行域的求解。同时结合实际问题，提出一种带备份的任务分配和动态调整方法，该方法旨在提供一种能在有备份的分布式集群上实现按效能的任务分配，同时尽可能降低节点之间的显在性能关联性。能够实现在提高主节点的任务量时，尽量不提高备份节点的额外负担，进而可以缓解节点间存在的性能差异对整体任务运行时间的影响。The invention discloses a distributed load balancing method of a master-slave backup technology, comprising the following steps: S1 divides node sets; S2 divides task sets; S3 distributes subtasks; S4 records execution time; S5 calculates execution efficiency; S6 solves distribution schemes ; By modeling the task allocation problem as a linear programming problem, and through the analysis of the abstract model, the solution to the task allocation feasible domain is realized. At the same time, combined with practical problems, a task allocation and dynamic adjustment method with backup is proposed. in performance relevance. It can be realized that when increasing the task load of the master node, the additional burden of the backup node should not be increased as far as possible, thereby alleviating the impact of performance differences between nodes on the overall task running time.

Description

A distributed load balancing method based on master-slave backup technology

技术领域technical field

本发明属于计算机通信领域，尤其涉及一种基于主从备份技术的分布式负载均衡方法。The invention belongs to the field of computer communication, in particular to a distributed load balancing method based on master-slave backup technology.

背景技术Background technique

在分布式系统中，容忍进程失败的关键方法是把多个同样的进程放到一个组中，当一个信息发送到组本身进行处理时，所有的成员都接受并处理它。通过这种方式，如果组中的一个进程失败，其他的一些进程可以接管它。在需要容错的情况下，通常使用进程复制的方法，当主进程崩溃时，备份进程替代当前主进程的任务；在需要提高效能的情况下，通常使用复制和缓存扩展以及冗余编码来复制主进程形成进程组，由于冗余的存在可以使得任务更快回收(基于MDS码的任务冗余部分快速回收)，可以使得任务所需通讯量减少(基于CDC编码的通信负载降低)。In a distributed system, the key way to tolerate process failure is to put multiple identical processes into a group. When a message is sent to the group itself for processing, all members accept and process it. This way, if one process in the group fails, some other process can take over for it. In the case of fault tolerance, the method of process replication is usually used. When the main process crashes, the backup process replaces the task of the current main process; in the case of needing to improve performance, replication and cache expansion and redundant coding are usually used to copy the main process. Forming a process group, due to the existence of redundancy, tasks can be recycled faster (fast recovery of redundant parts of tasks based on MDS codes), which can reduce the amount of communication required by tasks (reduction of communication load based on CDC codes).

通常的进程备份方案将主节点上运行的主进程完全复制形成备份进程，并放置到备份节点上。由于节点之间存在全备份关系，主节点和备份节点之间需要保证强一致性，增加主节点的任务量同时也会相应的增加备份节点的任务量。当组中主节点与其备份节点存在性能差异时，提高主节点的任务量可能会导致主节点和备份节点之间存在不可忽视的运行时间差异，影响备份的实时性。In a common process backup scheme, the master process running on the master node is completely copied to form a backup process, and placed on the backup node. Since there is a full backup relationship between nodes, strong consistency needs to be ensured between the master node and the backup node. Increasing the task load of the master node will also increase the task load of the backup node accordingly. When there is a performance difference between the primary node and its backup node in the group, increasing the task load of the primary node may cause a non-negligible difference in running time between the primary node and the backup node, affecting the real-time performance of the backup.

发明内容Contents of the invention

发明目的：为了将上述任务分配问题建模为线性规划问题，通过对抽象模型的分析，实现对任务分配可行域的求解，同时结合实际问题，提出一种带备份的任务分配和动态调整方法。Purpose of the invention: In order to model the above task allocation problem as a linear programming problem, through the analysis of the abstract model, realize the solution to the task allocation feasible domain, and combine with practical problems, propose a task allocation and dynamic adjustment method with backup.

技术方案：本发明提供一种基于主从备份技术的分布式负载均衡方法，包括如下具体步骤：Technical solution: the present invention provides a distributed load balancing method based on master-slave backup technology, including the following specific steps:

(1)根据构建的有节点分布式集群，划分节点集合；(1) According to the constructed distributed cluster with nodes, divide the node set;

(2)在集群上运行设置备份等级的任务，将同一批次所有任务划分任务集合；(2) Run the task of setting the backup level on the cluster, and divide all tasks of the same batch into task sets;

(3)依据步骤(1)和(2)得到的节点集合和任务集合，将任务集合包含的任务分发至对应节点集合；(3) According to the node set and task set obtained in steps (1) and (2), distribute the tasks contained in the task set to the corresponding node set;

(4)待所有节点完成其收到的任务，收集并记录执行时间；(4) After all nodes complete the tasks they receive, collect and record the execution time;

(5)获取每个节点的计算效率，归一化计算执行效率；(5) Obtain the computing efficiency of each node, and normalize the computing execution efficiency;

(6)通过已知常量建立方程组，变形转换为线性规划问题，进而求解分配方案。(6) Establish a system of equations through known constants, transform it into a linear programming problem, and then solve the allocation scheme.

进一步地，步骤(1)中所述划分节点集合方法包括：Further, the method for dividing node sets described in step (1) includes:

(1.1)构建一个n个节点分布式集群为

(1.1) Construct a distributed cluster of n nodes as

(1.2)对n个节点进行排列组合，取任意r个节点组成一组子集合

将全部的可能的组合

组成集合

集合

有

个元素，

初始化

(1.2) Arrange and combine n nodes, and take any r nodes to form a set of subsets

combine all possible combinations

Form a set

gather

have

elements,

initialization

进一步地，步骤(2)中所述划分任务集合方法包括：Further, the method for dividing task sets described in step (2) includes:

(2.1)在步骤(1)构建的集群上运行F个均等任务量的任务，同时设置备份等级为r；(2.1) Run F tasks with an equal amount of tasks on the cluster built in step (1), and set the backup level to r at the same time;

(2.2)整个任务集合F划分为小批次，每一批次执行一次任务分配与运行统计，系统运行时段记作t，批次记作F_j，每个批次F_j的计算时间记作Δt；每个批次F_j划分为更小批量的小任务集合，记作

的大小记作

(2.2) The entire task set F is divided into small batches, each batch executes task allocation and operation statistics once, the system running period is denoted as t, the batch is denoted as F _j , and the calculation time of each batch F _j is denoted as Δt; each batch F _j is divided into smaller batches of small task sets, denoted as

the size of

(2.3)将同一个批次的所有任务F_j划分为

个任务集，在t时刻，以比例

划分批次任务F_j为

个任务集合

集合

与

类似，集合

也有

个元素。(2.3) Divide all tasks F _j of the same batch into

task set, at time t, in proportion

Divide the batch task F _j as

set of tasks

gather

and

similar, set

also have

elements.

进一步地，步骤(3)中所述分发子任务方法包括：Further, the method for distributing subtasks described in step (3) includes:

(3.1)依据步骤(1)和(2)，获取两个有

个元素的集合

和

每次从两个集合中取元素

和

其中

是一个任务集合，

是一组节点组成的集合；(3.1) According to steps (1) and (2), two valid

collection of elements

and

Take elements from two collections at a time

and

in

is a set of tasks,

is a set of nodes;

(3.2)依次将

包含的任务发送到

表示的节点上，重复上述过程直到每一个任务集

中的任务都被发送给了对应的

中所有的节点上；同时每一个节点上存在的待处理任务个数一共为

每个任务集被拷贝并发送到r个不同的节点上。(3.2) in turn

Contains tasks sent to

On the node represented by , repeat the above process until each task set

The tasks in are sent to the corresponding

On all the nodes in ; at the same time, the total number of tasks to be processed on each node is

Each task set is copied and sent to r different nodes.

进一步地，步骤(4)中记录执行时间过程还包括：Further, the recording execution time process in step (4) also includes:

(4.1)等待所有节点完成其收到的任务，执行冗余校验或其它任务。收集每个节点N_q的任务执行时间，记作Δt_q；(4.1) Wait for all nodes to complete the tasks they received, perform redundancy check or other tasks. Collect the task execution time of each node N _q , denoted as Δt _q ;

(4.2)将当前时刻记作t，节点N_q的当前计算效率记作λ_q(t)，节点N_q上拥有的任务集

的大小记作

(4.2) Denote the current moment as t, the current computing efficiency of node N _q as λ _q (t), and the task set owned by node N _q

the size of

(4.3)采用公式(1)进行计算。(4.3) Use formula (1) to calculate.

每个节点独立计算自己的计算效率λ_q(t)，并将其发送给其他所有节点。Each node independently calculates its own computational efficiency λ _q (t) and sends it to all other nodes.

进一步地，步骤(5)中计算执行效率方法包括：Further, the method for calculating execution efficiency in step (5) includes:

(5.1)获取到每个节点所有的λ_q(t)，使用公式(2)对λ_q(t)进行归一化；(5.1) Obtain all λ _q (t) of each node, and use formula (2) to normalize λ _q (t);

定义

和e(t)：

definition

and e(t):

e(t)＝(e₀(t),e₁(t),…,e_n-1(t))；e(t)=(e ₀ (t), e ₁ (t),..., e _n-1 (t));

(5.2)将分发矩阵记作

令A为系数矩阵，

为变量，e(t)为常数项，可得非齐次线性方程组(3)。(5.2) Write the distribution matrix as

Let A be the coefficient matrix,

is a variable, and e(t) is a constant item, the non-homogeneous linear equations (3) can be obtained.

其中，分发矩阵的行标代表节点下标，列标代表任务集下标，元素a_q,i＝1代表节点q拥有任务集i，元素a_q,i＝0代表节点q不拥有任务集iAmong them, the row index of the distribution matrix represents the node index, and the column index represents the task set index. The element a _{q, i} = 1 means that the node q owns the task set i, and the element a _{q, i} = 0 means that the node q does not own the task set i

进一步地，步骤(6)中求解分配方案具体包括：Further, solving the allocation scheme in step (6) specifically includes:

(7.1)当行数等于列数时，

方程组有唯一解

(7.1) When the number of rows is equal to the number of columns,

system of equations has a unique solution

(7.2)当方程组的行数大于列数时，

方程组无唯一解，可以变形为一个线性规划问题，过程如下：(7.2) When the number of rows of the equation system is greater than the number of columns,

The system of equations has no unique solution, so it can be transformed into a linear programming problem. The process is as follows:

(7.2.1)首先定义矩阵A的形式，其左半部分为

右半部分为

即：(7.2.1) First define the form of matrix A, the left half of which is

the right half is

Right now:

同理定义

的上半部分为

下半部分为

Same definition

The upper half of

The second half is

上述方程组改写为(公式4)：The above equations are rewritten as (Formula 4):

求解得：Solved:

若

的每个维度都大于0，取

为自变量，将其转换为线性规划问题，如公式(6)，其中

为向量

的第i个元素：like

Each dimension of is greater than 0, take

As an independent variable, convert it into a linear programming problem, such as formula (6), where

as a vector

The i-th element of :

求解得

Solved

(7.2.2)向量

中得每一个元素

对应一个任务集占总任务数目的比例，向量

对应的一组值即可作为一组分配方案。使用该分配方案依照比例划分每一批次的任务F_j，令t＝t+1，进入下一个时间步，并执行步骤(2)，重复任务分配与计算性能估计，直到所有任务执行完毕。(7.2.2) vector

get every element

Corresponding to the ratio of a task set to the total number of tasks, the vector

The corresponding set of values can be used as a set of allocation schemes. Use this allocation scheme to divide each batch of tasks F _j in proportion, set t=t+1, enter the next time step, and perform step (2), repeat task allocation and calculation performance estimation until all tasks are executed.

有益效果：本发明与现有技术相比，其显著优点是：(1)提供一种能在有备份的分布式集群上实现按效能的任务分配，同时尽可能降低节点之间的显在性能关联性；(2)能够实现在提高主节点的任务量时，尽量不提高备份节点的额外负担，进而可以缓解节点间存在的性能差异对整体任务运行时间的影响。Beneficial effects: Compared with the prior art, the present invention has the remarkable advantages of: (1) providing a distributed cluster that can realize performance-based task allocation while reducing the apparent performance between nodes as much as possible; Relevance; (2) It can realize that when increasing the task load of the master node, the extra burden of the backup node should not be increased as far as possible, and then the impact of the performance difference between nodes on the overall task running time can be alleviated.

附图说明Description of drawings

图1一种基于主从备份技术的分布式负载均衡方法流程图；Fig. 1 is a flow chart of a distributed load balancing method based on master-slave backup technology;

图2可行域样例。Figure 2 Feasible region example.

具体实施方式：Detailed ways:

下面结合说明书附图对本发明的具体实施方式进行说明。The specific implementation manners of the present invention will be described below in conjunction with the accompanying drawings.

如图1具体描述实施例1：设一个拥有3个节点的分布式集群

如果要在

上执行4次循环计算，每次执行120个均等任务量的任务(任务之间无关联性，可以乱序执行且不相互影响)。设置每个任务在本集群上的冗余备份数目r＝2(即每个任务要同时执行两次)。下文将依照上述步骤描述本方法的具体实施方式。Example 1 is described in detail in Figure 1: Set up a distributed cluster with 3 nodes

if you want to be in

Execute 4 loop calculations on the above, and each time execute 120 tasks with an equal amount of tasks (there is no correlation between tasks, they can be executed out of order and do not affect each other). The number of redundant backups of each task on the cluster is set to r=2 (that is, each task must be executed twice at the same time). The specific implementation of the method will be described below according to the above steps.

步骤1：对3个节点进行排列组合，得组合结果

初始化每批次任务集大小

Step 1: Arrange and combine the 3 nodes to get the combination result

Initialize the task set size for each batch

步骤2：按照每批次任务集大小

将每批次循环的任务进行划分，则一批次120个子任务将划分为三个集合

且

Step 2: According to the size of each batch of task sets

Divide the tasks of each batch cycle, then a batch of 120 subtasks will be divided into three sets

and

步骤3：依照步骤1和2，得到了两个有3个元素的集合

和

每次从两个集合中取元素

和

其中

是一个任务集合，

是一组节点组成的集合。依次将

包含的任务发送到

表示的节点上，即每一个子集合

中的所有节点均接收到同样的任务集

即将

发送给节点{N₀,N₁}，将

发送给节点{N₀,N₂}，将

发送给节点{N₁,N₂}。此时，每一个节点上存在的待处理任务个数一共为μ＝2，每个任务集被拷贝并发送到2个不同的节点上，然后进入步骤4。Step 3: According to steps 1 and 2, two sets with 3 elements are obtained

and

Take elements from two collections at a time

and

in

is a set of tasks,

It is a set of nodes. in turn

Contains tasks sent to

On the node represented, that is, each sub-set

All nodes in receive the same set of tasks

about to

Send to node {N ₀ ,N ₁ }, will

Send to node {N ₀ ,N ₂ }, will

Send to node {N ₁ , N ₂ }. At this time, the total number of tasks to be processed on each node is μ=2, and each task set is copied and sent to two different nodes, and then enter step 4.

步骤4：等待所有节点完成其收到的任务。每个节点N_q独立记录其所有子任务的计算效率并发送给集群中的其他所有节点。以节点N₀为例，将其总的运行时间记作Δt₀，当前时刻为t，将节点N₀上拥有的任务集{P₁,P₂}的大小记作

计算λ₀(t)。Step 4: Wait for all nodes to complete the tasks they received. Each node N _q independently records the computational efficiency of all its subtasks and sends them to all other nodes in the cluster. Taking node N ₀ as an example, its total running time is recorded as Δt ₀ , the current moment is t, and the size of the task set {P ₁ , P ₂ } owned by node N ₀ is recorded as

Calculate λ ₀ (t).

并发送λ₀(t)给节点N₁,N₂。进入步骤5。And send λ ₀ (t) to nodes N ₁ , N ₂ . Go to step 5.

步骤5：在每个节点N_q上执行如下操作：从其他节点获取所有的λ_q(t)，对λ_q(t)进行归一化，得Step 5: Perform the following operations on each node N _q : Get all λ _q (t) from other nodes, normalize λ _q (t), and get

定义

和e(t)为如下形式：

e(t)＝(e₀(t),e₁(t),e₂(t))。可得非齐次线性方程组：definition

and e(t) are of the form:

e(t)=(e ₀ (t), e ₁ (t), e ₂ (t)). Inhomogeneous linear equations can be obtained:

其中：in:

步骤6：求解非齐次线性方程组，解

作为下一次迭代的任务分配方案，继续执行步骤2，直到完成所有批次的迭代。Step 6: Solve the inhomogeneous system of linear equations, solving

As the task allocation scheme for the next iteration, continue to perform step 2 until all batches of iterations are completed.

假设某一次迭代中，节点N₀,N₁,N₂上的任务执行时间分别为Δt₀＝16.7secs,Δt₁＝11.1secs,Δt₂＝14.3secs。可得其计算效率分别为λ₀(t)＝0.0399,λ₁(t)＝0.0600,λ₂(t)＝0.0466，归一化之后可得e₀(t)＝0.545,e₁(t)＝0.818,e₂(t)＝0.636那么上述的非齐次线性方程组的方程表示形式就可以写作：Assume that in a certain iteration, the execution time of tasks on nodes N ₀ , N ₁ , and N ₂ are Δt ₀ =16.7 secs, Δt ₁ =11.1 secs, and Δt ₂ =14.3 secs, respectively. The calculation efficiency can be obtained as λ ₀ (t) = 0.0399, λ ₁ (t) = 0.0600, λ ₂ (t) = 0.0466, after normalization, e ₀ (t) = 0.545, e ₁ (t) ＝0.818, e ₂ (t)＝0.636 Then the equation expression form of the above-mentioned non-homogeneous linear equation system can be written as:

可以解出：can be solved:

那么下一次进行任务划分时，其每个子任务集合的大小

因此每个节点上的任务总量变为66,98,76，与其计算性能估值0.0399,0.0600,0.0466相对应，可以认为其达到了近似最大资源利用率。Then the next time the task is divided, the size of each subtask set

Therefore, the total number of tasks on each node becomes 66, 98, and 76, corresponding to its computing performance estimates of 0.0399, 0.0600, and 0.0466, which can be considered to have reached the approximate maximum resource utilization.

实施例2：设一个拥有4个节点的分布式集群

如果要在

上执行4次循环计算，每次执行120个均等任务量的任务(任务之间无关联性，可以乱序执行且不相互影响)。设置每个任务在本集群上的冗余备份数目r＝2(即每个任务要同时执行两次)。下文将依照上述步骤描述本方法的具体实施方式。Example 2: Set up a distributed cluster with 4 nodes

if you want to be in

步骤1：对4个节点进行排列组合，总共有

种组合方案，记作D₁,D₂,…,D₆。D₁＝{N₁,N₂},D₂＝{N₁,N₃},…,D₆＝{N₃,N₄}。组合结果记作

初始化

Step 1: Arrange and combine the 4 nodes, there are a total of

A combination scheme, denoted as D ₁ , D ₂ ,..., D ₆ . D ₁ ={N ₁ ,N ₂ }, D ₂ ={N ₁ ,N ₃ },...,D ₆ ={N ₃ ,N ₄ }. The combined result is recorded as

initialization

步骤2：按照每批次任务集大小

将每批次循环的任务进行划分，则一批次120个子任务将划分为6个集合

且

Step 2: According to the size of each batch of task sets

Divide the tasks of each batch cycle, then a batch of 120 subtasks will be divided into 6 sets

and

步骤3：依照步骤1和2，得到了两个有6个元素的集合

和

每次从两个集合中取元素

和

其中

是一个任务集合，

是一组节点组成的集合。依次将

包含的任务发送到

表示的节点上，即每一个子集合

中的所有节点均接收到同样的任务集

即将

发送给节点{N₀,N₁}，将

发送给节点{N₀,N₂}，将

发送给节点{N₀,N₃}，以此类推。此时，每一个节点上存在的待处理任务个数一共为μ＝3，每个任务集被拷贝并发送到2个不同的节点上，然后进入步骤4。Step 3: According to steps 1 and 2, two sets with 6 elements are obtained

and

Take elements from two collections at a time

and

in

is a set of tasks,

It is a set of nodes. in turn

Contains tasks sent to

On the node represented, that is, each sub-set

All nodes in receive the same set of tasks

about to

Send to node {N ₀ ,N ₁ }, will

Send to node {N ₀ ,N ₂ }, will

Send to node {N ₀ ,N ₃ }, and so on. At this time, the total number of tasks to be processed on each node is μ=3, and each task set is copied and sent to two different nodes, and then step 4 is entered.

步骤4：等待所有节点完成其收到的任务。每个节点N_q独立记录其所有子任务的计算效率并发送给集群中的其他所有节点。以节点N₀为例，将其总的运行时间记作Δt₀，当前时刻为t，将节点N₀上拥有的任务集{P₁,P₂,P₃}的大小记作

计算λ₀(t)。Step 4: Wait for all nodes to complete the tasks they received. Each node N _q independently records the computational efficiency of all its subtasks and sends them to all other nodes in the cluster. Taking node N ₀ as an example, its total running time is recorded as Δt ₀ , and the current moment is t, and the size of the task set {P ₁ , P ₂ , P ₃ } owned by node N ₀ is recorded as

Calculate λ ₀ (t).

并发送λ₀(t)给节点N₁,N₂,N₃。进入步骤5。And send λ ₀ (t) to nodes N ₁ , N ₂ , N ₃ . Go to step 5.

定义

和e(t)为如下形式：

e(t)＝(e₀(t),e₁(t),...,e₅(t))。definition

and e(t) are of the form:

e(t)=(e ₀ (t), e ₁ (t), . . . , e ₅ (t)).

可得非齐次线性方程组：Inhomogeneous linear equations can be obtained:

其中：in:

步骤6：上述非齐次线性方程组稀疏矩阵不为非奇异矩阵，无法求解非齐次线性方程组，可以变形为一个线性规划问题，过程如下：Step 6: The sparse matrix of the above-mentioned non-homogeneous linear equations is not a non-singular matrix and cannot solve the non-homogeneous linear equations. It can be transformed into a linear programming problem. The process is as follows:

首先矩阵A可以写作一个方阵和一个普通矩阵拼接的形式，称其左半部分为A_l，右半部分为A_r,即：First, the matrix A can be written in the form of a square matrix and an ordinary matrix spliced, and its left half is called A _l , and the right half is called A _r , that is:

同样的称

的上半部分为

下半部分为

the same name

The upper half of

The second half is

那么上述方程组可以改写为(4)：Then the above equations can be rewritten as (4):

可以解出：can be solved:

令

的每个维度都大于0，取

为自变量，可以将其转换为线性规划问题，如公式(6)。make

Each dimension of is greater than 0, take

As an independent variable, it can be transformed into a linear programming problem, such as formula (6).

求得

后，带入公式(6)进而求解

求得

向量

中得每一个元素

对应一个任务集占总任务数目的比例，那么向量

对应的一组值即可作为一组分配方案。使用该分配方案依照比例划分每一批次的任务F_j，令t＝t+1，进入下一个时间步，并执行步骤2，重复任务分配与计算性能估计，直到所有任务执行完毕。obtain

After that, it is brought into the formula (6) to solve

obtain

vector

get every element

Corresponding to the proportion of a task set to the total number of tasks, then the vector

The corresponding set of values can be used as a set of allocation schemes. Use this allocation scheme to divide each batch of tasks F _j according to the proportion, set t=t+1, enter the next time step, and perform step 2, repeat task allocation and calculation performance estimation until all tasks are executed.

假设某一次迭代中，节点N₀,N₁,N₂,N₃上的任务执行时间分别为Δt₀＝16.7secs,Δt₁＝11.1secs,Δt₂＝14.3secs,Δt₃＝11.1secs。可得其计算效率分别为：Assume that in a certain iteration, the task execution times on nodes N ₀ , N ₁ , N ₂ , and N ₃ are Δt ₀ =16.7 secs, Δt ₁ =11.1 secs, Δt ₂ =14.3 secs, and Δt ₃ =11.1 secs, respectively. The computational efficiencies can be obtained as follows:

λ₀(t)＝0.0299,λ₁(t)＝0.0450,λ₂(t)＝0.0349,λ₃(t)＝0.0450λ ₀ (t) = 0.0299, λ ₁ (t) = 0.0450, λ ₂ (t) = 0.0349, λ ₃ (t) = 0.0450

归一化之后可得：After normalization, we can get:

e₀(t)＝0.3862,e₁(t)＝0.5814,e₂(t)＝0.4510,e₃(t)＝0.5814e ₀ (t)=0.3862, e ₁ (t)=0.5814, e ₂ (t)=0.4510, e ₃ (t)=0.5814

那么上述的非齐次线性方程组的方程表示形式就可以写作：Then the equation representation of the above non-homogeneous linear equations can be written as:

即：Right now:

系数矩阵：Coefficient matrix:

对其求逆得：Inverting it yields:

求

得：beg

have to:

求

得：beg

have to:

令：make:

可得：Available:

上述约束在第一象限内有可行域，如图2所示。The above constraints have a feasible region in the first quadrant, as shown in Figure 2.

在上述可行域中求：Find in the above feasible domain:

取一组可行解：Take a set of feasible solutions:

继续解出

Continue to solve

那么下一次进行任务划分时，其每个子任务集合的大小

因此每个节点上的任务总量变为46,70,54,70，与其计算性能估值0.0299,0.0450,0.039,0.0450相对应，可以认为其达到了近似最大资源利用率。Then the next time the task is divided, the size of each subtask set

Therefore, the total number of tasks on each node becomes 46, 70, 54, and 70, corresponding to its computing performance estimates of 0.0299, 0.0450, 0.039, and 0.0450, which can be considered to have reached the approximate maximum resource utilization.

Claims

1. A distributed load balancing method based on master-slave backup technology, characterized in that, comprising the steps:

(1) According to the constructed distributed cluster with nodes, divide the node set;

(2) Run the task of setting the backup level on the cluster, and divide all tasks of the same batch into task sets;

(3) According to the node set and task set obtained in steps (1) and (2), distribute the tasks contained in the task set to the corresponding node set;

(4) After all nodes complete the tasks they receive, collect and record the execution time;

(5) Obtain the computing efficiency of each node, and normalize the computing execution efficiency;

(6) Establish a system of equations through known constants, transform it into a linear programming problem, and then solve the allocation scheme;

Wherein, the method for dividing the set of nodes described in step (1) includes:

(1.1) Construct a distributed cluster of n nodes as

(1.2) Arrange and combine n nodes, and take r nodes to form a set of subsets

combine all possible combinations

Form a set

gather

have

elements,

initialization

In step (6), solving the allocation scheme specifically includes:

(6.1) When the number of rows is equal to the number of columns,

system of equations has a unique solution

(6.2) When the number of rows of the equation system is greater than the number of columns,

(6.2.1) Define the form of matrix A, whose left half is

the right half is

Right now:

Define the vector in the same way

The upper half of

The second half is

The above equations are rewritten as (Formula 4):

Solved:

like

Each dimension of is greater than 0, take

as a vector

The i-th element of :

Solved

(6.2.2) vector

get every element

The corresponding set of values can be used as a set of allocation schemes, use this allocation scheme to divide each batch of tasks F _j according to the proportion, set t=t+1, enter the next time step, and perform step (2), repeat Task allocation and computing performance estimation until all tasks are executed.

2. a kind of distributed load balancing method based on master-slave backup technology according to claim 1, is characterized in that, described in the step (2) task collection method comprises:

(2.1) Run F tasks with an equal amount of tasks on the cluster constructed in step (1), and set the backup level to r at the same time;

(2.2) The entire task set F is divided into small batches, each batch executes task allocation and operation statistics, the system running period is recorded as t, and the calculation time of each batch of task F _j is recorded as Δt; each batch The second task F _i is divided into smaller batches of small task sets, denoted as

the size of

(2.3) Divide all tasks F _j of the same batch into

task set, at time t, in proportion

Divide the batch task F _j as

set of tasks

gather

and

similar, set

also have

elements.

3. a kind of distributed load balancing method based on master-slave backup technology according to claim 2, is characterized in that, in the step (3), the method for distributing subtasks comprises:

(3.1) Get two

collection of elements

and

Take elements from two collections at a time

and

in

is a set of tasks,

is a set of nodes;

(3.2) in turn

Contains tasks sent to

on the nodes represented, until each task set

The tasks in are sent to the corresponding

Each task set is copied and sent to r different nodes.

4. a kind of distributed load balancing method based on master-slave backup technology according to claim 3, is characterized in that, in step (4), record execution time process also comprises:

(4.1) Wait for all nodes to complete the tasks they received, perform redundancy checks or other tasks, collect the task execution time of each node N _q , and record it as Δt _q ;

the size of

(4.3) Use formula (1) to calculate:

Each node independently calculates its own computational efficiency λ _q (t) and sends it to all other nodes.

5. a kind of distributed load balancing method based on master-slave backup technology according to claim 4, is characterized in that, calculates execution efficiency method in step (5) and comprises:

(5.1) Obtain all λ _q (t) of each node, and use formula (2) to normalize λ _q (t);

definition

and e(t):

e(t)=(e ₀ (t), e ₁ (t),..., e _n-1 (t));

(5.2) Denote the distribution matrix as A,

Let A be the coefficient matrix,

is a variable, and e(t) is a constant item, the non-homogeneous linear equation system (3) can be obtained:

Among them, the row index of the distribution matrix represents the node index, and the column index represents the task set index. The element a _{q, i} = 1 means that the node q owns the task set i, and the element a _{q, i} = 0 means that the node q does not own the task set i .