CN117808127B

CN117808127B - Image processing method, federated learning method and device under data heterogeneity conditions

Info

Publication number: CN117808127B
Application number: CN202410230103.0A
Authority: CN
Inventors: 范宝余; 李仁刚; 王立; 张润泽; 郭振华; 赵雅倩; 曹芳; 赵坤; 鲁璐; 贺蒙
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: IEIT Systems Co Ltd
Priority date: 2024-02-29
Filing date: 2024-02-29
Publication date: 2024-05-28
Anticipated expiration: 2044-02-29
Also published as: CN117808127A

Abstract

The present invention discloses an image processing method, a federated learning method and a device under data heterogeneity conditions, and relates to the field of image processing technology. Edge computing devices are clustered according to the similarity of data distribution. The edge computing devices in the cluster have similar data distribution, which can allow the model to better capture the characteristics of the data and effectively solve the problem of data heterogeneity. The edge computing devices in the cluster aggregate model parameters according to the tree-shaped aggregation network in the cluster. The edge computing devices in the lower layer only send model parameters to the corresponding edge computing devices in the upper layer, and do not send model parameters to other edge computing devices, which can greatly reduce communication overhead. The edge computing device and the edge cloud server perform two-layer model parameter aggregation in the federated learning process to obtain an accurate and reliable image processing model. Finally, the edge computing device uses the accurate and reliable image processing model to perform image processing, which can improve the accuracy and reliability of image processing.

Description

Image processing method, federated learning method and device under data heterogeneity conditions

技术领域Technical Field

本发明涉及图像处理技术领域，特别涉及一种数据异构条件下的图像处理方法、联邦学习方法、装置、系统、设备及介质。The present invention relates to the field of image processing technology, and in particular to an image processing method, a federated learning method, a device, a system, a device and a medium under data heterogeneity conditions.

背景技术Background technique

联邦学习是一种分布式机器学习方法。联邦学习的各参与方之间不交互本地数据，而交互本地模型参数，本地模型参数是指边缘计算设备使用本地数据训练模型得到的模型参数。边缘云服务器聚合各边缘计算设备的本地模型参数，得到全局模型参数，并将全局模型发送各个边缘计算设备，全局模型为以全局模型参数作为模型参数的模型。然而，传统的联邦学习方案往往不考虑边缘计算设备之间数据分布的差异，这样会导致模型对某些边缘计算设备的数据分布的适应性不佳，对于数据异构较大的联邦学习环境，这样会导致模型的性能受损。另外，传统的联邦学习方案通常采用中心化的方式进行模型参数聚合，即所有的边缘计算设备都将自身的模型参数发送给边缘云服务器进行聚合，这样会带来较大的通信开销。在训练得到的全局模型为图像处理模型的情况下，利用训练好的图像处理模型进行图像处理的准确性与可靠性不高。Federated learning is a distributed machine learning method. The participants in federated learning do not exchange local data, but local model parameters. Local model parameters refer to the model parameters obtained by edge computing devices using local data to train the model. The edge cloud server aggregates the local model parameters of each edge computing device to obtain the global model parameters, and sends the global model to each edge computing device. The global model is a model with global model parameters as model parameters. However, traditional federated learning schemes often do not consider the differences in data distribution between edge computing devices, which will lead to poor adaptability of the model to the data distribution of some edge computing devices. For federated learning environments with large data heterogeneity, this will cause the performance of the model to be impaired. In addition, traditional federated learning schemes usually use a centralized approach to aggregate model parameters, that is, all edge computing devices send their own model parameters to the edge cloud server for aggregation, which will bring about a large communication overhead. In the case where the trained global model is an image processing model, the accuracy and reliability of image processing using the trained image processing model are not high.

有鉴于此，如何解决联邦学习中数据异构性问题，降低通信开销、提升图像处理的准确性与可靠性已成为本领域技术人员亟待解决的技术问题。In view of this, how to solve the problem of data heterogeneity in federated learning, reduce communication overhead, and improve the accuracy and reliability of image processing has become a technical problem that technical personnel in this field need to solve urgently.

发明内容Summary of the invention

本发明的目的是提供一种数据异构条件下的图像处理方法、联邦学习方法、装置、系统、设备及介质，能够解决联邦学习中数据异构性问题，降低通信开销。The purpose of the present invention is to provide an image processing method, a federated learning method, an apparatus, a system, a device and a medium under data heterogeneity conditions, which can solve the problem of data heterogeneity in federated learning and reduce communication overhead.

为解决上述技术问题，本发明提供了一种数据异构条件下的图像处理方法，包括：In order to solve the above technical problems, the present invention provides an image processing method under data heterogeneity conditions, comprising:

根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇；According to the data distribution similarity of the edge computing devices, the edge computing devices are divided into a plurality of data homogeneity clusters;

从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头；Selecting an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster;

接收所述簇头上传的所在数据同性簇的簇内模型参数聚合结果；所述簇内模型参数聚合结果为所述数据同性簇内的各边缘计算设备根据簇内树形聚合网络聚合本地模型参数得到的第一层模型参数聚合结果，所述数据同性簇内的各边缘计算设备根据通信最优策略构建得到所述簇内树形聚合网络，所述簇头为所述簇内树形聚合网络的根节点，所述簇内树形聚合网络中子节点向相应的父节点发送模型参数，与父节点的本地模型参数进行模型参数聚合；Receive the in-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the in-cluster model parameter aggregation result is the first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the in-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the in-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the in-cluster tree aggregation network, and the child nodes in the in-cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes;

聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的图像处理模型收敛时，将所述图像处理模型下发给各所述边缘计算设备，以使所述边缘计算设备使用所述图像处理模型处理图像。Aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when the image processing model using the second-layer model parameter aggregation result as the model parameter converges, send the image processing model to each of the edge computing devices, so that the edge computing device uses the image processing model to process the image.

在一些实施例中，从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头包括：In some embodiments, selecting an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster includes:

选择所述数据同性簇中与簇内其他边缘计算设备的距离最近或与簇内其他边缘计算设备的通信速率最大的边缘计算设备作为所述数据同性簇的簇头。An edge computing device in the data homogeneity cluster that is closest to other edge computing devices in the cluster or has the highest communication rate with other edge computing devices in the cluster is selected as the cluster head of the data homogeneity cluster.

在一些实施例中，所述簇头从簇内其他边缘计算设备中选择预设个数的边缘计算设备，作为所述簇内树形聚合网络的第1层子节点，第1层子节点为与所述簇头通信时数据发送速率最大的边缘计算设备；第i层的子节点从簇内剩余边缘计算设备中选择预设个数的边缘计算设备，作为所述簇内树形聚合网络的第i+1层子节点；第i+1层子节点为与第i层子节点通信时数据发送速率最大的边缘计算设备；i从1取值，直到构建得到所述簇内树形聚合网络。In some embodiments, the cluster head selects a preset number of edge computing devices from other edge computing devices in the cluster as the first-layer child nodes of the tree-shaped aggregation network within the cluster, and the first-layer child nodes are edge computing devices with the highest data transmission rate when communicating with the cluster head; the i-th layer child nodes select a preset number of edge computing devices from the remaining edge computing devices in the cluster as the i+1-th layer child nodes of the tree-shaped aggregation network within the cluster; the i+1-th layer child nodes are edge computing devices with the highest data transmission rate when communicating with the i-th layer child nodes; i takes values from 1 until the tree-shaped aggregation network within the cluster is constructed.

在一些实施例中，聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果包括：In some embodiments, aggregating the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain the second-layer model parameter aggregation results includes:

确定当前训练轮次各所述数据同性簇的权重系数；Determine the weight coefficient of each data homogeneity cluster in the current training round;

根据各所述数据同性簇的权重系数，对各所述数据同性簇的簇内模型参数进行加权聚合，得到当前训练轮次的所述第二层模型参数聚合结果。According to the weight coefficients of the data homogeneity clusters, the in-cluster model parameters of the data homogeneity clusters are weightedly aggregated to obtain the aggregation result of the second-layer model parameters of the current training round.

在一些实施例中，确定当前训练轮次各所述数据同性簇的权重系数包括：In some embodiments, determining the weight coefficient of each data homogeneity cluster in the current training round includes:

统计所述数据同性簇的各边缘计算设备的本地数据测试精度，得到所述数据同性簇的本地数据测试精度；边缘计算设备的本地数据测试精度为边缘计算设备使用本地数据对上一训练轮次得到的全局模型进行测试的数据测试精度；The local data test accuracy of each edge computing device of the data homogeneity cluster is counted to obtain the local data test accuracy of the data homogeneity cluster; the local data test accuracy of the edge computing device is the data test accuracy of the edge computing device using the local data to test the global model obtained in the previous training round;

根据所述数据同性簇的本地数据测试精度确定所述数据同性簇的权重系数；其中，所述数据同性簇的权重系数与所述数据同性簇的本地数据测试精度呈负相关。The weight coefficient of the data homology cluster is determined according to the local data test accuracy of the data homology cluster; wherein the weight coefficient of the data homology cluster is negatively correlated with the local data test accuracy of the data homology cluster.

在一些实施例中，根据所述数据同性簇的本地数据测试精度确定所述数据同性簇的权重系数包括：In some embodiments, determining the weight coefficient of the data homogeneity cluster according to the local data test accuracy of the data homogeneity cluster includes:

根据得到所述数据同性簇的权重系数；K _c表示第c个数据同性簇的权重系数，μ表示第c个数据同性簇的本地数据测试精度。according to The weight coefficient of the data homology cluster is obtained; K _c represents the weight coefficient of the cth data homology cluster, and μ represents the local data test accuracy of the cth data homology cluster.

在一些实施例中，一个训练轮次包括本地模型训练、第一层模型参数聚合与第二层模型参数聚合三个阶段；数据同性簇内的每个边缘计算设备完成第一预设次数的模型参数更新后，进行一次第一层模型参数聚合，每个数据同性簇完成第二预设次数的第一层模型参数聚合后，进行一次第二层模型参数聚合。In some embodiments, a training round includes three stages: local model training, first-layer model parameter aggregation, and second-layer model parameter aggregation; after each edge computing device in the data homogeneity cluster completes a first preset number of model parameter updates, a first-layer model parameter aggregation is performed; after each data homogeneity cluster completes a second preset number of first-layer model parameter aggregations, a second-layer model parameter aggregation is performed.

在一些实施例中，还包括：In some embodiments, it also includes:

每当完成预设训练轮次的模型训练后，重新根据边缘计算设备的数据分布相似性对边缘计算设备进行分簇，并再次进行预设训练轮次的模型训练，直到全局模型收敛。Whenever the preset training rounds of model training are completed, the edge computing devices are clustered again according to the data distribution similarity of the edge computing devices, and the preset training rounds of model training are performed again until the global model converges.

在一些实施例中，根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇包括：In some embodiments, according to the data distribution similarity of the edge computing devices, dividing the edge computing devices into a plurality of data homogeneity clusters includes:

根据各所述边缘计算设备的数据分布相似性，构建带权无向图；所述带权无向图中两个边缘计算设备之间的连接边的值为两个边缘计算设备的数据分布相似性的值；According to the data distribution similarity of each edge computing device, a weighted undirected graph is constructed; the value of the connecting edge between two edge computing devices in the weighted undirected graph is the value of the data distribution similarity of the two edge computing devices;

根据所述带权无向图，将所述边缘计算设备划分为若干个数据同性簇。According to the weighted undirected graph, the edge computing device is divided into a plurality of data homogeneity clusters.

在一些实施例中，根据各所述边缘计算设备的数据分布相似性，构建带权无向图包括：In some embodiments, constructing a weighted undirected graph according to the data distribution similarity of each edge computing device includes:

从公网搜索公共数据，并基于所述公共数据构建测试数据集；Searching for public data from the public network, and building a test data set based on the public data;

将所述测试数据集下发给各所述边缘计算设备，以使各所述边缘计算设备使用本地模型对所述测试数据集进行推理；Sending the test data set to each edge computing device so that each edge computing device uses a local model to infer the test data set;

接收各所述边缘计算设备上传的推理结果，并计算各所述推理结果的相似度；Receiving the inference results uploaded by each edge computing device, and calculating the similarity of each inference result;

根据所述推理结果的相似度构建所述带权无向图。The weighted undirected graph is constructed according to the similarity of the inference results.

在一些实施例中，所述根据所述推理结果的相似度构建所述带权无向图包括：In some embodiments, constructing the weighted undirected graph according to the similarity of the inference results includes:

比较各边缘计算设备的推理结果的相似度与预设阈值的大小；Compare the similarity of the inference results of each edge computing device with the preset threshold;

若两个边缘计算设备的推理结果的相似度大于预设阈值，则建立两个边缘计算设备之间的连接关系，两个边缘计算设备的推理结果的相似度的值作为两个所述边缘计算设备之间的连接边的值。If the similarity of the inference results of the two edge computing devices is greater than a preset threshold, a connection relationship between the two edge computing devices is established, and the value of the similarity of the inference results of the two edge computing devices is used as the value of the connection edge between the two edge computing devices.

在一些实施例中，根据所述带权无向图，将所述边缘计算设备划分为若干个数据同性簇包括：In some embodiments, according to the weighted undirected graph, dividing the edge computing device into a plurality of data homogeneity clusters includes:

初始化所述带权无向图中各边缘计算设备的标签；Initialize the label of each edge computing device in the weighted undirected graph;

迭代更新各边缘计算设备的标签；其中，迭代更新各边缘计算设备的标签包括：将值大于设定阈值的连接边所连接的边缘计算设备的标签设置为相同的标签；统计目标边缘计算设备的邻居边缘计算设备的标签出现的次数，并选择邻居边缘计算设备中出现次数最多的标签作为所述目标边缘计算设备的标签；所述目标边缘计算设备为所连接的各连接边的值均不大于设定阈值的边缘计算设备；Iteratively updating the label of each edge computing device; wherein iteratively updating the label of each edge computing device includes: setting the label of the edge computing device connected to the connection edge whose value is greater than the set threshold to the same label; counting the number of times the label of the neighbor edge computing device of the target edge computing device appears, and selecting the label with the largest number of appearances among the neighbor edge computing devices as the label of the target edge computing device; the target edge computing device is an edge computing device to which the values of each connection edge connected are not greater than the set threshold;

判断是否满足迭代更新停止条件；Determine whether the iterative update stop condition is met;

若满足迭代更新停止条件，则将具有相同标签的所述边缘计算设备划分到同一个数据同性簇。If the iterative update stop condition is met, the edge computing devices with the same label are divided into the same data homogeneity cluster.

在一些实施例中，判断是否满足迭代更新停止条件包括：In some embodiments, determining whether the iterative update stop condition is satisfied includes:

在每次完成迭代更新后，计算本次迭代更新后各边缘计算设备的标签与上一次迭代更新后各边缘计算设备的标签的变化量；After each iteration, the change between the label of each edge computing device after the current iteration and the label of each edge computing device after the previous iteration is calculated.

比较所述变化量与预设值的大小；Comparing the change amount with a preset value;

若所述变化量小于所述预设值，则满足迭代更新停止条件。If the change amount is less than the preset value, the iterative update stop condition is met.

为解决上述技术问题，本发明还提供了一种数据异构条件下的联邦学习方法，包括：In order to solve the above technical problems, the present invention also provides a federated learning method under data heterogeneity conditions, including:

聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的全局模型收敛时，将所述全局模型下发给各所述边缘计算设备。Aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when a global model using the second-layer model parameter aggregation result as a model parameter converges, send the global model to each of the edge computing devices.

为解决上述技术问题，本发明还提供了一种数据异构条件下的图像处理装置，包括：In order to solve the above technical problems, the present invention further provides an image processing device under data heterogeneity conditions, comprising:

划分模块，用于根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇；A partitioning module, configured to partition the edge computing devices into a plurality of data homogeneity clusters according to the similarity of data distribution of the edge computing devices;

选择模块，用于从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头；A selection module, configured to select an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster;

接收模块，用于接收所述簇头上传的所在数据同性簇的簇内模型参数聚合结果；所述簇内模型参数聚合结果为所述数据同性簇内的各边缘计算设备根据簇内树形聚合网络聚合本地模型参数得到的第一层模型参数聚合结果，所述数据同性簇内的各边缘计算设备根据通信最优策略构建得到所述簇内树形聚合网络，所述簇头为所述簇内树形聚合网络的根节点，所述簇内树形聚合网络中子节点向相应的父节点发送模型参数，与父节点的本地模型参数进行模型参数聚合；A receiving module is used to receive the intra-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the intra-cluster model parameter aggregation result is the first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the intra-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the intra-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the intra-cluster tree aggregation network, and the child nodes in the intra-cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes;

聚合模块，用于聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的图像处理模型收敛时，将所述图像处理模型下发给各所述边缘计算设备，以使所述边缘计算设备使用所述图像处理模型处理图像。An aggregation module is used to aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when the image processing model using the second-layer model parameter aggregation result as the model parameter converges, the image processing model is sent to each of the edge computing devices, so that the edge computing device uses the image processing model to process the image.

为解决上述技术问题，本发明还提供了一种数据异构条件下的联邦学习装置，包括：In order to solve the above technical problems, the present invention further provides a federated learning device under data heterogeneity conditions, comprising:

划分单元，用于根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇；A division unit, configured to divide the edge computing devices into a plurality of data homogeneity clusters according to the data distribution similarity of the edge computing devices;

选择单元，用于从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头；A selection unit, configured to select an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster;

接收单元，用于接收所述簇头上传的所在数据同性簇的簇内模型参数聚合结果；所述簇内模型参数聚合结果为所述数据同性簇内的各边缘计算设备根据簇内树形聚合网络聚合本地模型参数得到的第一层模型参数聚合结果，所述数据同性簇内的各边缘计算设备根据通信最优策略构建得到所述簇内树形聚合网络，所述簇头为所述簇内树形聚合网络的根节点，所述簇内树形聚合网络中子节点向相应的父节点发送模型参数，与父节点的本地模型参数进行模型参数聚合；A receiving unit is used to receive the intra-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the intra-cluster model parameter aggregation result is a first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the intra-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the intra-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the intra-cluster tree aggregation network, and the child nodes in the intra-cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes;

聚合单元，用于聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的全局模型收敛时，将所述全局模型下发给各所述边缘计算设备。An aggregation unit is used to aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when a global model using the second-layer model parameter aggregation result as a model parameter converges, the global model is sent to each of the edge computing devices.

为解决上述技术问题，本发明还提供了一种设备，包括：In order to solve the above technical problems, the present invention further provides a device, comprising:

存储器，用于存储计算机程序；Memory for storing computer programs;

处理器，用于执行所述计算机程序时实现如上所述的数据异构条件下的图像处理方法的步骤或实现如上所述的数据异构条件下的联邦学习方法的步骤。A processor is used to implement the steps of the image processing method under the data heterogeneity condition as described above or the steps of the federated learning method under the data heterogeneity condition as described above when executing the computer program.

为解决上述技术问题，本发明还提供了一种数据异构条件下的图像处理系统，包括：In order to solve the above technical problems, the present invention further provides an image processing system under data heterogeneity conditions, comprising:

边缘计算设备，用于使用本地数据训练图像处理模型，得到本地模型参数，并使用边缘云服务器下发的图像处理模块处理图像；Edge computing devices are used to train image processing models using local data, obtain local model parameters, and process images using image processing modules sent by edge cloud servers;

边缘云服务器，用于根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇；从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头；接收所述簇头上传的所在数据同性簇的簇内模型参数聚合结果；所述簇内模型参数聚合结果为所述数据同性簇内的各边缘计算设备根据簇内树形聚合网络聚合本地模型参数得到的第一层模型参数聚合结果，所述数据同性簇内的各边缘计算设备根据通信最优策略构建得到所述簇内树形聚合网络，所述簇头为所述簇内树形聚合网络的根节点，所述簇内树形聚合网络中子节点向相应的父节点发送模型参数，与父节点的本地模型参数进行模型参数聚合；聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的图像处理模型收敛时，将所述图像处理模型下发给各所述边缘计算设备，以使所述边缘计算设备使用所述图像处理模型处理图像。The edge cloud server is used to divide the edge computing devices into several data homogeneity clusters according to the data distribution similarity of the edge computing devices; select an edge computing device from the data homogeneity cluster as the cluster head of the data homogeneity cluster; receive the cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the cluster model parameter aggregation result is the first layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the cluster tree aggregation network, and the child nodes in the cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes; aggregate the cluster model parameter aggregation results of each data homogeneity cluster to obtain a second layer model parameter aggregation result, and when the image processing model using the second layer model parameter aggregation result as the model parameter converges, send the image processing model to each edge computing device, so that the edge computing device uses the image processing model to process the image.

为解决上述技术问题，本发明还提供了一种数据异构条件下的联邦学习系统，包括：In order to solve the above technical problems, the present invention further provides a federated learning system under data heterogeneity conditions, including:

边缘计算设备，用于使用本地数据训练模型，得到本地模型参数；Edge computing devices are used to train models using local data and obtain local model parameters;

边缘云服务器，用于根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇；从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头；接收所述簇头上传的所在数据同性簇的簇内模型参数聚合结果；所述簇内模型参数聚合结果为所述数据同性簇内的各边缘计算设备根据簇内树形聚合网络聚合本地模型参数得到的第一层模型参数聚合结果，所述数据同性簇内的各边缘计算设备根据通信最优策略构建得到所述簇内树形聚合网络，所述簇头为所述簇内树形聚合网络的根节点，所述簇内树形聚合网络中子节点向相应的父节点发送模型参数，与父节点的本地模型参数进行模型参数聚合；聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的全局模型收敛时，将所述全局模型下发给各所述边缘计算设备。The edge cloud server is used to divide the edge computing devices into several data homogeneity clusters according to the data distribution similarity of the edge computing devices; select an edge computing device from the data homogeneity cluster as the cluster head of the data homogeneity cluster; receive the cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the cluster model parameter aggregation result is the first layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the cluster tree aggregation network, the child nodes in the cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes; aggregate the cluster model parameter aggregation results of each data homogeneity cluster to obtain the second layer model parameter aggregation result, and when the global model with the second layer model parameter aggregation result as the model parameter converges, the global model is sent to each edge computing device.

为解决上述技术问题，本发明还提供了一种介质，所述介质上存储有计算机程序，所述计算机程序被处理器执行时实现如上所述的数据异构条件下的图像处理方法或如上所述的数据异构条件下的联邦学习方法的步骤。In order to solve the above technical problems, the present invention also provides a medium, on which a computer program is stored. When the computer program is executed by a processor, the steps of the image processing method under the above-mentioned data heterogeneous conditions or the above-mentioned federated learning method under the above-mentioned data heterogeneous conditions are implemented.

本发明所提供的数据异构条件下的图像处理方法，根据数据分布相似性对边缘计算设备分簇，簇内边缘计算设备具有相似的数据分布，由于相似的数据分布可以让模型更好的捕捉到数据的特征，因此簇内模型参数聚合可以提高每个簇的模型训练准确性，有效解决数据异构问题。并且，簇内的边缘计算设备根据簇内树形聚合网络进行模型参数聚合，子节点只向相应的父节点发送模型参数，即下层的边缘计算设备只向上一层中相应的边缘计算设备发送模型参数，而不向其他边缘计算设备发送模型参数，这样可以极大降低通信开销。同时，将边缘计算设备分簇后，若某个簇中的边缘计算设备发生故障或离线，其他簇仍可以正常进行本地模型训练并进行模型参数聚合，从而可以提升联邦训练系统整体的容错性与可扩展性。边缘计算设备与边缘云服务器在联邦学习过程中进行两层模型参数聚合，得到准确可靠的图像处理模型，最终边缘计算设备使用该准确可靠的图像处理模型进行图像处理，可以提升图像处理的准确性与可靠性。The image processing method under data heterogeneity provided by the present invention clusters edge computing devices according to the similarity of data distribution. The edge computing devices in the cluster have similar data distribution. Since similar data distribution allows the model to better capture the characteristics of the data, the aggregation of model parameters in the cluster can improve the accuracy of model training for each cluster and effectively solve the problem of data heterogeneity. In addition, the edge computing devices in the cluster aggregate model parameters according to the tree-shaped aggregation network in the cluster, and the child nodes only send model parameters to the corresponding parent nodes, that is, the edge computing devices in the lower layer only send model parameters to the corresponding edge computing devices in the upper layer, and do not send model parameters to other edge computing devices, which can greatly reduce communication overhead. At the same time, after the edge computing devices are clustered, if the edge computing devices in a cluster fail or go offline, other clusters can still perform local model training and model parameter aggregation normally, thereby improving the overall fault tolerance and scalability of the federated training system. The edge computing devices and the edge cloud server perform two-layer model parameter aggregation in the federated learning process to obtain an accurate and reliable image processing model. Finally, the edge computing devices use the accurate and reliable image processing model for image processing, which can improve the accuracy and reliability of image processing.

本发明所提供的数据异构条件下的联邦学习方法，根据数据分布相似性对边缘计算设备分簇，簇内边缘计算设备具有相似的数据分布，由于相似的数据分布可以让模型更好的捕捉到数据的特征，因此簇内模型参数聚合可以提高每个簇的模型训练准确性，有效解决数据异构问题。并且，簇内的边缘计算设备根据簇内树形聚合网络进行模型参数聚合，子节点只向相应的父节点发送模型参数，即下层的边缘计算设备只向上一层中相应的边缘计算设备发送模型参数，而不向其他边缘计算设备发送模型参数，这样可以极大降低通信开销。同时，将边缘计算设备分簇后，若某个簇中的边缘计算设备发生故障或离线，其他簇仍可以正常进行本地模型训练并进行模型参数聚合，从而可以提升联邦训练系统整体的容错性与可扩展性。The federated learning method under data heterogeneity provided by the present invention clusters edge computing devices according to the similarity of data distribution. The edge computing devices in the cluster have similar data distribution. Since similar data distribution allows the model to better capture the characteristics of the data, the aggregation of model parameters in the cluster can improve the accuracy of model training in each cluster and effectively solve the problem of data heterogeneity. In addition, the edge computing devices in the cluster aggregate model parameters according to the tree-shaped aggregation network in the cluster. The child nodes only send model parameters to the corresponding parent nodes, that is, the edge computing devices in the lower layer only send model parameters to the corresponding edge computing devices in the upper layer, and do not send model parameters to other edge computing devices, which can greatly reduce communication overhead. At the same time, after the edge computing devices are clustered, if the edge computing devices in a cluster fail or go offline, other clusters can still perform local model training and model parameter aggregation normally, thereby improving the overall fault tolerance and scalability of the federated training system.

本发明所提供的数据异构条件下的图像处理装置、联邦学习装置、设备、系统及介质均具有上述技术效果。The image processing device, federated learning device, equipment, system and medium under data heterogeneity conditions provided by the present invention all have the above-mentioned technical effects.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例中的技术方案，下面将对现有技术和实施例中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the prior art and the drawings required for use in the embodiments are briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without creative work.

图1为本发明实施例所提供的一种数据异构条件下的联邦学习方法的流程示意图；FIG1 is a schematic diagram of a flow chart of a federated learning method under heterogeneous data conditions provided by an embodiment of the present invention;

图2为本发明实施例所提供的一种带权无向图的示意图；FIG2 is a schematic diagram of a weighted undirected graph provided by an embodiment of the present invention;

图3为发明实施例所提供的一种分簇结果示意图；FIG3 is a schematic diagram of a clustering result provided by an embodiment of the invention;

图4为本发明实施例所提供的一种簇内树形聚合网络的示意图；FIG4 is a schematic diagram of an intra-cluster tree aggregation network provided by an embodiment of the present invention;

图5为本发明实施例所提供的一种数据异构条件下的联邦学习装置的示意图；FIG5 is a schematic diagram of a federated learning device under data heterogeneity conditions provided by an embodiment of the present invention;

图6为本发明实施例所提供的一种数据异构条件下的图像处理方法的流程示意图；FIG6 is a schematic flow chart of an image processing method under heterogeneous data conditions provided by an embodiment of the present invention;

图7为本发明实施例所提供的一种数据异构条件下的图像处理装置的示意图。FIG. 7 is a schematic diagram of an image processing device under data heterogeneity conditions provided by an embodiment of the present invention.

具体实施方式Detailed ways

本发明的核心是提供一种数据异构条件下的图像处理方法、联邦学习方法、装置、系统、设备及介质，能够解决联邦学习中数据异构性问题，降低通信开销，提升图像处理的准确性与可靠性。The core of the present invention is to provide an image processing method, a federated learning method, an apparatus, a system, a device and a medium under data heterogeneity conditions, which can solve the problem of data heterogeneity in federated learning, reduce communication overhead, and improve the accuracy and reliability of image processing.

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solution and advantages of the embodiments of the present invention clearer, the technical solution in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

请参考图1，图1为本发明实施例所提供的一种数据异构条件下的联邦学习方法的流程示意图，参考图1所示，该方法包括：Please refer to FIG. 1 , which is a flow chart of a federated learning method under heterogeneous data conditions provided by an embodiment of the present invention. Referring to FIG. 1 , the method includes:

S101：根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇；S101: Divide the edge computing devices into a plurality of data homogeneity clusters according to data distribution similarity of the edge computing devices;

同一个数据同性簇内的边缘计算设备的数据分布相似。The data distribution of edge computing devices within the same data homogeneity cluster is similar.

在一些实施例中，根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇包括：In some embodiments, dividing the edge computing devices into a plurality of data homogeneity clusters according to data distribution similarity of the edge computing devices includes:

本实施例根据边缘计算设备的数据分布相似性构建带权无向图，进而根据带权无向图将边缘计算设备划分为若干个数据同性簇。带权无向图中两个边缘计算设备之间的连接边的值为两个边缘计算设备的数据分布相似性的值。This embodiment constructs a weighted undirected graph based on the data distribution similarity of edge computing devices, and then divides the edge computing devices into a number of data homogeneity clusters based on the weighted undirected graph. The value of the connecting edge between two edge computing devices in the weighted undirected graph is the value of the data distribution similarity of the two edge computing devices.

其中，在一些实施例中，根据各所述边缘计算设备的数据分布相似性，构建带权无向图包括：In some embodiments, constructing a weighted undirected graph according to the data distribution similarity of each edge computing device includes:

各个边缘计算设备基于本地数据进行模型训练，得到本地模型。边缘云服务器从公网搜索公共数据，基于公共数据构建测试数据集。边缘云服务器将测试数据集发送到各个边缘计算设备。边缘计算设备存储测试数据集，并使用基于本地数据训练得到的本地模型对测试数据集进行推理，得到推理结果。各个边缘计算设备将推理结果上传到边缘云服务器。边缘云服务器计算各推理结果的相似度，并根据推理结果的相似度构建各边缘计算设备的带权无向图。Each edge computing device performs model training based on local data to obtain a local model. The edge cloud server searches for public data from the public network and builds a test data set based on the public data. The edge cloud server sends the test data set to each edge computing device. The edge computing device stores the test data set and uses the local model trained based on the local data to infer the test data set to obtain the inference result. Each edge computing device uploads the inference result to the edge cloud server. The edge cloud server calculates the similarity of each inference result and builds a weighted undirected graph of each edge computing device based on the similarity of the inference result.

边缘云服务器可以采用向量相似性计算方法计算各推理结果的相似度。示例性的，边缘云服务器采用Jaccard相似系数计算方法计算各推理结果的相似度。Jaccard相似系数计算可用于计算集合之间的相似度，也可以用于计算二值向量的相似度。对于两个二值向量A与B，Jaccard相似系数计算的计算公式为：similarity = |A ∩ B| / |A ∪ B|；其中，A ∩ B表示向量A和B的交集，A ∪ B表示向量A和B的并集。The edge cloud server can use a vector similarity calculation method to calculate the similarity of each inference result. Exemplarily, the edge cloud server uses the Jaccard similarity coefficient calculation method to calculate the similarity of each inference result. The Jaccard similarity coefficient calculation can be used to calculate the similarity between sets, and can also be used to calculate the similarity of binary vectors. For two binary vectors A and B, the calculation formula for the Jaccard similarity coefficient calculation is: similarity = |A ∩ B| / |A ∪ B|; wherein A ∩ B represents the intersection of vectors A and B, and A ∪ B represents the union of vectors A and B.

假设边缘计算设备C的推理结果为一个二值向量[1,0,0,0,……1,1,1,0]，边缘计算设备D的推理结果为一个二值向量[0,1,1,0,……1,1,1,0]。使用Jaccard相似系数计算方法计算二值向量[1,0,0,0,……1,1,1,0]与二值向量[0,1,1,0,……1,1,1,0]的相似度。Assume that the inference result of edge computing device C is a binary vector [1,0,0,0,…1,1,1,0], and the inference result of edge computing device D is a binary vector [0,1,1,0,…1,1,1,0]. Use the Jaccard similarity coefficient calculation method to calculate the similarity between the binary vector [1,0,0,0,…1,1,1,0] and the binary vector [0,1,1,0,…1,1,1,0].

本实施例通过边缘计算设备对相同测试数据集的推理结果的相似性来表征边缘计算设备之间的数据分布相似性，边缘计算设备使用本地模型对测试数据集进行推理的推理结果越相似表明边缘计算设备使用本地数据训练得到的本地模型越相似，边缘计算设备使用本地数据训练得到的本地模型越相似表明边缘计算设备的数据分布越相似。使用本实施例所提供的分数据同性簇方式，边缘计算设备只需上传推理结果，而不需要上传本地数据，这样可以保障本地数据的隐私性。This embodiment characterizes the similarity of data distribution between edge computing devices by the similarity of the inference results of the edge computing devices for the same test data set. The more similar the inference results of the edge computing devices using local models to infer the test data set, the more similar the local models trained by the edge computing devices using local data are. The more similar the local models trained by the edge computing devices using local data are, the more similar the data distribution of the edge computing devices is. Using the data homogeneity clustering method provided in this embodiment, the edge computing devices only need to upload the inference results instead of uploading local data, which can ensure the privacy of local data.

在一些实施例中，根据所述推理结果的相似度构建所述带权无向图包括：In some embodiments, constructing the weighted undirected graph according to the similarity of the inference results includes:

如果两个边缘计算设备的推理结果的相似度大于预设阈值，则建立这两个边缘计算设备之间的连接关系，且这两个边缘计算设备之间的连接边的值等于这两个边缘计算设备之间的推理结果的相似度的值。如果两个边缘计算设备的推理结果的相似度不大于预设阈值，则不建立这两个边缘计算设备之间的连接关系。示例性的，预设阈值设置为0.7，据此构建得到图2所示的带权无向图，图2中设备1至设备6表示六个边缘计算设备，两个设备的连接边上的数值即为两个设备的预测结果的相似度的值。If the similarity of the inference results of two edge computing devices is greater than a preset threshold, a connection relationship between the two edge computing devices is established, and the value of the connection edge between the two edge computing devices is equal to the value of the similarity of the inference results between the two edge computing devices. If the similarity of the inference results of the two edge computing devices is not greater than the preset threshold, a connection relationship between the two edge computing devices is not established. Exemplarily, the preset threshold is set to 0.7, and the weighted undirected graph shown in Figure 2 is constructed accordingly. In Figure 2, devices 1 to 6 represent six edge computing devices, and the value on the connection edge between two devices is the value of the similarity of the prediction results of the two devices.

带权无向图中各边缘计算设备的标签初始化为不同的标签。边缘云服务器对各边缘计算设备的标签进行迭代更新。一次迭代更新的过程包括：比较各连接边与设定阈值的大小，将值大于设定阈值的连接边所连接的边缘计算设备的标签设置为相同的标签。对于所连接的各连接边的值都不大于设定阈值的边缘计算设备，进一步统计该边缘计算设备的邻居边缘计算设备的标签出现的次数，并选择邻居边缘计算设备中出现次数最多的标签作为该边缘计算设备的标签。带权无向图中具有连接关系的边缘计算设备互为邻居。当满足迭代更新停止条件，具有相同标签的边缘计算设备划分到同一个数据同性簇，同一个数据同性簇为具有相同标签的边缘计算设备的集合。The labels of each edge computing device in the weighted undirected graph are initialized to different labels. The edge cloud server iteratively updates the labels of each edge computing device. The process of one iterative update includes: comparing the size of each connection edge with the set threshold, and setting the labels of the edge computing devices connected to the connection edges whose values are greater than the set threshold to the same label. For edge computing devices whose values of each connection edge are not greater than the set threshold, the number of times the labels of the neighboring edge computing devices of the edge computing device appear is further counted, and the label with the most appearances in the neighboring edge computing devices is selected as the label of the edge computing device. Edge computing devices with a connection relationship in a weighted undirected graph are neighbors of each other. When the iterative update stop condition is met, the edge computing devices with the same label are divided into the same data homogeneity cluster, and the same data homogeneity cluster is a collection of edge computing devices with the same label.

示例性的，参考图2所示的带权无向图，假设设定阈值为0.9，设备1与设备2的连接边的值为0.94>0.9，因此将设备1与设备2的标签设置为相同的标签，例如设置为标签A。设备3与设备4的连接边的值为0.91>0.9，因此将设备3与设备4的标签设置为相同的标签，例如设置为标签B。设备5与设备4的连接边的值以及设备5与设备3的连接边的值都不大于0.9，因此统计设备5的邻居边缘计算设备（设备3与设备4）的标签。由于设备3与设备4的标签均为标签B，且设备5没有其他邻居边缘计算设备，所以设备5的邻居边缘计算设备中出现次数最多的标签为标签B（出现了两次），故将标签B作为设备5的标签。设备6与设备4的连接边的值以及设备6与设备3的连接边的值都不大于0.9，因此统计设备6的邻居边缘计算设备（设备3与设备4）的标签。由于设备3与设备4的标签均为标签B，且设备6没有其他邻居边缘计算设备，所以设备6的邻居边缘计算设备中出现次数最多的标签为标签B（出现了两次），故将标签B作为设备6的标签。Exemplarily, referring to the weighted undirected graph shown in FIG2, assuming that the threshold is set to 0.9, the value of the connection edge between device 1 and device 2 is 0.94>0.9, so the labels of device 1 and device 2 are set to the same label, for example, set to label A. The value of the connection edge between device 3 and device 4 is 0.91>0.9, so the labels of device 3 and device 4 are set to the same label, for example, set to label B. The value of the connection edge between device 5 and device 4 and the value of the connection edge between device 5 and device 3 are not greater than 0.9, so the labels of the neighbor edge computing devices (devices 3 and 4) of device 5 are counted. Since the labels of device 3 and device 4 are both label B, and device 5 has no other neighbor edge computing devices, the label that appears most frequently in the neighbor edge computing devices of device 5 is label B (appears twice), so label B is used as the label of device 5. The value of the connection edge between device 6 and device 4 and the value of the connection edge between device 6 and device 3 are not greater than 0.9, so the labels of the neighbor edge computing devices (devices 3 and 4) of device 6 are counted. Since the labels of device 3 and device 4 are both label B, and device 6 has no other neighbor edge computing devices, the label that appears most frequently among the neighbor edge computing devices of device 6 is label B (appears twice), so label B is used as the label of device 6.

满足迭代更新停止条件后，设备1与设备2的标签均为标签A，设备3至设备6的标签均为标签B，因此设备1与设备2划分到同一个数据同性簇中，将设备3至设备6划分到同一个数据同性簇中，得到如图3所示的分簇结果。After the iterative update stop condition is met, the labels of device 1 and device 2 are both label A, and the labels of device 3 to device 6 are all label B. Therefore, device 1 and device 2 are divided into the same data homogeneity cluster, and device 3 to device 6 are divided into the same data homogeneity cluster, and the clustering result shown in Figure 3 is obtained.

在一些实施例中，所述判断是否满足迭代更新停止条件包括：In some embodiments, the determining whether the iterative update stopping condition is satisfied includes:

在每次完成迭代更新后，计算当本次迭代更新后各边缘计算设备的标签与上一次迭代更新后各边缘计算设备的标签的变化量；After each iteration, the change between the label of each edge computing device after the current iteration and the label of each edge computing device after the previous iteration is calculated.

每次迭代更新后，边缘云服务器计算本次迭代更新后各边缘计算设备的标签与上一次迭代更新后各边缘计算设备的标签的变化量。如果变化量小于预设值，即标签基本稳定，不再发生显著变化，此时认为算法收敛，停止迭代更新。如果变化量不小于预设值，即标签不稳定，标签仍在显著变化，此时继续进行迭代更新。After each iteration, the edge cloud server calculates the change between the labels of each edge computing device after this iteration and the labels of each edge computing device after the previous iteration. If the change is less than the preset value, that is, the label is basically stable and no longer changes significantly, the algorithm is considered to have converged and the iteration is stopped. If the change is not less than the preset value, that is, the label is unstable and still changes significantly, the iteration continues.

完成分簇后，边缘云服务器将划分好的数据同性簇内的每个边缘计算设备的设备号发送给数据同性簇内的各个边缘计算设备，以便边缘计算设备据此与同簇内的其他边缘云服务器通信。After clustering is completed, the edge cloud server sends the device number of each edge computing device in the divided data homogeneity cluster to each edge computing device in the data homogeneity cluster, so that the edge computing device can communicate with other edge cloud servers in the same cluster accordingly.

S102：从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头；S102: Selecting an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster;

簇头负责向边缘云服务器上传所在簇的簇内模型参数聚合结果。The cluster head is responsible for uploading the aggregation results of the cluster model parameters to the edge cloud server.

在一些实施例中，所述从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头包括：In some embodiments, selecting an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster includes:

选取所述数据同性簇中与其他边缘计算设备的距离最近或与其他边缘计算设备的通信速率最大的边缘计算设备作为所述数据同性簇的簇头。An edge computing device that is closest to other edge computing devices in the data homogeneity cluster or has the highest communication rate with other edge computing devices is selected as the cluster head of the data homogeneity cluster.

本实施例以通信速率最大或通信距离最近作为选择原则来选择簇头，具体选择簇内距离其他边缘计算设备的距离最近或者与其他边缘计算设备的通信速率最大的边缘计算设备作为簇头，这样可以减少通信距离和延迟，提高通信效率。选择簇头后，边缘云服务器将簇头的设备号发送给簇内的边缘计算设备。In this embodiment, the cluster head is selected based on the principle of maximum communication rate or shortest communication distance. Specifically, the edge computing device that is closest to other edge computing devices in the cluster or has the highest communication rate with other edge computing devices is selected as the cluster head. This can reduce the communication distance and delay and improve communication efficiency. After selecting the cluster head, the edge cloud server sends the device number of the cluster head to the edge computing devices in the cluster.

S103：接收所述簇头上传的所在数据同性簇的簇内模型参数聚合结果；所述簇内模型参数聚合结果为所述数据同性簇内的各边缘计算设备根据簇内树形聚合网络聚合本地模型参数得到的第一层模型参数聚合结果，所述数据同性簇内的各边缘计算设备根据通信最优策略构建得到所述簇内树形聚合网络，所述簇头为所述簇内树形聚合网络的根节点，所述簇内树形聚合网络中子节点向父节点发送模型参数，与父节点的本地模型参数进行模型参数聚合。S103: Receive the intra-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the intra-cluster model parameter aggregation result is the first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the intra-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the intra-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the intra-cluster tree aggregation network, and the child nodes in the intra-cluster tree aggregation network send model parameters to the parent node, and perform model parameter aggregation with the local model parameters of the parent node.

参考图3所示，簇内树形聚合网络包括若干层，每个边缘计算设备为簇内树形聚合网络中的一个节点。簇头位于第1层，为簇内树形聚合网络的根节点。簇头的子节点（图3中所示的层1簇头）位于第二层；簇头的子节点的子节点（图3中所示的层2簇头）位于第三层；以此类推。层1簇头是层2簇头的父节点，层1簇头亦是簇头的子节点。同理，层2簇头是层3簇头的父节点，层2簇头亦是层1簇头的子节点。以此类推。层1簇头只向簇头发送模型参数，层2簇头只向层1簇头发送模型参数，以此类推。As shown in FIG3 , the intra-cluster tree aggregation network includes several layers, and each edge computing device is a node in the intra-cluster tree aggregation network. The cluster head is located at the first layer and is the root node of the intra-cluster tree aggregation network. The child nodes of the cluster head (the layer 1 cluster head shown in FIG3 ) are located at the second layer; the child nodes of the child nodes of the cluster head (the layer 2 cluster head shown in FIG3 ) are located at the third layer; and so on. The layer 1 cluster head is the parent node of the layer 2 cluster head, and the layer 1 cluster head is also the child node of the cluster head. Similarly, the layer 2 cluster head is the parent node of the layer 3 cluster head, and the layer 2 cluster head is also the child node of the layer 1 cluster head. And so on. The layer 1 cluster head only sends model parameters to the cluster head, and the layer 2 cluster head only sends model parameters to the layer 1 cluster head, and so on.

簇内模型参数的聚合方式为：第N层（最末层）的边缘计算设备即叶子节点将自身基于本地数据训练得到的本地模型参数上传给位于第N-1层的该边缘计算设备的父节点，第N-1层的边缘计算设备将其子节点上传的本地模型参数与自身的本地模型参数进行参数聚合，并将参数聚合结果上传到位于第N-2层的该边缘计算设备的父节点，以此类推，最终簇头对其子节点上传的参数聚合结果与自身的本地模型参数进行聚合，得到簇内模型参数聚合结果。The aggregation method of model parameters within the cluster is as follows: the edge computing device of the Nth layer (the last layer), i.e., the leaf node, uploads its local model parameters obtained by training based on local data to the parent node of the edge computing device located at the N-1th layer. The edge computing device of the N-1th layer aggregates the local model parameters uploaded by its child nodes with its own local model parameters, and uploads the parameter aggregation results to the parent node of the edge computing device located at the N-2th layer. And so on. Finally, the cluster head aggregates the parameter aggregation results uploaded by its child nodes with its own local model parameters to obtain the aggregation results of model parameters within the cluster.

非叶子节点聚合本地模型参数与所述非叶子节点的子节点上传的模型参数得到参数聚合结果的方式可以为：非叶子节点将其子节点上传的模型参数与自身的本地模型参数取平均值。例如，簇头将其子节点上传的参数聚合结果与自身的本地模型参数取平均值得到簇内模型参数聚合结果。The method for obtaining the parameter aggregation result by aggregating the local model parameters of the non-leaf node with the model parameters uploaded by the child nodes of the non-leaf node can be: the non-leaf node averages the model parameters uploaded by its child nodes with its own local model parameters. For example, the cluster head averages the parameter aggregation result uploaded by its child nodes with its own local model parameters to obtain the model parameter aggregation result within the cluster.

簇内树形聚合网络是数据同性簇内的各边缘计算设备根据通信最优策略构建得到的。The intra-cluster tree aggregation network is constructed by each edge computing device in the data homogeneity cluster according to the optimal communication strategy.

边缘云服务器从数据同性簇的边缘计算设备中选择一个作为该数据同性簇的簇头。该数据同性簇的簇头从簇内剩余的边缘计算设备中选择与簇头通信时数据发送速率最大的预设个数的边缘计算设备作为第1层子节点即层1簇头，层1簇头从簇内剩余的边缘计算设备中选择与层1簇头通信时数据发送速率最大的预设个数的边缘计算设备作为第二层子节点即层2簇头，以此类推，直到簇内所有的边缘计算设备获得自身在簇内树形聚合网络的编号，即直到簇内每个边缘计算设备获知模型参数向哪一个边缘计算设备上传。簇头只接收层1簇头的模型参数，而不接收簇内其他边缘计算设备的模型参数，层1簇头只接收层2簇头的模型参数，而不接收簇内其他边缘计算设备的模型参数，以此类推，层N-1簇头只接收层N簇头的模型参数，而不接收簇内其他边缘计算设备的模型参数。The edge cloud server selects one from the edge computing devices in the data homogeneity cluster as the cluster head of the data homogeneity cluster. The cluster head of the data homogeneity cluster selects a preset number of edge computing devices with the highest data transmission rate when communicating with the cluster head from the remaining edge computing devices in the cluster as the first-layer child nodes, i.e., the layer 1 cluster head. The layer 1 cluster head selects a preset number of edge computing devices with the highest data transmission rate when communicating with the layer 1 cluster head from the remaining edge computing devices in the cluster as the second-layer child nodes, i.e., the layer 2 cluster head. This is done by analogy until all edge computing devices in the cluster obtain their own numbers in the tree-shaped aggregation network in the cluster, i.e., until each edge computing device in the cluster knows which edge computing device to upload the model parameters to. The cluster head only receives the model parameters of the layer 1 cluster head, but not the model parameters of other edge computing devices in the cluster. The layer 1 cluster head only receives the model parameters of the layer 2 cluster head, but not the model parameters of other edge computing devices in the cluster. By analogy, the layer N-1 cluster head only receives the model parameters of the layer N cluster head, but not the model parameters of other edge computing devices in the cluster.

在构建簇内树形聚合网络过程中，簇头通过无线链路与簇内所有边缘计算设备进行通信，通信的数据可以为模型参数。簇头收集簇内所有边缘计算设备上传的模型参数，并计算各个边缘计算设备的发送速率。发送速率的计算方式可以为：发送速率=总的通信数据量/（总的发送时间+总的接收时间）。In the process of building a tree-shaped aggregation network within a cluster, the cluster head communicates with all edge computing devices within the cluster through wireless links, and the data communicated can be model parameters. The cluster head collects the model parameters uploaded by all edge computing devices within the cluster and calculates the sending rate of each edge computing device. The sending rate can be calculated as follows: sending rate = total communication data volume/(total sending time + total receiving time).

簇头根据簇内各个边缘计算设备的发送速率，选定发送速率最大的预设个数的边缘计算设备作为层1簇头。层1簇头接收到簇头的任命指令后，与除簇头以及层1簇头外的簇内其他边缘计算设备进行通信，计算簇内其他边缘计算设备的发送速率，并选择发送速率最大的预设个数的边缘计算设备作为层2簇头；层2簇头接收到层1簇头的任命指令后，与除簇头、层1簇头以及层2簇头外的簇内其他边缘计算设备进行通信，计算簇内其他边缘计算设备的发送速率，并选择发送速率最大的预设个数的边缘计算设备作为层3簇头；以此类推。The cluster head selects a preset number of edge computing devices with the highest transmission rate as the layer 1 cluster head according to the transmission rate of each edge computing device in the cluster. After receiving the appointment instruction of the cluster head, the layer 1 cluster head communicates with other edge computing devices in the cluster except the cluster head and the layer 1 cluster head, calculates the transmission rate of other edge computing devices in the cluster, and selects the preset number of edge computing devices with the highest transmission rate as the layer 2 cluster head; after receiving the appointment instruction of the layer 1 cluster head, the layer 2 cluster head communicates with other edge computing devices in the cluster except the cluster head, the layer 1 cluster head and the layer 2 cluster head, calculates the transmission rate of other edge computing devices in the cluster, and selects the preset number of edge computing devices with the highest transmission rate as the layer 3 cluster head; and so on.

示例性的，簇头根据簇内各个边缘计算设备的发送速率，选定发送速率最大的2个的边缘计算设备作为层1簇头。层1簇头接收到簇头的任命指令后，与除簇头以及层1簇头外的簇内其他边缘计算设备进行通信，计算簇内其他边缘计算设备的发送速率，并选择发送速率最大的2个的边缘计算设备作为层2簇头；层2簇头接收到层1簇头的任命指令后，与除簇头、层1簇头以及层2簇头外的簇内其他边缘计算设备进行通信，计算簇内其他边缘计算设备的发送速率，并选择发送速率最大的2个的边缘计算设备作为层3簇头；以此类推。Exemplarily, the cluster head selects the two edge computing devices with the highest transmission rates as the layer 1 cluster head according to the transmission rates of each edge computing device in the cluster. After receiving the appointment instruction of the cluster head, the layer 1 cluster head communicates with other edge computing devices in the cluster except the cluster head and the layer 1 cluster head, calculates the transmission rates of other edge computing devices in the cluster, and selects the two edge computing devices with the highest transmission rates as the layer 2 cluster head; after receiving the appointment instruction of the layer 1 cluster head, the layer 2 cluster head communicates with other edge computing devices in the cluster except the cluster head, the layer 1 cluster head and the layer 2 cluster head, calculates the transmission rates of other edge computing devices in the cluster, and selects the two edge computing devices with the highest transmission rates as the layer 3 cluster head; and so on.

S104：聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的全局模型收敛时，将所述全局模型下发给各所述边缘计算设备。S104: Aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when a global model using the second-layer model parameter aggregation result as a model parameter converges, send the global model to each of the edge computing devices.

边缘云服务器聚合各个数据同性簇的簇内模型参数聚合结果，得到全局模型参数即第二层模型参数聚合结果。当以第二层模型参数聚合结果作为模型参数的全局模型收敛时，将收敛的全局模型下发给各边缘计算设备。边缘计算设备使用全局模型进行相应的检测、分析等。示例性的，如果联邦学习得到的全局模型为图像处理模型，那么边缘计算设备使用全局模型进行图像处理。如果联邦学习得到的全局模型为网络攻击检测模型，那么边缘计算设备使用全局模型进行网络攻击检测。The edge cloud server aggregates the cluster model parameter aggregation results of each data homogeneity cluster to obtain the global model parameters, that is, the second-layer model parameter aggregation results. When the global model with the second-layer model parameter aggregation results as the model parameters converges, the converged global model is sent to each edge computing device. The edge computing device uses the global model to perform corresponding detection, analysis, etc. Exemplarily, if the global model obtained by federated learning is an image processing model, then the edge computing device uses the global model for image processing. If the global model obtained by federated learning is a network attack detection model, then the edge computing device uses the global model for network attack detection.

本实施例对每个数据同性簇设置其对应的权重系数，根据每个数据同性簇的权重系数对每个数据同性簇的簇头上传的簇内模型参数聚合结果进行加权聚合，这样可以使模型更好的反映真实的全局数据分布，提升模型的泛化能力。This embodiment sets a corresponding weight coefficient for each data homogeneity cluster, and performs weighted aggregation on the cluster model parameter aggregation results uploaded by the cluster head of each data homogeneity cluster according to the weight coefficient of each data homogeneity cluster, so that the model can better reflect the real global data distribution and improve the generalization ability of the model.

统计所述数据同性簇的各边缘计算设备的本地数据测试精度，得到所述数据同性簇的本地数据测试精度；边缘计算设备的本地数据测试精度为边缘计算设备使用本地数据对上一训练轮次得到的全局模型进行测试的数据测试精度；The local data test accuracy of each edge computing device of the data homogeneity cluster is counted to obtain the local data test accuracy of the data homogeneity cluster; the local data test accuracy of the edge computing device is the data test accuracy of the edge computing device using local data to test the global model obtained in the previous training round;

边缘云服务器将上一训练轮次得到的全局模型广播到所有边缘计算设备后，各边缘计算设备利用本地数据进行测试，测试方法可以为：边缘计算设备随机选取本地数据的1/10作为测试数据对全局模型进行测试。测试完成后，边缘计算设备并将本设备的测试结果发送给边缘云服务器。边缘云服务器遍历每个数据同性簇，对每个数据同性簇的本地数据测试精度进行统计，统计方法为求簇内各个边缘计算设备的本地数据测试精度的平均值。若某个数据同性簇的本地数据测试精度低，则说明该数据同性簇数据训练不充分或者全局模型对该数据同性簇的数据不适应，此时增加该数据同性簇的权重系数。After the edge cloud server broadcasts the global model obtained in the previous training round to all edge computing devices, each edge computing device uses local data for testing. The testing method can be: the edge computing device randomly selects 1/10 of the local data as test data to test the global model. After the test is completed, the edge computing device sends the test results of the device to the edge cloud server. The edge cloud server traverses each data homogeneity cluster and counts the local data test accuracy of each data homogeneity cluster. The statistical method is to find the average value of the local data test accuracy of each edge computing device in the cluster. If the local data test accuracy of a data homogeneity cluster is low, it means that the data training of the data homogeneity cluster is insufficient or the global model is not suitable for the data of the data homogeneity cluster. At this time, the weight coefficient of the data homogeneity cluster is increased.

本实施例根据数据同性簇的本地数据测试精度确定数据同性簇的权重系数，数据同性簇的本地数据测试精度不同，则数据同性簇的权重系数不同，通过赋予不同数据同性簇不同的权重系数，可以避免数据量大或质量好的边缘计算设备被忽视。同时根据数据同性簇的本地数据测试精度确定数据同性簇的权重系数，数据同性簇的权重系数是动态调整的，这样可以避免模型过渡适应特定的边缘计算设备，提升模型的泛化能力。This embodiment determines the weight coefficient of the data homogeneity cluster according to the local data test accuracy of the data homogeneity cluster. The local data test accuracy of the data homogeneity cluster is different, and the weight coefficient of the data homogeneity cluster is different. By assigning different weight coefficients to different data homogeneity clusters, it is possible to avoid edge computing devices with large data volume or good quality being ignored. At the same time, the weight coefficient of the data homogeneity cluster is determined according to the local data test accuracy of the data homogeneity cluster. The weight coefficient of the data homogeneity cluster is dynamically adjusted, which can avoid the model from over-adapting to a specific edge computing device and improve the generalization ability of the model.

其中，在一些实施例中，根据所述数据同性簇的本地数据测试精度确定所述数据同性簇的权重系数包括：Among them, in some embodiments, determining the weight coefficient of the data homogeneity cluster according to the local data test accuracy of the data homogeneity cluster includes:

根据得到所述数据同性簇的权重系数；K _c表示第c个数据同性簇的权重系数，μ表示第c个数据同性簇的本地数据测试精度。/>以e为底数，以-μ为指数。according to The weight coefficient of the data homogeneity cluster is obtained; K _c represents the weight coefficient of the cth data homogeneity cluster, and μ represents the local data test accuracy of the cth data homogeneity cluster. /> The base is e and the exponent is -μ.

假设将所有边缘计算设备划分为C个数据同性簇，由集合表示，第k个数据同性簇S_k包含/>个边缘计算设备。在联邦学习系统中，边缘计算设备i基于自身的本地数据集D_i训练得到一个本地模型。边缘计算设备i处的数据分布的局部经验损失函数为：/>；其中，/>为本地模型参数，/>为参与迭代训练的数据样本，/>为样本损失函数，用于量化数据样本/>上的预测误差。Assume that all edge computing devices are divided into C data homogeneous clusters, consisting of the set Indicates that the kth data homogeneity cluster S _k contains/> edge computing devices. In the federated learning system, edge computing device i trains a local model based on its own local data set _Di. The local empirical loss function of the data distribution at edge computing device i is:/> ; Among them, /> are local model parameters, /> is the data sample involved in iterative training,/> is a sample loss function used to quantify data samples/> The prediction error on .

联邦学习的主要目标是优化全局模型参数，以最小化与所有边缘计算设备关联的全局损失函数，同时保证聚合效率最高。全局损失函数为：The main goal of federated learning is to optimize the global model parameters to minimize the global loss function associated with all edge computing devices while ensuring the highest aggregation efficiency. The global loss function is:

。 .

联邦学习中模型训练过程分为本地模型训练、第一层模型参数聚合即簇内模型参数聚合与第二层模型参数聚合即全局聚合三个阶段，这三个阶段组合为一个训练轮次。The model training process in federated learning is divided into three stages: local model training, first-layer model parameter aggregation, i.e., cluster-wide model parameter aggregation, and second-layer model parameter aggregation, i.e., global aggregation. These three stages are combined into a training round.

本地模型训练：每个边缘计算设备更新本地模型。在一些实施例中，簇内的边缘计算设备使用SGD（Stochastic gradient descent，随机梯度下降）算法更新本地模型。在训练的第t轮过程中，第l次迭代更新过程表示为：Local model training: Each edge computing device updates the local model. In some embodiments, the edge computing devices in the cluster use the SGD (Stochastic gradient descent) algorithm to update the local model. During the tth round of training, the lth iteration update process is expressed as:

；其中，/>为第t轮第l次迭代更新后的本地模型参数，/>为第t轮第l次迭代的学习率。 ; Among them, /> is the updated local model parameter in the tth round and the lth iteration,/> is the learning rate of the lth iteration of the tth round.

簇内模型参数聚合：簇内的边缘计算设备进行次迭代更新后，进行一次簇内模型参数聚合。聚合方式可以为：层k簇头接收其子节点上传的模型参数，并与自身的本地模型参数取平均值，簇头接收其子节点上传的模型参数并与自身的本地模型参数取平均值，得到簇内模型参数聚合结果。其中，簇内的边缘计算设备每隔/>次训练，将模型进行存储。在接收到上层簇头发送的聚合指令时，将存储的模型发送给与之连接的上层簇头。In-cluster model parameter aggregation: The edge computing devices in the cluster perform After the iteration update, the cluster model parameter aggregation is performed. The aggregation method can be: the layer k cluster head receives the model parameters uploaded by its child nodes and averages them with its own local model parameters, and the cluster head receives the model parameters uploaded by its child nodes and averages them with its own local model parameters to obtain the cluster model parameter aggregation result. Among them, the edge computing devices in the cluster are updated every / > After training, the model is stored. When receiving the aggregation instruction sent by the upper cluster head, the stored model is sent to the upper cluster head connected to it.

全局聚合：当所有数据同性簇进行τ次簇内模型参数聚合后，边缘云服务器以同步的方式执行全局聚合。边缘云服务器接收C个簇头上传的模型参数，通过参数平均的方式更新全局模型参数为：Global aggregation: After all data homogeneous clusters perform intra-cluster model parameter aggregation τ times, the edge cloud server performs global aggregation in a synchronous manner. The edge cloud server receives the model parameters uploaded by C cluster heads and updates the global model parameters by parameter averaging:

；/>为第c个簇第t训练轮次的模型参数，/>为第t训练轮次更新后的全局模型参数。 ; /> is the model parameter of the cth cluster in the tth training round,/> is the global model parameter updated in the tth training round.

全局模型参数更新后，边缘云服务器将全局模型广播到所有的边缘计算设备，各边缘计算设备利用本地数据进行测试，测试方法可以为：边缘计算设备随机选取本地数据的1/10作为测试数据对全局模型进行测试。测试完成后，边缘计算设备并将本设备的测试结果发送给边缘云服务器。边缘云服务器遍历每个数据同性簇，对每个数据同性簇的本地数据测试精度进行统计，统计方法为求簇内各个边缘计算设备的本地数据测试精度的平均值。After the global model parameters are updated, the edge cloud server broadcasts the global model to all edge computing devices, and each edge computing device uses local data for testing. The testing method can be: the edge computing device randomly selects 1/10 of the local data as test data to test the global model. After the test is completed, the edge computing device sends the test results of the device to the edge cloud server. The edge cloud server traverses each data homogeneity cluster and counts the local data test accuracy of each data homogeneity cluster. The statistical method is to find the average value of the local data test accuracy of each edge computing device in the cluster.

下一训练轮次的全局聚合表示为：The global aggregation for the next training round is expressed as:

；/>第c个簇第t+1训练轮次的模型参数，/>为第t+1训练轮次更新后的全局模型参数。 ; /> The model parameters of the cth cluster in the t+1th training round,/> is the global model parameter updated in the t+1th training round.

，K _c表示第c个数据同性簇的权重系数，μ表示第c个数据同性簇的本地数据测试精度。 , K _c represents the weight coefficient of the c-th data homogeneity cluster, and μ represents the local data test accuracy of the c-th data homogeneity cluster.

在一些实施例中，还包括：In some embodiments, it also includes:

在完成h训练轮次的模型训练后，重新执行对边缘计算设备进行分数据同性簇，并在重新分数据同性簇后再次进行模型训练（包括本地模型训练、簇内模型参数聚合以及全局聚合）。重复上述划分数据同性簇以及划分数据同性簇后进行模型训练的步骤，直到全局模型收敛。After completing the model training of h training rounds, re-cluster the edge computing devices into data homogeneity clusters, and perform model training again after re-clustering the data homogeneity clusters (including local model training, model parameter aggregation within the cluster, and global aggregation). Repeat the above steps of dividing the data homogeneity clusters and performing model training after dividing the data homogeneity clusters until the global model converges.

综上所述，本发明所提供的数据异构条件下的联邦学习方法，根据数据分布相似性对边缘计算设备分簇，簇内边缘计算设备具有相似的数据分布，由于相似的数据分布可以让模型更好的捕捉到数据的特征，因此簇内模型参数聚合可以提高每个簇的模型训练准确性，有效解决数据异构问题。并且，簇内的边缘计算设备根据簇内树形聚合网络进行模型参数聚合，子节点只向相应的父节点发送模型参数，即下层的边缘计算设备只向上一层中相应的边缘计算设备发送模型参数，而不向其他边缘计算设备发送模型参数，这样可以极大降低通信开销。同时，将边缘计算设备分簇后，若某个簇中的边缘计算设备发生故障或离线，其他簇仍可以正常进行本地模型训练并进行模型参数聚合，从而可以提升联邦训练系统整体的容错性与可扩展性。In summary, the federated learning method under data heterogeneity provided by the present invention clusters edge computing devices according to the similarity of data distribution. The edge computing devices in the cluster have similar data distribution. Since similar data distribution allows the model to better capture the characteristics of the data, the aggregation of model parameters in the cluster can improve the accuracy of model training in each cluster and effectively solve the problem of data heterogeneity. In addition, the edge computing devices in the cluster aggregate model parameters according to the tree-shaped aggregation network in the cluster. The child nodes only send model parameters to the corresponding parent nodes, that is, the edge computing devices in the lower layer only send model parameters to the corresponding edge computing devices in the upper layer, and do not send model parameters to other edge computing devices. This can greatly reduce communication overhead. At the same time, after the edge computing devices are clustered, if the edge computing device in a cluster fails or goes offline, other clusters can still perform local model training and model parameter aggregation normally, thereby improving the overall fault tolerance and scalability of the federated training system.

本发明还提供了一种数据异构条件下的联邦学习装置，下文描述的该装置可以与上文描述的方法相互对应参照。请参考图5，图5为本发明实施例所提供的一种数据异构条件下的联邦学习装置的示意图，结合图5所示，该装置包括：The present invention also provides a federated learning device under data heterogeneity conditions, and the device described below can correspond to the method described above. Please refer to Figure 5, which is a schematic diagram of a federated learning device under data heterogeneity conditions provided by an embodiment of the present invention. As shown in Figure 5, the device includes:

划分单元11，用于根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇；A division unit 11 is used to divide the edge computing devices into a plurality of data homogeneity clusters according to the data distribution similarity of the edge computing devices;

选择单元12，用于从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头；A selection unit 12, configured to select an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster;

接收单元13，用于接收所述簇头上传的所在数据同性簇的簇内模型参数聚合结果；所述簇内模型参数聚合结果为所述数据同性簇内的各边缘计算设备根据簇内树形聚合网络聚合本地模型参数得到的第一层模型参数聚合结果，所述数据同性簇内的各边缘计算设备根据通信最优策略构建得到所述簇内树形聚合网络，所述簇头为所述簇内树形聚合网络的根节点，所述簇内树形聚合网络中子节点向相应的父节点发送模型参数，与父节点的本地模型参数进行模型参数聚合；The receiving unit 13 is used to receive the intra-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the intra-cluster model parameter aggregation result is the first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the intra-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the intra-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the intra-cluster tree aggregation network, and the child nodes in the intra-cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes;

聚合单元14，用于聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的全局模型收敛时，将所述全局模型下发给各所述边缘计算设备。The aggregation unit 14 is used to aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain the second-layer model parameter aggregation results, and when the global model using the second-layer model parameter aggregation results as the model parameters converges, the global model is sent to each of the edge computing devices.

在一些实施例中，选择单元12具体用于：In some embodiments, the selection unit 12 is specifically configured to:

在一些实施例中，聚合单元14包括：In some embodiments, the polymerization unit 14 includes:

确定子单元，用于确定当前训练轮次各所述数据同性簇的权重系数；A determination subunit, used to determine the weight coefficient of each data homogeneity cluster in the current training round;

聚合子单元，用于根据各所述数据同性簇的权重系数，对各所述数据同性簇的簇内模型参数进行加权聚合，得到当前训练轮次的所述第二层模型参数聚合结果。The aggregation subunit is used to perform weighted aggregation on the intra-cluster model parameters of each data homogeneity cluster according to the weight coefficient of each data homogeneity cluster to obtain the second-layer model parameter aggregation result of the current training round.

在一些实施例中，确定子单元具体用于：In some embodiments, the determining subunit is specifically configured to:

在一些实施例中，还包括：In some embodiments, it also includes:

重复单元，用于每当完成预设训练轮次的模型训练后，重新根据边缘计算设备的数据分布相似性对边缘计算设备进行分簇，并再次进行预设训练轮次的模型训练，直到全局模型收敛。The repetition unit is used to re-cluster the edge computing devices according to the data distribution similarity of the edge computing devices after completing the model training of the preset training rounds, and perform the model training of the preset training rounds again until the global model converges.

在一些实施例中，划分单元11包括：In some embodiments, the dividing unit 11 includes:

构建子单元，用于根据各所述边缘计算设备的数据分布相似性，构建带权无向图；所述带权无向图中两个边缘计算设备之间的连接边的值为两个边缘计算设备的数据分布相似性的值；A construction subunit is used to construct a weighted undirected graph according to the data distribution similarity of each edge computing device; the value of the connecting edge between two edge computing devices in the weighted undirected graph is the value of the data distribution similarity of the two edge computing devices;

划分子单元，用于根据所述带权无向图，将所述边缘计算设备划分为若干个数据同性簇。The partitioning subunit is used to partition the edge computing device into a plurality of data homogeneous clusters according to the weighted undirected graph.

在一些实施例中，构建子单元包括：In some embodiments, the building block includes:

搜索子单元，用于从公网搜索公共数据，并基于所述公共数据构建测试数据集；A search subunit, used to search for public data from the public network and construct a test data set based on the public data;

发送子单元，用于将所述测试数据集下发给各所述边缘计算设备，以使各所述边缘计算设备使用本地模型对所述测试数据集进行推理；A sending subunit, configured to send the test data set to each edge computing device, so that each edge computing device uses a local model to infer the test data set;

计算子单元，用于接收各所述边缘计算设备上传的推理结果，并计算各所述推理结果的相似度；A computing subunit, configured to receive the inference results uploaded by each edge computing device and calculate the similarity of each inference result;

带权无向图构建子单元，用于根据所述推理结果的相似度构建所述带权无向图。The weighted undirected graph construction subunit is used to construct the weighted undirected graph according to the similarity of the reasoning results.

在一些实施例中，带权无向图构建子单元包括：In some embodiments, the weighted undirected graph construction subunit includes:

比较子单元，用于比较各边缘计算设备的推理结果的相似度与预设阈值的大小；A comparison subunit, used to compare the similarity of the inference results of each edge computing device with a preset threshold;

建立子单元，用于若两个边缘计算设备的推理结果的相似度大于预设阈值，则建立两个边缘计算设备之间的连接关系，两个边缘计算设备的推理结果的相似度的值作为两个所述边缘计算设备之间的连接边的值。A subunit is established to establish a connection relationship between the two edge computing devices if the similarity of the inference results of the two edge computing devices is greater than a preset threshold, and the value of the similarity of the inference results of the two edge computing devices is used as the value of the connection edge between the two edge computing devices.

在一些实施例中，划分子单元包括：In some embodiments, dividing the subunits includes:

初始化子单元，用于初始化所述带权无向图中各边缘计算设备的标签；An initialization subunit, used to initialize the label of each edge computing device in the weighted undirected graph;

迭代更新子单元，用于迭代更新各边缘计算设备的标签；其中，迭代更新各边缘计算设备的标签包括：将值大于设定阈值的连接边所连接的边缘计算设备的标签设置为相同的标签；统计目标边缘计算设备的邻居边缘计算设备的标签出现的次数，并选择邻居边缘计算设备中出现次数最多的标签作为所述目标边缘计算设备的标签；所述目标边缘计算设备为所连接的各连接边的值均不大于设定阈值的边缘计算设备；An iterative update subunit, configured to iteratively update the labels of each edge computing device; wherein iteratively updating the labels of each edge computing device includes: setting the labels of edge computing devices connected to connection edges whose values are greater than a set threshold to the same label; counting the number of times the labels of neighboring edge computing devices of the target edge computing device appear, and selecting the label with the largest number of appearances among the neighboring edge computing devices as the label of the target edge computing device; the target edge computing device is an edge computing device to which the values of each connection edge connected are not greater than the set threshold;

判断子单元，用于判断是否满足迭代更新停止条件；A judgment subunit, used to judge whether the iterative update stop condition is met;

分簇子单元，用于若满足迭代更新停止条件，则将具有相同标签的所述边缘计算设备划分到同一个数据同性簇。The clustering subunit is used to divide the edge computing devices with the same label into the same data homogeneity cluster if the iterative update stop condition is met.

在一些实施例中，判断子单元具体用于：In some embodiments, the determination subunit is specifically used for:

本发明还提供了一种数据异构条件下的联邦学习系统，包括：边缘计算设备与边缘云服务器。The present invention also provides a federated learning system under data heterogeneity conditions, including: an edge computing device and an edge cloud server.

对于本发明所提供的系统的介绍请参照上述方法实施例，本发明在此不做赘述。For an introduction to the system provided by the present invention, please refer to the above method embodiment, and the present invention will not be elaborated here.

请参考图6，图6为本发明实施例所提供的一种数据异构条件下的图像处理方法的流程示意图，参考图6所示，该方法包括：Please refer to FIG. 6 , which is a flow chart of an image processing method under data heterogeneity conditions provided by an embodiment of the present invention. Referring to FIG. 6 , the method includes:

S201：根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇；S201: Divide the edge computing devices into a plurality of data homogeneity clusters according to data distribution similarity of the edge computing devices;

S202：从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头；S202: Selecting an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster;

S203：接收所述簇头上传的所在数据同性簇的簇内模型参数聚合结果；所述簇内模型参数聚合结果为所述数据同性簇内的各边缘计算设备根据簇内树形聚合网络聚合本地模型参数得到的第一层模型参数聚合结果，所述数据同性簇内的各边缘计算设备根据通信最优策略构建得到所述簇内树形聚合网络，所述簇头为所述簇内树形聚合网络的根节点，所述簇内树形聚合网络中子节点向相应的父节点发送模型参数，与父节点的本地模型参数进行模型参数聚合；S203: Receive the cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the cluster model parameter aggregation result is the first layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the cluster tree aggregation network, and the child nodes in the cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes;

S204：聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的图像处理模型收敛时，将所述图像处理模型下发给各所述边缘计算设备，以使所述边缘计算设备使用所述图像处理模型处理图像。S204: Aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when the image processing model using the second-layer model parameter aggregation result as the model parameter converges, send the image processing model to each of the edge computing devices, so that the edge computing device uses the image processing model to process the image.

在一些实施例中，还包括：In some embodiments, it also includes:

综上所述，本发明所提供的数据异构条件下的图像处理方法，根据数据分布相似性对边缘计算设备分簇，簇内边缘计算设备具有相似的数据分布，由于相似的数据分布可以让模型更好的捕捉到数据的特征，因此簇内模型参数聚合可以提高每个簇的模型训练准确性，有效解决数据异构问题。并且，簇内的边缘计算设备根据簇内树形聚合网络进行模型参数聚合，子节点只向相应的父节点发送模型参数，即下层的边缘计算设备只向上一层中相应的边缘计算设备发送模型参数，而不向其他边缘计算设备发送模型参数，这样可以极大降低通信开销。同时，将边缘计算设备分簇后，若某个簇中的边缘计算设备发生故障或离线，其他簇仍可以正常进行本地模型训练并进行模型参数聚合，从而可以提升联邦训练系统整体的容错性与可扩展性。边缘计算设备与边缘云服务器在联邦学习过程中进行两层模型参数聚合，得到准确可靠的图像处理模型，最终边缘计算设备使用该准确可靠的图像处理模型进行图像处理，可以提升图像处理的准确性与可靠性。In summary, the image processing method under data heterogeneity provided by the present invention clusters edge computing devices according to the similarity of data distribution. The edge computing devices in the cluster have similar data distribution. Since similar data distribution allows the model to better capture the characteristics of the data, the aggregation of model parameters in the cluster can improve the accuracy of model training for each cluster and effectively solve the problem of data heterogeneity. In addition, the edge computing devices in the cluster perform model parameter aggregation according to the tree-shaped aggregation network in the cluster, and the child nodes only send model parameters to the corresponding parent nodes, that is, the edge computing devices in the lower layer only send model parameters to the corresponding edge computing devices in the upper layer, and do not send model parameters to other edge computing devices, which can greatly reduce communication overhead. At the same time, after the edge computing devices are clustered, if the edge computing devices in a cluster fail or go offline, other clusters can still perform local model training and model parameter aggregation normally, thereby improving the overall fault tolerance and scalability of the federated training system. The edge computing devices and the edge cloud server perform two-layer model parameter aggregation in the federated learning process to obtain an accurate and reliable image processing model. Finally, the edge computing devices use the accurate and reliable image processing model for image processing, which can improve the accuracy and reliability of image processing.

本发明还提供了一种数据异构条件下的图像处理装置，下文描述的该装置可以与上文描述的方法相互对应参照。请参考图7，图7为本发明实施例所提供的一种数据异构条件下的图像处理装置的示意图，结合图7所示，该装置包括：The present invention also provides an image processing device under data heterogeneity conditions. The device described below can correspond to the method described above. Please refer to Figure 7, which is a schematic diagram of an image processing device under data heterogeneity conditions provided by an embodiment of the present invention. As shown in Figure 7, the device includes:

划分模块21，用于根据边缘计算设备的数据分布相似性，将所述边缘计算设备划分为若干个数据同性簇；A division module 21, configured to divide the edge computing devices into a plurality of data homogeneity clusters according to the data distribution similarity of the edge computing devices;

选择模块22，用于从所述数据同性簇中选择边缘计算设备作为所述数据同性簇的簇头；A selection module 22, configured to select an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster;

接收模块23，用于接收所述簇头上传的所在数据同性簇的簇内模型参数聚合结果；所述簇内模型参数聚合结果为所述数据同性簇内的各边缘计算设备根据簇内树形聚合网络聚合本地模型参数得到的第一层模型参数聚合结果，所述数据同性簇内的各边缘计算设备根据通信最优策略构建得到所述簇内树形聚合网络，所述簇头为所述簇内树形聚合网络的根节点，所述簇内树形聚合网络中子节点向相应的父节点发送模型参数，与父节点的本地模型参数进行模型参数聚合；The receiving module 23 is used to receive the intra-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the intra-cluster model parameter aggregation result is the first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the intra-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the intra-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the intra-cluster tree aggregation network, and the child nodes in the intra-cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes;

聚合模块24，用于聚合各所述数据同性簇的簇内模型参数聚合结果，得到第二层模型参数聚合结果，并当以所述第二层模型参数聚合结果作为模型参数的图像处理模型收敛时，将所述图像处理模型下发给各所述边缘计算设备，以使所述边缘计算设备使用所述图像处理模型处理图像。The aggregation module 24 is used to aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain the second-layer model parameter aggregation results, and when the image processing model using the second-layer model parameter aggregation results as model parameters converges, the image processing model is sent to each of the edge computing devices, so that the edge computing device uses the image processing model to process the image.

在一些实施例中，选择模块22具体用于：In some embodiments, the selection module 22 is specifically used to:

在一些实施例中，聚合模块24包括：In some embodiments, the aggregation module 24 includes:

确定子模块，用于确定当前训练轮次各所述数据同性簇的权重系数；A determination submodule is used to determine the weight coefficients of the data homogeneity clusters in the current training round;

聚合子模块，用于根据各所述数据同性簇的权重系数，对各所述数据同性簇的簇内模型参数进行加权聚合，得到当前训练轮次的所述第二层模型参数聚合结果。The aggregation submodule is used to perform weighted aggregation on the intra-cluster model parameters of each data homogeneity cluster according to the weight coefficient of each data homogeneity cluster to obtain the second-layer model parameter aggregation result of the current training round.

在一些实施例中，确定子模块具体用于：In some embodiments, the determination submodule is specifically configured to:

在一些实施例中，还包括：In some embodiments, it also includes:

重复模块，用于每当完成预设训练轮次的模型训练后，重新根据边缘计算设备的数据分布相似性对边缘计算设备进行分簇，并再次进行预设训练轮次的模型训练，直到全局模型收敛。The repetition module is used to re-cluster the edge computing devices according to the data distribution similarity of the edge computing devices after completing the model training of the preset training rounds, and perform the model training of the preset training rounds again until the global model converges.

在一些实施例中，划分模块21包括：In some embodiments, the partitioning module 21 includes:

构建子模块，用于根据各所述边缘计算设备的数据分布相似性，构建带权无向图；所述带权无向图中两个边缘计算设备之间的连接边的值为两个边缘计算设备的数据分布相似性的值；A construction submodule is used to construct a weighted undirected graph according to the data distribution similarity of each edge computing device; the value of the connecting edge between two edge computing devices in the weighted undirected graph is the value of the data distribution similarity of the two edge computing devices;

划分子模块，用于根据所述带权无向图，将所述边缘计算设备划分为若干个数据同性簇。The partitioning submodule is used to partition the edge computing device into a plurality of data homogeneous clusters according to the weighted undirected graph.

在一些实施例中，构建子模块包括：In some embodiments, the building blocks include:

搜索子模块，用于从公网搜索公共数据，并基于所述公共数据构建测试数据集；A search submodule, used to search for public data from the public network and construct a test data set based on the public data;

发送子模块，用于将所述测试数据集下发给各所述边缘计算设备，以使各所述边缘计算设备使用本地模型对所述测试数据集进行推理；A sending submodule, used for sending the test data set to each edge computing device, so that each edge computing device uses a local model to infer the test data set;

计算子模块，用于接收各所述边缘计算设备上传的推理结果，并计算各所述推理结果的相似度；A calculation submodule, used to receive the inference results uploaded by each edge computing device, and calculate the similarity of each inference result;

带权无向图构建子模块，用于根据所述推理结果的相似度构建所述带权无向图。The weighted undirected graph construction submodule is used to construct the weighted undirected graph according to the similarity of the reasoning results.

在一些实施例中，带权无向图构建子模块包括：In some embodiments, the weighted undirected graph construction submodule includes:

比较子模块，用于比较各边缘计算设备的推理结果的相似度与预设阈值的大小；A comparison submodule is used to compare the similarity of the inference results of each edge computing device with a preset threshold;

建立子模块，用于若两个边缘计算设备的推理结果的相似度大于预设阈值，则建立两个边缘计算设备之间的连接关系，两个边缘计算设备的推理结果的相似度的值作为两个所述边缘计算设备之间的连接边的值。A submodule is established to establish a connection relationship between the two edge computing devices if the similarity of the inference results of the two edge computing devices is greater than a preset threshold, and the value of the similarity of the inference results of the two edge computing devices is used as the value of the connection edge between the two edge computing devices.

在一些实施例中，划分子模块包括：In some embodiments, the partitioning submodule includes:

初始化子模块，用于初始化所述带权无向图中各边缘计算设备的标签；An initialization submodule, used to initialize the labels of each edge computing device in the weighted undirected graph;

迭代更新子模块，用于迭代更新各边缘计算设备的标签；其中，迭代更新各边缘计算设备的标签包括：将值大于设定阈值的连接边所连接的边缘计算设备的标签设置为相同的标签；统计目标边缘计算设备的邻居边缘计算设备的标签出现的次数，并选择邻居边缘计算设备中出现次数最多的标签作为所述目标边缘计算设备的标签；所述目标边缘计算设备为所连接的各连接边的值均不大于设定阈值的边缘计算设备；An iterative update submodule, for iteratively updating the labels of each edge computing device; wherein iteratively updating the labels of each edge computing device includes: setting the labels of edge computing devices connected to the connection edges whose values are greater than a set threshold to the same label; counting the number of times the labels of neighboring edge computing devices of the target edge computing device appear, and selecting the label with the largest number of appearances among the neighboring edge computing devices as the label of the target edge computing device; the target edge computing device is an edge computing device to which the values of each connected connection edge are not greater than the set threshold;

判断子模块，用于判断是否满足迭代更新停止条件；A judgment submodule is used to judge whether the iterative update stop condition is met;

分簇子模块，用于若满足迭代更新停止条件，则将具有相同标签的所述边缘计算设备划分到同一个数据同性簇。The clustering submodule is used to divide the edge computing devices with the same label into the same data homogeneity cluster if the iterative update stop condition is met.

在一些实施例中，判断子模块具体用于：In some embodiments, the determination submodule is specifically used for:

本发明还提供了一种设备，该设备包括存储器和处理器。The present invention also provides a device, which includes a memory and a processor.

存储器，用于存储计算机程序；Memory for storing computer programs;

处理器，用于执行计算机程序实现如联邦学习方法的任一实施例的步骤或实现如图像处理方法的任一实施例的步骤。A processor is used to execute a computer program to implement the steps of any embodiment of the federated learning method or the steps of any embodiment of the image processing method.

对于本发明所提供的设备的介绍请参照上述方法实施例，本发明在此不做赘述。For an introduction to the device provided by the present invention, please refer to the above method embodiment, and the present invention will not be elaborated here.

本发明还提供了一种介质，该介质上存储有计算机程序，计算机程序被处理器执行时可实现如联邦学习方法的任一实施例的步骤或实现如图像处理方法的任一实施例的步骤。The present invention also provides a medium having a computer program stored thereon. When the computer program is executed by a processor, the steps of any embodiment of the federated learning method or the steps of any embodiment of the image processing method can be implemented.

该介质可以包括：U盘、移动硬盘、只读存储器（Read-Only Memory ，ROM）、随机存取存储器（Random Access Memory ，RAM）、磁碟或者光盘等各种可以存储程序代码的介质。The medium may include: a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and other media that can store program codes.

对于本发明所提供的介质的介绍请参照上述方法实施例，本发明在此不做赘述。For an introduction to the medium provided by the present invention, please refer to the above method embodiment, and the present invention will not be elaborated here.

说明书中各个实施例采用递进的方式描述，每个实施例重点说明的都是与其他实施例的不同之处，各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置、设备以及介质而言，由于其与实施例公开的方法相对应，所以描述的比较简单，相关之处参见方法部分说明即可。The various embodiments in the specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other. For the devices, equipment and media disclosed in the embodiments, since they correspond to the methods disclosed in the embodiments, the description is relatively simple, and the relevant parts can be referred to the method part description.

专业人员还可以进一步意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件、计算机软件或者二者的结合来实现，为了清楚地说明硬件和软件的可互换性，在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本发明的范围。Professionals may further appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the interchangeability of hardware and software, the composition and steps of each example have been generally described in the above description according to function. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professionals and technicians may use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the present invention.

结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块，或者二者的结合来实施。软件模块可以置于随机存储器（RAM）、内存、只读存储器（ROM）、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的介质中。The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two. The software module may be placed in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a removable disk, a CD-ROM, or any other form of medium known in the art.

以上对本发明所提供的数据异构条件下的联邦学习方法、图像处理方法、及装置进行了详细介绍。本文中应用了具体个例对本发明的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本发明的方法及其核心思想。应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以对本发明进行若干改进和修饰，这些改进和修饰也落入本发明的保护范围。The above is a detailed introduction to the federated learning method, image processing method, and device under data heterogeneity conditions provided by the present invention. This article uses specific examples to illustrate the principles and implementation methods of the present invention. The description of the above embodiments is only used to help understand the method of the present invention and its core idea. It should be pointed out that for ordinary technicians in this technical field, without departing from the principles of the present invention, the present invention can also be improved and modified in a number of ways, and these improvements and modifications also fall within the scope of protection of the present invention.

Claims

1. An image processing method under heterogeneous data conditions, characterized by comprising:

According to the data distribution similarity of the edge computing devices, the edge computing devices are divided into a plurality of data homogeneity clusters;

Selecting an edge computing device in the data homogeneity cluster that is closest to other edge computing devices in the cluster or has the highest communication rate with other edge computing devices in the cluster as a cluster head of the data homogeneity cluster;

Receive the in-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the in-cluster model parameter aggregation result is the first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the in-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the in-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the in-cluster tree aggregation network, and the child nodes in the in-cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes;

Aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when the image processing model using the second-layer model parameter aggregation result as the model parameter converges, send the image processing model to each of the edge computing devices, so that the edge computing device uses the image processing model to process the image.

2. The image processing method according to claim 1 is characterized in that the cluster head selects a preset number of edge computing devices from other edge computing devices in the cluster as the first-layer child nodes of the tree-shaped aggregation network within the cluster, and the first-layer child nodes are edge computing devices with the largest data transmission rate when communicating with the cluster head; the i-th layer child nodes select a preset number of edge computing devices from the remaining edge computing devices in the cluster as the i+1-th layer child nodes of the tree-shaped aggregation network within the cluster; the i+1-th layer child nodes are edge computing devices with the largest data transmission rate when communicating with the i-th layer child nodes; i takes values from 1 until the tree-shaped aggregation network within the cluster is constructed.

3. The image processing method according to claim 2, characterized in that aggregating the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain the second-layer model parameter aggregation results comprises:

Determine the weight coefficient of each data homogeneity cluster in the current training round;

According to the weight coefficients of the data homogeneity clusters, the in-cluster model parameters of the data homogeneity clusters are weightedly aggregated to obtain the aggregation result of the second-layer model parameters of the current training round.

4. The image processing method according to claim 3, characterized in that determining the weight coefficient of each data homogeneity cluster in the current training round comprises:

The local data test accuracy of each edge computing device of the data homogeneity cluster is counted to obtain the local data test accuracy of the data homogeneity cluster; the local data test accuracy of the edge computing device is the data test accuracy of the edge computing device using the local data to test the global model obtained in the previous training round;

The weight coefficient of the data homology cluster is determined according to the local data test accuracy of the data homology cluster; wherein the weight coefficient of the data homology cluster is negatively correlated with the local data test accuracy of the data homology cluster.

5. The image processing method according to claim 4, characterized in that determining the weight coefficient of the data homogeneity cluster according to the local data test accuracy of the data homogeneity cluster comprises:

according to The weight coefficient of the data homology cluster is obtained; K _c represents the weight coefficient of the cth data homology cluster, and μ represents the local data test accuracy of the cth data homology cluster.

6. The image processing method according to claim 5 is characterized in that one training round includes three stages: local model training, first-layer model parameter aggregation and second-layer model parameter aggregation; after each edge computing device in the data homogeneity cluster completes the first preset number of model parameter updates, a first-layer model parameter aggregation is performed, and after each data homogeneity cluster completes the second preset number of first-layer model parameter aggregation, a second-layer model parameter aggregation is performed.

7. The image processing method according to claim 6, further comprising:

Whenever the preset training rounds of model training are completed, the edge computing devices are clustered again according to the data distribution similarity of the edge computing devices, and the preset training rounds of model training are performed again until the global model converges.

8. The image processing method according to claim 1, characterized in that dividing the edge computing devices into a plurality of data homogeneity clusters according to the data distribution similarity of the edge computing devices comprises:

According to the data distribution similarity of each edge computing device, a weighted undirected graph is constructed; the value of the connecting edge between two edge computing devices in the weighted undirected graph is the value of the data distribution similarity of the two edge computing devices;

According to the weighted undirected graph, the edge computing device is divided into a plurality of data homogeneous clusters.

9. The image processing method according to claim 8, characterized in that constructing a weighted undirected graph according to the data distribution similarity of each edge computing device comprises:

Searching for public data from the public network, and building a test data set based on the public data;

Sending the test data set to each edge computing device so that each edge computing device uses a local model to infer the test data set;

Receiving the inference results uploaded by each edge computing device, and calculating the similarity of each inference result;

The weighted undirected graph is constructed according to the similarity of the inference results.

10. The image processing method according to claim 9, characterized in that constructing the weighted undirected graph according to the similarity of the inference results comprises:

Compare the similarity of the inference results of each edge computing device with the preset threshold;

If the similarity of the inference results of the two edge computing devices is greater than a preset threshold, a connection relationship between the two edge computing devices is established, and the value of the similarity of the inference results of the two edge computing devices is used as the value of the connection edge between the two edge computing devices.

11. The image processing method according to claim 10, characterized in that, according to the weighted undirected graph, dividing the edge computing device into a plurality of data homogeneity clusters comprises:

Initialize the label of each edge computing device in the weighted undirected graph;

Iteratively updating the label of each edge computing device; wherein iteratively updating the label of each edge computing device includes: setting the label of the edge computing device connected to the connection edge whose value is greater than the set threshold to the same label; counting the number of times the label of the neighbor edge computing device of the target edge computing device appears, and selecting the label with the largest number of appearances among the neighbor edge computing devices as the label of the target edge computing device; the target edge computing device is an edge computing device to which the values of each connection edge connected are not greater than the set threshold;

Determine whether the iterative update stop condition is met;

If the iterative update stop condition is met, the edge computing devices with the same label are divided into the same data homogeneity cluster.

12. The image processing method according to claim 11, wherein determining whether the iterative update stop condition is satisfied comprises:

After each iteration, the change between the label of each edge computing device after the current iteration and the label of each edge computing device after the previous iteration is calculated.

Comparing the change amount with a preset value;

If the change amount is less than the preset value, the iterative update stop condition is met.

13. A federated learning method under heterogeneous data conditions, characterized by comprising:

Selecting an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster;

Receive the in-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the in-cluster model parameter aggregation result is the first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the in-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the in-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the in-cluster tree aggregation network, and the child node in the in-cluster tree aggregation network sends the model parameter to the parent node, and performs model parameter aggregation with the local model parameter of the parent node;

Aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when a global model using the second-layer model parameter aggregation result as a model parameter converges, send the global model to each of the edge computing devices.

14. An image processing device under data heterogeneity conditions, characterized by comprising:

A partitioning module, configured to partition the edge computing devices into a plurality of data homogeneity clusters according to the similarity of data distribution of the edge computing devices;

A selection module, configured to select an edge computing device in the data homogeneity cluster that is closest to other edge computing devices in the cluster or has the highest communication rate with other edge computing devices in the cluster as a cluster head of the data homogeneity cluster;

A receiving module is used to receive the intra-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the intra-cluster model parameter aggregation result is the first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the intra-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the intra-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the intra-cluster tree aggregation network, and the child nodes in the intra-cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes;

An aggregation module is used to aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when the image processing model using the second-layer model parameter aggregation result as the model parameter converges, the image processing model is sent to each of the edge computing devices, so that the edge computing device uses the image processing model to process the image.

15. A federated learning device under heterogeneous data conditions, characterized by comprising:

A division unit, configured to divide the edge computing devices into a plurality of data homogeneity clusters according to the data distribution similarity of the edge computing devices;

A selection unit, configured to select an edge computing device from the data homogeneity cluster as a cluster head of the data homogeneity cluster;

A receiving unit is used to receive the intra-cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the intra-cluster model parameter aggregation result is a first-layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the intra-cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the intra-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the intra-cluster tree aggregation network, and the child nodes in the intra-cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes;

An aggregation unit is used to aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain a second-layer model parameter aggregation result, and when a global model using the second-layer model parameter aggregation result as a model parameter converges, the global model is sent to each of the edge computing devices.

16. A device, comprising:

Memory for storing computer programs;

A processor, used for implementing the steps of the image processing method under data heterogeneity conditions as described in any one of claims 1 to 12 or the steps of the federated learning method under data heterogeneity conditions as described in claim 13 when executing the computer program.

17. An image processing system under heterogeneous data conditions, characterized by comprising:

Edge computing devices are used to train image processing models using local data, obtain local model parameters, and process images using image processing modules sent by edge cloud servers;

The edge cloud server is used to divide the edge computing devices into several data homogeneity clusters according to the data distribution similarity of the edge computing devices; select the edge computing device with the shortest distance to other edge computing devices in the cluster or the highest communication rate with other edge computing devices in the cluster as the cluster head of the data homogeneity cluster; receive the cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the cluster model parameter aggregation result is the first layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster according to the cluster tree aggregation network aggregation local model parameters, and each edge computing device in the data homogeneity cluster The edge computing device constructs the intra-cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the intra-cluster tree aggregation network, and the child nodes in the intra-cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes; aggregate the intra-cluster model parameter aggregation results of each of the data homogeneity clusters to obtain the second-layer model parameter aggregation results, and when the image processing model using the second-layer model parameter aggregation results as the model parameters converges, the image processing model is sent to each of the edge computing devices, so that the edge computing devices use the image processing model to process the image.

18. A federated learning system under heterogeneous data conditions, characterized by comprising:

Edge computing devices are used to train models using local data and obtain local model parameters;

The edge cloud server is used to divide the edge computing devices into several data homogeneity clusters according to the data distribution similarity of the edge computing devices; select an edge computing device from the data homogeneity cluster as the cluster head of the data homogeneity cluster; receive the cluster model parameter aggregation result of the data homogeneity cluster uploaded by the cluster head; the cluster model parameter aggregation result is the first layer model parameter aggregation result obtained by each edge computing device in the data homogeneity cluster aggregating local model parameters according to the cluster tree aggregation network, each edge computing device in the data homogeneity cluster constructs the cluster tree aggregation network according to the optimal communication strategy, the cluster head is the root node of the cluster tree aggregation network, the child nodes in the cluster tree aggregation network send model parameters to the corresponding parent nodes, and perform model parameter aggregation with the local model parameters of the parent nodes; aggregate the cluster model parameter aggregation results of each data homogeneity cluster to obtain the second layer model parameter aggregation result, and when the global model with the second layer model parameter aggregation result as the model parameter converges, the global model is sent to each edge computing device.

19. A medium, characterized in that a computer program is stored on the medium, and when the computer program is executed by a processor, the steps of the image processing method under data heterogeneity conditions as described in any one of claims 1 to 12 or the federated learning method under data heterogeneity conditions as described in claim 13 are implemented.