[go: up one dir, main page]

CN117708877B - Personalized federal learning method and system for hybrid multi-stage private model - Google Patents

Personalized federal learning method and system for hybrid multi-stage private model Download PDF

Info

Publication number
CN117708877B
CN117708877B CN202311678245.5A CN202311678245A CN117708877B CN 117708877 B CN117708877 B CN 117708877B CN 202311678245 A CN202311678245 A CN 202311678245A CN 117708877 B CN117708877 B CN 117708877B
Authority
CN
China
Prior art keywords
model
local
prototype
global
personalized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311678245.5A
Other languages
Chinese (zh)
Other versions
CN117708877A (en
Inventor
熊黎丽
韩鹏
甘宇琦
包维杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Academy of Science and Technology
Original Assignee
Chongqing Academy of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Academy of Science and Technology filed Critical Chongqing Academy of Science and Technology
Priority to CN202311678245.5A priority Critical patent/CN117708877B/en
Publication of CN117708877A publication Critical patent/CN117708877A/en
Application granted granted Critical
Publication of CN117708877B publication Critical patent/CN117708877B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a personalized federation learning method and a personalized federation learning system for a hybrid multi-stage private model, comprising the following steps: downloading a global model and a global prototype through a client, and initializing a local model, a personalized model and the local prototype; training a local model by using local data through a global model and a prototype regular term, carrying out weighted mixing on the local model and a historical local model, and training a personalized model by using the local data; the historical local model and the personalized model are weighted and mixed to generate a new personalized model, and then a local prototype is generated through the personalized model; the client uploads the local model and the local prototype to the server; the server aggregates the individual models and prototypes to generate global models and global prototypes, which are then distributed to the individual clients. The method introduces the concept of prototype learning, aggregates the local prototypes of different clients to obtain the global prototype, enhances the performance of the historical local model through regularization terms, and ensures the stability and accuracy of the model.

Description

混合多阶段私有模型的个性化联邦学习方法和系统Personalized federated learning method and system for hybrid multi-stage private models

技术领域Technical Field

本发明涉及联邦学习和分布式计算领域,具体地,涉及一种混合多阶段私有模型的个性化联邦学习方法和系统。The present invention relates to the field of federated learning and distributed computing, and in particular to a personalized federated learning method and system of a hybrid multi-stage private model.

背景技术Background technique

在有监督学习中,模型的泛化能力依赖于标注数据的规模。虽然我们的世界每天有海量的数据被创造出来,但是他们有这样的特点:In supervised learning, the generalization ability of the model depends on the scale of labeled data. Although our world has a huge amount of data created every day, they have the following characteristics:

(1)分布的:联邦学习应用于各种异构数据源,这些数据可以来自不同的设备、地理位置、网络条件等。这些异构数据可能具有不同的分布特性和数据类型。不同设备或用户的数据通常不是独立同分布(Non-IID)的,即它们的数据分布可能有很大的差异。这种不一致性增加了在分布式环境下进行模型训练的挑战。(1) Distributed: Federated learning is applied to various heterogeneous data sources, which can come from different devices, geographical locations, network conditions, etc. These heterogeneous data may have different distribution characteristics and data types. Data from different devices or users are usually not independent and identically distributed (Non-IID), that is, their data distributions may be very different. This inconsistency increases the challenges of model training in a distributed environment.

(2)隐私的:因为隐私性,端和端(或者数据中心)之间的数据共享变得困难。同时,法律法规对于数据隐私性的限制也在日益完善,诸如欧盟出台的《General DataProtection Regulation》等。特别的数据隐私性在医疗场景下收到了诸多关注。联邦学习通常在本地设备上进行,涉及用户的隐私数据。因此,确保在本地设备上的数据处理和传输过程中保护用户隐私是至关重要的。(2) Privacy: Because of privacy, data sharing between ends (or data centers) becomes difficult. At the same time, laws and regulations on data privacy are becoming more and more stringent, such as the General Data Protection Regulation issued by the European Union. In particular, data privacy has received a lot of attention in medical scenarios. Federated learning is usually performed on local devices and involves user privacy data. Therefore, it is crucial to ensure that user privacy is protected during data processing and transmission on local devices.

(3)高昂的数据传输成本:海量数据的传输成本是高昂的。传输大量数据涉及网络通信开销,尤其是在大规模数据集上。在一些情况下,数据传输可能导致昂贵的成本。(3) High data transmission costs: The transmission cost of massive data is high. Transmitting large amounts of data involves network communication overhead, especially on large-scale data sets. In some cases, data transmission may result in expensive costs.

在处理这些特点时,联邦学习研究领域一直在努力开发新的算法和技术,以确保在分布式环境下有效地训练模型,同时保护用户隐私,降低数据传输成本,并处理异构和非独立同分布的数据。随着研究的深入,联邦学习有望成为处理大规模、异构和敏感数据的有效方法。在Non-IID设定下,每个数据中心的优化方向不一致,造成全局模型的泛化能力差。考虑每个参与方的出发点只是需要一个模型对本地数据有更强的性能,所以个性化联邦学习就是为了解决联邦学习中的在异质性强(non-IID)的数据上收敛慢,性能差和模型对于本地任务或者数据集缺乏个性化的问题。In dealing with these characteristics, the federated learning research field has been working hard to develop new algorithms and technologies to ensure effective model training in a distributed environment while protecting user privacy, reducing data transmission costs, and processing heterogeneous and non-independent and identically distributed data. With the deepening of research, federated learning is expected to become an effective method for processing large-scale, heterogeneous and sensitive data. Under the Non-IID setting, the optimization direction of each data center is inconsistent, resulting in poor generalization ability of the global model. Considering that the starting point of each participant is just to have a model with stronger performance for local data, personalized federated learning is to solve the problems of slow convergence on highly heterogeneous (non-IID) data, poor performance and lack of personalization of the model for local tasks or data sets in federated learning.

专利文献CN117035057A公开了一种基于模型和数据蒸馏的个性化联邦学习方法,步骤如下:在客户端构建本地模型,包含共享编码器和私有解码器。客户端根据私有数据集训练本地模型,向服务器上传共享编码器的模型参数,客户端根据本地模型计算公有数据集的输出logits并向服务器上传logits,服务器对logits和多个客户端的共享编码器模型参数分别做加权聚合,得到全局logits和多个全局编码器模型参数,各客户端下载多个全局编码器模型,更新客户端中的多个本地共享编码器模型参数,下载全局logits,以知识蒸馏的方式参与解码器的训练。然而该专利无法完全解决目前存在的技术问题,也无法满足本发明的需求。Patent document CN117035057A discloses a personalized federated learning method based on model and data distillation, with the following steps: construct a local model on the client, including a shared encoder and a private decoder. The client trains the local model based on the private data set, uploads the model parameters of the shared encoder to the server, calculates the output logits of the public data set based on the local model and uploads the logits to the server, the server performs weighted aggregation on the logits and the shared encoder model parameters of multiple clients, obtains global logits and multiple global encoder model parameters, each client downloads multiple global encoder models, updates multiple local shared encoder model parameters in the client, downloads global logits, and participates in the training of the decoder in the form of knowledge distillation. However, this patent cannot completely solve the current technical problems, nor can it meet the needs of the present invention.

发明内容Summary of the invention

针对现有技术中的缺陷,本发明的目的是提供一种混合多阶段私有模型的个性化联邦学习方法和系统。In view of the defects in the prior art, the purpose of the present invention is to provide a personalized federated learning method and system of a hybrid multi-stage private model.

根据本发明提供的混合多阶段私有模型的个性化联邦学习方法,包括:The personalized federated learning method of the hybrid multi-stage private model provided by the present invention includes:

步骤S1:通过客户端下载全局模型和全局原型,并初始化本地模型、个性化模型以及本地原型;Step S1: download the global model and global prototype through the client, and initialize the local model, personalized model and local prototype;

步骤S2:利用本地数据通过全局模型和原型正则项训练本地模型,将本地模型与历史本地模型加权混和,同时利用本地数据训练个性化模型;Step S2: Using local data to train a local model through a global model and a prototype regularization term, weighting and mixing the local model with the historical local model, and using local data to train a personalized model;

步骤S3:将历史本地模型与个性化模型加权混合生成新的个性化模型,再通过个性化模型生成本地原型;Step S3: weighted mixing of the historical local model and the personalized model to generate a new personalized model, and then generating a local prototype through the personalized model;

步骤S4:客户端上传本地模型与本地原型至服务器,等待新一轮的模型参数;Step S4: The client uploads the local model and the local prototype to the server and waits for a new round of model parameters;

步骤S5:服务器聚合各个模型和原型以生成全局模型和全局原型,再分发给各个客户端。Step S5: The server aggregates the various models and prototypes to generate a global model and a global prototype, and then distributes them to various clients.

优选地,所述步骤S2包括:Preferably, step S2 comprises:

通过计算得到代表全局数据特征分布的全局类原型,在局部模型训练时引入一个正则项,纠正本地训练的偏移,表达式为:The global class prototype representing the global data feature distribution is obtained by calculation. A regularization term is introduced during local model training to correct the deviation of local training. The expression is:

其中,L2表示欧氏距离;li表示交叉熵损失函数;表示第j类数据的全局类原型;表示客户端i的本地类原型;w为本地模型参数;x为输入数据;y为输入数据x的真实标签;C是图像类别原型表示;λ为超参数,控制正则项的惩罚权重;Among them, L 2 represents the Euclidean distance; l i represents the cross entropy loss function; Represents the global class prototype of the j-th type of data; represents the local class prototype of client i; w is the local model parameter; x is the input data; y is the true label of the input data x; C is the image category prototype representation; λ is a hyperparameter that controls the penalty weight of the regularization term;

然后将本地模型与历史本地模型加权混合,以保留全局信息,表达式为:The local model is then weightedly mixed with the historical local model to preserve the global information, expressed as:

其中,history表示将每轮本地模型wi,t累加后的历史本地模型;t表示迭代的轮次;β为超参数,控制模型混合比例;wi,t表示第i个客户端第t轮的本地模型参数。Among them, history represents the historical local model after accumulating each round of local models wi ,t ; t represents the iteration round; β is a hyperparameter that controls the model mixing ratio; wi,t represents the local model parameters of the tth round of the i-th client.

优选地,所述步骤S3包括:Preferably, step S3 comprises:

为将本地信息和全局信息混合实现个性化,将历史本地模型和个性化模型赋权混合得到新一轮的个性化模型,表达式为:In order to achieve personalization by mixing local information with global information, the historical local model and the personalized model are weighted and mixed to obtain a new round of personalized model, which is expressed as:

其中,为个性化模型,用于本地训练生成下一轮的pi,t+1和本地类原型;t为迭代的轮次;μ为超参数,控制模型混合比例;个性化模型用本地数据训练一轮得到pi,t,此模型和历史本地模型混合得到新一轮的个性化模型 in, is a personalized model, which is used for local training to generate the next round of p i,t+1 and local class prototypes; t is the number of iterations; μ is a hyperparameter that controls the model mixing ratio; the personalized model is trained with local data for one round to obtain p i,t , and this model is mixed with the historical local model to obtain a new round of personalized model

优选地,将模型p分为表示层以及预测层,表达式为:Preferably, the model p is divided into a representation layer and a prediction layer, expressed as:

p∶=[pr+pd]p∶=[ pr + pd ]

z=g(x;pr)z=g(x; pr )

y=h(x;pd)y=h(x;p d )

其中,pr和pd分别表示模型的表示层参数和预测层参数;z表示对于输入x经过表示层后的输出;y表示对于输入x经过预测层后的输出;函数g表示输入数据经过模型表示层pr后的结果;函数h表示输入数据经过预测层pd后的结果;Among them, p r and p d represent the representation layer parameters and prediction layer parameters of the model respectively; z represents the output after the input x passes through the representation layer; y represents the output after the input x passes through the prediction layer; function g represents the result after the input data passes through the model representation layer p r ; function h represents the result after the input data passes through the prediction layer p d ;

定义Cj来表示图像输入中第j类的原型,则对于客户端i,其第j类的原型表示为即第j类数据经过表示层之后的向量平均,表达式为:Define Cj to represent the prototype of the jth class in the image input. For client i, the prototype of the jth class is expressed as That is, the vector average of the j-th type of data after passing through the representation layer is expressed as:

其中,Di,j表示客户端i的数据集中的第j类数据;Where D i,j represents the j-th category of data in the data set of client i;

在模型训练阶段用L2距离来衡量模型与原型对于输入x的预测差异,表达式为:In the model training phase, the L2 distance is used to measure the difference between the model and the prototype in predicting the input x. The expression is:

y=argminj||g(x;pr)-Cj||2y= argminj ||g(x; pr ) -Cj || 2 .

优选地,所述步骤S5包括:Preferably, step S5 comprises:

对于类j的全局原型,从各个客户端收集类j的本地原型,通过聚合不同客户端的本地原型得到类j的全局原型表达式为:For the global prototype of class j, collect the local prototypes of class j from each client, and obtain the global prototype of class j by aggregating the local prototypes of different clients. The expression is:

其中,表示有关于j类数据的客户端集合,Sj为j的数据量;in, represents the set of clients with data of type j, Sj is the amount of data of type j;

同时收集不同客户端的本地模型,聚合得到全局模型,表达式为:At the same time, local models of different clients are collected and aggregated to obtain the global model, which is expressed as:

其中,N为客户端总个数。Where N is the total number of clients.

根据本发明提供的混合多阶段私有模型的个性化联邦学习系统,包括:The personalized federated learning system of the hybrid multi-stage private model provided by the present invention includes:

模块M1:通过客户端下载全局模型和全局原型,并初始化本地模型、个性化模型以及本地原型;Module M1: Download the global model and global prototype through the client, and initialize the local model, personalized model and local prototype;

模块M2:利用本地数据通过全局模型和原型正则项训练本地模型,将本地模型与历史本地模型加权混和,同时利用本地数据训练个性化模型;Module M2: Use local data to train a local model through a global model and prototype regularization term, weightedly blend the local model with the historical local model, and use local data to train a personalized model.

模块M3:将历史本地模型与个性化模型加权混合生成新的个性化模型,再通过个性化模型生成本地原型;Module M3: weighted mixing of the historical local model and the personalized model to generate a new personalized model, and then generate a local prototype through the personalized model;

模块M4:客户端上传本地模型与本地原型至服务器,等待新一轮的模型参数;Module M4: The client uploads the local model and local prototype to the server and waits for a new round of model parameters;

模块M5:服务器聚合各个模型和原型以生成全局模型和全局原型,再分发给各个客户端。Module M5: The server aggregates the various models and prototypes to generate a global model and a global prototype, which are then distributed to various clients.

优选地,所述模块M2包括:Preferably, the module M2 comprises:

通过计算得到代表全局数据特征分布的全局类原型,在局部模型训练时引入一个正则项,纠正本地训练的偏移,表达式为:The global class prototype representing the global data feature distribution is obtained by calculation. A regularization term is introduced during local model training to correct the deviation of local training. The expression is:

其中,L2表示欧氏距离;li表示交叉熵损失函数;表示第j类数据的全局类原型;表示客户端i的本地类原型;w为本地模型参数;x为输入数据;y为输入数据x的真实标签;C是图像类别原型表示;λ为超参数,控制正则项的惩罚权重;Among them, L 2 represents the Euclidean distance; l i represents the cross entropy loss function; Represents the global class prototype of the j-th type of data; represents the local class prototype of client i; w is the local model parameter; x is the input data; y is the true label of the input data x; C is the image category prototype representation; λ is a hyperparameter that controls the penalty weight of the regularization term;

然后将本地模型与历史本地模型加权混合,以保留全局信息,表达式为:The local model is then weightedly mixed with the historical local model to preserve the global information, expressed as:

其中,history表示将每轮本地模型wi,t累加后的历史本地模型;t表示迭代的轮次;β为超参数,控制模型混合比例;wi,t表示第i个客户端第t轮的本地模型参数。Among them, history represents the historical local model after accumulating each round of local models wi ,t ; t represents the iteration round; β is a hyperparameter that controls the model mixing ratio; wi,t represents the local model parameters of the tth round of the i-th client.

优选地,所述模块M3包括:Preferably, the module M3 comprises:

为将本地信息和全局信息混合实现个性化,将历史本地模型和个性化模型赋权混合得到新一轮的个性化模型,表达式为:In order to achieve personalization by mixing local information with global information, the historical local model and the personalized model are weighted and mixed to obtain a new round of personalized model, which is expressed as:

其中,为个性化模型,用于本地训练生成下一轮的pi,t+1和本地类原型;t为迭代的轮次;μ为超参数,控制模型混合比例;个性化模型用本地数据训练一轮得到pi,t,此模型和历史本地模型混合得到新一轮的个性化模型 in, is a personalized model, which is used for local training to generate the next round of p i,t+1 and local class prototypes; t is the number of iterations; μ is a hyperparameter that controls the model mixing ratio; the personalized model is trained with local data for one round to obtain p i,t , and this model is mixed with the historical local model to obtain a new round of personalized model

优选地,将模型p分为表示层以及预测层,表达式为:Preferably, the model p is divided into a representation layer and a prediction layer, and the expression is:

p∶=[pr+pd]p∶=[ pr + pd ]

z=g(x;pr)z=g(x; pr )

y=h(x;pd)y=h(x;p d )

其中,pr和pd分别表示模型的表示层参数和预测层参数;z表示对于输入x经过表示层后的输出;y表示对于输入x经过预测层后的输出;函数g表示输入数据经过模型表示层pr后的结果;函数h表示输入数据经过预测层pd后的结果;Among them, p r and p d represent the representation layer parameters and prediction layer parameters of the model respectively; z represents the output after the input x passes through the representation layer; y represents the output after the input x passes through the prediction layer; function g represents the result after the input data passes through the model representation layer p r ; function h represents the result after the input data passes through the prediction layer p d ;

定义Cj来表示图像输入中第j类的原型,则对于客户端i,其第j类的原型表示为即第j类数据经过表示层之后的向量平均,表达式为:Define Cj to represent the prototype of the jth class in the image input. For client i, the prototype of the jth class is expressed as That is, the vector average of the j-th type of data after passing through the representation layer is expressed as:

其中,Di,j表示客户端i的数据集中的第j类数据;Where D i,j represents the j-th category of data in the data set of client i;

在模型训练阶段用L2距离来衡量模型与原型对于输入x的预测差异,表达式为:In the model training phase, the L2 distance is used to measure the difference between the model and the prototype in predicting the input x. The expression is:

y=argminj||g(x;pr)-Cj||2y= argminj ||g(x; pr ) -Cj || 2 .

优选地,所述模块M5包括:Preferably, the module M5 comprises:

对于类j的全局原型,从各个客户端收集类j的本地原型,通过聚合不同客户端的本地原型得到类j的全局原型表达式为:For the global prototype of class j, collect the local prototypes of class j from each client, and obtain the global prototype of class j by aggregating the local prototypes of different clients. The expression is:

其中,表示有关于j类数据的客户端集合,Sj为j的数据量;in, represents the set of clients with data of type j, Sj is the amount of data of type j;

同时收集不同客户端的本地模型,聚合得到全局模型,表达式为:At the same time, local models of different clients are collected and aggregated to obtain the global model, which is expressed as:

其中,N为客户端总个数。Where N is the total number of clients.

与现有技术相比,本发明具有如下的有益效果:Compared with the prior art, the present invention has the following beneficial effects:

(1)本发明提出了一种基于原型学习的混合多阶段私有模型的个性化联邦学习框架,该框架不仅大大提升了在数据非独立同分布的环境下模型的性能,还能够为每个参与的客户端提供一个定制的个性化模型,此模型会对本地数据具有很强的适应性;(1) The present invention proposes a personalized federated learning framework based on a hybrid multi-stage private model of prototype learning. This framework not only greatly improves the performance of the model in an environment where data is not independent and identically distributed, but also provides a customized personalized model for each participating client. This model has strong adaptability to local data.

(2)与现有个性化联邦学习算法相比,本发明引入一种混合多阶段私有模型的方法实现个性化,并且加入原型学习作为正则项纠正各个客户端的偏移量,该机制能在不交换原本数据的前提下,有效的将本地信息和全局信息相互转移;(2) Compared with the existing personalized federated learning algorithm, the present invention introduces a hybrid multi-stage private model method to achieve personalization, and adds prototype learning as a regular term to correct the offset of each client. This mechanism can effectively transfer local information and global information to each other without exchanging the original data;

(3)与现有个性化联邦学习算法相比,此方法可以在更少的迭代次数中实现更高的性能,另外此方法作为一个联邦学习框架,还能在多个领域展开使用,实现了在保护数据隐私的前提下实现分布式机器学习。(3) Compared with existing personalized federated learning algorithms, this method can achieve higher performance with fewer iterations. In addition, as a federated learning framework, this method can also be used in multiple fields, realizing distributed machine learning while protecting data privacy.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

通过阅读参照以下附图对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显:Other features, objects and advantages of the present invention will become more apparent from the detailed description of non-limiting embodiments made with reference to the following drawings:

图1为本发明流程图;Fig. 1 is a flow chart of the present invention;

图2为本发明运行步骤图。FIG. 2 is a diagram showing the operation steps of the present invention.

具体实施方式Detailed ways

下面结合具体实施例对本发明进行详细说明。以下实施例将有助于本领域的技术人员进一步理解本发明,但不以任何形式限制本发明。应当指出的是,对本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变化和改进。这些都属于本发明的保护范围。The present invention is described in detail below in conjunction with specific embodiments. The following embodiments will help those skilled in the art to further understand the present invention, but are not intended to limit the present invention in any form. It should be noted that, for those of ordinary skill in the art, several changes and improvements can also be made without departing from the concept of the present invention. These all belong to the protection scope of the present invention.

实施例1Example 1

如图1和图2,本发明实施例提供了一种基于原型学习的混合多阶段私有模型的个性化联邦学习方法,引入原型学习的作为正则项,通过混合模型参数实现个性化联邦学习,具体流程为:As shown in Figures 1 and 2, the embodiment of the present invention provides a personalized federated learning method of a hybrid multi-stage private model based on prototype learning, introduces prototype learning as a regular term, and implements personalized federated learning through hybrid model parameters. The specific process is as follows:

步骤S1:通过客户端下载全局模型和全局原型,并初始化本地模型、个性化模型以及本地原型;Step S1: download the global model and global prototype through the client, and initialize the local model, personalized model and local prototype;

步骤S2:利用本地数据通过全局模型和原型正则项训练本地模型,将本地模型与历史本地模型加权混和,同时利用本地数据训练个性化模型;Step S2: Using local data to train a local model through a global model and a prototype regularization term, weighting and mixing the local model with the historical local model, and using local data to train a personalized model;

具体地,在所述步骤S2中:Specifically, in step S2:

通过计算得到代表全局数据特征分布的全局类原型,在局部模型训练时引入一个正则项,纠正本地训练的偏移:The global class prototype representing the global data feature distribution is obtained by calculation, and a regularization term is introduced during local model training to correct the deviation of local training:

其中,L2表示欧氏距离,li表示交叉熵损失函数,表示第j类数据的全局类原型,表示客户端i的本地类原型。Among them, L 2 represents the Euclidean distance, li represents the cross entropy loss function, Represents the global class prototype of the j-th type of data, Represents the local class prototype of client i.

然后将本地模型与历史本地模型加权混合,以保留全局信息:The local model is then weightedly blended with the historical local model to preserve the global information:

步骤S3:将历史本地模型与个性化模型加权混合生成新的个性化模型,再通过个性化模型生成本地原型;Step S3: weighted mixing of the historical local model and the personalized model to generate a new personalized model, and then generating a local prototype through the personalized model;

具体地,在所述步骤S3中:Specifically, in step S3:

为了将本地信息和全局信息混合实现个性化,考虑将历史本地模型和个性化模型赋权混合得到新一轮的个性化模型,表达式为:In order to achieve personalization by mixing local information with global information, we consider weighting and mixing the historical local model and the personalized model to obtain a new round of personalized model, which is expressed as:

其中,为个性化模型参数,会被用于本地训练生成下一轮的pi,t+1和本地类原型,t为迭代的轮次。in, are personalized model parameters, which will be used for local training to generate the next round of p i,t+1 and local class prototypes, where t is the number of iterations.

对于深度学习模型可分为表示层以及预测层,表达式为:The deep learning model can be divided into a representation layer and a prediction layer, and the expression is:

p∶=[pr+pd]p∶=[ pr + pd ]

z=g(x;pr)z=g(x; pr )

y=h(x;pd)y=h(x;p d )

其中,pr和pd分别表示模型的表示层参数和预测层参数,z表示对于输入x经过表示层后的输出,y表示对于输入x经过预测层后的输出。定义Cj来表示图像输入中第j类的原型,那么对于特定的客户端i,其第j类的原型可表示为也就是第j类数据经过表示层之后的向量平均,即:Where p r and p d represent the representation layer parameters and prediction layer parameters of the model respectively, z represents the output after the input x passes through the representation layer, and y represents the output after the input x passes through the prediction layer. Define C j to represent the prototype of the jth class in the image input. Then for a specific client i, its prototype of the jth class can be expressed as That is, the vector average of the j-th type of data after passing through the representation layer, that is:

其中,Di,j表示客户端i的数据集中的第j类数据,有了如上定义,在模型训练阶段可以用L2距离来衡量模型与原型对于输入x的预测差异,表达式为:Among them, Di ,j represents the j-th category of data in the dataset of client i. With the above definition, the L2 distance can be used to measure the difference between the model and the prototype for the prediction of input x during the model training phase. The expression is:

y=argminj||g(x;pr)-Cj||2 y=argmin j || g(x; pr ) -Cj || 2

步骤S4:客户端上传本地模型与本地原型至服务器,等待新一轮的模型参数;Step S4: The client uploads the local model and the local prototype to the server and waits for a new round of model parameters;

步骤S5:服务器聚合各个模型和原型以生成全局模型和全局原型,再分发给各个客户端;Step S5: the server aggregates the various models and prototypes to generate a global model and a global prototype, and then distributes them to various clients;

具体地,在所述步骤S5中:Specifically, in step S5:

对于类j的全局原型,从各个客户端收集类j的本地原型,通过聚合不同客户端的本地原型得到类j的全局原型表达式为:For the global prototype of class j, collect the local prototypes of class j from each client, and obtain the global prototype of class j by aggregating the local prototypes of different clients. The expression is:

其中,表示有关于j类数据的客户端集合,Sj为j的数据量。in, It represents the set of clients with data of type j, and Sj is the amount of data of j.

同时收集不同客户端的本地模型聚合得到全局模型,表达式为:At the same time, the local models of different clients are collected and aggregated to obtain the global model, which is expressed as:

实施例2Example 2

本发明还提供一种混合多阶段私有模型的个性化联邦学习系统,所述混合多阶段私有模型的个性化联邦学习系统可以通过执行所述混合多阶段私有模型的个性化联邦学习方法的流程步骤予以实现,即本领域技术人员可以将所述混合多阶段私有模型的个性化联邦学习方法理解为所述混合多阶段私有模型的个性化联邦学习系统的优选实施方式。The present invention also provides a personalized federated learning system for a hybrid multi-stage private model. The personalized federated learning system for the hybrid multi-stage private model can be implemented by executing the process steps of the personalized federated learning method for the hybrid multi-stage private model, that is, those skilled in the art can understand the personalized federated learning method for the hybrid multi-stage private model as a preferred implementation of the personalized federated learning system for the hybrid multi-stage private model.

根据本发明提供的混合多阶段私有模型的个性化联邦学习系统,包括:模块M1:通过客户端下载全局模型和全局原型,并初始化本地模型、个性化模型以及本地原型;模块M2:利用本地数据通过全局模型和原型正则项训练本地模型,将本地模型与历史本地模型加权混和,同时利用本地数据训练个性化模型;模块M3:将历史本地模型与个性化模型加权混合生成新的个性化模型,再通过个性化模型生成本地原型;模块M4:客户端上传本地模型与本地原型至服务器,等待新一轮的模型参数;模块M5:服务器聚合各个模型和原型以生成全局模型和全局原型,再分发给各个客户端。The personalized federated learning system of the hybrid multi-stage private model provided by the present invention includes: module M1: downloading the global model and the global prototype through the client, and initializing the local model, the personalized model and the local prototype; module M2: using local data to train the local model through the global model and the prototype regularization term, weighted mixing the local model with the historical local model, and training the personalized model with the local data; module M3: weighted mixing the historical local model and the personalized model to generate a new personalized model, and then generating the local prototype through the personalized model; module M4: the client uploads the local model and the local prototype to the server, and waits for a new round of model parameters; module M5: the server aggregates the various models and prototypes to generate the global model and the global prototype, and then distributes them to each client.

所述模块M2包括:The module M2 comprises:

通过计算得到代表全局数据特征分布的全局类原型,在局部模型训练时引入一个正则项,纠正本地训练的偏移,表达式为:The global class prototype representing the global data feature distribution is obtained by calculation. A regularization term is introduced during local model training to correct the deviation of local training. The expression is:

其中,L2表示欧氏距离;li表示交叉熵损失函数;表示第j类数据的全局类原型;表示客户端i的本地类原型;w为本地模型参数;x为输入数据;y为输入数据x的真实标签;C是图像类别原型表示;λ为超参数,控制正则项的惩罚权重;Among them, L 2 represents the Euclidean distance; l i represents the cross entropy loss function; Represents the global class prototype of the j-th type of data; represents the local class prototype of client i; w is the local model parameter; x is the input data; y is the true label of the input data x; C is the image category prototype representation; λ is a hyperparameter that controls the penalty weight of the regularization term;

然后将本地模型与历史本地模型加权混合,以保留全局信息,表达式为:The local model is then weightedly mixed with the historical local model to preserve the global information, expressed as:

其中,history表示将每轮本地模型wi,t累加后的历史本地模型;t表示迭代的轮次;β为超参数,控制模型混合比例;wi,t表示第i个客户端第t轮的本地模型参数。Among them, history represents the historical local model after accumulating each round of local models wi ,t ; t represents the iteration round; β is a hyperparameter that controls the model mixing ratio; wi,t represents the local model parameters of the tth round of the i-th client.

所述模块M3包括:The module M3 comprises:

为将本地信息和全局信息混合实现个性化,将历史本地模型和个性化模型赋权混合得到新一轮的个性化模型,表达式为:In order to achieve personalization by mixing local information with global information, the historical local model and the personalized model are weighted and mixed to obtain a new round of personalized model, which is expressed as:

其中,为个性化模型,用于本地训练生成下一轮的pi,t+1和本地类原型;t为迭代的轮次;μ为超参数,控制模型混合比例;个性化模型用本地数据训练一轮得到pi,t,此模型和历史本地模型混合得到新一轮的个性化模型 in, is a personalized model, which is used for local training to generate the next round of p i,t+1 and local class prototypes; t is the number of iterations; μ is a hyperparameter that controls the model mixing ratio; the personalized model is trained with local data for one round to obtain p i,t , and this model is mixed with the historical local model to obtain a new round of personalized model

将模型p分为表示层以及预测层,表达式为:The model p is divided into a representation layer and a prediction layer, and the expression is:

p∶=[pr+pd]p∶=[ pr + pd ]

z=g(x;pr)z=g(x; pr )

y=h(x;pd)y=h(x;p d )

其中,pr和pd分别表示模型的表示层参数和预测层参数;z表示对于输入x经过表示层后的输出;y表示对于输入x经过预测层后的输出;函数g表示输入数据经过模型表示层pr后的结果;函数h表示输入数据经过预测层pd后的结果;Among them, p r and p d represent the representation layer parameters and prediction layer parameters of the model respectively; z represents the output after the input x passes through the representation layer; y represents the output after the input x passes through the prediction layer; function g represents the result after the input data passes through the model representation layer p r ; function h represents the result after the input data passes through the prediction layer p d ;

定义Cj来表示图像输入中第j类的原型,则对于客户端i,其第j类的原型表示为即第j类数据经过表示层之后的向量平均,表达式为:Define Cj to represent the prototype of the jth class in the image input. For client i, the prototype of the jth class is expressed as That is, the vector average of the j-th type of data after passing through the representation layer is expressed as:

其中,Di,j表示客户端i的数据集中的第j类数据;Where D i,j represents the j-th category of data in the data set of client i;

在模型训练阶段用L2距离来衡量模型与原型对于输入x的预测差异,表达式为:In the model training phase, the L2 distance is used to measure the difference between the model and the prototype in predicting the input x. The expression is:

y=argminj||g(x;pr)-Cj||2y= argminj ||g(x; pr ) -Cj || 2 .

所述模块M5包括:The module M5 comprises:

对于类j的全局原型,从各个客户端收集类j的本地原型,通过聚合不同客户端的本地原型得到类j的全局原型表达式为:For the global prototype of class j, collect the local prototypes of class j from each client, and obtain the global prototype of class j by aggregating the local prototypes of different clients. The expression is:

其中,表示有关于j类数据的客户端集合,Sj为j的数据量;in, represents the set of clients with data of type j, Sj is the amount of data of type j;

同时收集不同客户端的本地模型,聚合得到全局模型,表达式为:At the same time, local models of different clients are collected and aggregated to obtain the global model, which is expressed as:

其中,N为客户端总个数。Where N is the total number of clients.

本领域技术人员知道,除了以纯计算机可读程序代码方式实现本发明提供的系统、装置及其各个模块以外,完全可以通过将方法步骤进行逻辑编程来使得本发明提供的系统、装置及其各个模块以逻辑门、开关、专用集成电路、可编程逻辑控制器以及嵌入式微控制器等的形式来实现相同程序。所以,本发明提供的系统、装置及其各个模块可以被认为是一种硬件部件,而对其内包括的用于实现各种程序的模块也可以视为硬件部件内的结构;也可以将用于实现各种功能的模块视为既可以是实现方法的软件程序又可以是硬件部件内的结构。Those skilled in the art know that, in addition to implementing the system, device and its various modules provided by the present invention in a purely computer-readable program code, it is entirely possible to implement the same program in the form of logic gates, switches, application-specific integrated circuits, programmable logic controllers and embedded microcontrollers by logically programming the method steps. Therefore, the system, device and its various modules provided by the present invention can be considered as a hardware component, and the modules included therein for implementing various programs can also be considered as structures within the hardware component; the modules for implementing various functions can also be considered as both software programs for implementing the method and structures within the hardware component.

以上对本发明的具体实施例进行了描述。需要理解的是,本发明并不局限于上述特定实施方式,本领域技术人员可以在权利要求的范围内做出各种变化或修改,这并不影响本发明的实质内容。在不冲突的情况下,本申请的实施例和实施例中的特征可以任意相互组合。The above describes the specific embodiments of the present invention. It should be understood that the present invention is not limited to the above specific embodiments, and those skilled in the art can make various changes or modifications within the scope of the claims, which does not affect the essence of the present invention. In the absence of conflict, the embodiments of the present application and the features in the embodiments can be combined with each other arbitrarily.

Claims (6)

1. A personalized federal learning method of a hybrid multi-stage private model, comprising:
Step S1: downloading a global model and a global prototype through a client, and initializing a local model, a personalized model and the local prototype;
Step S2: training a local model by using local data through a global model and a prototype regular term, carrying out weighted mixing on the local model and a historical local model, and training a personalized model by using the local data;
Step S3: the historical local model and the personalized model are weighted and mixed to generate a new personalized model, and then a local prototype is generated through the personalized model;
step S4: uploading the local model and the local prototype to a server by the client, and waiting for a new round of model parameters;
step S5: the server aggregates each model and prototype to generate a global model and a global prototype, and then distributes the global model and the global prototype to each client;
The step S2 includes:
The global model representing the global data feature distribution is obtained through calculation, a regular term is introduced during local model training, the offset of local training is corrected, and the expression is as follows:
Wherein L 2 represents the euclidean distance; l i denotes the cross entropy loss function; A global class prototype representing class j data; Representing a local class prototype of the client i; w is a local model parameter; x is input data; y is the real label of the input data x; c is the image class prototype representation; lambda is a super parameter, and the punishment weight of the regular term is controlled;
the local model is then weighted mixed with the historical local model to preserve global information expressed as:
Wherein, history represents the history local model after accumulating each round of local model w i,t; t represents the iteration round; beta is a super parameter, and the mixing proportion of the model is controlled; w i,t denotes the local model parameters of the ith client's nth round;
the step S3 includes:
In order to realize individuation by mixing local information and global information, a historical local model and individuation model are weighted and mixed to obtain a new round of individuation model, and the expression is as follows:
Wherein, Generating a p i,t+1 and a local prototype of the next round for the personalized model for local training; t is the iteration round; mu is a super parameter, and the mixing proportion of the model is controlled; training the personalized model by using local data for one round to obtain p i,t, and mixing the model with the historical local model to obtain a new round of personalized model
2. The personalized federal learning method of hybrid multi-phase private models according to claim 1, wherein the model p is divided into a representation layer and a prediction layer, expressed as:
p∶=[pr+pd]
z=g(x;pr)
y=h(x;pd)
Wherein p r and p d represent the representation layer parameters and the prediction layer parameters of the model, respectively; z represents the output after the presentation layer for input x; y represents the output after passing through the prediction layer for input x; the function g represents the result of the input data after passing through the model representation layer p r; the function h represents the result of the input data after passing through the prediction layer p d;
Defining C j to represent a prototype of the j-th class in the image input, then for client i, its prototype of the j-th class is represented as I.e., vector average of j-th class data after passing through the presentation layer, the expression is:
wherein D i,j represents the j-th class of data in the dataset of client i;
In the model training stage, the prediction difference of the model and the prototype for the input x is measured by using the L 2 distance, and the expression is:
y=argminj||g(x;pr)-Cj||2
3. the personalized federal learning method of hybrid multi-phase private models according to claim 2, wherein step S5 comprises:
for the global prototype of the class j, collecting local prototypes of the class j from each client, and obtaining the global prototype of the class j by aggregating the local prototypes of different clients The expression is:
Wherein, Representing a client set related to j-class data, S j being the data quantity of j;
meanwhile, collecting local models of different clients, and obtaining a global model by aggregation, wherein the expression is as follows:
Wherein N is the total number of clients.
4. A personalized federal learning system for a hybrid multi-stage private model, comprising:
Module M1: downloading a global model and a global prototype through a client, and initializing a local model, a personalized model and the local prototype;
module M2: training a local model by using local data through a global model and a prototype regular term, carrying out weighted mixing on the local model and a historical local model, and training a personalized model by using the local data;
Module M3: the historical local model and the personalized model are weighted and mixed to generate a new personalized model, and then a local prototype is generated through the personalized model;
module M4: uploading the local model and the local prototype to a server by the client, and waiting for a new round of model parameters;
Module M5: the server aggregates each model and prototype to generate a global model and a global prototype, and then distributes the global model and the global prototype to each client;
The module M2 includes:
The global model representing the global data feature distribution is obtained through calculation, a regular term is introduced during local model training, the offset of local training is corrected, and the expression is as follows:
Wherein L 2 represents the euclidean distance; l i denotes the cross entropy loss function; A global class prototype representing class j data; Representing a local class prototype of the client i; w is a local model parameter; x is input data; y is the real label of the input data x; c is the image class prototype representation; lambda is a super parameter, and the punishment weight of the regular term is controlled;
the local model is then weighted mixed with the historical local model to preserve global information expressed as:
Wherein, history represents the history local model after accumulating each round of local model w i,t; t represents the iteration round; beta is a super parameter, and the mixing proportion of the model is controlled; w i,t denotes the local model parameters of the ith client's nth round;
the module M3 includes:
In order to realize individuation by mixing local information and global information, a historical local model and individuation model are weighted and mixed to obtain a new round of individuation model, and the expression is as follows:
Wherein, Generating a p i,t+1 and a local prototype of the next round for the personalized model for local training; t is the iteration round; mu is a super parameter, and the mixing proportion of the model is controlled; training the personalized model by using local data for one round to obtain p i,t, and mixing the model with the historical local model to obtain a new round of personalized model
5. The personalized federal learning system for hybrid multi-phase private models according to claim 4, wherein the model p is divided into a representation layer and a prediction layer, expressed as:
p∶=[pr+pd]
z=g(x;pr)
y=h(x;pd)
Wherein p r and p d represent the representation layer parameters and the prediction layer parameters of the model, respectively; z represents the output after the presentation layer for input x; y represents the output after passing through the prediction layer for input x; the function g represents the result of the input data after passing through the model representation layer p r; the function h represents the result of the input data after passing through the prediction layer p d;
Defining C j to represent a prototype of the j-th class in the image input, then for client i, its prototype of the j-th class is represented as I.e., vector average of j-th class data after passing through the presentation layer, the expression is:
wherein D i,j represents the j-th class of data in the dataset of client i;
In the model training stage, the prediction difference of the model and the prototype for the input x is measured by using the L 2 distance, and the expression is:
y=argminj||g(x;pr)-Cj||2
6. the personalized federal learning system for hybrid multi-phase private models according to claim 5, wherein the module M5 comprises:
for the global prototype of the class j, collecting local prototypes of the class j from each client, and obtaining the global prototype of the class j by aggregating the local prototypes of different clients The expression is:
Wherein, Representing a client set related to j-class data, S j being the data quantity of j;
meanwhile, collecting local models of different clients, and obtaining a global model by aggregation, wherein the expression is as follows:
Wherein N is the total number of clients.
CN202311678245.5A 2023-12-07 2023-12-07 Personalized federal learning method and system for hybrid multi-stage private model Active CN117708877B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311678245.5A CN117708877B (en) 2023-12-07 2023-12-07 Personalized federal learning method and system for hybrid multi-stage private model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311678245.5A CN117708877B (en) 2023-12-07 2023-12-07 Personalized federal learning method and system for hybrid multi-stage private model

Publications (2)

Publication Number Publication Date
CN117708877A CN117708877A (en) 2024-03-15
CN117708877B true CN117708877B (en) 2024-07-12

Family

ID=90147152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311678245.5A Active CN117708877B (en) 2023-12-07 2023-12-07 Personalized federal learning method and system for hybrid multi-stage private model

Country Status (1)

Country Link
CN (1) CN117708877B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119378650B (en) * 2024-10-30 2025-08-26 北京电子科技学院 Heterogeneous federated learning method of data and models based on adaptive aggregation prototype

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115731424A (en) * 2022-12-03 2023-03-03 北京邮电大学 Image classification model training method and system based on enhanced federal domain generalization
CN116029369A (en) * 2023-02-10 2023-04-28 中国海洋大学 Back door attack defense method and system based on federal learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6636848B1 (en) * 2000-05-31 2003-10-21 International Business Machines Corporation Information search using knowledge agents

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115731424A (en) * 2022-12-03 2023-03-03 北京邮电大学 Image classification model training method and system based on enhanced federal domain generalization
CN116029369A (en) * 2023-02-10 2023-04-28 中国海洋大学 Back door attack defense method and system based on federal learning

Also Published As

Publication number Publication date
CN117708877A (en) 2024-03-15

Similar Documents

Publication Publication Date Title
Liu et al. Decentralized federated learning: Balancing communication and computing costs
Chung et al. Electric vehicle charge scheduling mechanism to maximize cost efficiency and user convenience
Pelikan et al. The bivariate marginal distribution algorithm
Agliamzanov et al. Hydrology@ Home: a distributed volunteer computing framework for hydrological research and applications
Fang et al. Automated federated pipeline for parameter-efficient fine-tuning of large language models
CN109165808B (en) A method for dispatching work orders for on-site operation and maintenance of power communication network
Kumar et al. A hybrid multi-agent based particle swarm optimization algorithm for economic power dispatch
Wan et al. Privacy-preservation for gradient descent methods
Chamnanlor et al. Re-entrant flow shop scheduling problem with time windows using hybrid genetic algorithm based on auto-tuning strategy
CN110378434A (en) Training method, recommended method, device and the electronic equipment of clicking rate prediction model
Fan et al. Serial-batching group scheduling with release times and the combined effects of deterioration and truncated job-dependent learning
CN117236421B (en) Large model training method based on federal knowledge distillation
CN112330048A (en) Scoring card model training method and device, storage medium and electronic device
CN110008023B (en) Budget Constrained Random Task Scheduling Method for Cloud Computing System Based on Genetic Algorithm
Zhang et al. Graph-based traffic forecasting via communication-efficient federated learning
CN117708877B (en) Personalized federal learning method and system for hybrid multi-stage private model
Gholami et al. Solving parallel machines job-shop scheduling problems by an adaptive algorithm
Huang et al. A Simulation‐Based Approach of QoS‐Aware Service Selection in Mobile Edge Computing
CN111027709A (en) Information recommendation method and device, server and storage medium
Jiang et al. Neural combinatorial optimization for energy-efficient offloading in mobile edge computing
CN103281374A (en) Method for rapid data scheduling in cloud storage
CN117436627B (en) Task allocation method, device, terminal equipment and medium
CN112948123A (en) Spark-based grid hydrological model distributed computing method
Yue et al. Hybrid Pareto artificial bee colony algorithm for multi-objective single machine group scheduling problem with sequence-dependent setup times and learning effects
Yu et al. Pi-fed: Continual federated learning with parameter-level importance aggregation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant