CN117834165A

CN117834165A - A network intrusion detection method based on Gaussian differential privacy federated learning

Info

Publication number: CN117834165A
Application number: CN202310376798.9A
Authority: CN
Inventors: 杨凯; 杜佳玮
Original assignee: Xijing University
Current assignee: Xijing University
Priority date: 2023-04-11
Filing date: 2023-04-11
Publication date: 2024-04-05

Abstract

The invention discloses a network intrusion detection method based on Gaussian difference privacy federal learning, which comprises the following steps: the client downloads model parameters from the server and trains a local model, uploads the model parameters after the local training to the server, receives the weighted average of the model parameters and adds the model parameters after the last round of aggregation, sends the updated model parameters to the client in a broadcasting way, and the client repeats the local model training until the federal learning training round is finished; the invention ensures the safety of the federal learning calculation process by utilizing the differential privacy of a Gaussian mechanism, reduces the communication overhead by utilizing the improved FedAVg algorithm in the server, participates in collaborative training by adopting the improved 1DCNN in the local model of the client, has higher detection rate and lower false alarm rate, improves the intrusion detection performance while protecting the network flow privacy, and provides effective reference for carrying out safe data analysis on the network flow in a large scale and multiple scenes in the future.

Description

Network intrusion detection method based on Gaussian differential privacy federal learning

Technical Field

The invention relates to the technical field of network intrusion detection, in particular to a network intrusion detection method based on Gaussian difference privacy federal learning.

Background

With increasing attention of people to network security, privacy protection has become a problem which must be considered in the internet era, and network intrusion detection is a widely applied network security defense technology, which can protect internal and external attacks and misoperation in real time, and intercept and respond to intrusion before a network system is threatened. At present, the deep learning method is largely applied to network intrusion detection, and a series of network security problems exist while higher performance is achieved. Furthermore, a federal learning technology is introduced, and federal learning is used as a distributed machine learning framework based on privacy protection and data security, so that the problem of data island in machine learning is solved. The key idea of federal learning is to build a machine learning model for data sets distributed across multiple devices while preventing data leakage.

The basic framework of federal learning consists of a server and a plurality of clients, wherein the clients use initialized model parameters to perform model training on local data, and further upload the trained model parameters to the server. The server coordinates the client to participate in federal learning training, averages or weighted averages the collected local model parameters, and broadcasts the updated model parameters to the client for preparing for the next training round. Therefore, federal learning can solve the problem that a plurality of institutions perform model training of intrusion detection deep learning under the condition of not sharing network traffic. Federal learning is currently applied to various fields, such as ISP, government affairs, medical treatment, finance, advertisement, logistics, etc., in order to obtain a network intrusion detection deep learning model with better performance, a plurality of institutions in the same field are required to cooperatively train the model, as clients of federal learning, and servers are deployed in the cloud.

At present, a large number of research papers aim to apply the federal learning method to network intrusion detection, but most methods lack the applicability verification of multiple data sets, only perform experiments on a single data set, or cannot provide a high detection rate for all types of traffic, and only have higher classification performance for a certain type of traffic, so that the detection performance of the existing network intrusion detection method is lower, and privacy cannot be effectively prevented from being revealed when the network traffic is subjected to data analysis processing, so that the invention provides a network intrusion detection method based on Gaussian differential privacy federal learning to solve the problems in the prior art.

Disclosure of Invention

Aiming at the problems, the invention aims to provide a network intrusion detection method based on Gaussian differential privacy federal learning, which solves the problems that the existing method for applying the federal learning method to network intrusion detection lacks of verifying the applicability of multiple data sets, only experiments are carried out on a single data set, or a high detection rate cannot be provided for all types of traffic, and only a certain type of traffic has higher classification performance.

In order to achieve the purpose of the invention, the invention is realized by the following technical scheme: a network intrusion detection method based on Gaussian differential privacy federal learning comprises the following steps:

step one: firstly constructing a client-server architecture based on federal learning, then downloading model parameters from a server by a client to serve as local parameters to construct a local model and update the local model parameters, and then training the local model according to a local database in the client-server architecture;

step two: when the local model is trained in the first step, the gradient information in the local model is cut and noise is added in a client by utilizing the differential privacy of a Gaussian mechanism, and when the training times of the local model reach a preset value, the trained local model parameters are uploaded to a server;

step three: after the server receives the trained local model parameters, the server aggregates the local model parameters received from the client by utilizing an improved FedAVg algorithm, obtains new global model parameters, and updates the global model by the new global model parameters;

step four: the server updates the global model to obtain updated model parameters, and sends the updated model parameters to the client in a broadcasting mode;

step five: and the client repeatedly trains the local model after receiving the updated model parameters until the preset federal learning training round is finished, further updates the global model, and detects network intrusion by using the updated global model.

The further improvement is that: in the first step, the local parameter is a global model received by the client i from the serverI.e. < ->Gradient descent is performed, local model training is achieved +.>

The further improvement is that: in the first step, the local model is trained on KDD CUP99, nsl_kdd, and unsw_nb15 datasets, and takes part in co-training with the modified 1D CNN.

The further improvement is that: in the second step, gaussian noise N is added to each client side and used when cutting and noise adding operations are performedThe local model in the training process is disturbed.

The further improvement is that: in the third step, the specific steps of utilizing the improved FedAVg algorithm for aggregation are as follows: the received local model parameters are weighted and averaged and added with the model parameters after the previous round of aggregation to obtain an updated global model, and the global model is updated according to the global model parameters in the updated global model.

The further improvement is that: in the third step, during polymerization: the client receives initial model parameters w from the server ₀ And calculating a gradient g by using the parameter w and the local data thereof, wherein the local model learning rate is eta, the loss function is l, and the local data are divided into |X|data X, and for each batch X epsilon X, gradient descent is performed:the standard deviation of Gaussian noise is sigma, and Gaussian distribution N (0, sigma) is added ² ) Generated noise, local model update: w=w- λ×g+n (0, σ) ² ) And the client sends the updated model parameters to the server.

The further improvement is that: in the third step, during aggregation, the server selects m clients to perform the current training round, broadcasts the current model parameters to the m clients participating in the current training round, and receives the local model parameters w from the m clients ₁ ,w ₂ ,…,w _n Noise data is added in the global model aggregation operation, the global model is updated and broadcast to the client, and the next round of calculation is waited.

The further improvement is that: in the fifth step, after each client repeatedly trains the local model, the updated model parameters are uploaded to the server, and the server aggregates the updated parameters of a plurality of clients so as to update the global model.

The beneficial effects of the invention are as follows: the invention ensures the safety of the federal learning calculation process by utilizing the differential privacy of a Gaussian mechanism, reduces the communication expenditure by utilizing the improved FedAVg algorithm in the server, adopts the improved 1D CNN to participate in collaborative training in the local model of the client, has higher detection rate and lower false alarm rate, can effectively avoid the privacy of the network traffic from being revealed when the network traffic is subjected to data analysis processing, improves the intrusion detection performance while protecting the network traffic privacy, and provides effective reference for the future large-scale and multi-scene network traffic to carry out safe data analysis.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.

FIG. 1 is a flow chart of a network intrusion detection method of the present invention;

FIG. 2 is a schematic diagram of the effect of sigma and C on model accuracy in an embodiment of the invention;

FIG. 3 is a schematic diagram of the effect of σ and C on ε in an embodiment of the invention;

FIG. 4 is a schematic diagram showing the influence degree of the number of clients on the model accuracy in the embodiment of the present invention;

FIG. 5 is a schematic diagram showing the influence degree of the number of clients on model loss in the embodiment of the present invention;

FIG. 6 is a statistical graph of model performance of NIDS-FLGDP in various scenarios in an embodiment of the present invention;

FIG. 7 is a chart of a statistical chart of the classification accuracy of NIDS-FLGDP in an embodiment of the present invention;

FIG. 8 is a multi-class accuracy statistical plot of NIDS-FLGDP in an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, 2, 3, 4, 5, 6, 7, and 8, the present embodiment provides a network intrusion detection method (NIDS-FLGDP) based on gaussian differential privacy federal learning, including the following steps:

the local parameters are global models received by the client i from the serverI.e. < ->Gradient descent is performed, local model training is achieved +.>The local model is trained on KDD CUP99, NSL_KDD and UNSW_NB15 data sets, and improved 1D CNN is adopted to participate in collaborative training;

step two: in the first step, when the local model is trained, the gradient information in the local model is cut and noise added by utilizing the differential privacy of a Gaussian mechanism in the client, when the training times of the local model reach a preset value, the trained local model parameters are uploaded to a server, and when the cutting and noise adding operations are performed, gaussian noise N is added for each client, and the method is usedA local model in the disturbance training process;

step three: after the server receives the trained local model parameters, the server aggregates the local model parameters received from the client by using an improved FedAvg algorithm and obtains new global model parameters, which are specifically as follows: the method comprises the following specific steps: firstly, the received local model parameters are weighted and averaged and added with the model parameters after the previous round of aggregation to obtain an updated global model, the global model is updated according to the global model parameters in the updated global model, and when FedAVg algorithm aggregation is improved, the client side:

receiving an initial from a serverModel parameters w ₀ 。

And calculates the gradient g using the parameter w and its local data.

The local model learning rate is eta and the loss function is l.

The local data is then divided into |X| data X.

For each batch x∈x.

Gradient descent is performed:

the standard deviation of gaussian noise is σ.

Additive use of Gaussian distribution N (0, σ) ² ) And (5) generating noise.

Local model update: w=w- λ×g+n (0, σ) ² )。

And the client sends the updated model parameters to the server.

And (3) a server:

the server selects m clients to perform this round of training.

And broadcasting the current model parameters to m clients participating in the training round.

Receiving local model parameters w from m clients ₁ ,w ₂ ,…,。

Noise data is added in a global model aggregation operation.

Global model update: w (w) _n ←w _n-1 +Δ ⁿ +N(0,σ ² ),

The global model is broadcast to clients.

Waiting for the next round of computation.

step five: the client receives the updated model parameters and then trains the local model repeatedly until the preset federal learning training turn is finished, the client uploads the updated model parameters to the server after training the local model repeatedly, the server aggregates the updated parameters of a plurality of clients, the global model is updated, and network intrusion is detected by using the updated global model.

The test environment of this example is shown in table 1 below:

table 1 experimental environment

Module	Parameters (parameters)
		CPU	Inte1(R)Core(TM)i7-7700K CPU@4.20GHz
Dominant frequency	4.2GHz
		Memory	16GB
System and method for controlling a system	Windows 10
		Experimental software	Python3.7 TensorFlow2.2.0

The client in the NIDS-FLGDP provided by the embodiment adopts a 1D CNN deep learning model for local training, a large number of super parameters are set by exploration in the early stage of the experiment to obtain reliable model performance, and the performance of the improved 1D CNN and the overall NIDS-FLGDP can be further improved by adopting the optimized super parameters. In the three experiments designed, the 1D CNN was set to the same hyper-parameters, allowing fair comparison of NIDS-FLGDP performance, with the 1D CNN model parameters defined in table 2 below.

Table 2 1d CNN model parameters

	Node	Activation function
			Input layer	4	N/A
Hidden layer 1	32	Relu
			Hidden layer 2	64	Relu
Hidden layer 3	128	Relu
			Hidden layer 4	128	Relu
Hidden layer 5	128	Relu
			Hidden layer 6	100	Relu
Output layer	1	Sigmoid

In the modified 1D CNN, data is fed forward through six hidden layers via the input layers, with 40% loss of input units between each hidden layer to prevent overfitting of local client data. Each hidden layer is composed of a plurality of nodes, wherein the hidden layer 1 contains 32 nodes, the hidden layer 2 contains 64 nodes, the hidden layer 3 contains 128 nodes, the hidden layer 4 contains 128 nodes, the hidden layer 5 contains 128 nodes, and the hidden layer 6 contains 100 nodes. The use of the Relu activation function in each hidden layer effectively enhances the nonlinear representation capabilities of the network, and finally calculates the predictions in the output layer. The Adam algorithm is used as an optimizer in model training to map advanced features to output results by back propagation.

In the federal learning framework, there are five participating clients and one global server. Each client holds a unique set of network traffic data representing a real scene. Wherein, client 1 represents the Train201 data set, client 2 represents the Train202 data set, client 3 represents the Train203 data set, client 4 represents the Train204 data set, and client 5 represents the Train205 data set. The federal learning model parameters were set as shown in table 3 below.

TABLE 3 Federal learning model parameters

Parameters (parameters)	Value of
		Number of local training rounds	3
Batch size	1024
		Client optimizer	Adam
Client learning rate	0.01
		Loss function	Binary cross entropy
Number of training wheels for federal study	50
		Server optimizer	Adam
Server learning rate	0.01

The parameters in each round of federal learning are the same, the learning rate of the local and the server is 0.01, the round of federal learning training is 50, the round of local training is 3, the batch size is 1024, the optimizer is Adam, and the loss function is binary cross entropy.

The KDD Cup99, nsl_kdd, and unsw_nb15 datasets represent classical and latest network attacks. Training of NIDS-FLGDP on three data sets may better verify model performance and applicability of NIDS-FLGDP. Each network connection in the KDD CUP99 dataset is marked as normal or abnormal, with the types of anomalies being DoS, R2L, U R, and Probing, respectively. The anomaly category in the nsl_kdd dataset is the same as the KDD Cup99 dataset and reduces redundant data in the KDD Cup 99. The UNSW_NB15 dataset contains normal network connections and nine attack types: fuzzers, analysis, backdoors, doS, expliois, generic, reconnaissance, shellcode and Worms. Two classification and multiple classification experiments are respectively arranged on the three data sets, the data are subjected to unified labeling, the labels are normal and abnormal and are used for the two classification experiments, the labels are multiple flow types and are used for the multiple classification experiments, and the specific classification is shown in the following table 4.

Table 4 data set partitioning

The original dataset of KDD Cup99, nsl_kdd and unsw_nb15 has the disadvantage: (1) the data set contains a large amount of redundant data. (2) the entire dataset is unbalanced. Normal data is much, but abnormal data is less, and the classifier may be misled. And then preprocessing the original data set, firstly balancing the original data set, deleting a part of normal data, and copying a part of abnormal data. And converting the character type data in the data set into numerical data, and adopting a one-hot and tag coding method. After the numerical values are normalized, in order to avoid large differences in dimension between the numerical values, the characteristics of each dimension can be normalized to within the same range of values. After normalization, the differences between the different features can be eliminated. The invention uses a Z-Score method to normalize the mean value and standard deviation of the original data, and the calculation formula is as follows:

NIDS-FLGDP is essentially a classification model for network traffic, the output of which is the probability of each traffic type. And constructing an confusion matrix through the classification result, and respectively obtaining TP, TN, FP and FN of each type. TP is the number of normal samples classified as normal; FN is the number of anomaly samples classified as anomaly type; FP is the number of normal samples classified as an anomaly type; TN is the number of abnormal samples classified as normal. The confusion matrix is shown in table 5 below.

TABLE 5 confusion matrix

In network intrusion detection, models are typically evaluated using Accuracy (ACC), precision (PR), detection Rate (DR), F-metric (F1), and False Positive Rate (FPR).

ACC is the percentage of the total number of samples in all sample data that is correctly classified, and the formula is:

PR is the percentage of the true positive class sample in the predicted positive class samples, and the formula is:

DR is the percentage of positive samples correctly predicted to be positive samples, and the formula is:

f1 is the harmonic mean of the precision and recall, and the formula is:

FPR is the percentage of negative samples that are mispredicted to positive samples, and the formula is:

four parts of experiments were designed in this example: parameter impact, client impact, model performance validation and comparison experiments. In the parameter influence experiment, the optimal parameters of the NIDS-FLGDP under the optimal accuracy are obtained by adjusting learning parameters in federal learning. In the client influence experiment, the influence degree of the number of the clients on the model accuracy can be obtained by setting different numbers of clients. In the scene verification experiment, classification accuracy of various traffic types is respectively obtained by setting two classification scenes and multiple classification scenes. In the comparative experiments, the performance of NIDS-FLGDP was better than that of other proposed models under the same reference by comparison with the methods proposed by the previous studies.

In the parameter influence experiment, the validity of the NIDS-FLGDP is verified by adjusting learning parameters in federal learning. And adding Gaussian noise into the NIDS-FLGDP, accurately setting related parameter values of the Gaussian noise, and calculating privacy budget required by obtaining optimal accuracy, thereby achieving the purpose of balancing privacy protection and data availability. Wherein the Gaussian distribution standard deviation sigma represents the noise scale, and the relaxation factor delta is set to 10 ^-5 Representative can only tolerate 10 ^-5 Is that the probability of (C) violates strict differential privacy, C is the gradient clipping boundary value in differential privacy. The experimental range of parameters sigma and C is sigma epsilon [0.1,0.2,0.3,0.4,0.5 ]]And C E [0.1,0.3,0.5,0.7,0.9 ]]The degree of influence of parameters sigma and C on the test accuracy is shown in fig. 2;

as shown in fig. 2, the corresponding σ and C values at the optimum accuracy 95.04 and minimum loss of 0.398 are 0.2 and 0.4, respectively. Within the experimental range of parameters sigma and C: the larger the sigma is, the more noise is added, the accuracy of training and testing is damaged, the smaller the sigma is, the smaller the disturbance of model parameters is, the higher the accuracy of the model is, the lower the loss function value is, and the better the model performance is; c limits the gradient norms, the larger C reduces the accuracy of training and testing. From the optimal values of sigma and C by the formulaThe optimal solution of the privacy budget e is calculated, and the influence degree of parameters sigma and C on the e is shown in fig. 3.

From the graph shown in fig. 3, a graph of the degree of influence of σ and C on e can be obtained, and then the optimal value of the privacy budget e calculated by the formula is 0.825, where the privacy budget is proportional to availability and inversely proportional to privacy protection. Therefore NIDS-FLGDP sets the optimal values of δ, σ, C and e to 10 under the differential privacy of the gaussian mechanism ^-5 Training on KDD CUP99, nsl_kdd, and unsw_nb15 datasets, 0.2, 0.4, and 0.825, can result in optimal accuracy with privacy preservation.

In the client influence experiments, 2, 3, 4 and 5 participating clients are set for experiments to verify the stability of the NIDS-FLGDP, and the accuracy and loss of the model under 2, 3, 4 and 5 participating clients are obtained, so that the optimal number of participating clients is obtained, as shown in fig. 4 and 5.

The experimental results shown in fig. 4 and fig. 5 show that, as the number of clients increases, the accuracy of NIDS-FLGDP increases and then decreases, because as clients participating in federal learning training increase, the data set participating in model training becomes huge, which is more beneficial to learning training of NIDS-FLGDP, thereby obtaining higher accuracy. However, as the number of clients increases, the data set that NIDS-FLGDP needs to learn increases, communication and computation overhead generated in training may become large, resulting in slow convergence speed, and the clients participating in training share model parameters, so that the overall stability of the accuracy of NIDS-FLGDP may be affected due to excessive number, and the final accuracy is not high. Therefore, for NIDS-FLGDP, the optimal number of clients is 4, and the accuracy is stabilized at the highest level.

In the model performance verification experiment, based on a two-class and multi-class scene under the condition of setting optimal relevant parameters, the NIDS-FLGDP is trained on the KDD CUP99, NSL_KDD and UNSW_NB15 data sets respectively, and the model performance and applicability are verified. In both the two-class and multi-class scenarios, the NIDS-FLGDP derived ACC, PR, DR, F and FPR were trained using KDD CUP99, nsl_kdd, and unsw_nb15 datasets, respectively, as shown in fig. 6.

The experimental results shown in fig. 6 demonstrate that the three data sets respectively train NIDS-FLGDP to obtain ACC, PR, DR, F and FPR with higher levels in both the two-class and multi-class scenarios. And based on the two-classification and multi-classification scenes, the classification accuracy of the NIDS-FLGDP on various types of traffic in the KDD CUP99, nsl_kdd and unsw_nb15 data sets is obtained, as shown in fig. 7 and 8.

As shown in the experimental results of fig. 7 and 8, the classification accuracy of NIDS-FLGDP for various types of traffic of three data sets is high in both the two-class and multi-class scenarios.

In the comparative experiment, NIDS-FLGDP was shown to be 10 at the optimal settings delta, sigma, C and E, respectively ^-5 0.2, 0.4 and 0.825; iteration round is 50; the feasibility of NIDS-FLGDP was verified by comparing the methods proposed in the previous study with the Federal learning client number of 4. ACC, PR, DR, F1 and FPR obtained by training various models are respectively compared with KDD CUP99, NSL_KDD and UNSW_NB15 data sets, and experimental results are shown in the following table 6.

Table 6 compares the performance of multiple models

As can be seen from the comparative experimental results shown in Table 6, the training of NIDS-FLGDP on the KDD CUP99 dataset resulted in ACC and DR superior to other models, with F1 and FPR approaching optimal values of 0.961 and 0.033, respectively. The ACC, DR, F1 and FPR trained on nsl_kdd are superior to the other models, PR is near the optimal value of 0.973. Both ACC, PR, DR, F and FPR are superior to other models on the unsw_nb15 dataset. Therefore, the obtained NIDS-FLGDP has higher model performance, and the feasibility is well verified.

The experiments fully prove that the NIDS-FLGDP finds a balance point between convergence performance and privacy protection level, and the optimal values of delta, sigma, C and epsilon are set to be 10 under the differential privacy of a Gaussian mechanism ^-5 For a fixed optimal privacy protection level, when the iteration round is 50, the model performance can be improved by increasing the total number of clients K participating in federal learning, an optimal K value is equal to 4, so that the NIDS-FLGDP has the best convergence performance, communication overhead is reduced by improving FedAVg algorithm, communication efficiency is improved, privacy protection capability is enhanced, and based on two-class and multi-class scenes, the training of the NIDS-FLGDP on the KDD CUP99, NSL_KDD and UNSW_NB15 data sets has higher model performance, and the model performance and applicability of the NIDS-FLGDP are further verified.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. A network intrusion detection method based on Gaussian differential privacy federal learning is characterized by comprising the following steps:

2. The network intrusion detection method based on gaussian differential privacy federal learning according to claim 1, wherein: in the first step, the local parameter is a global model received by the client i from the serverI.e. < ->Gradient descent is performed, local model training is achieved +.>

3. The network intrusion detection method based on gaussian differential privacy federal learning according to claim 1, wherein: in the first step, the local model is trained on KDD CUP99, nsl_kdd, and unsw_nb15 datasets, and takes part in co-training with the modified 1D CNN.

4. The network intrusion detection method based on gaussian differential privacy federal learning according to claim 2, wherein: in the second step, when cutting and noise adding operations are performedAdding Gaussian noise N to each client and usingThe local model in the training process is disturbed.

5. The network intrusion detection method based on gaussian differential privacy federal learning according to claim 1, wherein: in the third step, the specific steps of utilizing the improved FedAVg algorithm for aggregation are as follows: the received local model parameters are weighted and averaged and added with the model parameters after the previous round of aggregation to obtain an updated global model, and the global model is updated according to the global model parameters in the updated global model.

6. The network intrusion detection method based on gaussian differential privacy federal learning according to claim 1, wherein: in the third step, during polymerization: the client receives initial model parameters w from the server ₀ And calculating a gradient g by using the parameter w and the local data thereof, wherein the local model learning rate is eta, the loss function is l, and the local data are divided into |X|data X, and for each batch X epsilon X, gradient descent is performed:the standard deviation of Gaussian noise is sigma, and Gaussian distribution N (0, sigma) is added ² ) Generated noise, local model update: w=w- λ×g+n (0, σ) ² ) And the client sends the updated model parameters to the server.

7. The network intrusion detection method based on gaussian differential privacy federal learning according to claim 1, wherein: in the third step, during aggregation, the server selects m clients to perform the current training round, broadcasts the current model parameters to the m clients participating in the current training round, and receives the local model parameters w from the m clients ₁ ,w ₂ ,…,w _n Noise data is added in a global model aggregation operation,the global model is updated and broadcast to the clients waiting for the next round of computation.

8. The network intrusion detection method based on gaussian differential privacy federal learning according to claim 1, wherein: in the fifth step, after each client repeatedly trains the local model, the updated model parameters are uploaded to the server, and the server aggregates the updated parameters of a plurality of clients so as to update the global model.