CN113987539B

CN113987539B - Federal learning model safety protection method and system based on safety shuffling and differential privacy

Info

Publication number: CN113987539B
Application number: CN202111270844.4A
Authority: CN
Inventors: 粟勇; 刘圣龙; 江伊雯; 刘文龙
Original assignee: CHINA REALTIME DATABASE CO LTD; Nanjing Nanrui Ruizhong Data Co ltd; State Grid Jiangsu Electric Power Co Ltd; State Grid Electric Power Research Institute; Big Data Center of State Grid Corp of China
Current assignee: CHINA REALTIME DATABASE CO LTD; Nanjing Nanrui Ruizhong Data Co ltd; State Grid Jiangsu Electric Power Co Ltd; State Grid Electric Power Research Institute; Big Data Center of State Grid Corp of China
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2025-07-22
Anticipated expiration: 2041-10-29
Also published as: CN113987539A

Abstract

The invention discloses a federal learning model safety protection method and system based on safe shuffling and differential privacy.A federal model owner utilizes a differential privacy technology to noise federal learning model parameters to generate noisy model parameters, then utilizes a user authorization key and a safe shuffling algorithm to encrypt the model parameters, and sends the encrypted federal learning model parameters to a user; when a user locally uses the federal learning model, the user authorization key and the secure shuffling algorithm are firstly utilized to decrypt the model parameter ciphertext to obtain the federal learning model with noise, and the user can obtain a desired output result by taking own data as the input of the model. The invention not only protects the privacy of the original model, but also can effectively protect the safety of the original model and ensure that the user can obtain the available model using result.

Description

Federal learning model safety protection method and system based on safety shuffling and differential privacy

Technical Field

The invention belongs to the field of artificial intelligence, and particularly relates to a federal learning model safety protection method and system based on safe shuffling and differential privacy.

Background

Federal learning is an artificial intelligence technology which is being widely studied and used, and aims to develop high-efficiency machine learning among multiple participants or multiple computing nodes on the premise of guaranteeing information security during large data exchange, protecting terminal data and personal data privacy and guaranteeing legal compliance. Therefore, federal learning can solve the problem that data cannot go out of local machine learning tasks, so that training sample data privacy of cooperative participants is protected, and the problem of data island is solved. However, at present, although federal learning solves the data privacy problems of training samples of all participants, the use of the existing related differential privacy protection technology focuses on sample privacy and parameter privacy in the model training process, and can not solve the problem that federal learning models release locally used model safety privacy. As is well known, the bang learning model is a result of cooperation of multiple parties, the original federal learning model is a data asset of a model owner, and the problem that the privacy of release and use of the federal learning model is still to be solved is guaranteed. Therefore, how to safely release the original federal learning model and ensure the usability of the model by the user is an important technical difficulty.

Disclosure of Invention

Aiming at the defects of the prior art, the federal learning model safety protection method and system based on safe shuffling and differential privacy can protect the privacy of federal learning model owners and ensure the availability of federal models obtained by users.

The federal learning model safety protection method based on safe shuffling and differential privacy comprises the following steps:

(1) Noise is added to the model parameters learned by the Union based on a differential privacy Gaussian mechanism, and model parameters with noise are generated;

(2) Encrypting the model parameters subjected to differential privacy noise adding by using a user authorization key and a secure shuffling algorithm, and transmitting the encrypted federal learning model parameters to a user;

(3) Decrypting the model parameter ciphertext by using the user authorization key and the secure shuffling algorithm to obtain a noisy federal learning model;

(4) And taking the data of the user as the input of the federal learning model with noise to obtain a desired output result.

Further, the implementation process of the step (1) is as follows:

Parameters of the federal learning model pi form an m×n matrix a _Π, and each element in the matrix a _Π is subjected to noise processing by using a gaussian mechanism in differential privacy to obtain a parameter matrix a '_Π of the federal learning model pi' with noise:

A′_Π(i,j)＝A_Π(i,j)+α (1)

wherein the Gaussian mechanism provides relaxed (epsilon, delta) -differential privacy, noise ratio sigma is greater than or equal to cDeltas/epsilon, constant Epsilon (0, 1); sensitivityThe gaussian noise distribution α -N (0, σ ²) satisfies (epsilon, delta) -differential privacy, α being the noise value added by each data in the matrix, sensitivity representing the largest difference in the output of the query function s for adjacent data sets.

Further, the step (2) includes the steps of:

(21) Reading an m multiplied by n noisy federal learning model parameter matrix A _Π;

(22) Initializing logic mapping control parameters s, d, f and g, wherein s is a chaotic control parameter, d and f are logic mapping control parameters, x _n、y_n is mapped respectively, g is a coupling item, and an initial iterative mapping item=200 is given to key= { x, y }, wherein x and y are two initial values of the chaotic mapping;

(23) Taking the key as an initial value, obtaining m multiplied by n pairs of chaotic sequence values by iteratively mapping m multiplied by n+iter pairs of chaotic sequence values and discarding the iter pairs of chaotic sequence values, and respectively storing the m multiplied by n pairs of chaotic sequence values in one-dimensional arrays P and Q with the size of m multiplied by n:

(24) Operating the following secure shuffling algorithm 1 on elements in P and Q to obtain two integer value one-dimensional matrixes P 'and Q';

(25) Ordering by P 'and Q' to generate two one-dimensional pseudorandom sequence matrixes P 'and Q' with length of m multiplied by n, wherein the element values of the matrixes P 'and Q' are different integers in [0, m multiplied by n-1 ];

(26) The following transformation is performed on each element P "(k) and Q" (k) in the one-dimensional random sequences P "and Q", and mapped into a two-dimensional scrambling matrix X, Y of size mxn;

wherein x (i, j) and y (i, j) are elements of the two-dimensional scrambling matrix X, Y, respectively;

(27) And then, carrying out position scrambling on the temporary model parameter scrambling intermediate result by using Y to obtain a final federal learning model parameter scrambling ciphertext with noise.

Further, the implementation process of the step (3) is as follows:

(31) Given the same key= { x, y } as the encryption process, generating a scrambling matrix X, Y from the key x, y;

(32) And firstly scrambling the noisy federal learning model parameter scrambling ciphertext by using a scrambling matrix Y to obtain a temporary model parameter scrambling intermediate result, and then carrying out position scrambling on the temporary model parameter scrambling intermediate result by using X to obtain a noisy parameter model.

Based on the same inventive concept, the invention further provides a federal learning model safety protection system based on safe shuffling and differential privacy, which comprises a parameter processing module, an encryption module and a decryption module, wherein the parameter processing module is used for adding noise to the model parameters learned by the Union based on a differential privacy Gaussian mechanism to generate model parameters with noise, the encryption module is used for encrypting the model parameters subjected to differential privacy noise adding by using a user authorization key and a safe shuffling algorithm and sending the encrypted federal learning model parameters to a user, and the decryption module is used for decrypting the model parameter ciphertext by using the user authorization key and the safe shuffling algorithm to obtain the federal learning model with noise.

Further, the working process of the parameter processing module is as follows:

A′_Π(i,j)＝A_Π(i,j)+α (1)

Further, the working process of the encryption module is as follows:

(S1) reading an m multiplied by n noisy federal learning model parameter matrix A _Π;

(S2) initializing logic mapping control parameters S, d, f and g, wherein S is a chaotic control parameter, d and f are logic mapping control parameters, x _n、y_n is mapped respectively, g is a coupling term, and an initial iterative mapping item=200, and a key= { x, y }, wherein x and y are two initial values of the chaotic mapping;

(S3) taking the key as an initial value, obtaining m multiplied by n pairs of chaotic sequence values by iteratively mapping m multiplied by n+iter pairs of chaotic sequence values and discarding the iter pairs of chaotic sequence values, and storing the m multiplied by n pairs of chaotic sequence values in one-dimensional arrays P and Q respectively:

(S4) operating the following secure shuffling algorithm 1 on the elements in P and Q to obtain two integer value one-dimensional matrixes P 'and Q';

(S5) ordering by P 'and Q' to generate two one-dimensional pseudo-random sequence matrixes P 'and Q' with the length of m multiplied by n, wherein the element values of the matrixes P 'and Q' are different integers in [0, m multiplied by n-1 ];

(S6) performing the following transformation on each element P "(k) and Q" (k) in the one-dimensional random sequences P "and Q", and mapping it into a two-dimensional scrambling matrix X, Y of size mxn;

And (S7) firstly scrambling the matrix A _Π by using a scrambling matrix X to obtain a temporary model parameter scrambling intermediate result, and then carrying out position scrambling on the temporary model parameter scrambling intermediate result by using a scrambling matrix Y to obtain a final federal learning model parameter scrambling ciphertext with noise.

Further, the working process of the decryption module is as follows:

(H1) Given the same key= { x, y } as the encryption process, generating a scrambling matrix X, Y from the key x, y;

(H2) And firstly scrambling the noisy federal learning model parameter scrambling ciphertext by using a scrambling matrix Y to obtain a temporary model parameter scrambling intermediate result, and then carrying out position scrambling on the temporary model parameter scrambling intermediate result by using X to obtain a noisy parameter model.

Compared with the prior art, the method has the beneficial effects that the model owner utilizes differential privacy to protect the privacy of a real federal learning model, noise is added to the trained model parameters in the aspect of noise addition, the interference to a neural network training model is avoided, the usability of the federal learning model with noise is ensured, the combination of safe shuffling and an authorized key ensures that only an authorized user can obtain the federal learning model with noise, and therefore the safe release of the federal learning model for local use is achieved.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a flow chart of forward shuffling of the federal learning model;

Fig. 3 is a flow chart of reverse shuffling of the federal learning model.

Detailed Description

The invention is described in further detail below with reference to the accompanying drawings.

The invention provides a federal learning model release local use safety protection method based on safety shuffling and differential privacy, which protects model assets of model owners and ensures availability of federal learning models obtained by users. As shown in fig. 1, the method specifically comprises the following steps:

and step 1, adding noise to federally learned model parameters by using differential privacy by a federal model owner to generate noisy model parameters. The parameters are described in Table 1:

Table 1 parameter description

Assuming that parameters of the federal learning model pi form an mxn matrix a _Π, performing noise processing on each element in the matrix a _Π by using a gaussian mechanism in differential privacy to obtain a parameter matrix a _Π of the federal learning model pi' with noise, and the mathematical formula is as follows:

A′_Π(i,j)＝A_Π(i,j)+α (1)

The Gaussian mechanism can provide relaxed (epsilon, delta) -differential privacy, and in order to ensure that the increased Gaussian noise distribution alpha-N (0, sigma ²) meets the (epsilon, delta) -differential privacy, alpha is the noise value increased by each data in the matrix, the noise proportion sigma is set to be equal to or larger than cDeltas/[ epsilon ], wherein the constant E (0, 1), sensitivity Sensitivity represents the maximum difference in the output of the query function s for adjacent datasets.

And 2, encrypting the model parameters by using the user authorization key and the secure shuffling algorithm, and transmitting the encrypted federal learning model parameters to the user, as shown in fig. 2.

1) The m×n noisy federal learning model parameter matrix a _Π is read in.

2) Initializing logical mapping control parameters s=4, d=0.9, f=0.9, g=0.1, and initial iterative mapping iter=200, given key= { x, y }, where x and y are two initial values of the chaotic map.

3) Taking the key as an initial value, obtaining m multiplied by n pairs of chaotic sequence values by iteratively mapping m multiplied by n+iter pairs of chaotic sequence values and discarding the iter pairs of chaotic sequence values, and respectively storing the m multiplied by n pairs of chaotic sequence values in one-dimensional arrays P and Q with the size of m multiplied by n, wherein the following formula is as follows:

4) The following secure shuffling algorithm operation is performed on the elements in P and Q to obtain two integer valued one-dimensional matrices P 'and Q', as shown in table 2:

Table 2 secure shuffling algorithm

5) And ordering by P 'and Q' to generate two one-dimensional pseudorandom sequence matrixes P 'and Q' with the length of m multiplied by n, wherein the element values of the two one-dimensional pseudorandom sequence matrixes P 'and Q' are different integers in [0, m multiplied by n-1 ].

6) The following transforms are performed for each element P "(k) and Q" (k) in the one-dimensional random sequences P "and Q", and mapped into a two-dimensional scrambling matrix X, Y of size mxn:

where x (i, j) and y (i, j) are elements of the two-dimensional scrambling matrix X, Y, respectively.

7) And firstly scrambling the matrix A _Π by using a scrambling matrix X to obtain a temporary model parameter scrambling intermediate result, and then scrambling the position of the temporary model parameter scrambling intermediate result by using a scrambling matrix Y to obtain a final federal learning model parameter scrambling ciphertext with noise.

And 3, when the user locally uses the federal learning model, decrypting the model parameter ciphertext by using the user authorization key and the secure shuffling algorithm to obtain a noisy federal learning model, wherein the user takes the data as the input of the noisy federal learning model to obtain a desired output result as shown in fig. 3.

After receiving the noisy federal learning model parameter scrambling ciphertext and the key = { x, y }, the user needs to execute the inverse operation of the federal learning model owner to operate the noisy parameter model pi, and uses the model to process own data. The specific process is as follows:

1) Given that the key= { x, y } which is the same as the encryption process of the owner of the federal learning model is the same as the encryption process of the owner, generating a scrambling matrix X, Y by the key x, y;

2) And firstly scrambling the noisy federal learning model parameter scrambling ciphertext by using a scrambling matrix Y to obtain a temporary model parameter scrambling intermediate result, and then scrambling the temporary model parameter scrambling intermediate result by using X to obtain a usable noisy parameter model.

The working process of the parameter processing module is as follows:

A′_Π(i,j)＝A_Π(i,j)+α (1)

The working process of the encryption module is as follows:

The working process of the decryption module is as follows:

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the specific embodiments of the present invention without departing from the spirit and scope of the present invention, and any modifications and equivalents are intended to be included in the scope of the claims of the present invention.

Claims

1. The federal learning model safety protection method based on safe shuffling and differential privacy is characterized by comprising the following steps of:

(4) Taking the data of the user as the input of the federal learning model with noise to obtain a desired output result;

The step (2) comprises the following steps:

(22) Initializing logic mapping control parameters s, d, f and g, wherein s is a chaos control parameter, d and f are logic mapping control parameters, mapping is carried out on a chaos sequence value x _n、y_n respectively, g is a coupling item, and initial iterative mapping iter=200, and a key= { x, y } is given, wherein x and y are two initial values of chaos mapping;

(23) Taking the key as an initial value, obtaining m multiplied by n chaotic sequence values x _n、y_n by iteratively mapping m multiplied by n+iter pairs of chaotic sequence values and discarding the iter pairs of chaotic sequence values, and respectively storing the m multiplied by n chaotic sequence values in one-dimensional arrays P and Q with the size of m multiplied by n:

(24) Running a safe shuffling algorithm on elements in P and Q, and calculating to obtain two integer value one-dimensional matrixes P 'and Q';

2. The federal learning model security protection method based on secure shuffling and differential privacy according to claim 1, wherein the implementation procedure of step (1) is as follows:

A'_Π(i,j)＝A_Π(i,j)+α (1)

wherein the Gaussian mechanism provides relaxed (epsilon, delta) -differential privacy, noise ratio sigma is greater than or equal to cDeltas/epsilon, constant Sensitivity toThe gaussian noise distribution α -N (0, σ ²) satisfies (epsilon, delta) -differential privacy, α is the noise value added by each data in the matrix, D is the dataset, D' represents the neighbor dataset that differs from D by only one record, and sensitivity represents the maximum difference in the output of the query function s for the neighbor dataset.

3. The federal learning model security protection method based on secure shuffling and differential privacy according to claim 1, wherein the implementation procedure of step (3) is as follows:

(31) Given a key = { x, y }, which is the same as the encryption process of the owner of the federal learning model, a scrambling matrix X, Y is generated from the key x, y, which is the same as the encryption process;

4. A federal learning model safety protection system based on safe shuffling and differential privacy, which adopts the method as defined in any one of claims 1 to 3, comprises a parameter processing module, an encryption module and a decryption module, wherein the parameter processing module is used for carrying out noise adding on model parameters learned by the Union based on a differential privacy Gaussian mechanism to generate model parameters with noise, the encryption module is used for encrypting the model parameters subjected to the noise adding of the differential privacy by using a user authorization key and a safe shuffling algorithm and sending the encrypted federal learning model parameters to a user, and the decryption module is used for decrypting model parameter ciphertext by using the user authorization key and the safe shuffling algorithm to obtain the federal learning model with noise.

5. The federal learning model security protection system based on secure shuffling and differential privacy of claim 4, wherein the parameter processing module operates as follows:

A'_Π(i,j)＝A_Π(i,j)+α (1)

6. The federal learning model security protection system based on secure shuffling and differential privacy of claim 4, wherein the cryptographic module operates as follows:

(S3) taking a key as an initial value, obtaining m multiplied by n chaotic sequence values x _n、y_n by iteratively mapping m multiplied by n+iter pairs of chaotic sequence values and discarding the iter pairs of chaotic sequence values, and storing the m multiplied by n chaotic sequence values in one-dimensional arrays P and Q with the size of m multiplied by n respectively:

(S4) running a secure shuffling algorithm on elements in the P and the Q, and calculating to obtain two integer value one-dimensional matrixes P 'and Q';

7. The federal learning model security protection system based on secure shuffling and differential privacy of claim 4, wherein the decryption module operates as follows: