Background
With the development of networking and explosive growth of data, a big data age has come. Big data appears in the aspects of people's life, including information recording, trend analysis, digital online service and the like. Usually, the data is stored in a cloud service end, and data management is performed by a cloud service provider. Since a cloud server is not fully trusted in a cloud computing environment, for example, the cloud server may delete a file with less access to save storage overhead, and an enterprise or an individual outsources data to the cloud, which means that the enterprise or the individual loses full control over the data, a problem of security and integrity verification of the data arises. How to establish an effective and safe mechanism for verifying the integrity of big data of an external package becomes an urgent problem to be solved.
In 2007, Ateniese et al put forward a concept of data security audit in the document "Provable data permission at untrausted stores", and two data audit schemes are constructed by using homomorphic verifiable tags, so as to be safely applied to data integrity verification in a cloud environment. Later, Wang et al, in "energy public availability and data dynamics for storage security in closed computing," devised a remote data integrity verification scheme based on bilinear aggregate signature technology and Merkle hash trees. In 2015, Yu et al put forward a data auditing scheme based on a random sampling technology in a document "remove data admission checking with enhanced security for closed storage", so that some security defects existing in the past research are solved, and the integrity verification function of data is realized. However, since these schemes are based on public key cryptography or employ bilinear pairings, the verification process is computationally expensive.
In order to improve the efficiency of data integrity verification, a part of documents use algebraic signatures to construct a data auditing scheme. For example, a data auditing scheme "Using algebra signatures to check data permission in closed storage" proposed by Chen et al based on algebraic signatures, but the scheme cannot resist replay attacks and malicious server deletion attacks, and does not refer to a third party auditor, which brings huge computational overhead to data owners. Recently, Sookhak et al have designed an outsourcing data auditing scheme based on algebraic signatures in the document "Dynamic remote data auditing for securing big data storage in closed computing", and have effectively realized data integrity verification. However, this scheme is insecure, there is a replay attack, and the challenge information does not contain a random value, resulting in the certification information generated by the cloud server not being completely random.
So far, most data auditing schemes have large calculation cost or have safety problems, and how to design an efficient and safe outsourced big data safety auditing scheme becomes a key problem.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a high-efficiency and safe outsourced big data auditing method in a cloud environment, and flexibly realizes the safety audit of outsourced big data. The invention effectively reduces the calculation expense of the data owner by introducing the TTPA of the credible third party auditor. The safety certification shows that the invention can prove safety under a safety model and can resist various attacks. Performance analysis shows that compared with the existing scheme, the method has less calculation, storage and communication cost, and effectively realizes integrity verification of outsourced big data in the cloud environment.
The purpose of the invention is realized by the following technical scheme: an efficient and safe outsourcing big data auditing method in a cloud environment comprises the following steps:
s1, when the data owner DO wants to outsource storage of multiple files to the cloud server, he first runs the system Setup algorithm Setup (1)λ) Obtaining system parameters params and a private key sk, wherein the sk is kept secret by DO;
s2, the data owner DO executes a tag generation algorithm TagBlock (params, sk, F) to generate a file tag T for each file blocki(ii) a The DO uploads the file blocks and the corresponding labels to a cloud server CS;
s3, once the data owner DO wants to check whether the file is completely saved in the cloud server CS, he authorizes TTPA to execute; the TTPA calls a Challenge algorithm Challenge, generates Challenge information Challenge and transmits the Challenge information Challenge to a cloud server CS;
s4, the cloud server CS calls a Proof algorithm Proof (params, F, T) according to the received chaliChal) obtaining the proving information prf and returning the proving information prf to the TTPA;
s5, finally, the TTPA verifies the integrity of the file by calling a verification algorithm Verificati on (params, sk, hall, prf); if the output of the verification algorithm is 1, the prf is valid and the file is completely stored in the cloud server; otherwise the output is 0, indicating that the file is corrupted.
Preferably, the method comprises a preprocessing stage, in which a data segmentation technique is used to divide the file F into n blocks, each of which is represented as F [ i ], i is greater than or equal to 1 and less than or equal to n.
Preferably, in step S1, the system Setup algorithm Setup (1)
λ):Data owner DO first chooses randomly
And defining an XOR homomorphic hash function
Second, DO randomly selects an element γ from the Galois field for defining an algebraic signature S
γ(ii) a Finally outputting system parameters params ═ q, f, S
γThe private key sk ═ k
1,k
2,k
3}。
Preferably, in step S2, the tag generation algorithm TagBlock (params, sk, F): DO first selects n random numbers
Then adding file information Info for each file block
i={Ind
i||V
i||TS
iIn which Ind
iRepresents an index, V
iRepresenting version number with initial value of 1, TS
iRepresents a timestamp; and (4) DO calculation:
then the label for each file block is T
i=S
γ(C
i||Info
i) (ii) a DO will
Sending the file to a cloud server CS, and deleting the local file F; at the same time, DO passes k through the secure channel
2,
Sending the k to a trusted third party auditor TTPA
3And sending to the CS.
Preferably, in step S3, the Challenge algorithm Challenge: TTPA of trusted third party auditor selects c random numbers
Challenge information
Is sent to CS, where CS
iA file block representing a challenge.
Preferably, in step S4, the Proof algorithm Proof (params, F, T)iChal): after receiving the challenge information chal, the CS calculates:
and will be
As attestation information to the TTPA.
Preferably, in step S5, the verification algorithm veridication on (params, sk, hall, prf): after receiving the certification information prf, TTPA first calculates
Then calculate
Finally TTPA verifies the integrity of the file F by verifying whether the following equation holds:
if the equation is established, the verification algorithm outputs 1, which indicates that the cloud server completely stores the file F; otherwise the output is 0, indicating that the file is corrupted.
Compared with the prior art, the invention has the following advantages and beneficial effects:
in a big data outsourcing scene, the cloud server may not completely save the data or files of the user in order to save storage overhead, and many existing integrity verification schemes have the problems of high computing cost and safety. Therefore, in order to realize safe and effective data integrity verification in a cloud environment, the invention provides an outsourced data auditing scheme based on algebraic signatures and XOR homomorphic functions. By adopting the algebraic signature technology, compared with the existing scheme, the invention has low calculation overhead and communication cost. In addition, the invention introduces a trusted third party auditor TTPA, realizes big data integrity verification by the TTPA, and effectively reduces the calculation expense of the data owner. The security proof shows that the proposed scheme can prove security under a security model. Performance analysis shows that compared with the existing scheme, the method has less calculation overhead for the server, and effectively reduces the communication cost in the calculation, storage and verification processes of the verifier.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Example 1
Formalized definition of outsourcing big data auditing scheme:
an efficient and safe outsourcing big data auditing scheme under a cloud environment is composed of the following 5 polynomial time algorithms:
(1) system establishment algorithm Setup (1)λ): inputting a security parameter lambda and outputting a system public parameter params and a private key sk;
(2) the tag generation algorithm TagBlock (params, sk, F): inputting system public parameters params, a private key sk and a file F. Suppose file F is divided into n blocks, each represented as F [ i ]](i is more than or equal to 1 and less than or equal to n), and outputting a file label Ti;
(3) Challenge algorithm Challenge: outputting challenge information chal;
(4) proof of Algorithm Proof (params, F, T)iChal): import system public parameters params, file F, file tag TiChallenge information chal, and output certification information prf;
(5) verification algorithm veridicati on (params, sk, chal, prf): inputting system public parameters params, a private key sk, challenge information chal and certification information prf, and outputting 0 or 1.
Scheme design:
in a cloud environment, a specific example of an auditing scheme for outsourcing big data is shown in fig. 1. The graph consists of three entities: a Cloud Server (CS), a Data Owner (DO), and a Trusted Third Party Auditor (TTPA).
When the data owner DO wants to outsource storage of multiple files to the cloud server, he first runs the system Setup algorithm Setup (1)λ) Obtaining system parameters params and a private key sk, wherein the sk is kept secret by DO; then executing TagBlock (params, sk, F) to generate a file tag T for each file blocki. The DO uploads these file blocks and corresponding tags to the cloud server CS. Once the DO wants to check if the file is completely saved in the CS, he will authorize the TTPA to execute. TTPA calls the Challenge algorithm Challenge, generates Challenge information chal and passes it to the CS. The cloud server CS calls a Proof algorithm Proof (params, F, T) according to the received chaliChal) gets the attestation information prf and returns it to TTPA. Finally, TTPA verifies the integrity of the file by calling the verification algorithm veridicati on (params, sk, chal, prf). If the verification algorithm output is 1, then the prf is valid and the file is completely saved in the cloud server.
In the preprocessing stage, a data segmentation technology is adopted to divide the file F into n blocks, and each file block is represented as F [ i ] (i is more than or equal to 1 and less than or equal to n). The outsourcing big data auditing scheme provided by the patent comprises the following algorithms:
(1) system establishment algorithm Setup (1)
λ): data owner DO first chooses randomly
And defining an XOR homomorphic hash function
Second, DO randomly selects an element γ from the Galois field for defining an algebraic signature S
γ. Finally outputting system parameters params ═ q, f, S
γThe private key sk ═ k
1,k
2,k
3Q represents a large prime number, a random number
f represents an XOR homomorphic hash function
(2) The tag generation algorithm TagBlock (params, sk, F): DO first selects n random numbers
Then adding file information Info for each file block
i={Ind
i||V
i||TS
iIn which Ind
iRepresents an index, V
iRepresenting version number with initial value of 1, TS
iRepresenting a time stamp. And (4) DO calculation:
then the label for each file block is T
i=S
γ(C
i||Info
i). DO will
And sending the file to the cloud server CS, and deleting the local file F. At the same time, DO passes k through the secure channel
2,
Sending the k to a trusted third party auditor TTPA
3And sending to the CS.
(3) Challenge algorithm Challenge: TTPA of trusted third party auditor selects c random numbers
Challenge information
Is sent to CS, where CS
iA file block representing a challenge.
(4) Proof of Algorithm Proof (params, F, T)iChal): after receiving the challenge information chal, the CS calculates:
and will be
As attestation information to the TTPA.
(5) Verification algorithm veridicati on (params, sk, chal, prf): after receiving the certification information prf, TTPA first calculates
Then calculate
Finally TTPA verifies the integrity of the file F by verifying whether the following equation holds:
if the above equation is true, the verification algorithm outputs 1, indicating that the cloud server has completely saved the file F. Otherwise the output is 0, indicating that the file is corrupted. Fig. 2 illustrates the interactive process of TTPA and CS in the challenge algorithm, attestation algorithm, and validation algorithm of the outsourced big data auditing scheme presented herein.
The protocol was analyzed for correctness as follows:
in terms of calculation, storage and communication overhead, the embodiment compares the proposed scheme with the document [1] [2] [3] [4], and specifically includes whether the encryption mechanism adopts public key encryption or symmetric encryption, calculation overhead of the server and the verifier, communication overhead of the verification process, storage overhead of the verifier, whether verification is completed by a third party auditor, and the like, as shown in table 1. Where n represents the maximum number of encrypted files in the system, the server computational overhead primarily considers the computational cost of generating the attestation information, and the verifier computational overhead primarily considers the computational cost of the verification process.
Table 1 comparison of protocols herein with related protocols
As can be seen from table 1, the scheme proposed in this embodiment and the document [1] [2] [3] [4] both keep the storage overhead of the verifier at O (1), but the computation overhead of the server, the computation overhead of the verifier, and the communication overhead of the verification process of the scheme of this embodiment are also at O (1), which is superior to the document [2] [3] [4] with O (logn) complexity, and the same as the document [1 ]. On the other hand, document [1] [2] [3] is based on a public key cryptography, while the scheme of the present embodiment and document [4] is based on a symmetric cryptography. Generally, symmetric encryption mechanisms are less computationally expensive than public key encryption mechanisms. In addition, as can be seen from table 1, document [1] [2] [4] does not pass the third party verification, that is, the integrity verification of the file is realized by the data owner, while the scheme of this embodiment and document [3] introduces a third party verifier, which allows a trusted third party auditor to perform a data auditing operation, thereby greatly reducing the computational overhead of the data owner. Taken together, the scheme proposed by the embodiment is superior to the document [1] [2] [3] [4], and has less calculation, storage and communication overhead.
Reference documents:
[1]Ateniese G,Burns R,Curtmola R,et al.Provable data possession at untrusted stores.In:Proceedings of the 14th ACM Conference on Computer and Communications Security,Alexandria,2007.598-609.
[2]Erway C C,Papamanthou C,Tamassia R.Dynamic provable data possession.ACM Trans Inf Syst Secur,2009,17:213-222.
[3]Wang Q,Wang C,Ren K,et al.Enabling public auditability and data dynamics for storage security in cloud computing.IEEE Trans Parall Distrib Syst,2011,22:847-859.
[4]Yu Y,Zhang Y,Ni J,et al.Remote data possession checking with enhanced security for cloud storage.Future Gener Comput Syst,2015,52:77-85.
the above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.