Disclosure of Invention
The invention discloses a tamper-proof database method based on a trusted computing environment, which comprises the following steps:
S1, receiving data transmitted into the computer by adopting a security protocol communication TLS, performing preliminary verification and cleaning on the data, generating a verification key by a trusted computing module TPM, and storing the verification key in a security storage area of the trusted computing module;
s2, loading the verified data into the feasible computing environment TEE to generate an encryption key, and storing the encryption key in an internal storage area of the feasible computing environment TEE;
S3, carrying out integrity check on encrypted data in the feasible computing environment TEE;
S4, the checked data leave the feasible computing environment TEE through the security protocol communication TLS and are transmitted to a tamper-proof database for storage.
Preferably, the trusted computing module TPM in S1 generates a verification key to verify the integrity of the data by comparing the hash value of the actual data with the expected hash value at the hardware level, and verifies whether the data in the key start and run software is tampered.
Preferably, the TEE uses intel software to protect the extended SGX to generate the encryption key, and specifically includes the following steps:
S2.1, creating a secure Enclave in SGX, and loading data into the Enclave;
s2.2, generating a random initialization vector in the Enclave through a hardware random number generator RNG;
S2.3, encrypting by using a strong encryption algorithm AES-256 in the Enclave through a GCM encryption mode, generating an encryption key, and storing the encryption key in the Enclave.
Preferably, the integrity check in S3 includes the following steps;
s3.1, generating a hash value for the data through a hash algorithm, and storing or attaching the hash value to the data;
s3.2, before the data is accessed or transmitted, recalculating the hash value of the current data and comparing the hash value with the original hash value to ensure that the data is not modified;
and S3.3, verifying the source and the integrity of the data by adopting a digital signature, signing the data or the hash value by a data client through a private key, and verifying the signature by a public key of a sender to indicate that the data is not tampered after self-signing.
Preferably, the storing the data in the tamper-resistant database in S4 includes the following steps:
s4.1, encrypting data by using a symmetric encryption algorithm through the tamper-resistant database, generating a storage key through a hardware security module HSM, and encrypting the data by using the generated storage key;
s4.2, calculating a data hash value, and generating each data to comprise a unique key and a value through a K-V pair storage model;
s4.3, linking the hash value of the data to the hash value of the previous data, and storing each data together with the hash value calculated by each data to form a hash chain;
s4.4, storing the encrypted data in the tamper-proof database through a hash chain.
The tamper-resistant database system based on the trusted computing environment comprises a trusted computing layer, a data processing layer, a data storage layer and a system interaction layer;
the trusted computing layer comprises a TPM and an SGX, wherein the TPM generates a verification key through data, stores the verification key in a safe storage area, manages access control on the data, decrypts data and accesses sensitive data according to an application program and a user authorized by the verification key identification, and records an operation log;
the SGX generates an encryption key by creating a secure Enclave and isolating an execution environment through the data of the Enclave, and processes and stores sensitive data and the encryption key;
the data processing layer comprises a preliminary verification module, a data cleaning module and a data integrity verification module of data;
The preliminary verification module comprises data type and structure verification, data size and value range verification, malicious content and replay attack prevention detection;
the data cleaning module comprises the steps of identifying repeated records and unified data formats, splitting and merging fields, correcting logic errors, detecting and processing abnormal values, and checking validity and consistency;
The data integrity verification module verifies the data integrity by generating a unique hash value for the data through a hash algorithm;
The data storage layer comprises a data storage module and a data query module;
the data storage module stores data into the tamper-proof database through a K-V pair storage model;
The data query module acquires data stored in the tamper-resistant database through a query API;
The system interaction layer comprises an internal API, an external API and a security calling module;
The internal API is used for data exchange between different structural layers, and allows each internal module to call a specific function through the API;
the external API provides an interface for an external user to access and operate the internal data of the database, and the caller is authenticated through the API to identify authorized access;
And the security calling module calls an Enclave interface of the SGX from the outside, starts a specific security task in the Enclave, and encrypts and decrypts data.
Preferably, the TPM further comprises a TPM remote proving module and an operation protection module, wherein the TPM remote proving module proves data integrity to a remote client or a third party service through a server, a client confirms that the data processing and storage environment meets expected safety standards through verifying the safety state of the server, and the operation protection module protects software and hardware environment from being tampered through the TPM in the process from starting to executing.
Preferably, the SGX further includes an SGX remote attestation function, allowing the database to attest its identity and integrity to a remote user or application, and by cryptographically signing the secure Enclave, the verifier verifies that the data of the secure Enclave is secure and not tampered with;
the secure Enclave comprises a secure Enclave outside and a secure Enclave inside, data is automatically encrypted when the secure Enclave outside is stored, and decryption is performed inside the secure Enclave.
Preferably, the data query module further comprises a data verification mechanism, wherein the tamper-proof database verifies the hash value of the query parameter when a user queries, so that the validity of query is ensured, the data stored in the tamper-proof database is obtained through query, if the hash value of the query parameter is matched with the hash value stored in tamper-proof, query operation is allowed, and if the hash value of the query parameter is not matched with the hash value stored in tamper-proof, query is refused, and the time of query attempt and operator information are recorded.
Preferably, the system interaction layer adopts a bidirectional TLS protection mechanism, the transmitted data is encrypted through TLS, an encrypted communication channel is established, the data processing or data retrieval can be submitted by a verified client, and the bidirectional TLS protection mechanism records information of all API calls through a log and sets cache for frequently requested data.
Compared with the prior art, the technical scheme of the application has the following technical effects:
According to the invention, the TLS is adopted to communicate the TLS to receive the data in the transmission process, and the primary verification and cleaning of the data are carried out, so that the problem that the data may be tampered or malicious content is injected in the traditional data transmission process is solved, the encryption and the integrity of the data in the transmission process are ensured through the TLS, and the primary verification and cleaning ensure the validity and the safety of the data through checking the data type, the structure, the size and the value range and preventing replay attack, thereby improving the safety of the data transmission, and reducing the burden of invalid or malicious data in the TEE, so that the overall performance and the reliability of the system are improved.
The method solves the vulnerability problems of key management and data integrity verification in the traditional system by generating the verification key in the trusted computing module TPM and storing the verification key in the safe storage area, and ensures the integrity of the data in the starting and running processes by comparing the hash values of the data at the hardware level by the TPM, so that the verification mechanism at the hardware level is difficult to be damaged by the attack at the software level, thereby greatly enhancing the safety and the credibility of the system. In addition, the TPM also supports a remote proving function, so that a remote client or a third party service can verify the system state and the data integrity, and the transparency and the trust degree of the system are further improved.
The invention establishes the secure Enclave in the Intel software protection extension SGX, generates and manages the encryption key in the Enclave, solves the potential safety hazards in the key management and data processing process in the traditional encryption technology, ensures the isolation and confidentiality of data and codes in the execution process by the secure Enclave provided by the SGX, ensures that an attacker cannot access sensitive data and execution logic in the Enclave even if an operating system is broken, improves the security of data processing, and further enhances the anti-attack capability of the system by periodically updating the key and a secure discarding mechanism. In addition, the SGX also supports a remote attestation function, and allows a remote user or application to verify the identity and integrity of the secure enclave, thereby improving the credibility and security of the system.
The invention solves the safety and integrity problems in the data storage and transmission process in the traditional database technology by using the symmetric encryption algorithm and the hash chain technology in the tamper-proof database, the tamper-proof database generates a storage key through the hardware security module HSM, and encrypts the data by using the generated key, thereby ensuring the safety of the data in the storage process. Meanwhile, by calculating the hash value of the data and forming a hash chain, the non-tampering and traceability of the data are ensured. The technical scheme not only improves the safety of data storage, but also enhances the auditing capability and the data recovery capability of the system through a detailed log record and data verification mechanism, thereby providing a higher protection level for data processing and storage.
The foregoing description is only an overview of the present application, and is intended to provide a better understanding of the technical means of the present application, so that the present application may be practiced according to the teachings of the present specification, and so that the above-mentioned and other objects, features and advantages of the present application may be better understood, and the following detailed description of the preferred embodiments of the present application will be presented in conjunction with the accompanying drawings.
The above and other objects, advantages and features of the present application will become more apparent to those skilled in the art from the following detailed description of the specific embodiments of the present application when taken in conjunction with the accompanying drawings.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. In the following description, specific details such as specific configurations and components are provided merely to facilitate a thorough understanding of embodiments of the application. It will therefore be apparent to those skilled in the art that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the application. In addition, descriptions of well-known functions and constructions are omitted in the embodiments for clarity and conciseness.
It should be appreciated that reference throughout this specification to "one embodiment" or "this embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the "one embodiment" or "this embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the present application may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
The term "and/or" herein is merely one kind of association relation describing the association object, and indicates that three kinds of relations may exist, for example, a and/or B may indicate that a alone exists, B alone exists, and a and B exist simultaneously, and the term "/and" herein is another kind of association object relation describing that two kinds of relations may exist, for example, a/and B may indicate that a alone exists, and a and B exist separately, and in addition, a character "/" herein generally indicates that the association object is an "or" relation.
The term "at least one" is used herein to describe only one association relationship of associated objects, and means that three relationships may exist, for example, at least one of A and B may mean that A exists alone, while A and B exist together, and B exists alone.
It is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprise," "include," or any other variation thereof, are intended to cover a non-exclusive inclusion.
Example 1
The embodiment mainly describes a tamper-resistant database system based on a trusted computing environment, as shown in fig. 2, and comprises a trusted computing layer, a data processing layer, a data storage layer and a system interaction layer;
The trusted computing layer comprises a TPM and an SGX;
The TPM is used as a secure cryptoprocessor and is used for securely storing encryption keys, digital certificates and other sensitive data and providing security functions at a hardware level, including encryption and signature operations;
The TPM can safely store and manage the key used for encrypting the database, ensure that the key cannot be easily acquired by external software or an attacker, and can manage access control, so that only authorized application programs and users can decrypt and access sensitive data;
the TPM is used for ensuring that the database file is not tampered, verifying the integrity of the database file when the database is started, and detecting whether the file is illegally modified by comparing the hash value of the actual file with the expected hash value, wherein the verification process is carried out at a hardware level, so that the verification process is difficult to be destroyed by software level attacks;
The TPM can record the database operation log for future security audit and post analysis, and can ensure the integrity of log data and prevent the log from being tampered, which is particularly important for tracking potential security events and data leakage;
The TPM further comprises a TPM remote proving module and an operation protection module, the TPM remote proving module is used for proving data integrity to a remote client or a third party service through a server, a client is used for verifying the safety state of the server to ensure that the data processing and storage environment meets the expected safety standard, and the operation protection module is used for protecting software and hardware environment from being tampered through the TPM in the process from starting to executing.
The SGX provides an isolated execution environment for the database by creating a secure Enclave, wherein the Enclave is a protected memory area, and data and codes in the protected memory area are invisible even at an operating system level;
In SGX, data is automatically encrypted when being stored outside enclave and can be decrypted only inside enclave, so that the security of the data in the storage and transmission process is ensured;
the operation of the database can be performed in enclave, so that the safety and privacy of the data processing process are ensured;
The SGX can record the operation log of the database, and the log is generated and encrypted in enclave, so that the integrity and confidentiality of the log are ensured, and the safe starting of the database application can be realized through the SGX even if other parts of the system are broken.
The SGX may verify whether the database application code was tampered with prior to execution, ensuring that only verified code can run in enclave.
The SGX further comprises an SGX remote proving function, which allows the database to prove the identity and the integrity of the SGX remote proving function to a remote user or an application, and a verifier confirms that the data of the secure Enclave is secure and not tampered by carrying out encryption signature on the secure Enclave;
the data processing layer comprises a preliminary verification module, a data cleaning module and a data integrity verification module of data;
The preliminary verification module comprises data type and structure verification, data size and value range verification, malicious content and replay attack prevention detection;
Ensuring that the type of each data field accords with expectations, for example, whether the date field accords with a date format, whether the number field only contains numbers, checking whether the data accords with a predefined structure, such as the structural integrity of JSON or XML, and whether necessary fields are missing;
Checking whether the size of the data packet is within an allowable range, avoiding the system burden caused by processing oversized data, checking the data value, for example, the age cannot be negative, and the date cannot exceed a logic range;
Malicious content and replay attack prevention detection, scanning whether the data contains possible SQL injection codes, script injection or other malicious input, checking whether the received data is duplicated, and preventing the historical data from being resent to deception the system;
the data cleaning module comprises the steps of identifying repeated records and unified data formats, splitting and merging fields, correcting logic errors, detecting and processing abnormal values, and checking validity and consistency;
The data cleaning aim is to improve the quality of data, ensure the safety of the sensitive process, remove or correct errors and incomplete records in the data, ensure the accuracy and consistency of the data, and improve the usability, reliability and effectiveness of the data, wherein the data cleaning aim mainly comprises the following parts:
Identifying duplicate records and unified data formats by checking whether duplicate entries exist in the data, which may be generated before the data enters the TEE, to ensure that all data fields conform to the same format standard, converting inconsistent data types to consistent formats, such as converting digits of a string type to a numeric type;
The data structure is adjusted to split one field into a plurality of fields according to the requirement, or the fields are combined into one field and the logic error in the data is corrected;
Abnormal value detection and processing, namely identifying abnormal values in data by using a statistical method, wherein the abnormal values can be caused by incorrect input or measurement errors, and correcting, deleting or retaining marks;
Validity and consistency checks to ensure that the data complies with validity rules and to ensure that the data maintains logical consistency between different fields, e.g. a person's date of birth should be earlier than his date of employment.
The data integrity verification module verifies the data integrity by generating a unique hash value for the data through a hash algorithm;
The data storage layer comprises a data storage module and a data query module;
the data storage module stores data into the tamper-proof database through a K-V pair storage model;
The data query module acquires data stored in the tamper-resistant database through a query API;
The system interaction layer comprises an internal API, an external API and a safety calling module, so that a bridge for interaction between the layers of the system and the outside is formed, and the safety and the integrity of data are ensured;
The system external API provides interfaces for external users or systems to access and operate the data in the database, such as acquiring, updating and deleting data records, authenticating a system caller and ensuring that only authorized users or systems can access the API;
The security calling module calls an Enclave interface of the SGX from the outside and is used for starting a specific security task in the Enclave to realize the operation in the TEE; the secure call module can also be used for starting a data processing task in the enclaspe to encrypt and decrypt data, and the enclaspe interface of the external call SGX specially processes encryption operation before the data is persisted to a database, so that all sensitive data are ensured to be encrypted safely before leaving a trusted execution environment;
The system interaction layer also comprises an API endpoint to support a decryption flow read from the database, ensure that the data is decrypted inside the trusted environment, maintain the confidentiality and integrity of the data, decrypt the received data inside the enclase, and encrypt the data again after processing and send the data back to the client or other services without leaving the security boundary.
The system interaction layer adopts a bidirectional TLS protection mechanism, whether data is submitted or queried, the bidirectional TLS protection mechanism is used for protecting the data, the application of the bidirectional TLS protection mechanism relates to interaction of multiple layers in the system, the interaction comprises user access, data processing and secure communication with a trusted execution environment, all transmitted data are encrypted through TLS from a user terminal to a back-end service and from a trusted computing layer to a data storage layer, an encrypted communication channel is established, the data is effectively prevented from being stolen or tampered in the transmission process, and once the channel of the bidirectional TLS protection mechanism is established, all the data passing through the channel are encrypted, so that even if the data are intercepted in the transmission process, the data cannot be read by an unauthorized person;
the bi-directional TLS protection mechanism ensures that only authenticated clients can submit data for processing or retrieve data from when communicating with the trusted computing layer enclaspe, where all keys for TLS encryption must be securely managed, securely stored and restored using SGX's sealing and unsealing functions to prevent them from being accessed by malware, and in addition, all TLS connection attempts, successful handshakes, and completion of critical steps are recorded in an audit log for future security analysis and investigation.
The embodiment creates a safe execution environment through the trusted computing module TPM and the Intel software protection extension SGX, ensures the safety and transparency in the data processing and storage process, solves the defects of the traditional technology in the aspects of data encryption and access control, improves the overall safety of data through double protection of hardware and software, introduces a remote proving function, enhances the credibility of the system, and provides higher-level protection for data processing and storage.
Example 2
The embodiment describes in detail a tamper-resistant database method based on a trusted computing environment, as shown in fig. 1, comprising the steps of:
S1, receiving data transmitted into the computer by adopting a security protocol communication TLS, performing preliminary verification and cleaning on the data, generating a verification key by a trusted computing module TPM, and storing the verification key in a security storage area of the trusted computing module;
s2, loading the verified data into the feasible computing environment TEE to generate an encryption key, and storing the encryption key in an internal storage area of the feasible computing environment TEE;
S3, carrying out integrity check on encrypted data in the feasible computing environment TEE;
S4, the checked data leave the feasible computing environment TEE through the security protocol communication TLS and are transmitted to a tamper-proof database for storage.
Further, the trusted computing module TPM in S1 generates a verification key to verify the integrity of the data by comparing the hash value of the actual data with the expected hash value at the hardware level, and verifies whether the data in the key start and run software is tampered.
Further, as shown in fig. 4, the TEE uses intel software protection extension SGX to generate an encryption key, and specifically includes the following steps:
S2.1, creating a secure Enclave in SGX, and loading data into the Enclave;
s2.2, generating a random initialization vector in the Enclave through a hardware random number generator RNG;
S2.3, encrypting by using a strong encryption algorithm AES-256 in the Enclave through a GCM encryption mode, generating an encryption key, and storing the encryption key in the Enclave.
The initialization vector in S2.2 is a non-confidentiality binary sequence in an encryption algorithm, and is used for assisting a GCM encryption mode by introducing randomness and uniqueness in the encryption process, and the encryption of each data in the GCM encryption mode depends on the output of the previous data, so that the initialization vector is used for encrypting the first data block;
As shown in fig. 3, by combining the randomly generated initialization vector with the plaintext data block, the same data will also generate different ciphertext in different encryption instances, preventing an attacker from deducing the original data by comparing the patterns of the encrypted messages; in addition, even if the same data are repeatedly encrypted, different ciphertext can be generated by using different initialization vectors, so that the uniqueness of the encryption process is increased, and the information with repeated or structured data is protected;
in the GCM mode, the initialization vector must be guaranteed to be unique for each encryption key, and the initialization vector is securely stored to prevent tampering during encryption.
The encryption key is generated and stored in the enclaspe, so that the key security is ensured, the encryption key is safely stored by utilizing the sealing function provided by the enclaspe, so that the encryption key cannot be accessed even outside the enclaspe, the data is digitally signed by using the private key generated by the enclaspe when the data is updated each time, the signature is stored in the tamper-resistant database, and for each item of data retrieved from the database, the application in the enclaspe verifies the validity of the signature, so that the data is ensured not to be tampered.
Generating an encryption key in the TEE through a trusted execution environment, ensuring randomness and unpredictability of the key by using a strong random number generator, generating the encryption key according to an encryption algorithm, and storing the encryption key in an internal storage area of the TEE to prevent external access and leakage;
The access and use of the encryption key of the TEE are strictly controlled, so that only authorized processes or personnel can access the encryption key, the encryption key can be accessed when the encryption key is used according to the minimum authority principle, the encryption key can be updated periodically according to a security policy to resist potential attacks, and measures are taken to safely discard the key when the encryption key is not used any more, such as using an encryption technology to overwrite encryption key data.
Before data encryption, converting the data into an encoding form which can be effectively processed by an encryption algorithm, such as UTF-8, encrypting the data in a TEE by using a preselected encryption algorithm and a generated encryption key, and ensuring that all operations are performed in a safe memory of the TEE in the encryption process, so that any sensitive data is prevented from being leaked to a common memory, and carrying out integrity check on the encrypted data to ensure that the data is not tampered or damaged in the encryption process;
The invention can carry out decryption test on a small part of data, ensures that the encryption process is correct, and the encrypted data can be added with encryption algorithm identification and time stamp information and leaves the TEE through a TLS secure communication protocol to be transmitted to a tamper-proof database module, and the system can record encryption detailed information to a security log in the decryption and encryption processes to provide data support for subsequent audit.
Further, as shown in fig. 5, the integrity check in S3 includes the following steps;
s3.1, generating a hash value for the data through a hash algorithm, and storing or attaching the hash value to the data;
s3.2, before the data is accessed or transmitted, recalculating the hash value of the current data and comparing the hash value with the original hash value to ensure that the data is not modified;
and S3.3, verifying the source and the integrity of the data by adopting a digital signature, signing the data or the hash value by a data client through a private key, and verifying the signature by a public key of a sender to indicate that the data is not tampered after self-signing.
The system uses a hash algorithm to verify the data integrity by generating a unique hash value for the data, and generates a hash value for the data content by the system when the data is generated for the first time, and stores or attaches the hash value to the data itself;
the source and the integrity of the data are verified by adopting a digital signature mode, the data client signs the data or the hash value thereof by using a private key of the data client, the system can verify the signature by using a public key of a sender, the data is proved to be sent by a signer, and meanwhile, the data is indicated that the data is not tampered since the signature.
The integrity verification mechanism plays multiple roles of protection, monitoring and support in the invention through the implementation mode, provides a layer of necessary security protection for the system, and also provides traceability and definition of responsibility attribution while protecting data and the system from unauthorized or malicious tampering.
Further, as shown in fig. 6, the tamper-proof database in S4 stores data, including the following steps:
s4.1, encrypting data by using a symmetric encryption algorithm through the tamper-resistant database, generating a storage key through a hardware security module HSM, and encrypting the data by using the generated storage key;
s4.2, calculating a data hash value, and generating each data to comprise a unique key and a value through a K-V pair storage model;
s4.3, linking the hash value of the data to the hash value of the previous data, and storing each data together with the hash value calculated by each data to form a hash chain;
s4.4, storing the encrypted data in the tamper-proof database through a hash chain.
The tamper-resistant database in S4.2 uses a K-V pair storage model, each data comprises a unique key and a value, and in order to ensure the non-tamper property of the data, the value of each data item is encrypted and stored together with a hash value, wherein the hash value is calculated by using a hash algorithm through the value of the data;
When the data is updated, the anti-tampering database recalculates the hash value and compares the hash value with the old hash value to ensure that the data is not tampered, if the hash value is matched, the data is unchanged, if the hash value is not matched, the data is tampered, the anti-tampering can reject the updating operation, and the time of tampering attempt and the operator information are recorded.
Detailed description of the embodiments by combining trusted computing technology and tamper-resistant database technology, an omnidirectional data protection solution is provided. The integrated application of the security protocol communication TLS, the trusted computing module TPM and the Intel software protection extension SGX and the tamper-resistant database solves the defects of the traditional technology in the aspects of data transmission, key management, data processing, storage and the like, and the overall security and the credibility of the data are obviously improved through the double protection of hardware and software.
The above is only a preferred embodiment of the present invention, which is not intended to limit the scope of the present invention, and various modifications and variations may be made to the present invention by those skilled in the art, and the present invention may be modified, altered, substituted, integrated and altered by general substitution or the same function can be achieved without departing from the principle and spirit of the present invention, without departing from the scope of the invention.