[go: up one dir, main page]

HK1175861B - Verifiable trust for data through wrapper composition - Google Patents

Verifiable trust for data through wrapper composition Download PDF

Info

Publication number
HK1175861B
HK1175861B HK13102313.1A HK13102313A HK1175861B HK 1175861 B HK1175861 B HK 1175861B HK 13102313 A HK13102313 A HK 13102313A HK 1175861 B HK1175861 B HK 1175861B
Authority
HK
Hong Kong
Prior art keywords
data
wrapper
metadata
encrypted
access
Prior art date
Application number
HK13102313.1A
Other languages
Chinese (zh)
Other versions
HK1175861A1 (en
Inventor
R.V.奥拉德卡
R.P.德索扎
Original Assignee
微软技术许可有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/832,400 external-priority patent/US9537650B2/en
Application filed by 微软技术许可有限责任公司 filed Critical 微软技术许可有限责任公司
Publication of HK1175861A1 publication Critical patent/HK1175861A1/en
Publication of HK1175861B publication Critical patent/HK1175861B/en

Links

Description

Verifiable trust for data through wrapper composition
Technical Field
This document relates to providing trusted computing and data services, such as web services or cloud services, to devices, and more particularly to data or web services that apply a composite wrapper to transform data, metadata, or both.
Background
Taking the background regarding some conventional systems as an example, computing devices traditionally execute applications and data services locally on the device. In this case, when data is accessed, processed, stored, cached, etc., it may travel on the device through the local bus, interface, and other data paths, however, the user of the device does not have to worry about interference or exposure of the user data unless the device itself is lost, stolen, or otherwise compromised.
The evolution of network storage farms capable of storing terabytes of data (and potentially terabytes, millions of bytes of data in the future) creates opportunities to model the following applications: the application historically operates on local data, but instead operates on data stored in the cloud, with the master and external storage being separate. Cloud storage of application or system (or any) data allows many devices to store their data without requiring separate dedicated storage for each device.
Heretofore, with the evolution of online and cloud services, applications and services are increasingly migrating to third party network providers that perform some or all of a given service on behalf of a device(s). In this case, the user of the device may be concerned about who can access the data when the user's data is uploaded to the service, when the data is stored or processed by the service, or when the data is retrieved from the service, or worse, who can interfere with the data. In short, concerns arise regarding inadvertent or malicious processing of data or interference with data by third parties when the data of a user's device leaves the physical area of possession and enters a network environment that is physically remote from the user. Thus, it is desirable to increase trust, security, and privacy for cloud services and for processing of data in conjunction with cloud services. Similar concerns may arise with data storage even inside an enterprise, for example, when data leaves one control area (e.g., a first department) that generated the data and enters another area (e.g., a second department) for storage.
However, as mentioned above, the following problems still remain: neither cloud service nor network storage providers are able to effectively alleviate the problems of and the need for security, privacy, and integrity of data (when stored in the cloud). In short, the user needs to have higher trust in the following aspects: when relinquishing physical control over storage media, their data remains secure and private, and this obstacle has significantly prevented businesses and consumers from adopting backups of important data via third-party network services and solutions.
The above-described shortcomings of today's devices and data services provided to the devices are merely intended to provide an overview of some of the problems of conventional systems and are not intended to be exhaustive. Other problems of the prior art and the corresponding benefits of the various non-limiting embodiments may become apparent upon a careful reading of the following detailed description.
SUMMARY
A simplified summary is provided herein to facilitate a basic or general understanding of aspects of one or more of the exemplary, non-limiting embodiments described below and in the accompanying drawings. This summary is not intended to be exhaustive or exhaustive, however. Rather, the sole purpose of this summary is to present some concepts related to some exemplary non-limiting embodiments in a simplified form as a prelude to the more detailed description of the various embodiments that follow.
Network or cloud data services (including mathematical transformation techniques for data, such as searchable encryption, de-assembly/reassembly or distribution techniques) are provided in the following manner: this approach distributes trust across multiple entities to avoid a single point of data compromise and separates the data protection requirements from the container in which the data may be stored, processed, accessed or retrieved. In one embodiment, the mathematical transformation predicate generator (e.g., key generator), mathematical transformation provider (e.g., cryptographic technology provider), and cloud service provider are each provided as separate entities, allowing a trusted platform of a data publisher to confidentially (hidden, e.g., encrypted) provide data to the cloud service provider, and allowing an authorized subscriber selective access to the hidden (e.g., encrypted) data based on subscriber capabilities.
Using a trusted platform, a method for hosting data, comprising receiving data or metadata associated with the data, wherein the data, the metadata, or both are protected by a composite wrapper formed from at least one mathematical transformation of the data and a second mathematical transformation, wherein the at least one mathematical transformation defines a first wrapper for the data, the metadata, or both based on a first set of criteria, and the second mathematical transformation defines a second wrapper for the data, the metadata, or both based on a second set of criteria. The method also includes requesting access to the data, metadata, or both protected by the composite wrapper based on a set of capabilities included in the request. The capabilities may be any kind of access information such as a reconfiguration picture, a key, a decoding tool, etc. Based on the set of capabilities, access privileges to the data, the metadata, or both are determined based on evaluating visibility through the first wrapper and independently evaluating visibility through the second wrapper.
In a non-limiting embodiment, a system can include a mathematical transformation component distributed at least in part by a mathematical transformation technology provider, the mathematical transformation component implemented independently of an access information generator that generates capability information for publishing data, metadata, or both, or for subscribing to published data, published metadata, or both, the mathematical transformation component including at least one processor configured to authorize access based on the access information. A network service provider, implemented independently of the access information generator and the mathematical transformation technique provider, includes at least one processor configured to implement a network service with respect to computer data, computer metadata, or both transformed by the mathematical transformation component, the network service provider configured to communicate with the mathematical transformation component to perform the generation, regeneration, or deletion of a cryptographic wrapper applied to the computer data, computer metadata, or both.
Using techniques of a trusted platform, data (and associated metadata) is decoupled from the container (e.g., file system, database, etc.) holding the data, allowing the data to act as its own custodian by imposing a mathematical complexity protection shield that is punctured with presented capabilities, such as keys granted by a key generator of the trusted platform, as a non-limiting example. Sharing or access to data or a subset of the data is facilitated in a manner that preserves and extends trust without requiring a specific container to be enforced. The mathematical complexity applied to the data, such as searchable encryption techniques, protects the data regardless of the container or hardware in which the particular bit is recorded, i.e., the data is protected containerless (i.e., without regard to the container) and thus is not subject to attacks based on compromising the security of the container. In the event that a particular "safe" is compromised, the content remains protected.
In one non-limiting embodiment, extensible markup language (XML) data is data that acts as its own custodian. Using XML data, tags may be augmented or added with descriptive information that selectively allows or blocks access to the underlying data, thereby allowing the XML data or XML data fragments (as encapsulated by tag information in a trusted envelope applied to the XML data or fragments) to act as their own custodian. For example, the XML data or tags can represent searchable metadata that encodes any one or more of authentication information, authorization information, schema information, history information, tracking information, compliance information, and the like. Note that any of the XML-based embodiments may also be applicable to some alternative formats, such as, but not limited to, JavaScript object notation (JSON), S-expressions, Electronic Data Interchange (EDI), etc., and thus in these embodiments, XML is used for illustrative purposes only.
A "trusted envelope" for any type of payload, such as, but not limited to, a database field, XML fragment, or complete record, thus provides for draped (secured) access through various decorations or envelopes placed on the envelope that allow a full range of trust to a range of guarantees, such as, but not limited to, confidentiality, privacy, anonymity, tamper detection, integrity, etc. For example, XML tags may be applied or augmented to create trust envelopes for structured XML data, a common format for data exchange in a networked environment, enabling containerless XML data in a trusted cloud service environment.
Some other examples of cryptographic techniques or "adornments" that may be applied to facilitate establishing a high level of trust in the security and privacy of data include, but are not limited to, size-preserving encryption, searchable encryption, or application attestation, blind fingerprints, recoverable attestation, and the like.
Other embodiments and various non-limiting examples, scenarios, and implementations are described in more detail below.
Brief Description of Drawings
Various non-limiting embodiments are further described with reference to the accompanying drawings in which:
FIG. 1 is a block diagram that illustrates a system for publishing or subscribing to data, metadata, or both in a store with a composite wrapper, in an embodiment;
FIG. 2 is a block diagram that illustrates a system for publishing or subscribing to data, metadata, or both in a store with a synthetic encryption wrapper, in an embodiment;
FIG. 3 is an illustrative example of a concentric composite wrapper;
FIG. 4 is an illustrative example of a composite wrapper with a lateral wrapper;
FIG. 5 is an illustrative example of a hybrid composite wrapper with both concentric wrappers and lateral wrappers in an embodiment;
FIG. 6 is a block diagram of the use of a lateral wrapper in conjunction with access log metadata associated with data in one embodiment;
FIG. 7 is a block diagram that illustrates efficient deletion of data by discarding or shredding access information in an embodiment in which the manner in which data is deleted is encoded in metadata;
fig. 8 is a block diagram additionally illustrating an example in which data is scrambled and information on the scrambling is recorded in metadata inside or outside a wrapper of a composite wrapper;
FIG. 9 is a block diagram of a change in policy that results in surrendering the ability to view data, metadata, or both that are obscured by a wrapper as an alternative to removing the wrapper;
FIG. 10 is a block diagram of automatically unwrapping, generating, changing, regenerating, or augmenting a wrapper based on an instruction or state change;
FIG. 11 is an illustrative example of performing a data revision task using one or more lateral wrapper transforms;
FIG. 12 is a flow diagram of an exemplary, non-limiting process for hosting data, metadata, or both wrapped by a composer in an embodiment;
FIG. 13 is a block diagram of an overall environment for providing one or more embodiments of secure, private, and selectively accessible network data services;
FIG. 14 is a block diagram illustrating one or more aspects of "data as its own custodian";
FIG. 15 is a block diagram of an overall environment for providing one or more embodiments of secure, private, and selectively accessible network data services;
FIG. 16 is a flow diagram of a process for managing containers in which data acts as its own custodian;
FIG. 17 is another block diagram illustrating one or more aspects of data acting as its own custodian;
FIG. 18 is another block diagram showing aspects of data acting as its own custodian, illustrating that the data is capable of overriding the traditional container security model;
FIG. 19 illustrates a storage management layer that performs functions such as automatic shredding, caching, copying, restructuring, etc., of data from multiple data containers of different types;
FIG. 20 is a block diagram showing a secure overlay network adding a cryptographic access wrapper to data where the data is stored across various data containers;
FIG. 21 is a block diagram illustrating aspects related to legacy applications;
FIG. 22 is an example architectural model that can be used in connection with legacy applications as well as FTO aware applications;
FIG. 23 is a block diagram illustrating the general use of a cryptographic wrapper or envelope on data and/or metadata describing the data or characteristics of the data;
FIG. 24 is a specific example that further highlights the concept generally presented in FIG. 23;
FIG. 25 is another example illustrating federated trust overlays around protected data;
FIG. 26 is a block diagram illustrating an embodiment in which records and indexes are encrypted and uploaded to the cloud using trust overlays;
FIG. 27 illustrates how a client can generate, upload, and/or search an encrypted index for a richer cloud storage experience utilizing the federated trust overlay architecture on encrypted data;
28-30 are block diagrams illustrating some additional non-limiting trust guarantees for the system;
FIG. 31 is a diagram to illustrate an embodiment of trusted override in an XML context;
FIGS. 32-35 are flowcharts illustrating exemplary processes for trusted XML in various embodiments;
FIG. 36 is a flow diagram that illustrates an exemplary non-limiting method for processing data to form trusted XML, in one embodiment;
FIG. 37 is a block diagram of a trusted cloud service framework or ecosystem, according to an embodiment;
FIG. 38 is a flow diagram illustrating an exemplary non-limiting method for publishing data in accordance with a trusted cloud service ecosystem;
FIG. 39 is a flow diagram illustrating an exemplary non-limiting method for subscribing to data in accordance with a trusted cloud service ecosystem;
FIG. 40 illustrates an exemplary ecosystem showing the separation of a key generation Center (CKG), a Cryptographic Technology Provider (CTP), and a Cloud Service Provider (CSP) in a trusted ecosystem;
FIG. 41 is another architectural diagram illustrating further benefits of a trusted ecosystem for performing cloud services for an enterprise;
FIG. 42 is another block diagram illustrating adaptation to different storage providers through a storage abstraction layer;
FIG. 43 illustrates further aspects of storage in conjunction with a storage abstraction service;
FIG. 44 is another block diagram illustrating various participants in a trusted ecosystem;
FIG. 45 is a representative view of some layers of an exemplary non-limiting implementation of a trusted cloud computing system in which different components may be provided by different or the same entity;
FIG. 46 is a flow diagram of an exemplary non-limiting process for publishing documents to a digital safe application in a manner that provides controlled selective access to data to publishers with late binding;
FIG. 47 is a flow diagram of an exemplary non-limiting process for subscribing to material placed in a digital safe;
FIG. 48 illustrates an exemplary, non-limiting implementation of a trusted cloud service using a digital escrow pattern to implement secure extranet for an enterprise through one or more data centers;
FIG. 49 is a flow diagram illustrating another exemplary non-limiting scenario based on a trusted cloud services ecosystem in which subscribers are given selective access to encrypted data stored by CSPs;
FIG. 50 is another flow diagram illustrating that an application response may be customized to a subscriber based on login information;
FIG. 51 is another flow diagram illustrating a secure record upload scenario, which may be implemented for one or more parties;
FIG. 52 is yet another flow diagram illustrating an exemplary, non-limiting implementation of a role-based query on a searchable encrypted data store enabled by a trusted cloud service ecosystem;
FIG. 53 is a flow diagram illustrating a multi-party collaboration scenario in which an enterprise provides access to some of its encrypted data to external enterprises;
FIG. 54 is a flow diagram illustrating a multi-party automatic search scenario among multiple enterprises;
FIG. 55 illustrates an exemplary non-limiting Edge Computing Network (ECN) technique that may be implemented for a trusted cloud service;
FIG. 56 is a block diagram illustrating one or more optional aspects of a key generation center according to a trusted cloud services ecosystem;
FIG. 57 is a block diagram of an exemplary, non-limiting embodiment of a trusted store that includes searchable encrypted data;
FIG. 58 is a flow chart illustrating an exemplary non-limiting process for subscribing to, the process including a confirmation step;
FIG. 59 illustrates an exemplary, non-limiting confirmation challenge/response protocol by which an authenticator issues a cryptographic challenge to a prover;
FIG. 60 is a block diagram of another exemplary, non-limiting embodiment of a trusted store that includes searchable encrypted data;
FIG. 61 is a flow chart illustrating an exemplary non-limiting process for subscribing to, the process including a confirmation step;
FIG. 62 illustrates another exemplary, non-limiting authentication challenge/response protocol by which an authenticator issues a cryptographic challenge to a prover;
FIG. 63 is a block diagram of a general environment for providing one or more embodiments of a service (including blind fingerprinting);
FIG. 64 is a block diagram illustrating a non-limiting scenario in which multiple independent federated trust overlays or digital escrows may exist side-by-side or on top of each other for a hierarchical fashion;
FIG. 65 is a block diagram of another exemplary, non-limiting embodiment of a trusted storage that includes data distribution techniques for obfuscating data from unauthorized access;
FIG. 66 is a block diagram representing exemplary non-limiting networked environments in which various embodiments described herein can be implemented; and
FIG. 67 is a block diagram representing an exemplary non-limiting computing system or operating environment in which one or more aspects of various embodiments described herein can be implemented.
Detailed Description
Overview
As discussed in the background, data sent to a web service may cause discomfort in privacy, the possibility of tampering, etc., e.g., when data is transferred from a user's device to a web application, service, or data store, the user desires sufficient assurance that no malicious third party can cause harm. By definition, the user has lost control over the data. Therefore, what is needed is increased trust such that the publishers and/or owners of data are willing to relinquish physical control over their data while believing that: in a network, its data will remain private and undamaged except when accessed by the publisher or the owner or anyone granted privileges as verified based on the requestor's identity.
In this regard, the following problems still remain: neither cloud service nor network storage providers are able to effectively alleviate the problems of and the need for security, privacy, and integrity of data (when stored in the cloud). In short, the user is concerned about having a higher trust in: when relinquishing physical control over storage media, their data remains secure and private, and this obstacle has significantly prevented businesses and consumers from adopting backups of important data via third-party network services and solutions.
The term "network storage provider" as used herein includes, but is not limited to: a content delivery (or delivery) network (CDN), a hybrid scenario (e.g., across enterprise storage, cloud storage, and/or a CDN), and/or a broader federated scenario (e.g., across multiple enterprises, multiple clouds, or multiple CDNs), or any combination of the preceding.
Traditionally, to keep data secure, the data is locked or kept secret, e.g., on a physical medium. In this regard, the data owner knows that the custodian of the safe must be a fully trusted party or that there is no access to the contents of the safe. In this regard, while cloud services are premised on the fact that customers do not necessarily need to know exactly where their data is physically located, this problem cannot be completely ignored. This is because it is a challenge to have all responsibility for who (what device) can access the data, who sees the data, who maintains the data, and how the data is stored. Accordingly, in reality, customers are very concerned about who the third party controlling the various computing and storage devices in the cloud chain is due to inherent distrust and various other concerns.
By eliminating active custody controlled by a human or external entity (which has an inherent bias that may not be consistent with the owner or publisher of the data), embodiments herein provide a system in which data is mathematically transformed (e.g., selectively encrypted or searchably encrypted) such that the data acts as its own custody regardless of the third party machine, mechanism, device, or container that holds the data. In this regard, various implementations of federated trust coverage enable containerless data and guarantees of security, confidentiality, tamper resistance, and the like, where these guarantees are made transparent to the user.
Thus, in implementations, a trusted cloud platform is used to host data, including receiving data or metadata associated with the data, wherein the data, the metadata, or both are protected by a composite wrapper formed from mathematical transformations of the data, the metadata, or both, wherein the mathematical transformations include at least a first mathematical transformation defining a first wrapper for the data, the metadata, or both based on a first set of criteria and a second mathematical transformation defining a second wrapper for the data, the metadata, or both based on a second set of criteria. An entity may make a request to access data, metadata, or both protected by a composite wrapper based on a set of capabilities included in the request. Based on the set of capabilities, access privileges to the data, the metadata, or both are determined based on evaluating visibility through the first wrapper and independently evaluating visibility through the second wrapper.
Receiving may include receiving data or metadata protected by a composite wrapper formed from a mathematical transformation including a first mathematical transformation defining a first wrapper that wraps less than all of the data, the metadata, or both based on a first set of criteria.
Receiving may include receiving data or metadata protected by a composite wrapper formed from mathematical transformations including a first mathematical transformation defining a first wrapper that wraps the data, the metadata, or both based on a first set of criteria and a second mathematical transformation defining a second wrapper that wraps the data, the metadata, or both wrapped by the first wrapper.
Receiving may include receiving data, metadata, or both protected by a composite wrapper formed at least in part from a mathematical algorithm that enables the first or second wrapper to at least partially decompose after an implicitly or explicitly defined condition is satisfied. Receiving may include receiving data, metadata, or both, protected by a composite wrapper formed at least in part from a mathematical algorithm that enables at least one of the first and second wrappers to allow full access to the data, metadata, or both after an implicitly or explicitly defined condition is satisfied.
Receiving may include receiving data, metadata, or both protected by a composite wrapper formed at least in part from a mathematical algorithm that enables selective opacity to the data, the metadata, or both.
Receiving may include receiving data, metadata, or both protected by a composite wrapper formed at least in part from a mathematical algorithm that includes first and second mathematical transforms that form first and second wrappers based on first and second sets of criteria, respectively, the first or second sets of criteria including at least one of: a representation of the key information; asserting information of evidence of a role; the type of data, metadata, or both; a type of association of data, metadata, or both; or declare information that holds evidence of at least one claim.
Receiving may include receiving data or metadata protected by a composite wrapper formed from a searchable encryption algorithm. Receiving may include receiving, by a device in the first control area, data, metadata, or both, from a device in the second control area.
Receiving may include receiving data, metadata, or both, wherein the data, metadata, or both are formed from an analysis of the data, metadata, or both and encrypting an output of the analysis based on key information. Receiving a request to access data, metadata, or both may include receiving trapdoor data that enables visible access to the data, metadata, or both, such as visible access defined by a cryptographic trapdoor of the trapdoor data.
Receiving may include receiving data, metadata, or both protected by a composite wrapper formed from mathematical transforms of the data, metadata, or both, including a first mathematical transform of a first wrapper forming the data and a second mathematical transform of a second wrapper forming the metadata.
In other embodiments, receiving may include receiving data or metadata protected by a composite wrapper formed from a mathematical transformation including a first mathematical transformation defining a first wrapper that wraps less than all of the data, the metadata, or both, and a second mathematical transformation defining a second wrapper that wraps all of the data, the metadata, or both, based on a first set of criteria.
The second wrapper may wrap all of the data, metadata, or both that were partially wrapped by the first wrapper. Receiving may include receiving data, metadata, or both, protected by a composite wrapper that includes a supplemental wrapper, the composite wrapper including at least first and second wrappers for satisfying supplemental trust or security criteria.
In one embodiment, if the state of the data, metadata, or both, changes to a new state, an additional wrapper is automatically added that fits the new set of criteria associated with the new state. Alternatively, additional wrappers that fit the new set of criteria associated with the new state may be automatically removed. Alternatively, if the state of the data, the metadata, or both changes to a new state, determining the access privileges may include determining the access privileges based on an unrestricted capability granted by the entity that generated the capability.
In other embodiments, if the confidentiality category of data, metadata, or both becomes a more sensitive category, an additional wrapper may be automatically added that fits the more sensitive category of data, metadata, or both. If the state of the data, metadata, or both changes to a new state, the first wrapper or the second wrapper may also change for a new set of criteria associated with the new state. If the state of the data, metadata, or both changes to a new state, the change may also include modifying the first wrapper or the second wrapper as appropriate for a new set of criteria associated with the new state.
In another embodiment, if the state of the data, metadata, or both changes to a new state, at least some of the data, metadata, or both may be revised by a mathematical transformation based on at least one of the first wrapper or the second wrapper being adapted for a new set of criteria associated with the new state. Likewise, if the state of the data, metadata, or both changes to a new state, the first wrapper or the second wrapper may be deleted. Alternatively, if the data, metadata, or both change, the metadata may be augmented with change metadata that describes at least one change to the data, metadata, or both. As another alternative embodiment, if the data, metadata, or both change, change metadata describing at least one change to the data, metadata, or both in the first wrapper may be encoded in the wrapper. As a different alternative embodiment, if the data, metadata, or both change, the metadata may be augmented with change metadata that describes at least one change to the data, metadata, or both.
The determination of access privileges may include determining an order in which to evaluate visibility based on a defined hierarchy of at least a first wrapper relative to at least a second wrapper. For example, the determination of access privileges may include determining that the order in which visibility is evaluated is based on a hierarchy defined by a tree data structure.
As alternative embodiments, the determination of access privileges may include determining a concentric order in which visibility is evaluated. As alternative embodiments, the determination of access privileges may include determining a lateral order in which visibility is evaluated. The determination of access privileges may include determining the order based on a concentric and lateral order in which visibility is evaluated.
The determination of access privileges may include first evaluating visibility through the first wrapper and evaluating visibility through the second wrapper if the set of capabilities allows access privileges to the data, metadata, or both. The determination of access privileges may include first evaluating visibility through the second wrapper and evaluating visibility through the first wrapper if the set of capabilities allows access privileges to the data, metadata, or both.
The determination of access privileges may include first evaluating visibility through a second wrapper applicable to the originating metadata of the metadata and augmenting the originating metadata based on the entity requesting the access privileges. The determining may include determining the access privileges based on evaluating visibility through a first wrapper adapted for an external data set including the metadata and independently evaluating visibility through a second wrapper adapted for an internal data set including the data. The determining may include determining the access privileges based on evaluating visibility through a first wrapper applicable to an encryption index corresponding to the data.
The process may also include blindly searching the encryption index via selective access to the encryption index by the first wrapper.
Defining the first wrapper or defining the second wrapper may include defining access speed requirements for the data, the metadata, or both. Defining the first wrapper or defining the second wrapper may include defining tamper-resistant requirements for the data, the metadata, or both. Defining the first wrapper or defining the second wrapper may include defining recovery reliability requirements specified for the data, the metadata, or both.
A system may include a mathematical transformation component, distributed at least in part by a mathematical transformation technology provider, implemented independently of an access information generator, wherein the access information generator generates capability information for at least one of: publish the data, the metadata, or both, or subscribe to the published data, the published metadata, or both, wherein the mathematical transformation component comprises at least one processor configured to execute at least one encoding algorithm or decoding algorithm based on the capability information generated by the access information generator. The system may also include a network service provider implemented independently of the access information generator and the mathematical transformation component, including at least one processor configured to implement a network service related to the computer data, the computer metadata, or both encrypted by the mathematical transformation component, the network service provider configured to communicate with the mathematical transformation component to perform the generation, regeneration, modification, augmentation, or deletion of at least two mathematical transformation wrappers applicable to the computer data, the computer metadata, or both.
The network service provider may be configured to generate, regenerate, change, augment, or delete a wrapper based on a temporal event that modifies a trust requirement of a set of trust requirements of the wrapper. The network provider may be configured to regenerate, alter, augment, or delete a wrapper based on determining that a mathematical transformation technique used to generate the wrapper no longer satisfies a trust requirement of the set of trust requirements.
The network service provider may be configured to generate, regenerate, change, augment, or delete a wrapper based on at least one spatial event that modifies a trust requirement in a set of trust requirements for the wrapper. The network provider may be configured to regenerate, alter, augment, or delete a wrapper based on determining which party the mathematical transformation technique used to generate the wrapper is no longer applicable to generating the wrapper.
In other embodiments, the trusted platform is used as a convertible framework for publishers to mathematically obfuscate data to enable subscribers to selectively access segments for which the subscribers are authorized. In this regard, the platform enables data to act as its own custodian by simultaneously protecting the data and allowing access by authorized subscribers while preserving integrity and security.
Data that acts as its own custodian can be implemented with a federated trust overlay with pluggable services, as described in embodiments and in specific sections below. By implementing more than mathematical obfuscation (e.g., encryption), embodiments provide the following guarantees to the user and escrow proxy data: regardless of where and how the data is stored, the data retains confidentiality and integrity requirements appropriately defined by the publisher or holder of the data. In this regard, focus has been switched or extended from protecting the boundaries, pipes, and containers of data to protecting the data and associated metadata by providing a cryptographically secure trust envelope that, when presented with appropriate capabilities (e.g., keys), allows access to that data/metadata or a particular subset.
In one embodiment, a method for hosting data is provided that includes receiving, by a computing device in a first control region, obscured data from a computing device in a second control region, the obscured data formed from a mathematical transformation of data of a defined data set of the computing device in the second control region. The method also includes receiving, by a computing device in the first control region, obscured metadata formed from an analysis of the data and at least one other mathematical transformation of an output of the analysis. Next, it is determined which one or more containers of a set of containers have at least two different container types for storing the obfuscated data and/or the obfuscated metadata.
In one non-limiting implementation of the system, the one or more mathematical transformation components are distributed, at least in part, by a mathematical transformation algorithm provider implemented independently of a generator that generates mathematical transformation predicate information (e.g., key information) for at least one of: publish data and metadata, or subscribe data and metadata. The one or more mathematical transformation components execute at least one searchable data obfuscation algorithm (e.g., searchable encryption) or searchable data revealing (e.g., searchable decryption) algorithm based on mathematical transformation predicate information generated by the generator. A network service provider implemented independently of the generator and the one or more mathematical transformation components implements a network service related to the data or metadata obscured by the one or more mathematical transformation components, and the network service provider includes a data container management component that manages where the data or metadata obscured by the at least one mathematical transformation component is stored based on at least one of data latency requirements, data reliability requirements, distance from data consumption requirements, or data size requirements of the network service.
Data that is the custodian provides access to the data at a fine or specified level of granularity, when needed, or when expected to be needed, without requiring full rights to a given data set. An operator at the cloud storage provider also cannot view, modify, tamper with, or delete data without being detected unless such viewing, modification, tampering, or deletion is explicitly authorized according to the capabilities granted to the operator, such as maintenance of server logs, or some other limited operation on metadata to plan storage capabilities, and so forth. Furthermore, containerless data enables proactive replication (proactive replication) that facilitates tamper resistance, a requirement that conventional systems cannot adequately address.
In one embodiment, federated trust overlays are implemented with one or more of the following components: a Cloud Data Service (CDS) or cloud storage provider, a Cryptographic Technology Provider (CTP), and a key generation Center (CKG). The CDS can be provided by any storage provider, i.e. no container data requires a specific container. CTP may also be provided by either party, provided that the party operates in a separate control area from the CDS, whether based on open specifications for implementing CTP or a proprietary implementation of the CTP. Separating the key generation function and subjecting mathematical principles (such as encryption principles) to public inspection inspires the following confidence: the CTP approach is unbiased and may be implemented by a business or individual user, or outsourced to a third party with a CTP expert. Also, proprietary versions, open versions for companies, open or closed versions for government or main rights areas, benchmark open source versions, or other categories may be created for pre-packaging use or implementation by a given entity.
The CKG entity generates key information according to the technique specified by the CTP and is also provided as a separate component of the federated trust overlay (however, CKG may also be combined with other components depending on the level of trust desired for a given implementation of the FTO). In various embodiments, although a CKG may be a centralized entity, as used herein, a "hub" is a logical reference rather than an indication of a centralized entity, and thus, the CKG may also be distributed and federated. The CKGs may serve a single entity or multiple collaborators, e.g., multiple collaborators between dispensing enterprises, to share and access information according to a key exchange from an agreed upon CKG. Thus, with FTO, by separating capabilities, trust and confidentiality are maintained, preventing insight into stored information, logs, or access patterns without explicit authorization, and also allowing tamper detection and integrity, e.g., verification. For example, a service provider cannot modify or delete data without being detected. The ability to audit with non-repudiation enables customers to comfortably release their hands from the data and ensure that no one has accidentally or intentionally disturbed the data. The journal also has the same guarantees as data and metadata.
The result "validation" is another feature that may be included in an FTO implementation and is described in more detail below. Validation ensures that the cloud cannot withhold data that is solicited to it, e.g., two documents cannot be delivered when three documents are solicited. The concept of separation can be made even further by considering the separate implementation of CKG and any services that perform validation of data, and by separating data from an application service provider that receives, modifies, retrieves, modifies, augments, or deletes the data or metadata based on the capabilities granted to the application service provider. This also has the additional benefit of maintaining application capabilities based on the then-current access characteristics, updated security model, updated roles, time of day, etc.
Combining all or even some of the above features (such as described in more detail below in various embodiments) enhances the possibility of mitigating concerns over cloud storage of data. At the enterprise level, the enterprise can own policies and control enforcement in a granular manner, even if data and applications are hosted in the cloud. The system can be integrated with enterprise security infrastructure, such as identity meta-systems (e.g., Claims (Claims), identity lifecycle management, active directory, etc.). Enterprises may be exposed to more or less FTO implementations as needed.
The provisioning of data services as described herein involves various combinations and permutations of storage and cryptographic techniques that allow for cost-effective and secure and proprietary solutions. For example, optional embodiments described in more detail below implement data protection techniques including reserved-size encryption, searchable encryption, and/or cryptographic techniques known as proof of application (profofapplication) (see general techniques). Such embodiments enable new business scenarios for outsourced cloud data protection, disaster recovery, or analytics. As discussed in the background, none of the conventional systems implement cloud or network data services in a manner that meets the lost privacy or security needs of the customer.
In this regard, to eliminate the trust barrier surrounding conventional provisioning of network services, a trusted cloud computing and data services ecosystem or framework is provided that achieves the above-identified goals and other advantages highlighted in the various embodiments described below. The term "cloud" service generally refers to the concept that a service is not executed locally from a user device, but is delivered from a remote device that may be accessed over one or more networks. Since the user's device does not need to understand the details of what is happening at the remote device or devices, the service is delivered from the "cloud" point of view of the user's device.
In one embodiment, a system includes a key generator that generates key information for publishing or subscribing to data. A cryptographic technology provider implemented independently of a key generator implements a searchable encryption/decryption algorithm based on key information generated by the key generator. In addition, a network service provider, implemented independently of the key generator and the cryptographic technology provider, provides network services related to data encrypted by the cryptographic technology provider.
In one embodiment, a data store is provided that exposes selectively accessible (e.g., searchable) encrypted data to which at least one publisher publishes data representing a resource. In case a partitioning of the possibility of abusive trust is provided, the first independent entity performs the generation of cryptographic key information. The second independent entity in turn encrypts the published data based on the cryptographic key information generated by the first independent entity prior to storing the published data. A set of networks or cloud services then selectively access the encrypted data for requests to the network service based on the late-binding selected privileges granted by the publisher or owner of the resource.
In other embodiments, the data store stores encrypted data that is selectively accessible, wherein subscribers subscribe to a specified subset of the encrypted data. The first independent entity generates cryptographic key information based on identity information associated with the subscriber, and the second independent entity performs decryption of the specified subset based on the cryptographic key information generated by the first independent entity. The network service responds to the subscriber's request and provides selective access to the encrypted data based on the late-binding selected privileges granted by the publisher or owner of the specified subset.
In this regard, the terms "publisher" and "subscriber" generally refer to anyone who publishes or subscribes to the data of the trusted cloud service, respectively. In practice, however, publishers and subscribers will assume more specific roles depending on the industry, domain, or application of the trusted cloud services ecosystem and the digital hosting mode. For example, in the context of data throughout a system, typically only a small group of subscribers will have privileges to access the data. For example, in the context of data, an auditor of an encrypted data store has certain capabilities based on the auditor's role in the data to ensure that certain requirements, such as frequency of backups, are met without being authorized to access the content itself.
In one non-limiting embodiment, a method for hosting data comprises: receiving, by a first computing device in a first control area, encrypted data from a second computing device in a second control area, the encrypted data formed by encrypting data of a defined data set of the second computing device according to a searchable encryption algorithm based on cryptographic key information; receiving, by the first computing device, encrypted metadata formed from analyzing the data and encrypting an output of the analysis based on the cryptographic key information; and automatically determining a container from among containers of at least two different container types used to store the encrypted data or the encrypted metadata. Trapdoor data is received that enables visible access to the encrypted data or metadata as defined by at least one cryptographic trapdoor of the trapdoor data.
The container in which the encrypted data or metadata is stored may be automatically switched or changed if predefined conditions for the plurality of containers are met. For example, if some data or metadata becomes high priority for a client, it can be moved from slower, longer term storage to an agile container with low access latency. Alternatively, data or metadata may be moved, copied, or deleted for other efficiency reasons, e.g., based on a storage size associated with the encrypted data or metadata, based on access speed requirements specified for the encrypted data or metadata, based on recovery reliability requirements specified for the encrypted data or metadata, based on proximity to one or more devices having access to the encrypted data or metadata, etc.
In another non-limiting embodiment, a system includes a cryptographic component, distributed at least in part by a cryptographic technology provider, implemented independently of a key generator that generates key information for publishing data and metadata or subscribing to data and metadata, that searchably encrypts or decryptes data and metadata based on the key information generated by the key generator.
The system may also include a network service provider implemented independently of the key generator and the cryptographic component, the network service provider providing a network service related to the data and metadata encrypted by the cryptographic component, the network service provider including a data container management component that manages where the data or metadata encrypted by the cryptographic component is stored based on data latency requirements, data reliability requirements, distance from data consumption requirements, or data size requirements of the network service. The key information may include capability information defining access privileges with respect to data or metadata encrypted by the cryptographic component. This capability information may be late-bound so that up-to-date data access privileges are granted to a given subscriber.
In another non-limiting embodiment, a computing system includes a data store storing selectively accessible encrypted data or metadata, wherein a publisher publishes data or metadata to the data store representing a resource, a first independent entity generates cryptographic key information, and a second independent entity encrypts the published data or metadata prior to storage in the data store based on the cryptographic key information generated by the first independent entity. The system provides a network service that enables selective access to the encrypted data or metadata for requests to the network service based on late-binding selected privileges granted by a publisher or owner of the resource. In this regard, the system is agnostic to container type, and thus the data store includes containers of different container types and the data store automatically distributes storage of selectively accessible encrypted data or metadata to individual containers based on analysis of current storage resources represented by the containers.
In one embodiment, the "data" is XML data that includes XML payload data (e.g., the text string "Michael Jackson" and XML tag information (e.g., < Name >) applied to the payload). To provide various trust guarantees using a combination of cryptographic key generating entities (CKG) and cryptographic technology providing entities (CTPs), as noted above, any of the embodiments herein with respect to XML data or metadata may also be applied to other formats, such as, but not limited to, JSON, S-expressions, EDI, etc., and thus XML is used for illustrative purposes only in the presently described embodiments.
If the XML data is a piece of a larger document, it may also encode manifest information for locating other related fragments. Implementations are technology independent (any CKG/CTP may be used) because of the way that storage details for a particular container are spread across different containers, i.e., one or more intermediate layers handle the particular container. Further, unlike trust wrappers, implementations are also open because any number of wrappers can be applied in addition to searchable encryption and validation or verification, and new wrapper techniques become applicable. Tags that help modulate consistency, trajectory, etc. may also be added on top of (or by augmenting) pre-existing data and metadata.
If the data/information is in XML format, any of these techniques or wrappers may be applied to the structured XML data so that the data may be selectively queried to gain access to the XML fragments. Currently, XML has a standard format, namely < tag "value" > (< tag "value" >) or < tag "value" | XML end tag > (< tag "value" | XMLend-tag >). Advantageously, using a structured XML document, there is a way to represent the structure hierarchically such that there is an external wrapper that will point to CKG/CTP 'frames' unique to the digital pipe model. Thus, when there is a need or desire to access an embedded fragment, existing trust with the < CKG > and < CTP > wrappers can be utilized or a new trust set can be established with a new CKG/CTP frame.
This may be provided through a standard public key infrastructure PKI, however the particular schema selected is not to be considered limiting of the techniques described herein. In this regard, regardless of the particular set of encryption techniques selected, the embodiments described herein enable a user to search, extract, and decrypt various pieces, subsets, or portions of encrypted data or metadata. In addition, a public attestation of the data-owning mechanism (a trusted third party that operates on behalf of a device) may be performed to verify that the particular XML fragment accessed has not been tampered with since it was originally authored.
Essentially, a "trusted envelope" of XML fragments or complete records (e.g., "payload") is provided through various "adornments" that allow the trust to run a full range of trust guarantees such as, but not limited to, confidentiality, privacy, anonymity, and integrity.
As one example of the type of information that may be represented in the XML tag information as part of a trusted envelope, pieces of an XML document may be designated for respective sensitivity levels. For example, there may be documents with Public (Public), Secret (Secret), and top Secret (topexecute) paragraphs. A person performing a search and requesting access using a "secret" permission will only gain access to the public and secret paragraphs. The classification of paragraphs may also be used to determine encryption mechanisms, keys, and access policies. For example, a policy may be implemented that the confidential content cannot be accessed from a wireless or remote device.
Similarly, such classification can be used to create policies on how data can be stored, where data can be stored, how long data can be stored, and so forth. For example, a policy may be created that requires that AES256 encryption must be used once a day to backup (sensitive) medical data into a secure server in a trusted data center.
In one embodiment, a method for hosting extensible markup language (XML) data includes: the first computing device in the first control region receives encrypted XML data from the second computing device in the second control region, the encrypted XML data including encrypted XML payload data and an encrypted XML tag. The encrypted XML data is formed based on encryption of the defined XML data set of the second computing device by the searchable encryption algorithm based on the cryptographic key information. The request for data includes a capability based on cryptographic key information defining privileges for accessing at least some of the encrypted XML payload data or the encrypted XML tag and enabling selective access to the encrypted XML data as defined by the capability.
Although some embodiments are described in the context of encryption of XML data, any mathematical transformation or obfuscation of XML data may be used. For example, in one embodiment, XML data is distributed according to a substantially non-guessable data distribution algorithm that distributes the XML data across different storage locations. A mapping is maintained that allows the relevant portions of data for which the requesting entity has privileges to be reconstructed if access to the mapping is granted. In this regard, the embodiments described herein in the context of encryption may thus be generalized to any algorithm or mathematical transformation that obfuscates or otherwise encodes data in a manner that hides the data without access privileges.
The capabilities may include trapdoor data including cryptographic trapdoors for selectively accessing the encrypted XML payload data or encrypted XML tags. The encrypted data includes auxiliary encrypted metadata formed from analysis of the encrypted XML payload data or encrypted XML tags. For example, a public, secret, or absolute secret level tag may be applied to each payload element of the XML document on a segment-by-segment basis, and may be included in the auxiliary encrypted metadata to implement a highly granular policy regarding access to portions of the XML document.
In another embodiment, a method for subscribing to searchable encrypted XML data comprises: receiving cryptographic key information from a key generation component that generates the cryptographic key information based on identity information associated with the subscriber device; requesting, by the subscriber device, a subset of the searchably-encrypted XML data and corresponding XML tag data, including transmitting the cryptographic key information to a storage provider of the searchably-encrypted XML data and corresponding tag data; and decrypting the encrypted XML data and the corresponding subset of XML tag data as allowed by the capabilities defined in the cryptographic key information.
For each XML fragment of the encrypted XML data, XML tag data representing a confidentiality rating of the corresponding encrypted XML data may be decrypted and it may be determined whether the capability allows access to the data having the confidentiality rating. This includes a public confidentiality level with open access privileges, or a less open secret confidentiality level as defined according to policy.
The method may include confirming that the correct subset of encrypted XML data and corresponding XML tag data was received by a subscriber device that conformed to the request. Examples of validation include performing a proof of data possession to prove that the subscriber device received the correct subset. The method may also include verifying that the contents of the encrypted XML data and the corresponding subset of XML tag data were not deleted or modified prior to receiving the encrypted XML data and the corresponding subset of XML tag data. Examples of verification include performing a recoverability proof to prove that there is no intervention with the content. Among other optional features, anonymization credentials associated with a subscriber device may be applied when requesting access to encrypted XML data or key information.
In another embodiment, a method for publishing extensible markup language (XML) data may comprise: encrypting XML data according to a searchable encryption algorithm based on cryptographic key information received from a separate key generator to form encrypted XML data including encrypted XML tag information, wherein the key generator generates the cryptographic key information; and transmitting the encrypted XML data to a network service provider for storage of the encrypted data, wherein the encrypted data is selectively accessible according to a late binding of the selected privileges granted to the requesting device based on the identity information of the requesting device. The encrypting may include receiving cryptographic key information from a key generator executing in a separate control area, the key generator generating the cryptographic key information based on an identity of a publishing device that performs the encrypting of the XML data.
In another embodiment, a method for subscribing to extensible markup language (XML) data includes: in response to a request by a subscriber device for a subset of the searchably encrypted XML data that includes an encrypted XML tag, receive cryptographic key information from a key generation component and decrypt the subset of the encrypted XML data according to privileges defined in the cryptographic key information that are granted to the subscriber device, wherein the key generation component generates the cryptographic key information based on identity information associated with the subscriber device.
Techniques may include a subscriber device requesting proof that a correct data item was received for data items of a subset of encrypted XML data, which may include receiving the following information: this information proves to the subscriber device that the data items in the subset of encrypted XML data requested by the subscriber device are correct. Techniques may include a subscriber device requesting proof that a subset of the encrypted XML data has not been tampered with prior to the request, which may include the subscriber device receiving the following information: this information proves to the subscriber device that the subset of the encrypted XML data was not tampered with prior to the request.
In yet another embodiment, a system includes a data store storing selectively accessible encrypted XML payload data and corresponding encrypted XML tag data corresponding to the encrypted XML payload data, wherein a subscriber requests a subscription to the encrypted XML payload data or a subset of the encrypted XML tag data, a first independent entity generates cryptographic key data based on identity information associated with the subscriber, and a second independent entity performs decryption of the subset based on the cryptographic key information generated by the first independent entity. The system also includes a web service for processing a subscriber's request that provides selective access to the subset of encrypted XML payload data or encrypted XML tag data. The system may be configured to confirm that the subset of encrypted XML payload data or encrypted XML tag data is the correct subset consistent with the subscription and/or verify that the subset of encrypted XML payload or encrypted XML tag data was not altered or deleted without authorization prior to selective access to the subset of encrypted XML payload or encrypted XML tag data.
In another embodiment, a system includes a cryptographic component distributed at least in part by a cryptographic key technology provider implemented independently of a key generator that generates key information for publishing or subscribing to XML data or corresponding tag data, and a network service provider implemented independently of the key generator and the cryptographic component, the cryptographic component including a processor configured to execute a searchable encryption/decryption algorithm based on the key information generated by the key generator, the network service provider including a network service configured to implement a network service related to XML data or corresponding tag data encrypted by the cryptographic component. The key information includes "late binding" capability information by which to grant up-to-date access privileges to a given subscriber of the XML data or corresponding tag data.
Further details of these and other various exemplary, non-limiting embodiments and scenarios are provided below.
Verifiable trust synthesized by wrapper
As mentioned in the background, maintaining sensitive enterprise data at a remote site owned by a service organization may place the data at risk from privacy violation to data loss. As described for various embodiments herein, a network or cloud data service (including mathematical transformation techniques for data) is provided in the following manner: this approach distributes trust across multiple entities to avoid a single point of data compromise and separates the data protection requirements from a container that can store, process, access, or retrieve that data. In one embodiment, the access information generator, the mathematical transformation technique provider, and the cloud service provider are each provided as separate entities, allowing a trusted platform for data publishers to publish data confidentially (encrypted) to the cloud service provider, and allowing selective access to the encrypted data by authorized subscribers based on subscriber capabilities. The multiple transformation wrappers or layers may transform, in whole or in part, the data, metadata, or both into mathematical transformations (e.g., encryption, distribution across storage, obfuscation) or otherwise introduce visibility deficiencies into some or all of the data, metadata, or both.
Embodiments of the present disclosure provide verifiable trust through a family of techniques called wrapper composition, with various applications including storage of data on servers, services or clouds that are not fully trusted, or exchange of data through control areas that cannot be fully trusted. Some other examples of cryptographic techniques or 'decorations' that may be applied to wrappers or envelopes of data in order to establish a high level of trust in the security and privacy of the data include, but are not limited to, size-preserving encryption, searchable encryption, application proofs, blind fingerprints, recoverability proofs, and the like. In some embodiments herein, the candidate data is referred to as 'content'.
Verifiable trust (VerifiableTrust) is the ability to provide guarantees such as anonymity, privacy, or integrity through various techniques, which may include, but are not limited to, cryptographic techniques for encryption, signatures, cryptographic hashes, and interactive proofs. Other types of guarantees exist using the associated types of applicable passwords and other techniques. For ease of explanation, all of these techniques will be referred to hereinafter as 'passwords'.
Many conventional cryptographic and other schemes for protecting content (i.e., data) and for providing verifiable trust rely on "all or nothing" methods that require the recipient or reader to have an appropriate key or other mechanism for gaining access.
Emerging methods and systems can provide 'selective opacity' in which a party with access to such content can be provided with a delegation capability for performing a limited action on the content. These actions may include having selective access based on criteria including a key or role attestation or a claim of possession.
The actions may also include the ability to perform a range of actions on the content while continuing to have limited viewing of the content, which may include searching, routing, and workflow. An example of such a technique is known in the research literature as 'searchable encryption'.
A set of reasons is generally described below, including scene diversification, restrictions in cryptographic techniques or systems, or complications by different policies and verifiable trust requirements of participants. The prior art and implementations are not able to meet these various needs in a manner that provides system and operational flexibility and dynamic composition. Loosely coupled wrappers (which may be transient) are described and provide dynamic addition and removal of supporting wrappers based on the needs of any scenario.
FIG. 1 is a block diagram that illustrates a system for publishing or subscribing to data, metadata, or both in a store with a composition wrapper, in an embodiment. According to the distributed trust platform, the mathematical transformation component 100 includes an access information generator 102. The access information may be a memory map, cryptographic key information, or other capabilities for reconstructing segments of the data that are hidden or otherwise distributed. A mathematical transformation technique provider 104 is also included. The component 100 is used for publishing 110 data by publishers or subscribing 112 data by subscribers. Network service provider 120 facilitates interaction with data 150 and/or metadata 152 stored in storage 130. In this regard, wrappers 140 may be applied to the data 150, the metadata 152, or both, and these wrappers 140 may be generated, regenerated, deleted, augmented, altered, or otherwise changed to correspond to changes in the system or instructions.
FIG. 2 is a block diagram that illustrates a system for publishing or subscribing to data, metadata, or both in a store with a synthetic encryption wrapper, in an embodiment. According to the distributed trust platform, the mathematical transformation component 200 includes a cryptographic key generator 202. A cryptographic technology provider 204 is also included. The component 200 is used for publishing 220 data by publishers or subscribing 222 data by subscribers. Network service provider 220 facilitates interaction with data 250 and/or metadata 252 stored in store 230. In this regard, cryptographic wrappers 240 may be applied to the data 250, the metadata 252, or both, and these wrappers 240 may be generated, regenerated, deleted, augmented, altered, or otherwise changed to correspond to changes in the system or instructions.
Fig. 3 is an illustrative example of a concentric composite wrapper, where an outer wrapper 2330 overlays an inner wrapper 320, both wrappers mathematically transforming (e.g., reducing the visibility of) data 300, metadata 310, or both. Wrappers 320 and 330 protect all of data 300, metadata 310, or both, and wrapper 1320 is invisible as an inner wrapper until the correct ability to be visible through wrapper 2330 (the outer wrapper) is presented.
Figure 4 is an illustrative example of a composite wrapper with a lateral wrapper. For the lateral wrappers 420 and 430, the associated transformation makes not all of the data 400, the metadata 410, or both invisible. Rather, desired portions or items in the data 400, the metadata 410, or both are obscured while other portions may remain visible.
Figure 5 is an illustrative example of a hybrid composite wrapper with both concentric wrappers and lateral wrappers in an embodiment. In this regard, FIG. 5 is an example of the wrappers of FIGS. 3 and 4 combined in an embodiment where some wrappers 320 and 330 wrap all of the data 500, the metadata 510, or both, and some wrappers 420 and 430 wrap portions of the data 500, the metadata 510, or both (these portions need not be contiguous and may satisfy any subset definition). In this regard, which wrappers are first unwrapped or which hierarchies are independent of each other may be represented according to any data structure that maintains a hierarchy.
FIG. 6 is a block diagram of using a lateral wrapper in conjunction with access log metadata associated with data, in an embodiment. As an example use of a lateral wrapper, where data 600 may include access logs 602 (e.g., origin data), lateral wrapper 620 protects some data 600 and lateral wrapper 622 protects access logs 602. Thus, not all data is protected by a given wrapper, and different criteria may or may not be applied to different portions of a data item. Fig. 6 additionally illustrates that additional full wrappers 630 may be applied to the partial wrappers 620, 622 as an additional layer of protection.
FIG. 7 is a block diagram that illustrates efficiently deleting data by discarding or shredding access information in an embodiment in which the manner in which the data is deleted is encoded in the metadata. In this regard, in such a system, one way to delete data, rather than unwrapping the wrapper 720 on the data 700 or the metadata 710 or both, is to simply discard or shred the access information generated by the access information generator 702, because then the data 700 or metadata 710 cannot be accessed without this information. However, if the system chooses to record some information about when or how the deletion occurred, information about the data 700, the metadata 710, or both may still be retained.
Fig. 8 is a block diagram additionally illustrating an example in which data is scrambled and information on the scrambling is recorded in metadata inside or outside a wrapper of a composite wrapper. In this example, the data 800, the metadata 810, or both, protected by the wrappers 820 may be scrambled, and information related to the algorithm used may be added to the metadata 810 or the external metadata 812, or even encoded in one of the wrappers 820 for institutional awareness of what the data occurred. Thus, either the discarded access information generated by the access information generator 802 or the scrambling may be used to effectively delete data in the system.
FIG. 9 is a block diagram of a change in policy that results in surrendering the ability to view data, metadata, or both that are obscured by a wrapper as an alternative to removing the wrapper. In this example, regardless of wrapper protection, a way to "unprotect" protected data 900, metadata 910, or both is given. For example, based on policy changes 904, access information generator 902 simply generates the ability to view all or part of data 900, metadata 910, or both, through wrapper 920 without any cost to the requesting entity. This leverages existing architecture so that expensive unwrapping operations do not have to be performed.
FIG. 10 is a block diagram of automatically unwrapping, generating, changing, regenerating, or augmenting a wrapper based on an instruction or state change. This example is illustrative, and a change in state 1040 or explicit instruction from the network service provider 1030 may automatically trigger a change to the wrapper 1020 protecting the data 1000, the metadata 1010, or both. For example, the wrapper may be automatically unwrapped when the data reaches a particular age (e.g., once the relevance of the data expires it becomes clear text again). As another example, a wrapper 1020 may be generated, for example, when data becomes relevant (e.g., confidential). The wrapper 1020 may also be regenerated using a different encoding in the event that it is reasonable to believe that damage has occurred. As another example, the wrapper may be altered, e.g., different transformations may be used on the wrapper or different parameters of a mathematical transformation may be used.
FIG. 11 is an illustrative example of performing a data revision task using one or more lateral wrapper transforms. For example, the revised lateral wrappers 1120 may be used to revise company names or dates outside of the data 1100, or may revise certain keywords outside of the metadata 1110. As another example, while data is revised with the revision wrapper 1120, it is possible to record information about what, or when, was revised in the metadata 1100 as obscured by the consistency metadata wrapper 1130 for people who need to know something about what was revised in the data 1100.
FIG. 12 is a flow diagram of an exemplary non-limiting process for hosting data, metadata, or both wrapped by a composer in an embodiment.
At 1200, data or metadata or both protected by a composite wrapper is received, wherein the composite wrapper is formed from a first mathematical transformation of a first wrapper defining the data, the metadata, or both based on a first set of criteria and a second mathematical transformation of a second wrapper defining the data, the metadata, or both based on a second set of criteria.
At 1210, a request to access data, metadata, or both protected by a synthetic wrapper based on a set of capabilities included in the request is received. At 1220, based on the set of capabilities, access privileges to the data, the metadata, or both are determined based on evaluating visibility through the first wrapper and independently evaluating visibility through the second wrapper. At 1230, the state of the system may change, or an instruction is received explicitly requesting the change. At 1240, based on the state change or instruction, the first or second wrapper is changed, deleted, or augmented based on a new set of criteria (QoS requirements, type of content or metadata, consistency requirements, etc.), another wrapper is generated, a wrapper is deleted or undone, a wrapper is regenerated with the same or other mathematical transformation, or a policy governing the first or second wrapper is changed, etc.
Many of the techniques described herein rely on a protection hierarchy that may involve an "inner" payload containing the actual content to be protected and an "outer" payload containing an encrypted index describing inner data that may be subject to blind search techniques, such as those referred to in the research literature as "public key encryption with keyword search".
In practice, this reference to "internal" and "external" may imply literal containment, where access to the "internal" payload would require unlocking the "external" payload (i.e., wrapper) through techniques such as decryption. However, this may also be a virtual concept where there is no real containment, but it is difficult to locate or reassemble the inner payload without decoding the outer wrapper.
This difficulty mentioned in the previous sentence is due to the computational complexity introduced by cryptographic techniques or other mathematical transformations, or to the systematic techniques that provide isolation of internal and external content by ownership and control zones, or to the use of a combination of cryptographic dispersion techniques coupled with systematic techniques that optimize dispersion and reassembly.
There are several reasons for creating a hierarchical structure of wrappers to provide verifiable trust, some of which have been described above. These reasons include providing selective opacity based on natural hierarchies such as organizational structures observed in the real world. Other reasons include the ability to combine different cryptographic techniques with either supplemental attribute cryptography or system attributes in a manner that provides a composite that is more amenable to real-world implementations.
There are also many other reasons for adding or removing wrappers that resolve temporal or spatial events that modify the various requirements of verifiable trust in a way that does not require the inner wrapper or payload to remain intact.
Temporal events may include estimations or assumptions that provide guaranteed capabilities of the cryptographic techniques used to generate these wrappers are weakened by hardware, software, or scientific progress. By generating new wrappers on a schedule, on demand, or other trigger to facilitate long-term provisioning of verifiable trust, a system can be built to accommodate this failure.
Spatial events may include transferring ownership or ownership across parties handling the content who have reason (perhaps for reasons including compliance with jurisdictions, laws, regulations) to choose different cryptographic techniques and are particularly noticeable when crossing the boundaries of the master entity or authority. The system may be built to embed capabilities in the gateway, either directly in the wrapper, or in some hybrid manner, such that the wrapper is materialized or vaporized (vaporize) based on policies that reflect the rules of the new processor or owner.
Techniques exist in the literature that attempt to provide verifiable trust that can model natural hierarchies, including hierarchical identity-based encryption, attribute-based encryption, and predicate (or function) encryption. There are also techniques that provide the inherent ability to materialize a wrapper in a virtual manner through cryptographic technology attributes such as perfect homomorphism. With these technologies in place, there may still be a need for synthesizing through wrappers as described above for reasons ranging from verifiable trust to other reasons for the system.
Cryptographic reasons for synthesizing wrappers include the need for matching techniques that can synthesize by an inner wrapper that provides stronger assurance or finer grained control of access and an outer wrapper that leaks less information to advertisements but provides coarser grained control and weaker assurance. Thus, the synthesis provides a more optimal mix of assurance and control.
System reasons may include the need for matching techniques with variable attributes, which may include performance and scale. In such a scenario, the external wrapper may be constructed external to the technology providing higher level performance and scale, but with some limitation in functionality, and the internal wrapper with extended or supplemented functionality, possibly with lower performance and scale.
In practice, the situation is typically: emerging cryptographic techniques offer an increased level of capability, but at the expense of performance and scale. There is also a lack of confidence in these emerging technologies, which typically dissipate over time as these technologies are analyzed, refined, and standardized. Also, over time, learning to implement these techniques in consistently more efficient software or hardware approaches, or combinations thereof.
Wrappers facilitate optimistic use of these new technologies in a manner that provides a 'safety net', where internal or external wrappers can compensate for known or unknown vulnerabilities or deficiencies or inherent lack of features in the technology that implements any single wrapper. For example, an external wrapper may provide keyword privacy and high performance, but with coarser granularity of access control or no access control. This can be compensated with an inner wrapper that does not provide keyword privacy to provide finer grain access control, possibly at the cost of higher performance and size penalties.
The situation is typically: the outer wrapper will be optimized for 'false positives' because it is more likely to make selections, search, route, or workflow. This may reflect a higher or contained licensing hierarchy in the real world organization. However, the corresponding inner wrapper will dedicate this result to the inner layers in the hierarchy. For example, an outer wrapper may get all results that are appropriate for a math department in a university, while an inner wrapper may filter out results that are of interest to a mathematician.
There are other reasons for wrappers. These reasons include the need to provide scenario-specific verifiable trust guarantees. For example, there may be the following families of scenarios: where protection from knowledge of the existence of particular keywords is not important, this may be because the domain of such keywords is large enough that dictionary or keyword guessing attacks are computationally impractical, or because leakage of such keywords is considered a low risk by the scenario.
Instead, there may be other families of scenarios: where these keywords are highly sensitive and the data owner (or owner's company, social, support, or consumption network) desires that the current processor of the data (e.g., server, service, cloud, or other party) perform actions without having to learn anything other than these keywords that they operate to perform actions, including searching and then retrieving or routing or performing other operations.
Furthermore, in many families of real-world scenes, the content is synthetic, containing components in a continuum ranging from 'highly sensitive' to 'disposable'. In general, any real-world organization reflects these different requirements in the composite content that is stored, exchanged, or otherwise cooperatively manipulated.
Furthermore, in many of these collaboration scenarios, including many extranets, where companies collaborate across organizations, regulatory compliance, ownership entities, or other boundaries, the categorization of content accessed by a party will be considered by the owner to map to points in the continuum based on the current corporate or contractual relationship (or history and trust) between the owner and the accessing party. Further, this relationship and mapping may be ephemeral, perhaps to the granularity of a particular corporate transaction or social interaction.
In any predictable state of cryptography and system technology, it would be impractical to design a single solution that meets a range of requirements across scenarios that meet all verifiable trust requirements, and system requirements including performance and scale, as well as other requirements that may include the previously described requirements. Due to the complexity and subsequent cost of delivering these solutions on a large scale, higher freight costs are typically incurred (which is often passed on to the end user), which places a high development, maintenance, and operational burden on servers, services, clouds, and other deliveries.
Techniques, including those generally described in this disclosure, are candidates for synthesizing solutions that are considered optimal from a cryptographic and/or system perspective. Such composition may include automatic composition of hybrid services from appropriate candidate building block services, but this would not exclude semi-automatic composition that may include operational or engineering intervention. In these cases, the solution may leverage this capability to embody composition of constructs outside of wrappers that provide complementary systems, passwords, and other features that deliver optimal solutions to the scene or family of scenes, but in a manner that ideally does not place an implementation or operational burden on features or capabilities that are not needed for the scene or family of scenes.
Additional reasons for implementing a wrapper include requirements, and requirements for intermediaries between policies for different individuals or organizations that have legal requirements for controlling access to content. This is a common business scenario where a specified set of parties must come together to agree to unlock access to some content. An example may include a family of clinical trial (clinical trial) scenarios, where content may be regulated by FDA and HIPAA, and ownership may be shared between parties, such as drug sponsors (e.g., Merck) and data owners (e.g., microsoft health vault). The drug sponsor may exercise rights due to the need to protect intellectual property generated as a result of the study, while the health valuls may exercise rights due to the need to protect the anonymity and privacy of their customers participating in the clinical trial.
There are cryptographic, systematic, and human-assisted techniques for intervening, mediating, and arbitrating conflicts, as these conflicts typically arise due to inconsistent, conflicting, or ambiguously defined policies. These cryptographic techniques include techniques that utilize secret sharing, and implementations include those that deploy multiple rights issuer keys or capability generators. However, these techniques do not always satisfy the correct mix of verifiable trust, flexibility, and expressiveness, or system performance and scale. Even if there is a single technology and corresponding implementation that satisfies all the potentially different and conflicting requirements of the parties, it may be the case that: a single implementation or deployment will not be trusted due to the lack of trust by potentially other participants in the technology, implementer, or host.
In this case, wrappers are candidate solutions that will be supplemented with other automated techniques, or by manual intervention, possibly through electronic or manual workflows. Such a composite solution may utilize a wrapper that will provide a layer of control to a particular party or group of parties in the business network. The wrapper may optionally be a technology, implementation, or deployment that is trusted by the party or group of parties in question. Such a wrapper may engage in an interactive protocol with the remote cloud to possibly implement late binding or support expiration or revocation or provide an audit log with verifiable trust. Or such wrapper may implement an offline protocol that will grant, block, and/or record access in an appropriate manner based on the access credentials and some access policy.
Other reasons for implementing verifiable trust with a wrapper may include the need to manage assets or artifacts such as keys, passwords, pass phrases, certificates, or other proof of knowledge or ownership. In general, cryptographic systems may place additional system burdens on generating, managing, backing up, archiving, retaining, deploying, secure shredding, providing forensics (forensics), and supporting servers, services, and clouds to access these artifacts in a manner that provides a desired level of service level agreement on availability, scalability, latency, and data loss and recovery time upper bounds in the event of failure.
This is further exacerbated by the need to provide some level of trust for these servers, services, or clouds. This is typically done through a trusted hierarchy (e.g., PKI). These systems need to be further protected so that servers, services, or clouds are not permitted to launch attacks through possible impersonation and man-in-the-middle attacks if compromised in any way.
In such cases, the wrapper provides a candidate solution that may be suitably combined with a remote server, service, or cloud that provides supplemental capabilities in a manner that mitigates the overhead, complexity, latency, and WAN fragility of interactive protocols for reasons including access, authentication, and retrieval of artifacts. In embodiments of such wrappers, the outer wrapper may hold the necessary artifacts for accessing the inner wrapper. One of the results may include eliminating the need to store and archive these artifacts, as they effectively become part of the payload, either by inclusion or through efficient system engineering.
There is an opportunity to provide a hybrid solution where such wrappers can initiate an interactive protocol with a remote server, service, or cloud to effect expiration or revocation or to provide an audit log with verifiable trust for purposes such as auditing or debate. The way this hybrid solution is implemented may optimize the burden placed on the server, service, or cloud, or it may be optimized to operate in the presence of network problems.
In these and other cases, the wrapper may select a technology, implementation, and remote server, service, or cloud for the optional interactive protocol in a manner that facilitates verifiable trust in the business network. Wrappers may be designed to implement linear hierarchies or more complex structures such as trees or graphs. As described above, these structures may be actual inclusions, or they may be virtual. Such complex structures may facilitate flexible and expressive decision trees that may enable combinations of threshold participation, overrides and escrow, and possibly exceptions and manual overrides.
There are cryptographic techniques that can provide equivalent capabilities through a system such as a multiple rights issuer key or capability generator. These techniques and systems can be viewed as candidates for different wrappers in a composite system that meets the diverse needs of participants in a business network due to all business, social, political, ownership, and other complex needs in any complex real-world network that must be accommodated.
Such complex real-world networks, ecosystems, and markets typically grow organically from an initial set of participants, which then grows through the network effect of an increased number of participants. Other networks are established from top to bottom, perhaps by standards or ordinances. Other existing networks, which may be organically grown, will be split into silos due to possible collisions or due to foreign effects. It is therefore often desirable to be able to support a network "forest" with the ability to join or separate, perhaps by design due to the transitory nature of trust being an inherent requirement in the network.
This disclosure and various embodiments generally describe some of the many reasons a wrapper is used. Real-world applications address a range of verifiable trust and system requirements and support a variety of scenarios through suitable composition and expansion based on modular concepts and building blocks (including those described above).
For ease of illustration, and for pedagogical reasons, the previous text sometimes used certain abbreviations or made implicit or explicit assumptions including those described below.
Techniques that provide verifiable assurances of any form, possible system solutions that implement boundaries, control regions, and gaps, or cryptography or similar techniques that rely on computationally difficult problems, highlighted as 'cryptography', 'cryptographic solutions', or other similar or equivalent phrases. However, for the avoidance of doubt, cryptographic techniques such as searchable encryption are by no means necessary. Any mathematical transformation, encoding, blurring, etc., may be used, as well as other techniques for protecting the data, such as hiding the data or splitting the data in such a way that it cannot be spliced together without a recompose. Operations such as signing or encryption, or techniques such as hashing, tend to be less ambiguous in common use than other terms such as anonymity, confidentiality, integrity, privacy, and non-repudiation.
Other parties in such a interchange may include participants such as an extranet or business network of a business scenario. Alternatively, these may be social, supporting networks of consumers, individuals, friends, and families. Alternatively, these may be networks of individuals or organizations across different ownership entities or rights. These ownership entities or their representatives may be participants in the promotion or participation or both, and may form their own network, such as NATO.
The method and apparatus for implementing a composite with a wrapper for optimizing verifiable trust in a configurable manner is referred to simply as a 'wrapper'. Implementations may utilize software, hardware, or other means to implement an offline protocol that creates a space or virtual container. Other implementations may include interactive protocols with remote services, or some combination of offline operations and interactive protocols.
The parties involved in facilitating the transaction or interactive protocol (which may include servers, services, clouds, workflow endpoints that may be human or automated), or other parties, are referred to as the 'cloud'. Implementations of the cloud may include public, private, outsourced, private, or multi-tenant versions.
Container-less data for trusted computing and data services
Using the techniques of a trusted platform, data (and associated metadata) is separated from the container (e.g., file system, database, etc.) holding the data, enabling the data to act as its own custodian by imposing a protective shield of mathematical complexity that is pierced with presented capabilities, such as keys granted by a cryptographic key generator of a trusted platform as described in various embodiments. Sharing or access to data or a subset of the data is facilitated in a manner that preserves and extends trust without requiring a specific container to be enforced. The mathematical complexity applied to the data, such as searchable encryption techniques, protects the data regardless of the container or hardware in which the particular bit is recorded, i.e., the data is container-less (i.e., regardless of the container) protected and thus not subject to attacks based on compromising the security of the container. In the event that this particular "safe" is compromised, the content remains protected.
FIG. 13 is a block diagram of an overall environment for providing one or more embodiments of a secure, private, and selectively accessible network data service as described herein. Multiple enterprises 1300, 1302 are shown for illustrative purposes, but the techniques are applicable to a single enterprise or many cooperating enterprises. In embodiments, using federated trust overlay 1330 as described in detail below, policies 1310 of enterprise 1300 and enforcement 1320 of policies 1312 of enterprise 1302 may be shared for collaborative work based on FTO infrastructure 1330. The implementation 1320 may also be applied separately by each enterprise 1300, 1302. In this regard, because the policies and enforcement are entirely within the realm of the enterprise 1300, 1302 (e.g., based on the trust overlay 1330), from a customer standpoint, the location of the actual data in the cloud 1340 and what particular container 1342 is used becomes irrelevant, except for the following matters that are actually of concern to the customer: latency, reliability, quality of service guarantees, backup, retrieval time, size guarantees, and the like.
Thus, recognizing that data is freed from the container holding the data by the trust overlay 1330, in embodiments, the data store management layer 1350 automatically processes issues of customer interest based on analysis of real-time availability of storage resources and their respective characteristics to optimize data storage in the container as appropriate to customer needs and desires. The storage management layer 1350 is dashed to indicate that its location is not critical. The storage management layer 1350 typically does not have cryptographic privileges to access, view, or change data stored in the one or more data stores 1342, however it may be desirable to expose some of the metadata (such as file size or file type) to facilitate an understanding of how the customer would like to use the data in the future so that the storage management layer 1350 can make intelligent storage choices. For example, if the storage management layer 1350 is given enough views of the data to understand that the data is a video, the storage management layer 1350 can save the video in a media store that meets the requirements of streaming media.
Fig. 14 is a block diagram showing a general concept of "data as its own custodian". Using policies and enforcement under the control of the user or enterprise, the data and corresponding logs are encrypted and can only be accessed with specific capabilities granted to the user, as described in more detail below. For example, typically, someone who has no capability, such as an operator staff of a cloud service provider, cannot view, modify, tamper with, or delete without being detected because they do not have data privileges. Using data as its own custodian, policies are set by the owner/publisher of the data, access is enforced/guaranteed by the data itself (wherever the data is stored), making container selection redundant. Trust guarantees are enforced by the data, but are controlled by the owner/publisher by describing what the subscriber/customer can do with the data.
As shown, in one non-limiting embodiment, enterprise 1420 "owns" policies 1424 and enforcement 1422 of policies 1424 that are related to users 1426 and their use of system resources of enterprise 1420 and to external users 1430 (e.g., mobile workers). Using data as its own custodian, the actual data and/or log 1405 can be separated from policies 1424 and enforcement 1422 by storing the data in the cloud 1400, however, the operational staff 1410 of the cloud 1400 cannot view, modify, tamper with, or delete the data and/or log 1405 without being detected.
FIG. 15 is a block diagram of an overall environment for providing one or more embodiments of a secure, private, and selectively accessible network data service as described herein. In general, a non-limiting example of distributing trust using federated trust overlays is shown, with a computing device 1500 (e.g., a customer) in a first control region 1510, a computing device 1520 (e.g., a cloud service provider) in a second control region 1530, a computing device 1560 in a third control region 1590, a cryptography provider 1580 provided in a fourth control region 1595, and a key generator 1582 provided in a fifth control region 1597. Each of computing devices 1500, 1520, 1560 may include a processor P1, P2, P3, respectively, and a memory M1, M2, M3, respectively. In this regard, as described in accordance with various non-limiting embodiments, techniques are provided for enabling encrypted data 1540 in a cloud such that an item 1550 or a portion of an item may be selectively retrieved from the cloud based on access privileges. In this regard, a set of analytics services 1570 may be provided as a layer on top of the encrypted data 1545, 1547 to be stored that automatically determines where to optimally store the encrypted data 1540 and the encrypted data 1542 maintained in the cloud based on the local data set 1505 from the device 1500. In this regard, the service 1570 ensures that when the computing device 1500 retrieves this data based on the CTP1580/CKG1582 federated trust overlay, the retrieved data 1552 or retrieved data 1550 is retrieved from the optimal container for a given request, or if suboptimal, automatically switches the container. For example, if the current container from the computing device 1560 is not performing well for the customer's needs, or if the customer's needs change, the analytics storage service 1570 may move or copy the data to another storage container in real-time and seamlessly switch the service to a more suitable container, e.g., to meet quality of service requirements.
FIG. 16 is a flow diagram of a process for managing containers in which data acts as its own custodian, as described herein. At 1600, a first computing device in a first control area receives encrypted data from a second computing device in a second control area. The encrypted data is formed from an encryption of data of the defined data set of the second computing device according to a searchable encryption algorithm based on the cryptographic key information. At 1610, encrypted metadata is also received, the encrypted metadata formed from an analysis of the data and an encryption of an output of the analysis based on the cryptographic key information. At 1620, it is determined which container(s) are to store at least some of the encrypted data or encrypted metadata. At 1630, if a predefined condition is satisfied, the container in which the encrypted data is stored may be automatically changed.
FIG. 17 is another block diagram illustrating one or more aspects of data acting as its own custodian. In this regard, the container is redundant for security, access is enforced by the password wrapper, and policies are set by the owner/publisher and guaranteed by the password wrapper. As described below in various embodiments, a wrapper may include various cryptographic techniques depending on the particular security needs of a situation. For example, as shown, policies are set at the enterprise level, and then users seek access to data that is wrapped by password access controls that allow or deny access. Other users, such as business auditors, security personnel, operator personnel, etc., may or may not have access privileges defined by the wrapper, depending on the policies set at the enterprise.
As shown in the example of fig. 17, an enterprise 1720 has enterprise employees 1722 that can comply with an enterprise access policy 1730, and some of these enterprise employees 1722 can set the enterprise access policy 1730. Enterprise access policies 1730 may affect how data 1712 stored in data container 1710 of cloud container 1700 can be accessed, manipulated, retrieved, searched, and the like. Thus, when a user 1708 of data 1712 attempts to access such data 1712, cryptographic access controls 1714, directed by enterprise access policy 1730 but separate from enterprise access policy 1730, protect data 1712 from improper access by user 1708. The cryptographic access controls 1714 of the data container 1710 may reflect different enterprise access policies 1730 to apply to different access entities or tasks, such as enterprise audits 1702 performed by security employees 1704 or cloud operations employees 1706 to ensure that visibility is limited to those people who should be allowed access. Data container 1710 may be located anywhere and made redundant for security, and access is enforced by password access control 1714. In this regard, enterprise access policy 1730 may be set by the enterprise owner and guaranteed by a password wrapper as implemented by password access control 1714.
FIG. 18 is another block diagram showing data acting as its own custodian, illustrating that the data is capable of overriding the traditional container security model. In this regard, as recognized herein, data may not only be located anywhere, but the data may be spliced or partitioned across multiple containers in a manner that is optimal for a given situation. Placement may optimize access, recoverability, etc., and the storage management layer may handle consistency, versioning, garbage collection, etc.
As shown in fig. 18, enterprise 1820 defines its enterprise access policies 1830 that apply to enterprise employees 1822, while data 1812 is stored remotely and protected by cryptographic access controls 1814 that apply to users 1810 who wish to access data 1812. The system and user 1810 are agnostic as to whether the container storing the data 1812 is stored in the cloud 1800, somewhere at the enterprise 1802, or via the overlay network 1804, or a combination thereof, and the data may span the container.
FIG. 19 illustrates a storage management layer that performs functions such as automatic shredding, caching, copying, reconstruction, etc., of data from multiple data containers of different types. These processes may be performed based on criteria including explicit policies and access patterns. As shown, from a user's perspective, data container 1900, including data 1902 and password access controls 1904, is stored at an abstract storage tier 1910 for storing all data, however in reality, data 1902 protected by password access controls 1904 may be shredded, cached, copied, and reconstructed across any one or more of cloud data services 1920, file systems 1922, enterprise databases 1924, overlay networks 1926, and so forth, based on criteria, which may include policies and access patterns.
FIG. 20 more generally illustrates that the pivot point that enables data to serve as the security, privacy, reliability, etc. of its own custodian is a secure overlay network that adds a cryptographic access wrapper to data regardless of where it is stored across data containers. In particular, the overlay network 2010 may be an intermediate storage medium for further storing the container 2000 of data 2002 protected by the cryptographic access control 2004 in any one or more of the cloud data service 2020, the file system 2022, or the enterprise database 2024. Thus, storage may be hierarchical in terms of its final destination.
FIG. 21 is a block diagram illustrating a conventional application and its container-based view of the world (e.g., database file) does not need to be changed. Instead, for use in a federated trust overlay storage scenario, an adapter may be provided that performs password negotiation, password transformation and caching, versioning, leasing, etc. based on the requirements of the application and the legacy container. More specifically, legacy application 2100 can interact with cloud data services 2110, file systems 2112, and enterprise databases 2114 in a consistent manner, however, at this point abstraction storage layer 2120 can still have container-free data occurring behind the scenes. The abstract storage tier 2120 may expose adapters that implement cryptographic negotiations, cryptographic transformations, and caching, versioning, leasing, etc., based on applications and legacy container features, and then boot the containerized data 2140 into containerized data, such as via the secure overlay network 2130 described in connection with fig. 20.
FIG. 22 is an example architectural model that can be used in connection with legacy applications as well as FTO aware applications. In this regard, FTO-enabled application 2205 may be plugged directly into FTO2200 and advantageously utilize secure and private storage, processing, etc. of data. For SDS-aware applications 2215, a layer 2210 may be provided, the layer 2210 adding cryptographic shredding and scattering of data. For consistency-aware application 2225, an existing, unmodified overlay network may be used and bridged to the system as shown by layer 2220. For example, LiveMesh, Fabric/CAS can be bridged to DaaS and XStore via layer 2220. Finally, as described in connection with fig. 21, an adaptor 2230 may be provided, said adaptor 2240 performing password negotiation, password transformation and caching, versioning, leasing, etc., based on legacy applications 2240 and legacy container 2235 features. These layers and applications can take advantage of the benefits provided by federated trust overlay based cloud storage.
FIG. 23 is a block diagram illustrating the general use of a cryptographic wrapper or envelope on data and/or metadata describing the data or characteristics of the data. As an example, the record 2302 (e.g., data payload) and associated metadata and/or the label 2300 may be collectively or separately encrypted in a mathematically selectively accessible manner to produce encrypted metadata and labels 2310 and encrypted records 2312. Using such encrypted data/metadata, various operations 2320 may be performed based on mathematically selective accessibility, such as searches for data or metadata, logical operations on data or metadata, queries, backup operations, audits of data, and the like. In addition to the encryption metadata 2300 and record 2302, optional additional data may be added to the encrypted package depending on any desired objective 2314, or an optional additional tag 2316 may be added to the content as part of the encryption process, such as a public or secret tag that allows or prohibits access to, for example, a certain type of user. Using such additional data 2314 or label 2316, additional operations 2330 can be performed, such as integrity checks, tamper checks, usability checks, and the like.
Fig. 24 is a specific example illustrating a payload 2402 and a label 2400, the payload 2410 and the label 2412 being encrypted to form an encrypted label 2410 and encrypted data 2412 for operation 2420. Further, as described above, the data 2414 can be used to augment the data and the tag can be augmented with the tag 2416, which can facilitate an additional set of operations 2430.
FIG. 25, constructed on the example of FIG. 24, is an example illustrating a surrounding federated trust overlay. In this regard, CTP2500 without a back door may be implemented based on an open method subject to a common check for robustness. Based on CTP2500, CKG2550 may be generated to process requests for capabilities (e.g., key 2540) for performing operations 2530 (e.g., searches, logical operations or queries, backups, audits, tamper checks, integrity checks, availability checks, etc.). Thus, cloud data service provider 2520 provides services, e.g., storage of encrypted metadata 2510 and encrypted data 2512. In an optional embodiment, the cloud hosts the data in a manner that is agnostic to the data or access patterns.
FIG. 26 is a block diagram illustrating an embodiment in which records and indexes are encrypted and uploaded to the cloud using a trust overlay. In this regard, the record and index are searchably encrypted such that the index is selectively accessible as a first visible layer to the associated data. Then, based on a search of the index, content or records that match one or more given indices can be identified, and then the user can or cannot access the matching content and records based on privileges, thereby acting as a second layer of protection for the data — the first layer is access to the index for the search or other operation, and the second layer is access to the data. In this regard, any number of hierarchical cryptographic wrappers may be applied to different portions of the data and associated metadata. As shown, client 2600 can have various records 2602, and at 2630 an encrypted index 2604 can be generated from records 2602. Record 2602 and encrypted index 2604 are uploaded to cloud 2610 at 2640 and stored as record 2612 and encrypted index 2612 in cloud 2610. To retrieve the record 2612, for example, based on the encrypted index 2614, at 2650, the customer 2600 receives a record 2620 signed with at least one signature 2622 from the cloud 2610, and may check the at least one signature 2622 at 2660.
Fig. 27 illustrates how a client can generate and upload encrypted indices onto encrypted data to obtain a richer cloud storage experience using the federated trust overlay architecture. The federated trust overlay architecture involves splitting authority to generate a trusted cryptographic ecosystem and is described in more detail below.
FTO2785 is an ecosystem that benefits customers 2775 by separating pieces of mathematical transformations made against containerless data in the cloud or other storage, and as described elsewhere herein, includes a Cloud Data Service (CDS) 2780, a Cryptographic Technology Provider (CTP) 2770, and a key generation center 2790. By way of example, client 2775 may have a document 2700 with which each keyword 2710 is associated. Common parameters 2765 for encryption are retrieved from CKG2790, while techniques for performing mathematical transformations are retrieved from CTP 2770. To perform the upload, the document 2700 is encrypted 2720 and uploaded 2730 into encrypted document storage 2750 in the cloud. The location 2735 and key 2725 for uploading are entered along with the keyword 2710 to generate an encrypted index 2740 associated with the encrypted upload of the document 2700, and the encrypted index generated at 2740 is uploaded to an encrypted index store 2755 at 2745.
FIG. 27 illustrates the uploading of encrypted index data, and FIG. 28 illustrates the decrypting of the index to search for particular content, which is authorized based on the capabilities provided by the federated trust overlay, and then using the visibility of the search results, the user may be granted the ability or privilege to decrypt the actual documents relevant to the search. In this regard, access to the index and access to documents may be separately controlled by the FTO based on policy and enforcement.
As noted above, FTO2885 is an ecosystem that benefits clients 2875 by separating pieces of mathematical transformations performed against containerless data in the cloud or other storage, and as described elsewhere herein, includes Cloud Data Service (CDS) 2880, Cryptographic Technology Provider (CTP) 2870, and key generation center 2890.
In this example, customer 2875 forms query 2800 and then at 2805 retrieves trapdoor 2810 from CKG2890, which is presented to the cloud along with query 2800. In the cloud, at 2820, the encrypted index in the encrypted index store 2825 is searched based on the technology 2815 retrieved from CTP 2870. Subsequently, the result 2835 is returned, still encrypted, and decrypted at 2840, extracting the location 2842 and key 2844 from the result. This gives the system the information to retrieve encrypted document 2850 from encrypted document store 2830 at 2845, which can be decrypted based on key 2844 at 2855 to return one or more documents 2860, e.g., document 2700 from fig. 27.
29-30 are block diagrams illustrating some additional non-limiting trust guarantees for the system; in this regard, any algorithm that proves that what the user receives is correct may be used as an additional layer to mathematically prove to the user that the cloud does not provide useless data. For example, one technique is known as data-owning technique (PDP), where tags are applied against encrypted data, which can be used in conjunction with confirming the correctness of the data. Similar information may be applied (and encrypted) to prove that the data was not inappropriately altered or deleted when stored in the cloud. Using cryptographic techniques, such attestation typically takes the form of cryptographic challenges and responses. In fig. 17, the PDP tag is encoded and encrypted in the cloud along with the encrypted record, index, metadata, etc., while in fig. 18, the authentication operation is performed based on a cryptographic consultation with the FTO that remains unchanged for the integrity of the data.
Referring to fig. 29, FTO2985 is, as described above, an ecosystem that benefits customers 2975 by separating pieces of mathematical transformations performed against containerless data in a cloud or other storage, and as described elsewhere herein, includes a Cloud Data Service (CDS) 2980, a Cryptographic Technology Provider (CTP) 2970, and a key generation center 2990. In this example, the publisher 2900 encrypts the record and index 2910 by encoding the record and index based on the secret 2930 retrieved from the CKG2990 and the technique 2940 retrieved from the CTP2970 at 2920. The encrypted or encoded record and index 2950 may be stored in the cloud. A data-possession-certification (PDP) tag 2960 may be used in conjunction with the encoding at 2920, the data-possession-certification (PDP) tag 2960 later helping to ensure certain aspects of the data when stored in the cloud, as described in more detail elsewhere herein.
As described above, in fig. 30, the authentication operation is performed based on cryptographic consultation with the FTO that keeps the integrity of the data unchanged. In this regard, FTO3085 is an ecosystem that benefits customers 3075 by separating pieces of mathematical transformations performed with respect to containerless data in a cloud or other storage, and as described elsewhere herein, includes a Cloud Data Service (CDS) 3080, a Cryptographic Technology Provider (CTP) 3070, and a key generation center 3090. PDP label 3040 is available for auditor 3000 of the system to check the integrity of data stored in the cloud. Based on the random number 3005, and based on the secret 3025 retrieved from CKG3090 and the technology retrieved from CTP3070, the auditor 3000 issues a challenge 3010 to the prover 3020 in the cloud. Prover 3020 also uses technique 3045 in connection with implementing the attestation algorithm. In this regard, prover 3020 receives the encrypted record and index 3030 and the PDP tag as input and returns information to auditor 3000, which is verified at 3050. Based on whether the verification operation succeeded or failed at 3060, the auditor 3000 is notified whether the integrity of the encrypted record and index 3030 is maintained.
As described in more detail below, various cryptographic techniques that can provide strong guarantees of privacy and irreproducibility for service users can be incorporated into the provision of services. By integrating these cryptographic techniques with data protection techniques, remote services and layered applications can be implemented on top of the data in a manner that enables owners and business customers ("customers") of the data to have precise control over the types of operations that can be performed by the entity or cloud service provider or operator ("CSP") hosting the data. Further, many of these operations may be performed by the CSP on behalf of the customer without having to learn or otherwise see the actual content of the data on which the operations are performed. In addition, the customer can detect if the CSP is deleting or modifying the data improperly, or moving the data to a low performance secondary or tertiary storage. In this regard, various cryptographic techniques may be integrated with data services to provide customers with a degree of trust in relinquishing control of data, e.g., to increase security and privacy.
For example, searchable encryption is an encryption method in which the necessary metadata is copied out of the data before encrypting the data. By way of non-limiting example, in the case of Exchange email, the data is a message whose attachments and necessary metadata may include selected Messaging Application Programming Interface (MAPI) attributes and a full-text index. For example, the data is encrypted using, for example, the Advanced Encryption Standard (AES), while the metadata is encrypted in such a way that an encrypted index is generated. As a result, the encrypted data and index can now be handed over to another entity that is not fully trusted, such as CSP. Subsequent selective access to the aggregated encrypted data and index may be achieved by the owner (customer) of the data, sending the encrypted query to the CSP (or other authorized subscriber). Thus, the CSP can apply the encrypted query on the encrypted index and return the matching encrypted data, however, the CSP does not know anything about the data, metadata, content of the query or result (unless authorized by the client).
Proof of possession and proof of restorability are cryptographic techniques in which a "prover" (in this case a CSP providing storage) and a "verifier" (a client) can interface in a protocol in which the verifier can efficiently determine whether the data they own remains unchanged and can be used to easily retrieve from the holder of the data (CSP). These techniques are efficient in network bandwidth and in the operations performed by the CSP, so the cost of goods sold by the CSP (COGS) remains relatively unchanged and the time to complete the protocol is reasonably short.
Another cryptographic technique that may be integrated into the provisioning of data services is application attestation. Similar to the proof of possession, the application proof enables the verifier to determine that the data is being properly maintained by the prover (CSP).
Blind fingerprinting represents another class of cryptographic techniques that extend network deduplication techniques, such as Rabin fingerprinting, which are commonly used to minimize redundant data exchange on a network. In embodiments herein, fingerprinting (fingerprinting) is applied so that participants in the protocol (e.g., CSPs in the case of data stores) do not know the actual content of the data they are hosting.
Based on the above framework and corresponding cryptographic techniques (ranging from storage and computing services to communication and collaboration services), various scenarios of CSP-based service offerings have emerged. Larger enterprise customers have a large amount of computing and storage assets in their current enterprise data centers, and the inertia to employ cloud services can be high. Furthermore, customers are experienced and familiar with data center operations, want to take advantage of operational expenses (OPEX) and capital expenses (CAPEX), and thus have concerns about moving their sensitive business data from inside to the cloud.
For such clients, in embodiments, a set of applications is provided that relate to the clients that own and operate their existing servers (such as Exchange servers). At which point a second copy of the data may be delegated to a cloud service provider for data protection, archiving, compliance, regulatory, legal reasons, or other reasons. The CSP thus has the skill, skill and economies of scale to protect this data from data loss or leakage, and can facilitate running applications on this second copy. Small samples of example products and services that may be provided to customers based on maintenance data include litigation support, monitoring and oversight, service dial tones, data navigation, and so forth.
With regard to litigation support, the litigation process requires various entities to perform searches of historical email records when a company is prosecuted. These entities include internal legal employees, HR (human resources), managers, outside counselors, their outside litigation support partners, and opposing side counselors. There are specific scope rules about who can perform what search. In current litigation support scenarios, it is difficult to define the scope. Thus, any individual participating in litigation support may see emails that are out of range. In the case of e-mail, the results of the search are often exchanged in the form of personal storage table (PTS) files, which constitute an additional risk, as these files may be inadvertently or maliciously handed over to unauthorized individuals.
Conversely, when the second copy is hosted remotely (e.g., stored in the cloud by CSP) and maintained by data, it is possible for a single trusted entity in the enterprise (e.g., a chief legal officer) to provide a specific trapdoor that limits their query capabilities to their needs at each person in the operation. Data hosted in the cloud and protected by searchable encryption and tamper-resistant audit logs provides a higher level of protection, preventing inappropriate email access. The need to exchange PTS files is eliminated because all individuals in operation directly access the cloud for queries, while litigation support partners are the only entities that output targeted content for conversion to Tagged Image File Format (TIFF) for case management.
In monitoring and supervising remote data copies, any reasonably sized organization should actively monitor their organization's emails for a variety of reasons. These reasons may range from legal/compliant to regulatory reasons such as monitoring IP leakage, piracy, inappropriate language, etc. Typically, monitoring and supervision software monitors the primary server or the second copy being backed up or archived. The problem with monitoring the primary server is: this may place an excessive burden on a busy production server. Furthermore, because it is possible for an administrator to inadvertently or maliciously modify or delete data on the primary server, one solution is to capture the data and transmit it to a second copy in a compatible manner, with the monitoring and supervising software continually scanning incoming emails for look-up or search patterns. However, in many enterprise settings, there is local administrative access to these second copies, and as a result, resource-rich administrators can modify or delete information despite tamper detection and prevention mechanisms.
In contrast, maintaining the data by the CSP advantageously places the second copy in a different control area. Suitable cryptographic techniques, such as searchable public key encryption (PEKS) and proof of possession (POP), can ensure that even if there is a collusion between the enterprise administrator and the employees of the CSP, they are prevented from actively and accurately identifying the items they wish to modify. The monitoring and supervision software runs at a remote site or in the cloud and looks for items with specific predetermined keywords through previously provided trapdoors.
As described herein according to various embodiments, independent data protection and cryptographic techniques are combined in the following manner: each enhanced and modified to support the other to provide solutions that are not currently available to consumers, businesses, ecosystems, and social networks, and to enable containerless, secure, private, and selectively accessible data in a cloud environment.
Trusted XML
XML has evolved into a popular network switching format for a variety of reasons, including but not limited to the efficient descriptive capabilities brought about by tags and their hierarchical arrangement. In this regard, XML data may be protected in accordance with the above FTO infrastructure that allows different permissions to be applied to different portions of an XML document (including payload and tags, as well as any metadata added on top of existing tags or metadata). As also described above, so that trusted XML can be stored in a container-less manner.
As shown in fig. 31, the XML payload 3102 and its tag 3100 may be encrypted to form an encrypted tag 3110 and payload 3112. In this regard, by breaking the XML document into XML fragments with potentially different levels of protection, a licensing system with far more granularity is enabled that does not rely on the original organization of the document as the publisher side. Further, additional data may be added to the payload data based on any of the functions 3114, and additional XML tags may be applied to facilitate additional functionality to be applied to the trusted XML fragments. Operations on the payload 3112/tag 3110 include operations 3120 such as searching, querying, backing up, auditing, and so forth. Based on the optional addition of the data 3114 or label 3116, other operations 3130 may be implemented on the data. For example, any time the data conforms to the schema of the social security number, a tag 3116 that marks the XML fragment private may be automatically added to keep such information from being violated.
In this regard, if the data/information is in XML format, any of the above-described techniques with respect to data/metadata may be applied to the structured XML data to selectively query and gain access to the XML fragments. XML has a standard format, namely < tag "value" > (< tag "value" >) or < tag "value" | XML end tag > (< tag "value" | XMLend-tag >). In this regard, using the structure XML, there is a way to represent the structure hierarchically such that there is an external wrapper that will point to CKG/CTP 'frames' unique to the digital pipe model. Thus, when there is a need to access embedded fragments, existing (or materialized, new) trust is used with < CKG > and < CTP > wrappers. This allows the user to search, extract and decrypt the segments when permitted. In addition, the PDP may be used to verify that the particular XML fragment requested has not been tampered with since it was originally authored.
Thus, in embodiments, a "trusted envelope" of XML fragments or complete records ("payload") is created through various "decorations" that allow the trust to run a full range of trust guarantees, such as confidentiality, privacy, anonymity, and integrity.
This is consistent with the containerless data embodiment described above. The opportunity to separate data from its containers (e.g., file system, database) facilitates sharing in a way that preserves and extends the original guarantees without requiring containers to implement. Any other wrapper may also be added in addition to password searching, password-based tamper detection, etc., based on business needs and with the advent of different technologies. Using XML data, tags may be added to the data to help modulate the consistency of the data, which may depend on the domain and application.
Advantageously, XML may include searchable metadata that encodes authentication, authorization, schema, history, traces, consistency, and the like. If it is a piece of a larger document, it may also encode manifest information to locate other relevant segments. Being able to use any agreed upon CKG/CTP and being able to add other wrappers independent of being able to complement searchable encryption and PDPs as new technologies become applicable enables a flexible architecture that handles any type of cloud scenario. XML tags may also be extended or added to modulate consistency, trajectory, etc.
When combined with data dissemination techniques, strong guarantees are achieved regarding confidentiality, privacy, anonymity and integrity. This "trusted envelope" may be used to adorn any payload with additional metadata that may include mode information, consistency hints, versions and tracks, confidentiality levels (e.g., when using swarm computation), locators for reconstructing this payload from other peers of the slice, etc.
In one non-limiting application, trusted XML provides "loose-form binding" to grow the ecosystem to catalyze network effects. The combination of FTO (parameterizing these technologies and key managers) and XML universal exchange format facilitates greater flexibility in adapting to dissemination technologies, applications, domains, locales, principals, formats, and other requirements.
In another application, current settlement and reconciliation for aggregation involves error-prone, omission and fraud prone point-to-point exchanges. Inserting secure and private data services would therefore directly benefit accounting, auditing, etc. in a manner that facilitates selective disclosure such that trusted entities remain reliable, and may allow appropriate regulators (compliance, law) or intermediaries (conflict resolution, etc.) to selectively peek at XML tags to establish trust in transactions. The trusted XML has the advantages that: the payload can encode a proprietary format between participants, which the storage does not need to know or even attempt to understand. The layers of the trusted wrapper thus add a great deal of technical and business value as well as legal and compliance values and ownership entity values.
In another application, health care system integration is burdensome due to (a) different incompatible legacy systems and (b) more importantly-loss of patient stickiness to existing solution providers. By introducing cloud data services as a clearinghouse and trusted XML as a clearinghouse format, these existing solution providers can treat this as a way to maintain this stickiness while also utilizing the common format facilitated by XML.
As for the use of FTO enabled "routers" ("gateways/guardians") and the use of trusted XML, this is because (a) routers can do their things without needing more knowledge beyond what is needed for routing, (b) routers have less freedom to mistakes and misbehaviours, (c) complex key management is eliminated due to late binding.
In addition, tags may be added or augmented or additional metadata may be applied to the XML document to indicate that the content has various levels of sensitivity. For example, there may be documents with public, secret, and confidential paragraphs. For example, a person performing a search and requesting access with a secret permission will only have access to public and secret paragraphs. The classification of paragraphs may also be used to determine encryption mechanisms, keys, and access policies. For example, confidential content cannot be accessed from a wireless or remote device.
Similarly, the classification can be used to create policies regarding how data can be stored, where data can be stored, for how long data can be stored. For example, medical data must be backed up to a secure server in a trusted data center once a day using AES256 encryption.
FIG. 32 is a flow diagram illustrating an exemplary process for hosting trusted XML in one embodiment. At 3200, the computing device in the first control region receives encrypted XML data from the computing device in the second control region, the encrypted XML data including encrypted XML payload data and an encrypted XML tag. The encrypted XML data is formed from an encryption of a defined XML data set of the computing device in the second control area according to a searchable encryption algorithm based on cryptographic key information. At 3210, auxiliary metadata encrypted based on the cryptographic key information is received, wherein the auxiliary metadata is formed from an analysis of the encrypted XML payload data or the encrypted XML tag. At 3220, a request for data is received, the request including a capability based on cryptographic key information defining privileges for accessing some of the encrypted XML payload data or the encrypted XML tag, thereby allowing selective access to the encrypted XML data as defined by the capability. At 2030, optionally, it is confirmed that the correct subset of encrypted XML data and corresponding XML tag data was received by the subscriber device that conformed to the request.
FIG. 33 is a flow diagram illustrating an exemplary process for hosting trusted XML in one embodiment. At 3300, cryptographic key information is received from a key generation component that generates the cryptographic key information based on identity information with a subscriber device. At 3310, the subscriber device requests a subset of the searchable encrypted XML data and corresponding XML tag data. The cryptographic key information is transmitted to a storage provider of the searchable encrypted XML data and corresponding tag data. At 3320, the encrypted XML data and the subset of corresponding XML tag data are decrypted as allowed by the capabilities defined in the cryptographic key information. At 3330, it is confirmed that the correct subset of encrypted XML data and corresponding XML tag data was received by the subscriber device that conformed to the request. At 3340, it is verified that the contents of the encrypted XML data and the corresponding subset of XML tag data have not been deleted or modified prior to receiving the encrypted XML data and the corresponding subset of XML tag data.
FIG. 34 is a flow diagram illustrating an exemplary process for hosting trusted XML in one embodiment. At 3400, the XML data is encrypted according to a searchable encryption algorithm based on cryptographic key information received from a separate key generator to form encrypted XML data, the encrypted XML data including encrypted XML tag information, wherein the key generator generates the cryptographic key information. At 3410, the encrypted XML data is transmitted to a network service provider for storage of the encrypted data. At 3420, the encrypted data is selectively accessible according to late binding of the selected privileges granted to the requesting device based on the identity information of the requesting device.
FIG. 35 is a flow diagram illustrating an exemplary process for hosting trusted XML in one embodiment. At 3500, the subscriber device makes a request for a subset of the searchable encrypted XML data that includes the encrypted XML tag. At 3510, cryptographic key information is received from a key generation component that generates the cryptographic key information based on identity information with a subscriber device. At 3520, the subset of encrypted XML data is decrypted according to the privileges assigned to the subscriber device defined in the key information.
As provided for various embodiments, trusted XML protects data from the whole to a lower node level and can protect access privileges at that node level in a versatile and efficient manner.
Examples illustrating one or more concepts are set forth below in which anonymous IBEs are used and/or in which AES is used as a cryptographic technique for obfuscation. However, it is to be understood that any suitable mathematical transformation may be used in these examples, and thus a given technique for obfuscating or hiding XML data should not be taken as limiting any of the more general concepts.
In one embodiment, encoding of the XML may be accomplished by passing the XML through an encoding transformer that outputs trusted XML, and decoding may be accomplished by passing the trusted XML through a defined decoding transformer. In this regard, the same node may be multiply protected (wrapped at multiple levels) already with higher protection for the boundary of the node.
In the following illustrative, but non-selective, example, medical records are used to explain the implementation of trusted XML using the concept of selective encoding to treat different portions of data differently.
In one non-limiting aspect, trusted XML enables protection of selected portions of an XML document, rather than the entire document. For example, one implementation may protect the elements of an internal block labeled 'action = "encode" (action = encode)'.
For example, to protect the Name of a patient, the Name element may be labeled as follows:
<Nameaction="encode">JohnMcKenzie</Name>
as a result, the data payload (here, 'JohnMcKenzie') will not be visible to anyone.
This selective encoding can be done at any element level, for example it can be done for planar elements (as described above) or it can be set for some hierarchical element (e.g. 'BloodTest)') as follows:
in the above example, the entire 'TestData' inside the 'BloodTest' element will not be visible to anyone.
The pre-processor information may be generated in order to set up to perform mathematical transformations, such as but not limited to encryption. For example, anonymous ibe (aibe) and AES may be used as cryptographic techniques, but again it is noted that these are non-limiting.
In one embodiment, the system may have a separate Capability Generation Center (CGC) for managing and generating secrets for the AIBE, as generally described for the federated architecture described elsewhere herein.
To perform the AIBE setup, in one embodiment, the CGC may generate a public Parameter (PUB) and a master key (MSK). In CGC, the MSK may be kept secret. The PUB may be provided to an end user of the application for use of the AIBE.
FIG. 36 is a flow diagram of an exemplary, non-limiting algorithm for use in the context of AIBE to selectively encode trusted XML documents. At 3600, an inner block (R) of elements marked for 'encoding' is received. At 3610, a new encryption key (K) is generated and at 3620, the inner block (R) is encrypted with the key (K), e.g., (R) → R'. At 3630, an 'identity keyword' (w) is generated, which is the keyword that is later used to request capabilities. At 3640, (w) is used to protect (K) with any asymmetric algorithm and the result is noted as (K'). At 3650, (R) and (K) may be discarded. At 3660, (R ') and (K') may be added to the algorithm information and the service information for later obtaining corresponding capabilities.
The following are example implementations using AES and AIBE.
Starting with the following elements:
in this example implementation, (R) is as follows:
<TestDatalabname="Quest">
<data>…</data>
</TestData>
for data processing to be performed as part of the encoding, a 256-bit AES encryption key (K) is generated. The AES encrypts the record (R), for example,
AESK(R)→R’
as an example, the value of the Base64Encoded for R' may be expressed as follows:
pDB9AaoGgBMbkUAox/+thz6IlIWpE21Qj0ZiW8I9vQ91OA3WrRaIUTWg9iDqvgu7svclH1SjENgBWDzlo5gaWYX1D+Ib3j6VpGX13mwd5Dq5FctLQFbSLWZCBzsCC/ORbe6A1iwk+6fGam/GrVcyuXeocIxUsmSBc0hhhwwdbz2IKpvY+rqW63uglgcbn4pyMbnOdiofbPOroqVXyCbFCDGbS46cmac8YKeDGrCURayt/yZW3Z7AwCzLvN3py6LBZvj8W4lJbzND5fa/S3bdfg==
the 'Id' of the document may then be taken and it may be appended after the 'element' name. This will be used as 'identity key' (w). For example, herein, (w) may be "JK 392E8 DBloodTest". Next, AIBE is used to protect (K), for example, as follows:
AIBE(PUB,w,K)→K’
as a result of protection via AIBE, K' may look as follows:
32,BuLI8ihhSAV3oxa9hm7Dx70BuLI8i,9uzEeIG89oAasixlbDLae9uzEeI,zn9xpp89kZtTio0zn9x,fmmxLd3Ehg16Efmmx
at this point, R and K may be discarded, and the output XML may be generated, as described above.
As for compiling the output XML, (R ') is held inside the ' Value ' element, for example, as follows:
<Value>
pDB9AaoGgBMbkUAox/+thz6IlIWpE21Qj0ZiW8I9vQ91OA3WrRaIUTWg9iDqvgu7svclH1SjENgBWDzlo5gaWYX1D+Ib3j6VpGX13mwd5Dq5FctLQFbSLWZCBzsCC/ORbe6A1iwk+6fGam/GrVcyuXeocIxUsmSBc0hhhwwdbz2IKpvY+rqW63uglgcbn4pyMbnOdiofbPOroqVXyCbFCDGbS46cmac8YKeDGrCURayt/yZW3Z7AwCzLvN3py6LBZvj8W4lJbzND5fa/S3bdfg==
</Value>
next, the used transformation algorithm is added. Here, it may be Encrypt and AES, for example.
In addition, a namespace is defined and encapsulated inside the 'Data' element, e.g., as follows:
as for the Key, (K') is maintained inside the Key element, for example, as follows:
again, the transformation information used is added. Here, again, for example, it is Encrypt and AIBE, for example, as follows:
service information for which the decoder retrieves the Key (Key) is also added, e.g., as follows
For example, a namespace may be defined and encapsulated to 'KeyInfo (Key information)', as follows:
example output elements of BloodTest are as follows:
the following record of inputs and transformed outputs illustrate an example transformation.
Example input records:
example transformed output records:
the above examples thus highlight that any data obfuscation or other mathematical transformation may be applied to different portions of XML data, encoding the different portions differently and enabling selective access to the data.
As for decoding, initially, the requesting entity retrieves or receives the capabilities. To obtain capabilities, the decoder provides an 'identity key' (w) to the CGC to request 'capabilities' (C). Depending on the request, the CGC provides the capabilities (C) of the given 'identity keyword'.
At this point, the provided capability opens up (K ') of the matching' identity key '(w) but not other K'. In the given example, if the user wants to get the 'Name' in the document, the user provides the 'identity key' (w) of the element 'Name'.
Here, (w) will be "JK 392E8 DName".
Once the user has acquired this capability, it can be applied to K' to get K according to the following:
AIBE(K’,PUB,C)→K
now, with K, the user will be able to decrypt R' using it, for example, as follows:
AESK(R’)→R
additional embodiments described for containerless data and details regarding federated trust coverage are provided below for the supplemental context.
Supplemental context for trusted cloud services ecosystem
As described above, separate data protection and cryptographic techniques are combined in various ways to enhance privacy, trust, and security with respect to data (e.g., as data stored at a remote site such as maintained by a CSP). Although the general ecosystem is described below in the context of a general data or web service, such general data or web service can be used in any one or more of the above scenarios of storing data at a remote site.
A digital escrow pattern is provided to network data services, including searchable encryption techniques for data stored in the cloud, to distribute trust across multiple entities to avoid compromise by a single entity. In one embodiment, the key generator, the cryptographic technology provider, and the cloud service provider are each provided as separate entities, enabling a publisher of data to publish data confidentially (encrypted) to the service provider, and then selectively expose the encrypted data to subscribers requesting the data, the selective exposure being based on subscriber identity information encoded into key information generated in response to subscriber requests.
With respect to searchable encryption/decryption algorithms, a searchable public key encryption (PEKS) scheme implemented by one or more cryptographic technology providers generates a trapdoor TW for any given message W, such that TW allows for checking whether a given ciphertext is an encryption of W, where TW does not reveal any additional information about the plaintext. According to various embodiments described below, the PEKS scheme may be used to prioritize or filter encrypted data, such as encrypted messages, based on keywords contained in the encrypted data (e.g., message text). Thus, a data recipient may be provided with selected access to portions of encrypted data related to a keyword by the ability to release the corresponding keyword (sometimes referred to by cryptographers as a "trapdoor"). In this way, these keywords can be checked in the encrypted data, but it is guaranteed that no more is learned from the subscriber than the subscriber's capabilities allow.
For the avoidance of doubt, although in one or more embodiments herein PEKS is disclosed as the algorithm for implementing searchable encryption, it will be appreciated that there are a number of alternative algorithms for implementing searchable encryption. Some exemplary, non-limiting alternatives to PEKS include, for example, forgetting RAM. Thus, the term "searchable encryption" as used herein should not be limited to any one technique, and thus refers to a wide range of encryption mechanisms or combinations of encryption mechanisms as follows: the encryption mechanism allows selective access to a subset of encrypted data based on a search or query function on the encrypted data.
Optionally, the subscriber and publisher of the data in the ecosystem can be provided with confirmation and/or verification of the results as an additional benefit. The confirmation provides a way to confirm that the data items received as a result of the subscription request for the subset of data are the correct set of items, i.e., that the correct subset of data that should have been received has actually been received. One technique in the field of cryptography is data proof of possession (PDP), however, for the avoidance of doubt, PDP is merely one exemplary algorithm that may be implemented, and other algorithms that achieve the same or similar goals may be used. Data-owned provability (Provable) or certification is the subject of how to frequently, efficiently, and securely verify that a storage server faithfully stores potentially large outsourced data for its clients. The storage server is assumed to be untrusted, both in terms of security and reliability.
Verification of the result provides an additional mechanism for checking the content of the item itself, i.e. to ensure that the item received in connection with the subscription request has not been tampered with by any unauthorized entity. One example of authentication in the field of cryptography is data proof of possession (PDP), however, for the avoidance of doubt, PDP is merely one exemplary algorithm that may be implemented and other algorithms that achieve the same or similar goals may be used. Another technique known in the cryptographic arts is the proof of restorability (POR), however, for the avoidance of doubt, POR is merely one exemplary algorithm that may be implemented, and other algorithms that achieve the same or similar goals may be used. A POR is a compact attestation made by a service provider or data hoster (prover) to a client (verifier) that indicates that the target file F is complete in the sense that the client can fully recover the file F and no tampering has occurred.
As an additional option, the ecosystem may implement the concept of anonymous credentials, whereby publishers can upload information about themselves in an anonymous fashion without exposing critical details, and subscribers may be limited in their ability such that they cannot be exposed or provided access to critical details uploaded by publishers. In this way, a publisher or subscriber may interact with the system while exposing only as much information that they wish to expose to third parties.
Conventional web services are limited to static client server arrangements and statically defined user policies for accessing web services. However, such conventional web service models cannot be flexible or secure enough when many publishers and subscribers are conceived from complex business and other relationships that change and evolve often. Thus, in various embodiments, late-binding is enabled such that publishers and/or owners of data and content can change access privileges to encrypted content based on who the subscriber is, based on the subscriber's capabilities, and based on what they are looking for (e.g., based on keywords used in requests for data). Thus, what a subscriber can selectively access dynamically changes in concert with changes to access privileges by publishers and/or owners, as subscriber capabilities are encoded in the key information provided by the running key generator. Thus, a subscriber privilege is defined for a given request at the time the key is generated for that request, and thus always reflects the current policy on requests from subscribers.
Similarly, an administrator of a server of a trusted cloud service may be permitted to observe logs of activity and data transactions processed by the server, but may also be restricted from seeing any customer name or credit card information. Thus, the identity of the subscriber may be the basis for limiting the types of data that the subscriber can access.
Various non-limiting embodiments of trusted ecosystems are presented herein in the context of building trust for cloud services, however, the trust establishment for ecosystems provided herein is more general and is not limited to application to cloud services. Rather, the embodiments described herein are similarly applicable to different servers or participants within an enterprise data center. Thus, while data may never leave a given entity, the techniques described herein for building trust are equally applicable where different processes within an enterprise operate within separate control zones. Without visibility across all enterprise processes, similar distrust may result as if the participants were outside the enterprise. For example, a server cloud may be corrupted within an enterprise even when under the control of an administrator or when the administrator may be inattentive or malicious.
In addition to being applicable to encrypted data in the cloud, the various techniques of the invention may also be applicable to data stored on laptop computers or other portable devices, as laptop computers may be lost or stolen. In this case, the device may end up being in the possession of an overly curious or malicious entity, however, the same techniques described herein that are suitable for protecting data in the cloud may also be used to protect data on a server or laptop. If the local data is encrypted, without proper subscriber credentials, the thief will not be able to understand the encrypted local data and thus be able to present the proper role or ability to access the data.
FIG. 37 is a block diagram of a trusted cloud service framework or ecosystem, according to an embodiment; the system includes a trusted data store 3700, the trusted data store 100 for storing searchable encrypted data 3710 and the results of subscriber requests that have undergone validation and/or verification. In this regard, web services 3720 may be built on top of secure data 3710 such that the publisher of the data retains control of the capabilities given to, for example, subscribers 3740 requesting the data through web services 3720. The publisher 3730 can also be a subscriber 3740, and vice versa, and the owner 3750 of the data can also be a publisher 3730 and/or a subscriber 3740. As an example of some common roles and corresponding capability sets that may be defined, special types of publishers 3730 and subscribers 3740 are administrators 3760 and auditors 3770.
For example, administrator 3760 may be a special set of permissions to data 3710 to help maintain operations on trusted data store 3700, and auditor entity 3770 may help maintain the integrity of certain data within the scope of the audit. For example, auditor 3770 may subscribe to messages containing data 3710 for offensive keywords, in which case auditor 3770, if permitted according to the assigned capabilities, may be alerted when a message for data 3710 contains such offensive line keywords, but unable to read other messages. In this regard, countless scenarios can be built based on the following capabilities: the publisher data is placed under digital escrow so that keys can be distributed that enable selective access to the data.
For example, a publisher authenticates to an ecosystem and indicates a set of documents to upload to the ecosystem. The document is encrypted according to a searchable encryption algorithm based on cryptographic key information received from a separate key generator that generates the key information. The encrypted data is then transmitted to a network service provider for storage of the encrypted data such that the encrypted data is selectively accessible in accordance with a late binding based on selected privileges assigned to the requesting device based on the identity information of the requesting device. Separating the cryptographic technology provider from the storage of the encrypted data would additionally isolate the encrypted data from further compromise.
In this regard, fig. 38 is a flow diagram illustrating an exemplary non-limiting method for publishing data in accordance with a trusted cloud service ecosystem. At 3800, the publisher authenticates with the system (e.g., the publisher logs in with a username and password, a LiVEID credential, etc.). At 3810, key information is generated by a key generator, such as a key generation center, which will be described in one or more embodiments below. At 3820, a separate cryptographic technology provider encrypts the set of publisher documents based on the key information. At 3830, the encrypted document is uploaded to a network service provider, such as a storage service provider, along with the capabilities, such that the encrypted document can be selectively accessed utilizing late binding based on selected privileges granted by the requesting device (subscriber) identity information.
For example, on the subscriber side, the subscriber authenticates to the ecosystem and indicates a request for a subset of data (e.g., a query for a subset of documents containing a given keyword or set of keywords). In response to a request for the searchable encrypted data from at least one subscriber device, a key generation component generates cryptographic key information based on identity information associated with the subscriber device. The subset of encrypted data is then decrypted according to the privileges assigned to the subscriber device defined in the cryptographic key information.
FIG. 39 is a flow diagram illustrating an exemplary non-limiting method for subscribing to data in accordance with a trusted cloud service ecosystem. At 3900, a method for subscribing to data comprises: the subscriber is authenticated (e.g., the subscriber logs in with a username and password, a liveeid credential, etc.). At 3910, a subscriber makes a request for data. At 3920, key information is generated by a separate key generation entity based on a subscriber request, wherein the capabilities of the subscriber may be defined in the key information. At 3930, the subset of the publisher data is decrypted based on capabilities defined in the key information. For example, the CSP may decrypt the data. At 3940, a subset of publisher data is made accessible to subscribers, e.g., subscribers can download, view, process, change, etc., the data based on dynamically definable capabilities imparted by the owner/publisher. Optionally, the techniques for encryption, decryption, and key generation may be provided by separate cryptographic technology providers, but hosted by any participant.
In one embodiment, the identity information of the subscriber device includes the role of the subscriber. For example, an auditor role, an administrator role, or other pre-specified role can be used by the publisher/owner as a basis to limit or grant access to portions of the searchable encrypted data store.
Fig. 40 illustrates an exemplary ecosystem showing the separation of a key generation Center (CKG) 4000, a Cryptographic Technology Provider (CTP) 4010, and a Cloud Service Provider (CSP) 4020, thereby eliminating the possibility of a single entity causing harm in a trusted ecosystem. In this regard, clients 4030 include publishers and/or subscribers to data. Optionally, CKG4000 may be built based on reference software, open source software, and/or a Software Development Kit (SDK), for example, provided by CTP4010, so that building blocks of multiple parties can create such components themselves or be satisfied by third parties implementing such ecosystem components. In one embodiment, the SDK is provided by CTP4010 and can be used by one or more participant users to host or implement CKG4000, a Compute and Store Abstraction (CSA) and/or a cryptographic client library as described in more detail below. Optionally, the SDK may be an entity distributed from CTP4010 to hosting CKG 4000.
In general, each of CKG4000, CTP4010, or CSP4020 may be subdivided into subcomponents depending on a given implementation, however the overall separation is preserved to maintain trust. For example, CKG entity 4001, such as Master Public Key (MPK) delivery 4002, client library downloader 4004, secret key extractor 4006, trust verifier 4008, or other subcomponents, may be provided separately in subsets, or together as an integrated component. CTP entities 4011 such as client applications 4012 for encoding and decoding, alternative encryption techniques 4014, applications 4016 for interfacing with CKGs, other cryptographic building blocks 4018, and the like may also be provided separately in subsets or together. Further, the CSP4020 may be considered to be a number of separate service providers, such as the CSPs 4022, 4026 that host the storage service 4024 and the service host 4028, respectively, or such services may be provided together.
It can be appreciated that the CKGs or CKG instances hosted by one or more participants in the trusted ecosystem need not be a single monolithic entity. Rather, CKGs may be divided into multiple (redundant) entities that cooperate to generate keys so that operations may continue even if a small subset of participants are offline. In one embodiment, optionally, the set of participants may be trusted in their entirety even when a small subset of these participants have been compromised by an adversary or otherwise become unavailable or untrusted.
Fig. 41 is another architectural diagram illustrating additional benefits of a trusted ecosystem for performing cloud services for enterprise 4100. For example, enterprise 4100 may include different organizations 4102, 4104, 4106, 4108. The different organizations 4102, 4104, 4106, 4108 in this figure show: organizations may assume as much or as little ownership as implementing policies for using systems or key generation. For example, the organization 4102 implements its own policy 4112, but uses the centralized key generator 4122, while the organization 4104 chooses to implement its own key generator 4124 and implement its own policy 4114. The organization 4106 also implements its own policies but relies on third party CKG4126, while the organization 4108 chooses to rely on third party policy providers 4118 and independent CKG 4128.
In this regard, to distribute data, the issuer 4140 obtains the common parameters 4135 used to encrypt the data based on the output from the CKG 4122. Based on the common parameters, the data is encrypted by the issuer device 4140 using an independent cryptographic technology provider at 4145. The encrypted data is uploaded to the storage abstraction service 4150, which storage abstraction service 4140 hides the storage semantics associated with storing the encrypted data by one or more CSPs 4170, such as CSPs 4172, 4174, 4176, or 4178. On the subscriber device 4160, the request for data results in the generation of a private secret key 4165 from the CKG 4122. The private secret key 4165 includes the following information: this information enables the subscriber device 4160 to selectively access the searchably encrypted data by decrypting the data at 4155. Again, the semantics of retrieving data from the CSP4170 are hidden by the storage abstraction service 4150. Also, the privileges granted to the subscriber device 4160 are the current set of privileges that result from late binding of the capabilities granted by the publisher/owner.
As can be appreciated from FIG. 41, multiple data owners (enterprises or consumers) can participate in a trusted ecosystem as described herein to establish a trusted relationship. In this case, each owner may host or control its own CKG (e.g., CKG4124 of organization 4104) such that requests or queries for data are forwarded to the respective CKG to collect the required keys from all sharers of the requested data.
FIG. 42 is another block diagram illustrating the adaptation to different storage providers by the storage abstraction layer 4210. For this trusted ecosystem, desktops 4230, 4232, having client applications 4240, 4242, respectively, may publish or subscribe to data as described above, initiating a request to key generation center 4220 for key information used to encrypt and decrypt the data. Similarly, services 4244, 4246, 4248 may also be publishers and/or subscribers in the ecosystem. In this regard, the storage abstraction service 4210 (as the name implies) abstracts details about one or more particular repositories that are remote from the client for storage or retrieval by any of the private cloud storage 4200, the SQL data service storage 4202, or the simple storage web service 4204, among others.
In this regard, for the avoidance of doubt, fig. 42 addresses a number of situations. In one case, FIG. 42 encompasses the non-intermediation of storage providers (abstracting them into individuals) by storage abstraction services (sometimes also referred to as Compute and Storage Abstraction (CSA)). In addition, fig. 42 covers the following scenarios: data is split and/or fanned out (e.g., for redundancy) into scenarios that may be multiple back-end storage providers of the same or different types, such that the original data may be reconstructed even when one (or a few) of the back-end storage providers accidentally or unintentionally delete or change their copies of the data.
Fig. 43 illustrates other aspects of storage in conjunction with a storage abstraction service 4310 comprising a server Operating System (OS) 4314 and a storage service 4312, the storage abstraction service 4310 abstracting storage details of private cloud storage 4300, SQL data store 4302, simple storage web service storage 4304, and so on. The client may be a desktop 4350 or 4352 with client applications 4340 and 4342, respectively. The key generation center 4320 may include a key generator application 4322 executing on a server OS 4324. In this regard, organization 4330, with active directory 4336, server OS4334, and Security Token Service (STS) 4332, may be a publisher or subscriber in the ecosystem. In this regard, the Storage Transport Format (STF) is a standard exchange format that may be used to exchange encrypted data and metadata across multiple libraries. For example, an organization 4330 may wish to transmit email data (in which case STF may be used) between storage service providers 4300, 4302, or 4304.
Fig. 44 is another block diagram illustrating various participants in the trusted ecosystem 4420. As described above, advantageously, the enterprise 4400 can offload storage and maintenance of data volumes from the field to a cloud storage service provider, wherein the cloud storage service provider is better suited to handle such data volumes while maintaining the comfort that the data will not be decrypted to the wrong subscriber, as the enterprise maintains control over the capabilities defined for the encrypted data. For example, organization 4402 may operate collaboration application 4412 such as Sharepoint. In this regard, the organization 4402 may establish a digital escrow or trusted domain for sharepoint data. The policies 4432 and CKG4434 may be implemented by a first data center 4430, which first data center 4430 is configured to establish a secure space by defining cryptographic key information 4445 for a trusted domain.
Then, another organization 4404, e.g., acting as an issuer 4414, may encrypt the data based on key information obtained from the CKG4434, at which point the compute and store abstraction component 4442 of the second data center 4440 processes the details of storing the searchably encrypted data at the third data center 4450 (e.g., in the CSP 4452). Conversely, when a subscriber 4416 of the organization 4404 requests data, the private key or secret key information is delivered to the subscriber 4416 as part of the extraction 4465. Next, based on private key information including capabilities defined for the subscriber, the data requested by the subscriber is decrypted at 4475, assuming the subscriber has privileges, and the abstraction layer 4442 again handles the details of the underlying storage 4452.
FIG. 45 is a representative view of some layers of an exemplary non-limiting implementation of a trusted cloud computing system in which different components may be provided by different or the same entity. At the bottom of the layer stack is a math and cipher library 4586, which is used to implement encryption/decryption algorithms. An abstraction of the definition of various cryptographic schemes may be provided as an intermediate layer 4584 between the detail repository 4586 and the actual implementation of the searchable cryptographic scheme 4582. Layers 4582, 4584, and 4586 together form a larger cryptographic service layer 4580, which cryptographic service layer 4580 forms the basis for implementing the trusted digital escrow 4570 and its storage when combined with the abstraction layer 4560 into a software as a service (SaaS) application ecosystem. The abstraction layer 4560 contains a basic language for implementing a digital escrow mode, i.e., commands such as SetUp (), Encrypt (), Extract (), Decrypt (), and the like.
Above the abstraction layer 4560 is a layer 4550, which layer 4550 ties into various more specific platform technologies (e.g., SDS, Azure, Backup/Archive, RMS, STS, etc.). On top of the layer 4550 tied into various specific platform technologies are various SaaS applications that use trusted digital hosting 4500. The exemplary, non-limiting illustration shows: digital hosted application 4500 may be implemented by a single company 4510 or by a partner 4530, or by both. For example, company 4510 may implement services such as: high Performance Computing (HPC), eDiscovery and legal discovery 4514, Live services 4516 (e.g., DBox), backup/archive as a service 4518, audit log — business process and monitoring 4520, or other cloud services 4522. Partner 4530 may implement services such as: eretteroffcredit 4532, HPC & vertical services 4534, eHealth services, secure extranet 4538, compliance 4540, litigation support 4542, and so forth.
Scene based on credible cloud service ecosystem
Any type of application may be implemented in the cloud due to the increased trust inherent in the separation of the key generator, the password provider, and the cloud service provider, as well as other techniques described herein. In this regard, where such a trusted cloud services ecosystem has been implemented, a rich set of services and scenarios can be implemented that take advantage of one or more of the benefits of the trusted ecosystem described herein.
For example, FIG. 46 is a flow diagram of an exemplary non-limiting process for publishing documents to a digital safe application in a manner that provides controlled selective access to data to publishers utilizing the late binding described above. At 4600, the device is authenticated (e.g., the device logs in with a username and password, password credentials, biometric credentials, LiveID credentials, etc.). At 4610, the document is uploaded and a tag is entered. At 4620, the ticket is sent to the hosting agent and, in response, a hashed ticket is received from the hosting agent. In this regard, the tag may be provided as mentioned, or alternatively may be automatically extracted from the payload (record, document) by full-text indexing. At 4630, the client encrypts documents with the publisher's key information and the documents are sent to the secure digital cloud storage provider along with the subscriber's capabilities with respect to those documents. At 4640, the secure digital cloud storage provider sends, for example, an encrypted blob (blob) to the storage service (e.g., with respect to the storage abstraction layer).
FIG. 47 is a flow diagram of an exemplary non-limiting process for subscribing to material placed in a digital safe. At 4700, the subscriber is authenticated and the client device sends the ticket to the hosting agent, which in response sends back the hashed ticket at 4710. The client then sends the hashed ticket to the digital safe service at 4720, and the hashed ticket is interpreted to know: at 4730, the client is entitled to have its search request performed in whole or in part by the storage service.
FIG. 48 illustrates an exemplary, non-limiting implementation of a trusted cloud service using a digital escrow pattern to implement secure extranet for an enterprise through one or more data centers. As mentioned, the trusted computing ecosystem can include a key generation center 4800, the key generation center 4800 implemented separately from a Cryptographic Technology Provider (CTP) 4810, the cryptographic technology provider 4810 providing a reference implementation for use in implementing cryptographic technologies implemented separately from one or more Cloud Service Providers (CSP) 4820 consistent with the ecosystem. In an exemplary, non-limiting implementation of a secure extranet, 4880 shows: an enterprise maintains shared repository 4870 (e.g., SharePoint) and repository 4860 of design or analysis applications for use in connection with documents in shared repository 4870. Business software 4840 (e.g., Sentinel) may monitor application or server performance, etc., of a computer with a desktop 4850.
In this regard, in a trusted cloud services ecosystem, when a subscriber using desktop 4850 seeks information from storage that is selectively accessible and encrypted, security token service 4830 can deliver certain information to identify subscriber 4882, and CKG4800 can be consulted through an interface of CKG layer 4802 of the first data center, as shown at 4884. CKG4800 returns key information that can then be used to selectively access data held by data services 4824 through storage abstraction service 4822 as shown at 4886. Thus, any type of data can be selectively shared across an enterprise according to the role of subscribers in the enterprise.
FIG. 49 is a flow diagram illustrating another exemplary, non-limiting scenario based on a trusted cloud services ecosystem in which subscribers are provided selective access to encrypted data stored by, for example, a CSP within an enterprise. Initially, the subscriber device has not gained privileges to access the encrypted data. However, by making a request for some or all of the encrypted data at 4900, for example, by interacting with an application, the application automatically communicates with the corresponding STS for obtaining a claim (a term in cryptography) at 4910. At 4920, the application communicates with the CKG to obtain key information that encodes information about the subscriber's capabilities (capabilities are sometimes referred to in cryptographic parlance as trapdoors, but the term "capabilities" is not limited to the context in which the term "trapdoors" commonly appears). Finally, the application 4930 provides the key information to the CSP, which allows the encrypted data to be searched or queried to the extent permitted by the subscriber's capabilities.
FIG. 50 is another flow diagram illustrating that an application response can be customized for a subscriber based on login information. For example, at 5000, user ID information is received by the application. At 5010, the application obtains the relevant declarations from the STS. At 5020, based on the one or more roles the user plays in association with the user ID information, the experience can be customized commensurate with the privileges/constraints of those roles. For example, the user experience presented to the chief financial officer of a company as a view of the company's encrypted data may and should be a different user experience than the view of the company's encrypted data provided to the mail room employee. Fig. 50 may be applicable to single or multiple sign-on scenarios.
Fig. 51 is another flow diagram illustrating a secure record upload scenario, which may be implemented for one or more parties. At 5100, records and keywords are received by an application, such as provided or specified by a user of a device having the application. At 5110, the application obtains a Master Public Key (MPK) and applies a public key encryption keyword searchable (PEKS) algorithm. The application may optionally cache the MPK. At 5120, the application enters the encrypted record into the CSP repository, e.g., through a storage abstraction layer.
FIG. 52 is another flow diagram illustrating an exemplary, non-limiting role-based query on a searchable encrypted data store implemented by a trusted cloud service ecosystem, e.g., for automatic searching by a single party. At 5200, the application receives or initiates the federated query. At 5210, the application obtains the relevant declarations from the STS. For example, the STS maps the user's role to the appropriate set of queries and returns a set of legitimate queries for the given role. At 5220, the application submits the filtered claims and query such that claims corresponding to the query can be efficiently submitted, rather than all of the claims. Optionally, the CKG returns the trapdoor claim to the application (or rejects the claim). At 5230, the application performs a trapdoor declaration on the remote index. Based on the processing of the remote index, the results may be received by an application and presented to the user by the application using a customized presentation, for example, based on the user role.
FIG. 53 is a flow diagram illustrating a multi-party collaboration scenario in which an enterprise provides access to some of its encrypted data to external enterprises. For example, a manufacturer may give a vendor access to its data stored in a trusted cloud, and vice versa. In this regard, at 5300, the STS of Enterprise 2 is designated as a resource provider, and the application of Enterprise 1 proceeds to obtain a declaration for access to resources provided by the resource provider in the cloud. At 5310, the STS of Enterprise 1 is designated as an identity provider. In this regard, the application obtains claims for a role or set of roles defined by a subscriber at enterprise 1, facilitated by an identity provider. At 5320, the application retrieves the claims based on the licensable resources controlled by enterprise 2 and based on the permissions/capabilities defined by the subscribing entity. In fig. 53, although only one STS is depicted, it should be noted that multiple identity provider STS and/or multiple resource provider STS may exist in a digital escrow or federated trust overlay.
FIG. 54 is a flow diagram illustrating a multi-party automatic search scenario among multiple enterprises, such as Enterprise 1 and Enterprise 2, for example. At 5400, the enterprise 1 application receives or initiates a federated query for execution. At 5410, the application obtains the relevant declarations from the STS of the resource provider (Enterprise 2). Optionally, the resource provider may be specified in an organization tag. The STS may optionally perform a mapping of user roles to query groups such that a set of legitimate queries for the user roles is returned. At 5420, the application submits the filtered claims and query based on the user role, such that claims corresponding to the query can be efficiently submitted, rather than all of the claims. Optionally, the CKG returns capabilities to the application (e.g., trapdoor claims), or the CKG rejects the claims. At 5440, the application performs a trapdoor declaration on the remote index. Based on the processing of the remote index, the results may be received by an application and presented to the user by the application using a customized presentation, for example, based on the user role.
The method may include the step of receiving a federated query or otherwise initiating a federated query. In this regard, optionally, the federated query may also be password protected, such that recipients (clients or service providers) without trapdoors (or capabilities) may decompose the federated query and determine its components.
Fig. 55 illustrates an exemplary non-limiting Edge Computing Network (ECN) technique that can be implemented for trusted cloud services. In this regard, a plurality of dynamic compute nodes 5570, 5572, 5574, 5576 are dynamically allocated computing bandwidth in conjunction with trusted cloud components operating independently of one another. For example, key generation center 5520, storage abstraction service 5510, organization 5530, and organization 5540 may be implemented as shown to encompass multi-organization traffic or other scenarios such as those described above. The key generation center 5520 includes a key generator 5522 and a server OS 5524. The storage abstraction service 5510 includes a storage services component 5512 and a server OS 5514. Organization 5530 includes STS5532, AD5536, and server OS 5534. The organization 5540 includes an STS5542, an AD5546, and a server OS 5544. The server OSs 5514, 5524, 5534, 5544 cooperate to implement ECNs across servers. Any storage provider or abstraction 5502 may be used to store data, such as SQL data services may be used. In this manner, one or more desktops 5550, 5552 may publish or subscribe to data through client applications 5560, 5562, respectively.
Fig. 56 is a block diagram illustrating one or more optional aspects of a key generation center 5610 according to a trusted cloud service ecosystem. Initially, a collection of computing devices, such as desktops 5660, 5662 and respective client applications 5670, 5672 or services or servers 5674, 5676, 5678, etc., are potential publishers and/or subscribers of the cloud content delivery network 5650. However, before satisfying a request from any of the set of computing devices, a key generation center, which initially acts as a custodian for the trust of publishers, encrypts data based on a public key and issues private keys to data subscribers based on their capabilities.
In an exemplary, non-limiting interaction, a request from a computing device is initially supplied 5600, and a primary of the CKG5610 requests an instance of the CKG5610 from the CKG factory 5602 at 5680. Next, user authentication 5604 is performed at 5682. Any usage-based billing 5684 may then be applied by the billing system 5606 for use by the CKG plant 5602. The rental CKG is then materialized at 5686 by a CKG factory 5602, which may include an MPK delivery component 5612, a client library downloader 5614, a secret key extractor 5616, and a trust validator/verifier 5618.
The MPK delivery component 5612 delivers the MPK to the CDN5650 at 5688. The client library downloader 5614 downloads to the requesting client the cryptographic libraries that may be used in conjunction with encryption of the data to be published or decryption of the data subscribed to by the device. The client then makes a request to extract a given set of documents based on the key information received from the secret key extractor 5616 in cooperation with the trust verifier 5618, which trust verifier 5618 can confirm that the subscriber has certain capabilities based on verifying the STS thumbprint of the subscriber at 5694, e.g., based on communicating with a different STS5620, 5622, 5624, 5626 of the organization to which the request relates. As in other embodiments, a storage abstraction service 5640 may be provided to abstract storage details of database services 5630 (e.g., SQL).
FIG. 57 is an exemplary non-limiting block diagram of trusted storage 5700 in connection with delivery of web services 5700, the trusted storage 5700 including searchable encrypted data 5720 with validation and/or verification. In this embodiment, the subscriber 5740, or an application used by the subscriber 5740, may request proof of confirmation of the items returned to the request as part of the request to access portions of the encrypted storage 5700 to confirm that the items actually received are also what should have been received. In this regard, fig. 57 shows a combination of searchable encryption technology and validation technology. Optionally, the system may also be integrated with claim-based identity and access management, as described in other embodiments herein. In this regard, as described in various embodiments herein, the digital escrow pattern, also known as federated trust overlay, can be seamlessly integrated with more traditional claim-based authentication systems.
In fig. 57, trusted data store 5700, or the service provider or the host of the data store, performs the attestation step, while the owner of the data (e.g., subscriber device) performs the validation. The data store 5700 is trusted because the user may believe it provides a strong assurance, but it can be appreciated that the physical entity actually hosts the data, and that some participants are not fully trusted.
FIG. 58 is a flow diagram of an exemplary non-limiting process for a subscription that includes a confirmation step. At 5800, a subset of the searchably encrypted data is received from the subscriber device. At 5810, cryptographic key information is generated from a key generation instance that generates the cryptographic key information based on identity information of the subscriber device. At 5820, the subset of encrypted data is decrypted according to capabilities assigned to the subscriber device as defined in the cryptographic key information. At 5830, the items represented in the subset may be validated (e.g., data proof of possession), and the data accessed at 5840.
In many cases, it is desirable to be able to perform a PDP/POR on encrypted data without the need to decrypt the encrypted data. Optionally, the key information required by the PDP may be encoded within the metadata that was protected with searchable encryption. Although this is an efficient way of managing keys for a PDP/POR, it should be noted that there are many high value scenarios in which a PDP/POR can be performed on encrypted data without requiring access to the plaintext content.
Fig. 59 illustrates an exemplary, non-limiting confirmation challenge/response protocol in which a verifier 5900 (e.g., a data owner) issues a cryptographic challenge 5920 to a prover 5910 (e.g., a data service provider). Upon receiving challenge 5920, prover 5910 computes response 5912 from the data and challenge. Challenge response 5930 is then returned to verifier 5900, who then performs calculations to verify or prove that the data has not been modified 5902.
The acknowledgement shown generally in fig. 59 is referred to as a private PDP, but it should be noted that there is also a "public" version in which a third party is provided with a secret key ("public" key) so that the third party acts as a verifier according to a similar protocol without having to know the actual data. POR (i.e. an example of authentication) is different from PDP, where PDP provides proof that data is retrievable (whether any corruption/modification or not), but as shown below in fig. 30, the basic protocol is the same, but the structure of the document and the actual algorithm are different. Various embodiments of the trusted ecosystem herein combine searchable encryption and POR/PDR to benefit the system and consolidate trust. In this regard, the data is searchable encrypted before being submitted to the service provider, and post-processing of the data may include POR and/or PDP.
Additionally, if more force assurance is desired, a "data scatter" technique may optionally be overlaid into any one or more of the embodiments described above. With data dispersion, data is distributed to several service providers to gain resilience against "massively bad behavior" or catastrophic loss in any single service provider. Using the trust mechanism shown here, the dispersion is performed in a way that makes it difficult for independent service providers to collude and corrupt data. This is similar to the concept of the distributed CKG embodiment described above.
FIG. 60 is another exemplary, non-limiting block diagram of a trusted store 2500 in connection with delivery of a network service 2520 for data from a publisher 2530, the trusted store 2500 including searchable encrypted data 2510 with validation and/or verification. In particular, fig. 60 illustrates a verification component 6050 that is used to verify that the item returned to the subscriber 2540 has not been tampered with or otherwise inadvertently changed. The above-described PDP is a non-limiting example of authentication.
FIG. 61 is a flow diagram of an exemplary non-limiting process for a subscription that includes a confirmation step. At 6100, a subset of the searchably encrypted data is received from the subscriber device. At 6110, cryptographic key information is generated from a key generation instance that generates the cryptographic key information based on identity information of the subscriber device. At 6120, the subset of encrypted data is decrypted according to capabilities assigned to the subscriber device as defined in the cryptographic key information. At 6130, the content of the items represented in the subset can be verified (e.g., a recoverability certificate), and the data is accessed at 6140.
Fig. 62 illustrates an exemplary, non-limiting verification challenge/response protocol in which a verifier 6200 (e.g., a data owner) issues a cryptographic challenge 6220 to a prover 6210 (e.g., a data service provider). Upon receiving the challenge 6220, the prover 6210 computes a response 6212 from the data and the challenge. The challenge response 6230 is then returned to the verifier 6200, who then performs a calculation to verify or prove that the data is recoverable 6202.
Blind fingerprinting represents another class of cryptographic techniques that extends network deduplication techniques, such as Rabin fingerprinting, which are commonly used to minimize redundant data exchange on a network. In embodiments herein, fingerprinting is applied so that participants in the protocol (e.g., CSPs in the case of data stores) are unaware of the actual content of the data they are hosting.
For some additional context regarding blind fingerprints, any large exchange of data across a Wide Area Network (WAN), including maintenance of the data, will require techniques for "deduplication" on the line, or to ensure that unnecessary data is not sent over the line. This is accomplished by fingerprinting segments of data and then exchanging fingerprints so that the sender knows what they have but not the recipient. Also, the receivers know what data they need to ask the sender. Distributed file service replication (DFS-R) can be used to optimize data exchange in scenarios such as corporate backup over WAN and distributed file systems.
In the Exchange case, there is a large amount of data duplication, and as much as 50% or more of the data on a line at any given time may be duplicated. Fingerprints may be obtained at the block level or at the object level (e.g., email, calendar items, tasks, contacts, etc.). Fingerprints may be cached at the primary and secondary data centers. Thus, if there is a failure at the primary data center, the secondary data along with the fingerprint may be restored to the primary data center. Encryption of the data at the primary data center should still allow the fingerprint to be visible, albeit obscured, to the secondary data center operator. This may be accomplished, for example, by storing the fingerprint as a keyword/metadata with searchable encryption such that no other entity can detect the pattern other than an authorized entity/agent in the data center.
In the context of data services, when sending full text or deltas, the master data center can examine each item/fragment/chunk in the log or EDB and consult a local copy of the fingerprint. If there is a match, the primary data center replaces the item/segment/chunk with the fingerprint. The term "blind fingerprint" is referred to herein as such because of the way fingerprinting is applied. In one embodiment, the cryptographic technique selection used to implement blind fingerprinting includes a size-preserving cryptographic technique.
FIG. 63 is a block diagram of an overall environment for providing one or more embodiments of services, including blind fingerprinting. Using blind fingerprints, the data subscriber 6300 and the data service provider 6310 undergo a fingerprint exchange to act as a proxy to understand what pieces of data are already owned on each local and backup copy of the data set being backed up. As a result of the fingerprint exchange 6320, a reduced set of modification data to be transmitted is determined at 6302 as deduplicated modification data 6330 to the data service provider 6310, which then 6310 applies the modification data based on selectively accessing the deduplicated modification data and any blind fingerprints 6340.
FIG. 64 is a block diagram illustrating a non-limiting scenario in which multiple independent federated trust overlays or digital escrows may exist side-by-side or on top of each other for a hierarchical fashion. In this scenario, there is trusted data store 6400 with searchable encrypted data 6410, and various network services 6420 may be based on this data 6410. For example, the network service 6420 may include delivering word processing software as a cloud service. As part of geographic distribution, etc., optionally, multiple overlays/hosts 6432, 6434, 6436 may be provided, each tuned to a different application/profile/compliance requirement/ownership entity requirement, such that the publisher 2530 or subscriber 6450 implicitly or explicitly selects the correct overlay/host to participate in based on a set of requirements or jurisdiction/residence area. Thus, the overlay may change, but the backend services from the cloud may remain unchanged without complicating the delivery of the core services themselves.
FIG. 65 is a block diagram of another exemplary, non-limiting embodiment of a trusted storage that includes data distribution techniques for obfuscating data from unauthorized access. This example shows: all of the above techniques or systems that provide encryption techniques as a means for hiding or obfuscating data may also be implemented by any other mathematical transformation or algorithm that prevents visibility of the data (or metadata). In this regard, for example, data may be automatically integrated or distributed across a set of data stores, which may be the same type of container, or different types of containers 6512, 6514, … …, 6516 as shown in FIG. 65.
The system thus includes a data store 6500 that includes (as an abstraction) data stores 6512, 6514, … …, 6516 for storing selectively accessed data or metadata 6510. A publisher may publish data or metadata 6510 representing at least one resource to a data store 6500, and a first independent entity 6550 performs generation of access information applicable to the data and metadata as published, and a second independent entity 6560 distributes the data and metadata as published across a set of data stores of the data store 6500 while maintaining knowledge of the set of data stores storing the published data or metadata.
Thus, this knowledge is a secret that cannot be revealed without access information. The data or metadata 6510 may be published via a network service 6520, the network service 6520 providing selective access to the published data or metadata for a given request to the network service based on selected late-binding privileges granted by a publisher or owner of the at least one resource and represented by the access information. The data store 6500 includes a plurality of containers of the same or different container types, and the published data or metadata is automatically distributed across at least one container of the plurality of containers. The distribution may be based on any algorithm known to the data distributor 6560, for example, based on real-time analysis of the storage resources represented by the multiple containers, based on characteristics of the data or metadata, or any other parameter suitable for a given application.
In this regard, when the subscriber 6540 makes a request for data or metadata 6510, the network service consults the independent entities 6550 and/or 6560 to determine whether the subscriber 6540 is permitted to have access information that allows the data to be reassembled. For example, the data map may be a secret that permits reorganization of the data. This embodiment may be combined with other mathematical transformations, such as encryption, to provide additional protection for the data. Such additional mathematical transformations may be supervised by further independent entities for additional distribution of trust to further satisfy the following: this data remains invisible except to authorized parties.
Various exemplary, non-limiting embodiments are described herein that illustrate the delivery of trusted data services. These embodiments are not independent, but may be combined with each other as appropriate. Additionally, any of the above embodiments may be extended to a number of alternatives. For example, in one embodiment, the trusted data service provides expiration and revocation of trapdoors or capabilities to obtain a greater degree of security for data access. In another optional embodiment, a rights management layer is built into the provisioning of trusted data services, for example to preserve rights attached to content as part of encryption/decryption or to prevent actions on copyrighted data in data hosting that are more easily recognized or detected in the clear. Thus, any combination or permutation of the embodiments described herein may be envisaged within the scope of the invention.
Exemplary, non-limiting implementations
Any exemplary implementation of a digital escrow pattern is referred to as a Federated Trust Overlay (FTO). Some additional non-limiting details regarding the implementation of FTP are attached in appendix a.
In this regard, the digital escrow pattern is merely one example of many possible patterns and variations. In addition, this schema (which includes publishers, subscribers, administrators, and auditors — and possibly other specialized roles as described above) forms a layer on top of another underlying FTO schema that performs the "church & state" separation of CTPs, CSPs, CKGs, etc. to maintain trust. There may also be multiple independent FTOs and DEPs that may coexist without interfering with each other and even without knowing each other's presence. Moreover, DEP and FTO can be overlaid on cloud storage without the cloud storage service providers collaborating or even knowing about the existence of these patterns/overlays.
In more detail, FTO is a service set independent of data services in the cloud. These services are run by multiple parties other than the operator of the data service and can provide strong guarantees regarding confidentiality, tamper detection, and non-repudiation of data hosted by the cloud service.
Any partner may construct and host these overlay services, such as an intermediary program service, a validation service, a storage abstraction service, and so forth. These partners may choose to host a reference implementation or to construct their own implementations based on publicly available formats and protocols.
Due to the open nature of the format, protocol, and reference implementations, maintaining separation of control among multiple parties, such as the operator of the FTO and the data owner, can be straightforward.
Although encryption is an element of the solution, the organization of services federated in the realm of different parties is also part of the solution. Although conventional encryption techniques are mandatory for many scenarios, they preclude implementing many of these scenarios, such as tamper detection, non-repudiation, building trust by organizing multiple (untrusted) services, searching databases, and so forth.
Supplemental context
As described above, for some additional non-limiting contexts, a trusted set of cloud offerings enables an application ecosystem for clouds that build on trust. Various techniques used herein include: CKG-key generation center, an entity that hosts a multi-tenant key generation center, e.g., any of Microsoft, VeriSign, Fidelity, asoveignentitity, Enterprise, corporate entity, etc. may host a CKG. In this regard, multiple tenants are optional (e.g., desirable but not mandatory). Other terms include: CTP-cryptographic technology provider, an entity that provides cryptographic technology for use with trusted ecosystems, e.g., any of Symantec, Certicom, Voltage, PGP corporation, BitArmor, Enterprise, Guardian, souriegnentity, etc., are example companies that may be CTPs.
In addition, the term "CSP" -a cloud service provider is an entity that provides cloud services, including storage. Various companies may provide such data services. The CIV-cloud index validator is a second repository for validating the returned index. CSA-compute and store abstraction abstracts the storage back-end. STF — storage transport format is a common format for transporting data/metadata across multiple repositories.
In this regard, as noted above, some enterprise scenarios include: extranet engineering using data services technologies or applications; designing and engineering analysis; defining data relationships between manufacturers and suppliers, etc. Thus, a unique ecosystem is achieved for all of the multiple scenarios by distributing trust across multiple entities so that there are no 'over' trusted entities or a single point of compromise.
For some supplemental contexts regarding searchable encryption, a user typically has or obtains a 'capability' or 'trapdoor' for a keyword, and then uses that 'capability' (presents it to the server) to send a request. The server 'combines' the capabilities and index to find relevant documents or data. The user is then provided access only to the documents resulting from the search (although the user may access more than these documents).
As mentioned, a single algorithm should not be considered to limit the supply of searchable encrypted data stores as described herein, however, some theories beyond the exemplary non-limiting algorithm are generally outlined below and provide preliminary knowledge of the Searchable Symmetric Encryption (SSE) pattern:
message: m is
Keyword: w is a1,...,wn
·PRF:H
Generating escrow keys
Choosing random S for H
Encryption
Choosing a random key K
Choosing a random fixed length r
For 1. ltoreq. i.ltoreq.n
Calculating ai=HS(wi)
Calculation of bi=Hai(r)
Computing(sign)
Output (E)K(m),r,c1,...,cn)
Trap door or capability for w
·d=HSj(w)
Test for w
Calculate p = Hd(r)
Calculation of
If z = flag, output "true"
To EK(m) decrypting to obtain m
Although again should not be considered limiting to any of the embodiments described herein, the following is preliminary knowledge about the public key encryption w/keyword search (PEKS) model.
Public key encryption
a.PKE=(Gen,Enc,Dec)
Identity-based encryption
b.IBE=(Gen,Enc,Extract,Dec)
c. Generating master keys
i.(msk,mpk)=IBE.Gen()
d. Encrypting m for ID
i.c=IBE.Enc(mpk,ID,m)
e. Generating secret keys for IDs
i.sk=IBE.Extract(msk,ID)
f. Decryption
i.m=IBE.Dec(sk,c)
g. Message: m is
h. Key words: w is a1,...,wn
i. Generating escrow keys
i.(msk,mpk)=IBE.Gen()
ii(pk,sk)=PKE.Gen()
j. Encryption
k. For 1 ≦ i ≦ n
i.ci=IBE.Enc(mpk,wi,flag)
l. Return (PKE. Enc (pk, m), c)1,...,cn)
Generating capacity or trapdoor for w
i.d=IBE.Extract(msk,w)
n. testing for w
o, i.n for 1. ltoreq. i
i.z=IBE.Dec(d,ci)
if z = flag, output "true"
To EK(m) decrypting to obtain m
Exemplary networked and distributed environments
One of ordinary skill in the art will appreciate that the various embodiments and related embodiments of the methods and apparatus for a trusted cloud services framework described herein may be implemented in connection with any computer or other client or server device that may be deployed as part of a computer network or in a distributed computing environment and that may be connected to any kind of data store. In this regard, the embodiments described herein may be implemented in any computer system or environment having any number of memory or storage units, and any number of applications and processes occurring across any number of storage units. This includes, but is not limited to, environments with server computers and client computers deployed in network environments or distributed computing environments with remote or local storage.
FIG. 66 provides a non-limiting schematic diagram of an exemplary networked or distributed computing environment. The distributed computing environment comprises computing objects or devices 6610, 6612, etc. and 6620, 6622, 6624, 6626, 6628, etc. which may comprise programs, methods, data stores, programmable logic, etc. as represented by applications 6630, 6632, 6634, 6636, 6638. It is to be appreciated that computing objects or devices 6610, 6612, etc. and computing objects or devices 6620, 6622, 6624, 6626, 6628, etc. may comprise different devices, such as PDAs, audio/video devices, mobile phones, MP3 players, laptop computers, etc.
Computing objects or devices 6610, 6612, etc. and computing objects or devices 6620, 6622, 6624, 6626, 6628, etc. may communicate with one or more other computing objects or devices 6610, 6612, etc. and computing objects or devices 6620, 6622, 6624, 6626, 6640, etc. by way of the communications network 6640, either directly or indirectly. Even though illustrated as a single element in fig. 66, network 6640 may comprise other computing objects or computing devices that provide services to the system of fig. 66, and/or may represent multiple interconnected networks, which are not shown. The computing objects or devices 6610, 6612, etc., or 6620, 6622, 6624, 6626, 6628, etc., may also contain applications such as applications 6630, 6632, 6634, 6636, 6638, which may utilize APIs or other objects, software, firmware, and/or hardware suitable for communicating with or implementing trusted cloud computing services or applications provided in accordance with embodiments of the present invention.
There are a variety of systems, components, and network configurations that support distributed computing environments. For example, computing systems may be connected together by wired or wireless systems, by local networks, or by widely distributed networks. Currently, many networks are coupled to the Internet, which provides an infrastructure for widely distributed computing and encompasses many different networks, but any network infrastructure may be used for exemplary communications that become associated with the techniques as described in various embodiments.
Thus, a host of network topologies and network infrastructures, such as client/server, peer-to-peer, or hybrid architectures, can be utilized. In a client/server architecture, particularly a networked system, a client is typically a computer that accesses shared network resources provided by another computer (e.g., a server). In the illustration of fig. 66, as a non-limiting example, computing objects or devices 6620, 6622, 6624, 6626, 6628, etc. can be thought of as clients and computing objects or devices 6610, 6612, etc. can be thought of as servers where computing objects or devices 6610, 6612, etc. provide data services, such as receiving data from computing objects or devices 6620, 6622, 6624, 6626, 6628, etc.; storing the data; processing data; data, etc., to a client, such as computing object or device 6620, 6622, 6624, 6626, 6628, etc., although any computer may be considered a client, a server, or both, depending on the circumstances. Any of these computing devices may process data or request services or tasks that may include the improved user profiling and related techniques described in one or more embodiments herein.
A server is typically a remote computer system accessible over a remote or local network, such as the Internet or wireless network infrastructure. A client process may be active in a first computer system and a server process may be active in a second computer system, communicating with each other over a communications medium, thereby providing distributed functionality and allowing multiple clients to take advantage of the information-gathering capabilities of the server. Any software objects utilized pursuant to user profiling can be provided independently or distributed across multiple computing devices or objects.
For example, in a network environment in which the communications network/bus 6640 is the internet, the computing objects or devices 6610, 6612, etc. can be web servers with which clients, such as computing objects or devices 6620, 6622, 6624, 6626, 6628, etc., communicate via any of a number of known protocols, such as the hypertext transfer protocol (HTTP). A server, such as computing object or device 6610, 6612, may also act as a client, such as computing object or device 6620, 6622, 6624, 6626, 6628, etc., as is characteristic of a distributed computing environment.
Exemplary computing device
As noted above, the various embodiments described herein are applicable to any device in which it is desirable to implement one or more portions of a trusted cloud services framework. It should be understood, therefore, that the various handheld, portable, and other computing devices and computing objects used in connection with the various embodiments described herein are contemplated, i.e., anywhere a device may provide certain functionality in connection with a trusted cloud services framework. Accordingly, the below general purpose remote computer described below in FIG. 67 is but one example, and embodiments of the disclosed subject matter can be implemented with any client having network/bus interoperability and interaction.
Although not required, any of the various embodiments can be implemented in part via an operating system, for use by a developer of services for a device or object, and/or included within application software that operates in conjunction with operational components. Software may be described in the general context of computer-executable instructions, such as program modules, being executed by one or more computers, such as client workstations, servers or other devices. Those skilled in the art will appreciate that network interactions may be practiced with a variety of computer system configurations and protocols.
Thus, fig. 67 illustrates an example of a suitable computing system environment 6700 in which one or more embodiments may be implemented, but as made clear above, the computing system environment 6700 is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of any of the embodiments. Neither should the computing environment 6700 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the exemplary operating environment 6700.
With reference to FIG. 67, an exemplary remote device for implementing one or more embodiments herein may include a general purpose computing device in the form of a handheld computer 6710. Components of the handheld computer 6710 may include, but are not limited to: a processing unit 6720, a system memory 6730, and a system bus 6721 that couples various system components including the system memory to the processing unit 6720.
The computer 6710 typically includes a variety of computer readable media such as, but not limited to, Digital Versatile Disks (DVD), flash memory, internal or external hard drives, Compact Disks (CD), and the like, and can be any available media that can be accessed by the computer 6710 including remote drives, cloud storage disks, and the like. The system memory 6730 can include computer storage media in the form of volatile and/or nonvolatile memory such as Read Only Memory (ROM) and/or Random Access Memory (RAM). By way of example, and not limitation, memory 6730 may also include an operating system, application programs, other program modules, and program data.
A user may enter commands and information into the computer 6710 through input devices 6740. A monitor or other type of display device is also connected to the system bus 6721 via an interface, such as output interface 6750. In addition to a monitor, computers can also include other peripheral output devices such as speakers and a printer, which may be connected through output interface 6750.
The computer 6710 can operate in a networked or distributed environment using logical connections to one or more other remote computers, such as a remote computer 6770. The remote computer 6770 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, or any other remote media consumption or transmission device, and may include any or all of the elements described above relative to the computer 6710. The logical connections depicted in FIG. 67 include a network 6771, such Local Area Network (LAN) or a Wide Area Network (WAN), but may also include other networks/buses. Such networking environments are commonplace in homes, offices, enterprise-wide computer networks, intranets and the Internet.
As noted above, while exemplary embodiments are described in connection with various computing devices, networks, and advertising architectures, the underlying concepts may also be applied to any network system and any computing device or system in which it is desirable to provide trust in connection with interaction with cloud services.
There are numerous ways of implementing one or more of the embodiments described herein, e.g., appropriate APIs, toolkits, driver code, operating systems, controls, standalone or downloadable software objects, etc., that enable applications and services to use the trusted cloud services framework. Embodiments may be conceived from the standpoint of an API (or other software object), as well as from a software or hardware object that provides a fixed-point platform service in accordance with one or more of the described embodiments. Various implementations and embodiments described herein may have aspects that are wholly in hardware, partly in hardware and partly in software, as well as in software.
The word "exemplary" is used herein to mean serving as an example, instance, or illustration. For the avoidance of doubt, the subject matter disclosed herein is not limited to these examples. Additionally, any aspect or design described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects or designs, nor is it meant to exclude equivalent exemplary structures and techniques known to those of ordinary skill in the art. Furthermore, to the extent that the terms "includes," "has," "includes," and other similar words are used in either the detailed description or the claims, for the avoidance of doubt, such terms are intended to be inclusive in a manner similar to the term "comprising" as an open transition word without precluding any additional or other elements.
As noted, the various techniques described herein may be implemented in connection with hardware or software or, where appropriate, with a combination of both. As used herein, the terms "component," "system," and the like are likewise intended to refer to a computer-related entity, either hardware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computer and the computer can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between two or more computers.
The system as described above has been described with reference to interaction between several components. It will be understood that these systems and components may include components or specified sub-components, some specified components or sub-components, and/or additional components, and according to various permutations and combinations of the above. Sub-components can also be implemented as components communicatively coupled to other components rather than included within parent components (hierarchical). Additionally, it should be noted that one or more components may also be combined into a single component providing aggregate functionality or divided into multiple separate sub-components, and any one or more middle layers, such as a management layer, may be provided to communicatively couple to such sub-components to provide integrated functionality. Any components described herein may also interact with one or more other components not specifically described herein but generally known by those of skill in the art.
In view of the exemplary systems described above, methodologies that may be implemented in accordance with the disclosed subject matter will be better appreciated with reference to the flow charts of the various figures. While, for purposes of simplicity of explanation, the methodologies are shown and described as a series of blocks, it is to be understood and appreciated that the claimed subject matter is not limited by the order of the blocks, as some blocks may occur in different orders and/or concurrently with other blocks from what is depicted and described herein. Although non-sequential or branched flow is illustrated via flowchart, it can be appreciated that various other branches, flow paths, and orders of the blocks, may be implemented which achieve the same or similar result. Moreover, not all illustrated blocks may be required to implement the methodologies described hereinafter.
While a client-side perspective is shown in some embodiments, it is to be understood for the avoidance of doubt that a corresponding server perspective exists, and vice versa. Similarly, in implementing a method, a corresponding device may be provided having storage and at least one processor configured to implement the method via one or more components.
While the embodiments have been described in connection with the preferred embodiments of the various figures, it is to be understood that other similar embodiments may be used or modifications and additions may be made to the described embodiment for performing the same function without deviating therefrom. Moreover, one or more aspects of the various embodiments described herein may be implemented in or across a plurality of processing chips or devices, and storage may similarly be effected across a plurality of devices. Accordingly, the present invention should not be limited to any single embodiment, but rather should be construed in breadth and scope in accordance with the appended claims.

Claims (15)

1. A method for hosting data, comprising:
receiving at least one of data or metadata associated with the data on a trusted platform hosting the data, wherein the data, the metadata, or both are protected by a synthetic wrapper formed from at least one mathematical transformation of the data, the metadata, or both by a mathematical transformation component separate from the trusted platform hosting the data, wherein the mathematical transformation includes at least a first mathematical transformation defining a first wrapper for the data, the metadata, or both based on a first set of criteria and a second mathematical transformation defining a second wrapper for the data, the metadata, or both based on a second set of criteria, wherein if the state of the data, the metadata, or both changes to a new state, the first wrapper or the second wrapper is modified based on a new set of criteria associated with the new state; and
receiving a request to access data, metadata, or both protected by the synthetic wrapper based on a set of capabilities included in the request, the set of capabilities generated by an access information generator separate from the trusted platform hosting the data and the mathematical transformation component; and
based on the set of capabilities, determining at least one access privilege to the data, metadata, or both based on evaluating visibility through the first wrapper and independently evaluating visibility through the second wrapper.
2. The method of claim 1, wherein the receiving comprises receiving the at least one of data or metadata protected by a composite wrapper formed from the at least one mathematical transformation, the at least one mathematical transformation comprising at least a first mathematical transformation defining a first wrapper that wraps less than all of the data, the metadata, or both based on the first set of criteria.
3. The method of claim 1, wherein the receiving comprises receiving the at least one of data or metadata protected by a composite wrapper formed from the at least one mathematical transformation, the at least one mathematical transformation comprising at least a first mathematical transformation defining a first wrapper that wraps the data, the metadata, or both based on the first set of criteria, and at least a second mathematical transformation defining a second wrapper that wraps the data, the metadata, or both wrapped by the first wrapper.
4. The method of claim 1, wherein the receiving comprises receiving the at least one of data or metadata protected by a composite wrapper formed from the at least one mathematical transformation, the at least one mathematical transformation comprising at least a first mathematical transformation defining a first wrapper that wraps less than all of the data, the metadata, or both, and at least a second mathematical transformation defining a second wrapper that wraps all of the data, the metadata, or both, based on the first set of criteria.
5. The method of claim 1, wherein the receiving comprises receiving data, metadata, or both protected by a composite wrapper comprising a supplemental wrapper, the composite wrapper comprising at least first and second wrappers for satisfying supplemental trust or security criteria.
6. The method of claim 1, further comprising:
if the state of the data, the metadata, or both becomes a new state, at least one additional wrapper that is appropriate for the new set of criteria associated with the new state is automatically added or removed.
7. The method of claim 1, wherein if the confidentiality classification of the data, the metadata, or both changes to a more sensitive classification, then automatically adding at least one additional wrapper to the data, the metadata, or both that is appropriate for the more sensitive classification.
8. The method of claim 1, wherein the determining comprises determining a concentric order in which to evaluate visibility.
9. The method of claim 1, wherein the determining comprises determining a lateral order in which to evaluate visibility.
10. The method of claim 1, wherein defining the first wrapper or defining the second wrapper comprises defining access speed requirements for the data, the metadata, or both.
11. The method of claim 1, wherein defining the first wrapper or defining the second wrapper comprises defining tamper-resistant requirements for the data, the metadata, or both.
12. The method of claim 1, wherein defining the first wrapper or defining the second wrapper comprises defining a recovery reliability requirement specified for the data, the metadata, or both.
13. A system for hosting data, comprising:
an access information generator, operative with the at least one processor, configured to generate capability information for publishing the data, the metadata, or both, or for subscribing to at least one of the published data, the published metadata, or both;
at least one mathematical transformation component, distributed at least in part by a mathematical transformation technology provider, implemented independently of an access information generator, the at least one mathematical transformation component comprising at least one processor configured to execute at least one encoding algorithm or decoding algorithm based on capability information generated by the access information generator; and
a network service provider implemented independently of the access information generator and the at least one mathematical transformation component, comprising at least one processor configured to implement a network service related to data, metadata, or both encrypted by the at least one mathematical transformation component, the network service provider configured to communicate with the at least one mathematical transformation component to perform the generation, regeneration, modification, augmentation, or deletion of a mathematical transformation wrapper applicable to at least two of the data, metadata, or both,
wherein the system is configured to perform the method of any one of claims 1 to 12.
14. The system of claim 13, wherein the network service provider is configured to generate, regenerate, change, augment, or delete a wrapper based on at least one temporal event that modifies a trust requirement in a set of trust requirements for the wrapper.
15. The system of claim 13, wherein the network service provider is configured to generate, regenerate, change, augment, or delete a wrapper based on at least one spatial event that modifies a trust requirement in a set of trust requirements for the wrapper.
HK13102313.1A 2009-12-15 2010-11-18 Verifiable trust for data through wrapper composition HK1175861B (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US28665409P 2009-12-15 2009-12-15
US61/286,654 2009-12-15
US12/832,400 2010-07-08
US12/832,400 US9537650B2 (en) 2009-12-15 2010-07-08 Verifiable trust for data through wrapper composition
PCT/US2010/057258 WO2011081738A2 (en) 2009-12-15 2010-11-18 Verifiable trust for data through wrapper composition

Publications (2)

Publication Number Publication Date
HK1175861A1 HK1175861A1 (en) 2013-07-12
HK1175861B true HK1175861B (en) 2016-12-02

Family

ID=

Similar Documents

Publication Publication Date Title
US10348700B2 (en) Verifiable trust for data through wrapper composition
US10348693B2 (en) Trustworthy extensible markup language for trustworthy computing and data services
US10275603B2 (en) Containerless data for trustworthy computing and data services
CN102318262B (en) Trusted cloud computing and services framework
AU2010258678A1 (en) Secure and private backup storage and processing for trusted computing and data services
CN102318263A (en) Trusted cloud computing and services framework
HK1175861B (en) Verifiable trust for data through wrapper composition
HK1173794B (en) Containerless data for trustworthy computing and data services