WO2008118168A1 - Procédé et système de stockage de sauvegarde de données électroniques - Google Patents
Procédé et système de stockage de sauvegarde de données électroniques Download PDFInfo
- Publication number
- WO2008118168A1 WO2008118168A1 PCT/US2007/064810 US2007064810W WO2008118168A1 WO 2008118168 A1 WO2008118168 A1 WO 2008118168A1 US 2007064810 W US2007064810 W US 2007064810W WO 2008118168 A1 WO2008118168 A1 WO 2008118168A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- data file
- stored
- file
- storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0428—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1464—Management of the backup or restore process for networked environments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0823—Network architectures or network communication protocols for network security for authentication of entities using certificates
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/875—Monitoring of systems including the internet
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
- G11B20/18—Error detection or correction; Testing, e.g. of drop-outs
- G11B20/1833—Error detection or correction; Testing, e.g. of drop-outs by adding special lists or symbols to the coded information
Definitions
- the present Invention relates generally to storage of electronic data, and more particularly to guaranteeing backup storage of a data file and/or backup storing the data file such that it may be recreated even where part of the data file is lost.
- insurance policies that provide data "loss protection” typically cover only the cost of data recovery and business interruption. That is, when data is lost, an insurance policy may cover the time and expense of retrieving the data from electronic backup facilities, creating electronic data from paper archives, restoring damaged electronic storage media, and the cost of lost computing resources during recovery. However, insurance policies typically will not cover the value of lost data that cannot be recovered. Those policies that apparently cover the value of unrecoverable data have such high premiums and/or restrictive claim payments that these policies do not offer meaningful insurance for unrecoverable data.
- a data owner may lose on site data and request retrieval of such data from the off site storage. Where the off site facility is unable to provide the requested data, the data owner has no recourse but to simply accept the data as lost. While the data owner may switch to a different data backup provider, this does not compensate the data owner for costs associated with the lost data.
- one object of the present invention is to address the above and/or other problems relating to backup storage of electronic data.
- Another object of the present invention is to guarantee backup storage of electronic data.
- Still another object of the present invention is to provide a method and system for insuring against data loss.
- Yet another object of the present invention is to increase reliability of backup data storage facilities.
- a further object of the present invention is to provide a method and system for backup storing a data file such that the data file can be reconstructed in substantially (or bit for bit) identical form despite loss of part of the data file.
- One embodiment of the invention includes a method for guaranteeing retrieval of stored data.
- the method includes receiving, from a sender, a data file being stored to be stored at a data custodian and creating a storage certificate associated with the data file being stored, the storage certificate including a partial representation of the data file being stored and an electronic signature of the data custodian.
- the data file being stored is stored in a storage unit at the data custodian, the storage certificate is returned to the sender to verify storage of the data file being stored at the data custodian.
- Another embodiment includes a method for backup storing data files.
- the method of this embodiment includes receiving, from a sender, a data file being stored to be stored at a data custodian, and creating N data chunks from the data file such that the data file can be recreated using ⁇ N of the data chunks.
- the N data chunks are in a plurality of storage locations.
- Figure 1 is a system for guaranteeing backup storage of a data file in accordance with an embodiment of the present invention
- Figure 2 is a flow chart representative of a method for guaranteeing backup storage of a data file in accordance with an embodiment of the present invention
- Figure 3 is a system for backup storing a data file such that the data file can be reconstructed in bit for bit identical form despite loss of part of the data file in accordance with an embodiment of the present invention
- Figure 4 is a flow chart representative of a method for backup storing a data file such that the data file can be reconstructed in bit for bit identical form despite loss of part of the data file in accordance with an embodiment of the present invention
- Figure 5 is a detailed flow chart representative of a method for receiving a data file from a client, guaranteeing storage of the data, and backup storing a data file such that the data file can be reconstructed in bit for bit identical form despite loss of part of the data file in accordance with an embodiment of the present invention
- Figure 6 is a detailed flow chart representative of a method for retrieving a data file guaranteed and stored according to the process shown in Figure 5, in accordance with an embodiment of the invention
- Figure 7 is a detailed flow chart representative of the packer step of Figure 5 in accordance with an embodiment of the present invention.
- Figure 8 is a detailed flow chart representative of the receiver to decoder step of
- Figure 9 illustrates a computer that may be used to implement the present invention. DETAILED DESCRIPTION OF THE INVENTION
- FIG. 1 a system for guaranteeing backup storage of a data file in accordance with an embodiment of the present invention.
- the system of Figure 1 includes workstations 101, laptops 103, servers 105, local area network (LAN) server 107, firewall server 109, Internet 111, firewall server 113, offsite data vault 115 and mirrored site (or additional data vault) 117.
- LAN local area network
- FIG. 1 there could be additional backup storage facilities and additional copies of the data may be stored there or alternately more sophisticated parceling of data as illustrated in Figure 3 may be employed.
- workstations 101, laptops 103, servers 105, and local area network (LAN) server 107 make up the resources of a business (or household) that uses a backup data storage system in accordance with an embodiment of the present invention.
- Firewall server 113, offsite data vault 115 and mirrored site 117 make up the backup storage facility in the embodiment of Figure 1.
- Workstations 101, and laptops 103 are well known and are operated by users that generate and store data files that are essential to successful operation and survival of a business, for example.
- Servers 105 may be dedicated computer systems for providing various functions used during operation of the business.
- the workstations 101, laptops 103 and servers 105 are preferably programmed with any suitable Web browser software that permits these resources to retrieve Web pages via the Internet 111 from remote computers or servers such as the firewall server 113, offsite data vault 115 and mirrored site 117 in order to provide guaranteed data backup to users of these systems.
- the Web browser software may also be used to transmit information provided by the workstations 101, laptops 103 and servers 105 to remote computers such as the server 113, and storage devices 115 and 117.
- the workstations 101, laptops 103 and servers 105 could be equipped with freestanding software which emulates a browser to transmit archives or makes use of dedicated file servers for this purpose.
- File servers exist, which can create an illusion that storage resources which in fact reside somewhere else on the network, appear to reside locally.
- files which customers save from their local systems In the customary way actually are saved on such a file server and are automatically guaranteed in the amount predetermined for all files saved to that volume, folder, or directory of this file server.
- these systems could also use erasure codes, error correcting codes, clustering, and high availability technologies to make them as reliable as desired.
- a plurality of machines each having locally attached disks can present those disks as block devices via iSCSI or the network block device mechanism for example.
- Such a system can support the illusion to a client system that the disk resources on it appear to be directly attached to the client.
- a collection of client systems can then utilize those blocks, preferably so that each client has exclusive access to any given block device and those block devices accessed by a given client can be combined to form a RAID 5 volume.
- the failure of any given server system causes no failure of the client system.
- These client systems may then in turn export their file systems built on these block devices so that they become file servers over the network to clients of their own.
- another file server can obtain access to those blocks and recover that file system and resume service and client access to the file system also fails over.
- Software tools that enable this include Heartbeat, for example.
- the delimiter pair become ' ⁇ p> ! and ' ⁇ /p> ⁇
- the delimiter pair consists of ' ⁇ table> ! and ' ⁇ /table>'.
- Such delimiters enclose text which may in turn include other delimted text.
- the decorating or annotating with functions is done by including "attributes" so that we can include that information in the opening delimiter.
- our table has a width of 5 and a height of 3
- the closing delimiter ' ⁇ /table>' is unchanged.
- This mechanism allows for extremely rich data structures to be transported.
- the data types that occur in programming language can be expressed by XML.
- a web services description language file contains the information necessary to make the translation.
- software tools exist that can take an existing function or method and generate this WSDL file.
- Software tools also exist which can take this WSDL file and create a proxy function or method which supports the illusion that functionality that is in fact remote, is just an ordinary function or method running locally.
- the implementation language for the actual service need not be the same as the implementation language for the proxy object.
- These XML enabled software interfaces are called web services. In this way a programmer using their preferred programming language can make use of program code implemented in an entirely different language without knowing anything about that other language.
- workstations 101, laptops 103 and servers 105 are connected in a LAN by way of LAN server 107, which coordinates communication among business resources and provides other network functionality such as compression and/or encryption of data.
- the business resources are also connected to Internet 111 by way of firewall computer 109 to enable communication with external resources such as the data storage facility.
- firewall server 109 filters data entering and exiting the LAN to protect against unauthorized access to the business resources.
- Workstations 101, laptops 103, servers 105, local area network (LAN) server 107, and firewall server 109 communicate with each other and with external resources using any suitable protocol for communicating directly or via the Internet 111. More specifically, these devices are configured to interface with the firewall server 113, offsite data vault 115 and/or mirrored site 117 to store backup data by user command or automatically, to obtain storage certificates, to retrieve backup data and to make claims to the storage facility in accordance with embodiments of the present invention.
- Workstations 101, laptops 103, servers 105, local area network (LAN) server 107, and firewall server 109 may be implemented as the general purpose computer system 9001 of Figure 9, for example.
- Offsite storage vault 115 and mirrored site 117 are also connected to the Internet 111, and thus the business resources, via firewall 113.
- Firewall server 113 filters data entering and exiting the backup data storage system to protect against unauthorized access to backup data storage resources.
- the Internet 111 includes various networks and gateways for linking together various computer networks and computers such as those shown in Figure 1.
- Offsite data vault 115 and mirrored site 117 are storage devices that store electronic information including data files from business resources such as the workstations 101, laptops 103 and servers 105, for example.
- Firewall server 113, offsite data vault 115 and/or mirrored site 117 may be configured as a Web server programmed to receive, store, and/or transmit various types of information, including, requested backup data files from the business resources. These devices can also be configured to provide storage certificates and satisfy claims in accordance with embodiments of the present invention. [0039] It is to be understood that the system in Figure 1 is for exemplary purposes only, as many variations of the specific hardware and software used to implement the present invention will be readily apparent to one having ordinary skill in the art. For example, the functionality of the firewall server 109 and LAN server 107 may be combined in a single device.
- a single computer e.g., the computer system 9001 of Figure 9
- two or more programmed computers may be substituted for any one of the devices shown in Figure 1.
- Principles and advantages of distributed processing such as redundancy and replication, may also be implemented as desired to increase the robustness and performance of the system, for example.
- other means of tendering information to the custodian are possible such as satellite links, wireless communication, electromagnetic communication including microwave, maser, laser, spread spectrum etc.
- Figure 2 is a flow chart representative of a method for guaranteeing backup storage of a data file in accordance with an embodiment of the present invention.
- the method of Figure 2 may be performed on the system of Figure 1, for example.
- the method begins with receiving a data file at a data custodian in step 201.
- the data file may include digital information or any information that may be represented without loss in a digital form, that is to be stored in a data backup storage of the data custodian.
- the data custodian may be a storage device for providing backup storage services.
- the data file may be accounting information generated by server 103 and sent to the data vault 115 via the Internet 111.
- the data file (being stored) received in step 201 Is an encrypted data file.
- asymmetric cryptographic systems which use a pair of keys, called the public key and the private key, with the property that what is encrypted using the private key may be unencrypted using the public key, permitting the public key holder to verify that the sender of an encrypted message was in possession of the private key.
- Knowledge of the public key provides no useful information for obtaining the matching private key.
- Symmetric cryptographic systems also exist which use a single key.
- Dutch auctions exist, under conditions where there is the ability to fulfill multiple bids, for example, with multiple identical Items; they are conducted by collecting bids, and choosing one bid as the accepted bid value.
- the bidder who bid that accepted bid value does not receive an Item, but rather all bidders whose bids were higher than the accepted bid price each receive an item or items at the accepted bid value.
- This mechanism has the useful property that It is provably fair, in the sense that everyone pays the exact same price, everyone who gets an item gets it for less than they committed themselves to pay, everybody who does get an item committed themselves for more than everyone who does not, and no collusion on the part of the winning bidders could possibly lower the price that they pay.
- an embodiment of the present invention may conduct a real-time dutch auction, to prioritize the servicing of file restoration requests, ensuring that those customers whose needs are most pressing, as measured by the value they place on priority recovery, are given that priority.
- the same dutch auction mechanism may be used for setting other pricing such as the fees charged for data storage, and fees paid for levels of guarantee.
- the data file (being stored) received in 201 is accompanied by a digitally signed bid specifying the value of the file and the offered fee.
- the data file being stored can be received at the custodian via the Internet or other digital communications means.
- the data file may be received as an attachment in an e-mail, or the data file being stored can be uploaded to the data custodian.
- the data file may be manually sent to the data custodian by a user, or automatically sent to the data custodian according to a predetermined backup storage plan or policy or as triggered by saving the file to a file server volume or directory having a predetermined value associated with it.
- the customer requests that a onetime or recurring schedule be set up, so that the data custodian polls the client software running on the customer premises, to respond by sending an archive, where the value is predetermined, or available for retrieval from some pre-arranged place.
- data files can be transferred at a convenient time when Internet traffic is relatively low.
- client software running on the customer premises can generate data chunks (as will be discussed below) and submit the chunks directly to the data storage facilities, and upload the file.
- the data custodian computes each chunk, but rather than sending it to the data storage facility, computes the hash of the chunk, and compares it with the hash of the chunk computed by the data storage facility. If all (or sufficiently many) of these hashes agree, the file has been properly stored and the certificate is issued. [0049] Once the data file being stored is received at the data custodian, the data custodian creates a storage certificate in step 203.
- the storage certificate includes a partial representation of the data file received by the data custodian, the declared value of the data file, as well as an electronic signature of the data custodian.
- the partial representation is a unique representation of the data file being stored that is created from the data file being stored and can be used to authenticate the data file being stored if presented in the future.
- the partial representation of the data file is typically a small fraction of the size of the original data file being stored and cannot be used to recreate the data file being stored.
- the certificate may contain other information as well such as the size of the file, a time stamp recording when the file coverage took effect, an expiration for coverage, a confirmation of the fees being charged, labels for the tracking convenience of the customer, etc.
- the partial representation is a hash value of the data file being stored computed according to a predetermined algorithm.
- hash functions exist, which take as input a file and produce as output a small piece of data, called its hash value, in such a way that it is not feasible to analyze that small piece of data so as to recover the original file.
- These functions have the additional property that any change in the file, however small, results in a completely different hash value.
- the hash value cannot be used to create the original data file that created the hash value
- the hash value can be used to authenticate a file presented as the original data file.
- MD5 and SHAl are widely used hash functions. Source code implementing these functions is available as open source in any standard Linux distribution.
- the electronic signature created in step 203 is associated with the data custodian and functions as a written signature to identify the custodian as the sender, and to possibly legally bind the data custodian as having received the data file being stored.
- the electronic signature is a cryptographic digital signature.
- Digital signatures exist, whereby a document may be reliably attributed to the owner of a public key in possession of the private key, by encrypting the hash value of the document using the private key, for example.
- the general public in possession of the public key may decrypt the encrypted hash value using that public key. When that original hash value is recovered it proves that the encryption truly was done by the private key and therefore could only have been done by the unique possessor of the private key.
- step 205 the data file being stored is stored in a storage unit of the data custodian.
- the data file being stored can be stored in the data vault 115 or the mirrored site 117 or both.
- storage can be spread among a number of sites as will be discussed more fully in connection with Figures 3, 4, 5, 6, 7, and 8.
- the storage certificate is returned to the sender of the data file.
- the storage certificate can be returned to the sender via e-mail or downloading the storage certificate to the sender.
- the certificate may be placed on a website for inspection by the sender or communicated in any other mode suitable for transmitting digital information.
- the data custodian provides a storage certificate to the sender of the data file as a receipt that the data file being stored has been received and stored at the data custodian.
- This storage certificate can be used to obtain the stored data file in the future.
- the data custodian can receive a request for a requested data file from a requestor.
- the request can include a data storage certificate and other optional information such as a file label.
- the storage certificates may be maintained by the data custodian, in which case the request might simply be some piece of information enabling the data custodian to retrieve that certificate.
- the data custodian processes the request in order to determine whether the requested data file can be provided by the data custodian.
- the data custodian first obtains an electronic signature from the storage certificate and determines whether the electronic signature of the storage certificate corresponds to a signature of the custodian. Specifically, the data custodian determines whether the electronic signature of the storage certificate matches its electronic signature previously issued in a data storage certificate. This verification can be performed by anyone in an algorithmic fashion by transforming the certificate using the public key of the data custodian as previously discussed.
- the data custodian retrieves a file from storage having the requested file label and computes a hash value for the retrieved file based on a predetermined algorithm used by the data custodian to issue storage certificates. The data custodian then compares the hash value of the retrieved file with the hash value of the stored data file to determine If a match exists. Where the hash value of the retrieved data file matches the hash value of the stored data file, the data custodian returns the retrieved data file to the requestor as the requested data file, via any communications means suitable for digital communication.
- a check is performed through the accounting system to be sure that payments are current before honoring a request. Or alternately, some different charge may apply for honoring the request under those circumstances.
- the data custodian takes corrective action. Specifically, where the storage certificate includes the data custodian's electronic signature, but the data custodian cannot retrieve a stored file that can generate the hash value included in the storage certificate, then this indicates that the data custodian has lost the data file being stored now requested. Thus, the data custodian can pay the requestor a predetermined dollar amount for lost data.
- the data owner submits their data, probably encrypted, for storage.
- the data custodian generates a hash value for the data, and creates an electronic storage certificate, which is then digitally signed. If the data owner claims a loss, the data owner can exhibit the certificate to all the world without in any way compromising the confidentiality of their data.
- the data custodian may prove to all the world that the original data has been restored, by exhibiting the (encrypted) data, which all the world can verify has the hash value referenced in the storage certificate. Cancellation due to failure to pay, claim payment, or failure of claim payment can all be proved in the usual manners for financial transactions.
- a request to cancel coverage can be acknowledged with another digitally signed instrument.
- the information submitted is not encrypted but is a common file which may have been downloaded from the internet or purchased from a vendor. In this case no additional copy of the information need be stored.
- the certificate can be granted immediately upon verification that the hash value of the transmitted file is the hash value of a file already stored.
- Figure 3 is a system for backup storing a data file such that the data file can be reconstructed in substantially (or even bit for bit) Identical form despite loss of part of the data file in accordance with an embodiment of the present invention.
- the system includes a data file being stored 301 and a plurality of storage units 303-317. As seen in Figure 3, the data file being stored 301 is separated into "chunks," and each chunk is sent to a respective data storage unit 303-317.
- the storage file 301 can be reconstructed as file 350 from the remaining chunks stored in storage units 303, 307, 309, 313, 315 and 317.
- the reconstructed file 350 is bit for bit identical to the original storage file 301.
- files can be stored with greater reliability. A method for accomplishing this reconstruction is discussed in connection with Figure 4.
- Figure 4 is a flow chart representative of a method for backup storing a data file such that the data file can be reconstructed in bit for bit identical (for example) form despite loss of part of the data file in accordance with an embodiment of the present invention.
- the method begins with receiving a data file at a data custodian in step 401.
- the data file may include digital information that is to be stored in a data backup storage of the data custodian, and the data custodian may be a storage device for providing backup storage services.
- the data file being stored received in step 401 is an encrypted data file such as that discussed in relation to asymmetric cryptographic systems above, or alternately a file may be returned from which the original file may be algorithmically derived, as for example a compressed version of the file.
- the storage file is an unencrypted data file.
- the data file being stored can be received at the custodian via the Internet, as an attachment in an e-mail, or can be uploaded to the data custodian or transmitted in any other manner appropriate for digital communication.
- step 401 may be performed by a data custodian software application on the sender's computer system. That is, while step 401 includes receiving a data file at the data custodian, such data file may be processed by client software at the user and divided into data chunks (as discussed below) before ever being remotely transmitted from the user.
- client software could be running on systems owned by the user, or on systems owned by the user but administered by the data custodian, or by systems owned by the systems custodian but located on the user's premises.
- Step 403 may be performed by use of error correcting codes or erasure codes.
- Error correcting codes exist, which enable messages that have been altered in transit to be reconstructed.
- a simple example is a repetition code, which simply sends the same message multiple times.
- An error is detected by comparing the multiple copies, and errors may be corrected by a comparison process to reach a consensus. This is the technique people use when trying to talk in a noisy environment; they repeat themselves as necessary to be understood over the noise. While the message is eventually communicated, it may take a very long time.
- step 403 of Figure 4 can be performed by use of error correcting codes such as Berger Code, Check digit, Chipkill (an application of ECC techniques to volatile system memory), Constant weight control, Convolutional codes, Differential space-time codes (related to space-time block codes), Dual modular redundancy (subset of N-modular redundancy, related to triple modular redundancy), Erasure codes (superset of Fountain codes), Forward error correction, Group code, Golay code, Goppa code, Hadamard code, Hagelbarger code, Hamming code, Lexicographic code, Longitudinal redundancy check, Low-density parity-check code, LT codes (which are near optimal rateless erasure correcting codes), m of n codes, Online codes (which are an example of rateless erasure codes), Parity bit, Raptor codes (which are high speed -near real time- fountain codes), Reed-Solomon error correction, Reed-Muller code, Repeat accumulate code, Sparse graph code,
- error correcting codes
- step 403 includes use of versions of the sparse matrix method of Gallager which comprises a family of codes.
- a particular code is constructed by choosing a number M of message pieces, each of Identical size, and constructing a number C of check pieces, each of the same size as each of the message pieces.
- Each check piece is generated from some collection of message pieces where the number of message pieces may vary from check piece to check piece.
- the manner in which the check piece is generated Is by forming an exclusive "or" of the message pieces. Decoding proceeds via "belief propagation" described below for the case of erasure codes.
- step 403 utilizes the fact that a polynomial having degree N has at most N roots. It is an easily proved standard fact from algebra that this implies that if you know the value of a degree N polynomial at any N points, you know the coefficients of the polynomial.
- Algorithms for generating the values and recovering the coefficients include Neville's Algorithm, and Newtons Interpolation Formula which may be found in Steer and Bulirsch, Introduction to Numerical Analysis.
- Neville's Algorithm and Newtons Interpolation Formula which may be found in Steer and Bulirsch, Introduction to Numerical Analysis.
- a more efficient, but more complicated approach is the Berlekamp-Massey Algorithm which are standard topics in books on algebraic codes.
- Step 403 may also be performed by use of erasure codes, which are closely related to error correcting codes and are also able to reconstruct a message where portions of the message have been lost in transmission.
- Preferred erasure codes are the Gallager family whose encoding process is described above. Decoding proceeds by finding first those check pieces constructed from just one message piece, then using that knowledge to extricate those message pieces from the remaining check pieces. That is, a check piece constructed from just one message piece coincides with that message piece. The extrication is readily done because applying an exclusive "or" a second time decodes the effect of applying an exclusive "or” the first time. Thus applying the exclusive "or" operation to the message piece and any check piece that includes it, yields a new check piece which excludes that message piece.
- the data chunks are created, they are sent to a plurality of custodian locations in step 405.
- the data chunks are preferably each sent to different storage locations to increase reliability of storage, but at least two different storage locations is all that is required according to the present invention. As noted above, in Figure 3, where one or more of the storage locations loses a data chunk for some reason, the original data file can be recreated from the remaining data chunks that have not been lost.
- Figure 5 is a detailed flow chart representative of a method for receiving a data file from a client, guaranteeing storage of the data, and backup storing a data file such that the data file can be reconstructed in bit for bit identical form despite loss of part of the data file in accordance with an embodiment of the present invention.
- Figure 5 shows stage 1 of receiving data from a Client (sender).
- the process begins uploading a data file in step 501.
- the data file includes a value of the data file and may also include a name, a label or other information for the convenience of the customer.
- the value may be a specific monetary value, or a rating of the importance or other valuation of the data file.
- step 503 the data file is saved at the data custodian, and the data custodian processes the file to record the value, path and any label or other information associated with the file.
- step 503 is performed at a packer manager system of the data custodian, which may be a general purpose computer such as that described in Figure 9.
- the value of step 505 may be the value received in step 501, or a value that is calculated based on a rating of the importance of the data as received in step 501, or a pre-assigned value where the file has been stored to a location bearing an agreed upon per file valuation.
- step 507 the data custodian sends a message to the client thanking the client for uploading the file and informing the client to expect an e-mail with label information.
- the client could receive synchronous notification and/or synchronous receipt of the storage certificate.
- Stage 2 begins with step 511 where the value, path and label associated with the data file are received.
- step 513 a code for encoding the data file is chosen based on a value and size of the data file. As noted with respect to Fig. 4, the code may be an error code or an erasure code, the reliability and efficiency of which can be chosen based on the value of the data file.
- Data chunks are then created in step 515 using the code selected in step 513.
- Step 511 also branches to step 517 where a hash value is computed for the original data file.
- step 519 a subset (principal chunks) of the chunks in step 515 are combined to recreate the data file, and a hash value is recomputed for the recreated data file. This recomputed hash value is then compared with the hash value computed in step 517 to check the accuracy of the recreation process using chunks.
- the recreated file will normally match bit for bit the original file. The only way this can fail to happen is if some transient hardware failure has corrupted the computation. That transient error could have occurred in the course of creating the chunks in step 515 or in the computation of the hash in step 517.
- step 521 If the files and hashes both differ in step 521 then the correct execution of step 515 is to be doubted. Thus, the chunks are erased in step 523 and the data chunks are recreated In step 515. Where the files are the same but the hashes differ, there must have been a computational error in computing the hash either at step 517 or at step 519. In this case the process returns to step 517 to recompute those hashes. It is virtually Impossible for the files to differ and yet the hashes to be the same as a result of a transient computational error.
- step 519 determines that the recomputed hash value is the same as the originally computed hash value, which should nearly always be the case, the process continues to step 525 where a check syndrome is performed.
- the check syndrome is a piece of information computed from the encoded information which reveals the presence of errors. In the case of linear codes, in the absence of error the syndrome will normally be zero. For example, a parity or check sum code adds an extra bit of Information to a message in such a way that the total number of bits is even. The syndrome for this code is then computed by counting the number of 'Ts".
- step 515 If that number is even, the syndrome is O 5 if that number is odd, the syndrome is "1" which reveals that there has been an error in transmission because the total number of "Is'" was arranged to be even not odd. Where the result of the check syndrome is bad, an error must have occurred in the execution of step 515. Thus, the process returns to step 523 where the chunks are erased and new chunks are created in step 515, Where the result of the check syndrome is good, the process flows to step 527 where packer messages are created and each given a data chunk for delivery to a distinct data center. In step 529, the packer messages are sent to distinct data centers Pl-Pn, as will be further discussed in Fig. 7 below. This may be done in order that the processes of sending the chunks to the various data centers may be performed concurrently.
- a single process could sequentially send the various chunks to the various data centers or a single process employing threads could use multiple threads to achieve concurrency. Not all choices of erasure or error correcting codes possess syndromes in which case there would be nothing to be done at step 525.
- a collation process is performed on the success or failure of the packer functions in storing the data chunks to the data centers based on messages received back from the data centers.
- the data centers send success or failure messages to their respective packer which in turn reports back to the packer manager which collates the success or failure of the message storage process in step 531.
- This collation process may conclude success based on the individual success of all of the packers or alternately on a sufficient number or a sufficient subset of successful packer operations. Under normal conditions all of those operations will succeed. If however, a particular remote data center is down or under conditions of network breakdown or congestion or transient hardware failure in a packer process, failure may result. As seen in step 533, where the collation process determines a failure, the customer is notified of such failure in step 535. This may initiate the customer resending the data file for storage. In an alternate embodiment we might return to step 527 to retry some subset of the failed packer operations.
- step 539 a certificate that guarantees the data for the declared, computed, or implied value is issued and sent to the customer as part of a notification in step 541.
- the certificate can include a hash value for the data file and a digital signature associated with the data custodian.
- the certificate might also Include other Information of value to the customer or the custodian such as the size of the file and the time of issue, time of expiration, etc.
- the customer notification could be a synchronous response coinciding with step 507.
- Fig. 6 is a detailed flow chart representative of a method for retrieving a data file guaranteed and stored according to the process shown in Fig. 5.
- the claim process begins in step 1 when the customer submits a claim. More specifically, in step 601, the customer submits a certificate with a hash value and signature to the data custodian. It could also include a label and other information discussed previously.
- the signature is verified to determine whether the signature is good or bad, as shown in step 605. Where the signature is determined to be bad, the process flows to step 611 where messages are sent to a customer to provide a valid certificate.
- step 607 a message is sent to the claims department of the data center, and a confirmation Is sent to the customer in step 609 to inform the customer that the data storage certificate was good and to expect an e-mail relating to the process claim.
- the message could be sent synchronously in step 625 rather than asynchronously via e-mail.
- the message to the claims department would normally be an automated message which could be a client-server interaction, a function call, a remote method invocation, a web service or any other form of digital communication.
- step 613 the message generated in step 607 is sent to the retriever activator in step 617.
- step 619 a retriever function is performed to retrieve the data chunk from a respective data center. Details of the retriever process flow 619 will be discussed with respect to Fig. 8 below.
- step 621 at least a subset of the end data chunks are recomblned and decoded to recover the original data file, as discussed above. In one embodiment only those chunks required to reconstruct the file are retrieved. In another embodiment an attempt is made to retrieve all chunks and then check to see if there has been success at retrieving sufficiently many chunks to recover the file. Under normal conditions, all of the original chunks should be available.
- a data center may be congested, overloaded, or suffering from some transient or permanent hardware failure.
- a given chunk may be received correctly from a given data center, a chunk may be received but corrupted, this will be apparent if a hash of the chunk is included as a part of the file name of the chunk or is stored in some other way. If the computed hash of the chunk falls to match its original hash then that chunk is corrupted. Yet another possibility is that the file was not found indicating some other process failure. Another possibility is that no response was received from the data center because of network congestion or because the server is down. This would be a soft failure referenced in step 631.
- step 623 a hash value of the retrieved data file is verified by comparing this hash value with the hash value received from the customer in step 601.
- step 625 a link is e-mailed to the customer informing the customer where they can download the original data file.
- the file could be synchronously returned to the customer or communicated in any way suitable for digital communication.
- step 627 an attempt to recompute the hash value is made, to guard against the unlikely event that some transient processing error is responsible.
- step 625 where an e-mail is sent to the customer as before.
- step 627 fails, a message is generated to initiate dollar payment on the customer's claim in step 629. That is, where the hash value cannot be verified for the retrieved data file, the data custodian has lost the data stored for the customer.
- a further attempt at retrieving and decoding could be initiated to guard against the possibility that some transient error corrupted the process in steps 617, 619, and 621.
- step 621 the data custodian is unable to decode the retrieved data chunks
- the process continues to step 631 where it is determined whether a soft failure has occurred.
- a soft failure results, for example, where one or more data chunks cannot be received, but the message can still be decoded using the remaining data chunks, or by using chunks from systems that were temporarily out of service but which still retain the missing information.
- the process returns to the retriever function 617 and decoder function 621.
- no soft failure this indicates that the data file has been lost by the data custodian, and the process flows to step 629 where a procedure is initiated for dollar payment on the claim, or payment in some other currency or store of value.
- FIG. 7 is a detailed flow chart representative of the packer step of Fig. 5 in accordance with an embodiment of the invention.
- each packer process may perform its function in parallel with all of the others.
- step 701 a Hash 1 value is computed for each of the incoming data chunks.
- the Hash 1 value for the respective data chunks is saved at the data custodian for later processing, and the data chunks are sent in step 703 to the data center 705 which this packer is responsible for.
- step 707 the data chunks are each stored at their respective data centers, and a chunk Hash 2 is computed for the stored data chunk.
- the chunk Hash 2 is received at the data custodian in step 709, and a comparison of the Hash 1 and Hash 2 values for a particular data chunk is performed. Under normal circumstances, these two hashes will agree. If they do not agree, either there has been a transient computational error in computing one or both of those hashes or there has been a corruption in transmission.
- the Hash 1 value equals the Hash 2 value
- the process continues to step 711 where that data chunk is deleted and successful storage of the data chunk is reported to the packer manager.
- the extra precautions associated with recomputing and comparing these hash values might be dispensed with.
- Hash 1 value is not equal to Hash 2 value
- an error condition exists. There are three possibilities: one possibility is that there was a transient computational error in the computing of one of the two hash values or there was a transmission error resulting in a corruption of the chunk at the data storage center, or there was an error in the transmission of Hash 2 from the data center to the packer.
- the process continues to step 713 where a Hash 3 value is recomputed and this Hash 3 value is compared to Hash 2 value for the data chunk. Hash 3 should be the same as Hash 1. If it is not then an error occurred either at this step 713 or previously at step 701.
- Hash 3 value is equal to the Hash 2 value, this means it is overwhelmingly probable that the error occurred at step 701 and that the chunk was in fact correctly transmitted in which case the process proceeds to step 711 where the data chunk is deleted having been safely stored and success is reported to the packer manager.
- a Hash 3 value also does not equal Hash 2 value, it is highly likely that some transmission error did occur. Therefore, an erase message is sent to the data center in step 715 and the data chunks are re-sent to the packers data center in steps 703. The recomputed Hash 3 replaces the originally computed Hash 1.
- step 717 Where the data center is unable to save the data chunk in step 717, for example because its disks are full, the process continues to 719 where the data center reports failure back to the packer manager. In addition, if the packer process times out in its attempt to contact the data center, it may report a soft failure. It should be noted that an embodiment of the invention might place greater trust in the reliability of the computing hardware and the network infrastructure and dispense with some of the safeguard steps.
- FIG. 8 is a detailed flow chart representative of a receiver process flow 619 of Fig. 6 in accordance with an embodiment of the invention. This process may be executed concurrently for each of the data storage centers possessing chunks of the desired file. As seen in Fig. 8, the process begins with the process manager at step 801 requesting data chunks derived from a file having a particular hash value from the data centers (815). Where the data center can find a chunk labeled with that hash value the data center returns the requested data chunk and the packer manager receives the data chunk in step 803. Each chunk is labeled with the hash value of the file it derives from as well as its own hash value. In step 805 the hash value of the chunk is verified.
- Recovery proceeds by returning to step 801 where a request can again be made to a data center step 815 for a data chunk associated with a particular hash value.
- the verification step 805 is successful, the chunk is saved in the package manager and a report is sent to the decoder indicating successful retrieval of a data chunk.
- the process continues to step 811 where a hard failure is reported to the decoder.
- the process continues to step 813 where a soft failure is reported to the decoder. This timeout most likely indicates a network congestion situation or a temporary server outage.
- the data storage facility By storing a chunk hash with the chunk, for example by incorporating it into the name of the file containing the chunk, the data storage facility is able to continually verify the integrity of its data storage and detect any file corruption that may occur. In addition, this process will uncover as early as possible any indication of incipient failure of a disk allowing it to be replaced before any data loss occurs.
- This cost is not simply the value of the file, but incorporates two other factors: the probability of losing the file, and the time value of money. Paying out money in the future is less costly than paying out money now. Thus, the cost is obtained by integrating together, for all possible future times T, the product of a time value of money factor exp(-IT) times a probability factor representing the probability at that future time T of receiving a claim and not being able to reproduce the file. That probability of having lost the file depends on the particular encoding chosen. It will be readily recognized that the interest rate I may itself be varying with time or may be a stochastic process.
- any one owner of information could ill afford to incur only this rational economic cost described above, because the consequences of data loss are so catastrophic.
- owners of information pool their risks, that barrier to economic efficiency is removed. For example, if a homeowner could not purchase home insurance, the homeowner would be compelled to go to ludicrous extremes to prevent his or her house from ever burning down. Because homeowners can purchase fire insurance and thus pool their risk with other homeowners, they take only the most practical of precautions.
- any message can be regarded as a sequence of characters where the characters are chosen from a collection of at least two possible characters. A conventional choice for two such characters are: 0 and 1.
- any message may be encoded as a sequence of O's and 1 's. Any given code requires breaking the message up Into pieces of a fixed size, which means that the message must have a size that Is a multiple of that fixed size. If the message has a size which is not a multiple of that fixed size, it may be necessary to pad it to make it the right size to enable it to be split up into the number of pieces needed by that code.
- Accelerometers exist, which measure the accelerations experienced by them, and hence by the objects to which they are attached.
- Nonvolatile memory devices exist, such as bubble memory, on one technical extreme, to litmus paper, which remembers by permanently changing color. By combining nonvolatile memory with accelerometers we may permanently record the greatest shock that some object has experienced over it's lifespan.
- An important source of unreliability in data storage is the damage that disks may suffer, after being fabricated, before being installed in a computer. Static discharge during installation is a potential source of serious damage, as is mechanical shock in transit. By attaching the device described in the previous paragraph to a disk drive during the manufacturing process, disks which have experienced unacceptable abuse levels in transit can be identified to better control the probabilities of disk failure.
- probability models can be developed which allow for estimating, according to the level of shock a disk has experienced, what the impact of that event will be on it's mean time before failure.
- Reduction of probability is readily achieved through independence. If an event has a 10% chance of occurring then there is only a 1% chance of two independent occurrences of that event. Thus, achieving maximal levels of reliability is facilitated greatly by achieving Independence of failure events.
- a disk that fails may cause other disks to fail because they are electrically connected.
- To achieve independence we can isolate the data connections by converting the voltage signal into an optical signal, which is then converted back to a voltage signal. Circuitry which limits current and/or electrical potential may also be employed. We can isolate the power connections bringing independent DC lines to each disk.
- Embodiments of the present invention may be implemented using a conventional general purpose computer or micro-processor programmed according to the teachings of the present invention, as will be apparent to those skilled in the computer art. Appropriate software can readily be prepared by programmers of ordinary skill based on the teachings of the present disclosure, as will be apparent to those skilled in the software art.
- a non-limiting example of a computer 100 as shown in Figure 9 may implement the method of the present invention, wherein the computer housing 102 houses a motherboard 104 which contains a CPU 106, memory 108 (e.g., DRAM, ROM, EPROM, EEPROM, SRAM, SDRAM, and Flash RAM), and other optical special purpose logic devices (e.g., ASICS) or configurable logic devices (e.g., GAL and reprogrammable FPGA).
- the computer 100 also includes plural input devices, (e.g., keyboard 122 and mouse 124), and a display card 110 for controlling a monitor 120. Additionally, the computer 100 may include a floppy disk drive 114; other removable media devices (e.g.
- the compact disc 119, tape, and removable magneto-optical media (not shown)); and a hard disk 112 or other fixed high density media drives, connected using an appropriate device bus (e.g., a SCSI bus, an Enhanced IDE bus, an Ultra DMA bus, a SATA bus, a fibre channel storage network, or an IP network using network block devices or iSCSI ).
- the computer may also include a compact disc reader 118, a compact disc reader/writer unit (not shown), or a compact disc jukebox (not shown), which may be connected to the same device bus or to another device bus.
- the system includes at least one computer readable medium.
- Examples of computer readable media are compact discs 119, hard disks 112, floppy disks, tape, magneto-optical disks, PROMs (e.g., EPROM 3 EEPROM, Flash EPROM), DRAM, SRAM, SDRAM, etc.
- the present invention includes software for controlling both the hardware of the computer 100 and for enabling the computer to interact with a human user.
- Such software may include, but is not limited to, device drivers, operating systems and user applications, such as development tools.
- Such computer readable media further includes the computer program product of the present invention for performing the inventive method herein disclosed.
- the computer code devices of the present invention can be any interpreted or executable code mechanism, including but not limited to, scripts, interpreters, dynamically linked libraries, Java classes, and complete executable programs. Moreover, parts of the processing of the present invention may be distributed for better performance, reliability, and/or cost. [00101] The invention may also be implemented by the preparation of application specific integrated circuits or by interconnecting an appropriate network of conventional component circuits, as will be readily apparent to those skilled in the art.
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Procédé pour garantir la récupération de données stockées qui comporte la réception par un expéditeur d'un fichier de données stocké au niveau d'un dispositif de conservation de données et la création d'un certificat de stockage associé au fichier de données. Le certificat de stockage comporte une représentation partielle du fichier de données stocké et une signature électronique du dispositif de conservation de données. Le fichier de données stocké est stocké dans une unité de stockage au niveau du dispositif de conservation de données; le certificat de stockage est renvoyé à l'expéditeur pour vérifier le stockage du fichier de données stocké au niveau du dispositif de conservation de données. Le procédé de stockage de sauvegarde de fichiers de données comporte la réception d'un fichier de données stocké au niveau d'un dispositif de conservation de données et la création de N segments de données à partir du fichier de données, de sorte que le fichier de données peut être recréé en utilisant moins de N segments de données. Les N segments de données sont situés dans une pluralité d'emplacements de stockage.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2007/064810 WO2008118168A1 (fr) | 2007-03-23 | 2007-03-23 | Procédé et système de stockage de sauvegarde de données électroniques |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2007/064810 WO2008118168A1 (fr) | 2007-03-23 | 2007-03-23 | Procédé et système de stockage de sauvegarde de données électroniques |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2008118168A1 true WO2008118168A1 (fr) | 2008-10-02 |
Family
ID=39788778
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2007/064810 Ceased WO2008118168A1 (fr) | 2007-03-23 | 2007-03-23 | Procédé et système de stockage de sauvegarde de données électroniques |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2008118168A1 (fr) |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2478438A4 (fr) * | 2009-09-16 | 2013-03-13 | Varaani Works Oy | Procédé et serveur de stockage pour redondance de données |
| US8453257B2 (en) | 2009-08-14 | 2013-05-28 | International Business Machines Corporation | Approach for securing distributed deduplication software |
| US9832769B2 (en) | 2009-09-25 | 2017-11-28 | Northwestern University | Virtual full duplex network communications |
| CN111277651A (zh) * | 2020-01-20 | 2020-06-12 | 国网江苏招标有限公司 | 一种远程投标方法及系统 |
| US10922173B2 (en) | 2018-03-13 | 2021-02-16 | Queen's University At Kingston | Fault-tolerant distributed digital storage |
| CN113196702A (zh) * | 2018-11-16 | 2021-07-30 | 先进信息技术公司 | 使用区块链进行分布式数据存储和传送的系统和方法 |
| CN115459882A (zh) * | 2022-09-05 | 2022-12-09 | 福建天晴在线互动科技有限公司 | 一种远程文件修复方法及系统 |
| CN118331516A (zh) * | 2024-06-17 | 2024-07-12 | 北京城建智控科技股份有限公司 | 数据处理方法及装置 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020138572A1 (en) * | 2000-12-22 | 2002-09-26 | Delany Shawn P. | Determining a user's groups |
| US20050114653A1 (en) * | 1999-07-15 | 2005-05-26 | Sudia Frank W. | Certificate revocation notification systems |
| US20050138110A1 (en) * | 2000-11-13 | 2005-06-23 | Redlich Ron M. | Data security system and method with multiple independent levels of security |
| US20050204273A1 (en) * | 2004-02-06 | 2005-09-15 | Samsung Electronics Co., Ltd. | Apparatus and method for encoding and decoding a space-time low density parity check code with full diversity gain |
-
2007
- 2007-03-23 WO PCT/US2007/064810 patent/WO2008118168A1/fr not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050114653A1 (en) * | 1999-07-15 | 2005-05-26 | Sudia Frank W. | Certificate revocation notification systems |
| US20050138110A1 (en) * | 2000-11-13 | 2005-06-23 | Redlich Ron M. | Data security system and method with multiple independent levels of security |
| US20020138572A1 (en) * | 2000-12-22 | 2002-09-26 | Delany Shawn P. | Determining a user's groups |
| US20050204273A1 (en) * | 2004-02-06 | 2005-09-15 | Samsung Electronics Co., Ltd. | Apparatus and method for encoding and decoding a space-time low density parity check code with full diversity gain |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8453257B2 (en) | 2009-08-14 | 2013-05-28 | International Business Machines Corporation | Approach for securing distributed deduplication software |
| EP2478438A4 (fr) * | 2009-09-16 | 2013-03-13 | Varaani Works Oy | Procédé et serveur de stockage pour redondance de données |
| US9832769B2 (en) | 2009-09-25 | 2017-11-28 | Northwestern University | Virtual full duplex network communications |
| US10922173B2 (en) | 2018-03-13 | 2021-02-16 | Queen's University At Kingston | Fault-tolerant distributed digital storage |
| CN113196702A (zh) * | 2018-11-16 | 2021-07-30 | 先进信息技术公司 | 使用区块链进行分布式数据存储和传送的系统和方法 |
| CN111277651A (zh) * | 2020-01-20 | 2020-06-12 | 国网江苏招标有限公司 | 一种远程投标方法及系统 |
| CN111277651B (zh) * | 2020-01-20 | 2024-04-09 | 国网江苏招标有限公司 | 一种远程投标方法及系统 |
| CN115459882A (zh) * | 2022-09-05 | 2022-12-09 | 福建天晴在线互动科技有限公司 | 一种远程文件修复方法及系统 |
| CN118331516A (zh) * | 2024-06-17 | 2024-07-12 | 北京城建智控科技股份有限公司 | 数据处理方法及装置 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8171102B2 (en) | Smart access to a dispersed data storage network | |
| US8868969B2 (en) | Method and apparatus for rebuilding data in a dispersed data storage network | |
| US9996413B2 (en) | Ensuring data integrity on a dispersed storage grid | |
| JP6483746B2 (ja) | データ記憶アプリケーションプログラミングインターフェース | |
| WO2008118168A1 (fr) | Procédé et système de stockage de sauvegarde de données électroniques | |
| Chen et al. | NCCloud: A network-coding-based storage system in a cloud-of-clouds | |
| US8352782B2 (en) | Range based rebuilder for use with a dispersed data storage network | |
| US8880940B2 (en) | Pessimistic data reading in a dispersed storage network | |
| US7240236B2 (en) | Fixed content distributed data storage using permutation ring encoding | |
| US10693640B2 (en) | Use of key metadata during write and read operations in a dispersed storage network memory | |
| US20140344228A1 (en) | Multiple Revision Mailbox | |
| CN104603740A (zh) | 归档数据识别 | |
| US11169973B2 (en) | Atomically tracking transactions for auditability and security | |
| US8239706B1 (en) | Data retrieval system and method that provides retrieval of data to any point in time | |
| US12411736B2 (en) | Data reconstruction in a storage network and methods for use therewith | |
| US7729926B1 (en) | Methods and apparatus for backing up and restoring data | |
| US9971802B2 (en) | Audit record transformation in a dispersed storage network | |
| Goodson et al. | Efficient consistency for erasure-coded data via versioning servers | |
| Jaikar et al. | Verifying Data Integrity in Cloud | |
| More et al. | Efficient and Secure Data Dynamic Operations in Cloud Computing | |
| Shrividhya et al. | Erasure correcting to enhance data security in cloud data storage | |
| Ghaeb et al. | An oblique-matrix technique for data integrity assurance | |
| Curtmola et al. | ◾ Availability, Recovery, and Auditing across Data Centers | |
| Patil et al. | Data Integrity Check In Cloud Using Dispersal Code | |
| Dhanalakshmi et al. | Remote Data integrity in cloud security services |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07759269 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 07759269 Country of ref document: EP Kind code of ref document: A1 |