HK1159285B

HK1159285B - System and method for high performance enterprise data protection

Info

Publication number: HK1159285B
Application number: HK11113835.9A
Authority: HK
Inventors: 彼得．志雄．刘; 苏比．阿卡尔亚
Original assignee: 信科索尔特公司
Priority date: 2005-06-24
Filing date: 2011-12-22
Publication date: 2014-04-11

Description

System and method for high performance enterprise data protection

Cross Reference to Related Applications

This application claims priority from U.S. provisional patent application No.60/693715, filed 24/6/2005, and is incorporated herein by reference. This application also incorporates by reference the full text of our co-invented And co-assigned application entitled "System And Method for visualizing Back Image", filed on even date herewith.

Technical Field

The invention belongs to the technical field of information, and particularly relates to a high-performance enterprise-level backup and disaster recovery system.

Background

Recent events have proven that the need for rapid recovery from (human and natural) disasters is crucial. Under such a demand, enterprise-level backup and disaster recovery systems are involved. In the prior art in this field, the typical end product of a backup operation is a backup volume, which must undergo a relatively long "restore" process before it can be used in production.

Some "short downtime" backup and restore solutions do exist, but they typically require expensive server clustering and/or replication capabilities.

The prior art has been described in a publication for Storage Networking industry association ("SNIA"), which may be found inwww.snia.orgAnd obtaining on line. See, inter alia, Michael Rowan of Reviio corporation "Examination of Disk Based Data Technologies "; "Identifying and Eliminating Back System bottleecks" by Jacob farm of Cambridge computer corporation; "Technologies to Address contextual data protection" by Michael Fishman, Inc. E MC company; and AT&"NextGeneration Business Continuity" of Andrea Chiaffitelli of T corporation. (each of which is incorporated herein by reference)

As will be appreciated upon review of the aforementioned cited references, the prior art in this field does not provide a way to make the most recent point-in-time snapshots of the system immediately available in the event of a system failure or disaster without the need for large-scale server clustering and/or replication.

It is therefore desirable to have a system implemented with simple hardware that provides the ability for an organization to have a recent consistent set of images of its available production servers at any given time, which can be brought online and start efficient production more or less instantaneously in the event of a system failure or disaster.

Disclosure of Invention

One embodiment of the present invention is made available as a software product Backup for the assignee of the present application (syncport corporation)(BEX). As implemented in BackupExpress, the present invention provides, among other capabilities, a service called "fast application recovery" (FAR) that makes near-instantaneous recovery from failures possible with simple hardware that is well within the IT budget of most enterprises.

It is an object of the present invention to provide a high performance, enterprise-level data protection system and method that provides efficient block-level incremental snapshots of primary storage devices, and the instant availability of such snapshots in a form that is immediately installable that can be used directly in place of the primary storage devices.

Other objects of the invention are as follows:

an enterprise repository (repository) is provided for such snapshots to facilitate implementation of the methods described herein on various types of storage platforms.

The ability to create a replacement physical primary facility in real time as a primary storage device while working with another storage unit is provided.

The ability to eliminate redundancy in multiple backups and/or in a single file system is provided by block level comparisons.

In one embodiment, the instant availability aspect of the present invention is provided by the steps of:

a) providing a base-level snapshot stored on a secondary system of a source ("primary") file system;

b) providing a block-level delta snapshot of a primary system stored on a secondary system, the block-level delta snapshot representing only blocks that have changed since a previous snapshot; and

c) from at least one of the incremental snapshot images, a logical disk image is constructed that can be used directly as a unit of installed storage (the incremental snapshot in step b has been constructed as required to facilitate the immediate performance of this step).

The snapshot and instant availability features of the present invention are used in conjunction with storage hardware components to provide an "enterprise image destination" (EID) for backup images created in accordance with the present invention. EID software is more well known to work with storage hardware from a wide variety of vendors, including inexpensive ATA storage hardware. A "double-protection" feature is also provided whereby point-in-time images in the EID may themselves be backed up to selected media or copied to other EIDs.

The present invention also provides a feature, referred to as "lazy mirroring", whereby an alternate physical host facility can be created as the primary source file system while working with the secondary storage unit. The secondary storage unit according to this feature may be a secondary logical volume previously brought online according to the above-mentioned "instant availability" feature of the present invention, where a replacement physical primary volume is created at the same time; or (as another example) it may be a live unit of a mirrored storage system, while another mirrored unit is "resilvered" or replaced. Other applications of the "lazy mirroring" technique are possible. In any such application, the "lazy mirroring" according to the present invention is also characterized by the ability to process without interruption.

Finally, the present invention provides a block comparison based technique for greatly accelerating distributed backup operations by eliminating redundant data when backing up multiple systems with partially common content (e.g., operating system files and common databases). When it is determined that a block to be backed up already exists in the backup set, the already existing block is used in the directory or catalog of the backup rather than storing two blocks. A similar technique is employed to eliminate redundant blocks in the file system.

Other objects and advantages of the present invention will become apparent from the accompanying drawings and the following detailed description.

Drawings

FIG. 1 is a high-level system block diagram illustrating a typical enterprise deployment of one embodiment of the present invention.

FIG. 2 is a block diagram illustrating block-level backup data transfer and file-level restore.

FIG. 3 is a block diagram illustrating the timeline of an operation as part of a block-level incremental backup followed by an exemplary file-level restore.

FIG. 4(A & B) is a block diagram illustrating timelines of exemplary disaster recovery scenarios including incremental block-level backup, instant availability restore, and "lazy mirror" replication.

Detailed Description

The following is a description of various preferred embodiments of aspects of the invention, showing details of how a system may be constructed to implement the invention, and the steps that may be taken to utilize such a system and practice such a method. These embodiments are for illustrative purposes only, and the invention is not intended to be limited to the specific examples shown. For example, certain preferred embodiments have been described with respect to implementation with specific storage hardware and operating systems, but it will be understood that the following disclosure is intended to enable those skilled in the art to readily apply the teachings set forth to other storage hardware and operating systems. The specific features of any particular embodiment should not be construed as limiting the scope as may be claimed.

Definition of

As used in this application, the following terms have the defined meanings:

APM (advanced protection manager): the name of a set of products that implement an embodiment of the present invention.

APM2D (advanced protection manager for disks): the general terminology of currently available auxiliary devices and future solutions in systems providing permanent block-level incremental and immediate availability is covered.

Application (Application): a mass-produced (i.e., generally business-authorized) backup-protected back-end business application (typically a database). This is to be distinguished from (and should not be confused with) end-user applications.

Application Instance (Application example): a logically independent implementation of an application that co-exists with other instances on the physical machine. An example of an application is the target of FAR.

Backup Client (Backup Client): client software that provides block-level incremental backup for high-speed backup with virtually no impact on other operations. Direct access to disk bypasses the file system for very fast, efficient image-based backup. The backup client software also provides block-level incremental backups of Exchange 2000/2003 and SQL Server 2000 databases.

BAR (backup after restore): the first backup after restore is also incremental and dependent on the original library.

EID (enterprise image destination): a near-line destination and repository for application-aware persistent image deltas.

EOFM: OEM version of snapshot driver for Windows from st.

Applied ERF (ultimate fast failure recovery): it may be desirable to fail back an application from a target node for a FAR to the original or most recently specified originating node (home node) of the application. This is performed quickly and seamlessly with minimal application downtime.

ExpressDR: simple and robust one-step bare metal recovery is provided for client nodes based on routine daily backups. And may also be used to deploy a complete system image to multiple machines.

Express Image (quick Image): block-level techniques for high-performance backups of systems are applied to tapes or storage-independent disks. Providing an unusual performance gain for high capacity backups with many small files.

FAR (fast application recovery): fast application recovery is the ability to quickly bring applications online on a standby or original server by connecting to virtual storage created from outside the backup image on the NAS device.

Filer (file manager): a NAS device.

Forever Image increment, also referred to as "permanent increment" and "permanent block-level increment"): the ability to seed the basic level of rollback and thereafter schedule incremental, block level backups continuously.

Instant Availability: backup data sets can be quickly mounted as read/write volumes. Providing near instantaneous recovery of critical applications and data without transferring the data.

iSCSI: TCP/IP based storage protocol. A low cost alternative to fibre channel for making remote storage on an IP network accessible to any authenticated initiator node.

iSCSI mapping and Unmapping (iSCSI mapping and Unmapping): iSCSI registers with the process of the file manager so that the LUN on the file manager is visible as if it were local storage on the restore target node. iSCSI logoff cancels the process and deletes the disks.

LAR (life cycle after reduction): if the protected FAR volume has commercial value, the LAR is a combination of the ERF and the backup of the FAR volume.

LUN Cloning (LUN Cloning): one feature of the NAS file manager allows the LUN supported by the snapshot to be unrestricted by supporting the snapshot and converted to a normal LUN. The LUN can be used by the application when this process is completed. The snapshot can then be deleted and the LUN exists independently.

LUN Creation: one feature of the NAS file manager separates virtual storage from backup images stored in snapshots. These LUNs can then be installed on the restore target for read-write. Reads may be from snapshots while writes are directed to a separate persistent area. The original backup image does not change.

Online/Background Restore: automatic background copying of image data from the iSCSI drive to the local disk after FAR while keeping the application online. This is done in the background, unobtrusively, while the application is active and running. When an application is quiesced or restarted and the iSCSI driver unmaps, a brief synchronization is required at the end. At the end of the process, all data is local. When data transmission occurs, there is no penalty in terms of application interruption or downtime.

PIT Images (PIT Images): the point-in-time image of the application volume is frozen at the time of backup.

Protocol director (Protocol director): the execution of jobs employing block-level application-consistent protocols is controlled and managed.

Secondary Storage: unlike the primary storage (on which production data resides), the secondary storage is the destination of the backup, and forms the basis of the LUN of the virtual machine disk. Only changes require additional storage and therefore only a little more secondary storage than is required for backup. The storage may be Write Once Read Many (WORM) to support immutable content that is reserved to meet legal requirements.

Specialized Backup Software (professional Backup Software): the software creates a backup image, captures incremental changes and saves the past point in time on secondary storage. The backup software creates an application-consistent image and additionally captures the machine configuration including persistent and volatile states.

Application Manager (Application Manager): application-consistent backup operations at all block levels are managed according to an easy-to-use, browser-based GUI. And backup of NAS equipment + Windows, UNIX and Linux nodes is supported. SQL and Exchange volumes and databases are also displayed in the GUI for optional backup and restore. All backup and other operations are tracked in a single directory.

Stand-by Node/Alternate Node/proactive Setup (standby Node/replacement Node/Preventive installation): a machine with minimal hardware and default application installation that can be targeted for FAR for high availability or validation reasons. Depending on business needs, this model may also be a powerful machine capable of running applications on a permanent infrastructure.

Volume: a backup unit, a single file system consisting of many files and directories, which is backed up at a block level.

End-to-end protection with enterprise image destinations

The enterprise image destination is part of an APM (advanced protection manager) product suite. This feature is implemented entirely in software and, once installed on a node, will allow the node to function as a near-line destination for application-aware permanent image deltas. The EID node can be configured in various ways (local disk, iSCSI storage, etc.) to provide various degrees of protection and reliability. Image backups from different nodes will be merged, nearlined and versioned on top of the device. These versioned images will support instant availability of file systems and applications.

FIG. 1 illustrates a typical enterprise deployment of an embodiment of the present invention showing a secondary storage server 107 utilizing inexpensive SATA disk drives, the secondary storage server 107 in turn being connected to additional server arrays 103, NAS devices 104, remote secondary storage devices 105 and tape storage 106. This backup scheme is used to remotely manage backup and restore of networks including small (101) and large (102) remote sites. The block-level backup client software is used to perform the block-level backup operations indicated at (111, 112, 114, 115). The figure also shows replication to tertiary storage 113 (where secondary storage server 107 also serves as tertiary storage) and tape 116 (to tape drive 106). The various components and backup steps shown in fig. 1 are further discussed in various sections of the present disclosure below.

The system structure is as follows:

basic type:

the EID node will have a locally attached SATA drive configured as hot-swappable RAID 5. This memory will be used as a repository for images. Versioning is achieved via snapshots available on the system (VSS for Win2003 or LVM/EVMS for Linux). The image will be exported as a read-write LUN via the accompanying iSCSI target software.

And (3) slimming:

the EID node will only have a small local driver (ideally mirrored) to accommodate the OS and EID software. A backend iSCSI storage array (or similar network intelligence device) would be used as the actual destination for the backup image. The storage array must expose LUN creation, snapshot creation, LUN cloning, LUN masking/unmasking features as candidates to participate in the accompanying EID solution. The VSS/VDS or SMI-S APIs may be used to standardize the interface between EID software and external storage.

Thin sharing type:

this is a variation of the above-described slimness, where a networked storage array is shared between the source machine and the EID node. In such a configuration, backup may be optimized by sharing snapshots between the source and destination. In this configuration the EID node will act as a backup header.

EID with double protection:

the backup needs to be protected by further backups to tape or disk. This is termed dual protection. The first backup (referred to as a double protection document) of a disk on the EID node may be transferred to a tape device on the SAN or other disk distinct from the storage on which the first backup resides. It may be secondary or tertiary storage residing on a SAN or attached to some remote device (possibly another EID node). EID is thus a key implementer of end-to-end solutions for data protection based on multilevel storage.

Configuration:

APM customer node:

these nodes may be configured with APM client software and support programs that support multiple snapshot providers (if available). The APM client software will be able to backup the EID target to secondary storage, which may be vendor supported storage hardware or ordinary ATA storage hardware. The snapshot support program may be basic (attached EOFM) or complex with a separate snapshot provider for each volume. (when there are multiple snapshot providers, their use must be pre-configured or dictated by the EID node). When implemented, application support is available to both secondary and EID targets.

APM server-EID node:

such a node may have EID software installed with storage-specific plug-ins, depending on the backend iSCSI storage (if any). The plug-in configuration will be hardwired during installation with the license information. This basic configuration can be supported on two different commercial sets of OS-Windows 2003/NTFS and Linux 2.6 with ext3fs/xfs for LVM/EVMS. What is needed is essentially a 64-bit log file system with sparse file support with multiple persistent snapshots. Any system that meets these criteria can be a candidate for an EID node. (file system additional attributes like compression and/or encryption, although not required, may be employed to provide additional features at additional complexity and/or overhead).

A backup process:

FIG. 2 schematically illustrates point-in-time snapshots, block-level incremental backups, and creation of point-in-time full volume images, as well as file-level restore operations.

A snapshot stage:

the protocol director contacts APPH (application helper, which coordinates application-specific interactions (SQL Server, Exchange, etc.) at the start and end of the BACKUP) using BACKUP _ PREPARE. APPH interfaces with a snapshot handler that encapsulates snapshot code and an incremental block trace interface, generates a snapshot of the collection of volumes and refreshes the change log. The snapshot handler will execute DISCOVER LUNS as part of the file system discovery. After a supported iSCSI (or FCP (fibre channel protocol)) vendor detects that some LUNs are backend LUNs, the snapshot handler will invoke vendor specific methods to obtain a snapshot of the set of volumes residing on the same iSCSI storage entity (e.g., a volume containing a set of LUNs on a storage device). There will be a dedicated provider program for each storage vendor to provide this functionality, or a VSS or SMI-S provider program may be used if available from the storage vendor. Each back-end storage node will require additional configuration information to provide this functionality, which configuration information will have to be obtained from the database. (this information may be cached or saved as part of the local configuration file.) since most external providers will not provide change log support, it may be necessary to take an external (or VSS-coordinated) snapshot and an accompanying EOFM snapshot. The EOFM snapshot will be used separately to flush the change log and track blocks that change. The external snapshot will represent either a real backup instance or a consistent source for remote copying. The EOFM snapshot is taken first, followed by the external snapshot to produce a consistent image. There is a small window between two snapshots where a block may change. Since applications have been stopped (application state has been coordinated through APPH so that applications know that backup has started and have flushed their transactions to disk), no I/O should be generated for them. Nor should any file system metadata be changed (in any case the file system can be restored to a crash-consistent state). There may be block changes that do not capture a single file until the next delta. Note that the window is small and there is little chance that unsupported applications will have inconsistent states.

APPH will create a content file for the backup description at the end of the process. This file will be added with vendor specific information, possibly a logical name and a persistent snapshot id along with the local snapshot volume created by EOFM, VSS, or third party provider.

Data transmission:

the SVH contacts the EID software using a CREATE _ relative sip message (for the first backup) and passes the content file as the source path.

The EID software on the EID node then establishes a connection with corresponding software on the source node ("node software") and passes the content file path. The node software on the source side then reads the content of the content file and passes it back to the EID software on the EID node.

Change I: shared snapshot-backup

The EID software examines vendor specific snapshot information and determines whether the vendor is supported and licensed. If the answer is in the affirmative, the EID software seeks to determine, via a local query, whether a snapshot exists on the shared storage device, and if it determines that the shared snapshot can be used as a backup, the process is complete. At this point an allocation bitmap is also obtained. The EID software stores the relationship in its local database, i.e., the combination of source node + source driver (or unique id) + destination node + lun name. An allocation bitmap indexed by snapshot id is also saved.

Snapshot on EID node:

CREATE _ SNAPSHOT from SVH is returned with the shared SNAPSHOT in the previous step.

And (3) error recovery:

this is not required for this case.

Restart after cancellation:

not needed because the backup should be very fast.

File history recording:

a file history is (optionally) generated on the EID node using the backup LUN. The file history will be transferred to the BEM (Backup express master) server in some implementation specific manner.

Incremental backup:

incremental backups proceed in the same manner as the base backup, except that the change log is passed through to the EID node, which then stores CJ to its local database indexed by snapshot id.

And (4) checking the sum:

the checksum may be calculated for all allocated blocks on the LUN image and additionally saved in the EID database indexed by the snapshot id. The checksum is important for three reasons:

1. ability to verify after write.

2. Facilitating reliable checkpoint restart.

3. The ability to perform incremental backups using block-level tracing (although at increased cost).

APPS：

The APPS volume includes files that were generated on the active file system after taking the snapshot and as part of a POST BACKUP event. These files are not present in the shared snapshot. These files need to be backed up independently. In this case change II "local copy to backup LUN" must be used. Although APPS appears as a virtual volume, the backup of APPS is achieved by copying all files (file-by-file backup) rather than volume-oriented block copies.

Variation II: local copy to backup LUN

If the EID software determines that a shared snapshot cannot be used, it creates a backup LUN on iSCSI storage or locally, uniquely naming the LUN with a combination of source node + drive id. Host name + port id + target name + lunid are returned to the source EID software as part of the initial handshake.

The source side node software then calls the MAP _ LUN (indirectly using iSCSI registration) using the information passed from the EID node. The MAP LUN exposes a device that MAPs to a local namespace. The node software begins copying the allocated blocks from the device's local snapshot to the iSCSI mapped device. During this process, it passes status/checksum/progress to the EID software via the already established channel.

Snapshot on EID node:

the EID software takes a snapshot or some overlay entity of the backup LUN and returns a snapshot id.

And (3) error recovery:

not needed because iSCSI connections for data transfer are reliable and have built-in recovery and error connections. The EID software should be able to recover from errors based on a control connection that is transparent to DMA.

Restart after cancellation:

this needs to be achieved. The EID software needs to remember the last successful block write and pass this information during the initial handshake, indicating that this is part of restarting a backup of an abort.

File history recording:

a file history is (optionally) generated on the EID node using the backup LUN. The file history is communicated to the Backup Express Master server in some implementation specific manner.

Incremental backup:

incremental backups are performed in the same manner as base backups, except that as part of the process, the change log is utilized locally to copy only changed blocks onto the backup LUN.

And (4) checking the sum:

the checksum may be calculated for all allocated blocks on the LUN image and additionally stored in the EID database indexed by snapshot id.

APPS：

The APPS volume includes files that were generated on the active file system after taking the snapshot and as part of a POST BACKUP event. These files are not present in the backup snapshot. After the APPS LUN has been mapped locally, it must be formatted into a locally identified file system. The APPS directory/file is then copied all (file by file) onto the APPS backup LUN from the location pointed to by the APPH (rather than from the snapshot). During incremental backup, the APPS LUN must be cleared and a new set of APPS files copied. (the old snapshot will retain the previous version of the APPS file)

Variation III: network copy

Similar to variation II, if the EID software determines that a shared snapshot cannot be used, it creates a backup LUN on iSCSI memory or locally, uniquely naming the LUN with a combination of source node + drive id. LUN creation may fail if not supported on top of this node (actually the basic configuration). If this happens, host name + port id + target name + lunid is not returned to the source node software as part of the initial handshake and change III is indicated.

If change III is indicated, or there is no iSCSI, or other means of LUN mapping support, on top of the source node, then the source-side node software starts reading the allocated blocks from the device's local snapshot and then sends them over the network to the destination EID software. The destination EID software reads from the channel and writes a sparse file on some predefined volume at the destination. Either end of this process may generate a checksum.

Snapshot on EID node:

the EID software takes a snapshot of the volume containing the backup image file and then returns the snapshot id to the DMA.

And (3) error recovery:

it is necessary to recover from network outages by means of checkpoints maintained at the destination.

Restart after restart/cancel:

needs to be implemented. The EID software needs to remember the last successful block write and pass this information during the initial handshake, indicating that this is part of restarting a backup of an abort.

File history recording: a file history is (optionally) generated on the EID node using the backup image.

Incremental backup:

incremental backups are performed in the same manner as the base backup, except that the change log is utilized locally to read only the changed blocks and then transfer them across the network to update the backup image on the destination.

And (4) checking the sum:

the checksum may be calculated for all allocated/changed blocks on the backup image and additionally stored in the EID database indexed by snapshot id.

APPS：

APPS directories/files are read all (file by file) from the location pointed to by APPH (rather than from a snapshot) and copied over the network to the destination EID software, where a directory structure is created (below the predetermined backup directory location) to reflect a consistent copy with the file at the source. During an incremental backup, the APPS directory must be emptied and a new set of APPS files transferred and recreated from the source. (the old snapshot will retain the previous version of the APPS file)

Plug-in structure for external LUN/Snapshot management:

EID backup relies on snapshot creation, LUN cloning, etc. Both the source side and the EID side of the backup process are users of these services. To facilitate easy structural separation and to be able to plug into products of different vendors, it is necessary to implement an interface (in the form of a DLL or shared library) that is specific to the relevant vendor providing program. The default implementation will use the iSCSI provider accompanying the EID node, but could be replaced by a vendor specific implementation if authorized. The interface will provide general LUN create/delete, LUN clone, snapshot create/delete functions. An extended version of the interface may add functionality for block-level mirroring and other prominent features (e.g., secondary to tertiary replicated features) that may be utilized to support an efficient/elegant double protection scheme.

EID database:

a small database on the EID node is required to maintain configuration (e.g., backend iSCSI storage), permissions, snapshot id, checksum information, etc. This would be especially necessary where the EID node is operating some iSCSI/shared SAN storage in the backend. The Backup Express infrastructure will handle the unique snapshot id, but the EID software must translate it to an exact network entity by revoking the reference to the snapshot id via the local database.

A simple implementation could be a set of directories named using a snapshot id containing a block allocation bitmap, an incremental bitmap, a checksum, a file history, etc.

Double protection to tape:

this would be achieved via a conventional NDMP (network data management protocol) backup redirected from the SSSVH to the job handler. (reference is made to the separate discussion of dual backup) with respect to DP to tape, it is important to note that the full/full image of the first backup was created on tape. Subsequent tape backups are full copies of other first backup instances. The concept of incremental, or in any other way of relating one tape backup image to another tape backup image, is not part of this design.

Double protection to disk:

double protection to disk (DP2D) further extends the life of the backup image on disk by creating another backup on the original/first backed up disk. In this case, as various efforts, subsequent backups are created by transferring incremental data to update the tertiary backup. Various solutions are:

multilevel storage visible to EID nodes:

in this scheme, the tertiary disk storage is accessible from the EID node (secondary and tertiary storage are part of a large multi-level storage deployment accessed via a unified single vendor interface, Hitachi TagmaStore). In this case, the DP backup would be via a local block-level incremental backup performed by the EID software after the appropriate tertiary location is selected and the LUN unmasked/installed on the local EID node.

Block mirroring between individual vendor nodes:

if the vendor has an efficient and device-implemented block mirroring method for transferring data between the secondary and tertiary nodes, the EID software will trigger and perform image transfers/updates via the vendor specific API set to create a doubly protected backup.

EID node to EID node:

when the tertiary storage is physically isolated from the EID node, the remote EID node will initiate a backup via a "network copy" to pull (pull) data from the local EID node.

EID node to secondary store:

when data must be transferred between the EID node and the secondary node, the applicable backup client software transfer method will be used, i.e., the secondary storage will be contacted and a request will be made to pull data from the EID node. The EID software will recognize a DP2D backup and update the secondary image with the appropriate (usually most recent) snapshot using the saved bitmap.

The backup mechanism comprises the following steps:

once a double protection job is created to protect the first backup, the protocol director initiates an EID backup that is very similar to a conventional EID backup except that the snapshot phase is skipped.

The CREATE _ RELATIONSHIP hip is sent to the destination EID software (this could be an EID node co-located with the destination, a remote EID node, or another type of secondary storage). If the EID software detects that the backup is the source node, it uses the appropriate mechanism to either locally copy the image (using the allocated or incremental bitmap saved with the backup) to the tertiary destination or to invoke a vendor specific method to accomplish the transfer. If the EID software detects that the source is remote, it initiates a conventional EID backup using the mechanisms described previously. The backup is saved on the destination EID node similar to a conventional EID backup, meaning that the process can be cascaded indefinitely.

The snapshot id returned from NOTIFY in response to creation of the secondary snapshot is cataloged as part of the DP backup and is tied to the original first backup. (for a detailed explanation, see the separate discussion of double protection)

Restore from dual protection backup:

reference is made to the description of the double protection.

And (4) recovering browsing:

when a file history is generated at the end of a backup on the EID node and incorporated into the backup express database, browsing proceeds normally via the directory browsing function of dB. The NDMP directory browsing functionality may be used by contacting the EID software when no file history is generated (when generating a file history is computationally intensive or would require excessive storage). Browsing may be provided by installing a backup LUN on the EID node and then browsing the file system when browsing is needed, either using the existing "snapshot directory list" mechanism or by generating "rawtoc" from the image file. If the option of mounting a LUN as an identifiable file system is not available, then double protection to tape requires the generation of a file history during the double protection operation, either as part of the image or to construct a file-by-file archive format.

And (3) recovering the flow:

directory/file recovery:

once the restore selection has been generated (after the backup document of an instance has been translated from the application target to a file by APPH, either by the user or by the protocol director) and the content file has been created, the SSSVH contacts the node software on the restore target and passes the content file to it, from which the EID node obtains the data, path and snapshot id on that node. The node software on the restore target then contacts the EID software, passing the restore path and snapshot id to it. Once the EID node has examined this information, it determines if the id and volume combination can be exposed as a LUN on the restore target. If possible (much like backup), then the LUN will be created by the EID node, either locally or on a shared SAN storage, and the host name + port id + target name + lunid passed to the restore target. (note: the host name may be different from the EID node.) once the node software on the restore target is able to map the LUN, the handshake is complete. For instant availability, this essentially completes the restore process. Otherwise the node software performs a local copy of the file/directory from the mapped LUN to the restore target location. (Note: this is exactly how the APPS file is logically backed up.)

And (3) rollback:

it is possible that the EID node determines that the LUN cannot be exposed to the requesting node (e.g., for security reasons) or that the requesting node cannot map the LUN after the initial handshake is complete. In this case (low priority), a conventional restore is performed in which the EID software reads the requested files from the backup image and sends them over the network, the node software on the restore target then recreates the files locally based on the received data. In this case, a 'rawtoc' is required, which either pre-exists from a backup process or is created on-the-fly for restoration (and then cached if needed).

Error recovery/restart capability:

this is not necessary for LUN-mapped/IA type restore, but may be useful for traditional restore (if fully implemented).

Instant availability restoration:

as in other block level restores, MAP _ LUNS will be called (as implemented in the snapshot handler) to MAP a set of volumes on the restore target via iSCSI or FCP according to the selected snapshot. The snapshot handler will call CREATE _ LUN _ FROM _ LUN on the EID node to CREATE and expose a LUN within the snapshot. The APPS volume will then similarly map to the local namespace, either via a local iSCSI installation or a network installation. Once this step is complete, the SSSVH will direct APPH to complete the reduction. APPH will copy the log file from APPS volume to IA volume to restore the application or database if necessary. Note that IA restore is not tied to EID software at all.

The backup data transferred over the network as part of the differencing block-level image has a disk marker appended at the beginning with the appropriate information to virtualize the backup of the volume as an entire SCSI disk with a single active partition.

During restore, the read-only image is converted to an iSCSI addressable read-write LUN by creating a sparse file within the snapshot that the image supports. The LUN file is persistent and may serve as primary storage that aggregates changes and the original unchanged data from the backup image. The LUN may be installed as a separate disk or as part of a RAID set.

Error recovery/restart capability:

N/A。

restore via local volume rollback:

volume rollback is only possible if the restore happens to happen at the initial location and there are all change logs since the backup time. If these conditions are not met, a full volume restore may be triggered (but this is a degenerate case of volume rollback) or the restore job may fail. (this may not be required at all given the capability of IA reduction)

One option indicates that volume ROLLBACK is desired, in which case the protocol director sends a volume _ ROLLBACK message to the snapshot handler (more like a MAP _ LUN). This message contains the backup jobid (which uniquely identifies the point in time of the backup) and the volume in question. If a volume rollback is possible, the snapshot handler locks and unloads (the application residing on the volume is shut down or brought offline by APPH) the volume and then flushes the change log with a snapshot. All change logs since the time the snapshot was restored are logically anded to create a bitmap file that is returned (file name only) to the protocol director. The protocol director adds the bitmap file to the content file and passes it on to the EID software, which uses the bitmap file to restore only a set of chunks according to the mapped LUN or over the network.

If a traditional full volume restore is implemented, the allocation bitmap must be passed from the EID node to the node software on the restore target so that only the allocated blocks are copied. If a network copy is used, the EID node already knows which blocks to send.

After the restore is complete the volume is unlocked and remapped in the local namespace, and then the application/database reboots and comes online.

Restore via volume rollback in thin sharing configuration:

this restore mode requires back-end storage support for single file or LUN rollback.

Volume locking and application shutdown that occurs on the restore target node coordinated by the snapshot handler and APPH are exactly the same as described above.

During the initial handshake for volume rollback, the restore target passes the overlay information (e.g., D:. fileA, vol3/lun2) for the target volume to the EID software. The EID software, after determining that the backend storage supports this feature and that the snapshot is logically related to the restore target LUN, calls a backend API (part of the plug-in interface) with two parameters-the snapshot being restored and the target logical entity or LUN of the volume on the restore target node.

Volume rollback on back-end storage occurs asynchronously and may take some time (but should be fast because only local copying is involved) depending on the divergence between the active file system and the snapshot. Once this is done, the restore is complete and the application can be restarted (an example of this scenario is restoring a single file LUN snapshot on the NAS device).

Error recovery/restart capability:

reducing the whole volume: only for large full-roll reduction is important. This may be accomplished by a restart mechanism similar to a backup, but with checkpoints that are tracked by the recovery target node software and communicated upon reconnection. Whether or not the restore needs to be restarted after being cancelled by the DMA is beyond the scope of this document.

And (3) local volume rollback:

since the restore involves a local copy, error recovery should be unnecessary. A restart capability after cancellation/suspension may be desirable.

Application-supported roll-back: error recovery should be unnecessary, but a reboot capability should be implemented if the backend storage supports reboots.

ExpressDR reduction:

this is a special case of full volume restore, where Linux is running as the restore target. The Linux node software may be driven by a modified version of jndmpc to work exactly as described above, using an iSCSI boot if available on a custom Linux kernel. Error recovery/restart capability should be necessary in this case. Furthermore, there needs to be a standard mechanism for browsing snapshots of the ExpressDR backup of a given node. This will be partly on the interface exposed by the EID software or snapshot handler on the EID node. With a predefined naming convention for snapshots, a list of snapshot directories may be sufficient, or an appropriate interface may need to be defined for the enumerated matching snapshots.

Error recovery/restart capability:

this is highly desirable for large scale reduction and should be achieved similar to full volume reduction.

Security/virtualization/compliance/self-provisioning restore:

near-line data needs to be more secure than data on off-line media (e.g., tape) since the data is active and can be accessed over a network if appropriate rights are given or if a small set of accounts are compromised. One option would be to encrypt the data residing on the near-line storage (which may be encrypted using the local file system if available). This will slow down the immediate availability recovery, but the increased security may make it worthwhile. Double protection of disks and/or tapes, especially if they are for long-term archival purposes, is also a prime candidate for encryption.

Protecting the Backup of a large number of machines consolidated on a single EID node with a small number of user accounts (a Backup Express administrator and a highest-authority user (root), or administrator on the EID node) may not be secure enough for most enterprises. Multiple administrators, each with responsibility/permission for a set of backup images, may be more acceptable (in which case, the supervisor does not necessarily have permission for all backup images). Some types of RBACs (role based access control) can be implemented by using existing security mechanisms on Windows2003 or Linux 2.6.

Since the full image of the application server is stored as a backup image on the EID node, these image sets (at various discrete points in time in the past) are the primary candidates for virtualization. With some existing or OS-related virtualization software, each client node or application server may be virtualized as it appears at some point in time in the past. The secure virtualization potential of machine state, where only authorized personnel have access to machine data, enables the enterprise to implement just-in-time virtualization for restore, compliance, analysis, or other business important purposes without administrators.

Rule compliance or litigation discovery (litigation discovery) is an important application of the EID paradigm where data on the EID node can be virtualized to some point in the past for compliance review at very little additional cost. The dual protection of disks or tapes targeting specialized compliance devices like secondary WORM storage or WORM tapes allows end-to-end solutions starting from backup, for recent restore and long-term archiving to meet compliance requirements.

Self-provisioning restore refers to data recovery without an administrator, where end users typically restore files without assistance from a desktop or administrator coordination. This is possible because the data is stored on the EID node, preserving the original file system security. Once instant availability or other techniques are used to map the scroll back to some well-known location, the user can find and restore the data using existing and familiar tools. (Backup Express GUI can also be used to find and restore data without logging in as an administrator.) the inherent nature of the EID architecture allows TCOs (total cost of ownership) to be restored from a providing end user and thus significantly reduced.

Example (c):

FIG. 3 illustrates block-level incremental backup and file-level incremental restore operations in greater detail than FIG. 2, in a manner that illustrates many of the foregoing principles. The illustrated example includes the following events and operations:

2:00a.m. a base backup of the primary system 300 is performed during the early morning backup window. Note that only the allocated blocks (301) are backed up. Unallocated blocks (320) are not transferred to secondary storage unit 330, reducing runtime and secondary storage requirements. A snapshot (341) on secondary storage represents all data (volumes/directories/files) on 2:00a.m. primary storage.

10:00a.m. this is an incremental backup because all backups after the base backup are auto-incremental. Note that only blocks (302) that have changes since the base backup are transferred. The snapshot (342) on the secondary storage is a comprehensive base backup image representing all data (volumes, directories, files) on the 10:00a.m. primary storage.

Only blocks (303) with changes since the 10:00a.m. backup were transferred. Snapshot on secondary storage (343) represents all data on primary storage at 11:00a.m.

A snapshot (343) of 11:00a.m. selected from the Backup instance (snapshot) is displayed on the Backup Express restore screen. According to this backup instance, three files (351) are selected for restore.

Dual protection

Double protection protects primary image backups to intelligent disk storage by backing up the primary image backups to tape or disk, managing their lifecycle, and providing direct restore from tape when the primary backup has expired or disk storage is not available.

APM to disk (APM 2D):

first backup:

1. an image of the file system is backed up to disk along with application specific metadata (APPS). Such data resides in a form that allows for instant availability and/or instant virtualization.

2. The file system/OS that does not support image backup is backed up to disk as a file and resides under a destination directory as a point-in-time copy of the source file system.

Double protection explanation:

the dual protection creates at least one (and as many as desired) virtual copy of the first backup to disk or tape. It is at this point of great importance that the subsequent backup is a consistent untransformed copy. Since the first backup is a frozen point-in-time image, the copy may be performed at any time in the future and still capture the original state of the file system. Pairing is no longer necessary since a large number of copies of the original backup can be made as soon as or whenever a policy requires it. For supported applications, application-consistent snapshots are saved to tape, as if a tape backup was performed at the time of the original first backup.

Presentation/scheduling:

the GUI will display a list of first backup jobs in the double-protected screen that are candidates for double protection. This may look like a traditional image/or NDMP backup screen, except that the left pane may be a backup job. (device selection is initially avoided by implicitly selecting the default cluster and the media pool for containing the node group). A DP job will be stored as an NDMP job with a first backup job name or first backup jobid as part of the definition. Scheduling may be simpler-but is a backup schedule like APM2D, with no basic incremental or differential settings. A DP job (i.e., jobid) that selected a particular instance of the first backup job may not have an associated schedule and the job may be deleted after it is run. When the JOB handler receives JOB _ START and determines that this is a DP JOB, it may issue a CREATE _ DP _ JOB to the database, specifying the JOB name or jobid as an argument (argument). Given the jobid, the dB can get (by looking up the snapid) the backup document for the job. Given the job name, the most recent backup jobid will be used to find the backup document for the job. The backup document contains the entire state of the first backup that requires the ability to build an NDMP job to tape consistent with the original APM2D job. It is possible to create a one-to-one mapping of tasks in the original job in DP _ JOP to get an equal set of source statements.

For example, an APM2D job having tasks C:, D:, APPS: would be converted into three tasks:

/vol/vol1/.snapshot/snapname/qtree1，

v/vol/vol 1/. snap/snap name/qtree2, and

/vol/vol1/.snapshot/snapname/APPS-qtree。

the CREATE _ DP _ JOB will return a temporary JOB name and the JOB handler will allow the NDMP JOB to proceed once it has obtained its definition. Once the job creates a copy to tape, it will appear as if the backup to tape was running at the original time of the disk backup.

The jobid and taskid need to be backed up for the first time so that the DP job task is related to the first backup. As part of CREATE _ DP _ JOB, the database may pre-CATALOG DP JOBs, CREATE directory entries, which will take effect if an actual TASK _ CATALOG enters.

When a necessary condition is triggered (snapshot is exhausted, etc.), the SVH will also call CREATE _ DP _ JOB. SVH can then run the JOB via JOB _ SATRT or the like, following or even before the backup.

The integrated scheduling combining disk and tape and life cycle management is beyond the scope of this project and will be considered at a later stage.

Running DP operation:

the double protection job is an APM backup (including proprietary NAS backup methods) coordinated by means of EID software or an external NDMP data server. The first backup may be an image file or a copied directory. When EID software backs up them, it will recognize that DP backups are in progress and back up them, preserving the original format if they are images, and as logical backups if they are duplicate directories. The foreign agents will backup the image or copied catalog in their native format (dump or tar).

If the DP backup is to backup to tape, the legacy job handler path will be used. DP backup directed to tertiary disks (secondary to tertiary replication) will be handled by the SSSVH or some external agent (possibly involving a simple script followed by a cataloging utility step).

In all cases, no file history will be generated or captured, as the same file history that was first backed up makes it unnecessary.

All the restoration will be done via the node software, regardless of the originating format. (this means that the external dump or tar format should be understood as required.)

Archival format/compliance

For long-term archive or regulatory needs, a DP backup might convert an image backup into a logical backup in some portable format, such as tar, cpio, or pax. These backups can be backed up to WORM tapes or WORM drives to meet compliance requirements. Data may be restored from the archive file using a file history saved during the first backup. Direct Access Restoration (DAR) may require that the file history with the associated fh _ info be re-saved, thereby requiring that the file history be generated during the dual protection process.

Commonly available utilities such as tar may be used to restore files according to an archive format that is independent of BackupExpress. Current designs provide freedom to produce and/or publish different archive formats.

Cataloguing:

each DP job may catalog as many tasks as the original backup in 'sscat'. The new fields in the sscat for the original task and the jobid can be added to the track index for the original job. (since this is an advanced and critical construct for first backup jobs, as part of this operation we can also add the snapid field as part of the sscat.) DP jobs will have their own equivalent disk entries in the sscat with pathnames that reflect the secondary disk locations.

Example sscat (partial column):

JOBID	TASKID	original JOBID^＊	JOBNAME	DISK
					1000055	1	0	First backup	C:
1000055	2	0	First backup	D:
					1000055	3	0	First backup	APPS:
1000100	1	1000055	Dual protection	/vol/vol1/qtreeC
					1000100	2	1000055	Dual protection	/vol/vol1/qtreeD
1000100	3	1000055	Dual protection	/vol/vol1/qtreeAPPS

Directory compression and job expiration:

since the first backup and subsequent DP backups are treated as separate jobs, each will have their own retention period. When the first backup expires, a check will be made to ensure that the DP backup exists depending on the policy. If a determination is made that there is an unprotected first backup, an alert may be issued at that point or a DP job may be triggered.

During main job compression, the first backed up catalog items will be retained and not deleted to maintain a file history. The backup document will be retained as well, as this is necessary for application restoration. The original jobid is always retained as part of the initiated job, as this needs to be reflected as part of the restore browse. If there are multiple DP jobs for a given first backup, they all contain the original jobid, which will point to the original ssfile.

This process should be relatively simple, as the single-pass directory table may be all that is required during compression.

And (3) reduction definition generation:

the restore view will return from the original job instance from $ NDMPDATA for the restore representation. The RJI process would also be enhanced to include the file history from the original ssfile in order to create an appropriate restore specification. This process may include generating a tape window involved in the DP backup along with the restore pathname from the original ssfile. The root directory (the only one cataloged) in the ssfile for the DP backup will be ignored.

Reduction: fault/location independent

Through conventional NDMP restore, it will appear that the DP tape backup is a conventional NDMP backup, and this can be used to restore directly to any compatible file system. In the event that the original secondary disk location is destroyed or corrupted, these backups may be restored to the original location to either recreate the APM2D location or initiate the restoration or implement virtualization. These restores may be handled by the job handler as normal NDMP restores and may be part of the complete solution if no application is involved.

Disaster recovery or full node backup of a secondary disk node is considered a single backup and can be used independently to restore the secondary disk node in case of a disaster.

The APM2D restore view will be unchanged except in the case where DP backups exist for the first backup, these will not be displayed. For expired backups, if there are DP backups, these will be displayed and represented as backups of near-line. The restore browsing process needs to be augmented to return the NDMP backup instance as an APM2D backup. The reduction selection will be transferred to SSSVH as it is now. (if the job handler implements the restore side of the APPH process, it is possible to create an NDMP restore for the application restore, but this may be limited in terms of better handling fault tolerance.)

After APPH has been contacted for application restore and the list of restored files is determined, the protocol director will try to cycle through the available disk destinations in order to satisfy the restore selection. If the operation fails (the first backup has expired or the disk destination is unreachable), an NDMP restore JOB will be constructed from tape and run via JOB _ START (presumably by the JOB handler). Once this is successfully completed, APPH will be contacted again and the reduction completed.

Lazy mirror image "

The primary volume may be mirrored on top of the secondary volume according to the following procedure.

Mounting the main roll

Mounting secondary coil

Creating a list of blocks to be copied from a primary volume to a secondary volume

Writing new blocks to primary and secondary volumes as they arrive

Deleting blocks from the list of blocks as those blocks are written

The list is traversed and, whenever bandwidth is available and convenient, the encountered blocks are copied from the primary volume to the secondary volume as a result of such traversal

Operation continues until all blocks on the list have been copied

The end result of the above process is that the secondary volumes will be synchronized with the primary volumes. This technique does not require stopping processing on the primary volume, nor does it impose any limitation on the time it may take to complete the copy process.

The "lazy mirroring" technique may be used, for example, to restore a physical master device after an "instantly available" virtual device has been utilized, such as after a master device failure. The virtual device will be used temporarily because the data on it will be as complete as the data at the point in time it was snapshot. However, the virtual device may be only a temporary solution and needs to revert traffic to the replacement master device once it is available. "lazy mirroring" provides this capability by allowing processing to continue uninterrupted and the actual copy to proceed at its own pace while minimizing the load on other system components.

The "lazy mirroring" technique can also be advantageously used to "re-silver" mirrors that have crashed or lost synchronization while the master mirror remains in production.

Furthermore, the "lazy mirroring" technique can be used wherever it is desirable to copy a volume without stopping the volume and doing so without making special measurements to save time.

Eliminating redundancy in backup and file systems

When multiple systems are to be backed up in a backup operation, it is common for a machine to have a large number of blocks that are identical to the blocks on other machines involved in the backup. When multiple machines have the same operating system files, or the same application or data files, installed on them, the same blocks will increase. Storing blocks with the same content multiple times is redundant. Redundancy relates not only to redundant use of storage, but also to the redundant use of bandwidth when transferring and storing duplicate blocks.

Furthermore, even in a single file system, it is common to have duplicate blocks as a result of file replication. This also represents a redundancy.

Such redundancy in backup scenarios may be eliminated by taking a summary of each block written to the backup data set and placing the summary data in a list or database.

The comparison of the block digests is preferably performed at the server side.

If the node to be backed up has a large number of blocks that have been changed and need to be backed up, it sends a list of these blocks to the backup server along with their digests (there are also situations where the node has previously created a list of block digests for some other purpose, such as determining which of its own blocks have changed so that those digests do not have to include a separate step to create them).

The server then compares the block digests and requests for backup those blocks that it has determined it does not already have (a list of blocks or a database to store in a manner that facilitates quick lookup using the digests as keys). The full list of blocks sent by the remote node is stored (including those sent plus blocks that the server determines it already has) as part of the backup directory.

Preferably, if the node being backed up has only a few changed blocks, in this case they are only sent and the redundancy check is skipped.

A similar technique is employed in single file systems for eliminating redundancy. Each block to be written to the file system is abstracted and then compared to the digests of the blocks already stored (again, here a list or database of blocks is stored in a manner that facilitates quick lookups using the digests as keys). If blocks of the same content already exist on the file system, the existing directory point is used and duplicate blocks are not written. When a file is deleted, its blocks are deallocated from the file. If other files use the same block, those allocations remain valid (until no file references a block, the block is "free").

Example (c): fast application recovery

The following are a series of examples illustrating the fast application recovery provided by the present invention.

Examples introduction:

the examples illustrate the ability provided by the present invention to quickly bring applications online on a standby or origin server by connecting to virtual storage created on a file manager, such as a NAS file manager, outside of the backup image.

The consistent volume image and its associated application consistent state from the source node is neared to the backup, typically on the NAS file manager. The user handles the application logical objects, while the Backup Express agent creates a hot base Backup of the physical objects containing the application. The permanent incremental image ensures that only blocks that have changed since the last backup are copied to the file manager and still leaves all database backups intact. Since the application data and state are nearline, the restore is accomplished very quickly by restoring a point-in-time copy of the application file, then bringing the application online and using a small amount of redo log records. The FAR rebuilds the storage as it was at the time of backup, establishes the logical expected physical relationships of the application, and then restores the application to a fully functional instance.

Description of the mechanism:

the application of reduction is summarized as a two-step process: the data files that need to be restored are followed by application recovery (sometimes referred to as roll-forward recovery). The user selects a backup instance or PIT image (usually the most recent) based on the nature of the disaster, the type of user error, or other business requirements. The first step is accomplished by creating an addressable virtual storage (LUN) on-the-fly at the file manager according to the PIT volume image selected by the user. These LUNs are then made visible to the target node in question. These LUNs are then attached as local disks on the restore target via iSCSI registered to the file manager. This process is nearly instantaneous since no actual data movement is involved. Once the application data file is visible in the target node's local namespace, the application is programmatically restored using appropriate application-specific APIs. This may require the use of additional log files that are retrieved as needed from the file manager backup location. This brings the application instance at most back to the point in time of backup. If the current log is available, then a roll-forward to the point of failure is possible. Since the backup is a snapshot backup, the application is only in hot backup mode for a very short time, so that only little processing is required to bring the database to a consistent state. These relatively simple and rapid steps enable the application to be restored within a matter of minutes after the FAR process is started. Orders of magnitude faster than traditional reduction of FAR, reducing down time of applications from days or hours to minutes. The FAR scales independently of the size of the data set.

Post-reduction:

FAR is not the end of the description. With the FAR completed, block change tracking may be activated and local disk attach may be performed if needed. This enables the background restore to continue while the application is active and running. Since changed blocks can be tracked, incremental backups can be started from the point in time of the restore. The application may eventually fail back to the original node or another node with minimal downtime and with all recent changes (since restore) saved.

The method comprises the following steps:

the source and target nodes need to be running and APM licensed. (applications may require separate permissions, if any.)

The NAS device or secondary storage unit requires iSCSI permission.

The target node needs to install the iSCSI initiator software (also supporting iSCSI HBA).

The standby node needs to be pre-configured with a minimum application installation.

Platform/application support includes Windows XP/Windows2000/Windows2003 and SQL Server 2000(SP2+), Exchange 2000(SP2+)/Exchange 2003, SQLServer 2005, Oracle and Linux.

The various cases and applications for fast application restore, and the life cycle of the data after restore are studied in the following sections:

example 1. application IV (instant verification)

The method comprises the following steps: because the backup is never truly verified, the restore is always speculative. Tape is unreliable. Validation is generally equivalent to verifying the internal consistency of the backup image. Application consistency and recoverability are occasional.

The method comprises the following steps: application IV verifies the application backup nearly instantaneously by restoring (FAR) to the alternate verification node or the original node when possible. The application is then resumed to complete the process. This operation can be scheduled so that each backup is always checked for integrity and no additional fire maneuvers need to be performed in order to recreate the disaster scenario.

The PIT image used: usually most recent, but possibly images from the past if the validation is batch processing.

Where to perform: typically on an alternate node that has previously created a minimum application installation. The same node may be used as the source for the backup if supported by the application. (e.g., Exchange 2003 configured with Recovery Storage Group or SQL Server with an option to rename a database being verified.) verification on the original node is typically not recommended because this would put additional strain on the application Server.

Mode (2):

and (3) lightweight verification: the application (typically the database) restarts/restores correctly, thereby verifying the correctness of the backup.

And (3) comprehensively verifying: further verification (more resource intensive) can be performed using application specific techniques if needed to verify that all database pages are clean and/or that the logical target is working well. (imagine a database query that spans multiple tables with the result being a definitive proof of database health)

Application-specific considerations:

exchange: installing memory is often an important step. Further verification may be performed with 'eseutil' on the replacement node.

SQL Server: installing a database is often an important verification step. Further validation may be performed via 'DBCC' or by running SQL queries.

And (3) subsequent operation:

none. Authentication is a transient operation and iSCSI logoff or reboot will clear machine state. The application IV may be configured such that the next authentication run will clear the previous authentication map. The machine state of the drive with the mapping need not be preserved, and thus further backup of the replacement node is unnecessary.

Example 2 application of service continuity IA (instant availability)

The method comprises the following steps: the down time is minimized to minutes. The most recent application backup state is restored. (depending on the frequency of backups, a very small amount of data may be lost)

The method comprises the following steps: the FAR restores the application instance on the standby or original node in near-instantaneous time, minimizing downtime. The application state at the time of backup is restored. Unless the application log is available (either salvaged from the original node or from some replicated location), the changes made after the last backup are lost. If the current application log is available and subsequently used, the application can roll forward to the time of failure without loss of data.

The PIT image used: typically the most recent image, but may be the image that precedes the event depending on the cause of the disaster (e.g., a virus attack).

Application-specific considerations:

exchange 2003: a complex scenario like 'dial tone restoration' involves creation of an empty database, then performs a database switch when restoration to RSG (restore storage group) is complete, then no longer needs to be recombined because FAR is very fast and painless, reducing application downtime to a minimum.

SQL server: backup databases, replication, and/or log sending are expensive and management intensive options for SQL server availability. FAR is an easy to configure option, has reduced administrative costs, and combines fast backup capability and fast availability on demand.

Example 2a. with in-line reduction

The method comprises the following steps: the application data needs to be eventually restored back to the local or SAN attached storage. Utilizing storage from secondary storage may be only a temporary option.

Where to perform: depending on the nature of the disaster and preventive installation, usually to the original application node or the closest node.

Subsequent operation (LAR): the application is online and the user can start with the application within minutes. The restore to the local disk continues in the background while the application is active and running. After all data is restored to the local disk, the application is temporarily stopped or paused and the iSCSI mapping is deleted. The local disk is then promoted to the only application disk. The application continues to execute or restart. The application is only unavailable for the last brief period (if not available at all).

BAR-enforcement (kick in) is used to protect the regular backup plan of applications on a newly restored volume. (repeat the cycle if future reduction of the application is required)

Example 2b. No in-line reduction

The method comprises the following steps: the reason for not requiring any background restore is either that the standby node is temporary and degraded performance is sufficient (failure recovery may be imminent once the original site has been reconstructed), or that the capabilities of the file manager storing the backup image are powerful enough to host the application.

Redundant destination: a high-end file manager (possibly located at a remote site) may mirror the backup image stored at the original backup destination (e.g., to tertiary storage). This configuration is suitable for redirection to a high-end file manager rather than a restore of the original file manager. In this case no background restore to the local disk is required anymore, since the file manager storage will be high-end and persistent.

Restored stored quality:

A. Low-iSCSI install to Secondary storage: applications may be able to survive gracefully and perform storage over iSCSI, and in particular if this is a temporary situation, it may be desirable to fail back immediately once higher quality storage and nodes are repaired or independently restored.

B. high-iSCSI is installed to high performance storage established by replication from secondary to tertiary, or copied from the original file manager to high performance storage after backup: the application will execute adequately and this may be a persistent solution. However, this does not prevent failure recovery.

And (3) subsequent operation: if desired, the restored Backup (BAR) will continue from the target machine, or a NAS block-level backup may be initiated because storage has been effectively migrated to the NAS device. The LUNs on the file manager may be cloned to break their dependencies from the original snapshot because the persistent storage on the file manager has been established with its own storage lifecycle.

Application of ERF (final fast failure recovery):

the application may eventually fail back to the original node or to a separately reconstructed node in the following manner:

1. the application is temporarily shut down on the currently running node.

2. If a relationship is established between the secondary and alternate stores, and the original secondary is near the final destination, the replication source and destination are reversed, the secondary is resynchronized and updated from the current store. Otherwise go to step 3. Assuming that failover is initiated reasonably soon after the point of failure, this should be done soon)

3. The FAR to the desired node is performed.

4. The application instance will fall back to its state on top of the standby node (with the most recent changes) and normal operation can continue.

Example 3 Fine-grained restoration from Whole application backup

The method comprises the following steps: for most applications, it is unlikely that a fine-grained restore will be performed based on the backup of the entire application. Granular application target backups are unreliable and extremely resource intensive. With the current state of the art backup/restore solutions for fine-grained application targets, FAR (done very quickly) to an alternate instance for an application execution, and then restoring the fine-grained targets with application-specific tools is a very attractive option.

The method comprises the following steps: the FAR follows with an application specific tool in order to drill (drill down) and inspect the application target. These can then be merged into the original destination or extracted for external use.

The PIT image used: depending on when the fine-grained target is deleted or in an uncorrupted state.

Where to perform: usually for alternate instances on different nodes or for the original node (depending on settings and requirements).

And (3) subsequent operation: typically not, because the demand is temporary, and the instance is torn down and the iSCSI mapping is not complete.

Application-specific considerations:

exchange 2000: exe or other tools are used with FAR and then with use of exerge, it is possible to restore a single mailbox without any backup performance penalty.

Exchange 2003: the powerful combination of Recovery Storage Group (Recovery Storage Group) and FAR makes single mailbox restore or even sub-mailbox restore for fine grained restore from any point in the past a very quick and painless (paging) option.

SQL Server: table level reduction-tables can be reduced according to an alternate FAR instance using 'bcp' or other tools.

Example 4 application instant replication for analytics, reporting, and data warehousing

The method comprises the following steps: often, obtaining a second copy of the data for analysis or reporting is a luxury offered by large businesses that perform expensive split-mirroring techniques with a large amount of disk space. With FAR, this is not only feasible at much lower cost, but can be replicated to multiple destinations nearly instantaneously. The service will be able to explore more analysis possibilities and gain a competitive advantage.

The method comprises the following steps: the FAR is utilized to replicate to one or more nodes as frequently as desired.

The PIT image used: usually recently, but some point in time in the past (perhaps christmas sales data in the last year) may be used depending on the analysis or business reasons.

Where to perform: to the replacement node. The original node continues to run the service application.

What (LAR) will happen next? : if a copy needs to have its own time-line or long lifetime, it needs to be backed up. The backup continues with incremental changes from the restored copy.

Example 5 alternate node restore for Long-term reservation tape backup

The method comprises the following steps: additional protection and/or long-term retention may require tape backups. Near-line images expire quickly, so tape backups are almost always necessary for long-term retention.

The method comprises the following steps: an image backup of iSCSI maps volumes to tapes. The tape image may then be restored to the node at any point in time at any granularity in the future.

The PIT image used: typically varying according to the backup plan (stagger) and depending on how many instances need to stay close to the line.

Where to perform: a tape connected to a standby node. This may also be the IV of an application node.

And (3) subsequent operation: image backup to tape is performed for one or more FAR volumes (requiring permission). iSCSI is cleared after a successful backup and the phase is set to be for the next cycle.

Example 6 FAR for storage migration

The method comprises the following steps: the need to migrate directly attached or legacy SAN storage to block-oriented NAS file manager storage is needed for cost, federation, performance, or manageability reasons.

The method comprises the following steps: once the block level backup to the file manager has been completed-the migrated seed has been broadcast. The backup image or snapshot image may be copied to a high-end file manager to further simplify the process. The FAR effectively completes the migration process.

The PIT image used: usually the most recent.

Where to perform: to a new application node, the node is to connect to the LUN created on the file manager.

And (3) subsequent operation: the LUNs will then be cloned (in the background) while the applications are active and running in order to free them from the constraints of the snapshot that contains them. The snapshot may then be re-cycled to reclaim space. The restored Backup (BAR) may continue to execute volumes supported by the LUN or the file manager volume or the quota tree containing the LUN.

Example 7 FAR 4C-compliant FAR

The method comprises the following steps: for legal reasons. Often, compliance involves expensive solutions involving proprietary hardware. Backing up the Backup of the Backup Express image to secondary WORM storage provides an affordable solution that can instantly and accurately recreate the machine state at some time in the past.

The method comprises the following steps: FAR to the standby node, or to recreate the application state or the entire machine state.

The PIT image used: depending on whether an annual report is needed or on demand (depending on the reason for the scrutiny, which may be any point in time in the past).

Where to perform: any standby nodes.

And (3) subsequent operation: typically transient, is unloaded after the adjustment has been met. If desired, via scenario 5, the entire machine state can be archived to WORM tape for offline review or portability compliance.

Further examples

Fig. 4(a & B) illustrates an instant availability and recovery scenario that utilizes instant availability to virtually eliminate service interruptions during the recovery process:

11:00a.m. shows the last routine backup on the NAS 107 before the disk failure on the originating node 300.

Volume D406 failed at 12:00p.m.

12:05p.m. within minutes, a 11:00a.m. backup instance accessed through Logical Unit Number (LUN)411 on the secondary storage unit is mapped to drive number D via iSCSI (412). The service continues. The iSCSI connection to the secondary storage unit 107 is transparent to the user. Note that the data changes are stored in the "valid data area" 414 on the secondary storage unit (coinciding with the white background block). The 11:00a.m. backup instance 413 itself is read-only and does not change.

The failed disk 406 is replaced with a new disk 421, 12:05-1:00p.m. Normal traffic usage continues via an active iSCSI connection to the secondary storage unit 107.

The 1:00-1:45p.m.11:00a.m. backup instance is transferred 451 to the originating node 300 and its new disk 421. Traffic continues without relaying, by means of an active iSCSI connection, until the system is interrupted (bridging down) at 2:45a.m.

2:45-3:00a.m. the administrator performs a data resynchronization ("lazy mirroring") (452). During this time, the system is not available to the user. The immediate availability gives the administrator the flexibility to perform resynchronization (452) during an overnight maintenance cycle.

3:00a.m. recovery is complete. The instant availability connection is ended by remapping volume D to a new disk 421.

It will be apparent that the embodiments described herein fulfill the objects stated in the present invention. While the preferred embodiments have been described in detail, it will be apparent to those skilled in the art that the principles of the invention may be implemented by other devices, systems and methods without departing from the scope and spirit of the invention as defined by the appended claims.

Claims

1. A method of rolling back to an earlier state, comprising:

addressing the volume with a unique job identifier and locking and unloading the addressed volume;

obtaining a snapshot of the volume to refresh a change log;

mapping the snapshot as an installable volume;

logical AND (AND) all change logs since the time of the desired state to restore to create a bitmap file;

adding the contents of the bitmap file to a content file associated with the snapshot;

utilizing the bitmap file to restore only a set of chunks according to the mapped volume.