CN1294514C - Efficient computer file backup system and method - Google Patents
Efficient computer file backup system and method Download PDFInfo
- Publication number
- CN1294514C CN1294514C CNB028161971A CN02816197A CN1294514C CN 1294514 C CN1294514 C CN 1294514C CN B028161971 A CNB028161971 A CN B028161971A CN 02816197 A CN02816197 A CN 02816197A CN 1294514 C CN1294514 C CN 1294514C
- Authority
- CN
- China
- Prior art keywords
- file
- hash key
- backup
- specific
- described specific
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
- G06F11/1453—Management of the data involved in backup or backup restore using de-duplication of the data
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/953—Organization of data
- Y10S707/959—Network
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99951—File or database maintenance
- Y10S707/99952—Coherency, e.g. same view to multiple users
- Y10S707/99953—Recoverability
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99951—File or database maintenance
- Y10S707/99952—Coherency, e.g. same view to multiple users
- Y10S707/99955—Archiving or backup
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Techniques For Improving Reliability Of Storages (AREA)
Abstract
Description
本发明一般地涉及一种用于备份和恢复在计算机系统上的数据文件和程序的方法,更具体来讲,本发明涉及一种高效的方法,用于确定先前是否已经备份了一文件或者程序,或者如果存在该文件的一份备份副本,则然后仅仅备份那些先前没有被备份的、并且没有备份副本的程序。因此,该系统和方法使得能够为本地或者远程备份计算机和/或计算机系统的文件而高效地利用带宽。The present invention generally relates to a method for backing up and restoring data files and programs on a computer system, and more particularly, the present invention relates to an efficient method for determining whether a file or program has been previously backed up , or if a backup copy of the file exists, then back up only those programs that were not previously backed up and that do not have a backup copy. Accordingly, the systems and methods enable efficient use of bandwidth for local or remote backup of files of computers and/or computer systems.
传统的用于备份计算机程序和数据文件的方法常常使用大量昂贵的网络带宽和过量的处理器(CPU)处理时间。当前,许多备份过程对计算机或者计算机系统的整个程序和数据储存库进行备份,这引起备份文件和程序的重复,并且要求大量网络带宽和过量的存储介质(例如磁带或者光盘(CD))。Traditional methods for backing up computer programs and data files often use large amounts of expensive network bandwidth and excessive processor (CPU) processing time. Currently, many backup processes back up a computer or computer system's entire program and data repository, causing duplication of backup files and programs, and requiring large amounts of network bandwidth and excess storage media (such as tape or compact disk (CD)).
许多组织的网络常常包括数据中心(“服务器群组(server farms)”),用于存储和管理大量的因特网可存取数据。数据中心常常包括几个计算机系统,例如因特网服务器,雇员工作站(employee workstations),文件服务器,等等。常常,这样的数据中心具有使用传统备份系统的可扩缩性问题。所要求的带宽和存储器不足以执行数据中心环境的大规模备份。可扩缩的并且能够随着组织的壮大而发展的系统将是很有益的。The networks of many organizations often include data centers ("server farms") for storing and managing large amounts of Internet-accessible data. Data centers often include several computer systems, such as Internet servers, employee workstations, file servers, and so on. Often, such data centers have scalability issues using traditional backup systems. The bandwidth and storage required are insufficient to perform large-scale backups of data center environments. A system that is scalable and able to grow as the organization grows would be beneficial.
能够通过增量备份方法实现一些带宽和存储介质的节省,该方法仅仅对已经改变的或者已经更新的文件进行备份。然而,这些方法没有解决这样的问题:重复在一个网络、乃至不同网络上的不同计算机上驻留的文件,常常仍以重复的形式获得备份,耗尽大量存储介质。Some savings in bandwidth and storage media can be achieved through an incremental backup method that only backs up files that have changed or have been updated. However, these methods do not solve the problem that files residing on different computers on one network, or even on different networks, are often backed up in duplicate, consuming a large amount of storage media.
例如,在许多人之间常常共享数据文件,并且重复的副本驻留在许多不同计算机上,这引起跨越一个或者多个计算机网络的、文件的许多多重副本。进一步来讲,计算机常常使用重复的程序和数据文件,用于运行操作系统和应用程序。例如,在运行Microsoft Windows的网络中,每一计算机可能具有重复的操作系统文件和程序。使用传统方法对整个网络进行备份可能导致那些文件和程序的许多多重备份,引起存储介质的过度浪费。除去备份文件和程序的重复的装置将是合乎需要的,可能的好处导致对于存储介质、处理时间和网络带宽的更加高效的利用。For example, data files are often shared among many people, and duplicate copies reside on many different computers, resulting in many multiple copies of the file across one or more computer networks. Further, computers often use duplicate program and data files to run operating systems and applications. For example, in a network running Microsoft Windows (R) , each computer may have duplicate operating system files and programs. Backing up an entire network using traditional methods can result in many multiple copies of those files and programs, causing excessive waste of storage media. A means of deduplicating backup files and programs would be desirable, with possible benefits resulting in more efficient utilization of storage media, processing time, and network bandwidth.
进一步来讲,通过组织实现的传统的备份方法常常使用许多计算机服务器来执行该备份,常常备份到磁带介质,这引起数据备份的分布式存储,此外还在介质和处理器时间两方面都引起重复和浪费。Further, traditional backup methods implemented by organizations often use many computer servers to perform the backup, often to tape media, which results in distributed storage of data backups, in addition to duplication in both media and processor time and waste.
再进一步来讲,分布式的备份过程通常引起对于存储许多备份磁带、或者其他类似备份介质的需要,并且要求一种追踪多个介质的方法。这样一种系统常常是很难恢复的,特别是如果使用了增量备份过程。正确的存储介质必须被定位,并且必须被以正确的顺序加载。磁带恢复是一种冗长的、费时的过程。常常,恢复过程是如此低效率和易出错,以致它是无效的,导致数据损失甚至生产率的损失,因为必须重新安装程序,并且必须重建数据。引起更加有效的和更容易实现的恢复过程的、更加高效易用的备份系统将有益于使用计算机系统的组织。Still further, a distributed backup process typically entails the need to store many backup tapes, or other similar backup media, and requires a method of keeping track of the multiple media. Such a system is often difficult to restore, especially if incremental backup procedures are used. The correct storage media must be located and must be loaded in the correct order. Tape recovery is a tedious, time-consuming process. Often, the recovery process is so inefficient and error-prone that it is ineffective, resulting in data loss and even loss of productivity because programs must be reinstalled and data must be rebuilt. Organizations using computer systems would benefit from a more efficient and easy-to-use backup system resulting in a more efficient and easier-to-implement recovery process.
本发明涉及在备份技术方面的改进,更具体来讲,本发明创建了一种解决方案,用于在因特网数据中心和企业数据中心环境中进行大规模服务器备份,并结果产生了一种用于灾难恢复和数据保护的解决方案。The present invention relates to improvements in backup technology, and more specifically, the present invention creates a solution for large-scale server backup in Internet data center and enterprise data center environments, and results in a solution for Solutions for disaster recovery and data protection.
本发明是一种使用文件内容的散列密钥的改进系统和方法,用于更加高效的和更加有效备份计算机文件和计算机程序。The present invention is an improved system and method for more efficient and effective backup of computer files and computer programs using hash keys of file content.
该过程中的第一步骤是扫描目标机(待备份的计算机系统)上的文件系统,并且创建散列密钥,为每一个待备份的文件创建一个唯一的数字代码。在优选实施例中,为了减少处理时间,仅仅为具有修改日期属性的、也就是比上次备份更新近的文件创建散列密钥。The first step in the process is to scan the file system on the target machine (the computer system to be backed up) and create a hash key that creates a unique digital code for each file to be backed up. In a preferred embodiment, to reduce processing time, hash keys are only created for files with a modified date attribute, ie more recent than the last backup.
作为结果的散列密钥被存储在本地数据库——目标计算机上的数据库——中,例如供在当前、以及将来的备份会话中作进一步的比较。所述本地数据库还包括每一备份文件的完整路径。The resulting hash key is stored in a local database—a database on the target computer—for example, for further comparison in current, as well as future backup sessions. The local database also includes the full path to each backup file.
对照在本地数据库中的先前的散列密钥项目,对所存储的散列密钥进行校验。以这种方式,所述散列密钥被用于校验每一本地文件,以便确定先前是否在目标系统中对其进行了备份。没有在本地数据库密钥列表中查找到的散列密钥被用于该过程的下一步骤。The stored hash key is checked against previous hash key entries in the local database. In this way, the hash key is used to verify each local file to determine whether it was previously backed up on the target system. Hash keys not found in the local database key list are used in the next step of the process.
对照在中央存储服务器上存储的文件的散列密钥,对没有在本地的散列密钥数据库中查找到的散列密钥进行校验。这一校验用于确定是否已经在中央存储服务器上存在特定文件。该文件可以作为来自另一服务器或者系统的备份、或者来自先前备份操作的结果来存在。The hash key not found in the local hash key database is checked against the hash key of the file stored on the central storage server. This check is used to determine if a particular file already exists on the central storage server. This file may exist as a backup from another server or system, or as a result of a previous backup operation.
例如,逐文件地、而不是逐块地执行是否进行备份的判定。这强有力地减少了比较次数和本地数据库的尺寸,并且极其适用于群组服务器,在所述群组服务器中,不仅数据块、而且常常是完整的文件在多个服务器之间被重复。For example, the determination of whether to perform a backup is performed on a file-by-file basis rather than on a block-by-block basis. This strongly reduces the number of comparisons and the size of the local database, and is extremely suitable for group servers where not only data blocks, but often complete files, are duplicated across multiple servers.
附图的简短说明A short description of the drawings
图1是显示根据本发明的一方面的备份过程的主要步骤的方框图;Figure 1 is a block diagram showing the main steps of a backup process according to an aspect of the present invention;
图2是显示根据本发明的一方面的备份决策进行过程的方框图;Fig. 2 is a block diagram showing the backup decision-making process according to an aspect of the present invention;
图3是显示依据本发明、用于实现本发明的方法的系统的一种实现方式的方框图;Figure 3 is a block diagram showing an implementation of a system for implementing the method of the present invention according to the present invention;
图4是显示本发明的备份子系统的更加详细的实现方式的方框图;Figure 4 is a block diagram showing a more detailed implementation of the backup subsystem of the present invention;
传统上,无论是否执行计算机、服务器或者系统的增量或者全部备份,备份解决方案都极大地增加了网络通信量,并且能够使用巨大的存储容量。本发明使用内容散列密钥来做出是否备份某些数据的智能决策,并且使用中央存储器容量来提供更加高效的和更加有效的备份存储和恢复活动。Traditionally, whether incremental or full backups of computers, servers, or systems are performed, backup solutions dramatically increase network traffic and can use enormous storage capacity. The present invention uses content hash keys to make intelligent decisions about whether to back up certain data, and uses central storage capacity to provide more efficient and effective backup storage and restore activities.
本发明是一种使用文件内容的散列密钥的系统和方法,用于更加高效的和更加有效的备份计算机文件和计算机程序。在本说明中,术语“文件”、“程序”、“计算机文件”、“计算机程序”、“数据文件”和“数据”是可交换地使用的,并且依据使用的上下文,任何一个的使用都可能暗示了另一个术语。The present invention is a system and method for more efficient and effective backup of computer files and computer programs using hash keys of file content. In this description, the terms "file", "program", "computer file", "computer program", "data file" and "data" are used interchangeably, and depending on the context of use, the use of either Another term might be implied.
本发明利用了一种使用散列机制的过程,用于检验一个文件在备份系统中是否是唯一的。仅仅唯一的、并且还未备份的文件才将被存储在中央存储系统上,这在使用网络带宽和存储介质时产生了效率。该过程利用将新创建的内容密钥与所有先前产生的散列密钥(使用本地化的和/或中央化的列表)相匹配、以产生备份判定,产生执行备份的整体分析,并且更加有效地和更少麻烦地完成恢复功能。作为结果的方法通过减少网络通信量和备份文件存储器两方面的重复,具有最小的带宽消耗和最小的存储容量使用。这对于备份操作系统文件和常用的应用程序特别有用。The present invention utilizes a process using a hashing mechanism for checking whether a file is unique within the backup system. Only files that are unique and not yet backed up will be stored on the central storage system, which creates efficiencies in the use of network bandwidth and storage media. This process utilizes matching the newly created content key with all previously generated hash keys (using localized and/or centralized lists) to generate a backup verdict, resulting in an overall analysis of performing backups and is more efficient Restoration functions can be accomplished more easily and with less hassle. The resulting method has minimal bandwidth consumption and minimal storage capacity usage by reducing duplication in both network traffic and backup file storage. This is especially useful for backing up operating system files and frequently used applications.
图1提供了对于依据本发明的备份过程的一种实现方式的方法的概观。由框10示出的该过程中的第一步骤是对目标计算机/系统(待备份的单独计算机或者计算机系统)上的文件系统进行扫描,并且例如如框12所示,以32或者64字节模式创建一个内容散列密钥。所述散列密钥对于每一个待备份文件来讲,是唯一的数字代码。对于每一个唯一的文件来讲,所述散列密钥是唯一的。进一步来讲,对于文件的相同副本来讲,所述散列密钥是相同的。以这种方式,对于该文件和任何相同的复制来讲,所述散列密钥成为一个唯一标识符。因此,如果两个文件具有相同的散列代码,则它们是相同的,并且,能够而且将会被同样地处理。能够使用工业散列过程,MD5。Figure 1 provides an overview of the method for one implementation of the backup process according to the invention. The first step in the process, shown by box 10, is to scan the file system on the target computer/system (the individual computer or computer system to be backed up) and, for example, as shown in box 12, in 32 or 64 byte mode to create a content hash key. The hash key is a unique digital code for each file to be backed up. The hash key is unique for each unique file. Further, the hash key is the same for identical copies of the file. In this way, the hash key becomes a unique identifier for the file and any identical copies. Therefore, if two files have the same hash code, they are the same and can and will be treated alike. Ability to use the industrial hashing process, MD5.
作为结果的散列密钥被存储在本地数据库404(图3)中,供在当前、以及将来的备份会话中作进一步的比较。这由图1中的框14表示。对应于所述散列密钥的该文件的路径和/或文件名与所述散列密钥一起被存储。The resulting hash key is stored in local database 404 (FIG. 3) for further comparison in current, as well as future backup sessions. This is represented by box 14 in FIG. 1 . The path and/or filename of the file corresponding to the hash key is stored together with the hash key.
对这一过程的改进可以是将所述散列密钥追加到计算机文件自身。以这种方式,已经进行了散列处理的文件能够被所述散列过程旁路掉,这在计算机处理方面提供了进一步的节省。然而,并不能够对所有的文件进行这样的追加,所以这一改进对于所有计算机文件类型可能是不可行的。An improvement to this process could be to append the hash key to the computer file itself. In this way, files that have already been hashed can be bypassed by the hashing process, which provides further savings in computer processing. However, such appending is not possible for all files, so this improvement may not be feasible for all computer file types.
对照本地数据库404中的先前的散列密钥项目,对所存储的散列密钥进行校验,如图1中的框16所示。以这种方式,所述散列密钥被用于校验是否每一本地文件都曾在以前、在目标系统中进行了备份。没有在本地数据库中查找到的散列密钥被用于该过程的下一步骤。因为只有那些没有由于被最近备份、或者至少最近处理过而被记录的文件才需经历进一步的处理。这使得可以有效利用计算机资源。The stored hash key is checked against previous hash key entries in the
现在对照中央数据库408中存储的文件,对没有在本地散列密钥数据库中查找到的散列密钥进行校验,如图1中的框18所示。对应于每一散列密钥的文件的路径和/或文件名与存储在本地数据库中的每一散列密钥一起被存储。所述散列密钥被用于确定是否已经在所述中央存储服务器400上存在所述对应的文件,并因此不需要对其进行备份。所述文件可能作为来自不同的目标计算机300乃至不同的目标网络的一次备份而存在。原理是不管有多少不同的目标计算机可能包含该相同,且完全相同的文件,都在中央存储系统内存储每一个唯一文件的单一副本。The hash keys not found in the local hash key database are now checked against the files stored in the
如果在中央数据库中不存在与给定的散列密钥的匹配,则该散列密钥被添加到所述中央数据库408,并且将所对应的文件上载(图1中的框20)到所述中央存储系统400(框22),所述中央存储系统400管理所述文件和散列密钥列表。能够由所述服务器保存所述过程的记录(参见日志存档框22a)。如果期望的话,为了安全原因,对待存档的文件进行加密(框24),并且对所述文件进行压缩,以便减少存储介质需求(框28)。举例来说,可以通过使用所述散列密钥产生加密密钥,并通过已知的、但是安全的算法对其进行变换。If there is no match in the central database for a given hash key, the hash key is added to the
最后,接着执行调度过程(图1中的框30)。基于所述散列密钥,所述调度过程将决定所述文件需要被调度到哪一位置中,并且它应该被存储在哪一存储设备(32a,32b,32c,32d...32n)上。所述存储设备可能被集中地放置,以便增加效率,但是本发明也能够使用分布式的、乃至远程放置的设备。散列密钥可被用于将文件调度到存储网络中的不同位置中。Finally, the scheduling process follows (block 30 in Figure 1). Based on the hash key, the scheduling process will decide in which location the file needs to be scheduled and on which storage device (32a, 32b, 32c, 32d...32n) it should be stored . The storage devices may be centrally located for increased efficiency, but the invention is also capable of using distributed, or even remotely located devices. The hash key can be used to schedule files into different locations in the storage network.
在优选实施例中,使用所述散列密钥作为文件名对所存储的文件进行重命名。这可使文件的检索变得简单、并且更加快速。当恢复的时候,将通过将所述散列密钥与被恢复机器上的文件名和/或文件路径交叉参照,来恢复原始文件名。In a preferred embodiment, the stored files are renamed using said hash key as the file name. This makes file retrieval easier and faster. When restoring, the original file names will be restored by cross-referencing the hash key with the file names and/or file paths on the restored machine.
图2中的流程图更详细地示出了进行所述文件备份决策过程。通过框100中的步骤示出了本地扫描。在步骤102中扫描文件,并且通过步骤104形成散列密钥。在优选实施例中,仅仅为具有修改或者创建日期属性的、也就是比上次备份日期更新近的文件计算散列密钥。每一散列密钥与本地数据库404中的本地存储的散列密钥列表相比较。本地数据库404为先前已经备份的每一文件包含一个记录,该记录包括散列密钥和该文件的完整路径和名称(步骤106)。那些具有匹配的文件将不被备份(步骤110),而那些具有与本地列表不匹配的散列密钥的文件(步骤106)需要进一步处理(框200中的步骤)。至少对于每一非匹配文件来讲,在本地数据库中存储一个新记录,该新记录包括该散列密钥和该文件的完整路径和名称。用于非匹配文件的散列密钥被收集、以供转发(步骤108),并且被转发出去,以便与中央存储的(中央数据库408)密钥列表相比较(步骤202)。如果密钥与先前中央存储的散列密钥匹配(步骤204),则不备份该文件(步骤210)。然而,只有当没有匹配时(步骤204),才备份该文件。所述散列密钥将被存储在中央数据库408中,并且该文件在被备份或者存档到存储器中之前,可以经受如上所述的处理(即,加密和压缩)。The flowchart in Fig. 2 shows the process of making the file backup decision in more detail. Local scanning is shown by the steps in box 100 . The document is scanned in step 102 and a hash key is formed by step 104 . In a preferred embodiment, hash keys are only calculated for files with a modification or creation date attribute, ie more recent than the last backup date. Each hash key is compared to a locally stored list of hash keys in
能够通过保存文件的历史副本、以及散列列表404、408的历史副本实现对上述过程的进一步改进,以致能够将任何单独机器恢复到它在过去某一给定时刻的文件系统状态。显然,实现这一改进需要中央存储系统400中的额外存储介质,以便在适宜的时机保存这些“快照”。对于人们能够倒退存档文件系统多远的的唯一限制是专用于该任务的存储量。因此,如果对于一种具体的实现方式来讲,计算机文件系统的历史快照不是令人想要的,则人们能够通过不实现本发明的这一特征来节省资本费用。A further improvement on the above process can be achieved by keeping a historical copy of the file, and of the hash lists 404, 408, so that any individual machine can be restored to the state of its file system at a given moment in the past. Obviously, realizing this improvement requires an additional storage medium in the
依据系统恢复文件基本上是通过将过程反向来实施的。因为每一目标计算机300或者系统都具有本地数据库404,该本地数据库404包括已处理文件的散列密钥的记录,所以本地数据库上的那些散列密钥可用于将目标计算机300上的需要被恢复的文件标识为该记录中指示的路径。本地数据库的备份副本还应该被存储在不同的机器上、乃至中央地备份,以便可获得散列密钥的列表和对应的路径来重建毁坏机器中的文件系统。Restoring files from a system is basically performed by reversing the process. Because each
该系统通过恢复在本地计算机的数据库404上列出的每一文件来恢复该毁坏机器的文件系统,存储在中央存储系统400中文件对应于它们的散列密钥。进一步来讲,可在中央存储系统400中存储本地数据库404本身、以便保留计算机文件系统状态记录,或者在该中央存储系统400中备份这一本地数据库。The system restores the crashed machine's file system by restoring each file listed on the local computer's
类似地,如果打算实现这一特征,将计算机系统恢复到先前的历史文件系统状态,则仅仅需要为该时刻获取该本地数据库,然后依据所述历史的本地数据库恢复文件系统文件。能够本地地、中央地、或者最好是同时在两个位置中存储所述历史的本地数据库。Similarly, if it is intended to implement this feature, to restore the computer system to a previous historical file system state, it is only necessary to obtain the local database for that moment, and then restore the file system files from the historical local database. A local database of the history can be stored locally, centrally, or preferably both.
所述散列码本身可用于在备份和恢复过程期间确保文件的完整性。通过对被备份的和/或被恢复的文件运行所述散列过程,产生了可与原始散列码相比较的散列码。如果所述密钥不是完全相同的,则产生文件误差,并且不能保证文件的完整性。如果是完全相同的,则确保了文件的完整性。The hash code itself can be used to ensure the integrity of the file during the backup and restore process. By running the hashing process on the backed up and/or restored files, a hash code is generated which is comparable to the original hash code. If the keys are not identical, file errors occur and the integrity of the file cannot be guaranteed. If they are identical, the integrity of the file is ensured.
图3示出了用于实践依据本发明方法的一种系统的实现方式的可能的高层概观。目标计算机或者目标系统300是待备份的系统。备份代理402能够被运行或许在目标系统上、或者在所述目标系统是它的一个客户端的服务器上。此外,所述备份代理能够远程地被运行。所述备份代理402实现在上文中论及的文件扫描和散列功能。所述备份代理402还使用了包含有用于先前已被备份的每一文件的记录的本地数据库404,并且实现本地的比较操作(图2中的框100),以便确定所述目标300上的文件先前是否已经被备份。FIG. 3 shows a possible high-level overview of the implementation of a system for practicing the method according to the invention. The target computer or
为了更高的效率或者为了避免目标计算机上的消耗,所述备份代理402能够在专用服务器上运行,并为这一功能而进行优化。所述备份代理402也可以包括恢复功能,或者一个单独的模块能够实现所述恢复功能。所述备份代理402和/或所述恢复代理能够使用万维网(web)界面,来允许经由诸如因特网的广域网(WAN),或者在本地经由局域网(LAN)或者其他网络对所述目标系统的文件备份进行远程管理。替换地或者并行地,还可以经由相同的或者类似的web界面对下文中将论及的备份服务器406进行管理。这能够允许所述备份和/或恢复操作被远程控制,而无论可能是从何处提供了对于所述代理402和/或所述服务器406的访问。For greater efficiency or to avoid overhead on the target computer, the
利用了中央存储系统400来实现集中式备份功能,包括图2中的框200中的集中式比较操作。尽管是作为集中式系统来描述的,但是将理解的是,针对这种集中式系统描述的所述功能和/或部件远程地被分布或者放置,取决于本发明的期望实现方式。The
备份和恢复服务器406被用于指导所述集中式备份操作。所述服务器406从代理402接收表示未在本地密钥列表中列出的文件的散列密钥列表。然后服务器406将所述失配的密钥列表与中央散列密钥数据库408中存储的(先前备份文件的)密钥列表相比较。将理解的是,如果期望的话,这一数据库能够被存储到下文中论及的一个或多个存储设备414里。如果当前在所述中央设备414中没有备份该文件,则将不存在与中央密钥数据库408中包含的散列密钥的匹配。这意味着需要备份对应的文件。在该情况下,所述服务器406从代理402获取对应的文件,或者替换地,所述服务器可以获取所述文件自身,并将其重命名为它的散列密钥,将重命名地文件转发到加密和压缩模块410(如果要求加密和/或压缩),这实现了上述的加密和压缩步骤。将理解的是,如果期望的话,能够在服务器406上,或者通过单独的计算机/服务器运行所述加密和/或压缩模块。Backup and restore server 406 is used to direct the centralized backup operations. The server 406 receives from the proxy 402 a list of hash keys representing files not listed in the local key list. The server 406 then compares the mismatched key list with the key list stored in the central hash key database 408 (of the previous backup file). It will be appreciated that this database can be stored, if desired, in one or more of the
然后,将所述加密和压缩文件转发到到文件调度器412,所述文件调度器412基于所述散列密钥或者关于所述文件应该被存储在哪里的其它指示符,将所述文件引导到适当的存储设备414a、414b......414n。依照希望,这些数据库414n可以被中央地或者分布地放置。The encrypted and compressed file is then forwarded to the
为了恢复唯一的文件,所述目标服务器300从本地数据库(在目标服务器上)、为该文件请求散列密钥,并且使用该名称、从中央存储服务器406检索该文件。To recover a unique file, the
可能的是:相对于所述目标系统300,远程地或者在本地放置所述集中式备份系统400。可以由服务供应商使用ASP或者XSP商业模型远程提供所述备份系统400,其中所述中央系统被提供给运行该目标系统300的付费客户端。这样一种系统能够使用诸如因特网之类的公众WAN,以便在中央系统和目标客户端之间提供网络连接性。替换地,专用网(WAN或者LAN,等等)能够连接这两个系统。还可以利用公共网络上的虚拟专用网络(VPN)。此外,客户端可能希望本地地实现这样一种系统,以便确保本地控制和自治,特别是在待存储的信息可能是特别敏感的、有价值的和/或是私人所有的情况下。然而,如果此类考虑不是优先的话,能够将更加成本有效的服务市场化,在这种服务中,由服务供应商提供所述中央系统。在该情况下,因特网连接性可能是合算的,并且如上所述,基于web的管理系统也会是有用的,并且依据本发明被容易地适应。It is possible to place the
可能使用自助模型实现利用本发明的系统,这使得客户网络管理员能够备份和恢复客户端系统。在该情况下,网络管理员会经由诸如上述基于web的实现方式之类的界面访问该服务。替换地,可以实现集中管理,来卸载客户端的备份职责。对于IDC服务器群组、以及对于与DataCenter技术的操作系统相结合来讲,这样的系统会是很有用的。此外,所述系统可以利用众多其它开放标准,诸如XML/SOAP,HTTP,和FTP。It is possible to implement a system utilizing the present invention using a self-service model, which enables client network administrators to backup and restore client systems. In this case, the network administrator would access the service via an interface such as the web-based implementation described above. Alternatively, centralized management can be implemented to offload backup responsibilities from clients. Such a system would be useful for IDC server farms and for operating systems integrated with DataCenter technology. Additionally, the system can utilize numerous other open standards such as XML/SOAP, HTTP, and FTP.
图4示出了在图3中给出的系统概述中的备份子系统的更详细的潜在实现方式,其示出了客户端和系统服务器的各种部件。这一附图对应于本发明方法的一种潜在实现方式的更详细的描述(在下文中给出)。Figure 4 shows a more detailed potential implementation of the backup subsystem in the system overview given in Figure 3, showing various components of the client and system servers. This figure corresponds to a more detailed description (given below) of one potential implementation of the method of the invention.
依据所述系统的更详细的潜在实现方式,用户会访问GUI,以便使用附加的进度表配置备份作业。这一备份作业会包含待备份文件/目录、OS具体备份选项和进度表选项的选择。当备份被人工执行、或者被所述进度表引起的时候:According to a more detailed potential implementation of the system, a user would access a GUI to configure a backup job with an additional schedule. This backup job will include the selection of files/directories to be backed up, OS specific backup options and schedule options. When the backup is performed manually, or caused by the schedule:
(I)文件系统扫描产生目标服务器300上现有的、并且将被作为“当前_备份”表存储在本地数据库404中的文件。为这一表中的每一文件,存储所述文件的位置、属性和最后修改时间。(1) A file system scan generates files that are existing on the
(II)接下来,将所述表“当前_备份”与存储有先前备份历史的、数据库404中的表“先前_备份”相比较。比较结果会是已经改变了最后修改时间的文件。(II) Next, the table "current_backup" is compared with the table "previous_backup" in the
(III)产生所述改变文件的内容校验和、并将其存储在本地数据库404中的“当前_备份”表中。(III) Generate the content checksum of the changed file and store it in the “current_backup” table in the
(IV)然后对照在中央存储服务器400上的中央数据库408中物理地驻留的、校验和的全局库,校验这些校验和。这一校验的结果集合是遗漏的校验和的列表。(IV) These checksums are then verified against a global repository of checksums physically residing in the
(V)这些遗留的校验和代表需要被传输给中央存储服务器400的文件。具有遗漏的校验和的每一文件将有一个备份过程,所述备份过程包括与存储服务器的数据同步、其内容的物理传输、压缩、加密以及在所述不同阶段期间的完整性校验,以便保证文件的成功接收。(V) These legacy checksums represent files that need to be transferred to the
(VI)当已经成功地备份所述文件的时候,所述文件将被标记为在本地数据库404中成功地备份。(VI) When the file has been successfully backed up, the file will be marked as successfully backed up in the
(VII)在所述备份过程之后,客户端和存储服务器400之间的数据同步为所有目标服务器(客户端)产生中央备份历史。(VII) After the backup process, data synchronization between the client and the
基于所述备份历史被存储的不同位置,可以以多种方式执行所述恢复过程。作为默认,从本地数据库404中存储的历史执行恢复。由操作员选择文件的先前备份集合的子集。这一列表为每一文件包含:原始位置,内容密钥,和文件属性。基于这一信息,代理可以从库中获得该文件,对该内容进行解压缩和解密,将所述文件恢复到其原始位置,继之以恢复关于所述恢复文件的属性。The restore process can be performed in a variety of ways based on the different locations where the backup history is stored. By default, recovery is performed from the history stored in the
恢复文件的第二种方式是从快照文件获得备份历史。这是一个纯文本文件,在备份过程期间被创建,并且包含一个文件列表。在备份期间,紧挨着每一文件的原始位置存储了内容密钥和文件属性。当我们将这样一种文件提供给客户端计算机上的代理的时候,所述代理能够基于上述说明恢复这些文件。The second way to restore files is to get the backup history from snapshot files. This is a plain text file that is created during the backup process and contains a list of files. During backup, the content key and file attributes are stored next to each file's original location. When we provide such a file to an agent on a client computer, said agent is able to restore these files based on the above instructions.
还可以从存储在中央数据库408中的备份历史创建快照文件,其驻留在中央存储服务器400上。Snapshot files may also be created from the backup history stored in the
Claims (15)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP01120041 | 2001-08-20 | ||
| EP01120041.7 | 2001-08-20 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1543617A CN1543617A (en) | 2004-11-03 |
| CN1294514C true CN1294514C (en) | 2007-01-10 |
Family
ID=8178374
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CNB028161971A Expired - Lifetime CN1294514C (en) | 2001-08-20 | 2002-03-08 | Efficient computer file backup system and method |
Country Status (6)
| Country | Link |
|---|---|
| US (2) | US7254596B2 (en) |
| EP (1) | EP1419457B1 (en) |
| JP (1) | JP4446738B2 (en) |
| CN (1) | CN1294514C (en) |
| AU (1) | AU2002304842A1 (en) |
| WO (1) | WO2003019412A2 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103548003A (en) * | 2011-02-11 | 2014-01-29 | 赛门铁克公司 | Processes and methods for client-side fingerprint caching to improve deduplication system backup performance |
Families Citing this family (89)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP3800527B2 (en) * | 2002-05-30 | 2006-07-26 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Data backup technology using network |
| US8402001B1 (en) * | 2002-10-08 | 2013-03-19 | Symantec Operating Corporation | System and method for archiving data |
| US8943024B1 (en) | 2003-01-17 | 2015-01-27 | Daniel John Gardner | System and method for data de-duplication |
| US8375008B1 (en) | 2003-01-17 | 2013-02-12 | Robert Gomes | Method and system for enterprise-wide retention of digital or electronic data |
| CN101221567B (en) * | 2003-10-22 | 2012-07-04 | 奥林巴斯株式会社 | File creation method and file search method |
| GB2413654B (en) * | 2004-04-29 | 2008-02-13 | Symbian Software Ltd | A method of backing up and restoring data in a computing device |
| US7809898B1 (en) * | 2004-05-18 | 2010-10-05 | Symantec Operating Corporation | Detecting and repairing inconsistencies in storage mirrors |
| US7330997B1 (en) * | 2004-06-03 | 2008-02-12 | Gary Odom | Selective reciprocal backup |
| US20060026218A1 (en) * | 2004-07-23 | 2006-02-02 | Emc Corporation | Tracking objects modified between backup operations |
| US20060212439A1 (en) * | 2005-03-21 | 2006-09-21 | Microsoft Corporation | System and method of efficient data backup in a networking environment |
| US7802134B1 (en) * | 2005-08-18 | 2010-09-21 | Symantec Corporation | Restoration of backed up data by restoring incremental backup(s) in reverse chronological order |
| US8930402B1 (en) * | 2005-10-31 | 2015-01-06 | Verizon Patent And Licensing Inc. | Systems and methods for automatic collection of data over a network |
| JP2007140887A (en) * | 2005-11-18 | 2007-06-07 | Hitachi Ltd | Storage system, disk array device, volume presentation method, and data consistency confirmation method |
| CN100357901C (en) * | 2005-12-21 | 2007-12-26 | 华为技术有限公司 | Method for verifying data between main device and back-up device |
| US8478755B2 (en) * | 2006-04-20 | 2013-07-02 | Microsoft Corporation | Sorting large data sets |
| US7441092B2 (en) * | 2006-04-20 | 2008-10-21 | Microsoft Corporation | Multi-client cluster-based backup and restore |
| US8051043B2 (en) | 2006-05-05 | 2011-11-01 | Hybir Inc. | Group based complete and incremental computer file backup system, process and apparatus |
| US7844581B2 (en) * | 2006-12-01 | 2010-11-30 | Nec Laboratories America, Inc. | Methods and systems for data management using multiple selection criteria |
| US8041641B1 (en) * | 2006-12-19 | 2011-10-18 | Symantec Operating Corporation | Backup service and appliance with single-instance storage of encrypted data |
| US8850140B2 (en) * | 2007-01-07 | 2014-09-30 | Apple Inc. | Data backup for mobile device |
| US20080294453A1 (en) * | 2007-05-24 | 2008-11-27 | La La Media, Inc. | Network Based Digital Rights Management System |
| US8209540B2 (en) * | 2007-06-28 | 2012-06-26 | Apple Inc. | Incremental secure backup and restore of user settings and data |
| US8615490B1 (en) | 2008-01-31 | 2013-12-24 | Renew Data Corp. | Method and system for restoring information from backup storage media |
| US8452736B2 (en) | 2008-03-05 | 2013-05-28 | Ca, Inc. | File change detection |
| US8751561B2 (en) * | 2008-04-08 | 2014-06-10 | Roderick B. Wideman | Methods and systems for improved throughput performance in a distributed data de-duplication environment |
| US9098495B2 (en) * | 2008-06-24 | 2015-08-04 | Commvault Systems, Inc. | Application-aware and remote single instance data management |
| US8135930B1 (en) | 2008-07-14 | 2012-03-13 | Vizioncore, Inc. | Replication systems and methods for a virtual computing environment |
| US8060476B1 (en) * | 2008-07-14 | 2011-11-15 | Quest Software, Inc. | Backup systems and methods for a virtual computing environment |
| US8046550B2 (en) * | 2008-07-14 | 2011-10-25 | Quest Software, Inc. | Systems and methods for performing backup operations of virtual machine files |
| US8392361B2 (en) * | 2008-08-11 | 2013-03-05 | Vmware, Inc. | Centralized management of virtual machines |
| US8209343B2 (en) * | 2008-10-06 | 2012-06-26 | Vmware, Inc. | Namespace mapping to central storage |
| US8171278B2 (en) * | 2008-08-11 | 2012-05-01 | Vmware, Inc. | Booting a computer system from central storage |
| US8429649B1 (en) | 2008-09-25 | 2013-04-23 | Quest Software, Inc. | Systems and methods for data management in a virtual computing environment |
| US8495032B2 (en) * | 2008-10-01 | 2013-07-23 | International Business Machines Corporation | Policy based sharing of redundant data across storage pools in a deduplicating system |
| CN101414277B (en) * | 2008-11-06 | 2010-06-09 | 清华大学 | A disaster recovery system and method for on-demand incremental recovery based on virtual machine |
| US8055614B1 (en) * | 2008-12-23 | 2011-11-08 | Symantec Corporation | Method and apparatus for providing single instance restoration of data files |
| JP5294014B2 (en) * | 2008-12-26 | 2013-09-18 | 株式会社日立製作所 | File sharing method, computer system, and job scheduler |
| US8161255B2 (en) * | 2009-01-06 | 2012-04-17 | International Business Machines Corporation | Optimized simultaneous storing of data into deduplicated and non-deduplicated storage pools |
| US8090683B2 (en) * | 2009-02-23 | 2012-01-03 | Iron Mountain Incorporated | Managing workflow communication in a distributed storage system |
| US20100215175A1 (en) * | 2009-02-23 | 2010-08-26 | Iron Mountain Incorporated | Methods and systems for stripe blind encryption |
| US8145598B2 (en) * | 2009-02-23 | 2012-03-27 | Iron Mountain Incorporated | Methods and systems for single instance storage of asset parts |
| US8397051B2 (en) * | 2009-02-23 | 2013-03-12 | Autonomy, Inc. | Hybrid hash tables |
| US9792384B2 (en) * | 2009-02-26 | 2017-10-17 | Red Hat, Inc. | Remote retreival of data files |
| US8806062B1 (en) * | 2009-03-27 | 2014-08-12 | Symantec Corporation | Adaptive compression using a sampling based heuristic |
| EP2237170A1 (en) * | 2009-03-31 | 2010-10-06 | BRITISH TELECOMMUNICATIONS public limited company | Data sorage system |
| EP2237144A1 (en) * | 2009-03-31 | 2010-10-06 | BRITISH TELECOMMUNICATIONS public limited company | Method of remotely storing data and related data storage system |
| US8996468B1 (en) | 2009-04-17 | 2015-03-31 | Dell Software Inc. | Block status mapping system for reducing virtual machine backup storage |
| US8171202B2 (en) * | 2009-04-21 | 2012-05-01 | Google Inc. | Asynchronous distributed object uploading for replicated content addressable storage clusters |
| US8255365B2 (en) | 2009-06-08 | 2012-08-28 | Symantec Corporation | Source classification for performing deduplication in a backup operation |
| US9058298B2 (en) * | 2009-07-16 | 2015-06-16 | International Business Machines Corporation | Integrated approach for deduplicating data in a distributed environment that involves a source and a target |
| US9778946B2 (en) | 2009-08-07 | 2017-10-03 | Dell Software Inc. | Optimized copy of virtual machine storage files |
| US8738668B2 (en) | 2009-12-16 | 2014-05-27 | Renew Data Corp. | System and method for creating a de-duplicated data set |
| US9032243B2 (en) * | 2010-01-27 | 2015-05-12 | International Business Machines Corporation | Target operating system and file system agnostic bare-metal restore |
| CN102236588A (en) * | 2010-04-23 | 2011-11-09 | 阿里巴巴集团控股有限公司 | Remote data backup method, equipment and system |
| US9569446B1 (en) | 2010-06-08 | 2017-02-14 | Dell Software Inc. | Cataloging system for image-based backup |
| US8898114B1 (en) | 2010-08-27 | 2014-11-25 | Dell Software Inc. | Multitier deduplication systems and methods |
| CN101945156B (en) * | 2010-09-01 | 2014-04-16 | 惠州Tcl移动通信有限公司 | Method and device for backuping data information of mobile terminal |
| EP2455922B1 (en) * | 2010-11-17 | 2018-12-05 | Inside Secure | NFC transaction method and system |
| US8683026B2 (en) | 2010-12-08 | 2014-03-25 | International Business Machines Corporation | Framework providing unified infrastructure management for polymorphic information technology (IT) functions across disparate groups in a cloud computing environment |
| US8661259B2 (en) | 2010-12-20 | 2014-02-25 | Conformal Systems Llc | Deduplicated and encrypted backups |
| US10049116B1 (en) * | 2010-12-31 | 2018-08-14 | Veritas Technologies Llc | Precalculation of signatures for use in client-side deduplication |
| JP2014507841A (en) * | 2011-01-07 | 2014-03-27 | トムソン ライセンシング | Apparatus and method for online storage, transmitting apparatus and method, and receiving apparatus and method |
| CN102841897B (en) * | 2011-06-23 | 2016-03-02 | 阿里巴巴集团控股有限公司 | A kind of method, Apparatus and system realizing incremental data and extract |
| CN102495772B (en) * | 2011-09-30 | 2013-10-30 | 奇智软件(北京)有限公司 | A feature-based terminal program cloud backup and recovery method |
| CN103500127B (en) * | 2011-09-30 | 2016-11-02 | 北京奇虎科技有限公司 | Terminal program cloud backup and recovery method |
| CN102360320A (en) * | 2011-09-30 | 2012-02-22 | 奇智软件(北京)有限公司 | Terminal backup object sharing and recovery method based on cloud architecture |
| CN102622394A (en) * | 2011-11-28 | 2012-08-01 | 江苏奇异点网络有限公司 | Method for backing up editable documents in local area network |
| US8959605B2 (en) | 2011-12-14 | 2015-02-17 | Apple Inc. | System and method for asset lease management |
| US9311375B1 (en) | 2012-02-07 | 2016-04-12 | Dell Software Inc. | Systems and methods for compacting a virtual machine file |
| US9262423B2 (en) * | 2012-09-27 | 2016-02-16 | Microsoft Technology Licensing, Llc | Large scale file storage in cloud computing |
| US9495379B2 (en) | 2012-10-08 | 2016-11-15 | Veritas Technologies Llc | Locality aware, two-level fingerprint caching |
| CN103365996B (en) * | 2013-07-12 | 2017-11-03 | 北京奇虎科技有限公司 | file management and processing method, device and system |
| US20150082054A1 (en) * | 2013-08-21 | 2015-03-19 | Venux LLC | System and Method for Establishing a Secure Digital Environment |
| WO2015094193A1 (en) * | 2013-12-17 | 2015-06-25 | Hitachi Data Systems Corporation | Distributed disaster recovery file sync server system |
| CN103645905B (en) * | 2013-12-20 | 2017-08-08 | 北京中电普华信息技术有限公司 | A kind of incremental data acquisition method and device |
| JP6269174B2 (en) | 2014-03-05 | 2018-01-31 | 富士通株式会社 | Data processing program, data processing apparatus, and data processing method |
| US10762074B2 (en) * | 2015-10-20 | 2020-09-01 | Sanjay JAYARAM | System for managing data |
| CN106302641B (en) * | 2016-07-27 | 2019-10-01 | 北京小米移动软件有限公司 | A kind of methods, devices and systems of upper transmitting file |
| CN107797889B (en) * | 2017-11-14 | 2021-05-04 | 北京思特奇信息技术股份有限公司 | Method and device for checking system file backup integrity |
| CN108038028B (en) * | 2017-12-13 | 2021-03-23 | 北信源系统集成有限公司 | File backup method and device and file restoration method and device |
| CN108255640B (en) * | 2017-12-15 | 2021-11-02 | 云南省科学技术情报研究院 | Method and device for rapidly recovering redundant data in distributed storage |
| US10630602B1 (en) | 2018-10-08 | 2020-04-21 | EMC IP Holding Company LLC | Resource allocation using restore credits |
| US11201828B2 (en) | 2018-10-08 | 2021-12-14 | EMC IP Holding Company LLC | Stream allocation using stream credits |
| US11005775B2 (en) | 2018-10-08 | 2021-05-11 | EMC IP Holding Company LLC | Resource allocation using distributed segment processing credits |
| US11184423B2 (en) * | 2018-10-24 | 2021-11-23 | Microsoft Technology Licensing, Llc | Offloading upload processing of a file in a distributed system using a key that includes a hash created using attribute(s) of a requestor and/or the file |
| CN110515765B (en) * | 2019-07-31 | 2022-04-22 | 苏州浪潮智能科技有限公司 | License key acquisition method and device and storage system |
| US12277031B2 (en) * | 2023-04-19 | 2025-04-15 | Dell Products L.P. | Efficient table-based archiving of data items from source storage system to target storage system |
| CN116775374A (en) * | 2023-06-01 | 2023-09-19 | 中国人民财产保险股份有限公司 | File backup method and device |
| US12373306B2 (en) * | 2023-07-14 | 2025-07-29 | Dell Products L.P. | Efficient table-based remote backup of data items between source and target storage servers |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1999009480A1 (en) * | 1997-07-29 | 1999-02-25 | Telebackup Systems, Inc. | Method and system for nonredundant backup of identical files stored on remote computers |
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5202982A (en) * | 1990-03-27 | 1993-04-13 | Sun Microsystems, Inc. | Method and apparatus for the naming of database component files to avoid duplication of files |
| US5301286A (en) * | 1991-01-02 | 1994-04-05 | At&T Bell Laboratories | Memory archiving indexing arrangement |
| EP0706686B1 (en) * | 1993-07-01 | 1998-10-14 | Legent Corporation | System and method for distributed storage management on networked computer systems |
| WO1996025801A1 (en) | 1995-02-17 | 1996-08-22 | Trustus Pty. Ltd. | Method for partitioning a block of data into subblocks and for storing and communicating such subblocks |
| US5778395A (en) * | 1995-10-23 | 1998-07-07 | Stac, Inc. | System for backing up files from disk volumes on multiple nodes of a computer network |
| US5754844A (en) * | 1995-12-14 | 1998-05-19 | Sun Microsystems, Inc. | Method and system for accessing chunks of data using matching of an access tab and hashing code to generate a suggested storage location |
| EP0899662A1 (en) * | 1997-08-29 | 1999-03-03 | Hewlett-Packard Company | Backup and restore system for a computer network |
| US6374266B1 (en) * | 1998-07-28 | 2002-04-16 | Ralph Shnelvar | Method and apparatus for storing information in a data processing system |
| JP2000200208A (en) | 1999-01-06 | 2000-07-18 | Fujitsu Ltd | File backup method, apparatus and program recording medium thereof |
| US6513051B1 (en) * | 1999-07-16 | 2003-01-28 | Microsoft Corporation | Method and system for backing up and restoring files stored in a single instance store |
| US6526418B1 (en) * | 1999-12-16 | 2003-02-25 | Livevault Corporation | Systems and methods for backing up data files |
| US6971018B1 (en) * | 2000-04-28 | 2005-11-29 | Microsoft Corporation | File protection service for a computer system |
-
2002
- 2002-03-08 EP EP02732469A patent/EP1419457B1/en not_active Expired - Lifetime
- 2002-03-08 AU AU2002304842A patent/AU2002304842A1/en not_active Abandoned
- 2002-03-08 WO PCT/EP2002/002588 patent/WO2003019412A2/en not_active Ceased
- 2002-03-08 JP JP2003523401A patent/JP4446738B2/en not_active Expired - Fee Related
- 2002-03-08 CN CNB028161971A patent/CN1294514C/en not_active Expired - Lifetime
-
2004
- 2004-02-19 US US10/780,683 patent/US7254596B2/en not_active Expired - Lifetime
-
2007
- 2007-08-06 US US11/834,344 patent/US7752171B2/en not_active Expired - Fee Related
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1999009480A1 (en) * | 1997-07-29 | 1999-02-25 | Telebackup Systems, Inc. | Method and system for nonredundant backup of identical files stored on remote computers |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103548003A (en) * | 2011-02-11 | 2014-01-29 | 赛门铁克公司 | Processes and methods for client-side fingerprint caching to improve deduplication system backup performance |
| CN103548003B (en) * | 2011-02-11 | 2017-06-23 | 赛门铁克公司 | Method and system for improving the client-side fingerprint cache of deduplication system backup performance |
Also Published As
| Publication number | Publication date |
|---|---|
| AU2002304842A1 (en) | 2003-03-10 |
| HK1069651A1 (en) | 2005-05-27 |
| US7752171B2 (en) | 2010-07-06 |
| WO2003019412A3 (en) | 2003-10-30 |
| WO2003019412A2 (en) | 2003-03-06 |
| CN1543617A (en) | 2004-11-03 |
| EP1419457A2 (en) | 2004-05-19 |
| EP1419457B1 (en) | 2012-07-25 |
| US20040236803A1 (en) | 2004-11-25 |
| JP4446738B2 (en) | 2010-04-07 |
| US7254596B2 (en) | 2007-08-07 |
| JP2005501342A (en) | 2005-01-13 |
| US20080034021A1 (en) | 2008-02-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN1294514C (en) | Efficient computer file backup system and method | |
| US8341117B2 (en) | Method, system, and program for personal data management using content-based replication | |
| US11561931B2 (en) | Information source agent systems and methods for distributed data storage and management using content signatures | |
| US8041677B2 (en) | Method and system for data backup | |
| US7680998B1 (en) | Journaled data backup during server quiescence or unavailability | |
| US7441092B2 (en) | Multi-client cluster-based backup and restore | |
| EP1975800B1 (en) | Replication and restoration of single-instance storage pools | |
| US5765173A (en) | High performance backup via selective file saving which can perform incremental backups and exclude files and uses a changed block signature list | |
| US8738668B2 (en) | System and method for creating a de-duplicated data set | |
| US9002800B1 (en) | Archive and backup virtualization | |
| CN1242089A (en) | Regenerating an agent for backup software | |
| US8943024B1 (en) | System and method for data de-duplication | |
| US8065277B1 (en) | System and method for a data extraction and backup database | |
| HK1069651B (en) | Efficient computer file backup system and method | |
| Sadasivan | An Investigation on Image and Data Storage in Cloud Environment with an Enhanced Approach of Data Compression using Compressed Sensing |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1069651 Country of ref document: HK |
|
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: GR Ref document number: 1069651 Country of ref document: HK |
|
| ASS | Succession or assignment of patent right |
Owner name: SYMANTEC CORP. Free format text: FORMER OWNER: DATACT TECHNOLOGIES N. V. Effective date: 20110923 |
|
| C41 | Transfer of patent application or patent right or utility model | ||
| TR01 | Transfer of patent right |
Effective date of registration: 20110923 Address after: American California Patentee after: Symantec Corp. Address before: Belgium Los Christie Patentee before: Datact Technologies N. V. |
|
| CX01 | Expiry of patent term |
Granted publication date: 20070110 |
|
| CX01 | Expiry of patent term |