CN1256672C - Remote data flexible replication system and method - Google Patents
Remote data flexible replication system and method Download PDFInfo
- Publication number
- CN1256672C CN1256672C CNB018131956A CN01813195A CN1256672C CN 1256672 C CN1256672 C CN 1256672C CN B018131956 A CNB018131956 A CN B018131956A CN 01813195 A CN01813195 A CN 01813195A CN 1256672 C CN1256672 C CN 1256672C
- Authority
- CN
- China
- Prior art keywords
- data
- replication
- remote
- unit
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2056—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
- G06F11/2069—Management of state, configuration or failover
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2056—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
- G06F11/2058—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring using more than 2 mirrored copies
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2056—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
- G06F11/2064—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring while ensuring consistency
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
发明领域field of invention
本发明关于一种由服务器或其它计算机所进行的数字资料远程复制,以提供较佳的容错及/或事故复原能力,特别是关于其工具与技术,以增加远程资料复制的弹性。The present invention relates to a remote replication of digital data by a server or other computer to provide better fault tolerance and/or recovery from accidents, especially its tools and techniques to increase the flexibility of remote data replication.
发明的技术背景Technical Background of the Invention
美国专利第5,537,533号说明用以由主网络服务器到远程网络服务器,进行数字资料远程复制的工具与技术。依据该专利的系统包括具有主服务器接口以及主链接接口的主数据传输单元,与具有远程链接接口以及远程服务器接口的远程数据传输单元。该主链接接口包括一能够对主网络服务器而产生“预确认(Pre-acknowledgement)”的欺骗封包产生器。换言之,该系统具有“智能型缓冲区”,而可在复制资料已储存于主链接接口内的非挥发性缓冲区之后,并在说明该复制资料已储存于远程服务器的应答信息抵达之前,送给主服务器一预确认,或是“欺骗”信息。US Patent No. 5,537,533 describes tools and techniques for remote copying of digital data from a host network server to a remote network server. The system according to this patent includes a master data transfer unit having a master server interface and a master link interface, and a remote data transfer unit having a remote link interface and a remote server interface. The main link interface includes a spoofed packet generator capable of generating "pre-acknowledgment" to the main network server. In other words, the system has "smart buffers" that send data after the replicated data has been stored in a non-volatile buffer within the main link interface, but before a response message arrives stating that the replicated data has been stored on the remote server. Give the master server a pre-confirmation, or "spoof" message.
犹他州盐湖城MiraLink公司为美国专利第5,537,533号的所有人。该公司已于本申请案的前一年,将Off-SiteServer产品(Off-SiteServer为MiraLink公司的商标)行销上市。该Off-SiteServer产品包括将NovellNetWare服务器磁盘(NetWare为Novell公司的商标),透过低频宽通讯链接,以远程方式将其复制至另外一个地理性远程的服务器的技术。MiraLink Corporation of Salt Lake City, Utah is the owner of US Patent No. 5,537,533. The company has marketed the Off-SiteServer product (Off-SiteServer is a trademark of MiraLink Corporation) one year before this application. The Off-SiteServer product includes the technology of remotely copying the Novell NetWare server disk (NetWare is a trademark of Novell Corporation) to another geographically remote server through a low-bandwidth communication link.
利用资料复制法,由主网络服务器到远程置换网络服务器所进行的远程资料复制,是一功能强大的资料备份方法。远程复制可在安全距离之外由原始资料产生资料复制,并且该操作基本上是与储存原始资料同步。如发生严重事故后,如该储存于远程的资料是存放于“暖机”远程网络服务器,则该资料几乎立即可以使用,换言之,在真实或仿真的灾变后数分钟内,远程服务器即可以新的主网络服务器角色激活并执行。Utilizing the data duplication method, the remote data duplication carried out from the main network server to the remote replacement network server is a powerful data backup method. Remote replication creates copies of data from original data at a safe distance, and the operation is essentially synchronous with storing the original data. After a serious accident, if the data stored remotely is stored in a "warm machine" remote network server, the data can be used almost immediately, in other words, within a few minutes after a real or simulated disaster, the remote server can be updated. The main web server role of the server is activated and executed.
在一般的安装程序中,使用该Off-SiteServer产品会牵涉到一对Off-SiteServer机盒:其一为本地机盒,另一为远程机盒。该Off-SiteServer机盒设定以专用的硬件、韧体及/或其它软件配置的,并概述于美国专利第5,537,533号文件内。使用专用的序列线,将本地端的NetWare服务器连接到其中一个机盒。NetWare服务器本身使用Vinca适配卡(VINCA为Vinca公司的商标)。该适配卡是由“NetWare可加载式模块(NLM)”所驱动,该程序截取硬盘机请求,然后由该序列线将资料传送到本地端Off-SiteServer机盒。In the general installation procedure, using the Off-SiteServer product will involve a pair of Off-SiteServer boxes: one is a local box and the other is a remote box. The Off-SiteServer box is configured with dedicated hardware, firmware and/or other software as outlined in US Patent No. 5,537,533. Connect the local NetWare server to one of the boxes using a dedicated serial cable. The NetWare server itself uses a Vinca adapter card (VINCA is a trademark of Vinca Corporation). The adapter card is driven by "NetWare Loadable Module (NLM)". This program intercepts the request from the hard disk drive, and then transmits the data to the local Off-SiteServer box through the serial line.
该本地端Off-SiteServer机盒具有4Giga字节例如像IDE硬盘机的非挥发性缓冲区。将资料预先确认后进入此Off-SiteServer缓冲区。对于该本地端服务器的操作系统而言,会在本地端进行第二次的“复制”写入操作。实际上,该Off-SiteServer产品已由NLM接收到该资料,并存妥于本地的缓冲区内。本地端Off-SiteServer机盒会将扇区及磁道变化资料储存起来,直到该机盒可安全地将该资料传送至位于远处的远程Off-SiteServer机盒。本地端Off-SiteServer机盒的缓冲区亦为“智能型”,此因该缓冲区可储存任何在通讯链接上可区域性处理的资料。该资料会被存放于本地端Off-SiteServer机盒内,直到远程Off-SiteServer机盒已成功地将其写入远程第二服务器,并回传一确认信号予该本地端Off-SiteServer机盒。当收到该确认信号后,该本地端Off-SiteServer机盒即释放本地端原先被成功地传出的扇区/磁道/区块资料所占用的非挥发性缓冲区空间。The local Off-SiteServer box has 4Gigabytes such as non-volatile buffers like IDE hard drives. Enter the Off-SiteServer buffer after confirming the data in advance. For the operating system of the local server, a second "copy" write operation will be performed on the local side. In fact, the Off-SiteServer product has received the data by NLM and stored it in the local buffer. The local Off-SiteServer box will store the sector and track change data until the box can safely transmit the data to the remote Off-SiteServer box at a distance. The buffer of the local Off-SiteServer box is also "intelligent", so that it can store any data that can be processed regionally on the communication link. The data will be stored in the local Off-SiteServer machine box until the remote Off-SiteServer machine box has successfully written it into the remote second server, and returns a confirmation signal to the local Off-SiteServer machine box. After receiving the acknowledgment signal, the Off-SiteServer box at the local end releases the non-volatile buffer space originally occupied by the successfully transmitted sector/track/block data at the local end.
该Off-SiteServer产品使用V.35接口作为在本地端的资料输出之用。V.35是一连接至“频道服务单元/数据服务单元(CSU/DSU)”的序列式通讯标准,而后再与通讯网路介接。该远程(第二)位置处设有一第二CSU/DSU,以中继该扇区/磁道/区块资料给远程第二Off-SiteServer机盒的V.35输入接口。该远程第二Off-SiteServer机盒透过序列缆线连接到远程第二服务器内的另一张Vihca适配卡专用的序列线,输出该扇区/磁道/区块资料。接着,远程服务器资料复制与系统软件将该扇区/磁道/区块资料写入远程服务器磁盘驱动器内,并且将写入操作确认信息回复该本地端Off-SiteServer机盒。本系统一小时内可处理约300Mega字节的资料变更操作。The Off-SiteServer product uses V.35 interface for data output at the local end. V.35 is a serial communication standard connected to "Channel Service Unit/Data Service Unit (CSU/DSU)" and then interfaced with the communication network. A second CSU/DSU is provided at the remote (second) location to relay the sector/track/block data to the V.35 input interface of the remote second Off-SiteServer box. The remote second Off-SiteServer box is connected to another dedicated serial line of the Vihca adapter card in the remote second server through a serial cable, and outputs the sector/track/block data. Then, the remote server data replication and system software write the sector/track/block data into the remote server disk drive, and reply the write operation confirmation message to the local Off-SiteServer box. The system can handle about 300Mega bytes of data change operations within one hour.
Off-SiteServer智能型产品足可感测频宽是否增加或减少,且/或该通讯链接是否中断。在链接中断期间内,Off-SiteServer机盒可由本地端非挥发性智能型缓冲区将变更的资料储存起来。而当链接再度激活时,Off-SiteServer机盒即开始自动传送资料。如当可用频宽变多或少时,Off-SiteServer机盒可随时变更其输出频宽。所有上述的传输操作,也合并有标准的软件式总和误码检查侦错及校正,及/或硬件式错误校正码(ECC)错误处理功能。The Off-SiteServer is smart enough to sense if the bandwidth increases or decreases, and/or if the communication link is broken. During the link interruption period, the Off-SiteServer box can store the changed data in the local non-volatile intelligent buffer. And when the link is activated again, the Off-SiteServer box will start to transmit data automatically. For example, when the available bandwidth becomes more or less, the Off-SiteServer box can change its output bandwidth at any time. All of the above transfer operations also incorporate standard software-based error checking detection and correction, and/or hardware-based error correction code (ECC) error handling.
万一在本地端(主)NetWare服务器出现磁盘或服务器当机,则按照上述方式而连附到远程(第二)Off-SiteServer机盒的第二(远程)服务器,可具有本地端(主)服务器上所有资料的完整复制磁盘复制。该远程备份复制可被存放于本地端(主)服务器上。而在发生事故时,该第二远程服务器亦可代替本地端主服务器。可借简易的指令,以极迅速的方式执行这种第二回存与/或替代操作。In case a disk or a server crash occurs in the local (main) NetWare server, the second (remote) server attached to the remote (second) Off-SiteServer box can have a local (main) A full duplicate disk copy of all material on the server. The remote backup copy can be stored on the local (primary) server. And when an accident occurs, the second remote server can also replace the local main server. This second restore and/or replacement operation can be performed in a very rapid manner with simple instructions.
简言之,Off-SiteServer产品及其它资料复制技术,可对无论是重要工作资料或是文件,提供极具价值的容错与灾变复原能力。不过,这些既有的方案,其弹性均受到了不必要的限制。In short, Off-SiteServer products and other data replication technologies can provide valuable fault tolerance and disaster recovery capabilities for important work data or files. However, the flexibility of these existing solutions is unnecessarily limited.
例如,Off-SiteServer产品需要一特定的Vinca公司硬件及软件版本。除了Novell NetWare平台之外,Vinca公司产品这项版本要求并不支持其它操作系统/档案系统平台。同时,必要的Vinca套装方案的硬件组件,也不能与较新、较快的服务器与较大的磁盘容量相配合。For example, the Off-SiteServer product requires a specific Vinca hardware and software version. In addition to Novell NetWare platform, this version requirement of Vinca products does not support other operating system/file system platforms. Also, the hardware components necessary for the Vinca package are not compatible with newer, faster servers and larger disk capacities.
原本的Off-SiteServer产品也是针对连接一本地服务器至远程服务器而设计。在一给定时间内,只能对单一本地服务器进行复制到远程服务器。不同位置的多个服务器无法即刻复制到单一的远程处。同样地,如果某公司有多个本地服务器是使用多种操作系统及/或档案系统,那么每个执行个别操作平台的服务器,就必须复制到其相对应的远程服务器上。The original Off-SiteServer product is also designed for connecting a local server to a remote server. Only a single local server can be replicated to a remote server at a given time. Multiple servers in different locations cannot replicate to a single remote at once. Similarly, if a company has multiple local servers using multiple operating systems and/or file systems, each server running a specific operating platform must be copied to its corresponding remote server.
此外,原本的Off-SiteServer产品要在本地服务器安装NLM,并且其是设计为利用私用专属的通讯链接。传统的复制亦要求一远程服务器,以便保持所复制的信息在远程为可开机格式。In addition, the original Off-SiteServer product needs to install NLM on the local server, and it is designed to use private and dedicated communication links. Traditional replication also requires a remote server to maintain the replicated information remotely in a bootable format.
于专利申请第09/438,184号案中注记这些限制及其它项目。本申请案可提供远程资料复制的额外工具及技术,以充分发挥如母申请案所述的技术以及其它进展。These limitations and others are noted in patent application Ser. No. 09/438,184. This application may provide additional tools and techniques for remote data replication to take advantage of the techniques described in the parent application, as well as other advances.
发明目的与概述Invention purpose and overview
本发明可提供一种资料复制工具及技术,可并同运用在本专利申请发明案内或其它具体实施例中。本发明概述焦点在于未在先前所详述的工具及技术。例如,本发明可提供像是本地-远程角色互换、借一「媒体未待机」信号的热待机服务器状态实施方式、数种替代性缓冲器内容及缓冲法则、交易、借运用「虚拟」远程复制单元的多对一复制处理、无应用特定知识但另基于应用项目经登注及分析行为的频繁接取资料的识别处理,以及按未授权方式运用第二服务器等的工具及技术。而经下列说明将可更佳了解本发明其它特性及优点。The present invention can provide a data replication tool and technology, which can be used together in the invention of the patent application or in other specific embodiments. The present summary focuses on tools and techniques not previously detailed. For example, the present invention can provide things like local-remote role swap, hot standby server state implementation with a "media not standby" signal, several alternative buffer contents and buffering laws, transactions, use of "virtual" remote Many-to-one replication processing of replication units, identification processing of frequently accessed data without application-specific knowledge but additionally based on registration and analysis behavior of application items, and tools and techniques for using second servers in an unauthorized manner. Other characteristics and advantages of the present invention will be better understood through the following description.
示意图简单说明Simple schematic diagram
为说明本发明是如何获致其优点及特性,兹参考所附示意图进行本发明细部说明。该示意图仅述及本发明数项特点,惟本发明并不局限于此。In order to illustrate how the present invention achieves its advantages and characteristics, the detailed description of the present invention will be described with reference to the attached schematic diagram. The schematic diagram only describes several features of the present invention, but the present invention is not limited thereto.
图1为说明一现有的计算机网络复制的示意图,可对该技术进行调整以适用于本发明。FIG. 1 is a schematic diagram illustrating an existing computer network replication, which technology can be adapted for the present invention.
图2为说明符合本发明的计算机系统示意图,无远程服务器,但包括了一具较大缓冲区的远程复制单元。Figure 2 is a schematic diagram illustrating a computer system according to the present invention, without a remote server, but including a remote replication unit with a relatively large buffer.
图3为说明符合本发明的计算机系统示意图,包括了具有“可热交换(Hot-Swappable)”RAID单元的远程服务器,以及一具相对较小缓冲区的远程复制单元。3 is a schematic diagram illustrating a computer system according to the present invention, including a remote server with a "hot-swappable" RAID unit, and a remote replication unit with a relatively small buffer.
图4为说明符合本发明的计算机系统示意图,无远程服务器,但包括了一具相对较小缓冲区以及可热交换RAID单元的远程复制单元。Figure 4 is a schematic diagram illustrating a computer system according to the present invention, without a remote server, but including a remote replication unit with a relatively small buffer and hot-swappable RAID units.
图5为说明符合本发明的多对一复制计算机系统示意图,无远程服务器,但包括了各自具有其本地复制单元的多个执行一特定平台的本地服务器,以及单一具相对较小缓冲区和多个可热交换RAID单元的远程复制单元。Figure 5 is a schematic diagram illustrating a many-to-one replicating computer system consistent with the present invention, having no remote servers, but including multiple local servers executing a particular platform each with its own local replicating unit, and a single server with a relatively small buffer and multiple Remote replication unit for hot-swappable RAID units.
图6为说明符合本发明的另外多对一复制计算机系统示意图,无远程服务器,但包括了各自具有其本地复制单元的多个执行一特定平台的本地服务器,以及单一具相对较小缓冲区和多具个别外部储存文件名录的远程复制单元。6 is a schematic diagram illustrating another many-to-one replicating computer system consistent with the present invention, having no remote servers but including multiple local servers each with its own local replicating unit executing a particular platform, and a single server with a relatively small buffer and Multiple remote replication units with individual external storage file directories.
图7为说明符合本发明的另外多对一复制计算机系统示意图,无远程服务器,但包括了各自具有其本地复制单元的多个执行一特定平台的本地服务器,以及单一具相对较小缓冲区、具多个磁盘分割的单一个别外部储存文件名录、和一个同样具多个磁盘分割的可热交换RAID单元的远程复制单元。7 is a schematic diagram illustrating another many-to-one replicating computer system consistent with the present invention, having no remote servers but including multiple local servers each with its own local replicating unit executing a particular platform, and a single relatively small buffer, A single individual external storage file directory with multiple partitions, and a remote replication unit with a hot-swappable RAID unit also with multiple partitions.
图8为说明符合本发明的另外多对一复制计算机系统示意图,无远程服务器,但包括了各自具有其本地复制单元的多个执行不同平台的本地服务器,以及单一具相对较小缓冲区和多个可热交换RAID单元的远程复制单元。8 is a schematic diagram illustrating another many-to-one replicating computer system consistent with the present invention, having no remote server but including multiple local servers each with its own local replicating unit executing a different platform, and a single server with a relatively small buffer and multiple Remote replication unit for hot-swappable RAID units.
图9为说明符合本发明的另外多对一复制计算机系统示意图,无远程服务器,但包括了各自具有其本地复制单元的多个执行不同平台的本地服务器,以及单一具相对较小缓冲区和多个外部储存文件名录的远程复制单元。9 is a schematic diagram illustrating another many-to-one replicating computer system consistent with the present invention, having no remote server but including multiple local servers each with its own local replicating unit executing a different platform, and a single server with a relatively small buffer and multiple A remote replication unit for externally stored file directories.
图10为说明符合本发明的另外多对一复制计算机系统示意图,无远程服务器,但包括了各自具有其本地复制单元的多个执行不同平台的本地服务器,以及单一具相对较小缓冲区、一个具多个磁盘分割的外部储存文件名录、和一个同样具多个磁盘分割的可热交换RAID单元的远程复制单元。Figure 10 is a schematic diagram illustrating another many-to-one replicating computer system consistent with the present invention, having no remote server but including multiple local servers each with its own local replicating unit executing a different platform, and a single, relatively small buffer, a An external storage file directory with multiple partitions, and a remote replication unit that is also a hot-swappable RAID unit with multiple partitions.
图11为说明符合本发明的另外一对多复制计算机系统示意图,其中一本地服务器连接到多个本地复制单元,以便将数据复制到多个远程位置。11 is a schematic diagram illustrating another one-to-many replication computer system in accordance with the present invention, wherein a local server is connected to multiple local replication units for replicating data to multiple remote locations.
图12为说明符合本发明的另外一对多复制计算机系统示意图,其中一本地服务器连接到一个多阜(Multi-Ported)的本地复制单元,以便将数据复制到多个远程位置。12 is a schematic diagram illustrating another one-to-many replication computer system in accordance with the present invention, wherein a local server is connected to a multi-ported local replication unit to replicate data to multiple remote locations.
图13为说明本发明方法的流程图。Figure 13 is a flowchart illustrating the method of the present invention.
图14为说明一在远程复制单元、远程服务器以及RAID单元间具有双主机配置的示意图,且可执行符合本发明的切换操作。FIG. 14 is a schematic diagram illustrating a dual-host configuration among remote replication units, remote servers, and RAID units, and switching operations in accordance with the present invention can be performed.
图15为一进一步说明本发明方法的流程图。Figure 15 is a flow chart further illustrating the method of the present invention.
较佳实施例详细说明DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
本发明关于用于弹性资料复制的计算机系统及方法。本发明可提供一种非侵入式复制、有或无专属私有通讯链接的复制,以及于目的地端有或无专属服务器或另一服务器协助该远程复制单元的复制。本发明也可提供多对一资料复制,包括从位于两个或更多地理分布位置处,执行相同或不同操作系统及/或档案系统的本地服务器进行复制。此外,本发明可借由允许利用一或更多外部储存单元及/或RAID单元的各式组合来保存经复制数据以提供弹性。The present invention relates to computer systems and methods for elastic data replication. The present invention can provide a non-intrusive replication, replication with or without a dedicated private communication link, and replication with or without a dedicated server or another server at the destination to assist the remote replication unit. The present invention can also provide many-to-one data replication, including replication from local servers at two or more geographically distributed locations, running the same or different operating systems and/or file systems. Furthermore, the present invention can provide flexibility by allowing replicated data to be stored using various combinations of one or more external storage units and/or RAID units.
本发明也提供弹性资料复制的工具及技术。包含复制单元角色互换;服务器热待机模式运作;复制数据储存选项;有关于经变动数据的SCSI指令的储存与重输入;交易;虚拟复制单元;应用程序状态复原;以及资料容量再同步。这些主题参照于图15所述,而应了解到按一给定主题的适当信息,实不必然地仅出现在图15及其直接提到的文字内。The invention also provides tools and techniques for elastic data replication. Includes replica unit role reversal; server hot standby mode operation; replicated data storage options; storage and re-import of SCSI commands related to changed data; transactions; virtual replica units; application state restoration; and data volume resynchronization. These topics are described with reference to Figure 15, and it should be understood that the appropriate information on a given topic does not necessarily appear only in Figure 15 and the text directly referred to therein.
本发明可依各种方法、系统进行实施。除另明确指出,任一种具体实施例的讨论说明亦适用于其它形式的具体实施例。例如,关于本发明系统的说明,亦有助于了解用以配置设定这些系统及/或方法的发明方法,以透过这些系统传送资料以取得复制资料,且反之亦然。特别是,虽图15显示一流程图,但此并不受限于方法,而是亦有助于叙述根据本发明而配置设定的媒体及系统。The present invention can be implemented according to various methods and systems. Unless otherwise specified, the discussion and description of any specific embodiment is also applicable to other forms of specific embodiments. For example, descriptions of the systems of the present invention are also helpful in understanding the inventive methods for configuring these systems and/or methods to transmit data through these systems to obtain replicated data, and vice versa. In particular, although FIG. 15 shows a flowchart, this is not limited to the method, but is also useful in describing media and systems configured in accordance with the present invention.
计算机及网络概论Introduction to Computer and Network
图1描述一网络100,其中该本地服务器102透过传统路由104,而被复制到远程服务器106。该传统路由104并不仅限于通讯链接,而是包括调制解调器、数据传输单元,以及其它用来在该链路上传送及/或接收由此传出的资料的传统工具与技术。特别但不限,该传统路由104可包括服务器接口、链接接口以及如图1美国专利第5,537,533号及其专利所讨论内容所说明的DTUs。FIG. 1 depicts a
此外,该传统路由104可包括“小型计算机系统接口(SCSI)”性能扩充器,或是标准的“储存接取网络(SAN)”连接器。这些装置需要一极高的频宽链接及最低的迟延。因距离会导致迟延问题,故其距离一般或限制于10至20英哩范围内。譬如说,在单模光纤配置下,因迟延问题故一给定的SCSI扩充器可允许资料源及目的地之间的距离约为15公里。而使用多模光纤配置,则因迟延问题使其可用距离约为其三分之二。对于这种链接,不可产生或是仅能允许低于每秒钟内极小片段的迟延或中断,或者是最多仅数秒的处理迟延而已。同样的问题发生在大型主机频道扩充器上。Additionally, the
虽然示意图的网络100符合传统工具及技术而设定来进行复制,但也可以是许多符合本发明适合加以调整并采用的网络。这种调整操作包括了各种步骤,依照本发明特定的具体实施例而定。譬如说,该调整操作可包括如果远程服务器106不再需要时,即可切断与其之间的联机,并且以符合本发明而连接的复制单元补充或替代传统路由104,由本地服务器102剥除掉复制NLM或是其它专用软件,增加更多欲复制的本地服务器,及/或以外部储存文件名录且/或“独立磁盘备援矩阵(RAID)”单元的形式,以增加远程储存量。然而,该调整操作一般至少会牵涉到增加至少一个本地复制单元,与至少一个远程复制单元,且该远程复制单元能够为进行符合本发明的操作而互连。While the
在进行该调整操作之前及/或之后,网络100可透过一网关器或是类似机制,联机至其它网络108,包括LAN或WAN,或是网际网络或内部网络的局部,而构成更广域的网络。在该示意图的网络100中,本地服务器102借由通讯链接或是网络信号线路110,联机到一个或更多网络客户端112。至于其它合适的网络,则是包括了“多服务器网络”与“点对点网络”。某一特定网络内的多个服务器102与客户端112,可为单处理器、多处理器,或是团簇式处理器设备。多个服务器102与客户端112,每一个均包含一例如像随机存取内存的可寻址式储存媒体。Before and/or after performing this adjustment operation, the
适合的客户端112,包括但不限,个人计算机、膝上型计算机114,个人数字助理与其它行动装置;以及工作站116。信号线路110可为双绞线、同轴缆线或光纤缆线、电话线路、卫星、微波中继、模块或AC电力线路、RF联机、网络联机、拨接式联机、例如红外线的可携式联机,及/或其它数据传输“线路”或是现有技术的通讯链路。该链接110可以传统或创新信号方式实施,特别是可由所述的一系列复制资料指令与/或数据结构而进行实施。远程服务器106可将由传统路由104所取得的复制资料,储存到所附接例如外部硬盘或RAID子系统118的储存装置之上。Suitable clients 112 include, but are not limited to, personal computers, laptop computers 114 , personal digital assistants and other mobile devices; and workstations 116 .
弹性复制单元系统实施例Elastic replication unit system embodiment
图2说明符合本发明的发明系统。与先前讨论的传统方法不同的是,符合本图的系统并不需要远程服务器。本地服务器200或是其它的主机200透过本地链接202,而与本地复制单元204连接。该本地复制单元204透过旅程链接206,而与远程复制单元208连接。各个本地复制单元包括为产生“预确认”资料给本地服务器200的欺骗封包产生器,以及一为在远程存妥资料之前而保存复制资料的非挥发性资料缓冲区210。远程复制单元具有一目的地非挥发性储存装置,以存放透过旅程链接206而自本地复制单元204所接收到的复制数据。该远程复制单元可实体上与本地服务器200分隔开,距离可自不及10英哩、至少10英哩到至少100英哩。此距离仅为举例,因为本发明可充分利用旅程链接206,而符合本发明的系统并无既存距离限制。以下将对个别的复制单元,就图2至12显示实施例的弹性,以及其组件与操作概述两者进行详细讨论。Figure 2 illustrates an inventive system consistent with the present invention. Unlike the traditional approaches discussed previously, a system conforming to this diagram does not require a remote server. The
不过,在此注意到此点或将有所帮助,即某些本地复制单元204具体实施例包括了SCSI仿真软件及/或硬件,而可让本地链接202成为SCSI联机,并借此而使得本地复制单元204对于本地服务器200或是其它的主机200而言,就像是SCSI磁盘或其它传统的SCSI装置。这可借由于本地复制单元204内使用一SCSI主机适配卡,且在目的模式下而非在一般激活模式下执行所实现。具该目的模式的合适SCSI主机适配卡,至少包括像Adaptec 2940UWQ适配卡、以及像Logic QLA-1040适配卡。同样地,本地链接202可为光纤频道连接、USB连接、大型主机频道扩充器、V.35CSU/DSU连接、IEEE 1394连接、内存型态(例如AS/400复制内存,而非磁盘)、IDE总线、PCMCIA连接、序列式连接、以太网络连接、FDDI连接,或是其它种类将磁盘与/或RAID子系统,连接到服务器的标准总线。如此,则传统的复制用硬件及/或软件,即可适用于本地服务器200内,使得被复制的资料仅被送至本地磁盘,正如同透过旅程链接206而送达远处。However, it may be helpful to note that some embodiments of the
与先前讨论的传统方式的长途链接不同,该旅程链接206不必为专属的私用通讯链路。虽然该种链路仍可应用在其它的具体实施例中,不过本发明仍可透过网络,或是透过使用诸如以太网络协议、FDDI、V.35、或其它数据链结协议;IP或其它网络协议;及/或UDP、TCP、或其它传输协议的一系列类似像网际网络的网络上,来提供复制单元204、208,而不必考虑该协议的路由可通性或不可通性。因此,如有必要该二复制单元204、208可相隔数十至数百英哩远。Unlike the conventional long-distance links previously discussed, the
该旅程链接206可由传统链接104,以及作为资料取得点的欺骗性本地复制单元204所形成。然而,该旅程链接206并不强迫要求高频宽及低迟延,而传统链接104一般均有如此的要求。譬如说,与SAN不同,一种利用该旅程链接206的系统,可由资料源传送复制资料到距离无限制的目的地。该旅程链接206也可以提供频宽分享,就如同在一般网际网络或是其它广域网络上相同。此外,该旅程链接206及/或复制单元,可提供对于中断与断线具有相当高度容忍性优点的新式系统。The
该远程复制单元208拥有一大型缓冲区212。故该远程复制单元208可对于本地服务器200或是其它主机200的完整档案目录提供缓冲功能。在某些具体实施例中,本地复制单元204也拥有一大型缓冲区。例如在一个实施例里,该本地服务器200档案目录与该大型缓冲区(本地及远程)可以非挥发性储存的方式,各自容纳1Tera字节的资料。该缓冲功能可以例如于本地复制单元204或者是远程复制单元208上,使用Qlogic QLA-1040适配卡而控制可达1Tera字节数据,而大致上不需另加修改的方式来完成。因此该本地服务器200完整档案目录的影像文件(Image),即可储存在这些复制单元的缓冲区内。The
对于增加的资料复原能力,可另产生一选用的本地复制档230;普通称之为“满注”本地复制档,此因该档具有一致性即可用性,但却不一定是最新的。该本地复制操作可以各种方式实现。其中包括但不限于,利用另一个第二本地复制单元204,或是多阜式本地复制单元204的第二阜,来复制数据到一“远程”磁盘子系统,而实际上就地理性而言,该子系统接近本地主机200;在本地复制单元204内将数据分岔到单元204的磁盘仿真层之下,以此来产生另外一个经由SCSI或类似的总线上,而传送到本地附接的磁盘子系统的复制(第一份复制已透过旅程链接206传送到远程复制单元);或是利用具有本地复制单元204的传统工具及技术,来产生与维护该本地复制档230。For increased data recovery, an optional
该复制文件230包括了服务器200档案目录的复制,以便发生硬软件错误时能够复原。但是因为该本地复制档230位于本地而非远程,故无法对自然灾变、战争、恐怖攻击行动、实体破坏、以及其它地理性位置所生的危害,而提供基本性的保护给服务器200。因此,无论该复制档230是否包含了另外一个复制单元204来实施本发明,该复制档230都并不提供如同远程复制一般相同程度的数据保护。该本地复制文件230借由路径232连接到复制单元204,而该路径可包括一如同路径104的传统链接,或者是一符合本发明的新型路径。虽然该本地复制档230并未明述于其余示意图中,但一个或多个本地复制仍是可应用于其余示意图里,以及其它符合本发明的系统。The
譬如说,其中一方法为采用Nonstop Networks Limited公司的技术或其它技术,来对两个服务器进行复制;本地复制单元被用来作为第二服务器的唯一(主)磁盘子系统。另一方法为,借着让本地复制单元成为主机200的唯一磁盘子系统,使得对该两个复制单元而言,所有的复制操作均为内部化;该本地复制文件230成为主磁盘,并且使得该远程复制档作为唯一真正的复制档。最后一项为较低保证配置,但是该项仍可以较低成本提供较高的性能。For example, one approach is to replicate two servers using Nonstop Networks Limited or other technology; the local replication unit is used as the only (primary) disk subsystem for the second server. Another approach is to internalize all replication operations for both replication units by making the local replication unit the only disk subsystem of the
图3说明一本地服务器200与本地复制单元204间在本地链接之上进行通讯的系统。该本地复制单元204与远程复制单元308间在旅程链接206之上进行通讯。与具有足以维持从整个本地服务器200文件名录传送资料的大型非挥发性缓冲区212的远程复制单元208不同,远程复制单元308仅拥有一相当小的非挥发性缓冲区310,而让该缓冲区310保存仅约数个(例如四个)兆字节的资料。FIG. 3 illustrates a system in which a
不过,符合图3的系统包括一远程服务器300,该服务器具有一相关的非挥发性内部或外部储存装置。未说明此点,图3显示一RAID单元312,可由远程服务器300对于该单元某点上进行控制。该RAID单元312为“可热交换”,即在该RAID单元312内,故障的磁盘可于计算机300执行期间,并且直接移除而代换的;档案系统结构与其它在替换磁盘资料上的资料,可自动安装设定完成。在某些情况下,如同图3内从RAID单元312到服务器300的箭头所示,借由例如像在服务器300上含有专属的复制软件的传统方式,该RAID单元312可被视为服务器300的一部分或者是被连接到此。However, the system consistent with FIG. 3 includes a
不过,该RAID单元312也可借由以下将另行详述并绘示于图14的配置1400里双主机连接,而被连接到远程复制单元308与服务器300。该双主机连接可允许由具有一被动远程服务器300、一远程RAID单元312或是其它仅作为复制用的远程磁盘子系统、与一本地复制档及/或主动提供服务予读取请求的本地主机200磁盘的第一“正常复制”状态,切换到具有一主动提供以由远程RAID单元312,或是其它远程磁盘子系统来读取资料的服务请求的远程服务器300。However, the
在第一“正常复制”状态下,远程复制单元308透过例如像以太网络及/或TCP/IP联机206,接收由本地复制单元204传来的数据。如图2说明,本地链接202可为SCSI总线、USB、光纤频道或是类似的联机。透过远程链接302与远程复制单元308,该远程复制单元308将资料传送给远程服务器300,以便对可热交换RAID单元312进行后续的储存操作,或者是如果采用双主机联机1400的话,则可由远程复制单元308直接送到RAID单元312。远程链接302可为例如像SCSI总线联机,能够让远程复制单元308对于远程服务器300而言就如同一SCSI磁盘,可被远程服务器300复制到另一个“磁盘”,即RAID单元312。该远程链接302亦可为序列式、以太网络、FDDI、USB、光纤频道或是其它非私有的联机。In a first "normal replication" state, the
本地复制单元204具有一类似或等同于(除了对储存于此的特定数据之外)远程复制单元小型缓冲区310的非挥发性缓冲区。本地服务器200经预确认后,置入本地复制单元204缓冲区内。对于主服务器200来说,第二“复制”写入操作只会以本地的方式进行。事实上,本地复制单元204已收到数据,并储存于该本地缓冲区内。该本地复制单元204存妥该扇区以及磁道更动数据(或是类似的区块层级数据),一直到本地复制单元204可安全地透过旅程链接206,传送资料给远程复制单元308。本地复制单元204的智能型缓冲区,可储存任何在旅程链接206上能够当地处理的资料。这些数据会被储存在本地复制单元204上,一直到远程复制单元308已成功地将其写入远程服务器300,并且送回一确认信号给本地复制单元204为止。当收到该确认信号后,本地复制单元204将已成功地传送的扇区/磁道/区块资料片段由本地非挥发性缓冲区消除掉。不同于传统系统,服务器200及300正好相反,都不需要标准档案系统以及操作系统所需的NLM或者是特地为资料复制所设计的软件。The
图4说明一种具有多个如同上述,并标以如前绘图示相同代号的组件的系统。不过,图4的系统里,一远程复制单元408包括小型非挥发性缓冲区310以及大型非挥发性缓冲区;该大型非挥发性缓冲区由可热交换RAID单元312实施的,并直接连接到该远程复制单元408。而小型缓冲区310用来作为对由旅程链接206所收到的资料提供缓冲,使得数据可被确认回复到本地复制单元204,并对该数据提供缓冲直到该数据已被远程复制单元408存妥于大型缓冲区312上。此处不需要远程服务器。Figure 4 illustrates a system having a number of components as described above, and labeled with the same reference numbers as shown in the previous figures. However, in the system of FIG. 4, a
图5说明一些两个或多个本地服务器200写入到远程复制单元508的系统。在本图及其它图式里,应了解到以本地服务器200为参考,一般亦包括不做服务器使用的主机200在内。换言之,本发明可用以复制任何连接到本地复制单元204的主机计算机系统200。主机200最常被以服务器作例子。但是其它主机200例子包括了不为服务器多个计算机的团簇、大型主机、“储存进接网络(SAN)”或是“网络式附接储存(NAS)”资料源。该本地服务器200或是其它主机200彼此在实体上可为距离低于10英哩、至少10英哩或远达百英哩等类似地分隔。在本图所示的系统内,特定系统内每个本地服务器200均执行于相同的操作系统及档案系统平台上,但是符合图5的不同的系统亦可采用相异的平台。例如,每个服务器200均可能是这个系统的Novell NetWare服务器,而在另外的系统里,服务器200也可能是采用NT档案系统的MicrosoftWindows NT服务器。FIG. 5 illustrates some systems where two or more
系统里各个主机200均以SCSI、光纤频道、USB、序列式、或是其它标准储存子系统或其它外围联机202,而连接到其本地复制单元204。该本地复制单元204借由旅程链接206,连接到单一远程复制单元508。该远程复制单元508以SCSI、光纤频道、USB或是类似掌控每个本地复制单元204的控制卡。Each
由本地复制单元204而来的数据,可经由SCSI、光纤频道、USB或是类似远程复制单元508内的联机,直接(即不通过远程服务器)传送到RAID单元群组512内独立的可热交换RAID储存单元312。譬如说以包含有连接到旅程链接206的以太网络卡的该部分为例子,RAID储存单元312至少对于一部分的远程复制单元508而言,可为实体上的外部装置。然而,远程复制单元508是以功能而非包装来定义。特别是,除非另有说明(如图14所述),RAID储存单元312会被认定为远程复制单元508的一部分。各个RAID储存单元312均具有一远程可开机磁盘名录,并且该资料以扇区/磁道或是区块的方式写入。该远程复制单元508也包括了一小型缓冲区310以进行确认,以及对由旅程链接206所收到的资料提供缓冲功能。Data from the
图6说明类似于图5所示的系统,但是远程复制单元608将资料写入群组616内的外部可开机储存磁盘名录。在相同平台上执行的本地服务器200,实际上是写入本地复制单元204,然后再将数据写入远程复制单元608。远程复制单元608具有SCSI、光纤频道、USB或是类似的控制卡以及对应到各张本地复制单元204的可开机储存磁盘名录614。由各个本地复制单元204所传来的数据,会再从远程复制单元608,经由SCSI总线或其它的资料线路,而直接传送到所对应的储存磁盘名录614。每个磁盘名录614都有一远程可开机磁盘名录,并且该资料以扇区/磁道或是区块的方式写入。FIG. 6 illustrates a system similar to that shown in FIG. 5 , but with
而本系统另外大致符合本图6与其它系统的具体实施例里,也可使用单独的扇区来保存各个本地服务器200的复制资料,而不必将复制数据保存在相对应单独的磁盘614上(即如图6所示),或者是单独的RAID储存单元312(即如图5所示)。在各式的多对一系统里,或许有激活一程序,来将本身分岔出另一新的连接,并且借由IPC或其它机制来从多个复制尝试操作以锁定磁盘名录复制文件的必要。And this system roughly conforms to this Fig. 6 and in the specific embodiment of other systems in addition, also can use independent sector to preserve the duplicate data of each
图7说明一种系统,其中该远程复制单元708包含有各自的外部储存磁盘名录614,以及RAID单元312两者。被复制的资料会由该远程复制单元708存放于两个储存子系统312、614里,以提供额外有关于当有需要时该资料随时可用的保证。FIG. 7 illustrates a system in which the
图7说明一种系统,其中两个或多个本地复制单元204,将所有自被直接挂载于远程复制单元708的单一大型储存磁盘名录(312或614或两者都有,依具体实施例而定)之下,各个本地服务器200所复制的资料,写入一远程复制单元708,而不必分别在如图5、6的多个远程储存单元312或614上,将被复制的数据进行切割。该远程复制单元708所使用的磁盘名录具有各个本地复制单元204所使用的扇区。每个扇区均提供一远程可开机“磁盘名录”,而且以往常方式将日期记录在扇区/磁道或区块上。FIG. 7 illustrates a system in which two or more
图7也说明另外一种系统,其中被复制的资料备分为两个或多个储存单元,且直接连接到具一储存有给定本地复制单元204被复制数据的特定储存单元的远程复制单元708。不过,在此使用的是外部磁盘614与RAID单元312的组合,而非仅仅使用RAID单元(如图5所示)或是外部磁盘(如图6所示)的系统。譬如说,外部磁盘614保有从第一本地复制单元204传来的数据,而RAID单元312则保存从第二本地复制单元204传来的数据。在该系统里,该远程复制单元708具有SCSI、光纤频道、USB或是类似对应到各张本地复制单元204的控制卡,而由本地复制单元204所传来的资料,会直接地(即不通过如服务器300的服务器)传送到个别的外部可热交换RAID储存单元312,或者是透过SCSI、光纤频道、USB或是类似的通讯线路,而传送到外部可开机式磁盘驱动器614。FIG. 7 also illustrates another system in which replicated data is backed up into two or more storage units and is directly connected to a remote replication unit that has a specific storage unit that stores the replicated data for a given
图8说明与图5所讨论有关的系统。然而,在图8里的系统,本地服务器200执行于不同的平台上,正如图中各822、824与826所示的号码。当然,符合本图或其它示意图的系统,并不一定具有正好拥有三个本地服务器200以及其所对应的本地复制单元204;以本地服务器200与对应的本地复制单元204算是一对,它们可为两对或更多对。例如,一个符合图8的系统,包括了Novell NetWare服务器822以及Microsoft Windows NT服务器824,但是另外一个符合图8的系统,则包括两个Novell NetWare服务器822、826,以及一个Microsoft WindowsNT服务器824。FIG. 8 illustrates the system discussed in relation to FIG. 5 . However, in the system of FIG. 8, the
图9说明和图6、8所讨论有关的系统。然而,与图6里的系统不同,该本地服务器200执行于不同的平台上,而与图8里的系统的不同之处,则是该远程复制单元为单元608,该单元使用外部磁盘驱动器614的群组616,而非RAID单元312的群组512。FIG. 9 illustrates the system discussed in relation to FIGS. 6 and 8 . However, unlike the system in FIG. 6, the
图10说明与图7所讨论有关的系统。然而,符合本图10的系统,其中的本地服务器200执行于不同的平台上。正如图7所示,在一些系统中,本地复制单元204可被对映到扇区或是储存单元。当对映到扇区时,本地复制单元204可被对映到RAID单元312里的扇区、外部磁盘驱动器614的扇区,或者是对映到同时也被复制到外部磁盘驱动器614的RAID单元312里的扇区。而当本地复制单元204被对映到储存单元时,一个或多个本地复制单元204可透过远程复制单元708,将它们的数据送给对应的外部磁盘驱动器614,而一个或多个其它的本地复制单元204,则可透过远程复制单元708,将它们的数据送给对应的RAID单元312。FIG. 10 illustrates the system discussed in relation to FIG. 7 . However, in accordance with the system of FIG. 10, the
图11说明一系统,其中该资料被复制到两个或更多个远程位置。就以图5-10指的是“多对一”复制系统(超过一个被复制到远程位置的本地服务器)的角度来说,这种系统可以视为如图5-10所描述系统的反例,而图1l所说明的是“一对多”的复制系统(一个本地服务器被复制到超过一个远程位置)。一般说来,该本地复制单元204被复制的数据相同,但是使用多重的本地复制单元204,即可允许透过至少一个由旅程链接206,而使得即使是某一特定本地复制单元204无法取用,也能够可以继续复制而不中断。本地链接202也都可以使用与此相同,或者是不同的连接形式。譬如说,本地链接202可为一SCSI连接,而另一本地链接202可为USB连接。而旅程链接206也可以是一致或是变化不同的。同样地,各个远程复制单元也可能具有相同的组件(即每个都是使用RAID单元312),或是在不同位置应用不同的组件。Figure 11 illustrates a system in which the material is replicated to two or more remote locations. To the extent that Figure 5-10 refers to a "many-to-one" replication system (more than one local server being replicated to a remote location), such a system can be viewed as a counterexample to the system described in Figure 5-10, And what Fig. 11 illustrates is a "one-to-many" replication system (one local server is replicated to more than one remote location). Generally speaking, the data replicated by the
图12说明一种系统,且类似于如图11所讨论的,该数据复制到两个或多个远程位置。然而,图12的本地复制单元204为多阜式复制单元。亦即,该单元可以类似于传统的多阜式服务器同时连接方式,同时被连接到一个以上的旅程链接206。该多阜式复制单元204由主机200,经由每一个作用中的链接而送出资料,借此协助将主机200复制到多个彼此间距离或为数英里之隔的远程位置。该多阜式本地复制单元204仅需要一个本地缓冲区,并且如同其它系统内的复制单元204,该多阜式复制单元204可选择性地包括一完整本地复制档230。Figure 12 illustrates a system, and similar to that discussed with Figure 11, the data is replicated to two or more remote locations. However, the
复制单元续论Replication Unit Continuation
复制单元的组件与操作为于前述的图2-12所讨论。以下所提供的额外资料片段,并非必定要属于符合本发明的各个系统内每一个复制单元,不过该额外信息,仍有助于了解复制单元是如何提供更多的弹性给负责确保资料恰当复制的人员与企业。The components and operation of the replication unit are discussed above with respect to Figures 2-12. The additional pieces of data provided below are not necessarily pertaining to every replication unit in each system consistent with the present invention, but this additional information is helpful in understanding how replication units provide more flexibility to those responsible for ensuring that data is properly replicated people and businesses.
至少部分的复制单元能够可靠地仿真以SCSI、光纤频道、USB或是类似执行在Novell NetWare及/或Microsoft Windows NT平台上的标准服务器驱动程序的链接所连接的磁盘驱动器。同时,也可提供在其它操作系统下的SCSI、光纤频道、USB或是类似仿真程序。At least some of the replication units are capable of reliably emulating disk drives connected by SCSI, Fiber Channel, USB, or similar links to standard server drivers running on Novell NetWare and/or Microsoft Windows NT platforms. At the same time, SCSI, Fiber Channel, USB or similar emulation programs under other operating systems can also be provided.
每一个本地或远程的控制单元,均以较适当的方式进行设定,以便可透过插接的显示器、键盘及鼠标来支持I/O。某些复制单元具有网络位置,或可允许网络管理者透过远程工作站116上的浏览器或其它方式,来接取到某一经调适过的网络100上特定复制单元。Each local or remote control unit is configured in an appropriate manner to support I/O through a plugged-in monitor, keyboard and mouse. Certain replication units have network locations, or may allow a network administrator to access specific replication units on an adapted
该类复制单元为可支持“简易网络管理协议(SNMP)”形式较佳。网管人员对于本地及远程复制单元两者均具有远程访问能力。复制单元204软件可提供一监控公用程序状态的接口。特别是,每一个本地复制单元204均扮演网络代理人的角色,因为单元204可追踪对本地服务器200的读写次数、各个本地服务器200的状态、各个本地服务器200的重新激活/暖开机次数等等,并且当有必要时,产生SNMP捕捉功能。本地复制单元204亦可提供下列的资料片段给网管人员:现存于缓冲区210内的区块数、当缓冲区210满溢或超过一特定门槛值时的警示信号、服务器200激活后所传送的区块数以及服务器200激活后所接收的区块数。Such duplication units are preferably in the form of supporting "Simple Network Management Protocol (SNMP)". Network administrators have remote access to both local and remote replication units. The
一些本地复制单元204亦可提供选择性的拨接增加功能。倘若一顾客正以拨接连线方式使用该复制单元204,且并不想一直保持联机,该单元204提供一选项以便于特定时刻经由旅程链接206传出资料。同时,也可设定本地复制单元204为当处于高流量时段内,并不允许资料传送到调整过的网络100,或者是旅程链接206的另外一部分。本地复制单元204内的缓冲区210空间应足够大,以便对这些不传送时段里由本地服务器200所收到的资料提供缓冲功能。Some
一般说来,本地复制单元204根据数据传输速度、可靠性以及与现有服务器200平台的兼容性较佳配合高速的RAID磁盘子系统性能。由于实施以软件部分为主,不太可能符合这些性能目标,因此本地复制单元204最好包括特定功能的硬件在内,包括必要的韧体在内,软硬件适当的设计与建构,可由本行业技艺人士特别针对下列事项:传统复制路径104;SCSI控制器或类似SCSI、光纤频道、USB或是类似的控制器;个别为众知的子系统如缓冲区210、212、310、磁盘614与RAID单元312及其接口;如FreeBSD驱动程序的软件;以太网络以及个别为众知的“网络适配卡(NIC)”;如以太网络与TCP/IP协议的网络协议;此处所述的说明及实施例;与其它现有或这些人士后续可用的工具和技术等各项进行处理而得。Generally speaking, the
一般说来,如果要写入本地复制单元204则需进行确认操作并写到本地缓冲区210,或者也可以透过传统路径104或其它路径,写到完整本地复制磁盘目录230,即使这种本地复制方式并未于图3到12明白叙述。对于性能来说,一般可接受借由本地复制单元204,或者是本地服务器200或两者内的RAM高速缓存的方式,来提供写入操作的缓冲功能。特别是指,可利用现有的硬件RAID单元312快取或其它SCSI、光纤频道、USB或类似快取的优点而实施的。由本地复制单元204进行的读取操作,一般由本地复制文件230提供适当的资料。Generally speaking, if you want to write to the
当在毁损或重开机或其它中断之后本地复制单元204又再度上线时,会自动开始由本地缓冲区210送出数据给远程复制单元208、308、408、508、608及708。本地复制单元204不可送出SCSI、光纤频道、USB或类似装置的重置信号,以避免损毁主机200的运作。写入本地复制单元缓冲区210的数据,将以“先入先出”的方式,透过网络或是其它旅程链接206,传送给远程复制单元。这可由使用TCP/IP或另外的旅程链接协议而实现。该远程复制单元以保持一完整、一致的复制档为佳,以便远程磁盘名录为可用状态,并且无论复制的同步状态为如何,均可随时由操作系统所挂载。When the
至少在使用FreeBSD为基础的软件情形下,本地复制单元204的核心(Kernel)问题最好是不可以发生,除非是基本性的复制软硬件发生故障。本地复制单元204的设定发生错误,或是主机服务器200的任何行为,都不可导致系统产生关闭状况。如可能最好以无须重开机的方式来重新设定复制单元软件的配置;软件变更时,最好附有唯一的版本号码。因此,该软件最好是透过可由管理人员发出的系统呼叫,自行读取所有的起始信息及配置,而复制单元不会中断数据处理程序。主机服务器200不可被中断。无论远程复制单元是否上线,也无论是否可以使用网络或是其它旅程链接206的频宽,除非是本地缓冲区210已满溢,否则本地复制单元204都能接受由主机服务器200传来的写入操作为佳。At least in the case of using FreeBSD-based software, the core (Kernel) problem of the
如果本地缓冲区210已满溢,则本地复制单元204能继续保存本地复制档230(如果存在)为佳,并且最好是继续由本地缓冲区210的环型队列(Queue)销去排队的项目。然而,本地复制单元204直到使用者(一般为管理人员)的程序告知可开始接受排队项目之前,最好是暂停将其加入到对列里。最好是系统呼叫,而非重开机,可允许使用者方面的程序来关掉或再激活本地缓冲区210队列。If the
复制单元最好可对网络或是其它旅程链接206频宽的消失与再联机进行自动侦错。例如,将本地复制单元的以太网络卡断线,然后隔天再重新接上,如果这样子只要本地缓冲区210内有足够的空间来掌握在本地复制单元204断线时所累积的数据变动,仍然可以维持零资料漏失状态,并且不需要网管人员的干预为佳。Preferably, the replication unit can automatically detect the loss and reconnection of the network or other journey link 206 bandwidth. For example, the Ethernet card of the local replication unit is disconnected and then reconnected the next day. If so, as long as there is enough space in the
复制单元或与其有关单元的监控软件,最好可决定该系统在先前的开机程序之后,是否已被完全关闭,以便该监控软件可决定远程复制文件已为异步的可能性。当电力中断时,本地复制单元204最好尽可能不漏失资料。因此部分的复制单元可包括一“不中断电源供应器(UPS)”。可假定有时当发生电力中断时,将RAM所缓冲的写入操作,倾送到本地复制档(如果存在)及/或本地缓冲区210内。Monitoring software for the copying unit, or a unit related thereto, preferably can determine whether the system has been completely shut down after a previous boot sequence, so that the monitoring software can determine the possibility that the remote copying of files has become asynchronous. When the power is interrupted, the
在某一具体实施例中,复制单元操作系统(即FreeBSD)以只读模式由硬盘开机,以避免具FreeBSD的档案系统自身问题。将配置设定资料写到较小的扇区,并且可以两种方式存妥,一是由相同的复制单元点的信息,或者是送出SNMP警示信号,说明该复制单元已漏失配置设定资料并且会离线,一直到重新安装为止。当无法接取到该复制单元点时,即可使用该警示信号。某些具体实施例中也避开进行控制卡起始程序,因为此时磁盘驱动器无法自行操作,故可免除例如像总线重置等问题。而且,如果复制单元缓冲器已经满溢,则最好是仅需对写入操作响应确认信号并在本地端复制的,同时送出缓冲器已经满溢,并且远程复制档已不与本地复制文件同步的警示信号。In one embodiment, the replication unit operating system (ie, FreeBSD) boots from the hard disk in read-only mode to avoid problems with the file system itself of FreeBSD. Write the configuration setting data to a smaller sector, and save it in two ways, one is from the information of the same replication unit point, or send an SNMP warning signal, indicating that the configuration setting data has been missed by the replication unit and Will be offline until reinstalled. When the replication unit point cannot be accessed, the warning signal can be used. In some embodiments, the control card initialization process is also avoided, because the disk drive cannot operate by itself at this time, so problems such as bus reset can be avoided. Also, if the replication unit buffer has overflowed, it is best to only respond to the write operation with an acknowledgment signal and replicate locally, while the send buffer is already overflowing, and the remote replicated file is out of sync with the locally replicated file warning signs.
如同前述,如果可能,最好是本地复制单元204的冷启动,不会影响到主机系统200,特别是对于SCSI、光纤频道、USB或类似装置的交握(Handshaking)信号。本地复制单元的缓冲区210保留写入请求的次序,并且由本地复制单元204以与接收时相同的次序,将其传送到远程复制单元,以随时保存数据的一致性。As mentioned above, if possible, it is preferable that the cold boot of the
远程复制单元自本地复制单元204处,接收TCP“协议数据单元”(也称为TCP封包),并将其写入磁盘子系统(例如外部叠机614或RAID单元312),以使得该磁盘驱动器至少在逻辑上为“区块对区块”的方式与本地复制档230,如果有的话,以及与早先时的主机200磁盘目录相同。被复制的资料或为过时,但是仍必须保持一致。The remote replicating unit receives TCP "protocol data units" (also referred to as TCP packets) from the
为达到资料复原的目的,远程复制单元软件最好具有一使用者端接口,该程序可关闭或重开复制单元软件的读取、写入、及/或远程复制文件搜寻功能,使得该远程磁盘子系统,因此也包括该复制资料,可被位于相同链接上的第二SCSI主机所接取。在远处,远程复制单元与备份主机服务器,会被附接到该分享式磁盘子系统。譬如说,远程复制单元可使用SCSI ID 6,而作为复原使用的远程服务器则用到SCSI ID 7。当远程复制单元在作复制时,远程主机不会挂载该分享式磁盘驱动器。为进行资料复原,作为切换的一部分,远程复制单元会中止存取该分享式磁盘驱动器,并且让备份主机服务器挂载。In order to achieve the purpose of data restoration, the remote copy unit software preferably has a user-side interface, and this program can close or reopen the read, write, and/or remote copy file search functions of the copy unit software, so that the remote disk The subsystem, and thus the replicated data, can be accessed by a second SCSI host on the same link. Remote replication units and backup host servers are attached to the shared disk subsystem at a remote location. For example, the remote replication unit can use SCSI ID 6, while the remote server used as a restore uses SCSI ID 7. When the remote replication unit is replicating, the remote host will not mount the shared disk drive. For data recovery, as part of the switchover, the remote replication unit disables access to the shared disk drive and allows the backup host server to mount it.
远程复制单元最好是可向使用者端报告由本地复制单元204所传来的区块总数。远程复制单元将其复制到磁盘子系统,以便磁盘名录可以在与产生本地磁盘名录的本地服务器200相同的操作系统上,被主机系统挂载。如果远程复制单元从本地复制单元204处,接收到一写到逻辑区块代号N的请求,则会将该数据写到远程复制单元磁盘子系统312或614的逻辑区块代号N的位置。为保持数据的一致性,从本地复制单元204处而来的写入请求,会依照当本地复制单元204处接收到该请求时的顺序,依次写到远程复制单元磁盘子系统312或614。Preferably, the remote replication unit can report the total number of blocks sent by the
在旅程链接206上,本地复制单元204处与远程复制单元处之间的通讯可采用TCP协议,因为其特性为错误复原与传输保证能力。远程复制单元软件可当作TCP服务器;本地复制单元204则可作为远程单元的客户端。而失去网络频宽及联机最好是不会中断本地复制单元204,也不会中断远程复制单元。同样地,远程位置的资料复原操作最好也不会中断本地复制单元204。如果本地复制单元204与远程复制单元之间的联机过时或是断线,则该本地复制单元204能够尝试再联机直到重建完成较佳。然后,本地复制单元204最好可由原先中断处继续传送复制资料,或是重新开始正常的操作。On the
新式的复制单元较原先的Off-SiteServer产品具有智能,就在于该新式的复制单元执行于以FreeBSD UNIX操作系统为基础而再行修改过的操作系统上。其中一项修正包括改换Qlogic SCSI控制器的驱动程序,以使得该卡如同成为SCSI的目标位置而非主机,让它可仿真为磁盘驱动器;也可以使用其它搭配有合适驱动程序的控制器。开机程序也被加以修改,以便在主控台上显示复制单元配置的公用程序而不是提示字符,同时操作系统核心部分也加以重新编译过。在来源端每一个复制单元204均执行于允许其可完全独立于主机服务器200的操作系统上。因此,此处提供的各项弹性复制的特点之一,即是复制单元204并不需要主机服务器200处的起始或是联机软件(在原先的Off-SiteServer产品上,该软件采用Vinca NLM的型式)。The new type of replication unit is more intelligent than the original Off-SiteServer product, because the new type of replication unit is executed on the modified operating system based on the FreeBSD UNIX operating system. One of the fixes involves changing the driver for the Qlogic SCSI controller so that the card behaves as if it were a SCSI target rather than a host, allowing it to emulate a disk drive; other controllers with appropriate drivers can also be used. The boot sequence was also modified to display the replication unit configuration utility on the console instead of a prompt, and the operating system core was recompiled. Each
另外不同的是,复制单元204操作系统可仿真一SCSI或是其它标准磁盘或资料取得点。使得该复制单元204譬如说可被挂载于任何支持SCSI操作系统下的SCSI复制磁盘,至少包括有Microsoft Windows 95、Microsoft Windows98、Microsoft Windows NT、Novell NetWare、FreeBSD以及Linux操作系统。该磁盘仿真最好可被施加于任何执行标准磁盘操作的节点上(至少是从服务器200的观点而言),包含除了磁盘读取与磁盘写入之外,还有处理服务器200的磁盘格式化、磁盘分割、诸如扫描磁盘等的磁盘整体检查请求。Another difference is that the operating system of the
符合本发明的系统也可以本地的方式,为容错而维护完整复制磁盘目录230。由于该复制操作是以在复制单元204的软件仿真层级之下,分岔出该资料(或是进行两次写入操作)的方式进行,而该复制单元204即能够经由使用一序列式数据变动缓冲区,来维护该本地磁盘目录230。这可使得该复制单元204能够以服务器200对本地读取操作提供服务,而不会有过度的迟延,因此系统能够执行且不会造成磁盘失效问题,也不需要切分搜寻(Split-Seek)软件,因而消除掉潜在的软件兼容性问题。如此也让本新式系统可在本地磁盘复制操作下,将复制资料送回到服务器200的本地磁盘,而不必经过旅程链接206。此外,如果本地复制档230已经维护,则本地复制单元204就不需要包括欺骗封包产生器来预确认写入操作到主机200,因为本地复制档230并不会受到在旅程链接206上,与传送复制资料相关连的延误与危险的影响。Systems consistent with the present invention may also maintain a fully replicated
一个符合本发明的复制单元,一般包括有操作系统软件。因此,至少某些复制单元可执行多重“主机”应用软件,以对所取得的复制资料进行操控。该系统也可以借由驱动程序及/或适当软件及/或硬件来进行扩充或缩减,以配合特殊环境的需求。处理程序可扩展到多个处理器、SCSI卡、及/或其它“智能型”装置,以处理更多的操作与负担。同样地,也可以将系统缩减下来,以降低成本而仍然可以符合较低性能环境的需求。以合适的软件,本地复制单元204可以独立智能型磁盘子系统来执行,或是为本地端容错功能,来仿真主机200操作系统失效情形。万一主机200智能型磁盘子系统损毁,该本地复制磁盘名录230可提供本地端容错功能,而作为本地复制替代者。A replication unit according to the invention generally includes operating system software. Thus, at least some of the replication units may execute multiple "host" applications to manipulate the acquired replicated material. The system can also be expanded or reduced by drivers and/or appropriate software and/or hardware to meet the needs of special environments. The process can be extended to multiple processors, SCSI cards, and/or other "smart" devices to handle more operations and workloads. Likewise, systems can be scaled down to reduce cost while still meeting the needs of lower performance environments. With suitable software, the
本系统维持远程位置的一致性与可用性,一部分要靠一个智能型缓冲区210,已先进先出的方式来维持与送出资料。如此,数据可以其经由本地复制单元204的仿真层被接收到时相同的次序,被送往远程位置。由于封包式资料并不一定会与送出时相同的次序抵达目的地,故也可使用序号及/或时间戳记。The system maintains consistency and availability at remote locations, in part by an
当停机关闭事件发生时,有些具体实施例会采用下述环型缓冲区及其它装置等方法来保护资料。除了以Qlogic卡作为磁盘目的地仿真器之外,本地复制单元拥有两个透过一本地SCSI磁盘控制器而附接的磁盘系统。一磁盘其上包含有主机操作系统(即FreeBSD),以及相关的公用程序与复制单元管理软件。该磁盘也可作为缓冲区210磁盘。另外一个附接到该复制单元的磁盘系统,为至少与被复制的主机210磁盘一样大小,并且作为主机210磁盘的本地复制文件。When a shutdown event occurs, some embodiments employ ring buffers and other devices described below to protect data. In addition to using the Qlogic card as the disk destination emulator, the local replication unit has two disk systems attached via a local SCSI disk controller. A disk contains the host operating system (ie, FreeBSD) on it, as well as related utilities and replication unit management software. This disk also serves as the
SCSI资料由Qlogic卡读出,并且在核心部分依照读写请求而进行求值。由Qlogic卡而来的读取请求最好是以本地复制磁盘230处理的,而不必传经网络206。写入指令则是尽快地直接复制到本地复制磁盘230,并确认到主机系统200(但是不一定要预确认),同时加到缓冲区磁盘或是非挥发性RAM的环型缓冲区内。SCSI data is read by the Qlogic card and evaluated in the kernel according to read and write requests. Read requests from Qlogic cards are preferably processed locally on the
每次当一区块要被写入该环型缓冲区内时,实际上是依序写进两个区块,一个是实际会被传送的资料区块,另一个是队列目前尾端指针的时间戳记,或再包括如LBN(逻辑区块数)的数据。后者区块即为所谓的“超资料(Meta-Data)”区块。这种方法对使用空间不具效益,但是该法可以减低所需要的磁盘读写次数而保持队列指针。如果RAM可用,也可以保留至少一份复制,或是将整个该环型缓冲区存入非挥发性RAM之内的方式,来维持队列指针。有一种同时节省空间及时间的方法,是一次将一大块的资料写入该环型缓冲区,将区块送入内存内缓冲,一直累积到足以执行写入操作为止。这可允许该超数据区块能被多个数据区块所使用,以减少磁盘写入操作次数并且节省磁盘空间。Every time when a block is to be written into the ring buffer, two blocks are actually written in order, one is the actual data block to be transmitted, and the other is the current end pointer of the queue. Timestamp, or further data such as LBN (Logical Block Number). The latter block is the so-called "Meta-Data" block. This method is not space efficient, but it reduces the number of disk reads and writes required while maintaining queue pointers. Queue pointers can also be maintained by keeping at least one copy if RAM is available, or by storing the entire ring buffer in non-volatile RAM. One way to save both space and time is to write a large chunk of data to the ring buffer at a time, buffering the chunks in memory until enough is accumulated to perform the write operation. This allows the hyperblock to be used by multiple data blocks, reducing the number of disk write operations and saving disk space.
当发生停机关闭事件并重开机时,可由搜寻该超数据段落内最近的时间戳记,来寻得队列的起始端,然后用该超数据段落来定出尾端指针的位置。例如,这可由执行一项二元搜寻法而实现。由于是以环型方式来实施缓冲区,故不必实际地将被传输的资料由缓冲区移除(即将其删除或归零);将尾端指针递增就可以做到。当头端指针比该尾端指针小于1时,即可侦知缓冲区满溢的状态。指针是指出该环型缓冲区的位置,而不是指向缓冲区的资料值(此为一矩阵而非链接表)。When a shut down event occurs and the machine is restarted, the head end of the queue can be found by searching the latest time stamp in the hyperdata section, and then the position of the tail pointer can be determined by using the hyperdata section. For example, this can be achieved by performing a binary search method. Since the buffer is implemented in a circular fashion, it is not necessary to actually remove the transferred data from the buffer (ie, delete or zero it); this can be done by incrementing the tail pointer. When the head pointer is less than 1 than the tail pointer, the state of buffer overflow can be detected. The pointer points to the location of the ring buffer, not to the data values of the buffer (this is a matrix rather than a linked list).
因为既然有在系统关闭之前最近的秒数,而该秒数即已足够决定所写入的最后一个区块位置,故也有可能不需要保留该64位的时间戳记。例如,假设四个区块在同一秒钟被写入,且拥有相同的时间戳记。那么,由于此位有序式队列,故按该时间戳记最后一个区块为最后写入的那一个。如果时间戳记耗去太多的计算资源,那么一较简易的计数器或已足够,虽然不到公元2038年就会跑完一圈。该队列缓冲区的大小,会随着终端使用者的资料变化率,以及客户需求的时间长度而变,以经得起网络206中断的问题。该队列缓冲区可小到仅仅只有数百个Mega字节,或是大到与被复制的主机磁盘目录相同的容量。Since there is a recent number of seconds before the system shutdown, which is sufficient to determine the last block location written, it may not be necessary to keep the 64-bit timestamp. For example, suppose four blocks were written in the same second and have the same timestamp. Then, because of this bit-ordered queue, the last block according to the timestamp is the last written one. If time stamping consumes too many computing resources, then a simpler counter may suffice, although it will complete the lap in less than 2038 AD. The size of the queue buffer will vary with the end user's data change rate and the length of time required by the client, so as to withstand the problem of
缓冲区的大小并未有既定的最低或最高限制,并且当预期到旅程链接206上会有高资料速度变化,及频繁的冗长中断发生时,缓冲区或将需要比被复制的主机磁盘目录容量还要大。There is no set minimum or maximum limit on the size of the buffer, and when high data rate changes and frequent lengthy interruptions are expected on the
一单独的处理程序,可在使用者端或是系统端执行,由该环型缓冲区读取区块,并且透过网络206将其传送给远程复制单元。该传输程序可随时知会该队列处理程序,要去传送程序的目前指针位置,以及能够观察时间戳记以便决定何时该队列为净空状态。如果存放在该超资料内的尾端指针仅略为过时,则仍为可接受,因为发生最坏的状况时,只要当系统重新激活时,该重传区块数不会累积成为过量,系统即会再度重传其已传出资料区块个数的资料区块。最好是当服务器激活时,该传输程序也可以决定区块数。在某些情形下,可预先假定该缓冲区将可对整个主机磁盘容量进行缓冲。在这种“无害”的哲学下,最好是不冒任何减缓SCSI总线性能的风险,并且仅仅将这些无法置入已满溢缓冲区的资料倾列出来,并通知使用者端来监控该事件程序。A separate process, which can be executed on the client side or the system side, reads the blocks from the ring buffer and sends them over the
为尝试减低重送区块个数,本系统可对本地复制档检查写入操作,且仅仅当操作不同时才会将其加入环型缓冲区,以避免任何怠惰的写入操作。这可由维护磁盘上每个LBN检查总和的杂凑表(Hash Table)来完成;而其取舍即在于处理器花时间在计算检查总和,及内存或额外的磁盘操作。In an attempt to reduce the number of resent blocks, the system checks writes against the local replica and only adds them to the ring buffer if they differ, avoiding any lazy writes. This can be done by maintaining a hash table (Hash Table) that checks the sum of each LBN on disk; the trade-off is that the processor spends time calculating the check sum, and memory or additional disk operations.
方法概论Methodology
图13及15说明本发明有关远程资料复制的方法。某些方法包括安装复制单元的步骤;为简化起见,这些步骤就整体合并为安装步骤1300。譬如说,当进行图2到12任何一种的系统安装时,系统整合者、复制设备贩售者、与管理者可被授权来执行步骤1300所示的部分或是全部。本发明其它方法还包括传输资料给一个或多个复制单元的步骤;为简化起见,这些步骤就整体合并为传输步骤。这些传输步骤可在安装者的授权之下以测试资料来进行,以作为安装步骤1300的一部分,但是这些步骤也可依照符合本发明的系统的正常使用者要求,以例行性的方式用对于工作极为关键的资料来执行。13 and 15 illustrate the method of the present invention related to remote data replication. Some methods include the step of installing a replication unit; these steps are collectively combined as installing step 1300 for simplicity. For example, system integrators, replica device vendors, and administrators may be authorized to perform some or all of the steps shown in step 1300 when performing any of the system installations shown in FIGS. 2 to 12 . Other methods of the present invention also include the step of transmitting data to one or more copying units; for simplicity, these steps are collectively combined as a transmitting step. These transmission steps may be performed with test data under the authorization of the installer as part of the installation step 1300, but these steps may also be used in a routine manner in accordance with normal user requirements of systems consistent with the present invention. Work is extremely critical data to perform.
在联机步骤1304中,至少一个服务器200会被连接上至少一个本地复制单元204。正如前述,该联机可为SCSI、光纤频道、USB或是其它标准磁盘子系统总线的形式。由于该本地复制单元204可仿真磁盘子系统,故在步骤1304进行联机基本上是与将磁盘子系统连接到服务器200相同,至少由服务器200的观点是如此。特别是不再需要安装特殊的NLM或其它复制软件。In the connecting step 1304 , at least one
在联机步骤1306中,至少一个本地复制单元204会被连接上至少一个相对应到的旅程链接206上。依状况而定,可能会牵涉到许多操作。例如,如果旅程链接206包括一个局域网络,那么本地复制单元204可以像其它的节点般连接到该网络上;也可以安装SNMP支持。如果旅程链接206包括一个由本地复制单元204发出的拨接式联机,那么也可以设定拨接式联机的参数。同样地,如果旅程链接206包括一个例如T1线路的专属的私用通讯线路,那么可使用相似的操作来进行联机。In the connecting step 1306 , at least one
在联机步骤1308里,至少一个远程复制单元208、308、408、508、608或708会被连接到至少一个相对应到的旅程链接206上。一般这可由与步骤1306中本地复制单元204联机的相同方式而实现。不过,在特定的具体实施例中,当远程复制单元作为一个TCP服务器时,本地复制单元204会成为远程复制单元的客户端。如此一来,在这种的实施例里,联机步骤1306连接到TCP客户端,而联机步骤1308则连接到TCP服务器。In the connecting step 1308 , at least one
在测试步骤1310里,会在复制单元上执行测试。这些测试可包括例如像是以RAID单元性能,来对本地复制单元204的输出性能进行比较;由远程位置重新复制数据回到本地位置;将不当配置信息放回本地复制单元204,然后校正该信息;本地复制单元204重新开机;切断旅程链接206;中断本地复制单元204电源供应;中断远程复制单元电源供应;将本地复制单元204的缓冲区210满溢;以及其它测试。特别但不限于,该测试步骤1310可牵涉到执行一个或多个本文“测试组合”章节的测试。该测试步骤1310也牵涉到传输下述与步骤1302有关的资料,但为简化起见,测试在图13中仅以单独步骤示之。In a testing step 1310, testing is performed on the replica unit. These tests may include, for example, comparing the output performance of the
传输步骤1302也可以包括传输步骤1312,该步骤由服务器200透过标准总线,传送资料给本地复制单元204。这是可实现的,因为本发明与传统路径104不同,可提供能够仿真一磁盘或RAID子系统的复制单元。The transmitting step 1302 may also include a transmitting step 1312, in which the
在传输步骤1314,被复制的资料是透过旅程链接206传送出去。如同前述,这可如同传统路径104般以一专属链接的方式实现,但是也可借由例如像以太网络、及/或TCP、及/或其它开放式标准协议而完成,其中包括该相关的类似像局域网络及/或网际网络的传统网络架构基础建设。In the transmission step 1314, the copied data is transmitted via the
在某些具体实施例里,该复制资料会被本地复制单元204登录上时间戳记,以维持一序列记录,其中为早先被复制的资料区块,并且也将资料联系到一特定时点上。这会附带有足够大的远程及/或本地资料储存装置,以掌握一个或多个复制磁盘名录,再加上对该磁盘名录的扇区/磁道/区块层级的递增变化量的快照(Snapshots),而不是仅仅掌握目前的复制磁盘名录复制。在较合宜的实施例中,只需要一个快照。该单一快照即可提供一基准线,而后续的变化即登入日志,以便让在任何所预设的时点上(依照该日志的时间细密程度而定)的磁盘名录状态都可以被复原。该日志可为任意大小,如有必要可另外加储存空间以维持该档案,当然也可以设计为固定大小的FIFO环型缓冲区,当原先日志缓冲区为满溢后,旧项目即被新项目给覆写。一般说来,合适重复制的软件,再加上快照与(如有必要)递增变化值,在稍后就可用以复原该存在于早先某一特定时点的复制磁盘名录。In some embodiments, the replicated data is time-stamped by the
在传输步骤1316里,复制资料被传送到无服务器的复制单元上。该配置可如图2所说明。远程复制单元并非传统服务器,虽然具有与其相同的硬件及功能。该服务器可提供比复制单元更多的一般功能;复制单元专注于有效地提供基本上为连续、近乎实时性的远程资料复制。而远程复制单元的行为,就以透过旅程链接206取得资料这点来说,即近似一远程复制服务器,但除此之外则是非常类似一经挂载的磁盘。特别是,该远程复制单元的行为对第二服务器而言,如果有附接上的话,会类似一磁盘或是RAID单元。如果确有必要重新复制,远程复制单元是不需要第二服务器,透过旅程链接206米将资料重新复制回到该本地服务器200。In a transmission step 1316, the replicated material is transmitted to the serverless replication unit. This configuration can be illustrated in Figure 2. A remote replication unit is not a traditional server, although it has the same hardware and features. The server can provide more general functionality than the replication unit; the replication unit is focused on efficiently providing essentially continuous, near real-time replication of remote data. The behavior of the remote replica unit is similar to that of a remote replica server in terms of accessing data via the
在数据由本地复制单元204传送到目的地的远程复制单元之后,该远程复制单元可进行任何处理。例如,远程复制单元可将所接收到的数据封包,仅转换为可被写到单一外部磁盘驱动器614的资料区块。该远程复制单元也可以将所接收到的数据封包,转换为数据区块,然后将其写到内部磁盘子系统及或磁盘扇区上。该远程复制单元也可以接收数据封包,将其转换为磁盘数据区块,然后借由内部剥除(Striping)软件(RAID),将该资料剥除到一“非智能型”磁盘子系统的多个磁盘之上,并以外部数据子系统的型式写入RAID单元312。这种由封包转换为磁盘区块数据再转换为剥除(RAID)资料的相同过程,并附带有储存到外部非智能型磁盘子系统的程序,可以借硬件控制卡及相关驱动程序处理。该远程复制单元也可以写到外部智能型RAID子系统312,其磁盘区块以数据流的形式被写入磁盘子系统,并被智能型RAID子系统进行剥除处理。After the data is transferred by the
不直接将所收到的数据立刻写到复制单元312或614,首先远程复制单元可将数据写入远程缓冲区,然后送回某种资料“签名”形式的ACK确认信号(例如像总和检查或是CRC值)给本地复制单元。该本地复制单元接着会按照该签名确认结果,要不“确认-确认(ACK-ACK)”,要不就是“确认-回拒(ACK-NAK)”;只有当收到由本地复制单元所传来的ACK-ACK时,远程复制单元才会由远程缓冲区接收数据而写到远程复制档。在该种具体实施例里,如果该远程复制单元不仅接收数据而且也须由本地复制单元接收原始签名,则倘若于原始签名并未正确地认证,则该远程复制单元会回拒该原始数据传输。Instead of directly writing the received data to the
此外,也可以不同方式来确认资料。例如,可将远程复制单元及本地复制单元视为端点,而不是彼此的子系统。在这种情况下,该远程复制单元方面,ACK信号由该远程复制单元自己发出(或许是由其高速缓存发出);在该本地复制单元方面,ACK信号也是由该本地复制单元自己发出(最好是由其高速缓存发出);但是在该本地复制单元方面,ACK信号并不需要从远程复制单元而来,而在送出ACK信号给主机之前,仅由旅程链接的本地端即可。在该本地复制单元在删除本地缓冲区的数据区块之前,仍需谨慎地等待接收由远程复制单元送来的ACK信号,不过这可在确认给主机之后再进行。In addition, the data can also be confirmed in different ways. For example, a remote replicated unit and a local replicated unit may be considered endpoints rather than subsystems of each other. In this case, on the remote copying unit, the ACK signal is sent by the remote copying unit itself (perhaps by its cache); on the local copying unit, the ACK signal is also sent by the local copying unit itself (at last preferably sent by its cache); but on the part of the local copy unit, the ACK signal does not need to come from the remote copy unit, but only by the local end of the journey link before sending the ACK signal to the host. Before the local replication unit deletes the data block in the local buffer, it still needs to carefully wait for the ACK signal sent by the remote replication unit, but this can be done after confirming to the host.
如本系统内另有至少一个第二服务器,则可进行额外的步骤。例如,远程复制单元可直接将资料透过服务器的网络操作系统中继传送给远程服务器300。该操作系统可为主动或被动状态。在这两种情形下,透过联机302所接收到的资料,都可经由服务器300的操作系统被写到一个内部的本地磁盘子系统。这种方式对每一个在远程位置的操作系统都需要一特定软件。该远程复制单元也可以采用网际网络式的资料窗口来在远程复制单元与第二服务器300之间传送及接收资料。该资料窗口可为由一附接外加式(Plug-In)扩充到浏览接口,或是例如像Microsoft ActiveX扩充一样,由网际网络组件扩充到核心操作系统。Additional steps can be performed if there is at least one second server in the system. For example, the remote replication unit can directly relay the data to the
在任何一种上述的状况里,本地复制单元可具有足够的“智能”,来中继传送复制数据给单一远程复制单元或是多个远程复制单元;一个如图12的“一对多”系统,具有由各自旅程链接206而连上单一个多阜式本地复制单元204的三个远程复制单元,并且多阜式复制单元也可如此单独或是与单阜式复制单元共同合并使用于其它符合本发明的系统里。在某一给定系统里,远程复制单元的个数并没有硬性限制。In any of the above situations, the local replicating unit may be sufficiently "smart" to relay replicated data to a single remote replicating unit or to multiple remote replicating units; a "one-to-many" system as shown in Figure 12 , with three remote replicating units connected to a single multicast
远程复制单元也可以中继传送复制数据给附近的复制单元,及/或为容错的因素而传送给另外一个较远的远程复制单元。一远程复制单元可作为两个或更多下列远程复制单元的间的头端,适当地监管复制资料的连续一致性与完整性,以平衡其负载并提供容错能力。将N个远程复制单元彼此连接起来,并维持相同的网络位置或是网域名称系统(DNS)名称,以提供更进一步的容错功能。当然也可以将上述方法一起组合起来应用。A remote replication unit may also relay replicated data to a nearby replication unit, and/or to another remote replication unit for fault tolerance reasons. A remote replication unit may serve as a headend between two or more remote replication units, properly monitoring the continuity and integrity of replicated data to balance its load and provide fault tolerance. Connect N remote replication units to each other and maintain the same network location or Domain Name System (DNS) name to provide further fault tolerance. Of course, the above methods can also be applied in combination.
在具有一个或多个个别完全独立的远程磁盘子系统连接到远程复制单元的具体实施例中,远程复制单元的行为即如同例如像SCSI主端,并且将数据写到远程磁盘。如果存在一个第二伺服端300,该伺服端300即在SCSI链中尾随于远程复制单元及远程磁盘子系统之后。在资料复制的过程中,该第二伺服端300一般为侍从端,及/或属于被动状态。万一被复制的远程服务器200发生故障,该远程服务器300即挂载该外部磁盘目录,并且成为SCSI主端。同时,该远程复制单元卸载其远程磁盘子系统磁盘驱动器而且成为被动状态。In embodiments having one or more individual, completely independent remote disk subsystems connected to the remote mirroring unit, the remote mirroring unit behaves, for example, like a SCSI master and writes data to the remote disks. If there is a
特别是说,这可以使用类似于图14的配置方式而实现,其中包括了一项“双主机”连接1400。在许多的传统方式里,只有一张主机适配卡会在SCSI链中被激活,一般设为LUN7。当打开电源或是重置时,该主机轮询所有其它的LUN,以决定连接上了哪些设备。如果系统使用了适用双主机的适配卡,则第二主机一般设定成LUN6,而只对LUN0-5进行重置或查询。如此应可认定LUN7为主要,而LUN6为第二。无论如何,倘以图14的方式连接时,该二主机均可接取至较低层次的目的装置。In particular, this can be accomplished using a configuration similar to that of FIG. 14 , which includes a "dual host" connection 1400 . In many traditional ways, only one host adapter card will be activated in the SCSI chain, usually set to LUN7. When powered on or reset, the host polls all other LUNs to determine which devices are attached. If the system uses an adapter card suitable for dual hosts, the second host is generally set to LUN6, and only reset or query LUN0-5. This should make LUN7 primary and LUN6 secondary. In any case, if connected in the manner shown in FIG. 14, the two hosts can access the lower-level target device.
双主机连接其本身并非为新创。尤其是,具BusLogic EISA卡及NovellNetWare服务器的双主机连接已为众知。然而,因Novell服务器无法按照以要求为基础而更新其档案配置表,使得双主机连接所提供的功能在此情况下无法使用。有关于双主机连接的一般资料,可由公开来源取得,其中也包括一线上SCSI常问问题解答集。如不使用双主机连接,则远程服务器300需要一驱动程序NLM,及/或其它专用于复制的软件,以便该远程服务器300可直接由远程复制单元处接收复制数据,并将其存放以供后续使用。Dual host connections are not new in themselves. In particular, dual host connections with BusLogic EISA cards and Novell NetWare servers are known. However, the functionality provided by the dual-host connection cannot be used in this case because the Novell server cannot update its profile configuration table on a request-by-demand basis. General information on dual host connections is available from open sources, including an online SCSI FAQ. If do not use dual host connection, then
在符合本发明并使用双主机配置1400的具体实施例里,远程复制单元208、308、408、508、608或708控制了RAID单元312或是其它远程磁盘子系统,一直到被命令停止以执行切换操作。此时,该远程复制单元执行远程资料复制操作,并且如同文中说明,作为SCSI主端,送出资料给RAID单元312。同时,Novell或是其它第二服务器300仍为被动状态。这可防止因为同时对服务器300、远程复制单元、RAID单元312或者是其它远程磁盘子系统进行如图14的“二对一”方式的写入操作,而或将发生的损害。In an embodiment consistent with the present invention and using dual-host configuration 1400,
为进行切换操作,该远程复制单元卸载RAID单元312磁盘驱动器,而由服务器300挂载RAID单元312磁盘驱动器。接着,服务器300即成为SCSI主机。由于一般无法预先决定或强制第二服务器SCSI适配卡选择,故远程复制单元具有第二主机位置(LUN6)较佳。当两台机器启动后,远程复制单元在该磁盘驱动器打开通电时,可感受到第二次的重置操作。此为正常现象,但是该远程复制单元应能在装置磁盘驱动器层级即可进行复原。注意到利用双主机(不是双频道)方法,配线方式即为正常结束的SCSI线链;不需要额外的硬件。透过储存子系统及/或磁盘驱动器卸载、挂载、以及相关操作,切换操作可完全由软件来操作。For switching operation, the remote replication unit unmounts the
前述的讨论可视为已预设远程复制单元与第二服务器300之间为一对一的关系。不过,软件或是机械式SCSI开关(譬如说)可被用来提供远程复制单元与多个潜在主机服务器300之间的连接。在如同光纤频道的协议及/或SAN架构里,并不存在传统的SCSI主从关系。相反地,而是有一透过DNS及/或数码位置而出现的地址关系。在这种系统中,切换操作可借由位置变更而进行切换,而同时该远程复制单元仍保持在被动状态。The foregoing discussion can be regarded as a preset one-to-one relationship between the remote replication unit and the
该远程复制单元可设定为执行完整的网络操作系统。如出现灾害,则该远程复制单元进入主动状态,并且对于要传送复制资料过去的磁盘子系统上的信息而言,即成为一完整运作的服务器。该远程复制单元也可以执行一仿真程序,来仿真成本地端特定主机操作系统下的服务器。远程复制单元也可以执行一程序,以关闭复制时使用的操作系统与任何相关程序,然后再由一另外的内部磁盘或是扇区,在特定的主机操作系统下重新激活。The remote mirroring unit can be configured to run a complete network operating system. In the event of a disaster, the remote replication unit enters an active state and becomes a fully operational server for information on the disk subsystem to which the replicated data is to be transferred. The remote replication unit can also execute an emulation program to emulate a server under a specific host operating system at the local end. The remote copying unit can also execute a program to shut down the operating system and any related programs used for copying, and then reactivate it under the specific host operating system from an additional internal disk or sector.
该远程复制单元也可以再加强化,以用来连续地作为第二服务器,而非一般地专作为资料复制之用。不过,如此一来将会严重地降低复制性能,并且增加复制失败的风险。The remote replication unit can also be enhanced to continuously serve as a second server, rather than being generally used for data replication. However, doing so will severely degrade replication performance and increase the risk of replication failures.
如果该远程复制单元与本地复制单元204的软件大致相同,则该远程复制单元可作为本地复制单元204使用。例如,当复制是由A地到B地再到C地时,在B地的复制单元相对于A而言为远程复制单元,相对于C而言为本地复制单元。在进行由远程位置回到来源处的复原操作时,该远程复制单元也可以作为本地复制单元204。此即当由A地到B地时,A地的复制单元为本地端,而B地的复制单元为远程,但是当由B地到A地时,A地的复制单元为远程,而B地的复制单元为本地端。If the software of the remote copying unit is substantially the same as that of the
最后,一些新式系统可接纳多个使用者会期(Session);一使用者会期是一复制资料中继或是储存会期。上述各项场景的多重组合及范例可同时或是单独在适宜的状态下出现。同时,或许也需要更多个处理器包括磁盘、内存等等,以便完成特定的组合。Finally, some modern systems can accept multiple user sessions; a user session is a replication data relay or storage session. Multiple combinations and examples of the above-mentioned scenarios can appear simultaneously or individually in a suitable state. At the same time, more processors including disks, memory, etc. may be required to complete a specific combination.
这些各式的工具及技术也可以应用于符合本发明的一对多或是多对一的复制系统。同样地,有关对封包、IP、以太网络、符号环、或是其它封包式资料环境的讨论亦然,并且应可了解到其它被支持的环境,也可以不必使用封包而是资料流的方式将资料写入。These various tools and techniques can also be applied to one-to-many or many-to-one replication systems consistent with the present invention. Likewise, discussions about packet, IP, Ethernet, symbolic ring, or other packetized data environments, and it should be understood that other supported environments can also use streams instead of packets Data is written.
除了在某一步骤需要另一步骤的结果作为输入的情况下之外,上述及其它的步骤也可以不同顺序及/或同时而执行。譬如说,连接步骤1304、1306和1308可依不同顺序及/或同时而执行,但是在测试步骤1310里,即会假定各个指定的联机部分或全部均需出现,至少名目上如此。步骤1312将数据传输到本地复制单元,必然会在步骤1314将该资料透过旅程链接206传输或是传输给本地复制档230之前。另一方面,倘若是传送给无服务器远程复制单元,则传输步骤1316可以执行传输步骤1314的方式而进行。不管是否在本细部描述章节内有明示为可略,除非是在所述的权利要求内,其它各个步骤也可以省略掉。各步骤可以重复、合并或是以不同方式命名。Except in cases where a certain step requires as input the results of another step, the above and other steps may also be performed in different order and/or concurrently. For example, connection steps 1304, 1306, and 1308 may be performed in different orders and/or simultaneously, but in test step 1310, it is assumed that some or all of the respective specified connections are present, at least in name. Step 1312 transfers the data to the local replica unit, necessarily prior to step 1314 transferring the data via the
现请参见图15以及以下说明,其将会直接参照于该图,而同时讨论可于本发明具体实施例内善加运用(单独或按各式组合)的额外工具与技术,像是本地-远程角色互换、热待机服务器状态实施方式、数种替代性缓冲器内容及缓冲法则、交易、多对一复制处理(前文中已按图5-10略予说明)、频繁接取资料的识别处理,以及按未授权方式运用第二服务器等。Referring now to FIG. 15 and the following description, which will refer directly to that figure while discussing additional tools and techniques that may be advantageously employed (alone or in various combinations) within embodiments of the invention, such as local- Remote role swap, implementation of hot standby server status, several alternative buffer contents and buffer rules, transactions, many-to-one copy processing (simply explained in the previous article according to Figure 5-10), identification of frequently accessed data processing, and use of secondary servers in an unauthorized manner, etc.
角色互换role reversal
当一例如为服务器200的主服务器变成非运作,并且既已将变动数据完全地送出到远程位置时,例如为单元204及208的复制单元可改变角色,借此让在WAN上例如为服务器200的远程服务器,能够对其各网络端节提供例如像是灾难复原的功能。受让人MiraLink第一份专利,美国专利第5,537,533号,即已讨论到一种连续可用、远程复制、替换网络服务器。但是显然该处并未讨论到角色互换可用性。在角色互换里,整个复制单元架构会按其性质加以反置。如本地及远程复制单元两者皆可存活过任何导致需要灾难复原功能的事件,则在本地一远程角色互换后,原来为远程者会被视为是本地端,且该处所注记的资料变化,就会被复制送返给现已转为远程角色的原始本地端。When a master server, such as
在一具体实施例里,会按下列方式实施角色互换步骤1506。首先,最好是等同地配置设定“机盒”组对(像是单元204、208的复制单元),以利于转换操作。其次,处理SCSI仿真的核心模块会在本地机盒里为作用中,而在远程机盒里为休眠状态。就是这个软件状态会实际地产生后述的“媒体未待机”特性。当该本地机盒既已将所有的变动资料递交给该远程机盒后,使用者可下指令以进行角色互换。这会关闭该本地机盒的复制功能,并激活远程SCSI仿真层级,从而现在可指挥该远程服务器以登注该远程复制单元。如此,在各站台处的复制单元可改变其角色,并供允服务器参与以让这项改变生效。可借位旗标或其它变量,按内部方式表示出该复制单元的目前角色。In one embodiment, the role-swapping step 1506 is implemented in the following manner. First, it is best to configure sets of "box" pairs (such as duplicate units of
在此,当该复制单元互换角色1506,并开始按远程角色而运作时,可利用一在按本地角色运作的复制单元内作为传送媒体的实体盘片来当作是接收缓冲器。在一本地复制单元里,像是单元204,此盘片为一传送盘片,可储存该旅程链接206的变化资料。在一远程复制单元,相同盘片会是一接收缓冲器,可保存所接收的1504变化资料,一直到既经辨识并传交给远程复制单元盘片或其它的非挥发性储存物为止。在一些具体实施例里,可程序设计该辨识水准及传交时间延迟。Here, when the replicating unit switches roles 1506 and begins to operate in the remote role, a physical disk as the transmission medium in the replicating unit operating in the local role can be used as a receive buffer. In a locally replicated unit, such as
对该第二服务器的媒体未待机状态Media not ready for the second server
利用1508,“媒体未待机”状态可让第二服务器300位属“热”待机模式。若无此项,在远程复制单元308确已上线后,或将需要带起该第二服务器,以便该第二服务器可向该SCSI串链查询该远程复制单元308是否出现。在步骤1508里,该远程复制单元的SCSI仿真层会对于来自该远程服务器300的请求项,响应以像是资料大小及数据可用性的资料特征,但是会拒绝该远程服务器300接取至该资料内容。在此,会由单元308利用标准SCSI响应格式来提供这些对该服务器300的限制响应。Using 1508, the "media not standby" state allows the
或者,可带起该第二服务器300,而无需该远程复制单元308缆线连接至该第二服务器300。在一最终失效后,会连接该缆线然后必须执行一SCSI装置串链探寻操作,以侦测出新的硬件。该服务器300然后会登注该装置308。相对地,利用1508的较佳方式,利用一媒体未待命模式,而让该容量308成为“启用”且“侦得”但仍维持未登注,一直到要求失效为止。Alternatively, the
环型缓冲器ring buffer
两个额外的操作模式可借由允许一种“非一致”复制模式(亦即不再是完全可信赖的时间延迟复制),按此可依给定时间及/或频宽进行复原操作,来延扩该缓冲器内的环型资料队列的运用性。此环型队列也称为“可扩充式智能型缓冲器”、“环型缓冲器队列”或“CBQ”。这在一正常模式下会利用盘片空间作为FIFO(先进先出),存放变动“逻辑区块编号(LBN)”,而不是存放真实的变动资料。这代表按CBQ可减少储存大小(128LBN“各为4个字节”相对于一个变动数据区块“各为512字节”),借此减缓CBQ被填满的速度,提供给该旅程链接206更多的复原时间。如该旅程链接206维持停当一段足够长的时间,而该CBQ变成全满,则会要求进行完整复制。然而,系统仅需对既变区块复原一次,使得CBQ会溃散于一虚拟“档案配置表(FAT)”或类似的区块(像是簇集或区段)配置结构内,并对于各个区块将检查总和或“环型冗余检查”数值存入CBQ里。当该旅程链接206被复原后,远程复制单元会被本地复制单元通知1302需要重复制,且其会与该本地复制单元交换CRC等区块,供决定需要送出该盘片的哪个簇集(例如)。例如,超过90%的硬盘可能并未改变,从而不需要透过该旅程链接206送出,这点确与先前的复制方式不同,其中会假定在本地与远程碟机之间100%的资料皆属不同。Two additional modes of operation can be implemented by allowing a "non-consistent" replication mode (i.e. no longer fully reliable time-delayed replication), whereby recovery operations can be performed at a given time and/or bandwidth The availability of the circular data queue within the buffer is extended. This ring queue is also known as a "Scalable Smart Buffer", "Ring Buffer Queue", or "CBQ". This would use disk space as a FIFO (first in, first out) in a normal mode, storing changed "Logical Block Numbers (LBNs)" instead of storing actual changed data. This means that the storage size can be reduced by CBQ (128LBN "4 bytes each" compared to a change data block "512 bytes each"), thereby slowing down the speed at which the CBQ is filled up for the
SCSI探察缓冲处理SCSI probe buffer handling
在一些具体实施例里,在正常模式下的“可扩充式智能型缓冲器”(即如“环型缓冲器队列”)会收存变动区块,一直到触抵一门槛值为止,在此时该复制单元会存放1510变动“逻辑区块编号(LBN)”,而不是真实的变动资料。在一种利用“SCSI探察缓冲处理”的变化方式里,该资料复制系统会缓冲该真实的SCSI指令,而不是切出该区块数据,并缓冲这些SCSI指令。这可按下列方式进行;注意,即如图15中所示,不同的步骤1512具体实施例或会包含或省略掉在此集体标注为部分编号1512之一或更多特定操作。In some specific embodiments, the "scalable intelligent buffer" (i.e., "ring buffer queue") in the normal mode will store the changed blocks until a threshold is reached, here At this time, the replication unit will store the 1510 change "Logical Block Number (LBN)" instead of the real change data. In a variation using "SCSI snoop buffering", the data replication system buffers the actual SCSI commands, rather than cutting out the block data, and buffers the SCSI commands. This can be done in the following manner; note that, as shown in FIG. 15 , different embodiments of step 1512 may include or omit one or more specific operations collectively referenced here as section number 1512 .
在该复制装置204内一目标调接器会按被动方式倾听1512该SCSI总线。“被动”在本文中的意思是该实体装置204并不电子参与于该总线,但确会将在该总线所观察到的加以记录1512。该目标调接器可利用在SCSI分析器中所运用具有类似性质的现存实体硬件,但非其目的。SCSI分析器是一种解析工具,可让使用者监视SCSI总线的活动状况,而无须实际地参与其中。然后,将由本发明目标调接器从该SCSI总线所收集的资料1512,针对源自或朝向该SCSI总线上某一特定真实参与者或“目标”的活动加以解释1512。这种资料包括一组包封SCSI指令集,即如在该SCSI总线上所见者1512。A target adapter within the
指令配对1512过滤标准,即如仅仅和所欲SCSI总线参与者相关的指令,会利用适当的缓冲算法,按照观察到的顺序予以队列1512。在此,并不必然地会对从该SCSI总线上所收集的资料1512分析或解释1512超过对来自该SCSI总线一特定参与者的指令或响应进行辨识1512。不过,可采取操作而将1512分割成为(a)来自该总线上一主机控制器的请求,以及(b)来自该总线上一主机控制器而属写入性质的指令。借缓冲1512写入性质指令,该缓冲器内可含有仅仅与该SCSI总线上目标参与者的变动资料相关的交易项目。Command pairing 1512 filtering criteria, ie commands related only to desired SCSI bus participants, are queued 1512 in the order observed using appropriate buffering algorithms. Here, it is not necessary to analyze or interpret 1512 data collected from the SCSI bus beyond identifying 1512 commands or responses from a particular participant on the SCSI bus. However, operations may be taken to split 1512 into (a) requests from a host controller on the bus, and (b) commands of a write nature from a host controller on the bus. By means of buffer 1512 write property commands, the buffer may contain only transaction entries related to the change data of the target participant on the SCSI bus.
然后,跨于像是旅程链接206的通讯链路,将经缓冲的SCSI指令资料传交1502到第二复制单元208、308等等。在收讫1504后,会依等同或类似参与者,借由重复于一第二实体个别SCSI总线来“回放”1514这些指令,而这些参与者会按照与第一总线上彼此对等者的相同的状态开始。按此方式,当从原先SCSI总线上读出1512各项指令时,即可将位于第二SCSI总线上的复制目标参与者,设置成与原先目标参与者相同的状态下,并令其含有相同资料。在此,可依类似方式运用SCSI以外的其它总线进行指令捕捉及回放,以及本发明其它特性。The buffered SCSI command data is then communicated 1502 to the
当实施本复制系统时,很重要的一点是需注意到读出请求与写入请求间的细微的无意互动。这在若该受关注的SCSI总线参与者保持一暗示,但非随即可见,的内部状态,根据一按先前读出操作的后续写入运算来改变其行为时,将会特别重要。When implementing the replication system, it is important to be aware of subtle inadvertent interactions between read requests and write requests. This is especially important if the SCSI bus participant in question maintains an implicit, but not immediately visible, internal state that changes its behavior in response to a subsequent write operation following a previous read operation.
此外,从在受监视的SCSI总线上各个捕捉到指令的参与者所回报的错误,需按与该第二SCSI总线上一致的方式加以处置1514,但是这并不必然会产生相同的错误。同时,在该第二SCSI总线上所产生的错误条件,可能会令该第二SCSI总线在状态及资料方面与该第一SCSI总线并不一致。In addition, errors reported from participants that capture commands on the monitored SCSI bus need to be handled 1514 in a manner consistent with that on the second SCSI bus, but this does not necessarily result in the same errors. Meanwhile, an error condition generated on the second SCSI bus may cause the status and data of the second SCSI bus to be inconsistent with the first SCSI bus.
暂时交易Temporary deal
暂时交易处理1516利用一复制单元204、208等缓冲器以提供交易档案系统功能性。注意,不同的步骤1516具体实施例或会包含或省略一或更多特定操作,在此共同标注为部分编号1516。借操作系统代理者及/或核心嵌档,可追踪1516档案开启及关闭,以及档案操作时间戳内存,借以支持尚未支持交易操作的档案系统上的操作回溯(roll-back)1516。Temporary transaction processing 1516 utilizes a
在此情境里,“核心嵌档”是一种二进制补文件或一原始码补档,可嵌挤入现有二进制程序代码或原始码以修改操作系统。这与装置驱动程序或代理者不同,因为核心嵌文件插置操作会出现在操作系统里,并不专门设计以让额外软件链接连入或另予插置的位置处。借由将程序代码插置1516于操作系统里例如出现会开启及关闭档案等运作的点处,就可按照这些事件来进行操作。In this context, a "kernel embed" is a binary patch or a source code patch that embeds existing binary code or source code to modify the operating system. This is different from a device driver or agent, because the core embedded file insertion operation will occur in the operating system, and is not specifically designed to allow additional software to be linked into or otherwise inserted. These events can be followed by inserting 1516 program code at points in the operating system where operations such as opening and closing files occur.
这种方式可被视为是一种复制及覆写的混合体,原因是覆写会在当关闭档案时复制档案,而复制则是在当写入档案时会复制档案。这种方式会根据何时开启该档案或关闭以供等待,而将一时间戳记或其它标号附接1516至经复制资料。如此,在开启该档案后借一程序而出现的所有变化,皆会被关联1516到该开启/关闭循环,而在重新开启该档案之后的任何后续变动,则并不会被关联到此目前循环。This approach can be thought of as a hybrid of copying and overwriting, since overwriting copies the file when it is closed, and copying copies the file when it is written to. This approach attaches 1516 a time stamp or other label to the copied material depending on when the file was opened or closed for waiting. In this way, all changes made by a program after opening the file will be linked 1516 to the open/close cycle, while any subsequent changes after reopening the file will not be linked to the current cycle .
当完成开启/关闭后,缺少空间或其它因素或会使得不易追踪1516与一档案相关的特定区块,但可随即保持追踪1516当出现一特定开启/关闭事件时的精确时间,并且也可追踪1516当一区块进入该缓冲器的精确时间。如此,在稍后时间,系统管理者可检视由该嵌物所提供的开启/关闭日志文件,并选择性地消除符合一特定时段的经变动资料区块。When opening/closing is done, lack of space or other factors may make it difficult to track 1516 specific blocks associated with a file, but can then keep track of 1516 the precise time when a particular opening/closing event occurs, and can also track 1516 The precise time when a block entered the buffer. Thus, at a later time, the system administrator can review the open/close log files provided by the insert and selectively eliminate changed data blocks matching a specific time period.
注意,如仅运用于像是数据库般开启档案一个长时段和对其等写入资料一个长时段的应用,这种方式只能提供相当小的好处。然而,对于保持档案系统安全或用于复原1516被意外覆写的文书处理器档案,这种方式会相当好用,这是因为这些操作会出现在一短时段内,且通常是愈快愈好。在此,当发生时,即如一文书处理器的档案储存操作,会对一档案系统变化追踪1516至一合理精确的时间点。然后将对应于这些时点的资料变化复制操作加以辨识1516,并且会从执行复制操作的资料变化操作串流中挑出,对所选定的资料变化操作加以编辑1516。Note that this approach provides only relatively small benefit when applied only to applications such as databases that open files for a long period of time and write data to them for a long period of time. However, it works quite well for keeping a file system safe or for restoring 1516 word processor files that were accidentally overwritten, since these operations occur in a short period of time, usually as quickly as possible . Here, a file system change is tracked 1516 to a reasonably precise point in time as it occurs, eg, a file storage operation in a word processor. The data change copy operations corresponding to these points in time are then identified 1516 and selected from the stream of data change operations performing the copy operations, and the selected data change operations are edited 1516 .
可借一远程系统代理者,或是其它将资料变化日志文件保留1516于缓冲器内并能够对该变化回溯1516一个时段的程序,来完成交易1516。该远程系统代理者常驻于像是单元208的远程资料复制单元内,并可在通讯链路206上接收1504、1506来自于该本地资料复制单元204的资料变化信息。The transaction 1516 can be done by a remote system agent, or other program that maintains 1516 a log file of data changes in a buffer and can backtrack 1516 the changes for a period of time. The remote system agent resides in a remote data replication unit such as
在一些具体实施例里,系统在本地及远程两处皆装设有一复制盘片及一缓冲器盘片,但是除非该远程系统按某些原因而不再需要成为远程而变成本地者,即如当交换1506远程/本地角色,使得可从被复制的位置处将该远程复制数据复原回返时,否则并不会真正地利用像是缓冲器310的远程缓冲器盘片。从而,可利用远程缓冲器盘片来保存1516的交易日志文件。In some embodiments, the system has a replica disk and a buffer disk installed both locally and remotely, but unless the remote system no longer needs to be remote for some reason and becomes local, ie Remote buffer platters such as
可按类似于交易队列的结构来组织这些日志文件,而能够按一排序方式,来储存1516一资料区块以及关于此者的信息(LBN及时间戳记)。不立即地将资料写入盘片,相反地本发明却是将其储存1516于一缓冲器一段时间,此时段长短是由缓冲器空间可用性,及/或管理者偏好所决定。当超过时间时,资料会被从缓冲器移除1516并被写入至该复制影像。但在此刻,该管理者就不会有撤消(undo)该次写入的选择项。而若该远程208需要成为1506该本地204时,则在可将相同的缓冲器空间310配属给该数据传输操作之前,就会需要把整个远程缓冲器310传交至一盘片,像是RAID单元312。These log files can be organized in a structure similar to a transaction queue, which can store 1516 a block of data and information about it (LBN and timestamp) in an ordered fashion. Data is not written to disk immediately, but instead the present invention stores 1516 it in a buffer for a period of time determined by buffer space availability, and/or administrator preference. When the time expires, data is removed 1516 from the buffer and written to the duplicate image. But at this moment, the manager will not have the option to undo (undo) the writing. And if the remote 208 needs to become 1506 the local 204, then the entire
更广义地说,借利用缓冲器及其时间戳记信息,即可有效地重作1516既已发生于该受复制的服务器200,和在接收该复制资料的远程系统缓冲器310上,但是尚未离出该复制影像的缓冲器310而存入像是该RAID单元312上的各项事务。可仅借将这些受询区块移离1516该远程队列,由管理者利用一管理公用程序来执行此重作操作。More broadly, by utilizing the buffer and its timestamp information, effectively redoing 1516 has occurred both at the replicated
替代性缓冲法则Alternative Buffer Law
可在一些复制单元204内采用不同的缓冲法则,以相较于简易环型缓冲器能够节省缓冲器空间及时间。假定是在当收讫时会将各区块写入本地复制230,且只会把LBN号码储存在有序队列里。即如本文所用,所谓“有序队列”是指任何队列(queue)、串行(list)、“先进先出(FIFO)”、窗体(table),或是其它能够按照被递入时为相同的顺序撷取项目之一或更多种数据结构集合。特别是,环型队列即为有序队列的一范例。Different buffering algorithms can be used in some
在将被复制区块覆写于一既存于该队列内的区块上,且尚未被复制1302至远程站台的情况下,会按照先前所述的具体实施例的相同方式,将预先存在的区块复制到该缓冲器空间内(亦即仅会将一朝向该区块的指针置放在该真实队列内,而该区块本身会被存放在一置换(swap)空间内)。本替代性缓冲法则可让整个缓冲器成为“精简”模式,而同时仍又保持安全性。在此,仅会缓冲各项变化的变动部分。In the case where the copied block is overwritten on a block that already exists in the queue and has not been copied 1302 to the remote station, the pre-existing block will be copied in the same manner as in the previously described embodiment. Blocks are copied into the buffer space (ie only a pointer to the block is placed in the real queue, and the block itself is stored in a swap space). This alternative buffering algorithm allows the entire buffer to be "lite" while still maintaining safety. Here, only the moving parts of the changes are buffered.
“精简模式”与“正常模式”是指缓冲模式。精简模式可实施“竭尽所能”策略,会在当填满该缓冲器时进行。正常模式是平常所用的缓冲方式,一直到管理者-定义或是触抵其它的空置缓冲空间门槛值。按语意来说,此门槛值有时又称为“高水位标记”,因为当水位很高时,则最好是加以处理为妙。在触抵该门槛值后,该缓冲器会按精简模式运作,而这就不再能够于所有情况下保证资料的整合性,因为这仅会保存追踪1510出现变化的LBN,而不是LBN与资料。在此,会按正常方式将资料写入该本地复制230内,而当从该队列里读出该LBN时,就会将待予传送1500的资料从该本地复制230里读出。在许多情况下,这可正常运作-所有资料皆被复制。"Compact mode" and "normal mode" refer to buffered mode. Compact mode implements a "best effort" strategy, which occurs when the buffer is filled. Normal mode is the usual buffering mode until administrator-defined or other thresholds of free buffer space are reached. Semantically, this threshold is sometimes called the "high water mark" because when the water level is high, it's best to handle it. After this threshold is reached, the buffer operates in compact mode, which no longer guarantees data integrity in all cases, as it only saves the LBN tracking 1510 changes, not the LBN and data . Here, data will be written into the
不过,在有些情况下,档案会被写入,然后又因为一些变动而重新写入。这两种改变都会被放进队列内,但是当从队列中移除第一种变化时,所递送1500的资料其实是来自于该第二种(或后者)变化,并因此会在时间到之前出现于远程复制的盘片310/312上。这会是一个重大的问题,因为这通常会覆写该档案系统标的。不过这是一种“试试看或许成功”法则,且仍可提供某种程度的保护,故较仅缓冲器用尽为佳。However, in some cases files are written and then re-written due to some changes. Both changes will be put into the queue, but when the first change is removed from the queue, the delivered 1500 data is actually from the second (or latter) change, and will therefore be returned when the time is up. Previously appeared on remote replicated
可改善这种方法的本发明替代性缓冲法则,是按几乎相同的方式进行。不过,当后续写入一给定资料区块,在本地复制230上的区块会被复制,且被存入该缓冲器里的另外某个位置。将此区块插置返回该队列并不可行;一般说来,将会需要移动太多的队列元素方可获得空间。但是,可在改变该特定LBN的个别项目的位置,以参照在该系统上某个其它位置的资料区块。例如,可由该本地复制单元204利用一第二储存区域来保存这些区块。The alternative buffering algorithm of the present invention, which improves this approach, proceeds in much the same way. However, when a given block of data is subsequently written, the block on the
这种替代缓冲法则的一项优点是,多数的时间仅有必要单一次写入操作。偶有需要进行读出/写入/写入操作1518,亦即,从该本地复制230中读出该区块;将其写入暂时性储存内;更新该队列内的LBN项目以朝向该暂时性储存内的区块,而非该复制内的区块;将该新区块写入该复制230内,在此会收存该资料的先前复制;然后将该新区块的LBN项目增加至该队列内。An advantage of this alternative buffering strategy is that most of the time only a single write operation is necessary. Occasionally a read/write/write operation 1518 is required, i.e., read the block from the
远程多对一复制Remote many-to-one replication
这项新法包括本文所述的技术,这可进一步调适以提供一种按属多对一解决方案的硬件/软件平台,此者具有如前述的中央备份站台或服务供货商。本地系统运作概如前述。该本地复制单元204透过该SCSI总线连接到主机服务器系统200,并且按固定碟机的方式出现,而这又会被用来(例如)作为RAID-l复制的一部分。然后,透过该本地复制单元204传输协议,将资料从本地缓冲器210传送1500到远程站台,其等操作状态可如本文其它部分所述。一管理接口可支持该本地系统与,像是单元508、608或708的复制单元内的远程多对一解决方案间的一对一视界(从该本地复制系统的观点)。This new approach includes the technology described herein, which can be further adapted to provide a hardware/software platform as a many-to-one solution with a central backup site or service provider as previously described. The local system works as described above. The
该远程多对一解决方案可执行1520该复制系统的传送及缓冲器管理软件的多重实施例,即如像是前述的远程复制单元208、308、408软件的多个软件实施例。然而,在这些具体实施例里,核心模块会被使用者空间控制模块所取代,这会仿真1520前述系统的核心接口。多重个“虚拟远程复制单元”(在此又称为“虚拟系统”或“虚拟1.1系统”)可在一服务器300的硬件平台上,或是经修饰的复制单元208、308、408上担任主执1520。该硬件平台可为任何高阶的服务器系统,能够提供一共享及可用Posix/Unix/SRV4环境。范例包括,但不限于此,分别地执行Solaris/Linux或AIX/Linux的升阳或IBM服务器。The remote many-to-one solution can execute 1520 multiple embodiments of the replication system's transport and buffer management software, eg, multiple software embodiments of the
为利于实施可按需要所运作1520的虚拟系统传送软件,该软件应按模块方式所撰写,且不对关于资料是如何在装置间流动的提出任何假定,这些装置含有例如一本地缓冲器、一远程缓冲器、一本地复制、一远程复制及该核心。对于资料是从何而来而又前往何处的控制,则是透过核心接口进行,其会维持关于复制情况与使用者激活状态变化的状态信息。To facilitate the implementation of virtual system delivery software that can function 1520 as desired, the software should be written in a modular fashion without making any assumptions about how data flows between devices containing, for example, a local buffer, a remote buffers, a local copy, a remote copy and the core. Control over where data comes from and where it goes is through the core interface, which maintains state information about replication and user activation status changes.
在一些具体实施例里,硬件平台会执行一介接于一复制单元管理层的SAN管理软件,以视需要提供像是对本地装置的SAN储存上的路由装置等功能,实施(对于缓冲器装置、复制装置、变化复制装置等)各种操作状态。该多对一系统的管理接口或可从先前所述的复制单元的管理接口,透过MIB扩充部分与全球信息网而利用SNMP所导出。在该管理层内,可由主(本地)复制系统提供一对一关系,而同时仍然允许远程系统上所需要的状态运作。在此,可利用一SAN管理套装作为类似接口的模型,像是设定检查点、制作多份复制资料的复制,及/或改变会被复制的装置。In some embodiments, the hardware platform executes a SAN management software interfacing with a replication unit management layer to provide functions such as routing devices on the SAN storage of local devices as needed, implementing (for buffer devices, replicator, change replicator, etc.) various operating states. The management interface of the many-to-one system may be derived from the previously described management interface of the replication unit through MIB extensions and the World Wide Web using SNMP. Within this management layer, a one-to-one relationship can be provided by the primary (local) replication system, while still allowing the required state operations on the remote system. Here, a SAN management suite can be used as a model for similar interfaces, such as setting checkpoints, making multiple copies of replicated data, and/or changing devices that will be replicated.
识别频繁接取数据元素而无需应用特定知识Identify frequently accessed data elements without applying specific knowledge
在本节与后续两段中,一数据区块为一“数据元素”范例,而一盘片区段为一“储存单元”范例。一“目前集合”可被视为是一碟机的抽象项。In this section and the following two paragraphs, a data block is an example of a "data element", and a disk section is an example of a "storage unit". A "current collection" can be viewed as an abstraction of a player.
容错系统常见的一个问题是,当在应用程序结束前仅完成一组资料储存操作集合的某部分时,该运用应用程序并不会采取方法以进行复原。经设计以能容错的应用程序通常是拥有一些方法,借此这些方法能够执行一组资料储存操作集合,但却一直要到执行某些最终操作后,才会将这些操作视为有效,从而假使任一操作并未成功,则整个操作不会被视为有效。然而,许多应用程序不是按此方式所设计。A common problem with fault-tolerant systems is that the application does not take steps to recover when only a portion of a set of data storage operations completes before the application terminates. Applications that are designed to be fault tolerant typically have methods whereby a set of data storage operations can be performed, but these operations are not considered valid until some final operation is performed, thus if If either operation is unsuccessful, the entire operation will not be considered valid. However, many applications are not designed this way.
一种用以对并非特定设计的应用程序提供容错功能的方法,为具有其中包含待予执行的各项操作的详细知识的应用-特定信息,以及在该应用程序之外持续追踪该应用程序状态。如整个交易尚未透过一监视该应用程序的外部代理者所递送,则可从该作用资料集合中将其移除。但是这会造成问题,因为该监视代理者需要关于该应用程序行为的特定知识,因此这会对于该应用程序本身以外的资料变化极为敏感。A method for providing fault tolerance to applications that are not specifically designed to have application-specific information that contains detailed knowledge of the operations to be performed and to keep track of the application state outside of the application . If the entire transaction has not been delivered by an external agent monitoring the application, it may be removed from the role data set. But this creates problems because the monitoring agent needs specific knowledge about the application's behavior, so this can be extremely sensitive to changes in data outside of the application itself.
本文所述的一种方式,就是利用一并不拥有这种应用-特定信息的监视代理者来频繁地识别1522所接取数据。该代理者假定1522该应用程序的一组储存交易会出现于一暂时相关的簇集内,假定这通常会将一组对于第一群组邻接资料组件的操作集合,假定这些储存操作会出现在该组对于第一群组邻接资料组件的操作集合之前及/或之后,以及假定这些储存操作会出现在或靠近第二群组邻接资料组件,这些位于除该第一群组邻接资料组件以外的其它位置处,且为不同交易所共享。这些共享组件在此称为“状态区块”。One approach described herein is to frequently identify 1522 received data using a monitoring agent that does not possess such application-specific information. The agent assumes 1522 that a set of store transactions for the application would occur in a temporally related cluster, assuming this would normally group a set of operations on a first group of contiguous data components, assuming the store operations would occur in The set of operations before and/or after the set of operations on the first group of contiguous data elements, and assuming that these store operations will occur at or near a second group of contiguous data elements, which are located in addition to the first group of contiguous data elements In other locations, and shared by different exchanges. These shared components are referred to herein as "state blocks."
以此为例,考虑一档案系统写入操作。该数据文件经一或更多操作集合所更新,通常是牵涉到一组在实体储存媒体上邻近设置的邻接储存组件。然后将这些档案系统表进行更新,这些会被存放在不同但一致参指的位置内,且会位于一组有限数量的实体相关储存组件里。各个保存该档案的使用者资料的区段或簇集会对应到该第一群组的邻接资料组件,而各个保存该档案系统表、位映图或类似档案系统数据结构的区段或簇集,则是会对应到该第二群组的邻接资料组件。For this example, consider a file system write operation. The data file is updated through a set of one or more operations, usually involving a set of contiguous storage components disposed adjacently on a physical storage medium. These file system tables are then updated, which will be stored in different but consistently referenced locations, and will be located in a limited set of entity-related storage components. each segment or cluster holding user data for the file corresponds to a contiguous data element of the first group, and each segment or cluster holding the file system table, bitmap or similar file system data structure, is the adjacent data element corresponding to the second group.
许多应用程序可支持一种类似于此的写入策略。为增加写入性能,一给定操作系统或将尝试着将不相关的写入操作予以簇集成为一单一写入操作。因此,该数据文件更新操作会在根据操作系统而定的时候进行。Many applications can support a write strategy similar to this. To increase write performance, a given operating system may attempt to cluster unrelated write operations into a single write operation. Therefore, the data file update operation will be performed at a time determined according to the operating system.
借由本发明,一种用以识别1522一交易的方法,为在对这些特殊状态区块的更新操作间保持追踪储存写入操作。一交易项会含有在两个状态区块更新操作间,所有写入至(各)数据文件的资料。可借由在跨于其正常运作范围上执行1522一应用程序,并保持追踪1522哪个储存操作既已写入、多频繁且按何种顺序,来实现识别1522状态区块。可利用中性净值、统计分析或其它类似技术及工具以从所获日志文件里撷取1522一状态区块的识别结果。历时所累积的日志文件应会显示出比起其它者,会更为频繁地接取/写入某些储存单元,且因此应被视为1522状态区块。如未发现这种明显的统计相关样式,则本法并不适用于在此所讨论的应用程序。本发明方法并不必然地可运作于每一种储存利用应用程序。With the present invention, a method for identifying 1522 a transaction is to keep track of storage write operations between update operations to these special state blocks. A transaction will contain all data written to the data file(s) between two state block update operations. Identifying 1522 state blocks can be accomplished by executing 1522 an application program across its normal operating range, and keeping
当适当地采用本法时,如该应用程序失效,且无法复原,则可由未递送1524续接的资料区块,以及未递送1524与各状态区块更新操作间所写入的状态区块更新来协助进行复原,一直到该应用程序能够进行复原其状态为止。为支持此未递送功能性,本发明会按一种非挥发性储存形式,来储存于状态区块更新操作间所被覆写的资料单元。或另外,本发明可在递送储存操作返回至盘片之前先行缓冲,而在侦测到并处理过次一组的状态区块储存操作后,释放出所欲的缓冲器空间。读出操作应从经缓冲的储存装置中读取获得,而不是从所递送的复制中读取获得。可维护一份窗体,以表示出在一缓冲器里或在所递送的储存装置上,某一给定数据单元的位置。When this method is properly adopted, if the application fails and cannot be restored, it can be updated by the data block continued by the undelivered 1524 and the status block written between the undelivered 1524 and each status block update operation to assist in recovery until the application is able to recover its state. To support this uncommitted functionality, the present invention stores data units that are overwritten between state block update operations in a form of non-volatile storage. Alternatively, the present invention may buffer before delivering store operations back to disk, and free up the desired buffer space after the next set of state block store operations are detected and processed. Read operations should be read from buffered storage, not from the delivered copy. A table can be maintained showing the location of a given data unit in a buffer or on delivered storage.
从一第一资料容量对一未授权第二资料容量再同步Resynchronization from a first data volume to an unauthorized second data volume
本发明也可提供用以从一像是本地复制210的第一资料容量,对一像是远程复制盘片子系统312或614的未授权第二资料容量予以再同步的工具及技术,以利于利用该第二资料容量作为第一者一段时间后能够进行灾难复原。The present invention may also provide tools and techniques for resynchronizing an unauthorized second data volume, such as a remotely replicated
在正常运作之下,会将资料单元写入一第一资料容量内,然后再借某种方式,像是复制单元204、208,写入一第二资料容量内。第一资料容量上的资料会被视为是已经授权,而因此当需要接取资料单元时会被查询。在该第一资料容量发生非破坏性失效的情况下(即如电力失效或暂时隔离于既存资料单元的使用应用程序),该使用应用程序会转向该第二资料容量,以便储存新的资料单元与读取数据单元。在此,会维持1526一列表(即如窗体,或其它数据结构),指出当第一资料容量非属可用时,在该第二资料容量上会改变的资料单元。当第一资料容量回属可用时就会查询此列表,以将1526该第二资料容量的内容再同步于该第一容量的内容。此再同步1526程序会从该第一容量中读取相对应资料单元,并将其写入到该第二资料容量内。Under normal operation, data units are written into a first data volume and then written into a second data volume by some means, such as replicating
在此情境下,对第二资料容量所作的改变会被假定成非经授权,且正常时是为该再同步1526所覆写。而这或以例如特定于该利用应用程序为理由。In this scenario, changes to the second data volume are assumed to be unauthorized and are normally overwritten by the resync 1526 . And this may be on the grounds that it is specific to the exploiting application, for example.
如此,在适当的情况下,本发明可提供一种用以于两个资料容量间,再建立第一-第二关系的简易方法。此再同步1526不同于角色互换1506;在角色互换里,该第二容量会变成第一授权容量,而在再同步1526中,该第一授权容量仍为授权。Thus, under appropriate circumstances, the present invention can provide a simple method for re-establishing the first-second relationship between two data volumes. This resynchronization 1526 differs from role swapping 1506; in a role swapping, the second capacity becomes the first authorized capacity, while in resynchronization 1526, the first authorizing capacity remains authorized.
于同一个实体储存系统上维护一有序队列及一目前复制Maintain an ordered queue and a current copy on the same physical storage system
即如本文所述,一些具体实施例里,复制单元204是会按一依其经接收的顺序的有序队列的方式,来存放各资料单元写入,以便能够依序地被读返。在一些具体实施例里,会定义一组资料储存单元为“目前复制”,而这些资料储存单元会被按整体方式从该目前复制所读返1528。在该储存装置的一给定资料单元上的新储存操作会更新1528该目前复制内的资料单元,而各资料单元仍属可用1528,以读出而复原回早先的系统状态。That is to say, as described herein, in some embodiments, the
这是由维护1528一份该目前复制的储存单元位置的窗体(或其它数据结构)所管理。该窗体可识别出该目前复制里一给定储存单元的最新近资料单元的地址。当处理请求后,会在窗体里查核1528该资料单元,并在参考到该窗体时从该有序队列里读出。在此,是借由按队列前向的方式,从该有序队列里一已知位置进行读取,来处理各项有序读取请求1528。This is managed by a window (or other data structure) that maintains 1528 a copy of the location of the currently replicated storage unit. This window identifies the address of the most recent data unit for a given storage unit in the current replication. When the request is processed, the data unit is checked 1528 in the form and read from the ordered queue when the form is referenced. Here, each ordered read request 1528 is processed by reading from a known position in the ordered queue in a queue-forward manner.
依此,即无令人信服的理由需按实体分割的方式来保存两份相同资料单元的复制。本发明可避免对储存系统写入相同资料单元两次来实施一实体分割系统。注意,在不同的步骤1528具体实施例中,确可包括或略除一或更多共同标注为部分编号1528的特定操作。Accordingly, there is no compelling reason to keep two copies of the same data unit as physical divisions. The present invention can avoid writing the same data unit twice in the storage system to implement a physical partition system. Note that in different embodiments of step 1528 , one or more specific operations collectively labeled as section number 1528 may indeed be included or omitted.
当实体储存系统填满有序队列资料时,最老的有序队列单元会为逾时1528,且那些储存空间会被释放,以供新的有序队列单元运用。如在目前集合内一老的有序队列单元需为逾时,则可将其复制1528到一第二储存装置,并更新该有序集合1528以朝指向这个新的位置。这是否为一常见情境是属应用特定,然在许多情况下,本发明这项特性1528是会倾向于减少为维护目前集合及一组资料单元的有序队列观点两者时所需要的写入操作次数。When the physical storage system is full of ordered queue data, the oldest ordered queue unit will be timed out by 1528, and those storage spaces will be released for use by new ordered queue units. If an old ordered queue unit in the current set needs to be timed out, it can be copied 1528 to a second storage device and the ordered set 1528 updated to point to this new location. Whether this is a common scenario is application specific, but in many cases this feature of the present invention 1528 will tend to reduce the writes required both for maintaining the current collection and for an ordered queue view of a set of data units number of operations.
保持一有序队列的结果是可利用先前的目前集合来作为重新建构1528之用。在此,借由在一时间点上,选取1528一有序队列作为新的目前集合,扫描1528该参考窗体以参指到较新于该既选时点的有序队列的各单元,然后更新1528该参考窗体以参指到较旧的有序队列单元,而这些是会参指到目前集合的正确部分,依此方式来重建一先前目前集合。As a result of maintaining an ordered queue, the previous current set is available for reconstruction 1528 . Here, by selecting 1528 an ordered queue as a new current set at a point in time, scanning 1528 the reference frame to refer to each unit of the ordered queue that is newer at the selected time point, and then The reference window is updated 1528 to refer to older SQUs, and these would refer to the correct part of the current set, thereby recreating a previous current set.
在许多情况下,此款本发明具体实施例1528需为读取操作付出性能打折的代价,这是因为在一些情形里,这不会发生在接续性储存单元上。但是储存操作应在任何顺序下皆为有效,因为储存操作最好总是在按有序队列排置方式的接续性储存单元上为宜,亦即如将该有序队列实施为一跨于一储存系统的储存单元的线性数组。In many cases, this embodiment of the invention 1528 pays a performance penalty for read operations because in some cases, this does not occur on sequential storage units. However, the storage operation should be effective in any order, because the storage operation is preferably always on the consecutive storage units arranged in an ordered queue, that is, if the ordered queue is implemented as a spanning one A linear array of memory cells in the storage system.
配置储存媒体与信号Configure storage media and signals
按本发明范围所制作的对象,包括有一计算机可读取的储存媒体,且合并有一该计算机可读取储存媒体基板所特定的实体配置。该基板配置代表资料与指令,可让计算机如下述依照特定及预设的方式而操作。适合的储存装置包括软盘、硬盘、磁带、CD-ROM、RAM、闪存及其它可由一个或多个计算机所读取的媒体。每个前述媒体均可实施出能够被机器所执行的程序、功能及/或指令,以进行大致于此讨论的弹性化复制方法步骤,包括但不限定于可执行如图13所示的部分或是全部步骤,以及用以安装及/或采用如图2到12系统的方法。本发明也可提供该程序所使用或采用的新式信号。这些信号可以“有线”、RAM、磁盘或其它储存媒体或资料载体实施。Objects made within the scope of the present invention include a computer-readable storage medium incorporating a physical configuration specific to a substrate of the computer-readable storage medium. The substrate configuration represents data and instructions that allow the computer to operate in a specific and predetermined manner as described below. Suitable storage devices include floppy disks, hard disks, magnetic tape, CD-ROMs, RAM, flash memory, and other media that can be read by one or more computers. Each of the aforementioned media can implement programs, functions and/or instructions that can be executed by a machine, so as to perform the steps of the flexible copy method discussed here, including but not limited to the execution of the parts or instructions shown in Figure 13 are all the steps and methods for installing and/or using the systems shown in Figures 2 to 12. The invention may also provide novel signals used or employed by the program. These signals can be implemented by "wire", RAM, disk or other storage medium or data carrier.
额外信息extra information
为更进一步帮助个人及企业了解及适当制作本发明,兹提供额外的相关信息及细节。这些论述以前续的假设,而除非另有说明,任何一种实施例型态(方法、系统或配置储存媒体)的讨论亦适用于其它的实施例。To further assist individuals and businesses in understanding and properly making the present invention, additional relevant information and details are provided. These discussions are based on subsequent assumptions, and unless otherwise stated, any discussion of an embodiment type (method, system, or configuration storage medium) is also applicable to other embodiments.
本发明改良的特定实施例Improved Specific Embodiments of the Invention
对于数据保护问题(磁带备份、区域性丛集、再制、阴影复制、远程大型主机频道扩充等等),许多其它的解决方法均多多少少需直接连接到主机200操作系统并且与其相关。该相关会对客户产生困扰,而使用本方法可加以避免。譬如说,假设软件不能完全在目前的主机操作系统或是该操作系统升级版之下操作的话,那么如果依赖相关的专属软件就可能会造成兼容性问题及错误。依赖专属主机复制软件的软件解决方案也可能产生性能问题,因为其将额外的工作加于主机之上。相关的软件解决方案也可能会造成不稳定性问题。当磁盘目录增大,而且软件与操作系统变得较复杂时,这些问题就更需要相关的软件来解决。此外,如果主机200操作系统当机,则依赖该操作系统的解决方案也就无法操作。Many other solutions to data protection problems (tape backups, regional clustering, replication, shadow replication, remote mainframe channel expansion, etc.) all require more or less direct connection to the
相对地,至少在有些具体实施例里,本发明并不使用会造成增加主机计算机(即本地服务器200)负载的软件,也因此降低或避免了上述的问题。如果主机操作系统当机,该复制单元可继续操作并且仍可使用复制资料,因为该复制单元执行其本身的操作系统。与必须在核心部分进行实质性修改的解决方案不同,当磁盘目录增多且软件变得复杂时,本发明可立即扩充。倘若磁盘空间较大,可将较大的磁盘放入复制单元内。如果数据变动率超过了目前写入磁盘的能力,则可使用一快取控制器并且增加系统的内存。某些其它的解决方案需要其它操作系统厂商的合作,以便顺利整合并操作而不会出错。由于所有的操作系统在可预期的未来都支持(譬如说)SCSI及光纤频道,故本发明的安装及使用不需要这种合作。In contrast, at least in some embodiments, the present invention does not use software that would increase the load on the host computer (ie, the local server 200 ), thereby reducing or avoiding the above-mentioned problems. If the host operating system goes down, the replication unit can continue to operate and the replicated material can still be used because the replication unit executes its own operating system. Unlike solutions that have to be substantially modified at the core, the present invention scales instantly as the disk catalog grows and the software becomes complex. If the disk space is large, the larger disk can be placed in the replication unit. If the rate of data change exceeds the current ability to write to disk, use a cache controller and increase the system's memory. Certain other solutions require the cooperation of other operating system vendors in order to integrate smoothly and operate without error. Since all operating systems support (say) SCSI and Fiber Channel for the foreseeable future, installation and use of the present invention does not require such cooperation.
当其它方案失效时,可取用主机200,因为如上述的密切互动关系。由于本系统操作可与主机200无关,因此如果故障也不会严重影响主机计算机。传统的磁盘复制原先是设计来提供区域性的容错能力。以平行的方式写入两个磁盘,而如果一个磁盘故障,该计算机仍可运作。故障的磁盘可在背景模式由操作系统卸载下来。操作系统及计算机可持续运作而不会有任何闪失。因为本发明的复制单元可被视为一SCSI磁盘并且以复制磁盘挂载,因此可提供类似的优点。如果复制单元当机了,需将其卸载即可。例如,如果复制单元上的操作系统或是其它软件失效,则该复制单元会停止仿真成磁盘驱动器的操作。因此,主机200的操作系统不再认得该复制单元。对此,主机200的操作系统只需卸载该复制单元204并继续运作即可。When other solutions fail, the
至少有部分先前说明的复制系统实施例会使用单一磁盘IDE缓冲区。即使是用欺骗封包的方式,这种智能型缓冲区也无法跟得上具有硬件式剥除功能的高速SCSI RAID单元。之前被传送到远程位置的最重要数据,会被存寄于单一磁盘,而并不具有在智能型缓冲区方面的容错功能。相对地,利用本发明,本地端及远程复制单元可同时复制具容错功能的单一磁盘缓冲区,且可以在多重磁盘之上执行硬件RAID剥除。这点可提供跟得上服务器端高速储存子系统,以及较佳的容错两种能力。万一服务器200磁盘目录或是复制单元磁盘210、310的某个磁盘发生事故,这也可以降低漏失缓冲区资料的风险。At least some of the previously described replication system embodiments use a single disk IDE buffer. Even with spoofed packets, this smart buffer can't keep up with high-speed SCSI RAID units with hardware stripping. The most important data previously sent to the remote location is stored on a single disk without fault tolerance in terms of intelligent buffering. In contrast, using the present invention, the local and remote replication units can simultaneously replicate a single disk buffer with fault tolerance, and can perform hardware RAID stripping on multiple disks. This provides both the ability to keep up with the server-side high-speed storage subsystem and better fault tolerance. This can also reduce the risk of missing buffer data in case of an accident in the disk directory of the
先前各种方式的资料输入容量限制,使得提出可获市场接受度的新技术变得非常困难。譬如说,至少在某些先前所述的方法里,没有支持“储存接取网络(SAN)”或是网络接附储存(NAS)。因为需要如同服务器300般的标准远程服务器,使得提供备份与复制日渐流行的SAN及NAS磁盘子系统变成极为困难或是不可能。然而,所有这些子系统可透过以太网络、光纤频道及/或SCSI来执行本地端复制操作。本新式复制单元可接受多种的输入型态,包括SCSI、以太网络与光纤频道输入。The data input capacity limitations of previous methods make it very difficult to propose new technologies that can be accepted by the market. For example, at least in some of the previously described approaches, there is no support for "storage access network (SAN)" or network-attached storage (NAS). The requirement for standard remote servers like
本发明也提供对较大型储存子系统的支持。许多较早期的容错解决方案设计适用于即使是6Giga字节储存磁盘容量都算是大型的环境之下。由于储存成本降低,磁盘子系统容量快速增加。现在即使服务器磁盘容量是100Giga字节也是很平常。本发明可容下这些较大型的磁盘目录,部分是借由在例如复制单元背景模式下进行主机服务器200同步处理。将工作负载自主机服务器卸除到复制单元,可使得中央主机服务器200完整复制而不会大幅降低性能。相反地,另外的“丛集式”及/或需要一本地服务器来处理复制所需同步的复制方案,都会降低甚至毁损主要服务器性能。The invention also provides support for larger storage subsystems. Many earlier fault-tolerant solutions were designed for use in environments where even 6Gigabyte storage disk capacities are considered large. Disk subsystem capacity is rapidly increasing due to lower storage costs. Even server disk capacities of 100Gigabytes are not uncommon these days. The present invention accommodates these larger disk directories, in part by performing
虽然具体实施例已尽力避免复制磁盘经由通讯链接上的再同步操作(重新复制),但是至少前述的再复制实施例中有些在当本地缓冲区无法支持整个本地端磁盘目录时,会要求本地服务器200进行干预。再复制操作会减缓中央/主要/主机服务器200为停顿,并且可能要好几天。所以再复制操作一般在使用者较少网络可以较慢的周末时进行。但是当磁盘子系统变大,这就无法接受了。本发明可在不仅是远程而且也适用于本地复制单元204支持非挥发性储存,其容量可装下整个要被复制到远程位置的磁盘目录。这可允许该本地复制单元204对完整的本地磁盘容量预确认到本地式的智能型缓冲区,并且以从服务器200的观点为背景的方式来执行再复制操作。Although the specific embodiments try to avoid re-synchronization operations (re-replication) on the replicated disk via the communication link, at least some of the foregoing re-replicated embodiments require the local server to 200 to intervene. The re-replication operation would slow down the central/primary/
至少在某些前述的方法里,T1输出的最大输出限制,不管是对本地或是远程,即使是讯框中继、ATM及/或VSAT网络可供使用,都会减缓再复制操作。相反地,本发明可弹性地提供较大的I/O管线容量以改善性能,因为再复制操作可变得较快,资料布放也会比较有效率。如果无法取得在远程储存的复制数据,则放在该无法取得位置的数据,可借高速私有数据网络以高速传送到另外的设施。这些数据网络一般可支持达OC48(即每秒2.488Giga字节)的频宽。其一例为某顾客一般将资料复制到芝加哥,而如今需使用纽约的设备来进行复原操作。这种型态的需求比起原先预想的还要频繁。In at least some of the aforementioned approaches, the maximum output limitation of the T1 output, whether local or remote, slows down recopy operations even if frame relay, ATM and/or VSAT networks are available. On the contrary, the present invention can flexibly provide larger I/O pipeline capacity to improve performance, because re-copy operations can be made faster and data placement can be more efficient. If the replicated data stored remotely is unavailable, the data placed in the unavailable location can be transferred to another facility at high speed via a high-speed private data network. These data networks generally support bandwidths up to OC48 (ie 2.488 Giga bytes per second). One example is a customer who normally copies data to Chicago, but now needs to use facilities in New York for restore operations. This type of demand is more frequent than originally thought.
早先的Off-SiteServer产品无法提供一开放式“应用程序设计人员接口(API)”。相反地,是完全采用封闭式专用硬件(MiraLink)以及封闭式专用软件(Vinca)。如果某一企业客户具有超出该产品范畴之外的需要,则一般并没有简易方法进行订制修改或是调整。相对于此,本发明可提供一开放式API,以便由客户端程序针对特定的顾客或新兴市场而来进行这些修改。特别是,但不限定于,本发明更具有一种可提供一种或多种呼叫,以对复制单元进行配置设定的API,同时并不会中断服务器200,另外也提供一种呼叫来重新激活该复制单元,而且也不会中断服务器200。Previous Off-SiteServer products could not provide an open "application programming interface (API)". Instead, it is completely closed dedicated hardware (MiraLink) and closed dedicated software (Vinca). If an enterprise customer has needs outside the scope of the product, there is generally no easy way to make custom modifications or adjustments. In contrast, the present invention can provide an open API so that these modifications can be made by client programs for specific customers or emerging markets. In particular, but not limited to, the present invention has an API that can provide one or more calls to configure settings for the replication unit without interrupting the
配置资料configuration data
系统配置数据以分散型式存放较佳,以便万一该复制单元漏失配置数据,该配置数据仍可由各单元点而复原。例如像网络信息的基本配置资料最好是存放于非挥发性储存装置(即磁盘上、或是接装干电池的半导体内存),以便即使是失去磁盘上的配置数据,该配置数据仍可由复制单元相对点复原回来。The system configuration data is preferably stored in a decentralized manner, so that in case the replication unit loses the configuration data, the configuration data can still be recovered from each unit point. For example, the basic configuration data such as network information is preferably stored in a non-volatile storage device (that is, on a magnetic disk, or a semiconductor memory with a dry battery), so that even if the configuration data on the disk is lost, the configuration data can still be used by the replication unit The relative point is restored.
全球信息网接口最好是至少能提供下列的配置选项或其对等项目:IP地址(远程/本地)、网络屏蔽(远程/本地)、管理员密码(共享)、缓冲区大小(本地)、缓冲区高水位记号(缓冲区已装满超过一可接受标准)、磁盘容量大小(可配置设定到制造厂商设定的最高值)、SCSI目标“逻辑单元数量(LUN)”、SNMP配置设定(远程/本地)。The World Wide Web interface preferably provides at least the following configuration options or their equivalents: IP address (remote/local), netmask (remote/local), administrator password (shared), buffer size (local), Buffer high water mark (buffer is full beyond an acceptable level), disk capacity size (configurable to a maximum value set by the manufacturer), SCSI target "logical unit number (LUN)", SNMP configuration settings set (remote/local).
该SNMP配置设定本身最好能够包含下列项目:增/删SNMP复制主机(远程/本地)、事件轮询时间间隔、缓冲区装满超过可接受限制、网络联机失效、缓冲区已满、远程已失去同步、增/删电子邮件收信者。The SNMP configuration settings themselves should preferably include the following items: add/remove SNMP replication hosts (remote/local), event polling interval, buffer full beyond acceptable limits, network connection down, buffer full, remote Out of sync, adding/removing email recipients.
网页接口最好是至少能提供下列状态信息:缓冲区内资料区块数、资料区块已送出数、资料区块已接收数、复制单元版本、复制单元序号、磁盘目录大小、本单元为远程或本地。网页接口最好可提供一未挂载远程的公用程序。网页接口最好也可提供一日志倾印报告。SNMP及SMTP陷接一般用以下列事件:缓冲区装满超过可接受限制、缓冲区已满、网络联机失效、远程已失去同步。The web interface should at least provide the following status information: the number of data blocks in the buffer, the number of data blocks that have been sent, the number of data blocks that have been received, the version of the copy unit, the serial number of the copy unit, the size of the disk directory, and whether the unit is remote or locally. The web interface preferably provides an unmounted remote utility. Preferably, the web interface also provides a log dump report. SNMP and SMTP traps are typically used for the following events: buffer full beyond acceptable limits, buffer full, network connection down, remote has lost synchronization.
而管理工具可以电子邮件、呼叫器、或其它方法提供知会操作。知会操作可为实时性及/或合并有自动日志或自动产生的报表。知会操作也可以送到系统管理员及/或贩售厂商。在以执行网页服务器/电子邮件程序包作为接口的具体实施例中,也可利用网页许多的特性。譬如说,使用者可在本地端或远程来存取及管理该复制单元。按个别权限而定,使用者可以公司内部方式及/或由世界任何所在位置来接取该复制单元。复制单元可透过电子邮件和SNMP,来通知使用者(还有该复制单元贩售厂商)该复制单元所发生的问题以及重大事件。也可为该电子邮件撰写专用订制的文件程序文件,以便通知不同的使用者或使用者群组。报表输出并非为必要项目。如果顾客要求管理用的专用报表,而非每个月复制所要求资料并且将资料复写一遍又一遍,则该顾客或受通知的设计厂商可使用HTML、JAVA及/或其它熟悉工具及技术,来让复制单元产生并利用电子邮件寄送该份具有所需格式的报表。The management tool may provide notification operations by e-mail, pager, or other methods. Notification operations may be real-time and/or incorporated with automated logs or automatically generated reports. The notification operation can also be sent to the system administrator and/or the vendor. In embodiments that interface with an executing web server/email package, many of the features of web pages can also be utilized. For example, the user can access and manage the replication unit locally or remotely. Depending on the individual rights, the user can access the reproduction unit internally within the company and/or from any location in the world. The replication unit can notify users (and the replication unit vendor) of problems and major events in the replication unit through email and SNMP. It is also possible to compose a dedicated and customized document program document for the e-mail, so as to notify different users or groups of users. Report output is not a required item. If the customer requires special reports for management purposes, rather than duplicating the required data every month and rewriting the data over and over again, the customer or the notified designer can use HTML, JAVA and/or other familiar tools and techniques to Let the replication unit generate and email the report in the desired format.
基本硬件basic hardware
一般说来,符合本发明的系统应该包括诸如标准PentiumII、PentiumIII、AMD K6-3或AMD K7等级的PC兼容计算机(具有各自厂商的品牌)的基本硬件。各种配置中,该设备最好具有至少64、128或256Mega字节的RAM,以及挂覆计算机外壳。也最好是包含一片100Mb的以太网络卡、FDDI适配卡等等。而磁盘驱动器接口,该设备最好是具有Qlogic SCSI适配卡作为磁盘驱动器仿真之用,以及Adaptec 2940UW适配卡作为缓冲及复制控制之用,或是FreeBSD所支持DPT品牌的RAID适配卡。也可以使用快取,包括RAID或SCSI控制器快取,复制单元的挥发性内存RAM快取,复制单元的非挥发性内存RAM快取(即静态RAM或是电池附接的RAM)等等。熟悉快取方面工具及技术的人士,可即按符合本发明而修正应用的。In general, a system consistent with the present invention should include basic hardware such as a standard Pentium II, Pentium III, AMD K6-3 or AMD K7 class PC compatible computer (with the respective manufacturer's brand). In various configurations, the device preferably has at least 64, 128 or 256 Megabytes of RAM, and a mounted computer case. It is also best to include a 100Mb Ethernet network card, FDDI adapter card, and so on. As for the disk drive interface, the device preferably has a Qlogic SCSI adapter card for disk drive emulation, an Adaptec 2940UW adapter card for buffering and copy control, or a DPT brand RAID adapter card supported by FreeBSD. Cache can also be used, including RAID or SCSI controller cache, volatile memory RAM cache for replicated units, non-volatile memory RAM cache for replicated units (ie, static RAM or battery attached RAM), and the like. Those who are familiar with the tools and techniques of caching can immediately modify the application according to the present invention.
在某些具体实施例中,如果N为欲复制的磁盘目录大小,则包含有本地复制档230的本地复制单元204需具有至少N的储存容量以作为该本地复制档使用。而在某些具体实施例中,座位本地缓冲区210(无论是否具有本地复制文件)伺服之用的磁盘系统,需具有至少五分之六倍N的容量,即1.2倍的N。该远程复制单元具有至少一个容量至少为N的磁盘系统,以提供给远程复制档。在所有的情况下,该本地复制单元缓冲区210或将需要等同于远程复制单元的资料容量,包括缓冲区与热交换RAID子系统,以提供本地再复制之用。In some embodiments, if N is the size of the disk directory to be copied, then the
套装测试项目Package test items
用以度量符合本发明系统性能的测量项目,最好能包括可用以衡量相对性能的解析性测试,以及涵盖了重点功能规格符合标准的布尔(通过/不通过)测试。如果对所有问题的指定答案均与测试结果正确相符,则算是通过布尔测试。该布尔测试可用来决定传递的适合度。The measurement items used to measure the performance of the system according to the present invention preferably include analytical tests to measure relative performance, and Boolean (pass/fail) tests to cover key functional specification compliance criteria. The Boolean test is passed if the specified answers to all questions correctly match the test results. This Boolean test can be used to determine the suitability of the transfer.
测试时最好是以本地网络配置(其中该旅程链接206处于单一局域网络之内),和以本地与远程配置(其中该本地复制单元204以及远程复制单元在地理上互相远隔)进行。例如,一远程网络配置可包含两个以T1链接206,或者是等同于旅程链接206的公众网际网络频宽所连接的位置。Testing is preferably performed in a local network configuration (where the
解析性测试最好采用一标准磁盘硬件套装测试,例如像Bonie(适用于UNIX),或是PCTools(适用于Windows NT以及Novell用户)。该测试可进行原始磁盘驱动器(注记其型式、尺寸与特征值)以及弹性复制单元204之间的性能比较。记录其输出性能以作未来参考。Analytical testing is best done with a standard disk hardware suite such as Bonie (for UNIX), or PCTools (for Windows NT and Novell users). This test allows for a performance comparison between the original disk drive (noting its type, size and characteristics) and the
最好能询问下列问题,并且进行必要的更正,直到符合所列答案为止。It would be a good idea to ask the following questions and make necessary corrections until the answers listed are met.
主机200操作系统是否将该复制单元204认定为正确容量的磁盘驱动器?(是的)Does the
数据是否能被读取及写入该复制单元204而不会有漏失?(是的)Can data be read and written to the
主机系统200可否对该复制单元204上的资料持续48小时执行任何的档案操作而不会有漏失?(是的)Can the
该安装有100Mega字节主机磁盘目录以及一远程网络配置的本地复制单元204,可否以每小时300Mega字节,或是如果有FDDI及其它支持的更高速度,成功地将资料复制到远程复制单元?(是的)注意该每小时300Mega字节的速度低于T1联机最高载送容量的50%;T1容量约为每小时617Mega字节。Can the
该本地复制单元204可否重开机,而完全不会造成附接的主机系统200无法正常操作,换言之,该主机200可继续完成所欲的操作目的而没有明显的性能退化?(是的)Can the
当该本地复制单元204重新上线时,是否可自动透过网络或是其它旅程链接206(即使用TCP socket协议),开始传输遗留在该本地复制单元204队列的数据,送出该数据到远程复制单元,而不会产生数据漏失?(是的)注意此项应以当该本地复制单元204附接于主机系统200时,在该本地复制单元204重开机之前与之后,于主机系统200之上挂载该远程复制单元磁盘驱动器的方式来确认。在该事件之后,远程复制档应仍为可挂载,而不会产生明显的档案系统修复需求。不应造成资料漏失,并且应该让产生该资料的应用程序认定为合理。将该复制单元实体挂载至本地主机系统200后,该主机系统200是否能够挂载该复制文件,并且该主机系统200上的应用程序及其客户端是否能够成功地使用该复制文件的资料?(是的)When the
对于例如像错误远程IP地址,或是无效的SCSI ID(小于零或大于15)的不正确信息输入,复制系统是否会损毁或当机?(不会)使用者是否可以更正信息,重新起始该软件并且让其正常执行,而不会需要将复制单元重新激活?(是的)所有的软件是否均可显示正确版本号码和版权说明?(是的)Will the replication system crash or crash for incorrect information entered eg wrong remote IP address, or invalid SCSI ID (less than zero or greater than 15)? (No) Can the user correct the information, restart the software and let it run normally without reactivating the replication unit? (Yes) Do all software display the correct version number and copyright notice? (Yes)
对于网络缆线206断线持续约30分钟或更久,而此时主机系统200正进行复制操作或是其它磁盘I/O,该本地复制单元204是否可以继续操作?(是的)而是否会被主机操作系统认定为具有正确设定容量的磁盘驱动器?(是的)是否可以读写数据到该本地复制单元204,而不会产生数据漏失?(是的)Can the
在起始复制操作建立起来之后,将网络缆线断线持续约24小时,然后执行一周期性的再测试操作。该本地复制单元204是否仍会被主机操作系统认定为具有正确设定容量的磁盘驱动器?(是的)是否仍然可以读写数据到该本地复制单元204,而不会产生数据漏失?(是的)After the initial replication operation was established, the network cable was disconnected for about 24 hours, and then a periodic retest operation was performed. Will the
同样地,在强迫该主机系统200的缓冲区210满溢之后(即重复制多次),确认该本地复制单元204仍尽可能正常运作。该本地复制单元204是否仍会被主机操作系统认定为具有正确设定容量的磁盘驱动器?(是的)是否仍然可以读写数据到该本地复制单元204,而不会产生数据漏失?(是的)使用者可否将程序排入队列的操作停止并且重新开始,而不会要求该本地复制单元204重新激活?(是的)使用者可否将程序移除队列的操作停止并且重新开始,而不会要求该本地复制单元204重新激活?(是的)如果至少部分资料已复制一次以上,那么使用者可否选择性地将特定的缓冲区部分排清,即排清中止的复制操作,而不必排清整个复制操作?(是的)Likewise, after forcing the
当主机系统200正进行复制操作或是其它的磁盘I/O密集操作时,将网络缆线或是其它旅程链接206断线持续约30分钟。在实体网络链接建立完成后,该本地复制单元204是否仍可开始由队列传送数据到远程复制单元?(是的)自该本地复制单元204到缓冲区状态的有效统计数字(即满溢或非满溢、缓冲区内资料区块数、由缓冲区送出而为远程所接收的资料区块数)是否仍为可用?(是的)When the
将UPS自该本地复制单元204拔除,关闭该本地复制单元204,并等待该本地复制单元204电力中断。先将该本地复制单元204重新接上电源,然后再将该主机系统200重新接上电源,该主机系统是否正常运作?(是的)该本地复制单元204是否可完整重新激活,而不会造成该附接的主机系统200无法正常操作?(是的)当该本地复制单元204重新上线时,是否可自动透过网络或是其它旅程链接206,开始传输遗留在该本地复制单元缓冲区210内的数据,而不会产生资料漏失?(是的)注意这些远程复制挂载测试的最后两项,应于本电力失效仿真之前及之后共同执行。是否可通过?(是的)Unplug the UPS from the
此外,如主机磁盘目录容量为200Giga字节时,前述各项测试是否可通过?(是的)In addition, if the capacity of the host disk directory is 200Giga bytes, can the aforementioned tests pass? (Yes)
远程复制单元可否被关闭,且远程复制档可否被另一执行相同操作系统,而作为第一主机系统200的待命服务器所挂载?(是的)Can the remote replication unit be turned off, and can the remote replication file be mounted by another standby server running the same operating system as the
然后该远程主机可否正常操作,而不会对其性能产生影响?(是的)注意前述两项测试操作由附接于与该远程复制单元与其远程复制磁盘子系统312或614同一个SCSI链上的远程备份主机来支持。Can the remote host then operate normally without impacting its performance? (Yes) Note that the preceding two test operations are supported by a remote backup host attached to the same SCSI chain as the remote copy unit and its remote
结语epilogue
本发明可提供本地端及/或远程的资料复制工具及技术。特别是一符合本发明的远程资料复制计算机系统,其中包括一个或多个弹性复制特征。本地复制系统(即其中来源与目的地距离少于10英哩)也可以具有这种弹性复制特征。The present invention can provide local and/or remote data replication tools and techniques. In particular, a remote data replication computer system according to the present invention includes one or more flexible replication features. Local replication systems (ie, where the source and destination are less than 10 miles away) can also have this elastic replication feature.
例如,该系统可具备无服务器终端设置,即本系统的一个实施例透过本地复制单元204从作为源端的本地服务器200到作为终端的远程复制单元208,408,508,608或708,不需要用到装于远程复制单元的远程服务器。For example, the system may have a serverless terminal setup, i.e. an embodiment of the system goes from a
例如,该系统也可以非挥发性设置,因此不需在本地服务器200上安装专为远程资料复制设计的软件。同样,不需在包含第二服务器300系统内的第二服务器300上安装这种软件。相反地,每个复制单元均执行其操作系统以及一个或多个远程资料复制应用程序(包括执行者、程序、任务等等)。譬如说,由复制单元而非服务器来对要被复制的资料提供缓冲,产生及监控旅程链接206的联机,并且在旅程链接206上传输/接收复制资料,然后解除该服务器的操作。同样地,本系统也具有磁盘仿真的特征,使得本系统透过一标准储存子系统总线,由本地服务器200处将资料复制到本地复制单元204处。适合的标准储存子系统总线包括SCSI、光纤频道、USB以及其它非专属的总线。这些总线于此也视为到本地复制单元204处的“联机”。For example, the system can also be configured as non-volatile, so that no software designed for remote data replication needs to be installed on the
本系统也具有TCP旅程链接206及/或以太网络旅程线路特征的特性。例如,该系统由本地服务器200,透过作为旅程链接206的TCP客户端的本地复制单元204处来复制数据;该远程复制单元208、308、408、508、608或708作为TCP服务器端。普遍来说,旅程线路的特征值表示SCSI高频宽低迟延的要求,而原本的Off-SiteServer序列式联机、SAN联机等等均未出现于本地复制单元204和远程复制单元之间的联机206上。The system also features TCP journey links 206 and/or Ethernet journey lines. For example, the system uses the
本系统也可以被视为具有多重性特征。换言之,本系统可提供由两个或更多本地(主要)服务器200到单一远程复制单元208、308、408、508、608或708的多对一复制操作。然后,该远程复制单元非挥发性储存装置的资料复制系统,可对每一个主要网络服务器200包含一磁盘扇区,而每个磁盘扇区掌握有各相对的服务器200的复制资料,供每个服务器200的一外部硬盘614,供每个服务器200的一个RAID单元312,或是如此的组合。各式各样的主要(本地)服务器200可使用相同的操作系统,或是采用不同操作系统的组合。在某些情况下,目标非挥发性储存装置容量已经足够来存装所有主要服务器200现有合并的非挥发性资料。至于另一项多重性特征,即为本系统可提供由给定的本地(主要)服务器200,到两个或更多远程复制单元208、308、408、508、608或708一对多复制操作。The system can also be viewed as having multiplicity features. In other words, the system can provide many-to-one replication operations from two or more local (primary)
本系统也可提供多种包括了安装弹性复制单元、使用该单元以及两者同时的方法。例如,提供弹性资料复制的方法,包括了至少两个由群组1300来安装的步骤。另外一种弹性资料复制的方法,则包括有一个或多个传输步骤1302。The system can also provide a variety of methods including installing the elastic replication unit, using the unit, and both. For example, a method for providing elastic data replication includes at least two steps implemented by group 1300 . Another method for elastic data replication includes one or more transmission steps 1302 .
安装步骤其中之一牵涉到以标准磁盘子系统总线202,来从本地服务器200连接到本地复制单元204,借此允许本地复制单元204仿真磁盘子系统来在链接202上进行通讯。步骤1306牵涉到连接本地复制单元204到旅程链接206,以便由至少一个以太网络联机及TCP联机进行数据传输。步骤1308处则牵涉到连接远程复制单元208、308、408、508、608或708到旅程链接206,以便由至少一个以太网络联机及TCP联机进行资料接收。而当至少前述连接步骤其中之一的某部分已完成之后,测试步骤1310即至少会测试远程复制单元208、308、408、508、608或708其中一个。One of the installation steps involves connecting from
传输步骤1302其中之一即为步骤1312,而当本地复制单元204仿真一磁盘子系统时,该步骤将资料由本地服务器200处,透过标准磁盘子系统总线202而传输到本地复制单元204。步骤1314将资料由本地复制单元204处,透过旅程链接206而传输到远程复制单元208、308、408、508、608或708。而在当远程复制单元属于无服务器时,换言之,如果没有附接于第二服务器300,步骤1316(也可以如同步骤1314的数据传输执行)则是将数据由本地复制单元204,透过旅程链接206而传输到远程复制单元208、308、408、508、608或708处。One of the transmission steps 1302 is step 1312, which transfers data from the
在这些及其它具体实施例中,本发明可拥有额外特征,像是针对于角色互换1506;热待机服务器实施方式1508;各式缓冲与其它储存特征1510、1518、1528;在SCSI或其它总线上的指令捕捉1512及回放1514;交易1516;在单一硬件平台上执行多个远程复制单元软件的实施例1520;根据于时间上的观察结果,而非一给定应用程序的储存操作的详细新进知识,的频繁接取资料识别处理1522,以支持应用程序状态复原1524;以及利用1526未授权第二服务器。In these and other embodiments, the present invention may have additional features, such as those directed to role swapping 1506; hot standby server implementation 1508; various buffering and other storage features 1510, 1518, 1528; Instruction Capture 1512 and Replay 1514 on the Internet; Transaction 1516; Executing Multiple Remote Copy Unit Software Embodiments 1520 on a Single Hardware Platform; Based on Observations Over Time, Rather Than Detailed Updates on Storage Operations for a Given Application Frequent access
本发明具体实施例可在即使是对该远程复制单元为相当低频宽联机的情形下,亦能够遮除旅程链接206的延迟,从而能够在像是先前即使是专属光纤亦无法获用复制功能的情况下有助于得到长距离的离处(off-site)复制,以及有助于在低成本网络联机上进行复制操作等优点。即使是这种低成本联机仅具有足够支持平均盘片资料交换速度的频宽,而不是支持峰值速度的频宽,也可以利用无误。本发明具体实施例不仅适用于备份与复原,同时也可作为一高可用度的第一储存系统。在远程多对一具体实施例里,该核心模块,或一介接于该缓冲器即SCSI或其它传输协议的软件接口,可被替换成一更为一般化的使用者空间控制模块,来仿真该系统的接口而无需真实的SCSI或其它的传输协议处理层。这些装置可包括像是本地缓冲器、远程缓冲器、本地复制、远程复制及SCSI或其它传输协议层。执行SAN管理软件的硬件平台可为集中方式。Embodiments of the present invention can mask the delay of the
兹对本发明的特别具体实施例(方法、配置设定的储存媒体、以及系统)再加以说明与描述。为避免不必要的重复,凡是可适用于一具体实施例的观念与细节,即不会在其它具体实施例上另行叙述。然而,除非有特别说明,否则此处本发明的特别具体实施例的描述仍可适用于其它具体实施例。例如,对本发明系统的讨论也属于适合其方法,反之亦然;并且创新方法的描述,亦合配于相对应的配置设定储存媒体,反之亦然。Specific embodiments (methods, storage media for configuration settings, and systems) of the present invention are illustrated and described hereafter. In order to avoid unnecessary repetition, the concepts and details applicable to one specific embodiment will not be separately described in other specific embodiments. However, descriptions herein of particular embodiments of the invention are applicable to other embodiments unless otherwise indicated. For example, the discussion of the system of the present invention is also suitable for its method, and vice versa; and the description of the innovative method is also suitable for the corresponding configuration setting storage medium, and vice versa.
本文内所撰写的“一”与“该”,以及指定项目例如“复制单元”一般皆为包括一个或多个该指定的项目。本发明亦可按其它特定型式实施的,而不会悖离其基本特性。所描述的具体实施例由各方面而言均应被视为仅具范例性而非局限于此。标题仅为便于理解。故本发明范畴由随附的权利要求所指明,并非由前述的各项描述。所有因语言文义及范围而生的变更均包括在其范畴内。As used herein, "a" and "the", as well as specified items such as "a unit of reproduction", generally include one or more of the specified items. The present invention may also be embodied in other specific forms without departing from its essential characteristics. The described embodiments are to be considered in all respects as illustrative only and not restrictive. Headings are for ease of understanding only. The scope of the invention is therefore indicated by the appended claims rather than by the foregoing description. All changes in context and scope of language are included within its scope.
Claims (13)
Applications Claiming Priority (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US20946900P | 2000-06-05 | 2000-06-05 | |
| US60/209,469 | 2000-06-05 | ||
| US22393400P | 2000-08-09 | 2000-08-09 | |
| US60/223,934 | 2000-08-09 | ||
| US26214301P | 2001-01-16 | 2001-01-16 | |
| US60/262,143 | 2001-01-16 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1457457A CN1457457A (en) | 2003-11-19 |
| CN1256672C true CN1256672C (en) | 2006-05-17 |
Family
ID=27395378
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CNB018131956A Expired - Fee Related CN1256672C (en) | 2000-06-05 | 2001-06-02 | Remote data flexible replication system and method |
Country Status (10)
| Country | Link |
|---|---|
| EP (1) | EP1305711A4 (en) |
| JP (1) | JP4945047B2 (en) |
| KR (1) | KR20030066331A (en) |
| CN (1) | CN1256672C (en) |
| AU (2) | AU2001265335B2 (en) |
| BR (1) | BR0111422A (en) |
| CA (1) | CA2449984A1 (en) |
| IL (2) | IL153163A0 (en) |
| MX (1) | MXPA02012065A (en) |
| WO (1) | WO2001097030A1 (en) |
Families Citing this family (30)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB0227786D0 (en) | 2002-11-29 | 2003-01-08 | Ibm | Improved remote copy synchronization in disaster recovery computer systems |
| JP4598387B2 (en) | 2003-09-17 | 2010-12-15 | 株式会社日立製作所 | Storage system |
| US7219201B2 (en) | 2003-09-17 | 2007-05-15 | Hitachi, Ltd. | Remote storage disk control device and method for controlling the same |
| US7162551B2 (en) * | 2003-10-31 | 2007-01-09 | Lucent Technologies Inc. | Memory management system having a linked list processor |
| JP4401895B2 (en) * | 2004-08-09 | 2010-01-20 | 株式会社日立製作所 | Computer system, computer and its program. |
| US7464124B2 (en) * | 2004-11-19 | 2008-12-09 | International Business Machines Corporation | Method for autonomic data caching and copying on a storage area network aware file system using copy services |
| US7568056B2 (en) | 2005-03-28 | 2009-07-28 | Nvidia Corporation | Host bus adapter that interfaces with host computer bus to multiple types of storage devices |
| US9195397B2 (en) | 2005-04-20 | 2015-11-24 | Axxana (Israel) Ltd. | Disaster-proof data recovery |
| JP4977688B2 (en) * | 2005-04-20 | 2012-07-18 | アクサナ・(イスラエル)・リミテッド | Remote data mirroring system |
| US7707453B2 (en) | 2005-04-20 | 2010-04-27 | Axxana (Israel) Ltd. | Remote data mirroring system |
| WO2009047751A2 (en) | 2007-10-08 | 2009-04-16 | Axxana (Israel) Ltd. | Fast data recovery system |
| US8037240B2 (en) * | 2007-10-24 | 2011-10-11 | International Business Machines Corporation | System and method for using reversed backup operation for minimizing the disk spinning time and the number of spin-up operations |
| EP2286343A4 (en) | 2008-05-19 | 2012-02-15 | Axxana Israel Ltd | Resilient data storage in the presence of replication faults and rolling disasters |
| US8289694B2 (en) | 2009-01-05 | 2012-10-16 | Axxana (Israel) Ltd. | Disaster-proof storage unit having transmission capabilities |
| CN101997902B (en) * | 2009-08-28 | 2015-07-22 | 云端容灾有限公司 | Remote online backup system and method based on station segment transmission |
| WO2011067702A1 (en) | 2009-12-02 | 2011-06-09 | Axxana (Israel) Ltd. | Distributed intelligent network |
| GB2499747B (en) * | 2010-11-22 | 2014-04-09 | Seven Networks Inc | Aligning data transfer to optimize connections established for transmission over a wireless network |
| US8909996B2 (en) * | 2011-08-12 | 2014-12-09 | Oracle International Corporation | Utilizing multiple storage devices to reduce write latency for database logging |
| US9135164B2 (en) * | 2013-03-15 | 2015-09-15 | Virident Systems Inc. | Synchronous mirroring in non-volatile memory systems |
| US10769028B2 (en) | 2013-10-16 | 2020-09-08 | Axxana (Israel) Ltd. | Zero-transaction-loss recovery for database systems |
| KR102157396B1 (en) | 2013-12-11 | 2020-09-17 | 주식회사 알티캐스트 | System and method of providing a related service using still image or moving picture |
| KR102157399B1 (en) | 2013-12-19 | 2020-09-17 | 주식회사 알티캐스트 | System and method of providing a related service using consecutive query images |
| US10379958B2 (en) | 2015-06-03 | 2019-08-13 | Axxana (Israel) Ltd. | Fast archiving for database systems |
| US10003835B2 (en) * | 2015-06-24 | 2018-06-19 | Tribune Broadcasting Company, Llc | Device control in backup media-broadcast system |
| CN107015887A (en) * | 2017-02-21 | 2017-08-04 | 深圳市中博睿存信息技术有限公司 | Object stores remote copy method and system |
| US10592326B2 (en) | 2017-03-08 | 2020-03-17 | Axxana (Israel) Ltd. | Method and apparatus for data loss assessment |
| RU2726318C1 (en) * | 2020-01-14 | 2020-07-13 | Юрий Иванович Стародубцев | Method for backing up complex object state |
| CN113742129B (en) * | 2020-05-28 | 2024-05-28 | 珠海信核数据科技有限公司 | Data backup method and device |
| US11537633B2 (en) | 2020-11-06 | 2022-12-27 | Oracle International Corporation | Asynchronous cross-region block volume replication |
| JP7556593B1 (en) | 2023-03-13 | 2024-09-26 | Necプラットフォームズ株式会社 | Control system, control device, control method, and program |
Family Cites Families (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1994000816A1 (en) * | 1992-06-18 | 1994-01-06 | Andor Systems, Inc. | Remote dual copy of data in computer systems |
| JPH07146810A (en) * | 1993-09-27 | 1995-06-06 | Toshiba Corp | Computer system |
| KR0128271B1 (en) * | 1994-02-22 | 1998-04-15 | 윌리암 티. 엘리스 | Remote data duplexing |
| US5574950A (en) * | 1994-03-01 | 1996-11-12 | International Business Machines Corporation | Remote data shadowing using a multimode interface to dynamically reconfigure control link-level and communication link-level |
| US5592618A (en) * | 1994-10-03 | 1997-01-07 | International Business Machines Corporation | Remote copy secondary data copy validation-audit function |
| US5870537A (en) * | 1996-03-13 | 1999-02-09 | International Business Machines Corporation | Concurrent switch to shadowed device for storage controller and device errors |
| US6052797A (en) * | 1996-05-28 | 2000-04-18 | Emc Corporation | Remotely mirrored data storage system with a count indicative of data consistency |
| US6101497A (en) * | 1996-05-31 | 2000-08-08 | Emc Corporation | Method and apparatus for independent and simultaneous access to a common data set |
| US5794254A (en) * | 1996-12-03 | 1998-08-11 | Fairbanks Systems Group | Incremental computer file backup using a two-step comparison of first two characters in the block and a signature with pre-stored character and signature sets |
| JPH1139273A (en) * | 1997-07-17 | 1999-02-12 | Chubu Nippon Denki Software Kk | Remote backup system |
| JPH1185594A (en) * | 1997-09-01 | 1999-03-30 | Hitachi Ltd | Information processing system for remote copy |
| US6065018A (en) * | 1998-03-04 | 2000-05-16 | International Business Machines Corporation | Synchronizing recovery log having time stamp to a remote site for disaster recovery of a primary database having related hierarchial and relational databases |
| JPH11305947A (en) * | 1998-04-17 | 1999-11-05 | Fujitsu Ltd | Remote transfer method by magnetic disk controller |
-
2001
- 2001-06-02 CA CA002449984A patent/CA2449984A1/en not_active Abandoned
- 2001-06-02 AU AU2001265335A patent/AU2001265335B2/en not_active Ceased
- 2001-06-02 BR BR0111422-0A patent/BR0111422A/en not_active IP Right Cessation
- 2001-06-02 IL IL15316301A patent/IL153163A0/en active IP Right Grant
- 2001-06-02 WO PCT/US2001/017920 patent/WO2001097030A1/en not_active Ceased
- 2001-06-02 KR KR1020027016613A patent/KR20030066331A/en not_active Abandoned
- 2001-06-02 JP JP2002511168A patent/JP4945047B2/en not_active Expired - Fee Related
- 2001-06-02 CN CNB018131956A patent/CN1256672C/en not_active Expired - Fee Related
- 2001-06-02 EP EP01939862A patent/EP1305711A4/en not_active Withdrawn
- 2001-06-02 MX MXPA02012065A patent/MXPA02012065A/en active IP Right Grant
- 2001-06-02 AU AU6533501A patent/AU6533501A/en active Pending
-
2002
- 2002-11-28 IL IL153163A patent/IL153163A/en not_active IP Right Cessation
Also Published As
| Publication number | Publication date |
|---|---|
| BR0111422A (en) | 2004-02-10 |
| CA2449984A1 (en) | 2001-12-20 |
| EP1305711A4 (en) | 2007-05-02 |
| WO2001097030A1 (en) | 2001-12-20 |
| IL153163A (en) | 2008-11-26 |
| MXPA02012065A (en) | 2003-04-25 |
| AU2001265335B2 (en) | 2007-01-25 |
| JP2004523017A (en) | 2004-07-29 |
| IL153163A0 (en) | 2003-06-24 |
| JP4945047B2 (en) | 2012-06-06 |
| AU6533501A (en) | 2001-12-24 |
| EP1305711A1 (en) | 2003-05-02 |
| KR20030066331A (en) | 2003-08-09 |
| CN1457457A (en) | 2003-11-19 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN1256672C (en) | Remote data flexible replication system and method | |
| US7203732B2 (en) | Flexible remote data mirroring | |
| TW454120B (en) | Flexible remote data mirroring | |
| US8706694B2 (en) | Continuous data protection of files stored on a remote storage device | |
| JP5420242B2 (en) | System and method for high performance enterprise data protection | |
| EP1999584B1 (en) | Method for improving mean time to data loss (mtdl) in a fixed content distributed data storage | |
| US10185583B1 (en) | Leveraging snapshots | |
| US9606881B1 (en) | Method and system for rapid failback of a computer system in a disaster recovery environment | |
| AU2001265335A1 (en) | Flexible remote data mirroring | |
| CN102187311B (en) | Methods and systems for recovering a computer system using a storage area network | |
| CN1581105A (en) | Remote copy system | |
| KR20010041762A (en) | Highly available file servers | |
| CN1906593A (en) | System and method for failover | |
| AU2009324800A1 (en) | Method and system for managing replicated database data | |
| US7047261B2 (en) | Method for file level remote copy of a storage device | |
| US12235867B2 (en) | Replication progress tracking technique | |
| US12038817B2 (en) | Methods for cache rewarming in a failover domain and devices thereof | |
| WO2024174477A1 (en) | Synchronous remote replication method and apparatus for storage system | |
| Asami et al. | Designing a self-maintaining storage system | |
| JP2008033967A (en) | EXTERNAL STORAGE DEVICE, DATA RECOVERY METHOD FOR EXTERNAL STORAGE DEVICE, AND PROGRAM | |
| JPH09218840A (en) | Information processing method, its apparatus, and information processing system | |
| Kleiman et al. | Using NUMA interconnects for highly available filers | |
| US8941863B1 (en) | Techniques for image duplication optimization |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| C19 | Lapse of patent right due to non-payment of the annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |