JP6671708B2

JP6671708B2 - Backup restore system and backup restore method

Info

Publication number: JP6671708B2
Application number: JP2016022712A
Authority: JP
Inventors: 仁志亀井; 隆喜中村
Original assignee: Tohoku University NUC; Hitachi Ltd
Current assignee: Tohoku University NUC; Hitachi Ltd
Priority date: 2016-02-09
Filing date: 2016-02-09
Publication date: 2020-03-25
Anticipated expiration: 2036-02-09
Also published as: JP2017142605A

Description

本発明は、概して、バックアップ及びリストアのうちの少なくとも一方、例えば、その両方に関する。 The present invention generally relates to backup and / or restore, for example, both.

耐災害性を向上させるため、ネットワークを通してバックアップ及びリストアを行うシステムが構築されている。非特許文献１に示されている通り、正サイトが、遠隔にあるデータセンタにデータをバックアップする。正サイトに災害が起きて正サイトが復旧されると、復旧した正サイトが、バックアップされたデータをデータセンタからリストアする。 In order to improve disaster resistance, a system for backing up and restoring through a network has been constructed. As shown in Non-Patent Document 1, a primary site backs up data to a remote data center. When a disaster occurs at the primary site and the primary site is restored, the restored primary site restores the backed up data from the data center.

また、バックアップ及びリストアをファイルの更新差分データに基づき処理することによって効率化する技術が知られている。例えば、特許文献１に示される通り、更新差分の管理のため、更新のあったファイルデータブロックが、差分データとしてバックアップされる。 Further, there is known a technology for improving efficiency by processing backup and restore based on update difference data of a file. For example, as disclosed in Patent Document 1, an updated file data block is backed up as difference data for managing update differences.

WO2007/131190WO2007 / 131190 US6604118US6604118

非特許文献１に開示の技術によれば、データをリストアするリストア処理に時間がかかる。具体的には、災害によって損壊した正サイトが物理的に復旧した後、遠隔データセンタから復旧した正サイトへデータが送信される。この時、ネットワークの帯域の制約により、データ転送に時間がかかる。 According to the technology disclosed in Non-Patent Document 1, it takes time to restore data. Specifically, after the primary site damaged by the disaster is physically restored, data is transmitted from the remote data center to the restored primary site. At this time, it takes time to transfer the data due to the limitation of the network bandwidth.

特許文献１に開示の技術でも、差分データが大きい場合は非特許文献１と同様の問題が発生する。更に、リストアのために、差分の元となるベーススナップショットを復旧した正サイトに初めに送る必要があり、データ転送時間が長くなる。 Even with the technique disclosed in Patent Literature 1, when the difference data is large, the same problem as in Non Patent Literature 1 occurs. Further, for restoration, it is necessary to first send the base snapshot that is the source of the difference to the restored primary site, which increases the data transfer time.

また、ベーススナップショットが失われると、差分を用いたリストアが不可能になるため、遠隔データサイトが複数のベーススナップショットを保存することで対災害性を向上する必要がある。しかし、そうすると、バックアップデータ容量が大きくなってしまう。 Further, if the base snapshot is lost, restoration using the difference becomes impossible, so that it is necessary for the remote data site to store a plurality of base snapshots to improve disaster resistance. However, doing so increases the backup data capacity.

それぞれがファイル又はディレクトリである１以上のバックアップ対象オブジェクトを複数のバックアップサーバの少なくとも１つにバックアップする第１オブジェクトサーバと、それぞれがファイル又はディレクトリである１以上のリストア対象オブジェクトを複数のバックアップサーバの少なくとも１つからリストアする第２オブジェクトサーバとのうちの少なくとも１つが備えられる。１以上のリストア対象オブジェクトは、バックアップされた１以上のバックアップ対象オブジェクトでよい。１以上のバックアップ対象オブジェクトの各々、及び、１以上のリストア対象オブジェクトの各々は、オブジェクトデータとそのオブジェクトデータについてのメタデータとで構成されている。 A first object server for backing up one or more backup objects each of which is a file or a directory to at least one of a plurality of backup servers, and a first object server of one or more backup objects each of which is a file or a directory for a plurality of backup servers At least one of a second object server to be restored from at least one is provided. The one or more objects to be restored may be one or more objects to be backed up. Each of the one or more objects to be backed up and each of the one or more objects to be restored are composed of object data and metadata about the object data.

第１オブジェクトサーバは、１以上のバックアップ対象オブジェクトの各々におけるオブジェクトデータ及びメタデータの各々について、
当該データでありオブジェクトデータ又はメタデータである対象データが第１条件と第２条件のいずれの条件に該当するかを判断し、
対象データが第１条件に該当する場合、対象データの１以上の複製データを２以上のバックアップサーバにバックアップし、
対象データが第２条件に該当する場合、対象データのＥＣ（Erasure Coding）に従う冗長度Ｖの断片データ（Ｖは１より大きい値）である２以上のデータを、２以上のバックアップサーバにバックアップする。なお、対象データの冗長度Ｖの断片データは、ＥＣに従う断片データでなくてもよい。例えば、断片データの複製データが２以上のデータに含まれていれば、断片データが失われてもそれの複製データに代替可能なため、その断片データの冗長性が実現される。 The first object server, for each of the object data and metadata in each of the one or more objects to be backed up,
Determine whether the target data, which is the data and the object data or the metadata, satisfies the first condition or the second condition,
When the target data satisfies the first condition, one or more replicated data of the target data is backed up to two or more backup servers,
When the target data satisfies the second condition, two or more pieces of data that are fragment data (V is a value greater than 1) of the redundancy V according to EC (Erasure Coding) of the target data are backed up to two or more backup servers. . Note that the fragment data having the redundancy V of the target data may not be fragment data complying with EC. For example, if the duplicate data of the fragment data is included in two or more pieces of data, even if the fragment data is lost, it can be replaced with the duplicate data, so that the redundancy of the fragment data is realized.

第１オブジェクトサーバは、１以上のバックアップ対象オブジェクトの各々について、第１ファイルシステム空間におけるそのバックアップ対象オブジェクトの位置とそのバックアップ対象オブジェクトのバックアップ先とに関する情報を、複数のバックアップサーバのうちの少なくとも１つにおける管理情報に登録する。第１ファイルシステム空間は、１以上のバックアップ対象オブジェクトを含んだファイルシステム空間（例えばネームスペース）である。管理情報は、複数のバックアップサーバ以外の共有記憶領域（第１及び第２オブジェクトサーバのいずれもアクセス可能な記憶領域）に格納されてもよい。 The first object server, for each of the one or more backup target objects, stores information on the position of the backup target object in the first file system space and the backup destination of the backup target object in at least one of the plurality of backup servers. Registration in the management information in one. The first file system space is a file system space (for example, a name space) including one or more backup target objects. The management information may be stored in a shared storage area other than the plurality of backup servers (a storage area accessible by both the first and second object servers).

第２オブジェクトサーバは、１以上のリストア対象オブジェクトの各々について、そのリストア対象オブジェクトの第２ファイルシステム空間における位置と、そのリストア対象オブジェクトのバックアップ先とを、上述の管理情報を参照することにより特定する。第２ファイルシステム空間は、１以上のリストア対象オブジェクトを含んだファイルシステム空間であり、例えば、リストア対象の上記第１ファイルシステム空間であってよい。 The second object server specifies, for each of the one or more objects to be restored, the position of the object to be restored in the second file system space and the backup destination of the object to be restored by referring to the management information described above. I do. The second file system space is a file system space including one or more objects to be restored, and may be, for example, the first file system space to be restored.

第２オブジェクトサーバは、１以上のリストア対象オブジェクトの各々について、
当該オブジェクトである対象オブジェクトのメタデータとして、そのメタデータの少なくとも１つの複製データ、又は、そのメタデータの断片データを複数のバックアップサーバのうちの少なくとも１つから受信し、
対象オブジェクトのオブジェクトデータとして、そのオブジェクトデータの少なくとも１つの複製データ、又は、そのオブジェクトデータの断片データを複数のバックアップサーバのうちの少なくとも１つから受信し、
受信したメタデータと受信したオブジェクトデータとに基づき、対象オブジェクトの第２ファイルシステム空間における位置に対象オブジェクトをリストアする。 The second object server, for each of the one or more objects to be restored,
As metadata of a target object that is the object, at least one copy of the metadata or fragment data of the metadata is received from at least one of the plurality of backup servers,
Receiving, as object data of the target object, at least one copy of the object data or fragment data of the object data from at least one of the plurality of backup servers;
The target object is restored to the position in the second file system space of the target object based on the received metadata and the received object data.

なお、第２オブジェクトサーバは、第１オブジェクトサーバのリプレース後（復旧後）のオブジェクトサーバでもよいし、第１オブジェクトサーバのスケールアウト等の目的で追加されたオブジェクトサーバでもよい。 The second object server may be an object server after replacement (after restoration) of the first object server, or may be an object server added for the purpose of scaling out the first object server.

高いリストア性能とバックアップデータ容量削減を両立できる。具体的には、第１条件に該当するデータ（例えば、メタデータのような一般的に小サイズのデータ）についてはその複製がバックアップされ、第２条件に該当するデータ（例えば、オブジェクトデータのような大サイズとなり得るデータ）については複数の断片データに分割されてバックアップされるので、複数のバックアップサーバから１以上のリストア対象オブジェクトを転送するのに要する時間を短縮できる。また、第２条件に該当するデータの分割はＥＣ（Erasure Coding）に従うエンコードのため、冗長性を保ちつつデータ容量削減を実現できる。 High restore performance and backup data capacity reduction can both be achieved. More specifically, a copy of data that satisfies the first condition (for example, generally small-sized data such as metadata) is backed up, and data that satisfies the second condition (for example, object data such as object data). Data that can be large in size) is divided into a plurality of fragmented data and backed up, so that the time required to transfer one or more objects to be restored from a plurality of backup servers can be reduced. In addition, since the data that satisfies the second condition is encoded according to EC (Erasure Coding), the data capacity can be reduced while maintaining redundancy.

実施例１に係るバックアップリストアシステムの構成を示す。1 shows a configuration of a backup / restore system according to a first embodiment. ファイルサーバの構成を示す。1 shows a configuration of a file server. バックアップサーバの構成を示す。2 shows the configuration of a backup server. バックアップ設定テーブルの構成を示す。4 shows a configuration of a backup setting table. バックアップ管理テーブルの構成の一例を示す。4 shows an example of the configuration of a backup management table. リストア処理の模式図である。It is a schematic diagram of a restoration process. バックアップ処理のフローを示す。4 shows a flow of a backup process. リストア処理のフローを示す。10 shows a flow of a restore process. 実施例２に係るバックアップ方式テーブルの構成を示す。9 illustrates a configuration of a backup mode table according to the second embodiment. 実施例２に係るバックアップ方式決定処理のフローを示す。9 illustrates a flow of a backup method determination process according to the second embodiment. 実施例３に係るバックアップ方式テーブルの構成を示す。13 illustrates a configuration of a backup mode table according to the third embodiment. 実施例３に係るバックアップ方式決定処理のフローを示す。13 illustrates a flow of a backup method determination process according to the third embodiment.

以下、幾つかの実施例を説明する。 Hereinafter, some embodiments will be described.

なお、以下の説明では、以下の説明では、「ｘｘｘテーブル」といった表現にて情報を説明することがあるが、情報は、どのようなデータ構造で表現されていてもよい。すなわち、情報がデータ構造に依存しないことを示すために、「ｘｘｘテーブル」を「ｘｘｘ情報」と言うことができる。また、以下の説明において、各テーブルの構成は一例であり、１つのテーブルは、２以上のテーブルに分割されてもよいし、２以上のテーブルの全部又は一部が１つのテーブルであってもよい。 In the following description, information may be described in an expression such as “xxx table” in the following description, but the information may be expressed in any data structure. That is, "xxx table" can be referred to as "xxx information" to indicate that the information does not depend on the data structure. Further, in the following description, the configuration of each table is an example, and one table may be divided into two or more tables, or even if all or some of the two or more tables are one table. Good.

また、以下の説明では、「プログラム」を主語として処理を説明する場合があるが、プログラムは、プロセッサ（例えばＣＰＵ（Central Processing Unit）によって実行されることで、定められた処理を、適宜に記憶資源（例えばメモリ）及び／又は通信インターフェイスデバイス（例えば通信ポート）等を用いながら行うため、処理の主語がプロセッサとされてもよい。プログラムを主語として説明された処理は、プロセッサあるいはそのプロセッサを有する装置が行う処理としてもよい。また、プロセッサは、処理の一部または全部を行うハードウエア回路を含んでもよい。プログラムは、プログラムソースから各コントローラにインストールされてもよい。プログラムソースは、例えば、プログラム配布計算機または計算機が読み取り可能な記憶メディアであってもよい。また、以下の説明において、２以上のプログラムが１つのプログラムとして実現されてもよいし、１つのプログラムが２以上のプログラムとして実現されてもよい。 Further, in the following description, processing may be described with “program” as the subject, but the program is stored in a processor (for example, a CPU (Central Processing Unit)) so that the determined processing is appropriately stored. The processing may be performed using a resource (for example, a memory) and / or a communication interface device (for example, a communication port), or the like, and the subject of the processing may be a processor. The processor may include a hardware circuit that performs part or all of the processing.The program may be installed in each controller from a program source. Program distribution computer or computer readable憶 may be media. In the following description, to 2 or more programs may be implemented as a single program, a single program may be implemented as two or more programs.

また、以下の説明では、同種の要素を区別しないで説明する場合には、参照符号（又は参照符号における共通部分）を使用し、同種の要素を区別して説明する場合は、要素のＩＤ（又は要素の参照符号）を使用することがある。例えば、ファイルサーバを特に区別しないで説明する場合には、「ファイルサーバ１０５」と記載し、個々のファイルサーバを区別して説明する場合には、「ファイルサーバ１０５Ａ」、「ファイルサーバ１０５Ｂ」のように記載することがある。 In the following description, reference numerals (or common parts in reference numerals) are used when describing the same kind of elements without distinguishing them, and when discriminating and describing the same kind of elements, the element IDs (or Element reference number). For example, when the file server is described without particular distinction, it is described as “file server 105”, and when the individual file server is described separately, it is represented as “file server 105A” or “file server 105B”. May be described.

図１は、実施例１に係るバックアップリストアシステムの構成を示す。 FIG. 1 illustrates a configuration of the backup / restore system according to the first embodiment.

バックアップリストアシステム１００は、クライアント１０１、正サイト１０３、正サイト１０３にある管理サーバ１０４、ファイルサーバ１０５Ａ、バックアップサイト１０６、バックアップサイト１０６にあるバックアップサーバ１０７、リカバリサイト１０８、及び、リカバリサイト１０８にあるファイルサーバ１０５Ｂを有する。複数のバックアップサーバ１０７が存在する。バックアップリストアシステム１００は、計算機システムの一例であり、ファイルサーバ１０５Ａ及び１０５Ｂを含んでいるため、バックアップリストアシステムと呼ばれる。ファイルサーバ１０５Ｂが無い場合、ファイルサーバ１０５Ａを含んだ計算機システムは、バックアップシステムと呼ばれてよい。一方、ファイルサーバ１０５Ａが無い場合、ファイルサーバ１０５Ｂを含んだ計算機システムは、リストアシステムと呼ばれてよい。 The backup / restore system 100 is located at a client 101, a primary site 103, a management server 104 at the primary site 103, a file server 105A, a backup site 106, a backup server 107 at the backup site 106, a recovery site 108, and a recovery site 108. It has a file server 105B. There are a plurality of backup servers 107. The backup / restore system 100 is an example of a computer system and includes the file servers 105A and 105B, and is therefore called a backup / restore system. If there is no file server 105B, the computer system including the file server 105A may be called a backup system. On the other hand, if there is no file server 105A, the computer system including the file server 105B may be called a restore system.

クライアント１０１と正サイト１０３とバックアップサイト１０６とリカバリサイト１０８は通信路１０２によって相互に接続されている。通信路１０２は、例えば、ＬＡＮ（Local Area Network）やインターネットのような通信ネットワークでよい。 The client 101, the primary site 103, the backup site 106, and the recovery site 108 are mutually connected by the communication path 102. The communication path 102 may be, for example, a communication network such as a LAN (Local Area Network) or the Internet.

クライアント１０１は、ファイルサーバ１０５Ａ又はファイルサーバ１０５Ｂを利用する端末（計算機）である。クライアント１０１を使用するエンドユーザは、クライアント１０１のファイルアクセスプログラム１１０を用いてファイルサーバ１０５Ａ又はファイルサーバ１０５Ｂへ接続し、ファイルサーバ１０５Ａ又はファイルサーバ１０５Ｂに格納されているファイルを読み書きする。 The client 101 is a terminal (computer) that uses the file server 105A or 105B. An end user using the client 101 connects to the file server 105A or the file server 105B using the file access program 110 of the client 101, and reads and writes a file stored in the file server 105A or the file server 105B.

正サイト１０３は、災害発生前に、クライアント１０１へファイルアクセスサービスを提供するサイトである。 The primary site 103 is a site that provides a file access service to the client 101 before a disaster occurs.

正サイト１０３の管理サーバ１０４は、バックアップリストアシステム１００の全体を管理する端末（計算機）である。バックアップリストアシステム１００の管理者は、管理サーバ１０４を用いて、バックアップリストアシステム１００の設定を行う。例えば、管理者は、管理サーバ１０４を用いてバックアップサイト１０６ＡのＩＰアドレスをファイルサーバ１０５Ａに設定する。なお、管理サーバ１０４は、バックアップサイト１０６やリカバリサイト１０８のような任意の位置に配置されてよい。 The management server 104 of the primary site 103 is a terminal (computer) that manages the entire backup / restore system 100. The administrator of the backup / restore system 100 sets the backup / restore system 100 using the management server 104. For example, the administrator uses the management server 104 to set the IP address of the backup site 106A in the file server 105A. Note that the management server 104 may be arranged at an arbitrary position such as the backup site 106 or the recovery site 108.

正サイト１０３内のファイルサーバ１０５Ａは、クライアント１０１へファイルアクセスサービスを提供する計算機である。ファイルサーバ１０５Ａは、第１オブジェクトサーバの一例である。ファイルアクセスサービスは、いずれの種別でもよく、例えば、ＮＦＳ（Network File System）又はＣＩＦＳ（Common Internet File System）である。正サイト１０３のファイルサーバ１０５Ａは、例えば、災害が発生する前に運用されるサーバである。 The file server 105A in the primary site 103 is a computer that provides a file access service to the client 101. The file server 105A is an example of a first object server. The file access service may be of any type, for example, NFS (Network File System) or CIFS (Common Internet File System). The file server 105A of the main site 103 is, for example, a server operated before a disaster occurs.

バックアップサイト１０６は、バックアップデータを保持するサイトである。バックアップサイト１０６のバックアップサーバ１０７に、ファイルサーバ１０５が保持するオブジェクト（ファイル又はディレクトリ）のオブジェクトデータ（例えばファイルデータ）とそれのメタデータとのうちの少なくとも一部に関するデータ（具体的には、後述するように、複製データ、断片データ及びパリティデータのうちの少なくとも１つ）がバックアップされる。バックアップサーバ１０７は、ファイルサーバ１０５Ａからデータを定期的又は不定期的に受信し、受信したデータを、バックアップサーバ１０７のディスクデバイス（不揮発記憶デバイスの一例）へ格納する。ファイルサーバ１０５とバックアップサーバ１０７の動作は後に詳述する。また、１つのバックアップサイト１０６に１つのバックアップサーバ１０７が存在するため、本実施例では、バックアップサイト１０６とバックアップサーバ１０７は実質的に同義である。 The backup site 106 is a site that stores backup data. The backup server 107 of the backup site 106 stores at least a part of the object data (for example, file data) of the object (file or directory) held by the file server 105 and its metadata (specifically, So that at least one of the duplicate data, fragment data, and parity data) is backed up. The backup server 107 receives data from the file server 105A regularly or irregularly, and stores the received data in a disk device (an example of a nonvolatile storage device) of the backup server 107. The operations of the file server 105 and the backup server 107 will be described later in detail. Further, since one backup server 107 exists at one backup site 106, the backup site 106 and the backup server 107 are substantially synonymous in this embodiment.

リカバリサイト１０８は、災害発生後に、クライアント１０１へファイルアクセスサービスを提供するサイトである。 The recovery site 108 is a site that provides a file access service to the client 101 after a disaster occurs.

リカバリサイト１０８のファイルサーバ１０５Ｂは、災害発生後に、クライアント１０１へファイルアクセスサービスを提供する計算機である。ファイルサーバ１０５Ｂは、第２オブジェクトサーバの一例である。災害によって、正サイト１０３が物理的に損壊した場合、リカバリサイト１０８のファイルサーバ１０５Ｂがサービスを引き継ぐ。そのため、正サイト１０３のファイルサーバ１０５Ａが、ファイルサーバ１０５Ａが保持するデータをバックアップサイト１０６のバックアップサーバ１０７へ定期的に又は不定期的にバックアップする（保存する）。災害発生後、リカバリサイト１０８のファイルサーバ１０５Ｂは、バックアップサーバ１０７からデータ（バックアップされたデータ）を取得し、ファイルアクセスサービスを再開する。クライアント１０１は、再開したファイルサービスを利用する。 The file server 105B of the recovery site 108 is a computer that provides a file access service to the client 101 after a disaster occurs. The file server 105B is an example of a second object server. When the primary site 103 is physically damaged due to a disaster, the file server 105B of the recovery site 108 takes over the service. Therefore, the file server 105A of the primary site 103 regularly (or irregularly) backs up (saves) the data held by the file server 105A to the backup server 107 of the backup site 106. After the disaster, the file server 105B of the recovery site 108 acquires data (backed up data) from the backup server 107 and restarts the file access service. The client 101 uses the resumed file service.

バックアップリストアシステム１００において、正サイト１０３のファイルサーバ１０５Ａは、１以上のバックアップ対象オブジェクトの各々におけるオブジェクトデータ及びメタデータの各々について、そのデータのバックアップ方式を決定する、具体的には、そのデータが第１条件と第２条件のいずれに該当するかを判断する。 In the backup / restore system 100, the file server 105A of the primary site 103 determines a backup method for each of the object data and the metadata in each of the one or more backup target objects. It is determined which of the first condition and the second condition is satisfied.

メタデータのような一般的に小サイズのデータは、第１条件に該当する。この場合、ファイルサーバ１０５Ａは、データを、複数のバックアップサーバ１０７に多重複製する。 Generally, small-sized data such as metadata satisfies the first condition. In this case, the file server 105A multiplexes the data to a plurality of backup servers 107.

一方、オブジェクトデータのような一般的に大サイズとなり得るデータは、第２条件に該当する。この場合、ファイルサーバ１０５Ａは、データを分割することにより得られた複数の断片データを複数のバックアップサーバ１０７にバックアップする。データ分割として、耐災害性を向上させるため、ＥＣ（Erasure Coding）に従うエンコードが採用される。 On the other hand, data that can be generally large in size, such as object data, corresponds to the second condition. In this case, the file server 105A backs up a plurality of fragment data obtained by dividing the data to a plurality of backup servers 107. As data division, encoding according to EC (Erasure Coding) is adopted in order to improve disaster resistance.

データのリストア時には、第１条件に該当したデータについては、少なくとも１つのバックアップサーバ１０７から複製データがファイルサーバ１０５Ｂに転送される。複製データを用いてリストアが行われる。本実施例では、第１条件は、データがメタデータであることである。メタデータはファイルデータより優先的にリストアされる。つまり、メタデータを発災後早急にリストアできる。また、メタデータは、一般的に小サイズのデータ（例えば数キロバイト程度のデータ）のため、容量オーバヘッドが小さい。 When data is restored, at least one backup server 107 transfers duplicate data to the file server 105B for data that satisfies the first condition. Restoration is performed using the duplicated data. In the present embodiment, the first condition is that the data is metadata. Metadata is restored prior to file data. In other words, the metadata can be restored immediately after the disaster. In addition, since metadata is generally small-sized data (for example, data of about several kilobytes), the capacity overhead is small.

第２条件に該当したデータについては、複数のバックアップサーバ１０７からファイルサーバ１０５Ｂに並列に断片データが転送される。このため、１つのバックアップサーバ１０７からデータ（典型的にはファイルデータ）が転送されることに比べてデータ転送時間が短縮される。また、データ分割にＥＣが採用されていることで、バックアップ及びリストアで広帯域化が実現され、且つ、容量オーバヘッドを抑えることができる。複数の断片データを用いたデコードによりデータがリストアされる。 For data that satisfies the second condition, fragment data is transferred from the plurality of backup servers 107 to the file server 105B in parallel. Therefore, the data transfer time is reduced as compared with the case where data (typically, file data) is transferred from one backup server 107. In addition, since EC is adopted for data division, a wide band is realized in backup and restoration, and capacity overhead can be suppressed. Data is restored by decoding using a plurality of fragment data.

以下、実施例１を詳述する。 Hereinafter, Example 1 will be described in detail.

図２は、ファイルサーバ１０５の構成を示す。 FIG. 2 shows the configuration of the file server 105.

ファイルサーバ１０５は、ネットワークＩ／Ｆ（インターフェース）２０１と、ＣＰＵ２０２と、ディスクデバイス２０３と、メモリ２０５とを有する。それらは内部通信路２０４によって相互に接続されている。 The file server 105 has a network I / F (interface) 201, a CPU 202, a disk device 203, and a memory 205. They are interconnected by an internal communication path 204.

ネットワークＩ／Ｆ２０１は、通信路１０２と相互に接続されており、クライアント１０１からのファイルアクセス要求を受け付ける場合に用いられるデバイスである。ＣＰＵ２０２は、メモリ２０５に格納されたプログラムを実行するデバイスであり、プロセッサの一例である。ディスクデバイス２０３は、プログラムファイルやエンドユーザが作成したファイルを格納する記憶デバイスである。メモリ２０５は、プログラムやデータ構造を保持するデバイスである。プログラムはディスクデバイス２０３に格納されており、ＣＰＵ２０２がそれを実行する際に、それをメモリ２０５へ読み込む。以降、特に明示しない限り、プログラムはＣＰＵによって実行されるものとする。また、プログラムはディスクデバイスからメモリに読み込まれて実行されるものとする。 The network I / F 201 is a device that is mutually connected to the communication path 102 and is used when receiving a file access request from the client 101. The CPU 202 is a device that executes a program stored in the memory 205, and is an example of a processor. The disk device 203 is a storage device that stores program files and files created by end users. The memory 205 is a device that holds programs and data structures. The program is stored in the disk device 203 and is read into the memory 205 when the CPU 202 executes the program. Hereinafter, the program is executed by the CPU unless otherwise specified. The program is read from the disk device into the memory and executed.

ファイル共有サーバプログラム２０６は、クライアント１０１のファイルアクセスプログラム１１０のファイルアクセス要求を処理するプログラムである。ファイルアクセス要求とは、例えば、ファイルの読み込み（READ）やファイルへのデータ書き込み（WRITE）である。ファイルアクセス要求によって受信したファイルデータは、ディスクデバイス２０３へ格納される。そのファイルデータは、バックアップサーバ１０７へバックアップされる。 The file sharing server program 206 is a program that processes a file access request of the file access program 110 of the client 101. The file access request is, for example, reading of a file (READ) or writing of data to a file (WRITE). The file data received by the file access request is stored in the disk device 203. The file data is backed up to the backup server 107.

ファイルシステムプログラム２０７は、ディスクデバイス２０３に格納されたファイルのデータや管理情報（メタデータ）を管理するプログラムである。ファイル共有サーバプログラム２０６は、クライアント１０１から送られてきたファイルデータをファイルシステムプログラム２０７へ受け渡し、ファイルシステムプログラム２０７がディスクデバイス２０３へ格納して管理する。 The file system program 207 is a program for managing file data and management information (metadata) stored in the disk device 203. The file sharing server program 206 transfers the file data sent from the client 101 to the file system program 207, and the file system program 207 stores and manages the data in the disk device 203.

バックアップ決定プログラム２０８は、バックアップ先のバックアップサーバ１０７やバックアップ方式を決めるプログラムである。バックアップ決定プログラム２０８の処理動作は後に詳述する。 The backup determination program 208 is a program for determining the backup server 107 as a backup destination and the backup method. The processing operation of the backup determination program 208 will be described later in detail.

バックアップデータ送受信プログラム２０９は、バックアップデータをバックアップサーバ１０７へ通信路１０２を通して送信するプログラムである。バックアップデータの送受信に用いるプロトコルは、例えば、ＲＥＳＴ（Representational State Transfer）プロトコル又はＳＭＢ（Server Message Block）プロトコルである。 The backup data transmission / reception program 209 is a program for transmitting backup data to the backup server 107 via the communication path 102. The protocol used for transmitting and receiving the backup data is, for example, a REST (Representational State Transfer) protocol or an SMB (Server Message Block) protocol.

複製データ作成プログラム２１０は、バックアップするファイルのバックアップデータを作成するプログラムである。複製データ作成プログラム２１０は、バックアップ対象となったファイルのメタデータとデータを分離したバックアップデータを作成する。更に、複製データ作成プログラム２１０は、データをＥＣに従い冗長符号化する。ＥＣ（Erasure Coding）とは、データをあるサイズの断片データに分割し、その断片データのパリティデータを作成する符号化方式である。パリティデータ数の断片データを失っても、元のデータを復元できる。例えば、複製データ作成プログラム２１０は、或るファイルのバックアップデータとして、１０個の複製データ（単純複製メタデータ）と１０個の断片データと２個のパリティデータとを作成したとする。複製データ作成プログラム２１０は、これらを複数のバックアップサーバ１０７へバックアップ（分散）する。これにより、バックアップリストアシステム１００の耐災害性が向上する。ＥＣにはReed Solomon符号化など様々な方式があるが、ＥＣの方式としていずれの方式が採用されてもよい。 The copy data creation program 210 is a program for creating backup data of a file to be backed up. The duplicate data creation program 210 creates backup data in which metadata and data of a file to be backed up are separated. Further, the duplicate data creation program 210 performs redundant encoding of data according to EC. EC (Erasure Coding) is an encoding method for dividing data into fragment data of a certain size and creating parity data of the fragment data. Even if fragment data of the number of parity data is lost, the original data can be restored. For example, it is assumed that the copy data creation program 210 has created 10 copy data (simple copy metadata), 10 fragment data, and 2 parity data as backup data of a certain file. The duplicate data creation program 210 backs up (distributes) them to the plurality of backup servers 107. Thereby, disaster resistance of the backup / restore system 100 is improved. There are various schemes for EC such as Reed Solomon coding, and any scheme may be adopted as the EC scheme.

複製データ復元プログラム２１１は、複製データ作成プログラム２１０が作成したバックアップデータを用いて元のファイルを復元するプログラムである。バックアップデータがＥＣによって冗長化されている場合は、デコード処理を行って元のデータを計算によって求める。 The copy data restoration program 211 is a program for restoring the original file using the backup data created by the copy data creation program 210. If the backup data is made redundant by the EC, decoding processing is performed to obtain the original data by calculation.

管理プログラム２１２は、複製データ作成プログラム２１０などのプログラムが用いる管理情報を作成又は管理するプログラムである。例えば、管理プログラム２１２は、ＥＣの方式や単純複製数などを設定又は管理する。管理プログラム２１２は、管理サーバ１０４を通して管理者によって実行される。管理者は、管理サーバ１０４からファイルサーバ１０５へ任意のプロトコル（例えばＳＳＨ（Secure Shell）プロトコル）を用いて接続し、管理プログラム２１２を動作させる。 The management program 212 is a program for creating or managing management information used by a program such as the duplicate data creation program 210. For example, the management program 212 sets or manages the EC system, the number of simple copies, and the like. The management program 212 is executed by the administrator through the management server 104. The administrator connects to the file server 105 from the management server 104 using an arbitrary protocol (for example, SSH (Secure Shell) protocol) and operates the management program 212.

バックアップ設定テーブル２１３は、管理プログラム２１２によって生成されるバックアップ方式などを定めたテーブルである。バックアップ設定テーブル２１３は後に詳述する。 The backup setting table 213 is a table that defines a backup method and the like generated by the management program 212. The backup setting table 213 will be described later in detail.

図３は、バックアップサーバ１０７の構成を示す。 FIG. 3 shows the configuration of the backup server 107.

バックアップサーバ１０７は、ネットワークＩ／Ｆ３０１と、ＣＰＵ３０２と、メモリ３０３と、ディスクデバイス３１０とを有する。それらは内部通信路３０７によって相互に接続されている。 The backup server 107 has a network I / F 301, a CPU 302, a memory 303, and a disk device 310. They are interconnected by an internal communication path 307.

ネットワークＩ／Ｆ３０１は、外部通信路１０２と接続されるデバイスである。ＣＰＵ３０２は、メモリ３０３に格納されたプログラムを実行するデバイスである。ディスクデバイス３１０は、プログラム、バックアップ管理テーブル３０８、断片データ３０９（３０９Ａ、３０９Ｂ、…）、パリティデータ３１１及び複製データ３１２を保持するデバイスである。断片データ３０９は、データの一部分であり、データＥＣに従いエンコードすることにより得られた部分データである。パリティデータ３１１は、１以上の断片データ３０９のうちの１つを生成するために使用されるデータである。複製データは、データ（例えばメタデータ）の複製である。ファイルシステムプログラム３０４は、バックアップデータをファイルとして管理するプログラムである。 The network I / F 301 is a device connected to the external communication path 102. The CPU 302 is a device that executes a program stored in the memory 303. The disk device 310 is a device that holds a program, a backup management table 308, fragment data 309 (309A, 309B,...), Parity data 311 and duplicate data 312. The fragment data 309 is a part of the data, and is partial data obtained by encoding according to the data EC. The parity data 311 is data used to generate one of the one or more pieces of fragment data 309. The copy data is a copy of data (for example, metadata). The file system program 304 is a program for managing backup data as a file.

バックアップデータ送受信プログラム３０５は、ファイルサーバ１０５（１０５Ａ）から送られてくるバックアップデータをファイルとしてディスクデバイス３１０へ保存すると共に、ファイルサーバ１０５（１０５Ｂ）からのバックアップデータ取得要求に対してバックアップデータをファイルサーバ１０５（１０５Ｂ）に送信するプログラムである。なお、上述の通り、バックアップデータの送受信には、ＲＥＳＴなどのプロトコルが用いられてよい。 The backup data transmission / reception program 305 stores the backup data sent from the file server 105 (105A) as a file in the disk device 310, and also stores the backup data in response to a backup data acquisition request from the file server 105 (105B). This is a program to be transmitted to the server 105 (105B). As described above, a protocol such as REST may be used for transmitting and receiving the backup data.

管理プログラム３０６は、バックアップサーバ１０７の設定を行うプログラムである。例えば、管理プログラム３０６は、バックアップサーバ１０７のＩＰアドレスなどを設定する。 The management program 306 is a program for setting the backup server 107. For example, the management program 306 sets the IP address of the backup server 107 and the like.

ディスクデバイス３１０に格納されるバックアップ管理テーブル３０８は、バックアップデータの格納先のバックアップサーバ１０７やＥＣ方式などを管理するテーブルである。バックアップ管理テーブル３０８の内容や利用法は後に詳述する。バックアップ管理テーブル３０８は、少なくとも１つのバックアップサーバ１０７、又は、ファイルサーバ１０５Ａ及び１０５Ｂがアクセス可能な他の共有記憶領域に格納されてよい。 The backup management table 308 stored in the disk device 310 is a table for managing the backup server 107 as the storage destination of the backup data, the EC method, and the like. The contents and usage of the backup management table 308 will be described later in detail. The backup management table 308 may be stored in at least one backup server 107 or another shared storage area accessible by the file servers 105A and 105B.

図４は、バックアップ設定テーブル２１３の構成を示す。 FIG. 4 shows the configuration of the backup setting table 213.

バックアップ設定テーブル２１３は、複数のバックアップサーバ１０７（バックアップサイト１０６）の各々についてレコードを有する。各レコードが、サイトＩＤ４０１、サイト名４０２及びＵＲＬ４０３を格納する。また、バックアップ設定テーブル２１３は、ＥＣ設定４１０及びレプリケーション設定４１１を格納する。 The backup setting table 213 has a record for each of the plurality of backup servers 107 (backup site 106). Each record stores a site ID 401, a site name 402, and a URL 403. The backup setting table 213 stores an EC setting 410 and a replication setting 411.

サイトＩＤ４０１は、バックアップサイト１０６のＩＤ（例えば通番）である。サイト名４０２は、バックアップサイト１０６の名称である。ＵＲＬ４０３は、ファイルサーバ１０５からバックアップサーバ１０７（バックアップサーバ１０７内のストレージ）へのアクセスパスである。ファイルサーバ１０５は、断片データ３０９やパリティデータ３１１へアクセスするためＵＲＬ４０３に基づいてアクセスパスを生成し、それらのデータを取得又は送信する。ＵＲＬ４０３に代えて他種のアクセスパスが採用されてもよい。ＥＣ設定４１０は、ＥＣに従うエンコードのポリシーに関する情報、例えば、断片データ数、パリティデータ数及びＥＣ方式を表す情報を含む。レプリケーション設定４１１は、レプリケーション方式で複製する場合のレプリケーション数（複製データ数）を表す情報を含む。例えば、レプリケーション設定４１１に従ってメタデータの複製データ数（レプリケーション数）が決定される。 The site ID 401 is the ID (for example, serial number) of the backup site 106. The site name 402 is the name of the backup site 106. The URL 403 is an access path from the file server 105 to the backup server 107 (storage in the backup server 107). The file server 105 generates an access path based on the URL 403 for accessing the fragment data 309 and the parity data 311 and acquires or transmits the data. Instead of the URL 403, another type of access path may be employed. The EC setting 410 includes information on an encoding policy according to the EC, for example, information indicating the number of fragment data, the number of parity data, and the EC method. The replication setting 411 includes information indicating the number of replications (the number of replicated data) when replicating by the replication method. For example, the number of replicated data of metadata (the number of replications) is determined according to the replication setting 411.

図５は、バックアップ管理テーブル３０８の構成を示す。 FIG. 5 shows the configuration of the backup management table 308.

バックアップ管理テーブル３０８は、１以上のバックアップ対象オブジェクト（リストア対象オブジェクト）の各々についてレコードを有する。各レコードが、パス５０１、ＩＤ５０２、種別５０３、Ｍｅｔａ５０４、Ｄａｔａ５０５、ＭｅｔａＢＵＰ５０６、ＤａｔａＢＵＰ５０７、ＰａｒｉｔｙＢＵＰ５０８を格納する。 The backup management table 308 has a record for each of one or more backup target objects (restore target objects). Each record stores a path 501, an ID 502, a type 503, a Meta 504, a Data 505, a MetaBUP 506, a DataBUP 507, and a ParityBUP 508.

パス５０１は、ファイルシステム空間におけるオブジェクトへのパス（バス名）を表す。パス５０１は、ファイルシステム空間におけるオブジェクトの位置情報の一例である。リストア時には、パス５０１と同じパス（位置）にファイル又はディレクトリがリストアされる。ＩＤ５０２は、オブジェクトのＩＤである。種別５０３は、オブジェクトの種別（ファイル又はディレクトリ）を示す情報である。 A path 501 represents a path (bus name) to an object in the file system space. The path 501 is an example of position information of an object in the file system space. At the time of restoration, the file or directory is restored to the same path (position) as the path 501. The ID 502 is the ID of the object. The type 503 is information indicating the type (file or directory) of the object.

Ｍｅｔａ５０４は、オブジェクト内のメタデータのバックアップ方式を示す情報である。Ｍｅｔａ５０４が「Ｒｅｐ．」の場合、メタデータの複製データがバックアップされる。メタデータをＥＣに従いバックアップする場合、後述のように、「（ｐ，ｑ）」が設定される。Ｄａｔａ５０５は、オブジェクト内のオブジェクトデータのバックアップ方式を示す情報である。オブジェクトデータがＥＣに従いバックアップされる場合、Ｄａｔａ５０５として、「（ｐ，ｑ）」が設定される。ｐは、断片データ数であり、ｑは、パリティデータ数である。 Meta 504 is information indicating a backup method of metadata in the object. When Meta 504 is “Rep.”, Duplicate data of the metadata is backed up. When metadata is backed up in accordance with EC, “(p, q)” is set as described later. Data 505 is information indicating a backup method of the object data in the object. When the object data is backed up according to EC, “(p, q)” is set as Data 505. p is the number of fragment data, and q is the number of parity data.

ＭｅｔａＢＵＰ５０６は、メタデータのバックアップ先（バックアップサイト１０６）のサイト名を表す。ＤａｔａＢＵＰ５０７は、オブジェクトデータのバックアップ先（バックアップサイト１０６）のサイト名を表す。ＰａｒｉｔｙＢＵＰ５０８は、メタデータ及びオブジェクトデータのうちの少なくとも１つについて１以上のパリティデータ３１１がバックアップされた場合、その１以上のパリティデータ３１１のバックアップ先（バックアップサイト１０６）のサイト名を表す。ＰａｒｉｔｙＢＵＰ５０８は、メタデータについてのパリティデータ３１１のバックアップ先サイト名と、オブジェクトデータについてのパリティデータ３１１のバックアップ先サイト名のように２種類のサイト名を区別して保持してもよい。 MetaBUP 506 indicates a site name of a metadata backup destination (backup site 106). DataBUP 507 indicates the site name of the backup destination (backup site 106) of the object data. When one or more parity data 311 is backed up for at least one of the metadata and the object data, the ParityBUP 508 indicates a site name of a backup destination (backup site 106) of the one or more parity data 311. The ParityBUP 508 may distinguish and store two types of site names such as a backup destination site name of the parity data 311 for metadata and a backup destination site name of the parity data 311 for object data.

バックアップ管理テーブル３０８が複数のレコード群（１以上のレコード）に分割され、それら複数のレコード群が複数のバックアップサーバ１０７に分散していてもよい。また、バックアップ管理テーブル３０８のうちの少なくとも１つのレコードが、上述の共有記憶領域に格納されてよい。 The backup management table 308 may be divided into a plurality of record groups (one or more records), and the plurality of record groups may be distributed to the plurality of backup servers 107. Further, at least one record of the backup management table 308 may be stored in the above-mentioned shared storage area.

以下、本実施例で行われる処理を説明する。 Hereinafter, processing performed in the present embodiment will be described.

図６は、リストア処理の模式図である。 FIG. 6 is a schematic diagram of the restore processing.

２つのバックアップサーバ１０７Ａ及び１０７Ｂ（バックアップサイト１０６Ａ及び１０６Ｂ）の２つのディスクデバイス３１０に格納されている断片データ３０９、複製データ３１２及びパリティデータ３１１を用いて、リカバリサイト１０８のファイルサーバ１０５に３つのオブジェクト（ディレクトリ「ＨＯＭＥ」、ファイルＸ及びファイルＹ）をリストアすることを例に取る。１つのオブジェクトデータについての２以上のデータ（２以上の断片データ及び１以上のパリティデータ）は、データ数がバックアップサーバ数以下の場合、その２以上のデータはそれぞれ異なる２以上のバックアップサーバ１０７にバックアップされている。データ数がバックアップサーバ数より多い場合、１台のバックアップサーバ１０７に、１つのオブジェクトデータについて複数のデータ（断片データ及びパリティデータのうちの少なくとも１つ）がバックアップされている。 Using the fragment data 309, the duplicate data 312, and the parity data 311 stored in the two disk devices 310 of the two backup servers 107A and 107B (backup sites 106A and 106B), three files are stored in the file server 105 of the recovery site 108. Restoring objects (directory "HOME", file X and file Y) is taken as an example. When the number of data of two or more data (one or more fragment data and one or more parity data) of one object data is equal to or less than the number of backup servers, the two or more data are transferred to different two or more backup servers 107 respectively. Backed up. When the number of data is larger than the number of backup servers, one backup server 107 backs up a plurality of data (at least one of fragment data and parity data) for one object data.

ファイルサーバ１０５Ｂが、一時ディレクトリ「ＴＭＰ」を、第２ファイルシステム空間（オブジェクトのリストア先のファイルシステム空間）に生成する。一時ディレクトリ「ＴＭＰ」は、例えば、ルートディレクトリ「Ｒｏｏｔ」の子ディレクトリとして生成される。 The file server 105B generates a temporary directory “TMP” in the second file system space (the file system space where the object is restored). The temporary directory “TMP” is generated, for example, as a child directory of the root directory “Root”.

ファイルサーバ１０５Ｂが、バックアップ管理テーブル３０８を参照することで、３つのオブジェクトの各々について、そのオブジェクトの第２ファイルシステム空間における位置と、そのオブジェクトのバックアップ先サイト名とを特定する。 By referring to the backup management table 308, the file server 105B specifies, for each of the three objects, the position of the object in the second file system space and the name of the backup destination site of the object.

その後、まず、ファイルサーバ１０５Ｂが、オブジェクトデータよりもメタデータを優先的にリストアする。具体的には、ファイルサーバ１０５Ｂが、バックアップサーバ１０７Ａ及び１０７Ｂから、並行して、３つのオブジェクトにそれぞれ対応した３つの複製データ３１２（メタデータ複製）を取得し、一時ディレクトリ「ＴＭＰ」に、３つの複製データをそれぞれ仮復旧する。メタデータの仮復旧とは、そのメタデータのみを有しオブジェクトデータを有さないオブジェクトである仮オブジェクトを生成することである。これにより、一時ディレクトリ「ＴＭＰ」に、３つの仮オブジェクトが生成される。なお、ディレクトリは一般にオブジェクトデータを有さないため、ディレクトリ「ＨＯＭＥ」については、メタデータの仮復旧により一時的にリストアされたことになる。 After that, first, the file server 105B restores the metadata with priority over the object data. Specifically, the file server 105B obtains three pieces of duplicate data 312 (metadata copies) corresponding to the three objects from the backup servers 107A and 107B in parallel, and stores the three pieces of data in the temporary directory “TMP”. Temporarily recover each of the two replicated data. Temporary restoration of metadata means generating a temporary object which is an object having only the metadata and no object data. As a result, three temporary objects are generated in the temporary directory “TMP”. Since the directory generally does not have object data, the directory “HOME” is temporarily restored by the temporary restoration of the metadata.

次に、ファイルサーバ１０５Ｂが、バックアップサーバ１０７Ａ及び１０７Ｂから、並行して、ファイルＸ及びＹに対応した複数の断片データ及び複数のパリティデータを取得する。ファイルサーバ１０５Ｂが、取得したそれらのデータを用いたデコードを行うことにより、ファイルＸ及びＹにそれぞれ対応した２つのファイルデータを算出する。ファイルサーバ１０５Ｂが、その２つのファイルデータの各々を仮復旧する。ファイルデータの仮復旧とは、そのファイルデータをそのファイルデータに対応する仮オブジェクトに格納することである。これにより、一時ディレクトリ「ＴＭＰ」に、ファイルＸ及びＹの各々がリストアされる。 Next, the file server 105B acquires a plurality of fragment data and a plurality of parity data corresponding to the files X and Y from the backup servers 107A and 107B in parallel. The file server 105B calculates two file data respectively corresponding to the files X and Y by performing decoding using the obtained data. The file server 105B temporarily recovers each of the two file data. Temporary recovery of file data means storing the file data in a temporary object corresponding to the file data. As a result, each of the files X and Y is restored to the temporary directory “TMP”.

最後に、ファイルサーバ１０５Ｂが、３つのオブジェクトの各々を、一時ディレクトリ「ＴＭＰ」から、第２ファイルシステム空間における該当位置に移動する。具体的には、例えば、ディレクトリ「ＨＯＭＥ」が、ルートディレクトリ「Ｒｏｏｔ」の子ディレクトリとして配置される（パス「／Ｒｏｏｔ／ＨＯＭＥ」に従う位置に移動される）。そして、ディレクトリ「ＨＯＭＥ」内にファイルＸ及びファイルＹが移動される（ファイルＸが、パス「／Ｒｏｏｔ／ＨＯＭＥ／ＦｉｌｅＸ」に従う位置に移動され、ファイルＹが、パス「／Ｒｏｏｔ／ＨＯＭＥ／ＦｉｌｅＹ」に従う位置に移動される）。３つのオブジェクトの移動後（つまりリストア処理の完了後）、一時ディレクトリ「ＴＭＰ」がファイルサーバ１０５Ｂにより削除されてよい。 Finally, the file server 105B moves each of the three objects from the temporary directory “TMP” to a corresponding position in the second file system space. Specifically, for example, the directory “HOME” is arranged as a child directory of the root directory “Root” (moved to a position according to the path “/ Root / HOME”). Then, the file X and the file Y are moved into the directory “HOME” (the file X is moved to a position according to the path “/ Root / HOME / FileX”, and the file Y is moved to the path “/ Root / HOME / FileY”). Will be moved to a position according to). After the three objects have been moved (that is, after the restoration process is completed), the temporary directory “TMP” may be deleted by the file server 105B.

一時ディレクトリ「ＴＭＰ」があれば、複数のリストア対象オブジェクトのうちの上位のリストア対象オブジェクトがリストアされるのを待つことなく下位のリストア対象オブジェクトを先に一時的にリストアすることができる。これにより、リストア処理を高速に実行できる。 With the temporary directory “TMP”, the lower-level restoration target object can be temporarily restored first without waiting for the higher-level restoration target object of the plurality of restoration target objects to be restored. As a result, restore processing can be executed at high speed.

図７は、バックアップ処理のフローを示す。 FIG. 7 shows a flow of the backup process.

災害前に正サイト１０３のファイルサーバ１０５Ａがバックアップ処理を実行する。バックアップ処理は、ファイルサーバ１０５Ａのプログラムとバックアップサーバ１０７のプログラムが連携して動作する。ファイルシステムプログラム３０４がクライアント１０１のファイルアクセスプログラム１１０から受信したファイルは、一旦ディスクデバイス２０３へ格納される。そのファイルのファイルデータが、一定期間が経過すると、ＥＣに従い２以上の断片データ３０９と１以上のパリティデータ３１１とされ、それらのデータと、メタデータ（複製データ）とが、複数のバックアップサーバ１０７のうちの２以上のバックアップサーバ１０７へ送られることになる。 Before the disaster, the file server 105A of the primary site 103 performs a backup process. In the backup process, the program of the file server 105A and the program of the backup server 107 operate in cooperation with each other. The file received by the file system program 304 from the file access program 110 of the client 101 is temporarily stored in the disk device 203. After a certain period of time, the file data of the file is converted into two or more pieces of fragment data 309 and one or more pieces of parity data 311 in accordance with EC, and those data and metadata (replicated data) are transferred to a plurality of backup servers 107. Are sent to two or more backup servers 107.

まず、バックアップ処理において、バックアップ決定プログラム２０８が、バックアップ方式決定処理を実行する（ステップ７０１）。具体的には、実施例１では、バックアップ決定プログラム２０８は、バックアップ設定テーブル２１３のＥＣ設定４１０に従い、バックアップ対象オブジェクトのオブジェクトデータについて、断片データ数とパリティデータ数を決める。さらに、バックアップ決定プログラム２０８は、バックアップ設定テーブル２１３のレプリケーション設定４１１に従い、バックアップ対象データのメタデータについて、複製データ数を決定する。なお、バックアップ対象がディレクトリの場合、オブジェクトデータが無いため、ディレクトリのバックアップについては、レプリケーション設定４１１のみ使用される。以下、図７の説明では、バックアップ対象オブジェクトはファイルであるとする。 First, in the backup process, the backup decision program 208 executes a backup method decision process (Step 701). Specifically, in the first embodiment, the backup determination program 208 determines the number of fragment data and the number of parity data for the object data of the backup target object according to the EC setting 410 of the backup setting table 213. Further, the backup determination program 208 determines the number of replicated data for the metadata of the backup target data according to the replication setting 411 of the backup setting table 213. If the backup target is a directory, there is no object data, so only the replication setting 411 is used for backing up the directory. Hereinafter, in the description of FIG. 7, the backup target object is a file.

次に、バックアップ決定プログラム２０８は、バックアップ対象ファイルのファイルデータについて、断片データ数とパリティデータ数の合計のバックアップサイト１０６をバックアップ設定テーブル２１３から選定する（ステップ７０２）。バックアップ先（バックアップサイト１０６）は、サイトＩＤの番号に基づいてラウンドロビンで決定されてよい。このステップ７０２で、複製データのバックアップ先のバックアップサイト１０６が選定されてもよい。そのバックアップサイト１０６は、断片データ及びパリティデータのバックアップ先と異なるバックアップサイト１０６（バックアップサーバ１０７）でよい。 Next, the backup determination program 208 selects, from the backup setting table 213, the backup site 106 of the total of the number of fragment data and the number of parity data for the file data of the file to be backed up (Step 702). The backup destination (backup site 106) may be determined on a round robin basis based on the site ID number. In this step 702, the backup site 106 of the backup destination of the replicated data may be selected. The backup site 106 may be a backup site 106 (backup server 107) different from the backup destination of the fragment data and the parity data.

次に、複製データ作成プログラム２１０が、バックアップ対象ファイルのファイルデータをＥＣに従いエンコードする（ステップ７０３）。この時、複製データ作成プログラム２１０は、バックアップ決定プログラム２０８から断片データ数とパリティデータ数を受け、その断片データ数分の断片データ３０９と、そのパリティデータ数分のパリティデータ３１１を生成する。なお、バックアップ対象オブジェクトがディレクトリの場合は、このステップ７０３はスキップされる。 Next, the copy data creation program 210 encodes the file data of the backup target file according to the EC (Step 703). At this time, the duplicate data creation program 210 receives the number of fragment data and the number of parity data from the backup determination program 208, and generates fragment data 309 for the number of fragment data and parity data 311 for the number of parity data. If the object to be backed up is a directory, step 703 is skipped.

バックアップデータ送受信プログラム２０９が、生成された断片データ３０９とパリティデータ３１１を、ステップ７０２で選定されたバックアップサーバ１０７へそれぞれバックアップ（送信）する（ステップ７０４）。具体的には、バックアップデータ送受信プログラム２０９は、断片データ３０９とパリティデータ３１１のバックアップ先のバックアップサーバ１０７のリストをバックアップ決定プログラム２０８から受け取り、そのリストに従ってそれらのデータを送信する。なお、バックアップ対象オブジェクトがディレクトリの場合、このステップ７０４はスキップされる。 The backup data transmission / reception program 209 backs up (transmits) the generated fragment data 309 and parity data 311 to the backup server 107 selected in Step 702 (Step 704). Specifically, the backup data transmission / reception program 209 receives from the backup determination program 208 a list of the backup server 107 as a backup destination of the fragment data 309 and the parity data 311 and transmits the data according to the list. If the object to be backed up is a directory, step 704 is skipped.

次に、バックアップデータ送受信プログラム２０９は、バックアップ対象ファイルのメタデータをバックアップサーバ１０７へ送信する（ステップ７０５）。この時、バックアップデータ送受信プログラム２０９は、メタデータ送信先となるバックアップサーバ１０７のリストをバックアップ決定プログラム２０８から受け取り、そのリストに従ってメタデータを送信する。 Next, the backup data transmission / reception program 209 transmits the metadata of the backup target file to the backup server 107 (Step 705). At this time, the backup data transmission / reception program 209 receives from the backup determination program 208 a list of the backup servers 107 to which the metadata is to be transmitted, and transmits the metadata according to the list.

次に、バックアップデータ送受信プログラム２０９は、バックアップ管理テーブル３０８にレコードを追加する（ステップ７０６）。バックアップデータ送受信プログラム２０９は、その追加したレコードに、下記を登録する（ステップ７０７）。
・バックアップ対象ファイルのパスを表すパス５０１。
・バックアップ対象ファイルのＩＤ（通番）５０２。
・種別５０３「ＦＩＬＥ」。
・Ｍｅｔａ５０４「Ｒｅｐ．」（メタデータの複製データがバックアップされたため）。
・Ｄａｔａ５０５「（ｐ，ｑ）」（ｐ＝断片データ数、ｑ＝パリティデータ数）。
・ＭｅｔａＢＵＰ５０６（複製データ（メタデータ複製）の送信先のバックアップサイト１０６のサイト名）。
・ＤａｔａＢＵＰ５０７（断片データの送信先のバックアップサイト１０６のサイト名）。
・ＰａｒｉｔｙＢＵＰ５０８（パリティデータの送信先のバックアップサイト１０６のサイト名）。 Next, the backup data transmission / reception program 209 adds a record to the backup management table 308 (Step 706). The backup data transmission / reception program 209 registers the following in the added record (step 707).
A path 501 representing the path of the file to be backed up.
An ID (serial number) 502 of the file to be backed up;
-Type 503 "FILE".
Meta 504 “Rep.” (Because duplicate data of metadata has been backed up).
Data 505 “(p, q)” (p = number of fragment data, q = number of parity data).
MetaBUP 506 (the site name of the backup site 106 to which duplicate data (metadata duplication) is transmitted).
DataBUP 507 (site name of backup site 106 to which fragment data is transmitted).
ParityBUP 508 (site name of backup site 106 to which parity data is transmitted).

図８は、リストア処理のフローを示す。 FIG. 8 shows a flow of the restore processing.

リストア処理は、災害発生後にリカバリサイト１０８のファイルサーバ１０５Ｂにより実行される。 The restoration process is executed by the file server 105B of the recovery site 108 after a disaster occurs.

ファイルサーバ１０５Ｂのバックアップデータ送受信プログラム２０９は、バックアップ設定テーブル２１３に記述されたバックアップサーバ１０７に接続し、バックアップ管理テーブル３０８を取得する（ステップ８０１）。なお、全てのバックアップサーバ１０７にバックアップ管理テーブル３０８が格納されていない場合、全てのバックアップサーバ１０７からバックアップ管理テーブル３０８を取得できるまで、順次、バックアップサーバ１０７へアクセスが行われてもよい。また、バックアップ設定テーブル２１３には、バックアップ管理テーブル３０８が格納されるバックアップサーバ１０７のＵＲＬが記述されていてよく、そのＵＲＬを用いて、バックアップ管理テーブル３０８の取得（参照）が行われてもよい。 The backup data transmission / reception program 209 of the file server 105B connects to the backup server 107 described in the backup setting table 213 and acquires the backup management table 308 (Step 801). When the backup management tables 308 are not stored in all the backup servers 107, the backup servers 107 may be accessed sequentially until the backup management tables 308 can be obtained from all the backup servers 107. Further, the URL of the backup server 107 in which the backup management table 308 is stored may be described in the backup setting table 213, and the backup management table 308 may be obtained (referenced) using the URL. .

次に、複製データ復元プログラム２１１は、バックアップ管理テーブル３０８のレコード５２０を１つ選択し（ステップ８０２）、そのレコードのＭｅｔａ５０４とＤａｔａ５０５から、バックアップ方式を特定する（ステップ８０３）。これにより、複製データ復元プログラム２１１は、各リストア対象オブジェクトのリストア方式がわかる。そして、複製データ復元プログラム２１１は、リストア元とするバックアップサーバ１０７を特定する（ステップ８０４）。ここでは、例えば、ネットワーク接続をｐｉｎｇコマンドによって確認し、その結果に基づいてバックアップサーバ１０７を決定することが行われてよい。この時、オブジェクトデータの取得のためには、ＤａｔａＢＵＰ５０７に基づきバックアップサーバ１０７が特定される。ＤａｔａＢＵＰ５０７に記述されたバックアップサーバ１０７が停止している場合、複製データ復元プログラム２１１は、ＰａｒｉｔｙＢＵＰ５０８から１つの代替サイト（バックアップサーバ１０７）を選択する。 Next, the copy data restoration program 211 selects one record 520 of the backup management table 308 (Step 802), and specifies a backup method from the Meta 504 and Data 505 of the record (Step 803). As a result, the copy data restoration program 211 knows the restoration method of each restoration target object. Then, the copy data restoration program 211 specifies the backup server 107 as a restoration source (Step 804). Here, for example, the network connection may be confirmed by a ping command, and the backup server 107 may be determined based on the result. At this time, the backup server 107 is specified based on DataBUP 507 in order to obtain the object data. When the backup server 107 described in the DataBUP 507 is stopped, the duplicate data restoration program 211 selects one alternative site (the backup server 107) from the ParityBUP 508.

次に、複製データ復元プログラム２１１は、バックアップデータ送受信プログラム２０９を通して、選定したバックアップサーバ１０７からメタデータをリストアする（ステップ８０５）。この時、前述の通り、複製データ復元プログラム２１１は、メタデータを仮復旧するため、一時ディレクトリへ仮ファイルを生成する（ステップ８０６）。 Next, the copy data restoration program 211 restores the metadata from the selected backup server 107 through the backup data transmission / reception program 209 (Step 805). At this time, as described above, the duplicate data restoration program 211 generates a temporary file in a temporary directory to temporarily recover the metadata (Step 806).

そして、複製データ復元プログラム２１１は、バックアップデータ送受信プログラム２０９を通して、選定したバックアップサーバ１０７から、断片データ又はパリティデータを受信する（ステップ８０７）。そして、複製データ復元プログラム２１１は、受信したデータを用いて、ＥＣに従うデコードを行い、ファイルを一時ディレクトリへ仮復旧する（ステップ８０８）。複製データ復元プログラム２１１は、仮復旧したファイル又はディレクトリ（つまり、一時的にリストアされたファイル又はディレクトリ）を、バックアップ管理テーブル３０８のパス５０１に基づいて、本来のパスに従う位置に移動する（ステップ８０９）。これにより、１つのファイル又はディレクトリのリストアが完了する。 Then, the copy data restoration program 211 receives fragment data or parity data from the selected backup server 107 through the backup data transmission / reception program 209 (Step 807). Then, the copy data restoration program 211 performs decoding according to the EC using the received data, and temporarily restores the file to the temporary directory (step 808). The replicated data restoration program 211 moves the temporarily restored file or directory (that is, the temporarily restored file or directory) to a position following the original path based on the path 501 of the backup management table 308 (step 809). ). Thereby, restoration of one file or directory is completed.

次に、複製データ復元プログラム２１１は、未リストアのファイル又はディレクトリの有無を確認する（ステップ８１０）。未リストアのファイル又はディレクトリがあれば（ステップ８１０：Ｎ）、ステップ８０２から処理が再開される。一方、無ければ（ステップ８１０：Ｙ）、処理が終了する。 Next, the copy data restoration program 211 checks whether there is an unrestored file or directory (step 810). If there is an unrestored file or directory (step 810: N), the processing is restarted from step 802. On the other hand, if not (step 810: Y), the process ends.

ここで、図６を用いて説明したリストアの並列処理について、並列処理される区間は、ステップ８０３からステップ８０９である。この区間は、ファイル又はディレクトリを個別に処理できるため、並列化可能である。並列化する場合は、複数のスレッド又はプロセスによって、該当の区間に属するステップが実行される。 Here, in the parallel processing of the restoration described with reference to FIG. 6, the section where the parallel processing is performed is from step 803 to step 809. This section can be processed in parallel because files or directories can be processed individually. In the case of parallelization, steps belonging to a corresponding section are executed by a plurality of threads or processes.

以上、実施例１によれば、メタデータとオブジェクトデータでバックアップ方式が異なる。具体的には、メタデータはレプリケーションバックアップを行い、オブジェクトデータはＥＣに従う分散バックアップを行う。オブジェクトデータをリストアするには、メタデータのリストア完了が必要である。よって、本実施例により、メタデータを格納するバックアップサイトさえ回復すればすぐにメタデータをリストアする前提条件を整えることが可能となる。更に、その後、複数のバックアップサイトが順次回復するのに合わせて、メタデータに比して時間のかかるオブジェクトデータを実行することとなり、何れか単一の方法を用いるのに比べて効率よく短い時間での回復を可能とすることができる。 As described above, according to the first embodiment, the backup method differs between metadata and object data. Specifically, the metadata performs a replication backup, and the object data performs a distributed backup according to EC. To restore object data, metadata restoration must be completed. Therefore, according to the present embodiment, it is possible to prepare the precondition for restoring the metadata as soon as the backup site storing the metadata is restored. Further, thereafter, as the plurality of backup sites are sequentially recovered, the object data, which takes longer time than the metadata, is executed, and a shorter time is efficiently used than in the case of using any one method. Recovery can be made possible.

以下、実施例２を説明する。その際、実施例１との相違点を主に説明し、実施例１との共通点については説明を省略又は簡略する。 Hereinafter, a second embodiment will be described. At this time, differences from the first embodiment will be mainly described, and description of common points with the first embodiment will be omitted or simplified.

実施例２では、バックアップ対象オブジェクトの属性（具体的には、オブジェクト中のデータのサイズ、オブジェクトのパス深度（後述））によって、バックアップ方式やバックアップ冗長度（複製データ数、断片データ数及びパリティデータ数のうちの少なくとも１つ）が異なる。これにより、オブジェクトの属性に最適なバックアップ及びリストアが期待できる。 In the second embodiment, depending on the attributes of the backup target object (specifically, the size of the data in the object, the path depth of the object (described later)), the backup method and the backup redundancy (the number of duplicate data, the number of fragment data, and the parity data) At least one of the numbers). As a result, backup and restoration optimal for the attributes of the object can be expected.

例えば、バックアップ対象オブジェクトがファイルデータであっても、ファイルデータのデータサイズが閾値未満の場合、ＥＣに代えてレプリケーションバックアップが採用される。サイズが小さいデータをＥＣに従い分割してバックアップしても、リストア時の並列化効果が低い。そこで、データサイズが閾値未満のファイルデータについてはレプリケーションバックアップを採用することでＥＣに従うデコード（ステップ８０８）を省略することができる。これにより、リストア処理の更なる高速化が期待できる。 For example, even if the object to be backed up is file data, if the data size of the file data is smaller than the threshold, replication backup is adopted instead of EC. Even if small data is divided and backed up according to EC, the parallelization effect at the time of restoration is low. Therefore, for the file data whose data size is less than the threshold, decoding according to EC (step 808) can be omitted by employing replication backup. This can be expected to further speed up the restore processing.

図９は、実施例２に係るバックアップ方式テーブルの構成を示す。 FIG. 9 illustrates a configuration of a backup mode table according to the second embodiment.

バックアップ方式テーブル９００は、ファイルサーバ１０５の記憶部（メモリ及びディスクデバイスのうちの少なくとも１つ）に格納される。バックアップ方式テーブル９００は、パラメータ９０１及び値９０２を１つの組としたレコードを有する。本実施例では、パラメータ９０１として、「メタデータ」、「オブジェクトデータ（サイズ大）」、「オブジェクトデータ（サイズ小）」、及び「サイズ閾値」がある。ここで例として示す値は管理者によって適切な値が設定される。従って、本発明はこの例に限定されるものではない。 The backup method table 900 is stored in a storage unit (at least one of a memory and a disk device) of the file server 105. The backup method table 900 has a record in which a parameter 901 and a value 902 are set as one set. In this embodiment, the parameters 901 include “metadata”, “object data (large size)”, “object data (small size)”, and “size threshold”. Here, an appropriate value is set as an example by the administrator. Therefore, the present invention is not limited to this example.

パラメータ９０１「メタデータ」の値９０２、パラメータ９０１「オブジェクトデータ（サイズ大）」の値９０２、及び、パラメータ９０１「オブジェクトデータ（サイズ小）」の値９０２のいずれも、（ｘ，ｙ，ｚ）、すなわち、パス深度（ｘ）と、バックアップ方式（ｙ）と、冗長度（ｚ）との関係を示す。 The value 902 of the parameter 901 “metadata”, the value 902 of the parameter 901 “object data (large size)”, and the value 902 of the parameter 901 “object data (small size)” are all (x, y, z). That is, the relationship between the path depth (x), the backup method (y), and the redundancy (z) is shown.

オブジェクトの「パス深度」とは、ファイルシステム空間におけるオブジェクトの深さ、言い換えれば、ルートディレクトリからそのオブジェクトへ至るまでに経由するリンクの数である。例えば、パスが「／ＤＩＲ１／ＤＩＲ２／ＦｉｌｅＸ」の場合、ルートディレクトリからファイルＸに至るまでに３つのリンク（ルートディレクトリからディレクトリ「ＤＩＲ１」までのリンク、ディレクトリ「ＤＩＲ１」からディレクトリ「ＤＩＲ２」までのリンク、及び、ディレクトリ「ＤＩＲ２」から「ＦｉｌｅＸ」までのリンク）を経由するため、パス深度は３である。本実施例では、パス深度の範囲として、１以上４以下の範囲と、５以上の範囲とが設けられている。従って、図９において、パス深度「１」は、パス深度が１以上４以下であることを意味し、パス深度「５」は、パス深度が５以上であることを意味する。パス深度の範囲は、本実施例に限られない。 The "path depth" of an object is the depth of the object in the file system space, in other words, the number of links that go from the root directory to the object. For example, when the path is “/ DIR1 / DIR2 / FileX”, three links (a link from the root directory to the directory “DIR1” and a link from the directory “DIR1” to the directory “DIR2”) from the root directory to the file X are provided. The path depth is 3 through the link and the directory “DIR2” to “FileX”. In the present embodiment, a range of 1 or more and 4 or less and a range of 5 or more are provided as ranges of the path depth. Accordingly, in FIG. 9, the path depth “1” means that the path depth is 1 or more and 4 or less, and the path depth “5” means that the path depth is 5 or more. The range of the path depth is not limited to the present embodiment.

「バックアップ方式」は、複製とＥＣのいずれかである。バックアップ方式「Ｒｅｐ」は、複製を意味し、バックアップ方式「ＥＣ」は、ＥＣを意味する。 The “backup method” is either a copy or EC. The backup method “Rep” means duplication, and the backup method “EC” means EC.

「冗長度」の意味は、バックアップ方式が「Ｒｅｐ」であるか「ＥＣ」であるかによって異なる。バックアップ方式が「Ｒｅｐ」の場合、「冗長度」は、複製データの数である。バックアップ方式が「ＥＣ」の場合、「冗長度」は、ＥＣ冗長度である。ここで、ＥＣ冗長度Ｖは、（ｐ＋ｑ）／ｐと定義される。ｐは、上述したように、生成された断片データの数であり、ｑは、上述したように、生成されたパリティデータの数である。具体的には、例えば、２個の断片データに対して１個のパリティデータが生成された場合、ＥＣ冗長度Ｖは、（２＋１）／２＝１．５である。パス深度によってＥＣ冗長度Ｖの値を変える理由として、パス深度が小さいオブジェクト（つまり浅いオブジェクト）は重要度が高いと考えられるためである。例えば、上記ディレクトリ「ＤＩＲ１」の下位には、上述のディレクトリ「ＤＩＲ２」及び「ＦｉｌｅＸ」の他に、他のディレクトリ及び他のファイルが格納され得る。つまり、ディレクトリ「ＤＩＲ１」はこれらのオブジェクトに依存される。ディレクトリ「ＤＩＲ１」のリストアが完了しないと、これらのオブジェクトのリストアは完了できない。従って、比較的浅いオブジェクトは比較的深いオブジェクトよりもＥＣ冗長度Ｖの値を高くすることが好ましい。 The meaning of “redundancy” differs depending on whether the backup method is “Rep” or “EC”. When the backup method is “Rep”, “redundancy” is the number of duplicate data. When the backup method is “EC”, “redundancy” is EC redundancy. Here, the EC redundancy V is defined as (p + q) / p. p is the number of fragment data generated as described above, and q is the number of parity data generated as described above. Specifically, for example, when one piece of parity data is generated for two pieces of fragment data, the EC redundancy V is (2 + 1) /2=1.5. The reason why the value of the EC redundancy V is changed depending on the path depth is that an object having a small path depth (that is, a shallow object) is considered to have a high importance. For example, below the directory "DIR1", other directories and other files may be stored in addition to the directories "DIR2" and "FileX" described above. That is, the directory “DIR1” depends on these objects. Unless restoration of the directory "DIR1" is completed, restoration of these objects cannot be completed. Therefore, it is preferable that a relatively shallow object has a higher EC redundancy value V than a relatively deep object.

パラメータ９０１「サイズ閾値」は、オブジェクトデータのサイズの閾値である。図９によれば、「サイズ閾値」の値９０２は、「１ＭＢ」である。また、「オブジェクトデータ（サイズ大）」は、サイズ閾値以上のサイズのオブジェクトデータであり、「オブジェクトデータ（サイズ小）」は、サイズ閾値未満のサイズのオブジェクトデータである。従って、本実施例では、「オブジェクトデータ（サイズ大）」は、１ＭＢ以上のサイズのオブジェクトデータであり、「オブジェクトデータ（サイズ小）」は、１ＭＢ未満のサイズのオブジェクトデータである。なお、「サイズ閾値」は、１つでもよいし、複数あってもよい。また、サイズ閾値として、メタデータ用のサイズ閾値があってもよい。 The parameter 901 “size threshold” is a threshold of the size of the object data. According to FIG. 9, the value 902 of the “size threshold” is “1 MB”. “Object data (large size)” is object data having a size equal to or larger than the size threshold, and “object data (small size)” is object data having a size smaller than the size threshold. Therefore, in this embodiment, “object data (large size)” is object data having a size of 1 MB or more, and “object data (small size)” is object data having a size of less than 1 MB. The “size threshold” may be one or more. Further, a size threshold for metadata may be used as the size threshold.

図１０は、実施例２に係るバックアップ方式決定処理のフローを示す。なお、図１０では、図９のテーブルに従う値が記述されている。 FIG. 10 illustrates a flow of a backup method determination process according to the second embodiment. In FIG. 10, values according to the table of FIG. 9 are described.

バックアップ方式決定処理は、バックアップ決定プログラム２０８によって実行される処理である。実施例１では、バックアップ方式決定処理では、メタデータについてはレプリケーションバックアップが採用され、オブジェクトデータについてはＥＣ分割バックアップ（ＥＣに従いデータを分割してバックアップするバックアップ方式）が採用される。 The backup mode determination process is a process executed by the backup determination program 208. In the first embodiment, in the backup method determination processing, a replication backup is adopted for metadata, and an EC division backup (a backup method for dividing and backing up data according to EC) is adopted for object data.

実施例２では、実施例１の観点に代えて又は加えて、バックアップ方式決定処理では、バックアップ方式テーブル９００に基づき下記が行われる。 In the second embodiment, instead of or in addition to the viewpoint of the first embodiment, in the backup method determination processing, the following is performed based on the backup method table 900.

バックアップ決定プログラム２０８は、バックアップ対象オブジェクト中の処理対象データ（対象データ）がメタデータかオブジェクトデータかを判断する（ステップ１０００）。 The backup determination program 208 determines whether the processing target data (target data) in the backup target object is metadata or object data (step 1000).

対象データがメタデータの場合、バックアップ決定プログラム２０８は、メタデータを含んだオブジェクト（この段落において「オブジェクトＡ」）のパス深度に基づき、冗長度を決定する（ステップ１００１）。図９のテーブル９００によれば、オブジェクトＡのパス深度が１以上４以下の場合、冗長度＝２０（２０個の複製データ（メタデータ複製）を生成すること）とされる（ステップ１００２）。一方、オブジェクトＡのパス深度が５以上の場合、冗長度＝１０（１０個の複製データを生成すること）とされる（ステップ１００８）。 If the target data is metadata, the backup determination program 208 determines the redundancy based on the path depth of the object including the metadata (“Object A” in this paragraph) (Step 1001). According to the table 900 of FIG. 9, when the path depth of the object A is 1 or more and 4 or less, the redundancy is set to 20 (20 pieces of duplicate data (metadata copy) are generated) (step 1002). On the other hand, if the path depth of the object A is 5 or more, the redundancy is set to 10 (10 pieces of duplicate data are generated) (step 1008).

一方、対象データがオブジェクトデータの場合、バックアップ決定プログラム２０８は、そのオブジェクトデータのサイズがサイズ閾値（１ＭＢ）以上か否かを判断する（ステップ１００３）。 On the other hand, if the target data is object data, the backup determination program 208 determines whether the size of the object data is equal to or larger than the size threshold (1 MB) (step 1003).

ステップ１００３の判断結果が否定の場合、バックアップ決定プログラム２０８は、５個の複製データをバックアップすることを決定する（ステップ１００４）。５個の複製データは、１以上の複製データの一例である。この場合にＥＣ分割バックアップが採用されない理由は、データサイズが閾値未満であるとＥＣ分割の効果が低いからである。 If the result of the determination in step 1003 is negative, the backup decision program 208 decides to back up the five replicated data (step 1004). The five pieces of copy data are an example of one or more pieces of copy data. The reason that the EC division backup is not adopted in this case is that if the data size is smaller than the threshold value, the effect of the EC division is low.

ステップ１００３の判断結果が肯定の場合、バックアップ決定プログラム２０８は、対象データを含んだオブジェクト（この段落において「オブジェクトＢ」）のパス深度に基づきＥＣ冗長度Ｖを決定する（ステップ１００５）。図９のテーブル９００によれば、オブジェクトＢのパス深度が１以上４以下の場合、ＥＣ冗長度Ｖ＝１０とされる（ステップ１００６）。一方、オブジェクトＢのパス深度が５以上の場合、ＥＣ冗長度Ｖ＝５とされる（ステップ１００７）。 If the result of the determination in step 1003 is positive, the backup determination program 208 determines the EC redundancy V based on the path depth of the object containing the target data (“object B” in this paragraph) (step 1005). According to the table 900 of FIG. 9, when the path depth of the object B is 1 or more and 4 or less, the EC redundancy V is set to 10 (step 1006). On the other hand, if the path depth of the object B is 5 or more, the EC redundancy V is set to 5 (step 1007).

以上、実施例２によれば、バックアップ対象オブジェクトの属性に最適なバックアップ方式及び冗長度（複製データ数、断片データ数及びパリティデータ数のうちの少なくとも１つ）を決定することができる。 As described above, according to the second embodiment, it is possible to determine the backup method and the redundancy (at least one of the number of replicated data, the number of fragmented data, and the number of parity data) that are optimal for the attribute of the backup target object.

なお、実施例２は、上述したように実施例１との組合せが可能である。例えば、対象データがオブジェクトデータの場合、ＥＣ分割バックアップが採用されるが、ＥＣ冗長度Ｖの値は、対象データを含んだオブジェクトのパス深度に従い決定される。 Note that the second embodiment can be combined with the first embodiment as described above. For example, when the target data is object data, EC division backup is adopted, but the value of EC redundancy V is determined according to the path depth of the object including the target data.

また、実施例２において、図１０に示したバックアップ方式決定処理では、ステップ１００３及び１００５のうちの一方は無くてもよい。 In the second embodiment, one of steps 1003 and 1005 may not be included in the backup method determination processing shown in FIG.

以下、実施例３を説明する。その際、実施例１及び２との相違点を主に説明し、実施例１及び２との共通点については説明を省略又は簡略する。 Hereinafter, a third embodiment will be described. At this time, differences from the first and second embodiments will be mainly described, and descriptions of common points with the first and second embodiments will be omitted or simplified.

実施例３では、災害（例えば大規模災害）が発生した後のバックアップサイトの回復率に基づいてバックアップ方式やバックアップ冗長度（複製データ数、断片データ数及びパリティデータ数のうちの少なくとも１つ）が決定される。これにより、実施例３に係るバックアップリストアシステムを運用する地域に合わせた最適なバックアップ及びリストアが期待できる。 In the third embodiment, the backup method and the backup redundancy (at least one of the number of replicated data, the number of fragmented data, and the number of parity data) are based on the recovery rate of the backup site after a disaster (for example, a large-scale disaster) occurs. Is determined. Thereby, it is possible to expect an optimal backup and restore according to the region where the backup / restore system according to the third embodiment is operated.

図１１は、実施例３に係るバックアップ方式テーブルの構成を示す。 FIG. 11 illustrates a configuration of a backup mode table according to the third embodiment.

バックアップ方式テーブル１１００は、図９に示したテーブル９００に代えて又は加えて用意されたテーブルであり、ファイルサーバ１０５の記憶部（メモリ及びディスクデバイスのうちの少なくとも１つ）に格納される。 The backup method table 1100 is a table prepared instead of or in addition to the table 900 shown in FIG. 9, and is stored in the storage unit (at least one of the memory and the disk device) of the file server 105.

バックアップ方式テーブル１１００は、災害が発生してからの経過時間と回復率との関係を表している。具体的には、例えば、バックアップ方式テーブル１１００の１日後回復率１１０１とは、災害発生後１日目の回復率を表す。３日後回復率１１０２は、災害発生後３日目の回復率を表す。回復率は、バックアップサーバ数に対する、回復したバックアップサーバ（動作可能なバックアップサーバ）１０７の数の割合である。例えば、１００台のバックアップサーバ１０７が導入されている場合の回復率「３０％」は、３０台のバックアップサーバ１０７が動作可能な状態である。一般的に、災害発生からの時間経過に伴い、回復率は上昇する。 The backup method table 1100 indicates the relationship between the elapsed time since the occurrence of the disaster and the recovery rate. Specifically, for example, the one-day recovery rate 1101 in the backup method table 1100 indicates the recovery rate on the first day after the occurrence of the disaster. The three-day recovery rate 1102 indicates the recovery rate on the third day after the occurrence of the disaster. The recovery rate is a ratio of the number of recovered backup servers (operable backup servers) 107 to the number of backup servers. For example, the recovery rate “30%” when 100 backup servers 107 are installed indicates that 30 backup servers 107 can operate. Generally, the recovery rate increases with the passage of time from the occurrence of a disaster.

図１２は、実施例３に係るバックアップ方式決定処理のフローを示す。 FIG. 12 illustrates a flow of a backup method determination process according to the third embodiment.

まず、バックアップ決定プログラム２０８は、１日後回復率１１０１の値が、１日後回復率用の閾値（例えば２０％）以下か否かを判断する（ステップ１２０１）。 First, the backup determination program 208 determines whether or not the value of the one-day recovery rate 1101 is equal to or less than the one-day recovery rate threshold (for example, 20%) (step 1201).

ステップ１２０１の判断結果が肯定の場合、回復が比較的遅い地域にバックアップサイトが存在するという定義がされているということである。このため、各バックアップ対象オブジェクトについて、メタデータ及びオブジェクトデータの各々について、レプリケーションバックアップが採用される。具体的には、例えば、バックアップ決定プログラム２０８は、メタデータ及びオブジェクトデータの各々について、１０個の複製データを生成することを決定する（ステップ１２０５）。１０個の複製データは、１以上の複製データの一例である。複製データ数が多い程、少しでも多くのバックアップサーバ１０７が回復することで、リストアできる確率が高まる。 If the result of the determination in step 1201 is affirmative, it is defined that a backup site exists in an area where recovery is relatively slow. For this reason, for each backup target object, replication backup is adopted for each of the metadata and the object data. Specifically, for example, the backup determination program 208 determines to generate 10 pieces of duplicate data for each of the metadata and the object data (Step 1205). The ten pieces of copy data are an example of one or more pieces of copy data. The greater the number of replicated data, the more the backup server 107 recovers even a little, thereby increasing the probability of restoration.

ステップ１２０１の判断結果が否定の場合、バックアップ決定プログラム２０８は、３日後回復率１１０２の値が、３日後回復率用の閾値（例えば５０％）以下か否かを判断する（ステップ１２０２）。 If the determination result in step 1201 is negative, the backup determination program 208 determines whether the value of the three-day recovery rate 1102 is equal to or less than a three-day recovery rate threshold (for example, 50%) (step 1202).

ステップ１２０２の判断結果が肯定の場合、回復率が少し悪いため、バックアップ決定プログラム２０８は、メタデータについてはレプリケーションバックアップ（１０個の複製データを生成）を決定し、オブジェクトデータについてはＥＣ冗長度Ｖ＝３（ｐ（断片データ数）＝２、ｑ（パリティデータ数）＝４）のＥＣ分割バックアップを決定する（ステップ１２０４）。 If the determination result in step 1202 is affirmative, the recovery rate is slightly poor, so the backup determination program 208 determines a replication backup (generates 10 pieces of replicated data) for the metadata and the EC redundancy V for the object data. = 3 (p (number of fragment data) = 2, q (number of parity data) = 4) is determined (step 1204).

ステップ１２０２の判断結果が否定の場合、回復が比較的早い地域にバックアップサイトが存在するという定義がされているということである。バックアップ決定プログラム２０８は、メタデータについてはレプリケーションバックアップ（５個の複製データを生成）を決定し、オブジェクトデータについてはＥＣ冗長度Ｖ＝２（ｐ＝４、ｑ＝４）のＥＣ分割バックアップを決定する（ステップ１２０３）。 If the determination result in step 1202 is negative, it means that a backup site is defined in an area where recovery is relatively quick. The backup determination program 208 determines a replication backup (generates five pieces of replicated data) for metadata, and determines an EC split backup of EC redundancy V = 2 (p = 4, q = 4) for object data. (Step 1203).

ステップ１２０３〜１２０５において決定されたバックアップ方式及びバックアップ冗長度は、バックアップ決定プログラム２０８により、バックアップ管理テーブル３０８に登録される。そのバックアップ方式及びバックアップ冗長度に基づいてリストア処理が行われる。 The backup method and the backup redundancy determined in steps 1203 to 1205 are registered in the backup management table 308 by the backup determination program 208. Restore processing is performed based on the backup method and the backup redundancy.

以上、実施例３によれば、バックアップサイト全体の復旧状況（災害発生からの経過時間と回復率との関係）を想定してバックアップ方式やバックアップ冗長度を変えることで、バックアップ容量を抑えたり、リストア性能を上げたりといった、効率的なバックアップ及びリストアが期待できる。 As described above, according to the third embodiment, the backup capacity is suppressed by changing the backup method and the backup redundancy assuming the recovery status of the entire backup site (the relationship between the elapsed time after the occurrence of the disaster and the recovery rate). Efficient backup and restore, such as improving restore performance, can be expected.

なお、実施例３は、上述したように実施例１及び２のうちの少なくとも１つとの組合せが可能である。例えば、１日後回復率１１０１の値が２０％以下であり、且つ、対象データのサイズが１ＭＢ未満の場合、レプリケーションバックアップが採用されてもよい。また、例えば、１日後回復率１１０１の値が２０％以下であっても、対象データのサイズが１ＭＢ以上なら、ＥＣ分割バックアップが採用されてもよい。 Note that the third embodiment can be combined with at least one of the first and second embodiments as described above. For example, when the value of the one-day recovery rate 1101 is equal to or less than 20% and the size of the target data is less than 1 MB, a replication backup may be employed. Further, for example, even if the value of the one-day recovery rate 1101 is equal to or less than 20%, if the size of the target data is equal to or greater than 1 MB, EC division backup may be employed.

以上、幾つかの実施例を説明したが、これらは本発明の説明のための例示であって、本発明の範囲をこれらの実施例にのみ限定する趣旨ではない。本発明は、他の種々の形態でも実行することが可能である。 Although some embodiments have been described above, these are examples for describing the present invention, and are not intended to limit the scope of the present invention only to these embodiments. The present invention can be implemented in various other forms.

１０１：クライアント、１０３：正サイト、１０５：ファイルサーバ、１０６：バックアップサイト、１０７：バックアップサーバ、１０８：リカバリサイト 101: Client, 103: Primary Site, 105: File Server, 106: Backup Site, 107: Backup Server, 108: Recovery Site

Claims

Multiple backup servers,
A first object server for backing up one or more objects, each of which is a file or directory, to at least one of the plurality of backup servers;
A second object server that restores the backed up one or more objects from at least one of the plurality of backup servers;
The file in the one or more objects includes file data that is object data and metadata about the object data, and the directory in the one or more objects includes metadata about the object. Yes,
The processor of the first object server, for each of the object data and metadata in the one or more objects,
(B1) a function of determining which of the first condition and the second condition the target data, which is the data and is object data or metadata,
(B2) as a result of (B1), when the target data satisfies the first condition, a function of backing up duplicate data of the target data to two or more backup servers;
(B3) As a result of (B1), when the target data satisfies the second condition, fragment data of redundancy V according to EC (Erasure Coding) of the target data (V is a value based on (p + q) / p , P is the number of fragment data, and q is the number of parity data), and has a function of backing up two or more data to two or more backup servers.
The processor of the first object server, for each of the one or more objects, information on a position of the object in a first file system space in the first object server and a backup destination of the object, the plurality of backup servers Has a function of registering in the management information stored in the storage device in at least one of
The processor of the second object server,
(R1) for each of the one or more objects, a function of specifying the position of the object in the second file system space of the second object server and the backup destination of the object by referring to the management information; ,
(R2) As a function for each of the one or more objects,
(R21) a function of receiving, from at least one of the plurality of backup servers, duplicate data of the metadata or fragment data of the metadata as metadata of the target object which is the object;
(R22) When the target object is a file, receiving, as object data of the target object, a copy of the object data or a fragment of the object data from at least one of the plurality of backup servers. Function and
(R23) the second metadata of the target object based on the received metadata when the target object is a directory, or the received metadata and the received object data when the target object is a file. Generating and arranging the target object at a position in a file system space,
The first condition is that the target data is metadata, and the second condition is that the target data is object data, or
The first condition is that the target data is metadata, the target data is object data, and the data size of the object data is smaller than a threshold, and the second condition is: The target data is object data, and the data size of the object data is equal to or larger than the threshold.
Backup and restore system.

The processor of the second object server is configured to execute a process according to (R21) prior to a process according to (R22) for at least one of the one or more objects.
The backup / restore system according to claim 1.

The processor of the second object server,
Before starting the processing by (R2), a temporary directory is provided in the second file system space,
The processing according to (R21) and the processing according to (R22) are executed in parallel for the one or more objects,
By the processing in (R23), a directory in the one or more objects is generated in the temporary directory based on the metadata acquired in the processing in (R21), and a file in the one or more objects is generated according to (R21). Based on the metadata obtained in the process and the object data obtained in the process of (R22), the temporary directory is generated in the temporary directory, and the directory and the file are transferred from the temporary directory to a corresponding position in the second file system space. 3. The backup / restore system according to claim 1, wherein the backup / restore system is configured to move.

When the target data satisfies the second condition, the value of the redundancy V is a value according to a path depth of an object including the target data;
The path depth of the object including the target data is the number of links that pass from a root directory of the first file system space to the object.
The backup / restore system according to any one of claims 1 to 3.

A replication backup in which each of the object data and metadata in one or more objects, each of which is a file or a directory, is to backup duplicate data to two or more backup servers of a plurality of backup servers; and EC (Erasure Coding). )), Two or more pieces of data that are fragment data (V is a value based on (p + q) / p, p is the number of fragment data, and q is the number of parity data) are backed up by two or more. backup accordance selected backup method of the distributed backup is to backup the server, a backup restoration method for generating a plurality of backup servers that are backed up to the one or more objects to restore the object to the server of the
The file in the one or more objects includes file data that is object data and metadata about the object data, and the directory in the one or more objects includes metadata about the object. Yes,
The object server is:
(R1) For each of the one or more objects, specify a position of the object in the file system space of the object server and a backup destination of the object;
(R2) for each of the one or more objects:
(R21) receiving, from at least one of the plurality of backup servers, duplicate data of the metadata or fragment data of the metadata as metadata of the target object which is the object;
(R22) receiving, as object data of the target object, duplicate data of the object data or fragment data of the object data from at least one of the plurality of backup servers;
(R23) the file system of the target object based on the received metadata when the target object is a directory or the received metadata and the received object data when the target object is a file; Generating and placing the target object at a position in space ,
If the backup source object server is
For each of the object data and metadata in the one or more objects,
(B1) determining whether the target data, which is the data and is object data or metadata, satisfies the first condition or the second condition;
(B2) As a result of (B1), if the target data satisfies the first condition, backup the duplicate data of the target data to two or more backup servers;
(B3) As a result of (B1), when the target data satisfies the second condition, two or more data that are fragment data of the redundancy V according to the EC of the target data are backed up to two or more backup servers,
The first condition is that the target data is metadata, and the second condition is that the target data is object data, or
The first condition is that the target data is metadata, the target data is object data, and the data size of the object data is smaller than a threshold, and the second condition is: A backup / restoration method, wherein the target data is object data, and the data size of the object data is equal to or larger than the threshold.

An object server for backing up one or more backup target objects, each of which is a file or a directory, to at least one of a plurality of backup servers,
An interface connected to the plurality of backup servers;
A processor connected to the interface,
The file in the one or more objects to be backed up comprises file data as object data and metadata about the object data, and the directory in the one or more objects comprises metadata about the object. Has been
The processor, for each of the object data and metadata in one or more objects to be backed up, each being a file or a directory,
(B1) a function of determining which of the first condition and the second condition the target data, which is the data and is object data or metadata,
(B2) as a result of (B1), when the target data satisfies the first condition, a function of backing up duplicate data of the target data to two or more backup servers;
(B3) As a result of (B1), when the target data satisfies the second condition, fragment data of redundancy V according to EC (Erasure Coding) of the target data (V is a value based on (p + q) / p , P is the number of fragment data, and q is the number of parity data), and has a function of backing up two or more data to two or more backup servers.
The processor further includes, for each of the one or more backup objects, information on a position of the backup object in the first file system space and a backup destination of the backup object, among the plurality of backup servers. Registering in the management information stored in at least one storage device,
The first file system space is a file system space including the one or more backup target objects,
The processor comprises:
(R1) For each of one or more objects to be restored each of which is a file or a directory, the location of the object to be restored in the second file system space and the backup destination of the object are referred to by referring to the management information. Has a function to specify,
The second file system space is a file system space including the one or more objects to be restored,
(R2) As a function for each of the one or more objects to be restored,
(R21) a function of receiving, as metadata of the target object being the restore target object, duplicate data of the metadata or fragment data of the metadata from at least one of the plurality of backup servers;
(R22) When the target object is a file, receiving, as object data of the target object, a copy of the object data or a fragment of the object data from at least one of the plurality of backup servers. Function and
(R23) the second metadata of the target object based on the received metadata when the target object is a directory, or the received metadata and the received object data when the target object is a file. A function of generating and arranging the target object at a position in a file system space,
The first condition is that the target data is metadata, and the second condition is that the target data is object data, or
The first condition is that the target data is metadata, the target data is object data, and the data size of the object data is smaller than a threshold, and the second condition is: An object server, wherein the target data is object data, and a data size of the object data is equal to or larger than the threshold.

The processor is configured to execute a process of (R21) prior to a process of (R22) for at least one of the one or more restore target objects.
The object server according to claim 6 .

The processor comprises:
Before starting the processing by (R2), a temporary directory is provided in the second file system space,
The processing according to (R21) and the processing according to (R22) are executed in parallel with respect to the one or more objects to be restored,
By the processing in (R23), a directory in the one or more objects to be restored is generated in the temporary directory based on the metadata acquired in the processing in (R21), and a file in the one or more objects is written in (R21). ) Is generated in the temporary directory based on the metadata acquired in the process of (R22) and the object data acquired in the process of (R22), and the target object is stored from the temporary directory to a corresponding position in the second file system space. claim 6 or 7 object server according is configured to move to, so.

When the target data satisfies the second condition, the value of the redundancy V is a value according to the path depth of the backup target object including the target data;
The path depth of the backup target object including the target data is the number of links from the root directory of the first file system space to the backup target object.
An object server according to any one of claims 6 to 8 .