CN101692239A

CN101692239A - Method for distributing metadata of distributed type file system

Info

Publication number: CN101692239A
Application number: CN200910153371A
Authority: CN
Inventors: 尹建伟; 张聪萍; 吴朝晖; 邓水光; 李莹; 吴健
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2009-10-19
Filing date: 2009-10-19
Publication date: 2010-04-07
Anticipated expiration: 2029-10-19
Also published as: CN101692239B

Abstract

The invention discloses a method for distributing metadata of a distributed type file system. The method uses a catalogue of the file system as a basic unit of Hash, adopts an extensible Hash method to distribute the metadata to each metadata server to ensure that clients can position the position of the metadata; the metadata has high accessing efficiency, even distribution and balanced load; meanwhile, the method maintains storage locality of the catalogue, realizes conveniently prefetching, and improves the accessing efficiency. A unique and unchanged identifier is given to the catalogue to avoid metadata migration caused by the path name of Hash and improve the system performance; participation or exit of the metadata server cannot cause redistribution of mass metadata so as to furthest reduce the metadata migration and ensure high expandability.

Description

A Distributed File System Metadata Allocation Method

技术领域technical field

本发明涉及分布式文件系统领域，更具体地，涉及针对集群式的元数据服务器(Metadata Server：MDS)的一种分布式文件系统元数据分配方法。The present invention relates to the field of distributed file systems, and more specifically, to a distributed file system metadata distribution method for clustered metadata servers (Metadata Server: MDS).

背景技术Background technique

随着对象存储设备(Object Storage Device：OSD)的出现，为构建PB(10的15次方字节)程度的大规模的分布式文件系统提供了基础。为了更有效地管理基于OSD的大规模分布式文件系统，常常将文件的元数据与数据访问分离，由独立的元数据服务器集群管理文件系统的所有元数据，而OSD组成的集群则专门用来管理、存储文件。数据图1给出了此种情况下分布式文件系统的结构。With the emergence of object storage devices (Object Storage Device: OSD), it provides a basis for building a large-scale distributed file system of PB (10 to the 15th power byte) level. In order to manage a large-scale distributed file system based on OSD more effectively, file metadata and data access are often separated, and all metadata of the file system is managed by an independent metadata server cluster, while the cluster composed of OSD is dedicated to Manage and store files. Figure 1 shows the structure of the distributed file system in this case.

元数据是描述数据的数据，文件系统的元数据包括文件的基本属性信息和目录结构信息。虽然它的大小只占整个系统大小的小部分，但是元数据的操作非常频繁，占整个文件系统操作的50％多，使得元数据管理非常重要。一种好的元数据分配方法可以使系统获得高性能和高可扩展性。现有的方法主要有静态子树分块(Static Subtree Partition)、动态子树分块(Dynamic SubtreePartition)、哈希(Hash)等。Metadata is data describing data, and the metadata of a file system includes basic attribute information and directory structure information of files. Although its size only accounts for a small part of the entire system size, metadata operations are very frequent, accounting for more than 50% of the entire file system operation, making metadata management very important. A good metadata distribution method can make the system achieve high performance and high scalability. Existing methods mainly include Static Subtree Partition, Dynamic Subtree Partition, Hash and so on.

静态子树分块方法将整个文件系统划分成若干目录子树，分配子树给不同的MDS。这种方法的优点是元数据的分配简单，便于开发目录的存储局部性，缺点是扩展性差，且不能有效地平衡负载，容易引起热点(hot-pot)问题。The static subtree partitioning method divides the entire file system into several directory subtrees, and assigns the subtrees to different MDSs. The advantage of this method is that the distribution of metadata is simple, and it is convenient to develop the storage locality of the directory. The disadvantage is that the scalability is poor, and the load cannot be effectively balanced, and it is easy to cause hot-pot problems.

动态子树分块方法在静态子树分块方法的基础上，采用动态分解子树的方法平衡负载，即当某棵子树结构过大或是访问比较频繁时，动态的分解该棵子树成两棵或多棵子树，并将分解后的子树迁移到其他MDS，从而平衡负载。但是，这种方法并没有解决可扩展性问题，同时也存在大量元数据迁移现象The dynamic subtree partitioning method is based on the static subtree partitioning method, and uses the method of dynamically decomposing subtrees to balance the load, that is, when a certain subtree structure is too large or the access is frequent, the subtree is dynamically decomposed into two parts. One or more subtrees, and migrate the decomposed subtrees to other MDSs to balance the load. However, this approach does not solve the scalability problem, and there is also a large amount of metadata migration

哈希方法基于一些唯一的文件标识符(例如索引节点号、路径名)的哈希值分布元数据。该方法的优点是能够直接计算出元数据的位置，元数据被均匀地分布到各个MDS从而可以获得较好的负载均衡，但是它失去了文件系统目录的存储局部性，使得路径遍历消耗很大。而且，当文件(或目录)重命名或者元数据服务器增加或减少时，其哈希值相应改变，这将导致大量的元数据迁移，使得系统性能急剧下降。Hashing methods distribute metadata based on the hash of some unique file identifier (eg, inode number, pathname). The advantage of this method is that the location of the metadata can be directly calculated, and the metadata is evenly distributed to each MDS to obtain a better load balance, but it loses the storage locality of the file system directory, making path traversal consume a lot . Moreover, when a file (or directory) is renamed or the number of metadata servers increases or decreases, its hash value changes accordingly, which will lead to a large amount of metadata migration, causing a sharp drop in system performance.

鉴于现有的元数据分配方法无法满足分布式文件系统对元数据服务器集群高性能、高扩展性的需求，我们提出了一种基于目录的可扩展哈希方法来分配元数据。该方法继承了哈希方法原有的优点——负载均衡、查询速度快，并且可以保持目录局部性、允许目录名修改且不引起数据迁移、元数据服务器增加或加入只会引起少量的元数据迁移，大大提高了可扩展性。Given that existing metadata allocation methods cannot meet the high-performance and high-scalability requirements of distributed file systems for metadata server clusters, we propose a directory-based scalable hashing method to allocate metadata. This method inherits the original advantages of the hash method - load balancing, fast query speed, and can maintain directory locality, allowing directory name modification without causing data migration, adding or adding metadata servers will only cause a small amount of metadata migration, greatly improving scalability.

发明内容Contents of the invention

本发明的目的在于提供一种新的、高效的、可扩展的元数据分配方法，该方法将文件系统的目录作为哈希的基本单元，采用可扩展的哈希方法来分配元数据到各个元数据服务器。The purpose of the present invention is to provide a new, efficient, and scalable metadata allocation method, which uses the directory of the file system as the basic unit of hash, and uses the scalable hash method to allocate metadata to each element. data server.

将目录作为哈希的基本单元，可以保持目录的存储局部性，方便进行预取，提高系统的效率。在创建目录时，赋给每个目录唯一的不改变的标识符，以该标识符的哈希值为依据分配元数据到不同的元数据服务器。这样就避免了哈希路径名所遇到的问题——修改路径名时，改变了哈希值，导致元数据迁移。Using the directory as the basic unit of the hash can maintain the storage locality of the directory, facilitate prefetching, and improve the efficiency of the system. When creating a directory, assign each directory a unique identifier that does not change, and distribute metadata to different metadata servers based on the hash value of the identifier. This avoids the problem encountered with hashed pathnames - when the pathname is modified, the hash value is changed, resulting in metadata migration.

采用对元数据服务器集群的节点数进行取模运算的普通哈希算法不能满足分布式文件系统的可扩展性要求。当元数据服务器加入、退出(或失效)时，哈希值改变，导致元数据大量迁移，引起极大的网络消耗。为了解决这个问题，我们采用可扩展的哈希方法，首先，我们将目录标识符以及元数据服务器的标识符使用同一哈希函数[Addr＝hash(key)，其中key值取值比较大的常数]；然后，根据哈希值的大小进行分配。由于采用的key值与元数据服务器集群的服务器数无关，元数据服务器的加入或退出不会造成大量元数据的重新分配，只会影响少量元数据的分配。其具体实施方式如图2所示。The common hash algorithm that takes the modulo operation on the number of nodes in the metadata server cluster cannot meet the scalability requirements of the distributed file system. When the metadata server joins, exits (or fails), the hash value changes, resulting in a large amount of metadata migration, causing great network consumption. In order to solve this problem, we adopt a scalable hash method. First, we use the same hash function [Addr=hash(key) for the directory identifier and metadata server identifier, where the key value is a relatively large constant ]; Then, allocate according to the size of the hash value. Since the key value used has nothing to do with the number of servers in the metadata server cluster, the addition or withdrawal of metadata servers will not cause the redistribution of a large amount of metadata, but will only affect the distribution of a small amount of metadata. Its specific implementation is shown in Figure 2.

综上所述，采用基于目录的可扩展哈希方法分配元数据，可以有如下优点：To sum up, using the directory-based scalable hash method to allocate metadata can have the following advantages:

(1)客户端可以定位元数据的位置，元数据访问效率高。(1) The client can locate the location of the metadata, and the metadata access efficiency is high.

(2)元数据分布均匀，负载均衡。(2) The metadata is evenly distributed and the load is balanced.

(3)将目录作为哈希的基本单元，保持目录的存储局部性，方便预取的实现，提高访问效率。(3) The directory is used as the basic unit of the hash, the storage locality of the directory is maintained, the realization of prefetching is convenient, and the access efficiency is improved.

(4)赋给目录唯一不改变的标识符，避免了哈希路径名引起的元数据迁移，提高系统性能。(4) Give the directory a unique identifier that does not change, avoiding metadata migration caused by the hash path name, and improving system performance.

(5)元数据服务器加入或退出不会引起大量的元数据重新分布，最大限度地减少元数据迁移，可扩展性高。(5) Joining or exiting metadata server will not cause massive metadata redistribution, minimize metadata migration, and have high scalability.

附图说明Description of drawings

图1为分布式文件系统的结构Figure 1 shows the structure of the distributed file system

图2为文件系统基于目录的分解演示Figure 2 is a directory-based decomposition demonstration of the file system

图3为可扩展哈希方法的元数据分配Figure 3. Metadata allocation for scalable hashing methods

图4为元数据服务器退出时元数据的重分配Figure 4 shows the redistribution of metadata when the metadata server exits

图5为元数据服务器加入时元数据的重分配Figure 5 shows the redistribution of metadata when the metadata server joins

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚，以下结合附图说明及具体实施例对本发明作进一步地详细描述：In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments:

图1是分布式文件系统结构，整个系统由三部分构成：客户端、元数据服务器集群以及OSD集群。元数据服务器集群主要负责管理元数据、提供给客户统一的命名空间，OSD集群主要负责存储文件的数据和元数据。当客户操作文件时，首先，根据元数据分配原则定位到负责该文件的元数据服务器，发送请求给该台元数据服务器；元数据服务器在接收的请求后，对该文件的元数据进行相应的操作(例如：创建、查询、修改、删除等)，返回给客户该文件的标识符；然后，客户以文件的标识符为输入参数与OSD集群通信，进行具体的文件IO操作。Figure 1 shows the distributed file system structure. The whole system consists of three parts: client, metadata server cluster and OSD cluster. The metadata server cluster is mainly responsible for managing metadata and providing customers with a unified namespace, and the OSD cluster is mainly responsible for storing file data and metadata. When a client operates a file, first of all, it locates the metadata server responsible for the file according to the metadata allocation principle, and sends a request to the metadata server; after receiving the request, the metadata server performs corresponding processing on the metadata of the file. Operations (such as: create, query, modify, delete, etc.), return the identifier of the file to the client; then, the client communicates with the OSD cluster using the identifier of the file as an input parameter to perform specific file IO operations.

以下对本发明所涉及到关键过程进行详细的阐述：The key processes involved in the present invention are described in detail below:

(1)文件系统的分解(1) Decomposition of the file system

文件系统的命名空间是以目录树的结构组织的，我们以目录为标准，把目录包含的第一层文件或子目录作为哈希的基本单元(DHU：Directory-basedHash Unit)，每个DHU对应一个目录。同一个DHU中文件或子目录的元数据将被分配到同一个元数据服务器中，目录树中第二层以及以下层的元数据将被分配到其他元数据服务器，即每一个DHU只包含其子目录的元数据信息，但是不包含子目录下的内容。如图2所示。在目录d1下有3层，第一层的文件和目录都属于DHU1，该目录树有d1，d2，d3三个目录，分别对应有三个DHU：DHU1，DHU2，DHU3。The namespace of the file system is organized in the structure of a directory tree. We use the directory as the standard, and use the first-level files or subdirectories contained in the directory as the basic unit of the hash (DHU: Directory-basedHash Unit). Each DHU corresponds to a directory. The metadata of files or subdirectories in the same DHU will be distributed to the same metadata server, and the metadata of the second layer and below in the directory tree will be distributed to other metadata servers, that is, each DHU only contains its The metadata information of the subdirectory, but not the content under the subdirectory. as shown in picture 2. There are 3 layers under the directory d1. The files and directories on the first layer belong to DHU1. The directory tree has three directories d1, d2, and d3, corresponding to three DHUs: DHU1, DHU2, and DHU3.

(2)元数据的分配(2) Distribution of metadata

元数据服务器集群中的每个节点都有其唯一标识的ID(例如：IP地址)，每个DHU都有我们赋给的唯一的不改变的标识符DHU-ID。Each node in the metadata server cluster has its unique ID (for example: IP address), and each DHU has a unique identifier DHU-ID assigned by us that does not change.

首先，我们使用同一哈希函数哈希节点ID和DHU-ID到0～2³²的圆上(范围的大小取决于文件系统的规模，应避免不同的元数据服务器节点哈希到相同的虚拟位置上)；然后从DHU-ID的哈希值映射到圆上的位置开始顺时针查找，将DHU中包含的元数据分配到找到的第一个服务器上；如果超过2³²仍然找不到服务器，则分配到第一台元数据服务器上。如图3所示，DHU 1顺时针查找到MDS1，其元数据被分配到MDS 1；DHU 2顺时针查找到MDS 4，于是其元数据被分配到MDS 4；依次类推First, we use the same hash function to hash the node ID and DHU-ID to a circle of 0 to 2 ³² (the size of the range depends on the size of the file system, and different metadata server nodes should be avoided to hash to the same virtual location above); then search clockwise from the hash value of the DHU-ID to the position on the circle, and distribute the metadata contained in the DHU to the first server found; if the server is still not found after exceeding 2 ³² , Then it is assigned to the first metadata server. As shown in Figure 3, DHU 1 finds MDS1 clockwise, and its metadata is assigned to MDS 1; DHU 2 finds MDS 4 clockwise, so its metadata is assigned to MDS 4; and so on

另外，使用一般的哈希算法哈希元数据服务器，元数据服务器在圆上的映射地点分布不均匀。因此，我们采用虚拟元数据服务器的思想，为每个物理元数据服务器在圆上分配多个虚拟元数据服务器，这样就可以拟制分布不均匀，获得更好的负载均衡，并最大限度的减小元数据服务器增减时的元数据重新分配。In addition, using a general hash algorithm to hash metadata servers, the mapping locations of metadata servers on the circle are not evenly distributed. Therefore, we adopt the idea of virtual metadata server and allocate multiple virtual metadata servers on the circle for each physical metadata server, so that uneven distribution can be simulated, better load balancing can be achieved, and Metadata redistribution when small metadata servers increase or decrease.

(3)元数据的访问(3) Metadata access

对元数据的访问主要有：创建、删除、移动、重命名、查询文件或目录以及列举目录(读目录)等操作，以图2、图3为参照。Access to metadata mainly includes operations such as creating, deleting, moving, renaming, querying files or directories, and listing directories (reading directories). Refer to Figures 2 and 3.

创建文件或目录Create a file or directory

场景：用户在某个目录(例如：/d1/d2/)下创建文件text。Scenario: The user creates a file text under a certain directory (for example: /d1/d2/).

第一步，接收请求的服务器解析目录/d1/d2/，获取d2对应的DHU2的ID，计算该DHU-ID的哈希值；In the first step, the server receiving the request parses the directory /d1/d2/, obtains the ID of DHU2 corresponding to d2, and calculates the hash value of the DHU-ID;

第二步，根据DHU-ID的哈希值查找管理该DHU2下元数据的MDS4，并发送创建文件text的请求给MDS4；The second step is to find the MDS4 that manages the metadata under the DHU2 according to the hash value of the DHU-ID, and send a request to create a file text to MDS4;

第三步，MDS4接收到请求后，在目录d2下添加一条文件text的元数据，返回给用户文件text的标识符。In the third step, after receiving the request, MDS4 adds a piece of metadata of the file text under the directory d2, and returns the identifier of the file text to the user.

第四步：MDS4发送更新目录d2元数据的请求给d2的父目录d1所对应的元数据服务器MDS1。Step 4: MDS4 sends a request to update metadata of directory d2 to metadata server MDS1 corresponding to parent directory d1 of d2.

创建目录与创建文件的过程类似。Creating a directory is similar to creating a file.

删除文件或目录delete file or directory

文件的删除比较简单，我们根据其父目录的DHU-ID查找到相应的MDS，然后MDS删除该文件的元数据，并返回给用户该文件的标识符，同时向父目录元数据所在的MDS发出更新请求。File deletion is relatively simple. We find the corresponding MDS according to the DHU-ID of its parent directory, and then the MDS deletes the metadata of the file, returns the identifier of the file to the user, and sends a message to the MDS where the metadata of the parent directory is located. update request.

然后目录的删除稍微复杂点，因为目录下可能嵌套了多级的子目录，在删除该目录时，我们必须确保其子目录均删除了，例如如果我们想删除目录/d1：Then the deletion of the directory is a little more complicated, because there may be multiple levels of subdirectories nested under the directory. When deleting the directory, we must ensure that all its subdirectories are deleted. For example, if we want to delete the directory /d1:

第一步，根据目录d1对应的DHU1的DHU-ID查找到MDS1，发送删除请求给MDS1；The first step is to find MDS1 according to the DHU-ID of DHU1 corresponding to directory d1, and send a deletion request to MDS1;

第二步，MDS1查找到d1中所有子目录，图1中只有目录d2，根据目录d2对应的DHU2的DHU-ID查找到MDS4，发送删除目录d2的请求给MDS4；In the second step, MDS1 finds all the subdirectories in d1, and there is only directory d2 in Figure 1, finds MDS4 according to the DHU-ID of DHU2 corresponding to directory d2, and sends a request to delete directory d2 to MDS4;

第三步，同第一、第二步，MDS4发送删除目录d3的请求给负责d3的MDS，直至该目录下没有子目录为止，删除掉该目录下所有文件，删除完成后发送反馈信息给MDS4；The third step is the same as the first and second steps. MDS4 sends a request to delete the directory d3 to the MDS in charge of d3 until there are no subdirectories in the directory, delete all files in the directory, and send feedback information to MDS4 after the deletion is completed. ;

第四步，MDS4接收到反馈信息后，完成删除d2内容的操作，结束后同样的发送反馈信息给MDS1；Step 4: After receiving the feedback information, MDS4 completes the operation of deleting the content of d2, and sends the same feedback information to MDS1 after completion;

最后，MDS1可以完成删除目录d1的操作。Finally, MDS1 can finish deleting the directory d1.

移动文件或目录move a file or directory

文件的移动相当于先在源父目录所在的MDS上执行删除操作，再在目标父目录所在的MDS上执行创建操作。The movement of files is equivalent to performing a delete operation on the MDS where the source parent directory is located, and then performing a create operation on the MDS where the target parent directory is located.

目录的移动不需要改变整个目录所在的MDS(即不会发生数据迁移)，只需要分别向源父目录、目标父目录所在的MDS发送更新目录项请求。The movement of the directory does not need to change the MDS where the entire directory is located (that is, no data migration will occur), but only needs to send update directory item requests to the MDS where the source parent directory and the target parent directory are located.

查询文件或目录、列举目录内容(读目录)Query files or directories, list directory contents (read directory)

查询文件或目录，首先获取父目录的DHU-ID，根据DHU-ID确定MDS，MDS响应请求，返回所需信息。To query a file or directory, first obtain the DHU-ID of the parent directory, determine the MDS according to the DHU-ID, and the MDS responds to the request and returns the required information.

列举目录内容有两种方式：1)通过查询该目录的父目录的元数据信息；2)在目录所在的MDS中读取目录内容。There are two ways to list the content of the directory: 1) by querying the metadata information of the parent directory of the directory; 2) reading the content of the directory in the MDS where the directory is located.

(4)元数据服务器的扩展(4) Expansion of metadata server

随着文件系统规模的增大，元数据服务器集群也相应的需要扩大，同时集群中的节点有可能发生故障，退出集群。元数据服务器的加入或退出，无可避免引起元数据的迁移，而本发明提出的方法可以最大限度的减小元数据服务器增减时的元数据重新分配。As the size of the file system increases, the metadata server cluster also needs to expand accordingly. At the same time, nodes in the cluster may fail and exit the cluster. The addition or withdrawal of the metadata server will inevitably lead to the migration of metadata, and the method proposed by the present invention can minimize the redistribution of metadata when the metadata server increases or decreases.

对比于图3，当元数据服务MDS 1退出(或失效)时，不同于已往所有的元数据都需要重新分配，MDS 1的退出只影响其逆时针方向的两个DHU(介乎于MDS1和MDS 3之间)，如图4所示，只是将原来分配在MDS 1的两个DHU的元数据重新分配到MDS 2中，而其他DHU的元数据分配都不改变。Compared with Figure 3, when the metadata service MDS 1 exits (or fails), unlike the past, all metadata needs to be redistributed. The exit of MDS 1 only affects its two DHUs in the counterclockwise direction (between MDS1 and Between MDS 3), as shown in Figure 4, only the metadata of the two DHUs originally allocated in MDS 1 are redistributed to MDS 2, while the metadata allocation of other DHUs does not change.

对比于图4，当元数据服务器MDS x加入时，如图5所示，同样的不需要所有的元数据都需要重新分配，MDS x的加入只影响其逆时针方向的DHU元数据的重新分配，即介乎于MDS 3和MDS x之间的DHU的元数据由原来的MDS 2转移到MDS x中，而其他DHU的元数据分配都不改变。Compared with Figure 4, when the metadata server MDS x joins, as shown in Figure 5, it is not necessary to redistribute all the metadata. The addition of MDS x only affects the redistribution of DHU metadata in the counterclockwise direction , that is, the metadata of the DHU between MDS 3 and MDS x is transferred from the original MDS 2 to MDS x, while the metadata allocation of other DHUs does not change.

Claims

1. A distributed file system metadata distribution method, the distributed file system includes a client, a metadata server cluster and an OSD cluster, characterized in that:

Create a directory, using the directory of the file system as the basic unit of the hash;

Metadata distribution is carried out by adopting an extensible hash method. In the extensible hash method, at first, the directory identifier and the identifier of the metadata server are used with the same hash function Addr=hash(key), where key The value is a constant with a relatively large value, and then, it is allocated according to the size of the hash value.

2. the distributed file system metadata distribution method as claimed in claim 1, is characterized in that:

In the directory creation stage, each directory is assigned a unique identifier that does not change, and the metadata is distributed to different metadata servers based on the hash value of the identifier.

3. The distributed file system metadata distribution method as claimed in claim 1 or 2, characterized in that:

When the distributed file system is decomposed, the directory is used as the standard, and the first-level files or subdirectories contained in the directory are used as the basic unit of the hash. Each basic unit of the hash corresponds to a directory. The metadata of the file or subdirectory in the basic unit of the above hash will be distributed to the same metadata server, and the metadata of the second layer and below in the directory tree will be distributed to other metadata servers.

4. The distributed file system metadata distribution method as claimed in claim 1 or 2, characterized in that:

When the distributed file system distributes metadata, each node in the metadata server cluster has its uniquely identified ID, and each basic unit of the hash has a unique identifier DHU- ID,

First, use the same hash function to hash the node ID and DHU-ID to the circle of 0-2 ³² ; then search clockwise from the position on the circle where the hash value of the DHU-ID is mapped, and convert the hashed The metadata contained in the basic unit is allocated to the first server found; if the server cannot be found after more than ²³² , it is allocated to the first metadata server.

5. The distributed file system metadata distribution method as claimed in claim 4, characterized in that:

The size of the range of the circle depends on the scale of the file system, and it should be avoided that different metadata server nodes hash to the same virtual location.

6. The distributed file system metadata distribution method as claimed in claim 1 or 2, characterized in that:

When the distributed file system creates a file or directory, the following steps are performed,

In the first step, the server receiving the request parses the directory /d1/d2/, obtains the identifier DHU-ID of the hash basic unit corresponding to d2, and calculates the hash value of the DHU-ID;

In the second step, according to the hash value of the DHU-ID, search for a second metadata server that manages metadata under the hash basic unit, and send a request to create a file or directory to the second metadata server;

Step 3: After receiving the request, the second metadata server adds the metadata of the corresponding file or directory under the directory d2, and returns the identifier of the file or directory to the user;

Step 4: The second metadata server sends a request to update the metadata of the directory d2 to the first metadata server corresponding to the parent directory d1 of d2.

7. The distributed file system metadata distribution method as claimed in claim 1 or 2, characterized in that:

When the distributed file system deletes a file, it searches for the corresponding metadata server according to the hash basic unit identifier of the parent directory of the file to be deleted, and then the metadata server obtained by the search deletes the metadata of the file, and Return the identifier of the file to the user, and at the same time send an update request to the metadata server where the metadata of the parent directory is located;

The distributed file system takes the following steps when deleting a directory,

In the first step, a third metadata server is found according to the hash basic unit identifier corresponding to the directory, and a deletion request is sent to the third metadata server;

In the second step, the third metadata server finds all subdirectories in the directory, finds the fourth metadata server according to the hash basic unit identifier corresponding to the subdirectory, and sends a request to delete the subdirectory to the first Quaternary data server;

The third step is the same as the first and second steps. The fourth metadata server sends a request to delete the next subdirectory to the metadata server responsible for the next subdirectory until there is no subdirectory under the directory to be deleted. Delete Delete all files in this directory, and send feedback information to the fourth metadata server after deletion is completed;

In the fourth step, after receiving the feedback information, the fourth metadata server completes the operation of deleting the content of the subdirectory, and sends the feedback information to the third metadata server in the same way after completion;

Finally, the third metadata server completes the operation of deleting the directory to be deleted.

8. The distributed file system metadata distribution method as claimed in claim 1 or 2, characterized in that:

When the distributed file system moves files or directories,

First perform the delete operation on the metadata server where the source parent directory is located, and then perform the create operation on the metadata server where the target parent directory is located, so as to complete the file movement;

The movement of the directory does not need to change the metadata server where the entire directory is located, but only needs to send the update directory item request to the MDS where the source parent directory and the target parent directory are located.

9. The distributed file system metadata distribution method as claimed in claim 1 or 2, characterized in that:

When the distributed file system queries a file or directory, it first obtains the hash basic unit identifier of the parent directory, determines the corresponding metadata server according to the hash basic unit identifier, and the corresponding metadata server responds to the request and returns required information;

There are two ways for the distributed file system to enumerate directory contents: 1) by querying the metadata information of the parent directory of the directory; 2) reading the directory contents in the metadata server where the directory is located.

10. The distributed file system metadata distribution method as claimed in claim 1 or 2, characterized in that:

When the first metadata server exits or fails, the exit or failure of the first metadata server only affects its counterclockwise DHUs, and only redistributes the metadata of the DHUs originally assigned to the first metadata server to the corresponding first metadata server. In the binary data server, the metadata allocation of other DHUs does not change;

When the metadata server MDSx joins, the addition of MDSx only affects the redistribution of DHU metadata in the counterclockwise direction, that is, the metadata of the DHU between the corresponding third metadata server and MDSx is changed from the original corresponding metadata server The data server is transferred to MDSx, while the metadata allocation of other DHUs is not changed.