[go: up one dir, main page]

CN101217571B - Methods for write/read file operations in a multi-replica data grid system - Google Patents

Methods for write/read file operations in a multi-replica data grid system Download PDF

Info

Publication number
CN101217571B
CN101217571B CN2008100563932A CN200810056393A CN101217571B CN 101217571 B CN101217571 B CN 101217571B CN 2008100563932 A CN2008100563932 A CN 2008100563932A CN 200810056393 A CN200810056393 A CN 200810056393A CN 101217571 B CN101217571 B CN 101217571B
Authority
CN
China
Prior art keywords
copy
view
module
write
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2008100563932A
Other languages
Chinese (zh)
Other versions
CN101217571A (en
Inventor
郑纬民
武永卫
徐鹏志
杨广文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN2008100563932A priority Critical patent/CN101217571B/en
Publication of CN101217571A publication Critical patent/CN101217571A/en
Application granted granted Critical
Publication of CN101217571B publication Critical patent/CN101217571B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

用于多副本数据网格系统中的写/读文件操作的方法属于多副本数据网格系统中高速写文件操作领域,其特征在于,依次含有以下步骤:用户向副本访问客户端提出写文件请求;经元数据服务器进行身份认证成功后,则通过其中的服务器距离比较模块选择一个在逻辑距离上离用户最近的副本存储服务器,并选择对应的副本视图信息,经副本访问客户端的视图解析模块确认副本视图有效后,由副本访问客户端的读写执行模块向所述副本存储服务器发送写数据请求,当所述副本存储服务器执行写操作成功后,由元数据服务器的副本视图更新模块来更新副本视图。本发明使用户能够选择最近的副本存储服务器进行写操作,同时也保证用户能正常执行读操作。

Figure 200810056393

The method for writing/reading file operations in a multi-copy data grid system belongs to the field of high-speed file writing operations in a multi-copy data grid system. ; After successful identity authentication by the metadata server, select a replica storage server that is logically closest to the user through the server distance comparison module, and select the corresponding replica view information, and confirm it through the view analysis module of the replica access client After the replica view is valid, the read-write execution module of the replica access client sends a write data request to the replica storage server, and when the replica storage server executes the write operation successfully, the replica view updating module of the metadata server updates the replica view . The invention enables the user to select the closest copy storage server to perform the write operation, and at the same time ensures that the user can normally perform the read operation.

Figure 200810056393

Description

用于多副本数据网格系统中的写/读文件操作的方法 Methods for write/read file operations in a multi-replica data grid system

技术领域technical field

本发明涉及数据网格系统中的文件操作,尤其涉及多副本数据网格系统中的高速写文件操作。The invention relates to the file operation in the data grid system, in particular to the high-speed write file operation in the multi-copy data grid system.

背景技术Background technique

数据网格系统广泛采用数据副本技术来提高文件操作的性能,即将多个文件副本存储在不同数据节点中,其中包括一个可改写的主副本文件以及多个只读副本文件。对于读文件操作,一般情况下用户可以手工选择最近的副本文件进行读操作,或者通过数据网格中的资源调度器为用户自动分配通讯开销小的副本文件进行读操作。对于写文件操作,一般是将数据写入到主副本文件中,主副本文件是唯一可改写副本文件,主副本文件更新完毕后,将通过同步或异步方式自动更新其他只读副本文件。Data grid systems widely use data copy technology to improve the performance of file operations, that is, multiple file copies are stored in different data nodes, including a rewritable master copy file and multiple read-only copy files. For the file read operation, in general, the user can manually select the nearest copy file for the read operation, or the resource scheduler in the data grid automatically assigns the user a copy file with low communication overhead for the read operation. For file writing operations, data is generally written to the master copy file. The master copy file is the only rewritable copy file. After the master copy file is updated, other read-only copy files will be automatically updated synchronously or asynchronously.

从上述现有技术可知,目前数据网格中的写文件操作有如下不足:It can be seen from the above prior art that the current file writing operation in the data grid has the following deficiencies:

由于只能改写主副本文件,当主副本文件存储到用户的传输开销较大时,将降低数据网格写文件操作的性能。Since only the main copy file can be rewritten, when the transmission overhead of storing the main copy file to the user is high, the performance of the data grid's file writing operation will be reduced.

当对某一文件频繁执行写操作时,存储该主副本文件的数据节点将可能出现过载,导致写文件操作的性能急剧下降,甚至造成系统崩溃。When frequent write operations are performed on a file, the data node storing the master copy file may be overloaded, resulting in a sharp drop in the performance of file write operations, or even a system crash.

发明内容Contents of the invention

本发明的目的在于提供一种用于多副本数据网格系统中的就近写/读副本文件的方法,该方法使用户可以对最近的副本文件进行写操作,提高了写文件操作的性能。The purpose of the present invention is to provide a method for nearby writing/reading of duplicate files in a multi-duplicate data grid system, which enables users to write operations on the latest duplicate files and improves the performance of file writing operations.

一种用于多副本数据网格系统中的支持就近写/读副本文件的方法涉及到元数据服务器、副本存储服务器和副本访问客户端。A method for supporting nearby writing/reading of replica files in a multi-replica data grid system involves a metadata server, a replica storage server and a replica access client.

其中元数据服务器包括副本视图存储模块、副本视图更新模块、服务器距离比较模块、副本视图选择模块和出错处理模块,各模块的作用如下:The metadata server includes a copy view storage module, a copy view update module, a server distance comparison module, a copy view selection module and an error handling module. The functions of each module are as follows:

副本视图存储模块:存储副本视图信息,描述一个副本的完整视图信息由副本存储服务器地址以及在所述副本存储服务器上存储的可用副本的数据段描述信息组成,所述数据段描述信息由在副本内的起始偏移量与数据段长度表示,所述副本存储服务器地址由副本存储服务器的域名或IP地址表示;Copy view storage module: store copy view information, describe the complete view information of a copy by the address of the copy storage server and the data segment description information of the available copies stored on the copy storage server, the data segment description information is composed of The starting offset and the length of the data segment in , the address of the copy storage server is represented by the domain name or IP address of the copy storage server;

副本视图更新模块:根据副本更新的具体情况,查找并删除过时的副本视图信息,建立新的副本视图信息,修改副本视图的数据段描述信息;Replica view update module: according to the specific situation of replica update, find and delete outdated replica view information, create new replica view information, and modify the data segment description information of replica view;

服务器距离比较模块:估计各个副本存储服务器到发起读写请求的用户的逻辑距离,距离越远表示服务器与用户之间的通信链路的可用带宽越小,反之亦然;Server distance comparison module: Estimate the logical distance between each copy storage server and the user who initiates the read and write request. The farther the distance, the smaller the available bandwidth of the communication link between the server and the user, and vice versa;

副本视图选择模块:根据用户的读写请求,以及所述服务器距离比较模块的输出结果选取与所述用户距离最小的副本视图;当用户进行读文件操作时,所述距离最小的副本视图可能由多个副本存储服务器上的多个数据段拼合而成;当用户进行写文件操作时,所述距离最小的副本视图为距离用户最近的副本存储服务器上的一个数据段;Copy view selection module: select the copy view with the smallest distance from the user according to the user's read and write request and the output result of the server distance comparison module; when the user performs a file read operation, the copy view with the smallest distance may be selected by Multiple data segments on multiple replica storage servers are stitched together; when a user performs a file write operation, the replica view with the smallest distance is a data segment on the replica storage server closest to the user;

其中副本存储服务器负责存储、管理副本文件;负责接收并执行来自用户的读写操作;负责向元数据服务器报告所管理的副本文件的更新状态;The replica storage server is responsible for storing and managing replica files; responsible for receiving and executing read and write operations from users; responsible for reporting the update status of the managed replica files to the metadata server;

出错处理模块:当身份认证失败时,记录异常,并向客户端发送请求失败响应。Error handling module: when identity authentication fails, record an exception and send a request failure response to the client.

其中访问客户端包括用户访问接口模块、视图请求模块、视图解析模块、读写执行模块和错误处理模块,各模块的作用如下:The access client includes a user access interface module, a view request module, a view analysis module, a read-write execution module and an error handling module. The functions of each module are as follows:

用户访问接口模块:接收用户发出的读写文件请求;User access interface module: receive read and write file requests from users;

视图请求模块:根据用户访问接口模块的输出参数,向元数据服务器发出读写请求,获取具体读写操作所需的副本视图信息;View request module: according to the output parameters of the user access interface module, send read and write requests to the metadata server to obtain the copy view information required for specific read and write operations;

副本视图解析模块:解析由元数据服务器返回的副本视图信息,按副本视图信息中的数据段所在的服务器对所述副本视图信息进行分组,存储在相同服务器上的数据段属于同一组,同一组的各数据段按照起始偏移量进行组内排序;对于读操作,通过读写执行模块从相关副本存储服务器获取各数据段数据,并组成完整的数据段返回给用户访问接口模块;对于写操作,通过读写执行模块将用户数据写入到相应的副本存储服务器的对应数据段中;Replica view parsing module: parse the replica view information returned by the metadata server, group the replica view information according to the servers where the data segments in the replica view information are located, and the data segments stored on the same server belong to the same group, and the same group Each data segment is sorted within the group according to the starting offset; for read operations, the read and write execution module obtains the data of each data segment from the relevant replica storage server, and forms a complete data segment and returns it to the user access interface module; for write Operation, write user data into the corresponding data segment of the corresponding replica storage server through the read-write execution module;

读写执行模块:与副本存储服务器交互,执行读写文件操作;Read and write execution module: interact with the replica storage server to perform read and write file operations;

错误处理模块:处理读写视图请求以及执行读写文件时所产生的错误。Error handling module: handle the errors generated when reading and writing view requests and executing read and write files.

其中通过访问客户端进行写文件操作的方法,其实现步骤依次如下:Among them, the method of writing files by accessing the client, the implementation steps are as follows:

步骤(1).用户向所述系统中的副本访问客户端的用户访问接口模块提交写文件请求,Step (1). The user submits a file writing request to the user access interface module of the copy access client in the system,

该请求包括:所写文件的标识、所要写入的数据以及所写数据段的起始地址偏移量和数据段的长度;The request includes: the identifier of the file to be written, the data to be written, the offset of the starting address of the written data segment and the length of the data segment;

步骤(2).所述用户访问接口模块对所述写文件请求进行简单封装后,发向所述副本访问客户端的视图请求模块,该视图请求模块向所述系统中元数据服务器内的身份认证模块发出写视图请求;Step (2). After the user access interface module simply encapsulates the file writing request, it sends it to the view request module of the copy access client, and the view request module authenticates the identity of the metadata server in the system. The module issues a write view request;

步骤(3).所述元数据服务器的身份认证模块对所述写视图请求进行简单认证,若:未认证通过认证,则把出错信息发给该元数据服务器中的出错处理模块,经记录异常后,向所述系统中副本访问客户端内的视图解析模块发送写视图请求失效响应;若:通过认证,则把认证正确的信息以及用户的写视图请求发向该元数据服务器中的服务器距离比较模块,该服务器距离比较模块根据预先存储的各副本存储服务器的域名或IP地址,估计并选择出与离开发起写视图请求的用户的逻辑距离最小的一个副本存储服务器;Step (3). The identity authentication module of the metadata server performs simple authentication on the write view request. If it is not authenticated and passes the authentication, the error message is sent to the error processing module in the metadata server, and the exception is recorded. Afterwards, send a write view request invalidation response to the view parsing module in the copy access client in the system; if: pass the authentication, then send the authenticated correct information and the user’s write view request to the server distance in the metadata server A comparison module, the server distance comparison module estimates and selects a copy storage server with the smallest logical distance from the user who initiates the write view request according to the domain name or IP address of each copy storage server stored in advance;

步骤(4).所述元数据服务器中的副本视图选择模块根据从所述服务器距离比较模块得到的副本存储服务器和用户写视图请求,从所述元数据服务器中的副本视图存储模块所提供的副本视图中,选取副本视图,生成对应的副本视图信息,并将结果返回给所述系统中副本访问客户端内的视图解析模块;Step (4). The copy view selection module in the metadata server according to the copy storage server and the user write view request obtained from the server distance comparison module, from the copy view storage module in the metadata server. In the copy view, select the copy view, generate corresponding copy view information, and return the result to the view parsing module in the copy access client in the system;

步骤(5).所述副本访问客户端的视图解析模块收到从所述元数据服务器中的副本视图选择模块返回的副本视图信息或者出错处理模块返回的写视图请求失效响应后,验查是否为有效视图信息;若为无效视图,则把出错信息发给该副本访问客户端的出错处理模块,并通过所述用户访问接口模块给用户返回出错信息;若为有效视图,则解析所述副本视图信息,按照副本视图信息中的数据段所在服务器对所述副本视图信息进行分组,使得存储在相同服务器上的数据段属于同一个组,再把同一组的各数据段按起始地址的偏移量的大小进行组内排序,并将分组视图信息发向所述副本访问客户端内的读写执行模块进行具体的写数据操作;步骤(6).所述副本访问客户端的读写执行模块根据从所述视图解析模块收到的分组视图信息向副本存储服务器发送写数据请求;Step (5). After the view parsing module of the copy access client receives the copy view information returned from the copy view selection module in the metadata server or the write view request failure response returned by the error handling module, check whether it is Valid view information; if it is an invalid view, send the error message to the error handling module of the copy access client, and return the error message to the user through the user access interface module; if it is a valid view, then analyze the copy view information , group the replica view information according to the servers where the data segments in the replica view information are located, so that the data segments stored on the same server belong to the same group, and then group the data segments of the same group according to the offset of the starting address The size of the group is sorted, and the group view information is sent to the read-write execution module in the copy access client to perform specific write data operations; step (6). The read-write execution module of the copy access client is based on the The grouped view information received by the view parsing module sends a write data request to the copy storage server;

步骤(7).步骤(6)所述的副本存储服务器检验从所述副本访问客户端中的读写执行模块收到的写数据请求的有效性;若为无效数据请求,则向所述副本访问客户端的读写执行模块返回出错信息;若为有效请求,则对数据副本执行写数据操作,在写成功后,所述副本存储服务器向所述元数据服务器内的副本视图更新模块发送副本更新报告;Step (7). The copy storage server described in the step (6) checks the validity of the write data request received from the read-write execution module in the copy access client; if it is an invalid data request, it sends the request to the copy The read-write execution module of the access client returns an error message; if it is a valid request, the data copy is executed to write data, and after the write is successful, the copy storage server sends a copy update to the copy view update module in the metadata server Report;

步骤(8).步骤(7)所述元数据服务器的副本视图更新模块收到从副本存储服务器得到的副本更新报告后,通过所述元数据服务器内的副本视图存储模块对对应的副本视图信息进行更新操作,以便向所述副本视图选择模块提供准确的副本视图,若副本视图更新失败,则所述副本视图更新模块向步骤(7)所述副本存储服务器返回副本更新失败确认信息,进而所述副本存储服务器回滚至写操作前的状态,并向步骤(6)所述副本访问客户端的读写执行模块发送写数据失败响应;Step (8). After the replica view update module of the metadata server in step (7) receives the replica update report obtained from the replica storage server, it updates the corresponding replica view information through the replica view storage module in the metadata server. Perform an update operation to provide an accurate copy view to the copy view selection module, if the copy view update fails, the copy view update module returns copy update failure confirmation information to the copy storage server described in step (7), and then the The copy storage server rolls back to the state before the write operation, and sends a write data failure response to the read-write execution module of the copy access client described in step (6);

步骤(9).若步骤(8)所述的副本视图更新成功,则步骤(8)所述的副本视图更新模块向步骤(7)所述副本存储服务器返回副本更新成功确认信息,该副本存储服务器向步骤(6)所述副本访问客户端的读写执行模块发送写数据成功响应;Step (9). If the copy view update described in step (8) is successful, the copy view update module described in step (8) returns copy update success confirmation information to the copy storage server described in step (7), and the copy storage The server sends a write data success response to the read-write execution module of the copy access client described in step (6);

步骤(10).步骤(6)所述副本访问客户端的读写执行模块收到从步骤(7)所述副本存储服务器的写数据响应后,对写数据响应进行判断,若为写数据失败响应,则把出错信息发给该副本访问客户端的出错处理模块,并通过所述用户访问接口模块向用户返回出错信息;若为写数据成功响应,则向所述视图解析模块返回写数据成功信息,该视图解析模块通过所述用户访问接口模块向用户返回写文件响应。Step (10). After the read-write execution module of the replica access client in step (6) receives the write data response from the replica storage server described in step (7), it judges the write data response, and if it is a write data failure response , then send the error information to the error handling module of the copy access client, and return the error information to the user through the user access interface module; if it is a successful response to writing data, then return the success information of writing data to the view parsing module, The view parsing module returns a file writing response to the user through the user access interface module.

其中当把所述的方法用于读操作时,在身份认证通过后,由服务器距离比较模块根据所述副本存储器与用户之间的逻辑距离信息选择并生成对应的读视图信息。Wherein when the method is used for the read operation, after the identity authentication is passed, the server distance comparison module selects and generates corresponding read view information according to the logical distance information between the copy storage and the user.

其中用户在进行读写操作前,需要获得文件的句柄,并在完成所有读写操作后,释放所获取的文件句柄。The user needs to obtain a file handle before performing read and write operations, and release the obtained file handle after completing all read and write operations.

综上所述,使用本发明所提供的方法可以使用户能够选择最近的副本存储服务器进行写操作,同时保证用户能够正常执行读文件操作,提高文件读写操作的性能,此外访问客户端对读写细节进行了屏蔽,从而为用户提供更加方便的文件操作服务。In summary, using the method provided by the present invention can enable the user to select the nearest copy storage server to perform the write operation, and at the same time ensure that the user can normally perform the file read operation and improve the performance of the file read and write operation. Write details are shielded to provide users with more convenient file operation services.

附图说明Description of drawings

图1是本发明的系统拓扑图;Fig. 1 is a system topology diagram of the present invention;

图2是元数据服务器的主要模块组成框图;Fig. 2 is a block diagram of the main modules of the metadata server;

图3是访问客户端的主要模块组成框图;Fig. 3 is a block diagram of the main modules of the access client;

图4是用户进行写文件操作的流程图;Fig. 4 is the flow chart that the user carries out writing file operation;

图5是用户进行读文件操作的流程图。Fig. 5 is a flow chart of the user's file reading operation.

具体实施方式Detailed ways

本发明中的通过访问客户端进行写文件操作的方法实施步骤如下:The implementation steps of the method for writing a file operation by accessing the client in the present invention are as follows:

1、用户向访问客户端的访问客户端的访问接口模块提交写文件请求,该请求包括所写文件的标识、所要写如的数据以及所写数据段的起始偏移量和长度;1. The user submits a file writing request to the access interface module of the access client of the access client, and the request includes the identifier of the file to be written, the data to be written, and the starting offset and length of the data segment to be written;

2、访问接口模块对写文件请求进行简单封装后,交由视图请求模块,后者向元数据服务器请求写操作所需的视图信息;2. After the access interface module simply encapsulates the file write request, it is handed over to the view request module, which requests the view information required for the write operation from the metadata server;

3、元数据服务器对请求进行简单认证,如果未认证通过,则返回给访问客户端出错信息;如果认证通过,则转入服务器距离比较模块,估计并选择出与用户距离最近的副本存储服务器;3. The metadata server performs simple authentication on the request. If the authentication is not passed, an error message will be returned to the access client; if the authentication is passed, it will transfer to the server distance comparison module, estimate and select the copy storage server closest to the user;

4、找到最近副本存储服务器后,由副本视图选择模块生成对应的视图信息,并将结果返回给访问客户端;4. After finding the nearest copy storage server, the copy view selection module generates the corresponding view information, and returns the result to the access client;

5、访问客户端的视图解析模块接收元数据服务器返回的视图信息,并检查是否为有效视图信息,如果为无效视图,则通过访问接口模块返回给用户出错信息;如果为有效视图,则按照视图信息通过读写执行模块尝试向最近的副本存储服务器发送写数据请求;5. The view analysis module of the access client receives the view information returned by the metadata server, and checks whether it is a valid view information. If it is an invalid view, it will return an error message to the user through the access interface module; if it is a valid view, it will follow the view information Try to send a write data request to the nearest replica storage server through the read-write execution module;

6、副本存储服务器检验来自访问客户端的写数据请求的有效性,如果为无效请求,则返回给访问客户端出错信息;如果为有效请求,则执行写操作;写成功后副本存储服务器向元数据服务器发送副本更新报告;如果更新失败,则返回给访问客户端出错信息,并回滚至写操作前的状态;如果更新成功,则通知访问客户端;6. The replica storage server checks the validity of the data write request from the access client, and if it is an invalid request, returns an error message to the access client; if it is a valid request, executes the write operation; after the write is successful, the replica storage server sends the metadata The server sends a copy update report; if the update fails, it returns an error message to the access client and rolls back to the state before the write operation; if the update is successful, it notifies the access client;

7、如果访问客户端的视图解析模块的写请求失败,则通过访问接口模块返回给用户出错信息;如果成功,则通过访问接口模块提示用户写文件操作成功完成。7. If the write request of the view analysis module of the access client fails, an error message is returned to the user through the access interface module; if successful, the user is prompted through the access interface module that the file writing operation is successfully completed.

其中通过访问客户端进行读文件操作的方法的实施步骤如下:The implementation steps of the method for reading files by accessing the client are as follows:

1、用户向访问客户端的访问客户端的访问接口模块提交读文件请求,该请求包括所读文件的标识、所要读取数据的起始偏移量和长度;1. The user submits a file read request to the access interface module of the access client, and the request includes the identifier of the file to be read, the starting offset and length of the data to be read;

2、访问接口模块对读文件请求进行简单封装后,交由视图请求模块,后者向元数据服务器请求读操作所需的视图信息;2. After the access interface module simply encapsulates the read file request, it is handed over to the view request module, which requests the view information required for the read operation from the metadata server;

3、元数据服务器对请求进行简单认证,如果未认证通过,则返回给访问客户端出错信息;如果认证通过,则转入服务器距离比较模块,估计副本存储服务器与用户之间的距离,并转入副本视图选择模块进行处理;3. The metadata server performs simple authentication on the request. If the authentication is not passed, it will return an error message to the access client; if the authentication is passed, it will transfer to the server distance comparison module, estimate the distance between the replica storage server and the user, and transfer Enter the copy view selection module for processing;

4、副本视图选择模块根据由服务器距离比较模块得到的距离信息选择生成对应的视图信息,并将结果返回给访问客户端;4. The copy view selection module selects and generates corresponding view information according to the distance information obtained by the server distance comparison module, and returns the result to the access client;

5、访问客户端的视图解析模块接收元数据服务器返回的视图信息,并检查是否为有效视图信息,如果为无效视图,则通过访问接口模块返回给用户出错信息;如果为有效视图,则按照视图信息通过读写执行模块尝试向一个或多个副本存储服务器发送读数据请求;5. The view analysis module of the access client receives the view information returned by the metadata server, and checks whether it is a valid view information. If it is an invalid view, it will return an error message to the user through the access interface module; if it is a valid view, it will follow the view information Attempt to send read data requests to one or more replica storage servers through the read-write execution module;

6、副本存储服务器检验来自访问客户端的读数据请求的有效性,如果为无效请求,则返回给访问客户端出错信息;如果为有效请求,则执行读操作,将读取的数据返回给访问客户端;6. The replica storage server checks the validity of the read data request from the access client, and if it is an invalid request, it returns an error message to the access client; if it is a valid request, it executes the read operation and returns the read data to the access client end;

7、如果访问客户端的视图解析模块的读请求失败,则通过访问接口模块返回给用户出错信息;如果成功,则将所有返回的数据段拼装成用户所请求的完成的数据段,通过访问接口模块返回给用户。7. If the read request of the view analysis module of the access client fails, an error message will be returned to the user through the access interface module; returned to the user.

下面结合附图对本发明的具体实施作进一步的详细阐述。The specific implementation of the present invention will be further described in detail below in conjunction with the accompanying drawings.

图1是具体实施本发明的数据网格系统的网络组成拓扑图,该图由服务器组1与访问客户端2两大部分组成,其中服务器组由元数据服务器3与多个副本存储服务器4构成,其中元数据服务器3与副本存储服务器4相连。访问客户端2与元数据服务器3和副本存储服务器4相连,客户端的读写操作由两个阶段组成,首先是与元数据服务器之间的元数据操作,然后是与副本存储服务器之间的数据操作。Fig. 1 is a topological diagram of the network composition of the data grid system of the present invention, which is composed of server group 1 and access client 2, wherein the server group is composed of metadata server 3 and multiple replica storage servers 4 , wherein the metadata server 3 is connected to the replica storage server 4. The access client 2 is connected to the metadata server 3 and the copy storage server 4. The read and write operations of the client are composed of two stages, the first is the metadata operation with the metadata server, and the second is the data exchange with the copy storage server. operate.

如图2所示,其中元数据服务器由6个模块构成,分别为:副本视图更新模块10、副本视图存储模块11、身份认证模块12、服务器距离比较模块13、副本视图选择模块14、出错处理模块15。As shown in Figure 2, the metadata server consists of 6 modules, namely: replica view update module 10, replica view storage module 11, identity authentication module 12, server distance comparison module 13, replica view selection module 14, error handling Module 15.

各模块的作用如下:The functions of each module are as follows:

副本视图更新模块10:接收来自副本存储服务器的副本更新报告,查找并删除过时的副本视图信息,建立新的副本视图信息,修改副本视图的数据段描述信息,并向副本存储服务器发送更新确认。Replica view update module 10: receives the replica update report from the replica storage server, finds and deletes outdated replica view information, creates new replica view information, modifies the data segment description information of the replica view, and sends an update confirmation to the replica storage server.

副本视图存储模块11:存储副本视图信息,并为其他模块提供视图查询服务。描述一个副本的完整视图信息由副本存储服务器地址以及在所述副本存储服务器上存储的可用副本的数据段描述信息组成,所述数据段描述信息由在副本内的起始偏移量与数据段长度表示,所述副本存储服务器地址由副本存储服务器的域名或IP地址表示。Copy view storage module 11: store copy view information, and provide view query service for other modules. The complete view information describing a replica consists of the address of the replica storage server and the data segment description information of the available replicas stored on the replica storage server. The data segment description information consists of the starting offset and the data segment The length indicates that the address of the duplicate storage server is indicated by the domain name or IP address of the duplicate storage server.

身份认证模块12:对用户的读写视图请求进行基于文件句柄的身份认证处理,即判断请求中所提供的文件句柄是否为合法用户的文件句柄。Identity authentication module 12: perform identity authentication processing based on the file handle for the user's read-write view request, that is, judge whether the file handle provided in the request is the file handle of a legitimate user.

服务器距离比较模块13:估计各个副本存储服务器到发起读写请求的用户的逻辑距离,距离越远表示服务器与用户之间的通信链路的可用带宽越小,反之亦然。Server distance comparison module 13: Estimate the logical distance between each replica storage server and the user who initiates the read/write request. The longer the distance, the smaller the available bandwidth of the communication link between the server and the user, and vice versa.

副本视图选择模块14:根据用户的读写请求,以及所述服务器距离比较模块的输出结果选取与所述用户距离最小的副本视图;当用户进行读文件操作时,所述距离最小的副本视图可能由多个副本存储服务器上的多个数据段拼合而成;当用户进行写文件操作时,所述距离最小的副本视图为距离用户最近的副本存储服务器上的一个数据段。Copy view selection module 14: select the copy view with the smallest distance from the user according to the user's read and write request and the output result of the server distance comparison module; when the user performs a file read operation, the copy view with the smallest distance may It is composed of multiple data segments on multiple replica storage servers; when a user performs a file write operation, the replica view with the smallest distance is a data segment on the replica storage server closest to the user.

出错处理模块15:当身份认证失败时,记录异常,并向客户端发送请求失败响应。Error handling module 15: when identity authentication fails, record an exception, and send a request failure response to the client.

如图3所示,其中访问客户端由用户访问接口模块200、视图请求模块201、视图解析模块202、读写执行模块203和错误处理模块204五部分组成。As shown in FIG. 3 , the access client is composed of five parts: a user access interface module 200 , a view request module 201 , a view analysis module 202 , a read-write execution module 203 and an error handling module 204 .

各模块的作用如下:The functions of each module are as follows:

用户访问接口模块200:接收用户发出的读写文件请求,并将执行结果返回给用户。User access interface module 200: receiving the file read and write request sent by the user, and returning the execution result to the user.

视图请求模块201:根据用户访问接口模块的输出参数,向元数据服务器发出读写视图请求,获取具体读写操作所需的副本视图信息。The view request module 201: sends a read and write view request to the metadata server according to the output parameters of the user access interface module, and obtains copy view information required for specific read and write operations.

副本视图解析模块202:解析由元数据服务器返回的副本视图信息,按副本视图信息中的数据段所在的服务器对所述副本视图信息进行分组,存储在相同服务器上的数据段属于同一组,同一组的各数据段按照起始偏移量进行组内排序;对于读操作,通过读写执行模块203从相关副本存储服务器获取各数据段数据,并组成完整的数据段返回给用户访问接口模块201;对于写操作,通过读写执行模块203将用户数据写入到相应的副本存储服务器的对应数据段中。Replica view parsing module 202: analyze the replica view information returned by the metadata server, group the replica view information according to the servers where the data segments in the replica view information are located, and the data segments stored on the same server belong to the same group, the same Each data segment of the group is sorted in the group according to the starting offset; for the read operation, the data of each data segment is obtained from the relevant copy storage server through the read and write execution module 203, and a complete data segment is formed and returned to the user access interface module 201 ; For the write operation, the user data is written into the corresponding data segment of the corresponding replica storage server through the read-write execution module 203 ;

读写执行模块203:与副本存储服务器交互,执行读写文件操作。Read and write execution module 203: Interact with the replica storage server to execute read and write file operations.

错误处理模块204:处理读写视图请求以及执行读写文件时所产生的错误。Error handling module 204: process errors generated when reading and writing view requests and executing file reading and writing.

图4所示为用户进行写文件操作的流程图,在访问客户端内,步骤300中由用户访问接口200对用户的写文件请求进行简单封装,转到步骤310中由视图请求模块201向元数据服务器请求写操作所需的视图信息;在元数据服务器内,步骤320中由身份认证模块对用户的视图请求进行身份认证,在步骤330中如果通过认证,则转到步骤340中由服务器距离比较模块13估计并选择出与用户距离最近的副本存储服务器,并转到步骤350中由副本视图选择模块14生成视图信息发送给访问客户端,如果在步骤330中没有通过认证,则转到步骤360中由出错处理模块15发送给访问客户端错误报告;在访问客户端内,步骤370中由视图解析模块202接收来自元数据服务的视图响应,在步骤380中如果视图为无效视图信息,则转到步骤440中由错误处理模块204进行错误处理,并转到步骤450由用户访问接口200提示用户写失败,如果在步骤380中判断视图为有效视图,则转到步骤390中由写执行模块203向副本存储服务器发起写执行请求;在副本存储服务器中,步骤400检验来自访问客户端的写执行请求的有效性,如果为无效请求,则返回给访问客户端出错信息转到步骤420,如果为有效请求,则执行写操作,向元数据服务器发送副本更新报告,并在步骤410中由元数据服务器的副本视图更新模块10对副本视图进行更新,完成更新后在步骤400中向访问客户端发送写执行结果;在访问客户端中,步骤420中由写执行模块203接收副本存储服务器的写执行结果,在步骤430中判断写执行是否成功,如果成功则转到步骤450中由用户访问接口200向用户返回用户写响应,如果步骤430中判断写执行失败,则转到步骤440中由错误处理模块204进行错误处理,并转到步骤450由用户访问接口200提示用户写失败。Figure 4 shows the flow chart of the user's file writing operation. In the access client, the user access interface 200 simply encapsulates the user's file writing request in step 300, and the view request module 201 sends the request to the element in step 310. The data server requests the view information required for the write operation; in the metadata server, the identity authentication module authenticates the user’s view request in step 320, and if the authentication is passed in step 330, then go to step 340 and the server distance The comparison module 13 estimates and selects the replica storage server closest to the user, and proceeds to step 350, where the replica view selection module 14 generates view information and sends it to the access client. If the authentication is not passed in step 330, then proceed to step In 360, the error processing module 15 sends an error report to the access client; in the access client, in step 370, the view parsing module 202 receives the view response from the metadata service, and in step 380, if the view is invalid view information, then Go to step 440 and carry out error handling by the error handling module 204, and go to step 450 to prompt the user to write failure by the user access interface 200, if in step 380 it is judged that the view is a valid view, then go to step 390 by the write execution module 203 initiates a write execution request to the copy storage server; in the copy storage server, step 400 checks the validity of the write execution request from the access client, if it is an invalid request, then return to the access client error message and go to step 420, if it is valid request, then perform a write operation, send a copy update report to the metadata server, and update the copy view by the copy view update module 10 of the metadata server in step 410, and send it to the access client in step 400 after the update is completed Write the execution result; in the access client, in step 420, the write execution module 203 receives the write execution result of the copy storage server, and in step 430, it is judged whether the write execution is successful, and if it is successful, then go to the user access interface 200 in step 450 Return a user writing response to the user. If it is judged that the writing execution fails in step 430, then go to step 440, and the error handling module 204 performs error handling, and go to step 450, and the user access interface 200 prompts the user to write failure.

图5为用户进行读文件操作的流程图,在访问客户端内,步骤500中由用户访问接口200对用户的读文件请求进行简单封装,转到步骤510中由视图请求模块201向元数据服务器请求读操作所需的视图信息;在元数据服务器内,步骤520中由身份认证模块对用户的视图请求进行身份认证,在步骤530中如果通过认证,则转到步骤540中由服务器距离比较模块13估计各副本存储服务器与用户之间的距离,并转到步骤550中由副本视图选择模块14按照步骤540估计的距离信息生成视图信息发送给访问客户端,如果在步骤530中没有通过认证,则转到步骤560中由出错处理模块15发送给访问客户端错误报告;在访问客户端内,步骤570中由视图解析模块202接收来自元数据服务的视图响应,在步骤580中如果视图为无效视图信息,则转到步骤620中由错误处理模块204进行错误处理,并转到步骤650由用户访问接口200提示用户写失败,如果在步骤580中判断视图为有效视图,则转到步骤590中由读执行模块203向视图中涉及到的副本存储服务器发起读数据请求;在副本存储服务器中,步骤600检验来自访问客户端的读执行请求的有效性,如果为无效请求,则返回给访问客户端出错信息转到步骤610,如果为有效请求,则执行读数据操作,并向访问客户端发送读数据结果;在访问客户端中,步骤610中由读执行模块203接收副本存储服务器的读数据结果,在步骤610中判断读数据是否成功,如果成功则转到步骤640中由视图解析模块按照视图信息对读取的数据进行拼合,当整个视图拼合完成后转到步骤650中由用户访问接口200向用户返回用户读响应,如果步骤610中判断读数据失败,则转到步骤620中由错误处理模块204进行错误处理,并转到步骤650由用户访问接口200提示用户读失败。Fig. 5 is the flowchart of the user's file reading operation. In the access client, the user access interface 200 simply encapsulates the user's file reading request in step 500, and then transfers to the metadata server by the view request module 201 in step 510. Request the view information required for the read operation; in the metadata server, in step 520, the identity authentication module performs identity authentication on the user's view request. 13 Estimate the distance between each replica storage server and the user, and go to step 550, and the replica view selection module 14 generates view information according to the distance information estimated in step 540 and sends it to the access client. If the authentication is not passed in step 530, Then turn to step 560 and send the error report to the access client by the error processing module 15; in the access client, the view analysis module 202 receives the view response from the metadata service in step 570, and if the view is invalid in step 580 view information, then go to step 620 to handle the error by the error handling module 204, and go to step 650, and the user access interface 200 prompts the user to write failure, if it is judged that the view is a valid view in step 580, then go to step 590 The read execution module 203 initiates a read data request to the copy storage server involved in the view; in the copy storage server, step 600 checks the validity of the read execution request from the access client, and returns it to the access client if it is an invalid request The error information goes to step 610, if it is a valid request, then execute the read data operation, and send the read data result to the access client; in the access client, the read execution module 203 receives the read data result of the replica storage server in step 610 , in step 610, it is judged whether the read data is successful, and if it is successful, then go to step 640, and the view parsing module stitches the read data according to the view information; Return the user to read the response to the user, if it is judged that the read data fails in step 610, then go to step 620 to perform error handling by the error handling module 204, and go to step 650 to prompt the user to read the failure by the user access interface 200.

Claims (2)

1. a method that is used for the operating writing-file of multi-copy data grid system is characterized in that, implementation step is as follows successively:
Step (1). the user capture interface module of the copy access client of user in described system is submitted the written document request to, and this request comprises: the sign of institute's written document, the data that will write and the initial address side-play amount of institute's write data section and the length of data segment;
Step (2). after described user capture interface module is carried out simplified package to described written document request, be sent to the view request module of described copy access client, this view request module authentication module in the meta data server in described system is sent and is write view request;
Step (3). the authentication module of described meta data server is carried out simple authentication to the described view request of writing, if: not by authentication, then error message is issued error handling processing module in this meta data server, behind recording exceptional, the view parsing module in described system in the copy access client sends writes view request inefficacy response; If: by authentication, then correct information of authentication and user's the view request of writing is sent to server in this meta data server apart from comparison module, a copy storage server of the user's who initiates to write view request logical reach minimum is estimated and selected and leave to this server according to the domain name or the IP address of each copy storage server of storage in advance, apart from comparison module;
Step (4). the copy view selection module in the described meta data server is according to writing view request from described server apart from copy storage server and the user that comparison module obtains, in the copy view that copy view memory module from described meta data server is provided, choose the copy view, generate corresponding copy view information, and the result is returned to the view parsing module in the copy access client in the described system;
Step (5). the view parsing module of described copy access client receives that copy view information that the copy view selection module from described meta data server is returned or error handling processing module return writes view request and lost efficacy after the response, and whether check is effective view information; If the error handling processing module of this copy access client then issued error message by invalid view, and returns error message by described user capture interface module to the user; If effective view, then resolve described copy view information, according to the data segment place server in the copy view information described copy view information is divided into groups, make the data segment that is stored on the same server belong to same group, again each data segment of same group is organized internal sort by the size of the side-play amount of initial address, and the read-write Executive Module that group view information is sent in the described copy access client carries out concrete data writing operation;
Step (6). the read-write Executive Module of described copy access client sends write data requests according to the group view information of receiving from described view parsing module to the copy storage server;
Step (7). the validity of the write data requests that the read-write Executive Module of the described copy storage server check of step (6) from described copy access client received; If the invalid data request, then the read-write Executive Module to described copy access client returns error message; If effectively request is then carried out data writing operation to the data copy, after writing successfully, the copy view update module of described copy storage server in described meta data server sends the copy updating record;
Step (8). after the copy view update module of the described meta data server of step (7) is received the copy updating record that obtains from the copy storage server, by the copy view memory module in the described meta data server copy view information of correspondence is upgraded operation, so that provide copy view accurately to described copy view selection module, if copy view update failure, then described copy view update module is returned copy to the described copy storage server of step (7) and is upgraded the failure confirmation, and then described copy storage server is rolled back to the state before the write operation, and sends the write data failure response to the read-write Executive Module of the described copy access client of step (6);
Step (9). if the described copy view update success of step (8), then the described copy view update of step (8) module is returned copy to the described copy storage server of step (7) and is upgraded successful confirmation, and this copy storage server sends the write data success response to the read-write Executive Module of the described copy access client of step (6);
Step (10). the read-write Executive Module of the described copy access client of step (6) is received after the write data response of the described copy storage server of step (7), response is judged to write data, if write data failure response, then error message is issued the error handling processing module of this copy access client, and returned error message to the user by described user capture interface module; If the write data success response is then returned the write data successful information to described view parsing module, this view parsing module returns the written document response by described user capture interface module to the user.
2. the method that is used for the operating writing-file of multi-copy data grid system according to claim 1, it is characterized in that, when described method is used for read operation, after authentication is passed through, by server apart from comparison module according to the logical reach Information Selection between copy storage server and the user and generate corresponding view information.
CN2008100563932A 2008-01-18 2008-01-18 Methods for write/read file operations in a multi-replica data grid system Expired - Fee Related CN101217571B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008100563932A CN101217571B (en) 2008-01-18 2008-01-18 Methods for write/read file operations in a multi-replica data grid system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008100563932A CN101217571B (en) 2008-01-18 2008-01-18 Methods for write/read file operations in a multi-replica data grid system

Publications (2)

Publication Number Publication Date
CN101217571A CN101217571A (en) 2008-07-09
CN101217571B true CN101217571B (en) 2010-07-28

Family

ID=39623933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008100563932A Expired - Fee Related CN101217571B (en) 2008-01-18 2008-01-18 Methods for write/read file operations in a multi-replica data grid system

Country Status (1)

Country Link
CN (1) CN101217571B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101706805B (en) * 2009-10-30 2011-11-09 中国科学院计算技术研究所 Method and system for storing object
CN101853269B (en) * 2010-04-29 2012-01-11 中国人民解放军国防科学技术大学 Consistent processing method for updating distributed data
CN102541869B (en) * 2010-12-07 2016-08-10 腾讯科技(深圳)有限公司 The method and apparatus of write file
CN102043920A (en) * 2010-12-29 2011-05-04 北京深思洛克软件技术股份有限公司 Access quarantine method of public file in data divulgence protection system
CN102346773B (en) * 2011-09-23 2013-10-02 深圳市赫迪威信息技术有限公司 File operation method, controller and file operation system
CN103294675B (en) * 2012-02-23 2018-08-03 上海盛大网络发展有限公司 Data-updating method and device in a kind of distributed memory system
CN102546664A (en) * 2012-02-27 2012-07-04 中国科学院计算技术研究所 User and authority management method and system for distributed file system
CN103095687B (en) * 2012-12-19 2015-08-26 华为技术有限公司 Metadata processing method and device
CN103530205A (en) * 2013-10-23 2014-01-22 曙光信息产业(北京)有限公司 Method and device for processing fault duplicate in multiple duplicates
KR102312336B1 (en) * 2014-07-29 2021-10-14 삼성전자주식회사 Method for sharing data and apparatus thereof
CN107728930A (en) * 2016-08-10 2018-02-23 中国移动通信集团重庆有限公司 A kind of data access method and system
US10936576B2 (en) * 2017-03-08 2021-03-02 Microsoft Technology Licensing, Llc Replicating storage tables used to manage cloud-based resources to withstand storage account outage
CN110022338B (en) * 2018-01-09 2022-05-27 阿里巴巴集团控股有限公司 File reading method and system, metadata server and user equipment
CN108429803B (en) * 2018-03-08 2021-10-26 南京坚卓软件科技有限公司 User design data communication device of electronic commerce website and communication method thereof
CN109582718B (en) * 2018-10-17 2021-05-04 百度在线网络技术(北京)有限公司 Data processing method, device and storage medium
CN113656496A (en) * 2021-07-30 2021-11-16 星辰天合(北京)数据科技有限公司 Data processing method and system
CN116975076B (en) * 2023-07-28 2024-05-07 深圳市丕微科技企业有限公司 Multi-terminal interactive data transmission control method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1602489A (en) * 2001-10-25 2005-03-30 Bea系统公司 System and method for flushing bean cache
CN1815963A (en) * 2006-03-10 2006-08-09 清华大学 Hybrid positioning method for data duplicate in data network system
US20070168319A1 (en) * 1998-02-13 2007-07-19 Oracle International Corporation Methods to perform disk writes in a distributed shared disk system needing consistency across failures

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070168319A1 (en) * 1998-02-13 2007-07-19 Oracle International Corporation Methods to perform disk writes in a distributed shared disk system needing consistency across failures
CN1602489A (en) * 2001-10-25 2005-03-30 Bea系统公司 System and method for flushing bean cache
CN1815963A (en) * 2006-03-10 2006-08-09 清华大学 Hybrid positioning method for data duplicate in data network system

Also Published As

Publication number Publication date
CN101217571A (en) 2008-07-09

Similar Documents

Publication Publication Date Title
CN101217571B (en) Methods for write/read file operations in a multi-replica data grid system
US11388251B2 (en) Providing access to managed content
CN110532247B (en) Data migration method and data migration system
CN105393243B (en) Transaction sequencing
WO2018040589A1 (en) Distributed storage system based data processing method and storage device
CN104618482B (en) Method, server, traditional storage device, system for accessing cloud data
WO2018090256A1 (en) Directory deletion method and device, and storage server
US20140019495A1 (en) Processing a file system operation in a distributed file system
JP2023541298A (en) Transaction processing methods, systems, devices, equipment, and programs
CN113918857A (en) Three-level cache acceleration method for improving performance of distributed WEB application system
CN109684273A (en) A kind of snapshot management method, apparatus, equipment and readable storage medium storing program for executing
US11507277B2 (en) Key value store using progress verification
CN113656504B (en) A blockchain transaction submission, editing and query method based on time series attributes
CN107391112A (en) A kind of FileVersion detection method and its special purpose device
TW201724001A (en) System and method for acquiring, processing and updating global information
CN105959179A (en) Reverse proxy nginx testing system and method
CN104639599B (en) A kind of system and method for realizing files in batch downloading
CN113411373B (en) Transaction data storage method, tracking and tracing method and blockchain network
CN110413350A (en) Request processing method, system, server and storage medium based on dynamic mapping plug-in
CN109389271B (en) Application performance management method and system
CN116107801B (en) Transaction processing methods and related products
CN104301345B (en) The method and system of data are deleted in a kind of Cache clusters
CN104660721A (en) Method, system and device for processing download link in downloading of resource data
CN120315656B (en) Method and device for reading and writing disk file, electronic equipment and storage medium
JP6172294B2 (en) Transaction distributed processing apparatus, method, system, and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100728

Termination date: 20170118

CF01 Termination of patent right due to non-payment of annual fee