CN115630037B

CN115630037B - File page exchange-based client local persistent cache optimization method

Info

Publication number: CN115630037B
Application number: CN202211135679.6A
Authority: CN
Inventors: 石亮; 徐宇泽
Original assignee: East China Normal University
Current assignee: East China Normal University
Priority date: 2022-09-19
Filing date: 2022-09-19
Publication date: 2025-07-22
Anticipated expiration: 2042-09-19
Also published as: CN115630037A

Abstract

The invention relates to a file page exchange-based client local persistent cache optimization method, which is mainly used for performing cross-device access performance optimization for clients of a distributed file system according to the characteristics of unstable consumer scene network and limited device hardware conditions, and comprises the steps of refining local persistent cache granularity, optimizing local persistent cache management and simplifying a local persistent cache call stack. The basic idea is to asynchronously exchange file pages obtained by a client through a network request to an exchange partition of an external memory, and when the file pages need to be accessed again, the file pages are replaced into the internal memory from the exchange partition, so that the network request is reduced, and the access delay is reduced. Meanwhile, the file page is taken as the minimum cache granularity, and the cache management algorithm is assisted, so that the storage space of the equipment can be saved, and the cache hit rate can be improved. By utilizing the idea of exchanging partitions, the read-write of a specific file system is directly used for exchanging the cache, so that the inter-level overhead can be reduced, and the system performance can be improved.

Description

File page exchange-based client local persistent cache optimization method

Technical Field

The invention relates to the technical field of distributed file system performance, in particular to a file page exchange-based client local persistent cache optimization method, which optimizes cross-device access of mobile terminal equipment according to consumer level scene characteristics.

Background

Consumer-grade mobile terminal devices have evolved rapidly over the past few decades, with ever-expanding user scales, and more user data have been generated by users in daily use, which increases the storage pressure of the devices. In order to meet the demands of user data storage and backup and ensure the efficiency and performance of data access, and finally give users a good use experience, home network attached storage (NetworkAttachedStorage, NAS) services are becoming more and more popular.

The home NAS service can enable a user to upload the generated user data on the mobile terminal equipment to the NAS server through the wireless local area network, and access and modification are performed through WiFi connection when needed. The NAS server and various consumer-level mobile terminal devices establish connection and form a distributed system with a star-shaped structure (server-client model), and the distributed system is managed by a related distributed file system (DistributedFileSystem, DFS). The consumer mobile terminal as the client side obtains the required file pages through the network request, temporarily stores the file pages in a local page cache (belonging to the memory), and returns the file pages to the user-state application program through the virtual file system for the user to use. But is limited by unstable network conditions and limited device hardware conditions (memory space, external storage space, etc.) in consumer level scenarios, such cross-device remote file access can be greatly impacted in performance and reliability. Experiments show that under the conditions of poor network quality and small memory space, the delay of reading and writing remote files by the client device can be greatly increased.

Currently, mainstream distributed file systems improve the performance and reliability of client remote access by introducing client local persistent caches (clients-SIDEPERSISTENTCACHE). For example, FSCache middle layers supporting distributed file systems such as NFS, AFS, etc. can provide interfaces for local persistence function modules such as CacheFS and CACHEFILES, etc., helping the distributed file system to persist acquired remote files to local external storage. When the file needs to be accessed again, the client device can directly access the local persistent cache without a network, so that the performance is improved, and the dependence on the network is reduced. Coda also adopts a similar design thought, and can download the whole amount of the remote files to the local external memory in the open flow of the remote files.

However, the above technical scheme has certain problems in a consumer level scene that the space and the service life of the storage device of the consumer level mobile device are limited, too many local persistent caches are not suitable to be introduced, frequent reading, writing and erasing are not suitable to be performed, and the CPU is stressed in computational power, if an intermediate layer is introduced for processing, the computing and calling cost is increased, the overall performance of the system is reduced, and the user experience is affected.

Therefore, aiming at a consumption level scene, the performance and reliability of remote file access of the client device can be improved by refining the granularity of the local persistent cache, optimizing the management of the local persistent cache and simplifying the call stack of the local persistent cache under limited environment and hardware conditions, so that better user experience is provided for users.

Disclosure of Invention

In order to overcome the problems, the invention aims to refine the granularity of the local persistent cache, optimize the management of the local persistent cache and simplify the call stack of the local persistent cache, and provides a method for optimizing the local persistent cache of a client based on file page exchange, which provides performance and user experience optimization for the application of a distributed file system in a consumer level scene.

The specific technical scheme for realizing the aim of the invention is as follows:

A client local persistent cache optimization method based on file page exchange is an optimization of a distributed file system in a consumer level scene, and comprises the following steps:

1) The client side application program of the distributed file system in the consumer level scene makes a system call open to open a remote file;

2) After receiving a system call request for opening a remote file, an operating system kernel of the client sends a network request to a server, and the server sends back a response according to the request;

3) After receiving the response from the server, the client judges whether the local memory has the stale version of the file according to the file index node information included in the response, if so, the stale version is cleared, and the step 4) is carried out, and if not, the step 4) is directly carried out;

4) The client starts a first partial process of client local persistent cache optimization management, judges whether the persistent cache of the requested file exists in an exchange partition established in a local external memory, if so, the LRU linked list item corresponding to the file is lifted to the front part of the linked list to represent cache hit, if not, the LRU linked list item is newly built and inserted in the front part of the LRU linked list, and after the judgment is completed, the step 5 is entered;

5) The operating system kernel of the client returns the information completed by the system call open to the upper layer application to finish the opening flow of the remote file;

6) After receiving a system call request for reading a remote file, an operating system kernel of the client side searches a required file page in a page cache page-cache of a memory first, if so, the method enters a step 10), and after the completion, the process is ended, and if not, the method enters a step 7);

7) The kernel of the operating system of the client starts to read the file page through the readpage interface provided by the virtual file system, at the moment, starts to judge whether the requested persistent cache of the file page exists in the exchange partition of the local external memory or not;

8) The client sends a network request for reading the file page to the server, and the server sends back a response containing the file page data according to the request;

9) After receiving the response, the client puts the file page data in the response into the memory for reading of the upper layer application program, then enters step 10), and simultaneously executes step 11);

10 The operating system kernel of the client returns the information of completion of the system call read to the upper layer application;

11 The client side writes the file page obtained by request into a local external memory exchange partition through a write interface kernel_write of a file system, and performs page persistence cache record, and searches the corresponding position of a bitmap in an LRU linked list item of a corresponding file by taking index of the page as a subscript, and the value of the bit is set to be 1 to indicate that the page is cached in a persistence mode;

12 And after the requested file page finishes the persistent cache, adding one PAGESIZE to the persistent cache data quantity of the local external memory exchange partition, namely, the size of the file page is 4KB, checking whether the value of the updated persistent cache data quantity exceeds a preset threshold value, and if so, starting from the tail part of the LRU linked list, and eliminating the file pages of the persistent cache of the file corresponding to the item of the LRU linked list one by one.

The optimization of the distributed file system in the consumer level scene comprises various distributed file systems realized in kernel mode and application mode, and is applicable to the cross-device access of remote files.

The LRU linked list of the first partial process of the client local persistent cache optimization management is stored in the memory of the client and is used for managing the persistent cache file pages in the external memory exchange partition, each table entry of the LRU linked list corresponds to a remote file, and the table entry content comprises a file identification, file version information, the number of the file pages which are cached in a persistent mode and a bitmap used for recording the file pages which are cached in the persistent mode.

The persistent cache of the file data takes the file pages as the minimum granularity, and when the page persistent cache record is carried out, a bitmap, a linked list or a hash table is used.

Each item of the LRU link list of the first partial process of client local persistent cache optimization management can correspond to a single file page, a single file or a series of associated file sets, and the LRU link list can be replaced by a FIFO link list or an LFU link list.

In the invention, all cache data takes file pages as the minimum persistent cache granularity and takes the file pages as the minimum exchange unit, so that fine-granularity cache optimization is realized, and the least recently used cache management method represented by the LRU linked list can realize more efficient cache management in a limited space, and can identify the behavior mode of the use of the file by a user and the file access heat degree to a certain extent, thereby providing higher cache hit rate.

The method and the device have the beneficial effects that when the client side is in a working state with low network bandwidth and small memory space, the delay of the client side in accessing the remote file data by crossing the equipment can be reduced, so that better use experience is provided.

Drawings

Fig. 1 is a schematic diagram of the working path of the system according to the present invention.

FIG. 2 is a schematic diagram of a data structure for client-side local persistent cache optimization management in accordance with the present invention.

Fig. 3 is a timing flow chart of an embodiment of the present invention.

Detailed Description

The invention is further described below with reference to the drawings and examples.

Fig. 1 illustrates the system operating path in accordance with the present invention. The working path ① represents the interaction between the application program and the virtual file system through the operating system POSIX interface. The working path ② represents interactions between the swap partition and a particular backing file system (e.g., F2FS, extFS, etc.). The working path ③ represents interactions between the virtual file system and the backing file system. The working path ④ represents the interaction between the backing file system and the external storage, which is the basis for the operating system to access data to the external storage medium. The working path ⑤ represents the interaction of the virtual file system with the distributed file system server-side portion. Based on path ④, the server can read the local file page and send the data to the client over the network. The working path ⑥ represents interactions of the virtual file system with the distributed file system client portion.

The swap-based client local persistence caching technique retains acquired remote file pages by persisting the file pages in memory to external memory. When the file page cache in the page cache of the memory is reclaimed, based on path ②, the client can swap the file page from the swap partition into the memory on the premise of local persistence cache hit, thereby avoiding network requests. By utilizing the idea of exchanging partition, the cached data is exchanged and persisted to the external memory through the virtual file system by using a specific backup file system, so that the management of an intermediate layer is avoided, a call stack is simplified, and the interlayer interaction overhead is reduced.

Considering that a plurality of files (such as streaming media files) do not need to carry out full request acquisition on the whole files when being accessed by a user, and the data is not necessarily in a continuous access mode when being requested, the invention adopts the design of 'caching on demand', so that the exchange space for caching can be saved, the cost of the lasting caching can be reduced, and the caching management is more flexible. FIG. 2 illustrates a specific method of optimizing management of client-side local persistent cache data. The client-side local persistent cache data is managed in a file management unit by using a limited LRU linked list. Each element of the LRU list represents a remote file being accessed, including a file identification of the file, a file version number, a local cached size, etc., and a bitmap for recording pages of the cached file.

Examples

As shown in fig. 3, a timing flow chart of an embodiment of the present invention is shown. The method is divided into two sub-processes, namely a file opening process and a file access process.

Three objects are referred to in the figure, client application, client and server.

The file opening flow is used for updating meta information of the local persistent cache of the client, and the specific flow is as follows:

1. The user-state client application program performs system call open to open a remote file;

2. After receiving the upper layer request, the operating system kernel of the client initiates a network request to the remote server, and acquires and updates meta information of the remote file;

3. And updating and managing the LRU linked list for the client local persistent cache according to the acquired file information.

The file access flow is used for the kernel of the operating system to return the requested related data to the upper layer application, wherein the flow comprises the added flow of the optimization of the client local persistent cache, and the flow is as follows:

1. the user-state client application program performs system call read to read a remote file;

2. After receiving the upper layer request, the kernel of the operating system of the client side firstly inquires whether a relevant file page exists in a page cache of the local memory, and if so, the kernel directly returns to the upper layer application;

3. If the file page does not exist, inquiring the exchange partition where the local persistent cache of the client is located, and if the cache hits, exchanging the relevant file page into the memory and directly returning to an upper layer, so that a remote network request is avoided;

4. If the local cache is not hit, the client starts to initiate a network request, acquires file page data from a remote server, asynchronously performs local persistent cache on the acquired file page, exchanges data from a memory into an exchange partition so as to be directly taken when accessing next time, and after the persistence is completed, further needs to update data volume information of the persistent cache, and if the spatial limit is exceeded, the persistent cache is removed from the tail part of the LRU linked list until the data volume of the persistent cache does not exceed the spatial limit.

Claims

1. A method for optimizing client local persistent cache based on file page exchange is an optimization of a distributed file system in a consumer-level scenario, and is characterized by comprising the following steps:

1) In a consumer-level scenario, the client-side application of the distributed file system makes a system call to open a remote file;

2) After the client's operating system kernel receives the system call request to open the remote file, it sends a network request to the server, and the server sends back a response based on the request;

3) When the client receives the response from the server, it determines whether there is an outdated version of the file in the local memory based on the file index node inode information included in the response; if so, it clears the outdated version and goes to step 4); if not, it directly goes to step 4);

4) The client starts the first part of the client local persistent cache optimization management process, and determines whether there is a persistent cache of the requested file in the swap partition established in the local external memory; if so, the LRU linked list item corresponding to the file is moved to the front of the linked list, indicating a cache hit; if not, a new LRU linked list item is created and inserted at the front of the LRU linked list; after the determination is completed, proceed to step 5);

5) The client's operating system kernel returns the information that the system call open is completed to the upper-level application, ending the remote file opening process; at this time, the client application sends a system call read to read the remote file content;

6) After the client's operating system kernel receives the system call request to read the remote file, it first searches for the required file page in the page cache of the memory; if it is found, it proceeds to step 10) and ends the process after completion; if not, it proceeds to step 7);

7) The client's operating system kernel starts to read the file page through the readpage interface provided by the virtual file system. At this time, it starts to determine whether there is a persistent cache of the requested file page in the swap partition of the local external memory; if so, the page is swapped into the memory through the file system's read interface kernel_read, and the process goes to step 10). After completion, the process ends; if not, it goes to step 8);

8) The client sends a network request to the server to read the file page, and the server sends back a response containing the file page data based on the request;

9) After receiving the response, the client puts the file page data in the response into the memory for reading by the upper-layer application, then proceeds to step 10) and simultaneously executes step 11);

10) The client's operating system kernel returns information about the completion of the system call read to the upper-layer application;

11) The client starts to cache the file data persistently. The client writes the requested file page to the swap partition of the local external memory through the file system's write interface kernel_write, and makes a persistent cache record of the page. The client uses the index of the page as the subscript to find the corresponding position of the bitmap in the LRU linked list item of the corresponding file, and sets the value of the bit to 1, indicating that the page has been persistently cached.

12) The client starts the second part of the client local persistent cache optimization management process and updates the persistent cache data volume information of the local external memory swap partition. After the requested file page completes the persistent cache, the persistent cache data volume of the local external memory swap partition plus a PAGESIZE, that is, the size of the file page, which is 4KB. It is checked whether the updated persistent cache data volume value exceeds the preset threshold. If it exceeds, starting from the end of the LRU linked list, the file pages of the files corresponding to the LRU linked list items that are persistently cached are removed one by one.

2. The client local persistent cache optimization method based on file page exchange as described in claim 1 is characterized in that the optimization of the distributed file system in the consumer-level scenario includes various distributed file systems implemented in kernel state and application state, and is applicable as long as it involves cross-device access to remote files.

3. As described in claim 1, the client local persistent cache optimization method based on file page exchange is characterized in that the LRU linked list of the first part of the client local persistent cache optimization management process exists in the client memory and is used to manage the persistent cache file pages in the external memory swap partition; each table entry of the LRU linked list corresponds to a remote file, and the table entry content includes the file identifier, file version information, the number of file pages that have been persistently cached, and a bitmap for recording the file pages that have been persistently cached.

4. The client local persistent cache optimization method based on file page exchange as described in claim 1 is characterized in that the persistent cache of the file data uses the file page as the minimum granularity, and when performing page persistent cache recording, it includes the use of a bitmap, a linked list or a hash table.

5. As described in claim 3, the client local persistent cache optimization method based on file page exchange is characterized in that each table entry of the LRU linked list can correspond to a single file page, a single file or a series of related file sets; the LRU linked list can be replaced by a FIFO linked list or an LFU linked list.