[go: up one dir, main page]

CN119597719A - File processing method, device, equipment, readable storage medium and program product - Google Patents

File processing method, device, equipment, readable storage medium and program product Download PDF

Info

Publication number
CN119597719A
CN119597719A CN202411712846.8A CN202411712846A CN119597719A CN 119597719 A CN119597719 A CN 119597719A CN 202411712846 A CN202411712846 A CN 202411712846A CN 119597719 A CN119597719 A CN 119597719A
Authority
CN
China
Prior art keywords
node
file
target
information
metadata server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411712846.8A
Other languages
Chinese (zh)
Inventor
魏文晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Cloud Technology Co Ltd
Original Assignee
China Telecom Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Cloud Technology Co Ltd filed Critical China Telecom Cloud Technology Co Ltd
Priority to CN202411712846.8A priority Critical patent/CN119597719A/en
Publication of CN119597719A publication Critical patent/CN119597719A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to a file processing method, apparatus, computer device, computer readable storage medium and computer program product. The method comprises the steps of receiving a file operation request, sending the operation path to a metadata server, receiving node information of a target node corresponding to the operation path and returned by the metadata server, wherein the target node is a node in a file cluster tree, and processing a file under the target node corresponding to the file operation request according to the node information of the target node. By adopting the method, the file operation efficiency can be improved.

Description

File processing method, apparatus, device, readable storage medium, and program product
Technical Field
The present application relates to the field of computer technology, and in particular, to a file processing method, apparatus, computer device, computer readable storage medium, and computer program product.
Background
With the rapid development of big data technology, enterprises and organizations face increasingly complex data storage and management challenges. The Hadoop distributed file system (HDFS, hadoop Distributed FILE SYSTEM) plays a key role in data storage and processing as a core component of the big data ecosystem. However, with the expansion of service scale and the proliferation of data volume, it is often difficult for a single HDFS cluster to meet all management requirements, resulting in an increasingly common situation where multiple HDFS clusters coexist.
In the related art, an independent distributed file system is used as a unified access layer of a plurality of file storage systems to realize data access and management across the storage systems, however, the cluster management mode needs to synchronize metadata of a plurality of storage clusters into the distributed file system, so that the storage pressure of the distributed file system can be increased, and the processing efficiency of managed files is low.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a file processing method, apparatus, computer device, computer-readable storage medium, and computer program product that can improve the processing efficiency of a file.
In a first aspect, the present application provides a file processing method, including:
receiving a file operation request, wherein the file operation request comprises an operation path;
The operation path is sent to a metadata server, and node information of a target node corresponding to the operation path returned by the metadata server is received, wherein the target node is a node in a file cluster tree;
and processing the file under the target node according to the node information of the target node, wherein the processing corresponds to the file operation request.
In one embodiment, the processing, according to the node information of the target node, of the file under the target node corresponding to the file operation request includes:
Accessing a child node of the target node according to the node information of the target node, and acquiring the node information of the child node from the metadata server;
Traversing the nodes in the file cluster tree in turn according to the node information of the child nodes until a first node corresponding to a target file is determined, and acquiring the node information of the first node from the metadata server;
And acquiring the target file according to the node information of the first node, and processing the target file corresponding to the file operation request.
In one embodiment, the obtaining the target file according to the node information of the first node, and performing processing corresponding to the file operation request on the target file includes:
determining the node type of the first node according to the node information of the first node;
if the node type of the first node belongs to the symbol link type, acquiring the target file through a proxy file system pointed by the symbol link corresponding to the first node;
If the node type of the first node does not belong to the symbol link type, acquiring the target file through a local file system;
and processing the target file corresponding to the file operation request.
In one embodiment, the file operation request further includes a target path, and the method further includes:
copying file data of a target file corresponding to the first node;
Acquiring node information of a second node corresponding to the target path sent by the metadata server;
and according to the node information of the second node, storing the file data of the copied target file under a target path corresponding to the second node.
In one embodiment, the file operation request comprises a node rename request, the node rename request comprising a new name of the target node, the method further comprising:
creating a renamed node corresponding to the new name according to the operation path, and modifying the node identification of the target node into the node identification of the renamed node in the node information of the target node to obtain the node information of the renamed node;
And sending a deleting instruction to the metadata server to instruct the metadata server to delete the node information of the target node, and sending the node information of the renamed node to the metadata server for storage.
In one embodiment, the method further comprises:
Receiving a file mounting request, wherein the file mounting request comprises a file mounting path;
creating corresponding mounting nodes under the file mounting paths;
And acquiring the node information of the mounting node, and sending the node information of the mounting node to a metadata server for storage.
In a second aspect, the present application also provides a file processing apparatus, including:
the request receiving module is used for receiving a file operation request, wherein the file operation request comprises an operation path;
The metadata receiving module is used for sending the operation path to a metadata server and receiving node information of a target node corresponding to the operation path returned by the metadata server, wherein the target node is a node in a file cluster tree;
And the file processing module is used for processing the file under the target node according to the node information of the target node, wherein the file processing module is used for processing the file corresponding to the file operation request.
In a third aspect, the present application also provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the file processing method provided in the first aspect when the computer program is executed by the processor.
In a fourth aspect, the present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the file processing method provided in the first aspect.
In a fifth aspect, the present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the file processing method provided in the first aspect.
According to the file processing method, the device, the computer equipment, the computer readable storage medium and the computer program product, the operation path in the file operation request is sent to the metadata server by receiving the file operation request, the node information of the target node corresponding to the operation path returned by the metadata server is received, the file under the target node is processed corresponding to the file operation request according to the node information of the target node, the node information of the cluster tree stored by the metadata server can be acquired at the client, the file under the operation path can be conveniently acquired according to the node information for processing, and the file processing efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the related art, the drawings that are needed in the description of the embodiments of the present application or the related technologies will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other related drawings may be obtained according to these drawings without inventive effort to those of ordinary skill in the art.
FIG. 1 is an application environment diagram of a file processing method in one embodiment;
FIG. 2 is a flow diagram of a method of processing files in one embodiment;
FIG. 3 is a flow chart of a method for obtaining a target file according to an embodiment;
FIG. 4 is a schematic diagram of a file cluster tree structure in one embodiment;
FIG. 5 is a block diagram of a file processing device in one embodiment;
Fig. 6 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The file processing method provided by the embodiment of the application can be applied to an application environment shown in figure 1. Wherein the client 102 communicates with the metadata server 104 via a network, the data storage system may store data that the metadata server 104 needs to process. The data storage system may be integrated on the metadata server 104 or may be located on the cloud or other network server. Metadata server 104 is configured to store and manage metadata corresponding to HDFS file system 106. HDFS file system 106 may include a plurality of metadata servers. The client 102 receives the file operation request and sends an operation path in the file operation request to the metadata server 104, the metadata server 104 sends node information of a target node corresponding to the operation path to the client 102, and the client 102 performs processing corresponding to the file operation request on the file under the target node according to the node information of the target node. Wherein the file system 106 may be stored in the metadata server 104 in a tree structure. The client 102 may be installed on a terminal, which may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, projection devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The head-mounted device may be a Virtual Reality (VR) device, an augmented Reality (Augmented Reality, AR) device, smart glasses, or the like. The metadata server 104 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services.
In an exemplary embodiment, as shown in fig. 2, a file processing method is provided, and an example of application of the method to the client in fig. 1 is described, including the following steps 202 to 206. Wherein:
step 202, a file operation request is received, where the file operation request includes an operation path.
The file operation request refers to a request initiated by a user to perform operation processing on a file in a file system. For example, the file operation request may be initiated by an operation command, triggering an operation control, or entering a corresponding page, or the like.
By way of example, the file operation request may be, for example, a file access request, a file delete request, a file mount request, a file copy request, a file rename request, and so forth. The content included in the file operation requests for different application scenarios may be different, but the file operation requests at least include an operation path.
And the operation path is used for representing the storage position of the file in the file system. The formats of the operation paths corresponding to different operation systems may be different, for example, in the linux operation system, the operation path format is/home/username/myfile.txt, and in the Windows operation system, the operation path format is C: \Users\user name\desktop, etc. In this embodiment, a plurality of file systems are stored according to a tree structure in a unified naming manner to obtain a file cluster tree, an entry of the file cluster tree may be used as a root node, each file system is used as a child node of the root node, and then the child nodes are sequentially generated downwards according to the file structure of each file system until the leaf node corresponding to the final file. The operation path is a node set from the root node to the branch where the target node corresponding to the target file is located in the file cluster tree.
And 204, transmitting the operation path to the metadata server, and receiving node information of a target node corresponding to the operation path, which is returned by the metadata server, wherein the target node is a node in a file cluster tree.
The metadata server stores node information of all nodes corresponding to the file cluster tree, and the target node can be any node in the file cluster tree. The node information is information for characterizing a position in the file cluster tree and a node attribute. The node information includes, for example, node directory information, node type, node access time, node modification time, authority, and the like, and if the node type is a symbolic link type, the node information further includes a file system to which the symbolic link points. The node directory information is used to characterize the relationship between the node and the previous level node.
The node types of the nodes in the file cluster tree can comprise file types, path types and symbol link types, the file types are used for representing that the corresponding nodes correspond to files, the path types are used for representing that the corresponding nodes correspond to paths, the nodes of the path types are usually middle nodes of the file cluster tree, the symbol link types are used for representing that the corresponding nodes correspond to symbol links, the symbol links (soft links) are special files and comprise a reference pointing to other files or directories in the form of absolute paths or relative paths, the symbol links can play a role of shortcuts, and the corresponding target files or directories can be accessed through the symbol links without knowing positions of the symbol links. It will be readily appreciated that symbolic links may point to files of the client's local system, or to files of a proxy file system.
It can be understood that each operation path has a corresponding node in the file cluster tree, after receiving the operation paths, the metadata server can start from the root node, find the target node corresponding to the operation paths in the file cluster tree according to the sequence of each node in the operation paths, and then return the saved node information to the client according to the view form. Wherein the view may comprise a tree structure or a table.
And 206, processing the file under the target node according to the node information of the target node, wherein the processing corresponds to the file operation request.
The processing corresponding to the file operation request may include, for example, accessing, copying, deleting, or renaming. After the node information of the target node is obtained, the file under the target file node can be processed corresponding to the file operation request according to the node information. The file operation request is an exemplary file deletion request, and after obtaining the node information of the target node, a deletion instruction is sent to the metadata server to instruct the metadata server to delete the stored node information of the target node, and at the same time, delete the file corresponding to the target node.
Optionally, the node type of the target node may be determined according to the node information of the target node, and the acquisition cluster of the file under the target node may be determined according to the node type of the target node.
According to the file processing method, the file operation request is received, the operation path in the file operation request is sent to the metadata server, the node information of the target node corresponding to the operation path returned by the metadata server is received, the file under the target node is processed corresponding to the file operation request according to the node information of the target node, the file systems are uniformly stored in the file cluster tree in a tree structure mode, the metadata of the file systems corresponding to the file cluster tree are stored in the metadata server, the number and the scale of the stored file systems are not limited, meanwhile, the client can acquire the tree structure information of the metadata of the file systems, the file can be conveniently processed according to the tree structure information, and the file processing efficiency can be improved.
In some embodiments, step 206 of performing processing corresponding to the file operation request on the file under the target node according to the node information of the target node includes:
According to the node information of the target node, accessing the child node of the target node and acquiring the node information of the child node from the metadata server, traversing the nodes in the file cluster tree in turn according to the node information of the child node until a first node corresponding to the target file is determined and acquiring the node information of the first node from the metadata server, acquiring the target file according to the node information of the first node, and processing the target file corresponding to the file operation request.
In an actual application scenario, the node information may further include a child node name, after the client obtains the node information of the target node, the client may display names of all child nodes of the target node, when a user selects a name of one of the child nodes, that is, in response to a trigger selected by the child node, access the corresponding child node, send an operation path corresponding to the child node to the metadata server, the metadata server returns the node information of the child node to the client according to the operation path corresponding to the child node, then the client may display the name of a next node of the child node, and so on, sequentially traverse the nodes in the file cluster tree until determining the operation path corresponding to the target file, send the operation path of the target file to the metadata server, return the node information of the first node corresponding to the target file, obtain the target file from the file system according to the node information of the first node, and perform processing corresponding to the operation request, such as accessing, deleting, copying, renaming, and so on the target file. It is to be readily understood that the file systems referred to in the embodiments of the present application are all referred to as distributed file systems.
In one example, from the root node of the file cluster tree, traversing layer by layer until the first node corresponding to the target file is determined, sending the operation path where the target file is located to the metadata server, and returning the node information of the first node corresponding to the operation path where the target file is located by the metadata server, for example, including the node identifier of the first node, the node directory information of the first node, the node type of the first node, and the like, where the target file may be obtained according to the node type of the first node, and then performing processing corresponding to the file operation request on the target file.
In this embodiment, the sub-nodes are accessed layer by layer from the target node in the file cluster tree according to the node information of the target node until the first node corresponding to the target file is determined, the node information of the first node is obtained, the target file is obtained according to the node information of the first node, and then the target file is processed corresponding to the file operation request, so that the target file needing to be operated can be determined in a tree structure traversal manner, the target file can be quickly found, and the access efficiency of the target file is improved.
In an exemplary embodiment, according to node information of a first node, a target file is obtained, and processing corresponding to a file operation request is performed on the target file, including:
The method comprises the steps of determining the node type of a first node according to node information of the first node, acquiring a target file through a proxy file system pointed by a symbol link corresponding to the first node if the node type of the first node belongs to the symbol link type, acquiring the target file through a local file system if the node type of the first node does not belong to the symbol link type, and processing the target file corresponding to a file operation request.
The node information may include a node type, where the node type is determined according to a node attribute when the node is created, a correspondence between a node identifier and the node type in the file cluster tree is stored in the metadata server, and the node type of the first node may be determined according to the correspondence between the node identifier and the node type. It can be understood that the operation paths, the nodes and the node identifiers are in one-to-one mapping relation, and the corresponding node can be positioned under the condition of knowing the operation paths, so as to acquire the corresponding node identifier.
Optionally, if the node type of the first node belongs to the symbol link type, the node information further includes corresponding bound file system information, a correspondence between the node identifier and the bound file system information is stored in the metadata server, and according to the correspondence between the node identifier and the bound file system information, a file system corresponding to the first node to be accessed can be determined, and then the target file is acquired from the corresponding file system.
In one example, as shown in fig. 3, the method starts from a root node, traverses layer by layer until determining a target file, wherein the target file is a file which is required to be operated by a user, determines a node identifier of a first node from a directory information table according to an operation path where the target file is located, obtains a node type of the first node from a node attribute table, obtains a file system address bound by the first node from a node path table if the node type of the first node is of a symbolic link type, identifies whether the bound file system address starts with hdfs:// or not, initializes a proxy file system with the bound file system address, namely a proxy file system path, the proxy file system can be a remote file system, then obtains the target file from the proxy file system, and obtains the target file through the local file system if the bound file system address does not start with hdfs://. And if the node type of the first node does not belong to the symbolic link, acquiring the target file from the local file system of the client. The directory information table is used for storing information between the node and the node at the upper level, for example, key value pairs are stored in the directory information table, the key of each key value pair is the node identifier of the node at the upper level and the node name of the current node, and the value is the node identifier of the current node. The node attribute table is used for storing the corresponding relation between the node identification and the node type, and the node path table is used for storing the corresponding relation between the node identification and the bound file system address.
In this embodiment, the target file is obtained according to the node type of the first node, if the node type of the first node belongs to the symbolic link type, the target file is obtained through the proxy file system pointed by the symbolic link corresponding to the first node, if the node type of the first node does not belong to the symbolic link type, the target file is obtained through the local file system, and then the target file is processed corresponding to the file operation request, so that the remote proxy file system is also added into the file cluster tree for management through the symbolic link, the file system management is lighter and more convenient, and the file access efficiency is further improved.
In some embodiments, the file operation request further includes a target path, and the method further includes:
the method comprises the steps of copying file data of a target file corresponding to a first node, obtaining node information of a second node corresponding to a target path sent by a metadata server, and storing the file data of the copied target file under the target path corresponding to the second node according to the node information of the second node.
The target path refers to a storage path of the file. The file operation request includes a file copy request including a save path of the copied file.
In an actual application scene, after a client acquires a target file, copying file data of the target file, traversing layer by layer from a root node of a file cluster tree according to a target path until reaching a second node corresponding to the target path, acquiring node information of the second node sent by a metadata server, and storing the copied file data of the target file under the target path corresponding to the second node according to the node information of the second node. It is easy to understand that the process of acquiring the node information of the second node corresponding to the target path transmitted by the metadata server is similar to the process of acquiring the node information of the target node corresponding to the operation path returned by the metadata server.
For example, a child node may be created under the second node and file data of the replicated target file may be saved under the child node. It is easy to understand that the node type of the created child node may be a file type, for example, the created child node corresponds to a specific folder in which file data of the copied target file is stored.
In this embodiment, the file data of the target file corresponding to the first node is copied, and then the copied file data is stored under the target path according to the node information of the second node corresponding to the target path, so as to implement file copying.
In some embodiments, the file operation request comprises a node rename request comprising a new name of the target node, the method further comprising:
creating a renamed node corresponding to the new name according to the operation path, and modifying the node identification of the target node into the node identification of the renamed node in the node information of the target node to obtain the node information of the renamed node; and sending a deleting instruction to the metadata server to instruct the metadata server to delete the node information of the target node, and sending the node information of the renamed node to the metadata server for storage.
The node renaming request refers to a request for renaming a node, wherein the node renaming request comprises a new name of a target node, and the new name is a renamed node name.
In the process of performing the node renaming operation, the node name of the target node corresponding to the operation path may be called an old name. In an actual application scene, when a client receives a renaming request, a renaming node corresponding to a new name is rebuilt according to an operation path, a corresponding node identifier of the renaming node is distributed, then in node information of a target node, the node identifier of the target node is changed into the node identifier of the renaming node, the node information of the renaming node is obtained, the node information of the renaming node is sent to a metadata server for storage, and a deleting command is sent to the metadata server to instruct the metadata server to delete the node information of the target node.
Optionally, the node information of the target node includes a target node identifier, a key value pair formed by the target node identifier and the previous level node information, a node type, and the like, and if the node type of the target node belongs to the symbol link type, the node information of the target node further includes the target node identifier and the file system information corresponding to the binding. And modifying the target node identifier into a renamed node identifier in the target node information, wherein the obtained node information of the renamed node comprises a key value pair formed by the renamed node identifier and the previous-stage node information, the node type of the renamed node and the file system information which is bound with the renamed node correspondingly.
In the embodiment, the renaming node corresponding to the new name is created according to the operation path, the node identification of the target node is modified into the node identification of the renaming node in the node information of the target node, the node information of the renaming node is obtained, the node information of the target node is deleted through the metadata server, the node information of the renaming node is stored, the renaming of the node is realized, the file path can be flexibly managed, the organization and classification of the file are convenient, the user can rename the file transparently based on the file cluster tree without concern about the actual storage position of the file, and the file operation efficiency is improved.
In some embodiments, the above method further comprises:
the method comprises the steps of receiving a file mounting request, wherein the file mounting request comprises a file mounting path, creating a corresponding mounting node under the file mounting path, acquiring node information of the mounting node, and sending the node information of the mounting node to a metadata server for storage.
The file mounting request can be characterized in the form of a file mounting command, and the file mounting request comprises a file mounting path, namely a position of a file mounted in a file cluster tree.
Optionally, according to the file mounting path, finding a corresponding position in the file cluster tree, creating a mounting node, and distributing a node identifier for the mounting node. The node identification of the previous level node of the mounting node and the node name of the mounting node are used as keys, the node identification of the mounting node is used as a value to form a key value pair, the key value pair is used as node catalog information of the mounting node, the node type and the bound file system information are used as node information of the mounting node, and the node information of the mounting node is sent to a metadata server to be stored for subsequent access processing of the mounting node.
In this embodiment, the corresponding mounting node is created under the file mounting path, and the node information of the mounting node is sent to the metadata for storage, that is, the mounting node is added into the file cluster tree to be used as one node in the file cluster tree for management, so that the file can be managed flexibly and orderly, and the file access and processing efficiency are improved.
In one exemplary embodiment, the client may integrate a multi-view management SDK (Software Development Kit ), the HDFS file system may be a remote file system and/or a local file system, and the HDFS file system may include a plurality. Metadata of the HDFS file system may be stored in a metadata server, which may be stored in a storage manner such as mysql, postgresql, redis. In an actual application scenario, if an application service needs to read and write a distributed file system, an operation request may be initiated to a metadata server, for example, a command HDFS SHELL such as HDFS dfs-get or HDFS dfs-put may be used to perform operations, where the operation request may pass through a virtual file system layer of a client SDK, the virtual file system layer inherits and implements all interface semantics of HDFS FILESYSTEM, and the file system is a core class, and can provide rich APIs (Application Programming Interface, application programming interfaces) for operating files and directories on the HDFS file system, where objects include node objects (inodes) and directory objects (entry), where the node objects record metadata information of the file system, and the directory objects record structural information of the file system, that is, a hierarchical relationship of different directories. Information of the node object and the target object is stored in a metadata server. The metadata server manages the file system hierarchical relationship of the unified name space and the information of file attributes, rights and the like.
Illustratively, the file cluster tree structure is shown in FIG. 4, where the file cluster tree in FIG. 4 includes two HDFS file clusters, where the file cluster HDFS nn1 is bound to the/data path and the file cluster HDFS nn2 is bound to the/user path. It is easy to understand that the number of file clusters that the file cluster tree in the actual application scenario may include is not limited to two, but may be hundreds, thousands or more, and the number of file clusters that the file cluster tree may include is not limited. Taking URI (Uniform Resource Identifier ) entry of the file cluster tree as a root node, wherein node identification of the root node is represented by 0, then paths are represented by node names of all levels in a sequence from top to bottom, for example, node name corresponding to a first child node of the root node is data, node identification can be represented by 1, path corresponding to the first child node is/data, node name of the root node can be omitted, similarly, node name of a second child node of the root node is user, node identification can be represented by 2, and corresponding path is/user. Because the node type of the data corresponding node is a symbol link type, the corresponding bound file cluster is hdfs nn1, and the metadata server can return the metadata information corresponding to the file cluster hdfs nn1 by initiating an operation request to the metadata server, so that the client can synchronously display the metadata information of the file cluster hdfs nn 1.
Illustratively, in the file cluster tree shown in fig. 4, the corresponding node directory information may be represented in the form of key-value pairs, where a key is composed of a node identifier of a node at a previous level of the current node and a node name of the current node, and the value is the node identifier of the current node as shown in table 1 below. For example, the previous level node corresponding to the first child node is a root node, the node identifier of the root node is 0, the node name of the current node is data, the key is 0/data, and the value is the node identifier 1 of the first child node.
TABLE 1
For example, the node attribute information corresponding to the file cluster tree shown in fig. 4 is shown in the following table 2, where only the attribute of the node type is shown in table 2, and the node attribute information may further include attributes such as file access time, file modification time, and file access authority. The file cluster information bound by the node with the node type being the symbolic link type is shown in table 3.
TABLE 2
TABLE 3 Table 3
Illustratively, a client obtains a file mount request corresponding to a mount command mount hdfs:// nn 2/directors/user, creates/user nodes under a mount path in the mount command, allocates node identifications 2 for the nodes, adds the node identifications 2 and corresponding node types 'symbolic links' to node attribute information of table 2, adds a file cluster hdfs:// nn 2/directors bound by the node identifications 2 and the symbolic links to table 3, and then adds '0/user' and the node identifications 2 to table 1.
The process of file access is illustratively described in terms of a query/user/bob. And traversing the access layer by layer from the root node according to the query path until the node corresponding to the bob file is determined, wherein the corresponding node identifier is 2, the node type of the node 2 is obtained from the node attribute information as a symbol link from the table 1, the file system path pointed by the symbol link is hdfs:// nn2/directory from the table 3, and the hdfs:// nn2/directory/bob is opened and accessed through a proxy file system.
Illustratively, the process of file replication is described taking the copying of/data/reports to/user directory as an example. The file is accessed/data/reports first, the final access path is hdfs:// nn 1/directors/reports, and the content of the file is read. And then opening/user directory, and finally opening hdfs:// nn2/directory, creating file reports file under the directory, and copying the file content read in the last step into hdfs:// nn 2/directory.
Illustratively, if the path corresponding to the renamed node is before the node whose node type is the symbolic link type, the renaming is implemented by the local file system, and if the path corresponding to the renamed node is after the node whose node type is the symbolic link type, the renaming is performed by the remote file system. Taking the renaming/data to/store as an example, the process of implementing renaming by the local file system is described. According to the/data path, create/store node in the file cluster tree and assign node identification 3, add node identification 3 to table 2 linked with file type symbol, add node identification 3 and hdfs:// nn1/directory to table 3, and then point "0/stpre" to node identification 3. And the information related to the node identification 2 is deleted in table 1, table 2 and table 3.
When the file is deleted, the nodes corresponding to the files to be deleted are determined, the node type of the nodes is judged, if the node type belongs to the symbolic link type, the remote file system agent deletes the lower-level files or directories, the deleted files or directories enter the recovery directory of the remote file cluster, and the remote file cluster performs physical deletion at regular time. If the node type does not belong to the symbol link, continuing to enter the next node, and if the node type of the next node does not belong to the symbol link type, deleting the corresponding file by the local system.
In the above embodiment, the multiple file systems are managed according to the tree structure by uniformly naming the multiple file systems through the multiple view management file systems, so that flow decoupling can be performed on view mapping and file access, transparent access to files distributed in different file systems is realized at a client, file system management is more flexible and light, file access efficiency is improved, and file processing efficiency is improved.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides a file processing device for realizing the above related file processing method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in one or more embodiments of the document processing device provided below may refer to the limitation of the document processing method hereinabove, and will not be repeated herein.
In one exemplary embodiment, as shown in FIG. 5, a file processing apparatus is provided, comprising a request receiving module 502, a metadata receiving module 504, and a file processing module 506, wherein:
A request receiving module 502, configured to receive a file operation request, where the file operation request includes an operation path;
The metadata receiving module 504 is configured to send the operation path to the metadata server, and receive node information of a target node corresponding to the operation path returned by the metadata server;
And the file processing module 506 is configured to perform processing corresponding to the file operation request on the file under the target node according to the node information of the target node.
In some embodiments, the file processing module 506 is further configured to access a child node of the target node according to the node information of the target node, and obtain the node information of the child node from the metadata server, traverse the nodes in the file cluster tree in turn according to the node information of the child node until the first node corresponding to the target file is determined, and obtain the node information of the first node from the metadata server, obtain the target file according to the node information of the first node, and perform a process corresponding to the file operation request on the target file.
In some embodiments, the file processing module 506 is further configured to determine a node type of the first node according to the node information of the first node, obtain the target file through the proxy file system pointed by the symbolic link corresponding to the first node if the node type of the first node belongs to the symbolic link type, obtain the target file through the local file system if the node type of the first node does not belong to the symbolic link type, and perform a process corresponding to the file operation request on the target file.
In some embodiments, the file operation request further includes a target path, the file processing module 506 is further configured to copy file data of a target file corresponding to the first node, obtain node information of a second node corresponding to the target path sent by the metadata server, and store the file data of the copied target file under the target path corresponding to the second node according to the node information of the second node.
In some embodiments, the file operation request comprises a node renaming request, wherein the node renaming request comprises a new name of a target node, the file processing device further comprises a renaming module, the renaming module is used for creating a renaming node corresponding to the new name according to an operation path, in node information of the target node, the node identification of the target node is modified to the node identification of the renaming node to obtain the node information of the renaming node, a deleting instruction is sent to a metadata server to instruct the metadata server to delete the node information of the target node, and the node information of the renaming node is sent to the metadata server to be stored.
In some embodiments, the file processing device further includes a file mounting module, configured to receive a file mounting request, where the file mounting request includes a file mounting path, create a corresponding mounting node under the file mounting path, obtain node information of the mounting node, and send the node information of the mounting node to the metadata server for storage.
The respective modules in the above-described file processing apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In an exemplary embodiment, a computer device, which may be a terminal, is provided, and an internal structure diagram thereof may be as shown in fig. 6. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input means. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The Communication interface of the computer device is used for conducting wired or wireless Communication with an external terminal, and the wireless Communication can be realized through WIFI, a mobile cellular network, near field Communication (NEAR FIELD Communication) or other technologies. The computer program is executed by a processor to implement a file processing method. The display unit of the computer device is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device. The display screen can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In an exemplary embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method of processing a file in the above-described embodiments when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, implements the steps of the file processing method of the above-described embodiments.
In an embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, implements the steps of the file processing method of the above embodiments.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are both information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to meet the related regulations.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile memory and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (RESISTIVE RANDOM ACCESS MEMORY, reRAM), magneto-resistive Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computation, an artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) processor, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the present application.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (10)

1. A method of processing a document, the method comprising:
receiving a file operation request, wherein the file operation request comprises an operation path;
The operation path is sent to a metadata server, and node information of a target node corresponding to the operation path returned by the metadata server is received, wherein the target node is a node in a file cluster tree;
and processing the file under the target node according to the node information of the target node, wherein the processing corresponds to the file operation request.
2. The method according to claim 1, wherein the processing the file under the target node according to the node information of the target node and corresponding to the file operation request includes:
Accessing a child node of the target node according to the node information of the target node, and acquiring the node information of the child node from the metadata server;
Traversing the nodes in the file cluster tree in turn according to the node information of the child nodes until a first node corresponding to a target file is determined, and acquiring the node information of the first node from the metadata server;
And acquiring the target file according to the node information of the first node, and processing the target file corresponding to the file operation request.
3. The method according to claim 2, wherein the obtaining the target file according to the node information of the first node, and performing processing corresponding to the file operation request on the target file, includes:
determining the node type of the first node according to the node information of the first node;
if the node type of the first node belongs to the symbol link type, acquiring the target file through a proxy file system pointed by the symbol link corresponding to the first node;
If the node type of the first node does not belong to the symbol link type, acquiring the target file through a local file system;
and processing the target file corresponding to the file operation request.
4. The method of claim 2, wherein the file operation request further comprises a target path, the method further comprising:
copying file data of a target file corresponding to the first node;
Acquiring node information of a second node corresponding to the target path sent by the metadata server;
and according to the node information of the second node, storing the file data of the copied target file under a target path corresponding to the second node.
5. The method of claim 1, wherein the file operation request comprises a node rename request including a new name of the target node, the method further comprising:
creating a renamed node corresponding to the new name according to the operation path, and modifying the node identification of the target node into the node identification of the renamed node in the node information of the target node to obtain the node information of the renamed node;
And sending a deleting instruction to the metadata server to instruct the metadata server to delete the node information of the target node, and sending the node information of the renamed node to the metadata server for storage.
6. The method according to claim 1, wherein the method further comprises:
Receiving a file mounting request, wherein the file mounting request comprises a file mounting path;
creating corresponding mounting nodes under the file mounting paths;
And acquiring the node information of the mounting node, and sending the node information of the mounting node to a metadata server for storage.
7. A document processing apparatus, the apparatus comprising:
the request receiving module is used for receiving a file operation request, wherein the file operation request comprises an operation path;
The metadata receiving module is used for sending the operation path to a metadata server and receiving node information of a target node corresponding to the operation path returned by the metadata server, wherein the target node is a node in a file cluster tree;
And the file processing module is used for processing the file under the target node according to the node information of the target node, wherein the file processing module is used for processing the file corresponding to the file operation request.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.
10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.
CN202411712846.8A 2024-11-27 2024-11-27 File processing method, device, equipment, readable storage medium and program product Pending CN119597719A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411712846.8A CN119597719A (en) 2024-11-27 2024-11-27 File processing method, device, equipment, readable storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411712846.8A CN119597719A (en) 2024-11-27 2024-11-27 File processing method, device, equipment, readable storage medium and program product

Publications (1)

Publication Number Publication Date
CN119597719A true CN119597719A (en) 2025-03-11

Family

ID=94829845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411712846.8A Pending CN119597719A (en) 2024-11-27 2024-11-27 File processing method, device, equipment, readable storage medium and program product

Country Status (1)

Country Link
CN (1) CN119597719A (en)

Similar Documents

Publication Publication Date Title
US11797477B2 (en) Defragmentation for objects within object store
US11630807B2 (en) Garbage collection for objects within object store
US11868312B2 (en) Snapshot storage and management within an object store
US10852976B2 (en) Transferring snapshot copy to object store with deduplication preservation and additional compression
US20240184746A1 (en) Metadata attachment to storage objects within object store
CN111290826B (en) Distributed file systems, computer systems, and media
JP5775177B2 (en) Clone file creation method and file system using it
US9367569B1 (en) Recovery of directory information
US8977662B1 (en) Storing data objects from a flat namespace in a hierarchical directory structured file system
CN116561358A (en) Unified 3D scene data file storage and retrieval method based on hbase
EP2686791B1 (en) Variants of files in a file system
CN116996575A (en) Resource access method, device, equipment and storage medium
CN114647630B (en) File synchronization, information generation method, device, computer equipment and storage medium
CN119597719A (en) File processing method, device, equipment, readable storage medium and program product
CN114416676A (en) Data processing method, device, equipment and storage medium
CN117215477A (en) Data object storage method, device, computer equipment and storage medium
US8990265B1 (en) Context-aware durability of file variants
CN112181899A (en) Metadata processing method and device and computer readable storage medium
JP7629518B2 (en) Creation and modification of collection content items for organizing and presenting content items - Patents.com
CN119902703A (en) Data processing method, object storage system, device, equipment, medium and product
CN116909480A (en) Method, system, equipment and storage medium for optimizing enumeration user quantity
CN119363766A (en) Data sharing method, device, equipment, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination