WO2025087546A1 - Any point-in-time file system - Google Patents
Any point-in-time file system Download PDFInfo
- Publication number
- WO2025087546A1 WO2025087546A1 PCT/EP2023/080129 EP2023080129W WO2025087546A1 WO 2025087546 A1 WO2025087546 A1 WO 2025087546A1 EP 2023080129 W EP2023080129 W EP 2023080129W WO 2025087546 A1 WO2025087546 A1 WO 2025087546A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- entry
- timestamp
- file
- metadata
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1873—Versioning file systems, temporal file systems, e.g. file system supporting different historic versions of files
Definitions
- the present disclosure relates, in general, to accessing data from a file system at any point-intime (PIT).
- PIT point-intime
- a file system is a method and a data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be a single large body of data, with no possibility of telling where a particular piece of data begins and ends, or where a particular piece of data is located. By separating the data into pieces and giving each piece a name, the data can be easily isolated and identified.
- Point-in-time recovery in the context of computers involves systems, often databases, whereby an administrator can restore or recover a set of data or a particular setting from a time in the past.
- PITR provides an option to restore application data to its state at any time since the base backup was created.
- PITR provides additional insurance against logical data corruption.
- An objective of the present disclosure is to provide a mechanism for supporting any point-intime access in a file system.
- a first aspect of the present disclosure provides a method of managing a file system, the method comprising performing an operation on a first entry comprised in a first location within the file system, the first entry having a unique name associated therewith, and adding an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
- a new type of a file system - that is, an any point-in-time (PIT) file system - can be provided.
- the any PIT file system enables the user to access files/directories at any pointin-time, even if the object has been deleted and/or changed (e.g., moved, renamed, or similar).
- the expression comprises a prefix, an infix, and/or a postfix.
- the first timestamp may comprise a value indicative of a beginning of the lifespan of the first entry.
- the second timestamp may comprise a value indicative of an end of the lifespan of the first entry.
- the value indicative of the end of the lifespan of the first entry may comprise a value indicating that the lifespan of the first entry has ended, or that an end of the lifespan of the first entry is undefined yet, such that the first entry is still active.
- the operation may comprise a move operation and the method may comprise creating a second entry in a second location within the file system, wherein the unique name associated with the second entry corresponds to the unique name associated with the first entry, replacing the first entry with a link to the second entry, and replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
- the operation may comprise a delete operation and the method may comprise replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
- the operation may comprise a rename operation and the method may comprise creating a second entry in the first location within the file system, replacing the first entry with a link to the second entry, and replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
- the operation may comprise a metadata operation and the method may comprise adding information related to a metadata change of the first entry, ordered by time, to an existing list of metadata.
- the operation may comprise a metadata operation and the method may comprise creating a second entry in the first location within the file system, wherein the first entry comprises first metadata, and the second entry comprises second metadata, replacing the first entry with a link to the second entry, and replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
- the first entry may comprise a file and an index node describing the file may comprise a key pointing to historical data versions of the file at various points-in-time, wherein the key comprises an offset in the file and the first timestamp of the file.
- the method may further comprise identifying the file at a specific point-in-time, wherein the identifying comprises dividing a pathname of the file into a sequence of tokens, starting from a root directory of the file system, going over those entries which comprise the first timestamp that comprises a value smaller or equal to the specific point in time and the second timestamp that comprises a value bigger or equal to the specific point in time in order to find the entry that matches a current token of the sequence of tokens based on a name of the file, and, in response to finding the entry, reading the index node describing the file.
- the method may further comprise accessing metadata of the file at the specific point-in-time, wherein accessing the metadata comprises reading the metadata list attached to the index node, scanning the metadata list and returning a metadata entry that is present in the list at the specific point-in-time.
- the method may further comprise accessing the file at the specific point-in-time, wherein accessing the file comprises checking whether a requested data block does not exceed a file size at the specific point-in-time based on an entry in the metadata list, in response to verifying that the file size is not exceeded, scanning through the pointer pointing to historical data versions of the file at various points-in-time until the file version at the specific point-in-time is detected, and returning the data block.
- a second aspect of the present disclosure provides a computer readable storage medium comprising computer program code, accessible by an apparatus comprising a processor, to provide instructions and/or data to the apparatus, the computer program code configured to, with the processor, cause the apparatus to perform an operation on a first entry comprised in a first location within a file system, the first entry having a unique name associated therewith, add an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
- a third aspect of the present disclosure provides a computing system, comprising a processor, a memory coupled to the processor, configured to store program code executable by the processor, the program code comprising one or more instructions to cause the computing system to perform an operation on a first entry comprised in a first location within a file system, the first entry having a unique name associated therewith, add an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
- Fig. 1 is a schematic representation of the method of managing a file system according to an example
- Fig. 2 schematically depicts a regular file system after a set of operations to aid understanding of the invention
- FIG. 3 schematically depicts a file system after a set of operations according to an example
- Fig. 4 schematically depicts a file system after a move operation according to an example
- Fig. 5 schematically depicts a file system after a delete operation according to an example
- Fig. 6 schematically depicts a file system after a rename operation according to an example
- Fig. 7 schematically depicts a file system after a metadata operation according to an example
- Fig. 8 schematically depicts an index node according to an example
- Fig. 9 schematically depicts a computing system according to an example.
- the current file systems do not support any point-in-time (PIT) access.
- PIT point-in-time
- a mechanism to support any PIT access in a file system enables the user to access files/directories at any point-in-time, even if the object has been deleted.
- the new type of a file system does not remove the history of an entry, instead storing it as a special data inside the file system itself, allowing the user to access data from any PIT.
- the file system supports any PIT access without the need to set up a backup for the file system.
- Examples in the present disclosure can be provided as methods, systems or machine-readable instructions, such as any combination of software, hardware, firmware or the like.
- Such machine-readable instructions may be included on a computer readable storage medium (including but not limited to disc storage, CD-ROM, optical storage, etc.) having computer readable program codes therein or thereon.
- the machine-readable instructions may, for example, be executed by a machine such as a general-purpose computer, user equipment such as a smart device, e.g., a smart phone, a special purpose computer, an embedded processor or processors of other programmable data processing devices to realize the functions described in the description and diagrams.
- a processor or processing apparatus may execute the machine-readable instructions.
- modules of apparatus for example, a module implementing a comparator unit, or a firewall structure and so on
- modules of apparatus may be implemented by a processor executing machine readable instructions stored in a memory, or a processor operating in accordance with instructions embedded in logic circuitry.
- the term 'processor' is to be interpreted broadly to include a CPU, processing unit, ASIC, logic unit, or programmable gate set etc.
- the methods and modules may all be performed by a single processor or divided amongst several processors.
- Such machine-readable instructions may also be stored in a computer readable storage that can guide the computer or other programmable data processing devices to operate in a specific mode.
- the instructions may be provided on a non-transitory computer readable storage medium encoded with instructions, executable by a processor.
- Fig. 1 is a schematic representation of the method of managing a file system according to an example.
- the method comprises, in block 101, performing an operation on a first entry comprised in a first location within the file system, the first entry having a unique name associated therewith.
- entity may be interpreted as a file and/or a directory within the file system.
- the unique name associated with the first entry may comprise an entry name, i.e., the name of the file and/or the name of the directory.
- the operation may comprise, for example, a move operation, a delete operation, a rename operation, and/or a metadata operation (i.e., an operation performed on the metadata associated with the first entry).
- the term “location” may relate to a specific location that the entry occupies in the memory (i.e., its memory address), or, more generally, to a directory that the entry is comprised in.
- the method comprises, in block 102, adding an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
- Fig. 2 schematically depicts a regular file system 200 after a set of operations to aid understanding of the invention.
- the Fig. exemplarily depicts four create directory 201-1, ..., 201-n operations, as well as five create file 202-1, . . . , 202-n operations.
- eight entries are created in total. The entry names remain unchanged after the set of operations has been performed.
- FIG. 3 schematically depicts a file system 300 after a set of operations according to an example. Same elements in Figs. 2 and 3 are denoted with the same reference numerals and function likewise. The set of operations depicted in Fig. 3 is the same as that depicted in Fig. 2. However, the skilled person would appreciate that, even though a specific number and type of operations are shown in the Figures, the invention is not limited thereto.
- a unique expression is added to each entry name for the directories 201-1 to 201-n as well as for the files 202-1 to 202-n.
- the expression comprises a prefix.
- the expression may also comprise an infix and/or a postfix.
- the prefix may comprise a pair of timestamps - start time and end time, separated by a dot character. This pair represents the lifetime of the entry.
- the first timestamp may comprise a value indicative of a beginning of the lifespan of the first entry - for example, when each of the files 202-n and/or the directories 201- n were created, respectively.
- the second timestamp may comprise a value indicative of an end of the lifespan of the first entry.
- the value indicative of the end of the lifespan of the first entry may comprise a value indicating that the lifespan of the entry has ended (for example, because the first entry has been deleted), or that an end of the lifespan of the first entry is undefined yet, such that the first entry is still active (i.e., that it has not been deleted/changed yet).
- the value indicating that the end of the lifespan of the first entry is undefined yet may comprise, for example, a tw value.
- a new entry may be created and the old entry may be replaced with a link to the new entry.
- the second timestamp i.e., the end time value
- the current timestamp instead of tinf
- Fig. 4 schematically depicts a file system after a move operation according to an example.
- a file fde2 402-2
- the old entry relating to file2402-2 may be replaced by a link (depicted as a dotted line) and a new entry relating to fde2 402-2 may be created under dir 2 401-2, such that the old entry points to the new entry.
- the prefix of the old entry may be updated with the timestamp of the operation (exemplarily depicted as tio).
- the algorithm may start from the root directory and only consider entries that are alive at the requested point-in-time (start time ⁇ PIT ⁇ end time).
- Fig. 5 schematically depicts a file system after a delete operation according to an example.
- a file (file 1 502-1) is deleted.
- the second timestamp may be replaced with a timestamp corresponding to a time of the delete operation.
- the second timestamp may be replaced with tio.
- the data of the file may not actually be deleted, to enable any PIT access in the future.
- Fig. 6 schematically depicts a file system after a rename operation according to an example.
- a file (file2 602-2) is renamed.
- the old entry relating to file2 602-2 may be replaced by a link (depicted as a dotted line) and a new entry relating to file 2 602-2 may be created in the same directory (dir I 601-1), such that the old entry points to the new entry.
- the new entry relating to file2 602-2 may have a different unique name associated therewith, compared to the old entry.
- the second timestamp of the old entry may be replaced with a timestamp corresponding to a time of the operation (in the example of Fig. 6, tio).
- Figs. 3-6 relate to operations that change the file system tree.
- Fig. 7 described below relates to a metadata operation that only changes the attributes of the entry without changing the hierarchy of the file system tree. Examples of such an operation comprise chmod, chown, setxattr and truncate. Below, the operations which do not change the hierarchy of the file system tree are explicitly referred to as “metadata operations”.
- Fig. 7 schematically depicts a file system after a metadata operation according to an example.
- attributes of a file (filel 702-1) are changed.
- information related to a metadata change may be added to an existing list of metadata 711 associated with the file in question.
- Information related to the metadata change may be ordered by time in the list of metadata 711, such that the latest metadata change is added to the end of the list of metadata 711.
- the list of metadata 711 may be stored in the file system on a per entry basis and consist of the metadata changes for the entry.
- the first entry in the list of metadata 711 may comprise the original metadata used during the creation.
- a metadata operation may be implemented by creating a second file in the same directory as the first file.
- the metadata of the second file may be different compared to the metadata of the first file.
- the first file may be replaced with a link to the second file, and the second timestamp of the first entry may be replaced with a timestamp corresponding to the time of the operation.
- entries e.g., files and directories
- Figs. 2-7 are referred to, in the Figures, using different reference numerals, the skilled person would readily appreciate that a single entry may be subject to a number of different operations (for example, it may be renamed, moved, and then deleted).
- Fig. 8 schematically depicts an index node according to an example.
- An index node is a data structure in a Unix-style file system that describes a file system object (for example, a file or a directory). Each index node stores the attributes and disk block locations of the object’s data.
- File system attributes may include metadata (for example, time of last change, access, modification), as well as owner and permission data.
- the index node 800 comprising a plurality of pointers may be extended by a key 801.
- the key 801 may comprise an additional indirect pointer that points to historical data versions of the file at various point-in-time.
- the indirect pointer shown in Fig. 8 comprises a triple indirect level pointer, but the invention is not limited thereto.
- the level of indirection i.e., whether the pointer comprises a double indirect pointer, a triple indirect pointer, a quadruple indirect pointer etc.
- the level of indirection may be changed depending on the amount of data history the file system should preserve.
- the key 801 may comprise an offset in the file and the first timestamp of the file.
- the key typically comprises only the offset.
- a new data may be appended into the last position, such that the last data block for each offset is pointed twice - once by the original data block and second time by the PIT pointer.
- a file name or a directory name must be mapped into the index node number, i.e., the file must be identified.
- the process may pass the file pathname to a virtual file system call, such as open(), mkdir(), rename() or stat().
- the standard procedure for performing this task may comprise analysing the pathname of the file and dividing it into a sequence of filenames called tokens.
- the filenames in the sequence of filenames identify directories within the file system.
- the process may include going over those entries that existed at tpiT (i.e., those files that comprise the first timestamp that comprises a value smaller or equal to the specific PIT and the second timestamp that comprises a value bigger or equal to the specific PIT, start time ⁇ tpiT ⁇ end time) to find the entry having a name that matches the current token. If no match has been found, the process may return “UNDEFINED”. If a match has been found, the index node associated with the entry may be read and the process can repeat for the next token until the last token in the path is reached, at which point the index node of the file has been identified.
- the identified index node of the file and the metadata attached to the index node may be read.
- the metadata list 711 (sorted chronologically by the timestamp) may be scanned and the metadata entry that is present in the list at the specific PIT may be returned.
- the index node may be read, and a check may be performed to verify whether the requested data block does not exceed the file size at time tpiT.
- the file size may be saved within the metadata entry 711. If the size of the requested data block exceeds the file size at time tpiT, “INVALID DATA” may be returned.
- the key 801 may be scanned through until the requested data block has been found. The data block may then be returned.
- Fig. 9 schematically depicts a computing system according to an example.
- the computing system 900 comprises a processor 901 and a memory 902 coupled to the processor 901, the memory 902 configured to store program code 903 executable by the processor 901, the program code 903 comprising one or more instructions to cause the computing system 900 to perform the method steps described herein.
- machine-readable instructions can be loaded onto a computer or other programmable data processing devices, so that the computer or other programmable data processing devices perform a series of operations to produce computer-implemented processing, thus the instructions executed on the computer or other programmable devices provide an operation for realizing functions specified by flow(s) in the flow charts and/or block(s) in the block diagrams.
- teachings herein may be implemented in the form of a computer or software product, such as a non-transitory machine-readable storage medium, the computer software or product being stored in a storage medium and comprising a plurality of instructions, e.g., machine readable instructions, for making a computer device implement the methods recited in the examples of the present disclosure.
- Cloud-computing environments may provide various services and applications via the Internet. These cloud-based services (e.g., software as a service, platform as a service, infrastructure as a service, etc.) may be accessible through a web browser or other remote interface of the user equipment for example. Various functions described herein may be provided through a remote desktop environment or any other cloud-based computing environment.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
In some examples, a method of managing a file system comprises performing an operation on a first entry comprised in a first location within the file system, the first entry having a unique name associated therewith, and adding an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
Description
ANY POINT-IN-TIME FILE SYSTEM
TECHNICAL FIELD
The present disclosure relates, in general, to accessing data from a file system at any point-intime (PIT).
BACKGROUND
A file system is a method and a data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be a single large body of data, with no possibility of telling where a particular piece of data begins and ends, or where a particular piece of data is located. By separating the data into pieces and giving each piece a name, the data can be easily isolated and identified.
Point-in-time recovery (PITR) in the context of computers involves systems, often databases, whereby an administrator can restore or recover a set of data or a particular setting from a time in the past. In particular, PITR provides an option to restore application data to its state at any time since the base backup was created. PITR provides additional insurance against logical data corruption.
SUMMARY
An objective of the present disclosure is to provide a mechanism for supporting any point-intime access in a file system.
The foregoing and other objectives are achieved by the features of the independent claims.
Further implementation forms are apparent from the dependent claims, the description and the Figures.
A first aspect of the present disclosure provides a method of managing a file system, the method comprising performing an operation on a first entry comprised in a first location within the file system, the first entry having a unique name associated therewith, and adding an expression to the unique name associated with the first entry, wherein the expression comprises a first
timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
Accordingly, a new type of a file system - that is, an any point-in-time (PIT) file system - can be provided. The any PIT file system enables the user to access files/directories at any pointin-time, even if the object has been deleted and/or changed (e.g., moved, renamed, or similar).
In an implementation of the first aspect, the expression comprises a prefix, an infix, and/or a postfix.
The first timestamp may comprise a value indicative of a beginning of the lifespan of the first entry.
The second timestamp may comprise a value indicative of an end of the lifespan of the first entry.
The value indicative of the end of the lifespan of the first entry may comprise a value indicating that the lifespan of the first entry has ended, or that an end of the lifespan of the first entry is undefined yet, such that the first entry is still active.
The operation may comprise a move operation and the method may comprise creating a second entry in a second location within the file system, wherein the unique name associated with the second entry corresponds to the unique name associated with the first entry, replacing the first entry with a link to the second entry, and replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
The operation may comprise a delete operation and the method may comprise replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
The operation may comprise a rename operation and the method may comprise creating a second entry in the first location within the file system, replacing the first entry with a link to the second entry, and replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
The operation may comprise a metadata operation and the method may comprise adding information related to a metadata change of the first entry, ordered by time, to an existing list of metadata.
The operation may comprise a metadata operation and the method may comprise creating a second entry in the first location within the file system, wherein the first entry comprises first metadata, and the second entry comprises second metadata, replacing the first entry with a link to the second entry, and replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
The first entry may comprise a file and an index node describing the file may comprise a key pointing to historical data versions of the file at various points-in-time, wherein the key comprises an offset in the file and the first timestamp of the file.
The method may further comprise identifying the file at a specific point-in-time, wherein the identifying comprises dividing a pathname of the file into a sequence of tokens, starting from a root directory of the file system, going over those entries which comprise the first timestamp that comprises a value smaller or equal to the specific point in time and the second timestamp that comprises a value bigger or equal to the specific point in time in order to find the entry that matches a current token of the sequence of tokens based on a name of the file, and, in response to finding the entry, reading the index node describing the file.
The method may further comprise accessing metadata of the file at the specific point-in-time, wherein accessing the metadata comprises reading the metadata list attached to the index node, scanning the metadata list and returning a metadata entry that is present in the list at the specific point-in-time.
The method may further comprise accessing the file at the specific point-in-time, wherein accessing the file comprises checking whether a requested data block does not exceed a file size at the specific point-in-time based on an entry in the metadata list, in response to verifying that the file size is not exceeded, scanning through the pointer pointing to historical data versions of the file at various points-in-time until the file version at the specific point-in-time is detected, and returning the data block.
A second aspect of the present disclosure provides a computer readable storage medium comprising computer program code, accessible by an apparatus comprising a processor, to provide instructions and/or data to the apparatus, the computer program code configured to, with the processor, cause the apparatus to perform an operation on a first entry comprised in a first location within a file system, the first entry having a unique name associated therewith, add an expression to the unique name associated with the first entry, wherein the expression
comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
A third aspect of the present disclosure provides a computing system, comprising a processor, a memory coupled to the processor, configured to store program code executable by the processor, the program code comprising one or more instructions to cause the computing system to perform an operation on a first entry comprised in a first location within a file system, the first entry having a unique name associated therewith, add an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
These and other aspects of the invention will be apparent from the embodiment s) described below.
BRIEF DESCRIPTION OF THE DRAWINGS
In order that the present invention may be more readily understood, embodiments of the invention will now be described, by way of example, with reference to the accompanying drawings, in which:
Fig. 1 is a schematic representation of the method of managing a file system according to an example;
Fig. 2 schematically depicts a regular file system after a set of operations to aid understanding of the invention;
Fig. 3 schematically depicts a file system after a set of operations according to an example;
Fig. 4 schematically depicts a file system after a move operation according to an example;
Fig. 5 schematically depicts a file system after a delete operation according to an example;
Fig. 6 schematically depicts a file system after a rename operation according to an example;
Fig. 7 schematically depicts a file system after a metadata operation according to an example;
Fig. 8 schematically depicts an index node according to an example; and
Fig. 9 schematically depicts a computing system according to an example.
DETAILED DESCRIPTION
Example embodiments are described below in sufficient detail to enable those of ordinary skill in the art to embody and implement the systems and processes herein described. It is important to understand that embodiments can be provided in many alternate forms and should not be construed as limited to the examples set forth herein.
Accordingly, while embodiments can be modified in various ways and take on various alternative forms, specific embodiments thereof are shown in the drawings and described in detail below as examples. There is no intent to limit to the particular forms disclosed. On the contrary, all modifications, equivalents, and alternatives falling within the scope of the appended claims should be included. Elements of the example embodiments are consistently denoted by the same reference numerals throughout the drawings and detailed description where appropriate.
The terminology used herein to describe embodiments is not intended to limit the scope. The articles “a,” “an,” and “the” are singular in that they have a single referent, however the use of the singular form in the present document should not preclude the presence of more than one referent. In other words, elements referred to in the singular can number one or more, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes,” and/or “including,” when used herein, specify the presence of stated features, items, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, items, steps, operations, elements, components, and/or groups thereof.
Unless otherwise defined, all terms (including technical and scientific terms) used herein are to be interpreted as is customary in the art. It will be further understood that terms in common usage should also be interpreted as is customary in the relevant art and not in an idealized or overly formal sense unless expressly so defined herein.
The current file systems do not support any point-in-time (PIT) access. According to an example, there is provided a mechanism to support any PIT access in a file system. The any PIT file system enables the user to access files/directories at any point-in-time, even if the object has been deleted. Advantageously, the new type of a file system does not remove the history of
an entry, instead storing it as a special data inside the file system itself, allowing the user to access data from any PIT. Furthermore, the file system supports any PIT access without the need to set up a backup for the file system.
Examples in the present disclosure can be provided as methods, systems or machine-readable instructions, such as any combination of software, hardware, firmware or the like. Such machine-readable instructions may be included on a computer readable storage medium (including but not limited to disc storage, CD-ROM, optical storage, etc.) having computer readable program codes therein or thereon.
The present disclosure is described with reference to flow charts and/or block diagrams of the method, devices and systems according to examples of the present disclosure. Although the flow diagrams described above show a specific order of execution, the order of execution may differ from that which is depicted. Blocks described in relation to one flow chart may be combined with those of another flow chart. In some examples, some blocks of the flow diagrams may not be necessary and/or additional blocks may be added. It shall be understood that each flow and/or block in the flow charts and/or block diagrams, as well as combinations of the flows and/or diagrams in the flow charts and/or block diagrams can be realized by machine readable instructions.
The machine-readable instructions may, for example, be executed by a machine such as a general-purpose computer, user equipment such as a smart device, e.g., a smart phone, a special purpose computer, an embedded processor or processors of other programmable data processing devices to realize the functions described in the description and diagrams. In particular, a processor or processing apparatus may execute the machine-readable instructions. Thus, modules of apparatus (for example, a module implementing a comparator unit, or a firewall structure and so on) may be implemented by a processor executing machine readable instructions stored in a memory, or a processor operating in accordance with instructions embedded in logic circuitry. The term 'processor' is to be interpreted broadly to include a CPU, processing unit, ASIC, logic unit, or programmable gate set etc. The methods and modules may all be performed by a single processor or divided amongst several processors.
Such machine-readable instructions may also be stored in a computer readable storage that can guide the computer or other programmable data processing devices to operate in a specific
mode. For example, the instructions may be provided on a non-transitory computer readable storage medium encoded with instructions, executable by a processor.
Fig. 1 is a schematic representation of the method of managing a file system according to an example. The method comprises, in block 101, performing an operation on a first entry comprised in a first location within the file system, the first entry having a unique name associated therewith. The term “entry” may be interpreted as a file and/or a directory within the file system. The unique name associated with the first entry may comprise an entry name, i.e., the name of the file and/or the name of the directory. The operation may comprise, for example, a move operation, a delete operation, a rename operation, and/or a metadata operation (i.e., an operation performed on the metadata associated with the first entry). Here, the term “location” may relate to a specific location that the entry occupies in the memory (i.e., its memory address), or, more generally, to a directory that the entry is comprised in.
The method comprises, in block 102, adding an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
Fig. 2 schematically depicts a regular file system 200 after a set of operations to aid understanding of the invention. The Fig. exemplarily depicts four create directory 201-1, ..., 201-n operations, as well as five create file 202-1, . . . , 202-n operations. In other words, in the example shown, eight entries are created in total. The entry names remain unchanged after the set of operations has been performed.
In contrast, Fig. 3 schematically depicts a file system 300 after a set of operations according to an example. Same elements in Figs. 2 and 3 are denoted with the same reference numerals and function likewise. The set of operations depicted in Fig. 3 is the same as that depicted in Fig. 2. However, the skilled person would appreciate that, even though a specific number and type of operations are shown in the Figures, the invention is not limited thereto.
In contrast to the regular file system shown in Fig. 2, according to the present invention, a unique expression is added to each entry name for the directories 201-1 to 201-n as well as for the files 202-1 to 202-n. In the example shown in Fig. 3, the expression comprises a prefix. However, the expression may also comprise an infix and/or a postfix. The prefix may comprise
a pair of timestamps - start time and end time, separated by a dot character. This pair represents the lifetime of the entry.
For each entry, the first timestamp may comprise a value indicative of a beginning of the lifespan of the first entry - for example, when each of the files 202-n and/or the directories 201- n were created, respectively. The second timestamp may comprise a value indicative of an end of the lifespan of the first entry. The value indicative of the end of the lifespan of the first entry may comprise a value indicating that the lifespan of the entry has ended (for example, because the first entry has been deleted), or that an end of the lifespan of the first entry is undefined yet, such that the first entry is still active (i.e., that it has not been deleted/changed yet). The value indicating that the end of the lifespan of the first entry is undefined yet may comprise, for example, a tw value. Once an entry in the file system is changed (e.g., the file is moved/deleted/renamed or such), a new entry may be created and the old entry may be replaced with a link to the new entry. In addition, the second timestamp (i.e., the end time value) of the old entry may be replaced with the current timestamp (instead of tinf) to indicate the new lifespan of the entry.
Fig. 4 schematically depicts a file system after a move operation according to an example. In this example, a file (fde2 402-2) is moved from one directory into another directory (from dirl 401-1 into dir2401-2). The old entry relating to file2402-2 may be replaced by a link (depicted as a dotted line) and a new entry relating to fde2 402-2 may be created under dir 2 401-2, such that the old entry points to the new entry. In addition, the prefix of the old entry may be updated with the timestamp of the operation (exemplarily depicted as tio).
Advantageously, by using an expression comprising timestamps, it may be possible to generate a file scanning algorithm for a specific point in time. The algorithm may start from the root directory and only consider entries that are alive at the requested point-in-time (start time < PIT < end time).
Fig. 5 schematically depicts a file system after a delete operation according to an example. In this example, a file (file 1 502-1) is deleted. The second timestamp may be replaced with a timestamp corresponding to a time of the delete operation. In the example of Fig. 5, since the file has been deleted at tio, the second timestamp may be replaced with tio. In contrast to conventional file systems, the data of the file may not actually be deleted, to enable any PIT access in the future.
Fig. 6 schematically depicts a file system after a rename operation according to an example. In this example, a file (file2 602-2) is renamed. Similarly to the delete operation described in relation to Fig. 4, the old entry relating to file2 602-2 may be replaced by a link (depicted as a dotted line) and a new entry relating to file 2 602-2 may be created in the same directory (dir I 601-1), such that the old entry points to the new entry. The new entry relating to file2 602-2 may have a different unique name associated therewith, compared to the old entry. The second timestamp of the old entry may be replaced with a timestamp corresponding to a time of the operation (in the example of Fig. 6, tio).
Figs. 3-6 relate to operations that change the file system tree. In contrast, Fig. 7 described below relates to a metadata operation that only changes the attributes of the entry without changing the hierarchy of the file system tree. Examples of such an operation comprise chmod, chown, setxattr and truncate. Below, the operations which do not change the hierarchy of the file system tree are explicitly referred to as “metadata operations”.
Fig. 7 schematically depicts a file system after a metadata operation according to an example. In this example, attributes of a file (filel 702-1) are changed. After the file attributes (for example, permissions) are changed, information related to a metadata change may be added to an existing list of metadata 711 associated with the file in question. Information related to the metadata change may be ordered by time in the list of metadata 711, such that the latest metadata change is added to the end of the list of metadata 711. Advantageously, by saving the metadata changes in such a manner, generation of links can be avoided, thus reducing overhead. The list of metadata 711 may be stored in the file system on a per entry basis and consist of the metadata changes for the entry. The first entry in the list of metadata 711 may comprise the original metadata used during the creation.
Alternatively, similarly to the rename operation described above, a metadata operation may be implemented by creating a second file in the same directory as the first file. The metadata of the second file may be different compared to the metadata of the first file. The first file may be replaced with a link to the second file, and the second timestamp of the first entry may be replaced with a timestamp corresponding to the time of the operation. In this implementation, there is no need to maintain a list of metadata 711.
Although the entries (e.g., files and directories) of Figs. 2-7 are referred to, in the Figures, using different reference numerals, the skilled person would readily appreciate that a single entry may
be subject to a number of different operations (for example, it may be renamed, moved, and then deleted).
Fig. 8 schematically depicts an index node according to an example. An index node (inode) is a data structure in a Unix-style file system that describes a file system object (for example, a file or a directory). Each index node stores the attributes and disk block locations of the object’s data. File system attributes may include metadata (for example, time of last change, access, modification), as well as owner and permission data.
Data operations change the actual data of the file and thus are only applicable to files (i.e., not to directories). In order to save the data history (thereby enabling any PIT access), the index node 800 comprising a plurality of pointers may be extended by a key 801. The key 801 may comprise an additional indirect pointer that points to historical data versions of the file at various point-in-time. The indirect pointer shown in Fig. 8 comprises a triple indirect level pointer, but the invention is not limited thereto. The level of indirection (i.e., whether the pointer comprises a double indirect pointer, a triple indirect pointer, a quadruple indirect pointer etc.) may be changed depending on the amount of data history the file system should preserve. As the level of indirection increases, so does the amount of history, i.e., the pointer may point to more files. The key 801 may comprise an offset in the file and the first timestamp of the file. In contrast, in a traditional file system, the key typically comprises only the offset. As such, in a traditional file system, there is no mechanism available for saving duplicates of the same offsets, ordered by timestamps. A new data may be appended into the last position, such that the last data block for each offset is pointed twice - once by the original data block and second time by the PIT pointer. By using the double pointers to the up-to-date data for each offset allows the PIT file system to serve sequential reads in the same efficiency as the original file system, without the need to duplicate data.
In most cases, in order to perform any file/directory operation on the file system, a file name or a directory name must be mapped into the index node number, i.e., the file must be identified. In order to identify a file at a specific PIT (tprr), the process may pass the file pathname to a virtual file system call, such as open(), mkdir(), rename() or stat().
The standard procedure for performing this task may comprise analysing the pathname of the file and dividing it into a sequence of filenames called tokens. The filenames in the sequence of filenames (except the last filename) identify directories within the file system. Starting from
a root directory of the file system, the process may include going over those entries that existed at tpiT (i.e., those files that comprise the first timestamp that comprises a value smaller or equal to the specific PIT and the second timestamp that comprises a value bigger or equal to the specific PIT, start time < tpiT < end time) to find the entry having a name that matches the current token. If no match has been found, the process may return “UNDEFINED”. If a match has been found, the index node associated with the entry may be read and the process can repeat for the next token until the last token in the path is reached, at which point the index node of the file has been identified.
If the metadata of the file at the specific PIT is to be accessed, the identified index node of the file and the metadata attached to the index node may be read. The metadata list 711 (sorted chronologically by the timestamp) may be scanned and the metadata entry that is present in the list at the specific PIT may be returned. Similarly, if the file data is to be accessed at the specific PIT, the index node may be read, and a check may be performed to verify whether the requested data block does not exceed the file size at time tpiT. The file size may be saved within the metadata entry 711. If the size of the requested data block exceeds the file size at time tpiT, “INVALID DATA” may be returned. In other words, in such a case, the return value may indicate that no data is available for that block (e.g., EOF = EndOfFile has been reached). In response to verifying that the file size is not exceeded, for a specific data offset, the key 801 may be scanned through until the requested data block has been found. The data block may then be returned.
Fig. 9 schematically depicts a computing system according to an example. The computing system 900 comprises a processor 901 and a memory 902 coupled to the processor 901, the memory 902 configured to store program code 903 executable by the processor 901, the program code 903 comprising one or more instructions to cause the computing system 900 to perform the method steps described herein.
According to an example, machine-readable instructions can be loaded onto a computer or other programmable data processing devices, so that the computer or other programmable data processing devices perform a series of operations to produce computer-implemented processing, thus the instructions executed on the computer or other programmable devices provide an operation for realizing functions specified by flow(s) in the flow charts and/or block(s) in the block diagrams.
Further, the teachings herein may be implemented in the form of a computer or software product, such as a non-transitory machine-readable storage medium, the computer software or product being stored in a storage medium and comprising a plurality of instructions, e.g., machine readable instructions, for making a computer device implement the methods recited in the examples of the present disclosure.
In some examples, some methods can be performed in a cloud-computing or network-based environment. Cloud-computing environments may provide various services and applications via the Internet. These cloud-based services (e.g., software as a service, platform as a service, infrastructure as a service, etc.) may be accessible through a web browser or other remote interface of the user equipment for example. Various functions described herein may be provided through a remote desktop environment or any other cloud-based computing environment.
While various embodiments have been described and/or illustrated herein in the context of fully functional computing systems, one or more of these exemplary embodiments may be distributed as a program product in a variety of forms, regardless of the particular type of computer- readable-storage media used to actually carry out the distribution. The embodiments disclosed herein may also be implemented using software modules that perform certain tasks. These software modules may include script, batch, or other executable files that may be stored on a computer-readable storage medium or in a computing system. In some embodiments, these software modules may conFig. a computing system to perform one or more of the exemplary embodiments disclosed herein. In addition, one or more of the modules described herein may transform data, physical devices, and/or representations of physical devices from one form to another.
The preceding description has been provided to enable others skilled in the art to best utilize various aspects of the exemplary embodiments disclosed herein. This exemplary description is not intended to be exhaustive or to be limited to any precise form disclosed. Many modifications and variations are possible without departing from the spirit and scope of the instant disclosure. The embodiments disclosed herein should be considered in all respects illustrative and not restrictive. Reference should be made to the appended claims and their equivalents in determining the scope of the instant disclosure.
Claims
1. A method of managing a file system, the method comprising: performing an operation on a first entry comprised in a first location within the file system, the first entry having a unique name associated therewith; adding an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
2. The method of claim 1, wherein the expression comprises a prefix, an infix, and/or a postfix.
3. The method of claim 1 or 2, wherein the first timestamp comprises a value indicative of a beginning of the lifespan of the first entry.
4. The method of claim 3, wherein the second timestamp comprises a value indicative of an end of the lifespan of the first entry.
5. The method of claim 4, wherein the value indicative of the end of the lifespan of the first entry comprises a value indicating that the lifespan of the first entry has ended, or that an end of the lifespan of the first entry is undefined yet, such that the first entry is still active.
6. The method of any preceding claim, wherein the operation comprises a move operation, the method comprising: creating a second entry in a second location within the file system, wherein the unique name associated with the second entry corresponds to the unique name associated with the first entry; replacing the first entry with a link to the second entry, and replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
7. The method of any one of claims 1 to 5, wherein the operation comprises a delete operation, the method comprising: replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
8. The method of any one of claims 1 to 5, wherein the operation comprises a rename operation, the method comprising: creating a second entry in the first location within the file system; replacing the first entry with a link to the second entry, and replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
9. The method of any one of claims 1 to 5, wherein the operation comprises a metadata operation, the method comprising: adding information related to a metadata change of the first entry, ordered by time, to an existing list of metadata.
10. The method of any one of claims 1 to 5, wherein the operation comprises a metadata operation, the method comprising: creating a second entry in the first location within the file system, wherein the first entry comprises first metadata, and the second entry comprises second metadata; replacing the first entry with a link to the second entry; replacing the second timestamp of the first entry with a timestamp corresponding to a time of the operation.
11. The method of claim 9, wherein the first entry comprises a file, wherein an index node describing the file comprises a key pointing to historical data versions of the file at various points-in-time, wherein the key comprises an offset in the file and the first timestamp of the file.
12. The method of claim 11, further comprising identifying the file at a specific point-intime, wherein the identifying comprises: dividing a pathname of the file into a sequence of tokens;
starting from a root directory of the file system, going over those entries which comprise the first timestamp that comprises a value smaller or equal to the specific point in time and the second timestamp that comprises a value bigger or equal to the specific point in time in order to find the entry that matches a current token of the sequence of tokens based on a name of the file; and in response to finding the entry, reading the index node describing the file.
13. The method of claim 12, further comprising accessing metadata of the file at the specific point-in-time, wherein accessing the metadata comprises: reading the metadata list attached to the index node; scanning the metadata list and returning a metadata entry that is present in the list at the specific point-in-time.
14. The method of claim 13, further comprising accessing the file at the specific point-intime, wherein accessing the file comprises: checking whether a requested data block does not exceed a file size at the specific point-in-time based on an entry in the metadata list; in response to verifying that the file size is not exceeded, scanning through the pointer pointing to historical data versions of the file at various points-in-time until the file version at the specific point-in-time is detected; and returning the data block.
15. A computer readable storage medium comprising computer program code, accessible by an apparatus comprising a processor, to provide instructions and/or data to the apparatus, the computer program code configured to, with the processor, cause the apparatus to: perform an operation on a first entry comprised in a first location within a file system, the first entry having a unique name associated therewith; add an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
16. A computing system, comprising: a processor;
a memory coupled to the processor, configured to store: program code executable by the processor, the program code comprising one or more instructions to cause the computing system to: perform an operation on a first entry comprised in a first location within a file system, the first entry having a unique name associated therewith; add an expression to the unique name associated with the first entry, wherein the expression comprises a first timestamp and a second timestamp, wherein the first timestamp and the second timestamp together define a lifespan of the first entry.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2023/080129 WO2025087546A1 (en) | 2023-10-27 | 2023-10-27 | Any point-in-time file system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2023/080129 WO2025087546A1 (en) | 2023-10-27 | 2023-10-27 | Any point-in-time file system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2025087546A1 true WO2025087546A1 (en) | 2025-05-01 |
Family
ID=88647294
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2023/080129 Pending WO2025087546A1 (en) | 2023-10-27 | 2023-10-27 | Any point-in-time file system |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2025087546A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080117899A1 (en) * | 2006-11-16 | 2008-05-22 | Terence Sean Sullivan | Network audio directory server and method |
US20100036895A1 (en) * | 2008-08-06 | 2010-02-11 | International Business Machines Corporation | Representation of system clock changes in time based file systems |
US9973785B1 (en) * | 2015-12-28 | 2018-05-15 | Amazon Technologies, Inc. | Automatic failover for live video streaming |
-
2023
- 2023-10-27 WO PCT/EP2023/080129 patent/WO2025087546A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080117899A1 (en) * | 2006-11-16 | 2008-05-22 | Terence Sean Sullivan | Network audio directory server and method |
US20100036895A1 (en) * | 2008-08-06 | 2010-02-11 | International Business Machines Corporation | Representation of system clock changes in time based file systems |
US9973785B1 (en) * | 2015-12-28 | 2018-05-15 | Amazon Technologies, Inc. | Automatic failover for live video streaming |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11914485B2 (en) | Restoration of specified content from an archive | |
US7752226B1 (en) | Reverse pathname lookup by inode identifier | |
US11507550B2 (en) | Eventual consistency in a deduplicated cloud storage system | |
US9665304B2 (en) | Storage system with fast snapshot tree search | |
US10387271B2 (en) | File system storage in cloud using data and metadata merkle trees | |
US20110131184A1 (en) | Focused backup scanning | |
US12174709B2 (en) | Reducing bandwidth during synthetic restores from a deduplication file system | |
US11544150B2 (en) | Method of detecting source change for file level incremental backup | |
JP2005018757A (en) | Quick restoration for use of file system in ultra-large-scale file system | |
EP4208777B1 (en) | Efficiently storing data in a cloud storage | |
CN111045857A (en) | Method for data backup and recovery, electronic device and computer readable storage medium | |
EP3963465B1 (en) | File system metadata deduplication | |
CN111104377A (en) | File management method, electronic device and computer-readable storage medium | |
US20130290383A1 (en) | Mapping long names in a filesystem | |
US11403024B2 (en) | Efficient restoration of content | |
US20190147047A1 (en) | Object-level image query and retrieval | |
WO2025087546A1 (en) | Any point-in-time file system | |
US12056105B2 (en) | File lifetime tracking for cloud-based object stores | |
US20190188183A1 (en) | Handling weakening of hash functions by using epochs | |
WO2025087514A1 (en) | Accessing data from a backup file system at a specific point-in-time | |
US20190079948A1 (en) | Directory tree clones | |
WO2023165691A1 (en) | Method of updating key/value pair in object storage system and object storage system | |
WO2023138788A1 (en) | Method of backing up file-system onto object storgae system and data management module | |
WO2022117203A1 (en) | Continuous data protection system and method of storing data therein |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23798749 Country of ref document: EP Kind code of ref document: A1 |