CN109828866B - XFS file fragment recovery method and device - Google Patents
XFS file fragment recovery method and device Download PDFInfo
- Publication number
- CN109828866B CN109828866B CN201910076494.4A CN201910076494A CN109828866B CN 109828866 B CN109828866 B CN 109828866B CN 201910076494 A CN201910076494 A CN 201910076494A CN 109828866 B CN109828866 B CN 109828866B
- Authority
- CN
- China
- Prior art keywords
- file
- fragment
- fragments
- xfs
- data block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000012634 fragment Substances 0.000 title claims abstract description 167
- 238000011084 recovery Methods 0.000 title claims abstract description 32
- 238000000034 method Methods 0.000 title claims abstract description 20
- 238000005192 partition Methods 0.000 claims abstract description 16
- 238000000605 extraction Methods 0.000 claims abstract description 9
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000013467 fragmentation Methods 0.000 claims description 3
- 238000006062 fragmentation reaction Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 230000008521 reorganization Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides an XFS file fragment recovery method and device. The method comprises the following steps: step 1, obtaining a file directory structure of a disk by using a directory manager of an XFS file system; step 2, determining file fragments and fragment types in each disk partition according to the information entropy values of the data blocks extracted in each disk partition, wherein the fragment types comprise text fragments and image fragments; step 3, acquiring the initial logical address of the file fragment in a file linked list; and 4, splicing and recovering the XFS file fragments according to the initial logical address and the fragment type of the file fragments and the file directory structure. The device comprises: the device comprises a directory acquisition module, a fragment extraction module, an address query module and a splicing recovery module. The invention utilizes the directory manager and the space manager of the XFS file system to quickly determine the file fragments and the initial logical addresses thereof, thereby splicing and recovering the file fragments.
Description
Technical Field
The invention relates to the technical field of data storage, in particular to an XFS file fragment recovery method and device.
Background
XFS was originally developed for IRIX operating systems, is a high-performance journaling file system that can ensure consistency of file system data in the event of a power outage and operating system crash. The XFS is a 64-bit file system, is later opened and transplanted into a Linux operating system, currently, the CentOS 7 takes XFS + LVM as a default file system, and XFS has good read-write performance on large files and extremely has flexibility. File fragmentation is due to files being stored scattered to different places throughout the disk, rather than being stored contiguously in contiguous clusters of the disk. With the continuous development of data recovery technology, the disk-based data logical layer recovery technology is increasingly perfected, but at present, a great challenge exists in the logical layer recovery technology, namely when a deleted file has a multi-segment fragment state, the data reorganization and recovery become very difficult.
Patent application 201610625795.4 discloses a method for reorganizing and recovering data based on an XFS file system, which searches data by locating a file link list generated by XFS when storing data files, and mainly comprises the following steps: (1) loading and analyzing disk sector information; (2) matching a file linked list structure; (3) analyzing a file link structure; (4) reading corresponding block address data; (5) And (5) recombining the new file, and finally repeating the steps (2) to (5) to traverse the hard disk sector to realize the fragment recombination of the XFS file system. However, when the application searches for data, each block needs to be matched one by one to determine whether the block conforms to a plurality of file linked list structures, and the processing process is complex; when the data capacity of the hard disk is large, the application is low in efficiency when file fragments are recovered.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides an XFS file method and system, which quickly determines file fragments and splices and restores the file fragments by using a directory manager and a space manager of an XFS file system.
The invention provides a method for recovering XFS file fragments, which comprises the following steps:
step 1, obtaining a file directory structure of a disk by using a directory manager of an XFS file system;
step 2, determining file fragments and fragment types in each disk partition according to the information entropy values of the data blocks extracted from each disk partition, wherein the fragment types comprise text fragments and image fragments;
step 3, acquiring the initial logical address of the file fragment in a file linked list;
and 4, splicing and recovering the XFS file fragments according to the initial logical address and the fragment type of the file fragments and the file directory structure.
Further, the step 2 specifically includes:
step 2.1, calculating an information entropy value H (n) of the data block n according to the formula (1):
wherein, L represents the number of bytes contained in the data block n, and p (i) represents the probability when the byte L in the file fragment takes the value i.
Step 2.2, if the information entropy value H (n) is larger than a set entropy value threshold value, judging that the data block n is a file fragment;
and 2.3, determining the fragment type of the file fragment according to the set entropy value interval of the text fragment and the set entropy value interval of the image fragment.
Further, the step 4 specifically includes:
step 4.1, traversing a file directory structure according to the fragment type, and determining a target directory corresponding to the file fragments;
step 4.2, determining the size and the data block sequence of each data block of the XFS file by using a space manager according to the target directory;
and 4.3, splicing and recovering the XFS file fragments according to the initial logical address of the file fragments, the size of each data block and the sequence of the data blocks.
In another aspect, the present invention provides an XFS file fragment restoring apparatus, including:
the directory acquisition module is used for acquiring a file directory structure of the disk by using a directory manager of the XFS;
the fragment extraction module is used for determining file fragments and fragment types in each disk partition according to the information entropy of each data block extracted in each disk partition, wherein the fragment types comprise text fragments and image fragments;
the address query module is used for acquiring the initial logical address of the file fragment in a file linked list;
and the splicing recovery module is used for splicing and recovering the XFS file fragments according to the initial logical address and the fragment type of the file fragments and the file directory structure.
Further, the fragment extraction module specifically includes:
and the entropy value calculation submodule is used for calculating the information entropy value H (n) of the data block n according to the formula (1):
wherein, L represents the number of bytes contained in the data block n, and p (i) represents the probability when the byte L in the file fragment takes the value i.
The comparison submodule judges the data block n as a file fragment if the information entropy value H (n) is larger than a set entropy value threshold;
and the fragment type judgment submodule determines the fragment type of the file fragment according to the set entropy interval of the text fragment and the set entropy interval of the image fragment.
Further, the splicing recovery module specifically includes:
the directory traversal submodule determines a target directory corresponding to the file fragments according to the fragment type traversal file directory structure;
the ordering submodule determines the size and the data block sequence of each data block of the XFS file according to the target directory by using a space manager;
and the recovery submodule is used for splicing and recovering the XFS file fragments according to the initial logical address of the file fragments, the size of each data block and the sequence of the data blocks.
The invention has the beneficial effects that:
the invention provides a method and a device for recovering XFS file fragments, which extract the file fragments of each disk partition by utilizing the information entropy difference between an idle data block and an occupied data block (namely the file fragments); then, the fragment types of the file fragments are further distinguished by utilizing the characteristic that the information entropy values of different types of files are different, so that the traversal range can be reduced according to the fragment types when the file directory structure is traversed, and the efficiency is improved; then, determining the data block arrangement sequence of the size of the data blocks of the XFS file according to the inquired target directory by using a space manager; finally, the XFS file is spliced and restored by combining the initial logical address acquired from the file linked list. The data processing process of the invention is simple, and the data recovery of all file fragments can be realized quickly.
Drawings
Fig. 1 is a schematic flow chart of an XFS file fragment recovery method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an XFS file fragment recovery apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a fragment extraction module according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a splicing recovery module according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Fig. 1 is a flowchart illustrating an XFS file fragment recovery method according to an embodiment of the present invention. As shown in fig. 1, the method comprises the steps of:
s101, obtaining a file directory structure of a disk by using a directory manager of an XFS file system;
specifically, the XFS file system can be viewed as being made up of several modules: a hard drive, a volume manager, a cache, a transaction manager, a space manager, an I/O manager, a directory manager, and a system call and VNODE interface. Where the directory manager is responsible for managing the namespaces of the XFS file system. Thus, the directory manager may be utilized to obtain a file directory structure stored on disk.
S102, determining file fragments and fragment types in each disk partition according to information entropy values of the data blocks extracted in each disk partition, wherein the fragment types comprise text fragments and image fragments;
specifically, according to the principle of information entropy, it is known that: when the ordered states of the system are consistent, the more concentrated the data, the smaller the entropy value, the more dispersed the data, and the larger the entropy value. When the data amount is consistent, the more ordered the system is, the lower the entropy value is; the more chaotic or decentralized the system, the higher the entropy value. Therefore, the entropy of the free data block is the largest, the entropy of the text fragment is smaller, and the entropy of the image fragment is the smallest.
As an implementation manner, the step S102 specifically includes:
s1021, calculating an information entropy value H (n) of the data block n according to the formula (1):
wherein, L represents the number of bytes contained in the data block n, and p (i) represents the probability when the byte L in the file fragment takes the value i.
S1022, if the information entropy value H (n) is larger than the set entropy value threshold, determining that the data block n is a file fragment;
and S1023, determining the fragment type of the file fragment according to the set entropy interval of the text fragment and the set entropy interval of the image fragment.
S103, acquiring an initial logical address of the file fragment in a file linked list;
and S104, splicing and recovering the XFS file fragments according to the initial logical addresses of the file fragments and the file directory structure.
Specifically, as an implementation manner, the step S104 specifically includes:
s1041, traversing a file directory structure according to the fragment type, and determining a target directory corresponding to the file fragment;
s1042, determining the size and the data block sequence of each data block of the XFS file according to the target directory by using a space manager;
in particular, the space manager is responsible for the allocation and release of free space of the XFS file system, and an ordered piece of entry information about a data block can be abstracted by traversing the target directory. The storage status of the file to be restored, such as the data block size of each data block of the file to be restored and the logical connection order between the data blocks, can be determined by integrating the space manager and the target directory.
And S1043, splicing and recovering the XFS file fragments according to the initial logical addresses of the file fragments, the sizes of the data blocks and the sequence of the data blocks.
The invention provides an XFS file fragment recovery method, which extracts the file fragment of each disk partition by utilizing the information entropy difference between an idle data block and an occupied data block (namely the file fragment); then, the fragment types of the file fragments are distinguished by further utilizing the characteristic that information entropy values of different types of files are different, so that when a file directory structure is traversed, the traversal range can be reduced according to the fragment types, and the efficiency is improved; then, determining the data block arrangement sequence of the size of the data blocks of the XFS file according to the inquired target directory by using a space manager; finally, the XFS file is spliced and restored by combining the initial logical address acquired from the file linked list. The data processing process of the invention is simple, and the data recovery of all file fragments can be realized quickly.
Fig. 2 is a schematic structural diagram of an XFS file fragment recovery apparatus according to an embodiment of the present invention. As shown in fig. 2, the apparatus includes: the system comprises a directory acquisition module 201, a fragment extraction module 202, an address query module 203 and a splicing recovery module 204. Wherein:
the directory acquiring module 201 acquires a file directory structure of the disk by using a directory manager of the XFS file system; the fragment extraction module 202 determines a file fragment and a fragment type in each disk partition according to the information entropy of each data block extracted in each disk partition, where the fragment type includes a text fragment and an image fragment; the address query module 203 is configured to obtain an initial logical address of the file fragment in a file linked list; and the splicing recovery module 204 performs splicing recovery on the XFS file fragments according to the initial logical address of the file fragments and the file directory structure.
Specifically, as shown in fig. 3, as an implementation manner, the fragment extraction module 202 specifically includes: an entropy calculation sub-module 2021, a comparison sub-module 2022, and a patch type determination sub-module 2023. Wherein:
the entropy calculation sub-module 2021 calculates an information entropy H (n) of the data block n according to equation (1):
wherein, L represents the number of bytes contained in the data block n, and p (i) represents the probability when the byte L in the file fragment takes the value i. The comparison submodule 2022 determines that the data block n is a file fragment if the information entropy value H (n) is greater than the set entropy threshold; the fragment type determining submodule 2023 determines the fragment type of the file fragment according to the set entropy interval of the text fragment and the entropy interval of the image fragment.
As shown in fig. 4, as an implementation manner, the splicing recovery module 204 specifically includes: a directory traversal submodule 2041, a sorting submodule 2042, and a restore submodule 2043. Wherein:
the directory traversal submodule 2041 determines a target directory corresponding to the file fragments according to the fragment type traversal file directory structure; the sorting submodule 2042 determines the size and the data block sequence of each data block of the XFS file according to the target directory by using the space manager; the recovery submodule 2043 performs splicing recovery on the XFS file fragments according to the starting logical address of the file fragment, the size of each data block, and the sequence of the data blocks.
It should be noted that the XFS file fragment recovery apparatus provided in the embodiment of the present invention is for implementing the method described above, and the functions of the apparatus may specifically refer to the method described above, and are not described herein again.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (2)
1. An XFS file fragmentation recovery method is characterized by comprising the following steps:
step 1, obtaining a file directory structure of a disk by using a directory manager of an XFS file system;
step 2, determining file fragments and fragment types in each disk partition according to the information entropy values of the data blocks extracted in each disk partition, wherein the fragment types comprise text fragments and image fragments; the step 2 specifically comprises the following steps:
step 2.1, calculating an information entropy value H (n) of the data block n according to the formula (1):
wherein, L represents the byte number contained in the data block n, and p (i) represents the probability when the byte L in the file fragment takes the value i;
step 2.2, if the information entropy value H (n) is larger than a set entropy value threshold, judging that the data block n is a file fragment;
step 2.3, determining the fragment type of the file fragment according to the set entropy interval of the text fragment and the set entropy interval of the image fragment;
step 3, acquiring the initial logical address of the file fragment in a file linked list;
step 4, splicing and recovering the XFS file fragments according to the initial logical address and the fragment type of the file fragments and the file directory structure; the method specifically comprises the following steps:
step 4.1, traversing the file directory structure according to the fragment type, and determining a target directory corresponding to the file fragments;
step 4.2, determining the size and the data block sequence of each data block of the XFS file by using a space manager according to the target directory;
and 4.3, splicing and recovering the XFS file fragments according to the initial logical address of the file fragments, the size of each data block and the sequence of the data blocks.
2. An XFS file fragmentation recovery apparatus, comprising:
the directory acquisition module is used for acquiring a file directory structure of the disk by using a directory manager of the XFS file system;
the fragment extraction module is used for determining file fragments and fragment types in each disk partition according to the information entropy of each data block extracted in each disk partition, wherein the fragment types comprise text fragments and image fragments; the fragment extraction module specifically comprises:
and the entropy value calculation submodule is used for calculating the information entropy value H (n) of the data block n according to the formula (1):
wherein, L represents the byte number contained in the data block n, and p (i) represents the probability when the byte L in the file fragment takes the value i;
the comparison submodule judges the data block n as a file fragment if the information entropy value H (n) is larger than a set entropy value threshold;
the fragment type judgment submodule determines the fragment type of the file fragment according to the set entropy interval of the text fragment and the set entropy interval of the image fragment;
the address query module is used for acquiring the initial logical address of the file fragment in a file linked list;
the splicing recovery module is used for splicing and recovering the XFS file fragments according to the initial logical address and the fragment type of the file fragments and the file directory structure; the splicing recovery module specifically comprises:
the directory traversal submodule determines a target directory corresponding to the file fragment according to the fragment type traversal file directory structure;
the sorting sub-module determines the size and the data block sequence of each data block of the XFS file according to the target directory by using a space manager;
and the recovery submodule is used for splicing and recovering the XFS file fragments according to the initial logical address of the file fragments, the size of each data block and the sequence of the data blocks.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910076494.4A CN109828866B (en) | 2019-01-26 | 2019-01-26 | XFS file fragment recovery method and device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910076494.4A CN109828866B (en) | 2019-01-26 | 2019-01-26 | XFS file fragment recovery method and device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN109828866A CN109828866A (en) | 2019-05-31 |
| CN109828866B true CN109828866B (en) | 2023-04-14 |
Family
ID=66862436
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910076494.4A Active CN109828866B (en) | 2019-01-26 | 2019-01-26 | XFS file fragment recovery method and device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN109828866B (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114862667B (en) * | 2021-02-04 | 2024-10-18 | 西安电子科技大学青岛计算技术研究院 | Two-dimensional fragment splicing system based on OpenCV image processing |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6173291B1 (en) * | 1997-09-26 | 2001-01-09 | Powerquest Corporation | Method and apparatus for recovering data from damaged or corrupted file storage media |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR100308874B1 (en) * | 1999-11-17 | 2001-11-07 | 이채홍 | Data recovering method from a fragmented data of a hard disc and computer readable medium the same |
| CN102622302B (en) * | 2011-01-26 | 2014-10-29 | 中国科学院高能物理研究所 | Recognition method for fragment data type |
| CN106155845B (en) * | 2016-08-02 | 2023-03-28 | 四川效率源信息安全技术股份有限公司 | XFS-based file system data reorganization and recovery method |
-
2019
- 2019-01-26 CN CN201910076494.4A patent/CN109828866B/en active Active
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6173291B1 (en) * | 1997-09-26 | 2001-01-09 | Powerquest Corporation | Method and apparatus for recovering data from damaged or corrupted file storage media |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109828866A (en) | 2019-05-31 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11334522B2 (en) | Distributed write journals that support fast snapshotting for a distributed file system | |
| US11416452B2 (en) | Determining chunk boundaries for deduplication of storage objects | |
| US9977746B2 (en) | Processing of incoming blocks in deduplicating storage system | |
| JP5732536B2 (en) | System, method and non-transitory computer-readable storage medium for scalable reference management in a deduplication-based storage system | |
| US9678688B2 (en) | System and method for data deduplication for disk storage subsystems | |
| US8447938B2 (en) | Backing up a deduplicated filesystem to disjoint media | |
| US8583599B2 (en) | Reducing data duplication in cloud storage | |
| US11221921B2 (en) | Method, electronic device and computer readable storage medium for data backup and recovery | |
| US20090132616A1 (en) | Archival backup integration | |
| US9940331B1 (en) | Proactive scavenging of file system snaps | |
| TW201101021A (en) | System and method for data deduplication | |
| US9679007B1 (en) | Techniques for managing references to containers | |
| KR101674176B1 (en) | Method and apparatus for fsync system call processing using ordered mode journaling with file unit | |
| CN105183399A (en) | Data writing and reading method and device based on elastic block storage | |
| US10430383B1 (en) | Efficiently estimating data compression ratio of ad-hoc set of files in protection storage filesystem with stream segmentation and data deduplication | |
| WO2021082926A1 (en) | Data compression method and apparatus | |
| WO2015096847A1 (en) | Method and apparatus for context aware based data de-duplication | |
| US11392617B2 (en) | Recovering from a failure of an asynchronous replication node | |
| CN109828866B (en) | XFS file fragment recovery method and device | |
| US11836388B2 (en) | Intelligent metadata compression | |
| US9984112B1 (en) | Dynamically adjustable transaction log | |
| US9436697B1 (en) | Techniques for managing deduplication of data | |
| US9811545B1 (en) | Storage of sparse files using parallel log-structured file system | |
| CN113220211A (en) | Data storage system, data access method and related device | |
| CN111625186B (en) | Data processing method, device, electronic equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |