WO2024169286A1 - Copy-on-write method, apparatus and device, and non-volatile readable storage medium - Google Patents
Copy-on-write method, apparatus and device, and non-volatile readable storage medium Download PDFInfo
- Publication number
- WO2024169286A1 WO2024169286A1 PCT/CN2023/131854 CN2023131854W WO2024169286A1 WO 2024169286 A1 WO2024169286 A1 WO 2024169286A1 CN 2023131854 W CN2023131854 W CN 2023131854W WO 2024169286 A1 WO2024169286 A1 WO 2024169286A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- copy
- bitmap
- data block
- size
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
Definitions
- the present application relates to the technical field of cloud computing, and in particular to a copy-on-write method; and also to a copy-on-write device, equipment, and non-volatile readable storage medium.
- OCFS2 (the full name is The Oracle Clustered File System Version 2) is a general-purpose shared disk cluster file system based on disk areas. It uses the network and disk-based heartbeat to determine whether the nodes in the cluster are available. It can realize cross-node (server) disk space management and provide a unified view to users.
- the shared disk cluster file system needs to use a locking mechanism to ensure the sequentiality of disk space management, that is, to avoid modifying the metadata of the file system at the same time.
- the disk is divided into several cluster groups for management. Each cluster group contains several clusters, and each cluster contains several blocks. It supports block sizes from 512 bytes to 4KB, and supports file system cluster (cluster, or cluster) sizes from 4KB to 1MB.
- reflink technology In cloud computing scenarios, traditional data backup solutions mainly back up data periodically, including full backup and incremental backup.
- the main solution for incremental backup is to use reflink technology.
- the basic principle of reflink is that when copying files, only a new inode (index node) is created to make it independent of the inode of the source file, but the newly created inode and the inode of the source file will still share data block (cluster in OCFS2) information. Only when the data block is modified will the source data block information and the modified part of the information be copied to the new data block.
- the minimum copy granularity of a data block is the size of a cluster.
- the purpose of the present application is to provide a copy-on-write method that can improve the execution speed of the copy-on-write and improve the overall response capability of the system.
- Another purpose of the present application is to provide a copy-on-write device, equipment and non-volatile readable storage medium, all of which have the above technical effects.
- the present application provides a copy-on-write method, comprising:
- the block bitmap records the first address of the new data block, the first address of the source data block, and the update status of each data block in the new data block.
- the target data in the source data block is copied to the new data block with block as the minimum granularity.
- determining whether the block bitmap feature is enabled includes:
- the block bitmap feature is enabled.
- a flag indicating that the block bitmap feature is enabled is recorded in the node.
- a bit in the block bitmap indicates an update status of a data block of the new data block.
- the corresponding bit is zero; if the data block in the new data block is updated, the corresponding bit is one.
- the block bitmap also records the length of the bitmap used to record the update status of each data block of the new data block.
- New data blocks are allocated with cluster as the smallest unit.
- the first address of the storage space is recorded in the node.
- the data reading object is determined, and data is read from the data reading object.
- the block bitmap feature is enabled, the block bitmap is consulted based on the first address of the storage space.
- the target data is copied to a new data block with cluster as the minimum granularity.
- the blocks to be copied are copied to new data blocks with cluster as the minimum granularity.
- the present application also provides a copy-on-write device, comprising:
- the acquisition module is set to obtain the size of the cluster and the size of the block;
- the creation module is configured to create a block bitmap if the ratio of the cluster size to the block size reaches a preset threshold
- the copy module is set to use block as the minimum granularity to copy the target data in the source data block to the new data block;
- the recording module is configured to record the first address of the new data block and the first address of the source data block and the update status of each data block in the new data block in the block bitmap.
- the present application also provides a copy-on-write device, including:
- a memory arranged to store a computer program
- a processor is configured to implement the steps of any of the above copy-on-write methods when executing a computer program.
- the present application also provides a non-volatile readable storage medium, on which a computer program is stored.
- a computer program is stored on which a computer program is stored.
- the steps of the copy-on-write method as described in any one of the above items are implemented.
- the write-time copy method provided by the present application includes: obtaining the size of a cluster and the size of a block; if the ratio of the size of the cluster to the size of the block reaches a preset threshold, creating a block bitmap (block bitmap); using the block as the minimum granularity, copying the target data in the source data block to the new data block; recording the first address of the new data block and the first address of the source data block and the update status of each data block in the new data block in the block bitmap.
- the write-time copy method uses block as the minimum granularity to copy data when the size of the cluster is much larger than the size of the block, which can avoid invalid data copying and improve the execution speed of data copying.
- block bitmap to record the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block, it can be ensured that after the data is copied with block as the minimum granularity, data access can be performed normally.
- the copy-on-write device, equipment, and non-volatile readable storage medium provided in this application all have the above-mentioned technical effects.
- FIG1 is a schematic flow chart of a copy-on-write method provided in an embodiment of the present application.
- FIG2 is a schematic diagram of a copy-on-write principle provided by an embodiment of the present application.
- FIG3 is a schematic diagram of an optional copy-on-write method provided in an embodiment of the present application.
- FIG4 is a schematic diagram of a copy-on-write device provided in an embodiment of the present application.
- FIG5 is a schematic diagram of a copy-on-write device provided in an embodiment of the present application.
- the core of this application is to provide a copy-on-write method, which can improve the execution speed of copy-on-write and improve the overall response capability of the system.
- Another core of this application is to provide a copy-on-write device, equipment and non-volatile readable storage medium, all of which have the above technical effects.
- FIG. 1 is a flow chart of a copy-on-write method provided in an embodiment of the present application.
- the method mainly includes:
- S101 Get the size of the cluster and the size of the block
- S104 Record the first address of the new data block and the first address of the source data block and the update status of each data block in the new data block in the block bitmap.
- This embodiment aims to copy data with block as the minimum granularity when the size of cluster is much larger than the size of block.
- this embodiment creates a block bitmap when the size of cluster is much larger than the size of block, and records the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block through the block bitmap.
- Target data refers to the data in the source data block that needs to be updated.
- Steps S101 and S102 can be executed after the upper layer application triggers the file copy action.
- the lower layer can first obtain the cluster_size of OCFS2, i.e. the size of the cluster, the block_size, i.e. the size of the block, and the preset threshold (the preset threshold can be given when making the file system or executing the reflink snapshot) when creating the inode node and filling in the member information. If the ratio of cluster_size to block_size is greater than or equal to the preset threshold, a block bitmap is created. If the ratio of cluster_size to block_size is less than the preset threshold, no block bitmap is created and the copy-on-write continues to use the existing implementation scheme.
- the method further includes: if the ratio of the size of the cluster to the size of the block reaches a preset threshold, recording a mark in the node for indicating that the block bitmap feature is enabled.
- the feature flag BLKMAP is defined and recorded in the ocfs2_extend_rec.e_flags member of the inode node, indicating that the block bitmap is enabled. If the ratio of cluster_size to block_size is less than the preset threshold, no marking is performed, indicating that the block bitmap is not enabled and the existing implementation of copy-on-write is still used.
- the data structure of the block bitmap includes blkno, ref_blkno and bitmap.
- Blkno is used to record the first address of the new data block
- ref_blkno is used to record the first address of the source data block (shared data block)
- bitmap is used to record the update status of each data block in the new data block. Based on the bitmap, it can be known which data blocks in the new data block (such as data clusters) are updated and which data blocks are not updated.
- a bit of the block bitmap in the bitmap indicates an update status of a data block of the new data block.
- the corresponding bit is zero; if the data block in the new data block is updated, the corresponding bit is one.
- each bit in the bitmap corresponds to a block in the new data block. If some block data in the new data block is updated, the corresponding bit in the bitmap is 1, otherwise the bit is cleared to 0.
- the length of the bitmap is equal to the number of blocks in the new data block.
- the block bitmap further records the length of the bitmap used to record the update status of each data block of the new data block.
- bitmap is a bitmap.
- the data structure of the block bitmap includes blkno, ref_blkno, map_size and bitmap.
- Steps S103 and S104 can be executed when the upper layer application modifies the reflink snapshot file.
- the lower layer allocates new data blocks with cluster as the smallest unit and block as the smallest granularity, and only copies the block data that needs to be updated in the source data block to the new data block, and updates the bitmap in the block bitmap at the same time.
- the reflink count tree is updated synchronously. The refcount value.
- copying data in the source data block to the new data block further includes:
- the target data in the source data block is copied to the new data block with block as the minimum granularity.
- judging whether the block bitmap feature is enabled includes:
- the block bitmap feature is enabled.
- the bottom layer first determines whether the block bitmap in the inode of the reflink snapshot file has been enabled. If not, the existing write-time copy implementation scheme is used, that is, the data in the source data block is copied to the new data block with cluster as the minimum granularity, and the refcount value of the reflink counter tree is updated at the same time. If enabled, only the block data that needs to be updated in the source data block is copied to the new data block with block as the minimum granularity, and the bitmap in the block bitmap is updated at the same time. In addition, if all data in the source data block has been updated, the refcount value of the reflink counter tree is updated synchronously.
- the following further includes:
- the first address of the storage space is recorded in the node.
- link the block bitmap to the inode node For example, use a block space to store the block bitmap information, and add a new member i_blockcount_loc under ocfs2_extend_rec in the inode node to record the first address blockno of the data block where the block bitmap information is located.
- the data reading object is determined, and data is read from the data reading object.
- This embodiment aims to index the new data block according to the block bitmap.
- the lower layer consults the block bitmap to confirm which block data is read from the new data block and which block data is read from the source data block, and then reads data from the source data block or the new data block.
- the block bitmap feature is enabled, the block bitmap is consulted based on the first address of the storage space.
- the bottom layer first determines whether the block bitmap in the inode of the reflink snapshot file has been enabled. If the block bitmap is not enabled, the data is retrieved according to the record information of the reflilnk count tree. If the block bitmap is enabled, the block bitmap is consulted to confirm which block data is read from the new data block and which block data is read from the source data block, and then the data is read in a targeted manner.
- Determining whether the block bitmap feature is enabled may include:
- the block bitmap feature is enabled.
- the second judgment first determines whether the inode of the reflink snapshot file records a mark indicating that the block bitmap feature is enabled. If the inode records a mark indicating that the block bitmap feature is enabled, data is retrieved according to the record information of the reflilnk count tree. If the inode does not record a mark indicating that the block bitmap feature is enabled, the block bitmap is consulted to confirm which block data is read from the new data block and which block data is read from the source data block, and then the data is read from the source data block or the new data block.
- the method further includes:
- the target data is copied to a new data block with cluster as the minimum granularity.
- the size of the target data For example, before copying the current data, you can also analyze the size of the target data. If the size of the target data reaches a first preset value, that is, the amount of target data is large, if the data is copied with block as the minimum granularity, more copies are required. At this time, you can choose to copy the data with cluster as the minimum granularity.
- the method further includes:
- the blocks to be copied are copied to new data blocks with cluster as the minimum granularity.
- OCFS2 cluster_size 1M
- block_size 4K
- preset thresholds HT 256.
- the 4M data managed by the node inodeB of a certain file is taken as the research object.
- inodeB.extent_rec extentent_record, data segment record
- the block bitmap feature is adaptively configured according to the cluster_size and block_size of OCFS2 and the preset threshold.
- the bottom layer When the upper layer application is ready to modify the data in the 0th and 2nd blocks in the 5th cluster in the reflink snapshot file (i.e., the two data blocks with blockno 1280 and 1282), the bottom layer first checks the block bitmap feature flag in the inodeA of the file, which has been enabled in this embodiment. Then, a new data block is allocated with cluster as the minimum unit, and the block bitmap information is updated at the same time.
- blkno records the starting address of the new data block blkno: 3584
- ref_blkno records the starting address of the source data block to be updated blkno: 1280
- map_size records the length of the new data block 256 (a new cluster is allocated, with a total of 256 blocks)
- the bitmap is 256 consecutive bits of the address, and is initialized to 0, indicating that the block data in the new data block has not been updated and written.
- the bottom layer After the copy-on-write is completed, when the upper-layer application reads the reflink snapshot file, the bottom layer first checks the block bitmap feature flag in the inodeA of the file, which has been enabled in this embodiment. Then it traverses the block bitmap. When the bitmap is set to 1, the corresponding block data is read from the new data block, otherwise the block data is still read from the source data block (shared data block).
- the copy-on-write method provided by the present application includes: obtaining the size of the cluster and the size of the block; if the ratio of the size of the cluster to the size of the block reaches a preset threshold, then creating a block bitmap; using block as the minimum granularity, copying the data in the source data block to the new data block; recording the first address of the new data block and the first address of the source data block, as well as the data block where data updates occur in the new data block in the block bitmap. It can be seen that the copy-on-write method provided by the present application uses block as the minimum granularity to copy data when the size of the cluster is much larger than the size of the block, thereby avoiding invalid data copying and improving the execution speed of data copying.
- the present application also provides a copy-on-write device, and the device described below can be referred to in correspondence with the method described above.
- Figure 4 is a schematic diagram of a copy-on-write device provided in an embodiment of the present application. As shown in Figure 4, the device includes:
- An acquisition module 10 is configured to acquire the size of a cluster and the size of a block
- a creation module 20 is configured to create a block bitmap if the ratio of the size of the cluster to the size of the block reaches a preset threshold
- the copy module 30 is configured to copy the target data in the source data block to the new data block with block as the minimum granularity
- the recording module 40 is configured to record the first address of the new data block and the first address of the source data block and the update status of each data block in the new data block in the block bitmap.
- This embodiment aims to copy data with block as the minimum granularity when the size of cluster is much larger than the size of block.
- this embodiment creates a block bitmap when the size of cluster is much larger than the size of block, and records the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block through the block bitmap.
- the lower layer creates the inode node and fills in the member information. 10 can first obtain the cluster_size of OCFS2, i.e., the size of the cluster, the block_size, i.e., the size of the block, and a preset threshold (the preset threshold can be given when making the file system or executing the reflink snapshot). Calculate the ratio of cluster_size to block_size. If the ratio of cluster_size to block_size is greater than or equal to the preset threshold, the creation module 20 creates a block bitmap. If the ratio of cluster_size to block_size is less than the preset threshold, the creation module 20 does not create a block bitmap, and the copy-on-write continues to use the existing implementation scheme.
- the cluster_size of OCFS2 i.e., the size of the cluster
- the block_size i.e., the size of the block
- a preset threshold the preset threshold can be given when making the file system or executing the reflink snapshot.
- the data structure of the block bitmap includes blkno, ref_blkno and bitmap.
- blkno is used to record the first address of the new data block
- ref_blkno is used to record the first address of the source data block (shared data block)
- bitmap is used to record which data blocks in the new data block are updated.
- the copy module 30 uses block as the minimum granularity and only copies the block data that needs to be updated in the source data block to the new data block, and the recording module 40 updates the bitmap in the block bitmap.
- a first determination module is configured to determine whether the block bitmap feature is enabled
- the copy module 30 copies the target data in the source data block to the new data block with the block as the minimum granularity.
- the lower layer first determines whether the block bitmap in the inode of the reflink snapshot file has been enabled. If not, the existing write-time copy implementation scheme is used, that is, the data in the source data block is copied to the new data block with cluster as the minimum granularity, and the refcount value of the reflink count tree is updated at the same time. If enabled, the block is used as the minimum granularity, and only the block data that needs to be updated in the source data block is copied to the new data block, and the bitmap in the block bitmap is updated at the same time.
- the first judgment module includes:
- a judging unit configured to judge whether a mark representing enabling a block bitmap feature is recorded in the node
- the determination unit is configured to enable the block bitmap feature if a flag indicating that the block bitmap feature is enabled is recorded in the node.
- the marking module is configured to record a mark in the node for indicating that the block bitmap feature is enabled if the ratio of the size of the cluster to the size of the block reaches a preset threshold.
- the feature flag BLKMAP is defined and recorded in the ocfs2_extend_rec.e_flags member of the inode node, indicating that the block bitmap is enabled. If the ratio of cluster_size to block_size is less than the preset threshold, no marking is performed, indicating that it is not enabled.
- the block bitmap still uses the existing copy-on-write implementation.
- a bit in the block bitmap represents an update status of a data block of the new data block.
- the corresponding bit is zero; if the data block in the new data block is updated, the corresponding bit is one.
- each bit in the bitmap corresponds to a block in the new data block. If some block data in the new data block is updated, the corresponding bit in the bitmap is 1, otherwise the bit is cleared to 0.
- the length of the bitmap is equal to the number of blocks in the new data block.
- the block bitmap further records the length of the bitmap used to record the update status of each data block of the new data block.
- bitmap is a bitmap.
- the data structure of the block bitmap includes blkno, ref_blkno, map_size and bitmap.
- the first allocation module is configured to allocate new data blocks using cluster as the minimum unit.
- the second allocation module is configured to allocate storage space for storing information recorded in the block bitmap.
- the node recording module is configured to record the first address of the storage space in the node.
- a block space is used alone to store block bitmap information, and a new member i_blockcount_loc is added under ocfs2_extend_rec in the inode node to record the first address blockno of the data block where the block bitmap information is located.
- the reference module is configured to reference the block bitmap according to the first address of the storage space
- the reading module is configured to determine a data reading object according to the block bitmap, and read data from the data reading object.
- the upper-layer application reads the reflink snapshot file, it consults the block bitmap to confirm which block data is read from the new data block and which block data is read from the source data block, and then reads data from the source data block or the new data block.
- a second determination module is configured to determine whether the block bitmap feature is enabled
- the query module queries the block bitmap based on the first address of the storage space.
- the bottom layer first determines whether the block bitmap in the inode of the reflink snapshot file has been enabled. If the block bitmap is not enabled, the data is retrieved according to the record information of the reflilnk count tree. If the block bitmap is enabled, the block bitmap is consulted to confirm which block data is read from the new data block and which block data is read from the source data block, and then the data is read in a targeted manner.
- the second judgment module is configured to: judge whether a mark indicating that the block bitmap feature is enabled is recorded in the node; if the mark indicating that the block bitmap feature is enabled is recorded in the node, the block bitmap feature is enabled.
- the second judgment first determines whether the inode of the reflink snapshot file records a mark indicating the enablement of the block bitmap feature. If the inode records a mark indicating the enablement of the block bitmap feature, data is retrieved according to the record information of the reflilnk count tree. If the inode does not record a mark indicating the enablement of the block bitmap feature, the block bitmap is consulted to confirm which block data is read from the new data block and which block data is read from the source data block, and then the data is read in a targeted manner.
- a reference counting tree creation module is configured to create a reference counting tree.
- An update module is configured to update the reference count value of the reference count tree.
- the node creation module is configured to create nodes.
- An analysis module is configured to analyze the size of target data
- the second copy module is configured to copy the target data to a new data block with cluster as the minimum granularity if the size of the target data reaches a first preset value.
- the statistics module is set to count the number of blocks to be copied
- the third copy module is configured to copy the blocks to be copied to the new data block with cluster as the minimum granularity if the number of blocks to be copied reaches a second preset value.
- a definition module is configured to define a preset threshold value.
- the initialization module is configured to initialize the bits corresponding to each data block in the new data block to zero.
- the write-time copy device copies data with block as the minimum granularity when the size of the cluster is much larger than the size of the block, thus avoiding invalid data copying and improving the execution speed of data copying.
- a block bitmap to record the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block, it can be ensured that after the data is copied with block as the minimum granularity, data access can be performed normally.
- the present application also provides a copy-on-write device, as shown in FIG5 , the device includes a memory 1 and a processor 2 .
- a memory 1 configured to store a computer program
- Processor 2 is configured to execute a computer program to implement the following steps:
- Get the size of the cluster and the size of the block if the ratio of the size of the cluster to the size of the block reaches the preset threshold, create a block bitmap; use the block as the minimum granularity to copy the target data in the source data block to the new data block; record the first address of the new data block and the first address of the source data block in the block bitmap, as well as the update status of each data block in the new data block
- the copy-on-write device copies data with block as the minimum granularity when the size of the cluster is much larger than the size of the block, thus avoiding invalid data copying and improving the execution speed of data copying.
- a block bitmap to record the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block, it can be ensured that after the data is copied with block as the minimum granularity, data access can be performed normally.
- the present application also provides a non-volatile readable storage medium, on which a computer program is stored.
- a computer program is stored on a non-volatile readable storage medium, on which a computer program is stored.
- Get the size of the cluster and the size of the block if the ratio of the size of the cluster to the size of the block reaches the preset threshold, create a block bitmap; use the block as the minimum granularity to copy the target data in the source data block to the new data block; record the first address of the new data block and the first address of the source data block in the block bitmap, as well as the update status of each data block in the new data block
- the non-volatile readable storage medium may include: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and other media that can store program codes.
- the non-volatile readable storage medium provided by the present application is used when the size of the cluster is much larger than the size of the block.
- Block is the minimum granularity for data copying, which avoids invalid data copying and improves the execution speed of data copying.
- by creating a block bitmap to record the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block it can be ensured that after the data is copied with block as the minimum granularity, the data can be accessed normally.
- the steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two.
- the software module may be placed in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a removable disk, a CD-ROM, or any other form of non-volatile readable storage medium known in the art.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Record Information Processing For Printing (AREA)
Abstract
Description
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2023年2月14日提交中国专利局,申请号为202310108005.5,申请名称为“一种写时拷贝方法、装置、设备及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application filed with the China Patent Office on February 14, 2023, with application number 202310108005.5 and application name “A copy-on-write method, device, apparatus and computer-readable storage medium”, the entire contents of which are incorporated by reference into this application.
本申请涉及云计算技术领域,特别涉及一种写时拷贝方法;还涉及一种写时拷贝装置、设备以及非易失性可读存储介质。The present application relates to the technical field of cloud computing, and in particular to a copy-on-write method; and also to a copy-on-write device, equipment, and non-volatile readable storage medium.
OCFS2(英文全称为The Oracle Clustered File System Version 2)是一个通用的基于盘区的共享磁盘集群文件系统,其利用网络和基于磁盘的心跳来确定集群内的节点是否可用,可以实现跨节点(服务器)的磁盘空间管理,同时对用户提供统一视图。共享磁盘集群文件系统需要通过锁机制保证磁盘空间管理的循序性,即避免同时修改文件系统的元数据。磁盘划分为若干个cluster group(集群组)进行管理,每个cluster group包含若干个cluster,每个clsuter又包含若干个block。支持从512字节到4KB的块(block)大小,支持从4KB到1MB的文件系统集群(cluster,或称为,簇)大小。OCFS2 (the full name is The Oracle Clustered File System Version 2) is a general-purpose shared disk cluster file system based on disk areas. It uses the network and disk-based heartbeat to determine whether the nodes in the cluster are available. It can realize cross-node (server) disk space management and provide a unified view to users. The shared disk cluster file system needs to use a locking mechanism to ensure the sequentiality of disk space management, that is, to avoid modifying the metadata of the file system at the same time. The disk is divided into several cluster groups for management. Each cluster group contains several clusters, and each cluster contains several blocks. It supports block sizes from 512 bytes to 4KB, and supports file system cluster (cluster, or cluster) sizes from 4KB to 1MB.
在云计算场景中,传统的数据备份方案主要是周期性对数据进行备份,包括全量备份和增量备份。增量备份主要方案就是使用reflink技术。Reflink的基本原理是在做文件拷贝动作时,仅创建新的inode(index node,索引节点),使其独立于源文件的inode,但新建的inode和源文件的inode仍会共享数据块(在OCFS2中指cluster)信息,只有当数据块被修改的情况下才会拷贝源数据块信息和修改部分的信息到新数据块中。In cloud computing scenarios, traditional data backup solutions mainly back up data periodically, including full backup and incremental backup. The main solution for incremental backup is to use reflink technology. The basic principle of reflink is that when copying files, only a new inode (index node) is created to make it independent of the inode of the source file, but the newly created inode and the inode of the source file will still share data block (cluster in OCFS2) information. Only when the data block is modified will the source data block information and the modified part of the information be copied to the new data block.
目前,现有的CoW(Copy-on-Write,写时拷贝技术)的技术实现方案中,数据块的最小拷贝粒度是一个cluster的大小。在cluster_size=1M,block_size=4K的场景下,若源数据块发生很小的数据量改动,比如1K,触发CoW后,底层则至少需要拷贝1个cluster的数据量,即1M,但实际上只有1K数据属于有效拷贝数据。由于数据拷贝操作是比较耗费系统资源的,因此会出现数据拷贝动作执行时间长,系统响应慢的情况。At present, in the existing CoW (Copy-on-Write) technical implementation scheme, the minimum copy granularity of a data block is the size of a cluster. In the scenario where cluster_size = 1M and block_size = 4K, if the source data block has a small data change, such as 1K, after CoW is triggered, the bottom layer needs to copy at least 1 cluster of data, that is, 1M, but in fact only 1K of data is valid copy data. Since data copy operations consume system resources, the data copy operation will take a long time to execute and the system response will be slow.
有鉴于此,如何提高写时拷贝的执行速度,提高系统整体响应能力已成为本领域技术人 员亟待解决的技术问题。In view of this, how to improve the execution speed of copy-on-write and improve the overall responsiveness of the system has become a hot topic for those skilled in the art. Technical issues that need to be solved urgently.
发明内容Summary of the invention
本申请的目的是提供一种写时拷贝方法,能够提高写时拷贝的执行速度,提高系统整体响应能力。本申请的另一个目的是提供一种写时拷贝装置、设备以及非易失性可读存储介质,均具有上述技术效果。The purpose of the present application is to provide a copy-on-write method that can improve the execution speed of the copy-on-write and improve the overall response capability of the system. Another purpose of the present application is to provide a copy-on-write device, equipment and non-volatile readable storage medium, all of which have the above technical effects.
为解决上述技术问题,本申请提供了一种写时拷贝方法,包括:To solve the above technical problems, the present application provides a copy-on-write method, comprising:
获取cluster的大小与block的大小;Get the size of the cluster and the size of the block;
若cluster的大小与block的大小的比值达到预设阈值,则创建block位图;If the ratio of the cluster size to the block size reaches a preset threshold, a block bitmap is created;
以block为最小粒度,将源数据块中的目标数据拷贝到新数据块;With block as the minimum granularity, copy the target data in the source data block to the new data block;
在block位图中记录新数据块的首地址与源数据块的首地址以及新数据块中各数据块的更新状态。The block bitmap records the first address of the new data block, the first address of the source data block, and the update status of each data block in the new data block.
可选的,以block为最小粒度,将源数据块中的目标数据拷贝到新数据块前还包括:Optionally, with block as the minimum granularity, before copying the target data in the source data block to the new data block, the following steps are also included:
判断是否启用了block位图特性;Determine whether the block bitmap feature is enabled;
若启用了block位图特性,则以block为最小粒度,将源数据块中的目标数据拷贝到新数据块。If the block bitmap feature is enabled, the target data in the source data block is copied to the new data block with block as the minimum granularity.
可选的,判断是否启用了block位图特性包括:Optionally, determining whether the block bitmap feature is enabled includes:
判断节点中是否记录有表征启用block位图特性的标记;Determine whether a flag indicating that the block bitmap feature is enabled is recorded in the node;
若节点中记录有表征启用block位图特性的标记,则启用了block位图特性。If a flag indicating that the block bitmap feature is enabled is recorded in the node, the block bitmap feature is enabled.
可选的,还包括:Optionally, also include:
若cluster的大小与block的大小的比值达到预设阈值,则在节点中记录用于表征启用block位图特性的标记。If the ratio of the size of the cluster to the size of the block reaches a preset threshold, a flag indicating that the block bitmap feature is enabled is recorded in the node.
可选的,block位图中的比特位图的一个比特位表示新数据块的一个数据块的更新状态。Optionally, a bit in the block bitmap indicates an update status of a data block of the new data block.
可选的,若新数据块中的数据块未发生更新,则对应的比特位为零;若新数据块中的数据块发生更新,则对应的比特位为一。Optionally, if the data block in the new data block is not updated, the corresponding bit is zero; if the data block in the new data block is updated, the corresponding bit is one.
可选的,block位图还记录有用于记录新数据块的各数据块的更新状态的比特位图的长度。Optionally, the block bitmap also records the length of the bitmap used to record the update status of each data block of the new data block.
可选的,还包括:Optionally, also include:
以cluster为最小单位分配新数据块。 New data blocks are allocated with cluster as the smallest unit.
可选的,还包括:Optionally, also include:
分配存储空间用于存放block位图记录的信息。Allocate storage space to store the information recorded by the block bitmap.
可选的,还包括:Optionally, also include:
在节点中记录存储空间的首地址。The first address of the storage space is recorded in the node.
可选的,还包括:Optionally, also include:
根据存储空间的首地址,查阅block位图;According to the first address of the storage space, consult the block bitmap;
根据block位图,确定数据读取对象,并从数据读取对象处读取数据。According to the block bitmap, the data reading object is determined, and data is read from the data reading object.
可选的,根据存储空间的首地址,查阅block位图前还包括:Optionally, according to the first address of the storage space, before consulting the block bitmap, the following is also included:
判断是否启用了block位图特性;Determine whether the block bitmap feature is enabled;
若启用了block位图特性,则根据存储空间的首地址,查阅block位图。If the block bitmap feature is enabled, the block bitmap is consulted based on the first address of the storage space.
可选的,还包括:Optionally, also include:
创建引用计数树。Create a reference counted tree.
可选的,还包括:Optionally, also include:
更新引用计数树的引用计数值。Updates the reference count value of the reference counting tree.
可选的,还包括:Optionally, also include:
创建节点。Create a node.
可选的,还包括:Optionally, also include:
分析目标数据的大小;Analyze the size of target data;
若目标数据的大小达到第一预设值,则以cluster为最小粒度,将目标数据拷贝到新数据块。If the size of the target data reaches a first preset value, the target data is copied to a new data block with cluster as the minimum granularity.
可选的,还包括:Optionally, also include:
统计待拷贝的block的个数;Count the number of blocks to be copied;
若待拷贝的block的个数达到第二预设值,则以cluster为最小粒度,将待拷贝的block拷贝到新数据块。If the number of blocks to be copied reaches a second preset value, the blocks to be copied are copied to new data blocks with cluster as the minimum granularity.
可选的,还包括:Optionally, also include:
定义预设阈值。Define preset thresholds.
可选的,还包括:Optionally, also include:
将新数据块中的各数据块对应的比特位初始化为零。Initialize the bits corresponding to each data block in the new data block to zero.
为解决上述技术问题,本申请还提供了一种写时拷贝装置,包括:In order to solve the above technical problems, the present application also provides a copy-on-write device, comprising:
获取模块,被设置为获取cluster的大小与block的大小; The acquisition module is set to obtain the size of the cluster and the size of the block;
创建模块,被设置为若cluster的大小与block的大小的比值达到预设阈值,则创建block位图;The creation module is configured to create a block bitmap if the ratio of the cluster size to the block size reaches a preset threshold;
拷贝模块,被设置为以block为最小粒度,将源数据块中的目标数据拷贝到新数据块;The copy module is set to use block as the minimum granularity to copy the target data in the source data block to the new data block;
记录模块,被设置为在block位图记录新数据块的首地址与源数据块的首地址以及新数据块中各数据块的更新状态。The recording module is configured to record the first address of the new data block and the first address of the source data block and the update status of each data block in the new data block in the block bitmap.
为解决上述技术问题,本申请还提供了一种写时拷贝设备,包括:To solve the above technical problems, the present application also provides a copy-on-write device, including:
存储器,被设置为存储计算机程序;a memory arranged to store a computer program;
处理器,被设置为执行计算机程序时实现如上任一项的写时拷贝方法的步骤。A processor is configured to implement the steps of any of the above copy-on-write methods when executing a computer program.
为解决上述技术问题,本申请还提供了一种非易失性可读存储介质,非易失性可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现如上任一项的写时拷贝方法的步骤。To solve the above technical problems, the present application also provides a non-volatile readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, the steps of the copy-on-write method as described in any one of the above items are implemented.
本申请所提供的写时拷贝方法,包括:获取cluster的大小与block的大小;若cluster的大小与block的大小的比值达到预设阈值,则创建block位图(块位图);以block为最小粒度,将源数据块中的目标数据拷贝到新数据块;在block位图中记录新数据块的首地址与源数据块的首地址以及新数据块中各数据块的更新状态。The write-time copy method provided by the present application includes: obtaining the size of a cluster and the size of a block; if the ratio of the size of the cluster to the size of the block reaches a preset threshold, creating a block bitmap (block bitmap); using the block as the minimum granularity, copying the target data in the source data block to the new data block; recording the first address of the new data block and the first address of the source data block and the update status of each data block in the new data block in the block bitmap.
可见,本申请所提供的写时拷贝方法,在cluster的大小远大于block的大小时,以block为最小粒度进行数据拷贝,这样可以避免无效的数据拷贝,提高数据拷贝的执行速度。同时,通过创建block位图来记录新数据块的首地址与源数据块的首地址以及新数据块中发生数据更新的数据块,可以保证以block为最小粒度进行数据拷贝后,能够正常进行数据访问。It can be seen that the write-time copy method provided by the present application uses block as the minimum granularity to copy data when the size of the cluster is much larger than the size of the block, which can avoid invalid data copying and improve the execution speed of data copying. At the same time, by creating a block bitmap to record the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block, it can be ensured that after the data is copied with block as the minimum granularity, data access can be performed normally.
本申请所提供的写时拷贝装置、设备以及非易失性可读存储介质均具有上述技术效果。The copy-on-write device, equipment, and non-volatile readable storage medium provided in this application all have the above-mentioned technical effects.
为了更清楚地说明本申请实施例中的技术方案,下面将对现有技术和实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the prior art and the drawings required for use in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying any creative work.
图1为本申请实施例所提供的一种写时拷贝方法的流程示意图;FIG1 is a schematic flow chart of a copy-on-write method provided in an embodiment of the present application;
图2为本申请实施例所提供的一种写时拷贝原理示意图;FIG2 is a schematic diagram of a copy-on-write principle provided by an embodiment of the present application;
图3为本申请实施例所提供的一种可选的写时拷贝方法的示意图; FIG3 is a schematic diagram of an optional copy-on-write method provided in an embodiment of the present application;
图4为本申请实施例所提供的一种写时拷贝装置的示意图;FIG4 is a schematic diagram of a copy-on-write device provided in an embodiment of the present application;
图5为本申请实施例所提供的一种写时拷贝设备的示意图。FIG5 is a schematic diagram of a copy-on-write device provided in an embodiment of the present application.
本申请的核心是提供一种写时拷贝方法,能够提高写时拷贝的执行速度,提高系统整体响应能力。本申请的另一个核心是提供一种写时拷贝装置、设备以及非易失性可读存储介质,均具有上述技术效果。The core of this application is to provide a copy-on-write method, which can improve the execution speed of copy-on-write and improve the overall response capability of the system. Another core of this application is to provide a copy-on-write device, equipment and non-volatile readable storage medium, all of which have the above technical effects.
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solution and advantages of the embodiments of the present application clearer, the technical solution in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of this application.
请参考图1,图1为本申请实施例所提供的一种写时拷贝方法的流程示意图,参考图1所示,该方法主要包括:Please refer to FIG. 1 , which is a flow chart of a copy-on-write method provided in an embodiment of the present application. Referring to FIG. 1 , the method mainly includes:
S101:获取cluster的大小与block的大小;S101: Get the size of the cluster and the size of the block;
S102:若cluster的大小与block的大小的比值达到预设阈值,则创建block位图;S102: If the ratio of the size of the cluster to the size of the block reaches a preset threshold, a block bitmap is created;
S103:以block为最小粒度,将源数据块中的目标数据拷贝到新数据块;S103: using block as the minimum granularity, copy the target data in the source data block to the new data block;
S104:在block位图记录新数据块的首地址与源数据块的首地址以及新数据块中各数据块的更新状态。S104: Record the first address of the new data block and the first address of the source data block and the update status of each data block in the new data block in the block bitmap.
本实施例旨在当cluster的大小远大于block的大小时,以block为最小粒度进行数据拷贝。同时为了在以block为最小粒度进行数据拷贝后,能够正常准确的访问数据,本实施例在cluster的大小远大于block的大小时,创建了block位图,通过block位图记录新数据块的首地址与源数据块的首地址以及新数据块中发生数据更新的数据块。以block为最小粒度进行数据拷贝后,当访问数据时,通过查阅block位图记录的信息可获知所要访问的数据位于新数据块还是源数据块,进而可以根据block位图记录的新数据块的首地址访问新数据块,或者根据block位图记录的源数据块的首地址访问源数据块。This embodiment aims to copy data with block as the minimum granularity when the size of cluster is much larger than the size of block. At the same time, in order to be able to access data normally and accurately after copying data with block as the minimum granularity, this embodiment creates a block bitmap when the size of cluster is much larger than the size of block, and records the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block through the block bitmap. After copying data with block as the minimum granularity, when accessing data, by consulting the information recorded in the block bitmap, it can be known whether the data to be accessed is located in the new data block or the source data block, and then the new data block can be accessed according to the first address of the new data block recorded in the block bitmap, or the source data block can be accessed according to the first address of the source data block recorded in the block bitmap.
目标数据是指源数据块中需要更新的数据。Target data refers to the data in the source data block that needs to be updated.
步骤S101与步骤S102可以是在上层应用触发文件拷贝动作后执行。例如,上层应用触发文件拷贝动作后,底层在创建inode节点并填充成员信息时,可首先获取OCFS2的cluster_size即cluster的大小和block_size即block的大小以及预设阈值(该预设阈值可在制作文件系统时或者执行reflink快照时给定)。计算cluster_size与block_size的 比值。如果cluster_size与block_size的比值大于或等于预设阈值,则创建block位图。如果cluster_size与block_size的比值小于预设阈值,则不创建block位图,写时拷贝沿用现有的实现方案。Steps S101 and S102 can be executed after the upper layer application triggers the file copy action. For example, after the upper layer application triggers the file copy action, the lower layer can first obtain the cluster_size of OCFS2, i.e. the size of the cluster, the block_size, i.e. the size of the block, and the preset threshold (the preset threshold can be given when making the file system or executing the reflink snapshot) when creating the inode node and filling in the member information. If the ratio of cluster_size to block_size is greater than or equal to the preset threshold, a block bitmap is created. If the ratio of cluster_size to block_size is less than the preset threshold, no block bitmap is created and the copy-on-write continues to use the existing implementation scheme.
其中,在一些实施例中,还包括:若cluster的大小与block的大小的比值达到预设阈值,则在节点中记录用于表征启用block位图特性的标记。In some embodiments, the method further includes: if the ratio of the size of the cluster to the size of the block reaches a preset threshold, recording a mark in the node for indicating that the block bitmap feature is enabled.
如果cluster_size与block_size的比值大于或等于预设阈值,定义特性标记BLKMAP,并记录到inode节点中的ocfs2_extend_rec.e_flags成员,表示启用block位图。如果cluster_size与block_size的比值小于预设阈值,则不做标记处理,表示不启用block位图,仍沿用现有的写时拷贝的实现方案。If the ratio of cluster_size to block_size is greater than or equal to the preset threshold, the feature flag BLKMAP is defined and recorded in the ocfs2_extend_rec.e_flags member of the inode node, indicating that the block bitmap is enabled. If the ratio of cluster_size to block_size is less than the preset threshold, no marking is performed, indicating that the block bitmap is not enabled and the existing implementation of copy-on-write is still used.
参考图2所示,block(数据块)位图的数据结构包括blkno、ref_blkno和bitmap。blkno用于记录新数据块的首地址,ref_blkno用于记录源数据块(共享数据块)的首地址,bitmap(位图)用于记录新数据块中各个数据块的更新状态,依据bitmap可以获知新数据块(例如数据集群(data clusters))中哪些数据块被更新了,哪些数据块没有被更新。As shown in Figure 2, the data structure of the block bitmap includes blkno, ref_blkno and bitmap. Blkno is used to record the first address of the new data block, ref_blkno is used to record the first address of the source data block (shared data block), and bitmap is used to record the update status of each data block in the new data block. Based on the bitmap, it can be known which data blocks in the new data block (such as data clusters) are updated and which data blocks are not updated.
在一种可选的实施方式中,block位图中的比特位图的一个比特位表示新数据块的一个数据块的更新状态。In an optional implementation, a bit of the block bitmap in the bitmap indicates an update status of a data block of the new data block.
其中,若新数据块中的数据块未发生更新,则对应的比特位为零;若新数据块中的数据块发生更新,则对应的比特位为一。If the data block in the new data block is not updated, the corresponding bit is zero; if the data block in the new data block is updated, the corresponding bit is one.
也就是说,bitmap中的每个bit位对应新数据块中的一个block,若新数据块中的某些block数据更新了,bitmap中对应的bit位置1,否则bit位清0。Bitmap的长度等于新数据块中块的个数。That is, each bit in the bitmap corresponds to a block in the new data block. If some block data in the new data block is updated, the corresponding bit in the bitmap is 1, otherwise the bit is cleared to 0. The length of the bitmap is equal to the number of blocks in the new data block.
另外,在一种可选的实施方式中,block位图还记录有用于记录新数据块的各数据块的更新状态的比特位图的长度。In addition, in an optional implementation manner, the block bitmap further records the length of the bitmap used to record the update status of each data block of the new data block.
如图2所示,bitmap即为比特位图。本实施例中,block位图的数据结构包括blkno、ref_blkno、map_size和bitmap。map_size用于记录bitmap的长度。例如,bitmap的长度为20个bit(比特),则map_size=20。As shown in FIG2 , bitmap is a bitmap. In this embodiment, the data structure of the block bitmap includes blkno, ref_blkno, map_size and bitmap. Map_size is used to record the length of the bitmap. For example, if the length of the bitmap is 20 bits, then map_size=20.
步骤S103与S104可以是在上层应用修改reflink快照文件时执行。例如,上层应用修改reflink快照文件时,底层以cluster为最小单位分配新数据块,并以block为最小粒度,仅拷贝源数据块中的需要更新的block数据到新数据块中,同时更新block位图中的bitmap。另外,如果源数据块中的所有数据均已经更新,则同步更新reflink计数树的 refcount值。Steps S103 and S104 can be executed when the upper layer application modifies the reflink snapshot file. For example, when the upper layer application modifies the reflink snapshot file, the lower layer allocates new data blocks with cluster as the smallest unit and block as the smallest granularity, and only copies the block data that needs to be updated in the source data block to the new data block, and updates the bitmap in the block bitmap at the same time. In addition, if all the data in the source data block has been updated, the reflink count tree is updated synchronously. The refcount value.
在一种可选的实施方式中,以block为最小粒度,将源数据块中的数据拷贝到新数据块还包括:In an optional implementation, with block as the minimum granularity, copying data in the source data block to the new data block further includes:
判断是否启用了block位图特性;Determine whether the block bitmap feature is enabled;
若启用了block位图特性,则以block为最小粒度,将源数据块中的目标数据拷贝到新数据块。If the block bitmap feature is enabled, the target data in the source data block is copied to the new data block with block as the minimum granularity.
其中,判断是否启用了block位图特性包括:Among them, judging whether the block bitmap feature is enabled includes:
判断节点中是否记录有表征启用block位图特性的标记;Determine whether a flag indicating that the block bitmap feature is enabled is recorded in the node;
若节点中记录有表征启用block位图特性的标记,则启用了block位图特性。If a flag indicating that the block bitmap feature is enabled is recorded in the node, the block bitmap feature is enabled.
例如,上层应用修改reflink快照文件时,底层首先判断reflink快照文件的inode中,block位图是否已经启用。若没有启用,则按照现有的写时拷贝的实现方案,即以cluster为最小粒度,拷贝源数据块中的数据到新数据块中,同时更新reflink计数树的refcount值。若启用,则以block为最小粒度,仅拷贝源数据块中的需要更新的block数据到新数据块中,同时更新block位图中的bitmap。另外,如果源数据块中的所有数据均已经更新,则同步更新reflink计数树的refcount值。For example, when the upper-layer application modifies the reflink snapshot file, the bottom layer first determines whether the block bitmap in the inode of the reflink snapshot file has been enabled. If not, the existing write-time copy implementation scheme is used, that is, the data in the source data block is copied to the new data block with cluster as the minimum granularity, and the refcount value of the reflink counter tree is updated at the same time. If enabled, only the block data that needs to be updated in the source data block is copied to the new data block with block as the minimum granularity, and the bitmap in the block bitmap is updated at the same time. In addition, if all data in the source data block has been updated, the refcount value of the reflink counter tree is updated synchronously.
在上述实施例的基础上,在一些实施例中,还包括:Based on the above embodiments, in some embodiments, the following further includes:
分配存储空间用于存放block位图记录的信息。Allocate storage space to store the information recorded by the block bitmap.
在节点中记录存储空间的首地址。The first address of the storage space is recorded in the node.
block位图创建后,链接block位图到inode节点。例如,单独使用一个block空间存储block位图信息,同时在inode节点中ocfs2_extend_rec下的新增一个成员i_blockcount_loc,用于记录block位图信息所在数据块的首地址blockno。After the block bitmap is created, link the block bitmap to the inode node. For example, use a block space to store the block bitmap information, and add a new member i_blockcount_loc under ocfs2_extend_rec in the inode node to record the first address blockno of the data block where the block bitmap information is located.
可选的,在上述实施例的基础上,作为一种可选的实施方式,还包括:Optionally, based on the above embodiment, as an optional implementation manner, it further includes:
根据存储空间的首地址,查阅block位图;According to the first address of the storage space, consult the block bitmap;
根据block位图,确定数据读取对象,并从数据读取对象处读取数据。According to the block bitmap, the data reading object is determined, and data is read from the data reading object.
本实施例旨在根据block位图索引新数据块。写时拷贝完成后,当上层应用读取reflink快照文件时,底层查阅block位图,确认哪些block数据从新数据块中读取,哪些block数据从源数据块中读取,进而从源数据块或新数据块中读取数据。This embodiment aims to index the new data block according to the block bitmap. After the copy-on-write is completed, when the upper-layer application reads the reflink snapshot file, the lower layer consults the block bitmap to confirm which block data is read from the new data block and which block data is read from the source data block, and then reads data from the source data block or the new data block.
其中,根据存储空间的首地址,查阅block位图前还包括:Among them, according to the first address of the storage space, before consulting the block bitmap, it also includes:
判断是否启用了block位图特性;Determine whether the block bitmap feature is enabled;
若启用了block位图特性,则根据存储空间的首地址,查阅block位图。 If the block bitmap feature is enabled, the block bitmap is consulted based on the first address of the storage space.
例如,写时拷贝完成后,当上层应用读取reflink快照文件时,底层首先判断reflink快照文件的inode中,block位图是否已经启用。若block位图没有启用,则根据reflilnk计数树的记录信息进行数据的检索。若block位图已经启用,则查阅block位图,确认哪些block数据从新数据块中读取,哪些block数据从源数据块中读取,进而针对性的读取数据。For example, after the copy-on-write is completed, when the upper-layer application reads the reflink snapshot file, the bottom layer first determines whether the block bitmap in the inode of the reflink snapshot file has been enabled. If the block bitmap is not enabled, the data is retrieved according to the record information of the reflilnk count tree. If the block bitmap is enabled, the block bitmap is consulted to confirm which block data is read from the new data block and which block data is read from the source data block, and then the data is read in a targeted manner.
判断是否启用了block位图特性可以包括:Determining whether the block bitmap feature is enabled may include:
判断节点中是否记录有表征启用block位图特性的标记;Determine whether a flag indicating that the block bitmap feature is enabled is recorded in the node;
若节点中记录有表征启用block位图特性的标记,则启用了block位图特性。If a flag indicating that the block bitmap feature is enabled is recorded in the node, the block bitmap feature is enabled.
例如,写时拷贝完成后,当上层应用读取reflink快照文件时,第二判断首先判断reflink快照文件的inode中是否记录有表征启用block位图特性的标记。若inode中记录有表征启用block位图特性的标记,则根据reflilnk计数树的记录信息进行数据的检索。若inode中未记录有表征启用block位图特性的标记,则查阅block位图,确认哪些block数据从新数据块中读取,哪些block数据从源数据块中读取,进而从源数据块或新数据块中读取数据。For example, after the copy-on-write is completed, when the upper-layer application reads the reflink snapshot file, the second judgment first determines whether the inode of the reflink snapshot file records a mark indicating that the block bitmap feature is enabled. If the inode records a mark indicating that the block bitmap feature is enabled, data is retrieved according to the record information of the reflilnk count tree. If the inode does not record a mark indicating that the block bitmap feature is enabled, the block bitmap is consulted to confirm which block data is read from the new data block and which block data is read from the source data block, and then the data is read from the source data block or the new data block.
可选的,在一些实施例中,还包括:Optionally, in some embodiments, the method further includes:
分析目标数据的大小;Analyze the size of target data;
若目标数据的大小达到第一预设值,则以cluster为最小粒度,将目标数据拷贝到新数据块。If the size of the target data reaches a first preset value, the target data is copied to a new data block with cluster as the minimum granularity.
例如,在拷贝目前数据前,还可以分析目标数据的大小,如果目标数据的大小达到第一预设值,即目标数据的量较大时,若以block为最小粒度进行数据拷贝,需要拷贝较多次,此时可以选择以cluster为最小粒度进行数据拷贝。For example, before copying the current data, you can also analyze the size of the target data. If the size of the target data reaches a first preset value, that is, the amount of target data is large, if the data is copied with block as the minimum granularity, more copies are required. At this time, you can choose to copy the data with cluster as the minimum granularity.
可选的,在一些实施例中,还包括:Optionally, in some embodiments, the method further includes:
统计待拷贝的block的个数;Count the number of blocks to be copied;
若待拷贝的block的个数达到第二预设值,则以cluster为最小粒度,将待拷贝的block拷贝到新数据块。If the number of blocks to be copied reaches a second preset value, the blocks to be copied are copied to new data blocks with cluster as the minimum granularity.
例如,在拷贝目前数据前,还可以统计待拷贝的block的个数,如果待拷贝的block的个数达到第二预设值,即待拷贝的block的个数较多时,若以block为最小粒度进行数据拷贝,需要拷贝较多次,此时可以选择以cluster为最小粒度进行数据拷贝。For example, before copying the current data, you can also count the number of blocks to be copied. If the number of blocks to be copied reaches a second preset value, that is, the number of blocks to be copied is large, if the data is copied with block as the minimum granularity, more copies are required. At this time, you can choose to copy the data with cluster as the minimum granularity.
以下结合图3所示,描述一种可选的写时拷贝实施方式:An optional implementation of copy-on-write is described below in conjunction with FIG3 :
OCFS2的cluster_size(簇大小)=1M,block_size(块大小)=4K,预设阈值 HT=256。以某文件的节点inodeB管理的4M数据作为研究对象,inodeB.extent_rec(extent_record,数据段记录)记录了e_flags=REFCOUNTED,表示支持reflink快照功能。另外记录了源文件偏移量e_cpos=3(从磁盘的cluster3开始),源数据块长度e_leaf_cluster=4(共占用4个cluster),源数据块所在的磁盘位置的首地址e_blkno=768(数据记录从磁盘的768号blockno(block number,块编号)开始)。OCFS2 cluster_size = 1M, block_size = 4K, preset thresholds HT=256. The 4M data managed by the node inodeB of a certain file is taken as the research object. inodeB.extent_rec (extent_record, data segment record) records e_flags=REFCOUNTED, indicating that the reflink snapshot function is supported. In addition, the source file offset e_cpos=3 (starting from cluster3 of the disk), the source data block length e_leaf_cluster=4 (occupying 4 clusters in total), and the first address of the disk location where the source data block is located e_blkno=768 (data record starts from blockno (block number) 768 of the disk).
根据OCFS2的cluster_size和block_size以及预设阈值进行block位图特性自适应配置。当上层应用对文件做reflink快照,底层创建新的文件节点副本inodeA,此时不创建新数据块,也不对源数据进行拷贝。由于cluster_size和block_size的比值clsuter_size/block_size=256>=HT,故InodeA.extent_rec中的e_flags=REFCOUNTED|BLKMAP,表示在支持reflink快照功能的基础上,追加block位图特性。另外记录源数据(共享数据块)的相关信息:偏移量e_cpos=3,源数据块长度e_leaf_cluster=4,源数据块所在的磁盘位置首地址e_blkno=768。除了新建文件节点副本inodeA,还新建一个引用计数树refcount tree,用于记录reflink快照后的数据块共享信息,其refcount_rec(referencecount_record,引用计数记录)中记录了共享数据块的文件偏移r_cpos=3,共享数据块的共享长度r_cluster=4,共享数据块的引用计数r_refcount=2。The block bitmap feature is adaptively configured according to the cluster_size and block_size of OCFS2 and the preset threshold. When the upper-layer application makes a reflink snapshot of the file, the bottom layer creates a new file node copy inodeA. At this time, no new data block is created, and the source data is not copied. Since the ratio of cluster_size to block_size is clsuter_size/block_size=256>=HT, e_flags=REFCOUNTED|BLKMAP in InodeA.extent_rec indicates that the block bitmap feature is added on the basis of supporting the reflink snapshot function. In addition, the relevant information of the source data (shared data block) is recorded: offset e_cpos=3, source data block length e_leaf_cluster=4, and the disk location first address of the source data block e_blkno=768. In addition to creating a new file node copy inodeA, a new reference count tree refcount tree is also created to record the data block sharing information after the reflink snapshot. Its refcount_rec (referencecount_record) records the file offset r_cpos=3 of the shared data block, the shared length r_cluster=4 of the shared data block, and the reference count r_refcount=2 of the shared data block.
创建block位图并连接到inodeA。当上层应用对文件的reflink快照做小范围修改,底层在分配新数据块之前,识别到block位图特性已经开启,新分配一个block的存储空间ocfs2_blkcnt_block用于存放block位图信息ocfs2_blkcnt_rec。block位图存储空间分配完成后,将inodeA中ocfs2_extend_rec的成员i_blockcount_loc指向block位图数据块的首地址blockno:102400。Create a block bitmap and connect it to inodeA. When the upper-layer application makes a small modification to the reflink snapshot of the file, the lower layer recognizes that the block bitmap feature has been enabled before allocating new data blocks, and allocates a new block storage space ocfs2_blkcnt_block to store the block bitmap information ocfs2_blkcnt_rec. After the block bitmap storage space is allocated, the member i_blockcount_loc of ocfs2_extend_rec in inodeA is pointed to the first address blockno: 102400 of the block bitmap data block.
上层应用准备修改reflink快照文件中第5个cluster中的第0个和第2个block中的数据(即blockno为1280和1282的两个数据block)时,底层首先检查该文件的inodeA中block位图特性flag(标识),本实施例中该特性已经启用。然后以cluster为最小单位分配新的数据块,同时更新block位图信息,blkno记录新数据块的起始地址blkno:3584,ref_blkno记录源数据块准备更新的数据块起始地址blkno:1280,map_size记录新数据块的长度256(新分配一个cluster,共256个block),bitmap为256个地址连续的bit位,并初始化为0,表示新数据块中的block数据还未进行更新写入。以上工作准备就绪后,再将预写入的数据拷贝到新数据块中的3584和3586两个block位置,同时将block位图bitmap中对应的bit位值1,源数据块中的数据不做修改。 When the upper layer application is ready to modify the data in the 0th and 2nd blocks in the 5th cluster in the reflink snapshot file (i.e., the two data blocks with blockno 1280 and 1282), the bottom layer first checks the block bitmap feature flag in the inodeA of the file, which has been enabled in this embodiment. Then, a new data block is allocated with cluster as the minimum unit, and the block bitmap information is updated at the same time. blkno records the starting address of the new data block blkno: 3584, ref_blkno records the starting address of the source data block to be updated blkno: 1280, map_size records the length of the new data block 256 (a new cluster is allocated, with a total of 256 blocks), and the bitmap is 256 consecutive bits of the address, and is initialized to 0, indicating that the block data in the new data block has not been updated and written. After the above work is ready, the pre-written data is copied to the two block positions 3584 and 3586 in the new data block, and the corresponding bit value in the block bitmap is set to 1. The data in the source data block is not modified.
写时拷贝完成后,当上层应用读取reflink快照文件时,底层首先检查该文件的inodeA中block位图特性flag,本实施例中该特性已经启用。接着遍历block位图,bitmap中置1时,则对应的block数据从新数据块中读取数据,否则block数据仍从源数据块(共享数据块)中读取数据。After the copy-on-write is completed, when the upper-layer application reads the reflink snapshot file, the bottom layer first checks the block bitmap feature flag in the inodeA of the file, which has been enabled in this embodiment. Then it traverses the block bitmap. When the bitmap is set to 1, the corresponding block data is read from the new data block, otherwise the block data is still read from the source data block (shared data block).
综上,本申请所提供的写时拷贝方法,包括:获取cluster的大小与block的大小;若cluster的大小与block的大小的比值达到预设阈值,则创建block位图;以block为最小粒度,将源数据块中的数据拷贝到新数据块;在block位图记录新数据块的首地址与源数据块的首地址以及新数据块中发生数据更新的数据块。可见,本申请所提供的写时拷贝方法,在cluster的大小远大于block的大小时,以block为最小粒度进行数据拷贝,这样避免由于无效的数据拷贝,提高数据拷贝的执行速度。同时,通过创建block位图来记录新数据块的首地址与源数据块的首地址以及新数据块中发生数据更新的数据块,可以保证以block为最小粒度进行数据拷贝后,能够正常进行数据访问。In summary, the copy-on-write method provided by the present application includes: obtaining the size of the cluster and the size of the block; if the ratio of the size of the cluster to the size of the block reaches a preset threshold, then creating a block bitmap; using block as the minimum granularity, copying the data in the source data block to the new data block; recording the first address of the new data block and the first address of the source data block, as well as the data block where data updates occur in the new data block in the block bitmap. It can be seen that the copy-on-write method provided by the present application uses block as the minimum granularity to copy data when the size of the cluster is much larger than the size of the block, thereby avoiding invalid data copying and improving the execution speed of data copying. At the same time, by creating a block bitmap to record the first address of the new data block and the first address of the source data block, as well as the data block where data updates occur in the new data block, it can be ensured that after data copying is performed with block as the minimum granularity, data access can be performed normally.
本申请还提供了一种写时拷贝装置,下文描述的该装置可以与上文描述的方法相互对应参照。请参考图4,图4为本申请实施例所提供的一种写时拷贝装置的示意图,结合图4所示,该装置包括:The present application also provides a copy-on-write device, and the device described below can be referred to in correspondence with the method described above. Please refer to Figure 4, which is a schematic diagram of a copy-on-write device provided in an embodiment of the present application. As shown in Figure 4, the device includes:
获取模块10,被设置为获取cluster的大小与block的大小;An acquisition module 10 is configured to acquire the size of a cluster and the size of a block;
创建模块20,被设置为若cluster的大小与block的大小的比值达到预设阈值,则创建block位图;A creation module 20 is configured to create a block bitmap if the ratio of the size of the cluster to the size of the block reaches a preset threshold;
拷贝模块30,被设置为以block为最小粒度,将源数据块中的目标数据拷贝到新数据块;The copy module 30 is configured to copy the target data in the source data block to the new data block with block as the minimum granularity;
记录模块40,被设置为在block位图记录新数据块的首地址与源数据块的首地址以及新数据块中各数据块的更新状态。The recording module 40 is configured to record the first address of the new data block and the first address of the source data block and the update status of each data block in the new data block in the block bitmap.
本实施例旨在当cluster的大小远大于block的大小时,以block为最小粒度进行数据拷贝。同时为了在以block为最小粒度进行数据拷贝后,能够正常准确的访问数据,本实施例在cluster的大小远大于block的大小时,创建了block位图,通过block位图记录新数据块的首地址与源数据块的首地址以及新数据块中发生数据更新的数据块。以block为最小粒度进行数据拷贝后,当访问数据时,通过查阅block位图记录的信息即可获知所要访问的数据位于新数据块还是源数据块,进而可以根据block位图记录的新数据块的首地址访问新数据块或者根据block位图记录的源数据块的首地址访问源数据块。This embodiment aims to copy data with block as the minimum granularity when the size of cluster is much larger than the size of block. At the same time, in order to be able to access data normally and accurately after copying data with block as the minimum granularity, this embodiment creates a block bitmap when the size of cluster is much larger than the size of block, and records the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block through the block bitmap. After copying data with block as the minimum granularity, when accessing data, by consulting the information recorded in the block bitmap, it can be known whether the data to be accessed is located in the new data block or the source data block, and then the new data block can be accessed according to the first address of the new data block recorded in the block bitmap, or the source data block can be accessed according to the first address of the source data block recorded in the block bitmap.
上层应用触发文件拷贝动作后,底层在创建inode节点并填充成员信息时,比较模块 10可首先获取OCFS2的cluster_size即cluster的大小和block_size即block的大小以及预设阈值(该预设阈值可在制作文件系统时或者执行reflink快照时给定)。计算cluster_size与block_size的比值。如果cluster_size与block_size的比值大于或等于预设阈值,则创建模块20创建block位图。如果cluster_size与block_size的比值小于预设阈值,则创建模块20不创建block位图,写时拷贝沿用现有的实现方案。After the upper-layer application triggers the file copy action, the lower layer creates the inode node and fills in the member information. 10 can first obtain the cluster_size of OCFS2, i.e., the size of the cluster, the block_size, i.e., the size of the block, and a preset threshold (the preset threshold can be given when making the file system or executing the reflink snapshot). Calculate the ratio of cluster_size to block_size. If the ratio of cluster_size to block_size is greater than or equal to the preset threshold, the creation module 20 creates a block bitmap. If the ratio of cluster_size to block_size is less than the preset threshold, the creation module 20 does not create a block bitmap, and the copy-on-write continues to use the existing implementation scheme.
block位图的数据结构包括blkno、ref_blkno和bitmap。blkno用于记录新数据块的首地址,ref_blkno用于记录源数据块(共享数据块)的首地址,bitmap用于记录新数据块中哪些数据块被更新了。The data structure of the block bitmap includes blkno, ref_blkno and bitmap. blkno is used to record the first address of the new data block, ref_blkno is used to record the first address of the source data block (shared data block), and bitmap is used to record which data blocks in the new data block are updated.
上层应用修改reflink快照文件时,拷贝模块30以block为最小粒度,仅拷贝源数据块中的需要更新的block数据到新数据块中,同时记录模块40更新block位图中的bitmap。When the upper layer application modifies the reflink snapshot file, the copy module 30 uses block as the minimum granularity and only copies the block data that needs to be updated in the source data block to the new data block, and the recording module 40 updates the bitmap in the block bitmap.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
第一判断模块,被设置为判断是否启用了block位图特性;A first determination module is configured to determine whether the block bitmap feature is enabled;
若启用了block位图特性,则拷贝模块30以block为最小粒度,将源数据块中的目标数据拷贝到新数据块。If the block bitmap feature is enabled, the copy module 30 copies the target data in the source data block to the new data block with the block as the minimum granularity.
例如,上层应用修改reflink快照文件时,底层首先判断reflink快照文件的inode中,block位图是否已经启用。若没有启用,则按照现有的写时拷贝的实现方案,即以cluster为最小粒度,拷贝源数据块中的数据到新数据块中,同时更新reflink计数树的refcount值。若启用,则以block为最小粒度,仅拷贝源数据块中的需要更新的block数据到新数据块中,同时更新block位图中的bitmap。For example, when the upper-layer application modifies the reflink snapshot file, the lower layer first determines whether the block bitmap in the inode of the reflink snapshot file has been enabled. If not, the existing write-time copy implementation scheme is used, that is, the data in the source data block is copied to the new data block with cluster as the minimum granularity, and the refcount value of the reflink count tree is updated at the same time. If enabled, the block is used as the minimum granularity, and only the block data that needs to be updated in the source data block is copied to the new data block, and the bitmap in the block bitmap is updated at the same time.
在上述实施例的基础上,作为一种可选的实施方式,第一判断模块包括:Based on the above embodiment, as an optional implementation manner, the first judgment module includes:
判断单元,被设置为判断节点中是否记录有表征启用block位图特性的标记;A judging unit, configured to judge whether a mark representing enabling a block bitmap feature is recorded in the node;
确定单元,被设置为若节点中记录有表征启用block位图特性的标记,则启用了block位图特性。The determination unit is configured to enable the block bitmap feature if a flag indicating that the block bitmap feature is enabled is recorded in the node.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
标记模块,被设置为若cluster的大小与block的大小的比值达到预设阈值,则在节点中记录用于表征启用block位图特性的标记。The marking module is configured to record a mark in the node for indicating that the block bitmap feature is enabled if the ratio of the size of the cluster to the size of the block reaches a preset threshold.
例如,如果cluster_size与block_size的比值大于或等于预设阈值,定义特性标记BLKMAP,并记录到inode节点中的ocfs2_extend_rec.e_flags成员,表示启用block位图。如果cluster_size与block_size的比值小于预设阈值,则不做标记处理,表示不启用 block位图,仍沿用现有的写时拷贝的实现方案。For example, if the ratio of cluster_size to block_size is greater than or equal to the preset threshold, the feature flag BLKMAP is defined and recorded in the ocfs2_extend_rec.e_flags member of the inode node, indicating that the block bitmap is enabled. If the ratio of cluster_size to block_size is less than the preset threshold, no marking is performed, indicating that it is not enabled. The block bitmap still uses the existing copy-on-write implementation.
在上述实施例的基础上,作为一种可选的实施方式,block位图中的比特位图的一个比特位表示新数据块的一个数据块的更新状态。Based on the above embodiment, as an optional implementation manner, a bit in the block bitmap represents an update status of a data block of the new data block.
在上述实施例的基础上,作为一种可选的实施方式,若新数据块中的数据块未发生更新,则对应的比特位为零;若新数据块中的数据块发生更新,则对应的比特位为一。Based on the above embodiment, as an optional implementation, if the data block in the new data block is not updated, the corresponding bit is zero; if the data block in the new data block is updated, the corresponding bit is one.
例如,bitmap中的每个bit位对应新数据块中的一个block,若新数据块中的某些block数据更新了,bitmap中对应的bit位置1,否则bit位清0。bitmap的长度等于新数据块中块的个数。For example, each bit in the bitmap corresponds to a block in the new data block. If some block data in the new data block is updated, the corresponding bit in the bitmap is 1, otherwise the bit is cleared to 0. The length of the bitmap is equal to the number of blocks in the new data block.
在上述实施例的基础上,作为一种可选的实施方式,block位图还记录有用于记录新数据块的各数据块的更新状态的比特位图的长度。Based on the above embodiment, as an optional implementation manner, the block bitmap further records the length of the bitmap used to record the update status of each data block of the new data block.
例如,bitmap即为比特位图。本实施例中,block位图的数据结构包括blkno、ref_blkno、map_size和bitmap。map_size用于记录bitmap的长度。例如,bitmap的长度为20个bit,则map_size=20。For example, bitmap is a bitmap. In this embodiment, the data structure of the block bitmap includes blkno, ref_blkno, map_size and bitmap. Map_size is used to record the length of the bitmap. For example, if the length of the bitmap is 20 bits, then map_size=20.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
第一分配模块,被设置为以cluster为最小单位分配新数据块。The first allocation module is configured to allocate new data blocks using cluster as the minimum unit.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
第二分配模块,被设置为分配存储空间用于存放block位图记录的信息。The second allocation module is configured to allocate storage space for storing information recorded in the block bitmap.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
节点记录模块,被设置为在节点中记录存储空间的首地址。The node recording module is configured to record the first address of the storage space in the node.
例如,单独使用一个block空间存储block位图信息,同时在inode节点中ocfs2_extend_rec下的新增一个成员i_blockcount_loc,用于记录block位图信息所在数据块的首地址blockno。For example, a block space is used alone to store block bitmap information, and a new member i_blockcount_loc is added under ocfs2_extend_rec in the inode node to record the first address blockno of the data block where the block bitmap information is located.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
查阅模块,被设置为根据存储空间的首地址,查阅block位图;The reference module is configured to reference the block bitmap according to the first address of the storage space;
读取模块,被设置为根据block位图,确定数据读取对象,并从数据读取对象处读取数据。The reading module is configured to determine a data reading object according to the block bitmap, and read data from the data reading object.
写时拷贝完成后,当上层应用读取reflink快照文件时,查阅block位图,确认哪些block数据从新数据块中读取,哪些block数据从源数据块中读取,进而从源数据块或新数据块中读取数据。After the copy-on-write is completed, when the upper-layer application reads the reflink snapshot file, it consults the block bitmap to confirm which block data is read from the new data block and which block data is read from the source data block, and then reads data from the source data block or the new data block.
在上述实施例的基础上,作为一种可选的实施方式,还包括: Based on the above embodiment, as an optional implementation manner, it also includes:
第二判断模块,被设置为判断是否启用了block位图特性;A second determination module is configured to determine whether the block bitmap feature is enabled;
若启用了block位图特性,则查阅模块根据存储空间的首地址,查阅block位图。If the block bitmap feature is enabled, the query module queries the block bitmap based on the first address of the storage space.
例如,写时拷贝完成后,当上层应用读取reflink快照文件时,底层首先判断reflink快照文件的inode中,block位图是否已经启用。若block位图没有启用,则根据reflilnk计数树的记录信息进行数据的检索。若block位图已经启用,则查阅block位图,确认哪些block数据从新数据块中读取,哪些block数据从源数据块中读取,进而针对性的读取数据。For example, after the copy-on-write is completed, when the upper-layer application reads the reflink snapshot file, the bottom layer first determines whether the block bitmap in the inode of the reflink snapshot file has been enabled. If the block bitmap is not enabled, the data is retrieved according to the record information of the reflilnk count tree. If the block bitmap is enabled, the block bitmap is consulted to confirm which block data is read from the new data block and which block data is read from the source data block, and then the data is read in a targeted manner.
其中,第二判断模块被设置为:判断节点中是否记录有表征启用block位图特性的标记;若节点中记录有表征启用block位图特性的标记,则启用了block位图特性。The second judgment module is configured to: judge whether a mark indicating that the block bitmap feature is enabled is recorded in the node; if the mark indicating that the block bitmap feature is enabled is recorded in the node, the block bitmap feature is enabled.
写时拷贝完成后,当上层应用读取reflink快照文件时,第二判断首先判断reflink快照文件的inode中是否记录有表征启用block位图特性的标记。若inode中记录有表征启用block位图特性的标记,则根据reflilnk计数树的记录信息进行数据的检索。若inode中未记录有表征启用block位图特性的标记,则查阅block位图,确认哪些block数据从新数据块中读取,哪些block数据从源数据块中读取,进而针对性的读取数据。After the copy-on-write is completed, when the upper-layer application reads the reflink snapshot file, the second judgment first determines whether the inode of the reflink snapshot file records a mark indicating the enablement of the block bitmap feature. If the inode records a mark indicating the enablement of the block bitmap feature, data is retrieved according to the record information of the reflilnk count tree. If the inode does not record a mark indicating the enablement of the block bitmap feature, the block bitmap is consulted to confirm which block data is read from the new data block and which block data is read from the source data block, and then the data is read in a targeted manner.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
引用计数树创建模块,被设置为创建引用计数树。A reference counting tree creation module is configured to create a reference counting tree.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
更新模块,被设置为更新引用计数树的引用计数值。An update module is configured to update the reference count value of the reference count tree.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
节点创建模块,被设置为创建节点。The node creation module is configured to create nodes.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
分析模块,被设置为分析目标数据的大小;An analysis module is configured to analyze the size of target data;
第二拷贝模块,被设置为若目标数据的大小达到第一预设值,则以cluster为最小粒度,将目标数据拷贝到新数据块。The second copy module is configured to copy the target data to a new data block with cluster as the minimum granularity if the size of the target data reaches a first preset value.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
统计模块,被设置为统计待拷贝的block的个数;The statistics module is set to count the number of blocks to be copied;
第三拷贝模块,被设置为若待拷贝的block的个数达到第二预设值,则以cluster为最小粒度,将待拷贝的block拷贝到新数据块。The third copy module is configured to copy the blocks to be copied to the new data block with cluster as the minimum granularity if the number of blocks to be copied reaches a second preset value.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
定义模块,被设置为定义预设阈值。 A definition module is configured to define a preset threshold value.
在上述实施例的基础上,作为一种可选的实施方式,还包括:Based on the above embodiment, as an optional implementation manner, it also includes:
初始化模块,被设置为将新数据块中的各数据块对应的比特位初始化为零。The initialization module is configured to initialize the bits corresponding to each data block in the new data block to zero.
本申请所提供的写时拷贝装置,在cluster的大小远大于block的大小时,以block为最小粒度进行数据拷贝,这样避免了无效的数据拷贝,提高了数据拷贝的执行速度。同时,通过创建block位图来记录新数据块的首地址与源数据块的首地址以及新数据块中发生数据更新的数据块,可以保证以block为最小粒度进行数据拷贝后,能够正常进行数据访问。The write-time copy device provided by the present application copies data with block as the minimum granularity when the size of the cluster is much larger than the size of the block, thus avoiding invalid data copying and improving the execution speed of data copying. At the same time, by creating a block bitmap to record the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block, it can be ensured that after the data is copied with block as the minimum granularity, data access can be performed normally.
本申请还提供了一种写时拷贝设备,参考图5所示,该设备包括存储器1和处理器2。The present application also provides a copy-on-write device, as shown in FIG5 , the device includes a memory 1 and a processor 2 .
存储器1,被设置为存储计算机程序;A memory 1, configured to store a computer program;
处理器2,被设置为执行计算机程序实现如下的步骤:Processor 2 is configured to execute a computer program to implement the following steps:
获取cluster的大小与block的大小;若cluster的大小与block的大小的比值达到预设阈值,则创建block位图;以block为最小粒度,将源数据块中的目标数据拷贝到新数据块;在block位图中记录新数据块的首地址与源数据块的首地址以及新数据块中各数据块的更新状态Get the size of the cluster and the size of the block; if the ratio of the size of the cluster to the size of the block reaches the preset threshold, create a block bitmap; use the block as the minimum granularity to copy the target data in the source data block to the new data block; record the first address of the new data block and the first address of the source data block in the block bitmap, as well as the update status of each data block in the new data block
对于本申请所提供的设备的介绍请参照上述方法实施例,本申请在此不做赘述。For an introduction to the equipment provided in this application, please refer to the above method embodiments, and this application will not go into details here.
本申请所提供的写时拷贝设备,在cluster的大小远大于block的大小时,以block为最小粒度进行数据拷贝,这样避免了无效的数据拷贝,提高了数据拷贝的执行速度。同时,通过创建block位图来记录新数据块的首地址与源数据块的首地址以及新数据块中发生数据更新的数据块,可以保证以block为最小粒度进行数据拷贝后,能够正常进行数据访问。The copy-on-write device provided by the present application copies data with block as the minimum granularity when the size of the cluster is much larger than the size of the block, thus avoiding invalid data copying and improving the execution speed of data copying. At the same time, by creating a block bitmap to record the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block, it can be ensured that after the data is copied with block as the minimum granularity, data access can be performed normally.
本申请还提供了一种非易失性可读存储介质,该非易失性可读存储介质上存储有计算机程序,计算机程序被处理器执行时可实现如下的步骤:The present application also provides a non-volatile readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, the following steps can be implemented:
获取cluster的大小与block的大小;若cluster的大小与block的大小的比值达到预设阈值,则创建block位图;以block为最小粒度,将源数据块中的目标数据拷贝到新数据块;在block位图中记录新数据块的首地址与源数据块的首地址以及新数据块中各数据块的更新状态Get the size of the cluster and the size of the block; if the ratio of the size of the cluster to the size of the block reaches the preset threshold, create a block bitmap; use the block as the minimum granularity to copy the target data in the source data block to the new data block; record the first address of the new data block and the first address of the source data block in the block bitmap, as well as the update status of each data block in the new data block
该非易失性可读存储介质可以包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The non-volatile readable storage medium may include: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and other media that can store program codes.
对于本申请所提供的非易失性可读存储介质的介绍请参照上述方法实施例,本申请在此不做赘述。For an introduction to the non-volatile readable storage medium provided in this application, please refer to the above method embodiment, and this application will not go into details here.
本申请所提供的非易失性可读存储介质,在cluster的大小远大于block的大小时,以 block为最小粒度进行数据拷贝,这样避免了无效的数据拷贝,提高了数据拷贝的执行速度。同时,通过创建block位图来记录新数据块的首地址与源数据块的首地址以及新数据块中发生数据更新的数据块,可以保证以block为最小粒度进行数据拷贝后,能够正常进行数据访问。The non-volatile readable storage medium provided by the present application is used when the size of the cluster is much larger than the size of the block. Block is the minimum granularity for data copying, which avoids invalid data copying and improves the execution speed of data copying. At the same time, by creating a block bitmap to record the first address of the new data block and the first address of the source data block and the data block where data update occurs in the new data block, it can be ensured that after the data is copied with block as the minimum granularity, the data can be accessed normally.
说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置、设备以及非易失性可读存储介质而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The various embodiments in the specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other. For the devices, equipment, and non-volatile readable storage media disclosed in the embodiments, since they correspond to the methods disclosed in the embodiments, the description is relatively simple, and the relevant parts can be referred to the method part description.
专业人员还可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Professionals may also realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the interchangeability of hardware and software, the composition and steps of each example have been generally described in the above description according to function. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professionals and technicians may use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的非易失性可读存储介质中。The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two. The software module may be placed in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a removable disk, a CD-ROM, or any other form of non-volatile readable storage medium known in the art.
以上对本申请所提供的写时拷贝方法、装置、设备以及非易失性可读存储介质进行了详细介绍。本文中应用了可选个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围。 The above is a detailed introduction to the copy-on-write method, device, equipment and non-volatile readable storage medium provided by the present application. Optional examples are used herein to illustrate the principles and implementation methods of the present application. The description of the above embodiments is only used to help understand the method and core idea of the present application. It should be pointed out that for ordinary technicians in this technical field, without departing from the principles of the present application, several improvements and modifications can be made to the present application, and these improvements and modifications also fall within the scope of protection of the claims of the present application.
Claims (22)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310108005.5A CN115826878B (en) | 2023-02-14 | 2023-02-14 | Copy-on-write method, device, equipment and computer readable storage medium |
| CN202310108005.5 | 2023-02-14 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024169286A1 true WO2024169286A1 (en) | 2024-08-22 |
Family
ID=85521199
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/131854 Ceased WO2024169286A1 (en) | 2023-02-14 | 2023-11-15 | Copy-on-write method, apparatus and device, and non-volatile readable storage medium |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN115826878B (en) |
| WO (1) | WO2024169286A1 (en) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115826878B (en) * | 2023-02-14 | 2023-05-16 | 浪潮电子信息产业股份有限公司 | Copy-on-write method, device, equipment and computer readable storage medium |
| CN117708072B (en) * | 2023-07-14 | 2024-10-18 | 荣耀终端有限公司 | File copying method, terminal equipment and chip system |
| CN117891751B (en) * | 2024-03-14 | 2024-06-14 | 北京壁仞科技开发有限公司 | Memory data access method and device, electronic equipment and storage medium |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20060053139A1 (en) * | 2004-09-03 | 2006-03-09 | Red Hat, Inc. | Methods, systems, and computer program products for implementing single-node and cluster snapshots |
| US7231409B1 (en) * | 2003-03-21 | 2007-06-12 | Network Appliance, Inc. | System and method for reallocating blocks in checkpointing bitmap-based file systems |
| CN101840362A (en) * | 2009-10-28 | 2010-09-22 | 创新科存储技术有限公司 | Method and device for achieving copy-on-write snapshot |
| CN105009119A (en) * | 2013-02-28 | 2015-10-28 | 微软公司 | Granular partial recall of deduplicated files |
| CN106557274A (en) * | 2015-09-30 | 2017-04-05 | 中兴通讯股份有限公司 | Virtual snapshot processing method and processing device |
| US20190108100A1 (en) * | 2017-10-05 | 2019-04-11 | Zadara Storage, Inc. | Dedupe as an infrastructure to avoid data movement for snapshot copy-on-writes |
| CN111737221A (en) * | 2020-06-19 | 2020-10-02 | 浪潮电子信息产业股份有限公司 | Data reading and writing method, device and device and storage medium of cluster file system |
| CN115826878A (en) * | 2023-02-14 | 2023-03-21 | 浪潮电子信息产业股份有限公司 | A copy-on-write method, device, device, and computer-readable storage medium |
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7849057B1 (en) * | 2007-03-30 | 2010-12-07 | Netapp, Inc. | Identifying snapshot membership for blocks based on snapid |
| US8095770B2 (en) * | 2009-05-08 | 2012-01-10 | Oracle America Inc. | Method and system for mapping data to a process |
| CN103412824B (en) * | 2013-07-19 | 2016-08-10 | 华为技术有限公司 | Copy on write Snapshot Method and device |
| CN103984609B (en) * | 2014-05-28 | 2017-06-16 | 华为技术有限公司 | A kind of method and apparatus that checkpoint is reclaimed in file system based on copy-on-write |
| CN104331344A (en) * | 2014-11-11 | 2015-02-04 | 浪潮(北京)电子信息产业有限公司 | Data backup method and device |
| CN105988723A (en) * | 2015-02-12 | 2016-10-05 | 中兴通讯股份有限公司 | Snapshot processing method and device |
| CN107122131B (en) * | 2017-04-18 | 2020-08-14 | 杭州宏杉科技股份有限公司 | Thin provisioning method and device |
| US11531488B2 (en) * | 2017-08-07 | 2022-12-20 | Kaseya Limited | Copy-on-write systems and methods |
| CN111522507B (en) * | 2020-04-14 | 2021-10-01 | 中山大学 | A low-latency file system address space management method, system and medium |
| US11467735B2 (en) * | 2020-12-01 | 2022-10-11 | International Business Machines Corporation | I/O operations in log structured arrays |
| CN113568788B (en) * | 2021-09-26 | 2021-11-30 | 成都云祺科技有限公司 | Snapshot method, system and storage medium for Linux non-logical volume block device |
| CN115129253B (en) * | 2022-06-30 | 2025-07-04 | 苏州浪潮智能科技有限公司 | A snapshot processing method, device, equipment and medium |
-
2023
- 2023-02-14 CN CN202310108005.5A patent/CN115826878B/en active Active
- 2023-11-15 WO PCT/CN2023/131854 patent/WO2024169286A1/en not_active Ceased
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7231409B1 (en) * | 2003-03-21 | 2007-06-12 | Network Appliance, Inc. | System and method for reallocating blocks in checkpointing bitmap-based file systems |
| US20060053139A1 (en) * | 2004-09-03 | 2006-03-09 | Red Hat, Inc. | Methods, systems, and computer program products for implementing single-node and cluster snapshots |
| CN101840362A (en) * | 2009-10-28 | 2010-09-22 | 创新科存储技术有限公司 | Method and device for achieving copy-on-write snapshot |
| CN105009119A (en) * | 2013-02-28 | 2015-10-28 | 微软公司 | Granular partial recall of deduplicated files |
| CN106557274A (en) * | 2015-09-30 | 2017-04-05 | 中兴通讯股份有限公司 | Virtual snapshot processing method and processing device |
| US20190108100A1 (en) * | 2017-10-05 | 2019-04-11 | Zadara Storage, Inc. | Dedupe as an infrastructure to avoid data movement for snapshot copy-on-writes |
| CN111737221A (en) * | 2020-06-19 | 2020-10-02 | 浪潮电子信息产业股份有限公司 | Data reading and writing method, device and device and storage medium of cluster file system |
| CN115826878A (en) * | 2023-02-14 | 2023-03-21 | 浪潮电子信息产业股份有限公司 | A copy-on-write method, device, device, and computer-readable storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN115826878B (en) | 2023-05-16 |
| CN115826878A (en) | 2023-03-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12216929B2 (en) | Storage system, memory management method, and management node | |
| CN115826878B (en) | Copy-on-write method, device, equipment and computer readable storage medium | |
| US10067684B2 (en) | File access method and apparatus, and storage device | |
| CN115599544A (en) | Memory management method, device, computer equipment and storage medium | |
| CN110147203B (en) | File management method and device, electronic equipment and storage medium | |
| US12141106B2 (en) | File system cloning method and apparatus | |
| CN106326229B (en) | File storage method and device for embedded system | |
| CN104618482A (en) | Cloud data access method, server, traditional storage device and architecture | |
| US10761932B2 (en) | Data and metadata storage in storage devices | |
| CN107239569A (en) | A kind of distributed file system subtree storage method and device | |
| CN107967122A (en) | A kind of method for writing data of block device, device and medium | |
| WO2024113688A1 (en) | Flash memory device and data management method therefor | |
| WO2022021280A1 (en) | Storage controller, storage control method, solid state disk and storage system | |
| US8239427B2 (en) | Disk layout method for object-based storage devices | |
| WO2024187818A1 (en) | Data migration method, system and device and non-volatile readable storage medium | |
| CN112748854B (en) | Optimized access to a fast storage device | |
| US12050775B2 (en) | Techniques for determining and using temperature classifications with adjustable classification boundaries | |
| CN113590309B (en) | Data processing method, device, equipment and storage medium | |
| Zhang et al. | A light-weight log-based hybrid storage system | |
| US8990541B2 (en) | Compacting Memory utilization of sparse pages | |
| TWI894007B (en) | Storage management method and storage management device | |
| US20240319876A1 (en) | Caching techniques using a unified cache of metadata leaf objects with mixed pointer types and lazy content resolution | |
| CN118585140A (en) | A data aggregation method, device, distributed storage system and storage medium | |
| CN119200998A (en) | Storage management method and storage management device | |
| CN119322583A (en) | Data processing method and device and computing equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23922402 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |