CN110908966B - Method, device and equipment for calculating deduplication rate and readable storage medium - Google Patents
Method, device and equipment for calculating deduplication rate and readable storage medium Download PDFInfo
- Publication number
- CN110908966B CN110908966B CN201911122474.2A CN201911122474A CN110908966B CN 110908966 B CN110908966 B CN 110908966B CN 201911122474 A CN201911122474 A CN 201911122474A CN 110908966 B CN110908966 B CN 110908966B
- Authority
- CN
- China
- Prior art keywords
- file
- log information
- log
- operation request
- storage system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1748—De-duplication implemented within the file system, e.g. based on file segments
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1737—Details of further file system functions for reducing power consumption or coping with limited storage space, e.g. in mobile devices
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0608—Saving storage space on storage systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
- G06F3/0641—De-duplication techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明公开了一种重删率计算方法,包括:获取文件操作请求,利用在线重删方法执行文件操作请求并生成对应的日志信息,将日志信息添加到日志中;读取日志,计算存储系统的理论占用值;获取存储系统的实际占用值,利用理论占用值和实际占用值计算重删率;该方法在获取执行文件操作请求之后利用在线重删方法执行请求,生成对应的日志信息,在计算重删率时利用日志计算存储系统的理论占用值,利用理论占用值和实际占用值计算存储系统的重删率,无需停止接收业务操作,并且可以准确计算出当前时刻存储系统的重删率,提高了重删率计算的准确性;此外,本发明还提供了一种重删率计算装置、设备及计算机可读存储介质,同样具有上述有益效果。
The invention discloses a method for calculating a deduplication rate, comprising: acquiring a file operation request, using an online deduplication method to execute the file operation request, generating corresponding log information, and adding the log information to the log; reading the log, and calculating a storage system The theoretical occupancy value of the storage system is obtained; the actual occupancy value of the storage system is obtained, and the deduplication rate is calculated by using the theoretical occupancy value and the actual occupancy value; this method uses the online deduplication method to execute the request after obtaining the execution file operation request, and generates the corresponding log information. When calculating the deduplication rate, use the log to calculate the theoretical occupancy value of the storage system, and use the theoretical occupancy value and the actual occupancy value to calculate the deduplication rate of the storage system. It is not necessary to stop receiving business operations, and the deduplication rate of the storage system at the current moment can be accurately calculated. , the accuracy of deduplication rate calculation is improved; in addition, the present invention also provides a deduplication rate calculation device, equipment and computer-readable storage medium, which also have the above beneficial effects.
Description
技术领域technical field
本发明涉及文件重删技术领域,特别涉及一种重删率计算方法、重删率计算装置、重删率计算设备及计算机可读存储介质。The present invention relates to the technical field of file deduplication, and in particular, to a deduplication rate calculation method, a deduplication rate calculation device, a deduplication rate calculation device and a computer-readable storage medium.
背景技术Background technique
存储空间是一种有限的资源,同一用户上传的不同文件,例如不同时间的日志,或者不同用户上传的文件,例如同一公司的不同人上传的文件,这些被上传的文件中会存在大量内容相同的文件或切分后内容相同的对象。为了节省存储空间,可以对存储集群进行数据重删。在开启重复数据删除功能后,统计重删率,用来衡量、比较不同重删算法、算法优化配置参数以及算法实现细节的效率与性能,变得非常重要。现有重删率计算方法重为离线重删方法(即后台重删方法),即在文件上传完成后,获取集群的容量,停止接收业务操作并启动文件重删功能,在利用重删逻辑删除内容重复文件后,获取重删后的集群存储的容量,利用重删前后的容量计算重删率。但是这样仅能计算一段时间内的平均重删率,计算的重删率不够准确。Storage space is a limited resource. Different files uploaded by the same user, such as logs at different times, or files uploaded by different users, such as files uploaded by different people in the same company, will have a lot of the same content in these uploaded files. file or an object with the same content after segmentation. To save storage space, data deduplication can be performed on the storage cluster. After the deduplication function is enabled, it is very important to count the deduplication rate to measure and compare the efficiency and performance of different deduplication algorithms, algorithm optimization configuration parameters, and algorithm implementation details. The existing deduplication rate calculation method is changed to the offline deduplication method (that is, the background deduplication method), that is, after the file upload is completed, the capacity of the cluster is obtained, the reception of business operations is stopped, and the file deduplication function is activated. After the content of the duplicated files, obtain the capacity of the cluster storage after deduplication, and use the capacity before and after deduplication to calculate the deduplication rate. However, only the average deduplication rate in a period of time can be calculated in this way, and the calculated deduplication rate is not accurate enough.
因此,如何解决现有重删率计算方法计算的重删率不够准确的问题,是本领域技术人员需要解决的技术问题。Therefore, how to solve the problem that the deduplication rate calculated by the existing deduplication rate calculation method is not accurate enough is a technical problem to be solved by those skilled in the art.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本发明的目的在于提供一种重删率计算方法、重删率计算装置、重删率计算设备及计算机可读存储介质,解决了现有重删率计算方法计算的重删率不够准确的问题。In view of this, the object of the present invention is to provide a deduplication rate calculation method, deduplication rate calculation device, deduplication rate calculation device and computer-readable storage medium, which solve the deduplication rate calculated by the existing deduplication rate calculation method. Inaccurate question.
为解决上述技术问题,本发明提供了一种重删率计算方法,包括:In order to solve the above-mentioned technical problem, the present invention provides a kind of deduplication rate calculation method, including:
获取文件操作请求,利用在线重删方法执行所述文件操作请求并生成对应的日志信息,将所述日志信息添加到日志中;Obtaining a file operation request, using the online deduplication method to execute the file operation request and generating corresponding log information, and adding the log information to the log;
读取所述日志,计算存储系统的理论占用值;Read the log, and calculate the theoretical occupancy value of the storage system;
获取所述存储系统的实际占用值,利用所述理论占用值和所述实际占用值计算重删率。The actual occupancy value of the storage system is acquired, and the deduplication rate is calculated by using the theoretical occupancy value and the actual occupancy value.
可选的,所述读取所述日志,计算存储系统的理论占用值,包括:Optionally, the reading of the log to calculate the theoretical occupancy value of the storage system includes:
读取所述日志中的各个所述日志信息,确定各个所述日志信息中的文件体积和与所述文件体积对应的状态;其中,所述状态为加状态或减状态;Read each of the log information in the log, and determine the file volume in each of the log information and the state corresponding to the file volume; wherein, the state is a plus state or a minus state;
将所有具有所述加状态的所述文件体积相加,得到第一占用值,将所有具有所述减状态的所述文件体积相加,得到第二占用值;adding up all the file volumes with the plus state to obtain a first occupancy value, and adding up all the file volumes with the minus state to obtain a second occupancy value;
利用所述第一占用值减去所述第二占用值,得到所述理论占用值。The theoretical occupancy value is obtained by subtracting the second occupancy value from the first occupancy value.
可选的,当所述文件操作请求为文件上传操作请求时,所述获取文件操作请求,利用在线重删方法执行所述文件操作请求并生成对应的日志信息,包括:Optionally, when the file operation request is a file upload operation request, the obtaining the file operation request, using the online deduplication method to execute the file operation request and generating corresponding log information, including:
获取文件上传操作请求和第一文件,对所述第一文件进行条带化切片,得到多个第一对象;Obtaining a file upload operation request and a first file, and striping and slicing the first file to obtain a plurality of first objects;
计算各个所述第一对象的第一指纹信息,利用各个所述第一指纹信息依次与指纹信息库进行匹配;Calculate the first fingerprint information of each of the first objects, and use each of the first fingerprint information to sequentially match with the fingerprint information database;
当匹配成功时,将所述第一指纹信息对应的第一目标对象的引用计数加一;其中,第一目标对象被存储在所述存储系统中;When the matching is successful, add one to the reference count of the first target object corresponding to the first fingerprint information; wherein, the first target object is stored in the storage system;
当匹配不成功时,将所述第一指纹信息对应的第一对象存入所述存储系统中;When the matching is unsuccessful, the first object corresponding to the first fingerprint information is stored in the storage system;
获取所述第一文件的第一文件体积,利用所述第一文件体积生成上传日志信息,将所述上传日志信息确定为所述日志信息。A first file volume of the first file is acquired, upload log information is generated by using the first file volume, and the upload log information is determined as the log information.
可选的,当所述文件操作请求为文件更新操作请求时,所述获取文件操作请求,利用在线重删方法执行所述文件操作请求并生成对应的日志信息,包括:Optionally, when the file operation request is a file update operation request, obtaining the file operation request, using an online deduplication method to execute the file operation request and generating corresponding log information, including:
获取文件更新操作请求和第二文件,确定所述文件更新操作请求指定的更新文件;Obtain the file update operation request and the second file, and determine the update file specified by the file update operation request;
获取所述更新文件的更新文件体积,利用所述更新文件体积生成删除日志信息;Obtain the update file volume of the update file, and utilize the update file volume to generate deletion log information;
将所述第二文件存入所述存储系统中,并获取所述第二文件的第二文件体积,利用所述第二文件体积生成第一日志信息;其中,第一日志信息为上传日志信息;Store the second file in the storage system, obtain the second file volume of the second file, and use the second file volume to generate first log information; wherein the first log information is upload log information ;
利用所述第一日志信息和所述删除日志信息构成所述日志信息。The log information is constructed using the first log information and the deletion log information.
本发明还提供了一种重删率计算装置,包括:The present invention also provides a deduplication rate calculation device, comprising:
日志更新模块,用于获取文件操作请求,利用在线重删方法执行所述文件操作请求并生成对应的日志信息,将所述日志信息添加到日志中;A log update module, configured to obtain a file operation request, execute the file operation request by using an online deduplication method, generate corresponding log information, and add the log information to the log;
第一计算模块,用于读取所述日志,计算存储系统的理论占用值;a first calculation module, configured to read the log and calculate the theoretical occupancy value of the storage system;
第二计算模块,用于获取所述存储系统的实际占用值,利用所述理论占用值和所述实际占用值计算重删率。The second calculation module is configured to obtain the actual occupancy value of the storage system, and calculate the deduplication rate by using the theoretical occupancy value and the actual occupancy value.
可选的,所述第一计算模块,包括:Optionally, the first computing module includes:
确定单元,用于读取所述日志中的各个所述日志信息,确定各个所述日志信息中的文件体积和与所述文件体积对应的状态;其中,所述状态为加状态或减状态;a determining unit, configured to read each of the log information in the log, and determine a file volume in each of the log information and a state corresponding to the file volume; wherein, the state is a plus state or a minus state;
第一计算单元,用于将所有具有所述加状态的所述文件体积相加,得到第一占用值,将所有具有所述减状态的所述文件体积相加,得到第二占用值;a first calculation unit, configured to add up all the file volumes with the plus state to obtain a first occupancy value, and add all the file volumes with the minus state to obtain a second occupancy value;
第二计算单元,用于利用所述第一占用值减去所述第二占用值,得到所述理论占用值。The second calculation unit is configured to subtract the second occupancy value from the first occupancy value to obtain the theoretical occupancy value.
可选的,所述日志更新模块,包括:Optionally, the log update module includes:
第一对象获取单元,用于获取文件上传操作请求和第一文件,对所述第一文件进行条带化切片,得到多个第一对象;a first object obtaining unit, configured to obtain a file upload operation request and a first file, and perform striping and slicing on the first file to obtain a plurality of first objects;
匹配单元,用于计算各个所述第一对象的第一指纹信息,利用各个所述第一指纹信息依次与指纹信息库进行匹配;a matching unit, configured to calculate the first fingerprint information of each of the first objects, and use each of the first fingerprint information to sequentially match with the fingerprint information database;
计数修改单元,用于当匹配成功时,将所述第一指纹信息对应的第一目标对象的引用计数加一;其中,第一目标对象被存储在所述存储系统中;a count modification unit, configured to add one to the reference count of the first target object corresponding to the first fingerprint information when the matching is successful; wherein, the first target object is stored in the storage system;
存储单元,用于当匹配不成功时,将所述第一指纹信息对应的第一对象存入所述存储系统中;a storage unit, configured to store the first object corresponding to the first fingerprint information in the storage system when the matching is unsuccessful;
日志生成单元,用于获取所述第一文件的第一文件体积,利用所述第一文件体积生成上传日志信息,将所述上传日志信息确定为所述日志信息。A log generating unit, configured to acquire a first file volume of the first file, generate upload log information by using the first file volume, and determine the upload log information as the log information.
可选的,所述日志更新模块,包括:Optionally, the log update module includes:
确定单元,用于获取文件更新操作请求和第二文件,确定所述文件更新操作请求指定的更新文件;a determining unit, configured to obtain the file update operation request and the second file, and determine the update file specified by the file update operation request;
第一生成单元,用于获取所述更新文件的更新文件体积,利用所述更新文件体积生成删除日志信息;a first generating unit, configured to obtain the update file volume of the update file, and use the update file volume to generate deletion log information;
第二生成单元,用于将所述第二文件存入所述存储系统中,并获取所述第二文件的第二文件体积,利用所述第二文件体积生成第一日志信息;其中,第一日志信息为上传日志信息;a second generating unit, configured to store the second file in the storage system, obtain a second file volume of the second file, and generate first log information by using the second file volume; 1. The log information is the upload log information;
生成单元,用于利用所述第一日志信息和所述删除日志信息构成所述日志信息。A generating unit configured to use the first log information and the deletion log information to form the log information.
本发明还提供了一种重删率计算设备,包括存储器和处理器,其中:The present invention also provides a deduplication rate computing device, including a memory and a processor, wherein:
所述存储器,用于保存计算机程序;the memory for storing computer programs;
所述处理器,用于执行所述计算机程序,以实现上述的重删率计算方法。The processor is configured to execute the computer program to implement the above-mentioned method for calculating the deduplication rate.
本发明还提供了一种计算机可读存储介质,用于保存计算机程序,其中,所述计算机程序被处理器执行时实现上述的重删率计算方法。The present invention also provides a computer-readable storage medium for storing a computer program, wherein when the computer program is executed by a processor, the above-mentioned deduplication rate calculation method is implemented.
本发明提供的重删率计算方法,获取文件操作请求,利用在线重删方法执行文件操作请求并生成对应的日志信息,将日志信息添加到日志中。读取日志,计算存储系统的理论占用值。获取存储系统的实际占用值,利用理论占用值和实际占用值计算重删率。The deduplication rate calculation method provided by the present invention obtains a file operation request, uses an online deduplication method to execute the file operation request, generates corresponding log information, and adds the log information to the log. Read the log and calculate the theoretical occupancy value of the storage system. Obtain the actual occupancy value of the storage system, and use the theoretical occupancy value and the actual occupancy value to calculate the deduplication rate.
可见,该方法在获取执行文件操作请求之后利用在线重删方法执行请求,生成对应的日志信息并将日志信息添加到日志中,在计算重删率时利用日志计算存储系统的理论占用值,并检测此时存储系统的实际占用值,利用理论占用值和实际占用值计算存储系统的重删率。该方法无需停止接收业务操作,可以准确计算出当前时刻存储系统的重删率,提高了重删率计算的准确性,解决了现有重删率计算方法计算的重删率不够准确的问题,对重删算法配置参数的优化以及算法实现过程中效率和性能的提高具有重要意义。It can be seen that this method uses the online deduplication method to execute the request after obtaining the execution file operation request, generates the corresponding log information and adds the log information to the log, and uses the log to calculate the theoretical occupancy value of the storage system when calculating the deduplication rate. Detect the actual occupancy value of the storage system at this time, and use the theoretical occupancy value and the actual occupancy value to calculate the deduplication rate of the storage system. The method does not need to stop receiving business operations, can accurately calculate the deduplication rate of the storage system at the current moment, improves the accuracy of deduplication rate calculation, and solves the problem that the deduplication rate calculated by the existing deduplication rate calculation method is not accurate enough. It is of great significance to the optimization of the configuration parameters of the deduplication algorithm and the improvement of the efficiency and performance during the implementation of the algorithm.
此外,本发明还提供了一种重删率计算装置、重删率计算设备及计算机可读存储介质,同样具有上述有益效果。In addition, the present invention also provides a deduplication rate calculation device, a deduplication rate calculation device, and a computer-readable storage medium, which also have the above beneficial effects.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only It is an embodiment of the present invention. For those of ordinary skill in the art, other drawings can also be obtained according to the provided drawings without creative work.
图1为本发明实施例提供的一种重删率计算方法流程图;1 is a flowchart of a method for calculating a deduplication rate provided by an embodiment of the present invention;
图2为本发明实施例提供的一种理论值计算流程图;Fig. 2 is a kind of theoretical value calculation flow chart provided by the embodiment of the present invention;
图3为本发明实施例提供的一种文件操作请求处理流程图;FIG. 3 is a flowchart for processing a file operation request provided by an embodiment of the present invention;
图4为本发明实施例提供的另一种文件操作请求处理流程图;4 is a flowchart of another file operation request processing provided by an embodiment of the present invention;
图5为本发明实施例提供的一种重删率计算装置的结构示意图;5 is a schematic structural diagram of an apparatus for calculating a deduplication rate provided by an embodiment of the present invention;
图6为本发明实施例提供的一种重删率计算设备的结构示意图。FIG. 6 is a schematic structural diagram of a deduplication rate calculation device according to an embodiment of the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments It is only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
请参考图1,图1为本发明实施例提供的一种重删率计算方法流程图。该方法包括:Please refer to FIG. 1 , which is a flowchart of a method for calculating a deduplication rate according to an embodiment of the present invention. The method includes:
S101:获取文件操作请求,利用在线重删方法执行文件操作请求并生成对应的日志信息,将日志信息添加到日志中。S101: Obtain a file operation request, execute the file operation request by using an online deduplication method, generate corresponding log information, and add the log information to the log.
文件操作请求用于对存储系统中的文件或对象进行操作,具体操作可以包括下载、删除、拷贝、更新等,也可以用于向存储系统中添加新文件或新对象。一个文件操作请求可以仅包括一个请求,或者可以包括多个请求;当包括多个请求时,各个请求的种类和针对的文件或对象可相同也可以不同。文件操作请求可以包括桶名和文件名等信息,用于对被操作的文件或对象进行定位。The file operation request is used to operate a file or object in the storage system, and the specific operation may include downloading, deleting, copying, updating, etc., and may also be used to add a new file or new object to the storage system. A file operation request may include only one request, or may include multiple requests; when multiple requests are included, the types and targeted files or objects of each request may be the same or different. The file operation request can include information such as bucket name and file name, which is used to locate the operated file or object.
在线重删也可以称为实时重删,是指在向存储系统中添加新文件时,先利用文件或对象的指纹信息判断文件或对象是否已经被存储,若是,则不将文件或对象进行存储,仅添加相应的信息;若否,则将文件或对象存入存储系统中。具体的。这种重删方法可以减少重复数据的落盘,重复数据的删除是实时的。在利用在线重删方法执行文件操作请求后,需要生成对应的日志信息。由于在线重删时重复数据被实时删除,因此无法得知不进行重删时存储系统的占用值,因此需要生成相应的日志信息,日志信息记录了本次操作对应的文件或对象的大小,用于记录执行文件操作请求后存储系统的理论变化。在生成日志信息后将其添加到存储系统的日志中。Online deduplication, also known as real-time deduplication, means that when a new file is added to the storage system, the fingerprint information of the file or object is used to determine whether the file or object has been stored, and if so, the file or object is not stored. , only add the corresponding information; if not, store the file or object in the storage system. specific. This deduplication method can reduce the placement of duplicate data, and the deduplication of duplicate data is real-time. After the online deduplication method is used to execute the file operation request, the corresponding log information needs to be generated. Since the duplicate data is deleted in real time during online deduplication, it is impossible to know the occupied value of the storage system without deduplication. Therefore, corresponding log information needs to be generated. The log information records the size of the file or object corresponding to this operation. Theoretical changes to the storage system after recording the file operation request. Add log information to the storage system's log after it is generated.
S102:读取日志,计算存储系统的理论占用值。S102: Read the log, and calculate the theoretical occupancy value of the storage system.
在需要计算重删率时,可以读取日志,计算存储系统的理论占用值。需要说明的是,本发明实施例并不限定读取日志的时机,例如可以当检测到计算重删率指令时执行读取日志的操作,计算重删率指令可以由操作人员手动输入,也可以当检测到特定指令时自动生成计算重删率指令;或者可以按照预设周期计算重删率,即按照预设周期执行读取日志的操作;或者可以实时计算重删率,即当上一次重删率计算结束后立即执行下一次读取日志的操作。在读取日志时,可以计算存储系统的理论占用值,即不进行数据重删操作时,存储系统理论上的空间占用值。When you need to calculate the deduplication rate, you can read the log and calculate the theoretical occupancy value of the storage system. It should be noted that the embodiment of the present invention does not limit the timing of reading the log. For example, the operation of reading the log can be performed when the instruction for calculating the deduplication rate is detected. The instruction for calculating the deduplication rate can be manually input by the operator, or When a specific command is detected, an instruction to calculate the deduplication rate is automatically generated; or the deduplication rate can be calculated according to a preset period, that is, the operation of reading the log is performed according to the preset period; or the deduplication rate can be calculated in real time, that is, when the The next operation of reading the log is performed immediately after the deletion rate calculation is completed. When reading logs, you can calculate the theoretical occupancy value of the storage system, that is, the theoretical space occupancy value of the storage system when data deduplication is not performed.
S103:获取存储系统的实际占用值,利用理论占用值和实际占用值计算重删率。S103: Obtain the actual occupancy value of the storage system, and calculate the deduplication rate by using the theoretical occupancy value and the actual occupancy value.
实际占用值为存储系统在执行数据重删操作之后的空间占用值,在获取理论占用值和实际占用值之后,利用理论占用值与实际占用值相除,即可得到重删率。具体的,可以利用X:1的形式表示重删率。The actual occupancy value is the space occupancy value of the storage system after the data deduplication operation is performed. After obtaining the theoretical occupancy value and the actual occupancy value, the deduplication rate can be obtained by dividing the theoretical occupancy value and the actual occupancy value. Specifically, the deduplication rate can be expressed in the form of X:1.
应用本发明实施例提供的重删率计算方法,在获取执行文件操作请求之后利用在线重删方法执行请求,生成对应的日志信息并将日志信息添加到日志中,在计算重删率时利用日志计算存储系统的理论占用值,并检测此时存储系统的实际占用值,利用理论占用值和实际占用值计算存储系统的重删率。该方法无需停止接收业务操作,并且可以准确计算出当前时刻存储系统的重删率,提高了重删率计算的准确性,解决了现有重删率计算方法计算的重删率不够准确的问题,对重删算法配置参数的优化以及算法实现过程中效率和性能的提高具有重要意义。Applying the deduplication rate calculation method provided by the embodiment of the present invention, after obtaining the execution file operation request, the online deduplication method is used to execute the request, the corresponding log information is generated and the log information is added to the log, and the log is used when calculating the deduplication rate. Calculate the theoretical occupancy value of the storage system, detect the actual occupancy value of the storage system at this time, and use the theoretical occupancy value and the actual occupancy value to calculate the deduplication rate of the storage system. The method does not need to stop receiving business operations, and can accurately calculate the deduplication rate of the storage system at the current moment, which improves the accuracy of deduplication rate calculation, and solves the problem that the deduplication rate calculated by the existing deduplication rate calculation method is not accurate enough. , which is of great significance to the optimization of the configuration parameters of the deduplication algorithm and the improvement of the efficiency and performance during the implementation of the algorithm.
基于上述发明实施例,本发明实施例将说明一种具体的理论占用值计算流程。请参考图2,图2为本发明实施例提供的一种理论值计算流程图,包括:Based on the above embodiments of the present invention, the embodiments of the present invention will describe a specific theoretical occupancy value calculation process. Please refer to FIG. 2. FIG. 2 is a flow chart of theoretical value calculation provided by an embodiment of the present invention, including:
S201:读取日志中的各个日志信息,确定各个日志信息中的文件体积和与文件体积对应的状态;其中,状态为加状态或减状态。S201: Read each log information in the log, and determine the file volume in each log information and a state corresponding to the file volume; wherein, the state is a plus state or a minus state.
在本发明实施例中,日志信息中包括文件体积和文件体积对应的状态,每个日志信息中可以包括一个文件体积和一个对应的状态,还可以包括多个文件体积和多个对应的状态。其中,状态为加状态或减状态,加状态用于表示理论占用值增大,减状态用于表示理论占用值减少。不同的文件操作请求在被执行后生成的日志信息中的状态也不同,每种文件操作请求对应的状态可以相同也可以不同,例如当文件操作请求为文件上传操作请求时,生成的日志信息中的状态为加状态;或者当文件操作请求为文件删除请求时,生成的日志信息中的状态为减状态。因此,在计算理论占用值时,需要读取日志中各个日志信息并确定各个日志信息中的文件体积和与文件体积对应的状态。In this embodiment of the present invention, the log information includes file volumes and states corresponding to file volumes, and each log information may include one file volume and one corresponding state, and may also include multiple file volumes and multiple corresponding states. The state is a plus state or a minus state, the plus state is used to indicate an increase in the theoretical occupancy value, and the minus state is used to indicate a decrease in the theoretical occupancy value. Different file operation requests have different statuses in the log information generated after they are executed. The status corresponding to each file operation request can be the same or different. For example, when the file operation request is a file upload operation request, the generated log information The status is plus status; or when the file operation request is a file deletion request, the status in the generated log information is minus status. Therefore, when calculating the theoretical occupancy value, each log information in the log needs to be read and the file volume in each log information and the state corresponding to the file volume need to be determined.
S202:将所有具有加状态的文件体积相加,得到第一占用值,将所有具有减状态的文件体积相加,得到第二占用值。S202: Add up the volumes of all files in the plus state to obtain a first occupancy value, and add up all the file volumes in the minus state to obtain a second occupancy value.
在确定文件体积以及对应的状态之后,将所有具有加状态的文件体积相加,即可得到第一占用值。具体的,可以在检测到一个具有加状态的文件体积时将其与之前读取到的加状态文件体积相加,直至将所有具有加状态的文件体积相加,得到第一占用值;或者可以在读取到具有加状态的文件体积时将其进行记录,并在记录完所有具有加状态的文件体积后将所有的加状态文件体积进行相加,得到第一占用值。将所有具有减状态的文件体积进行相加,得到第二占用值,第二占用值的具体计算过程可以与第一占用值计算过程相同,也可以与第一占用值计算过程不同,本实施例对此不做限定。After the file volume and the corresponding state are determined, the first occupancy value can be obtained by adding up all the file volumes with the plus state. Specifically, when a file volume with an added state is detected, it can be added to the previously read file volume with an added state, until the volumes of all files with an added state are added to obtain the first occupancy value; or When the file volume with the plus state is read, it is recorded, and after all the file volumes with the plus state are recorded, all the file volumes with the plus state are added to obtain the first occupancy value. The second occupancy value is obtained by adding up the volumes of all files with a reduced state. The specific calculation process of the second occupancy value may be the same as the first occupancy value calculation process, or may be different from the first occupancy value calculation process. This embodiment This is not limited.
S203:利用第一占用值减去第二占用值,得到理论占用值。S203: Subtract the second occupancy value from the first occupancy value to obtain a theoretical occupancy value.
在得到第一占用值和第二占用值后,利用第一占用值减去第二占用值,即可得到存储系统的理论占用值,即存储系统不执行数据重删操作时的空间占用值。After obtaining the first occupancy value and the second occupancy value, subtract the second occupancy value from the first occupancy value to obtain the theoretical occupancy value of the storage system, that is, the space occupancy value when the storage system does not perform the data deduplication operation.
基于上述发明实施例,文件操作请求可以为文件上传操作请求,本发明实施例将说明文件上传操作请求的处理过程,请参考图3,图3为本发明实施例提供的一种文件操作请求处理流程图,包括:Based on the above embodiments of the present invention, the file operation request may be a file upload operation request. The embodiment of the present invention will describe the processing process of the file upload operation request. Please refer to FIG. 3 , which is a file operation request processing provided by an embodiment of the present invention. Flowchart, including:
S301:获取文件上传操作请求和第一文件,对第一文件进行条带化切片,得到多个第一对象。S301: Obtain a file upload operation request and a first file, and perform striping and slicing on the first file to obtain a plurality of first objects.
由于文件操作请求为文件上传操作请求,因此在获取文件上传操作请求后,还应获取对应的第一文件,即被上传的文件。在获取第一文件后,对第一文件进行条带化切片处理,得到多个第一对象,本实施例并不限定对条带化切片的具体规则和过程,可以参考相关技术,在此不再赘述。需要说明的是,第一对象也可以为一个,例如当第一文件的大小小于切片阈值无法进行切片时,可以将第一文件直接作为第一对象。Since the file operation request is a file upload operation request, after the file upload operation request is obtained, the corresponding first file, that is, the uploaded file should also be obtained. After the first file is acquired, striping and slicing the first file is performed to obtain a plurality of first objects. This embodiment does not limit the specific rules and procedures for striping and slicing. Reference may be made to related technologies. Repeat. It should be noted that the first object may also be one. For example, when the size of the first file is smaller than the slicing threshold and cannot be sliced, the first file may be directly used as the first object.
S302:计算各个第一对象的第一指纹信息,利用各个第一指纹信息依次与指纹信息库进行匹配。S302: Calculate the first fingerprint information of each first object, and use each first fingerprint information to sequentially match with the fingerprint information database.
指纹信息用于表示文件或对象的身份,可以利用散列算法即哈希算法计算得到,例如利用MD5算法计算得到;或者利用SHA1算法计算得到。在得到各个第一对象对应的第一指纹信息时,将各个第一指纹信息一次与存储系统对应的指纹信息库进行匹配。判断第一指纹信息是否被存储在指纹信息库中。The fingerprint information is used to represent the identity of the file or object, and can be calculated by using a hash algorithm, that is, a hash algorithm, for example, calculated by using the MD5 algorithm; or calculated by using the SHA1 algorithm. When the first fingerprint information corresponding to each first object is obtained, each first fingerprint information is matched with the fingerprint information database corresponding to the storage system once. It is judged whether the first fingerprint information is stored in the fingerprint information database.
S303:当匹配成功时,将第一指纹信息对应的第一目标对象的引用计数加一;其中,第一目标对象被存储在存储系统中。S303: When the matching is successful, add one to the reference count of the first target object corresponding to the first fingerprint information; wherein, the first target object is stored in the storage system.
当匹配成功时,说明该第一指纹信息对应的第一对象已经被存储在存储系统中,在存储系统中与该第一对象相对应的对象为第一目标对象。因此在匹配成功时,将第一指纹信息对应的第一目标对象的引用计数加一。When the matching is successful, it means that the first object corresponding to the first fingerprint information has been stored in the storage system, and the object corresponding to the first object in the storage system is the first target object. Therefore, when the matching is successful, the reference count of the first target object corresponding to the first fingerprint information is incremented by one.
S304:当匹配不成功时,将第一指纹信息对应的第一对象存入存储系统中。S304: When the matching is unsuccessful, store the first object corresponding to the first fingerprint information in the storage system.
当匹配不成功时,说明该第一指纹信息对应的第一对象没有被存储在存储系统中,因此将该洗衣指纹信息对应的第一对象存入存储系统中。在将第一对象存入存储系统之后,还可以执行其他操作,例如将该第一对象的引用计数设置为1,并将对应的第一指纹信息存入指纹信息库中。When the matching is unsuccessful, it means that the first object corresponding to the first fingerprint information is not stored in the storage system, so the first object corresponding to the laundry fingerprint information is stored in the storage system. After the first object is stored in the storage system, other operations may also be performed, for example, setting the reference count of the first object to 1, and storing the corresponding first fingerprint information in the fingerprint information database.
S305:获取第一文件的第一文件体积,利用第一文件体积生成上传日志信息,将上传日志信息确定为日志信息。S305: Obtain the first file volume of the first file, generate upload log information by using the first file volume, and determine the upload log information as log information.
在对所有第一对象进行处理之后,获取第一文件的第一文件体积,利用第一文件体积生成上传日志信息,并将上传日志信息确定为日志信息。具体的,上传日志信息可以包括第一文件体积对应的状态,即加状态,可以用于计算理论占用值;或者可以将第一文件体积标记为正数,可以用于后续计算理论占用值。After all the first objects are processed, a first file volume of the first file is obtained, upload log information is generated by using the first file volume, and the upload log information is determined as log information. Specifically, the upload log information may include the state corresponding to the first file volume, that is, the plus state, which can be used to calculate the theoretical occupancy value; or the first file volume can be marked as a positive number, which can be used for subsequent calculation of the theoretical occupancy value.
基于上述发明实施例,文件操作请求可以为文件更新操作请求,本发明实施例将说明文件更新操作请求的处理过程,请参考图4,图4为本发明实施例提供的另一种文件操作请求处理流程图,包括:Based on the above embodiments of the present invention, the file operation request may be a file update operation request. The embodiment of the present invention will describe the processing process of the file update operation request. Please refer to FIG. 4 . FIG. 4 is another file operation request provided by the embodiment of the present invention. Process flow chart, including:
S401:获取文件更新操作请求和第二文件,确定文件更新操作请求指定的更新文件。S401: Obtain the file update operation request and the second file, and determine the update file specified by the file update operation request.
由于文件更新操作请求需要对存储系统中的更新文件进行更新,因此在获取文件更新操作请求后,还需要获取对应的第二文件。文件更新操作请求中应记录更新文件的信息,例如可以为更新文件对应的桶名和文件名,或者可以为其他信息。利用文件更新操作请求,可以确定其指定的更新文件。Since the file update operation request needs to update the update file in the storage system, after the file update operation request is obtained, the corresponding second file needs to be obtained. The update file information should be recorded in the file update operation request, for example, the bucket name and file name corresponding to the update file, or other information. With a file update operation request, the update file it specifies can be determined.
S402:获取更新文件的更新文件体积,利用更新文件体积生成删除日志信息。S402: Obtain the update file volume of the update file, and use the update file volume to generate deletion log information.
在获取更新文件体积之后,利用更新文件体积生成删除日志信息,具体的,删除日志信息可以包括更新文件体积对应的状态,即减状态,可以用于计算理论占用值;或者可以将更新文件体积标记为负数,可以用于后续计算理论占用值。After obtaining the update file volume, use the update file volume to generate deletion log information. Specifically, the deletion log information may include the state corresponding to the update file volume, that is, the reduction state, which can be used to calculate the theoretical occupancy value; or the update file volume can be marked If it is a negative number, it can be used for subsequent calculation of the theoretical occupancy value.
S403:将第二文件存入存储系统中,并获取第二文件的第二文件体积,利用第二文件体积生成第一日志信息。S403: Store the second file in the storage system, obtain a second file volume of the second file, and generate first log information by using the second file volume.
需要说明的是,第一日志信息即为上传日志信息。具体的上传日志信息构建方法可以参考S304步骤,在此不再赘述。需要说明的是,第一日志信息和删除日志信息的状态或标记不同。It should be noted that the first log information is the upload log information. For the specific upload log information construction method, reference may be made to step S304, which will not be repeated here. It should be noted that the state or flag of the first log information and the deleted log information are different.
S404:利用第一日志信息和删除日志信息构成日志信息。S404: Use the first log information and the deleted log information to form log information.
利用第一日志信息和删除日志信息构成文件更新操作请求对应的日志信息,在构成日志信息后,将日志信息添加到日志中。在计算理论占用值时,可以将该日志信息进行解析,得到第一日志信息和删除日志信息,利用第一日志信息和删除日志信息计算理论占用值,具体的,可以先利用第一日志信息和删除日志信息计算整个日志信息的文件体积以及文件体积对应的状态或标记,再利用该文件体积和状态或标记计算理论占用值;或者可以将日志信息解析为两条日志信息,即第一日志信息和删除日志信息,分别利用第二文件体积和对应的状态或标记以及更新文件体积和对应的状态或标记计算理论占用值。The log information corresponding to the file update operation request is formed by using the first log information and the deletion log information, and after the log information is formed, the log information is added to the log. When calculating the theoretical occupancy value, the log information can be analyzed to obtain first log information and deleted log information, and the theoretical occupancy value can be calculated by using the first log information and the deleted log information. Delete the log information to calculate the file volume of the entire log information and the status or mark corresponding to the file volume, and then use the file volume and status or mark to calculate the theoretical occupancy value; or the log information can be parsed into two log information, that is, the first log information and deletion log information, respectively using the second file volume and the corresponding state or flag and the updated file volume and the corresponding state or flag to calculate the theoretical occupancy value.
下面对本发明实施例提供的重删率计算装置进行介绍,下文描述的重删率计算装置与上文描述的重删率计算方法可相互对应参照。The following describes the deduplication rate calculation apparatus provided by the embodiment of the present invention. The deduplication rate calculation apparatus described below and the deduplication rate calculation method described above can be referred to each other correspondingly.
请参考图5,图5为本发明实施例提供的一种重删率计算装置的结构示意图,包括:Please refer to FIG. 5. FIG. 5 is a schematic structural diagram of an apparatus for calculating a deduplication rate according to an embodiment of the present invention, including:
日志更新模块510,用于获取文件操作请求,利用在线重删方法执行文件操作请求并生成对应的日志信息,将日志信息添加到日志中;The
第一计算模块520,用于读取日志,计算存储系统的理论占用值;The
第二计算模块530,用于获取存储系统的实际占用值,利用理论占用值和实际占用值计算重删率。The
可选的,第一计算模块520,包括:Optionally, the
确定单元,用于读取日志中的各个日志信息,确定各个日志信息中的文件体积和与文件体积对应的状态;其中,状态为加状态或减状态;A determination unit, used for reading each log information in the log, and determining the file volume in each log information and the state corresponding to the file volume; wherein, the state is a plus state or a minus state;
第一计算单元,用于将所有具有加状态的文件体积相加,得到第一占用值,将所有具有减状态的文件体积相加,得到第二占用值;The first calculation unit is used to add up all the file volumes with the plus state to obtain the first occupancy value, and add all the file volumes with the minus state to obtain the second occupancy value;
第二计算单元,用于利用第一占用值减去第二占用值,得到理论占用值。The second calculation unit is configured to subtract the second occupancy value from the first occupancy value to obtain the theoretical occupancy value.
可选的,日志更新模块510,包括:Optionally, the
第一对象获取单元,用于获取文件上传操作请求和第一文件,对第一文件进行条带化切片,得到多个第一对象;a first object obtaining unit, configured to obtain a file upload operation request and a first file, and perform striping and slicing on the first file to obtain a plurality of first objects;
匹配单元,用于计算各个第一对象的第一指纹信息,利用各个第一指纹信息依次与指纹信息库进行匹配;a matching unit, configured to calculate the first fingerprint information of each first object, and use each first fingerprint information to match with the fingerprint information database in turn;
计数修改单元,用于当匹配成功时,将第一指纹信息对应的第一目标对象的引用计数加一;其中,第一目标对象被存储在存储系统中;a count modification unit, used for adding one to the reference count of the first target object corresponding to the first fingerprint information when the matching is successful; wherein, the first target object is stored in the storage system;
存储单元,用于当匹配不成功时,将第一指纹信息对应的第一对象存入存储系统中;a storage unit, configured to store the first object corresponding to the first fingerprint information in the storage system when the matching is unsuccessful;
日志生成单元,用于获取第一文件的第一文件体积,利用第一文件体积生成上传日志信息,将上传日志信息确定为日志信息。The log generating unit is configured to obtain the first file volume of the first file, generate upload log information by using the first file volume, and determine the upload log information as log information.
可选的,日志更新模块510,包括:Optionally, the
确定单元,用于获取文件更新操作请求和第二文件,确定文件更新操作请求指定的更新文件;a determining unit, used to obtain the file update operation request and the second file, and determine the update file specified by the file update operation request;
第一生成单元,用于获取更新文件的更新文件体积,利用更新文件体积生成删除日志信息;The first generation unit is used to obtain the update file volume of the update file, and utilize the update file volume to generate deletion log information;
第二生成单元,用于将第二文件存入存储系统中,并获取第二文件的第二文件体积,利用第二文件体积生成第一日志信息;其中,第一日志信息为上传日志信息;The second generating unit is configured to store the second file in the storage system, obtain the second file volume of the second file, and use the second file volume to generate first log information; wherein, the first log information is upload log information;
生成单元,用于利用第一日志信息和删除日志信息构成日志信息。The generating unit is configured to use the first log information and the deletion log information to form log information.
下面对本发明实施例提供的重删率计算设备进行介绍,下文描述的重删率计算设备与上文描述的重删率计算方法可相互对应参照。The following describes the deduplication rate calculation device provided by the embodiment of the present invention, and the deduplication rate calculation device described below and the deduplication rate calculation method described above may refer to each other correspondingly.
请参考图6,图6为本发明实施例提供的一种重删率计算设备的结构示意图,该重删率计算设备包括存储器、处理器,其中:Please refer to FIG. 6. FIG. 6 is a schematic structural diagram of a deduplication rate calculation device according to an embodiment of the present invention. The deduplication rate calculation device includes a memory and a processor, wherein:
存储器610,用于保存计算机程序;a
处理器620,用于执行计算机程序,以实现上述的重删率计算方法。The
下面对本发明实施例提供的计算机可读存储介质进行介绍,下文描述的计算机可读存储介质与上文描述的重删率计算方法可相互对应参照。The computer-readable storage medium provided by the embodiment of the present invention is introduced below, and the computer-readable storage medium described below and the deduplication rate calculation method described above can be referred to each other correspondingly.
本发明还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现上述的重删率计算方法的步骤。The present invention also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned deduplication rate calculation method are implemented.
该计算机可读存储介质可以包括:U盘、移动硬盘、只读存储器(Read-OnlyMemory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The computer-readable storage medium may include: a U disk, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, etc. that can store program codes. medium.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The various embodiments in this specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments may be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part can be referred to the description of the method.
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应该认为超出本发明的范围。Professionals may further realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of the two, in order to clearly illustrate the possibilities of hardware and software. Interchangeability, the above description has generally described the components and steps of each example in terms of function. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of the present invention.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of a method or algorithm described in connection with the embodiments disclosed herein may be directly implemented in hardware, a software module executed by a processor, or a combination of the two. The software module can be placed in random access memory (RAM), internal memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other in the technical field. in any other known form of storage medium.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系属于仅仅用来将一个实体或者操作与另一个实体或者操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其他任何变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。Finally, it should also be noted that, in this context, relationships such as first and second, etc., are used only to distinguish one entity or operation from another, and do not necessarily require or imply these entities or that there is any such actual relationship or sequence between operations. Moreover, the terms "comprising", "comprising" or any other variation are intended to cover a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also not expressly listed Other elements, or elements that are inherent to such a process, method, article or apparatus.
以上对本发明所提供的一种重删率计算方法、重删率计算装置、重删率计算设备及计算机可读存储介质进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。The above provides a detailed introduction to a deduplication rate calculation method, a deduplication rate calculation device, a deduplication rate calculation device, and a computer-readable storage medium provided by the present invention. Specific examples are used in this paper to describe the principles and implementations of the present invention. Elaborated, the description of the above embodiment is only used to help understand the method of the present invention and its core idea; meanwhile, for those of ordinary skill in the art, according to the idea of the present invention, there will be a For changes, in summary, the contents of this specification should not be construed as limiting the present invention.
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911122474.2A CN110908966B (en) | 2019-11-15 | 2019-11-15 | Method, device and equipment for calculating deduplication rate and readable storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911122474.2A CN110908966B (en) | 2019-11-15 | 2019-11-15 | Method, device and equipment for calculating deduplication rate and readable storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN110908966A CN110908966A (en) | 2020-03-24 |
| CN110908966B true CN110908966B (en) | 2022-06-10 |
Family
ID=69817582
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201911122474.2A Active CN110908966B (en) | 2019-11-15 | 2019-11-15 | Method, device and equipment for calculating deduplication rate and readable storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN110908966B (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103744783A (en) * | 2014-01-03 | 2014-04-23 | 华为技术有限公司 | Method for measuring performance of repeating data deleting and device |
| CN107391774A (en) * | 2017-09-15 | 2017-11-24 | 厦门大学 | The rubbish recovering method of JFS based on data de-duplication |
| CN109074226A (en) * | 2016-09-28 | 2018-12-21 | 华为技术有限公司 | Method for deduplicating data in storage system, storage system and controller |
| CN110399348A (en) * | 2019-07-19 | 2019-11-01 | 苏州浪潮智能科技有限公司 | File deduplication method, device, system, and computer-readable storage medium |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2011075610A1 (en) * | 2009-12-16 | 2011-06-23 | Renew Data Corp. | System and method for creating a de-duplicated data set |
| US9600201B2 (en) * | 2014-03-27 | 2017-03-21 | Hitachi, Ltd. | Storage system and method for deduplicating data |
-
2019
- 2019-11-15 CN CN201911122474.2A patent/CN110908966B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103744783A (en) * | 2014-01-03 | 2014-04-23 | 华为技术有限公司 | Method for measuring performance of repeating data deleting and device |
| CN109074226A (en) * | 2016-09-28 | 2018-12-21 | 华为技术有限公司 | Method for deduplicating data in storage system, storage system and controller |
| CN107391774A (en) * | 2017-09-15 | 2017-11-24 | 厦门大学 | The rubbish recovering method of JFS based on data de-duplication |
| CN110399348A (en) * | 2019-07-19 | 2019-11-01 | 苏州浪潮智能科技有限公司 | File deduplication method, device, system, and computer-readable storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN110908966A (en) | 2020-03-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109271321B (en) | A method and device for counting the number of contributed codes | |
| CN111090620B (en) | A file storage method, apparatus, device and readable storage medium | |
| US9443082B2 (en) | User evaluation | |
| JP6870466B2 (en) | Control programs, control methods, controls, and database servers | |
| CN109241084B (en) | Data query method, terminal equipment and medium | |
| CN103970879B (en) | Method and system for regulating storage positions of data blocks | |
| CN110018996A (en) | A kind of the snapshot rollback method and relevant apparatus of distributed memory system | |
| CN109002424B (en) | File format conversion method and device, computer equipment and storage medium | |
| CN107832470A (en) | A kind of object storage method and device based on storage system | |
| CN111373804A (en) | Data processing method and device | |
| CN115104092A (en) | Data synchronization method and related device | |
| CN107885859B (en) | Method, device and computer-readable storage medium for file number quota | |
| CN112416417A (en) | Code amount statistical method and device, electronic equipment and storage medium | |
| CN104834648A (en) | Log query method and device | |
| CN107656701A (en) | Small documents read accelerated method, system, device and computer-readable recording medium | |
| CN110505314A (en) | A method for processing concurrent additional upload requests | |
| CN110908966B (en) | Method, device and equipment for calculating deduplication rate and readable storage medium | |
| CN113641628B (en) | Data quality detection method, device, equipment and storage medium | |
| CN113076068B (en) | Data storage method and device, electronic equipment and readable storage medium | |
| CN114756467A (en) | Buried point data detection method and device, storage medium and equipment | |
| CN109948800B (en) | Risk Control Method and System | |
| CN111258765A (en) | Load balancing method and device, computing equipment and storage medium | |
| CN110750521A (en) | Data migration method, device and equipment and computer readable storage medium | |
| CN115629907A (en) | Method, device and equipment for processing backup data and storage medium | |
| CN109815445A (en) | A document display method, device, electronic device and readable storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |