[go: up one dir, main page]

CN106445405B - Data access method and device for flash memory storage - Google Patents

Data access method and device for flash memory storage Download PDF

Info

Publication number
CN106445405B
CN106445405B CN201510498599.0A CN201510498599A CN106445405B CN 106445405 B CN106445405 B CN 106445405B CN 201510498599 A CN201510498599 A CN 201510498599A CN 106445405 B CN106445405 B CN 106445405B
Authority
CN
China
Prior art keywords
data
storage
nvdimm
read
storage space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510498599.0A
Other languages
Chinese (zh)
Other versions
CN106445405A (en
Inventor
吴忠杰
欧阳涛
其他发明人请求不公开姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Memblaze Technology Co Ltd
Original Assignee
Beijing Memblaze Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Memblaze Technology Co Ltd filed Critical Beijing Memblaze Technology Co Ltd
Priority to CN201911350374.5A priority Critical patent/CN111007991B/en
Priority to CN201510498599.0A priority patent/CN106445405B/en
Priority to PCT/CN2016/094422 priority patent/WO2017025039A1/en
Publication of CN106445405A publication Critical patent/CN106445405A/en
Application granted granted Critical
Publication of CN106445405B publication Critical patent/CN106445405B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

A data access method and device for flash memory storage are provided. A data access method for a storage system, the storage system comprising a plurality of storage devices, the storage system providing a plurality of storage objects, a storage object being comprised of storage resources on a storage device, the plurality of storage objects comprising one or more writable storage objects and a plurality of read-only storage objects; the method comprises the following steps: responding to a write request, and writing data to the writable storage object in an additional write mode; if the writable storage object is full, setting the writable storage object as a read-only storage object; in response to a read request, data is read from the read-only memory object.

Description

一种面向闪存存储的数据访问方法及其装置A data access method and device for flash memory storage

技术领域technical field

本发明涉及高性能存储系统,更具体地,本发明涉及在存储系统中将数据输出到记录载体期间,使读、写操作在存储对象层分离的存储系统的数据访问方法及其装置。The present invention relates to a high-performance storage system, and more particularly, the present invention relates to a data access method and apparatus for a storage system that separates read and write operations at the storage object layer during data output to a record carrier in the storage system.

背景技术Background technique

现有技术通常通过LBA(Logical Block Address,逻辑块地址)的方式对固态硬盘(SSD,Solid State Drive)进行访问。文件系统中产生的数据访问需要确定LBA,然后根据LBA访问对应的SSD。这种采用LBA进行数据布局的方式比较直观,并且能够通过数据缓存、充分利用应用数据的局部性特性来提高数据访问的性能。In the prior art, a solid state drive (SSD, Solid State Drive) is usually accessed by means of an LBA (Logical Block Address, logical block address). The data access generated in the file system needs to determine the LBA, and then access the corresponding SSD according to the LBA. This way of using LBA for data layout is more intuitive, and can improve the performance of data access through data caching and making full use of the locality of application data.

现有技术是在磁盘技术上演变过来的,其充分利用了磁盘存储设备的顺序访问能力强,随机访问能力弱的特性。然而,通过LBA访问SSD的方式无法满足存储性能一致性的需求,并且当存储系统使用一段时间之后,由于大量数据回收操作将使得系统性能整体下降。The prior art is evolved from the disk technology, which makes full use of the characteristics of the disk storage device with strong sequential access capability and weak random access capability. However, the way of accessing SSD through LBA cannot meet the requirement of storage performance consistency, and when the storage system is used for a period of time, the overall performance of the system will decrease due to a large number of data recovery operations.

为提高存储系统的写操作性能,一般在系统中提供高速缓冲,用于缓存写入的数据。提供写缓存的一种方式是在内存中建立缓存,对文件或者块设备的写请求写到内存中的缓存后,就可以向主机返回写操作完成,然后通过后台操作异步地将数据写到磁盘。这种方法通常称为write-back(写回)。另一种方法是等待缓冲中的数据被同步到磁盘才向主机返回写操作完成,这称为write-through(写穿)。In order to improve the write operation performance of the storage system, a cache is generally provided in the system for caching the written data. One way to provide a write cache is to create a cache in memory. After a write request to a file or block device is written to the cache in memory, the write operation can be returned to the host, and then the data can be asynchronously written to disk through a background operation. . This method is often called write-back. Another method is to wait for the data in the buffer to be synchronized to the disk before returning the completion of the write operation to the host, which is called write-through.

NVDIMM是用于计算设备的存储器,其兼具类似DRAM(Dynamic Random AccessMemory)的高速数据存取能力与非易失存储器的数据保持能力,即使发生意外掉电,NVDIMM中存储的数据依然不会丢失。NVDIMM is a memory used for computing equipment. It has both the high-speed data access capability similar to DRAM (Dynamic Random Access Memory) and the data retention capability of non-volatile memory. Even if an unexpected power failure occurs, the data stored in the NVDIMM will not be lost. .

参看图1,在公开号为CN104239226A的中国专利申请中公开了采用NVDIMM作为高速缓存的iSCSI存储服务器。在存储服务器工作时,所有的iSCSI读写命令都通过磁盘缓存完成相应的读写操作。当存储服务器收到iSCSI读命令时,先在磁盘缓存查找,如果找到相应的数据,直接返回给客户端;如果没找到相应的数据,将数据从磁盘读入到该磁盘缓存中,然后再返回给客户端;当存储服务器收到iSCSI写命令时,直接将数据复制到磁盘缓存的相应区域中。Referring to FIG. 1 , an iSCSI storage server using NVDIMM as a cache is disclosed in Chinese Patent Application Publication No. CN104239226A. When the storage server is working, all iSCSI read and write commands complete the corresponding read and write operations through the disk cache. When the storage server receives an iSCSI read command, it searches the disk cache first, and if it finds the corresponding data, it returns it directly to the client; if it does not find the corresponding data, it reads the data from the disk into the disk cache, and then returns To the client; when the storage server receives the iSCSI write command, it directly copies the data to the corresponding area of the disk cache.

现有技术中,将NVDIMM既用作读缓存又用作写缓存,对NVDIMM的容量提出了很高的需求。在面对流式数据访问请求时,现有技术的缓存方式难以获得高利用率,而频繁的缓存缺失又将引起存储系统的整体性能颠簸。In the prior art, the NVDIMM is used as both a read cache and a write cache, which places a high demand on the capacity of the NVDIMM. In the face of streaming data access requests, it is difficult to obtain high utilization in the existing caching methods, and frequent cache misses will cause the overall performance of the storage system to thrash.

发明内容SUMMARY OF THE INVENTION

本发明的一个目的在于希望解决在掉电时数据不丢失的前提下提升写性能的问题,实现对写性能和数据可靠性的兼顾。本发明的又一个目的在于提供针对流式大数据的有效的缓存替换机制,提高缓存的利用效率,减少存储系统的性能颠簸,使存储系统实现延迟的一致性。本发明的又一个目的在于提出了一种数据在闪存存储介质上的分布方式,降低读写操作之间的相互影响,使读操作延迟控制在一定范围之内。在使得闪存存储系统获取最佳性能的同时,还实现性能一致性。除此之外,本发明的目的还在于提高系统性能的同时,增强闪存使用寿命。One purpose of the present invention is to solve the problem of improving the writing performance on the premise that the data is not lost when the power is turned off, so as to realize the balance between the writing performance and the data reliability. Another object of the present invention is to provide an effective cache replacement mechanism for streaming big data, improve the utilization efficiency of the cache, reduce the performance bumps of the storage system, and enable the storage system to achieve delay consistency. Another object of the present invention is to propose a way of distributing data on a flash memory storage medium, so as to reduce the mutual influence between read and write operations and control the read operation delay within a certain range. Achieve consistent performance while enabling flash storage systems to achieve optimal performance. In addition, the purpose of the present invention is to enhance the service life of the flash memory while improving the system performance.

根据本发明的第一方面,提供了根据本发明第一方面的第一基于NVDIMM的数据写缓存方法,包括:接收第一数据写入请求,所述第一数据写入请求指示将第一数据写入第一地址;影响于接收到该第一数据写入请求,将所述第一数据写入NVDIMM;响应于将所述第一数据写入NVDIMM的操作完成,发送指示所述第一数据写入请求完成的消息;响应于接收到所述第一数据写入请求,还将所述第一数据写入存储设备;响应于将所述第一数据写入存储设备的操作完成,释放所述第一数据在所述NVDIMM中占据的存储空间。According to a first aspect of the present invention, there is provided a first NVDIMM-based data write caching method according to the first aspect of the present invention, comprising: receiving a first data write request, the first data write request indicating that the first data Writing a first address; Affecting receiving the first data write request, writing the first data into the NVDIMM; in response to the completion of the operation of writing the first data into the NVDIMM, sending an indication of the first data Write request completion message; in response to receiving the first data write request, also write the first data to the storage device; in response to the completion of the operation of writing the first data to the storage device, release the storage space occupied by the first data in the NVDIMM.

根据本发明第一方面的第一基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第二方法,其中将所述第一数据写入NVDIMM的操作与将所述第一数据写入存储设备的操作并行执行。According to the first NVDIMM-based data write caching method according to the first aspect of the present invention, there is provided the second method according to the first aspect of the present invention, wherein the operation of writing the first data to the NVDIMM is the same as the writing of the first data. operations to the storage device are performed in parallel.

根据本发明第一方面的第一或第二基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第三方法,其中所述将所述第一数据写入NVDIMM包括:生成第一数据块,所述第一数据块中包括所述第一数据、所述第一地址以及顺序号,其中每次生成顺序号时,递增所述顺序号;将所述第一数据块写入所述NVDIMM。The first or second NVDIMM-based data write caching method according to the first aspect of the present invention provides a third method according to the first aspect of the present invention, wherein the writing the first data to the NVDIMM comprises: generating a first A data block, the first data block includes the first data, the first address and a sequence number, wherein each time a sequence number is generated, the sequence number is incremented; the first data block is written into all NVDIMMs described above.

根据本发明第一方面的第一至第三基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第四方法,还包括:接收第二数据写入请求,所述第二数据写入请求指示将第二数据写入第二地址;响应于接收到所述第二数据写入请求,生成第二数据块,所述第二数据块中包括所述第二数据、所述第二地址以及顺序号;将所述第二数据块写入所述NVDIMM。The first to third NVDIMM-based data write caching methods according to the first aspect of the present invention provide a fourth method according to the first aspect of the present invention, further comprising: receiving a second data write request, the second data write The write request instructs to write the second data to the second address; in response to receiving the second data write request, a second data block is generated, and the second data block includes the second data, the second data address and sequence number; write the second data block to the NVDIMM.

根据本发明第一方面的第一至第三基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第五方法,还包括:响应于收到正常关机的消息,向所述NVDIMM中写入第一标记。According to the first to third NVDIMM-based data write caching methods according to the first aspect of the present invention, there is provided a fifth method according to the first aspect of the present invention, further comprising: in response to receiving a message of normal shutdown, sending a message to the NVDIMM Write the first mark.

根据本发明第一方面的第四基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第六方法,还包括:响应于收到正常关机的消息,向所述NVDIMM中写入第一标记。According to the fourth NVDIMM-based data write caching method according to the first aspect of the present invention, there is provided the sixth method according to the first aspect of the present invention, further comprising: in response to receiving a message of normal shutdown, writing the first method to the NVDIMM a mark.

根据本发明第一方面的第六基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第七方法,还包括:响应于开机,若从所述NVDIMM中无法读到所述第一标记,则从所述NVDIMM中读出所述第一数据块与所述第二数据块,按照所述第一数据块与所述第二数据块中的顺序号从小到大的顺序将所述第一数据块与所述第二数据块中的第一数据与第二数据写入所述存储设备。The sixth NVDIMM-based data write caching method according to the first aspect of the present invention provides the seventh method according to the first aspect of the present invention, further comprising: in response to power-on, if the first method cannot be read from the NVDIMM mark, the first data block and the second data block are read from the NVDIMM, and the first data block and the second data block The first data and the second data in the first data block and the second data block are written into the storage device.

根据本发明第一方面的第四基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第八方法,还包括:响应于收到异常关机的消息,向所述NVDIMM中写入第二标记;响应于开机,若从所述NVDIMM中读到所述第二标记,则从所述NVDIMM中读出所述第一数据块与所述第二数据块,按照所述第一数据块与所述第二数据块中的顺序号从小到大的顺序将所述第一数据块与所述第二数据块中的第一数据与第二数据写入所述存储设备。The fourth NVDIMM-based data write caching method according to the first aspect of the present invention provides the eighth method according to the first aspect of the present invention, further comprising: in response to receiving a message of abnormal shutdown, writing the first method to the NVDIMM Two flags; in response to power-on, if the second flag is read from the NVDIMM, the first data block and the second data block are read from the NVDIMM, according to the first data block The first data and the second data in the first data block and the second data block are written into the storage device in ascending order of the sequence numbers in the second data block.

根据本发明第一方面的前述基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第九方法,其中响应于将第一数据写入所述NVDIMM,将所述第一数据在所述NVDIMM上占据的存储空间标记为占用;响应于释放所述第一数据在所述NVDIMM中占据的存储空间,将所述第一数据在所述NVDIMM上占据的存储空间标记为空闲。According to the aforementioned NVDIMM-based data write caching method according to the first aspect of the present invention, there is provided a ninth method according to the first aspect of the present invention, wherein in response to writing the first data to the NVDIMM, the first data is stored in the The storage space occupied on the NVDIMM is marked as occupied; in response to releasing the storage space occupied by the first data in the NVDIMM, the storage space occupied by the first data on the NVDIMM is marked as free.

根据本发明第一方面的第九基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第十方法,其中所述将所述第一数据写入NVDIMM时将所述第一数据写入所述NVDIMM的被标记为空闲的存储空间。According to the ninth NVDIMM-based data write cache method according to the first aspect of the present invention, there is provided the tenth method according to the first aspect of the present invention, wherein the first data is written when the first data is written into the NVDIMM into the memory space of the NVDIMM that is marked as free.

根据本发明第一方面的前述基于NVDIMM的数据写缓存方法,提供了根据本发明第一方面的第十一方法,还包括:将所述第一地址写入所述存储设备。The aforementioned NVDIMM-based data write caching method according to the first aspect of the present invention provides an eleventh method according to the first aspect of the present invention, further comprising: writing the first address into the storage device.

根据本发明的第二方面,提供了根据本发明第二方面的第一基于NVDIMM的数据写缓存装置,包括:接收模块,用于接收第一数据写入请求,所述第一数据写入请求指示将第一数据写入第一地址;NVDIMM写入模块,用于响应于接收到该第一数据写入请求,将所述第一数据写入NVDIMM;消息发送模块,用于响应于将所述第一数据写入NVDIMM的操作完成,发送指示所述第一数据写入请求完成的消息;存储设备写入模块,用于响应于接收到所述第一数据写入请求,还将所述第一数据写入存储设备;以及NVDIMM释放模块,用于响应于将所述第一数据写入存储设备的操作完成,释放所述第一数据在所述NVDIMM中占据的存储空间。According to a second aspect of the present invention, there is provided a first NVDIMM-based data write cache device according to the second aspect of the present invention, comprising: a receiving module configured to receive a first data write request, the first data write request Instruct to write the first data to the first address; the NVDIMM writing module is used to write the first data into the NVDIMM in response to receiving the first data write request; the message sending module is used to respond to the writing of the first data. After the operation of writing the first data into the NVDIMM is completed, a message indicating the completion of the first data writing request is sent; the storage device writing module is configured to, in response to receiving the first data writing request, send the writing first data to a storage device; and an NVDIMM release module, configured to release the storage space occupied by the first data in the NVDIMM in response to the completion of the operation of writing the first data to the storage device.

根据本发明的第三方面,提供了根据本发明第三方面的第一基于NVDIMM的数据写缓存方法,包括:接收第一数据写入请求,所述第一数据写入请求指示将第一数据写入第一地址;响应于接收到该第一数据写入请求,将所述第一数据写入NVDIMM;响应于将所述第一数据写入NVDIMM的操作完成,发送指示所述第一数据写入请求完成的消息;接收第二数据写入请求,所述第二数据写入请求指示将第二数据写入第二地址;响应于接收到该第二数据写入请求,将所述第二数据写入NVDIMM;响应于将所述第二数据写入NVDIMM的操作完成,发送指示所述第二数据写入请求完成的消息;生成第一存储数据块,所述第一存储数据块中包括所述第一数据、所述第一地址、所述第二数据以及所述第二地址;将所述第一存储数据块写入存储设备;响应于将所述第一存储数据写入存储设备的操作完成,释放所述第一数据与第二数据在所述NVDIMM中占据的存储空间。According to a third aspect of the present invention, there is provided a first NVDIMM-based data write caching method according to the third aspect of the present invention, comprising: receiving a first data write request, the first data write request indicating that the first data writing the first address; in response to receiving the first data write request, writing the first data into the NVDIMM; in response to the completion of the operation of writing the first data into the NVDIMM, sending an indication of the first data write request completion message; receive a second data write request, the second data write request instructing to write the second data to the second address; in response to receiving the second data write request, write the second data write request Two data are written into the NVDIMM; in response to the completion of the operation of writing the second data into the NVDIMM, a message indicating the completion of the second data write request is sent; a first storage data block is generated, and the first storage data block is including the first data, the first address, the second data, and the second address; writing the first block of storage data to a storage device; in response to writing the first storage data to storage After the operation of the device is completed, the storage space occupied by the first data and the second data in the NVDIMM is released.

根据本发明第三方面的第一基于NVDIMM的数据写缓存方法,提供了根据本发明第三方面的第二方法,其中所述将所述第一数据写入NVDIMM包括:生成第一数据块,所述第一数据块中包括所述第一数据、所述第一地址以及顺序号;将所述第一数据块写入所述NVDIMM;以及其中所述将所述第二数据写入NVDIMM包括:生成第二数据块,所述第二数据块中包括所述第二数据、所述第二地址以及顺序号;将所述第二数据块写入所述NVDIMM;以及其中每次生成顺序号时,递增所述顺序号。The first NVDIMM-based data write caching method according to the third aspect of the present invention provides the second method according to the third aspect of the present invention, wherein the writing the first data to the NVDIMM includes: generating a first data block, The first data block includes the first data, the first address, and a sequence number; writing the first data block to the NVDIMM; and wherein the writing the second data to the NVDIMM includes : generate a second data block, the second data block includes the second data, the second address and the sequence number; write the second data block into the NVDIMM; and where the sequence number is generated each time , increment the sequence number.

根据本发明第三方面的第一或第二基于NVDIMM的数据写缓存方法,提供了根据本发明第三方面的第三方法,还包括:响应于收到正常关机的消息,向所述NVDIMM中写入第一标记。According to the first or second NVDIMM-based data write caching method according to the third aspect of the present invention, there is provided the third method according to the third aspect of the present invention, further comprising: in response to receiving a message of normal shutdown, sending a message to the NVDIMM Write the first mark.

根据本发明第三方面的第三基于NVDIMM的数据写缓存方法,提供了根据本发明第三方面的第四方法,还包括:响应于开机,若从所述NVDIMM中无法读到所述第一标记,则从所述NVDIMM中读出所述第一数据块与所述第二数据块,按照所述第一数据块与所述第二数据块中的顺序号从小到大的顺序将所述第一数据块与所述第二数据块中的第一数据与第二数据写入所述存储设备。A third NVDIMM-based data write caching method according to the third aspect of the present invention provides a fourth method according to the third aspect of the present invention, further comprising: in response to power-on, if the first method cannot be read from the NVDIMM mark, the first data block and the second data block are read from the NVDIMM, and the first data block and the second data block The first data and the second data in the first data block and the second data block are written into the storage device.

根据本发明第三方面的前述基于NVDIMM的数据写缓存方法,提供了根据本发明第三方面的第五方法,其中响应于将第一数据写入所述NVDIMM,将所述第一数据在所述NVDIMM上占据的存储空间标记为占用;响应于释放所述第一数据在所述NVDIMM中占据的存储空间,将所述第一数据在所述NVDIMM上占据的存储空间标记为空闲。According to the aforementioned NVDIMM-based data write caching method according to the third aspect of the present invention, there is provided a fifth method according to the third aspect of the present invention, wherein in response to writing the first data to the NVDIMM, the first data is stored in the The storage space occupied on the NVDIMM is marked as occupied; in response to releasing the storage space occupied by the first data in the NVDIMM, the storage space occupied by the first data on the NVDIMM is marked as free.

根据本发明第三方面的第五基于NVDIMM的数据写缓存方法,提供了根据本发明第三方面的第六方法,其中所述将所述第一数据写入NVDIMM时将所述第一数据写入所述NVDIMM的被标记为空闲的存储空间。According to a fifth NVDIMM-based data write caching method according to the third aspect of the present invention, there is provided a sixth method according to the third aspect of the present invention, wherein the first data is written when the first data is written into the NVDIMM into the memory space of the NVDIMM that is marked as free.

根据本发明的第四方面,提供了根据本发明第四方面的第一基于NVDIMM的数据写缓存装置,包括:第一接收模块,用于接收第一数据写入请求,所述第一数据写入请求指示将第一数据写入第一地址;第一NVDIMM写入模块,用于响应于接收到该第一数据写入请求,将所述第一数据写入NVDIMM;第一消息发送模块,用于响应于将所述第一数据写入NVDIMM的操作完成,发送指示所述第一数据写入请求完成的消息;第二接收模块,用于接收第二数据写入请求,所述第二数据写入请求指示将第二数据写入第二地址;第二NVDIMM写入模块,用于响应于接收到该第二数据写入请求,将所述第二数据写入NVDIMM;第二消息发送模块,用于响应于将所述第二数据写入NVDIMM的操作完成,发送指示所述第二数据写入请求完成的消息;数据聚合模块,用于生成第一存储数据块,所述第一存储数据块中包括所述第一数据、所述第一地址、所述第二数据以及所述第二地址;存储设备写入模块,用于将所述第一存储数据块写入存储设备;以及NVDIMM释放模块,用于响应于将所述第一存储数据写入存储设备的操作完成,释放所述第一数据与第二数据在所述NVDIMM中占据的存储空间。According to a fourth aspect of the present invention, there is provided a first NVDIMM-based data write buffer device according to the fourth aspect of the present invention, comprising: a first receiving module configured to receive a first data write request, the first data write The write request instructs to write the first data to the first address; the first NVDIMM writing module is configured to write the first data into the NVDIMM in response to receiving the first data write request; the first message sending module, In response to the completion of the operation of writing the first data to the NVDIMM, sending a message indicating that the first data write request is completed; a second receiving module, configured to receive a second data write request, the second The data write request instructs to write the second data into the second address; the second NVDIMM write module is used to write the second data into the NVDIMM in response to receiving the second data write request; the second message is sent a module, configured to send a message indicating that the second data write request is completed in response to the completion of the operation of writing the second data to the NVDIMM; a data aggregation module, configured to generate a first storage data block, the first The storage data block includes the first data, the first address, the second data and the second address; a storage device writing module is configured to write the first storage data block into a storage device; and an NVDIMM release module, configured to release the storage space occupied by the first data and the second data in the NVDIMM in response to the completion of the operation of writing the first stored data to the storage device.

根据本发明的第五方面,提供一种计算机程序,当被载入计算机系统并在计算机系统上执行时,所述计算机程序代码使所述计算机系统执行执行根据本发明的第一与第三方面而提供的多种方法之一。According to a fifth aspect of the present invention, there is provided a computer program which, when loaded into and executed on a computer system, causes the computer system to execute the first and third aspects of the present invention And one of the many methods offered.

根据本发明的第六方面,提供了一种计算机,包括:用于存储程序指令的机器可读存储器;用于执行存储在所述存储器中的程序指令的一个或多个处理器;所述程序指令用于使所述一个或多个处理器执行根据本发明的第一与第三方面而提供的多种方法之一。According to a sixth aspect of the present invention, there is provided a computer comprising: a machine-readable memory for storing program instructions; one or more processors for executing the program instructions stored in the memory; the program The instructions are for causing the one or more processors to perform one of the methods provided in accordance with the first and third aspects of the present invention.

根据本发明的第七方面,提供了一种程序,其使得计算机执行根据本发明的第一与第三方面而提供的多种方法之一。According to a seventh aspect of the present invention, there is provided a program that causes a computer to perform one of the methods provided according to the first and third aspects of the present invention.

根据本发明的第八方面,提供了一种在其上具有所记录的程序的计算机可读存储介质,其中所述程序使得计算机执行根据本发明的第一与第三方面而提供的多种方法之一。According to an eighth aspect of the present invention, there is provided a computer-readable storage medium having a program recorded thereon, wherein the program causes a computer to perform the various methods provided according to the first and third aspects of the present invention one.

根据本发明的第九方面,提供了根据本发明第九方面的第一用于流式数据处理的写缓存释放方法,其中所述写缓存中包括一个或多个空闲存储空间,提供数据结构用来索引所述一个或多个空闲存储空间,所述方法包括:响应于接收到释放所述写缓存的第一存储空间的请求,查找与所述第一存储空间相接的存储空间是否为空闲存储空间;若找到与所述第一存储空间相接的第一空闲存储空间,修改所述数据结构中索引所述第一空闲存储空间的第一节点,使所述第一节点索引所述第一空闲存储空间与所述第一存储空间;以及若未找到与所述第一存储空间相接的第一空间存储空间,在数据结构中添加新的节点用来索引所述第一存储空间。According to a ninth aspect of the present invention, there is provided the first write cache release method for streaming data processing according to the ninth aspect of the present invention, wherein the write cache includes one or more free storage spaces, and provides a data structure for to index the one or more free storage spaces, the method includes: in response to receiving a request to release the first storage space of the write cache, looking up whether the storage space adjacent to the first storage space is free storage space; if the first free storage space connected to the first storage space is found, modify the first node in the data structure that indexes the first free storage space, so that the first node indexes the first free storage space. a free storage space and the first storage space; and if the first storage space connected to the first storage space is not found, a new node is added in the data structure for indexing the first storage space.

根据本发明第九方面,提供了根据本发明第九方面的第二用于流式数据处理的写缓存释放方法,其中所述写缓存中包括一个或多个空闲存储空间,提供数据结构用来索引所述一个或多个空闲存储空间,所述数据结构包括多个节点,每个节点用于索引一个空闲存储空间,所述方法包括:响应于接收到释放所述写缓存的第一存储空间的请求,查找所述第一存储空间前后的空闲存储空间是否与所述第一存储空间相接;According to a ninth aspect of the present invention, there is provided a second write cache release method for streaming data processing according to the ninth aspect of the present invention, wherein the write cache includes one or more free storage spaces, and a data structure is provided for indexing the one or more free storage spaces, the data structure includes a plurality of nodes, each node is used for indexing a free storage space, the method includes: in response to receiving the first storage space that releases the write cache request, find out whether the free storage space before and after the first storage space is connected to the first storage space;

若找到与所述第一存储空间相接的第一空闲存储空间,则合并所述第一空闲存储空间与所述第一存储空间;以及若未找到与所述第一存储空间相接的第一空间存储空间,在数据结构中添加新的节点用来索引所述第一存储空间。If the first free storage space connected to the first storage space is found, the first free storage space and the first storage space are merged; and if the first free storage space connected to the first storage space is not found A space storage space, and a new node is added in the data structure for indexing the first storage space.

根据本发明第九方面的第二用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第三用于流式数据处理的写缓存释放方法,其中所述合并所述第一空闲存储空间与所述第一存储空间包括:修改所述数据结构中索引所述第一空闲存储空间的第一节点,使所述第一节点索引所述第一空闲存储空间与所述第一存储空间。According to the second write cache release method for streaming data processing according to the ninth aspect of the present invention, there is provided a third write cache release method for streaming data processing according to the ninth aspect of the present invention, wherein the combining said The first free storage space and the first storage space include: modifying a first node in the data structure that indexes the first free storage space, so that the first node indexes the first free storage space and the first free storage space. first storage space.

根据本发明第九方面的第二用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第四用于流式数据处理的写缓存释放方法,其中若找到与所述第一存储空间相接的第一空闲存储空间与第二空闲存储空间,则合并所述第一空间存储空间与所述第二空闲存储空间,其中第一空闲存储空间的地址在所述第一存储空间之前,而第二空间存储空间的地址在所述第一存储空间之后。According to the second write cache release method for streaming data processing according to the ninth aspect of the present invention, there is provided a fourth write cache release method for streaming data processing according to the ninth aspect of the present invention. If the first free storage space and the second free storage space are adjacent to the first storage space, the first storage space and the second free storage space are merged, wherein the address of the first free storage space is in the first storage space. Before the storage space, and the address of the second space storage space is after the first storage space.

根据本发明第九方面的前述用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第五用于流式数据处理的写缓存释放方法,还包括:在所述第一存储空间中设置标记,用以指示所述第一存储空间为空闲。According to the aforementioned write cache release method for streaming data processing according to the ninth aspect of the present invention, there is provided a fifth write cache release method for streaming data processing according to the ninth aspect of the present invention, further comprising: in the first A flag is set in a storage space to indicate that the first storage space is free.

根据本发明第九方面的前述用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第六用于流式数据处理的写缓存释放方法,其中提供指针,所述指针指向的节点索引所述第一空闲存储空间;响应于接收到分配存储空间的请求,从所述指针指向的节点开始查找空闲存储空间。According to the aforementioned write cache release method for streaming data processing according to the ninth aspect of the present invention, there is provided a sixth write cache release method for streaming data processing according to the ninth aspect of the present invention, wherein a pointer is provided, the pointer The pointed node indexes the first free storage space; in response to receiving a request for allocating storage space, the free storage space is searched from the node pointed to by the pointer.

根据本发明第九方面的第六用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第七用于流式数据处理的写缓存释放方法,还包括:若所述指针指向的节点可满足所述分配存储空间的请求,从所述指针指向的节点所索引的空闲存储空间中分配空闲存储空间以响应所述分配存储空间的请求。The sixth write cache release method for streaming data processing according to the ninth aspect of the present invention provides the seventh write cache release method for streaming data processing according to the ninth aspect of the present invention, further comprising: if the The node pointed to by the pointer can satisfy the request for allocating storage space, and free storage space is allocated from the free storage space indexed by the node pointed to by the pointer in response to the request for allocating storage space.

根据本发明第九方面的第七用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第八用于流式数据处理的写缓存释放方法,还包括:若未找到索引了可满足所述请求的空闲存储空间的第三节点,等待可满足所述请求的空闲存储空间出现。According to the seventh write cache release method for streaming data processing according to the ninth aspect of the present invention, the eighth write cache release method for streaming data processing according to the ninth aspect of the present invention is provided, further comprising: if not found The third node that indexes the free storage space that can satisfy the request, and waits for the free storage space that can satisfy the request to appear.

根据本发明第九方面的第二用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第九用于流式数据处理的写缓存释放方法,其中所述合并所述第一空闲存储空间与所述第一存储空间包括:在数据结构中添加第二节点用来索引所述第一存储空间与所述第一空间存储空间;以及在数据结构中删除索引所述第一空闲存储空间的第一节点。According to the second write cache release method for streaming data processing according to the ninth aspect of the present invention, there is provided the ninth write cache release method for streaming data processing according to the ninth aspect of the present invention, wherein the combining said The first free storage space and the first storage space include: adding a second node in the data structure for indexing the first storage space and the first storage space; and deleting the index in the data structure. A first node of free storage space.

根据本发明第九方面的第四用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第十用于流式数据处理的写缓存释放方法,包括:修改所述数据结构中索引所述第一空闲存储空间的第一节点或索引所述第二空闲存储空间的第二节点,使所述第一节点或所述第二节点索引所述第一空闲存储空间、所述第一存储空间与所述第二存储空间。According to the fourth write cache release method for streaming data processing according to the ninth aspect of the present invention, there is provided a tenth write cache release method for streaming data processing according to the ninth aspect of the present invention, including: modifying the data In the structure, the first node that indexes the first free storage space or the second node that indexes the second free storage space, so that the first node or the second node indexes the first free storage space, all the the first storage space and the second storage space.

根据本发明第九方面的第四用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第十一用于流式数据处理的写缓存释放方法,包括:在数据结构中添加第二节点用来索引第一空闲存储空间、所述第一存储空间与所述第二存储空间;以及在数据结构中删除索引所述第一空闲存储空间的第一节点,与索引所述第二空闲存储空间的第二节点。According to the fourth write cache release method for streaming data processing according to the ninth aspect of the present invention, an eleventh write cache release method for streaming data processing according to the ninth aspect of the present invention is provided, including: in the data structure Add a second node to index the first free storage space, the first storage space and the second storage space; and delete the first node indexing the first free storage space in the data structure, and the index of the first node. the second node of the second free storage space.

根据本发明第九方面的前述用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第十二用于流式数据处理的写缓存释放方法,其中所述写缓存的存储空间组织为环形缓冲区。According to the aforementioned write cache release method for streaming data processing according to the ninth aspect of the present invention, there is provided a twelfth write cache release method for streaming data processing according to the ninth aspect of the present invention, wherein the write cache The storage space is organized as a ring buffer.

根据本发明第九方面的前述用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第十三用于流式数据处理的写缓存释放方法,其中所述节点索引空闲存储空间包括在所述节点中标记所引用的空闲存储空间的首地址与空闲存储空间长度。According to the aforementioned write cache release method for streaming data processing according to the ninth aspect of the present invention, there is provided the thirteenth write cache release method for streaming data processing according to the ninth aspect of the present invention, wherein the node index is free The storage space includes the first address of the free storage space and the length of the free storage space referenced by the mark in the node.

根据本发明第九方面的前述用于流式数据处理的写缓存释放方法,提供了根据本发明第九方面的第十四用于流式数据处理的写缓存释放方法,其中所述多个节点按其索引的空闲存储空间的地址排序。According to the aforementioned write cache release method for streaming data processing according to the ninth aspect of the present invention, there is provided the fourteenth write cache release method for streaming data processing according to the ninth aspect of the present invention, wherein the plurality of nodes Sort by the address of the free storage space it indexes.

根据本发明的第十方面,提供了根据本发明第十方面的第一用于流式数据处理的写缓存分配方法,其中所述写缓存中包括一个或多个空闲存储空间,提供数据结构用来索引所述一个或多个空闲存储空间,所述数据结构包括多个节点,每个节点用于索引一个空闲存储空间,提供第一指针,指向所述多个节点之一,所述方法包括:响应于接收到分配第一存储空间的第一请求,从所述指针指向的第一节点查找空闲存储空间;若所述第一节点可满足所述第一请求,从第一节点所索引的空闲存储空间的低地址开始分配空闲存储空间以响应所述第一请求;修改所述第一节点,使所述第一节点索引响应了所述第一请求之后的空闲存储空间。According to a tenth aspect of the present invention, there is provided the first write cache allocation method for streaming data processing according to the tenth aspect of the present invention, wherein the write cache includes one or more free storage spaces, and provides a data structure for to index the one or more free storage spaces, the data structure includes a plurality of nodes, each node is used to index a free storage space, and a first pointer is provided to point to one of the plurality of nodes, and the method includes : in response to receiving the first request for allocating the first storage space, search for free storage space from the first node pointed to by the pointer; if the first node can satisfy the first request, search for free storage space from the first node indexed by the first node The low address of the free storage space begins to allocate free storage space to respond to the first request; the first node is modified so that the first node index responds to the free storage space after the first request.

根据本发明的第十方面的第一用于流式数据处理的写缓存分配方法,提供了根据本发明第十方面的第二用于流式数据处理的写缓存分配方法,还包括:若所述第一节点不满足所述第一请求,等待第一节点出现可满足所述第一请求的空闲存储空间。The first write buffer allocation method for streaming data processing according to the tenth aspect of the present invention provides the second write buffer allocation method for streaming data processing according to the tenth aspect of the present invention, further comprising: if so The first node does not satisfy the first request, and waits for the first node to have free storage space that can satisfy the first request.

根据本发明的第十方面的第一用于流式数据处理的写缓存分配方法,提供了根据本发明第十方面的第三用于流式数据处理的写缓存分配方法,还包括:若所述第一节点不满足所述第一请求,顺序遍历所述多个节点,查找可满足所述第一请求的空闲存储空间。The first write buffer allocation method for streaming data processing according to the tenth aspect of the present invention provides a third write buffer allocation method for streaming data processing according to the tenth aspect of the present invention, further comprising: if so If the first node does not satisfy the first request, traverse the plurality of nodes in sequence to search for free storage space that can satisfy the first request.

根据本发明的第十方面的第三用于流式数据处理的写缓存分配方法,提供了根据本发明第十方面的第四用于流式数据处理的写缓存分配方法,还包括:若未找到索引了可满足所述第一请求的空闲存储空间的节点,等待可满足所述第一请求的空闲存储空间出现。The third write cache allocation method for streaming data processing according to the tenth aspect of the present invention provides the fourth write cache allocation method for streaming data processing according to the tenth aspect of the present invention, further comprising: if not Find the node that indexes the free storage space that can satisfy the first request, and wait for the free storage space that can satisfy the first request to appear.

根据本发明的第十方面的前述用于流式数据处理的写缓存分配方法,提供了根据本发明第十方面的第五用于流式数据处理的写缓存分配方法,还包括:响应于接收到释放第二存储空间的请求,查找所述第二存储空间前后的空闲存储空间是否与所述第二存储空间相邻;若找到与所述第二存储空间相邻的第一空闲存储空间,则合并所述第一空闲存储空间与所述第二存储空间,以及修改索引了所述第一空闲存储空间的节点使之索引合并后的所述第一空闲存储空间与所述第二存储空间;以及若未找到与所述第二存储空间相邻的第一空闲存储空间,在数据结构中添加新的节点用来索引所述第二存储空间。The aforementioned write buffer allocation method for streaming data processing according to the tenth aspect of the present invention provides a fifth write buffer allocation method for streaming data processing according to the tenth aspect of the present invention, further comprising: in response to receiving To the request to release the second storage space, find out whether the free storage space before and after the second storage space is adjacent to the second storage space; if the first free storage space adjacent to the second storage space is found, Then combine the first free storage space and the second storage space, and modify the node that has indexed the first free storage space to make the index of the combined first free storage space and the second storage space ; and if the first free storage space adjacent to the second storage space is not found, adding a new node in the data structure for indexing the second storage space.

根据本发明的第十方面的前述用于流式数据处理的写缓存分配方法,提供了根据本发明第十方面的第六用于流式数据处理的写缓存分配方法,其中节点索引空闲存储空间包括在节点中标记所所引用的空闲存储空间的首地址与空闲存储空间长度。The aforementioned write cache allocation method for streaming data processing according to the tenth aspect of the present invention provides a sixth write cache allocation method for streaming data processing according to the tenth aspect of the present invention, wherein nodes index free storage space It includes the first address of the free storage space and the length of the free storage space referenced by the mark in the node.

根据本发明的第十方面的前述用于流式数据处理的写缓存分配方法,提供了根据本发明第十方面的第七用于流式数据处理的写缓存分配方法,其中所述多个节点按其索引的空闲存储空间的地址(增序)排序。The aforementioned write buffer allocation method for streaming data processing according to the tenth aspect of the present invention provides a seventh write buffer allocation method for streaming data processing according to the tenth aspect of the present invention, wherein the plurality of nodes Sort by the address (in increasing order) of the free storage space it indexes.

根据本发明的第十方面的前述用于流式数据处理的写缓存分配方法,提供了根据本发明第十方面的第八用于流式数据处理的写缓存分配方法,其中所述写缓存的存储空间组织为环形缓冲区。The aforementioned write cache allocation method for streaming data processing according to the tenth aspect of the present invention provides an eighth write cache allocation method for streaming data processing according to the tenth aspect of the present invention, wherein the write cache The storage space is organized as a ring buffer.

根据本发明第十一方面,提供了一种基于NVDIMM的数据写缓存方法,包括:接收第一数据写入请求,所述第一数据写入请求指示将第一数据写入第一地址;影响于接收到该第一数据写入请求,将所述第一数据写入NVDIMM,其中所述NVDIMM中包括一个或多个空闲存储空间,提供数据结构用来索引所述一个或多个空闲存储空间,所述数据结构包括多个节点,每个节点用于索引一个空闲存储空间,提供第一指针,指向所述多个节点之一;从所述指针指向的第一节点开始查找空闲存储空间;若所述第一节点可满足所述第一请求,从第一节点所索引的空闲存储空间的低地址开始分配第一空闲存储空间;修改所述第一节点,使所述第一节点索引响应了所述第一请求之后的空闲存储空间;将所述第一数据写入所述第一空闲存储空间;响应于将所述第一数据写入NVDIMM的操作完成,发送指示所述第一数据写入请求完成的消息;响应于接收到所述第一数据写入请求,还将所述第一数据写入存储设备;响应于将所述第一数据写入存储设备的操作完成,释放所述第一数据在所述NVDIMM中占据的存储空间。According to an eleventh aspect of the present invention, an NVDIMM-based data write cache method is provided, comprising: receiving a first data write request, the first data write request instructing to write the first data to a first address; affecting upon receiving the first data write request, write the first data into the NVDIMM, wherein the NVDIMM includes one or more free storage spaces, and provide a data structure for indexing the one or more free storage spaces , the data structure includes a plurality of nodes, each node is used to index a free storage space, provides a first pointer, and points to one of the plurality of nodes; starts from the first node pointed to by the pointer to search for the free storage space; If the first node can satisfy the first request, allocate the first free storage space from the low address of the free storage space indexed by the first node; modify the first node so that the first node index responds the free storage space after the first request is received; write the first data into the first free storage space; in response to the completion of the operation of writing the first data into the NVDIMM, sending an indication of the first data Write request completion message; in response to receiving the first data write request, also write the first data to the storage device; in response to the completion of the operation of writing the first data to the storage device, release the storage space occupied by the first data in the NVDIMM.

根据本发明第十二方面,提供了一种写缓存释放装置,其中所述写缓存中包括一个或多个空闲存储空间,提供数据结构用来索引所述一个或多个空闲存储空间,所述装置包括:空闲存储空间查找模块,用于响应于接收到释放所述写缓存的第一存储空间的请求,查找与所述第一存储空间相接的存储空间是否为空闲存储空间;存储空间合并模块,用于若找到与所述第一存储空间相接的第一空闲存储空间,修改所述数据结构中索引所述第一空闲存储空间的第一节点,使所述第一节点索引所述第一空闲存储空间与所述第一存储空间;以及索引添加模块,用于若未找到与所述第一存储空间相接的第一空间存储空间,在数据结构中添加新的节点用来索引所述第一存储空间。According to a twelfth aspect of the present invention, a write cache release device is provided, wherein the write cache includes one or more free storage spaces, a data structure is provided for indexing the one or more free storage spaces, and the The device includes: a free storage space search module for, in response to receiving a request for releasing the first storage space of the write cache, to find out whether the storage space connected to the first storage space is free storage space; the storage space is merged A module, configured to modify the first node indexing the first free storage space in the data structure if the first free storage space connected to the first storage space is found, so that the first node indexes the The first free storage space and the first storage space; and an index adding module for adding a new node in the data structure for indexing if the first space storage space connected to the first storage space is not found the first storage space.

根据本发明第十三方面,提供了一种写缓存分配装置,其中所述写缓存中包括一个或多个空闲存储空间,提供数据结构用来索引所述一个或多个空闲存储空间,所述数据结构包括多个节点,每个节点用于索引一个空闲存储空间,提供第一指针,指向所述多个节点之一,所述装置包括:空闲存储空间查找模块,用于响应于接收到分配第一存储空间的第一请求,从所述指针指向的第一节点查找空闲存储空间;存储空间分配模块,用于若所述第一节点可满足所述第一请求,从第一节点所索引的空闲存储空间的低地址开始分配空闲存储空间以响应所述第一请求;节点修改模块,用于修改所述第一节点,使所述第一节点索引响应了所述第一请求之后的空闲存储空间。According to a thirteenth aspect of the present invention, a write cache allocation device is provided, wherein the write cache includes one or more free storage spaces, a data structure is provided for indexing the one or more free storage spaces, and the The data structure includes a plurality of nodes, each node is used for indexing a free storage space, and a first pointer is provided to point to one of the plurality of nodes, and the apparatus includes: a free storage space search module for responding to receiving the allocation For the first request for the first storage space, search for free storage space from the first node pointed to by the pointer; the storage space allocation module is configured to, if the first node can satisfy the first request, search for free storage space from the first node indexed by the first node. The low address of the free storage space starts to allocate free storage space to respond to the first request; the node modification module is used to modify the first node, so that the first node index responds to the idle after the first request. storage.

根据本发明第十四方面,提供了一种基于NVDIMM的写入数据的装置,包括:接收模块,用于接收第一数据写入请求,所述第一数据写入请求指示将第一数据写入第一地址;NVDIMM写入模块,用于响应于接收到该第一数据写入请求,将所述第一数据写入NVDIMM,其中所述NVDIMM中包括一个或多个空闲存储空间,提供数据结构用来索引所述一个或多个空闲存储空间,所述数据结构包括多个节点,每个节点用于索引一个空闲存储空间,提供第一指针,指向所述多个节点之一;所述NVDIMM写入模块包括:空闲存储空间查找模块,用于从所述指针指向的第一节点开始查找空闲存储空间;存储空间分配模块,用于若所述第一节点可满足所述第一请求,从第一节点所索引的空闲存储空间分配第一空闲存储空间;节点修改模块,用于修改所述第一节点,使所述第一节点索引响应了所述第一请求之后的空闲存储空间;以及数据写入模块,用于将所述第一数据写入所述第一空闲存储空间;所述基于NVDIMM的写入数据的装置还包括:消息发送模块,用于响应于将所述第一数据写入NVDIMM的操作完成,发送指示所述第一数据写入请求完成的消息;存储设备写入模块,用于响应于接收到所述第一数据写入请求,还将所述第一数据写入存储设备;NVDIMM释放模块,用于响应于将所述第一数据写入存储设备的操作完成,释放所述第一数据在所述NVDIMM中占据的存储空间。According to a fourteenth aspect of the present invention, there is provided an NVDIMM-based device for writing data, comprising: a receiving module configured to receive a first data writing request, where the first data writing request instructs to write the first data into the first address; the NVDIMM writing module is configured to, in response to receiving the first data writing request, write the first data into the NVDIMM, wherein the NVDIMM includes one or more free storage spaces for providing data The structure is used to index the one or more free storage spaces, the data structure includes a plurality of nodes, each node is used to index a free storage space, and a first pointer is provided to point to one of the plurality of nodes; the The NVDIMM writing module includes: a free storage space search module, configured to search for free storage space from the first node pointed to by the pointer; a storage space allocation module, configured to, if the first node can satisfy the first request, Allocate the first free storage space from the free storage space indexed by the first node; a node modification module, configured to modify the first node, so that the first node index responds to the free storage space after the first request; and a data writing module for writing the first data into the first free storage space; the NVDIMM-based device for writing data further includes: a message sending module for writing the first data in response to After the operation of writing data to the NVDIMM is completed, a message indicating the completion of the first data writing request is sent; the storage device writing module is configured to, in response to receiving the first data writing request, write the first data Writing to a storage device; an NVDIMM release module, configured to release the storage space occupied by the first data in the NVDIMM in response to the completion of the operation of writing the first data to the storage device.

根据本发明第十五方面,提供了一种计算机,包括:用于存储程序指令的机器可读存储器;用于执行存储在所述存储器中的程序指令的一个或多个处理器;所述程序指令用于使所述一个或多个处理器执行根据本发明第九到第十一方面提供的多种方法之一。According to a fifteenth aspect of the present invention, there is provided a computer comprising: a machine-readable memory for storing program instructions; one or more processors for executing the program instructions stored in the memory; the program The instructions are for causing the one or more processors to perform one of the methods provided in accordance with the ninth to eleventh aspects of the present invention.

根据本发明的第十六方面,提供了一种程序,其使得计算机执行根据本发明的第九与第十一方面而提供的多种方法之一。According to a sixteenth aspect of the present invention, there is provided a program that causes a computer to execute one of the methods provided according to the ninth and eleventh aspects of the present invention.

根据本发明的第十七方面,提供了一种在其上具有所记录的程序的计算机可读存储介质,其中所述程序使得计算机执行根据本发明的第九与第十一方面而提供的多种方法之一。According to a seventeenth aspect of the present invention, there is provided a computer-readable storage medium having a program recorded thereon, wherein the program causes a computer to execute the multiple functions provided according to the ninth and eleventh aspects of the present invention. one of the methods.

根据本发明的第十八方面,提供了根据本发明第十八方面的第一用于存储系统的数据访问方法,所述存储系统包括多个存储设备,所述存储系统提供多个存储对象,存储对象由存储设备上的存储资源组成,所述多个存储对象包括一个或多个可写存储对象与多个只读存储对象;所述方法包括:响应于写请求,向所述可写存储对象以追加写(append)/顺序写方式写入数据;若所述可写存储对象已满,将所述可写存储对象设置为只读存储对象;响应于读请求,从所述只读存储对象中读出数据。According to an eighteenth aspect of the present invention, there is provided a first data access method for a storage system according to the eighteenth aspect of the present invention, the storage system comprising a plurality of storage devices, the storage system providing a plurality of storage objects, The storage object is composed of storage resources on the storage device, and the multiple storage objects include one or more writable storage objects and multiple read-only storage objects; the method includes: responding to a write request, sending a request to the writable storage object The object writes data in an append/sequential write mode; if the writable storage object is full, the writable storage object is set as a read-only storage object; in response to a read request, from the read-only storage object Read data from the object.

根据本发明第十八方面的第一用于存储系统的数据访问方法,提供了根据本发明第十八方面的第二用于存储系统的数据访问方法,其中存储对象包括来自第一存储设备的第一存储空间与来自第二存储设备的第二存储空间。According to the first data access method for a storage system according to the eighteenth aspect of the present invention, there is provided a second data access method for a storage system according to the eighteenth aspect of the present invention, wherein the storage object includes data from the first storage device. The first storage space and the second storage space from the second storage device.

根据本发明第十八方面的第一用于存储系统的数据访问方法,提供了根据本发明第十八方面的第三用于存储系统的数据访问方法,其中存储对象包括来自第一存储设备的第一连续存储空间的部分与来自第二存储设备的第二连续存储空间的部分。According to the first data access method for a storage system according to the eighteenth aspect of the present invention, there is provided a third data access method for a storage system according to the eighteenth aspect of the present invention, wherein the storage object includes data from the first storage device. The portion of the first contiguous storage space and the portion of the second contiguous storage space from the second storage device.

根据本发明第十八方面的第二或第三用于存储系统的数据访问方法,提供了根据本发明第十八方面的第四用于存储系统的数据访问方法,其中第二存储空间用于存储第一存储空间的数据的校验数据。According to the second or third data access method for a storage system according to the eighteenth aspect of the present invention, there is provided a fourth data access method for a storage system according to the eighteenth aspect of the present invention, wherein the second storage space is used for Verification data of the data in the first storage space is stored.

根据本发明第十八方面的第一用于存储系统的数据访问方法,提供了根据本发明第十八方面的第五用于存储系统的数据访问方法,其中存储对象包括来自存储设备的存储空间。According to the first data access method for a storage system according to the eighteenth aspect of the present invention, there is provided a fifth data access method for a storage system according to the eighteenth aspect of the present invention, wherein the storage object includes storage space from a storage device .

根据本发明第十八方面的第一用于存储系统的数据访问方法,提供了根据本发明第十八方面的第六用于存储系统的数据访问方法,还包括记录所写入数据与所述可写对象的映射关系。According to the first data access method for a storage system according to the eighteenth aspect of the present invention, there is provided a sixth data access method for a storage system according to the eighteenth aspect of the present invention, further comprising recording the written data and the A mapping of writable objects.

根据本发明第十八方面的第六用于存储系统的数据访问方法,提供了根据本发明第十八方面的第七用于存储系统的数据访问方法,还包括响应于读请求,根据所述映射关系查找存储了所请求数据的只读存储对象,并从所述存储了所请求数据的只读存储对象中读出数据。A sixth data access method for a storage system according to the eighteenth aspect of the present invention provides a seventh data access method for a storage system according to the eighteenth aspect of the present invention, further comprising responding to a read request, according to the The mapping relationship searches for a read-only storage object that stores the requested data, and reads data from the read-only storage object that stores the requested data.

根据本发明第十八方面的前述用于存储系统的数据访问方法,提供了根据本发明第十八方面的第八用于存储系统的数据访问方法,其中若所述可写对象写满,还将空闲存储对象设置为可写存储对象,使得所述存储系统中包括至少一个可写存储对象。According to the aforementioned data access method for a storage system according to the eighteenth aspect of the present invention, there is provided an eighth data access method for a storage system according to the eighteenth aspect of the present invention, wherein if the writable object is full, further The free storage object is set as a writable storage object, so that the storage system includes at least one writable storage object.

根据本发明第十八方面的前述用于存储系统的数据访问方法,提供了根据本发明第十八方面的第九用于存储系统的数据访问方法,其中所述存储系统还包括NVDIMM;所述方法还包括:响应于写请求,向所述NVDIMM写入数据;响应于向所述NVDIMM写入数据的操作完成,发送指示所述写请求完成的消息;以及响应于将数据写入所述可写存储对象,释放所述数据在所述NVDIMM中占据的空间。According to the aforementioned data access method for a storage system according to the eighteenth aspect of the present invention, there is provided a ninth data access method for a storage system according to the eighteenth aspect of the present invention, wherein the storage system further comprises an NVDIMM; the The method further includes: in response to a write request, writing data to the NVDIMM; in response to completion of the operation of writing data to the NVDIMM, sending a message indicating completion of the write request; and in response to writing data to the available Write the storage object to release the space occupied by the data in the NVDIMM.

根据本发明第十八方面的前述用于存储系统的数据访问方法,提供了根据本发明第十八方面的第十用于存储系统的数据访问方法,其中所述存储系统还包括NVDIMM;所述方法还包括:响应于写请求,向所述NVDIMM写入数据;响应于向所述NVDIMM写入数据的操作完成,发送指示所述写请求完成的消息;在将所述可写存储对象设置为只读存储对象之前,响应于读请求,从所述NVDIMM中读出所述数据。According to the aforementioned data access method for a storage system according to the eighteenth aspect of the present invention, there is provided a tenth data access method for a storage system according to the eighteenth aspect of the present invention, wherein the storage system further comprises an NVDIMM; the The method further includes: in response to the write request, writing data to the NVDIMM; in response to the completion of the operation of writing data to the NVDIMM, sending a message indicating that the write request is completed; after setting the writable storage object to The data is read from the NVDIMM in response to a read request before the read-only storage object.

根据本发明第十八方面的第一至第九用于存储系统的数据访问方法,提供了根据本发明第十八方面的第十一用于存储系统的数据访问方法,其中所述存储系统还提供缓存,所述方法还包括:响应于写请求,向所述缓存写入数据;在将所述可写存储对象设置为只读存储对象之前,响应于读请求,从所述缓存中读出所述数据。According to the first to ninth data access methods for a storage system according to the eighteenth aspect of the present invention, there is provided an eleventh data access method for a storage system according to the eighteenth aspect of the present invention, wherein the storage system further A cache is provided, and the method further includes: in response to a write request, writing data to the cache; before setting the writable storage object as a read-only storage object, in response to a read request, reading out from the cache the data.

根据本发明的第十八方面,提供了根据本发明第十八方面的第十二用于存储系统的数据访问方法,其中所述存储系统包括多个存储设备与NVDIMM,所述存储系统提供多个存储对象,存储对象由存储设备上的存储资源组成,所述多个存储对象包括一个或多个可写存储对象与多个只读存储对象;所述方法包括:响应于第一写请求,将第一数据写入所述NVDIMM;响应于将所述第一数据写入所述NVDIMM,发送指示所述第一写请求完成的消息;响应于第二写请求,将第二数据写入所述NVDIMM;响应于将所述第二数据写入所述NVDIMM,发送指示所述第二写请求完成的消息;生成存储数据块,所述存储数据块包括所述第一数据与所述第二数据;向所述可写存储对象以追加写/顺序写(append)方式写入所述存储数据块;若所述可写存储对象已满,将所述可写存储对象设置为只读存储对象;响应于读请求,从所述只读存储对象中读出所述第一数据或所述第二数据。According to an eighteenth aspect of the present invention, there is provided a twelfth data access method for a storage system according to the eighteenth aspect of the present invention, wherein the storage system includes a plurality of storage devices and NVDIMMs, and the storage system provides multiple storage objects, the storage objects are composed of storage resources on the storage device, the multiple storage objects include one or more writable storage objects and multiple read-only storage objects; the method includes: in response to the first write request, writing first data to the NVDIMM; in response to writing the first data to the NVDIMM, sending a message indicating that the first write request is complete; in response to the second write request, writing the second data to all the NVDIMM; in response to writing the second data to the NVDIMM, sending a message indicating that the second write request is complete; generating a storage data block, the storage data block including the first data and the second data; write the storage data block to the writable storage object by appending/sequential writing (append); if the writable storage object is full, set the writable storage object as a read-only storage object ; in response to a read request, read the first data or the second data from the read-only storage object.

根据本发明的第十九方面,提供了根据本发明第十九方面的第一计算机,包括:用于存储程序指令的机器可读存储器;用于执行存储在所述存储器中的程序指令的一个或多个处理器;所述程序指令用于使所述一个或多个处理器执行根据根据本发明第十八方面提供的多种方法。According to a nineteenth aspect of the present invention, there is provided a first computer according to the nineteenth aspect of the present invention, comprising: a machine-readable memory for storing program instructions; one for executing the program instructions stored in the memory or more processors; the program instructions are used to cause the one or more processors to execute the various methods provided according to the eighteenth aspect of the present invention.

根据本发明的第二十方面,提供了根据本发明第二十方面的第一用于存储系统的数据访问装置,所述存储系统包括多个存储设备,所述存储系统提供多个存储对象,存储对象由存储设备上的存储资源组成,所述多个存储对象包括一个或多个可写存储对象与多个只读存储对象;所述装置包括:写入模块,用于响应于写请求,向所述可写存储对象以追加写(append)/顺序写方式写入数据;存储对象设置模块,用于若所述可写存储对象已满,将所述可写存储对象设置为只读存储对象;读出模块,用于响应于读请求,从所述只读存储对象中读出数据。According to a twentieth aspect of the present invention, there is provided a first data access apparatus for a storage system according to the twentieth aspect of the present invention, the storage system comprising a plurality of storage devices, the storage system providing a plurality of storage objects, The storage object is composed of storage resources on the storage device, and the multiple storage objects include one or more writable storage objects and multiple read-only storage objects; the device includes: a writing module for responding to a writing request, Write data to the writable storage object in an append/sequential writing manner; a storage object setting module is used to set the writable storage object as read-only storage if the writable storage object is full an object; a readout module, configured to read data from the read-only storage object in response to a read request.

根据本发明的第二十方面,提供了根据本发明第二十方面的第二用于存储系统的数据访问装置,其中所述存储系统包括多个存储设备与NVDIMM,所述存储系统提供多个存储对象,存储对象由存储设备上的存储资源组成,所述多个存储对象包括一个或多个可写存储对象与多个只读存储对象;所述装置包括:第一NVDIMM写入模块用于响应于第一写请求,将第一数据写入所述NVDIMM;第一消息发送模块,用于响应于将所述第一数据写入所述NVDIMM,发送指示所述第一写请求完成的消息;第二NVDIMM写入模块,用于响应于第二写请求,将第二数据写入所述NVDIMM;第二消息发送模块,响应于将所述第二数据写入所述NVDIMM,发送指示所述第二写请求完成的消息;生成模块,用于生成存储数据块,所述存储数据块包括所述第一数据与所述第二数据;存储对象写入模块,用于向所述可写存储对象以追加写/顺序写(append)方式写入所述存储数据块;存储对象设置模块,用于若所述可写存储对象已满,将所述可写存储对象设置为只读存储对象;读出模块,用于响应于读请求,从所述只读存储对象中读出所述第一数据或所述第二数据。According to a twentieth aspect of the present invention, there is provided a second data access apparatus for a storage system according to the twentieth aspect of the present invention, wherein the storage system includes a plurality of storage devices and NVDIMMs, and the storage system provides a plurality of a storage object, the storage object is composed of storage resources on a storage device, the multiple storage objects include one or more writable storage objects and multiple read-only storage objects; the apparatus includes: a first NVDIMM writing module for In response to the first write request, write first data to the NVDIMM; a first message sending module is configured to send a message indicating that the first write request is completed in response to the first data being written to the NVDIMM a second NVDIMM writing module for writing second data into the NVDIMM in response to a second writing request; a second message sending module for sending an instruction to the NVDIMM in response to writing the second data into the NVDIMM a message that the second write request is completed; a generating module is used to generate a storage data block, the storage data block includes the first data and the second data; a storage object writing module is used to write to the writable The storage object is written into the storage data block in an append write/sequential write (append) manner; a storage object setting module is used to set the writable storage object as a read-only storage object if the writable storage object is full ; a readout module, configured to read out the first data or the second data from the read-only storage object in response to a read request.

附图说明Description of drawings

当连同附图阅读时,通过参考后面对示出性的实施例的详细描述,将最佳地理解本发明以及优选的使用模式和其进一步的目的和优点,其中附图包括:The present invention, together with the preferred mode of use and its further objects and advantages, will be best understood by reference to the following detailed description of illustrative embodiments when read in conjunction with the accompanying drawings, which include:

图1是现有技术的存储系统的方框图;1 is a block diagram of a prior art storage system;

图2是根据本发明的实施例的存储系统的方框图;2 is a block diagram of a storage system according to an embodiment of the present invention;

图3是根据本发明的又一实施例的存储系统的方框图;3 is a block diagram of a storage system according to yet another embodiment of the present invention;

图4是根据本发明的实施例的数据写入方法的流程图;4 is a flowchart of a data writing method according to an embodiment of the present invention;

图5展示了根据本发明实施例的在NVDIMM上存储的数据块;FIG. 5 illustrates a data block stored on an NVDIMM according to an embodiment of the present invention;

图6是根据本发明的实施例的存储系统开机过程中执行的方法的流程图;FIG. 6 is a flowchart of a method executed during a boot process of a storage system according to an embodiment of the present invention;

图7是根据本发明的又一实施例的数据写入方式的流程图;7 is a flowchart of a data writing method according to another embodiment of the present invention;

图8展示了根据本发明实施例的NVDIMM的数据组织;FIG. 8 illustrates the data organization of an NVDIMM according to an embodiment of the present invention;

图9是根据本发明实施例的NVDIMM的存储空间分配方法的流程图;9 is a flowchart of a storage space allocation method for an NVDIMM according to an embodiment of the present invention;

图10是根据本发明又一实施例的NVDIMM的存储空间分配方法的流程图;10 is a flowchart of a storage space allocation method for an NVDIMM according to yet another embodiment of the present invention;

图11是根据本发明实施例的NVDIMM的存储空间释放方法的流程图;11 is a flowchart of a method for releasing storage space of an NVDIMM according to an embodiment of the present invention;

图12是根据本发明又一实施例的NVDIMM的存储空间释放方法的流程图;12 is a flowchart of a method for releasing storage space of an NVDIMM according to another embodiment of the present invention;

图13是根据本发明的又一实施例的数据写入方法的流程图;13 is a flowchart of a data writing method according to yet another embodiment of the present invention;

图14展示了根据本发明实施例的存储对象;Figure 14 illustrates a storage object according to an embodiment of the present invention;

图15展示了根据本发明实施例的存储系统的读写操作的示意图;15 shows a schematic diagram of read and write operations of a storage system according to an embodiment of the present invention;

图16是根据本发明的实施例的存储系统数据访问方法的流程图;16 is a flowchart of a storage system data access method according to an embodiment of the present invention;

图17是根据本发明的又一实施例的存储系统数据访问方法的流程图;以及FIG. 17 is a flowchart of a data access method for a storage system according to yet another embodiment of the present invention; and

图18是根据本发明的依然又一实施例的存储系统数据访问方法的流程图。FIG. 18 is a flowchart of a data access method of a storage system according to still another embodiment of the present invention.

具体实施方式Detailed ways

图2是根据本发明的实施例的存储系统的方框图。在根据图2的实施例中,存储系统可以是计算机或服务器,包括CPU 210、NVDIMM 220以及一个或多个盘(DISK)设备230。盘设备230可以是机械硬盘、固态硬盘和或存储卡。盘设备230可通过例如SATA、IDE、USB、PCIe、NVMe、SCSI、以太网等方式与CPU交换数据。盘设备230可以直接耦合到CPU 210,也可以通过诸如芯片组的桥接芯片(未示出)耦合到CPU 210。NVDIMM 220通过DIMM内存插槽耦合到CPU 210。CPU 210可包括一个或多个CPU芯片。一个或多个盘230可以被组织为RAID来提供高性能、高可靠性的存储服务。2 is a block diagram of a storage system according to an embodiment of the present invention. In the embodiment according to FIG. 2 , the storage system may be a computer or server, including a CPU 210 , NVDIMM 220 , and one or more disk (DISK) devices 230 . The disk device 230 may be a mechanical hard disk, a solid state disk, and/or a memory card. The disk device 230 can exchange data with the CPU through, for example, SATA, IDE, USB, PCIe, NVMe, SCSI, Ethernet, and the like. The disk device 230 may be directly coupled to the CPU 210, or may be coupled to the CPU 210 through a bridge chip (not shown) such as a chipset. NVDIMM 220 is coupled to CPU 210 through a DIMM memory socket. CPU 210 may include one or more CPU chips. One or more disks 230 may be organized in RAID to provide high performance, high reliability storage services.

在CPU 210上运行访问存储系统的软件,例如应用软件、数据库软件等。根据本发明实施例的提供存储服务的软件也运行在CPU 210,用于响应存储请求,并操作诸如NVDIMM220与盘230的存储设备。因而可在诸如应用服务器、数据库服务器的服务器上实施本发明。Software that accesses the storage system, such as application software, database software, and the like, runs on the CPU 210 . Software for providing storage services according to embodiments of the present invention also runs on the CPU 210 for responding to storage requests and operating storage devices such as the NVDIMM 220 and the disk 230 . The invention can thus be implemented on servers such as application servers, database servers.

图3是根据本发明的又一实施例的存储系统的方框图。在根据图3的实施例中,提供独立的存储系统。存储系统包括控制器310、NVDIMM 320、接口340以及一个或多个盘(DISK)设备330。盘设备230可以是机械硬盘、固态硬盘和或存储卡。盘设备230可通过例如SATA、IDE、USB、PCIe、NVMe、SCSI、以太网等方式与控制器交换数据。盘设备330可以直接耦合到控制器310,也可以通过诸如芯片组的桥接芯片(未示出)耦合到控制器310。NVDIMM320通过DIMM内存插槽耦合到控制器310。控制器10可包括一个或多个CPU芯片,或一个或多个应用专用集成电路。一个或多个盘330可以被组织为RAID来提供高性能、高可靠性的存储服务。接口340将存储系统耦合到网络,存储系统可通过网络来被访问。接口340可以是支持例如以太网、FC(Fibre Channel,光纤通道)、Infiniband的接口。访问存储系统的应用运行于其他服务器上,并通过网络以及接口340来访问根据图3的实施例的存储系统。本发明实施例的提供存储服务的软件运行在控制器310,用于响应存储请求,并操作诸如NVDIMM 320与盘330的存储设备。3 is a block diagram of a storage system according to yet another embodiment of the present invention. In the embodiment according to Figure 3, a separate storage system is provided. The storage system includes a controller 310 , an NVDIMM 320 , an interface 340 , and one or more disk (DISK) devices 330 . The disk device 230 may be a mechanical hard disk, a solid state disk, and/or a memory card. The disk device 230 may exchange data with the controller through, for example, SATA, IDE, USB, PCIe, NVMe, SCSI, Ethernet, and the like. The disk device 330 may be directly coupled to the controller 310, or may be coupled to the controller 310 through a bridge chip (not shown) such as a chipset. NVDIMM 320 is coupled to controller 310 through a DIMM memory socket. Controller 10 may include one or more CPU chips, or one or more application specific integrated circuits. One or more disks 330 may be organized in RAID to provide high performance, high reliability storage services. Interface 340 couples the storage system to the network through which the storage system can be accessed. The interface 340 may be an interface supporting, for example, Ethernet, FC (Fibre Channel, Fibre Channel), and Infiniband. Applications accessing the storage system run on other servers and access the storage system according to the embodiment of FIG. 3 through the network and the interface 340 . The software for providing storage services according to the embodiment of the present invention runs on the controller 310 to respond to storage requests and operate storage devices such as the NVDIMM 320 and the disk 330 .

图4是根据本发明的实施例的数据写入方法的流程图。当需要写入数据时,应用程序会发出数据写入请求。根据本发明实施例的软件接收应用程序或其他程序发出的数据写入请求。响应于接收到数据写入请求(410),在根据本发明的实施例中,将数据写入NVDIMM(420)。在将数据写入NVDIMM之后,向发出写请求的应用或其他程序发出消息,该消息用以指示数据写入请求已经处理完成(430)。虽然此时数据仅被写入到NVDIMM中,而尚未写入存储系统的诸如盘230(参看图2)的存储设备,但由于NVDIMM具有非易失特性,根据本发明的实施例的写入方法中,基于数据被写入到NVDIMM而发出指示数据写入请求已经处理完成的消息,并且通过后续的步骤确保写入到NVDIMM的数据被写入到存储系统的诸如盘230的存储设备。以此方式,能够保证即使在执行步骤430与执行步骤450之间发生掉电,已写入存储系统的数据不丢失。在根据图4的实施例中,NVDIMM充当存储服务器的写缓存。由于NVDIMM具有高速数据存取能力,步骤420将很快完成,并能很快向发出指示数据写入请求完成的消息,从而提升了存储系统的写入性能。FIG. 4 is a flowchart of a data writing method according to an embodiment of the present invention. When data needs to be written, the application issues a data write request. The software according to the embodiment of the present invention receives a data write request sent by an application program or other program. In response to receiving the data write request (410), in an embodiment according to the invention, data is written to the NVDIMM (420). After the data is written to the NVDIMM, a message is issued to the application or other program that issued the write request to indicate that the data write request has been processed (430). Although data is only written into the NVDIMM at this time, and has not been written to the storage device of the storage system, such as the disk 230 (see FIG. 2 ), since the NVDIMM has non-volatile characteristics, the writing method according to the embodiment of the present invention In the NVDIMM, a message indicating that the data write request has been processed is issued based on the data being written to the NVDIMM, and subsequent steps ensure that the data written to the NVDIMM is written to a storage device such as disk 230 of the storage system. In this way, it can be ensured that even if a power failure occurs between executing step 430 and executing step 450, the data written to the storage system is not lost. In the embodiment according to Figure 4, the NVDIMM acts as a write cache for the storage server. Since the NVDIMM has high-speed data access capability, step 420 will be completed quickly, and a message indicating the completion of the data write request can be sent to the NVDIMM, thereby improving the write performance of the storage system.

响应于接收到数据写入请求,在根据本发明的实施例中,还将数据写入到存储设备(450)。在根据本发明的实施例中,将数据写入到存储设备(450)的步骤与将数据写入到DIMM(420)的步骤均依赖于接收到数据写入请求(410)而发生。因而,在一个例子中,将数据写入到存储设备(450)的步骤与将数据写入到DIMM(420)的步骤并行发生。在另一个例子中,由一个CPU来处理将数据写入到存储设备(450)的步骤,而由另一个CPU来处理将数据写入到DIMM(420)的步骤。在依然另一个例子中,由一个CPU来分时地处理将数据写入到存储设备(450)的步骤与将数据写入到DIMM(420)的步骤。在依然另一个例子中,在将数据写入到存储设备(450)的步骤之后再执行将数据写入到DIMM(420)的步骤。In response to receiving the data write request, in an embodiment according to the invention, the data is also written to the storage device (450). In an embodiment according to the invention, both the steps of writing data to the storage device (450) and the steps of writing data to the DIMM (420) occur in dependence on receiving a data write request (410). Thus, in one example, the step of writing data to the storage device (450) occurs in parallel with the step of writing data to the DIMM (420). In another example, the step of writing data to the storage device (450) is handled by one CPU and the step of writing data to the DIMM (420) is handled by another CPU. In yet another example, the steps of writing data to the storage device (450) and the steps of writing data to the DIMMs (420) are time-shared by one CPU. In yet another example, the step of writing data to the DIMM (420) is performed after the step of writing the data to the storage device (450).

响应于将数据写入存储设备(450)的步骤执行完成,指示释放在步骤420中写入的NVDIMM的数据在NVDIMM中占据的空间。存储设备可以是诸如盘230(参见图2)的存储设备。由于在将数据写入存储设备后,及时将NVDIMM中缓存的数据释放或删除,在根据本发明的实施例中,NVDIMM的存储容量无需很大,可以远小于存储系统的诸如盘230(参见图2)的存储设备的存储容量。对NVDIMM中数据的释放可以是将数据所占用的存储空间标记为空闲,而无需在NVDIMM上执行删除、写入或擦除等操作。在根据图4的实施例中,响应于接收到数据写入请求,将数据写入到存储设备(450),并从NVDIMM中释放相应的数据(460)。因而数据不会在NVDIMM中缓存很长世间,从而NVDIMM的存储空间得以很快地重复利用,减少了对NVDIMM的整体存储容量的需求。并使存储系统具有可持续的高写入性能。In response to the completion of the step of writing data to the storage device (450), it is instructed to release the space occupied in the NVDIMM by the data of the NVDIMM written in step 420. The storage device may be a storage device such as disk 230 (see FIG. 2). Since the data cached in the NVDIMM is released or deleted in time after the data is written to the storage device, in the embodiment according to the present invention, the storage capacity of the NVDIMM does not need to be large, and can be much smaller than that of the storage system such as the disk 230 (see FIG. 2) The storage capacity of the storage device. The release of data in the NVDIMM may be to mark the storage space occupied by the data as free without performing delete, write or erase operations on the NVDIMM. In the embodiment according to Figure 4, in response to receiving a data write request, data is written to the storage device (450) and corresponding data is released from the NVDIMM (460). Therefore, data will not be cached in the NVDIMM for a long time, so that the storage space of the NVDIMM can be quickly reused, reducing the overall storage capacity requirement of the NVDIMM. And make the storage system have sustainable high write performance.

所属领域技术人员将意识到,会接收到多个数据写入请求。对每个数据写入请求,分别执行将数据写入NVDIMM(420)的步骤以及将数据写入存储设备(450)的步骤。以及在将数据写入NVDIMM(420)后,发送指示数据写入请求完成的消息(430)。在将数据写入存储设备(450)后,释放该数据在NVDIMM中所占据的空间(460)。Those skilled in the art will appreciate that multiple data write requests may be received. For each data write request, the step of writing data to the NVDIMM (420) and the step of writing data to the storage device (450) are performed separately. And after writing the data to the NVDIMM (420), send a message (430) indicating that the data write request is complete. After the data is written to the storage device (450), the space occupied by the data in the NVDIMM is released (460).

图5展示了根据本发明实施例的在NVDIMM上存储的数据块。在根据图5的实施例中,NVDIMM 500上存储有数据块510、数据块520、数据块530与数据块540。作为举例,在数据块510中至少包括数据、逻辑地址与顺序号。在一个例子中,数据块510、数据块520、数据块530与数据块540具有相同的尺寸。数据块510、数据块520、数据块530与数据块540也可以具有不同的尺寸,并在每个数据块中记录数据块各自的尺寸。Figure 5 illustrates a block of data stored on an NVDIMM according to an embodiment of the present invention. In the embodiment according to FIG. 5 , a data block 510 , a data block 520 , a data block 530 and a data block 540 are stored on the NVDIMM 500 . As an example, the data block 510 includes at least data, a logical address and a sequence number. In one example, data block 510, data block 520, data block 530 and data block 540 have the same size. The data block 510, the data block 520, the data block 530 and the data block 540 may also have different sizes, and the respective sizes of the data blocks are recorded in each data block.

参看图4,所接收到的数据写入请求中,包括要写入的数据与要写入数据的逻辑地址。为了将数据写入到NVDIMM,生成数据块510(图5),在数据块510中记录数据写入请求所包括的要写入的数据与要写如数据的逻辑地址。在生成数据块510时,还产生顺序号,并在数据块510中包含该顺序号。顺序号是递增的,用于标识接收到数据写入请求的顺序,从而可通过顺序号而获得各个数据块生成的顺序。为了获得递增的顺序号,在一个例子中,每当接收到数据写入请求并生成数据块(例如,数据块510)时,将顺序号递增,并将递增后的顺序号的值记录在数据块510中。Referring to FIG. 4 , the received data write request includes the data to be written and the logical address of the data to be written. In order to write data to the NVDIMM, a data block 510 ( FIG. 5 ) is generated, and the data to be written and the logical address to be written as the data included in the data write request are recorded in the data block 510 . When the data block 510 is generated, a sequence number is also generated and included in the data block 510 . The sequence number is incremented and is used to identify the sequence in which the data write request is received, so that the sequence in which each data block is generated can be obtained through the sequence number. To obtain the incremented sequence number, in one example, whenever a data write request is received and a data block (eg, data block 510 ) is generated, the sequence number is incremented, and the value of the incremented sequence number is recorded in the data in block 510.

在需要从NVDIMM中恢复数据时,利用数据块(例如,数据块510)中的数据与逻辑地址,能够获得向逻辑地址写入数据的数据写入请求;而利用数据块中的顺序号,能够获得各个数据写入请求的收到的顺序。准确的识别各个数据写入请求的收到的顺序是非常重要的。发生在不同时间的数据写入请求可能向相同的地址写入不同的数据,而数据写入请求的执行顺序决定了在该相同地址最终记录的数据。When data needs to be recovered from the NVDIMM, a data write request to write data to the logical address can be obtained by using the data and the logical address in the data block (eg, data block 510 ); and by using the sequence number in the data block, it can be Gets the order in which each data write request was received. It is important to accurately identify the order in which individual data write requests are received. Data write requests that occur at different times may write different data to the same address, and the execution order of the data write requests determines the data finally recorded at the same address.

在NVDIMM中还记录多种元数据。元数据可用来记录NVDIMM中空闲的和/或被占用的存储区。当向NVDIMM中写入数据块时,向NVDIMM中的空闲存储区写入数据块,并将被写入了数据的存储区标记为被占用。当响应于将数据写入诸如盘230(参看图2)的存储设备后,释放该数据在NVDIMM中占据的空间(参看图4,460)时,在元数据中标记对应的存储区为空闲。在向NVDIMM中写入数据时,若NVDIMM中没有足够的空闲存储区,该数据写入无法完成。可暂时挂起该数据写入,等待NVDIMM中被占据的存储区被释放后,再执行将数据写入NVDIMM的操作。Various metadata are also recorded in the NVDIMM. Metadata can be used to record free and/or occupied memory areas in the NVDIMM. When writing a data block to the NVDIMM, the data block is written to a free storage area in the NVDIMM, and the storage area to which the data is written is marked as occupied. When the space occupied by the data in the NVDIMM (see Figure 4, 460) is freed in response to writing data to a storage device such as disk 230 (see Figure 2), the corresponding storage area is marked as free in the metadata. When writing data to the NVDIMM, if there is not enough free memory area in the NVDIMM, the data writing cannot be completed. The data writing can be temporarily suspended, and the operation of writing data to the NVDIMM can be performed after the occupied storage area in the NVDIMM is released.

元数据还可用来记录标志位,用来指示在存储系统上电时是否执行数据恢复操作。在一个例子中,在存储系统启动时,将标志位设置为第一值,而在存储系统正常关闭时,将标志设置为第二值。在此情况下,在存储系统启动时,若发现标志位为第一值,表明存储系统没有被正确关闭。在次情况下,需要利用NVDIMM中保存的数据进行数据恢复。后面会结合附图详细描述从NVDIMM中恢复数据的方法。在另一个例子中,在存储系统异常关机时,在标志位中写入第一值,在存储系统启动时,若发现标志位为第一值,表明存储系统没有被正确关闭,需要利用NVDIMM中保存的数据进行数据恢复。Metadata can also be used to record flag bits that indicate whether data recovery operations are performed when the storage system is powered up. In one example, the flag bit is set to a first value when the storage system starts up, and the flag is set to a second value when the storage system shuts down normally. In this case, when the storage system starts up, if the flag bit is found to be the first value, it indicates that the storage system is not properly shut down. In this case, data recovery needs to be performed using the data saved in the NVDIMM. The method for recovering data from the NVDIMM will be described in detail later with reference to the accompanying drawings. In another example, when the storage system is shut down abnormally, the first value is written in the flag bit. When the storage system starts up, if the flag bit is found to be the first value, it indicates that the storage system has not been shut down properly, and the NVDIMM needs to be used Save the data for data recovery.

图6是根据本发明的实施例的存储系统开机过程中执行的方法的流程图。当存储系统开机(600)时,访问NVDIMM中记录的标志位(620),用以判断存储系统是否需要从NVDIMM中恢复数据。在正常关机时,作为举例,NVDIMM中缓存的数据均已被写入诸如盘230的存储设备。在此情况下,无需从NVDIMM中恢复数据。在。若存储系统经历了异常关机,比如在执行数据写入请求过程中发生掉电等意外事件,与数据写入请求对应的数据被写入到NVDIMM,但并未被写入诸如盘230的存储设备中。在此情况下,需要从NVDIMM中恢复数据。FIG. 6 is a flowchart of a method performed during a boot process of a storage system according to an embodiment of the present invention. When the storage system is powered on (600), a flag bit recorded in the NVDIMM is accessed (620) to determine whether the storage system needs to restore data from the NVDIMM. At a graceful shutdown, the data cached in the NVDIMM has all been written to a storage device such as disk 230 as an example. In this case, there is no need to restore data from the NVDIMM. exist. If the storage system experiences an abnormal shutdown, such as an unexpected event such as a power failure during the execution of a data write request, the data corresponding to the data write request is written to the NVDIMM, but not written to the storage device such as the disk 230 middle. In this case, data needs to be recovered from the NVDIMM.

通过访问NVDIMM中记录的标志位来判断存储系统在上一次关机时是否经历了正常关机(630)。若是正常关机,则无需从NVDIMM中恢复数据,图6中的方法转向步骤660结束执行。若是非正常关机,则从NVDIMM中读出所有数据块(640)。在一个例子中,参看图5,从NVDIMM中读出数据块510、数据块520、数据块530与数据块540。在进一步的实施例中,还通过访问NVDIMM中记录的元数据来获得被占用的数据块的信息,进而从NVDIMM中读出被占用的数据块,而无需读出NVDIMM中的空闲数据块。It is determined by accessing the flag bit recorded in the NVDIMM whether the storage system experienced a normal shutdown during the last shutdown (630). In the case of normal shutdown, there is no need to restore data from the NVDIMM, and the method in FIG. 6 turns to step 660 to end the execution. In the case of an abnormal shutdown, all data blocks are read from the NVDIMM (640). In one example, referring to FIG. 5, data block 510, data block 520, data block 530, and data block 540 are read from the NVDIMM. In a further embodiment, the information of the occupied data blocks is also obtained by accessing the metadata recorded in the NVDIMM, and then the occupied data blocks are read out from the NVDIMM without reading out the free data blocks in the NVDIMM.

对于读出的数据块,依据各个数据块中记录的顺序号,按从小到大的方式排序,并按照排序后的顺序将各数据块对应的数据写入诸如盘230的存储设备。数据块的顺序号小,意味与该数据块对应的数据写入请求发生得较早。若数据块510(参看图5)的顺序号小于数据块520的顺序号,意味着数据块510对应的数据写入请求的发生早于数据块520对应的数据写入请求。在一个例子中,数据块510的顺序号小于数据块520的顺序号,数据块520的顺序号小于数据块530的顺序号,数据块530的顺序号小于数据块540的顺序号,则按照数据块510、数据块520、数据块530、数据块540的顺序,将各数据块对应的数据写入诸如盘230(参看图2)的存储设备。具体地,从各数据块中获得数据与逻辑地址,并根据逻辑地址将数据写入诸如盘230的存储设备。在一个例子中,根据逻辑地址与数据重新生成数据写入写请求,并基于数据写入请求将数据写入存储设备。The read data blocks are sorted in ascending order according to the sequence numbers recorded in each data block, and the data corresponding to each data block is written to a storage device such as the disk 230 according to the sorted order. The sequence number of the data block is small, which means that the data write request corresponding to the data block occurs earlier. If the sequence number of the data block 510 (see FIG. 5 ) is smaller than that of the data block 520 , it means that the data write request corresponding to the data block 510 occurs earlier than the data write request corresponding to the data block 520 . In one example, the sequence number of the data block 510 is smaller than the sequence number of the data block 520, the sequence number of the data block 520 is smaller than the sequence number of the data block 530, and the sequence number of the data block 530 is smaller than the sequence number of the data block 540, according to the data In the sequence of block 510, data block 520, data block 530, and data block 540, the data corresponding to each data block is written to a storage device such as disk 230 (see FIG. 2). Specifically, data and logical addresses are obtained from each data block, and data is written to a storage device such as disk 230 according to the logical addresses. In one example, the data write request is regenerated based on the logical address and the data, and the data is written to the storage device based on the data write request.

图7是根据本发明的又一实施例的数据写入方式的流程图。当需要写入数据时,应用程序会发出数据写入请求。根据本发明实施例的软件接收应用程序或其他程序发出的数据写入请求。而另一个实施例中,根据本发明的存储系统通过网络接收应用程序或其他程序或其他服务器发出的数据写入请求。FIG. 7 is a flowchart of a data writing method according to yet another embodiment of the present invention. When data needs to be written, the application issues a data write request. The software according to the embodiment of the present invention receives a data write request sent by an application program or other program. In another embodiment, the storage system according to the present invention receives data write requests sent by application programs or other programs or other servers through a network.

响应于接收到第一数据写入请求(710),在根据图7的实施例中,将第一数据写入NVDIMM(712)。在第一数据写入请求中,包括第一数据以及要写入第一数据的第一逻辑地址。在将第一数据写入NVDIMM之后,向发出写请求的应用、其他程序或服务器发出消息,该消息用以指示第一数据写入请求已经处理完成(714)。虽然此时数据仅被写入到NVDIMM中,而尚未写入存储系统的诸如盘230(参看图2)的存储设备,但由于NVDIMM具有非易失特性,根据本发明的实施例的写入方法中,基于数据被写入到NVDIMM而发出指示数据写入请求已经处理完成的消息,并且通过后续的步骤确保写入到NVDIMM的数据被写入到存储系统的诸如盘230的存储设备。In response to receiving the first data write request (710), in the embodiment according to FIG. 7, the first data is written to the NVDIMM (712). The first data write request includes first data and a first logical address to which the first data is to be written. After the first data is written to the NVDIMM, a message is sent to the application, other program, or server that issued the write request, the message indicating that the first data write request has been processed (714). Although data is only written to the NVDIMM at this time, and has not been written to the storage device of the storage system, such as the disk 230 (see FIG. 2 ), since the NVDIMM has non-volatile characteristics, the writing method according to the embodiment of the present invention , a message indicating that the data write request has been processed is issued based on the data being written to the NVDIMM, and subsequent steps ensure that the data written to the NVDIMM is written to a storage device such as disk 230 of the storage system.

在根据图7的实施例中,在接收到第一数据写入请求后,并非立即将第一写入请求对应的第一数据写入到诸如盘230的存储设备。而是等待接收到第二写入请求(720)。通过将第一写入请求与第二写入请求合并处理,能够在保证数据可靠性的前提下,减少存储设备执行写操作的次数,从而提升存储系统性能。在第二数据写入请求中,包括第二数据以及要写入第二数据的第二逻辑地址。In the embodiment according to FIG. 7 , after the first data write request is received, the first data corresponding to the first write request is not written to the storage device such as the disk 230 immediately. Instead, wait for a second write request to be received (720). By combining the first write request and the second write request, the number of write operations performed by the storage device can be reduced on the premise of ensuring data reliability, thereby improving the performance of the storage system. The second data write request includes second data and a second logical address to which the second data is to be written.

响应于接收到第二数据写入请求(720),将第二数据写入NVDIMM(722)。在将第二数据写入NVDIMM之后,向发出写请求的应用、其他程序或服务器发出消息,该消息用以指示第二数据写入请求已经处理完成(724)。In response to receiving the second data write request (720), second data is written to the NVDIMM (722). After writing the second data to the NVDIMM, a message is sent to the application, other program, or server that issued the write request to indicate that the second data write request has been processed (724).

响应于接收到第一写入请求与第二写入请求,将第一数据与第二数据写入存储设备(730)。在一个例子中,生成存储数据块,在存储数据块中记录第一数据与第二数据,并将存储数据块写入存储设备。存储数据块可具有与存储设备的物理存储块相同的大小。在另一个例子中,在第一数据写入请求与第二数据写入请求在逻辑地址上连续的情况下,将第二数据追加在第一数据之后,并写入存储设备。In response to receiving the first write request and the second write request, the first data and the second data are written to the storage device (730). In one example, a storage data block is generated, the first data and the second data are recorded in the storage data block, and the storage data block is written into a storage device. The storage data block may have the same size as the physical storage block of the storage device. In another example, when the first data write request and the second data write request are consecutive in logical addresses, the second data is appended to the first data and written to the storage device.

响应于将第一数据与第二数据写入存储设备,在NVDIMM中释放第一数据与第二数据(740)。在一个例子中,在NVDIMM中释放与第一数据对应的第一存储块,以及与第二数据对应的第二存储块。In response to writing the first and second data to the storage device, the first and second data are released in the NVDIMM (740). In one example, a first memory block corresponding to the first data and a second memory block corresponding to the second data are released in the NVDIMM.

需要指出的是,在步骤730中合并的数据可以来自两个或更多个数据写入请求。在另一个例子中,在收到第二数据写入请求后,若第一数据与第二数据不适于合并,将第一数据与第二数据分别写入存储设备。在根据图7的实施例中,向NVDIMM写入的存储块可以是如图5所展示的存储块。在根据图7的实施例将第一数据与第二数据写入NVDIMM后,若经历了异常掉电,也可通过如图6所示的实施例来从NVDIMM中恢复数据。It should be noted that the data merged in step 730 may come from two or more data write requests. In another example, after receiving the second data write request, if the first data and the second data are not suitable for combining, the first data and the second data are written into the storage device respectively. In the embodiment according to FIG. 7 , the memory blocks written to the NVDIMM may be memory blocks as shown in FIG. 5 . After the first data and the second data are written into the NVDIMM according to the embodiment of FIG. 7 , if an abnormal power failure is experienced, the data can also be recovered from the NVDIMM by the embodiment shown in FIG. 6 .

图8展示了根据本发明实施例的NVDIMM的数据组织。参看图8,以类似环形缓冲区(810)的方式组织NVDIMM中的存储空间。图8中指示了队列头部(head),指示通常情况下将数据写入NVDIMM的起始位置。图8中还指示了队列尾部(tail)。NVDIMM中未被使用或已经释放的存储空间为空闲存储空间。如图8所示,NVDIMM中包括空闲存储空间812、空闲存储空间814与空闲存储空间816。使用数据结构索引NVDIMM中的空闲存储空间。在图8的实施例中,例如,以链表830组织NVDIMM的空闲存储空间。也可以使用线性表、树等数据结构来索引NVDIMM中的空闲存储空间。NVDIMM中还包括一个或多个数据块,在图8中用网状线标识。数据块代表NVDIMM中被占用的存储空间。Figure 8 illustrates the data organization of an NVDIMM according to an embodiment of the present invention. Referring to Figure 8, the storage space in the NVDIMM is organized in a manner similar to the ring buffer (810). The head of the queue is indicated in Figure 8, indicating the starting position where data is normally written to the NVDIMM. The queue tail is also indicated in FIG. 8 . The unused or released storage space in the NVDIMM is free storage space. As shown in FIG. 8 , the NVDIMM includes free storage space 812 , free storage space 814 and free storage space 816 . Use a data structure to index free storage space in an NVDIMM. In the embodiment of FIG. 8, the free storage space of the NVDIMM is organized in a linked list 830, for example. Data structures such as linear tables, trees, etc. can also be used to index free storage space in NVDIMMs. The NVDIMM also includes one or more data blocks, which are identified by mesh lines in Figure 8. A data block represents the occupied storage space in an NVDIMM.

参看图8,链表830包括节点832、节点834与节点836。每个节点索引NVDIMM中的连续空闲地址空间。节点832索引空闲存储空间812,节点834索引空闲存储空间814,节点836索引空间存储空间816。在节点836中,通过存储空闲存储空间816的起始地址、长度和/或末尾地址来索引空闲存储空间816。在节点834中,通过存储空闲存储空间814的起始地址、长度和/或末尾地址来索引空闲存储空间814。在节点832中通过存储空闲存储空间812的起始地址、长度和/或末尾地址来索引空间存储空间812。节点832、节点834与节点834组织为双向链表。还提供指针820来索引节点836。参看图8,将NVDIMM的空闲存储空间按顺时针方向排序。在未发生地址回绕时(回绕指地址达到或越过最大值而从地址空间的起始地址重新开始),节点832、节点834与节点834按其各自所索引的空闲存储区的地址排序。Referring to FIG. 8 , the linked list 830 includes a node 832 , a node 834 and a node 836 . Each node indexes contiguous free address space in the NVDIMM. Node 832 indexes free storage space 812 , node 834 indexes free storage space 814 , and node 836 indexes space storage space 816 . In node 836, free memory space 816 is indexed by storing the start address, length, and/or end address of free memory space 816. In node 834, free memory space 814 is indexed by storing the start address, length, and/or end address of free memory space 814. The space storage space 812 is indexed in the node 832 by storing the start address, length and/or end address of the free storage space 812 . Node 832, node 834 and node 834 are organized as a doubly linked list. Pointer 820 is also provided to index node 836. Referring to Figure 8, sort the free storage space of the NVDIMM in a clockwise direction. In the absence of address wrapping (wrap refers to addresses reaching or past the maximum value and starting over from the beginning of the address space), nodes 832, 834 and 834 are ordered by the addresses of the free memory regions they each index.

空闲存储空间816与空闲存储空间814相邻,而空闲存储空间814与空闲存储空间812相邻。参看图8,空闲存储空间816与空闲存储空间812或空闲存储空间814之间存在数据块,因而空闲存储空间816与空闲存储空间812或空闲存储空间814并不相接。Free storage space 816 is adjacent to free storage space 814 , which is adjacent to free storage space 812 . Referring to FIG. 8 , there are data blocks between the free storage space 816 and the free storage space 812 or the free storage space 814 , so the free storage space 816 is not connected to the free storage space 812 or the free storage space 814 .

当从NVDIMM中申请或分配空闲存储空间时,从指针820所索引的节点836开始寻找空闲存储区。根据本发明实施例的存储空间分配/释放方法,在指针836所索引的节点,将有很大概率存在满足要求的空闲存储区,从而提高存储空间分配的效率。在一个例子中,当在节点836找到满足要求的空闲存储区816,返回空闲存储区816的首地址来代表所分配的存储空间。修改节点836中记录的空闲存储区的起始地址和/或长度。此时队列的头部(head)相应变化。作为另一个例子,当在节点836找不到满足要求的空闲存储区,则遍历链表830来寻找满足要求的空闲存储区。例如若在节点832找到满足要求的空闲存储空间,则从节点832所索引的空闲存储区812中分配存储空间,并相应地修改节点832的索引信息。When free storage space is allocated or allocated from the NVDIMM, the free storage area is searched from the node 836 indexed by the pointer 820 . According to the storage space allocation/release method according to the embodiment of the present invention, a node indexed by the pointer 836 has a high probability of having a free storage area that meets the requirements, thereby improving the efficiency of storage space allocation. In one example, when a free storage area 816 that meets the requirements is found at node 836, the first address of the free storage area 816 is returned to represent the allocated storage space. Modify the start address and/or length of the free memory area recorded in node 836. At this time, the head of the queue changes accordingly. As another example, when a free storage area that meets the requirements cannot be found at the node 836, the linked list 830 is traversed to find a free storage area that meets the requirements. For example, if a free storage space that meets the requirements is found at the node 832, the storage space is allocated from the free storage area 812 indexed by the node 832, and the index information of the node 832 is modified accordingly.

通常情况下,从NVDIMM的头部(head)写入数据,而在尾部(tail)删除数据。参看图4,在将数据写入存储设备(450)后,释放该数据在NVDIMM中所占据的空间(460)。因而NVDIMM中的数据在写入NVDIMM一段时间后被释放,进而在向NVDIMM写入数据时,通常情况下将在指针820所索引的节点836找到满足要求的空闲存储区。Typically, data is written from the head (head) of the NVDIMM and data is deleted from the tail (tail). Referring to Figure 4, after data is written to the storage device (450), the space occupied by the data in the NVDIMM is released (460). Therefore, the data in the NVDIMM is released after being written to the NVDIMM for a period of time, and when data is written to the NVDIMM, a free storage area that meets the requirements will be found at the node 836 indexed by the pointer 820 under normal circumstances.

在根据本发明的另一个例子中,在NVDIMM中预留一定空间来存储NVDIMM的状态信息,包括例如多种元数据。元数据可用来记录NVDIMM中空闲的和/或被占用的存储区。元数据还可用来记录标志位,用来指示在存储系统上电时是否执行数据恢复操作。所属领域技术人员将意识到,元数据也可存储在NVDIMM之外。In another example according to the present invention, a certain space is reserved in the NVDIMM to store state information of the NVDIMM, including, for example, various metadata. Metadata can be used to record free and/or occupied memory areas in the NVDIMM. Metadata can also be used to record flag bits that indicate whether data recovery operations are performed when the storage system is powered up. Those skilled in the art will appreciate that metadata may also be stored outside of the NVDIMM.

图9是根据本发明实施例的NVDIMM的存储空间分配方法的流程图。在根据图9的实施例中,向NVDIMM写入数据时,需要从NVDIMM中分配空闲存储空间。参见图8,NVDIMM中可包括一个或多个空闲存储空间(812、814与816),利用链表830或其他数据结构来索引NVDIMM中的一个或多个空闲存储空间。FIG. 9 is a flowchart of a storage space allocation method of an NVDIMM according to an embodiment of the present invention. In the embodiment according to FIG. 9, when writing data to the NVDIMM, free memory space needs to be allocated from the NVDIMM. Referring to FIG. 8, one or more free storage spaces (812, 814, and 816) may be included in the NVDIMM, and a linked list 830 or other data structure is used to index the one or more free storage spaces in the NVDIMM.

响应于接收接收到分配存储空间的请求(910),从指针820(参见图8)索引的节点836(第一节点)中查找空闲存储空间(920)。节点836索引的空闲存储空间816作为NVDIMM的环形缓冲区的队列头部,有很大几率存在可满足分配要求的空闲缓冲区。若第一节点可满足分配存储空间的请求(930),则从第一节点分配空闲存储空间(940)。在一个例子中,提供第一节点的空闲存储空间起始地址来响应分配存储空间的请求。若第一节点无法满足分配存储空间的请求(930),则遍历例如链表830数据结构的其他节点来寻找空闲存储空间(950)。例如,若图8的节点836对应的空闲存储空间816无法满足分配存储空间的请求,则遍历链表830,判断节点834对应的空闲存储空间814是否满足分配存储空间的请求。如果需要的话,判断节点832对应的空闲存储空间812是否满足分配存储空间的请求。在进一步的实施例中,若遍历了链表830依然无法找到满足请求的空闲存储空间,则等待由于存储空间的释放而在NVDIMM中出现的空闲存储空间。在依然进一步的实施例中,存储空间的释放在空闲存储空间816的附近出现的几率最大,因而在发现NVDIMM中的存储空间被释放后,优先通过节点836查找空闲存储空间。In response to receiving a request to allocate storage space (910), free storage space is looked up (920) from node 836 (the first node) indexed by pointer 820 (see Figure 8). The free storage space 816 indexed by the node 836 is used as the queue head of the ring buffer of the NVDIMM, and there is a high probability that there is a free buffer that can meet the allocation requirements. If the first node can satisfy the request to allocate storage space (930), free storage space is allocated from the first node (940). In one example, the first node's free storage starting address is provided in response to the request to allocate storage. If the first node cannot satisfy the request for allocating storage space (930), it traverses other nodes of the data structure such as linked list 830 to find free storage space (950). For example, if the free storage space 816 corresponding to the node 836 in FIG. 8 cannot meet the request for allocating storage space, the linked list 830 is traversed to determine whether the free storage space 814 corresponding to the node 834 meets the request for allocating storage space. If necessary, it is determined whether the free storage space 812 corresponding to the node 832 satisfies the request for allocating storage space. In a further embodiment, if the free storage space that satisfies the request cannot be found after traversing the linked list 830, then wait for the free storage space that appears in the NVDIMM due to the release of the storage space. In still further embodiments, the release of storage space occurs most likely in the vicinity of free storage space 816, and thus, upon finding that the storage space in the NVDIMM is freed, node 836 is prioritized to search for free storage space.

在根据本发明的实施例中,若通过节点836找到了满足分配请求的空闲存储空间816,还修改节点836,使之索引分配了存储空间之后的空闲存储空间。例如,修改在节点836中存储的空闲存储空间的起始地址、末尾地址和/或空闲存储空间的长度。In the embodiment according to the present invention, if the free storage space 816 that satisfies the allocation request is found through the node 836, the node 836 is also modified to index the free storage space after the storage space is allocated. For example, the start address, end address, and/or length of the free memory space stored in node 836 is modified.

图10是根据本发明又一实施例的NVDIMM的存储空间分配方法的流程图。在根据图9的实施例中,遍历链表830,从NVDIMM的全部空闲存储空间中查找满足分配请求的存储空间。而根据本发明,在NVDIMM的环形缓冲区的队列头部,相对于缓冲区的其他位置,有更大几率存在空闲存储空间。因而在根据图10的实施例中,仅从索引环形缓冲区的队列头部的第一节点(节点836)中寻找空闲存储空间,在第一节点对应的空闲存储空间不能满足分配请求时,不去遍历链表830,而是等待满足分配请求的空闲存储空间出现。从而进一步提高存储空间分配的效率。FIG. 10 is a flowchart of a storage space allocation method for an NVDIMM according to yet another embodiment of the present invention. In the embodiment according to FIG. 9 , the linked list 830 is traversed to find the storage space that satisfies the allocation request from all the free storage space of the NVDIMM. According to the present invention, there is a higher probability of free storage space at the head of the queue of the ring buffer of the NVDIMM compared to other positions in the buffer. Therefore, in the embodiment according to FIG. 10 , the free storage space is only searched from the first node (node 836 ) in the queue head of the index ring buffer, and when the free storage space corresponding to the first node cannot satisfy the allocation request, no free storage space is required. To traverse the linked list 830, instead wait for free storage space to appear that satisfies the allocation request. Thereby, the efficiency of storage space allocation is further improved.

参看图10,响应于接收接收到分配存储空间的请求(1010),从指针820(参见图8)索引的节点836(第一节点)中查找空闲存储空间(1020)。节点836索引的空闲存储空间816作为NVDIMM的环形缓冲区的队列头部,有很大几率存在可满足分配要求的空闲缓冲区。若第一节点可满足分配存储空间的请求(1030),则从第一节点分配空闲存储空间(1040)。在一个例子中,提供第一节点的空闲存储空间起始地址来响应分配存储空间的请求。若第一节点无法满足分配存储空间的请求(1030),则等待满足分配请求的空闲存储空间出现(1050)。在一个例子中,响应于对NVDIMM的存储空间释放,再次查找结点836(第一节点)中是否存在满足分配请求的空闲存储空间(1040)。在另一个例子中,响应于对NVDIMM的存储空间释放,判断存储空间释放请求所释放的存储空间能否满足分配请求,进而分配空闲存储空间。Referring to Figure 10, in response to receiving a request to allocate storage space (1010), free storage space is looked up (1020) from node 836 (the first node) indexed by pointer 820 (see Figure 8). The free storage space 816 indexed by the node 836 is used as the queue head of the ring buffer of the NVDIMM, and there is a high probability that there is a free buffer that can meet the allocation requirements. If the first node can satisfy the request to allocate storage space (1030), free storage space is allocated from the first node (1040). In one example, the first node's free storage starting address is provided in response to the request to allocate storage. If the first node cannot satisfy the request for allocating storage space (1030), it waits for a free storage space that satisfies the allocation request to appear (1050). In one example, in response to the release of storage space for the NVDIMM, it is again looked up whether there is free storage space in node 836 (the first node) that satisfies the allocation request (1040). In another example, in response to the storage space release of the NVDIMM, it is determined whether the storage space released by the storage space release request can satisfy the allocation request, and then the free storage space is allocated.

图11是根据本发明实施例的NVDIMM的存储空间释放方法的流程图。响应于接收到释放第一存储空间的请求(1110),判断在NVDIMM中第一存储空间与相邻的空闲存储空间是否相接(1120)。若在NVDIMM中第一存储空间与相邻的存储空间相接,则合并第一存储空间与相邻的空闲存储空间(1130)。若在NVDIMM中第一存储空间与相邻的空闲存储空间不相接,则提供新节点来索引第一存储空间(1140)。例如,参看图8,创建新节点来索引所释放的第一存储空间,并将该新节点插入链表830。可按照第一存储空间的地址,将该新节点插入链表830。使得链表830中的节点依然按照各自所索引的空闲存储空间的地址排序。换句话说,使得链表830中的各节点所索引的空闲存储空间依照其地址按顺时针方向排列。FIG. 11 is a flowchart of a method for releasing storage space of an NVDIMM according to an embodiment of the present invention. In response to receiving the request to release the first storage space (1110), it is determined whether the first storage space and the adjacent free storage space are connected in the NVDIMM (1120). If the first storage space is connected to the adjacent storage space in the NVDIMM, the first storage space and the adjacent free storage space are merged (1130). If the first storage space is not contiguous with adjacent free storage spaces in the NVDIMM, a new node is provided to index the first storage space (1140). For example, referring to FIG. 8, a new node is created to index the freed first storage space, and the new node is inserted into the linked list 830. The new node may be inserted into the linked list 830 according to the address of the first storage space. The nodes in the linked list 830 are still sorted according to the addresses of the free storage spaces indexed by them. In other words, the free storage spaces indexed by each node in the linked list 830 are arranged in a clockwise direction according to their addresses.

在根据本发明的实施例中,链表830的节点按其所索引的空闲存储空间的地址排序。通过第一存储空间的地址,在链表830中找到索引与第一存储空间相邻的空闲存储空间的节点,并确定第一存储空间与相邻的空闲存储空间是否相接。在一个例子中,在索引空闲存储空间的节点中包括空闲存储空间的起始地址、末尾地址和/或长度。通过比较空闲存储空间的地址与第一存储存储空间的地址来确定第一存储空间与相邻的空闲存储空间是否相接。In an embodiment according to the present invention, the nodes of the linked list 830 are ordered by the addresses of the free storage spaces they index. According to the address of the first storage space, the node indexing the free storage space adjacent to the first storage space is found in the linked list 830, and it is determined whether the first storage space and the adjacent free storage space are connected. In one example, the start address, end address and/or length of the free memory space are included in the node indexing the free memory space. Whether the first storage space and the adjacent free storage space are connected is determined by comparing the address of the free storage space with the address of the first storage storage space.

在一个例子中,参看图8,若待释放的第一存储空间与空闲存储空间816相邻,并且第一存储空间的末尾地址等于空闲存储空间816的起始地址,则修改索引了空闲存储空间816的节点836,使得修改后的节点836索引第一存储空间与空闲存储空间816。以此方式实现了第一存储空间与空闲存储空间816的合并。例如,修改节点836所索引的空闲存储空间的起始地址、长度和/或末尾地址,来使节点836索引合并后的第一存储空间与空闲存储空间816。在另一个例子中,通过创建索引了第一存储空间与空闲存储空间816的新节点,将新节点插入链表830,删除节点836的方式来实现第一存储空间与空闲存储空间816的合并。在此例子中,由于指针820指向的节点836被删除,还将指针820指向所创建的新节点。In one example, referring to FIG. 8, if the first storage space to be released is adjacent to the free storage space 816, and the end address of the first storage space is equal to the start address of the free storage space 816, then the modified index of the free storage space Node 836 of 816 , so that the modified node 836 indexes the first storage space and the free storage space 816 . In this way, the merging of the first storage space and the free storage space 816 is achieved. For example, the start address, length and/or end address of the free storage space indexed by the node 836 is modified, so that the node 836 indexes the combined first storage space and the free storage space 816 . In another example, the first storage space and the free storage space 816 are merged by creating a new node that indexes the first storage space and the free storage space 816 , inserting the new node into the linked list 830 , and deleting the node 836 . In this example, since the node 836 pointed to by the pointer 820 was deleted, the pointer 820 also points to the new node that was created.

在依然一个例子中,若与待释放的第一存储空间相邻的空闲存储空间是空闲存储空间816和空闲存储空间812,但第一存储空间与空闲存储空间816或空间存储空间812均不相接,则创建新的节点来索引第一存储空间,并将该新的节点插入链表830。In still another example, if the free storage space adjacent to the first storage space to be released is the free storage space 816 and the free storage space 812, but the first storage space is not the same as the free storage space 816 or the space storage space 812 Then, a new node is created to index the first storage space, and the new node is inserted into the linked list 830 .

所属领域技术人员将意识到,虽然在图8的实施例中,用链表组织了索引空闲存储空间的多个节点,但是可使用多种数据结构来组织多个节点,例如可使用树、线性表等结构来组织多个节点,并对多个节点排序和/或有效查找。Those skilled in the art will appreciate that, although in the embodiment of FIG. 8, a linked list is used to organize multiple nodes indexing free storage space, multiple data structures can be used to organize multiple nodes, such as trees, linear lists, etc. and other structures to organize multiple nodes and sort and/or efficiently search multiple nodes.

图12是根据本发明又一实施例的NVDIMM的存储空间释放方法的流程图。响应于接收接收到释放第一存储空间的请求(1210),查找第一存储空间的前向相接空闲存储空间是否存在(1220),换句话说,查找在NVDIMM中第一存储空间之前的空闲存储空间与第一存储空间是否相接。若找到前向相接的空闲存储空间,再查找第一存储空间的后向相接的空闲存储空间(1230),换句话说,查找在NVDIMM中第一存储空间之后的空闲存储空间与第一存储空间是否相接。若在操作1230找到后向相接的空闲存储空间,表明在NVDIMM中第一存储空间前后均存在与第一存储空间相接的空闲存储空间,那么将与第一存储空间前向相接的空闲存储空间以及与第一存储空间后向相接的空闲存储空间合并(1250)。在一个例子中,通过链表830中的一个节点索引合并后的空闲存储空间。FIG. 12 is a flowchart of a method for releasing storage space of an NVDIMM according to yet another embodiment of the present invention. In response to receiving the request to release the first storage space (1210), look up whether the forward contiguous free storage space of the first storage space exists (1220), in other words, look for the free space before the first storage space in the NVDIMM Whether the storage space is connected to the first storage space. If the free storage space connected to the front is found, then look for the free storage space connected to the back of the first storage space (1230), in other words, find the free storage space after the first storage space in the NVDIMM and the first storage space. Whether the storage space is connected. If the back-connected free storage space is found in operation 1230, indicating that there are free storage spaces connected to the first storage space before and after the first storage space in the NVDIMM, then the free storage space connected to the first storage space in the forward direction will be The storage space and the free storage space adjoining the first storage space backwards are merged (1250). In one example, the merged free storage space is indexed by a node in the linked list 830 .

若在操作1230未找到与第一存储空间后向相接的空闲存储空间,表明在NVDIMM中第一存储空间前存在与第一存储空间相接的空闲存储空间,而在第一存储空间后不存在与第一存储空间相接的空闲存储空间,那么将与第一存储空间前向相接的空闲存储空间以及第一存储空间合并(1260)。在一个例子中,修改链表830中索引与第一存储空间前向相接的空闲存储空间的节点,使之索引合并后的空闲存储空间。If no free storage space connected to the back of the first storage space is found in operation 1230, it indicates that there is a free storage space connected to the first storage space before the first storage space in the NVDIMM, and there is no free storage space after the first storage space. If there is free storage space adjoining the first storage space, then the free storage space adjoining the first storage space forward and the first storage space are merged (1260). In one example, the node in the linked list 830 that indexes the free storage space that is forward-connected to the first storage space is modified to index the combined free storage space.

在操作1220,若未找到前向相接的空闲存储空间,则查找第一存储空间的后向相接空闲存储空间是否存在(1235),换句话说,查找在NVDIMM中第一存储空间之后的空闲存储空间与第一存储空间是否相接。若找到后向相接的空闲存储空间,则将第一存储空间和与第一存储空间后向相接的空闲存储空间合并(1270)。在一个例子中,修改链表830中索引与第一存储空间后向相接的空闲存储空间的节点,使之索引合并后的空闲存储空间。In operation 1220, if the forward contiguous free storage space is not found, then look up whether the backward contiguous free storage space of the first storage space exists (1235), in other words, look up the space after the first storage space in the NVDIMM. Whether the free storage space is connected to the first storage space. If a backward-adjacent free storage space is found, the first storage space and the backward-adjacent free storage space are merged (1270). In one example, the node in the linked list 830 that indexes the free storage space that is backward adjacent to the first storage space is modified to index the combined free storage space.

若在操作1235未找到与第一存储空间后向相接的空闲存储空间,表明在NVDIMM中第一存储空间周围,不存在空闲存储空间。那么提供新节点索引第一存储空间(1280),并将新节点按其索引的第一存储空间的地址插入链表830,使得链表830中的节点按其各自所索引的空闲存储区的地址排序。If no free storage space adjacent to the first storage space is found in operation 1235, it indicates that there is no free storage space around the first storage space in the NVDIMM. The new node is then provided to index the first storage space (1280), and the new node is inserted into the linked list 830 according to the address of its indexed first storage space, so that the nodes in the linked list 830 are sorted by the addresses of their respective indexed free storage areas.

图13是根据本发明的又一实施例的数据写入方法的流程图。当需要写入数据时,应用程序会发出数据写入请求。根据本发明实施例的软件接收应用程序或其他程序发出的数据写入请求。而另一个实施例中,根据本发明的存储系统通过网络接收应用程序或其他程序或其他服务器发出的数据写入请求。NVDIMM作为写缓存根据本发明的数据写入方法中得到使用。FIG. 13 is a flowchart of a data writing method according to yet another embodiment of the present invention. When data needs to be written, the application issues a data write request. The software according to the embodiment of the present invention receives a data write request sent by an application program or other program. In another embodiment, the storage system according to the present invention receives data write requests sent by application programs or other programs or other servers through a network. NVDIMMs are used as write caches in the data writing method according to the present invention.

响应于接收到数据写入请求(1310),在根据图13的实施例中,判断在NVDIMM中是否有足够的空闲存储空间来容纳数据写入请求所对应的数据(1320)。若在NVDIMM中有足够的空闲存储空间,则在NVDIMM中分配存储空闲存储空间(1330),并将数据写入请求所对应的数据写入NVDIMM(1350),继而向发出数据写如请求的应用、其他程序或服务器发出消息,该消息用以指示数据写入请求已经处理完成(1360)。虽然此时数据仅被写入到NVDIMM中,而尚未写入存储系统的诸如盘230(参看图2)的存储设备,但由于NVDIMM具有非易失特性,根据本发明的实施例的写入方法中,基于数据被写入到NVDIMM而发出指示数据写入请求已经处理完成的消息,并且通过后续的步骤确保写入到NVDIMM的数据被写入到存储系统的诸如盘230或盘330(参看图3)的存储设备。In response to receiving the data write request (1310), in the embodiment according to FIG. 13, it is determined whether there is enough free storage space in the NVDIMM to accommodate the data corresponding to the data write request (1320). If there is enough free storage space in the NVDIMM, allocate the storage free storage space in the NVDIMM (1330), write the data corresponding to the data write request into the NVDIMM (1350), and then write the data to the application that issued the data write request. , other program or server issues a message to indicate that the data write request has been processed (1360). Although data is only written into the NVDIMM at this time, and has not been written to the storage device of the storage system, such as the disk 230 (see FIG. 2 ), since the NVDIMM has non-volatile characteristics, the writing method according to the embodiment of the present invention , a message indicating that the data write request has been processed is issued based on the data being written to the NVDIMM, and subsequent steps ensure that the data written to the NVDIMM is written to a storage system such as disk 230 or disk 330 (see FIG. 3) storage device.

若在NVDIMM中没有足够的连续空闲存储空间来满足数据写入请求,则等待空闲存储空间出现在NVDIMM中(1340)。随着向NVDIMM释放存储空间,NVDIMM中将出现更大的连续空闲存储空间来满足数据写入请求。If there is not enough contiguous free memory space in the NVDIMM to satisfy the data write request, then wait for free memory space to appear in the NVDIMM (1340). As storage space is released to the NVDIMM, a larger contiguous free storage space will appear in the NVDIMM to satisfy data write requests.

响应于接收到数据写入请求(1310),将数据写入请求所对应的数据写入到存储设备(1370)。响应于将数据写入存储设备(1370)的步骤执行完成,指示释放该数据写入请求所对应的数据在NVDIMM中占据的空间(1380)。存储设备可以是诸如盘230(参见图2)的存储设备。随着释放数据在NVDIMM中占据的空间,NVDIMM中将出现足够的连续空闲存储空间来满足在NVDIMM中分配存储空间的需要。在另一个例子中,可通过图11或图12展示的实施例来从NVDIMM中释放存储空间。In response to receiving the data write request (1310), data corresponding to the data write request is written to the storage device (1370). In response to the completion of the step of writing the data to the storage device (1370), it is instructed to release the space occupied by the data corresponding to the data writing request in the NVDIMM (1380). The storage device may be a storage device such as disk 230 (see FIG. 2). As the space occupied by data in the NVDIMM is freed, there will be enough contiguous free storage space in the NVDIMM to satisfy the need to allocate storage space in the NVDIMM. In another example, storage space may be freed from the NVDIMM by the embodiments shown in FIG. 11 or FIG. 12 .

在根据图13的实施例中,响应于接收到数据写入请求,将数据写入到存储设备(1370),并从NVDIMM中释放相应的数据(1380)。因而数据不会在NVDIMM中缓存很长世间,从而NVDIMM的存储空间得以很快地重复利用,减少了对NVDIMM的整体存储容量的需求,并能够及时在NVDIMM中申请到空闲存储空间用于响应数据写入请求。In the embodiment according to Figure 13, in response to receiving a data write request, data is written to the storage device (1370) and corresponding data is released from the NVDIMM (1380). Therefore, the data will not be cached in the NVDIMM for a long time, so that the storage space of the NVDIMM can be quickly reused, reducing the demand for the overall storage capacity of the NVDIMM, and applying for free storage space in the NVDIMM in time to respond to data. write request.

所属领域技术人员将意识到,会接收到多个数据写入请求。对每个数据写入请求,分别执行将数据写入NVDIMM(1350)的步骤以及将数据写入存储设备(1370)的步骤。以及在将数据写入NVDIMM(1350)后,发送指示数据写入请求完成的消息(1360)。在将数据写入存储设备(1370)后,释放该数据在NVDIMM中所占据的空间(1380)。Those skilled in the art will appreciate that multiple data write requests may be received. For each data write request, the steps of writing data to the NVDIMM (1350) and the steps of writing data to the storage device (1370) are performed, respectively. And after writing the data to the NVDIMM (1350), send a message (1360) indicating that the data write request is complete. After the data is written to the storage device (1370), the space occupied by the data in the NVDIMM is released (1380).

再次参看图8,响应于收到数据写入请求,通过指针820获得节点836,从节点836所对应的空闲存储区816中分配至少部分空闲存储区的用于写入数据。在写入数据后,空闲存储区816将变小,并可能不足以响应新的数据写入请求。依据本发明实施例的数据写入方法,在NVDIMM的队列尾部(由tail指示)的数据块在较早时被写入,因而通常情况下,该数据块以很大几率将被很快被写入到存储设备并在NVDIMM中被释放。即队列尾部(由tail指示)的数据块将很快被释放。随着队列尾部(由tail指示)的数据块被释放,由节点836所索引的空闲存储区(816)将与队列尾部(由tail指示)的数据块合并,使得空闲存储区816的空间变大,并能够响应新的数据写入请求。Referring to FIG. 8 again, in response to receiving the data write request, the node 836 is obtained through the pointer 820, and at least part of the free memory area is allocated from the free memory area 816 corresponding to the node 836 for writing data. After data is written, free memory area 816 will become small and may not be sufficient to respond to new data write requests. According to the data writing method of the embodiment of the present invention, the data block at the tail of the queue of the NVDIMM (indicated by tail) is written at an earlier time, so in general, the data block will be written soon with a high probability into the storage device and released in the NVDIMM. That is, the data block at the tail of the queue (indicated by tail) will be freed soon. As the data blocks at the tail of the queue (indicated by tail) are freed, the free storage area (816) indexed by node 836 will be merged with the data blocks at the tail of the queue (indicated by tail), making the free storage area 816 larger , and be able to respond to new data write requests.

在根据本发明的实施例中,将数据写入存储设备时,使写入数据形成数据流,并且使写入的数据在存储设备上的分布具有局部性,而使待读出的数据分布在存储设备的整个存储空间。通过这样的数据部分方式,来充分利用SSD随机读性能、顺序写性能高的优势。In the embodiment according to the present invention, when data is written into the storage device, the written data is formed into a data stream, and the distribution of the written data on the storage device is localized, and the data to be read is distributed in the The entire storage space of the storage device. Through this data part method, the advantages of SSD random read performance and sequential write performance are fully utilized.

图14是展示了根据本发明实施例的存储对象。在根据本发明的实施例中,将存储设备提供的存储资源组织为一个或多个存储对象。在一个例子中,将SSD的存储资源(例如,块)池化,并组成一个或多个存储对象。存储对象又被称为Container(容器)。将数据写入存储设备或从存储设备中读出数据时,以存储对象作为访问的基本单元。存储对象的数量由存储系统的容量决定。Figure 14 illustrates a storage object according to an embodiment of the present invention. In an embodiment according to the present invention, the storage resources provided by the storage device are organized into one or more storage objects. In one example, storage resources (eg, blocks) of an SSD are pooled and organized into one or more storage objects. Storage objects are also called Containers. When data is written to or read from a storage device, the storage object is used as the basic unit of access. The number of storage objects is determined by the capacity of the storage system.

参看图14,盘0、盘1、盘2与盘3是诸如SSD的存储设备。在其他例子中,盘0、盘1、盘2与盘3也可以是机械硬盘。作为举例,盘0作为存储设备提供地址空间。将盘0的地址空间分为多个大块(chunk),例如图4中的大块(chunk 0)1410、大块(chunk 1)1412……大块(chunk)n。大块可包括多个块(block)。类似地,将盘1、盘2与盘3的每个的地址空间分为多个大块,例如大块1420与大块1422由盘3提供。Referring to FIG. 14, Disk 0, Disk 1, Disk 2, and Disk 3 are storage devices such as SSDs. In other examples, Disk 0, Disk 1, Disk 2 and Disk 3 may also be mechanical hard disks. As an example, Disk 0 provides the address space as a storage device. The address space of disk 0 is divided into a plurality of chunks, such as chunk 0 1410, chunk 1 1412 . . . chunk n in FIG. 4 . A large block may include multiple blocks. Similarly, the address space of each of Disk 1, Disk 2, and Disk 3 is divided into a plurality of large blocks, eg, large blocks 1420 and 1422, are provided by Disk 3.

在图14的例子中,由盘0、盘1、盘2与盘3提供的多个大块形成了存储资源池。从存储资源池中将若干个大块组织为存储对象(在图14中展示为容器)。例如,容器0包括来自盘0的大块0(大块0-1)、来自盘1的大块1(大块1-1)与来自盘2的大块0(大块2-0),容器1包括来自盘0的大块2(大块0-2)与来自盘1的大块2(大块1-2)。根据本发明实施例的存储系统提供多个容器,每个容器可包括相同或不同数量的大块。在容器内,可以提供诸如RAID的数据保护机制。例如,在容器0内,大块0-0与大块1-1用于存储用户数据,而大块2-0用于存储与大块0-0、大块1-1相对应的校验数据。容器0内,校验数据的存储位置不限于大块2-0,也可以存储在大块0-0或大块1-1。作为另一个例子,容器1的大块0-2与大块1-2均用于存储用户数据。依然作为一个例子,构成容器0的多个大块(大块0-0、大块1-1与大块2-0)来自不同的盘,从而对容器0的访问将分布于盘0、盘1与盘2。以此方式,增加了对盘的访问的并行性,使存储性能得以提升。所属领域技术人员也将意识到,存储系统中的盘可具有不同的容量,盘可能被更换,从而容器内的大块可来自相同的盘。In the example of FIG. 14, a plurality of chunks provided by Disk 0, Disk 1, Disk 2, and Disk 3 form a storage resource pool. Several large blocks are organized into storage objects (shown as containers in Figure 14) from the storage resource pool. For example, container 0 includes chunk 0 (chunk 0-1) from disc 0, chunk 1 (chunk 1-1) from disc 1, and chunk 0 (chunk 2-0) from disc 2, Container 1 includes chunk 2 from disc 0 (chunk 0-2) and chunk 2 from pan 1 (chunk 1-2). A storage system according to an embodiment of the present invention provides a plurality of containers, each container may include the same or a different number of chunks. Within the container, data protection mechanisms such as RAID can be provided. For example, in container 0, chunks 0-0 and 1-1 are used to store user data, while chunks 2-0 are used to store checksums corresponding to chunks 0-0 and 1-1 data. In container 0, the storage location of the verification data is not limited to the large block 2-0, and can also be stored in the large block 0-0 or the large block 1-1. As another example, both chunks 0-2 and 1-2 of container 1 are used to store user data. Still as an example, the multiple chunks that make up container 0 (chunk 0-0, chunk 1-1, and chunk 2-0) are from different disks, so that access to container 0 will be distributed across disk 0, 1 and disc 2. In this way, the parallelism of access to the disk is increased, resulting in improved storage performance. Those skilled in the art will also appreciate that the disks in a storage system may be of different capacities, and the disks may be replaced so that large blocks within the container may be from the same disk.

在图14的例子中,展示了以大块构成容器的例子。可由其他方式来构成大块。例如在另一个例子中,不提供大块,容器对应于盘的一段存储空间,例如一段连续的逻辑地址空间。In the example of FIG. 14, the example in which a container is comprised by a large block is shown. Chunks may be constructed in other ways. For example, in another example, instead of providing large blocks, the container corresponds to a segment of storage space on the disk, eg, a segment of contiguous logical address space.

在根据本发明的实施例中,为提升存储系统性能,向容器写入数据时,采用追加写或顺序写方式。换句话说,容器只能从头写到尾,而容器的任意位置不能被覆盖写入。In the embodiment according to the present invention, in order to improve the performance of the storage system, when data is written to the container, an additional write or sequential write is used. In other words, the container can only be written from the beginning to the end, and any part of the container cannot be overwritten.

图15展示了根据本发明实施例的存储系统的读写操作的示意图。图15中展示了存储系统包括容器0、容器1、容器2与容器3。存储系统中的容器可分为两类,可写容器与只读容器。图15中,容器3是可写容器,而容器0、容器1与容器2都是只读容器。在存储系统中,存在至少一个可写容器,例如一个。通过限制可写容器的数量,使存储系统中写入的数据在存储设备上的分布具有局部性。当可写容器被写满后,将写满的可写容器变为只读容器,存储系统创建新的可写容器供写请求使用。可写容器仅用于响应写请求,而只读容器仅用于响应读请求。FIG. 15 shows a schematic diagram of read and write operations of a storage system according to an embodiment of the present invention. The storage system shown in FIG. 15 includes container 0, container 1, container 2 and container 3. Containers in storage systems can be divided into two categories, writable containers and read-only containers. In Figure 15, container 3 is a writable container, while container 0, container 1 and container 2 are all read-only containers. In the storage system, there is at least one writable container, eg one. By limiting the number of writable containers, the data written in the storage system is distributed locally on the storage device. When the writable container is full, the full writable container becomes a read-only container, and the storage system creates a new writable container for use by write requests. Writable containers are used only in response to write requests, while read-only containers are used only in response to read requests.

参看图15,存储系统收到写请求时,将数据写入NVDIMM。数据被写入NVDIMM后,发出指示写请求完成的消息。在收到写请求后,还将数据写入可写容器3。数据以追加写方式被写入可写容器3,已经写入可写容器3的数据不会被更新。在容器3被写完后,将容器3设置为只读容器,并且不再用于响应写请求。通过这种写入方式,使得在存储系统中的盘设备上的写操作基本上都是大块数据的顺序写,从而盘设备的FTL(Flash Translation Layer)工作在最佳状态,盘设备内不会频繁启动垃圾回收,从而获取最高性能。通过这种数据分布方式,写入数据永远发生在盘设备的局部,而且都是大数据块的方式对盘设备进行操作。因此,盘设备内的写放大系数将会大大降低。从而可以整体提高盘设备的使用寿命。Referring to Figure 15, when the storage system receives a write request, it writes data to the NVDIMM. After data is written to the NVDIMM, a message is issued indicating that the write request is complete. Data is also written to writable container 3 after a write request is received. The data is written to the writable container 3 in an append-write manner, and the data that has been written to the writable container 3 will not be updated. After container 3 has been written, set container 3 as a read-only container and is no longer used to respond to write requests. Through this writing method, the write operations on the disk device in the storage system are basically sequential writing of large blocks of data, so that the FTL (Flash Translation Layer) of the disk device works in the best state, and the disk device does not Garbage collection is started frequently for maximum performance. Through this data distribution method, writing data always occurs locally in the disk device, and the disk device is operated in the form of large data blocks. Therefore, the write amplification factor within the disk device will be greatly reduced. Thereby, the service life of the disk device can be improved as a whole.

继续参看图15,存储系统中的读请求,由只读容器响应。在图15中,容器0、容器1与容器2是只读容器。存储系统接收到读请求时,通过映射机制得到读请求对应数据所在的只读容器,并从只读容器中读取数据。可写容器3在被写满后,也变为只读容器,继而不再用于写请求而只用于响应读请求。在写数据写入容器3时,记录数据与容器3的对应关系。在容器3被写满后,容器3成为只读容器。Continuing to refer to Figure 15, a read request in the storage system is responded to by a read-only container. In Figure 15, container 0, container 1 and container 2 are read-only containers. When the storage system receives a read request, it obtains the read-only container where the data corresponding to the read request is located through the mapping mechanism, and reads the data from the read-only container. After the writable container 3 is full, it also becomes a read-only container, and is no longer used for writing requests but only used for responding to read requests. When writing data into the container 3, the corresponding relationship between the data and the container 3 is recorded. After container 3 is full, container 3 becomes a read-only container.

通过这样的方式,将读写操作在容器级别分离,从而减少了读写操作之间的耦合性,降低读操作与写操作彼此的干扰,增强了读操作延迟与写操作延迟的一致性。这与传统的基于LBA的数据布局方式完全不同。在根据本发明的实施例中,当存储系统中写入了很多数据后,写入的数据会分布到多个盘的几乎任意位置。此时在响应读请求时,是从多个盘的多个位置读出数据,从而能够充分利用SSD随机读性能高的优点。而在某一时刻,写请求发生在可写容器上,使得写请求与读请求的目标位置彼此分离,从而降低读请求与写请求彼此的影响。In this way, read and write operations are separated at the container level, thereby reducing the coupling between read and write operations, reducing the interference between read and write operations, and enhancing the consistency of read and write delays. This is completely different from the traditional LBA-based data layout. In an embodiment according to the present invention, when a lot of data is written in the storage system, the written data will be distributed to almost any position of the multiple disks. At this time, when responding to a read request, data is read from multiple locations on multiple disks, so that the advantage of the high random read performance of the SSD can be fully utilized. At a certain moment, the write request occurs on the writable container, so that the target locations of the write request and the read request are separated from each other, thereby reducing the impact of the read request and the write request.

在进一步的实施例中,在存储系统中为可写容器提供缓存。从而在诸如容器3的可写容器尚未写满时,由缓存来响应对可写容器3中已写入数据的读请求。在根据图15的实施例中,可使用NVDIMM作为可写容器的缓存,用于响应对可写容器中已写入数据的读请求。在另一个例子中,使用存储系统的内存或其他高速存储介质作为可写容器的缓存。In a further embodiment, a cache is provided in the storage system for writable containers. Therefore, when the writable container such as container 3 is not yet full, the cache responds to the read request for the data written in the writable container 3 . In the embodiment according to FIG. 15, the NVDIMM may be used as a cache for the writable container for responding to read requests for data written in the writable container. In another example, the memory or other high-speed storage medium of the storage system is used as a cache for the writable container.

图16是根据本发明的实施例的存储系统数据访问方法的流程图。响应于接收到写请求(1610),将数据写入可写存储对象(例如,参看图15,容器3)(1620)。以追加写方式将数据写入可写存储对象。当可写存储对象被写满(1630)后,将该存储对象设置为只读存储对象(1640)。响应于接收到读请求(1650),从只读存储对象读出数据(1660)。FIG. 16 is a flowchart of a data access method of a storage system according to an embodiment of the present invention. In response to receiving the write request (1610), data is written to a writable storage object (eg, see Figure 15, container 3) (1620). Writes data to a writable storage object in append-write mode. When the writable storage object is full (1630), the storage object is set as a read-only storage object (1640). In response to receiving the read request (1650), data is read from the read-only storage object (1660).

在将数据写入可写存储对象时,还记录数据与存储对象的对应关系。在响应读请求时,依据记录的对应关系,确定存储了数据的只读存储对象,并从只读存储对象中读出数据。When data is written into the writable storage object, the corresponding relationship between the data and the storage object is also recorded. When responding to the read request, the read-only storage object storing the data is determined according to the corresponding relationship of the records, and the data is read from the read-only storage object.

图17是根据本发明的又一实施例的存储系统数据访问方法的流程图。也参看图15,响应于接收到写请求(1710),将数据写入NVDIMM(1770)。NVDIMM作为存储系统的写缓存。将数据写入NVDIMM后,发送指示所述写请求完成的消息(1780)。从而降低了写操作的延迟。响应于收到写请求,还将数据写入到可写存储对象(例如,图15中的容器3)(1720)。在将数据写入可写存储对象时还记录数据与存储对象的对应关系,以用于读出该数据。当可写存储对象写满时(1730),将该可写存储对象设置为只读(1740)存储对象(1740)。在一个例子中,任意时刻存储系统中仅有一个可写存储对象。当该可写存储对象写满后,将该可写存储对象设置为只读。存储系统创建新的可写存储对象用于承载写请求。在另一个例子中,存储系统中同时存在若干个可写存储对象。响应于接收到读请求(1750),从只读存储对象读出数据(1760)。FIG. 17 is a flowchart of a data access method for a storage system according to yet another embodiment of the present invention. Referring also to Figure 15, in response to receiving a write request (1710), data is written to the NVDIMM (1770). NVDIMM acts as a write cache for the storage system. After data is written to the NVDIMM, a message is sent indicating that the write request is complete (1780). This reduces the latency of write operations. In response to receiving the write request, the data is also written to the writable storage object (eg, container 3 in Figure 15) (1720). When data is written into the writable storage object, the corresponding relationship between the data and the storage object is also recorded, so as to be used for reading the data. When the writable storage object is full (1730), the writable storage object is set as a read-only (1740) storage object (1740). In one example, there is only one writable storage object in the storage system at any one time. When the writable storage object is full, the writable storage object is set to read-only. The storage system creates new writable storage objects to carry write requests. In another example, several writable storage objects exist simultaneously in the storage system. In response to receiving the read request (1750), data is read from the read-only storage object (1760).

在一个例子中,NVDIMM还作为可写存储对象的缓存。若数据已写入可写存储对象,而可写存储对象尚未写满时,出现对已写入数据的读请求,则从NVDIMM中读出所请求的数据。而在可写存储对象写满后,还释放NVDIMM缓存的对应于该可写存储对象的数据。In one example, the NVDIMM also acts as a cache for writable storage objects. If data has been written to the writable storage object, and the writable storage object is not full, a read request for the written data occurs, and the requested data is read from the NVDIMM. After the writable storage object is full, the data corresponding to the writable storage object cached in the NVDIMM is also released.

在另一个例子中,NVDIMM作为存储系统的写缓存。在将数据写入可写存储对象后,从NVDIMM中释放对应的数据。存储系统在内存或其他高速存储介质中为可写存储对象提供缓存。响应于收到写请求,将数据缓存在为可写存储对象提供的缓存中。若数据已写入可写存储对象,而可写存储对象尚未写满时,出现对已写入数据的读请求,则从为可写存储对象提供的缓存中读出所请求的数据。而在可写存储对象写满后,还释放为可写存储对象提供的缓存中存储的对应于该可写存储对象的数据。In another example, the NVDIMM acts as a write cache for the storage system. After the data is written to the writable storage object, the corresponding data is released from the NVDIMM. A storage system provides a cache of writable storage objects in memory or other high-speed storage media. In response to receiving a write request, cache data in the cache provided for writable storage objects. If data has been written to the writable storage object, and the writable storage object is not full, and a read request for the written data occurs, the requested data is read from the cache provided for the writable storage object. After the writable storage object is full, the data corresponding to the writable storage object stored in the cache provided for the writable storage object is also released.

图18是根据本发明的依然又一实施例的存储系统数据访问方法的流程图。响应于接收到第一数据写入请求(1810),在根据图18的实施例中,将第一数据写入NVDIMM(1812)。在第一数据写入请求中,包括第一数据以及要写入第一数据的第一逻辑地址。在将第一数据写入NVDIMM之后,向发出写请求的应用、其他程序或服务器发出消息,该消息用以指示第一数据写入请求已经处理完成(1814)。FIG. 18 is a flowchart of a data access method of a storage system according to still another embodiment of the present invention. In response to receiving the first data write request (1810), in an embodiment according to Figure 18, the first data is written to the NVDIMM (1812). The first data write request includes first data and a first logical address to which the first data is to be written. After the first data is written to the NVDIMM, a message is sent to the application, other program, or server that issued the write request to indicate that the first data write request has been processed (1814).

在根据图7的实施例中,在接收到第一数据写入请求后,并非立即将第一写入请求对应的第一数据写入到可写存储对象。而是等待接收到第二写入请求(1820)。通过将第一写入请求与第二写入请求合并处理,能够在保证数据可靠性的前提下,减少存储设备执行写操作的次数,从而提升存储系统性能。In the embodiment according to FIG. 7 , after receiving the first data write request, the first data corresponding to the first write request is not written to the writable storage object immediately. Instead, wait for a second write request to be received (1820). By combining the first write request and the second write request, the number of write operations performed by the storage device can be reduced on the premise of ensuring data reliability, thereby improving the performance of the storage system.

响应于接收到第二数据写入请求(1820),将第二数据写入NVDIMM(1822)。在将第二数据写入NVDIMM之后,向发出写请求的应用、其他程序或服务器发出消息,该消息用以指示第二数据写入请求已经处理完成(1824)。In response to receiving the second data write request (1820), second data is written to the NVDIMM (1822). After the second data is written to the NVDIMM, a message is sent to the application, other program or server that issued the write request to indicate that the second data write request has been processed (1824).

响应于接收到第一写入请求与第二写入请求,将第一数据与第二数据写入可写存储对象(例如,参看图15,容器3)(1830)。在一个例子中,将第一数据与第二数据聚合,使得写入可写存储对象的数据块具有较大的尺寸。在另一个例子中,在第一数据写入请求与第二数据写入请求在逻辑地址上连续的情况下,将第二数据追加在第一数据之后,并写入可写存储对象。在依然另一个例子中,当NVDIMM中了聚合第一数据与第二数据后,将NVDIMM中聚合的数据写入可写存储对象。In response to receiving the first write request and the second write request, the first data and the second data are written to a writable storage object (eg, see Figure 15, container 3) (1830). In one example, the first data is aggregated with the second data such that the blocks of data written to the writable storage object have a larger size. In another example, when the first data write request and the second data write request are consecutive in logical addresses, the second data is appended to the first data, and the writable storage object is written. In still another example, after the first data and the second data are aggregated in the NVDIMM, the aggregated data in the NVDIMM is written to the writable storage object.

以追加写方式将数据写入可写存储对象。记录第一数据与可写存储对象的对应关系,以及第二数据与可写存储对象的对应关系。当可写存储对象被写满(185)后,将该存储对象设置为只读存储对象(1860)。响应于接收到读请求(1870),依据所记录的数据与存储对象的对应关系,确定存储了所请求数据的存储对象,并从只读存储对象读出数据(1875)。Writes data to a writable storage object in append-write mode. The correspondence between the first data and the writable storage object, and the correspondence between the second data and the writable storage object are recorded. When the writable storage object is full (185), the storage object is set as a read-only storage object (1860). In response to receiving the read request (1870), the storage object storing the requested data is determined according to the correspondence between the recorded data and the storage object, and data is read from the read-only storage object (1875).

在一个例子中,响应于将第一数据与第二数据写入可写存储对象,在NVDIMM中释放第一数据与第二数据(1840)。在另一个例子中,NVDIMM用作可写存储对象的缓存。若数据已写入可写存储对象,而可写存储对象尚未写满时,出现对已写入数据的读请求,则从NVDIMM中读出所请求的数据。而在可写存储对象写满后,还释放NVDIMM缓存的对应于该可写存储对象的数据。In one example, in response to writing the first and second data to the writable storage object, the first and second data are released in the NVDIMM (1840). In another example, NVDIMMs are used as a cache for writable storage objects. If data has been written to the writable storage object, and the writable storage object is not full, a read request for the written data occurs, and the requested data is read from the NVDIMM. After the writable storage object is full, the data corresponding to the writable storage object cached in the NVDIMM is also released.

需要指出的是,在步骤1830中合并的数据可以来自两个或更多个数据写入请求。在另一个例子中,在收到第二数据写入请求后,若第一数据与第二数据不适于合并,将第一数据与第二数据分别写入可写存储对象。It should be noted that the data merged in step 1830 may come from two or more data write requests. In another example, after receiving the second data write request, if the first data and the second data are not suitable for combining, the first data and the second data are written into the writable storage object respectively.

本发明实施例还提供一种包含计算机程序代码的计算机程序,当被载入计算机系统并在计算机系统上执行时,所述计算机程序代码使所述计算机系统执行上面所述的方法。Embodiments of the present invention also provide a computer program comprising computer program code, which, when loaded into a computer system and executed on the computer system, causes the computer system to perform the above-described method.

本发明实施例还提供一种包括程序代码的程序,当被载入存储设备并在存储设备上执行时,所述计程序代码使所述存储设备执行上面所述的方法。Embodiments of the present invention further provide a program including program codes, when loaded into a storage device and executed on the storage device, the program code causes the storage device to execute the above-described method.

应该理解,框图和流程图的每个框以及框图和流程图的框的组合可以分别由包括计算机程序指令的各种装置来实施。这些计算机程序指令可以加载到通用计算机、专用计算机或其他可编程数据控制设备上以产生机器,从而在计算机或其他可编程数据控制设备上执行的指令创建了用于实现一个或多个流程图框中指定的功能的装置。It will be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, respectively, can be implemented by various means including computer program instructions. These computer program instructions can be loaded on a general purpose computer, special purpose computer or other programmable data control device to produce a machine such that the instructions executed on the computer or other programmable data control device create a flow diagram for implementing one or more blocks device with the function specified in .

这些计算机程序指令还可以存储在可以引导计算机或其他可编程数据控制设备的计算机可读存储器中从而以特定方式起作用,从而能够利用存储在计算机可读存储器中的指令来制造包括用于实现一个或多个流程图框中所指定功能的计算机可读指令的制品。计算机程序指令还可以加载到计算机或其他可编程数据控制设备上以使得在计算机或其他可编程数据控制设备上执行一系列的操作操作,从而产生计算机实现的过程,进而在计算机或其他可编程数据控制设备上执行的指令提供了用于实现一个或多个流程图框中所指定功能的操作。These computer program instructions may also be stored in a computer readable memory that can direct a computer or other programmable data control device to function in a particular manner, such that the instructions stored in the computer readable memory can be used to manufacture including for implementing a Articles of manufacture of computer readable instructions for the functions specified in the flowchart block or blocks. Computer program instructions can also be loaded onto a computer or other programmable data control device to cause a series of operations to be performed on the computer or other programmable data control device, resulting in a computer-implemented process, which in turn is performed on the computer or other programmable data control device. The instructions executing on the control device provide operations for implementing the functions specified in one or more of the flowchart blocks.

因而,框图和流程图的框支持用于执行指定功能的装置的组合、用于执行指定功能的操作的组合和用于执行指定功能的程序指令装置的组合。还应该理解,框图和流程图的每个框以及框图和流程图的框的组合可以由执行指定功能或操作的、基于硬件的专用计算机系统实现,或由专用硬件和计算机指令的组合实现。Accordingly, blocks of the block diagrams and flowchart illustrations support combinations of means for performing the specified functions, combinations of operations for performing the specified functions and combinations of program instruction means for performing the specified functions. It will also be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware-based computer systems that perform the specified functions or operations, or by combinations of special purpose hardware and computer instructions.

上面已经公开了存储系统的数据访问方法及其装置。所属领域技术人员还将意识到本发明中所公开的方法或操作流程可由软件、固件及其任何组合实现。实现本发明实施例的方法或操作流程的软件、固件可由访问存储设备的主机的CPU执行。实现本发明实施例的方法或操作的软件、固件可存储于网络服务器、访问存储设备的主机和/或存储设备。The data access method and device of the storage system have been disclosed above. Those skilled in the art will also appreciate that the methods or operational procedures disclosed in this disclosure may be implemented by software, firmware, and any combination thereof. The software and firmware for implementing the methods or operation procedures of the embodiments of the present invention may be executed by the CPU of the host accessing the storage device. Software and firmware for implementing the methods or operations of the embodiments of the present invention may be stored in a network server, a host accessing a storage device, and/or a storage device.

虽然当前发明参考的示例被描述,其只是为了解释的目的而不是对本发明的限制,对实施方式的改变,增加和/或删除可以被做出而不脱离本发明的范围。Although the examples with reference to the present invention are described for purposes of explanation only and not limitation of the invention, changes, additions and/or deletions to the embodiments may be made without departing from the scope of the invention.

这些实施方式所涉及的、从上面描述和相关联的附图中呈现的教导获益的领域中的技术人员将认识到这里记载的本发明的很多修改和其他实施方式。因此,应该理解,本发明不限于公开的具体实施方式,旨在将修改和其他实施方式包括在所附权利要求书的范围内。尽管在这里采用了特定的术语,但是仅在一般意义和描述意义上使用它们并且不是为了限制的目的而使用。Many modifications and other embodiments of the inventions set forth herein will come to mind to those skilled in the art to which these embodiments pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed, but that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims (10)

1.一种用于存储系统的数据访问方法,所述存储系统包括多个存储设备与NVDIMM,所述存储系统提供多个存储对象,存储对象由存储设备上的存储资源组成,所述多个存储对象包括一个或多个可写存储对象与多个只读存储对象;所述方法包括:1. A data access method for a storage system, the storage system comprising a plurality of storage devices and NVDIMMs, the storage system providing a plurality of storage objects, the storage objects being composed of storage resources on the storage device, the plurality of storage The storage objects include one or more writable storage objects and multiple read-only storage objects; the method includes: 响应于写请求,向所述NVDIMM写入数据,以及向所述可写存储对象以追加写方式写入数据;in response to a write request, writing data to the NVDIMM and appending writing data to the writable storage object; 响应于向所述NVDIMM写入数据的操作完成,发送指示所述写请求完成的消息;响应于将数据写入所述可写存储对象,释放所述数据在所述NVDIMM中占据的空间;In response to the completion of the operation of writing data to the NVDIMM, sending a message indicating that the write request is completed; in response to writing data to the writable storage object, releasing the space occupied by the data in the NVDIMM; 若所述可写存储对象已满,将所述可写存储对象设置为只读存储对象;If the writable storage object is full, set the writable storage object as a read-only storage object; 响应于读请求,从所述只读存储对象中读出数据。In response to a read request, data is read from the read-only storage object. 2.根据权利要求1所述的方法,其中存储对象包括来自第一存储设备的第一存储空间与来自第二存储设备的第二存储空间。2. The method of claim 1, wherein the storage object comprises a first storage space from a first storage device and a second storage space from a second storage device. 3.根据权利要求2所述的方法,其中第二存储空间用于存储第一存储空间的数据的校验数据。3. The method according to claim 2, wherein the second storage space is used to store check data of the data of the first storage space. 4.根据权利要求1至3之一所述的方法,还包括记录所写入数据与所述可写存储对象的映射关系。4. The method according to one of claims 1 to 3, further comprising recording the mapping relationship between the written data and the writable storage object. 5.根据权利要求4所述的方法,还包括响应于读请求,根据所述映射关系查找存储了所请求数据的只读存储对象,并从所述存储了所请求数据的只读存储对象中读出数据。5. The method according to claim 4, further comprising, in response to a read request, searching for a read-only storage object that stores the requested data according to the mapping relationship, and from the read-only storage object that stores the requested data Read data. 6.根据权利要求1-3之一所述的方法,所述方法还包括:6. The method of one of claims 1-3, further comprising: 在将所述可写存储对象设置为只读存储对象之前,响应于读请求,从所述NVDIMM中读出所述数据。The data is read from the NVDIMM in response to a read request prior to setting the writable storage object as a read-only storage object. 7.根据权利要求1-3之一所述的方法,其中所述存储系统还提供缓存,所述方法还包括:7. The method of one of claims 1-3, wherein the storage system further provides a cache, the method further comprising: 响应于写请求,向所述缓存写入数据;In response to the write request, write data to the cache; 在将所述可写存储对象设置为只读存储对象之前,响应于读请求,从所述缓存中读出所述数据。The data is read from the cache in response to a read request before the writable storage object is set as a read-only storage object. 8.一种用于存储系统的数据访问方法,其中所述存储系统包括多个存储设备与NVDIMM,所述存储系统提供多个存储对象,存储对象由存储设备上的存储资源组成,所述多个存储对象包括一个或多个可写存储对象与多个只读存储对象;所述方法包括:8. A data access method for a storage system, wherein the storage system includes a plurality of storage devices and NVDIMMs, the storage system provides a plurality of storage objects, and the storage objects are composed of storage resources on the storage device, and the plurality of storage objects are Each storage object includes one or more writable storage objects and multiple read-only storage objects; the method includes: 响应于第一写请求,将第一数据写入所述NVDIMM;writing first data to the NVDIMM in response to a first write request; 响应于将所述第一数据写入所述NVDIMM,发送指示所述第一写请求完成的消息;In response to writing the first data to the NVDIMM, sending a message indicating that the first write request is complete; 响应于第二写请求,将第二数据写入所述NVDIMM;writing second data to the NVDIMM in response to a second write request; 响应于将所述第二数据写入所述NVDIMM,发送指示所述第二写请求完成的消息;In response to writing the second data to the NVDIMM, sending a message indicating that the second write request is complete; 生成存储数据块,所述存储数据块包括所述第一数据与所述第二数据;generating a storage data block, the storage data block includes the first data and the second data; 向所述可写存储对象以追加写/顺序写(append)方式写入所述存储数据块;writing the storage data block to the writable storage object in an append write/sequential write (append) manner; 若所述可写存储对象已满,将所述可写存储对象设置为只读存储对象;If the writable storage object is full, set the writable storage object as a read-only storage object; 响应于读请求,从所述只读存储对象中读出所述第一数据或所述第二数据。The first data or the second data is read from the read-only storage object in response to a read request. 9.一种计算机,包括:用于存储程序指令的机器可读存储器;用于执行存储在所述存储器中的程序指令的一个或多个处理器;所述程序指令用于使所述一个或多个处理器执行根据权利要求1-8之一所述的方法。9. A computer comprising: a machine-readable memory for storing program instructions; one or more processors for executing program instructions stored in said memory; said program instructions for causing said one or A plurality of processors perform the method according to one of claims 1-8. 10.一种用于存储系统的数据访问装置,所述存储系统包括多个存储设备与NVDIMM,所述存储系统提供多个存储对象,存储对象由存储设备上的存储资源组成,所述多个存储对象包括一个或多个可写存储对象与多个只读存储对象;所述装置包括:10. A data access device for a storage system, the storage system comprising a plurality of storage devices and NVDIMMs, the storage system providing a plurality of storage objects, and the storage objects are composed of storage resources on the storage device, the plurality of The storage objects include one or more writable storage objects and multiple read-only storage objects; the apparatus includes: 写入模块,用于响应于写请求,向所述NVDIMM写入数据,以及向所述可写存储对象以追加写方式写入数据;a writing module, configured to write data to the NVDIMM in response to a write request, and write data to the writable storage object in an additional write manner; 用于响应于向所述NVDIMM写入数据的操作完成,发送指示所述写请求完成的消息的模块;means for sending a message indicating the completion of the write request in response to the completion of the operation of writing data to the NVDIMM; 用于响应于将数据写入所述可写存储对象,释放所述数据在所述NVDIMM中占据的空间的模块;means for releasing the space occupied by the data in the NVDIMM in response to writing data to the writable storage object; 存储对象设置模块,用于若所述可写存储对象已满,将所述可写存储对象设置为只读存储对象;a storage object setting module, configured to set the writable storage object as a read-only storage object if the writable storage object is full; 读出模块,用于响应于读请求,从所述只读存储对象中读出数据。A readout module, configured to read data from the read-only storage object in response to a read request.
CN201510498599.0A 2015-08-13 2015-08-13 Data access method and device for flash memory storage Active CN106445405B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201911350374.5A CN111007991B (en) 2015-08-13 2015-08-13 Method for separating read-write requests based on NVDIMM and computer thereof
CN201510498599.0A CN106445405B (en) 2015-08-13 2015-08-13 Data access method and device for flash memory storage
PCT/CN2016/094422 WO2017025039A1 (en) 2015-08-13 2016-08-10 Flash storage oriented data access method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510498599.0A CN106445405B (en) 2015-08-13 2015-08-13 Data access method and device for flash memory storage

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN201911350374.5A Division CN111007991B (en) 2015-08-13 2015-08-13 Method for separating read-write requests based on NVDIMM and computer thereof

Publications (2)

Publication Number Publication Date
CN106445405A CN106445405A (en) 2017-02-22
CN106445405B true CN106445405B (en) 2020-02-07

Family

ID=57983858

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201911350374.5A Active CN111007991B (en) 2015-08-13 2015-08-13 Method for separating read-write requests based on NVDIMM and computer thereof
CN201510498599.0A Active CN106445405B (en) 2015-08-13 2015-08-13 Data access method and device for flash memory storage

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201911350374.5A Active CN111007991B (en) 2015-08-13 2015-08-13 Method for separating read-write requests based on NVDIMM and computer thereof

Country Status (2)

Country Link
CN (2) CN111007991B (en)
WO (1) WO2017025039A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108519885B (en) * 2017-02-28 2021-07-23 武汉斗鱼网络科技有限公司 A Flash-based public resource loading method and device
CN107025289B (en) * 2017-04-14 2018-12-11 腾讯科技(深圳)有限公司 A kind of method and relevant device of data processing
CN107454094A (en) * 2017-08-23 2017-12-08 北京明朝万达科技股份有限公司 A kind of data interactive method and system
CN109558070B (en) * 2017-09-27 2023-09-15 北京忆恒创源科技股份有限公司 Scalable storage system architecture
CN109558236B (en) * 2017-09-27 2023-07-25 北京忆恒创源科技股份有限公司 Method for accessing stripes and storage system thereof
CN110018784B (en) 2018-01-09 2023-01-10 阿里巴巴集团控股有限公司 Data processing method, device and computing device
CN108491333A (en) * 2018-03-21 2018-09-04 广州多益网络股份有限公司 Method for writing data, device, equipment and the medium of buffer circle
CN111290974B (en) * 2018-12-07 2024-09-03 北京忆恒创源科技股份有限公司 Cache elimination method for storage device and storage device
CN112115206B (en) * 2019-06-19 2025-05-16 北京京东尚科信息技术有限公司 A method and device for processing object storage metadata
CN111666046B (en) * 2020-05-20 2023-07-25 西安奥卡云数据科技有限公司 Data storage method, device and equipment
CN114527934A (en) * 2022-01-12 2022-05-24 珠海泰芯半导体有限公司 Flash memory control method and device, storage medium and electronic equipment
CN115904255B (en) * 2023-01-19 2023-05-16 苏州浪潮智能科技有限公司 Data request method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226561A (en) * 2012-01-26 2013-07-31 阿普赛尔有限公司 Content addressable stores based on sibling groups
CN104238962A (en) * 2014-09-16 2014-12-24 华为技术有限公司 Method and device for writing data into cache

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7457982B2 (en) * 2003-04-11 2008-11-25 Network Appliance, Inc. Writable virtual disk of read-only snapshot file objects
US7783611B1 (en) * 2003-11-10 2010-08-24 Netapp, Inc. System and method for managing file metadata during consistency points
US7360112B2 (en) * 2005-02-07 2008-04-15 International Business Machines Corporation Detection and recovery of dropped writes in storage devices
CN101727299B (en) * 2010-02-08 2011-06-29 北京同有飞骥科技股份有限公司 RAID5-orientated optimal design method for writing operation in continuous data storage
CN103150128A (en) * 2013-03-25 2013-06-12 中国人民解放军国防科学技术大学 Implementation method of solid state drive (SSD) and disk-based reliable mixed storage system
US9552176B2 (en) * 2013-04-12 2017-01-24 Microsoft Technology Licensing, Llc Block storage using a hybrid memory device
US8954619B1 (en) * 2013-08-07 2015-02-10 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Memory module communication control
EP2905706B1 (en) * 2013-12-02 2017-04-05 Huawei Technologies Co., Ltd. Data processing device and data processing method
CN104021093A (en) * 2014-06-24 2014-09-03 浪潮集团有限公司 Power-down protection method for memory device based on NVDIMM (non-volatile dual in-line memory module)
CN104239226A (en) * 2014-10-10 2014-12-24 浪潮集团有限公司 Method for designing iSCSI storage server with independent cache
CN104375959A (en) * 2014-12-01 2015-02-25 浪潮集团有限公司 Method for achieving data protection by adopting NVDIMM (non-volatile memory Module) on POWERPC (Power on remote control Unit) cloud storage platform
CN104765575B (en) * 2015-04-23 2017-09-15 成都博元时代软件有限公司 information storage processing method
CN104765574A (en) * 2015-04-23 2015-07-08 成都博元时代软件有限公司 Data cloud storage method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226561A (en) * 2012-01-26 2013-07-31 阿普赛尔有限公司 Content addressable stores based on sibling groups
CN104238962A (en) * 2014-09-16 2014-12-24 华为技术有限公司 Method and device for writing data into cache

Also Published As

Publication number Publication date
CN111007991B (en) 2024-01-26
CN106445405A (en) 2017-02-22
CN111007991A (en) 2020-04-14
WO2017025039A1 (en) 2017-02-16

Similar Documents

Publication Publication Date Title
CN106445405B (en) Data access method and device for flash memory storage
US9411742B2 (en) Use of differing granularity heat maps for caching and migration
US9454317B2 (en) Tiered storage system, storage controller and method of substituting data transfer between tiers
US9697219B1 (en) Managing log transactions in storage systems
US10176190B2 (en) Data integrity and loss resistance in high performance and high capacity storage deduplication
US9342256B2 (en) Epoch based storage management for a storage device
WO2016046911A1 (en) Storage system and storage system management method
US8694563B1 (en) Space recovery for thin-provisioned storage volumes
CN107924291B (en) Storage system
US20140258628A1 (en) System, method and computer-readable medium for managing a cache store to achieve improved cache ramp-up across system reboots
WO2017113213A1 (en) Method and device for processing access request, and computer system
WO2018171296A1 (en) File merging method and controller
US20130198448A1 (en) Elastic cache of redundant cache data
JP2016506585A (en) Method and system for data storage
CN104272272A (en) Mix storage collections to remove duplicates
CN105659204A (en) Method and apparatus for performing annotated atomic write operations
CN105302744A (en) Invalidation data area for cache
TW201329714A (en) A method for metadata persistence
CN103902479A (en) Quick reconstruction mechanism for metadata cache on basis of metadata log
WO2017113211A1 (en) Method and device for processing access request, and computer system
CN110427347A (en) Method, apparatus, memory node and the storage medium of data de-duplication
CN108228088B (en) Method and apparatus for managing storage system
US9471252B2 (en) Use of flash cache to improve tiered migration performance
US20140115293A1 (en) Apparatus, system and method for managing space in a storage device
CN108958657A (en) A kind of date storage method, storage equipment and storage system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 100192 room A302, building B-2, Dongsheng Science Park, Zhongguancun, 66 xixiaokou Road, Haidian District, Beijing

Patentee after: Beijing yihengchuangyuan Technology Co.,Ltd.

Address before: Room A302, building B-2, Dongsheng Science Park, Zhongguancun, No. 66, xixiaokou Road, Shijingshan District, Beijing 100192

Patentee before: MEMBLAZE TECHNOLOGY (BEIJING) Co.,Ltd.

CP03 Change of name, title or address