HK1082976B - Multi-protocol storage appliance that provides integrated support for file and block access protocols - Google Patents
Multi-protocol storage appliance that provides integrated support for file and block access protocols Download PDFInfo
- Publication number
- HK1082976B HK1082976B HK06102881.2A HK06102881A HK1082976B HK 1082976 B HK1082976 B HK 1082976B HK 06102881 A HK06102881 A HK 06102881A HK 1082976 B HK1082976 B HK 1082976B
- Authority
- HK
- Hong Kong
- Prior art keywords
- storage
- protocol
- file
- appliance
- access
- Prior art date
Links
Description
Technical Field
The present invention relates to storage systems, and more particularly, to a multi-protocol storage appliance that supports file and block access protocols.
Background
A storage system is a computer that provides storage services related to the organization of information on writable persistent storage devices such as memory, tape, or disk. Storage systems are typically provided in a Storage Area Network (SAN) or Network Attached Storage (NAS) environment. When used within a NAS environment, the storage system may be implemented as a file server containing an operating system that implements a file system to logically organize the information as a hierarchy of directories and files on, for example, disks. Each "on-disk" file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data of the file. Alternatively, a directory may be implemented as a specially formatted file in which information about other files and directories is stored.
The file server or filer may also be configured to operate according to a client/server model of information delivery, allowing many client systems (clients) to access shared resources, such as files, stored in the filer. Sharing of files is a feature of NAS systems that is enabled due to the semantic level of their access to files and file systems. Storage of information in a NAS system is typically deployed over a computer network, such as ethernet, that includes a geographically distributed collection of interconnected communication links that allow clients to remotely access information (files) in a file manager. Clients typically communicate with the file manager by exchanging discrete frames or packets of data according to predefined protocols, such as the transmission control protocol/internet protocol (TCP/IP).
In the client/server model, a client may comprise an application running on a computer that is "connected" to a file manager via a computer network, such as a point-to-point link, a shared local area network, a wide area network, or a virtual private network implemented over a public network, such as the Internet. NAS systems typically employ file-based access protocols; thus, each client may request the services of the file manager by issuing file system protocol messages (in the form of data packets) to the file system over the network that identify one or more files to be accessed, regardless of the particular location, e.g., blocks, where the data is stored on the disk. By supporting multiple file system protocols, such as the traditional Common Internet File System (CIFS), Network File System (NFS), and Direct Access File System (DAFS) protocols, the usefulness of the file manager may be enhanced for networking clients.
A SAN is a high-speed network that allows for the establishment of direct connections between a storage system and its storage devices. Thus, a SAN may be considered an extension of a storage bus, and thus, the operating system of the storage system allows storage information to be accessed over the "extension bus" using a block-based access protocol. In this context, an expansion bus is typically implemented in a Fibre Channel (FC) or Ethernet media, suitable for working with block access protocols, such as Small Computer System Interface (SCSI) protocol encapsulation, over FC or TCP/IP/Ethernet.
SAN configurations or deployments allow separation of storage from storage systems, such as application servers, and some level of information storage sharing at the application server level. However, there are environments where a SAN is dedicated to a single server. In some SAN deployments, information is organized in the form of databases, while in other deployments, file-based organization is employed. In the case where information is organized as a file, the client requesting the information maintains file mapping and manages file semantics, while its request (and server response) addresses the information in terms of block addressing the disk with, for example, a logical unit number (lun).
Previous approaches generally employ two separate solutions for SAN and NAS environments. For those approaches that provide a single solution for both environments, NAS functionality is typically "deployed" on the SAN storage system platform using, for example, a "sidecar" device connected to the SAN platform. However, even these prior systems typically split storage into different SAN and NAS storage domains. That is, the storage spaces of the SAN and NAS domains do not co-exist and are physically partitioned by a configuration process implemented by, for example, a user (system administrator).
An example of such a prior system is the Symmetrix ® system platform available from EMC ®, inc. In general terms, individual disks of a SAN storage system (Symmetrix System) are assigned to NAS sidecar devices (e.g., Celerra)TMDevices) that in turn export those disks to the NAS client via, for example, NFS and CIFS protocols. The system administrator decides the number of disks and the location of the "slices" (extents) of those disks that are combined to build the "user-defined volumes" and then decides the manner of use of those volumes. The term "volume" as traditionally used in a SAN environment means a storage entity that is built by specifying physical disks and extents within those disks via operations that combine those extents/disks into a user-defined volume storage entity. Notably, SAN-based disks and NAS-based disks that contain user-defined volumes are physically partitioned within the system platform.
System administrators typically present their decisions through a complex user interface to users who can understand the underlying physical aspects of the system. That is, the user interface primarily surrounds the physical disk structure and the management that must be performed by the system administrator to provide a view of the SAN platform on behalf of the client. For example, the user interface may prompt the administrator to specify the physical disks required to construct the user-defined volume and the extent sizes in those disks. In addition, the interface also prompts the administrator to specify the physical locations of those extents and disks and the manner in which they are "glued together" (organized) and made visible (exported) to the SAN client as a user-defined volume corresponding to a certain disk or lun. Once a physical disk and its extents are selected for building a volume, only those disks/extents contain the volume. The system administrator must also specify a form of reliability for that constructed volume, such as a redundant array of independent (or inexpensive) disks (RAID) protection level and/or mirroring. The RAID group is then overlaid on top of those selected disks/extents.
In summary, prior system approaches require a system administrator to carefully configure the physical layout of disks and their organization to create a user-defined volume exported as a single lun to a SAN client. All the management associated with this prior approach is built on a physical disk basis. To increase the size of a user-defined volume, a system administrator adds disks and recalculates RAID computations to include redundant information associated with the data stored on the disks that make up the volume. Obviously, this is a complicated and expensive process. The present invention aims to provide a simple and efficient integrated solution for SAN and NAS storage environments.
Disclosure of Invention
The present invention relates to a multi-protocol storage appliance for file and block protocol access to information stored on storage devices in an integrated manner for both Network Attached Storage (NAS) and Storage Area Network (SAN) deployments. The storage operating system of the appliance implements a file system that cooperates with a novel virtualization module to provide a virtualization system for the storage space provided by a "virtual" appliance. Notably, the file system provides volume management functionality for block-based access to information stored on the device. The virtualization system allows the file system to logically organize information into named file, directory, and virtual disk (vdisk) storage objects, thereby providing an integrated NAS and SAN appliance approach to storage by enabling file-based access to files and directories, while also enabling block-based access to vdisks.
In the illustrative embodiment, the virtualization module is implemented, for example, as a vdisk module and a Small Computer System Interface (SCSI) target module. The vdisk module provides a data path from the block-based SCSI target module to blocks managed by the file system. The vdisk module also interacts with the file system to allow an administrative interface, such as a streamlined User Interface (UI), to be accessed in response to a system administrator issuing commands to the multi-protocol storage appliance. In addition, the vdisk module manages SAN deployments by, among other things, implementing a comprehensive set of vdisk commands issued by a system administrator via the UI. These vdisk commands are converted into basic file system operations that interact with the file system and SCSI target module to implement the vdisks.
The SCSI target module, in turn, initiates emulation of a disk or logical unit number (lun) by providing a mapping process that translates logical block accesses to luns specified in access requests into virtual block accesses to vdisks, and converts vdisks into luns in response to these requests. Thus, the SCSI target module provides a translation layer of the virtualization system between SAN block (lun) space and file system space, where luns are represented as vdisks. By "deploying" SAN virtualization on top of the file system, the multi-protocol storage appliance reverses the approaches taken by prior systems, thereby providing a single unified storage platform for substantially all storage access protocols.
Advantageously, the integrated multi-protocol storage appliance provides access control and, where appropriate, sharing of files and vdisks for all protocols while preserving data integrity. The storage appliance also provides embedded/integrated virtualization functionality that eliminates the need for users to allocate storage resources when creating NAS and SAN storage objects. These functions include virtualized storage space that allows SAN and NAS objects to coexist with respect to global space management within the appliance. Furthermore, the integrated storage appliance provides simultaneous support for block access protocols to the same vdisk as well as a heterogeneous SAN environment with support for clustering.
Brief description of the drawings
The above and other advantages of the invention may be better understood by referring to the following description in conjunction with the accompanying drawings in which like reference numbers indicate identical or functionally similar elements:
FIG. 1 is a schematic block diagram of a multi-protocol storage appliance configured to operate in a Storage Area Network (SAN) and Network Attached Storage (NAS) environment in accordance with the present invention;
FIG. 2 is a schematic block diagram of a storage operating system of a multi-protocol storage appliance that may be advantageously used with the present invention;
FIG. 3 is a schematic block diagram of a virtualization system implemented by a file system interacting with a virtualization module in accordance with the present invention; and
figure 4 is a flowchart illustrating the sequence of steps involved in accessing information stored in a multi-protocol storage appliance over a SAN network.
Detailed description of illustrative embodiments
The present invention is directed to a multi-protocol storage appliance for file and block protocol access to information stored in a storage device in an integrated manner. In this context, an integrated multi-protocol appliance represents a computer with features such as simplified storage service management and facilitating storage reconfiguration, including reusable storage space, users (system administrators) and clients for Network Attached Storage (NAS) and Storage Area Network (SAN) deployments. The storage device may provide NAS services through a file system, while the same device provides SAN services through SAN virtualization, including logical unit number (lun) emulation.
FIG. 1 is a schematic block diagram of a multi-protocol storage appliance 100 configured to provide storage services related to the organization of information in storage devices, such as disks 130. The storage device 100 is illustratively embodied as a storage system comprising a processor 122, a memory 124, a plurality of network adapters 125, 126, and a storage adapter 128 interconnected by a system bus 123. The multi-protocol storage appliance 100 also includes a storage operating system 200 that provides a virtualization system (and in particular a file system) to logically organize information as a hierarchy of named directories, files, and virtual disk (vdisk) storage objects on the disks 130.
Clients of a NAS-based network environment have a storage perspective for files, while clients of a SAN-based network environment have a storage perspective for blocks or disks. To this end, the multi-protocol storage appliance 100 provides (exports) disks to SAN clients by creating lun or vdisk objects. A vdisk object (hereinafter "vdisk") is a special file type that is implemented by the virtualization system and is converted to an emulated disk that the SAN client sees. Thereafter, the multi-protocol storage appliance makes these emulated disks accessible to the SAN clients through controlled exports, as further described herein.
In the illustrative embodiment, memory 124 comprises storage locations addressable by the processors and adapters for storing software program code and data structures associated with the present invention. The processor and adapters may, in turn, comprise processing elements and/or logic circuitry configured to execute the software code and manipulate the data structures. A portion of storage operating system 200 typically resides in memory and is executed by a processing element, and storage operating system 200 functionally organizes the storage device by, inter alia, invoking storage operations supported by the storage service implemented by the device. It will be apparent to those skilled in the art that other processing and memory means, including various computer readable media, may be used for storing and executing program instructions pertaining to the systems and methods of the invention described herein.
The network adapter 125 couples the storage device to a plurality of clients 160a, b through a point-to-point link, wide area network, virtual private network (hereinafter referred to as illustrative ethernet 165) implemented over a public network (the internet) or a shared local area network. Thus, the network adapter 125 may include a Network Interface Card (NIC) having the electromechanical and signaling circuitry required to connect the device to a network switch, such as a conventional ethernet switch 170. For such NAS-based network environments, clients are configured to access information stored in files in the multi-protocol appliance. The client 160 communicates with the storage device via the network 165 by exchanging discrete frames or packets of data according to a predefined protocol, such as the transmission control protocol/internet protocol (TCP/IP).
Client 160 may be a general purpose computer configured to include UNIX ® and Microsoft ® WindowsTMApplications are run on various operating systems, including an operating system. In accessing information (in the form of files and directories) over a NAS-based network, client systems typically employ a file-based access protocol. Thus, each client 160 may request the services of the storage device 100 by issuing a file access protocol message (in the form of a data packet) to the device via the network 165. For example, a client 160a running a Windows operating system may communicate with the storage device 100 using the TCP/IP-based Common Internet File System (CIFS) protocol. On the other hand, the client 160b running the UNIX operating system may communicate with the multi-protocol device using the Network File System (NFS) protocol over TCP/IP or the Direct Access File System (DAFS) protocol over Virtual Interface (VI) transport according to the Remote DMA (RDMA) protocol over TCP/IP. It will be apparent to those skilled in the art that other clients running other types of operating systems may also communicate with the integrated multi-protocol storage appliance using other file access protocols.
The storage network "target" adapter 126 also couples the multi-protocol storage appliance 100 to clients 160 that may also be configured to access stored information as blocks or disks. For such a SAN based network environment, the storage devices are coupled to an illustrative Fibre Channel (FC) network 185. FC is a networking standard that describes a set of protocols and media that are primarily found in SAN deployments. The network target adapter 126 may comprise a FC Host Bus Adapter (HBA) having the electromechanical and signaling circuitry required to connect the device 100 to a SAN network switch, such as a conventional FC switch 180. In addition to providing FC access, the FC HBA may offload fibre channel network processing operations for the storage device.
In accessing information (in the form of blocks, disks, or vdisks) over a SAN-based network, the client 160 typically employs a block-based access protocol, such as the Small Computer System Interface (SCSI) protocol. SCSI is a peripheral input/output (I/O) interface with standard device-independent protocols that allows different peripheral devices, such as disks 130, to be connected to the storage device 100. In SCSI technology, a client 160 operating in a SAN environment is the initiator that initiates requests and commands for data. Thus, the multi-protocol storage appliance is a target configured to respond to requests issued by initiators according to a request/response protocol. The initiator and target have endpoint addresses that contain World Wide Names (WWNs) according to the FC protocol. The WWN is a unique identifier consisting of an 8-byte number, such as a node name or port name.
The multi-protocol storage appliance 100 supports various SCSI-based protocols for use in SAN deployments, including SCSI encapsulated over TCP (iscsi) and SCSI encapsulated over FC (fcp). Thus, an initiator (hereinafter client 160) may request the services of a target (hereinafter storage device 100) by issuing iSCSI and FCP messages over the networks 165, 185 to access information stored on disk. It will be apparent to those skilled in the art that other block access protocols may be used by clients to request services from the integrated multi-protocol storage appliance. By supporting multiple block access protocols, the multi-protocol storage appliance provides a uniform and coherent access solution to vdisks/luns in heterogeneous SAN environments.
The storage adapter 128 cooperates with the storage operating system 200 running on the storage device to access information requested by the client. Information may be stored on disk 130 or other similar medium suitable for storing information. The storage adapter includes I/O interface circuitry that couples to the disks through an I/O interconnect configuration, such as a conventional high performance FC serial link topology. The information is retrieved by the storage adapter and, if necessary, processed by the processor 122 (or the adapter 128 itself) before being forwarded to the network adapters 125, 126 over the system bus 123, where the information is formatted into data packets or messages and returned to the client.
The storage of information on device 100 is preferably implemented as one or more storage volumes (e.g., VOL 1-2150) that comprise clusters of physical storage disks 130, defining an overall logical arrangement of disk space. The disks in a volume are typically organized into one or more sets of redundant arrays of independent (or inexpensive) disks (RAIDs). RAID implementations enhance the reliability/integrity of data storage by writing "stripes" of data across a given number of physical disks in a RAID group and storing redundant information about the striped data appropriately. The redundant information allows for recovery of data lost when the storage device fails. It will be apparent to those skilled in the art that other redundancy techniques, such as mirroring, may be employed in accordance with the present invention.
In particular, each volume 150 is comprised of an array of physical disks 130 organized into RAID groups 140, 142, and 144. According to the illustrative RAID level 4 configuration, the physical disks of each RAID group include those disks configured to store striped data (D) and those disks configured to store parity (P) for the data. It should be noted that other RAID level configurations (e.g., RAID 5) are also contemplated for use with the teachings described herein. In an illustrative embodiment, a minimum of one parity disk and one data disk may be employed. However, a typical implementation may be three data disks and one parity disk per RAID group and at least one RAID group per volume.
To facilitate access to the disks 130, the storage operating system 200 implements a write-anywhere file system of the novel virtualization system that "virtualizes" the storage space provided by the disks 130. The file system logically organizes the information as a hierarchy of named directories and file objects (hereinafter "directories" and "files") on the disk. Each "on-disk" file may be implemented as a collection of disk blocks configured to store information, such as data, while a directory may be implemented as a specially formatted file in which names and links to other files and directories are stored. The virtualization system allows the file system to further logically organize the information as a hierarchy of named vdisks on disks, thereby providing an integrated NAS and SAN appliance approach to storage by allowing file-based (NAS) access to named files and directories, while also allowing block-based (SAN) access to the named vdisks on file-based storage platforms. The file system simplifies the complexity of managing the underlying physical storage in a SAN deployment.
As mentioned above, a vdisk is a special file type in a volume that derives from a normal (regular) file, but has associated export control and operational limitations that support disk emulation. Unlike files that may be created by a client using, for example, the NFS or CIFS protocols, vdisks are created in multi-protocol storage devices as special type files (objects) via, for example, a User Interface (UI). Illustratively, the vdisk is a multi-inode (multi-inode) object that includes a special file inode that holds data and at least one associated stream inode that holds attributes that contain security information. The special file inode serves as a main container for storing data associated with the emulated disk, such as application data. The stream inode stores attributes that allow luns and exports to be maintained over, for example, reboot operations, while also enabling management of vdisks as single-disk objects with respect to SAN clients. One example of a vdisk and its associated index node that may be advantageously employed with the present invention is described in co-pending and commonly assigned U.S. patent application serial No. (112056-0069) entitled Storage Virtualization by layer vdisks on a File System, which is hereby incorporated by reference as if fully set forth herein.
In an illustrative embodiment, the storage operating system is preferably a NetApp ® Data ONTAP available from Network application, Inc. (Sunnyvale, Calif.)TMOperating system implementing Write Anywhere File Layout (WAFL)TM) A file system. However, it is expressly contemplated that any suitable storage is provided in accordance with the inventive principles described hereinStoring an operating system, including a write-in-place file system, may enhance usage. Thus, where the term "WAFL" is employed, it should be taken broadly to mean any storage operating system that might otherwise be suitable for the teachings of the present invention.
The term "storage operating system" as used herein generally refers to computer executable code operable on a computer that manages Data access and, in the case of a multi-protocol storage appliance, may implement Data access semantics, such as a Data ONTAP storage operating system, which is implemented as a microkernel. The storage operating system may also be implemented as an application program operating in a general-purpose operating system such as UNIX ® or Windows NT ®, or as a general-purpose operating system with configurable functionality, as described herein, configured for storage applications.
In addition, those skilled in the art will appreciate that the inventive systems and methods described herein may be applied to any type of special purpose (e.g., storage serving appliance) or general purpose computer, including a standalone computer or portion thereof, implemented as or including a storage system. Furthermore, the teachings of the present invention may be adapted to a variety of storage system architectures including, but not limited to, network-attached storage environments, storage area networks, and disk assemblies directly attached to client or host computers. Thus, the term "storage system" should also be taken broadly to include such configurations, in addition to any subsystems configured to perform a storage function and associated with other devices or systems.
FIG. 2 is a schematic block diagram of a storage operating system 200 that may be advantageously used with the present invention. The storage operating system includes a series of software layers organized to form an integrated network protocol stack or, more generally, a multi-protocol engine that provides data paths for clients to access information stored in the multi-protocol storage appliance using block and file access protocols. The protocol stack includes a media access layer 210 of network drivers (e.g., gigabit ethernet drivers) that interface with network protocol layers, such as IP layer 212 and its supporting transport mechanisms, TCP layer 214, and User Datagram Protocol (UDP) layer 216. The file system protocol layer provides multi-protocol file access, and to this end includes support for DAFS protocol 218, NFS protocol 220, CIFS protocol 222, and Hypertext transfer protocol (HTTP) protocol 224. The VI layer 226 implements the VI architecture to provide the Direct Access Transfer (DAT) capability required by the DAFS protocol 218, such as RDMA.
The iSCSI driver layer 228 provides block protocol access above the TCP/IP network protocol layer, while the FC driver layer 230 cooperates with the FC HBA 126 to receive and send block access requests and responses to the integrated storage device. The FC and iSCSI drivers provide FC-specific and iSCSI-specific access control to luns (vdisks), managing the export of vdisks to either iSCSI or FCP, or to both iSCSI and FCP, when accessing a single vdisk in a multi-protocol storage appliance. In addition, the storage operating system includes a disk storage layer 240 that implements a disk storage protocol, such as a RAID protocol, and a disk drive layer 250 that implements a disk access protocol, such as a SCSI protocol.
Bridging the disk software layer with the integrated network protocol stack layer is a virtualization system 300 according to the present invention. Fig. 3 is a schematic block diagram of a virtualization system 300 implemented by a file system 320 in cooperation with virtualization modules illustratively embodied as, for example, a vdisk module 330 and a SCSI target module 310. It should be noted that the vdisk module 330, file system 320, and SCSI target module 310 may be implemented in software, hardware, firmware, or a combination thereof. The vdisk module 330 is layered on (and interacts with) the file system 320 to provide a data path from the block-based SCSI target module to blocks managed by the file system. The vdisk module also allows management interfaces, such as a streamlined user interface (UI 350), to be accessed in response to a system administrator issuing commands to the multi-protocol storage appliance 100. In essence, the vdisk module 330 manages SAN deployments, particularly by implementing a comprehensive set of vdisk (lun) commands issued by a system administrator via the UI 350. These vdisk commands are converted into basic file system operations ("primitives") that interact with the file system 320 and the SCSI target module 310 to implement the vdisks.
The SCSI target module 310, in turn, initiates emulation of a disk or lun by providing a mapping process that translates logical block access to luns specified in access requests into virtual block access to specific vdisk file types, and vdisks into luns in response to these requests. The SCSI target module is illustratively deployed between the FC and iSCSI drivers 228, 230 and the file system 320, providing a translation layer of the virtualization system 300 between SAN block (lun) space and file system space, where luns are represented as vdisks. By "deploying" SAN virtualization on top of the file system 320, the multi-protocol storage appliance reverses the approaches taken by prior systems, thereby providing a single unified storage platform for substantially all storage access protocols.
According to the present invention, a file system provides the capability for file-based access to information stored in a storage device, such as a disk. In addition, the file system provides volume management capabilities for block-based access to stored information. That is, in addition to providing file system semantics (e.g., dividing storage into discrete objects and naming of those storage objects), the file system 320 provides functionality typically associated with volume managers. As described herein, these functions include: (i) an aggregation of disks, (ii) an aggregation of storage bandwidth of disks, and (iii) a reliability guarantee, such as mirroring and/or parity (RAID), to provide one or more storage objects layered on a file system. One feature of the multi-protocol storage appliance is the simplicity of use associated with these volume management capabilities, especially when used in SAN deployments.
File system 320 illustratively implements a WAFL file system having an on-disk format representation that is block-based using, for example, 4 kilobyte (kB) blocks and using inodes to describe files. The WAFL file system employs files to store metadata describing the layout of its file system; these metadata files include inode files and the like. The file handle, i.e., the identifier, containing the inode number is used to retrieve the inode from disk. A description of the structure of a file system, including an inode file, is provided in U.S. patent No. 5819292 to David Hitz et al (granted on 6/10/1998) entitled "method for maintaining a consistent state of a file system and for creating a user-accessible read-only copy of a file system," which is hereby incorporated by reference as if fully set forth herein.
Broadly speaking, all inodes of a file system are organized as inode files. The File System (FS) information block specifies the layout of information in the file system, as well as the inode that includes the file that contains all other inodes of the file system. Each volume has FS info blocks that are preferably stored in fixed locations within a RAID group, such as a file system. The inode of the root FS info block may directly reference (point to) the inode file block or may reference indirect inode file blocks that, in turn, reference direct inode file blocks. In each direct inode file block, inodes are embedded, each of which may reference indirect blocks that, in turn, reference data blocks of a file or vdisk.
According to one aspect of the invention, the file system implements access operations to the vdisks 322 and to files 324 and directories (dir 326) that coexist with respect to global space management of storage units, such as volumes 150 and/or quota trees (qtrees) 328. The quota tree 328 is a special directory that has attributes of logical child volumes within the namespace of the physical volume. Each file system storage object (file, directory, or vdisk) is illustratively associated with a quota tree, and quotas, security attributes, and other items can be allocated per quota tree. Vdisks and files/directories may be layered above a quota tree 328, which quota tree 328 is in turn layered above volume 150, as abstracted by a file system "virtualization" layer 320.
Note that the vdisk storage objects in the file system 320 are associated with SAN deployments of the multi-protocol storage appliance, while the file and directory storage objects are associated with NAS deployments of the appliance. Files and directories are generally not accessible via FC or SCSI block access protocols; however, the files may be converted to vdisks and then accessed via SAN or NAS protocols. Vdisks are accessible as luns via SAN (FC and SCSI) protocols and as files via NAS (NFS and CIFS) protocols.
In another aspect of the invention, the virtualization system 300 provides a virtualized storage space that allows SAN and NAS storage objects to coexist with respect to global space management of the file system 320. To this end, the virtualization system 300 takes advantage of the characteristics of a file system, including its inherent ability to aggregate disks and abstract them into a single storage pool. For example, the system 300 leverages the volume management capabilities of the file system 320 to organize the collection of disks 130 into one or more volumes 150 representing a global pool of storage space. The global storage pool is then made available for SAN and NAS deployments by creating vdisks 322 and files 324, respectively. In addition to sharing the same global storage space, vdisks and files share the same available storage pool from which they are extracted when the SAN and/or NAS deployment is extended. Unlike prior systems, there is no physical partitioning of the disk within the global storage space of the multi-protocol storage appliance.
The multi-protocol storage appliance substantially simplifies the management of global storage space by allowing users to manage both NAS and SAN storage objects using a single pool of storage resources. In particular, for SAN and NAS deployments, free block space is managed in fine-grained blocks from the global free pool. If those storage objects are managed separately (discretely), the user is typically required to keep a certain amount of "spare" disk at hand for the various objects to respond to changes in, for example, business goals. The overhead required to maintain that discrete approach is greater than if those objects were managed from within a single resource pool, where only a single set of spare disks is available for business-dominated extensions. The blocks released by the vdisk operations, respectively, may be immediately reused by the NAS object (or vice versa). The details of this management are transparent to the administrator. This represents the "total cost of ownership" advantage of the integrated multi-protocol storage appliance.
The virtualization system 300 also provides reliability guarantees for those SAN and NAS storage objects that coexist in the global storage space of the multi-protocol appliance 100. In particular, reliability guarantees in the face of disk failures are a feature inherited from the file system 320 of the device 100 through techniques such as RAID or mirroring performed at the physical block level of a conventional SAN system. This simplifies management by allowing the administrator to make global decisions on underlying redundant physical storage that also applies to vdisks and NAS objects in the file system.
As described above, the file system 320 organizes information into file, directory, and vdisk objects within the volumes 150 of the disks 130. Below each volume 150 is a set of RAID groups 140 and 144 that provide protection and reliability against disk failures within the volume. Information serviced by the multi-protocol storage appliance is protected in accordance with an illustrative RAID 4 configuration. This level of protection may be extended to include, for example, synchronous mirroring on a device platform. The RAID 4 protected vdisk 322 created in the volume "inherits" the protection when the synchronous mirroring additional protection is specified for the volume 150. In this case, the synchronous mirroring protection is not an attribute of the vdisk, but rather an attribute of the underlying volume and a reliability guarantee of the file system 320. This "legacy" feature of the multi-protocol storage appliance simplifies management of vdisks because the system administrator does not have to deal with reliability issues.
In addition, the virtualization system 300 aggregates the bandwidth of the disks 130 without requiring the user to know the physical construction of those disks. The file system 320 is configured to write (store) data on disks as a continuous stripe on those disks according to input/output (I/O) storage operations that aggregate the bandwidth of all disks of a volume used to store the data. When information is stored or retrieved for a vdisk, I/O operations are not directed to the disk specified by the user. Rather, those operations are transparent to the user, as the file system "stripes" the data across all of the disks of the volume in a reliable manner according to its write anywhere layout policy. Due to the virtualization of block storage, the I/O bandwidth to the vdisk may be the maximum bandwidth of the underlying physical disk of the file system, regardless of the size of the vdisk (unlike typical physical implementations of luns in traditional block access products).
In addition, the virtualization system leverages (legacy) file system placement, management, and block allocation policies in order for the vdisk to function properly within the multi-protocol storage appliance. The virtual disk block placement policy is a function of the underlying virtualized file system and there is no permanent physical binding of file system blocks to SCSI logical block addresses that are subject to modification. The vdisk may be transparently reorganized to perhaps change data access pattern behavior.
For SAN and NAS deployments, the block allocation policy is independent of the physical properties of the disks (e.g., geometry, size, cylinders, sector size). The file system provides file-based management of files 324 and directories 326, and in accordance with the present invention, vdisks 322 reside within volumes 150. When a disk is added to an array of connected multi-protocol storage devices, that disk is incorporated into an existing volume to increase the overall volume space, which space may be used for any purpose, such as more vdisks or more files.
The management of the integrated multi-protocol storage appliance 100 is simplified by using the UI350 and vdisk command set available to the system administrator. The UI350 illustratively includes a command line interface (CLI 352) and a graphical user interface (GUI 354) that are used to implement a vdisk command set to, among other things, create a vdisk, increase/decrease the size of the vdisk, and/or destroy the vdisk. The storage space of the destroyed vdisk may then be reused, e.g., for NAS-based files, depending on the virtual storage space characteristics of the device 100. The vdisk may be increased ("grown") or decreased ("shrunk") under user control while preserving the block and NAS multi-protocol access to its application data.
The UI350 simplifies management of the multi-protocol SAN/NAS storage appliance by, for example, avoiding the need for a system administrator to explicitly configure and specify the disks to be used when creating a vdisk. For example, to create a vdisk, a system administrator need only issue a vdisk ("lun create") command through, for example, the CLI 352 or GUI 354. The vdisk command specifies the creation of a vdisk (lun) as well as the intended size of the vdisk and a path descriptor (path name) to that vdisk. In response, the file system 320 cooperates with the vdisk module 330 to "virtualize" the storage space provided by the underlying disks and create the vdisks specified by the create command. In particular, the vdisk module 330 processes vdisk commands to "call" basic operations ("primitives") in the file system 320 that implement the high-level concept of vdisks (luns). For example, the "lun create" command is converted into a series of file system primitives that create a vdisk with the correct information and size and at the correct location. These file system primitives include operations to create a file inode (CreateFile), create a stream inode (CreateFlow), and store information in the stream inode (stream write).
The result of the lun create command is that the vdisk 322 is created with a specified size and is RAID protected without having to explicitly specify such protection. The storage of information in the disks of a multi-protocol storage device is not a printed word, but rather "raw" bits are stored in the disks. The file system organizes those bits as vdisks and RAID groups across all disks within the volume. Thus, the created vdisk 322 does not need to be explicitly configured because the virtualization system 300 creates the vdisk in a manner that is transparent to the user. The created vdisk inherits the high performance characteristics of the underlying volume created by the file system, such as reliability and storage bandwidth.
The CLI 352 and/or GUI 354 also interacts with the vdisk module 330 to introduce properties and persistent lun mapping bindings that assign numbers to created vdisks. These lun map bindings are then used to export the vdisk to the client as a specific SCSI Identifier (ID). In particular, the created vdisk may be exported via lun mapping techniques, thereby enabling SAN clients to "view" (access) the disks. Vdisks (luns) generally require tightly controlled access in SAN environments; sharing of luns in a SAN environment typically occurs only in limited situations, such as clustered file systems, clustered operating systems, and multi-pass configurations. A system administrator of the multi-protocol storage appliance determines which vdisks (luns) are exportable to SAN clients. Once the vdisk is exported as a lun, the client may access the vdisk over the SAN network using block access protocols such as FCP and iSCSI.
SAN clients typically identify and address disks by logical numbers or luns. However, a "manageability" feature of the multi-protocol storage appliance is that the system administrator can manage the vdisks and their addressing by logical names. To this end, the vdisk module 330 of the multi-protocol storage appliance maps logical names to vdisks. For example, when creating a vdisk, the system administrator allocates the vdisk in "correct size" and specifies it with a name that is generally meaningful to its target application (e.g.,/vol/vo 10/database to save the database). The management interface provides name-based management of luns/vdisks (and files) exported from the storage device onto the client, thereby providing a consistent and uniform naming scheme for block-based (and file-based) storage.
The multi-protocol storage apparatus manages the export control of the vdisk according to the logical name by using an initiator group (igroup). igroup is a logical named entity assigned to one or more addresses associated with one or more initiators (depending on whether a clustered environment is configured). The "igroup create" command primarily associates (binds) those addresses, which may contain WWN addresses or iSCSI IDs, with logical names or igroup. The "lun map" command is then used to export one or more vdisks to the igroup, i.e., to make the vdisk "visible" to the igroup. In this sense, the "lun map" command is equivalent to NFS export or CIFS share. Thus, the WWN address or iSCSI ID identifies the clients that are permitted to access those vdisks specified by the lun map command. Thereafter, the logical name is used in conjunction with all operations within the storage operating system. This logical naming abstraction extends throughout the vdisk command set, including the interaction between users and the multi-protocol storage device. In particular, the igroup naming convention is used for all subsequent export operations as well as a list of luns exported for various SAN clients.
Figure 4 is a simplified flowchart illustrating the sequence of steps involved in accessing information stored in a multi-protocol storage appliance over a SAN network. Here, the client communicates with the storage device 100 over a network coupled to the device using a block access protocol. If the client is client 160a running a Windows operating system, the block access protocol is illustratively the FCP protocol used over network 185. On the other hand, if the client is client 160b running a UNIX operating system, the block access protocol is illustratively the iSCSI protocol used over network 165. The sequence begins at step 400 and proceeds to step 402 where the client generates a request to access information residing in the multi-protocol storage appliance and, at step 404, the request is forwarded as a conventional FCP or iSCSI block access request over the network 185, 165.
The request is received at the network adapter 126, 125 of the storage device 100 at step 406, processed therein by the integrated network protocol stack, and passed to the virtualization system 300 at step 408. In particular, if the request is an FCP request, it is handled by FC driver 230 as, for example, a 4k block request to access (i.e., read/write) data. If the request is an iSCSI protocol request, it is received in the media access layer (Intel gigabit Ethernet) and passed to the virtualization system through the TCP/IP network protocol layer.
Command and control operations associated with the SCSI protocol, including addressing information, are typically directed to disks or luns; but file system 320 does not recognize luns. Thus, the SCSI target module 310 of the virtualization system initiates emulation of the lun in order to respond to the SCSI command contained in the request (step 410). To this end, the SCSI target module has a set of application programming interfaces (APIs 360) that are based on the SCSI protocol and that implement a consistent interface with the iSCSI and FCP drivers 228, 230. The SCSI target module also implements a mapping/translation process that essentially translates luns to vdisks. At step 412, the SCSI target module maps the requested addressing information, such as FC routing information, to the internal structure of the file system.
File system 320 is illustratively a message-based system; thus, the SCSI target module 310 transforms the SCSI request into a message representing an operation for the file system. For example, the message generated by the SCSI target module may include an operation (e.g., read, write) as well as a path name (e.g., path descriptor) and a file name (e.g., special file name) of the vdisk object represented in the file system. The SCSI target module 310 passes the message to the file system layer 320 as, for example, a function call 365, where the operation is performed.
In response to receiving this message, the file system 320 maps the pathname to an inode structure to obtain a file handle corresponding to the vdisk 322. By being provided with a file handle, storage operating system 200 can convert that handle to a disk block, and thus retrieve the block (inode) from disk. Broadly speaking, a file handle is an internal representation of a data structure, i.e., a representation of an inode data structure used internally in a file system. The file handle is typically made up of multiple components including a file ID (inode number), snapshot ID, generation ID, and flag. The file system utilizes the file handle to retrieve a special file inode containing a vdisk in a file system structure implemented on the disk 130 and at least one associated stream inode.
At step 414, if the requested data is not resident "in core", i.e., in memory 124, the file system generates an operation to load (retrieve) the requested data from disk 130. If the information is not in memory, the file system 320 indexes the inode file with the inode number to access the appropriate entry and retrieve the logical Volume Block Number (VBN). The file system passes the logical VBN to the disk storage (RAID) layer 240, which maps the logical number to a disk block number and sends the latter to the appropriate drive (e.g., SCSI) of the disk drive layer 250. The disk drive accesses the disk block number from disk 130 and loads the requested data block into memory 124. At step 416, the requested data is processed by the virtualization system 300. For example, the data may be processed in connection with a read or write operation for the vdisk or in connection with a seek command for the vdisk.
The SCSI target module 310 of the virtualization system 300 emulates support for the conventional SCSI protocol by providing meaningful "emulation" information about the requested vdisk. This information is either computed by the SCSI target module or stored persistently, for example in the vdisk's attribute stream inode. At step 418, the SCSI target module 310 loads the requested block-based information (converted from the file-based information provided by the file system 320) into a block access (SCSI) protocol message. For example, the SCSI target module 310 may load information, such as the size of the vdisk, into SCSI protocol messages in response to SCSI inquiry command requests. Upon completion of the request, the storage device (and operating system) returns a reply (e.g., as a SCSI "Capacity" response message) to the client over the network (step 420). The sequence then ends at step 422.
It should be noted that the software "path" through the storage operating system layers described above that is required to perform data storage access for client requests received on the multi-protocol storage appliance may also be implemented in hardware. That is, in an alternative embodiment of the present invention, the storage access request data path through the operating system layer (including virtualization system 300) may be implemented as logic circuitry contained in a Field Programmable Gate Array (FPGA) or Application Specific Integrated Circuit (ASIC). Such a hardware implementation increases the performance of the storage service provided by the device 100 in response to file access or block access requests issued by the client 160. Furthermore, in another alternative embodiment of the present invention, the processing element of the network and the storage adapter 125 and 128 may be configured to offload some or all of the packet processing and storage access operations, respectively, from the processor 122, thereby improving the performance of the storage service provided by the multi-protocol storage appliance. It is expressly contemplated that the various processes, architectures and procedures described herein can be implemented in hardware, firmware or software.
Advantageously, the integrated multi-protocol storage appliance provides access control and, where appropriate, sharing of files and vdisks for all protocols while preserving data integrity. The storage appliance also provides embedded/integrated virtualization capabilities that eliminate the need for users to allocate storage resources when creating NAS and SAN storage objects. These capabilities include virtualized storage space that allows SAN and NAS storage objects to coexist with respect to global space management within the appliance. Furthermore, the integrated storage appliance provides simultaneous support for block access protocols (iSCSI and FCP) to the same vdisk and a heterogeneous SAN environment with support for clustering. In summary, the multi-protocol storage appliance provides a single unified storage platform for all storage access protocols.
The foregoing description has been directed to specific embodiments of this invention. It will be apparent that other variations and modifications may be made to the described embodiments, with the attainment of some or all of their advantages. For example, it is expressly contemplated that the teachings of this invention can be implemented as software, including a computer-readable medium having program instructions executing on a computer, hardware, firmware, or a combination thereof. Accordingly, this description is to be taken only by way of example and not to otherwise limit the scope of the invention. Therefore, it is the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the invention.
Claims (18)
1. A multi-protocol storage appliance adapted to provide file protocol and block protocol access to information stored on storage devices for network attached storage and storage area network deployments, the appliance comprising:
a storage operating system adapted to implement a file system cooperating with a virtualization module to virtualize a storage space provided by the storage device, the storage space storing information from file protocols and block protocol access requests in an integrated manner.
2. The multi-protocol storage appliance of claim 1 wherein the file system logically organizes the information into files, directories and virtual disks to provide an integrated network attached storage and storage area network appliance method for storage by allowing file-based access to the files and directories while also allowing block-based access to the virtual disks, each of the virtual disks being a special file type that is converted to an emulated disk.
3. The multi-protocol storage appliance of claim 1 wherein the virtualization module comprises a virtual disk module and a small computer system interface target module.
4. The multi-protocol storage appliance of claim 3 wherein the virtual disk module is layered on the file system to allow a management interface to access in response to a system administrator issuing commands to the multi-protocol storage appliance.
5. The multi-protocol storage appliance of claim 4 wherein the management interface comprises a user interface.
6. The multi-protocol storage appliance of claim 5 wherein the virtual disk module manages the storage area network deployment by implementing a set of virtual disk commands issued via the user interface.
7. The multi-protocol storage appliance of claim 6 wherein the virtual disk commands are translated into basic file system operations that interact with the file system and the small computer system interface object module to implement the virtual disk.
8. The multi-protocol storage appliance of claim 7 wherein the small computer system interface target module initiates emulation of a disk or logical unit number by providing a mapping program that translates the logical unit number to a virtual disk.
9. The multi-protocol storage appliance of claim 8 wherein the small computer system interface target module provides a translation layer between a storage area network block space and a file system space.
10. The multi-protocol storage appliance of claim 1 wherein the virtualized storage space allows storage area network and network attached storage objects to coexist with respect to global space management of the file system.
11. The multi-protocol storage appliance of claim 10 wherein the file system cooperates with the virtualization module to provide a virtualization system that provides reliability guarantees for the storage area network and network attached storage objects that coexist in the virtualized storage space.
12. The multi-protocol storage appliance of claim 1 wherein the file system provides volume management capabilities for block-based access to the information stored on the storage devices.
13. The multi-protocol storage appliance of claim 12 wherein the storage device is a disk.
14. The multi-protocol storage appliance of claim 1 wherein the file system provides (i) file system semantics including naming of storage objects and (ii) functionality associated with a volume manager.
15. The multi-protocol storage appliance of claim 14 wherein the functionality associated with the volume manager comprises at least one of:
an aggregation of the storage devices;
an aggregation of storage bandwidth of the device; and
reliability guarantees, including mirrored or redundant arrays of independent disks.
16. A method for providing file and block protocol access to information stored on storage devices of a multi-protocol storage appliance for network attached storage and storage area network deployment, the method comprising the steps of:
providing a single storage space to hold the stored information;
providing a network attached storage service with (i) a network adapter connecting the device to a first network and (ii) a file system capability allowing the device to access information stored as files in response to file-based requests issued by a network attached storage client; and
storage area network services are provided employing (i) a network target adapter coupling the device to a second network and (ii) volume management capabilities that allow the device to access information as virtual disk storage in response to block-based requests issued by storage area network clients.
17. The method of claim 16, further comprising the steps of:
providing name-based management of the files and virtual disks stored on the multi-protocol storage appliance, thereby providing a consistent naming scheme for file-based and block-based storage; and
a hierarchy of named files and virtual disks stored on the storage device is provided, wherein each of the virtual disks is a special file type that is converted to an emulated disk.
18. A method for providing file and block protocol access to storage objects stored on storage devices of a multi-protocol storage appliance for network attached storage and storage area network deployment, the method comprising the steps of:
organizing the storage as one or more volumes representing a global storage space;
allowing storage area network and network attached storage objects to coexist in the global storage space;
receiving, at a multi-protocol engine of the storage device, block-based and file-based requests to access the storage area network and network attached storage objects; and
the storage area network and network attached storage objects are accessed and returned in response to block-based and file-based requests.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US10/215,917 | 2002-08-09 | ||
| US10/215,917 US7873700B2 (en) | 2002-08-09 | 2002-08-09 | Multi-protocol storage appliance that provides integrated support for file and block access protocols |
| PCT/US2003/023597 WO2004015521A2 (en) | 2002-08-09 | 2003-07-28 | Multi-protocol storage appliance that provides integrated support for file and block access protocols |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1082976A1 HK1082976A1 (en) | 2006-06-23 |
| HK1082976B true HK1082976B (en) | 2008-08-15 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN100357916C (en) | Multi-protocol storage appliance that provides integrated support for file and block access protocols | |
| EP1543424B1 (en) | Storage virtualization by layering virtual disk objects on a file system | |
| US7930473B2 (en) | System and method for supporting file and block access to storage object on a storage appliance | |
| US7849274B2 (en) | System and method for zero copy block protocol write operations | |
| US7437530B1 (en) | System and method for mapping file block numbers to logical block addresses | |
| US7055014B1 (en) | User interface system for a multi-protocol storage appliance | |
| EP1763734B1 (en) | System and method for supporting block-based protocols on a virtual storage appliance executing within a physical storage appliance | |
| JP5054531B2 (en) | System and method for requesting return of unused space from provisional data container | |
| US20050246345A1 (en) | System and method for configuring a storage network utilizing a multi-protocol storage appliance | |
| US7293152B1 (en) | Consistent logical naming of initiator groups | |
| US20070061454A1 (en) | System and method for optimized lun masking | |
| HK1082976B (en) | Multi-protocol storage appliance that provides integrated support for file and block access protocols | |
| HK1082304B (en) | Storage virtualization by layering virtual disk objects on a file system |