[go: up one dir, main page]

CN1285993C - Input/output processing unit of storage device - Google Patents

Input/output processing unit of storage device Download PDF

Info

Publication number
CN1285993C
CN1285993C CN 03148526 CN03148526A CN1285993C CN 1285993 C CN1285993 C CN 1285993C CN 03148526 CN03148526 CN 03148526 CN 03148526 A CN03148526 A CN 03148526A CN 1285993 C CN1285993 C CN 1285993C
Authority
CN
China
Prior art keywords
cpu
interface
processing unit
chip
bridge sheet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CN 03148526
Other languages
Chinese (zh)
Other versions
CN1567146A (en
Inventor
郑珉
胡鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN 03148526 priority Critical patent/CN1285993C/en
Publication of CN1567146A publication Critical patent/CN1567146A/en
Application granted granted Critical
Publication of CN1285993C publication Critical patent/CN1285993C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Landscapes

  • Multi Processors (AREA)

Abstract

The present invention discloses an input/output (IO) processing unit of a storage device, which at least comprises more than one CPU, more than one internal storage and more than one CPU reset module capable of operating independently, more than one bridging chip for protocol conversion, and a disk interface module connected with a storage unit of the storage device, wherein each CPU is connected with the internal storage and the independent CPU reset module; the CPUs are connected with the bridging chips through a high speed protocol channel which can support more than one CPU subsystem, two bridging chips are connected through the high speed protocol channel, and each bridging chip is connected with the disk interface module through a PCI bus. The IO processing unit enhances self-protection capability in the IO process of the storage device and reduces dead halt conditions, and thereby, system usability is improved. The IO processing unit further improves the IO processing speed of the storage device, and thus, the bandwidth bottle necks of the PCI bus and the like are avoided.

Description

存储设备的输入输出处理装置Input and output processing device of storage device

技术领域technical field

本发明涉及输入输出(IO)处理技术,特别是指一种存储设备的IO处理装置。The invention relates to input and output (IO) processing technology, in particular to an IO processing device of a storage device.

背景技术Background technique

随着计算机的处理速度和存储技术迅速提高,计算机设备的IO处理能力已成为影响其性能及可用性的主要因素。特别是随着对存储器容量需求的不断扩大,出现了磁盘阵列等大规模可独立工作的存储设备,这类存储设备自身包含有存储单元和IO处理装置,能同时为几台服务器等设备提供存储空间和IO服务,因此其IO处理装置处理的好坏将直接影响整个系统的性能。With the rapid improvement of computer processing speed and storage technology, the IO processing capability of computer equipment has become the main factor affecting its performance and availability. Especially with the continuous expansion of the demand for storage capacity, large-scale storage devices such as disk arrays that can work independently have emerged. This type of storage device itself contains storage units and IO processing devices, which can provide storage for several servers and other devices at the same time. Space and IO services, so the quality of its IO processing device will directly affect the performance of the entire system.

目前业界通常采用的存储设备IO处理装置的基本结构参见图1所示。包括:CPU 101、北桥芯片102、南桥芯片103、内存104以及若干个用于与磁盘阵列等存储设备连接的光纤通道(FC,Fiber Channel)模块105。其中,FC模块105包括有FC HBA和FC接口,FC接口用于连接硬盘组成的存储单元。CPU通过CPU前端总线与北桥芯片102连接,北桥芯片102再分别与RAM和南桥芯片103连接,所有FC模块105通过一条PCI总线连接到南桥芯片103。这种结构与PC机的结构相似,其缺点首先是系统可用性很难提高,由于所有FC模块105共同串联在同一条PCI总线上,并由一个CPU101进行处理,缺乏保护措施,一旦任何环节出现故障,系统工作将停止,系统死机的可能性较大。另外,这种结构的IO带宽瓶颈非常明显,由于多个FC接口共同分享同一条PCI总线带宽,因为PCI带宽不足,FC接口无法达到满负荷,并且CPU 101、北桥芯片102、南桥芯片103和FC接口串接的结构也容易造成CPU 101的IO和中端处理能力不足,系统速度很难提高。Refer to FIG. 1 for a basic structure of a storage device IO processing device generally used in the industry at present. Including: CPU 101, north bridge chip 102, south bridge chip 103, memory 104 and several Fiber Channel (FC, Fiber Channel) modules 105 for connecting with storage devices such as disk arrays. Wherein, the FC module 105 includes an FC HBA and an FC interface, and the FC interface is used to connect a storage unit composed of a hard disk. The CPU is connected to the North Bridge chip 102 through the CPU front side bus, and the North Bridge chip 102 is connected to the RAM and the South Bridge chip 103 respectively, and all the FC modules 105 are connected to the South Bridge chip 103 through a PCI bus. This structure is similar to that of a PC. Its disadvantage is that it is difficult to improve system availability. Since all FC modules 105 are connected in series on the same PCI bus and processed by one CPU 101, there is no protection measure. Once any link fails , the system will stop working, and the possibility of system crash is high. In addition, the IO bandwidth bottleneck of this structure is very obvious. Since multiple FC interfaces share the same PCI bus bandwidth, because the PCI bandwidth is insufficient, the FC interface cannot reach full load, and the CPU 101, North Bridge chip 102, South Bridge chip 103 and The serial structure of the FC interface also easily causes insufficient IO and mid-end processing capabilities of the CPU 101, making it difficult to improve the system speed.

发明内容Contents of the invention

有鉴于此,本发明的主要目的在于提供一种存储设备的IO处理装置,增强存储设备IO过程中自保护能力,减少死机情况的发生,从而提高系统的可用性。并进一步提高存储设备的IO处理速度,避免PCI等带宽瓶颈。In view of this, the main purpose of the present invention is to provide an IO processing device for a storage device, which enhances the self-protection capability of the storage device during IO, reduces the occurrence of crashes, and thus improves the availability of the system. And further improve the IO processing speed of the storage device to avoid bandwidth bottlenecks such as PCI.

本发明的一种存储设备的输入输出处理装置,至少包括:一个以上可独立工作的CPU,与所述CPU数目相等的内存和与所述CPU数目相等的CPU复位模块;一个以上用于进行协议转换的桥片,和与所述桥片数目相等的用于与存储设备的存储单元连接的磁盘接口模块;每个CPU上连接有内存和独立的CPU复位模块;一个以上所述CPU通过可支持一个以上CPU子系统的高速协议通道与桥片连接,每个桥片之间通过所述高速协议通道连接,并且每个桥片通过PCI总线与磁盘接口模块连接。An input and output processing device for a storage device of the present invention at least includes: one or more CPUs that can work independently, memory equal to the number of the CPUs, and CPU reset modules equal to the number of the CPUs; more than one CPU reset module for protocol Converted bridge slices, and a disk interface module that is equal to the number of the bridge slices and is used to connect with the storage unit of the storage device; each CPU is connected with a memory and an independent CPU reset module; more than one CPU can be supported by More than one high-speed protocol channel of the CPU subsystem is connected with the bridge slices, each bridge slice is connected through the high-speed protocol channel, and each bridge slice is connected with the disk interface module through the PCI bus.

该装置所述支持一个以上CPU子系统的高速协议通道是超传送输入输出接口通道,所述CPU是带有超传送输入输出接口的CPU,所述桥片是能够将超传送输入输出接口协议转换成PCI协议的桥片。The high-speed protocol channel supporting more than one CPU subsystem described in the device is a hypertransport input and output interface channel, the CPU is a CPU with a hypertransport input and output interface, and the bridge slice can convert the hypertransport input and output interface protocol into a bridge slice of the PCI protocol.

该装置所述支持一个以上CPU子系统的高速协议通道是IB接口通道,所述桥片由IB交换机与TCA组成,CPU通过HCA与桥片的IB交换机连接,TCA与磁盘接口模块连接。The high-speed protocol channel supporting more than one CPU subsystem in the device is an IB interface channel, the bridge is composed of an IB switch and a TCA, the CPU is connected to the IB switch of the bridge through the HCA, and the TCA is connected to the disk interface module.

该装置所述CPU复位模块是编写有复位狗程序的可编程逻辑阵列芯片。The CPU reset module of the device is a programmable logic array chip programmed with a reset dog program.

该装置所述磁盘接口模块包括有用于光纤通道协议处理的FC芯片和FC接口,FC芯片通过PCI总线与所述桥片连接,FC接口连接至所述存储单元。The disk interface module of the device includes an FC chip and an FC interface for fiber channel protocol processing, the FC chip is connected to the bridge through a PCI bus, and the FC interface is connected to the storage unit.

该装置所述磁盘接口模块包括有用于iSCSI协议处理的iSCSI芯片和iSCSI接口,iSCSI芯片通过PCI总线与所述桥片连接,iSCSI接口连接至所述存储单元。The disk interface module of the device includes an iSCSI chip for iSCSI protocol processing and an iSCSI interface, the iSCSI chip is connected to the bridge through a PCI bus, and the iSCSI interface is connected to the storage unit.

该装置所述内存是随机存储器。The memory of the device is random access memory.

从上述方案可以看出,本发明的存储设备IO处理装置,通过采用两个或两个以上可独立工作的CPU子系统,为IO处理装置增加了保护机制,提高了系统的可用性,并且每个磁盘接口模块独立占用一条PCI总线,增加了带宽,有效解决了PCI瓶颈问题,采用高速传输总线和多个CPU协同工作方式,大大提高了IO处理装置的IO处理速度。It can be seen from the above scheme that the IO processing device of the storage device of the present invention adds a protection mechanism to the IO processing device by using two or more CPU subsystems that can work independently, thereby improving the availability of the system, and each The disk interface module independently occupies a PCI bus, which increases the bandwidth and effectively solves the PCI bottleneck problem. The high-speed transmission bus and multiple CPUs work together to greatly improve the IO processing speed of the IO processing device.

附图说明Description of drawings

图1为现有存储设备IO处理装置结构示意图;1 is a schematic structural diagram of an existing storage device IO processing device;

图2为本发明采用双CPU和四FC模块实施例的结构示意图;Fig. 2 is the structural representation that the present invention adopts double CPU and four FC module embodiments;

图3为本发明复位模块与CPU连接结构示意图;Fig. 3 is a schematic diagram of the connection structure between the reset module and the CPU of the present invention;

图4为本发明采用双CPU和四FC模块实施例信号流向图;Fig. 4 adopts double CPU and four FC module embodiment signal flow diagrams for the present invention;

图5为本发明采用双CPU和双FC模块实施例的结构示意图;Fig. 5 is the structural representation that the present invention adopts double CPU and double FC module embodiment;

图6为本发明采用双CPU和五FC模块实施例的结构示意图;Fig. 6 is the structural representation that the present invention adopts dual CPU and five FC module embodiments;

图7为本发明采用三CPU和六FC模块实施例的结构示意图;Fig. 7 is the structural representation of the embodiment of the present invention adopting three CPUs and six FC modules;

图8为本发明采用双CPU和四iSCSI模块实施例的结构示意图。FIG. 8 is a schematic structural diagram of an embodiment of the present invention using dual CPUs and four iSCSI modules.

具体实施方式Detailed ways

下面结合附图及具体实施例对本发明再作进一步详细的说明。The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

本发明采用两个或两个以上CPU子系统,每个CPU分别通过高速协议通道和一组用于协议转换的桥片与一组FC模块等的磁盘接口模块连接。正常工作时,各CPU子系统可协同工作,每个CPU可分别控制一部分桥片和其上磁盘接口模块的IO。并且CPU之间内存共享,一个CPU可以把其它CPU的内存、所控制的桥片和磁盘接口模块看作自己的IO空间。当某个CPU出现故障死机时,其它CPU将接替该CPU的工作,从而提高存储设备IO处理装置的可用性。The invention adopts two or more CPU subsystems, and each CPU is respectively connected with a group of disk interface modules such as FC modules through a high-speed protocol channel and a group of bridges for protocol conversion. When working normally, the CPU subsystems can work together, and each CPU can separately control a part of the bridge and the IO of the disk interface module on it. Moreover, the memory is shared among the CPUs. A CPU can regard the memory of other CPUs, the controlled bridge and the disk interface module as its own IO space. When a certain CPU fails and crashes, other CPUs will take over the work of the CPU, thereby improving the availability of the IO processing device of the storage device.

本发明较佳实现方案的存储设备IO处理装置结构参见图2所示,采用两个CPU 101的结构,包括:两个CPU 101,两个内存104,两个独立工作的CPU复位模块201,以及四个桥片202和四个FC模块105。该结构左右对称,四个桥片202通过超传送输入输出接口(HT,HyperTransport I/OInterface)高速信号线串联,形成桥片链,每个桥片202下分别通过PCI总线连接有一个FC模块105。两个CPU 101分别位于桥片链的左右两端,并与左右两端的桥片202通过HT高速信号线连接。在两个CPU 101上各自连接有一块内存104,以及一个用于在CPU 101发生故障时对其进行复位的CPU复位模块201。The structure of the storage device IO processing device of the preferred implementation scheme of the present invention is shown in Figure 2, adopting the structure of two CPUs 101, including: two CPUs 101, two internal memory 104, two CPU reset modules 201 working independently, and Four bridges 202 and four FC modules 105 . The structure is left-right symmetrical, and the four bridges 202 are connected in series through high-speed signal lines of the HyperTransport I/O Interface (HT, HyperTransport I/O Interface) to form a chain of bridges, and each bridge 202 is respectively connected to an FC module 105 through a PCI bus. . The two CPUs 101 are respectively located at the left and right ends of the bridge chain, and are connected to the bridges 202 at the left and right ends through HT high-speed signal lines. A memory 104 is respectively connected to the two CPUs 101, and a CPU reset module 201 for resetting the CPU 101 when it breaks down.

本实施例的CPU 101与桥片202之间,及桥片202与桥片202之间采用HT协议实现数据交互。HT是一种高速通道协议,这项技术不但具有高速、高性能特点,还可为系统提供通用联系,减少系统内部总线数量,并可用于连接多个相对独立的CPU子系统,及实现CPU子系统的内存104共享。Between the CPU 101 and the bridge chip 202 in this embodiment, and between the bridge chip 202 and the bridge chip 202, the HT protocol is used to realize data interaction. HT is a high-speed channel protocol. This technology not only has the characteristics of high speed and high performance, but also provides a general connection for the system, reduces the number of internal buses in the system, and can be used to connect multiple relatively independent CPU subsystems and realize CPU subsystems. The system's memory 104 is shared.

因此为了支持HT协议,需要采用自身提供有HT接口的CPU,这一类CPU包括如:基于MIPS core的BROADCOM BCM 1250、MIPS1125、MIPS1280等,和基于X86的Hummer系列等。Therefore, in order to support the HT protocol, it is necessary to use a CPU with its own HT interface. This type of CPU includes, for example, BROADCOM BCM 1250, MIPS1125, MIPS1280 based on MIPS core, and the Hummer series based on X86.

本实施例中CPU 101采用Broadcom BCM 1250,该CPU 101有两个600MHz~1GHz的64位MIPS核,有两个最高200MHz(400Mbit/s)的64位双倍数据速率随机存储器(DDR RAM)通道,最大RAM带宽6.4GB,可直接连接DDR RAM。并且该CPU 101提供12.8Gb的高速LDT HyperTransport接口,带宽为400M×2×2×8=12.8Gbit/s,可直接与桥片202连接。另外,BROADCOM BCM 1250上提供的Generic I/O接口,以及Reset管脚可连接本实施例的CPU复位模块201。In the present embodiment, CPU 101 adopts Broadcom BCM 1250, and this CPU 101 has two 64-bit MIPS cores of 600MHz~1GHz, and two 64-bit double data rate random access memory (DDR RAM) channels of the highest 200MHz (400Mbit/s) are arranged , the maximum RAM bandwidth is 6.4GB, which can be directly connected to DDR RAM. And the CPU 101 provides a 12.8Gb high-speed LDT HyperTransport interface with a bandwidth of 400M×2×2×8=12.8Gbit/s, which can be directly connected to the bridge chip 202. In addition, the Generic I/O interface and the Reset pin provided on the BROADCOM BCM 1250 can be connected to the CPU reset module 201 of this embodiment.

本实施例的内存104采用DDR RAM。The memory 104 of the present embodiment adopts DDR RAM.

桥片202在本实施例中的功能主要是用于HT到PCI之间的协议转换,采用提供HT接口的三通桥片202,该类桥片202包括两个HT接口和一个PCI接口,三个接口之间必须能够相互交换数据,使CPU 101可以以IO空间读写方式透过桥片202读写PCI总线上芯片的寄存器。本实施例采用API公司的HyperTransport PCI桥系统,该桥片202提供左右两个8位400M DDR的HT接口,可用于桥片202串联,并提供一路66M×64位的PCI接口,可通过PCI总线连接FC模块105。并且该芯片在PCI接口与HT接口之间可实现不透明桥接,从而可避免PCI接口与HT之间数据串扰。The function of the bridge 202 in this embodiment is mainly used for the protocol conversion between HT to PCI, adopting the three-way bridge 202 that provides the HT interface, this type of bridge 202 includes two HT interfaces and a PCI interface, three The two interfaces must be able to exchange data with each other, so that the CPU 101 can read and write the registers of the chips on the PCI bus through the bridge chip 202 in an IO space read and write mode. This embodiment adopts the HyperTransport PCI bridge system of API Company. The bridge 202 provides two HT interfaces of 8-bit 400M DDR on the left and right, which can be used for series connection of the bridge 202, and provides a 66M×64-bit PCI interface, which can be passed through the PCI bus. Connect the FC module 105. Moreover, the chip can realize an opaque bridge between the PCI interface and the HT interface, thereby avoiding data crosstalk between the PCI interface and the HT interface.

FC模块105在本实施例中由FC芯片和FC接口组成。FC芯片的功能主要是完成FC的协议处理,处理的协议层次包括FC-0、FC-1、FC-2、FC-3、FC-4。本实施例中FC芯片可采用LSI919,该芯片内部有嵌入ARM处理核用于处理FC协议,内部DMA模块可直接将数据发送至内存104,内部集成了2G FC的Serders功能,直接提供一路双向2.125差分信号,与SFP光模块相连,外部提供66MHz×64位PCI接口,正好可以与HyperTransport PCI桥系统桥片202配合使用。The FC module 105 is composed of an FC chip and an FC interface in this embodiment. The function of the FC chip is mainly to complete the protocol processing of FC, and the protocol layers processed include FC-0, FC-1, FC-2, FC-3, and FC-4. In this embodiment, the FC chip can use LSI919, which has an embedded ARM processing core for processing the FC protocol, and the internal DMA module can directly send data to the memory 104, and integrates the Serders function of 2G FC inside, directly providing one way two-way 2.125 Differential signal, connected with SFP optical module, externally provides 66MHz×64-bit PCI interface, just can be used with HyperTransport PCI bridge system bridge chip 202.

本实施例中CPU复位电路参见图3所示,采用一块可编程逻辑阵列芯片(FPGA)编写的两套独立的逻辑复位狗(Watchdog),每个复位狗即复位模块201分别与每个CPU 101的Generic I/O接口和Reset管脚连接,两套复位逻辑独立工作,互相不干扰。工作时FPGA的复位狗每隔较长的一段时间输出复位信号到其所对应CPU 101的Reset管脚,复位该CPU 101;该CPU 101每隔较短的一段时间通过Generic I/O接口清除FPGA里面的复位狗,即将复位狗的计数清零。因此当CPU 101足够长时间内未清狗时,复位狗将复位该CPU 101。其中,复位狗复位CPU 101的时间间隔和CPU 101清狗的时间间隔可以任意设置。例如:本实施例中可设置CPU 101每隔10毫秒清除一次复位狗,复位狗每隔500毫秒复位CPU 101,这样当CPU 101连续500毫秒未清复位狗时,FPGA的复位狗将输出复位信号到CPU 101的Reset管脚,复位该CPU 101。另外,CPU 101的启动和初始化操作时间大约需要30秒,为了防止复位狗在这段时间内再向CPU 101的Reset管脚发送复位指令,可以设置复位狗两次发送复位指令的时间间隔不能少于1分钟,以留出足够的时间使CPU 101完成启动和初始化操作。In the present embodiment, the CPU reset circuit is referring to shown in Figure 3, adopts two sets of independent logic reset dogs (Watchdog) that a programmable logic array chip (FPGA) writes, and each reset dog is reset module 201 and each CPU 101 respectively The Generic I/O interface is connected to the Reset pin, and the two sets of reset logic work independently without interfering with each other. When working, the reset dog of the FPGA outputs a reset signal to the Reset pin of its corresponding CPU 101 every long period of time to reset the CPU 101; the CPU 101 clears the FPGA through the Generic I/O interface every short period of time. The reset dog inside is about to clear the count of the reset dog. Therefore when the CPU 101 has not cleared the dog for a long enough time, the reset dog will reset the CPU 101. Wherein, the time interval for resetting the CPU 101 by the reset dog and the time interval for clearing the dog by the CPU 101 can be set arbitrarily. For example: in this embodiment, CPU 101 can be set to clear the reset dog every 10 milliseconds, and the reset dog resets the CPU 101 every 500 milliseconds, so that when the CPU 101 has not cleared the reset dog for 500 milliseconds in a row, the reset dog of FPGA will output a reset signal to the Reset pin of the CPU 101 to reset the CPU 101. In addition, it takes about 30 seconds for the CPU 101 to start and initialize. In order to prevent the reset dog from sending a reset command to the Reset pin of the CPU 101 during this period, you can set the time interval between the reset dog to send the reset command twice. 1 minute, so as to allow enough time for the CPU 101 to complete the startup and initialization operations.

下面描述本实施例中存储设备IO处理装置的工作流程:The following describes the workflow of the storage device IO processing device in this embodiment:

上电初始化过程中,左右两个CPU分别同时上电自检。During the power-on initialization process, the left and right CPUs are powered on and self-tested at the same time.

左侧CPU上电后,通过访问IO地址,将会检测到4个桥片存在。但该CPU只为左1和左2桥片设置驱动程序,并初始化左1和左2桥片的寄存器。然后左CPU检测到与左1和左2桥片相连的左1FC芯片和左2FC芯片,初始化此两个FC芯片。After the left CPU is powered on, it will detect the existence of 4 bridges by accessing the IO address. But the CPU only sets the driver for the left 1 and left 2 bridges, and initializes the registers of the left 1 and left 2 bridges. Then the left CPU detects the left 1FC chip and the left 2FC chip connected to the left 1 and left 2 bridges, and initializes the two FC chips.

左CPU访问右CPU对应的物理地址空间,检测该CPU是否存在,如果存在,则为右CPU设置逻辑地址映射,并设置一块内存区域作为与右CPU的通信空间。然后左CPU结束初始化,开始正常工作。The left CPU accesses the physical address space corresponding to the right CPU, detects whether the CPU exists, and if it exists, sets a logical address mapping for the right CPU, and sets a memory area as a communication space with the right CPU. Then the left CPU finishes initialization and starts working normally.

如果在初始化过程中,左CPU发现右CPU不在位,例如:死机、损坏或未加工等原因,则左CPU分别初始化左3、左4桥片,和左3、左4FC芯片。结束初始化,开始正常工作。此时左CPU同时独立控制所有桥片和FC芯片。If during the initialization process, the left CPU finds that the right CPU is not in place, for example: crash, damage or unprocessed reasons, then the left CPU initializes the left 3 and left 4 bridges, and the left 3 and left 4 FC chips respectively. Finish initialization and start working normally. At this time, the left CPU independently controls all bridges and FC chips at the same time.

左CPU正常工作后,左CPU每隔一段时间,例如1秒钟检测右CPU的在位情况。该检测方法可以是在右CPU的通讯寄存器中设置一块空间保存一个数字,安排右CPU每隔一段时间,如1毫秒对该数字进行加1操作,左CPU每隔1秒钟检查一下该数字的变化情况,如果该数字与上次检查时相比没有变化,则说明右CPU可能因死机等原因不在位。After the left CPU works normally, the left CPU detects the presence of the right CPU at intervals, such as 1 second. The detection method can be to set a space in the communication register of the right CPU to save a number, arrange the right CPU to add 1 to the number every once in a while, such as 1 millisecond, and the left CPU checks the value of the number every 1 second Changes, if the number has not changed compared with the last check, it means that the right CPU may not be in place due to a crash or other reasons.

右侧CPU上电后,通过访问IO地址,检测到4个桥片的存在。After the right CPU is powered on, it detects the existence of 4 bridges by accessing the IO address.

等待一段时间,如1分钟以便左CPU完成初始化过程。Wait for a period of time, such as 1 minute, for the left CPU to complete the initialization process.

然后右CPU访问左CPU对应的物理地址空间,检测此CPU是否存在,如果存在,则为左CPU设置逻辑地址映射,设置一块内存区域作为与左CPU的通讯空间。此时两颗CPU均正常工作,通过通讯,左CPU会将左3、左4桥片分配给右CPU管理,则右CPU初始化左3、左4桥片和左3、左4芯片。初始化结束后,开始正常工作。Then the right CPU accesses the corresponding physical address space of the left CPU, detects whether the CPU exists, if exists, sets logical address mapping for the left CPU, and sets a memory area as a communication space with the left CPU. At this time, both CPUs are working normally. Through communication, the left CPU will assign the left 3 and left 4 bridges to the right CPU for management, and the right CPU will initialize the left 3 and left 4 bridges and the left 3 and left 4 chips. After initialization, it starts working normally.

如果右CPU在初始化过程中发现左CPU不在位,例如:死机、损坏或未加工等原因,则右CPU将分别初始化左1、左2、左3、左4桥片和左1、左2、左3、左4芯片。初始化结束,开始正常工作。此时右CPU同时独立控制所有桥片和FC芯片。If the right CPU finds that the left CPU is not in place during the initialization process, for example: crash, damage or unprocessed reasons, the right CPU will respectively initialize the left 1, left 2, left 3, left 4 bridges and left 1, left 2, Left 3, left 4 chips. After initialization, normal operation begins. At this time, the right CPU independently controls all bridges and FC chips at the same time.

右CPU正常工作后,与左CPU相同,右CPU每隔1秒钟检测左CPU的在位情况。After the right CPU works normally, the same as the left CPU, the right CPU detects the presence of the left CPU every 1 second.

其中,本实施例中所述两个CPU在初始化过程中,各有一部分内存被设置成允许对方读写,在一边的CPU看来另一边CPU的内存的一部分和FC芯片是一段可读写的IO空间,如此利用HT协议,两CPU通过互相读写对方的一部分内存空间实现通讯。另外,也可以采用读写同一FPGA内部的同一个寄存器来通讯、串口通讯或以太网通讯等方式。Wherein, in the initialization process of the two CPUs described in this embodiment, each part of the memory is set to allow the other party to read and write. From the perspective of the CPU on one side, a part of the memory on the other side of the CPU and the FC chip are readable and writable. IO space, so using the HT protocol, the two CPUs communicate by reading and writing each other's part of the memory space. In addition, it is also possible to read and write the same register inside the same FPGA for communication, serial port communication or Ethernet communication.

参见图4所示两CPU 101在正常工作状态下IO处理装置的数据流向:Refer to the data flow direction of the IO processing device of the two CPUs 101 shown in Figure 4 in normal working condition:

两个CPU 101工作在多处理器(MP)模式,各自运行各自的操作系统,如:Linux程序,两个CPU 101能够同时访问4个桥片202和挂在桥片202上的FC芯片。当正常情况下,左侧的CPU 101只处理FC接口1和FC接口2的业务,右侧CPU 101只处理FC接口3和FC接口4的业务。这样,有效克服了FC接口处的PCI带宽瓶颈,增加了带宽,大大提高了存储设备的IO处理速度。Two CPUs 101 work in the multiprocessor (MP) mode, respectively run their respective operating systems, such as: Linux program, two CPUs 101 can simultaneously access 4 bridges 202 and the FC chips hanging on the bridges 202. Under normal circumstances, the CPU 101 on the left only processes the services of FC interface 1 and FC interface 2, and the right CPU 101 only processes the services of FC interface 3 and FC interface 4. In this way, the PCI bandwidth bottleneck at the FC interface is effectively overcome, the bandwidth is increased, and the IO processing speed of the storage device is greatly improved.

FC接口1、4为系统前向接口,在存储组网中一般与服务器等计算机设备的FC接口连接。FC接口2、3为系统后向接口,一般与FC硬盘模块,即存储单元对接。FC ports 1 and 4 are forward ports of the system, and are generally connected to FC ports of computer devices such as servers in a storage network. FC ports 2 and 3 are system backward ports, and are generally connected to FC hard disk modules, that is, storage units.

以图4的左半边为例,服务器的访问请求首先经FC接口1到达FC芯片,FC芯片LSI919内部嵌入ARM处理器核,处理完成FC协议后,服务器对硬盘访问请求被LSI919内部的DMA模块以PCI突发通讯模式经桥片202,并经过CPU 101到达DDR RAM内存104中。随后CPU 101对内存104中的该访问请求进行处理,处理内容包括:判断请求的合法性,查询缓冲存储器(Cache)命中情况,启动从硬盘读出数据请求等,处理后的数据放置在内存104中,随后被FC接口2的FC芯片以DMA方式将该数据取到芯片内部,再将该数据发送至与FC接口2互连的硬盘模块。Taking the left half of Figure 4 as an example, the server’s access request first reaches the FC chip through FC interface 1, and the FC chip LSI919 is embedded with an ARM processor core. The PCI burst communication mode reaches the DDR RAM memory 104 through the bridge chip 202 and through the CPU 101. Then CPU 101 processes the access request in internal memory 104, and the processing content includes: judging the legitimacy of request, querying buffer memory (Cache) hit situation, starting to read data request etc. from hard disk, the data after processing is placed in internal memory 104 Then, the FC chip of FC interface 2 takes the data into the chip in DMA mode, and then sends the data to the hard disk module interconnected with FC interface 2.

该IO处理装置的两个CPU 101都与同一个FPGA的两个复位狗相连,当左侧CPU子系统出现死机等故障时,则两个CPU 101之间的通信中断,右侧CPU 101可通过每秒一次的检测知道左CPU 101在位情况。检测到左侧CPU 101不在位后,右侧CPU 101重新配置FC接口1和FC接口2的FC芯片的寄存器,使4片FC芯片把所有的业务访问发送至右侧的CPU子系统处理。虽然此时IO处理装置的处理能力下降了大约50%,IO处理装置的带宽没有降低,每个端口仍可工作,从而大大提高系统的可用性。The two CPUs 101 of the IO processing device are all connected to the two reset dogs of the same FPGA. When a failure such as a crash occurs in the left CPU subsystem, the communication between the two CPUs 101 is interrupted, and the right CPU 101 can pass The detection once per second knows that the left CPU 101 is in place. After detecting that the left CPU 101 is not in place, the right CPU 101 reconfigures the registers of the FC chips of FC interface 1 and FC interface 2, so that the 4 FC chips send all service accesses to the right CPU subsystem for processing. Although the processing capacity of the IO processing device is reduced by about 50%, the bandwidth of the IO processing device is not reduced, and each port can still work, thereby greatly improving the availability of the system.

由于左侧CPU 101的故障,当连续500毫秒钟未清FPGA的左侧复位狗时,左复位狗将复位该左CPU 101。当左侧CPU 101复位并重新工作正常时,FC接口1和FC接口2重新被左侧CPU 101接管,恢复正常工作状态。Due to the failure of the left CPU 101, when the left reset dog of the FPGA is not cleared for 500 milliseconds in succession, the left reset dog will reset the left CPU 101. When the left side CPU 101 resets and works normally again, the FC interface 1 and FC interface 2 are taken over by the left side CPU 101 again, and return to the normal working state.

本实施例中IO处理装置采用两个独立工作的CPU子系统,为系统的IO处理提供了双重保护,并且提高了两倍以上的IO处理能力。另外,采用双CPU时,也可支持对两个或两个以上其它数目的FC模块。In this embodiment, the IO processing device adopts two independently working CPU subsystems, which provides double protection for the IO processing of the system, and improves the IO processing capacity by more than two times. In addition, when dual CPUs are used, it can also support two or more FC modules of other numbers.

参见图5所示,为两个CPU 101支持两个FC模块105的结构示意图。此种情况下,一个FC接口为系统前向接口,在存储组网中与服务器连接,另一个FC接口为系统后向接口,与存储单元连接。两个CPU 101,一个CPU101正常工作,另一个CPU 101处于待机状态,并实时检测工作CPU 101的在位情况,一旦检测到工作CPU 101不在位,则马上接替其工作。原工作CPU 101复位完成后,可处于待机状态,并实施检测另一个CPU 101的在位情况;也可重新接替当前CPU 101工作,当前CPU 101则仍回到待机状态。Referring to FIG. 5, it is a schematic structural diagram of two CPUs 101 supporting two FC modules 105. In this case, one FC interface is the forward interface of the system and is connected to the server in the storage network, and the other FC interface is the backward interface of the system and is connected to the storage unit. Two CPUs 101, one CPU 101 works normally, and the other CPU 101 is in a standby state, and detects the presence of the working CPU 101 in real time, once it detects that the working CPU 101 is not in place, it will immediately take over its work. After the original working CPU 101 is reset, it can be in the standby state, and detect the presence of another CPU 101; it can also take over the work of the current CPU 101, and the current CPU 101 still returns to the standby state.

参见图6所示,为两个CPU 101支持五个FC模块105的结构示意图。此种情况下,五个FC模块105中有两个为系统前向接口,与服务器连接,另外三个FC模块105作为系统后向接口与存储单元连接。正常工作时,可为一个CPU 101分配两个FC模块105,另一个CPU 101分配三个FC模块105,当一个CPU 101因故障不在位时,另一个CPU 101则迅速对故障CPU101控制的FC模块105和桥片202进行初始化,并接替该CPU 101的工作,直到故障CPU 101复位完成。Referring to FIG. 6, it is a schematic structural diagram of two CPUs 101 supporting five FC modules 105. In this case, two of the five FC modules 105 are system forward interfaces and are connected to the server, and the other three FC modules 105 are used as system backward interfaces to connect to the storage unit. During normal operation, two FC modules 105 can be assigned to one CPU 101, and three FC modules 105 can be assigned to the other CPU 101. When one CPU 101 is not in place due to a fault, the other CPU 101 will quickly control the FC module controlled by the faulty CPU 101. 105 and the bridge chip 202 are initialized, and take over the work of the CPU 101 until the reset of the faulty CPU 101 is completed.

本发明也可采用两个以上的CPU子系统结构。参见图7所示,此为采用三个CPU 101支持六个FC桥片202情况时的结构示意图。每个CPU 101各支持两个桥片202和FC模块105,如此可形成三个CPU子系统,并进一步采用HT交换机(Switch)芯片701,通过HT高速信号线连接三个CPU子系统的末端桥片202,实现三个CPU子系统之间的通信。可以为三个CPU101设置优先级,正常工作时,三个CPU 101各控制属于自己的一部分桥片202和FC模块105,三个CPU 101协同工作,一个CPU 101可以把其它CPU101的内存104、所控制的桥片202和FC模块105看作自己的IO空间。当某个CPU 101出现故障不在位时,另外两个CPU 101按照事先设置的优先级顺序,选出一个接替故障CPU 101的工作,如果同时有两个CPU 101不在位,则剩余的一个CPU 101将接替该两个CPU 101的工作,此时IO处理装置的IO处理能力大约只有原来的30%,但装置带宽没有降低,任意FC接口仍可进行IO处理,系统仍然可用。当故障CPU 101复位后,则可恢复正常工作状态。The present invention can also adopt more than two CPU subsystem structures. Referring to FIG. 7 , this is a schematic structural diagram when three CPUs 101 are used to support six FC bridges 202. Each CPU 101 supports two bridge slices 202 and FC modules 105, so that three CPU subsystems can be formed, and a HT switch (Switch) chip 701 is further used to connect the end bridges of the three CPU subsystems through HT high-speed signal lines Slice 202 implements communication among the three CPU subsystems. Priorities can be set for the three CPUs 101. During normal operation, each of the three CPUs 101 controls a part of the bridge 202 and the FC module 105 belonging to itself. The three CPUs 101 work together, and one CPU 101 can use the memory 104 and all other CPUs 101. The controlled bridge slice 202 and the FC module 105 are regarded as their own IO space. When a CPU 101 breaks down and is not in place, the other two CPUs 101 select one to take over the work of the faulty CPU 101 according to the priority order set in advance. If two CPUs 101 are not in place at the same time, the remaining CPU 101 The work of the two CPUs 101 will be replaced. At this time, the IO processing capacity of the IO processing device is only about 30% of the original, but the device bandwidth is not reduced. Any FC interface can still perform IO processing, and the system is still available. After faulty CPU 101 resets, then can recover normal working condition.

当四个或更多CPU时与采用三个CPU的情况类似。When four or more CPUs are used, it is similar to the case of three CPUs.

本发明中所述FC模块可以是电接口模块,也可是指光接口模块。另外,本发明中也可采用支持iSCSI协议的iSCSI接口,用由iSCSI芯片和iSCSI接口组成的iSCSI模块作为本发明的磁盘接口模块。可参见图8所示,此为采用四个磁盘接口模块、双CPU 101情况下的IO处理装置结构示意图,与图2所示实施例的结构基本相同,只是用iSCSI模块801代替了图2中所示的FC模块105。The FC module mentioned in the present invention may be an electrical interface module, or may refer to an optical interface module. In addition, the present invention can also adopt the iSCSI interface supporting the iSCSI protocol, and use the iSCSI module composed of the iSCSI chip and the iSCSI interface as the disk interface module of the present invention. See Figure 8, which is a schematic diagram of the structure of an IO processing device using four disk interface modules and dual CPUs 101. It is basically the same as the structure of the embodiment shown in Figure 2, except that the iSCSI module 801 in Figure 2 is replaced. The FC module 105 is shown.

另外,本发明中的CPU与桥片之间,及桥片与桥片之间也可采用其它类型可支持两个或两个以上CPU子系统的高速协议进行数据交互。In addition, other types of high-speed protocols that can support two or more CPU subsystems can be used for data interaction between the CPU and the bridge chip, and between the bridge chip and the bridge chip.

例如:可采用IB(InfiniBand)接口协议,该协议可支持多个CPU子系统并可实现内存共享。当采用IB协议时,对CPU的接口类型没有限制,但需要与IB器件配合使用。可采用CPU和HCA的组合取代HT协议情况中带有HT接口的CPU;采用IB协议交换机(IB Switch)和TCA的组合取代HT协议情况中的桥片,IO处理装置整体结构和其它的芯片类型均不改变。其中,HCA、IB Switch和TCA都是IB器件,HCA用于CPU和IB协议通道之间的连接,该类芯片有Intel的82808AA芯片等;IB Switch连接HCA和TCA,可采用BDC42116,TCA是IB到PCI接口的桥片,可采用BDC22104和MT21108等芯片。这种采用IB协议的结构可以起到与HT协议情况同样的效果。For example: IB (InfiniBand) interface protocol can be used, which can support multiple CPU subsystems and realize memory sharing. When the IB protocol is used, there is no restriction on the interface type of the CPU, but it needs to be used with IB devices. The combination of CPU and HCA can be used to replace the CPU with HT interface in the case of HT protocol; the combination of IB protocol switch (IB Switch) and TCA can be used to replace the bridge chip in the case of HT protocol, the overall structure of the IO processing device and other chip types Neither change. Among them, HCA, IB Switch and TCA are all IB devices, HCA is used for the connection between the CPU and the IB protocol channel, such chips include Intel's 82808AA chip, etc. The bridge chip to the PCI interface can use chips such as BDC22104 and MT21108. This structure using the IB protocol can achieve the same effect as the case of the HT protocol.

以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the scope of the present invention. within the scope of protection.

Claims (7)

1, a kind of input and output processing unit of memory device is characterized in that, comprises at least:
The CPU that can work independently more than one, internal memory that equates with described CPU number and the cpu reset module that equates with described CPU number;
Be used to carry out the bridge sheet of protocol conversion more than one, with the disk interface module that is connected with the storage unit of memory device with described bridge sheet number being used for of equating;
Be connected with internal memory and independent CPUs reseting module on each CPU;
The above CPU is connected with the bridge sheet by the high speed protocol passage that can support an above cpu subsystem, connect by described high speed protocol passage between each bridge sheet, and each bridge sheet is connected with the disk interface module by pci bus.
2, according to the described input and output processing unit of claim 1, it is characterized in that, the high speed protocol passage of an above cpu subsystem of described support is the super IO interface passage that transmits, described CPU has the super CPU that transmits IO interface, and described bridge sheet is to transmit the bridge sheet that the IO interface protocol conversion becomes the PCI agreement with surpassing.
3, according to the described input and output processing unit of claim 1, it is characterized in that, the high speed protocol passage of an above cpu subsystem of described support is the IB interface channel, described bridge sheet is made up of IB switch and TCA, CPU is connected with the IB switch of bridge sheet by HCA, and TCA is connected with the disk interface module.
According to the described input and output processing unit of claim 1, it is characterized in that 4, described cpu reset module is to write the programmable logic array chip of the dog program that resets.
5, according to the described input and output processing unit of claim 1, it is characterized in that, described disk interface module includes and is used for FC chip and the FC interface that fiber channel protocol is handled, and the FC chip is connected with described bridge sheet by pci bus, and the FC interface is connected to described storage unit.
6, according to the described input and output processing unit of claim 1, it is characterized in that, described disk interface module includes iSCSI chip and the iSCSI interface that is used for the iSCSI protocol processes, and the iSCSI chip is connected with described bridge sheet by pci bus, and the iSCSI interface is connected to described storage unit.
According to the described input and output processing unit of claim 1, it is characterized in that 7, described internal memory is a random access memory.
CN 03148526 2003-06-30 2003-06-30 Input/output processing unit of storage device Expired - Lifetime CN1285993C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 03148526 CN1285993C (en) 2003-06-30 2003-06-30 Input/output processing unit of storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 03148526 CN1285993C (en) 2003-06-30 2003-06-30 Input/output processing unit of storage device

Publications (2)

Publication Number Publication Date
CN1567146A CN1567146A (en) 2005-01-19
CN1285993C true CN1285993C (en) 2006-11-22

Family

ID=34472303

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 03148526 Expired - Lifetime CN1285993C (en) 2003-06-30 2003-06-30 Input/output processing unit of storage device

Country Status (1)

Country Link
CN (1) CN1285993C (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8116330B2 (en) * 2009-06-01 2012-02-14 Lsi Corporation Bridge apparatus and methods for coupling multiple non-fibre channel devices to a fibre channel arbitrated loop
CN103150279B (en) * 2013-04-02 2015-05-06 无锡江南计算技术研究所 Method allowing host and baseboard management controller to share device
CN105094699B (en) * 2015-07-15 2018-10-02 浪潮(北京)电子信息产业有限公司 A kind of Cloud Server storage system

Also Published As

Publication number Publication date
CN1567146A (en) 2005-01-19

Similar Documents

Publication Publication Date Title
US9996491B2 (en) Network interface controller with direct connection to host memory
US7099318B2 (en) Communicating message request transaction types between agents in a computer system using multiple message groups
EP1804157B1 (en) Data storage system and data storage control apparatus
US6636933B1 (en) Data storage system having crossbar switch with multi-staged routing
JP3657428B2 (en) Storage controller
CN101814060B (en) Method and apparatus to facilitate system to system protocol exchange in back to back non-transparent bridges
US20110246686A1 (en) Apparatus and system having pci root port and direct memory access device functionality
US20050132089A1 (en) Directly connected low latency network and interface
WO2007112166A2 (en) System and method for re-routing signals between memory system components
EP2137625A1 (en) Initiator notification method and apparatus
US20080215926A1 (en) Dubug by a Communication Device
EP1609071B1 (en) Data storage system
US20080005410A1 (en) Methodology for manipulation of SATA device access cycles
US7624324B2 (en) File control system and file control device
US7493432B2 (en) Storage system with memories each having DIMMs daisy-chain connected to one another and control method thereof
US20050060477A1 (en) High-speed I/O controller having separate control and data paths
CN1285993C (en) Input/output processing unit of storage device
CN101126994B (en) Data processing device, mode management device and mode management method thereof
US8402320B2 (en) Input/output device including a mechanism for error handling in multiple processor and multi-function systems
EP1869558A2 (en) Data storage system having memory controller with embedded cpu
US6604176B1 (en) Data storage system having plural fault domains
US9201599B2 (en) System and method for transmitting data in storage controllers
Briggs et al. Intel 870: A building block for cost-effective, scalable servers
CN1519736A (en) a disk storage system
US20060179356A1 (en) Method to use fabric initialization to test functionality of all inter-chip paths between processors in system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CX01 Expiry of patent term
CX01 Expiry of patent term

Granted publication date: 20061122