[go: up one dir, main page]

WO2016062037A1 - Method, apparatus and system for information transmission and controller fault handling through interface cards - Google Patents

Method, apparatus and system for information transmission and controller fault handling through interface cards Download PDF

Info

Publication number
WO2016062037A1
WO2016062037A1 PCT/CN2015/076658 CN2015076658W WO2016062037A1 WO 2016062037 A1 WO2016062037 A1 WO 2016062037A1 CN 2015076658 W CN2015076658 W CN 2015076658W WO 2016062037 A1 WO2016062037 A1 WO 2016062037A1
Authority
WO
WIPO (PCT)
Prior art keywords
controller
interface card
controllers
notification message
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2015/076658
Other languages
French (fr)
Chinese (zh)
Inventor
唐觅
陈明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of WO2016062037A1 publication Critical patent/WO2016062037A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/34Signalling channels for network management communication
    • H04L41/344Out-of-band transfers

Definitions

  • the present invention relates to the field of communications, and in particular, to a method for transmitting information through an interface card, a controller fault handling method, an apparatus, and a system.
  • controllers are designed with redundancy. If one controller fails, it will not affect the system business. All services are taken over by redundant controllers. This design is very important for storage reliability. of.
  • the interface card for controller control and management belongs to each controller. As shown in Figure 1, interface card 1 is connected to controller A, and interface card 2 is connected to controller B. Each interface card only serves one control. Device. For example, if controller A fails, the service on the link is stopped and the interface card 1 cannot be used any more. Among them, one controller can also be connected with multiple interface cards. In Figure 1, an interface card is taken as an example, but each interface card can only serve one controller.
  • Embodiments of the present invention provide a method, a device, and a system for processing information through a interface card, which are used to solve a service interruption in a controller faulty interface card and a device that cannot be used continuously with the interface card. problem.
  • a storage system comprising:
  • M controllers for controlling the system;
  • the M controllers include a main controller and M-1 slave controllers as redundant, and M is a positive integer;
  • N interface cards wherein each interface card is coupled to at least two controllers for relaying signals transmitted by the controller, or signals transmitted to the controller, or for processing from the controller Signal; N is an integer less than or equal to M.
  • the interface card is connected to the controller through a PCIE bus.
  • a serial control bus and/or parallel control is further connected between the interface card and the controller Bus for transmitting control signals.
  • the system further includes at least one storage device, wherein Each storage device is coupled to the at least one interface card to enable the controller to interact with the storage device via a corresponding interface card.
  • a second aspect of the present invention provides a method for transmitting information through an interface card, including:
  • the second controller of the M controllers competes for the new primary controller
  • the second controller performs information relaying by using at least one of the N interface cards; wherein the first interface card is respectively connected to the first controller and the second controller.
  • the method further includes:
  • the second controller receives a first fault notification message sent by the first controller, where the first fault notification message is used to notify the second controller that the first controller has a fault.
  • the second controller in the M controllers competes for a new master control After the device, it also includes:
  • the second controller receives a second fault notification message sent by a third controller of the M controllers, where the second fault notification message is used to notify the second controller, the third controller A failure has occurred;
  • the second controller removes information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information that can be used as redundant controllers.
  • a third aspect of the present invention provides a controller fault processing method, including:
  • the first interface card connected to the first controller and serving the first controller of the N interface cards receives the a second fault notification message sent by the first controller, where the second fault notification message is used to notify the first interface card, where the first controller has a fault;
  • the first interface card controls the port connected to the first controller to enter an inactive state according to the second failure notification message to stop communication with the first controller.
  • the method further includes:
  • the first interface card receives a master control notification message sent by a second controller of the M controllers, where the master control notification message is used to notify the first interface card, and the second controller has been contending For the new primary controller;
  • the first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to communicate with the second controller through a port connected to the second controller.
  • a fourth aspect of the present invention provides a method for transmitting information through an interface card, including:
  • the first controller When the first controller that is the primary controller of the M controllers included in the storage system fails, the first controller sends a second failure notification message to the first interface card, where the second failure notification message is used.
  • the first interface card is configured to notify the first interface card that the first controller is faulty; wherein the first interface card is connected to the first controller in the N interface cards included in the storage system, and is An interface card served by the first controller;
  • the first interface card controls, according to the second failure notification message, that a port connected to the first controller enters an inactive state to stop communication with the first controller;
  • the second controller When the second controller of the M controllers contends to be the new master controller, the second controller sends a master control notification message to the first interface card, where the master control notification message is used to notify The first interface card, the second controller has competed for a new primary controller; wherein the first interface card is connected to the second controller;
  • the first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to communicate with the second controller through a port connected to the second controller.
  • a controller comprising:
  • An operation module when the first controller as the main controller of the M controllers fails, causing the controller to compete for a new main controller;
  • a communication module configured to perform information relaying by using at least one of the N interface cards; wherein the first interface card is respectively connected to the first controller and the controller.
  • the controller further includes: a receiving module, configured to: before the operating module causes the controller to compete for a new primary controller, Receiving a first fault notification message sent by the first controller, where the first fault notification message is used to notify the controller that the first controller has a fault.
  • the controller further includes a receiving module and a removing module;
  • the receiving module is configured to: after the operating module causes the controller to compete for a new primary controller, receive a second fault notification message sent by a third controller of the M controllers, where the second fault notification is sent a message is used to notify the controller that the third controller has failed;
  • the removal module is configured to remove information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information that can be used as redundant controllers.
  • an interface card including:
  • a receiving module configured to receive a second fault notification message sent by the first controller when a first controller that is the master controller of the M controllers fails, where the second fault notification message is used to notify the In the interface card, the first controller is faulty; wherein the interface card is an interface card of the N interface cards that is connected to the first controller and serves the first controller;
  • control module configured to control, according to the second fault notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller.
  • the receiving module is further configured to: after receiving the second fault notification message sent by the first controller, receive the M controls a master control notification message sent by the second controller in the device, the master control notification message is used to notify the interface card, and the second controller has competed for a new master controller;
  • the control module is further configured to: according to the master control notification message, control a port connected to the second controller to enter an activation state, to connect a port connected to the second controller with the second controller Communicate.
  • a storage system including:
  • a first controller configured to send a second fault message to the first interface card when the first controller that is the master controller in the M controllers included in the storage system fails, the second The fault notification message is used to notify the first interface card that the first controller has a fault; wherein the first interface card is among the N interface cards included in the storage system and the first control An interface card that is connected to and serves the first controller;
  • the first interface card is configured to control, according to the second fault notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller;
  • a second controller configured to send a master control notification message to the first interface card when the second controller of the M controllers contends to be a new master controller, the master control notification message For notifying the first interface card, the second controller has competed for a new primary controller; wherein the first interface card is connected to the second controller;
  • the first interface card is further configured to control, according to the master control notification message, with the second controller
  • the connected port enters an active state to communicate with the second controller through a port connected to the second controller.
  • an interface card is connected to at least two controllers. If one of the controllers connected to an interface card fails, the interface card can stop serving the controller, and the interface card is also Connected to other controllers, it can continue to serve other controllers. In this way, even if the controller fails, the interface card can continue to serve other controllers as long as there is no fault, and can continue to be used. Compared with the prior art, the services in the interface card are not interrupted, and the interface card is connected. Other hardware devices can continue to transmit information through the interface card, which ensures the reliability of the system.
  • the interface card and the device connected to the interface card can continue to be used, which also saves hardware resources to a certain extent and improves the utilization of the interface card.
  • the number of interface cards can be reduced to a certain extent, the system structure tends to be simple, and the volume of the system is reduced.
  • FIG. 1 is a structural diagram of a storage system in the prior art
  • FIG. 2 is a schematic structural diagram of a storage system according to an embodiment of the present invention.
  • FIG. 3 is a detailed structural diagram of an implementation manner of a storage system according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of another implementation manner of a storage system according to an embodiment of the present invention.
  • FIG. 5 is a main flowchart of a method for transmitting information by using an interface card according to an embodiment of the present invention
  • FIG. 6 is a main flowchart of a controller fault processing method according to an embodiment of the present invention.
  • FIG. 7 is a main flowchart of another method for transmitting information by using an interface card according to an embodiment of the present invention.
  • FIG. 8 is a main structural block diagram of a controller in an embodiment of the present invention.
  • FIG. 9 is a main structural block diagram of an interface card according to an embodiment of the present invention.
  • An embodiment of the present invention provides a storage system, including: M controllers for controlling the system; the M controllers include one main controller and M-1 as redundant slave controllers, M Positive An integer number; N interface cards, wherein each interface card is coupled to at least two controllers for relaying signals transmitted by the controller or transmitted to the controller, or for processing from the control The signal of the device; N is an integer less than or equal to M.
  • an interface card is connected to at least two controllers. If one of the controllers connected to an interface card fails, the interface card can stop serving the controller, and the interface card is also Connected to other controllers, it can continue to serve other controllers. In this way, even if the controller fails, the interface card can continue to serve other controllers as long as there is no fault, and can continue to be used. Compared with the prior art, the services in the interface card are not interrupted, and the interface card is connected. Other hardware devices can continue to transmit information through the interface card, which ensures the reliability of the system.
  • the interface card and the device connected to the interface card can continue to be used, which also saves hardware resources to a certain extent and improves the utilization of the interface card.
  • the number of interface cards can be reduced to a certain extent, the system structure tends to be simple, and the volume of the system is reduced.
  • system and “network” are used interchangeably herein.
  • the term “and/or” in this context is merely an association describing the associated object, indicating that there may be three relationships, for example, A and / or B, which may indicate that A exists separately, and both A and B exist, respectively. B these three situations.
  • the character "/" in this article unless otherwise specified, generally indicates that the contextual object is an "or" relationship.
  • an embodiment of the present invention provides a storage system, which may include M controllers 201 and N interface cards 202.
  • M controllers 201 are used to control the system.
  • the M controllers 201 include one main controller 201 and M-1 as redundant slave controllers 201, and M is a positive integer. That is, the M-1 slave controllers 201 are backups of the master controller 201, and when the master controller 201 fails, one of the slave controllers 201 can continue to operate as the master controller 201, allowing the system operation to continue.
  • N interface cards 202 each of which is connected to at least two controllers 201 of the M controllers 201 for transmitting or transmitting the controller 201 connected to the interface card 202.
  • the signal from the controller 201 connected to the interface card 202 is relayed, or the interface card 202 is also used to process signals from the controller 201 connected to the interface card 202.
  • N is an integer less than or equal to M, that is, in the system, the number of interface cards 202 is less than or equal to the number of controllers 201.
  • one interface card 202 corresponds to multiple controllers 201 (ie, connected to at least two controllers 201), one controller 201 may correspond to only one interface card 202, or one controller 201 may correspond to multiple Interface card 202.
  • the interface card 202 can have multiple types of ports that can be connected to different hardware modules. Then the signal sent by the controller 201 can be relayed through the interface card 202. For example, the format of the signal sent by the controller 201 to the interface card 202 is format 1. If the controller 201 wants to send the signal to the hardware module 1, and the corresponding signal format of the hardware module 1 is format 2, the interface card 202 can The format of the received signal is converted from format 1 to format 2 and then sent to the hardware module 1. Similarly, the signal sent by the hardware module 1 to the controller 201 is also relayed through the interface card 202.
  • the controller 201 can also send signals to the interface card 202, such as signals for controlling the interface card 202, or signals to the interface card 202 to inform the status of the controller 201, etc., which the interface card 202 can process.
  • the interface card 202 can also send a signal to the controller 201, such as a signal for notifying the controller 201 of the status of the interface card 202, and the like.
  • the interface card 202 and the controller 201 can be connected through a PCIE (Peripheral Component Interconnect Express) bus.
  • PCIE Peripheral Component Interconnect Express
  • PCIE TX for an interface card 202 and a controller 201 connected thereto, There may be two PCIE buses connected, namely PCIE TX and PCIE RX, respectively called PCIE transmit bus and PCIE receive bus, where the transmission and reception are both for the controller 201.
  • PCIE TX the controller 201 can transmit signals at high speed, and the controller 201 can receive signals through the PCIE RX.
  • FIG 3 there are two controllers 201, a for PCIE TX and b for PCIE RX.
  • the PCIE TX and the PCIE RX can be designed according to actual service requirements, and can be bandwidths of different standards such as X4, X8, and X16, and are the main service data channels between the controller 201 and the interface card 202.
  • a serial control bus and/or a parallel control bus may be connected for transmitting control signals.
  • a serial control bus and a parallel control bus are connected at the same time.
  • c denotes a serial control bus
  • d denotes a parallel control bus.
  • the serial control bus transmits a low-speed signal, and is mainly used to perform some mutual handshake between the controller 201 and the interface card 202, for example, the controller 201 can read the interface card 202 type, status, alarm information, etc., and The interface card 202 acquires status information of the controller 201, such as master-slave status information of the controller 201, that is, whether the controller 201 is the master controller 201 or the slave controller 201, and the like.
  • the parallel control bus transmits high-speed signals, and is mainly used for the controller 201 to control the state of the interface card 202 and upgrade the firmware, version, etc. of the interface card 202, including general basic control signals, such as in-position signals, power-on enable, and interrupt. Signal, reset signal, etc.
  • serial control bus and the parallel control bus are all buses that can be customized according to the required functions, and are not limited to a fixed form.
  • each of the N interface cards 202 can be connected to the server, so that information exchange can be implemented between the server and the system through the interface card 202.
  • the storage system may further include at least one storage device, wherein each storage device is connected to at least one interface card 202, so that the controller 201 and the storage device can pass the corresponding interface card. 202 performs information interaction.
  • the storage device may for example be a hard disk or may be other types of devices for storing information.
  • Figure 3 is an interface card 202, and the connection The port card 202 is connected to two storage devices as an example.
  • each interface card 202 can have a control module, and the function of the interface card 202 is implemented by the control module.
  • the control module in the interface card 202 is a module that can process, control, convert or switch signals between the controller 201 and the interface card 202, and may be one or more chips or external onboard lines, according to the state of the controller 201. Or command to implement the state change and switching of the port connected to the controller 201 on the interface card 202, and also convert the signal from the PCIE protocol format to other various protocol formats, such as converting from the PCIE protocol format to the FC (Fibre Channel, Fibre Channel protocol format, from PCIE protocol format to GE (Gigabit Ethernet, Gigabit Ethernet interface) protocol format, from PCIE protocol format to SAS (Serial Attached SCSI) protocol format, and so on.
  • the control module can also implement a mirroring (NT) function to back up data in real time with the controller 201.
  • connection line between the two controllers 201 there is a connection line between the two controllers 201, for example between the two controllers 201, there may be a connection line for transmitting mirrored data, wherein the mirrored data may be Refers to backup data, which may have a connection line for transmitting heartbeat information, and may also have a serial control bus and/or a parallel control bus.
  • each of the two controllers 201 may have a connection line, for example, both may have a connection line for transmitting mirror data and a connection line for transmitting heartbeat information, and It may have a serial control bus and/or a parallel control bus.
  • the two controllers 201 perform state information transmission through mirroring and heartbeat, and monitor each other's status and service characteristics in real time.
  • Both of the above-described Figures 2 and 3 are architectural designs that are more redundant with a more commonly used dual-control memory controller and an interface card 202.
  • the service flow is transmitted by the main controller 201 through the PCIE bus and the interface card 202.
  • the switch card 202 converts the PCIE format message into a message of another protocol bus format, such as a message.
  • the FC card, the GE protocol, the SAS protocol, and the like, the interface card 202 can be connected to a server port of the front end or a disk of the back end.
  • the main controller 201 fails, it will pass the heartbeat signal or power down between the controllers 201.
  • the interrupt signal is notified to the slave controller 201, and the slave controller 201 can compete for the master controller 201. Because the interface card 202 is connected to both controllers 201, the interface card 202 controls the port state to perform corresponding switching, and the interface card 202 can continue to work to serve the new primary controller 201, so that the service of the interface card 202 can continue.
  • the service is maintained with a server at the front end of the interface card 202 or a cascading box at the back end.
  • FIG. 4 another possible schematic diagram of the storage system is provided.
  • each controller 201 can connect two interface cards 202.
  • the controller 201 on the left is first used as the main controller 201, and the main controller 201 uses the interface card 202 on the left.
  • the controller 201 notifies the right controller 201.
  • the controller 201 also notifies each interface card 202 connected thereto so that each interface card changes the corresponding port state.
  • the controller 201 on the right competes for the main controller 201.
  • the controller 201 on the right side may continue to select to use the interface card 202 on the left side, or may choose to use the interface card 202 on the right side, or may also select to use the interface card 202 on the left side and the interface card 202 on the right side.
  • they may be according to pre-set rules, or may be randomly selected, and the like.
  • the interface card 202 and the plurality of controllers 201 implement a redundant design in an intersecting manner, and the redundant manner of the cross-connects makes the entire interface card 202
  • the reliability of the service is better.
  • the controller 201 and the interface card 202 are both faulty, as long as there are controllers 201 and interface cards 202 that have not failed, the services at the front and rear ends of the interface card 202 can be continued to avoid service interruption.
  • the system reliability has been greatly improved.
  • an interface card 202 can be connected to a plurality of controllers 201 to be able to serve a plurality of controllers 201.
  • the controller 201 on the left is the master controller 201
  • the interface card 202 is the controller 201 on the left. If the main controller 201 fails, the right slave controller 201 can compete for the slave controller 201, and the interface card 202 can continue to serve the controller 201 on the right side, without being associated with the controller 201.
  • the interface card 202 connected to the controller 201 cannot be used as much as possible to ensure the continuity of the service in the interface card 202, so as to be connected with the interface card. Other devices that are connected can continue to be used, improve the reliability of the system, save hardware resources as much as possible, and improve the utilization of the interface card 202.
  • an embodiment of the present invention provides a method for transmitting information through an interface card, which may be applied to the storage system shown in FIG. 2, FIG. 3, and FIG. The main flow of the method is described below.
  • Step 501 When the first controller as the main controller of the M controllers fails, the second controller of the M controllers competes for the new main controller.
  • the M controllers 201 include a main controller 201 and M-1 as redundant slave controllers 201.
  • the M controllers 201 belong to a storage system, and the storage system further includes N An interface card 202, wherein each interface card 202 is coupled to at least two controllers 201 for relaying signals transmitted by the controller 201 or transmitted to the controller 201, or for processing from the The signal of the controller 201; N is an integer less than or equal to M.
  • the first controller is the controller 201 on the left side of the figure
  • the second controller is the controller 201 on the right side of the figure.
  • the storage system When the storage system starts to work, it first needs to be powered on. After the power is turned on, the system performs initialization. After the initialization is completed, the main controller 201 detects the in-position signal of the interface card 202, that is, determines whether the interface card 202 has been inserted into the correct insertion. groove. If the interface card 202 is not in place, the interface card 202 needs to be inserted. If it can be detected that the interface card is pulled low (because it is normally active low), the binary value of the type of the interface card 202 is determined by the driver. That is, it is judged whether the type of the interface card 202 is a type supported by the system. If the type of the interface card 202 is of a non-system supported type, the alarm message is sent by the driver, the red light on the interface card 202 is illuminated, and the user can reinsert the interface card 202 supported by the system.
  • the system sends power-on enable to the interface card 202.
  • the signal, clock signal and reset signal cause the interface card 202 to begin operation.
  • the interface card 202 and the main controller 201 automatically negotiate the port, and the so-called negotiation port refers to the interface card 202.
  • information such as a transmission bandwidth, a transmission rate, and the like of a transmission channel between the interface card 202 and the main controller 201 is determined.
  • the interface card 202 and the main controller 201 can communicate.
  • the main controller 201 at this time is, for example, the controller 201 on the left side in FIG.
  • the method may further include:
  • the second controller receives a first fault notification message sent by the first controller, where the first fault notification message is used to notify the second controller that the first controller has a fault.
  • the main controller 201 When the main controller 201 fails, the main controller 201 notifies the slave controller 201 of the failure information of the main controller 201 by the power-down interrupt signal or the heartbeat signal, that is, transmits the first failure notification message to the slave controller 201, and then The controller 201 competes for the main controller 201, and the new main controller 201 at this time refers to, for example, the controller 201 on the right side of FIG. At the same time, the original main controller 201 also notifies the interface card 202 of the failure information, or the interface card 202 can automatically detect the status of each controller 201 periodically, periodically or randomly.
  • the controller 201 on the right side notifies the interface card 202 through the serial control bus or the parallel control bus between the interface card 202, and the controller 201 has already competed for the master controller 201, that is, the master-slave has occurred between the controllers 201.
  • the port connected between the controller 201 enters an inactive state, or enters a mirrored state, and the port connected between the interface card 202 and the controller 201 on the right can start working, that is, the interface between the interface card 202 and the controller 201 on the right.
  • the port enters the active state.
  • the port on the control interface card 202 enters an inactive state, a mirrored state, or an activated state, which may be specifically performed by a control module in the interface card 202.
  • control port When the control port enters the mirroring state, it can control the port to transfer mirror data, that is, back up data. At this time, the port that enters the mirroring state is equivalent to the backup port of another port.
  • the method may further include:
  • the second controller receives the second reason sent by the third controller of the M controllers 201
  • the second notification message is used to notify the second controller that the third controller has a fault
  • the second controller removes information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information of each controller 201 that can be redundant.
  • the system includes three controllers 201, which are a controller 1, a controller 2 and a controller 3, respectively.
  • the controller 1 is a master controller 201
  • the controller 2 and the controller 3 are slave controllers. 201.
  • the controller 2 competes for the main controller 201, and then the controller 3 is still the slave controller 201.
  • the controller 3 When the controller 2 is working, if the controller 3 also fails, the controller 3 notifies the controller 2 by the power-down interrupt signal or the heartbeat signal, that is, sends the second fault notification message to the controller 2, and the controller 2
  • the information of the controller 3 can be removed from the list of redundant controllers.
  • the controller 3 also notifies the interface card 202 of the fault information, or the interface card 202 can automatically detect the state of each controller 201 periodically, periodically or randomly, and the interface card 202 controls the port entry with the controller 3. Inactive or mirrored state.
  • the slave controller 201 when the slave controller 201 fails, the service between the master controller 201 and the interface card 202 is not affected. If there are other slave controllers 201 in the system, for example, there is also a fourth controller. Then, when the controller 2 as the main controller 201 fails, the controller 2 does not send the fault information to the faulty controller 3, that is, when the master-slave switch is required, the faulty slave controller 201 is not selected. Instead, the slave controller 201 is selected to be faultless. Of course, if there is no other slave controller 201 in the system, if the controller 2 as the master controller 201 also fails, the system may stop running.
  • the controller 3 can notify the controller 2 by means of a heartbeat signal, etc., and the controller 2 will re-introducing the controller 3 into a selection range in which master-slave switching can be performed, that is, re-controlling the controller 3.
  • the information is added to the list of redundant controllers.
  • the slave controller 201 can also notify the interface card 202 when recovering from the failure of the controller 201.
  • the controller 201 and the interface card 202 can exchange information.
  • the controller 201 can send a notification message to the interface card 202 in real time, timing, or upon stateful transition to inform the interface card 202 of the current state of the controller 201, or the interface card 202 can also be in real time, timed, or randomly.
  • a probe message is sent to the controller 201 to ascertain the current state of the controller 201.
  • the interface card 202 can send a notification message to the controller 201 in real time, timing, or when there is a state transition to inform the controller 201 of the current state of the interface card 202, or the controller 201 can also be real time, timed, or randomly.
  • a probe message is sent to the interface card 202 to ascertain the current state of the interface card 202.
  • Step 502 The second controller performs information relaying by using at least one of the N interface cards, where the first interface card is respectively associated with the first controller and the second controller. connection.
  • the second controller competes for the main controller 201.
  • the first controller and the second controller are connected to the same interface card 202.
  • the interface card 202 can continue to be used.
  • the second controller can continue to use the interface card 202, which is referred to herein as the first interface card.
  • an embodiment of the present invention provides a controller fault processing method, which may be applied to the storage system shown in FIG. 2, FIG. 3, and FIG. The main flow of the method is described below.
  • Step 601 When the first controller that is the primary controller of the M controllers fails, the first interface card that is connected to the first controller and serves the first controller among the N interface cards Receiving a second fault notification message sent by the first controller, where the second fault notification message is used to notify the first interface card, and the first controller has a fault.
  • the M controllers 201 include a main controller 201 and M-1 as redundant slave controllers 201, and the M controllers 201 and the N interface cards 202 belong to a storage system, wherein Each interface card 202 is coupled to at least two controllers 201 for relaying signals transmitted by the controller 201 or transmitted to the controller 201, or for processing from the controller 201 Signal; N is an integer less than or equal to M.
  • the controller 201 on the left side is the main controller 201.
  • the first failure notification message may be sent to the slave controller 201, and at the same time, the interface card 202 may also be sent to the interface card 202.
  • the second failure notification message is described. The specific process has been described in the introduction of FIG. 2 to FIG. 5, and will not be described here.
  • the method may further include:
  • the first interface card receives a master control notification message sent by a second controller of the M controllers 201, where the master control notification message is used to notify the first interface card, and the second controller has Competing for a new master controller 201;
  • the first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to communicate with the second controller through a port connected to the second controller.
  • the controller 201 on the left side is the main controller 201.
  • the first failure notification message can be sent to the slave controller 201, and the controller 201 competes for the new master.
  • the controller 201 sends a new control notification message to the first interface card, and after receiving the main control notification message, the first interface card activates and controls the new main control.
  • the port between the devices 201 communicates with the new main controller 201.
  • Step 602 The first interface card controls, according to the second failure notification message, that a port connected to the first controller enters an inactive state to stop communication with the first controller.
  • the interface card 202 Upon receiving the second failure notification message, the interface card 202 can control the port connected to the controller 201 on the left to enter an inactive state, so that communication with the controller 201 on the left can be stopped.
  • the master control notification message may also be sent to the interface card 202, and the interface card 202 may control the port connected to the controller 201 on the right to enter an active state, thereby The controller 201 performs communication.
  • an embodiment of the present invention provides another method for transmitting information through an interface card, which may be applied to the storage system shown in FIG. 2, FIG. 3, and FIG. The main flow of the method is described below.
  • Step 701 When a first controller that is the primary controller of the M controllers included in the storage system fails, the first controller sends a second fault notification message to the first interface card, where the second fault occurs.
  • the notification message is used to notify the first interface card that the first controller is faulty; wherein the first interface card is connected to the first controller in the storage system including N interface cards And an interface card serving the first controller.
  • the controller 201 on the left side is the main controller 201.
  • the first failure notification message may be sent to the slave controller 201, and at the same time, the interface card 202 may also be sent to the interface card 202.
  • the second failure notification message is described. The specific process has been described in the introduction of FIG. 2 to FIG. 5, and will not be described here.
  • Step 702 The first interface card controls, according to the second failure notification message, that a port connected to the first controller enters an inactive state to stop communication with the first controller.
  • the interface card 202 Upon receiving the second failure notification message, the interface card 202 can control the port connected to the controller 201 on the left to enter an inactive state, so that communication with the controller 201 on the left can be stopped.
  • Step 703 When the second controller of the M controllers contends to be the new master controller, the second controller sends a master control notification message to the first interface card, where the master control notification message is sent. And the second controller is used to notify the first interface card that the second controller has competed for a new primary controller; wherein the first interface card is connected to the second controller.
  • Step 704 The first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to pass a port connected to the second controller and the second control. Communicate.
  • the controller 201 on the left is the main controller 201, when it is out When the fault occurs, the first fault notification message may be sent to the slave controller 201, and then the slave controller 201 competes for the new master controller 201, and the new master controller 201 sends the new master controller 201 to the first interface card.
  • the master control notification message after the first interface card receives the master control notification message, activates a port with the new master controller 201 to communicate with the new master controller 201, the specific process It has been described in the introduction of Figures 2 to 5, and will not be described here.
  • an embodiment of the present invention provides a controller, which may be the controller 201 in the storage system shown in FIG. 2 to FIG. 4, that is, the flow of FIG. 5-7.
  • the controller 201 described above, in particular, the controller 201 may be the second controller described in the flow of Figures 5-7.
  • the controller 201 can include an operation module 801 and a communication module 802.
  • the operation module 801 is configured to cause the controller 201 to compete for a new main controller 201 when a failure occurs in the M controllers 201 as the first controller of the main controller 201.
  • the M controllers 201 include a main controller 201 and M-1 as redundant slave controllers 201, and the M controllers 201 belong to the storage system, and the storage system further includes N interface cards 202, wherein each interface card 202 is coupled to at least two controllers 201 for relaying signals transmitted by the controller 201 or transmitted to the controller 201, or for processing from The signal of the controller 201; N is an integer less than or equal to M.
  • the communication module 802 is configured to perform information relaying by using at least one of the N interface cards 202; wherein the first interface card is respectively connected to the first controller and the controller.
  • the controller 201 may further include a receiving module, configured to: before the operating module 801 causes the controller 201 to compete for the new primary controller 201, receive the first controller to send The first failure notification message is used to notify the controller 201 that the first controller has a failure.
  • the controller 201 may further include the receiving module and the removing module.
  • the receiving module is configured to receive a second fault notification message sent by a third controller of the M controllers 201 after the operating module 801 causes the controller 201 to compete for the new master controller 201.
  • the second failure notification message is used to notify the controller 201 that the third controller has a fault;
  • the removal module is configured to remove information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information of each controller 201 that can be redundant.
  • an embodiment of the present invention provides an interface card, which may be an interface card 202 in the storage system shown in FIG. 2 to FIG. 4, that is, the flow of FIG. 5-7.
  • the interface card 202 described above, in particular, the interface card 202 can be the first interface card described in the flow of Figures 5-7.
  • the interface card 202 can include a receiving module 901 and a control module 902.
  • the receiving module 901 is configured to receive a second fault notification message sent by the first controller when a first controller of the M controllers 201 is faulty as the primary controller 201, where the second fault notification message is used. Informing the interface card 202 that the first controller has a fault; wherein the interface card 202 is connected to the first controller and served by the first controller in the N interface cards 202. Interface card 202;
  • the control module 902 is configured to control, according to the second failure notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller.
  • the receiving module 901 is further configured to: after receiving the second fault notification message sent by the first controller, receive the second controller sent by the M controllers 201
  • the master control notification message is used to notify the interface card 202 that the second controller has contend for the new master controller 201;
  • the control module 902 is further configured to: according to the master control notification message And controlling a port connected to the second controller to enter an active state to communicate with the second controller through a port connected to the second controller.
  • an embodiment of the present invention further provides a storage system, which may be the storage system shown in FIG. 2 to FIG. 4, that is, the storage system described in the flowcharts of FIG. 5-7.
  • the storage system can include a first controller, a first interface card, and a second controller.
  • the storage system may include multiple controllers 201 and multiple interface cards 202, here only two controllers 201 (ie, the first controller and the second controller) and An interface card 202 (ie, the first interface card) is taken as an example.
  • the first controller is configured to send a second fault to the first interface card when the first controller that is the main controller 201 in the M controllers 201 included in the storage system fails
  • the second failure notification message is used to notify the first interface card that the first controller has a fault; wherein the first interface card is the N interface cards 202 included in the storage system.
  • An interface card connected to the first controller and serving the first controller;
  • the first interface card is configured to control, according to the second fault notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller;
  • the second controller is configured to send a master control notification message to the first interface card when the second controller of the M controllers 201 competes for a new master controller 201,
  • the master control notification message is used to notify the first interface card, and the second controller has competed for the new primary controller 201; wherein the first interface card is connected to the second controller;
  • the first interface card is further configured to control, according to the master control notification message, a port connected to the second controller to enter an active state, by using a port connected to the second controller, and the second control Communicate.
  • An embodiment of the present invention provides a storage system, including: M controllers 201 for controlling the system; the M controllers 201 include a main controller 201 and M-1 as redundant slave controls.
  • the device 201, M is a positive integer; N interface cards 202, wherein each interface card 202 is connected to at least two controllers 201 for transmitting signals transmitted by the controller 201 or transmitted to the controller 201 Performing a relay, or for processing a signal from the controller 201; N is an integer less than or equal to M.
  • an interface card 202 is connected to at least two controllers 201. If one of the controllers 201 connected to an interface card 202 fails, the interface card 202 can stop serving the controller 201. At the same time, the interface card 202 is also connected to other controllers 201, and can continue to serve other controllers 201. In this way, even if the controller 201 fails, the interface card 202 can continue to serve other controllers 201 as long as there is no fault, and can continue to be used. Compared with the prior art, the services in the interface card 202 are not interrupted, and Other hardware connected to the interface card 202 The device can also continue to transmit information through the interface card 202, thereby ensuring the reliability of the system.
  • the interface card 202 and the device connected to the interface card 202 can continue to be used, which also saves hardware resources to some extent and improves the utilization of the interface card 202.
  • the number of the interface cards 202 can be reduced to some extent, the system structure tends to be simple, and the volume of the system is reduced.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division.
  • there may be another division manner for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold as a standalone product Or when used, it can be stored in a computer readable storage medium.
  • the technical solution of the present application in essence or the contribution to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) or a processor to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Hardware Redundancy (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)

Abstract

Disclosed is a storage system for solving the technical problems of service interruption occurs in a faulty interface card of a controller and the use of the apparatus connected with the interface card cannot be continued. The system comprises: a number M of controllers (201) for controlling the system, wherein the number M of controllers (201) comprise a master controller (201), and a number M-1 of slave controllers (201) as redundancy, and M is a positive integer; a number N of interface cards (202), wherein each interface card (202) is connected to at least two controllers (201), so as to relay signals transmitted by the controllers (201) or signals transmitting to the controllers (201), or to process signals from the controllers (201); and N is an integer less than or equal to M. Also disclosed are a method and corresponding apparatus for information transmission and fault handling for the controllers (201) through the interface cards (202).

Description

通过接口卡传输信息、控制器故障处理方法、装置及系统Information transmission through interface card, controller failure processing method, device and system 技术领域Technical field

本发明涉及通信领域,特别涉及一种通过接口卡传输信息、控制器故障处理方法、装置及系统。The present invention relates to the field of communications, and in particular, to a method for transmitting information through an interface card, a controller fault handling method, an apparatus, and a system.

背景技术Background technique

目前业界绝大多数存储系统中,都采用的是两个以上的控制器来做冗余设计,这是出于对存储的高可靠性要求而设计的。在这种背景下,控制器都设计有冗余,如果一块控制器故障,不会影响系统业务,所有业务由另外冗余的控制器接管继续工作,这种设计对存储的可靠性是非常重要的。At present, most of the storage systems in the industry use more than two controllers for redundant design, which is designed for high reliability requirements of storage. In this context, the controllers are designed with redundancy. If one controller fails, it will not affect the system business. All services are taken over by redundant controllers. This design is very important for storage reliability. of.

而对于控制器控制和管理的接口卡,是分属于各个控制器的,如图1所示,接口卡1连接控制器A,接口卡2连接控制器B,每个接口卡只服务于一个控制器。例如,若控制器A发生故障,则该链路上的业务即停止,接口卡1也无法再继续使用。其中,一个控制器也可以连接有多个接口卡,图1中是以一个接口卡为例,但每个接口卡是只能服务于一个控制器的。The interface card for controller control and management belongs to each controller. As shown in Figure 1, interface card 1 is connected to controller A, and interface card 2 is connected to controller B. Each interface card only serves one control. Device. For example, if controller A fails, the service on the link is stopped and the interface card 1 cannot be used any more. Among them, one controller can also be connected with multiple interface cards. In Figure 1, an interface card is taken as an example, but each interface card can only serve one controller.

可见,一旦一个控制器故障,属于该控制器的所有接口卡也都无法使用。一般来说,接口卡会连接有磁盘等其他设备,按照现有技术中的工作方式,接口卡在停止使用后会导致接口卡中的业务中断,可能也会导致与该接口卡连接的磁盘等设备无法再接收或发送信息,相当于导致与该接口卡连接的设备无法再继续使用。It can be seen that once a controller fails, all interface cards belonging to the controller are also unavailable. Generally, the interface card is connected to other devices such as disks. According to the working mode in the prior art, after the interface card is stopped, the service in the interface card is interrupted, and the disk connected to the interface card may also be caused. The device can no longer receive or send information, which is equivalent to the device connected to the interface card can no longer be used.

发明内容Summary of the invention

本发明实施例提供一种通过接口卡传输信息、控制器故障处理方法、装置及系统,用于解决在控制器故障接口卡中的业务中断、及与该接口卡连接的设备无法继续使用的技术问题。Embodiments of the present invention provide a method, a device, and a system for processing information through a interface card, which are used to solve a service interruption in a controller faulty interface card and a device that cannot be used continuously with the interface card. problem.

本发明的第一方面,提供一种存储系统,包括: In a first aspect of the invention, a storage system is provided, comprising:

M个控制器,用于控制所述系统;所述M个控制器中包括一个主控制器及M-1个作为冗余的从控制器,M为正整数;M controllers for controlling the system; the M controllers include a main controller and M-1 slave controllers as redundant, and M is a positive integer;

N个接口卡,其中每个接口卡与至少两个控制器连接,用于将所述控制器传输的信号、或传输给所述控制器的信号进行中转,或用于处理来自所述控制器的信号;N为小于等于M的整数。N interface cards, wherein each interface card is coupled to at least two controllers for relaying signals transmitted by the controller, or signals transmitted to the controller, or for processing from the controller Signal; N is an integer less than or equal to M.

结合第一方面,在第一方面的第一种可能的实现方式中,所述接口卡与所述控制器通过PCIE总线相连。In conjunction with the first aspect, in a first possible implementation of the first aspect, the interface card is connected to the controller through a PCIE bus.

结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述接口卡与所述控制器之间还连接有串行控制总线和/或并行控制总线,用于传输控制信号。In conjunction with the first possible implementation of the first aspect, in a second possible implementation of the first aspect, a serial control bus and/or parallel control is further connected between the interface card and the controller Bus for transmitting control signals.

结合第一方面或第一方面的第一种可能的实现方式或第二种可能的实现方式,在第一方面的第三种可能的实现方式中,所述系统还包括至少一个存储设备,其中每个存储设备与至少一个接口卡相连,以使所述控制器与所述存储设备通过相应的接口卡进行信息交互。In conjunction with the first aspect or the first possible implementation or the second possible implementation of the first aspect, in a third possible implementation of the first aspect, the system further includes at least one storage device, wherein Each storage device is coupled to the at least one interface card to enable the controller to interact with the storage device via a corresponding interface card.

本发明的第二方面,提供一种通过接口卡传输信息的方法,包括:A second aspect of the present invention provides a method for transmitting information through an interface card, including:

当M个控制器中作为主控制器的第一控制器发生故障时,所述M个控制器中的第二控制器竞争为新的主控制器;When the first controller as the primary controller of the M controllers fails, the second controller of the M controllers competes for the new primary controller;

所述第二控制器至少通过所述N个接口卡中的第一接口卡进行信息中转;其中,所述第一接口卡分别与所述第一控制器及所述第二控制器连接。The second controller performs information relaying by using at least one of the N interface cards; wherein the first interface card is respectively connected to the first controller and the second controller.

结合第二方面,在第二方面的第一种可能的实现方式中,在所述M个控制器中的第二控制器竞争为新的主控制器之前,还包括:With reference to the second aspect, in a first possible implementation manner of the second aspect, before the second controller of the M controllers competes for the new primary controller, the method further includes:

所述第二控制器接收所述第一控制器发送的第一故障通知消息,所述第一故障通知消息用于通知所述第二控制器,所述第一控制器出现了故障。The second controller receives a first fault notification message sent by the first controller, where the first fault notification message is used to notify the second controller that the first controller has a fault.

结合第二方面或第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,在所述M个控制器中的第二控制器竞争为新的主控制器之后,还包括: With reference to the second aspect or the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the second controller in the M controllers competes for a new master control After the device, it also includes:

所述第二控制器接收所述M个控制器中的第三控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述第二控制器,所述第三控制器出现了故障;The second controller receives a second fault notification message sent by a third controller of the M controllers, where the second fault notification message is used to notify the second controller, the third controller A failure has occurred;

所述第二控制器从冗余控制器列表中去掉所述第三控制器的信息;其中,所述冗余控制器列表用于记录能够作为冗余的各控制器的信息。The second controller removes information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information that can be used as redundant controllers.

本发明的第三方面,提供一种控制器故障处理方法,包括:A third aspect of the present invention provides a controller fault processing method, including:

当M个控制器中作为主控制器的第一控制器发生故障时,N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的第一接口卡接收所述第一控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述第一接口卡,所述第一控制器出现了故障;When the first controller as the primary controller of the M controllers fails, the first interface card connected to the first controller and serving the first controller of the N interface cards receives the a second fault notification message sent by the first controller, where the second fault notification message is used to notify the first interface card, where the first controller has a fault;

所述第一接口卡根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信。The first interface card controls the port connected to the first controller to enter an inactive state according to the second failure notification message to stop communication with the first controller.

结合第三方面,在第三方面的第一种可能的实现方式中,在第一接口卡接收所述第一控制器发送的第二故障通知消息之后,还包括:With reference to the third aspect, in a first possible implementation manner of the third aspect, after the first interface card receives the second fault notification message that is sent by the first controller, the method further includes:

所述第一接口卡接收所述M个控制器中的第二控制器发送的主控通知消息,所述主控通知消息用于通知所述第一接口卡,所述第二控制器已竞争为新的主控制器;The first interface card receives a master control notification message sent by a second controller of the M controllers, where the master control notification message is used to notify the first interface card, and the second controller has been contending For the new primary controller;

所述第一接口卡根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。The first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to communicate with the second controller through a port connected to the second controller. .

本发明的第四方面,提供一种通过接口卡传输信息的方法,包括:A fourth aspect of the present invention provides a method for transmitting information through an interface card, including:

当存储系统中包括的M个控制器中作为主控制器的第一控制器发生故障时,所述第一控制器向第一接口卡发送第二故障通知消息,所述第二故障通知消息用于通知所述第一接口卡,所述第一控制器出现了故障;其中,所述第一接口卡为所述存储系统中包括N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的接口卡; When the first controller that is the primary controller of the M controllers included in the storage system fails, the first controller sends a second failure notification message to the first interface card, where the second failure notification message is used. The first interface card is configured to notify the first interface card that the first controller is faulty; wherein the first interface card is connected to the first controller in the N interface cards included in the storage system, and is An interface card served by the first controller;

所述第一接口卡根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信;The first interface card controls, according to the second failure notification message, that a port connected to the first controller enters an inactive state to stop communication with the first controller;

当所述M个控制器中的第二控制器竞争为新的主控制器时,所述第二控制器向所述第一接口卡发送主控通知消息,所述主控通知消息用于通知所述第一接口卡,所述第二控制器已竞争为新的主控制器;其中,所述第一接口卡与所述第二控制器连接;When the second controller of the M controllers contends to be the new master controller, the second controller sends a master control notification message to the first interface card, where the master control notification message is used to notify The first interface card, the second controller has competed for a new primary controller; wherein the first interface card is connected to the second controller;

所述第一接口卡根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。The first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to communicate with the second controller through a port connected to the second controller. .

本发明的第五方面,提供一种控制器,包括:According to a fifth aspect of the invention, a controller is provided, comprising:

操作模块,用于当M个控制器中作为主控制器的第一控制器发生故障时,令所述控制器竞争为新的主控制器;An operation module, when the first controller as the main controller of the M controllers fails, causing the controller to compete for a new main controller;

通信模块,用于至少通过所述N个接口卡中的第一接口卡进行信息中转;其中,所述第一接口卡分别与所述第一控制器及所述控制器连接。And a communication module, configured to perform information relaying by using at least one of the N interface cards; wherein the first interface card is respectively connected to the first controller and the controller.

结合第五方面,在第五方面的第一种可能的实现方式中,所述控制器还包括接收模块,用于:在所述操作模块令所述控制器竞争为新的主控制器之前,接收所述第一控制器发送的第一故障通知消息,所述第一故障通知消息用于通知所述控制器,所述第一控制器出现了故障。In conjunction with the fifth aspect, in a first possible implementation manner of the fifth aspect, the controller further includes: a receiving module, configured to: before the operating module causes the controller to compete for a new primary controller, Receiving a first fault notification message sent by the first controller, where the first fault notification message is used to notify the controller that the first controller has a fault.

结合第五方面或第五方面的第一种可能的实现方式,在第五方面的第二种可能的实现方式中,所述控制器还包括接收模块和去除模块;With reference to the fifth aspect, or the first possible implementation manner of the fifth aspect, in a second possible implementation manner of the fifth aspect, the controller further includes a receiving module and a removing module;

所述接收模块用于在所述操作模块令控制器竞争为新的主控制器之后,接收所述M个控制器中的第三控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述控制器,所述第三控制器出现了故障;The receiving module is configured to: after the operating module causes the controller to compete for a new primary controller, receive a second fault notification message sent by a third controller of the M controllers, where the second fault notification is sent a message is used to notify the controller that the third controller has failed;

所述去除模块用于从冗余控制器列表中去掉所述第三控制器的信息;其中,所述冗余控制器列表用于记录能够作为冗余的各控制器的信息。The removal module is configured to remove information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information that can be used as redundant controllers.

本发明的第六方面,提供一种接口卡,包括: According to a sixth aspect of the invention, an interface card is provided, including:

接收模块,用于当M个控制器中作为主控制器的第一控制器发生故障时,接收所述第一控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述接口卡,所述第一控制器出现了故障;其中,所述接口卡为N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的接口卡;a receiving module, configured to receive a second fault notification message sent by the first controller when a first controller that is the master controller of the M controllers fails, where the second fault notification message is used to notify the In the interface card, the first controller is faulty; wherein the interface card is an interface card of the N interface cards that is connected to the first controller and serves the first controller;

控制模块,用于根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信。And a control module, configured to control, according to the second fault notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller.

结合第六方面,在第六方面的第一种可能的实现方式中,所述接收模块还用于:在接收所述第一控制器发送的第二故障通知消息之后,接收所述M个控制器中的第二控制器发送的主控通知消息,所述主控通知消息用于通知所述接口卡,所述第二控制器已竞争为新的主控制器;With reference to the sixth aspect, in a first possible implementation manner of the sixth aspect, the receiving module is further configured to: after receiving the second fault notification message sent by the first controller, receive the M controls a master control notification message sent by the second controller in the device, the master control notification message is used to notify the interface card, and the second controller has competed for a new master controller;

所述控制模块还用于:根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。The control module is further configured to: according to the master control notification message, control a port connected to the second controller to enter an activation state, to connect a port connected to the second controller with the second controller Communicate.

本发明的第七方面,提供一种存储系统,包括:According to a seventh aspect of the present invention, a storage system is provided, including:

第一控制器,用于当所述存储系统中包括的M个控制器中作为主控制器的所述第一控制器发生故障时,向第一接口卡发送第二故障消息,所述第二故障通知消息用于通知所述第一接口卡,所述第一控制器出现了故障;其中,所述第一接口卡为所述存储系统中包括的N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的接口卡;a first controller, configured to send a second fault message to the first interface card when the first controller that is the master controller in the M controllers included in the storage system fails, the second The fault notification message is used to notify the first interface card that the first controller has a fault; wherein the first interface card is among the N interface cards included in the storage system and the first control An interface card that is connected to and serves the first controller;

所述第一接口卡,用于根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信;The first interface card is configured to control, according to the second fault notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller;

第二控制器,用于当所述M个控制器中的所述第二控制器竞争为新的主控制器时,向所述第一接口卡发送主控通知消息,所述主控通知消息用于通知所述第一接口卡,所述第二控制器已竞争为新的主控制器;其中,所述第一接口卡与所述第二控制器连接;a second controller, configured to send a master control notification message to the first interface card when the second controller of the M controllers contends to be a new master controller, the master control notification message For notifying the first interface card, the second controller has competed for a new primary controller; wherein the first interface card is connected to the second controller;

所述第一接口卡还用于根据所述主控通知消息,控制与所述第二控制器相 连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。The first interface card is further configured to control, according to the master control notification message, with the second controller The connected port enters an active state to communicate with the second controller through a port connected to the second controller.

本发明实施例中,一个接口卡至少与两个控制器相连,若与一个接口卡相连的其中一个控制器出现了故障,则该接口卡可以停止服务于该控制器,同时,该接口卡还与其他控制器相连,还能够继续服务于其他控制器。这样,即使控制器出现故障,接口卡只要无故障就还可以继续为其他控制器服务,可以继续使用,相对于现有技术来说,接口卡中的业务不会中断,并且与该接口卡连接的其他硬件设备也可以继续通过该接口卡传输信息,保障了系统的可靠性。In the embodiment of the present invention, an interface card is connected to at least two controllers. If one of the controllers connected to an interface card fails, the interface card can stop serving the controller, and the interface card is also Connected to other controllers, it can continue to serve other controllers. In this way, even if the controller fails, the interface card can continue to serve other controllers as long as there is no fault, and can continue to be used. Compared with the prior art, the services in the interface card are not interrupted, and the interface card is connected. Other hardware devices can continue to transmit information through the interface card, which ensures the reliability of the system.

并且,接口卡和与该接口卡连接的设备都可以继续使用,也在一定程度上节省了硬件资源,提高了接口卡的利用率。并且,采用本发明实施例中的技术方案,可以在一定程度上减少接口卡的数量,使系统结构趋于简单,有利于减小系统的体积。Moreover, the interface card and the device connected to the interface card can continue to be used, which also saves hardware resources to a certain extent and improves the utilization of the interface card. Moreover, with the technical solution in the embodiment of the present invention, the number of interface cards can be reduced to a certain extent, the system structure tends to be simple, and the volume of the system is reduced.

附图说明DRAWINGS

图1为现有技术中存储系统架构图;1 is a structural diagram of a storage system in the prior art;

图2为本发明实施例中存储系统的简略架构图;2 is a schematic structural diagram of a storage system according to an embodiment of the present invention;

图3为本发明实施例中存储系统的一种实现方式的详细架构图;3 is a detailed structural diagram of an implementation manner of a storage system according to an embodiment of the present invention;

图4为本发明实施例中存储系统的另一种实现方式的简略架构图;4 is a schematic structural diagram of another implementation manner of a storage system according to an embodiment of the present invention;

图5为本发明实施例中通过接口卡传输信息的方法的主要流程图;FIG. 5 is a main flowchart of a method for transmitting information by using an interface card according to an embodiment of the present invention;

图6为本发明实施例中控制器故障处理方法的主要流程图;6 is a main flowchart of a controller fault processing method according to an embodiment of the present invention;

图7为本发明实施例中另一种通过接口卡传输信息的方法的主要流程图;FIG. 7 is a main flowchart of another method for transmitting information by using an interface card according to an embodiment of the present invention;

图8为本发明实施例中控制器的主要结构框图;8 is a main structural block diagram of a controller in an embodiment of the present invention;

图9为本发明实施例中接口卡的主要结构框图。FIG. 9 is a main structural block diagram of an interface card according to an embodiment of the present invention.

具体实施方式detailed description

本发明实施例提供一种存储系统,包括:M个控制器,用于控制所述系统;所述M个控制器中包括一个主控制器及M-1个作为冗余的从控制器,M为正 整数;N个接口卡,其中每个接口卡与至少两个控制器连接,用于将所述控制器传输的、或传输给所述控制器的信号进行中转,或用于处理来自所述控制器的信号;N为小于等于M的整数。An embodiment of the present invention provides a storage system, including: M controllers for controlling the system; the M controllers include one main controller and M-1 as redundant slave controllers, M Positive An integer number; N interface cards, wherein each interface card is coupled to at least two controllers for relaying signals transmitted by the controller or transmitted to the controller, or for processing from the control The signal of the device; N is an integer less than or equal to M.

本发明实施例中,一个接口卡至少与两个控制器相连,若与一个接口卡相连的其中一个控制器出现了故障,则该接口卡可以停止服务于该控制器,同时,该接口卡还与其他控制器相连,还能够继续服务于其他控制器。这样,即使控制器出现故障,接口卡只要无故障就还可以继续为其他控制器服务,可以继续使用,相对于现有技术来说,接口卡中的业务不会中断,并且与该接口卡连接的其他硬件设备也可以继续通过该接口卡传输信息,保障了系统的可靠性。In the embodiment of the present invention, an interface card is connected to at least two controllers. If one of the controllers connected to an interface card fails, the interface card can stop serving the controller, and the interface card is also Connected to other controllers, it can continue to serve other controllers. In this way, even if the controller fails, the interface card can continue to serve other controllers as long as there is no fault, and can continue to be used. Compared with the prior art, the services in the interface card are not interrupted, and the interface card is connected. Other hardware devices can continue to transmit information through the interface card, which ensures the reliability of the system.

并且,接口卡和与该接口卡连接的设备都可以继续使用,也在一定程度上节省了硬件资源,提高了接口卡的利用率。并且,采用本发明实施例中的技术方案,可以在一定程度上减少接口卡的数量,使系统结构趋于简单,有利于减小系统的体积。Moreover, the interface card and the device connected to the interface card can continue to be used, which also saves hardware resources to a certain extent and improves the utilization of the interface card. Moreover, with the technical solution in the embodiment of the present invention, the number of interface cards can be reduced to a certain extent, the system structure tends to be simple, and the volume of the system is reduced.

为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

另外,本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,如无特殊说明,一般表示前后关联对象是一种“或”的关系。Additionally, the terms "system" and "network" are used interchangeably herein. The term "and/or" in this context is merely an association describing the associated object, indicating that there may be three relationships, for example, A and / or B, which may indicate that A exists separately, and both A and B exist, respectively. B these three situations. In addition, the character "/" in this article, unless otherwise specified, generally indicates that the contextual object is an "or" relationship.

下面结合说明书附图对本发明实施例作进一步详细描述。The embodiments of the present invention are further described in detail below with reference to the accompanying drawings.

请参见图2,本发明实施例提供一种存储系统,所述系统可以包括M个控制器201和N个接口卡202。图2以M=2、N=1为例。 Referring to FIG. 2, an embodiment of the present invention provides a storage system, which may include M controllers 201 and N interface cards 202. FIG. 2 takes M=2 and N=1 as an example.

M个控制器201用于控制所述系统。其中,M个控制器201中包括有一个主控制器201和M-1个作为冗余的从控制器201,M为正整数。即,M-1个从控制器201是作为主控制器201的备份,当主控制器201故障时可以由其中一个从控制器201继续作为主控制器201来工作,使系统工作得以延续。M controllers 201 are used to control the system. Among them, the M controllers 201 include one main controller 201 and M-1 as redundant slave controllers 201, and M is a positive integer. That is, the M-1 slave controllers 201 are backups of the master controller 201, and when the master controller 201 fails, one of the slave controllers 201 can continue to operate as the master controller 201, allowing the system operation to continue.

N个接口卡202,其中的每个接口卡202与M个控制器201中的至少两个控制器201连接,接口卡202用于将与该接口卡202连接的控制器201传输的、或传输给与该接口卡202连接的控制器201的信号进行中转,或者,接口卡202也用于处理来自与该接口卡202连接的控制器201的信号。N interface cards 202, each of which is connected to at least two controllers 201 of the M controllers 201 for transmitting or transmitting the controller 201 connected to the interface card 202. The signal from the controller 201 connected to the interface card 202 is relayed, or the interface card 202 is also used to process signals from the controller 201 connected to the interface card 202.

N为小于等于M的整数,即,在所述系统中,接口卡202的数量小于等于控制器201的数量。本发明实施例中,一个接口卡202对应多个控制器201(即与至少两个控制器201连接),一个控制器201可以只对应一个接口卡202,或者,一个控制器201也可以对应多个接口卡202。N is an integer less than or equal to M, that is, in the system, the number of interface cards 202 is less than or equal to the number of controllers 201. In the embodiment of the present invention, one interface card 202 corresponds to multiple controllers 201 (ie, connected to at least two controllers 201), one controller 201 may correspond to only one interface card 202, or one controller 201 may correspond to multiple Interface card 202.

接口卡202上可以具有多种类型的端口,能够与不同硬件模块相连。则控制器201所发出的信号能够通过接口卡202进行中转。例如,控制器201发送给接口卡202的信号的格式为格式1,若控制器201要将该信号发送给硬件模块1,而硬件模块1对应的信号格式为格式2,则接口卡202可以将接收的信号的格式由格式1转换为格式2后再发送给硬件模块1,同样的,由硬件模块1发送给控制器201的信号也是通过接口卡202中转。The interface card 202 can have multiple types of ports that can be connected to different hardware modules. Then the signal sent by the controller 201 can be relayed through the interface card 202. For example, the format of the signal sent by the controller 201 to the interface card 202 is format 1. If the controller 201 wants to send the signal to the hardware module 1, and the corresponding signal format of the hardware module 1 is format 2, the interface card 202 can The format of the received signal is converted from format 1 to format 2 and then sent to the hardware module 1. Similarly, the signal sent by the hardware module 1 to the controller 201 is also relayed through the interface card 202.

控制器201还可以向接口卡202发送信号,例如用于控制接口卡202的信号,或者是向接口卡202告知控制器201的状态的信号等等,接口卡202可以处理这些信号。同样的,接口卡202也可以向控制器201发送信号,例如是用于通知控制器201接口卡202的状态的信号等等。The controller 201 can also send signals to the interface card 202, such as signals for controlling the interface card 202, or signals to the interface card 202 to inform the status of the controller 201, etc., which the interface card 202 can process. Similarly, the interface card 202 can also send a signal to the controller 201, such as a signal for notifying the controller 201 of the status of the interface card 202, and the like.

可选的,请参见图3,接口卡202和控制器201之间可以通过PCIE(Peripheral Component Interconnect Express,快捷外设互连)总线相连。图3以M=2、N=1为例。Optionally, referring to FIG. 3, the interface card 202 and the controller 201 can be connected through a PCIE (Peripheral Component Interconnect Express) bus. FIG. 3 takes M=2 and N=1 as an example.

具体的,对于一个接口卡202和与其连接的一个控制器201来说,其之间 连接的PCIE总线可以有两条,分别是PCIE TX和PCIE RX,分别称为PCIE发送总线和PCIE接收总线,这里的发送和接收都是针对控制器201来说的。通过PCIE TX,控制器201可以高速发送信号,通过PCIE RX,控制器201可以接收信号。在图3中有两个控制器201,a表示PCIE TX,b表示PCIE RX。Specifically, for an interface card 202 and a controller 201 connected thereto, There may be two PCIE buses connected, namely PCIE TX and PCIE RX, respectively called PCIE transmit bus and PCIE receive bus, where the transmission and reception are both for the controller 201. Through the PCIE TX, the controller 201 can transmit signals at high speed, and the controller 201 can receive signals through the PCIE RX. In Figure 3 there are two controllers 201, a for PCIE TX and b for PCIE RX.

本发明实施例中,PCIE TX和PCIE RX可根据实际业务要求设计,可以是X4、X8、X16等不同标准的带宽,是控制器201与接口卡202之间的主要业务数据通道.In the embodiment of the present invention, the PCIE TX and the PCIE RX can be designed according to actual service requirements, and can be bandwidths of different standards such as X4, X8, and X16, and are the main service data channels between the controller 201 and the interface card 202.

可选的,请仍然参见图3,接口卡202和控制器201之间,还可以连接有串行控制总线和/或并行控制总线,用于传输控制信号。图3中以同时连接有串行控制总线和并行控制总线为例,在图3中,c表示串行控制总线,d表示并行控制总线。Optionally, still referring to FIG. 3, between the interface card 202 and the controller 201, a serial control bus and/or a parallel control bus may be connected for transmitting control signals. In FIG. 3, a serial control bus and a parallel control bus are connected at the same time. In FIG. 3, c denotes a serial control bus, and d denotes a parallel control bus.

串行控制总线传输的是低速信号,主要用于控制器201与接口卡202之间实施一些相互握手的动作,比如可用于控制器201读取接口卡202类型、状态、告警信息等,以及用于接口卡202获取控制器201的状态信息,例如控制器201的主从状态信息,即控制器201究竟是主控制器201还是从控制器201,等等。The serial control bus transmits a low-speed signal, and is mainly used to perform some mutual handshake between the controller 201 and the interface card 202, for example, the controller 201 can read the interface card 202 type, status, alarm information, etc., and The interface card 202 acquires status information of the controller 201, such as master-slave status information of the controller 201, that is, whether the controller 201 is the master controller 201 or the slave controller 201, and the like.

并行控制总线传输的是高速信号,主要用于控制器201控制接口卡202状态和升级接口卡202的固件、版本等等,包括一般的基本控制信号,如在位信号、上电使能、中断信号、复位信号等等。The parallel control bus transmits high-speed signals, and is mainly used for the controller 201 to control the state of the interface card 202 and upgrade the firmware, version, etc. of the interface card 202, including general basic control signals, such as in-position signals, power-on enable, and interrupt. Signal, reset signal, etc.

总的来说,除了PCIE是国际标准的协议方式,串行控制总线和并行控制总线都是可以按照所需功能来自定义的总线,不限于某种固定形式。In general, in addition to the PCIE is an international standard protocol, the serial control bus and the parallel control bus are all buses that can be customized according to the required functions, and are not limited to a fixed form.

可选的,请仍参见图3,N个接口卡202中的每个接口卡202都可以与服务器相连,这样所述服务器与所述系统之间可以通过接口卡202实现信息交互。Optionally, referring to FIG. 3, each of the N interface cards 202 can be connected to the server, so that information exchange can be implemented between the server and the system through the interface card 202.

可选的,请仍参见图3,所述存储系统还可以包括至少一个存储设备,其中每个存储设备与至少一个接口卡202相连,这样,控制器201和存储设备就可以通过相应的接口卡202进行信息交互。存储设备例如可以是硬盘,或者也可以是其他类型的用于存储信息的设备。图3中是以一个接口卡202、及该接 口卡202连接了两个存储设备为例。Optionally, still referring to FIG. 3, the storage system may further include at least one storage device, wherein each storage device is connected to at least one interface card 202, so that the controller 201 and the storage device can pass the corresponding interface card. 202 performs information interaction. The storage device may for example be a hard disk or may be other types of devices for storing information. Figure 3 is an interface card 202, and the connection The port card 202 is connected to two storage devices as an example.

本发明实施例中,每个接口卡202中都可以具有控制模块,接口卡202的功能通过所述控制模块实现。In the embodiment of the present invention, each interface card 202 can have a control module, and the function of the interface card 202 is implemented by the control module.

接口卡202中的控制模块是可以处理、控制、转换或切换控制器201与接口卡202之间的信号的模块,可以是一个或多个芯片或外部板载线路,可根据控制器201的状态或命令实现接口卡202上与控制器201连接的端口的状态变化和切换,也可实现将信号从PCIE协议格式到其他各种协议格式的转化,如从PCIE协议格式转换为FC(Fibre Channel,光纤通道)协议格式、从PCIE协议格式转换为GE(Gigabit Ethernet,千兆以太网接口)协议格式、从PCIE协议格式转换为SAS(Serial Attached SCSI,串行连接SCSI)协议格式,等等。控制模块还可以实现镜像(NT)功能,以与控制器201实时备份数据。The control module in the interface card 202 is a module that can process, control, convert or switch signals between the controller 201 and the interface card 202, and may be one or more chips or external onboard lines, according to the state of the controller 201. Or command to implement the state change and switching of the port connected to the controller 201 on the interface card 202, and also convert the signal from the PCIE protocol format to other various protocol formats, such as converting from the PCIE protocol format to the FC (Fibre Channel, Fibre Channel protocol format, from PCIE protocol format to GE (Gigabit Ethernet, Gigabit Ethernet interface) protocol format, from PCIE protocol format to SAS (Serial Attached SCSI) protocol format, and so on. The control module can also implement a mirroring (NT) function to back up data in real time with the controller 201.

另外,在图3中可以看出,在两个控制器201之间还有连接线,例如在两个控制器201之间,可以具有用于传输镜像数据的连接线,其中,镜像数据可以是指备份数据,可以具有用于传输心跳信息的连接线,另外还可以具有串行控制总线和/或并行控制总线。In addition, it can be seen in FIG. 3 that there is a connection line between the two controllers 201, for example between the two controllers 201, there may be a connection line for transmitting mirrored data, wherein the mirrored data may be Refers to backup data, which may have a connection line for transmitting heartbeat information, and may also have a serial control bus and/or a parallel control bus.

本发明实施例中,若M>2,则每两个控制器201之间都可以具有连接线,例如都可以具有用于传输镜像数据的连接线和用于传输心跳信息的连接线,另外还可以具有串行控制总线和/或并行控制总线。In the embodiment of the present invention, if M>2, each of the two controllers 201 may have a connection line, for example, both may have a connection line for transmitting mirror data and a connection line for transmitting heartbeat information, and It may have a serial control bus and/or a parallel control bus.

两个控制器201之间通过镜像和心跳进行状态信息传递,实时监控对方状态和业务特征。The two controllers 201 perform state information transmission through mirroring and heartbeat, and monitor each other's status and service characteristics in real time.

如上的图2和图3中所示出的均是比较常用的双控存储控制器与一个接口卡202冗余的架构设计。业务正常运行时,业务流由主控制器201通过PCIE总线与接口卡202进行数据传输,接口卡202内一般有switch(转换)芯片会将PCIE格式的消息转化成其他协议总线格式的消息,如FC协议、GE协议、SAS协议等等,接口卡202可以与前端的服务器端口或后端的磁盘等设备连接。当其中的主控制器201出现故障时,会通过控制器201之间的心跳信号或掉电 中断信号通知从控制器201,则从控制器201可竞争为主控制器201。因为接口卡202与两个控制器201都有连接,则接口卡202控制端口状态进行相应的切换,接口卡202可继续工作,为新的主控制器201服务,使得接口卡202的业务得以继续,与接口卡202前端的服务器或后端的级联框等业务得以保持。Both of the above-described Figures 2 and 3 are architectural designs that are more redundant with a more commonly used dual-control memory controller and an interface card 202. When the service is running normally, the service flow is transmitted by the main controller 201 through the PCIE bus and the interface card 202. Generally, the switch card 202 converts the PCIE format message into a message of another protocol bus format, such as a message. The FC card, the GE protocol, the SAS protocol, and the like, the interface card 202 can be connected to a server port of the front end or a disk of the back end. When the main controller 201 fails, it will pass the heartbeat signal or power down between the controllers 201. The interrupt signal is notified to the slave controller 201, and the slave controller 201 can compete for the master controller 201. Because the interface card 202 is connected to both controllers 201, the interface card 202 controls the port state to perform corresponding switching, and the interface card 202 can continue to work to serve the new primary controller 201, so that the service of the interface card 202 can continue. The service is maintained with a server at the front end of the interface card 202 or a cascading box at the back end.

请参见图4,提供另一种可能的所述存储系统的简略架构示意图。图4中以M=2、N=2为例。从图4中可以看出,每个控制器201都可以连接两个接口卡202。Referring to FIG. 4, another possible schematic diagram of the storage system is provided. In Fig. 4, M = 2 and N = 2 are taken as an example. As can be seen from Figure 4, each controller 201 can connect two interface cards 202.

例如,首先是左边的控制器201作为主控制器201,该主控制器201所使用的是左边的接口卡202。当左边的控制器201故障时,该控制器201会通知右边的控制器201,当然该控制器201也会通知与其连接的各接口卡202,以便各接口卡改变相应的端口状态。右边的控制器201竞争为主控制器201。此时,右边的控制器201可以继续选择使用左边的接口卡202,或者可以选择使用右边的接口卡202,或者也可以选择同时使用左边的接口卡202和右边的接口卡202,具体如何选择可根据不同控制器201的不同需求,或者可以根据预先的设定规则,或者也可以随机选择,等等。For example, the controller 201 on the left is first used as the main controller 201, and the main controller 201 uses the interface card 202 on the left. When the left controller 201 fails, the controller 201 notifies the right controller 201. Of course, the controller 201 also notifies each interface card 202 connected thereto so that each interface card changes the corresponding port state. The controller 201 on the right competes for the main controller 201. At this time, the controller 201 on the right side may continue to select to use the interface card 202 on the left side, or may choose to use the interface card 202 on the right side, or may also select to use the interface card 202 on the left side and the interface card 202 on the right side. Depending on the different requirements of the different controllers 201, they may be according to pre-set rules, or may be randomly selected, and the like.

图4是多个控制器201与多个接口卡202的冗余设计架构,接口卡202与多个控制器201以交叉的方式实现冗余设计,交叉连接的冗余方式使整个接口卡202的业务的可靠性更好,在有控制器201和接口卡202同时故障时,只要还有没出现故障的控制器201和接口卡202,都可以使接口卡202前后端的业务继续进行,避免业务中断,系统可靠性得到较大的提升。4 is a redundant design architecture of a plurality of controllers 201 and a plurality of interface cards 202. The interface card 202 and the plurality of controllers 201 implement a redundant design in an intersecting manner, and the redundant manner of the cross-connects makes the entire interface card 202 The reliability of the service is better. When the controller 201 and the interface card 202 are both faulty, as long as there are controllers 201 and interface cards 202 that have not failed, the services at the front and rear ends of the interface card 202 can be continued to avoid service interruption. The system reliability has been greatly improved.

综上,一个接口卡202可以连接多个控制器201,从而能够为多个控制器201服务。例如按照图2的例子,首先左边的控制器201为主控制器201,接口卡202为左边的控制器201服务。如果主控制器201出现了故障,则右边的从控制器201可以竞争为从控制器201,接口卡202可以继续为右边的控制器201服务,不会因为一个控制器201出现故障就连带与该控制器201连接的接口卡202也不能使用,尽量保证接口卡202中的业务的连续性,使与接口卡连 接的其他设备可以继续使用,提高系统的可靠性,也尽量节省了硬件资源,提高接口卡202的利用率。In summary, an interface card 202 can be connected to a plurality of controllers 201 to be able to serve a plurality of controllers 201. For example, in accordance with the example of FIG. 2, first the controller 201 on the left is the master controller 201, and the interface card 202 is the controller 201 on the left. If the main controller 201 fails, the right slave controller 201 can compete for the slave controller 201, and the interface card 202 can continue to serve the controller 201 on the right side, without being associated with the controller 201. The interface card 202 connected to the controller 201 cannot be used as much as possible to ensure the continuity of the service in the interface card 202, so as to be connected with the interface card. Other devices that are connected can continue to be used, improve the reliability of the system, save hardware resources as much as possible, and improve the utilization of the interface card 202.

请参见图5,基于同一发明构思,本发明实施例提供一种通过接口卡传输信息的方法,所述方法可以应用于图2、图3和图4所示出的存储系统中。所述方法的主要流程描述如下。Referring to FIG. 5, based on the same inventive concept, an embodiment of the present invention provides a method for transmitting information through an interface card, which may be applied to the storage system shown in FIG. 2, FIG. 3, and FIG. The main flow of the method is described below.

步骤501:当M个控制器中作为主控制器的第一控制器发生故障时,所述M个控制器中的第二控制器竞争为新的主控制器。其中,所述M个控制器201中包括一个主控制器201及M-1个作为冗余的从控制器201,所述M个控制器201属于存储系统,所述存储系统中还包括N个接口卡202,其中每个接口卡202与至少两个控制器201连接,用于将所述控制器201传输的、或传输给所述控制器201的信号进行中转,或用于处理来自所述控制器201的信号;N为小于等于M的整数。Step 501: When the first controller as the main controller of the M controllers fails, the second controller of the M controllers competes for the new main controller. The M controllers 201 include a main controller 201 and M-1 as redundant slave controllers 201. The M controllers 201 belong to a storage system, and the storage system further includes N An interface card 202, wherein each interface card 202 is coupled to at least two controllers 201 for relaying signals transmitted by the controller 201 or transmitted to the controller 201, or for processing from the The signal of the controller 201; N is an integer less than or equal to M.

以图2和图3中的系统架构为例。例如所述第一控制器为图中左边的控制器201,所述第二控制器为图中右边的控制器201。Take the system architecture in Figure 2 and Figure 3 as an example. For example, the first controller is the controller 201 on the left side of the figure, and the second controller is the controller 201 on the right side of the figure.

所述存储系统要开始工作时,首先要上电,上电之后所述系统进行初始化,初始化完成后主控制器201侦测接口卡202在位信号,即判断接口卡202是否已经插入正确的插槽。如果接口卡202不在位,则需要插入接口卡202,如果能侦测到接口卡在位信号被拉低(因为通常是低电平有效),则通过驱动程序判断接口卡202的类型的二进制值,即判断接口卡202的类型是否是系统所支持的类型。如果是接口卡202的类型是非系统支持的类型,则通过所述驱动程序发出告警信息,接口卡202上的红灯点亮,用户可以重新插入所述系统支持的接口卡202。When the storage system starts to work, it first needs to be powered on. After the power is turned on, the system performs initialization. After the initialization is completed, the main controller 201 detects the in-position signal of the interface card 202, that is, determines whether the interface card 202 has been inserted into the correct insertion. groove. If the interface card 202 is not in place, the interface card 202 needs to be inserted. If it can be detected that the interface card is pulled low (because it is normally active low), the binary value of the type of the interface card 202 is determined by the driver. That is, it is judged whether the type of the interface card 202 is a type supported by the system. If the type of the interface card 202 is of a non-system supported type, the alarm message is sent by the driver, the red light on the interface card 202 is illuminated, and the user can reinsert the interface card 202 supported by the system.

在接口卡202识别到之后(即接口卡202已插入正确的插槽、接口卡202无故障、且接口卡202的类型为系统支持的类型),所述系统给接口卡202发送上电使能信号、时钟信号和复位信号,以使接口卡202开始工作。发送完毕后,接口卡202与主控制器201自协商端口,所谓的协商端口,是指通过接口卡202 与主控制器201的协商,确定接口卡202和主控制器201之间的传输通道的传输带宽、传输速率等信息。在协商完毕后,接口卡202和主控制器201就可以进行通信。此时的主控制器201例如是指图2中左边的控制器201。After the interface card 202 is identified (ie, the interface card 202 has been inserted into the correct slot, the interface card 202 is not faulty, and the type of the interface card 202 is of a type supported by the system), the system sends power-on enable to the interface card 202. The signal, clock signal and reset signal cause the interface card 202 to begin operation. After the sending is completed, the interface card 202 and the main controller 201 automatically negotiate the port, and the so-called negotiation port refers to the interface card 202. In consultation with the main controller 201, information such as a transmission bandwidth, a transmission rate, and the like of a transmission channel between the interface card 202 and the main controller 201 is determined. After the negotiation is completed, the interface card 202 and the main controller 201 can communicate. The main controller 201 at this time is, for example, the controller 201 on the left side in FIG.

可选的,本发明实施例中,在所述M个控制器201中的所述第二控制器竞争为新的主控制器201之前,还可以包括:Optionally, in the embodiment of the present invention, before the second controller of the M controllers 201 competes for the new main controller 201, the method may further include:

所述第二控制器接收所述第一控制器发送的第一故障通知消息,所述第一故障通知消息用于通知所述第二控制器,所述第一控制器出现了故障。The second controller receives a first fault notification message sent by the first controller, where the first fault notification message is used to notify the second controller that the first controller has a fault.

当主控制器201故障时,主控制器201通过掉电中断信号或心跳信号通知从控制器201,主控制器201的故障信息,即向从控制器201发送所述第一故障通知消息,则从控制器201竞争为主控制器201,此时新的主控制器201例如是指图2中右边的控制器201。同时,原主控制器201也会将故障信息通知接口卡202,或接口卡202可周期性、定时或随机地自动探知各控制器201的状态。When the main controller 201 fails, the main controller 201 notifies the slave controller 201 of the failure information of the main controller 201 by the power-down interrupt signal or the heartbeat signal, that is, transmits the first failure notification message to the slave controller 201, and then The controller 201 competes for the main controller 201, and the new main controller 201 at this time refers to, for example, the controller 201 on the right side of FIG. At the same time, the original main controller 201 also notifies the interface card 202 of the failure information, or the interface card 202 can automatically detect the status of each controller 201 periodically, periodically or randomly.

此时右边的控制器201通过与接口卡202之间的串行控制总线或并行控制总线通知接口卡202,该控制器201已经竞争为主控制器201,即控制器201之间发生了主从切换事件,接口卡202与控制器201之间连接的端口的工作状态也需要进行切换,即,接口卡202与左边的控制器201之间连接的端口可以停止工作,即令接口卡202与左边的控制器201之间连接的端口进入非激活状态,或进入镜像状态,接口卡202与右边的控制器201之间连接的端口可以开始工作,即令接口卡202与右边的控制器201之间连接的端口进入激活状态。控制接口卡202上的端口进入非激活状态、镜像状态或激活状态,具体可以是由接口卡202中的控制模块来执行。At this time, the controller 201 on the right side notifies the interface card 202 through the serial control bus or the parallel control bus between the interface card 202, and the controller 201 has already competed for the master controller 201, that is, the master-slave has occurred between the controllers 201. The switching event, the working state of the port connected between the interface card 202 and the controller 201 also needs to be switched, that is, the port connected between the interface card 202 and the controller 201 on the left side can be stopped, that is, the interface card 202 and the left side are The port connected between the controller 201 enters an inactive state, or enters a mirrored state, and the port connected between the interface card 202 and the controller 201 on the right can start working, that is, the interface between the interface card 202 and the controller 201 on the right. The port enters the active state. The port on the control interface card 202 enters an inactive state, a mirrored state, or an activated state, which may be specifically performed by a control module in the interface card 202.

控制端口进入镜像状态,是指可以控制端口传输镜像数据,即备份数据,此时,进入镜像状态的端口相当于成为另一个端口的备份端口。When the control port enters the mirroring state, it can control the port to transfer mirror data, that is, back up data. At this time, the port that enters the mirroring state is equivalent to the backup port of another port.

进一步的,本发明实施例中,在所述M个控制器201中的所述第二控制器竞争为新的主控制器201之后,还可以包括:Further, in the embodiment of the present invention, after the second controller of the M controllers 201 competes for the new main controller 201, the method may further include:

所述第二控制器接收所述M个控制器201中的第三控制器发送的第二故 障通知消息,所述第二故障通知消息用于通知所述第二控制器,所述第三控制器出现了故障;The second controller receives the second reason sent by the third controller of the M controllers 201 The second notification message is used to notify the second controller that the third controller has a fault;

所述第二控制器从冗余控制器列表中去掉所述第三控制器的信息;其中,所述冗余控制器列表用于记录能够作为冗余的各控制器201的信息。The second controller removes information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information of each controller 201 that can be redundant.

例如,所述系统中共包括有三个控制器201,分别为控制器1、控制器2和控制器3,初始时,控制器1为主控制器201,控制器2和控制器3为从控制器201。For example, the system includes three controllers 201, which are a controller 1, a controller 2 and a controller 3, respectively. Initially, the controller 1 is a master controller 201, and the controller 2 and the controller 3 are slave controllers. 201.

当控制器1出现故障时,控制器2竞争为了主控制器201,则此时控制器3仍为从控制器201。When the controller 1 fails, the controller 2 competes for the main controller 201, and then the controller 3 is still the slave controller 201.

在控制器2工作时,若控制器3也出现了故障,则控制器3会通过掉电中断信号或心跳信号通知控制器2,即向控制器2发送所述第二故障通知消息,控制器2可以将控制器3的信息从所述冗余控制器列表中去除。同时控制器3也会将故障信息通知接口卡202,或接口卡202可周期性、定时或随机地自动探知各控制器201的状态,则接口卡202会控制与控制器3之间的端口进入非激活状态或镜像状态。When the controller 2 is working, if the controller 3 also fails, the controller 3 notifies the controller 2 by the power-down interrupt signal or the heartbeat signal, that is, sends the second fault notification message to the controller 2, and the controller 2 The information of the controller 3 can be removed from the list of redundant controllers. At the same time, the controller 3 also notifies the interface card 202 of the fault information, or the interface card 202 can automatically detect the state of each controller 201 periodically, periodically or randomly, and the interface card 202 controls the port entry with the controller 3. Inactive or mirrored state.

本发明实施例中,当从控制器201故障时,主控制器201与接口卡202之间的业务不会受到影响。若所述系统中还有其他的从控制器201,例如还有第四控制器。则当作为主控制器201的控制器2出现故障时,控制器2不会向有故障的控制器3发送故障信息,即当需要主从切换时,不会选择有故障的从控制器201,而会选择无故障的从控制器201。当然,若所述系统中再没有其他的从控制器201,那么如果作为主控制器201的控制器2也故障,所述系统可能会停止运行。In the embodiment of the present invention, when the slave controller 201 fails, the service between the master controller 201 and the interface card 202 is not affected. If there are other slave controllers 201 in the system, for example, there is also a fourth controller. Then, when the controller 2 as the main controller 201 fails, the controller 2 does not send the fault information to the faulty controller 3, that is, when the master-slave switch is required, the faulty slave controller 201 is not selected. Instead, the slave controller 201 is selected to be faultless. Of course, if there is no other slave controller 201 in the system, if the controller 2 as the master controller 201 also fails, the system may stop running.

当控制器3的故障恢复时,控制器3可以通过心跳信号等方式通知控制器2,则控制器2会重新将控制器3列入可以进行主从切换的选择范围,即重新将控制器3的信息加入所述冗余控制器列表。当然,当从控制器201的故障恢复时,从控制器201也可以通知接口卡202。 When the fault of the controller 3 is restored, the controller 3 can notify the controller 2 by means of a heartbeat signal, etc., and the controller 2 will re-introducing the controller 3 into a selection range in which master-slave switching can be performed, that is, re-controlling the controller 3. The information is added to the list of redundant controllers. Of course, the slave controller 201 can also notify the interface card 202 when recovering from the failure of the controller 201.

本发明实施例中,控制器201与接口卡202之间可以互通信息。例如,控制器201可以实时、定时、或在有状态转换时向接口卡202发送通知消息,以告知接口卡202控制器201当前的状态,或者,接口卡202也可以实时、定时、或随机地向控制器201发送探测消息,以探知控制器201当前的状态。当然,接口卡202可以实时、定时、或在有状态转换时向控制器201发送通知消息,以告知控制器201接口卡202当前的状态,或者,控制器201也可以实时、定时、或随机地向接口卡202发送探测消息,以探知接口卡202当前的状态。In the embodiment of the present invention, the controller 201 and the interface card 202 can exchange information. For example, the controller 201 can send a notification message to the interface card 202 in real time, timing, or upon stateful transition to inform the interface card 202 of the current state of the controller 201, or the interface card 202 can also be in real time, timed, or randomly. A probe message is sent to the controller 201 to ascertain the current state of the controller 201. Of course, the interface card 202 can send a notification message to the controller 201 in real time, timing, or when there is a state transition to inform the controller 201 of the current state of the interface card 202, or the controller 201 can also be real time, timed, or randomly. A probe message is sent to the interface card 202 to ascertain the current state of the interface card 202.

步骤502:所述第二控制器至少通过所述N个接口卡中的第一接口卡进行信息中转;其中,所述第一接口卡分别与所述第一控制器及所述第二控制器连接。Step 502: The second controller performs information relaying by using at least one of the N interface cards, where the first interface card is respectively associated with the first controller and the second controller. connection.

当所述第一控制器出现故障时,所述第二控制器竞争为主控制器201。在图2和图3中,所述第一控制器和所述第二控制器连接到同一个接口卡202,虽然所述第一控制器出现了故障,该接口卡202还可以继续使用,则所述第二控制器可以继续使用该接口卡202,这里将该接口卡202称为所述第一接口卡。When the first controller fails, the second controller competes for the main controller 201. In FIG. 2 and FIG. 3, the first controller and the second controller are connected to the same interface card 202. Although the first controller fails, the interface card 202 can continue to be used. The second controller can continue to use the interface card 202, which is referred to herein as the first interface card.

请参见图6,基于同一发明构思,本发明实施例提供一种控制器故障处理方法,所述方法可以应用于图2、图3和图4所示出的存储系统中。所述方法的主要流程描述如下。Referring to FIG. 6, based on the same inventive concept, an embodiment of the present invention provides a controller fault processing method, which may be applied to the storage system shown in FIG. 2, FIG. 3, and FIG. The main flow of the method is described below.

步骤601:当M个控制器中作为主控制器的第一控制器发生故障时,N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的第一接口卡接收所述第一控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述第一接口卡,所述第一控制器出现了故障。其中,所述M个控制器201中包括一个主控制器201及M-1个作为冗余的从控制器201,所述M个控制器201和所述N个接口卡202属于存储系统,其中每个接口卡202与其中的至少两个控制器201连接,用于将所述控制器201传输的、或传输给所述控制器201的信号进行中转,或用于处理来自所述控制器201的信号;N为小于等于M的整数。 Step 601: When the first controller that is the primary controller of the M controllers fails, the first interface card that is connected to the first controller and serves the first controller among the N interface cards Receiving a second fault notification message sent by the first controller, where the second fault notification message is used to notify the first interface card, and the first controller has a fault. The M controllers 201 include a main controller 201 and M-1 as redundant slave controllers 201, and the M controllers 201 and the N interface cards 202 belong to a storage system, wherein Each interface card 202 is coupled to at least two controllers 201 for relaying signals transmitted by the controller 201 or transmitted to the controller 201, or for processing from the controller 201 Signal; N is an integer less than or equal to M.

以图2和图3为例,例如左边的控制器201为主控制器201,当它出现故障时,可以向从控制器201发送第一故障通知消息,同时,也可以向接口卡202发送所述第二故障通知消息。具体的过程在介绍图2-图5时已有描述,此处不多赘述。Taking FIG. 2 and FIG. 3 as an example, for example, the controller 201 on the left side is the main controller 201. When it fails, the first failure notification message may be sent to the slave controller 201, and at the same time, the interface card 202 may also be sent to the interface card 202. The second failure notification message is described. The specific process has been described in the introduction of FIG. 2 to FIG. 5, and will not be described here.

可选的,本发明实施例中,在所述第一接口卡接收所述第一控制器发送的所述第二故障通知消息之后,还可以包括:Optionally, in the embodiment of the present invention, after the first interface card receives the second fault notification message sent by the first controller, the method may further include:

所述第一接口卡接收所述M个控制器201中的第二控制器发送的主控通知消息,所述主控通知消息用于通知所述第一接口卡,所述第二控制器已竞争为新的主控制器201;The first interface card receives a master control notification message sent by a second controller of the M controllers 201, where the master control notification message is used to notify the first interface card, and the second controller has Competing for a new master controller 201;

所述第一接口卡根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。The first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to communicate with the second controller through a port connected to the second controller. .

以图2和图3为例,例如左边的控制器201为主控制器201,当它出现故障时,可以向从控制器201发送第一故障通知消息,则从控制器201竞争为新的主控制器201,新的主控制器201会向所述第一接口卡发送所述主控通知消息,所述第一接口卡接收到所述主控通知消息后,则会激活与新的主控制器201之间的端口,以与新的主控制器201进行通信,具体的过程在介绍图2-图5时已有描述,此处不多赘述。Taking FIG. 2 and FIG. 3 as an example, for example, the controller 201 on the left side is the main controller 201. When it fails, the first failure notification message can be sent to the slave controller 201, and the controller 201 competes for the new master. The controller 201 sends a new control notification message to the first interface card, and after receiving the main control notification message, the first interface card activates and controls the new main control. The port between the devices 201 communicates with the new main controller 201. The specific process has been described in the description of FIG. 2 to FIG. 5, and details are not described herein.

步骤602:所述第一接口卡根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信。Step 602: The first interface card controls, according to the second failure notification message, that a port connected to the first controller enters an inactive state to stop communication with the first controller.

接口卡202在接收到所述第二故障通知消息时,可以控制与左边的控制器201相连的端口进入非激活状态,这样就可以停止与左边的控制器201之间的通信。Upon receiving the second failure notification message, the interface card 202 can control the port connected to the controller 201 on the left to enter an inactive state, so that communication with the controller 201 on the left can be stopped.

当然,在右边的控制器201竞争为主控制器201后,也可以向接口卡202发送主控通知消息,接口卡202可以控制与右边的控制器201相连的端口进入激活状态,从而与右边的控制器201进行通信。 Of course, after the controller 201 on the right competes with the main controller 201, the master control notification message may also be sent to the interface card 202, and the interface card 202 may control the port connected to the controller 201 on the right to enter an active state, thereby The controller 201 performs communication.

具体的实现过程在介绍图2-图5时已有描述,此处不多赘述。The specific implementation process has been described in the introduction of FIG. 2 to FIG. 5, and will not be described here.

请参见图7,基于同一发明构思,本发明实施例提供另一种通过接口卡传输信息的方法,所述方法可以应用于图2、图3和图4所示出的存储系统中。所述方法的主要流程描述如下。Referring to FIG. 7, based on the same inventive concept, an embodiment of the present invention provides another method for transmitting information through an interface card, which may be applied to the storage system shown in FIG. 2, FIG. 3, and FIG. The main flow of the method is described below.

步骤701:当存储系统中包括的M个控制器中作为主控制器的第一控制器发生故障时,所述第一控制器向第一接口卡发送第二故障通知消息,所述第二故障通知消息用于通知所述第一接口卡,所述第一控制器出现了故障;其中,所述第一接口卡为所述存储系统中包括N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的接口卡。Step 701: When a first controller that is the primary controller of the M controllers included in the storage system fails, the first controller sends a second fault notification message to the first interface card, where the second fault occurs. The notification message is used to notify the first interface card that the first controller is faulty; wherein the first interface card is connected to the first controller in the storage system including N interface cards And an interface card serving the first controller.

以图2和图3为例,例如左边的控制器201为主控制器201,当它出现故障时,可以向从控制器201发送第一故障通知消息,同时,也可以向接口卡202发送所述第二故障通知消息。具体的过程在介绍图2-图5时已有描述,此处不多赘述。Taking FIG. 2 and FIG. 3 as an example, for example, the controller 201 on the left side is the main controller 201. When it fails, the first failure notification message may be sent to the slave controller 201, and at the same time, the interface card 202 may also be sent to the interface card 202. The second failure notification message is described. The specific process has been described in the introduction of FIG. 2 to FIG. 5, and will not be described here.

步骤702:所述第一接口卡根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信。Step 702: The first interface card controls, according to the second failure notification message, that a port connected to the first controller enters an inactive state to stop communication with the first controller.

接口卡202在接收到所述第二故障通知消息时,可以控制与左边的控制器201相连的端口进入非激活状态,这样就可以停止与左边的控制器201之间的通信。Upon receiving the second failure notification message, the interface card 202 can control the port connected to the controller 201 on the left to enter an inactive state, so that communication with the controller 201 on the left can be stopped.

步骤703:当所述M个控制器中的第二控制器竞争为新的主控制器时,所述第二控制器向所述第一接口卡发送主控通知消息,所述主控通知消息用于通知所述第一接口卡,所述第二控制器已竞争为新的主控制器;其中,所述第一接口卡与所述第二控制器连接。Step 703: When the second controller of the M controllers contends to be the new master controller, the second controller sends a master control notification message to the first interface card, where the master control notification message is sent. And the second controller is used to notify the first interface card that the second controller has competed for a new primary controller; wherein the first interface card is connected to the second controller.

步骤704:所述第一接口卡根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。Step 704: The first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to pass a port connected to the second controller and the second control. Communicate.

继续以图2和图3为例,例如左边的控制器201为主控制器201,当它出 现故障时,可以向从控制器201发送所述第一故障通知消息,则从控制器201竞争为新的主控制器201,新的主控制器201会向所述第一接口卡发送所述主控通知消息,所述第一接口卡接收到所述主控通知消息后,则会激活与新的主控制器201之间的端口,以与新的主控制器201进行通信,具体的过程在介绍图2-图5时已有描述,此处不多赘述。Continuing with FIG. 2 and FIG. 3 as an example, for example, the controller 201 on the left is the main controller 201, when it is out When the fault occurs, the first fault notification message may be sent to the slave controller 201, and then the slave controller 201 competes for the new master controller 201, and the new master controller 201 sends the new master controller 201 to the first interface card. The master control notification message, after the first interface card receives the master control notification message, activates a port with the new master controller 201 to communicate with the new master controller 201, the specific process It has been described in the introduction of Figures 2 to 5, and will not be described here.

请参见图8,基于同一发明构思,本发明实施例提供一种控制器,所述控制器可以是图2-图4所示的存储系统中的控制器201,即也是图5-图7流程中所述的控制器201,特别的,该控制器201可以是图5-图7流程中所述的第二控制器。该控制器201可以包括操作模块801和通信模块802。Referring to FIG. 8, based on the same inventive concept, an embodiment of the present invention provides a controller, which may be the controller 201 in the storage system shown in FIG. 2 to FIG. 4, that is, the flow of FIG. 5-7. The controller 201 described above, in particular, the controller 201 may be the second controller described in the flow of Figures 5-7. The controller 201 can include an operation module 801 and a communication module 802.

操作模块801用于当M个控制器201中作为主控制器201的第一控制器发生故障时,令所述控制器201竞争为新的主控制器201。其中,所述M个控制器201中包括一个主控制器201及M-1个作为冗余的从控制器201,所述M个控制器201属于所述存储系统,所述存储系统中还包括N个接口卡202,其中每个接口卡202与至少两个控制器201连接,用于将所述控制器201传输的、或传输给所述控制器201的信号进行中转,或用于处理来自所述控制器201的信号;N为小于等于M的整数。The operation module 801 is configured to cause the controller 201 to compete for a new main controller 201 when a failure occurs in the M controllers 201 as the first controller of the main controller 201. The M controllers 201 include a main controller 201 and M-1 as redundant slave controllers 201, and the M controllers 201 belong to the storage system, and the storage system further includes N interface cards 202, wherein each interface card 202 is coupled to at least two controllers 201 for relaying signals transmitted by the controller 201 or transmitted to the controller 201, or for processing from The signal of the controller 201; N is an integer less than or equal to M.

通信模块802用于至少通过所述N个接口卡202中的第一接口卡进行信息中转;其中,所述第一接口卡分别与所述第一控制器及所述控制器连接。The communication module 802 is configured to perform information relaying by using at least one of the N interface cards 202; wherein the first interface card is respectively connected to the first controller and the controller.

可选的,本发明实施例中,控制器201还可以包括接收模块,用于:在操作模块801令所述控制器201竞争为新的主控制器201之前,接收所述第一控制器发送的第一故障通知消息,所述第一故障通知消息用于通知所述控制器201,所述第一控制器出现了故障。Optionally, in the embodiment of the present invention, the controller 201 may further include a receiving module, configured to: before the operating module 801 causes the controller 201 to compete for the new primary controller 201, receive the first controller to send The first failure notification message is used to notify the controller 201 that the first controller has a failure.

可选的,本发明实施例中,控制器201还可以包括所述接收模块和去除模块;Optionally, in the embodiment of the present invention, the controller 201 may further include the receiving module and the removing module.

所述接收模块用于在操作模块801令控制器201竞争为新的主控制器201之后,接收所述M个控制器201中的第三控制器发送的第二故障通知消息, 所述第二故障通知消息用于通知所述控制器201,所述第三控制器出现了故障;The receiving module is configured to receive a second fault notification message sent by a third controller of the M controllers 201 after the operating module 801 causes the controller 201 to compete for the new master controller 201. The second failure notification message is used to notify the controller 201 that the third controller has a fault;

所述去除模块用于从冗余控制器列表中去掉所述第三控制器的信息;其中,所述冗余控制器列表用于记录能够作为冗余的各控制器201的信息。The removal module is configured to remove information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information of each controller 201 that can be redundant.

请参见图9,基于同一发明构思,本发明实施例提供一种接口卡,所述接口卡可以是图2-图4所示的存储系统中的接口卡202,即也是图5-图7流程中所述的接口卡202,特别的,该接口卡202可以是图5-图7流程中所述的第一接口卡。该接口卡202可以包括接收模块901和控制模块902。Referring to FIG. 9, based on the same inventive concept, an embodiment of the present invention provides an interface card, which may be an interface card 202 in the storage system shown in FIG. 2 to FIG. 4, that is, the flow of FIG. 5-7. The interface card 202 described above, in particular, the interface card 202 can be the first interface card described in the flow of Figures 5-7. The interface card 202 can include a receiving module 901 and a control module 902.

接收模块901用于当M个控制器201中作为主控制器201的第一控制器发生故障时,接收所述第一控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述接口卡202,所述第一控制器出现了故障;其中,所述接口卡202为N个接口卡202中与所述第一控制器相连、且为所述第一控制器服务的接口卡202;The receiving module 901 is configured to receive a second fault notification message sent by the first controller when a first controller of the M controllers 201 is faulty as the primary controller 201, where the second fault notification message is used. Informing the interface card 202 that the first controller has a fault; wherein the interface card 202 is connected to the first controller and served by the first controller in the N interface cards 202. Interface card 202;

控制模块902用于根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信。The control module 902 is configured to control, according to the second failure notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller.

可选的,本发明实施例中,接收模块901还用于:在接收所述第一控制器发送的第二故障通知消息之后,接收所述M个控制器201中的第二控制器发送的主控通知消息,所述主控通知消息用于通知所述接口卡202,所述第二控制器已竞争为新的主控制器201;控制模块902还用于:根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。Optionally, in the embodiment of the present invention, the receiving module 901 is further configured to: after receiving the second fault notification message sent by the first controller, receive the second controller sent by the M controllers 201 The master control notification message is used to notify the interface card 202 that the second controller has contend for the new master controller 201; the control module 902 is further configured to: according to the master control notification message And controlling a port connected to the second controller to enter an active state to communicate with the second controller through a port connected to the second controller.

基于同一发明构思,本发明实施例还提供一种存储系统,所述存储系统可以是图2-图4所示的存储系统,即也是图5-图7流程中所述的存储系统。所述存储系统可以包括第一控制器、第一接口卡和第二控制器。本发明实施例中,所述存储系统中可以包括多个控制器201和多个接口卡202,这里只是以两个控制器201(即所述第一控制器和所述第二控制器)和一个接口卡202(即所述第一接口卡)为例。 Based on the same inventive concept, an embodiment of the present invention further provides a storage system, which may be the storage system shown in FIG. 2 to FIG. 4, that is, the storage system described in the flowcharts of FIG. 5-7. The storage system can include a first controller, a first interface card, and a second controller. In the embodiment of the present invention, the storage system may include multiple controllers 201 and multiple interface cards 202, here only two controllers 201 (ie, the first controller and the second controller) and An interface card 202 (ie, the first interface card) is taken as an example.

所述第一控制器,用于当所述存储系统中包括的M个控制器201中作为主控制器201的所述第一控制器发生故障时,向所述第一接口卡发送第二故障消息,所述第二故障通知消息用于通知所述第一接口卡,所述第一控制器出现了故障;其中,所述第一接口卡为所述存储系统中包括的N个接口卡202中与所述第一控制器相连、且为所述第一控制器服务的接口卡;The first controller is configured to send a second fault to the first interface card when the first controller that is the main controller 201 in the M controllers 201 included in the storage system fails The second failure notification message is used to notify the first interface card that the first controller has a fault; wherein the first interface card is the N interface cards 202 included in the storage system. An interface card connected to the first controller and serving the first controller;

所述第一接口卡,用于根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信;The first interface card is configured to control, according to the second fault notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller;

所述第二控制器,用于当所述M个控制器201中的所述第二控制器竞争为新的主控制器201时,向所述第一接口卡发送主控通知消息,所述主控通知消息用于通知所述第一接口卡,所述第二控制器已竞争为新的主控制器201;其中,所述第一接口卡与所述第二控制器连接;The second controller is configured to send a master control notification message to the first interface card when the second controller of the M controllers 201 competes for a new master controller 201, The master control notification message is used to notify the first interface card, and the second controller has competed for the new primary controller 201; wherein the first interface card is connected to the second controller;

所述第一接口卡还用于根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。The first interface card is further configured to control, according to the master control notification message, a port connected to the second controller to enter an active state, by using a port connected to the second controller, and the second control Communicate.

本发明实施例提供一种存储系统,包括:M个控制器201,用于控制所述系统;所述M个控制器201中包括一个主控制器201及M-1个作为冗余的从控制器201,M为正整数;N个接口卡202,其中每个接口卡202与至少两个控制器201连接,用于将所述控制器201传输的、或传输给所述控制器201的信号进行中转,或用于处理来自所述控制器201的信号;N为小于等于M的整数。An embodiment of the present invention provides a storage system, including: M controllers 201 for controlling the system; the M controllers 201 include a main controller 201 and M-1 as redundant slave controls. The device 201, M is a positive integer; N interface cards 202, wherein each interface card 202 is connected to at least two controllers 201 for transmitting signals transmitted by the controller 201 or transmitted to the controller 201 Performing a relay, or for processing a signal from the controller 201; N is an integer less than or equal to M.

本发明实施例中,一个接口卡202至少与两个控制器201相连,若与一个接口卡202相连的其中一个控制器201出现了故障,则该接口卡202可以停止服务于该控制器201,同时,该接口卡202还与其他控制器201相连,还能够继续服务于其他控制器201。这样,即使控制器201出现故障,接口卡202只要无故障就还可以继续为其他控制器201服务,可以继续使用,相对于现有技术来说,接口卡202中的业务不会中断,并且与该接口卡202连接的其他硬件 设备也可以继续通过该接口卡202传输信息,保障了系统的可靠性。In the embodiment of the present invention, an interface card 202 is connected to at least two controllers 201. If one of the controllers 201 connected to an interface card 202 fails, the interface card 202 can stop serving the controller 201. At the same time, the interface card 202 is also connected to other controllers 201, and can continue to serve other controllers 201. In this way, even if the controller 201 fails, the interface card 202 can continue to serve other controllers 201 as long as there is no fault, and can continue to be used. Compared with the prior art, the services in the interface card 202 are not interrupted, and Other hardware connected to the interface card 202 The device can also continue to transmit information through the interface card 202, thereby ensuring the reliability of the system.

并且,接口卡202和与该接口卡202连接的设备都可以继续使用,也在一定程度上节省了硬件资源,提高了接口卡202的利用率。并且,采用本发明实施例中的技术方案,可以在一定程度上减少接口卡202的数量,使系统结构趋于简单,有利于减小系统的体积。Moreover, the interface card 202 and the device connected to the interface card 202 can continue to be used, which also saves hardware resources to some extent and improves the utilization of the interface card 202. Moreover, with the technical solution in the embodiment of the present invention, the number of the interface cards 202 can be reduced to some extent, the system structure tends to be simple, and the volume of the system is reduced.

所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。It will be clearly understood by those skilled in the art that for the convenience and brevity of the description, only the division of each functional module described above is exemplified. In practical applications, the above function assignment can be completed by different functional modules as needed. The internal structure of the device is divided into different functional modules to perform all or part of the functions described above. For the specific working process of the system, the device and the unit described above, reference may be made to the corresponding process in the foregoing method embodiments, and details are not described herein again.

在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售 或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit is implemented in the form of a software functional unit and sold as a standalone product Or when used, it can be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application, in essence or the contribution to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) or a processor to perform all or part of the steps of the methods described in various embodiments of the present application. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

以上所述,以上实施例仅用以对本申请的技术方案进行了详细介绍,但以上实施例的说明只是用于帮助理解本发明的方法及其核心思想,不应理解为对本发明的限制。本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。 The above embodiments are only used to describe the technical solutions of the present application in detail, but the description of the above embodiments is only for helping to understand the method and the core idea of the present invention, and should not be construed as limiting the present invention. Those skilled in the art will be able to devise variations or alternatives within the scope of the present invention within the scope of the present invention.

Claims (16)

一种存储系统,其特征在于,包括:A storage system, comprising: M个控制器,用于控制所述系统;所述M个控制器中包括一个主控制器及M-1个作为冗余的从控制器,M为正整数;M controllers for controlling the system; the M controllers include a main controller and M-1 slave controllers as redundant, and M is a positive integer; N个接口卡,其中每个接口卡与至少两个控制器连接,用于将所述控制器传输的信号、或传输给所述控制器的信号进行中转,或用于处理来自所述控制器的信号;N为小于等于M的整数。N interface cards, wherein each interface card is coupled to at least two controllers for relaying signals transmitted by the controller, or signals transmitted to the controller, or for processing from the controller Signal; N is an integer less than or equal to M. 如权利要求1所述的方法,其特征在于,所述接口卡与所述控制器通过快捷外设互连PCIE总线相连。The method of claim 1 wherein said interface card is coupled to said controller via a fast peripheral interconnect PCIE bus. 如权利要求2所述的方法,其特征在于,所述接口卡与所述控制器之间还连接有串行控制总线和/或并行控制总线,用于传输控制信号。The method of claim 2, wherein a serial control bus and/or a parallel control bus are further coupled between the interface card and the controller for transmitting control signals. 如权利要求1-3任一所述的方法,其特征在于,所述系统还包括至少一个存储设备,其中每个存储设备与至少一个接口卡相连,以使所述控制器与所述存储设备通过相应的接口卡进行信息交互。A method according to any one of claims 1 to 3, wherein the system further comprises at least one storage device, wherein each storage device is coupled to at least one interface card to cause the controller and the storage device Information exchange through the corresponding interface card. 一种通过接口卡传输信息的方法,其特征在于,包括:A method for transmitting information through an interface card, comprising: 当M个控制器中作为主控制器的第一控制器发生故障时,所述M个控制器中的第二控制器竞争为新的主控制器;When the first controller as the primary controller of the M controllers fails, the second controller of the M controllers competes for the new primary controller; 所述第二控制器至少通过所述N个接口卡中的第一接口卡进行信息中转;其中,所述第一接口卡分别与所述第一控制器及所述第二控制器连接。The second controller performs information relaying by using at least one of the N interface cards; wherein the first interface card is respectively connected to the first controller and the second controller. 如权利要求5所述的方法,其特征在于,在所述M个控制器中的第二控制器竞争为新的主控制器之前,还包括:The method of claim 5, wherein before the second one of the M controllers competes for the new primary controller, the method further comprises: 所述第二控制器接收所述第一控制器发送的第一故障通知消息,所述第一故障通知消息用于通知所述第二控制器,所述第一控制器出现了故障。The second controller receives a first fault notification message sent by the first controller, where the first fault notification message is used to notify the second controller that the first controller has a fault. 如权利要求5或6所述的方法,其特征在于,在所述M个控制器中的第二控制器竞争为新的主控制器之后,还包括: The method according to claim 5 or 6, wherein after the second controller of the M controllers competes for the new master controller, the method further includes: 所述第二控制器接收所述M个控制器中的第三控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述第二控制器,所述第三控制器出现了故障;The second controller receives a second fault notification message sent by a third controller of the M controllers, where the second fault notification message is used to notify the second controller, the third controller A failure has occurred; 所述第二控制器从冗余控制器列表中去掉所述第三控制器的信息;其中,所述冗余控制器列表用于记录能够作为冗余的各控制器的信息。The second controller removes information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information that can be used as redundant controllers. 一种控制器故障处理方法,其特征在于,包括:A controller fault processing method, comprising: 当M个控制器中作为主控制器的第一控制器发生故障时,N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的第一接口卡接收所述第一控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述第一接口卡,所述第一控制器出现了故障;When the first controller as the primary controller of the M controllers fails, the first interface card connected to the first controller and serving the first controller of the N interface cards receives the a second fault notification message sent by the first controller, where the second fault notification message is used to notify the first interface card, where the first controller has a fault; 所述第一接口卡根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信。The first interface card controls the port connected to the first controller to enter an inactive state according to the second failure notification message to stop communication with the first controller. 如权利要求8所述的方法,其特征在于,在第一接口卡接收所述第一控制器发送的第二故障通知消息之后,还包括:The method of claim 8, wherein after the first interface card receives the second failure notification message sent by the first controller, the method further includes: 所述第一接口卡接收所述M个控制器中的第二控制器发送的主控通知消息,所述主控通知消息用于通知所述第一接口卡,所述第二控制器已竞争为新的主控制器;The first interface card receives a master control notification message sent by a second controller of the M controllers, where the master control notification message is used to notify the first interface card, and the second controller has been contending For the new primary controller; 所述第一接口卡根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。The first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to communicate with the second controller through a port connected to the second controller. . 一种通过接口卡传输信息的方法,其特征在于,包括:A method for transmitting information through an interface card, comprising: 当存储系统中包括的M个控制器中作为主控制器的第一控制器发生故障时,所述第一控制器向第一接口卡发送第二故障通知消息,所述第二故障通知消息用于通知所述第一接口卡,所述第一控制器出现了故障;其中,所述第一接口卡为所述存储系统中包括N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的接口卡; When the first controller that is the primary controller of the M controllers included in the storage system fails, the first controller sends a second failure notification message to the first interface card, where the second failure notification message is used. The first interface card is configured to notify the first interface card that the first controller is faulty; wherein the first interface card is connected to the first controller in the N interface cards included in the storage system, and is An interface card served by the first controller; 所述第一接口卡根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信;The first interface card controls, according to the second failure notification message, that a port connected to the first controller enters an inactive state to stop communication with the first controller; 当所述M个控制器中的第二控制器竞争为新的主控制器时,所述第二控制器向所述第一接口卡发送主控通知消息,所述主控通知消息用于通知所述第一接口卡,所述第二控制器已竞争为新的主控制器;其中,所述第一接口卡与所述第二控制器连接;When the second controller of the M controllers contends to be the new master controller, the second controller sends a master control notification message to the first interface card, where the master control notification message is used to notify The first interface card, the second controller has competed for a new primary controller; wherein the first interface card is connected to the second controller; 所述第一接口卡根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。The first interface card controls, according to the master control notification message, that a port connected to the second controller enters an active state to communicate with the second controller through a port connected to the second controller. . 一种控制器,其特征在于,包括:A controller, comprising: 操作模块,用于当M个控制器中作为主控制器的第一控制器发生故障时,令所述控制器竞争为新的主控制器;An operation module, when the first controller as the main controller of the M controllers fails, causing the controller to compete for a new main controller; 通信模块,用于至少通过所述N个接口卡中的第一接口卡进行信息中转;其中,所述第一接口卡分别与所述第一控制器及所述控制器连接。And a communication module, configured to perform information relaying by using at least one of the N interface cards; wherein the first interface card is respectively connected to the first controller and the controller. 如权利要求11所述的控制器,其特征在于,所述控制器还包括接收模块,用于:在所述操作模块令所述控制器竞争为新的主控制器之前,接收所述第一控制器发送的第一故障通知消息,所述第一故障通知消息用于通知所述控制器,所述第一控制器出现了故障。The controller according to claim 11, wherein said controller further comprises a receiving module, configured to: receive said first before said operating module causes said controller to compete for a new primary controller And a first fault notification message sent by the controller, where the first fault notification message is used to notify the controller that the first controller has a fault. 如权利要求11或12所述的控制器,其特征在于,所述控制器还包括接收模块和去除模块;The controller according to claim 11 or 12, wherein the controller further comprises a receiving module and a removing module; 所述接收模块用于在所述操作模块令控制器竞争为新的主控制器之后,接收所述M个控制器中的第三控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述控制器,所述第三控制器出现了故障;The receiving module is configured to: after the operating module causes the controller to compete for a new primary controller, receive a second fault notification message sent by a third controller of the M controllers, where the second fault notification is sent a message is used to notify the controller that the third controller has failed; 所述去除模块用于从冗余控制器列表中去掉所述第三控制器的信息;其中,所述冗余控制器列表用于记录能够作为冗余的各控制器的信息。The removal module is configured to remove information of the third controller from a list of redundant controllers; wherein the redundant controller list is used to record information that can be used as redundant controllers. 一种接口卡,其特征在于,包括: An interface card, comprising: 接收模块,用于当M个控制器中作为主控制器的第一控制器发生故障时,接收所述第一控制器发送的第二故障通知消息,所述第二故障通知消息用于通知所述接口卡,所述第一控制器出现了故障;其中,所述接口卡为N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的接口卡;a receiving module, configured to receive a second fault notification message sent by the first controller when a first controller that is the master controller of the M controllers fails, where the second fault notification message is used to notify the In the interface card, the first controller is faulty; wherein the interface card is an interface card of the N interface cards that is connected to the first controller and serves the first controller; 控制模块,用于根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信。And a control module, configured to control, according to the second fault notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller. 如权利要求14所述的接口卡,其特征在于,所述接收模块还用于:在接收所述第一控制器发送的第二故障通知消息之后,接收所述M个控制器中的第二控制器发送的主控通知消息,所述主控通知消息用于通知所述接口卡,所述第二控制器已竞争为新的主控制器;The interface card according to claim 14, wherein the receiving module is further configured to: after receiving the second failure notification message sent by the first controller, receive a second one of the M controllers a master control notification message sent by the controller, the master control notification message is used to notify the interface card, and the second controller has competed for a new master controller; 所述控制模块还用于:根据所述主控通知消息,控制与所述第二控制器相连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。The control module is further configured to: according to the master control notification message, control a port connected to the second controller to enter an activation state, to connect a port connected to the second controller with the second controller Communicate. 一种存储系统,其特征在于,包括:A storage system, comprising: 第一控制器,用于当所述存储系统中包括的M个控制器中作为主控制器的所述第一控制器发生故障时,向第一接口卡发送第二故障消息,所述第二故障通知消息用于通知所述第一接口卡,所述第一控制器出现了故障;其中,所述第一接口卡为所述存储系统中包括的N个接口卡中与所述第一控制器相连、且为所述第一控制器服务的接口卡;a first controller, configured to send a second fault message to the first interface card when the first controller that is the master controller in the M controllers included in the storage system fails, the second The fault notification message is used to notify the first interface card that the first controller has a fault; wherein the first interface card is among the N interface cards included in the storage system and the first control An interface card that is connected to and serves the first controller; 所述第一接口卡,用于根据所述第二故障通知消息,控制与所述第一控制器相连的端口进入非激活状态,以停止与所述第一控制器之间的通信;The first interface card is configured to control, according to the second fault notification message, a port connected to the first controller to enter an inactive state to stop communication with the first controller; 第二控制器,用于当所述M个控制器中的所述第二控制器竞争为新的主控制器时,向所述第一接口卡发送主控通知消息,所述主控通知消息用于通知所述第一接口卡,所述第二控制器已竞争为新的主控制器;其中,所述第一接口卡与所述第二控制器连接;a second controller, configured to send a master control notification message to the first interface card when the second controller of the M controllers contends to be a new master controller, the master control notification message For notifying the first interface card, the second controller has competed for a new primary controller; wherein the first interface card is connected to the second controller; 所述第一接口卡还用于根据所述主控通知消息,控制与所述第二控制器相 连的端口进入激活状态,以通过与所述第二控制器相连的端口与所述第二控制器进行通信。 The first interface card is further configured to control, according to the master control notification message, with the second controller The connected port enters an active state to communicate with the second controller through a port connected to the second controller.
PCT/CN2015/076658 2014-10-24 2015-04-15 Method, apparatus and system for information transmission and controller fault handling through interface cards Ceased WO2016062037A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410579922.2A CN104410510B (en) 2014-10-24 2014-10-24 Pass through the method, apparatus and system of interface card transmission information
CN201410579922.2 2014-10-24

Publications (1)

Publication Number Publication Date
WO2016062037A1 true WO2016062037A1 (en) 2016-04-28

Family

ID=52648108

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/076658 Ceased WO2016062037A1 (en) 2014-10-24 2015-04-15 Method, apparatus and system for information transmission and controller fault handling through interface cards

Country Status (2)

Country Link
CN (1) CN104410510B (en)
WO (1) WO2016062037A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109542198A (en) * 2018-11-20 2019-03-29 郑州云海信息技术有限公司 A kind of method and apparatus that control PCIe card powers on
CN111737062A (en) * 2020-06-24 2020-10-02 浙江大华技术股份有限公司 Backup processing method, device and system
CN112000286A (en) * 2020-08-13 2020-11-27 北京浪潮数据技术有限公司 Four-control full-flash-memory storage system and fault processing method and device thereof

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104410510B (en) * 2014-10-24 2018-07-03 华为技术有限公司 Pass through the method, apparatus and system of interface card transmission information
CN105335101B (en) * 2015-09-29 2018-11-20 浪潮(北京)电子信息产业有限公司 A kind of data processing method and system
CN106059791B (en) * 2016-05-13 2020-04-14 华为技术有限公司 A link switching method and storage device for services in a storage system
CN106302480B (en) * 2016-08-19 2019-05-10 浪潮(北京)电子信息产业有限公司 A communication method based on NTB hardware and SCSI communication protocol
US11909635B2 (en) * 2021-03-05 2024-02-20 Juniper Networks, Inc. Hardware-assisted fast data path switchover for a network device with redundant forwarding components
CN114880254B (en) * 2022-04-02 2025-09-16 锐捷网络股份有限公司 Table item reading method and device and network equipment
CN115391105B (en) * 2022-08-31 2025-07-18 杭州宏杉科技股份有限公司 Storage control method and device applied to storage equipment
CN115657975B (en) * 2022-12-29 2023-03-31 浪潮电子信息产业股份有限公司 Disk data read-write control method, related components and front-end shared card
CN117439971B (en) * 2023-10-10 2024-12-13 深圳市佳合丰新能源科技有限公司 ADDRESS ALLOCATION METHOD, SYSTEM, COMPUTER DEVICE AND STORAGE MEDIUM

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1753376A (en) * 2005-10-27 2006-03-29 杭州华为三康技术有限公司 Biprimary controlled network equipment and its master back-up switching method
CN1909559A (en) * 2006-08-30 2007-02-07 杭州华为三康技术有限公司 Interface board based on rapid periphery components interconnection and method for switching main-control board
CN102195845A (en) * 2010-03-03 2011-09-21 杭州华三通信技术有限公司 Method, device and equipment for realizing active-standby switching of main control board
CN203482216U (en) * 2013-09-24 2014-03-12 浙江大华系统工程有限公司 Network equipment
CN104410510A (en) * 2014-10-24 2015-03-11 华为技术有限公司 Method, device and system for processing failure in controller where information is transmitted through interface card

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101068140B (en) * 2007-06-27 2010-06-16 中兴通讯股份有限公司 A device and method for realizing primary/standby PCI device switching
CN101252531A (en) * 2008-04-02 2008-08-27 杭州华三通信技术有限公司 Equipment, system and method for realizing load sharing and main standby switching

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1753376A (en) * 2005-10-27 2006-03-29 杭州华为三康技术有限公司 Biprimary controlled network equipment and its master back-up switching method
CN1909559A (en) * 2006-08-30 2007-02-07 杭州华为三康技术有限公司 Interface board based on rapid periphery components interconnection and method for switching main-control board
CN102195845A (en) * 2010-03-03 2011-09-21 杭州华三通信技术有限公司 Method, device and equipment for realizing active-standby switching of main control board
CN203482216U (en) * 2013-09-24 2014-03-12 浙江大华系统工程有限公司 Network equipment
CN104410510A (en) * 2014-10-24 2015-03-11 华为技术有限公司 Method, device and system for processing failure in controller where information is transmitted through interface card

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109542198A (en) * 2018-11-20 2019-03-29 郑州云海信息技术有限公司 A kind of method and apparatus that control PCIe card powers on
CN109542198B (en) * 2018-11-20 2022-02-18 郑州云海信息技术有限公司 Method and equipment for controlling power-on of PCIE card
CN111737062A (en) * 2020-06-24 2020-10-02 浙江大华技术股份有限公司 Backup processing method, device and system
CN112000286A (en) * 2020-08-13 2020-11-27 北京浪潮数据技术有限公司 Four-control full-flash-memory storage system and fault processing method and device thereof
CN112000286B (en) * 2020-08-13 2023-02-28 北京浪潮数据技术有限公司 Four-control full-flash-memory storage system and fault processing method and device thereof

Also Published As

Publication number Publication date
CN104410510A (en) 2015-03-11
CN104410510B (en) 2018-07-03

Similar Documents

Publication Publication Date Title
CN104410510B (en) Pass through the method, apparatus and system of interface card transmission information
US8127059B1 (en) Apparatus for interconnecting hosts with storage devices
EP2052326B1 (en) Fault-isolating sas expander
US20190235465A1 (en) Backplane-based plc system with hot swap function
US10275373B2 (en) Hot swappable device and method
KR20210094069A (en) Alternative protocol selection
US12519740B2 (en) Method to reset switch when controller fault is detected
CN101557379B (en) Link reconfiguration method for PCIE interface and device thereof
CN110419035B (en) USB host to host automatic switching
CN108737188B (en) A network card failover system
CN100418047C (en) Disk array device and its control method
CN101488105B (en) Method for implementing high availability of memory double-controller and memory double-controller system
EP2137906B1 (en) Communicating configuration information over standard interconnect link
US9116881B2 (en) Routing switch apparatus, network switch system, and routing switching method
US11061462B2 (en) Remote terminal apparatus enabled to reset a plug-and-play compatible device even fixedly connected without removing the device from the apparatus, control method thereof, computer system, and non-transitory recording medium
CN111181766B (en) Redundant FC network system and method for realizing dynamic configuration of switch
CN103970705A (en) Multi-path server architecture design with redundant and symmetrical hot-plugging IO boxes
JP6134720B2 (en) Connection method
JP5176914B2 (en) Transmission device and system switching method for redundant configuration unit
JP7746581B2 (en) Storage system, data processing method, and device
CN118606117A (en) A four-controller interconnected mirroring system, data transmission method, device and medium
CN120803374B (en) Storage systems and storage system clusters
CN111475440A (en) Communication control method and device based on asynchronous transmission protocol and electronic equipment
GB2489838A (en) Processor trace circuit, which shares a bus with the processor being monitored
JP2001086146A (en) Control method of FC_AL system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15851877

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15851877

Country of ref document: EP

Kind code of ref document: A1