CN108023814A - SDN control plane failure emergency systems and method - Google Patents
SDN control plane failure emergency systems and method Download PDFInfo
- Publication number
- CN108023814A CN108023814A CN201711242547.2A CN201711242547A CN108023814A CN 108023814 A CN108023814 A CN 108023814A CN 201711242547 A CN201711242547 A CN 201711242547A CN 108023814 A CN108023814 A CN 108023814A
- Authority
- CN
- China
- Prior art keywords
- controller
- plane
- control
- switch
- control plane
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract 10
- 230000005540 biological transmission Effects 0.000 claims abstract 5
- 230000009191 jumping Effects 0.000 claims 2
- 238000011084 recovery Methods 0.000 claims 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/38—Flow based routing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
本发明公开了一种SDN控制平面故障应急系统及方法,其中,系统包括:单控制平面多控制器,用于在控制器出现单点故障时,控制应急系统进入单控制平面多控制器模式;双平面混合带内传输器,用于在部分交换机与控制器出现直连连接故障时,控制应急系统进入双平面混合带内传输模式;单数据平面自主学习器,用于在控制平面失效时,控制应急系统进入单数据平面自主学习模式,并且将分布式路由协议以兼容OpenFlow协议的形式扩展于交换机中,且将协议数据包的解析扩展为OpenFlow协议支持的动作,以配合流表项完全以OpenFlow通道处理数据包进行转发。该系统可以提高整体网络的可控性,提高了SDN网络的稳定性和可靠性。
The invention discloses an SDN control plane fault emergency system and method, wherein the system includes: single control plane and multiple controllers, used to control the emergency system to enter the single control plane and multiple controllers mode when a single point of failure occurs in the controller; The dual-plane hybrid in-band transmitter is used to control the emergency system to enter the dual-plane hybrid in-band transmission mode when there is a direct connection failure between some switches and the controller; the single data plane self-learning device is used to Control the emergency system to enter the single data plane self-learning mode, and extend the distributed routing protocol in the switch in a form compatible with the OpenFlow protocol, and expand the analysis of the protocol data packet to the action supported by the OpenFlow protocol, so as to cooperate with the flow entry completely with OpenFlow channels process packets for forwarding. The system can improve the controllability of the overall network and improve the stability and reliability of the SDN network.
Description
技术领域technical field
本发明涉及网络架构技术领域,特别涉及一种SDN(Software DefinedNetworking,软件定义网络)控制平面故障应急系统及方法。The present invention relates to the technical field of network architecture, in particular to an SDN (Software Defined Networking, software defined network) control plane failure emergency system and method.
背景技术Background technique
SDN(Software Defined Networking,软件定义网络)作为一种新型的网络架构,有效解决了传统网络功能扩展受限的弊端,极大提高了网络的可编程性与管控能力。As a new type of network architecture, SDN (Software Defined Networking) effectively solves the disadvantages of limited expansion of traditional network functions, and greatly improves the programmability and control capabilities of the network.
但是,SDN集中式控制的架构使控制器的地位在网络中极为特殊,数据平面交换机一旦与控制器失去连接,将会影响底层网络的数据传输,而且SDN网络大多采用单一控制器带外模式部署,部署示意如图1所示,整个网络仅由一个控制器进行管控(所有交换机均由同一个控制器进行管控),且交换机与控制器间均有直连链路。在这种部署模式下,交换机与控制器失联主要分为两种情况:控制器单点故障或者交换机与控制器间连接故障(包括链路长时间拥塞、链路物理故障以及链路端口故障等)。However, the centralized control architecture of SDN makes the status of the controller very special in the network. Once the data plane switch loses connection with the controller, it will affect the data transmission of the underlying network, and most SDN networks are deployed in the out-of-band mode of a single controller. , the deployment diagram is shown in Figure 1. The entire network is managed and controlled by only one controller (all switches are managed and controlled by the same controller), and there are direct links between the switches and the controller. In this deployment mode, the loss of connection between the switch and the controller is mainly divided into two situations: a single point of failure of the controller or a connection failure between the switch and the controller (including long-term link congestion, link physical failure, and link port failure). Wait).
在发生控制器单点故障时,控制器故障将会导致所有交换机均与控制器失去连接,交换机无法主动生成底层转发所需要的流表项,并且在控制器恢复之前,网络的所有数据传输受阻,整个网络彻底瘫痪;When a single point of failure of the controller occurs, the failure of the controller will cause all switches to lose connection with the controller, the switch cannot actively generate the flow entries required for the underlying forwarding, and all data transmission on the network will be blocked before the controller recovers , the entire network is completely paralyzed;
在发生交换机与控制器间连接故障时,可能出现部分交换机与控制器连接故障的情况,失联交换机辖下的数据传输受阻,最极端的情况下是所有交换机与控制器连接故障,此时与发生控制器单点故障类似。In the event of a connection failure between the switch and the controller, there may be a connection failure between some switches and the controller, and the data transmission under the control of the lost switch is blocked. A controller single point of failure occurs similarly.
无论何种故障情况,在交换机失联阶段网络将会受到极大的波动,底层主机可能无法通信,整个网络甚至于彻底瘫痪,这对于商用网络是难以接受的,亟待解决。No matter what kind of failure occurs, the network will be greatly fluctuated when the switch is disconnected, the underlying host may not be able to communicate, and the entire network may even be completely paralyzed. This is unacceptable for commercial networks and needs to be resolved urgently.
发明内容Contents of the invention
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。The present invention aims to solve one of the technical problems in the related art at least to a certain extent.
为此,本发明的一个目的在于提出一种SDN控制平面故障应急系统,该系统可以降低SDN控制平面故障对底层网络的影响,提高了SDN网络的稳定性和可靠性。Therefore, an object of the present invention is to propose an SDN control plane failure emergency system, which can reduce the impact of SDN control plane failures on the underlying network, and improve the stability and reliability of the SDN network.
本发明的另一个目的在于提出一种SDN控制平面故障应急方法。Another object of the present invention is to propose an SDN control plane failure emergency method.
为达到上述目的,本发明一方面实施例提出了一种SDN控制平面故障应急系统,包括:单控制平面多控制器,用于在控制器出现单点故障时,控制应急系统进入单控制平面多控制器模式,并且通过冗余备份在控制平面部署多控制器,以解决所述控制平面的所述控制器单点故障;双平面混合带内传输器,用于在部分交换机与控制器出现直连连接故障时,控制所述应急系统进入双平面混合带内传输模式,并且通过数据平面其他交换机传输控制流,且所述控制流的传输过程受所述控制器的控制,以通过所述控制器下发流表管理控制流的路径,使网络中所有的流量均受所述控制器进行直接管控;单数据平面自主学习器,用于在控制平面失效时,控制所述应急系统进入单数据平面自主学习模式,并且将分布式路由协议以兼容OpenFlow协议的形式扩展于所述交换机中,且将协议数据包的解析扩展为所述OpenFlow协议支持的动作,以配合所述流表项完全以OpenFlow通道处理数据包进行转发。In order to achieve the above purpose, an embodiment of the present invention proposes an SDN control plane fault emergency system, including: a single control plane with multiple controllers, used to control the emergency system to enter the single control plane multiple controllers when a single point of failure occurs in the controller. Controller mode, and deploy multiple controllers on the control plane through redundant backup to solve the single point of failure of the controllers in the control plane; dual-plane hybrid in-band transmitters are used for direct connection between some switches and controllers When the connection fails, control the emergency system to enter the dual-plane hybrid in-band transmission mode, and transmit the control flow through other switches on the data plane, and the transmission process of the control flow is controlled by the controller, so as to pass the control The controller issues a flow table to manage the path of the control flow, so that all traffic in the network is directly controlled by the controller; the single data plane self-learning device is used to control the emergency system to enter the single data plane when the control plane fails. Plane self-learning mode, and extend the distributed routing protocol in the switch in a form compatible with the OpenFlow protocol, and extend the analysis of the protocol data packet to the action supported by the OpenFlow protocol, so as to cooperate with the flow entry completely in the OpenFlow channels process packets for forwarding.
本发明实施例的SDN控制平面故障应急系统,可以通过单控制平面多控制器解决控制器单点故障的问题;通过双平面混合带内传输器解决部分交换机与控制器直连连接故障的问题,提高了整体网络的可控性;通过单数据平面自主学习器解决控制平面完全失效的问题,有效保障底层数据平面的正常通信,降低网络恢复时延,提高网络可靠性,减小网络管理人员的负担。The SDN control plane failure emergency system of the embodiment of the present invention can solve the problem of single point failure of the controller through a single control plane and multiple controllers; solve the problem of direct connection failure between some switches and controllers through a dual-plane hybrid in-band transmitter, Improve the controllability of the overall network; solve the problem of complete failure of the control plane through the single data plane self-learning device, effectively guarantee the normal communication of the underlying data plane, reduce the network recovery delay, improve network reliability, and reduce network management personnel's workload burden.
另外,根据本发明上述实施例的SDN控制平面故障应急系统还可以具有以下附加的技术特征:In addition, the SDN control plane failure emergency system according to the above-mentioned embodiments of the present invention may also have the following additional technical features:
进一步地,在本发明的一个实施例中,在所述部分交换机与所述控制器出现直连连接故障时,所述交换机通过定时发送的ECHO数据包来进行探测,当规定时间内未收到所述ECHO回复数据包,则判断所述交换机与所述控制器断开,且所述交换机会尝试与所述控制器重新建立TCP(Transfer Control Protocol,传输控制协议)连接。Further, in one embodiment of the present invention, when a direct connection failure occurs between the part of the switches and the controller, the switch detects through the regularly sent ECHO data packets, and when no If the ECHO replies with a data packet, it is determined that the switch is disconnected from the controller, and the switch will try to re-establish a TCP (Transfer Control Protocol, Transmission Control Protocol) connection with the controller.
进一步地,在本发明的一个实施例中,所述单数据平面自主学习器,进一步包括:处理模块,所述处理模块用于通过添加所述表项对所述APR数据包进行处理,以及通过添加动作实现对OSPF数据包进行处理,其中,所述动作包含特殊动作的表项;自学习模块,所述自学习模块用于对收到的所述数据包进行所述表项自学习。Further, in an embodiment of the present invention, the single data plane autonomous learner further includes: a processing module, the processing module is used to process the APR data packet by adding the entry, and by Adding an action realizes processing the OSPF data packet, wherein the action includes a table item of a special action; a self-learning module, the self-learning module is used for performing self-learning of the table item on the received data packet.
进一步地,在本发明的一个实施例中,所述自学习模块,进一步包括:二层表项自学习单元,用于要完成MAC(Medium Access Control,物理地址)地址与端口号的学习,以指导数据包的最终转发出端口;ARP表项自学习单元,用于学习ARP数据包的IP(InternetProtocol,网际协议)地址与MAC地址的对应关系,以在后期指导其他网段下主机请求本地网段下主机之间的IP数据包的转发;三层表项自学习单元,用于计算到网络中每个目标网段的下一跳交换机,以在后期指导不同网段下主机之间的IP数据包的转发。Further, in one embodiment of the present invention, the self-learning module further includes: a layer-2 entry self-learning unit, used to complete the learning of MAC (Medium Access Control, physical address) addresses and port numbers, to Instruct the final forwarding port of the data packet; the ARP entry self-learning unit is used to learn the corresponding relationship between the IP (Internet Protocol, Internet Protocol) address and the MAC address of the ARP data packet, so as to guide the host in other network segments to request the local network Forwarding of IP data packets between hosts under the segment; the self-learning unit of the three-layer table entry is used to calculate the next-hop switch to each target network segment in the network, so as to guide the IP between hosts under different network segments in the later stage packet forwarding.
进一步地,在本发明的一个实施例中,所述自学习模块,还包括:Further, in one embodiment of the present invention, the self-learning module also includes:
基本灾备表项单元,用于引导所述数据包实现不同网络层次之间进行跳转。The basic disaster recovery table entry unit is used to guide the data packet to realize jumping between different network layers.
为达到上述目的,本发明另一方面实施例提出了一种SDN控制平面故障应急方法,包括以下步骤:在控制器出现单点故障时,控制应急系统进入单控制平面多控制器模式,并且通过冗余备份在控制平面部署多控制器,以解决所述控制平面的所述控制器单点故障;在部分交换机与控制器出现直连连接故障时,控制所述应急系统进入双平面混合带内传输模式,并且通过数据平面其他交换机传输控制流,且所述控制流的传输过程受所述控制器的控制,以通过所述控制器下发流表管理控制流的路径,使网络中所有的流量均受所述控制器进行直接管控;在控制平面失效时,控制所述应急系统进入单数据平面自主学习模式,并且将分布式路由协议以兼容OpenFlow协议的形式扩展于所述交换机中,且将协议数据包的解析扩展为所述OpenFlow协议支持的动作,以配合所述流表项完全以OpenFlow通道处理数据包进行转发。In order to achieve the above object, another embodiment of the present invention proposes an SDN control plane fault emergency method, which includes the following steps: when a controller has a single point of failure, the control emergency system enters a single control plane multi-controller mode, and through Redundant backup deploys multiple controllers on the control plane to solve the single point of failure of the controllers on the control plane; when a direct connection failure occurs between some switches and controllers, the emergency system is controlled to enter the dual-plane hybrid zone transmission mode, and the control flow is transmitted through other switches in the data plane, and the transmission process of the control flow is controlled by the controller, so that the path of the control flow is managed by the controller through the flow table, so that all in the network The traffic is directly controlled by the controller; when the control plane fails, the emergency system is controlled to enter a single data plane self-learning mode, and the distributed routing protocol is extended in the switch in a form compatible with the OpenFlow protocol, and The parsing of protocol data packets is extended to actions supported by the OpenFlow protocol, so as to cooperate with the flow entries to completely process data packets through OpenFlow channels for forwarding.
本发明实施例的SDN控制平面故障应急方法,可以通过单控制平面多控制器解决控制器单点故障的问题;通过双平面混合带内传输器解决部分交换机与控制器直连连接故障的问题,提高了整体网络的可控性;通过单数据平面自主学习器解决控制平面完全失效的问题,有效保障底层数据平面的正常通信,降低网络恢复时延,提高网络可靠性,减小网络管理人员的负担。The SDN control plane failure emergency method in the embodiment of the present invention can solve the problem of single point failure of the controller through a single control plane and multiple controllers; solve the problem of direct connection failure between some switches and controllers through a dual-plane hybrid in-band transmitter, Improve the controllability of the overall network; solve the problem of complete failure of the control plane through the single data plane self-learning device, effectively guarantee the normal communication of the underlying data plane, reduce the network recovery delay, improve network reliability, and reduce network management personnel's workload burden.
另外,根据本发明上述实施例的SDN控制平面故障应急方法还可以具有以下附加的技术特征:In addition, the SDN control plane failure emergency method according to the above-mentioned embodiments of the present invention may also have the following additional technical features:
进一步地,在本发明的一个实施例中,在所述部分交换机与所述控制器出现直连连接故障时,进一步包括:所述交换机通过定时发送的ECHO数据包来进行探测,当规定时间内未收到所述ECHO回复数据包,则判断所述交换机与所述控制器断开,且所述交换机会尝试与所述控制器重新建立TCP连接。Further, in an embodiment of the present invention, when a direct connection failure occurs between the part of the switches and the controller, it further includes: the switch detects through the ECHO data packet sent regularly, and when the specified time If the ECHO reply data packet is not received, it is determined that the switch is disconnected from the controller, and the switch will try to re-establish a TCP connection with the controller.
进一步地,在本发明的一个实施例中,在所述控制应急系统进入单数据平面自主学习模式后,进一步包括:通过添加所述表项对所述APR数据包进行处理,以及通过添加动作实现对OSPF数据包进行处理,其中,所述动作包含特殊动作的表项;对收到的所述数据包进行所述表项自学习。Further, in an embodiment of the present invention, after the control emergency system enters the single data plane autonomous learning mode, it further includes: processing the APR data packet by adding the entry, and implementing Processing the OSPF data packet, wherein the action includes an entry of a special action; performing self-learning of the entry on the received data packet.
进一步地,在本发明的一个实施例中,所述对收到的所述数据包进行所述表项自学习,进一步包括:完成MAC地址与端口号的学习,并指导数据包的最终转发出端口;学习ARP数据包的IP地址与MAC地址的对应关系,并在后期指导其他网段下主机请求本地网段下主机之间的IP数据包的转发;计算到网络中每个目标网段的下一跳交换机,并在后期指导不同网段下主机之间的IP数据包的转发。Further, in an embodiment of the present invention, the self-study of the table entry for the received data packet further includes: completing the learning of the MAC address and port number, and guiding the final forwarding of the data packet port; learn the corresponding relationship between the IP address and the MAC address of the ARP data packet, and guide the hosts in other network segments to request the forwarding of IP data packets between the hosts in the local network segment; Next-hop switch, and guide the forwarding of IP data packets between hosts on different network segments in the later stage.
进一步地,在本发明的一个实施例中,所述对收到的所述数据包进行所述表项自学习,还包括:引导所述数据包实现不同网络层次之间进行跳转。Further, in an embodiment of the present invention, the self-learning of the table entry for the received data packet further includes: guiding the data packet to realize jumping between different network layers.
本发明附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
附图说明Description of drawings
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, wherein:
图1为相关技术中SDN控制平面故障应急系统部署示意图;FIG. 1 is a schematic diagram of deployment of an SDN control plane failure emergency system in the related art;
图2为根据本发明实施例的SDN控制平面故障应急系统的结构示意图;FIG. 2 is a schematic structural diagram of an SDN control plane failure emergency system according to an embodiment of the present invention;
图3为根据本发明一个实施例的二层表项自学习单元的流程图;FIG. 3 is a flowchart of a layer 2 entry self-learning unit according to an embodiment of the present invention;
图4为根据本发明一个实施例的ARP表项自学习单元的流程图;FIG. 4 is a flowchart of an ARP entry self-learning unit according to an embodiment of the present invention;
图5为根据本发明一个实施例的三层表项自学习单元的流程图;FIG. 5 is a flowchart of a three-layer entry self-learning unit according to an embodiment of the present invention;
图6为根据本发明一个实施例的区分不同数据包的流程图;Fig. 6 is a flow chart of distinguishing different data packets according to one embodiment of the present invention;
图7为根据本发明一个实施例的网络部署示意图;FIG. 7 is a schematic diagram of network deployment according to an embodiment of the present invention;
图8为根据本发明一个实施例的切换示意图;FIG. 8 is a schematic diagram of switching according to an embodiment of the present invention;
图9为根据本发明实施例的SDN控制平面故障应急方法是流程图。FIG. 9 is a flow chart of a method for emergency response to an SDN control plane failure according to an embodiment of the present invention.
具体实施方式Detailed ways
下面详细描述本发明的实施例,实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。Embodiments of the present invention are described in detail below, and examples of the embodiments are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.
在介绍本发明实施例的SDN控制平面故障应急系统及方法,先来介绍下相关技术中的SDN控制器发生故障时的解决方法。Before introducing the SDN control plane failure emergency system and method according to the embodiment of the present invention, the solution to the failure of the SDN controller in the related art will be introduced first.
对于SDN控制器故障的情况,相关技术有两种解决思路,一种是从控制平面入手,另一种是从数据平面入手。For the situation of SDN controller failure, related technologies have two solutions, one is to start from the control plane, and the other is to start from the data plane.
(1)多控制器(备用控制器)(1) Multi-controller (standby controller)
相关技术对于控制平面,提出了一种备用控制器+数据共享中心的解决方案。该SDN网络系统包含数据共享中心和多个控制器。在正常工作状态下,由主控制器处理网络设备的OpenFlow消息以及管理网络数据流,备用控制器检测主控制器状态并与其同步信息;控制器状态同步主要包含:网络视图同步、控制器网络服务应用状态同步以及控制器数据同步。对于网络视图冗余和同步,使用Hot主动/备用策略,主控制器会将域内的网络视图实时变化推送到所有备用控制器。对于控制器网络服务应用冗余和同步,使用Warm主动/备用策略,主控制器定期或有条件地向所有备用控制器推送服务应用的快照。对于控制器数据同步,在控制器间采用可靠的数据共享中心。Related technologies For the control plane, a solution of standby controller + data sharing center is proposed. The SDN network system includes a data sharing center and multiple controllers. Under normal working conditions, the main controller handles OpenFlow messages of network devices and manages network data flow, and the standby controller detects the status of the main controller and synchronizes information with it; controller status synchronization mainly includes: network view synchronization, controller network services Application state synchronization and controller data synchronization. For network view redundancy and synchronization, using the Hot active/standby strategy, the master controller will push real-time changes in the domain's network view to all standby controllers. For controller network service application redundancy and synchronization, using the Warm active/standby strategy, the primary controller periodically or conditionally pushes a snapshot of the service application to all standby controllers. For controller data synchronization, a reliable data sharing center is used between controllers.
整个系统对于故障发生时的处理分为两个阶段:故障检测阶段和恢复阶段。故障检测阶段:主控制器故障检测机制基于heartbeat组件,主要参数包括heartbeat消息的间隔以及备用控制器修复主控制器故障的时间间隔。The entire system is divided into two stages for the processing of faults: fault detection stage and recovery stage. Fault detection phase: The primary controller fault detection mechanism is based on the heartbeat component, and the main parameters include the interval of heartbeat messages and the time interval for the standby controller to repair the fault of the primary controller.
恢复阶段:当检测到主控制器故障时,系统进入恢复阶段,主要包含以下步骤:Recovery phase: When a failure of the main controller is detected, the system enters the recovery phase, which mainly includes the following steps:
首先,指定一个新的主控制器,新的主控制器是一个具有最高ID(或者IP)的备用控制器;First, designate a new primary controller, which is a standby controller with the highest ID (or IP);
其次,新的主控制器告知其它控制器自身状态的变化;Second, the new master controller notifies other controllers of their state changes;
再次,控制器网络服务和应用的恢复;Again, the recovery of controller network services and applications;
最后,启动控制网络接口。Finally, start the control network interface.
整体系统基于HAC(High Availability Controller,高可用控制器)架构,主控制器与备用控制器间定时发送heartbeat消息来探测控制器是否发生故障,在此同时,主控制器和备份控制器也会根据不同信息类型选取不同策略进行信息的同步,当备用控制器发现主控制器发生故障,则进入恢复阶段,成为新的主控制器,并从数据共享中心下载部分数据信息,完成所有数据的同步,从而由该新的主控制器接管整个网络,解决单一控制器故障引起的巨大问题,并且可以做到较低的同步开销以及40ms到50ms的故障时间,能够被网络服务应用接受。The overall system is based on the HAC (High Availability Controller) architecture. The main controller and the backup controller regularly send heartbeat messages to detect whether the controller fails. At the same time, the main controller and the backup controller will also follow the Different information types select different strategies for information synchronization. When the standby controller finds that the main controller fails, it enters the recovery phase and becomes the new main controller, and downloads part of the data information from the data sharing center to complete the synchronization of all data. Therefore, the new main controller takes over the entire network, solves the huge problem caused by the failure of a single controller, and can achieve lower synchronization overhead and 40ms to 50ms failure time, which can be accepted by network service applications.
(2)数据平面的混杂模式(2) Mixed mode of data plane
在SDN网络环境中,交换机的流表项用于指导数据分组的转发,一般情况下,这部分表项由控制器下发。假定网络的所有控制器均发生故障或者控制器与交换机所有直连连接故障,交换机不能自动生成转发所需的表项,由控制器下发的表项失效后,底层网络将会彻底瘫痪。针对底层转发设备完全脱离控制器控制的极端情况,提出基于传统路由的解决方案,在交换机上增加类似传统网络的处理逻辑,从而保证网络的正常通信。In the SDN network environment, the flow entry of the switch is used to guide the forwarding of data packets. Generally, this part of the entry is delivered by the controller. Assuming that all controllers in the network fail or all direct connections between the controller and the switch fail, the switch cannot automatically generate the table entries required for forwarding. After the table entries issued by the controller fail, the underlying network will be completely paralyzed. Aiming at the extreme situation where the underlying forwarding device is completely out of the control of the controller, a solution based on traditional routing is proposed, and processing logic similar to traditional networks is added to the switch to ensure normal communication of the network.
如OVS(Open vSwitch,软件交换机)支持的双通道处理,即OpenFlow处理通道和具有部分功能的传统处理通道。双通道机制允许进入OVS的数据分组优先进入OpenFlow处理通道,然后通过特殊的Normal端口将数据分组引入传统通道中处理,按照传统的协议完成二层转发、三层路由等功能,以此保障系统在无控制器状态下数据分组的转发与路由。For example, the dual-channel processing supported by OVS (Open vSwitch, software switch), that is, the OpenFlow processing channel and the traditional processing channel with some functions. The dual-channel mechanism allows the data packets entering OVS to enter the OpenFlow processing channel first, and then introduce the data packets into the traditional channel for processing through the special Normal port, and complete the functions of layer-2 forwarding and layer-3 routing according to the traditional protocol, so as to ensure the system in Forwarding and routing of data packets without a controller.
完全脱离控制器控制的OVS具有两种不同的工作模式:失败独立模式和失败安全模式,并且这两种工作模式下都可以实现控制器失效后数据分组的转发功能,两者均是通过含有Normal端口的流表项来保证无控制器状态下数据分组的二层转发,Normal通道是根据MAC、PORT以及VLAN(Virtual Local Area Network,虚拟局域网)的映射关系来实现二层转发,即在OVS接收到数据分组的同时进行简单的学习,存储数据分组的源MAC地址、入端口以及VLAN映射关系。对于之后接收的数据分组,如果目的MAC地址在Normal通道的存储表中查询到,则直接从对应的端口转发出去,查找失败则广播转发。The OVS that is completely out of the control of the controller has two different working modes: failure independent mode and fail safe mode, and both of these two working modes can realize the forwarding function of data packets after the controller fails. Port flow entry to ensure the Layer 2 forwarding of data packets without a controller. The Normal channel implements Layer 2 forwarding according to the mapping relationship between MAC, PORT, and VLAN (Virtual Local Area Network, Virtual Local Area Network). Carry out simple learning while receiving the data packet, and store the source MAC address, ingress port and VLAN mapping relationship of the data packet. For the data packets received later, if the destination MAC address is found in the storage table of the Normal channel, it will be directly forwarded from the corresponding port, and if the search fails, it will be broadcast and forwarded.
在失败独立模式下,在交换机三次主动探测控制器连接状况发现连接失败后,ovs-vswitchd将会自动接管设备的转发逻辑(后台仍然尝试连接到控制器,一旦恢复连接则退出该模式)。在该模式下,在确定失去连接后会自动下发一条优先级最高,匹配域为任意,端口为Normal的隐藏表项,此时OVS将作为一个普通的MAC-learning交换机,将收到的数据分组切换到Normal通道,保证二层转发。In the failure-independent mode, after the switch actively detects the connection status of the controller three times and finds that the connection fails, ovs-vswitchd will automatically take over the forwarding logic of the device (still trying to connect to the controller in the background, and exit this mode once the connection is restored). In this mode, after it is determined that the connection is lost, it will automatically issue a hidden entry with the highest priority, any matching domain, and Normal port. At this time, OVS will act as an ordinary MAC-learning switch and will receive the received data. The packet is switched to the Normal channel to ensure Layer 2 forwarding.
在失败安全模式下,对于上传至控制器的数据分组或者消息,交换机都会将其丢弃(后台仍然尝试连接到控制器,一旦恢复连接则退出该模式)。在该模式下,不会自动配置含有Normal端口的表项,OVS仅仅按照已有的表项进行转发,流表中原有的表项会根据hard_timeout与idle_timeout超时删除。所以,该模式下如果要保证主机间二层通信,需要提前下发表项,表项的hard_timeout与idle_timeout均是0,端口为Normal,两个时间均为0保证该表项不会超时,从而将匹配到的数据分组引入到Normal通道,保证简单的二层转发。“失败安全模式”相对于“失败独立模式”的缺点是需要提前下发含有Normal的表项。In the fail-safe mode, the switch will discard the data packets or messages uploaded to the controller (the background still tries to connect to the controller, and exits this mode once the connection is restored). In this mode, entries containing Normal ports will not be automatically configured, and OVS will only forward according to the existing entries, and the original entries in the flow table will be deleted according to the timeout of hard_timeout and idle_timeout. Therefore, in this mode, if you want to ensure Layer 2 communication between hosts, you need to download the entry in advance. The hard_timeout and idle_timeout of the entry are both 0, the port is Normal, and both times are 0 to ensure that the entry will not time out. The matched data packets are introduced into the Normal channel to ensure simple Layer 2 forwarding. The disadvantage of "fail-safe mode" compared with "failure-independent mode" is that entries containing Normal need to be delivered in advance.
然而,相关技术的适用场景有限,也就是说,相关技术只适用于解决控制器单点故障或全部交换机与控制器脱离的情况,没有考虑部分交换机直连连接故障的情况,解决方案不能平滑过渡,缺少对不同故障程度的应对策略,应用场景受限;其次是缺乏整体系统,相关技术只是单独从某一平面入手,多控制器模式是从控制平面解决问题,混杂模式是从数据平面解决问题,两种方式相互独立。而真实网络环境可能出现的故障情况往往是复杂多变,可能由多种情况混合而成。目前尚未有一个成熟的系统,单一的解决方案不能够充分适应变化复杂的真实网络环境;再次是兼容性差,在混杂模式的解决方案中,要使数据分组可以从OpenFlow通道进入Normal通道,这不仅需要OpenFlow协议制定,还需要OpenFlow-hybrid交换机进行支持。目前支持既支持OpenFlow通道又支持带有Normal端口的传统处理通道的混杂交换机较少,并且复杂度较高,设备的价格也较高。此外,在SDN网络中增加Normal通道处理,与控制和转发分离的SDN思想相违背,不具备可扩展性和灵活性;最后是不同网段间不能通信,在混杂模式的解决方案中,对于即使支持Normal功能的OVS,在断开控制器时,OVS的Normal通道本身不支持任何的路由协议,不能在断开控制器之后对网络拓扑进行自学习。这将导致在断开控制器后仅能支持相同网段间的通信,即简单的二层转发,限制了网络规模的扩大,缩小了应用场景。However, the applicable scenarios of related technologies are limited, that is to say, related technologies are only applicable to solve the single point of failure of the controller or the situation where all switches are separated from the controller, and the situation of direct connection failure of some switches is not considered, and the solution cannot be smoothly transitioned , the lack of coping strategies for different fault levels, and limited application scenarios; followed by the lack of an overall system, the related technology only starts from a certain plane alone, the multi-controller mode solves the problem from the control plane, and the hybrid mode solves the problem from the data plane , the two methods are independent of each other. However, the faults that may occur in a real network environment are often complex and changeable, and may be caused by a mixture of various situations. There is not yet a mature system, and a single solution cannot fully adapt to the changing and complex real network environment; thirdly, the compatibility is poor. The OpenFlow protocol needs to be developed, and the OpenFlow-hybrid switch needs to be supported. At present, there are few hybrid switches that support both OpenFlow channels and traditional processing channels with normal ports, and the complexity is high, and the price of the equipment is also high. In addition, adding Normal channel processing in the SDN network is contrary to the SDN idea of separation of control and forwarding, and does not have scalability and flexibility. Finally, communication between different network segments is impossible. In the promiscuous mode solution, even if OVS that supports the normal function, when disconnecting the controller, the normal channel of OVS itself does not support any routing protocol, and cannot self-learn the network topology after disconnecting the controller. This will only support communication between the same network segment after the controller is disconnected, that is, simple Layer 2 forwarding, which limits the expansion of the network scale and reduces the application scenarios.
本发明正是基于上述问题,而提出了一种SDN控制平面故障应急系统及方法。Based on the above problems, the present invention proposes an SDN control plane failure emergency system and method.
下面参照附图描述根据本发明实施例提出的SDN控制平面故障应急系统及方法,首先将参照附图描述根据本发明实施例提出的SDN控制平面故障应急系统。The following describes the SDN control plane failure emergency system and method according to the embodiments of the present invention with reference to the drawings. First, the SDN control plane failure emergency system according to the embodiments of the present invention will be described with reference to the drawings.
图2是本发明实施例的SDN控制平面故障应急系统的结构示意图。Fig. 2 is a schematic structural diagram of an SDN control plane failure emergency system according to an embodiment of the present invention.
如图2所示,该SDN控制平面故障应急系统10包括:单控制平面多控制器100、双平面混合带内传输器200和单数据平面自主学习器300。As shown in FIG. 2 , the SDN control plane fault emergency system 10 includes: a single control plane multi-controller 100 , a dual plane hybrid in-band transmitter 200 and a single data plane autonomous learner 300 .
其中,单控制平面多控制器100,用于在控制器出现单点故障时,控制应急系统进入单控制平面多控制器模式,并且通过冗余备份在控制平面部署多控制器,以解决控制平面的控制器单点故障。双平面混合带内传输器200用于在部分交换机与控制器出现直连连接故障时,控制应急系统进入双平面混合带内传输模式,并且通过数据平面其他交换机传输控制流,且控制流的传输过程受控制器的控制,以通过控制器下发流表管理控制流的路径,使网络中所有的流量均受控制器进行直接管控。单数据平面自主学习器300用于在控制平面失效时,控制应急系统进入单数据平面自主学习模式,并且将分布式路由协议以兼容OpenFlow协议的形式扩展于交换机中,且将协议数据包的解析扩展为OpenFlow协议支持的动作,以配合流表项完全以OpenFlow通道处理数据包进行转发。本发明实施例的系统10可以降低SDN控制平面故障对底层网络的影响,提高了SDN网络的稳定性和可靠性。Among them, the single control plane multi-controller 100 is used to control the emergency system to enter the single control plane multi-controller mode when a controller has a single point of failure, and deploy multiple controllers on the control plane through redundant backup to solve the problem of control plane failure. controller single point of failure. The dual-plane hybrid in-band transmitter 200 is used to control the emergency system to enter the dual-plane hybrid in-band transmission mode when there is a direct connection failure between some switches and the controller, and transmit the control flow through other switches on the data plane, and the transmission of the control flow The process is controlled by the controller, and the path of the control flow is managed through the controller issuing the flow table, so that all traffic in the network is directly controlled by the controller. The single data plane autonomous learner 300 is used to control the emergency system to enter the single data plane autonomous learning mode when the control plane fails, and extend the distributed routing protocol in the switch in a form compatible with the OpenFlow protocol, and analyze the protocol data packets It is extended to the action supported by the OpenFlow protocol, so as to cooperate with the flow entry to completely process the data packet through the OpenFlow channel for forwarding. The system 10 of the embodiment of the present invention can reduce the impact of SDN control plane failure on the underlying network, and improve the stability and reliability of the SDN network.
可以理解的是,本发明实施例的系统可以包括单控制平面多控制器100、双平面混合带内传输器200以及单数据平面自主学习器300,以能够根据不同的故障情况在各级间平滑切换。It can be understood that the system of the embodiment of the present invention may include a single control plane multi-controller 100, a dual-plane hybrid in-band transmitter 200, and a single data plane autonomous learner 300, so as to be able to smooth between stages according to different fault conditions switch.
其中,单控制平面多控制器100通过冗余备份在控制平面部署多控制器,以解决控制平面的控制器单点故障。如通过选取其中一个控制器为主控制器管控网络,并在控制器间同步信息。当主控制器故障时,将从备用控制器中选举出新的主控制器,并将所有业务切换至新的主控制器,切换时间很短,在数据平面完全不可感知。本发明实施例的单控制平面多控制器100可以采用相关技术中是技术方案,为减少冗余,在此不做详细赘述。Wherein, the single control plane multi-controller 100 deploys multiple controllers on the control plane through redundant backup, so as to solve the single point of failure of the controller of the control plane. For example, by selecting one of the controllers as the main controller to manage and control the network, and synchronize information between the controllers. When the main controller fails, a new main controller will be elected from the standby controllers, and all services will be switched to the new main controller. The switching time is very short and completely imperceptible on the data plane. The single control plane multi-controller 100 in the embodiment of the present invention may adopt a technical solution in the related art, and in order to reduce redundancy, details are not described here.
双平面混合带内传输器200考虑控制平面除控制器单点故障外,还考虑部分控制链路物理故障或链路端口故障等情形,导致部分交换机与控制器失联的情况。也就是说,在正常连接控制器时,控制器根据控制流的特点,下发特定优先级特定匹配项的流表项,用以指导控制流从指定的直连端口转发至控制器。交换机主动检测与控制器的连接状况,若与控制器连接中断,则主动删除当前交换机内指导控制包转发的流表项,并根据预先下发的灾备表项转发控制包。初始情况下,交换机会将控制包从所有端口泛洪至周围交换机,并经由周围交换机转发至控制器。若有多条路径均能到达控制器,将会选取最快回复的路径作为此后的固定路径转发(但固定路径未必是最短路径)。当交换机通过数据平面转发与控制器重新建立连接后,控制器可计算出此时交换机与控制器间最优路径,并下发流表项指导此后的控制包经由最优路径上传至控制器(此处的最优路径由开发者决策,在控制器上开发应用实现相关算法,在此不做具体限定)。The dual-plane hybrid in-band transmitter 200 considers not only the single-point failure of the controller, but also the physical failure of some control links or the failure of link ports on the control plane, resulting in the loss of connection between some switches and the controller. That is to say, when the controller is connected normally, the controller issues a flow entry with a specific priority and a specific matching item according to the characteristics of the control flow, so as to guide the control flow to be forwarded from the specified direct connection port to the controller. The switch actively detects the connection status with the controller. If the connection with the controller is interrupted, it will actively delete the flow entry in the current switch that guides the forwarding of the control packet, and forward the control packet according to the disaster recovery table issued in advance. Initially, the switch floods the control packets from all ports to the surrounding switches and forwards them to the controller via the surrounding switches. If multiple paths can reach the controller, the fastest reply path will be selected as the fixed path forwarding (but the fixed path may not be the shortest path). After the switch re-establishes the connection with the controller through data plane forwarding, the controller can calculate the optimal path between the switch and the controller at this time, and issue flow entries to guide subsequent control packets to be uploaded to the controller via the optimal path ( The optimal path here is determined by the developer, and applications are developed on the controller to implement related algorithms, which are not specifically limited here).
单数据平面自主学习器300主要考虑极端情况,如所有控制器均出现故障或所有控制链路均故障,导致所有交换机与控制器失联的情形。也就是说,在与控制器失去连接后,交换机能够在本地自主生成流表项以替代控制器的功能。此时的交换机与相关技术中网络的交换机工作模式类似,但并不使用相关技术数据通道处理数据包,而是采用OpenFlow流表匹配的形式处理数据包。本发明实施例通过交换机主动检测与控制器的连接状况,若与控制器连接中断,则主动下发灾备表项,并开始执行分布式路由协议RIP协议或OSPF协议。不同于混杂交换机使用相关技术中的通道处理数据包,所有数据包(包括RIP协议数据包或OSPF协议数据包)的处理全部由OpenFlow通道处理,将协议数据包的解析扩展为OpenFlow所支持的动作,通过匹配协议数据包并执行相应的解析动作,将能够自主生成流表项以指导不同网段间数据包的转发与路由。The single data plane autonomous learner 300 mainly considers extreme situations, such as a situation where all controllers fail or all control links fail, resulting in the loss of connection between all switches and controllers. That is to say, after losing connection with the controller, the switch can automatically generate flow entries locally to replace the function of the controller. The working mode of the switch at this time is similar to that of the switch in the network in the related art, but it does not use the data channel of the related technology to process the data packet, but processes the data packet in the form of OpenFlow flow table matching. In the embodiment of the present invention, the switch actively detects the connection status with the controller. If the connection with the controller is interrupted, the disaster recovery table item is sent actively, and the distributed routing protocol RIP protocol or OSPF protocol starts to be executed. Unlike hybrid switches that use channels in related technologies to process data packets, all data packets (including RIP protocol data packets or OSPF protocol data packets) are processed by OpenFlow channels, and the analysis of protocol data packets is extended to actions supported by OpenFlow , by matching protocol data packets and performing corresponding parsing actions, it will be able to autonomously generate flow entries to guide the forwarding and routing of data packets between different network segments.
进一步地,在本发明的一个实施例中,在部分交换机与控制器出现直连连接故障时,交换机通过定时发送的ECHO数据包来进行探测,当规定时间内未收到ECHO回复数据包,则判断交换机与控制器断开,且交换机会尝试与控制器重新建立TCP连接。Further, in one embodiment of the present invention, when a direct connection failure occurs between some switches and the controller, the switch detects through the ECHO data packets sent regularly, and when the ECHO reply data packet is not received within the specified time, then It is judged that the switch is disconnected from the controller, and the switch will try to re-establish a TCP connection with the controller.
可以理解的是,在一个SDN网络系统中,当部分交换机与控制器之间的直连连接发生故障时,此时交换机发出的控制包不能通过原本的直连端口与控制器进行通信,这部分交换机将会与控制器断开,为了保证交换机能够实时检测到自身与控制器的连接状态,本发明实施例采用对应的策略进行故障恢复,也就是说,本发明实施例的交换机可以通过定时发送的ECHO数据包来进行探测,当规定时间(该时间会根据实际网络链路时延动态变化)内未收到ECHO回复数据包,则判断自身与控制器断开,交换机会尝试与控制器重新建立TCP连接。It is understandable that in an SDN network system, when the direct connection between some switches and the controller fails, the control packets sent by the switch cannot communicate with the controller through the original direct connection port. The switch will be disconnected from the controller. In order to ensure that the switch can detect the connection status between itself and the controller in real time, the embodiment of the present invention adopts a corresponding strategy for fault recovery. That is to say, the switch in the embodiment of the present invention can send The ECHO data packet is used for detection. When the ECHO reply data packet is not received within the specified time (this time will be dynamically changed according to the actual network link delay), it will judge that it is disconnected from the controller, and the switch will try to reconnect with the controller. Establish a TCP connection.
直连连接故障导致通过直连链路不可达控制器,所以在交换机检测到断开之后,控制流将会匹配灾备上行表项(如表1所示),初始情况下采用泛洪的方式与控制器建立连接,发出的控制包会到达所有相邻交换机,相邻交换机中含有到达控制器的上行表项(如表2所示),其匹配域为控制器的IP地址以及TCP目的端口号,动作为从某端口转发(假定该邻居交换机是可达控制器的,其可能是通过直连链路,也可能是通过其自身的邻居交换机到达控制器的)。通过相邻交换机的转发,控制器能收到链路故障交换机的控制包。其中,表1为交换机下发的上行表项,表2为控制器下发的上行表项。The failure of the direct connection makes the controller unreachable through the direct link, so after the switch detects the disconnection, the control flow will match the disaster recovery uplink entry (as shown in Table 1), and the flooding method is used initially Establish a connection with the controller, and the sent control packets will reach all adjacent switches, and the adjacent switches contain uplink entries to the controller (as shown in Table 2), and the matching domain is the IP address of the controller and the TCP destination port No., the action is to forward from a certain port (assuming that the neighbor switch is reachable to the controller, it may reach the controller through a direct link, or through its own neighbor switch). Through forwarding by adjacent switches, the controller can receive the control packet from the switch with link failure. Wherein, Table 1 is the uplink entry sent by the switch, and Table 2 is the uplink entry sent by the controller.
表1Table 1
表2Table 2
控制器收到交换机的控制包之后,会沿着上行路径的反向路径将回复数据包转发给请求交换机。虽然交换机第一次发出的控制包是广播到周围交换机的,但一旦某条到达控制器的路径建立起来,则会暂时采用这条路径。在重新建立TCP连接的过程中,交换机收到周围交换机发来控制包的同时,会通过端口学习生成OpenFlow表项来记录MAC地址与端口的映射关系(如表3所示),相当于记录控制包的上行路径,从而保证控制器的下行流量能够沿着正确的单播路径将数据包发送给请求交换机(如表4和表5所示)。另外在请求交换机收到控制器回复的控制包之后,会生成到达控制器OpenFlow表项,其匹配域和动作与邻居交换机的上行表项特征一致,只是出端口值可能随实际端口改变,之后交换机到达控制器的上行流量均会匹配到该表项,避免交换机通过广播的方式来与控制器进行通信。通过这样的方式使直连连接故障的交换机能够重新与控制器建立连接,接受控制器的管控。其中,表3为交换机自主学习记录的端口学习表项,表4为交换机下发的下行表项,表5为控制器下发的下行表项。After the controller receives the control packet from the switch, it forwards the reply data packet to the requesting switch along the reverse path of the uplink path. Although the control packet sent by the switch for the first time is broadcast to the surrounding switches, once a path to the controller is established, this path will be temporarily used. In the process of re-establishing the TCP connection, when the switch receives the control packets from the surrounding switches, it will generate OpenFlow entries through port learning to record the mapping relationship between MAC addresses and ports (as shown in Table 3), which is equivalent to recording control The upstream path of the packet, so as to ensure that the downstream traffic of the controller can send the data packet to the requesting switch along the correct unicast path (as shown in Table 4 and Table 5). In addition, after the requesting switch receives the control packet replied by the controller, it will generate an OpenFlow entry to the controller. All upstream traffic arriving at the controller will match this entry, preventing the switch from communicating with the controller through broadcasting. In this way, the switch whose direct connection fails can re-establish a connection with the controller and accept the control of the controller. Among them, Table 3 is the port learning entry recorded by the switch autonomous learning, Table 4 is the downlink entry sent by the switch, and Table 5 is the downlink entry sent by the controller.
表3table 3
表4Table 4
表5table 5
在交换机通过自主下发的灾备上下行表项与控制器正常建立连接后,控制器会根据网络中实时的连接情况,计算控制器与请求交换机控制包的最优路径,并向该交换机和需要途径的交换机下发相关表项,从而保证该模式下,控制器仍然能够对控制包选路的管理,降低控制包的时延。本发明实施例可以通过这样的方式使直连连接故障的交换机能够重新与控制器建立连接,并接受控制器的管控。After the switch normally establishes a connection with the controller through the disaster recovery uplink and downlink entries issued by itself, the controller will calculate the optimal path between the controller and the requesting switch control packet according to the real-time connection status in the network, and send the request to the switch and The switch that needs the route issues the relevant table items, so as to ensure that in this mode, the controller can still manage the route selection of the control packet and reduce the delay of the control packet. In the embodiment of the present invention, in this way, the switch whose direct connection fails can re-establish a connection with the controller and accept the management and control of the controller.
控制器始终开启控制流管理应用,为每个与控制器正常连接的交换机下发管理控制流的表项。控制器应用将会根据获取的交换机端口信息以及当前网络拓扑信息实时判断出当前交换机到达本身的最优路径,在直连路径正常时将会下发表项通过直连路径转发,而当直连路径故障时,控制器可根据某种策略为交换机选取路径(该策略由控制器上层应用开发者决定,在此不作具体限制)。The controller always starts the control flow management application, and issues table entries for managing control flow to each switch normally connected to the controller. The controller application will judge the optimal path from the current switch to itself in real time according to the obtained switch port information and current network topology information. When the direct connection path is normal, the next entry will be forwarded through the direct connection path. , the controller can select a path for the switch according to a certain strategy (the strategy is determined by the upper-layer application developer of the controller, and no specific limitation is made here).
本发明实施例的控制流管理应用可以由特定的Packet_in数据包触发,交换机定时向控制器发送特定的Packet_in数据包,数据包内容包含该交换机的MAC地址,DPID以及标识符。控制器可以通过标识符判别该Packet_in数据包后,触发控制流管理应用,读取当前网络信息以及交换机端口信息,为该请求交换机控制流选取最优路径,并向路径上所有涉及的交换机下发表项,其中,上行表项结构与表2一致,下行表项结构与表4一致。The control flow management application in the embodiment of the present invention can be triggered by a specific Packet_in data packet, and the switch sends a specific Packet_in data packet to the controller at regular intervals, and the content of the data packet includes the MAC address, DPID and identifier of the switch. The controller can identify the Packet_in data packet through the identifier, trigger the control flow management application, read the current network information and switch port information, select the optimal path for the request switch control flow, and publish it to all involved switches on the path Items, wherein, the structure of the uplink entry is consistent with Table 2, and the structure of the downlink entry is consistent with Table 4.
另外,在交换机断开瞬间到重新与控制器建立连接的过程中,交换机中原本数据平面数据包转发的流表项仍会存在,保证该过程中数据平面原有的数据转发不受影响。但是在恢复建立连接之后,由控制器重新下发数据平面表项,尽最大可能降低连接故障对底层通信的影响。In addition, during the process from the moment the switch is disconnected to re-establishing a connection with the controller, the original data plane data packet forwarding flow entry in the switch will still exist, ensuring that the original data forwarding of the data plane will not be affected during this process. However, after the connection is restored, the controller re-delivers the data plane entry to minimize the impact of the connection failure on the underlying communication.
在双平面混合带内传输器200中,直连连接发生故障的交换机仍然能够通过相邻交换机与控制器进行OpenFlow通信,另外控制器会根据网络中实时的变化动态调整其与交换机之间的路径,保证控制流量的低时延,提高业务水平,而且在断开的瞬间交换机保留原有数据平面表项,维持原有通信,减小故障对数据包转发的影响。In the dual-plane hybrid in-band transmitter 200, the switch whose direct connection fails can still perform OpenFlow communication with the controller through the adjacent switch, and the controller will dynamically adjust the path between itself and the switch according to the real-time changes in the network , to ensure low latency of control traffic, improve service levels, and at the moment of disconnection, the switch retains the original data plane entries, maintains the original communication, and reduces the impact of faults on data packet forwarding.
本发明实施例的双平面混合带内传输器200可以解决部分控制连接故障引起交换机失去控制器控制的问题,并且使整个系统在控制平面不同程度故障下进行平滑的故障恢复,保证任意程度故障下,均可在较短时间内恢复网络的正常通信,提高网络可靠性。The dual-plane hybrid in-band transmitter 200 in the embodiment of the present invention can solve the problem that the switch loses the control of the controller due to the failure of some control connections, and enables the entire system to perform smooth fault recovery under different degrees of faults on the control plane, ensuring that any degree of fault , can restore the normal communication of the network in a relatively short period of time, and improve the reliability of the network.
进一步地,在本发明的一个实施例中,单数据平面自主学习器300,进一步包括:处理模块和自学习模块。其中,处理模块用于通过添加表项对APR数据包进行处理,以及通过添加动作实现对OSPF数据包进行处理,其中,所述动作包含特殊动作的表项。自学习模块用于对收到的数据包进行表项自学习。Further, in an embodiment of the present invention, the single data plane self-learning device 300 further includes: a processing module and a self-learning module. Wherein, the processing module is used to process the APR data packet by adding an entry, and realize processing the OSPF data packet by adding an action, wherein the action includes a special action entry. The self-learning module is used for self-learning of table items for received data packets.
可以理解的是,OpenFlow协议是SDN架构中常用的南向协议,基于其提出的多级流表机制,设计多级流表的选择性处理方案。在本发明的一个实施例中,数据包一旦进入交换机中,通过基本灾备表项引导数据包进入类型判别表中判断数据包类型,并根据目标类型将其引导至对应的处理表中,在每种类型数据包的处理流程中,也可能会采用多级流表的设计以达到对不同情况做出正确处理的目的。It can be understood that the OpenFlow protocol is a commonly used southbound protocol in the SDN architecture. Based on the multi-level flow table mechanism proposed by the OpenFlow protocol, a selective processing scheme for the multi-level flow table is designed. In one embodiment of the present invention, once the data packet enters the switch, the basic disaster recovery entry guides the data packet into the type discrimination table to determine the type of the data packet, and guides it to the corresponding processing table according to the target type. In the processing flow of each type of data packet, the design of a multi-level flow table may also be used to achieve the purpose of correct handling of different situations.
由于底层网络已经完全与控制平面脱离,因此与相关技术中的网络类似,交换机间必须通过分布式路由协议才能获取全网的路由。在相关技术的网络中,主机间进行IP数据包通信之前,网络中必定已经出现过ARP数据包和OSPF数据包(或者其它路由协议,如RIP、OSPF、ISIS等,该部分以OSPF为例),本发明实施例通过对网络数据包转发基本要求的分析,发现只需要对ARP和OSPF数据包进行学习和正确处理即可达到预期目标。Since the underlying network has been completely separated from the control plane, similar to the network in the related art, the routes of the entire network must be obtained between switches through a distributed routing protocol. In the network of the related technology, before the IP data packet communication between the hosts, the ARP data packet and the OSPF data packet must have appeared in the network (or other routing protocols, such as RIP, OSPF, ISIS, etc., this part takes OSPF as an example) In the embodiment of the present invention, by analyzing the basic requirements of network data packet forwarding, it is found that only ARP and OSPF data packets need to be learned and correctly processed to achieve the expected goal.
本发明实施例可以将单数据平面自主学习器300分为处理模块和自学习模块。处理模块可以对收到的数据包进行正确处理:如通过添加表项直接对ARP数据包进行处理;以及通过添加动作实现对OSPF数据包的处理。自学习模块可以对收到的数据包进行表项自学习。且自学习模块为单数据平面自主学习器300的重点,其主要根据网络层次来进行模块的大致划分,本发明实施例的自学习模块主要包含二层表项自学习单元、ARP表项自学习单元和三层表项自学习单元,为减少冗余,在此不做详细赘述。In the embodiment of the present invention, the single data plane self-learning device 300 can be divided into a processing module and a self-learning module. The processing module can correctly process the received data packet: for example, directly process the ARP data packet by adding an entry; and realize the processing of the OSPF data packet by adding an action. The self-learning module can perform table entry self-learning on the received data packets. And the self-learning module is the key point of the single data plane self-learning device 300, which mainly divides the modules according to the network level. The unit and the three-layer entry self-learning unit are not described in detail here in order to reduce redundancy.
进一步地,在本发明的一个实施例中,自学习模块,进一步包括:二层表项自学习单元、ARP表项自学习单元和三层表项自学习单元。其中,二层表项自学习单元,用于要完成MAC地址与端口号的学习,以指导数据包的最终转发出端口;ARP表项自学习单元,用于学习ARP数据包的IP地址与MAC地址的对应关系,以在后期指导其他网段下主机请求本地网段下主机之间的IP数据包的转发。三层表项自学习单元,用于计算到网络中每个目标网段的下一跳交换机,以在后期指导不同网段下主机之间的IP数据包的转发。Furthermore, in an embodiment of the present invention, the self-learning module further includes: a layer-2 table entry self-learning unit, an ARP table entry self-learning unit, and a layer-3 table entry self-learning unit. Among them, the two-layer table item self-learning unit is used to complete the study of the MAC address and port number to guide the final forwarding port of the data packet; the ARP table item self-learning unit is used to learn the IP address and MAC address of the ARP data packet The corresponding relationship of addresses is used to guide hosts in other network segments to request the forwarding of IP data packets between hosts in the local network segment. The three-layer entry self-learning unit is used to calculate the next-hop switch to each target network segment in the network, so as to guide the forwarding of IP data packets between hosts in different network segments in the later stage.
可以理解的是,二层表项自学习单元主要完成MAC地址与端口号的学习,用于指导数据包的最终转发出端口。二层表项自学习单元设计的基本原理与相关技术网络中网桥工作原理极为接近,记录数据包进入网桥时的源MAC地址与入端口,区别是本发明实施例中不仅有二层网络,同时可能包含三层网络,并结合端口实际情况,所以仅对ARP和OSPF数据包进行端口的学习。具体的,如图3所示,交换机对于收到的ARP和OSPF数据包,首先对数据包中关键字段进行解析,从中提取出数据包的源MAC地址并记录进入交换机的端口号,然后根据这些信息构造新的OpenFlow表项,表项匹配域为目的MAC,该匹配字段的值就是数据包的源MAC地址,执行的动作为从指定端口转发,转发端口为数据包的入端口。最后判断该表项是否已经存在与指定流表中,存在则对原有表项进行更新,不存在则将该表项添加至指定流表中。It can be understood that the layer 2 entry self-learning unit mainly completes the learning of the MAC address and the port number, and is used to guide the final forwarding port of the data packet. The basic principle of the design of the self-learning unit of the two-layer table entry is very close to the working principle of the bridge in the related technology network, and records the source MAC address and the ingress port of the data packet when it enters the bridge. , at the same time may include a three-layer network, combined with the actual situation of the port, so only the port is learned for ARP and OSPF packets. Specifically, as shown in Figure 3, the switch first analyzes the key field in the data packet for the received ARP and OSPF data packets, extracts the source MAC address of the data packet and records the port number entering the switch, and then according to The information constructs a new OpenFlow table entry. The matching field of the table entry is the destination MAC address. The value of the matching field is the source MAC address of the data packet. The action performed is forwarding from the specified port, and the forwarding port is the incoming port of the data packet. Finally, it is judged whether the entry already exists in the specified flow table, if it exists, the original entry is updated, and if it does not exist, the entry is added to the specified flow table.
ARP表项自学习单元主要学习ARP数据包的IP地址与MAC地址的对应关系,ARP表项自学习单元自学习到的表项主要用于后期指导其他网段下主机请求本地网段下主机之间的IP数据包的转发。自学习思路与二层的学习过程基本相似,只是学习的内容变为IP地址与MAC地址的对应关系。对于ARP数据包的处理流程如图4所示,交换机对于收到的ARP请求包和回复包,首先对包头进行解析,从中提取出数据包的源IP地址和MAC地址,然后根据这些信息构造新的OpenFlow表项,表项匹配域为使用IP协议及目的IP,该匹配字段的值就是数据包的源IP地址,执行的动作为修改数据包的目的MAC为ARP数据包的源MAC地址,最后判断该表项是否已经存在与指定流表中,存在则对原有表项进行更新,不存在则将该表项添加至指定流表中。The ARP table entry self-learning unit mainly learns the correspondence between the IP address and the MAC address of the ARP data packet. The table entries learned by the ARP table entry self-learning unit are mainly used to guide hosts on other network segments to request hosts on the local network segment. Forwarding of IP data packets between them. The idea of self-learning is basically similar to the learning process of the second layer, except that the content of learning becomes the corresponding relationship between IP addresses and MAC addresses. The processing flow of ARP data packets is shown in Figure 4. For the received ARP request packets and reply packets, the switch first parses the packet headers, extracts the source IP address and MAC address of the data packets, and then constructs a new The OpenFlow table item, the table item matching field is the use of IP protocol and destination IP, the value of the matching field is the source IP address of the data packet, the action to be executed is to modify the destination MAC of the data packet to the source MAC address of the ARP data packet, and finally Judging whether the entry already exists in the specified flow table, if it exists, update the original entry, and if it does not exist, add the entry to the specified flow table.
三层表项自学习单元主要目标是计算到网络中每个目标网段的下一跳交换机,并下发相应的表项。该部分表项用于后期指导不同网段下主机之间的IP数据包的转发。如图5所示,三层表项自学习单元的设计思路主要是提取OSPF链路状态更新报文(LSU)的内容存入数据库中,并根据同步后的数据库内容来封装三层转发表项。因此,该过程需要交换机之间支持接收和发送OSPF数据包,OVS对数据包进行解析并提取链路状态信息存储在本地数据库中,在每一台交换机中根据数据库内容进行Dijkstra算法,计算出到每一个目标网段最短路径的下一跳交换机,并根据数据库内容读取目标网段、通往目标网段的下一跳交换机端口MAC地址和本地交换机的对应转发端口MAC地址,然后根据这些信息构造新的OpenFlow表项,表项匹配域为使用IP协议及目的网段,该匹配字段的值使用的为带子网掩码长度的地址,执行的动作为修改数据包的目的MAC为下一跳交换机端口MAC地址,修改数据包的源MAC地址为转发端口MAC地址。最后判断该表项是否已经存在于指定流表中,存在则对原有表项进行更新,不存在则将该表项添加至指定流表中。The main goal of the three-layer entry self-learning unit is to calculate the next-hop switch to each target network segment in the network, and issue corresponding entries. This part of the entries is used to later guide the forwarding of IP data packets between hosts on different network segments. As shown in Figure 5, the design idea of the three-layer entry self-learning unit is mainly to extract the content of the OSPF link state update message (LSU) and store it in the database, and encapsulate the three-layer forwarding entry according to the synchronized database content . Therefore, this process requires switches to support receiving and sending OSPF data packets. OVS analyzes the data packets and extracts link state information and stores them in the local database. In each switch, the Dijkstra algorithm is performed according to the content of the database to calculate the The next-hop switch of the shortest path for each target network segment, and read the target network segment, the next-hop switch port MAC address leading to the target network segment, and the corresponding forwarding port MAC address of the local switch according to the database content, and then based on these information Construct a new OpenFlow entry. The matching field of the entry is the use of the IP protocol and the destination network segment. The value of the matching field is an address with a subnet mask length. The action to be executed is to modify the destination MAC of the data packet to the next hop Switch port MAC address, modify the source MAC address of the data packet to the forwarding port MAC address. Finally, it is judged whether the entry already exists in the specified flow table, if it exists, the original entry is updated, and if it does not exist, the entry is added to the specified flow table.
进一步地,在本发明的一个实施例中,自学习模块,还包括:基本灾备表项单元。其中,基本灾备表项单元引导数据包实现不同网络层次之间进行跳转。Further, in an embodiment of the present invention, the self-learning module further includes: a basic disaster recovery entry unit. Among them, the basic disaster recovery entry unit guides data packets to realize jumping between different network layers.
可以理解的是,如图6所示,在OVS中所有数据包都会默认先进入到表0进行匹配查找,在此处添加一条表项对控制器的连接状态进行判断,若控制器正常则直接使用控制器下发的表项进行转发,若控制器出现故障将会跳转到类型判断表中。需要注意的是,该连接状态判断表项优先级应该高于上传至控制器的表项,从而实现首先对连接状态进行判断。上一步中跳转后进入的表主要对数据包类型进行判断,区分ARP数据包、主机间通信的IP数据包及交换机间通信的OSPF数据包,对这些不同类型的数据包分别添加对应的跳转、转发、丢弃或者自学习动作。其中,ARP数据包的处理分为两部分:首先是请求网关MAC时交换机充当网关并根据表项直接回复ARP请求,其次是对同网段下ARP的请求则泛洪出所有端口,对于同网段的ARP回复包,则会直接查表项单播转发,除此之外,在每张表中下发完全匹配的表项,执行动作根据具体情况设置为丢弃、广播或者其它,最后还需要添加表项来减小主机间IP数包的TTL值。It is understandable that, as shown in Figure 6, all data packets in OVS will first enter table 0 for matching search by default. Add an entry here to judge the connection status of the controller. If the controller is normal, directly Use the table items issued by the controller for forwarding. If the controller fails, it will jump to the type judgment table. It should be noted that the priority of the connection status judging entry should be higher than that uploaded to the controller, so that the connection status can be judged first. The table entered after the jump in the previous step mainly judges the data packet type, distinguishes ARP data packets, IP data packets communicated between hosts and OSPF data packets communicated between switches, and adds corresponding jumps to these different types of data packets. Forward, forward, discard or self-learning actions. Among them, the processing of ARP data packets is divided into two parts: firstly, the switch acts as a gateway when requesting the MAC address of the gateway and directly replies to the ARP request according to the entry; segment of the ARP reply packet, it will directly look up the entry for unicast forwarding. In addition, each table will issue a completely matching entry, and the execution action will be set to discard, broadcast, or other according to the specific situation. Finally, you need to add the table Item to reduce the TTL value of IP packets between hosts.
另外,基于OVS中的连接状态检测机制,设计了基于连接状态进行下发表项的方案,当状态值发生变化时触发下发灾备表项动作,并在恢复与控制器连接时自动删除所有灾备表项,即仅在断开控制器的瞬间下发该部分表项,控制器正常时不会出现该部分表项,从而减小了正常连接状态下的处理负担。In addition, based on the connection state detection mechanism in OVS, a scheme for publishing entries based on the connection state is designed. When the state value changes, the disaster recovery entry action will be triggered, and all disaster recovery entries will be automatically deleted when the connection with the controller is restored. Standby table entries, that is, only send this part of table entries when the controller is disconnected, and this part of table entries will not appear when the controller is normal, thereby reducing the processing burden in the normal connection state.
在基本灾备表项单元中,即使所有交换机均脱离控制器的控制,但由于其上运行的分布式路由协议和一些基本的表项学习,保证了在该状态下,底层设备仍然能够正常通信,减小故障对数据包转发的影响。In the basic disaster recovery entry unit, even if all switches are out of the control of the controller, due to the distributed routing protocol running on them and some basic table entry learning, it is guaranteed that the underlying devices can still communicate normally in this state , to reduce the impact of faults on packet forwarding.
举例而言,在本发明的一个具体实施例中,本发明实施例的SDN控制平面故障应急系统应用场景如下:For example, in a specific embodiment of the present invention, the application scenario of the SDN control plane failure emergency system of the embodiment of the present invention is as follows:
控制平面出现不同故障时系统能够在各级之间平滑切换,整个过程无需人工介入,完全由系统自主完成,保障数据平面正常的传输,极大地减小控制平面故障对数据平面产生的影响。When different failures occur on the control plane, the system can smoothly switch between levels. The whole process does not require manual intervention and is completely completed by the system to ensure normal transmission of the data plane and greatly reduce the impact of control plane failures on the data plane.
如图7所示,本发明实施例在控制平面部署多控制器,且每个控制器均与每个交换机有直连链路,能够建立正常的OpenFlow通信。(这个直连的意义为在非数据平面外的网络上有通路,事实上他们之间并不一定是直接相连,可能是经由很多现有网络交换机转发相连,也可能仅由一个交换机相连,根据实际布网的不同有所不同,因此示意图中用云来涵括所有情况。)As shown in FIG. 7 , the embodiment of the present invention deploys multiple controllers on the control plane, and each controller has a direct link with each switch, so that normal OpenFlow communication can be established. (The meaning of this direct connection is that there is a path on the network outside the data plane. In fact, they are not necessarily directly connected. They may be forwarded and connected by many existing network switches, or they may be connected by only one switch. According to Actual deployments vary, so clouds are used in illustrations to cover all cases.)
本发明实施例不讨论云内可能存在的链路故障情况,仅讨论交换机接入云的链路故障情况。The embodiment of the present invention does not discuss possible link failures in the cloud, but only discusses link failures of switches accessing the cloud.
在初始模式下,网络工作在正常工作状态,控制平面将选举某个控制器为主控制器,其余控制器为备用控制器,假定示意图中C1为当前的主控制器,如图7的子图A所示,正常工作模式下由主控制器C1管控数据平面的传输。In the initial mode, the network is working normally, the control plane will elect a certain controller as the master controller, and the other controllers are the backup controllers. Assume that C1 in the schematic diagram is the current master controller, as shown in the sub-diagram of Figure 7 As shown in A, in the normal working mode, the main controller C1 controls the transmission of the data plane.
假定当前主控制器C1出现故障。如图7的子图B所示,备用控制器C2与C3将会检测到主控制器出现故障,并选举出新的主控制器,假定为C2。C2将会从数据共享中心下载同步C1的内容,完成控制器工作的完全切换,并将自己成为新的主控制器的消息告知交换机,顺利接管底层交换机的数据传输(如图8中①所示),其中,控制器方案及部署模式不固定,布网时可根据实际需求选取,在此不做具体限制。Assume that the current master controller C1 fails. As shown in sub-diagram B of FIG. 7 , the backup controllers C2 and C3 will detect the failure of the master controller and elect a new master controller, assumed to be C2. C2 will download and synchronize the content of C1 from the data sharing center, complete the complete switchover of the controller work, and inform the switch that it has become the new master controller, and successfully take over the data transmission of the bottom switch (as shown in ① in Figure 8 ), where the controller scheme and deployment mode are not fixed, and can be selected according to actual needs during network deployment, and no specific restrictions are set here.
新的主控制器C2在接管网络后,将会开启控制流管理应用(C1正常工作时,也会开启该应用)。该应用可以根据交换机上传的端口信息,实时判断出当前交换机端口以及端口所属链路的故障情况,若处于正常工作情况,则该应用将会下发流表指导控制流从直连端口转发。此时的控制流传输属于带外传输,即每个交换机通过专有的非数据平面的网络传输控制流。After the new main controller C2 takes over the network, it will start the control flow management application (when C1 is working normally, it will also start the application). The application can judge the failure status of the current switch port and the link to which the port belongs in real time according to the port information uploaded by the switch. If it is in normal working condition, the application will issue a flow table to guide the control flow to be forwarded from the directly connected port. The control flow transmission at this time belongs to out-of-band transmission, that is, each switch transmits the control flow through a dedicated non-data plane network.
在单控制平面多控制器模式下,控制器单点故障后切换备用控制器的时延极短,在数据平面完全无法感知,因此控制器单点故障时,数据平面的数据传输完全不受影响。In the multi-controller mode of single control plane, the delay of switching to the standby controller after a single point failure of the controller is extremely short, and it is completely unaware on the data plane. Therefore, when the controller has a single point of failure, the data transmission of the data plane is not affected at all. .
假定当前交换机S1控制链路物理故障(链路端口故障或出现长时间拥塞)。如图7的子图C所示,其控制流将无法正常从该链路转发至控制器,因此交换机S1会检测到与控制器的连接中断。It is assumed that the control link of the current switch S1 has a physical failure (a link port failure or long-term congestion). As shown in subgraph C of FIG. 7 , its control flow will not be normally forwarded from the link to the controller, so the switch S1 will detect that the connection with the controller is interrupted.
交换机将会主动删除由控制器应用所下发的,指导控制流从直连端口转发的流表项(该流表项存在将会一直指导控制流向故障链路转发,因此必须删除),并根据预先下发的灾备流表项转发控制包。此时由于不受控制器控制,交换机将会在数据平面内随机选取一条到达控制器的路径转发控制包,假定当前交换机S1的控制流选取经由交换机S2转发到达控制器,并与控制器重新建立连接,与控制器重新建立连接后,控制流管理应用将会重新接管控制流的转发,根据当前链路的故障情况,以及其他因素综合为交换机控制流选取出新的转发路径。此时的控制流传输属于带内传输,即交换机通过数据平面的网络传输控制流。The switch will actively delete the flow entry sent by the controller application to direct the control flow to be forwarded from the directly connected port (the existence of the flow entry will always guide the control flow to be forwarded to the faulty link, so it must be deleted), and according to The pre-delivered disaster recovery flow entry forwards the control packet. At this time, because it is not controlled by the controller, the switch will randomly select a path to the controller in the data plane to forward the control packet. Assume that the control flow of the current switch S1 is forwarded to the controller via the switch S2 and re-established with the controller. Connection, after the connection with the controller is re-established, the control flow management application will take over the forwarding of the control flow again, and select a new forwarding path for the control flow of the switch according to the current link failure and other factors. The control flow transmission at this time belongs to in-band transmission, that is, the switch transmits the control flow through the network of the data plane.
在双平面混合带内传输器中,从带外切换至带内的过程中,交换机将会有一段时间与控制器失去连接,为继续保障底层网络的正常传输,在交换机检测到与控制器失去连接后,将会进入单数据平面自学习模式,也就是自学习单元(如图8中②所示),开始自主生成表项以指导数据的转发。由于执行交换机自学习生成表项需要一定的收敛时间,因此交换机内已有的由控制器下发的表项不会被删除,能够继续指导数据包的转发,当表项超时删除后,则会执行交换机自学习产生的表项。使用保留控制器下发表项加自学习表项双重保障的机制,极大降低切换过程中底层数据平面所受到的影响。在切换完成后,控制器将会重新接管交换机,交换机将会自动退出单数据平面自学习模式,进入带内管理模式(转换图中③所示)。In the dual-plane hybrid in-band transmitter, during the switching from out-of-band to in-band, the switch will lose connection with the controller for a period of time. In order to continue to ensure the normal transmission of the underlying network, the After the connection, it will enter the single data plane self-learning mode, that is, the self-learning unit (as shown in ② in Figure 8), and start to automatically generate table entries to guide data forwarding. Since it takes a certain amount of time to converge to generate entries through switch self-learning, the existing entries in the switch issued by the controller will not be deleted and can continue to guide the forwarding of data packets. When the entries are deleted over time, the Execute the entries generated by the switch self-learning. Using the dual protection mechanism of reserved entries under the controller and self-learning entries greatly reduces the impact on the underlying data plane during the switching process. After the switching is completed, the controller will take over the switch again, and the switch will automatically exit the single data plane self-learning mode and enter the in-band management mode (as shown in ③ in the transition diagram).
假定当前所有交换机控制链路物理故障(链路端口故障或出现长时间拥塞)。如图7的子图D所示,无论是哪个交换机控制流均没有到达控制器的无故障路径,因此所有交换机均会检测到与控制器失去连接。Assume that all switches currently control link physical failure (link port failure or long-term congestion). As shown in subgraph D of FIG. 7 , no control flow of any switch has a fault-free path to the controller, so all switches will detect that they are disconnected from the controller.
此时交换机所做动作与切换带内模式时完全相同,唯一区别是存在一条正常工作的控制链路时,能够成功切换至带内模式,否则所有交换机均无法与控制器建立连接,将保持在单数据平面自学习模式(如图8中④所示)。At this time, the actions of the switch are exactly the same as when switching the in-band mode. The only difference is that when there is a normal working control link, it can successfully switch to the in-band mode. Single data plane self-learning mode (as shown in ④ in Figure 8).
假定当前部分交换机控制链路恢复,由于交换机在单数据平面自学习模式时后台仍然在继续尝试与控制器建立连接,因此将会通过恢复的控制链路与控制器重新建立连接,控制器将会重新接管交换机,交换机将会自动退出单数据平面自学习模式,进入带内管理模式(如图8中⑤所示)。重新建立连接过程与切换带内模式时完全相同,为减少冗余,此处不再赘述。Assuming that the control link of some switches is restored, because the background of the switch is still trying to establish a connection with the controller in the single data plane self-learning mode, it will re-establish a connection with the controller through the restored control link, and the controller will After taking over the switch again, the switch will automatically exit the single data plane self-learning mode and enter the in-band management mode (as shown in ⑤ in Figure 8). The process of re-establishing the connection is exactly the same as when switching the in-band mode, and will not be repeated here to reduce redundancy.
假定当前所有交换机控制链路均恢复,控制器上控制流管理应用将会根据交换机上传的端口信息判断出所有链路均恢复正常,因此,控制流管理应用,将会更新所有交换机内指导控制流转发的流表项,为所有控制流选取直连路径进行转发,恢复为带外模式(如图8中⑥所示)。Assuming that the current control links of all switches are restored, the control flow management application on the controller will judge that all links are restored to normal based on the port information uploaded by the switch. Therefore, the control flow management application will update the guidance control flow in all switches. For the flow entries sent, select the direct connection path for all control flows to forward, and restore the out-of-band mode (as shown in ⑥ in Figure 8).
经过上述过程,完成了整个由故障到恢复的过程,整个过程完全由交换机与控制器自主完成,且在整个过程中,始终最大力度地保障底层数据传输的正常进行,极大降低各种故障对底层网络的影响。上述过程仅描述了一种故障发生顺序与恢复顺序,但系统并不局限于解决这一种顺序下的故障情形,任意故障顺序下系统均能够正常工作。除此之外,所有控制器故障与所有控制链路故障情形相同,为减少冗余,在此不做详细赘述。After the above process, the entire process from failure to recovery is completed. The whole process is completely completed by the switch and the controller independently. In the whole process, the normal progress of the underlying data transmission is always guaranteed to the greatest extent, which greatly reduces the impact of various failures. The impact of the underlying network. The above process only describes a fault occurrence sequence and recovery sequence, but the system is not limited to solving faults in this sequence, and the system can work normally in any fault sequence. In addition, all controller failures are the same as all control link failures. To reduce redundancy, details are not described here.
综上所述,本发明实施例的SDN控制平面故障应系统具有如下优点:In summary, the SDN control plane failure response system in the embodiment of the present invention has the following advantages:
(1)适用范围广泛:SDN控制平面故障应急系统针对控制平面的故障设计了不同的模式,能够适应于各种控制平面可能出现的故障或者任意故障的组合,适用于任何SDN网络的部署。(1) Wide range of applications: The SDN control plane failure emergency system has designed different modes for control plane failures, which can adapt to various control plane failures or any combination of failures, and is applicable to the deployment of any SDN network.
(2)提高网络可靠性:SDN控制平面故障应急系统,能够在各级之间平滑地切换,最小化控制平面故障的影响,理想情况下达到网络的无缝切换,提高网络可靠性。(2) Improve network reliability: The SDN control plane failure emergency system can smoothly switch between levels, minimize the impact of control plane failures, and ideally achieve seamless network switching and improve network reliability.
(3)保障部分交换机失去连接后的通信:实现在部分交换机与控制器连接中断后,能够切换到带内模式,使用数据平面交换机转发控制包,使控制器重新与交换机建立连接,维持底层数据平面的通信。(3) Guarantee the communication after some switches lose connection: After some switches and the controller are disconnected, they can switch to the in-band mode, and use the data plane switch to forward the control packet, so that the controller can re-establish a connection with the switch and maintain the underlying data Flat communication.
(4)提高网络可控性:采用可控带内管理模块,实现控制器对带内控制流的实时管控,可将相应的策略应用于控制流的转发上。提高网络的可控性,符合SDN的思想。(4) Improve the controllability of the network: the controllable in-band management module is adopted to realize the real-time control of the in-band control flow by the controller, and the corresponding strategy can be applied to the forwarding of the control flow. Improve the controllability of the network, in line with the idea of SDN.
(5)保障控制平面完全失效后的通信:采用三层自学习模式,在控制平面完全失效后,交换机能够自主生成三层路由表项,保障网络的正常通信。(5) Guarantee the communication after the complete failure of the control plane: the three-layer self-learning mode is adopted. After the complete failure of the control plane, the switch can independently generate three-layer routing entries to ensure the normal communication of the network.
(6)扩大网络部署范围:交换机适用于三层网络,扩大了网络的部署范围。(6) Expand the scope of network deployment: the switch is suitable for the three-layer network, which expands the scope of network deployment.
根据本发明实施例提出的SDN控制平面故障应系统,可以通过单控制平面多控制器解决控制器单点故障的问题;通过双平面混合带内传输器解决部分交换机与控制器直连连接故障的问题,提高了整体网络的可控性;通过单数据平面自主学习器解决控制平面完全失效的问题;并且根据网络中交换机与控制器实时连接情况,在各级之间自动进行切换,保障底层数据平面的正常通信,降低网络恢复时延,提高网络可靠性;系统的切换无需人工干预,减小网络管理人员的负担。According to the SDN control plane failure response system proposed by the embodiment of the present invention, the problem of single point failure of the controller can be solved by using a single control plane and multiple controllers; the problem of direct connection failure between some switches and controllers can be solved by using a dual-plane hybrid in-band transmitter problem, improving the controllability of the overall network; solving the problem of complete failure of the control plane through a single data plane self-learning device; The normal communication of the plane reduces the delay of network recovery and improves the reliability of the network; the switching of the system does not require manual intervention, reducing the burden of network management personnel.
其次参照附图描述根据本发明实施例提出的SDN控制平面故障应急方法。Next, the SDN control plane failure emergency method proposed according to the embodiment of the present invention will be described with reference to the accompanying drawings.
图9是本发明一个实施例的SDN控制平面故障应急方法的流程图。FIG. 9 is a flow chart of a method for emergency response to an SDN control plane failure according to an embodiment of the present invention.
如图9所示,该SDN控制平面故障应急方法包括以下步骤:As shown in Figure 9, the SDN control plane failure emergency method includes the following steps:
在步骤S901中,在控制器出现单点故障时,控制应急系统进入单控制平面多控制器模式,并且通过冗余备份在控制平面部署多控制器,以解决控制平面的控制器单点故障。In step S901, when a controller has a single point of failure, the control emergency system enters a single control plane multi-controller mode, and deploys multiple controllers on the control plane through redundant backup, so as to solve the controller single point failure of the control plane.
在步骤S902中,在部分交换机与控制器出现直连连接故障时,控制应急系统进入双平面混合带内传输模式,并且通过数据平面其他交换机传输控制流,且控制流的传输过程受控制器的控制,以通过控制器下发流表管理控制流的路径,使网络中所有的流量均受控制器进行直接管控。In step S902, when there is a direct connection failure between some switches and the controller, the control emergency system enters the dual-plane hybrid in-band transmission mode, and transmits the control flow through other switches on the data plane, and the transmission process of the control flow is controlled by the controller. Control, to manage the path of the control flow through the controller issuing the flow table, so that all traffic in the network is directly controlled by the controller.
在步骤S903中,在控制平面失效时,控制应急系统进入单数据平面自主学习模式,并且将分布式路由协议以兼容OpenFlow协议的形式扩展于交换机中,且将协议数据包的解析扩展为OpenFlow协议支持的动作,以配合流表项完全以OpenFlow通道处理数据包进行转发。In step S903, when the control plane fails, the control emergency system enters the single data plane autonomous learning mode, and the distributed routing protocol is extended in the switch in a form compatible with the OpenFlow protocol, and the analysis of the protocol data packet is extended to the OpenFlow protocol Supported actions, in order to cooperate with the flow entry to completely process the data packet through the OpenFlow channel for forwarding.
进一步地,在本发明的一个实施例中,在部分交换机与控制器出现直连连接故障时,进一步包括:交换机通过定时发送的ECHO数据包来进行探测,当规定时间内未收到ECHO回复数据包,则判断交换机与控制器断开,且交换机会尝试与控制器重新建立TCP连接。Further, in one embodiment of the present invention, when a direct connection failure occurs between some switches and the controller, it further includes: the switch detects through the ECHO data packets sent regularly, and when the ECHO reply data is not received within the specified time packet, it is determined that the switch is disconnected from the controller, and the switch will try to re-establish a TCP connection with the controller.
进一步地,在本发明的一个实施例中,在控制应急系统进入单数据平面自主学习模式后,进一步包括:通过添加表项对APR数据包进行处理,以及通过添加动作实现对OSPF数据包进行处理;对收到的数据包进行表项自学习,其中,所述动作包含特殊动作的表项。Further, in one embodiment of the present invention, after controlling the emergency system to enter the single data plane autonomous learning mode, it further includes: processing the APR data packet by adding an entry, and processing the OSPF data packet by adding an action ; Perform entry self-learning on the received data packet, wherein the action includes a special action entry.
进一步地,在本发明的一个实施例中,对收到的数据包进行表项自学习,进一步包括:完成MAC地址与端口号的学习,并指导数据包的最终转发出端口;学习ARP数据包的IP地址与MAC地址的对应关系,并在后期指导其他网段下主机请求本地网段下主机之间的IP数据包的转发;计算到网络中每个目标网段的下一跳交换机,并在后期指导不同网段下主机之间的IP数据包的转发。Further, in one embodiment of the present invention, performing table entry self-study on the received data packet further includes: completing the learning of MAC address and port number, and guiding the final forwarding port of the data packet; learning the ARP data packet The corresponding relationship between the IP address and the MAC address, and guide the hosts in other network segments to request the forwarding of IP data packets between the hosts in the local network segment; calculate the next-hop switch to each target network segment in the network, and In the later stage, it guides the forwarding of IP data packets between hosts on different network segments.
进一步地,在本发明的一个实施例中,对收到的数据包进行表项自学习,还包括:引导数据包实现不同网络层次之间进行跳转。Further, in an embodiment of the present invention, performing table entry self-learning on the received data packets also includes: guiding the data packets to realize jumping between different network layers.
需要说明的是,前述对SDN控制平面故障应急系统实施例的解释说明也适用于该实施例的SDN控制平面故障应急方法,此处不再赘述。It should be noted that, the foregoing explanations of the embodiment of the SDN control plane failure emergency system are also applicable to the SDN control plane failure emergency method of this embodiment, which will not be repeated here.
根据本发明实施例提出的SDN控制平面故障应急方法,可以通过单控制平面多控制器解决控制器单点故障的问题;通过双平面混合带内传输器解决部分交换机与控制器直连连接故障的问题,提高了整体网络的可控性;通过单数据平面自主学习器解决控制平面完全失效的问题;并且根据网络中交换机与控制器实时连接情况,在各级之间自动进行切换,保障底层数据平面的正常通信,降低网络恢复时延,提高网络可靠性,减小网络管理人员的负担。According to the SDN control plane failure emergency method proposed in the embodiment of the present invention, the problem of single point failure of the controller can be solved by using a single control plane and multiple controllers; the problem of direct connection failure between some switches and controllers can be solved by using a dual-plane hybrid in-band transmitter problem, improving the controllability of the overall network; solving the problem of complete failure of the control plane through a single data plane self-learning device; Plane normal communication reduces network recovery delay, improves network reliability, and reduces the burden on network management personnel.
在本发明的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“长度”、“宽度”、“厚度”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”“内”、“外”、“顺时针”、“逆时针”、“轴向”、“径向”、“周向”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。In describing the present invention, it should be understood that the terms "center", "longitudinal", "transverse", "length", "width", "thickness", "upper", "lower", "front", " Back", "Left", "Right", "Vertical", "Horizontal", "Top", "Bottom", "Inner", "Outer", "Clockwise", "Counterclockwise", "Axial", The orientation or positional relationship indicated by "radial", "circumferential", etc. is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying the referred device or element Must be in a particular orientation, be constructed in a particular orientation, and operate in a particular orientation, and therefore should not be construed as limiting the invention.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In addition, the terms "first" and "second" are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, the features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In the description of the present invention, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined.
在本发明中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。In the present invention, unless otherwise clearly specified and limited, terms such as "installation", "connection", "connection" and "fixation" should be understood in a broad sense, for example, it can be a fixed connection or a detachable connection , or integrated; it may be mechanically connected or electrically connected; it may be directly connected or indirectly connected through an intermediary, and it may be the internal communication of two components or the interaction relationship between two components, unless otherwise specified limit. Those of ordinary skill in the art can understand the specific meanings of the above terms in the present invention according to specific situations.
在本发明中,除非另有明确的规定和限定,第一特征在第二特征“上”或“下”可以是第一和第二特征直接接触,或第一和第二特征通过中间媒介间接接触。而且,第一特征在第二特征“之上”、“上方”和“上面”可是第一特征在第二特征正上方或斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”可以是第一特征在第二特征正下方或斜下方,或仅仅表示第一特征水平高度小于第二特征。In the present invention, unless otherwise clearly specified and limited, the first feature may be in direct contact with the first feature or the first and second feature may be in direct contact with the second feature through an intermediary. touch. Moreover, "above", "above" and "above" the first feature on the second feature may mean that the first feature is directly above or obliquely above the second feature, or simply means that the first feature is higher in level than the second feature. "Below", "beneath" and "beneath" the first feature may mean that the first feature is directly below or obliquely below the second feature, or simply means that the first feature is less horizontally than the second feature.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limiting the present invention, those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201711242547.2A CN108023814A (en) | 2017-11-30 | 2017-11-30 | SDN control plane failure emergency systems and method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201711242547.2A CN108023814A (en) | 2017-11-30 | 2017-11-30 | SDN control plane failure emergency systems and method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN108023814A true CN108023814A (en) | 2018-05-11 |
Family
ID=62077685
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201711242547.2A Pending CN108023814A (en) | 2017-11-30 | 2017-11-30 | SDN control plane failure emergency systems and method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN108023814A (en) |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109450811A (en) * | 2018-11-30 | 2019-03-08 | 新华三云计算技术有限公司 | Flow control methods, device and server |
| CN109714437A (en) * | 2019-02-03 | 2019-05-03 | 北京邮电大学 | Emergency Communications Network system |
| CN111431763A (en) * | 2020-03-18 | 2020-07-17 | 紫光云技术有限公司 | Connectivity detection method for SDN controller |
| CN112558504A (en) * | 2019-09-10 | 2021-03-26 | 中国电信股份有限公司 | Method, device and system for forwarding critical path information based on OSPF protocol |
| CN113273309A (en) * | 2018-11-30 | 2021-08-17 | 诺基亚技术有限公司 | Side link failure recovery with beamforming |
| CN114172919A (en) * | 2020-08-21 | 2022-03-11 | 三星电子株式会社 | Data center |
| CN115086978A (en) * | 2021-03-11 | 2022-09-20 | 中国移动通信集团四川有限公司 | Network function virtualization SDN network system |
| EP4672707A1 (en) * | 2024-06-26 | 2025-12-31 | New H3C Technologies Co., Ltd. | METHOD AND DEVICE FOR HANDLING CONNECTION FAULTS, STORAGE MEDIUM AND PROGRAM PRODUCT |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103929333A (en) * | 2014-05-08 | 2014-07-16 | 陈桂芳 | Implementation method for SDN controller pool |
| US20160226817A1 (en) * | 2015-02-03 | 2016-08-04 | Electronics And Telecommunications Research Institute | Apparatus and method for creating block-type structure using sketch-based user interaction |
-
2017
- 2017-11-30 CN CN201711242547.2A patent/CN108023814A/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103929333A (en) * | 2014-05-08 | 2014-07-16 | 陈桂芳 | Implementation method for SDN controller pool |
| US20160226817A1 (en) * | 2015-02-03 | 2016-08-04 | Electronics And Telecommunications Research Institute | Apparatus and method for creating block-type structure using sketch-based user interaction |
Non-Patent Citations (5)
| Title |
|---|
| TAO HUANG等: "Building SDN-Based Agricultural Vehicular Sensor", 《SENSORS》 * |
| 晏思宇等: "基于OVS的SDN移动自组网络架构设计及实现", 《无线电通信技术》 * |
| 杨帆等: "OVS的编程扩展技术", 《电信科学》 * |
| 洪硕果等: "一种SDN网络的故障自动恢复方案", 《计算机技术与发展》 * |
| 胡延楠: "软件定义网络关键技术及相关问题的研究", 《中国博士学位论文信息科技辑》 * |
Cited By (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109450811A (en) * | 2018-11-30 | 2019-03-08 | 新华三云计算技术有限公司 | Flow control methods, device and server |
| US12395229B2 (en) | 2018-11-30 | 2025-08-19 | Nokia Technologies Oy | Failure recovery of sidelink with beamforming |
| CN113273309A (en) * | 2018-11-30 | 2021-08-17 | 诺基亚技术有限公司 | Side link failure recovery with beamforming |
| CN109714437B (en) * | 2019-02-03 | 2020-10-16 | 北京邮电大学 | Emergency communication network system |
| CN109714437A (en) * | 2019-02-03 | 2019-05-03 | 北京邮电大学 | Emergency Communications Network system |
| CN112558504A (en) * | 2019-09-10 | 2021-03-26 | 中国电信股份有限公司 | Method, device and system for forwarding critical path information based on OSPF protocol |
| CN112558504B (en) * | 2019-09-10 | 2021-11-02 | 中国电信股份有限公司 | Method, device and system for forwarding critical path information based on OSPF protocol |
| CN111431763B (en) * | 2020-03-18 | 2021-07-27 | 紫光云技术有限公司 | Connectivity detection method for SDN controller |
| CN111431763A (en) * | 2020-03-18 | 2020-07-17 | 紫光云技术有限公司 | Connectivity detection method for SDN controller |
| CN114172919A (en) * | 2020-08-21 | 2022-03-11 | 三星电子株式会社 | Data center |
| CN115086978A (en) * | 2021-03-11 | 2022-09-20 | 中国移动通信集团四川有限公司 | Network function virtualization SDN network system |
| CN115086978B (en) * | 2021-03-11 | 2024-05-07 | 中国移动通信集团四川有限公司 | Network Function Virtualization SDN Network System |
| EP4672707A1 (en) * | 2024-06-26 | 2025-12-31 | New H3C Technologies Co., Ltd. | METHOD AND DEVICE FOR HANDLING CONNECTION FAULTS, STORAGE MEDIUM AND PROGRAM PRODUCT |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108023814A (en) | SDN control plane failure emergency systems and method | |
| US8166187B2 (en) | Distributed IP gateway based on sharing a MAC address and IP address concurrently between a first network switching device and a second network switching device | |
| JP5484590B2 (en) | Method, device and system for processing service traffic based on pseudowire | |
| US8300523B2 (en) | Multi-chasis ethernet link aggregation | |
| EP2255501B1 (en) | Distributed spanning tree protocol on a multi chassis port channel | |
| US8059638B2 (en) | Inter-node link aggregation system and method | |
| CN105763359B (en) | Distributed Bidirectional Forwarding Detection Protocol (D-BFD) for Interleaved Fabric Switch Clusters | |
| US9584397B2 (en) | Routing in spine-leaf networking systems | |
| JP4688765B2 (en) | Network redundancy method and intermediate switch device | |
| US9813257B2 (en) | Access network dual path connectivity | |
| CN104980349B (en) | Relay system and exchange apparatus | |
| WO2018054156A1 (en) | Vxlan message forwarding method, device and system | |
| CN102638389A (en) | Redundancy backup method and system of TRILL (Transparent Interconnection over Lots of Links) network | |
| US9288140B2 (en) | Multichassis failover and recovery for MLPPP wireless backhaul | |
| WO2012122945A1 (en) | Operating method and device for virtual network element | |
| WO2016023436A1 (en) | Fault detection method for virtual router redundancy protocol and router device | |
| EP3562107A1 (en) | Broadcast packet processing method and processing apparatus, controller, and switch | |
| CN102546430A (en) | Method for redundant backup of network equipment, and routing equipment and system | |
| CN113259235B (en) | IPv 6-based dual-active route redundancy method and system | |
| US20200314003A1 (en) | Layer 3 multi-chassis link aggregation group | |
| JP2003258822A (en) | Packet ring network and inter-packet ring network connection method used in the same | |
| CN100531136C (en) | Method and system for transmitting message in virtual special network link fault | |
| CN101150478A (en) | A method, system and router for establishing active and standby links | |
| CN104780097B (en) | Hot spare method and first routing device under non-fully-connected network topological condition | |
| CN101582848A (en) | Cross-ring protection method and system of resilient packet ring (RPR) |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| WD01 | Invention patent application deemed withdrawn after publication | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180511 |