[go: up one dir, main page]

HK1218192B - A method and system of updating conversation allocation in link aggregation - Google Patents

A method and system of updating conversation allocation in link aggregation Download PDF

Info

Publication number
HK1218192B
HK1218192B HK16106078.4A HK16106078A HK1218192B HK 1218192 B HK1218192 B HK 1218192B HK 16106078 A HK16106078 A HK 16106078A HK 1218192 B HK1218192 B HK 1218192B
Authority
HK
Hong Kong
Prior art keywords
network device
port
conversation
aggregation
mask
Prior art date
Application number
HK16106078.4A
Other languages
Chinese (zh)
Other versions
HK1218192A1 (en
Inventor
Panagiotis Saltsidis
János FARKAS
Balázs Peter GERÖ
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/135,556 external-priority patent/US9553798B2/en
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Publication of HK1218192A1 publication Critical patent/HK1218192A1/en
Publication of HK1218192B publication Critical patent/HK1218192B/en

Links

Description

更新链路聚合中的对话分配的方法和系统Method and system for updating session allocation in link aggregation

技术领域Technical Field

本发明的实施例一般涉及链路聚合,并且更具体地说,涉及用于链路聚合组(LAG)的对话敏感收集的方法和设备。Embodiments of the present invention relate generally to link aggregation, and more particularly, to methods and apparatus for conversation-aware collection of link aggregation groups (LAGs).

背景技术Background Art

如图1A中所图示的,链路聚合是用于聚合网络中一对节点120、122之间多个链路以实现在参与链路聚合组(LAG) 101中的每一个链路上传送用户数据的网络配置和过程(例如参见电气与电子工程师协会(IEEE)标准802.1AX)。以这种方式聚合多个网络连接可增大吞吐量超过单个连接能承受的,和/或可用于在链路之一故障的情况下提供弹性。“分布式弹性网络互连”(DRNI) 102(参见IEEE 802.1AX-REV/D1.0的条款8)规定对链路聚合的扩充,以便能够在甚至多于两个节点之间(例如在图1B中所图示的四个节点K、L、M和O之间)的网络接口上使用链路聚合。As illustrated in FIG1A , link aggregation is a network configuration and process for aggregating multiple links between a pair of nodes 120, 122 in a network to enable transmission of user data on each of the links participating in a link aggregation group (LAG) 101 (see, for example, Institute of Electrical and Electronics Engineers (IEEE) standard 802.1AX). Aggregating multiple network connections in this manner can increase throughput beyond what a single connection can sustain, and/or can be used to provide resilience in the event of a failure of one of the links. "Distributed Resilient Network Interconnect" (DRNI) 102 (see clause 8 of IEEE 802.1AX-REV/D1.0) specifies an extension to link aggregation to enable the use of link aggregation on network interfaces even between more than two nodes (e.g., between the four nodes K, L, M, and O illustrated in FIG1B ).

如图1B中所示出的,LAG形成在网络150与网络152之间。更确切地说,LAG形成在LAG虚拟节点或“门户”112、114之间。第一LAG虚拟节点或门户112包含第一节点(K)和第二节点(L)。第二LAG虚拟节点或门户114包含第三节点(M)和第四节点(O)。这些节点也可被称为“门户系统”。要指出,第一LAG虚拟节点或门户112和第二LAG虚拟节点或门户114在门户中可包含单个节点或多于两个节点。LAG节点K和M作为对等节点连接,并且LAG节点L和O也作为对等节点连接。在此申请中所使用的“LAG虚拟节点”指的是上面讨论的IEEE文档中的DRNI门户(即,对它们的相应对等表现为单个节点的两个或更多节点)。此外,虚拟节点或门户112“包含”两个节点K、L的表述意味着,虚拟节点或门户112由节点K、L仿真,这可被称为“仿真系统”。类似地,虚拟节点或门户114“包含”两个节点M、O的表述意味着,虚拟节点或门户114由节点M、O仿真。要指出,链路聚合组161也形成在K-M与L-O链路之间。As shown in FIG1B , a LAG is formed between network 150 and network 152. More specifically, a LAG is formed between LAG virtual nodes or "portals" 112 and 114. First LAG virtual node or portal 112 includes a first node (K) and a second node (L). Second LAG virtual node or portal 114 includes a third node (M) and a fourth node (O). These nodes may also be referred to as a "portal system." It should be noted that first and second LAG virtual nodes or portals 112 and 114 may include a single node or more than two nodes within a portal. LAG nodes K and M are connected as peer nodes, and LAG nodes L and O are also connected as peer nodes. As used herein, "LAG virtual node" refers to the DRNI portal in the IEEE document discussed above (i.e., two or more nodes that appear as a single node to their respective peers). Furthermore, the statement that virtual node or portal 112 "includes" two nodes K and L means that virtual node or portal 112 is emulated by nodes K and L, which may be referred to as an "emulated system." Similarly, the statement that the virtual node or portal 114 "contains" two nodes M, O means that the virtual node or portal 114 is emulated by the nodes M, O. Note that a link aggregation group 161 is also formed between the K-M and L-O links.

参与LAG中的多个节点对LAG中它们的对等伙伴看起来是具有单个系统ID的相同虚拟节点或门户。系统ID用于标识每个节点(例如节点K、节点L、节点M和节点O)。系统ID包含在在LAG的各个伙伴节点之间(例如在K与M之间或在L与O之间)发送的链路聚合控制协议数据单元(LACPDU)中。系统ID可基于门户的组成节点的标识符,使用任何单独标识符或它们的任何组合生成。可前后一致地生成对应LAG虚拟节点或门户的公共和唯一系统ID。从而,如图1B中所示出的,节点K和节点L属于同一网络150,并且它们是同一DRNI门户112(即,同一LAG虚拟节点)的一部分,并且对于仿真LAG虚拟节点112使用“K”的公共系统ID。类似地,网络152的节点M和O被节点K和L看作具有系统ID“M”的单个LAG虚拟节点或虚拟114。Multiple nodes participating in a LAG appear to their peer partners in the LAG as the same virtual node or portal with a single system ID. The system ID is used to identify each node (e.g., node K, node L, node M, and node O). The system ID is included in the Link Aggregation Control Protocol Data Unit (LACPDU) sent between the various partner nodes in the LAG (e.g., between K and M or between L and O). The system ID can be generated based on the identifiers of the portal's constituent nodes, using any individual identifier or any combination thereof. A common and unique system ID for each LAG virtual node or portal can be consistently generated. Thus, as shown in FIG1B , node K and node L belong to the same network 150 and are part of the same DRNI portal 112 (i.e., the same LAG virtual node), using the common system ID "K" for emulating LAG virtual node 112. Similarly, nodes M and O of network 152 are seen by nodes K and L as a single LAG virtual node or portal with a system ID "M."

图1B还示出了具体装置的DRNI链路分配(参见图1B中的K与M之间的加粗链路)。接口的服务分配可涉及虚拟局域网(VLAN),并且服务的标识符可以是VLAN标识符(VID),诸如服务VID(即“S VID”)(通常标识网络到网络接口(NNI)上的服务)或顾客VID(即“C-VID”)(通常标识用户到网络接口(UNI)上的服务)。(要指出,B-VID与S-VID难以区分,因为他们具有相同以太网类型。)在图1B的示例中,服务被分配给上链路(在上节点K、M之间)。上链路从而被选择作为“工作”链路,并且下链路(在节点L、O之间)是“备用”链路或“保护”链路。服务链路分配,即对于在前向和后向方向的帧传送使用相同物理链路是高度期望的。FIG1B also illustrates the DRNI link allocation for a specific device (see the bold link between K and M in FIG1B ). The service allocation for an interface may involve a virtual local area network (VLAN), and the identifier of the service may be a VLAN identifier (VID), such as a service VID (i.e., "S-VID") (typically identifying a service on a network-to-network interface (NNI)) or a customer VID (i.e., "C-VID") (typically identifying a service on a user-to-network interface (UNI)). (Note that B-VIDs are indistinguishable from S-VIDs because they have the same Ethernet type.) In the example of FIG1B , the service is allocated to the upper link (between upper nodes K and M). The upper link is thus selected as the "working" link, and the lower link (between nodes L and O) is the "backup" link or "protection" link. Service link allocation, i.e., using the same physical link for frame transmission in both the forward and reverse directions, is highly desirable.

传送的帧可被动态重新分布,并且此类重新分布可源自于移除或添加的链路或负载平衡方案中的改变。发生在业务流中间的业务重新分布可引起无序帧。为了确保帧由于这个重新分布引起不复制或记录,链路聚合使用标志协议。使用标记协议的目的是检测何时在远程对等节点成功接收到给定业务流的所有帧。为了实现这个,LACP在每一个端口信道链路上传送标记协议数据单元PDU。伙伴系统对应于接收的标记PDU,一旦它在标记PDU之前已经接收到在此链路上传送的所有帧。伙伴系统然后发送每个接收的标记PDU的标记响应PDU。一旦本地系统在门户的所有成员链路上接收到标记响应PDU,本地系统就能重新分布业务流中的帧,由此避免帧无序的任何风险。然而,确保标记响应PDU在LAG的任一或两个对等节点可包括多个系统的DRNI中恰当地工作,可能是有问题的。因此必须采取措施,以便确保对于此类LAG中端口之间的帧交换——称为对话——的某些序列保持帧排序。Transmitted frames can be dynamically redistributed, and such redistribution can result from removed or added links or changes in the load balancing scheme. Traffic redistribution occurring in the middle of a traffic flow can cause out-of-order frames. To ensure that frames are not duplicated or logged due to this redistribution, link aggregation uses a marker protocol. The purpose of using a marker protocol is to detect when all frames for a given traffic flow have been successfully received at the remote peer node. To achieve this, LACP transmits a marker protocol data unit (PDU) on each port channel link. The partner system responds to a received marker PDU once it has received all frames transmitted on that link prior to the marker PDU. The partner system then sends a marker response PDU for each received marker PDU. Once the local system receives the marker response PDU on all member links of the portal, it can redistribute the frames in the traffic flow, thereby avoiding any risk of out-of-order frames. However, ensuring that the marker response PDU works properly in DRNIs where either or both peer nodes of a LAG may include multiple systems can be problematic. Therefore, measures must be taken to ensure that frame ordering is maintained for a certain sequence of frame exchanges between ports in such a LAG—called a conversation.

发明内容Summary of the Invention

公开了更新链路聚合中的对话分配的方法。网络装置实现了用于更新链路聚合组的链路上的对话分配的方法。网络装置通信地通过链路聚合组的链路与聚合端口耦合,并且它处理由帧的有序序列构成的对话。该方法开始于验证对话敏感链路聚合控制协议(LACP)的实现是可操作的。然后确定,通过增强链路聚合控制协议数据单元(LACPDU)的操作是可能的。增强LACPDU可用于更新对话分配信息,并且该确定至少部分基于网络装置的操作参数的第一集合与伙伴网络装置的操作参数的第二集合之间的兼容性检查,其中伙伴网络装置是通信地与网络装置耦合的链路聚合组的远程网络装置。然后,基于对话分配状态不正确的确定,更新链路聚合组的聚合端口的对话分配状态,其中链路聚合组的聚合端口的对话分配状态指示传送通过聚合端口的对话的列表。A method for updating conversation allocation in a link aggregation is disclosed. A network device implements the method for updating conversation allocation on links of a link aggregation group. The network device is communicatively coupled to an aggregation port via a link of the link aggregation group and processes a conversation consisting of an ordered sequence of frames. The method begins by verifying that an implementation of a conversation-aware Link Aggregation Control Protocol (LACP) is operational. A determination is then made that operation via an enhanced Link Aggregation Control Protocol Data Unit (LACPDU) is possible. The enhanced LACPDU can be used to update conversation allocation information, and the determination is based at least in part on a compatibility check between a first set of operating parameters of the network device and a second set of operating parameters of a partner network device, wherein the partner network device is a remote network device of the link aggregation group communicatively coupled to the network device. Then, based on a determination that the conversation allocation state is incorrect, a conversation allocation state of the aggregation port of the link aggregation group is updated, wherein the conversation allocation state of the aggregation port of the link aggregation group indicates a list of conversations transmitted through the aggregation port.

公开了配置成更新链路聚合中的对话分配的网络装置。网络装置配置成通信地通过链路聚合组的链路与聚合端口耦合,并且网络装置配置成处理对话,并且其中每个对话由帧的有序序列构成。网络装置含有配置成从链路聚合组的聚合端口和网络处理器接收帧并向其传送帧的聚合端口控制器。网络处理器包含:聚合控制器,配置成验证对话敏感链路聚合控制协议(LACP)的实现是可操作的;确定通过增强链路聚合控制协议数据单元(LACPDU)的操作是可能的,其中增强LACPDU可用于更新对话分配信息,其中该确定基于网络装置的操作参数的第一集合与伙伴网络装置的操作参数的第二集合之间的兼容性检查,并且其中伙伴网络装置是通信地与网络装置耦合的链路聚合组的远程网络装置;基于对话分配状态不正确的确定,更新链路聚合组的聚合端口的对话分配状态,其中链路聚合组的聚合端口的对话分配状态指示传送通过聚合端口的对话的列表。A network device configured to update conversation allocation in a link aggregation is disclosed. The network device is configured to be communicatively coupled to an aggregation port via a link of a link aggregation group, and the network device is configured to process conversations, wherein each conversation consists of an ordered sequence of frames. The network device includes an aggregation port controller configured to receive frames from the aggregation port of the link aggregation group and a network processor and to transmit frames to the aggregation port. The network processor includes: an aggregation controller configured to verify that an implementation of a conversation-sensitive link aggregation control protocol (LACP) is operational; determine that operation is possible via an enhanced link aggregation control protocol data unit (LACPDU), wherein the enhanced LACPDU can be used to update conversation allocation information, wherein the determination is based on a compatibility check between a first set of operating parameters of the network device and a second set of operating parameters of a partner network device, and wherein the partner network device is a remote network device of the link aggregation group communicatively coupled to the network device; and update a conversation allocation state for the aggregation port of the link aggregation group based on a determination that the conversation allocation state is incorrect, wherein the conversation allocation state for the aggregation port of the link aggregation group indicates a list of conversations transmitted through the aggregation port.

公开了一种非暂时性计算机可读存储介质,其中存储有指令,所述指令当由处理器执行时使处理器执行更新链路聚合中的对话分配的操作。网络装置实现了用于更新链路聚合组的链路上的对话分配的操作。网络装置配置成通信地通过链路聚合组的链路与聚合端口耦合,并且配置成处理由帧的有序序列构成的对话。操作开始于验证对话敏感链路聚合控制协议(LACP)的实现是可操作的。然后确定,通过增强链路聚合控制协议数据单元(LACPDU)的操作是可能的。增强LACPDU可用于更新对话分配信息,并且该确定至少部分基于网络装置的操作参数的第一集合与伙伴网络装置的操作参数的第二集合之间的兼容性检查,其中伙伴网络装置是通信地与网络装置耦合的链路聚合组的远程网络装置。然后,基于对话分配状态不正确的确定,更新链路聚合组的聚合端口的对话分配状态,其中链路聚合组的聚合端口的对话分配状态指示传送通过聚合端口的对话的列表。A non-transitory computer-readable storage medium is disclosed, having stored therein instructions that, when executed by a processor, cause the processor to perform operations for updating conversation allocations in a link aggregation. A network device implements operations for updating conversation allocations on links of a link aggregation group. The network device is configured to be communicatively coupled to an aggregation port via a link of the link aggregation group and to process a conversation consisting of an ordered sequence of frames. The operations begin by verifying that an implementation of a conversation-aware Link Aggregation Control Protocol (LACP) is operational. A determination is then made that operations via enhanced Link Aggregation Control Protocol Data Units (LACPDUs) are possible. The enhanced LACPDUs may be used to update conversation allocation information, and this determination is based at least in part on a compatibility check between a first set of operating parameters of the network device and a second set of operating parameters of a partner network device, wherein the partner network device is a remote network device of the link aggregation group communicatively coupled to the network device. Then, based on a determination that the conversation allocation state is incorrect, a conversation allocation state of the aggregation port of the link aggregation group is updated, wherein the conversation allocation state of the aggregation port of the link aggregation group indicates a list of conversations transmitted through the aggregation port.

一种计算机程序,具有指令,所述指令当由处理器执行时使处理器执行由网络装置实现的用于更新链路聚合组的链路上的对话分配的操作。网络装置实现了用于更新链路聚合组的链路上的对话分配的操作。网络装置配置成通信地通过链路聚合组的链路与聚合端口耦合,并且配置成处理由帧的有序序列构成的对话。操作开始于验证对话敏感链路聚合控制协议(LACP)的实现是可操作的。然后确定,通过增强链路聚合控制协议数据单元(LACPDU)的操作是可能的。增强LACPDU可用于更新对话分配信息,并且该确定至少部分基于网络装置的操作参数的第一集合与伙伴网络装置的操作参数的第二集合之间的兼容性检查,其中伙伴网络装置是通信地与网络装置耦合的链路聚合组的远程网络装置。然后,基于对话分配状态不正确的确定,更新链路聚合组的聚合端口的对话分配状态,其中链路聚合组的聚合端口的对话分配状态指示传送通过聚合端口的对话的列表。A computer program having instructions that, when executed by a processor, cause the processor to perform operations implemented by a network device for updating conversation allocations on links of a link aggregation group. The network device implements the operations for updating conversation allocations on links of the link aggregation group. The network device is configured to be communicatively coupled to an aggregation port via a link of the link aggregation group and to process conversations consisting of an ordered sequence of frames. The operations begin by verifying that an implementation of a conversation-aware Link Aggregation Control Protocol (LACP) is operational. A determination is then made that operations using enhanced Link Aggregation Control Protocol Data Units (LACPDUs) are possible. The enhanced LACPDUs can be used to update conversation allocation information, and this determination is based at least in part on a compatibility check between a first set of operating parameters of the network device and a second set of operating parameters of a partner network device, wherein the partner network device is a remote network device of the link aggregation group communicatively coupled to the network device. Then, based on a determination that the conversation allocation state is incorrect, a conversation allocation state of the aggregation port of the link aggregation group is updated, wherein the conversation allocation state of the aggregation port of the link aggregation group indicates a list of conversations transmitted through the aggregation port.

一种包括上面提到的计算机程序的载体,其中所述载体是电子信号、光信号、无线电信号或计算机可读存储介质之一。A carrier comprising the computer program mentioned above, wherein the carrier is one of an electronic signal, an optical signal, a radio signal or a computer-readable storage medium.

本发明的实施例提供了更新网络装置之间的链路聚合组的端口的对话分配的机制,使得可通过网络装置保持帧交换序列的帧排序。本发明的实施例可在网络装置利用,所述网络装置实现一对节点或门户系统之间的链路聚合组,每个门户(诸如DRNI系统)含有多个节点。Embodiments of the present invention provide a mechanism for updating the session assignments of ports in a link aggregation group between network devices, thereby maintaining the frame ordering of the frame exchange sequence by the network devices. Embodiments of the present invention can be utilized in network devices that implement a link aggregation group between a pair of nodes or a portal system, where each portal (such as a DRNI system) contains multiple nodes.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

通过参考用于说明本发明实施例的如下描述和附图可最好地理解本发明。在附图中:The present invention may best be understood by referring to the following description and accompanying drawings which illustrate embodiments of the invention. In the drawings:

图1A是两个网络装置之间的链路聚合组的一个实施例的图。FIG. 1A is a diagram of one embodiment of a link aggregation group between two network devices.

图1B是经由链路聚合组连接两个网络的两个门户的一个实施例的图。FIG. 1B is a diagram of one embodiment of two portals connecting two networks via a link aggregation group.

图2是链路聚合子层的一个实施例的图。FIG2 is a diagram of one embodiment of a link aggregation sublayer.

图3是图示根据本发明一个实施例更新聚合端口的对话分配的过程的流程图。FIG3 is a flow chart illustrating a process of updating conversation allocations of an aggregation port according to one embodiment of the present invention.

图4A图示了根据本发明一个实施例的聚合端口的对话掩码TLV。FIG. 4A illustrates a conversation mask TLV of an aggregation port according to one embodiment of the present invention.

图4B图示了根据本发明一个实施例的聚合端口的对话掩码TLV内的对话掩码状态字段。4B illustrates a Conversation Mask Status field within a Conversation Mask TLV of an aggregation port according to one embodiment of the present invention.

图4C图示了根据本发明一个实施例在网络装置的链路聚合组的聚合端口的端口操作对话掩码。FIG. 4C illustrates a port operation session mask for an aggregation port of a link aggregation group in a network device according to one embodiment of the present invention.

图5A是端口对话服务映射TLV的一个实施例的图。FIG5A is a diagram of one embodiment of a Port Session Service Mapping TLV.

图5B是聚合的管理服务对话映射的一个实施例的图。FIG5B is a diagram of one embodiment of an aggregated management service conversation map.

图6是图示根据本发明一个实施例更新聚合端口的对话分配的过程的另一流程图。FIG. 6 is another flow chart illustrating a process of updating conversation allocations of an aggregation port according to one embodiment of the present invention.

图7是图示根据本发明一个实施例在接收到长LACPDU时更新聚合端口的对话掩码的流程图。FIG. 7 is a flow chart illustrating updating a conversation mask of an aggregation port upon receiving a long LACPDU according to one embodiment of the present invention.

图8A-D图示了根据本发明一个实施例更新聚合端口的对话掩码的序列。8A-D illustrate a sequence for updating a conversation mask of an aggregation port according to one embodiment of the present invention.

图9是链路聚合组的对话敏感收集的过程的一个实施例的流程图。FIG9 is a flow diagram of one embodiment of a process for conversation-sensitive collection of link aggregation groups.

图10是链路聚合组的对话敏感收集的过程的另一个实施例的流程图。FIG10 is a flow diagram of another embodiment of a process for conversation-sensitive collection of link aggregation groups.

图11是实现链路聚合组的对话敏感收集的网络装置的一个实施例的图。11 is a diagram of one embodiment of a network device implementing conversation-aware collection of link aggregation groups.

图12A-C图示了根据本发明一个实施例的聚合端口的对话掩码-1到掩码-3 TLV。12A-C illustrate conversation mask-1 to mask-3 TLVs for an aggregation port according to one embodiment of the present invention.

图13图示了根据本发明一个实施例对于支持对话敏感帧收集和分布功能性所需的TLV集合。FIG. 13 illustrates the set of TLVs required to support conversation-sensitive frame collection and distribution functionality according to one embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

在如下描述中,阐述了众多特定细节。然而,要理解到,本发明的实施例可以在没有这些特定细节的情况下实行。在其它实例中,众所周知的电路、结构和技术未详细示出,以免模糊了对此描述的理解。In the following description, numerous specific details are set forth. However, it is to be understood that embodiments of the present invention may be practiced without these specific details. In other instances, well-known circuits, structures, and techniques are not shown in detail to avoid obscuring an understanding of this description.

然而,本领域技术人员将认识到,可在没有此类特定细节的情况下实践本发明。在其它实例中,控制结构、门级电路以及全软件指令序列未详细示出,以免使本发明模糊不清。本领域普通技术人员用所包含的描述将能够实现适当功能性,而无需过多实验。However, those skilled in the art will recognize that the present invention can be practiced without such specific details. In other instances, control structures, gate-level circuits, and full software instruction sequences are not shown in detail to avoid obscuring the present invention. A person of ordinary skill in the art will be able to implement appropriate functionality using the included description without undue experimentation.

在说明书中提到“一个实施例”、“实施例”、“示例实施例”等指示所描述的实施例可包含具体特征、结构或特性,但每一个实施例可能不一定都包含该具体特征、结构或特性。而且,此类短语不一定是指同一实施例。另外,当结合一个实施例描述具体特征、结构或特性时,认为结合不管是否明确描述的其它实施例影响这样的特征、结构或特性在本领域技术人员的知识范围内。References in the specification to "one embodiment," "an embodiment," "an example embodiment," and the like indicate that the described embodiment may include a particular feature, structure, or characteristic, but not every embodiment may include that particular feature, structure, or characteristic. Furthermore, such phrases do not necessarily refer to the same embodiment. Furthermore, when a particular feature, structure, or characteristic is described in conjunction with one embodiment, it is understood that it is within the knowledge of those skilled in the art to incorporate such feature, structure, or characteristic in conjunction with other embodiments, whether or not explicitly described.

术语the term

在描述中可使用如下术语。The following terms may be used in the description.

行动者:链路聚合控制协议(LACP)交换中的本地实体(即,节点或网络装置)。Actor: The local entity (ie, node or network device) in a Link Aggregation Control Protocol (LACP) exchange.

聚合密钥:与每个聚合端口和聚合系统的每个聚合器关联的参数,标识能聚合在一起的那些聚合端口。共享同一聚合密钥值的聚合系统中的聚合端口潜在地能够聚合在一起。Aggregation Key: A parameter associated with each Aggregation Port and each Aggregator in an Aggregation System that identifies the Aggregation Ports that can be aggregated together. Aggregation Ports in an Aggregation System that share the same Aggregation Key value can potentially be aggregated together.

聚合端口:由聚合器支持的聚合系统中的服务访问点(SAP)。Aggregation Port: A Service Access Point (SAP) in an aggregation system supported by an Aggregator.

聚合系统:唯一可标识实体,包括(还有其它事项)为了聚合目的的一个或多个聚合端口的任意组合。聚合链路的实例总是发生在两个聚合系统之间。物理装置可包括单个聚合系统或多于一个聚合系统。Aggregation System: A uniquely identifiable entity that includes (among other things) any combination of one or more Aggregation Ports for the purpose of aggregation. An instance of an Aggregation Link always occurs between two Aggregation Systems. A physical device may include a single Aggregation System or more than one Aggregation System.

聚合客户端:分层的实体紧接在链路聚合子层上面,链路聚合子层提供内部子层服务(ISS)的实例。Aggregation Client: The layered entity immediately above the Link Aggregation Sublayer that provides an instance of the Internal Sublayer Services (ISS).

聚合器:绑定到一个或多个聚合端口的逻辑媒体访问控制(MAC)地址,通过其聚合器客户端提供对物理介质的访问。Aggregator: A logical media access control (MAC) address bound to one or more aggregated ports, providing access to the physical medium through its aggregator clients.

对话:从一个终端站传送到另一个的帧集合,其中所有帧形成有序序列,并且其中通信终端站要求在交换的帧集合中间保持有序。Dialogue: A set of frames transmitted from one end station to another in which all frames form an ordered sequence and in which the communicating end stations require that order be maintained among the sets of frames exchanged.

数据终端设备(DTE):连接到局域网的数据的任何源或目标。Data Terminal Equipment (DTE): Any source or destination of data connected to a local area network.

分布式中继站(DR):通过包括门户的每一个聚合系统中的DR功能分布在门户上的功能实体,其从网关向聚合器分布出局帧,并从聚合器向网关分布入局帧。Distributed Relay (DR): A functional entity distributed on the portal via the DR function in each aggregation system including the portal, which distributes outgoing frames from the gateway to the aggregator and distributes incoming frames from the aggregator to the gateway.

分布式弹性网络互连(DRNI):链路聚合扩展成包含门户和聚合系统或两个门户。Distributed Resilient Network Interconnect (DRNI): Link aggregation is extended to include a portal and an aggregation system or two portals.

DR功能:驻留在单个门户系统内的分布式中继站的部分。DR functionality: The portion of a distributed relay station that resides within a single portal system.

网关:由网关链路和两个网关端口组成的连接,通常将分布式中继站虚拟连接到系统(不是系统之间的物理链路)。Gateway: A connection consisting of a gateway link and two gateway ports, typically connecting a distributed relay station virtually to a system (not a physical link between systems).

内部子层服务(ISS):在IEEE Std 802.1AC-2012中定义的MAC服务的加强版本。Internal Sublayer Service (ISS): An enhanced version of the MAC service defined in IEEE Std 802.1AC-2012.

链路聚合组(LAG):一组链路,看起来是聚合器客户端,就好像它们是单个链路。链路聚合组可连接两个聚合系统、聚合系统和门户或两个门户。一个或多个对话可与是链路聚合组一部分的每个链路关联。Link Aggregation Group (LAG): A group of links that appears to the aggregator client as if they were a single link. A Link Aggregation Group can connect two aggregation systems, an aggregation system and a portal, or two portals. One or more conversations can be associated with each link that is part of a Link Aggregation Group.

伙伴:链路聚合控制协议交换中的远程实体(即,节点或网络装置)。Partner: The remote entity (ie, node or network device) in a Link Aggregation Control Protocol exchange.

端口对话标识符(ID):用于选择通过聚合端口的帧的对话标识符值。Port Conversation Identifier (ID): The conversation identifier value used to select frames passing through the aggregate port.

门户:DRNI的一端,包含一个或多个聚合系统,每个具有一起包括链路聚合组的物理链路。门户的聚合系统协同操作以仿真整个链路聚合组附连到的单个聚合系统的存在。Portal: One end of a DRNI that contains one or more aggregation systems, each with physical links that together comprise a link aggregation group. The aggregation systems of the portal operate in concert to emulate the presence of a single aggregation system to which the entire link aggregation group is attached.

类型/长度/值(TLV):由顺序的类型、长度和值字段组成的信息元素的短的可变长度编码,其中类型字段标识信息的类型,长度字段指示信息字段的长度(八位组),并且值字段含有信息本身。类型值本地定义,并且需要在此标准中定义的协议内是唯一的。Type/Length/Value (TLV): A short variable-length encoding of an information element consisting of type, length, and value fields in sequence, where the type field identifies the type of information, the length field indicates the length of the information field (in octets), and the value field contains the information itself. The type value is locally defined and needs to be unique within the protocol defined in this standard.

在如下说明书和权利要求书中,可以使用术语“耦合”和“连接”,以及它们的派生词。应该理解,这些术语不打算作为彼此的同义词。“耦合”用于指示两个或更多元件彼此协同操作或交互,它们可以或者可以不彼此直接物理接触或电接触。“连接”用于指示在彼此耦合的两个或更多元件之间建立通信。本文所使用的“集合”指的是任何正整数项,包含一项。In the following description and claims, the terms "coupled" and "connected," and their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. "Coupled" is used to indicate that two or more elements operate or interact with each other, and they may or may not be in direct physical or electrical contact with each other. "Connected" is used to indicate that communication is established between two or more elements that are coupled to each other. As used herein, "set" refers to any positive integer item, including one item.

电子装置(例如终端站、网络装置)使用机器可读介质诸如非暂时性机器可读介质(例如机器可读存储介质,诸如磁盘;光盘;只读存储器;闪存装置;相变存储器)和暂时性机器可读传送介质(例如电、光、声或其它形式的传播信息——诸如载波、红外信号)存储和传送(在内部和/或通过网络用其它电子装置)代码(由软件指令构成)和数据。此外,此类电子装置包含硬件,诸如耦合到一个或多个其它组件——例如一个或多个非暂时性机器可读存储介质(以存储代码和/或数据)和网络连接(以使用传播信号传送代码和/或数据)的一个或多个处理器的集合,以及在一些情况下用户输入/输出装置(例如键盘、触摸屏和/或显示器)。处理器的集合与其它组件的耦合通常通过电子装置内的一个或多个互连(例如总线,可能还有桥)。从而,给定电子装置的非暂时性机器可读介质通常存储指令以便在那个电子装置的一个或多个处理器上执行。本发明实施例的一个或多个部分可使用软件、固件和/或硬件的不同组合来实现。Electronic devices (e.g., end stations, network devices) use machine-readable media, such as non-transitory machine-readable media (e.g., machine-readable storage media, such as magnetic disks; optical disks; read-only memory; flash memory devices; phase-change memory) and transitory machine-readable transmission media (e.g., electrical, optical, acoustic, or other forms of propagated information, such as carrier waves, infrared signals) to store and transmit (internally and/or over a network with other electronic devices) code (composed of software instructions) and data. Furthermore, such electronic devices include hardware, such as a collection of one or more processors coupled to one or more other components, such as one or more non-transitory machine-readable storage media (to store code and/or data) and a network connection (to transmit code and/or data using propagated signals), and in some cases, user input/output devices (e.g., a keyboard, touch screen, and/or display). The collection of processors and other components are typically coupled via one or more interconnects (e.g., buses, and possibly bridges) within the electronic device. Thus, the non-transitory machine-readable medium of a given electronic device typically stores instructions for execution on one or more processors of that electronic device. One or more portions of embodiments of the present invention may be implemented using various combinations of software, firmware, and/or hardware.

如本文所使用的,网络装置(例如路由器、交换机、桥)是一件连网设备,包含通信地互连网络上其它设备(例如其它网络装置、终端站)的硬件和软件。一些网络装置是为多个连网功能(例如路由、桥接、交换、层2聚合、会话边界控制、服务质量和/或订户管理)提供支持和/或为多个应用服务(例如数据、语音和视频)提供支持的“多服务网络装置”。订户终端站(例如服务器、工作站、膝上型电脑、上网本、掌上电脑、移动电话、智能电话、多媒体电话、通过因特网协议的语音(VOIP)电话、用户设备、终端、便携式媒体播放器、GPS单元、游戏系统、机顶盒)访问通过因特网提供的内容/服务和/或在叠加在因特网上(例如通过因特网遂穿)的虚拟私用网(VPN)上提供的内容/服务。内容和/或服务通常由属于服务或内容提供商的一个或多个终端站(例如服务器终端站)或参与对等(P2P)服务的终端站提供,并且例如可包含公用网页(例如免费内容、店面、搜索服务)、私用网页(例如提供电子邮件服务的用户名/密码访问的网页)和/或VPN上的公司网络等。通常,订户终端站(例如通过(有线或无线)耦合到接入网的客户室内设备)耦合到边缘网络装置,边缘网络装置(例如通过一个或多个核心网络装置)耦合到其它边缘网络装置,其它边缘网络装置耦合到其它终端站(例如服务器终端站)。As used herein, a network device (e.g., a router, switch, bridge) is a piece of networking equipment that includes hardware and software that communicatively interconnects other devices on the network (e.g., other network devices, end stations). Some network devices are "multi-service network devices" that provide support for multiple networking functions (e.g., routing, bridging, switching, layer 2 aggregation, session border control, quality of service, and/or subscriber management) and/or provide support for multiple application services (e.g., data, voice, and video). Subscriber end stations (e.g., servers, workstations, laptops, netbooks, PDAs, mobile phones, smartphones, multimedia phones, voice over Internet Protocol (VOIP) phones, user equipment, terminals, portable media players, GPS units, gaming systems, set-top boxes) access content/services provided over the Internet and/or provided over a virtual private network (VPN) superimposed on the Internet (e.g., tunneled over the Internet). The content and/or services are typically provided by one or more end stations (e.g., server end stations) belonging to a service or content provider or end stations participating in a peer-to-peer (P2P) service, and may include, for example, public web pages (e.g., free content, storefronts, search services), private web pages (e.g., web pages accessed by username/password for email services), and/or corporate networks over VPNs, etc. Typically, subscriber end stations (e.g., customer premises equipment coupled to an access network (wired or wireless)) are coupled to edge network devices, which are coupled to other edge network devices (e.g., via one or more core network devices), which are coupled to other end stations (e.g., server end stations).

网络装置通常被分成控制平面和数据平面(有时称为转发平面或媒体平面)。在网络装置是路由器(或正在实现路由功能性)的情况下,控制平面通常确定要如何路由数据(例如分组)(例如数据的下一跳以及那个数据的出局端口),并且数据平面负责转发那个数据。例如,控制平面通常包含与其它网络装置通信以交换路由并基于路由度量选择那些路由的一个或多个路由协议(例如外部网关协议,诸如,边界网关协议(BGP) (RFC 4271)、内部网关协议(IGP)(例如开放最短路径优先(OSPF)(RFC 2328和5340)、中间系统到中间系统(IS-IS) (RFC 1142)、路由信息协议(RIP)(版本1 RFC 1058、版本2 RFC 2453和下一代RFC2080))、标签分布协议(LDP) (RFC 5036)、资源保留协议(RSVP) (RFC 2205、2210、2211、2212以及RSVP-业务工程(TE):对LSP隧道RFC 3209的RSVP的扩充、通用多协议标签切换(GMPLS)信令RSVP-TE RFC 3473、RFC 3936、4495和4558)。此外,控制平面通常还包含ISO层2控制协议,诸如快速生成树协议(RSTP)、多生成树协议(MSTP)和SPB(最短路径桥接),它们已经由各种标准体标准化(例如,SPB已经在IEEE Std 802.1aq-2012中定义)。Network devices are typically divided into a control plane and a data plane (sometimes called a forwarding plane or media plane). In the case where the network device is a router (or is implementing routing functionality), the control plane typically determines how data (e.g., packets) is to be routed (e.g., the next hop for the data and the outgoing port for that data), and the data plane is responsible for forwarding that data. For example, the control plane typically includes one or more routing protocols (e.g., exterior gateway protocols such as Border Gateway Protocol (BGP) (RFC 4271), interior gateway protocols (IGPs) such as Open Shortest Path First (OSPF) (RFCs 2328 and 5340), Intermediate System to Intermediate System (IS-IS) (RFC 1142), Routing Information Protocol (RIP) (version 1 RFC 1058, version 2 RFC 2453, and next generation RFC 2080)), Label Distribution Protocol (LDP) (RFC 5036), Resource Reservation Protocol (RSVP) (RFCs 2205, 2210, 2211, 2212, and RSVP-Traffic Engineering (TE): An Extension to RSVP for LSP Tunnels RFC 3209, Generalized Multiprotocol Label Switching (GMPLS) Signaling RSVP-TE RFC 3473, RFC 5036) that communicate with other network devices to exchange routes and select those routes based on routing metrics. 3936, 4495, and 4558). In addition, the control plane typically also includes ISO Layer 2 control protocols, such as Rapid Spanning Tree Protocol (RSTP), Multiple Spanning Tree Protocol (MSTP), and SPB (Shortest Path Bridging), which have been standardized by various standards bodies (for example, SPB has been defined in IEEE Std 802.1aq-2012).

路由以及邻接关系被存储在控制平面上的一个或多个路由结构(例如路由信息库(RIB)、标签信息库(LIB)、一个或多个邻接结构)中。控制平面基于路由结构用信息(例如邻接关系和路由信息)对数据平面编程。例如,控制平面将邻接关系和路由信息编程到数据平面上的一个或多个转发结构(例如转发信息库(RIB)、标签转发信息库(LFIB)和一个或多个邻接结构)中。数据平面当转发业务时使用这些转发和邻接结构。Routes and adjacencies are stored in one or more routing structures (e.g., the Routing Information Base (RIB), the Label Information Base (LIB), and one or more adjacency structures) on the control plane. The control plane programs the data plane with information (e.g., adjacency and routing information) based on the routing structures. For example, the control plane programs adjacency and routing information into one or more forwarding structures (e.g., the Forwarding Information Base (RIB), the Label Forwarding Information Base (LFIB), and one or more adjacency structures) on the data plane. The data plane uses these forwarding and adjacency structures when forwarding traffic.

每一个路由协议都基于某些路由度量(度量对于不同路由协议可以不同)将路由项下载到主RIB。每一个路由协议都可在本地RIB(例如OSPF本地RIB)中存储路由项,包含未下载到主RIB的路由项。管理主RIB的RIB模块从通过路由协议(基于度量集合)下载的路由中选择路由,并将那些选择的路由(有时称为活动路由项)下载到数据平面。RIB模块还可引起在路由协议之间重新分布路由。对于层2转发,网络装置可存储用于基于那个数据中的层2信息转发数据的一个或多个桥接表。Each routing protocol downloads routing entries to the master RIB based on certain routing metrics (the metrics can be different for different routing protocols). Each routing protocol can store routing entries in a local RIB (e.g., an OSPF local RIB), including routing entries that have not been downloaded to the master RIB. The RIB module that manages the master RIB selects routes from the routes downloaded by the routing protocols (based on a set of metrics) and downloads those selected routes (sometimes referred to as active routing entries) to the data plane. The RIB module can also cause routes to be redistributed between routing protocols. For layer 2 forwarding, the network device can store one or more bridging tables for forwarding data based on the layer 2 information in that data.

通常,网络装置包含一个或多个线卡的集合、一个或多个控制卡的集合,可选地还有一个或多个服务卡(有时称为资源卡)的集合。这些卡通过一个或多个互连机制(例如,第一全网耦合线卡而第二全网耦合所有卡)耦合在一起。线卡的集合构成数据平面,而控制卡的集合提供控制平面,并通过线卡与外部网络装置交换分组。服务卡的集合可提供专业化处理(例如,层4到层7服务(例如,防火墙、因特网协议安全性(IPsec) (RFC 4301和4309)、入侵检测系统(IDS)、对等(P2P)、IP语音(VoIP)会话边界控制器、移动无线网关(网关通用分组无线电服务(GPRS)支持节点(GGSN)、演进的分组核心(EPC)网关))。作为示例,服务卡可用于终止IPsec隧道,并执行附带的认证和加密算法。Typically, a network device includes a set of one or more line cards, a set of one or more control cards, and optionally a set of one or more service cards (sometimes referred to as resource cards). These cards are coupled together via one or more interconnect mechanisms (e.g., a first full mesh coupling of line cards and a second full mesh coupling of all cards). The set of line cards constitutes the data plane, while the set of control cards provides the control plane and exchanges packets with external network devices via the line cards. The set of service cards can provide specialized processing (e.g., layer 4 to layer 7 services (e.g., firewalls, Internet Protocol security (IPsec) (RFCs 4301 and 4309), intrusion detection systems (IDS), peer-to-peer (P2P), voice over IP (VoIP) session border controllers, mobile wireless gateways (Gateway General Packet Radio Service (GPRS) Support Node (GGSN), Evolved Packet Core (EPC) gateways)). As an example, a service card can be used to terminate an IPsec tunnel and execute the accompanying authentication and encryption algorithms.

如本文所使用的,节点在IP分组中的一些IP报头信息的基础上转发IP分组;其中IP报头信息包含源IP地址、目标IP地址、源端口、目标端口(其中“源端口”和“目标端口”在本文指的是协议端口,与网络装置的物理端口相对)、传输协议(例如用户数据报协议(UDP)(RFC 768、2460、2675、4113和5405)、传送控制协议(TCP) (RFC 793和1180)以及差分服务(DSCP)值(RFC 2474、2475、2597、2983、3086、3140、3246、3247、3260、4594、5865、3289、3290和3317)。节点实现在网络装置中。物理节点直接实现在网络装置上,而虚拟节点是软件、可能还有硬件、在网络装置上实现的抽象。从而,多个虚拟节点可实现在单个网络装置上。As used herein, a node forwards an IP packet based on some IP header information in the IP packet; wherein the IP header information includes a source IP address, a destination IP address, a source port, a destination port (wherein "source port" and "destination port" refer to protocol ports, as opposed to physical ports of a network device), a transport protocol (e.g., User Datagram Protocol (UDP) (RFCs 768, 2460, 2675, 4113, and 5405), a Transmission Control Protocol (TCP) (RFCs 793 and 1180), and a Differentiated Services Token Service (DSCP) value (RFC 2474, 2475, 2597, 2983, 3086, 3140, 3246, 3247, 3260, 4594, 5865, 3289, 3290, and 3317). Nodes are implemented in network devices. Physical nodes are implemented directly on network devices, while virtual nodes are abstractions implemented in software, and possibly hardware, on network devices. Thus, multiple virtual nodes can be implemented on a single network device.

网络接口可以是物理的或虚拟的,并且接口地址是指配给网络接口的IP地址,如果它是物理网络接口或虚拟网络接口的话。物理网络接口是网络装置中的硬件,通过其进行网络连接(例如,通过无线网络接口控制器(WNIC)无线地或通过插入电缆到连接到网络接口控制器(NIC)的端口)。通常,网络装置具有多个物理网络接口。虚拟网络接口可与物理网络接口、另一虚拟接口关联,或者代表它自己(例如,环回接口、点对点协议接口)。网络接口(物理或虚拟)可被编号(具有IP地址的网络接口)或不编号(没有IP地址的网络接口)。环回接口(及其环回地址)是经常用于管理目的的节点(物理或虚拟)的特定类型的虚拟网络接口(和IP地址);其中此类IP地址被称为节点环回地址。指配给网络装置的网络接口的IP地址被称为那个网络装置的IP地址;在更大粒度级,指配给网络接口(网络接口被指配给在网络装置上实现的节点)的IP地址可被称为那个节点的IP地址。A network interface can be physical or virtual, and the interface address is the IP address assigned to the network interface, if it is a physical or virtual network interface. A physical network interface is the hardware in a network device through which a network connection is made (e.g., wirelessly via a wireless network interface controller (WNIC) or by plugging a cable into a port connected to a network interface controller (NIC)). Typically, a network device has multiple physical network interfaces. A virtual network interface can be associated with a physical network interface, another virtual interface, or represent itself (e.g., a loopback interface, a Point-to-Point Protocol interface). A network interface (physical or virtual) can be numbered (a network interface with an IP address) or unnumbered (a network interface without an IP address). A loopback interface (and its loopback address) is a specific type of virtual network interface (and IP address) often used for management purposes for a node (physical or virtual); such an IP address is referred to as a node loopback address. The IP address assigned to a network interface of a network device is referred to as the IP address of that network device; at a more granular level, the IP address assigned to a network interface (a network interface is assigned to a node implemented on the network device) can be referred to as the IP address of that node.

一些网络装置提供对于实现VPN(虚拟私用网)(例如层2 VPN和/或层3 VPN)的支持。例如,耦合提供商的网络和顾客的网络的网络装置分别被称为PE(提供商边缘)和CE(顾客边缘)。在层2 VPN中,转发通常在VPN任一端上的CE上执行,并且业务跨网络发送(例如通过由其它网络装置耦合的一个或多个PE)。层2电路配置在CE与PE(例如以太网端口、ATM永久虚拟电路(PVC)、帧中继PVC)之间。在层3 VPN中,路由通常由PE执行。作为示例,支持多个上下文的边缘网络装置可被部署为PE;并且上下文可用VPN协议配置,并且从而,那个上下文被称为VPN上下文。Some network devices provide support for implementing VPNs (Virtual Private Networks), such as Layer 2 VPNs and/or Layer 3 VPNs. For example, the network devices that couple a provider's network and a customer's network are referred to as PEs (Provider Edge) and CEs (Customer Edge), respectively. In a Layer 2 VPN, forwarding is typically performed on a CE at either end of the VPN, and traffic is sent across the network (e.g., via one or more PEs coupled by other network devices). Layer 2 circuits are configured between the CE and the PE (e.g., Ethernet ports, ATM permanent virtual circuits (PVCs), Frame Relay PVCs). In a Layer 3 VPN, routing is typically performed by the PE. As an example, an edge network device that supports multiple contexts can be deployed as a PE; and a context can be configured using a VPN protocol, and thus, that context is referred to as a VPN context.

一些网络装置提供对于VPLS(虚拟私用LAN服务)(RFC 4761和4762)的支持。例如,在VPLS网络中,订户终端站通过耦合到CE来访问通过VPLS网络提供的内容/服务,CE通过由其它网络装置耦合的PE耦合。VPLS网络可用于实现三重播放网络应用(例如,数据应用(例如高速因特网接入)、视频应用(例如电视服务诸如IPTV(因特网协议电视)、VoD(视频点播)服务)和语音应用(例如VoIP(因特网协议上的语音)服务))、VPN服务等。VPLS是可用于多点连接的层2 VPN类型。VPLS网络还允许在单独地理位置与CE耦合的订户终端站跨广域网(WAN)彼此通信,就好像它们在局域网(LAN)(称为仿真LAN)中彼此直接附连一样。Some network devices provide support for VPLS (Virtual Private LAN Service) (RFCs 4761 and 4762). For example, in a VPLS network, subscriber end stations access content/services provided through the VPLS network by coupling to a CE, which is coupled through a PE coupled by other network devices. VPLS networks can be used to implement triple-play network applications (e.g., data applications (e.g., high-speed Internet access), video applications (e.g., television services such as IPTV (Internet Protocol Television), VoD (Video on Demand) services), and voice applications (e.g., VoIP (Voice over Internet Protocol) services)), VPN services, etc. VPLS is a type of Layer 2 VPN that can be used for multi-point connections. VPLS networks also allow subscriber end stations coupled to CEs at separate geographic locations to communicate with each other across a wide area network (WAN) as if they were directly attached to each other in a local area network (LAN) (referred to as an emulated LAN).

在VPLS网络中,每个CE通常有可能通过接入网(有线和/或无线)经由附连电路(例如CE与PE之间的虚拟链路或连接)附连到PE的桥模块。PE的桥模块通过仿真LAN接口附连到仿真LAN。每个桥模块通过保持将MAC地址映射到伪线和附连电路的转发表来充当“虚拟交换机实例”(VSI)。PE基于包含在那些帧中的MAC目标地址字段向目标(例如其它CE、其它PE)转发帧(从CE接收的)。In a VPLS network, each CE is typically attached to a PE's bridge module, possibly via an access network (wired and/or wireless) via an attachment circuit (e.g., a virtual link or connection between the CE and PE). The PE's bridge module is attached to an emulated LAN via an emulated LAN interface. Each bridge module acts as a "virtual switch instance" (VSI) by maintaining a forwarding table that maps MAC addresses to pseudowires and attachment circuits. The PE forwards frames (received from the CE) to their destination (e.g., other CEs, other PEs) based on the MAC destination address field contained in those frames.

网络装置还可支持本地L2网络技术和装置类型,包含由C-VLAN桥、提供商桥、提供商主干桥、提供商主干桥业务工程(TE)(如在IEEE std 802.1ad-2005、IEEE std 802.1ah-2008、IEEE std 802.1Qaq-2009、IEEE std 802.1Q-2011中所定义的)支持的VLAN桥接网络,以及类似技术和网络装置类型。作为示例而非限制,提供了网络装置类型和支持的技术的以上列表。本领域技术人员将理解,其它技术、标准和装置类型可作为在本文所使用的网络装置包含。The network device may also support native L2 network technologies and device types, including VLAN bridged networks supported by C-VLAN bridges, provider bridges, provider backbone bridges, provider backbone bridge traffic engineering (TE) (as defined in IEEE std 802.1ad-2005, IEEE std 802.1ah-2008, IEEE std 802.1Qaq-2009, IEEE std 802.1Q-2011), and similar technologies and network device types. The above list of network device types and supported technologies is provided by way of example and not limitation. Those skilled in the art will appreciate that other technologies, standards, and device types may be included as network devices as used herein.

链路聚合子层Link Aggregation Sublayer

图2是链路聚合子层200的一个实施例的图。聚合器客户端202通过聚合器250与聚合端口292、294、296的集合通信。在一个实施例中,聚合器250给出了到聚合器客户端202的标准IEEE Std 802.1Q内部子层服务(ISS)接口。聚合器250绑定到一个或多个聚合端口,包含聚合端口292、294、296。聚合器250将帧传送从聚合器客户端202分布到聚合端口292、294、296,并从聚合端口292、294、296收集接收的帧,并将它们透明地传到聚合器客户端202。FIG2 is a diagram of one embodiment of a link aggregation sublayer 200. Aggregator client 202 communicates with a collection of aggregation ports 292, 294, and 296 through aggregator 250. In one embodiment, aggregator 250 presents a standard IEEE Std 802.1Q Interior Sublayer Services (ISS) interface to aggregator client 202. Aggregator 250 binds to one or more aggregation ports, including aggregation ports 292, 294, and 296. Aggregator 250 distributes frame transmissions from aggregator client 202 to aggregation ports 292, 294, and 296, and collects received frames from aggregation ports 292, 294, and 296 and transparently passes them to aggregator client 202.

将聚合端口292、294、296绑定到聚合器250由链路聚合控制210管理,链路聚合控制210负责确定可聚合哪些链路,聚合它们,将聚合端口绑定到适当聚合器,并监视条件以确定何时需要聚合上的改变。此类确定和绑定可由网络管理器通过直接操纵链路聚合的状态变量(例如通过聚合密钥)而在人工控制下。此外,自动确定、配置、绑定和监视可通过使用链路聚合控制协议(LACP) 214发生。LACP 214使用跨链路的对等交换,以在正在进行的基础上确定各种链路的聚合能力,并且不断提供在给定一对聚合系统之间可实现的聚合能力的最大级别。The binding of aggregated ports 292, 294, and 296 to aggregator 250 is managed by link aggregation control 210, which is responsible for determining which links can be aggregated, aggregating them, binding the aggregated ports to the appropriate aggregator, and monitoring conditions to determine when a change in aggregation is required. Such determination and binding can be under manual control by a network manager by directly manipulating state variables of the link aggregation (e.g., via an aggregation key). In addition, automatic determination, configuration, binding, and monitoring can occur through the use of Link Aggregation Control Protocol (LACP) 214. LACP 214 uses peer exchanges across links to determine the aggregation capabilities of various links on an ongoing basis and continuously provides the maximum level of aggregation capability achievable between a given pair of aggregated systems.

聚合系统可含有多个聚合器,服务于多个聚合器客户端。给定聚合端口将在任何时间绑定到(至多)单个聚合器。聚合器客户端在某一时间由单个聚合器服务。An aggregation system may contain multiple aggregators, serving multiple aggregator clients. A given aggregation port will be bound to (at most) a single aggregator at any given time. An aggregator client is served by a single aggregator at a time.

对于聚合器客户端之间的帧交换的某些序列(称为对话),保持帧排序。帧分布器234确保给定对话的所有帧都被传到单个聚合端口。对于给定对话,要求帧收集器224按从聚合端口接收它们的次序将帧传到聚合器客户端202。帧收集器224另外以任何次序自由选择从聚合端口292、294、296接收的帧。由于没有在单个链路上使帧混乱的手段,因此这确保对于任何对话保持帧排序。对话可在链路聚合组内的聚合端口之间移动,以便负载平衡,并在链路故障的情况下保持可用性。Frame ordering is maintained for certain sequences of frame exchanges between aggregator clients (called conversations). Frame distributor 234 ensures that all frames for a given conversation are delivered to a single aggregator port. For a given conversation, frame collector 224 is required to deliver frames to aggregator client 202 in the order in which they were received from the aggregator ports. Frame collector 224 is also free to select frames received from aggregator ports 292, 294, and 296 in any order. This ensures that frame ordering is maintained for any conversation, as there is no means to shuffle frames on a single link. Conversations can be moved between aggregator ports within a link aggregation group to facilitate load balancing and maintain availability in the event of a link failure.

聚合端口292、294、296各指配了媒体访问控制(MAC)地址,这些地址在链路聚合组上并且对链路聚合组连接到的任何桥接的局域网(LAN)(例如网络遵从IEEE 802.1Q桥接的LAN)是唯一的。这些MAC地址被用作由链路聚合子层270本身内的实体发起的帧交换(即,LACP 214和标记协议交换)的源地址。Aggregation ports 292, 294, 296 are each assigned a Media Access Control (MAC) address that is unique across the link aggregation group and to any bridged local area network (LAN) to which the link aggregation group is connected (e.g., a LAN whose network complies with IEEE 802.1Q bridges). These MAC addresses are used as source addresses for frame exchanges initiated by entities within the link aggregation sublayer 270 itself (i.e., LACP 214 and tag protocol exchanges).

聚合器250(以及其它聚合器,如果部署的话)被指配了MAC地址,在链路聚合组上并且对链路聚合组连接到的桥接的LAN(例如网络遵从IEEE 802.1Q桥接的LAN)是唯一的。这个地址从聚合器客户端202的角度被用作链路聚合组的MAC地址,作为传送帧的源地址和接收帧的目标地址。聚合器250的MAC地址可以是关联的链路聚合组中的聚合端口的MAC地址之一。Aggregator 250 (and other aggregators, if deployed) is assigned a MAC address that is unique to the LAG and to the bridged LAN (e.g., a LAN bridged in accordance with IEEE 802.1Q) to which the LAG is connected. This address is used as the MAC address of the LAG from the perspective of aggregator client 202, serving as the source address for transmitted frames and the destination address for received frames. The MAC address of aggregator 250 can be one of the MAC addresses of the aggregated ports in the associated LAG.

分布式弹性网络互连(DRNI)Distributed Resilient Network Interconnect (DRNI)

链路聚合创建链路聚合组,其是对更高层好像是单个逻辑链路的一个或多个物理链路的集合。链路聚合组具有两端,每端终止在聚合系统中。DRNI扩展链路聚合的概念,使得在链路聚合组的任一端或两端,单个聚合系统由门户替代,每个由一个或多个聚合系统构成。Link aggregation creates a link aggregation group (LAG), which is a collection of one or more physical links that appears to higher layers as a single logical link. A LAG has two ends, each terminating in an aggregation system. DRNI extends the concept of link aggregation so that at either or both ends of a LAG, the single aggregation system is replaced by portals, each consisting of one or more aggregation systems.

通过使用分布式中继站互连两个或更多系统来创建DRNI,每个系统运行链路聚合,以创建门户。门户(即,每个门户系统)中的每个聚合系统用单个聚合器运行链路聚合。分布式中继站使门户系统能够联合终止链路聚合组。对于连接门户的所有其它聚合系统,链路聚合组好像终止在由门户系统创建的单独仿真聚合系统中。DRNI is created by interconnecting two or more systems using distributed relays, each running link aggregation to create a portal. Each aggregation system in a portal (i.e., each portal system) runs link aggregation using a single aggregator. Distributed relays enable portal systems to jointly terminate a link aggregation group. To all other aggregation systems connected to the portal, the link aggregation group appears to terminate in a single simulated aggregation system created by the portal system.

更新对话分配的实施例的集合Update the set of embodiments of the conversation allocation

图3是图示根据本发明一个实施例更新聚合端口的对话分配的过程的流程图。这个流程图和其它流程图的操作将参考其它图的示范实施例(例如在图11中图示的实施例)描述。然而,应该理解,流程图的操作可由本发明的实施例而不是参考这些其它图讨论的实施例执行,并且参考这些其它图讨论的本发明实施例可执行与参考流程图讨论的操作不同的操作。FIG3 is a flow chart illustrating a process for updating conversation allocations for an aggregation port according to one embodiment of the present invention. The operations of this and other flow charts will be described with reference to exemplary embodiments of other figures (e.g., the embodiment illustrated in FIG11 ). However, it should be understood that the operations of the flow charts may be performed by embodiments of the present invention other than the embodiments discussed with reference to these other figures, and that embodiments of the present invention discussed with reference to these other figures may perform operations different from those discussed with reference to the flow charts.

在图3中图示的过程可实现在含有部署一个或多个链路聚合组的一个或多个网络装置(诸如图1A的网络装置120和122以及包括图1B的门户112和114的网络装置)的网络中。当链路聚合组传送一个或多个对话时,该过程用于更新链路聚合组的聚合端口的对话分配,其中每个对话与网络中的服务或应用关联。The process illustrated in FIG3 can be implemented in a network including one or more network devices that deploy one or more link aggregation groups (such as network devices 120 and 122 in FIG1A and network devices including portals 112 and 114 in FIG1B ). The process is used to update the conversation assignments of the aggregation ports of the link aggregation group when the link aggregation group transmits one or more conversations, where each conversation is associated with a service or application in the network.

该过程开始于在块301验证对话敏感链路聚合控制协议(LACP)的实现是可操作的。对话敏感LACP的实现需要是可操作的,也就是,LACP需要能够协调一对行动者和伙伴网络装置的对话敏感帧收集和分布。例如,块301的验证可通过验证对话敏感LACP的实现能够传送和接收分别由行动者和伙伴网络装置指示端口算法(用于向各种对话指配帧)的LACPDU来执行。也就是,验证包含验证由网络装置(行动者网络装置)使用的端口算法可通过对话敏感LACP的实现发送到伙伴网络装置。在备选或附加实施例中,验证包含验证对话标识符摘要和对话服务映射摘要的一致性,如本文下面更详细讨论的。在没有验证的情况下,网络装置不知道它是否能通过LACP传递对话敏感信息,并且用于接收对话敏感LACP信息的过程被忽略。当验证失败时,网络装置发送出管理行动的通知。The process begins at block 301 by verifying that an implementation of the conversation-sensitive Link Aggregation Control Protocol (LACP) is operational. The conversation-sensitive LACP implementation must be operational, that is, LACP must be able to coordinate the collection and distribution of conversation-sensitive frames between a pair of actor and partner network devices. For example, the verification at block 301 can be performed by verifying that the conversation-sensitive LACP implementation is capable of transmitting and receiving LACPDUs that indicate port algorithms (used to assign frames to various conversations) by the actor and partner network devices, respectively. That is, the verification includes verifying that the port algorithm used by the network device (the actor network device) can be transmitted to the partner network device by the conversation-sensitive LACP implementation. In an alternative or additional embodiment, the verification includes verifying the consistency of the conversation identifier digest and the conversation service mapping digest, as discussed in more detail below. Without verification, the network device does not know whether it can transmit conversation-sensitive information via LACP, and the process for receiving conversation-sensitive LACP information is bypassed. When verification fails, the network device sends a notification of management action.

该过程在验证对话敏感LACP的实现是可操作的之后流动到块303。在块303,网络装置确定通过增强LACPDU的操作至少部分基于兼容性检查是有可能的。增强LACPDU是可用于通过链路聚合组更新对话分配信息的LACPDU,并且它们不能操作在所有条件下。兼容性检查确定与聚合端口关联的网络装置的操作参数集合是否匹配与在伙伴网络装置的匹配端口关联的伙伴网络装置的操作参数的匹配集合。伙伴网络装置是通信地与网络装置耦合的远程网络装置。如果操作参数集合彼此匹配,则过程300继续。可选地,如果操作参数集合不匹配,则可向链路聚合组的管理系统发送出通知,并且网络运营商可解决失配。After verifying that the implementation of conversation-sensitive LACP is operational, the process flows to block 303. At block 303, the network device determines that operation via enhanced LACPDUs is possible based at least in part on a compatibility check. Enhanced LACPDUs are LACPDUs that can be used to update conversation allocation information for a link aggregation group, and they may not operate under all conditions. The compatibility check determines whether the set of operational parameters of the network device associated with the aggregation port matches the matching set of operational parameters of the partner network device associated with the matching port on the partner network device. The partner network device is a remote network device communicatively coupled to the network device. If the operational parameter sets match, process 300 continues. Optionally, if the operational parameter sets do not match, a notification may be sent to the management system of the link aggregation group, and the network operator may resolve the mismatch.

增强LACPDU与传统LACPDU不同。传统LACPDU(诸如遵从IEEE标准802.1AX版本1的LACPDU)具有128个八位组的帧大小。如果128个八位组的每位用于指示对话状态,则传统LACPDU可仅含有高达128×8=1024的对话。不过,链路聚合组也可支持多于1024的对话。例如,一些实施例可需要支持高达4096个对话,从而,这些实施例,传统LACPDU不充分,并且对于过程300,利用不同类型的LACPDU,称为增强LACPDU。在一个实施例中,增强LACPDU包含端口算法TLV、端口对话ID摘要TLV、端口对话掩码和/或端口对话服务映射TLV的字段。The enhanced LACPDU is different from the traditional LACPDU. The traditional LACPDU (such as the LACPDU compliant with IEEE Standard 802.1AX Version 1) has a frame size of 128 octets. If each bit of the 128 octets is used to indicate the state of the conversation, the traditional LACPDU may only contain up to 128×8=1024 conversations. However, the link aggregation group may also support more than 1024 conversations. For example, some embodiments may need to support up to 4096 conversations, and thus, for these embodiments, the traditional LACPDU is not sufficient, and for process 300, a different type of LACPDU is utilized, referred to as the enhanced LACPDU. In one embodiment, the enhanced LACPDU includes fields for a port algorithm TLV, a port conversation ID digest TLV, a port conversation mask, and/or a port conversation service mapping TLV.

在证实通过增强LACPDU的操作是有可能的之后,该过程然后去到块305,在此更新网络装置的链路聚合组的聚合端口的对话分配状态。该更新基于聚合端口的对话分配状态不准确的确定。对话分配状态指示传送通过聚合端口的对话的列表。例如,当用对话标识符(ID)标识每个对话时,聚合端口的对话分配状态可含有对话ID集合,指示通过该端口的对话集合。After confirming that operation via the enhanced LACPDU is possible, the process then proceeds to block 305, where the conversation allocation state of the link aggregation group of the network device is updated. This update is based on a determination that the conversation allocation state of the aggregation port is inaccurate. The conversation allocation state indicates a list of conversations passing through the aggregation port. For example, when each conversation is identified by a conversation identifier (ID), the conversation allocation state of the aggregation port may contain a set of conversation IDs indicating the set of conversations passing through the port.

在某些情形下,网络装置的聚合端口的对话状态可失去与伙伴网络装置的聚合端口的同步。例如,在网络装置的链路聚合组的聚合端口可设置成传送/接收标识为对话1-5的对话,从而,聚合端口的对话分配状态指示通过对话端口的对话1-5。不过,在伙伴网络装置的链路聚合组的匹配端口可被设置成传送/接收标识为对话1-7的对话(例如,由于在不服务的伙伴网络装置的某一其它端口引起的)。网络装置的聚合端口的对话分配状态与伙伴网络装置不同步,从而它被视为不正确。类似的问题发生在在网络装置的同一链路聚合组的另一端口被设置成传送/接收标识为对话5-7的对话时。在此情况下,聚合端口的对话分配状态与同一链路聚合组的另一端口不同步,并且对话5不能通过这两个端口,并保持对话的帧的次序。换句话说,同步故障可简单地特征化为LAG一侧上的分布算法(或相关过程)的故障或失灵,以确保对话仅被分配给单个端口。一旦聚合端口的对话分配状态被确定为不正确,就更新聚合端口的对话分配状态。例如,它被更新以匹配在伙伴网络装置的匹配端口的对话分配状态,或匹配在网络装置的同一链路聚合组的另一端口的对话分配状态。In certain situations, the conversation state of an Aggregation port on a network device may become out of sync with that of an Aggregation port on a partner network device. For example, an Aggregation port on a link aggregation group of a network device may be configured to transmit/receive conversations identified as conversations 1-5, resulting in the Aggregation port's conversation allocation state indicating that conversations 1-5 are passing through the conversation port. However, a matching port on a link aggregation group of a partner network device may be configured to transmit/receive conversations identified as conversations 1-7 (e.g., due to a different port on the partner network device being out of service). The conversation allocation state of the Aggregation port on the network device is out of sync with that of the partner network device, and is therefore considered incorrect. A similar problem occurs when another port on the same link aggregation group of a network device is configured to transmit/receive conversations identified as conversations 5-7. In this case, the Aggregation port's conversation allocation state is out of sync with the other port on the same link aggregation group, and conversation 5 cannot pass through both ports while preserving the order of the conversations' frames. In other words, a synchronization failure can simply be characterized as a failure or malfunction of the distribution algorithm (or related process) on the LAG side, which ensures that conversations are assigned to only a single port. Once the conversation allocation state of the aggregation port is determined to be incorrect, the conversation allocation state of the aggregation port is updated, for example, to match the conversation allocation state of a matching port on a partner network device, or to match the conversation allocation state of another port in the same link aggregation group on the network device.

用于传递聚合端口的对话分配状态的TLV的实施例Embodiment of a TLV for conveying conversation allocation state for an aggregate port

聚合端口的对话分配状态需要以通过LACP传送的数据格式表示。在本发明的一个实施例中,已经使用TLV格式传递聚合端口的对话分配状态。图4A图示了根据本发明一个实施例的聚合端口的对话掩码TLV。对话掩码TLV 400含有四个字段:TLV类型402、对话掩码长度404、对话掩码状态406和端口操作对话掩码408。在其它实施例中,可不按图4A中图示的次序定位字段,并且其它实施例可含有更多或更少的字段。The conversation allocation state of an aggregate port needs to be represented in a data format transmitted via LACP. In one embodiment of the present invention, the conversation allocation state of an aggregate port is conveyed using a TLV format. FIG4A illustrates a conversation mask TLV for an aggregate port according to one embodiment of the present invention. Conversation mask TLV 400 contains four fields: TLV type 402, conversation mask length 404, conversation mask state 406, and port operation conversation mask 408. In other embodiments, the fields may not be positioned in the order illustrated in FIG4A , and other embodiments may contain more or fewer fields.

TLV类型402指示在TLV元组中携带的信息的性质。在一个实施例中,对话掩码TLV由整数0x06标识。对话掩码长度404(在图4A中标为Conversation_Mask_Length)指示TLV元组的八位组的长度。对话掩码TLV的总长度是515八位组,从而该字段包含值515。在不同实施例中,对话掩码长度404含有大于或小于515的值。TLV Type 402 indicates the nature of the information carried in the TLV tuple. In one embodiment, the Conversation Mask TLV is identified by the integer 0x06. Conversation Mask Length 404 (labeled as Conversation_Mask_Length in FIG. 4A ) indicates the length in octets of the TLV tuple. The total length of the Conversation Mask TLV is 515 octets, so this field contains the value 515. In various embodiments, Conversation Mask Length 404 contains a value greater than or less than 515.

对话掩码状态(在图4A中标为Conversation_Mask_State)指示对话掩码的状态。图4B图示了根据本发明一个实施例的聚合端口的对话掩码TLV内的对话掩码状态字段。对话掩码状态450(对话掩码状态406的实施例)含有8位(一个八位组),其中保留八位当中的七位以便将来使用(保留的411-414)。一个剩余位是指示由网络装置的链路聚合组的聚合端口的帧分布器使用的对话掩码是否与伙伴网络装置的链路聚合组的关联聚合端口的帧分布器使用的对话掩码相同的标志。该标志从而是同步标志,在图4B中称为ActPar_Sync410。在一个实施例中,同步标志是布尔值,并且如果由本地网络装置的聚合端口的帧分布器使用的对话掩码与伙伴网络装置的聚合端口的帧分布器使用的对话掩码相同,则指示“真”,否则它指示“假”。The conversation mask state (labeled as Conversation_Mask_State in FIG. 4A ) indicates the state of the conversation mask. FIG. 4B illustrates the conversation mask state field within the conversation mask TLV for an aggregation port according to one embodiment of the present invention. Conversation mask state 450 (an embodiment of conversation mask state 406) contains 8 bits (one octet), of which seven of the eight bits are reserved for future use (reserved 411-414). The remaining bit is a flag that indicates whether the conversation mask used by the frame distributor of the aggregation port of the link aggregation group of the network device is the same as the conversation mask used by the frame distributor of the associated aggregation port of the link aggregation group of the partner network device. This flag is therefore a synchronization flag, referred to as ActPar_Sync 410 in FIG. 4B . In one embodiment, the synchronization flag is a Boolean value and indicates "true" if the conversation mask used by the frame distributor of the aggregation port of the local network device is the same as the conversation mask used by the frame distributor of the aggregation port of the partner network device, and indicates "false" otherwise.

端口操作对话掩码408(在图4A中标为Port_Oper_Conversation_Mask)含有指示索引的端口对话标识符(ID)是否通过具体聚合端口分布的布尔向量值。在一个实施例中,根据优先权选择协定构造布尔向量值。优先权选择协定指示,给定对话去到链路聚合组的单个聚合端口。基于信息,端口操作对话掩码可构造成指示在聚合端口上传送由对话标识符索引的对话。The port operation conversation mask 408 (labeled Port_Oper_Conversation_Mask in FIG. 4A ) contains a Boolean vector value indicating whether the indexed port conversation identifier (ID) is distributed across a specific Aggregation Port. In one embodiment, the Boolean vector value is constructed based on a preference agreement. The preference agreement dictates that a given conversation be directed to a single Aggregation Port of a Link Aggregation Group. Based on this information, the port operation conversation mask can be configured to indicate that the conversation indexed by the conversation identifier is transmitted on the Aggregation Port.

图4C图示了根据本发明一个实施例在网络装置的链路聚合组的聚合端口的端口操作对话掩码。端口对话掩码470(在图4C中标为Port_Oper_Conversation_Mask)是端口操作对话掩码408的实施例,并且它含有4096位(512 八位组×8位/八位组=4096位),并且每位指示是否通过聚合端口传送(或接收;在它是分布器掩码的情况下传送,并在它是收集器掩码的情况下接收)给定对话。如所图示的,在位0的参考420用于对话0,在位1的参考421用于对话1(即,对话ID=1,同样适用于其它对话),并且在位2的参考422和在位3的参考423分别用于对话2和对话3。最后,在位4095的参考424指示在聚合端口上是否传送对话4095。在一个实施例中,当通过聚合端口传送对话时,对话的布尔值指示“真”。当链路聚合组可支持高达4096个对话(可用12位寻址)时,512八位组的端口对话掩码470可指示通过链路聚合端口传送的可能对话的所有排列。要指出,一些实施例可支持多于或少于4096个对话,并且相应地可实现端口操作对话掩码的长度,以容纳不同最大数量的对话。FIG4C illustrates the port operation conversation mask for an aggregate port in a link aggregation group of a network device, according to one embodiment of the present invention. Port conversation mask 470 (labeled as Port_Oper_Conversation_Mask in FIG4C ) is an embodiment of port operation conversation mask 408 and contains 4096 bits (512 octets x 8 bits/octet = 4096 bits), with each bit indicating whether a given conversation is transmitted (or received; if it is a distributor mask, transmitted; if it is a collector mask, received) via the aggregate port. As illustrated, reference 420 in bit 0 is for conversation 0, reference 421 in bit 1 is for conversation 1 (i.e., conversation ID = 1, similarly applicable to other conversations), and references 422 in bit 2 and 423 in bit 3 are for conversations 2 and 3, respectively. Finally, reference 424 in bit 4095 indicates whether conversation 4095 is transmitted on the aggregate port. In one embodiment, the Boolean value of the conversation indicates "true" when the conversation is transmitted via the aggregate port. When a link aggregation group can support up to 4096 conversations (addressable using 12 bits), a 512-octet port conversation mask 470 can indicate all permutations of possible conversations that can be transmitted over the link aggregation port. Note that some embodiments can support more or less than 4096 conversations and accordingly can implement the length of the port operation conversation mask to accommodate different maximum numbers of conversations.

要指出,对话掩码TLV 400含有515个八位组,并且它比128个八位组长得多,其是IEEE 802.1AX标准的版本1中的LACPDU的长度。从而,在本发明的一个实施例中,需要“长”LACPDU传送对话掩码TLV。Note that the Conversation Mask TLV 400 contains 515 octets, which is much longer than 128 octets, which is the length of a LACPDU in version 1 of the IEEE 802.1AX standard. Thus, in one embodiment of the present invention, a "long" LACPDU is required to transmit the Conversation Mask TLV.

在另一实施例中,使用多个TLV实现端口对话掩码。图12-C图示了根据本发明一个实施例使用聚合端口的三个TLV(对话掩码-1到掩码-3 TLV)实现对话掩码的实施例。参考图12A,对话掩码-1 TLV 1200含有四个字段,类似于图4A的对话掩码TLV 400:TLV类型1202、对话掩码-1长度1204、对话掩码状态1206和端口操作对话掩码-1 1208。In another embodiment, multiple TLVs are used to implement the port conversation mask. Figure 12-C illustrates an embodiment of implementing the conversation mask using three TLVs (Conversation Mask-1 through Mask-3 TLVs) for aggregated ports, according to one embodiment of the present invention. Referring to Figure 12A , Conversation Mask-1 TLV 1200 contains four fields, similar to Conversation Mask TLV 400 of Figure 4A : TLV Type 1202 , Conversation Mask-1 Length 1204 , Conversation Mask Status 1206 , and Port Operation Conversation Mask-1 1208 .

TLV类型1202标识在TLV元组中携带的信息的类型。在一个实施例中,对话掩码-1TLV可由整数0x06标识。对话掩码-1长度1204(在图12A中标为Conversation_Mask_1_Length)指示TLV元组的八位组的长度。在一个示例实施例中,对话掩码-1 TLV的长度是195个八位组,从而字段1204包含值195。在一个实施例中,对话掩码状态1206和端口操作对话掩码-1208是类似于本文上面分别参考图4B和4C描述的对话掩码状态406和端口操作对话掩码408构造的字段。[00106] TLV Type 1202 identifies the type of information carried in the TLV tuple. In one embodiment, Conversation Mask-1 TLV may be identified by the integer 0x06. Conversation Mask-1 Length 1204 (labeled as Conversation_Mask_1_Length in FIG. 12A) indicates the length in octets of the TLV tuple. In one example embodiment, the length of Conversation Mask-1 TLV is 195 octets, so that field 1204 contains the value 195. In one embodiment, Conversation Mask State 1206 and Port Operation Conversation Mask-1 208 are fields constructed similarly to Conversation Mask State 406 and Port Operation Conversation Mask 408 described herein above with reference to FIG. 4B and FIG. 4C, respectively.

图12B图示了根据本发明一个实施例的聚合端口的对话掩码-2 TLV。对话掩码-2TLV 1210含有三个字段:TLV类型1212、对话掩码-2长度1214和端口操作对话掩码-2 1216。这些字段分别服务于与对话掩码-1 TLV 1200的对应字段类似的功能。Figure 12B illustrates a Conversation Mask-2 TLV for an aggregation port according to one embodiment of the present invention. Conversation Mask-2 TLV 1210 contains three fields: TLV Type 1212, Conversation Mask-2 Length 1214, and Port Operation Conversation Mask-2 1216. These fields serve similar functions to the corresponding fields in Conversation Mask-1 TLV 1200.

图12C图示了根据本发明一个实施例的聚合端口的对话掩码-3 TLV。对话掩码-3TLV 1220也含有三个字段:TLV类型1222、对话掩码-3长度1224和端口操作对话掩码-31226。这些字段分别服务于与对话掩码-1 TLV 1210的对应字段类似的功能。在一个示例实施例中,对话掩码-3的长度是130个八位组,并且三个组合端口对话掩码的总长度是512个八位组。前两个对话掩码已经包含384个八位组(即,每个192个八位组)用于端口操作对话掩码,仅留下第三端口对话掩码所需的130个八位组,等于本文上面参考图4A-C描述的端口对话掩码的大小。从而,本领域技术人员将理解,可使用具有三个对话掩码TLV的替换实施例代替单个TLV,另外,根据在此讨论的相同原理,TLV可被分成任何数量的单独TLV。类似地,在本文参考利用单个对话掩码TLV讨论实施例的情况下,将理解到,还考虑了具有多个对话掩码TLV的替换实施例。Figure 12C illustrates the Conversation Mask-3 TLV for an aggregated port according to one embodiment of the present invention. Conversation Mask-3 TLV 1220 also contains three fields: TLV Type 1222, Conversation Mask-3 Length 1224, and Port Operation Conversation Mask-3 1226. These fields each serve similar functions to the corresponding fields of Conversation Mask-1 TLV 1210. In one example embodiment, the length of Conversation Mask-3 is 130 octets, and the total length of the three combined port conversation masks is 512 octets. The first two conversation masks already contain 384 octets (i.e., 192 octets each) for the port operation conversation masks, leaving only 130 octets required for the third port conversation mask, which is equal to the size of the port conversation mask described above with reference to Figures 4A-C. Thus, those skilled in the art will appreciate that alternative embodiments with three Conversation Mask TLVs can be used instead of a single TLV, and furthermore, the TLV can be divided into any number of separate TLVs according to the same principles discussed herein. Similarly, where embodiments are discussed herein with reference to utilizing a single Conversation Mask TLV, it will be understood that alternative embodiments having multiple Conversation Mask TLVs are also contemplated.

图5A是可包含在增强LACPDU中以交换关于由每个聚合系统保持的对话ID摘要状态的信息的TLV的一个实施例的图。TLV在此称为端口对话服务映射TLV。TLV包含具有如下字段定义的字段502、505和506的集合:TLV_type 502,其包含指示TLV类型是端口对话服务映射摘要的值。摘要是密码散列或数据的类似处理以生成可用于唯一(或接近唯一)标识处理数据的标识符,实现误差校验和文件内容比较(例如在两个文件的内容不同的情况下,它们的摘要将不同)。这个字段指示在此TLV元组中携带的信息的性质。在一个实施例中,端口对话服务映射摘要TLV可由整数值0x0A标识。第二字段是Port_Conversation_ServiceMapping Digest_Length字段505。这个字段指示这个TLV元组的长度(八位组)。在一个实施例中,端口对话服务映射摘要TLV使用长度值18(0x12)。第三字段是Actor_Conversation_Serice_Mapping_Digest 506。这个字段含有根据aAggAdminServiceConversationMap[]计算的消息摘要(MD5)的值以便与伙伴系统交换。aAggAdminServiceConversationMap[]是由网络装置保持的服务ID到对话ID映射的阵列。存在由端口对话ID索引的4096个变量,aAggAdminServiceConversationMap[0] 至aAggAdminServiceConversationMap[4095]。一般而言,每个含有在阵列内唯一的服务ID集合。如果服务ID正在表示VID,则仅单个VID可应用,同时在服务ID正在表示I-SID的情况下,多于一个I-SID是可能的。伙伴系统可比较MD5摘要值,以确定在由每个伙伴系统保持的映射之间是否存在差异。FIG5A illustrates one embodiment of a TLV that may be included in an enhanced LACPDU to exchange information about the state of conversation ID digests maintained by each aggregation system. The TLV is referred to herein as a Port Conversation Service Mapping TLV. The TLV includes a set of fields 502, 505, and 506 with the following field definitions: TLV_type 502 contains a value indicating that the TLV type is a Port Conversation Service Mapping Digest. A digest is a cryptographic hash or similar processing of data to generate an identifier that can be used to uniquely (or nearly uniquely) identify the processed data, enabling error checking and file content comparison (for example, if two files have different contents, their digests will be different). This field indicates the nature of the information carried in this TLV tuple. In one embodiment, the Port Conversation Service Mapping Digest TLV may be identified by the integer value 0x0A. The second field is the Port_Conversation_ServiceMapping Digest_Length field 505. This field indicates the length (in octets) of this TLV tuple. In one embodiment, the Port Conversation Service Mapping Digest TLV uses a length value of 18 (0x12). The third field is the Actor_Conversation_Service_Mapping_Digest 506. This field contains the value of the message digest (MD5) calculated from aAggAdminServiceConversationMap[] for exchange with the partner system. aAggAdminServiceConversationMap[] is an array of service ID to conversation ID mappings maintained by the network device. There are 4096 variables, aAggAdminServiceConversationMap[0] to aAggAdminServiceConversationMap[4095], indexed by the port conversation ID. Generally, each contains a unique set of service IDs within the array. If the service ID is representing a VID, only a single VID can apply, while in the case where the service ID is representing an I-SID, more than one I-SID is possible. Partner systems can compare the MD5 digest values to determine if there are any differences between the mappings maintained by each partner system.

图5B是聚合的管理服务对话映射的一个实施例的图。该图图示了聚合的管理服务对话映射的字段,该映射是由端口对话ID索引的阵列,并且含有服务ID或表示服务ID的整数。在一个实施例中,聚合的管理服务对话映射(aAggAdminServiceConversationMap[])是整数(诸如32位或64位整数)的阵列。在其它实施例中,阵列可具有任何大小、数量或类型的值。聚合的管理服务对话映射可用于将服务ID转变成对话ID,并且反之亦然。对话ID可用于索引到阵列中,以恢复服务ID。可遍历阵列以找到服务ID,并且对应索引是对话ID。FIG5B is a diagram of one embodiment of an aggregated admin service conversation map. The diagram illustrates the fields of the aggregated admin service conversation map, which is an array indexed by port conversation ID and contains service IDs or integers representing service IDs. In one embodiment, the aggregated admin service conversation map (aAggAdminServiceConversationMap[]) is an array of integers (such as 32-bit or 64-bit integers). In other embodiments, the array can have values of any size, number, or type. The aggregated admin service conversation map can be used to convert a service ID into a conversation ID, and vice versa. The conversation ID can be used to index into the array to retrieve the service ID. The array can be traversed to find the service ID, and the corresponding index is the conversation ID.

图13图示了根据本发明一个实施例对于支持对话敏感帧收集和分布功能性所需的TLV集合。TLV集合包含端口算法TLV、端口对话ID摘要TLV、端口对话掩码-1到掩码-3 TLV和端口对话服务映射TLV。本文已经讨论了每一个TLV。在一个示例实施例中,端口算法TLV1302具有0x04的类型字段值。端口对话ID摘要TLV具有0x05的类型字段值。端口对话掩码-1到掩码-3分别具有0x06到0x08的类型字段值。在一个实施例中,TLV集合形成增强LACPDU以实现在图3-6中图示的并在本文讨论的本发明实施例。FIG13 illustrates a set of TLVs required to support conversation-sensitive frame collection and distribution functionality according to one embodiment of the present invention. The TLV set includes a Port Algorithm TLV, a Port Conversation ID Summary TLV, a Port Conversation Mask-1 through Mask-3 TLVs, and a Port Conversation Service Mapping TLV. Each of these TLVs has been discussed herein. In one example embodiment, the Port Algorithm TLV 1302 has a Type field value of 0x04. The Port Conversation ID Summary TLV has a Type field value of 0x05. The Port Conversation Mask-1 through Mask-3 have Type field values of 0x06 through 0x08, respectively. In one embodiment, the TLV set forms an enhanced LACPDU to implement the embodiments of the present invention illustrated in FIG3-6 and discussed herein.

更新对话分配的实施例的另一集合Another set of embodiments for updating conversation allocation

图6是图示根据本发明一个实施例更新聚合端口的对话分配的过程的另一流程图。该过程可在含有部署链路聚合组的一个或多个网络装置(诸如图1A的网络装置120和122)的网络中实现。该过程还可在图1B的门户112和114实现。要指出,过程600用块602-616图示,并且它们由块301-305的点线覆盖,以指示过程600是实现过程300的本发明的一个实施例。FIG6 is another flow chart illustrating a process for updating session assignments for an aggregation port according to one embodiment of the present invention. This process can be implemented in a network comprising one or more network devices (such as network devices 120 and 122 in FIG1A ) that deploy a link aggregation group. This process can also be implemented in portals 112 and 114 in FIG1B . It is noted that process 600 is illustrated using blocks 602-616, which are overlaid with dotted lines from blocks 301-305 to indicate that process 600 is an embodiment of the present invention that implements process 300.

参考图6,过程开始于在块602初始化对话敏感LACP。在一个实施例中,初始化包含将伙伴网络装置的默认端口算法记录为在网络装置处的伙伴网络装置的当前操作端口算法(例如,使用recordDefaultPortAlgorithm()函数将默认端口算法记录为在网络装置处的伙伴网络装置的当前操作参数)。初始化包含将伙伴网络装置的默认对话端口对话标识符(ID)摘要记录为在网络装置处的伙伴网络装置的对话端口摘要的当前操作参数(例如,使用recordDefaultConversationPortDigest()函数将默认对话端口对话ID摘要记录为在网络装置处的伙伴网络装置的当前操作参数)。初始化可进一步包含将伙伴网络装置的默认对话掩码记录为在网络装置处的伙伴网络装置的当前操作对话掩码(例如,使用recordDefaultConversationMask()函数将默认对话掩码记录为在网络装置处的伙伴网络装置的当前操作参数)。另外,初始化可包含将伙伴网络装置的默认对话服务映射摘要记录为在网络装置处的伙伴网络装置的当前操作对话服务映射摘要(例如,使用recordDefaultConversationServiceMappingDigest()函数将默认对话服务映射摘要记录为在网络装置处的伙伴网络装置的当前操作参数)。在使用默认值记录伙伴网络装置的操作参数的情况下,初始化对话敏感LACP。6 , the process begins by initializing conversation-sensitive LACP at block 602. In one embodiment, initialization includes recording the partner network device's default port algorithm as the partner network device's current operating port algorithm at the network device (e.g., using the recordDefaultPortAlgorithm() function to record the default port algorithm as the partner network device's current operating parameter at the network device). Initialization includes recording the partner network device's default conversation port conversation identifier (ID) digest as the partner network device's current operating parameter at the network device's conversation port digest (e.g., using the recordDefaultConversationPortDigest() function to record the default conversation port conversation ID digest as the partner network device's current operating parameter at the network device). Initialization may further include recording the partner network device's default conversation mask as the partner network device's current operating conversation mask at the network device (e.g., using the recordDefaultConversationMask() function to record the default conversation mask as the partner network device's current operating parameter at the network device). Additionally, initialization may include recording the default conversation service mapping digest of the partner network device as the current operational conversation service mapping digest of the partner network device at the network device (e.g., using a recordDefaultConversationServiceMappingDigest() function to record the default conversation service mapping digest as the current operational parameters of the partner network device at the network device). When the operational parameters of the partner network device are recorded using the default values, conversation-aware LACP is initialized.

过程在块603继续,其中网络装置接收来自伙伴网络装置的有关端口算法的信息、端口对话ID摘要和/或对话服务映射摘要。接收的信息将用于将参数值记录为网络装置的操作值。在网络装置接收信息作为嵌入在LACPUD中的TLV。有关端口算法的信息标识端口算法,并在recordPortAlgorithmTLV中携带,并且携带的值被记录为伙伴网络装置的当前操作参数值(例如,操作参数是Partner_Port_Algorithm)。有关端口对话ID摘要的信息在recordConversationPortDigestTLV中携带,并且携带的值被记录为伙伴网络装置的当前操作参数值(例如,操作参数是Partner_Conversation_PortList_Digest)。附加地,有关对话服务映射摘要的信息在recordConversationServiceMappingDigestTLV 中携带,并且携带的值被记录为伙伴网络装置的当前操作参数值(例如,操作参数是Partner_Admin_Conversation_PortList_Digest)。一旦接收到信息,对话敏感就被验证为可操作,如在图3的块301中所描述的。类似于图3的块303,过程600流动到块604-606,并执行操作以至少部分基于兼容性检查确定通过增强LACPDU的操作是可能的。The process continues at block 603, where the network device receives information about the port algorithm, the port conversation ID digest, and/or the conversation service mapping digest from the partner network device. The received information is used to record parameter values as operational values for the network device. The network device receives the information as a TLV embedded in the LACPUD. The information about the port algorithm identifies the port algorithm and is carried in the recordPortAlgorithmTLV, with the value carried being recorded as the current operational parameter value of the partner network device (e.g., the operational parameter is Partner_Port_Algorithm). The information about the port conversation ID digest is carried in the recordConversationPortDigestTLV, with the value carried being recorded as the current operational parameter value of the partner network device (e.g., the operational parameter is Partner_Conversation_PortList_Digest). Additionally, information about the conversation service mapping digest is carried in the recordConversationServiceMappingDigestTLV, with the value carried being recorded as the current operational parameter value of the partner network device (e.g., the operational parameter is Partner_Admin_Conversation_PortList_Digest). Once the information is received, the conversation sensitivity is verified as operational, as described in block 301 of Figure 3. Similar to block 303 of Figure 3, process 600 flows to blocks 604-606 and performs operations to determine that operation via enhanced LACPDUs is possible based at least in part on the compatibility check.

参考图6,在块604,网络装置确定网络装置使用的端口算法是否与链路聚合组的伙伴网络装置相同。网络装置的操作端口算法可存储在诸如链路聚合组的Actor_Port_Algorithm的变量中,而伙伴网络装置的操作端口算法可存储在诸如同一链路聚合组的Partner_Port_Algorithm的变量中。网络装置比较这两个变量,并确定它们是否一致。例如,可使用诸如Differ_Port_Algorithms的函数,其中Differ_Port_Algorithms返回指示由在同一链路聚合组两端的网络装置和伙伴网络装置使用的端口算法是否相同的布尔。如果两个变量不一致,则可选地发送出通知,以便通知运营商链路聚合组解决了异常。6 , at block 604, the network device determines whether the port algorithm used by the network device is the same as that used by the partner network device of the link aggregation group. The network device's operating port algorithm may be stored in a variable such as Actor_Port_Algorithm for the link aggregation group, while the partner network device's operating port algorithm may be stored in a variable such as Partner_Port_Algorithm for the same link aggregation group. The network device compares these two variables and determines whether they are consistent. For example, a function such as Differ_Port_Algorithms may be used, where Differ_Port_Algorithms returns a Boolean indicating whether the port algorithms used by the network device and the partner network device at both ends of the link aggregation group are the same. If the two variables are inconsistent, a notification may be optionally sent to inform the operator that the link aggregation group has resolved the anomaly.

如果这两个变量一致,则流程去到块605,在此网络装置确定由网络装置使用的对话ID摘要是否与链路聚合组的伙伴网络装置相同。网络装置的操作对话ID摘要可存储在诸如Actor_Conversation_PortList_Digest的摘要中,而伙伴网络装置的操作对话ID摘要可存储在诸如Partner_Conversation_PortList_Digest的摘要中。网络装置比较这两个摘要,并确定它们是否一致。例如,可使用诸如Differ_Port_Conversation_Digests的函数,其中Differ_Port_Conversation_Digests返回指示由在同一链路聚合组两端的网络装置和伙伴网络装置使用的端口对话摘要是否相同的布尔。如果两个摘要不一致,则可选地发送出通知,以便通知运营商链路聚合组解决了异常。If the two variables match, the process proceeds to block 605, where the network device determines whether the conversation ID digest used by the network device is the same as that of the partner network device of the link aggregation group. The network device's operational conversation ID digest may be stored in a digest such as Actor_Conversation_PortList_Digest, while the partner network device's operational conversation ID digest may be stored in a digest such as Partner_Conversation_PortList_Digest. The network device compares the two digests and determines whether they match. For example, a function such as Differ_Port_Conversation_Digests may be used, where Differ_Port_Conversation_Digests returns a Boolean indicating whether the port conversation digests used by the network device and the partner network device at both ends of the link aggregation group are the same. If the two digests do not match, a notification is optionally sent to inform the operator that the link aggregation group has resolved the anomaly.

如果这两个变量一致,则流程去到块606,在此网络装置确定由网络装置使用的对话服务映射摘要是否与链路聚合组的伙伴网络装置相同。网络装置的操作对话服务映射摘要可存储在诸如Actor_Conversation_Service_Mapping_Digest的摘要中,而伙伴网络装置的操作对话服务映射摘要可存储在诸如Partner_Conversation_Service_Mapping_Digest的摘要中。网络装置比较这两个摘要,并确定它们是否一致。例如,可使用诸如Differ_Conversation_Service_Digests的函数,其中Differ_Conversation_Service_Digests返回指示由在同一链路聚合组两端的网络装置和伙伴网络装置使用的对话服务映射摘要是否相同的布尔。如果两个摘要不一致,则可选地发送出通知,以便通知运营商链路聚合组解决了异常。If the two variables match, the process proceeds to block 606, where the network device determines whether the conversation service mapping digest used by the network device is the same as that used by the partner network device of the link aggregation group. The network device's operational conversation service mapping digest may be stored in a digest such as Actor_Conversation_Service_Mapping_Digest, while the partner network device's operational conversation service mapping digest may be stored in a digest such as Partner_Conversation_Service_Mapping_Digest. The network device compares the two digests and determines whether they match. For example, a function such as Differ_Conversation_Service_Digests may be used, where Differ_Conversation_Service_Digests returns a Boolean indicating whether the conversation service mapping digests used by the network device and the partner network device at both ends of the link aggregation group are the same. If the two digests do not match, a notification is optionally sent to inform the operator that the link aggregation group has resolved the anomaly.

要指出,在本发明的一些实施例中,块604-606的确定次序可不同于在图6中所图示的。此外,本发明的一些实施例可如所图示的部署更多或更少的兼容性检查。Note that in some embodiments of the invention, the order of determination of blocks 604-606 may differ from that illustrated in Figure 6. Furthermore, some embodiments of the invention may deploy more or fewer compatibility checks as illustrated.

一旦确定同一链路聚合组的网络装置和伙伴网络装置的操作参数(然而,其中一些参数可被视为管理参数)可兼容,并且二者都声明长LACPDU(也可被称为版本2 LACPDU),处理由长LACPDU接收的对话敏感信息就是可能的。每个长LACPDU长度都在128个八位组以上。如本文上面所讨论的,增强LACPDU需要更新对话分配信息,因为传统LACPDU可仅支持高达1024个对话。长LACPDU是增强LACPDU的一个实施例,并且增强LACPDU的其它实施例在支持所公开发明上是可行的。增强LACPDU以一般形式可携带在本地网络装置与伙伴网络装置之间的链路聚合组的链路上交换对话分配信息所需的控制信息。一些实施例可不使用长LACPDU,例如当LACP的实现仅支持不超过1024个对话时。在其它实施例中,使用长LACPDU。因为每个LACPDU比128个八位组长,并且它能够支持比128个八位组的传统LACPDU更多的对话。例如,长LACPDU可传送在图4A中图示的对话掩码TLV,其可指示高达4096个对话的对话分配状态。长LACPDU采取更多网络资源处理和传送,并且总是允许它们的传送可能不是有效的。从而,块608可设置定时器以提供网络装置传送LACPDU的时间窗口。一旦定时器期满,网络装置就不再传送长LACPDU,并且过程结束,无需更新对话分配。在设置了长LACPDU的定时器的情况下,网络装置确定通过增强LACPDU的操作(在本发明的此实施例中利用长LACPDU)是可能的,如在图3的块303中所描述的。类似于图3的块305,过程600流动到块608-622,并更新聚合端口的对话状态。Once the operating parameters (however, some of these parameters may be considered management parameters) of the network device and the partner network device in the same link aggregation group are determined to be compatible, and both declare long LACPDUs (also referred to as version 2 LACPDUs), processing of conversation-sensitive information received via long LACPDUs becomes possible. Each long LACPDU is longer than 128 octets. As discussed above, enhanced LACPDUs require updated conversation allocation information because conventional LACPDUs may only support up to 1024 conversations. Long LACPDUs are one embodiment of enhanced LACPDUs, and other embodiments of enhanced LACPDUs are possible to support the disclosed invention. Enhanced LACPDUs, in a generalized form, can carry the control information required to exchange conversation allocation information on the links of the link aggregation group between the local network device and the partner network device. Some embodiments may not use long LACPDUs, for example when the LACP implementation only supports up to 1024 conversations. In other embodiments, long LACPDUs are used. Because each LACPDU is longer than 128 octets, it can support more conversations than conventional 128-octet LACPDUs. For example, a long LACPDU may transmit the conversation mask TLV illustrated in FIG4A , which may indicate a conversation allocation state of up to 4096 conversations. Long LACPDUs take more network resources to process and transmit, and it may not be efficient to always allow their transmission. Thus, block 608 may set a timer to provide a time window for the network device to transmit the LACPDU. Once the timer expires, the network device no longer transmits the long LACPDU, and the process ends without updating the conversation allocation. In the case where the timer for the long LACPDU is set, the network device determines that operation through the enhanced LACPDU (utilizing the long LACPDU in this embodiment of the present invention) is possible, as described in block 303 of FIG3 . Similar to block 305 of FIG3 , process 600 flows to blocks 608-622 and updates the conversation state of the aggregation port.

参考图6,在块608,网络装置从伙伴网络装置接收一个或多个长LACPDU,指示在伙伴网络装置的不同操作对话分配状态。在伙伴网络装置的操作对话分配状态是由接收的长LACPDU所递送的伙伴网络装置的操作对话掩码。所接收的长LACPDU可含有嵌入在单个对话掩码TLV内的操作对话分配状态。在另一实施例中,伙伴网络装置的操作对话分配状态被嵌入在多个对话掩码TLV(诸如在图12A-C中所图示的对话掩码-1到掩码-3)内。6 , at block 608, the network device receives one or more long LACPDUs from a partner network device, indicating different operational session allocation states at the partner network device. The operational session allocation state at the partner network device is the operational session mask of the partner network device conveyed by the received long LACPDU. The received long LACPDU may contain the operational session allocation state embedded within a single Session Mask TLV. In another embodiment, the operational session allocation state of the partner network device is embedded within multiple Session Mask TLVs (such as Session Mask-1 through Mask-3 illustrated in FIG. 12A-C ).

在一个实施例中,对于具有多个对话掩码TLV的实施例,执行函数(诸如recordReceivedConversationMaskTLV)。函数将在接收的端口对话掩码-1 TLV中携带的ActPar_Sync的参数值记录为Partner_ActPar_Sync的当前操作参数值,它级联分别由端口对话掩码-1 TLV、端口对话掩码-2 TLV和端口对话掩码-3 TLV携带的Port_Oper_Conversation_Mask_1、Port_Oper_Conversation_Mask_2和Port_Oper_Conversation_Mask_3的值,并且函数将级联记录为伙伴操作掩码变量的当前值。当比较在伙伴网络装置和本地网络装置的操作对话分配状态时,在块616,函数比较变量端口操作对话掩码与伙伴操作对话掩码。In one embodiment, for embodiments with multiple conversation mask TLVs, a function (such as recordReceivedConversationMaskTLV) is executed. The function records the parameter value of ActPar_Sync carried in the received PortConversationMask-1 TLV as the current operational parameter value of Partner_ActPar_Sync, concatenating the values of Port_Oper_Conversation_Mask_1, Port_Oper_Conversation_Mask_2, and Port_Oper_Conversation_Mask_3 carried by the PortConversationMask-1 TLV, PortConversationMask-2 TLV, and PortConversationMask-3 TLV, respectively, and records the concatenation as the current value of the PartnerConversationMask variable. When comparing the operational conversation allocation states at the partner network device and the local network device, at block 616, the function compares the variable PortConversationMask with the PartnerConversationMask.

网络装置可以不接收长LACPDU,但在块612,检测端口的链路聚合组的操作状态的改变或管理配置的改变。网络装置可含有聚合组的每个端口的变量,以跟踪每个端口的操作状态的改变。例如,网络装置可设置每个端口的ChangeActorOperDist变量,并且当帧分布状态改变时,该变量被设置成真。变量可被表述为对应于所有聚合端口的ChangeActorOperDist变量的逻辑“或”的ChangeAggregationPorts。每个端口的变量ChangeActorOperDist也可跟踪管理配置改变。例如,如果检测到由aAggConversationAdminPort[](其含有参考端口对话ID的聚合端口选择优先权列表的管理值)跟踪的聚合端口选择优先权列表的新管理值或由aAggAdminServiceConversationMap[](其含有服务ID集合)跟踪的新管理值,则变量可设置成“真”。从而,也在块612,网络装置更新其操作对话分配状态。在一个实施例中,更新通过更新其操作对话掩码。在两种情况下,在块616,网络装置更新端口的收集对话掩码。在一个实施例中,收集对话掩码是操作布尔向量。它可由端口对话ID索引,指示是否允许索引的端口对话ID当通过对话端口接收时到达聚合器。然后,网络装置检查并看看其操作对话掩码是否匹配由伙伴网络装置使用的掩码。在一个实施例中,验证通过检查在网络装置的Partner_Oper_Conversation_Mask变量。The network device may not receive the long LACPDU, but instead detects a change in the operational state or administrative configuration of the port's link aggregation group at block 612. The network device may include a variable for each port of the aggregation group to track changes in the operational state of each port. For example, the network device may set a ChangeActorOperDist variable for each port, and this variable is set to true when the frame distribution state changes. The variable may be represented as the logical OR of the ChangeActorOperDist variables corresponding to all aggregation ports. The per-port variable ChangeActorOperDist may also track administrative configuration changes. For example, if a new administrative value for the aggregation port selection priority list tracked by aAggConversationAdminPort[] (which contains administrative values for the aggregation port selection priority list that references a port conversation ID) or a new administrative value tracked by aAggAdminServiceConversationMap[] (which contains a set of service IDs) is detected, the variable may be set to true. Consequently, also at block 612, the network device updates its operational conversation allocation state. In one embodiment, this update occurs by updating its operational conversation mask. In both cases, at block 616, the network device updates the port's collection conversation mask. In one embodiment, the collection conversation mask is an operational Boolean vector. It is indexed by the port conversation ID and indicates whether the indexed port conversation ID is allowed to reach the aggregator when received through the conversation port. The network device then checks to see if its operational conversation mask matches the mask used by the partner network device. In one embodiment, this verification is performed by checking the Partner_Oper_Conversation_Mask variable in the network device.

在一个实施例中,网络装置不同地设置端口的收集对话掩码,取决于是否已经更新了网络装置中的所有聚合端口(在门户情况下包含门户内端口(IPP))的对话掩码。如果所有端口上的所有对话掩码都已经被更新,则网络装置将端口的收集对话掩码设置成等于更新的端口操作对话掩码(更新的端口操作对话掩码可基于当前对话端口列表通过更新函数(例如updateConversationMask)获得)。如果网络装置中的其它端口的对话掩码的更新仍在进行,则网络装置将端口的收集对话掩码设置成等于对应于来自当前收集对话掩码与更新的端口操作对话掩码之间的逻辑“与”运算(例如通过updateConversationMask函数)的结果的布尔向量。In one embodiment, the network device sets the port's collection conversation mask differently depending on whether the conversation masks of all aggregated ports (including intra-portal ports (IPPs) in the network device have been updated in the case of portals). If all conversation masks on all ports have been updated, the network device sets the port's collection conversation mask to be equal to the updated port's operational conversation mask (the updated port's operational conversation mask can be obtained based on the current conversation port list through an update function (e.g., updateConversationMask)). If the update of conversation masks for other ports in the network device is still in progress, the network device sets the port's collection conversation mask to be equal to a Boolean vector corresponding to the result of a logical AND operation (e.g., through the updateConversationMask function) between the current collection conversation mask and the updated port's operational conversation mask.

网络装置指示收集对话掩码与分布对话掩码不同步(使用对话掩码TLV的对话掩码状态字段的ActPar_Sync位,例如在图4B中所图示的)。如本文上面所讨论的,网络装置可包含聚合组的每个端口的变量,以跟踪每个端口的操作状态的改变,诸如每个端口的ChangeActorOperDist变量,其中ChangeActorOperDist跟踪在分布帧的网络装置的操作端口。网络装置会将该变量设置成“假”,以指示没有帧分布状态改变。The network device indicates that the collection conversation mask is not synchronized with the distribution conversation mask (using the ActPar_Sync bit of the Conversation Mask Status field of the Conversation Mask TLV, such as illustrated in FIG4B ). As discussed above, the network device may include a variable for each port of the aggregation group to track changes in the operational state of each port, such as a ChangeActorOperDist variable for each port, where ChangeActorOperDist tracks the operational port of the network device distributing frames. The network device will set this variable to "false" to indicate that there has been no change in the frame distribution state.

当端口的操作对话掩码匹配在伙伴网络装置的匹配端口的操作对话掩码时,过程去到块622,并且由于两个网络装置(伙伴)都具有相同的操作对话掩码,因此将停止发送长LACPDU的过程。当端口的操作对话掩码不匹配在伙伴网络装置的关联端口的操作对话掩码时,过程去到块617,在此检测到不同步。When the port's operational conversation mask matches the operational conversation mask of the matching port on the partner network device, the process proceeds to block 622, and since both network devices (partners) have the same operational conversation mask, the process of sending long LACPDUs is stopped. When the port's operational conversation mask does not match the operational conversation mask of the associated port on the partner network device, the process proceeds to block 617, where an out-of-sync is detected.

然后,网络装置设置用于在块618将更新长LACPDU发送到远程网络装置的定时器。当对话掩码不同步时,它将更新本地设置设置成“真”(例如使用updateLocal指示不需要更新本地对话掩码)。The network device then sets a timer for sending an update long LACPDU to the remote network device at block 618. When the conversation mask is out of sync, it sets the update local setting to "true" (eg, using updateLocal to indicate that the local conversation mask does not need to be updated).

更新对话掩码的实施例Example of updating the dialog mask

图7是图示根据本发明一个实施例在接收到长LACPDU时更新聚合端口的对话掩码的流程图。可在网络装置的聚合控制器实现方法700。在图7中的块705,在网络装置接收到含有在伙伴网络装置的不同对话分配状态的长LACPDU。当长LACPDU含有与在对话端口的对话掩码不同的伙伴网络装置的对话掩码(即,伙伴网络装置已经发送不同对话掩码)时,网络装置通过检查伙伴操作对话掩码变量诸如Partner_Oper_Conversation_Mask来确定伙伴网络装置是否已经发送不同对话掩码。伙伴操作对话掩码变量是与每个链路聚合端口关联的变量。在一个实施例中,变量存储在网络装置内的存储装置内。FIG7 is a flow chart illustrating updating the conversation mask of an aggregation port upon receiving a long LACPDU according to one embodiment of the present invention. Method 700 may be implemented in an aggregation controller of a network device. At block 705 in FIG7 , a long LACPDU containing a different conversation allocation state at a partner network device is received at the network device. When the long LACPDU contains a conversation mask for a partner network device that is different from the conversation mask at the conversation port (i.e., the partner network device has sent a different conversation mask), the network device determines whether the partner network device has sent a different conversation mask by checking a partner operation conversation mask variable, such as Partner_Oper_Conversation_Mask. The partner operation conversation mask variable is a variable associated with each link aggregation port. In one embodiment, the variable is stored in a storage device within the network device.

在一个实施例中,通过对话掩码TLV传送伙伴操作对话掩码,如图4A中所图示的。当接收到新对话掩码TLV时,嵌入在新对话掩码TLV中的伙伴操作对话掩码更新聚合端口的伙伴操作对话掩码变量。从而,链路聚合端口的伙伴操作对话掩码变量与操作在链路聚合组的端口的伙伴网络装置的对话掩码同步。网络装置比较伙伴操作对话掩码变量与端口操作对话掩码以及差异触发块616。In one embodiment, the partner operational conversation mask is communicated via a conversation mask TLV, as illustrated in FIG4A . Upon receiving a new conversation mask TLV, the partner operational conversation mask embedded in the new conversation mask TLV updates the aggregation port's partner operational conversation mask variable. Consequently, the partner operational conversation mask variable of the link aggregation port is synchronized with the conversation mask of the partner network device operating on the port in the link aggregation group. The network device compares the partner operational conversation mask variable with the port operational conversation mask, and any discrepancies trigger block 616 .

参考图7,在块707,基于更新对话掩码函数(例如updateConversationMask函数)更新聚合端口的收集对话掩码。7 , at block 707 , the collection conversation mask of the aggregation port is updated based on an update conversation mask function (eg, updateConversationMask function).

在块708,网络装置设置在网络装置使用的对话掩码不同于在伙伴网络装置使用的对话掩码的指示。在一个实施例中,对话掩码状态值(诸如在图4B的参考410的ActPar_Sync位)可设置成指示差异。At block 708, the network device sets an indication that the conversation mask used at the network device is different from the conversation mask used at the partner network device. In one embodiment, a conversation mask state value (such as the ActPar_Sync bit at reference 410 of FIG. 4B ) may be set to indicate the difference.

尽管图7图示了操作的次序,但在本发明的其它实施例中操作的次序可以不同,例如在本发明的另一实施例中块707-708可不同地排序。Although FIG. 7 illustrates an order of operations, the order of operations may differ in other embodiments of the invention, for example, blocks 707 - 708 may be ordered differently in another embodiment of the invention.

要指出,虽然在与图7相关的讨论中使用聚合端口,但方法700可对于分布式弹性网络互连(DRNI)系统的门户实现,其网络装置还实现如图1B中所示的聚合端口。Note that although aggregate ports are used in the discussion related to FIG. 7 , method 700 may be implemented for a portal of a Distributed Resilient Network Interconnect (DRNI) system whose network devices also implement aggregate ports as shown in FIG. 1B .

图8A-D图示了根据本发明一个实施例更新聚合端口的对话掩码的序列。每个图都包含与聚合端口关联的伙伴对话掩码变量并且还有收集掩码、分布掩码和与聚合端口关联的对话掩码状态的值。在图8A中,聚合端口操作在正常状态。收集对话掩码和分布对话掩码是相同的,都是01010101…00。在此示例中,链路聚合组支持高达4096个对话,从而,收集对话掩码和分布对话掩码含有4096位(512个八位组)。为了图示的简化,仅图示了掩码的前10位和后2位,从而,讨论聚焦在对话0到9(对话ID:0到9)以及对话4094和4095(对话ID:4094和4095)。如所图示的,聚合端口处理对话1、3、5和7。端口分布并收集相同对话1、3、5和7的子帧。伙伴对话掩码变量与收集对话掩码和分布对话掩码一样,并且它指示在远程网络装置的链路聚合组的匹配端口传送对话1、3、5和7。从而,对话掩码状态通过将ActPar_Sync位设置成1来指示收集对话掩码和分布对话掩码与伙伴对话掩码编码相同,从而,对话掩码状态是10000000。Figures 8A-D illustrate a sequence for updating the conversation mask of an aggregation port according to one embodiment of the present invention. Each figure includes the partner conversation mask variable associated with the aggregation port and also includes the values of the collection mask, distribution mask, and conversation mask state associated with the aggregation port. In Figure 8A, the aggregation port is operating in a normal state. The collection conversation mask and the distribution conversation mask are identical, both 01010101...00. In this example, the link aggregation group supports up to 4096 conversations, so the collection conversation mask and the distribution conversation mask contain 4096 bits (512 octets). For simplicity of illustration, only the first 10 bits and the last 2 bits of the mask are illustrated, so that the discussion focuses on conversations 0 to 9 (conversation IDs: 0 to 9) and conversations 4094 and 4095 (conversation IDs: 4094 and 4095). As shown, the aggregation port handles conversations 1, 3, 5, and 7. The port distributes and collects subframes of the same conversations 1, 3, 5, and 7. The partner conversation mask variable is the same as the collection conversation mask and the distribution conversation mask, and it indicates that conversations 1, 3, 5, and 7 are transmitted on the matching ports of the link aggregation group of the remote network device. Therefore, the conversation mask state indicates that the collection conversation mask and the distribution conversation mask are the same as the partner conversation mask encoding by setting the ActPar_Sync bit to 1, and thus, the conversation mask state is 10000000.

在图8B中,异常发生于链路聚合组,并且伙伴对话掩码变量被更新成不同值。触发事件可以是链路故障、门户的链路聚合系统故障或某些其它事件。异常可触发传送一个或多个增强LACPDU,诸如长LACPDU,并在网络装置接收增强LACPDU。嵌入式TLV(诸如在图4A中图示的对话掩码TLV 400)用于更新与聚合端口关联的伙伴对话掩码变量。伙伴对话掩码变量的改变位值由下划线突出,并且相同记法适用于图8C-D。伙伴对话掩码变量现在指示伙伴网络装置向聚合端口传送对话0-3。它不再传送对话5和7,而是已经添加了对话0和2。In FIG8B , an anomaly occurs in the link aggregation group, and the partner conversation mask variable is updated to a different value. The triggering event may be a link failure, a failure in the portal's link aggregation system, or some other event. The anomaly may trigger the transmission of one or more enhanced LACPDUs, such as long LACPDUs, and the receipt of the enhanced LACPDUs at the network device. An embedded TLV (such as the conversation mask TLV 400 illustrated in FIG4A ) is used to update the partner conversation mask variable associated with the aggregation port. The changed bit values of the partner conversation mask variable are highlighted by underscores, and the same notation applies to FIG8C-D . The partner conversation mask variable now instructs the partner network device to transmit conversations 0-3 to the aggregation port. It no longer transmits conversations 5 and 7, but has added conversations 0 and 2.

网络装置然后存储伙伴对话掩码变量,并保持对话1、3、5和7的帧的聚合端口收集和分布,与之前一样,由本地网络装置(行动者)使用的对话掩码不同于远程系统(伙伴)的,由ActPar_Sync位表示的对话掩码状态被重新设置成0,并且变量updateLocal被设置成1以指示本地对话掩码需要被重新计算。The network device then stores the partner conversation mask variable and maintains the aggregate port collection and distribution of frames for conversations 1, 3, 5, and 7. As before, the conversation mask used by the local network device (actor) is different from that of the remote system (partner). The conversation mask state indicated by the ActPar_Sync bit is reset to 0, and the variable updateLocal is set to 1 to indicate that the local conversation mask needs to be recalculated.

在图8C,长LACPDU已经到达,并且如果本地网络装置上的所有端口未更新以匹配与伙伴相同的条件,则通过当前收集对话掩码与更新端口操作对话掩码之间的逻辑“与”运算(例如通过更新操作,诸如执行updateConversationMask函数),更新收集对话掩码和分布对话掩码。从而,收集对话掩码和分布对话掩码被更新成01010000..00(即,对话端口仅从公共对话1、3中收集帧)。然后,在图8D中,本地网络装置上的所有端口都已经被更新成匹配与伙伴相同的条件,并且对应地,被报告具有相同的Actor_Oper_Port_State。分布值如果远程伙伴上的连接端口停机,则它将ActPar_Sync位设置成1,指示在伙伴网络装置的伙伴端口已经完成收集对话掩码与分布对话掩码的同步。然后,收集对话掩码可被设置成与分布对话掩码相同,并且仅收集对话0-3的帧,遵循伙伴对话掩码变量。In FIG8C , a long LACPDU has arrived, and if all ports on the local network device have not been updated to match the same conditions as the partner, the collection conversation mask and the distribution conversation mask are updated via a logical AND operation between the current collection conversation mask and the updated port operation conversation mask (e.g., via an update operation, such as executing the updateConversationMask function). Consequently, the collection conversation mask and the distribution conversation mask are updated to 01010000..00 (i.e., the conversation port only collects frames from common conversations 1 and 3). Then, in FIG8D , all ports on the local network device have been updated to match the same conditions as the partner and, accordingly, are reported with the same Actor_Oper_Port_State. If the connection port on the remote partner is down, it sets the ActPar_Sync bit to 1, indicating that synchronization of the collection conversation mask with the distribution conversation mask has been completed on the partner port of the partner network device. The collection conversation mask can then be set to the same as the distribution conversation mask, and only frames for conversations 0-3 are collected, following the partner conversation mask variable.

图9是链路聚合组的对话敏感收集的过程的一个实施例的流程图。所图示的过程结合帧收集过程实现。也就是,此过程涉及处置含有正规数据业务的帧,与本文上面所讨论的处置LACPDU相对。还有,如本文上面所描述的,帧收集过程从聚合端口接收帧,并基于结合伙伴系统的帧分布器利用的端口算法收集它们。在实现对话敏感收集和分布的情况下,所图示的过程实施每个聚合端口的对话分配。对话被分配给特定端口,使得到达非分配的聚合端口的给定对话的帧是无序的,作为到另一聚合端口的对话重新分配或类似问题的结果。Figure 9 is a flow diagram of one embodiment of a process for conversation-sensitive collection of link aggregation groups. The illustrated process is implemented in conjunction with a frame collection process. That is, this process involves handling frames containing regular data traffic, as opposed to handling LACPDUs as discussed above herein. Also, as described above herein, the frame collection process receives frames from the aggregation ports and collects them based on a port algorithm utilized in conjunction with the partner system's frame distributor. Where conversation-sensitive collection and distribution is implemented, the illustrated process implements conversation allocation per aggregation port. Conversations are assigned to specific ports such that frames for a given conversation arriving at a non-assigned aggregation port are out of order, as a result of a conversation reallocation to another aggregation port or similar issues.

该过程可响应于在与执行该过程的网络装置关联的链路聚合组中的链路上接收帧而发起(块901)。在链路聚合组上通信的网络装置可以是DRNI门户或类似网络配置的一部分。接收的帧可以是任何类型的通信格式,诸如以太网帧或类似通信单元。该帧可经由聚合端口接收,并传到网络装置的帧收集器。在一个实施例中,可通过管理函数或配置启用和禁用对话敏感帧收集。在其它实施例中,总是实现对话敏感帧收集。在对话敏感帧收集可配置的情况下,帧收集器可检查当前是否启用对话敏感帧收集(块903)。如果不启用对话敏感帧收集,则接收的帧被转发到聚合器客户端(块905)。帧收集器根据由伙伴系统采用的聚合算法或分布过程组织来自所有聚合器端口的接收的帧。The process may be initiated in response to receiving a frame on a link in a link aggregation group associated with a network device performing the process (block 901). The network device communicating on the link aggregation group may be part of a DRNI portal or similar network configuration. The received frame may be in any type of communication format, such as an Ethernet frame or similar communication unit. The frame may be received via an aggregation port and passed to a frame collector of the network device. In one embodiment, conversation-sensitive frame collection may be enabled and disabled via an administrative function or configuration. In other embodiments, conversation-sensitive frame collection is always enabled. Where conversation-sensitive frame collection is configurable, the frame collector may check whether conversation-sensitive frame collection is currently enabled (block 903). If conversation-sensitive frame collection is not enabled, the received frame is forwarded to the aggregator client (block 905). The frame collector organizes the received frames from all aggregator ports according to an aggregation algorithm or distribution process employed by the partner system.

在启用对话敏感收集的情况下,可确定接收帧的对话标识符(块907)。可使用利用接收帧内的信息的任何过程或技术来确定对话标识符,使得帧分布器和帧收集器都利用相同过程或技术来确定性地获得相同对话标识符。在一个示例实现中,从接收帧中提取服务标识符。服务标识符可以是接收帧中的任何字段或字段组合,诸如虚拟局域网(VLAN)标识符(VID)字段或主干服务实例标识符(I-SID)。服务标识符然后可被转变成对话标识符。转变可使用任何本地数据结构,诸如查找表、映射阵列或类似数据结构,来映射服务标识符和对话标识符。When conversation-sensitive collection is enabled, a conversation identifier for the received frame may be determined (block 907). The conversation identifier may be determined using any process or technique utilizing information within the received frame such that both the frame distributor and the frame collector utilize the same process or technique to deterministically obtain the same conversation identifier. In one example implementation, a service identifier is extracted from the received frame. The service identifier may be any field or combination of fields in the received frame, such as a virtual local area network (VLAN) identifier (VID) field or a backbone service instance identifier (I-SID). The service identifier may then be converted into a conversation identifier. The conversion may utilize any local data structure, such as a lookup table, a mapping array, or similar data structure, to map the service identifier and the conversation identifier.

作为结果的对话标识符然后可与跟踪已经被分配给具体聚合端口的对话的对话掩码或类似数据结构相比较(块911)。在发现匹配的情况下,接收帧是已经被分配给聚合端口的对话的一部分,在聚合端口上接收它并且从而按恰当次序,并且帧收集器可将帧传到聚合器客户端。然而,如果在对话掩码或类似跟踪结构中未发现匹配,则接收的帧已经在错误聚合端口上无序地接收,并且然后被丢弃(块913)。The resulting conversation identifier can then be compared to a conversation mask or similar data structure that tracks conversations that have been assigned to a specific Aggregation Port (block 911). If a match is found, the received frame is part of a conversation that has been assigned to the Aggregation Port, it is received on the Aggregation Port and is therefore in the proper order, and the frame collector can pass the frame to the Aggregator Client. However, if no match is found in the conversation mask or similar tracking structure, the received frame has been received out of order on the wrong Aggregation Port and is then discarded (block 913).

图10是链路聚合组的对话敏感收集的过程的另一个实施例的流程图。这个实施例提供了上面相对于图9描述的过程的示例实现。响应于接收到帧的初始化(块901)可以是从网络装置的MAC聚合端口接收帧指针或标识符,其中聚合端口与链路聚合组关联(块1001)。链路聚合组可定义在是聚合系统和DRNI门户的两个伙伴系统之间。帧可被存储在网络处理器中或网络装置内的任何存储器件、缓冲器、高速缓存、寄存器或类似存储位置中。指针或类似标识符可提供用于访问帧的位置信息。FIG10 is a flow chart of another embodiment of a process for conversation-sensitive collection of link aggregation groups. This embodiment provides an example implementation of the process described above with respect to FIG9 . In response to receiving a frame (block 901 ), a frame pointer or identifier may be received from a MAC aggregation port of a network device, where the aggregation port is associated with a link aggregation group (block 1001 ). A link aggregation group may be defined between two partner systems, namely, an aggregation system and a DRNI portal. The frame may be stored in any memory device, buffer, cache, register, or similar storage location within a network processor or within the network device. The pointer or similar identifier may provide location information for accessing the frame.

接收的帧可以是任何类型的通信格式,诸如以太网帧或类似通信单元。帧可经由聚合端口接收,并经由控制解析器/复用器和聚合器解析器/复用器传到网络装置的帧收集器,其中帧收集器是由网络装置的网络处理器执行的链路聚合子层的聚合器的子组件。在一个实施例中,可通过管理函数或配置启用和禁用对话敏感帧收集。在其它实施例中,总是实现对话敏感帧收集。在对话敏感帧收集可配置的情况下,帧收集器可通过检查在聚合器或类似位置的配置中是否设置了标志或类似状况标记(例如“启用错误对话丢弃”标志)(块1003),来检查当前是否启用对话敏感帧收集(块903)。如果不启用对话敏感帧收集,则接收的帧、帧指针或类似帧标识符被转发到聚合器客户端(块905、1005)。帧收集器根据由伙伴系统采用的聚合算法或分布过程从所有聚合器端口收集接收的帧。The received frames may be in any type of communication format, such as Ethernet frames or similar communication units. The frames may be received via an aggregation port and passed to a frame collector of the network device via a control parser/multiplexer and an aggregator parser/multiplexer, wherein the frame collector is a subcomponent of the aggregator of the link aggregation sublayer executed by the network processor of the network device. In one embodiment, conversation-sensitive frame collection can be enabled and disabled via a management function or configuration. In other embodiments, conversation-sensitive frame collection is always enabled. Where conversation-sensitive frame collection is configurable, the frame collector may check whether conversation-sensitive frame collection is currently enabled (block 903) by checking whether a flag or similar status indicator (e.g., an "enable error conversation drop" flag) is set in the configuration of the aggregator or similar location (block 1003). If conversation-sensitive frame collection is not enabled, the received frame, frame pointer, or similar frame identifier is forwarded to the aggregator client (blocks 905, 1005). The frame collector collects received frames from all aggregator ports according to an aggregation algorithm or distribution process employed by the partner system.

处理帧以通过使用帧收集器与帧分布器之间的共享确定性过程的任何函数(例如DeterminePortConversationID函数)来确定关联的对话标识符(块907)。在一个示例实施例中,此类函数可通过访问帧以提取服务ID来确定对话标识符(块907),其中首先检查帧内容和格式,以通过比较帧报头信息与帧对话指配配置信息来确定服务ID格式和位置(块1007)。帧格式和配置信息可指示服务ID以12位VID字段、24位I-SID字段或类似字段或它们组合的形式,取决于帧格式。配置可指明任何字段或字段集合被用作接收帧的服务ID。过程然后继续,使用服务ID类型和位置信息从帧中检索服务ID(块1009)。例如,帧指针和位置信息可分别采取地址和偏移的形式,使帧收集器能够访问和检索在规定位置的值。The frame is processed to determine the associated conversation identifier (block 907) using any function (e.g., a DeterminePortConversationID function) that utilizes a shared deterministic process between the frame collector and the frame distributor. In one exemplary embodiment, such a function may determine the conversation identifier by accessing the frame to extract the service ID (block 907), first examining the frame content and format to determine the service ID format and location by comparing the frame header information with the frame conversation assignment configuration information (block 1007). The frame format and configuration information may indicate the service ID in the form of a 12-bit VID field, a 24-bit I-SID field, or similar fields, or a combination thereof, depending on the frame format. The configuration may specify any field or set of fields to be used as the service ID for the received frame. The process then continues by retrieving the service ID from the frame using the service ID type and location information (block 1009). For example, the frame pointer and location information may take the form of an address and offset, respectively, enabling the frame collector to access and retrieve values at specified locations.

检索的服务ID然后可用于获得(即,转变成)对应对话标识符(块909)。转变过程可采取查找表的形式,使用对话服务映射表(即aAggAdminServiceConversationMap[]阵列,其使用对话标识符作为索引,并存储服务ID)。查找表可使用服务ID作为索引,可遍历数据结构以匹配服务ID,或在对话服务映射摘要上执行类似查找操作。查找操作返回接收帧的对应对话标识符。The retrieved service ID can then be used to obtain (i.e., converted into) a corresponding conversation identifier (block 909). The conversion process can take the form of a lookup table using a conversation service mapping table (i.e., an aAggAdminServiceConversationMap[] array, which uses the conversation identifier as an index and stores the service ID). The lookup table can use the service ID as an index and can traverse the data structure to match the service ID, or perform a similar lookup operation on the conversation service mapping digest. The lookup operation returns the corresponding conversation identifier for the received frame.

然后可检查接收帧是否具有已经分配给在其上接收它的聚合端口的对话的对话标识符(块911)。通过访问通过其接收帧的聚合端口的对话掩码可弄清这个检查,其中对话掩码是用于跟踪分配给聚合端口的对话的位图或类似数据结构(块1013)。如果对话标识符的对应位被设置成布尔“真”值,则该帧与恰当地分配给聚合端口的对话关联,并且可被转发到聚合器客户端(块1005)。然而,如果对话掩码中的对应位被设置成布尔“假”,则该帧被丢弃(块913、1015),因为帧与未分配给通过其接收它的聚合端口的对话关联,指示它由于重新分配过程或类似改变而错误地或无序地发送。Can then check whether the received frame has the conversation identifier (piece 911) that has been assigned to the conversation of the aggregation port on which it is received.Can make clear this check by accessing the conversation mask of the aggregation port through which the frame is received, wherein the conversation mask is a bitmap or similar data structure (piece 1013) that is used to track the conversations assigned to the aggregation port.If the corresponding bit of the conversation identifier is set to Boolean "true" value, then this frame is associated with the conversation that is properly assigned to the aggregation port, and can be forwarded to the aggregator client (piece 1005).Yet, if the corresponding bit in the conversation mask is set to Boolean "false", then this frame is discarded (pieces 913, 1015), because frame is associated with the conversation that is not assigned to the aggregation port through which it is received, indicating that it is erroneously or disorderly sent due to reallocation process or similar changes.

图11是实现网络中链路聚合组的对话敏感收集的网络装置的一个实施例的图。网络装置可处理对话,其中每个对话用于网络中的服务或应用。网络装置1180可实现本文上面关于图2所描述的链路聚合子层1170,并且支持本文上面描述的链路聚合功能。网络装置1180可包含网络处理器1100、端口1140的集合、存储装置1150以及类似网络装置组件。作为示例而非限制,提供网络装置的组件。网络装置1180可使用任何数量或类型的处理器并用任何配置实现聚合功能和链路聚合子层1170。在其它实施例中,聚合功能和链路聚合子层以及相关组件分布在网络处理器集合、线卡集合以及它们的组成通用和专用处理器上,或在网络装置架构中类似实现。FIG11 is a diagram of one embodiment of a network device that implements conversation-sensitive collection of link aggregation groups in a network. The network device may process conversations, each of which is for a service or application in the network. Network device 1180 may implement link aggregation sublayer 1170 described above with respect to FIG2 and support the link aggregation functionality described above. Network device 1180 may include a network processor 1100, a collection of ports 1140, a storage device 1150, and similar network device components. The components of the network device are provided by way of example and not limitation. Network device 1180 may implement aggregation functionality and link aggregation sublayer 1170 using any number or type of processors and in any configuration. In other embodiments, the aggregation functionality and link aggregation sublayer and related components are distributed across a collection of network processors, a collection of line cards, and their constituent general-purpose and application-specific processors, or similarly implemented within the network device architecture.

端口1140可将网络装置经由物理介质诸如以太网、光纤或类似介质与任何数量的其它网络装置连接。在网络装置1180中可呈现任何数量或种类的端口。端口1140的任何组合或子集可被组织并管理为链路聚合组或DRNI门户,其中网络装置担任聚合系统。Ports 1140 can connect the network device to any number of other network devices via a physical medium such as Ethernet, fiber optics, or the like. Any number or type of ports can be present in network device 1180. Any combination or subset of ports 1140 can be organized and managed as a link aggregation group or DRNI portal, where the network device acts as an aggregation system.

网络装置1180内的存储装置1150的集合可以是任何类型的存储器件、高速缓存、寄存器或类似存储装置,以便用作工作存储器和/或永久存储装置。可利用任何数量和种类的存储装置1150存储网络装置的数据,包含由网络装置1180要处理的编程数据和接收的数据业务。在一个实施例中,摘要数据库1152或对话服务映射摘要的类似组织、传送通过聚合端口的对话的列表的对话分配状态以及本文上面描述的类似数据结构可存储在此类数据结构中。存储在存储装置1150中的其它数据结构可包含aAggAdminServiceConversationMap[]和类似数据结构。在其它实施例中,这些数据结构可被设想为独立的,并且可分布在网络装置1180内的任何数量的单独存储装置1150上。The collection of storage devices 1150 within network device 1180 can be any type of memory device, cache, register, or similar storage device that can be used as working memory and/or permanent storage. Any number and type of storage devices 1150 can be utilized to store network device data, including programming data to be processed by network device 1180 and received data traffic. In one embodiment, summary database 1152 or a similar organization of conversation service map summaries, conversation allocation status for lists of conversations transmitted through aggregation ports, and similar data structures described above herein can be stored in such data structures. Other data structures stored in storage devices 1150 can include aAggAdminServiceConversationMap[] and similar data structures. In other embodiments, these data structures can be considered independent and distributed across any number of separate storage devices 1150 within network device 1180.

网络处理器1100的集合可实现本文上面描述的聚合功能和链路聚合子层1170。聚合功能可包含聚合器客户端1172和链路聚合子层1170,其可包含控制解析器/复用器1102、聚合控制器1106、帧收集器1125、帧分布器1120和客户端接口1111。如本文上面所进一步描述的,聚合器客户端1172可提供网络装置的更高级功能,诸如层3功能和类似更高级功能。The collection of network processors 1100 may implement the aggregation functionality described above herein and the link aggregation sublayer 1170. The aggregation functionality may include an aggregator client 1172 and the link aggregation sublayer 1170, which may include the control parser/multiplexer 1102, the aggregation controller 1106, the frame collector 1125, the frame distributor 1120, and the client interface 1111. As further described above herein, the aggregator client 1172 may provide higher-level functionality of the network device, such as layer 3 functionality and similar higher-level functionality.

本文上面所进一步描述的聚合控制器1106可实现链路聚合控制和链路聚合控制协议功能。这些功能管理链路聚合组、DRNI门户和类似方面的配置和分配。控制分析器和复用器1102标识LACPDU,并转发来自在聚合端口上接收的其它数据业务的LACPDU,并将LACPDU发送到聚合控制器1106,以及将其它数据业务发送到链路聚合子层1170。The aggregation controller 1106, described further above, implements link aggregation control and link aggregation control protocol functions. These functions manage the configuration and allocation of link aggregation groups, DRNI portals, and similar aspects. The control analyzer and multiplexer 1102 identifies LACPDUs and forwards LACPDUs from other data traffic received on the aggregation port, sending the LACPDUs to the aggregation controller 1106 and sending other data traffic to the link aggregation sublayer 1170.

本文上面所进一步描述的链路聚合子层1170根据分布算法管理帧的收集和分布。在链路聚合子层1170内,帧收集器1125接收帧,并根据在链路聚合组上与伙伴系统共享的分布算法组织它们。帧分布器1120根据分布算法准备并选择带外帧在聚合端口的集合上传送。客户端接口1111从聚合客户端1172接收帧,并向其传送帧。带内帧从帧收集器1125传到聚合器客户端1172,而带外帧从帧分布器1120传到聚合器客户端1172。The link aggregation sublayer 1170, described further above, manages the collection and distribution of frames according to a distribution algorithm. Within the link aggregation sublayer 1170, the frame collector 1125 receives frames and organizes them according to a distribution algorithm shared with partner systems on the link aggregation group. The frame distributor 1120 prepares and selects out-of-band frames for transmission on a set of aggregation ports based on the distribution algorithm. The client interface 1111 receives frames from and transmits them to the aggregation client 1172. In-band frames are transmitted from the frame collector 1125 to the aggregator client 1172, while out-of-band frames are transmitted from the frame distributor 1120 to the aggregator client 1172.

如本文上面关于链路聚合组的对话敏感收集所讨论的,帧收集器1125配置成确定接收帧的对话标识符(例如,使用DetermineConversationID函数,其在一个示例实施例中从帧中提取服务标识符并将服务标识符转变成对话标识符,然而,可利用在帧收集器与帧分布器之间共享的任何确定性过程),比较对话标识符与端口对话分配,响应于对话标识符与端口对话分配的不匹配而丢弃该帧,并响应于对话标识符与端口对话分配的匹配而将帧转发到聚合器客户端。此外,在一个示例实施例中,帧收集器1125可检查对话敏感收集是否启用,可从与链路聚合组关联的聚合端口接收帧指针,可通过比较帧报头信息与帧对话指配配置来确定服务标识符格式和位置并从在确定的位置的帧中检索服务标识符而从帧中提取服务标识符,可通过查找对话服务映射摘要中的服务标识符以获得对话标识符来将服务标识符转变成对话标识符,可通过使用对话标识符作为索引来访问聚合端口的对话掩码来比较对话标识符与端口对话分配,并可响应于在通过使用对话标识符作为索引而标识的对话掩码中的位置发现布尔“假”而丢弃该帧。As discussed above herein with respect to conversation-sensitive collection of link aggregation groups, the frame collector 1125 is configured to determine a conversation identifier for a received frame (e.g., using a DetermineConversationID function that, in one example embodiment, extracts a service identifier from the frame and converts the service identifier into a conversation identifier, however, any deterministic process shared between the frame collector and the frame distributor may be utilized), compare the conversation identifier to the port conversation assignment, discard the frame in response to a mismatch between the conversation identifier and the port conversation assignment, and forward the frame to the aggregator client in response to a match between the conversation identifier and the port conversation assignment. Furthermore, in an example embodiment, the frame collector 1125 may check whether conversation-sensitive collection is enabled, may receive a frame pointer from an aggregation port associated with a link aggregation group, may extract a service identifier from a frame by comparing frame header information with a frame conversation assignment configuration to determine a service identifier format and location and retrieving the service identifier from the frame at the determined location, may convert the service identifier into a conversation identifier by looking up the service identifier in a conversation service mapping summary to obtain a conversation identifier, may compare the conversation identifier to a port conversation assignment by accessing a conversation mask of the aggregation port using the conversation identifier as an index, and may discard the frame in response to finding a Boolean "false" at a location in the conversation mask identified by using the conversation identifier as an index.

在一个实施例中,聚合控制器1106验证对话敏感链路聚合控制协议(LACP)的实现是可操作的。该验证通过如下方式执行:聚合控制器1106初始化LACP的实现,并且然后执行如下至少一项:(1)接收用于向在伙伴网络装置的端口对话标识符指配帧的算法的标识符;(2)从伙伴网络装置接收对话标识符摘要;以及(3)从伙伴网络装置接收对话服务映射摘要。接收的参数可被存储在存储装置1150(例如摘要数据块1152)中。In one embodiment, the aggregation controller 1106 verifies that the implementation of the conversation-aware Link Aggregation Control Protocol (LACP) is operational. This verification is performed by the aggregation controller 1106 initializing the LACP implementation and then performing at least one of the following: (1) receiving an identifier of an algorithm for assigning frames to port conversation identifiers at a partner network device; (2) receiving a conversation identifier digest from the partner network device; and (3) receiving a conversation service mapping digest from the partner network device. The received parameters may be stored in the storage device 1150 (e.g., the digest data block 1152).

然后,聚合控制器1106确定在验证LACP的实现是可操作的之后,通过增强LACPDU的操作是否是可能的。如本文上面所讨论的,增强LACPDU可用于更新对话分配信息,并且该确定基于网络装置1180的操作参数集合与网络装置1180的伙伴网络装置的操作参数的另一匹配集合之间的兼容性检查。伙伴网络装置是网络装置1180的链路聚合组另一端的远程网络装置。在一个实施例中,增强LACPDU是长LACPDU,这意味着它们在长度上多于128个八位组。Aggregation controller 1106 then determines whether operation via enhanced LACPDUs is possible after verifying that the LACP implementation is operational. As discussed above, enhanced LACPDUs can be used to update session allocation information, and this determination is based on a compatibility check between a set of operating parameters of network device 1180 and another matching set of operating parameters of a partner network device of network device 1180. A partner network device is a remote network device at the other end of the link aggregation group of network device 1180. In one embodiment, enhanced LACPDUs are long LACPDUs, meaning they are greater than 128 octets in length.

在一个实施例中,兼容性检查包含(1)确定用于向在网络装置的端口对话标识符指配帧的第一算法与用于向从伙伴网络装置接收的端口对话标识符指配帧的第二算法一致;(2)确定网络装置的第一对话标识符摘要与从伙伴网络装置接收的第二对话标识符摘要一致;以及(3)确定第一对话服务映射摘要与从伙伴网络装置接收的第二对话服务映射摘要一致。如果兼容性检查通过,则聚合控制器1106处理接收的收集敏感信息,并将定时器设置成提供传送增强LACPDU的时间窗口。如果定时器期满,并且未接收到增强LACPDU,则设置伙伴的默认配置参数,并且需要发起另一验证/兼容性检查循环。In one embodiment, the compatibility check includes (1) determining that a first algorithm for assigning frames to a port conversation identifier at the network device is consistent with a second algorithm for assigning frames to a port conversation identifier received from the partner network device; (2) determining that a first conversation identifier digest of the network device is consistent with a second conversation identifier digest received from the partner network device; and (3) determining that a first conversation service mapping digest is consistent with a second conversation service mapping digest received from the partner network device. If the compatibility check passes, the aggregation controller 1106 processes the received aggregated sensitive information and sets a timer to provide a time window for transmitting an enhanced LACPDU. If the timer expires and no enhanced LACPDU is received, the partner's default configuration parameters are set and another verification/compatibility check cycle needs to be initiated.

如果兼容性检查失败,则不能使用增强LACPDU,并且可能需要人工干预,从而,聚合控制器1106可选地可发送出通知以指示兼容性检查失败。If the compatibility check fails, enhanced LACPDUs cannot be used and manual intervention may be required, so the aggregation controller 1106 may optionally send out a notification indicating the compatibility check failure.

当兼容性检查通过时,聚合控制器1106可配置成基于对话分配状态不正确的确定来更新链路聚合组的聚合端口的对话分配状态。在一个实施例中,聚合端口的对话分配状态由聚合端口的对话掩码表示。聚合端口的对话掩码可由对话掩码类型/长度/值(TLV)表示,其含有(1)TLV类型字段;(2)对话掩码长度字段;(3)对话掩码状态字段;以及(4)端口操作对话掩码字段。每个字段的结构本文已经在上面讨论。要指出,对话掩码可由一个或多个对话掩码TLV表示,如在图4A-C和12A-C中所图示的和本文在上面讨论的。When the compatibility check passes, the aggregation controller 1106 may be configured to update the conversation allocation state of the aggregation port of the link aggregation group based on the determination that the conversation allocation state is incorrect. In one embodiment, the conversation allocation state of the aggregation port is represented by the conversation mask of the aggregation port. The conversation mask of the aggregation port may be represented by a conversation mask type/length/value (TLV), which contains (1) a TLV type field; (2) a conversation mask length field; (3) a conversation mask state field; and (4) a port operation conversation mask field. The structure of each field has been discussed above in this document. It is noted that the conversation mask may be represented by one or more conversation mask TLVs, as illustrated in Figures 4A-C and 12A-C and discussed above in this document.

因此,在一些实施例中,聚合控制器可被视为包括:验证单元,用于验证对话敏感链路聚合控制协议(LACP)的实现是可操作的;确定单元,用于确定通过增强链路聚合控制协议数据单元LACPDU的操作是可能的,其中增强LACPDU可用于更新对话分配信息,其中该确定基于网络装置的操作参数的第一集合与伙伴网络装置的操作参数的第二集合之间的兼容性检查,并且其中伙伴网络装置是通信地与网络装置耦合的链路聚合组的远程网络装置;以及更新单元,用于基于链路聚合组的聚合端口的第一对话分配状态不正确的确定,更新所述第一对话分配状态,其中链路聚合组的聚合端口的第一对话分配状态指示传送通过聚合端口的对话的第一列表。Thus, in some embodiments, the aggregation controller may be viewed as comprising: a verification unit for verifying that an implementation of a conversation-sensitive link aggregation control protocol (LACP) is operable; a determination unit for determining that operation via an enhanced link aggregation control protocol data unit (LACPDU) is possible, wherein the enhanced LACPDU may be used to update conversation allocation information, wherein the determination is based on a compatibility check between a first set of operating parameters of the network device and a second set of operating parameters of a partner network device, and wherein the partner network device is a remote network device of a link aggregation group communicatively coupled to the network device; and an updating unit for updating a first conversation allocation state of an aggregation port of the link aggregation group based on a determination that the first conversation allocation state is incorrect, wherein the first conversation allocation state of the aggregation port of the link aggregation group indicates a first list of conversations transmitted through the aggregation port.

更新对话分配状态可基于在网络装置的链路聚合组的聚合端口的第一对话分配状态不同于从伙伴网络装置接收的聚合端口的第二对话分配状态的确定,其中第二对话分配状态指示通过链路聚合组接收的对话的第二列表。备选地,更新对话分配状态可基于检测到在网络装置的聚合组的相邻聚合端口的操作状态的改变。要指出,网络装置可设置定时器以提供网络装置传送长LACPDU的时间窗口。一旦定时器期满,就禁止网络装置传送增强LACPDU(例如本文上面所讨论的长LACPDU),并且更新对话分配的过程结束。在设置了长LACPDU的定时器的情况下,网络装置首先确定使用增强LACPDU的操作是可能的,如在图3的块303中所描述的。Updating the conversation allocation state may be based on determining that a first conversation allocation state of an aggregation port in a link aggregation group of the network device is different from a second conversation allocation state of an aggregation port received from a partner network device, wherein the second conversation allocation state indicates a second list of conversations received via the link aggregation group. Alternatively, updating the conversation allocation state may be based on detecting a change in the operational state of an adjacent aggregation port in the aggregation group of the network device. It is noted that the network device may set a timer to provide a time window for the network device to transmit a long LACPDU. Once the timer expires, the network device is prohibited from transmitting an enhanced LACPDU (e.g., the long LACPDU discussed above in this document), and the process of updating conversation allocation ends. In the event that the timer for the long LACPDU is set, the network device first determines whether operation using the enhanced LACPDU is possible, as described in block 303 of Figure 3.

为了清楚起见,一些术语在本文档与优先权文档之间已经改变。然而,术语上的所有改变是关于等同的术语学。本文所使用的“数据流”和优先权文档要理解成指的是帧的有序序列,其也相当于“对话”。已经参考链路聚合组“级”,其在“链路级”与链路聚合组“级”之间引入二分法,并声明对话标识符标识在链路聚合组级的对话,等同于指示对话标识符标识在给定链路聚合组的对话。在阐述了在网络装置接收的帧集合中的“每个帧”的地方,具体的“接收帧”在帧的这个集合内。For the sake of clarity, some terminology has changed between this document and the priority document. However, all changes in terminology are with respect to equivalent terminology. "Data stream" as used herein and in the priority document is to be understood to refer to an ordered sequence of frames, which is also equivalent to a "conversation". Reference has been made to the link aggregation group "level", which introduces a dichotomy between the "link level" and the link aggregation group "level", and stating that a conversation identifier identifies a conversation at the link aggregation group level is equivalent to indicating that a conversation identifier identifies a conversation at a given link aggregation group. Where reference is made to "each frame" in a set of frames received by a network device, the specific "received frame" is within this set of frames.

虽然已经依据几个示例实施例描述本发明,但本领域技术人员将认识到,本发明不限于所描述的实施例,可用在所附权利要求书的精神和范围内的修改和变化来实行本发明。从而,描述被视为说明性的,而不是限制性的。Although the present invention has been described in terms of several exemplary embodiments, those skilled in the art will recognize that the present invention is not limited to the embodiments described and can be practiced with modification and alteration within the spirit and scope of the appended claims. Accordingly, the description is to be regarded as illustrative rather than restrictive.

Claims (28)

1.一种由网络装置实现的用于更新链路聚合组的链路上的对话分配的方法,其中所述网络装置通过所述链路聚合组的所述链路通信地与聚合端口耦合,其中所述网络装置处理对话,并且其中每个对话由帧的有序序列构成,所述方法包括如下步骤:1. A method implemented by a network device for updating session allocation on links of a link aggregation group, wherein the network device is communicatively coupled to an aggregation port via the links of the link aggregation group, wherein the network device processes sessions, and wherein each session consists of an ordered sequence of frames, the method comprising the following steps: 验证(301)对话敏感链路聚合控制协议LACP的实现是可操作的;The implementation of the (301) Dialogue Sensitive Link Aggregation Control Protocol (LACP) is verified to be operational; 确定(303)通过增强链路聚合控制协议数据单元LACPDU的操作是可能的,其中所述增强LACPDU可用于更新对话分配信息,其中所述确定至少部分基于所述网络装置的操作参数的第一集合与伙伴网络装置的操作参数的第二集合之间的兼容性检查,并且其中所述伙伴网络装置是通信地与所述网络装置耦合的所述链路聚合组的远程网络装置;以及Determining (303) that operation via an Enhanced Link Aggregation Control Protocol Data Unit (LACPDU) is possible, wherein the enhanced LACPDU can be used to update dialogue allocation information, wherein the determination is at least partially based on a compatibility check between a first set of operating parameters of the network device and a second set of operating parameters of a partner network device, and wherein the partner network device is a remote network device of the link aggregation group communicatively coupled to the network device; and 基于所述链路聚合组的聚合端口的第一对话分配状态不正确的确定,更新(305)所述第一对话分配状态,其中所述链路聚合组的所述聚合端口的所述第一对话分配状态指示传送通过所述聚合端口的对话的第一列表。Based on the determination that the first conversation allocation state of the aggregation port of the link aggregation group is incorrect, the first conversation allocation state is updated (305), wherein the first conversation allocation state of the aggregation port of the link aggregation group indicates a first list of conversations transmitted through the aggregation port. 2.如权利要求1所述的方法,其中验证对话敏感LACP的实现是可操作的包括如下步骤:2. The method of claim 1, wherein verifying that the implementation of the dialogue-sensitive LACP is operable includes the following steps: 初始化(602)在所述网络装置的所述对话敏感LACP的实现;以及Initialization (602) of the implementation of the dialogue-sensitive LACP in the network device; and 接收(603) 如下至少一项:Receive (603) at least one of the following: 用于向在所述伙伴网络装置的端口对话标识符指配帧的算法的标识符;Identifiers for algorithms used to assign frames to port dialogue identifiers in the partner network device; 来自所述伙伴网络装置的对话标识符摘要;以及A conversation identifier digest from the partner network device; and 来自所述伙伴网络装置的对话服务映射摘要。A summary of the dialogue service mapping from the partner network device. 3.如权利要求1或2所述的方法,其中所述网络装置的操作参数的所述第一集合与所述伙伴网络装置的操作参数的所述第二集合之间的所述兼容性检查包含:3. The method of claim 1 or 2, wherein the compatibility check between the first set of operating parameters of the network device and the second set of operating parameters of the partner network device comprises: 确定(604)用于向在所述网络装置的端口对话标识符指配帧的第一算法与用于向从所述伙伴网络装置接收的端口对话标识符指配帧的第二算法一致;Determine (604) that the first algorithm for assigning a port session identifier to the port session identifier of the network device is consistent with the second algorithm for assigning a port session identifier to the port session identifier received from the partner network device. 确定(605)所述网络装置的第一对话标识符摘要与从所述伙伴网络装置接收的第二对话标识符摘要一致;以及Determine (605) that the first session identifier digest of the network device matches the second session identifier digest received from the partner network device; and 确定(606)第一对话服务映射摘要与从所述伙伴网络装置接收的第二对话服务映射摘要一致。Determine (606) that the first dialogue service mapping digest is consistent with the second dialogue service mapping digest received from the partner network device. 4.如权利要求1或2所述的方法,其中所述增强LACPDU是长LACPDU,并且其中每个长LACPDU在长度上超过128个八位组。4. The method of claim 1 or 2, wherein the enhanced LACPDU is a long LACPDU, and wherein each long LACPDU has more than 128 octets in length. 5.如权利要求1或2所述的方法,其中所述聚合端口的所述第一对话分配状态由所述聚合端口的对话掩码表示。5. The method of claim 1 or 2, wherein the first dialogue allocation state of the aggregation port is represented by the dialogue mask of the aggregation port. 6.如权利要求5所述的方法,其中所述聚合端口的所述对话掩码由一个或多个对话掩码类型/长度/值(TLV)表示。6. The method of claim 5, wherein the dialog mask of the aggregation port is represented by one or more dialog mask type/length/value (TLV). 7.如权利要求6所述的方法,其中所述对话掩码TLV包含:7. The method of claim 6, wherein the dialog mask TLV comprises: TLV类型字段;TLV type field; 对话掩码长度字段;Dialogue mask length field; 对话掩码状态字段;以及Dialogue mask status field; and 端口操作对话掩码字段。Port operation dialog mask field. 8.如权利要求7所述的方法,其中所述对话掩码状态字段含有一个位指示所述网络装置的所述聚合端口的第一对话掩码和所述伙伴网络装置的匹配聚合端口的第二对话掩码是否一致,其中收集对话掩码和分布对话掩码分别指示在收集函数中收集的和在分布函数中分布的对话的列表。8. The method of claim 7, wherein the dialog mask state field contains a bit indicating whether the first dialog mask of the aggregation port of the network device and the second dialog mask of the matching aggregation port of the partner network device are consistent, wherein the collection dialog mask and the distribution dialog mask respectively indicate the lists of dialogs collected in the collection function and distributed in the distribution function. 9.如权利要求7所述的方法,其中所述端口操作对话掩码字段指示所述聚合链路的所述对话分配状态。9. The method of claim 7, wherein the port operation dialog mask field indicates the dialog allocation status of the aggregated link. 10.如权利要求1或2所述的方法,其中所述第一对话分配状态不正确的所述确定基于如下至少一项:10. The method of claim 1 or 2, wherein the determination of an incorrect first dialogue assignment state is based on at least one of the following: 确定在所述网络装置的所述链路聚合组的所述聚合端口的所述第一对话分配状态不同于从所述伙伴网络装置接收的所述聚合端口的第二对话分配状态,其中所述第二对话分配状态指示通过所述链路聚合组接收的对话的第二列表;以及Determining that the first session allocation state of the aggregation port of the link aggregation group in the network device is different from the second session allocation state of the aggregation port received from the partner network device, wherein the second session allocation state indicates a second list of sessions received through the link aggregation group; and 检测在所述网络装置的所述聚合组的操作状态或管理配置的改变,其中所述操作状态与在所述网络装置的所述聚合组的每个端口关联。Detect changes in the operational status or management configuration of the aggregation group in the network device, wherein the operational status is associated with each port of the aggregation group in the network device. 11.如权利要求1或2所述的方法,还包括:11. The method of claim 1 or 2, further comprising: 在确定通过所述增强LACPDU的操作是可能的之后,设置超时定时器,其中在超时时期之后禁止通过所述增强LACPDU的操作。After determining that operation via the enhanced LACPDU is possible, a timeout timer is set, wherein operation via the enhanced LACPDU is prohibited after the timeout period. 12.如权利要求1或2所述的方法,其中每个对话用于网络中的服务或应用。12. The method of claim 1 or 2, wherein each conversation is for a service or application in the network. 13.一种网络装置(1180),配置成通过链路聚合组的链路通信地与聚合端口耦合,其中所述网络装置配置成处理对话,并且其中每个对话由帧的有序序列构成,所述网络装置包括:13. A network device (1180) configured to be communicatively coupled to an aggregation port via a link aggregation group, wherein the network device is configured to process conversations, and wherein each conversation consists of an ordered sequence of frames, the network device comprising: 聚合端口(1140)的集合,配置成在所述链路聚合组的所述链路上接收帧;以及A set of aggregated ports (1140), configured to receive frames on the links of the link aggregation group; and 网络装置(1100),包含:Network device (1100), comprising: 聚合控制器(1106),配置成验证对话敏感链路聚合控制协议(LACP)的实现是可操作的;The aggregation controller (1106) is configured to verify that the implementation of the dialogue-sensitive link aggregation control protocol (LACP) is operable; 所述聚合控制器还配置成确定通过增强链路聚合控制协议数据单元(LACPDU)的操作是可能的,其中所述增强LACPDU可用于更新对话分配信息,其中所述确定基于所述网络装置的操作参数的第一集合与伙伴网络装置的操作参数的第二集合之间的兼容性检查,并且其中所述伙伴网络装置是通信地与所述网络装置耦合的所述链路聚合组的远程网络装置;以及The aggregation controller is further configured to determine whether operation via Enhanced Link Aggregation Control Protocol Data Unit (LACPDU) is possible, wherein the enhanced LACPDU can be used to update dialogue allocation information, wherein the determination is based on a compatibility check between a first set of operating parameters of the network device and a second set of operating parameters of a partner network device, and wherein the partner network device is a remote network device of the link aggregation group communicatively coupled to the network device; and 所述聚合控制器还配置成基于所述链路聚合组的聚合端口的第一对话分配状态不正确的确定,更新所述第一对话分配状态,其中所述链路聚合组的所述聚合端口的所述第一对话分配状态指示传送通过所述聚合端口的对话的第一列表。The aggregation controller is also configured to update the first conversation allocation state based on the determination that the first conversation allocation state of the aggregation port of the link aggregation group is incorrect, wherein the first conversation allocation state of the aggregation port of the link aggregation group indicates a first list of conversations transmitted through the aggregation port. 14.如权利要求13所述的网络装置,还包括:14. The network apparatus of claim 13, further comprising: 存储装置(1150),配置成存储传送通过所述聚合端口的对话的所述列表的对话分配状态。Storage device (1150) is configured to store the dialog assignment state of the list of dialogs transmitted through the aggregation port. 15.如权利要求13或14所述的网络装置,其中所述网络处理器进一步包含:15. The network device of claim 13 or 14, wherein the network processor further comprises: 链路聚合子层,包含:The link aggregation sublayer includes: 控制解析器和复用器,配置成处理来自所述链路聚合组的聚合端口的LACPDU和帧,并将它们传送到所述链路聚合组的聚合端口;Control the parser and multiplexer, configured to process LACPDUs and frames from the aggregation port of the link aggregation group, and transmit them to the aggregation port of the link aggregation group; 客户端接口,配置成从所述伙伴网络装置接收对话敏感帧,并向所述伙伴网络装置传送;The client interface is configured to receive dialogue-sensitive frames from the partner network device and transmit them to the partner network device. 帧分布器,配置成将帧从所述伙伴网络装置向所述聚合端口分布;以及A frame distributor configured to distribute frames from the buddy network device to the aggregation port; and 帧收集器,配置成从所述聚合端口向所述伙伴网络装置收集帧;以及A frame collector, configured to collect frames from the aggregation port to the partner network device; and 聚合器客户端,配置成与所述链路聚合子层交互以处理帧。The aggregator client is configured to interact with the link aggregation sublayer to process frames. 16.如权利要求13或14所述的网络装置,其中所述聚合控制器通过以下步骤验证对话敏感LACP的实现是可操作的:16. The network apparatus of claim 13 or 14, wherein the aggregation controller verifies that the implementation of the conversation-sensitive LACP is operable through the following steps: 初始化在所述网络装置的所述对话敏感LACP的实现;以及Initialize the implementation of the dialogue-sensitive LACP in the network device; and 接收如下至少一项:Receive at least one of the following: 用于向在所述伙伴网络装置的端口对话标识符指配帧的算法的标识符;Identifiers for algorithms used to assign frames to port dialogue identifiers in the partner network device; 来自所述伙伴网络装置的对话标识符摘要;以及A conversation identifier digest from the partner network device; and 来自所述伙伴网络装置的对话服务映射摘要。A summary of the dialogue service mapping from the partner network device. 17.如权利要求13或14所述的网络装置,其中所述聚合控制器通过以下步骤执行所述网络装置的操作参数的所述第一集合与所述伙伴网络装置的操作参数的所述第二集合之间的所述兼容性检查:17. The network device of claim 13 or 14, wherein the aggregation controller performs the compatibility check between the first set of operating parameters of the network device and the second set of operating parameters of the partner network device by means of the following steps: 确定用于向在所述网络装置的端口对话标识符指配帧的第一算法与用于向从所述伙伴网络装置接收的端口对话标识符指配帧的第二算法一致;The first algorithm for determining the assignment frame to the port session identifier of the network device is consistent with the second algorithm for determining the assignment frame to the port session identifier received from the partner network device. 确定所述网络装置的第一对话标识符摘要与从所述伙伴网络装置接收的第二对话标识符摘要一致;以及Determine that the first session identifier digest of the network device matches the second session identifier digest received from the partner network device; and 确定第一对话服务映射摘要与从所述伙伴网络装置接收的第二对话服务映射摘要一致。The first dialogue service mapping digest is determined to be consistent with the second dialogue service mapping digest received from the partner network device. 18.如权利要求13或14所述的网络装置,其中所述增强LACPDU是长LACPDU,其中每个长LACPDU在长度上超过128个八位组。18. The network device of claim 13 or 14, wherein the enhanced LACPDU is a long LACPDU, wherein each long LACPDU has more than 128 octets in length. 19.如权利要求13或14所述的网络装置,其中所述聚合端口的所述第一对话分配状态由所述聚合端口的对话掩码表示。19. The network apparatus of claim 13 or 14, wherein the first session allocation state of the aggregation port is represented by the session mask of the aggregation port. 20.如权利要求19所述的网络装置,其中所述聚合端口的所述对话掩码由一个或多个对话掩码类型/长度/值(TLV)表示。20. The network apparatus of claim 19, wherein the session mask of the aggregation port is represented by one or more session mask type/length/value (TLV). 21.如权利要求20所述的网络装置,其中所述对话掩码TLV包含:21. The network apparatus of claim 20, wherein the session mask TLV comprises: TLV类型字段;TLV type field; 对话掩码长度字段;Dialogue mask length field; 对话掩码状态字段;以及Dialogue mask status field; and 端口操作对话掩码字段。Port operation dialog mask field. 22.如权利要求21所述的网络装置,其中所述对话掩码状态字段含有一个位指示所述网络装置的所述聚合端口的第一对话掩码和所述伙伴网络装置的匹配聚合端口的第二对话掩码是否一致,其中收集对话掩码和分布对话掩码分别指示在收集函数中收集的和在所述聚合端口的分布函数中分布的对话的列表。22. The network apparatus of claim 21, wherein the session mask state field contains a bit indicating whether a first session mask of the aggregation port of the network apparatus and a second session mask of the matching aggregation port of the partner network apparatus are identical, wherein the collection session mask and the distribution session mask respectively indicate a list of sessions collected in the collection function and distributed in the distribution function of the aggregation port. 23.如权利要求21所述的网络装置,其中所述端口操作对话掩码字段指示所述聚合链路的所述对话分配状态。23. The network apparatus of claim 21, wherein the port operation session mask field indicates the session allocation status of the aggregated link. 24.如权利要求13或14所述的网络装置,其中所述第一对话分配状态不正确的所述确定基于如下至少一项:24. The network apparatus of claim 13 or 14, wherein the determination of an incorrect first session allocation state is based on at least one of the following: 确定在所述网络装置的所述链路聚合组的所述聚合端口的所述第一对话分配状态不同于从所述伙伴网络装置接收的所述聚合端口的第二对话分配状态,其中所述第二对话分配状态指示通过所述链路聚合组接收的对话的第二列表;以及Determining that the first session allocation state of the aggregation port of the link aggregation group in the network device is different from the second session allocation state of the aggregation port received from the partner network device, wherein the second session allocation state indicates a second list of sessions received through the link aggregation group; and 检测在所述网络装置的所述聚合组的操作状态或管理配置的改变,其中所述操作状态与在所述网络装置的所述聚合组的每个端口关联。Detect changes in the operational status or management configuration of the aggregation group in the network device, wherein the operational status is associated with each port of the aggregation group in the network device. 25.如权利要求13或14所述的网络装置,其中所述链路聚合控制器在确定通过增强LACPDU的操作是可能的之后设置超时定时器,其中在超时时期之后禁止通过增强LACPDU的操作。25. The network apparatus of claim 13 or 14, wherein the link aggregation controller sets a timeout timer after determining that operation via enhanced LACPDU is possible, wherein operation via enhanced LACPDU is prohibited after the timeout period. 26.如权利要求13或14所述的网络装置,其中每个对话用于网络中的服务或应用。26. The network apparatus of claim 13 or 14, wherein each conversation is for a service or application in the network. 27.一种非暂时性计算机可读存储介质,其中存储有指令,所述指令当由处理器执行时使所述处理器执行由网络装置实现的用于更新链路聚合组的链路上的对话分配的操作,其中所述网络装置配置成通过所述链路聚合组的所述链路通信地与聚合端口耦合,其中所述网络装置配置成处理对话,并且其中每个对话由帧的有序序列构成,所述操作包括如下步骤:27. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform operations implemented by a network device for updating session assignments on links of a link aggregation group, wherein the network device is configured to communicatively couple to an aggregation port via the links of the link aggregation group, wherein the network device is configured to process sessions, and wherein each session consists of an ordered sequence of frames, the operations comprising the following steps: 验证(301)对话敏感链路聚合控制协议(LACP)的实现是可操作的;Verify that the implementation of the (301) Dialogue Sensitive Link Aggregation Control Protocol (LACP) is operable; 确定(303)通过增强链路聚合控制协议数据单元(LACPDU)的操作是可能的,其中所述增强LACPDU可用于更新对话分配信息,其中所述确定至少部分基于所述网络装置的操作参数的第一集合与伙伴网络装置的操作参数的第二集合之间的兼容性检查,并且其中所述伙伴网络装置是通信地与所述网络装置耦合的所述链路聚合组的远程网络装置;以及Determining (303) that operation via Enhanced Link Aggregation Control Protocol Data Unit (LACPDU) is possible, wherein the enhanced LACPDU can be used to update session allocation information, wherein the determination is at least partially based on a compatibility check between a first set of operating parameters of the network device and a second set of operating parameters of a partner network device, and wherein the partner network device is a remote network device of the link aggregation group communicatively coupled to the network device; and 基于所述链路聚合组的聚合端口的第一对话分配状态不正确的确定,更新(305)所述第一对话分配状态,其中所述链路聚合组的所述聚合端口的所述第一对话分配状态指示传送通过所述聚合端口的对话的第一列表。Based on the determination that the first conversation allocation state of the aggregation port of the link aggregation group is incorrect, the first conversation allocation state is updated (305), wherein the first conversation allocation state of the aggregation port of the link aggregation group indicates a first list of conversations transmitted through the aggregation port. 28.一种由网络装置实现的用于更新链路聚合组的链路上的对话分配的设备,其中所述网络装置配置成通过所述链路聚合组的所述链路通信地与聚合端口耦合,其中所述网络装置配置成处理对话,并且其中每个对话由帧的有序序列构成,所述设备包括:28. An apparatus implemented by a network device for updating session assignments on links of a link aggregation group, wherein the network device is configured to communicatively couple to an aggregation port via the links of the link aggregation group, wherein the network device is configured to process sessions, and wherein each session consists of an ordered sequence of frames, the apparatus comprising: 用于验证(301)对话敏感链路聚合控制协议(LACP)的实现是可操作的部件;The component used to verify that the implementation of the (301) Dialogue Sensitive Link Aggregation Control Protocol (LACP) is operational; 用于确定(303)通过增强链路聚合控制协议数据单元(LACPDU)的操作是可能的部件,其中所述增强LACPDU可用于更新对话分配信息,其中所述确定至少部分基于所述网络装置的操作参数的第一集合与伙伴网络装置的操作参数的第二集合之间的兼容性检查,并且其中所述伙伴网络装置是通信地与所述网络装置耦合的所述链路聚合组的远程网络装置;以及The component for determining (303) whether operation via Enhanced Link Aggregation Control Protocol Data Unit (LACPDU) is possible, wherein the enhanced LACPDU can be used to update dialogue allocation information, wherein the determination is at least partially based on a compatibility check between a first set of operating parameters of the network device and a second set of operating parameters of a partner network device, and wherein the partner network device is a remote network device of the link aggregation group communicatively coupled to the network device; and 用于基于所述链路聚合组的聚合端口的第一对话分配状态不正确的确定,更新(305)所述第一对话分配状态的部件,其中所述链路聚合组的所述聚合端口的所述第一对话分配状态指示传送通过所述聚合端口的对话的第一列表。The component for updating (305) the first dialogue allocation state based on the determination that the first dialogue allocation state of the aggregation port of the link aggregation group is incorrect, wherein the first dialogue allocation state of the aggregation port of the link aggregation group indicates a first list of dialogues transmitted through the aggregation port.
HK16106078.4A 2013-04-23 2014-03-07 A method and system of updating conversation allocation in link aggregation HK1218192B (en)

Applications Claiming Priority (9)

Application Number Priority Date Filing Date Title
US201361815200P 2013-04-23 2013-04-23
US201361815203P 2013-04-23 2013-04-23
US61/815200 2013-04-23
US61/815203 2013-04-23
US201361865125P 2013-08-12 2013-08-12
US61/865125 2013-08-12
US14/135,556 US9553798B2 (en) 2013-04-23 2013-12-19 Method and system of updating conversation allocation in link aggregation
US14/135556 2013-12-19
PCT/SE2014/050282 WO2014175804A1 (en) 2013-04-23 2014-03-07 A method and system of updating conversation allocation in link aggregation

Publications (2)

Publication Number Publication Date
HK1218192A1 HK1218192A1 (en) 2017-02-03
HK1218192B true HK1218192B (en) 2019-08-30

Family

ID=

Similar Documents

Publication Publication Date Title
US11949599B2 (en) Method and system of implementing conversation-sensitive collection for a link aggregation group
CN105122748B (en) Method and system for implementing session-sensitive collection of link aggregation groups
US9654418B2 (en) Method and system of supporting operator commands in link aggregation group
US10819833B2 (en) Dynamic re-route in a redundant system of a packet network
DK2989759T3 (en) PROCEDURE AND SYSTEM TO SUPPORT DRCP (DISTRIBUTED RELAY CONTROL PROTOCOL) OPERATIONS BY COMMUNICATION FAILURE
US10680910B2 (en) Virtualized proactive services
US20170070416A1 (en) Method and apparatus for modifying forwarding states in a network device of a software defined network
US11006319B2 (en) 5G fixed mobile convergence user plane encapsulation
EP3304812A1 (en) Method and system for resynchronization of forwarding states in a network forwarding device
EP3732833A1 (en) Method and system for enabling broadband roaming services
HK1218192B (en) A method and system of updating conversation allocation in link aggregation
TW201448537A (en) A method and system of updating conversation allocation in link aggregation
WO2017149364A1 (en) Coordinated traffic reroute in an inter-chassis redundancy system
OA17571A (en) A method and system for supporting distributed relay control protocol (DRCP) operations upon communication failure.