[go: up one dir, main page]

US20160080247A1 - Optimal forwarding in a network implementing a plurality of logical networking schemes - Google Patents

Optimal forwarding in a network implementing a plurality of logical networking schemes Download PDF

Info

Publication number
US20160080247A1
US20160080247A1 US14/947,134 US201514947134A US2016080247A1 US 20160080247 A1 US20160080247 A1 US 20160080247A1 US 201514947134 A US201514947134 A US 201514947134A US 2016080247 A1 US2016080247 A1 US 2016080247A1
Authority
US
United States
Prior art keywords
network
rbridge
gateways
vxlan
logical network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/947,134
Inventor
Yibin Yang
Chiajen Tsai
Liqin Dong
Shyam Kapadia
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cisco Technology Inc
Original Assignee
Cisco Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cisco Technology Inc filed Critical Cisco Technology Inc
Priority to US14/947,134 priority Critical patent/US20160080247A1/en
Assigned to CISCO TECHNOLOGY, INC. reassignment CISCO TECHNOLOGY, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DONG, LIQIN, KAPADIA, SHYAM, TSAI, CHIAJEN, YANG, YIBIN
Publication of US20160080247A1 publication Critical patent/US20160080247A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/12Shortest path evaluation
    • H04L45/124Shortest path evaluation using a combination of metrics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/46Interconnection of networks
    • H04L12/4633Interconnection of networks using encapsulation techniques, e.g. tunneling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/46Interconnection of networks
    • H04L12/4641Virtual LANs, VLANs, e.g. virtual private networks [VPN]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/12Shortest path evaluation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/66Layer 2 routing, e.g. in Ethernet based MAN's

Definitions

  • TRILL Transparent Interconnect of Lots of Links
  • TRILL provides an architecture of Layer 2 control and forwarding that provides benefits such as pair-wise optimal forwarding, loop mitigation, multipathing and provisioning free.
  • the TRILL protocol is described in detail in Perlman et al., “RBridges: Base Protocol Specification,” available at http://tools.ietf.org/html/draft-ietf-trill-rbridge-protocol-16.
  • the TRILL base protocol supports approximately four-thousand customer (or tenant) identifications through the use of inner virtual local area network (“VLAN”) tags. The number of tenant identifications provided by the TRILL base protocol is insufficient for large multi-tenant data center deployments.
  • VLAN virtual local area network
  • FGL fine-grained labeling
  • VxLAN Virtual extensible local area network
  • VxLAN is a networking scheme that provides a Layer 2 overlay on top of Layer 3 network infrastructure. Similar to FGL, VxLAN supports approximately sixteen million tenant identifications. Specifically, according to VxLAN, customer frames are encapsulated with a VxLAN header containing a VxLAN segment ID/VxLAN network identifier (“VNI”), which is a 24-bit field to identify virtual Layer 2 networks for different tenants.
  • VNI VxLAN segment ID/VxLAN network identifier
  • the VxLAN networking scheme is discussed in detail in Mahalingham et al., “VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks,” available at http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-01.
  • TRILL FGL and VxLAN can co-exist in a multi-tenant data center.
  • VxLAN origination and termination capabilities can be built into application-specific integrated circuits (“ASICs”) already supporting TRILL.
  • ASICs application-specific integrated circuits
  • packet-switching devices can be built with VxLAN gateway functionality.
  • a VxLAN gateway can be configured to push FGL frames into VxLAN tunnels, as well as decapsulate frames from VxLAN tunnels for further forwarding as FGL frames. Accordingly, traffic can flow over the same physical network either natively in FGL or overlay in VxLAN.
  • FIG. 1 is a block diagram illustrating an example physical network
  • FIG. 2 is a block diagram illustrating forwarding paths in two logical networks over the network shown in FIG. 1 ;
  • FIGS. 3A-3B are block diagrams illustrating example frame formats according to networking schemes discussed herein;
  • FIG. 4 is a flow diagram illustrating example operations for determining an optimal forwarding path across the network shown in FIG. 1 ;
  • FIG. 5 is a block diagram of an example computing device.
  • Methods, systems and devices for determining an optimal forwarding path across a network that implements two different logical networking schemes are provided herein.
  • the methods, systems and devices can compute the total path costs for traffic flowing via a plurality of forwarding paths, while accounting for the differences in the encapsulation overhead associated with the logical networking schemes.
  • the path costs over the logical network with the greater encapsulation overhead can be weighted accordingly.
  • the optimal path among the plurality of forwarding paths can be determined and optionally used when the traffic is forwarded over the network.
  • the network 10 can be a multi-tenant data center deployment where FGL and VxLAN networking schemes are implemented for network virtualization.
  • the network 10 can include RBridges RB 11 , RB 12 , RB 13 , RB 21 , RB 22 and RB 23 , physical server pm 1 and VxLAN servers 1 and 2 .
  • Virtual machines vm 1 and vm 2 run on VxLAN servers 1 and 2 , respectively.
  • the RBridges and servers discussed above can be communicatively connected through one or more communication links. This disclosure contemplates the communication links are any suitable communication link.
  • a communication link may be implemented by any medium that facilitates data exchange between the network elements including, but not limited to, wired, wireless and optical links.
  • the network 10 shown in FIG. 1 is provided only as an example. A person of ordinary skill in the art may provide the functionalities described herein in a network having more or less elements than shown in FIG. 1 .
  • RBridges are packet-forwarding devices (e.g., switches, bridges, etc.) that are configured to implement the TRILL protocol.
  • the TRILL protocol is well-known in the art and is therefore not discuss in further detail herein.
  • TRILL links 12 between the RBridges are shown as solid lines in FIG. 1 .
  • each of RBridges RB 11 , RB 12 , RB 13 , RB 21 , RB 22 and RB 23 can be configured to support the FGL networking scheme.
  • two inner VLAN tags are used to increase the number of available tenant identifications as compared to the number of tenant identifications available using the TRILL base protocol.
  • RBridges RB 12 , RB 21 and RB 22 can be configured to support the VxLAN networking scheme in addition to the FGL networking scheme. Similar to the FGL networking scheme, the VxLAN networking scheme increases the number of available tenant identifications. The FGL and VxLAN networking schemes are optionally implemented in large multi-tenant data centers due to the large number of available tenant identifications.
  • RBridges RB 12 , RB 21 and RB 22 are also referred to as “VxLAN gateways” below because RBridges RB 12 , RB 21 and RB 22 can interface with both the FGL and VxLAN logical networks.
  • three servers are communicatively connected to the network 10 through edge RBridges RB 21 , RB 22 and RB 23 .
  • the servers are connected to the network 10 through classic Ethernet links 14 shown as dotted-dashed lines in FIG. 1 .
  • physical server pm 1 is connected to RBridge 21 . It should be understood that physical server pm 1 is not configured or capable of performing VxLAN encapsulation/decapsulation.
  • VxLAN servers 1 and 2 are connected to RB 22 and RB 23 , respectively. It should be understood that VxLAN servers 1 and 2 are configured or capable of performing VxLAN encapsulation/decapsulation.
  • VxLAN servers 1 and 2 have respective VTEPs vtep 1 and vtep 2 to originate and terminate VxLAN tunnels for their respective virtual machines vm 1 and vm 2 .
  • traffic e.g., a packet, frame, etc.
  • traffic can be transported in two formats—natively in FGL and overlay in VxLAN.
  • the traffic traverses two logical networks (e.g., the FGL and VxLAN networks) on top of the same physical network 10 .
  • FIG. 2 is a block diagram illustrating the forwarding paths in the two logical networks over the network 10 of FIG. 1 .
  • a plurality of (or multiple) forwarding paths exist between physical server pm 1 and VxLAN server 1 due to fact that there are multiple VxLAN gateways (i.e., RB 12 , RB 21 and RB 22 ) in the network 10 .
  • the traffic flowing from physical server pm 1 can reach VxLAN server 1 via RBridges RB 12 , RB 21 or RB 22 (i.e., the VxLAN gateways).
  • all links in the network 10 are the same (e.g., 10 G links) and that all links have the same default metric value of 10.
  • all links in the network 10 are assumed to be equal for the purpose of the examples, this disclosure contemplates that all of the links in the network 10 may not be equal.
  • the path costs of the multiple forwarding paths can be different due to the link metric values and/or the hop count.
  • differences in path costs can exist, for example, due to the differences in the encapsulation overhead incurred by the networking schemes.
  • example techniques for determining an optimal forwarding path are provided with reference to the two-tier fat tree network topology shown in FIG. 1 .
  • This disclosure contemplates that the example techniques are also applicable to arbitrary network topologies as well.
  • FIG. 2 the multiple forwarding paths for traffic flowing from physical server pm 1 to VxLAN server 1 , e.g., via each of RBridges RB 12 , RB 21 and RB 22 (e.g., the VxLAN gateways) are illustrated.
  • the FGL paths 22 are shown by dotted-dashed lines and the VxLAN paths 24 are shown by solid lines in FIG. 2 .
  • the FGL path costs between physical server pm 1 and each of RBridges RB 12 , RB 21 and RB 22 are 10, 0 and 20, respectively.
  • there is one hop e.g., from RBridge RB 21 to RBridge RB 12
  • the first hops e.g., between physical server pm 1 and RBridge RB 21 and between VxLAN server 1 and RBridge RB 22
  • the VxLAN path costs between VxLAN server 1 and each of RBridges RB 12 , RB 21 and RB 22 are 10, 20 and 0, respectively.
  • there are two hops e.g., from RBridge RB 21 to RBridge RB 12 to RBridge RB 22 ) between RBridge RB 21 and VxLAN server 1 .
  • each of the multiple forwarding paths between physical server pm 1 and VxLAN server 1 appear to have the same total path cost (e.g., 20) on the surface.
  • the forwarding path with the fewest hops over the VxLAN e.g., when RBridge RB 22 is the VxLAN gateway
  • FIGS. 3A-3B block diagrams illustrating example frame formats according to networking schemes discussed herein are shown.
  • FIGS. 3A-3B block diagrams illustrating example frame formats according to networking schemes discussed herein are shown.
  • FIG. 3A-3B the original customer frame (e.g., inner source and destination MAC addresses and packet payload) are shaded.
  • FIG. 3A illustrates an example FGL frame, which adds 32 bytes to the original customer frame.
  • FIG. 3B illustrates an example VxLAN frame, which adds 76 bytes to the original customer frame.
  • the VxLAN tunnel over the network 10 therefore, introduces an additional 44-byte encapsulation overhead per frame as compared to using FGL.
  • the optimal forwarding path is via RBridge RB 22 (e.g., RBridge RB 22 acts as the VxLAN gateway).
  • RBridge RB 22 acts as the VxLAN gateway.
  • the total path cost over the two logical networks can be computed. Additionally, differences between the frame formats of the two logical networks (e.g., the FGL and VxLAN networks) can be taken into consideration when computing the total path cost. Further, gateway devices (e.g., RBridges RB 12 , RB 21 and RB 22 or the VxLAN gateways) can be configured to carry out the total path cost computation because the gateways connect the logical networks.
  • the VxLAN gateways such as RBridges RB 12 , RB 21 and RB 22 , for example, can be configured to carry out the total path cost computation.
  • the VxLAN gateways are RBridges and therefore are configured to implement the TRILL protocol.
  • the VxLAN gateways can learn the network topology by exchanging link state information using the TRILL IS-IS link state protocol.
  • the VxLAN gateways can optionally use other standard or proprietary protocols for exchanging link state information.
  • the VxLAN gateways can compute their own path costs to/from any of the RBridges in the network 10 .
  • RBridge RB 12 (e.g., one of the VxLAN gateways) can compute its path cost to each of RBridges RB 21 , RB 22 and RB 23 as 10 using the link state information.
  • the VxLAN gateways can compute the path costs of the other VxLAN gateways to/from any of the RBridges in the network 10 if the VxLAN gateways know the other RBridge nicknames of the other VxLAN gateways.
  • RBridge RB 12 e.g., one of the VxLAN gateways
  • the RBridge nickname associated with RBridge RB 21 e.g., one of the other VxLAN gateways
  • it can compute the path cost between RBridge RB 21 and each of RBridges RB 22 and RB 23 as 20 using the link state information.
  • the VxLAN gateways can determine which RBridges the source and destination nodes are connected to, respectively, and then compute the total path costs between the source node and each of the VxLAN gateways and the total path costs between each of the VxLAN gateways and the destination node.
  • a source node e.g., physical server pm 1
  • a destination node e.g., VxLAN server 1
  • the VxLAN gateways can determine which RBridges the source and destination nodes are connected to, respectively, and then compute the total path costs between the source node and each of the VxLAN gateways and the total path costs between each of the VxLAN gateways and the destination node.
  • the traffic will traverse the FGL network between the source node (e.g., physical server pm 1 ) and the VxLAN gateway and traverse the VxLAN between the VxLAN gateway and the destination node (e.g., VxLAN server 1 ).
  • each of the VxLAN gateways can be configured to advertise its respective RBridge nickname and, optionally, the IP address used for VxLAN encapsulation. It should be understood that each of the VxLAN gateways can be associated with a unique identifier (e.g., the RBridge nickname) according to the TRILL protocol.
  • the RBridge nickname can be included in a Type Length Value (TLV) in the link state protocol used for disseminating the link state information. This is also referred to as the VxLAN Gateway Information TLV herein.
  • the link state protocol can be the TRILL IS-IS link state protocol.
  • the VxLAN Gateway Information TLV can optionally include the IP address used for VxLAN encapsulation, as well as the RBridge nickname. For example, if RBridge RB 21 (e.g., one of the VxLAN gateways) announces its RBridge nickname using the VxLAN Gateway Information TLV, then RBridge RB 12 (e.g., one of the VxLAN gateways) can compute the total path cost for RBridge RB 21 to/from the other RBridges in the network 10 in addition to its own path cost to/from the other RBridges in the network 10 .
  • RBridge RB 21 e.g., one of the VxLAN gateways
  • RBridge RB 12 e.g., one of the VxLAN gateways
  • a VxLAN gateway can compute path costs between each of the other VxLAN gateways and each of the RBridges in the network 10 provided it knows the RBridge nicknames for the other VxLAN gateways. Additionally, as discussed in detail below, a VxLAN gateway can optionally use the IP address for VxLAN encapsulation when notifying the other RBridges in the network of the optimal VxLAN gateway.
  • the VxLAN gateways can determine the RBridges to which the source and destination nodes, respectively, are connected. The determination is different depending on whether the source or destination node is a physical server (e.g., physical server pm 1 ) or a VxLAN server (e.g., VxLAN server 1 or 2 ).
  • a VxLAN gateway processing traffic from a physical server can determine which RBridge the physical server is connected to via MAC learning. In other words, the VxLAN gateway can determine the RBridge to which the physical server is connected from the physical server's MAC address and RBridge nickname binding using its MAC address table.
  • the VxLAN gateway learns the binding between physical server pmt's MAC address and the RBridge nickname associated with ingress RBridge RB 21 , e.g., the RBridge to which physical server pm 1 is connected. Then, using the link state information exchanged through the link state protocol, the VxLAN gateway can compute its own path cost from/to RBridge RB 21 as 10 .
  • the VxLAN gateway can also compute path costs of RBridges RB 21 and RB 22 from/to RBridge RB 21 as 0 and 20, respectively.
  • the process for determining the RBridge to which a VxLAN server is connected is discussed below.
  • the IP addresses used by the VxLAN gateways e.g., RBridges RB 12 , RB 21 and RB 22
  • the VTEPs e.g., VTEPs vtep 1 and vtep 2
  • SVIs switch virtual interfaces
  • VTEPs vtep 1 and vtep 2 can be configured to transmit VxLAN encapsulated frames in VLAN “X” and the SVIs for VLAN “X” can be configured on RBridges RB 12 , RB 21 and RB 22 .
  • the VxLAN gateways can then determine the RBridge to which a VxLAN server is connected through the following bindings: (1) the binding between the MAC address associated with a VxLAN server and the IP address associated with the VTEP (e.g., VxLAN learning), (2) the binding between the IP address associated with the VTEP and the MAC address associated with the VTEP (e.g., ARP), and (3) the binding between the MAC address associated with the VTEP and the RBridge nickname of the ingress RBridge (e.g., MAC learning).
  • RBridge RB 12 can determine that VxLAN server 1 is connected to RBridge RB 22 through the following three bindings. First, through VxLAN learning, RBridge RB 12 can find the binding of the MAC address associated with the VxLAN server 1 and the IP address associated with VTEP vtep 1 using its VxLAN table. Next, because RBridge 12 is in the same subnet as VTEP vtep 1 , RBridge RB 12 can find the MAC address associated with VTEP vtep 1 using its ARP table.
  • RBridge RB 12 can find which RBridge VxLAN server 1 is connected to based on VTEP vtep 1 's MAC address and RBridge RB 22 's RBridge nickname using its MAC address table.
  • the VxLAN gateway can compute its own path cost from/to RBridge RB 22 as 10.
  • the VxLAN gateway e.g., RBridge RB 12
  • the VxLAN gateway can also compute the path costs of RBridges RB 21 and RB 22 from/to RBridge RB 22 as 20 and 0, respectively.
  • the VxLAN gateway (e.g., RBridge RB 12 , RB 21 or RB 22 ) can determine the optimal forwarding path and the optimal VxLAN gateway. It should be understood that traffic flows from the source node to the VxLAN gateway over the FGL network and from the VxLAN gateway to the destination node over the VxLAN. Alternatively or additionally, it should be understood that traffic flows from the source node to the VxLAN gateway over the VxLAN and from the VxLAN gateway to the destination node over the FGL network. This is shown in FIG. 2 .
  • the optimal forwarding path is the forwarding path having the fewest hops in the logical network having the greater encapsulation overhead.
  • the optimal forwarding path is chosen such that traffic makes fewer hops in the logical network associated with the larger encapsulation overhead (e.g., the VxLAN) and more hops in the logical network associated with smaller encapsulation overhead (e.g., the FGL network).
  • VxLAN encapsulation overhead exceeds FGL encapsulation overhead.
  • the optimal VxLAN gateway is RBridge RB 22 and the optimal forwarding path is through RBridge RB 22 .
  • the VxLAN gateways can be configured to calculate an encapsulation overhead metric.
  • the encapsulation overhead metric (“E O/H ”) can optionally be defined as:
  • the encapsulation overhead metric provided in Eqn. (1) is provided only as an example and that the encapsulation overhead metric can be defined in other ways.
  • the per frame encapsulation overhead of VxLAN encapsulation exceeds FGL encapsulation by 44 bytes.
  • the encapsulation overhead metric calculated using Eqn. (1) is 1.1, assuming an average packet size of 440 bytes. This disclosure contemplates that the average packet size can optionally be more or less than 440 bytes, which is provided only as an example.
  • Table 1 shows the total path costs computed for the multiple forwarding paths between physical server pm 1 and VxLAN server 1 of FIG. 2 , assuming an encapsulation overhead of 44 bytes per frame and an average packet size of 440 bytes.
  • VxLAN Forwarding Path FGL Path Cost Path Cost E o/H Total Path Cost Via RB12 10 10 1.1 21 Via RB21 0 20 1.1 22 Via RB22 20 0 1.1 20 As shown above in Table 1, the optimal forwarding path is via RBridge RB 22 .
  • the VxLAN gateways can be configured to notify the RBridges and VTEPs in the network 10 of which RBridge is the optimal VxLAN gateway.
  • the VxLAN gateways can be configured to notify the RBridges and VTEPs in the network 10 of which RBridge is the optimal VxLAN gateway.
  • RBridge RB 12 performs VxLAN encapsulation and transmits the encapsulated frame to the VxLAN IP multicast address.
  • the distribution tree 16 rooted at RBridge RB 12 is shown as a dashed line in FIG. 1 .
  • RBridge RB 12 learns the binding between physical server pmt's MAC address and RBridge RB 21 's RBridge nickname through MAC address learning, and therefore, RBridge RB 12 can compute the FGL path costs between physical server pm 1 and all of the VxLAN gateways (e.g., RBridges RB 12 , RB 21 and RB 22 ).
  • VxLAN sever 1 responds with a unicast frame to physical server pm 1 .
  • VTEP vtep 1 encapsulates the frame, using learned RBridge RB 12 's IP address as the destination IP address.
  • RBridge RB 12 can learn the binding between VxLAN server 1 's MAC address and VTEP vtep 1 's IP address, for example, through the three bindings discussed above.
  • RBridge RB 12 can then compute the VxLAN path costs between VxLAN server 1 and all of the VxLAN gateways (e.g., RBridges RB 12 , RB 21 and RB 22 ).
  • RBridge RB 12 The ability of RBridge RB 12 to compute the total path costs for the other VxLAN gateways assumes that RBridge RB 12 has learned the RBridge nicknames of the other VxLAN gateways, for example, by exchanging messages using the link state protocol including the VxLAN Gateway Information TLV. Additionally, RBridge RB 12 can weight the path costs over the VxLAN because VxLAN encapsulation has a higher encapsulation overhead as compared to FGL encapsulation.
  • RBridge RB 12 Upon computing the total path costs, for example, as shown in Table 1 above, RBridge RB 12 realizes that it is not in the optimal forwarding path.
  • RBridge RB 12 can optionally be configured to notify one or more RBridges in the network 10 to use the optimal forwarding path, e.g., via RBridge RB 22 , instead of the forwarding path via RBridge RB 12 .
  • RBridge RB 12 can be configured to notify the RBridge to which physical server pm 1 is connected (e.g., RBridge RB 21 ) and VxLAN server 1 's VTEP (e.g., VTEP vtep 1 ) to use the optimal path via RBridge RB 22 .
  • physical server pm 1 e.g., RBridge RB 21
  • VxLAN server 1 's VTEP e.g., VTEP vtep 1
  • an implicit approach is provided below that can be used by a VxLAN gateway to notify an RBridge or a VTEP of the optimal forwarding path. It should be understood that the implicit approach does not require any protocol changes.
  • a VxLAN gateway can encapsulate FGL frames using the desired optimal VxLAN gateway's RBridge nickname as the ingress RBridge nickname.
  • the RBridge to which the physical server is connected can learn the binding between the desired MAC address and RBridge nickname and redirect traffic to the optimal VxLAN gateway.
  • RBridge RB 12 (e.g., a non-optimal VxLAN gateway) can decapsulate VxLAN frames from VxLAN server 1 and can encapsulate the frames with FGL headers using RBridge RB 22 's (e.g., an optimal VxLAN gateway) RBridge nickname, instead of its own, as the ingress RBridge nickname. Then, the RBridge to which physical server pm 1 is connected (e.g., RBridge RB 21 ) can learn the desired binding between VxLAN server 1 's MAC address and RBridge RB 22 's RBridge nickname and redirect traffic to RBridge RB 22 .
  • RBridge RB 21 the RBridge to which physical server pm 1 is connected
  • a VxLAN gateway can encapsulate VxLAN frames using the desired optimal VxLAN gateway's IP address as the source IP address.
  • the VTEP can learn the desired binding between the MAC address and IP address and redirect the traffic to the optimal VxLAN gateway.
  • RBridge RB 12 e.g., a non-optimal VxLAN gateway
  • RBridge RB 22 can decapsulate FGL frames from physical server pm 1 and can encapsulate the frames with VxLAN headers using RBridge RB 22 's (e.g., an optimal VxLAN gateway) IP address, instead of its own, as the source IP address.
  • VTEP vtep 1 can learn the desired binding between physical server pmt's MAC address and RBridge RB 22 's IP address and redirect the traffic to RBridge RB 22 .
  • an explicit approach is provided below that can be used by a VxLAN gateway to notify an RBridge or a VTEP of the optimal forwarding path.
  • the explicit approach requires a protocol change, it provides the benefit of fast rerouting when a VxLAN gateway in the optimal forwarding path fails.
  • a VxLAN gateway can use the TRILL ESADI to notify an RBridge to which the physical server is connected of the plurality of bindings between the VxLAN server's MAC address and RBridge nicknames of the VxLAN gateways and the associated VxLAN path costs.
  • a VxLAN gateway can notify the RBridge to which the physical server is connected of a plurality of bindings with associated VxLAN path costs so that the RBridge can switch to the next-best VxLAN gateway if the optimal VxLAN gateway is detected as unreachable by the link state protocol.
  • the VxLAN can be configured to use a modified MAC Reachability TLV, i.e., a VxLAN MAC Reachability TLV.
  • the VxLAN MAC Reachability TLV can include a list of tuples, including but not limited to, one or more VxLAN server MAC addresses and associated VxLAN gateway RBridge nicknames and VxLAN path costs.
  • the RBridge When the RBridge receives the VxLAN MAC Reachability TLV, it can compute the total path costs based on its FGL path costs to VxLAN gateways and the advertised VxLAN path costs. For example, RBridge RB 12 can use the VxLAN MAC Reachability TLV to announce the bindings of VxLAN server 1 's MAC address and the RBridge nicknames of RBridges RB 12 , RB 21 and RB 22 (e.g., the VxLAN gateways) with respective VxLAN path costs of 10, 20 and 0.
  • RBridge RB 21 When the RBridge to which physical server pm 1 is connected (e.g., RBridge RB 21 ) receives the VxLAN MAC Reachability TLV, it can compute total path costs via RBridges RB 12 , RB 21 and RB 22 as 21, 22 and 20, respectively, based on its FGL path costs to RBridges RB 12 , RB 21 and RB 22 of 10, 0 and 20, respectively, and the advertised VxLAN path costs of 10, 20 and 0. RBridge RB 21 can then redirect the traffic to RBridge RB 22 because it is the VxLAN gateway associated with the lowest total path cost.
  • a VxLAN gateway can use a control protocol (e.g., VxLAN Gateway Address Distribution Information (VGADI)) to allow a VxLAN gateway to notify a VTEP of the plurality of bindings between a physical server's MAC address and VxLAN gateway IP addresses of the VxLAN gateways and the associated total path costs.
  • VGADI VxLAN Gateway Address Distribution Information
  • a VxLAN gateway can unicast its protocol data units (“PDUs”) to the IP address of the intended VTEP.
  • PDUs protocol data units
  • Each PDU can carry a VxLAN Gateway Reachability TLV, which includes a list of tuples, including but not limited to, one or more physical server MAC addresses and associated VxLAN gateway IP addresses and total path costs.
  • RBridge RB 12 can use the VxLAN Gateway Reachability TLV to inform VTEP vtep 1 of the bindings between physical server pmt's MAC address and the IP addresses of RBridges RB 12 , RB 21 and RB 22 (e.g., the VxLAN gateways) with respective total path costs of 21, 22 and 20.
  • VTEP vtep 1 can then redirect the traffic to RBridge RB 22 because it is the optimal VxLAN gateway associated with the lowest total path cost.
  • the logical operations described herein with respect to the various figures may be implemented (1) as a sequence of computer implemented acts or program modules (i.e., software) running on a computing device, (2) as interconnected machine logic circuits or circuit modules (i.e., hardware) within the computing device and/or (3) a combination of software and hardware of the computing device.
  • the logical operations discussed herein are not limited to any specific combination of hardware and software. The implementation is a matter of choice dependent on the performance and other requirements of the computing device. Accordingly, the logical operations described herein are referred to variously as operations, structural devices, acts, or modules. These operations, structural devices, acts and modules may be implemented in software, in firmware, in special purpose digital logic, and any combination thereof. It should also be appreciated that more or fewer operations may be performed than shown in the figures and described herein. These operations may also be performed in a different order than those described herein.
  • the network can be the network including RBridges configured to implement both FGL networking and VxLAN schemes, e.g., the network 10 shown in FIG. 1 .
  • RBridges configured to implement both FGL and VxLAN networking schemes are VxLAN gateways.
  • the example operations 400 can be carried out by a VxLAN gateway.
  • one or more RBridge nicknames can be learned. As discussed above, each RBridge nickname is uniquely associated with one of the VxLAN gateways in the network.
  • a path cost over the FGL network between each of the VxLAN gateways and a source node is determined.
  • a path cost over the VxLAN between each of the VxLAN gateways and a destination node is determined.
  • an encapsulation overhead metric associated with switching packets over the VxLAN can be determined.
  • one of the VxLAN gateways can be selected as an optimal VxLAN gateway. The selection can be based on the path cost over the FGL network between each of the VxLAN gateways and the source node, the path cost over the VxLAN between each of the VxLAN gateways and the destination node and the encapsulation overhead metric.
  • one or more RBridges in the network can be notified of the selection. This facilitates the ability of the RBridges to re-direct traffic via the optimal VxLAN gateway.
  • the process may execute on any type of computing architecture or platform.
  • FIG. 5 an example computing device upon which embodiments of the invention may be implemented is illustrated.
  • the RBridges and servers discussed above may be a computing device, such as computing device 500 shown in FIG. 5 .
  • the computing device 500 may include a bus or other communication mechanism for communicating information among various components of the computing device 500 .
  • computing device 500 typically includes at least one processing unit 506 and system memory 504 .
  • system memory 504 may be volatile (such as random access memory (RAM)), non-volatile (such as read-only memory (ROM), flash memory, etc.), or some combination of the two.
  • RAM random access memory
  • ROM read-only memory
  • the processing unit 506 may be a standard programmable processor that performs arithmetic and logic operations necessary for operation of the computing device 500 .
  • the processing unit 506 can be an ASIC.
  • Computing device 500 may have additional features/functionality.
  • computing device 500 may include additional storage such as removable storage 508 and non-removable storage 510 including, but not limited to, magnetic or optical disks or tapes.
  • Computing device 500 may also contain network connection(s) 516 that allow the device to communicate with other devices.
  • Computing device 500 may also have input device(s) 514 such as a keyboard, mouse, touch screen, etc.
  • Output device(s) 512 such as a display, speakers, printer, etc. may also be included.
  • the additional devices may be connected to the bus in order to facilitate communication of data among the components of the computing device 500 . All these devices are well known in the art and need not be discussed at length here.
  • the processing unit 506 may be configured to execute program code encoded in tangible, computer-readable media.
  • Computer-readable media refers to any media that is capable of providing data that causes the computing device 500 (i.e., a machine) to operate in a particular fashion.
  • Various computer-readable media may be utilized to provide instructions to the processing unit 506 for execution.
  • Common forms of computer-readable media include, for example, magnetic media, optical media, physical media, memory chips or cartridges, a carrier wave, or any other medium from which a computer can read.
  • Example computer-readable media may include, but is not limited to, volatile media, non-volatile media and transmission media.
  • Volatile and non-volatile media may be implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data and common forms are discussed in detail below.
  • Transmission media may include coaxial cables, copper wires and/or fiber optic cables, as well as acoustic or light waves, such as those generated during radio-wave and infra-red data communication.
  • Example tangible, computer-readable recording media include, but are not limited to, an integrated circuit (e.g., field-programmable gate array or application-specific IC), a hard disk, an optical disk, a magneto-optical disk, a floppy disk, a magnetic tape, a holographic storage medium, a solid-state device, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices.
  • an integrated circuit e.g., field-programmable gate array or application-specific IC
  • a hard disk e.g., an optical disk, a magneto-optical disk, a floppy disk, a magnetic tape, a holographic storage medium, a solid-state device, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (
  • the processing unit 506 may execute program code stored in the system memory 504 .
  • the bus may carry data to the system memory 504 , from which the processing unit 506 receives and executes instructions.
  • the data received by the system memory 504 may optionally be stored on the removable storage 508 or the non-removable storage 510 before or after execution by the processing unit 506 .
  • Computing device 500 typically includes a variety of computer-readable media.
  • Computer-readable media can be any available media that can be accessed by device 500 and includes both volatile and non-volatile media, removable and non-removable media.
  • Computer storage media include volatile and non-volatile, and removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • System memory 504 , removable storage 508 , and non-removable storage 510 are all examples of computer storage media.
  • Computer storage media include, but are not limited to, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 500 . Any such computer storage media may be part of computing device 500 .
  • the various techniques described herein may be implemented in connection with hardware or software or, where appropriate, with a combination thereof.
  • the methods and apparatuses of the presently disclosed subject matter, or certain aspects or portions thereof may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium wherein, when the program code is loaded into and executed by a machine, such as a computing device, the machine becomes an apparatus for practicing the presently disclosed subject matter.
  • the computing device In the case of program code execution on programmable computers, the computing device generally includes a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device.
  • One or more programs may implement or utilize the processes described in connection with the presently disclosed subject matter, e.g., through the use of an application programming interface (API), reusable controls, or the like.
  • API application programming interface
  • Such programs may be implemented in a high level procedural or object-oriented programming language to communicate with a computer system.
  • the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language and it may be combined with hardware implementations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

An example method for determining an optimal forwarding path across a network having gateways configured to implement a plurality of logical networking protocols can include determining a path cost over a first logical network between each of the gateways and a source node and a path cost over the a second logical network between each of the gateways and a destination node. Additionally, the method can include determining an encapsulation cost difference between switching packets over the first and second logical networks. The method can also include determining an encapsulation overhead metric associated with one of the first or second logical networks, and weighting one of the first or second path cost by the encapsulation overhead metric. Further, the method can include selecting one of the gateways as an optimal gateway. The selection can be based on the computed path costs.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. patent application Ser. No. 13/898,572, filed on May 21, 2013, entitled “OPTIMAL FORWARDING FOR TRILL FINE-GRAINED LABELING AND VXLAN INTERWORKING,” the disclosure of which is expressly incorporated herein by reference in its entirety.
  • BACKGROUND
  • IETF Transparent Interconnect of Lots of Links (“TRILL”) provides an architecture of Layer 2 control and forwarding that provides benefits such as pair-wise optimal forwarding, loop mitigation, multipathing and provisioning free. The TRILL protocol is described in detail in Perlman et al., “RBridges: Base Protocol Specification,” available at http://tools.ietf.org/html/draft-ietf-trill-rbridge-protocol-16. The TRILL base protocol supports approximately four-thousand customer (or tenant) identifications through the use of inner virtual local area network (“VLAN”) tags. The number of tenant identifications provided by the TRILL base protocol is insufficient for large multi-tenant data center deployments. Thus, a fine-grained labeling (“FGL”) networking scheme has been proposed to increase the number of tenant identifications to approximately sixteen million through the use of two inner VLAN tags. The FGL networking scheme is described in detail in Eastlake et al., “TRILL: Fine-Grained Labeling,” available at http://tools.ietf.org/html/draft-ietf-trill-fine-labeling-01.
  • Virtual extensible local area network (“VxLAN”) is a networking scheme that provides a Layer 2 overlay on top of Layer 3 network infrastructure. Similar to FGL, VxLAN supports approximately sixteen million tenant identifications. Specifically, according to VxLAN, customer frames are encapsulated with a VxLAN header containing a VxLAN segment ID/VxLAN network identifier (“VNI”), which is a 24-bit field to identify virtual Layer 2 networks for different tenants. The VxLAN networking scheme is discussed in detail in Mahalingham et al., “VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks,” available at http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-01.
  • As two complementary network virtualization schemes, TRILL FGL and VxLAN can co-exist in a multi-tenant data center. To facilitate their interworking, VxLAN origination and termination capabilities can be built into application-specific integrated circuits (“ASICs”) already supporting TRILL. In other words, packet-switching devices can be built with VxLAN gateway functionality. A VxLAN gateway can be configured to push FGL frames into VxLAN tunnels, as well as decapsulate frames from VxLAN tunnels for further forwarding as FGL frames. Accordingly, traffic can flow over the same physical network either natively in FGL or overlay in VxLAN.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The components in the drawings are not necessarily to scale relative to each other. Like reference numerals designate corresponding parts throughout the several views.
  • FIG. 1 is a block diagram illustrating an example physical network;
  • FIG. 2 is a block diagram illustrating forwarding paths in two logical networks over the network shown in FIG. 1;
  • FIGS. 3A-3B are block diagrams illustrating example frame formats according to networking schemes discussed herein;
  • FIG. 4 is a flow diagram illustrating example operations for determining an optimal forwarding path across the network shown in FIG. 1; and
  • FIG. 5 is a block diagram of an example computing device.
  • DETAILED DESCRIPTION
  • Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. Methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure. As used in the specification, and in the appended claims, the singular forms “a,” “an,” “the” include plural referents unless the context clearly dictates otherwise. The term “comprising” and variations thereof as used herein is used synonymously with the term “including” and variations thereof and are open, non-limiting terms. While implementations will be described for determining an optimal forwarding path across a physical network where FGL and VxLAN networking schemes are implemented, it will become evident to those skilled in the art that the implementations are not limited thereto, but are applicable for determining an optimal forwarding path across a network that implements two different logical networking schemes.
  • Methods, systems and devices for determining an optimal forwarding path across a network that implements two different logical networking schemes are provided herein. The methods, systems and devices can compute the total path costs for traffic flowing via a plurality of forwarding paths, while accounting for the differences in the encapsulation overhead associated with the logical networking schemes. Optionally, the path costs over the logical network with the greater encapsulation overhead can be weighted accordingly. After computing the total path costs, the optimal path among the plurality of forwarding paths can be determined and optionally used when the traffic is forwarded over the network.
  • Referring now to FIG. 1, a block diagram illustrating an example physical network 10 is shown. For example, the network 10 can be a multi-tenant data center deployment where FGL and VxLAN networking schemes are implemented for network virtualization. The network 10 can include RBridges RB11, RB12, RB13, RB21, RB22 and RB23, physical server pm1 and VxLAN servers 1 and 2. Virtual machines vm1 and vm2 run on VxLAN servers 1 and 2, respectively. The RBridges and servers discussed above can be communicatively connected through one or more communication links. This disclosure contemplates the communication links are any suitable communication link. For example, a communication link may be implemented by any medium that facilitates data exchange between the network elements including, but not limited to, wired, wireless and optical links. It should be understood that the network 10 shown in FIG. 1 is provided only as an example. A person of ordinary skill in the art may provide the functionalities described herein in a network having more or less elements than shown in FIG. 1.
  • RBridges are packet-forwarding devices (e.g., switches, bridges, etc.) that are configured to implement the TRILL protocol. The TRILL protocol is well-known in the art and is therefore not discuss in further detail herein. TRILL links 12 between the RBridges are shown as solid lines in FIG. 1. In addition, each of RBridges RB11, RB12, RB13, RB21, RB22 and RB23 can be configured to support the FGL networking scheme. As discussed above, according to the FGL networking scheme, two inner VLAN tags are used to increase the number of available tenant identifications as compared to the number of tenant identifications available using the TRILL base protocol. RBridges RB12, RB21 and RB22 (e.g., the shaded RBridges in FIG. 1) can be configured to support the VxLAN networking scheme in addition to the FGL networking scheme. Similar to the FGL networking scheme, the VxLAN networking scheme increases the number of available tenant identifications. The FGL and VxLAN networking schemes are optionally implemented in large multi-tenant data centers due to the large number of available tenant identifications. RBridges RB12, RB21 and RB22 are also referred to as “VxLAN gateways” below because RBridges RB12, RB21 and RB22 can interface with both the FGL and VxLAN logical networks.
  • As shown in FIG. 1, three servers are communicatively connected to the network 10 through edge RBridges RB21, RB22 and RB23. Optionally, the servers are connected to the network 10 through classic Ethernet links 14 shown as dotted-dashed lines in FIG. 1. In particular, physical server pm1 is connected to RBridge 21. It should be understood that physical server pm1 is not configured or capable of performing VxLAN encapsulation/decapsulation. Additionally, VxLAN servers 1 and 2 are connected to RB22 and RB23, respectively. It should be understood that VxLAN servers 1 and 2 are configured or capable of performing VxLAN encapsulation/decapsulation. VxLAN servers 1 and 2 have respective VTEPs vtep1 and vtep2 to originate and terminate VxLAN tunnels for their respective virtual machines vm1 and vm2.
  • When traffic (e.g., a packet, frame, etc.) is forwarded from one server to another (e.g., from physical server pm1 to VxLAN server 1), the traffic can be transported in two formats—natively in FGL and overlay in VxLAN. Conceptually, the traffic traverses two logical networks (e.g., the FGL and VxLAN networks) on top of the same physical network 10. This is shown in FIG. 2, which is a block diagram illustrating the forwarding paths in the two logical networks over the network 10 of FIG. 1. It should be understood that a plurality of (or multiple) forwarding paths exist between physical server pm1 and VxLAN server 1 due to fact that there are multiple VxLAN gateways (i.e., RB12, RB21 and RB22) in the network 10. Thus, the traffic flowing from physical server pm1 can reach VxLAN server 1 via RBridges RB12, RB21 or RB22 (i.e., the VxLAN gateways). When there are multiple forwarding paths available, it is desirable to configure the RBridges to perform optimal forwarding across the network 10. In other words, it is desirable to configure the RBridges to identify and use the optimal VxLAN gateway when forwarding traffic.
  • In the example implementations described below for determining an optimal forwarding path in the network 10, it is assumed that all links in the network 10 are the same (e.g., 10 G links) and that all links have the same default metric value of 10. Although all links in the network 10 are assumed to be equal for the purpose of the examples, this disclosure contemplates that all of the links in the network 10 may not be equal. It should be understood that in an arbitrary network topology the path costs of the multiple forwarding paths can be different due to the link metric values and/or the hop count. Further, even in a two-tier fat tree network topology with equal link metric values (e.g., the network topology shown in FIG. 1), differences in path costs can exist, for example, due to the differences in the encapsulation overhead incurred by the networking schemes.
  • As discussed in further detail below, example techniques for determining an optimal forwarding path are provided with reference to the two-tier fat tree network topology shown in FIG. 1. This disclosure contemplates that the example techniques are also applicable to arbitrary network topologies as well. With reference to FIG. 2, the multiple forwarding paths for traffic flowing from physical server pm1 to VxLAN server 1, e.g., via each of RBridges RB12, RB21 and RB22 (e.g., the VxLAN gateways) are illustrated. The FGL paths 22 are shown by dotted-dashed lines and the VxLAN paths 24 are shown by solid lines in FIG. 2. Further, the FGL path costs between physical server pm1 and each of RBridges RB12, RB21 and RB22 are 10, 0 and 20, respectively. For example, there is one hop (e.g., from RBridge RB21 to RBridge RB12) between physical server pm1 and RBridge RB12. It should be understood that in the examples described herein the first hops (e.g., between physical server pm1 and RBridge RB21 and between VxLAN server 1 and RBridge RB22) are ignored because these hops will be the same regardless of the chosen forwarding path. The VxLAN path costs between VxLAN server 1 and each of RBridges RB12, RB21 and RB22 are 10, 20 and 0, respectively. For example, there are two hops (e.g., from RBridge RB21 to RBridge RB12 to RBridge RB22) between RBridge RB21 and VxLAN server 1.
  • Considering the two-tier fat tree network topology of FIG. 1, each of the multiple forwarding paths between physical server pm1 and VxLAN server 1 appear to have the same total path cost (e.g., 20) on the surface. However, due to differences between FGL and VxLAN encapsulations, the forwarding path with the fewest hops over the VxLAN (e.g., when RBridge RB22 is the VxLAN gateway) is actually the optimal path due to the encapsulation overhead introduced by VxLAN encapsulation as compared to FGL encapsulation. For example, referring now to FIGS. 3A-3B, block diagrams illustrating example frame formats according to networking schemes discussed herein are shown. In FIGS. 3A-3B, the original customer frame (e.g., inner source and destination MAC addresses and packet payload) are shaded. FIG. 3A illustrates an example FGL frame, which adds 32 bytes to the original customer frame. FIG. 3B illustrates an example VxLAN frame, which adds 76 bytes to the original customer frame. The VxLAN tunnel over the network 10, therefore, introduces an additional 44-byte encapsulation overhead per frame as compared to using FGL. Thus, the optimal forwarding path is via RBridge RB22 (e.g., RBridge RB22 acts as the VxLAN gateway). It should be understood that the fields/sizes shown in the example frames of FIGS. 3A-3B are provided only as examples and that the FGL frame and/or the VxLAN frame may have more or less fields/sizes than those shown.
  • Path Cost Computation
  • To facilitate optimal forwarding across the network 10, the total path cost over the two logical networks (e.g., the FGL and the VxLAN networks) can be computed. Additionally, differences between the frame formats of the two logical networks (e.g., the FGL and VxLAN networks) can be taken into consideration when computing the total path cost. Further, gateway devices (e.g., RBridges RB12, RB21 and RB22 or the VxLAN gateways) can be configured to carry out the total path cost computation because the gateways connect the logical networks.
  • As discussed above, the VxLAN gateways such as RBridges RB12, RB21 and RB22, for example, can be configured to carry out the total path cost computation. Further, as discussed above, the VxLAN gateways are RBridges and therefore are configured to implement the TRILL protocol. As such, the VxLAN gateways can learn the network topology by exchanging link state information using the TRILL IS-IS link state protocol. This disclosure contemplates that the VxLAN gateways can optionally use other standard or proprietary protocols for exchanging link state information. Using the link state information, the VxLAN gateways can compute their own path costs to/from any of the RBridges in the network 10. For example, RBridge RB12 (e.g., one of the VxLAN gateways) can compute its path cost to each of RBridges RB21, RB22 and RB23 as 10 using the link state information. In addition, the VxLAN gateways can compute the path costs of the other VxLAN gateways to/from any of the RBridges in the network 10 if the VxLAN gateways know the other RBridge nicknames of the other VxLAN gateways. For example, provided RBridge RB12 (e.g., one of the VxLAN gateways) knows the RBridge nickname associated with RBridge RB21 (e.g., one of the other VxLAN gateways), it can compute the path cost between RBridge RB21 and each of RBridges RB22 and RB23 as 20 using the link state information. Thus, to calculate the total path costs across the two logical networks (e.g., the FGL and VxLAN networks) for traffic flowing between a source node (e.g., physical server pm1) and a destination node (e.g., VxLAN server 1), the VxLAN gateways can determine which RBridges the source and destination nodes are connected to, respectively, and then compute the total path costs between the source node and each of the VxLAN gateways and the total path costs between each of the VxLAN gateways and the destination node. In FIG. 2, it should be understood that the traffic will traverse the FGL network between the source node (e.g., physical server pm1) and the VxLAN gateway and traverse the VxLAN between the VxLAN gateway and the destination node (e.g., VxLAN server 1).
  • To facilitate the VxLAN gateways learning the RBridge nicknames of the other VxLAN gateways in the network 10, each of the VxLAN gateways can be configured to advertise its respective RBridge nickname and, optionally, the IP address used for VxLAN encapsulation. It should be understood that each of the VxLAN gateways can be associated with a unique identifier (e.g., the RBridge nickname) according to the TRILL protocol. Optionally, the RBridge nickname can be included in a Type Length Value (TLV) in the link state protocol used for disseminating the link state information. This is also referred to as the VxLAN Gateway Information TLV herein. Optionally, the link state protocol can be the TRILL IS-IS link state protocol. The VxLAN Gateway Information TLV can optionally include the IP address used for VxLAN encapsulation, as well as the RBridge nickname. For example, if RBridge RB21 (e.g., one of the VxLAN gateways) announces its RBridge nickname using the VxLAN Gateway Information TLV, then RBridge RB12 (e.g., one of the VxLAN gateways) can compute the total path cost for RBridge RB21 to/from the other RBridges in the network 10 in addition to its own path cost to/from the other RBridges in the network 10. Accordingly, a VxLAN gateway can compute path costs between each of the other VxLAN gateways and each of the RBridges in the network 10 provided it knows the RBridge nicknames for the other VxLAN gateways. Additionally, as discussed in detail below, a VxLAN gateway can optionally use the IP address for VxLAN encapsulation when notifying the other RBridges in the network of the optimal VxLAN gateway.
  • In addition, to compute the total path cost for each of multiple forwarding paths between the source and destination nodes, the VxLAN gateways can determine the RBridges to which the source and destination nodes, respectively, are connected. The determination is different depending on whether the source or destination node is a physical server (e.g., physical server pm1) or a VxLAN server (e.g., VxLAN server 1 or 2). A VxLAN gateway processing traffic from a physical server can determine which RBridge the physical server is connected to via MAC learning. In other words, the VxLAN gateway can determine the RBridge to which the physical server is connected from the physical server's MAC address and RBridge nickname binding using its MAC address table. For example, when the traffic flows from physical server pm1 to VxLAN server 1 through RBridge RB12, the VxLAN gateway (e.g., RBridge RB12) learns the binding between physical server pmt's MAC address and the RBridge nickname associated with ingress RBridge RB21, e.g., the RBridge to which physical server pm1 is connected. Then, using the link state information exchanged through the link state protocol, the VxLAN gateway can compute its own path cost from/to RBridge RB21 as 10. In addition, provided that the VxLAN gateway has obtained the RBridge nicknames of the other VxLAN gateways in the network 10 (e.g., RBridges RB21 and RB22), the VxLAN gateway can also compute path costs of RBridges RB21 and RB22 from/to RBridge RB21 as 0 and 20, respectively.
  • The process for determining the RBridge to which a VxLAN server is connected is discussed below. The IP addresses used by the VxLAN gateways (e.g., RBridges RB12, RB21 and RB22) and the VTEPs (e.g., VTEPs vtep1 and vtep2) as the source IP addresses for VxLAN encapsulation are in the same IP subnet. This can be achieved by: (1) putting all VTEPs in the same VLAN and (2) configuring the switch virtual interfaces (“SVIs”) of the VLAN in the VxLAN gateways. For example, VTEPs vtep1 and vtep2 can be configured to transmit VxLAN encapsulated frames in VLAN “X” and the SVIs for VLAN “X” can be configured on RBridges RB12, RB21 and RB22. The VxLAN gateways can then determine the RBridge to which a VxLAN server is connected through the following bindings: (1) the binding between the MAC address associated with a VxLAN server and the IP address associated with the VTEP (e.g., VxLAN learning), (2) the binding between the IP address associated with the VTEP and the MAC address associated with the VTEP (e.g., ARP), and (3) the binding between the MAC address associated with the VTEP and the RBridge nickname of the ingress RBridge (e.g., MAC learning).
  • For example, when the traffic flows from VxLAN server 1 to physical server pm1 through RBridge RB12, RBridge RB12 can determine that VxLAN server 1 is connected to RBridge RB22 through the following three bindings. First, through VxLAN learning, RBridge RB12 can find the binding of the MAC address associated with the VxLAN server 1 and the IP address associated with VTEP vtep1 using its VxLAN table. Next, because RBridge 12 is in the same subnet as VTEP vtep1, RBridge RB12 can find the MAC address associated with VTEP vtep1 using its ARP table. Then, through MAC learning, RBridge RB12 can find which RBridge VxLAN server 1 is connected to based on VTEP vtep1's MAC address and RBridge RB22's RBridge nickname using its MAC address table. Using the link state information exchanged through the link state protocol, the VxLAN gateway can compute its own path cost from/to RBridge RB22 as 10. In addition, provided that the VxLAN gateway (e.g., RBridge RB12) has obtained the RBridge nicknames of the other VxLAN gateways in the network 10 (e.g., RBridges RB21 and RB22), the VxLAN gateway can also compute the path costs of RBridges RB21 and RB22 from/to RBridge RB22 as 20 and 0, respectively.
  • After computing the total path costs of the multiple forwarding paths between the source and destination nodes (e.g., physical server pm1 and VxLAN server 1), the VxLAN gateway (e.g., RBridge RB12, RB21 or RB22) can determine the optimal forwarding path and the optimal VxLAN gateway. It should be understood that traffic flows from the source node to the VxLAN gateway over the FGL network and from the VxLAN gateway to the destination node over the VxLAN. Alternatively or additionally, it should be understood that traffic flows from the source node to the VxLAN gateway over the VxLAN and from the VxLAN gateway to the destination node over the FGL network. This is shown in FIG. 2. The optimal forwarding path is the forwarding path having the fewest hops in the logical network having the greater encapsulation overhead. In other words, the optimal forwarding path is chosen such that traffic makes fewer hops in the logical network associated with the larger encapsulation overhead (e.g., the VxLAN) and more hops in the logical network associated with smaller encapsulation overhead (e.g., the FGL network). In the example implementations discussed herein, VxLAN encapsulation overhead exceeds FGL encapsulation overhead. Thus, the optimal VxLAN gateway is RBridge RB22 and the optimal forwarding path is through RBridge RB22.
  • Optionally, the VxLAN gateways can be configured to calculate an encapsulation overhead metric. The encapsulation overhead metric (“EO/H”) can optionally be defined as:
  • E O / H = 1 + Per Frame Encapsulation Overhead Average Packet Size ( 1 )
  • It should be understood that the encapsulation overhead metric provided in Eqn. (1) is provided only as an example and that the encapsulation overhead metric can be defined in other ways. In the examples provided above, the per frame encapsulation overhead of VxLAN encapsulation exceeds FGL encapsulation by 44 bytes. The encapsulation overhead metric calculated using Eqn. (1) is 1.1, assuming an average packet size of 440 bytes. This disclosure contemplates that the average packet size can optionally be more or less than 440 bytes, which is provided only as an example. Then, the total path costs for the multiple forwarding paths can optionally be computed by weighting the path costs (e.g., Weighted Path Cost=EO/H×Path Cost) between the each of VxLAN gateways and the destination nodes by the encapsulation overhead metric. Table 1 below shows the total path costs computed for the multiple forwarding paths between physical server pm1 and VxLAN server 1 of FIG. 2, assuming an encapsulation overhead of 44 bytes per frame and an average packet size of 440 bytes.
  • TABLE 1
    VxLAN
    Forwarding Path FGL Path Cost Path Cost Eo/H Total Path Cost
    Via RB12
    10 10 1.1 21
    Via RB21 0 20 1.1 22
    Via RB22 20 0 1.1 20

    As shown above in Table 1, the optimal forwarding path is via RBridge RB22.
  • Optimal Forwarding Notification
  • Optionally, upon determining the optimal forwarding path and optimal VxLAN gateway, the VxLAN gateways can be configured to notify the RBridges and VTEPs in the network 10 of which RBridge is the optimal VxLAN gateway. Consider the following initial traffic flow between physical server pm1 and VxLAN server 1 in FIG. 2. First, physical server pm1 sends a unicast frame to VxLAN server 1. Since lookup fails in RBridge RB21, the frame is sent along the distribution tree to all other RBridges in the network 10, including RBridges RB12 and RB22 (e.g., VxLAN gateways). Optionally, for multi-destination frame handling, only distribution tree root RBridge RB12 performs VxLAN encapsulation and transmits the encapsulated frame to the VxLAN IP multicast address. The distribution tree 16 rooted at RBridge RB12 is shown as a dashed line in FIG. 1. Additionally, RBridge RB12 learns the binding between physical server pmt's MAC address and RBridge RB21's RBridge nickname through MAC address learning, and therefore, RBridge RB12 can compute the FGL path costs between physical server pm1 and all of the VxLAN gateways (e.g., RBridges RB12, RB21 and RB22). In addition, VxLAN sever 1 responds with a unicast frame to physical server pm1. As discussed above, VTEP vtep1 encapsulates the frame, using learned RBridge RB12's IP address as the destination IP address. After RBridge RB12 receives the frame, RBridge RB12 can learn the binding between VxLAN server 1's MAC address and VTEP vtep1's IP address, for example, through the three bindings discussed above. RBridge RB12 can then compute the VxLAN path costs between VxLAN server 1 and all of the VxLAN gateways (e.g., RBridges RB12, RB21 and RB22). The ability of RBridge RB12 to compute the total path costs for the other VxLAN gateways assumes that RBridge RB12 has learned the RBridge nicknames of the other VxLAN gateways, for example, by exchanging messages using the link state protocol including the VxLAN Gateway Information TLV. Additionally, RBridge RB12 can weight the path costs over the VxLAN because VxLAN encapsulation has a higher encapsulation overhead as compared to FGL encapsulation.
  • Upon computing the total path costs, for example, as shown in Table 1 above, RBridge RB12 realizes that it is not in the optimal forwarding path. RBridge RB12 can optionally be configured to notify one or more RBridges in the network 10 to use the optimal forwarding path, e.g., via RBridge RB22, instead of the forwarding path via RBridge RB12. For example, RBridge RB12 can be configured to notify the RBridge to which physical server pm1 is connected (e.g., RBridge RB21) and VxLAN server 1's VTEP (e.g., VTEP vtep1) to use the optimal path via RBridge RB22.
  • Optionally, an implicit approach is provided below that can be used by a VxLAN gateway to notify an RBridge or a VTEP of the optimal forwarding path. It should be understood that the implicit approach does not require any protocol changes. A VxLAN gateway can encapsulate FGL frames using the desired optimal VxLAN gateway's RBridge nickname as the ingress RBridge nickname. Thus, the RBridge to which the physical server is connected can learn the binding between the desired MAC address and RBridge nickname and redirect traffic to the optimal VxLAN gateway. For example, RBridge RB12 (e.g., a non-optimal VxLAN gateway) can decapsulate VxLAN frames from VxLAN server 1 and can encapsulate the frames with FGL headers using RBridge RB22's (e.g., an optimal VxLAN gateway) RBridge nickname, instead of its own, as the ingress RBridge nickname. Then, the RBridge to which physical server pm1 is connected (e.g., RBridge RB21) can learn the desired binding between VxLAN server 1's MAC address and RBridge RB22's RBridge nickname and redirect traffic to RBridge RB22. Additionally, a VxLAN gateway can encapsulate VxLAN frames using the desired optimal VxLAN gateway's IP address as the source IP address. The VTEP can learn the desired binding between the MAC address and IP address and redirect the traffic to the optimal VxLAN gateway. For example, RBridge RB12 (e.g., a non-optimal VxLAN gateway) can decapsulate FGL frames from physical server pm1 and can encapsulate the frames with VxLAN headers using RBridge RB22's (e.g., an optimal VxLAN gateway) IP address, instead of its own, as the source IP address. Then, VTEP vtep1 can learn the desired binding between physical server pmt's MAC address and RBridge RB22's IP address and redirect the traffic to RBridge RB22.
  • Optionally, an explicit approach is provided below that can be used by a VxLAN gateway to notify an RBridge or a VTEP of the optimal forwarding path. Although the explicit approach requires a protocol change, it provides the benefit of fast rerouting when a VxLAN gateway in the optimal forwarding path fails. A VxLAN gateway can use the TRILL ESADI to notify an RBridge to which the physical server is connected of the plurality of bindings between the VxLAN server's MAC address and RBridge nicknames of the VxLAN gateways and the associated VxLAN path costs. In other words, using ESADI, a VxLAN gateway can notify the RBridge to which the physical server is connected of a plurality of bindings with associated VxLAN path costs so that the RBridge can switch to the next-best VxLAN gateway if the optimal VxLAN gateway is detected as unreachable by the link state protocol. The VxLAN can be configured to use a modified MAC Reachability TLV, i.e., a VxLAN MAC Reachability TLV. The VxLAN MAC Reachability TLV can include a list of tuples, including but not limited to, one or more VxLAN server MAC addresses and associated VxLAN gateway RBridge nicknames and VxLAN path costs. When the RBridge receives the VxLAN MAC Reachability TLV, it can compute the total path costs based on its FGL path costs to VxLAN gateways and the advertised VxLAN path costs. For example, RBridge RB12 can use the VxLAN MAC Reachability TLV to announce the bindings of VxLAN server 1's MAC address and the RBridge nicknames of RBridges RB12, RB21 and RB22 (e.g., the VxLAN gateways) with respective VxLAN path costs of 10, 20 and 0. When the RBridge to which physical server pm1 is connected (e.g., RBridge RB21) receives the VxLAN MAC Reachability TLV, it can compute total path costs via RBridges RB12, RB21 and RB22 as 21, 22 and 20, respectively, based on its FGL path costs to RBridges RB12, RB21 and RB22 of 10, 0 and 20, respectively, and the advertised VxLAN path costs of 10, 20 and 0. RBridge RB21 can then redirect the traffic to RBridge RB22 because it is the VxLAN gateway associated with the lowest total path cost.
  • Additionally, a VxLAN gateway can use a control protocol (e.g., VxLAN Gateway Address Distribution Information (VGADI)) to allow a VxLAN gateway to notify a VTEP of the plurality of bindings between a physical server's MAC address and VxLAN gateway IP addresses of the VxLAN gateways and the associated total path costs. For example, according to VGADI, a VxLAN gateway can unicast its protocol data units (“PDUs”) to the IP address of the intended VTEP. Each PDU can carry a VxLAN Gateway Reachability TLV, which includes a list of tuples, including but not limited to, one or more physical server MAC addresses and associated VxLAN gateway IP addresses and total path costs. For example, RBridge RB12 can use the VxLAN Gateway Reachability TLV to inform VTEP vtep1 of the bindings between physical server pmt's MAC address and the IP addresses of RBridges RB12, RB21 and RB22 (e.g., the VxLAN gateways) with respective total path costs of 21, 22 and 20. VTEP vtep1 can then redirect the traffic to RBridge RB22 because it is the optimal VxLAN gateway associated with the lowest total path cost.
  • It should be appreciated that the logical operations described herein with respect to the various figures may be implemented (1) as a sequence of computer implemented acts or program modules (i.e., software) running on a computing device, (2) as interconnected machine logic circuits or circuit modules (i.e., hardware) within the computing device and/or (3) a combination of software and hardware of the computing device. Thus, the logical operations discussed herein are not limited to any specific combination of hardware and software. The implementation is a matter of choice dependent on the performance and other requirements of the computing device. Accordingly, the logical operations described herein are referred to variously as operations, structural devices, acts, or modules. These operations, structural devices, acts and modules may be implemented in software, in firmware, in special purpose digital logic, and any combination thereof. It should also be appreciated that more or fewer operations may be performed than shown in the figures and described herein. These operations may also be performed in a different order than those described herein.
  • Referring now to FIG. 4, a flow diagram illustrating example operations 400 for determining an optimal forwarding path across a network is shown. The network can be the network including RBridges configured to implement both FGL networking and VxLAN schemes, e.g., the network 10 shown in FIG. 1. RBridges configured to implement both FGL and VxLAN networking schemes are VxLAN gateways. As discussed above, the example operations 400 can be carried out by a VxLAN gateway. At 402, one or more RBridge nicknames can be learned. As discussed above, each RBridge nickname is uniquely associated with one of the VxLAN gateways in the network. At 404, a path cost over the FGL network between each of the VxLAN gateways and a source node is determined. Additionally, at 406, a path cost over the VxLAN between each of the VxLAN gateways and a destination node is determined. At 408, an encapsulation overhead metric associated with switching packets over the VxLAN can be determined. Then, at 410, one of the VxLAN gateways can be selected as an optimal VxLAN gateway. The selection can be based on the path cost over the FGL network between each of the VxLAN gateways and the source node, the path cost over the VxLAN between each of the VxLAN gateways and the destination node and the encapsulation overhead metric. Optionally, after selecting an optimal VxLAN gateway, one or more RBridges in the network can be notified of the selection. This facilitates the ability of the RBridges to re-direct traffic via the optimal VxLAN gateway.
  • When the logical operations described herein are implemented in software, the process may execute on any type of computing architecture or platform. For example, referring to FIG. 5, an example computing device upon which embodiments of the invention may be implemented is illustrated. In particular, the RBridges and servers discussed above may be a computing device, such as computing device 500 shown in FIG. 5. The computing device 500 may include a bus or other communication mechanism for communicating information among various components of the computing device 500. In its most basic configuration, computing device 500 typically includes at least one processing unit 506 and system memory 504. Depending on the exact configuration and type of computing device, system memory 504 may be volatile (such as random access memory (RAM)), non-volatile (such as read-only memory (ROM), flash memory, etc.), or some combination of the two. This most basic configuration is illustrated in FIG. 5 by dashed line 502. The processing unit 506 may be a standard programmable processor that performs arithmetic and logic operations necessary for operation of the computing device 500. Alternatively or additionally, the processing unit 506 can be an ASIC.
  • Computing device 500 may have additional features/functionality. For example, computing device 500 may include additional storage such as removable storage 508 and non-removable storage 510 including, but not limited to, magnetic or optical disks or tapes. Computing device 500 may also contain network connection(s) 516 that allow the device to communicate with other devices. Computing device 500 may also have input device(s) 514 such as a keyboard, mouse, touch screen, etc. Output device(s) 512 such as a display, speakers, printer, etc. may also be included. The additional devices may be connected to the bus in order to facilitate communication of data among the components of the computing device 500. All these devices are well known in the art and need not be discussed at length here.
  • The processing unit 506 may be configured to execute program code encoded in tangible, computer-readable media. Computer-readable media refers to any media that is capable of providing data that causes the computing device 500 (i.e., a machine) to operate in a particular fashion. Various computer-readable media may be utilized to provide instructions to the processing unit 506 for execution. Common forms of computer-readable media include, for example, magnetic media, optical media, physical media, memory chips or cartridges, a carrier wave, or any other medium from which a computer can read. Example computer-readable media may include, but is not limited to, volatile media, non-volatile media and transmission media. Volatile and non-volatile media may be implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data and common forms are discussed in detail below. Transmission media may include coaxial cables, copper wires and/or fiber optic cables, as well as acoustic or light waves, such as those generated during radio-wave and infra-red data communication. Example tangible, computer-readable recording media include, but are not limited to, an integrated circuit (e.g., field-programmable gate array or application-specific IC), a hard disk, an optical disk, a magneto-optical disk, a floppy disk, a magnetic tape, a holographic storage medium, a solid-state device, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices.
  • In an example implementation, the processing unit 506 may execute program code stored in the system memory 504. For example, the bus may carry data to the system memory 504, from which the processing unit 506 receives and executes instructions. The data received by the system memory 504 may optionally be stored on the removable storage 508 or the non-removable storage 510 before or after execution by the processing unit 506.
  • Computing device 500 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by device 500 and includes both volatile and non-volatile media, removable and non-removable media. Computer storage media include volatile and non-volatile, and removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. System memory 504, removable storage 508, and non-removable storage 510 are all examples of computer storage media. Computer storage media include, but are not limited to, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 500. Any such computer storage media may be part of computing device 500.
  • It should be understood that the various techniques described herein may be implemented in connection with hardware or software or, where appropriate, with a combination thereof. Thus, the methods and apparatuses of the presently disclosed subject matter, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium wherein, when the program code is loaded into and executed by a machine, such as a computing device, the machine becomes an apparatus for practicing the presently disclosed subject matter. In the case of program code execution on programmable computers, the computing device generally includes a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. One or more programs may implement or utilize the processes described in connection with the presently disclosed subject matter, e.g., through the use of an application programming interface (API), reusable controls, or the like. Such programs may be implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language and it may be combined with hardware implementations.
  • Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (20)

What is claimed:
1. A method for determining an optimal forwarding path across a network, the network including a plurality of gateways configured to implement respective networking protocols for switching packets over a first logical network and a second logical network, the method comprising:
determining a path cost over the first logical network between each of the gateways and a source node, wherein the first logical network is a Transparent Interconnect of Lots of Links (“TRILL”) fine-grained labeling (“FGL”) network;
determining a path cost over the second logical network between each of the gateways and a destination node;
determining an encapsulation cost difference between switching packets over the second logical network and switching packets over the TRILL FGL network;
determining an encapsulation overhead metric associated with switching packets over the second logical network, wherein the encapsulation overhead metric is proportional to the encapsulation cost difference;
weighting the path cost over the second logical network between each of the gateways and the destination node by the encapsulation overhead metric; and
selecting one of the gateways as an optimal gateway, wherein the selection is based on the path cost over the TRILL FGL network between each of the gateways and the source node and the weighted path cost over the second logical network between each of the gateways and the destination node.
2. The method of claim 1, further comprising learning one or more RBridge nicknames, each RBridge nickname being uniquely associated with one of the gateways in the network, wherein learning one or more RBridge nicknames further comprises transmitting or receiving a message using a link state protocol, the message comprising at least one of an RBridge nickname and an IP address associated with one of the gateways in the network.
3. The method of claim 1, wherein the source node comprises a physical server, and the method further comprises determining an RBridge to which the physical server is connected using a media access control (“MAC”) address table, wherein the path cost over the TRILL FGL network between each of the gateways and the source node is determined as a path cost over the TRILL FGL network between each of the gateways and the RBridge to which the physical server is connected.
4. The method of claim 1, further comprising notifying at least one of an RBridge to which the source node is connected and an RBridge to which the destination node is connected of the optimal gateway.
5. The method of claim 4, wherein notifying at least one of an RBridge to which the source node is connected and an RBridge to which the destination node is connected of the optimal gateway further comprises:
encapsulating a frame with at least one of an RBridge nickname or an IP address associated with the optimal gateway; and
transmitting the encapsulated frame.
6. The method of claim 4, wherein notifying at least one of an RBridge to which the source node is connected of the optimal gateway further comprises advertising a plurality of bindings between a MAC address associated with the destination node and RBridge nicknames and path costs associated with the gateways in the network.
7. The method of claim 4, wherein notifying at least one of an RBridge to which the destination node is connected of the optimal gateway further comprises advertising a plurality of bindings between a MAC address associated with the source node and IP addresses and path costs associated with the gateways in the network.
8. The method of claim 1, wherein the second logical network is a VxLAN.
9. A non-transitory computer-readable recording medium having computer-executable instructions stored thereon for determining an optimal forwarding path across a network, the network including a plurality of gateways configured to implement respective networking protocols for switching packets over a first logical network and a second logical network, that, when executed by a gateway, cause the gateway to:
determine a path cost over the first logical network between each of the gateways and a source node, wherein the first logical network is a Transparent Interconnect of Lots of Links (“TRILL”) fine-grained labeling (“FGL”) network;
determine a path cost over the second logical network between each of the gateways and a destination node;
determine an encapsulation cost difference between switching packets over the second logical network and switching packets over the TRILL FGL network;
determine an encapsulation overhead metric associated with switching packets over the second logical network, wherein the encapsulation overhead metric is proportional to the encapsulation cost difference;
weight the path cost over the second logical network between each of the gateways and the destination node by the encapsulation overhead metric; and
select one of the gateways as an optimal gateway, wherein the selection is based on the path cost over the TRILL FGL network between each of the gateways and the source node and the weighted path cost over the second logical network between each of the gateways and the destination node.
10. The non-transitory computer-readable recording medium of claim 9, having further computer-executable instructions stored thereon that, when executed by the gateway, cause the gateway to learn one or more RBridge nicknames, each RBridge nickname being uniquely associated with one of the gateways in the network, wherein learning one or more RBridge nicknames further comprises transmitting or receiving a message using a link state protocol, the message comprising at least one of an RBridge nickname and an IP address associated with one of the gateways in the network.
11. The non-transitory computer-readable recording medium of claim 9, wherein the source node comprises a physical server, and the non-transitory computer-readable recording medium having further computer-executable instructions stored thereon that, when executed by the gateway, cause the gateway to determine an RBridge to which the physical server is connected using a media access control (“MAC”) address table, wherein the path cost over the TRILL FGL network between each of the gateways and the source node is determined as a path cost over the TRILL FGL network between each of the gateways and the RBridge to which the physical server is connected.
12. The non-transitory computer-readable recording medium of claim 9, having further computer-executable instructions stored thereon that, when executed by the gateway, cause the gateway to notify at least one of an RBridge to which the source node is connected and an RBridge to which the destination node is connected of the optimal gateway.
13. The non-transitory computer-readable recording medium of claim 12, wherein notifying at least one of an RBridge to which the source node is connected and an RBridge to which the destination node is connected of the optimal gateway further comprises:
encapsulating a frame with at least one of an RBridge nickname or an IP address associated with the optimal gateway; and
transmitting the encapsulated frame.
14. The non-transitory computer-readable recording medium of claim 12, wherein notifying at least one of an RBridge to which the source node is connected of the optimal gateway further comprises advertising a plurality of bindings between a MAC address associated with the destination node and RBridge nicknames and path costs associated with the gateways in the network.
15. The non-transitory computer-readable recording medium of claim 12, wherein notifying at least one of an RBridge to which the destination node is connected of the optimal gateway further comprises advertising a plurality of bindings between a MAC address associated with the source node and IP addresses and path costs associated with the gateways in the network.
16. The non-transitory computer-readable recording medium of claim 9, wherein the second logical network is a VxLAN.
17. A method for determining an optimal forwarding path across a network, the network including a plurality of gateways configured to implement respective networking protocols for switching packets over a first logical network and a second logical network, the method comprising:
determining a path cost over the first logical network between each of the gateways and a source node;
determining a path cost over the second logical network between each of the gateways and a destination node;
determining an encapsulation cost difference between switching packets over the second logical network and switching packets over the first logical network;
determining an encapsulation overhead metric associated with switching packets over the second logical network, wherein the encapsulation overhead metric is proportional to the encapsulation cost difference;
weighting the path cost over the second logical network between each of the gateways and the destination node by the encapsulation overhead metric; and
selecting one of the gateways as an optimal gateway, wherein the selection is based on the path cost over the first logical network between each of the gateways and the source node and the weighted path cost over the second logical between each of the gateways and the destination node.
18. The method of claim 17, wherein the first logical network is a Transparent Interconnect of Lots of Links (“TRILL”) fine-grained labeling (“FGL”) network.
19. The method of claim 18, further comprising learning one or more RBridge nicknames, each RBridge nickname being uniquely associated with one of the gateways in the network.
20. The method of claim 17, wherein the second logical network is a VxLAN.
US14/947,134 2013-05-21 2015-11-20 Optimal forwarding in a network implementing a plurality of logical networking schemes Abandoned US20160080247A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/947,134 US20160080247A1 (en) 2013-05-21 2015-11-20 Optimal forwarding in a network implementing a plurality of logical networking schemes

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/898,572 US9203738B2 (en) 2013-05-21 2013-05-21 Optimal forwarding for trill fine-grained labeling and VXLAN interworking
US14/947,134 US20160080247A1 (en) 2013-05-21 2015-11-20 Optimal forwarding in a network implementing a plurality of logical networking schemes

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/898,572 Continuation US9203738B2 (en) 2013-05-21 2013-05-21 Optimal forwarding for trill fine-grained labeling and VXLAN interworking

Publications (1)

Publication Number Publication Date
US20160080247A1 true US20160080247A1 (en) 2016-03-17

Family

ID=51935348

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/898,572 Active 2033-12-14 US9203738B2 (en) 2013-05-21 2013-05-21 Optimal forwarding for trill fine-grained labeling and VXLAN interworking
US14/947,134 Abandoned US20160080247A1 (en) 2013-05-21 2015-11-20 Optimal forwarding in a network implementing a plurality of logical networking schemes

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/898,572 Active 2033-12-14 US9203738B2 (en) 2013-05-21 2013-05-21 Optimal forwarding for trill fine-grained labeling and VXLAN interworking

Country Status (1)

Country Link
US (2) US9203738B2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10412047B2 (en) * 2017-08-17 2019-09-10 Arista Networks, Inc. Method and system for network traffic steering towards a service device
US10721651B2 (en) 2017-09-29 2020-07-21 Arista Networks, Inc. Method and system for steering bidirectional network traffic to a same service device
US10749789B2 (en) 2018-12-04 2020-08-18 Arista Networks, Inc. Method and system for inspecting broadcast network traffic between end points residing within a same zone
US10764234B2 (en) 2017-10-31 2020-09-01 Arista Networks, Inc. Method and system for host discovery and tracking in a network using associations between hosts and tunnel end points
US10848457B2 (en) 2018-12-04 2020-11-24 Arista Networks, Inc. Method and system for cross-zone network traffic between different zones using virtual network identifiers and virtual layer-2 broadcast domains
US10855733B2 (en) 2018-12-04 2020-12-01 Arista Networks, Inc. Method and system for inspecting unicast network traffic between end points residing within a same zone
US10917342B2 (en) 2018-09-26 2021-02-09 Arista Networks, Inc. Method and system for propagating network traffic flows between end points based on service and priority policies

Families Citing this family (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9374323B2 (en) * 2013-07-08 2016-06-21 Futurewei Technologies, Inc. Communication between endpoints in different VXLAN networks
US9910686B2 (en) 2013-10-13 2018-03-06 Nicira, Inc. Bridging between network segments with a logical router
US9647883B2 (en) 2014-03-21 2017-05-09 Nicria, Inc. Multiple levels of logical routers
US9893988B2 (en) 2014-03-27 2018-02-13 Nicira, Inc. Address resolution using multiple designated instances of a logical router
US9509603B2 (en) * 2014-03-31 2016-11-29 Arista Networks, Inc. System and method for route health injection using virtual tunnel endpoints
CN105306613A (en) * 2014-07-24 2016-02-03 中兴通讯股份有限公司 MAC address notification method and device and acquisition device for ESADI
CN105515999B (en) * 2014-09-24 2020-05-19 中兴通讯股份有限公司 Quick convergence method and device for end system address distribution information protocol
CN104243318B (en) * 2014-09-29 2018-10-09 新华三技术有限公司 MAC address learning method and device in VXLAN networks
US10511458B2 (en) 2014-09-30 2019-12-17 Nicira, Inc. Virtual distributed bridging
US10250443B2 (en) 2014-09-30 2019-04-02 Nicira, Inc. Using physical location to modify behavior of a distributed virtual network element
US9853873B2 (en) 2015-01-10 2017-12-26 Cisco Technology, Inc. Diagnosis and throughput measurement of fibre channel ports in a storage area network environment
US9787605B2 (en) 2015-01-30 2017-10-10 Nicira, Inc. Logical router with multiple routing components
US9900250B2 (en) 2015-03-26 2018-02-20 Cisco Technology, Inc. Scalable handling of BGP route information in VXLAN with EVPN control plane
US10222986B2 (en) 2015-05-15 2019-03-05 Cisco Technology, Inc. Tenant-level sharding of disks with tenant-specific storage modules to enable policies per tenant in a distributed storage system
US10063467B2 (en) 2015-05-18 2018-08-28 Cisco Technology, Inc. Virtual extensible local area network performance routing
US11588783B2 (en) 2015-06-10 2023-02-21 Cisco Technology, Inc. Techniques for implementing IPV6-based distributed storage space
US10348625B2 (en) 2015-06-30 2019-07-09 Nicira, Inc. Sharing common L2 segment in a virtual distributed router environment
US10778765B2 (en) 2015-07-15 2020-09-15 Cisco Technology, Inc. Bid/ask protocol in scale-out NVMe storage
US10129142B2 (en) 2015-08-11 2018-11-13 Nicira, Inc. Route configuration for logical router
US10057157B2 (en) 2015-08-31 2018-08-21 Nicira, Inc. Automatically advertising NAT routes between logical routers
CN106559325B (en) * 2015-09-25 2020-06-09 华为技术有限公司 Path detection method and device
US10095535B2 (en) 2015-10-31 2018-10-09 Nicira, Inc. Static route types for logical routers
US9892075B2 (en) 2015-12-10 2018-02-13 Cisco Technology, Inc. Policy driven storage in a microserver computing environment
US10536297B2 (en) * 2016-03-29 2020-01-14 Arista Networks, Inc. Indirect VXLAN bridging
CN107332812B (en) * 2016-04-29 2020-07-07 新华三技术有限公司 Method and device for realizing network access control
US10140172B2 (en) 2016-05-18 2018-11-27 Cisco Technology, Inc. Network-aware storage repairs
CN106101008B (en) * 2016-05-31 2019-08-06 新华三技术有限公司 A kind of transmission method and device of message
US20170351639A1 (en) 2016-06-06 2017-12-07 Cisco Technology, Inc. Remote memory access using memory mapped addressing among multiple compute nodes
US10664169B2 (en) 2016-06-24 2020-05-26 Cisco Technology, Inc. Performance of object storage system by reconfiguring storage devices based on latency that includes identifying a number of fragments that has a particular storage device as its primary storage device and another number of fragments that has said particular storage device as its replica storage device
US10153973B2 (en) 2016-06-29 2018-12-11 Nicira, Inc. Installation of routing tables for logical router in route server mode
US11563695B2 (en) 2016-08-29 2023-01-24 Cisco Technology, Inc. Queue protection using a shared global memory reserve
US10454758B2 (en) * 2016-08-31 2019-10-22 Nicira, Inc. Edge node cluster network redundancy and fast convergence using an underlay anycast VTEP IP
CN106302258B (en) * 2016-09-08 2019-06-04 杭州迪普科技股份有限公司 A kind of message forwarding method and device
WO2018058104A1 (en) 2016-09-26 2018-03-29 Nant Holdings Ip, Llc Virtual circuits in cloud networks
US10545914B2 (en) 2017-01-17 2020-01-28 Cisco Technology, Inc. Distributed object storage
US10243823B1 (en) 2017-02-24 2019-03-26 Cisco Technology, Inc. Techniques for using frame deep loopback capabilities for extended link diagnostics in fibre channel storage area networks
US10713203B2 (en) 2017-02-28 2020-07-14 Cisco Technology, Inc. Dynamic partition of PCIe disk arrays based on software configuration / policy distribution
US10254991B2 (en) 2017-03-06 2019-04-09 Cisco Technology, Inc. Storage area network based extended I/O metrics computation for deep insight into application performance
US10303534B2 (en) 2017-07-20 2019-05-28 Cisco Technology, Inc. System and method for self-healing of application centric infrastructure fabric memory
US10686734B2 (en) 2017-09-26 2020-06-16 Hewlett Packard Enterprise Development Lp Network switch with interconnected member nodes
US10404596B2 (en) 2017-10-03 2019-09-03 Cisco Technology, Inc. Dynamic route profile storage in a hardware trie routing table
US10942666B2 (en) 2017-10-13 2021-03-09 Cisco Technology, Inc. Using network device replication in distributed storage clusters
US10374827B2 (en) 2017-11-14 2019-08-06 Nicira, Inc. Identifier that maps to different networks at different datacenters
US10511459B2 (en) 2017-11-14 2019-12-17 Nicira, Inc. Selection of managed forwarding element for bridge spanning multiple datacenters
CN112702251B (en) * 2019-10-22 2022-09-23 华为技术有限公司 Message detection method, connectivity negotiation relationship establishment method and related equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100172249A1 (en) * 2005-11-02 2010-07-08 Hang Liu Method for Determining a Route in a Wireless Mesh Network Using a Metric Based On Radio and Traffic Load
US8102781B2 (en) * 2008-07-31 2012-01-24 Cisco Technology, Inc. Dynamic distribution of virtual machines in a communication network
US20120076150A1 (en) * 2010-09-23 2012-03-29 Radia Perlman Controlled interconnection of networks using virtual nodes
US20130259050A1 (en) * 2010-11-30 2013-10-03 Donald E. Eastlake, III Systems and methods for multi-level switching of data frames
US20130332602A1 (en) * 2012-06-06 2013-12-12 Juniper Networks, Inc. Physical path determination for virtual network packet flows
US20140029437A1 (en) * 2012-07-24 2014-01-30 Fujitsu Limited Information processing system, information processing method, and relay apparatus

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4816957B2 (en) * 2007-03-07 2011-11-16 日本電気株式会社 Relay device, route selection system, route selection method, and program
US9054999B2 (en) * 2012-05-09 2015-06-09 International Business Machines Corporation Static TRILL routing
US9380132B2 (en) * 2011-06-27 2016-06-28 Marvell Israel (M.I.S.L.) Ltd. FCoE over trill
WO2013117166A1 (en) * 2012-02-08 2013-08-15 Hangzhou H3C Technologies Co., Ltd. Implement equal cost multiple path of trill network
US9614759B2 (en) * 2012-07-27 2017-04-04 Dell Products L.P. Systems and methods for providing anycast MAC addressing in an information handling system
US9401862B2 (en) * 2013-02-07 2016-07-26 Dell Products L.P. Optimized internet small computer system interface path
JP6217138B2 (en) * 2013-05-22 2017-10-25 富士通株式会社 Packet transfer apparatus and packet transfer method
US9203749B2 (en) * 2013-05-29 2015-12-01 Cisco Technology, Inc. System, devices and methods for facilitating coexistence of VLAN labeling and fine-grained labeling RBridges
US9565105B2 (en) * 2013-09-04 2017-02-07 Cisco Technology, Inc. Implementation of virtual extensible local area network (VXLAN) in top-of-rack switches in a network environment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100172249A1 (en) * 2005-11-02 2010-07-08 Hang Liu Method for Determining a Route in a Wireless Mesh Network Using a Metric Based On Radio and Traffic Load
US8102781B2 (en) * 2008-07-31 2012-01-24 Cisco Technology, Inc. Dynamic distribution of virtual machines in a communication network
US20120076150A1 (en) * 2010-09-23 2012-03-29 Radia Perlman Controlled interconnection of networks using virtual nodes
US20130259050A1 (en) * 2010-11-30 2013-10-03 Donald E. Eastlake, III Systems and methods for multi-level switching of data frames
US20130332602A1 (en) * 2012-06-06 2013-12-12 Juniper Networks, Inc. Physical path determination for virtual network packet flows
US20140029437A1 (en) * 2012-07-24 2014-01-30 Fujitsu Limited Information processing system, information processing method, and relay apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Michael Barbehenn; A Note on the Complexity of Dijkstra’s Algorithm for Graphs with Weighted Vertices; IEEE TRANSACTIONS ON COMPUTERS, VOL. 47, NO. 2, FEBRUARY 1998 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10412047B2 (en) * 2017-08-17 2019-09-10 Arista Networks, Inc. Method and system for network traffic steering towards a service device
US20190356632A1 (en) * 2017-08-17 2019-11-21 Arista Networks, Inc. Method and system for network traffic steering towards a service device
US11012412B2 (en) 2017-08-17 2021-05-18 Arista Networks, Inc. Method and system for network traffic steering towards a service device
US10721651B2 (en) 2017-09-29 2020-07-21 Arista Networks, Inc. Method and system for steering bidirectional network traffic to a same service device
US11277770B2 (en) 2017-09-29 2022-03-15 Arista Networks, Inc. Method and system for steering bidirectional network traffic to a same service device
US10764234B2 (en) 2017-10-31 2020-09-01 Arista Networks, Inc. Method and system for host discovery and tracking in a network using associations between hosts and tunnel end points
US10917342B2 (en) 2018-09-26 2021-02-09 Arista Networks, Inc. Method and system for propagating network traffic flows between end points based on service and priority policies
US11463357B2 (en) 2018-09-26 2022-10-04 Arista Networks, Inc. Method and system for propagating network traffic flows between end points based on service and priority policies
US10749789B2 (en) 2018-12-04 2020-08-18 Arista Networks, Inc. Method and system for inspecting broadcast network traffic between end points residing within a same zone
US10848457B2 (en) 2018-12-04 2020-11-24 Arista Networks, Inc. Method and system for cross-zone network traffic between different zones using virtual network identifiers and virtual layer-2 broadcast domains
US10855733B2 (en) 2018-12-04 2020-12-01 Arista Networks, Inc. Method and system for inspecting unicast network traffic between end points residing within a same zone

Also Published As

Publication number Publication date
US9203738B2 (en) 2015-12-01
US20140348166A1 (en) 2014-11-27

Similar Documents

Publication Publication Date Title
US9203738B2 (en) Optimal forwarding for trill fine-grained labeling and VXLAN interworking
US9680751B2 (en) Methods and devices for providing service insertion in a TRILL network
EP3497893B1 (en) Segment routing based on maximum segment identifier depth
ES2588739T3 (en) Method, equipment and system for mapping a service instance
US8830998B2 (en) Separation of edge and routing/control information for multicast over shortest path bridging
US9167501B2 (en) Implementing a 3G packet core in a cloud computer with openflow data and control planes
US7408941B2 (en) Method for auto-routing of multi-hop pseudowires
US8345697B2 (en) System and method for carrying path information
EP3197107B1 (en) Message transmission method and apparatus
WO2016165492A1 (en) Method and apparatus for implementing service function chain
US11128489B2 (en) Maintaining data-plane connectivity between hosts
CN104396197B (en) Selecting between equal-cost shortest paths in 802.1aq networks using separate tie-breakers
CN109314666A (en) Virtual tunnel endpoints for congestion-aware load balancing
US20130100858A1 (en) Distributed switch systems in a trill network
CN112868214B (en) Coordinated load transfer OAM records within packets
WO2020173198A1 (en) Message processing method, message forwarding apparatus, and message processing apparatus
EP3528441B1 (en) Message forwarding
CN106170952A (en) Method and system for deploying a maximally redundant tree in a data network
CN107872389B (en) Method, apparatus, and computer-readable storage medium for service load balancing
US11362954B2 (en) Tunneling inter-domain stateless internet protocol multicast packets
CN111740907A (en) Message transmission method, device, equipment and machine readable storage medium
US20250047590A1 (en) Packet Sending Method, Network Device, and Communication System
US20230164070A1 (en) Packet sending method, device, and system
US20130279513A1 (en) Systems and methods for pseudo-link creation
US10164795B1 (en) Forming a multi-device layer 2 switched fabric using internet protocol (IP)-router / switched networks

Legal Events

Date Code Title Description
AS Assignment

Owner name: CISCO TECHNOLOGY, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YANG, YIBIN;TSAI, CHIAJEN;DONG, LIQIN;AND OTHERS;REEL/FRAME:037100/0629

Effective date: 20130520

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION