Disclosure of Invention
The application provides a method and a related device for processing network congestion, which can effectively avoid network congestion and improve the utilization rate of network bandwidth.
The first aspect of the application provides a method of handling network congestion. The first network device determines a target port, wherein the target port is an output port which enters a pre-congestion state or a congestion state. The first network device sends a first notification to at least one second network device. The at least one second network device includes one or more network devices capable of transmitting data streams to hosts under the destination port via at least two forwarding paths. The first notification includes information of a network device in which the target port is located and information of the target port. The at least one second network device is determined according to the role of the first network device, the attribute of the target port and the role of the network device where the target port is located.
In the above method of the present application, when the first network device has an egress port entering a pre-congestion state or a congestion state in the network, the egress port is notified to the second network device in the network. The second network device can acquire the information of the output port and avoid sending the message to a forwarding path comprising the output port when the message is forwarded subsequently, so as to avoid network congestion.
Optionally, when the network device where the target port is located is the first network device, the first network device monitors an output port of the first network device, and when a cache usage amount of one output port of the first network device exceeds a port cache threshold, the first network device determines that the output port is the target port.
Optionally, when the network device where the target port is located is the first network device, the first network device monitors an egress port queue of the first network device, and when a length of an egress port queue exceeds a queue buffer threshold, the first network device determines that an egress port where the egress port queue is located is the target port.
The application can not only determine whether the outlet port enters the congestion state or the pre-congestion state according to the cache usage amount of the outlet port, but also determine whether the outlet port enters the congestion state or the pre-congestion state according to the length of the outlet port queue in the outlet port, and can flexibly realize the notification and the processing of network congestion.
Optionally, the network device where the target port is located is a third network device, and the first network device receives a second notification sent by the third network device, where the second notification includes information of the third network device and information of the target port. The first network device determines the destination port based on the second notification.
In the application, the first network device also receives the notification sent by other network devices to acquire the information of the port which is found by the other network devices and enters the pre-congestion state or the congestion state, so that the network congestion processing of the whole network can be realized.
Optionally, the information of the network device where the target port is located includes an identifier of the network device where the target port is located, and the information of the target port includes an identifier of the target port or an identifier of a forwarding path where the target port is located. Or the information of the network equipment of the target port also comprises a role of the network equipment of the target port, wherein the role indicates the position of the network equipment of the target port, and the information of the target port also comprises an attribute of the target port, and the attribute indicates the direction of the data flow sent by the target port.
The notification of the application can comprise various types of information so as to adapt to different types of network architectures, thereby improving the applicability of the technical scheme.
Optionally, before the first network device sends the first notification to the at least one second network device, the first network device further determines that there is no free port on the first network device capable of forwarding the target data stream corresponding to the target port. The target data stream is a data stream corresponding to a target address range, the target address range is an address range corresponding to a host under the target port, and the target address range is determined according to the information of the network device where the target port is located and the information of the target port.
According to the application, the first network device forwards the target data stream through the idle output port on the first network device, so that the frequency of switching the target data stream can be reduced, and the influence of a forwarding path for switching the target data stream on other network devices is reduced.
Optionally, the information of the destination port may further include an identifier of a destination egress port queue, where the destination egress port queue is an egress port queue in the destination port that enters a congestion state or a pre-congestion state, and the destination data flow is a data flow corresponding to the destination address range and having a priority corresponding to the identifier of the egress port queue.
The application can only execute the processing of avoiding network congestion on the data flow corresponding to the output port queue which enters the pre-plug state or the congestion state, and can reduce the influence on other data flows while avoiding network congestion.
Optionally, the first network device stores information of a network device where the target port is located and information of the target port. Further, the first network device may also store a state of the target port.
Further, the first network device also sets an aging time for the stored information. Thus, when the first network device receives a subsequent data stream, the first network device can process the received data stream according to the stored information, so as to avoid sending the data stream to a forwarding path where the target port is located, and reduce network congestion.
A second aspect of the application provides a method of handling network congestion. The second network device receives a first notification from the first network device, wherein the first notification comprises information of the network device where a target port is located and information of the target port, the target port is a port entering a pre-congestion state or a congestion state, and the second network device is a network device capable of sending data streams to a host under the target port through at least two forwarding paths. The second network device determines a target data flow, a first forwarding path of the target data flow including the target port. The second network device determines whether an idle outlet port capable of forwarding the target data stream exists on the second network device, and a determination result is obtained. The second network device processes the target data stream according to the determination result.
In the application, the second network equipment processes the target data stream according to the received information including the target port entering the pre-congestion state or the congestion state, so that the target data stream can be prevented from being sent to the forwarding path where the target port is located, and network congestion is avoided.
Optionally, when there is an idle egress port on the second network device capable of forwarding the target data stream, the second network device sends the target data stream through the idle egress port, and the second forwarding path where the idle egress port is located does not include the target port.
The second network device forwards the target data stream through the idle output port on the second network device, so that information of the target port can be prevented from being diffused to other network devices, and oscillation to the network is avoided.
Optionally, when there is no free egress port on the second network device capable of forwarding the target data stream, the second network device forwards the target data stream through the first forwarding path. Further, the second network device generates a second notification, where the second notification includes information of the network device where the target port is located and information of the target port. The second network device sends the second notification to at least one third network device, the at least one third network device including a capability to send a data stream to a host under the destination port via at least two forwarding paths.
When the second network equipment does not have an idle outlet port capable of forwarding the target data stream, the second network equipment forwards the target data stream through the first forwarding path, so that the loss of the received data stream can be avoided. Further, the second network device also disseminates information of the target port to the third network device through the second notification. After the third network device receives the second notification, the third network device may perform a process of avoiding network congestion, so as to avoid network congestion.
Optionally, when the second network device is directly connected to the source host of the target data stream, the second network device further sends a backpressure message to the source host of the target data stream, where the backpressure message is used to cause the source host to perform an operation of handling network congestion.
The second network message sends a back pressure message to the source host of the target data stream, so that excessive data streams can be prevented from entering the network from the source, and network congestion can be avoided.
Optionally, the second network device determines a target address range according to the information of the network device where the target port is located and the information of the target port, where the target address range is an address range corresponding to a host under the target port, and the second network device determines a data stream with a target address belonging to the target address range as the target data stream.
Optionally, the first notification further includes an identifier of a destination egress port queue, where the destination egress port queue is an egress port queue that enters a pre-congestion state or a congestion state in the destination port, and the second network device determines, as the destination data flow, a data flow whose destination address belongs to the destination address range and whose priority corresponds to the identifier of the egress port queue.
Optionally, the second network device stores information of the network device where the target port is located and information of the target port. Further, the second network may also store the state of the destination port.
A third aspect of the application provides a network device for handling network congestion. The application does not limit the division of the plurality of functional modules, and the plurality of functional modules can be correspondingly divided according to the flow steps of the method for processing network congestion in the first aspect, and can also be divided according to specific implementation requirements. The plurality of functional modules may be hardware modules or software modules, and the plurality of functional modules may be deployed on the same physical device or may be deployed on different physical devices.
A fourth aspect of the application provides a network device for handling network congestion. The application does not limit the division of the plurality of functional modules, and the plurality of functional modules can be correspondingly divided according to the flow steps of the method for processing network congestion in the second aspect, and can be divided according to specific implementation requirements. The plurality of functional modules may be hardware modules or software modules, and the plurality of functional modules may be deployed on the same physical device or may be deployed on different physical devices.
A fifth aspect of the application provides a network device for handling network congestion. The apparatus comprises a memory for storing program code and a processor for invoking the program code to implement the method of handling network congestion in the first aspect of the application and any possible designs thereof, and to implement the method of handling network congestion in the second aspect of the application and any possible designs thereof.
A sixth aspect of the application provides a chip which, when operated, is capable of implementing the method of handling network congestion in the first aspect of the application and any possible designs thereof, and of implementing the method of handling network congestion in the second aspect of the application and any possible designs thereof.
A seventh aspect of the present application provides a storage medium having stored therein program code which, when run, enables a device (switch, router, server, etc.) running the program code to implement the method of handling network congestion in the first aspect of the present application and any possible designs thereof, and to implement the method of handling network congestion in the second aspect of the present application and any possible designs thereof.
An eighth aspect of the present application provides a data centre network comprising a first network device for implementing the method of handling network congestion in the first aspect of the present application and any possible designs thereof, and a second network device for implementing the method of handling network congestion in the second aspect of the present application and any possible designs thereof.
The advantageous effects of the third to eighth aspects of the present application may be referred to the description of the advantageous effects of the first and second aspects and the respective possible designs thereof, and will not be repeated here.
Detailed Description
The embodiment of the application provides a method and a related device for processing network congestion, which are applied to a system comprising a plurality of network devices. Embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of a network system according to an embodiment of the present application, where the network system adopts a Clos architecture. The network system includes an access layer 1110, a convergence layer 1120, and a core layer 1130. The access layer 1110 includes a plurality of access devices T1-T8, the aggregation layer 1120 includes a plurality of aggregation devices A1-A7, and the core layer 1130 includes a plurality of core devices C1-C4. Each access stratum device is connected to one or more hosts Hx. The Clos architecture in fig. 1 is a multi-plane architecture, where multi-plane means that there are a plurality of core device groups, each sink device connects only core devices in one core device group, for example, there are a core device group (C1, C2) and a core device group (C3, C4) in fig. 1, the core device group (C1, C2) is composed of core devices C1 and C2, and the core device group (C3, C4) is composed of core devices C3 and C4. Each core device group and the aggregation devices connected with each core device group form a forwarding plane. For example, the core device group (C1, C2) and the aggregation devices A1, A3, A5 and A7 form a forwarding plane, and the core device group (C3, C4) and the aggregation devices A2, A4, A6 and A8 form a forwarding plane. Optionally, the access devices and aggregation devices in fig. 1 may also form different distribution points (point of delivery, pod), each pod includes a certain number of access devices and aggregation devices, and the access devices in one pod are connected to all the aggregation devices in the pod. For example, pod 1 includes aggregation devices A1 and A2, and access device T1 in pod 1 is connected to aggregation devices A1 and A2, and access device T2 is also connected to aggregation devices A1 and A2. Each core device of the core layer is connected to all the pods. In the present application, fig. 1 shows a plurality of ports for illustrating the connection relationship between devices, and in the following figures related to Clos networks, the ports will not be drawn for brevity. Further, the multi-plane Clos architecture in fig. 1 may be replaced with a single plane Clos architecture, i.e., each core device connects all aggregation devices. The access device in the present application may be a switch, and the aggregation device and the core device may be switches or routers.
Fig. 2 is a schematic structural diagram of another network system according to an embodiment of the present application, as shown in fig. 2, where the network architecture includes a plurality of switch groups (4 are shown in fig. 2), each of which may be referred to as a pod, each of which includes N switches, and the number (identifier) of each switch adopts an xy format, where x indicates the pod to which the switch belongs and y indicates the number of the switch in the pod to which the switch belongs. For example, in fig. 2 pod 1 includes switches 11,12, 13..1n, pod 2 includes switches 21,22, 23..2n, pod 3 includes switches 31,32, 33..3n, and pod 4 includes switches 41,42, 43..4n. The N switches in each switch group are directly connected in pairs. Each switch is directly connected to a corresponding switch in the other pod, forming N inter-group planes. Corresponding switches refer to switches with the same numbers (identifiers) in different switch groups. For example, switches 11,21,31, and 41 are corresponding switches to each other. Also, switches 11,21,31, and 41 are interconnected to form the left inter-group plane in fig. 2, and switches 1N,2N,3N, and 4N are interconnected to form the right inter-group plane in fig. 2. By direct connection is meant that no other network devices such as switches or routers are present between the two switches, but that there may be devices for providing connections or devices for enhancing signals. Ports connecting switches in different switch groups are called inter-group ports, and ports connecting switches in the same switch group are called intra-group ports. The switches in a pod have the same configuration or specification. Each pod forms a intra-group plane. Further, each switch shown in fig. 2 is also connected to one or more hosts, only hosts H1 and H2 under switch 11 being shown in fig. 2.
Based on the network system shown in fig. 1 or fig. 2, as shown in fig. 3, the present application provides a method for handling network congestion. The method is implemented by the first network device and the second network device cooperating with each other. The first network device may be any of the devices of fig. 1 or 2, and the second network device may be determined by the first network device or may be preconfigured. The method is described below in connection with fig. 3.
In step 301, a first network device determines a destination port.
The destination port is an egress port that enters a congested state or a pre-congested state. Wherein the pre-congestion state refers to a state in which congestion is about to occur but not yet.
In one implementation, the destination port is an egress port of the first network device, and step 301 may include 301-1 and 301-2.
In step 301-1, a first network device monitors an egress port of the first network device. In the present application, the first network device may be any network device. When the first network device forwards the message, the message to be sent enters into the egress port queues of the egress ports, and each egress port has a plurality (e.g., 8) of egress port queues. The first network device monitors the egress ports of the first network device, either by monitoring each egress port of the first network device or by monitoring each egress port queue of the first network device. For example, the first network device monitors whether the amount of buffering of each egress port exceeds a first threshold, or the first network device monitors whether the length of each egress port queue exceeds a second threshold. The first threshold indicates the proportion or number of bytes of one egress port buffer memory occupied, which may also be referred to as a port buffer memory threshold, and the second threshold indicates the proportion or number of bytes of one egress port queue buffer memory occupied, which may also be referred to as a queue buffer memory threshold.
In step 301-2, the first network device determines a destination port according to the monitoring result.
Optionally, when the buffer usage of an output port exceeds a first threshold, the first network device determines the output port as the target port. The first threshold may be a pre-congestion threshold or a congestion threshold. When the cache usage of the output port exceeds the pre-plug threshold, the output port enters a pre-plug state. When the buffer usage of the output port exceeds the congestion threshold, the output port enters a congestion state.
Optionally, when the length of one of the egress port queues exceeds the second threshold, the first network device determines the egress port where the egress port queue is located as the target port. The egress port queue may be referred to as a target egress port queue. The first network device allocates a buffer for each egress port queue, and the maximum length of the egress port queue refers to the size of the buffer allocated for the egress port queue. When a message enters a buffer area corresponding to the output port queue, the data quantity stored in the buffer area is the length of the output port queue. The second threshold may be a length (number of bytes) or a ratio. For example, the maximum length of the egress port queue a is 2MB, the second threshold is 70%, and if the amount of data stored in the buffer area of the egress port queue a reaches or exceeds 1.4MB, it is determined that the egress port queue a enters the pre-congestion state or the congestion state (determined according to the setting). The first network device determines that the output port where the port queue A is located is a target port. In another implementation, the first network device is not the network device where the target port is located, and step 301 includes the first network device receiving a notification a sent by the third network device. Wherein the third network device is the network device where the target port is located. The notification a includes information of the third network device and information of the destination port. The first network device determines the destination port based on the information of the destination port in the notification. Further, the notification a may also include an identification of an egress port queue in the target port that entered a pre-congestion state or a congestion state.
Optionally, after determining the destination port, the first network device further stores congestion information, where the congestion information includes information of the destination port and information of a network device where the destination port is located. The congestion information may also include a status of the destination port to process the data flow according to the congestion information when the data flow is subsequently received. Further, the first network device sets an aging time for the congestion information, and deletes the congestion information when the aging time is reached.
In step 302, a first network device sends a notification B to at least one second network device. The notification B includes information of the network device where the target port is located and information of the target port.
Optionally, the notification B may further include a type of the notification B, where the type is used to indicate that the target port carried in the notification B is a port entering a pre-congestion state or a congestion state. Optionally, the information of the target port in the notification B includes a state of the target port, and the state includes a pre-congestion state or a congestion state. Optionally, the notification B further includes an identification of an egress port queue in the target egress port that enters a congestion state and a pre-congestion state. In the present application, the information of the network device where the destination port included in the notification B is located and the information of the destination port are collectively referred to as congestion information.
The first network device may send the notification B to the at least one second network device in a multicast manner, or may send the notification B to each of the at least one second network device in a unicast manner.
In one embodiment, the information of the network device where the destination port is located includes an identifier of the network device, and the information of the destination port includes an identifier of the destination port or an identifier of a path where the destination port is located. The identifier of the path on which the target port is located may be an identifier of a network device on the forwarding path on which the target port is located. In another embodiment, the information of the network device where the target port is located includes an identifier of the network device and a role of the network device, and the information of the target port includes the identifier of the target port and an attribute of the target port.
The at least one second network device may be preconfigured or may be determined by the first network device according to a preset rule. The at least one second network device includes one or more network devices capable of transmitting data streams to hosts under the destination port via at least two forwarding paths. Or, the at least one second network device includes one or more network devices capable of sending data streams to the host under the target port through at least two forwarding paths, and having the smallest number of hops from the network device where the target port is located. The host under the target port is a near-end host capable of receiving the data stream through the target port. The at least one second network device is determined based on the role of the network device in which the destination port is located, the attribute of the destination port, and the role of the first network device. The attribute of the destination port indicates the forwarding direction of the data flow in the destination port, and the role of the network device indicates the location of the network device in the network system.
In the network system shown in fig. 1, the roles of the network devices may be access devices, aggregation devices, or core devices. The port attributes include upstream or downstream ports. The port of the access equipment connected with the convergence equipment is an uplink port, the port of the core equipment connected with the convergence equipment is a downlink port. In the network system shown in fig. 1, the near-end hosts refer to hosts that do not span the core device. For example, in fig. 4, the near-end hosts under port 4 of core device C2 refer to hosts connected to access devices T7 and T8, in fig. 6, the near-end hosts under port 3 of aggregation device A7 refer to hosts connected to access devices T7 and T8, in fig. 7, the near-end hosts under port 3 of access device T7 refer to hosts connected to access device T7, in fig. 8, the near-end hosts under port 1 of aggregation device A1 refer to hosts connected to access devices T1 and T2, in fig. 9, the near-end hosts under port 7 of core device C1 refer to hosts connected to access devices T7 and T8, and in fig. 10, the near-end hosts under port 1 of access device T7 refer to hosts connected to access device T7.
In the network system shown in fig. 2, the attributes of ports include intra-group ports or inter-group ports. Ports connecting switches within the same switch group are referred to as intra-group ports, e.g., ports connecting switch 11 and switch 12. Ports connecting switches in different switch groups are referred to as inter-group ports. For example, ports connecting switch 1N and switch 2N. The role of the network device may be intra-group switches or inter-group switches. The switches belonging to the same switch group are intra-group switches, and two switches belonging to different switches are inter-group switches, for example, the switches 11,12 in Pod 1 are intra-group switches, and the switch 1N in Pod 1 is inter-group switch with respect to the switch 2N in Pod 2. In the network system shown in fig. 2, the near-end host refers to a host under a switch to which the target port is directly connected. For example, in FIG. 11, the near-end hosts under port 3 of switch 3N refer to all hosts 34 connected by switch 33, and in FIG. 12, the near-end hosts under port 2 of switch 1N are all hosts connected by switch 2N. Prior to step 302, the first network device may further determine whether there is an idle egress port on the first network device capable of forwarding the target data stream, and when there is no idle egress port, perform step 302, and when there is an idle egress port, the first network device forwards the target data stream through the idle egress port.
The target data stream is a data stream corresponding to a target address range, wherein the target address range is an address range corresponding to a host under the target port, and the target address range is determined according to the information of the network equipment where the target port is located and the information of the target port. When the first network device determines only a target port that enters a pre-congestion state or a congestion state, the target data flow includes a data flow to a host under the target port. The first network device also determines, when an egress port queue enters a pre-congestion state or a congestion state, that the target data flow includes a data flow addressed to a host under the target port and having a priority corresponding to an identification of the egress port queue entering the congestion state or the pre-congestion state. Alternatively, the target data flow may be an elephant flow in a data flow addressed to a host under the target port, or an elephant flow in a data flow addressed to a host under the target port and having a priority corresponding to an identification of an egress port queue entering a congestion state or a pre-congestion state. An elephant flow refers to a data flow in which the flow rate (total number of bytes) per unit time exceeds a set threshold.
The messages in the data stream carry priorities, and when the network equipment forwards the data stream, the data stream with the same priority is scheduled to the same output port queue, so that the messages with different priorities enter different output port queues in the output ports, and therefore, the priorities of the messages have a corresponding relation with the identifications of the output port queues. When all network devices in the network system forward the data stream by adopting the same scheduling rule, one network device can know the identification of the corresponding output port queue of the data stream on the other network device according to the priority of the received data stream.
When the destination port is a downstream port under the Clos architecture shown in fig. 1, the destination address range corresponding to the destination data flow means that the address of the destination data flow belongs to the destination address range. When the destination port is an upstream port under the Clos architecture shown in fig. 1, the destination address range corresponding to the destination data flow means that the address of the destination data flow does not belong to the destination address range. When the destination port is an intra-group port or an inter-group port under the architecture shown in fig. 2, the destination address range corresponding to the destination data stream means that the address of the destination data stream belongs to the destination address range.
In step 303, the second network device receives the notification B.
The second network device is any one of the at least one second network device. Optionally, after receiving the notification B, the second network device stores information of the target port carried in the notification B and information of the network device where the target port is located. The second network device may also store the state of the destination port. For example, the second network device sets a first table for storing information of ports entering the pre-congestion state or the congestion state, and each entry of the first table includes information of a target port and information of a network device where the target port is located. For another example, the second network device sets a second table, where each entry of the second table includes information of a destination port, information of a network device where the destination port is located, and a state of the destination port. Further, the second network device may set an aging time for the information of each destination port, and delete the information of the destination port after reaching the aging time.
In step 304, the second network device determines a target data stream.
Since the second network device receives notification B, the second network device is not the network device where the target port is located.
In one implementation, the second network device determines a target address range according to the information of the target port in the notification B and the information of the network device where the target port is located, stores the target address range, and determines a data stream with a destination address belonging to the target address range received subsequently as a target data stream. For example, the second network device obtains a destination address of the received data stream, and if the destination address belongs to the destination address range, or if the destination address belongs to the destination address range and the priority of the data stream corresponds to the identification of the destination egress port queue, the second network device determines the data stream as the destination data stream. The target address range is an address range corresponding to a host under the target port, and the first forwarding path of the target data stream (i.e., the initial forwarding path before receiving the notification B) includes the target port.
In step 305, the second network device determines whether there is an idle output port on the second network device capable of forwarding the target data stream, obtains a determination result, and processes the target data stream according to the determination result.
The idle output port refers to another output port on the second network device, which does not enter a congestion state or a pre-congestion state and is different from the current output port of the target data stream. The amount of port buffering of the free egress port does not exceed the first threshold, or the length of no egress port queue in the free egress port exceeds the second threshold.
For example, in the Clos architecture shown in fig. 4, when the first network device is the core device C2, the target port is the downstream port 4, and the second network device is the aggregation device A1, the target address range determined by the aggregation device A1 is an address range corresponding to the host to which the access devices T7 and T8 are connected. When the sink device A1 receives a data stream with a destination address belonging to the destination address range, the sink device A1 determines whether an idle exit port exists in the uplink port of the sink device A1, and a forwarding path where the idle exit port is located does not include the downlink port 4 of the core device C2.
The second network device processes the target data stream according to the determination result, including step 306 and step 307.
In step 306, an idle egress port exists on the second network device through which the second network device transmits the target data stream.
In the application, the forwarding path of the idle outlet port determined by the second network device for the target data stream is called a second forwarding path of the target data stream, and the second forwarding path does not include the target port.
In step 307, the second network device forwards the target data stream via its initial forwarding path (i.e., the first forwarding path), i.e., without changing the egress port of the target data stream on the second network device.
Further, since there is no free egress port on the second network device, the second network device also informs at least one third network device capable of sending data flows to hosts under the target port via at least two forwarding paths of the congestion state or congestion state of the target port. Optionally, the second network device generates a notification C according to the information of the network device where the target port is located and the information of the target port, and sends the notification C to the third network device. The at least one third network device may be preconfigured on the second network device, or the second network device may be determined according to information of the network device where the target port is located and information of the target port.
By the method shown in fig. 3, when an egress port or an egress port queue of any one of the network devices in the network system shown in fig. 1 or fig. 2 enters a pre-congestion state or a congestion state, the network device may send a notification to cause the network device that receives the notification to perform a process of handling network congestion, where the process of handling network congestion includes reselecting a forwarding path for a target data stream to avoid sending the target data stream to the egress port, and where the process of handling network congestion further includes sending a notification to other network devices to flood the target port. By the method shown in fig. 3, network congestion can be avoided. In addition, the method can realize the load balance of the whole network and improve the utilization rate of network resources.
Different implementations of the various steps in the method shown in fig. 3 are described below in connection with fig. 4-12.
Fig. 4 is a schematic diagram of a processing procedure when the target port is a downstream port of the core device in the multi-plane Clos architecture shown in fig. 1. As shown in fig. 4, a thin solid line indicates a link where the destination port is located, and a thick solid line indicates a notified forwarding path. A data stream (denoted as data stream 1) from host H2 to host H7 arrives at core device C2 via access device T2 and sink device A1, and core device C2 forwards data stream 1 to sink device 7 via egress port queue 3 (Q3) of port 4 (P4). During forwarding of the data stream 1, the core device C2 monitors that the length of the egress port queue 3 exceeds the second threshold, determines that the egress port queue 3 enters the pre-blocking state, and further determines that the port 4 is the target port (step 301).
The core device C2 first confirms whether there are other idle output ports on the core device C2 that can reach the host H7, and when there are no other idle output ports on the core device C2 that can reach the host H7, the core device C2 sends a multicast notification to a plurality of aggregation devices except the aggregation device A7 connected to the port 4 (step 302). In a multi-plane scenario, the multiple aggregation devices and the core device C2 belong to the same forwarding plane. In fig. 4, if the core device C2 sends a notification in a multicast manner, the core device C2 determines a target multicast group corresponding to the port 4, where a multicast source of the target multicast group is the core device C2, and the group broadcast ports are ports connected to the aggregation devices A1, A3, and A5, and are assumed to be the port 1, the port 2, and the port 3. Core device C2 then sends a multicast notification over port 1, port 2 and port 3, the multicast notification including the identity of core device C2 (C2), the identity of port 4 (P4), optionally the multicast notification may also include one or more of the role of core device C2, the port attribute of port 4 (downstream port) and the identity of egress port queue 3 (Q3). In addition, the core device C2 may also store congestion information of the port 4. The multicast notification arrives at the aggregation devices A1, A3 and A5. The processing procedure of the sink device is described below by taking the sink device A1 as an example.
The sink device A1 receives the multicast notification sent by the core device C2 (step 303). Alternatively, the sink device A1 acquires congestion information ("C2P 4" or "C2P4Q3 downstream") in the multicast notification, stores the congestion information, and sets the aging time. Sink device A1 determines a target data stream (step 304). In determining the target data stream, the sink device A1 determines the address range (target address range) of the host under the port P4 of the core device C2, determines the data stream whose destination address belongs to the target address range as the target data stream, or determines the data stream whose destination address belongs to the target address range and whose priority corresponds to Q3 as the target data stream.
In determining the address ranges of the hosts corresponding to the ports P4 of the core device C2, in an alternative manner, since the destination port P4 is a downstream port, the core device C2 determines the address ranges of all the hosts connected under the sink device A7 connected to the port P4.
In one embodiment, addresses may be assigned to network devices and hosts according to a network architecture. For example, each network device in fig. 1 is assigned a number, where the number is an identifier of the network device. As shown in fig. 5, each block representing a network device is numbered a specific implementation of the switch's identity. For example, 10 may be a value of C2. Each combination of network device identification and downstream port identification may uniquely identify an underlying device. For example, core device 10 and port 00 combination may identify sink device A1 000, and the combination of port 1111 and the identification (00+) of the pod where sink device A1 000 is located may identify access device T2 1111. The address of the host includes a port of an access device to which the host is connected and an identification of the access device on the aggregation device. The address of host H2 may be xx.xx.001111.1110 according to the addressing rules described above.
Based on the addressing rule shown in fig. 5, if the identifier of the network device included in the multicast notification received by the sink device A1 is 10 and the port identifier is 11, the host address range determined by the sink device A1 according to the multicast notification is 110000 or 111111 lower by 5-10 bits, and the determined priority is the priority corresponding to Q3, for example 3. The sink device A1 determines that the received destination address falls within the host address range and that the data stream with priority 3 is the target data stream.
In determining the address ranges of all hosts connected under the port P4 of the core device C2, in another alternative manner, the sink device A1 determines the address ranges of all hosts connected under the port P4 of the core device C2 by means of a table look-up. For example, three tables are stored on each network device, the first table stores the corresponding relation among the core device, the ports of the core device and the convergence device, the second table stores the connection relation among the convergence device, the ports of the convergence device and the access device, and the third table stores the connection relation between the access device and the host address. After receiving the multicast notification, the convergence device A1 determines the role of the network device as core device according to the identifier (C2) of the network device, and searches the first table according to C2 and P4 to obtain convergence device A7, then searches the access devices T7 and T8 from the second table according to the convergence device A7, and finally searches the addresses of the hosts connected with the access devices T7 and T8 according to the third table to generate a host address list corresponding to the congestion information. Alternatively, the three tables may be integrated into one table, and the table needs to store the correspondence between the core device, the aggregation device, the access device and the host address.
After determining the target data flow (assumed to be the data flow 1), the sink device A1 determines whether an idle uplink output port exists on the sink device A1 (because the target port P4 is a downlink port of the core device and the downlink port of the core device corresponds to the uplink port of the sink device, the sink device A1 needs to determine whether the idle uplink port exists) (step 305), and when the idle uplink output port exists, the sink device A1 uses the idle uplink output port as the output port of the target data flow, and forwards the target data flow through the idle uplink output port (step 306). When there is no idle upstream output port, the sink device A1 continues to forward the target data flow through the initial forwarding path corresponding to the target data flow (step 307).
Before the congestion information ages, the aggregation device A1 may process any data flow according to the above method when receiving the data flow.
In addition, after the aggregation device A1 performs step 307, the congestion information is further diffused to the access device. That is, the sink device A1 also generates another notification and transmits the other notification to the access devices T1 and T2 (step 302). The further notification includes the congestion information. After receiving the other notification, the access devices T1 and T2 execute corresponding processing. The following describes the processing procedure of the access device, taking the access device T2 as an example.
When the access device T2 receives another notification (step 303), the access device T2 acquires congestion information in the other notification, stores the congestion information, and sets an aging time similarly to the sink device A1. The access device T2 determines a target address range according to the congestion information, determines a target data stream according to the target address range (step 304), determines whether an idle exit port capable of forwarding the target data stream exists on the access device T2 (step 305), forwards the target data stream through the idle exit port if the idle exit port exists (step 306), and forwards the target data stream through an initial forwarding path of the target data stream if the idle exit port does not exist (step 307). And, the access device T2 determines the source host of the target data flow, and sends a backpressure message to the source host, where the backpressure message is used to inform the source host to perform an operation of avoiding network congestion. The operation of avoiding network congestion may be to reduce the rate at which data is sent to the access device T2 or to reduce the rate at which the target data stream is sent to the access device T2. The manner in which the access device T2 determines the target data stream and processes the target data stream is similar to that of the sink device A1, and the process not described in detail in this section may be referred to the description of the process of the sink device A1.
Through the process, after the output port enters the pre-congestion state or the congestion state, the core device in the Clos system can send congestion information to the aggregation device, and the aggregation device can send the congestion to the access device. Each network device receiving the congestion information executes the operation of processing network congestion, so that the network congestion can be avoided, and the bandwidth utilization rate of the whole Clos system can be improved.
Fig. 6 is a schematic diagram of a processing procedure when the target port is a downstream port of the convergence device in the multi-plane Clos architecture shown in fig. 1. Wherein the thin solid line represents the link where the destination port is located and the thick solid line represents the notified forwarding path. As shown in fig. 6, assuming that the host H2 sends the data stream 1 to the host H7, the data stream 1 enters the queue 3 of the egress port 3 on the sink device A7, the sink device A7 detects that the length of the queue 3 of the egress port 3 exceeds the second threshold, and determines that the queue 3 enters the pre-blocking state, then the egress port 3 is the target port. There is no free downstream port on the aggregation device A7. The aggregation device A7 sends a notification to the plurality of second network devices (step 302). The plurality of second network devices are determined according to the port attribute (downstream port) of the output port 3 and the attribute (sink device) of the sink device A7, including all access devices except the access device T7 to which the output port 3 is connected. The notification includes an identification of the sink device A7 (A7), an identification of the egress port 3 (P3). Optionally, the notification may further include one or more of the role of aggregation device A7 (aggregation device), the attribute of egress port 3 (downstream port), and the identity of queue 3 (Q3). The notification may be sent in a unicast or multicast manner.
The notification sent by the aggregation device A7 to the access device T8 may reach the access device T8 directly, and the notifications sent to the access devices T1-T6 reach the core devices C1 and C2 belonging to the same forwarding plane as the aggregation device A7.
Since the core devices C1 and C2 cannot send data streams to the host under the egress port 3 of the sink device A7 through at least two forwarding paths, the core devices C1 and C2 are not the destination of the notification, and after receiving the notification, the core devices C1 and C2 forward the notification to the ports other than the port receiving the notification (fig. 6 only shows the forwarding path of the core device C2).
After forwarding via core device C1 or C2, the notification arrives at aggregation devices A1, A3 and A5 belonging to the same forwarding plane as aggregation device A7. Since the sink devices A1, A3 and A5 cannot send data streams to the host under the output port 3 of the sink device A7 through at least two forwarding paths, the sink devices A1, A3 and A5 are not destinations of the notification, and the sink devices A1, A3 and A5 still forward the received notification. Taking the sink device A1 as an example, after receiving the notification, the sink device A1 duplicates and forwards the notification to the downlink port, i.e. sends the notification to the connected access devices T1 and T2.
In the scenario shown in fig. 6, since the purpose of the notification sent by the aggregation device A7 is other access devices than the access device T7, both the core device and the aggregation device forward only the notification after receiving the notification. After any of the access devices T1-T6 and T8 receives the notification, steps 304-307 are performed in the manner described with reference to the embodiments above.
Through the above process, after the output port enters the pre-congestion state or the congestion state, the aggregation device in the Clos system can notify all other access devices except the access device connected with the output port of congestion information. Each access device that receives the congestion information performs an operation to handle network congestion. Therefore, the process can avoid network congestion and improve the bandwidth utilization rate of the whole Clos system.
Fig. 7 is a schematic diagram of a processing procedure when the target port is a downlink port of the access device under the multi-plane Clos architecture. As shown in fig. 7, a thin solid line indicates a link where the destination port is located, and a thick solid line indicates a notified forwarding path. As shown in fig. 7, assuming that the host H2 sends the data stream 1 to the signature H7, the data stream 1 enters the queue 3 of the egress port 3 on the access device T7, and the access device T7 detects that the length of the queue 3 of the egress port 3 exceeds the second threshold, determines that the queue 3 enters the pre-blocking state, and further determines that the port 3 is the target port. And, there are no other downstream ports on the access device T7 that can reach the host H7. The access device T7 generates a notification comprising an identification (T7) of the access device T7 and an identification (P3) of the outlet port 3. Further, the notification may also include one or more of the role of the access device T7 (access device), the properties of the egress port 3 (downstream port), and the identity of the queue 3 (Q3). The access device T7 sends the notification to the plurality of second network devices. Wherein the plurality of second network devices includes all access devices except access device T7. Furthermore, since the access device T7 is directly connected to the host H7, the access device T7 knows the address of the host H7, and thus the notification may also include the address of the host H7. Thus, the other access device that receives the notification can determine the target data stream directly from the address of the host H7. The notification may be sent in a unicast or multicast manner.
Similar to the process described in fig. 6, upon receiving the notification, the sink device or core device forwards the notification according to the destination address of the notification. Each access device, upon receiving the notification, performs an operation similar to that of the access device T2 in fig. 4.
In the scenario shown in fig. 4, 6 and 7, the target ports are all downstream ports. In other embodiments, the destination port may be an upstream port.
Fig. 8 is a schematic diagram of a processing procedure when the target port is an upstream port of the sink device in the multi-plane Clos architecture. Wherein the thin solid line indicates the link where the destination port is located, and the thick solid line indicates the notified forwarding path. Taking the data stream 1 sent from the host H2 to the host H7 as an example, during the process of forwarding the data stream 1, the sink device A1 monitors that the length of the egress port queue 3 (Q3) of the port 1 (P1) where the data stream 1 is located exceeds the second threshold, and determines that the port queue 3 enters the pre-blocking state, so that the egress port 1 is the target port. The sink device A1 confirms whether there are other idle output ports (uplink ports) on the sink device A1 that can reach the host H7, if there are other idle output ports that can reach the host H7, the sink device A1 switches the data stream 1 to the idle output ports, and sends the data stream 1 through the idle output ports, and when there are no other idle output ports that can reach the host H7, the sink device A1 sends a notification to a plurality of access devices connected to the sink device A1 in a multicast or unicast manner (step 302), where the notification includes an identifier (A1) of the sink device A1 and an identifier (P1) of the port 1. Optionally, the notification may also include the role of the aggregation device A1, the attribute of the port 1 (upstream port) and the identity of the egress port queue 3 (Q3). In fig. 8, although the sink devices A3, A5, and A7 can each transmit a data stream to the host under the destination port through at least two forwarding paths, since the sink devices A3, A5, and A7 are not the devices having the smallest hop count from the sink device A1, the sink device A1 transmits the notification only to the access devices T1 and T2, and does not transmit the notification to the sink devices A3, A5, and A7 and other access devices. The notification arrives at access devices T1 and T2. The following describes a process flow of the access device by taking the access device T2 as an example.
After receiving the notification (step 303), the access device T2 obtains congestion information in the notification, stores the congestion information, and sets the aging time. The access device T2 determines a target address range corresponding to the aggregation device A1, that is, addresses of hosts corresponding to all access devices connected to the aggregation device A1, and determines, as a target data stream, a data stream whose destination address does not belong to the target address range or whose destination address does not belong to the target address range and whose priority corresponds to Q3 (step 304). In this embodiment, since the upstream port of the sink device A1 fails, the data streams sent by all the hosts under the sink device A1 do not pass through the upstream port of the sink device A1, and therefore, the access device T2 determines the data stream sent to the host outside the management range of the sink device A1 as the target data stream. After determining the target data flow, the access device T2 determines whether there is an idle exit port (uplink port) corresponding to the congestion information on the access device T2, i.e. an idle exit port (step 305), if there is an idle exit port, the access device T2 forwards the target data flow through the idle exit port (step 306), and if there is no idle exit port, the access device T2 sends the target data flow through the initial forwarding path of the target data flow (step 307). Further, the access device T2 determines the source host of the target data flow, and sends a backpressure message to the source host, where the backpressure message is used to inform the source host to perform an operation of handling network congestion. The operation of handling network congestion may be to reduce the rate at which data is sent to the access device T2 or to reduce the rate at which the target data stream is sent to the access device T2.
In another scenario, when the target port is an uplink port of the access device, the access device determines that a data stream sent to the uplink port is a target data stream, determines whether an idle exit port (uplink port) capable of forwarding the target data stream exists on the access device, if the idle exit port exists, sends the target data stream through the idle exit port, if the idle exit port does not exist, determines a source host of the target data stream, and sends a backpressure message to the source host, wherein the backpressure message is used for notifying the source host to execute an operation of processing network congestion. It can be seen that when the target port is an uplink port of the access device, the access device does not need to send a notification.
The method of the present application shown in FIG. 3 can also be applied to a single plane Clos architecture. In the single plane Clos architecture, each core device connects all convergence devices.
Fig. 9 is a schematic diagram of a processing procedure when the target port is a downstream port of the core device under the single plane Clos architecture. As shown in fig. 9, a thin solid line indicates a link where the destination port is located, and a thick solid line indicates a notified forwarding path. A data stream (denoted as data stream 1) sent from host H2 to host H7 arrives at core device C1 via access device T2 and sink device A1, and core device C1 forwards data stream 1 to sink device A7 via egress port queue 3 (Q3) of port 7 (P7). During forwarding of the data stream 1, the core device C2 monitors that the length of the egress port queue 3 exceeds the second threshold, determines that the egress port queue 3 enters the pre-blocking state, and further determines that the port 7 is the target port (step 301). Since there is no free egress port (i.e., free downstream egress port) on the core device C1 that has the same attribute as the port 7, the core device C1 sends a notification to all other sink devices except the sink device A7, the notification including congestion information as described with reference to fig. 4. After receiving the notification, the sink device (e.g. A1) determines a target address range (i.e. the address of the host to which the access devices T7 and T8 are connected) according to the notification, determines a target data stream according to the target address range after receiving the data stream, then determines whether there is an idle exit port (uplink port) capable of forwarding the target data stream, switches the target data stream to the idle exit port when there is an idle exit port, forwards the data stream through the current exit port of the target data stream when there is no idle exit port, regenerates the notification according to the congestion information, and sends the notification to all access devices to which the sink device is connected.
After receiving the notification, the access device (e.g. T2) determines a target data flow according to the congestion information, and switches the target data flow to an idle output port (uplink port) capable of forwarding the target data flow when the idle output port exists, and when the idle output port does not exist, sends a backpressure message to a source host of the target data flow, where the backpressure message is used to notify the source host to perform an operation of processing network congestion.
Fig. 10 is a schematic diagram of a processing procedure when the target port is a downstream port of the sink device under the single plane Clos architecture. As shown in fig. 10, a thin solid line indicates a link where the destination port is located, and a thick solid line indicates a notified forwarding path. A data stream (denoted as data stream 1) sent from host H2 to host H7 arrives at aggregation device A7 via access device T2, aggregation device A1 and core device C1, and aggregation device A7 forwards data stream 1 to access device T7 via egress port queue 3 (Q3) of port 1 (P1). In the process of forwarding the data stream 1, the aggregation device A7 monitors that the length of the egress port queue 3 exceeds the second threshold, determines that the egress port queue 3 enters the pre-blocking state, and further determines that the port 1 is the target port (step 301). Since there is no free egress port (i.e., free downstream egress port) on the aggregation device A7 that has the same attribute as the port 1, the aggregation device A7 sends a notification to all core devices and other access devices (e.g., access device T8) to which the aggregation device is connected. The notification includes congestion information (the congestion information refers to the above embodiments). In this case, the core devices C1 and C2 can also send data flows to the host under the port 1 of the sink device A7 through at least two forwarding paths, and the core devices C1 and C2 exemplify the sink device A7 with only one hop, and the sink device sends notifications to the core devices C1, C2 and the access device T8.
After receiving the notification, the core device (e.g. C1) determines a target data stream according to the congestion information, and if there is an idle downlink output port on the core device, the core device sends the target data stream through the idle downlink output port. And if the core equipment does not have an idle downlink outlet port capable of forwarding the target data stream, sending a notification to other aggregation equipment except the aggregation equipment A7, wherein the notification comprises the congestion information.
After receiving the notification sent by the core device, any sink device performs the same operation as the sink device A1 in fig. 9.
After any access device in fig. 10 receives the notification, the same operation as that of the access device T2 in fig. 9 is performed.
The processing procedure when the target port is the downstream port of the access device under the single plane Clos architecture is similar to the processing procedure when the target port is the downstream port of the access device under the multi-plane architecture. The processing method when the target port under the single plane Clos architecture is an uplink port is similar to the processing method when the target port under the multi-plane Clos architecture is an uplink port.
The method of fig. 3 of the present application may also be applied to the network architecture of fig. 2. In the network architecture shown in fig. 2, the identifier of each switch may be the number of the switch, for example, the number xy of the switch, x represents the number of the pod where the switch is located, and y represents the number of the switch within the pod where the switch is located. For example, switch 11 represents switch number 1 within Pod 1. Thus, the first switch can know the role of the second switch according to the number of the second switch, and can also know the attribute of the port of the second switch.
FIG. 11 is a schematic diagram illustrating a processing procedure when the target port is an intra-group port in the architecture shown in FIG. 2. Assuming that the switch 3N monitors that the length of the outgoing port queue 3 of the port 3 exceeds the second threshold value during the process of sending the data stream 1 to the switch 33, it is determined that the port queue 3 enters the pre-blocking state, and then it is determined that the port 3 is the target port (step 301). The switch 3N sends a notification to the plurality of second network devices, the notification including an identification of the switch 3N and an identification of the port 3 (step 302). Alternatively, the identification of the switch 3N may be obtained by parsing the identification of the port 3, and accordingly, the identification of the switch 3N and the identification of the port 3 may use only one field. The notification may also include an identification of the egress port queue 3. When the identification of the switch in the network architecture shown in fig. 2 takes other forms, the notification may also include the properties of port 3 and the role of switch 3N (inter-group switch). The plurality of second network devices includes inter-group switches connected to switch 3N, i.e., switches 1N,2N, and 4N. Each inter-group switch is only one hop away from switch 3N. The switch 3N sends the notification to the switches 1N,2N and 4N in multicast or unicast. The following describes the procedure of the switches 1N,2N and 4N to process the notification, taking the switch 1N as an example.
After receiving the notification (step 303), the switch 1N acquires congestion information therein, stores the congestion information, and sets the aging time. The switch 1N determines a target data flow according to the congestion information (step 304), where the target data flow is a data flow addressed to the host to which the switch 3N is connected, or the target data flow is a data flow that is addressed to the host to which the switch 3N is connected and whose priority corresponds to the egress port queue 3. Switch 1N determines whether there is an free egress port on switch 1N that can transmit the target data stream, i.e., an empty inter-group port (step 305), and if there is a free egress port, switch 1N forwards the target data stream through the free egress port (step 306), and if there is no free egress port, switch 1N transmits the target data stream through the initial forwarding path of the target data stream (step 307). The switch 1N transmits a notification to other switches in the same switch group based on the congestion information. Switches 11,12 and 13 receive the notification and perform a similar process for the access devices under the Clos architecture.
Under the architecture shown in fig. 2, the hosts may be assigned addresses according to the network architecture. That is, the address of each host may be determined according to the number of the switch to which the host is connected, for example, the address of the host connected under the switch 1N is 1n.xxx.xxx. According to the above addressing rule, when the intra-group port between the switch 3N and the switch 33 is the target port, the target data stream is the data stream whose destination address is 33.Xxx.
In the network architecture shown in fig. 2, when the target port is a port of the switch connected to the host, the processing procedure of the switch is similar to that when the target port is an intra-group port.
FIG. 12 is a schematic diagram illustrating a process when the target port is an inter-group port in the architecture shown in FIG. 2. Assuming that the switch 1N monitors that the length of the egress port queue 3 of the port 2 of the switch 1N exceeds the second threshold value, i.e., the egress port queue 3 enters the pre-blocked state, during the transmission of the data stream 1 to the switch 2N, the switch 1N determines the port 2 as the target port (step 301). The host under the destination port is the host to which the switch 2N is connected. The switch 1N transmits a notification to the plurality of second network devices (step 302). The plurality of second network devices includes intra-group switches connected to the switch 1N, that is, switches 11,12, and 13, etc. The notification includes an identification (1N) of switch 1N, an identification (P2) of port 2. The notification may also include an identification of the egress port queue 3 (Q3). When the identification of the switch in the system shown in fig. 2 takes other forms, the notification may also include the attribute of port 2 and the role of switch 1N (intra-group switch). The switch 1N transmits the notification to the switches 11,12, 13, and the like in a multicast or unicast manner. Switches 11,12 and 13 etc receive the notification and perform a similar process for the access devices under the Clos architecture.
As can be seen from the description of the foregoing embodiments, the method provided in fig. 3 of the present application can issue a notification to other network devices in the network after detecting that an egress port or an egress port queue enters a congestion state or a pre-congestion state, and the network device that receives the notification selects an idle egress port for a target data stream or continues to diffuse the state of the egress port or the egress port queue in the network, so that network devices in the whole network can execute an operation of handling network congestion, and network congestion can be avoided under various network architectures. In addition, the network equipment in the application can forward the target data stream through the idle output port after receiving the notification, can realize end-to-end load balancing in the whole network, and improves the utilization rate of network resources. In addition, when the target data flow is according to the output port queue, the application can only adjust the forwarding path of the data flow causing congestion without affecting the normal data flow, thereby further improving the data flow forwarding efficiency.
Further, the embodiment of the present application further provides a network device 1300, where the network device 1300 may be any network device in fig. 1 or fig. 2. As shown in fig. 13, the network device 1300 includes a determining unit 1310, a transmitting unit 1320, and optionally, the network device 1300 further includes a receiving unit 1330, and a storage unit 1340. The network device 1300 is used to implement the functionality of the first network device of fig. 3.
A determining unit 1310, configured to determine a target port, where the target port is an output port that enters a pre-congestion state or a congestion state. A sending unit 1320, configured to send a first notification to at least one second network device, where the at least one second network device includes one or more network devices capable of sending a data stream to a host under the target port through at least two forwarding paths, and the first notification includes information of the network device where the target port is located and information of the target port.
Optionally, the network device where the target port is located is the first network device, and the determining unit is configured to monitor an output port of the first network device, and determine that the output port is the target port when a cache usage amount of one output port of the first network device exceeds a port cache threshold.
Optionally, the network device where the target port is located is the first network device, and the determining unit is configured to monitor an egress port queue of the first network device, and determine that the egress port where the egress port queue is located is the target port when a length of the egress port queue exceeds a queue buffer threshold.
Optionally, the network device where the target port is located is a third network device, the receiving unit 1330 is configured to receive a second notification sent by the third network device, where the second notification includes information of the third network device and information of the target port, and the determining unit determines the target port according to the second notification.
Optionally, the information of the network device where the target port is located includes an identifier of the network device where the target port is located, and the information of the target port includes an identifier of the target port or an identifier of a forwarding path where the target port is located.
Optionally, the information of the network device where the target port is located further includes a role of the network device where the target port is located, where the role indicates a location of the network device where the target port is located, and the information of the target port further includes an attribute of the target port, where the attribute indicates a direction in which the target port sends the data stream.
Optionally, the determining unit is further configured to determine that there is no free port on the network device capable of forwarding the target data stream corresponding to the target port. The target data stream is a data stream corresponding to a target address range, the target address range is an address range corresponding to a host under the target port, and the target address range is determined according to the information of the network device where the target port is located and the information of the target port.
Optionally, the information of the destination port may further include an identifier of a destination egress port queue, where the destination egress port queue is an egress port queue in the destination port that enters a congestion state or a pre-congestion state, and the destination data flow is a data flow corresponding to the destination address range and having a priority corresponding to the identifier of the egress port queue.
Optionally, the storage unit 1340 is configured to store information of a network device where the target port is located and information of the target port. The storage unit 1340 is also used for storing the status of the target port.
Further, the embodiment of the present application further provides a network device 1400, where the network device 1400 may be any of the network devices in fig. 1 or fig. 2. As shown in fig. 14, the network device 1400 includes a receiving unit 1410, a first determining unit 1420, a second determining unit 1430, and a processing unit 1440. Optionally, the network device 1400 further comprises a storage unit 1450. The network device 1400 is configured to implement the functionality of the second network device of fig. 3.
A receiving unit 1410, configured to receive a first notification from a first network device, where the first notification includes information of a network device where a target port is located and information of the target port, where the target port is a port that enters a pre-congestion state or a congestion state, and the second network device is a network device capable of sending a data stream to a host under the target port through at least two forwarding paths. A first determining unit 1420, configured to determine a target data flow, where a first forwarding path of the target data flow includes the target port. And a second determining unit 1430, configured to determine whether an idle output port capable of forwarding the target data stream exists on the second network device, to obtain a determination result. And a processing unit 1440, configured to process the target data stream according to the determination result.
Optionally, when there is an idle egress port on the network device capable of forwarding the target data stream, the processing unit 1440 sends the target data stream through the idle egress port, where the second forwarding path where the idle egress port is located does not include the target port.
Optionally, when there is no free egress port on the network device capable of forwarding the target data stream, the processing unit 1440 forwards the target data stream through the first forwarding path.
Optionally, the processing unit 1440 is further configured to generate a second notification, where the second notification includes information of a network device where the target port is located and information of the target port, and send the second notification to at least one third network device, where the at least one third network device includes a capability of sending a data stream to a host under the target port through at least two forwarding paths.
Optionally, the processing unit 1440 is further configured to send a backpressure message to the source host of the target data stream, where the backpressure message is used to cause the source host to perform an operation for handling network congestion.
Optionally, the first determining unit 1420 is configured to determine a target address range according to information of a network device where the target port is located and information of the target port, where the target address range is an address range corresponding to a host under the target port, and determine a data stream with a destination address belonging to the target address range as the target data stream.
Optionally, the first notification further includes an identifier of a destination egress port queue, where the destination egress port queue is an egress port queue that enters a pre-congestion state or a congestion state in the destination port, and the first determining unit 1420 is configured to determine, as the destination data flow, a data flow whose destination address belongs to the destination address range and whose priority corresponds to the identifier of the egress port queue.
Optionally, the storage unit 1450 is configured to store information of a network device where the target port is located and information of the target port. The storage unit 1450 is further configured to store a state of the target port.
The network devices of fig. 13 and fig. 14 cooperate with each other to implement the method shown in fig. 3, so as to avoid network congestion and implement load balancing of the entire network.
Further, the network devices of fig. 13 and 14 may be embodied by a network device 1500 as shown in fig. 15, the network device 1500 may include a processor 1510, a memory 1520, and a bus system 1530. Wherein the processor 1510 and the memory 1520 are connected by a bus system 1530, the memory 1520 is for storing program codes, and the processor 1510 is for executing the program codes stored in the memory 1520. For example, the processor 1510 may invoke program code stored in the memory 1520 to perform the method of handling network congestion in various embodiments of the application. In an embodiment of the present application, the processor 1510 may be a central processing unit (central processing unit, CPU), the processor 1510 may be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSPs), application-specific integrated circuits (ASICs), off-the-shelf programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic, discrete hardware components, etc. a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The processor 1510 may include one or more processing cores. The memory 1520 may include a read-only memory (ROM) device or a random-access memory (RAM) device. Any other suitable type of storage device may also be used as memory 1520. Memory 1520 may include data 1522 that is accessed by processor 1510 over bus 1530. The memory 1520 may further include an operating system 1523 to support the operation of the network device 1500. The bus system 1530 may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 1530 in the drawing. Optionally, the network device 1500 may also include one or more output devices, such as a communication interface 1540. Network device 1500 can communicate with other devices over communication interface 1540. Communication interface 1540 may connect to processor 1510 via bus system 1530. From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in hardware, or by means of software plus necessary general hardware platforms. Based on such understanding, the technical solution of the present application may be embodied in the form of a hardware product or a software product. The hardware product may be a dedicated chip. The software product may be stored on a non-volatile storage medium (which may be a CD-ROM, a usb disk, a mobile hard disk, etc.), and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods according to the embodiments of the application.
The foregoing is merely a preferred embodiment of the application, and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application, which are intended to be comprehended within the scope of the present application.