CN120111001B - Receiver-driven hybrid traffic transmission method and computing network - Google Patents
Receiver-driven hybrid traffic transmission method and computing networkInfo
- Publication number
- CN120111001B CN120111001B CN202510580957.6A CN202510580957A CN120111001B CN 120111001 B CN120111001 B CN 120111001B CN 202510580957 A CN202510580957 A CN 202510580957A CN 120111001 B CN120111001 B CN 120111001B
- Authority
- CN
- China
- Prior art keywords
- rate
- delay
- credit
- sending rate
- packet
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2458—Modification of priorities while in transit
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/125—Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/25—Flow control; Congestion control with rate being modified by the source upon detecting a change of network conditions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/28—Flow control; Congestion control in relation to timing considerations
- H04L47/283—Flow control; Congestion control in relation to timing considerations in response to processing delays, e.g. caused by jitter or round trip time [RTT]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/30—Flow control; Congestion control in combination with information about buffer occupancy at either end or at transit nodes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/32—Flow control; Congestion control by discarding or delaying data units, e.g. packets or frames
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/39—Credit based
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/6295—Queue scheduling characterised by scheduling criteria using multiple queues, one for each individual QoS, connection, flow or priority
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The application relates to a mixed flow transmission method driven by a receiving end and a computing network. The method comprises the steps of constructing a network topology structure of a computing center, classifying sub-flows in mixed flows by a receiving end according to types of the sub-flows, generating credit packets based on rates for each type of flow, adopting a credit scheduling mode driven by the receiving end to carry out data transmission, adopting a selective packet loss and packet loss recovery mechanism to carry out data packet transmission when unscheduled data packets arrive at a switch, allowing new flows to send unscheduled data packets at a link rate, actively limiting the sending rate of the credit packets on the switch, ensuring full utilization of the link capacity to avoid excessive buffer occupation, and adopting a congestion control mechanism based on delay to realize fair coexistence of heterogeneous flows in the flow transmission process. The method is excellent in performance under heterogeneous flow coexistence scenes, and can effectively guarantee high performance characteristics such as low buffer occupancy rate, rapid convergence and the like.
Description
Technical Field
The present application relates to the field of data transmission technologies, and in particular, to a method and a computing network for transmitting a hybrid traffic driven by a receiving end.
Background
In recent years, the application of scientific calculation, large artificial intelligent models and the like is rapid in iteration and innovation, the data scale is exponentially increased, and the demand for calculation force is continuously increased. According to international data corporation predictions, global data volumes are expected to reach 284 kilobytes (ZB) in the near future. The traditional system using a single supercomputer or a data center as a computing and storage core gradually appears at the cheating end when processing massive data, and the further development of artificial intelligence is restricted by the scarcity of computing resources and high cost.
The computing network aims to tightly integrate computing center nodes (such as high-performance computing systems and data centers) distributed in different areas, build a unified and efficient network architecture and provide high-quality service for data users in a large range and long distance. The strong network transmission capability is a key for realizing rapid and accurate scheduling of computing resources, and is also a basis for guaranteeing the application performance of the computing network. Therefore, optimizing data transmission efficiency is an important way to improve overall performance of a computing network.
However, as a single computing center is gradually expanded from a simple cluster to a wide area region consisting of multiple clusters, traffic heterogeneity within the computing network is also increasing. This heterogeneity mainly results from data transmissions over different distances (e.g., wide area networks and data center networks), different congestion control algorithm configurations of cloud users on virtual machines, and the gradual deployment of new transport protocols. This increasing heterogeneity may cause the computing network to enter a state where different transport protocol data streams coexist in a mix, as shown in fig. 1. Fairness issues arise when data flows of different transport protocols coexist in queuing at the same switch port.
In a computing network, fig. 1 illustrates heterogeneous transport protocol traffic generated by data transmissions based on different transport protocols. On the one hand, data transmission inside a computing center generally adopts a high-speed data center transmission protocol, so that quick and efficient data exchange is realized, and different services may correspond to different transmission protocols. For example, data transmission from server 4 to server 2 and data transmission from server 5 to server 3 may employ protocol 1 and protocol 2, respectively. On the other hand, when the traffic extends to multiple computing centers (e.g., from server 6 to server 1), the Wide Area Network (WAN) transmission protocol (i.e., protocol 3) assumes the task of trans-regional data transmission.
The introduction of different transport protocols exacerbates the unfairness of bandwidth allocation and may even lead to bandwidth starvation of portions of the transport stream. Existing studies indicate that heterogeneous transport Protocol (Heterogeneous Transport-Protocol, HTP) traffic has fairness issues. For example, ERA considers that sender-driven transport protocols tend to occupy more bandwidth than receiver-driven protocols, and Harmonia indicates that the root cause of unfair bandwidth allocation between transport protocols employing different congestion signals is due to the difference in network congestion detection mechanisms. However, the root cause of this fairness problem is still lacking in a comprehensive and thorough understanding.
Disclosure of Invention
Based on this, it is necessary to provide a receiving-end driven hybrid traffic transmission method and a computing network in order to solve the above technical problems.
A method for transmitting a mixed traffic driven by a receiving end, the method comprising:
And constructing a network topology structure of the computing center.
The receiving end classifies the types of the sub-streams in the mixed traffic and generates credit packets based on the rate for each type of traffic.
And transmitting the data by adopting a credit scheduling mode driven by the receiving end, wherein the credit scheduling mode driven by the receiving end is used for transmitting the data packets and the credit packets through independent queues, and the data transmission is strictly carried out according to one-to-one credit scheduling.
When an unscheduled data packet arrives at the switch, a selective packet loss and packet loss recovery mechanism is used for data packet transmission, allowing the new flow to send the unscheduled data packet at the link rate.
The sending rate of the credit packets is actively limited on the switch, ensuring that the link capacity is fully utilized while avoiding excessive buffer occupancy.
And adopting a congestion control mechanism based on delay in the traffic transmission process to realize fair coexistence of heterogeneous traffic.
In one embodiment, the selective packet loss and packet loss recovery mechanism includes:
When an unscheduled packet arrives at the switch, it is discarded if the buffer occupancy exceeds a preset threshold, and this discarding operation is not applicable to the scheduled packet, thereby ensuring that the buffer occupancy is not too high.
The discarded unscheduled data packet will be detected by the receiving end and guaranteed to be retransmitted as a scheduled data packet in the next round trip time RTT.
In one embodiment, the credit packet transmission rate is limited to 5% of the switch port link capacity. And the minimum ethernet frame size of the credit packet is 64 bytes.
In one embodiment, the delay-based congestion control mechanism includes:
The RTT is calculated through a loop formed by the credit packet and the data packet, the target delay is calculated according to the current transmission rate, the maximum transmission rate, the minimum transmission rate, the basic delay and the target scale factor, the transmission rate change rate is determined according to the target delay, the round trip time RTT, the maximum transmission rate, the minimum transmission rate and the basic delay, the basic delay is used for controlling a basic part of the target delay, and the target scale factor is used for controlling a scaling part of the target delay.
If the round trip timeAnd when the target delay is performed, the network is not congested, and a new sending rate is determined according to the sending rate change rate and the current sending rate.
If round trip time RTT > target delay, congestion occurs in the network, and if deceleration is required, a new transmission rate is determined according to the transmission rate change rate and the current transmission rate.
In one embodiment, the target delay is:
;
wherein, the Represents a target delay, k represents a basic delay parameter, p represents a scaled delay parameter,Indicating the current transmission rate of the mobile station,Indicating the maximum transmission rate at which the data is to be transmitted,Indicating a minimum transmission rate.
In one embodiment, the rate of change of the transmission rate is:
;
wherein, the Indicating the rate of change of the transmission rate,Represents the target delay, p represents the scaling delay parameter,Indicating the maximum transmission rate at which the data is to be transmitted,Represents the minimum transmission rate and m represents the scaling rate parameter.
In one embodiment, the round trip time RTT is:
;
wherein, the The current time is indicated as such,Representing the timestamp of the packet.
The computing network is a data center network and comprises a sending end, a receiving end and a data transmission link, wherein the computing network adopts any receiving end driven mixed flow transmission method to realize mixed flow transmission.
The method comprises the steps of constructing a network topology structure of a computing center, classifying the receiving end according to types of sub-flows in the mixed flow, generating credit packets based on rates for each type of flow, transmitting data by adopting a credit scheduling mode driven by the receiving end, wherein the credit scheduling mode driven by the receiving end is used for transmitting the data packets and the credit packets through independent queues, and the data transmission is strictly carried out according to one-to-one credit scheduling, when the unscheduled data packets arrive at a switch, transmitting the data packets by adopting a selective packet loss and packet loss recovery mechanism, allowing a new flow to transmit the unscheduled data packets at a link rate, actively limiting the transmission rate of the credit packets on the switch, ensuring full utilization of link capacity to avoid excessive buffer occupation, and realizing fair coexistence of heterogeneous flows by adopting a congestion control mechanism based on delay in the flow transmission process. The method is excellent in performance under heterogeneous flow coexistence scenes, and can effectively guarantee high performance characteristics such as low buffer occupancy rate, rapid convergence and the like.
Drawings
FIG. 1 is a schematic diagram of a prior art computing network;
fig. 2 is a flow chart of a method for transmitting a mixed traffic driven by a receiving end in an embodiment;
FIG. 3 (a) is a schematic diagram illustrating the operation of FairHet in one embodiment in coexistence with a Swift flow;
FIG. 3 (b) is a timeline diagram of FairHet transmissions in one embodiment;
FIG. 4 (a) is a diagram of self-convergence at FairHet with and without rate limiting enabled in one embodiment;
FIG. 4 (b) is a diagram of buffer occupancy at FairHet with and without rate limiting enabled in one embodiment;
FIG. 5 (a) is a diagram illustrating the self-convergence performance of Poseidon in one embodiment;
FIG. 5 (b) is a diagram illustrating self-convergence performance of FairHet in one embodiment;
FIG. 6 is a schematic diagram of a dumbbell topology in another embodiment;
FIG. 7 (a) is a diagram of Jain fairness index under a first parameter configuration in an embodiment;
FIG. 7 (b) is a diagram of fairness index under a second parameter configuration in an embodiment;
FIG. 8 (a) is a diagram illustrating the convergence performance of DCTCP and FairHet in one embodiment when they coexist;
FIG. 8 (b) is a diagram illustrating the convergence performance of DCTCP and FlexPass in one embodiment when they coexist;
FIG. 8 (c) is a diagram illustrating the convergence performance of the coexistence of Swift and FairHet in one embodiment;
FIG. 9 (a) is a diagram illustrating the maximum buffer occupancy and average buffer occupancy of DCTCP in different coexistence scenarios according to one embodiment;
FIG. 9 (b) is a diagram illustrating the maximum buffer occupancy and average buffer occupancy for a different coexistence scenario for Swift in one embodiment;
FIG. 10 (a) is a schematic diagram of a 99% fractional FCT curve of HTP flow at different coexistence ratios in a DCTCP and ExpressPass coexistence scenario according to an embodiment;
FIG. 10 (b) is a schematic diagram of a 99% fractional FCT curve of HTP flow at different coexistence ratios in a DCTCP and FlexPass coexistence scenario according to an embodiment;
FIG. 10 (c) is a schematic diagram of a 99% fractional FCT curve of HTP flow at different coexistence ratios in a DCTCP and FairHet coexistence scenario according to an embodiment;
FIG. 10 (d) is a schematic diagram of a 99% fractional FCT curve of HTP flow at different coexistence ratios in a coexistence scene of Swift and FairHet according to an embodiment;
FIG. 11 (a) is a schematic diagram of an average FCT curve of HTP traffic at different coexistence ratios in a DCTCP and ExpressPass coexistence scenario according to an embodiment;
FIG. 11 (b) is a schematic diagram of an average FCT curve of HTP traffic at different coexistence ratios in a DCTCP and FlexPass coexistence scenario according to an embodiment;
FIG. 11 (c) is a schematic diagram of an average FCT curve of HTP traffic at different coexistence ratios in a DCTCP and FairHet coexistence scenario according to an embodiment;
Fig. 11 (d) is a schematic diagram of an average FCT curve of HTP traffic at different coexistence ratios in a coexistence scene of Swift and FairHet in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The application provides a receiving end driven mixed flow transmission method, which mainly comprises a receiving end driven credit scheduling mode, a selective packet loss and packet loss recovery mechanism, a credit packet sending rate limit and a congestion control mechanism based on delay. The hybrid traffic transmission protocol driven by the receiving end is called FairHet protocol for short.
In one embodiment, as shown in fig. 2, there is provided a method for transmitting a mixed traffic of a receiver driver, the method comprising the steps of:
and 100, constructing a network topology structure of the computing center.
In particular, the compute farm network topology may be, but is not limited to, a dumbbell topology and a three-tier Clos topology.
Step 102, the receiving end classifies the sub-flows according to the types of the sub-flows in the mixed flow, and generates credit packets based on the rate for each type of flow.
In particular, due to the high requirements of the computing network on the delay, packet loss rate, throughput and other aspects of data transmission, the performance of the conventional data center transmission protocol in such a complex environment is limited. The existing transmission protocol adopts congestion signals such as ECN and RTT, has thicker granularity, can only reflect the network queue state or the full link transmission delay, and can not provide basis for the accurate rate adjustment of a sending end. In addition, the INT signal can provide accurate link load information, but the periodicity of the rate adjustment limits the accuracy of the transmit-side rate adjustment.
The receiver-driven transport protocol aims at pre-allocating network resources before traffic transmission, thereby preventing and avoiding network congestion. The main approach they employ is to use credits as congestion signals, which can provide network status information at the data cladding level. By employing a credit-scheduling-based adjustment strategy that triggers the transmission of each data packet individually, the receiver-driven protocol can finely and accurately schedule the transmission of network traffic.
And 104, carrying out data transmission by adopting a credit scheduling mode driven by a receiving end, wherein the credit scheduling mode driven by the receiving end is used for transmitting the data packets and the credit packets through independent queues, and the data transmission is strictly carried out according to one-to-one credit scheduling.
Specifically, active rate control of the credit schedule in the receiver-driven credit schedule is still a key aspect in the receiver-driven protocol, because it can allocate bandwidth with fine granularity in advance, thereby achieving faster fair convergence and reducing buffer occupancy. Furthermore, it is necessary to employ rate-based credit packet control, rather than window-based control, to avoid bursty packets at the sender, which may increase the buffer burden on the data path. Therefore, we keep the receiver-side driving mechanism from ExpressPass, as shown in fig. 3 (a) and fig. 3 (b), where fig. 3 (a) is a schematic diagram of the operation process when FairHet and the Swift traffic coexist, and fig. 3 (b) is a schematic diagram of the timeline of FairHet transmission. Data and credits are transmitted through separate queues, respectively, and data transmission is strictly performed according to a one-to-one credit schedule.
And 106, when the unscheduled data packet arrives at the switch, adopting a selective packet loss and packet loss recovery mechanism to carry out data packet transmission, and allowing the new flow to send the unscheduled data packet at the link rate.
Specifically, as observed in the existing advanced credit scheduling protocol, the combination of selective packet loss and data packet loss recovery effectively solves the first RTT problem of the credit scheduling protocol, and simultaneously keeps the buffer occupation low. The first RTT problem is a dilemma that it is necessary to wait for a round trip time RTT to receive a credit packet from a receiving end after transmitting the credit request packet. If no data packet is sent during this period, this may result in the network being underutilized during this period, in an idle state. On the other hand, if the data packet is excessively transmitted during this period, buffer accumulation, queuing delay, and even packet loss may be caused. To address this problem we employ a selective packet loss and loss recovery mechanism that allows the new stream to burst data packets at the link rate, as shown in fig. 3 (b). When an unscheduled packet arrives at the switch, it is discarded if the buffer occupancy exceeds a small threshold (e.g., 2-8 KB), thereby ensuring that the buffer occupancy is not too high. Such packet loss mechanisms are not suitable for scheduled packets. The discarded unscheduled data packet is detected by the receiving end and is guaranteed to be retransmitted as a scheduled data packet in the next round trip time RTT. This approach helps achieve low buffer occupancy and avoids wasting bandwidth. In addition, it eliminates the bandwidth preemption problem caused by the unrestricted transmission of unscheduled packets, as embodied in the Homa protocol.
Step 108, actively limiting the sending rate of the credit packet on the switch, ensuring that the link capacity is fully utilized and avoiding excessive buffer occupation.
Specifically, as a credit scheduling protocol, we also consider the impact of a rate limiting mechanism (i.e. actively limiting the sending rate of credit packets at the switch) on the traffic transmission, as shown in fig. 3 (a). With credit rate limiting, the sending rate of the credit packets is limited to 5% of the switch port link capacity and the minimum ethernet frame size of the credit packets is 64 bytes. This limitation ensures that the corresponding data packet (maximum ethernet frame size 1526 bytes) can fully utilize the link capacity without causing excessive buffer occupancy. We compare the self-convergence performance with the buffer occupancy with and without rate limiting, as shown in fig. 4 (a) for the self-convergence diagram of FairHet with and without rate limiting, and fig. 4 (b) for the buffer occupancy of FairHet with and without rate limiting. In fig. 4 (a), two black lines represent two non-rate limiting FairHet streams and two red lines represent two rate limiting FairHet streams. Compared to the FairHet limit rate, the FairHet without limit rate shows more significant rate fluctuations and higher buffer occupancy (up to 110 times the mean and up to 7 times the maximum), as shown in fig. 4 (b). Thus, we choose to enable the rate limiting function on the switch.
And 110, adopting a congestion control mechanism based on delay in the traffic transmission process to realize fair coexistence of heterogeneous traffic.
In particular, with the rapid increase of link capacity, more and more flows can complete their transmission within one round trip time RTT. This trend requires a more timely and accurate response to network congestion. In order to achieve faster response to network congestion, it is important to obtain more accurate network information. However, the congestion signal (credit packet loss rate) used in the conventional credit scheduling protocol fails to provide detailed information about the data path. This limitation may lead to unfair bandwidth allocation when coexisting with other protocols using different congestion signals. In addition, the use of selective packet loss mechanisms further exacerbates the instability of the packet loss rate itself. Thus, exploring and utilizing other signals becomes a necessary means to address these challenges.
The most important improvement is to enable the credit scheduling protocol to identify congestion signals in the data path. Since adjusting the ECN threshold may affect the performance of other flows through the jump, we seek to use the delay as a congestion signal to accommodate a wider range of scenarios. The ECN signal and the delayed signal may be converted to each other. In the current delay-based transmission protocol, it is a mainstream method to measure whether congestion occurs in a network by targeting a specific delay value, such as congestion control like Swift and poiseidon. Inspired by the manifestation of Poseidon fairness, the present application employs a delay-based congestion control mechanism.
The method comprises the steps of constructing a network topology structure of a computing center, classifying sub-flows in the mixed flow by a receiving end according to types of sub-flows in the mixed flow, generating credit packets based on rates for each type of flow, transmitting data by adopting a credit scheduling mode driven by the receiving end, wherein the credit scheduling mode driven by the receiving end is used for transmitting the data packets and the credit packets through independent queues, the data transmission is strictly carried out according to one-to-one credit scheduling, when the unscheduled data packets arrive at a switch, transmitting the data packets by adopting a selective packet loss and packet loss recovery mechanism, allowing a new flow to transmit the unscheduled data packets at a link rate, actively limiting the transmitting rate of the credit packets on the switch, ensuring full utilization of link capacity to avoid excessive buffer occupation, and realizing fair coexistence of heterogeneous flows by adopting a congestion control mechanism based on delay in the flow transmission process. The method is excellent in performance under heterogeneous flow coexistence scenes, and can effectively guarantee high performance characteristics such as low buffer occupancy rate, rapid convergence and the like.
In one embodiment, the selective packet loss and packet loss recovery mechanism in step 106 includes that when an unscheduled packet arrives at the switch, the packet will be discarded if the buffer occupancy exceeds a preset threshold, and the discarding operation is not applicable to the scheduled packet, thereby ensuring that the buffer occupancy is not too high, and that the discarded unscheduled packet will be detected by the receiving end and retransmitted as a scheduled packet in the next round trip time RTT.
In one embodiment, the transmission rate of the signal packets in step 108 is limited to 5% of the switch port link capacity. And the minimum ethernet frame size of the credit packet is 64 bytes.
In one embodiment, the delay-based congestion control mechanism has a delay as the congestion signal, the delay-based congestion control mechanism in step 110 includes calculating RTT through a loop formed by a credit packet and a data packet and calculating a target delay based on a current transmission rate, a maximum transmission rate, a minimum transmission rate, a base delay and a target scale factor, and determining a transmission rate change rate based on the target delay, a round trip time RTT, the maximum transmission rate, the minimum transmission rate and the base delay, the base delay being used to control a base portion of the target delay, the target scale factor being used to control a scaled portion of the target delay, if the round trip time isAnd if the round trip time RTT is greater than the target delay, the network is congested, and if the network needs to be decelerated, the new sending rate is determined according to the sending rate change rate and the current sending rate.
In one embodiment, the target delay is:
;
wherein, the Represents a target delay, k represents a basic delay parameter, p represents a scaled delay parameter,Indicating the current transmission rate of the mobile station,Indicating the maximum transmission rate at which the data is to be transmitted,Indicating a minimum transmission rate.
In one embodiment, the rate of change of the transmission rate is:
;
wherein, the Representing the rate of change of the transmission rate, initializing the current rate to the initial rate,Represents the target delay, p represents the scaling delay parameter,Indicating the maximum transmission rate at which the data is to be transmitted,Represents the minimum transmission rate and m represents the scaling rate parameter.
In one embodiment, the round trip time RTT is:
;
wherein, the The current time is indicated as such,Representing the timestamp of the packet.
Specifically, the delay-based congestion control mechanism calculates RTT through a loop formed by a credit packet and a data packet, and calculates a target delay according to the current transmission rate. If RTT is greater than the target delay, it indicates congestion of network and needs to be decelerated, otherwise, whenAt the target delay, the network may accommodate more traffic. The algorithm follows the method proposed by Swift, where the target delay is inversely proportional to the rate. It uses a logarithmic function to calculate the target delay, ensuring that the target delay is small at higher rates, as shown in algorithm 1. Furthermore, the algorithm employs multiplicative increase and multiplicative decrease, so that the flow can monotonically increase or decrease its rate. Multiplicative increase and multiplicative decrease allow the algorithm to reach convergence rate faster as the rate increases compared to popular additive increases and multiplicative decreases.
One advantage of this algorithm is that the queue length can be specified during convergence by parameter adjustment, which is well suited to accommodate different co-existence transmission environments. By adjusting different parameters we can achieve fairness with other protocols for FairHet. In this algorithm, parameter p controls the scaled portion of the target delay, while parameter k controls the underlying portion of the target delay. A larger value of the parameter p results in a higher RTT during convergence and reduced rate fluctuations. Furthermore, a smaller value of the parameter k may result in a lower converged RTT, but may also result in network underutilization. To achieve fairness with protocols having lower convergence queue lengths, we can reduce the parameter k and adjust the parameter p to get better optimization and vice versa.
Algorithm 1 pseudocode for delay-based congestion control mechanism is as follows:
the parameter k represents a basic delay parameter, p represents a scaling delay parameter, and m represents a scaling rate parameter;
2, initializing: cur_rate≡initial_rate;
3Function Receive Data(Packet):
4RTT = cur_time-packet. Time stamp;// calculate current RTT
5 IfPacket.timestamp > = t_last_ DECREASETHEN// determine if a threshold is reached that can slow down
6can_decrease=true;
7(V/calculate the current target)
8Calculating the current ratio
9IfRTT < = targetthen// determine if RTT reaches target
10New_rate=ratio×cur_rate;/(if it is reached) performing multiplicative acceleration
11Else// if not reached and allowed to slow down, multiplicative slow down is performed
12if can-decrease then
13new_rate=ratio×cur_rate;
14can_decrease=false;
15t_last decrease=now;
16returnnew_rate。
To verify if our improvement affects the high performance of the original protocol, we performed a self-convergence experiment. The experimental results are shown in fig. 5 (a) and 5 (b). We compare the self-convergence performance of FairHet and poiseidon under different parameter configurations. We have determined a suitableValues to ensure that the bottleneck link is not underutilized and then observe different parametersLower buffer occupancy. We have found that the buffer occupancy of a poiseidon follows parameters in different configurations that provide fast and fair self-convergenceAnd eventually stabilizes at about 45KB. In contrast, fairHet's buffer occupancy is not subject to parametersThe effect of the variation is always kept at an extremely low level. Furthermore FairHet maintains low rate fluctuations in this process, which can be attributed to its advanced bandwidth allocation mechanism. Furthermore, the introduction of rate limiting functionality helps to limit buffer occupancy. At the same time, the delay-based transport protocol prevents FairHet from easily suffering bandwidth preemption due to low convergence buffer occupancy. This allows FairHet to have a greater ability to adapt to different transport coexistence environments than Poseidon while maintaining the original high performance.
It should be understood that, although the steps in the flowchart of fig. 2 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
In one illustrative embodiment, we evaluate FairHet using OMNeT++ simulator in two network topologies, one a small dumbbell topology and the other a large Clos topology. We use throughput as an evaluation index for long duration traffic and Flow Completion Time (FCT) as an evaluation index for short duration traffic. Our main findings are as follows:
Experimental results show that FairHet can achieve fair and fast convergence when coexisting with two legacy transport protocols (DCTCP and Swift) in a small topology. In addition FairHet effectively maintains a low buffer occupancy.
In large scale simulations we compared FairHet with ExpressPass and FlexPass for their performance in co-existence with DCTCP, while FairHet with ExpressPass for their performance in co-existence with Swift. FairHet improves the 99% fractional Flow Completion Time (FCT) and ensemble average FCT throughout the deployment, especially with significant benefits in the coexistence of high throughput traffic.
In the coexistence process, fairHet has minimal influence on the traditional flow, thereby effectively maintaining the performance thereof.
(1) Dumbbell topology simulation
This section is set up to verify FairHet if it is operating as designed. We performed coexistence experiments between FairHet and two widely deployed conventional transport protocols DCTCP and Swift with different congestion signals in the dumbbell topology configuration shown in fig. 6. Each switch has a shared buffer of 12 MB with a selective drop threshold of 8 KB. A dynamic buffer management mechanism is implemented according to literature 1(S. Arslan, Y. Li, G. Kumar, N. Dukkipati, Bolt: Sub RTT congestion control for Ultra-Low latency, in 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23), 2023, pp. 219–236.). The rate of each link is 100 Gbps and the transmission delay is 500 ns, resulting in a basic RTT at the maximum ethernet frame (1526B) of about 3.4 mus. For DCTCP we set the ECN threshold to 20 KB and for Swift we set the hop scaling factor to 1 μs.
Micro reference test for basic delay parametersAnd scaling the delay parameterWe plan to initially base delay parametersSet to the size of the basic RTT and then use the scaling delay parameterThe queuing delay is scaled at the corresponding queue length (e.g., ECN threshold). Subsequently, by substituting the expected bandwidth (e.g., half of the link capacity) into the formula for calculating the target delay (Algorithm 1, line 10), we can determine the scaling delay parameterIs set to a predetermined value. In this experiment, the basic delay parameterThe drain delay for the ECN threshold is set to 1 μs, the minimum allocated bandwidth is set to 20 Kbps, and the expected bandwidth is half the link capacity. Thus, scaling delay parameters in target delay computationThe coefficient of (c) is 0.04,The initial value of (2) is set to 40 mus. Then we adjust the scaling delay parameterFor better optimization. Fig. 7 (a) and 7 (b) show Jain fairness index diagrams under different parameter configurations, wherein fig. 7 (a) is a Jain fairness index diagram under a first parameter configuration, and fig. 7 (b) is a fairness index diagram under a second parameter configuration. For DCTCP, a Jain fairness index close to 1 is achieved at a value of 17 μs for the scaling delay parameter p, as shown in fig. 7 (a), and the same fairness index is achieved at a value of 20 μs for the scaling delay parameter p, as shown in fig. 7 (b).
The convergence performance of the dumbbell topology under the optimal parameter configuration is shown in fig. 8 (a) to 8 (c), wherein fig. 8 (a) is a schematic view of the convergence performance when DCTCP and FairHet coexist, fig. 8 (b) is a schematic view of the convergence performance when DCTCP and FlexPass coexist, and fig. 8 (c) is a schematic view of the convergence performance when shift and FairHet coexist. The results shown in fig. 8 (a) to 8 (c) indicate that FairHet can achieve very small rate fluctuations in coexistence with DCTCP and Swift.
Fig. 9 (a) and 9 (b) show the maximum buffer occupancy and average buffer occupancy of DCTCP and Swift in different coexistence scenarios, wherein fig. 9 (a) is a schematic diagram of the maximum buffer occupancy and average buffer occupancy of DCTCP in different coexistence scenarios, fig. 9 (b) is a schematic diagram of the maximum buffer occupancy and average buffer occupancy of Swift in different coexistence scenarios, method one is ExpressPass, method two is DCTCP, method three is FlexPass, method four is FairHet, and method five is Swift. For DCTCP, fairHet reduces the average buffer occupancy from 24KB to 17KB, and the maximum buffer occupancy from 245KB to 78KB, only one third of the ExpressPass values. In addition, flexPass, fairHet, when facing DCTCP, not only exhibits a faster convergence speed and lower buffer occupancy, but also has the advantage of not interfering with innocent flows in the data center network. For Swift FairHet reduces the average buffer occupancy from 59 KB to 18 KB, the maximum buffer occupancy from 86 KB to 21 KB, and all reduced by more than three times compared to ExpressPass. The buffer occupancy indicators are close to the performance level of DCTCP and Swift in a self-convergence scene.
(2) Large scale simulation
The simulation setting is that the embodiment constructs a set of three-layer Clos topological structure, which comprises 8 core switches, 16 convergence switches, 32 rack top switches (tors) and 192 hosts. The host processing delay is set to 2 microseconds. The link rate between switches is 100 Gbps, the propagation delay is 2 microseconds, the link rate between the host and the switches is 100 Gbps, and the propagation delay is 1 microsecond. Under the above configuration, the base Round Trip Time (RTT) for the maximum ethernet frame (1526 bytes) to correspond to the minimum ethernet frame (64 bytes) is about 24 microseconds. Each switch is equipped with 12 MB shared buffers and the selective packet loss threshold is set to 8 KB. The application realizes a dynamic buffer management mechanism. In experiments we compared FairHet with ExpressPass and FlexPass for the DCTCP coexistence scenario and FairHet with ExpressPass for the Swift coexistence scenario. In the experiment, the aggressive factor of ExpressPassSet to 2.0, the skip scale factor of shift is set to 2 microseconds, and the other ECN based protocol ECN threshold is set to 100 KB, which is sufficient to support a throughput of 100 Gbps.
The method generates network traffic load based on real traffic scale distribution, and adopts a typical workload model in a Web search scene. The workload contained 49% small flow (0-10 KB), 3% medium flow (10 KB-100 KB), 18% large flow (100 KB-1 MB), and 20% extra large flow (greater than 1 MB), with an average flow size of 1.6 MB. Each flow randomly selects a pair of hosts as source and destination nodes, and the traffic arrival process follows the poisson process distribution. By adjusting the traffic arrival rate, a 3:1 oversubscription ratio is achieved at the uplink of the ToR switch to make a resource contention scenario. For the real flow model described above, the flow completion time (Flow Completion Time, FCT) is used herein as the primary performance evaluation index. Specifically, we count the average FCT of all flows to measure the bandwidth utilization efficiency of the system, while, for small flows (less than 100 KB), we calculate their 99 percentile FCT to evaluate the tail delay performance of the system.
Performance benefit under HTP traffic coexistence we designed experiments where the new scheme co-existed with the legacy transport protocols (i.e., DCTCP and Swift), gradually increasing the deployment ratio of the new scheme from 0% to 100%. The experimental results are shown in table 1.
TABLE 1 comparison of overall FCT Performance for all flows when co-existing with DCTCP or Swift at different deployment rates
For DCTCP, we will base delay parameters in the reference parameter configurationTaking 1 μs and scaling delay parametersIn the case of taking 40. Mu.s, the performance results in Table 2 were obtained. From the figure, expressPass was observed to cause a significant performance drop at 99% fct, reaching 32.51%. FlexPass mitigate this performance degradation during coexistence, but 99% quantile FCTs are higher at full deployment. In contrast FairHet consistently achieved lower fractional flow 99% FCT than ExpressPass and FlexPass during the gradual deployment process. Ensemble average FCTs exhibit similar trends, and FlexPass and FairHet each mitigate the performance degradation associated with ExpressPass. However, flexPass when deployed alone, the overall FCT is higher, while FairHet relies on its selective drop mechanism, exhibiting a lower overall average FCT.
For Swift, with the jump delay factor set to 2 μs, 99% of the fractional FCTs are higher, while the average FCT is lower than DCTCP. Thus, we can adjust FairHet the parameter configuration to better accommodate this new scenario. FairHet1 denotes the basic delay parameterScaling the delay parameter to 1 μs40 Mus configuration, fairHet denotes the basic delay parameterScaling the delay parameter to 0.5 musIs a 80 mus configuration. This configuration FairHet2 reduces the lower limit of the target delay and increases the upper limit, aiming at accommodating a wider range of delays. Thus, fairHet2 achieves lower 99% quantile FCT and ensemble average FCT in the coexistence case, compared to ExpressPass and FairHet1, but due to scaling delay parametersThe value is higher, and its FCT is higher than FairHet1 when deployed alone. If there is no new configuration FairHet a still results in a drop in the overall average FCT.
To evaluate the performance of each transport protocol more deeply, 99% fractional FCT curves of HTP traffic at different coexistence ratios in different coexistence scenarios as shown in fig. 10 (a) to 10 (d) are drawn, wherein fig. 10 (a), 10 (b) and 10 (c) are respectively 99% fractional FCT curves of HTP traffic at different coexistence ratios in the coexistence scenarios of DCTCP and ExpressPass, flexPass and FairHet, and fig. 10 (d) is a 99% fractional FCT curve of HTP traffic at different coexistence ratios in the coexistence scenarios of Swift and FairHet. Also drawn are graphs of average FCT curves of HTP traffic at different coexistence ratios in different coexistence scenarios as shown in fig. 11 (a) to 11 (d), wherein fig. 11 (a), 11 (b) and 11 (c) are graphs of average FCT curves of HTP traffic at different coexistence ratios in coexistence scenarios of DCTCP and ExpressPass, flexPass and FairHet, respectively, and fig. 11 (d) is a graph of average FCT curves of HTP traffic at different coexistence ratios in coexistence scenarios of Swift and FairHet.
FairHet operate in their optimal configuration. For DCTCP, deployment ExpressPass can cause serious side effects, resulting in a tail delay increase of up to 33% for the legacy flows and an average FCT increase of up to 41%, as shown in fig. 10 (a) and 11 (a). This situation can be ameliorated using FlexPass to limit the increase in tail delay to 9% and the increase in average FCT to 19%, as shown in fig. 10 (b) and 11 (b). In contrast, fairHet had minimal impact on conventional traffic during deployment, as shown in fig. 10 (c) and 11 (c). For Swift, the reasonable configuration FairHet effectively mitigates FCT performance degradation over tail delay and average delay, as shown in fig. 10 (d) and 11 (d).
The FairHet protocol is a receiver-driven delay-based protocol that aims to alleviate these problems by minimizing modifications while maintaining existing high performance characteristics. Experimental verification shows that the bandwidth preemption capability of a protocol is related to the convergence queue length of the protocol. To accommodate HTP traffic coexistence scenarios, fairHet combine receiver-driven traffic management, keep low-convergence buffer occupancy by implementing rate limiting, and combine delay-based congestion control to prevent FairHet from being prone to bandwidth preemption due to low-convergence buffer occupancy. Compared with the existing solution, fairHet has lower deployment difficulty and wider application range. Evaluation results show that FairHet can realize public flat and rapid convergence when coexisting with two traditional transmission protocols DCTCP and Swift in a small-scale topology, and the behavior of the self-convergence scene is close to that of the self-convergence scene. Furthermore, fairHet significantly improved the 99% quantile FCT and ensemble average FCT in large scale simulations and provided significant benefits in HTP traffic coexistence.
In one embodiment, the computing network for the mixed flow transmission driven by the receiving end is a data center network and comprises a sending end, a receiving end and a data transmission link, wherein the computing network adopts any one of the mixed flow transmission methods driven by the receiving end to realize the mixed flow transmission.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.
Claims (6)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202510580957.6A CN120111001B (en) | 2025-05-07 | 2025-05-07 | Receiver-driven hybrid traffic transmission method and computing network |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202510580957.6A CN120111001B (en) | 2025-05-07 | 2025-05-07 | Receiver-driven hybrid traffic transmission method and computing network |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN120111001A CN120111001A (en) | 2025-06-06 |
| CN120111001B true CN120111001B (en) | 2025-07-22 |
Family
ID=95874356
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202510580957.6A Active CN120111001B (en) | 2025-05-07 | 2025-05-07 | Receiver-driven hybrid traffic transmission method and computing network |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN120111001B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112468405A (en) * | 2020-11-30 | 2021-03-09 | 中国人民解放军国防科技大学 | Data center network congestion control method based on credit and reaction type |
| CN116032893A (en) * | 2022-12-14 | 2023-04-28 | 新讯数字科技(杭州)有限公司 | Data channel service system for IMS network and implementation method |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7272144B2 (en) * | 2002-06-26 | 2007-09-18 | Arris International, Inc. | Method and apparatus for queuing data flows |
| WO2022089715A1 (en) * | 2020-10-26 | 2022-05-05 | Huawei Technologies Co., Ltd. | Method of managing data transmission for ensuring per-flow fair bandwidth sharing |
| US20240163219A1 (en) * | 2022-11-11 | 2024-05-16 | Praveen Vaddadi | System and method for data transfer and request handling among a plurality of resources |
| CN119105866B (en) * | 2024-08-16 | 2025-10-03 | 南京航空航天大学 | Distributed cluster resource autonomous scheduling method based on DSACO |
| CN119603231A (en) * | 2024-12-04 | 2025-03-11 | 海南大学 | A method for orderly traffic scheduling based on dynamic priority in data center network |
-
2025
- 2025-05-07 CN CN202510580957.6A patent/CN120111001B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112468405A (en) * | 2020-11-30 | 2021-03-09 | 中国人民解放军国防科技大学 | Data center network congestion control method based on credit and reaction type |
| CN116032893A (en) * | 2022-12-14 | 2023-04-28 | 新讯数字科技(杭州)有限公司 | Data channel service system for IMS network and implementation method |
Also Published As
| Publication number | Publication date |
|---|---|
| CN120111001A (en) | 2025-06-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Ahn et al. | Supporting service differentiation for real-time and best-effort traffic in stateless wireless ad hoc networks (SWAN) | |
| CN110166366B (en) | Network congestion control method, device and system | |
| CN110808884B (en) | Network congestion control method | |
| CN104185298B (en) | Network load dynamic self-adapting parameter regulation means based on priority | |
| Ahn et al. | SWAN: Service differentiation in stateless wireless ad hoc networks | |
| CN112468405B (en) | Credit and Reactive Data Center Network Congestion Control Method | |
| CN112437019B (en) | Active transmission method based on credit packet for data center | |
| CN110351187B (en) | Adaptive load balancing method for path switching granularity in data center network | |
| CN105024940A (en) | Link adaptation-based heterogeneous network TCP congestion control method | |
| CN105490962B (en) | A kind of QoS management methods based on OpenFlow networks | |
| WO2015149460A1 (en) | Fiber channel over ethernet flow control method, device and system | |
| CN103916329A (en) | Named data network transmission control method and system | |
| Hegde et al. | Experiences with a centralized scheduling approach for performance management of IEEE 802.11 wireless LANs | |
| CN111371701A (en) | MAC Layer Queue Scheduling Method Based on TDMA | |
| CN109995608B (en) | Network rate calculation method and device | |
| CN114285803A (en) | Congestion control method and device | |
| Hu et al. | AMRT: Anti-ECN marking to improve utilization of receiver-driven transmission in data center | |
| CN120111001B (en) | Receiver-driven hybrid traffic transmission method and computing network | |
| CN115665060A (en) | A multi-path transmission scheduling method and device for heterogeneous network | |
| Zhou et al. | Expresspass++: Credit-effecient congestion control for data centers | |
| CN114531399A (en) | Memory blocking balance method and device, electronic equipment and storage medium | |
| Wei et al. | EC4: ECN and credit-reservation converged congestion control | |
| CN115396357B (en) | Traffic load balancing method and system in data center network | |
| Yang et al. | Cross-layer assisted early congestion control for cloud vr applications in 5g edge networks | |
| Han et al. | Optimization of multipath transmission path scheduling based on forward delay in vehicle heterogeneous networks |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |