[go: up one dir, main page]

WO2001065394A1 - Commande de trafic de bord-a-bord pour internet - Google Patents

Commande de trafic de bord-a-bord pour internet Download PDF

Info

Publication number
WO2001065394A1
WO2001065394A1 PCT/US2001/006263 US0106263W WO0165394A1 WO 2001065394 A1 WO2001065394 A1 WO 2001065394A1 US 0106263 W US0106263 W US 0106263W WO 0165394 A1 WO0165394 A1 WO 0165394A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
congestion
nodes
edge
epoch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2001/006263
Other languages
English (en)
Other versions
WO2001065394A9 (fr
Inventor
David Harrison
Shivkumar Kalyanaraman
Sthanunathan Ramakrishanan
Prasad Bagal
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rensselaer Polytechnic Institute
Original Assignee
Rensselaer Polytechnic Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rensselaer Polytechnic Institute filed Critical Rensselaer Polytechnic Institute
Priority to US10/204,222 priority Critical patent/US20040037223A1/en
Priority to AU2001245355A priority patent/AU2001245355A1/en
Publication of WO2001065394A1 publication Critical patent/WO2001065394A1/fr
Anticipated expiration legal-status Critical
Publication of WO2001065394A9 publication Critical patent/WO2001065394A9/fr
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/30Flow control; Congestion control in combination with information about buffer occupancy at either end or at transit nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • H04L47/122Avoiding congestion; Recovering from congestion by diverting traffic away from congested entities

Definitions

  • the present invention generally relates to computer network traffic
  • the present invention relates to
  • QoS Quality of Service
  • tail circuits (to remote locations), international circuits and peering points, may
  • QoS Quality-of-Service
  • the present invention provides a congestion control
  • edge-to-edge traffic control For the Internet, edge-to-edge traffic control.
  • edge-to-edge traffic trucking building blocks herein referred to as
  • IP network
  • TCP Transmission Control Protocol
  • t e overlay involves the exchange of control
  • the edge-to-edge virtual links thus created can be used as the
  • bandwidth- based denial of service attacks can be isolated, simple differentiated services can be offered, and new
  • FIG. 1 illustrates a network system for the implementation of the
  • FIG. 2 illustrates another view of a network system for the
  • FIG. 3 illustrates a detailed network system to for use in describing the
  • FIG. 4 shows a chart illustrating dynamic edge-to-edge regulation
  • FIG. 5(a) illustrates a queue at a bottleneck
  • FIG. 5(b) illustrates a queue distributed across edges.
  • FIG. 1 a network node bottleneck 100.
  • the present invention is based upon the observation that at all times, the sum of the output rates of flows passing through a particular single network node bottleneck 100 is less than or equal to the capacity of ( ⁇ ) 102 at the bottleneck 100, as illustrated in FIG. 1. Most importantly, this condition holds during periods of congestion called “congestion epochs" .
  • a “congestion epoch” is defined as any period when the instantaneous queue length exceeds a queue bound which is larger than the maximum steady state queue fluctuations. Chosen this way, the congestion epoch is the period of full utilization incurred when the mean aggregate load
  • ( ⁇ ) at a single bottleneck 100 exceeds mean capacity ( ⁇ ).
  • Congestion epoch does not involve packet loss in its definition and is a basis for "early" detection.
  • a single bottleneck 100 is used in the following description, although the present invention is applicable to a network of bottlenecks 100 as well.
  • the output rates of flows (v,) can be measured at the receiver and fed back to the sender.
  • each sender imposes a rate limit r, such that r, ⁇ — min( ⁇ v, , r,) where ⁇ ⁇ 1.
  • the increase-decrease policy of the present invention is not the same as the well known additive -increase multiplicative-decrease (AIMD) policy, because the decrease policy of the present invention is based upon tl e output rate (v,) and not the input rate ( ⁇ ,).
  • the policy of the present invention is hereto referred to as AIMD-ER (Additive Increase and Multiplicative Decrease using Egress Rate).
  • the remaining part of the basic approach is a method to detect the congestion epochs in the system.
  • the present invention utilizes two method for this purpose. The first method assumes that the interior routers assist in the determination of the start and duration of a congestion epoch. In the second method, edges detect congestion epochs without the involvement of interior routers.
  • the interior router promiscuously marks a bit in packets whenever the instantaneous queue length exceeds a carefully designed threshold.
  • the second method does not involve support from interior routers.
  • the edges To detect the beginning of a congestion epoch, the edges rely on the observation that each flow's contribution to the queue length (or accumulation), q is equal to the integral J( ⁇ j.(T) - Vj (T))dT. If this accumulation is larger than a predefined threshold, the flow assumes the beginning of a congestion epoch.
  • the end of the congestion epoch is detected when a one-way delay sample comes close to the minimum one-way delay.
  • the present invention assumes that the network architecture is partitioned into traffic control classes.
  • a traffic control class is a set of networks with consistent policies applied by a single administrative entity or cooperating administrative entities or peers. Specifically, it is assumed that edge-to-edge controlled traffic is isolated from other traffic which is not edge-to-edge controlled.
  • the architecture has three primary components: the ingress edge 202, the interior router 204, and the egress edge 206. Nodes within a traffic control class that are connected to nodes outside the class and implement edge-to-edge control are known as edge routers 202, 206. Any remaining routers within a class are called interior routers 204.
  • the methods of the present invention can be implemented on conventional hardware such as that of FIG. 2, where the ingress edge 202, the interior router 204, and egress edge 206 employ the means for performing the methods herein described.
  • the ingress node 202 regulates each edge-to-edge virtual link to a rate limit of r ; .
  • the actual input rate i.e. departure rate from the ingress 202, and denoted XT
  • the present invention also assumes that the ingress node 202 uses an observation interval T for each edge-to-edge virtual link originating at this ingress 202.
  • a congestion epoch begins when an interior router promiscuously marks a congestion bit on all packets once the instantaneous queue exceeds a carefully designed queue threshold parameter. Since interior routers 204 participate explicitly, the present invention refers to this as the Explicit Edge Control (EEC) method.
  • EEC Explicit Edge Control
  • the egress node 206 declares the beginning of a new congestion epoch upon seeing the first packet with tl e congestion bit set. A new control packet is created and the declaration of congestion along with the measured output rate at the egress 206 is fed back to the ingress 202. The interval used by the egress node 206 to measure and average the output rate is resynchronized with the beginning of this new congestion epoch.
  • the congestion epoch continues in every successive observation interval where at least one packet from the edge-to-edge virtual link is seen with the congestion bit set. At the end of such intervals, tl e egress 206 sends a control packet with the latest value of the exponentially averaged output rate.
  • the default response of the ingress edge 202 upon receipt of control packets is to reduce the virtual link's rate limit (r ; ) to the smootiied output rate scaled down by a multiplicative factor (v ; ⁇ 3 : 0 ( ⁇ ( 1).
  • the congestion epoch ends in the first interval when no packets from the link are marked with the congestion bit.
  • the egress 206 merely stops sending control packets and the ingress 202 assumes the end of a congestion epoch when two intervals pass without seeing a control packet.
  • the ingress node 202 uses a leaky bucket rate shaper whose rate limit (r) can be varied dynamically based upon feedback.
  • the amount of traffic "I" entering the network over any time interval [t, t + T ] after shaping is:
  • the queue threshold parameter can be set as N ⁇ where N is the number of virtual links, not end-to-end flows, passing through the bottleneck.
  • a rough estimate of N which suffices for this method, can be based upon the number of routing entries, and/or the knowledge of the number of edges whose traffic passes through the node. The objective is to allow at most ⁇ burstiness per active virtual link before signaling congestion.
  • edge-to-edge virtual links occurs in a manner similar to TCP slow start, and is defined by the present invention as "rate -based slow start. " As long as there are sufficient packets to send the rate limit doubles each interval,when a rate-decrease occurs (in a congestion epoch), a rate threshold 'rthreshi", is set to the new value of tl e rate limit after the decrease. The function of this variable is similar to tl e "SSTHRESH" variable used in TCP. While the rate-limit r, tracks the actual departure rate ⁇ instruct rthresh, serves as an upper bound for slow start.
  • the rate limit r is allowed to increase multiplicatively until it reaches rthresh , or receives a congestion notification.
  • tl e is allowed to increase linearly by ⁇ /T once per measurement interval T. The dynamics of these variables are illustrated in FIG. 4.
  • the rate-decrease during a congestion epoch is based upon the measured and smoothed egress rate v,.
  • the response to congestion is to limit the departure rates ( ⁇ ,) to values smaller than v, consistently during the congestion epoch.
  • the measurement interval T used by all edge systems (both ingress 202 and egress 206) is set to the class-wide maximum edge-to-edge round-trip propagation and transmission delays, max_ertt, plus the time to mark all virtual links passing through the bottleneck when congestion occurs.
  • the time to mark all virtual links can be roughly estimated as N max ⁇ / ⁇ mm where ⁇ min is the smallest capacity within the class and N max is a reasonable bound on the number of virtual links passing through the bottleneck.
  • the method optionally delays backoff through a method known as “delayed feedback response. " Specifically, the feedback received by the ingress node 202 is enforced after a delay of max_ertt
  • the ingress 202 backs off by ⁇ 2 when packet loss occurs.
  • tl e present invention also provides for edge-to-edge congestion control without interior router 204 involvement, herein referred to as Implicit Edge Control (IEC).
  • IEC Implicit Edge Control
  • IEC infers the congestion state by estimating the contribution of each virtual link to the queue length q ; by integrating the difference in ingress and egress rates.
  • tlie estimate exceeds a threshold, IEC declares congestion.
  • IEC ends the congestion epoch when the delay on a control packet drops to within e of the minimum measured delay.
  • IEC and Explicit Edge Control (EEC) are identical, as described above.
  • each virtual link signals congestion when its (contribution ( "accumulation" ), q i5 to the queue length exceeds ⁇ .
  • the total accumulation is N ⁇ which is the congestion epoch detection criterion used in the EEC method.
  • the accumulation q i can be calculated using the following observation: Assume a sufficiendy large interval ⁇ . If tl e average input rate during this period ⁇ is ⁇ ; and the average output rate is V; , the accumulation caused by this flow during the period ⁇ is ( ⁇ t - - v : ) x ⁇ . The accumulation measured during this period can be added to a running estimate of accumulation q ; which can then be compared against a maximum accumulation reference parameter. More accurately stated:
  • the average interval for v is delayed by the propagation delay so that any fluid entering the virtual link by the time t can leave by time u unless it is backlogged.
  • the computation of q excludes packets in the bandwidth-delay product, if the bandwidth-delay product is constant.
  • the ingress node 202 sends two control packets in each interval T (but no faster than the real data rate), ⁇ is tl e inter-departure time of control packets at the sender.
  • is tl e inter-departure time of control packets at the sender.
  • the ingress inserts a timestamp and the measured average input rate ( ⁇ ; ).
  • the average output rate v is measured over the time interval between arrivals of consecutive control packets at the egress 206.
  • the egress node 206 now has all the three quantities required to do the computation: ( ⁇ ; - v ; ) x ⁇ and add it to a running estimate of accumulation.
  • the running estimate of accumulation is also reset at the end of a congestion epoch to avoid propagation of measurement errors.
  • One way of implementing the control packet flow required for this mechanism without adding extra traffic is for the ingress 202 to piggy-back rate and timestamp information in a shim header on two data packets in each interval T. Interior IP routers ignore the shim headers, while the egress 206 strips them out.
  • the detection of the end of a congestion epoch, or in general an un- congested network is based upon samples of one-way delay.
  • the egress 202 updates the minimum one-way delay seen so far. Every time a one-way delay sample is within e of the minimum one-way delay, the egress 206 declares that the network is un- congested and stops sending negative feedback.
  • the minimum oneway delay captures the fixed components of delay such as transmission, propagation and processing (not queuing delays). The delay over and above this minimum one-way delay is a rough measure of queuing delays.
  • edge-to-edge control can be used to distribute backlog across the edges, as illustrated in FIGS. 5(a) and 5(b). This increases the effective number of buffers allowing more TCP connections to obtain large enough windows to survive loss without timing out. This reduces TCP's bias against tiny flows and thus improves fairness.
  • goodput is defined herein as the number of transmitted payload bits excluding retransmissions per unit time) when many TCP connections compete for the bottleneck. As expected, this improvement increases as congestion is distributed across more edges.
  • Edge-to-edge control can also be combined with Packeteer TCP rate control (TCPR) to provide a low-loss-end-to-end service for TCP connections.
  • TCPR Packeteer TCP rate control
  • Low-loss it is meant that the method typically does not incur loss in the steady-state.
  • IEC Packeteer TCP rate control
  • the combined IED + TCPR method does not require upgrading either end-systems or the interior network.
  • both the Explicit Edge Control ( EEC ) and the Implicit dge Control (IEC) methods can be deployed one class at a time improving performance as the number of edge controlled class increases. For example, deployment can be piggybacked with the roll-out of services or MPLS, SII C fl these techniques can work with either architecture. Both methods are transparent to end-systems, but require software components to be installed at the edges of the network. Such edge components can be installed as upgrades to routers or stand-alone units.
  • the above described apparat us and method proude loi an improved data network by elevating congestion at network bottlenecks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

L'invention concerne un appareil et un procédé de recouvrement pour augmenter la commande d' encombrement au mieux et la qualité de service dans Internet, appelée commande de trafic de bord-à-bord (Fig. 3). L'architecture de base fonctionne au niveau de la couche réseau et implique de repousser l'encombrement à partir de l' intérieur d' un réseau, de le distribuer dans des noeuds de bord (202, 206, Fig. 3) où les problèmes d' encombrement moindres peuvent être traités à l'aide de procédés flexibles, sophistiqués et moins coûteux. Les blocs de base de trafic de bord-à-bord ainsi créés peuvent être utilisés en tant que base de différentes applications. Ces applications incluent la commande des flux TCP et non TCP, l'amélioration de la variabilité dimensionnelle intermédiaire, le développement des services différenciés simples et l'isolation des attaques portant sur les refus des services de largeur de bande. Les procédés sont flexibles, peuvent être combinés avec d'autres protocoles (comme le MPLS et autres services), ne nécessitent pas de normalisation et peuvent être déployés rapidement.
PCT/US2001/006263 2000-02-29 2001-02-28 Commande de trafic de bord-a-bord pour internet Ceased WO2001065394A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US10/204,222 US20040037223A1 (en) 2001-02-28 2001-02-28 Edge-to-edge traffic control for the internet
AU2001245355A AU2001245355A1 (en) 2000-02-29 2001-02-28 Edge-to-edge traffic control for the internet

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US18579500P 2000-02-29 2000-02-29
US60/185,795 2000-02-29

Publications (2)

Publication Number Publication Date
WO2001065394A1 true WO2001065394A1 (fr) 2001-09-07
WO2001065394A9 WO2001065394A9 (fr) 2002-12-27

Family

ID=22682474

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/006263 Ceased WO2001065394A1 (fr) 2000-02-29 2001-02-28 Commande de trafic de bord-a-bord pour internet

Country Status (2)

Country Link
AU (1) AU2001245355A1 (fr)
WO (1) WO2001065394A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100450081C (zh) * 2005-06-10 2009-01-07 华为技术有限公司 进行流量控制的方法和系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5818815A (en) * 1994-09-19 1998-10-06 International Business Machines Corporation Method and an apparatus for shaping the output traffic in a fixed length cell switching network node
US5878228A (en) * 1996-11-15 1999-03-02 Northern Telecom Limited Data transfer server with time slots scheduling base on transfer rate and predetermined data
US6151300A (en) * 1996-05-10 2000-11-21 Fujitsu Network Communications, Inc. Method and apparatus for enabling flow control over multiple networks having disparate flow control capability
US6185187B1 (en) * 1997-12-10 2001-02-06 International Business Machines Corporation Method and apparatus for relative rate marking switches

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5818815A (en) * 1994-09-19 1998-10-06 International Business Machines Corporation Method and an apparatus for shaping the output traffic in a fixed length cell switching network node
US6151300A (en) * 1996-05-10 2000-11-21 Fujitsu Network Communications, Inc. Method and apparatus for enabling flow control over multiple networks having disparate flow control capability
US5878228A (en) * 1996-11-15 1999-03-02 Northern Telecom Limited Data transfer server with time slots scheduling base on transfer rate and predetermined data
US6185187B1 (en) * 1997-12-10 2001-02-06 International Business Machines Corporation Method and apparatus for relative rate marking switches

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100450081C (zh) * 2005-06-10 2009-01-07 华为技术有限公司 进行流量控制的方法和系统

Also Published As

Publication number Publication date
AU2001245355A1 (en) 2001-09-12
WO2001065394A9 (fr) 2002-12-27

Similar Documents

Publication Publication Date Title
US20040037223A1 (en) Edge-to-edge traffic control for the internet
Al-Saadi et al. A survey of delay-based and hybrid TCP congestion control algorithms
Parsa et al. Improving TCP congestion control over internets with heterogeneous transmission media
Kuzmanovic et al. TCP-LP: A distributed algorithm for low priority data transfer
US8125910B2 (en) Communication system
Wang et al. Adaptive bandwidth share estimation in TCP Westwood
EP1417808B1 (fr) Procédé et système pour le contrôle de congestion dans un réseau de communications
US6839321B1 (en) Domain based congestion management
Kalampoukas et al. Explicit window adaptation: A method to enhance TCP performance
Grieco et al. TCP westwood and easy RED to improve fairness in high-speed networks
JP2008530841A (ja) ネットワーク監視
Wang et al. A simple refinement of slow-start of TCP congestion control
Jain et al. The TCP bandwidth-delay product revisited: network buffering, cross traffic, and socket buffer auto-sizing
Man et al. ImTCP: TCP with an inline measurement mechanism for available bandwidth
Aweya et al. Enhancing TCP performance with a load‐adaptive RED mechanism
Capone et al. Bandwidth estimates in the TCP congestion control scheme
Biswal et al. Comparative analysis of compound tcp with various end-to-end congestion control mechanisms
Zhang et al. Optimizing TCP start-up performance
Aweya et al. Enhancing network performance with TCP rate control
WO2001065394A1 (fr) Commande de trafic de bord-a-bord pour internet
Chan et al. Performance improvement of congestion avoidance mechanism for TCP Vegas
Shihada et al. Threshold-based TCP Vegas over optical burst switched networks
Kim et al. The FB-RED algorithm for TCP over ATM
Lu et al. EQF: An explicit queue-length feedback for TCP congestion control in datacenter networks
Ho et al. An enhanced slow-start mechanism for TCP Vegas

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: C2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: C2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

COP Corrected version of pamphlet

Free format text: PAGES 1/5-5/5, DRAWINGS, REPLACED BY NEW PAGES 1/5-5/5; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWE Wipo information: entry into national phase

Ref document number: 10204222

Country of ref document: US

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP