CN103733582B

CN103733582B - Method and apparatus for the expansible packet scheduling of high-volume conversation

Info

Publication number: CN103733582B
Application number: CN201280039828.7A
Authority: CN
Inventors: 刘德明; 易恳; 刘品中
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2011-08-16
Filing date: 2012-08-14
Publication date: 2016-11-30
Anticipated expiration: 2032-08-14

Abstract

A kind of equipment, comprising: multiple queue, it is for caching the multiple bags corresponding to multiple sessions；Scheduler, it is for dispatching out bag from different queue, forwards in the exit of each corresponding queue with the end time based on each bag；Outbound, it is couple to described scheduler, and the bag that scheduling is gone out by the bandwidth for being shared with all queues transfers from all queues, the wherein said end time is that dynamic calculation goes out based on the amount of bandwidth distributing to corresponding queue, and wherein said queue is assigned with the weights of correspondence to share total bandwidth.

Description

Method and apparatus for scalable packet scheduling of a large number of sessions

相关申请案的交叉参考Cross References to Related Applications

不适用。Not applicable.

关于由联邦政府赞助的About Federal Sponsored

研究或开发的声明Statement of Research or Development

不适用。Not applicable.

参考缩微胶片附录Refer to Microfiche Addendum

不适用。Not applicable.

技术领域technical field

无none

背景技术Background technique

现代通信和数据网络由在整个网络中传输数据的节点组成。这些节点可以包括在网络中传输各个数据包或数据帧的路由器、交换机、网桥或它们的组合。节点可以并行地转发对应于不同会话或流的多个包。不同会话或流的包可以通过多个入口端口被接收，并通过该节点的多个出口端口被转发。另外，不同流的包可以在对应的列队或缓冲器进行入队或缓冲一段时间，然后再将包从节点发送出。不同队列中的包可以通过同一个出口链路被转发，并且它们共享该链路可用或所指派的带宽。节点处的调度器通常用于调度并协调不同队列中所缓存的包在同一出口链路上的转发，例如，实现方法是在由所述调度器所指定的不同时隙处从不同队列中选择包。Modern communication and data networks consist of nodes that transmit data throughout the network. These nodes may include routers, switches, bridges, or combinations thereof that transmit individual data packets or data frames in the network. Nodes can forward multiple packets corresponding to different sessions or flows in parallel. Packets for different sessions or flows may be received through multiple ingress ports and forwarded through multiple egress ports of the node. In addition, packets of different flows can be enqueued or buffered for a period of time in corresponding queues or buffers, and then the packets are sent out from the node. Packets in different queues can be forwarded over the same egress link, and they share the link's available or assigned bandwidth. The scheduler at the node is usually used to schedule and coordinate the forwarding of packets buffered in different queues on the same egress link, for example, the implementation method is to select from different queues at different time slots specified by the scheduler Bag.

发明内容Contents of the invention

在一项实施例中，本发明包括一种设备，所述设备包括：多个队列，其用于缓存对应于多个会话的多个包；调度器，其用于将所述包从不同队列调度出，以待基于每个包的结束时间而在各对应队列的出口处进行转发；以及出口链路，其耦接到所述调度器并用于以所有队列所共享的带宽来转发从所有队列中调度出的包，其中所述结束时间是基于分配给对应队列的带宽量而动态计算出的，并且其中所述队列被指派对应的权值以共享总带宽。In one embodiment, the invention includes an apparatus comprising: a plurality of queues for buffering a plurality of packets corresponding to a plurality of sessions; a scheduler for routing the packets from different queues scheduled to be forwarded at the egress of each corresponding queue based on the end time of each packet; and an egress link coupled to the scheduler and used to forward packets from all queues at a bandwidth shared by all queues The packets scheduled in , wherein the end time is dynamically calculated based on the amount of bandwidth allocated to the corresponding queue, and wherein the queue is assigned a corresponding weight to share the total bandwidth.

在另一项实施例中，本发明包括一种网络组件，所述网络组件包括：接收器，其用于接收对应于多个会话的多个包；一个或多个存储单元，其用以存储多个队列，所述多个队列作用是缓冲对应会话的包；逻辑单元，其用于计算在对应队列的头部处每个所检测的包的结束时间，以及将所检测的包指派给日历队列的时隙以便按照结束时间的升序来转发所述包；以及发射器，其用于通过输出链路按照时隙的顺序来发送指派给所述时隙的多个包。在又一项实施例中，本发明包括网络设备实施方法，所述方法包括：扫描多个包会话的多个队列从而检测出队列中的任何积压包；将在队列头部检测到的多个包按照多个结束时间的升序指派给日历表中的多个时隙，所述结束时间是针对所述包在分配给包会话的带宽方面计算出的；依序列扫描所述日历表中的时隙，以检测出所指派的包并在共享出口链路上按顺序转发检测到的指派包。In another embodiment, the present invention includes a network component comprising: a receiver for receiving a plurality of packets corresponding to a plurality of sessions; one or more storage units for storing A plurality of queues acting to buffer packets of a corresponding session; a logical unit for calculating the end time of each detected packet at the head of the corresponding queue, and assigning the detected packets to a calendar time slots of the queue to forward the packets in ascending order of end time; and a transmitter for transmitting the plurality of packets assigned to the time slots in order of the time slots through the output link. In yet another embodiment, the present invention includes a network device implementing a method comprising: scanning multiple queues of multiple packet sessions to detect any backlogged packets in the queues; A packet is assigned to a plurality of time slots in a calendar table in ascending order of a plurality of end times calculated for the packet in terms of the bandwidth allocated to the packet session; the time slots in the calendar table are scanned sequentially slots to detect the assigned packets and sequentially forward the detected assigned packets on the shared egress link.

从结合附图和权利要求书进行的以下详细描述将更清楚地理解这些和其他特征。These and other features will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings and claims.

附图说明Description of drawings

为了更完整地理解本发明，现在参考以下结合附图和详细描述进行的简要描述，其中相同参考标号表示相同部分。For a more complete understanding of the present invention, reference is now made to the following brief description taken in conjunction with the drawings and detailed description, wherein like reference numerals refer to like parts.

图1为调度器架构的一项实施例的示意图。Figure 1 is a schematic diagram of an embodiment of a scheduler architecture.

图2为典型日历队列的一项实施例的示意图。Figure 2 is a schematic diagram of one embodiment of a typical calendar queue.

图3为可扩展性较差的日历队列的另一项实施例的示意图。FIG. 3 is a schematic diagram of another embodiment of a calendar queue with poor scalability.

图4为具有改善的可扩展性的日历队列的另一项实施例的示意图。Figure 4 is a schematic diagram of another embodiment of a calendar queue with improved scalability.

图5为多级调度器层次的一项实施例的示意图。Figure 5 is a schematic diagram of an embodiment of a multi-level scheduler hierarchy.

图6为多级调度方案的一项实施例的示意图。Fig. 6 is a schematic diagram of an embodiment of a multi-level scheduling scheme.

图7为包调度工作量的一项实施例的图。Figure 7 is a diagram of one embodiment of a packet scheduling workload.

图8为调度公平性的一项实施例的图。Figure 8 is a diagram of an embodiment of scheduling fairness.

图9为经扫描用于发送包的若干槽的一项实施例的图。Figure 9 is a diagram of one embodiment of slots scanned for sending packets.

图10为包的调度转发方法的一项实施例的流程图。Fig. 10 is a flowchart of an embodiment of a packet scheduling and forwarding method.

图11为网络单元的一项实施例的示意图。Figure 11 is a schematic diagram of an embodiment of a network element.

图12为通用计算机系统的一项实施例的示意图。Figure 12 is a schematic diagram of one embodiment of a general purpose computer system.

具体实施方式detailed description

最初应理解，尽管下文提供一个或多个实施例的说明性实施方案，但可以使用任何数目的技术，不管是当前已知还是现有的，来实施所揭示的系统和/或方法。本发明决不应限于下文所说明的所述说明性实施方案、图式和技术，包含本文所说明并描述的示范性设计和实施方案，而是可以在所附权利要求书的范围以及其等效物的完整范围内修改。It should be understood at the outset that although an illustrative implementation of one or more embodiments is provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or in existence. The invention should in no way be limited to the illustrative implementations, drawings, and techniques described below, including the exemplary designs and implementations illustrated and described herein, but may be limited within the scope of the appended claims and their equivalents. Modified within the full range of effects.

本文本所揭示的是一种用于改进的包调度转发的系统和方法，例如，在网络节点处进行的。调度器可以用于有效地调度在网络节点处对应于多个会话的以及多个对应队列内所缓冲的多个包的转发。可以实施包调度策略或算法来处理因跳过空闲队列引起的可扩展性问题，所述问题存在于，例如，当大量会话被处理时以及会话中有较大部分为空闲的情况下。所述包调度策略或算法可以用于跳过有界数目个空闲队列从而为队列中的包服务或将其转发。对于所有或多个包布置或条件，该算法的时间复杂度可以为O(1)。类似于其他算法，例如，加权公平排队(WFQ)算法，该算法还可以具有公平性属性以处理不同会话的包。该策略或算法可以仅使用软件、使用具备有限硬件支持的软件或者使用硬件而得到有效的实施。Disclosed herein is a system and method for improved packet scheduling forwarding, eg, at network nodes. The scheduler can be used to efficiently schedule the forwarding of a plurality of packets corresponding to a plurality of sessions and buffered in a plurality of corresponding queues at a network node. Packet scheduling policies or algorithms can be implemented to deal with scalability issues caused by skipping idle queues, which exist, for example, when a large number of sessions are processed and a large fraction of the sessions are idle. The packet scheduling strategy or algorithm may be used to skip a bounded number of idle queues to serve or forward packets in the queues. The algorithm may have a time complexity of O(1) for all or multiple packet placements or conditions. Similar to other algorithms, such as the Weighted Fair Queuing (WFQ) algorithm, this algorithm can also have a fairness property to process packets of different sessions. The strategy or algorithm may be effectively implemented using software only, using software with limited hardware support, or using hardware.

图1所示为调度器架构100的一项实施例，所述调度器架构可以用于对不同会话或流的包进行调度以便进行转发。调度器架构100可以在网络组件或节点中实施，例如，路由器、网桥、交换机或者网络中用于转发包或帧的其他组件。所述包可能属于可以在该网络组件处接收或生成的不同会话或流。调度器架构100可以包括多个队列或缓冲器110、耦接到所有队列110的调度单元或调度器120，以及耦接到调度器120的输出或出口链路130。队列或缓冲器110可以用于缓存或临时存储引入的或生成的包，直到该包可以被调度器120发送到用于传输的出口链路130为止。调度器120可以基于调度算法而从不同队列中选择包用于经由同一个共享出口链路130而根据某种顺序进行发送，这样可以确保如下所述的包选择过程中具有公平性。出口链路130可以用于转发或传输不同会话的所有包，并且可以具有固定的且指派的带宽，所述带宽可以在所有会话中被共享。Figure 1 illustrates one embodiment of a scheduler architecture 100 that may be used to schedule packets of different sessions or flows for forwarding. The scheduler architecture 100 may be implemented in a network component or node, such as a router, bridge, switch, or other component in a network for forwarding packets or frames. The packets may belong to different sessions or flows that may be received or generated at that network component. The scheduler architecture 100 may include a plurality of queues or buffers 110 , a scheduling unit or scheduler 120 coupled to all queues 110 , and an output or egress link 130 coupled to the scheduler 120 . Queue or buffer 110 may be used to buffer or temporarily store incoming or generated packets until the packet can be sent by scheduler 120 to egress link 130 for transmission. The scheduler 120 may select packets from different queues for transmission via the same shared egress link 130 in a certain order based on a scheduling algorithm, which may ensure fairness in the packet selection process as described below. Egress link 130 may be used to forward or transmit all packets of different sessions, and may have a fixed and assigned bandwidth, which may be shared among all sessions.

通常，调度算法可以确保出口链路130的总带宽的部分分配过程中的公平性，例如WFQ调度算法。基于WFQ，n个会话(n为整数)可以共享带宽为R的一个输出链路，从而每个会话i具有权值w_i。各会话可以具有确保的比率其中WFQ可以模拟出广义处理共享(GPS)的流体模型，并定义虚拟时间，其中当会话中不存在包积压时，V(t)＝0。其他情况下，其中B(t₀,t)为[t₀,t]内积压会话的集合。WFQ还可以针对会话i上的第k个包来定义虚拟开始时间和对应的虚拟结束时间其中为会话i上第k个包的到达时间，并且其中为会话i上第k个包的长度。虚拟结束时间在将包进行排队时确定，例如，在将包添加到队列中时确定。队列中的包是按照积压包的虚拟结束时间的升序而得到服务。还可以使用不同的方案来减小WFQ的复杂度，例如，使用日历队列或者使用自时钟公平排队(SCFQ)。SCFQ可以减少WFQ的虚拟时间开销，并使用正被传递的包的虚拟结束时间作为当前虚拟时间。Generally, a scheduling algorithm can ensure fairness in the process of part allocation of the total bandwidth of the egress link 130, such as a WFQ scheduling algorithm. Based on WFQ, n sessions (n is an integer) can share an output link with bandwidth R, so each session i has a weight w _i . Each session can have a guaranteed ratio in WFQ can simulate a generalized process sharing (GPS) fluid model and define a virtual time where V(t)=0 when there is no packet backlog in the session. In other cases, Wherein B(t ₀ , t) is a set of backlog sessions in [t ₀ , t]. WFQ can also define a virtual start time for the kth packet on session i and the corresponding virtual end time in is the arrival time of the kth packet on session i, and where is the length of the kth packet on session i. The virtual end time is determined when the packet is queued, eg, when the packet is added to the queue. The packets in the queue are serviced in ascending order of the virtual end time of the backlog packets. Different schemes can also be used to reduce the complexity of WFQ, for example, using calendar queues or using self-clocked fair queuing (SCFQ). SCFQ can reduce the virtual time overhead of WFQ and use the virtual end time of the packet being delivered as the current virtual time.

图2为典型日历队列200的一项实施例，所述日历队列可以用于减小WFQ调度算法的计算复杂度。日历队列还可以称为日历表。日历队列200可以包括对应于多个会话或队列的多个时隙。可以依序列遍历所述时隙以便在每个时隙处转发对应的指派包。可以使用递归的方式来处理时隙，具体操作是在处理完最后一个时隙(时隙9)之后从第一时隙(时隙0)处重新开始。尽管图2中示出有十个时隙(从0到9)，但是日历队列200可以包括相当多数量的时隙，例如，多达大约1,000,000个时隙或更多个。可以使用WFQ将已排好序的积压包指派给所述时隙，随后该积压包可以根据指派的时隙的序列被转发。FIG. 2 is an embodiment of a typical calendar queue 200, which can be used to reduce the computational complexity of the WFQ scheduling algorithm. A calendar queue may also be called a calendar table. Calendar queue 200 may include multiple time slots corresponding to multiple sessions or queues. The slots may be traversed in sequence to forward a corresponding assignment packet at each slot. The time slots may be processed in a recursive manner, and the specific operation is to restart from the first time slot (slot 0) after the last time slot (slot 9) has been processed. Although ten time slots (from 0 to 9) are shown in FIG. 2, calendar queue 200 may include a considerable number of time slots, for example, up to about 1,000,000 time slots or more. WFQ can be used to assign ordered backlog packets to the slots, and the backlog packets can then be forwarded according to the sequence of assigned slots.

可以使用WFQ将一个或多个积压包(例如，P1、P2、P3、P4等)指派给各时隙。积压包可以经调度在日历队列200的条目或时隙处得到服务(或被转发)，所述条目或时隙由对应的量化结束时间来确定。日历队列200可以进行重复地遍历，其中指派给时隙的包如果存在则可以得到服务。在各当前时间处，可以扫描出所述时隙中的一个以用于一个指派包。如果发现有包，那么随后可以将该包从其所对应的队列处在共享输出链路上进行传输，然后再移动到日历队列200中的下一个时隙。其他情况下，可以对下一个时隙进行扫描以用于指派包。可以按照循环序列来重复这个过程，其中可以对日历队列200中所有时隙按序遍历多次。使用日历队列200可以减小WFQ的运行中(例如，实时的)计算复杂度。然而，日历队列200可以为相当稀疏的，例如，它包括空队列中的大量的未指派或空的时隙，所述日历队列仍然可以被扫描而到达指派的时隙处。这可能使得可扩展性变得非常差并且可能减小总体性能。One or more backlog packets (eg, P1, P2, P3, P4, etc.) may be assigned to each slot using WFQ. Backlog packets may be scheduled to be serviced (or forwarded) at entries or slots of the calendar queue 200 determined by corresponding quantized end times. The calendar queue 200 can be traversed iteratively, wherein packets assigned to time slots can be serviced if present. At each current time, one of the time slots may be scanned out for an assignment packet. If a packet is found, the packet may then be transmitted from its corresponding queue on the shared output link before moving to the next time slot in the calendar queue 200 . In other cases, the next slot may be scanned for dispatching packets. This process can be repeated in a circular sequence, wherein all time slots in the calendar queue 200 can be traversed multiple times in sequence. On-the-fly (eg, real-time) computational complexity of WFQ can be reduced using calendar queue 200 . However, the calendar queue 200 can be quite sparse, eg, it includes a large number of unassigned or empty time slots in an empty queue, which can still be scanned for assigned time slots. This can make scalability very poor and can reduce overall performance.

图3所示为可伸缩性较差的另一个日历队列300的一项实施例。日历队列300可以包括大约1,000,000个时隙(从0到999,999)，其中只有第一个时隙(时隙0)可以指派有一个包。未指派的时隙可以对应于不包括积压包的空队列。这可能是因为并非所有的会话或流一直都是激活的，并且因此队列中的一些可以是空的。遍历日历队列300中这种大量的未指派或空时隙可能需要相当大的开销，例如，在软件实施中进行时。此时，在仅有一个包被指派给一个时隙的情况下，可能仅使用总带宽的大约百万分之一，而余下的带宽可能会被浪费，尽管余下的空时隙仍在被扫描。An embodiment of another calendar queue 300 that is less scalable is shown in FIG. 3 . Calendar queue 300 may include approximately 1,000,000 time slots (from 0 to 999,999), of which only the first time slot (slot 0) may be assigned a packet. Unassigned slots may correspond to empty queues that do not include backlog packets. This may be because not all sessions or flows are active all the time, and therefore some of the queues may be empty. Traversing this large number of unassigned or empty slots in calendar queue 300 may require considerable overhead, eg, when done in a software implementation. At this point, with only one packet assigned to a slot, only about one millionth of the total bandwidth may be used, and the rest of the bandwidth may be wasted, although the remaining empty slots are still being scanned .

为提高不同会话中包调度和转发的可扩展性和性能，需要对调度算法进行改进从而使该算法可以对日历队列中较大部分的时隙进行指派，并且因此获得更密集的日历队列。这可能会使总带宽得到更好的利用，其方法是针对包括积压包的非空包，大体扫描在日历队列中已指派的时隙。To improve the scalability and performance of packet scheduling and forwarding across different sessions, an improvement to the scheduling algorithm is required such that the algorithm can assign a larger fraction of the time slots in the calendar queue, and thus obtain a denser calendar queue. This may result in better utilization of the overall bandwidth by roughly scanning the assigned slots in the calendar queue for non-empty packets, including backlog packets.

具体而言，如果队列处不存在包积压，那么可以将会话i上第k个包的新虚拟时间定义为V_i ^k＝0。其他情况下，V_i ^k被设置等于正被服务的包的结束时间。在WFQ情况下，包的开始时间和结束时间最迟可以在包移动到它的队列的头部或出口时进行计算，而不是在排队时间内。这种改进的算法可以定义虚拟开始时间以及对应的虚拟结束时间其中B为当包移动到队列头部时的所有激活会话的集合。余下的参数类似于上述的对应参数。Specifically, if there is no packet backlog at the queue, then the new virtual time of the kth packet on session i can be defined as V _i ^k =0. ^{Otherwise, Vik} _is set equal to the end time of the packet being served. In the WFQ case, a packet's start time and end time can be calculated at the latest when the packet moves to the head of its queue or exits, rather than during the queuing time. This improved algorithm can define a virtual start time and the corresponding virtual end time where B is the set of all active sessions when the packet moves to the head of the queue. The remaining parameters are similar to the corresponding parameters described above.

根据上面提出的算法，只有位于对应队列头部的包被指派给日历队列或表中由结束时间所确定的条目。因此，日历队列中大部分的时隙可以指派有一个包，并且因此，所述时隙可以被更快地遍历以发现待服务的包。日历队列在某些情况下可以仍然包括空队列的未指派时隙，但是这种时隙的数量可以得到很大程度地减少，并且因此包调度和转发的性能和可扩展性可以得到很大程度上的提高。这还可以很大程度上提高带宽利用率，其中大部分或者较大部分的共享输出链路带宽可以用于传输不同时隙处的包。According to the algorithm proposed above, only packets at the head of the corresponding queue are assigned to entries in the calendar queue or table determined by the end time. Thus, most slots in the calendar queue may have a packet assigned to them, and thus, the slots may be traversed more quickly to find packets to be serviced. Calendar queues may still include unassigned slots of empty queues in some cases, but the number of such slots can be greatly reduced, and thus the performance and scalability of packet scheduling and forwarding can be greatly improved on the improvement. This can also greatly improve bandwidth utilization, where most or a larger portion of the shared output link bandwidth can be used to transmit packets at different time slots.

如上所述，这种改进的算法可以提供工作节约策略。As mentioned above, this improved algorithm can provide work-saving strategies.

这种工作节约策略可以对应于O(1)工作节约调度。Such a work-saving strategy may correspond to O(1) work-saving scheduling.

此外，可以在不使用物理定时器的情况下将包指派给日历队列或表的时隙。结束时间还可以进行动态计算，这是因为用于计算结束时间的系数可以依据分配给会话的带宽量而发生变化，而不是像在WFQ中那样使用固定值。系数中的动态变化可以反射出会话在激活(包括积压包)与空闲(不包括积压包)之间切换的变化。这可能使得日历表变得更密集，无论所考虑的会话数量是多少。平均来说，或者一般而言，大约有S个时隙可以被扫描用来服务大约S个包(S是整数)。该算法还可以支持服务质量(QoS)要求，其中不同包会话在共享出口链路带宽时可能具有不同的权值或优先权。Furthermore, packets can be assigned to time slots of calendar queues or tables without the use of physical timers. The end time can also be calculated dynamically because the coefficient used to calculate the end time Can vary based on the amount of bandwidth allocated to the session instead of using a fixed value like in WFQ. Dynamic changes in the coefficients may reflect changes in sessions switching between active (including backlog packets) and idle (excluding backlog packets). This can make the calendar table denser regardless of the number of sessions considered. On average, or in general, about S time slots can be scanned to serve about S packets (S is an integer). The algorithm can also support Quality of Service (QoS) requirements, where different packet sessions may have different weights or priorities when sharing egress link bandwidth.

图4所示为具有改进的可扩展性的另一个日志队列400的实施例，所述改进的可扩展性是基于上述改进的算法而得到的。日志队列400可以包括大约1,000,000个时隙(从0到999,999)，其中大多数或大量的时隙可以被指派有位于对应队列头部的一个包。日历队列400可以不包括未指派的时隙，或者包括数量可忽略不计的未指派时隙，所述未指派时隙对应于不包括积压包的空队列。日历队列400相比于使用WFQ获得的日历队列300而言更加密集，并且因此可以使用少量计算开销来进行处理，例如，多数时使用软件进行此操作。在其他实施方案中，还可以使用硬件来代替软件或者将二者一起使用。FIG. 4 shows another embodiment of a log queue 400 with improved scalability obtained based on the above-mentioned improved algorithm. Log queue 400 may include approximately 1,000,000 time slots (from 0 to 999,999), most or a large number of which may be assigned a packet at the head of the corresponding queue. Calendar queue 400 may include no unassigned slots, or a negligible number of unassigned slots, corresponding to an empty queue that does not include backlog packets. Calendar queue 400 is denser than calendar queue 300 obtained using WFQ, and thus can be processed with little computational overhead, eg, most of the time using software to do so. In other embodiments, hardware may also be used instead of software or both.

上述算法还可以在多级排队层次中实施，其中处于某一级的队列可以耦接到位于较低级的多个队列。如此，源自较低级队列的包可以使用上述改进方案进行调度而转发到较高级队列。各级处的调度器可以实施这种改进的调度算法以将包从不同队列经由共享输出链路转发。图5所示为多级调度器层次500的一项实施例，其中改进的调度方案可以在各级处实施。多级调度器层次500可以在网络组件或节点中实施，例如，路由器、网桥、交换机或在网络中用于转发包或帧的其他组件。所述包可能属于可以在该网络组件处接收或生成的不同会话或流。或者，多级调度器层次500可以在多个节点中实施，所述多个节点可以采用多级层次形式，例如，树拓扑形式进行耦接。如此，包可以属于可以沿着不同级处节点转发的不同会话或流。The above algorithm can also be implemented in a multi-level queuing hierarchy, where a queue at a certain level can be coupled to multiple queues at lower levels. As such, packets originating from lower-level queues can be forwarded to higher-level queues for scheduling using the above-described improved scheme. The schedulers at each stage can implement this improved scheduling algorithm to forward packets from different queues via shared output links. Figure 5 shows one embodiment of a multi-level scheduler hierarchy 500 in which an improved scheduling scheme can be implemented at each level. The multi-level scheduler hierarchy 500 may be implemented in network components or nodes, such as routers, bridges, switches, or other components used in a network to forward packets or frames. The packets may belong to different sessions or flows that may be received or generated at that network component. Alternatively, the multi-level scheduler hierarchy 500 may be implemented in multiple nodes, and the multiple nodes may be coupled in a multi-level hierarchical form, for example, in a tree topology. As such, packets may belong to different sessions or flows that may be forwarded along nodes at different levels.

多级调度器层次500可以包括至少两级队列和对应的调度器，其中各级可以基于调度器架构100。如此，第一级调度器架构501可以包括多个第一级队列或缓冲器510(队列4、5和6)、耦接到所有第一级队列510的第一级调度单元或调度器520，以及耦接到第一级调度器520的第一级输出或出口链路530。另外，第二级调度器架构502耦接到所述第一调度器架构501，它可以包括多个第二级队列或缓冲器512(队列1、2和3)、耦接到所有第二级队列512的第二级调度单元或调度器522，以及耦接到第二级调度器522的第二级输出或出口链路532。第二级调度器架构502可以耦接到第一级调度器架构501，其耦接方式可以为将第二级队列512中的一个队列(队列3)耦接到第一级出口链路530。第一级调度器架构501的组件和第二级调度器架构502中类似的组件可以类似地用于调度器架构100的对应组件。The multi-level scheduler hierarchy 500 may include at least two levels of queues and corresponding schedulers, where each level may be based on the scheduler architecture 100 . As such, the first-level scheduler architecture 501 may include a plurality of first-level queues or buffers 510 (queues 4, 5, and 6), a first-level scheduling unit or scheduler 520 coupled to all first-level queues 510, and a first stage output or egress link 530 coupled to the first stage scheduler 520 . Additionally, a second stage scheduler architecture 502 is coupled to the first scheduler architecture 501, which may include a plurality of second stage queues or buffers 512 (queues 1, 2, and 3), coupled to all second stage A second level scheduling unit or scheduler 522 of the queue 512 and a second level output or egress link 532 coupled to the second level scheduler 522 . The second-level scheduler architecture 502 may be coupled to the first-level scheduler architecture 501 by coupling one of the second-level queues 512 (queue 3 ) to the first-level egress link 530 . Components of the first level scheduler architecture 501 and similar components in the second level scheduler architecture 502 may be similarly used for corresponding components of the scheduler architecture 100 .

第一级调度器架构501和第二级调度器架构502可以在同一个网络节点中实施。或者，第一级调度器架构501可以在树的第一节点中实施，并且第二级调度器架构502可以在树中下一较高级处的第二节点中实施，所述第二节点耦接到所述第一节点。第一级调度器520可以使用第一级日历队列540来实施上述改进的调度算法从而对包进行有效地调度并将其从第一级队列510(队列4、5和6)经由第一级出口链路530转发到第二级队列512(队列3)。类似地，第二级调度器522可以使用第二级日历队列542来实施上述的改进调度算法从而对包进行有效地调度并将其从第二级队列512(队列1、2和3)在第二级出口链路532上进行转发。如此，第一级日历队列540和第二级日历队列542可以为相当密集的，例如，类似于日历队列400。The first-level scheduler architecture 501 and the second-level scheduler architecture 502 can be implemented in the same network node. Alternatively, the first level scheduler architecture 501 may be implemented in a first node of the tree, and the second level scheduler architecture 502 may be implemented in a second node at the next higher level in the tree, the second node being coupled to to the first node. First-level scheduler 520 may use first-level calendar queues 540 to implement the improved scheduling algorithm described above to efficiently schedule packets and send them from first-level queues 510 (queues 4, 5, and 6) via first-level egress Link 530 is forwarded to second level queue 512 (queue 3). Similarly, second-level scheduler 522 may use second-level calendar queues 542 to implement the improved scheduling algorithm described above to efficiently schedule packets and send them from second-level queues 512 (queues 1, 2, and 3) at forwarding on the secondary egress link 532. As such, first level calendar queue 540 and second level calendar queue 542 may be fairly dense, eg, similar to calendar queue 400 .

图6所示为多级调度器层次的多级调度方案600的一项实施例。多级调度器层次可以包括三级队列和对应的调度器(未图示)，其中各级可以均基于调度器架构100。所述三级队列可以包括第一级队列，所述第一级队列包括队列4到12，该队列可以具有如图6所指示的多个对应权值(w_i)。所述三级队列还可以包括第二级队列和第三级队列0，所述第二级队列包括队列1、2和3，该队列也可以具有对应权值(w_i)。第一级队列和第二级队列可以具有对应的第一级调度器和第二级调度器(未图示)，这两个调度器可以实施上述改进的调度算法(使用对应的日历表)从而在共享出口链路上转发包(例如，到更高级队列)。具体而言，队列4、5和6的包可以进行调度并在共享链路上被转发到队列1，队列7、8和9的包可以进行调度并在共享链路上被转发到队列2，以及队列10、11和12的包可以进行调度并在共享链路上被转发到队列3。此外，队列1、2和3的包可以被调度转发到队列0。队列零处的包随后可以在同一输出链路上进行转发。Figure 6 illustrates an embodiment of a multi-level scheduling scheme 600 at the multi-level scheduler hierarchy. The multi-level scheduler hierarchy may include three levels of queues and corresponding schedulers (not shown), wherein each level may be based on the scheduler architecture 100 . The three-level queues may include first-level queues including queues 4 to 12, which may have a plurality of corresponding weights (w _i ) as indicated in FIG. 6 . The third-level queue may also include a second-level queue and a third-level queue 0, the second-level queue includes queues 1, 2, and 3, and the queues may also have corresponding weights (w _i ). The first-level queue and the second-level queue can have corresponding first-level schedulers and second-level schedulers (not shown), and these two schedulers can implement the above-mentioned improved scheduling algorithm (using a corresponding calendar table) so that Forward packets on shared egress links (eg, to higher-level queues). Specifically, packets of queues 4, 5, and 6 can be scheduled and forwarded to queue 1 on the shared link, packets of queues 7, 8, and 9 can be scheduled and forwarded to queue 2 on the shared link, And packets for queues 10, 11 and 12 can be scheduled and forwarded to queue 3 on the shared link. Additionally, packets from queues 1, 2, and 3 can be scheduled to be forwarded to queue 0. Packets at queue zero can then be forwarded on the same output link.

图7所示为对应于多级调度方案600的包调度工作量700的一项实施例。包调度工作量700表示为包出队过程中指令的多个计算平均数-多个日历队列槽大小(以字节为单位)的一条曲线。包出队(即，将包从队列的头部或出口移除)所用的指令使用平台i686处理器来实施。曲线示出，指令数从大约50字节时隙大小一直下降到大约500字节时隙大小。这表示，出队过程中指令的平均数可能与日历队列中所用的时隙大小相关。然而，对于相同的所检查的槽大小，发现有，包排队过程或者将包添加到队列过程中的指令平均数(未示出曲线)为大约32个指令的固定值。这表示，包排队或入队(即，将包添加到队列的起点或入口)过程中的指令平均数可能与日历队列中的时隙大小无关。One embodiment of a packet scheduling workload 700 corresponding to the multi-level scheduling scheme 600 is shown in FIG. 7 . The packet scheduling workload 700 is represented as a curve of multiple computed averages of instructions in the packet dequeue process versus multiple calendar queue slot sizes in bytes. The instructions used to dequeue packets (ie, remove packets from the head or exit of the queue) are implemented using the platform i686 processor. The graph shows that the number of instructions drops from about a 50 byte slot size all the way down to about a 500 byte slot size. This means that the average number of instructions in the dequeue process may be related to the slot size used in the calendar queue. However, for the same slot sizes examined, it was found that the average number of instructions (curve not shown) in the process of queuing a packet or adding a packet to a queue was a fixed value of about 32 instructions. This means that the average number of instructions during packet queuing or enqueuing (ie, adding a packet to the start or entry of a queue) may not be related to the slot size in the calendar queue.

图8所示为多级调度方案600的调度公平性800的一项实施例。调度公平性800表示为多个公平比计算值-上文所用的日历队列槽大小(以字节为单位)的多个曲线。具体而言，公平比是针对多级调度方案600中的队列1到队列2进行了计算。理想情况下，公平比应等于队列1的权值与队列2的权值的比，即为5/4＝1.25。所示的两条曲线为以下两种情况：队列1日历表中服务了250个包；队列2日历表中服务了500个包。曲线示出了得到服务的包数量越大，公平比越接近理想公平比。此外，还发现有，在服务500个包的情况下，不同时隙大小之间的公平比值大约相等。服务1,000个包的情况下的公平性(未图示)也可以在检查槽大小的值的同一范围内进行计算，并且发现该公平性大体等于理想公平比1.25。这表明，随着包数的增多，改善的调度算法具有的公平比更接近队列之间理想的或所需公平比。An embodiment of scheduling fairness 800 for multi-level scheduling scheme 600 is shown in FIG. 8 . Scheduling fairness 800 is represented as multiple curves of multiple fairness ratio calculations - calendar queue slot size (in bytes) as used above. Specifically, the fairness ratio is calculated for queues 1 to 2 in the multi-level scheduling scheme 600 . Ideally, the fairness ratio should be equal to the ratio of the weight of queue 1 to the weight of queue 2, that is, 5/4=1.25. The two curves shown are for the following two cases: 250 packets are served in the queue 1 calendar table; 500 packets are served in the queue 2 calendar table. The curve shows that the larger the number of packets served, the closer the fairness ratio is to the ideal fairness ratio. Furthermore, it was found that there is an approximately equal fairness ratio between the different slot sizes in the case of serving 500 packets. Fairness in the case of serving 1,000 packets (not shown) can also be calculated in the same range of values for slot size, and is found to be roughly equal to the ideal fairness ratio of 1.25. This shows that as the number of packets increases, the fairness ratio of the improved scheduling algorithm is closer to the ideal or required fairness ratio between queues.

图9所示为用于在多级调度方案600中发送包的扫描槽数900的一项实施例。扫描槽数900表示为扫描槽数-上文所用的日历队列槽大小(以字节为单位)的两条曲线。具体而言，所示的扫描槽数所示情况为在日历队列中为1,000个包服务。这两条曲线对应于扫描槽数的实际值以及基于理论分析的对应估计值。发现有，这两个值大体相等，并且因此这两条曲线重合。这些曲线指示出扫描槽数随着日历队列槽大小的增大而减小，从而允许每个时隙可以服务更多的包。因此，增大槽大小可以增大将包从它们的队列中转发出的速度，并可以提高带宽利用率和可扩展性。FIG. 9 illustrates one embodiment of a scan slot number 900 for sending packets in a multi-level scheduling scheme 600 . The number of scan slots 900 is represented as two curves of number of scan slots - calendar queue slot size in bytes as used above. Specifically, the number of scan slots shown is for servicing 1,000 packets in the calendar queue. These two curves correspond to the actual value of the number of slots scanned and the corresponding estimated value based on theoretical analysis. It was found that the two values are approximately equal, and therefore the two curves coincide. These curves indicate that the number of scan slots decreases as the calendar queue slot size increases, allowing more packets to be served per slot. Therefore, increasing the slot size increases the rate at which packets are dequeued from them, and improves bandwidth utilization and scalability.

图10所示为包调度转发方法1000的一项实施例，所述方法可以用于并行地调度并转发多个会话的多个包，即，通过共享同一出口链路带宽。包调度转发方法1000可以基于上述改进的调度算法得到，并且可以由多级排队或调度情境中的一个网络节点或多个网络节点来实施。该方法可以开始于块1010，其中可以对多个队列进行扫描以检测出队列中的积压包。在块1020处，可以将在队列头部检测出的多个包按照包结束时间的升序而指派给日历表中多个时隙，所述结束时间是关于包的对话分配到的带宽方面而动态计算出的，如上所述。在块1030处，可以依序列对日历队列中的时隙进行扫描以检测出指派包。在块1040处，可以在同一出口链路或端口上按顺序转发检测出的指派包。块1010到1040可以被重复进行来持续转发网络节点中接收/生成的以及入队的包。FIG. 10 shows an embodiment of a packet scheduling and forwarding method 1000, which can be used to schedule and forward multiple packets of multiple sessions in parallel, ie, by sharing the same egress link bandwidth. The packet scheduling and forwarding method 1000 can be obtained based on the above-mentioned improved scheduling algorithm, and can be implemented by one network node or multiple network nodes in a multi-level queuing or scheduling scenario. The method can begin at block 1010, where a plurality of queues can be scanned to detect a backlog of packets in the queues. At block 1020, the plurality of packets detected at the head of the queue may be assigned to a plurality of time slots in a calendar table in ascending order of packet end times that are dynamic with respect to bandwidth allocated for the packets' sessions calculated, as described above. At block 1030, the time slots in the calendar queue may be scanned sequentially to detect assignment packets. At block 1040, the detected assignment packets may be forwarded in sequence on the same egress link or port. Blocks 1010 to 1040 may be repeated to continuously forward received/generated and enqueued packets in the network nodes.

图11所示为网络单元1100的一项实施例，该网络单元可以为通过网络传输和处理数据的任何装置，例如，标签交换系统100。例如，网络单元1100可以定位在上述网络组件中的任何一个中，例如，位于RP、LC、边缘节点、转发节点或服务器这些组件的任何一项。网络单元1100可以包括一个或多个入口端口或单元1110，其耦接到接收器(Rx)1112，用于从其他网络组件接收包、对象或类型长度值(TLV)。网络单元1100可以包括逻辑单元1120，用以确定将包发送到哪些网络组件。逻辑单元1120也可以实施或支持动态配置和转发方法1200，以及服务可达性转发方案900和/或1000。逻辑单元1120可以使用硬件、软件或这两者来实施。网络单元1100还可以包括一个或多个出口端口或单元1130，其耦接到发射器(Tx)1132，用于向其他网络组件传输包或数据。网络单元1100的组件可以如图11所示进行布置。FIG. 11 shows an embodiment of a network unit 1100 , which can be any device that transmits and processes data through a network, for example, the label switching system 100 . For example, the network unit 1100 may be located in any one of the above-mentioned network components, for example, in any one of RP, LC, edge node, forwarding node or server. The network unit 1100 may include one or more ingress ports or units 1110 coupled to a receiver (Rx) 1112 for receiving packets, objects or type length values (TLVs) from other network components. Network unit 1100 may include logic unit 1120 to determine which network components to send packets to. Logic unit 1120 may also implement or support dynamic configuration and forwarding method 1200 , and service reachability forwarding scheme 900 and/or 1000 . Logic unit 1120 may be implemented using hardware, software, or both. The network unit 1100 may also include one or more egress ports or units 1130 coupled to a transmitter (Tx) 1132 for transmitting packets or data to other network components. The components of network element 1100 may be arranged as shown in FIG. 11 .

上文所述的网络组件可以在任何通用网络组件上实施，例如，具有足够的处理能力、内存资源以及网络吞吐能力以处理所承受的必要工作量的计算机或网络组件。图12图示了典型的通用网络组件1200，其适用于实施本文本所揭示组件的一个或多个实施例。网络组件1200包括处理器1202(可以称为中央处理器单元或CPU)，该处理器与包含以下项的存储装置进行通信：辅助存储器1204、只读存储器(ROM)1206、随机存取存储器(RAM)1208、输入/输出(I/O)装置1210以及网络连接装置1212。处理器1202可以作为一个或多个CPU芯片实施，或者可以为一个或多个专用集成电路(ASIC)的一部分。The network components described above may be implemented on any general-purpose network component, such as a computer or network component with sufficient processing power, memory resources, and network throughput to handle the necessary workload experienced. Figure 12 illustrates a typical general-purpose network component 1200 suitable for implementing one or more embodiments of the components disclosed herein. Network component 1200 includes a processor 1202 (which may be referred to as a central processing unit or CPU) in communication with storage devices including: secondary memory 1204, read only memory (ROM) 1206, random access memory (RAM) ) 1208, input/output (I/O) device 1210, and network connection device 1212. Processor 1202 may be implemented as one or more CPU chips, or may be part of one or more application specific integrated circuits (ASICs).

辅助存储器1204通常由一个或多个磁盘驱动器或磁带驱动器组成，用于数据的非易失性存储，并且如果RAM 1208的大小不足以保持所有工作数据，该辅助存储器用作溢流数据存储装置。辅助存储器1204可以用于存储程序，当选择执行这些程序时，所述程序将会加载到RAM 1208中。ROM 1206用于存储在程序执行期间读取的指令以及可能的数据。ROM1206是非易失性存储装置，其存储容量相对于辅助存储器1204的较大存储容量而言通常较小。RAM 1208用于存储易失性数据，并且还可能用于存储指令。对ROM 1206和RAM 1208两者的存取通常比对辅助存储器1204的存取快。Secondary storage 1204, typically consisting of one or more disk drives or tape drives, is used for non-volatile storage of data and is used as an overflow data storage device if RAM 1208 is not large enough to hold all working data. Secondary storage 1204 may be used to store programs that are loaded into RAM 1208 when those programs are selected for execution. ROM 1206 is used to store instructions and possibly data that are read during program execution. ROM 1206 is a non-volatile storage device that typically has a small storage capacity relative to the larger storage capacity of secondary storage 1204 . RAM 1208 is used to store volatile data and possibly also to store instructions. Access to both ROM 1206 and RAM 1208 is typically faster than access to secondary storage 1204 .

本发明揭示至少一项实施例，且所属领域的技术人员对所述实施例和/或所述实施例的特征作出的变化、组合和/或修改在本发明的范围内。因组合、整合和/或省略所述实施例的特征而得到的替代性实施例也在本发明的范围内。在明确陈述数值范围或限制的情况下，应将此类表达范围或限制理解成包含属于明确陈述的范围或限制内的类似量值的迭代范围或限制(例如，从约为1到约为10包含2、3、4等；大于0.10包含0.11、0.12、0.13等)。例如，只要揭示具有下限R_l和上限R_u的数值范围，则也特别揭示属于此范围内的任何数字。具体而言，特别揭示所述范围内的以下数字：R＝R_l+k*(R_u-R_l)，其中k为从1％到100％范围内以1％递增的变量，即，k为1％、2％、3％、4％、7％、……、70％、71％、72％、……、97％、96％、97％、98％、99％或100％。此外，还特别揭示由如上文所定义的两个R数字定义的任何数值范围。相对于权利要求的任一元件使用术语“任选地”意味着需要所述元件，或者不需要所述元件，这两种替代方案均在所述权利要求的范围内。应将使用“包括”、“包含”和“具有”等范围较大的术语理解成支持“由……组成”、“基本上由……组成”以及“大体上由……组成”等范围较小的术语。因此，保护范围不受上文所述描述的限制，而是由所附权利要求书界定，所述范围包含所附权利要求书的标的物的所有等效物。每一和每条权利要求作为进一步揭示内容并入说明书中，且权利要求书是本发明的实施例。揭示内容中对参考的论述并非承认其为现有技术，尤其是公开日期在本申请案的在先申请优先权日期之后的任何参考。本发明中所引用的所有专利、专利申请案和公开案的揭示内容在此以引用的方式并入本文中，其提供补充本发明的示范性、程序性或其他细节。The present invention discloses at least one embodiment, and changes, combinations and/or modifications made by those skilled in the art to the embodiments and/or the features of the embodiments are within the scope of the present invention. Alternative embodiments resulting from combining, integrating, and/or omitting features of the described embodiments are also within the scope of the invention. Where a numerical range or limit is expressly stated, such expressed range or limit should be understood to encompass iterative ranges or limits of similar magnitude falling within the expressly stated range or limit (e.g., from about 1 to about 10 Including 2, 3, 4, etc.; greater than 0.10 includes 0.11, 0.12, 0.13, etc.). For example, so long as a numerical range having a lower limit _R1 and an upper limit _Ru is disclosed, any number falling within that range is also specifically disclosed. In particular, the following numbers within said range are specifically disclosed: R=R _l +k*(R _u -R _l ), where k is a variable ranging from 1% to 100% in increments of 1%, i.e., k 1%, 2%, 3%, 4%, 7%, ... 70%, 71%, 72%, ... 97%, 96%, 97%, 98%, 99% or 100%. Furthermore, any numerical range defined by two R numbers as defined above is also specifically disclosed. Use of the term "optionally" with respect to any element of a claim means that the element is either required or not required, both alternatives being within the scope of the claim. The use of the broader terms "comprising", "comprising" and "having" should be understood as supporting the broader terms "consisting of", "consisting essentially of" and "consisting substantially of" small term. Accordingly, the scope of protection is not limited by the description set out above but is defined by the claims that follow, that scope including all equivalents of the subject matter of the claims. Each and every claim is incorporated into the specification as further disclosure and the claims are embodiments of the invention. The discussion of a reference in the disclosure is not an admission that it is prior art, especially any reference that has a publication date after the priority date of this application's earlier application. The disclosures of all patents, patent applications, and publications cited in this application are hereby incorporated by reference herein as providing exemplary, procedural, or other details supplementary to this application.

虽然本发明中已提供若干实施例，但应理解，在不脱离本发明的精神或范围的情况下，所揭示的系统和方法可以许多其他具体形式来体现。本发明的实例应被视为说明性的而非限制性的，且本发明不限于本文所给出的细节。例如，各种元件或组件可以在另一系统中组合或整合，或者某些特征可以省略或不实施。While several embodiments have been provided herein, it should be understood that the disclosed systems and methods may be embodied in many other specific forms without departing from the spirit or scope of the invention. The examples of the invention are to be regarded as illustrative rather than restrictive, and the invention is not limited to the details given herein. For example, various elements or components may be combined or integrated in another system, or certain features may be omitted or not implemented.

另外，在不脱离本发明的范围的情况下，各种实施例中描述和说明为离散或单独的技术、系统、子系统和方法可以与其它系统、模块、技术或方法相组合或整合。展示或论述为彼此耦接或直接耦接或通信的其他项也可以采用电方式、机械方式或其他方式通过某一接口、装置或中间组件间接地耦接或通信。改变、替代和更改的其他实例可由所属领域的技术人员确定，且可在不脱离本文所揭示的精神和范围的情况下作出。In addition, techniques, systems, subsystems and methods described and illustrated in various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques or methods without departing from the scope of the present invention. Other items shown or discussed as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations can be ascertained by those skilled in the art, and can be made without departing from the spirit and scope disclosed herein.

Claims

1. it is used for an equipment for the expansible packet scheduling of high-volume conversation, comprising:

Multiple queues, it is for caching the multiple bags corresponding to multiple sessions；

Scheduler, it is for dispatching out described bag from different queue, with the end time based on each bag in each corresponding team The exit of row forwards；And

Outbound, it is couple to described scheduler, and is used for scheduled bag at all described queues and with all institutes State the total bandwidth that queue shared to transfer,

The wherein said end time is that dynamic calculation goes out based on the amount of bandwidth distributing to corresponding queue, and

Wherein said queue is assigned with the weights of correspondence with shared total bandwidth；

The wherein said end time calculates only for the bag at described queue head, and is the most only pointed to described queue The bag of head is scheduling.

Equipment the most according to claim 1, the dynamic calculation of wherein said end time reflects session and is activating with empty The change of switching between spare time.

Equipment the most according to claim 1, is wherein assigned to the described weights service based on respective session of described queue Quality (QoS) requirement.

Equipment the most according to claim 1, it farther includes:

Multiple second queues, it is for caching the multiple bags corresponding to multiple sessions, and the plurality of second queue includes being couple to One the second queue of described outbound；

Second scheduler, it is for dispatching out described bag from the second different queues, with the end time based on each bag Forward in the exit of the second queue of each correspondence；And

Second outbound, it is couple to described second scheduler, and for by scheduled bag at all second queues and Transfer with the total bandwidth that all queues are shared.

Equipment the most according to claim 4, wherein queue, described scheduler, described outbound, described second queue, Described second scheduler and described second outbound are corresponding to consolidated network node.

Equipment the most according to claim 4, wherein queue, described scheduler, described outbound are corresponding to first network Node, and wherein said second queue, described second scheduler and described second outbound are higher corresponding to being positioned in tree Level and be couple to the second network node of described first network node.

Equipment the most according to claim 1, wherein scheduled is coated the multiple corresponding time slots being assigned in calendar watch, its Middle assigned bag forwards on described outbound according to the order of described time slot, and wherein said calendar watch is suitable Intensive and include that quantity is far fewer than the unassigned time slot assigning number of timeslots.

Equipment the most according to claim 1, the kth bag that formula is session i that the calculating of wherein said end time uses 'sWhereinFor the time started that kth bag is calculated, if i-th team Packet congestion is there is not, then V at row_i ^k=0, or V in the case of other_i ^kConfigured equal to obtaining at i-th queue servicing The end time of later bag, w_iFor session i weights in terms of the bandwidth of distribution, B is for moving to queue head when kth bag Time all activated session set, R is the total bandwidth of output link shared between all sessions,For kth bag on session i Length.

9. it is used for a networking component for the expansible packet scheduling of high-volume conversation, comprising:

Receptor, it is for receiving the multiple bags corresponding to multiple sessions；

One or more memory element, it is used for buffering multiple queues of the bag of described respective session for storage；

Logical block, it is for calculating end time, and the bag that will be detected for each bag detected at corresponding queue head It is assigned to the time slot of calendar queue so that according to the ascending order of end time to forward bag；And

Emitter, it is assigned to multiple bags of described time slot by output link for being sent according to the order of time slot.

Networking component the most according to claim 9, the calculating of the described end time wherein wrapped is based on certain coefficient, According to the amount of bandwidth of session distributing to described bag, occurrence dynamics changes described coefficient.

11. networking components according to claim 9, the kth that formula is session i that the calculating of wherein said end time uses Individual bagWhereinFor the time started that kth bag is calculated, if i-th Packet congestion is there is not, then V at individual queue_i ^k=0, or V in the case of other_i ^kConfigured serviced equal at i-th queue Last bag end time, w_iFor session i weights in terms of the bandwidth of distribution, B is for moving to queue when kth bag The set of all activated session during head, R is the total bandwidth of the output link shared between all sessions,For kth on session i The length of bag.

12. networking components according to claim 11, wherein assign bag to treat according to described end time F_i ^kForward this Process provides O (1) work and saves scheduling.

13. networking components according to claim 12, the most averagely have S time slot scan in described calendar queue with Serve S bag.

14. networking components according to claim 9, wherein number of timeslots is equal to 1,000,000 time slot or more.

15. 1 kinds of methods for the expansible packet scheduling of high-volume conversation, comprising:

Multiple queues are scanned to detect any overstocked bag in described queue for multiple bag sessions；

By the multiple bags detected at described queue head according to the ascending order of multiple end times be assigned in calendar watch multiple Time slot, the described end time is to distribute to the bandwidth aspect of described bag session calculate for described wrapping in；

Sequentially the described time slot in calendar watch described in column scan is to detect assigned bag；And

Shared outbound forwards appointment bag in order that detect.

16. methods according to claim 15, wherein said queue detects appointing in described queue through scanning continuously He Xin overstocks bag, and wherein said time slot has carried out recursive scanning, and its mode is first after scanning last time slot Restart at individual time slot.

17. methods according to claim 15, the bag the most only detected in the exit of described queue is according to them The ascending order of end time value of calculation is assigned to described time slot.

The size of 18. methods according to claim 15, the instruction number in wherein said bag dequeue process and described time slot Relevant, and wherein said to wrap into the average of instruction during team independent with the size of described time slot.

19. methods according to claim 15, wherein when bag number increases, belong to the described bag of different queue through forwarding more Close to the preferable fair ratio between described queue.

20. methods according to claim 15, the increase of the described time slot size of wherein said calendar queue reduces finger Send out the scanning timeslot number of bag to be forwarded.

21. methods according to claim 15, the kth that formula is session i that the calculating of wherein said end time uses BagWhereinFor the time started that kth bag is calculated, if i-th Packet congestion is there is not, then V at queue_i ^k=0, or V in the case of other_i ^kConfigured equal to obtaining service at i-th queue The end time of last bag, w_iFor session i weights in terms of the bandwidth of distribution, B is for moving to queue heads when kth bag The set of all activated session during portion, R is the total bandwidth of the output link shared between all sessions,For kth bag on session i Length.