WO2001022667A1 - Method and system for efficient routing table compression and fast routing lookups - Google Patents
Method and system for efficient routing table compression and fast routing lookups Download PDFInfo
- Publication number
- WO2001022667A1 WO2001022667A1 PCT/SE2000/001841 SE0001841W WO0122667A1 WO 2001022667 A1 WO2001022667 A1 WO 2001022667A1 SE 0001841 W SE0001841 W SE 0001841W WO 0122667 A1 WO0122667 A1 WO 0122667A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pointer
- bits
- level
- code word
- bit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q3/00—Selecting arrangements
- H04Q3/64—Distributing or queueing
- H04Q3/66—Traffic distributors
- H04Q3/665—Circuit arrangements therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/74—Address processing for routing
- H04L45/745—Address table lookup; Address filtering
- H04L45/74591—Address table lookup; Address filtering using content-addressable memories [CAM]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q2213/00—Indexing scheme relating to selecting arrangements in general and for multiplex systems
- H04Q2213/13034—A/D conversion, code compression/expansion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q2213/00—Indexing scheme relating to selecting arrangements in general and for multiplex systems
- H04Q2213/13056—Routines, finite state machines
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q2213/00—Indexing scheme relating to selecting arrangements in general and for multiplex systems
- H04Q2213/13097—Numbering, addressing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q2213/00—Indexing scheme relating to selecting arrangements in general and for multiplex systems
- H04Q2213/13141—Hunting for free outlet, circuit or channel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q2213/00—Indexing scheme relating to selecting arrangements in general and for multiplex systems
- H04Q2213/13196—Connection circuit/link/trunk/junction, bridge, router, gateway
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q2213/00—Indexing scheme relating to selecting arrangements in general and for multiplex systems
- H04Q2213/13204—Protocols
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q2213/00—Indexing scheme relating to selecting arrangements in general and for multiplex systems
- H04Q2213/13216—Code signals, frame structure
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q2213/00—Indexing scheme relating to selecting arrangements in general and for multiplex systems
- H04Q2213/13353—Routing table, map memory
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q2213/00—Indexing scheme relating to selecting arrangements in general and for multiplex systems
- H04Q2213/13389—LAN, internet
Definitions
- TITLE METHOD AND SYSTEM FOR EFFICIENT ROUTING TABLE COMPRESSION AND FAST ROUTING LOOKUPS
- the present invention relates generally to method and system of IP routing lookups in a routing table, comprising entries of arbitrary length prefixes with associated next- hop information in a next -hop table, to determine where IP datagrams are to be forwarded.
- Internet is an interconnected set of networks, where- in each- of the constituent networks retains its identity, and special mechanisms are needed for communication across multiple networks.
- the constituent networks are referred to as subnetworks .
- Each subnetwork in the Internet supports communica- tion among the devices connected to that subnetwork.
- I Us interworking units
- a particular I U is a router, which is used to connect two networks that may or may not be similar.
- the router employs an internet protocol (IP) present in each router and each host of the network.
- IP internet protocol
- IP provides a connectionless or datagram service between stations.
- Routing is generally accomplished by maintaining a routing table in each station and router that indicates, for each possible destination network, the next router to which the IP datagram should be sent.
- IP routers do a routing lookup in a routing table to determine where IP datagrams are to be forwarded.
- the re- suit of the operation is the next-hop on the path towards the destination.
- An entry in a routing table is conceptually an arbitrary length prefix with associated next -hop in- formation. Routing lookups must find the routing entry with the longest matching prefix.
- IP routing lookups have been inherently slow and complex, operations with prior art solutions have lead to a proliferation of techniques to avoid using them.
- IP router designs can use special-purpose hardware to do IP processing. This can be an inflexible solution. Any changes in the IP format or protocol could invalidate such designs. The flexibility of software and the rapid perform- ance increase of general purpose processors makes such solutions preferable.
- a further objective is to increase routing lookup speeds with a conventional microprocessor.
- Another objective is to minimize lookup time in the forwarding table.
- Another further objective of the invention is to provide a data structure which can fit entirely in the cache of a conventional microprocessor.
- IP routing lookup method and system which system is a data structure that can represent large routing tables in a very compact form and which can be searched quickly using few memory references.
- An advantage of the invention is an efficient lookup implementation resulting in better instruction cache behavior and more stable lookup costs.
- FIG 1 is a schematical view of a router design
- FIG 2 is an illustrative view of a direct next- hop encoded in the code word at level 1 in the data- structure according to the invention
- FIG 3 is an illustrative view of a direct next- hop encoded in a pointer at level 1 in the datastruc- ture according to the invention
- FIG 4 is an illustrative view of a pointer to the level 2 CPA encoded in the code word and pointer at level 1 in the datastructure according to the invention
- FIG 5 is an illustrative view of a direct next- hop encoded in the code word at level 2 in the data- structure according to the invention
- FIG 6 is an illustrative view of a direct next- hop encoded in the pointer at level 2 in the datastruc- ture according to the invention
- FIG 7 is an illustrative view of a pointer to the level 3 CPA encoded in the code word and pointer at level 2 in the datastructure according to the invention.
- FIG 8 is an illustrative view of a direct next- hop encoded in the code word at level 3 in the data- structure according to the invention
- FIG 9 is an illustrative view of a Next-hop index encoded in a pointer at level 3 in the datastructure according to the invention
- FIG 10 is a high level view of the method according to the invention for creating data structure means for representing the set of dominating points in the IP address space for each prefix in a routing table;
- FIG 11 is a block diagram of a first embodiment of a hardware implementation layout of the system for IP routing lookup in a routing table according to the invention.
- FIG 12 is a block diagram of a second embodiment of a hardware implementation layout of the system for IP routing lookup in a routing table according to the invention
- FIG 13 is an illustrative view of code words used in a third embodiment of a hardware implementation of the system for IP routing lookup in a routing table according to the invention
- FIG 14 is a block diagram of a third embodiment of a hardware implementation layout of the system for IP routing lookup in a routing table according to the invention.
- FIG 15 is an illustrative view of a 32 bits level 1 code word used in a software implementation of the invention. Detailed Description of the Invention
- a router design comprises a number of network inbound interfaces 1 , network outbound interfaces 2, forwarding engines 3, and a network processor
- connection fabric 4, all of which are interconnected by a connection fabric
- Inbound interfaces 1 send packet headers to the forwarding engines 3 through the connection fabric 5.
- the forwarding engines 3 determine which outgoing interface 2 the packet should be sent to. This information is sent back to the inbound interface 1, which forwards the packet to the outbound interface 2.
- the only task of a forwarding engine 3 is to process packet headers. All other tasks such as participating in routing protocols, resource reserva- tion, handling packets that need extra attention, and other administrative duties, are handled by the network processor 4.
- Each forwarding engine 3 uses a local version of the routing table, a forwarding table, downloaded from the net- work processor 4 and stored in storage means in the forwarding engine 3, to make its routing decisions. It is not necessary to download a new forwarding table for each routing update. Routing updates can be frequent but since routing protocols need some time to converge, forwarding tables can grow a little stale and need not change more than at most once per second.
- the network processor 4 needs a dynamic routing table designed for fast updates and fast generation of forwarding tables.
- the forwarding tables can be optimized for lookup speed and need not be dynamic.
- the data structure should need few instructions during lookup and keep the entities naturally aligned as much as possible to avoid expensive instructions and cumbersome bit-extraction operations.
- a routing table is a data base of prefixes where each prefix is associated with next-hop information in the form of an index into a next-hop table.
- a prefix is essentially a bit string consisting of 0 to 32 bits - the length of the prefix. The first bit is numbered 31 , the second is numbered 30, and so on until the bit string ends.
- bits 31 , 30 , 29, 28 , and 21 are specified. This means that such a prefix matches any IP address where bits 31 , 30, 29, 28 , and 21 of the IP address equals the correspond- ing bits of the prefix. Since only the five first (most significant) bits of the prefix are specified, it does not matter what the bits 0 to 26 of the IP address are.
- a prefix of length zero matches any IP address.
- the result of a routing lookup is always the in- dex associated with the longest matching prefix in the routing table.
- an IP address starting with 101 . . . matches both the prefix 10* and the prefix 101 * , where the * denotes the end of the prefix.
- the prefix 101 * is longer than 10* . Therefore, the preferred result of a routing lookup among these two prefixes is the index associated with the prefix 101 * .
- a prefix can also be represented by a range of size 2 i of the IP address space 0. . 4294961295 , for some non-negative integer i.
- the starting point of such a range must be a multiple of 2 1 . Otherwise, it is not a prefix. This is called the power of 2 property.
- the prefix 10* starts at 2141483648 and ends at 3221225412.
- the size of the range is 1013141824 , which equals 2 30 .
- the starting point equals 2*2 30 .
- Every prefix is a range possessing the power of 2 property and vice versa. Therefore, we do not need to distinguish between ranges possessing the power of 2 property and prefixes.
- the length of a prefix is defined as the number of bits specified and the size of a prefix is defined as the size of the range of the IP address space covered by the prefix.
- Computing the binary representation of the starting point of a prefix is achieved by padding the bit string representation of the prefix with zeroes to the right until the bit string becomes 32 bits long.
- the size of a prefix of length i is given by 2 32 '1 . Representing the Routing Table as a Partition of Prefixes
- a routing table there will typically be a few large prefixes that completely cover smaller prefixes.
- redundant information is removed by creating an auxiliary routing table containing exactly the same routing information but with no covering prefixes - a partition of prefixes. Covering prefixes are recursively split in half until the sizes of the remaining prefixes from the larger prefixes are the same as the smaller prefixes covered, which means that they can be removed. Then, the remaining prefixes from the covering prefix are re- moved. This method is used repeatedly, starting with the smallest prefixes and continuing to larger and larger prefixes as long as there are larger prefixes left that covers smaller prefixes. The result is a set of prefixes that forms a partition of the IP address space .
- the prefix 001 * is covered by the prefix 0* .
- the prefix * still covers all prefixes and is therefore split
- a routing table can also be represented as a binary tree, where each node has zero, one or two children. A node with zero children is called external node or leaf and a node with one or two children is called internal node.
- the next-hop index for the prefix 01 * is stored in the right (1 st ) child of the left (0 th ) child of the root. The next-hop index for * is stored in the root.
- the tree representation of the final prefix partition resulting from applying the method described above, possesses the following important properties:
- the method is sometimes referred to as tree com- pletion - the tree representation of the original routing table is completed into a full binary tree.
- prefix partitioning yields a set of prefixes, where each prefix is associated with next-hop information, that properly partitions the IP address space, it suffices to represent the partition as the. set of the starting points (first IP addresses) of the prefixes .
- the information required for representing the table from the example above is the following.
- the XTC is a data structure for efficiently representing the set of dominating points by using three levels of compressed pointer arrays .
- Each com- pressed pointer array (CPA) consists of a chunk and an array of associated 20 bits pointers and/or 15 bits next -hop indices.
- the CPA constituting the first level represents an uncompressed pointer array of size 65536 pointers (bits 31 . . 16 of the IP address) and the CPAs constituting level 2 and 3 represents 256 pointers each.
- a compressed pointer array is represented by a chunk and a list of pointers.
- the chunk consists of code words, where each code word represents 16 bits of an implicit bit array representing the chunk.
- the size of the implicit bit array is the same as the maximum number of pointers in the CPA.
- the bit with index i is set if and only if there is a prefix starting at position i. By starting point, we mean that the bits inspected ⁇ 31 . . 16 at level 1, 15. . 8 at level 2, and 1. . 0 at level 3) of the dominating point, when indexing into the CPA, equals i.
- the implicit bit array is, in turn, represented as 16 implicit bit masks. There are three classes of bit masks:
- next-hop index is encoded directly in the code word.
- Each such code word represents 16 pointers that would be required if the pointer array was not compressed.
- the code words 4 . . 1 will be identical.
- the bits masks with more than two bits set represents prefixes which does not completely cover a bit mask. For each such prefix, the starting point and end point lies in the same bit mask.
- the number of different kinds of bit masks of length 2 X with at least one bit set is given by the following recurrence equation:
- bit masks with two or more bits set.
- Such bit masks and the corresponding pointers are represented by a combination of a code word and a pointer list associated with the code word.
- Each pointer either constitutes a 15 bits next hop index or a 20 bits pointer to the next level CPA.
- the prefix with the smallest size for a prefix with starting point and end point within the same bit, the size is 1) .
- Each prefix in the set is either of size 1, 2, 4 , or 8. If the size is 16 it covers the bit mask completely and the direct encoding described above is applicable (the bit mask has exactly one bit set) . All prefixes that have larger size than the prefix with smallest can be repeatedly split in half until the sizes of all prefixes are equal.
- a part of the IP address will be used to index into the CPA.
- the high 12 (on level 1) or 4 (otherwise) bits is the index of the code word.
- the 4 low (least significant) bits will be used to index into the bit mask of the code word (if the code word does not contain an encoded direct next-hop) .
- indexing into the bit mask we will either index a bit that is set (among the set bits) to retrieve the pointer associated with that bit, or we index a bit which is not set. If the bit is not set, we will need to retrieve the pointer associated with the closest set bit to the left of the indexed bit.
- there is a pointer list associated with each bit mask/code word there is a pointer list associated with each bit mask/code word.
- Such pointer lists will either contain 2, 4, 8, or 16 pointers.
- the index of the closest bit set to the left, among the set bits, (and also the index of the corresponding pointer in the pointer list) can be computed by a simple shift operation. That is, the low 4 bits used to index into the bit mask, in the lookup, are shifted code bits to the right. This is to be compared to the costly map table lookup as is used in the FTC algorithm.
- the chunk In the XTC data structure, the chunk consists of an array of code words of fixed size (the size depend on the level) . To reduce the number of bits required for representing a the index of the first pointer associated with each code word (the beginning of the pointer list associated with the code word) , the pointers in the XTC data structure are stored in the same array as the chunks. The pointer array associated with a chunk is stored immediately after the last code word of the chunk.
- the pointer array consists of either 20 bits pointers and/or 15 bits next-hop indices. It is represented by an array of 16 bits integers, where the most significant bit indicates whether the low 15 bits represents a next-hop index or the low 15 bits of a 20 bits pointer.
- To represent a 20 bits pointer to the next level we use 5 bits of the code word for representing the high 5 bits of the 20 bits pointer. These bits are simply appended to the pointer when it is retrieved. This clearly requires that in any list of pointers to the next level associated with the same bit mask, bits 15. . 19 must be identical. This is achieved moving the chunks at the next level during construction if the low 15 bits if the pointers wraps while building the chunks at the next level associated the bit mask in construction at the current level (see Section 3 for details) .
- each code word consists of :
- next-hop there are 15 bits represent- ing the next -hop index. Otherwise there are:
- Level 1 a 65536 Elements Compressed Pointer Array
- the first level (level 1) of the XTC data structure is represented by a CPA with 4096 code words and 0 . . 65536 pointers.
- the first step in the lookup is to use bits 31 . . 20 of the IP address as index to the 24 bits code word which is retrieved (in the software implementation, we store the 24 bits code word in a 32 bits integer for alignment reasons) . If the most significant bit of the code word is set (represented by a white text on black background in Figure 2) , the following 15 bits represents a next-hop index which can be retrieved immediately and the lookup is completed. Bits 1. . 0 are unused in this case.
- bits 19. . 16 of the IP address are used to index into the code word to find the pointer in the pointer list. This is achieved by shifting the 4 bits of the IP address with the 2 bits integer retrieved from the code word (the code) , and, thereafter, adding the 16 bits offset also retrieved from the code word (see Figure 3) . The resulting number is the index to the pointer stored in the pointer list associated with the chunk. If the most significant bit of the pointer is set, the low 15 bits constitutes a next-hop index and the lookup is completed ( Figure 3) .
- the pointer does not represent a direct next- hop, it's low 15 bits are retrieved and combined with the 5 bits stored in the code word, to form a 20 bits pointer to the level 2 CPA ( Figure 4) where the lookup continues .
- the compressed pointer arrays at level 2 are stored in one big array of 16 bits integers, where the size of each CPA depends on the number of pointers in its pointer list, which is stored immediately after the list of code words. Therefore, it is not possible to index the ith CPA at level 2. Instead, the level 3 CPAs have to be referenced by their starting points (the index of the first code word) and that is the reason for needing 20 bits pointers when representing large routing tables.
- Level 2 a 256 Elements Compressed Pointer Array
- the second level of the XTC consists of big array of level 2 CPAs. Each such CPA consists of 16 code words and 0. . 256 pointers.
- the third step in the lookup is to use bits 15. . 12 of the IP address as index to the 16 bits code word which is retrieved. If the most significant bit of the code word is set (represented by a white text on black background in Figure 5) , the following 15 bits represents a next-hop index which can be retrieved immediately and the lookup is completed.
- 8 of the IP address are used to index into the code word to find the pointer in the pointer list. This is achieved by shifting the 4 bits of the IP address with the 2 bits inte- ger retrieved from the code word (the code) , and, thereafter, adding the 8 bits offset also retrieved from the code word (see Figure 6) . The resulting number is the index to the pointer stored in the pointer list associated with the chunk. If the most significant bit of the pointer is set, the low 15 bits constitute a next-hop index and the lookup is completed ( Figure 6) .
- the pointer does not represent a direct next- hop, it's low 15 bits are retrieved and combined with the 5 bits stored in the code word, to form a 20 bits pointer to the level 3 CPA ( Figure 7) where the lookup continues .
- the compressed pointer arrays at level 3 are also stored in one big array of 16 bits integers, where the size of each CPA depends on the number of pointers in its pointer list. Hence, also here a 20 bits pointer us needed for representing the location of the next level CPA.
- Level 3 a 256 Elements Compressed Pointer Array
- the third level of the XTC consists of big array of level 3 CPAs. Each such CPA consists of 16 code words and 0..256 pointers.
- the lookup continues as follows in a similar way as at level 2.
- the fifth step is to use bits 1. . 4 of the IP address as in- dex to the 16 bits code word which is retrieved. If the most significant bit of the code word is set (represented by a white text on black background in Figure 8) , the following 15 bits represents a next-hop index which can be retrieved immediately and the lookup is completed.
- bits 3 . . 0 of the IP address are used to index into the code word to find the pointer in the pointer list. This is achieved by shifting the 4 bits of the IP address with the 2 bits inte- ger retrieved from the code word (the code) , and, thereafter, adding the 8 bits offset also retrieved from the code word (see Figure 9) .
- the resulting number is the index to the pointer stored in the pointer list associated with the chunk.
- all bits of the IP address have been inspected. Hence, the pointer retrieved from the pointer list must represent a next- hop index. Therefore, this step completes the lookup (see Figure 9) .
- next-hop index is computed in 6 memory accesses, accessing a total of 13 ( 14) bytes, in the worst case.
- the XTC is built by traversing the routing table entries, building the next-hop table, and constructing the compressed XTC representation, in a single pass.
- the only requirement on the construction is that the routing entries are traversed in a certain order, and that a function is called on each entry.
- the representation of the dynamic routing table, from which the XTC forwarding table is built is completely independent from the XTC construction algorithm.
- the complete builder process consists of three steps that are executed simultanesously (see Figure 10) .
- the dynamic routing table is traversed in sorted order with respect to the IP address (starting point) and the length of the IP net- mask. Entries with smaller values of the IP address are processed before entries with larger IP addresses. For entries with equal IP addresses, the entry with the shortest IP netmask (shortest prefix) is processed first.
- the tree traversal is intimately related with the representation of the dynamic routing table used, which in Effnet's case is a Radix Tree, and is therefore not described.
- Each routing entry is sent to the routing entry conversion and next-hop table generation process as shown in Figure 10.
- This process efficiently splits routing entries into prefixes and discards prefixes that covers other prefixes as described in the introduction. It is also responsible for generating the next-hop table from the next-hop information (interface, MAC address, ATM channel etc.) that is present in the dymanic routing table.
- each prefix resulting from splitting each routing entry is fed into the XTC builder process which is described in de- tail in the following section.
- the same routing entry conversion and next hop table generation process can be used both for the FTC builder process and the XTC builder process (and also for other representations) . Therefore, it is detailed description is not given here. (It will be subject to a separate patent application. )
- each prefix generated immediately propagates to the XTC construction in progress.
- Each prefix has a starting point and an end point as described in the introduction.
- the prefixes are fed the XTC construction in sorted order.
- the end point of the next prefix to be processed equals the end point of the current prefix plus one.
- bucket 3 Buckets containing 16 pointers associated the current level 3 code word. offsetl Pointer offset associated with current level 1 code word . offset2 Pointer offset associated with current level 2 code word . offset3 Pointer offset associated with current level 3 code word . ma ski Bit mask associated with current level 1 code word. mask2 Bit mask associated with current level 2 code word. mask3 Bit mask associated with current level 3 code word. For each prefix processed, the expect variable is checked against the starting point to ensure that the stream of prefixes is valid. The processing of each prefix consists of an ini tialization phase, a construction phase, and a completion phase .
- the whole construction is initialized by setting basel , Jbase2, and, jbase3 to the memory addresses of the re- spective levels, and by zeroing pos2, pos3 , from2, and from3 .
- Level 1 construction is also initialized by setting chunkl to basel , and pointerl to the memory address where the level 1 pointer list is to be stored, i.e. to basel + 8192 if the pointer list is stored immediately after the chunk.
- offsetl is set to zero.
- the initialization performs the following.
- the depth of construction 12, 16, 20, 24 , 28 , or 32
- the following formula is used:
- the initialization continues by processing the particular initialization for the depth computed. How- ever, for one prefix construction at several depths may need to be initialized, starting at the greatest depth.
- Depth 32 initialization Determine if the construction of a new level 3 code word begins by using the following formula: ! ( expect & ( (1 ⁇ 4) - 1) )
- level 3 code word construction is initialized by setting mask3 to 0, bucket3 [0] , . . . , bucket3 [15] to EMPTY, and then continuing with the depth 28 initialization.
- bits 15. . 19 of from3 and pos3 are compared to ensure that they are equal. Recall that from3 and pos3 are the position of the first level 3 chunk and the current level 3 chunk, respectively, associated with the current level 2 code word in construction. Also, recall that the 20 bits pointers to these chunks will be stored at level 2 with the 5 high bits in the code word and the low 15 bits in the pointers (represented by from3 and pos3) associated with the code word. Therefore, it is required that bits 15. .
- Moving the CPAs is then achieved by repeatedly assigning base3 [index] to base3 [index + del ta] , where index loops from pos3 - 1 to from3 . It remains to update the pointers stored in the buckets associated with the level 2 code word in construction since the locations of the level 3 CPAs associated with these have changed. This is achieved by increasing each pointer stored in the bucket2 array (EMPTY buckets are skipped) by delta and then keeping only the low 15 bits by a simple bit masking operation. Finally, from3 is set to wrap and pos3 is increased by delta, before completing the depth 28 initialization, by setting chunk3 to base3 + pos3 , pointer3 to chunk3 + 16, and offset3 to zero, and continuing with the depth 24 initialization.
- level 2 code word construction is initialized by setting mask2 to 0, bucket2 [0] , . . . , bucket2 [15] to EMPTY, and then continuing with the depth 20 initialization.
- the depth 20 initialization is completed by setting chunk2 to base2 + pos2 , pointer2 to chunk2 + 16, which is the size of the chunk, and offset2 to zero. If the bits are not equal, it is needed to adjust the compressed pointer arrays built at level 2, starting from from2 so that the high 5 bits becomes equal and the 20 bits pointer encoding is possible. This is achieved by computing pos2 & Oxf ⁇ OOO and assigning this value to a wrap variable. Then the distance delta of the adjustment (how many positions to move the CPAs) is computed as the difference between wrap and from2.
- Moving the CPAs is then achieved by repeatedly assigning base2 [index] to base2 [index + del ta] , where index loops from pos2 - 1 to from2. It remains to update the pointers stored in the buckets associated with the level 1 code word in construction since the locations of the level 2 CPAs associated with these have changed. This is achieved by increasing each pointer stored in the bucketl array (EMPTY buckets are skipped) by delta and then keeping only the low 15 bits by a simple bit masking operation. Finally, fro ⁇ is set to wrap and pos2 is increased by delta, before completing the depth 20 initialization, by setting chunk2 to ase2 + pos2, pointer2 to chunk2 + 16, and of fset2 *to zero, and continuing with the depth 16 initialization.
- Depth 16 initialization Determine if the construction of a new level 1 code word begins by using the following formula: I ( expect & ( (1 « 20) - 1 ) )
- level 1 code word construction is initialized by setting maskl to 0, bucketl [0] , ..., bucketl [15] to EMPTY, and then continuing with the depth 12 initialization.
- the expect variable is updated by setting it to the end point of the prefix being processed plus one. Then, the processing continues with the construction phase where construction takes place at the depth computed in the beginning of the processing.
- a direct next-hop stored in a level 3 pointer is built as follows. Compute a bit index bix by extracting bits 0 . . 3 from the starting point of the prefix. Then, bit bix of mask3 is set and the next-hop index of the prefix is stored (without encoding) in bucket3 [bix] . In the completion phase, the content of the buckets will be moved to the pointer array associated with the level 3 code word in construction.
- Direct next -hops directly encoded in level 3 code words are built as follows.
- the indices, start and stop, of the first and last code word to be built are computed by extracting bits 4..7 of the starting point and end point of the prefix respectively.
- the next-hop index of the prefix is encoded by setting bit 15 and stored the chunk by assigning the encoded direct next -hop to chunk3 [start] , . . . , chunk3 [stop] .
- a direct next-hop stored in a level 2 pointer is built as follows. Compute a bit index bix by extracting bits 8. . 11 from the starting point of the prefix. Then, bit bix of mask2 is set and the next-hop index of the prefix is encoded by setting bit 15 and is thereafter stored in bucket2 [bix] . In the completion phase, the content of the buckets will be moved to the pointer array associated with the level 2 code word in construction.
- Direct next-hops directly encoded in level 2 code words are built as follows.
- the indices, start and stop, of the first and last code word to be built are computed by extracting bits 12. . 15 of the starting point and end point of the prefix respectively.
- the next-hop index of the prefix is encoded by setting bit 15 and stored the chunk by assigning the encoded direct next-hop to chunk2 [start] , . . . , chunk2 [stop] .
- Depth 16 construction A direct next-hop stored in a level 2 pointer is built as follows. Compute a bit index bix by extracting bits 16. . 19 from the starting point of the prefix. Then, bit bix of mask2 is set and the next -hop index of the prefix is encoded by setting bit 15 and is there- after stored in bucket2 [bix] . In the completion phase, the content of the buckets will be moved to the pointer array associated with the level 1 code word in construction.
- Direct next -hops directly encoded in level 3 code words are built as follows.
- the indices, start and stop, of the first and last code word to be built are computed by extracting bits 20. . 31 of the starting point and end point of the prefix respectively.
- the next -hop index of the prefix is encoded by shifting 8 bits to the left, setting bit 23 and stored the chunk by assigning the encoded direct next -hop to chunkl [start] , . . . , chunkl [stop] .
- the level 1 chunk contains 24 bits integers and in this case, the lowest 8 bits are unused. In the software implementation, these 24 bits will be stored in a 32 bits integer for alignment reasons and this requires a slightly changed encoding .
- the processing of the prefix is completed with the completion phase, which starts at the same depth as where the construction took place. For one prefix, completion at several depths may be required starting at the greatest depth. Depth 32 completion:
- the pointers are retrieved from the buckets and stored in the pointer array associated with the code word by looping through the buckets specified in the table above. For each non-EMPTY bucket the content of the bucket is assigned to pointer3[offset3], and for each EMPTY bucket pointer3[offset3 - 1] is assigned to pointer3 [offset3] . Thereafter, offset3 is increased by one before proceeding with the next bucket . After processing each bucket listed in the table above, the code word is completed and the processing continues with depth 28 completion.
- the level 2 bit index bix is computed by extract- ing bits 8..11 from the starting point of the prefix followed by setting bit bix of mask2.
- the low 15 bits of pos3 which is the position of the level 3 chunk being completed, are stored in bucket2 [bix] followed by adding 16 + offset3 to pos3 , to prepare for the construction of the next level 3 chunk, before proceeding to the depth 24 completion.
- the index tix of the code word to be com- pleted is computed by extracting bits 12 . . 15 from the starting point of the prefix. Then mask2 is analyzed to compute the 2 bits shift value shift . This is achieved by the same simple procedure as used in the depth 32 completion. Thereafter, the shift value and the encod- ing
- bits 15. . 19 of the pointers to the level 3 CPAs associated with the code word are extracted from from3 and encoded as bits 10 . . 14 of the code word, followed by setting from3 to pos3 to prepare for the next level 2 code word construction.
- the pointers stored in the buckets must be moved to the pointer list associated with the code word. Depending on the shift value, pointers are either retrieved from 2, 4, 8, or 16 buckets in the same way as in the depth 32 completion.
- the pointers are retrieved from the buckets and stored in the pointer array associated with the code word by looping through the buckets specified in the table above. For each non-EMPTY bucket the content of the bucket is assigned to pointer 2 [of fset2] , and for each EMPTY bucket pointer2 [offset2 - 1] is assigned to pointer2 [offset2] . Thereafter, offset2 is increased by one before proceeding with the next bucket. After processing each bucket listed in the table above, the code word is completed and the processing continues with depth 20 completion.
- the level 1 bit index bix is computed by extracting bits 16. . 19 from the starting point of the prefix followed by setting bit bix of maskl .
- the low 15 bits of pos2 which is the position of the level 2 chunk being completed, are stored in bucketl [bix] followed by adding 16 + offset ⁇ to pos2 , to prepare for the construction of the next level 2 chunk, before proceeding to the depth 16 completion.
- Depth 16 completion Determine if the construction of a level 1 code word ends by using the following formula (recall that the expect variable has been updated after the initialization phase) :
- bits 15. . 19 of the pointers to the level 2 CPAs associated with the code word are extracted from from2 and encoded as bits 20. . 23 of the code word, followed by setting from2 to pos2 to prepare for the next level 1 code word construction.
- the pointers stored in the buckets must be moved to the pointer list associated with the code word.
- pointers are either retrieved from 2, 4, 8, or 16 buckets in the same way as in the depth 32 completion.
- the pointers are retrieved from the buckets and stored in the pointer array associated with the code word by looping through the buckets specified in the table above. For each non-E PTY bucket the content of the bucket is assigned to pointerl [offsetl] , and for each EMPTY bucket pointerl [offsetl - 1] is assigned to pointerl [of fsetl] . Thereafter, offsetl is increased by one before proceeding with the next bucket. After processing each bucket listed in the table above, the code word is completed and also the processing continues with depth 12 completion. Depth 12 completion:
- the XTC lookup is extremely suitable for hardware implementations.
- Figure 11 a rough layout of a direct hardware implementation of the algorithm described in the previous sections is showed.
- the IP address are fed into the lookup from the left and progresses throughout the lookup in six steps until the next-hop index is available to the right. It is assumed that the data structure is stored with each level in a separate memory. That is, one for the one level 1 chunk, one for the array of associated level 1 pointers, one for level 2 chunks and pointers, and one for level 3 chunks and pointers (these can also be mapped into one memory) . If the lookup is completed ⁇ in less than six steps, the result is locked by feeding bit 15 of the result into the chain of multiplexers. In _J t o o . o
- the code words at level 1 cannot easily be represented by 24 bits without adding an extra memory access. Therefore, these are represented by 32 bits words with the pointer offset stored in the high 16 bits (see Figure 15) . Except for the 8 least significant bits, which are unused if the code word does not represent a direct next-hop, the low 15 bits are used in the same way as at level 2 and 3.
- the performance improvements of the XTC data structure when compared to the FTC data structure comes from trading memory usage for memory accesses, simpli- city and speed. It is a well known fact that one can easily do an extremely simple routing lookup in one memory access if the memory usage does not matter at all. This is achieved by simply having a 2 32 elements array where the next-hop index for each possible IP address is stored in the slot with the same index as the numerical value of the IP address.
- each code word represents more than 16 bits to reduce the number of code words. For instance, if each code word represents 256 bits instead of 16, 8 bits of the code word must be used for the shift value. In the worst case, the number of added pointers resulting from the simplified XTC encoding increases then by a factor of 256/9. The total number of code words will be 2 24 which is still quite large.
- a better way of increasing the performance with- out wasting so much memory is to reduce the lookup to four memory accesses. This is easily achieved by building an XTC with only two levels, where for instance level 1 represents the high 20 bits and each level 2 chunk represents the low 12 bits. Thus, each chunk at each level increases by a factor of 16 which also gives a rough estimate on the total growth of the data structure. Possibly, this could be combined with the idea of using "larger" code words at level 1 as describe above.
- the most straight forward way of increasing the speed somewhat is to only use XTC techniques at level 2 and 3 while using direct addressing at level 1. This immediately removes one memory access.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
A system for IP routing lookup in a routing table, comprising entries of arbitrary length prefixes with associated next-hop information in a next-hop table, for determining where IP datagrams are to be forwarded, and data structure means for representing the set of dominating points in the IP address space for each prefix in the routing table, said means comprising three levels of compressed pointer arrays (CPA), each including a chunk and an array of associated pointers, wherein each pointer comprises a next hop index or a pointer to the next level of chunk.
Description
APPLICANT: EFFNET GROUP AB
TITLE: METHOD AND SYSTEM FOR EFFICIENT ROUTING TABLE COMPRESSION AND FAST ROUTING LOOKUPS
Field of the Invention
The present invention relates generally to method and system of IP routing lookups in a routing table, comprising entries of arbitrary length prefixes with associated next- hop information in a next -hop table, to determine where IP datagrams are to be forwarded.
Description of the Prior Art
Internet is an interconnected set of networks, where- in each- of the constituent networks retains its identity, and special mechanisms are needed for communication across multiple networks. The constituent networks are referred to as subnetworks .
Each subnetwork in the Internet supports communica- tion among the devices connected to that subnetwork.
Further, subnetworks are connected by devices referred to as interworking units (I Us) .
A particular I U is a router, which is used to connect two networks that may or may not be similar. The router employs an internet protocol (IP) present in each router and each host of the network.
IP provides a connectionless or datagram service between stations.
Routing is generally accomplished by maintaining a routing table in each station and router that indicates, for each possible destination network, the next router to which the IP datagram should be sent.
IP routers do a routing lookup in a routing table to determine where IP datagrams are to be forwarded. The re- suit of the operation is the next-hop on the path towards the destination. An entry in a routing table is conceptually an arbitrary length prefix with associated next -hop in-
formation. Routing lookups must find the routing entry with the longest matching prefix.
Since IP routing lookups have been inherently slow and complex, operations with prior art solutions have lead to a proliferation of techniques to avoid using them.
Most current IP router designs use caching techniques wherein the routing entries of the most recently used destination addresses are kept in a cache. The technique relies on there being enough locality in the traffic so that the cache hit rate is sufficiently high and the cost of a routing lookup is amortized over several packets. These caching methods have worked well in the past. However, as the current rapid growth of the Internet increases the required size of address caches, hardware caches may become uneconomical .
IP router designs can use special-purpose hardware to do IP processing. This can be an inflexible solution. Any changes in the IP format or protocol could invalidate such designs. The flexibility of software and the rapid perform- ance increase of general purpose processors makes such solutions preferable.
Summary of the Invention
It is therefore an objective of the present invention to provide an improved IP routing lookup method and system for performing full routing lookups for each IP packet up, which method and system overcome the above mentioned disadvantages .
A further objective is to increase routing lookup speeds with a conventional microprocessor.
Another objective is to minimize lookup time in the forwarding table.
Another further objective of the invention is to provide a data structure which can fit entirely in the cache of a conventional microprocessor.
Consequently, memory accesses will be orders magni- tude faster than if the data structure needs to reside in memory consisting of, for example, relatively slow DRAM. These objectives are obtained with the IP routing lookup method and system according to the invention, which system is a data structure that can represent large routing tables in a very compact form and which can be searched quickly using few memory references.
An advantage of the invention is an efficient lookup implementation resulting in better instruction cache behavior and more stable lookup costs.
Brief Description of the Drawings
In order to explain the invention in more detail and the advantages and features of the invention preferred embodiments will be described in detail below, reference be- ing made to the accompanying drawings, in which
FIG 1 is a schematical view of a router design; FIG 2 is an illustrative view of a direct next- hop encoded in the code word at level 1 in the data- structure according to the invention; FIG 3 is an illustrative view of a direct next- hop encoded in a pointer at level 1 in the datastruc- ture according to the invention;
FIG 4 is an illustrative view of a pointer to the level 2 CPA encoded in the code word and pointer at level 1 in the datastructure according to the invention;
FIG 5 is an illustrative view of a direct next- hop encoded in the code word at level 2 in the data- structure according to the invention;
FIG 6 is an illustrative view of a direct next- hop encoded in the pointer at level 2 in the datastruc- ture according to the invention;
FIG 7 is an illustrative view of a pointer to the level 3 CPA encoded in the code word and pointer at level 2 in the datastructure according to the invention;
FIG 8 is an illustrative view of a direct next- hop encoded in the code word at level 3 in the data- structure according to the invention; FIG 9 is an illustrative view of a Next-hop index encoded in a pointer at level 3 in the datastructure according to the invention;
FIG 10 is a high level view of the method according to the invention for creating data structure means for representing the set of dominating points in the IP address space for each prefix in a routing table;
FIG 11 is a block diagram of a first embodiment of a hardware implementation layout of the system for IP routing lookup in a routing table according to the invention;
FIG 12 is a block diagram of a second embodiment of a hardware implementation layout of the system for IP routing lookup in a routing table according to the invention; FIG 13 is an illustrative view of code words used in a third embodiment of a hardware implementation of the system for IP routing lookup in a routing table according to the invention;
FIG 14 is a block diagram of a third embodiment of a hardware implementation layout of the system for IP routing lookup in a routing table according to the invention;
FIG 15 is an illustrative view of a 32 bits level 1 code word used in a software implementation of the invention.
Detailed Description of the Invention
With reference to FIG 1 a router design comprises a number of network inbound interfaces 1 , network outbound interfaces 2, forwarding engines 3, and a network processor
4, all of which are interconnected by a connection fabric
5. Inbound interfaces 1 send packet headers to the forwarding engines 3 through the connection fabric 5. The forwarding engines 3 in turn determine which outgoing interface 2 the packet should be sent to. This information is sent back to the inbound interface 1, which forwards the packet to the outbound interface 2. The only task of a forwarding engine 3 is to process packet headers. All other tasks such as participating in routing protocols, resource reserva- tion, handling packets that need extra attention, and other administrative duties, are handled by the network processor 4.
Each forwarding engine 3 uses a local version of the routing table, a forwarding table, downloaded from the net- work processor 4 and stored in storage means in the forwarding engine 3, to make its routing decisions. It is not necessary to download a new forwarding table for each routing update. Routing updates can be frequent but since routing protocols need some time to converge, forwarding tables can grow a little stale and need not change more than at most once per second. The network processor 4 needs a dynamic routing table designed for fast updates and fast generation of forwarding tables. The forwarding tables, on the other hand, can be optimized for lookup speed and need not be dynamic.
To minimize lookup time, two parameters, the number of memory accesses required during lookup, and the size of the data structure, have to be simultaneously minimized in the data structure in the forwarding table.
Reducing the number of memory accesses required during a lookup is important because memory accesses are relatively slow and usually the bottleneck of lookup procedures. If the data structure can be made small enough, it can fit entirely in the cache of a conventional microprocessor. This means that memory accesses will be orders of magnitude faster than if the data structure needs to reside in a memory consisting of a relatively slow DRAM, as is the case for Patricia trees. If the forwarding table does not fit entirely in the cache, it is still beneficial if a large fraction of the table can reside in the cache. Locality in traffic patterns will keep the most frequently used pieces of the data structure in the cache, so that most lookups will be fast. Moreover, it becomes feasible to use fast SRAM for the small amount of needed external memory. SRAM is expensive, and more expensive the faster it is. For a given cost, the SRAM can be faster if less is needed.
As secondary design goals, the data structure should need few instructions during lookup and keep the entities naturally aligned as much as possible to avoid expensive instructions and cumbersome bit-extraction operations.
The Routing Table Lookup Problem A routing table is a data base of prefixes where each prefix is associated with next-hop information in the form of an index into a next-hop table. A prefix is essentially a bit string consisting of 0 to 32 bits - the length of the prefix. The first bit is numbered 31 , the second is numbered 30, and so on until the bit string ends. In a prefix of length 5 for example, bits 31 , 30 , 29, 28 , and 21 are specified. This means that such a prefix matches any IP address where bits 31 , 30, 29, 28 , and 21 of the IP address equals the correspond- ing bits of the prefix. Since only the five first (most
significant) bits of the prefix are specified, it does not matter what the bits 0 to 26 of the IP address are. A prefix of length zero matches any IP address.
The result of a routing lookup is always the in- dex associated with the longest matching prefix in the routing table.
For example, an IP address starting with 101 . . . , where bit 31 is the leftmost bit, matches both the prefix 10* and the prefix 101 * , where the * denotes the end of the prefix. However, the prefix 101 * is longer than 10* . Therefore, the preferred result of a routing lookup among these two prefixes is the index associated with the prefix 101 * .
A prefix can also be represented by a range of size 2i of the IP address space 0. . 4294961295 , for some non-negative integer i. The starting point of such a range must be a multiple of 21 . Otherwise, it is not a prefix. This is called the power of 2 property. For example, the prefix 10* starts at 2141483648 and ends at 3221225412. The size of the range is 1013141824 , which equals 230. The starting point equals 2*230.
Every prefix is a range possessing the power of 2 property and vice versa. Therefore, we do not need to distinguish between ranges possessing the power of 2 property and prefixes. The length of a prefix is defined as the number of bits specified and the size of a prefix is defined as the size of the range of the IP address space covered by the prefix. Computing the binary representation of the starting point of a prefix is achieved by padding the bit string representation of the prefix with zeroes to the right until the bit string becomes 32 bits long. The size of a prefix of length i is given by 232 '1 .
Representing the Routing Table as a Partition of Prefixes
In a routing table, there will typically be a few large prefixes that completely cover smaller prefixes. To reduce the amount of memory required for representing the routing table, redundant information is removed by creating an auxiliary routing table containing exactly the same routing information but with no covering prefixes - a partition of prefixes. Covering prefixes are recursively split in half until the sizes of the remaining prefixes from the larger prefixes are the same as the smaller prefixes covered, which means that they can be removed. Then, the remaining prefixes from the covering prefix are re- moved. This method is used repeatedly, starting with the smallest prefixes and continuing to larger and larger prefixes as long as there are larger prefixes left that covers smaller prefixes. The result is a set of prefixes that forms a partition of the IP address space .
For example, consider the following routing table .
The prefix 001 * is covered by the prefix 0* .
Therefore, the larger prefix is split, which results in
resulting in the routing table
The prefix * still covers all prefixes and is therefore split
By repeatedly splitting 1* we get
Finally, we have the following routing table where the prefixes properly partitions the IP address space .
A routing table can also be represented as a binary tree, where each node has zero, one or two children. A node with zero children is called external node or leaf and a node with one or two children is called internal node. In such a tree, the next-hop index for the prefix 01 * is stored in the right (1st) child of the left (0th) child of the root. The next-hop index for * is stored in the root. The tree representation of the final prefix partition, resulting from applying the method described above, possesses the following important properties:
• All internal nodes have exactly two children
• Only external nodes contains next-hop information
The method is sometimes referred to as tree com- pletion - the tree representation of the original routing table is completed into a full binary tree.
Representing the Partition of Prefixes as the Set of Starting Points of the Prefixes
Since the prefix partitioning yields a set of prefixes, where each prefix is associated with next-hop information, that properly partitions the IP address space, it suffices to represent the partition as the.
set of the starting points (first IP addresses) of the prefixes .
Hence, the information required for representing the table from the example above is the following.
Finding the Closest Dominating Points in the Set of Prefixes
Given a set of points in the set of real numbers, we define the distance between two points 2 and p2 in the norm L«, (L- infinity) as limk→∞ ( {pi-p∑l k) 1/k- | _-p2i! (in 1-dimension, L∞ distance equals euclidian distance) . Moreover, we say that the point p2 dominates the point p± (p2 is dominated by p2) if an only if px< p2. When a routing table is represented as the set P of starting points of the prefixes, as described in the previous section, the routing lookup problem is reduced to the problem of finding the closest dominating point in P. For example, to perform a routing lookup of the IP address 112.16.0.20 in the routing table above we simply need to find the point, in the set of dominating points 0.0.0.0, 32.0.0.0, 64.0.0.0, 128.0.0.0, located at the shortest L__ distance from 112.1 .0.20. Obvi- ously, that point is 128.0.0.0 and the result of the routing lookup is next -hop index 4.
The XTC Data Structure
The XTC is a data structure for efficiently representing the set of dominating points by using three levels of compressed pointer arrays . Each com- pressed pointer array (CPA) consists of a chunk and an array of associated 20 bits pointers and/or 15 bits next -hop indices. The CPA constituting the first level represents an uncompressed pointer array of size 65536 pointers (bits 31 . . 16 of the IP address) and the CPAs constituting level 2 and 3 represents 256 pointers each.
Compressed Pointer Arrays and Sets of Dominating Points
A compressed pointer array is represented by a chunk and a list of pointers. The chunk consists of code words, where each code word represents 16 bits of an implicit bit array representing the chunk. The size of the implicit bit array is the same as the maximum number of pointers in the CPA. In the implicit bits bit array the bit with index i is set if and only if there is a prefix starting at position i. By starting point, we mean that the bits inspected { 31 . . 16 at level 1, 15. . 8 at level 2, and 1. . 0 at level 3) of the dominating point, when indexing into the CPA, equals i. The implicit bit array is, in turn, represented as 16 implicit bit masks. There are three classes of bit masks:
• Empty bits mask with no bits set
• Bit mask with exactly one bit set • Bit mask with two or more bits set
By the power of 2 property, a bit mask with one bit set represents a prefix where the starting point within the chunk is a multiple of 16. It completely covers the whole bit mask, and all the empty bit masks immediately following.
For example, if there is prefix with next -hop index 15 starting at position 64 = 4 * 16 (the dominating point representing the prefix equals 16) and ending at position 127 = 1 * 16 + 15, bit mask 4 has only the first bit set while bit masks 5, 6, and 7 are empty.
For bit masks with one bit set and the empty bit masks immediately following, the next-hop index is encoded directly in the code word. Each such code word represents 16 pointers that would be required if the pointer array was not compressed. In the example above, the code words 4 . . 1 will be identical.
The bits masks with more than two bits set represents prefixes which does not completely cover a bit mask. For each such prefix, the starting point and end point lies in the same bit mask. By the power of two property, the number of different kinds of bit masks of length 2X with at least one bit set is given by the following recurrence equation:
A (0) = 1 A (i ) = 1 + A (i - I ) 2
This means that for i -1 , 2, and 3 the bit masks shown in the table below are possible. For i=4 there are 677 different cases which are not listed.
Hence, by the power of 2 property, there are 616 different kinds of 16 bits bit masks with two or more bits set. Such bit masks and the corresponding pointers are represented by a combination of a code word and a pointer list associated with the code word. Each pointer either constitutes a 15 bits next hop index or a 20 bits pointer to the next level CPA.
Initially, there will be one pointer per bit set in the bit mask. However, in the XTC algorithm, the number of pointers (and bits) will be slightly increased during the encoding in order to reduce the complexity compared to the FTC algorithm.
In the FTC, 10 bits are used to encode the kind of the bit mask as an index into a map table. The map table requires memory accesses, hence slowing down the lookup. To improve the speed, a simpler encoding using only 2 bits instead of 10 is used in the XTC. Moreover, the map table is not used at all, thus reducing the number of memory accesses drastically.
By inspecting the set of prefixes whose starting points lies in the same bit mask, we can identify the prefix with the smallest size (for a prefix with starting point and end point within the same bit, the size is 1) . Each prefix in the set is either of size 1, 2, 4 , or 8. If the size is 16 it covers the bit mask completely and the direct encoding described above is applicable (the bit mask has exactly one bit set) . All prefixes that have larger size than the prefix with smallest can be repeatedly split in half until the sizes of all prefixes are equal. Each time a prefix is split, a zero bit in the bit mask is set (flipped) and a copy of the pointer associated with the closest set bit to the left of the flipped bit is added directly after the copied pointer in the pointer list. Each
split increases the number of pointers by one. After this step we have one of the cases shown in the table below.
Instead of dealing with 616 different kinds of bit masks as in the FTC algorithm, we now have only four kinds of bit masks. It is easily shown that the number of extra pointers introduced by using this method is 11 if the number of initially set bits is 5. Hence, in the worst case the number of pointers required in the XTC representation compared to the FTC representation is increased by a factor of 16/5. Ex- periments have shown that the typical increase is a factor of 2.
In the lookup, a part of the IP address will be used to index into the CPA. The high 12 (on level 1) or 4 (otherwise) bits is the index of the code word. The 4 low (least significant) bits will be used to index into the bit mask of the code word (if the code word does
not contain an encoded direct next-hop) . By indexing into the bit mask, we will either index a bit that is set (among the set bits) to retrieve the pointer associated with that bit, or we index a bit which is not set. If the bit is not set, we will need to retrieve the pointer associated with the closest set bit to the left of the indexed bit. As mentioned above, there is a pointer list associated with each bit mask/code word. Such pointer lists will either contain 2, 4, 8, or 16 pointers. By encoding bit masks as described in the table above, the index of the closest bit set to the left, among the set bits, (and also the index of the corresponding pointer in the pointer list) can be computed by a simple shift operation. That is, the low 4 bits used to index into the bit mask, in the lookup, are shifted code bits to the right. This is to be compared to the costly map table lookup as is used in the FTC algorithm.
In the XTC data structure, the chunk consists of an array of code words of fixed size (the size depend on the level) . To reduce the number of bits required for representing a the index of the first pointer associated with each code word (the beginning of the pointer list associated with the code word) , the pointers in the XTC data structure are stored in the same array as the chunks. The pointer array associated with a chunk is stored immediately after the last code word of the chunk.
As mentioned above, the pointer array consists of either 20 bits pointers and/or 15 bits next-hop indices. It is represented by an array of 16 bits integers, where the most significant bit indicates whether the low 15 bits represents a next-hop index or the low 15 bits of a 20 bits pointer. To represent a 20 bits pointer to the next level, we use 5 bits of the code
word for representing the high 5 bits of the 20 bits pointer. These bits are simply appended to the pointer when it is retrieved. This clearly requires that in any list of pointers to the next level associated with the same bit mask, bits 15. . 19 must be identical. This is achieved moving the chunks at the next level during construction if the low 15 bits if the pointers wraps while building the chunks at the next level associated the bit mask in construction at the current level (see Section 3 for details) .
To summarize this, each code word consists of :
• 1 bit -that indicates if it is a direct next- hop or not .
If it is a next-hop, there are 15 bits represent- ing the next -hop index. Otherwise there are:
• 2 bits for the code (shift value)
• 5 bits for the high bits of the pointers associated
• 16 (level 1) or 8 (level 2 and 3) bits for representing the pointer offset, i.e. the index of the first pointer associated with the code word, in the pointer array stored directly after the chunk.
Level 1: a 65536 Elements Compressed Pointer Array The first level (level 1) of the XTC data structure is represented by a CPA with 4096 code words and 0 . . 65536 pointers. The first step in the lookup is to use bits 31 . . 20 of the IP address as index to the 24 bits code word which is retrieved (in the software implementation, we store the 24 bits code word in a 32 bits integer for alignment reasons) . If the most significant bit of the code word is set (represented by a white text on black background in Figure 2) , the following 15 bits represents a next-hop index which can
be retrieved immediately and the lookup is completed. Bits 1. . 0 are unused in this case.
In the second step, bits 19. . 16 of the IP address are used to index into the code word to find the pointer in the pointer list. This is achieved by shifting the 4 bits of the IP address with the 2 bits integer retrieved from the code word (the code) , and, thereafter, adding the 16 bits offset also retrieved from the code word (see Figure 3) . The resulting number is the index to the pointer stored in the pointer list associated with the chunk. If the most significant bit of the pointer is set, the low 15 bits constitutes a next-hop index and the lookup is completed (Figure 3) .
If the pointer does not represent a direct next- hop, it's low 15 bits are retrieved and combined with the 5 bits stored in the code word, to form a 20 bits pointer to the level 2 CPA (Figure 4) where the lookup continues .
The compressed pointer arrays at level 2 are stored in one big array of 16 bits integers, where the size of each CPA depends on the number of pointers in its pointer list, which is stored immediately after the list of code words. Therefore, it is not possible to index the ith CPA at level 2. Instead, the level 3 CPAs have to be referenced by their starting points (the index of the first code word) and that is the reason for needing 20 bits pointers when representing large routing tables.
Level 2 : a 256 Elements Compressed Pointer Array The second level of the XTC consists of big array of level 2 CPAs. Each such CPA consists of 16 code words and 0. . 256 pointers. After the location of the level 2 CPA is computed at level 1, the lookup continues as follows. The third step in the lookup is to use bits 15. . 12 of the IP address as index to the 16
bits code word which is retrieved. If the most significant bit of the code word is set (represented by a white text on black background in Figure 5) , the following 15 bits represents a next-hop index which can be retrieved immediately and the lookup is completed. In the fourth step, bits 11 . . 8 of the IP address are used to index into the code word to find the pointer in the pointer list. This is achieved by shifting the 4 bits of the IP address with the 2 bits inte- ger retrieved from the code word (the code) , and, thereafter, adding the 8 bits offset also retrieved from the code word (see Figure 6) . The resulting number is the index to the pointer stored in the pointer list associated with the chunk. If the most significant bit of the pointer is set, the low 15 bits constitute a next-hop index and the lookup is completed (Figure 6) .
If the pointer does not represent a direct next- hop, it's low 15 bits are retrieved and combined with the 5 bits stored in the code word, to form a 20 bits pointer to the level 3 CPA (Figure 7) where the lookup continues .
In the same way as level 2, the compressed pointer arrays at level 3 are also stored in one big array of 16 bits integers, where the size of each CPA depends on the number of pointers in its pointer list. Hence, also here a 20 bits pointer us needed for representing the location of the next level CPA.
Level 3 : a 256 Elements Compressed Pointer Array The third level of the XTC consists of big array of level 3 CPAs. Each such CPA consists of 16 code words and 0..256 pointers. After the location of the level 3 CPA is computed at level 2, the lookup continues as follows in a similar way as at level 2. The fifth step is to use bits 1. . 4 of the IP address as in- dex to the 16 bits code word which is retrieved. If the
most significant bit of the code word is set (represented by a white text on black background in Figure 8) , the following 15 bits represents a next-hop index which can be retrieved immediately and the lookup is completed.
In the sixth step, bits 3 . . 0 of the IP address are used to index into the code word to find the pointer in the pointer list. This is achieved by shifting the 4 bits of the IP address with the 2 bits inte- ger retrieved from the code word (the code) , and, thereafter, adding the 8 bits offset also retrieved from the code word (see Figure 9) . The resulting number is the index to the pointer stored in the pointer list associated with the chunk. At this stage, all bits of the IP address have been inspected. Hence, the pointer retrieved from the pointer list must represent a next- hop index. Therefore, this step completes the lookup (see Figure 9) .
By these last four sections, we complete the description of the XTC data structure. We have described in great detail how chunks and code word are represented at each level and how the information is encoded in these. Finally, the lookup, resulting in either a next-hop index or a pointer to the next level, performed at each level have been described. The resulting lookup is performed in six steps. In each step, a 16 bits memory access is performed, except for the first access where 24 ( 32 in the software implementation) bits are accessed. By using the XTC, the next-hop index is computed in 6 memory accesses, accessing a total of 13 ( 14) bytes, in the worst case.
XTC Construction
To achieve the fastest possible construction the XTC is built by traversing the routing table entries,
building the next-hop table, and constructing the compressed XTC representation, in a single pass. The only requirement on the construction is that the routing entries are traversed in a certain order, and that a function is called on each entry. Hence, the representation of the dynamic routing table, from which the XTC forwarding table is built, is completely independent from the XTC construction algorithm.
Routing Entry Conversion and Next-hop Table Con- struction
The complete builder process consists of three steps that are executed simultanesously (see Figure 10) . In the first step, the dynamic routing table is traversed in sorted order with respect to the IP address (starting point) and the length of the IP net- mask. Entries with smaller values of the IP address are processed before entries with larger IP addresses. For entries with equal IP addresses, the entry with the shortest IP netmask (shortest prefix) is processed first. The tree traversal is intimately related with the representation of the dynamic routing table used, which in Effnet's case is a Radix Tree, and is therefore not described.
Each routing entry is sent to the routing entry conversion and next-hop table generation process as shown in Figure 10. This process efficiently splits routing entries into prefixes and discards prefixes that covers other prefixes as described in the introduction. It is also responsible for generating the next-hop table from the next-hop information (interface, MAC address, ATM channel etc.) that is present in the dymanic routing table. Finally, each prefix resulting from splitting each routing entry (splitting a routing entry can result in several prefixes) is fed into the XTC builder process which is described in de-
tail in the following section. The same routing entry conversion and next hop table generation process can be used both for the FTC builder process and the XTC builder process (and also for other representations) . Therefore, it is detailed description is not given here. (It will be subject to a separate patent application. )
Depth First, Left to Right XTC Construction When the routing entries are traversed and split into prefixes as described, each prefix generated immediately propagates to the XTC construction in progress. Each prefix has a starting point and an end point as described in the introduction. By the conversion described in the previous section, the prefixes are fed the XTC construction in sorted order. Moreover, the end point of the next prefix to be processed equals the end point of the current prefix plus one.
All three levels of the XTC are built simultaneously. Depending on the length of the prefix currently processed, construction takes place at different levels. Sometimes, construction work is required at all levels for one prefix. This requires an immense amount of book keeping in the construction. We keep track of the following:
construction . chunk 3 Address to the level 3 chunk currently in construction . pointerl Address to the level 1 pointer list currently in construction. pointer2 Address to the level 2 pointer list currently in construction. pointer3 Address to the level 3 pointer list currently in construction. pos2 Position of the level 2 chunk in construction with respect to base∑ . pos3 Position of the level 3 chunk in construction with respect to base3 . from2 Position of the first level 2 chunk associated with the current level 1 code word. from3 Position of the first level 3 chunk associated with the current level 2 code word. bucketl [] Buckets containing 16 pointers associated the current level 1 code word. bucket2 [] Buckets containing 16 pointers associated the current level 2 code word. bucket 3 [] Buckets containing 16 pointers associated the current level 3 code word. offsetl Pointer offset associated with current level 1 code word . offset2 Pointer offset associated with current level 2 code word . offset3 Pointer offset associated with current level 3 code word . ma ski Bit mask associated with current level 1 code word. mask2 Bit mask associated with current level 2 code word. mask3 Bit mask associated with current level 3 code word.
For each prefix processed, the expect variable is checked against the starting point to ensure that the stream of prefixes is valid. The processing of each prefix consists of an ini tialization phase, a construction phase, and a completion phase . In the initialization phase of the first prefix (starting point zero) , the whole construction is initialized by setting basel , Jbase2, and, jbase3 to the memory addresses of the re- spective levels, and by zeroing pos2, pos3 , from2, and from3 . Level 1 construction is also initialized by setting chunkl to basel , and pointerl to the memory address where the level 1 pointer list is to be stored, i.e. to basel + 8192 if the pointer list is stored immediately after the chunk. Moreover, offsetl is set to zero.
Then, for all prefixes (including the first) the initialization performs the following. First, the depth of construction ( 12, 16, 20, 24 , 28 , or 32) is computed by analyzing the net mask stored in the prefix. To compute the depth from the net mask m, the following formula is used:
32 - ( ( l (m & Oxf) + ! (m & OxfO ) + ! (m & OxfOO) + ! (m & Oxf 000) + ! (m & Oxf 0000 ) ) « 2) ,
where & denotes bit wise and operation, .' denotes logical not operation meaning !x = 1 if x = 0, and 0 otherwise, and << denotes left bit shift operation. Note also that we usually represent bit masks as integers in hexadecimal representation denoted by Ox in the beginning .
The initialization continues by processing the particular initialization for the depth computed. How-
ever, for one prefix construction at several depths may need to be initialized, starting at the greatest depth.
Depth 32 initialization: Determine if the construction of a new level 3 code word begins by using the following formula: ! ( expect & ( (1 << 4) - 1) )
If the result is false (zero) , no level 3 code word initialization is required and the processing con- tinues with the construction phase. Otherwise, level 3 code word construction is initialized by setting mask3 to 0, bucket3 [0] , . . . , bucket3 [15] to EMPTY, and then continuing with the depth 28 initialization.
Depth 28 initialization:
Determine if the construction of a new level 3 chunk begins by using the following formula: ! ( expect & ( (1 « 8) - 1) ) If the result is false (zero) , no level 3 chunk initialization is required and the processing continues with the construction phase. Otherwise, bits 15. . 19 of from3 and pos3 are compared to ensure that they are equal. Recall that from3 and pos3 are the position of the first level 3 chunk and the current level 3 chunk, respectively, associated with the current level 2 code word in construction. Also, recall that the 20 bits pointers to these chunks will be stored at level 2 with the 5 high bits in the code word and the low 15 bits in the pointers (represented by from3 and pos3) associated with the code word. Therefore, it is required that bits 15. . 19 of _rro_rι3 and pos3 are equal. If the bits are equal, the depth 28 initialization is completed by setting chunk3 to base3 + pos3 , pointer3 to chunk3 + 16, which is the size of the chunk, and offset3 to zero. If the bits are not equal, it is needed to adjust
the compressed pointer arrays built at level 3, starting from from3 so that the high 5 bits becomes equal and the 20 bits pointer encoding is possible. This is achieved by computing pos3 & OxfδOOO and assigning this value to a wrap variable. Then the distance delta of the adjustment (how many positions to move the CPAs) is computed as the difference between wrap and from3 . Moving the CPAs is then achieved by repeatedly assigning base3 [index] to base3 [index + del ta] , where index loops from pos3 - 1 to from3 . It remains to update the pointers stored in the buckets associated with the level 2 code word in construction since the locations of the level 3 CPAs associated with these have changed. This is achieved by increasing each pointer stored in the bucket2 array (EMPTY buckets are skipped) by delta and then keeping only the low 15 bits by a simple bit masking operation. Finally, from3 is set to wrap and pos3 is increased by delta, before completing the depth 28 initialization, by setting chunk3 to base3 + pos3 , pointer3 to chunk3 + 16, and offset3 to zero, and continuing with the depth 24 initialization.
Depth 24 initialization:
Determine if the construction of a new level 2 code word begins by using the following formula: ! ( expect & ( (1 « 12) - 1) )
If the result is false (zero) , no level 2 code word initialization is required and the processing continues with the construction phase. Otherwise, level 2 code word construction is initialized by setting mask2 to 0, bucket2 [0] , . . . , bucket2 [15] to EMPTY, and then continuing with the depth 20 initialization.
Depth 20 initialization:
Determine if the construction of a new level 2 chunk begins by using the following formula: ! ( expect & ( (1 « 16) - 1 ) ) If the result is false (zero) , no level 2 chunk initialization is required and the processing continues with the construction phase. Otherwise, bits 15. . 19 of from2 and pos2 axe compared to ensure that they are equal. Recall that from2 and pos2 are the position of the first level 2 chunk and the current level 2 chunk, respectively, associated with the current level 1 code word in construction. Also, recall that the 20 bits pointers to these chunks will be stored at level 1 with the 5 high bits in the code word and the low 15 bits in the pointers (represented by from2 and pos2) associated with the code word. Therefore, it is required that bits 15. . 19 of from2 and pos2 are equal. If the bits are equal, the depth 20 initialization is completed by setting chunk2 to base2 + pos2 , pointer2 to chunk2 + 16, which is the size of the chunk, and offset2 to zero. If the bits are not equal, it is needed to adjust the compressed pointer arrays built at level 2, starting from from2 so that the high 5 bits becomes equal and the 20 bits pointer encoding is possible. This is achieved by computing pos2 & OxfδOOO and assigning this value to a wrap variable. Then the distance delta of the adjustment (how many positions to move the CPAs) is computed as the difference between wrap and from2. Moving the CPAs is then achieved by repeatedly assigning base2 [index] to base2 [index + del ta] , where index loops from pos2 - 1 to from2. It remains to update the pointers stored in the buckets associated with the level 1 code word in construction since the locations of the level 2 CPAs associated with these have changed. This is achieved by increasing each pointer stored in
the bucketl array (EMPTY buckets are skipped) by delta and then keeping only the low 15 bits by a simple bit masking operation. Finally, fro ∑ is set to wrap and pos2 is increased by delta, before completing the depth 20 initialization, by setting chunk2 to ase2 + pos2, pointer2 to chunk2 + 16, and of fset2 *to zero, and continuing with the depth 16 initialization.
Depth 16 initialization: Determine if the construction of a new level 1 code word begins by using the following formula: I ( expect & ( (1 « 20) - 1 ) )
If the result is false (zero) , no level 1 code word initialization is required and the processing con- tinues with the construction phase. Otherwise, level 1 code word construction is initialized by setting maskl to 0, bucketl [0] , ..., bucketl [15] to EMPTY, and then continuing with the depth 12 initialization.
Depth 12 initialization:
No initialization is required here, it is taken care of when the whole construction is initialized.
After the initialization phase, the expect variable is updated by setting it to the end point of the prefix being processed plus one. Then, the processing continues with the construction phase where construction takes place at the depth computed in the beginning of the processing.
Depth 32 construction:
A direct next-hop stored in a level 3 pointer is built as follows. Compute a bit index bix by extracting bits 0 . . 3 from the starting point of the prefix. Then,
bit bix of mask3 is set and the next-hop index of the prefix is stored (without encoding) in bucket3 [bix] . In the completion phase, the content of the buckets will be moved to the pointer array associated with the level 3 code word in construction.
Depth 28 construction:
Direct next -hops directly encoded in level 3 code words are built as follows. The indices, start and stop, of the first and last code word to be built are computed by extracting bits 4..7 of the starting point and end point of the prefix respectively. Then the next-hop index of the prefix is encoded by setting bit 15 and stored the chunk by assigning the encoded direct next -hop to chunk3 [start] , . . . , chunk3 [stop] .
Depth 24 construction:
A direct next-hop stored in a level 2 pointer is built as follows. Compute a bit index bix by extracting bits 8. . 11 from the starting point of the prefix. Then, bit bix of mask2 is set and the next-hop index of the prefix is encoded by setting bit 15 and is thereafter stored in bucket2 [bix] . In the completion phase, the content of the buckets will be moved to the pointer array associated with the level 2 code word in construction.
Depth 20 construction:
Direct next-hops directly encoded in level 2 code words are built as follows. The indices, start and stop, of the first and last code word to be built are computed by extracting bits 12. . 15 of the starting point and end point of the prefix respectively. Then the next-hop index of the prefix is encoded by setting
bit 15 and stored the chunk by assigning the encoded direct next-hop to chunk2 [start] , . . . , chunk2 [stop] .
Depth 16 construction: A direct next-hop stored in a level 2 pointer is built as follows. Compute a bit index bix by extracting bits 16. . 19 from the starting point of the prefix. Then, bit bix of mask2 is set and the next -hop index of the prefix is encoded by setting bit 15 and is there- after stored in bucket2 [bix] . In the completion phase, the content of the buckets will be moved to the pointer array associated with the level 1 code word in construction.
Depth 12 construction:
Direct next -hops directly encoded in level 3 code words are built as follows. The indices, start and stop, of the first and last code word to be built are computed by extracting bits 20. . 31 of the starting point and end point of the prefix respectively. Then the next -hop index of the prefix is encoded by shifting 8 bits to the left, setting bit 23 and stored the chunk by assigning the encoded direct next -hop to chunkl [start] , . . . , chunkl [stop] . Recall that the level 1 chunk contains 24 bits integers and in this case, the lowest 8 bits are unused. In the software implementation, these 24 bits will be stored in a 32 bits integer for alignment reasons and this requires a slightly changed encoding .
The processing of the prefix is completed with the completion phase, which starts at the same depth as where the construction took place. For one prefix, completion at several depths may be required starting at the greatest depth.
Depth 32 completion:
Determine if the construction of a level 3 code word ends by using the following formula (recall that the expect variable has been updated after the initialization phase) :
! (expect & ( (1 « 4) - 1) )
If the result is false (zero) , one or more prefixes are needed to complete the level 3 code word and the processing of the current prefix is completed. Otherwise, the index tix of the code word to be completed is computed by extracting bits 4 . . 1 from the starting point of the prefix. Then mask3 is analyzed to compute the 2 bits shift value shift. This is achieved by the following simple procedure:
Thereafter, the shift value and the encoding (shift << 8) + offset3 of the code word is computed and stored in chunk3 [tix] . To complete the code word, the pointers stored in the buckets must be moved to the pointer list associated with the code word. Depending on the shift value, pointers are either retrieved from 2, 4, 8, or 16 buckets.
Shift Buckets value
0 0, 1, 2, 3, 4, 5, 6, 1, 8, 9, 10, 11, 12, 13, 14, 15
1 0, 2, 4, 6, 8, 10, 12, 14
2 0, 4, 8, 12
3 0, 8
The pointers are retrieved from the buckets and stored in the pointer array associated with the code word by looping through the buckets specified in the table above. For each non-EMPTY bucket the content of the bucket is assigned to pointer3[offset3], and for each EMPTY bucket pointer3[offset3 - 1] is assigned to pointer3 [offset3] . Thereafter, offset3 is increased by one before proceeding with the next bucket . After processing each bucket listed in the table above, the code word is completed and the processing continues with depth 28 completion.
Depth 28 completion:
Determine if the construction of a level 3 chunk ends by using the following formula (recall that the expect variable has been updated after the initialization phase) : ! (expect & ( (1 « 8) - 1) )
If the result is false (zero), one or more prefixes are needed to complete the level 3 chunk and the processing of the current prefix is completed. Otherwise, the level 2 bit index bix is computed by extract- ing bits 8..11 from the starting point of the prefix followed by setting bit bix of mask2. Finally, the low 15 bits of pos3 , which is the position of the level 3 chunk being completed, are stored in bucket2 [bix]
followed by adding 16 + offset3 to pos3 , to prepare for the construction of the next level 3 chunk, before proceeding to the depth 24 completion.
Depth 24 completion:
Determine if the construction of a level 2 code word ends by using the following formula (recall that the expect variable have been updated after the initialization phase) : ! (expect & ( (1 « 12) - 1) )
If the result is false (zero) , one or more prefixes are needed to complete the level 2 code word and the processing of the current prefix is completed. Otherwise, the index tix of the code word to be com- pleted is computed by extracting bits 12 . . 15 from the starting point of the prefix. Then mask2 is analyzed to compute the 2 bits shift value shift . This is achieved by the same simple procedure as used in the depth 32 completion. Thereafter, the shift value and the encod- ing
( (from3 & Oxf 8000) » 5) + (shift << 8) + offset2 of the code word is computed and stored in chunk2 [tix] . In addition to the encoding made in the depth 32 completion, bits 15. . 19 of the pointers to the level 3 CPAs associated with the code word are extracted from from3 and encoded as bits 10 . . 14 of the code word, followed by setting from3 to pos3 to prepare for the next level 2 code word construction. To complete the code word, the pointers stored in the buckets must be moved to the pointer list associated with the code word. Depending on the shift value, pointers are either retrieved from 2, 4, 8, or 16 buckets in the same way as in the depth 32 completion. The pointers are retrieved from the buckets and stored in the pointer array associated with the code word by looping
through the buckets specified in the table above. For each non-EMPTY bucket the content of the bucket is assigned to pointer 2 [of fset2] , and for each EMPTY bucket pointer2 [offset2 - 1] is assigned to pointer2 [offset2] . Thereafter, offset2 is increased by one before proceeding with the next bucket. After processing each bucket listed in the table above, the code word is completed and the processing continues with depth 20 completion.
Depth 20 completion:
Determine if the construction of a level 2 chunk ends by using the following formula (recall that the expect variable has been updated after the initializa- tion phase) :
! (expect & ( (1 « 16) - 1) )
If the result is false (zero) , one or more prefixes are needed to complete the level 2 chunk and the processing of the current prefix is completed. Other- wise, the level 1 bit index bix is computed by extracting bits 16. . 19 from the starting point of the prefix followed by setting bit bix of maskl . Finally, the low 15 bits of pos2, which is the position of the level 2 chunk being completed, are stored in bucketl [bix] followed by adding 16 + offset∑ to pos2 , to prepare for the construction of the next level 2 chunk, before proceeding to the depth 16 completion.
Depth 16 completion: Determine if the construction of a level 1 code word ends by using the following formula (recall that the expect variable has been updated after the initialization phase) :
.' (expect & ( (1 « 20) - 1 ) )
If the result is false (zero) , one or more prefixes are needed to complete the level 1 code word and the processing of the current prefix is completed. Otherwise, the index tix of the code word to be com- pleted is computed by extracting bits 20. . 31 from the starting point of the prefix. Then maskl is analyzed to compute the 2 bits shift value shift . This is achieved by the same simple procedure as used in the depth 32 completion. Thereafter, the shift value and the encod- ing
( (from2 & Oxf 8000) « 3) + (shift « 16) + off- setl of the code word is computed and stored in chunkl [tix] . In addition to the encoding made in the depth 32 completion, bits 15. . 19 of the pointers to the level 2 CPAs associated with the code word are extracted from from2 and encoded as bits 20. . 23 of the code word, followed by setting from2 to pos2 to prepare for the next level 1 code word construction. To com- plete the code word, the pointers stored in the buckets must be moved to the pointer list associated with the code word. Depending on the shift value, pointers are either retrieved from 2, 4, 8, or 16 buckets in the same way as in the depth 32 completion. The pointers are retrieved from the buckets and stored in the pointer array associated with the code word by looping through the buckets specified in the table above. For each non-E PTY bucket the content of the bucket is assigned to pointerl [offsetl] , and for each EMPTY bucket pointerl [offsetl - 1] is assigned to pointerl [of fsetl] . Thereafter, offsetl is increased by one before proceeding with the next bucket. After processing each bucket listed in the table above, the code word is completed and also the processing continues with depth 12 completion.
Depth 12 completion:
The XTC construction is completed. In the construction software, some statistics regarding construc- tion are updated, but no work is really necessary to perform completion at depth 12.
By this section, the complete method of processing a list of routing entries repeatedly fed into the construction to compress and convert these into the XTC representation is described.
XTC Lookup Implementations In this section we describe a number of possible implementations of the XTC lookup, both in hardware and in software. Some of these are more of less straight forward but some deviates slightly from the description made above and requires minor changes in the representation of the data structure and the builder process . Straight Forward Hardware Implementation
The XTC lookup is extremely suitable for hardware implementations. In Figure 11, a rough layout of a direct hardware implementation of the algorithm described in the previous sections is showed. The IP address are fed into the lookup from the left and progresses throughout the lookup in six steps until the next-hop index is available to the right. It is assumed that the data structure is stored with each level in a separate memory. That is, one for the one level 1 chunk, one for the array of associated level 1 pointers, one for level 2 chunks and pointers, and one for level 3 chunks and pointers (these can also be mapped into one memory) . If the lookup is completed^in less than six steps, the result is locked by feeding bit 15 of the result into the chain of multiplexers. In
_J t o o . o
≤ co H- n CD LQ Ό l-h ft) X CD rr φ rt s: φ CD Hi Hi rr H- tn rt CD ri¬ Ω rt rt
U rr 0 φ H- r ri 0 0 a Φ (3 H- 13 H- rr X rt Φ 0 H- 3 P- rr rt ft) ft) rr rr
H- 0 rt 3 tr Φ Φ Ω r a ri r-1 3 Ω 0 H- rt Hi CD (3 CQ 13 n Φ φ 3 Hi ft H-
Ω Hi 0 rr Φ Φ Φ rt Φ Φ 3 Ω Hi (3 Φ Hi rr Ω 13 Φ a rt CD rr Φ Φ Φ H- φ - CD ft) rr ft) Ω 13 Φ Φ φ Ω CO O Φ to ^ O 13 n ft) CD 0 Hi • rt rt Hi 3 0 (3 a (3 ϊ>
3 0) rr H 3 l- rr iQ CD ft) Hi ri 3 hH 3 C Φ Φ φ hH n H- rs rt CD • 2 ft)
Φ ft) 0 0 ft) CD 0 2 rt φ rr φ Hi a Ω 13 2 0 rt rt 0 13 ft) NO tr 3 X CD rr 0 3 Φ 3 S3 0 ft) 3 Φ Φ 0 rt CD Hi ^ H Φ ^
1-5 μ-> H- a C H- P- rr H- 3- Φ Hi (3 ft) P H- 0 • Φ 3 ft) ^ rt H- 0 H- Hi Ω cn rt CD r> 3 ft) μ-1 Φ H- Ω rr rr rr CD ω Hi X CD 13 rt H- CD 13 ft) Φ rt tr O Φ o £ tr CQ μ-1 rt rr rr H- rr ^ H ft) ri H- CQ M Ω Hi Hi CO rr rt H- r H( 3 H- Φ ri rr ft) H- Φ 13 Φ rt ft) rr Ω P- Φ 0 (3 Hti Hi μ-1 Hi Φ 0 CO Φ rr rr n ft) CD Φ IQ Φ rr CD Ό 3 rr ft) H- rt 13 CD 13 Hi Hl ft) 0 Φ (3 ft) ft) O 0 3 H- ft co H- ft) CD rr Φ H Φ rt ft) CD Ω cn CO CD Φ Φ r LQ ft) C- Hi Hi Hi rt a CQ X fi 13 rt C CD Φ 3 Ό X H- r 0 Ω ^ rt H- ≤ Ω rr H- CO ra H- Φ
13 Φ Φ φ Φ ^ rr ft) X 0 φ 3 Φ ft) Φ 0 H- μ-1 rt Φ Ω Φ φ 3 ω 3 0 3 Φ 3 rr Φ O φ ft) ri φ rr ft) CD ft) ft) 13 rt to H- Hi a Φ ^ c
0 H- s: rr φ rr a rr Φ •o O ri ^ a CD rr CD CD μ-1 CD a rr ^ <; 13 rt 3
H- 3 0 rr 3 rr 0 3 ft) CD Φ a Hi Φ CD Φ 0 Hi Φ CD Φ rr 0 rr rt
3 rt ^ Φ 0 re 0 • CD rr ft) ri ft) 0 (3 CD Ξ 0 ft) n ft) H- Ω φ rt Hi ft) ω rr Φ a i ft rr r CD CD ft) Ω Φ ri H r 3 CD Φ Hi rt o r 3 Φ rr H- Ό
Φ Hi en o H- CD Hi rr ^ 0 1-3 Φ CD Ω CD Φ H- φ rt <! H- 3 ft) 13 CD CD Φ Φ 13 Hi
H • 0 Φ 13 a φ rr a Φ CD 13 3 a o ft) Φ rt 0 Φ ft) Hi cn 13 CD Φ Hi ω H- a CD Φ ≤: ft) H- μ- tr CD S3 φ φ rr 13 s- ; a φ ft) φ 3 0
3 H Φ Φ ft H- o 13 D H- Φ CD ft) CD ri rt rt cn Hi rr 0 z. . Hi φ 0 rt CD 3 rr " rr a i 3 n Hi Φ 13 φ Hi CD rr rr H- N. Φ H- ft) (3 0 ^ a 0 rr > 0 H- , rr Φ 13 Φ rr 3 13 ft) a a 0 Ό ft Φ 3 Φ CD rt rt Hi rV ft) CD 0
3 co 0 ft) o 3 CD rr ft) rr CD ii φ rt 13 ft) .. Φ rr (3 rt H- r rr ^ rr l- H φ CD Φ ^ rr c Hi Hi 3 rs 13 0 ft) D 13 3 rr tr rr 3 a U 3 φ Φ CD Ω 0 0 H- rt 0 0 Φ a s; H- Φ Hi H <; Φ Ω C Φ
Φ rt) ft) CD O rr 13 Φ CD 0 tr Φ 0 H Hi 13 13 0 ri rr CD Φ rt Ω ft) Hi
? ft) rr 3 rr Φ μ-1 a a CQ Hi ) Φ Φ CD ft) 13 ft) ft) 13 rr cn ft) rt) ft) 3 Φ Φ rr rr rr 0 Φ H- Ό 0 ^ c a Hi 13 Φ !3 H- !3 ft) 13 rr r to rr 3 ft) rr Φ rr 0 rr 13 H- φ Hi 13 φ H- Φ ri 3 Φ o μ-1 tr 3 ft)
0 ri tr φ rr c ri rr rV 0 rr Ω H- 0 CD P 0 3 rt Φ tr Φ Φ Hi r . ft) H- ft) Φ 0 3 H- CD Φ c 0 Φ rV CQ ft) n H- Φ H 13 Hi φ a Φ 0 rt
Φ ^ rt 0 rr 0 a Ό ri ri H- rr Φ (-Q rt ft rr Φ Φ rs σ ft) C CD
O 0 ft) 3 ft) Φ - 0 a O 13 rt D rr Ω Φ Hi rt a 13 Ω cn
0 13 Ω c rr a CD ri H- o ri LQ 0 3 Φ rr Hi Ω ft) Φ Φ H- Ω 0
H- Mi 0 rr n 13 H- a H- Φ CD 0 Φ 3 0 φ tr H- 0 0 rt 3 CD 13 φ ^ Hi
3 D rr Φ - 0 H- Hi ri rr Hi φ ? 3 S3 0 a Hi CD H- Ω Φ CD
13 to r. CD 3 n 3 Φ rr H- 0 0 H- ri CO r ft) 0 H- Φ CD H- tn 3 S3 0 rt Hi Φ Φ 3 rt
CQ ri 0 rr ft) 3 (3 0 13 Hi H- <: 0 ft) 13 3 H- H- φ ft) rr
H- tr Φ Φ rr i 3 Φ IQ H ri ^ ft) l£! 3 φ Ω 13 13 a ri Φ
CD 13 a CD CD Φ 13 O
H- ^ P- a H- H- Ω Hi rt φ Φ
Φ rr Φ o H- a r-1 3 0 rr σ CD H- a P cn Φ Hi f- H- a CO 3 rt) Φ n CD 3 13 Φ φ H- rr H- ri CD en ft) H- Φ a 0 P- a 0
£U H rr <! 0 φ o s- rr 3 3 φ rt Φ 13 CD rt tr Ω Φ 13 H- 3 H- 0
H; CD 0 Φ 13 13 φ ft) Φ o rr CD Hi Hi S3 rt 0 c r a CD 13 S3 ω ft) μ-1 o 0 a ^ a ri Φ i Φ 0 CD $, a Hi rt H- 1 S3 rr CD CD Φ H- ^ ri Φ ft) Hi 13 Φ 0 o) Φ rr <; 0 Hi 1 13
(D r. 1 1 rr 0 tr rr O a rt 13 ft) Φ Hi o
Hi Φ Hi ft) 1 c ft) 1 Φ Hi
memory instead of storing the pointer associated with a chunk directly after the chunk.
This gives a slightly simpler design compared to the original design as shown in Figure 14. The reason is that each chunk will occupy exactly 16 24-bit words in the chunk memory of the level it belongs to, which implies that the starting point of a chunk will always be a multiple of 16. Hence, 4 bits are saved for the chunk reference so that the 15 bits of the pointer suffices to refer the next level chunk. Therefore, it is neither needed to encode the high bits of the pointers in the code word nor extract these bits to get a "big enough" pointer. The lookup can be pipelined in six steps, which means that each time a memory access is completed, a routing lookup is also completed. By this rather simple lookup design, performance of over 100 million packets per second can be achieved.
Straight Forward Software Implementation
In the software implementation, the code words at level 1 cannot easily be represented by 24 bits without adding an extra memory access. Therefore, these are represented by 32 bits words with the pointer offset stored in the high 16 bits (see Figure 15) . Except for the 8 least significant bits, which are unused if the code word does not represent a direct next-hop, the low 15 bits are used in the same way as at level 2 and 3.
Moreover, the whole data structure is stored in a sufficiently large consecutive block of memory with the space allocated for each of the three levels specified by a constant. Except for these small deviations, the software version of the lookup is a straightforward implementation of the algorithm described.
Performance Improvements
The performance improvements of the XTC data structure when compared to the FTC data structure comes from trading memory usage for memory accesses, simpli- city and speed. It is a well known fact that one can easily do an extremely simple routing lookup in one memory access if the memory usage does not matter at all. This is achieved by simply having a 232 elements array where the next-hop index for each possible IP address is stored in the slot with the same index as the numerical value of the IP address.
To achieve a trade-off, it is possible to do an XTC lookup in only two memory accesses if it is considered to use only one level with 228 code words, which still consumes an extreme amount of memory. A possibility might be to let each code word represent more than 16 bits to reduce the number of code words. For instance, if each code word represents 256 bits instead of 16, 8 bits of the code word must be used for the shift value. In the worst case, the number of added pointers resulting from the simplified XTC encoding increases then by a factor of 256/9. The total number of code words will be 224 which is still quite large.
A better way of increasing the performance with- out wasting so much memory is to reduce the lookup to four memory accesses. This is easily achieved by building an XTC with only two levels, where for instance level 1 represents the high 20 bits and each level 2 chunk represents the low 12 bits. Thus, each chunk at each level increases by a factor of 16 which also gives a rough estimate on the total growth of the data structure. Possibly, this could be combined with the idea of using "larger" code words at level 1 as describe above. The most straight forward way of increasing the speed somewhat is to only use XTC techniques at level 2
and 3 while using direct addressing at level 1. This immediately removes one memory access.
The focus in this section has been on ideas of how to increase performance by wasting more memory. If the goal instead is to compress the representation as much as possible, the opposite of these ideas can be used, i.e. "smaller" code words and more levels.
It should be apparent that the present invention pro- vides an improved IP routing lookup method and system that fully satisfies the aims and advantages set forth above. Although the invention has been described in conjunction with specific embodiments thereof, alternatives, modifications and variations are apparent to those skilled in the art.
Claims
1. A system for IP routing lookup in a routing table, comprising entries of arbitrary length prefixes with asso- ciated next-hop information in a next-hop table, for determining where IP datagrams are to be forwarded, characterized by data structure means for representing the set of dominating points in the IP address space for each prefix in the routing table, said means comprising three levels of compressed pointer arrays (CPA) , each including a chunk and an array of associated pointers, wherein each pointer comprises a next hop index or a pointer to the next level of chunk.
2. A system according to claim 1, characterized in that said pointers has 20 bits and said next-hop indices comprises 15 bits.
3. A system according to claim 1 or 2, characterized in that the CPA of the first level represents an uncompressed pointer array of size 65536 pointers, bits 31 . . 16 of the IP address, and the CPA of level 2 and 3 represents 256 pointers each, bits 15..8 and 7..0 of the IP address respectively.
4. A system according to any of the preceding claims 1-3, characterized in that the chunk comprises code words, wherein each code word represents 16 bits of an implicit bit array representing the chunk, and in that the size of the implicit bit array is the same as the maximum number of pointers in the CPA.
5. A system according to claim 4, characterized in that in the implicit bits bit array the bit with in- dex i is set if and only if there is a prefix starting at position i.
6. A system according to claim 5, characterized in that the implicit bit array is represented by implicit 16 bits bit masks, wherein each bit mask corresponds to a code word, including an empty bit mask with no bits set, a bit mask with exactly one bit set, and/or a bit mask with two or more bits set.
7. A system according to claim 6, characterized in that the starting point within the chunk is a multiple of 16, and it completely covers the whole bit mask, and all the empty bit masks immediately following.
8. A system according to claim 7, characterized in that for bit masks with one bit set and the empty bit masks immediately following, the next-hop index is encoded directly in the code word.
9. A system according to claim 8, characterized by 676 different kinds of 16 bits bit masks, having two or more bits set, wherein the bit masks and the corresponding pointers are represented by a combination of a code word and a list of pointers associated with the code word.
10. A system according to claim 9, characterized by means capable of identifying a prefix with the smallest size in a set of prefixes having their starting points in the same bit mask, dividing prefixes having a larger size than the prefix with the smallest size in half until the sizes of all prefixes have an equal smallest size, wherein each time a prefix is split, a zero bit is flipped in the bitmask and a copy of the pointer associated with the closest set bit to the left of the flipped bit is added directly after the copied pointer in the pointer list.
11. A system according to claim 10, characterized in that the pointer list comprises 2, 4, 8 or 16 pointers associated with each code word.
12. A system according to claim 10, characterized in that each code word comprises: 1 bit that indicates if it is a direct next-hop or not , if it is a next-hop, 15 bits representing the next-hop index, and if it is not a next -hop, 2 bits shift value for the code, 5 bits for the high bits of the associated pointers, 16 bits for level 1 or 8 bits for level 2 and 3 for representing the pointer offset, i.e. the index of the first pointer associated with the code word, in the pointer array stored directly after the chunk.
13. A method of IP routing lookup in a routing table, comprising entries of arbitrary length routing prefixes with associated next-hop information in a next-hop table, for determining where IP datagrams are to be forwarded, characterized by the steps of : accessing a first level code word (23..0) in a chunk in a first level compressed pointer array at a location corresponding to a first index part (31..20) of the IP address, and if the most significant bit of the code word is set, retrieving the next-hop index bits (22..8) in the code word.
14. A method according to claim 13, characterized by the further steps of: if the most significant bit of the code word is not set, shifting a second index part (19..16) of the IP address with shift bits (17..16) of the code word, adding offset bits (14..0) of the code word to the shifted second index part forming a pointer index, accessing a first level pointer at the pointer index location in a list of pointers of the first level compressed pointer array, and if the most significant bit of the pointer is set, retrieving the next-hop index bits (14..0) in the pointer.
15. A method according to claim 14, characterized by the further step of : if the most significant bit of the pointer is not set, retrieving and combining the remaining bits of the pointer (14..0) with a pointer bits part (22..18) of the code word forming a 20 bits second level starting pointer to the starting point of a second level com- pressed pointer array of a set of second level compressed pointer arrays.
16. A method according to claim 15, characterized by the further steps of : accessing a second level code word (15..0) in a chunk in the second level compressed pointer array at a location corresponding to a third index part (15..12) of the IP address, and if the most significant bit of the code word is set, retrieving the next-hop index bits (14..0) in the code word .
17. A method according to claim 16, characterized by the further steps of : if the most significant bit of the code word is not set, shifting a fourth index part (11..8) of the IP address with shift bits (9..8) of the code word, adding offset bits (7..0) of the code word to the shifted fourth index part forming a pointer index, accessing a second level pointer at the pointer index location in a list of pointers of the second level compressed pointer array, and if the most significant bit of the pointer is set, retrieving the next-hop index bits (14..0) in the pointer.
18. A method according to claim 17, characterized by the further steps of : if the most significant bit of the second level pointer is not set, retrieving and combining the remaining bits of the pointer with a pointer bits part of the code word forming a 20 bits third level starting pointer to the starting point of a third level com- pressed pointer array of a set of third level compressed pointer arrays.
19. A method according to claim 18, characterized by the further steps of : accessing a third level code word (15..0) in a chunk in the third level compressed pointer array at a location corresponding to a fifth index part (7..4) of the IP address, if the most significant bit of the code word is set, retrieving the next-hop index bits (14..0) in the code word .
20. A method according to claim 19, characterized by the further steps of: if the most significant bit of the code word is not set, shifting a sixth index part (3..0) of the IP address with shift bits (9..8) of the code word, adding offset bits (7..0) of the code word to the shifted sixth index part forming a pointer index, accessing a third level pointer at the pointer index location in a list of pointers of the third level compressed pointer array, retrieving the next-hop index bits (14..0) in the pointer.
21. A computer program produkt directly loadable into the internal memory of a digital computer, characterized in that said product comprises software code means for performing the steps of any of the claims 13- 20.
22. A computer program product comprising a computer readable medium, characterized in that, on said medium it is stored computer program code means, when it is loaded on a computer, to make the computer performing the steps of any of the claims 13-20.
23. A system for IP routing lookup in a routing table, comprising entries of arbitrary length prefixes with associated next -hop information in a next-hop table, for determining where IP datagrams are to be forwarded, characterized by means for performing the steps of any of the claims 13-20.
24. A method for creating data structure means for use in a system of IP routing lookup in a routing table, comprising entries of arbitrary length routing prefixes with associated next-hop information in a next-hop table, for determining where IP datagrams are to be forwarded, characterized by the step of : creating data structure means for representing the set of dominating points in the IP address space for each prefix in the routing table, comprising three levels of compressed pointer arrays (CPA) , each including a chunk and an array of associated pointers, wherein each pointer comprises a next hop index or a pointer to the next level of chunk .
25. A computer program produkt directly loadable into the internal memory of a digital computer, characterized in that said product comprises software code means for performing the step of claim 24.
26. A computer program product comprising a computer readable medium, characterized in that, on said medium it is stored computer program code means, when it is loaded on a computer, to make the computer per- forming the step of claim 24.
27. System for creating data structure means for use in a system of IP routing lookup in a routing table, comprising entries of arbitrary length routing prefixes with associated next-hop information in a next-hop table, for determining where IP datagrams are to be forwarded, characterized by means for performing the step of claim 24.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU78193/00A AU7819300A (en) | 1999-09-22 | 2000-09-22 | Method and system for efficient routing table compression and fast routing lookups |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| SE9903460-5 | 1999-09-22 | ||
| SE9903460A SE9903460L (en) | 1999-09-22 | 1999-09-22 | Method and system for fast routing lookups |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2001022667A1 true WO2001022667A1 (en) | 2001-03-29 |
Family
ID=20417140
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/SE2000/001841 Ceased WO2001022667A1 (en) | 1999-09-22 | 2000-09-22 | Method and system for efficient routing table compression and fast routing lookups |
Country Status (4)
| Country | Link |
|---|---|
| AU (1) | AU7819300A (en) |
| SE (1) | SE9903460L (en) |
| TW (1) | TW508926B (en) |
| WO (1) | WO2001022667A1 (en) |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2003063427A1 (en) * | 2002-01-21 | 2003-07-31 | Rockstorm Technologies Ab | Method and system for fast ip routing lookup using forwarding tables with guaranteed compression rate and lookup performance |
| CN1330190C (en) * | 2001-08-14 | 2007-08-01 | 中兴通讯股份有限公司 | Method for selecting route based on user' IP address route |
| WO2008119282A1 (en) * | 2007-03-31 | 2008-10-09 | Huawei Technologies Co., Ltd. | Method for routing lookup and system for forwarding |
| US9361404B2 (en) | 2014-05-07 | 2016-06-07 | Red Hat Israel, Ltd. | Offline radix tree compression with key sequence skip |
| US9442927B2 (en) | 2014-05-07 | 2016-09-13 | Red Hat Israel, Ltd. | Offline generation of compressed radix tree with key sequence skip |
| WO2016183944A1 (en) * | 2015-05-19 | 2016-11-24 | 中兴通讯股份有限公司 | Message forwarding configuration method and apparatus for communication device, and message forwarding method |
| CN119766724A (en) * | 2024-11-25 | 2025-04-04 | 天翼云科技有限公司 | Routing table updating method and device, electronic equipment and readable storage medium |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR100983150B1 (en) | 2005-11-14 | 2010-09-20 | 미쓰비시덴키 가부시키가이샤 | Network unit and programmable controller using it |
| TWI413910B (en) * | 2008-01-25 | 2013-11-01 | Univ Nat Taiwan | Numerical data range interval query method and system |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1999013619A2 (en) * | 1997-09-09 | 1999-03-18 | Sics Swedish Institute Of Computer Science | A lookup device and a method for classification and forwarding of packets |
| WO1999014906A1 (en) * | 1997-09-15 | 1999-03-25 | Effnet Group Ab | Method and system for fast routing lookups |
-
1999
- 1999-09-22 SE SE9903460A patent/SE9903460L/en not_active Application Discontinuation
-
2000
- 2000-09-21 TW TW089119541A patent/TW508926B/en active
- 2000-09-22 WO PCT/SE2000/001841 patent/WO2001022667A1/en not_active Ceased
- 2000-09-22 AU AU78193/00A patent/AU7819300A/en not_active Abandoned
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO1999013619A2 (en) * | 1997-09-09 | 1999-03-18 | Sics Swedish Institute Of Computer Science | A lookup device and a method for classification and forwarding of packets |
| WO1999014906A1 (en) * | 1997-09-15 | 1999-03-25 | Effnet Group Ab | Method and system for fast routing lookups |
Non-Patent Citations (4)
| Title |
|---|
| CHEUNG G., MCCANNE S.: "Optimal routing table using design for IP address lookup under memory constraints", INFOCOM '99, EIGHTEENTH ANNUAL JOINT CONFERENCE OF THE IEEE COMPUTER AND COMMUNICATIONS SOCIETIES, XP002936425 * |
| CHIUEH T., PRADHAN P.: "High-performance IP routing table lookup using CPU caching", INFOCOM '99, EIGHTEENTH ANNUAL JOINT CONFERENCE OF THE IEEE COMPUTER AND COMMUNICATIONS SOCIETIES, XP002936424 * |
| NEN-FU HUANG, SHI-MING ZHAO, JEN-YI PAN, CHI-AN SU: "A fast IP routing lookup scheme for gigabit switching routers", INFOCOM '99, EIGHTEENTH ANNUAL JOINT CONFERENCE ON THE IEEE COMPUTER AND COMMUNICATIONS SOCIETIES, XP002936423 * |
| TZENG H.H.-Y. AND PRZYGIENDA T.: "On fast addresss-lockup algorithms", IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATION, vol. 17, no. 6, June 1999 (1999-06-01), pages 1067 - 1082, ISSN: 0733-8716, XP002936422 * |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1330190C (en) * | 2001-08-14 | 2007-08-01 | 中兴通讯股份有限公司 | Method for selecting route based on user' IP address route |
| WO2003063427A1 (en) * | 2002-01-21 | 2003-07-31 | Rockstorm Technologies Ab | Method and system for fast ip routing lookup using forwarding tables with guaranteed compression rate and lookup performance |
| WO2008119282A1 (en) * | 2007-03-31 | 2008-10-09 | Huawei Technologies Co., Ltd. | Method for routing lookup and system for forwarding |
| US9361404B2 (en) | 2014-05-07 | 2016-06-07 | Red Hat Israel, Ltd. | Offline radix tree compression with key sequence skip |
| US9442927B2 (en) | 2014-05-07 | 2016-09-13 | Red Hat Israel, Ltd. | Offline generation of compressed radix tree with key sequence skip |
| WO2016183944A1 (en) * | 2015-05-19 | 2016-11-24 | 中兴通讯股份有限公司 | Message forwarding configuration method and apparatus for communication device, and message forwarding method |
| CN106302181A (en) * | 2015-05-19 | 2017-01-04 | 中兴通讯股份有限公司 | The message of communication equipment forwards collocation method, device and message forwarding method |
| CN106302181B (en) * | 2015-05-19 | 2020-06-26 | 中兴通讯股份有限公司 | Message forwarding configuration method, device and message forwarding method of communication equipment |
| CN119766724A (en) * | 2024-11-25 | 2025-04-04 | 天翼云科技有限公司 | Routing table updating method and device, electronic equipment and readable storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| AU7819300A (en) | 2001-04-24 |
| TW508926B (en) | 2002-11-01 |
| SE9903460D0 (en) | 1999-09-22 |
| SE9903460L (en) | 2001-03-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP1168723B1 (en) | Method and apparatus for longest matching prefix determination in a communication network | |
| US6266706B1 (en) | Fast routing lookup system using complete prefix tree, bit vector, and pointers in a routing table for determining where to route IP datagrams | |
| US6985483B2 (en) | Methods and systems for fast packet forwarding | |
| US6067574A (en) | High speed routing using compressed tree process | |
| US7899067B2 (en) | Method and apparatus for generating and using enhanced tree bitmap data structures in determining a longest prefix match | |
| US7433871B2 (en) | Efficient ipv4/ipv6 best matching prefix method and apparatus | |
| US7415472B2 (en) | Comparison tree data structures of particular use in performing lookup operations | |
| EP2037381A1 (en) | Database and database processing methods | |
| WO2003079618A2 (en) | System and method for longest prefix match internet protocol lookup | |
| WO2004040400A2 (en) | Methods and systems for fast binary network address lookups using parent node information stored in routing tables entries | |
| US7478109B1 (en) | Identification of a longest matching prefix based on a search of intervals corresponding to the prefixes | |
| US11327974B2 (en) | Field variability based TCAM splitting | |
| WO2001022667A1 (en) | Method and system for efficient routing table compression and fast routing lookups | |
| US6996559B1 (en) | IP address resolution methods and apparatus | |
| WO2002098055A2 (en) | Load balancing in ip address lookup | |
| US20050114393A1 (en) | Dynamic forwarding method using binary search | |
| JP2001517024A (en) | Method and system for fast routing lookup | |
| WO2000077984A1 (en) | Arrangement, system and method relating to datacommunication | |
| Chang et al. | Efficient IP routing table lookup scheme | |
| Wang et al. | A fast table update scheme for high-performance IP forwarding | |
| KR100572693B1 (en) | How to Look Up Internet Protocol Packets | |
| Chan et al. | High-performance IP forwarding with efficient routing-table update | |
| Yazdani et al. | Performing IP lookup on very high line speed | |
| Wang et al. | Routing interval: a new concept for IP lookups | |
| Wang et al. | An efficient IP routing lookup by using routing interval |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
| 122 | Ep: pct application non-entry in european phase | ||
| NENP | Non-entry into the national phase |
Ref country code: JP |