WO2011078108A1

WO2011078108A1 - Pattern-matching method and device for a multiprocessor environment

Info

Publication number: WO2011078108A1
Application number: PCT/JP2010/072869
Authority: WO
Inventors: 則夫山垣
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2009-12-21
Filing date: 2010-12-20
Publication date: 2011-06-30
Anticipated expiration: 2012-06-21
Also published as: JPWO2011078108A1

Abstract

Disclosed is a pattern-matching device that operates in a multi-core processor environment that has a multi-core CPU provided with: a plurality of processor cores; and a shared memory shared among the processor cores, which exist within the same CPU. The shared memory contains a pattern table provided with search patterns. The pattern-matching device executes a pattern-matching process in parallel across the plurality of processor cores. Said pattern-matching process tries to match a search target, namely a search key.

Description

Pattern matching method and apparatus in multiprocessor environment

　本発明は、複数のプロセッサから構成されるマルチプロセッサ環境に関し、特に、複数個のプロセッサとそれらのプロセッサで共有される共有メモリを一ユニットとし、複数のユニットが存在するシステムにおいて、各ユニットにおける共有メモリにパターン管理テーブルを分割して構成し、システム内の複数個のプロセッサを用いて並列にパターンマッチング処理を実行することで、パターンマッチング処理を行うシステムに関する。 The present invention relates to a multiprocessor environment composed of a plurality of processors, and in particular, in a system in which a plurality of processors and a shared memory shared by these processors are regarded as one unit, and there are a plurality of units, sharing in each unit is performed. The present invention relates to a system for performing pattern matching by dividing a pattern management table into a memory and executing pattern matching in parallel using a plurality of processors in the system.

　入力された文字列がある文字列（パターン）に一致するかを特定するパターンマッチング処理は、ネットワーク分野においても重要な要素技術であり、例えば、ルータやスイッチ等のネットワーク中継装置において、入力されたパケットのヘッダ情報を用いて、そのパケットがどのフローに属するかを特定するための処理（以下、これをフロー検索処理と呼ぶ）や、パケットのヘッダ情報のみならずパケットペイロード内のデータに対してパターンマッチング処理を行い、よりきめ細やかにフローを特定し、各フローに対するＱｏＳ（Ｑｕａｌｉｔｙ　ｏｆ　Ｓｅｒｖｉｃe）の提供や、負荷分散制御等を行うためのＤｅｅｐ　Ｐａｃｋｅｔ　Ｉｎｓｐｅｃｔｉｏｎ（ＤＰＩ）と呼ばれる技術に用いられる。 Pattern matching processing that identifies whether an input character string matches a character string (pattern) is an important elemental technology in the network field. For example, it is input in a network relay device such as a router or a switch. Using the header information of the packet to identify the flow to which the packet belongs (hereinafter referred to as flow search processing), not only the packet header information but also the data in the packet payload It is used in a technique called Deep Packet Inspection (DPI) for performing pattern matching processing, specifying flows more finely, providing QoS (Quality of Service) for each flow, performing load distribution control, and the like.

　フロー検索処理では、一般的に入力されたパケットのＩＰ（Ｉｎｔｅｒｎｅｔ　Ｐｒｏｔｏｃｏｌ）ヘッダに定義されている送信元ＩＰアドレス、宛先ＩＰアドレス、プロトコル番号に加え、ＴＣＰ（Ｔｒａｎｓｍｉｓｓｉｏｎ　Ｃｏｎｔｒｏｌ　Ｐｒｏｔｃｏｌ）/ＵＤＰ（Ｕｓｅｒ　Ｄａｔａｇｒａｍ　Ｐｒｏｔｃｏｌ）ヘッダに定義されている送信ポート番号、宛先ポート番号等のフィールドを用いて特定される一連のパケット列をフローと呼び、例えば、ハッシュ法、又はハッシングと呼ばれる技術を用いてフローの特定を行う。ハッシュ法では、フローの特定を行う上記のようなフィールドの組を検索キーとし、検索キーに対してハッシュ関数と呼ばれる演算を行うことで、検索キーをハッシュ値と呼ばれる特定長のビット列に変換し、得られたハッシュ値に相当するテーブル内のエントリ位置に、検索キーと次のエントリへのポインタを格納したテーブルを用いてフロー検索処理を行う。このようなテーブルをハッシュテーブルと呼び、特にフローを特定する検索キーが格納されているため、フローテーブルと呼ぶこともある。ハッシュ法については、例えば特許文献１に記載されているため詳細については省略するが、図１７にハッシュテーブルの構成例を示す。 In the flow search process, TCP (Transmission Control Protocol) / UDP (User Datagram Protocol) in addition to the source IP address, destination IP address, and protocol number defined in the IP (Internet Protocol) header of generally input packets. ) A series of packet sequences specified using fields such as a transmission port number and a destination port number defined in the header is called a flow. For example, a flow is specified using a technique called hashing or hashing. . In the hash method, the search key is converted into a bit string of a specific length called a hash value by performing a calculation called a hash function on the search key using the set of fields as described above for identifying the flow. Then, a flow search process is performed using a table storing a search key and a pointer to the next entry at the entry position in the table corresponding to the obtained hash value. Such a table is called a hash table, and in particular, since a search key for specifying a flow is stored, it may be called a flow table. Since the hash method is described in, for example, Patent Document 1 and details thereof are omitted, FIG. 17 shows a configuration example of a hash table.

　図１８に１つのみのプロセッサコアをもつ１つのみのＣＰＵ（Ｃｅｎｔｒａｌ　Ｐｒｏｃｅｓｓｉｎｇ　Ｕｎｉｔ）を含む通常のサーバアーキテクチャの概略構成例を示し、本サーバ上でフロー検索処理を行う場合の動作概要について説明する。図１８に示す通常のサーバアーキテクチャの例では、１つのみのプロセッサコアを持つＣＰＵ１１と、ＤＲＡＭ（Ｄｙｎaｍｉｃ　Ｒａｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｍｏｒｙ）等のメインメモリ６と、ＭＡＣ（Ｍｅｄｉａ　Ａｃｃｅｓｓ　Ｃｏｎｔｒｏｌ）や送受信ＦＩＦＯ（Ｆｉｒｓｔ－Ｉｎ　Ｆｉｒｓｔ－Ｏｕｔ）等を含むパケット送受信ブロック１０が、バス３を通じて接続されている。また、ＣＰＵ１１は、プロセッサコア１１－１とレベル２キャッシュ（Ｌ２キャッシュ）１１－０を含む。なお、ＣＰＵ１１とメインメモリ６とパケット送受信ブロック１０は、Ｎｏｒｔｈ　ＢｒｉｄｇｅやＳｏｕｔｈ　Ｂｒｉｄｇｅと呼ばれるチップセットを通じて接続されている場合もあるが、ここではこれらのチップセットも含めバス３として示している。 FIG. 18 shows a schematic configuration example of a normal server architecture including only one CPU (Central Processing Unit) having only one processor core, and describes an operation outline when performing flow search processing on this server. . In the example of the normal server architecture shown in FIG. 18, a CPU 11 having only one processor core, a main memory 6 such as a DRAM (Dynamic Random Access Memory), a MAC (Media Access Control) and a transmission / reception FIFO (First-In). A packet transmission / reception block 10 including First-Out) is connected through the bus 3. The CPU 11 includes a processor core 11-1 and a level 2 cache (L2 cache) 11-0. The CPU 11, the main memory 6, and the packet transmission / reception block 10 may be connected through a chip set called North Bridge or South Bridge, but here, these chipsets are also shown as the bus 3.

　通常、フローテーブルは大容量であるため、メインメモリ６上に構築される。パケットがパケット送受信ブロック１０で受信されると、ＭＡＣや受信ＦＩＦＯを経由し、例えばＤＭＡ（Ｄｉｒｅｃｔ　Ｍｅｍｏｒｙ　Ａｃｃｅｓｓ）機能を用いてメインメモリ６上に書き込まれる。また、パケットがパケット送受信ブロック１０で受信されると、パケット送受信ブロック１０がＣＰＵ１１のプロセッサコア１１－１へ割り込み制御をかける、又は、イベントキューにイベントが発行され、プロセッサコア１１－１によるイベントのポーリングが行われること等により、プロセッサコア１１－１はパケットの到着を判断し、メインメモリ６から当該パケットのヘッダ情報を読み出す。続いて、プロセッサコア１１－１は、検索キーである複数のヘッダのフィールドの組に対して、ハッシュ値を導出し、ハッシュ値に対応するフローテーブルのエントリをメインメモリ６から読み出し、検索キーの比較を行う。これらの検索キーが一致している場合、当該パケットはそのエントリに対応するフローに属すると判断する。一致していない場合、そのエントリに格納されているポインタを辿り、次のエントリを読み出し、検索キーとそのエントリ内の検索キーとの比較を行うという処理を繰り返すことで、当該パケットがエントリ内のフローに属しているかを確認するという処理を行う。なお、当該パケットの検索キーがフローテーブル内のエントリと最後まで一致しなかった場合の処理、例えば、新規フローとしてテーブルエントリを登録する等については、個々のシステム仕様に依存するため、ここでは省略する。 Usually, since the flow table has a large capacity, it is constructed on the main memory 6. When the packet is received by the packet transmission / reception block 10, it is written on the main memory 6 using, for example, a DMA (Direct Memory Access) function via the MAC or reception FIFO. When a packet is received by the packet transmission / reception block 10, the packet transmission / reception block 10 applies interrupt control to the processor core 11-1 of the CPU 11, or an event is issued to the event queue. As a result of polling, the processor core 11-1 determines the arrival of the packet and reads the header information of the packet from the main memory 6. Subsequently, the processor core 11-1 derives a hash value for a plurality of header field sets as search keys, reads an entry of the flow table corresponding to the hash value from the main memory 6, and Make a comparison. If these search keys match, it is determined that the packet belongs to the flow corresponding to the entry. If they do not match, follow the pointer stored in the entry, read the next entry, and repeat the process of comparing the search key with the search key in the entry, so that the packet is in the entry The process of confirming whether it belongs to the flow is performed. Note that processing when the search key of the packet does not match the entry in the flow table to the end, for example, registering a table entry as a new flow, is omitted here because it depends on the individual system specifications. To do.

　このようなネットワーク分野におけるパターンマッチング処理、特にフロー検索処理は、入力されたパケット毎に行うため、ネットワークの帯域に見合った高速な検索スループットでこれを実現することが必要である。このため、上記のように、メインメモリ上にフローテーブルを構築し、ＣＰＵによるソフトウェア処理でフロー検索処理を実行した場合、１パケットに対する検索処理あたりアクセス性能の低いメインメモリ上のフローテーブルに複数回のアクセスが必要であることから、処理性能の低下が課題となる。ＣＰＵ内部のキャッシュメモリ、例えば、図１８におけるＬ２キャッシュ１１－０にメインメモリのフローテーブルの一部を保持することで、メインメモリへのアクセスに関わる遅延を削減することが可能である。しかし、ネットワークにおいては多数のフローに属するパケットがほぼランダムに到着するため、このようなフロー検索処理のように処理毎に異なるデータ（フロー検索処理ではテーブルエントリ）を参照する必要がある場合には有効に機能せず、結果的にメインメモリへのアクセスが劇的に減少するとは考えにくく、有効に機能しないことが考えられる。同様に、例えば、特許文献２のように、ＣＰＵに具備されているデータキャッシュとは別に、ハッシュテーブルの一部をキャッシュしたテーブルを用意しても、上記と同様の課題が存在することが考えられる。 Since such pattern matching processing in the network field, particularly flow search processing, is performed for each input packet, it is necessary to realize this with a high-speed search throughput corresponding to the bandwidth of the network. For this reason, as described above, when a flow table is constructed on the main memory and the flow search process is executed by software processing by the CPU, the flow table on the main memory having a low access performance per search process for one packet is stored multiple times. Therefore, the reduction in processing performance becomes a problem. By holding a part of the flow table of the main memory in the cache memory inside the CPU, for example, the L2 cache 11-0 in FIG. 18, it is possible to reduce the delay related to the access to the main memory. However, in a network, packets belonging to a large number of flows arrive almost randomly, so when it is necessary to refer to different data (table entries in the flow search process) for each process such as the flow search process. It is unlikely that it will not function effectively, resulting in a dramatic decrease in access to the main memory, and may not function effectively. Similarly, for example, as in Patent Document 2, there is a problem similar to the above even if a table in which a part of the hash table is cached is prepared separately from the data cache provided in the CPU. It is done.

　一方、近年、複数個のプロセッサコアを内蔵したマルチコアＣＰＵ（マルチコアプロセッサ）が登場し、さらにそれらを複数個搭載したサーバも提供されている。例えば、図１９に２つのプロセッサコアをもつ２つのＣＰＵから構成される通常のサーバアーキテクチャの概略構成例を示す。図１９は、図１８に示した構成のうち、ＣＰＵ１１の代わりに２つのプロセッサコアを持つＣＰＵ１２、ＣＰＵ１３が具備された構成である。ここで、ＣＰＵ１２は２つのプロセッサコア１２－１、１２－２とＬ２キャッシュ１２－０を、ＣＰＵ１３は２つのプロセッサコア１３－１、１３－２とＬ２キャッシュ１３－０を含む。このようなサーバでは、プロセッサコアが４つ存在することになり、これらのプロセッサコア間で並列にフロー検索処理を行うことで、図１８に示すような構成のサーバにおいてフロー検索処理を行う場合よりも検索性能が向上する。しかし、フローテーブルはメインメモリに構築されるため、低速なメインメモリへのアクセスが必要である上、複数のプロセッサコアが同一のメインメモリへアクセスするためのアクセス競合の調停等が必要となる。このため、各プロセッサコアを有効に活用することができず、性能面での課題は残る。より一般的に、複数個のプロセッサコアをもつＣＰＵを複数個具備するサーバにおいても同様の課題は存在する。同様に、このようなＣＰＵにおいてもキャッシュメモリは存在するが、上述したような処理毎に異なるデータを参照する必要がある場合の課題は解決されていない。 On the other hand, in recent years, multi-core CPUs (multi-core processors) incorporating a plurality of processor cores have appeared, and servers equipped with a plurality of them have also been provided. For example, FIG. 19 shows a schematic configuration example of a normal server architecture composed of two CPUs having two processor cores. FIG. 19 shows a configuration in which the CPU 12 and the CPU 13 having two processor cores are provided instead of the CPU 11 in the configuration shown in FIG. Here, the CPU 12 includes two processor cores 12-1 and 12-2 and an L2 cache 12-0, and the CPU 13 includes two processor cores 13-1 and 13-2 and an L2 cache 13-0. In such a server, there are four processor cores, and by performing flow search processing in parallel between these processor cores, the flow search processing is performed in the server configured as shown in FIG. Also improves search performance. However, since the flow table is constructed in the main memory, access to the low-speed main memory is required, and arbitration of access contention for a plurality of processor cores to access the same main memory is required. For this reason, each processor core cannot be used effectively, and there remains a problem in terms of performance. More generally, a similar problem exists in a server including a plurality of CPUs having a plurality of processor cores. Similarly, although such a CPU has a cache memory, the problem in the case where it is necessary to refer to different data for each process as described above has not been solved.

　また、特許文献３では、図１９に示すようなマルチコアプロセッサ環境における共有キャッシュのミスヒット率を低減させるためのリプレイスメント制御についての開示があり、特許文献４では、マルチプロセッサ環境における共有メモリに対する排他制御についての開示があり、特許文献５では、マルチコアプロセッサ環境における共有キャッシュに対するプリフェッチ機能についての開示があるが、フロー検索処理のような処理毎に異なるデータを参照する必要がある場合の課題そのものを解決するものではない。さらに、非特許文献１では、最長一致検索（ＬＰＭ：　Ｌｏｎｇｅｓｔ　Ｐｒｅｆｉｘ　Ｍａｔｃｈ）を対象とし、検索キーとなるデータが格納された一部のデータ領域をキャッシュのような高速なアクセス性能を持つメモリ領域に格納し、低速なメモリへのアクセス回数を低減させるような手法について開示されている。 Patent Document 3 discloses a replacement control for reducing the miss-miss rate of a shared cache in a multi-core processor environment as shown in FIG. 19, and Patent Document 4 discloses an exclusive control for a shared memory in a multi-processor environment. Patent Document 5 discloses a prefetch function for a shared cache in a multi-core processor environment, but solves the problem itself when it is necessary to refer to different data for each process such as a flow search process. Not what you want. Furthermore, in Non-Patent Document 1, the longest match search (LPM: Longest Prefix Match) is targeted, and a part of the data area in which data serving as a search key is stored is a memory area having a high-speed access performance such as a cache. A technique for storing and reducing the number of accesses to a low-speed memory is disclosed.

　特許文献６では、図１が示すように、（複数の）プロセッサ、キャッシュメモリ、メモリ、周辺機器（ペリフェラル）、パケットコントローラがシステム・バスで接続された構成が開示されている（たとえば、第７段落参照）。 In Patent Document 6, as shown in FIG. 1, a configuration in which (a plurality of) processors, a cache memory, a memory, peripheral devices (peripherals), and a packet controller are connected via a system bus is disclosed (for example, the seventh See paragraph).

　また、パケットコントローラにて、パケットを受信すると、パケット受信部に構成されているパターンマッチングロジック、およびハッシュ機構によって、予め定義され、パターンマッチングロジック内のレジスタに保持されているパターンと受信パケット内のアドレス等が比較され、当該パケットの受信可否が判定される（たとえば、第９段落参照）。 Also, when the packet is received by the packet controller, the pattern matching logic configured in the packet receiver and the hash mechanism define in advance the pattern stored in the register in the pattern matching logic and the received packet Addresses and the like are compared to determine whether or not the packet can be received (for example, see the ninth paragraph).

　更に、システム内のキャッシュメモリには、受信を許可されたパケットの一部が書き込まれ、引き続きプロセッサにて行われるパケット処理を行うために保持される（たとえば、第１１段落参照）。 Further, a part of the packet permitted to be received is written in the cache memory in the system and is held for subsequent packet processing performed by the processor (see, for example, the 11th paragraph).

　しかし、上述したように、特許文献６では、パケットコントローラの内部でパターンマッチを行う機能があり、そこでパターンマッチを行う。パターンマッチが各プロセッサにおいて行われるようなことはない。 However, as described above, Patent Document 6 has a function of performing pattern matching inside the packet controller, and performs pattern matching there. Pattern matching is not performed in each processor.

　また、上述したように、特許文献６のキャッシュメモリには、パターンマッチングにより受信を許可されたパケットデータの一部が書き込まれ、プロセッサによるパケット処理に利用される。キャッシュメモリに、パターンマッチの対象となるパターン（フロー情報）が書き込まれており、フロー管理テーブルとして利用するようなことはない。 Also, as described above, a part of the packet data permitted to be received by pattern matching is written into the cache memory of Patent Document 6 and used for packet processing by the processor. A pattern (flow information) to be subjected to pattern matching is written in the cache memory and is not used as a flow management table.

　特許文献７の図７には、複数のマイクロエンジン（プロセッサ）、それらで共有されるメモリ（ＳＲＡＭ、ＤＲＡＭ）、パケット受信部等がバスで接続された構成（たとえば、第５０段落参照等）が開示されている。各マイクロエンジンには、ローカルメモリと計算コア（プロセッサコア）が含まれる。 FIG. 7 of Patent Document 7 shows a configuration in which a plurality of micro engines (processors), a memory (SRAM, DRAM) shared by them, a packet receiving unit, and the like are connected by a bus (for example, see paragraph 50). It is disclosed. Each micro engine includes a local memory and a computing core (processor core).

　また、共有メモリ（ＳＲＡＭ）で保持されたフローの状態情報（パターン、パターンマッチで利用する状態遷移情報）を、マイクロエンジン内のローカルメモリに読み出し、マイクロエンジン内のプロセッサコアにてパターンマッチ処理が行われる（たとえば、第５５段落参照）。 In addition, the flow state information (pattern, state transition information used in pattern matching) held in the shared memory (SRAM) is read to the local memory in the micro engine, and pattern matching processing is performed by the processor core in the micro engine. (See, for example, paragraph 55).

　しかし、上述したように、特許文献７では、複数のプロセッサで共有して利用するＳＲＡＭ、ＤＲＡＭ等のメインメモリは存在するが、それらよりもより高速にアクセスできるキャッシュのようなメモリとしては、各プロセッサに対応するローカルメモリしか存在せず、複数のプロセッサで共有できるキャッシュメモリは存在しない。複数のプロセッサで共有されるメインメモリ（ＳＲＡＭ、ＤＲＡＭ等）に加えて、特定の数のプロセッサで共有するキャッシュメモリが存在するようなことはない。 However, as described above, in Patent Document 7, there are main memories such as SRAM and DRAM which are shared and used by a plurality of processors. There is only local memory corresponding to the processor, and there is no cache memory that can be shared by a plurality of processors. In addition to the main memory (SRAM, DRAM, etc.) shared by a plurality of processors, there is no cache memory shared by a specific number of processors.

　また、上述したように、特許文献７のパターンマッチ処理では、パターン等の情報は複数のプロセッサで共有利用するメインメモリ上で保持されており、キャッシュメモリ（マイクロエンジン内のローカルメモリ）には、対象となるパケットのパターンマッチ処理を行う間だけメインメモリから情報を読み出し、利用する。キャッシュメモリに、パターンマッチの対象となるパターン（フロー情報）が書き込まれており、キャッシュメモリがフロー管理テーブルとして直接利用されるようなことはない。 As described above, in the pattern matching process of Patent Document 7, information such as a pattern is held on a main memory shared and used by a plurality of processors, and the cache memory (local memory in the micro engine) Information is read from the main memory and used only during the pattern matching process of the target packet. A pattern (flow information) to be subjected to pattern matching is written in the cache memory, and the cache memory is not directly used as a flow management table.

特開平１１－５３１９９号公報Japanese Patent Laid-Open No. 11-53199 特表２００２－５１１７０３号公報Special table 2002-511703 gazette 特開２００２－３７３１１５号公報JP 2002-373115 A 特開２００８－２９３３６８号公報JP 2008-293368 A 特開２００７－２０７２４０号公報JP 2007-207240 A 特表２００７－５２５０３７号公報Special Table 2007-525037 特表２００８－５４４７２８号公報Special table 2008-544728

Ｂｕｔｌｅｒ　Ｌａｍｐｓｏｎ、　Ｖｅｎｋａｔａｃｈａｒｙ　Ｓｒｉｎｉｖａｓａｎ、Ｇｅｏｒｇｅ　Ｖａｒｇｈｅｓｅ、「ＩＰ　Ｌｏｏｋｕｐｓ　Ｕｓｉｎｇ　Ｍｕｌｔｉｗａｙ　ａｎｄ　Ｍｕｌｔｉｃｏｌｕｍｎ　Ｓｅａｒｃｈ」、ＩＥＥＥ／ＡＣＭ　Ｔｒａｎｓａｃｔｉｏｎｓ　ｏｎ　Ｎｅｔｗｏｒｋｉｎｇ、ｖｏｌ．７、Ｎｏ．３、１９９９年、第３２４－３３４頁Butler Lampson, Venkatachari Srinivasan, George Vargese, “IP Lookups Multiway and Multicolumn Search”, IEEE / ACM Transactions. 7, no. 3, 1999, pages 324-334

　上記のようなネットワーク分野におけるマルチプロセッサ環境を用いたパターンマッチング処理では、ネットワークの帯域に見合った高速な検索スループットでこれを実現することが必要であるが、それを満たす高い処理性能を提供できないことに課題がある。 In pattern matching processing using the multiprocessor environment in the network field as described above, it is necessary to achieve this with high-speed search throughput that matches the network bandwidth, but it is not possible to provide high processing performance that satisfies it. There is a problem.

　その理由は、フローテーブルがアクセス性能の低いメインメモリ上に構築され、メインメモリ上のフローテーブルに複数回のアクセスが必要であるためである。また、メインメモリよりも高速にアクセス可能なキャッシュメモリのようなメモリにフローテーブルの一部をキャッシュし、メインメモリへのアクセス回数を削減しようとしても、到着するパケットが属するフローがランダムな場合には、到着するパケット間での依存性が少なく、キャッシュしたフローテーブルの一部のエントリにマッチする可能性が小さいため、メインメモリへのアクセス回数削減の効果が少ないことにある。 The reason is that the flow table is built on the main memory with low access performance, and the flow table on the main memory needs to be accessed multiple times. In addition, if a part of the flow table is cached in a memory such as a cache memory that can be accessed faster than the main memory, and the number of accesses to the main memory is reduced, the flow to which the arriving packet belongs is random. This is because there is less dependency between arriving packets and there is less possibility of matching some entries in the cached flow table, so that the effect of reducing the number of accesses to the main memory is small.

　本発明の目的は、ネットワーク帯域に見合った高速な検索性能を有するパターンマッチング処理を実現できるパターンマッチングシステムを提供することにある。 An object of the present invention is to provide a pattern matching system capable of realizing a pattern matching process having a high-speed search performance corresponding to a network band.

　本発明の第１、第２のパターンマッチング装置は、複数個のプロセッサコアとそれらのプロセッサコア間で共有される共有メモリを有する１個、又は複数個のマルチコアＣＰＵ（図１の１、図６の１、２、・・・、Ｍ）と、パケットの送受信を行うパケット送受信ブロック（図１の５、図６の５）と、受信したパケットからフロー検索処理における検索キーとなるヘッダ情報を抽出すると共に、パケット受信イベントの生成と発行、及びパケットをメインメモリへ書き込む検索キー抽出ブロック（図１の４、図６の４）と、パケット受信イベントや必要に応じてパケットそのものが書き込まれるＤＲＡＭ等のメインメモリ（図１の６、図６の６）と、マルチコアＣＰＵ、検索キー抽出ブロック、メインメモリをそれぞれ接続するバス（図１の３、図６の３）とを有する。また、ハッシュテーブルをベースとしたパターンテーブルは、マルチコアＣＰＵ内の共有メモリにおいて構成される。このような構成を採用し、検索キー抽出ブロックにより、予めパケットのヘッダ情報から検索キーとなるフィールドの組を抽出し、それをプロセッサコアに通知すると共に、メインメモリよりも高速なアクセスが可能な共有メモリ内のパターンテーブルを用いてパターンマッチング処理を行うことにより、本発明の目的を達成することができる。 The first and second pattern matching apparatuses according to the present invention include one or a plurality of multi-core CPUs (1 and 6 in FIG. 1) having a plurality of processor cores and a shared memory shared between the processor cores. 1, 2,..., M), a packet transmission / reception block (5 in FIG. 1, 5 in FIG. 6), and header information serving as a search key in the flow search process is extracted from the received packet. In addition, a search key extraction block (4 in FIG. 1 and 4 in FIG. 6) for generating and issuing a packet reception event and writing the packet to the main memory, a DRAM in which the packet reception event and the packet itself are written if necessary, etc. 1 main memory (6 in FIG. 1, 6 in FIG. 6), a bus (FIG. 1) for connecting the multi-core CPU, the search key extraction block, and the main memory 3, and a 3 in Figure 6). A pattern table based on a hash table is configured in a shared memory in the multi-core CPU. Adopting such a configuration, the search key extraction block extracts a set of fields as a search key from the packet header information in advance, notifies the processor core of the set, and allows faster access than the main memory. By performing the pattern matching process using the pattern table in the shared memory, the object of the present invention can be achieved.

　また、本発明の第３のパターンマッチング装置は、第２のパターンマッチング装置に加え、最終結果決定ブロック（図９の７）を有する。また、各マルチコアＣＰＵ内は、その共有メモリ上に、元の１つのパターンテーブルを分割したサブパターンテーブルを持つ。このような構成を採用し、検索キー抽出ブロックにより、予めパケットのヘッダ情報から検索キーとなるフィールドの組を抽出し、それを各マルチコアＣＰＵの１つのプロセッサコアに通知し、分散してパターンマッチング処理を行うと共に、メインメモリよりも高速なアクセスが可能な共有メモリ内のパターンテーブルを用いてパターンマッチング処理を行うことにより、本発明の目的を達成することができる。 In addition to the second pattern matching device, the third pattern matching device of the present invention has a final result determination block (7 in FIG. 9). Each multi-core CPU has a sub pattern table obtained by dividing one original pattern table on the shared memory. Adopting such a configuration, the search key extraction block extracts in advance the set of fields that will be the search key from the packet header information, notifies it to one processor core of each multi-core CPU, and distributes the pattern matching The object of the present invention can be achieved by performing the pattern matching process using the pattern table in the shared memory that can be accessed at a higher speed than the main memory.

　さらに、本発明の第４のパターンマッチング装置は、第３のパターンマッチング装置における最終結果決定ブロック（図９の７）を除き、ハッシュ値計算ブロック（図１３の９）を加えた構成である。このような構成を採用し、検索キー抽出ブロック（図１３の８）により、予めパケットのヘッダ情報から検索キーとなるフィールドの組を抽出し、さらに、ハッシュ値計算ブロック（図１３の９）において、検索キーに対するハッシュ値を算出することで、当該ハッシュ値に対応するパターンエントリ群が保持されているパターンテーブルを有するマルチコアＣＰＵを特定することができ、当該マルチコアＣＰＵの１つのプロセッサコアにおいてパターンマッチング処理を行うと共に、メインメモリよりも高速なアクセスが可能な共有メモリ内のパターンテーブルを用いてパターンマッチング処理を行うことにより、本発明の目的を達成することができる。 Furthermore, the fourth pattern matching device of the present invention has a configuration in which a hash value calculation block (9 in FIG. 13) is added except for the final result determination block (7 in FIG. 9) in the third pattern matching device. Employing such a configuration, a search key extraction block (8 in FIG. 13) previously extracts a set of fields serving as a search key from the header information of the packet, and further in a hash value calculation block (9 in FIG. 13). By calculating a hash value for the search key, a multi-core CPU having a pattern table holding a pattern entry group corresponding to the hash value can be specified, and pattern matching is performed in one processor core of the multi-core CPU. The object of the present invention can be achieved by performing the pattern matching process using the pattern table in the shared memory that can be accessed at a higher speed than the main memory.

　本発明によれば、ネットワーク帯域に見合った高速な検索性能を有するパターンマッチング処理を実現できる。 According to the present invention, it is possible to realize a pattern matching process having a high-speed search performance corresponding to the network bandwidth.

本発明の第１の実施の形態による装置の構成を示すブロック図である。It is a block diagram which shows the structure of the apparatus by the 1st Embodiment of this invention. 本発明の第１の実施による装置の動作を示す流れ図である。2 is a flowchart showing the operation of the apparatus according to the first embodiment of the present invention; 図２におけるステップＡ４を示す流れ図である。It is a flowchart which shows step A4 in FIG. フローテーブルの一例を示す図である。It is a figure which shows an example of a flow table. 本発明の第１の実施による装置の動作概要を示す図である。It is a figure which shows the operation | movement outline | summary of the apparatus by the 1st implementation of this invention. 本発明の第２の実施の形態による装置の構成を示すブロック図である。It is a block diagram which shows the structure of the apparatus by the 2nd Embodiment of this invention. 本発明の第２の実施による装置の動作を示す流れ図である。6 is a flowchart showing the operation of the apparatus according to the second embodiment of the present invention. 本発明の第２の実施による装置の動作概要を示す図である。It is a figure which shows the operation | movement outline | summary of the apparatus by the 2nd implementation of this invention. 本発明の第３の実施の形態による装置の構成を示すブロック図である。It is a block diagram which shows the structure of the apparatus by the 3rd Embodiment of this invention. 本発明の第３の実施の形態におけるフローテーブルの分割の一例を示す図である。It is a figure which shows an example of the division | segmentation of the flow table in the 3rd Embodiment of this invention. 本発明の第３の実施による装置の動作を示す流れ図である。6 is a flowchart showing the operation of the apparatus according to the third embodiment of the present invention; 本発明の第３の実施による装置の動作概要を示す図である。It is a figure which shows the operation | movement outline | summary of the apparatus by the 3rd implementation of this invention. 本発明の第４の実施の形態による装置の構成を示すブロック図である。It is a block diagram which shows the structure of the apparatus by the 4th Embodiment of this invention. 本発明の第４の実施による装置の動作を示す流れ図である。6 is a flowchart showing the operation of the apparatus according to the fourth embodiment of the present invention; 図１４におけるステップＡ１３を示す流れ図である。It is a flowchart which shows step A13 in FIG. 本発明の第４の実施による装置の動作概要を示す図である。It is a figure which shows the operation | movement outline | summary of the apparatus by the 4th implementation of this invention. ハッシュテーブルの構成例である。It is a structural example of a hash table. 通常の１つのＣＰＵ（シングルコア）を持つサーバにおけるソフトウェアによるフロー検索処理のアーキテクチャ例である。It is an example of an architecture of flow search processing by software in a server having a normal single CPU (single core). 通常の２つのＣＰＵ（デュアルコア）を持つサーバにおけるソフトウェアによるフロー検索処理のアーキテクチャ例である。It is an example of an architecture of flow search processing by software in a server having two normal CPUs (dual core).

　以下、図面を参照して本発明を実施するための形態について詳細に説明する。
＜実施の形態１＞
　図１は、本発明の第１の実施の形態の構成を示すブロック図である。図１を参照すると、本発明の第１の実施の形態は、マルチコアＣＰＵ１と、パケットの送受信を行うパケット送受信ブロック５と、受信したパケットからフロー検索処理における検索キーとなるヘッダ情報を抽出すると共に、パケット受信イベントの生成と発行、及びパケットをメインメモリへ書き込む検索キー抽出ブロック４と、パケット受信イベントや必要に応じてパケットそのものが書き込まれるＤＲＡＭ等のメインメモリ６と、マルチコアＣＰＵ１、検索キー抽出ブロック４、メインメモリ６を相互に接続するバス３とを含む。 DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the drawings.
<Embodiment 1>
FIG. 1 is a block diagram showing the configuration of the first exemplary embodiment of the present invention. Referring to FIG. 1, in the first embodiment of the present invention, a multi-core CPU 1, a packet transmission / reception block 5 for transmitting / receiving packets, and header information serving as a search key in flow search processing are extracted from received packets. Search key extraction block 4 for generating and issuing packet reception events and writing packets to the main memory; main memory 6 such as a DRAM in which packet reception events and packets themselves are written; multi-core CPU 1; search key extraction Block 4 and bus 3 interconnecting main memory 6 are included.

　マルチコアＣＰＵ１は、Ｎ個のプロセッサコア１－１、１－２、・・・、１－Ｎと、上記Ｎ個のプロセッサコアで共有される共有メモリ１－０を含む。また、共有メモリ１－０には、ハッシュテーブルとして構成されたフローテーブルの全てを保持している。なお、本共有メモリがデータキャッシュとして利用されている場合、本フローテーブルはデータキャッシュとは論理的に分離された領域に保持されるものとする。 The multi-core CPU 1 includes N processor cores 1-1, 1-2,..., 1-N, and a shared memory 1-0 shared by the N processor cores. The shared memory 1-0 holds all the flow tables configured as hash tables. When the shared memory is used as a data cache, the flow table is held in an area logically separated from the data cache.

　パケット送受信ブロック５は、ネットワークがＥｔｈｅｒｎｅｔ（登録商標）である場合には、ＭＡＣの機能を持ち、入力されたビットストリームをＥｈｅｒｎｅｔ（登録商標）フレームとして整形した後に、検索キー抽出ブロック４へ出力する。このようなパケット送受信ブロック５は、当業者にとってよく知られているため、その詳細な構成は省略する。 When the network is Ethernet (registered trademark), the packet transmission / reception block 5 has a MAC function, and after shaping the input bit stream as an Ethernet (registered trademark) frame, outputs it to the search key extraction block 4 . Since such a packet transmission / reception block 5 is well known to those skilled in the art, its detailed configuration is omitted.

　検索キー抽出ブロック４は、パケット送受信ブロック５から受信したパケットを受け取り、フロー検索処理における検索キーとなるヘッダ情報を抽出する。抽出した検索キーの情報を含むパケット受信イベントを生成し、マルチコアＣＰＵ１に対するイベントキューへ書き込むと共に、検索キーを抽出した後のパケットをメインメモリへ書き込む。本書き込み処理には、一般的にＤＭＡ機能が用いられる。なお、パケット送信時には、本ブロックはメインメモリからパケットデータを読み出し、そのままパケット送受信ブロック５へ出力する。 The search key extraction block 4 receives the packet received from the packet transmission / reception block 5 and extracts header information serving as a search key in the flow search process. A packet reception event including the extracted search key information is generated and written to the event queue for the multi-core CPU 1, and the packet after the search key is extracted is written to the main memory. In general, the DMA function is used for the writing process. At the time of packet transmission, this block reads packet data from the main memory and outputs it to the packet transmission / reception block 5 as it is.

　なお、図１では、マルチコアＣＰＵ１、検索キー抽出ブロック４、メインメモリ６をバス３によってバス接続している形態であるが、本接続形態はサーバ等に備えられたＮｏｒｔｈ　ＢｒｉｄｇｅやＳｏｕｔｈ　Ｂｒｉｄｇｅと呼ばれるチップセットを通じて接続される形態や、クロスバー、リングバスの形態でも構わない。また、本実施の形態は、サーバ等のマザーボード上で構成しても良いし、メインメモリ６を除いた全てを１つのＳｏＣ（Ｓｙｓｔｅｍ－ｏｎ－ａ－Ｃｈｉｐ）として、１つのチップ上に構成しても良い。 In FIG. 1, the multi-core CPU 1, the search key extraction block 4, and the main memory 6 are connected by a bus 3, but this connection form is a chip called North Bridge or South Bridge provided in a server or the like. A form connected through a set, a crossbar, or a ring bus may be used. In addition, the present embodiment may be configured on a motherboard such as a server, or all except the main memory 6 is configured as one SoC (System-on-a-Chip) on one chip. May be.

　次に、図１、及び図２の流れ図を参照して、本発明の第１の実施の形態の動作について詳細に説明する。 Next, the operation of the first exemplary embodiment of the present invention will be described in detail with reference to the flowcharts of FIGS.

　ネットワークから入力されたパケットは、パケット送受信ブロック５で受信する。パケット送受信ブロック５で受信したパケットは、例えばネットワークがＥｔｈｅｒｎｅｔ（登録商標）である場合には、Ｅｈｅｒｎｅｔ（登録商標）フレーム（本明細書では、これらを総じてパケットと呼ぶ）として整形した後に、検索キー抽出ブロック４に出力する（ステップＡ１）。 Packets input from the network are received by the packet transmission / reception block 5. For example, when the network is the Ethernet (registered trademark), the packet received by the packet transmission / reception block 5 is shaped as an Ethernet (registered trademark) frame (in the present specification, these are collectively referred to as a packet), and then a search key. It outputs to the extraction block 4 (step A1).

　検索キー抽出ブロック４は、パケット送受信ブロック５からパケットを受け取ると、パケットのヘッダ情報のうち、フロー検索処理における検索キーとして、予め決められた任意のフィールドを抜き出し、パケットに装置内で一意に識別できる識別子を付与し、検索キー抽出ブロック４に含まれるＦＩＦＯに書き込む（ステップＡ２）。ここで、検索キーとして用いるパケットヘッダのフィールドとしては、例えば、ＩＰヘッダに定義されている送信元ＩＰアドレス（３２ｂｉｔｓ）、宛先ＩＰアドレス（３２ｂｉｔｓ）、プロトコル番号（８ｂｉｔｓ）、ＴＣＰ/ＵＤＰヘッダに定義されている送信ポート番号（１６ｂｉｔｓ）、宛先ポート番号（１６ｂｉｔｓ）の５つのフィールドの組が用いられ、この場合の検索キーの長さは、１０４ｂｉｔｓ（＝１３ｂｙｔｅｓ）となる。 When receiving a packet from the packet transmission / reception block 5, the search key extraction block 4 extracts a predetermined field as a search key in the flow search process from the packet header information, and uniquely identifies the packet within the apparatus. A possible identifier is assigned and written in the FIFO included in the search key extraction block 4 (step A2). Here, as a field of the packet header used as a search key, for example, a source IP address (32 bits), a destination IP address (32 bits), a protocol number (8 bits), and a TCP / UDP header defined in the IP header are defined. A set of five fields of a transmission port number (16 bits) and a destination port number (16 bits) are used, and the length of the search key in this case is 104 bits (= 13 bytes).

　次に、検索キー抽出ブロック４は、パケット受信をマルチコアＣＰＵ１に通知するためのイベントを生成、発行し、イベントキューへ書き込むと共に、対応するパケットをメインメモリ６に書き込む（ステップＡ３）。ここで生成するイベントには、抽出した検索キーの情報と、対応するパケットの識別子を含まれれば良く、イベントキューが例えばメインメモリ６に構成されている場合は、メインメモリ６のイベントキューに書き込む。なお、イベントキューがメインメモリ６に構成されている場合、マルチコアＣＰＵ１に対するイベントキューの領域と、パケットデータを書き込む領域は論理的に分離されており、そのベースとなるアドレスはマルチコアＣＰＵ１にとって既知であるものとする。 Next, the search key extraction block 4 generates and issues an event for notifying the multi-core CPU 1 of packet reception, writes it in the event queue, and writes the corresponding packet in the main memory 6 (step A3). The event generated here only needs to include the extracted search key information and the identifier of the corresponding packet. When the event queue is configured in the main memory 6, for example, the event is written in the event queue in the main memory 6. . When the event queue is configured in the main memory 6, the event queue area for the multi-core CPU 1 and the area for writing packet data are logically separated, and the base address is known to the multi-core CPU 1. Shall.

　一方、マルチコアＣＰＵ１は、イベントキューにイベントが存在するかをポーリング（ｐｏｌｌｉｎｇ）によって監視しており、イベントキューにイベントが存在すれば、その先頭のイベントを読み出し、特定のプロセッサコアによってフロー検索処理を行う（ステップＡ４）。 On the other hand, the multi-core CPU 1 monitors whether there is an event in the event queue by polling. If there is an event in the event queue, the multi-core CPU 1 reads the leading event and performs flow search processing by a specific processor core. Perform (Step A4).

　図３は、ステップＡ４のより詳細な動作を説明するための流れ図である。 FIG. 3 is a flowchart for explaining the detailed operation of step A4.

　マルチコアＣＰＵ１は、イベントキューにイベントが存在すると、その先頭のイベントを読み出し、特定のプロセッサコアに割り当てる（ステップＢ１）。ここで、マルチコアＣＰＵ１のように複数個のプロセッサコアをもつＣＰＵでは、ポーリングに特化した処理を特定のプロセッサコア、例えば、プロセッサコア１－１が行い、イベントキューからイベントを読み出した後、処理を行っていない他の空き状態のプロセッサコアに割り当てるという手法が考えられる。また、各プロセッサコアがそれぞれイベントキューに対してポーリングを行うことでイベントを引き取る手法もある。後者の場合は、各プロセッサコアが同じイベントキューが存在するメモリ領域に対してアクセスを行うため、これらのプロセッサコア間での調停が必要となる。本実施形態では、このようなイベントの読み出し処理については、既存の手法を用いることとし、各プロセッサコア（特定のプロセッサコアがポーリングに特化した処理を行う場合には、これを除くプロセッサコア）に平均的にイベントの割り当てが行われれば良い。以下では、簡単化のため、イベントが割り当たったプロセッサコアをプロセッサコア１－ｋ（ｋ＝１、・・・、Ｎ）とし、プロセッサコア１－ｋの処理について説明する。 When there is an event in the event queue, the multi-core CPU 1 reads the top event and assigns it to a specific processor core (step B1). Here, in a CPU having a plurality of processor cores such as the multi-core CPU 1, a specific processor core, for example, the processor core 1-1 performs processing specialized for polling, and after reading an event from the event queue, the processing A method of assigning to another free processor core that has not been performed is conceivable. In addition, there is a method in which each processor core collects an event by polling the event queue. In the latter case, since each processor core accesses a memory area where the same event queue exists, arbitration between these processor cores is necessary. In this embodiment, an existing method is used for such event reading processing, and each processor core (a processor core other than this when a specific processor core performs processing specialized for polling) It suffices if events are assigned on average. Hereinafter, for simplification, the processor core 1-k (k = 1,..., N) is assumed as the processor core to which the event is assigned, and the processing of the processor core 1-k will be described.

　イベントが割り当たったプロセッサコア１－ｋは、イベント内に格納されている検索キーに対して、予め設定されたハッシュ関数を用いてハッシュ値を計算する（ステップＢ２）。ここで、ハッシュ関数やハッシュ値は、当業者にとってよく知られているため、その詳細な説明は省略する。 The processor core 1-k to which the event is assigned calculates a hash value using a preset hash function for the search key stored in the event (step B2). Here, since the hash function and the hash value are well known to those skilled in the art, a detailed description thereof will be omitted.

　続いて、プロセッサコア１－ｋは、算出したハッシュ値をアドレス値として（ステップＢ３）、フローテーブルからエントリを読み出す（ステップＢ４）。図４に、共有メモリ１－０に構成されているフローテーブルの一例を示す。フローテーブルはハッシュテーブルをベースに構成されており、同じハッシュ値をもつ複数のエントリは双方向リストで接続されている。１つのエントリは検索キーと、フローを特定するためのフローＩＤと、リストの次のエントリへのポインタとなるアドレス値と、リストの前のエントリへのポインタを保持している。但し、図４で示すエントリの検索キー、フローＩＤ、次のエントリへのポインタ、前のエントリへのポインタのデータ長の合計が共有メモリ１－０のワード長よりも長い場合には、連続する複数のワードを用いて１つのエントリが構成されるものとし、例えば、物理アドレスのビット長の下位ビットを無視する等、アドレス値は各エントリの先頭ワードを示す論理的なアドレス値を意味するものとする。エントリは、基本的に空き領域に追加され、双方向リストによって管理される。一方、既にエントリが管理されているアドレス値、つまりハッシュ値にエントリが追加される場合は、当該エントリのアドレス値をハッシュ値とするエントリの優先度の方が高く、既に管理されているエントリは別の空き領域に移動させるものとする。なお、図４はあくまで一例であり、衝突の発生を考慮して、エントリの先頭ワードを示す論理的なアドレス空間よりも小さい範囲でハッシュ値を計算するようにしても良いし、他の既存のハッシュテーブルの構成に従ってフローテーブルを構成しても構わない。また、エントリの格納場所の移動が発生しないようなフローテーブルを構成した場合には、フローを特定するフローＩＤにアドレス値を用いることができるため、フローＩＤをエントリ内に保持する必要はない。 Subsequently, the processor core 1-k reads the entry from the flow table (step B4) using the calculated hash value as an address value (step B3). FIG. 4 shows an example of a flow table configured in the shared memory 1-0. The flow table is configured based on a hash table, and a plurality of entries having the same hash value are connected in a bidirectional list. One entry holds a search key, a flow ID for specifying a flow, an address value serving as a pointer to the next entry in the list, and a pointer to the previous entry in the list. However, if the sum of the search key of the entry shown in FIG. 4, the flow ID, the pointer to the next entry, and the data length of the pointer to the previous entry is longer than the word length of the shared memory 1-0, it is continuous. An entry is composed of a plurality of words. For example, the address value means a logical address value indicating the first word of each entry, such as ignoring the lower bits of the bit length of the physical address. And Entries are basically added to the free space and managed by a bidirectional list. On the other hand, when an entry is added to an address value for which an entry is already managed, that is, a hash value, the priority of the entry having the address value of the entry as a hash value is higher, and the already managed entry is Assume that it is moved to another free area. Note that FIG. 4 is merely an example, and in consideration of the occurrence of a collision, the hash value may be calculated in a range smaller than the logical address space indicating the first word of the entry. The flow table may be configured according to the configuration of the hash table. In addition, when a flow table is configured such that the entry storage location does not move, an address value can be used for the flow ID that identifies the flow, so there is no need to hold the flow ID in the entry.

　フローテーブルからエントリを読み出したプロセッサコア１－ｋは、読み出したエントリの検索キーと、当該イベントの検索キーが一致するかを判断し（ステップＢ５）、一致すればエントリ内に保持しているフローＩＤを取得し（ステップＢ６）、フロー検索処理（ステップＡ４）を終了する。一致しない場合、リンクで接続された次のエントリが存在するかを確認し（ステップＢ７）、次のエントリが存在する場合には、当該エントリの次のポインタをアドレス値とし（ステップＢ８）、次のエントリを読み出す（ステップＢ４）という処理を繰り返すことでフロー検索処理を行う。また、次のエントリが存在しない場合には、該当するフローは登録されていないため、結果としてフロー未登録と判断し（ステップＢ９）、フロー検索処理（ステップＡ４）を終了する。 The processor core 1-k that has read the entry from the flow table determines whether the search key of the read entry matches the search key of the event (step B5). The ID is acquired (step B6), and the flow search process (step A4) is terminated. If they do not match, it is confirmed whether there is a next entry connected by a link (step B7). If there is a next entry, the next pointer of the entry is used as an address value (step B8), and the next The flow search process is performed by repeating the process of reading the entry (step B4). If the next entry does not exist, the corresponding flow is not registered. As a result, it is determined that the flow is not registered (step B9), and the flow search process (step A4) is terminated.

　最後に、該当するプロセッサコア１－ｋは、フロー検索処理（ステップＡ４）で得た結果を出力する（ステップＡ５）。ここで、一般的に、フロー検索処理は、ＱｏＳ処理のような、パケットが属するフローを識別し、そのフローに対する個別の処理を行うために行われるため、これらの処理を行う処理ブロックや、ＣＰＵ等がバス３を経由して、又は何らかの通信手段によって備わっている場合が想定される。本実施の形態では、フロー検索処理に焦点を当てているため、これらの処理ブロックについては省略しているが、ステップＡ５における結果出力先としては、このようなフロー毎に個別の処理を行わせる処理ブロックへ出力することを想定する。なお、上記では、一致する検索キーを持つエントリが無ければ、フロー未登録と判断して処理を終了していたが、当該フローを新規フローとして登録する処理を行っても良い。同様に、既にエントリとして登録されているフローの削除を行うことも可能であるが、これらの処理は、フローテーブルであるハッシュテーブルの構成に大きく依存するため、それぞれの詳細については、省略する。 Finally, the corresponding processor core 1-k outputs the result obtained in the flow search process (step A4) (step A5). Here, since the flow search processing is generally performed to identify a flow to which a packet belongs, such as QoS processing, and to perform individual processing on the flow, a processing block for performing these processing, a CPU Or the like is provided via the bus 3 or by some communication means. Since this embodiment focuses on the flow search processing, these processing blocks are omitted, but as a result output destination in step A5, individual processing is performed for each such flow. Assume output to processing block. In the above description, if there is no entry having a matching search key, it is determined that the flow has not been registered, and the process ends. However, a process for registering the flow as a new flow may be performed. Similarly, it is possible to delete a flow that has already been registered as an entry. However, since these processes largely depend on the configuration of a hash table that is a flow table, the details thereof are omitted.

　さらに、そのようなフロー毎に個別の処理を行う処理ブロックで何らかの処理を行った後、当該パケットを送信する場合の処理については、検索キー抽出ブロック４がメインメモリ６からパケットを読み出し、読み出したパケットデータをパケット送受信ブロック５に出力し、パケットデータを受け取ったパケット送受信ブロック５がパケットを送信することを想定するが、本実施形態ではフロー検索処理に着目しているため、詳細は省略する。 Further, after performing some processing in the processing block that performs individual processing for each flow, the search key extraction block 4 reads and reads the packet from the main memory 6 for processing when the packet is transmitted. It is assumed that the packet data is output to the packet transmission / reception block 5 and the packet transmission / reception block 5 that has received the packet data transmits the packet. However, in this embodiment, attention is paid to the flow search process, and the details are omitted.

　或いは、本実施の形態では、パケットそのものをパケット送受信ブロック５にて、ネットワークから受信する形態であったが、本実施の形態で示したフロー検索装置に対して、パケットヘッダのみが入力される場合も想定される。この場合、パケット送受信ブロック５が、外部から入力されたパケットヘッダを受信し、それを検索キー抽出ブロック４へ出力する。パケットヘッダを受け取った検索キー抽出ブロック４は、パケットヘッダから検索キーとなるフィールドの組を抽出し、イベントを生成、発行する。この場合、パケットのペイロードデータは受信していないため、メインメモリ６へのパケットデータの書き込みは不要となる。 Alternatively, in the present embodiment, the packet itself is received from the network by the packet transmission / reception block 5, but only the packet header is input to the flow search device shown in the present embodiment. Is also envisaged. In this case, the packet transmission / reception block 5 receives a packet header input from the outside and outputs it to the search key extraction block 4. Upon receiving the packet header, the search key extraction block 4 extracts a set of fields serving as a search key from the packet header, and generates and issues an event. In this case, since the payload data of the packet is not received, it is not necessary to write the packet data to the main memory 6.

　さらに、本実施の形態では、マルチコアＣＰＵ１に対するイベントキューは、メインメモリ６において構成されることを想定したが、本イベントキューは、マルチコアＣＰＵ１内の共有メモリ１－０において、フローテーブルとは論理的に分離された領域に構成し、検索キー抽出ブロック４から直接共有メモリ１－０内のイベントキューへイベントを書き込んでも構わないし、検索キー抽出ブロック内の特定内のメモリ、レジスタを用いて構成しても構わない。いずれにせよ、マルチコアＣＰＵ１から見ることのできるアドレス空間上で特定できる領域であれば問題無い。 Furthermore, in the present embodiment, it is assumed that the event queue for the multi-core CPU 1 is configured in the main memory 6, but this event queue is logically different from the flow table in the shared memory 1-0 in the multi-core CPU 1. The event may be written directly to the event queue in the shared memory 1-0 from the search key extraction block 4 or may be configured using a specific memory or register in the search key extraction block. It doesn't matter. In any case, there is no problem as long as it is an area that can be specified on the address space that can be viewed from the multi-core CPU 1.

　最後に、上記で説明した本実施の形態における処理概要を、図５を用いて説明する。本実施の形態を用いたフロー検索処理、図５に示すように、パケットｐｋｔ１、ｐｋｔ２、・・・が到着すると、順に検索キーＫｅｙ＃１、Ｋｅｙ＃２、・・・を抽出し、それぞれの検索キーをマルチコアＣＰＵ内の複数のプロセッサコアへ割り当て、フロー検索処理を行う。ここでマルチコアＣＰＵ内の複数のプロセッサコアへの割り当て方式は、例えばラウンドロビン（Ｒｏｕｎｄ－Ｒｏｂｉｎ）のような手法によって、順次割り当てていく方法が考えられるが、これに限定されるものではない。 Finally, an outline of the processing in the present embodiment described above will be described with reference to FIG. As shown in FIG. 5, when the packets pkt1, pkt2,... Arrive, the search keys Key # 1, Key # 2,. A search key is assigned to a plurality of processor cores in the multi-core CPU, and a flow search process is performed. Here, as a method of assigning to a plurality of processor cores in the multi-core CPU, a method of sequentially assigning by a technique such as round-robin is conceivable, but is not limited thereto.

　次に、本発明の第１の実施の形態の作用効果について説明する。 Next, the function and effect of the first embodiment of the present invention will be described.

　上記のように、本実施の形態では、検索キー抽出ブロックにより、予めパケットのヘッダ情報から検索キーとなるフィールドの組を抽出し、それをプロセッサコアに通知することにより、メインメモリに格納されるパケットヘッダを読み出す必要が無いことから、フロー検索の処理性能を向上できる。 As described above, in the present embodiment, the search key extraction block extracts a set of fields as a search key from the header information of the packet in advance, and notifies the processor core of the set to store in the main memory. Since there is no need to read the packet header, the processing performance of the flow search can be improved.

　また、本実施の形態では、フローテーブルをマルチコアＣＰＵ内の複数のプロセッサコアで共有する共有メモリ上で構成することにより、ＤＲＡＭ等のメインメモリ上にフローテーブルを構成した場合よりも、エントリの読み出しにかかるアクセスレイテンシが大幅に低減されるため、フロー検索処理に関わる処理性能を向上させることが可能となる。 Also, in this embodiment, the flow table is configured on a shared memory shared by a plurality of processor cores in the multi-core CPU, so that reading of entries is performed more than when the flow table is configured on a main memory such as a DRAM. As a result, the processing latency related to the flow search process can be improved.

　以上のことから、ネットワーク帯域に見合った高速な検索性能を有するパターンマッチング処理を実現できるパターンマッチングシステムを提供することが可能となる。 From the above, it is possible to provide a pattern matching system capable of realizing a pattern matching process having a high-speed search performance corresponding to the network bandwidth.

　次に、本発明の第２の実施の形態について図面を参照して詳細に説明する。
＜実施の形態２＞
　図６は、本発明の第２の実施の形態の構成を示すブロック図である。図６を参照すると、本発明の第２の実施の形態は、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭと、パケットの送受信を行うパケット送受信ブロック５と、受信したパケットからフロー検索処理における検索キーとなるヘッダ情報を抽出すると共に、パケット受信イベントの生成と発行、及びパケットをメインメモリへ書き込む検索キー抽出ブロック４と、パケット受信イベントや必要に応じてパケットそのものが書き込まれるＤＲＡＭ等のメインメモリ６と、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭ、検索キー抽出ブロック４、メインメモリ６をそれぞれ接続するバス３とを含む。本実施の形態は、図１に示した前記第１の実施の形態に、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭを加えたものであり、その他は前記第１の実施の形態と同じである。 Next, a second embodiment of the present invention will be described in detail with reference to the drawings.
<Embodiment 2>
FIG. 6 is a block diagram showing the configuration of the second exemplary embodiment of the present invention. Referring to FIG. 6, the second embodiment of the present invention relates to a multi-core CPU 1, multi-core CPU 2,..., Multi-core CPU M, a packet transmission / reception block 5 for transmitting / receiving packets, and a flow search process from received packets. The search key extraction block 4 that extracts header information as a search key, generates and issues a packet reception event, and writes the packet to the main memory, and a main body such as a DRAM in which the packet reception event and the packet itself are written if necessary. The memory 6 includes a multi-core CPU 1, a multi-core CPU 2,..., A multi-core CPU M, a search key extraction block 4, and a bus 3 to which the main memory 6 is connected. This embodiment is the same as the first embodiment except that multi-core CPUs 2,..., Multi-core CPUM are added to the first embodiment shown in FIG.

　マルチコアＣＰＵ２は、マルチコアＣＰＵ１と同様の形態であり、Ｎ個のプロセッサコア２－１、２－２、・・・、２－Ｎと、上記Ｎ個のプロセッサコアで共有される共有メモリ２－０を備え、共有メモリ２－０には、ハッシュテーブルとして構成されたフローテーブルを保持している。なお、このような形態は他のマルチコアＣＰＵについても同じであり、各マルチコアＣＰＵ内の共有メモリ内に構成されるフローテーブルも、全て同じフローテーブルが構成される。なお、本共有メモリがデータキャッシュとして利用されている場合、本フローテーブルはデータキャッシュとは論理的に分離された領域に保持されるものとする。 The multi-core CPU 2 has the same form as the multi-core CPU 1, and the N processor cores 2-1, 2-2,..., 2-N and the shared memory 2-0 shared by the N processor cores The shared memory 2-0 holds a flow table configured as a hash table. Such a configuration is the same for other multi-core CPUs, and all the flow tables configured in the shared memory in each multi-core CPU have the same flow table. When the shared memory is used as a data cache, the flow table is held in an area logically separated from the data cache.

　なお、図６に示すように、本実施の形態は、第１の実施の形態と同様、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭ、検索キー抽出ブロック４、メインメモリ６をバス３によってバス接続している形態であるが、本接続形態はサーバ等に備えられたＮｏｒｔｈ　ＢｒｉｄｇｅやＳｏｕｔｈ　Ｂｒｉｄｇｅと呼ばれるチップセットを通じて接続される形態や、クロスバー、リングバスの形態でも構わない。また、本実施の形態は、サーバ等のマザーボード上で構成しても良いし、メインメモリ６を除いた全てを１つのＳｏＣとして、１つのチップ上に構成しても良い。 As shown in FIG. 6, in the present embodiment, as in the first embodiment, the multi-core CPU 1, the multi-core CPU 2,..., The multi-core CPU M, the search key extraction block 4, and the main memory 6 are connected via the bus 3. Although this is a form of bus connection, this connection form may be a form connected through a chip set called North Bridge or South Bridge provided in a server or the like, a form of a crossbar, or a ring bus. Further, the present embodiment may be configured on a motherboard such as a server, or may be configured on one chip as a single SoC except for the main memory 6.

　次に、図６、及び図７の流れ図を参照して、本発明の第２の実施の形態の動作について詳細に説明する。 Next, the operation of the second exemplary embodiment of the present invention will be described in detail with reference to the flowcharts of FIGS.

　図７の流れ図は、第１の実施の形態の流れ図を変更したものであり、ステップＡ３をステップＡ６に、ステップＡ５をステップＡ７に置き換えたものである。その他のステップについては、第１の実施の形態と同様であるため、詳細な説明は省略する。 The flowchart of FIG. 7 is a modification of the flowchart of the first embodiment, in which step A3 is replaced with step A6 and step A5 is replaced with step A7. Since other steps are the same as those in the first embodiment, detailed description thereof is omitted.

　パケット送受信ブロック５（ステップＡ１）を経由して、検索キー抽出ブロック４は、ネットワークから入力されたパケットを受け取り、検索キーの抽出等のステップＡ２に対する処理を行う。 The search key extraction block 4 receives a packet input from the network via the packet transmission / reception block 5 (step A1), and performs processing for step A2 such as search key extraction.

　次に、検索キー抽出ブロック４は、パケット受信を複数のマルチコアＣＰＵのうちの１つに通知するためのイベントを生成、発行し、イベントキューへ書き込むと共に、対応するパケットをメインメモリ６に書き込む（ステップＡ６）。ここで、生成するイベントには、第１の実施の形態と同様、抽出した検索キーの情報と当該パケットの識別子を含まれれば良い。また、イベントキューが例えばメインメモリ６に構成されている場合は、メインメモリ６のイベントキューに書き込むが、各マルチコアＣＰＵに対するイベントキューは、それぞれのマルチコアＣＰＵのイベントキューの領域と、パケットデータを書き込む領域が論理的に分離されており、各マルチコアＣＰＵに対するイベントキューの領域のベースアドレスは、それぞれのマルチコアＣＰＵにとって既知であるものとする。イベントの発行は、複数のマルチコアＣＰＵのうちの１つに対してのみ行われ、イベントを発行するマルチコアＣＰＵの選択は、例えば、フロー検索処理を各マルチコアＣＰＵに対して平滑化して行うために、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭ、マルチコアＣＰＵ１、・・・のように、Ｒｏｕｎｄ－Ｒｏｂｉｎの手法を用いる方法が考えられるが、これに限定されるものではない。 Next, the search key extraction block 4 generates and issues an event for notifying one of a plurality of multi-core CPUs of packet reception, writes it in the event queue, and writes the corresponding packet in the main memory 6 ( Step A6). Here, the generated event only needs to include the extracted search key information and the identifier of the packet, as in the first embodiment. For example, when the event queue is configured in the main memory 6, the event queue is written in the event queue of the main memory 6, but the event queue for each multi-core CPU writes the event queue area of each multi-core CPU and packet data. It is assumed that the areas are logically separated, and the base address of the event queue area for each multi-core CPU is known to each multi-core CPU. The event is issued only to one of the plurality of multi-core CPUs, and the selection of the multi-core CPU that issues the event is performed in order to smooth the flow search process for each multi-core CPU, for example. A method using the Round-Robin method such as the multi-core CPU 1, the multi-core CPU 2,..., The multi-core CPU M, the multi-core CPU 1, and so on is conceivable, but is not limited thereto.

　一方、Ｍ個のマルチコアＣＰＵは、それぞれ自身のイベントキューにイベントが存在するかをポーリング（ｐｏｌｌｉｎｇ）によって監視しており、イベントキューにイベントが存在すれば、その先頭のイベントを読み出し、自身の持つ特定のプロセッサコアによってフロー検索処理を行う（ステップＡ４）。ここで、ステップＡ４は、それぞれのマルチコアＣＰＵが独立に行う処理となり、それぞれの処理の内容は、第１の実施の形態と同様であるため、ここでは詳細な説明を省略する。 On the other hand, each of the M multi-core CPUs monitors whether there is an event in its own event queue by polling. If there is an event in the event queue, the M multi-core CPU reads the first event and has its own A flow search process is performed by a specific processor core (step A4). Here, step A4 is a process performed independently by each multi-core CPU, and the contents of each process are the same as those in the first embodiment, and thus detailed description thereof is omitted here.

　最後に、実際のフロー検索処理を行った各マルチコアＣＰＵ内のプロセッサコアは、そこで得た結果を出力する（ステップＡ７）。第１の実施の形態では、１つのマルチコアＣＰＵ内の複数のプロセッサコアがこの処理を行っていたが、ここでは、複数のマルチコアＣＰＵ内の複数のプロセッサコアのうち、実際にフロー検索処理を行ったプロセッサコアが結果の出力を行う点で、第１の実施の形態におけるステップＡ５と異なる。その他のステップＡ７の処理内容については、ステップＡ５と同じであるため、詳細は省略する。 Finally, the processor core in each multi-core CPU that has performed the actual flow search process outputs the result obtained there (step A7). In the first embodiment, a plurality of processor cores in one multi-core CPU perform this processing, but here, among the plurality of processor cores in the plurality of multi-core CPUs, the flow search processing is actually performed. This is different from step A5 in the first embodiment in that the processor core outputs the result. The other processing contents of step A7 are the same as those of step A5, so the details are omitted.

　なお、本実施の形態では、各マルチコアＣＰＵ内の共有メモリに構成されるフローテーブルは、全てのマルチコアＣＰＵにおいて同一の内容である。このため、もし、未登録フローを新規フローとして登録する処理を行う場合、あるマルチコアＣＰＵ内のプロセッサコアにおいて更新したフローテーブルの内容を、他のマルチコアＣＰＵ内のフローテーブルに対しても更新作業を行う必要がある。 In this embodiment, the flow table configured in the shared memory in each multi-core CPU has the same contents in all multi-core CPUs. For this reason, if the process of registering an unregistered flow as a new flow is performed, the contents of the flow table updated in the processor core in a certain multi-core CPU are updated to the flow tables in other multi-core CPUs. There is a need to do.

　また、本実施の形態では、パケットそのものをパケット送受信ブロック５にて、ネットワークから受信する形態であったが、パケットヘッダのみが入力される場合の処理についても第１の実施の形態と同様であり、イベントキューの構成箇所（メインメモリ６、各マルチコアＣＰＵ内の共有メモリ、検索キー抽出ブロック４内の特定内のメモリ、レジスタ等）についても、第１の実施の形態と同様であるため、詳細は省略する。 In the present embodiment, the packet itself is received from the network by the packet transmission / reception block 5, but the processing when only the packet header is input is the same as in the first embodiment. The event queue components (main memory 6, shared memory in each multi-core CPU, specific memory in the search key extraction block 4, registers, etc.) are also the same as those in the first embodiment. Is omitted.

　最後に、上記で説明した本実施の形態における処理概要を、図８を用いて説明する。図８に示すように、本実施の形態を用いたフロー検索処理は、パケットｐｋｔ１、ｐｋｔ２、・・・が到着すると、順に検索キーＫｅｙ＃１、Ｋｅｙ＃２、・・・を抽出し、各検索キーを複数個あるマルチコアＣＰＵのうちの１つへ送出する。送出された各検索キーは、それぞれのマルチコアＣＰＵ内の複数のプロセッサコアのうちの１つに割り当て、それぞれのマルチコアＣＰＵ内のプロセッサコアによってフロー検索処理を行うことになる。ここで、各マルチコアＣＰＵ内の複数のプロセッサコアへの割り当て方法は、例えばＲｏｕｎｄ－Ｒｏｂｉｎのような手法を用いることが考えられるが、これに限定されるものではない。 Finally, an outline of the processing in the present embodiment described above will be described with reference to FIG. As shown in FIG. 8, in the flow search process using this embodiment, when packets pkt1, pkt2,... Arrive, search keys Key # 1, Key # 2,. A search key is sent to one of a plurality of multi-core CPUs. Each transmitted search key is assigned to one of a plurality of processor cores in each multi-core CPU, and a flow search process is performed by the processor core in each multi-core CPU. Here, as a method for assigning to a plurality of processor cores in each multi-core CPU, for example, a method such as Round-Robin may be used, but the method is not limited to this.

　次に、本発明の第２の実施の形態の作用効果について説明する。 Next, the function and effect of the second embodiment of the present invention will be described.

　上記のように、本実施の形態では、第１の実施の形態における作用効果に加えて、マルチコアＣＰＵが複数個あり、それぞれのマルチコアＣＰＵに対して、フロー検索処理を分散させて行うことより、増加したマルチコアＣＰＵ分に応じてより一層フロー検索の処理性能を向上できる。また、複数個のマルチコアＣＰＵに対して分散処理が可能となるので、１つのマルチコアＣＰＵ内の複数個のプロセッサコア間における共有メモリへのアクセス競合の頻度も低下し、より効率的なフロー検索処理が可能となる。 As described above, in this embodiment, in addition to the operation and effect in the first embodiment, there are a plurality of multi-core CPUs, and the flow search process is distributed to each multi-core CPU. The processing performance of the flow search can be further improved according to the increased number of multi-core CPUs. In addition, since distributed processing can be performed for a plurality of multi-core CPUs, the frequency of contention for access to the shared memory among the plurality of processor cores in one multi-core CPU is reduced, and more efficient flow search processing is performed. Is possible.

　次に、本発明の第３の実施の形態について図面を参照して詳細に説明する。
＜実施の形態３＞
　図９は、本発明の第３の実施の形態の構成を示すブロック図である。図９を参照すると、本発明の第３の実施の形態は、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭと、パケットの送受信を行うパケット送受信ブロック５と、受信したパケットからフロー検索処理における検索キーとなるヘッダ情報を抽出すると共に、パケット受信イベントの生成と発行、及びパケットをメインメモリへ書き込む検索キー抽出ブロック４と、複数のマルチコアＣＰＵにおける個別のフロー検索処理の結果を収集し、最終的なフロー検索処理結果を選択する最終結果決定ブロック７と、パケット受信イベントや必要に応じてパケットそのものが書き込まれるＤＲＡＭ等のメインメモリ６と、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭ、検索キー抽出ブロック４、メインメモリ６、最終結果決定ブロック７をそれぞれ接続するバス３とを含む。本実施の形態は、図６に示した前記第２の実施の形態に、最終結果決定ブロック７を加えたものであり、その他は前記第２の実施の形態と同じである。 Next, a third embodiment of the present invention will be described in detail with reference to the drawings.
<Embodiment 3>
FIG. 9 is a block diagram showing the configuration of the third exemplary embodiment of the present invention. Referring to FIG. 9, the third embodiment of the present invention relates to a multi-core CPU 1, a multi-core CPU 2,..., A multi-core CPU M, a packet transmission / reception block 5 for transmitting / receiving packets, and a flow search process from received packets. The header information as a search key is extracted, the packet reception event is generated and issued, the search key extraction block 4 for writing the packet to the main memory, and the results of the individual flow search processes in a plurality of multi-core CPUs are collected. Final result determination block 7 for selecting a typical flow search processing result, a main memory 6 such as a DRAM in which a packet reception event or a packet itself is written if necessary, a multicore CPU1, a multicore CPU2,..., A multicore CPUM, Search key extraction block 4, main Mori 6, including the final result decision block 7 and the bus 3 connected respectively. This embodiment is the same as the second embodiment except that a final result determination block 7 is added to the second embodiment shown in FIG.

　第２の実施の形態では、各マルチコアＣＰＵ内の共有メモリには、全て同一のフローテーブルが構成されていたが、本実施の形態では、元のフローテーブルをＭ個に分割したうちの１つのフローテーブルが構成される。なお、本共有メモリがデータキャッシュとして利用されている場合、本フローテーブルはデータキャッシュとは論理的に分離された領域に保持されるものとする。図１０に、本実施の形態におけるフローテーブルの分割の一例を示す図を示す。図１０に示すように、本実施の形態では、元の１つのフローテーブルをＭ個に分割し、それぞれのサブフローテーブルｋ（ｋ＝１、・・・、Ｍ）を、マルチコアＣＰＵｋ（ｋ＝１、・・・、Ｍ）内の共有メモリ上に構成する。各サブフローテーブルは、基本的には第１、第２の実施例と同様、ハッシュテーブルをベースとしたフローテーブルであるので、各エントリの内容等についての説明は省略するが、各サブフローテーブルは、そのテーブルだけで完全に独立して構成される。つまり、１つのサブフローテーブルに割り当てられたハッシュ値に対するフローは、同一サブフローテーブル内の空きエントリ領域に格納される。このため、衝突の発生を考慮して、エントリの先頭ワードを示す論理的なアドレス空間よりも小さい範囲でハッシュ値を計算するようにしていた場合には、各サブフローテーブルにおいて、上記が成り立つように構成する必要がある。補足すると以下のようになる。 In the second embodiment, the same flow table is configured for all the shared memories in each multi-core CPU, but in this embodiment, one of the original flow tables divided into M pieces is used. A flow table is configured. When the shared memory is used as a data cache, the flow table is held in an area logically separated from the data cache. FIG. 10 is a diagram illustrating an example of the division of the flow table in the present embodiment. As shown in FIG. 10, in the present embodiment, one original flow table is divided into M pieces, and each sub-flow table k (k = 1,..., M) is divided into multi-core CPUk (k = 1). ,..., M). Each subflow table is basically a flow table based on a hash table, as in the first and second embodiments, so description of the contents of each entry is omitted, but each subflow table is The table is completely independent. That is, a flow for a hash value assigned to one subflow table is stored in an empty entry area in the same subflow table. For this reason, when the hash value is calculated in a range smaller than the logical address space indicating the first word of the entry in consideration of the occurrence of the collision, the above holds in each subflow table. Must be configured. Supplementary information is as follows.

　例えば、元のフローテーブルにおいて衝突の発生を考慮し、エントリの先頭ワードを示す論理的なアドレス空間よりも小さい範囲でハッシュ値を計算するようにし、ハッシュ値によって特定できない領域のエントリがアドレス値の大きな領域に位置していた場合、このテーブルをサブフローテーブルに分割すると、ハッシュ値によって特定できないエントリ領域がＭ番目のサブフローテーブル等に偏ってしまうことになる。この場合、ハッシュ値によって特定できるエントリ領域しか持たないサブフローテーブルでは、衝突が多発することになり、逆にハッシュ値によって特定できないエントリ領域を多く持つ領域や、又は特定できないエントリ領域しか持たないサブフローテーブルでは、そのエントリ領域を有効に使い切れず、効率的なフローテーブルを構成することができない。このため、サブフローテーブル単位で衝突を考慮したエントリ領域を確保しておく必要がある。 For example, considering the occurrence of a collision in the original flow table, the hash value is calculated in a range smaller than the logical address space indicating the first word of the entry, and the entry in the area that cannot be specified by the hash value is the address value. If this table is divided into subflow tables when located in a large area, the entry area that cannot be specified by the hash value is biased to the Mth subflow table or the like. In this case, in the subflow table having only the entry area that can be specified by the hash value, collisions frequently occur, and conversely, the subflow table having many entry areas that cannot be specified by the hash value or only the entry area that cannot be specified. Thus, the entry area cannot be used up effectively, and an efficient flow table cannot be constructed. For this reason, it is necessary to secure an entry area in consideration of collision for each subflow table.

　なお、図９に示すように、本実施形態は、第１、第２の実施の形態と同様、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭ、検索キー抽出ブロック４、メインメモリ６、最終結果決定ブロック７をバス３によってバス接続している形態であるが、本接続形態はサーバ等に備えられたＮｏｒｔｈ　ＢｒｉｄｇｅやＳｏｕｔｈ　Ｂｒｉｄｇｅと呼ばれるチップセットを通じて接続される形態や、クロスバー、リングバスの形態でも構わない。また、本実施の形態は、サーバ等のマザーボード上で構成しても良いし、メインメモリ６を除いた全てを１つのＳｏＣとして、１つのチップ上に構成しても良い。 As shown in FIG. 9, the present embodiment is similar to the first and second embodiments in that the multi-core CPU 1, the multi-core CPU 2,..., The multi-core CPU M, the search key extraction block 4, the main memory 6, and the final The result determination block 7 is connected to the bus 3 by a bus 3, but this connection form is connected through a chip set called North Bridge or South Bridge provided in a server or the like, a crossbar, or a ring bus. It does not matter in the form. Further, the present embodiment may be configured on a motherboard such as a server, or may be configured on one chip as a single SoC except for the main memory 6.

　次に、図９、及び図１１の流れ図を参照して、本発明の第３の実施の形態の動作について詳細に説明する。 Next, the operation of the third exemplary embodiment of the present invention will be described in detail with reference to the flowcharts of FIG. 9 and FIG.

　図１１の流れ図は、図７に示す第２の実施の形態の流れ図を変更したものであり、ステップＡ６をステップＡ８に、ステップＡ７をステップＡ１０に置き換え、フロー検索処理（ステップＡ４）と結果出力ステップ（ステップＡ１０）の間に最終結果決定ステップ（ステップＡ９）を追加したものである。その他のステップについては、第１、第２の実施の形態と同様であるため、詳細な説明は省略する。 The flowchart of FIG. 11 is a modification of the flowchart of the second embodiment shown in FIG. 7, in which step A6 is replaced with step A8, and step A7 is replaced with step A10, and flow search processing (step A4) and result output are performed. A final result determination step (step A9) is added between the steps (step A10). Since other steps are the same as those in the first and second embodiments, detailed description thereof is omitted.

　次に、検索キー抽出ブロック４は、パケット受信を複数のマルチコアＣＰＵの全てに通知するためのイベントを生成、発行し、イベントキューへ書き込むと共に、対応するパケットをメインメモリ６に書き込む（ステップＡ８）。ここで、生成するイベントの情報や、各マルチコアＣＰＵに対するイベントキューについては、第２の実施の形態と同様であるため、詳細は省略する。 Next, the search key extraction block 4 generates and issues an event for notifying all of the plurality of multi-core CPUs of packet reception, writes the event in the event queue, and writes the corresponding packet in the main memory 6 (step A8). . Here, the information on the event to be generated and the event queue for each multi-core CPU are the same as those in the second embodiment, and thus the details are omitted.

　一方、Ｍ個のマルチコアＣＰＵは、それぞれ自身のイベントキューにイベントが存在するかをポーリング（ｐｏｌｌｉｎｇ）によって監視しており、イベントキューにイベントが存在すれば、その先頭のイベントを読み出し、自身の持つ特定のプロセッサコアによってフロー検索処理を行う（ステップＡ４）。ここで、ステップＡ４は、それぞれのマルチコアＣＰＵが独立に行う処理となり、それぞれの処理の内容は、第１、第２の実施の形態と同様であるため、ここでは詳細な説明を省略する。 On the other hand, each of the M multi-core CPUs monitors whether there is an event in its own event queue by polling. If there is an event in the event queue, the M multi-core CPU reads the first event and has its own A flow search process is performed by a specific processor core (step A4). Here, step A4 is a process performed independently by each multi-core CPU, and the contents of each process are the same as those in the first and second embodiments, and thus detailed description thereof is omitted here.

　各マルチコアＣＰＵ内の複数のプロセッサコアのうち、フロー検索処理を行ったプロセッサコアは、そこで得た結果と、処理を行ったパケットを識別する識別子を最終結果決定ブロック７に出力する。Ｍ個のマルチコアＣＰＵがフロー検索処理を行った同一パケットに対する検索結果が全て（Ｍ個）揃うと、最終結果決定ブロック７は、そのうちフローＩＤを特定できた結果が存在するかの確認を行い、特定できていれば、その結果を最終結果とし、出力する。一方、全ての検索結果がフローＩＤを特定できていなければ、最終結果を未登録フローとして出力する（ステップＡ９、ステップＡ１０）。なお、この最終結果の出力先は、第１、第２の実施形態と同じであるので、詳細は省略する。さらに、上記では、Ｍ個のマルチコアＣＰＵからの検索結果が全て揃った時点で最終結果の確認、決定を行っているが、処理対象のパケットに対する最終結果を特定できる変数やレジスタ等を用意し、マルチコアＣＰＵからの検索結果が返信される度に順次結果を確認していく手法を用いても良い。 Among the plurality of processor cores in each multi-core CPU, the processor core that has performed the flow search process outputs the result obtained there and the identifier for identifying the processed packet to the final result determination block 7. When all (M) search results for the same packet for which the M multi-core CPUs have performed the flow search processing are prepared, the final result determination block 7 checks whether there is a result of which the flow ID can be specified. If it can be specified, the result is output as the final result. On the other hand, if all the search results cannot identify the flow ID, the final result is output as an unregistered flow (step A9, step A10). Note that the output destination of this final result is the same as in the first and second embodiments, and details thereof are omitted. Furthermore, in the above, the final result is confirmed and determined when all the search results from the M multi-core CPUs are prepared. However, a variable, a register, or the like that can specify the final result for the packet to be processed is prepared. A method of sequentially checking the results every time search results from the multi-core CPU are returned may be used.

　なお、本実施の形態では、各マルチコアＣＰＵ内の共有メモリに構成されるサブフローテーブルは、全て独立した内容となっている。このため、もし、最終結果決定ブロック７で、フローＩＤを特定できなかった未登録フローを新規フローとして登録する処理を行う場合には、各マルチコアＣＰＵ内のサブフローテーブルのうち、適切なサブフローテーブルにフロー登録を行う処理が必要である。 In the present embodiment, all the subflow tables configured in the shared memory in each multi-core CPU have independent contents. For this reason, if the final result determination block 7 performs processing for registering an unregistered flow for which a flow ID could not be specified as a new flow, an appropriate subflow table among the subflow tables in each multi-core CPU is used. Processing to perform flow registration is required.

　また、本実施の形態では、パケットそのものをパケット送受信ブロック５にて、ネットワークから受信する形態であったが、パケットヘッダのみが入力される場合の処理についても第１の実施の形態と同様であり、イベントキューの構成箇所（メインメモリ６、各マルチコアＣＰＵ内の共有メモリ、検索キー抽出ブロック４内の特定内のメモリ、レジスタ等）についても、第１、第２の実施の形態と同様であるため、詳細は省略する。 In the present embodiment, the packet itself is received from the network by the packet transmission / reception block 5, but the processing when only the packet header is input is the same as in the first embodiment. The event queue components (main memory 6, shared memory in each multi-core CPU, specific memory in the search key extraction block 4, registers, etc.) are also the same as in the first and second embodiments. Therefore, details are omitted.

　最後に、上記で説明した本実施の形態における処理概要を、図１２を用いて説明する。図１２に示すように、本実施の形態におけるフロー検索処理は、パケットｐｋｔ１、ｐｋｔ２、・・・が到着すると、順に検索キーＫｅｙ＃１、Ｋｅｙ＃２、・・・を抽出し、各検索キーを全てのマルチコアＣＰＵへ送出する。送出された各検索キーは、各マルチコアＣＰＵ内の複数のプロセッサコアのうちの１つに割り当て、それぞれのマルチコアＣＰＵ内のプロセッサコアによってフロー検索処理を行い、最後に最終結果決定ブロックによって最終結果を決定することになる。ここで、各マルチコアＣＰＵ内の複数のプロセッサコアへの割り当ては、例えばＲｏｕｎｄ－Ｒｏｂｉｎのような手法を用いることが考えられるが、これに限定されるものではない。 Finally, an outline of the processing in the present embodiment described above will be described with reference to FIG. As shown in FIG. 12, in the flow search process according to the present embodiment, when packets pkt1, pkt2,... Arrive, search keys Key # 1, Key # 2,. Is sent to all multi-core CPUs. Each sent search key is assigned to one of a plurality of processor cores in each multi-core CPU, a flow search process is performed by the processor core in each multi-core CPU, and finally a final result is determined by a final result determination block. Will be determined. Here, for example, a technique such as Round-Robin may be used for allocation to a plurality of processor cores in each multi-core CPU, but the present invention is not limited to this.

　次に、本発明の第３の実施の形態の作用効果について説明する。 Next, the function and effect of the third embodiment of the present invention will be described.

　上記のように、本実施の形態では、第２の実施の形態とは異なり、複数個のマルチコアＣＰＵ内に存在するフローテーブルは、全て独立した内容となっている。このため、複数個のマルチコアＣＰＵに対する分散処理はできないが、第１の実施の形態における作用効果に加えて、第２の実施の形態におけるフローエントリ数がＭ倍となり、より多くのフローを登録、管理、検索することが可能となる。 As described above, in the present embodiment, unlike the second embodiment, the flow tables existing in a plurality of multi-core CPUs all have independent contents. For this reason, distributed processing cannot be performed for a plurality of multi-core CPUs, but in addition to the effects of the first embodiment, the number of flow entries in the second embodiment is M times, and more flows are registered. It becomes possible to manage and search.

　次に、本発明の第４の実施の形態について図面を参照して詳細に説明する。
＜実施の形態４＞
　図１３は、本発明の第４の実施の形態の構成を示すブロック図である。図１３を参照すると、本発明の第４の実施の形態は、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭと、パケットの送受信を行うパケット送受信ブロック５と、受信したパケットからフロー検索処理における検索キーとなるヘッダ情報を抽出する検索キー抽出ブロック８と、抽出した検索キーに対するハッシュ値を計算すると共に、パケット受信イベントの生成と発行、及びパケットをメインメモリへ書き込むハッシュ値計算ブロック９と、パケット受信イベントや必要に応じてパケットそのものが書き込まれるＤＲＡＭ等のメインメモリ６と、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭ、検索キー抽出ブロック８、ハッシュ値計算ブロック９、メインメモリ６をそれぞれ接続するバス３とを含む。本実施の形態は、図９に示した前記第３の実施の形態から、最終結果決定ブロック７を除き、検索キー抽出ブロック４を検索キー抽出ブロック８に変更し、ハッシュ値計算ブロック９を加えたものであり、その他は前記第２の実施の形態と同じである。 Next, a fourth embodiment of the present invention will be described in detail with reference to the drawings.
<Embodiment 4>
FIG. 13 is a block diagram showing the configuration of the fourth exemplary embodiment of the present invention. Referring to FIG. 13, the fourth embodiment of the present invention relates to a multi-core CPU 1, a multi-core CPU 2,..., A multi-core CPU M, a packet transmission / reception block 5 for transmitting / receiving packets, and a flow search process from received packets. A search key extraction block 8 for extracting header information serving as a search key; a hash value calculation block 9 for calculating a hash value for the extracted search key, generating and issuing a packet reception event, and writing the packet to the main memory; A main memory 6 such as a DRAM in which a packet reception event or a packet itself is written as necessary, a multi-core CPU 1, a multi-core CPU 2,..., A multi-core CPU M, a search key extraction block 8, a hash value calculation block 9, and a main memory 6 Each connected bus Including a 3 and. This embodiment is different from the third embodiment shown in FIG. 9 except that the final result determination block 7 is changed, the search key extraction block 4 is changed to a search key extraction block 8, and a hash value calculation block 9 is added. The others are the same as those of the second embodiment.

　また、本実施の形態における各マルチコアＣＰＵ内の共有メモリには、第３の実施の形態と同様、元のフローテーブルをＭ個に分割したうちの１つのフローテーブルを構成する。本フローテーブルについては、第３の実施の形態と同様であるため、詳細な説明は省略する。 Also, in the shared memory in each multi-core CPU in the present embodiment, one flow table of the original flow table divided into M pieces is configured as in the third embodiment. Since this flow table is the same as that of the third embodiment, detailed description thereof is omitted.

　なお、図１３においても、第１、第２、第３の実施の形態と同様、マルチコアＣＰＵ１、マルチコアＣＰＵ２、・・・、マルチコアＣＰＵＭ、ハッシュ値計算ブロック９、メインメモリ６をバス３によってバス接続している形態であるが、本接続形態はサーバ等に備えられたＮｏｒｔｈ　ＢｒｉｄｇｅやＳｏｕｔｈ　Ｂｒｉｄｇｅと呼ばれるチップセットを通じて接続される形態や、クロスバー、リングバスの形態でも構わない。また、本実施の形態は、サーバ等のマザーボード上で構成しても良いし、メインメモリ６を除いた全てを１つのＳｏＣとして、１つのチップ上に構成しても良い。 In FIG. 13, as in the first, second, and third embodiments, the multi-core CPU 1, the multi-core CPU 2,..., The multi-core CPU M, the hash value calculation block 9, and the main memory 6 are connected by the bus 3. However, this connection form may be a form connected through a chip set called North Bridge or South Bridge provided in a server or the like, a form of a crossbar, or a ring bus. Further, the present embodiment may be configured on a motherboard such as a server, or may be configured on one chip as a single SoC except for the main memory 6.

　次に、図１３、及び図１４の流れ図を参照して、本発明の第３の実施の形態の動作について詳細に説明する。 Next, the operation of the third exemplary embodiment of the present invention will be described in detail with reference to the flowcharts of FIGS.

　図１３の流れ図におけるステップＡ１、ステップＡ２は、第１、第２、第３の実施の形態におけるステップＡ１、ステップＡ２と同様であるため、詳細な説明は省略するが、パケット送受信ブロック５（ステップＡ１）を経由して、検索キー抽出ブロック８は、ネットワークから入力されたパケットを受け取り、検索キーの抽出等のステップＡ２に対する処理を行う。 Step A1 and step A2 in the flowchart of FIG. 13 are the same as step A1 and step A2 in the first, second, and third embodiments, and thus detailed description is omitted, but the packet transmission / reception block 5 (step Via A1), the search key extraction block 8 receives a packet input from the network and performs processing for step A2 such as search key extraction.

　次に、検索キー抽出ブロック８は、抽出した検索キーとパケット送受信ブロック５から受け取ったパケットをハッシュ値計算ブロック９へ出力し、ハッシュ値計算ブロック９は、受け取った検索キーに対するハッシュ値を計算する（ステップＡ１１）。ここで、ハッシュ値計算ブロック９において算出するハッシュ値については、第１、第２、第３の実施例におけるフロー検索処理（ステップＡ４）におけるハッシュ値の算出（ステップＢ２）と同じであるため、詳細な説明は省略する。 Next, the search key extraction block 8 outputs the extracted search key and the packet received from the packet transmission / reception block 5 to the hash value calculation block 9, and the hash value calculation block 9 calculates a hash value for the received search key. (Step A11). Here, the hash value calculated in the hash value calculation block 9 is the same as the hash value calculation (step B2) in the flow search process (step A4) in the first, second, and third embodiments. Detailed description is omitted.

　ハッシュ値計算ブロック９は、ハッシュ値の計算を行うと、パケット受信を複数のマルチコアＣＰＵのうちの１つに通知するためのイベントを生成、発行し、イベントキューへ書き込むと共に、対応するパケットをメインメモリ６に書き込む（ステップＡ１２）。ここで、生成するイベントには、抽出した検索キーの情報と、対応するパケットの識別子に加え、算出したハッシュ値の値を含むものであれば良く、各マルチコアＣＰＵに対するイベントキューの構成については、第３の実施の形態と同様である。 When the hash value calculation block 9 calculates the hash value, the hash value calculation block 9 generates and issues an event for notifying one of the multi-core CPUs of packet reception, writes the event to the event queue, and stores the corresponding packet in the main Write to the memory 6 (step A12). Here, the event to be generated only needs to include the value of the calculated hash value in addition to the extracted search key information and the corresponding packet identifier. For the configuration of the event queue for each multi-core CPU, This is the same as in the third embodiment.

　また、ハッシュ値計算ブロック９が生成し、発行するイベントは、Ｍ個のマルチコアＣＰＵのうちの１つである。このイベントを発行するマルチコアＣＰＵの選択方法については、以下に説明する。上述したように、本実施の形態におけるマルチコアＣＰＵは、第３の実施の形態と同様、元の１つのフローテーブルをＭ個に分割したサブフローテーブルのうちの１つを共有メモリ上に構成している。ハッシュ値計算ブロック９には、各サブフローテーブルに含まれるベースアドレスを予め設定しておく。このため、ハッシュ値計算ブロック９は、ハッシュ値の算出を行うと、当該ハッシュ値に対応するエントリ群が、Ｍ個のマルチコアＣＰＵのうちのどのサブフローテーブル内に登録されているかを判断することが可能となる。ハッシュ値計算ブロック９は、算出したハッシュ値に応じて、そのハッシュ値に対応するエントリ群が登録されているマルチコアＣＰＵに対して、パケット受信イベントを発行する。 The event generated and issued by the hash value calculation block 9 is one of M multi-core CPUs. A method of selecting a multi-core CPU that issues this event will be described below. As described above, the multi-core CPU according to the present embodiment configures one of the original flow tables into M sub-flow tables on the shared memory as in the third embodiment. Yes. In the hash value calculation block 9, a base address included in each subflow table is set in advance. Therefore, when the hash value calculation block 9 calculates the hash value, the hash value calculation block 9 may determine in which subflow table of the M multi-core CPUs the entry group corresponding to the hash value is registered. It becomes possible. The hash value calculation block 9 issues a packet reception event to the multi-core CPU in which the entry group corresponding to the hash value is registered according to the calculated hash value.

　一方、Ｍ個のマルチコアＣＰＵは、それぞれ自身のイベントキューにイベントが存在するかをポーリング（ｐｏｌｌｉｎｇ）によって監視しており、イベントキューにイベントが存在すれば、その先頭のイベントを読み出し、自身の持つ特定のプロセッサコアによってフロー検索処理を行う（ステップＡ１３）。 On the other hand, each of the M multi-core CPUs monitors whether there is an event in its own event queue by polling. If there is an event in the event queue, the M multi-core CPU reads the first event and has its own A flow search process is performed by a specific processor core (step A13).

　図１５は、ステップＡ１３のより詳細な動作を説明するための流れ図である。本フロー検索処理は、Ｍ個のマルチコアＣＰＵにおいてフロー検索処理を行うプロセッサコアに共通の処理であり、以下では、任意のマルチコアＣＰＵにおける任意のプロセッサコアに対する処理として説明する。 FIG. 15 is a flowchart for explaining more detailed operation of step A13. This flow search process is a process common to the processor cores that perform the flow search process in the M multi-core CPUs, and will be described below as a process for any processor core in any multi-core CPU.

　図１５を参照すると、本実施の形態におけるフロー検索処理（ステップＡ１３）は、第１、第２、第３の実施の形態におけるフロー検索処理（ステップＡ４）のうち、ハッシュ値の計算処理（ステップＢ２）を除いた処理の流れであり、上述したように、受信したイベントにはハッシュ値計算ブロック９において算出されたハッシュ値が含まれる。本実施の形態におけるフロー検索処理においては、第１、第２、第３の実施の形態におけるステップＢ２において算出するはずのハッシュ値として、受信したイベントに含まれるハッシュ値を用いて処理を行う。本実施の形態におけるフロー検索処理（ステップＡ１３）における他の詳細な処理については、第１、第２、第３の実施の形態におけるフロー検索処理（ステップＡ４）と同じであるため、詳細な説明は省略する。 Referring to FIG. 15, the flow search process (step A13) in the present embodiment is a hash value calculation process (step A4) of the flow search processes (step A4) in the first, second, and third embodiments. This is a processing flow excluding B2). As described above, the received event includes the hash value calculated in the hash value calculation block 9. In the flow search process in the present embodiment, processing is performed using the hash value included in the received event as the hash value that should be calculated in step B2 in the first, second, and third embodiments. The other detailed processing in the flow search processing (step A13) in the present embodiment is the same as the flow search processing (step A4) in the first, second, and third embodiments, and thus detailed description thereof. Is omitted.

　フロー検索処理（ステップＡ１３）が終了すると、プロセッサコアは、その結果を出力する（ステップＡ７）。本結果出力ステップＡ７は、第２の実施の形態と同様であるため、詳細な説明は省略する。なお、本実施の形態では、各マルチコアＣＰＵ内の共有メモリに構成されるサブフローテーブルは、全て独立した内容となっている。このため、もし、フローＩＤを特定できなかった未登録フローを新規フローとして登録する処理を行う場合には、フロー登録処理を行う必要があるが、第３の実施の形態と異なり、１つのパケットに対するフロー検索処理は、１つのマルチコアＣＰＵのうちの１つのプロセッサコアでのみ行われるため、各マルチコアＣＰＵに独立した登録処理となる。 When the flow search process (step A13) ends, the processor core outputs the result (step A7). Since this result output step A7 is the same as that of the second embodiment, detailed description thereof is omitted. In the present embodiment, the subflow tables configured in the shared memory in each multi-core CPU all have independent contents. For this reason, if processing for registering an unregistered flow for which a flow ID could not be specified as a new flow is performed, it is necessary to perform flow registration processing. Unlike the third embodiment, one packet Since the flow search process is performed only on one processor core of one multi-core CPU, it is a registration process independent of each multi-core CPU.

　また、本実施の形態では、パケットそのものをパケット送受信ブロック５にて、ネットワークから受信する形態であったが、パケットヘッダのみが入力される場合の処理についても第１の実施の形態と同様である。さらに、各マルチコアＣＰＵに対するイベントキューは、各マルチコアＣＰＵ内の共有メモリにおいて、フローテーブルとは論理的に分離された領域に構成し、ハッシュ値計算ブロック９から直接共有メモリ内のイベントキューへイベントを書き込んでも構わないし、ハッシュ値計算ブロック９内の特定内のメモリ、レジスタを用いて構成しても構わない。いずれにせよ、各マルチコアＣＰＵから見ることのできるアドレス空間上で、自身のマルチコアＣＰＵに対するイベントキューとして特定できる領域であれば問題無い。 In the present embodiment, the packet itself is received from the network by the packet transmission / reception block 5, but the processing when only the packet header is input is the same as in the first embodiment. . Further, the event queue for each multi-core CPU is configured in an area logically separated from the flow table in the shared memory in each multi-core CPU, and events are directly sent from the hash value calculation block 9 to the event queue in the shared memory. You may write in and you may comprise using the memory in the specific in the hash value calculation block 9, and a register | resistor. In any case, there is no problem as long as it is an area that can be specified as an event queue for its own multi-core CPU in an address space that can be viewed from each multi-core CPU.

　最後に、上記で説明した本実施の形態における処理概要を、図１６を用いて説明する。本実施の形態を用いたフロー検索処理、図１６に示すように、パケットｐｋｔ１、ｐｋｔ２、・・・が到着すると、順に検索キーＫｅｙ＃１、Ｋｅｙ＃２、・・・を抽出し、各検索キーに対するハッシュ値を算出する。続いて、算出したハッシュ値に応じて、当該ハッシュ値に対するエントリ群が登録されているマルチコアＣＰＵに対して検索キーとハッシュ値を送出する。送出された検索キーとハッシュ値は、それぞれのマルチコアＣＰＵ内の複数のプロセッサコアのうちの１つに割り当て、それぞれのマルチコアＣＰＵ内のプロセッサコアによってフロー検索処理を行うことになる。ここで、各マルチコアＣＰＵ内の複数のプロセッサコアへの割り当ては、例えばＲｏｕｎｄ－Ｒｏｂｉｎのような手法を用いることが考えられるが、これに限定されるものではない。 Finally, an outline of processing in the present embodiment described above will be described with reference to FIG. Flow search processing using this embodiment, as shown in FIG. 16, when packets pkt1, pkt2,... Arrive, search keys Key # 1, Key # 2,. Calculate a hash value for the key. Subsequently, according to the calculated hash value, the search key and the hash value are sent to the multi-core CPU in which the entry group for the hash value is registered. The transmitted search key and hash value are assigned to one of a plurality of processor cores in each multi-core CPU, and flow search processing is performed by the processor core in each multi-core CPU. Here, for example, a technique such as Round-Robin may be used for allocation to a plurality of processor cores in each multi-core CPU, but the present invention is not limited to this.

　次に、本発明の第４の実施の形態の作用効果について説明する。 Next, the function and effect of the fourth embodiment of the present invention will be described.

　上記のように、本実施の形態では、第３の実施の形態と同様、各マルチコアＣＰＵ内のフローテーブルは全て独立した内容となっている。このため、より多くのフローを登録、管理、検索することが可能である。また、本実施の形態では、ハッシュ値の算出を各プロセッサコアではなく、ハッシュ値計算ブロックにて行い、算出したハッシュ値に応じてフロー検索処理を行うべきフローテーブルを有するマルチコアＣＰＵを特定することが可能となる。このため、第３の実施の形態とは異なり、全てのマルチコアＣＰＵ内の１つのプロセッサコアでフロー検索処理を行う必要が無く、複数個のプロセッサコアに対して効率的にフロー検索処理を分散させることができるため、フロー検索処理の性能が向上する。 As described above, in the present embodiment, as in the third embodiment, all the flow tables in each multi-core CPU have independent contents. For this reason, it is possible to register, manage, and search more flows. In this embodiment, the hash value is calculated not in each processor core but in the hash value calculation block, and a multi-core CPU having a flow table to be subjected to a flow search process according to the calculated hash value is specified. Is possible. Therefore, unlike the third embodiment, there is no need to perform flow search processing with one processor core in all multi-core CPUs, and the flow search processing is efficiently distributed to a plurality of processor cores. Therefore, the performance of the flow search process is improved.

　上記の実施形態の一部又は全部は、以下の付記のようにも記載されうるが、以下には限定されない。
　（付記１）複数のプロセッサコアと、同一ＣＰＵ内に存在するそれらプロセッサコア間で共有される共有メモリから構成されるマルチコアＣＰＵを有するマルチコアプロセッサ環境において、
　共有メモリ上に検索パターンを備えるパターンテーブルを構成し、
　複数のプロセッサコアで検索対象となる検索キーに対するパターンマッチング処理を並列に実行する
　ことを特徴とするパターンマッチング方法。 A part or all of the above embodiment can be described as in the following supplementary notes, but is not limited to the following.
(Supplementary Note 1) In a multi-core processor environment having a multi-core CPU composed of a plurality of processor cores and a shared memory shared among those processor cores existing in the same CPU,
Configure a pattern table with search patterns on shared memory,
A pattern matching method, wherein a pattern matching process for a search key to be searched is executed in parallel by a plurality of processor cores.

　（付記２）前記パターンテーブルは、ハッシュテーブルとして構成されることを特徴とする付記１に記載のパターンマッチング方法。 (Supplementary note 2) The pattern matching method according to supplementary note 1, wherein the pattern table is configured as a hash table.

　（付記３）パケットの送受信を行い、
　受信したパケットのヘッダ情報から検索キーとなる任意のフィールドの組を抽出し、パケット受信イベントを生成し、発行すると共に、パケットデータをメインメモリへ記憶させ、
　１つのマルチコアＣＰＵでパターンマッチング処理を行うこと
　を特徴とする付記１又は付記２に記載のパターンマッチング方法。 (Appendix 3) Send and receive packets,
Extract a set of arbitrary fields that are search keys from the header information of the received packet, generate and issue a packet reception event, and store the packet data in the main memory,
The pattern matching method according to appendix 1 or appendix 2, wherein pattern matching processing is performed by one multi-core CPU.

　（付記４）抽出した検索キーを含むパケット受信イベントを生成、発行し、
　受信したパケット受信イベントを前記マルチコアＣＰＵが有する複数個のプロセッサコアのうちの１つに順次割り当て、
　パケット受信イベントが割り当たったプロセッサコアでパターンマッチング処理を行うこと
　を特徴とする付記３に記載のパターンマッチング方法。 (Appendix 4) Generate and issue a packet reception event including the extracted search key,
Sequentially assigns received packet reception events to one of a plurality of processor cores of the multi-core CPU;
4. The pattern matching method according to appendix 3, wherein pattern matching processing is performed by a processor core to which a packet reception event is assigned.

　（付記５）パケットの送受信を行い、
　受信したパケットのヘッダ情報から検索キーとなる任意のフィールドの組を抽出し、パケット受信イベントを生成し、発行すると共に、パケットデータをメインメモリへ記憶させ、
　複数個のマルチコアＣＰＵが有する共有メモリ上に、同一のパターンテーブルを構成し、
　複数個のマルチコアＣＰＵのうちの１つでパターンマッチング処理を行うこと
　を特徴とする付記１又は付記２に記載のパターンマッチング方法。 (Appendix 5) Sending and receiving packets
Extract a set of arbitrary fields that are search keys from the header information of the received packet, generate and issue a packet reception event, and store the packet data in the main memory,
Configure the same pattern table on the shared memory of multiple multi-core CPUs,
The pattern matching method according to appendix 1 or appendix 2, wherein pattern matching processing is performed by one of a plurality of multi-core CPUs.

　（付記６）抽出した検索キーを含むパケット受信イベントを生成し、
　生成したパケット受信イベントは、複数個のマルチコアＣＰＵのうちの１つに対して順次発行する
　ことを特徴とする付記５に記載のパターンマッチング方法。 (Appendix 6) Generate a packet reception event including the extracted search key,
The pattern matching method according to appendix 5, wherein the generated packet reception event is sequentially issued to one of a plurality of multi-core CPUs.

　（付記７）前記パケット受信イベントを受信したマルチコアＣＰＵにおいて、
　受信したパケット受信イベントを、当該マルチコアＣＰＵが有する複数個のプロセッサコアのうちの１つに順次割り当てる
　ことを特徴とする付記６に記載のパターンマッチング方法。 (Supplementary Note 7) In the multi-core CPU that has received the packet reception event,
The pattern matching method according to claim 6, wherein the received packet reception event is sequentially assigned to one of a plurality of processor cores of the multi-core CPU.

　（付記８）パケットの送受信を行い、
　受信したパケットのヘッダ情報から検索キーとなる任意のフィールドの組を抽出し、パケット受信イベントを生成し、発行すると共に、パケットデータをメインメモリへ記憶させ、
　複数個のマルチコアＣＰＵが有する共有メモリ上に、パターンテーブルをマルチコアＣＰＵと同数に分割したサブパターンテーブルのうちの１つを構成し、
　複数個のマルチコアＣＰＵの全てでパターンマッチング処理を行い、
　各マルチコアＣＰＵ内の複数のプロセッサコアにおいて行ったパターンマッチングの結果を収集し、最終的なパターンマッチング結果を決定する
　ことを特徴とする付記１又は付記２に記載のパターンマッチング方法。 (Appendix 8) Send and receive packets
Extract a set of arbitrary fields that are search keys from the header information of the received packet, generate and issue a packet reception event, and store the packet data in the main memory,
On one shared memory of a plurality of multi-core CPUs, one of the sub-pattern tables obtained by dividing the pattern table into the same number as the multi-core CPUs is configured.
Perform pattern matching on all of the multi-core CPUs.
The pattern matching method according to Supplementary Note 1 or Supplementary Note 2, wherein results of pattern matching performed in a plurality of processor cores in each multi-core CPU are collected and a final pattern matching result is determined.

　（付記９）抽出した検索キーを含むパケット受信イベントを生成し、
　生成したパケット受信イベントは、複数個のマルチコアＣＰＵの全てに対して発行する
　ことを特徴とする付記８に記載のパターンマッチング方法。 (Appendix 9) Generate a packet reception event including the extracted search key,
The pattern matching method according to appendix 8, wherein the generated packet reception event is issued to all of the plurality of multi-core CPUs.

（付記１０）前記複数個のマルチコアＣＰＵにおいて、
　受信したパケット受信イベントを、当該マルチコアＣＰＵが有する複数個のプロセッサコアのうちの１つに順次割り当てる
　ことを特徴とする付記９に記載のパターンマッチング方法。 (Supplementary Note 10) In the plurality of multi-core CPUs,
The pattern matching method according to appendix 9, wherein the received packet reception event is sequentially assigned to one of a plurality of processor cores of the multi-core CPU.

　（付記１１）パケットの送受信を行い、
　受信したパケットのヘッダ情報から検索キーとなる任意のフィールドの組を抽出し、
　抽出した検索キーに対するハッシュ値の算出を行い、パケット受信イベントを生成し、発行すると共に、パケットデータをメインメモリへ記憶させ、
　複数個のマルチコアＣＰＵが有する共有メモリ上に、パターンテーブルをマルチコアＣＰＵと同数に分割したサブパターンテーブルのうちの１つを構成し、
　複数個のマルチコアＣＰＵのうちの１つでパターンマッチング処理を行う
　ことを特徴とする付記１又は付記２に記載のパターンマッチング方法。 (Appendix 11) Sending and receiving packets,
Extract any set of fields that will be the search key from the header information of the received packet,
The hash value for the extracted search key is calculated, a packet reception event is generated and issued, and the packet data is stored in the main memory.
On one shared memory of a plurality of multi-core CPUs, one of the sub-pattern tables obtained by dividing the pattern table into the same number as the multi-core CPUs is configured.
The pattern matching method according to appendix 1 or appendix 2, wherein pattern matching processing is performed by one of a plurality of multi-core CPUs.

　（付記１２）抽出した検索キーと算出したハッシュ値を含むパケット受信イベントを生成し、
　生成したパケット受信イベントは、算出したハッシュ値に対応したエントリ群を保持するサブパターンテーブルを有するマルチコアＣＰＵに対して発行する
　ことを特徴とする付記１１に記載のパターンマッチング方法。 (Supplementary Note 12) Generate a packet reception event including the extracted search key and the calculated hash value,
12. The pattern matching method according to appendix 11, wherein the generated packet reception event is issued to a multi-core CPU having a sub-pattern table that holds an entry group corresponding to the calculated hash value.

　（付記１３）前記パケット受信イベントを受信したマルチコアＣＰＵにおいて、
　受信したパケット受信イベントを、当該マルチコアＣＰＵが有する複数個のプロセッサコアのうちの１つに順次割り当てる
　ことを特徴とする付記１２に記載のパターンマッチング方法。 (Additional remark 13) In multi-core CPU which received the said packet reception event,
The pattern matching method according to appendix 12, wherein the received packet reception event is sequentially assigned to one of a plurality of processor cores of the multi-core CPU.

　（付記１４）複数のプロセッサコアと、同一ＣＰＵ内に存在するそれらプロセッサコア間で共有される共有メモリを備えるマルチコアＣＰＵを有するマルチコアプロセッサ環境において、
　前記共有メモリ上は、検索パターンを備えるパターンテーブルを含み、
　前記複数のプロセッサコアで検索対象となる検索キーに対するパターンマッチング処理を並列に実行する
　ことを特徴とするパターンマッチング装置。 (Supplementary Note 14) In a multi-core processor environment having a multi-core CPU including a plurality of processor cores and a shared memory shared between the processor cores existing in the same CPU,
The shared memory includes a pattern table including a search pattern,
A pattern matching apparatus that performs a pattern matching process on a search key to be searched by the plurality of processor cores in parallel.

　（付記１５）前記パターンテーブルは、ハッシュテーブルとして構成することを特徴とする付記１４に記載のパターンマッチング装置。 (Supplementary note 15) The pattern matching device according to Supplementary note 14, wherein the pattern table is configured as a hash table.

　（付記１６）１つのマルチコアＣＰＵと、
　パケットの送受信を行うパケット送受信ブロックと、
　受信したパケットのヘッダ情報から検索キーとなる任意のフィールドの組を抽出し、パケット受信イベントを生成し、発行すると共に、パケットデータをメインメモリへ記憶させる検索キー抽出ブロックと、
　パケットデータを記憶するメインメモリと、
　前記マルチコアＣＰＵと検索キー抽出ブロックとメインメモリを接続するバスと
　を備えたことを特徴とする付記１４又は付記１５に記載のパターンマッチング装置。 (Supplementary Note 16) One multi-core CPU;
A packet transmission / reception block for transmitting and receiving packets;
A search key extraction block for extracting a set of arbitrary fields to be a search key from the header information of the received packet, generating and issuing a packet reception event, and storing the packet data in the main memory;
Main memory for storing packet data;
The pattern matching device according to appendix 14 or appendix 15, further comprising: a bus connecting the multi-core CPU, a search key extraction block, and a main memory.

　（付記１７）抽出した検索キーを含むパケット受信イベントを生成、発行し、
　パケット受信イベントは、前記マルチコアＣＰＵが有する複数個のプロセッサコアのうちの１つに順次割り当てる
　ことを特徴とする付記１６に記載のパターンマッチング装置。 (Supplementary Note 17) Generate and issue a packet reception event including the extracted search key,
The pattern matching apparatus according to appendix 16, wherein packet reception events are sequentially assigned to one of a plurality of processor cores included in the multi-core CPU.

　（付記１８）複数個のマルチコアＣＰＵと、
　パケットの送受信を行うパケット送受信ブロックと、
　受信したパケットのヘッダ情報から検索キーとなる任意のフィールドの組を抽出し、パケット受信イベントを生成し、発行すると共に、パケットデータをメインメモリへ記憶させる検索キー抽出ブロックと、
　パケットデータを記憶するメインメモリと、
　複数個のマルチコアＣＰＵと検索キー抽出ブロックとメインメモリを接続するバスとを備え、
　複数個のマルチコアＣＰＵが有する共有メモリ上には、同一のパターンテーブルが構成される
　ことを特徴とする付記１４又は付記１５に記載のパターンマッチング装置。 (Supplementary Note 18) a plurality of multi-core CPUs;
A packet transmission / reception block for transmitting and receiving packets;
A search key extraction block for extracting a set of arbitrary fields to be a search key from the header information of the received packet, generating and issuing a packet reception event, and storing the packet data in the main memory;
Main memory for storing packet data;
A plurality of multi-core CPUs, a search key extraction block, and a bus connecting the main memory;
The pattern matching apparatus according to appendix 14 or appendix 15, wherein the same pattern table is configured on a shared memory of a plurality of multi-core CPUs.

　（付記１９）抽出した検索キーを含むパケット受信イベントを生成し、
　パケット受信イベントは、複数個のマルチコアＣＰＵのうちの１つに対して発行する
　ことを特徴とする付記１８に記載のパターンマッチング装置。 (Supplementary note 19) Generate a packet reception event including the extracted search key,
The pattern matching apparatus according to appendix 18, wherein the packet reception event is issued to one of a plurality of multi-core CPUs.

　（付記２０）前記パケット受信イベントを受信したマルチコアＣＰＵは、
　受信したパケット受信イベントを、当該マルチコアＣＰＵが有する複数個のプロセッサコアのうちの１つに順次割り当てる
　ことを特徴とする付記１９に記載のパターンマッチング装置。 (Supplementary Note 20) The multi-core CPU that has received the packet reception event
The pattern matching apparatus according to appendix 19, wherein the received packet reception event is sequentially assigned to one of a plurality of processor cores of the multi-core CPU.

　（付記２１）複数個のマルチコアＣＰＵと、
　パケットの送受信を行うパケット送受信ブロックと、
　受信したパケットのヘッダ情報から検索キーとなる任意のフィールドの組を抽出し、パケット受信イベントを生成し、発行すると共に、パケットデータをメインメモリへ記憶させる検索キー抽出ブロックと、
　パケットデータを記憶するメインメモリと、
　各マルチコアＣＰＵ内の複数のプロセッサコアにおけるパターンマッチング結果を収集し、最終的なパターンマッチング結果を決定する最終結果決定ブロックと、
　複数個のマルチコアＣＰＵと検索キー抽出ブロックとメインメモリと最終結果決定ブロックを接続するバスとを備え、
　複数個のマルチコアＣＰＵが有する共有メモリ上には、パターンテーブルをマルチコアＣＰＵと同数に分割したサブパターンテーブルのうちの１つが構成される
　ことを特徴とする付記１４又は付記１５に記載のパターンマッチング装置。 (Supplementary note 21) a plurality of multi-core CPUs;
A packet transmission / reception block for transmitting and receiving packets;
A search key extraction block for extracting a set of arbitrary fields to be a search key from the header information of the received packet, generating and issuing a packet reception event, and storing the packet data in the main memory;
Main memory for storing packet data;
A final result determination block for collecting pattern matching results in a plurality of processor cores in each multi-core CPU and determining a final pattern matching result;
A plurality of multi-core CPUs, a search key extraction block, a main memory, and a bus connecting the final result determination block;
The pattern matching apparatus according to appendix 14 or appendix 15, wherein one of the sub-pattern tables obtained by dividing the pattern table into the same number as the multi-core CPU is configured on the shared memory of the plurality of multi-core CPUs. .

　（付記２２）抽出した検索キーを含むパケット受信イベントを生成し、
　生成したパケット受信イベントは、複数個のマルチコアＣＰＵの全てに対して発行する
　ことを特徴とする付記２１に記載のパターンマッチング装置。 (Appendix 22) Generate a packet reception event including the extracted search key,
The pattern matching apparatus according to appendix 21, wherein the generated packet reception event is issued to all of the plurality of multi-core CPUs.

　（付記２３）前記複数個のマルチコアＣＰＵは、
　受信したパケット受信イベントを、当該マルチコアＣＰＵが有する複数個のプロセッサコアのうちの１つに順次割り当てる
　ことを特徴とする付記２２に記載のパターンマッチング装置。 (Supplementary Note 23) The plurality of multi-core CPUs are:
The pattern matching apparatus according to appendix 22, wherein the received packet reception event is sequentially assigned to one of a plurality of processor cores of the multi-core CPU.

　（付記２４）複数個のマルチコアＣＰＵと、
　パケットの送受信を行うパケット送受信ブロックと、
　受信したパケットのヘッダ情報から検索キーとなる任意のフィールドの組を抽出する検索キー抽出ブロックと、
　抽出した検索キーに対するハッシュ値の算出を行い、パケット受信イベントを生成し、発行すると共に、パケットデータをメインメモリへ記憶させるハッシュ値計算ブロックと、
　パケットデータを記憶するメインメモリと、
　複数個のマルチコアＣＰＵと検索キー抽出ブロックとハッシュ値計算ブロックとメインメモリを接続するバスとを備え、
　複数個のマルチコアＣＰＵが有する共有メモリ上には、パターンテーブルをマルチコアＣＰＵと同数に分割したサブパターンテーブルのうちの１つが構成される
　ことを特徴とする付記１４又は付記１５に記載のパターンマッチング装置。 (Supplementary Note 24) a plurality of multi-core CPUs;
A packet transmission / reception block for transmitting and receiving packets;
A search key extraction block that extracts a set of arbitrary fields as search keys from the header information of the received packet;
A hash value calculation block for calculating a hash value for the extracted search key, generating and issuing a packet reception event, and storing packet data in the main memory;
Main memory for storing packet data;
A plurality of multi-core CPUs, a search key extraction block, a hash value calculation block, and a bus connecting the main memory;
The pattern matching apparatus according to appendix 14 or appendix 15, wherein one of the sub-pattern tables obtained by dividing the pattern table into the same number as the multi-core CPU is configured on the shared memory of the plurality of multi-core CPUs. .

　（付記２５）抽出した検索キーと算出したハッシュ値を含むパケット受信イベントを生成し、
　生成したパケット受信イベントは、算出したハッシュ値に対応したエントリ群を保持するサブパターンテーブルを有するマルチコアＣＰＵに対して発行する
　ことを特徴とする付記２４に記載のパターンマッチング装置。 (Supplementary Note 25) Generate a packet reception event including the extracted search key and the calculated hash value,
25. The pattern matching apparatus according to appendix 24, wherein the generated packet reception event is issued to a multi-core CPU having a sub-pattern table that holds an entry group corresponding to the calculated hash value.

　（付記２６）前記パケット受信イベントを受信したマルチコアＣＰＵは、
　受信したパケット受信イベントを、当該マルチコアＣＰＵが有する複数個のプロセッサコアのうちの１つに順次割り当てる
　ことを特徴とする付記２５に記載のパターンマッチング装置。 (Supplementary Note 26) The multi-core CPU that has received the packet reception event
26. The pattern matching apparatus according to appendix 25, wherein received packet reception events are sequentially assigned to one of a plurality of processor cores of the multi-core CPU.

　本発明の代表的な実施形態が詳細に述べられたが、様々な変更(changes)、置き換え(substitutions)及び選択(alternatives)が請求項で定義された発明の精神と範囲から逸脱することなくなされることが理解されるべきである。また、仮にクレーム場出願手続きにおいて補正されたとしても、クレームされた発明の均等の範囲は維持されるものと発明者は意図する。 Although representative embodiments of the present invention have been described in detail, various changes, substitutions and alternatives may be made without departing from the spirit and scope of the invention as defined in the claims. It should be understood. In addition, the inventors intend that the equivalent scope of the claimed invention will be maintained even if it is amended in the claim place application procedure.

　この出願は、２００９年１２月２１日に出願された日本出願特願２００９－２８９０１１号を基礎とする優先権を主張し、その開示の全てをここに取り込む。 This application claims priority based on Japanese Patent Application No. 2009-289011 filed on Dec. 21, 2009, the entire disclosure of which is incorporated herein.

　本発明の活用例として、パケットのヘッダ情報から特定のフィールドの組によって、パケットが属するフローを識別し、ＱｏＳ処理や負荷分散等、フロー毎に特定の処理を行わせるスイッチやルータ等のネットワーク装置、及びロードバランサー等のアプライアンス装置といった用途に適用できる。 As an application example of the present invention, a network device such as a switch or a router that identifies a flow to which a packet belongs from a header set of the packet by a specific field set and performs a specific process for each flow such as QoS processing or load distribution And an appliance device such as a load balancer.

１　マルチコアＣＰＵ
２　マルチコアＣＰＵ
３　バス
４　検索キー抽出ブロック
５　パケット送受信ブロック
６　メインメモリ
７　最終結果決定ブロック
８　検索キー抽出ブロック
９　ハッシュ値計算ブロック
１０　パケット送受信ブロック
１１　ＣＰＵ
１２　マルチコアＣＰＵ
１３　マルチコアＣＰＵ
Ｍ　マルチコアＣＰＵ　
１－０　共有メモリ
１－１、１－２、１－Ｎ　プロセッサコア
２－０　共有メモリ
２－１、２－２、１－Ｎ　プロセッサコア
１１－０　Ｌ２キャッシュ
１１－１　プロセッサコア
１２－０　Ｌ２キャッシュ
１２－１、１２－２　プロセッサコア
１３－０　Ｌ２キャッシュ
１３－１、１３－２　プロセッサコア
Ｍ－０　共有メモリ
Ｍ－１、Ｍ－２、Ｍ－Ｎ　プロセッサコア 1 Multi-core CPU
2 Multi-core CPU
3 Bus 4 Search key extraction block 5 Packet transmission / reception block 6 Main memory 7 Final result determination block 8 Search key extraction block 9 Hash value calculation block 10 Packet transmission / reception block 11 CPU
12 Multi-core CPU
13 Multi-core CPU
M multi-core CPU
1-0 shared memory 1-1, 1-2, 1-N processor core 2-0 shared memory 2-1, 2-2, 1-N processor core 11-0 L2 cache 11-1 processor core 12-0 L2 Cache 12-1, 12-2 Processor core 13-0 L2 cache 13-1, 13-2 Processor core M-0 Shared memory M-1, M-2, MN Processor core

Claims

In a multi-core processor environment having a multi-core CPU having a plurality of processor cores and a shared memory shared among those processor cores existing in the same CPU,
The shared memory includes a pattern table including a search pattern,
A pattern matching apparatus that performs a pattern matching process on a search key to be searched by the plurality of processor cores in parallel.

The pattern matching apparatus according to claim 1, wherein the pattern table is configured as a hash table.

One multi-core CPU;
A packet transmission / reception block for transmitting and receiving packets;
A search key extraction block for extracting a set of arbitrary fields to be a search key from the header information of the received packet, generating and issuing a packet reception event, and storing the packet data in the main memory;
Main memory for storing packet data;
The pattern matching apparatus according to claim 1, further comprising: a bus that connects the multi-core CPU, a search key extraction block, and a main memory.

Generate and issue a packet reception event that includes the extracted search key,
The pattern matching apparatus according to claim 3, wherein the packet reception event is sequentially assigned to one of a plurality of processor cores included in the multi-core CPU.

A plurality of multi-core CPUs;
A packet transmission / reception block for transmitting and receiving packets;
A search key extraction block for extracting a set of arbitrary fields to be a search key from the header information of the received packet, generating and issuing a packet reception event, and storing the packet data in the main memory;
Main memory for storing packet data;
A plurality of multi-core CPUs, a search key extraction block, and a bus connecting the main memory;
The pattern matching apparatus according to claim 1, wherein the same pattern table is configured on a shared memory included in a plurality of multi-core CPUs.

Generate a packet reception event that includes the extracted search key,
The pattern matching apparatus according to claim 5, wherein the packet reception event is issued to one of a plurality of multi-core CPUs.

The multi-core CPU that has received the packet reception event,
The pattern matching apparatus according to claim 6, wherein the received packet reception event is sequentially allocated to one of a plurality of processor cores included in the multi-core CPU.

A plurality of multi-core CPUs;
A packet transmission / reception block for transmitting and receiving packets;
A search key extraction block for extracting a set of arbitrary fields to be a search key from the header information of the received packet, generating and issuing a packet reception event, and storing the packet data in the main memory;
Main memory for storing packet data;
A final result determination block for collecting pattern matching results in a plurality of processor cores in each multi-core CPU and determining a final pattern matching result;
A plurality of multi-core CPUs, a search key extraction block, a main memory, and a bus connecting the final result determination block;
3. The pattern according to claim 1, wherein one of the sub-pattern tables obtained by dividing the pattern table into the same number as the multi-core CPU is configured on the shared memory of the plurality of multi-core CPUs. Matching device.

Generate a packet reception event that includes the extracted search key,
The pattern matching apparatus according to claim 8, wherein the generated packet reception event is issued to all of the plurality of multi-core CPUs.

The plurality of multi-core CPUs are:
The pattern matching apparatus according to claim 9, wherein the received packet reception event is sequentially assigned to one of a plurality of processor cores of the multi-core CPU.

A plurality of multi-core CPUs;
A packet transmission / reception block for transmitting and receiving packets;
A search key extraction block that extracts a set of arbitrary fields as search keys from the header information of the received packet;
A hash value calculation block for calculating a hash value for the extracted search key, generating and issuing a packet reception event, and storing packet data in the main memory;
Main memory for storing packet data;
A plurality of multi-core CPUs, a search key extraction block, a hash value calculation block, and a bus connecting the main memory;
3. The pattern according to claim 1, wherein one of the sub-pattern tables obtained by dividing the pattern table into the same number as the multi-core CPU is configured on the shared memory of the plurality of multi-core CPUs. Matching device.

Generate a packet reception event that includes the extracted search key and the calculated hash value,
The pattern matching apparatus according to claim 11, wherein the generated packet reception event is issued to a multi-core CPU having a sub-pattern table that holds an entry group corresponding to the calculated hash value.

The multi-core CPU that has received the packet reception event,
The pattern matching apparatus according to claim 12, wherein the received packet reception event is sequentially assigned to one of a plurality of processor cores of the multi-core CPU.

In a multi-core processor environment having a multi-core CPU composed of a plurality of processor cores and a shared memory shared among those processor cores existing in the same CPU,
Configure a pattern table with search patterns on shared memory,
A pattern matching method, wherein pattern matching processing for search keys to be searched is executed in parallel by a plurality of processor cores.