JP4320004B2

JP4320004B2 - XPath processing method, XPath processing device, XPath processing program, and storage medium storing the program

Info

Publication number: JP4320004B2
Application number: JP2005195572A
Authority: JP
Inventors: 秀一西岡; 真鬼塚; 雅司山室
Original assignee: Nippon Telegraph and Telephone Corp; NTT Inc USA
Current assignee: NTT Inc; NTT Inc USA
Priority date: 2005-07-04
Filing date: 2005-07-04
Publication date: 2009-08-26
Anticipated expiration: 2025-07-04
Also published as: JP2007011998A

Description

本発明は、オートマトンを用いたＸＰａｔｈ式の処理に関し、殊にＸＰａｔｈ式の差分追加を行う、ＸＰａｔｈ式処理方法、ＸＰａｔｈ式処理装置、ＸＰａｔｈ式処理プログラムおよびそのプログラムを格納した記憶媒体に関する。
The present invention relates to a process an XPath expression using an automaton, in particular performs differential additional XPath expressions, XPath expression processing method, XPath expression processor relates to a storage medium body which stores the XPath expression processing program, and a program.

例えば、ＸＭＬ（Extensible Markup Language）をベースにした新しいニュース配信フォーマットにＮｅｗｓＭＬ（ニューズエムエル）がある。ＮｅｗｓＭＬは、ニュース記事、画像、動画、音声等のニュース素材を自由に組み合わせ、ウェブサイトや携帯電話、テレビ（テレビのデータ放送）等、さまざまな機器を対象に情報を送ることができる。ＮｅｗｓＭＬの受け側（ユーザ）は、フィルタエンジンに検索条件を登録しておくことで、必要な情報を得ることができる。検索条件はＸＭＬ問い合わせ言語で記述されるため、フィルタエンジンは個々のＸＰａｔｈ式を処理すると共に、ＸＰａｔｈ（XML Path Language）式の追加を処理することになる。 For example, a new news distribution format based on XML (Extensible Markup Language) is NewsML. NewsML can freely combine news materials such as news articles, images, videos, and voices, and can send information to various devices such as websites, mobile phones, and televisions (TV data broadcasts). A recipient (user) of NewsML can obtain necessary information by registering a search condition in the filter engine. Since the search condition is described in the XML query language, the filter engine processes each XPath expression and also adds an XPath (XML Path Language) expression.

ＸＰａｔｈ式を用いて入力されたＸＭＬデータ（ストリームデータなど）をフィルタ処理する際には、ＸＰａｔｈ式から導出されたオートマトン（Automaton）を用いることが有効である。ちなみに、オートマトンとは、コンピュータ等の計算機構を数学的に表すモデルの総称であり、入力、出力、状態をもつ。このうち、決定性有限オートマトン（ＤＦＡ：Deterministic Finite Automaton）は、入力に対する推移先（遷移先）が１つに決まるオートマトンである。一方、非決定性有限オートマトン（ＮＦＡ：Nondeterministic Finite Automaton）は、ある状態において入力に対する推移先（遷移先）が複数存在するオートマトンである。 When filtering XML data (stream data or the like) input using the XPath expression, it is effective to use an automaton derived from the XPath expression. Incidentally, an automaton is a general term for models that mathematically represent a calculation mechanism such as a computer, and has an input, an output, and a state. Among these, a deterministic finite automaton (DFA) is an automaton in which a transition destination (transition destination) for an input is determined as one. On the other hand, a nondeterministic finite automaton (NFA) is an automaton having a plurality of transition destinations (transition destinations) for an input in a certain state.

既に入力されたＸＰａｔｈ式に対して追加を行う従来技術では、ＸＰａｔｈ式を処理するためのＤＦＡが、ＸＰａｔｈ式の追加の度に別途生成されていたため、ＸＰａｔｈ式の追加の度にＤＦＡが増えてしまうとの問題があった。そこで特許文献１に開示された技術は、各ユーザの希望条件をＸＰａｔｈ式を用いることで、ＸＭＬデータから複数のユーザの希望する部分ＸＭＬデータを抽出して配信するシステムにおいて、ユーザによるＸＰａｔｈ式の追加に応じてＸＰａｔｈ式から導出されたＤＦＡを追加前と追加後との差分で更新することを可能とするものである。
特開２００３−３２３４２９号公報（請求項１） In the conventional technique for adding to an already input XPath expression, a DFA for processing the XPath expression is separately generated each time the XPath expression is added. Therefore, the DFA increases every time an XPath expression is added. There was a problem with it. Therefore, the technique disclosed in Patent Document 1 uses the XPath expression for the desired condition of each user, and extracts and distributes partial XML data desired by a plurality of users from the XML data. In accordance with the addition, the DFA derived from the XPath expression can be updated with the difference between before and after the addition.
JP 2003-323429 A (Claim 1)

ＤＦＡの差分更新（差分追加）に関する従来技術（特許文献１など）では、ＸＰａｔｈ式の差分追加のときに、ＸＭＬデータを必要としていた。そのため、ＸＭＬデータのデータ変換処理などの評価時に、ＸＰａｔｈ式の追加・削除をオートマトンに反映するオートマトンの更新処理がオーバーヘッドとなり、ＸＭＬデータの評価時の性能が劣化してしまう。その結果、最新のＸＭＬデータをユーザが取得するまでに時間がかかってしまうという問題があった。特に、ＸＭＬデータが交通情報などの即時性のあるニュースデータのときには、ユーザにとって取得するまでに時間がかからないことが重要となる。 In the related art (for example, Patent Document 1) related to DFA difference update (difference addition), XML data is required when adding an XPath-type difference. Therefore, during the evaluation of data conversion processing of XML data, automaton update processing that reflects the addition / deletion of the XPath expression in the automaton becomes an overhead, and the performance at the time of evaluation of XML data deteriorates. As a result, there is a problem that it takes time until the user acquires the latest XML data. In particular, when the XML data is news data with immediacy such as traffic information, it is important for the user not to take time until acquisition.

そこで、本発明は、前記した問題を解決し、ＸＭＬデータを評価するためのオートマトンを差分更新するとともに、そのオートマトンがＸＭＬデータを迅速に処理できるようにすることを主な目的とする。 Accordingly, the main object of the present invention is to solve the above-described problems, update the automaton for evaluating the XML data, and allow the automaton to process the XML data quickly.

前記課題を解決するために、本発明は、ＸＭＬデータを処理するために入力された更新前ＸＰａｔｈ式を示す更新前ＤＦＡを、入力された更新後ＸＰａｔｈ式を示す更新後ＤＦＡに更新するＸＰａｔｈ式処理方法であって、コンピュータが、前記更新前ＸＰａｔｈ式を処理するために導出された前記更新前ＤＦＡを読み込み、その開始状態を状態ｄｓとして保持するステップと、追加対象のＸＰａｔｈ式を読み込み、それから差分ＮＦＡを導出し、その開始状態を状態ｎｓとして保持するステップと、これらの前記状態ｄｓ、前記状態ｎｓを引数として、ＸＰａｔｈ式の追加操作を行うための第１サブルーチンを呼び出すステップと、前記更新前ＸＰａｔｈ式を処理するための更新前ＮＦＡに前記差分ＮＦＡを反映して、更新後ＮＦＡとするステップと、を実行し、前記第１サブルーチンは、前記状態ｄｓが存在する場合、前記状態ｄｓが保持するＮＦＡ状態群に対して、前記状態ｎｓを追加するステップと、前記状態ｎｓからの推移群を取得するステップと、取得した推移群のうちの未処理の推移を１つずつ推移ｎｅとして選択し、前記状態ｄｓと前記推移ｎｅを引数として、第２サブルーチンを呼び出すステップと、前記推移ｎｅとして選択する未処理の推移がないときに、本サブルーチンを終了するステップと、を実行することにより構成され、前記第２サブルーチンは、前記状態ｄｓが保持する推移群において、前記推移ｎｅに適合する推移がない場合、前記状態ｄｓの推移群に新たに前記推移ｎｅを追加して、本サブルーチンを終了するステップと、前記状態ｄｓが保持する推移群において、前記推移ｎｅに適合する推移がある場合、未処理である推移を１つずつ推移ｄｅとして選択して、前記推移ｄｅの推移先のＤＦＡ状態を状態ｄｓ１として保持し、前記推移ｎｅの推移先のＮＦＡ状態を状態ｎｓ１として保持し、これらの前記状態ｄｓ１、前記状態ｎｓ１を引数として、前記第１サブルーチンを呼び出すステップと、を順に実行することにより、前記更新前ＤＦＡを前記更新後ＤＦＡに更新するサブルーチンとして構成されることを特徴とする。
さらに、本発明は、前記コンピュータであるＸＰａｔｈ式処理装置、前記コンピュータが実行するＸＰａｔｈ式処理プログラム、および、前記ＸＰａｔｈ式処理プログラムを記憶した記憶媒体である。
In order to solve the above-described problem, the present invention updates the pre-update DFA indicating the pre-update XPath expression input to process the XML data to the post-update DFA indicating the input post-update XPath expression. A processing method in which a computer reads the pre-update DFA derived to process the pre-update XPath expression, holds the start state as a state ds, reads the XPath expression to be added, and Deriving a differential NFA and holding its start state as a state ns, calling a first subroutine for performing an XPath expression addition operation using the state ds and the state ns as arguments, and the updating The difference NFA is reflected in the pre-update NFA for processing the previous XPath expression to obtain the post-update NFA. And when the state ds exists, the first subroutine adds the state ns to the NFA state group held by the state ds, and a transition group from the state ns. , A step of selecting an unprocessed transition from the acquired transition group one by one as a transition ne, calling a second subroutine with the state ds and the transition ne as arguments, and the transition ne A step of ending this subroutine when there is no unprocessed transition to be selected, and the second subroutine is a transition that matches the transition ne in the transition group held by the state ds. If there is not, the step of newly adding the transition ne to the transition group of the state ds and ending this subroutine, and the state ds is retained If there is a transition that matches the transition ne in the transition group, the unprocessed transitions are selected one by one as the transition de, the DFA state of the transition destination of the transition de is held as the state ds1, and the transition The NFA state of the transition destination of ne is held as the state ns1, and the step of calling the first subroutine with the state ds1 and the state ns1 as an argument is sequentially executed, thereby updating the pre-update DFA. It is characterized by being configured as a subroutine for updating to a later DFA .
Furthermore, the present invention is an XPath type processing apparatus that is the computer, an XPath type processing program executed by the computer, and a storage medium that stores the XPath type processing program.

これにより、これまでに生成されているＤＦＡ（既存のＤＦＡ）に更新対象のＸＰａｔｈ式の情報を差分更新するだけなので、新たなＤＦＡを別に生成する必要がない。このため、メモリ空間の有効活用を図ることができる。また、差分更新にＸＭＬデータを必要としないため、迅速にＸＰａｔｈ式の情報を差分更新できる。
さらに、確実にＤＦＡに対するＸＰａｔｈ式の情報の差分更新を行え、また、さらに効率的にメモリ空間を使用することができる。
そして、メモリ空間の有効活用を図りつつ、迅速にＤＦＡを差分追加できる。
As a result, since only the information of the XPath expression to be updated is differentially updated in the DFA (existing DFA) generated so far, it is not necessary to generate a new DFA separately. For this reason, the memory space can be effectively used. Further, since XML data is not required for the difference update, the XPath-type information can be quickly updated by difference.
Furthermore, it is possible to reliably update the difference of the XPath type information for the DFA, and to use the memory space more efficiently.
Then, DFA can be added quickly while making effective use of the memory space.

そして、本発明は、ＸＭＬデータを処理するために入力された更新前ＸＰａｔｈ式を示す更新前ＤＦＡを、入力された更新後ＸＰａｔｈ式を示す更新後ＤＦＡに更新するＸＰａｔｈ式処理方法であって、コンピュータが、前記更新前ＸＰａｔｈ式を処理するために導出された前記更新前ＤＦＡを読み込み、その開始状態を状態ｄｓとして保持するステップと、削除対象のＮＦＡを導出し、その開始状態を状態ｎｓとして保持するステップと、これらの前記状態ｄｓ、前記状態ｎｓを引数として、ＸＰａｔｈ式の削除操作を行うための第３サブルーチンを呼び出すステップと、前記更新前ＸＰａｔｈ式を処理するための更新前ＮＦＡから、前記削除対象のＮＦＡを削除して、更新後ＮＦＡとするステップと、を実行し、前記第３サブルーチンは、前記状態ｄｓが存在する場合、前記状態ｄｓが保持するＮＦＡ状態群に対して、前記状態ｎｓを削除するとともに、前記状態ｄｓが保持するＮＦＡ状態群がないときには、この状態ｄｓ配下を削除するステップと、前記状態ｎｓからの推移群を取得するステップと、取得した推移群のうちの未処理の推移を１つずつ推移ｎｅとして選択し、前記状態ｄｓと前記推移ｎｅを引数として、第４サブルーチンを呼び出すステップと、前記推移ｎｅとして選択する未処理の推移がないときに、本サブルーチンを終了するステップと、を実行することにより構成され、前記第４サブルーチンは、前記状態ｄｓが保持する推移群において、前記推移ｎｅに適合する推移がない場合、本サブルーチンを終了するステップと、前記状態ｄｓが保持する推移群において、前記推移ｎｅに適合する推移がある場合、未処理である推移を１つずつ推移ｄｅとして選択して、前記推移ｄｅの推移先のＤＦＡ状態を状態ｄｓ１として保持し、前記推移ｎｅの推移先のＮＦＡ状態を状態ｎｓ１として保持し、これらの前記状態ｄｓ１、前記状態ｎｓ１を引数として、前記第３サブルーチンを呼び出すステップと、を順に実行することにより、前記更新前ＤＦＡを前記更新後ＤＦＡに更新するサブルーチンとして構成されることを特徴とする。
さらに、本発明は、前記コンピュータであるＸＰａｔｈ式処理装置、前記コンピュータが実行するＸＰａｔｈ式処理プログラム、および、前記ＸＰａｔｈ式処理プログラムを記憶した記憶媒体である。
The present invention is an XPath expression processing method for updating a pre-update DFA indicating an XPath expression before update input to process XML data into an updated DFA indicating an input XPath expression after update. A computer reads the pre-update DFA derived to process the pre-update XPath expression, holds the start state as a state ds, derives an NFA to be deleted, and sets the start state as a state ns A step of holding, a step of calling a third subroutine for performing an XPath expression deletion operation using the state ds and the state ns as arguments, and the pre-update NFA for processing the pre-update XPath expression, Deleting the NFA to be deleted and setting it as an updated NFA, and the third subroutine is When the state ds exists, the state ns is deleted from the NFA state group held by the state ds, and when there is no NFA state group held by the state ds, the subordinate of the state ds is deleted. A step, a step of acquiring a transition group from the state ns, and selecting an unprocessed transition of the acquired transition group one by one as a transition ne, and using the state ds and the transition ne as arguments, A step of calling a subroutine and a step of ending the subroutine when there is no unprocessed transition to be selected as the transition ne, and the fourth subroutine is a transition held by the state ds. If there is no transition suitable for the transition ne in the group, the step of ending this subroutine and the transition group held by the state ds If there is a transition that matches the transition ne, the transitions that are not processed are selected one by one as the transition de, the DFA state of the transition destination of the transition de is held as the state ds1, and the transition ne The NFA state of the transition destination is held as the state ns1, and the step of calling the third subroutine using the state ds1 and the state ns1 as an argument is sequentially executed, whereby the pre-update DFA is changed to the post-update DFA. It is characterized by being configured as a subroutine to be updated .
Furthermore, the present invention is an XPath type processing apparatus that is the computer, an XPath type processing program executed by the computer, and a storage medium that stores the XPath type processing program.

これにより、これまでに生成されているＤＦＡ（既存のＤＦＡ）に更新対象のＸＰａｔｈ式の情報を差分更新するだけなので、新たなＤＦＡを別に生成する必要がない。このため、メモリ空間の有効活用を図ることができる。また、差分更新にＸＭＬデータを必要としないため、迅速にＸＰａｔｈ式の情報を差分更新できる。
さらに、確実にＤＦＡに対するＸＰａｔｈ式の情報の差分更新を行え、また、さらに効率的にメモリ空間を使用することができる。
そして、メモリ空間の有効活用を図りつつ、迅速にＤＦＡを差分削除できる。
As a result, since only the information of the XPath expression to be updated is differentially updated in the DFA (existing DFA) generated so far, it is not necessary to generate a new DFA separately. For this reason, the memory space can be effectively used. Further, since XML data is not required for the difference update, the XPath-type information can be quickly updated by difference.
Furthermore, it is possible to reliably update the difference of the XPath type information for the DFA, and to use the memory space more efficiently.
Then, the DFA difference can be quickly deleted while effectively using the memory space.

本発明によれば、これまでに生成されているＤＦＡに追加・削除対象のＸＰａｔｈ式を即時で差分更新するため、ＸＭＬデータの評価時にはＤＦＡの差分更新操作が発生しない。このようにＸＰａｔｈ式の追加・削除時に決定性オートマトンを更新するため、ＸＭＬデータの評価時にはＸＰａｔｈ式の追加・削除操作の影響による性能劣化が生じない。 According to the present invention, since the XPath expression to be added / deleted is immediately updated in the DFA generated so far, the DFA difference update operation does not occur when the XML data is evaluated. As described above, since the deterministic automaton is updated when an XPath expression is added / deleted, performance degradation due to the influence of the addition / deletion operation of the XPath expression does not occur when XML data is evaluated.

以下、本発明のＸＰａｔｈ式処理方法の実施形態を、図面を参照して詳細に説明する。なお、以下説明するＸＰａｔｈ式処理方法は、ＸＰａｔｈ式処理装置、ＸＰａｔｈ式処理プログラムを具現化したものでもある。第１実施形態では既存のＸＰａｔｈ式に新たなＸＰａｔｈ式を追加する差分更新を、第２実施形態では既存のＸＰａｔｈ式の一部を削除する差分更新を、それぞれ述べる。以下、第１実施形態を説明する。 Hereinafter, an embodiment of an XPath processing method of the present invention will be described in detail with reference to the drawings. Note that the XPath type processing method described below also embodies an XPath type processing apparatus and an XPath type processing program. In the first embodiment, difference update for adding a new XPath expression to an existing XPath expression will be described, and in the second embodiment, difference update for deleting a part of an existing XPath expression will be described. Hereinafter, the first embodiment will be described.

図１は、ＸＰａｔｈ式処理方法が適用されるフィルタエンジンの概要を示す図である。図示しないデータ提供者によりＸＭＬ形式にしたがって生成されたＸＭＬデータが、ＸＰａｔｈ式処理方法を実行するフィルタエンジン１０にイントラネット等のネットワークを経由して送信される。フィルタエンジン１０には、ＸＭＬデータを受け取る個々のユーザが、自分の欲しいデータの条件（個人プロファイル）を、従来例のように、ＸＭＬ問い合わせという形式でフィルタエンジン１０に予め登録している。 FIG. 1 is a diagram showing an outline of a filter engine to which the XPath processing method is applied. XML data generated according to the XML format by a data provider (not shown) is transmitted to the filter engine 10 that executes the XPath processing method via a network such as an intranet. In the filter engine 10, each user who receives the XML data registers in advance the data condition (personal profile) he wants in the filter engine 10 in the form of an XML query as in the conventional example.

フィルタエンジン１０は、登録されている個人プロファイルに応じて送られてくるニュースソース等のＸＭＬデータをフィルタ・変換して個々のユーザにＸＭＬデータとして配信する。ニュースソース等のＸＭＬデータの具体例としては、ＮｅｗｓＭＬがある。ＮｅｗｓＭＬは前記のとおり、ＸＭＬをベースにした新しいニュース配信フォーマットであり、ニュース記事、画像、動画、音声等のニュース素材を自由に組み合わせ、ウェブサイトや携帯電話等さまざまな機器を対象に情報を送ることができる。また、ニュース記事、画像、動画、音声等のさまざまなニュース素材を構造化して一元管理するのに適する。 The filter engine 10 filters and converts XML data such as a news source sent according to a registered personal profile, and distributes the converted XML data to individual users. A specific example of XML data such as a news source is NewsML. As mentioned above, NewsML is a new news distribution format based on XML, which can freely combine news materials such as news articles, images, videos, and voices, and send information to various devices such as websites and mobile phones. be able to. It is also suitable for structuring and centrally managing various news materials such as news articles, images, videos, and voices.

なお、フィルタエンジン１０は、請求項のＸＰａｔｈ式処理装置およびＸＰａｔｈ式処理プログラムを内包するものでもある。ちなみに、ＸＭＬは、インターネットの標準としてＷ３Ｃ（World Wide Web Consortium）により勧告されたメタ言語である。メタ言語は、言語を作る言語という意味である。ＸＭＬデータ（ＸＭＬ文書：XMLDocumentともいう）は、ＸＭＬによって作られた言語を用いて作成された文書やデータである。 The filter engine 10 also includes the XPath processing device and the XPath processing program of the claims. Incidentally, XML is a meta language recommended by the World Wide Web Consortium (W3C) as an Internet standard. Meta language means the language that makes a language. XML data (XML document: also called XMLDocument) is a document or data created using a language created by XML.

そして、フィルタエンジン１０は、ＸＭＬデータから、ユーザの希望条件であるＸＰａｔｈ式に適合するＸＭＬデータを抽出する。ここで、フィルタエンジン１０は、後記において詳細に説明するように、ＸＰａｔｈ式から導出されたＤＦＡヘ即時に、かつ、全てＸＰａｔｈ式の追加・削除を反映させることを特徴とする。これにより、ＸＭＬデータの処理時にはＸＰａｔｈ式の更新に関わる操作が一切発生しないため処理が高速化されるという効果を奏する。 Then, the filter engine 10 extracts XML data that conforms to the XPath expression, which is a user's desired condition, from the XML data. Here, as will be described in detail later, the filter engine 10 is characterized in that all additions and deletions of the XPath expression are immediately reflected in the DFA derived from the XPath expression. As a result, there is an effect that the processing is speeded up because no operation related to the update of the XPath expression occurs at the time of processing the XML data.

図２は、図１のフィルタエンジン１０の内部構成を示すブロック図である。この図２に示すように、フィルタエンジン１０は、ＸＭＬパースモジュール１１、問い合わせパースモジュール１２、データ抽出モジュール１３、ＤＦＡ作成モジュール１３Ｂ、オートマトン管理部１５、データ変換モジュール１６、ＤＦＡ更新モジュール１７、および、更新オートマトン管理部１８を含んで構成される。フィルタエンジン１０は、ＣＰＵおよびＲＡＭから構成される主制御装置、ハードディスク等から構成される外部記憶装置、通信を行うためのＮＩＣ（Network Interface Card）を有するコンピュータと、ルータ（Router）とを含んで構成される。オートマトン管理部１５および更新オートマトン管理部１８は、例えば、外部記憶装置に格納される。 FIG. 2 is a block diagram showing an internal configuration of the filter engine 10 of FIG. As shown in FIG. 2, the filter engine 10 includes an XML parsing module 11, an inquiry parsing module 12, a data extraction module 13, a DFA creation module 13B, an automaton management unit 15, a data conversion module 16, a DFA update module 17, and The update automaton management unit 18 is included. The filter engine 10 includes a main control device including a CPU and a RAM, an external storage device including a hard disk, a computer having a NIC (Network Interface Card) for communication, and a router. Composed. The automaton management unit 15 and the updated automaton management unit 18 are stored in, for example, an external storage device.

ＸＭＬパースモジュール（ＸＭＬパーサ）１１は、入力されるＸＭＬデータをパースして内部形式ＸＭＬデータ（ＳＡＸ（Simple API for XML）イベント）に変換し、データ抽出モジュール１３ヘ出力する。なお、パースとは、テキスト形式で記述されたＸＭＬデータを読み込んで、ＸＭＬのタグで指定された文書要素や属性等を解析する解析処理である（本実施形態においてはパースの手順等は特に限定するものではない）。ちなみに、ＸＭＬパースモジュール１１を通してＸＭＬデータを操作するためのＡＰＩ（Application Programming Interface）には、ＤＯＭ（Document Object Mode）とＳＡＸという２種類の標準インターフェースがある。本実施形態では、ＸＭＬパースモジュール１１は、後者のＳＡＸに対応している。 The XML parsing module (XML parser) 11 parses input XML data, converts it into internal format XML data (SAX (Simple API for XML) event), and outputs the data to the data extraction module 13. Note that parsing is analysis processing that reads XML data described in a text format and analyzes document elements, attributes, and the like specified by XML tags (in this embodiment, parsing procedures are particularly limited). Not) Incidentally, API (Application Programming Interface) for manipulating XML data through the XML parsing module 11 has two types of standard interfaces, DOM (Document Object Mode) and SAX. In this embodiment, the XML parsing module 11 corresponds to the latter SAX.

なお、ＳＡＸに対応したＸＭＬパースモジュール１１は、ＸＭＬデータを順次シーケンシャルに読み込みつつ、ＸＭＬのタグ（開始タグ、終了タグ、空要素タグ）を検出するごとにアドインされた各種ハンドラを起動する。ここでハンドラとは、ＳＡＸインターフェースに基づいてＸＭＬデータの各要素を処理するためのメソッドを定義したプログラムである。また、タグとは、ＸＭＬデータにおいて、要素の位置を明示し、属性を収納するために記述される文字列である。 Note that the XML parsing module 11 corresponding to SAX sequentially reads XML data and starts various handlers added in each time an XML tag (start tag, end tag, empty element tag) is detected. Here, the handler is a program that defines a method for processing each element of XML data based on the SAX interface. Further, the tag is a character string described in the XML data for clearly indicating the position of the element and storing the attribute.

問い合わせパースモジュール１２は、ＸＰａｔｈ式を生成する。具体的には、問い合わせパースモジュール１２は、追加される個人プロファイル（ＸＭＬ問い合わせ言語で記述される検索条件）をパース（解析処理）し、「データ変換操作」とデータ抽出操作である「個々のＸＰａｔｈ式」とに分離する。ＸＰａｔｈ式は、ＤＦＡ作成モジュール１３Ｂによってオートマトンに変換されてからデータ抽出モジュール１３ヘ出力され、データ変換操作はデータ変換モジュール１６ヘ出力される。なお、ＸＰａｔｈ式は、ＸＭＬデータの特定の部分を指し示す言語である。ＸＰａｔｈ式を利用すれば、ＸＭＬデータ中にアンカ等が埋め込まれていなくとも、データ中の任意の位置を指し示すことができる。 The inquiry parsing module 12 generates an XPath expression. Specifically, the inquiry parsing module 12 parses (analyzes) an individual profile to be added (search conditions described in the XML inquiry language), and performs “data conversion operation” and “individual XPath” which are data extraction operations. Separated into "expression". The XPath expression is converted to an automaton by the DFA creation module 13B and then output to the data extraction module 13. The data conversion operation is output to the data conversion module 16. The XPath expression is a language indicating a specific part of the XML data. If the XPath expression is used, an arbitrary position in the data can be indicated even if an anchor or the like is not embedded in the XML data.

ＤＦＡ作成モジュール１３Ｂは、問い合わせパースモジュール１２から入力される個々のＸＰａｔｈ式をＮＦＡに変換（ＮＦＡを生成）し、そのＮＦＡを結合ＮＦＡに結合し、その結合ＮＦＡからＤＦＡを生成してオートマトン管理部１５に登録（追加）する。 The DFA creation module 13B converts each XPath expression input from the inquiry parsing module 12 into NFA (generates NFA), combines the NFA with a combined NFA, generates a DFA from the combined NFA, and generates an automaton management unit. 15 is registered (added).

データ抽出モジュール１３は、オートマトン管理部１５に格納されたＤＦＡを用いて、ＸＭＬパースモジュール１１に入力されたＸＭＬデータから抽出された部分ＸＭＬを内部形式ＸＭＬデータ（フィルタされた後の内部形式ＸＭＬデータ）としてデータ変換モジュール１６ヘ出力する。 The data extraction module 13 uses the DFA stored in the automaton management unit 15 to convert the partial XML extracted from the XML data input to the XML parsing module 11 into the internal format XML data (filtered internal format XML data). ) To the data conversion module 16.

データ変換モジュール１６は、データ変換操作と抽出された内部形式のＸＭＬデータとから所定の変換を実行し、その結果をフィルタされたＸＭＬデータ（変換後ＸＭＬデータ）として出力する。なお、所定の変換は本発明においては特に限定するものではない。 The data conversion module 16 performs a predetermined conversion from the data conversion operation and the extracted XML data in the internal format, and outputs the result as filtered XML data (converted XML data). The predetermined conversion is not particularly limited in the present invention.

次に、オートマトン管理部１５を詳細に説明する。図３は、図２におけるオートマトン管理部１５のメモリ上でのデータ構成を示した図である。なお、ＤＦＡの詳細は図５に示す。 Next, the automaton management unit 15 will be described in detail. FIG. 3 is a diagram showing a data structure on the memory of the automaton management unit 15 in FIG. Details of the DFA are shown in FIG.

図３に示すように、個々のＸＰａｔｈ式ごとにＮＦＡが生成される。こうして生成された複数のＮＦＡは１つのノードにより結合され、そのルートからイプシロンエッジにより個々のＮＦＡに接続される（結合ＮＦＡ）。なお、イプシロンエッジとは、例えばオートマトンにおいて通常定義される”空文字列”のことである。ＤＦＡは、結合ＮＦＡを用いて順次生成および更新される。または、ＤＦＡは、結合ＮＦＡとＸＭＬデータの入力に応じて、必要な状態が順次生成および更新される。 As shown in FIG. 3, an NFA is generated for each XPath expression. The plurality of NFAs generated in this way are combined by one node, and connected to individual NFAs from the root by epsilon edges (combined NFA). The epsilon edge is an “empty character string” normally defined in, for example, an automaton. The DFA is generated and updated sequentially using the combined NFA. Alternatively, DFA sequentially generates and updates necessary states according to the input of combined NFA and XML data.

図４は、図３の具体的なプログラム上でのデータ構造を示す図である。この図４において、Ｖａｒｉａｂｌｅクラス（class Variable）は、個々のＸＰａｔｈ式ごとにインスタンス（Instance）を生成するクラスであり、次のような属性を有する。なお、インスタンスとは、例えばオブジェクト指向プログラミングにおいて、あるクラスの定義をひな型として実際に作られたオブジェクトのことをいう。
＊そのインスタンスごとに異なる内部識別子である”ｉｄ”
＊個々のＸＰａｔｈ式表現である”ＸＰａｔｈ式”
＊Ｖａｒｉａｂｌｅをユーザが区別するための名称である”ｖａｒＮａｍｅ” FIG. 4 is a diagram showing a data structure on the specific program of FIG. In FIG. 4, a variable class (class variable) is a class that generates an instance for each XPath expression, and has the following attributes. Note that an instance refers to an object actually created by using a class definition as a model in, for example, object-oriented programming.
* "Id" which is an internal identifier that is different for each instance
* "XPath expression" which is an expression of each XPath expression
* "VarName" which is a name for the user to distinguish Variable

Ｓｔａｔｅクラス（class State）は、ＮＦＡの状態を表現するクラスであり、次のような属性を有する。
＊自身の状態から他の状態への推移を表現するエッジの集合”ｅｄｇｅｓ”
＊自身の状態が終端か非終端かを示す”ｔｙｐｅ”（なお、ＸＰａｔｈ式の表現をＮＦＡに変換した場合、最後尾の状態は終端であり、それ以外の状態は非終端であるとする。）
＊自身の状態がどのＶａｒｉａｂｌｅインスタンスから（つまりそのＸＰａｔｈ式から）生成されたかを示す”ｖａｒ” The State class (class State) is a class that expresses the state of the NFA and has the following attributes.
* A set of edges “edges” representing transitions from one state to another
* “Type” indicating whether its own state is termination or non-termination (when the expression of the XPath expression is converted to NFA, the last state is termination and the other states are non-termination)
* “Var” indicating from which Variable instance (that is, from the XPath expression) its state was created

ＤＦＡＳｔａｔｅクラス（class DFAState）はＤＦＡの状態を表現し、Ｓｔａｔｅクラスをオブジェクト指向的な継承を用いて定義され、次のような属性を有する。
＊ＮＦＡを用いてＤＦＡを生成する際に必要になる、ＮＦＡの状態群を表現する”ｓｔａｔｅｓ”
なお、ｅｄｇｅｓはＳｔａｔｅクラスのものを継承しているが、処理の高速化を図るため、ｌｉｓｔではなくｈａｓｈやｍａｐ構造を用いて再定義することも可能である。 The DFAState class (class DFAState) represents the state of the DFA, the State class is defined using object-oriented inheritance, and has the following attributes.
* “States” that expresses the NFA state group that is required when generating a DFA using NFA
Although “edges” is inherited from the “State” class, it can be redefined using a hash or map structure instead of “list” in order to increase the processing speed.

Ｅｄｇｅクラス（class Edge）はＮＦＡの状態Ｓｔａｔｅ間のエッジ、もしくはＤＦＡの状態間のエッジを表現し、次のような属性を有する。
＊自身のエッジのエッジ先である状態”ｔｏ”
＊自身のエッジのエッジ元である状態”ｆｒｏｍ” The Edge class (class Edge) represents an edge between NFA state states or an edge between DFA states, and has the following attributes.
* State "to" that is the edge destination of its own edge
* State "from" which is the edge source of its own edge

図５は、図４のＤＦＡのメモリ上でのデータ構造を図示した図である。ただし、この図５では図４と異なり、”ｅｄｇｅｓ”を実現するためにｌｉｓｔではなくｍａｐ構造を利用している。 FIG. 5 is a diagram illustrating a data structure on the memory of the DFA of FIG. However, in FIG. 5, unlike FIG. 4, a map structure is used instead of a list in order to realize “edges”.

この図５の左に位置するＤＦＡＳｔａｔｅはＤＦＡＳｔａｔｅクラスのインスタンスであり、ＤＦＡの１状態を表現している。この図５の状態ではその属性値であるｓｔａｔｅｓ，ｅｄｇｅｓ，ｔｙｐｅ，ｖａｒ，ｖａｒＩｄの値がそれぞれ設定されている。ここで、ｓｔａｔｅｓはＮＦＡの状態群を表す。ｅｄｇｅｓはエッジのラベルとエッジインスタンスヘの参照の組を表現している。ｔｙｐｅは非終端であるので”ｎｏｎ−ｔｅｒｍｉｎａｌ”になっている。ｖａｒはＶａｒｉａｂｌｅインスタンスヘの参照である。この例では、ｖａｒＩｄは３であり、３番目のＶａｒｉａｂｌｅ（ＸＰａｔｈ式）までの状態をこのＤＦＡＳｔａｔｅが反映していることを意味している。 DFAState located on the left side of FIG. 5 is an instance of the DFAState class, and expresses one state of DFA. In the state of FIG. 5, the values of states, edges, type, var, and varId that are attribute values are set. Here, states represents an NFA state group. edge represents a set of a reference to an edge label and an edge instance. Since type is non-terminal, it is “non-terminal”. var is a reference to a Variable instance. In this example, varId is 3, which means that this DFAState reflects the state up to the third variable (XPath expression).

図５の中間（左中間）に位置するのがＥｄｇｅのインスタンス群である。また、図５の右中間に位置するのがＥｄｇｅの推移先に相当するＤＦＡＳｔａｔｅインスタンス群である。なお、図５に示すように、Ｅｄｇｅの推移先はＮｕｌｌ（ＤＦＡＳｔａｔｅが未生成）であることも許されている。 An Edge instance group is located in the middle (left middle) of FIG. Further, a DFAState instance group corresponding to the transition destination of Edge is located in the right middle of FIG. As shown in FIG. 5, the transition destination of Edge is allowed to be Null (DFAState is not generated).

ＤＦＡ更新モジュール１７は、オートマトン管理部１５が管理する構築済みのＤＦＡから更新後のオートマトンが許容する状態に対応する更新前のオートマトン状態を特定し、特定した個々の状態を差分で更新する。 The DFA update module 17 identifies the pre-update automaton state corresponding to the state permitted by the updated automaton from the constructed DFA managed by the automaton management unit 15, and updates the identified individual states with differences.

つまり、差分で更新する処理とは、更新後ＸＰａｔｈ式を全て使用する代わりに、更新前ＸＰａｔｈ式と更新後ＸＰａｔｈ式との差分となる差分ＸＰａｔｈ式を用いて、更新前ＤＦＡから更新後ＤＦＡに更新する処理である。まず、オートマトン管理部１５が、更新前ＸＰａｔｈ式と、その更新前ＸＰａｔｈ式から導出した更新前ＮＦＡ（図３の中段のＮＦＡ）と、その更新前ＮＦＡから導出した更新前ＤＦＡ（図３下段のＤＦＡ）を管理する。次に、更新オートマトン管理部１８は、差分ＸＰａｔｈ式から導出する差分ＮＦＡを管理する。 In other words, the process of updating with the difference means that instead of using all the updated XPath expressions, the difference XPath expression that is the difference between the pre-update XPath expression and the post-update XPath expression is used to change from the pre-update DFA to the post-update DFA. It is a process to update. First, the automaton management unit 15 calculates the pre-update XPath expression, the pre-update NFA derived from the pre-update XPath expression (the middle NFA in FIG. 3), and the pre-update DFA derived from the pre-update NFA (the lower half of FIG. 3). DFA) is managed. Next, the updated automaton management unit 18 manages the difference NFA derived from the difference XPath expression.

そして、ＤＦＡ更新モジュール１７は、差分ＮＦＡを用いて更新前ＤＦＡから更新後ＤＦＡへ部分更新する。つまり、ＤＦＡ更新モジュール１７は、差分ＸＰａｔｈ式を差分ＮＦＡに変換し、更新前ＮＦＡおよび更新前ＤＦＡに対して差分ＮＦＡを追加することにより、差分ＸＰａｔｈ式の情報を差分追加する。 Then, the DFA update module 17 performs partial update from the pre-update DFA to the post-update DFA using the differential NFA. In other words, the DFA update module 17 adds the difference XPath expression information by converting the difference XPath expression into a difference NFA and adding the difference NFA to the pre-update NFA and the pre-update DFA.

次に、図６に示す更新前のデータの一例、および、図７に示す更新後のデータの一例について説明する。 Next, an example of data before update shown in FIG. 6 and an example of data after update shown in FIG. 7 will be described.

図６は、オートマトン管理部１５が管理するデータの一例を示す図である。ＤＦＡ作成モジュール１３Ｂは、３つのＸＰａｔｈ式（更新前ＸＰａｔｈ式：図６（ａ）参照）から、各式に相当するＮＦＡを導出し、それらのＮＦＡを統合ＮＦＡ（更新前ＮＦＡ：図６（ｂ）参照）にし、それからＤＦＡ（更新前ＤＦＡ：図６（ｃ）参照）を導出している。 FIG. 6 is a diagram illustrating an example of data managed by the automaton management unit 15. The DFA creation module 13B derives NFA corresponding to each expression from the three XPath expressions (pre-update XPath expression: see FIG. 6A), and integrates these NFAs (NFA before update: FIG. 6B). )) And DFA (DFA before update: see FIG. 6C) is derived therefrom.

例えば、図６（ａ）に示す「／ａ／／ｂ」は、Ｓ１、Ｓ２、Ｓ３の状態と、ａ、＊、ｂの推移から構成される。なお／／ｌａｂｅｌは、入力ＸＭＬデータの任意の深さの要素の名称がｌａｂｅｌであるものにマッチするため、ＮＦＡではワイルドカード（＊）による自己推移に対応付けられる。「／ｄ／／ｂ」は、Ｓ４、Ｓ５、Ｓ６の状態と、ｄ、＊、ｂの推移から構成され、「／ａ／ｃ／ｂ」は、Ｓ７、Ｓ８、Ｓ９、Ｓ１０の状態と、ａ、ｃ、ｂの推移から構成される。 For example, “/ a // b” illustrated in FIG. 6A includes the states of S1, S2, and S3 and transitions of a, *, and b. Since // label matches the element whose name of the arbitrary depth of the input XML data is “label”, in NFA, it is associated with self-transition by a wild card (*). “/ D // b” is composed of the states of S4, S5, and S6 and transitions of d, *, and b. “/ A / c / b” is the states of S7, S8, S9, and S10. It consists of transitions of a, c, b.

次に、ＤＦＡ作成モジュール１３Ｂは、これらの３つのＮＦＡを、Ｓ０の状態と、イプシロンの推移により、各ＮＦＡの開始状態を、Ｓ０の状態で統合させる（図６（ｂ）に示す統合ＮＦＡの作成）。 Next, the DFA creation module 13B integrates these three NFAs with the state of S0 and the start state of each NFA in the state of S0 based on the transition of epsilon (the integrated NFA shown in FIG. 6B). Create).

さらに、ＤＦＡ作成モジュール１３Ｂは、この統合ＮＦＡと同等なＤＦＡを生成する（図６（ｃ））。ＤＦＡは、例えばサブセットコンストラクションにより作成されるが、ここの例では特に特許文献１で示されるような遅延評価ＤＦＡの例を用いており、部分的に状態がＮＵＬＬとなっている。ＤＦＡの各状態は推移群と対応するＮＦＡ状態を表すＮＦＡ状態群からなる。表記を簡略化するため、ＮＦＡの状態群の表記において、ＮＦＡ状態を表すプレフィックスＳを省略し番号のみを記載している。よって例えば、図６（ｂ）の状態Ｓ０は、図６（ｃ）の状態｛０，１，４，７｝の「０」に対応する。 Further, the DFA creation module 13B generates a DFA equivalent to this integrated NFA (FIG. 6C). The DFA is created by, for example, subset construction. In this example, an example of a delay evaluation DFA as shown in Patent Document 1 is used, and the state is partially NULL. Each state of the DFA includes an NFA state group representing an NFA state corresponding to the transition group. In order to simplify the notation, in the notation of the NFA state group, the prefix S representing the NFA state is omitted and only the number is described. Therefore, for example, the state S0 in FIG. 6B corresponds to “0” in the state {0, 1, 4, 7} in FIG.

図７は、記憶装置に格納されるデータの一例を示す図である。更新オートマトン管理部１８は、追加するＸＰａｔｈ式（差分ＸＰａｔｈ式：図７（ａ）参照）、そのＸＰａｔｈ式から導出されたＮＦＡ（差分ＮＦＡ：図７（ｂ）参照）を管理する。オートマトン管理部１５は、差分ＮＦＡが更新前ＮＦＡ（図６（ｂ））に追加されたＮＦＡ（更新後ＮＦＡ：図７（ｃ）参照）を管理する。 FIG. 7 is a diagram illustrating an example of data stored in the storage device. The updated automaton management unit 18 manages an XPath expression to be added (difference XPath expression: see FIG. 7A) and an NFA derived from the XPath expression (difference NFA: see FIG. 7B). The automaton management unit 15 manages the NFA in which the differential NFA is added to the pre-update NFA (FIG. 6B) (post-update NFA: see FIG. 7C).

図８、図９は、ＤＦＡ更新モジュール１７が、図６（ｃ）に示された更新前ＤＦＡに対して、図７（ｂ）の差分ＮＦＡを追加した結果である更新後ＤＦＡを示している。ハッチングされた状態は更新された状態を表している。 FIGS. 8 and 9 show the updated DFA, which is the result of the DFA update module 17 adding the difference NFA of FIG. 7B to the pre-update DFA shown in FIG. 6C. . The hatched state represents the updated state.

以下、ＤＦＡ更新モジュール１７の処理の詳細について、フローチャートを参照して説明する。ＤＦＡ更新モジュール１７は、図１０〜図１２に示すフローチャートに沿ってＤＦＡの差分更新を行う。 Details of the processing of the DFA update module 17 will be described below with reference to flowcharts. The DFA update module 17 performs DFA difference update according to the flowcharts shown in FIGS.

図１０はＤＦＡの更新フローを示す。Ｓ２００において、複数のＸＰａｔｈ式（更新前ＸＰａｔｈ式）を処理するために導出されたＤＦＡ（更新前ＤＦＡ）を読み込み、その開始状態を状態ｄｓとして保持する。Ｓ２１０において、追加対象のＸＰａｔｈ式（差分ＸＰａｔｈ式）を読み込み、それからＮＦＡ（差分ＮＦＡ）を導出し、その開始状態を状態ｎｓとして保持する。これらの状態ｄｓ、状態ｎｓを引数として、Ｓ２２０に示すサブルーチンＤＦＡ（更新前ＤＦＡ）状態の更新を行う。なお、ＤＦＡの更新フロー（図１０）の任意のタイミングにおいて、ＤＦＡ更新モジュール１７は、図６（ｂ）の更新前ＮＦＡに図７（ｂ）の差分ＮＦＡを反映して、図７（ｃ）の更新後ＮＦＡとする。 FIG. 10 shows a DFA update flow. In S200, the DFA (pre-update DFA) derived to process a plurality of XPath expressions (pre-update XPath expressions) is read, and the start state is held as the state ds. In S210, the XPath expression (difference XPath expression) to be added is read, an NFA (difference NFA) is derived therefrom, and the start state is held as the state ns. Using these state ds and state ns as arguments, the subroutine DFA (pre-update DFA) state shown in S220 is updated. At any timing in the DFA update flow (FIG. 10), the DFA update module 17 reflects the difference NFA in FIG. 7B on the pre-update NFA in FIG. It will be NFA after the update.

図１１はサブルーチンであるＤＦＡ状態の更新のフローの詳細を示す。ここでは特にＸＰａｔｈ式の追加操作に対するフローを示している。本サブルーチンでは正規のＤＦＡに加えて、特許文献１に記述されている遅延評価型のＤＦＡにも対応するため、Ｓ３１０において状態ｄｓがＮＵＬＬ（未評価）であるか否かを判定する。ＮＵＬＬである場合は（Ｓ３１０、Ｙｅｓ）、更新対象の状態が未評価であるため、本サブルーチンを終了（ｒｅｔｕｒｎ）する。状態ｄｓが存在する場合は（Ｓ３１０、Ｎｏ）、Ｓ３２０において状態ｄｓが保持するＮＦＡ状態群に対して、追加対象のＸＰａｔｈ式のＮＦＡ状態ｎｓを追加する。 FIG. 11 shows details of the DFA state update flow which is a subroutine. Here, the flow for the addition operation of the XPath expression is particularly shown. In this subroutine, in addition to the regular DFA, the delay evaluation type DFA described in Patent Document 1 is also supported. Therefore, in S310, it is determined whether or not the state ds is NULL (not evaluated). If it is NULL (S310, Yes), the status of the update target has not been evaluated, so this subroutine is terminated (return). If the state ds exists (S310, No), the NFA state ns of the XPath expression to be added is added to the NFA state group held by the state ds in S320.

続いてＮＦＡ状態ｎｓからの推移を用いてＤＦＡを更新するため、Ｓ３３０において状態ｎｓからの推移群を取得する。Ｓ３４０において取得した推移群でＳ３５０、Ｓ３６０の処理が未処理である推移があるか否かを判定する。未処理の推移がない場合は、全ての推移に対するＤＦＡの更新が終了（ｒｅｔｕｒｎ）である。未処理の推移がある場合は、Ｓ３５０において未処理の推移を１つ選択し、推移ｎｅとして保持する。そして、状態ｄｓと推移ｎｅを引数として、Ｓ３６０に示すサブルーチン「推移によるＤＦＡ状態更新」を行う。 Subsequently, in order to update the DFA using the transition from the NFA state ns, a transition group from the state ns is acquired in S330. It is determined whether or not there is a transition in which the processes of S350 and S360 are unprocessed in the transition group acquired in S340. If there is no unprocessed transition, updating of the DFA for all transitions is completed (return). If there is an unprocessed transition, one unprocessed transition is selected in S350 and held as a transition ne. Then, the subroutine “DFA state update by transition” shown in S360 is performed using the state ds and the transition ne as arguments.

図１２はサブルーチンである推移によるＤＦＡ状態更新フローの詳細を示す。Ｓ４００において、ＤＦＡ状態ｄｓが保持する推移群において、ＮＦＡ側の推移ｎｅに適合する推移を特定する。Ｓ４１０において、適合する推移の有無を判定する。適合する推移がない場合は、Ｓ４５０へ進み、状態ｄｓの推移群に新たに推移ｎｅを追加して、本サブルーチンを終了（ｒｅｔｕｒｎ）する。適合する推移がある場合は、Ｓ４２０へ進み、未処理である推移の有無を判定する。 FIG. 12 shows details of the DFA state update flow based on the transition that is a subroutine. In S400, a transition that matches the transition ne on the NFA side is specified in the transition group held by the DFA state ds. In S410, it is determined whether there is a suitable transition. If there is no suitable transition, the process proceeds to S450, a transition ne is newly added to the transition group of the state ds, and this subroutine is terminated (return). If there is a matching transition, the process proceeds to S420, and it is determined whether there is a transition that has not been processed.

未処理である推移がない場合は、推移ｎｅによるＤＦＡ更新が終了であるため、本サブルーチンを終了（ｒｅｔｕｒｎ）する。未処理である推移がある場合は、Ｓ４３０において未処理である推移を１つ選択し推移ｄｅとして保持する。Ｓ４４０において、推移ｄｅの推移先のＤＦＡ状態を状態ｄｓ１として保持し、推移ｎｅの推移先のＮＦＡ状態を状態ｎｓ１として保持する。これらの状態ｄｓ１、状態ｎｓ１を引数として、Ｓ２２０に示すサブルーチン「ＤＦＡ状態の更新」を行う。 If there is no unprocessed transition, the DFA update by the transition ne is complete, so this subroutine is terminated (return). If there is a transition that has not been processed, one transition that has not been processed is selected in S430 and held as a transition de. In S440, the transition DFA state of the transition de is held as the state ds1, and the transition destination NFA state of the transition ne is held as the state ns1. Using these state ds1 and state ns1 as arguments, the subroutine “DFA state update” shown in S220 is performed.

図１０〜図１２で示したように、本発明は更新ＸＰａｔｈ式から導出されたＮＦＡに対して、推移に従って状態を順に辿る処理を実行し、当該実行過程においてＮＦＡにおける各推移に適合するＤＦＡの推移を特定して、ＤＦＡの各状態が保持するＮＦＡ状態群および推移群を確認し、当該ＮＦＡ状態群および推移群と差分ＮＦＡとの差分を更新することを特徴とする。 As shown in FIGS. 10 to 12, the present invention executes a process of sequentially tracing the state according to the transition for the NFA derived from the updated XPath expression, and in the execution process, the DFA conforming to each transition in the NFA. The transition is specified, the NFA state group and the transition group held by each state of the DFA are confirmed, and the difference between the NFA state group and the transition group and the difference NFA is updated.

以上説明した図１０〜図１２の処理を具体的に説明するため、図６〜図９に示した一例を適用して、ＤＦＡ更新モジュール１７が行うＤＦＡの差分更新の方法を説明する。 In order to specifically describe the processing of FIGS. 10 to 12 described above, a method of differential update of DFA performed by the DFA update module 17 will be described by applying the example shown in FIGS. 6 to 9.

図１０のＳ２００において、状態ｄｓは｛０，１，４，７｝で構成される状態を保持する。Ｓ２１０で、状態ｎｓは図７のＳ１１の状態を保持する、Ｓ２２０で、これらの状態ｄｓと状態ｎｓからＤＦＡ状態更新を行う。 In S200 of FIG. 10, the state ds holds a state constituted by {0, 1, 4, 7}. In S210, the state ns holds the state of S11 in FIG. 7. In S220, the DFA state is updated from these states ds and ns.

図１１のＳ３１０において状態ｄｓ｛０，１，４，７｝はＮＵＬＬでないため、Ｓ３２０で状態ｎｓのＳ１１という状態を状態ｄｓに加え、その結果、状態ｄｓが保持するＮＦＡ状態群は｛０，１，４，７，１１｝となる。次にＳ３３０で、状態ｎｓからの推移としてａを取得する。Ｓ３４０において未処理である推移ａがあるため、Ｓ３５０でａを推移ｎｅに保持する。 Since the state ds {0, 1, 4, 7} is not NULL in S310 of FIG. 11, the state S11 of the state ns is added to the state ds in S320, and as a result, the NFA state group held by the state ds is {0, 1, 4, 7, 11}. Next, in S330, a is acquired as a transition from the state ns. Since there is an unprocessed transition a in S340, a is held in transition ne in S350.

次に、状態ｄｓと推移ｎｅでＳ３６０の「推移によるＤＦＡ状態更新」を行う。 Next, “update DFA state by transition” in S360 is performed with the state ds and the transition ne.

図１２のＳ４００において、状態ｄｓ（｛０，１，４，７，１１｝）の推移群ａ、ｄのうち、推移ｎｅ（ａ）に適合する推移であるａを特定する。Ｓ４１０において適合する推移があったため、Ｓ４２０に進み、未処理である推移ａがあるため、Ｓ４３０に進む。Ｓ４３０では適合する推移ａを取り出し、推移ｄｅに保持する。Ｓ４４０では推移ｄｅ（ａ）の推移先のＤＦＡ状態｛２、８｝を状態ｄｓ１として保持し、推移ｎｅ（ａ）の推移先のＮＦＡ状態Ｓ１２を状態ｎｓ１として保持し、これらを引数としてＳ２２０のサブルーチンＤＦＡ状態更新を行う。 In S400 of FIG. 12, a which is a transition suitable for the transition ne (a) is specified from the transition groups a and d of the state ds ({0, 1, 4, 7, 11}). Since there is a suitable transition in S410, the process proceeds to S420, and since there is a transition a that has not been processed, the process proceeds to S430. In S430, a suitable transition a is extracted and held in the transition de. In S440, the DFA state {2, 8} of the transition destination of the transition de (a) is held as the state ds1, the NFA state S12 of the transition destination of the transition ne (a) is held as the state ns1, and these are used as arguments in the S220. Subroutine DFA status update is performed.

ＤＦＡ状態更新サブルーチンの最初の再帰について説明する。 The first recursion of the DFA state update subroutine will be described.

図１１のＳ３１０において状態ｄｓ｛２，８｝はＮＵＬＬでないため、Ｓ３２０で状態ｎｓのＳ１２という状態を状態ｄｓに加え、その結果状態ｄｓが保持するＮＦＡ状態群は｛２，８，１２｝となる。次にＳ３３０で、状態ｎｓからの推移としてｄ、＊を取得する、Ｓ３４０において未処理である推移ｄがあるため、Ｓ３５０でｄを推移ｎｅに保持する。 Since the state ds {2, 8} is not NULL in S310 in FIG. 11, the state S12 of the state ns is added to the state ds in S320, and as a result, the NFA state group held by the state ds is {2, 8, 12}. Become. Next, in S330, d and * are acquired as transitions from the state ns. Since there is a transition d that is not processed in S340, d is held in the transition ne in S350.

次に、状態ｄｓと推移ｎｅでＳ３６０の「推移によるＤＦＡ状態更新」を行う。図１２のＳ４００において、状態ｄｓ（｛２，８，１２｝）の推移群ｂ、ｃ、ｏｔｈｅｒのうち、推移ｎｅ（ｄ）に適合する推移であるｏｔｈｅｒを特定する。Ｓ４１０において適合する推移があったため、Ｓ４２０に進み、未処理である推移ｄがあるため、Ｓ４３０に進む。 Next, “update DFA state by transition” in S360 is performed with the state ds and the transition ne. In S400 of FIG. 12, among the transition groups b, c, and other of the state ds ({2, 8, 12}), the other that is a transition that matches the transition ne (d) is specified. Since there is a suitable transition in S410, the process proceeds to S420, and since there is an unprocessed transition d, the process proceeds to S430.

Ｓ４３０では適合する推移ｄを取り出し、推移ｄｅに保持する。このとき、推移ｄが適合したＤＦＡの推移がｏｔｈｅｒであったため、ｏｔｈｅｒによる推移をｏｔｈｅｒとｄとに分割し、推移先の構造をコピーする。Ｓ４４０では推移ｄｅ（ｄ）の推移先のＤＦＡ状態｛２｝を状態ｄｓ１として保持し、推移ｎｅ（ｄ）の推移先のＮＦＡ状態Ｓ１３を状態ｎｓ１として保持し、これらを引数としてＳ２２０のサブルーチンＤＦＡ状態更新を行う。 In S430, a suitable transition d is extracted and held in the transition de. At this time, since the transition of the DFA to which the transition d is matched is other, the transition by other is divided into other and d, and the structure of the transition destination is copied. In S440, the transition DFA state {2} of the transition de (d) is held as the state ds1, the transition destination NFA state S13 of the transition ne (d) is held as the state ns1, and these are used as arguments to the subroutine DFA of S220. Update the state.

ＤＦＡ状態更新サブルーチンの二度目の再帰について説明する。 The second recursion of the DFA state update subroutine will be described.

図１１のＳ３１０において状態ｄｓ｛２｝はＮＵＬＬでないため、Ｓ３２０で状態ｎｓのＳ１３という状態を状態ｄｓに加え、その結果状態ｄｓが保持するＮＦＡ状態群は｛２，１３｝となる。次にＳ３３０で、状態ｎｓからの推移を取得するが推移はない。Ｓ３４０において未処理である推移がないため、本サブルーチンの処理を終了（ｒｅｔｕｒｎ）する。図８はこの時点のＤＦＡを示している。 Since the state ds {2} is not NULL in S310 in FIG. 11, the state S13 of the state ns is added to the state ds in S320, and as a result, the NFA state group held by the state ds is {2, 13}. Next, in S330, the transition from the state ns is acquired, but there is no transition. Since there is no unprocessed transition in S340, the processing of this subroutine is terminated (return). FIG. 8 shows the DFA at this point.

図１２のＳ２２０の処理が完了したため、Ｓ４２０に進む。Ｓ４２０では未処理である推移がないため、本サブルーチンの処理を終了（ｒｅｔｕｒｎ）する。 Since the process of S220 of FIG. 12 is completed, the process proceeds to S420. Since there is no unprocessed transition in S420, the processing of this subroutine is terminated (return).

図１１のＳ３６０の処理が完了したため、Ｓ３４０に進む。Ｓ３４０において未処理である推移＊があるため、Ｓ３５０で＊を推移ｎｅに保持する。次に、状態ｄｓと推移ｎｅでＳ３６０の「推移によるＤＦＡ状態更新」を行う。図１２のＳ４００において、状態ｄｓ（｛２，８，１２｝）の推移群ｄ、ｂ、ｃ、ｏｔｈｅｒのうち、推移ｎｅ（＊）に適合する推移であるｄ、ｂ、ｃ、ｏｔｈｅｒを特定する。Ｓ４１０において適合する推移があったため、Ｓ４２０に進み、未処理である推移ｄがあるため、Ｓ４３０に進む。Ｓ４３０では適合する推移ｄを取り出し、推移ｄｅに保持する。 Since the process of S360 in FIG. 11 is completed, the process proceeds to S340. Since there is an unprocessed transition * in S340, * is held in transition ne in S350. Next, “update DFA state by transition” in S360 is performed with the state ds and the transition ne. In S400 of FIG. 12, among transition groups d, b, c, and other of state ds ({2, 8, 12}), d, b, c, and other that are transitions that match transition ne (*) are identified. To do. Since there is a suitable transition in S410, the process proceeds to S420, and since there is an unprocessed transition d, the process proceeds to S430. In S430, a suitable transition d is extracted and held in the transition de.

Ｓ４４０では推移ｄｅ（ｄ）の推移先のＤＦＡ状態｛２，１３｝を状態ｄｓ１として保持し、推移ｎｅ（＊）の推移先のＮＦＡ状態Ｓ１２を状態ｎｓ１として保持し、これらを引数としてＳ２２０のサブルーチンＤＦＡ状態更新を行う。 In S440, the DFA state {2, 13} of the transition de (d) is held as the state ds1, the NFA state S12 of the transition ne (*) is held as the state ns1, and these are used as arguments in S220. Subroutine DFA status update is performed.

前記の処理を再帰的に繰り返すことにより、図６のＤＦＡが更新されて最終的に図９に示すＤＦＡになる。 By recursively repeating the above process, the DFA in FIG. 6 is updated to finally become the DFA shown in FIG.

以上、第１実施形態（差分追加）を説明した。次に第２実施形態（差分削除）を説明する。なお、第２実施形態におけるフィルタエンジン１０の構成は、第１実施形態と同じであるため、説明を省略する。 The first embodiment (difference addition) has been described above. Next, a second embodiment (difference deletion) will be described. In addition, since the structure of the filter engine 10 in 2nd Embodiment is the same as 1st Embodiment, description is abbreviate | omitted.

図１３は、図６に示す更新前のデータに対する削除するデータの一例を示している。図１４〜図１６は、図６に示されたＤＦＡに対して、図１３（ｂ）のＮＦＡを削除した結果のＤＦＡを示している。ハッチングされた状態は更新された状態を表している。 FIG. 13 shows an example of data to be deleted from the pre-update data shown in FIG. 14 to 16 show the DFA obtained as a result of deleting the NFA of FIG. 13B from the DFA shown in FIG. The hatched state represents the updated state.

ＤＦＡ更新モジュール１７は、図１７〜図１９に示すフローに沿ってＤＦＡの差分更新（差分削除）を行う。 The DFA update module 17 performs DFA difference update (difference deletion) along the flow shown in FIGS.

図１７はＤＦＡの削除フローを示す。Ｓ５００において、複数のＸＰａｔｈ式を処理するために導出されたＤＦＡを読み込み、その開始状態を状態ｄｓとして保持する。Ｓ５１０において、削除対象に相当するＮＦＡを読み込み、その開始状態を状態ｎｓとして保持する。これらの状態ｄｓ、状態ｎｓを引数として、Ｓ５２０に示すサブルーチンＤＦＡ状態の削除を行う。このＤＦＡ状態の削除後に、Ｓ５３０で削除対象に相当するＮＦＡを削除する。 FIG. 17 shows a DFA deletion flow. In S500, the DFA derived to process a plurality of XPath expressions is read, and the start state is held as the state ds. In S510, the NFA corresponding to the deletion target is read, and the start state is held as the state ns. The subroutine DFA state shown in S520 is deleted using these state ds and state ns as arguments. After the deletion of the DFA state, the NFA corresponding to the deletion target is deleted in S530.

図１８はサブルーチンであるＤＦＡ状態の削除のフローの詳細を示す。ここでは特にＸＰａｔｈ式の削除操作に対するフローを示している。本サブルーチンでは差分追加と同様に、Ｓ６１０において、状態ｄｓがＮＵＬＬ（未評価）であるか否かを判定する。ＮＵＬＬである場合は（Ｓ６１０，Ｙｅｓ）、追加対象の状態が未評価であるため、本サブルーチンを終了（ｒｅｔｕｒｎ）する。 FIG. 18 shows the details of the DFA state deletion flow which is a subroutine. Here, a flow for an XPath-type deletion operation is particularly shown. In this subroutine, similarly to the difference addition, in S610, it is determined whether or not the state ds is NULL (not evaluated). If it is NULL (S610, Yes), the status of the addition target has not been evaluated, so this subroutine is terminated (return).

状態ｄｓが存在する場合は、Ｓ６２０において状態ｄｓが保持するＮＦＡ状態群に対して、削除対象のＸＰａｔｈ式のＮＦＡ状態ｎｓを削除する。この時、状態ｄｓが保持するＮＦＡ状態群をＳ６３０にて確認し、ＮＦＡ状態群がない場合、この状態ｄｓ配下をＳ６４０にて削除する。ＮＦＡ状態群がある場合、ＮＦＡ状態ｎｓからの推移を用いてＤＦＡを削除するため、Ｓ６５０において状態ｎｓからの推移群を取得する。 When the state ds exists, the NFA state ns of the XPath expression to be deleted is deleted from the NFA state group held by the state ds in S620. At this time, the NFA state group held by the state ds is confirmed in S630. If there is no NFA state group, the subordinate of this state ds is deleted in S640. If there is an NFA state group, the transition group from the state ns is acquired in S650 in order to delete the DFA using the transition from the NFA state ns.

Ｓ６５０において取得した推移群でＳ６７０、Ｓ６８０の処理が未処理である推移があるか否かを判定する。未処理の推移がない場合は（Ｓ６６０，無）、全ての推移に対するＤＦＡの削除が終了である。未処理の推移がある場合（Ｓ６６０，有）は、Ｓ６７０において未処理の推移を一つ選択し、推移ｎｅとして保持する。そして、状態ｄｓと推移ｎｅとを引数として、Ｓ６８０に示すサブルーチン推移によるＤＦＡ状態削除を行い、Ｓ６６０に戻る。 It is determined whether or not there is a transition in which the processes of S670 and S680 are unprocessed in the transition group acquired in S650. If there is no unprocessed transition (S660, none), the DFA deletion for all transitions is completed. If there is an unprocessed transition (S660, yes), one unprocessed transition is selected in S670 and held as a transition ne. Then, using the state ds and the transition ne as arguments, the DFA state deletion by the subroutine transition shown in S680 is performed, and the process returns to S660.

図１９はサブルーチンである推移によるＤＦＡ状態削除フローの詳細を示す。Ｓ７００において、ＤＦＡ状態ｄｓが保持する推移群において、ＮＦＡ側の推移ｎｅに適合する推移を特定する。Ｓ７１０において、適合する推移の有無を判定する。適合する推移がない場合は、本サブルーチンを終了（ｒｅｔｕｒｎ）する。適合する推移がある場合は、Ｓ７２０へ進み。未処理である推移の有無を判定する。未処理である推移がない場合は、推移ｎｅによるＤＦＡ削除が終了であるため、本サブルーチンを終了（ｒｅｔｕｒｎ）する。 FIG. 19 shows the details of the DFA state deletion flow by the transition which is a subroutine. In S700, a transition that matches the transition ne on the NFA side is identified in the transition group held by the DFA state ds. In S710, it is determined whether there is a suitable transition. If there is no matching transition, this subroutine is terminated (return). If there is a suitable transition, the process proceeds to S720. It is determined whether or not there is an unprocessed transition. If there is no unprocessed transition, the DFA deletion by the transition ne is complete, so this subroutine is terminated (return).

未処理である推移がある場合は、Ｓ７３０において未処理である推移を一つ選択し推移ｄｅとして保持する。Ｓ７４０において、推移ｄｅの推移先のＤＦＡ状態を状態ｄｓ１として保持し、推移ｎｅの推移先のＮＦＡ状態を状態ｎｓ１として保持する。これらの状態ｄｓ１、状態ｎｓ１を引数として、Ｓ５２０に示すサブルーチン「ＤＦＡ状態の削除」を行う。 If there is a transition that has not been processed, one transition that has not been processed is selected in S730 and held as a transition de. In S740, the DFA state of the transition destination of the transition de is held as the state ds1, and the NFA state of the transition destination of the transition ne is held as the state ns1. Using these state ds1 and state ns1 as arguments, the subroutine “DFA state deletion” shown in S520 is performed.

以上説明した図１７〜図１９の処理を具体的に説明するため、図６、図１３〜図１６に示した一例を適用して、ＤＦＡ更新モジュール１７が行うＤＦＡの差分更新の方法を説明する。 In order to specifically describe the processes of FIGS. 17 to 19 described above, a method of differential update of DFA performed by the DFA update module 17 will be described by applying the examples shown in FIGS. 6 and 13 to 16. .

図１７のＳ５００において、状態ｄｓは、｛０，１，４，７｝で構成される状態を保持する。Ｓ５１０で、状態ｎｓは図１３のＳ１の状態を保持する。Ｓ５２０で、これらの状態ｄｓと状態ｎｓからＤＦＡ状態削除を行う。 In S500 of FIG. 17, the state ds holds a state constituted by {0, 1, 4, 7}. In S510, the state ns holds the state of S1 in FIG. In S520, the DFA state is deleted from these state ds and state ns.

図１８のＳ６１０において、状態ｄｓ｛０，１，４，７｝はＮＵＬＬでないため、Ｓ６２０で状態ｎｓのＳ１という状態を状態ｄｓから削除し、その結果状態ｄｓが保持するＮＦＡ状態群は、｛０，４，７｝となる。Ｓ６３０で状態ｄｓ｛０，４，７｝はＮＦＡ状態群を保持しているため、Ｓ６５０で、状態ｎｓからの推移としてａを取得する。Ｓ６６０において未処理である推移ａがあるため、Ｓ６７０でａを推移ｎｅに保持する。次に、状態ｄｓと推移ｎｅとでＳ６８０の「推移によるＤＦＡ状態削除」を行う。 In S610 of FIG. 18, since the state ds {0, 1, 4, 7} is not NULL, the state S1 of the state ns is deleted from the state ds in S620, and as a result, the NFA state group held by the state ds is { 0, 4, 7}. Since the state ds {0, 4, 7} holds the NFA state group in S630, a is acquired as a transition from the state ns in S650. Since there is a transition a that has not been processed in S660, a is held in transition ne in S670. Next, “DFA state deletion by transition” of S680 is performed with the state ds and the transition ne.

図１９のＳ７００において、状態ｄｓ｛０，４，７｝の推移群ａ，ｄのうち、推移ｎｅ（ａ）に適合する推移であるａを特定する。Ｓ７１０において適合する推移があったため、Ｓ７２０に進み、未処理である推移ａがあるため、Ｓ７３０に進む。Ｓ７３０では適合する推移ａを取り出し、推移ｄｅに保持する。Ｓ７４０では推移ｄｅ（ａ）の推移先のＤＦＡ状態｛２，８｝を状態ｄｓ１として保持し、推移ｎｅ（ａ）の推移先のＮＦＡ状態Ｓ２を状態ｎｓ１として保持し、これらを引数としてＳ５２０のサブルーチンＤＦＡ状態削除を行う。 In S700 of FIG. 19, a that is a transition that matches the transition ne (a) is specified from the transition groups a and d of the state ds {0, 4, 7}. Since there is a suitable transition in S710, the process proceeds to S720, and since there is a transition a that has not been processed, the process proceeds to S730. In S730, the suitable transition a is extracted and held in the transition de. In S740, the DFA state {2, 8} of the transition de (a) is held as the state ds1, the NFA state S2 of the transition ne (a) is held as the state ns1, and these are used as arguments in S520. Subroutine DFA state deletion is performed.

ＤＦＡ状態削除サブルーチンの最初の再帰について説明する。 The first recursion of the DFA state deletion subroutine will be described.

図１８のＳ６１０において、状態ｄｓ｛２、８｝はＮＵＬＬでないため、Ｓ６２０で状態ｎｓのＳ２という状態を状態ｄｓから削除し、その結果状態ｄｓが保持するＮＦＡ状態群は｛８｝となる。Ｓ６３０で状態ｄｓ｛８｝はＮＦＡ状態を保持しているため、Ｓ６５０で、状態ｎｓからの推移としてｂ、＊を取得する。Ｓ６６０において未処理である推移ｂがあるため、Ｓ６７０でｂを推移ｎｅに保持する。次に、状態ｄｓと推移ｎｅでＳ６８０の「推移によるＤＦＡ状態削除」を行う。 In S610 of FIG. 18, since the state ds {2, 8} is not NULL, the state S2 of the state ns is deleted from the state ds in S620, and as a result, the NFA state group held by the state ds becomes {8}. Since the state ds {8} holds the NFA state in S630, b and * are acquired as transitions from the state ns in S650. Since there is an unprocessed transition b in S660, b is held in the transition ne in S670. Next, “DFA state deletion by transition” of S680 is performed with the state ds and the transition ne.

図１９のＳ７００において、状態ｄｓ｛８｝の推移群ｂ，ｃ，ｏｔｈｅｒのうち、推移ｎｅ（ｂ）に適合する推移であるｂ，ｏｔｈｅｒを特定する。Ｓ７１０において適合する推移があったため、Ｓ７２０に進み、未処理である推移ｂがあるため、Ｓ７３０に進む。Ｓ７３０では適合する推移ｂを取り出し、推移ｄｅに保持する。Ｓ７４０では推移ｄｅ（ｂ）の推移先のＤＦＡ状態｛２，３｝を状態ｄｓ１として保持し、推移ｎｅ（ｂ）の推移先のＮＦＡ状態Ｓ３を状態ｎｓ１として保持し。これらを引数としてＳ５２０のサブルーチンＤＦＡ状態削除を行う。 In S700 of FIG. 19, among the transition groups b, c, and other of the state ds {8}, b and other that are transitions that match the transition ne (b) are specified. Since there is a suitable transition in S710, the process proceeds to S720, and since there is a transition b that is not processed, the process proceeds to S730. In S730, the suitable transition b is extracted and held in the transition de. In S740, the transition DFA state {2, 3} of the transition de (b) is retained as the state ds1, and the transition destination NFA state S3 of the transition ne (b) is retained as the state ns1. Using these as arguments, the subroutine DFA state is deleted in S520.

ＤＦＡ状態削除サブルーチンの二度目の再帰について説明する。 The second recursion of the DFA state deletion subroutine will be described.

図１８のＳ６１０において、状態ｄｓ｛２，３｝はＮＵＬＬではないため、Ｓ６２０で状態ｎｓのＳ３という状態を状態ｄｓから削除し、その結果状態ｄｓが保持するＮＦＡ状態群は｛２｝となる。Ｓ６３０で状態ｄｓ｛２｝はＮＦＡ状態を保持しているため、Ｓ６５０で、状態ｎｓからの推移を取得する。次に、Ｓ６６０で未処理である推移がないため、本サブルーチンの処理を終了（ｒｅｔｕｒｎ）する。図１４はこの時点のＤＦＡを示している。 In S610 of FIG. 18, since the state ds {2, 3} is not NULL, the state S3 of the state ns is deleted from the state ds in S620, and as a result, the NFA state group held by the state ds becomes {2}. . Since the state ds {2} holds the NFA state in S630, the transition from the state ns is acquired in S650. Next, since there is no unprocessed transition in S660, the processing of this subroutine is terminated (return). FIG. 14 shows the DFA at this time.

図１９のＳ５２０の処理が完了したため、Ｓ７２０に進む。Ｓ７２０では未処理である推移ｏｔｈｅｒがあるため、Ｓ７３０に進む。Ｓ７３０では適合する推移ｏｔｈｅｒを取り出し、推移ｄｅに保持する。Ｓ７４０では推移ｄｅ（ｏｔｈｅｒ）の推移先のＤＦＡ状態｛２｝を状態ｄｓ１として保持し、推移ｎｅ（ｏｔｈｅｒ）の推移先のＮＦＡ状態Ｓ２を状態ｎｓ１として保持し、これらを引数としてＳ５２０のサブルーチンＤＦＡ状態削除を行う。 Since the process of S520 of FIG. 19 is completed, the process proceeds to S720. Since there is a transition other that has not been processed in S720, the process proceeds to S730. In S730, a suitable transition other is extracted and held in the transition de. In S740, the transition DFA state {2} of the transition de (other) is held as the state ds1, the NFA state S2 of the transition ne (other) is held as the state ns1, and these are used as arguments to the subroutine DFA of S520. Delete state.

ＤＦＡ状態削除サブルーチンの三度目の再帰について説明する。 The third recursion of the DFA state deletion subroutine will be described.

図１８のＳ６１０において、状態ｄｓ｛２｝はＮＵＬＬでないため、Ｓ６２０で状態ｎｓのＳ２という状態を状態ｄｓから削除し、その結果状態ｄｓが保持するＮＦＡ状態群はＮＵＬＬとなる。Ｓ６３０で状態ｄｓはＮＦＡ状態を保持していないため、Ｓ６４０で、ｄ８を削除し、本サブルーチンの処理を終了（ｒｅｔｕｒｎ）する。 In S610 of FIG. 18, since the state ds {2} is not NULL, the state S2 of the state ns is deleted from the state ds in S620, and as a result, the NFA state group held by the state ds becomes NULL. Since the state ds does not hold the NFA state in S630, d8 is deleted in S640, and the process of this subroutine ends (return).

図１９のＳ５２０の処理が完了したため、Ｓ７２０に進む。Ｓ７２０では未処理である推移がないため、本サブルーチンの処理を終了（ｒｅｔｕｒｎ）する。図１５はこの時点のＤＦＡを示している。 Since the process of S520 of FIG. 19 is completed, the process proceeds to S720. In S720, since there is no unprocessed transition, the processing of this subroutine is terminated (return). FIG. 15 shows the DFA at this time.

図１８のＳ６８０の処理が完了したため、Ｓ６６０に進む。Ｓ６６０において未処理である推移＊があるため、Ｓ６７０で＊を推移ｎｅに保持する。次に、状態ｄｓと推移ｎｅでＳ３６０の「推移によるＤＦＡ状態削除」を行う。 Since the process of S680 in FIG. 18 is completed, the process proceeds to S660. Since there is an unprocessed transition * in S660, * is held in transition ne in S670. Next, “DFA state deletion by transition” of S360 is performed with the state ds and the transition ne.

図１９のＳ７００において、状態ｄｓ｛８｝の状態群ｂ，ｃ，ｏｔｈｅｒのうち、推移ｎｅ（＊）に適合する推移であるｂ，ｃ，ｏｔｈｅｒを特定する。Ｓ７１０において適合する推移があったため、Ｓ７２０に進み、未処理である推移ｃがあるため、Ｓ７３０に進む。Ｓ７３０では適合する推移ｃを取り出し、推移ｄｅに保持する。Ｓ７４０では推移ｄｅ（ｃ）の推移先のＤＦＡ状態｛２，９｝を状態ｄｓ１として保持し、推移ｎｅ（＊）の推移先のＮＦＡ状態Ｓ２を状態ｎｓ１として保持し、これらを引数としてＳ５２０のサブルーチンＤＦＡ状態削除を行う。 In S700 of FIG. 19, among the state groups b, c, other of the state ds {8}, the transitions b, c, other that are suitable for the transition ne (*) are specified. Since there is a suitable transition in S710, the process proceeds to S720, and since there is a transition c that has not been processed, the process proceeds to S730. In S730, the suitable transition c is extracted and held in the transition de. In S740, the DFA state {2, 9} of the transition de (c) is held as the state ds1, the NFA state S2 of the transition ne (*) is held as the state ns1, and these are used as arguments in S520. Subroutine DFA state deletion is performed.

前記の処理を繰り返すことにより、図６のＤＦＡが更新されて最終的に図１６に示すＤＦＡになる。これにより、図１７のＳ５２０で、ＤＦＡ状態更新が終了した後、Ｓ５３０で図６のＮＦＡから削除するＸＰａｔｈ式に相当するＮＦＡを削除し、図１６のＮＦＡになる。 By repeating the above processing, the DFA in FIG. 6 is updated to finally become the DFA shown in FIG. Thus, after the DFA state update is completed in S520 in FIG. 17, the NFA corresponding to the XPath expression to be deleted from the NFA in FIG. 6 is deleted in S530, and the NFA in FIG. 16 is obtained.

以上、第２実施形態（差分削除）を説明した。第１実施形態（図１０〜図１２）および第２実施形態（図１７〜図１９）で示したように、本発明は更新前ＸＰａｔｈ式から導出された更新前ＮＦＡに対して、推移に従って状態を順に辿る処理（深さ優先や幅優先など）を実行し、その実行過程において更新前ＮＦＡにおける各推移に適合する更新前ＤＦＡの推移を特定して、更新前ＤＦＡの各状態が保持するＮＦＡ状態群と推移群を確認し、当該ＮＦＡ状態群および推移群と差分ＮＦＡとの差分を更新することを特徴とする。 The second embodiment (difference deletion) has been described above. As shown in the first embodiment (FIGS. 10 to 12) and the second embodiment (FIGS. 17 to 19), the present invention is in a state according to the transition with respect to the pre-update NFA derived from the pre-update XPath expression. Are executed in order (depth priority, width priority, etc.), the transition of the pre-update DFA that matches each transition in the pre-update NFA is specified in the execution process, and the NFA status held by each state of the pre-update DFA The group and the transition group are confirmed, and the difference between the NFA state group and the transition group and the difference NFA is updated.

なお、本発明は前記した実施形態に限定されることなく、幅広く変形実施することができる。例えば、ＮｅｗｓＭＬは一例でありＸＭＬデータがＮｅｗｓＭＬに限定されることはない。例えば、ＸＭＬデータがＮＩＴＦ（News Industry Text Format）データ等でもよい。また、フィルタエンジン１０は、例えばＡＳＰ（Application Service Provider）が企業や個人ユーザのために設置したり、企業や団体等が自社の社員や構成員等のために設置したりする。また、前記実施形態で示したＤＦＡに対するＸＰａｔｈ式の情報の差分追加の手段・手法は一例であり、本発明が前記実施形態で示した差分追加の手段・手法に限定されることはない。また、前記したフローなどは、ＸＰａｔｈ式処理プログラムとしてネットワーク上を伝送されたり、ＣＤ−ＲＯＭ等の記憶媒体に記憶されて流通されたりする。 The present invention is not limited to the above-described embodiment, and can be widely modified. For example, NewsML is an example, and XML data is not limited to NewsML. For example, the XML data may be NITF (News Industry Text Format) data. The filter engine 10 is installed by, for example, an ASP (Application Service Provider) for a company or an individual user, or a company, an organization, or the like is installed for an employee or a member of the company. Further, the means / method for adding the difference of the XPath expression information to the DFA shown in the embodiment is merely an example, and the present invention is not limited to the means / method for adding the difference shown in the embodiment. The above-described flow or the like is transmitted as an XPath processing program over a network, or stored and distributed in a storage medium such as a CD-ROM.

また、フィルタエンジン１０を構成する装置は、１台に限定されることはなく、複数の装置に機能を分散配置してもよい。例えば、フィルタ処理を実行する装置（ＸＭＬパースモジュール１１、データ抽出モジュール１３、および、データ変換モジュール１６）と、フィルタ処理のためのオートマトンを作成する装置（問い合わせパースモジュール１１、ＤＦＡ作成モジュール１３Ｂ、オートマトン管理部１５、ＤＦＡ更新モジュール１７、更新オートマトン管理部１８）とを、別々の装置として構成してもよい。これにより、各装置への負荷が分散され、高速な処理が実現可能となる。 Moreover, the apparatus which comprises the filter engine 10 is not limited to 1 unit | set, You may distribute and arrange | position a function to several apparatus. For example, a device that executes filter processing (XML parsing module 11, data extraction module 13, and data conversion module 16) and a device that creates an automaton for filtering (inquiry parsing module 11, DFA creation module 13B, automaton) The management unit 15, the DFA update module 17, and the update automaton management unit 18) may be configured as separate devices. As a result, the load on each device is distributed, and high-speed processing can be realized.

本発明の一実施形態に関するＸＰａｔｈ式処理方法を実行するフィルタエンジンの概要を示す構成図である。It is a block diagram which shows the outline | summary of the filter engine which performs the XPath type | formula processing method regarding one Embodiment of this invention. 本発明の一実施形態に関する図１のフィルタエンジンの内部構成を示す構成図である。It is a block diagram which shows the internal structure of the filter engine of FIG. 1 regarding one Embodiment of this invention. 本発明の一実施形態に関する図２におけるオートマトン管理部のメモリ上でのデータ構成を示す説明図である。It is explanatory drawing which shows the data structure on the memory of the automaton management part in FIG. 2 regarding one Embodiment of this invention. 本発明の一実施形態に関する図３の具体的なプログラム上でのデータ構造を示す説明図である。It is explanatory drawing which shows the data structure on the specific program of FIG. 3 regarding one Embodiment of this invention. 本発明の一実施形態に関する図４のＤＦＡのメモリ上でのデータ構成を示す説明図である。FIG. 5 is an explanatory diagram showing a data configuration on a memory of the DFA of FIG. 4 according to an embodiment of the present invention. 本発明の一実施形態に関する更新前のデータの例を示す状態遷移図である。It is a state transition diagram which shows the example of the data before the update regarding one Embodiment of this invention. 本発明の第１実施形態に関する追加するデータの例を示す状態遷移図である。It is a state transition diagram which shows the example of the data added regarding 1st Embodiment of this invention. 本発明の第１実施形態に関するＤＦＡ追加の例（途中の状態）を示す状態遷移図である。It is a state transition diagram which shows the example (halfway state) of DFA addition regarding 1st Embodiment of this invention. 本発明の第１実施形態に関するＤＦＡ追加の例（最終状態）を示す状態遷移図である。It is a state transition diagram which shows the example (final state) of DFA addition regarding 1st Embodiment of this invention. 本発明の第１実施形態に関するＤＦＡ追加処理を示すフローチャートである。It is a flowchart which shows the DFA addition process regarding 1st Embodiment of this invention. 本発明の第１実施形態に関するＤＦＡ状態更新サブルーチンを示すフローチャートである。It is a flowchart which shows the DFA state update subroutine regarding 1st Embodiment of this invention. 本発明の第１実施形態に関する推移によるＤＦＡ状態更新サブルーチンを示すフローチャートである。It is a flowchart which shows the DFA state update subroutine by transition regarding 1st Embodiment of this invention. 本発明の第２実施形態に関する削除するＸＰａｔｈ式とそのＮＦＡを示す状態遷移図である。It is a state transition diagram showing an XPath expression to be deleted and its NFA related to the second embodiment of the present invention. 本発明の第２実施形態に関するＤＦＡ削除の例（途中の状態）を示す状態遷移図である。It is a state transition diagram which shows the example (halfway state) of the DFA deletion regarding 2nd Embodiment of this invention. 本発明の第２実施形態に関するＤＦＡ削除の例（途中の状態）を示す状態遷移図である。It is a state transition diagram which shows the example (halfway state) of the DFA deletion regarding 2nd Embodiment of this invention. 本発明の第２実施形態に関するＤＦＡ削除の例（最終状態）を示す状態遷移図である。It is a state transition diagram which shows the example (final state) of the DFA deletion regarding 2nd Embodiment of this invention. 本発明の第２実施形態に関するＤＦＡ削除処理を示すフローチャートである。It is a flowchart which shows the DFA deletion process regarding 2nd Embodiment of this invention. 本発明の第２実施形態に関するＤＦＡ状態削除サブルーチンを示すフローチャートである。It is a flowchart which shows the DFA state deletion subroutine regarding 2nd Embodiment of this invention. 本発明の第２実施形態に関する推移によるＤＦＡ状態削除サブルーチンを示すフローチャートである。It is a flowchart which shows the DFA state deletion subroutine by transition regarding 2nd Embodiment of this invention.

Explanation of symbols

１０フィルタエンジン
１１ＸＭＬパースモジュール
１２問い合わせパースモジュール
１３データ抽出モジュール
１３ＢＤＦＡ作成モジュール
１５オートマトン管理部
１６データ変換モジュール
１７ＤＦＡ更新モジュール
１８更新オートマトン管理部

DESCRIPTION OF SYMBOLS 10 Filter engine 11 XML parsing module 12 Inquiry parsing module 13 Data extraction module 13B DFA creation module 15 Automaton management part 16 Data conversion module 17 DFA update module 18 Update automaton management part

Claims

An XPath expression processing method for updating a pre-update DFA indicating an XPath expression before update input to process XML data to an updated DFA indicating an input XPath expression after update,
Computer
Reading the pre-update DFA derived to process the pre-update XPath expression and holding its start state as state ds;
Reading the XPath expression to be added, deriving a difference NFA therefrom, and holding its start state as state ns;
Calling a first subroutine for performing an additional operation of the XPath expression using the state ds and the state ns as arguments;
Reflecting the difference NFA in the pre-update NFA for processing the pre-update XPath expression to make the post-update NFA,
The first subroutine is:
If the state ds exists, adding the state ns to the NFA state group held by the state ds;
Obtaining a transition group from said state ns;
Selecting an unprocessed transition from the acquired transition group one by one as a transition ne, calling a second subroutine with the state ds and the transition ne as arguments, and an unprocessed transition to be selected as the transition ne When there is not, the step of ending this subroutine is executed, and
The second subroutine is
In the transition group held by the state ds, if there is no transition matching the transition ne, the step of adding the transition ne to the transition group of the state ds and ending this subroutine;
In the transition group held by the state ds, when there is a transition that matches the transition ne, the unprocessed transitions are selected one by one as the transition de, and the DFA state of the transition destination of the transition de is set as the state ds1. Holding the NFA state of the transition destination of the transition ne as the state ns1, and sequentially calling the first subroutine using the state ds1 and the state ns1 as arguments. An XPath-type processing method characterized by being configured as a subroutine for updating a previous DFA to the updated DFA .

An XPath expression processing method for updating a pre-update DFA indicating an XPath expression before update input to process XML data to an updated DFA indicating an input XPath expression after update,
Computer
Reading the pre-update DFA derived to process the pre-update XPath expression and holding its start state as state ds;
Deriving an NFA to be deleted and holding its start state as state ns;
Calling a third subroutine for performing an XPath expression deletion operation using the state ds and the state ns as arguments;
Performing the step of deleting the NFA to be deleted from the pre-update NFA for processing the pre-update XPath expression to make the post-update NFA,
The third subroutine is
If the state ds exists, the step of deleting the state ns from the NFA state group held by the state ds and deleting the subordinate of the state ds when there is no NFA state group held by the state ds When,
Obtaining a transition group from said state ns;
Selecting an unprocessed transition from the acquired transition group one by one as a transition ne, calling the fourth subroutine with the state ds and the transition ne as arguments, and an unprocessed transition to be selected as the transition ne When there is not, the step of ending this subroutine is executed, and
The fourth subroutine is
In the transition group held by the state ds, if there is no transition that matches the transition ne, the step of ending this subroutine;
In the transition group held by the state ds, when there is a transition that matches the transition ne, the unprocessed transitions are selected one by one as the transition de, and the DFA state of the transition destination of the transition de is set as the state ds1. Holding the NFA state of the transition destination of the transition ne as the state ns1, and sequentially calling the third subroutine using the state ds1 and the state ns1 as arguments. 2. The XPath processing method according to claim 1, wherein the XPath expression processing method is configured as a subroutine for updating a previous DFA to the updated DFA .

XPath expression processing program for the XPath expression processing method according to claim 1 or claim 2, causes the computer to perform.

A computer-readable recording medium storing the XPath processing program according to claim 3 .

An XPath expression processing apparatus that updates a pre-update DFA indicating an XPath expression before update input to process XML data to an updated DFA indicating an input XPath expression after update,
Reading the pre-update DFA derived to process the pre-update XPath expression and holding its start state as state ds;
Reading the XPath expression to be added, deriving a difference NFA therefrom, and holding its start state as state ns;
Calling a first subroutine for performing an additional operation of the XPath expression using the state ds and the state ns as arguments;
Reflecting the difference NFA in the pre-update NFA for processing the pre-update XPath expression to make the post-update NFA,
The first subroutine is:
If the state ds exists, adding the state ns to the NFA state group held by the state ds;
Obtaining a transition group from said state ns;
Selecting an unprocessed transition from the acquired transition group one by one as a transition ne, calling a second subroutine with the state ds and the transition ne as arguments, and an unprocessed transition to be selected as the transition ne When there is not, the step of ending this subroutine is executed, and
The second subroutine is
In the transition group held by the state ds, if there is no transition matching the transition ne, the step of adding the transition ne to the transition group of the state ds and ending this subroutine;
In the transition group held by the state ds, when there is a transition that matches the transition ne, the unprocessed transitions are selected one by one as the transition de, and the DFA state of the transition destination of the transition de is set as the state ds1. Holding the NFA state of the transition destination of the transition ne as the state ns1, and sequentially calling the first subroutine using the state ds1 and the state ns1 as arguments. An XPath type processing apparatus configured as a subroutine for updating a previous DFA to the updated DFA .

An XPath expression processing apparatus that updates a pre-update DFA indicating an XPath expression before update input to process XML data to an updated DFA indicating an input XPath expression after update,
Reading the pre-update DFA derived to process the pre-update XPath expression and holding its start state as state ds;
Deriving an NFA to be deleted and holding its start state as state ns;
Calling a third subroutine for performing an XPath expression deletion operation using the state ds and the state ns as arguments;
Performing the step of deleting the NFA to be deleted from the pre-update NFA for processing the pre-update XPath expression to make the post-update NFA,
The third subroutine is
If the state ds exists, the step of deleting the state ns from the NFA state group held by the state ds and deleting the subordinate of the state ds when there is no NFA state group held by the state ds When,
Obtaining a transition group from said state ns;
Selecting an unprocessed transition from the acquired transition group one by one as a transition ne, calling the fourth subroutine with the state ds and the transition ne as arguments, and an unprocessed transition to be selected as the transition ne When there is not, the step of ending this subroutine is executed, and
The fourth subroutine is
In the transition group held by the state ds, if there is no transition that matches the transition ne, the step of ending this subroutine;
In the transition group held by the state ds, when there is a transition that matches the transition ne, the unprocessed transitions are selected one by one as the transition de, and the DFA state of the transition destination of the transition de is set as the state ds1. Holding the NFA state of the transition destination of the transition ne as the state ns1, and sequentially calling the third subroutine using the state ds1 and the state ns1 as arguments. 6. The XPath processing apparatus according to claim 5, wherein the XPath processing apparatus is configured as a subroutine for updating a previous DFA to the updated DFA .