JP2002297569A

JP2002297569A - Structured document conversion device and query conversion device

Info

Publication number: JP2002297569A
Application number: JP2001099375A
Authority: JP
Inventors: Nobuko Itani; 宣子井谷
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2001-03-30
Filing date: 2001-03-30
Publication date: 2002-10-11
Anticipated expiration: 2021-03-30
Also published as: JP4689856B2

Abstract

(57)【要約】【課題】本発明は構造化文書変換装置及びクエリー変換
装置に関し、構造化文書の階層を浅くすることにより動
作メモリ量の削減を可能にし、データアクセス効率を改
善する。【解決手段】構造化文書変換装置において、構造化文書
変換装置を保持する構造化文書保持部１と、構造化文書
から予め定めた要素名の開始タグと終了タグに挟まれた
部分領域を取得する部分領域取得部２と、取得した部分
領域を１層構造に変換する構造変換部３と、構造化文書
の部分領域を、変換した構造化文書に置き換えて出力す
る構造化文書出力部４を備え、構造化文書の予め定めた
要素名の開始タグと終了タグに挟まれた部分領域を１層
構造に変換するように構成した。 (57) Abstract: The present invention relates to a structured document conversion device and a query conversion device, and makes it possible to reduce the amount of operation memory by making the hierarchy of a structured document shallow, thereby improving data access efficiency. In a structured document conversion device, a structured document holding unit for holding a structured document conversion device, and a partial area sandwiched between a start tag and an end tag of a predetermined element name are acquired from the structured document. And a structured document output unit 4 for converting the acquired partial area into a one-layer structure and for converting the partial area of the structured document into a converted structured document and outputting the converted structured document. With this configuration, a partial area between a start tag and an end tag of a predetermined element name of a structured document is converted into a one-layer structure.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ＸＭＬ文書などの
構造化文書の構造変換に利用する構造化文書変換装置及
びクエリー変換装置に関する。[0001] 1. Field of the Invention [0002] The present invention relates to a structured document conversion device and a query conversion device used for converting the structure of a structured document such as an XML document.

【０００２】近年、インターネットを通じて複数のシス
テム、企業、個人が繋がれ、ＥＤＩ（Electronic Data
Interchange ：電子取引）、ＥＣ（Electronic Commerc
e ：電子商取引）、携帯電話サービスやデジタルテレビ
向けのサービス、Ｗｅｂサービスなど、幅広いデータ交
換が行われるようになってきている。このような状況に
合わせて、計算機で扱うデータの形式を統一する動きが
ある。In recent years, a plurality of systems, companies, and individuals have been connected via the Internet, and EDI (Electronic Data)
Interchange: Electronic trading, EC (Electronic Commerc)
e: e-commerce), a wide range of data exchange such as a mobile phone service, a service for digital television, and a web service. In accordance with such a situation, there is a movement to unify the format of data handled by a computer.

【０００３】これまで、計算機あるいはアプリケーショ
ンによってばらばらであったデータ形式を異なる計算
機、アプリケーションでも使用できるようにするもので
ある。この統一のための規則はＸＭＬ（eXtensible Mar
kup Language）として１９９８年２月にＷ３Ｃ（World
Wide Web Consortium ）によって正式に勧告されてい
る。ＸＭＬ規則は、同様の規則ＳＧＭＬ（Stsndard Gen
erated Markup Language）のサブセットになっている。[0003] The present invention is to make it possible to use a data format that has been different depending on a computer or an application in a different computer or application. The rules for this unification are XML (eXtensible Mar
kup Language) in February 1998 as W3C (World
Wide Web Consortium). The XML rule is a similar rule SGML (Stsndard Gen.
erated Markup Language).

【０００４】また、ＸＭＬ文書をメモリ上のオブジェク
トに展開し、そのオブジェクトを扱うインタフェースの
規格ＤＯＭ（Document Object Model ）も１９９８年１
０月にＷ３Ｃによって勧告されている。[0004] Further, the standard DOM (Document Object Model) of an interface for expanding an XML document into an object on a memory and handling the object is also described in January 1998.
Recommended by W3C in October.

【０００５】[0005]

【従来の技術】従来、前記のように、ＸＭＬ文書などの
構造化文書が知られていた。このような構造化文書にお
いて、以下では、ＸＭＬ規則に基づき、「＜」と「＞」
で囲まれた文字列をタグ、「＜文字列＞」を開始タグ、
「＜／文字列＞」を終了タグ、開始タグと終了タグで挟
まれた文字列を要素、タグ内に記述される要素の名前を
要素名、要素に対する付加情報を属性と呼ぶようにす
る。2. Description of the Related Art Conventionally, as described above, structured documents such as XML documents have been known. In such a structured document, in the following, based on the XML rule, “<” and “>”
The tag is a character string enclosed by, the start tag is "<character string>",
“</ Character string>” is called an end tag, a character string sandwiched between a start tag and an end tag is called an element, the name of an element described in the tag is called an element name, and additional information for the element is called an attribute.

【０００６】構造化文書は、文書自身の中にタグを埋め
込む形で、データ構造を記述する。データ構造をタグと
して文書に埋め込んだ構成をとることにより、データ構
造の柔軟性／拡張性の高さを持っている。また、タグを
人が見て意味のあるテキストで記述することにより、そ
れまで独立のシステムで扱っていたデータを他のシステ
ムでも容易に扱うことができる。[0006] A structured document describes a data structure by embedding tags in the document itself. By adopting a configuration in which the data structure is embedded as a tag in the document, the data structure has high flexibility and expandability. In addition, by describing the tag with a meaningful text that can be seen by a person, data that has been handled by an independent system can be easily handled by another system.

【０００７】[0007]

【発明が解決しようとする課題】前記のような従来のも
のにおいては、次のような課題があった。例えばＸＭＬ
文書は、木構造のデータ構造をとることができ、深い階
層も表現できる。また、階層構造の方が、人は整理し易
く、データ操作にも間違いが少ない。しかし、処理シス
テム上は、階層が深くなるにつれ、構造を表すためのメ
モリ量（記憶容量）が多く必要になり、データアクセス
効率も悪くなる。The above-mentioned prior art has the following problems. For example, XML
Documents can have a tree-structured data structure and can express deep hierarchies. In addition, the hierarchical structure makes it easier for people to organize and has less error in data manipulation. However, on a processing system, as the hierarchy becomes deeper, a larger amount of memory (storage capacity) is required to represent the structure, and the data access efficiency also deteriorates.

【０００８】本発明はこのような従来の課題を解決し、
構造化文書の階層を浅くすることにより、動作記憶容量
の削減を可能にすると共に、データアクセス効率を改善
することを目的とする。The present invention solves such a conventional problem,
An object of the present invention is to reduce the hierarchical level of a structured document, thereby reducing the operation storage capacity and improving the data access efficiency.

【０００９】[0009]

【課題を解決するための手段】本発明は前記の目的を達
成するため、次のように構成した。Means for Solving the Problems The present invention has the following constitution in order to achieve the above object.

【００１０】(1) ：構造化文書変換装置において、構造
化文書を保持する構造化文書保持部と、構造化文書から
予め定めた要素名の開始タグと終了タグに挟まれた部分
領域を取得する部分領域取得部と、取得した部分領域を
１層構造に変換する構造変換部と、構造化文書の部分領
域を、変換した構造化文書に置き換えて出力する構造化
文書出力部を備え、構造化文書の予め定めた要素名の開
始タグと終了タグに挟まれた部分領域を１層構造に変換
することを特徴とする。(1): In the structured document conversion apparatus, a structured document holding unit for holding a structured document, and a partial area sandwiched between a start tag and an end tag of a predetermined element name are acquired from the structured document. And a structured document output unit that replaces the obtained partial area with a converted structured document and outputs the converted partial document. A partial area between a start tag and an end tag of a predetermined element name of a structured document is converted into a one-layer structure.

【００１１】(2) ：前記(1) の構造化文書変換装置にお
いて、前記構造変換部は、部分領域の各要素を取得する
要素取得手段と、各要素について入れ子になっている要
素名を取得する要素名取得手段と、取得した要素名を結
合して新しい要素名を生成する要素名生成手段と、生成
した要素名のタグで各要素を挟んで構造化文書を生成す
る構造化文書生成手段を備え、１層構造にしたときの要
素名を、入れ子になっている要素名を結合して生成する
ことを特徴とする。(2): In the structured document conversion device according to (1), the structure conversion section obtains each element of the partial area, and obtains a nested element name for each element. Element name obtaining means, element name generating means for generating a new element name by combining the obtained element names, and structured document generating means for generating a structured document by sandwiching each element with the generated element name tag And generating the element names in the one-layer structure by combining the nested element names.

【００１２】(3) ：前記(2) の構造化文書変換装置にお
いて、前記部分領域から要素名を入れ子の内側から外側
の順に取得する要素名取得手段と、取得した順に並べた
要素名の間に予め区切りコード挟んだ文字列を生成する
文字列生成手段を備え、１層構造にしたときの要素名
を、入れ子の内側から外側の順に要素名を並べ、間に予
め区切りコードを挟んだものとすることを特徴とする。(3): In the structured document conversion device according to (2), between the element name obtaining means for obtaining the element names from the partial area in order from the inner side to the outer side of the nest, and between the element names arranged in the obtained order. Character string generating means for generating a character string sandwiched in advance with a delimiter code, the element names in a one-layer structure, the element names are arranged in order from the inner side to the outer side of the nest, and the delimiter code is sandwiched in advance It is characterized by the following.

【００１３】(4) ：構造化文書処理システムにおいて、
構造化文書に対するクエリーを変換するクエリー変換装
置であって、構造変換規則を保持する構造変換規則保持
部と、構造変換規則に従ってクエリーを変換するクエリ
ー変換部を備え、構造変換規則に従って、構造化文書に
対するクエリーを変換してから文書処理に渡すことを特
徴とする。(4): In the structured document processing system,
A query conversion device for converting a query for a structured document, comprising: a structure conversion rule holding unit for holding a structure conversion rule; and a query conversion unit for converting a query according to the structure conversion rule, and a structured document according to the structure conversion rule. Is converted into a query and then passed to document processing.

【００１４】(5) ：構造化文書変換装置において、属性
を持っているタグを検出する属性付タグ検出部と、属性
をその要素の下層の要素名に、属性値を要素に変換する
属性変換部を備え、属性を持っているタグを検出し、属
性をそのタグの下層の要素に変換することを特徴とす
る。(5): In the structured document conversion apparatus, an attribute-attached tag detecting section for detecting a tag having an attribute, an attribute conversion for converting the attribute to an element name under the element and an attribute value to the element A tag having an attribute is detected, and the attribute is converted into an element below the tag.

【００１５】（作用）(a) ：前記(1) では、部分領域取
得部が構造化文書保持部に保持している構造化文書か
ら、予め定めた要素名の開始タグと終了タグに挟まれた
部分領域を取得し、構造変換部が前記取得した部分領域
を１層構造に変換する。そして、構造化文書出力部は、
構造化文書の部分領域を、前記変換した構造化文書に置
き換えて出力する。(Action) (a): In the above (1), a partial area acquisition unit is sandwiched between a start tag and an end tag of a predetermined element name from a structured document held in a structured document holding unit. The acquired partial area is acquired, and the structure conversion unit converts the acquired partial area into a one-layer structure. Then, the structured document output unit,
The partial area of the structured document is replaced with the converted structured document and output.

【００１６】このようにして、構造化文書変換装置は、
構造化文書の予め定めた要素名の開始タグと終了タグに
挟まれた部分領域を１層構造に変換して出力する。従っ
て、構造化文書の階層を浅くすることができ、動作記憶
容量の削減を可能にすると共に、データアクセス効率を
改善することができる。In this way, the structured document conversion device
A partial area between a start tag and an end tag of a predetermined element name of the structured document is converted into a one-layer structure and output. Therefore, the hierarchical level of the structured document can be reduced, and the operation storage capacity can be reduced, and the data access efficiency can be improved.

【００１７】(b) ：前記(2) では、要素取得手段が部分
領域の各要素を取得し、要素名取得手段が各要素につい
て入れ子になっている要素名を取得し、要素名生成手段
が取得した要素名を結合して新しい要素名を生成し、構
造化文書生成手段が取得した要素名を結合して新しい要
素名を生成する。(B): In the above (2), the element acquisition means acquires each element of the partial area, the element name acquisition means acquires the nested element name of each element, and the element name generation means A new element name is generated by combining the obtained element names, and a new element name is generated by combining the element names obtained by the structured document generation unit.

【００１８】このようにして、構造変換部は１層構造に
したときの要素名を、入れ子になっている要素名を結合
して生成する。従って、構造化文書の記述規則に準拠し
たままで構造化文書の階層を浅くすることができ、変換
後の構造化文書を既存の構造化文書処理システムで扱
え、動作記憶容量の削減を可能にすると共に、データア
クセス効率を改善することができる。In this way, the structure conversion unit generates the element names in the one-layer structure by combining the nested element names. Therefore, the structured document hierarchy can be made shallower while conforming to the structured document description rules, and the converted structured document can be handled by the existing structured document processing system, and the operation storage capacity can be reduced. In addition, the data access efficiency can be improved.

【００１９】(c) ：前記(3) では、要素名取得手段が部
分領域から要素名を入れ子の内側から外側の順に取得
し、文字列生成手段が取得した順に並べた要素名の間に
予め区切りコード挟んだ文字列を生成する。(C): In the above (3), the element name obtaining means obtains the element names from the partial area in order from the inner side to the outer side of the nest, and the element name obtaining means previously arranges between the element names arranged in the order obtained by the character string generating means. Generates a character string with delimiter code between.

【００２０】このようにして、構造変換部は１層構造に
したときの要素名を、入れ子の内側から外側の順に要素
名を並べ、間に予め区切りコードを挟んだものとする。
従って、構造化文書の記述規則に準拠したままで構造化
文書の階層を浅くすることができ、変換後の構造化文書
を既存の構造化文書処理システムで扱え、動作記憶容量
の削減を可能にすると共に、データアクセス効率を改善
することができる。As described above, the structure conversion unit arranges the element names in the one-layer structure in the order from the inner side to the outer side of the nest, with a delimiter code interposed therebetween in advance.
Therefore, the structured document hierarchy can be made shallower while conforming to the structured document description rules, and the converted structured document can be handled by the existing structured document processing system, and the operation storage capacity can be reduced. In addition, the data access efficiency can be improved.

【００２１】(d) ：前記(4) では、クエリー変換部が構
造変換規則保持部に保持している構造変換規則に従って
クエリーを変換する。このようにして、クエリー変換装
置は、構造変換規則に従って構造化文書に対するクエリ
ーを変換してから文書処理に渡す。従って、構造変換し
た後も構造変換前と同じクエリーで処理を行うことがで
き、利用者が変換を意識せずに、動作記憶容量の削減及
びデータアクセス効率の改善を利用することができる。(D): In the above (4), the query conversion unit converts the query according to the structure conversion rule held in the structure conversion rule holding unit. In this way, the query conversion device converts the query for the structured document according to the structure conversion rule and then passes the query to the document processing. Therefore, even after the structure conversion, the processing can be performed with the same query as before the structure conversion, and the user can use the reduction of the operation storage capacity and the improvement of the data access efficiency without being conscious of the conversion.

【００２２】(e) ：前記(5) では、属性付タグ検出部が
属性を持っているタグを検出し、属性変換部が属性をそ
の要素の下層の要素名に、属性値を要素に変換する。こ
のようにして、構造化文書変換装置は属性を持っている
タグを検出し、属性をそのタグの下層の要素に変換す
る。従って、属性を含む構造化文書も階層を浅くするこ
とができ、動作記憶容量の削減を可能にすると共に、デ
ータアクセス効率を改善することができる。(E): In the above (5), the tag-with-attribute detecting section detects a tag having an attribute, and the attribute converting section converts the attribute into an element name of a lower layer of the element and the attribute value into the element. I do. In this way, the structured document conversion device detects the tag having the attribute, and converts the attribute into an element below the tag. Therefore, the structured document including the attribute can also have a shallower hierarchy, thereby reducing the operation storage capacity and improving the data access efficiency.

【００２３】[0023]

【発明の実施の形態】以下、本発明の実施の形態を図面
に基づいて詳細に説明する。Embodiments of the present invention will be described below in detail with reference to the drawings.

【００２４】§１：構造化文書の構造変換の説明（その
１）構造化文書の構造変換説明図（その１）を図１に示し、
構造化文書の構造変換説明図（その２）を図２に示す。
図１、２において、(a) はテキストベースの変換、(b)
はオブジェクトベースの変換、(c) はクエリーの変換を
示す。§1: Description of Structure Conversion of Structured Document (Part 1) FIG. 1 shows an explanatory diagram (part 1) of structure conversion of a structured document.
FIG. 2 shows an explanatory diagram (part 2) of the structure conversion of the structured document.
1 and 2, (a) is a text-based conversion, (b)
Indicates object-based conversion, and (c) indicates query conversion.

【００２５】図１、２に示した変換は、構造化文書の階
層を浅くすることにより、動作記憶容量の削減を可能に
すると共に、データアクセス効率を改善し、処理装置や
処理システムの性能向上を図るための処理である。以
下、具体例について前記構造変換処理を説明する。The conversion shown in FIGS. 1 and 2 makes it possible to reduce the operation storage capacity by reducing the depth of the structured document, improve the data access efficiency, and improve the performance of the processing device and the processing system. This is a process for achieving. Hereinafter, the structure conversion processing will be described for a specific example.

【００２６】(1) ：テキストベースの構造変換例この構造変換例は、図１の(a) 図に示したように、テキ
ストベースでの変換例であり、入れ子の外側の要素名と
その内側の要素名を区切りコード（例えば、「−」）で
繋ぎあわせて新しい要素名とすることによって構造変換
を行う例である。この例では、元のデータの入れ子が３
層になっていたのを、前記構造変換により入れ子を１階
層に変換しており、構造化文書の階層を浅くすることが
できた。具体的には次の通りである。(1): Example of text-based structure conversion This structure conversion example is a text-based conversion example, as shown in FIG. This is an example in which the structure conversion is performed by connecting the element names of “.” With a delimiter code (for example, “−”) to obtain a new element name. In this example, the nest of the original data is 3
The nesting is converted into one layer by the above-described structure conversion, and the layer of the structured document can be made shallow. Specifically, it is as follows.

【００２７】図１の例では、「氏名」と「姓」を「−」
で繋ぎ、「氏名」と「名」を「−」で繋ぐ。また、「会
社」と「住所」と「郵便番号」を「−」で繋ぎ、「会
社」と「住所」と「住所」を「−」で繋ぎ、「会社」と
「電話」と「外線」を「−」で繋ぎ、「会社」と「電
話」と「内線」を「−」で繋いでいる。このような変換
により、３階層から１階層への構造変換が行われる。In the example of FIG. 1, "name" and "last name" are replaced with "-".
And connect "name" and "name" with "-". In addition, "company" and "address" and "zip code" are connected by "-", "company" and "address" and "address" are connected by "-", "company" and "telephone" and "outside line" Are connected by "-", and "company", "telephone" and "extension" are connected by "-". By such a conversion, a structure conversion from three layers to one layer is performed.

【００２８】(2) ：オブジェクトベースの構造変換例図２の(b) 図において、「名簿」、「個人」、「氏
名」、「会社」、「住所」、「電話」、「姓」、
「名」、「郵便番号」、「住所」、「外線」、「内線」
はそれぞれ「ノード」と呼び、前記「名簿」が「個人」
の、「個人」が「氏名」及び「会社」の、「氏名」が
「姓」及び「名」の「親ノード」であり、「姓」及び
「名」が「氏名」の、「氏名」及び「会社」が「個人」
の「子ノード」である。(2): Example of object-based structure conversion In FIG. 2B, in FIG. 2B, "list", "individual", "name", "company", "address", "telephone", "surname",
"First name", "Postal code", "Address", "Outside line", "Extension"
Are called "nodes", and the "list" is "individual"
"Person" is "parent node" of "name" and "company", "name" is "parent node" of "last name" and "first name", and "full name" of "last name" and "first name" is "full name" And "Company" is "Individual"
Is a "child node".

【００２９】また、図の破線で囲まれた部分のように、
１つのノードから派生して構成された木を「部分木」と
呼び、一番元のノード、図では「個人」を「根」と呼
び、「姓」、「名」、「郵便番号」、「住所」、「外
線」、「内線」をそれぞれ「葉」と呼ぶ。Also, as shown in the portion surrounded by the broken line in the figure,
A tree derived from one node is called a "subtree", and the original node, in the figure, "individual" is called "root", "last name", "first name", "zip code", The “address”, “outside line”, and “extension” are each called “leaf”.

【００３０】この変換例は、図２の(b) 図に示したよう
に、オブジェクトベースの変換例であり、親ノードの要
素名とその子ノードの要素名を区切りコード（例えば、
「−」）で繋ぎあわせて親ノードの層の要素名とするこ
とによって構造変換を行う例である。この例では、元の
データの入れ子が３層になっていたのを、前記構造変換
により入れ子を１階層に変換しており、構造化文書の階
層を浅くすることができた。具体的には次の通りであ
る。As shown in FIG. 2B, this conversion example is an object-based conversion example in which the element name of a parent node and the element name of its child node are separated by a delimiter code (for example,
This is an example in which the structure conversion is performed by connecting the elements with "-") to obtain the element names of the layer of the parent node. In this example, the nesting of the original data is three layers, but the nesting is converted to one layer by the above-mentioned structure conversion, and the hierarchical level of the structured document can be reduced. Specifically, it is as follows.

【００３１】図の点線で囲まれた部分木内では、元のオ
ブジェクトベースは３階層（例えば、「会社」、「電
話」、「内線」）である。この３階層のクラスタ毎に図
示矢印で示された構造変換を行うと、前記クラスタが１
階層になる。In the subtree surrounded by the dotted line in the figure, the original object base has three layers (for example, "company", "telephone", and "extension"). When the structural transformation indicated by the illustrated arrow is performed for each of the three hierarchical clusters, the cluster becomes 1
Become a hierarchy.

【００３２】この場合、「氏名」と「姓」が「−」で繋
がれ、「氏名」と「名」が「−」で繋がれ、「会社」と
「住所」と「郵便番号」が「−」で繋がれ、「会社」と
「住所」と「住所」が「−」で繋がれ、「会社」と「電
話」と「外線」が「−」で繋がれ、「会社」と「電話」
と「内線」が「−」で繋がれている。このような変換に
より、３階層から１階層への構造変換が行われる。In this case, "name" and "surname" are connected by "-", "name" and "first name" are connected by "-", and "company", "address" and "zip code" are "-", "Company" and "address" and "address" are connected by "-", "company" and "telephone" and "outside line" are connected by "-", and "company" and "telephone""
And "extension" are connected by "-". By such a conversion, a structure conversion from three layers to one layer is performed.

【００３３】(3) ：クエリーの変換クエリーの変換では、予め保持している構造変換規則に
従って、構造化文書に対するクエリーを変換してから文
書処理に渡す処理を行う。例えば、図２の(c)図に示し
た例では、「“個人”／“氏名”／“姓”」の３階層の
下の要素内容を取得するための記述をクエリーの変換で
「“個人−氏名−姓”」の要素内容を取得するための記
述に変換する。(3): Conversion of Query In the conversion of a query, a process of converting a query for a structured document according to a pre-stored structure conversion rule and then passing it to document processing is performed. For example, in the example shown in FIG. 2 (c), the description for acquiring the element contents under three layers of ““ individual ”/“ name ”/“ surname ”is converted to“ “individual” −Name—Last Name ””.

【００３４】§２：構造化文書の構造変換の説明（その
２）構造化文書の構造変換説明図（その３）を図３に示し、
構造化文書の構造変換説明図（その４）を図４に示す。
図３、４において、(a) はテキストベースの変換、(b)
はオブジェクトベースの変換、(c) はクエリーの変換を
示す。§2: Description of Structure Conversion of Structured Document (Part 2) FIG. 3 shows an explanatory diagram (part 3) of structure conversion of a structured document.
FIG. 4 is an explanatory diagram (part 4) of the structure conversion of the structured document.
3 and 4, (a) is a text-based conversion, (b)
Indicates object-based conversion, and (c) indicates query conversion.

【００３５】図３、４に示した変換は、前記図１、２の
例とは逆に、入れ子の内側から外側へ、葉から根の方向
へ並べた要素名を生成する構造変換例である。この場合
にも、構造化文書の階層を浅くすることにより、動作記
憶容量の削減を可能にすると共に、データアクセス効率
を改善し、処理装置や処理システムの性能向上を図るた
めの処理である。以下、具体例について前記構造変換処
理を説明する。The transformation shown in FIGS. 3 and 4 is an example of a structure transformation for generating element names arranged from the inside to the outside of the nest and from the leaf to the root, contrary to the examples of FIGS. . Also in this case, the processing is to reduce the operation storage capacity by reducing the depth of the structured document hierarchy, to improve the data access efficiency, and to improve the performance of the processing device and the processing system. Hereinafter, the structure conversion processing will be described for a specific example.

【００３６】(1) ：テキストベースの構造変換例この構造変換例は、図３の(a) 図に示したように、テキ
ストベースでの変換例であり、変換時に入れ子の内側か
ら外側への方向へ要素名を並べる。なお、要素名の並べ
方以外は図１の構造変換と同じである。(1): Example of text-based structure conversion As shown in FIG. 3A, this structure conversion example is a text-based conversion example. Arrange the element names in the direction. Except for the method of arranging the element names, it is the same as the structure conversion of FIG.

【００３７】図３の(a) 図では、変換後のテキストベー
スを示してあるが、元のデータは図１の(a) 図と同じで
ある。FIG. 3A shows the converted text base, but the original data is the same as FIG. 1A.

【００３８】前記変換後のテキストベースでは、「姓」
と「氏名」を「−」で繋ぎ、「名」と「氏名」を「−」
で繋ぐ。また、「郵便番号」と「住所」と「会社」を
「−」で繋ぎ、「住所」と「住所」と「会社」を「−」
で繋ぎ、「外線」と「電話」と「会社」を「−」で繋
ぎ、「内線」と「電話」と「会社」を「−」で繋いでい
る。このような変換により、３層から１層への構造変換
が行われる。In the converted text base, "last name"
And "name" are connected by "-", and "name" and "name" are "-"
Connect with. In addition, "zip code", "address" and "company" are connected by "-", and "address", "address" and "company" are "-".
, "Outside line", "telephone" and "company" are connected by "-", and "extension", "telephone" and "company" are connected by "-". By such conversion, the structure conversion from three layers to one layer is performed.

【００３９】(2) ：オブジェクトベースの構造変換例この構造変換例は、図４の(b) 図に示したように、オブ
ジェクトベースでの変換例であり、変換時に葉から根の
方向へ要素名を並べる。なお、要素名の並べ方以外は図
２の構造変換と同じである。図４の(b) 図では、変換後
のオブジェクトデータを示してあるが、元のデータは図
２と同じである。(2): Object-based structural conversion example This structural conversion example is an object-based conversion example, as shown in FIG. 4B, in which elements are converted from leaves to roots during conversion. List the names. Except for the method of arranging the element names, it is the same as the structure conversion of FIG. FIG. 4B shows the object data after conversion, but the original data is the same as FIG.

【００４０】図４の(b) 図に示した例では、「姓」と
「氏名」が「−」で繋がれ、「名」と「氏名」が「−」
で繋がれ、「郵便番号」と「住所」と「会社」が「−」
で繋がれ、「住所」と「住所」と「会社」が「−」で繋
がれ、「外線」と「電話」と「会社」が「−」で繋が
れ、「内線」と「電話」と「会社」が「−」で繋がれて
いる。このような変換により、３階層から１階層への構
造変換が行われる。In the example shown in FIG. 4B, "last name" and "name" are connected by "-", and "first name" and "name" are "-".
"Zip", "address" and "company" are "-"
, "Address", "address" and "company" are connected by "-", "outside line" and "telephone" and "company" are connected by "-", and "extension" and "telephone""Company" is connected by "-". By such a conversion, a structure conversion from three layers to one layer is performed.

【００４１】(3) ：クエリーの変換クエリーの変換では、予め保持している構造変換規則に
従って、構造化文書に対するクエリーを変換してから文
書処理に渡す。例えば、図３の(c) 図に示した例では、
“姓”の要素内容を取得するための記述を、クエリーの
変換で“姓＊”の要素内容を取得するための記述に変換
する。なお、前記「＊」は、前方一致の記号であり、
「姓」に続く文字は何でも良いことを表している。(3): Conversion of Query In the conversion of a query, a query for a structured document is converted in accordance with a pre-stored structure conversion rule and then passed to document processing. For example, in the example shown in FIG.
The description for acquiring the element content of "last name" is converted into a description for acquiring the element content of "last name *" by query conversion. Note that the asterisk (*) is a prefix match symbol,
The character following "last name" indicates that anything is acceptable.

【００４２】§３：構造化文書変換装置及びクエリー変
換装置の説明装置の説明図図５に示す。図５において、(a) 図はテキ
ストベースの構造変換装置、(b) はオブジェクトベース
の構造変換装置、(c) はクエリーの変換装置を示す。§3: Description of structured document conversion device and query conversion device Description of the device FIG. In FIG. 5, (a) shows a text-based structure conversion device, (b) shows an object-based structure conversion device, and (c) shows a query conversion device.

【００４３】(1) ：テキストベースの構造変換装置テキストベースの構造変換装置は、構造化文書を保持す
る構造化文書保持部１と、構造化文書保持部１が保持し
ている構造化文書から、予め定めた要素名の開始タグと
終了タグに挟まれた部分領域を取得する部分領域取得部
２と、部分領域取得部２が取得した部分領域を１階層構
造に変換する構造変換部３と、構造化文書の部分領域
を、変換した構造化文書に置き換えて出力する構造化文
書出力部４と、取得する部分領域を指定する部分領域指
定部５（詳細は後述する）を備え、構造化文書の予め定
めた要素名の開始タグと終了タグに挟まれた部分領域を
１層構造に変換する。(1): Text-based structure conversion device The text-based structure conversion device is composed of a structured document holding unit 1 for holding a structured document, and a structured document held by the structured document holding unit 1. A partial area acquisition unit 2 for acquiring a partial area sandwiched between a start tag and an end tag of a predetermined element name, a structure conversion unit 3 for converting the partial area acquired by the partial area acquisition unit 2 into a one-layer structure, A structured document output unit 4 for replacing a partial area of the structured document with a converted structured document and outputting the converted structured document, and a partial area designating unit 5 (described later in detail) for designating a partial area to be acquired. The partial area between the start tag and the end tag of the predetermined element name of the document is converted into a one-layer structure.

【００４４】なお、前記テキストベースの構造変換装置
は、パーソナルコンピュータ、ワークステーション等の
任意のコンピュータにより実現される装置であり、前記
構造化文書保持部１、部分領域取得部２、構造変換部
３、構造化文書出力部４、部分領域指定部５は、それぞ
れ前記コンピュータのＣＰＵがプログラムを実行するこ
とにより実現するものである。The text-based structure conversion device is a device realized by any computer such as a personal computer and a workstation, and includes the structured document holding unit 1, the partial area acquisition unit 2, the structure conversion unit 3 The structured document output unit 4 and the partial area designation unit 5 are realized by the CPU of the computer executing a program.

【００４５】この装置の処理は次の通りである。構造化
文書保持部１に、予め、変換対象の構造化文書を保持し
ておく。そして、部分領域取得部２は、部分領域指定部
５から指定情報を受け取ると、構造化文書保持部１の構
造化文書から、前記指定情報で指定された部分領域を取
得し、構造変換部３が、前記取得した部分領域に対し、
構造変換を行う。The processing of this device is as follows. The structured document to be converted is held in the structured document holding unit 1 in advance. Then, upon receiving the designation information from the partial region designation unit 5, the partial region acquisition unit 2 acquires the partial region designated by the designation information from the structured document in the structured document holding unit 1, and However, for the obtained partial area,
Perform a structural transformation.

【００４６】そして、構造化文書出力部４は、前記構造
変換部３が変換した部分領域のデータを取り込むと共
に、前記変換対象の部分領域以外のデータをそのまま構
造化文書保持部１から取り出す。そして、前記構造化文
書の部分領域を、前記変換した構造化文書に置き換えて
出力する。The structured document output section 4 takes in the data of the partial area converted by the structure converting section 3 and also takes out the data other than the partial area to be converted from the structured document holding section 1 as it is. Then, the partial area of the structured document is replaced with the converted structured document and output.

【００４７】(2) ：オブジェクトベースの構造変換装置オブジェクトベースの構造変換装置は、構造化文書をメ
モリ上で木構造に展開したオブジェクトを保持するオブ
ジェクト保持部１１と、前記木構造から、予め定めた部
分木を取得する部分木取得部１２と、取得した部分木を
１階層の木に変換する構造変換部１３と、木構造の指定
部分木を、変換した部分木に置き換えて出力するオブジ
ェクト出力部１４と、取得する部分木を指定する部分木
指定部１５（詳細は後述する）を備え、構造化文書をメ
モリ上に展開した木構造の予め定めた部分木を１階層の
木に変換する。(2): Object-based structure conversion device The object-based structure conversion device determines in advance an object holding unit 11 that holds an object obtained by expanding a structured document into a tree structure on a memory, and the tree structure. Subtree obtaining unit 12 for obtaining the obtained partial tree, a structure converting unit 13 for converting the obtained partial tree into a one-level tree, and an object output for replacing the specified partial tree of the tree structure with the converted partial tree and outputting the converted partial tree And a subtree designating unit 15 (to be described in detail later) for designating a partial tree to be acquired, and converts a predetermined partial tree of a tree structure obtained by expanding a structured document on a memory into a one-level tree. .

【００４８】なお、前記オブジェクトベースの構造変換
装置は、パーソナルコンピュータ、ワークステーション
等の任意のコンピュータにより実現される装置であり、
前記オブジェクト保持部１１、部分木取得部１２、構造
変換部１３、オブジェクト出力部１４、部分木指定部１
５は、それぞれ前記コンピュータのＣＰＵがプログラム
を実行することにより実現するものである。The object-based structure conversion device is a device realized by any computer such as a personal computer and a workstation.
The object holding unit 11, the partial tree acquisition unit 12, the structure conversion unit 13, the object output unit 14, the partial tree designation unit 1
Reference numeral 5 is realized by the CPU of the computer executing a program.

【００４９】この装置の処理は次の通りである。オブジ
ェクト保持部１１に、予めオブジェクトデータを保持し
ておく。そして、部分木取得部１２は、部分木指定部１
５から指定情報を受け取ると、オブジェクト保持部１１
のオブジェクトデータから、前記指定情報で指定された
部分木を取得し、構造変換部１３は、前記取得した部分
木に対し、構造変換を行う。The processing of this device is as follows. The object holding unit 11 holds object data in advance. Then, the subtree obtaining unit 12 outputs the subtree specifying unit 1
5, the object holding unit 11
From the object data, the structure conversion unit 13 obtains the subtree specified by the specification information, and performs the structure conversion on the obtained subtree.

【００５０】そして、オブジェクト出力部１４は、前記
構造変換部１３が変換した部分木のデータを取り込むと
共に、前記変換対象の部分木以外のデータをそのままオ
ブジェクト保持部１１から取り出す。そして、前記オブ
ジェクトデータの部分木を、前記変換したオブジェクト
データに置き換えて出力する。Then, the object output unit 14 takes in the data of the subtree converted by the structure conversion unit 13 and takes out the data other than the subtree to be converted from the object holding unit 11 as it is. Then, the subtree of the object data is replaced with the converted object data and output.

【００５１】(3) ：クエリーの構造変換装置クエリーの構造変換装置は、構造変換規則を保持する構
造変換規則保持部２４と、前記構造変換規則に従って、
クライアント２１から依頼されたクエリーを変換するク
エリー変換部２２を持ち、構造変換規則に従って、構造
化文書に対するクエリーを変換してからデータベース処
理部２３に渡す。(3): Query structure conversion device The query structure conversion device comprises: a structure conversion rule holding unit 24 for holding a structure conversion rule;
It has a query conversion unit 22 that converts a query requested by the client 21, converts a query for a structured document according to a structure conversion rule, and passes it to the database processing unit 23.

【００５２】この場合のシステム（構造化文書処理シス
テム）は、例えば、クライアント・サーバ−システムで
あり、クライアント２１からサーバへのクエリーの変換
要求に応じて、サーバ（クエリー変換装置に対応）側で
クエリーの変換処理を行う。そして、データベース処理
部２３が、クエリーの変換後のデータに応じて、サーバ
のデータベースからデータを取得し、クライアントへ返
す。The system (structured document processing system) in this case is, for example, a client-server system, and the server (corresponding to a query conversion device) side responds to a query conversion request from the client 21 to the server. Performs query conversion processing. Then, the database processing unit 23 acquires the data from the database of the server according to the converted data of the query, and returns the data to the client.

【００５３】なお、前記クエリー変換部２２、データベ
ース処理部２３、データベースは、それぞれサーバ側の
処理手段であり、例えば、サーバのＣＰＵがプログラム
を実行することにより実現するものである。また、前記
構造変換規則保持部２４は、例えば、サーバのハードデ
ィスク装置で構成する。The query conversion section 22, database processing section 23, and database are processing means on the server side, and are realized, for example, by the CPU of the server executing a program. Further, the structure conversion rule holding unit 24 is constituted by, for example, a hard disk device of a server.

【００５４】クエリーの構造変換装置の処理は次の通り
である。先ず、構造変換規則保持部２４に、予め、クエ
リーの変換を行う際の構造変換規則を格納しておく。こ
の状態でクライアント２１からサーバに対してクエリー
の要求が出されると、サーバのクエリー変換部２２は、
構造変換規則保持部２４を参照し、該規則に従って、ク
エリーの変換を行い、データベース処理部２３に変換後
のデータを渡す。データベース処理部２３は、クエリー
の変換後のデータを受け取ると、そのデータに応じてデ
ータベースからデータを取得し、クライアントへ返す。The processing of the query structure conversion apparatus is as follows. First, the structure conversion rule when the query is converted is stored in the structure conversion rule holding unit 24 in advance. When a query request is issued from the client 21 to the server in this state, the query conversion unit 22 of the server
With reference to the structure conversion rule holding unit 24, the conversion of the query is performed according to the rule, and the converted data is passed to the database processing unit 23. Upon receiving the converted data of the query, the database processing unit 23 acquires the data from the database according to the data and returns the data to the client.

【００５５】§４：属性変換の説明属性変換の説明図を図６に示す。図６において、(a) 図
は属性の変換、(b) 図は一層構造への変換を示す。な
お、この例は、テキストベースの変換例であり、元のテ
キストデータを、図示矢印で示す方向への属性変換をし
ている。§4: Description of attribute conversion FIG. 6 shows an explanatory diagram of attribute conversion. In FIG. 6, (a) shows the conversion of the attribute, and (b) shows the conversion to the one-layer structure. This example is a text-based conversion example in which original text data is subjected to attribute conversion in the direction indicated by the arrow in the figure.

【００５６】この例では、要素“姓”の属性だった“か
な”を、“姓”の下層の要素名にしている。このデータ
をさらに階層を一層に構造変換すると、図６の(a) 図の
ようになる。In this example, “Kana”, which was an attribute of the element “surname”, is changed to an element name below “surname”. When this data is further subjected to the structure conversion into one layer, the result is as shown in FIG.

【００５７】また、図６の(a) 図に示す変換後のデータ
を、更に、図１に示す「一層構造への変換」を行った場
合のデータを図６の (b)図に示す。FIG. 6 (b) shows the data after the conversion shown in FIG. 6 (a) and the data obtained by further performing the "conversion to single-layer structure" shown in FIG.

【００５８】§５：部分領域指定部／部分木指定部の説
明部分領域指定部／部分木指定部の説明図を図７に示す。
図７において、(a) 図は部分領域取得部、 (b)図は部分
木取得部を示す。§5: Description of partial area designating unit / partial tree designating unit FIG. 7 is an explanatory diagram of the partial area designating unit / partial tree designating unit.
In FIG. 7, (a) shows a partial area acquisition unit, and (b) shows a partial tree acquisition unit.

【００５９】(1) ：前記図５の(a) 図に示す部分領域指
定部５は、小部分領域取得部３１が前記構造化文書保持
部１から構造化文書中の小部分領域を取得し、保持部３
２が保持する。その後、構造検索部３３が前記保持部３
２と同じ構造をしている領域を検索する。(1): In the partial area designating section 5 shown in FIG. 5A, the small partial area acquiring section 31 acquires the small partial area in the structured document from the structured document holding section 1. , Holding part 3
2 holds. Then, the structure search unit 33 sets the holding unit 3
Search for an area having the same structure as 2.

【００６０】この場合、構造検索部３３は、構造化文書
中の小さな部分領域から始まり、同じ要素名を持つデー
タ構造が検索できたか否かを判断し、同じ要素名を持つ
データ構造が検索できた場合、部分領域拡大部３５は、
部分領域を一回り大きくし、更に、構造検索部３３が同
じ要素名を持つデータ構造を検索する。In this case, the structure search unit 33 starts from a small partial area in the structured document, determines whether a data structure having the same element name has been searched, and finds a data structure having the same element name. In this case, the partial area enlarging unit 35
The partial area is made slightly larger, and the structure search unit 33 searches for a data structure having the same element name.

【００６１】この検索は同じデータ構造が見つからなく
なるまで、これを繰り返す。このようにして決定した繰
り返し出現しているデータ構造を、前記一層構造への構
造変換の対象として、部分領域取得部２へ渡す。This search is repeated until the same data structure cannot be found. The repeatedly appearing data structure determined in this way is passed to the partial area acquisition unit 2 as a target of the structure conversion into the one-layer structure.

【００６２】なお、前記の例は、部分領域の指定をプロ
グラムの実行により自動的に行う例であるが、このよう
な例に限らず、人手により行うことも可能である。この
場合、部分領域指定部５に、テーブルデータを設定して
おき、このテーブルデータを部分領域取得部２が参照す
ることで、指定された部分領域を取得することも可能で
ある。Although the above example is an example in which the designation of the partial area is automatically performed by executing the program, the present invention is not limited to such an example, and it is also possible to manually designate the partial area. In this case, it is also possible to set the table data in the partial area specifying unit 5 and to obtain the specified partial area by referring to the table data by the partial area obtaining unit 2.

【００６３】(2) ：前記図５の(b) に示す部分木指定部
１５は、小部分木取得部４１が前記オブジェクト保持部
１１からオブジェクトデータの小部分木を取得し、保持
部４２が保持する。その後、構造検索部４３が保持部４
２のデータを検索する。この場合、構造検索部４３は、
オブジェクトデータベース中の小さな部分木から始ま
り、同じ要素名を持つデータ構造が検索できたか否かを
判断し、同じ要素名を持つデータ構造が検索できた場
合、部分領域拡大部４５は部分木を一回り大きくし、更
に、構造検索部４３が同じ要素名を持つデータ構造を検
索する。(2): In the partial tree specifying section 15 shown in FIG. 5B, the small partial tree acquiring section 41 acquires the small partial tree of the object data from the object holding section 11, and the holding section 42 Hold. After that, the structure search unit 43 sets the holding unit 4
Search the data of No. 2. In this case, the structure search unit 43
Starting from a small subtree in the object database, it is determined whether or not a data structure having the same element name has been retrieved. If a data structure having the same element name has been retrieved, the partial area enlarging unit 45 identifies the partial tree as one. The structure search unit 43 searches for a data structure having the same element name.

【００６４】この検索は同じデータ構造が見つからなく
なるまで、これを繰り返す。このようにして検索した繰
り返し出現しているデータ構造を、前記一層構造への構
造変換の対象として、部分木取得部１２へ渡す。なお、
ＤＴＤ（Document Type Definition；文書型定義）やス
キーマから使用されるデータ構造を解析し、構造変換の
対象としてもよい。This search is repeated until the same data structure cannot be found. The data structure repeatedly appearing in this way is passed to the subtree acquisition unit 12 as a target of the structure conversion into the one-layer structure. In addition,
A data structure used from a DTD (Document Type Definition) or a schema may be analyzed and subjected to structure conversion.

【００６５】また、この場合にも、部分木の指定をプロ
グラムの実行により自動的に行う例であるが、このよう
な例に限らず、人手により行うことも可能である。この
場合、部分木指定部１５に、テーブルデータを設定して
おき、このテーブルデータを部分木取得部１２が参照す
ることで、指定された部分木を取得することも可能であ
る。Also in this case, the subtree is specified automatically by executing a program, but the present invention is not limited to such an example, and it is also possible to perform the specification manually. In this case, it is also possible to set the table data in the subtree specifying unit 15 and to obtain the specified subtree by referring to the table data by the subtree obtaining unit 12.

【００６６】前記の説明に対し、次の構成を付記する。The following configuration is added to the above description.

【００６７】（付記１）構造化文書を保持する構造化
文書保持部と、構造化文書から予め定めた要素名の開始
タグと終了タグに挟まれた部分領域を取得する部分領域
取得部と、取得した部分領域を１層構造に変換する構造
変換部と、構造化文書の部分領域を、変換した構造化文
書に置き換えて出力する構造化文書出力部を備え、構造
化文書の予め定めた要素名の開始タグと終了タグに挟ま
れた部分領域を１層構造に変換することを特徴とする構
造化文書変換装置。(Supplementary Note 1) A structured document holding unit for holding a structured document, a partial region obtaining unit for obtaining a partial region between a start tag and an end tag of a predetermined element name from the structured document, A structured document conversion unit that converts the acquired partial area into a one-layer structure; and a structured document output unit that replaces the structured document partial area with the converted structured document and outputs the structured document. A structured document conversion apparatus for converting a partial area sandwiched between a start tag and an end tag of a name into a one-layer structure.

【００６８】（付記２）前記構造変換部は、部分領域
の各要素を取得する要素取得手段と、各要素について
入れ子になっている要素名を取得する要素名取得手段
と、取得した要素名を結合して新しい要素名を生成する
要素名生成手段と、生成した要素名のタグで各要素を挟
んで構造化文書を生成する構造化文書生成手段を備え、
１層構造にしたときの要素名を、入れ子になっている要
素名を結合して生成することを特徴とする（付記１）記
載の構造化文書変換装置。(Supplementary Note 2) The structure conversion unit includes: an element obtaining unit that obtains each element of the partial area; an element name obtaining unit that obtains a nested element name of each element; An element name generating means for combining to generate a new element name, and a structured document generating means for generating a structured document by sandwiching each element with a tag of the generated element name,
The structured document conversion device according to (Supplementary Note 1), wherein the element names in the one-layer structure are generated by combining the nested element names.

【００６９】（付記３）前記部分領域から要素名を入
れ子の内側から外側の順に取得する要素名取得手段と、
取得した順に並べた要素名の間に予め区切りコード挟ん
だ文字列を生成する文字列生成手段を備え、１層構造に
したときの要素名を、入れ子の内側から外側の順に要素
名を並べ、間に予め区切りコードを挟んだものとするこ
とを特徴とする（付記２）記載の構造化文書変換装置。(Supplementary Note 3) Element name acquiring means for acquiring element names from the partial area in order from inside to outside of the nest,
Character string generating means for generating a character string sandwiched between the element names arranged in the order of acquisition in advance, the element names when the one-layer structure, the element names are arranged in order from the inner nest to the outer, The structured document conversion device according to (Supplementary Note 2), wherein a delimiter code is interposed in advance.

【００７０】（付記４）構造化文書処理システムにお
いて、構造化文書に対するクエリーを変換するクエリー
変換装置であって、構造変換規則を保持する構造変換規
則保持部と、構造変換規則に従ってクエリーを変換する
クエリー変換部を備え、構造変換規則に従って、構造化
文書に対するクエリーを変換してから文書処理に渡すこ
とを特徴とするクエリー変換装置。(Supplementary Note 4) In the structured document processing system, a query conversion device for converting a query for a structured document, a structure conversion rule holding unit for holding a structure conversion rule, and converting the query according to the structure conversion rule A query conversion device comprising a query conversion unit, which converts a query for a structured document in accordance with a structure conversion rule and then passes the query to document processing.

【００７１】（付記５）属性を持っているタグを検出
する属性付タグ検出部と、属性をその要素の下層の要素
名に、属性値を要素に変換する属性変換部を備え、属性
を持っているタグを検出し、属性をそのタグの下層の要
素に変換することを特徴とする構造化文書変換装置。(Supplementary Note 5) An attribute-attached tag detecting section for detecting a tag having an attribute, an attribute converting section for converting an attribute into an element name of a lower layer of the element and an attribute value into an element are provided. A structured document conversion apparatus for detecting a tag which is included in the tag and converting the attribute into an element below the tag.

【００７２】（付記６）構造化文書をメモリ上で木構
造に展開したオブジェクトを保持するオブジェクト保持
部と、木構造から予め定めた部分木を取得する部分木取
得部と、取得した部分木を１階層の木に変換する構造変
換部と、木構造の指定部分木を、変換した部分木に置き
換えて出力するオブジェクト出力部を備え、構造化文書
をメモリ上に展開した木構造の予め定めた部分木を１階
層の木に変換することを特徴とする構造化文書変換装
置。(Supplementary Note 6) An object holding unit that holds an object obtained by expanding a structured document into a tree structure on a memory, a subtree acquisition unit that acquires a predetermined subtree from the tree structure, A structure conversion unit for converting the tree structure into a one-level tree; and an object output unit for replacing a specified subtree of the tree structure with the converted subtree and outputting the converted tree structure. A structured document conversion device for converting a partial tree into a one-level tree.

【００７３】（付記７）前記構造変換部において、取
得した部分木の各要素を取得する要素取得手段と、各要
素について部分木の根から各要素に対応付けられた節点
への経路にある要素名を取得する要素名取得手段と、取
得した要素名を結合して新しい要素名を生成する要素名
生成手段と、生成した要素名を１階層の木の節点とした
部分木を生成する部分木生成手段を備え、１階層の木に
変換したときの要素名を、部分木の根からの経路にある
要素名を結合して生成したものとすることを特徴とする
（付記６）記載の構造化文書変換装置。(Supplementary Note 7) In the structure conversion unit, an element acquisition unit that acquires each element of the acquired partial tree, and an element name on a path from the root of the partial tree to a node associated with each element for each element Element name obtaining means for obtaining, element name generating means for generating a new element name by combining the obtained element names, and subtree generating means for generating a subtree using the generated element name as a node of a one-level tree The structured document conversion apparatus according to (Appendix 6), wherein the element names when converted into a one-level tree are generated by combining the element names on the path from the root of the subtree. .

【００７４】（付記８）前記部分木から要素名を部分
木の葉から根の方向に取得する手段と、取得した順に並
べた要素名の間に区切りコードを挟んだ文字列を生成す
る手段を備え、１階層の木としたときの要素名を、部分
木の葉から根の方向に順に並べ、予め定めた区切りコー
ドを挟んだものとすることを特徴とする（付記６）記載
の構造化文書変換装置。(Supplementary Note 8) Means for acquiring an element name from the partial tree in the direction from the leaf to the root of the partial tree, and means for generating a character string in which a delimiter code is interposed between the element names arranged in the acquired order, The structured document conversion device according to (Supplementary Note 6), wherein the element names in the case of a one-level tree are arranged in order from the leaves of the partial tree to the root and sandwich a predetermined delimiter code.

【００７５】（付記９）前記構造化文書から繰り返し
出現しているデータ構造を検出し、検出した部分木を、
前記構造変換対象とすることを特徴とする（付記１）又
は（付記６）記載の構造化文書変換装置。(Supplementary Note 9) A data structure repeatedly appearing from the structured document is detected, and the detected subtree is
The structured document conversion device according to (Supplementary Note 1) or (Supplementary Note 6), wherein the structured document conversion target is the structure conversion target.

【００７６】[0076]

【発明の効果】以上説明したように、本発明によれば次
のような効果がある。As described above, the present invention has the following effects.

【００７７】(1) ：構造化文書の階層を浅くすることに
より、データアクセス効率の改善、動作記憶容量の削減
が期待できる。(1): By reducing the depth of the structured document, improvement in data access efficiency and reduction in operation storage capacity can be expected.

【００７８】(2) ：人が設計する際には、理解しやすい
階層構造を扱え、計算機上では、効率のいいフラットに
近い形でデータを扱える。(2): When designing by humans, a hierarchical structure that is easy to understand can be handled, and on a computer, data can be handled in an efficient and nearly flat form.

【００７９】(3) ：請求項１では、部分領域取得部が構
造化文書保持部に保持している構造化文書から、予め定
めた要素名の開始タグと終了タグに挟まれた部分領域を
取得し、構造変換部が前記取得した部分領域を１層構造
に変換する。そして、構造化文書出力部は、構造化文書
の部分領域を、前記変換した構造化文書に置き換えて出
力する。(3): In the first aspect, a partial area sandwiched between a start tag and an end tag of a predetermined element name is extracted from the structured document held in the structured document holding unit by the partial area acquisition unit. Acquired, the structure conversion unit converts the obtained partial area into a one-layer structure. Then, the structured document output unit replaces the partial area of the structured document with the converted structured document and outputs the result.

【００８０】このようにして、構造化文書変換装置は、
構造化文書の予め定めた要素名の開始タグと終了タグに
挟まれた部分領域を１層構造に変換して出力する。従っ
て、構造化文書の階層を浅くすることができ、動作記憶
容量の削減を可能にすると共に、データアクセス効率を
改善することができる。In this way, the structured document conversion device
A partial area between a start tag and an end tag of a predetermined element name of the structured document is converted into a one-layer structure and output. Therefore, the hierarchical level of the structured document can be reduced, and the operation storage capacity can be reduced, and the data access efficiency can be improved.

【００８１】(4) ：請求項２では、要素取得手段が部分
領域の各要素を取得し、要素名取得手段が各要素につい
て入れ子になっている要素名を取得し、要素名生成手段
が取得した要素名を結合して新しい要素名を生成し、構
造化文書生成手段が取得した要素名を結合して新しい要
素名を生成する。(4): In claim 2, the element obtaining means obtains each element of the partial area, the element name obtaining means obtains the nested element name of each element, and the element name generating means obtains the nested element name. A new element name is generated by combining the obtained element names, and a new element name is generated by combining the element names obtained by the structured document generation means.

【００８２】このようにして、構造変換部は１層構造に
したときの要素名を、入れ子になっている要素名を結合
して生成する。従って、構造化文書の階層を浅くするこ
とができ、動作記憶容量の削減を可能にすると共に、デ
ータアクセス効率を改善することができる。In this way, the structure conversion unit generates the element names in the one-layer structure by combining the nested element names. Therefore, the hierarchical level of the structured document can be reduced, and the operation storage capacity can be reduced, and the data access efficiency can be improved.

【００８３】(5) ：請求項３では、要素名取得手段が部
分領域から要素名を入れ子の内側から外側の順に取得
し、文字列生成手段が取得した順に並べた要素名の間に
予め区切りコード挟んだ文字列を生成する。(5): In claim 3, the element name obtaining means obtains the element names from the partial area in order from the inner side to the outer side of the nest, and separates in advance between the element names arranged in the order obtained by the character string generating means. Generates a character string between codes.

【００８４】このようにして、構造変換部は１層構造に
したときの要素名を、入れ子の内側から外側の順に要素
名を並べ、間に予め区切りコードを挟んだものとする。
従って、構造化文書の階層を浅くすることができ、動作
記憶容量の削減を可能にすると共に、データアクセス効
率を改善することができる。In this way, the structure conversion unit arranges the element names in the one-layer structure in the order from the inner side to the outer side of the nest, with a delimiter code interposed therebetween in advance.
Therefore, the hierarchical level of the structured document can be reduced, and the operation storage capacity can be reduced, and the data access efficiency can be improved.

【００８５】(6) 請求項４では、クエリー変換部が構造
変換規則保持部が保持している構造変換規則に従ってク
エリーを変換する。このようにして、クエリー変換装置
は、構造変換規則に従って構造化文書に対するクエリー
を変換してから文書処理に渡す。従って、構造化文書の
階層を浅くすることができ、動作記憶容量の削減を可能
にすると共に、データアクセス効率を改善することがで
きる。(6) In claim 4, the query conversion unit converts the query according to the structure conversion rule held by the structure conversion rule holding unit. In this way, the query conversion device converts the query for the structured document according to the structure conversion rule and then passes the query to the document processing. Therefore, the hierarchical level of the structured document can be reduced, and the operation storage capacity can be reduced, and the data access efficiency can be improved.

【００８６】(7) ：請求項５では、属性付タグ検出部が
属性を持っているタグを検出し、属性変換部が属性をそ
の要素の下層の要素名に、属性値を要素に変換する。こ
のようにして、構造化文書変換装置は属性を持っている
タグを検出し、属性をそのタグの下層の要素に変換す
る。従って、構造化文書の階層を浅くすることができ、
動作記憶容量の削減を可能にすると共に、データアクセ
ス効率を改善することができる。(7): In claim 5, the tag-with-attribute detecting unit detects the tag having the attribute, and the attribute converting unit converts the attribute into an element name of a lower layer of the element and the attribute value into the element. . In this way, the structured document conversion device detects the tag having the attribute, and converts the attribute into an element below the tag. Therefore, the hierarchy of the structured document can be reduced,
The operation storage capacity can be reduced, and the data access efficiency can be improved.

[Brief description of the drawings]

【図１】本発明の実施の形態における構造化文書の構造
変換説明図（その１）であり、(a) 図はテキストベース
を示す。FIG. 1 is an explanatory diagram (part 1) of a structure conversion of a structured document according to an embodiment of the present invention, and FIG. 1 (a) shows a text base.

【図２】本発明の実施の形態における構造化文書の構造
変換説明図（その２）であり、 (b)図はオブジェクトベ
ース、(c) 図はクエリーの変換を示す。FIG. 2 is an explanatory diagram (part 2) of a structure conversion of a structured document according to the embodiment of the present invention. FIG. 2 (b) shows an object base, and FIG. 2 (c) shows a query conversion.

【図３】本発明の実施の形態における構造化文書の構造
変換説明図（その３）であり、(a) 図はテキストベー
ス、(c) 図はクエリーの変換を示す。FIG. 3 is an explanatory diagram (part 3) of a structure conversion of a structured document according to the embodiment of the present invention; FIG. 3 (a) shows a text base, and FIG. 3 (c) shows a query conversion.

【図４】本発明の実施の形態における構造化文書の構造
変換説明図（その４）であり、(b) 図はオブジェクトベ
ースを示す。FIG. 4 is an explanatory diagram (part 4) of a structure conversion of a structured document according to the embodiment of the present invention, and FIG. 4 (b) shows an object base.

【図５】本発明の実施の形態における構造化文書変換装
置を示した図であり、(a) 図はテキストベースの構造変
換装置、(b) 図はオブジェクトベースの構造変換装置、
(c) 図はクエリーの構造変換装置を示す。5A and 5B are diagrams showing a structured document conversion device according to an embodiment of the present invention, in which FIG. 5A is a text-based structure conversion device, and FIG. 5B is an object-based structure conversion device.
(c) The figure shows the query structure conversion device.

【図６】本発明の実施の形態における属性の変換説明図
であり、(a) 図は属性の変換、(b) 図は一層構造への変
換を示す。6A and 6B are explanatory diagrams of attribute conversion according to the embodiment of the present invention. FIG. 6A shows attribute conversion, and FIG. 6B shows conversion to a single-layer structure.

【図７】本発明の実施の形態における部分領域指定部／
部分木指定部の説明図であり、(a) 図は部分領域指定
部、(b) 図は部分木指定部を示す。FIG. 7 is a diagram illustrating a partial area specifying unit according to the embodiment of the present invention.
It is explanatory drawing of a partial tree designation | designated part, (a) figure shows a partial area | region designation part, (b) drawing shows a partial tree designation | designated part.

[Explanation of symbols]

１構造化文書保持部２部分領域取得部３構造変換部４構造化文書出力部５部分領域指定部１１オブジェクト保持部１２部分木取得部１３構造変換部１４オブジェクト出力部１５部分木指定部２１クライアント２２クエリー変換部２３データベース処理部２４構造変換規則保持部２５データベース３１小部分領域取得部３２保持部３３構造検索部３４対象部分領域決定部３５部分領域拡大部４１小部分木取得部４２保持部４３構造検索部４４対象部分木決定部４５部分木拡大部 DESCRIPTION OF SYMBOLS 1 Structured document holding | maintenance part 2 Partial area acquisition part 3 Structure conversion part 4 Structured document output part 5 Partial area specification part 11 Object holding part 12 Partial tree acquisition part 13 Structure conversion part 14 Object output part 15 Partial tree specification part 21 Client Reference Signs List 22 Query conversion unit 23 Database processing unit 24 Structural conversion rule holding unit 25 Database 31 Small partial area acquiring unit 32 Holding unit 33 Structure searching unit 34 Target partial area determining unit 35 Partial area enlarging unit 41 Small partial tree acquiring unit 42 Holding unit 43 Structure search unit 44 Target subtree determination unit 45 Subtree enlargement unit

Claims

[Claims]

A structured document holding unit for holding a structured document; a partial region acquiring unit for acquiring a partial region sandwiched between a start tag and an end tag of a predetermined element name from the structured document; A structured document conversion unit that converts a partial area into a one-layer structure; and a structured document output unit that replaces the partial area of the structured document with the converted structured document and outputs the structured document. A structured document conversion apparatus for converting a partial area sandwiched between a start tag and an end tag into a one-layer structure.

2. The structure conversion unit as claimed in claim 1, further comprising: an element obtaining means for obtaining each element of the partial area; an element name obtaining means for obtaining a nested element name for each element; Name generating means for generating a new element name by means of: and structured document generating means for generating a structured document by sandwiching each element with a tag of the generated element name. 2. The structured document conversion device according to claim 1, wherein the nested element names are generated by combining them.

3. An element name obtaining means for obtaining an element name from the partial area in order from inside to outside of the nest, and a character string generation for generating a character string in which a delimiter code is inserted between the element names arranged in the obtained order in advance. 3. The structuring method according to claim 2, further comprising: means for arranging the element names in a one-layer structure, wherein the element names are arranged in order from the inner side to the outer side of the nest, and a delimiter code is interposed therebetween in advance. Document conversion device.

4. In a structured document processing system, a query conversion device for converting a query for a structured document, a structure conversion rule holding unit for holding a structure conversion rule, and a query conversion for converting a query according to the structure conversion rule A query conversion device, comprising: a section for converting a query for a structured document according to a structure conversion rule and then passing the converted query to document processing.

5. An attribute-attached tag detecting unit for detecting a tag having an attribute, an attribute converting unit for converting an attribute into an element name of a lower layer of the element and an attribute value into an element, and having an attribute. A structured document conversion apparatus characterized by detecting a tag and converting an attribute into an element below the tag.