[go: up one dir, main page]

WO2006004266A1 - Xml processor having function for filtering tree path, method of filtering tree path and recording medium thereof - Google Patents

Xml processor having function for filtering tree path, method of filtering tree path and recording medium thereof Download PDF

Info

Publication number
WO2006004266A1
WO2006004266A1 PCT/KR2005/000878 KR2005000878W WO2006004266A1 WO 2006004266 A1 WO2006004266 A1 WO 2006004266A1 KR 2005000878 W KR2005000878 W KR 2005000878W WO 2006004266 A1 WO2006004266 A1 WO 2006004266A1
Authority
WO
WIPO (PCT)
Prior art keywords
tree path
tree
document
information
xml
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/KR2005/000878
Other languages
French (fr)
Inventor
Seong-Kook Shin
Hyok-Sung Choi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020040074822A external-priority patent/KR100580198B1/en
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Priority to JP2007506077A priority Critical patent/JP2007531151A/en
Priority to EP05789798A priority patent/EP1730652A4/en
Priority to CN2005800140684A priority patent/CN1950816B/en
Publication of WO2006004266A1 publication Critical patent/WO2006004266A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/81Indexing, e.g. XML tags; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/143Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/154Tree transformation for tree-structured or markup documents, e.g. XSLT, XSL-FO or stylesheets

Definitions

  • the present invention relates to a tree path filtering, and more particularly, to an extensible markup language (XML) processor having a tree path filtering function, a tree path filtering method and a recording medium storing a program to implement the method.
  • XML extensible markup language
  • An XML processor parses an XML based document and is classified into a stream- based XML processor and a tree-based XML processor.
  • the stream-based XML processor reads characters one by one in a stream and parses an XML based document sequentially. Since the stream-based XML processor does not store information while parsing an XML document, it does not need a large amount of memory space. However, since an operator has to personally store necessary information, the operator is burdened with lots of works to do. Generally, when an XML document is designed in a simple format and has a relatively big size, and the memory capacity of a system is small, the stream-based XML processor is used.
  • the tree-based XML processor parses an XML based document as one of a stream, a buffer, and a file, and builds a tree-like structure corresponding to the XML document in a memory. That is, all elements included in the XML document are stored in the memory. These elements include attributes or metadata.
  • an XML document is an XML based MultiPhoto Video (MPV) document as shown in FIG. 1
  • MPV MultiPhoto Video
  • the tree based XML processor builds a tree structure, as shown in FIG. 2, in a memory.
  • the tree-based XML processor is used.
  • the elements stored in the memory since the MP3 player can reproduce only audio signals, the elements stored in the memory, except the elements related to audio signals, become information that is not used. Accordingly, when the tree -based XML processor is used, the memory capacity can be consumed unnecessarily.
  • the present invention provides an extensible markup language (XML) processor having a function filtering a tree path for an XML based document so that a tree structure can be built by using necessary elements among elements included in the XML based document, a tree path filtering method and a recording medium storing a program to implement the method.
  • XML extensible markup language
  • a tree path is filtered based on tree path policy information defined in advance and based on the filtered tree path, the tree structure is built such that unnecessary use of the memory capacity can be prevented. Accordingly, even in a system having a smaller memory capacity, a tree- based XML processor can be used.
  • FIG. 1 illustrates an example of an XML based MultiPhoto Video (MPV) document
  • FIG. 2 is a tree view corresponding to the MPV document shown in FIG. 1;
  • FIG. 3 is a functional block diagram of a system including an XML processor according to an embodiment of the present invention.
  • FIG. 4 illustrates an example of a tree view obtained by an XML processor according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of the operations performed by a tree path filtering method according to an embodiment of the present invention.
  • a tree path filtering method for filtering a tree path of an input document, the method comprising: detecting a tree path satisfying a predetermined tree path policy from the document; and building a tree structure corresponding to the document by using the detected tree path.
  • the tree path policy may define an element to be included in the tree structure as tree path information.
  • the tree path information may be described as any one of a character string in which a term and a stem are described between symbols distinguishing elements, a character string which is described in order of the term, the stem and the symbol, a character string which is described in order of the symbol, the term, and the stem, and a character string described in order of the term and the stem.
  • the detecting of the tree path may include: parsing the document in units of tree paths or elements; determining whether or not data identical to the tree path or element obtained by the parsing is included in the tree path information; if the tree path or element obtained by the parsing is included in the tree path information, detecting the tree path or element as satisfying the tree path policy; and if the tree path or element obtained by the parsing is not included in the tree path information, ignoring the tree path or element.
  • the document may be an extensible markup language based document, and the method is operated by a processor parsing the XML based document.
  • a processor having a function for filtering a tree path in an input document, including: a parser parsing the document and filtering a tree path of necessary data among the parsed data; a tree path policy storing unit storing information on a predetermined tree path policy; and an information providing unit referring to information stored in the tree path policy storing unit and if data parsed in the parser satisfies the predetermined tree path policy, providing information indicating that the parsed data is necessary data to the parser.
  • the parser may detect necessary data among the parsed data based on the in ⁇ formation provided by the information providing unit, indicating that the parsed data is necessary data, and build a tree structure corresponding to the document by using the detected data.
  • the tree path information may be described as any one of a character string in which a term and a stem are described between symbols distinguishing elements, a character string which is described in order of the term, the stem and the symbol, a character string which is described in order of the symbol, the term, and the stem, and a character string described in order of the term and the stem.
  • the information providing unit may determine that the parsed data satisfies the tree path policy.
  • the information providing unit may include: a comparison unit comparing the parsed data with the tree path information defined in the tree path policy.
  • a computer readable recording medium having embodied thereon a computer program to perform a method for filtering a tree path in an input document, wherein the tree path filtering method may include: detecting in the document a tree path based on a tree path or element satisfying a predetermined tree path policy; and by using the detected tree path, building a tree structure corresponding to the document.
  • a system including an XML processor has an XML processor 300 and a memory 310.
  • the system is an apparatus capable of using an XML based document, such as a computer system and a DVD player.
  • the XML processor 300 parses an XML document being input, filters a tree path of the XML document and builds a tree structure for the input XML document in the memory 310.
  • the XML processor 300 has an XML parser 301, a comparison unit 302, and a tree path policy storing unit 303.
  • An XML document is an XML based document.
  • the XML parser 301 parses the
  • XML document if the XML document is input, and detects XML data in units of tree paths.
  • the XML parser 301 detects ' I mpv:still I nmf:metadata I dc:Properties I dc:creator I seoung-kook shin I ' as one tree path, detects ' I mpv:still I nmf:Metadata I dc:Properties I dc:title I central Park I ' as one tree path, ' I mpv:still I nmf:Metadata I dcterms: Properties I dctermsxreated I 2004-03- 14T... I ' as one tree path, and ' I mpv:still I mpv:LastURL I STILL001.JPG I ' as one tree path.
  • the thus detected XML data is transmitted to the comparison unit 302. If a signal from the comparison unit 302 indicating that the detected XML data is necessary data is input, the XML parser 301 builds a tree structure including the detected XML data in the memory 310. However, if a signal from the comparison unit 302 indicating that the detected XML data is unnecessary data is input, the XML parser 301 ignores the detected XML data and detects XML data (or XML tree path) next to the detected XML data in the XML document and repeats the process described above.
  • the comparison unit 302 compares it with XML data stored in the tree path policy storing unit 303 and determines whether or not the input XML data satisfies the tree path policy. That is, if there is data corresponding to the tree path policy information in the input XML data, the comparison unit 302 determines that the input XML data satisfies the tree path policy, and provides a signal indicating that the detected XML data is necessary data, to the XML parser 301.
  • the comparison unit provides a signal indicating that the input XML data is necessary data, to the XML parser 301.
  • the comparison unit 302 determines that the input XML data does not satisfy the tree path policy and provides a signal indicating that the detected XML data is unnecessary data, to the XML parser 301. For example, if the input XML data is ' I mpv:still I nmf:metadata I dc:Properties I dcxreator I seoung-kook shin I ' and tree path policy information is '* I dc:title I *', the comparison unit provides a signal indicating that the input XML data is unnecessary data, to the XML parser 301.
  • the comparison unit 302 can be defined as an information providing unit providing information obtained by determining whether or not the input XML data satisfies the tree path policy.
  • the XML data provided by the XML parser 301 and the XML data stored in the tree path policy storing unit 303 can have a tree path format.
  • tree path policy information stored in the tree path policy storing unit 303 can be defined as tree path information such as at least ' * I dc:title I * '.
  • the tree path policy information can be defined as tree path information including ⁇ dc:title> element such as ' I mpv:still I nmf:Metadata I dc:Properties I dc:title I central Park I ' and '* I dc:title I
  • the comparison unit 302 performs a comparison to determine whether or not in element information stored in the tree path policy storing unit 303 there is information identical to the parsed XML data, and based on the comparison result, provides a signal indicating whether or not the parsed XML data is necessary, to the XML parser 301.
  • tree path information stored in the tree path policy storing unit 303 is '* I dc:title I *'
  • tree paths, excluding the tree paths including 'dc:title' element are regarded as unnecessary tree paths by the comparison unit 302. Accordingly, a tree structure built by the XML parser 301 in the memory 301 is as shown in FIG. 4.
  • Tree path policy information stored in the tree path policy storing unit 303 can be defined based on XML tree path information as shown in Table 1 :
  • tree path policy information stored in the tree path policy storing unit 303 can be defined by considering the function of a system.
  • the tree path policy information can be defined as XML data having tree path information related to audio.
  • tree path policy in ⁇ formation including an audio element such as '* I *: audio I *' can be defined.
  • the tree path policy may define an element to be included in the tree to be built as tree path information.
  • the element can include an attribute and can be regarded as one node in the tree structure.
  • the tree path information form can be described as any one of a character string in which a term and a stem are described between symbols distinguishing elements ( I term stem I), a character string which is described in order of a term, a stem and the symbol (term stem I), a character string which is described in order of the symbol, a term and a stem (I term stem), and a character string which is described in order of a term and a stem.
  • the memory 310 stores an XML document having a tree structure based on the
  • FIG. 5 is a flowchart of the operations performed by a tree path filtering method according to an exemplary embodiment of the present invention.
  • An XML document is parsed in units of tree paths or elements as described with reference to FIG. 3 in operation 501. It is determined whether or not the XML data parsed in units of tree paths or elements satisfies a tree path policy set in advance in operation 502. That is, if the tree path policy information set in advance includes data (or a character string) identical to XML data for one tree path or element parsed and detected, it is determined that the XML data for the one tree path or element satisfies the tree path policy set in advance.
  • the XML data for one tree path or element satisfies the tree path policy
  • the XML data for the tree path or element is detected as a tree path or element to be used when a tree structure corresponding to the XML data is built in operation 503. If XML data of all tree paths or elements included in the XML document are checked with respect to whether or not the tree path policy is satisfied in operation 504, a tree path for the XML document is built by using XML data for the detected tree path or element in operation 505.
  • the XML data parsed in units of tree paths or elements does not satisfy the tree path policy, the XML data is not detected such that when a tree structure cor ⁇ responding to the XML document is built, the XML data in units of tree paths or elements is not used in operation 506. That is, the XML data parsed in units of tree paths or elements is processed such that the XML data is ignored when the tree structure is built. Then, operation 504 is performed and the process described above is repeatedly performed.
  • FIG. 5 shows an example in which after all XML data of an input document are checked with respect to whether or not a tree path policy is satisfied, a tree structure corresponding to the input document is built.
  • the tree path filtering method can also be implemented such that whenever XML data satisfying a tree path policy is detected, the tree structure is built.
  • a program for performing a method of filtering a tree path when a tree structure corresponding to an XML document is built in a tree-based XML processor as described above can also be embodied as computer readable codes on a computer readable recording medium.
  • the computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the Internet).
  • the computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. Also, functional programs, codes, and code segments for ac ⁇ complishing the present invention can be easily construed by programmers skilled in the art to which the present invention pertains.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Document Processing Apparatus (AREA)

Abstract

An extensible markup language (XML) processor having a function for filtering a tree path for an XML based document so that a tree structure can be built by using necessary elements among elements included in the XML based document, a tree path filtering method and a recording medium storing a program to implement the method are provided. The method includes: detecting a tree path satisfying a predetermined tree path policy from a document; and building a tree structure corresponding to the document by using the detected tree path.

Description

Description
XML PROCESSOR HAVING FUNCTION FOR FILTERING TREE PATH, METHOD OF FILTERING TREE PATH AND
RECORDING MEDIUM THEREOF
Technical Field
[1] The present invention relates to a tree path filtering, and more particularly, to an extensible markup language (XML) processor having a tree path filtering function, a tree path filtering method and a recording medium storing a program to implement the method.
Background Art
[2] An XML processor parses an XML based document and is classified into a stream- based XML processor and a tree-based XML processor. The stream-based XML processor reads characters one by one in a stream and parses an XML based document sequentially. Since the stream-based XML processor does not store information while parsing an XML document, it does not need a large amount of memory space. However, since an operator has to personally store necessary information, the operator is burdened with lots of works to do. Generally, when an XML document is designed in a simple format and has a relatively big size, and the memory capacity of a system is small, the stream-based XML processor is used.
[3] The tree-based XML processor parses an XML based document as one of a stream, a buffer, and a file, and builds a tree-like structure corresponding to the XML document in a memory. That is, all elements included in the XML document are stored in the memory. These elements include attributes or metadata.
[4] For example, if an XML document is an XML based MultiPhoto Video (MPV) document as shown in FIG. 1, the tree based XML processor builds a tree structure, as shown in FIG. 2, in a memory. Generally, when an XML document is very complicated, has a relatively small size, and has internal cross references and the memory capacity of a system is big, the tree-based XML processor is used.
Disclosure of Invention
Technical Problem
[5] In the tree-based XML processor, since all elements are stored in a memory, un¬ necessary elements can be stored in the memory. For example, in case of an MP3 player, if the MP3 player can reproduce only audio signals but an input XML document includes a 'still' element in addition to an 'audio' element, the 'still' element is also stored in the memory disposed in the MP3 player.
[6] However, as described above, since the MP3 player can reproduce only audio signals, the elements stored in the memory, except the elements related to audio signals, become information that is not used. Accordingly, when the tree -based XML processor is used, the memory capacity can be consumed unnecessarily.
Technical Solution
[7] The present invention provides an extensible markup language (XML) processor having a function filtering a tree path for an XML based document so that a tree structure can be built by using necessary elements among elements included in the XML based document, a tree path filtering method and a recording medium storing a program to implement the method.
Advantageous Effects
[8] In the present invention as described above, when a tree structure corresponding to an XML document is built in a tree-based XML processor, a tree path is filtered based on tree path policy information defined in advance and based on the filtered tree path, the tree structure is built such that unnecessary use of the memory capacity can be prevented. Accordingly, even in a system having a smaller memory capacity, a tree- based XML processor can be used.
Description of Drawings
[9] FIG. 1 illustrates an example of an XML based MultiPhoto Video (MPV) document;
[10] FIG. 2 is a tree view corresponding to the MPV document shown in FIG. 1;
[11] FIG. 3 is a functional block diagram of a system including an XML processor according to an embodiment of the present invention;
[12] FIG. 4 illustrates an example of a tree view obtained by an XML processor according to an embodiment of the present invention; and
[13] FIG. 5 is a flowchart of the operations performed by a tree path filtering method according to an embodiment of the present invention.
Best Mode
[14] According to an aspect of the present invention, there is provided a tree path filtering method for filtering a tree path of an input document, the method comprising: detecting a tree path satisfying a predetermined tree path policy from the document; and building a tree structure corresponding to the document by using the detected tree path.
[15] The tree path policy may define an element to be included in the tree structure as tree path information.
[16] The tree path information may be described as any one of a character string in which a term and a stem are described between symbols distinguishing elements, a character string which is described in order of the term, the stem and the symbol, a character string which is described in order of the symbol, the term, and the stem, and a character string described in order of the term and the stem.
[17] The detecting of the tree path may include: parsing the document in units of tree paths or elements; determining whether or not data identical to the tree path or element obtained by the parsing is included in the tree path information; if the tree path or element obtained by the parsing is included in the tree path information, detecting the tree path or element as satisfying the tree path policy; and if the tree path or element obtained by the parsing is not included in the tree path information, ignoring the tree path or element.
[18] The document may be an extensible markup language based document, and the method is operated by a processor parsing the XML based document.
[19] According to another aspect of the present invention, there is provided a processor having a function for filtering a tree path in an input document, including: a parser parsing the document and filtering a tree path of necessary data among the parsed data; a tree path policy storing unit storing information on a predetermined tree path policy; and an information providing unit referring to information stored in the tree path policy storing unit and if data parsed in the parser satisfies the predetermined tree path policy, providing information indicating that the parsed data is necessary data to the parser.
[20] The parser may detect necessary data among the parsed data based on the in¬ formation provided by the information providing unit, indicating that the parsed data is necessary data, and build a tree structure corresponding to the document by using the detected data.
[21] The tree path information may be described as any one of a character string in which a term and a stem are described between symbols distinguishing elements, a character string which is described in order of the term, the stem and the symbol, a character string which is described in order of the symbol, the term, and the stem, and a character string described in order of the term and the stem.
[22] If the parsed data is included in the tree path information defined in the tree path policy, the information providing unit may determine that the parsed data satisfies the tree path policy.
[23] The information providing unit may include: a comparison unit comparing the parsed data with the tree path information defined in the tree path policy.
[24] According to still another aspect of the present invention, there is provided a computer readable recording medium having embodied thereon a computer program to perform a method for filtering a tree path in an input document, wherein the tree path filtering method may include: detecting in the document a tree path based on a tree path or element satisfying a predetermined tree path policy; and by using the detected tree path, building a tree structure corresponding to the document.
Mode for Invention [25] The present invention will now be described more fully with reference to the ac¬ companying drawings, in which exemplary embodiments of the invention are shown.
[26] Referring to FIG. 3, a system including an XML processor according to an exemplary embodiment of the present invention has an XML processor 300 and a memory 310. The system is an apparatus capable of using an XML based document, such as a computer system and a DVD player.
[27] The XML processor 300 parses an XML document being input, filters a tree path of the XML document and builds a tree structure for the input XML document in the memory 310. For this, the XML processor 300 has an XML parser 301, a comparison unit 302, and a tree path policy storing unit 303.
[28] An XML document is an XML based document. The XML parser 301 parses the
XML document if the XML document is input, and detects XML data in units of tree paths. For example, when XML data is detected in units of tree paths in FIG. 2, the XML parser 301 detects ' I mpv:still I nmf:metadata I dc:Properties I dc:creator I seoung-kook shin I ' as one tree path, detects ' I mpv:still I nmf:Metadata I dc:Properties I dc:title I central Park I ' as one tree path, ' I mpv:still I nmf:Metadata I dcterms: Properties I dctermsxreated I 2004-03- 14T... I ' as one tree path, and ' I mpv:still I mpv:LastURL I STILL001.JPG I ' as one tree path.
[29] The thus detected XML data is transmitted to the comparison unit 302. If a signal from the comparison unit 302 indicating that the detected XML data is necessary data is input, the XML parser 301 builds a tree structure including the detected XML data in the memory 310. However, if a signal from the comparison unit 302 indicating that the detected XML data is unnecessary data is input, the XML parser 301 ignores the detected XML data and detects XML data (or XML tree path) next to the detected XML data in the XML document and repeats the process described above.
[30] If XML data is input from the XML parser 301, the comparison unit 302 compares it with XML data stored in the tree path policy storing unit 303 and determines whether or not the input XML data satisfies the tree path policy. That is, if there is data corresponding to the tree path policy information in the input XML data, the comparison unit 302 determines that the input XML data satisfies the tree path policy, and provides a signal indicating that the detected XML data is necessary data, to the XML parser 301. For example, if the input XML data is ' I mpv:still I nmf:Metadata I dc:Properties I dc:title I central Park I ' and tree path policy information is '* I dc:title I *', the comparison unit provides a signal indicating that the input XML data is necessary data, to the XML parser 301.
[31] However, if there is no data corresponding to the tree path policy information in the input XML data, the comparison unit 302 determines that the input XML data does not satisfy the tree path policy and provides a signal indicating that the detected XML data is unnecessary data, to the XML parser 301. For example, if the input XML data is ' I mpv:still I nmf:metadata I dc:Properties I dcxreator I seoung-kook shin I ' and tree path policy information is '* I dc:title I *', the comparison unit provides a signal indicating that the input XML data is unnecessary data, to the XML parser 301.
[32] Accordingly, the comparison unit 302 can be defined as an information providing unit providing information obtained by determining whether or not the input XML data satisfies the tree path policy.
[33] Thus, the XML data provided by the XML parser 301 and the XML data stored in the tree path policy storing unit 303 can have a tree path format. For example, if an XML document is an MPV file as shown in FIG. 1 and only <dc:title> metadata is needed, tree path policy information stored in the tree path policy storing unit 303 can be defined as tree path information such as at least ' * I dc:title I * '. The tree path policy information can be defined as tree path information including <dc:title> element such as ' I mpv:still I nmf:Metadata I dc:Properties I dc:title I central Park I ' and '* I dc:title I
*>
[34] When the XML parser 301 parses the XML data in units of elements and provides the parsed data to the comparison unit 302, XML data provided by the XML parser 301 does not have a tree path format, but the tree path policy storing unit 303 has a tree path format as described above. In this case, the comparison unit 302 performs a comparison to determine whether or not in element information stored in the tree path policy storing unit 303 there is information identical to the parsed XML data, and based on the comparison result, provides a signal indicating whether or not the parsed XML data is necessary, to the XML parser 301.
[35] If the tree path information stored in the tree path policy storing unit 303 is '* I dc:title I *', tree paths, excluding the tree paths including 'dc:title' element, are regarded as unnecessary tree paths by the comparison unit 302. Accordingly, a tree structure built by the XML parser 301 in the memory 301 is as shown in FIG. 4.
[36] Tree path policy information stored in the tree path policy storing unit 303 can be defined based on XML tree path information as shown in Table 1 :
[37] Table 1 [38]
Figure imgf000006_0001
Figure imgf000007_0001
[39] Also, tree path policy information stored in the tree path policy storing unit 303 can be defined by considering the function of a system. For example, in case where the system is an MP3 player, the tree path policy information can be defined as XML data having tree path information related to audio. For example, tree path policy in¬ formation including an audio element such as '* I *: audio I *' can be defined.
[40] Thus, the tree path policy may define an element to be included in the tree to be built as tree path information. The element can include an attribute and can be regarded as one node in the tree structure. The tree path information form can be described as any one of a character string in which a term and a stem are described between symbols distinguishing elements ( I term stem I), a character string which is described in order of a term, a stem and the symbol (term stem I), a character string which is described in order of the symbol, a term and a stem (I term stem), and a character string which is described in order of a term and a stem.
[41] The memory 310 stores an XML document having a tree structure based on the
XML data filtered by the XML parser 301.
[42] FIG. 5 is a flowchart of the operations performed by a tree path filtering method according to an exemplary embodiment of the present invention. An XML document is parsed in units of tree paths or elements as described with reference to FIG. 3 in operation 501. It is determined whether or not the XML data parsed in units of tree paths or elements satisfies a tree path policy set in advance in operation 502. That is, if the tree path policy information set in advance includes data (or a character string) identical to XML data for one tree path or element parsed and detected, it is determined that the XML data for the one tree path or element satisfies the tree path policy set in advance.
[43] If the XML data for one tree path or element satisfies the tree path policy, the XML data for the tree path or element is detected as a tree path or element to be used when a tree structure corresponding to the XML data is built in operation 503. If XML data of all tree paths or elements included in the XML document are checked with respect to whether or not the tree path policy is satisfied in operation 504, a tree path for the XML document is built by using XML data for the detected tree path or element in operation 505.
[44] However, if among tree paths or elements included in the XML document there are tree paths or elements not checked with respect to whether or not the tree path policy is satisfied, processing returns to operation 502 and the process described above is repeatedly performed for a tree path or element parsed next.
[45] If the XML data parsed in units of tree paths or elements does not satisfy the tree path policy, the XML data is not detected such that when a tree structure cor¬ responding to the XML document is built, the XML data in units of tree paths or elements is not used in operation 506. That is, the XML data parsed in units of tree paths or elements is processed such that the XML data is ignored when the tree structure is built. Then, operation 504 is performed and the process described above is repeatedly performed.
[46] FIG. 5 shows an example in which after all XML data of an input document are checked with respect to whether or not a tree path policy is satisfied, a tree structure corresponding to the input document is built. However, the tree path filtering method can also be implemented such that whenever XML data satisfying a tree path policy is detected, the tree structure is built.
[47] A program for performing a method of filtering a tree path when a tree structure corresponding to an XML document is built in a tree-based XML processor as described above can also be embodied as computer readable codes on a computer readable recording medium. The computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the Internet).
[48] The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. Also, functional programs, codes, and code segments for ac¬ complishing the present invention can be easily construed by programmers skilled in the art to which the present invention pertains.
[49] While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.
[50]
[51]

Claims

Claims
[1] L A method of filtering a tree path of an input document, the method comprising: detecting a tree path satisfying a predetermined tree path policy from the document; and building a tree structure corresponding to the document by using the detected tree path.
[2] 2. The method of claim 1, wherein the tree path policy includes information on an element to be included in the tree structure as tree path information.
[3] 3. The method of claim 2, wherein the tree path information is described as any one of a character string in which a term and a stem are described between symbols distinguishing elements, a character string which is described in order of the term, the stem and the symbol, a character string which is described in order of the symbol, the term, and the stem, and a character string described in order of the term and the stem.
[4] 4. The method of claim 2, wherein the detecting of the tree path comprises: parsing the document in units of tree paths or elements; determining whether or not data identical to the tree path or element obtained by the parsing is included in the tree path information; if the tree path or element obtained by the parsing is included in the tree path in¬ formation, detecting the tree path or element as satisfying the tree path policy; and if the tree path or element obtained by the parsing is not included in the tree path information, ignoring the tree path or element.
[5] 5. The method of claim 1, wherein the document is an extensible markup language-based document, and the method is operated by a processor parsing the
XML based document.
[6] 6. A processor having a function for filtering a tree path in an input document, comprising: a parser parsing the document and filtering a tree path of necessary data among the parsed data; a tree path policy storing unit storing information on a predetermined tree path policy; and an information providing unit referring to information stored in the tree path policy storing unit and if data parsed in the parser satisfies the predetermined tree path policy, providing information indicating that the parsed data is necessary data to the parser.
[7] 7. The processor of claim 6, wherein the parser parses the document in units of tree paths or elements.
[8] 8. The processor of claim 6, wherein the parser detects necessary data among the parsed data based on the information provided by the information providing unit, indicating that the parsed data is necessary data, and builds a tree structure cor¬ responding to the document by using the detected data.
[9] 9. The processor of claim 8, wherein the information on the predetermined tree path policy defines an element to be included in the tree structure as tree path in¬ formation.
[10] 10. The processor of claim 9, wherein the tree path information is described as any one of a character string in which a term and a stem are described between symbols distinguishing elements, a character string which is described in order of the term, the stem and the symbol, a character string which is described in order of the symbol, the term, and the stem, and a character string described in order of the term and the stem.
[11] 11. The processor of claim 9, wherein if the parsed data is included in the tree path information defined in the tree path policy, the information providing unit determines that the parsed data satisfies the tree path policy.
[12] 12. The processor of claim 11, wherein the information providing unit comprises a comparison unit comparing the parsed data with the tree path information defined in the tree path policy.
[13] 13. The processor of claim 6, wherein the document is an XML-based document.
[14] 14. A computer readable recording medium having embodied thereon a computer program to perform a method of filtering a tree path in an input document, the method comprising: detecting a tree path satisfying a predetermined tree path policy from the document; and building a tree structure corresponding to the document by using the detected tree path.
PCT/KR2005/000878 2004-04-02 2005-03-25 Xml processor having function for filtering tree path, method of filtering tree path and recording medium thereof Ceased WO2006004266A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2007506077A JP2007531151A (en) 2004-04-02 2005-03-25 XML processor having tree path filtering function, tree path filtering method, and recording medium storing program for performing the method
EP05789798A EP1730652A4 (en) 2004-04-02 2005-03-25 XML PROCESSOR COMPRISING A TREE PATH FILTERING FUNCTION, TREE PATH FILTERING METHOD AND RECORDING MEDIUM THEREOF
CN2005800140684A CN1950816B (en) 2004-04-02 2005-03-25 Processor having function of filtering tree path and method of filtering tree path

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US55854904P 2004-04-02 2004-04-02
US60/558,549 2004-04-02
KR1020040074822A KR100580198B1 (en) 2004-04-02 2004-09-18 XML processor having function for filtering tree path, method for filtering tree path and recording medium storing a program to implement thereof
KR10-2004-0074822 2004-09-18

Publications (1)

Publication Number Publication Date
WO2006004266A1 true WO2006004266A1 (en) 2006-01-12

Family

ID=35783046

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2005/000878 Ceased WO2006004266A1 (en) 2004-04-02 2005-03-25 Xml processor having function for filtering tree path, method of filtering tree path and recording medium thereof

Country Status (3)

Country Link
EP (1) EP1730652A4 (en)
JP (1) JP2007531151A (en)
WO (1) WO2006004266A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929100A (en) * 2019-10-23 2020-03-27 东软集团股份有限公司 Method and device for acquiring value taking path, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010045025A (en) * 1999-11-02 2001-06-05 최한석 A logical structure information extractor for xml documents
KR20020023048A (en) * 2000-09-22 2002-03-28 구자홍 Method for layout of Documents using Extensible Markup Language and system for the same
WO2002031978A2 (en) * 2000-10-10 2002-04-18 Koninklijke Philips Electronics N.V. Programmable remote control device
KR20030083904A (en) * 2002-04-23 2003-11-01 엘지전자 주식회사 Processing method for structure information of xml document

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9928210D0 (en) * 1999-11-29 2000-01-26 Medical Data Service Gmbh Method
JP3984129B2 (en) * 2001-09-10 2007-10-03 富士通株式会社 Structured document processing system
KR100484138B1 (en) * 2002-05-08 2005-04-18 삼성전자주식회사 XML indexing method for regular path expression queries in relational database and data structure thereof.
US20040010752A1 (en) * 2002-07-09 2004-01-15 Lucent Technologies Inc. System and method for filtering XML documents with XPath expressions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010045025A (en) * 1999-11-02 2001-06-05 최한석 A logical structure information extractor for xml documents
KR20020023048A (en) * 2000-09-22 2002-03-28 구자홍 Method for layout of Documents using Extensible Markup Language and system for the same
WO2002031978A2 (en) * 2000-10-10 2002-04-18 Koninklijke Philips Electronics N.V. Programmable remote control device
KR20030083904A (en) * 2002-04-23 2003-11-01 엘지전자 주식회사 Processing method for structure information of xml document

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP1730652A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929100A (en) * 2019-10-23 2020-03-27 东软集团股份有限公司 Method and device for acquiring value taking path, storage medium and electronic equipment

Also Published As

Publication number Publication date
EP1730652A4 (en) 2009-11-11
EP1730652A1 (en) 2006-12-13
JP2007531151A (en) 2007-11-01

Similar Documents

Publication Publication Date Title
US20050223017A1 (en) XML processor having function for filtering tree path, method of filtering tree path and recording medium storing a program to implement the method
KR101066628B1 (en) Database model in hierarchical data format
US7007230B2 (en) Methods and apparatus for parsing extensible markup language (XML) data streams
CN101251855B (en) Equipment, system and method for cleaning internet web page
US20080281815A1 (en) Optimal storage and retrieval of xml data
RU2003134278A (en) METHOD AND COMPUTER READABLE MEDIA FOR IMPORT AND EXPORT OF HIERARCHICALLY STRUCTURED DATA
CN101145157B (en) XML format embedded type apparatus characteristic information analysis method
US8286074B2 (en) XML streaming parsing with DOM instances
US20080276230A1 (en) Processing bundle file using virtual xml document
JP4236055B2 (en) Structured document processing apparatus, method, and program
US12008029B1 (en) Delimiter determination in input data
CN115796146A (en) File comparison method and device
JP2006519422A (en) How to encode structured documents
EP1730652A1 (en) Xml processor having function for filtering tree path, method of filtering tree path and recording medium thereof
US7778969B2 (en) Information-processing apparatus and method for processing document
JP5426533B2 (en) Method and apparatus for searching multimedia content
US20240220726A1 (en) Processing of delimiter-separated value (dsv) data
JP5264905B2 (en) Query expression apparatus and method for multimedia search
JP2010186412A (en) Document management method and management device
KR100776823B1 (en) Method for generating compressed WML stream corresponding to simple path query, Selective reception method of WML stream corresponding to simple path query, and apparatus
EP1583003A2 (en) XML path queries
JP2008209996A (en) Search index creation device, search index creation method, and search index creation program
US20060155702A1 (en) Method and apparatus for searching element and recording medium storing a program therefor
JP4320022B2 (en) XPath type processing apparatus, XPath type processing method, XPath type processing program, and storage medium storing the program
Wright Introducing System. Xml

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200580014068.4

Country of ref document: CN

AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2005789798

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2007506077

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Ref document number: DE

WWP Wipo information: published in national office

Ref document number: 2005789798

Country of ref document: EP