WO2006051956A1 - サーバ装置及び検索方法 - Google Patents
サーバ装置及び検索方法 Download PDFInfo
- Publication number
- WO2006051956A1 WO2006051956A1 PCT/JP2005/020881 JP2005020881W WO2006051956A1 WO 2006051956 A1 WO2006051956 A1 WO 2006051956A1 JP 2005020881 W JP2005020881 W JP 2005020881W WO 2006051956 A1 WO2006051956 A1 WO 2006051956A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- document
- definition file
- unit
- user
- vocabulary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/106—Display of layout of documents; Previewing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/131—Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/14—Tree-structured documents
- G06F40/143—Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
- G06F40/154—Tree transformation for tree-structured or markup documents, e.g. XSLT, XSL-FO or stylesheets
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
Definitions
- the present invention relates to a processing technology for a document described in XML, and particularly relates to a server device and a search method for searching a definition file describing a processing method for a document described in XML.
- XML is attracting attention as a format suitable for sharing data with others via a network, and applications for creating, displaying, and editing XML documents have been developed (for example, (See Patent Document 1).
- An XML document is created based on a vocabulary (tag set) defined by a document type definition or the like.
- Patent Document 1 Japanese Patent Laid-Open No. 2001-290804
- the present invention has been made in view of such circumstances, and an object thereof is to provide a technique for supporting creation of a new vocabulary.
- This server device is a search target including a definition file holding unit that holds a definition file that describes a processing method of elements included in a document described in a markup language, a database that stores information on the definition file, and the like.
- a search request receiving unit that receives a search request including information indicating the function of the definition file; and a search for searching the database based on the information indicating the function And an answer unit for presenting a search result by the search unit.
- the answering unit may present a combination of definition files having a high score by search.
- the server device may further include a transmission unit that receives the definition file acquisition request presented by the response unit and transmits the definition file to the request source.
- FIG. 1 is a diagram showing a configuration of a document processing apparatus according to a base technology.
- FIG. 2 is a diagram showing an example of an XML document to be processed.
- FIG. 3 is a diagram showing an example of mapping the XML document shown in FIG. 2 to a table described in HTML.
- FIG. 4 (a) is a diagram showing an example of a definition file for mapping the XML document shown in FIG. 2 to the table shown in FIG.
- FIG. 4 (b) is a diagram showing an example of a definition file for mapping the XML document shown in FIG. 2 to the table shown in FIG.
- FIG. 5 is a diagram showing an example of a screen displayed by mapping the XML document described in the grade management vocabulary shown in FIG. 2 to HTML according to the correspondence shown in FIG.
- FIG. 6 is a diagram showing an example of a graphical user interface presented to the user by the definition file generation unit in order for the user to generate a definition file.
- FIG. 7 is a diagram showing another example of the screen layout generated by the definition file generation unit.
- FIG. 8 is a diagram showing an example of an XML document editing screen by the document processing apparatus.
- FIG. 9 is a diagram showing another example of an XML document edited by the document processing apparatus.
- FIG. 10 is a diagram showing an example of a screen displaying the document shown in FIG.
- FIG. 11 (a) is a diagram showing a basic configuration of a document processing system.
- FIG. 11 (b) is a diagram showing a block diagram of the entire document processing system.
- FIG. 11 (c) is a diagram showing a block diagram of the entire document processing system.
- FIG. 14 is a diagram showing details of the relationship between the program starter and other configurations.
- FIG. 15 is a diagram showing the details of the structure of the application service loaded by the program startup unit.
- FIG. 16 is a diagram showing details of the core component.
- FIG. 17] is a diagram showing details of the document management unit.
- FIG. 18 is a diagram showing details of an undo framework and an undo command.
- FIG. 19 is a diagram showing how a document is loaded in the document processing system.
- Sono 20 is a diagram showing an example of a document and its expression.
- FIG. 21 is a diagram showing a relationship between a model and a controller.
- FIG. 22 is a diagram showing details of the plug-in sub-system, the library connection, and the connector.
- FIG. 23 shows an example of a VCD file.
- FIG. 24 is a diagram showing a procedure for loading a compound document in the document processing system.
- FIG. 25 is a diagram showing a procedure for loading a compound document in the document processing system.
- FIG. 26 is a diagram showing a procedure for loading a compound document in the document processing system.
- FIG. 27 is a diagram showing a procedure for loading a compound document in the document processing system.
- FIG. 28 is a diagram showing a procedure for loading a compound document in the document processing system.
- FIG. 29 is a diagram showing a command flow.
- FIG. 30 is a diagram showing a configuration of a vocabulary server according to the first embodiment.
- FIG. 31 is a diagram showing a configuration of a document processing apparatus according to the first embodiment.
- FIG. 32 is a diagram showing a configuration of a schema generation device according to the second exemplary embodiment.
- FIG. 33 is a diagram showing an example of a definition file that is a target of schema generation in the schema generation device.
- FIG. 34 is a diagram showing an example of an XML document processed by the definition file shown in FIG. 33.
- FIG. 36 (a) is a diagram showing another example of a definition file that is a target of schema generation in the schema generation device.
- FIG. 36 (b) is a diagram showing another example of a definition file that is a target of schema generation in the schema generation device.
- FIG. 36 (c) is a diagram showing another example of a definition file that is a target of schema generation in the schema generation device.
- FIG. 36 (d) is a diagram showing another example of a definition file that is a target of schema generation in the schema generation device.
- FIG. 36 (e) is a diagram showing another example of a definition file that is a target of schema generation in the schema generation device.
- FIG. 37 (a) is a diagram showing an example of an XML document processed by the definition file shown in FIGS. 36 (a) to (e).
- FIG. 37 (b) is a diagram showing an example of an XML document processed by the definition file shown in FIGS. 36 (a) to (e).
- FIG. 37 (c) is a diagram showing an example of an XML document processed by the definition file shown in FIGS. 36 (a) to (e).
- FIG. 38 (a) is a diagram showing an example of a schema generated by the schema generation unit from the definition file shown in FIGS. 36 (a) to (e).
- FIG. 38 (b) is a diagram showing an example of a schema generated by the schema generation unit from the definition file shown in FIGS. 36 (a) to (e).
- FIG. 1 shows the configuration of the document processing apparatus 20 according to the base technology.
- the document processing device 20 processes a structured document in which data in a document is classified into a plurality of components having a hierarchical structure.
- the document processing apparatus 20 includes a main control unit 22, an editing unit 24, a DOM unit 30, a CSS unit 40, an HTML unit 50, an SVG unit 60, and a VC unit 80 which is an example of a conversion unit.
- these configurations are the power realized by the CPU, memory, and programs loaded in the memory of any computer.
- functional blocks realized by their cooperation are depicted. Therefore, those skilled in the art will understand that these functional blocks can be realized in various forms by hardware only, software only, or a combination thereof.
- the main control unit 22 provides a framework for loading plug-ins and executing commands.
- the editing unit 24 provides a framework for editing XML documents.
- the document display and editing functions in the document processing device 20 are realized by plug-ins, and necessary plug-ins are loaded by the main control unit 22 or the editing unit 24 according to the document type.
- the main control unit 22 or the editing unit 24 refers to the namespace of the XML document to be processed, determines which vocabulary describes the XML document, and displays or edits corresponding to the vocabulary. Load the plug-in to display and edit.
- the document processing device 20 has a display system and an editing system plug-in for each vocabulary (tag set), such as an HTML unit 50 that displays and edits HTML documents and an SVG unit 60 that displays and edits SVG documents.
- an HTML unit 50 that displays and edits HTML documents
- an SVG unit 60 that displays and edits SVG documents.
- the HTML unit 50 force
- the SV G unit 60 is loaded.
- both HTML and SVG When a compound document containing a component is processed, both HTML unit 50 and SVG unit 60 are loaded.
- the user can select and install only the necessary functions and add or delete the functions as needed later, so that the recording medium such as a hard disk storing the program
- the storage area can be used effectively, and memory can be prevented from being wasted during program execution.
- it has excellent function expandability, and as a development entity, it is possible to cope with the new vocabulary in the form of plug-ins, making development easier, and as a user, adding plug-ins makes it easier and less expensive. Power to add functions.
- the editing unit 24 receives an editing instruction event from the user via the user interface, notifies the event to an appropriate plug-in, etc., and re-executes the event (redo) or cancels execution (undo). Control the process.
- the DOM unit 30 includes a DOM providing unit 32, a DOM generation unit 34, and an output unit 36, and is a document object model (Document) defined to provide an access method when handling an XML document as data. Implements functions that conform to Object Model (DOM).
- the DOM provider 32 is a DOM implementation that satisfies the interface defined in the editing unit 24.
- the DOM generation unit 34 generates a DOM tree from the XML document. As described later, when the XML document to be processed is mapped to another library by VC unit 80, the source tree corresponding to the mapping source XML document and the destination corresponding to the mapping destination XML document A tree is generated.
- the output unit 36 outputs the DOM tree as an XML document at the end of editing, for example.
- the CSS unit 40 includes a CSS analysis unit 42, a CSS providing unit 44, and a rendering unit 46, and provides a display function compliant with CSS.
- the CSS analysis unit 42 has a function of a parser that analyzes the syntax of CSS.
- the CSS provider 44 is an implementation of a CSS object and performs CSS cascade processing on the DOM tree.
- the rendering unit 46 is a CSS rendering engine, and is used to display a document described in a vocabulary such as HTML that is laid out using CSS.
- the HTML unit 50 displays or edits a document described in HTML.
- SVG The unit 60 displays or edits the document described in SVG.
- These display / editing systems are realized in the form of plug-ins, and display units (Canvas) 56 and 66 for displaying documents, control units (Editlet) 52 and 62 for transmitting and receiving events including editing instructions, respectively. Equipped with editorial departments (Zone) 54 and 64 that receive editing commands and edit D0M.
- the control unit 52 or 62 receives a DOM tree editing command from the outside, the editing unit 54 or 64 changes the DOM tree, and the display unit 56 or 66 updates the display.
- MVC Model-View_Controller
- the display units 56 and 66 are set to “View”, the control units 52 and 62 are set to “Controller”, and the editing unit 54 And 64 and DOM entity correspond to “Model” respectively.
- the document processing apparatus 20 of the base technology enables editing in accordance with each vocabulary by simply editing an XML document in a tree display format.
- the HTML unit 50 provides a user interface for editing an HTML document in a manner similar to a word processor
- the SVG unit 60 provides a user interface for editing an SVG document in a manner similar to an image drawing tool.
- the VC unit 80 includes a mapping unit 82, a definition file acquisition unit 84, and a definition file generation unit 86.
- a mapping destination Provides a framework for displaying or editing documents with a display editing plug-in that supports the vocabulary. In this base technology, this function is called Vocabulary Connection (VC).
- the definition file acquisition unit 84 acquires a script file in which the mapping definition is described. This definition file describes the correspondence (connection) between nodes for each node. At this time, whether to edit the element value or attribute value of each node may be specified. Also, an arithmetic expression using the element value or attribute value of the node may be described.
- the mapping unit 82 refers to the script file acquired by the definition file acquisition unit 84, causes the DOM generation unit 34 to generate a destination tree, and manages the correspondence between the source tree and the destination tree.
- the definition file generator 86 provides a graphical user interface for the user to generate a definition file.
- VC unit 80 monitors the connection between the source tree and the destination tree When an editing instruction is received from the user via the user interface provided by the plug-in responsible for display, the corresponding node in the source tree is first changed.
- DOM unit 30 issues a mutation event indicating that the source tree has been changed
- VC unit 80 receives the mutation event and synchronizes the destination tree with the change in the source tree. Change the destination tree node corresponding to the changed node.
- a plug-in that displays / edits the destination tree for example, the HTML unit 50, receives a mutation event indicating that the destination tree has been changed, and updates the display with reference to the changed destination tree.
- the DOM generator 34 When the document processing device 20 reads a document to be processed, the DOM generator 34 generates the XML document DOM tree. Further, the main control unit 22 or the editing unit 24 refers to the name space to determine the vocabulary describing the document. If a plug-in corresponding to the vocabulary is installed in the document processing apparatus 20, the plug-in is loaded to display / edit the document. If the plug-in is not installed, check if the mapping definition file exists. When the definition file exists, the definition file acquisition unit 84 acquires the definition file, generates a destination tree according to the definition, and displays / edits the document by the plug-in corresponding to the mapping destination vocabulary.
- the corresponding parts of the document are displayed and edited by plug-ins corresponding to each vocabulary as described later. If the definition file does not exist, the document source or tree structure is displayed, and editing is performed on the display screen.
- FIG. 2 shows an example of an XML document to be processed.
- This XML document is used to manage student grade data.
- the component “score” that is the top node of the XML document has a plurality of component “students” provided for each student under the subordinate.
- the component “student” has an attribute value “name” and child elements “national language”, “mathematics”, “science”, and “society”. Attribute value "name” Stores the name of the student.
- the constituent elements “National language”, “Mathematics”, “Science” and “Society” store the results of national language, mathematics, science and society, respectively.
- a student with the name “A” has a national language grade of “90”, a mathematics grade of “50”, a science grade of “75”, and a social grade of “60”.
- the vocabulary (tag set) used in this document will be referred to as the “results management vocabulary”.
- the document processing apparatus 20 of the base technology does not have a plug-in that supports display Z editing of the grade management vocabulary, in order to display this document by a method other than source display and tree display,
- the VC function is used.
- the user interface for creating a definition file by the user himself will be described later.
- the description will proceed assuming that a definition file has already been prepared.
- Fig. 3 shows an example of mapping the XML document shown in Fig. 2 to a table described in HTML.
- the “Student” node in the Grade Management Library is associated with the “Row TR” node) of the “Table BLE” node in HTML, and the attribute value “Name” is displayed in the first column of each row.
- the second column contains the element values for the “National Language” node
- the third column contains the element values for the “Mathematics” node
- the fourth column contains the element values for the “Science” node
- the fifth column contains “Society”. Associate the element values of the nodes with each other.
- the XML document shown in FIG. 2 can be displayed in an HTML table format.
- the sixth column specifies the formula for calculating the weighted average of national language, mathematics, science, and society, and displays the average score of the students. In this way, by making it possible to specify an arithmetic expression in the definition file, more flexible display is possible, and user convenience during editing can be improved. Note that the sixth column specifies that editing is not possible, so that only the average score cannot be edited individually. In this way, by making it possible to specify whether or not editing can be performed in the mapping definition, it is possible to prevent erroneous operations by the user.
- FIGS. 4 (a) and 4 (b) map the XML document shown in FIG. 2 to the table shown in FIG.
- An example definition file for This definition file is described in the script language defined for the definition file.
- the definition file contains command definitions and display templates.
- “add student” and “delete student” are defined as commands.
- An operation for deleting the node “student” from the tree is associated.
- headings such as “name” and “national language” are displayed in the first line of the table, and the contents of the node “student” are displayed in the second and subsequent lines.
- FIG. 5 shows an example of a screen displayed by mapping the XML document described in the results management library shown in FIG. 2 to HTML according to the correspondence shown in FIG.
- Table 90 shows, from the left, each student's name, national language grade, mathematics grade, science grade, social grade, and average score.
- the user can edit the XML document on this screen. For example, if the value in the second row and third column is changed to “70”, the element value of the source corresponding to this node, that is, the math grade of the student “B” is changed to “70”.
- the VC unit 80 changes the corresponding portion of the destination tree that causes the destination tree to follow the source tree, and the HTML unit 50 updates the display based on the changed destination tree. Therefore, also in the table on the screen, the mathematics score of the student “B” is changed to “ 70 ”, and the average score is changed to “55”.
- the commands “Add Student Calo” and “Delete Student” are displayed in the menu as defined in the definition file shown in FIGS. 4 (a) and 4 (b). Is displayed.
- the node “Student” is added or deleted in the source tree.
- Such a single-structure editing function may be provided to the user in the form of a command.
- the table A command for adding or deleting a line may be associated with an operation for adding or deleting a node “student”.
- FIG. 6 shows an example of a graphical user interface that the definition file generator 86 presents to the user in order for the user to generate a definition file.
- the XML document of the mapping source is displayed in a tree.
- the area 92 on the right side of the screen shows the screen layout of the mapping destination XML document.
- This screen layout can be edited by the HTML unit 50, and the user creates a screen layout for displaying a document in an area 92 on the right side of the screen.
- mapping source XML document displayed in the area 91 on the left side of the screen into the HTML screen layout displayed in the area 92 on the right side of the screen.
- the connection between the mapping source node and the mapping destination node is specified. For example, if you drop “math”, which is a child element of the element “student”, into the first row and third column of Table 90 on the HTML screen, it will be between the “math” node and the “TD” node in the third column.
- a connection is established.
- Each node can be designated for editing.
- An arithmetic expression can also be embedded in the display screen.
- the definition file generation unit 86 generates a definition file describing the screen layout and the connection between the nodes.
- FIG. 7 shows another example of the screen layout generated by the definition file generation unit 86.
- a table 90 and a pie chart 93 are created on the screen for displaying the XML document described in the grade management vocabulary.
- This pie chart 93 is described in SVG.
- the document processing apparatus 20 of the base technology can process a compound document including a plurality of libraries in one XML document, and thus a table described in HTML as in this example. 90 and a pie chart 93 written in SVG can be displayed on one screen.
- FIG. 8 shows an example of an XML document editing screen by the document processing apparatus 20.
- one screen is divided into multiple parts, and the XML document to be processed is displayed in different display formats in each area.
- the document 94 is displayed in the area 94
- the tree structure of the document is displayed in the area 95
- the table described in HTML shown in FIG. 5 is displayed in the area 96.
- Documents can be edited on any of these screens.
- the source tree is changed, and the plug-in responsible for displaying each screen changes the source. Update the screen to reflect the changes in the tree.
- the display section of the plug-in responsible for displaying each editing screen is registered, and either plug-in or VC unit 80 is registered.
- the source tree is changed by, all the display units displaying the edit screen receive the issued mutation event and update the screen.
- the VC unit 80 changes the destination tree following the change of the source tree, and then refers to the changed destination tree.
- the display unit updates the screen.
- the source display plug-in and the tree display plug-in directly refer to the source tree without using the destination tree. And display.
- the source display plug-in and the tree display plug-in update the screen with reference to the changed source tree and are in charge of the screen in area 96.
- the HTML unit 50 updates the screen by referring to the changed destination tree following the change of the source tree.
- the source display and the tree display can also be realized by using the VC function. In other words, you can lay out the source and tree structure in HTML, map the XML document to the HTML, and display it in the HTML unit 50.
- three destination trees are generated: source format, tree format, and tabular format.
- VC Unit 80 changes the source tree, then changes each of the three destination trees: source format, tree format, and tabular format. Refer to those destination trees and update the three screens.
- the convenience of the user can be improved by displaying the document in a plurality of display formats on one screen.
- the user can display and edit a document in a format that can be easily visually divided using the table 90 or the like while grasping the hierarchical structure of the document by the source display or the tree display.
- the ability to divide a screen and display a screen in multiple display formats at the same time may display a screen in a single display format on a single screen, and the display format can be switched by a user instruction.
- the main control unit 22 receives a display format switching request from the user, and instructs each plug-in to switch the display.
- FIG. 9 shows another example of an XML document edited by the document processing apparatus 20.
- an XHTML document is embedded in the “foreignObject” tag of the SVG document, and a mathematical expression written in MathML is included in the XHTML document.
- the editing unit 24 refers to the name space and distributes the drawing work to an appropriate display system.
- the editing unit 24 first causes the SVG unit 60 to draw a rectangle, and then causes the HTML unit 50 to draw an XHTML document.
- the MathML unit (not shown) is made to draw mathematical expressions. In this way, a compound document including a plurality of vocabularies is appropriately displayed.
- Figure 10 shows the display results.
- the displayed menu may be switched according to the position of the cursor (carriage). That is, when the cursor is in the area where the SVG document is displayed, the menu defined by the SVG unit 60 or the command defined in the definition file for mapping the SVG document is displayed. If the XHTML document exists in the displayed area, the menu provided by the HTML unit 50 or the XHTML document Displays the commands defined in the definition file for mapping. Thereby, an appropriate user interface can be provided according to the editing position.
- the portion described by the library may be displayed in the source display or the tree display.
- the application that displays the loaded document is not installed, the power of displaying the contents cannot be displayed.
- the contents can be grasped by displaying the XML document composed of text data in the source display or tree display. This is a unique feature of text-based documents such as XML.
- a tag of another vocabulary may be used in a document described by a certain vocabulary. If this XML document is not valid, but it is well-formed, it can be processed as a valid XML document. In this case, the tag of another inserted library may be mapped by the definition file. For example, in an XHTML document, tags such as “Important” and “Most important” may be used, and the part surrounded by these tags may be highlighted or displayed in order of importance. Yo!
- the plug-in or VC unit 80 responsible for the edited part changes the source tree. Mutation event listeners can be registered for each node in the source tree. Normally, the plug-in display or VC unit 80 corresponding to the vocabulary to which each node belongs is registered as a listener. .
- the D0M provider 32 traces from the changed node to a higher hierarchy, and if there is a registered listener, issues a mutation event to that listener. For example, in the document shown in Figure 9, if the node below the ⁇ html> node is changed, the ⁇ html> node is registered as a listener.
- a mutation event is notified to the recorded HTML unit 50, and a mutation event is also notified to the SVG unit 60 registered as a listener in the svg> node at the top.
- the HTML unit 50 updates the display with reference to the changed source tree.
- the SVG unit 60 will be ignored if the nodes belonging to its own vocabulary are changed, so it is possible to ignore the mutation event.
- the overall layout may change as the display is updated by the HTML unit 50.
- the layout of the display area for each plug-in is updated by a configuration that manages the layout of the screen, for example, a plug-in that is responsible for displaying the top node.
- the HTML unit 50 first draws a part that it is in charge of and determines the size of the display area. Then, it notifies the configuration that manages the layout of the screen of the size of the display area after the change, and requests a layout update.
- the configuration that manages the layout of the screen receives the notification and re-lays out the display area for each plug-in. In this way, the display of the edited part is updated appropriately, and the layout of the entire screen is updated.
- Documents written in a markup language are usually expressed in the form of a tree data structure in browsers and other applications. This structure corresponds to the tree of the results of parsing the document.
- the DOM (Document Object Model) is a well-known tree-based data structure model used to represent and manipulate documents. DOM provides a standard set of objects for representing documents, including HTML and XML documents. DOM has two basic components: a standard model of how objects representing components in a document are connected, and a standard interface for accessing and manipulating those objects. Including.
- a DOM tree is a hierarchical representation of a document based on the contents of the corresponding DOM.
- a DOM tree contains a “root” and one or more “nodes” that originate from the root. In some cases, the root represents the entire document. Intermediate nodes can represent elements such as rows and columns in a table and its table, for example.
- a “leaf” in a DOM tree usually represents data such as text or images that cannot be further decomposed.
- Each node in the DOM tree may be associated with attributes that describe the parameters of the element represented by the node, such as font, size, color, and indentation.
- HTML is a language for power formatting and layout, which is a commonly used language for creating documents, and is not a language for data description. Nodes in the DOM tree that represent HTML documents are predefined elements as HTML formatting tags, and HTML usually does not provide functions for data detailing or data tagging Z labeling. It is often difficult to formulate queries for data in HTML documents.
- the goal of network designers is to allow documents on the web to be queried and processed by software applications. It has nothing to do with the display method, and any hierarchically structured language can be queried and processed as such. Markup languages such as XML (extensible Markup Language) can provide these features.
- XML XML Markup Language
- HTML HyperText Markup Language
- XSL XML Style Language
- Xpath provides common syntax and semantics for specifying the location of parts of an XML document.
- An example of functionality is traversing (moving) a DOM tree corresponding to an XML document. It provides basic functionality for string, number, and Boolean character manipulation associated with various representations of XML documents.
- Xpath is an XML document's visual syntax, for example, the DOM tree or other abstract / logical structure that does not have a grammar such as the number of lines or the number of characters when viewed as text. Operates in Using Xpath, you can specify a location through a hierarchical structure in the DOM tree of an XML document, for example. In addition to its use for addressing, Xpath is also designed to be used to test whether a node in a DOM tree matches a pattern. More details on XPath can be found at http: ⁇ www. W3.org/TR/xpath.
- XML XML
- XML markup language
- MVC Mode ⁇ View-Controll It is described using a well-known GUI (Graphical User Interface) paradigm called er).
- GUI Graphic User Interface
- the MVC paradigm divides an application or part of an application interface into three parts: a model, a view, and a controller.
- MVC was originally developed to assign traditional input, processing, and output roles to the GUI world.
- model M
- view V
- controller C
- the controller acts to interpret input such as mouse and keyboard input from the user and map these user actions to commands sent to the model and / or view to bring about appropriate changes.
- the model acts to manage one or more data elements, responds to queries about its state, and responds to instructions to change the state. Views work to manage the rectangular area of the display and have the ability to present data to the user through a combination of graphics and text
- FIGS. 11-29 An example of a document processing system is clarified in connection with FIGS. 11-29.
- FIG. 11 (a) shows a conventional configuration example of elements that function as the basis of a document processing system of the type described later.
- Configuration 10 includes a processor of the type such as a CPU or mic processor 11 connected to memory 12 by communication path 13.
- the memory 12 may be in any ROM and Z or RAM format available now or in the future.
- the communication path 13 is typically provided as a bus.
- An input / output interface 16 for a user input device 14 such as a mouse, a keyboard, a voice recognition system and a display device 15 (or other user interface) is also connected to the bus for communication between the processor 11 and the memory 12.
- This configuration may be a stand-alone, a networked form with multiple terminals and one or more servers connected, or may be configured in any known manner.
- the present invention provides an arrangement, centralized or distributed architecture of these components It is not limited by the communication method of one or various components.
- the present system and the embodiments discussed herein are discussed as including several components and subcomponents that provide various functionalities. These components and sub-components can be implemented with hardware only, software alone, or just a combination of hardware and software to provide the noted functionality. Furthermore, the hardware, software, and combinations thereof can be realized by general-purpose computing devices, dedicated hardware, or combinations thereof. Thus, the configuration of a component or subcomponent includes a general-purpose Z-only computing device that executes specific software to provide the functionality of the component or subcomponent.
- FIG. 11B shows an overall block diagram of an example of the document processing system.
- a document is generated and edited.
- These documents may be described in any language having markup language characteristics, such as XML.
- markup language characteristics such as XML.
- the document processing system can be regarded as having two basic configurations.
- the first configuration is an “execution environment” 101 that is an environment in which the document processing system operates.
- execution environment 101 is an environment in which the document processing system operates.
- the execution environment provides basic utilities and functions that support the system as well as the user during document processing and management.
- the second configuration is an “application” 102 composed of applications running in the execution environment. These applications include the document itself and various representations of the document.
- Programlnvoker 103 program invoking power: program activation unit.
- Programlnvokerl03 is a basic program that is accessed to activate the document processing system. For example, when a user logs on to a document processing system and starts, Programlnvokerl03 is executed.
- Programlnvoker 103 for example, reads functions added as a plug-in to the document processing system. You can run it, start and run the application, and read properties related to the document.
- the function of Programlnvokerl03 is not limited to these.
- Programlnvokerl03 finds the application, launches it, and runs the application.
- Plug-in subsystem 104 is used as a highly flexible and efficient configuration for adding functionality to a document processing system.
- the plug-in subsystem 104 can also be used to modify or delete functionality that exists in the document processing system.
- a wide variety of functions can be added or modified using the plug-in subsystem. For example, it is possible to add an Editlet function that supports drawing a document on the screen.
- the Editlet plug-in also supports editing of vocabularies that are added to the system.
- the plug-in subsystem 104 is a Service Broker (Service Broker: Service Broker).
- ServiceBrokerl041 mediates services added to the document processing system by managing plug-ins added to the document processing system.
- Service 1042 Individual functions that achieve the desired functionality are added to the system in the form of Service 1042.
- Available Servicel042 types are: Application service, ZoneFactory (zone factory: zone generator) Service, Editlet (editlet: editor) Service, CommandFactory (command factory: command generator) Service, C onnectXPath (Connect XPath: XPath Management Department) Service, CSSComputation (CSS Combination: CSS Calculation Department) Power including services, etc.
- Plug-in is one or more ServiceProvi der is a unit that can contain a service provider.
- Each ServiceProvider has one or more classes of Service associated with it. For example, by using a single plug-in with the appropriate software application, one or more services can be added to the system, thereby adding the corresponding functionality to the system.
- Command subsystem 105 is used to execute instructions in the form of commands related to document processing.
- a user can execute an operation on a document by executing a series of instructions. For example, a user edits an XML DOM tree corresponding to an XML document in the document processing system by issuing an instruction in the form of a command, and processes the XML document. These commands may be entered using keystrokes, mouse clicks, or other valid user interface actions.
- One command may execute more than one instruction. In this case, these instructions are wrapped in one command and executed sequentially. For example, suppose a user wants to replace an incorrect word with a correct word. In this case, the first command is to find the wrong word in the document, the second command is to delete the wrong word, and the third command is to insert the correct word. That's okay. These three instructions may be wrapped in one command.
- the command may have an associated function, for example, an "Undo" function, described in detail below. These functions may also be assigned to some base classes that are used to create objects.
- a key component of the command subsystem 105 is a Commandlnvoker (command invoking force: command activation unit) 105 1 which acts to selectively give and execute a command.
- Figure 11 (b) shows only one Commandlnvoker force.
- One or more Commandlnvokers can be used.
- One or more commands can be executed simultaneously.
- Com mandlnvokerl051 holds functions and classes necessary for executing commands.
- a Command 1052 to be executed is loaded into Queue 1 053.
- Commandlnvoker creates a command thread that runs continuously . If there is no Command already running in Commandlnvoker, Commandl052 intended to be executed by Commandlnvokerl051 is executed.
- Commandlnvoker If Commandlnvoker is already executing a command, the new Command will be stacked at the end of Queuel053. However, each Commandlnvokerl051 executes only one Command at a time. CommandlnvokerlO 51 executes exception handling when execution of the specified Command fails.
- Command types executed by Commandlnvokerl051 include forces including UndoableCommand 1054, AsynchronousCommand 1055, and VCCo mmand 10VC.
- UndoableCommand 10 54 is a command that can cancel the result of the command if the user desires it. Examples of UndoableCommands include cut, copy, and text insertion. In operation, when a user selects a part of a document and applies a cut command to that part, the cut-out part can be “cut off, “You can power S.
- VCCommand1056 is stored in a Vocabulary Connection Descriptor (VCD) script file. These are user-specified commands that can be defined by the programmer.
- the Command may be a more abstract combination of Comm and for adding an XML fragment, deleting an XML fragment, or setting an attribute, for example. These commands are specifically focused on document editing.
- AsynchronousCommand 105 is a command from the system such as loading and saving of a document, and is executed asynchronously separately from UndoableCommand and VCCommand. Async hronousCommand is not an UndoableCommand, so it cannot be canceled.
- Resourcel09 is an object that provides various functions to various classes. For example, string resources, icons, and default key bindings are examples of resources used in the system.
- the application component 102 which is the second main feature of the document processing system, is executed in the execution environment 101.
- Application component 102 includes the actual document and various logical and physical representations of the document in the system.
- the application component 102 includes the configuration of the system used to manage the document.
- the application component 102 further includes a User Application 106, an application core 108, a user interface 107, and a Core Component 110.
- User Application ! 06 is loaded on the system together with Programlnvokerl03.
- User Application ! 06 is an adhesive that connects the document, various representations of the document, and the user interface required to interact with the document. For example, suppose a user wants to generate a set of documents that are part of a project. When these documents are loaded, an appropriate representation of the document is generated. The user interface function is added as part of UserApplication106. In other words, UserApplication 106 holds both the representation of the document that allows the user to interact with the document that forms part of the project, and various aspects of the document. And once UserApplication06 is created, whenever the user wants to interact with the documents that form part of the project, the user can easily load UserApplication10 on the execution environment.
- CoreComponentl lO provides a way to share documents between multiple panes.
- Pane displays the DOM tree and handles the physical layout of the screen.
- a physical screen consists of multiple Pane forces in the screen that depict individual pieces of information. Documents visible to the user from the screen can appear in one or more panes. Also, two different documents will appear in two different panes on the screen.
- the physical layout of the screen is also in the form of a tree.
- a Pane can be a RootPane 1084 or a SubPane 1085.
- RootPanel084 is a Pane that hits the root of the Pane tree
- SubPane 10 85 is any Pane other than RootPanel084.
- CoreComponentl 10 also provides fonts and serves as a source of multiple functional operations for documents, such as toolkits.
- An example of a task performed by CoreComponentl 10 is moving the mouse cursor between multiple panes.
- Another example of a task to be performed is to mark a part of a document in one pane and copy it onto another pane that contains a different document.
- the application component 102 consists of documents that are processed and managed by the system. This includes various logical and physical representations of documents within the system.
- the application core 108 is a configuration of the application component 102. Its function is to keep the actual document with all the data it contains.
- the application core 108 includes DocumentManager (document manager: document management unit) 1081 and Document (document: document) 1082 itself.
- the DocumentManager 108 1 manages Documentl082.
- DocumentManagerl081 is also connected to RootPanel084, Sub Panel 085, ClipBoard (clipboard) utility 1087, and SnapShot (snapshot) utility 1088.
- the ClipBoard utility 1087 provides a way to keep the portion of the document that the user decides to add to the clipboard. For example, a user may want to cut a part of a document and save it in a new document for later review. In such a case, the clipped part is added to ClipBoard.
- the SnapShot utility 1088 allows the current state of an application to be stored when the application transitions from one state to another.
- a user interface 107 that provides a means for a user to physically interact with the system.
- the user interface is used by users to upload, delete, edit, and manage documents.
- the user interface is Frame 1071, MenuBar (menu 1/1) 1072, StatusBar (Status / 1) 1073, Includes 1074.
- Framel071 is considered to be an active area of the physical screen, as is generally known.
- MenuBarl072 is a screen area that contains menus that provide selection to the user.
- StatusBarl073 is a screen area that displays the execution status of the application.
- URLBarl074 provides an area for entering URL addresses to navigate the Internet.
- FIG. 12 shows the details of DocumentManagerl081. This includes the data structures and structures used to represent the document within the document processing system. For simplicity, the configuration described in this subsection is described using the MVC paradigm.
- DocumentManager081 includes DocumentContainer (document container: document container) 203 that holds and hosts all documents in the document processing system.
- the tool kit 201 attached to Document Managerl081 provides various tools used by DocumentManagerl081.
- DomService is a tool provided by toolkit 201 to provide all the functions needed to create, maintain and manage DOM corresponding to documents.
- IOManager input / output manager
- StreamHandler is a tool that handles uploading documents using bitstreams.
- the model (M) includes a DOM tree model 202 of the document. As mentioned above, all documents are represented as DOM trees in the document processing system. The document also forms part of the DocumentContainer 203.
- a DOM tree representing a document is a tree having Node 2021.
- DOM Zone 209 which is a subset of Lee, contains the associated region of one or more Nodes in the DOM tree. For example, only a part of the document can be displayed on the screen, but this part of the visualized document is displayed using the Zone 209.
- ZoneFactory zone factory: zone generation unit
- a Zone may use a “namespace” with a power of 1 or more to express a part of DOM.
- a namespace is a collection of names that are unique within a namespace. In other words, the same name does not exist in the namespace.
- the Facet 2022 is another configuration within the model (M) part of the MVC paradigm. Facet is used to edit Nodes in the Zone. Facet 2022 organizes access to DOM using procedures that can be performed without affecting the contents of the Zone itself. As explained next, these procedures perform important and useful operations related to Node.
- Each Node has a corresponding Facet. Instead of directly manipulating Nodes in the DOM, the integrity of the DOM is protected by using Facet to perform the operations. If the operation is performed directly on Node, several plug-ins can modify the DOM at the same time, resulting in inconsistencies.
- the DOM standard established by the W3C has the power to define a standard interface for operating Nodes. Actually, there are operations specific to each library or each Node. It is convenient to prepare it as an API. In the document processing system, APIs specific to each node are prepared as Facet and attached to each node. This makes it possible to add useful APIs while complying with the DOM standard. In addition, it is possible to handle a variety of uniforms in a unified manner by adding a special API to a standard DOM implementation that does not implement a specific DOM for each vocabulary. In both cases, it is possible to appropriately process a document in which multiple bubbling libraries are mixed in any combination.
- the vocabulary is a set of tags (for example, XML tags) belonging to the name space.
- a namespace has a set of unique names (here tags).
- Bocabulary Appears as a subtree of the DOM tree that represents the XML document. This subtree contains Z one.
- tag set boundaries are defined by Zones.
- Zone 209 is generated using a service called ZoneFactory205. As described above, Zone 209 is an internal representation of a part of the DOM tree that represents a document. A logical representation is required to provide access to some of these documents. This logical representation informs the computer how the document is logically represented on the screen.
- Canvas 210 is a service that acts to provide a logical layout corresponding to the Zone.
- the Pane 211 is a physical screen layout corresponding to the logical layout provided by the Canvas 210.
- the user sees only the rendering of the document with text and images on the display screen. Therefore, the document must be drawn on the screen, by the process of drawing characters and images on the screen.
- the document is rendered on the screen by Canvas 210 based on the physical layout provided by Pane211.
- a Canvas 210 corresponding to Zone 209 is generated using Editlet 206.
- the document DOM is edited using Editlet 206 and Canvas 210.
- Editlet 206 and Canvas 210 use Facet corresponding to one or more Nodes in Zone209. These services do not operate Zone and Node in DOM directly. Facet is operated using Command207.
- a user generally interacts with the screen by moving a cursor on the screen or typing a command.
- the Canvas 210 that provides a logical layout on the screen accepts this cursor operation.
- Canvas210 has the power to make Facet execute the corresponding action.
- the cursor subsystem 204 functions as a controller (C) of the MVC paradigm with respect to DocumentManager 081.
- Canvas210 also has a task to handle events. For example, Canvas 210 handles events such as mouse clicks, focus movements, and similar actions triggered by the user.
- Documents in a document processing system can be viewed from at least four perspectives.
- Z one, Facet, Canvas, and Pane represent the components of the document processing system that correspond to the above four viewpoints.
- the undo subsystem 212 implements a revocable component of the document manager. 13 ⁇ 4 ⁇ 1 ⁇ 2 ⁇ 8 61 "(Hando Manager: Hando Manager) 2121 holds operations on all documents that can be canceled by the user.
- the undo subsystem 212 supports such operations.
- the UndoManager 2121 holds the operation of such an Undoabl eEdit (Undoable Edit) 2122.
- the controller part of MVC may be equipped with the cursor subsystem 204.
- the cursor subsystem 204 receives input from the user. These inputs generally have the character of commands and / or editing operations. Therefore, the cursor subsystem 204 can be thought of as a controller (C) apportionment for the MVC paradigm associated with DocumentManager1081.
- Canvas 210 represents a logical layout of a document to be presented on the screen.
- Canvas210 may include a box tree 208 that logically represents how the document will look on the screen. This box tree 208 will be included in the view (V) part of the MVC paradigm associated with DocumentManager 081.
- XML documents can be handled by mapping them to other representations, and if the mapped representation is edited, the edits are restored to the original XML document. It is to provide an environment that is reflected while maintaining consistency.
- a document described in a markup language for example, an XML document is created based on a vocabulary defined by a document type definition.
- a bokeh library is a set of tags. Since a vocabulary may be arbitrarily defined, there can be an infinite number of vocabularies. However, it is not practical to provide a dedicated processing Z management environment for each of the many possible bubbly libraries. Vocabulary connection provides a way to solve this problem.
- a document may be described in two or more markup languages.
- Documents may be written in, for example, XHTML (eXtensiole Hyper Text Markup Language), SVG (scalable Vector rap hies), MathML (Mathematical Markup Language), or other markup languages.
- XHTML eXtensiole Hyper Text Markup Language
- SVG scalable Vector rap hies
- MathML MathML
- the markup language may be viewed in the same way as the vocabulary tag set in XML.
- the vocabulary is processed using the vocabulary plug-in.
- Documents written in a library where plug-ins are not available in the document processing system are displayed by mapping to documents in another library where plug-ins are available. With this feature, it is possible to properly display a document in a library that does not have a plug-in.
- a vocabulary connection includes the ability to obtain a definition file and map between two different vocabularies based on the obtained definition file.
- a document written in one vocabulary can be mapped to another vocabulary.
- the vocabulary connection enables a document to be displayed and edited by a display / editing plug-in corresponding to the vocabulary to which the document is mapped.
- each document is generally described in a document processing system as a DOM tree having a plurality of nodes.
- the “definition file” describes the correspondence between each node and other nodes. It is specified whether the element value and attribute value of each node can be edited. An arithmetic expression using the element value or attribute value of the node may be described.
- Doma with the definition file applied using the feature of mapping DM A tree is generated. In this way, the relationship between the source DOM tree and the destination DOM tree is constructed and maintained.
- the vocabulary connection monitors the correspondence between the source DOM tree and the destination DOM tree.
- the library connection Upon receiving an edit instruction from the user, the library connection changes the associated node in the source DOM tree. A “mutation event” is issued to indicate that the source DOM tree has changed, and the destination DOM tree is changed accordingly.
- the vocabulary connection subsystem that is a part of the document processing system provides a function that enables a plurality of expressions of a document.
- FIG. 13 shows a Vocabulary Connection (VC) subsystem 300.
- the VC subsystem 300 provides a way to maintain the consistency of two alternative representations of the same document.
- the two representations may be representations of the same document in two different bubbly libraries.
- one may be the source DOM tree and the other may be the destination DOM tree.
- the functions of the vocabulary connection subsystem 300 are realized in a document processing system using a plug-in called VocabularyConnection301.
- VocabularyConnection301 For each Vocabulary 305 in which the document is represented, a corresponding plug-in is required. For example, if a part of a document is written in HTML and the rest is written in SVG, a browser library corresponding to HTML and SVG is required.
- the VocabularyConnection plug-in 301 generates an appropriate VCCanvas 310 for the Zone 209 or Pane 211 corresponding to the appropriate Vocabulary 305 document.
- VocabularyConnection 301 changes to Zone 209 in the source DOM tree are transferred to the corresponding Zone in another DOM tree 306 by the conversion rule.
- the conversion rule is the Vocabulary Connect descriptor (Vocabulary Connect! on Descriptor: VCD).
- VCD Vocabulary Connect descriptor
- Connector 304 connects the source node of the source DOM tree to the destination node of the destination DOM tree. Connector 304 acts to see modifications (changes) to the source node in the source DOM tree and the source document corresponding to the source node. Then, modify the node of the corresponding destination DOM tree. Connector 304 is the only object that can modify the destination DOM tree. For example, the user can make modifications only to the source document and the corresponding source DOM tree. After that, Connector 304 makes corresponding modifications to the destination DOM tree.
- the connectors 304 are logically linked to form a tree structure.
- the tree formed by the connector 304 is called ConnectorTree (connector tree).
- Connect or 304 is generated using a service called ConnectorFactory (connector factory: connector generation unit) 303.
- ConnectorFactory303 generates Connector304 from the source document and links it to form ConnectorTree.
- VocabularyConnectionManager r302 holds ConnectorFactory303.
- the bubbly library is a set of tags in the namespace.
- Vocabulary 305 is generated for a document by VocabularyConnection 301. This is done by parsing the document file and generating an appropriate VocabularyConnectionManager 302 for mapping between the source DOM and the destination DOM. In addition, an appropriate relationship is created between Connector generation, onnectorFactoryd03, Zone209 actor ZoneF actory205, and Editlet 206, which generates a Canvas corresponding to the nodes in the Zone.
- the corresponding vocabulary connection manager 302 is deleted.
- Vocabulary 305 generates VCCanvas310. Further, a connector 304 and a destination DOM tree 306 are generated correspondingly.
- the source DOM and Canvas correspond to the model (M) and the view (V), respectively. However, such an expression is only meaningful if the target bubbly can be drawn on the screen. The depiction is done by a bokeh rib laggin.
- Vocabulary plug-ins are provided for major vocabulary libraries such as XHTML, SVG, and MathML. Bobber rib lagins are used in conjunction with the target bobbler. These provide a way to map between vocabularies using vocabulary connection descriptors.
- mapping is meaningful only when the target library can be mapped and the method of drawing on the screen is predefined.
- rendering methods are standards defined by organizations such as W3C, such as XHTML.
- VCCanvas is used when a vocabulary connection is required.
- the source canvas cannot be generated because the source view cannot be generated directly.
- VCCanvas is generated using ConnectorTree. This VCCanvas only handles event conversion and does not assist in rendering the document on the screen.
- the purpose of the vocabulary connection subsystem is to simultaneously generate and maintain two representations of the same document.
- the second representation is also in the form of a DOM tree, which has already been described as a destination DOM tree. DestinationZone, Canvas and Pane are needed to see the document in the second representation.
- VCCanvas When VCCanvas is created, a corresponding DestinationPane307 is created. In addition, an associated DestinationCanvas 308 and a corresponding BoxTree 309 are generated. Similarly, VCC anvas 310 is associated with Pane 211 and Zone 209 for the source document.
- DestinationCanvas 308 provides a logical layout of the document in the second representation.
- DestinationCanvas 308 provides user interface features such as cursors and selections to depict documents in the destination representation. Events that occur in Destination Canvas 308 are supplied to the Connector.
- DestinationCanvas308 provides mouse events, keyboard events, drag and drop events, and document death. Notify Connector 304 of events peculiar to the vocabulary of the tenion (second) expression.
- VC vocabulary connection
- VC vocabulary connection
- the vocabulary connection command subsystem 313 generates a VCCommand (vocabulary connection command) 315 that is used to execute instructions related to the vocabulary connection subsystem 300.
- the VCCo mmand can be generated using the built-in CommandTemplate 318 and / or by generating commands from scratch using the script language in the script subsystem 314.
- the command templates include, for example, an "If” command template, a "When” command template, and an "Insert” command template. These templates are used to create V CCom thigh nd.
- the & ⁇ subsystem 316 is an important component of the document processing system, and supports the implementation of the vocabulary connection.
- the nnector 304 generally contains xpath information. As mentioned above, one of the tasks of the vocabulary connection is to reflect changes in the source DOM tree in the destination DOM tree.
- the xpath information contains one or more xpath expressions that are used to determine the subset of the source DOM tree that should be monitored for changes / modifications.
- the source DOM tree is a DOM tree or Zone that represents a document in a vocabulary before being converted to another vocabulary.
- the node of the source DOM tree is called the source node.
- the destination DOM tree is a DOM tree or Zone that represents the same document in a different vocabulary after conversion by mapping, as described above in connection with the vocabulary connection. is there.
- the node of the destination DOM tree is called the destination node.
- ConnectorTree is a hierarchical expression based on a Connector representing the correspondence between a source node and a destination node. The Connector monitors the source node and modifications made to the source document and modifies the destination DOM tree. The Connector is the only object that is allowed to modify the destination DOM tree.
- An event is a method for describing and executing a user action executed on a program.
- programs had to actively gather information to understand user actions and execute them themselves. This means, for example, that after the program initializes itself, it enters a loop that repeatedly checks the user's actions to take appropriate action when the user takes action on the screen, keyboard, mouse, etc. To do.
- this process is cumbersome.
- it requires a program that consumes CPU cycles and loops while waiting for the user to do something.
- the document processing system defines and uses its own events and how to handle these events.
- a mouse event is an event that occurs from a user's mouse action. User actions involving the mouse are passed to the mouse event by Canva s210.
- Canvas is a system user It can be said that it is in the forefront of the interaction. If necessary, the canvas at the front passes the content related to the event to the child.
- the keystroke event flows from the Canvas210 force. Keystroke events have immediate focus. That is, it relates to work at any moment.
- the keystroke event input on Canvas210 is passed to its parent. Keystrokes are handled by different events that can handle string insertion. An event that handles string input occurs when a character is entered using the keyboard. Other “events” include, for example, other events that are handled in the same way as drag events, drop events, and mouse events.
- X HTMLCanvasl06 an example of DestinationCanvas
- receives events that occur such as mouse events, keyboard events, drag and drop events, and events specific to the library. These events are notified to the connector 304. More specifically, as shown in FIG. 21 (b), the event flow in the VocabularyConnection plug-in 301 is as follows: SourcePanel lO3, V and Canvasll04, DestmationPanel 105, DestinationCanvas 1 It passes through the destination DOM tree and ConnectorTree.
- Programlnvokerl03 is a basic program in the execution environment that is executed to start the document processing system. As shown in FIG. 11 (b) and FIG. 11 (c), the User Application 100, ServiceBrokerl041, Commandlnvoker05l, and Resourcel09 are all connected to the Programlnvoker 103. As described above, the application 102 is a component that is executed in the execution environment. Similarly, ServiceBrokerl041 Manage plug-ins that add various functions to the system. On the other hand, Commandlnvokerl051 executes instructions provided by the user and holds classes and functions used to execute the commands.
- ServiceBrokerl041 will be described in more detail with reference to FIG. 14 (b). As described above, ServiceBrokerl041 manages plug-ins (and related services) that add various functions to the system.
- Service 1042 The lowest layer where features can be added or changed in the document processing system. “Service” consists of two partial forces, ServiceCategory 401 and ServiceProvider 402. As shown in FIG. 14 (c), one ServiceCategory 401 can have a plurality of related ServiceProviders 402. Each ServiceProvider acts to execute some or all of a specific ServiceCategory. On the other hand, ServiceCategory 401 defines the type of Service.
- Service is 1) “spot color service” that provides a specific spot color to the document processing system, 2) “application service” that is an application executed by the document processing system, and 3) is required throughout the document processing system. It can be classified into three types: “environmental services” that provide special features.
- FIG. 14 An example of Service is shown in Fig. 14 (d).
- Application Category is an example of ServiceProvider supported by the system utility.
- Editlet 206 is a Category
- HTMLEditlet and SVGEditlet are corresponding ServiceProviders.
- the ZoneFactory 205 is an additional II of Service and has a ServiceProvider (not shown).
- Plug-ins may be considered as a unit consisting of several Service Providers 402 and their associated classes, as already described as adding functionality to the document processing system. Each plug-in has dependencies and ServiceCategory 401 described in the declaration file.
- Figure 14 (e) shows further details about the relationship between Programlnvokerl03 and UserAlicationl06.
- Necessary documents and data are loaded from the storage. Necessary plug All ins are loaded on ServiceBrokerl041.
- ServiceBrokerl041 holds and manages all plug-ins. Plug-ins can be physically added to the system and their functionality can be loaded from storage. When the plug-in content is loaded, ServiceBrokerl041 defines the corresponding plug-in. Next, the corresponding UserApplicationl06 is generated, loaded into the execution environment 101, and attached to Programlnvokerl03.
- Commandl052 is a command used to process a document such as XML and edit a corresponding XMLDOM tree in a document processing system.
- Commandlnvokerl05 1 holds classes and functions necessary for executing Commandl052.
- ServiceBrokerl041 is also executed in Programlnvokerl03.
- UserApplicationl06 is connected to the user interface 107 and CoreComponentllO.
- CoreCompone ntl lO provides a way to share documents between all panes.
- CoreComponentl lO also provides fonts and serves as a toolkit for Pane.
- FIG. 15 (b) shows the relationship between Framel071, MenuBarl072, and StatusBarl073.
- FIG. 16 (a) provides further explanation of the application core 108 that holds all documents and parts of the documents and data belonging to the documents.
- CoreComponentl lO is attached to DocumentManagerl081 that manages document 1082.
- DocumentManager 1081 is the owner of all documents 1082 stored in memory associated with the document processing system.
- DocumentManager 1081 is also connected to RootPanel 084 to facilitate the display of the document on the screen.
- the functions of ClipBoard087, SnapShotl088, Drag & Drop601, and Overlay602 are also attached to CoreComponentlO.
- Snapshot 1088 Used to restore application state. User When SnapShotl088 is started, the current state of the application is detected and stored. Then, when the application state changes to another state, the contents of the stored state are saved. SnapShotl088 is illustrated in FIG. 16 (b). In operation, when an application moves from one URL to another, SnapShotl088 remembers the previous state so that it can seamlessly execute the previous and next operations.
- Figure 17 (a) shows further explanation of DocumentManager 081 and how documents are organized and maintained in DocumentManager.
- the DocumentManager 1081 manages the document 1082.
- one of the plurality of documents is a RootDocument (norate document) 701
- the remaining document is a SubDocument (subdocument) 702.
- DocumentManagerl081 is connected to RootDocument701
- Root Document701 is connected to all SubDocument702.
- DocumentManagerl081 is coupled to DocumentContainer203, which is an object that manages all documents 1082.
- Tools that form part of toolkit 201 eg XML toolkit
- DOMService703 generates a DOM tree based on the document managed by DocumentManager1081.
- Each Document705 is managed by the corresponding DocumentContainer 203 regardless of whether it is an ootDocument701 or a SubDocument 702.
- FIG. 17 (b) shows how documents A to E are arranged hierarchically.
- Document A is RootDocume nt.
- Document B—D is a SubDocument of Document A.
- Document E is a SubDocument of Document D.
- the left side of Fig. 17 (b) shows an example where the same document hierarchy is displayed on the screen.
- Document A which is a RootDocument, is displayed as a basic frame.
- Document B—D which is the SubDocument of Document A, is displayed as a subframe in Basic Frame A.
- Document E which is a SubDocument of Document D, is displayed on the screen as a subframe of Subframe D.
- UndoManager Undo Manager: Undo Manager
- UndoWrapper Undo Wrapper 707 are assigned to each DocumentContainer 203. Is generated. UndoManager 706 and UndoWrapper 707 are used to execute a cancelable command. By using this feature, you can undo changes made to the document using editing operations. SubDocument changes are also closely related to Root Document. The undo operation takes into account changes that affect other documents in the hierarchy, for example, to maintain consistency among all documents in a chained hierarchy as shown in Figure 17 (b). Guarantee that.
- UndoWrapper707 wraps the undo objects related to SubDocument in DocumentContainer203 and binds them to the undo object related to RootDocument.
- UndoWrapper707f UndoableEditAcceptor (UndoableItAcceptor: Undoable edit acceptor) Collects undo objects that can be used in 709.
- UndoManager 706 and UndoWraer 707 are connected to UndoableEditAcc mark tor709 and Undo ableEditSource (Undo EditEdit Source) 708.
- UndoableEditAcc mark tor709 UndoableEditSource (Undo EditEdit Source) 708.
- the Document705 force SUndoableEditSource708 can also be the source of an editable edit object.
- Figures 18 (a) and 18 (b) provide further details about the undo framework and undo commands.
- UndoCommand 801, RedoCommand 802, and UndoableEditCommand 803 are commands that can be placed on Commandlnvoke rl051 as shown in FIG. 11 (b), and are executed in order.
- UndoableEditCommand 8 03 is further attached to UndoableEditSource708 and UndoableEditAcceptor709.
- rfooj EditCommand805 is an example of Undoa and leEditCommand.
- Figure 18 (b) shows the execution of UndoableEditCommand.
- UndoableEditAcc mark is assigned to UndoableEditSource708 which is the DOM tree of tor709 force Document705.
- Docum ent705 is edited using DOM API.
- the third step S3 it is notified that the listener power of the mutation event has been changed. That is, in this step, the listener that monitors all changes in the DOM tree detects the editing operation.
- UndoableEdit is stored as an object of UndoManager706.
- UndoableEditAcc mark tor709 is detected from UndoableEditSource708.
- UndoableEditSource708 can be Document705 itself.
- Figure 19 (a) shows an overview of how a document is loaded into the document processing system. Each step is detailed in relation to a specific example in Figures 24-28.
- a document processing system generates a DOM from a binary data stream consisting of data contained in a document.
- ApexNode (apex node) is generated for the part of the document that is the target of attention and belongs to the Zone.
- the corresponding Pane is identified.
- the identified pane creates a zone and canvas from the ApexNode and the physical screen surface.
- the Zone then creates Facets for each node and provides the information needed for them.
- Canvas generates a data structure for rendering nodes from a DOM tree.
- the document is loaded from storage 901.
- a DOM tree 902 of the document is generated.
- a corresponding DocumentContainer 903 is generated to hold the document.
- DocumentContainer 903 is attached to DocumentManager 904.
- a DOM tree includes a root node and sometimes multiple secondary nodes.
- the DOM tree may have, for example, an SVG subtree as well as an XHTML subtree.
- the XHTML subtree has an XHTML ApexNode905.
- SVG sub-tree has SVG ApexNode906.
- Step 1 the ApexNode906 force screen is touched by Pane907, which is the logical layout of the screen.
- Pane907 is PaneOwner (Pane Owner: Owning Pane Request the ZoneFactory for ApexNode906 to CoreComponent 908.
- PaneOwner908 returns a ZoneFactory and an Editlet that is a CanvasF actory for ApexNode906.
- Step 4 a Pane907 force SZone909 is generated. Zone909 is attached to Pane907.
- Zone909 generates a facet for each node and attaches to the corresponding node.
- Pane907 generates Canvas910.
- Canv as910 is attached to Pane907.
- Canvas910 includes various commands.
- the Canvas 910 builds a data structure for rendering the document on the screen. For XHTML, this includes a box tree structure.
- Figure 19 (b) shows an overview of the Zone configuration using the MVC paradigm.
- the model (M) includes Zone and Facet.
- the view (V) corresponds to the canvas and the data structure. Since Command executes control operations on the document and its various relationships, the control is transferred to the Canvas and 3 mm.
- the document used in this example contains both text and images.
- Text is represented using XHTML, and images are represented using SVG.
- Figure 20 details the MVC representation of the relationship between the document components and the corresponding object.
- Document 1001 is attached to DocumentContainerl002 that holds DocumentlOOl.
- the document is represented by the DOM tree 1003.
- the DOM tree includes ApexNodel004.
- ApexNode is represented by a black circle. Nodes that are not vertices are represented by white circles. A Facet used to edit a node is represented by a triangle and is attached to the corresponding node. Since the document has text and images, the DOM tree of this document contains an XHTML part and an SV G part.
- ApexNodel004 is the top node of the XHTML subtree. This is the top pane for the physical representation of the XHTML part of the document, XHTMLPanelO Attach to 05. ApexNodel004 is also attached to XHTMLZ onel006, which is part of the document's DOM tree.
- Facet corresponding to Nodel004 is also attached to XHTMLZonel006.
- XHTMLZone 1006 is attached to XHTMLPanel005.
- XHTMLEditlet generates XHTMLCanvasl007, which is a logical representation of the document.
- XHTMLCanvasl007 is attached to XHTMLPane 1005.
- XHTMLCanvasl007 creates BoxTreel009 for the XHTML component of Document 1001.
- Various Commandl008 required to hold and render the XHTML part of the document are also added to XHTMLCanvasl007.
- SVGZonelOll which is part of DocumentlOl's DOM tree that represents the SVG component of the document.
- ApexNodelOlO is attached to SV GPanel013, the top Pane in the physical representation of the SVG part of the document.
- SVGCanvas 1012 representing the logical representation of the SVG part of the document is generated by SVGEditlet and attached to SVGPanel013.
- Data structures and commands for rendering the SVG portion of the document on the screen are attached to the SVGCanvas.
- the data structure may include circles, lines, rectangles, etc. as shown.
- FIG. 21 (a) shows a simplified MV relationship in the XHTM L component of document 1001.
- the model is XHTMLZone 1101 for DocumentlOOl's XHTML component.
- XHTMLZone's tree includes les, some Nodes and their corresponding Facets.
- the corresponding XHTMLZone and Pane are part of the model (M) part of the MVC paradigm.
- the View (V) part of the MVC paradigm is the corresponding XHTML Canvasl 102 and BoxTree of DocumentlOOl's XHTML component.
- the XHTML portion of the document is rendered on the screen using the Canvas and the commands it contains. Events such as keyboard and mouse input proceed in the reverse direction as shown.
- SourcePane has an additional function: the role as DOM holder.
- Figure 2 1 (b) shows the vocabulary library for DocumentlOOl components shown in Figure 21 (a).
- SourcePanel 103 which acts as a DOM holder, contains the document's source DOM tree.
- ConnectorTree is created by ConnectorFactory and creates DestinationPanel 105 that also functions as the destination DOM holder.
- DestinationPane 1105 is laid out in a box-like format as XHTMLDestinationCanvas 1106.
- FIGS 22 (a)-(c) show further details related to the plug-in subsystem, the library connection, and the connector, respectively.
- Plug-in subsystems are used to add or replace functionality in a document processing system.
- the plug-in subsystem includes a ServiceBroker 1041 'a'.
- ServiceBroker 1041 (This ZoneFactoryService 1 201 generates a Zone for a part of a document.
- EditletService 1202 is also attached to ServiceBroke rl041.
- EditletServicel 202 generates a Canvas corresponding to a Node in the Zone.
- ZoneFactory examples are XHTMLZone Factory 1211 and SVGZoneFactory 1212 that generate XHTMLZone and SVGZone, respectively.
- the text component of the document may be represented by generating XHTMLZone, and the image may be represented using SVGZone.
- EditletService examples include XHTMLEditle tl 221 and SVGEditletl 222.
- Figure 22 (b) shows further details related to the vocabulary connection.
- the vocabulary connection is an important feature of a document processing system, and enables consistent expression and display of documents in two different ways.
- the VCManager 302 that holds the ConnectorFactory 303 is a part of the vocabulary connection subsystem.
- ConnectorFactory 303 generates a connector 304 for the document.
- the Connector monitors the nodes in the source DOM and modifies the nodes in the destination DOM to maintain consistency between the two representations.
- Template 317 represents conversion rules for several nodes.
- a vocabulary connection descriptor (VCD) file is a list of templates that represent a number of rules that transform an element or set of elements that satisfy a particular path or rule into another element.
- Template317 And CommandTemplate 318 are all attached to VCManager 302.
- VCManager is an object that manages all sections in a VCD file. One VCManager object is created for one VCD file.
- FIG. 22 (c) provides further details related to the Connector.
- ConnectorFactory303 is, the Soviet Union 1 to Subun, to et Connector generation.
- ConnectorractorydC is flagged by Vocabulary ⁇ emplate and ElementTemplate, and VocabularyConnector, Template, Onnector, and ElementConnector are created.
- VCManager302 holds ConnectorFactory303.
- the corresponding VCD file is read to generate the Vocabulary.
- ConnectorFactory303 is generated.
- This onnectorFactory 303 is related to the Editlet that creates the Zone “5— Zone actory and anvas”.
- VCCa nvas also creates an ApexNode Connector in the source DOM tree or Zone. Child connectors are generated recursively as needed. ConnectorTree is created by a set of templates in the VCD file.
- a template is a collection of rules for converting elements of a markup language into other elements. For example, each template is matched to the source DOM tree or Zone. If it matches properly, a vertex connector is created. For example, the template “A / * / D” matches all branches that start with node A and end with node D, regardless of what nodes are in between. Similarly, “ ⁇ B” matches all “B” nodes from the root.
- FIG. 23 shows an example of a VCD script using VCManager and ConnectorFactoryTree for the MySampleXMLj finale. Shows the vault library section, template section, and corresponding components in VCManager in the script file.
- vcd vocabulary ryj
- the attribute “match” is “sample: root”
- label is “MySampleXML”
- cal late temp late is “sample template”.
- Vocabulary is the VCManager of “MySampleXML”! /, And include the vertex element as “sample: root”.
- the corresponding UI label is “MySampleXML”.
- the tag is “vcd: template” and the name is “sample: template”.
- Figure 24_28 shows a detailed description of loading the document “MySampleXML”.
- the document is loaded from the storage 1405.
- DOMService generates DocumentContainerl401 corresponding to DOM tree and DocumentManagerl406.
- DocumentContainerl401 is attached to DocumentManagerl406.
- the document contains XHTML and MySampleXML subtrees.
- XHTML ApexNode 1403 is the top node of XHTML with the tag “xhtml: html”.
- “The ApexNodel404 of MySampleX MLJ is the top node of“ MySampleXML ”with the tag“ sample: root ”.
- RootPane In step 2 shown in Fig. 24 (b), RootPane generates XHTMLZone, Facet, and Canvas of the document. Generated corresponding to Panel407, XHTMLZonel408, XHTMLCanvasl409, and BoxTreel410 force ApexNodel403.
- step 3 shown in Fig. 24 (c) a tag "sample: root" that XHTMLZone does not know is discovered and a SubPane is generated from the XHTMLCanvas area.
- Step 4 shown in FIG. 25 SubPane can handle “sample: root”, and obtain an appropriate Zone generation process ZoneFactory.
- This ZoneFactoryi ⁇ , ZoneFactory is in a Vocabulary where you can exit. It contains the contents of the VocabularySection of “MySampleXML”.
- Step 5 shown in FIG. 26 Vocabulary corresponding to “MySampleXML” generates Default Zonel 601. A corresponding Editlet is generated and SubPanel 501 is provided to generate the corresponding Canvas. Editlet generates VCCanvas. And it calls Templatesection. Connector actory i'reeb ⁇ 3 ⁇ 4 And onnectorFactoryTree becomes ConnectorTree and generates all connectors.
- each Connector creates a destination DOM object.
- Some of the connectors contain xpath information.
- xpath information is Contains one or more xpath expressions that are used to determine the subset of the source DOM tree that needs to be monitored for updates / modifications.
- step 7 shown in Figure 28 the vocabulary creates a DestinationPane for the destination DOM tree from the source DOM pane. This is done based on the SourcePane.
- the ApexNode in the destination tree is attached to the DestinationPane and the corresponding Zone.
- DestinationPane is provided with its own Editlet that creates a DestinationCanvas and builds the data structure and commands to render the document in the format of the destination.
- Figure 29 (a) shows the flow when an event occurs on a node that does not have a corresponding source node and exists only in the destination tree.
- Events acquired by Canvas such as mouse events and keyboard events, pass through the destination tree and are sent to the ElementTemplateConnector. Since the SlementTemplate and the onnector do not have any source node, the transmitted event is not an edit operation on the source node. If the element template and onnector match the command specified by the ommandTemplate, the corresponding action is executed. If there is no matching command, ElementTemplateConnector ignores the transmitted event.
- Fig. 29 (b) shows the flow when an event occurs on the node of the destination tree associated with the source node by TextOfConnector.
- TextOfConnector gets the text node from the node specified by XPath of the source DOM tree and maps it to the node of the destination DOM tree.
- Events acquired by Canvas such as mouse events and keyboard events, pass through the destination tree and are transmitted to the Text OfConnector.
- TextOfConnector maps the transmitted event to the edit command of the corresponding source node and loads it on Queuel053.
- An edit command is a set of DOM API calls that are executed via Face t. When the queued command is executed, the source node is edited.
- TextOfConnector reconstructs the destination tree so that changes in the source node are reflected in the corresponding destination node.
- ConnectorFactory re-evaluates this control statement, and after rebuilding TextOfConnector, the destination tree is rebuilt.
- the definition file generator 86 provides a UI for generating a definition file. Users can create definition files using this UI, and create and edit XML documents using their own defined library.
- a new vocabulary is generated.
- the XML document generated using this definition file must describe the namespace URI of this vocabulary.
- namespace URIs are not allowed to be duplicated, for example, if a user owns an Internet domain, the namespace URI can be attached by adding an appropriate string to the domain name. However, if you do not own a domain, it is difficult to attach a unique URI. Therefore, bokiyab
- the library server provides a service that issues a unique namespace URI in response to a user request.
- FIG. 30 shows the configuration of the vocabulary server 3400.
- the server server 3400 includes a search request reception unit 3410, a search unit 3412, a response unit 3414, a transmission unit 3416, an issue request reception unit 3420, a namespace URI issue unit 3422, a registration unit 3424, a VCD database 3430, and a VC An information holding unit 3432 is provided.
- the search request receiving unit 3410 receives a search request for a definition file from the user. Search requests may be received in natural text, or with keywords indicating the purpose or function. When the search request receiving unit 3410 receives a search request in a natural sentence, the search request receiving unit 3410 may generate a keyword by extracting a noun by decomposing the part of speech. Further, synonyms may be expanded so that synonyms are also searched. Also, for example, when keywords are accepted in Japanese, bilingual expansion is performed using bilingual dictionaries such as Japanese and English so that tag names written in foreign languages such as English will also be hit. Also good.
- the VCD to be searched may be a VCD that is designed to be customized.
- a VCD that processes a huge vocabulary that contains many elements may be divided into several element groups. Also, it may be divided into parts for each function, such as a VCD describing a display / editing template, a VCD describing a UI, and a VCD describing a document processing command.
- the user can select a VCD having a desired function from each category.
- An explanation about the VCD may be written as a comment in the VCD file to be searched. Prepare an element to describe a comment, and write a comment in that element.
- a comment for example, a description about a tag set schema to be processed by the VCD, a description about a view, a description about a function, and the like may be described.
- the description of the schema describes what structure or type of XML document you are targeting, for example, a tag set that represents a list of items and a tag set that represents a map that is a combination of keys and values. It may be the information shown.
- the description regarding the view may be information indicating a display format such as a tabular view, a bulleted view, and a bar graph view.
- Table Aggregation Function Table Statistics It may be information indicating the function provided by VCD, such as analysis function. Comments in the vicinity of the template section of the VCD may be considered as comments about the view, and comments in the vicinity of the command section may be considered as comments about the function.
- the search unit 3412 searches the VCD database 3430 based on the keyword received or generated by the search request receiving unit 3410.
- the VCD database 3430 contains keywords, explanations, the tag set namespace that the definition file targets, the element names, attribute names, and schemas of the elements included in the tag set.
- the search unit 3412 uses Boolean search, vector search, clustering, Search by any technique such as filtering, and score the search results. Scoring may be performed with reference to the similarity of the structure indicated by the schema in addition to the similarity of the explanatory text.
- the answer unit 3414 presents the candidate with the highest score to the user.
- VCD candidates extracted as a result of search VCDs that provide additional functions that can be used for the VCD may be displayed together, and generated from the VCD. You can display VCD together.
- the answer unit 3414 when a natural sentence "manage student's grade" is input as a search key, the answer unit 3414 presents, for example, the VCD shown in FIGS. 4 (a) and 4 (b) as candidates. .
- the answer unit 3414 uses the VCD shown in FIGS. 4 (a) and 4 (b) and tabular data. Candidates are combinations of VCDs with statistical processing functions that describe statistical processing commands and UI logic.
- the user views the search result presented from the answer unit 3414, and selects a definition file that matches the function and purpose he is seeking.
- the transmission unit 3416 reads the definition file selected by the user from the VC information holding unit 3432 and transmits it.
- FIG. 31 shows the configuration of the document processing apparatus according to the first embodiment.
- the document processing apparatus 100 includes an acquisition unit 29 and a conversion code generation unit 71 in addition to the configuration of the document processing apparatus 20 of the base technology shown in FIG.
- the acquisition unit 29 acquires the definition file from the vocabulary server 3400.
- the user uses the definition file generation unit 86 of the document processing device 20 and the like, and generates a definition file by combining his / her favorite functions based on the definition file acquired from the vocabulary server 3400.
- the definition file can be modified by modifying the structure such as deleting unnecessary elements, changing the display format, or changing elements to attributes in the tag set to be processed by the acquired definition file.
- the ability to customize the tag name of a general name may be changed to a specific name. For example, it is possible to change the tag names “key” and “value” to specific tag names “name” and “score”. You can also customize functions such as adding and deleting commands. Also, logic such as commands and UI described in another definition file may be incorporated
- the conversion code generation unit 71 When the definition file is completed, the conversion code generation unit 71 generates a tool that converts the XML document created with the new definition file into an XML document that can be processed with the original definition file that is a part.
- This conversion tool may be described in a definition file template or in XSLT. If the specification of the tag set to be processed by the definition file is changed by editing the definition file in the definition file generator 86, the XML document generated using the new definition file becomes a part. The original definition file cannot be processed. Therefore, even if a useful application is prepared for the tag set to be processed by the original definition file, it can no longer be used. However, the conversion code generation unit 71 can convert the document generated using the new definition file into the original tag set by generating a code that can be converted into a format that can be processed by the original definition file. Yes, you can use applications.
- the definition file generator 86 is a definition file that changes the tag set specifications, such as changing an element name or attribute name, changing an element to an attribute, changing an attribute to an element, or adding or deleting an element or attribute. You can change the change to a command.
- the definition file generation unit 86 may change the corresponding part of the definition file and notify the conversion code generation unit 71 of the conversion code associated with the change.
- the conversion code generation unit 71 records the notified conversion code. That is, the conversion code generation unit 71 accumulates conversion codes that undo changes in tag set specifications in the definition file generation unit 86 as in the undo operation. Eventually, a conversion code is generated that reverses the change history in the definition file generator 86 and restores the original tag set format.
- an XML document created with a new definition file is converted into a format that can be processed with a definition file that has become a part.
- Various prepared applications can be used. For example, in order to create a definition file for managing student grades, suppose that a definition file for displaying a table was obtained and the definition file was created based on it. At this time, for example, if an application that statistically processes the data of the XML document created by the table vocabulary is prepared, the XML document created by the grade management vocabulary is converted into an XML instance of the table vocabulary. By doing so, it is possible to statistically process student performance using existing applications.
- a conversion tool When an XML document created with a new definition file is opened using a definition file based on it, a conversion tool can be applied before generating DOM and automatically edited with the definition file based on it. You can generate D ⁇ M after converting it to an XML document. If the conversion tool has been prepared as a definition file, conversion may be performed first using a conversion definition file, and then processing may be performed using the base definition file.
- the namespace URI issuing unit 3422 adds a user ID to the domain name managed by itself and issues a unique namespace URI. . You can include the version number of the definition file in the namespace URI.
- the registration unit 3424 also functions as a notification unit, and notifies the user of the namespace URI issued by the namespace URI issuing unit 3422 and registers it in the VCD database 3430.
- the registration unit 3424 acquires the definition file from the user and stores it in the V ⁇ information holding unit 3432.
- the VC information holding unit 3432 may give each user a directory for storing the definition file and give the directory name where the definition file is actually stored as the namespace URI.
- the registration unit 3424 may acquire a file related to the vocabulary such as a specification manual of the definition file, a schema, and related information from the user, and store the file in the user directory of the VC program holding unit 3432. The registered file may be transmitted upon request. You can also obtain keywords from the user that indicate the function and purpose of the definition file and register them in the VCD database 3430.
- the registration unit 3424 may extract the element name, attribute name, command name, etc. from the definition file and register them in the VCD database 3430. Further, the registration unit 3424 may extract a keyword from a manual acquired from the user and register it in the VCD database 3430.
- FIG. 32 shows a configuration of a schema generation device that is an example of a document processing device according to the second embodiment.
- the schema generation device 75 acquires the definition file, refers to the described template, extracts the elements and attributes included in the XML document created using the definition file, estimates the configuration thereof, Generate document type definitions such as schema and DTD.
- the schema generation device 75 includes an acquisition unit 76 that acquires an XML document, a definition file, and the like, an analysis unit 77 that analyzes the acquired definition file, and a schema generation unit 78 that generates a schema.
- the schema generation device 75 may be incorporated in the document processing device 20 or may be provided alone.
- the analysis unit 77 estimates that the latter element is a child element of the former element.
- the analysis unit 77 may estimate the configuration of elements and attributes by referring to commands and logic described in the definition file. For example, if a UI command that adds a certain element is described, the element may appear multiple times, and that force S is estimated.
- FIG. 33 shows an example of a target definition file.
- the definition file has a “vocabulary” element that declares the vocabulary to be processed, and the attribute value of the attribute “match” of this element is the root element of the vocabulary.
- the processing target vocabulary has the namespace “http: ⁇ xmlns.xfytec.com / samples / hello”, and the element name of the delete element is “hello”.
- the elements and attributes that can appear below the “hello” element can be inferred by looking at the template assigned to the “hello” element and the template that the template calls. In this template, the “hello” element has no text in its children, only a “world” element.
- This “world” element has editable text in its children, and there are no restrictions on the text.
- the editability of text is expressed as “text-of”, indicating that it can be edited. If “type” is not specified as the attribute, the specified text can be freely set. It can be edited.
- FIG. 34 shows an example of an XML document processed by the definition file shown in FIG.
- the parsing unit 77 may estimate the structure of the document by further referring to the XML document 3502 that is not limited to the definition file 3501 alone.
- FIG. 35 shows an example of a schema generated by the schema generation unit 78 from the definition file of FIG.
- This schema 3503 can be output in the same way for other types of schemas such as XML XML and DTD based on the RelaxNG schema.
- this schema 3503 one “world” element always appears under the “hello” element.
- the description of the power limit can be changed by setting.
- the name given to the “ref” and “define” elements is selected from the template name, element name, and template mode.
- FIGS. 36 (a) to (e) show other examples of the definition file to be processed.
- This definition file 360 1 processes a daily report bubbling library.
- the root element of this vocabulary is “daily report” and is named II, and has the name ij space “http: //xmlns.xfyte com / samples / day_report”.
- the definition file 3601 includes a section described by a “command” element and a section described by a “new_fragment” element.
- the “command” element indicates a command used when editing using the definition file 3601, that is, a special command. This command includes 0 elements for editing the structure, adding elements, that is, subtrees, adding attribute values, and so on.
- the “new-fragment” element describes the minimum structure of the document generated in this definition file 3601.
- the analysis unit 77 can infer the appearance pattern of elements and the number of appearances of essential elements and element groups from the description of the “command” element.
- the essential elements can be inferred from the description of the “new-fragment” element.
- FIGS. 37 (a) to (c) show examples of XML documents processed by the definition file shown in FIGS. 36 (a) to (e).
- the analysis unit 77 further analyzes the XML document 3602 by referring to it, information such as the “src” attribute of the “picture” element, which does not appear in the definition file 3601, can be complemented.
- a simplified schema may be output without referring to the XML document 3602.
- FIGS. 38 (a) and 38 (b) show examples of schemas generated by the schema generation unit 78 from the definition files shown in FIGS. 36 (a) to (e).
- this schema 3603 “http: ⁇ www.xfytec.com/2005 / xfy-datatypesj”, and the URL can resolve the VCD data type.
- the RelaxNG schema for SVG is output separately in the same directory. Since SVG is a W3C standard, the schema is obtained from W3C.
- Each “defme” element is created by inferring the following template from the template in the definition file 3601. The guessing method infers the template that may actually come under this element from the mode specification or “apply-templates” element, and arranges the elements that may exist below. Of course, if there is a template that matches all nodes, the elements that appear in that template can appear below all elements.
- the minimum necessary element is determined from the information of the "new_f ragment” element.
- the “new_fragment” element in the definition file 3601 in FIG. 36 (a) it can be seen that there is always a “report” element immediately under the root element “log_book”.
- a repeatable element group is determined. In this example, you can see that the “report” element and the “aragraph” element force that is always contained in it are repeatable.
- the attribute “mixed” appears in the template, which also represents a repeatable element. The element described in “mixed” is divided into two when the enter key is pressed in the element. In other words, the “para graphj element is repeatable.
- a schema can be automatically generated with reference to a definition file. You can also refer to the XML instance to generate a more accurate schema.
- the present invention can be used in a server device that supports creation of a new vocabulary.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Document Processing Apparatus (AREA)
Abstract
Description
Claims
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US11/667,685 US20080005085A1 (en) | 2004-11-12 | 2005-11-14 | Server Device and Search Method |
| JP2006545026A JPWO2006051956A1 (ja) | 2004-11-12 | 2005-11-14 | サーバ装置及び検索方法 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2004-329321 | 2004-11-12 | ||
| JP2004329321 | 2004-11-12 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2006051956A1 true WO2006051956A1 (ja) | 2006-05-18 |
Family
ID=36336621
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2005/020881 Ceased WO2006051956A1 (ja) | 2004-11-12 | 2005-11-14 | サーバ装置及び検索方法 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20080005085A1 (ja) |
| JP (1) | JPWO2006051956A1 (ja) |
| WO (1) | WO2006051956A1 (ja) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090083213A1 (en) * | 2007-09-20 | 2009-03-26 | Haynes Thomas R | Method and System for Fast Navigation in a Hierarchical Tree Control |
| US8200618B2 (en) * | 2007-11-02 | 2012-06-12 | International Business Machines Corporation | System and method for analyzing data in a report |
| US9262185B2 (en) * | 2010-11-22 | 2016-02-16 | Unisys Corporation | Scripted dynamic document generation using dynamic document template scripts |
| US20150169529A1 (en) * | 2013-12-16 | 2015-06-18 | Sap Ag | Mobile device data rendering |
| CN114816645B (zh) * | 2022-05-17 | 2022-11-22 | 三峡高科信息技术有限责任公司 | 一种信息系统标签配置与多语种翻译切换的方法及系统 |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2002245264A (ja) * | 2001-02-19 | 2002-08-30 | Hitachi Information Systems Ltd | Xmlのdtd管理システムと方法およびxmlのdtd流通システムと方法ならびにプログラム |
Family Cites Families (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH1049549A (ja) * | 1996-05-29 | 1998-02-20 | Matsushita Electric Ind Co Ltd | 文書検索装置 |
| US6012053A (en) * | 1997-06-23 | 2000-01-04 | Lycos, Inc. | Computer system with user-controlled relevance ranking of search results |
| US6356920B1 (en) * | 1998-03-09 | 2002-03-12 | X-Aware, Inc | Dynamic, hierarchical data exchange system |
| US6611840B1 (en) * | 2000-01-21 | 2003-08-26 | International Business Machines Corporation | Method and system for removing content entity object in a hierarchically structured content object stored in a database |
| US20010014899A1 (en) * | 2000-02-04 | 2001-08-16 | Yasuyuki Fujikawa | Structural documentation system |
| US6898761B2 (en) * | 2000-05-01 | 2005-05-24 | Raytheon Company | Extensible markup language genetic algorithm |
| EP1360611A2 (en) * | 2000-12-12 | 2003-11-12 | Time Warner Entertainment Company, L.P. | Digital asset data type definitions |
| JP3835193B2 (ja) * | 2001-03-30 | 2006-10-18 | セイコーエプソン株式会社 | ディジタルコンテンツ作成システム及びディジタルコンテンツ作成プログラム |
| JP2003186590A (ja) * | 2001-12-17 | 2003-07-04 | Sharp Corp | 機器操作学習装置 |
| CN1282068C (zh) * | 2002-04-05 | 2006-10-25 | 精工爱普生株式会社 | 使打印机印刷所要设计页面成为可能的装置及其动作方法 |
| DK1502201T3 (da) * | 2002-05-03 | 2010-05-03 | American Power Conv Corp | Fremgangsmåde og apparat til at indsamle og vise netværkindretningsinformation |
| JP2004133662A (ja) * | 2002-10-10 | 2004-04-30 | Nec Corp | ボキャブラリマネージメントシステム、グローバルvmsサーバ、パブリックvmsサーバ、及びプログラム |
| US8473399B2 (en) * | 2003-03-04 | 2013-06-25 | Siebel Systems, Inc. | Invoice data object for a common data object format |
| US7543286B2 (en) * | 2003-11-18 | 2009-06-02 | Microsoft Corporation | Method and system for mapping tags to classes using namespaces |
| US7725460B2 (en) * | 2003-12-08 | 2010-05-25 | Ebay Inc. | Method and system for a transparent application of multiple queries across multiple data sources |
| US7437709B2 (en) * | 2004-02-19 | 2008-10-14 | International Business Machines Corporation | Providing assistance for editing markup document based on inferred grammar |
| US7281018B1 (en) * | 2004-05-26 | 2007-10-09 | Microsoft Corporation | Form template data source change |
| KR20060006581A (ko) * | 2004-07-16 | 2006-01-19 | 김만철 | 골프채의 스윙속도 측정장치 |
| ATE510259T1 (de) * | 2005-01-31 | 2011-06-15 | Ontoprise Gmbh | Abbilden von web-diensten auf ontologien |
-
2005
- 2005-11-14 WO PCT/JP2005/020881 patent/WO2006051956A1/ja not_active Ceased
- 2005-11-14 US US11/667,685 patent/US20080005085A1/en not_active Abandoned
- 2005-11-14 JP JP2006545026A patent/JPWO2006051956A1/ja active Pending
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2002245264A (ja) * | 2001-02-19 | 2002-08-30 | Hitachi Information Systems Ltd | Xmlのdtd管理システムと方法およびxmlのdtd流通システムと方法ならびにプログラム |
Non-Patent Citations (1)
| Title |
|---|
| HINOKIYAMA M.: "XML Vocabulary Sakusei no Point o Shiru [Kohen]", JAVA WORLD, KABUSHIKI KAISHA IDG JAPAN, vol. 5, no. 1, 1 January 2001 (2001-01-01), pages 199 - 205, XP003007349 * |
Also Published As
| Publication number | Publication date |
|---|---|
| US20080005085A1 (en) | 2008-01-03 |
| JPWO2006051956A1 (ja) | 2008-05-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2006051905A1 (ja) | データ処理装置およびデータ処理方法 | |
| WO2006137530A1 (ja) | 文書処理装置 | |
| WO2006121051A1 (ja) | 文書処理装置および文書処理方法 | |
| WO2006085455A1 (ja) | 文書処理装置および文書処理方法 | |
| WO2006051870A1 (ja) | データ処理装置、文書処理装置及び文書処理方法 | |
| WO2006051715A1 (ja) | 文書処理装置及び文書処理方法 | |
| WO2006137565A1 (ja) | 文書処理装置及び文書処理方法 | |
| WO2006051961A1 (ja) | データ処理装置およびデータ処理方法 | |
| WO2006051964A1 (ja) | データ処理システム、データ処理方法、及び管理サーバ | |
| WO2006051962A1 (ja) | データ処理装置およびデータ処理方法 | |
| WO2006051965A1 (ja) | データ処理装置およびデータ処理方法 | |
| WO2006051975A1 (ja) | 文書処理装置 | |
| WO2006046666A1 (ja) | 文書処理装置および文書処理方法 | |
| WO2007105364A1 (ja) | 文書処理装置及び文書処理方法 | |
| WO2006051969A1 (ja) | 文書処理装置及び文書処理方法 | |
| WO2006051954A1 (ja) | 文書処理装置及び文書処理方法 | |
| WO2006051960A1 (ja) | 文書処理装置及び文書処理方法 | |
| WO2006051713A1 (ja) | 文書処理装置及び文書処理方法 | |
| WO2006120926A1 (ja) | 入力フォーム設計装置および入力フォーム設計方法 | |
| WO2006051904A1 (ja) | データ処理装置およびデータ処理方法 | |
| WO2006051955A1 (ja) | サーバ装置及び名前空間発行方法 | |
| WO2006046668A1 (ja) | 文書処理装置および文書処理方法 | |
| WO2006051966A1 (ja) | 文書管理装置及び文書管理方法 | |
| WO2006051959A1 (ja) | 文書処理装置及び文書処理方法 | |
| WO2006051956A1 (ja) | サーバ装置及び検索方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KN KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| WWE | Wipo information: entry into national phase |
Ref document number: 2006545026 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 11667685 Country of ref document: US |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 05806027 Country of ref document: EP Kind code of ref document: A1 |
|
| WWP | Wipo information: published in national office |
Ref document number: 11667685 Country of ref document: US |