JP2002342342A

JP2002342342A - Document management method and its execution system, and its processing program and recording medium

Info

Publication number: JP2002342342A
Application number: JP2001147955A
Authority: JP
Inventors: Yukie Kanie; 幸恵蟹江; Yuki Aoyama; ゆき青山; Masayoshi Matsumoto; 正義松本; Toru Takahashi; 亨高橋; Mitsuhiro Osada; 充弘長田; Yoshimune Shibata; 吉宗柴田
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2001-05-17
Filing date: 2001-05-17
Publication date: 2002-11-29

Abstract

(57)【要約】【課題】適切な書誌情報を付与した文書群を有効な情
報共有手段として活用できる様にすることが可能な技術
を提供する。【解決手段】登録対象文書の文書作成に用いられた文
書作成アプリケーションを特定するステップと、前記登
録対象文書の定型文書種別及び記述内容を解析し、文書
中の各段落を、該段落に付与されたスタイル名をタグ名
とし、該段落に含まれる文字列を該タグの内容とする構
造化文書記述言語データに変換して出力するステップ
と、前記構造化文書記述言語データに対して書誌情報抽
出定義ルールを適用することにより、前記構造化文書記
述言語データから書誌情報に該当する部分構造を抽出
し、その抽出結果を反映した書誌情報設定画面データを
出力するステップと、前記書誌情報設定画面データを表
示し、登録者からの文書登録要求を受け付けるステップ
とを有するものである。 (57) [Summary] [PROBLEMS] To provide a technique capable of utilizing a document group provided with appropriate bibliographic information as an effective information sharing means. SOLUTION: A step of specifying a document creation application used for creating a document to be registered, analyzing a fixed document type and description contents of the document to be registered, and assigning each paragraph in the document to the paragraph. Converting the style name into a tag name, converting a character string included in the paragraph into structured document description language data as the content of the tag, and outputting the structured document description language data; and extracting bibliographic information from the structured document description language data. Extracting a partial structure corresponding to bibliographic information from the structured document description language data by applying a definition rule, and outputting bibliographic information setting screen data reflecting the extraction result; and And receiving a document registration request from a registrant.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は文書を蓄積・管理す
る文書管理システムに関し、特に市販の様々な文書作成
アプリケーションを用いて作成された、アプリケーショ
ン固有のデータ形式の文書を蓄積・管理する文書管理シ
ステムに適用して有効な技術に関するものである。[0001] 1. Field of the Invention [0002] The present invention relates to a document management system for storing and managing documents, and more particularly to a document management system for storing and managing documents in a data format unique to an application created using various commercially available document creation applications. It relates to technology that is effective when applied to a system.

【０００２】[0002]

【従来の技術】近年、オフィスの生産性を高める目的
で、各部署で作成された文書を文書管理システムに蓄積
し、部署間での情報共有手段として利用することが当た
り前になりつつある。それぞれの部署では、文書作成の
為に市販の様々な文書作成アプリケーションが利用さ
れ、アプリケーション固有のデータ形式の文書が作成さ
れている。2. Description of the Related Art In recent years, it has become commonplace to accumulate documents created in each department in a document management system and use them as information sharing means between departments in order to increase office productivity. In each department, various commercially available document creation applications are used for document creation, and documents in a data format unique to the application are created.

【０００３】前記従来の文書作成アプリケーションで
は、文書中の各段落に対して、フォント、文字サイズ、
文字配置等の書式を設定することが可能であり、その設
定内容をまとめて名前付けしたものをスタイルと呼ぶ。
また、報告書や見積書といった定型文書の作成を容易に
する為に、それぞれの定型文書に共通な文字列、書式、
スタイル等をまとめたものをテンプレート文書として保
存することができる。例えば、報告書を作成したい場
合、ユーザは、報告書用のテンプレート文書を編集する
ことにより、報告書用の書式に従った定型文書を作成す
ることができる。In the conventional document creation application, a font, a character size,
It is possible to set a format such as a character arrangement, and a name that collectively names the set contents is called a style.
In addition, in order to facilitate creation of standard documents such as reports and quotes, character strings, formats,
A collection of styles and the like can be saved as a template document. For example, when a user wants to create a report, the user can create a standard document according to the report format by editing the report template document.

【０００４】多くの文書管理システムでは、蓄積した各
文書に対して一定の書誌情報（タイトル、作成日、作成
者等）を付与し、これらの書誌情報を検索に利用できる
様にしている。この様な文書管理システムでは、文書登
録時に図４に示す様な書誌情報設定画面を表示し、ユー
ザに書誌情報の入力を促す。画面上で入力されたデータ
は、登録対象文書の書誌情報として文書管理システムで
管理され、ユーザが蓄積された文書群から所望の文書を
検索する際に利用される。文書の内容を的確に表現した
書誌情報が設定された文書が蓄積されることにより、ユ
ーザは文書の内容を参照することなく、書誌情報を基に
所望の文書を探し当てることが可能となる。In many document management systems, certain bibliographic information (title, creation date, creator, etc.) is given to each stored document, and these bibliographic information can be used for retrieval. In such a document management system, a bibliographic information setting screen as shown in FIG. 4 is displayed at the time of document registration, and prompts the user to input bibliographic information. The data input on the screen is managed by the document management system as bibliographic information of the document to be registered, and is used when a user searches for a desired document from a stored document group. By storing documents in which bibliographic information that accurately expresses the contents of the documents is stored, the user can search for a desired document based on the bibliographic information without referring to the contents of the documents.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、文書登
録時に書誌情報設定画面への入力を強いられることは、
文書の登録作業を煩わしいものにしている。しかも、入
力する書誌情報が、その文書の記述内容に基づいた、か
つ、その文書を一意に識別するのに十分な情報となって
いないと、検索の為の情報としては十分でない。However, when the document is registered, the entry into the bibliographic information setting screen is forced.
This makes the document registration work cumbersome. Moreover, if the bibliographic information to be input is not based on the description of the document and is not enough information to uniquely identify the document, it is not sufficient as information for search.

【０００６】生産性向上の為の分業化が進むにつれ、文
書の作成を行うユーザ（以下、作成者と呼ぶ）と、作成
された文書に関する書誌情報を決定し、文書管理システ
ムへの文書登録を行うユーザ（以下、登録者と呼ぶ）と
が同一でないことも多くなる。その様な場合、登録者
が、登録対象となる文書を一意に識別できる書誌情報を
決定する為には、その文書の記述内容を把握しなければ
ならず、登録者の作業負担は更に大きくなる。その為、
現実には文書の記述内容を把握しなくても済む程度の書
誌情報しか付与されないという現象が起こっている。そ
の結果、適切な書誌情報が付与されずに文書が蓄積され
ることになり、蓄積された文書の活用の場も限られてし
まう。また、文書の作成者自身が登録を行う場合でも、
文書内に一度記述済みの内容を書誌情報として再度入力
するのは煩わしく、結果として、十分な書誌情報が与え
られないことが多い。[0006] As the division of labor for improving productivity progresses, a user who creates a document (hereinafter referred to as a creator) determines bibliographic information on the created document and registers the document in a document management system. It is often the case that the user (hereinafter, referred to as a registrant) is not the same. In such a case, in order for the registrant to determine bibliographic information that can uniquely identify the document to be registered, the registrant must understand the description content of the document, and the registrant's work load is further increased. . For that reason,
In reality, a phenomenon has occurred in which only bibliographic information is added to the extent that it is not necessary to grasp the description content of a document. As a result, documents are accumulated without appropriate bibliographic information, and the places where the accumulated documents are utilized are also limited. Also, even if the creator of the document himself registers,
It is troublesome to input the contents once described in the document as bibliographic information again, and as a result, sufficient bibliographic information is often not provided.

【０００７】本発明の目的は上記問題を解決し、文書登
録作業における登録者による書誌情報入力を不要とし、
適切な書誌情報を付与した文書群を有効な情報共有手段
として活用できる様にすることが可能な技術を提供する
ことにある。[0007] An object of the present invention is to solve the above-mentioned problem and eliminate the need for a registrant to input bibliographic information in document registration work.
It is an object of the present invention to provide a technology capable of utilizing a document group to which appropriate bibliographic information is assigned as an effective information sharing means.

【０００８】[0008]

【課題を解決するための手段】本発明は、文書作成アプ
リケーションによって作成された文書データを蓄積・管
理する文書管理システムにおいて、文書データを構造化
文書記述言語データに変換して書誌情報に該当する部分
構造を抽出するものである。According to the present invention, in a document management system for storing and managing document data created by a document creation application, the document data is converted into structured document description language data and corresponds to bibliographic information. This is to extract the partial structure.

【０００９】本発明による文書登録では、文書作成アプ
リケーションによって固有のデータ形式で生成された定
型文書データを登録対象文書として受け付け、その文書
の作成に用いられた文書作成アプリケーションを特定す
る。In the document registration according to the present invention, standard document data generated in a unique data format by a document creation application is accepted as a document to be registered, and the document creation application used to create the document is specified.

【００１０】次に前記登録対象文書の定型文書種別及び
記述内容を解析し、文書中の各段落を、該段落に付与さ
れたスタイル名をタグ名とし、該段落に含まれる文字列
を該タグの内容とする構造化文書記述言語データに変換
して出力する。[0010] Next, the standard document type and description contents of the document to be registered are analyzed, and each paragraph in the document is designated by a style name given to the paragraph as a tag name, and a character string included in the paragraph is designated by the tag. Is converted into structured document description language data having the contents described above and output.

【００１１】そして前記構造化文書記述言語データに対
して書誌情報抽出定義ルールを適用することにより、前
記構造化文書記述言語データから書誌情報に該当する部
分構造を抽出し、その抽出結果を反映した書誌情報設定
画面データを出力した後、前記書誌情報設定画面データ
を表示し、登録者からの文書登録要求を受け付ける。By applying the bibliographic information extraction definition rule to the structured document description language data, a partial structure corresponding to the bibliographic information is extracted from the structured document description language data, and the result of the extraction is reflected. After outputting the bibliographic information setting screen data, the bibliographic information setting screen data is displayed, and a document registration request from a registrant is accepted.

【００１２】前記の様に本発明によれば、文書の記述内
容が構造化文書記述言語表現に置き換えられた構造化文
書記述言語データが出力され、該データから書誌情報抽
出定義ルールに基づいて書誌情報が抽出され、その抽出
結果が書誌情報設定画面上に反映された状態で登録者に
対して表示される為、登録者による書誌情報入力が不要
となる。これにより、登録対象文書に対して、文書の記
述内容に基づいた適切な書誌情報を付与することが可能
となり、文書管理システムに蓄積された文書群を有効な
情報共有手段として活用できる様になる。As described above, according to the present invention, structured document description language data in which the description content of a document is replaced with a structured document description language expression is output, and bibliographic information is extracted from the data based on bibliographic information extraction definition rules. Since the information is extracted and the extraction result is displayed to the registrant in a state of being reflected on the bibliographic information setting screen, the registrant does not need to input the bibliographic information. As a result, appropriate bibliographic information based on the description content of the document can be given to the registration target document, and the document group stored in the document management system can be used as an effective information sharing means. .

【００１３】以上の様に本発明の文書管理システムによ
れば、文書データを構造化文書記述言語データに変換し
て書誌情報に該当する部分構造を抽出するので、文書登
録作業における登録者による書誌情報入力を不要とし、
適切な書誌情報を付与した文書群を有効な情報共有手段
として活用できる様にすることが可能である。As described above, according to the document management system of the present invention, document data is converted into structured document description language data to extract a partial structure corresponding to bibliographic information. No need to enter information,
It is possible to use a group of documents to which appropriate bibliographic information has been added as an effective information sharing means.

【００１４】[0014]

【発明の実施の形態】以下に文書作成アプリケーション
によって作成された文書データを蓄積・管理する一実施
形態の文書管理システムについて説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A document management system according to an embodiment for storing and managing document data created by a document creation application will be described below.

【００１５】図１は本実施形態の定型文書データを登録
対象文書として受け付け、これを文書データベースに格
納する文書登録部の処理概要を示す図である。図１に示
す様に本実施形態の文書管理システムは、文書データ受
付部２１と、書誌情報抽出部２２と、ｆｕｌｌ−ＸＭＬ
出力部２２１と、定義ルール適用部２２２と、書誌情報
設定画面表示部２３とを有している。FIG. 1 is a diagram showing an outline of the processing of a document registration unit that receives standard document data according to the present embodiment as a document to be registered and stores it in a document database. As shown in FIG. 1, the document management system according to the present embodiment includes a document data receiving unit 21, a bibliographic information extracting unit 22, a full-XML
It has an output unit 221, a definition rule application unit 222, and a bibliographic information setting screen display unit 23.

【００１６】文書データ受付部２１は、市販の文書作成
アプリケーションにより作成された定型文書データ１を
登録対象文書として受け付け、その文書作成に用いられ
た文書作成アプリケーションを特定する処理部である。The document data receiving unit 21 is a processing unit that receives standard document data 1 created by a commercially available document creating application as a document to be registered, and specifies the document creating application used for creating the document.

【００１７】書誌情報抽出部２２は、その定型文書種別
の判定と、登録対象文書の記述内容の内、書誌情報に該
当する部分構造の抽出、及び、その抽出結果を反映した
書誌情報設定画面データの出力を行う処理部である。The bibliographic information extraction unit 22 determines the standard document type, extracts a partial structure corresponding to the bibliographic information from the description contents of the registration target document, and sets bibliographic information setting screen data reflecting the extraction result. Is a processing unit that outputs the data.

【００１８】ｆｕｌｌ−ＸＭＬ出力部２２１は、前記登
録対象文書の定型文書種別及び記述内容を解析し、文書
中の各段落を、該段落に付与されたスタイル名をタグ名
とし、該段落に含まれる文字列を該タグの内容とする構
造化文書記述言語データに変換して出力する構造化変換
出力部である。A full-XML output unit 221 analyzes the fixed document type and description contents of the document to be registered, and includes each paragraph in the document as a tag name using the style name given to the paragraph as a tag name. A structured conversion output unit that converts a character string to be converted into structured document description language data having the contents of the tag and outputs the data.

【００１９】定義ルール適用部２２２は、前記構造化文
書記述言語データに対して書誌情報抽出定義ルールを適
用することにより、前記構造化文書記述言語データから
書誌情報に該当する部分構造を抽出し、その抽出結果を
反映した書誌情報設定画面データを出力する処理部であ
る。書誌情報設定画面表示部２３は、書誌情報抽出部２
２により出力された書誌情報設定画面データを表示し、
登録者からの文書登録要求を受け付ける処理部である。A definition rule application unit 222 extracts a partial structure corresponding to bibliographic information from the structured document description language data by applying a bibliographic information extraction definition rule to the structured document description language data. A processing unit that outputs bibliographic information setting screen data reflecting the extraction result. The bibliographic information setting screen display unit 23 includes a bibliographic information extraction unit 2
2. Display the bibliographic information setting screen data output by step 2.
The processing unit receives a document registration request from a registrant.

【００２０】文書管理システムを文書データ受付部２
１、書誌情報抽出部２２、ｆｕｌｌ−ＸＭＬ出力部２２
１、定義ルール適用部２２２及び書誌情報設定画面表示
部２３として機能させる為のプログラムは、ＣＤ−ＲＯ
Ｍ等の記録媒体に記録され磁気ディスク等に格納された
後、メモリにロードされて実行されるものとする。なお
前記プログラムを記録する記録媒体はＣＤ−ＲＯＭ以外
の他の記録媒体でも良い。また前記プログラムを当該記
録媒体から情報処理装置にインストールして使用しても
良いし、ネットワークを通じて当該記録媒体にアクセス
して前記プログラムを使用するものとしても良い。The document management system is connected to the document data receiving unit 2
1. Bibliographic information extraction unit 22, full-XML output unit 22
1. A program for functioning as the definition rule application unit 222 and the bibliographic information setting screen display unit 23 is a CD-RO
After being recorded on a recording medium such as M and stored on a magnetic disk or the like, it is assumed to be loaded into a memory and executed. The recording medium for recording the program may be a recording medium other than the CD-ROM. The program may be installed from the recording medium to the information processing apparatus and used, or the recording medium may be accessed through a network to use the program.

【００２１】図中の定義ルールデータベース３は、定型
文書毎に異なる書誌情報抽出定義ルール３１、３２、…
を、その適用対象となる定型文書種別に対応付けて管理
している。それぞれの書誌情報抽出定義ルールは、ＸＭ
Ｌ(eXtensible Markup Language)で記述された文書デー
タに含まれる書誌情報に該当する部分構造を抽出する為
のものであり、その出力形式は書誌情報抽出定義ルール
の作成者が自由に定義できる。本実施形態では、抽出結
果を書誌情報設定画面の入力領域の初期値として埋め込
む為の書誌情報設定画面データとして出力することにす
る。The definition rule database 3 shown in FIG. 3 includes bibliographic information extraction definition rules 31, 32,.
Are managed in association with the fixed document type to which the application is applied. Each bibliographic information extraction definition rule is XM
This is for extracting a partial structure corresponding to bibliographic information included in document data described in L (extensible Markup Language), and its output format can be freely defined by the creator of the bibliographic information extraction definition rule. In the present embodiment, the extraction result is output as bibliographic information setting screen data for embedding as an initial value of the input area of the bibliographic information setting screen.

【００２２】文書データ受付部２１における、登録対象
文書の作成に用いられた文書作成アプリケーションの特
定は、本実施形態では、登録対象文書ファイルの拡張子
を基に行うものとする。In the present embodiment, the specification of the document creation application used for creating the registration target document in the document data receiving unit 21 is performed based on the extension of the registration target document file.

【００２３】書誌情報抽出部２２は、登録対象文書の定
型文書種別、及び、その記述内容を解析し、文書中の各
段落を、該段落に付与されたスタイル名をタグ名とし、
該段落に含まれる文字列を該タグ名を持つタグで括った
ＸＭＬ要素に置き換えたｆｕｌｌ−ＸＭＬデータ５を出
力するｆｕｌｌ−ＸＭＬ出力部２２１と、登録対象文書
の定型文書種別に応じた書誌情報抽出定義ルールを前記
ｆｕｌｌ−ＸＭＬデータに適用することにより、書誌情
報抽出結果を反映した書誌情報設定画面データ６を出力
する定義ルール適用部２２２から構成される。The bibliographic information extraction unit 22 analyzes the fixed document type of the document to be registered and the description thereof, and makes each paragraph in the document a style name given to the paragraph, as a tag name.
A full-XML output unit 221 that outputs full-XML data 5 in which a character string included in the paragraph is replaced with an XML element enclosed by a tag having the tag name, and bibliographic information corresponding to a fixed document type of the registration target document A definition rule application unit 222 outputs bibliographic information setting screen data 6 reflecting the bibliographic information extraction result by applying the extraction definition rule to the full-XML data.

【００２４】ｆｕｌｌ−ＸＭＬ出力部２２１は、市販の
様々な文書作成アプリケーションで作成された文書デー
タを処理対象とする。本実施形態では、文書作成に用い
られた文書作成アプリケーション固有のデータ形式の文
書データについて、その定型文書種別、及び、記述内容
の解析を行う為に、それぞれの文書作成アプリケーショ
ンにより提供されているマクロ関数を用いるものとす
る。すなわち、ｆｕｌｌ−ＸＭＬ出力部２２１は、文書
作成に利用される文書作成アプリケーション毎に用意す
ることになる。The full-XML output unit 221 processes document data created by various commercially available document creation applications. In the present embodiment, macros provided by respective document creation applications are used to analyze the standard document type and description contents of document data in a data format specific to the document creation application used for document creation. A function shall be used. That is, the full-XML output unit 221 is prepared for each document creation application used for document creation.

【００２５】定型文書データ１が文書登録部２に入力さ
れると、まず、文書データ受付部２１が、定型文書デー
タ１の作成に用いられた文書作成アプリケーションを特
定する。When the standard document data 1 is input to the document registration unit 2, first, the document data receiving unit 21 specifies the document creation application used to create the standard document data 1.

【００２６】次に、書誌情報抽出部２２は、前記文書作
成アプリケーション用のｆｕｌｌ−ＸＭＬ出力部２２１
により、定型文書データ１について定型文書種別の判
定、及び、ｆｕｌｌ−ＸＭＬ出力を行い、定義ルール適
用部２２２により、前記定型文書種別に対応付けられた
書誌情報抽出定義ルールを適用することにより、書誌情
報の抽出、及び、その抽出結果を反映した書誌情報設定
画面データの出力を行う。Next, the bibliographic information extraction unit 22 outputs a full-XML output unit 221 for the document creation application.
Thus, the standard document type is determined for the standard document data 1 and full-XML output is performed, and the bibliographic information extraction definition rule associated with the standard document type is applied by the definition rule application unit 222, thereby obtaining the bibliography. It extracts information and outputs bibliographic information setting screen data reflecting the result of the extraction.

【００２７】次に、書誌情報設定画面表示部２３は、定
義ルール適用部２２２により出力された書誌情報設定画
面データの表示を行う。書誌情報設定画面表示部２３に
より表示される書誌情報設定画面の例を図１０に示す。
登録者は、本画面の各入力領域の初期値として表示され
ている書誌情報抽出結果を確認し、必要であれば、抽出
結果を修正した後、ＯＫボタン４０４を選択することに
より、画面上の各入力領域に表示されている文字列を定
型文書データ１の書誌情報として設定し、定型文書デー
タ１を文書データベース４に登録することを要求でき
る。Next, the bibliographic information setting screen display unit 23 displays the bibliographic information setting screen data output by the definition rule applying unit 222. FIG. 10 shows an example of a bibliographic information setting screen displayed by the bibliographic information setting screen display unit 23.
The registrant confirms the bibliographic information extraction result displayed as the initial value of each input area on this screen, corrects the extraction result if necessary, and then selects the OK button 404 to display the screen. A character string displayed in each input area can be set as bibliographic information of the standard document data 1 and a request can be made to register the standard document data 1 in the document database 4.

【００２８】図２は本実施形態の文書管理システムのシ
ステム構成を示す図である。ネットワーク１５に接続さ
れた文書管理サーバ１０は文書登録部２を含む文書管理
機能を備えたコンピュータであり、クライアント端末２
０はネットワーク１５を介して文書管理サーバ１０にア
クセスすることのできるコンピュータである。FIG. 2 is a diagram showing a system configuration of the document management system according to the present embodiment. The document management server 10 connected to the network 15 is a computer having a document management function including the document registration unit 2 and a client terminal 2.
Reference numeral 0 denotes a computer that can access the document management server 10 via the network 15.

【００２９】本実施形態は、文書管理サーバ１０とクラ
イアント端末２０との間の通信手段について特定するも
のではないが、本実施形態では、インターネット上での
ＨＴＴＰプロトコルによる通信を前提として説明する。
すなわち、文書管理サーバ１０はＨＴＴＰサーバ部１４
２を有し、ＨＴＴＰサーバとしての機能を備えるものと
する。Although the present embodiment does not specify the communication means between the document management server 10 and the client terminal 20, this embodiment will be described on the premise of communication using the HTTP protocol on the Internet.
That is, the document management server 10 is connected to the HTTP server unit 14
2 and has a function as an HTTP server.

【００３０】文書管理サーバ１０は、ディスプレイ１１
と、キーボード等のデータ入力装置１２と、ＣＰＵ１３
と、メモリ１４と、書誌情報抽出定義ルールを管理する
定義ルールデータベース３と、市販の文書作成アプリケ
ーションを用いて作成された文書データを蓄積する文書
データベース４とから構成される。メモリ１４には、文
書登録部２を含む文書管理部１４１と、ＨＴＴＰサーバ
部１４２が保持される。The document management server 10 has a display 11
, A data input device 12 such as a keyboard, and a CPU 13
, A memory 14, a definition rule database 3 for managing bibliographic information extraction definition rules, and a document database 4 for storing document data created using a commercially available document creation application. The memory 14 holds a document management unit 141 including the document registration unit 2 and an HTTP server unit 142.

【００３１】クライアント端末２０は、ディスプレイ２
０１と、データ入力装置２０２と、ＣＰＵ２０３と、メ
モリ２４とから構成される。メモリ２４には、文書作成
アプリケーション２４１と、Ｗｅｂブラウザ部２４２が
保持される。The client terminal 20 is connected to the display 2
01, a data input device 202, a CPU 203, and a memory 24. The memory 24 holds a document creation application 241 and a Web browser unit 242.

【００３２】以上のシステム構成において、登録者は、
クライアント端末２０上でＷｅｂブラウザを起動し、Ｗ
ｅｂブラウザを介して文書管理サーバ１０と通信するこ
とにより、文書管理サーバ１０上の文書データベース４
への文書登録を要求することができる。In the above system configuration, the registrant
Start a Web browser on the client terminal 20 and enter W
By communicating with the document management server 10 via the web browser, the document database 4 on the document management server 10
You can request document registration to

【００３３】図３は本実施形態のクライアント端末２０
で起動されたＷｅｂブラウザ上に表示される文書登録画
面の例を示す図である。登録者が、登録対象ファイル名
入力領域３０４に登録したい文書ファイル名を入力し、
書誌情報設定ボタン３０６を選択すると、文書管理サー
バ１０上の文書登録部２が呼び出される。FIG. 3 shows the client terminal 20 of this embodiment.
FIG. 7 is a diagram showing an example of a document registration screen displayed on a Web browser started in the step (a). The registrant inputs a document file name to be registered in the registration target file name input area 304,
When the bibliographic information setting button 306 is selected, the document registration unit 2 on the document management server 10 is called.

【００３４】図４は従来の方式による書誌情報設定画面
の例を示す図である。図４に示す様に従来の方式による
書誌情報設定画面は、登録対象文書に付与する書誌情報
として、タイトル、作成者、概要、資料ＩＤ、プロジェ
クト名、配布先を入力する為のものである。登録者が、
書誌情報入力領域４０２中の、タイトル入力領域４０２
１、作成者入力領域４０２２、概要入力領域４０２３、
資料ＩＤ入力領域４０２４、プロジェクト名入力領域４
０２５、配布先入力領域４０２６に対してデータを入力
し、ＯＫボタン４０４を選択すると、文書登録画面３０
で指定された文書ファイルに対して、本画面で入力され
た書誌情報が付与され、文書データベース４に登録され
る。FIG. 4 is a diagram showing an example of a bibliographic information setting screen according to a conventional method. As shown in FIG. 4, the bibliographic information setting screen according to the conventional method is used to input a title, a creator, an outline, a material ID, a project name, and a distribution destination as bibliographic information to be added to a registration target document. The registrant
Title input area 402 in bibliographic information input area 402
1, creator input area 4022, summary input area 4023,
Material ID input area 4024, project name input area 4
025, when data is input to the distribution destination input area 4026 and the OK button 404 is selected, the document registration screen 30
The bibliographic information input on this screen is added to the document file specified in step (1), and registered in the document database 4.

【００３５】図５は本実施形態の登録対象文書の例を示
す図である。これは、予め定められた報告書用の書式に
従って作成されたものであり、この定型文書種別を「報
告書」とする。FIG. 5 is a diagram showing an example of a document to be registered according to the present embodiment. This is created in accordance with a predetermined report format, and this fixed document type is referred to as “report”.

【００３６】図中の段落５０５には「題名」スタイルが
設定されている。これは、「○○中間報告」という文字
列を含む段落に対して、「フォント＝ゴシック体、文字
サイズ＝１４ポイント、文字配置＝中央揃え」という書
式を設定し、更に、この設定内容に対して「題名」スタ
イルという名前を付与したものである。In the paragraph 505 in the figure, a “title” style is set. This means that for a paragraph containing the character string "XX Interim Report", the format "Font = Gothic, Character size = 14 points, Character arrangement = Centered" is set, and The title "Title" style.

【００３７】同様に、段落５０１には「項目名」スタイ
ル、段落５０２には「資料ＩＤ」スタイル、段落５０３
には「項目名」スタイル、段落５０４には「配布先」ス
タイル、段落５０６には「日付」スタイル、段落５０７
には「所属」スタイル、段落５０８には「氏名」スタイ
ル、段落５０９には「要旨」スタイル、段落５１０には
「見出し」スタイル、段落５１１には「段落」スタイ
ル、という様に、文書中の全ての段落に対して何らかの
スタイルが設定されているものとする。Similarly, a paragraph 501 has an “item name” style, a paragraph 502 has a “material ID” style, and a paragraph 503.
, “Item name” style, paragraph 504, “distribution destination” style, paragraph 506, “date” style, paragraph 507
, "Affiliation" style, paragraph 508 "Name" style, paragraph 509 "Summary" style, paragraph 510 "Heading" style, paragraph 511 "Paragraph" style, etc. It is assumed that some style is set for all paragraphs.

【００３８】本実施形態では、報告書、営業日報や議事
録等の定型文書種別毎にそれぞれの文書作成アプリケー
ション用のテンプレートが予め作成されており、登録対
象文書の作成はこのテンプレートを用いて行われるもの
とする。また、その登録対象文書の記述内容をｆｕｌｌ
−ＸＭＬ出力するマクロ関数もそのテンプレートの書式
に合わせてそれぞれ作成されているものとする。In the present embodiment, a template for each document creation application is created in advance for each fixed document type such as a report, a business daily report, minutes of a meeting, etc., and a document to be registered is created using this template. Shall be In addition, the description content of the document to be registered is full.
-It is assumed that macro functions to be output in XML have also been created according to the format of the template.

【００３９】図６は本実施形態の書誌情報抽出部２２で
行われる書誌情報抽出処理の処理手順を示すフローチャ
ートである。まずステップ２２０１で書誌情報抽出部２
２は、登録対象となる定型文書ファイル、及び、その作
成に用いられた文書作成アプリケーションの名称を文書
データ受付部２１から受け付ける。FIG. 6 is a flowchart showing the processing procedure of the bibliographic information extraction process performed by the bibliographic information extraction unit 22 of the present embodiment. First, in step 2201, bibliographic information extraction unit 2
2 receives from the document data receiving unit 21 the standard document file to be registered and the name of the document creation application used to create it.

【００４０】ステップ２２０２で書誌情報抽出部２２
は、前記受付けた文書作成アプリケーションに対応した
ｆｕｌｌ−ＸＭＬ出力部２２１を起動し、前記登録対象
となる定型文書ファイルのテンプレートに合わせて作成
されているマクロ関数を読み出す。At step 2202, the bibliographic information extraction unit 22
Starts the full-XML output unit 221 corresponding to the received document creation application, and reads out a macro function created in accordance with the template of the standard document file to be registered.

【００４１】ステップ２２０３でｆｕｌｌ−ＸＭＬ出力
部２２１は、そのマクロ関数を利用して、ステップ２２
０１で受け付けた定型文書ファイルの定型文書種別を判
定し、ステップ２２０４では、そのマクロ関数をインタ
ープリタ等により実行し、ステップ２２０１で受け付け
た定型文書ファイル中の記述内容をｆｕｌｌ−ＸＭＬ出
力する。図５の文書データを基に出力されるｆｕｌｌ−
ＸＭＬデータの例を図７に示す。In step 2203, the full-XML output unit 221 uses the macro function to
In step 2204, the macro function is executed by an interpreter or the like, and the contents of the standard document file received in step 2201 are output in full-XML. Full- output based on the document data of FIG.
FIG. 7 shows an example of the XML data.

【００４２】図７は本実施形態の図５の文書データを基
に出力されるｆｕｌｌ−ＸＭＬデータの例を示す図であ
る。図５に示した登録対象文書は報告書である為、その
作成の際には報告書用のテンプレートが用いられてお
り、従ってその登録対象文書には報告書用のマクロ関数
が対応付けられている。この為、本実施形態のｆｕｌｌ
−ＸＭＬ出力部２２１がそのマクロ関数を実行すると、
まず図７のタグ＜報告書＞を出力した後、図５の項目名
「資料ＩＤ」を読み出して、図７のタグ＜項目名＞資料
ＩＤ＜／項目名＞を出力し、同様にして図７の様なｆｕ
ｌｌ−ＸＭＬデータを出力する。FIG. 7 is a diagram showing an example of full-XML data output based on the document data of FIG. 5 of the present embodiment. Since the registration target document shown in FIG. 5 is a report, a report template is used when the registration target document is created. Therefore, the registration target document is associated with a report macro function. I have. For this reason, full of this embodiment
When the XML output unit 221 executes the macro function,
First, after outputting the tag <report> in FIG. 7, the item name “material ID” in FIG. 5 is read, and the tag <item name> material ID </ item name> in FIG. 7 is output. Fu like 7
Outputs 11-XML data.

【００４３】ステップ２２０５では、定義ルールデータ
ベース３から、ステップ２２０３で得られた定型文書種
別に対応付けられた書誌情報抽出定義ルールを取り出
す。定型文書種別「報告書」に対応付けられた書誌情報
抽出定義ルールの例を図８に示す。In step 2205, a bibliographic information extraction definition rule associated with the fixed document type obtained in step 2203 is extracted from the definition rule database 3. FIG. 8 shows an example of the bibliographic information extraction definition rule associated with the standard document type “report”.

【００４４】図８は本実施形態の定型文書種別「報告
書」に対応付けられた書誌情報抽出定義ルールの例を示
す図である。図８に示す様に書誌情報抽出定義ルール
は、ＸＭＬデータに関する変換ルールをXSLT1.0(W3C勧
告 1999.11.16)で記述したものである。これは、図７に
示す様なｆｕｌｌ−ＸＭＬデータに含まれる「題名」、
「所属」等の部分構造を抽出し、これを書誌情報設定画
面の「タイトル」、「作成者」等の各入力領域の初期値
として埋め込んだ画面データをＨＴＭＬ(HyperTextMark
up Language)形式で出力することを定義している。FIG. 8 is a diagram showing an example of a bibliographic information extraction definition rule associated with the standard document type “report” of the present embodiment. As shown in FIG. 8, the bibliographic information extraction definition rule describes a conversion rule for XML data in XSLT1.0 (W3C recommendation 1999.11.16). This corresponds to “title” included in full-XML data as shown in FIG.
Screen data in which a partial structure such as “affiliation” is extracted and embedded as initial values of each input area such as “title” and “creator” of the bibliographic information setting screen is written in HTML (HyperTextMark).
up Language) format.

【００４５】登録対象文書ではその定型文書種別に応じ
て「題名」、「所属」、「氏名」等の記述内容が存在し
ているが、図４に示す様に書誌情報設定では「タイト
ル」、「作成者」等の登録対象文書とは異なる項目名が
用いられている。図８の書誌情報抽出定義ルールでは、
この様な定型文書種別毎に異なる記述内容から書誌情報
設定画面で用いられるものを抽出するルールを表してい
る。In the document to be registered, description contents such as "title", "affiliation", and "name" exist according to the type of the standard document. However, as shown in FIG. An item name different from the registration target document such as “creator” is used. In the bibliographic information extraction definition rule of FIG.
A rule for extracting a description used on the bibliographic information setting screen from description contents different for each of the fixed document types is shown.

【００４６】ステップ２２０６は、定義ルール適用部２
２２により実行される定義ルール適用処理であり、ステ
ップ２２０４で出力されたｆｕｌｌ−ＸＭＬデータにつ
いて、ステップ２２０５で取り出された書誌情報抽出定
義ルールを適用することにより、前記ｆｕｌｌ−ＸＭＬ
データから書誌情報を抽出し、その抽出結果を反映した
書誌情報設定画面データを出力する。図７のｆｕｌｌ−
ＸＭＬデータについて、図８の書誌情報抽出定義ルール
を適用することにより出力される書誌情報設定画面デー
タの例を図９に示す。Step 2206 consists of the definition rule application unit 2
22 is a definition rule application process, which applies the bibliographic information extraction definition rule extracted in step 2205 to the full-XML data output in step 2204, thereby obtaining the full-XML
Bibliographic information is extracted from the data, and bibliographic information setting screen data reflecting the extraction result is output. Full- in FIG.
FIG. 9 shows an example of bibliographic information setting screen data output by applying the bibliographic information extraction definition rule of FIG. 8 to XML data.

【００４７】図９は本実施形態のｆｕｌｌ−ＸＭＬデー
タから出力されるＨＴＭＬ形式の書誌情報設定画面デー
タの例を示す図である。また、本画面データを書誌情報
設定画面表示部２３により表示すると、図１０に示す様
な書誌情報設定画面が表示される。FIG. 9 is a diagram showing an example of bibliographic information setting screen data in HTML format output from full-XML data according to the present embodiment. When this screen data is displayed on the bibliographic information setting screen display section 23, a bibliographic information setting screen as shown in FIG. 10 is displayed.

【００４８】図１０は本実施形態の図９に示したＨＴＭ
Ｌ形式の書誌情報設定画面データをＷｅｂブラウザで表
示した際の表示例を示す図である。書誌情報抽出部２２
により抽出された「題名」、「所属」、「氏名」等が、
それぞれ、「タイトル」、「作成者」等として、タイト
ル入力領域４０８１、作成者入力領域４０８２、…に埋
め込まれた状態で表示される。登録者は、必要に応じ
て、各入力領域中の文字列を修正することが可能であ
る。ＯＫボタン４０４を選択すると、登録対象文書に対
して各入力領域中の文字列が、登録対象文書の書誌情報
として付与され、文書データベース４への文書登録が行
われる。FIG. 10 shows the HTM of this embodiment shown in FIG.
FIG. 10 is a diagram illustrating a display example when bibliographic information setting screen data in L format is displayed on a Web browser. Bibliographic information extraction unit 22
"Title", "Affiliation", "Name" etc. extracted by
.. Are displayed in a state of being embedded in a title input area 4081, a creator input area 4082,..., Respectively. The registrant can correct the character string in each input area as needed. When an OK button 404 is selected, a character string in each input area is added to the registration target document as bibliographic information of the registration target document, and the document is registered in the document database 4.

【００４９】前記の様に本実施形態によれば、文書の記
述内容を構造化文書記述言語表現に置き換えた構造化文
書記述言語データが出力され、書誌情報抽出定義ルール
に基づいて書誌情報が抽出され、その抽出結果が書誌情
報設定画面上に表示される為、登録者による書誌情報入
力が不要となる。これにより、登録対象文書に対して、
文書の記述内容に基づいた書誌情報を付与することが可
能となる為、文書管理システムに蓄積された文書群を有
効な情報共有手段として活用できる様になる。As described above, according to this embodiment, structured document description language data in which the description content of a document is replaced with a structured document description language expression is output, and bibliographic information is extracted based on bibliographic information extraction definition rules. Since the extraction result is displayed on the bibliographic information setting screen, the registrant does not need to input bibliographic information. As a result, for documents to be registered,
Since bibliographic information can be added based on the description content of the document, the document group stored in the document management system can be used as an effective information sharing means.

【００５０】以上説明した様に本実施形態の文書管理シ
ステムによれば、文書データを構造化文書記述言語デー
タに変換して書誌情報に該当する部分構造を抽出するの
で、文書登録作業における登録者による書誌情報入力を
不要とし、適切な書誌情報を付与した文書群を有効な情
報共有手段として活用できる様にすることが可能であ
る。As described above, according to the document management system of this embodiment, document data is converted into structured document description language data to extract a partial structure corresponding to bibliographic information. It is possible to eliminate the need for inputting bibliographic information, and to make it possible to utilize a document group to which appropriate bibliographic information is added as an effective information sharing means.

【００５１】[0051]

【発明の効果】本発明によれば文書データを構造化文書
記述言語データに変換して書誌情報に該当する部分構造
を抽出するので、文書登録作業における登録者による書
誌情報入力を不要とし、適切な書誌情報を付与した文書
群を有効な情報共有手段として活用できる様にすること
が可能である。According to the present invention, the document data is converted into structured document description language data to extract the partial structure corresponding to the bibliographic information. It is possible to utilize a document group to which bibliographic information is added as an effective information sharing means.

[Brief description of the drawings]

【図１】本実施形態の定型文書データを登録対象文書と
して受け付け、これを文書データベースに格納する文書
登録部の処理概要を示す図である。FIG. 1 is a diagram illustrating an outline of processing of a document registration unit that receives standard document data as a registration target document and stores the document data in a document database.

【図２】本実施形態の文書管理システムのシステム構成
を示す図である。FIG. 2 is a diagram showing a system configuration of a document management system according to the embodiment.

【図３】本実施形態のクライアント端末２０で起動され
たＷｅｂブラウザ上に表示される文書登録画面の例を示
す図である。FIG. 3 is a diagram illustrating an example of a document registration screen displayed on a Web browser activated on the client terminal 20 of the embodiment.

【図４】従来の方式による書誌情報設定画面の例を示す
図である。FIG. 4 is a diagram showing an example of a bibliographic information setting screen according to a conventional method.

【図５】本実施形態の登録対象文書の例を示す図であ
る。FIG. 5 is a diagram illustrating an example of a registration target document according to the present embodiment.

【図６】本実施形態の書誌情報抽出部２２で行われる書
誌情報抽出処理の処理手順を示すフローチャートであ
る。FIG. 6 is a flowchart illustrating a bibliographic information extraction process performed by a bibliographic information extraction unit 22 according to the embodiment.

【図７】本実施形態の図５の文書データを基に出力され
るｆｕｌｌ−ＸＭＬデータの例を示す図である。7 is a diagram illustrating an example of full-XML data output based on the document data of FIG. 5 according to the embodiment;

【図８】本実施形態の定型文書種別「報告書」に対応付
けられた書誌情報抽出定義ルールの例を示す図である。FIG. 8 is a diagram illustrating an example of a bibliographic information extraction definition rule associated with a standard document type “report” according to the embodiment;

【図９】本実施形態のｆｕｌｌ−ＸＭＬデータから出力
されるＨＴＭＬ形式の書誌情報設定画面データの例を示
す図である。FIG. 9 is a diagram illustrating an example of bibliographic information setting screen data in HTML format output from full-XML data according to the present embodiment.

【図１０】本実施形態の図９に示したＨＴＭＬ形式の書
誌情報設定画面データをＷｅｂブラウザで表示した際の
表示例を示す図である。FIG. 10 is a diagram showing a display example when bibliographic information setting screen data in HTML format shown in FIG. 9 of the present embodiment is displayed on a Web browser.

[Explanation of symbols]

１…定型文書データ、２…文書登録部、３…定義ルール
データベース、３１、３２…書誌情報抽出定義ルール、
４…文書データベース、５…ＸＭＬデータ、６…書誌情
報設定画面データ、２１…文書データ受付部、２２…書
誌情報抽出部、２２１…ｆｕｌｌ−ＸＭＬ出力部、２２
２…定義ルール適用部、２３…書誌情報設定画面表示
部、１０…文書管理サーバ、１１…ディスプレイ、１２
…データ入力装置、１３…ＣＰＵ、１４…メモリ、１４
１…文書管理部、１４２…ＨＴＴＰサーバ部、１５…ネ
ットワーク、２０…クライアント端末、２０１…ディス
プレイ、２０２…データ入力装置、２０３…ＣＰＵ、２
４…メモリ、２４１…文書作成アプリケーション、２４
２…Ｗｅｂブラウザ部、３０…文書登録画面、３０２…
登録対象ファイル名参照ボタン、３０４…登録対象ファ
イル名入力領域、３０６…書誌情報設定ボタン、３０８
…キャンセルボタン、４０…書誌情報設定画面、４０２
…書誌情報入力領域、４０４…ＯＫボタン、４０６…キ
ャンセルボタン、４０２１…タイトル入力領域、４０２
２…作成者入力領域、４０２３…概要入力領域、４０２
４…資料ＩＤ入力領域、４０２５…プロジェクト名入力
領域、４０２６…配布先入力領域、５０１〜５１１…段
落、４０８１…タイトル入力領域、４０８２…作成者入
力領域。1 ... fixed document data, 2 ... document registration section, 3 ... definition rule database, 31, 32 ... bibliographic information extraction definition rule,
4 Document database, 5 XML data, 6 Bibliographic information setting screen data, 21 Document data receiving unit, 22 Bibliographic information extraction unit, 221 full-XML output unit, 22
2 ... definition rule application unit, 23 ... bibliographic information setting screen display unit, 10 ... document management server, 11 ... display, 12
... data input device, 13 ... CPU, 14 ... memory, 14
DESCRIPTION OF SYMBOLS 1 ... Document management part, 142 ... HTTP server part, 15 ... Network, 20 ... Client terminal, 201 ... Display, 202 ... Data input device, 203 ... CPU, 2
4. Memory, 241, Document creation application, 24
2 Web browser section, 30 Document registration screen, 302
Registered file name reference button, 304: Registered file name input area, 306: Bibliographic information setting button, 308
... Cancel button, 40 ... Bibliographic information setting screen, 402
... Bibliographic information input area, 404 ... OK button, 406 ... Cancel button, 4021 ... Title input area, 402
2 ... creator input area, 4023 ... summary input area, 402
4 ... material ID input area, 4025 ... project name input area, 4026 ... distribution destination input area, 501 to 511 ... paragraph, 4081 ... title input area, 4082 ... creator input area.

───────────────────────────────────────────────────── フロントページの続き (72)発明者松本正義神奈川県川崎市幸区鹿島田890番地株式会社日立製作所ビジネスソリューション事業部内 (72)発明者高橋亨神奈川県川崎市幸区鹿島田890番地株式会社日立製作所ビジネスソリューション事業部内 (72)発明者長田充弘神奈川県横浜市戸塚区戸塚町5030番地株式会社日立製作所ソフトウェア事業部内 (72)発明者柴田吉宗神奈川県横浜市戸塚区戸塚町5030番地株式会社日立製作所ソフトウェア事業部内Ｆターム(参考） 5B009 NA05 5B075 NK04 NK31 ──────────────────────────────────────────────────の Continuing from the front page (72) Inventor Masayoshi Matsumoto 890 Kashimada, Saiwai-ku, Kawasaki-shi, Kanagawa Prefecture Inside the Hitachi, Ltd.Business Solutions Division (72) Inventor Tohru Takahashi 890 Kashimada, Sai-ku, Kawasaki-shi, Kanagawa Prefecture Co., Ltd. Hitachi Business Solutions Division (72) Inventor Mitsuhiro Nagata 5030 Totsuka-cho, Totsuka-ku, Yokohama-shi, Kanagawa Prefecture Hitachi, Ltd.Software Division (72) Inventor Yoshimune Shibata 5030 Totsuka-cho, Totsuka-ku, Yokohama, Kanagawa Prefecture F-term in the Software Division of Hitachi, Ltd. (Reference) 5B009 NA05 5B075 NK04 NK31

Claims

[Claims]

1. A document management method for storing and managing document data created by a document creation application, wherein the document data created by the document creation application is received as a document to be registered, and the document creation application used to create the document is received. And analyzing the fixed-form document type and the description content of the registration target document, each paragraph in the document as a style name given to the paragraph as a tag name, and a character string included in the paragraph as the Converting and outputting structured document description language data as tag contents, and applying a bibliographic information extraction definition rule to the structured document description language data, Extract the partial structure corresponding to the information and output the bibliographic information setting screen data reflecting the extraction result Step and to display the bibliographic information setting screen data, a document management method characterized by a step of accepting a document registration request from the subscriber.

2. The bibliographic information extraction definition rule is managed in association with a fixed document type of a document to which the bibliographic information extraction definition rule is applied.
The bibliographic information extraction definition rule associated with the fixed document type of the registration target document is applied to structured document description language data that outputs the description content of the registration target document. Document management method described.

3. A document management system for accumulating and managing document data created by a document creation application, wherein the document data created by the document creation application is received as a document to be registered, and the document creation application used to create the document. A document data receiving unit that specifies the type of the document to be registered and the type of the document to be registered and the description content are analyzed, and each paragraph in the document is used as a style name given to the paragraph as a tag name, and characters included in the paragraph are included. A structured conversion output unit that converts a column into structured document description language data having the content of the tag and outputs the structured document description language data; and applying a bibliographic information extraction definition rule to the structured document description language data, Bibliographic information setting that extracts the substructure corresponding to the bibliographic information from the structured document description language data and reflects the extraction result Document management system comprising: the definition rule application unit, to display the bibliographic information setting screen data, the bibliographic information setting screen display unit which receives a document registration request from the registrant to output the surface data.

4. A program for causing a computer to function as a document management system for storing and managing document data created by a document creation application, wherein the document data created by the document creation application is received as a document to be registered. A document data receiving unit for specifying a document creation application used for creation; analyzing a fixed document type and description content of the registration target document; and tagging each paragraph in the document with a style name given to the paragraph. A structured conversion output unit that converts a character string included in the paragraph into structured document description language data having the contents of the tag and outputs the structured document description language data; and a bibliographic information extraction definition rule for the structured document description language data. Is applied, the partial structure corresponding to the bibliographic information is obtained from the structured document description language data. A definition rule application unit for extracting and outputting bibliographic information setting screen data reflecting the extraction result; and a computer as a bibliographic information setting screen display unit for displaying the bibliographic information setting screen data and receiving a document registration request from a registrant A program characterized by causing a computer to function.

5. A computer-readable recording medium storing a program for causing a computer to function as a document management system for accumulating and managing document data created by a document creation application. As a document to be registered, and a document data receiving unit for specifying a document creation application used for the document creation, and analyzing a fixed document type and description contents of the document to be registered, and replacing each paragraph in the document with the corresponding paragraph. A structured conversion output unit for converting a character string included in the paragraph into structured document description language data having the contents of the tag as a tag name using the style name given to the tag, and outputting the structured document description language By applying a bibliographic information extraction definition rule to data, the structured document A definition rule application unit that extracts a substructure corresponding to bibliographic information from the predicate language data and outputs bibliographic information setting screen data reflecting the extraction result; A computer-readable recording medium on which a program for causing a computer to function as a bibliographic information setting screen display unit for receiving a registration request is recorded.