JP2013149068A

JP2013149068A - Method, system, and program for analyzing relevancy between files

Info

Publication number: JP2013149068A
Application number: JP2012008766A
Authority: JP
Inventors: Daisuke Bando; 大輔坂東
Original assignee: Hitachi Solutions Ltd
Current assignee: Hitachi Solutions Ltd
Priority date: 2012-01-19
Filing date: 2012-01-19
Publication date: 2013-08-01

Abstract

【課題】ハイパーリンクで関連付けられていないファイル同士を機械的に解析することで、検索システムなどで「弱い関連性」を有する関連ファイルを抽出し、画面内に表示できるようにすること。
【解決手段】インデクス済みファイル情報からファイル名、ファイルパスの情報を取得した後、ファイルパスに基づき各ファイルのデータを取得し、当該ファイルデータ中にファイル名文字列判定ルールに適合するファイル名を示す文字列が記述されているかを判定し、記述されていた場合には記述されていたファイルのファイルパスを参照元ファイルパス、記述されているファイル名を参照先ファイル名として第２のデータベースに格納すると共に、各参照元ファイルパスのファイルについて参照先ファイル名を検索キーとして検索キーに一致するファイル名のファイルが存在するかを判定し、存在した場合には当該ファイルのファイルパスを前記第１のデータベースから取得し、参照元と参照先との関係が特定された弱い関連関係を有するファイルの情報として抽出する。
【選択図】図１PROBLEM TO BE SOLVED: To extract a related file having a "weak relationship" by a search system or the like by mechanically analyzing files that are not associated with each other by a hyperlink, and display it on a screen.
SOLUTION: After acquiring file name and file path information from indexed file information, data of each file is acquired based on the file path, and a file name that conforms to the file name character string determination rule is included in the file data. It is determined whether or not the character string to be described is described. If it is described, the file path of the described file is set as the reference source file path, and the described file name is set as the reference destination file name in the second database. At the same time, for each file of the reference source file path, it is determined whether there is a file with a file name that matches the search key using the reference destination file name as a search key. Obtained from 1 database and has a weak relation in which the relation between the referrer and the referee is specified As file information.
[Selection] Figure 1

Description

本発明は、ファイルサーバ内に保存された大量のファイル間の関連性を解析するための技術に関するものであり、特に、検索システムにおいて、検索したファイルに関連するファイルを検索結果内に表示できるファイル間の関連性の解析方法及びシステム並びにプログラムに関するものである。 The present invention relates to a technique for analyzing a relationship between a large number of files stored in a file server, and in particular, in a search system, a file that can display a file related to a searched file in a search result. The present invention relates to a method and system for analyzing the relationship between programs, and a program.

従来、コンピュータ性能の高速化、HDDの大容量化に伴い、膨大な数のファイルが作られるようになっている。このため、膨大な数のファイルの中から、所望のファイルを高速かつ的確に探し出すことができる検索システムの必要性が高まっている。特に、各々のファイルが相互に関連し合っていることも多いことから、あるファイルに関連する他のファイルを探し出すニーズが大幅に増加している。 Conventionally, a large number of files have been created as computer performance increases and HDD capacity increases. For this reason, there is an increasing need for a search system that can quickly and accurately find a desired file from an enormous number of files. In particular, since each file is often associated with each other, the need to find other files related to a certain file has greatly increased.

ＷＷＷ(World Wide Web)に代表されるハイパーメディアデータベースに格納されているＨＴＭＬ(Hyper Text Mark-up Language)形式のWebページに関しては、検索エンジンのGoogle(Google社の登録商標)が、Webページ間に張られたハイパーリンクの相関関係に基づいて、そのWebページに関連する他のWebページを表示する機能を提供している。
本発明に関連する公知技術文献としては、下記特許文献１および２が挙げられる。 For web pages in HTML (Hyper Text Mark-up Language) format stored in hypermedia databases represented by the WWW (World Wide Web), the search engine Google (registered trademark of Google Inc.) It provides a function to display other web pages related to the web page based on the correlation of hyperlinks.
The following patent documents 1 and 2 are mentioned as a well-known technical document relevant to this invention.

特開２００２−３０４３９３号JP 2002-304393 A 米国特許US 6,285,999 B1US patent US 6,285,999 B1

上記特許文献１および２が、ファイル間の関連性を判断する際に前提としているのは、Webページ間に張られたハイパーリンクの相関関係である。
しかし、各々のファイルが相互に関連し合っていたとしても、その相関関係がハイパーリンクで明示されているとは限らず、ファイル内で他のファイルの名前を言及することで黙示されていることもある。
前者はハイパーリンクによりリンク先ファイルのパスが明示されており、その関連性が明白であることから「強い関連性」と捉えることができる。
一方、後者はリンク先ファイルのパスが直接示されておらず、ファイルの名前や更新日時などのデータに基づいて、その関連性を見出すことから「弱い関連性」と捉えることができる。 The above-mentioned Patent Documents 1 and 2 presuppose a correlation between hyperlinks created between Web pages when determining the relevance between files.
However, even if each file is related to each other, the correlation is not always clearly indicated by a hyperlink, and it is implied by referring to the name of another file in the file. There is also.
The former can be regarded as a “strong relationship” because the path of the linked file is clearly indicated by a hyperlink and the relationship is clear.
On the other hand, in the latter, the path of the link destination file is not shown directly, and the relationship is found based on data such as the file name and the update date and time, so that it can be regarded as a “weak relationship”.

従来は、「強い関連性」を有するファイル同士、すなわちハイパーリンクで関連付けられているファイル同士に関しては、システムが関連ファイルとして機械的に認識することができたが、「弱い関連性」を有するファイル同士、すなわちハイパーリンクで関連付けられていないファイル同士に関しては、人間が自らの判断で関連ファイルであると推定する必要があった。
その結果、「強い関連性」を有する関連ファイルは、システムにより機械的に検索することができたのに対し、「弱い関連性」を有する関連ファイルは、人間の判断が必要なため、検索作業を自動化することができず、人手で探し出すのに膨大な時間と労力を要するという問題があった。 Previously, files with “strong association”, that is, files associated with hyperlinks, could be mechanically recognized as related files by the system, but files with “weak association” Regarding files that are not associated with each other, that is, hyperlinks, it is necessary for a human to estimate that the files are related files based on their own judgment.
As a result, related files with “strong relevance” could be mechanically searched by the system, while related files with “weak relevance” require human judgment. There was a problem that it was not possible to automate, and it took a lot of time and labor to search manually.

本発明の目的は、ハイパーリンクで関連付けられていない「弱い関連性」を有するファイル同士を機械的に解析することで、検索システムなどで「弱い関連性」を有する関連ファイルを抽出し、画面内に表示できるようにすること方法及びシステム並びにプログラムを提供することである。 An object of the present invention is to mechanically analyze files having “weak relevance” that are not related by hyperlinks, to extract related files having “weak relevance” in a search system, etc. It is possible to provide a method, a system, and a program.

上記課題を解決するために、本発明のファイル間の関連性解析方法は、第１のデータベースに格納したインデクス済みファイル情報からファイル名、ファイルパスの情報を取得する第１のステップと、
取得した前記ファイルパスに基づき各ファイルのデータを取得し、当該ファイルデータ中にファイル名文字列判定ルールに適合するファイル名を示す文字列が記述されているかを判定し、記述されていた場合には記述されていたファイルのファイルパスを参照元ファイルパス、記述されているファイル名を参照先ファイル名として第２のデータベースに格納すると共に、検出日時の情報を対応付けて格納する処理を全てのファイルデータを対象に実行する第２のステップと、
前記第２のデータベースに格納されている前記参照元ファイルパスを取得し、各参照元ファイルパスのファイルについて前記第２のデータベースに格納されている前記参照先ファイル名を検索キーとして前記第１のデータベースに格納されたインデクス情報を検索し、検索キーに一致するファイル名のファイルが存在するかを判定し、存在した場合には当該ファイルのファイルパスを前記第１のデータベースから取得し、参照元と参照先との関係が特定された弱い関連関係を有するファイルの情報として参照元ファイルパス、参照先ファイルパスの情報、及び関連性を判定した日時の情報を第３のデータベースに格納する処理を全てのファイルデータを対象に実行する第３のステップとを備え、ハイパーリンクで特定されない参照元と参照先とが弱い関連関係を有するファイルの情報を抽出することを特徴とする。
また、前記第１のデータベースに格納されたインデクス済みファイル情報には、当該インデクス済みファイル情報の作成完了日時の情報が含まれており、前記第３のステップではインデクス済みファイル情報の作成完了日時と検索キーに一致するファイル名のファイルが存在することを判定した日時の情報を比較し、その差が許容範囲内であるときのみ前記参照元ファイルパス、参照先ファイルパスの情報、及び関連性を判定した日時の情報を第３のデータベースに格納する第４のステップを備えることを特徴とする。
また、任意のリンクと紐づいているファイルのファイルパスが指定されたとき、当該ファイルパスを検索キーとして前記第３のデータベースを検索し、検査キーに一致する参照先ファイルパスが存在した場合、その参照先ファイルパスへのリンクと関連性を判定した日時の情報を前記第３のデータベースから取得して表示する第５のステップを備えることを特徴とする。 In order to solve the above-mentioned problem, the relationship analysis method between files according to the present invention includes a first step of acquiring file name and file path information from indexed file information stored in a first database;
When the data of each file is acquired based on the acquired file path, a character string indicating a file name that conforms to the file name character string determination rule is described in the file data, and if it is described Is stored in the second database with the file path of the described file as the reference source file path and the described file name as the reference destination file name, and the process of associating and storing the information of the detection date and time A second step for executing the file data;
The reference source file path stored in the second database is acquired, and the first destination file name stored in the second database is used as a search key for the file of each reference source file path. The index information stored in the database is searched to determine whether a file with a file name that matches the search key exists. If the file exists, the file path of the file is obtained from the first database, and the reference source A process of storing, in the third database, the information of the reference source file path, the information of the reference destination file path, and the information of the date and time when the relationship is determined as the information of the file having the weak relation in which the relation between the relation and the reference destination is specified. A third step executed on all file data, and a reference source and a reference destination not specified by a hyperlink And extracting the information of a file having a weak binding relationship.
The indexed file information stored in the first database includes information on the creation completion date / time of the indexed file information. In the third step, the creation completion date / time of the indexed file information Compare the information of the date and time when it is determined that the file with the file name matching the search key exists, and the reference source file path, the information of the reference destination file path, and the relevance only when the difference is within the allowable range. A fourth step of storing the determined date and time information in a third database is provided.
When a file path of a file linked to an arbitrary link is specified, the third database is searched using the file path as a search key, and a reference file path that matches the inspection key exists. It is characterized by comprising a fifth step of acquiring and displaying information on the date and time when the link to the reference destination file path and the relevance are determined from the third database.

本発明のファイル間の関連性解析システムは、第１のデータベースに格納したインデクス済みファイル情報からファイル名、ファイルパスの情報を取得する第１の手段と、
取得した前記ファイルパスに基づき各ファイルのデータを取得し、当該ファイルデータ中にファイル名文字列判定ルールに適合するファイル名を示す文字列が記述されているかを判定し、記述されていた場合には記述されていたファイルのファイルパスを参照元ファイルパス、記述されているファイル名を参照先ファイル名として第２のデータベースに格納すると共に、検出日時の情報を対応付けて格納する処理を全てのファイルデータを対象に実行する第２の手段と、
前記第２のデータベースに格納されている前記参照元ファイルパスを取得し、各参照元ファイルパスのファイルについて前記第２のデータベースに格納されている前記参照先ファイル名を検索キーとして前記第１のデータベースに格納されたインデクス情報を検索し、検索キーに一致するファイル名のファイルが存在するかを判定し、存在した場合には当該ファイルのファイルパスを前記第１のデータベースから取得し、参照元と参照先との関係が特定された弱い関連関係を有するファイルの情報として参照元ファイルパス、参照先ファイルパスの情報、及び関連性を判定した日時の情報を第３のデータベースに格納する処理を全てのファイルデータを対象に実行する第３の手段とを備え、ハイパーリンクで特定されない参照元と参照先とが弱い関連関係を有するファイルの情報を抽出することを特徴とする。
また、前記第１のデータベースに格納されたインデクス済みファイル情報には、当該インデクス済みファイル情報の作成完了日時の情報が含まれており、前記第３のステップではインデクス済みファイル情報の作成完了日時と検索キーに一致するファイル名のファイルが存在することを判定した日時の情報を比較し、その差が許容範囲内であるときのみ前記参照元ファイルパス、参照先ファイルパスの情報、及び関連性を判定した日時の情報を第３のデータベースに格納する第４の手段を備えることを特徴とする。
また、任意のリンクと紐づいているファイルのファイルパスが指定されたとき、当該ファイルパスを検索キーとして前記第３のデータベースを検索し、検査キーに一致する参照先ファイルパスが存在した場合、その参照先ファイルパスへのリンクと関連性を判定した日時の情報を前記第３のデータベースから取得して表示する第５の手段を備えることを特徴とする。 The relevance analysis system between files of the present invention includes a first means for acquiring file name and file path information from indexed file information stored in a first database,
When the data of each file is acquired based on the acquired file path, a character string indicating a file name that conforms to the file name character string determination rule is described in the file data, and if it is described Is stored in the second database with the file path of the described file as the reference source file path and the described file name as the reference destination file name, and the process of associating and storing the information of the detection date and time A second means for executing on file data;
The reference source file path stored in the second database is acquired, and the first destination file name stored in the second database is used as a search key for the file of each reference source file path. The index information stored in the database is searched to determine whether a file with a file name that matches the search key exists. If the file exists, the file path of the file is obtained from the first database, and the reference source A process of storing, in the third database, the information of the reference source file path, the information of the reference destination file path, and the information of the date and time when the relationship is determined as the information of the file having the weak relation in which the relation between the relation and the reference destination is specified. A third means for executing all file data, and a reference source and a reference destination not specified by a hyperlink And extracting the information of the file having had relevant relationship.
The indexed file information stored in the first database includes information on the creation completion date / time of the indexed file information. In the third step, the creation completion date / time of the indexed file information Compare the information of the date and time when it is determined that the file with the file name matching the search key exists, and the reference source file path, the information of the reference destination file path, and the relevance only when the difference is within the allowable range. A fourth means for storing the determined date and time information in a third database is provided.
When a file path of a file linked to an arbitrary link is specified, the third database is searched using the file path as a search key, and a reference file path that matches the inspection key exists. A fifth means for acquiring and displaying information on the date and time when the relevance of the link to the reference destination file path is determined is obtained from the third database.

また、本発明のファイル間の関連性解析用のプログラムは、ファイル間の関連性解析サーバを、第１のデータベースに格納したインデクス済みファイル情報からファイル名、ファイルパスの情報を取得する第１の手段と、
取得した前記ファイルパスに基づき各ファイルのデータを取得し、当該ファイルデータ中にファイル名文字列判定ルールに適合するファイル名を示す文字列が記述されているかを判定し、記述されていた場合には記述されていたファイルのファイルパスを参照元ファイルパス、記述されているファイル名を参照先ファイル名として第２のデータベースに格納すると共に、検出日時の情報を対応付けて格納する処理を全てのファイルデータを対象に実行する第２の手段と、
前記第２のデータベースに格納されている前記参照元ファイルパスを取得し、各参照元ファイルパスのファイルについて前記第２のデータベースに格納されている前記参照先ファイル名を検索キーとして前記第１のデータベースに格納されたインデクス情報を検索し、検索キーに一致するファイル名のファイルが存在するかを判定し、存在した場合には当該ファイルのファイルパスを前記第１のデータベースから取得し、参照元と参照先との関係が特定された弱い関連関係を有するファイルの情報として参照元ファイルパス、参照先ファイルパスの情報、及び関連性を判定した日時の情報を第３のデータベースに格納する処理を全てのファイルデータを対象に実行する第３の手段として機能させ、ハイパーリンクで特定されない参照元と参照先とが弱い関連関係を有するファイルの情報を抽出することを特徴とする。
また、前記第１のデータベースに格納されたインデクス済みファイル情報には、当該インデクス済みファイル情報の作成完了日時の情報が含まれており、前記第３のステップではインデクス済みファイル情報の作成完了日時と検索キーに一致するファイル名のファイルが存在することを判定した日時の情報を比較し、その差が許容範囲内であるときのみ前記参照元ファイルパス、参照先ファイルパスの情報、及び関連性を判定した日時の情報を第３のデータベースに格納する第４の手段として機能させることを特徴とする。
また、任意のリンクと紐づいているファイルのファイルパスが指定されたとき、当該ファイルパスを検索キーとして前記第３のデータベースを検索し、検査キーに一致する参照先ファイルパスが存在した場合、その参照先ファイルパスへのリンクと関連性を判定した日時の情報を前記第３のデータベースから取得して表示する第５の手段として機能させることを特徴とする。 The file relevance analysis program according to the present invention also includes a file relevance analysis server that obtains file name and file path information from indexed file information stored in a first database. Means,
When the data of each file is acquired based on the acquired file path, a character string indicating a file name that conforms to the file name character string determination rule is described in the file data, and if it is described Is stored in the second database with the file path of the described file as the reference source file path and the described file name as the reference destination file name, and the process of associating and storing the information of the detection date and time A second means for executing on file data;
The reference source file path stored in the second database is acquired, and the first destination file name stored in the second database is used as a search key for the file of each reference source file path. The index information stored in the database is searched to determine whether a file with a file name that matches the search key exists. If the file exists, the file path of the file is obtained from the first database, and the reference source A process of storing, in the third database, the information of the reference source file path, the information of the reference destination file path, and the information of the date and time when the relationship is determined as the information of the file having the weak relation in which the relation between the relation and the reference destination is specified. Function as a third means to execute on all file data, referrers and references not specified by hyperlink Doo is characterized by extracting the information of the file having a weak association relationship.
The indexed file information stored in the first database includes information on the creation completion date / time of the indexed file information. In the third step, the creation completion date / time of the indexed file information Compare the information of the date and time when it is determined that the file with the file name matching the search key exists, and the reference source file path, the information of the reference destination file path, and the relevance only when the difference is within the allowable range. It functions as a fourth means for storing the determined date and time information in a third database.
When a file path of a file linked to an arbitrary link is specified, the third database is searched using the file path as a search key, and a reference file path that matches the inspection key exists. It is characterized by functioning as a fifth means for acquiring and displaying information on the date and time when the link to the reference destination file path is determined from the third database.

本発明によれば、第１のデータベースに格納したインデクス済みファイル情報からファイル名、ファイルパスの情報を取得した後、その取得した前記ファイルパスに基づき各ファイルのデータを取得し、当該ファイルデータ中にファイル名文字列判定ルールに適合するファイル名を示す文字列が記述されているかを判定し、記述されていた場合には記述されていたファイルのファイルパスを参照元ファイルパス、記述されているファイル名を参照先ファイル名として第２のデータベースに格納すると共に、検出日時の情報を対応付けて格納する処理を全てのファイルデータを対象に実行し、さらに第２のデータベースに格納されている前記参照元ファイルパスを取得し、各参照元ファイルパスのファイルについて前記第２のデータベースに格納されている前記参照先ファイル名を検索キーとして前記第１のデータベースに格納されたインデクス情報を検索し、検索キーに一致するファイル名のファイルが存在するかを判定し、存在した場合には当該ファイルのファイルパスを前記第１のデータベースから取得し、参照元と参照先との関係が特定された弱い関連関係を有するファイルの情報として参照元ファイルパス、参照先ファイルパスの情報、及び関連性を判定した日時の情報を第３のデータベースに格納する処理を全てのファイルデータを対象に実行することにより、ハイパーリンクで特定されない参照元と参照先とが弱い関連関係を有するファイルの情報を抽出するようにしているため、「弱い関連性」を有するファイル同士を機械的な解析処理により、「弱い関連性」を有する関連ファイルの一覧を作成することができる。
これによって、検索システムなどで「弱い関連性」を有する関連ファイルを画面内に表示できる。 According to the present invention, after acquiring the file name and file path information from the indexed file information stored in the first database, the data of each file is acquired based on the acquired file path, and the file data It is determined whether a character string indicating a file name that conforms to the file name character string determination rule is described, and if it is described, the file path of the described file is described as the reference source file path The file name is stored in the second database as a reference destination file name, and the process of associating and storing the information of the detection date and time is executed for all file data, and further stored in the second database. The reference source file path is acquired, and the files of each reference source file path are stored in the second database. The index information stored in the first database is searched using the reference file name being searched as a search key, and it is determined whether a file with a file name matching the search key exists. The file path of the file is acquired from the first database, and the information of the reference source file path, the information of the reference destination file path, and the relation as the information of the file having the weak relation in which the relation between the reference source and the reference destination is specified By executing the process of storing the date and time information determined in the third database for all file data, information on the file having a weak relation between the reference source and the reference destination not specified by the hyperlink is extracted. Therefore, files with “weak association” have “weak association” by mechanical analysis. It is possible to create a list of that related files.
As a result, a related file having a “weak relationship” can be displayed on the screen in a search system or the like.

本発明に係る第一の実施形態におけるシステム構成図である。It is a system configuration figure in a first embodiment concerning the present invention. インデックス済みファイル情報格納ＤＢ２のデータ構成図である。It is a data block diagram of indexed file information storage DB2. ファイル名出現箇所情報格納ＤＢ３のデータ構成図である。It is a data block diagram of file name appearance location information storage DB3. 「弱い関連性」情報格納ＤＢ４のデータ構成図である。It is a data block diagram of "weak relation" information storage DB4. ファイル名文字列抽出手段８の動作を示すフローチャートである。4 is a flowchart showing the operation of file name character string extraction means 8. ファイル名出現箇所抽出手段９の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the file name appearance location extraction means 9. FIG. 言及先ファイルパス特定手段１０の動作を示すフローチャートである。4 is a flowchart showing the operation of the mentioned file path specifying means 10. 「弱い関連性」解析手段１１の動作を示すフローチャートである。4 is a flowchart showing the operation of “weak association” analysis means 11. 関連ファイル表示手段１２の動作を示すフローチャートである。7 is a flowchart showing the operation of the related file display means 12. 関連ファイル表示手段１２に関して、関連ファイルへのリンクが表示される画面の概念図である。It is a conceptual diagram of a screen on which a link to a related file is displayed with respect to the related file display means 12. 関連ファイル表示手段１２に関して、関連ファイルが存在しない場合の関連ファイル情報表示画面の概念図である。It is a conceptual diagram of the related file information display screen when there is no related file regarding the related file display means 12. 関連ファイル表示手段１２に関して、関連ファイルが存在する場合の関連ファイル情報表示画面の概念図である。It is a conceptual diagram of a related file information display screen when a related file exists regarding the related file display means.

以下、本発明を実施する場合の第一の実施形態について、図面に基づき詳細に説明する。
図１は、本発明に係る第一の実施形態におけるシステム構成図である。
図１において示したファイル間の関連性の解析システムは、関連性解析サーバ１およびファイルサーバ１７（以下「サーバ等」という）を、ＬＡＮ（Local Area Network）１６等の有線又は無線の通信回線により互いに通信可能に接続したシステムである。
図１においては、サーバ等はＬＡＮ１６によって互いに通信可能に接続されているが、接続方法はＬＡＮに限定されるものではなく、例えばＷＡＮ（Wide Area Network）やインターネットによって接続されていてもよい。
また、図１においては、サーバ等が同一ＬＡＮセグメント上で接続されているが、この構成は一例に過ぎず、どのような構成になっていてもよい。さらに、図１においては、関連性解析サーバ１およびファイルサーバ１７はそれぞれ１台ずつ存在しているが、それぞれ２台以上存在しても構わない。また、関連性解析サーバ１およびファイルサーバ１７はそれぞれ異なる装置である必要はなく、例えば、関連性解析サーバ１およびファイルサーバ１７の機能を１台の装置によって実現することも可能である。 Hereinafter, a first embodiment for carrying out the present invention will be described in detail with reference to the drawings.
FIG. 1 is a system configuration diagram according to the first embodiment of the present invention.
The relationship analysis system between files shown in FIG. 1 includes a relationship analysis server 1 and a file server 17 (hereinafter referred to as “server etc.”) using a wired or wireless communication line such as a LAN (Local Area Network) 16. These are systems that are communicably connected to each other.
In FIG. 1, servers and the like are communicably connected to each other via a LAN 16, but the connection method is not limited to the LAN, and may be connected by, for example, a WAN (Wide Area Network) or the Internet.
In FIG. 1, servers and the like are connected on the same LAN segment. However, this configuration is merely an example, and any configuration may be used. Further, in FIG. 1, one relevance analysis server 1 and one file server 17 exist, but two or more each may exist. Further, the relationship analysis server 1 and the file server 17 do not need to be different devices. For example, the functions of the relationship analysis server 1 and the file server 17 can be realized by a single device.

以上のような構成により、関連性解析サーバ１はファイルサーバ１７上のファイルのうち、相互に「弱い関連性」を有するファイルの一覧を作成する。
関連性解析サーバ１およびファイルサーバ１７はＰＣ等の装置である。
関連性解析サーバ１には、インデックス済みファイル情報格納ＤＢ２、ファイル名出現箇所情報格納ＤＢ３、「弱い関連性」情報格納ＤＢ４が設けられている。
また、関連性解析サーバ１には、インデックスデータ５、ファイル名文字列判定ルールリスト１３、参照先特定用設定データ１４、スケジューラ設定データ１５が格納されている。 With the configuration as described above, the relevance analysis server 1 creates a list of files having “weak relevance” among the files on the file server 17.
The relevance analysis server 1 and the file server 17 are devices such as a PC.
The relevance analysis server 1 is provided with an indexed file information storage DB 2, a file name appearance location information storage DB 3, and a “weak relevance” information storage DB 4.
The relevance analysis server 1 stores index data 5, a file name character string determination rule list 13, reference destination specifying setting data 14, and scheduler setting data 15.

記憶装置１８は磁気ディスク等の装置であり、ファイルサーバ１７に内蔵され又は外部接続される。記憶装置１８は検索対象ファイル１９を記憶しており、関連性解析サーバ１から検索対象ファイル１９を参照できるように設定されている。
インデックスデータ５は、インデクサ７が検索対象ファイル１９をインデクシングする過程で字句解析と正規化を行った結果として導出されたデータである。 The storage device 18 is a device such as a magnetic disk, and is built in the file server 17 or connected externally. The storage device 18 stores a search target file 19 and is set so that the search target file 19 can be referred to from the relevance analysis server 1.
The index data 5 is data derived as a result of performing lexical analysis and normalization in the process of indexing the search target file 19 by the indexer 7.

ファイル名文字列判定ルールリスト１３には、ファイル名文字列抽出手段８がファイル名を示す文字列を判定し抽出する際に参照する判定ルールの一覧が正規表現の形式で記憶されている。 The file name character string determination rule list 13 stores a list of determination rules to be referred to when the file name character string extraction unit 8 determines and extracts a character string indicating a file name in the form of a regular expression.

参照先特定用設定データ１４は、参照先ファイルパス特定手段１０が参照先ファイルパスを特定する際に参照する日時差分の許容範囲を示す設定値である。
スケジューラ設定データ１５は、スケジューラ６がファイルの関連性の解析を定期的に実行するために参照する実行間隔の設定値である。 The reference destination specifying setting data 14 is a setting value indicating an allowable range of date and time differences to be referred to when the reference destination file path specifying unit 10 specifies a reference destination file path.
The scheduler setting data 15 is a setting value of an execution interval that the scheduler 6 refers to in order to periodically execute file relevance analysis.

関連性解析サーバ１はスケジューラ６、インデクサ７、ファイル名文字列抽出手段８、ファイル名出現箇所抽出手段９、参照先ファイルパス特定手段１０、「弱い関連性」解析手段１１、および、関連ファイル表示手段１２を備えている。 The relevance analysis server 1 includes a scheduler 6, an indexer 7, a file name character string extraction unit 8, a file name appearance location extraction unit 9, a reference destination file path identification unit 10, a “weak relevance” analysis unit 11, and a related file display Means 12 are provided.

スケジューラ６は関連ファイルの一覧を定期的に作成するにあたり、スケジューラ設定データ１５の一覧作成処理実行間隔を参照して、ファイル名出現箇所抽出手段９、「弱い関連性」解析手段１１を実行して、関連ファイルの一覧を作成する。 When the scheduler 6 periodically creates a list of related files, the scheduler 6 refers to the list creation processing execution interval of the scheduler setting data 15 and executes the file name appearance location extraction means 9 and the “weak relation” analysis means 11. Create a list of related files.

ファイル名文字列抽出手段８は、ファイル名文字列判定ルールリスト１３の情報に基づいて、インデックス済みファイル情報格納ＤＢ２からファイル名を示す文字列を抽出する。この動作については、ファイル名文字列抽出手段８の処理フロー（Ｓ５０１等）として後述する。 The file name character string extraction unit 8 extracts a character string indicating a file name from the indexed file information storage DB 2 based on the information in the file name character string determination rule list 13. This operation will be described later as a processing flow (S501, etc.) of the file name character string extraction means 8.

ファイル名出現箇所抽出手段９は、他のファイルを参照している参照元ファイルのファイルパスとファイル名文字列抽出手段８によって抽出した参照先ファイル名を、ファイル名出現箇所情報格納ＤＢ３に格納する。この動作については、ファイル名出現箇所抽出手段９の処理フロー（Ｓ６０１等）として後述する。 The file name appearance location extraction means 9 stores the file path of the reference source file referring to another file and the reference destination file name extracted by the file name character string extraction means 8 in the file name appearance location information storage DB 3. . This operation will be described later as a processing flow (S601, etc.) of the file name appearance location extraction means 9.

参照先ファイルパス特定手段１０は、ファイル名出現箇所抽出手段９によって抽出された参照先ファイル名および参照先特定用設定データ１４の情報に基づいて、インデックス済みファイル情報格納ＤＢ２から参照先ファイルパスを抽出する。この動作については、参照先ファイルパス特定手段１０の処理フロー（Ｓ７０１等）として後述する。 The reference destination file path specifying unit 10 obtains the reference destination file path from the indexed file information storage DB 2 based on the reference destination file name extracted by the file name appearance location extracting unit 9 and the information of the reference destination specifying setting data 14. Extract. This operation will be described later as a processing flow (S701 and the like) of the reference destination file path specifying unit 10.

「弱い関連性」解析手段１１は、ファイル名出現箇所情報格納ＤＢ３の参照元ファイルパス３１１と参照先ファイルパス特定手段１０によって特定された参照先ファイルパスの対応情報の一覧を作成し、その一覧の情報を「弱い関連性」情報格納ＤＢ４に格納する。
この動作については、「弱い関連性」解析手段１１（Ｓ８０１等）として後述する。 The “weak association” analyzing unit 11 creates a list of correspondence information between the reference source file path 311 and the reference destination file path specifying unit 10 in the file name appearance location information storage DB 3, and the list Is stored in the “weak relevance” information storage DB 4.
This operation will be described later as “weak association” analysis means 11 (S801, etc.).

関連ファイル表示手段１２は、「弱い関連性」情報格納ＤＢ４に格納されている情報を参照して、あるファイルと「弱い関連性」を有する関連ファイルを画面内に表示する。この動作については、関連ファイル表示手段１２の処理フロー（Ｓ９０１等）として後述する。 The related file display means 12 refers to the information stored in the “weak relationship” information storage DB 4 and displays a related file having a “weak relationship” with a certain file on the screen. This operation will be described later as a processing flow (S901, etc.) of the related file display means 12.

図２はインデックス済みファイル情報格納ＤＢ２のデータ構成図である。
インデックス済みファイル情報格納ＤＢ２は、インデックス済みファイル情報テーブル２１で定義される情報を格納する。
インデックス済みファイル情報テーブル２１は、ファイルパス２１１、ファイル名２１２、インデクス作成完了日時２１３から構成されており、図示しないインデクサ７がインデクシングを完了した検索対象ファイル１９に関するファイルパス２１１、ファイル名２１２、インデクシング完了日時２１３が登録されるようになっている。 FIG. 2 is a data configuration diagram of the indexed file information storage DB2.
The indexed file information storage DB 2 stores information defined by the indexed file information table 21.
The indexed file information table 21 includes a file path 211, a file name 212, and an index creation completion date and time 213. The file path 211, the file name 212, and the indexing relating to the search target file 19 for which the indexer 7 (not shown) has completed the indexing. The completion date and time 213 is registered.

図３はファイル名出現箇所情報格納ＤＢ３のデータ構成図である。
ファイル名出現箇所情報格納ＤＢ３は、ファイル名出現箇所情報テーブル３１で定義される情報を格納する。
ファイル名出現箇所情報テーブル３１は、参照元ファイルパス３１１、参照先ファイル名３１２、出現箇所抽出日時３１３から構成されており、ファイル名出現箇所抽出手段９により、参照元のファイルのファイルパス３１１、参照先のファイル名３１２、抽出時点でのシステム日時３１３が登録されるようになっている。 FIG. 3 is a data configuration diagram of the file name appearance location information storage DB 3.
The file name appearance location information storage DB 3 stores information defined in the file name appearance location information table 31.
The file name appearance location information table 31 includes a reference source file path 311, a reference destination file name 312, and an appearance location extraction date / time 313, and the file name appearance location extraction unit 9 uses the file path 311 of the reference source file, The file name 312 of the reference destination and the system date and time 313 at the time of extraction are registered.

図４は「弱い関連性」情報格納ＤＢ４のデータ構成図である。
「弱い関連性」情報格納ＤＢ４は、「弱い関連性」情報テーブル４１で定義される情報を格納する。「弱い関連性」情報テーブル４１は、参照元ファイルパス４１１、参照先ファイルパス４１２、関連情報更新日時４１３から構成されており、「弱い関連性」解析手段１１により、参照元ファイルのファイルパス４１１、インデックス済みファイル情報格納ＤＢ２から検索された参照先ファイルパス４１２、解析時点でのシステム日時４１３が登録されるようになっている。 FIG. 4 is a data configuration diagram of the “weak association” information storage DB 4.
The “weak association” information storage DB 4 stores information defined in the “weak association” information table 41. The “weak relationship” information table 41 includes a reference source file path 411, a reference destination file path 412, and related information update date / time 413, and the “weak relationship” analysis unit 11 uses the file path 411 of the reference source file. The reference file path 412 searched from the indexed file information storage DB 2 and the system date and time 413 at the time of analysis are registered.

図５はファイル名文字列抽出手段８の動作を示すフローチャートである。
ファイル名文字列抽出手段８は、インデックスデータ５で示されるファイルパスの各ファイルを参照元としてファイル名文字列判定ルールに適合するファイル名の文字列を抽出する処理である。
図５の処理は、図６のフローチャートに示す処理の中から呼び出される。
まず、ファイル名文字列判定ルールリスト１３を参照し、その中に正規表現の形式（例えば、”*.doc” “*.xls” “*.pdf””*.txt”といった形式）で記述されているルールを読み込む（Ｓ５０１）。 FIG. 5 is a flowchart showing the operation of the file name character string extracting means 8.
The file name character string extraction unit 8 is a process of extracting a character string of a file name that conforms to the file name character string determination rule using each file of the file path indicated by the index data 5 as a reference source.
The process of FIG. 5 is called from the processes shown in the flowchart of FIG.
First, the file name character string judgment rule list 13 is referenced and described in the regular expression format (for example, “* .doc” “* .xls” “* .pdf” ”*. Txt”). The current rule is read (S501).

次に、ファイル名出現箇所抽出手段９の処理の中から呼び出された際に指定されたファイルに関して、Ｓ５０１で読み込んだルールを検索キーとして、インデックスデータ５を検索する（Ｓ５０２）。次に、検索にヒットした文字列をファイル名文字列と見なして抽出し、ファイル名出現箇所抽出手段９の処理にその抽出結果を返却する（Ｓ５０３）。
なお、Ｓ５０３において、抽出件数が“０”件となった場合は、ファイル名出現箇所抽出手段９の処理にＮＵＬＬ値を返却し、抽出件数が複数件となった場合は、ファイル名出現箇所抽出手段９の処理にその全件を返却する。 Next, the index data 5 is searched for the file specified when called from the processing of the file name appearance location extraction means 9 by using the rule read in S501 as a search key (S502). Next, the character string hit in the search is regarded as the file name character string and extracted, and the extraction result is returned to the process of the file name appearance location extracting means 9 (S503).
In S503, if the number of extractions is “0”, a NULL value is returned to the process of the file name appearance location extraction unit 9, and if the number of extractions is multiple, the extraction of the file name appearance location is performed. All the cases are returned to the processing of means 9.

図６はファイル名出現箇所抽出手段９の動作を示すフローチャートである。
ファイル名出現箇所抽出手段９は参照先ファイル名が出現する箇所の情報を参照元のファイルから抽出する処理である。図６のＳ６０１に至るまでは、前述のように、スケジューラ６が、関連ファイルの一覧を定期的に作成するにあたり、スケジューラ設定データ１５の一覧作成処理実行間隔を参照して、処理を開始させる。
まず、インデックス済みファイル情報格納ＤＢ２からインデックス済みファイルの情報を取得する（Ｓ６０１）。
次に、取得したインデックス済みファイルの情報に存在するファイルパスで示される全てのファイルデータを１つずつ取得し、ファイル名文字列抽出手段８の呼び出し、およびＳ６０２からＳ６０５までのステップを反復する。 FIG. 6 is a flowchart showing the operation of the file name appearance location extraction means 9.
The file name appearance location extracting means 9 is a process for extracting information on the location where the reference file name appears from the reference source file. Until S601 in FIG. 6, as described above, the scheduler 6 refers to the list creation processing execution interval of the scheduler setting data 15 to start processing when periodically creating a list of related files.
First, indexed file information is acquired from the indexed file information storage DB 2 (S601).
Next, all the file data indicated by the file path existing in the acquired indexed file information is acquired one by one, the call of the file name character string extraction means 8 and the steps from S602 to S605 are repeated.

この反復ループの中では、まず、ファイル名文字列抽出手段８を呼び出す。ファイル名文字列抽出手段８によりファイル名文字列判定ルールリスト１３に定義されたルールに適合するファイル名文字列が抽出されたか否かで条件分岐を行う（Ｓ６０２）。抽出されなかった場合は、次のインデックス済みファイルデータの処理を移る。
抽出された場合は、抽出された全てのファイル名文字列を処理するまで、Ｓ６０３からＳ６０５までのステップを反復する（ある１つのインデックス済みファイルデータ内に、複数のファイル名文字列が参照されているケースがあるため、各々のファイル名文字列毎に処理を行う）。 In this iterative loop, first, the file name character string extraction means 8 is called. Conditional branching is performed based on whether or not a file name character string that conforms to the rules defined in the file name character string determination rule list 13 is extracted by the file name character string extraction means 8 (S602). If not extracted, the processing of the next indexed file data is shifted.
If extracted, the steps from S603 to S605 are repeated until all the extracted file name character strings are processed (a plurality of file name character strings are referred to in a certain indexed file data). Because there are cases, there is a process for each file name string).

抽出された場合には、インデックス済みファイル情報テーブル２１のファイルパス２１１の値（すなわちファイル名文字列判定ルールリスト１３に定義されたルールに適合するファイル名文字列が抽出されたファイルのファイルパス）をファイル名出現箇所情報テーブル３１の参照元ファイルパス３１１に格納する（Ｓ６０３）。
次にファイル名文字列抽出手段８により抽出されたファイル名文字列の値をファイル名出現箇所情報テーブル３１の参照先ファイル名３１２に格納する（Ｓ６０４）。
次に抽出時のシステム日時の値をファイル名出現箇所情報テーブル３１の出現箇所抽出日時３１３に格納する（Ｓ６０５）。 When extracted, the value of the file path 211 of the indexed file information table 21 (that is, the file path of the file from which the file name character string that conforms to the rules defined in the file name character string determination rule list 13 is extracted) Is stored in the reference source file path 311 of the file name appearance location information table 31 (S603).
Next, the value of the file name character string extracted by the file name character string extracting means 8 is stored in the reference destination file name 312 of the file name appearance location information table 31 (S604).
Next, the system date value at the time of extraction is stored in the appearance location extraction date 313 of the file name appearance location information table 31 (S605).

図７は参照先ファイルパス特定手段１０の動作を示すフローチャートである。
参照先ファイルパス特定手段１０は、ファイル名出現箇所情報テーブル３１の参照先ファイル名３１２の情報から参照先ファイルパスを特定する処理である。
図７の処理は、図８のフローチャートに示す処理の中から呼び出される。
「弱い関連性」解析手段１１の処理の中から呼び出された際に指定されたファイル名出現箇所情報テーブル３１の参照先ファイル名３１２をファイル名２１２に対する検索キーとして、インデックス済みファイル情報格納ＤＢ２を検索する（Ｓ７０１）。
検索にヒットしたか否かで条件分岐を行う（Ｓ７０２）。 FIG. 7 is a flowchart showing the operation of the reference destination file path specifying means 10.
The reference destination file path specifying means 10 is a process for specifying the reference destination file path from the information of the reference destination file name 312 in the file name appearance location information table 31.
The process of FIG. 7 is called from the processes shown in the flowchart of FIG.
The indexed file information storage DB 2 is stored using the reference file name 312 of the file name appearance location information table 31 specified when called from the processing of the “weak association” analysis unit 11 as a search key for the file name 212. Search is performed (S701).
Conditional branching is performed depending on whether or not the search is hit (S702).

検索にヒットしなかった場合は、処理を終了し、「弱い関連性」解析手段１１の処理にＮＵＬＬ値を返却する。
検索にヒットした場合は、全てのファイルを処理するまでＳ７０３からＳ７０８までのステップを反復する（ある１つの参照先ファイル名３１２と等しい名前のファイルが複数のファイルパスに存在しているケースがあるため、各々のファイルパスに存在するファイル毎に処理を行う）。 If the search is not hit, the process is terminated, and a NULL value is returned to the process of the “weak relevance” analysis unit 11.
When the search is hit, the steps from S703 to S708 are repeated until all the files are processed (there is a case where a file having a name equal to one reference destination file name 312 exists in a plurality of file paths). Therefore, processing is performed for each file existing in each file path).

次にファイル名出現箇所情報格納ＤＢ３から「弱い関連性」解析手段１１の処理の中から呼び出された際に指定された参照元ファイルパス３１２に対応するレコードの出現箇所抽出日時３１３を取得する（Ｓ７０３）。
次にインデックス済みファイル情報格納ＤＢ２から当該ファイルの更新日時２１３を取得する（Ｓ７０４）。
取得した出現箇所抽出日時３１３と更新日時２１３の差分を算出する（Ｓ７０５）。
参照先特定用設定データ１４から許容範囲の値を取得する（Ｓ７０６）。
差分がＳ７０６で取得した許容範囲の値に収まっているか否かで条件分岐を行う（Ｓ７０７）。 Next, the appearance location extraction date and time 313 of the record corresponding to the reference source file path 312 specified when called from the processing of the “weak association” analysis unit 11 is acquired from the file name appearance location information storage DB 3 ( S703).
Next, the update date 213 of the file is obtained from the indexed file information storage DB 2 (S704).
The difference between the acquired appearance location extraction date and time 313 and the update date and time 213 is calculated (S705).
The allowable range value is acquired from the reference destination specifying setting data 14 (S706).
Conditional branching is performed based on whether or not the difference is within the allowable range acquired in S706 (S707).

差分が許容範囲内に収まっている場合は、インデックス済みファイル情報格納ＤＢ２から当該ファイルのファイルパス２１１を取得する（Ｓ７０８）。
Ｓ７０８でファイルパス２１１の値が複数取得された（参照先ファイル名３１２と同名のファイルが複数のファイルパスに存在した）場合は、反復ループを抜けた後に一括して、「弱い関連性」解析手段１１の処理に返却する。
ステップＳ７０３〜Ｓ７０７までの処理はインデクス完了日時が古いファイルは除外するためのものであり、必要に応じて付加する。 If the difference is within the allowable range, the file path 211 of the file is acquired from the indexed file information storage DB 2 (S708).
When a plurality of values of the file path 211 are acquired in S708 (a file having the same name as the reference destination file name 312 exists in a plurality of file paths), after exiting the iterative loop, the “weak relevance” analysis is performed collectively. Return to the process of means 11.
The processing from steps S703 to S707 is for excluding files with old index completion dates and times, and is added as necessary.

図８は「弱い関連性」解析手段１１の動作を示すフローチャートである。
「弱い関連性」解析手段１１は、「弱い関連性」を解析し、参照元ファイルパス４１１と参照先ファイルパス４１２の対応情報を導出する処理である。
図６のフローチャートに示した処理の終了後、スケジューラ６が図８の処理を起動させる。ファイル名出現箇所情報格納ＤＢ３から参照元ファイルの情報を取得する（Ｓ８０１）。
まず、全ての参照元ファイルを処理するまで、参照先ファイルパス特定手段１０の呼び出し、および、Ｓ８０２の条件分岐、Ｓ８０３〜Ｓ８０５の内部反復ループを反復する。 FIG. 8 is a flowchart showing the operation of the “weak association” analysis unit 11.
The “weak association” analysis unit 11 is a process of analyzing “weak association” and deriving correspondence information between the reference source file path 411 and the reference destination file path 412.
After the process shown in the flowchart of FIG. 6 is completed, the scheduler 6 starts the process of FIG. Information of the reference source file is acquired from the file name appearance location information storage DB 3 (S801).
First, until all the reference source files are processed, the calling of the reference destination file path specifying unit 10 and the conditional branching of S802 and the inner iteration loop of S803 to S805 are repeated.

この反復ループの中では、まず、参照先ファイルパス特定手段１０を呼び出す。
参照先ファイルパス特定手段１０によりパスを特定できたか否かで条件分岐を行う（Ｓ８０２）。
特定できなかった場合は、次の参照元ファイルに処理を移る。
特定できた場合は、特定できた全てのパスを処理するまで、Ｓ８０３からＳ８０５までのステップを反復する。ここでは当該の参照元ファイルの参照元ファイルパス３１１の値を「弱い関連性」情報テーブル４１の参照元ファイルパス４１１に格納する（Ｓ８０３）。また、参照先ファイルパス特定手段１０により特定した参照先ファイルパスの値を「弱い関連性」情報テーブル４１の参照先ファイルパス４１２に格納する（Ｓ８０４）。そして、特定した時のシステム日時の値を「弱い関連性」情報テーブル４１の関連情報更新日時４１３に格納する（Ｓ８０５）。 In this iterative loop, first, the reference destination file path specifying means 10 is called.
Conditional branching is performed depending on whether the path can be specified by the reference destination file path specifying means 10 (S802).
If it cannot be identified, the process moves to the next reference source file.
If it can be specified, the steps from S803 to S805 are repeated until all the specified paths are processed. Here, the value of the reference source file path 311 of the reference source file is stored in the reference source file path 411 of the “weak association” information table 41 (S803). Further, the value of the reference destination file path specified by the reference destination file path specifying means 10 is stored in the reference destination file path 412 of the “weak association” information table 41 (S804). Then, the system date and time value at the time of identification is stored in the related information update date and time 413 of the “weak relationship” information table 41 (S805).

図９は関連ファイル表示手段１２の動作を示すフローチャートである。
関連ファイル表示手段１２はあるファイルと関連するファイルの一覧を表示する処理である。
図９のＳ９０１に至るまでは、図１０の概念図に示すような画面１００内において、あるファイル１０１と紐づいている関連ファイルへのリンク１０２がクリックされると（指定されると）処理が開始される。
まず、リンク１０２と紐づいているファイル１０１のファイルパス１０３を取得する（Ｓ９０１）。当該ファイルパスを参照元ファイルパス４１１に対する検索キーとして、「弱い関連性」情報格納ＤＢ４を検索する（Ｓ９０２）。
検索にヒットしたか否かで条件分岐を行う（Ｓ９０３）。検索にヒットしなかった場合は、図１１に示すように「関連ファイル無し」である旨１１１を画面１１０内に表示する（Ｓ９０４）。 FIG. 9 is a flowchart showing the operation of the related file display means 12.
The related file display means 12 is a process for displaying a list of files related to a certain file.
Until S901 in FIG. 9, when a link 102 to a related file linked to a certain file 101 is clicked (designated) in the screen 100 as shown in the conceptual diagram of FIG. Be started.
First, the file path 103 of the file 101 linked to the link 102 is acquired (S901). The “weak relevance” information storage DB 4 is searched using the file path as a search key for the reference source file path 411 (S902).
Conditional branching is performed depending on whether or not the search is hit (S903). If the search is not hit, a message 111 “No related file” is displayed in the screen 110 as shown in FIG. 11 (S904).

検索にヒットした場合は、ヒットしたレコード全てを処理するまで、Ｓ９０５からＳ９０６までのステップを反復する。すなわち「弱い関連性」情報格納ＤＢ４から、検索にヒットした参照元ファイルパス４１１に対応する参照先ファイルパス４１２を取得する（Ｓ９０５）。次に「弱い関連性」情報格納ＤＢ４から、検索にヒットした参照元ファイルパス４１１に対応する関連情報更新日時４１３を取得する（Ｓ９０６）。そして、Ｓ９０５、Ｓ９０６で取得した参照先ファイルパス４１２へのリンクと関連情報更新日時４１３の一覧１２１を画面１２０内に表示する（Ｓ９０７）。 If the search is hit, the steps from S905 to S906 are repeated until all hit records are processed. In other words, the reference destination file path 412 corresponding to the reference source file path 411 hit in the search is acquired from the “weak association” information storage DB 4 (S905). Next, the related information update date and time 413 corresponding to the reference source file path 411 hit in the search is acquired from the “weak relationship” information storage DB 4 (S906). Then, a link 121 to the reference destination file path 412 acquired in S905 and S906 and a list 121 of related information update date and time 413 are displayed in the screen 120 (S907).

図１０は関連ファイル表示手段１２に関して、関連ファイルへのリンクが表示される画面の概念図である。画面１００の内容は一例に過ぎない。関連ファイルへのリンク１０２の文言は、図１０内で例示しているものに限定せず、関連ファイルへのリンクであるという旨が伝わる内容であれば任意とする。 FIG. 10 is a conceptual diagram of a screen on which a link to a related file is displayed regarding the related file display means 12. The content of the screen 100 is only an example. The wording of the link 102 to the related file is not limited to that illustrated in FIG. 10, and may be arbitrary as long as the content indicates that the link is to the related file.

図１１は関連ファイル表示手段１２に関して、関連ファイルが存在しない場合の関連ファイル情報表示画面の概念図である。画面１１０の内容は一例に過ぎない。「関連ファイル無し」である旨１１１の文言は、図１１内で例示しているものに限定せず、その旨が伝わる内容であれば任意とする。 FIG. 11 is a conceptual diagram of the related file information display screen when no related file exists for the related file display means 12. The content of the screen 110 is only an example. The wording of “no related file” 111 is not limited to that illustrated in FIG. 11, and is arbitrary as long as the content conveys that fact.

図１２は関連ファイル表示手段１２に関して、関連ファイルが存在する場合の関連ファイル情報表示画面の概念図である。画面１２０の内容は一例に過ぎない。一覧１２１の文言は、図１２内で例示しているものに限定せず、あるファイルと関連するファイルの一覧の情報が伝わる内容であれば任意とする。 FIG. 12 is a conceptual diagram of a related file information display screen when there is a related file regarding the related file display means 12. The content of the screen 120 is only an example. The wording of the list 121 is not limited to that illustrated in FIG. 12, and may be arbitrary as long as the information of the list of files related to a certain file is transmitted.

１関連性解析サーバ
２インデックス済みファイル情報格納ＤＢ
３ファイル名出現箇所情報格納ＤＢ
４「弱い関連性」情報格納ＤＢ
５インデックスデータ
６スケジューラ
７インデクサ
８ファイル名文字列抽出手段
９ファイル名出現箇所抽出手段
１０参照先ファイルパス特定手段
１１「弱い関連性」解析手段
１２関連ファイル表示手段
１３ファイル名文字列判定ルールリスト
１４参照先特定用設定データ
１５スケジューラ設定データ
１６ネットワーク
１７ファイルサーバ
１８ファイルサーバ１７に接続された記憶装置
１９検索対象ファイル 1 Relevance analysis server 2 Indexed file information storage DB
3 File name appearance location information storage DB
4 "Weak relevance" information storage DB
DESCRIPTION OF SYMBOLS 5 Index data 6 Scheduler 7 Indexer 8 File name character string extraction means 9 File name appearance location extraction means 10 Reference destination file path specification means 11 "Weak relevance" analysis means 12 Related file display means 13 File name character string judgment rule list 14 Reference destination specifying setting data 15 Scheduler setting data 16 Network 17 File server 18 Storage device connected to file server 17 19 Search target file

Claims

A first step of obtaining file name and file path information from the indexed file information stored in the first database;
When the data of each file is acquired based on the acquired file path, a character string indicating a file name that conforms to the file name character string determination rule is described in the file data, and if it is described Is stored in the second database with the file path of the described file as the reference source file path and the described file name as the reference destination file name, and the process of associating and storing the information of the detection date and time A second step for executing the file data;
The reference source file path stored in the second database is acquired, and the first destination file name stored in the second database is used as a search key for the file of each reference source file path. The index information stored in the database is searched to determine whether a file with a file name that matches the search key exists. If the file exists, the file path of the file is obtained from the first database, and the reference source A process of storing, in the third database, the information of the reference source file path, the information of the reference destination file path, and the information of the date and time when the relationship is determined as the information of the file having the weak relation in which the relation between the relation and the reference destination is specified. A third step executed on all file data, and a reference source and a reference destination not specified by a hyperlink Association analysis method between file and extracts the information of the file having a weak binding relationship.

The indexed file information stored in the first database includes information on the creation completion date / time of the indexed file information. In the third step, the creation completion date / time and search key of the indexed file information are included. The information of the date and time when it was determined that there is a file with a file name that matches the above is compared, and the reference source file path, the information of the reference destination file path, and the relevance are determined only when the difference is within the allowable range. The relevance analysis method between files according to claim 1, further comprising a fourth step of storing date and time information in a third database.

When a file path of a file linked to an arbitrary link is specified, the third database is searched using the file path as a search key, and if there is a reference file path that matches the inspection key, the reference The relationship between files according to claim 1 or 2, further comprising a fifth step of acquiring and displaying information on the date and time when the link to the destination file path is determined from the third database. Sex analysis method.

First means for obtaining file name and file path information from the indexed file information stored in the first database;
When the data of each file is acquired based on the acquired file path, a character string indicating a file name that conforms to the file name character string determination rule is described in the file data, and if it is described Is stored in the second database with the file path of the described file as the reference source file path and the described file name as the reference destination file name, and the process of associating and storing the information of the detection date and time A second means for executing on file data;
The reference source file path stored in the second database is acquired, and the first destination file name stored in the second database is used as a search key for the file of each reference source file path. The index information stored in the database is searched to determine whether a file with a file name that matches the search key exists. If the file exists, the file path of the file is obtained from the first database, and the reference source A process of storing, in the third database, the information of the reference source file path, the information of the reference destination file path, and the information of the date and time when the relationship is determined as the information of the file having the weak relation in which the relation between the relation and the reference destination is specified. A third means for executing all file data, and a reference source and a reference destination not specified by a hyperlink Association analysis system between file and extracts the information of the file having had relevant relationship.

The indexed file information stored in the first database includes information on the creation completion date / time of the indexed file information. In the third step, the creation completion date / time and search key of the indexed file information are included. The information of the date and time when it was determined that there is a file with a file name that matches the above is compared, and the reference source file path, the information of the reference destination file path, and the relevance are determined only when the difference is within the allowable range. 5. The relationship analysis system between files according to claim 4, further comprising a fourth means for storing date and time information in a third database.

When a file path of a file linked to an arbitrary link is specified, the third database is searched using the file path as a search key, and if there is a reference file path that matches the inspection key, the reference 6. The association between files according to claim 4 or 5, further comprising fifth means for acquiring and displaying information on the date and time when the relevance of the link to the destination file path is determined from the third database. Sex analysis system.

Relevance analysis server between files
First means for obtaining file name and file path information from the indexed file information stored in the first database;
When the data of each file is acquired based on the acquired file path, a character string indicating a file name that conforms to the file name character string determination rule is described in the file data, and if it is described Is stored in the second database with the file path of the described file as the reference source file path and the described file name as the reference destination file name, and the process of associating and storing the information of the detection date and time A second means for executing on file data;
The reference source file path stored in the second database is acquired, and the first destination file name stored in the second database is used as a search key for the file of each reference source file path. The index information stored in the database is searched to determine whether a file with a file name that matches the search key exists. If the file exists, the file path of the file is obtained from the first database, and the reference source A process of storing, in the third database, the information of the reference source file path, the information of the reference destination file path, and the information of the date and time when the relationship is determined as the information of the file having the weak relation in which the relation between the relation and the reference destination is specified. It functions as a third means to execute on all file data, and refers to references and references not specified by hyperlinks. File between relevant program for analysis, characterized in that preceding the extracts information about files having a weak association relationship.

The indexed file information stored in the first database includes information on the creation completion date / time of the indexed file information. In the third step, the creation completion date / time of the indexed file information Compare the information of the date and time when it is determined that the file with the file name matching the search key exists, and the reference source file path, the information of the reference destination file path, and the relevance only when the difference is within the allowable range. 8. The program for analyzing the relationship between files according to claim 7, wherein the program functions as a fourth means for storing the determined date and time information in a third database.

When a file path of a file linked to an arbitrary link is specified, the third database is searched using the file path as a search key, and a reference file path that matches the inspection key exists. 9. The file according to claim 7, wherein the file is made to function as a fifth means for acquiring and displaying information on a date and time when the link to the reference destination file path is determined from the third database. A program for analyzing the relationship between the two.