JPH06103127A

JPH06103127A - Device for managing hash file data and method thereof

Info

Publication number: JPH06103127A
Application number: JP4252384A
Authority: JP
Inventors: Toshihiko Kogure; 俊彦小暮
Original assignee: Kanebo Ltd
Current assignee: Kanebo Ltd
Priority date: 1992-09-22
Filing date: 1992-09-22
Publication date: 1994-04-15

Abstract

PURPOSE:To provide the hash file data managing device which can maintain a large quantity of data at a high speed. CONSTITUTION:Plural input data are divided, based on ranges 0-10, 11-20 of a hash value, and stored in plural input files IF1, IF2. From plural input files IF1, IF2, data is inputted to a data base managing device 3 by data input devices 1, 2. A hash file for constituting one logical file is divided physically into plural sub-files SF1, SF2, based on the ranges 0-10, 11-20 of the hash value. Data inputted by plural data input devices 1, 2 are stored independently, and also, in parallel in the corresponding sub-files SF1, SF2 through plural disk controllers 6, 7.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、ハッシング法により
データを管理するハッシュファイルデータ管理装置およ
びハッシュファイルデータ管理方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a hash file data management device and a hash file data management method for managing data by a hashing method.

【０００２】[0002]

【従来の技術】データベースに格納される複数のレコー
ドから任意のレコードを迅速に検索する方法としてハッ
シング法（ランダマイズ法）がある。ハッシング法で
は、レコードを識別するためのキー（レコードキー）が
ある一定の関数によって別の数値に変換され、その数値
が記憶媒体上の格納番地とされる。この関数をハッシュ
関数と呼び、その関数値をハッシュ値と呼ぶ。2. Description of the Related Art There is a hashing method (randomization method) as a method for quickly searching for an arbitrary record from a plurality of records stored in a database. In the hashing method, a key for identifying a record (record key) is converted into another numerical value by a certain function, and the numerical value is used as a storage address on a storage medium. This function is called a hash function, and the function value is called a hash value.

【０００３】キーをｋとし、ハッシュ関数をｆとすれ
ば、ハッシュ値ｐは次式により求められる。If the key is k and the hash function is f, the hash value p can be calculated by the following equation.

【０００４】ｐ＝ｆ（ｋ）このようにハッシュ関数ｆによりキーｋからハッシュ値
ｐを求めることをハッシング（ランダマイズ化）と呼
ぶ。各レコードはハッシュ関数により求められた格納番
地に格納され、アクセス時には同じハッシュ関数を用い
て各レコードの格納番地が決定される。したがって、ハ
ッシング法によると、レコードの迅速な検索が可能とな
る。P = f (k) Obtaining the hash value p from the key k by the hash function f in this way is called hashing (randomization). Each record is stored in the storage address obtained by the hash function, and the storage address of each record is determined using the same hash function at the time of access. Therefore, according to the hashing method, it is possible to quickly search for a record.

【０００５】上式によりハッシュ値を求めると、複数の
キーについてハッシュ値が同一になる場合がある。これ
を衝突と呼び、同一のハッシュ値を有するキーをシノニ
ム（同義語）と呼ぶ。このような衝突が少なくなるよう
に、最適なハッシュ関数が選択される。When the hash value is obtained by the above equation, the hash value may be the same for a plurality of keys. This is called a collision, and keys having the same hash value are called synonyms. The optimal hash function is selected so as to reduce such collisions.

【０００６】図８は、従来のハッシュファイルデータ管
理装置を示すブロック図である。入力ファイルＩＦには
複数のレコードが格納される。データ入力装置１１は、
入力ファイルＩＦに格納されるレコードを１つずつ読込
む。データベース管理装置１２は、データ入力装置１１
から与えられるレコードのキーをハッシュ値に変換し、
そのレコードおよびハッシュ値をディスクコントローラ
１３に与える。また、データベース管理装置１２は、デ
ィスクコントローラ１３に処理の種類を指示する。ディ
スクコントローラ１３は、ハッシュ値に基づいてハッシ
ュファイル１４にアクセスし、ハッシュファイル１４に
対して指示された処理を実行する。FIG. 8 is a block diagram showing a conventional hash file data management device. A plurality of records are stored in the input file IF. The data input device 11 is
The records stored in the input file IF are read one by one. The database management device 12 is a data input device 11
Convert the key of the record given by
The record and hash value are given to the disk controller 13. The database management device 12 also instructs the disk controller 13 on the type of processing. The disk controller 13 accesses the hash file 14 based on the hash value, and executes the processing instructed to the hash file 14.

【０００７】[0007]

【発明が解決しようとする課題】従来、ハッシュファイ
ルの設計では、論理的な構成に主眼が置かれ、あるハッ
シュ値を有するデータをどのデータ格納エリアに格納す
るかは特に意識されていない。そのため、１つのハッシ
ュファイルを複数のサブファイルに物理的に分割し、複
数のディスクコントローラを用いて多重アクセスできる
場合でも、データベース管理装置はディスクコントロー
ラに、複数のディスクコントローラを有効に活用できる
適切な順序でアクセスを指示することができず、たとえ
ばデータのキーのハッシュ値順に１件ずつアクセスを指
示していた。Conventionally, in the design of hash files, the logical structure has been the main focus, and no particular consideration has been given to which data storage area to store data having a certain hash value. Therefore, even if one hash file is physically divided into a plurality of sub-files and multiple access is possible by using a plurality of disk controllers, the database management device can appropriately use the plurality of disk controllers for the disk controller. The access cannot be instructed in order, and for example, access is instructed one by one in the order of the hash value of the data key.

【０００８】そのため、特定のディスクコントローラに
集中的にアクセスが指示され、他のディスクコントロー
ラはアクセス待ちの状態となる。Therefore, a specific disk controller is intensively instructed to access the other disk controller, and the other disk controllers are in a state of waiting for access.

【０００９】このように、ハッシュファイル内の個々の
データにアクセスする場合には迅速なデータ処理が可能
であるが、大量のデータにアクセスする場合には高速な
データ処理が達成されない。As described above, rapid data processing is possible when accessing individual data in the hash file, but high-speed data processing is not achieved when accessing a large amount of data.

【００１０】この発明の目的は、大量のデータを高速に
保守することが可能なハッシュファイルデータ管理装置
およびハッシュファイルデータ管理方法を提供すること
である。An object of the present invention is to provide a hash file data management device and a hash file data management method capable of maintaining a large amount of data at high speed.

【００１１】[0011]

【課題を解決するための手段】この発明に係るハッシュ
ファイルデータ管理装置は、ハッシュファイル手段、複
数の入力手段およびアクセス手段を備える。A hash file data management device according to the present invention comprises a hash file means, a plurality of input means and an access means.

【００１２】ハッシュファイル手段には、ハッシング法
によりデータが格納される。ハッシュファイル手段は、
格納されるべきデータのハッシュ値の連続した範囲によ
って複数のサブファイルに物理的に分割されかつ１つの
論理的ファイルを構成する。複数の入力手段は、ハッシ
ュ値の異なる連続した範囲がそれぞれ割当てられ、割当
てられた範囲のハッシュ値を有するデータをそれぞれ入
力する。アクセス手段は、複数の入力手段の各々により
入力されたデータのハッシュ値を算出し、算出されたハ
ッシュ値に基づいて対応するサブファイルにアクセスす
る。アクセス手段は、少なくとも２つのサブファイルへ
の同時アクセスが可能である．この発明に係るハッシュ
ファイルデータ管理方法は次のステップを含む。Data is stored in the hash file means by the hashing method. Hash file means
It is physically divided into a plurality of sub-files and constitutes one logical file by a continuous range of hash values of the data to be stored. The plurality of input means are respectively assigned continuous ranges of different hash values, and each input data having a hash value of the assigned range. The access means calculates a hash value of the data input by each of the plurality of input means, and accesses the corresponding subfile based on the calculated hash value. The access means can simultaneously access at least two subfiles. The hash file data management method according to the present invention includes the following steps.

【００１３】１つの論理的ファイルを構成するハッシュ
ファイルを、格納されるべきデータのハッシュ値の連続
した範囲によって複数のサブファイルに物理的に分割す
る。入力されるべきデータをハッシュ値の連続した範囲
によって予め複数の入力手段にそれぞれ割当てる。複数
の入力手段によってそれぞれ割当てられたデータを同時
に入力する。入力されたデータのハッシュ値をそれぞれ
算出し、算出されたハッシュ値に基づいて対応するサブ
ファイルに同時にアクセスする。A hash file that constitutes one logical file is physically divided into a plurality of subfiles according to a continuous range of hash values of data to be stored. The data to be input is assigned in advance to a plurality of input means according to a continuous range of hash values. Data assigned respectively by a plurality of input means are simultaneously input. Each hash value of the input data is calculated, and the corresponding subfiles are simultaneously accessed based on the calculated hash value.

【００１４】[0014]

【作用】この発明に係るハッシュファイルデータ管理装
置およびハッシュファイルデータ管理方法においては、
１つの論理的ファイルを構成するハッシュファイルが、
格納されるべきデータのハッシュ値の範囲によって複数
のサブファイルに物理的に分割され、かつ入力されるべ
きデータがハッシュ値の範囲によって複数の入力手段に
それぞれ割当てられ、同時に入力される。それにより、
ある特定のサブファイルへの同時アクセスの集中度が減
少し、同時並行的に入力処理が行われる。In the hash file data management device and the hash file data management method according to the present invention,
The hash files that make up one logical file are
The data to be stored is physically divided into a plurality of subfiles according to the range of hash values, and the data to be input is assigned to a plurality of input means according to the range of hash values, and is input at the same time. Thereby,
Concentration of simultaneous access to a specific subfile is reduced, and input processing is performed in parallel.

【００１５】したがって、大量のデータを一括して保守
する場合に処理時間が短縮される。その結果、迅速なデ
ータ検索機能を損なうことなく、大量のデータを高速に
保守することができる。Therefore, the processing time is shortened when a large amount of data is collectively maintained. As a result, a large amount of data can be maintained at high speed without impairing the quick data search function.

【００１６】[0016]

【実施例】以下、この発明の実施例を図面を参照しなが
ら詳細に説明する。Embodiments of the present invention will now be described in detail with reference to the drawings.

【００１７】図１は、この発明の一実施例によるハッシ
ュファイルデータ管理装置の構成を示すブロック図であ
る。このハッシュファイルデータ管理装置は、データ入
力装置１，２、データベース管理装置３、ハッシング装
置４、関連テーブル５、ディスクコントローラ６，７お
よびハッシュファイル８を含む。ハッシュファイル８
は、２つのサブファイルＳＦ１，ＳＦ２に物理的に分割
されている。２つのサブファイルＳＦ１，ＳＦ２が１つ
の論理的ファイルを構成する。FIG. 1 is a block diagram showing the configuration of a hash file data management device according to an embodiment of the present invention. This hash file data management device includes data input devices 1 and 2, a database management device 3, a hashing device 4, a related table 5, disk controllers 6 and 7, and a hash file 8. Hash file 8
Is physically divided into two subfiles SF1 and SF2. The two subfiles SF1 and SF2 form one logical file.

【００１８】データ入力装置１は入力ファイルＩＦ１か
らデータを読込み、それをデータベース管理装置３に与
える。データ入力装置２は入力ファイルＩＦ２からデー
タを読込み、それをデータベース管理装置３に与える。
入力ファイルＩＦ１にはハッシュ値“０”〜“１０”を
有する複数のレコードが格納され、入力ファイルＩＦ２
にはハッシュ値“１１”〜“２０”を有する複数のレコ
ードが格納されている。The data input device 1 reads data from the input file IF1 and gives it to the database management device 3. The data input device 2 reads data from the input file IF2 and gives it to the database management device 3.
The input file IF1 stores a plurality of records having hash values “0” to “10”, and the input file IF2
Stores a plurality of records having hash values “11” to “20”.

【００１９】ハッシング装置４は、データ入力装置１，
２により入力された各レコードのキーを予め定められた
ハッシング手法（ハッシング関数）によりハッシング値
に変換する。The hashing device 4 is a data input device 1,
The key of each record input by 2 is converted into a hashing value by a predetermined hashing method (hashing function).

【００２０】関連テーブル５は、ハッシュ値の範囲とサ
ブファイルとの対応関係を示している。図２に関連テー
ブル５の一例を示す。図２の関連テーブルでは、ハッシ
ュ値“０”〜“１０”がサブファイルＳＦ１に割当てら
れ、ハッシュ値“１１”〜“２０”がサブファイルＳＦ
２に割当てられている。The relation table 5 shows the correspondence between the range of hash values and subfiles. FIG. 2 shows an example of the relation table 5. In the relation table of FIG. 2, hash values “0” to “10” are assigned to the subfile SF1, and hash values “11” to “20” are assigned to the subfile SF.
It is assigned to 2.

【００２１】データベース管理装置３は、関連テーブル
５を参照して、データ入力装置１，２により与えられた
各レコードおよび関連テーブル５により変換されたハッ
シュ値を、ディスクコントローラ６，７のいずれか一方
に与える。The database management device 3 refers to the relational table 5 and sets each of the records given by the data input devices 1 and 2 and the hash value converted by the relational table 5 to one of the disk controllers 6 and 7. Give to.

【００２２】ディスクコントローラ６，７は、データベ
ース管理装置３から与えられたレコードおよび対応する
ハッシュ値に基づいてハッシュファイル８内のサブファ
イルＳＦ１，ＳＦ２にそれぞれアクセスする。The disk controllers 6 and 7 respectively access the subfiles SF1 and SF2 in the hash file 8 based on the record given from the database management device 3 and the corresponding hash value.

【００２３】データベース管理装置３およびハッシング
装置４は、データ入力装置１，２から与えられたレコー
ドを同時に処理することが可能である。したがって、デ
ィスクコントローラ６，７も、それぞれ独立かつ並列に
サブファイルＳＦ１，ＳＦ２にアクセスすることができ
る。The database management device 3 and the hashing device 4 can simultaneously process the records given from the data input devices 1 and 2. Therefore, the disk controllers 6 and 7 can also access the subfiles SF1 and SF2 independently and in parallel.

【００２４】次に、図３のフローチャートを参照しなが
ら図１のハッシュファイルデータ管理装置の動作を説明
する。Next, the operation of the hash file data management device of FIG. 1 will be described with reference to the flowchart of FIG.

【００２５】図４に入力ファイルＩＦ１，ＩＦ２の一例
を示す。入力ファイルＩＦ１，ＩＦ２の各々にはキー、
データおよび処理区分を含む８個のレコードが格納され
ている。ここで、処理区分“１”は追加を表わし、処理
区分“２”は変更を表わす。FIG. 4 shows an example of the input files IF1 and IF2. A key for each of the input files IF1 and IF2,
Eight records including data and processing classification are stored. Here, the processing division “1” represents addition and the processing division “2” represents change.

【００２６】ここでは、データ入力装置１，２により図
２の入力ファイルＩＦ１，ＩＦ２が読込まれるものとす
る。Here, it is assumed that the data input devices 1 and 2 read the input files IF1 and IF2 shown in FIG.

【００２７】まず、データ入力装置１は、入力ファイル
ＩＦからキー“５−Ａ”を有する１番目のレコードを読
込む（ステップＳ１１）。データベース管理装置３は、
そのレコードを格納するとともにそれをハッシング装置
４に与える。ハッシング装置４は、所定のハッシング手
法によりキー“５−Ａ”からハッシュ値“５”を求める
（ステップＳ１２）。First, the data input device 1 reads the first record having the key "5-A" from the input file IF (step S11). The database management device 3 is
The record is stored and given to the hashing device 4. The hashing device 4 obtains the hash value "5" from the key "5-A" by a predetermined hashing method (step S12).

【００２８】データベース管理装置３は、関連テーブル
５を参照して処理対象ファイルとしてサブファイルＳＦ
１を選択する（ステップＳ１３）。さらに、データベー
ス管理装置３は、ハッシュ値“５”に基づいてサブファ
イルＳＦ１内のデータ格納エリアのアドレスを求める。
そして、レコードの処理区分より処理の種類を判別す
る。この場合、処理区分は“１”であるので、処理区分
を除くレコードをディスクコントローラ６に与えるとと
もに、レコードの追加処理を指示する。The database management device 3 refers to the relational table 5 and selects the subfile SF as the file to be processed.
1 is selected (step S13). Further, the database management device 3 obtains the address of the data storage area in the subfile SF1 based on the hash value “5”.
Then, the type of processing is determined from the processing classification of the record. In this case, since the processing classification is "1", the record excluding the processing classification is given to the disk controller 6 and the record addition processing is instructed.

【００２９】それにより、ディスクコントローラ６はサ
ブファイルＳＦ１にアクセスし（ステップＳ１４）、レ
コードの追加処理を行なう（ステップＳ１５）。その結
果、図５に示されるように、サブファイルＳＦ１内のハ
ッシュ値“５”に対応するデータ格納エリアにキー“５
−Ａ”を有するレコードが追加される。As a result, the disk controller 6 accesses the sub-file SF1 (step S14) and performs record addition processing (step S15). As a result, as shown in FIG. 5, the key "5" is stored in the data storage area corresponding to the hash value "5" in the subfile SF1.
A record with -A "is added.

【００３０】一方、データ入力装置２は、入力ファイル
ＩＦ２からキー“１４−Ａ”を有する１番目のレコード
を読込む（ステップＳ２１）。データベース管理装置３
は、そのレコードを格納するとともにそれをハッシング
装置４に与える。ハッシング装置４は、所定のハッシン
グ手法によりキー“１４−Ａ”からハッシュ値“１４”
を求める（ステップＳ２２）。On the other hand, the data input device 2 reads the first record having the key "14-A" from the input file IF2 (step S21). Database management device 3
Stores the record and gives it to the hashing device 4. The hashing device 4 uses the predetermined hashing method to change the hash value “14” from the key “14-A”.
Is calculated (step S22).

【００３１】データベース管理装置３は、関連テーブル
５を参照して処理対象ファイルとしてサブファイルＳＦ
２を選択する（ステップＳ２３）。さらに、データベー
ス管理装置３は、ハッシュ値“１４”に基づいてサブフ
ァイルＳＦ２内のデータ格納エリアのアドレスを求め
る。そして、レコードの処理区分により処理の種類を判
別する。この場合、処理区分は“２”であるので、処理
区分を除くレコードをディスクコントローラ７に与える
とともに、レコードの変更処理を指示する。The database management device 3 refers to the related table 5 and selects the subfile SF as the processing target file.
2 is selected (step S23). Further, the database management device 3 obtains the address of the data storage area in the subfile SF2 based on the hash value “14”. Then, the type of processing is determined based on the processing classification of the record. In this case, since the processing classification is “2”, the record excluding the processing classification is given to the disk controller 7 and the record changing processing is instructed.

【００３２】それにより、ディスクコントローラ７はサ
ブファイルＳＦ２にアクセスし（ステップＳ２４）、レ
コードの変更処理を行なう（ステップＳ２５）。その結
果、図５に示されるように、サブファイルＳＦ２内のハ
ッシュ値“１４”に対応するデータ格納エリアのレコー
ドが、キー“１４−Ａ”を有するレコードで置換され
る。As a result, the disk controller 7 accesses the sub-file SF2 (step S24) and performs a record changing process (step S25). As a result, as shown in FIG. 5, the record of the data storage area corresponding to the hash value “14” in the subfile SF2 is replaced with the record having the key “14-A”.

【００３３】ステップＳ１１〜Ｓ１５の処理およびステ
ップＳ２１〜Ｓ２５の処理は独立かつ並列に行なわれ
る。したがって、サブファイルＳＦ１，ＳＦ２への同時
アクセスが可能となる。The processing of steps S11 to S15 and the processing of steps S21 to S25 are performed independently and in parallel. Therefore, the subfiles SF1 and SF2 can be accessed simultaneously.

【００３４】以下同様にして、入力ファイルＩＦ１内の
２番目ないし８番目のレコードに基づいてサブファイル
ＳＦ１に対して処理が行なわれ、入力ファイルＩＦ２内
の２番目ないし８番目のレコードに基づいてサブファイ
ルＳＦ２に対して処理が行なわれる。その結果、図５に
示されるように、ハッシュファイル８内のサブファイル
ＳＦ１，ＳＦ２の内容が更新される。Similarly, the sub file SF1 is processed based on the second to eighth records in the input file IF1, and the sub file SF1 is processed based on the second to eighth records in the input file IF2. The process is performed on the file SF2. As a result, as shown in FIG. 5, the contents of the subfiles SF1 and SF2 in the hash file 8 are updated.

【００３５】図１のハッシュファイルデータ管理装置で
は、サブファイルＳＦ１，ＳＦ２に対して独立かつ並列
にアクセスすることができるので、ハッシュファイル８
内の大量のデータの保守を短時間で行なうことが可能と
なる。In the hash file data management apparatus of FIG. 1, since the sub files SF1 and SF2 can be accessed independently and in parallel, the hash file 8
It becomes possible to perform maintenance of a large amount of data in a short time.

【００３６】図６は、入力ファイルＩＦ１，ＩＦ２に分
割されたハッシュ値の範囲とサブファイルＳＦ１，ＳＦ
２に分割されたハッシュ値の範囲とが異なる場合のハッ
シュファイルデータ管理装置の動作を示すブロック図で
ある。FIG. 6 shows a range of hash values divided into input files IF1 and IF2 and subfiles SF1 and SF2.
It is a block diagram which shows operation | movement of a hash file data management apparatus when the range of the hash value divided into 2 differs.

【００３７】入力ファイルＩＦ１にはハッシュ値“０”
〜“１０１０”を有する複数のレコードが格納され、入
力ファイルＩＦ２にはハッシュ値“１０１１”〜“２０
００”を有する複数のレコードが格納される。A hash value "0" is stored in the input file IF1.
~ A plurality of records having "1010" are stored, and hash values "1011" to "20" are stored in the input file IF2.
A plurality of records having "00" are stored.

【００３８】データベース管理装置３は、基本的には、
データ入力装置１から与えられるデータをディスクコン
トローラ６に与え、データ入力装置２から与えられるデ
ータをディスクコントローラ７に与える。それにより、
通常は、サブファイルＳＦ１にはハッシュ値“０”〜
“１０１０”を有するレコードが格納され、サブファイ
ルＳＦ２にはハッシュ値“１０１１”〜“２０００”を
有するレコードが格納される。The database management device 3 basically has
The data supplied from the data input device 1 is supplied to the disk controller 6, and the data supplied from the data input device 2 is supplied to the disk controller 7. Thereby,
Normally, the hash value “0” to the sub file SF1
A record having “1010” is stored, and a record having hash values “1011” to “2000” is stored in the subfile SF2.

【００３９】しかし、データベース管理装置３は、デー
タ入力装置１から与えられるハッシュ値の範囲とサブフ
ァイルＳＦ１のハッシュ値の範囲とが異なる場合または
データ入力装置２から与えられるハッシュ値の範囲とサ
ブファイルＳＦ２のハッシュ値の範囲とが異なる場合
に、データ入力装置１から与えられるデータの一部をデ
ィスクコントローラ７に与えることができ、または、デ
ータ入力装置２から与えられるデータの一部をディスク
コントローラ６に与えることもできる。However, if the range of hash values given from the data input device 1 and the range of hash values of the sub-file SF1 are different, the database management device 3 receives the range of hash values given from the data input device 2 and the sub-file. When the range of the hash value of SF2 is different, a part of the data supplied from the data input device 1 can be supplied to the disk controller 7, or a part of the data supplied from the data input device 2 can be supplied to the disk controller 6. Can also be given to.

【００４０】図６の例では、サブファイルＳＦ１にハッ
シュ値“０”〜“１０００”が割当てられ、サブファイ
ルＳＦ２にハッシュ値“１００１”〜“２０００”が割
当てられる。この場合、データベース管理装置３は、デ
ータ入力装置１から与えられるデータのうち、ハッシュ
値“０”〜“１０００”を有するデータをディスクコン
トローラ６に与え、ハッシュ値“１００１”〜“１０１
０”を有するデータをディスクコントローラ７に与え
る。In the example of FIG. 6, hash values "0" to "1000" are assigned to the subfile SF1, and hash values "1001" to "2000" are assigned to the subfile SF2. In this case, the database management device 3 gives the data having the hash values “0” to “1000” among the data given from the data input device 1 to the disk controller 6, and makes the hash values “1001” to “101”.
Data having 0 ″ is given to the disk controller 7.

【００４１】他の部分の構成および動作は図１のハッシ
ュファイルデータ管理装置の構成および動作と同様であ
る。The configuration and operation of the other parts are the same as the configuration and operation of the hash file data management device of FIG.

【００４２】図６のハッシュファイルデータ管理装置に
おいても、図１のハッシュファイルデータ管理装置とほ
ぼ同様の効果が得られる。これを図７を用いて説明す
る。The hash file data management device shown in FIG. 6 has substantially the same effect as the hash file data management device shown in FIG. This will be described with reference to FIG.

【００４３】図７に示すように、ハッシュファイルをハ
ッシュ値Ｈ１，Ｈ２を分割点として３つのサブファイル
ＳＦ１，ＳＦ２，ＳＦ３に分割し、入力データをハッシ
ュ値Ｈ１′，Ｈ２′を分割点として３つのグループＤＡ
１，ＤＡ２，ＤＡ３に分割するものと仮定する。ここ
で、Ｈ１＜Ｈ１′およびＨ２＜Ｈ２′である。As shown in FIG. 7, the hash file is divided into three sub-files SF1, SF2 and SF3 using the hash values H1 and H2 as division points, and the input data is divided into three sub files SF1 to SF2 using the hash values H1 'and H2'. Group DA
1, DA2, DA3 are assumed to be divided. Here, H1 <H1 ′ and H2 <H2 ′.

【００４４】各グループＤＡ１，ＤＡ２，ＤＡ３の入力
データは３つのデータ入力装置から同時に入力され、３
つのディスクコントローラを用いてサブファイルＳＦ
１，ＳＦ２，ＳＦ３にそれぞれ格納される。しかし、グ
ループＤＡ１内の一部のデータＨＡはグループＤＡ２内
のデータと同様にサブファイルＳＦ２に格納される。同
様に、グループＤＡ２内の一部のデータＨＢはグループ
ＤＡ３内のデータと同様にサブファイルＳＦ３に格納さ
れる。Input data of each group DA1, DA2, DA3 are simultaneously input from three data input devices, and
Subfile SF using one disk controller
1, SF2 and SF3, respectively. However, a part of the data HA in the group DA1 is stored in the subfile SF2 like the data in the group DA2. Similarly, a part of the data HB in the group DA2 is stored in the sub-file SF3 like the data in the group DA3.

【００４５】したがって、２つのディスクコントローラ
が同時に１つのサブファイルにアクセスする可能性があ
る。この場合、同時にアクセスされるサブファイルのア
クセス速度は低下する。しかし、一部のデータＨＡ，Ｈ
Ｂを除く大部分のデータはそれぞれ対応するサブファイ
ルに格納される。Therefore, two disk controllers may access one subfile at the same time. In this case, the access speed of the subfiles that are accessed at the same time decreases. However, some data HA, H
Most of the data except B is stored in the corresponding subfiles.

【００４６】したがって、上記の場合においても、ハッ
シュファイル内のすべてのデータを処理するために要す
る時間は、従来のハッシュファイルデータ管理装置と比
較して、ほぼ１／（分割数）となる。Therefore, even in the above case, the time required to process all the data in the hash file is about 1 / (the number of divisions) as compared with the conventional hash file data management device.

【００４７】ただし、入力データおよびハッシュファイ
ルを同じハッシュ値の範囲で分割すると、最も効率よく
データを格納することができる。However, if the input data and the hash file are divided within the same hash value range, the data can be stored most efficiently.

【００４８】なお、上記実施例では、入力データおよび
ハッシュファイルを２つに分割しているが、入力ファイ
ルおよびハッシュファイルを３つ以上の部分に分割して
もよい。その場合には、さらに短時間でハッシュファイ
ル内のデータを保守することが可能となる。Although the input data and the hash file are divided into two in the above embodiment, the input file and the hash file may be divided into three or more parts. In that case, the data in the hash file can be maintained in a shorter time.

【００４９】[0049]

【発明の効果】以上のようにこの発明によれば、ある特
定のサブファイルへの集中した同時アクセスを減少させ
ることが可能となる。したがって、迅速なデータ検索機
能を損なうことなく、大量のデータを高速に保守するこ
とができる。As described above, according to the present invention, it is possible to reduce concentrated simultaneous access to a specific subfile. Therefore, a large amount of data can be maintained at high speed without impairing the quick data search function.

[Brief description of drawings]

【図１】この発明の一実施例によるハッシュファイルデ
ータ管理装置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a hash file data management device according to an embodiment of the present invention.

【図２】関連テーブルの一例を示す図である。FIG. 2 is a diagram showing an example of a relation table.

【図３】図１のハッシュファイルデータ管理装置の動作
を説明するためのフローチャートである。FIG. 3 is a flowchart for explaining the operation of the hash file data management device of FIG.

【図４】入力ファイルの一例を示す図である。FIG. 4 is a diagram showing an example of an input file.

【図５】ハッシュファイルの処理後の状態を示す図であ
る。FIG. 5 is a diagram showing a state after processing of a hash file.

【図６】この発明の他の実施例によるハッシュファイル
データ管理装置の構成を示すブロック図である。FIG. 6 is a block diagram showing a configuration of a hash file data management device according to another embodiment of the present invention.

【図７】図６のハッシュファイルデータ管理装置の効果
を説明するための図である。FIG. 7 is a diagram for explaining the effect of the hash file data management device of FIG.

【図８】従来のハッシュファイルデータ管理装置の構成
を示すブロック図である。FIG. 8 is a block diagram showing a configuration of a conventional hash file data management device.

[Explanation of symbols]

１，２データ入力装置３データベース管理装置４ハッシング装置５関連テーブル６，７ディスクコントローラ８ハッシュファイルＩＦ１，ＩＦ２入力ファイルＳＦ１，ＳＦ２サブファイルなお、各図中同一符号は同一または相当部分を示す。 1, 2 Data Input Device 3 Database Management Device 4 Hashing Device 5 Related Table 6, 7 Disk Controller 8 Hash File IF1, IF2 Input File SF1, SF2 Subfile In the drawings, the same reference numerals indicate the same or corresponding parts.

Claims

[Claims]

1. Data is stored by the hashing method,
Hash file means that is physically divided into a plurality of sub-files and constitutes one logical file by a continuous range of hash values of data to be stored, and continuous ranges of different hash values are allocated and allocated. A plurality of input means for respectively inputting data having hash values in a range, and a hash value of the data input by each of the plurality of input means, and a corresponding subfile based on the calculated hash value A hash file data management device, wherein the access means is capable of simultaneously accessing at least two subfiles.

2. A hash file that constitutes one logical file is physically divided into a plurality of subfiles according to a continuous range of hash values of data to be stored, and the data to be input is divided into hash values. Assigned to a plurality of input means in advance by a continuous range, simultaneously input the data assigned respectively by the plurality of input means, calculate hash values of the input data, and respond based on the calculated hash values Hash file data management method to access sub files simultaneously.