JP2006031631A

JP2006031631A - Data base processing system and method, and program for data base processing

Info

Publication number: JP2006031631A
Application number: JP2004213407A
Authority: JP
Inventors: Toshiyuki Inazaki; 敏之稲崎
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2004-07-21
Filing date: 2004-07-21
Publication date: 2006-02-02

Abstract

<P>PROBLEM TO BE SOLVED: To provide a data base retrieval system capable of speeding up processing on the whole, for all data base processing systems. <P>SOLUTION: The data base processing system is provided with a grouping means 11 for grouping transaction data constituted of a set of a series of processing for a master data base by a predetermined rule, a master extraction means 13 for extracting a sub data base including data which can process the transaction data from the master data base on the basis of the transaction data in a grouped specific group and a transaction processing means 14 for processing the transaction data grouped for the sub data base. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、データベース処理システムにかかり、特に、大量のトランザクションデータに対してデータベース照合等の検索処理を行って出力結果を得るバッチシステムやオンラインシステムなどのデータベース処理システムに関する。また、データベース処理方法及びデータベース処理用プログラムに関する。 The present invention relates to a database processing system, and more particularly to a database processing system such as a batch system or an online system that obtains an output result by performing a search process such as database collation on a large amount of transaction data. The present invention also relates to a database processing method and a database processing program.

従来より、データベース処理において、その処理を高速化する技術開発が進められ、例えば以下の２つの技術が存在した。 Conventionally, in database processing, technological development for speeding up the processing has been advanced. For example, the following two technologies existed.

まず、「キャッシュ技術」という技術は、一度アクセスされたデータベース上のデータをメモリ上に一定期間蓄え、次に同じデータがアクセスされたときにディスクアクセスを行わずメモリ上のデータにアクセスして処理を高速化するという技術である。しかし、かかるキャッシュ技術においては、一般的にコピーされるアクセス対象は物理ブロックであり、トランザクションデータのキー分布が広範囲に分布し、キャッシュ容量と比して分布が大きい場合には、キャッシュヒット率が低下する、という問題が生じる。 First, the technology called “cache technology” stores the data on the database once accessed in the memory for a certain period of time, and then accesses and processes the data on the memory without accessing the disk when the same data is accessed. It is a technology that speeds up. However, in such a cache technology, generally, the access target to be copied is a physical block, the key distribution of transaction data is distributed over a wide range, and if the distribution is large compared to the cache capacity, the cache hit rate is The problem of being reduced arises.

また、「事前ソート技術」という技術は、データベースアクセスを行うキー順にあらかじめトランザクションデータの並び替えを行い、トランザクションデータのキー値がブレイクするタイミングでのみＤＢアクセスを行うように業務ロジックを搭載し、処理を高速化するという技術である。しかし、かかる事前ソート技術においては、トランザクションデータの処理順序に意味があり、ソートが適さない処理には適用できないという問題が生じ、また、トランザクションデータ中のキー分布が広範でキー値の連続率が低い場合、ソートを行うことでシステム全体として結果的に処理時間が増大し逆効果となる場合があるという問題が生じる。さらに、事前にソートが可能なバッチ処理のみの適用となり汎用性に欠けるという問題や、業務ロジックにてブレイク処理の対応が必要となる、という問題も生じる。 In addition, the technology called “pre-sorting technology” is equipped with business logic so that DB data is accessed only at the timing when the key value of the transaction data breaks in advance in the order of the key for database access and the key value of the transaction data breaks. It is a technology that speeds up. However, such a pre-sort technique has a problem that the processing order of transaction data is meaningful and cannot be applied to processing that is not suitable for sorting, and the key distribution in transaction data is wide and the continuity rate of key values is high. If it is low, sorting causes a problem that processing time increases as a result of the entire system, which may be counterproductive. Furthermore, there are problems that only batch processing that can be sorted in advance is applied and lacks versatility, and that break processing is required in business logic.

また、下記の特許文献１には、利用者の指示情報に基づいて当該利用者が必要とする業務的な目的をもった特定のデータベースを作成し、かかるデータベースに対する処理を行うシステムが開示されている。しかし、かかる構成では、利用者が変われば目的とするデータベースも変わるため、新たに作成するデータベースに対する効果的な処理は期待できない、と言う問題が生じる。 Patent Document 1 below discloses a system that creates a specific database having a business purpose required by the user based on the instruction information of the user and performs processing on the database. Yes. However, with such a configuration, if the user changes, the target database also changes, so that there is a problem that effective processing cannot be expected for a newly created database.

特開２００２−３６６４０１号公報JP 2002-366401 A

以上より、上述した従来のデータベース検索システムでは、検索対象データの母体件数が多い場合には１件あたりの処理に時間がかかり、大量のトランザクションともなれば全体の処理時間がかかっていたため、キャッシュ技術などを用いて物理的に高速な装置を効果的に用いるなどの手法で対応を行ってきたが、かかる手法ではシステムに一律汎用的に作用するため、効果のある適用領域には条件と限界があった。すなわち、全てのデータベース処理に対して処理の高速化を図ることができない。 As described above, in the above-described conventional database search system, it takes time to process each case when the number of base data of the search target data is large, and the entire processing time is required for a large number of transactions. However, since this method works universally on the system, there are conditions and limitations in the effective application area. there were. That is, it is impossible to increase the processing speed for all database processing.

また、業務特性を考慮することで、ある特定の環境下では著しく効率化することが可能となる場合があるが、このような特性は一般的に業務固有のアプローチが必要であり、特性に応じた分析を行った後に効果度合いを判断するというように個別対応が多く、やはり全てのデータベース処理に対する高速化を図ることができない。 Considering business characteristics, it may be possible to improve the efficiency significantly in a specific environment. However, such characteristics generally require a business-specific approach, and depending on the characteristics. In other words, there are many individual correspondences such as judging the degree of effect after performing the analysis, and it is impossible to speed up all the database processing.

このため、本発明では、上記従来例の有する不都合を改善し、特に、あらゆるデータベース処理システムに対して、全体での処理の高速化を図ることができるデータベース処理システムを提供することをその目的とする。 Therefore, the present invention has an object to provide a database processing system that can improve the disadvantages of the above-described conventional example, and in particular, can increase the overall processing speed for any database processing system. To do.

そこで、本発明であるデータベース処理システムの一形態としては、
マスタデータベースに対する一連の処理の集合から成るトランザクションデータを予め定められたルールにてグループ化を行うグループ化手段と、
グループ化された特定のグループにおけるトランザクションデータに基づいてマスタデータベースから当該トランザクションデータを処理することが可能なデータを含むサブデータベースを抽出するマスタ抽出手段と、
サブデータベースに対してグループ化されたトランザクションデータの処理を行うトランザクション処理手段と、
を備えたことを特徴としている。 Therefore, as one form of the database processing system according to the present invention,
Grouping means for grouping transaction data consisting of a set of processes for the master database according to a predetermined rule;
A master extraction means for extracting a sub-database including data capable of processing the transaction data from the master database based on the transaction data in the grouped specific group;
Transaction processing means for processing transaction data grouped in the sub-database;
It is characterized by having.

そして、上記グループ化手段が、トランザクションデータの処理対象となるマスタデータベースに対する処理特性に応じてグループ化を行う、という構成にしてもよい。具体的には、上記グループ化手段が、トランザクションデータ内に存在するマスタデータベースに対する検索キーに関連するデータに基づいてグループ化を行う、という構成にしてもよい。 The grouping means may be configured to perform grouping according to processing characteristics for a master database that is a processing target of transaction data. Specifically, the grouping unit may perform grouping based on data related to a search key for a master database existing in transaction data.

さらに、上記構成に加えて、
グループ化手段が、グループ化に用いる予め定められたルールを複数設定すると共に、これら複数のルールのうちの全部又は一部を利用することによってグループ数が予め定められた数値範囲になるようグループ化を行う、という構成にしてもよい。 In addition to the above configuration,
The grouping means sets a plurality of predetermined rules to be used for grouping, and uses all or a part of the plurality of rules so that the number of groups falls within a predetermined numerical range It may be configured to perform.

上記構成にすることにより、まず、マスタ引き当てのキー値や処理の時間帯など、トランザクションデータの特徴値に応じてグループ化するためのルールが予め設定されており、これに基づいて複数のグループにグループ化される。このとき、所定の数のグループに分けられるよう、複数のルールを用いてグループ化を行う。続いて、それぞれのグループ内におけるトランザクションデータに共通する特性に基づいて、マスタデータベースから一時的なサブデータベースを抽出して生成する。特に、サブデータベースは、所定のグループにおけるトランザクションデータを処理可能なデータを含むよう抽出される。例えば、グループ化の際に用いられた上記検索キーを含むサブデータベースが生成される。その後、各グループに対応するサブデータベースに対して、当該グループのトランザクションデータの処理が実行される。 With the above configuration, rules for grouping according to transaction data feature values, such as master assigned key values and processing time zones, are set in advance. Grouped. At this time, grouping is performed using a plurality of rules so as to be divided into a predetermined number of groups. Subsequently, based on characteristics common to transaction data in each group, a temporary sub-database is extracted and generated from the master database. In particular, the sub-database is extracted to include data that can process transaction data in a predetermined group. For example, a sub-database including the search key used for grouping is generated. Thereafter, the transaction data of the group is processed for the sub-database corresponding to each group.

従って、サブデータベースにはトランザクションデータの処理を実行可能なデータが含まれている可能性が高いため、その処理の精度を維持しつつ、また、その母体数も少数であるため、処理時間の短縮化を図ることができ、かつ、トランザクション処理を行うコンピュータの負荷を軽減できる。 Therefore, it is highly possible that the sub-database contains data that can execute transaction data processing, so the processing accuracy is maintained and the number of hosts is small, so the processing time is reduced. And the load on the computer that performs transaction processing can be reduced.

また、本発明の別の形態として、データベース処理用プログラムを提供しており、当該プログラムは、
所定の記憶装置に記憶された処理対象となるデータ群であるマスタデータベースに対して、所定の記憶装置に記憶されたトランザクションデータの処理を行うコンピュータに、
マスタデータベースに対する一連の処理の集合から成るトランザクションデータを予め定められたルールにてグループ化を行うグループ化手段と、
グループ化された特定のグループにおけるトランザクションデータに基づいてマスタデータベースから当該トランザクションデータを処理することが可能なデータを含むサブデータベースを抽出するマスタ抽出手段と、
サブデータベースに対してグループ化されたトランザクションデータの処理を行うトランザクション処理手段と、を実現するためのプログラムである。 Further, as another embodiment of the present invention, a database processing program is provided, and the program is
A computer that processes transaction data stored in a predetermined storage device for a master database, which is a data group to be processed, stored in a predetermined storage device,
Grouping means for grouping transaction data consisting of a set of processes for the master database according to a predetermined rule;
A master extraction means for extracting a sub-database including data capable of processing the transaction data from the master database based on the transaction data in the grouped specific group;
And a transaction processing means for processing the transaction data grouped with respect to the sub-database.

さらに、本発明の別の形態として、データベース処理方法を提供しており、
コンピュータを用いて、所定の記憶装置に記憶された処理対象となるデータ群であるマスタデータベースに対して、所定の記憶装置に記憶されたトランザクションデータの処理を行うデータベース処理方法であって、
マスタデータベースに対する一連の処理の集合から成るトランザクションデータを予め定められたルールにてグループ化を行うグループ化工程と、
グループ化された特定のグループにおけるトランザクションデータに基づいてマスタデータベースから当該トランザクションデータを処理することが可能なデータを含むサブデータベースを抽出するマスタ抽出工程と、
サブデータベースに対してグループ化されたトランザクションデータの処理を行うトランザクション処理工程と、を有することを特徴としている。 Furthermore, as another aspect of the present invention, a database processing method is provided,
A database processing method for processing transaction data stored in a predetermined storage device with respect to a master database that is a processing target data group stored in a predetermined storage device using a computer,
A grouping step of grouping transaction data consisting of a set of processes for the master database according to a predetermined rule;
A master extraction step for extracting a sub-database including data capable of processing the transaction data from the master database based on the transaction data in the grouped specific group;
A transaction processing step for processing transaction data grouped in the sub-database.

上記構成のデータベース処理プログラム、及び、データベース処理方法であっても、上述したデータベース処理システム同様に作用するため、上記目的を達成することができる。 Even the database processing program and the database processing method configured as described above operate in the same manner as the above-described database processing system, and thus the above-described object can be achieved.

本発明は、以上のように構成され機能するので、これによると、あらゆるトランザクション処理を行うシステムに対して、データベース処理の精度を維持しつつ、また、処理対象の母体数も少数となることから処理時間の短縮化を図ることができ、かつ、トランザクション処理を行うコンピュータの付加を軽減できる、という従来にない優れた効果を有する。 Since the present invention is configured and functions as described above, according to this, the accuracy of database processing is maintained for a system that performs any transaction processing, and the number of bases to be processed is also small. The present invention has an unprecedented excellent effect that the processing time can be shortened and the addition of a computer for performing transaction processing can be reduced.

本発明であるデータベース処理システムは、例えば、データベース照合等の検索処理を行って出力結果を得るバッチシステムまたはオンラインシステムである。そして、トランザクションデータをある指標値でグループ分けし、グループ毎にトランザクションデータの分布特性に応じたある規則性から本来のデータベース（マスタデータベース）から別の単一または複数の一時データベース（サブデータベース）を自動的に生成し、生成された一時データベース群を用いて本来の処理と同一または同様の結果を得る、ことを特徴としている。これにより、システム全体として少ない処理時間で処理結果を得ることができ、システムへの負担を抑制しつつ、データベース処理の高速化を図ることができる。 The database processing system according to the present invention is, for example, a batch system or an online system that obtains an output result by performing a search process such as database collation. Then, the transaction data is grouped by a certain index value, and from the regular database (master database) to another single or multiple temporary databases (sub-databases) according to the distribution characteristics of the transaction data for each group It is characterized by being automatically generated and obtaining the same or similar result as the original process using the generated temporary database group. As a result, processing results can be obtained in a short processing time for the entire system, and the speed of database processing can be increased while suppressing the burden on the system.

そして、上述したようにすることにより、コアとなる業務プログラムに手を入れることなく、従来の汎用的手法による限界を超えた処理の高速化という効果を得ることができる。特に、本発明はバッチ処理とオンライン問合せ処理の双方に適用が可能であり、スループットの向上を図ることができる。これは一定の負荷がかかっている環境下において、オンライン処理においてはレスポンス平均値の向上、バッチ処理においてはエラップスの向上に繋がる。なお、バッチ処理の方がオンライン処理よりも業務特性の分析精度を上げやすいため、効果の期待度は大きい。以下、システムの構成について、実施例を参照して詳述する。 And by doing as mentioned above, the effect of speeding up the process beyond the limit by the conventional general-purpose method can be acquired, without putting a hand in the business program used as a core. In particular, the present invention can be applied to both batch processing and online inquiry processing, and throughput can be improved. This leads to an improvement in average response value in online processing and an improvement in Elax in batch processing under an environment where a certain load is applied. Since batch processing is easier to improve the analysis accuracy of business characteristics than online processing, the expectation of the effect is large. Hereinafter, the configuration of the system will be described in detail with reference to examples.

本発明の第１の実施例を、図１乃至図７を参照して説明する。図１は、データベース処理システムとなるコンピュータの構成を示す機能ブロック図である。図２は、データベース処理の様子を示す説明図である。図３は、データベースの構成を示す説明図である。図４は、システムの動作を示す説明図である。図５は、動作の一部を示す説明図である。図６乃至図７は、本発明の効果を示すための説明図である。 A first embodiment of the present invention will be described with reference to FIGS. FIG. 1 is a functional block diagram showing a configuration of a computer that becomes a database processing system. FIG. 2 is an explanatory diagram showing a state of database processing. FIG. 3 is an explanatory diagram showing the structure of the database. FIG. 4 is an explanatory diagram showing the operation of the system. FIG. 5 is an explanatory diagram showing a part of the operation. 6 to 7 are explanatory diagrams for illustrating the effects of the present invention.

本実施例で説明するデータベース処理システムは、上述したように、例えば、データベース照合等の検索処理を行って出力結果を得るバッチシステム、あるいは、オンラインシステムである。但し、かかるシステムに限定されることなく、データベースに対して予め定められた複数のトランザクションデータの処理を行うシステムであれば、いかなるシステムでもよい。 As described above, the database processing system described in the present embodiment is, for example, a batch system or an online system that obtains an output result by performing a search process such as database collation. However, the present invention is not limited to such a system, and any system may be used as long as it is a system that processes a plurality of predetermined transaction data for a database.

ここで、トランザクションデータに基づく検索処理とは、例えば、図２に示すように「処理Ａ」にて示し、データベースをマスタ表のような単純な構成（商品マスタ）とし、入力されたコード値（商品コード）をキーとして名称（商品名称）をマスタから引き当てる処理を想定する。そして、この「処理Ａ」に想定する処理が、本実施例では後述するようにサブデータベースに対して実行される。なお、トランザクションデータの項目には「商品コード」以外に、その他の関連情報も存在している。 Here, the search processing based on transaction data is, for example, indicated by “Processing A” as shown in FIG. 2, the database has a simple configuration (product master) like a master table, and an input code value ( Assume a process of assigning a name (product name) from a master using a product code) as a key. Then, the process assumed for the “process A” is executed on the sub-database as described later in this embodiment. In addition to the “product code”, other related information exists in the transaction data item.

＜構成＞
本発明であるデータベース処理システムは、図１に示すように、コンピュータ１にて構築される。そして、このコンピュータ１は、一般的なコンピュータにて構成されており、演算処理装置であるＣＰＵ１０、記憶装置であるハードディスク２０やＲＡＭなどのメモリ３０を備えている。そして、ＣＰＵ１０には、本発明の特徴となるデータベース処理を実現するためのデータベース処理用プログラムが組み込まれることで、以下に説明する各処理部１１〜１５が構築されることとなる。また、ハードディスク２０には、予め設定されているデータを記憶する各記憶部２１〜２３が形成され、メモリ３０には、上記各処理部１１〜１５の作用により生成されたデータを記憶する各記憶部３１〜３３が形成されている。上記構成について、以下に詳述する。 <Configuration>
The database processing system according to the present invention is constructed by a computer 1 as shown in FIG. The computer 1 is composed of a general computer, and includes a CPU 10 that is an arithmetic processing unit, a hard disk 20 that is a storage device, and a memory 30 such as a RAM. The CPU 10 incorporates a database processing program for realizing the database processing that is a feature of the present invention, whereby the processing units 11 to 15 described below are constructed. The hard disk 20 includes storage units 21 to 23 that store preset data. The memory 30 stores storage units that store data generated by the operations of the processing units 11 to 15. Portions 31 to 33 are formed. The above configuration will be described in detail below.

まず、ハードディスク２０には、処理対象となるデータベースであるマスタデータベースを記憶するマスタデータベース記憶部２３が形成されている。また、このマスタデータベースに対する一連の処理の集合から成るトランザクションデータを記憶するトランザクションデータ記憶部２１が形成されている。さらには、このトランザクションデータ記憶部２１に記憶されているトランザクションデータを、グループ化するための区分け基準を表すグループ化ルールデータを記憶するグループ化ルールデータ記憶部２２が形成されている。このグループ化ルールデータは、後述するディスパッチ処理部１１にてグループ化する際に用いられるデータであるため、当該ディスパッチ処理部１１の説明時に詳述する。なお、ハードディスク２０に記憶されている上記各データは、コンピュータ１に接続された他のコンピュータの記憶装置に記憶されていてもよい。 First, a master database storage unit 23 that stores a master database that is a database to be processed is formed in the hard disk 20. In addition, a transaction data storage unit 21 for storing transaction data including a series of processes for the master database is formed. Further, a grouping rule data storage unit 22 is formed for storing grouping rule data representing a classification criterion for grouping the transaction data stored in the transaction data storage unit 21. Since the grouping rule data is data used when grouping by the dispatch processing unit 11 described later, it will be described in detail when the dispatch processing unit 11 is described. The data stored in the hard disk 20 may be stored in a storage device of another computer connected to the computer 1.

次に、ＣＰＵ１０には、上述したように、所定のプログラムが組み込まれることで、マスタデータベースに対する一連の処理の集合から成るトランザクションデータを予め定められたルールにてグループ化を行うディスパッチ処理部１１（グループ化手段）と、このグループ化したトランザクションデータを分析するトランザクション分析処理部１２と、グループ化された特定のグループにおけるトランザクションデータに基づいてマスタデータベースから当該トランザクションデータを処理することが可能なデータを含むサブデータベースを抽出するマスタ抽出処理部１３（マスタ抽出手段）と、サブデータベースに対してグループ化されたトランザクションデータの処理を行うトランザクション処理部１４（トランザクション処理手段）と、この処理結果を出力する出力処理部１５と、が構築されている。 Next, as described above, the CPU 10 incorporates a predetermined program so that transaction data consisting of a series of processes for the master database is grouped according to a predetermined rule. Grouping means), a transaction analysis processing unit 12 for analyzing the grouped transaction data, and data capable of processing the transaction data from the master database based on the transaction data in the specific grouped group. A master extraction processing unit 13 (master extraction unit) that extracts a sub-database including the transaction processing unit 14 (transaction processing unit) that processes transaction data grouped in the sub-database. , An output processing unit 15 for outputting the processing result, is constructed.

上記ディスパッチ処理部１１は、トランザクションデータに対してある指標値（以下、グループ化関数と呼ぶ）でグループ化を行う。このとき、グループ化関数の引数の１つとして、トランザクションデータのマスタデータベースに対する処理特性に応じたデータが選択される。特に、トランザクションデータ内にデータベース検索キーの分布特性と相関関係を持ったものがあれば、その相関関係を持ったデータが関数の引数の１つとして選択される。すなわち、このようなグループ化関数を選択する指令が、上記グループ化ルールデータ記憶部２２に記憶されているか、あるいは、ディスパッチ処理部１１に予め組み込まれていて、上記グループ化関数に基づいて、ディスパッチ処理部１１にてグループ化処理が行われる。 The dispatch processing unit 11 groups transaction data with a certain index value (hereinafter referred to as a grouping function). At this time, data corresponding to the processing characteristics of the transaction data with respect to the master database is selected as one of the arguments of the grouping function. In particular, if transaction data has a correlation with the distribution characteristics of the database search key, the data having the correlation is selected as one of the arguments of the function. That is, an instruction for selecting such a grouping function is stored in the grouping rule data storage unit 22 or incorporated in the dispatch processing unit 11 in advance, and dispatching is performed based on the grouping function. Grouping processing is performed in the processing unit 11.

そして、具体的に、グループ化関数の性質としては、マスタ引き当てのキー値の分布特性を反映するものが望ましいため、例えば、トランザクションデータ中から「支社コード」などのように、適度に集約され、かつ「商品コード」の分布と相関関係のある変数が引数として複数選択される。そして、バッチ処理においては、データの処理順にこれら変数ができるだけ適度に連続して現れる性質を持っているものが望ましい。 And, specifically, as the property of the grouping function, it is desirable to reflect the distribution characteristics of the key value assigned by the master, so for example, it is moderately aggregated from the transaction data, such as “branch code”, In addition, a plurality of variables correlated with the distribution of “product code” are selected as arguments. In batch processing, it is desirable that these variables have a property of appearing as moderately as possible in the order of data processing.

なお、上述したような検索キーと相関関係を持った変数が全く何もない場合であっても、例えば以下に示すトランザクションデータのマスタデータベースに対する処理特性に応じたデータが、グループ化関数の引数の１つとして選択される。例えば、オンラインシステムでは、トランザクションデータ中の処理時間帯別や、アクセス端末番号の上２桁による分類、受付番号順に１万件ずつなど、また、バッチ処理システムでは、接続元やレコード番号等、入力データ種別毎、データ順に１万件ずつなどの指標が変数（引数）の候補となる。また「商品コードの頭２桁」などを変数に採用したり、前日の処理結果の統計情報を用いるなど日々変動型の条件を変数に採用することも応用例として検討対象となりうる。これらの変数を選択することは、ランダムな関数を選択することに比べて、一般的にはキー値分布と相関関係があると見込まれるためである。 Even if there is no variable correlated with the search key as described above, for example, the data corresponding to the processing characteristics of the transaction data shown below for the master database is the argument of the grouping function. Selected as one. For example, in the online system, by processing time zone in the transaction data, classification by the first two digits of the access terminal number, 10,000 items in order of the reception number, etc. In the batch processing system, input the connection source, record number, etc. Indicators such as 10,000 records for each data type and data order are candidates for variables (arguments). In addition, adopting a daily variable type condition as a variable, such as using “the first two digits of the product code” as a variable or using statistical information of the processing result of the previous day, can be considered as an application example. This is because selecting these variables is generally expected to correlate with the key value distribution as compared to selecting a random function.

そして、いかなるグループ化関数を用いるかは、システムの利用者がデータベースの特性やトランザクション処理の特性に応じて特定し、コンピュータ１に入力することにより設定される。 The grouping function to be used is set by the system user specifying it according to the characteristics of the database and the characteristics of transaction processing and inputting them to the computer 1.

なお、バッチ処理においては、データの処理順に変数Ｐ＝(Ｐ1,Ｐ2,…)が適度に連続している場合には、当初のグループ化関数Ｇoのかわりに、Ｇo(Ｐ)の値が変化する（Ｐがブレイクすると呼ぶ）毎にカウントアップするような関数Ｇを新たなグループ化関数として構成する。これは、バッチ処理においてはグループ毎の単位で処理がスケジュールされる傾向があるため、データの順序性が必要であったり、グループ分けの処理が高負荷となることによる弊害を回避するために用いる。 In batch processing, when the variable P = (P1, P2,...) Is moderately continuous in the data processing order, the value of Go (P) changes instead of the original grouping function Go. A function G that counts up each time (P is called a break) is configured as a new grouping function. This is because batch processing tends to be scheduled on a group-by-group basis, so that data ordering is necessary and adverse effects caused by high load on grouping processing are used. .

なお、以下では、トランザクションデータ全体の集合をＴ、マスタデータベース全体の集合をＭ、マスタ引き当て関数（ＴからＭ内のデータを引き当てる関数）をΦ、グループ化関数をＧ、Ｇの値域を｛１,２,…,Ｎ｝として表すこととする。 In the following, T is a set of all transaction data, M is a set of the entire master database, Φ is a master assigning function (function that assigns data in M from T), G is a grouping function, and {1 is a range of G , 2,..., N}.

そして、上記ディスパッチ処理部１１は、グループ化したトランザクションデータを、メモリ３１に形成されたグループ化トランザクションデータ記憶部３１に記憶する。このとき、ディスパッチ処理部１１は、グループ化の分割数などが適度な分布となるように、分割数が多すぎる場合は引数または返値の適度な集約や関数の不採用、分割数が少なすぎる時は引数の追加や別の関数を追加して、分割数を調整する。すなわち、グループ化を行う際に使用するルールを増減することによって、自動的にグループ数が予め定められた数値範囲になるようグループ化を行う機能を有する。これは、後述するように、サブデータベースがシステムのキャッシュを効果的に利用できるレベルの大きさまで小さくなるようにするためである。 The dispatch processing unit 11 stores the grouped transaction data in the grouped transaction data storage unit 31 formed in the memory 31. At this time, the dispatch processing unit 11 causes the argument or return value to be appropriately aggregated, the function not adopted, or the number of divisions is too small when the number of divisions is too large so that the number of divisions for grouping becomes an appropriate distribution. At times, adjust the number of divisions by adding arguments or adding another function. That is, it has a function of automatically performing grouping so that the number of groups falls within a predetermined numerical range by increasing or decreasing the rules used when grouping. This is because the sub-database is reduced to a level that can effectively use the system cache, as will be described later.

続いて、上記トランザクション分析処理部１２について説明する。このトランザクション分析処理部１２は、グループ化した各トランザクションデータ（符号３１内のデータ）に対して、走査分析を行い、データベース引き当て時のキー値として使用されているものを洗い出す。そして、このキー値として使用されているデータを、各グループ毎に、グループ分析データとしてメモリ３０内のグループ分析データ記憶部３２に記憶する。 Next, the transaction analysis processing unit 12 will be described. The transaction analysis processing unit 12 performs a scan analysis on the grouped transaction data (data in reference numeral 31), and identifies what is used as a key value at the time of database allocation. The data used as the key value is stored in the group analysis data storage unit 32 in the memory 30 as group analysis data for each group.

続いて、上記マスタ抽出処理部１３について説明する。このマスタ抽出処理部１３は、グループ分けされたトランザクションデータのまとまり毎に、まとまり内のトランザクションデータの分布特性に応じた一時的なデータベース（以下、サブデータベースと呼ぶ）を、元の検索対象データベースであるマスタデータベースから自動的に生成する。このとき、上記トランザクション分析処理部１２にて分析したグループ分析データ記憶部３２に記憶されたキー値に基づいて、これに対応するデータをマスタデータベースから抽出することによって構成することができる。なお、サブデータベースはマスタデータベースの部分集合であり、グループ内のトランザクションデータを処理することが可能なデータを含んだものである。 Next, the master extraction processing unit 13 will be described. The master extraction processing unit 13 creates a temporary database (hereinafter referred to as a sub-database) corresponding to the distribution characteristics of transaction data in the group for each group of transaction data grouped in the original search target database. Generate automatically from a master database. At this time, based on the key value stored in the group analysis data storage unit 32 analyzed by the transaction analysis processing unit 12, the corresponding data can be extracted from the master database. The sub-database is a subset of the master database, and includes data that can process transaction data in the group.

具体的には、例えば、グループｎに属するトランザクションの集合をＴｎ、Ｔｎのキーが参照するマスタデータベースの部分集合（サブデータベースと呼ぶ）をＭｎとする。バッチ処理においては、スケジュール単位としてのグループの処理開始前に全数調査を行ってＭｎを確定する。また、オンライン処理においては、初期予想と都度追加の２段階で構成する。このため、今処理しようとするトランザクションデータによる検索対象データが、サブデータベースに存在することを保証する、という構成を採る。その簡易的な手法として、トランザクションデータ１件毎にサブデータベース照会処理を実施する方法があるが、結果としてサブデータベース検索処理が２回発生することになるため本発明の効果が弱くなる。これを抑えるため、通常はトランザクションデータ中の別の条件ですでに存在することが保障されている別の指標値を選択し、その指標から外れたデータのみをサブデータベース検索処理で照会を行うように構成する。 Specifically, for example, a set of transactions belonging to group n is Tn, and a subset of the master database (referred to as a sub-database) referenced by the key of Tn is Mn. In batch processing, Mn is determined by conducting a total survey before starting processing of a group as a schedule unit. Further, the online processing is composed of two stages of initial prediction and addition each time. For this reason, a configuration is adopted in which it is ensured that search target data based on transaction data to be processed exists in the sub-database. As a simple method, there is a method of executing a sub database inquiry process for each transaction data. However, since the sub database search process occurs twice as a result, the effect of the present invention is weakened. To suppress this, usually select another index value that is guaranteed to already exist under different conditions in the transaction data, and only the data that deviates from that index is queried in the sub-database search process. Configure.

そして、上記で構成したＭｎの集合は、検索処理においてミスヒットしないように最低限必要なものとして構成したものであるから、これ以外に余分なデータが少量含まれていることは構わない。特にオンライン処理においては、ＴｎのΦによる写像としてのΦ（Ｔｎ）を事前に生成することは困難であるから、ある程度の予想としてのＭ’ｎを事前に構成しておき、Ｍｎを、Φ（Ｔｎ）とＭ’ｎの和集合として構成を行うこととなる。すると、一般的には図３に示すように、各サブデータベースの各々の集合は、属する要素に重なりを持っており、全サブデータベースを合わせても元のマスタデータベースになるとは限らない。 The set of Mn configured as described above is configured as a minimum necessary so as not to be mis-hit in the search process. Therefore, it is possible that a small amount of extra data is included in addition to this. In particular, in online processing, it is difficult to generate Φ (Tn) as a mapping of Φ of Tn in advance. Therefore, M′n as a certain degree of prediction is configured in advance, and Mn is expressed as Φ ( The configuration is performed as a union of Tn) and M′n. Then, generally, as shown in FIG. 3, each set of sub-databases has an overlapping element, and even if all sub-databases are combined, the original master database is not always obtained.

なお、サブデータベースの生成タイミングにおいては、バッチ処理においてはグループ毎にトランザクションデータの全数調査を事前に行って一括処理を行うことが最も効率的である。オンライン処理においては、上述したようにグループ分けのある程度の想定条件で事前に予想して作成しておき（この事前作成処理は必ずしも実施しなくてもよい）、足りないものを都度追加する方法で行う。この場合、足りているか足りていないかを常に判断する処理が別途必要となる。 It should be noted that at the generation timing of the sub-database, it is most efficient in batch processing to perform batch processing by checking all transaction data in advance for each group. In online processing, as described above, it is created in advance by forecasting under certain assumptions for grouping (this pre-creation processing does not necessarily have to be performed), and a method for adding missing ones each time is used. Do. In this case, a separate process for always determining whether it is sufficient or not is necessary.

このようにして作成したサブデータベースをメモリ３１内のサブマスタデータベース記憶部３３に記憶しておき、上記トランザクション処理部１４にて、各々グループ分けしたトランザクションにマスタデータベースの替わりに適用することで、データベース引き当て処理を変更することなく、本来の目的と同等の処理結果を得ることが可能となる。 The sub-database created in this way is stored in the sub-master database storage unit 33 in the memory 31, and the transaction processing unit 14 applies the database to each grouped transaction instead of the master database. A processing result equivalent to the original purpose can be obtained without changing the allocation processing.

＜動作＞
次に、上記構成のシステムの動作を、図４乃至図５を参照して説明する。図４は、ＣＰＵ１０内の処理の様子を示す説明図である。図５は、グループが複数あった場合の処理を説明する説明図である。 <Operation>
Next, the operation of the system configured as described above will be described with reference to FIGS. FIG. 4 is an explanatory diagram showing a state of processing in the CPU 10. FIG. 5 is an explanatory diagram for explaining processing when there are a plurality of groups.

まず、ディスパッチ処理部１１にて、グループ化関数を用いてトランザクションデータをグループ化する。すなわち、トランザクションデータＴ（２１）を、Ｔ１〜ＴＮに分解する処理を行う。そして、グループｎ（３１）が生成される。なお、このディスパッチ処理により、図５に示すように、複数のグループ（グループ１（３１ａ）、グループ２（３１ｂ））が生成されてもよく、生成されるグループの数は限定されない。 First, the dispatch processing unit 11 groups transaction data using a grouping function. That is, the transaction data T (21) is processed to be decomposed into T1 to TN. Then, a group n (31) is generated. As shown in FIG. 5, a plurality of groups (group 1 (31a), group 2 (31b)) may be generated by this dispatch process, and the number of generated groups is not limited.

次に、グループ化されたトランザクションデータの分析がトランザクション分析処理部１２で行われ、その分析データ３２が記憶保持される。すなわち、ＴｎからＭｎを作成するための情報を分析する処理を行う。そして、この分析データ３２を用いて、マスタデータベース２３からマスタ抽出処理部１３にてマスタ抽出処理を行い、サブデータベース３３を生成する。すなわち、ＭからＭ１〜ＭＮを作成する処理を行う。このとき、複数のグループが生成される場合には、それに対応して複数のサブデータベース（サブＤＢ１（３３ａ），サブＤＢ２（３３ｂ））が生成される（図５参照）。 Next, analysis of the grouped transaction data is performed by the transaction analysis processing unit 12, and the analysis data 32 is stored and held. That is, processing for analyzing information for creating Mn from Tn is performed. Then, using this analysis data 32, the master extraction processing unit 13 performs master extraction processing from the master database 23 to generate a sub-database 33. That is, processing for creating M1 to MN from M is performed. At this time, when a plurality of groups are generated, a plurality of sub-databases (sub-DB1 (33a), sub-DB2 (33b)) are generated correspondingly (see FIG. 5).

その後、各グループのトランザクションデータの処理を、トランザクション処理部１４にて当該グループに対応したサブデータベース３３（３３ａ，３３ｂ）に対して行う。これにより、トランザクションに対する出力結果を表す出力データ４０が生成される。 Thereafter, the transaction data of each group is processed by the transaction processing unit 14 on the sub-database 33 (33a, 33b) corresponding to the group. Thereby, output data 40 representing the output result for the transaction is generated.

このようにすることにより、サブデータベースにはトランザクションデータの処理を実行可能なデータが含まれている可能性が高いため、その処理の精度を維持しつつ、また、その母体数も少数であるため、処理時間の短縮化を図ることができ、かつ、トランザクション処理を行うコンピュータの付加を軽減できる。 By doing so, it is highly possible that the sub-database contains data that can execute transaction data processing, so the accuracy of the processing is maintained and the number of hosts is also small. The processing time can be shortened and the addition of a computer that performs transaction processing can be reduced.

上記効果を、さらに図６乃至図７を参照して説明する。図６には、トランザクションデータとキー値分布の関係を表したものである。トランザクションデータ１件を処理する時間として、検索処理の処理時間は、検索対象マスタの母体の大きさと相関関係がある。この相関関係のスケールオーダを縦軸に表現したとするならば、トランザクション全体の処理Ａに相当する処理時間は、図６のグラフの面積に比例することとなる。これらを比較すると、面積は下記のようになる。
グループ化分割を行わない場合：面積＝Ｔ×Ｍ
グループ化分割を行った場合：面積＝Σ_{ｎ＝１〜Ｎ}（Ｔｎ×Ｍｎ） The above effect will be further described with reference to FIGS. FIG. 6 shows the relationship between transaction data and key value distribution. As the time for processing one transaction data, the processing time of the search processing has a correlation with the size of the base of the search target master. If the scale order of this correlation is expressed on the vertical axis, the processing time corresponding to the processing A of the entire transaction is proportional to the area of the graph of FIG. When these are compared, the area is as follows.
When grouping is not performed: Area = T × M
When grouped and divided: Area = Σ _{n = 1 to N} (Tn × Mn)

上記からも明らかなように、グループ化分割を行った場合の方が明らかに小さい。なお、本発明においては、「トランザクション分析」処理や「マスタ抽出」処理が必要となるため、多少のオーバーヘッドが発生するものの、通常これらのオーバーヘッド処理は、トランザクションデータの母体の大きさには比例するが、マスタ母体の大きさとは相関関係が低い傾向にあるため、マスタ母体が大きくなれば本発明の適用の優位性が増すこととなる。この様子を、図７に示す。この図の（Ａ）には、本願発明を適用しない場合のトランザクション処理の処理時間を表したグラフを示し、（Ｂ）には、本願発明を適用した場合のトランザクション処理を含む全処理時間を表したグラフを示す。なお、（Ｂ）は、本願発明を適用したときのトランザクション処理のみの処理時間（Ｂ１）と、本願発明を適用したときのオーバーヘッド処理時間（Ｂ２）と、を併せた処理時間となる（（Ｂ）＝（Ｂ１）＋（Ｂ２））。この図を参照すると、検索対象のデータベースのデータ数がａ値を超えたときに、処理の高速化を図ることができる。 As is clear from the above, the grouping / dividing is obviously smaller. In the present invention, since "transaction analysis" processing and "master extraction" processing are required, some overhead occurs, but usually these overhead processing is proportional to the base size of transaction data. However, since the correlation with the size of the master matrix tends to be low, the advantage of application of the present invention increases as the master matrix increases. This is shown in FIG. (A) of this figure shows a graph showing the processing time of transaction processing when the present invention is not applied, and (B) shows the total processing time including transaction processing when the present invention is applied. The graph is shown. Note that (B) is a processing time obtained by combining the processing time (B1) of only the transaction processing when the present invention is applied and the overhead processing time (B2) when the present invention is applied ((B ) = (B1) + (B2)). Referring to this figure, the processing speed can be increased when the number of data in the search target database exceeds the value a.

ここで、上記構成に加え、上述した従来技術であるキャッシュ技術を併用することで、さらなる処理の高速化を図ることができる。なお、キャッシュへの取り込みにおいて元データベースのランダムアクセスが発生するが、本発明では通常、バッチ処理の適用において、この部分をキャッシュ技術の場合よりも少ない負荷で抑えることが可能な場合がある。これは、データベース処理システムの実装にも依存するが、一括処理において突合せ処理などのシーケンシャルアクセスが可能となるためである。 Here, in addition to the above-described configuration, the processing speed can be further increased by using the above-described conventional cache technology together. In addition, although random access of the original database occurs during fetching into the cache, in the present invention, in some cases, this portion can be suppressed with a smaller load than in the case of the cache technology in the application of batch processing. This is because although it depends on the implementation of the database processing system, sequential access such as matching processing is possible in the batch processing.

本発明は、大量のトランザクションデータを処理するデータベース処理システムに組み込むことで、当該トランザクションデータの処理の高速化を図ることができるため、産業上の利用可能性を有する。 The present invention has industrial applicability because it can be processed in a database processing system for processing a large amount of transaction data to increase the processing speed of the transaction data.

本発明であるデータベース処理システムとして作動するコンピュータの構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the computer which operate | moves as a database processing system which is this invention. 一般的なトランザクション処理の一例を示す説明図である。It is explanatory drawing which shows an example of general transaction processing. マスタデータベースとサブデータベースとの関係を示す説明図である。It is explanatory drawing which shows the relationship between a master database and a sub database. データベース処理システムの動作を示す説明図である。It is explanatory drawing which shows operation | movement of a database processing system. データベース処理システムの動作の一部を示す説明図である。It is explanatory drawing which shows a part of operation | movement of a database processing system. トランザクション分布とキー値分布との関係を示す図である。It is a figure which shows the relationship between transaction distribution and key value distribution. 従来例と本発明との処理時間を比較した図である。It is the figure which compared the processing time of a prior art example and this invention.

Explanation of symbols

１コンピュータ（データベース処理システム）
１０ＣＰＵ
１１ディスパッチ処理部（グループ化手段）
１２トランザクション分析処理部
１３マスタ抽出処理部（マスタ抽出手段）
１４トランザクション処理部（トランザクション処理手段）
１５出力処理部
２０ハードディスク
２１トランザクションデータ記憶部（トランザクションデータ）
２２グループ化ルールデータ記憶部
２３マスタデータベース記憶部（マスタデータベース）
３０メモリ
３１グループ化トランザクションデータ記憶部
３２グループ分析データ記憶部
３３サブデータベース記憶部（サブデータベース）
1 Computer (database processing system)
10 CPU
11 Dispatch processing part (grouping means)
12 Transaction analysis processing unit 13 Master extraction processing unit (master extraction means)
14 Transaction processing unit (transaction processing means)
15 Output processing unit 20 Hard disk 21 Transaction data storage unit (transaction data)
22 Grouping rule data storage unit 23 Master database storage unit (master database)
30 Memory 31 Grouped Transaction Data Storage Unit 32 Group Analysis Data Storage Unit 33 Sub Database Storage Unit (Sub Database)

Claims

Grouping means for grouping transaction data consisting of a set of processes for the master database according to a predetermined rule;
Master extraction means for extracting a sub-database including data capable of processing the transaction data from the master database based on transaction data in a specific grouped group;
Transaction processing means for processing transaction data grouped with respect to the sub-database;
A database processing system comprising:

2. The database processing system according to claim 1, wherein the grouping means performs grouping according to processing characteristics for the master database to be processed of the transaction data.

The database processing system according to claim 1, wherein the grouping means performs grouping based on data related to a search key for the master database existing in the transaction data.

The grouping means sets a plurality of predetermined rules to be used for grouping, and uses all or a part of the plurality of rules so that the number of groups falls within a predetermined numerical range. 4. The database processing system according to claim 1, wherein the database processing system is configured.

A computer that processes transaction data stored in a predetermined storage device for a master database, which is a data group to be processed, stored in a predetermined storage device,
Grouping means for grouping transaction data consisting of a set of processes for the master database according to a predetermined rule;
Master extraction means for extracting a sub-database including data capable of processing the transaction data from the master database based on transaction data in a specific grouped group;
Transaction processing means for processing transaction data grouped with respect to the sub-database;
A database processing program for realizing

A database processing method for processing transaction data stored in a predetermined storage device with respect to a master database that is a processing target data group stored in a predetermined storage device using a computer,
A grouping step of grouping transaction data consisting of a set of processes for the master database according to a predetermined rule;
A master extraction step of extracting a sub-database including data capable of processing the transaction data from the master database based on transaction data in a specific grouped group;
A transaction processing step for processing transaction data grouped with respect to the sub-database;
A database processing method characterized by comprising: