JP2015049542A

JP2015049542A - Distributed database system and selection unit

Info

Publication number: JP2015049542A
Application number: JP2013178656A
Authority: JP
Inventors: 浩一高岡; Koichi Takaoka
Original assignee: Panasonic Intellectual Property Management Co Ltd
Current assignee: Panasonic Intellectual Property Management Co Ltd
Priority date: 2013-08-29
Filing date: 2013-08-29
Publication date: 2015-03-16
Also published as: WO2015029341A1

Abstract

PROBLEM TO BE SOLVED: To enable a client to access data without being conscious of a storage device that stores the data.SOLUTION: A distributed database system includes a plurality of storage devices 10 and a selection device 20. The storage device 10 has a function of communicating via a telecommunication line 30, and forms a group for each type of stored data. The selection device 20 accesses required data which is specified by identification information notified via the telecommunication line 30. The identification information includes first identification information for identifying the group and second identification information for identifying a storage location of data in the group. The selection device 20 includes: a first extraction unit 21 that accesses the group where the required data is stored, by using the first identification information; and a second extraction unit 22 that accesses the required data by using the second identification information in the group which is accessed by the first extraction unit 21.

Description

本発明は、複数の記憶装置を備える分散データベースシステム、およびそれに用いる選択装置に関する。 The present invention relates to a distributed database system including a plurality of storage devices and a selection device used therefor.

近年、電気通信回線を通して通信する複数台の記憶装置を、データベース管理システムで統合することにより、利用者からはあたかも１台の記憶装置を用いたデータベースであるかのように扱うことができる分散データベースの技術が採用されている。 In recent years, a distributed database that can be handled by a user as if it were a database using one storage device by integrating a plurality of storage devices communicating via a telecommunication line with a database management system. Technology is adopted.

特許文献１に記載された技術は、複数のサーバ内の複数のリソースで管理されているデータを要求されたとき、データアクセス処理により、要求データを、サーバ、リソース、アイテムの順に分類した管理テーブルを作成している。さらに、データアクセス処理は、管理テーブルに基づいてデータを収集し、データを要求順に並べることにより、クライアントが要求するデータを一括して表示等ができるようにしている。 The technology described in Patent Document 1 is a management table in which, when data managed by a plurality of resources in a plurality of servers is requested, the requested data is classified in the order of servers, resources, and items by data access processing. Have created. Further, the data access process collects data based on the management table and arranges the data in the order of request so that the data requested by the client can be collectively displayed.

特開２０００−５７０７４号公報JP 2000-57074 A

特許文献１に記載された技術は、クライアントがデータの配置等を意識することなくデータを一括して取得できるようにするという課題を解決している。すなわち、複数のサーバに分散して管理されているデータを収集する処理と、データを順に並べる処理とを、データアクセス処理が行っている。 The technique described in Patent Document 1 solves the problem of enabling the client to acquire data in a batch without being aware of the data arrangement or the like. That is, the data access process performs a process of collecting data managed by being distributed to a plurality of servers and a process of arranging the data in order.

しかしながら、この構成においてクライアントは、要求するデータをデータアクセス処理に指示するために、データを管理しているサーバを指定しなければならず、どのデータがどのサーバに存在しているかがわからなければ、データにアクセスできないという問題がある。 However, in this configuration, the client must specify the server that manages the data in order to instruct the requested data to the data access process, and if the client does not know which data exists on which server There is a problem that data cannot be accessed.

本発明は、データを格納している記憶装置をクライアントが意識することなくデータにアクセスすることを可能にした分散データベースシステムを提供することを目的とし、さらに分散データベースシステムに用いる選択装置を提供することを目的とする。 An object of the present invention is to provide a distributed database system that enables a client to access data without being aware of a storage device storing the data, and further provides a selection device used in the distributed database system. For the purpose.

本発明に係る分散データベースシステムは、電気通信回線を通して通信する機能を有し、格納するデータの種別ごとにグループを形成している複数台の記憶装置と、前記電気通信回線を通して通知される識別情報で特定される所要のデータにアクセスする選択装置とを備え、前記識別情報は、前記グループを識別する第１の識別情報と、前記グループの中で前記データの格納場所を識別する第２の識別情報とを含み、前記選択装置は、前記第１の識別情報を用いて前記所要のデータが格納された前記グループにアクセスする第１の抽出部と、前記第１の抽出部がアクセスした前記グループにおいて前記第２の識別情報を用いて前記所要のデータにアクセスする第２の抽出部とを備えることを特徴とする。 The distributed database system according to the present invention has a function of communicating through a telecommunication line, a plurality of storage devices forming a group for each type of data to be stored, and identification information notified through the telecommunication line And a selection device that accesses the required data specified in (1), wherein the identification information includes first identification information that identifies the group, and second identification that identifies a storage location of the data in the group And the selection device uses the first identification information to access the group storing the required data and the group accessed by the first extraction unit. And a second extraction unit that accesses the required data using the second identification information.

この分散データベースシステムにおいて、前記選択装置は、前記電気通信回線を通して前記記憶装置と通信するサーバであることが好ましい。 In this distributed database system, it is preferable that the selection device is a server that communicates with the storage device through the electric communication line.

本発明に係る選択装置は、上述した分散データベースシステムに用いられる前記サーバであることを特徴とする。 The selection device according to the present invention is the server used in the distributed database system described above.

本発明の構成によれば、複数台の記憶装置が格納するデータの種別ごとにグループを形成し、グループを識別する第１の識別情報を用いて、データが格納されている記憶装置を絞り込んでいる。さらに、グループの中でデータの格納場所を識別する第２の識別情報を用いて、グループにおいて所要のデータにアクセスしている。したがって、クライアントは、データを格納している記憶装置を意識することなくデータにアクセスすることが可能になる。 According to the configuration of the present invention, a group is formed for each type of data stored in a plurality of storage devices, and the storage devices storing the data are narrowed down using the first identification information for identifying the group. Yes. Furthermore, using the second identification information for identifying the data storage location in the group, the required data is accessed in the group. Therefore, the client can access the data without being aware of the storage device storing the data.

実施形態を示すブロック図である。It is a block diagram which shows embodiment. 同上の一使用例を示す概略構成図である。It is a schematic block diagram which shows one example of use same as the above. 同上の他の使用例を示す概略構成図である。It is a schematic block diagram which shows the other usage example same as the above. 同上における記憶装置の構成例を示す概略図である。It is the schematic which shows the structural example of the memory | storage device same as the above.

以下に説明する分散データベースシステム（以下、「分散データベース」と略称する）は、図１に示すように、複数台の記憶装置１０と、選択装置２０とを備える。記憶装置１０は、電気通信回線３０を通して通信する機能を有し、格納するデータの種別ごとにグループＧ１，Ｇ２を形成している。選択装置２０は、電気通信回線３０を通して通知される識別情報で特定される所要のデータにアクセスする。識別情報は、グループを識別する第１の識別情報と、グループＧ１，Ｇ２の中でデータの格納場所を識別する第２の識別情報とを含む。選択装置２０は、第１の識別情報を用いて所要のデータが格納されたグループＧ１，Ｇ２にアクセスする第１の抽出部２１と、第１の抽出部２１がアクセスしたグループＧ１，Ｇ２において第２の識別情報を用いて所要のデータにアクセスする第２の抽出部２２とを備える。 A distributed database system (hereinafter abbreviated as “distributed database”) described below includes a plurality of storage devices 10 and a selection device 20, as shown in FIG. The storage device 10 has a function of communicating through the telecommunication line 30 and forms groups G1 and G2 for each type of data to be stored. The selection device 20 accesses necessary data specified by the identification information notified through the telecommunication line 30. The identification information includes first identification information for identifying a group, and second identification information for identifying a data storage location in the groups G1 and G2. The selection device 20 uses the first identification information to access the first extraction unit 21 that accesses the groups G1 and G2 in which necessary data is stored, and the first extraction unit 21 accesses the groups G1 and G2. And a second extraction unit 22 that accesses the required data using the two identification information.

記憶装置１０を実現する構成は、データを格納できれば、ハードディスク装置のような専用のストレージ、データベース管理システムが扱うストレージ、データテーブルなどどのような構成でもよい。ただし、記憶装置１０は電気通信回線３０を通して通信可能でなければならない。したがって、記憶装置１０は、コンピュータシステムからなるサーバとして実現される。 The configuration for realizing the storage device 10 may be any configuration such as a dedicated storage such as a hard disk device, a storage handled by the database management system, and a data table as long as data can be stored. However, the storage device 10 must be able to communicate through the telecommunication line 30. Therefore, the storage device 10 is realized as a server composed of a computer system.

記憶装置１０が形成するグループＧ１，Ｇ２には、グループＧ１，Ｇ２ごとにデータベースサーバ４１（すなわち、データベース管理システムを搭載したサーバ）が設けられる。データベースサーバ４１は、グループＧ１，Ｇ２内に複数台設けられていてもよい。選択装置２０は、アプリケーションプログラムインターフェイス（Application Program InterfaceまたはApplication Programming Interface：ＡＰＩ）を設けた専用のサーバにより実現される。すなわち、選択装置２０は、電気通信回線３０を通して記憶装置１０と通信するサーバであることが望ましい。電気通信回線２０がインターネットである場合、選択装置２０は、ウェブサーバにより構成される。なお、図示例において、選択装置２０は１台のみ記載しているが、必要に応じて選択装置２０を複数台設けてもよいのはもちろんのことである。また、ＡＰＩはデータベースサーバ４１に設けられていてもよい。この場合、選択装置２０はデータベースサーバ４１に設けられることになる。あるいはまた、いずれかの記憶装置１０にＡＰＩを設け、選択装置２０として用いることも可能である。 In the groups G1 and G2 formed by the storage device 10, a database server 41 (that is, a server equipped with a database management system) is provided for each of the groups G1 and G2. A plurality of database servers 41 may be provided in the groups G1 and G2. The selection device 20 is realized by a dedicated server provided with an application program interface (Application Programming Interface: API). That is, the selection device 20 is preferably a server that communicates with the storage device 10 through the telecommunication line 30. When the telecommunication line 20 is the Internet, the selection device 20 is configured by a web server. In the illustrated example, only one selection device 20 is shown, but it goes without saying that a plurality of selection devices 20 may be provided as necessary. The API may be provided in the database server 41. In this case, the selection device 20 is provided in the database server 41. Alternatively, an API may be provided in any storage device 10 and used as the selection device 20.

図示例において、選択装置２０は、サーバ４２を通してクライアントである端末装置４３および管理対象である設備４４と、電気通信回線３０を通して通信する。また、選択装置２０は、電気通信回線３０を通してデータベースサーバ４１と通信する。サーバ４２は、電気通信回線３０を通して設備４４の監視あるいは設備４４の制御を行うために設けられ、サーバ４２が設備４４から取得したデータは選択装置２０を通して記憶装置１０に格納される。また、端末装置４３が記憶装置１０に格納されたデータを要求すると、サーバ４２が選択装置２０を通して所要のデータを記憶装置１０から読み出す。すなわち、選択装置２０は、サーバ４２からの要求に応じて記憶装置１０のデータにアクセスする機能を有する。ここでは、データへのアクセスは、記憶装置１０に格納されたデータを取り出すことと、記憶装置１０の所要の場所にデータを格納することとの両方を意味する。 In the illustrated example, the selection device 20 communicates through the server 42 with the terminal device 43 as a client and the equipment 44 as a management target through the telecommunication line 30. The selection device 20 communicates with the database server 41 through the telecommunication line 30. The server 42 is provided for monitoring the equipment 44 or controlling the equipment 44 through the telecommunication line 30, and data acquired from the equipment 44 by the server 42 is stored in the storage device 10 through the selection device 20. Further, when the terminal device 43 requests data stored in the storage device 10, the server 42 reads out necessary data from the storage device 10 through the selection device 20. That is, the selection device 20 has a function of accessing data in the storage device 10 in response to a request from the server 42. Here, accessing data means both retrieving data stored in the storage device 10 and storing data in a required location of the storage device 10.

図１では、サーバ４２が接続されている電気通信回線３０と、選択装置２０が接続されている電気通信回線３０とは、データの内容が異なるから分けて記載しているが、２つの電気通信回線３０は物理的には同じであってもよい。また、サーバ４２が主としてデータの収集を行う場合、サーバ４２と設備４４とはＭ２Ｍ（Machine-to-Machine）技術を用いることが望ましい。 In FIG. 1, the telecommunications line 30 to which the server 42 is connected and the telecommunications line 30 to which the selection device 20 is connected are described separately because they have different data contents. The line 30 may be physically the same. Further, when the server 42 mainly collects data, it is desirable that the server 42 and the equipment 44 use M2M (Machine-to-Machine) technology.

図１に示す構成例では、サーバ４２と選択装置２０とを分離することによって、様々なサービスを提供するサーバ４２に対して分散データベースを共用可能にしているが、選択装置２０をサーバ４２が兼用する構成を採用することも可能である。サーバ４２の機能については後述する。電気通信回線３０は、インターネットを用いたＶＰＮ（Virtual Private Network）を想定しているが、専用の通信経路を用いる電気通信回線３０であってもよい。 In the configuration example shown in FIG. 1, the distributed database can be shared with the server 42 that provides various services by separating the server 42 and the selection device 20, but the selection device 20 is shared by the server 42. It is also possible to adopt a configuration that does this. The function of the server 42 will be described later. The telecommunication line 30 is assumed to be a VPN (Virtual Private Network) using the Internet, but may be the telecommunication line 30 using a dedicated communication path.

いま、１５万施設の電気、ガス、水道などに関するデータを４００万件以上扱うために分散データベースを用いる場合を想定する。この規模のデータを自動収集する場合、たとえば、３００万ポイント以上について、合計で２０テラバイト以上のデータ容量のデータを処理する性能を確保することが必要になる。施設の一例を挙げると、事業用の建物（オフィスビル、商業施設など）、住居用の建物（集合住宅など）、公園、運動施設などがある。データは、電気、ガス、水道などの使用量に関するデータ（以下、「エネルギーデータ」という）のほか、施設に設けられた設備の構成に関するデータ、設備の稼働状態に関するデータなどを含むことが望ましい。データを自動収集する周期は、より短くすることが要求されており、現状では１０分程度であるが、１分以下に短縮することも要求されてきている。 Assume that a distributed database is used to handle more than 4 million data on electricity, gas, water, etc. of 150,000 facilities. When automatically collecting data of this scale, for example, it is necessary to secure the performance of processing data with a total data capacity of 20 terabytes or more for 3 million points or more. Examples of facilities include business buildings (such as office buildings and commercial facilities), residential buildings (such as apartment buildings), parks, and exercise facilities. The data preferably includes data relating to the usage of electricity, gas, water, etc. (hereinafter referred to as “energy data”), data relating to the configuration of equipment provided in the facility, data relating to the operating state of the equipment, and the like. The period for automatically collecting data is required to be shorter and is currently about 10 minutes, but it is also required to be shortened to 1 minute or less.

上述のように、エネルギーデータのほかに、設備の構成、設備の稼働状態などの種々のデータを統括して管理することができれば、これらのデータを様々に組み合わせ、またデータを加工することによって、様々なサービスを提供することが可能になる。たとえば、エネルギーデータを収集するサービス、エネルギーデータの推移を可視化するサービス、施設における電力、ガス、水道の総使用量を管理するサービス、施設に設けられた設備の稼働状態を監視するサービスなど種々のサービスが提供可能になる。 As mentioned above, in addition to energy data, if various data such as the configuration of equipment and the operating state of equipment can be managed in an integrated manner, by combining these data in various ways and processing the data, It becomes possible to provide various services. For example, various services such as a service that collects energy data, a service that visualizes the transition of energy data, a service that manages the total amount of power, gas, and water used in the facility, and a service that monitors the operating status of equipment provided in the facility Service becomes available.

分散データベースを用いて提供されるサービスの種類は、サービスを受ける利用者との契約によって決まり、サーバ４２は、利用者との契約により決めたサービスを提供する。サービスは電気通信回線３０を通して提供されるから、サービスを提供可能な地域は広範囲であって、国内だけではなく海外であってもサービスの提供が可能になっている。そのため、分散データベースは、クラウドを構築していることが望ましい。 The type of service provided using the distributed database is determined by a contract with the user who receives the service, and the server 42 provides the service determined by the contract with the user. Since the service is provided through the telecommunications line 30, the area where the service can be provided is wide, and the service can be provided not only in Japan but also overseas. Therefore, it is desirable that the distributed database has a cloud.

また、使用するデータの種類は、利用者に提供するサービスの種類により異なるから、データの種類に応じてデータを格納する記憶装置１０を分けていることが望ましい。つまり、データの種類によって情報量が異なるから、複数種類のデータが混在している場合にデータを一括して扱うと、一括されたデータに含まれているデータの種類に応じて情報量に大きなばらつきが生じる。そのため、複数種類のデータが混在した状態で、この種のデータを一括して１つの記憶装置１０に格納すると、記憶装置１０のアクセス時間に大きなばらつきが生じる。 In addition, since the type of data to be used varies depending on the type of service provided to the user, it is desirable to divide the storage device 10 that stores data according to the type of data. In other words, the amount of information varies depending on the type of data, so when multiple types of data are mixed, handling the data in a batch increases the amount of information depending on the type of data contained in the batched data. Variation occurs. Therefore, if this type of data is stored in one storage device 10 in a state where a plurality of types of data are mixed, the access time of the storage device 10 varies greatly.

本実施形態では、記憶装置１０は、格納するデータの種類ごとにグループＧ１，Ｇ２を形成している。上述のように、エネルギーデータ、設備の構成に関するデータ、設備の稼働状態に関するデータの３種類のデータが存在する場合、３グループの記憶装置１０が設けられ、各種類のデータが互いに異なるクループに格納される。このようなグループＧ１，Ｇ２を形成することによって、１つのグループＧ１，Ｇ２の中では、記憶装置１０のアクセス時間のばらつきが抑制される。 In the present embodiment, the storage device 10 forms groups G1 and G2 for each type of data to be stored. As described above, when there are three types of data, energy data, equipment configuration data, and equipment operating state data, three groups of storage devices 10 are provided, and each type of data is stored in different groups. Is done. By forming such groups G1 and G2, variation in access time of the storage device 10 is suppressed in one group G1 and G2.

図２に示す構成例では、記憶装置１０は、マスタ管理データを格納するグループＧ１、エネルギーデータを格納するグループＧ２、設備稼働データを格納するグループＧ３の３グループを構成している。マスタ管理データは、設備の構成に関するデータであり、たとえば、利用者（事業者）−事業所−建物−フロア−設備などの設備の階層構造および設備の仕様などが格納されている。 In the configuration example shown in FIG. 2, the storage device 10 constitutes three groups: a group G1 that stores master management data, a group G2 that stores energy data, and a group G3 that stores facility operation data. The master management data is data relating to the configuration of equipment, and stores, for example, the hierarchical structure of equipment such as a user (business) -business establishment-building-floor-equipment, equipment specifications, and the like.

図２に示す構成例では、種々のサーバ４２が上述した分散データベースシステム（クラウド）を共通のプラットフォームに用いることによって、種々のサービスを行う場合を例示している。上段のサーバ４２は、主として端末装置４３にデータ（情報）を提供するサービスを行い、下段のサーバ４２は、主として設備４４からデータを収集するサービスを行う。 The configuration example illustrated in FIG. 2 illustrates a case where various servers 42 perform various services by using the above-described distributed database system (cloud) as a common platform. The upper server 42 mainly provides a service for providing data (information) to the terminal device 43, and the lower server 42 mainly performs a service for collecting data from the equipment 44.

図示例において、下段のサーバ４２は、それぞれエネルギー収集、ＢＥＭＳ（Building and Energy Management System）監視、冷設・空調、遠隔監視のサービスを行い、上段のサーバ４２は、それぞれ情報発信、総量管理、見える化、分析・診断のサービスを行う場合を示している。ＢＥＭＳ監視および遠隔監視は、設備４４から収集したデータを用いて設備４４の遠隔監視あるいは遠隔制御を行うサービスである。ＢＥＭＳ監視は、オフィスビルやテナントビルの照明器具、空調機器などの消費電力を抑制するための制御を支援するサービスであり、遠隔監視は、蓄電設備などの遠隔制御を行うサービスである。冷設・空調は、店舗において冷凍・冷蔵ショーケースや空調設備を対象とした遠隔監視などを行うサービスである。また、情報発信は、商品情報、省エネ事例などの情報を発信するサービスであり、総量管理は、エネルギー使用量をグラフなどで示すサービスである。見える化は、太陽光発電設備、蓄電設備などの運転状況をグラフなどで示すサービスである。分析・診断は、設備４４が無駄なく効率的に運転されるように分析や診断を行うサービスである。 In the illustrated example, the lower server 42 provides energy collection, BEMS (Building and Energy Management System) monitoring, cooling / air conditioning, and remote monitoring services, and the upper server 42 provides information transmission and total amount management, respectively. It shows the case of providing services for analysis, analysis and diagnosis. BEMS monitoring and remote monitoring are services for remotely monitoring or controlling the equipment 44 using data collected from the equipment 44. BEMS monitoring is a service that supports control for suppressing power consumption of lighting equipment and air conditioning equipment in office buildings and tenant buildings, and remote monitoring is a service that performs remote control of power storage equipment and the like. Refrigeration / air conditioning is a service that performs remote monitoring, etc. for refrigeration / refrigeration showcases and air conditioning equipment in stores. The information transmission is a service for transmitting information such as product information and energy saving cases, and the total amount management is a service for indicating the energy usage amount in a graph or the like. Visualization is a service that shows the operation status of solar power generation equipment, power storage equipment, etc. in a graph or the like. The analysis / diagnosis is a service for performing analysis and diagnosis so that the equipment 44 can be efficiently operated without waste.

ただし、図２に示す構成例は一例であり、サービスの種類や内容は適宜に選択される。また、図３に示すように、地理的に分散して配置されている複数台の記憶装置１０を備える構成において、地域ごとに１種類以上のグループの記憶装置１０を配置するようにしてもよい。図示例では、一点鎖線で囲んだ３地域Ｄ１，Ｄ２，Ｄ３が示され、２地域Ｄ１，Ｄ３ではグループＧ２とグループＧ３とが組み合わせて用いられ、残りの１地域Ｄ２ではグループＧ２のみが用いられている。図の左端の地域Ｄ１におけるグループＧ３の記憶装置１０は、たとえば、太陽光発電設備と蓄電設備との稼働状態に関する設備稼働データを格納する。また、図の右端の地域Ｄ３におけるグループＧ３の記憶装置１０は、たとえば、店舗用の冷凍庫および冷蔵庫の稼働状態に関する設備稼働データを格納する。 However, the configuration example shown in FIG. 2 is an example, and the type and contents of the service are appropriately selected. In addition, as shown in FIG. 3, in a configuration including a plurality of storage devices 10 that are geographically distributed, one or more types of storage devices 10 may be arranged for each region. . In the illustrated example, three regions D1, D2, and D3 surrounded by an alternate long and short dash line are shown. In the two regions D1 and D3, the group G2 and the group G3 are used in combination, and in the remaining one region D2, only the group G2 is used. ing. The storage device 10 of the group G3 in the region D1 at the left end of the figure stores, for example, facility operation data relating to the operation state of the photovoltaic power generation facility and the power storage facility. In addition, the storage device 10 of the group G3 in the region D3 on the right end of the figure stores, for example, facility operation data relating to the operating states of the store freezer and the refrigerator.

図３に示す例のように、同じグループＧ１，Ｇ２，Ｇ３の記憶装置１０は、異なる地域に分散して設けられていてもよい。たとえば、エネルギーデータを格納するグループＧ２に属する記憶装置１０が東京と大阪とにそれぞれ設けられてもよい。また、同じグループＧ１，Ｇ２，Ｇ３に属する複数台の記憶装置１０が同じ地域に配置されてもよい。 As in the example illustrated in FIG. 3, the storage devices 10 of the same group G1, G2, G3 may be provided in different areas. For example, the storage devices 10 belonging to the group G2 that stores energy data may be provided in Tokyo and Osaka, respectively. A plurality of storage devices 10 belonging to the same group G1, G2, G3 may be arranged in the same area.

一方、サービスを提供するために必要なデータが複数台の記憶装置１０に分散して格納されている場合、サーバ４２は、複数台の記憶装置１０のデータにアクセスしなければならない。この場合でも、利用者は、データが格納されている記憶装置１０の構成および場所を意識せずに、端末装置４３あるいは設備４４から所要のデータにアクセスすることが可能である。つまり、複数台の記憶装置１０に格納されたデータにアクセスする場合であっても、端末装置４３あるいは設備４４から所要のデータへのアクセスは、あたかも１台のデータベースサーバにアクセスしているかのようにシームレスに行われる。 On the other hand, when data necessary for providing a service is distributed and stored in a plurality of storage devices 10, the server 42 must access data in the plurality of storage devices 10. Even in this case, the user can access the required data from the terminal device 43 or the facility 44 without being aware of the configuration and location of the storage device 10 in which the data is stored. That is, even when accessing data stored in a plurality of storage devices 10, access to required data from the terminal device 43 or the facility 44 is as if accessing one database server. To be done seamlessly.

利用者がサービスを受けるのに必要なデータへのアクセスは、選択装置２０が代行して行う。すなわち、選択装置２０は、サーバ４２から要求を受けると、サービスの種類に応じて記憶装置１０にアクセスする。つまり、利用者はサーバ４２を通してサービスを受け、サーバ４２はクラウドとして構築されている分散データベースを用いることによって、サーバ４２が提供するサービスに応じたデータにアクセスする。選択装置２０にはサーバ４２からサービスの種類に応じた識別情報が通知されるから、選択装置２０は、上述のように、識別情報に含まれる第１の識別情報と第２の識別情報とを用いて所要のデータにアクセスする。 The selection device 20 performs access to data necessary for the user to receive the service. That is, when receiving a request from the server 42, the selection device 20 accesses the storage device 10 according to the type of service. That is, a user receives a service through the server 42, and the server 42 accesses data according to the service provided by the server 42 by using a distributed database constructed as a cloud. Since the selection device 20 is notified of identification information corresponding to the type of service from the server 42, the selection device 20 uses the first identification information and the second identification information included in the identification information as described above. To access the required data.

図２、図３において、選択装置２０は、複数のグループＧ１，Ｇ２，Ｇ３あるいは複数の地域Ｄ１，Ｄ２，Ｄ３に跨がって設けられているが、選択装置２０は、クラウドとしての分散データベースに含まれるから、適宜に分割されていてもよい。すなわち、上述したように、分散データベースにおいて、複数の選択装置２０が含まれていてもよい。 2 and 3, the selection device 20 is provided across a plurality of groups G1, G2, G3 or a plurality of regions D1, D2, D3, but the selection device 20 is a distributed database as a cloud. Therefore, it may be appropriately divided. That is, as described above, a plurality of selection devices 20 may be included in the distributed database.

以下、識別情報を「グローバルＩＤ」と呼ぶ。グローバルＩＤは、さらに第１の識別情報である「発行元ＩＤ」と、第２の識別情報である「ローカルＩＤ」とを含む。発行元ＩＤは、分散データベースの範囲内でユニークになるように定められ、ローカルＩＤは、発行元ＩＤの範囲内においてユニークになるように定められる。 Hereinafter, the identification information is referred to as “global ID”. The global ID further includes “issuer ID” that is first identification information and “local ID” that is second identification information. The issuer ID is determined to be unique within the range of the distributed database, and the local ID is determined to be unique within the range of the issuer ID.

発行元ＩＤは、記憶装置１０のグループを識別する情報であって、本実施形態では、グループＧ１，Ｇ２，Ｇ３を識別する情報に、サービスを受ける利用者（通常は事業者を想定する）の所在地域を識別する情報と、利用者を識別する情報とが結合されている。所在地および利用者には、それぞれ識別する情報になるユニークな番号が付与される。同様に、グループを識別する情報もグループごとにユニークになる番号が付与される。また、所在地は、国、県（地方）、市町村などから選択される単位で設定される。したがって、発行元ＩＤは、たとえば、（国番号、利用者番号、グループ番号）の組で表される。なお、所在地、利用者、グループをそれぞれ識別する情報は、番号でなくてもよいのはもちろんのことである。 The issuer ID is information for identifying a group in the storage device 10. In this embodiment, the issuer ID is information for identifying a group G1, G2, G3. Information for identifying the location area and information for identifying the user are combined. Each location and user is given a unique number that serves as identifying information. Similarly, the information for identifying the group is given a unique number for each group. The location is set in units selected from the country, prefecture (region), municipality, and the like. Accordingly, the issuer ID is represented by a set of (country code, user number, group number), for example. Of course, the information for identifying the location, the user, and the group may not be a number.

一方、ローカルＩＤは、発行元ＩＤの中でユニークになるように設定されていればよいから、通常は、ローカルＩＤの範囲内において順番を表す情報が付与される。つまり、ローカルＩＤは、記憶装置１０にデータを格納した順番を表す番号が用いられる。ローカルＩＤを用いると、グループの中でデータが格納される場所が特定されるから、当該場所にデータを格納し、また当該場所に格納されたデータにアクセスすることが可能になる。言い換えると、ローカルＩＤは、グループの中でデータを識別すると言える。ローカルＩＤに順番を表す番号を用いる理由は後述する。 On the other hand, since the local ID only needs to be set so as to be unique among the issuer IDs, information indicating the order within the range of the local ID is usually given. That is, as the local ID, a number indicating the order in which data is stored in the storage device 10 is used. When the local ID is used, the location where the data is stored in the group is specified. Therefore, the data can be stored in the location and the data stored in the location can be accessed. In other words, it can be said that the local ID identifies data in the group. The reason for using a number representing the order for the local ID will be described later.

選択装置２０は、グローバルＩＤを受け取ると、第１の抽出部２１が発行元ＩＤを用いて、記憶装置１０のグループを特定する。すなわち、データへのアクセスが選択装置２０に要求されると、第１の抽出部２１は、国番号と利用者番号とグループ番号とを用いて記憶装置１０のグループを識別し、該当するグループの記憶装置１０をターゲットとして絞り込む。その後、第２の抽出部２２がローカルＩＤを用いて、グループ内で所要のデータが格納されている場所を特定し、所要のデータにアクセスする。 When the selection device 20 receives the global ID, the first extraction unit 21 specifies the group of the storage device 10 using the issuer ID. That is, when the selection device 20 is requested to access the data, the first extraction unit 21 identifies the group of the storage device 10 using the country code, the user number, and the group number. Narrow down the storage device 10 as a target. Thereafter, the second extraction unit 22 uses the local ID to identify a location where the required data is stored in the group and accesses the required data.

以上のように、選択装置２０は、第１の抽出部２１と第２の抽出部２２とを用いた２段階の処理によって所要のデータが格納される場所を特定し、目的とするデータにアクセスすることになる。 As described above, the selection device 20 specifies a location where necessary data is stored by two-step processing using the first extraction unit 21 and the second extraction unit 22, and accesses target data. Will do.

ところで、上述のようなサービスを提供しようとすると、比較的短い時間間隔で大量のデータが発生するから、データの収集や提供のための入出力がボトルネックになる上に、排他制御が行われる頻度が高くなる。この種の問題に対処するために、分散データベースを用いることは有効である。分散データベースでは、複数の記憶装置１０にデータが分散して格納されるから、１つの記憶装置１０で構築されたデータベースにおけるボトルネックを回避することが可能になる。 By the way, if an attempt is made to provide such a service, a large amount of data is generated at a relatively short time interval. Therefore, input / output for data collection and provision becomes a bottleneck, and exclusive control is performed. Increases frequency. To deal with this type of problem, it is effective to use a distributed database. In a distributed database, data is distributed and stored in a plurality of storage devices 10, so that it is possible to avoid a bottleneck in a database constructed with one storage device 10.

しかしながら、分散データベースを構築している物理的実体としての記憶装置１０が複数存在している場合に、ハードウェア資源の異なる記憶装置１０が混在することがある。データベースでは、ハードウェア資源のうち、とくにＣＰＵ（Central Processing Unit）の性能と記憶容量との相違はスループットに大きく影響する。そのため、分散データベースを構成している記憶装置１０のハードウェア資源のばらつきに起因して、データを格納する際の書込時間やデータを提供する際の応答時間にばらつきが生じる。 However, when there are a plurality of storage devices 10 as physical entities that construct a distributed database, storage devices 10 with different hardware resources may coexist. In the database, among the hardware resources, the difference between the performance of the CPU (Central Processing Unit) and the storage capacity greatly affects the throughput. Therefore, due to variations in hardware resources of the storage devices 10 constituting the distributed database, variations occur in writing time when storing data and response time when providing data.

また、分散データベースを構成している複数の記憶装置１０にデータを格納する際に、単一の記憶装置１０にアクセスが集中する場合があり、この場合には該当する記憶装置１０がボトルネックになる。すなわち、分散データベースの全体としてのスループットが低下する。 Further, when data is stored in a plurality of storage devices 10 constituting a distributed database, access may be concentrated on a single storage device 10, and in this case, the corresponding storage device 10 becomes a bottleneck. Become. That is, the overall throughput of the distributed database is reduced.

本実施形態は、分散データベースを構成している複数の記憶装置１０にハードウェア資源のばらつきがあってもボトルネックの発生が抑制されるように、以下の構成を採用している。すなわち、以下の構成を採用することによって、ハードウェア資源の劣る記憶装置１０が混在していてもスループットの低下が生じにくい分散データベースの提供が可能になる。 The present embodiment employs the following configuration so that the occurrence of bottlenecks is suppressed even if there are variations in hardware resources among the plurality of storage devices 10 constituting the distributed database. That is, by adopting the following configuration, it is possible to provide a distributed database that is unlikely to cause a reduction in throughput even when storage devices 10 with inferior hardware resources are mixed.

上述したように、分散データベースは複数台の記憶装置１０を備え、分散データベースと端末装置４３とがは電気通信回線３０を通して通信することにより、端末装置４３をクライアントとするクライアントサーバシステムを構築する。このクライアントサーバシステムは、プレゼンテーション層とアプリケーション層とデータ層とからなる３層アーキテクチャを有し、プレゼンテーション層は端末装置４３により実現されている。 As described above, the distributed database includes a plurality of storage devices 10, and the distributed database and the terminal device 43 communicate with each other through the electric communication line 30 to construct a client server system using the terminal device 43 as a client. The client server system has a three-layer architecture including a presentation layer, an application layer, and a data layer, and the presentation layer is realized by a terminal device 43.

記憶装置１０はデータ層に対応し、選択装置２０、データベースサーバ４１、サーバ４２はアプリケーション層に相当する。サーバ４２は、端末装置４３から要求された処理に対応した依頼をデータベースサーバ４１に対して行い、データベースサーバ４１の応答を端末装置４３に返す機能を備える。 The storage device 10 corresponds to the data layer, and the selection device 20, the database server 41, and the server 42 correspond to the application layer. The server 42 has a function of making a request corresponding to the processing requested from the terminal device 43 to the database server 41 and returning a response of the database server 41 to the terminal device 43.

ところで、記憶装置１０は、図４に示すように、複数台の仮想的なサーバとしてのプラットフォームモジュール（以下、「ＰＦモジュール」という）１１に分割されている。選択装置２０は、複数台の記憶装置１０を１つのデータベースとして扱う機能だけではなく、複数のＰＦモジュール１１を統合する機能を備える。この機能は、複数台の記憶装置１０を統合して１つのデータベースとして扱う機能と同様に、選択装置２０が備えるＡＰＩにより実現される。すなわち、選択装置２０のＡＰＩは、端末装置４３からの要求に対応するＰＦモジュール１１を選択する機能を実現するための関数ないし命令を備える。 As shown in FIG. 4, the storage device 10 is divided into a plurality of platform modules (hereinafter referred to as “PF modules”) 11 as virtual servers. The selection device 20 has a function of integrating a plurality of PF modules 11 as well as a function of handling a plurality of storage devices 10 as one database. This function is realized by an API provided in the selection device 20 as in the case of a function that integrates a plurality of storage devices 10 and handles them as one database. That is, the API of the selection device 20 includes a function or an instruction for realizing a function of selecting the PF module 11 corresponding to a request from the terminal device 43.

選択装置２０がＰＦモジュール１１を管理することにより、ＰＦモジュール１１は、サーバ４２から見ると、個別のサーバと等価に機能する。すなわち、個々のＰＦモジュール１１は、見かけ上では、データを格納する機能と、データベース管理システムの機能とを備えていることになる。また、個々のＰＦモジュール１１は、記憶容量が等しくなるように構築されている。つまり、実体である記憶装置１０の記憶容量にかかわらず、ＰＦモジュール１１は互いに記憶容量が等しくなっている。言い換えると、分散データベースは、記憶容量が等しい複数個のＰＦモジュール１１を組み合わせて構成されていると言える。 When the selection device 20 manages the PF module 11, the PF module 11 functions equivalently to an individual server when viewed from the server 42. That is, each PF module 11 apparently has a function of storing data and a function of a database management system. The individual PF modules 11 are constructed so that the storage capacities are equal. In other words, the PF modules 11 have the same storage capacity regardless of the storage capacity of the actual storage device 10. In other words, it can be said that the distributed database is configured by combining a plurality of PF modules 11 having the same storage capacity.

いま、ボトルネックの発生を抑制するために、分散データベースに１００個のＰＦモジュール１１を用いて負荷を分散させる場合を想定する。分散データベースの導入時には、記憶容量が５０ギガバイトの記憶装置１０を１台だけ用いているとすれば、１００個のＰＦモジュール１１は１台の記憶装置１０に設けなければならないから、１個のＰＦモジュール１１の記憶容量は５００メガバイトになる。 Now, in order to suppress the occurrence of a bottleneck, a case is assumed in which a load is distributed using 100 PF modules 11 in a distributed database. When a distributed database is introduced, if only one storage device 10 having a storage capacity of 50 gigabytes is used, 100 PF modules 11 must be provided in one storage device 10, so that one PF The storage capacity of the module 11 is 500 megabytes.

ここで、格納すべきデータの量が増加するのに伴って、１５０ギガバイトの記憶装置１０を新たに導入したとする。２台の記憶装置１０の記憶容量を合計すると２００ギガバイトであって、記憶装置１０が１台の場合と同様に、１００個のＰＦモジュール１１を構成すると、１個のＰＦモジュール１１の記憶容量はは２ギガバイトになる。１台目の記憶装置１０の記憶容量は５０ギガバイトであるから、２５個のＰＦモジュール１１を構成でき、２台目の記憶装置１０の記憶容量は１５０ギガバイトであるから、７５個のＰＦモジュール１１を構成できる。 Here, it is assumed that a 150 gigabyte storage device 10 is newly introduced as the amount of data to be stored increases. The total storage capacity of the two storage devices 10 is 200 gigabytes. Similarly to the case of one storage device 10, when 100 PF modules 11 are configured, the storage capacity of one PF module 11 is Will be 2 gigabytes. Since the storage capacity of the first storage device 10 is 50 gigabytes, 25 PF modules 11 can be configured. Since the storage capacity of the second storage device 10 is 150 gigabytes, 75 PF modules 11 are used. Can be configured.

さらに、格納すべきデータ同様にして、格納すべきデータの量が増加するのに伴って、記憶装置１０の台数を増加させたとする。ここで、記憶容量が５０ギガバイトである記憶装置１０を１００台設ける場合を想定する。この場合、分散データベースの全体では５テラバイトになるから、ＰＦモジュール１１を１００個設けるのであれば、１個のＰＦモジュール１１は５０ギガバイトの記憶容量を持つことになる。また、ＰＦモジュール１１を１００個必要とするのであれば、１台の記憶装置１０が１個のＰＦモジュール１１に対応する。 Further, it is assumed that the number of storage devices 10 is increased as the amount of data to be stored increases in the same manner as the data to be stored. Here, it is assumed that 100 storage devices 10 having a storage capacity of 50 gigabytes are provided. In this case, since the entire distributed database is 5 terabytes, if 100 PF modules 11 are provided, one PF module 11 has a storage capacity of 50 gigabytes. Further, if 100 PF modules 11 are required, one storage device 10 corresponds to one PF module 11.

なお、一般的な傾向として、記憶装置１０は、処理能力が高いほど記憶容量が大きくなる。ここでは、ＰＦモジュール１１の記憶容量を等しくしているから、記憶装置１０の処理能力にかかわらずＰＦモジュール１１のスループットはほぼ等しくなる。ＰＦモジュール１１の記憶領域はさらに分割されるが、要旨ではないのでここでは説明しない。 As a general tendency, the storage capacity of the storage device 10 increases as the processing capability increases. Here, since the storage capacities of the PF modules 11 are equal, the throughputs of the PF modules 11 are substantially equal regardless of the processing capability of the storage device 10. Although the storage area of the PF module 11 is further divided, it is not a gist and will not be described here.

選択装置２０に設けられた第２の抽出部２２が抽出するローカルＩＤ（第２の識別情報）は、ＰＦモジュール１１を特定する識別情報であってもよい。この場合、所要のデータが格納されている場所を特定するには、ＰＦモジュール１１の中でデータが格納されている場所を示す識別情報が必要になる。 The local ID (second identification information) extracted by the second extraction unit 22 provided in the selection device 20 may be identification information that identifies the PF module 11. In this case, in order to specify the location where the required data is stored, identification information indicating the location where the data is stored in the PF module 11 is required.

なお、データが複数のＰＦモジュール１１に分散して格納されるように負荷を分散させるには、たとえば、データに０を含む自然数によるデータＩＤを付与しておき、データＩＤをＰＦモジュール１１の個数で除したときの剰余をローカルＩＤに用いる。すなわち、データＩＤをＸとし、ＰＦモジュール１１の個数をＫとすると、ローカルＩＤは、ＸをＫで除した剰余になる（つまり、Ｘ mod Ｋ）。また、ローカルＩＤは、０から始まる数値とする。したがって、剰余が０になれば、ローカルＩＤが０であるＰＦモジュール１１が選択される。 In order to distribute the load so that the data is distributed and stored in a plurality of PF modules 11, for example, a data ID of a natural number including 0 is given to the data, and the data ID is the number of PF modules 11. The remainder when divided by is used as the local ID. That is, if the data ID is X and the number of PF modules 11 is K, the local ID is a remainder obtained by dividing X by K (that is, X mod K). The local ID is a numerical value starting from 0. Therefore, when the remainder becomes 0, the PF module 11 having the local ID 0 is selected.

この場合、同じグループの記憶装置１０に順に格納するデータのデータＩＤが１ずつ異なる場合、このようなデータは、異なるＰＦモジュール１１に順に格納される。したがって、あたかも複数台のハードディスクを用いてストライピングを行う場合のように、異なるＰＦモジュール１１にデータが振り分けて格納される。その結果、アクセスが単一のＰＦモジュール１１に集中しなくなり、データへのアクセスが平準化される。 In this case, when the data IDs of data stored in order in the storage devices 10 of the same group are different by one, such data is stored in different PF modules 11 in order. Therefore, data is distributed and stored in different PF modules 11 as if striping is performed using a plurality of hard disks. As a result, access is not concentrated on a single PF module 11, and access to data is leveled.

以上説明したように、グローバルＩＤが、発行元ＩＤとローカルＩＤとにより構成されており、発行元ＩＤを用いてデータを格納する記憶装置１０のグループが求められ、次にローカルＩＤを用いてデータを格納するＰＦモジュール１１が求められる。したがって、所要のデータを抽出する際に、データの種類を指定することによって、記憶装置１０のグループが絞られ、グループ内でローカルＩＤを用いてＰＦモジュール１１を特定することができる。最終的には、１個のＰＦモジュール１１の範囲内でデータにアクセスすればよいから、データへのアクセスが容易である。 As described above, the global ID is composed of the issuer ID and the local ID, and the group of the storage device 10 for storing data is obtained using the issuer ID, and then the data using the local ID is obtained. Is required. Therefore, when extracting required data, by specifying the type of data, the group of the storage devices 10 is narrowed down, and the PF module 11 can be specified using the local ID within the group. Eventually, it is only necessary to access data within the range of one PF module 11, so that access to the data is easy.

上述した構成例ではデータを格納するＰＦモジュール１１を決めるために、選択装置２０はデータＩＤの除算を行って剰余を求めているが、乱数などの他の関係を用いてデータを格納するＰＦモジュールを分散させてもよい。 In the configuration example described above, in order to determine the PF module 11 for storing data, the selection device 20 performs division of the data ID to obtain a remainder. However, the PF module stores data using another relationship such as a random number. May be dispersed.

上述した例からわかるように、実体としての記憶装置１０を用いて仮想的にＰＦモジュール１１を構成しているから、負荷が平準化されるだけではなく、記憶装置１０に格納するデータの量に応じて、記憶装置１０の台数を段階的に増加させることが可能である。すなわち、分散データベースのスループットと記憶容量とのスケーラビリティを確保することが可能になる。また、実施形態において説明した分散データベースは、構成する機材を専門業者に委託するか（ハウジング）、専門業者の機材を利用するか（ホスティング）によらず構築可能である。したがって、分散データベースは、サーバ４２が提供するサービスに適した環境で運用することができる。 As can be seen from the above-described example, since the PF module 11 is virtually configured using the storage device 10 as an entity, not only the load is leveled, but also the amount of data stored in the storage device 10 is increased. Accordingly, the number of storage devices 10 can be increased in stages. That is, it becomes possible to ensure the scalability of the throughput and storage capacity of the distributed database. The distributed database described in the embodiment can be constructed regardless of whether the equipment to be configured is outsourced to a specialist (housing) or the equipment of the specialist is used (hosting). Therefore, the distributed database can be operated in an environment suitable for the service provided by the server 42.

１０記憶装置
２０選択装置
２１第１の抽出部
２２第２の抽出部
３０電気通信回線
４１データベースサーバ
４２サーバ
４３端末装置
４４設備
Ｇ１，Ｇ２グループ DESCRIPTION OF SYMBOLS 10 Memory | storage device 20 Selection apparatus 21 1st extraction part 22 2nd extraction part 30 Telecommunication line 41 Database server 42 Server 43 Terminal device 44 Equipment G1, G2 group

Claims

A plurality of storage devices having a function of communicating through a telecommunication line and forming a group for each type of data to be stored;
A selection device for accessing required data specified by identification information notified through the telecommunication line,
The identification information is
First identification information for identifying the group;
Second identification information for identifying the data in the group,
The selection device is:
A first extraction unit that accesses the group in which the required data is stored using the first identification information;
A distributed database system comprising: a second extraction unit that accesses the required data using the second identification information in the group accessed by the first extraction unit.

The distributed database system according to claim 1, wherein the selection device is a server that communicates with the storage device through the telecommunication line.

It is the said server used for the distributed database system of Claim 2. The selection apparatus characterized by the above-mentioned.