JP2016053829A

JP2016053829A - Information processing method, program, and information processing device

Info

Publication number: JP2016053829A
Application number: JP2014179367A
Authority: JP
Inventors: 雄田中; Takeshi Tanaka; 白井　太三; Taizo Shirai; 太三白井; 洋平川元; Yohei Kawamoto; 紘一作本; Koichi Sakumoto
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2014-09-03
Filing date: 2014-09-03
Publication date: 2016-04-14
Also published as: WO2016035448A1

Abstract

PROBLEM TO BE SOLVED: To efficiently derive a condition for minimizing a decrease in an information amount and abstracting information.SOLUTION: An information processing method includes retrieval processing for retrieving a minimal origin satisfying anonymity based on a prescribed condition in a first bundle defined on the basis of an order relation for abstracting classifications more with a combination of a plurality of classifications for dividing attribute values which can be taken by a plurality of object attributes as a first origin, and reference point derivation processing for deriving a new reference point on the basis of a minimal traverse of a complementary set of second origins corresponding to the retrieved minimal origin in a second bundle in which an inclusion relation of the second origins is defined as an order relation with each subset of a finite set defined on the basis of the number of divisions in each object attribute as the second origin, executes retrieval processing based on the new reference point and reference point derivation processing based on a result of the retrieval processing until a new reference point is not derived any more, and outputs a combination of divisions indicated by each of a series of retrieved minimal origins.SELECTED DRAWING: Figure 1

Description

本開示は、情報処理方法、プログラム、及び情報処理装置に関する。 The present disclosure relates to an information processing method, a program, and an information processing apparatus.

近年では、ネットワークや各種センサ等の発達に伴い、各種情報を収集する手段が多様化しており、収集される情報も多岐に亘り、このような収集手段により収集された各種情報が様々なビジネスに活用されている。また、近年では、例えば、データマイニングと呼ばれる技術等を活用することで、個々のデータのみでは明示されておらず導き出すことが困難な情報を、データベース等のような複数のデータの集合から抽出することも可能となってきている。 In recent years, with the development of networks and various sensors, the means for collecting various information has been diversified, and the information collected is diverse. Various information collected by such collecting means can be used in various businesses. It is utilized. In recent years, for example, by utilizing a technique called data mining, information that is not clearly shown only by individual data and is difficult to derive is extracted from a set of a plurality of data such as a database. It is also possible.

米国特許出願公開第２０１０／０３３２５３７号明細書US Patent Application Publication No. 2010/0332537

一方で、収集される情報の中には、個人を特定することが可能な情報も含まれており、当該個人のプライバシの保護を要する場合もある。このような課題に対して、収集された各種情報を抽象化することで匿名性を高め、個人のプライバシを保護することも可能ではあるが、情報の抽象化に伴い情報量が減少し、データの有用性が損なわれる場合もある。 On the other hand, the collected information includes information that can identify an individual, and may require protection of the privacy of the individual. In response to such issues, it is possible to improve anonymity by abstracting various collected information and protect personal privacy, but the amount of information decreases with the abstraction of information, and data The usefulness of may be impaired.

そこで、本開示では、情報量の減少を最低限に抑えて情報を抽象化するための条件を効率よく導出することが可能な、情報処理方法、プログラム、及び情報処理装置を提案する。 Therefore, the present disclosure proposes an information processing method, a program, and an information processing apparatus that can efficiently derive a condition for abstracting information while minimizing a decrease in the amount of information.

本開示によれば、プロセッサが、複数の属性を含むレコードを複数含むテーブルの当該複数の属性のうち、少なくとも２以上の属性を対象属性として、当該対象属性が取り得る属性値それぞれが段階的により抽象化されるように区分する複数の分類の、前記２以上の対象属性間における各組み合わせを第１の元として、前記分類をより抽象化する順序関係に基づき規定される第１の束のうち、前記第１の元それぞれに対応する前記分類に基づき前記テーブルを抽象化した場合に、当該テーブルの前記レコード間で、所定の条件に基づく匿名性を満たす前記第１の元を基準点として、当該匿名性を満たす極小元を探索する探索処理を実行することと、前記対象属性ごとの前記分類の数に基づき規定される有限集合の各部分集合を第２の元とし、当該第２の元の包含関係を順序関係として規定される第２の束のうち、探索された前記極小元にあらかじめ対応付けられた前記第２の元の補集合を特定し、当該補集合の極小横断それぞれに対応する前記第１の元のうち、前記匿名性を満たす第１の元を新たな前記基準点として導出する基準点導出処理を実行することと、を含み、プロセッサが、特定された前記新たな基準点に基づく前記探索処理と、当該探索処理により探索される前記極小元に基づく前記基準点導出処理とを、前記新たな基準点が導出されなくなるまで実行し、探索された一連の前記極小元それぞれが示す前記分類の組み合わせを、前記テーブルを抽象化するための条件の候補として出力する、情報処理方法が提供される。 According to the present disclosure, each attribute value that can be taken by the target attribute is gradually increased by using at least two or more attributes among the plurality of attributes of the table including a plurality of records including the plurality of attributes as the target attribute. Among the first bundles defined based on the order relation that makes the classification more abstract, with each combination between the two or more target attributes as a first element of a plurality of classifications to be abstracted When the table is abstracted based on the classification corresponding to each of the first elements, the first element that satisfies anonymity based on a predetermined condition is used as a reference point between the records of the table. Performing a search process for searching for a minimal element satisfying the anonymity, and each subset of a finite set defined based on the number of classifications for each target attribute as a second element, Among the second bundles defined as the order relation of the second original inclusion relation, the second original complement set previously associated with the searched minimal element is specified, and Performing a reference point derivation process for deriving a first element satisfying the anonymity among the first elements corresponding to each of the minimum crossings as the new reference point, and a processor is specified. The search process based on the new reference point and the reference point derivation process based on the minimal element searched by the search process are executed until the new reference point is not derived, and a series of searched There is provided an information processing method for outputting a combination of the classifications indicated by each of the minimal elements as a candidate for a condition for abstracting the table.

また、本開示によれば、コンピュータに、複数の属性を含むレコードを複数含むテーブルの当該複数の属性のうち、少なくとも２以上の属性を対象属性として、当該対象属性が取り得る属性値それぞれが段階的により抽象化されるように区分する複数の分類の、前記２以上の対象属性間における各組み合わせを第１の元として、前記分類をより抽象化する順序関係に基づき規定される第１の束のうち、前記第１の元それぞれに対応する前記分類に基づき前記テーブルを抽象化した場合に、当該テーブルの前記レコード間で、所定の条件に基づく匿名性を満たす前記第１の元を基準点として、当該匿名性を満たす極小元を探索する探索処理と、前記対象属性ごとの前記分類の数に基づき規定される有限集合の各部分集合を第２の元とし、当該第２の元の包含関係を順序関係として規定される第２の束のうち、探索された前記極小元にあらかじめ対応付けられた前記第２の元の補集合を特定し、当該補集合の極小横断それぞれに対応する前記第１の元のうち、前記匿名性を満たす第１の元を新たな前記基準点として導出する基準点導出処理と、を実行させ、特定された前記新たな基準点に基づく前記探索処理と、当該探索処理により探索される前記極小元に基づく前記基準点導出処理とを、前記新たな基準点が導出されなくなるまで実行させる、探索された一連の前記極小元それぞれが示す前記分類の組み合わせを、前記テーブルを抽象化するための条件の候補として出力させる、プログラムが提供される。 Further, according to the present disclosure, each attribute value that can be taken by the target attribute is set to the computer with at least two or more attributes among the plurality of attributes of the table including a plurality of records including the plurality of attributes. A first bundle defined based on an order relationship that makes the classification more abstract, with each combination between the two or more target attributes as a first element of a plurality of classifications that are classified so as to be more abstract When the table is abstracted based on the classification corresponding to each of the first elements, the first element satisfying anonymity based on a predetermined condition between the records of the table is a reference point. As a second element, a search process for searching for a minimal element satisfying the anonymity and each subset of a finite set defined based on the number of classifications for each target attribute, Among the second bundles that define the original inclusion relation as the order relation, specify the second original complement that is associated in advance with the searched minimal element, and each minimal traversal of the complement A reference point deriving process for deriving the first element satisfying the anonymity among the first elements corresponding to the new reference point as the new reference point, and based on the specified new reference point The classification indicated by each of the searched series of minimal elements, wherein a search process and the reference point derivation process based on the minimal element searched by the search process are executed until the new reference point is not derived. Is provided as a candidate for a condition for abstracting the table.

また、本開示によれば、複数の属性を含むレコードを複数含むテーブルの当該複数の属性のうち、少なくとも２以上の属性を対象属性として、当該対象属性が取り得る属性値それぞれが段階的により抽象化されるように区分する複数の分類の、前記２以上の対象属性間における各組み合わせを第１の元として、前記分類をより抽象化する順序関係に基づき規定される第１の束のうち、前記第１の元それぞれに対応する前記分類に基づき前記テーブルを抽象化した場合に、当該テーブルの前記レコード間で、所定の条件に基づく匿名性を満たす前記第１の元を基準点として、当該匿名性を満たす極小元を探索する探索処理を実行する極小元探索部と、前記対象属性ごとの前記分類の数に基づき規定される有限集合の各部分集合を第２の元とし、当該第２の元の包含関係を順序関係として規定される第２の束のうち、探索された前記極小元にあらかじめ対応付けられた前記第２の元の補集合を特定し、当該補集合の極小横断それぞれに対応する前記第１の元のうち、前記匿名性を満たす第１の元を新たな前記基準点として導出する基準点導出処理を実行する基準点導出部と、特定された前記新たな基準点に基づく前記探索処理と、当該探索処理により探索される前記極小元に基づく前記基準点導出処理とが、前記新たな基準点が導出されなくなるまで実行された結果に基づき、探索された一連の前記極小元それぞれが示す前記分類の組み合わせを、前記テーブルを抽象化するための条件の候補として出力する出力部と、を備える、情報処理装置が提供される。 Further, according to the present disclosure, at least two or more attributes of the plurality of attributes of the table including a plurality of records including a plurality of attributes are set as target attributes, and each attribute value that can be taken by the target attribute is abstracted in stages. Of the first bundle defined based on the order relationship that makes the classification more abstract, with each combination between the two or more target attributes as a first element of a plurality of classifications that are classified to be When the table is abstracted based on the classification corresponding to each of the first elements, the first element satisfying anonymity based on a predetermined condition is used as a reference point between the records of the table. A minimal element search unit that executes a search process for searching for a minimal element that satisfies anonymity, and each subset of a finite set defined based on the number of classifications for each target attribute as a second element, Among the second bundles defined as the order relation of the second original inclusion relation, the second original complement set previously associated with the searched minimal element is specified, and A reference point deriving unit that executes a reference point deriving process for deriving a first element satisfying the anonymity as a new reference point among the first elements corresponding to each of the minimum crossings, and the specified new The search process based on the reference point and the reference point derivation process based on the minimal element searched by the search process are searched based on the results executed until the new reference point is not derived. There is provided an information processing apparatus comprising: an output unit that outputs the combination of classifications indicated by each of a series of the minimal elements as a candidate for a condition for abstracting the table.

以上説明したように本開示によれば、情報量の減少を最低限に抑えて情報を抽象化するための条件を効率よく導出することが可能な、情報処理方法、プログラム、及び情報処理装置が提供される。 As described above, according to the present disclosure, there is provided an information processing method, a program, and an information processing apparatus capable of efficiently deriving a condition for abstracting information while minimizing a decrease in the amount of information. Provided.

なお、上記の効果は必ずしも限定的なものではなく、上記の効果とともに、または上記の効果に代えて、本明細書に示されたいずれかの効果、または本明細書から把握され得る他の効果が奏されてもよい。 Note that the above effects are not necessarily limited, and any of the effects shown in the present specification, or other effects that can be grasped from the present specification, together with or in place of the above effects. May be played.

本開示の実施形態に係る情報処理装置が適用された情報処理システムの概略的なシステム構成の一例を示している。1 illustrates an example of a schematic system configuration of an information processing system to which an information processing apparatus according to an embodiment of the present disclosure is applied. テーブルの構成の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of a structure of a table. テーブルの一部の属性の属性値を一般化することで、より匿名性を高めたテーブルを生成する処理の概要について説明するための説明図である。It is explanatory drawing for demonstrating the outline | summary of the process which produces | generates the table which improved more anonymity by generalizing the attribute value of the one part attribute of a table. （ｋ，ｔ）−匿名性について説明するための説明図である。It is explanatory drawing for demonstrating (k, t) -anonymity. 属性の一般化階層について説明するための説明図である。It is explanatory drawing for demonstrating the generalization hierarchy of an attribute. 属性の一般化階層について説明するための説明図である。It is explanatory drawing for demonstrating the generalization hierarchy of an attribute. 一般化戦略からなる束について説明するための説明図である。It is explanatory drawing for demonstrating the bundle which consists of generalization strategies. ハイパーグラフ及び横断について説明するための説明図である。It is explanatory drawing for demonstrating a hypergraph and crossing. ハイパーグラフ及び横断について説明するための説明図である。It is explanatory drawing for demonstrating a hypergraph and crossing. 同実施形態に係る情報処理装置の機能構成の一例について説明するための説明図である。It is an explanatory view for explaining an example of a functional configuration of the information processing apparatus according to the embodiment. 対象となる属性の属性値を一般化する処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process which generalizes the attribute value of the attribute used as object. 対象となる属性の属性値を、入力として指定された一般化戦略と、当該属性の一般化階層とに基づき一般化する処理の処理結果の一例である。It is an example of the process result of the process which generalizes the attribute value of the attribute which becomes object based on the generalization strategy designated as an input and the generalization hierarchy of the said attribute. 所望の一般化戦略が、（ｋ，ｔ）−匿名性を満たすことが可能な一般化戦略か否かを判定する処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process which determines whether a desired generalization strategy is a generalization strategy which can satisfy | fill (k, t) -anonymity. 般化戦略からなる束と離散集合から誘導される束との間の対応関係の一例について示した図である。It is the figure shown about an example of the correspondence between the bundle | flux which consists of a generalization strategy, and the bundle | flux induced | guided | derived from a discrete set. 一般化戦略からなる束の元を、離散集合から誘導される束の元に変換する処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process which converts the element of the bundle which consists of a generalization strategy into the element of the bundle induced | guided | derived from a discrete set. 一般化戦略からなる束の元を、離散集合から誘導される束の元に変換する処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process which converts the element of the bundle which consists of a generalization strategy into the element of the bundle induced | guided | derived from a discrete set. 離散集合から誘導される束の元を、一般化戦略からなる束の元に変換する処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process which converts the element of the bundle | flux induced | guided | derived from a discrete set into the element of the bundle | flux which consists of a generalization strategy. 一般化戦略からなる束における各一般化ノードから、（ｋ，ｔ）−匿名性を満たす極小元を探索する処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process which searches for the minimal element which satisfy | fills (k, t) -anonymity from each generalized node in the bundle | flux which consists of generalization strategies. 一般化戦略からなる束における各一般化ノードから、（ｋ，ｔ）−匿名性を満たす極小元を探索する処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process which searches for the minimal element which satisfy | fills (k, t) -anonymity from each generalized node in the bundle | flux which consists of generalization strategies. 解析部の一連の処理の流れの一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the flow of a series of processes of an analysis part. 解析部の一連の処理の流れの一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the flow of a series of processes of an analysis part. 基準点導出部の処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process of a control | reference point deriving part. 基準点導出部の処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process of a control | reference point deriving part. 基準点導出部の処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process of a control | reference point deriving part. 極小元探索部の処理の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of a process of the local minimum element search part. 極小元探索部の処理結果の一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the process result of a local minimum element search part. 同実施形態に係る情報処理装置による処理の計算量について説明するための説明図である。It is explanatory drawing for demonstrating the computational complexity of the process by the information processing apparatus which concerns on the embodiment. 変形例１に係る解析部の一連の処理の流れの一例について説明するための説明図である。10 is an explanatory diagram for describing an example of a flow of a series of processes of an analysis unit according to Modification 1. FIG. 変形例１に係る解析部の一連の処理の流れの一例について説明するための説明図である。10 is an explanatory diagram for describing an example of a flow of a series of processes of an analysis unit according to Modification 1. FIG. 変形例２に係る情報処理システムの一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the information processing system which concerns on the modification 2. FIG. 変形例２に係る情報処理システムの一例について説明するための説明図である。It is explanatory drawing for demonstrating an example of the information processing system which concerns on the modification 2. FIG. 同実施形態に係る情報処理装置のハードウェア構成の一例を示した図である。It is the figure which showed an example of the hardware constitutions of the information processing apparatus which concerns on the embodiment.

以下に添付図面を参照しながら、本開示の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In addition, in this specification and drawing, about the component which has the substantially same function structure, duplication description is abbreviate | omitted by attaching | subjecting the same code | symbol.

なお、説明は以下の順序で行うものとする。
１．概要
２．用語の定義
２．１．（ｋ，ｔ）−匿名性
２．２．属性の一般化階層
２．３．一般化戦略
２．４．ハイパーグラフと横断
３．機能構成
４．処理
４．１．属性値の一般化
４．２．（ｋ，ｔ）−匿名性の判定
４．３．元の変換
４．４．極小元の探索
４．５．一連の処理の流れ
５．計算量
６．変形例
６．１．変形例１：一般化戦略の探索に係る処理負荷の軽減
６．２．変形例２：システム構成の一例
７．ハードウェア構成
８．まとめ The description will be made in the following order.
1. Overview 2. Definition of terms 2.1. (K, t) -anonymity 2.2. Generalized hierarchy of attributes 2.3. Generalization strategy 2.4. 2. Hypergraph and crossing Functional configuration Processing 4.1. Generalization of attribute values 4.2. (K, t) -anonymity determination 4.3. Original transformation 4.4. Search for local minimum 4.5. 4. Flow of a series of processes Calculation amount Modification 6.1. Modification 1: Reduction of processing load related to search for generalization strategy 6.2. Modification 2: Example of system configuration 7. Hardware configuration Summary

＜１．概要＞
まず、図１を参照して、本開示の実施形態に係る情報処理装置１０が適用された情報処理システム１の一例について説明したうえで、本実施形態に係る情報処理装置１０の課題について整理する。図１は、本実施形態に係る情報処理装置１０が適用された情報処理システム１の概略的なシステム構成の一例を示している。 <1. Overview>
First, with reference to FIG. 1, an example of the information processing system 1 to which the information processing apparatus 10 according to the embodiment of the present disclosure is applied will be described, and the problems of the information processing apparatus 10 according to the present embodiment will be organized. . FIG. 1 shows an example of a schematic system configuration of an information processing system 1 to which an information processing apparatus 10 according to the present embodiment is applied.

図１に示すように、情報処理システム１は、例えば、各種データが蓄積されたＤＢ（データベース）５０と、ＤＢサーバ７０と、１以上のクライアント８０と、情報処理装置１０と、管理者端末３０と、一般化ＤＢ４０とを含む。なお、情報処理システム１は、一般化ＤＢ４０を参照する解析サーバ６０を含んでもよい。 As illustrated in FIG. 1, the information processing system 1 includes, for example, a DB (database) 50 in which various types of data are stored, a DB server 70, one or more clients 80, an information processing apparatus 10, and an administrator terminal 30. And generalized DB40. The information processing system 1 may include an analysis server 60 that refers to the generalized DB 40.

ＤＢ５０は、所謂リレーショナルデータベースに相当し、複数の属性を含むデータをレコードとして、当該レコードを複数含むテーブルｄ１０を含んで構成される。例えば、図２は、ＤＢ５０に含まれるテーブルｄ１０の構成の一例について説明するための説明図である。なお、図２に示すテーブルｄ１０は、患者の病歴をデータとして管理するためのテーブルｄ１０の一例を示しており、当該患者の年齢、性別、及び病気等の情報を、患者ごとに管理している。 The DB 50 corresponds to a so-called relational database, and includes a table d10 including a plurality of records with data including a plurality of attributes as records. For example, FIG. 2 is an explanatory diagram for explaining an example of the configuration of the table d10 included in the DB 50. The table d10 shown in FIG. 2 shows an example of the table d10 for managing the patient's medical history as data, and manages information on the patient's age, sex, illness, etc. for each patient. .

図２において、参照符号ｄ１１は、「属性」を示しており、個々のデータ（後述する「レコード」）が含む特徴を示す項目に相当する。また、参照符号ｄ１３は、「属性値」を示しており、属性ｄ１１の取る具体的な値に相当する。また、参照符号ｄ１５は、「レコード」であり、各属性ｄ１１に対応する属性値ｄ１３それぞれの集合からなる。なお、レコードは、行（row）やタプル（tuble）と呼ばれる場合もある。 In FIG. 2, reference symbol d <b> 11 indicates “attribute” and corresponds to an item indicating a feature included in each piece of data (“record” to be described later). Reference sign d13 indicates an “attribute value” and corresponds to a specific value taken by the attribute d11. Reference numeral d15 is a “record”, and includes a set of attribute values d13 corresponding to the attributes d11. A record is sometimes called a row or a tuple.

また、テーブルｄ１０は、レコードｄ１５の集合である。なお、本説明では、図２に示すように、テーブルｄ１０は、属性ｄ１１ごとの属性値ｄ１３を列としてレコードｄ１５を形成し、当該レコードｄ１５を行方向に配列することで形成されているものとする。即ち、テーブルｄ１０は、各レコードｄ１５を、属性ｄ１１ごとの属性値ｄ１３の列ベクトルとしてリスト化して表現される。 The table d10 is a set of records d15. In this description, as shown in FIG. 2, the table d10 is formed by forming a record d15 with the attribute value d13 for each attribute d11 as a column, and arranging the record d15 in the row direction. To do. That is, the table d10 is expressed by listing each record d15 as a column vector of attribute values d13 for each attribute d11.

なお、本説明では、本実施形態に係る情報処理装置１０の特徴をよりわかりやすくするために、属性ｄ１１を、識別子ｄ１１１と、準識別子ｄ１１３と、機微属性ｄ１１５とに分類して説明する。 In this description, the attribute d11 is classified into an identifier d111, a quasi-identifier d113, and a sensitive attribute d115 in order to make the characteristics of the information processing apparatus 10 according to the present embodiment easier to understand.

識別子ｄ１１１は、各レコードｄ１５を一意に特定するための属性であり、具体的な一例として、主キー（primary key）が挙げられる。識別子ｄ１１１の具体的な一例としては、アカウントＩＤやサービスＩＤのような、所謂ＩＤ（identifier）が挙げられる。 The identifier d111 is an attribute for uniquely specifying each record d15, and a specific example is a primary key. A specific example of the identifier d111 is a so-called ID (identifier) such as an account ID or a service ID.

機微属性ｄ１１５は、例えば、テーブルｄ１０の特性を決定づけていると考えられる属性に相当する。具体的な一例として、図２に示すように、患者の病歴をデータとして管理するテーブルｄ１０の場合には、患者の「病気」を示す属性ｄ１１は、テーブルｄ１０を特徴づける（当該テーブルｄ１０に特徴的な）属性であり、機微属性ｄ１１５に相当する。 The sensitive attribute d115 corresponds to, for example, an attribute that is considered to determine the characteristics of the table d10. As a specific example, as shown in FIG. 2, in the case of the table d10 that manages the patient's medical history as data, the attribute d11 indicating the patient's “disease” characterizes the table d10 (characterized by the table d10). Attribute) and corresponds to the sensitive attribute d115.

準識別子ｄ１１３は、単体では各レコードｄ１５を識別することが困難ではあるが、他の属性ｄ１１と組み合わせることで、各レコードｄ１５を一意に特定し得る、もしくは、特定のレコードｄ１５を絞り込み得る属性を示している。図２に示す例では、参照符号ｄ１１３ａ及びｄ１１３ｂで示された「年齢」及び「性別」を示す属性は、これらの属性を組み合わせることで、対象となる患者（個人）を絞り込み得ることから、準識別子ｄ１１３に相当する。 Although it is difficult to identify each record d15 by itself, the quasi-identifier d113 is an attribute that can uniquely identify each record d15 by combining with other attributes d11 or can narrow down a specific record d15. Show. In the example shown in FIG. 2, the attributes indicating “age” and “gender” indicated by the reference signs d113a and d113b can be narrowed down by combining these attributes to narrow down target patients (individuals). It corresponds to the identifier d113.

ＤＢサーバ７０は、ＤＢ５０を参照または更新するためのサーバである。また、クライアント８０は、ネットワークｎ１を介してＤＢサーバ７０にアクセスすることで、ＤＢ５０内のデータを参照または更新するための端末を示している。 The DB server 70 is a server for referring to or updating the DB 50. In addition, the client 80 is a terminal for referring to or updating data in the DB 50 by accessing the DB server 70 via the network n1.

情報処理装置１０は、ＤＢ５０に含まれる各テーブルのうち、少なくとも一部のテーブルｄ１０について、一部の属性ｄ１１の属性値ｄ１３を一般化（抽象化）することで、より匿名性を高めたテーブルｄ３０を生成する。 The information processing device 10 generalizes (abstracts) the attribute values d13 of some of the attributes d11 for at least some of the tables d10 included in the DB 50, thereby increasing anonymity. d30 is generated.

例えば、図３は、テーブルｄ１０の一部の属性ｄ１１の属性値ｄ１３を一般化（抽象化）することで、より匿名性を高めたテーブルｄ３０を生成する処理の概要について説明するための説明図である。なお、本説明では、情報処理装置１０の処理をわかりやすくするために、図２に示したテーブルｄ１０のように、テーブルｄ１０が所謂ユーザの個人情報を管理するテーブルである場合を例に説明する。また、機微属性ｄ１１５が個人のプライバシに関わる属性ｄ１１を示しており、準識別子ｄ１１３が個人を特定し得る属性ｄ１１であるものとして説明する。 For example, FIG. 3 is an explanatory diagram for explaining an outline of processing for generating a table d30 with higher anonymity by generalizing (abstracting) attribute values d13 of some attributes d11 of the table d10. It is. In this description, in order to make the processing of the information processing apparatus 10 easy to understand, a case where the table d10 is a table for managing so-called personal information of a user, such as the table d10 shown in FIG. 2, will be described as an example. . Further, it is assumed that the sensitive attribute d115 indicates the attribute d11 related to the privacy of the individual, and the quasi-identifier d113 is the attribute d11 that can specify the individual.

具体的には、図３に示す例では、情報処理装置１０は、テーブルｄ１０における「年齢」及び「性別」を示す属性ｄ１１３ａ及びｄ１１３ｂの属性値を一般化（抽象化）している。 Specifically, in the example illustrated in FIG. 3, the information processing apparatus 10 generalizes (abstracts) the attribute values of the attributes d113a and d113b indicating “age” and “sex” in the table d10.

例えば、情報処理装置１０は、年齢を示す属性ｄ１１３ａの属性値ｄ１５を、１０代や２０代のように世代を示す属性値に分類することで、当該属性ｄ１１３ａの属性値ｄ１５を一般化し、年齢を示す属性値の匿名性を高めている。 For example, the information processing apparatus 10 generalizes the attribute value d15 of the attribute d113a by classifying the attribute value d15 of the attribute d113a indicating the age into attribute values indicating the generation such as teens or twenties. The anonymity of the attribute value indicating is increased.

具体的には、年齢を示す属性ｄ１１３ａの属性値の定義域を｛０，１，・・・，１２５｝、世代の定義域｛１０歳未満，１０代，２０代，・・・，９０代，１００歳以上｝とした場合に、情報処理装置１０は、年齢を示す属性値０〜９を、１０歳未満を示す属性値に対応付ける。同様に、情報処理装置１０は、年齢を示す属性値１０〜１９を、１０代を示す属性値に対応付けることとなる。 Specifically, the definition area of the attribute value of the attribute d113a indicating the age is {0, 1,..., 125}, the generation definition area {under 10 years old, teens, 20s,. , 100 years old or older}, the information processing apparatus 10 associates attribute values 0 to 9 indicating age with attribute values indicating less than 10 years old. Similarly, the information processing apparatus 10 associates attribute values 10 to 19 indicating age with attribute values indicating teenagers.

即ち、図３に示す例では、情報処理装置１０は、属性ｄ１１３ａの属性値のうち、２０〜２９は、２０代を示す属性値［２０，２９］に対応付け、３０〜３９は、３０代を示す属性値［３０，３９］に対応付けることで、年齢を示す属性値の匿名性を高めている。 That is, in the example illustrated in FIG. 3, the information processing apparatus 10 associates 20 to 29 with the attribute value [20, 29] indicating the 20s among the attribute values of the attribute d113a, and 30 to 39 indicates the 30s. Is associated with the attribute value [30, 39] indicating the anonymity of the attribute value indicating the age.

同様に、情報処理装置１０は、性別を示す属性ｄ１１３ｂの属性値｛男，女，未登録｝を、性別を示す情報が登録されているか否か（即ち、性別を示す情報の登録状況）を示す属性値｛登録，未登録｝に対応付けている。具体的には、情報処理装置１０は、属性ｄ１１３ｂの属性値のうち、「男」及び「女」は、性別を示す情報が登録されていることを示す属性値「登録」に対応付けることで、性別を示す属性値の匿名性を高めている。 Similarly, the information processing apparatus 10 uses the attribute value {male, female, unregistered} of the attribute d113b indicating gender as to whether or not information indicating gender is registered (that is, the registration status of information indicating gender). The attribute value {registered, unregistered} is indicated. Specifically, the information processing apparatus 10 associates “male” and “female” among the attribute values of the attribute d113b with the attribute value “registration” indicating that information indicating sex is registered, Anonymity of attribute values indicating gender is increased.

このような一般化に伴い、例えば、「ＩＤ」が「１」〜「２」で示されたレコードや、「ＩＤ」が「３」〜「４」で示されたレコードを、「年齢」及び「性別」の属性に基づき識別することが困難となる。 With such generalization, for example, records with “ID” “1” to “2” and records with “ID” “3” to “4” are changed to “age” and It becomes difficult to identify based on the attribute of “sex”.

一方で、「ＩＤ」が「７」で示されたレコードは、「年齢」及び「性別」を示す属性ｄ１１３ａ及びｄ１１３ｂの属性値を一般化しても、４０代であり、かつ、性別の情報が登録されているレコードが他に存在しないため、一意に特定され得る。このことは、「ＩＤ」が「８」で示されたレコードについても同様である。そのため、情報処理装置１０は、属性値が一般化されたテーブルｄ３０のうち、上記に示した一意に特定され得るレコードを、あらかじめ決められたレコード数以下の範囲で削除する。 On the other hand, the record whose “ID” is “7” is in the 40s even if the attribute values of the attributes d113a and d113b indicating “age” and “sex” are generalized, and the sex information is Since no other record exists, it can be uniquely identified. The same applies to the record whose “ID” is “8”. Therefore, the information processing apparatus 10 deletes the above-described uniquely identifiable records in the table d30 whose attribute values are generalized within a range equal to or less than the predetermined number of records.

このような構成により、テーブルｄ３０中のいずれのレコードを抽出し、当該レコードに対応する患者の識別を試みたとしても、対応する患者の可能性を２通り未満に絞り込むことが困難となり、匿名性が担保される。 With such a configuration, even if any record in the table d30 is extracted and identification of the patient corresponding to the record is attempted, it becomes difficult to narrow down the possibility of the corresponding patient to less than two ways, and the anonymity Is secured.

ここで、図３に示すテーブルｄ１０の例では、悪意のある第三者が「年齢」及び「性別」を基に個人を特定し得る場合には、対象となるレコードを一意に特定し得た。そのため、悪意のある第三者は、「年齢」及び「性別」の情報を基に特定の個人に対応するレコードをテーブルｄ１０中から特定し、当該特定したレコードに基づき、当該個人の「病気」という新たな情報を知り得ることとなる。なお、「年齢」及び「性別」等の既知の情報に基づき、特定の個人に対応するレコードをテーブル中から特定することを、以降では、「再識別」と記載する場合がある。 Here, in the example of the table d10 shown in FIG. 3, when a malicious third party can identify an individual based on “age” and “gender”, the target record can be uniquely identified. . Therefore, a malicious third party specifies a record corresponding to a specific individual from the table d10 based on the information of “age” and “gender”, and based on the specified record, the “disease” of the individual You will get new information. Note that specifying a record corresponding to a specific individual from the table based on known information such as “age” and “gender” may be referred to as “re-identification” hereinafter.

一方で、テーブルｄ３０に示すように、いずれのレコードを抽出したとしても、当該レコードと「年齢」及び「性別」が重複する他のレコードが存在する場合には、悪意のある第三者は、「年齢」及び「性別」を基に個人を一意に特定することが困難となる。即ち、テーブルｄ３０は、テーブルｄ１０よりも匿名性が高められていることがわかる。 On the other hand, as shown in the table d30, even if any record is extracted, if there is another record in which the record and "age" and "gender" overlap, the malicious third party It becomes difficult to uniquely identify an individual based on “age” and “gender”. That is, it can be seen that the anonymity of the table d30 is higher than that of the table d10.

以上のようにして、情報処理装置１０は、テーブルｄ１０について、一部の属性ｄ１１の属性値ｄ１３を一般化（抽象化）することで、より匿名性を高めたテーブルｄ３０を生成する。そして、情報処理装置１０は、生成したテーブルｄ３０を、一般化ＤＢ４０に出力する。一般化ＤＢ４０は、情報処理装置１０により生成されたテーブルｄ３０を管理するための構成である。 As described above, the information processing apparatus 10 generates a table d30 with higher anonymity by generalizing (abstracting) attribute values d13 of some attributes d11 with respect to the table d10. Then, the information processing apparatus 10 outputs the generated table d30 to the generalized DB 40. The generalized DB 40 is a configuration for managing the table d30 generated by the information processing apparatus 10.

なお、本説明では、情報処理システム１の構成をよりわかりやすくするために、ＤＢ５０と一般化ＤＢ４０とを別々の構成として説明しているが、必ずしもこのような構成には限定されない。具体的な一例として、ＤＢ５０と一般化ＤＢ４０とを一つのＤＢとして構成し、当該ＤＢにおいて、テーブルｄ１０と、当該テーブルｄ１０を基に生成されたテーブルｄ３０とを管理してもよい。 In this description, in order to make the configuration of the information processing system 1 easier to understand, the DB 50 and the generalized DB 40 are described as separate configurations. However, the configuration is not necessarily limited to such a configuration. As a specific example, the DB 50 and the generalized DB 40 may be configured as one DB, and the table d10 and the table d30 generated based on the table d10 may be managed in the DB.

また、管理者端末３０は、所謂ＤＢ５０や一般化ＤＢ４０の管理者が、情報処理装置１０を操作するためのインタフェースの一例である。 The administrator terminal 30 is an example of an interface for the administrator of the so-called DB 50 or generalized DB 40 to operate the information processing apparatus 10.

なお、ＤＢサーバ７０は、一般化ＤＢ４０中の、テーブルｄ３０を参照できるように構成されていてもよい。また、解析サーバ６０は、一般化ＤＢ４０に記憶されたテーブルｄ３０中の情報を、例えば、データマイニング等の技術を利用することで解析し、当該解析の結果を、ネットワークｎ１１を介してクライアント８０に出力してもよい。 The DB server 70 may be configured to refer to the table d30 in the generalized DB 40. Further, the analysis server 60 analyzes the information in the table d30 stored in the generalized DB 40 by using, for example, a technique such as data mining, and the analysis result is sent to the client 80 via the network n11. It may be output.

以上、図１を参照して、本実施形態に係る情報処理装置１０が適用された情報処理システム１の概略的なシステム構成の一例について説明した。 Heretofore, an example of a schematic system configuration of the information processing system 1 to which the information processing apparatus 10 according to the present embodiment is applied has been described with reference to FIG.

上記に説明したように、情報処理装置１０は、ＤＢ５０に含まれる少なくとも一部のテーブルｄ１０について、一部の属性ｄ１１の属性値ｄ１３を一般化し、必要に応じて一部のレコードを削除することで、より匿名性を高めたテーブルｄ３０を生成する。 As described above, the information processing apparatus 10 generalizes the attribute values d13 of some attributes d11 for at least some tables d10 included in the DB 50, and deletes some records as necessary. Thus, the table d30 with higher anonymity is generated.

なお、属性値をより一般化するほど、匿名性はより高くなる傾向にある。これに対して、属性値を一般化することで、従前では識別されていた各属性値が、同じ属性値としてまとめられることになるため、個々のレコードを識別するための情報が損なわれ、情報量が減少する。そのため、属性値をより一般化するほど、データの有用性がより損なわれる傾向にある。 In addition, the anonymity tends to be higher as the attribute value is more generalized. On the other hand, by generalizing attribute values, each attribute value that was previously identified is collected as the same attribute value, so the information for identifying individual records is impaired, and information The amount decreases. Therefore, as the attribute value becomes more general, the usefulness of the data tends to be impaired.

また、削除するレコード数の最大値が増大するほど、匿名性をより高くすることが可能となる傾向にある。一方で、レコードの削除が、情報量の減少につながることは言うまでもなく、削除するレコード数が増大すれば、情報量がより減少することとなる。即ち、削除するレコード数の最大値が増大するほど、母集合の範囲がより制限される可能性が高くなり、データの有用性がより損なわれる可能性がある。 In addition, as the maximum number of records to be deleted increases, anonymity tends to become higher. On the other hand, it goes without saying that deletion of a record leads to a decrease in the amount of information, and as the number of records to be deleted increases, the amount of information further decreases. That is, as the maximum value of the number of records to be deleted increases, there is a higher possibility that the range of the population is more restricted, and the usefulness of data may be further impaired.

このように、匿名性の担保と、データとしての有用性とはトレードオフの関係にある。そのため、所定の条件に基づく匿名性（後述する、（ｋ，ｔ）−匿名性）を満たし、かつ、情報量の減少を最小限に抑えることが可能な、属性値を抽象化するための条件を特定することが望ましい。 Thus, the guarantee of anonymity and the usefulness as data are in a trade-off relationship. Therefore, a condition for abstracting attribute values that satisfies anonymity based on a predetermined condition (to be described later, (k, t) -anonymity) and can minimize a decrease in the amount of information. It is desirable to specify.

その一方で、複数の属性ｄ１１について属性値ｄ１３を一般化することで匿名性を担保する場合には、どの属性ｄ１１についてどこまで一般化すれば（即ち、一般化のレベルをどの程度に設定すれば）、所定の条件に基づく匿名性が担保されるかは、テーブルｄ１０内のデータ（即ち、各レコードｄ１５）に依存する。そのため、所定の条件に基づく匿名性を満たし、かつ、情報量の減少を最小限に抑えることが可能な、属性値を抽象化するための条件を特定するための計算量を事前に見積もることが困難であった。 On the other hand, when anonymity is ensured by generalizing the attribute value d13 for a plurality of attributes d11, to what extent the attribute d11 is generalized (that is, how much the level of generalization is set) ) Whether the anonymity based on a predetermined condition is secured depends on the data in the table d10 (that is, each record d15). Therefore, it is possible to estimate in advance the amount of calculation for specifying the conditions for abstracting attribute values that can satisfy anonymity based on predetermined conditions and minimize the decrease in the amount of information. It was difficult.

また、一般化の対象となる属性ｄ１１の数が増えるほど、複数の属性ｄ１１間における一般化のレベルの組み合わせの数は増大する。このような特性から、所定の条件に基づく匿名性を満たし、かつ、情報量の減少を最小限に抑えることが可能な、属性値を抽象化するための条件を効率よく導出することが可能な仕組みが求められている。 Further, as the number of attributes d11 to be generalized increases, the number of combinations of generalization levels among a plurality of attributes d11 increases. From these characteristics, it is possible to efficiently derive conditions for abstracting attribute values that can satisfy anonymity based on a predetermined condition and can minimize a decrease in the amount of information. A mechanism is required.

そこで、本実施形態に係る情報処理装置１０として、所定の条件に基づく匿名性を満たし、かつ、情報量の減少を最小限に抑えることが可能な、属性値を抽象化するための条件を効率よく導出することが可能な仕組みを提案する。以下に、本実施形態に係る情報処理装置１０について、さらに詳しく説明する。 Therefore, as the information processing apparatus 10 according to the present embodiment, a condition for abstracting attribute values that satisfies anonymity based on a predetermined condition and can minimize a reduction in the amount of information is efficiently used. We propose a mechanism that can be derived well. Hereinafter, the information processing apparatus 10 according to the present embodiment will be described in more detail.

＜２．用語の定義＞
本実施形態に係る情報処理装置１０の詳細について説明するにあたり、まず、本説明で使用する用語の定義について以下にまとめる。 <2. Definition of terms>
In describing details of the information processing apparatus 10 according to the present embodiment, first, definitions of terms used in the present description are summarized below.

［２．１．（ｋ，ｔ）−匿名性］
本項では、図４を参照して、匿名性の条件である（ｋ，ｔ）−匿名性について説明する。図４は、（ｋ，ｔ）−匿名性について説明するための説明図である。なお、図４に示す例は、図３と同様に、患者の病歴をデータとして管理するためのテーブルｄ１０を、「年齢」及び「性別」の属性ｄ１１３ａ及びｄ１１３ｂそれぞれの属性値を一般化することで、匿名性を高めたテーブルｄ３０を生成した場合を示している。 [2.1. (K, t) -anonymity]
In this section, (k, t) -anonymity, which is a condition for anonymity, will be described with reference to FIG. FIG. 4 is an explanatory diagram for explaining (k, t) -anonymity. In the example shown in FIG. 4, as in FIG. 3, the table d10 for managing the patient's medical history as data is generalized with the attribute values of the “age” and “sex” attributes d113a and d113b. The case where the table d30 which raised anonymity is produced | generated is shown.

図４において、参照符号ｄ３５１〜ｄ３５５は、「年齢」及び「性別」の属性ｄ１１３ａ及びｄ１１３ｂの属性値を一般化した場合に、各属性間の属性値の組み合わせが等しい集合を示している。なお、以降の説明では、この各属性間の属性値の組み合わせが等しい集合を「等価クラス」と記載する場合がある。例えば、等価クラスｄ３５１は、「年齢」を示す属性ｄ３１３ａの属性値が、「２０代」を示す「［２０，２９］」であり、「性別」を示す属性ｄ３１３ｂの属性値が、性別を示す情報が登録されていることを示す「登録」である集合を示している。同様に、等価クラスｄ３５３は、「年齢」を示す属性ｄ３１３ａの属性値が、「３０代」を示す「［３０，３９］」であり、「性別」を示す属性ｄ３１３ｂの属性値が、性別を示す情報が登録されていないことを示す「未登録」である集合を示している。 In FIG. 4, reference symbols d351 to d355 indicate sets in which the attribute value combinations between the attributes are equal when the attribute values of the “age” and “sex” attributes d113a and d113b are generalized. In the following description, a set having the same combination of attribute values between attributes may be referred to as an “equivalent class”. For example, in the equivalence class d351, the attribute value of the attribute d313a indicating “age” is “[20, 29]” indicating “20's”, and the attribute value of the attribute d313b indicating “sex” indicates gender. A set of “registration” indicating that information is registered is shown. Similarly, in the equivalent class d353, the attribute value of the attribute d313a indicating “age” is “[30, 39]” indicating “30's”, and the attribute value of the attribute d313b indicating “sex” is gender. A set of “unregistered” indicating that the indicated information is not registered is shown.

ここで、テーブル中の任意の等価クラスを抽出した場合に、当該等価クラスのレコード数がいずれの場合においてもｋ以上である場合に、当該テーブルが「ｋ−匿名性（k-Anonymity）」を満たすという。即ち、ｋ−匿名性を満たすテーブルは、当該テーブル中のいずれのレコードに対して再識別が試みられたとしても、再識別の対象の候補をｋ通り未満に絞り込むことが困難である。 Here, when an arbitrary equivalent class is extracted from the table, if the number of records of the equivalent class is k or more in any case, the table indicates “k-anonymity”. Satisfy. That is, for a table that satisfies k-anonymity, it is difficult to narrow down candidates for re-identification to less than k, no matter which record in the table is re-identified.

また、テーブルがｋ−匿名性を満たさない場合においても、等価クラスのレコード数がｋ未満であるようなレコードを全て削除することで、ｋ−匿名性を満たすことが可能な場合がある。このように、レコード数がｔ以下の範囲でレコードを削除することで、ｋ−匿名性を満たすことが可能な場合を、「（ｋ，ｔ）−匿名性（（k,t）-Anonymity）」を満たすという。 Even when the table does not satisfy k-anonymity, it may be possible to satisfy k-anonymity by deleting all records whose number of records in the equivalent class is less than k. In this way, by deleting records in a range where the number of records is equal to or less than t, the case where k-anonymity can be satisfied is expressed as “(k, t) -anonymity ((k, t) -Anonymity)”. "

具体的な一例として、図４に示す例において、テーブルｄ１０及びｄ３０が、ｋ＝２、ｔ＝２とした（２，２）−匿名性を満たすか否かについて以下に説明する。 As a specific example, whether or not the tables d10 and d30 satisfy (2, 2) -anonymity with k = 2 and t = 2 in the example illustrated in FIG. 4 will be described below.

例えば、テーブルｄ１０は、「年齢」を示す属性ｄ１１３ａの属性値がそれぞれ異なる。そのため、例えば、「年齢」が「２９」、「性別」が「女」のレコードに対して再識別を試みた場合には、「ＩＤ」が「２」のレコードが一意に特定される。このことは、他のレコードについても同様である。即ち、テーブルｄ１０は、２−匿名性を満たしてないことがわかる。 For example, in the table d10, the attribute value of the attribute d113a indicating “age” is different. Therefore, for example, when re-identification is attempted for a record whose “age” is “29” and “sex” is “female”, the record whose “ID” is “2” is uniquely identified. The same applies to other records. That is, it can be seen that the table d10 does not satisfy 2-anonymity.

また、テーブルｄ１０において、各レコードは、「年齢」及び「性別」の属性ｄ１１３ａ及びｄ１１３ｂに着目した場合に、各属性間の属性値の組み合わせが等しい他のレコードが存在しない。即ち、テーブルｄ１０は、レコード数が２以下の範囲でレコードの削除を行ったとしても、２−匿名性を満たさない。即ち、テーブルｄ１０は、（２，２）−匿名性を満たさないこととなる。 Further, in the table d10, when focusing on the attributes d113a and d113b of the “age” and “gender”, there is no other record having the same combination of attribute values between the attributes. That is, the table d10 does not satisfy 2-anonymity even if records are deleted in a range where the number of records is 2 or less. That is, the table d10 does not satisfy (2, 2) -anonymity.

これに対して、テーブルｄ３０は、（２，２）−匿名性を満たしている。具体的には、テーブルｄ３０は、レコード数が２未満の等価クラスｄ３５４及びｄ３５３を含むため、２−匿名性は満たしていない。しかしながら、レコード数が２未満の等価クラスｄ３５４及びｄ３５３に対応するレコードを削除することで、当該レコード削除後のテーブルｄ３０は、２−匿名性を満たす。このとき、等価クラスｄ３５４及びｄ３５３に対応するレコードのレコード数は２（即ち、２以下）であるため、テーブルｄ３０は、（２，２）−匿名性を満たすこととなる。 On the other hand, the table d30 satisfies (2, 2) -anonymity. Specifically, since the table d30 includes equivalence classes d354 and d353 having the number of records of less than 2, 2-anonymity is not satisfied. However, by deleting records corresponding to the equivalent classes d354 and d353 having the number of records less than 2, the table d30 after the record deletion satisfies 2-anonymity. At this time, since the number of records corresponding to the equivalent classes d354 and d353 is 2 (that is, 2 or less), the table d30 satisfies (2, 2) -anonymity.

以上、図４を参照して、匿名性の条件である（ｋ，ｔ）−匿名性について説明した。 The anonymity condition (k, t) -anonymity has been described above with reference to FIG.

［２．２．属性の一般化階層］
次に、図５及び図６を参照して、属性の一般化階層について説明する。図５及び図６は、属性の一般化階層について説明するための説明図である。 [2.2. Generalized hierarchy of attributes]
Next, a generalized hierarchy of attributes will be described with reference to FIGS. 5 and 6 are explanatory diagrams for explaining a generalized hierarchy of attributes.

ここで、「属性の一般化階層」とは、一の属性について属性値を段階的に一般化した場合に、一般化前の属性値と一般化後の属性値との間の一般化の対応関係を階層構造（所謂、木構造）で示したものである。 Here, “attribute generalization hierarchy” means the generalization correspondence between the attribute value before generalization and the attribute value after generalization when attribute values are generalized step by step for one attribute The relationship is shown in a hierarchical structure (so-called tree structure).

例えば、図５は、図２に示したテーブルｄ１０のうち、「年齢」を示す属性ｄ１１３ａの属性値を一般化（抽象化）するための属性の一般化階層の一例を示している。なお、以降では、図５に示す、属性ｄ１１３ａを一般化（抽象化）するための属性の一般化階層、即ち、属性ｄ１１３ａの一般化階層を、「属性の一般化階層ｄ２１」と記載する場合がある。 For example, FIG. 5 illustrates an example of an attribute generalization hierarchy for generalizing (abstracting) the attribute value of the attribute d113a indicating “age” in the table d10 illustrated in FIG. In the following, the attribute generalization hierarchy for generalizing (abstracting) the attribute d113a shown in FIG. 5, that is, the generalization hierarchy of the attribute d113a will be described as “attribute generalization hierarchy d21”. There is.

図５に示す例では、「年齢」を示す属性ｄ１１３ａの属性値ｄ１３を２段階に分けて一般化した場合について示している。図５において、参照符号ｄ２２は、「年齢」を示す属性の属性値ｄ１３を段階的に一般化した場合の、属性の一般化階層ｄ２１における各階層の高さを示している。なお、以降では、各階層の高さを「一般化レベル」と記載する場合がある。 In the example shown in FIG. 5, the attribute value d13 of the attribute d113a indicating “age” is generalized in two stages. In FIG. 5, reference symbol d22 indicates the height of each layer in the attribute generalization layer d21 when the attribute value d13 of the attribute indicating “age” is generalized stepwise. Hereinafter, the height of each layer may be described as a “generalization level”.

図５において、参照符号ｄ２１０は、「年齢」を示す属性が取り得る一連の属性値、即ち、定義域｛０，１，２，・・・，１２５｝で示された属性値に対応する階層を示している。なお、図５に示す属性の一般化階層ｄ２１では、「年齢」を示す属性が取り得る一連の属性値、即ち、一般化が施されていない属性値の階層ｄ２１０の一般化レベルを「０」に設定している。 In FIG. 5, reference symbol d <b> 210 indicates a series of attribute values that can be taken by the attribute indicating “age”, that is, a hierarchy corresponding to the attribute value indicated by the domain {0, 1, 2,. Is shown. Note that in the attribute generalization layer d21 shown in FIG. 5, a series of attribute values that can be taken by the attribute indicating “age”, that is, the generalization level of the attribute value layer d210 that has not been generalized is set to “0”. Is set.

また、参照符号ｄ２１１は、階層ｄ２１０で示された「年齢」を示す属性の一連の属性値を一般化することで、「世代」を示す属性値に分類した場合の、当該「世代」を示す属性値に対応する階層を示している。なお、図５に示す例では、「世代」を示す属性値の定義域は、｛１０歳未満，１０代，２０代，・・・，９０代，１００歳以上｝で示される。 The reference sign d211 indicates the “generation” when the attribute value indicating the “age” indicated in the hierarchy d210 is generalized to be classified into the attribute value indicating the “generation”. The hierarchy corresponding to the attribute value is shown. In the example shown in FIG. 5, the definition area of the attribute value indicating “generation” is indicated by {under 10 years old, 10s, 20s,..., 90s, over 100 years}.

また、図５に示す例では、階層ｄ２１０における属性値「０」〜「９」は、階層ｄ２１１における「１０歳未満」を示す属性値「［０，９］」に対応付けられている。即ち、属性の一般化階層ｄ２１は、階層ｄ２１０における属性値「０」〜「９」を、階層ｄ２１１における属性値「［０，９］」に一般化することを示している。同様にして、階層ｄ２１０における属性値「１０」〜「１９」は、階層ｄ２１１における「１０代」を示す属性値「［１０，１９］」に一般化されることとなる。なお、図５に示す属性の一般化階層ｄ２１では、階層ｄ２１１の一般化レベルを「１」に設定している。 In the example illustrated in FIG. 5, the attribute values “0” to “9” in the hierarchy d210 are associated with the attribute value “[0, 9]” indicating “under 10 years” in the hierarchy d211. That is, the attribute generalization hierarchy d21 indicates that the attribute values “0” to “9” in the hierarchy d210 are generalized to the attribute values “[0, 9]” in the hierarchy d211. Similarly, the attribute values “10” to “19” in the hierarchy d210 are generalized to the attribute value “[10, 19]” indicating “10th generation” in the hierarchy d211. Note that, in the attribute generalization layer d21 shown in FIG. 5, the generalization level of the layer d211 is set to “1”.

また、参照符号ｄ２１２は、階層ｄ２１１で示された「世代」を示す属性値を、さらに一般化した属性値に分類した場合の、当該属性値に対応する階層を示している。なお、図５に示した属性の一般化階層ｄ２１では、階層ｄ２１２は、各年齢を識別しない場合の属性値を示しており、定義域は｛［０，１２５］｝となる。即ち、階層ｄ２１１の属性値を階層ｄ２１２の属性値に一般化すると、階層ｄ２１１の属性値はいずれも、階層ｄ２１２の属性値「［０，１２５］」に一般化されることとなる。なお、図５に示す属性の一般化階層ｄ２１では、階層ｄ２１２の一般化レベルを「２」に設定している。 Reference numeral d212 indicates a hierarchy corresponding to the attribute value when the attribute value indicating "generation" indicated by the hierarchy d211 is further classified into a generalized attribute value. In the attribute generalization hierarchy d21 shown in FIG. 5, the hierarchy d212 indicates attribute values when each age is not identified, and the definition area is {[0, 125]}. That is, when the attribute value of the hierarchy d211 is generalized to the attribute value of the hierarchy d212, any attribute value of the hierarchy d211 is generalized to the attribute value “[0, 125]” of the hierarchy d212. In the attribute generalization hierarchy d21 shown in FIG. 5, the generalization level of the hierarchy d212 is set to “2”.

このように、属性の一般化階層ｄ２１は、より一般化されるほど階層の高さが高くなるように（即ち、一般化レベルが高くなるように）構成される。 In this way, the attribute generalization hierarchy d21 is configured such that the higher the generalization hierarchy, the higher the hierarchy (that is, the higher the generalization level).

同様に、図６は、図２に示したテーブルｄ１０のうち、「性別」を示す属性ｄ１１３ｂの属性値を一般化（抽象化）するための属性の一般化階層の一例を示している。なお、以降では、図５に示す、属性ｄ１１３ｂを一般化（抽象化）するための属性の一般化階層、即ち、属性ｄ１１３ｂの一般化階層を、「属性の一般化階層ｄ２３」と記載する場合がある。 Similarly, FIG. 6 shows an example of an attribute generalization hierarchy for generalizing (abstracting) the attribute value of the attribute d113b indicating “sex” in the table d10 shown in FIG. In the following, the attribute generalization hierarchy for generalizing (abstracting) the attribute d113b shown in FIG. 5, that is, the generalization hierarchy of the attribute d113b is described as “attribute generalization hierarchy d23”. There is.

図６に示す例では、「性別」を示す属性ｄ１１３ｂの属性値ｄ１３を２段階に分けて一般化した場合について示している。図６において、参照符号ｄ２４は、「性別」を示す属性の属性値ｄ１３を段階的に一般化した場合の、属性の一般化階層ｄ２３における各階層の高さ（即ち、一般化レベル）を示している。 In the example shown in FIG. 6, the attribute value d13 of the attribute d113b indicating “sex” is generalized in two stages. In FIG. 6, reference numeral d24 indicates the height (that is, the generalization level) of each hierarchy in the attribute generalization hierarchy d23 when the attribute value d13 of the attribute indicating “gender” is generalized stepwise. ing.

図６において、参照符号ｄ２３０は、「性別」を示す属性が取り得る一連の属性値、即ち、定義域｛男，女，未登録｝で示された属性値に対応する階層を示している。なお、図６に示す属性の一般化階層ｄ２３では、「性別」を示す属性が取り得る一連の属性値、即ち、一般化が施されていない属性値の階層ｄ２３０の一般化レベルを「０」に設定している。 In FIG. 6, a reference symbol d <b> 230 indicates a hierarchy corresponding to a series of attribute values that can be taken by the attribute indicating “gender”, that is, the attribute value indicated in the domain {male, female, unregistered}. Note that in the attribute generalization layer d23 shown in FIG. 6, a series of attribute values that can be taken by the attribute indicating “gender”, that is, the generalization level of the attribute value layer d230 that has not been generalized is set to “0”. Is set.

また、参照符号ｄ２３１は、階層ｄ２３０で示された「性別」を示す属性の一連の属性値を一般化することで、「性別を示す情報の登録状況」を示す属性値に分類した場合の、当該属性値に対応する階層を示している。なお、図５に示す例では、「性別を示す情報の登録状況」を示す属性値の定義域は、｛登録，未登録｝で示される。 Further, the reference sign d231 is obtained by generalizing a series of attribute values indicating the “sex” indicated in the hierarchy d230 to be classified into attribute values indicating the “registration status of information indicating sex”. A hierarchy corresponding to the attribute value is shown. In the example shown in FIG. 5, the definition area of the attribute value indicating “registration status of information indicating gender” is indicated by {registered, unregistered}.

また、図６に示す例では、階層ｄ２１０における属性値「男」及び「女」は、階層ｄ２３１において「性別を示す情報が登録されている」ことを示す属性値「登録」に対応付けられている。即ち、属性の一般化階層ｄ２３は、階層ｄ２３０における属性値「男」及び「女」を、階層ｄ２３１における属性値「登録」に一般化することを示している。同様にして、階層ｄ２３０における属性値「未登録」は、階層ｄ２３１において「性別を示す情報が登録されていない」ことを示す属性値「未登録」に対応付けられている。なお、図６に示す属性の一般化階層ｄ２３では、階層ｄ２３１の一般化レベルを「１」に設定している。 In the example illustrated in FIG. 6, the attribute values “male” and “female” in the hierarchy d210 are associated with the attribute value “registration” indicating that “information indicating sex” is registered in the hierarchy d231. Yes. That is, the attribute generalization hierarchy d23 indicates that the attribute values “male” and “female” in the hierarchy d230 are generalized to the attribute value “registration” in the hierarchy d231. Similarly, the attribute value “unregistered” in the hierarchy d230 is associated with the attribute value “unregistered” indicating that “information indicating gender is not registered” in the hierarchy d231. Note that in the attribute generalization layer d23 shown in FIG. 6, the generalization level of the layer d231 is set to “1”.

また、参照符号ｄ２３２は、階層ｄ２３１で示された「性別を示す情報の登録状況」を示す属性値を、さらに一般化した属性値に分類した場合の、当該属性値に対応する階層を示している。なお、図６に示した属性の一般化階層ｄ２３では、階層ｄ２３２は、性別を識別しない場合の属性値を示しており、定義域を｛不明｝とすることで、各患者の性別が不明であるものと取扱った場合を示している。即ち、階層ｄ２３１の属性値を階層ｄ２３２の属性値に一般化すると、階層ｄ２３１の属性値はいずれも、階層ｄ２３２の属性値「不明」に一般化されることとなる。なお、図６に示す属性の一般化階層ｄ２３では、階層ｄ２３２の一般化レベルを「２」に設定している。 Reference numeral d232 indicates a hierarchy corresponding to the attribute value when the attribute value indicating the “registration status of information indicating gender” indicated in the hierarchy d231 is further classified into a generalized attribute value. Yes. In addition, in the attribute generalization hierarchy d23 shown in FIG. 6, the hierarchy d232 shows the attribute value when the gender is not identified, and the gender of each patient is unknown by setting the definition area to {unknown}. It shows the case where it is handled as a certain thing. That is, when the attribute value of the hierarchy d231 is generalized to the attribute value of the hierarchy d232, any attribute value of the hierarchy d231 is generalized to the attribute value “unknown” of the hierarchy d232. Note that in the attribute generalization hierarchy d23 shown in FIG. 6, the generalization level of the hierarchy d232 is set to “2”.

なお、属性の一般化階層ｄ２３は、図５を参照して説明した属性の一般化階層ｄ２１と同様に、より一般化されるほど階層の高さが高くなるように（即ち、一般化レベルが高くなるように）構成される。 Note that the attribute generalization hierarchy d23 is set so that the higher the generalization level is, the higher the hierarchy becomes higher (that is, the generalization level is the same as the attribute generalization hierarchy d21 described with reference to FIG. 5). Configured to be higher).

以上、図５及び図６を参照して、属性の一般化階層について説明した。 The attribute generalization hierarchy has been described above with reference to FIGS. 5 and 6.

［２．３．一般化戦略］
次に、テーブルｄ１０を一般化するための一般化戦略の一例について説明する。図５及び図６を参照して説明したように、複数の属性それぞれについて一般化階層が与えられている場合には、各属性に対してどの高さまで一般化を行うかの違い（即ち、各属性の一般化階層の高さの組み合わせ）により、一般化の方法が複数存在することとなる。 [2.3. Generalization strategy]
Next, an example of a generalization strategy for generalizing the table d10 will be described. As described with reference to FIG. 5 and FIG. 6, when a generalization hierarchy is given for each of a plurality of attributes, a difference in which generalization is performed for each attribute (that is, for each attribute) There are a plurality of generalization methods depending on the combination of the height of the generalization hierarchy of attributes.

例えば、図５及び図６に示す例では、「年齢」を示す属性ｄ１１３ａの一般化階層の高さは０（年齢）、１（世代）、２（区別なし）であり、「性別」を示す属性ｄ１１３ｂの一般化階層の高さは０（性別）、１（登録状況）、２（不明（区別なし））である。ここで、「年齢」を示す属性ｄ１１３ａの一般化階層の高さをｖ_１、「性別」を示す属性ｄ１１３ｂの一般化階層の高さをｖ_２とした場合に、［ｖ_１，ｖ_２］の組み合わせは、［０，０］、［０，１］、・・・、［２，１］、［２，２］となる。このように、一般化の対象となる各属性の一般化階層の高さの組み合わせを、本説明では「一般化戦略（Generalization Strategy）」と記載する場合がある。 For example, in the example shown in FIGS. 5 and 6, the height of the generalized hierarchy of the attribute d113a indicating “age” is 0 (age), 1 (generation), 2 (no distinction), and indicates “sex”. The height of the generalized hierarchy of the attribute d113b is 0 (gender), 1 (registration status), 2 (unknown (no distinction)). Here, the height of the generalization hierarchy of attributes d113a indicating "age" _{v 1,} the height of the generalization hierarchy of attributes d113b indicating "sex" in the case of the _{_{_{v 2, [v 1, v}}} 2] The combinations are [0, 0], [0, 1],..., [2, 1], [2, 2]. As described above, the combination of the heights of the generalization hierarchies of the attributes to be generalized may be described as “Generalization Strategy” in this description.

例えば、一般化戦略［１，２］を適用した場合には、「年齢」を示す属性ｄ１１３ａの属性値は、高さが「１」の「世代」を示す属性値に一般化され、「性別」を示す属性ｄ１１３ｂの属性値は、高さが「２」の「不明（区別なし）」を示す属性値に一般化される。このように、各一般化戦略は、テーブルｄ１０における「年齢」及び「性別」を示す属性ｄ１１３ａ及びｄ１１３ｂそれぞれの属性値を一般化するための指標を示している。 For example, when the generalization strategy [1, 2] is applied, the attribute value of the attribute d113a indicating “age” is generalized to the attribute value indicating “generation” having a height of “1”. The attribute value of the attribute d113b indicating “” is generalized to an attribute value indicating “unknown (no distinction)” having a height of “2”. As described above, each generalization strategy indicates an index for generalizing the attribute values of the attributes d113a and d113b indicating “age” and “sex” in the table d10.

ここで、ある一般化戦略ａと、当該一般化戦略ａの各要素のうちいずれか一つの要素（即ち、いずれか一つの属性の一般化階層の高さ）を１だけ増加させた（即ち、一階層分だけ一般化した）一般化戦略ｂの大小関係を、ａ＜ｂとする。具体的な一例として、一般化戦略［１，１］と［１，２］とは、［１，１］＜［１，２］の関係にある。同様に、一般化戦略［１，２］と［２，２］とは、［１，２］＜［２，２］の関係にある。一方で、一般化戦略［１，２］と［２，１］との間には、大小関係は定義されない。このような大小関係を順序関係として規定し、一連の一般化戦略［ｖ_１，ｖ_２］＝［０，０］、［０，１］、・・・、［２，１］、［２，２］を元とすることで、一般化戦略［ｖ_１，ｖ_２］は束（Lattice）をなす。この束を、以降では「一般化戦略からなる束」と呼ぶ場合がある。 Here, a generalization strategy a and any one of the elements of the generalization strategy a (that is, the height of the generalization hierarchy of any one attribute) are increased by 1 (that is, The magnitude relationship of the generalization strategy b (generalized for one layer) is assumed to be a <b. As a specific example, the generalization strategies [1,1] and [1,2] have a relationship of [1,1] <[1,2]. Similarly, the generalization strategies [1,2] and [2,2] have a relationship of [1,2] <[2,2]. On the other hand, no magnitude relationship is defined between the generalization strategies [1,2] and [2,1]. Such a magnitude relationship is defined as an order relationship, and a series of generalization strategies [v ₁ , v ₂ ] = [0, 0], [0, 1],..., [2, 1], [2, 2], the generalization strategy [v ₁ , v ₂ ] forms a lattice. Hereinafter, this bundle may be referred to as a “bundle composed of generalization strategies”.

例えば、図７は、一般化戦略からなる束について説明するための説明図である。図７は、図５及び図６に示した、「年齢」及び「性別」を示す属性ｄ１１３ａ及びｄ１１３ｂそれぞれの一般化階層に基づき規定される一般化戦略［ｖ_１，ｖ_２］を元とした、一般化戦略からなる束の一例を示している。 For example, FIG. 7 is an explanatory diagram for explaining a bundle of generalization strategies. FIG. 7 is based on the generalization strategy [v ₁ , v ₂ ] defined based on the generalization hierarchy of the attributes d113a and d113b indicating “age” and “sex” shown in FIG. 5 and FIG. Shows an example of a bundle of generalization strategies.

また、本説明では、半順序関係が定義されている束＜Ｌ，≦＞に対して、以下に（条件１）及び（条件２）で示す性質を満たす関数ｑを、束＜Ｌ，≦＞のクエリ関数と呼ぶ。
（条件１）ｑ：Ｌ→{０，１}；束の台集合を定義域とするブール値への写像
（条件２）任意のａ，ｂ∈Ｌに対して、ａ≦（≧）ｂならばｑ（ａ）≦（≧）ｑ（ｂ） Further, in this description, for a bundle <L, ≦> in which a partial order relation is defined, a function q that satisfies the properties shown in (Condition 1) and (Condition 2) below is expressed as a bundle <L, ≦>. Called the query function.
(Condition 1) q: L → {0,1}; mapping to a Boolean value with a set of bundles as a domain (Condition 2) For any a, bεL, if a ≦ (≧) b Q (a) ≦ (≧) q (b)

また、半順序が定義された束＜Ｌ，≦＞と、上述したクエリ関数ｑとが与えられている場合に、以下に示す（条件３）及び（条件４）を満たす元（即ち、一般化戦略）ａ＝［ｖ_１，ｖ_２］を、極小元と呼ぶ。
（条件３）ｑ（ａ）＝１
（条件４）任意のａ＞ｂ∈Ｌに対してｑ（ｂ）＝０ Further, when a bundle <L, ≦> in which a partial order is defined and the above-described query function q are given, elements satisfying (Condition 3) and (Condition 4) shown below (ie, generalization) Strategy) a = [v ₁ , v ₂ ] is called a minimal element.
(Condition 3) q (a) = 1
(Condition 4) q (b) = 0 for any a> bεL

例えば、図７に示す例では、一般化戦略［２，２］、［２，１］、［１，２］、［１，１］、［２，０］が、クエリ関数ｑに基づき、ｑ（（ｖ_１，ｖ_２））＝１を満たす。また、一般化戦略［０，２］、［０，１］、［１，０］、［０，０］が、クエリ関数ｑに基づき、ｑ（（ｖ_１，ｖ_２））＝０を満たす。このとき、参照符号ｄ５０は、ｑ（（ｖ_１，ｖ_２））＝１を満たす元（即ち、一般化戦略）と、ｑ（（ｖ_１，ｖ_２））＝０を満たす元の境界を示しており、元［１，１］及び［２，０］が極小元となる。 For example, in the example shown in FIG. 7, the generalization strategies [2, 2], [2, 1], [1, 2], [1, 1], [2, 0] are based on the query function q, ((V ₁ , v ₂ )) = 1 is satisfied. Further, the generalization strategies [0, 2], [0, 1], [1, 0], [0, 0] satisfy q ((v ₁ , v ₂ )) = 0 based on the query function q. . At this time, the reference sign d50 has a boundary between an element satisfying q ((v ₁ , v ₂ )) = 1 (that is, a generalized strategy) and an element satisfying q ((v ₁ , v ₂ )) = 0. The elements [1, 1] and [2, 0] are the minimal elements.

以上、図７を参照して、テーブルｄ１０を一般化するための一般化戦略の一例と、当該一般化戦略からなる束の一例とについて説明した。 The example of the generalization strategy for generalizing the table d10 and the example of the bundle composed of the generalization strategy have been described above with reference to FIG.

［２．４．ハイパーグラフと横断］
次に、図８及び図９を参照して、ハイパーグラフ及び横断について説明する。図８及び図９は、ハイパーグラフ及び横断について説明するための説明図である。 [2.4. Hypergraph and crossing]
Next, the hypergraph and the crossing will be described with reference to FIGS. 8 and 9 are explanatory diagrams for explaining the hypergraph and the crossing.

ハイパーグラフについて説明するにあたり、まず、図８を参照して、頂点の集合Ｖと、集合Ｖ中の２つ頂点を結ぶエッジＥからなるグラフ（Ｖ，Ｅ）について説明する。例えば、図８に示す例は、Ｖ＝｛１，２，３，４，５｝、Ｅ＝｛｛１，４｝，｛１，５｝，｛２，３｝，｛２，４｝，｛４，５｝｝となるグラフ（Ｖ，Ｅ）を示している。ここで、例えば、エッジＥ＝｛１，４｝は、頂点「１」と頂点「４」とを結ぶエッジを示している。同様に、エッジＥ＝｛２，３｝は、頂点「２」と頂点「３」とを結ぶエッジを示している。 In describing the hypergraph, first, a graph (V, E) including a vertex set V and an edge E connecting two vertices in the set V will be described with reference to FIG. For example, in the example shown in FIG. 8, V = {1, 2, 3, 4, 5}, E = {{1, 4}, {1, 5}, {2, 3}, {2, 4}, The graph (V, E) which becomes {4, 5}} is shown. Here, for example, the edge E = {1, 4} indicates an edge connecting the vertex “1” and the vertex “4”. Similarly, an edge E = {2, 3} indicates an edge connecting the vertex “2” and the vertex “3”.

これに対して、図８に示すグラフ（Ｖ，Ｅ）のエッジＥを、２つ以上の頂点を結ぶハイパーエッジＨに一般化したグラフを、ハイパーグラフ（Ｖ，Ｈ）と呼ぶ。図９は、ハイパーグラフ（Ｖ，Ｈ）の一例を示している。例えば、図９に示す例は、Ｖ＝｛１，２，３，４，５｝、Ｈ＝｛｛１，２，４｝，｛１，５｝，｛２，３，５｝，｛１，４，５｝｝となるハイパーグラフ（Ｖ，Ｈ）を示している。ここで、例えば、ハイパーエッジＨ＝｛１，２，４｝は、頂点「１」、頂点「２」、及び頂点「４」を結ぶハイパーエッジを示している。同様に、ハイパーエッジＨ＝｛１，５｝は、頂点「１」及び頂点「５」を結ぶハイパーエッジを示している。 On the other hand, a graph obtained by generalizing the edge E of the graph (V, E) shown in FIG. 8 into a hyper edge H connecting two or more vertices is called a hyper graph (V, H). FIG. 9 shows an example of the hypergraph (V, H). For example, in the example shown in FIG. 9, V = {1, 2, 3, 4, 5}, H = {{1, 2, 4}, {1, 5}, {2, 3, 5}, {1 , 4, 5}}, a hypergraph (V, H) is shown. Here, for example, the hyper edge H = {1, 2, 4} indicates a hyper edge connecting the vertex “1”, the vertex “2”, and the vertex “4”. Similarly, the hyper edge H = {1, 5} indicates a hyper edge connecting the vertex “1” and the vertex “5”.

次に、図９を参照して、横断について説明する。ハイパーグラフ（Ｖ，Ｈ）の任意のハイパーエッジと少なくとも一部を共有するＶの部分集合を「横断」と呼ぶ。また、横断のうち、いずれの要素を取り除いても横断とならなくなるものを「極小横断」と呼ぶ。 Next, crossing will be described with reference to FIG. A subset of V that shares at least part with any hyperedge of the hypergraph (V, H) is called “crossing”. In addition, a crossing that does not become a crossing if any element is removed is called a “minimal crossing”.

例えば、図９に示す例では、Ｖの部分集合｛１，２，４｝は、ハイパーエッジＨ＝｛｛１，２，４｝，｛１，５｝，｛２，３，５｝，｛１，４，５｝｝のいずれとも一部を共有するため、横断に相当する。また、部分集合｛１，２，４｝から、要素「４」を取り除いた部分集合｛１，２｝は、ハイパーエッジＨ＝｛｛１，２，４｝，｛１，５｝，｛２，３，５｝，｛１，４，５｝｝のいずれとも一部を共有する。このことから、部分集合｛１，２，４｝は、極小横断ではないことがわかる。 For example, in the example shown in FIG. 9, the subset {1, 2, 4} of V has hyperedges H = {{1, 2, 4}, {1, 5}, {2, 3, 5}, { 1, 4, 5}} shares a part and corresponds to crossing. Further, the subset {1, 2} obtained by removing the element “4” from the subset {1, 2, 4} is a hyper edge H = {{1, 2, 4}, {1, 5}, {2 , 3, 5} and {1, 4, 5}} share a part. This shows that the subset {1, 2, 4} is not a minimal crossing.

また、Ｖの部分集合｛１，２｝は、前述通り横断に相当するが、要素「１」及び「２」のいずれかを取り除くと、ハイパーエッジＨ＝｛｛１，２，４｝，｛１，５｝，｛２，３，５｝，｛１，４，５｝｝の少なくともいずれかと一部を共有しない状態となる。このことから、部分集合｛１，２｝は、極小横断に相当する。同様に、Ｖの部分集合｛１，５｝も、極小横断となる。 Further, the subset {1, 2} of V corresponds to the crossing as described above. However, if any one of the elements “1” and “2” is removed, the hyper edge H = {{1, 2, 4}, { 1, 5}, {2, 3, 5}, and {1, 4, 5}} are not shared in part. From this, the subset {1, 2} corresponds to the minimum crossing. Similarly, the subset {1, 5} of V is also a minimal crossing.

以上、図８及び図９を参照して、ハイパーグラフ及び横断について説明した。 The hypergraph and the crossing have been described above with reference to FIGS. 8 and 9.

＜３．機能構成＞
次に、図１０を参照して、本実施形態に係る情報処理装置１０の機能構成の一例について説明する。図１０は、本実施形態に係る情報処理装置１０の機能構成の一例について説明するための説明図である。 <3. Functional configuration>
Next, an example of a functional configuration of the information processing apparatus 10 according to the present embodiment will be described with reference to FIG. FIG. 10 is an explanatory diagram for describing an example of a functional configuration of the information processing apparatus 10 according to the present embodiment.

図１０に示すように、情報処理装置１０は、ＤＢ情報取得部１１と、ＤＢ情報出力部１２と、階層情報取得部１３と、解析部１４と、解析結果出力部１５と、一般化条件取得部１６と、ＤＢ情報変換部１７と、変換結果出力部１８とを含む。 As illustrated in FIG. 10, the information processing apparatus 10 includes a DB information acquisition unit 11, a DB information output unit 12, a hierarchy information acquisition unit 13, an analysis unit 14, an analysis result output unit 15, and a generalized condition acquisition. Unit 16, DB information conversion unit 17, and conversion result output unit 18.

ＤＢ情報取得部１１は、ＤＢ５０に含まれるテーブルのうち、処理対象となるテーブルｄ１０の情報（即ち、属性の定義情報や、各レコード等の、テーブルを形成するデータ）を、当該ＤＢ５０から取得する。なお、処理対象となるテーブルｄ１０とは、一部の属性の属性値を一般化（抽象化）することで、より匿名性を高めたテーブルｄ３０を生成するための入力情報となるテーブルを示している。ＤＢ情報取得部１１は、取得したテーブルｄ１０の情報を、ＤＢ情報出力部１２、解析部１４、及びＤＢ情報変換部１７のそれぞれに出力する。なお、ＤＢ情報取得部１１は、取得したテーブルｄ１０の情報を、図示しない記憶部に保持または記憶させることで、当該テーブルの情報を、当該記憶部を介して間接的に、ＤＢ情報出力部１２、解析部１４、及びＤＢ情報変換部１７のそれぞれに出力してもよい。 The DB information acquisition unit 11 acquires, from the DB 50, information on the table d10 to be processed (that is, data defining the table, such as attribute definition information and each record) among the tables included in the DB 50. . Note that the table d10 to be processed is a table serving as input information for generating a table d30 with higher anonymity by generalizing (abstracting) attribute values of some attributes. Yes. The DB information acquisition unit 11 outputs the acquired information of the table d10 to each of the DB information output unit 12, the analysis unit 14, and the DB information conversion unit 17. The DB information acquisition unit 11 holds or stores the acquired information of the table d10 in a storage unit (not shown), so that the information of the table is indirectly stored in the DB information output unit 12 via the storage unit. The data may be output to each of the analysis unit 14 and the DB information conversion unit 17.

ＤＢ情報出力部１２は、ＤＢ情報取得部１１から処理対象となるテーブルｄ１０の情報を取得し、取得したテーブルｄ１０の情報を、管理者端末３０を介して管理者に提示する。これにより、管理者は、テーブルｄ１０のうち、属性値の一般化（抽象化）の対象となる属性や、当該属性の属性値を一般化（抽象化）するための属性の一般化階層を定義することが可能となる。 The DB information output unit 12 acquires information on the table d10 to be processed from the DB information acquisition unit 11, and presents the acquired information on the table d10 to the administrator via the administrator terminal 30. Thereby, the administrator defines the attribute to be generalized (abstracted) of the attribute value and the attribute generalization hierarchy for generalizing (abstracting) the attribute value of the attribute in the table d10. It becomes possible to do.

階層情報取得部１３は、処理対象となるテーブルｄ１０の各属性のうち、管理者端末３０を介して管理者より指定された、属性値の一般化（抽象化）の対象となる属性を示す情報と、当該属性それぞれの一般化階層を示す情報とを、当該管理者端末３０から取得する。そして、階層情報取得部１３は、取得した属性値の一般化（抽象化）の対象となる属性を示す情報と、当該属性それぞれの一般化階層を示す情報とを、解析部１４に出力する。 The hierarchy information acquisition unit 13 is information indicating attributes to be generalized (abstracted) of attribute values specified by the administrator via the administrator terminal 30 among the attributes of the table d10 to be processed. And information indicating the generalized hierarchy of each attribute are acquired from the administrator terminal 30. Then, the hierarchy information acquisition unit 13 outputs to the analysis unit 14 information indicating an attribute that is a target of generalization (abstraction) of the acquired attribute value and information indicating a generalization hierarchy of each attribute.

また、階層情報取得部１３は、管理者端末３０を介して管理者より指定された匿名性の条件を示す情報を、当該管理者端末３０から取得してもよい。なお、匿名性の条件を示す情報とは、例えば、（ｋ，ｔ）−匿名性の条件である、ｋ−匿名性を満たすための等価クラスに含まれるレコードのレコード数の閾値ｋと、削除するレコードのレコード数の最大値ｔとに相当する。そして、階層情報取得部１３は、管理者端末３０から取得した匿名性の条件を示す情報を、解析部１４に出力する。 Further, the hierarchy information acquisition unit 13 may acquire information indicating anonymity conditions specified by the administrator via the administrator terminal 30 from the administrator terminal 30. Note that the information indicating the condition of anonymity is, for example, (k, t) -anonymity condition, the threshold value k of the number of records included in the equivalence class for satisfying k-anonymity, and deletion This corresponds to the maximum value t of the number of records to be recorded. Then, the hierarchy information acquisition unit 13 outputs information indicating the condition of anonymity acquired from the administrator terminal 30 to the analysis unit 14.

なお、匿名性の条件を示す情報の取得元は、必ずしも管理者端末３０には限定されない。具体的な一例として、階層情報取得部１３は、情報処理装置１０内の記憶部（図示しない）にあらかじめ記憶された、匿名性の条件を示す情報を取得し、取得した当該情報を解析部１４に出力してもよい。 Note that the acquisition source of the information indicating the anonymity condition is not necessarily limited to the administrator terminal 30. As a specific example, the hierarchical information acquisition unit 13 acquires information indicating anonymity conditions stored in advance in a storage unit (not shown) in the information processing apparatus 10, and the acquired information is analyzed by the analysis unit 14. May be output.

解析部１４は、ＤＢ情報取得部１１から処理対象となるテーブルｄ１０の情報を取得する。また、解析部１４は、階層情報取得部１３から、処理対象となるテーブルｄ１０の各属性のうち、属性値の一般化（抽象化）の対象となる属性を示す情報と、当該属性それぞれの一般化階層を示す情報とを取得する。また、解析部１４は、階層情報取得部１３から、匿名性の条件を示す情報を取得する。 The analysis unit 14 acquires information on the table d10 to be processed from the DB information acquisition unit 11. Further, the analysis unit 14 receives information indicating an attribute that is a target of generalization (abstraction) of the attribute value from among the attributes of the table d10 to be processed from the hierarchy information acquisition unit 13, and general information of each of the attributes. Information indicating the hierarchization hierarchy. Further, the analysis unit 14 acquires information indicating anonymity conditions from the hierarchy information acquisition unit 13.

解析部１４は、取得した属性値の一般化（抽象化）の対象となる属性を示す情報と、当該属性それぞれの一般化階層を示す情報とに基づき、テーブルｄ１０を、より匿名性を高めたテーブルｄ３０に変換するための一般化戦略（即ち、各属性の一般化階層の高さの組み合わせ）を規定する。なお、これにより規定される一般化戦略は、対象となる属性それぞれの属性値がより一般化（抽象化）されるように区分するための各分類の、当該対象となる属性間における組み合わせに相当する。そして、解析部１４は、既定した各一般化戦略のうち、取得した匿名性の条件を満たし、かつ、情報量の減少を最小限に抑えることが可能な一般化戦略を抽出する。 The analysis unit 14 further increases the anonymity of the table d10 based on the information indicating the attribute to be generalized (abstracted) of the acquired attribute value and the information indicating the generalized hierarchy of each attribute. A generalization strategy (that is, a combination of the heights of the generalization hierarchies of each attribute) for conversion into the table d30 is defined. In addition, the generalization strategy specified by this is equivalent to the combination of each classification for classifying the attribute values so that the attribute values of each target attribute are more generalized (abstracted). To do. And the analysis part 14 extracts the generalization strategy which can satisfy | fill the acquired anonymity conditions and can suppress the reduction | decrease in an information amount to the minimum among each predetermined generalization strategy.

そこで、上述した解析部１４の構成及び動作を、以下に詳しく説明する。図１０に示すように、解析部１４は、解析データ生成部１４１と、極小元探索部１４３と、基準点導出部１４５とを含む。 Therefore, the configuration and operation of the analysis unit 14 described above will be described in detail below. As shown in FIG. 10, the analysis unit 14 includes an analysis data generation unit 141, a local minimum search unit 143, and a reference point derivation unit 145.

解析データ生成部１４１は、属性値の一般化（抽象化）の対象となる属性それぞれの一般化階層を示す情報に基づき、各属性の一般化階層の高さの組み合わせを抽出し、抽出した各組み合わせを一般化戦略として規定する。 The analysis data generation unit 141 extracts a combination of heights of the generalized hierarchies of each attribute based on the information indicating the generalized hierarchies of the attributes to be generalized (abstracted) of the attribute values. Specify the combination as a generalization strategy.

そして、解析データ生成部１４１は、既定した各一般化戦略を元として、各一般化戦略が要素として有する各属性の一般化階層の高さをより一般化（抽象化）する順序関係（換言すると、各属性の分類をより抽象化する順序関係）に基づき、一般化戦略からなる束を規定する。このとき、解析データ生成部１４１は、例えば、ある一般化戦略ａと、当該一般化戦略ａの各要素のうちいずれか一つの要素を１だけ増加させた（即ち、一階層分だけ一般化した）一般化戦略ｂの大小関係をａ＜ｂとする順序関係に基づき、一般化戦略からなる束を規定する。 Then, the analysis data generation unit 141 is based on each predetermined generalization strategy, and the order relation (in other words, the generalization hierarchy of each attribute that each generalization strategy has as an element is more generalized (in other words, abstracted). And a bundle of generalization strategies based on an order relation that abstracts the classification of each attribute. At this time, the analysis data generation unit 141 increases, for example, one generalized strategy a and any one element of the generalized strategy a by 1 (that is, generalized by one layer) ) Define a bundle of generalization strategies based on an order relationship in which the magnitude relation of the generalization strategy b is a <b.

具体的な一例として、解析データ生成部１４１は、図５及び図６に示した「年齢」及び「性別」のそれぞれに対応する属性の一般化階層から、図７に示した一般化戦略からなる束を規定することとなる。なお、以降では、一般化戦略からなる束における元を、特に「一般化ノード」と記載する場合がある。また、以降では、図５及び図６に示した属性の一般化階層と、図７に示した一般化戦略からなる束を対象として処理する場合を例に説明する。 As a specific example, the analysis data generation unit 141 includes the generalization strategy shown in FIG. 7 from the generalization hierarchy of attributes corresponding to “age” and “gender” shown in FIGS. 5 and 6. A bundle will be defined. In the following, the element in the bundle composed of generalization strategies may be described as “generalized node”. Further, hereinafter, a case will be described as an example in which processing is performed on a bundle composed of the attribute generalization hierarchy shown in FIGS. 5 and 6 and the generalization strategy shown in FIG.

なお、一般化戦略からなる束が「第１の束」の一例に相当する。また、一般化戦略からなる束における各元、即ち、各属性の一般化階層の高さの組み合わせに基づき規定される一般化戦略が、「第１の元」の一例に相当する。 Note that a bundle composed of generalization strategies corresponds to an example of a “first bundle”. Further, each element in a bundle composed of generalized strategies, that is, a generalized strategy defined based on a combination of heights of generalized hierarchies of each attribute corresponds to an example of “first element”.

極小元探索部１４３は、既定された一般化戦略からなる束における各一般化ノードのうち、取得した匿名性の条件（即ち、（ｋ，ｔ）−匿名性）を満たすことが可能な一般化戦略に対応する極小元を探索する。なお、（ｋ，ｔ）−匿名性を満たすことが可能な一般化戦略とは、当該一般化戦略に基づきテーブルｄ１０を一般化（抽象化）した場合に、一般化後のテーブルｄ３０が、（ｋ，ｔ）−匿名性を満たすことが可能な一般化戦略を示すものとする。 The minimal element search unit 143 is a generalization capable of satisfying the acquired anonymity condition (that is, (k, t) -anonymity) among the generalized nodes in the bundle composed of the predetermined generalization strategies. Search for the minimal element corresponding to the strategy. Note that the generalized strategy that can satisfy (k, t) -anonymity is that when the table d10 is generalized (abstracted) based on the generalized strategy, the generalized table d30 is ( k, t) —denote a generalization strategy that can satisfy anonymity.

具体的には、極小元探索部１４３は、既定された一般化戦略からなる束における一般化ノードに対応する一般化戦略が、（ｋ，ｔ）−匿名性を満たすことが可能な一般化戦略か否かを判定する処理を、クエリ関数ｑとして規定している。なお、以降では、（ｋ，ｔ）−匿名性を満たすことが可能な一般化戦略に対応する元（即ち、一般化ノード）を、単に「（ｋ，ｔ）−匿名性を満たす元」と記載する場合がある。 Specifically, the minimal element search unit 143 is a generalized strategy in which a generalized strategy corresponding to a generalized node in a bundle of predetermined generalized strategies can satisfy (k, t) -anonymity. Is determined as a query function q. In the following, an element corresponding to a generalized strategy that can satisfy (k, t) -anonymity (that is, a generalized node) is simply referred to as “element satisfying (k, t) -anonymity”. May be described.

そして、極小元探索部１４３は、既定された一般化戦略からなる束において、（ｋ，ｔ）−匿名性を満たすいずれかの元（一般化ノード）を基準点として、当該元の各要素のうち少なくともいずれかをより具体化する方向に向けて、（ｋ，ｔ）−匿名性を満たす極小元を探索する。 Then, the minimal element search unit 143 uses, as a reference point, any element (generalized node) that satisfies (k, t) -anonymity in a bundle of predetermined generalized strategies. A minimal element satisfying (k, t) -anonymity is searched for in a direction to make at least one of them more specific.

具体的には、極小元探索部１４３は、（ｋ，ｔ）−匿名性を満たすことが確認された元の各要素（即ち、一般化階層の高さ）のうち、いずれかの要素をより具体化する方向にシフトさせ、当該シフト後の元が（ｋ，ｔ）−匿名性を満たす元か否かを判定する。そして、極小元探索部１４３は、シフト後の元が（ｋ，ｔ）−匿名性を満たす場合には、いずれかの要素をさらに具体化する方向にシフトさせ、シフト後の元が（ｋ，ｔ）−匿名性を満たさない場合には、いずれかの要素を一般化（抽象化）する方向にシフトさせる。このように、極小元探索部１４３は、いずれかの要素をシフトさせ、シフト後の元が（ｋ，ｔ）−匿名性を満たす元か否かを判定する処理を逐次実行することで、（ｋ，ｔ）−匿名性を満たす極小元を探索する。 Specifically, the minimal element search unit 143 uses any element among the original elements (that is, the height of the generalized hierarchy) confirmed to satisfy (k, t) -anonymity. It shifts in the direction which materializes, and it is determined whether the element after the said shift is an element which satisfies (k, t) -anonymity. Then, when the element after the shift satisfies (k, t) -anonymity, the minimal element search unit 143 shifts one of the elements in a direction that further embodies, and the element after the shift becomes (k, t t)-If anonymity is not satisfied, shift any element in the direction of generalization (abstraction). Thus, the minimal element search unit 143 shifts any element and sequentially executes a process of determining whether the element after the shift satisfies (k, t) -anonymity, k, t) —search for a minimal element satisfying anonymity.

なお、極小元探索部１４３が、（ｋ，ｔ）−匿名性を満たす極小元を探索できれば、その探索方法は、特に限定されない。具体的な一例として、極小元探索部１４３は、所謂二分探索に基づき、（ｋ，ｔ）−匿名性を満たす極小元を探索してもよい。 The search method is not particularly limited as long as the minimum element search unit 143 can search for a minimum element satisfying (k, t) -anonymity. As a specific example, the local minimum search unit 143 may search for a local minimum satisfying (k, t) -anonymity based on a so-called binary search.

また、各属性それぞれの属性値を最も一般化（抽象化）する分類（一般化階層）の組み合わせを示す一般化戦略は、（ｋ，ｔ）−匿名性を満たし得る。そのため、当該一般化戦略に対応する一般化ノードは、（ｋ，ｔ）−匿名性を満たす極小元を探索する場合の基準点となり得る。具体的な一例として、図５及び図６に示した「年齢」及び「性別」のそれぞれに対応する属性の一般化階層の高さを、それぞれ「２」に設定した場合の一般化戦略に対応する一般化ノード（即ち、図７において（２，２）で示された一般化ノード）は、（ｋ，ｔ）−匿名性を満たす極小元を探索する場合の基準点となり得る。 Moreover, the generalization strategy which shows the combination of the classification (generalization hierarchy) which generalizes (abstracts) the attribute value of each attribute most can satisfy (k, t) -anonymity. Therefore, the generalized node corresponding to the generalization strategy can be a reference point when searching for a minimal element that satisfies (k, t) -anonymity. As a specific example, it corresponds to the generalization strategy when the height of the generalization hierarchy of the attribute corresponding to each of “age” and “gender” shown in FIGS. 5 and 6 is set to “2” respectively. The generalized node (that is, the generalized node indicated by (2, 2) in FIG. 7) can be a reference point when searching for a minimal element that satisfies (k, t) -anonymity.

以上のように、極小元探索部１４３は、一般化戦略からなる束から、（ｋ，ｔ）−匿名性を満たす極小元を探索し、探索された極小元の情報を保持する。また、極小元探索部１４３は、（ｋ，ｔ）−匿名性を満たす極小元の探索結果を基準点導出部１４５に通知する。 As described above, the minimum element search unit 143 searches for a minimum element that satisfies (k, t) -anonymity from a bundle of generalization strategies, and holds information on the searched minimum element. Further, the minimal element search unit 143 notifies the reference point deriving unit 145 of the search result of the minimal element that satisfies (k, t) -anonymity.

基準点導出部１４５は、極小元探索部１４３から、（ｋ，ｔ）−匿名性を満たす極小元の探索結果を取得する。基準点導出部１４５は、取得した当該極小元の探索結果に基づき、既定された一般化戦略からなる束において、極小元探索部１４３による探索がなされていない一般化ノードのうち、（ｋ，ｔ）−匿名性を満たす一般化ノードを導出し、導出した一般化ノードを新たな基準点の候補として、極小元探索部１４３に出力する。 The reference point deriving unit 145 acquires a search result of a minimal element satisfying (k, t) -anonymity from the minimal element searching unit 143. The reference point deriving unit 145 determines (k, t) among the generalized nodes that have not been searched by the minimal element search unit 143 in a bundle of predetermined generalization strategies based on the obtained search result of the minimal element. )-A generalized node that satisfies anonymity is derived, and the derived generalized node is output to the local minimum search unit 143 as a new reference point candidate.

この新たな基準点の候補の出力を受けて、極小元探索部１４３は、当該候補を基準点として新たに（ｋ，ｔ）−匿名性を満たす極小元を探索するとともに、当該探索結果を基準点導出部１４５に出力する。 In response to the output of this new reference point candidate, the local minimum search unit 143 searches for a local minimum satisfying (k, t) -anonymity using the candidate as a reference point, and uses the search result as a reference. The data is output to the point deriving unit 145.

なお、極小元探索部１４３が（ｋ，ｔ）−匿名性を満たす極小元を探索する処理と、当該探索結果に基づき、基準点導出部１４５が新たな基準点の候補を導出する処理との詳細については、別途後述する。 The minimum element search unit 143 searches for a minimum element satisfying (k, t) -anonymity, and the reference point deriving unit 145 derives a new reference point candidate based on the search result. Details will be described later.

以上のようにして、極小元探索部１４３は、基準点導出部１４５と相互に連携して動作し、基準点導出部１４５により新たな基準点の候補が抽出されなくなるまで、（ｋ，ｔ）−匿名性を満たす極小元を探索して保持する。そして、極小元探索部１４３は、保持された一連の極小元（即ち、探索された一連の（ｋ，ｔ）−匿名性を満たす極小元）それぞれに対応する一般化戦略を、解析結果出力部１５に出力する。 As described above, the local minimum element search unit 143 operates in cooperation with the reference point deriving unit 145 until the reference point deriving unit 145 no longer extracts a new reference point candidate (k, t). -Search for and maintain a minimal element that satisfies anonymity. Then, the minimal element search unit 143 generates a generalized strategy corresponding to each of the stored series of minimal elements (that is, the searched series of (k, t) -minimum elements satisfying anonymity), and the analysis result output unit 15 is output.

解析結果出力部１５は、極小元探索部１４３により探索された一連の極小元それぞれに対応する一般化戦略を、当該極小元探索部１４３から取得する。そして、解析結果出力部１５は、取得した一連の一般化戦略を、（ｋ，ｔ）−匿名性を満たすようにテーブルｄ１０を一般化（抽象化）するための条件（即ち、一般化戦略）の候補として、管理者端末３０を介して管理者に提示する。なお、このとき提示される条件、即ち、探索された一連の極小元それぞれに対応する一般化戦略が、あらかじめ指定された（ｋ，ｔ）−匿名性を満たすようにテーブルｄ１０を一般化し、かつ、情報量の減少を最小限に抑えることが可能な一般化戦略に相当する。 The analysis result output unit 15 acquires a generalization strategy corresponding to each of a series of minimum elements searched by the minimum element search unit 143 from the minimum element search unit 143. The analysis result output unit 15 then generalizes (abstracts) the table d10 so that the acquired series of generalization strategies satisfy (k, t) -anonymity (ie, generalization strategy). To the administrator via the administrator terminal 30 as candidates. Note that the conditions presented at this time, that is, the table d10 is generalized so that the generalization strategy corresponding to each of the searched series of minimal elements satisfies the previously specified (k, t) -anonymity, and This corresponds to a generalization strategy that can minimize the decrease in the amount of information.

一般化条件取得部１６は、解析部１４により規定された一連の一般化戦略のうち、管理者端末３０を介して管理者に指定された一般化戦略に基づき、テーブルｄ１０を一般化（抽象化）するための条件を、当該管理者端末３０から取得する。なお、テーブルｄ１０を一般化（抽象化）するための条件とは、管理者により指定された一般化戦略に基づき、各属性の属性値を一般化（抽象化）するための分類を示す情報や、（ｋ，ｔ）−匿名性の条件を示す情報に相当する。なお、以降では、テーブルｄ１０を一般化（抽象化）するための条件を、「テーブルｄ１０の一般化条件」と記載する場合がある。 The generalization condition acquisition unit 16 generalizes (abstracts) the table d10 based on the generalization strategy specified by the administrator via the administrator terminal 30 among the series of generalization strategies defined by the analysis unit 14. ) Is acquired from the administrator terminal 30. The conditions for generalizing (abstracting) the table d10 are information indicating a classification for generalizing (abstracting) attribute values of each attribute based on the generalization strategy specified by the administrator. , (K, t) —corresponds to information indicating anonymity conditions. Hereinafter, a condition for generalizing (abstracting) the table d10 may be referred to as “generalization condition for the table d10”.

そして、一般化条件取得部１６は、取得したテーブルｄ１０の一般化条件を示す情報を、ＤＢ情報変換部１７に出力する。 Then, the generalization condition acquisition unit 16 outputs information indicating the generalization condition of the acquired table d10 to the DB information conversion unit 17.

ＤＢ情報変換部１７は、ＤＢ情報取得部１１から、処理対象となるテーブルｄ１０の情報を取得する。また、ＤＢ情報変換部１７は、ＤＢ情報変換部１７から、テーブルｄ１０の一般化条件を示す情報を取得する。 The DB information conversion unit 17 acquires information on the table d10 to be processed from the DB information acquisition unit 11. Also, the DB information conversion unit 17 acquires information indicating the generalization condition of the table d10 from the DB information conversion unit 17.

ＤＢ情報変換部１７は、テーブルｄ１０の一般化条件を示す情報に基づき、テーブルｄ１０の中の対象となる属性の属性値を一般化することで、当該テーブルｄ１０の情報を、より匿名性を高めたテーブルｄ３０の情報に変換する。 The DB information conversion unit 17 improves the anonymity of the information of the table d10 by generalizing the attribute value of the target attribute in the table d10 based on the information indicating the generalization condition of the table d10. Is converted into information of the table d30.

具体的には、ＤＢ情報変換部１７は、一般化の対象となる属性の属性値を、テーブルｄ１０の一般化条件として指定された分類中の属性値に対応付けることで、当該属性の属性値を一般化する。例えば、ＤＢ情報変換部１７は、年齢を示す属性の属性値を、世代を示す属性値に対応付けることで、当該年齢を示す属性の属性値を抽象化する。 Specifically, the DB information conversion unit 17 associates the attribute value of the attribute to be generalized with the attribute value in the classification specified as the generalization condition in the table d10, thereby obtaining the attribute value of the attribute. Generalize. For example, the DB information conversion unit 17 abstracts the attribute value of the attribute indicating the age by associating the attribute value of the attribute indicating the age with the attribute value indicating the generation.

また、ＤＢ情報変換部１７は、テーブルｄ１０の一般化条件として指定された（ｋ，ｔ）−匿名性の条件に基づき、属性値が抽象化されたテーブルｄ１０から、等価クラスのレコード数がｋ未満であるようなレコードを、レコード数がｔ以下の範囲で削除する。以上のようにして、ＤＢ情報変換部１７は、テーブルｄ１０の情報を、より匿名性を高めたテーブルｄ３０の情報に変換する。 Further, the DB information conversion unit 17 determines that the number of equivalent class records is k from the table d10 in which the attribute values are abstracted based on the (k, t) -anonymity condition specified as the generalization condition of the table d10. Records that are less than the number of records are deleted within a range where the number of records is t or less. As described above, the DB information conversion unit 17 converts the information in the table d10 into information in the table d30 with higher anonymity.

そして、ＤＢ情報変換部１７は、テーブルｄ１０の情報を変換することで生成されたテーブルｄ３０の情報を、変換結果出力部１８を介して一般化ＤＢ４０に出力する。変換結果出力部１８は、ＤＢ情報変換部１７により生成されたテーブルｄ３０の情報を、一般化ＤＢ４０に出力するためのインタフェースに相当する。 Then, the DB information conversion unit 17 outputs the information of the table d30 generated by converting the information of the table d10 to the generalized DB 40 via the conversion result output unit 18. The conversion result output unit 18 corresponds to an interface for outputting the information of the table d30 generated by the DB information conversion unit 17 to the generalized DB 40.

以上、図１０を参照して、本実施形態に係る情報処理装置１０の機能構成の一例について説明した。 Heretofore, an example of the functional configuration of the information processing apparatus 10 according to the present embodiment has been described with reference to FIG.

＜４．処理＞
次に、情報処理装置１０の一連の処理の流れについて、特に、解析部１４の処理、即ち、処理対象となるテーブルｄ１０を一般化（抽象化）するための一般化戦略の候補を導出する処理に着目して、さらに詳しく説明する。なお、本説明では、解析部１４の処理として、まず、「属性値の一般化」、「（ｋ，ｔ）−匿名性の判定」、「元の変換」、及び「極小元の探索」に係る各部分処理について説明し、その後に、「一連の処理の流れ（即ち、全体的な処理の流れ）」について説明する。 <4. Processing>
Next, with regard to the flow of a series of processes of the information processing apparatus 10, in particular, the process of the analysis unit 14, that is, the process of deriving generalization strategy candidates for generalizing (abstracting) the table d10 to be processed. This will be described in more detail with a focus on. In this description, as the processing of the analysis unit 14, first, “generalization of attribute value”, “(k, t) -anonymity determination”, “original conversion”, and “minimum element search” are performed. Each of the partial processes will be described, and then “a series of process flows (that is, an overall process flow)” will be described.

［４．１．属性値の一般化］
まず、図１１を参照して、テーブルｄ１０の各レコードのうち、対象となる属性の属性値を、入力として指定された一般化戦略と、当該属性の一般化階層とに基づき一般化する処理の一例について説明する。図１１は、対象となる属性の属性値を一般化する処理の一例について説明するための説明図であり、同処理のアルゴリズムを所謂プログラムコードの形式で模式的に提示した図である。なお、図１１に示す例は、処理の流れをわかりやすくするために一部の情報を抽象化して示しており、必ずしも特定のプログラム言語のコンパイラに基づきコンパイルできる形式で示したものではない。 [4.1. Generalization of attribute values]
First, referring to FIG. 11, the process of generalizing the attribute value of the target attribute among the records in the table d10 based on the generalization strategy specified as input and the generalization hierarchy of the attribute. An example will be described. FIG. 11 is an explanatory diagram for explaining an example of the process of generalizing the attribute value of the target attribute, and is a diagram schematically showing the algorithm of the process in a so-called program code format. In the example shown in FIG. 11, some information is abstracted for easy understanding of the processing flow, and is not necessarily shown in a format that can be compiled based on a compiler of a specific programming language.

図１１に示した「getFrequentSets」関数は、入力として、変数ｖ＝（ｖ_１，ｖ_２，・・・，ｖ_ｎ）で示された一般化戦略（即ち、一般化ノード）と、変数Ｔで示された処理対象となるテーブルｄ１０と、変数（Ｈ_１，Ｈ_２，・・・，Ｈ_ｎ）で示された各属性の一般化階層を示す情報とを取得する。 The “getFrequentSets” function shown in FIG. 11 takes as input a generalization strategy (that is, a generalized node) indicated by variables v = (v ₁ , v ₂ ,..., V _n ) and a variable T. The table d10 to be processed and the information indicating the generalized hierarchy of each attribute indicated by the variables (H ₁ , H ₂ ,..., H _n ) are acquired.

なお、（ｖ_１，ｖ_２，・・・，ｖ_ｎ）のそれぞれは、互いに異なる属性に対応しており、各属性について指定された一般化階層の高さ（換言すると、当該属性の属性値を、一般化するための分類）をそれぞれ示している。同様に、（Ｈ_１，Ｈ_２，・・・，Ｈ_ｎ）は、互いに異なる属性に対応しており、各属性の一般化階層をそれぞれ示している。例えば、Ｈ_ｉ（ｖ_ｉ）は、i番目の属性の一般化階層のうち、高さがｖ_ｉに対応する階層（分類）を示している。 Note that each of (v ₁ , v ₂ ,..., V _n ) corresponds to a different attribute, and the height of the generalized hierarchy specified for each attribute (in other words, the attribute value of the attribute) , Classification for generalization). Similarly, (H ₁ , H ₂ ,..., H _n ) correspond to mutually different attributes, and indicate generalized hierarchies of the respective attributes. For example, H _i (v _i ) indicates a hierarchy (classification) whose height corresponds to v _i among the generalized hierarchies of the i-th attribute.

具体的には、図１１に示した「getFrequentSets」関数は、入力として取得したテーブルＴの各レコードｒそれぞれを、等価クラスに割り当てる。このとき、同関数は、レコードｒの各属性の属性値をｒ_ｉとした場合に、当該属性の一般化階層における高さｖ_ｉの属性（分類）に一般化した場合の値ｄ_ｉに置き換えることで、各レコードｒを割り当てるべき等価クラスを特定している。 Specifically, the “getFrequentSets” function shown in FIG. 11 assigns each record r of the table T acquired as an input to the equivalence class. At this time, when the attribute value of each attribute of the record r is r _i , the function replaces the value d _i when generalized to the attribute (classification) of the height v _i in the generalization hierarchy of the attribute. Thus, the equivalence class to which each record r is to be assigned is specified.

より具体的には、「getFrequentSets」関数は、テーブルＴ中の各レコードｒについて、任意の高さｖ_ｉについて、属性値ｒ_ｉ∈ｄ_ｉ∈Ｈ_ｉ（ｖ_ｉ）を満足する値の組み合わせｄ＝（ｄ_１，ｄ_２，・・・，ｄ_ｎ）を特定する。そして、同関数は、特定した値の組み合わせｄに応じて、各レコードｒを、当該値の組み合わせｄに対応する等価クラスに割り当てる。以上のようにして、同関数は、各レコードｒが割り当てられた等価クラスの集合Ｆ’を生成し、生成した集合Ｆ’を出力する。 More specifically, the “getFrequentSets” function is a combination of values satisfying the attribute value r _i ∈d _i ∈ H _i (v _i ) for an arbitrary height v _i for each record r in the table T. = (D ₁ , d ₂ ,..., D _n ) is specified. The function assigns each record r to an equivalence class corresponding to the value combination d according to the specified value combination d. As described above, the function generates a set F ′ of equivalence classes to which each record r is assigned, and outputs the generated set F ′.

例えば、図１２は、対象となる属性の属性値を、入力として指定された一般化戦略と、当該属性の一般化階層とに基づき一般化する処理の処理結果の一例を示している。図１２に示す例は、図２に示したテーブルｄ１０の「年齢」及び「性別」を示す属性ｄ１１３ａ及びｄ１１３ｂそれぞれの属性値を一般化した場合の一例を示している。なお、図１２に示す例では、属性ｄ１１３ａ及びｄ１１３ｂそれぞれの属性値を一般化するために、図５及び図６に示した属性の一般化階層を使用し、各一般化階層の高さを「１」とした一般化ノード（１，１）に基づき、各属性値を一般化している。 For example, FIG. 12 illustrates an example of a processing result of a process of generalizing an attribute value of a target attribute based on a generalization strategy specified as an input and a generalization hierarchy of the attribute. The example illustrated in FIG. 12 illustrates an example in which the attribute values of the attributes d113a and d113b indicating “age” and “sex” in the table d10 illustrated in FIG. 2 are generalized. In the example shown in FIG. 12, in order to generalize the attribute values of the attributes d113a and d113b, the attribute generalization hierarchy shown in FIGS. 5 and 6 is used, and the height of each generalization hierarchy is set to “ Each attribute value is generalized based on the generalized node (1, 1) set to “1”.

即ち、図１２に示す例の場合には、「getFrequentSets」関数は、テーブルｄ１０の各レコードが割り当てられた等価クラスｄ３５１〜ｄ３５５を、集合Ｆ’として出力することとなる。 That is, in the example shown in FIG. 12, the “getFrequentSets” function outputs the equivalence classes d351 to d355 to which the records of the table d10 are assigned as the set F ′.

以上、図１１及び図１２を参照して、テーブルｄ１０の各レコードのうち、対象となる属性の属性値を、入力として指定された一般化戦略と、当該属性の一般化階層とに基づき一般化する処理の一例について説明した。 As described above, with reference to FIGS. 11 and 12, the attribute value of the target attribute among the records in the table d10 is generalized based on the generalization strategy specified as input and the generalization hierarchy of the attribute. An example of the processing to be performed has been described.

［４．２．（ｋ，ｔ）−匿名性の判定］
次に、図１３を参照して、所望の一般化戦略が、（ｋ，ｔ）−匿名性を満たすことが可能な一般化戦略か否かを判定する処理の一例について説明する。図１３は、所望の一般化戦略が、（ｋ，ｔ）−匿名性を満たすことが可能な一般化戦略か否かを判定する処理の一例について説明するための説明図であり、同処理のアルゴリズムを所謂プログラムコードの形式で模式的に提示した図である。なお、図１３に示す例は、処理の流れをわかりやすくするために一部の情報を抽象化して示しており、必ずしも特定のプログラム言語のコンパイラに基づきコンパイルできる形式で示したものではない。 [4.2. (K, t)-Determination of Anonymity]
Next, an example of processing for determining whether or not a desired generalization strategy is a generalization strategy that can satisfy (k, t) -anonymity will be described with reference to FIG. FIG. 13 is an explanatory diagram for explaining an example of processing for determining whether or not a desired generalization strategy is a generalization strategy that can satisfy (k, t) -anonymity. It is the figure which presented the algorithm typically in the form of what is called a program code. In the example shown in FIG. 13, some information is abstracted for easy understanding of the processing flow, and is not necessarily shown in a format that can be compiled based on a compiler of a specific programming language.

図１３に示した「kAnonymityCheck」関数は、入力として、変数ｖ＝（ｖ_１，ｖ_２，・・・，ｖ_ｎ）で示された一般化戦略（即ち、一般化ノード）と、（ｋ，ｔ）−匿名性の閾値（即ち、ｋ及びｔ）と、変数Ｔで示された処理対象となるテーブルｄ１０と、変数（Ｈ_１，Ｈ_２，・・・，Ｈ_ｎ）で示された各属性の一般化階層を示す情報とを取得する。なお、（ｖ_１，ｖ_２，・・・，ｖ_ｎ）及び（Ｈ_１，Ｈ_２，・・・，Ｈ_ｎ）の定義は、図１１を参照して前述した、属性値を一般化するための「getFrequentSets」関数の場合と同様である。 The “kAnonymityCheck” function shown in FIG. 13 has, as inputs, a generalization strategy (that is, a generalized node) indicated by variables v = (v ₁ , v ₂ ,..., V _n ), and (k, t)-Anonymity threshold (ie, k and t), table d10 to be processed indicated by variable T, and each indicated by variable (H ₁ , H ₂ ,..., H _n ) Get information indicating the generalized hierarchy of attributes. The definitions of (v ₁ , v ₂ ,..., V _n ) and (H ₁ , H ₂ ,..., H _n ) generalize the attribute values described above with reference to FIG. This is the same as the case of the “getFrequentSets” function.

具体的には、図１３に示した「kAnonymityCheck」関数は、入力として取得したテーブルＴの各レコードｒを、入力として取得した一般化戦略ｖに基づき、等価クラスに割り当てる。なお、各レコードｒを等価クラスに割り当てる処理は、図１１に基づき前述した「getFrequentSets」関数に相当する。 Specifically, the “kAnonymityCheck” function shown in FIG. 13 assigns each record r of the table T acquired as an input to an equivalence class based on the generalization strategy v acquired as an input. The process of assigning each record r to the equivalence class corresponds to the “getFrequentSets” function described above with reference to FIG.

また、「kAnonymityCheck」関数は、各レコードｒが割り当てられた各等価クラスについて、レコード数が入力として取得した閾値ｋ以上か否かを判定し、レコード数がｋ未満となる等価クラスのサイズ(等価クラスの元の数)cを計上する。そして、同関数は、計上されたレコード数がｋ未満となる等価クラスのサイズｃが、入力として取得した閾値ｔ以下の場合には「真（True）」を出力し、閾値ｔを超える場合には「偽（False）」を出力する。 In addition, the “kAnonymityCheck” function determines, for each equivalent class to which each record r is assigned, whether or not the number of records is greater than or equal to the threshold value k acquired as an input, and the size of the equivalent class with which the number of records is less than k (equivalent Count the original number of classes) c. The function outputs “True” when the size c of the equivalent class in which the number of recorded records is less than k is equal to or smaller than the threshold value t acquired as an input, and when it exceeds the threshold value t. Outputs "False".

以上、図１３を参照して、所望の一般化戦略が、（ｋ，ｔ）−匿名性を満たすことが可能な一般化戦略か否かを判定する処理の一例について説明した。 The example of the process for determining whether or not the desired generalization strategy is a generalization strategy that can satisfy (k, t) -anonymity has been described above with reference to FIG.

［４．３．元の変換］
前述したように、情報処理装置１０の基準点導出部１４５は、一般化戦略からなる束から、極小元探索部１４３が探索した（ｋ，ｔ）−匿名性を満たす極小元の探索結果を基に、（ｋ，ｔ）−匿名性を満たす一般化ノードを、新たな基準点の候補として特定する。このとき、基準点導出部１４５は、各属性の属性値を一般化する分類の数（換言すると、各属性の一般化階層の高さ）に基づき離散集合から誘導される束を規定し、一般化戦略からなる束と当該離散集合から誘導される束との間で元を変換することで、新たな基準点の候補を特定する。 [4.3. Original conversion]
As described above, the reference point deriving unit 145 of the information processing apparatus 10 is based on the search result of the minimal element satisfying (k, t) -anonymity searched by the minimal element searching unit 143 from the bundle of generalization strategies. In addition, a generalized node satisfying (k, t) -anonymity is specified as a new reference point candidate. At this time, the reference point deriving unit 145 defines a bundle derived from the discrete set based on the number of classifications that generalize the attribute values of each attribute (in other words, the height of the generalization hierarchy of each attribute). A new reference point candidate is specified by transforming the element between the bundle formed by the conversion strategy and the bundle derived from the discrete set.

そこで、本項では、まず、図１４を参照して、各属性の属性値を一般化する分類の数に基づく離散集合から誘導される束について説明し、その後、図１４〜図１７を参照して、元の変換に係る処理の一例について説明する。なお、以降では、各属性の属性値を一般化する分類の数に基づく離散集合から誘導される束を、単に、「離散集合から誘導される束」と記載する場合がある。 Therefore, in this section, first, a bundle derived from a discrete set based on the number of classifications that generalizes the attribute values of each attribute will be described with reference to FIG. 14, and then refer to FIGS. An example of processing related to the original conversion will be described. In the following, a bundle derived from a discrete set based on the number of classifications that generalizes the attribute values of each attribute may be simply referred to as a “bundle derived from the discrete set”.

なお、離散集合から誘導される束とは、有限集合［ｎ］：＝｛１，２，・・・，ｎ｝の冪集合をＢ（［ｎ］）とした場合に、冪集合B([n])の各元の包含関係を半順序関係として規定される束＜Ｂ（［ｎ］），⊆＞に相当する。 Note that a bundle derived from a discrete set refers to a 冪 set B ([[] when a 冪 set of a finite set [n]: = {1, 2,..., N} is B ([n]). n]) corresponds to a bundle <B ([n]), ⊆> defined as a partial order relationship.

例えば、図１４は、一般化戦略からなる束と離散集合から誘導される束との間の対応関係の一例について示した図であり、各束の間での元の対応関係を示している。なお、図１４に示す例では、図５及び図６に示した属性の一般化階層に基づく一般化戦略からなる束（図７参照）を適用した場合の一例を示している。 For example, FIG. 14 is a diagram showing an example of a correspondence relationship between a bundle composed of a generalization strategy and a bundle derived from a discrete set, and shows an original correspondence relationship between each bundle. The example shown in FIG. 14 shows an example in which a bundle (see FIG. 7) composed of generalization strategies based on the attribute generalization hierarchy shown in FIGS. 5 and 6 is applied.

図１４に示す例では、各属性の属性値を一般化する分類（換言すると、各属性の一般化階層のうち、属性値そのものを示す高さ「０」の階層を除いたもの）の数に基づき規定される有限集合（離散集合）の各部分集合を元として、当該部分集合間の包含関係を順序関係とすることで、離散集合から誘導される束が規定されている。具体的には、各属性の属性値を一般化する分類の数の和を最大とする正の整数の集合の各部分集合を元として、当該部分集合間の間の包含関係を順序関係とすることで、離散集合から誘導される束が規定されている。 In the example shown in FIG. 14, the number of classifications that generalize the attribute values of each attribute (in other words, the generalized hierarchy of each attribute, excluding the hierarchy of height “0” indicating the attribute value itself). Based on each subset of a finite set (discrete set) defined on the basis of the order, the bundle derived from the discrete set is defined by using the inclusion relation between the subsets as an order relationship. Specifically, based on each subset of a set of positive integers that maximizes the sum of the number of classifications that generalize the attribute values of each attribute, the inclusion relationship between the subsets is an order relationship. Thus, a bundle derived from a discrete set is defined.

なお、図１４に示す例では、離散集合から誘導される束の正の整数の有限集合｛１，２，３，４｝のうち、要素「１」及び「２」に基づく有限集合｛１，２｝の部分集合により、図５に示した年齢を示す属性の高さ「０」〜「２」で示された各分類（即ち、一般化階層）を示している。 In the example shown in FIG. 14, among the positive integer finite sets {1, 2, 3, 4} of bundles derived from the discrete sets, the finite sets {1, 2, 3 based on the elements “1” and “2” are used. 2} represents the respective classifications (namely, generalized hierarchies) indicated by the heights “0” to “2” of the attributes indicating the age shown in FIG.

具体的には、年齢を示す属性の一般化階層おける高さ「０」の分類（即ち、属性値そのもの）は、離散集合から誘導される束において、正の整数の部分集合｛１，２｝で示される。また、年齢を示す属性の一般化階層おける高さ「１」の分類は、離散集合から誘導される束において、正の整数の部分集合｛１｝で示される。また、年齢を示す属性の一般化階層おける高さ「２」の分類は、離散集合から誘導される束において、正の整数の部分集合｛｝（即ち、空集合）で示される。 Specifically, the classification of height “0” in the generalized hierarchy of attributes indicating age (that is, the attribute value itself) is a subset of positive integers {1, 2} in a bundle derived from a discrete set. Indicated by Further, the classification of the attribute indicating the age with the height “1” in the generalized hierarchy is represented by a positive integer subset {1} in the bundle derived from the discrete set. In addition, the classification of the attribute indicating the age at the height “2” in the generalized hierarchy is represented by a positive integer subset {} (that is, an empty set) in a bundle derived from the discrete set.

同様に、図１４に示す例では、離散集合から誘導される束の正の整数の有限集合｛１，２，３，４｝のうち、要素「３」及び「４」に基づく有限集合｛３，４｝の部分集合により、図６に示した性別を示す属性の高さ「０」〜「２」で示された各分類（即ち、一般化階層）を示している。 Similarly, in the example shown in FIG. 14, among the positive integer finite sets {1, 2, 3, 4} of bundles derived from the discrete sets, the finite set {3 based on the elements “3” and “4” , 4} represents the respective classifications (namely, generalized hierarchies) indicated by the attribute heights “0” to “2” indicating the gender shown in FIG.

具体的には、性別を示す属性の一般化階層おける高さ「０」の分類（即ち、属性値そのもの）は、離散集合から誘導される束において、正の整数の部分集合｛３，４｝で示される。また、性別を示す属性の一般化階層おける高さ「１」の分類は、離散集合から誘導される束において、正の整数の部分集合｛３｝で示される。また、性別を示す属性の一般化階層おける高さ「２」の分類は、離散集合から誘導される束において、正の整数の部分集合｛｝（即ち、空集合）で示される。 Specifically, the classification of the attribute indicating gender having a height of “0” in the generalized hierarchy (that is, the attribute value itself) is a subset of positive integers {3,4} in a bundle derived from the discrete set. Indicated by In addition, the classification of the attribute indicating sex by the height “1” in the generalized hierarchy is represented by a positive integer subset {3} in a bundle derived from the discrete set. In addition, the classification of the attribute indicating gender with the height “2” in the generalized hierarchy is represented by a positive integer subset {} (that is, an empty set) in a bundle derived from the discrete set.

即ち、各属性に対応する複数の分類のそれぞれは、より具体化された分類ほど、当該属性に関連付けられた、複数の連続した正の整数に対応した要素のうち、より多くの要素が、値のより小さい要素から順次加えられた部分集合に対応付けられている。 That is, for each of the plurality of classifications corresponding to each attribute, the more specific the classification is, the more elements among the elements corresponding to the plurality of consecutive positive integers associated with the attribute are the values. Are associated with subsets added sequentially starting with smaller elements.

なお、このとき、一般化戦略からなる束の最大元（２，２）は、離散集合から誘導される束の最小限｛｝（即ち、空集合）に対応付けられ、一般化戦略からなる束の最小元（０，０）は、離散集合から誘導される束の最大元｛１，２，３，４｝に対応付けられる。 At this time, the maximum element (2, 2) of the bundle composed of the generalization strategy is associated with the minimum {} (ie, empty set) of the bundle derived from the discrete set, and the bundle composed of the generalization strategy. Is associated with the maximum element {1, 2, 3, 4} of the bundle derived from the discrete set.

以上のようにして、離散集合から誘導される束を規定することで、一般化階層からなる束の各元（即ち、一般化ノード）が、当該離散集合から誘導される束の元に対応付けられる。 By defining a bundle derived from a discrete set as described above, each element of the bundle consisting of the generalized hierarchy (that is, a generalized node) is associated with the element of the bundle derived from the discrete set. It is done.

具体的な一例として、図１４に示す例では、一般化戦略からなる束の一般化ノード（０，１）は、離散集合から誘導される束の元｛１，２，３｝に対応付けられる。同様に、一般化戦略からなる束の一般化ノード（２，１）は、離散集合から誘導される束の元｛３｝に対応付けられる。 As a specific example, in the example illustrated in FIG. 14, a generalized node (0, 1) of a bundle composed of a generalization strategy is associated with a bundle element {1, 2, 3} derived from a discrete set. . Similarly, a bundle generalization node (2, 1) consisting of a generalization strategy is associated with a bundle element {3} derived from a discrete set.

以上のような構成のもと、一般化戦略からなる束と、離散集合から誘導される束との間で、一方の束の元を他方の束の元に変換する処理の一例について以下に説明する。 Based on the above configuration, an example of processing for converting one bundle element into another bundle element between a bundle consisting of a generalized strategy and a bundle derived from a discrete set will be described below. To do.

まず、図１４及び図１５を参照して、一般化戦略からなる束の元（即ち、一般化ノード）を、離散集合から誘導される束の元に変換する処理の一例について説明する。図１５は、一般化戦略からなる束の元を、離散集合から誘導される束の元に変換する処理の一例について説明するための説明図であり、同処理のアルゴリズムを所謂プログラムコードの形式で模式的に提示した図である。なお、図１５に示す例は、処理の流れをわかりやすくするために一部の情報を抽象化して示しており、必ずしも特定のプログラム言語のコンパイラに基づきコンパイルできる形式で示したものではない。 First, with reference to FIGS. 14 and 15, an example of processing for converting a bundle element (that is, a generalized node) composed of a generalization strategy into a bundle element derived from a discrete set will be described. FIG. 15 is an explanatory diagram for explaining an example of a process for converting a bundle element composed of a generalization strategy into a bundle element derived from a discrete set. The algorithm of the process is expressed in a so-called program code format. It is the figure shown typically. Note that the example shown in FIG. 15 abstracts a part of information for easy understanding of the flow of processing, and is not necessarily shown in a format that can be compiled based on a compiler of a specific programming language.

図１５に示した「set」関数は、入力として、変数ｖ＝（ｖ_１，ｖ_２，・・・，ｖ_ｎ）で示された一般化ノードと、（ｈ_１，ｈ_２，・・・，ｈ_ｎ）で示された各属性の一般化階層の高さの最大値を示す情報とを取得する。なお、（ｖ_１，ｖ_２，・・・，ｖ_ｎ）の定義は、図１１を参照して前述した、属性値を一般化するための「getFrequentSets」関数の場合と同様である。また、（ｈ_１，ｈ_２，・・・，ｈ_ｎ）のそれぞれは、互いに異なる属性に対応しており、各属性の一般化階層の高さの最大値を示している。即ち、ｈ_ｉは、ｉ番目の属性の一般化階層の高さの最大値を示していることとなる。 The “set” function shown in FIG. 15 has, as inputs, a generalized node indicated by variables v = (v ₁ , v ₂ ,..., V _n ), and (h ₁ , h ₂ ,. , H _n ), the information indicating the maximum value of the generalized hierarchy height of each attribute is acquired. The definition of (v ₁ , v ₂ ,..., V _n ) is the same as that of the “getFrequentSets” function for generalizing attribute values described above with reference to FIG. Each of (h ₁ , h ₂ ,..., H _n ) corresponds to a different attribute and indicates the maximum height of the generalized hierarchy of each attribute. That is, h _i indicates the maximum value of the generalized hierarchy of the i-th attribute.

図１５に示す「set」関数は、図１４に示した一般化戦略からなる束の元と離散集合から誘導される束の元との対応関係に基づき、一般化戦略からなる束の最小元（０，０）を入力とした場合に、離散集合から誘導される束の最大元｛１，２，３，４｝を出力する。同様に、同関数は、一般化戦略からなる束の最大元（２，２）を入力とした場合に、離散集合から誘導される束の最小限｛｝を出力する。 The “set” function shown in FIG. 15 is based on the correspondence between the bundle element composed of the generalized strategy shown in FIG. 14 and the bundle element derived from the discrete set, and the minimum element of the bundle composed of the generalized strategy ( When 0,0) is input, the maximum element {1, 2, 3, 4} of bundles derived from the discrete set is output. Similarly, this function outputs the minimum {} of bundles derived from a discrete set when the maximum element (2, 2) of bundles composed of generalization strategies is input.

また、一般化戦略からなる束の最小元に対して、少なくともいずれかの属性に対応する要素が加算された元を、「set」関数の入力とした場合に着目する。この場合には、同関数は、離散集合から誘導される束の最大元に対応する集合から、一般化ノードにおいて値が加算された属性に対応する要素のうち、より値の大きい要素が、当該加算分だけ削除された部分集合に対応する元を出力する。 Further, attention is paid to a case where an element obtained by adding an element corresponding to at least one of the attributes to the minimum element of a bundle composed of generalization strategies is used as an input of the “set” function. In this case, from the set corresponding to the largest element of the bundle derived from the discrete set, the same function is the element with the larger value among the elements corresponding to the attribute added with the value in the generalized node. The element corresponding to the subset deleted by the addition is output.

具体的な一例として、一般化戦略からなる束の最小元（０，０）に対して、性別を示す属性に対応する要素が１加算された（即ち、当該属性を１段階だけ一般化した）一般化ノード（０，１）に着目する。この場合には、離散集合から誘導される束の最大元に対応する集合｛１，２，３，４｝のうち、要素「３」と「４」とが、性別を示す属性に対応している。そのため、「set」関数は、集合｛１，２，３，４｝から、要素「３」と「４」とのうち、より大きい要素を１つだけ削除した部分集合｛１，２，３｝に対応する元を、一般化ノード（０，１）に対応する元として出力する。 As a specific example, the element corresponding to the attribute indicating sex is added to the minimum element (0, 0) of the bundle consisting of the generalization strategy (that is, the attribute is generalized only by one level). Focus on the generalized node (0, 1). In this case, in the set {1, 2, 3, 4} corresponding to the maximum element of the bundle derived from the discrete set, the elements “3” and “4” correspond to the attribute indicating gender. Yes. Therefore, the “set” function is a subset {1, 2, 3} obtained by deleting only one larger element from the elements “3” and “4” from the set {1, 2, 3, 4}. Is output as an element corresponding to the generalized node (0, 1).

また、他の一例として、一般化戦略からなる束の最小元（０，０）に対して、年齢を示す属性に対応する要素が２加算された（即ち、当該属性を２段階だけ一般化した）一般化ノード（２，０）に着目する。この場合には、離散集合から誘導される束の最大元に対応する集合｛１，２，３，４｝のうち、要素「１」と「２」とが、年齢を示す属性に対応している。そのため、「set」関数は、集合｛１，２，３，４｝から、要素「１」と「２」とのうち、より大きい要素を２つだけ削除した部分集合｛３，４｝に対応する元を、一般化ノード（２，０）に対応する元として出力する。 As another example, the element corresponding to the attribute indicating age is added to the minimum element (0, 0) of the bundle consisting of the generalization strategy (that is, the attribute is generalized by only two stages). Note the generalized node (2, 0). In this case, in the set {1, 2, 3, 4} corresponding to the maximum element of the bundle derived from the discrete set, the elements “1” and “2” correspond to the attribute indicating the age. Yes. Therefore, the “set” function corresponds to the subset {3, 4} in which only two larger elements of the elements “1” and “2” are deleted from the set {1, 2, 3, 4}. Is output as an element corresponding to the generalized node (2, 0).

以上のようにして、「set」関数は、図１４に示した一般化戦略からなる束の元と離散集合から誘導される束の元との対応関係に基づき、入力として取得した一般化ノードｖを、離散集合から誘導される束の元Ｖ＝｛ａ_１，ａ_２，・・・｝に変換する。 As described above, the “set” function is the generalized node v acquired as an input based on the correspondence between the bundle elements formed by the generalization strategy shown in FIG. 14 and the bundle elements derived from the discrete set. Is converted into a bundle element V = {a ₁ , a ₂ ,...

以上、図１４及び図１５を参照して、一般化戦略からなる束の元（即ち、一般化ノード）を、離散集合から誘導される束の元に変換する処理の一例について説明した。 As described above, with reference to FIGS. 14 and 15, an example of processing for converting a bundle element (that is, a generalized node) including a generalization strategy into a bundle element derived from a discrete set has been described.

次に、図１６及び図１７を参照して、離散集合から誘導される束の元を、一般化戦略からなる束の元（即ち、一般化ノード）に変換する処理の一例について説明する。図１６及び図１７は、離散集合から誘導される束の元を、一般化戦略からなる束の元（即ち、一般化ノード）に変換する処理の一例について説明するための説明図である。また、図１７は、同処理のアルゴリズムを所謂プログラムコードの形式で模式的に提示した図である。なお、図１７に示す例は、処理の流れをわかりやすくするために一部の情報を抽象化して示しており、必ずしも特定のプログラム言語のコンパイラに基づきコンパイルできる形式で示したものではない。 Next, an example of processing for converting a bundle element derived from a discrete set into a bundle element (ie, a generalized node) composed of a generalization strategy will be described with reference to FIGS. FIGS. 16 and 17 are explanatory diagrams for explaining an example of processing for converting a bundle element derived from a discrete set into a bundle element (ie, a generalized node) including a generalization strategy. FIG. 17 is a diagram schematically showing the algorithm of the processing in a so-called program code format. Note that the example shown in FIG. 17 abstracts a part of information for easy understanding of the flow of processing, and is not necessarily shown in a format that can be compiled based on a compiler of a specific programming language.

まず、図１６を参照して、離散集合から誘導される束の元を、一般化戦略からなる束の元（即ち、一般化ノード）に変換する処理の概要について説明する。図１４を参照するとわかるように、一般化戦略からなる束の元と、離散集合から誘導される束の元とを１対１で対応付けた場合に、離散集合から誘導される束の元全てに対して、一般化戦略からなる束の元が対応付けられるとは限らない。 First, with reference to FIG. 16, an outline of processing for converting a bundle element derived from a discrete set into a bundle element (ie, a generalized node) composed of a generalization strategy will be described. As can be seen with reference to FIG. 14, all the bundle elements derived from the discrete set when the bundle elements formed by the generalization strategy and the bundle elements derived from the discrete set are associated one-to-one. On the other hand, a bundle element made up of a generalization strategy is not always associated.

具体的な一例として、前述した「set」関数により、一般化戦略からなる束の元を、離散集合から誘導される束の元に変換した場合に、変換後の元として、例えば、部分集合｛１，２，４｝に対応する元や、部分集合｛１，４｝に対応する元は出力されない。そのため、「set」関数の逆変換を行ったとしても、必ずしも、離散集合から誘導される束の元全てを、一般化戦略からなる束の元に変換できるとは限らない。 As a specific example, when a bundle element composed of a generalization strategy is converted into a bundle element derived from a discrete set by the above-described “set” function, for example, a subset { Elements corresponding to 1, 2, 4} and elements corresponding to the subset {1, 4} are not output. Therefore, even if the inverse transformation of the “set” function is performed, it is not always possible to transform all the elements of the bundle derived from the discrete set into the elements of the bundle consisting of the generalization strategy.

以上のような状況を鑑みて、図１７に示す「vec」関数は、離散集合から誘導される束の元全てを、一般化戦略からなる束におけるいずれかの元（即ち、一般化ノード）に対応付けるアルゴリズムを提供する。そこで、図１７を参照して、離散集合から誘導される束の元を、一般化戦略からなる束の元（即ち、一般化ノード）に変換する、「vec」関数の詳細について以下に説明する。 In view of the above situation, the “vec” function shown in FIG. 17 converts all elements of a bundle derived from a discrete set into any element (that is, a generalized node) in a bundle composed of a generalization strategy. Provide the matching algorithm. The details of the “vec” function for converting a bundle element derived from a discrete set into a bundle element consisting of a generalization strategy (ie, a generalized node) will be described below with reference to FIG. .

図１７に示した「vec」関数は、入力として、変数Ｖ＝｛ａ_１，ａ_２，・・・｝で示された離散集合から誘導される束の元と、（ｈ_１，ｈ_２，・・・，ｈ_ｎ）で示された各属性の一般化階層の高さの最大値を示す情報とを取得する。なお、（ｈ_１，ｈ_２，・・・，ｈ_ｎ）の定義は、図１５を参照して前述した、一般化戦略からなる束の元を、離散集合から誘導される束の元に変換する「set」関数の場合と同様である。 The “vec” function shown in FIG. 17 has, as inputs, a bundle element derived from a discrete set indicated by a variable V = {a ₁ , a ₂ ,...}, (H ₁ , h ₂ , .., H _n ), information indicating the maximum value of the height of the generalized hierarchy of each attribute is acquired. Note that the definition of (h ₁ , h ₂ ,..., H _n ) is to convert a bundle element composed of the generalization strategy described above with reference to FIG. 15 into a bundle element derived from a discrete set. This is the same as the case of the “set” function.

「vec」関数は、入力として取得した離散集合から誘導される束の元を、「set」関数により一般化戦略からなる束の一般化ノードが対応付けられた、離散集合から誘導される束の元のいずれかに対応付ける。具体的には、同関数は、入力として取得した元が示す正の整数の有限集合（即ち、有限集合｛１，２，３，４｝の部分集合）において、各属性に関連付けられた要素に着目して、当該要素よりも低い値を示す他の要素が欠落している場合には、当該他の要素を補間することで、他の元に対応付ける。 The “vec” function is a bundle derived from a discrete set that is associated with a bundle generalized node consisting of a generalization strategy by the “set” function. Map to one of the original. Specifically, in the finite set of positive integers indicated by the element acquired as input (that is, a subset of the finite set {1, 2, 3, 4}), the function is applied to the element associated with each attribute. When attention is paid to other elements that are lower in value than the element, the other elements are interpolated to correspond to other elements.

例えば、図１６において、離散集合から誘導される束の元のうち、有限集合｛２，３｝で示された元に着目する。前述したように、図１４及び図１６に示した離散集合から誘導される束の元では、要素「１」及び「２」に基づく有限集合｛１，２｝の部分集合により、年齢を示す属性の分類が示される。即ち、有限集合｛２，３｝に着目した場合に、年齢を示す属性に関連付けられた要素は「２」であり、当該要素より低い値を示す要素「１」が欠落していることがわかる。この場合には、「vec」関数は、有限集合｛２，３｝に対して、年齢を示す属性に対応する要素「１」を補間する。 For example, in FIG. 16, attention is paid to the elements indicated by the finite set {2, 3} among the bundle elements derived from the discrete set. As described above, in the bundle derived from the discrete set shown in FIG. 14 and FIG. 16, the attribute indicating the age is represented by a subset of the finite set {1, 2} based on the elements “1” and “2”. Classification is shown. That is, when focusing on the finite set {2, 3}, it is understood that the element associated with the attribute indicating age is “2” and the element “1” indicating a value lower than the element is missing. . In this case, the “vec” function interpolates the element “1” corresponding to the attribute indicating age with respect to the finite set {2, 3}.

また、図１４及び図１６に示した離散集合から誘導される束の元では、要素「３」及び「４」に基づく有限集合｛３，４｝の部分集合により、性別を示す属性の分類が示される。即ち、有限集合｛２，３｝に着目した場合に、性別を示す属性に関連付けられた要素は「３」であり、当該要素より低い値を示す要素は存在しない。この場合には、「vec」関数は、有限集合｛２，３｝に対して、性別を示す属性に対応する要素を新たに補完することはない。 Further, in the bundle source derived from the discrete set shown in FIGS. 14 and 16, the attribute classification indicating sex is determined by the subset of the finite set {3, 4} based on the elements “3” and “4”. Indicated. That is, when focusing on the finite set {2, 3}, the element associated with the attribute indicating gender is “3”, and there is no element indicating a value lower than that element. In this case, the “vec” function does not newly supplement the element corresponding to the attribute indicating sex with respect to the finite set {2, 3}.

以上のようにして、「vec」関数は、離散集合から誘導される束において、有限集合｛２，３｝で示された元を、有限集合｛１，２，３｝で示された他の元に対応付ける。なお、離散集合から誘導される束において有限集合｛１，２，３｝で示された元は、図１４及び図１６に示すように、一般化戦略からなる束の一般化ノード（０，１）に対応付けられる。即ち、「vec」関数は、有限集合｛２，３｝を要素の補間により有限集合｛１，２，３｝に対応付けることで、離散集合から誘導される束において有限集合｛２，３｝で示された元を、一般化戦略からなる束の一般化ノード（０，１）に変換する。 As described above, in the bundle derived from the discrete set, the “vec” function replaces the element indicated by the finite set {2, 3} with the other set indicated by the finite set {1, 2, 3}. Map to the original. In the bundle derived from the discrete set, the element indicated by the finite set {1, 2, 3} is a bundle generalized node (0, 1) consisting of a generalization strategy as shown in FIGS. ). That is, the “vec” function associates the finite set {2, 3} with the finite set {1, 2, 3} by interpolating elements, so that in the bundle derived from the discrete set, the finite set {2, 3} The indicated element is transformed into a bundle of generalized nodes (0, 1) consisting of generalization strategies.

また他の一例として、図１６において、離散集合から誘導される束の元のうち、有限集合｛１，４｝で示された元に着目する。この場合には、年齢を示す属性に関連付けられた要素は「１」であり、当該要素より低い値を示す要素は存在しない。この場合には、「vec」関数は、有限集合｛１，４｝に対して、年齢を示す属性に対応する要素を新たに補間することはない。 As another example, in FIG. 16, attention is paid to elements indicated by a finite set {1, 4} among bundle elements derived from a discrete set. In this case, the element associated with the attribute indicating age is “1”, and there is no element indicating a value lower than that element. In this case, the “vec” function does not newly interpolate an element corresponding to the attribute indicating age for the finite set {1, 4}.

また、有限集合｛１，４｝に着目した場合に、性別を示す属性に関連付けられた要素は「４」であり、当該要素より低い値を示す要素「３」が欠落していることがわかる。この場合には、「vec」関数は、有限集合｛１，４｝に対して、性別を示す属性に対応する要素「３」を補間する。 Further, when focusing on the finite set {1, 4}, it is understood that the element associated with the attribute indicating gender is “4” and the element “3” indicating a value lower than the element is missing. . In this case, the “vec” function interpolates the element “3” corresponding to the attribute indicating sex with respect to the finite set {1, 4}.

以上のようにして、「vec」関数は、離散集合から誘導される束において、有限集合｛１，４｝で示された元を、有限集合｛１，３，４｝で示された他の元に対応付ける。なお、離散集合から誘導される束において有限集合｛１，３，４｝で示された元は、図１４及び図１６に示すように、一般化戦略からなる束の一般化ノード（１，０）に対応付けられる。即ち、「vec」関数は、有限集合｛１，４｝を要素の補間により有限集合｛１，３，４｝に対応付けることで、離散集合から誘導される束において有限集合｛１，４｝で示された元を、一般化戦略からなる束の一般化ノード（１，０）に変換する。 As described above, in the bundle derived from the discrete set, the “vec” function replaces the element represented by the finite set {1, 4} with the other represented by the finite set {1, 3, 4}. Map to the original. Note that in a bundle derived from a discrete set, an element represented by a finite set {1, 3, 4} is a bundle generalized node (1,0) consisting of a generalization strategy as shown in FIGS. ). In other words, the “vec” function associates the finite set {1, 4} with the finite set {1, 3, 4} by interpolating elements, so that in the bundle derived from the discrete set, the finite set {1, 4} The indicated element is transformed into a bundle of generalized nodes (1, 0) consisting of generalization strategies.

同様にして、離散集合から誘導される束において、有限集合｛１，２，４｝、｛２，３，４｝、｛２，４｝で示された各元は、有限集合｛１，２，３，４｝で示された他の元に対応付けられる。なお、離散集合から誘導される束において有限集合｛１，２，３，４｝で示された元は、図１４及び図１６に示すように、一般化戦略からなる束の一般化ノード（０，９）に対応付けられる。即ち、「vec」関数は、離散集合から誘導される束において、有限集合｛１，２，４｝、｛２，３，４｝、｛２，４｝で示された各元を、一般化戦略からなる束の一般化ノード（０，０）に変換する。 Similarly, in a bundle derived from a discrete set, each element indicated by a finite set {1, 2, 4}, {2, 3, 4}, {2, 4} is represented by a finite set {1, 2, , 3, 4}. In the bundle derived from the discrete set, the element indicated by the finite set {1, 2, 3, 4} is a bundle generalized node (0) as shown in FIG. 14 and FIG. , 9). That is, the “vec” function generalizes each element represented by a finite set {1, 2, 4}, {2, 3, 4}, {2, 4} in a bundle derived from a discrete set. Convert to a bundle of generalized nodes (0, 0) consisting of strategies.

以上のようにして、「vec」関数は、図１６に示すように、入力として取得した離散集合から誘導される束の元Ｖ＝｛ａ_１，ａ_２，・・・｝を、一般化戦略からなる束の一般化ノードｖ＝（ｖ_１，ｖ_２，・・・，ｖ_ｎ）に変換する。 As described above, as shown in FIG. 16, the “vec” function uses a bundle element V = {a ₁ , a ₂ ,... Is converted into a generalized node v = (v ₁ , v ₂ ,..., V _n ).

以上、図１６及び図１７を参照して、離散集合から誘導される束の元を、一般化戦略からなる束の元（即ち、一般化ノード）に変換する処理の一例について説明した。 As described above, with reference to FIGS. 16 and 17, an example of processing for converting a bundle element derived from a discrete set into a bundle element (that is, a generalized node) including a generalization strategy has been described.

［４．４．極小元の探索］
次に、図１８及び図１９を参照して、前述した解析部１４の極小元探索部１４３が、一般化戦略からなる束における各一般化ノードから、（ｋ，ｔ）−匿名性を満たす極小元を探索する処理の一例について説明する。図１８及び図１９は、一般化戦略からなる束における各一般化ノードから、（ｋ，ｔ）−匿名性を満たす極小元を探索する処理の一例について説明するための説明図である。 [4.4. Search for local minimum]
Next, referring to FIG. 18 and FIG. 19, the local minimum search unit 143 of the analysis unit 14 described above satisfies the minimum satisfying (k, t) -anonymity from each generalized node in the bundle composed of the generalization strategy. An example of a process for searching for an element will be described. 18 and 19 are explanatory diagrams for explaining an example of processing for searching for a minimal element satisfying (k, t) -anonymity from each generalized node in a bundle composed of generalization strategies.

まず、図１８を参照して、一般化戦略からなる束における各一般化ノードから、（ｋ，ｔ）−匿名性を満たす極小元を探索するための「AMSS(A Most Specific Sentences)」関数の処理の流れについて説明する。図１８は、同処理のアルゴリズムを所謂プログラムコードの形式で模式的に提示した図である。なお、図１８に示す例は、処理の流れをわかりやすくするために一部の情報を抽象化して示しており、必ずしも特定のプログラム言語のコンパイラに基づきコンパイルできる形式で示したものではない。 First, referring to FIG. 18, an “AMSS (A Most Specific Sentences)” function for searching for a minimal element satisfying (k, t) −anonymity from each generalized node in a bundle composed of generalization strategies. The flow of processing will be described. FIG. 18 is a diagram schematically showing the algorithm of the processing in a so-called program code format. In the example shown in FIG. 18, some information is abstracted for easy understanding of the processing flow, and is not necessarily shown in a format that can be compiled based on a compiler of a specific programming language.

図１８に示した「AMSS」関数は、入力として、変数ｖ＝（ｖ_１，ｖ_２，・・・，ｖ_ｎ）で示された（ｋ，ｔ）−匿名性を満たす一般化ノードと、変数Ｔで示された処理対象となるテーブルｄ１０と、変数（Ｈ_１，Ｈ_２，・・・，Ｈ_ｎ）で示された各属性の一般化階層を示す情報とを取得する。なお、（ｖ_１，ｖ_２，・・・，ｖ_ｎ）及び（Ｈ_１，Ｈ_２，・・・，Ｈ_ｎ）の定義は、図１１を参照して前述した、属性値を一般化するための「getFrequentSets」関数の場合と同様である。 The “AMSS” function shown in FIG. 18 has a generalized node satisfying (k, t) -anonymity represented by variables v = (v ₁ , v ₂ ,..., V _n ) as inputs, A table d10 to be processed indicated by the variable T and information indicating the generalized hierarchy of each attribute indicated by the variables (H ₁ , H ₂ ,..., H _n ) are acquired. The definitions of (v ₁ , v ₂ ,..., V _n ) and (H ₁ , H ₂ ,..., H _n ) generalize the attribute values described above with reference to FIG. This is the same as the case of the “getFrequentSets” function.

「AMSS」関数は、極小元探索部１４３の動作として前述したように、一般化戦略からなる束において、（ｋ，ｔ）−匿名性を満たす元（一般化ノード）を基準点として、少なくともいずれかの要素をより具体化する方向に向けて、（ｋ，ｔ）−匿名性を満たす極小元を探索する。 As described above as the operation of the minimal element search unit 143, the “AMSS” function is at least any of the bundles (generalized nodes) satisfying (k, t) -anonymity in the bundle composed of generalized strategies. A minimal element that satisfies (k, t) -anonymity is searched for in a direction to further embody such an element.

具体的には、「AMSS」関数は、入力として指定された（ｋ，ｔ）−匿名性を満たす一般化ノードｖを、（ｋ，ｔ）−匿名性を満たす極小元を探索するための基準点として設定する。そして、同関数は、（ｋ，ｔ）−匿名性を満たすことが確認された元の各要素（即ち、一般化階層の高さ）のうち、いずれかの要素をより具体化する方向にシフトさせ、当該シフト後の元が（ｋ，ｔ）−匿名性を満たす元か否かを判定する。なお、（ｋ，ｔ）−匿名性の判定には、図１３を参照して前述した「kAnonymityCheck」関数を使用すればよい。 Specifically, the “AMSS” function is a criterion for searching for a generalized node v that satisfies (k, t) -anonymity specified as an input, and a minimal element that satisfies (k, t) -anonymity. Set as a point. Then, the function is shifted in a direction to make one of the elements more specific among the original elements (that is, the height of the generalized hierarchy) confirmed to satisfy (k, t) -anonymity. And determine whether or not the element after the shift satisfies (k, t) -anonymity. For the determination of (k, t) -anonymity, the “kAnonymityCheck” function described above with reference to FIG. 13 may be used.

そして、「AMSS」関数は、シフト後の元が（ｋ，ｔ）−匿名性を満たす場合には、いずれかの要素をさらに具体化する方向にシフトさせ、シフト後の元が（ｋ，ｔ）−匿名性を満たさない場合には、いずれかの要素を一般化（抽象化）する方向にシフトさせる。このように、同関数は、いずれかの要素をシフトさせ、シフト後の元が（ｋ，ｔ）−匿名性を満たす元か否かを判定する処理を逐次実行することで、（ｋ，ｔ）−匿名性を満たす極小元を探索する。 Then, when the element after the shift satisfies (k, t) -anonymity, the “AMSS” function shifts any element in a direction to further embody, and the element after the shift becomes (k, t )-If anonymity is not satisfied, shift any element in the direction of generalization (abstraction). As described above, the same function sequentially shifts any element and sequentially determines whether the element after the shift satisfies (k, t) -anonymity. )-Search for a minimal element that satisfies anonymity.

なお、図１８に示す例では、「AMSS」関数は、（ｋ，ｔ）−匿名性を満たす極小元を探索に、所謂二分探索と呼ばれる方法を用いている。 In the example shown in FIG. 18, the “AMSS” function uses a so-called binary search method for searching for a minimal element satisfying (k, t) -anonymity.

ここで、図１９を参照して、図１８に示した「AMSS」関数に基づき、極小元探索部１４３が、（ｋ，ｔ）−匿名性を満たす極小元の探索に係る処理について、具体的な例を挙げて説明する。図１９に示す例は、図５及び図６に示した属性の一般化階層に基づく一般化戦略からなる束（図７参照）を適用し、最大元（２，２）を基準点とした場合の一例について示している。なお、図１９に示す境界ｄ５０は、（ｋ，ｔ）−匿名性を満たす元と、（ｋ，ｔ）−匿名性を満たさない元の境界を示しており、当該境界ｄ５０から最大元側（即ち、図面の下側）の各元が、（ｋ，ｔ）−匿名性を満たす元に相当する。もちろん、境界ｄ５０から最小元側（即ち、図面の上側）の各元は、（ｋ，ｔ）−匿名性を満たさない元に相当することは言うまでもない。 Here, referring to FIG. 19, based on the “AMSS” function shown in FIG. 18, the minimum element search unit 143 specifically describes the process related to the search for the minimum element satisfying (k, t) -anonymity. A specific example will be described. In the example shown in FIG. 19, a bundle (see FIG. 7) composed of generalization strategies based on the attribute generalization hierarchy shown in FIGS. 5 and 6 is applied, and the maximum element (2, 2) is used as a reference point. An example is shown. Note that a boundary d50 shown in FIG. 19 indicates an original boundary that satisfies (k, t) -anonymity and an original boundary that does not satisfy (k, t) -anonymity, and the maximum element side ( That is, each element on the lower side of the drawing corresponds to an element satisfying (k, t) -anonymity. Of course, it goes without saying that each element on the minimum element side (that is, the upper side of the drawing) from the boundary d50 corresponds to an element that does not satisfy (k, t) -anonymity.

図１９に示す例では、極小元探索部１４３は、まず、年齢の属性に対応する要素をより具体化する方向（即ち、一般化戦略の高さを低くする方向）にシフトさせている。即ち、極小元探索部１４３は、一般化ノード（２，２）から、年齢の属性に対応する要素をより具体化する方向にシフトさせた一般化ノード（１，２）の（ｋ，ｔ）−匿名性を判定する。ここでは、一般化ノード（１，２）は、（ｋ，ｔ）−匿名性を満たすものとする。 In the example illustrated in FIG. 19, the local minimum search unit 143 first shifts the element corresponding to the age attribute to a more specific direction (that is, a direction of lowering the height of the generalization strategy). In other words, the local minimum search unit 143 shifts the element corresponding to the age attribute from the generalized node (2, 2) to a more specific direction (k, t) of the generalized node (1, 2). -Determine anonymity. Here, it is assumed that the generalized node (1, 2) satisfies (k, t) -anonymity.

一般化ノード（１，２）が（ｋ，ｔ）−匿名性を満たすため、極小元探索部１４３は、当該一般化ノード（１，２）から、年齢の属性に対応する要素をより具体化する方向にシフトさせた一般化ノード（０，２）の（ｋ，ｔ）−匿名性を判定する。ここでは、一般化ノード（０，２）は、（ｋ，ｔ）−匿名性を満たさないものとする。 Since the generalized node (1, 2) satisfies (k, t) -anonymity, the local minimum search unit 143 more specificizes the element corresponding to the age attribute from the generalized node (1, 2). The (k, t) -anonymity of the generalized node (0, 2) shifted in the direction to be determined is determined. Here, it is assumed that the generalized node (0, 2) does not satisfy (k, t) -anonymity.

一般化ノード（０，２）は、（ｋ，ｔ）−匿名性を満たさないため、極小元探索部１４３は、一般化ノード（１，２）から、性別の属性に対応する要素をより具体化する方向にシフトさせた一般化ノード（１，１）の（ｋ，ｔ）−匿名性を判定する。ここでは、一般化ノード（１，１）は、（ｋ，ｔ）−匿名性を満たすものとする。 Since the generalized node (0, 2) does not satisfy (k, t) -anonymity, the local minimum search unit 143 more specifically identifies the element corresponding to the gender attribute from the generalized node (1, 2). The (k, t) -anonymity of the generalized node (1, 1) shifted in the direction to be converted is determined. Here, it is assumed that the generalized node (1, 1) satisfies (k, t) -anonymity.

一般化ノード（１，１）が（ｋ，ｔ）−匿名性を満たすため、極小元探索部１４３は、当該一般化ノード（１，１）から、性別の属性に対応する要素をより具体化する方向にシフトさせた一般化ノード（１，０）の（ｋ，ｔ）−匿名性を判定する。ここでは、一般化ノード（１，０）は、（ｋ，ｔ）−匿名性を満たさないものとする。 Since the generalized node (1, 1) satisfies (k, t) -anonymity, the local minimum search unit 143 more specificizes the element corresponding to the gender attribute from the generalized node (1, 1). The (k, t) -anonymity of the generalized node (1, 0) shifted in the direction to be determined is determined. Here, it is assumed that the generalized node (1, 0) does not satisfy (k, t) -anonymity.

なお、一般化ノード（０，１）は、一般化ノード（０，２）の判定結果に基づき、（ｋ，ｔ）−匿名性を満たさないことが既に判定されている。これは、一般化ノード（０，１）が、一般化ノード（０，２）のうち、年齢の属性に対応する要素をさらに具体化した一般化ノードに相当するためである。具体的には、（ｋ，ｔ）−匿名性を満たさない一般化ノードを、さらに具体化した他の一般化ノードは、（ｋ，ｔ）−匿名性を満たさない。そのため、一般化ノード（０，２）が（ｋ，ｔ）−匿名性を満たさないと判定された場合には、当該一般化ノード（０，２）をさらに具体化した一般化ノード（０，１）は、当該一般化ノード（０，２）の判定結果をもって、（ｋ，ｔ）−匿名性を満たさないこととなる。 Note that it is already determined that the generalized node (0, 1) does not satisfy (k, t) -anonymity based on the determination result of the generalized node (0, 2). This is because the generalized node (0, 1) corresponds to a generalized node further embodying elements corresponding to the age attribute among the generalized nodes (0, 2). Specifically, the generalized node that does not satisfy (k, t) -anonymity, and another generalized node that is further specified does not satisfy (k, t) -anonymity. For this reason, when it is determined that the generalized node (0, 2) does not satisfy (k, t) -anonymity, the generalized node (0, 2) that further specifies the generalized node (0, 2). 1) does not satisfy (k, t) -anonymity with the determination result of the generalized node (0, 2).

ここで、一般化ノード（１，１）をａとした場合に、ａ＞ｂの関係を有する一般化ノードｂは、（０，１）及び（１，０）となる。このとき、（ｋ，ｔ）−匿名性を判定する処理をクエリ関数ｑとした場合に、ａが極小元であるとは、ｑ（ａ）＝１であり、かつ、任意のａ＞ｂ∈Ｌに対してｑ（ｂ）＝０が成り立つことであった。なお、一般化ノード（０，１）については、実際にｑ（ａ）を計算しなくても、前述の通り、一般化ノード（０，２）の判定結果に基づき、ｑ（ａ）＝０であることは自明である。そのため、極小元探索部１４３は、一般化ノード（１，１）を極小元と判定し、当該一般化ノード（１，１）を出力する。 Here, when a is a generalized node (1, 1), generalized nodes b having a> b relationship are (0, 1) and (1, 0). At this time, when the process of determining (k, t) -anonymity is a query function q, a is a minimal element, q (a) = 1, and any a> b∈ Q (b) = 0 holds for L. As for the generalized node (0, 1), q (a) = 0 based on the determination result of the generalized node (0, 2) as described above without actually calculating q (a). It is self-evident. Therefore, the minimal element search unit 143 determines the generalized node (1, 1) as the minimal element, and outputs the generalized node (1, 1).

以上、図１８及び図１９を参照して、極小元探索部１４３が、一般化戦略からなる束における各一般化ノードから、（ｋ，ｔ）−匿名性を満たす極小元を探索する処理の一例について説明した。 As described above, with reference to FIG. 18 and FIG. 19, an example of processing in which the minimal element search unit 143 searches for a minimal element satisfying (k, t) -anonymity from each generalized node in a bundle composed of generalization strategies. Explained.

［４．５．一連の処理の流れ］
次に、図２０及び図２１を参照して、前述した解析部１４が、一般化戦略からなる束から、（ｋ，ｔ）−匿名性を満たす極小元を探索し、探索された一連の極小元それぞれに対応する一般化戦略を出力する処理の一例について説明する。図２０及び図２１は、解析部１４の一連の処理の流れの一例について説明するための説明図である。 [4.5. Flow of a series of processing]
Next, referring to FIG. 20 and FIG. 21, the analysis unit 14 described above searches for a minimum element satisfying (k, t) -anonymity from a bundle of generalization strategies, and a series of searched minimums An example of a process for outputting a generalization strategy corresponding to each element will be described. 20 and 21 are explanatory diagrams for explaining an example of a flow of a series of processes of the analysis unit 14.

図２０は、解析部１４の一連の処理の一例を示した「All-MSS（All Most Specific Sentences）」関数のアルゴリズムを、所謂プログラムコードの形式で模式的に提示した図である。なお、図２０に示す例は、処理の流れをわかりやすくするために一部の情報を抽象化して示しており、必ずしも特定のプログラム言語のコンパイラに基づきコンパイルできる形式で示したものではない。 FIG. 20 is a diagram schematically showing an algorithm of an “All-MSS (All Most Specific Sentences)” function showing an example of a series of processes of the analysis unit 14 in a so-called program code format. Note that the example shown in FIG. 20 abstracts a part of information for easy understanding of the processing flow, and is not necessarily shown in a format that can be compiled based on a compiler of a specific programming language.

図２０に示した「All-MSS」関数は、入力として、（ｋ，ｔ）−匿名性の閾値（即ち、ｋ及びｔ）と、変数Ｔで示された処理対象となるテーブルｄ１０と、変数（Ｈ_１，Ｈ_２，・・・，Ｈ_ｎ）で示された各属性の一般化階層を示す情報とを取得する。なお、（Ｈ_１，Ｈ_２，・・・，Ｈ_ｎ）の定義は、図１１を参照して前述した、属性値を一般化するための「getFrequentSets」関数の場合と同様である。 The “All-MSS” function shown in FIG. 20 has (k, t) -anonymity thresholds (that is, k and t) as input, a table d10 to be processed indicated by a variable T, and a variable Information indicating the generalized hierarchy of each attribute indicated by (H ₁ , H ₂ ,..., H _n ) is acquired. The definition of (H ₁ , H ₂ ,..., H _n ) is the same as that of the “getFrequentSets” function for generalizing attribute values described above with reference to FIG.

また、図２１は、図２０に示した「All-MSS」関数の一連の処理の流れを示したフローチャートである。以下に、図２１を参照して、「All-MSS」関数の一連の処理の流れ、即ち、解析部１４の一連の処理の流れの一例について説明する。なお、本説明では、図５及び図６に示した属性の一般化階層に基づく一般化戦略からなる束（図７参照）を適用したものとして説明する。なお、この場合には、一般化戦略からなる束の各一般化ノードは、図１４及び図１６に示した離散集合から誘導される束の元に対応付けられるものとする。 FIG. 21 is a flowchart showing a flow of a series of processes of the “All-MSS” function shown in FIG. Hereinafter, with reference to FIG. 21, an example of a flow of a series of processes of the “All-MSS” function, that is, an example of a series of processes of the analysis unit 14 will be described. In this description, it is assumed that a bundle (see FIG. 7) composed of generalization strategies based on the attribute generalization hierarchy shown in FIGS. 5 and 6 is applied. In this case, it is assumed that each generalized node of the bundle composed of the generalization strategy is associated with the element of the bundle derived from the discrete set shown in FIGS. 14 and 16.

（ステップＳ１０１）
解析部１４は、ＤＢ情報取得部１１から処理対象となるテーブルｄ１０（即ち、Ｔ）の情報を取得する。また、解析部１４は、階層情報取得部１３から、処理対象となるテーブルｄ１０の各属性のうち、属性値の一般化（抽象化）の対象となる属性を示す情報と、当該属性それぞれの一般化階層を示す情報（即ち、（Ｈ_１，Ｈ_２，・・・，Ｈ_ｎ））とを取得する。また、解析部１４は、階層情報取得部１３から、匿名性の条件を示す情報として、（ｋ，ｔ）−匿名性の閾値（即ち、ｋ及びｔ）を取得する。 (Step S101)
The analysis unit 14 acquires information on the table d10 (that is, T) to be processed from the DB information acquisition unit 11. Further, the analysis unit 14 receives information indicating an attribute that is a target of generalization (abstraction) of the attribute value from among the attributes of the table d10 to be processed from the hierarchy information acquisition unit 13, and general information of each of the attributes. Information indicating the categorization hierarchy (that is, (H ₁ , H ₂ ,..., H _n )) is acquired. Further, the analysis unit 14 acquires (k, t) -anonymity thresholds (that is, k and t) as information indicating anonymity conditions from the hierarchy information acquisition unit 13.

そして、解析データ生成部１４１は、既定した各一般化戦略を元として、各一般化戦略が要素として有する各属性の一般化階層の高さをより一般化（抽象化）する順序関係（換言すると、各属性の分類をより抽象化する順序関係）に基づき、一般化戦略からなる束を規定する。 Then, the analysis data generation unit 141 is based on each predetermined generalization strategy, and the order relation (in other words, the generalization hierarchy of each attribute that each generalization strategy has as an element is more generalized (in other words, abstracted). And a bundle of generalization strategies based on an order relation that abstracts the classification of each attribute.

（ステップＳ１０３）
極小元探索部１４３は、まず、一般化戦略からなる束の最大元を基準点として、「AMSS」関数に基づき、（ｋ，ｔ）−匿名性を満たす極小元ｖを取得する。具体的には、極小元探索部１４３は、図１９に基づき前述したように、一般化戦略からなる束の最大元（２，２）を基準点とした場合に、（ｋ，ｔ）−匿名性を満たす極小元ｖ＝（１，１）を取得する。極小元探索部１４３は、（ｋ，ｔ）−匿名性を満たす極小元の探索結果として、取得した極小元ｖ＝（１，１）を基準点導出部１４５に通知する。 (Step S103)
First, the minimal element search unit 143 obtains a minimal element v satisfying (k, t) -anonymity based on the “AMSS” function with the maximum element of a bundle composed of generalized strategies as a reference point. Specifically, as described above with reference to FIG. 19, the minimal element search unit 143 uses (k, t) -anonymity when the maximum element (2, 2) of the bundle composed of the generalization strategy is used as a reference point. The minimal element v = (1, 1) satisfying the property is acquired. The minimal element searching unit 143 notifies the reference point deriving unit 145 of the acquired minimal element v = (1, 1) as a minimum element search result that satisfies (k, t) -anonymity.

（ステップＳ１０５）
基準点導出部１４５は、極小元探索部１４３から、（ｋ，ｔ）−匿名性を満たす極小元の探索結果として、極小元ｖ＝（１，１）を取得する。基準点導出部１４５は、取得した極小元ｖを入力として「set」関数に基づき、当該極小元ｖに対応する、離散集合から誘導される束の元Ｖを特定する。 (Step S105)
The reference point deriving unit 145 acquires the minimal element v = (1, 1) as the minimal element search result satisfying (k, t) -anonymity from the minimal element searching unit 143. The reference point deriving unit 145 specifies the bundle element V derived from the discrete set corresponding to the minimal element v based on the “set” function based on the acquired minimal element v.

例えば、図２２は、基準点導出部１４５の処理の一例について説明するための説明図であり、基準点導出部１４５が、取得した極小元ｖ＝（１，１）に対応する、離散集合から誘導される束の元を特定する処理を模式的に示している。なお、図２２において、参照符号ｄ５２は、一般化戦略からなる束に示された境界ｄ５０を、離散集合から誘導される束側に示したものである。即ち、境界ｄ５２から最小元側（即ち、図面の下側）の各元を、一般化戦略からなる束の一般化ノードに変換した場合に、当該一般化ノードは（ｋ，ｔ）−匿名性を満たすこととなる。また、境界ｄ５２から最大元側（即ち、図面の上側）の各元を、一般化戦略からなる束の一般化ノードに変換した場合に、当該一般化ノードは（ｋ，ｔ）−匿名性を満たさないこととなる。 For example, FIG. 22 is an explanatory diagram for explaining an example of the processing of the reference point deriving unit 145, and the reference point deriving unit 145 starts from the discrete set corresponding to the acquired minimal element v = (1, 1). The process which specifies the origin of the induced | guided | derived bundle is typically shown. In FIG. 22, reference numeral d52 indicates the boundary d50 shown in the bundle made up of the generalization strategy on the bundle side derived from the discrete set. That is, when each element on the minimum element side (that is, the lower side of the drawing) from the boundary d52 is converted into a generalized node of a bundle composed of the generalized strategy, the generalized node is (k, t) -anonymity. Will be satisfied. Further, when each element on the maximum element side (that is, the upper side of the drawing) from the boundary d52 is converted into a generalized node of a bundle composed of a generalization strategy, the generalized node has (k, t) -anonymity. It will not satisfy.

即ち、基準点導出部１４５は、取得した極小元ｖ＝（１，１）に対応する、離散集合から誘導される束の元Ｖとして、Ｖ＝｛１，３｝を特定する。 That is, the reference point deriving unit 145 specifies V = {1, 3} as the bundle element V derived from the discrete set corresponding to the acquired minimal element v = (1, 1).

次に、基準点導出部１４５は、特定した離散集合から誘導される束の元Ｖ＝｛１，３｝の補集合を導出し、導出した補集合を変数Ｃで示された集合に代入する。例えば、図２３は、基準点導出部１４５の処理の一例について説明するための説明図である。図２３に示す例では、基準点導出部１４５は、離散集合から誘導される束の元Ｖ＝｛１，３｝の補集合に対応する元｛２，４｝を導出し、導出した元｛２，４｝を集合Ｃに代入する。 Next, the reference point deriving unit 145 derives a complement of the bundle element V = {1, 3} derived from the identified discrete set, and substitutes the derived complement into the set indicated by the variable C. . For example, FIG. 23 is an explanatory diagram for explaining an example of processing of the reference point deriving unit 145. In the example shown in FIG. 23, the reference point deriving unit 145 derives an element {2, 4} corresponding to a complement of the bundle element V = {1, 3} derived from the discrete set, and derives the derived element { 2,4} is substituted into set C.

（ステップＳ１０７）
次いで、基準点導出部１４５は、変数Ｄで示された集合を、空集合で初期化する。なお、当該集合Ｄは、離散集合から誘導される束の各元のうち、後述する処理が既に完了している元を保持するための集合であり、当該元に対して再度処理が実行されるのを防ぐために設けられている。 (Step S107)
Next, the reference point deriving unit 145 initializes the set indicated by the variable D with an empty set. Note that the set D is a set for holding elements of bundles derived from the discrete sets that have already been processed, and the process is performed again on the elements. It is provided to prevent this.

（ステップＳ１０９）
基準点導出部１４５は、離散集合から誘導される束の元が示す正の整数の集合から、極小横断に相当する部分集合を抽出する関数Ｔｒに対して、導出した補集合Ｃを入力する。そして、基準点導出部１４５は、関数Ｔｒの出力として、補集合Ｃの極小横断に相当する部分集合Ｔｒ（Ｃ）を取得し、当該部分集合Ｔｒ（Ｃ）の要素のうち、集合Ｄに含まれない要素に対応する離散集合から誘導される束の元を、変数Ｘで示された元として抽出する。 (Step S109)
The reference point deriving unit 145 inputs the derived complement C to the function Tr that extracts a subset corresponding to the minimum crossing from the set of positive integers indicated by the bundle elements derived from the discrete set. Then, the reference point deriving unit 145 acquires a subset Tr (C) corresponding to the minimal crossing of the complement C as an output of the function Tr, and is included in the set D among the elements of the subset Tr (C). The bundle element derived from the discrete set corresponding to the non-element is extracted as the element indicated by the variable X.

具体的な一例として、図２３に示すように、基準点導出部１４５は、離散集合から誘導される束の元｛２，４｝の極小横断に相当する部分集合Ｔｒ（Ｃ）として、Ｔｒ（Ｃ）＝｛｛２｝，｛４｝｝を取得する。 As a specific example, as shown in FIG. 23, the reference point deriving unit 145 uses Tr (C) as a subset Tr (C) corresponding to the minimal crossing of the element {2, 4} of the bundle derived from the discrete set. C) = {{2}, {4}} is acquired.

なお、関数Ｔｒは、入力された集合（例えば、補集合Ｃ）をハイパーエッジとするハイパーグラフの双対を計算する、所謂ハイパーグラフ双対化（Hypergraph Dualization）のアルゴリズムとして知られる処理に相当する。そのため、ハイパーグラフ双対化のアルゴリズムに基づく処理であれば、関数Ｔｒの処理内容は特に限定されない。また、ハイパーグラフ双対化のアルゴリズムに基づく処理に限らず、入力された正の整数の集合から、極小横断に相当する部分集合を抽出することが可能であれば、関数Ｔｒの処理内容は特に限定されないことは言うまでもない。 The function Tr corresponds to a process known as a so-called hypergraph dualization algorithm for calculating a hypergraph dual with the input set (for example, the complementary set C) as a hyper edge. Therefore, the processing content of the function Tr is not particularly limited as long as the processing is based on the hypergraph dualization algorithm. Further, the processing contents of the function Tr are not particularly limited as long as it is possible to extract a subset corresponding to the minimum crossing from the set of input positive integers, not limited to the processing based on the hypergraph dualization algorithm. It goes without saying that it is not done.

（ステップＳ１１１）
離散集合から誘導される束の元Ｘが存在する場合には（ステップＳ１０９、ＹＥＳ）、基準点導出部１４５は、「vec」関数に基づき、当該元Ｘに対応する、一般化戦略からなる束の一般化ノードｖを特定する。 (Step S111)
When there is a bundle element X derived from the discrete set (step S109, YES), the reference point deriving unit 145 is based on the “vec” function and includes a bundle composed of generalization strategies corresponding to the element X. Identifies a generalized node v.

例えば、図２４は、基準点導出部１４５の処理の一例について説明するための説明図である。図２４に示す例では、基準点導出部１４５が、「vec」関数に基づき、離散集合から誘導される束の元｛２｝及び元｛４｝それぞれについて、対応する一般化戦略からなる束の一般化ノードｖを特定している。具体的には、基準点導出部１４５は、離散集合から誘導される束の元｛２｝に対応する一般化ノードとして、一般化ノードｖ＝（０，２）を特定する。同様に、基準点導出部１４５は、離散集合から誘導される束の元｛４｝に対応する一般化ノードとして、一般化ノードｖ＝（２，０）を特定する。 For example, FIG. 24 is an explanatory diagram for explaining an example of processing of the reference point deriving unit 145. In the example shown in FIG. 24, the reference point deriving unit 145 uses a “vec” function to generate a bundle consisting of a corresponding generalized strategy for each of the elements {2} and {4} of the bundle derived from the discrete set. A generalized node v is specified. Specifically, the reference point deriving unit 145 identifies the generalized node v = (0, 2) as the generalized node corresponding to the bundle element {2} derived from the discrete set. Similarly, the reference point deriving unit 145 identifies the generalized node v = (2, 0) as the generalized node corresponding to the bundle element {4} derived from the discrete set.

（ステップＳ１１３）
そして、基準点導出部１４５は、特定した一般化ノードｖが、（ｋ，ｔ）−匿名性を満たすか否かを判定する。なお、（ｋ，ｔ）−匿名性を満たすか否かの判定には、前述した「kAnonymityCheck」関数を適用すればよい。 (Step S113)
Then, the reference point deriving unit 145 determines whether or not the identified generalized node v satisfies (k, t) -anonymity. Note that the above-described “kAnonymityCheck” function may be applied to determine whether or not (k, t) −anonymity is satisfied.

（ステップＳ１１９）
特定した一般化ノードｖが（ｋ，ｔ）−匿名性を満たさない場合には（ステップＳ１１３、ＮＯ）、基準点導出部１４５は、当該一般化ノードｖの特定元である、離散集合から誘導される束の元Ｘ（即ち、「vec」関数に入力した元Ｘ）を集合Ｄに加える。例えば、図２４に示す例では、離散集合から誘導される束の元｛２｝に対応する一般化ノードｖ＝（０，２）は、（ｋ，ｔ）−匿名性を満たさない。そのため、基準点導出部１４５は、一般化ノードｖ＝（０，２）の特定元である、離散集合から誘導される束の元Ｘ＝｛２｝を集合Ｄに加える。 (Step S119)
When the identified generalized node v does not satisfy (k, t) -anonymity (NO in step S113), the reference point deriving unit 145 derives from the discrete set that is the identification source of the generalized node v. The bundle element X (ie, the element X input to the “vec” function) is added to the set D. For example, in the example shown in FIG. 24, the generalized node v = (0, 2) corresponding to the bundle element {2} derived from the discrete set does not satisfy (k, t) -anonymity. Therefore, the reference point deriving unit 145 adds the bundle element X = {2} derived from the discrete set, which is the identification element of the generalized node v = (0, 2), to the set D.

（ステップＳ１１５）
これに対して、特定した一般化ノードｖが（ｋ，ｔ）−匿名性を満たす場合には（ステップＳ１１３、ＹＥＳ）、基準点導出部１４５は、当該一般化ノードｖを新たな基準点の候補として、極小元探索部１４３に出力する。例えば、図２４に示す例では、離散集合から誘導される束の元｛４｝に対応する一般化ノードｖ＝（２，０）は、（ｋ，ｔ）−匿名性を満たす。そのため、基準点導出部１４５は、一般化ノードｖ＝（２，０）を、新たな基準点の候補として、極小元探索部１４３に出力する。 (Step S115)
On the other hand, when the specified generalized node v satisfies (k, t) -anonymity (step S113, YES), the reference point deriving unit 145 sets the generalized node v as a new reference point. The candidate is output to the local minimum search unit 143 as a candidate. For example, in the example shown in FIG. 24, the generalized node v = (2, 0) corresponding to the bundle element {4} derived from the discrete set satisfies (k, t) -anonymity. Therefore, the reference point deriving unit 145 outputs the generalized node v = (2, 0) as a new reference point candidate to the local minimum search unit 143.

この新たな基準点の候補の出力を受けて、極小元探索部１４３は、当該候補を基準点として新たに（ｋ，ｔ）−匿名性を満たす極小元ｗを探索する。そして、極小元探索部１４３は、（ｋ，ｔ）−匿名性を満たす極小元の探索結果として、探索された極小元ｗを基準点導出部１４５に通知する。 Upon receiving the output of the new reference point candidate, the local minimum element search unit 143 searches the local minimum element w that satisfies (k, t) -anonymity with the candidate as a reference point. Then, the minimal element search unit 143 notifies the reference point derivation unit 145 of the searched minimal element w as a minimal element search result that satisfies (k, t) -anonymity.

（ステップＳ１１７）
基準点導出部１４５は、取得した極小元ｗに対応する、離散集合から誘導される束の元Ｖを特定し、特定した元Ｖの補集合を集合Ｃに加える。 (Step S117)
The reference point deriving unit 145 identifies the bundle element V derived from the discrete set corresponding to the acquired minimal element w, and adds the complement of the identified element V to the set C.

（ステップＳ１０９）
そして、極小元探索部１４３及び基準点導出部１４５は、ステップＳ１１１〜Ｓ１１７に示された一連の処理を、部分集合Ｔｒ（Ｃ）の要素のうち、集合Ｄに含まれない要素に対応する離散集合から誘導される束の元Ｘが存在する限り継続する（ステップＳ１０９、ＹＥＳ）。 (Step S109)
Then, the minimal element search unit 143 and the reference point deriving unit 145 perform the series of processing shown in steps S111 to S117 for discrete elements corresponding to elements not included in the set D among the elements of the subset Tr (C). The process continues as long as there is a bundle element X derived from the set (YES in step S109).

例えば、図２５は、極小元探索部１４３の処理の一例について説明するための説明図であり、極小元探索部１４３が、一般化ノードｖ＝（２，０）を新たな基準点として、（ｋ，ｔ）−匿名性を満たす極小元を探索する場合について示している。なお、図２５に示す例では、一般化ノードｖ＝（２，０）自体が、（ｋ，ｔ）−匿名性を満たす極小元に相当する。 For example, FIG. 25 is an explanatory diagram for explaining an example of processing of the minimal element search unit 143. The minimal element search unit 143 uses the generalized node v = (2, 0) as a new reference point ( k, t) —shows a case where a minimal element satisfying anonymity is searched. In the example shown in FIG. 25, the generalized node v = (2, 0) itself corresponds to a minimal element that satisfies (k, t) -anonymity.

そして、離散集合から誘導される束の元Ｘが存在しない場合には（ステップＳ１０９、ＮＯ）、極小元探索部１４３は、探索された一連の（ｋ，ｔ）−匿名性を満たす極小元それぞれに対応する一般化戦略を、解析結果出力部１５に出力する。 When there is no bundle element X derived from the discrete set (step S109, NO), the minimal element search unit 143 determines each of the minimal elements satisfying the searched series of (k, t) -anonymity. Is output to the analysis result output unit 15.

例えば、図２６は、極小元探索部１４３の処理結果の一例について説明するための説明図であり、図５及び図６に示した属性の一般化階層に基づく一般化戦略からなる束（図７参照）から探索された、（ｋ，ｔ）−匿名性を満たす極小元の一例を示している。即ち、図２６に示す例では、一般化ノード（１，１）及び（２，０）が、（ｋ，ｔ）−匿名性を満たす極小元に相当する。そのため、極小元探索部１４３は、一般化ノード（１，１）及び（２，０）に対応する一般化戦略を、解析結果出力部１５に出力することとなる。 For example, FIG. 26 is an explanatory diagram for explaining an example of the processing result of the local minimum search unit 143, and is a bundle of generalization strategies based on the generalization hierarchy of attributes shown in FIGS. 5 and 6 (FIG. 7). An example of a local minimum satisfying (k, t) -anonymity searched from (see) is shown. In other words, in the example illustrated in FIG. 26, the generalized nodes (1, 1) and (2, 0) correspond to local minimums that satisfy (k, t) -anonymity. Therefore, the minimal element search unit 143 outputs the generalization strategy corresponding to the generalized nodes (1, 1) and (2, 0) to the analysis result output unit 15.

なお、ステップＳ１０９として示したように、解析部１４は、補集合Ｃをハイパーエッジとするハイパーグラフの双対を計算する処理を繰り返し計算することとなる。そのため、同処理として、ハイパーグラフの双対化が逐次的に実行可能なアルゴリズム、即ち、新たなハイパーエッジが追加されたとしても、既に双対を計算したハイパーグラフについては、再計算を必要としないアルゴリズムを適用してもよい。このようなアルゴリズムを適用することで、解析部１４は、（ｋ，ｔ）−匿名性を満たす極小元の探索に係る計算量をより削減することが可能となる。 As shown in step S109, the analysis unit 14 repeatedly calculates a process of calculating a hypergraph dual with the complement C as a hyper edge. Therefore, as the same processing, an algorithm that can sequentially execute hypergraph dualization, that is, an algorithm that does not require recalculation for a hypergraph that has already been calculated for a dual even if a new hyper edge is added. May be applied. By applying such an algorithm, the analysis unit 14 can further reduce the amount of calculation related to the search for the minimal element that satisfies (k, t) -anonymity.

以上、図２０及び図２１を参照して、前述した解析部１４が、一般化戦略からなる束から、（ｋ，ｔ）−匿名性を満たす極小元を探索し、探索された一連の極小元それぞれに対応する一般化戦略を出力する処理の一例について説明した。 As described above, with reference to FIGS. 20 and 21, the analysis unit 14 described above searches for a minimum element satisfying (k, t) -anonymity from a bundle of generalization strategies, and a series of searched minimum elements An example of the process of outputting the generalization strategy corresponding to each has been described.

＜５．計算量＞
次に、図２７を参照して、本実施形態に係る情報処理装置１０が、一般化戦略からなる束から、（ｋ，ｔ）−匿名性を満たす極小元を探索する場合の、当該探索に係る処理の計算量について説明する。図２７は、本実施形態に係る情報処理装置１０が、一般化戦略からなる束から、（ｋ，ｔ）−匿名性を満たす極小元を探索する場合の、当該探索に係る処理の計算量について説明するための説明図である。なお、本説明では、情報処理装置１０が、（ｋ，ｔ）−匿名性を満たす極小元を探索する場合の計算量を、（ｋ，ｔ）−匿名性を満たすか否かを判定するクエリ関数ｑ（即ち、「kAnonymityCheck」関数）の実行回数として説明する。また、図２７は、一例として、図５及び図６に示した「年齢」及び「性別」を示す属性の一般化階層に基づく一般化戦略からなる束を示している。 <5. Calculation amount>
Next, with reference to FIG. 27, the information processing apparatus 10 according to the present embodiment performs the search when searching for a minimal element satisfying (k, t) -anonymity from a bundle of generalization strategies. The calculation amount of such processing will be described. FIG. 27 shows the calculation amount of processing related to the search when the information processing apparatus 10 according to the present embodiment searches for a minimal element satisfying (k, t) -anonymity from a bundle of generalization strategies. It is explanatory drawing for demonstrating. In this description, a query for determining whether or not (k, t) -anonymity is satisfied is the amount of calculation when the information processing apparatus 10 searches for a minimal element that satisfies (k, t) -anonymity. The number of executions of the function q (that is, the “kAnonymityCheck” function) will be described. FIG. 27 shows, as an example, a bundle of generalization strategies based on the generalization hierarchy of attributes indicating “age” and “gender” shown in FIGS. 5 and 6.

まず、情報処理装置１０が、一般化戦略からなる束から、（ｋ，ｔ）−匿名性を満たす極小元を探索する場合の計算量を説明するための用語として、「入力のサイズ」、「出力のサイズ」、「ボーダー」、及び「束の幅」の定義について説明する。 First, as terms for explaining the amount of calculation when the information processing apparatus 10 searches for a minimal element satisfying (k, t) -anonymity from a bundle of generalization strategies, “input size”, “ The definitions of “output size”, “border”, and “bundle width” will be described.

（入力のサイズ）
「入力のサイズ」とは、一般化戦略の表現長に相当する。入力のサイズＲｉは、一般化の対象となる属性の数ｍと、各属性の一般化階層における階層の数ｍ_ｉとに基づき、以下に示す（式１）に基づき規定される。 (Input size)
The “input size” corresponds to the expression length of the generalization strategy. Size Ri inputs, the number m of attributes to be generalized, based on the number m _i of the hierarchy in general hierarchy of each attribute is defined based on the following equation (1).

例えば、図５示した「年齢」を示す属性の一般化階層は、高さが０、１、２の３つの階層を有しており、階層の数ｍ_ｉ＝３となる。同様に、図６示した「性別」を示す属性の一般化階層は、高さが０、１、２の３つの階層を有しており、階層の数ｍ_ｉ＝３となる。そのため、「年齢」及び「性別」を示す属性の属性値を一般化するための一般化戦略の数は、図２７に示した、（０，０），（０，１），・・・，（１，２），（２，２）の９つとなる。したがって、図２７に示す例では、入力のサイズＲｉ＝ｌｏｇ９となる。 For example, the generalized hierarchy of the attribute indicating “age” shown in FIG. 5 has three hierarchies with heights 0, 1, and 2, and the number of hierarchies m _i = 3. Similarly, the generalized hierarchies of the attribute indicating “sex” shown in FIG. 6 have three hierarchies of height 0, 1, and 2, and the number of hierarchies m _i = 3. Therefore, the number of generalization strategies for generalizing attribute values of attributes indicating “age” and “gender” is (0, 0), (0, 1),. There are nine (1,2,) and (2,2). Therefore, in the example shown in FIG. 27, the input size Ri = log9.

（出力のサイズ）
「出力のサイズ」とは、情報処理装置１０が出力する一般化戦略の数に相当し、即ち、情報処理装置１０により探索された（ｋ，ｔ）−匿名性を満たす極小元の数に相当する。 (Output size)
The “size of output” corresponds to the number of generalization strategies output by the information processing apparatus 10, that is, corresponds to the number of minimal elements satisfying (k, t) -anonymity searched by the information processing apparatus 10 To do.

図２７に示す例において、境界ｄ５０を、（ｋ，ｔ）−匿名性を満たす元と、（ｋ，ｔ）−匿名性を満たさない元の境界とした場合に、（ｋ，ｔ）−匿名性を満たす極小元は、一般化ノード（１，１）及び（２，０）の２つとなる。即ち、図２７に示す例では、出力のサイズは「２」となる。 In the example shown in FIG. 27, when the boundary d50 is an original boundary that satisfies (k, t) -anonymity and an original boundary that does not satisfy (k, t) -anonymity, (k, t) -anonymity There are two minimal elements satisfying the characteristics, generalized nodes (1, 1) and (2, 0). That is, in the example shown in FIG. 27, the output size is “2”.

（ボーダー）
「ボーダー」とは、束上の境界にあたる一般化戦略の集合に相当し、「正のボーダー」と、「負のボーダー」とに分けられる。 (border)
The “border” corresponds to a set of generalized strategies corresponding to the boundaries on the bundle, and is divided into a “positive border” and a “negative border”.

「正のボーダー」とは、（ｋ，ｔ）−匿名性を満たす極小な一般化戦略（即ち、（ｋ，ｔ）−匿名性を満たす極小元）の集合に相当する。図２７に示す例では、一般化ノード（１，１）及び（２，０）の集合が、正のボーダーに相当する。即ち、正ボーダーのサイズと、出力のサイズとは一致することとなる。 The “positive border” corresponds to a set of minimal generalization strategies satisfying (k, t) -anonymity (that is, (k, t) -minimum elements satisfying anonymity). In the example shown in FIG. 27, a set of generalized nodes (1, 1) and (2, 0) corresponds to a positive border. That is, the size of the main border and the output size are the same.

「負のボーダー」とは、（ｋ，ｔ）−匿名性を満たさない極大な一般化戦略の集合に相当する。図２７に示す例では、一般化ノード（０，２）及び（１，０）の集合が、負のボーダーに相当する。 A “negative border” corresponds to a set of maximal generalization strategies that do not satisfy (k, t) -anonymity. In the example shown in FIG. 27, a set of generalized nodes (0, 2) and (1, 0) corresponds to a negative border.

（束の幅）
「束の幅」とは、束上の元のうち、「行き先」が最も多い元の当該行き先の数に相当する。ここで、一般化ノードＢが、一般化ノードＡの「行き先」であるとは、順序関係において、Ａ＞Ｂであり、かつ、Ａ＞Ｃ＞Ｂを満たす一般化ノードＣが存在しないことを意味する。なお、ここでノード間の関係を示す記号として「＞」を用いたが、この記号は、数値間の大小関係を示す記号ではなく、ノード間の順序関係を示しており、所謂「succeed」と呼ばれる記号に相当する。また、以降の説明では、ノード間の関係を示す記号として、「＞」、「＜」、「≧」、及び「≦」を使用した場合には、これらの記号は、数値間の大小関係を示す記号ではなく、ノード間の順序関係を示しているものとする。 (Bundle width)
The “bundle width” corresponds to the number of destinations having the largest “destination” among the sources on the bundle. Here, the generalized node B is the “destination” of the generalized node A. In the order relation, A> B and there is no generalized node C that satisfies A>C> B. means. Here, “>” is used as a symbol indicating a relationship between nodes, but this symbol indicates an order relationship between nodes, not a symbol indicating a magnitude relationship between numerical values, so-called “succeed”. It corresponds to the symbol called. In the following description, when “>”, “<”, “≧”, and “≦” are used as symbols indicating the relationship between the nodes, these symbols indicate the magnitude relationship between the numerical values. It is assumed that the order relationship between the nodes is shown instead of the symbol.

具体的には、図２７に示す例において、一般化ノード（１，２）の行き先は、一般化ノード（０，２）及び（１，１）であり、当該行き先の数は２つとなる。なお、図２７に示す例において、行き先の数が２つを超える一般化ノードは存在しない。そのため、図２７に示す例では、束の幅は「２」となる。 Specifically, in the example shown in FIG. 27, the destinations of the generalized nodes (1, 2) are the generalized nodes (0, 2) and (1, 1), and the number of the destinations is two. In the example shown in FIG. 27, there is no generalized node having more than two destinations. Therefore, in the example shown in FIG. 27, the width of the bundle is “2”.

（計算量）
以上を踏まえ、出力のサイズをＮ、側の幅をｗ、負のボーダーのサイズをＢ⁻とした場合には、情報処理装置１０が、（ｋ，ｔ）−匿名性を満たす極小元を探索する場合の計算量Ｑは、以下に示す（式２）で示される。なお、出力のサイズＮは、入力のサイズと、一般化の対象となるデータに依存する。 (Calculation amount)
Based on the above, when the output size is N, the side width is w, and the negative border size is B ⁻ , the information processing apparatus 10 searches for a minimal element that satisfies (k, t) -anonymity. The calculation amount Q in this case is expressed by the following (Formula 2). The output size N depends on the input size and the data to be generalized.

ここで、（ｋ，ｔ）−匿名性を満たし、かつ、情報量の減少を最小限に抑えることが可能な一般化戦略、即ち、（ｋ，ｔ）−匿名性を満たす一連の極小元を特定する処理は、所謂列挙問題に相当する。このような列挙問題を解く従来のアルゴリズムは、一般的にはヒューリスティックなアルゴリズムに相当し、計算量を事前に算出することが困難であった。 Here, a generalized strategy that satisfies (k, t) -anonymity and can minimize a decrease in the amount of information, that is, a series of minimal elements that satisfy (k, t) -anonymity. The specified process corresponds to a so-called enumeration problem. Conventional algorithms for solving such enumeration problems generally correspond to heuristic algorithms, and it has been difficult to calculate the amount of calculation in advance.

これに対して、本実施形態に係る情報処理装置１０は、ハイパーグラフ双対化のアルゴリズムを、探索元（即ち、（ｋ，ｔ）−匿名性を満たす極小元）を列挙するために利用している。このようなハイパーグラフ双対化のアルゴリズムには、論理的な上限が与えられているものもある。そのため、本実施形態に係る情報処理装置１０が、（ｋ，ｔ）−匿名性を満たし、かつ、情報量の減少を最小限に抑えることが可能な一般化戦略を特定（列挙）する処理の計算量を事前に算出することが可能となる。 In contrast, the information processing apparatus 10 according to the present embodiment uses the hypergraph dualization algorithm to enumerate search sources (that is, (k, t) -minimum elements satisfying anonymity). Yes. Some of these hypergraph dualization algorithms have a logical upper limit. Therefore, the information processing apparatus 10 according to the present embodiment specifies (enumerates) a generalization strategy that satisfies (k, t) -anonymity and can minimize a decrease in the amount of information. It is possible to calculate the calculation amount in advance.

また、本実施形態に係る情報処理装置１０の処理として説明したアルゴリズムに依れば、必ずしも全ての一般化戦略（即ち、一般化ノード）に対して、（ｋ，ｔ）−匿名性を満たすか否かの判定をおこなう必要はない。具体的な一例として、図２０〜図２６を参照して説明した例では、一般化ノード（２，１）に対して、（ｋ，ｔ）−匿名性を満たすか否かの判定を行われていないが、（ｋ，ｔ）−匿名性を満たす極小元（１，１）及び（２，０）を特定している。このように、本実施形態に係る情報処理装置１０の処理として説明したアルゴリズムに依れば、一般化戦略からなる束から、（ｋ，ｔ）−匿名性を満たす一連の極小元を効率よく探索（列挙）し、当該探索に係る処理の負荷を低減することが可能となる。 Further, according to the algorithm described as the processing of the information processing apparatus 10 according to the present embodiment, does (k, t) -anonymity necessarily satisfy all generalization strategies (ie, generalized nodes)? There is no need to determine whether or not. As a specific example, in the example described with reference to FIGS. 20 to 26, it is determined whether or not (k, t) -anonymity is satisfied for the generalized node (2, 1). However, the minimum elements (1, 1) and (2, 0) satisfying (k, t) -anonymity are specified. As described above, according to the algorithm described as the processing of the information processing apparatus 10 according to the present embodiment, a series of minimal elements satisfying (k, t) -anonymity is efficiently searched from a bundle of generalization strategies. (Enumeration) and the processing load related to the search can be reduced.

以上、図２７を参照して、本実施形態に係る情報処理装置１０が、一般化戦略からなる束から、（ｋ，ｔ）−匿名性を満たす極小元を探索する場合の、当該探索に係る処理の計算量について説明した。 As described above, with reference to FIG. 27, the information processing apparatus 10 according to the present embodiment relates to the search when searching for a minimal element satisfying (k, t) -anonymity from a bundle of generalization strategies. The calculation amount of processing has been described.

＜６．変形例＞
次に、本実施形態に係る情報処理システム１の変形例について説明する。 <6. Modification>
Next, a modified example of the information processing system 1 according to the present embodiment will be described.

［６．１．変形例１：一般化戦略の探索に係る処理負荷の軽減］
まず、変形例１として、図２８及び図２９を参照して、情報処理装置１０の解析部１４が、一般化戦略からなる束から（ｋ，ｔ）−匿名性を満たす一連の極小元を探索する処理の他の一例について説明する。図２８及び図２９は、変形例１に係る解析部１４の一連の処理の流れの一例について説明するための説明図である。 [6.1. Modification 1: Reduction of processing load related to search for generalization strategy]
First, as Modification 1, with reference to FIG. 28 and FIG. 29, the analysis unit 14 of the information processing apparatus 10 searches for a series of minimal elements satisfying (k, t) -anonymity from a bundle of generalization strategies. Another example of the processing to be performed will be described. 28 and 29 are explanatory diagrams for describing an example of a flow of a series of processes of the analysis unit 14 according to the first modification.

図２８は、変形例１に係る解析部１４の一連の処理の一例を示した「All-MSS」関数のアルゴリズムを、所謂プログラムコードの形式で模式的に提示した図である。なお、図２８に示す例は、処理の流れをわかりやすくするために一部の情報を抽象化して示しており、必ずしも特定のプログラム言語のコンパイラに基づきコンパイルできる形式で示したものではない。また、図２９は、図２８に示した「All-MSS」関数の一連の処理の流れを示したフローチャートである。 FIG. 28 is a diagram schematically showing an algorithm of the “All-MSS” function showing an example of a series of processes of the analysis unit 14 according to the first modification in a so-called program code format. In the example shown in FIG. 28, some information is abstracted for easy understanding of the processing flow, and is not necessarily shown in a format that can be compiled based on a compiler of a specific programming language. FIG. 29 is a flowchart showing a flow of a series of processes of the “All-MSS” function shown in FIG.

なお、変形例１に係る解析部１４は、図２０及び図２１を参照して前述した実施形態に係る解析部１４の処理の一部を変更することで、（ｋ，ｔ）−匿名性を満たす一連の極小元を探索する処理の処理量を低減している。 In addition, the analysis part 14 which concerns on the modification 1 changes (k, t) -anonymity by changing a part of process of the analysis part 14 which concerns on embodiment mentioned above with reference to FIG.20 and FIG.21. The amount of processing for searching for a series of minimal elements to be satisfied is reduced.

具体的には、変形例１に係る解析部１４は、一般化ノードが（ｋ，ｔ）−匿名性を満たさないと判定した場合には、この（ｋ，ｔ）−匿名性を満たさない一般化ノードを、集合Ｅに加える。この（ｋ，ｔ）−匿名性を満たさない一般化ノードからは、いずれの要素をより具体化する方向にシフトさせたとしても、シフト後の一般化ノードが（ｋ，ｔ）−匿名性を満たさない。そこで、変形例１に係る解析部１４は、（ｋ，ｔ）−匿名性を満たす極小元を探索するための新たな基準点の候補を特定した際に、当該候補と集合Ｅとを比較することで、特定した新たな基準点の候補の中から集合Ｅに含まれる一般化ノードを事前に除く。このような構成により、変形例１に係る解析部１４は、（ｋ，ｔ）−匿名性の判定に係る処理（即ち、図１３に示した「kAnonymityCheck」関数）の実行回数をより低減し、（ｋ，ｔ）−匿名性を満たす一連の極小元を探索する処理の処理量を低減する。 Specifically, when it is determined that the generalized node does not satisfy (k, t) -anonymity, the analysis unit 14 according to the first modification generally does not satisfy this (k, t) -anonymity. Add a node to set E. From the generalized node that does not satisfy this (k, t) -anonymity, even if any element is shifted in a more specific direction, the generalized node after the shift has (k, t) -anonymity. Do not meet. Therefore, when the analysis unit 14 according to Modification 1 identifies a new reference point candidate for searching for a minimal element satisfying (k, t) -anonymity, the analysis unit 14 compares the candidate with the set E. Thus, the generalized nodes included in the set E are previously removed from the identified new reference point candidates. With such a configuration, the analysis unit 14 according to Modification 1 further reduces the number of executions of the process related to the determination of (k, t) -anonymity (that is, the “kAnonymityCheck” function illustrated in FIG. 13), (K, t) —Reduces the amount of processing for searching for a series of local minimums satisfying anonymity.

そこで、以降では、図２９を参照して、特に、図２１に示した処理とは異なる部分に着目して、より詳しく説明する。 Therefore, in the following, with reference to FIG. 29, a detailed description will be given, particularly focusing on portions different from the processing shown in FIG.

（ステップＳ１０１〜Ｓ１０５）
ステップＳ１０１〜Ｓ１０５に係る処理は、図２１に示した例と同様のため詳細な説明は省略する。即ち、極小元探索部１４３は、まず、一般化戦略からなる束の最大元を基準点として（ｋ，ｔ）−匿名性を満たす極小元を探索する。そして、基準点導出部１４５は、探索された極小元を離散集合から誘導される元に変換し、変換された元の補集合を、集合Ｃに代入する。 (Steps S101 to S105)
Since the process concerning step S101-S105 is the same as that of the example shown in FIG. 21, detailed description is abbreviate | omitted. That is, the minimal element search unit 143 first searches for a minimal element that satisfies (k, t) -anonymity with the maximum element of a bundle composed of generalized strategies as a reference point. Then, the reference point deriving unit 145 converts the searched minimal element into an element derived from the discrete set, and substitutes the converted original complement into the set C.

（ステップＳ１０７’）
次いで、基準点導出部１４５は、変数Ｄで示された集合と、変数Ｅで示された集合とを、空集合で初期化する。 (Step S107 ′)
Next, the reference point deriving unit 145 initializes the set indicated by the variable D and the set indicated by the variable E with an empty set.

（ステップＳ１２１）
次に、基準点導出部１４５は、集合Ｅと、特定した一般化ノードｖとに基づき、ｗ≧ｖの関係を満たす集合Ｅの要素ｗ（即ち、一般化ノードｗ）が存在するか否かを判定する。 (Step S121)
Next, the reference point deriving unit 145 determines whether or not there is an element w (that is, a generalized node w) of the set E that satisfies the relationship w ≧ v based on the set E and the specified generalized node v. Determine.

（ステップＳ１２３）
ｗ≧ｖの関係を満たす集合Ｅの要素ｗが存在する場合には（ステップＳ１２１、ＹＥＳ）、基準点導出部１４５は、特定された一般化ノードｖの特定元である、離散集合から誘導される束の元Ｘ（即ち、「vec」関数に入力した元Ｘ）を集合Ｄに加える。 (Step S123)
When there is an element w of the set E that satisfies the relationship w ≧ v (YES in step S121), the reference point deriving unit 145 is derived from the discrete set that is the identification source of the identified generalized node v. The element X of the bundle (ie, the element X input to the “vec” function) is added to the set D.

（ステップＳ１１３）
ｗ≧ｖの関係を満たす集合Ｅの要素ｗが存在しない場合には（ステップＳ１２１、ＮＯ）、基準点導出部１４５は、特定された一般化ノードｖ（ｋ，ｔ）−匿名性を満たすか否かを判定する。なお、（ｋ，ｔ）−匿名性を満たすか否かの判定には、前述した「kAnonymityCheck」関数を適用すればよい。 (Step S113)
If there is no element w of the set E that satisfies the relationship of w ≧ v (step S121, NO), the reference point deriving unit 145 satisfies the specified generalized node v (k, t) -anonymity? Determine whether or not. Note that the above-described “kAnonymityCheck” function may be applied to determine whether or not (k, t) −anonymity is satisfied.

（ステップＳ１１５）
特定した一般化ノードｖが（ｋ，ｔ）−匿名性を満たす場合には（ステップＳ１１３、ＹＥＳ）、基準点導出部１４５は、当該一般化ノードｖを新たな基準点の候補として、極小元探索部１４３に出力する。
（ステップＳ１１７）
そして、基準点導出部１４５は、取得した極小元ｗに対応する、離散集合から誘導される束の元Ｖを特定し、特定した元Ｖの補集合を集合Ｃに加える。 (Step S115)
When the identified generalized node v satisfies (k, t) -anonymity (step S113, YES), the reference point deriving unit 145 sets the generalized node v as a new reference point candidate and uses the minimal element. The data is output to the search unit 143.
(Step S117)
Then, the reference point deriving unit 145 identifies the bundle element V derived from the discrete set corresponding to the acquired minimal element w, and adds the complement of the identified element V to the set C.

（ステップＳ１３１）
一方で、特定した一般化ノードｖが（ｋ，ｔ）−匿名性を満たさない場合には（ステップＳ１１３、ＮＯ）、基準点導出部１４５は、ｗ＜ｖの関係を満たす集合Ｅの要素ｗが存在するか否かを判定する。 (Step S131)
On the other hand, when the specified generalized node v does not satisfy (k, t) -anonymity (step S113, NO), the reference point deriving unit 145 has the element w of the set E that satisfies the relationship of w <v. It is determined whether or not exists.

（ステップＳ１３３）
ｗ＜ｖの関係を満たす集合Ｅの要素ｗが存在する場合には（ステップＳ１３１、ＹＥＳ）、基準点導出部１４５は、集合Ｅから当該要素ｗを除き、特定した一般化ノードｖを加える（即ち、Ｅ←Ｅ＼｛ｗ｝∪｛ｖ｝）。 (Step S133)
When there is an element w of the set E that satisfies the relationship of w <v (step S131, YES), the reference point deriving unit 145 removes the element w from the set E and adds the specified generalized node v ( That is, E ← E \ {w} ∪ {v}).

（ステップＳ１３５）
また、ｗ＜ｖの関係を満たす集合Ｅの要素ｗが存在しな場合には（ステップＳ１３１、ＮＯ）、基準点導出部１４５は、集合Ｅに、特定した一般化ノードｖを加える。 (Step S135)
If there is no element w of the set E that satisfies the relationship w <v (step S131, NO), the reference point deriving unit 145 adds the specified generalized node v to the set E.

（ステップＳ１３５）
そして、基準点導出部１４５は、特定された一般化ノードｖの特定元である、離散集合から誘導される束の元Ｘ（即ち、「vec」関数に入力した元Ｘ）を集合Ｄに加える。 (Step S135)
Then, the reference point deriving unit 145 adds the element X of the bundle derived from the discrete set that is the specifying element of the specified generalized node v (that is, the element X input to the “vec” function) to the set D. .

（ステップＳ１０９）
以上のようにして、極小元探索部１４３及び基準点導出部１４５は、ステップＳ１１１〜Ｓ１３７に示された一連の処理を、部分集合Ｔｒ（Ｃ）の要素のうち、集合Ｄに含まれない要素に対応する離散集合から誘導される束の元Ｘが存在する限り継続する（ステップＳ１０９、ＹＥＳ）。 (Step S109)
As described above, the minimal element search unit 143 and the reference point derivation unit 145 perform the series of processes shown in Steps S111 to S137 among the elements of the subset Tr (C) that are not included in the set D. As long as there is a bundle element X derived from the discrete set corresponding to (YES in step S109).

そして、離散集合から誘導される束の元Ｘが存在しない場合には（ステップＳ１０９、ＮＯ）、極小元探索部１４３は、探索された一連の（ｋ，ｔ）−匿名性を満たす極小元それぞれに対応する一般化戦略を、解析結果出力部１５に出力する。以降の処理は、図２１に示した例と同様のため詳細な説明は省略する。 When there is no bundle element X derived from the discrete set (step S109, NO), the minimal element search unit 143 determines each of the minimal elements satisfying the searched series of (k, t) -anonymity. Is output to the analysis result output unit 15. Subsequent processing is the same as the example shown in FIG.

以上、図２８及び図２９を参照して、変形例１に係る解析部１４が、一般化戦略からなる束から（ｋ，ｔ）−匿名性を満たす一連の極小元を探索する処理の他の一例について説明した。上記に説明したように、変形例１に係る解析部１４は、（ｋ，ｔ）−匿名性を満たす極小元を探索するための新たな基準点の候補を特定した際に、当該候補と集合Ｅとを比較することで、特定した新たな基準点の候補の中から集合Ｅに含まれる一般化ノードを事前に除く。このような構成により、変形例１に係る解析部１４は、（ｋ，ｔ）−匿名性の判定に係る処理（即ち、図１３に示した「kAnonymityCheck」関数）の実行回数をより低減し、（ｋ，ｔ）−匿名性を満たす一連の極小元を探索する処理の処理量を低減することが可能となる。 As described above, with reference to FIG. 28 and FIG. 29, the analysis unit 14 according to the modified example 1 is another process of searching for a series of minimal elements satisfying (k, t) -anonymity from a bundle of generalization strategies. An example has been described. As described above, when the analysis unit 14 according to Modification 1 identifies a new reference point candidate for searching for a minimal element that satisfies (k, t) -anonymity, By comparing with E, generalized nodes included in the set E are excluded in advance from the identified new reference point candidates. With such a configuration, the analysis unit 14 according to Modification 1 further reduces the number of executions of the process related to the determination of (k, t) -anonymity (that is, the “kAnonymityCheck” function illustrated in FIG. 13), It is possible to reduce the amount of processing for searching for a series of minimal elements satisfying (k, t) -anonymity.

［６．２．変形例２：システム構成の一例］
次に、変形例２として、本実施形態に係る情報処理システムのシステム構成の一例について説明する。 [6.2. Modification 2: Example of system configuration]
Next, as a second modification, an example of a system configuration of the information processing system according to the present embodiment will be described.

例えば、図３０は、変形例２に係る情報処理システム１の一例について説明するための説明図である。図３０において、参照符号１０’は、前述した実施形態に係る情報処理装置１０（図１参照）に相当する。即ち、図３０に示す例は、前述した実施形態に係る情報処理装置１０を、複数の情報処理装置１０１〜１０ｎにより構成される分散処理システム１０’として実現した場合について示している。 For example, FIG. 30 is an explanatory diagram for describing an example of the information processing system 1 according to the second modification. In FIG. 30, reference numeral 10 ′ corresponds to the information processing apparatus 10 (see FIG. 1) according to the above-described embodiment. That is, the example illustrated in FIG. 30 illustrates a case where the information processing apparatus 10 according to the above-described embodiment is realized as a distributed processing system 10 ′ including a plurality of information processing apparatuses 101 to 10 n.

前述したように、本実施形態に係る情報処理装置１０は、ＤＢ５０に記憶されたテーブルｄ１０を解析することで一般化戦略の候補を探索し、管理者から指定された一般化戦略に基づきテーブルｄ１０を、より匿名性が向上したテーブルｄ３０に変換する。このような特性から、特に、解析の対象となる属性（即ち、一般化の対象となる属性）の数や、レコード数が増大するほど、情報処理装置１０の処理負荷は増大する。 As described above, the information processing apparatus 10 according to the present embodiment searches for a generalization strategy candidate by analyzing the table d10 stored in the DB 50, and based on the generalization strategy designated by the administrator, the table d10. Is converted into a table d30 with improved anonymity. From such characteristics, in particular, the processing load of the information processing apparatus 10 increases as the number of attributes to be analyzed (that is, attributes to be generalized) and the number of records increase.

そのため、前述した実施形態に係る情報処理装置１０（図１参照）を、図３０に示すように、分散処理システム１０’として構成することで、個々の情報処理装置（即ち、情報処理装置１０１〜１０ｎ）の処理負荷を軽減してもよい。この場合には、図１０に示した情報処理装置１０の各構成のうち、特に、解析部１４やＤＢ情報変換部１７の処理を、複数の情報処理装置１０１〜１０ｎにより分散処理するとよい。 Therefore, the information processing apparatus 10 (see FIG. 1) according to the above-described embodiment is configured as a distributed processing system 10 ′ as illustrated in FIG. The processing load of 10n) may be reduced. In this case, among the components of the information processing apparatus 10 shown in FIG. 10, in particular, the processing of the analysis unit 14 and the DB information conversion unit 17 may be distributed by the plurality of information processing apparatuses 101 to 10n.

また、図３１は、変形例２に係る情報処理システム１の一例について説明するための説明図であり、図３１において、参照符号５０’は、前述した実施形態に係るＤＢ５０（図１参照）に相当する。即ち、図３１に示す例は、前述した実施形態に係るＤＢ５０を、複数のＤＢ５０１〜５０ｍにより構成される分散ＤＢ５０’として実現した場合について示している。 FIG. 31 is an explanatory diagram for explaining an example of the information processing system 1 according to the modified example 2. In FIG. 31, reference numeral 50 ′ denotes the DB 50 (see FIG. 1) according to the above-described embodiment. Equivalent to. That is, the example illustrated in FIG. 31 illustrates a case where the DB 50 according to the above-described embodiment is realized as a distributed DB 50 ′ configured by a plurality of DBs 501 to 50 m.

特に、所謂データベースに対するデータの入出力は、当該データベースを実現するためのハードウェア（例えば、サーバやストレージ）の入出力性能に依存し、当該入出力性能がボトルネックとなる場合もある。特に、本実施形態に係る情報処理システム１では、処理対象となるテーブルｄ１０のレコード数が増大するほど、テーブルｄ１０のデータ量は増大する。そのため、テーブルｄ１０のレコード数が増大するほど、ＤＢ５０を構成するハードウェアの入出力性能がボトルネックとなり、情報処理装置１０がＤＢ５０からテーブルｄ１０の情報を取得するための時間が増大する場合がある。 In particular, the input / output of data to a so-called database depends on the input / output performance of hardware (for example, a server or a storage) for realizing the database, and the input / output performance may become a bottleneck. In particular, in the information processing system 1 according to the present embodiment, the data amount of the table d10 increases as the number of records in the table d10 to be processed increases. Therefore, as the number of records in the table d10 increases, the input / output performance of the hardware configuring the DB 50 becomes a bottleneck, and the time for the information processing apparatus 10 to acquire the information of the table d10 from the DB 50 may increase. .

そのため、前述した実施形態に係るＤＢ５０（図１参照）を、図３１に示すように、分散ＤＢ５０’として構成することで、ハードウェアの入出力性能に依存するボトルネックを解消してもよい。 Therefore, the bottleneck that depends on the input / output performance of the hardware may be eliminated by configuring the DB 50 (see FIG. 1) according to the above-described embodiment as a distributed DB 50 ′ as shown in FIG. 31.

具体的な一例として、テーブルｄ１０中の各レコードを、分散ＤＢ５０’の各データベース５０１〜５０ｍに分散して管理してもよい。このような構成により、例えば、情報処理装置１０が、分散ＤＢ５０’からテーブルｄ１０の情報（特に、テーブルｄ１０の各レコード）を取得する際に、当該情報の取得に係る時間を最大で１／ｍに短縮することも可能となる。 As a specific example, each record in the table d10 may be distributed and managed in each database 501 to 50m of the distributed DB 50 '. With such a configuration, for example, when the information processing apparatus 10 acquires information of the table d10 (particularly, each record of the table d10) from the distributed DB 50 ′, the time required for acquiring the information is 1 / m at the maximum. It is also possible to shorten it.

なお、図３１に示す例では、ＤＢ５０を分散ＤＢとして構成する例について説明したが、一般化ＤＢ４０（図１参照）を分散ＤＢとして構成してもよい。即ち、一般化ＤＢ４０を分散ＤＢとして構成することで、情報処理装置１０が、テーブルｄ１０の情報に基づき生成したテーブルｄ３０の情報を一般化ＤＢ４０に出力する際に、当該一般化ＤＢ４０へのテーブルｄ３０の情報の出力に係る時間を短縮することが可能となる。 In the example shown in FIG. 31, the example in which the DB 50 is configured as a distributed DB has been described, but the generalized DB 40 (see FIG. 1) may be configured as a distributed DB. That is, by configuring the generalized DB 40 as a distributed DB, when the information processing apparatus 10 outputs the information of the table d30 generated based on the information of the table d10 to the generalized DB 40, the table d30 to the generalized DB 40 is displayed. It is possible to shorten the time required to output the information.

また、図３０及び図３１に示す構成を組み合わせてもよい。即ち、情報処理装置１０を分散処理システム１０’として構成し、ＤＢ５０を分散ＤＢ５０’として構成してもよい。もちろん、一般化ＤＢ４０を分散ＤＢとして構成してもよい。 Further, the configurations shown in FIGS. 30 and 31 may be combined. That is, the information processing apparatus 10 may be configured as a distributed processing system 10 ′, and the DB 50 may be configured as a distributed DB 50 ′. Of course, the generalized DB 40 may be configured as a distributed DB.

以上、図３０及び図３１を参照して、変形例２に係る情報処理システムのシステム構成の一例について説明した。 The example of the system configuration of the information processing system according to the modification 2 has been described above with reference to FIGS. 30 and 31.

＜７．ハードウェア構成＞
次に、図３２を参照して、本開示の実施形態に係る情報処理装置１０のハードウェア構成の一例について説明する。図３２は、本実施形態に係る情報処理装置１０のハードウェア構成の一例を示した図である。 <7. Hardware configuration>
Next, an example of a hardware configuration of the information processing apparatus 10 according to the embodiment of the present disclosure will be described with reference to FIG. FIG. 32 is a diagram illustrating an example of a hardware configuration of the information processing apparatus 10 according to the present embodiment.

図３２に示すように、本実施形態に係る情報処理装置１０は、プロセッサ９０１と、メモリ９０３と、ストレージ９０５と、通信デバイス９１１と、バス９１５とを含む。また、情報処理装置１０は、操作デバイス９０７と、報知デバイス９０９とを含んでもよい。 As illustrated in FIG. 32, the information processing apparatus 10 according to the present embodiment includes a processor 901, a memory 903, a storage 905, a communication device 911, and a bus 915. The information processing apparatus 10 may include an operation device 907 and a notification device 909.

プロセッサ９０１は、例えばＣＰＵ（Central Processing Unit）、ＧＰＵ（Graphics Processing Unit）、ＤＳＰ（Digital Signal Processor）又はＳｏＣ（System on Chip）であってよく、情報処理装置１０の様々な処理を実行する。プロセッサ９０１は、例えば、各種演算処理を実行するための電子回路により構成することが可能である。なお、前述した解析部１４や、ＤＢ情報変換部１７は、プロセッサ９０１により実現され得る。 The processor 901 may be, for example, a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a DSP (Digital Signal Processor), or a SoC (System on Chip), and executes various processes of the information processing apparatus 10. The processor 901 can be configured by, for example, an electronic circuit for executing various arithmetic processes. The analysis unit 14 and the DB information conversion unit 17 described above can be realized by the processor 901.

メモリ９０３は、ＲＡＭ（Random Access Memory）及びＲＯＭ（Read Only Memory）を含み、プロセッサ９０１により実行されるプログラム及びデータを記憶する。ストレージ９０５は、半導体メモリ又はハードディスクなどの記憶媒体を含み得る。 The memory 903 includes a RAM (Random Access Memory) and a ROM (Read Only Memory), and stores programs and data executed by the processor 901. The storage 905 can include a storage medium such as a semiconductor memory or a hard disk.

操作デバイス９０７は、ユーザが所望の操作を行うための入力信号を生成する機能を有する。操作デバイス９０７は、例えばボタン及びスイッチなどユーザが情報を入力するための入力部と、ユーザによる入力に基づいて入力信号を生成し、プロセッサ９０１に供給する入力制御回路などから構成されてよい。 The operation device 907 has a function of generating an input signal for a user to perform a desired operation. The operation device 907 may include an input unit for inputting information such as buttons and switches, and an input control circuit that generates an input signal based on an input by the user and supplies the input signal to the processor 901.

報知デバイス９０９は、出力デバイスの一例であり、例えば、液晶ディスプレイ（ＬＣＤ：Liquid Crystal Display）装置、有機ＥＬ（ＯＬＥＤ：Organic Light Emitting Diode）ディスプレイなどのデバイスであってよい。この場合には、報知デバイス９０９は、画面を表示することにより、ユーザに対して所定の情報を報知することができる。 The notification device 909 is an example of an output device, and may be a device such as a liquid crystal display (LCD) device or an organic light emitting diode (OLED) display, for example. In this case, the notification device 909 can notify the user of predetermined information by displaying the screen.

また、他の一例として、報知デバイス９０９は、ＬＥＤ（Light Emitting Diode）のように、点灯又は点滅のパターンにより、所定の情報をユーザに報知するデバイスであってもよい。また、報知デバイス９０９は、スピーカ等のように、所定の音響信号を出力することで、所定の情報をユーザに報知するデバイスであってもよい。 As another example, the notification device 909 may be a device that notifies the user of predetermined information using a lighting or blinking pattern, such as an LED (Light Emitting Diode). Further, the notification device 909 may be a device that notifies a user of predetermined information by outputting a predetermined acoustic signal, such as a speaker.

通信デバイス９１１は、本開示の実施形態に係る情報処理装置１０が備える通信手段であり、ネットワークを介して外部装置と通信する。通信デバイス９１１は、有線または無線用の通信インタフェースである。通信デバイス９１１を、無線通信インタフェースとして構成するバイには、当該通信デバイス９１１は、通信アンテナ、ＲＦ（Radio Frequency）回路、ベースバンドプロセッサなどを含んでもよい。 The communication device 911 is a communication unit included in the information processing apparatus 10 according to the embodiment of the present disclosure, and communicates with an external device via a network. The communication device 911 is a wired or wireless communication interface. In the case where the communication device 911 is configured as a wireless communication interface, the communication device 911 may include a communication antenna, an RF (Radio Frequency) circuit, a baseband processor, and the like.

通信デバイス９１１は、外部装置から受信した信号に各種の信号処理を行う機能を有し、受信したアナログ信号から生成したデジタル信号をプロセッサ９０１に供給することが可能である。 The communication device 911 has a function of performing various kinds of signal processing on a signal received from an external device, and can supply a digital signal generated from the received analog signal to the processor 901.

バス９１５は、プロセッサ９０１、メモリ９０３、ストレージ９０５、操作デバイス９０７、報知デバイス９０９、及び通信デバイス９１１を相互に接続する。バス９１５は、複数の種類のバスを含んでもよい。 The bus 915 connects the processor 901, the memory 903, the storage 905, the operation device 907, the notification device 909, and the communication device 911 to each other. The bus 915 may include a plurality of types of buses.

また、コンピュータに内蔵されるプロセッサ、メモリ、及びストレージなどのハードウェアを、上記した情報処理装置１０が有する構成と同等の機能を発揮させるためのプログラムも作成可能である。また、当該プログラムを記録した、コンピュータに読み取り可能な記憶媒体も提供され得る。 In addition, it is possible to create a program for causing hardware such as a processor, a memory, and a storage built in a computer to exhibit functions equivalent to the configuration of the information processing apparatus 10 described above. A computer-readable storage medium that records the program can also be provided.

＜８．まとめ＞
以上説明したように、本実施形態に係る情報処理装置１０は、処理対象となるテーブルｄ１０中の対象となる属性それぞれについて、当該属性の属性値を段階的により一般化されるように区分するための複数の分類に基づく、当該属性の一般化階層を取得する。そして、情報処理装置１０は、当該属性間における分類の組み合わせ（即ち、一般化階層の高さの組み合わせ）を元として、当該分類をより一般化（抽象化）する順序関係に基づき、一般化戦略からなる束を規定する。 <8. Summary>
As described above, the information processing apparatus 10 according to the present embodiment classifies the attribute values of the attribute so as to be generalized step by step for each attribute to be processed in the table d10 to be processed. The generalized hierarchy of the attribute is acquired based on the plurality of classifications. Then, the information processing apparatus 10 performs a generalization strategy based on an order relation that generalizes (abstracts) the classification based on a combination of classifications between the attributes (that is, a combination of generalized hierarchy heights). A bundle consisting of

情報処理装置１０は、既定した一般化戦略からなる束のうち、（ｋ，ｔ）−匿名性を満たす元を基準点として、当該元の各要素のうち、いずれかの要素をより具体化する方向にシフトさせることで、（ｋ，ｔ）−匿名性を満たす極小元を探索する。 The information processing apparatus 10 uses the element satisfying (k, t) -anonymity as a reference point in the bundle of the predetermined generalization strategies, and more specifically embodies any element of the original elements. By shifting in the direction, a local minimum satisfying (k, t) -anonymity is searched.

（ｋ，ｔ）−匿名性を満たす極小元を探索したら、情報処理装置１０は、探索した当該極小元に基づき、新たな基準点を導出する。具体的には、情報処理装置１０は、特定した極小元を、離散集合から誘導される束の元に変換し、変換された元の補集合から極小横断を求め、当該極小横断に対応する一般化戦略からなる束の元のうち、（ｋ，ｔ）−匿名性を満たす元を新たな基準点の候補とする。 When searching for a minimal element that satisfies (k, t) -anonymity, the information processing apparatus 10 derives a new reference point based on the searched minimal element. Specifically, the information processing apparatus 10 converts the specified minimal element into a bundle element derived from a discrete set, obtains a minimal crossing from the transformed original complement, and corresponds to the minimal crossing in general. Among the bundle elements composed of the conversion strategies, an element satisfying (k, t) -anonymity is set as a new reference point candidate.

そして、情報処理装置１０は、新たな基準点に基づく（ｋ，ｔ）−匿名性を満たす極小元を探索に係る処理と、当該探索の結果に基づく新たな基準点の導出に係る処理とを、新たな基準点の候補が導出されなくなるまで実行する。以上のようにして、情報処理装置１０は、（ｋ，ｔ）−匿名性を満たす一連の極小元を探索し、探索された一連の極小元それぞれに対応する一般化戦略を出力する。なお、探索された一連の極小元それぞれに対応する一般化戦略が、（ｋ，ｔ）−匿名性を満たすようにテーブルｄ１０を一般化し、かつ、情報量の減少を最小限に抑えることが可能な一般化戦略に相当する。 Then, the information processing apparatus 10 performs processing related to searching for a minimal element satisfying (k, t) -anonymity based on a new reference point, and processing related to derivation of a new reference point based on the search result. , And so on until no new reference point candidates are derived. As described above, the information processing apparatus 10 searches for a series of minimum elements satisfying (k, t) -anonymity, and outputs a generalization strategy corresponding to each of the searched series of minimum elements. It should be noted that the generalization strategy corresponding to each searched series of minimal elements can generalize the table d10 so as to satisfy (k, t) -anonymity, and minimize the reduction in the amount of information. Equivalent to a generalized strategy.

なお、（ｋ，ｔ）−匿名性を満たし、かつ、情報量の減少を最小限に抑えることが可能な一般化戦略、即ち、（ｋ，ｔ）−匿名性を満たす一連の極小元を特定する処理は、所謂列挙問題に相当する。このような列挙問題を解く従来のアルゴリズムは、一般的にはヒューリスティックなアルゴリズムに相当し、計算量を事前に算出することが困難であった。 It should be noted that (k, t) -generalization strategy that satisfies anonymity and can minimize the decrease in the amount of information, that is, identifies a series of minimal elements that satisfy (k, t) -anonymity. This processing corresponds to a so-called enumeration problem. Conventional algorithms for solving such enumeration problems generally correspond to heuristic algorithms, and it has been difficult to calculate the amount of calculation in advance.

また、本実施形態に係る情報処理装置１０に依れば、必ずしも全ての一般化戦略（即ち、一般化ノード）に対して、（ｋ，ｔ）−匿名性を満たすか否かの判定をおこなう必要はない。そのため、本実施形態に係る情報処理装置１０に依れば、一般化戦略からなる束から、（ｋ，ｔ）−匿名性を満たす一連の極小元を探索する（列挙する）処理の負荷を、低減することが可能となる。 Further, according to the information processing apparatus 10 according to the present embodiment, it is not necessarily determined whether (k, t) -anonymity is satisfied for all generalization strategies (that is, generalized nodes). There is no need. Therefore, according to the information processing apparatus 10 according to the present embodiment, the load of processing for searching for (listing) a series of minimal elements satisfying (k, t) -anonymity from a bundle of generalization strategies, It becomes possible to reduce.

以上、添付図面を参照しながら本開示の好適な実施形態について詳細に説明したが、本開示の技術的範囲はかかる例に限定されない。本開示の技術分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本開示の技術的範囲に属するものと了解される。 The preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the technical scope of the present disclosure is not limited to such examples. It is obvious that a person having ordinary knowledge in the technical field of the present disclosure can come up with various changes or modifications within the scope of the technical idea described in the claims. Of course, it is understood that it belongs to the technical scope of the present disclosure.

また、本明細書に記載された効果は、あくまで説明的または例示的なものであって限定的ではない。つまり、本開示に係る技術は、上記の効果とともに、または上記の効果に代えて、本明細書の記載から当業者には明らかな他の効果を奏しうる。 Further, the effects described in the present specification are merely illustrative or exemplary and are not limited. That is, the technology according to the present disclosure can exhibit other effects that are apparent to those skilled in the art from the description of the present specification in addition to or instead of the above effects.

なお、以下のような構成も本開示の技術的範囲に属する。
（１）
プロセッサが、複数の属性を含むレコードを複数含むテーブルの当該複数の属性のうち、少なくとも２以上の属性を対象属性として、当該対象属性が取り得る属性値それぞれが段階的により抽象化されるように区分する複数の分類の、前記２以上の対象属性間における各組み合わせを第１の元として、前記分類をより抽象化する順序関係に基づき規定される第１の束のうち、前記第１の元それぞれに対応する前記分類に基づき前記テーブルを抽象化した場合に、当該テーブルの前記レコード間で、所定の条件に基づく匿名性を満たす前記第１の元を基準点として、当該匿名性を満たす極小元を探索する探索処理を実行することと、
前記対象属性ごとの前記分類の数に基づき規定される有限集合の各部分集合を第２の元とし、当該第２の元の包含関係を順序関係として規定される第２の束のうち、探索された前記極小元にあらかじめ対応付けられた前記第２の元の補集合を特定し、当該補集合の極小横断それぞれに対応する前記第１の元のうち、前記匿名性を満たす第１の元を新たな前記基準点として導出する基準点導出処理を実行することと、
を含み、
プロセッサが、特定された前記新たな基準点に基づく前記探索処理と、当該探索処理により探索される前記極小元に基づく前記基準点導出処理とを、前記新たな基準点が導出されなくなるまで実行し、探索された一連の前記極小元それぞれが示す前記分類の組み合わせを、前記テーブルを抽象化するための条件の候補として出力する、情報処理方法。
（２）
前記匿名性は、前記テーブル中において、前記２以上の対象属性それぞれの前記属性値の組み合わせが等しい前記レコードの集合を示す等価クラスのうち、レコード数が所定の数ｋ未満の前記等価クラスに対応する前記レコードを、所定の数ｔ以下の範囲で削除した場合に、当該レコードを削除した後の前記テーブル中の前記レコードに基づく前記等価クラスのそれぞれについて、レコード数が前記所定の数ｋ以上となることを条件とする、前記（１）に記載の情報処理方法。
（３）
前記第２の束は、前記対象属性ごとの前記分類の数の和を最大とする正の整数の有限集合の冪集合を前記第２の元として、当該第２の元の包含関係を順序関係として規定される、前記（１）または（２）に記載の情報処理方法。
（４）
前記第１の束における前記第１の元それぞれと、前記第２の束における前記第２の元それぞれとは、前記第１の束の最大元と前記第２の束の最小元とが対応し、かつ、前記第１の束の最小元と前記第２の束の最大元と対応するように、前記第１の束及び前第２の束それぞれの順序関係に基づき、あらかじめ対応付けられる、前記（３）に記載の情報処理装置。
（５）
前記第２の束において、
前記対象属性それぞれには、前記正の整数の有限集合の各要素うち、互いに異なる複数の連続した正の整数に対応した要素が関連付けられ、
当該対象属性に対応する前記複数の分類それぞれは、前記複数の連続した正の整数に対応した当該要素に基づく有限集合の部分集合に対応付けられる、前記（３）または（４）に記載の情報処理装置。
（６）
前記第２の束において、
前記対象属性に対応する複数の分類それぞれは、より具体化された当該分類ほど、当該対象属性に関連付けられた前記複数の連続した正の整数に対応した要素のうち、より多くの当該要素が、値のより小さい要素から順次加えられた部分集合に対応付けられる、前記（５）に記載の情報処理装置。
（７）
前記極小横断に対応する前記第１の元は、当該極小横断に対して、前記対象属性ごとに、当該対象属性に関連付けられた前記複数の連続した正の整数に対応した要素のうち、当該極小横断が含む当該要素よりも値の小さい他の要素が補間された、正の整数の有限集合に対応する前記第２の元に基づき特定される、前記（６）に記載の情報処理装置。
（８）
最初に実行される前記探索処理では、前記第１の束の最大元を前記基準点として、前記極小元が探索される、前記（１）〜（７）のいずれか一項に記載の情報処理方法。
（９）
前記最大元は、前記第１の束のうち、前記２以上の対象属性それぞれの前記属性値を最も抽象化する前記分類の組み合わせに対応する前記第１の元である、前記（８）に記載の情報処理方法。
（１０）
前記探索処理では、前記基準点から、前記２以上の対象属性のうち少なくともいずれかに対応する前記分類がより具体化される方向に向けて、前記極小元が探索される、前記（１）〜（９）のいずれか一項に記載の情報処理方法。
（１１）
前記探索処理では、前記２以上の対象属性のうち少なくともいずれかに対応する前記分類を、より具体化される方向に向けた二分探索に基づき、前記極小元が探索される、前記（１０）に記載の情報処理装置。
（１２）
前記極小横断それぞれに対応する前記第１の元のうち、前記探索処理に基づき探索済みの前記第１の元は、前記新たな基準点の候補から除かれる、前記（１）〜（１１）のいずれか一項に記載の情報処理方法。
（１３）
前記極小横断は、前記補集合をハイパーエッジとするハイパーグラフの双対化に基づき特定される、前記（１）〜（１２）のいずれか一項に記載の情報処理方法。
（１４）
前記探索処理と前記基準点導出処理との少なくともいずれかは、複数のプロセッサにより分散処理される、前記（１）〜（１３）のいずれか一項に記載の情報処理方法。
（１５）
コンピュータに、
複数の属性を含むレコードを複数含むテーブルの当該複数の属性のうち、少なくとも２以上の属性を対象属性として、当該対象属性が取り得る属性値それぞれが段階的により抽象化されるように区分する複数の分類の、前記２以上の対象属性間における各組み合わせを第１の元として、前記分類をより抽象化する順序関係に基づき規定される第１の束のうち、前記第１の元それぞれに対応する前記分類に基づき前記テーブルを抽象化した場合に、当該テーブルの前記レコード間で、所定の条件に基づく匿名性を満たす前記第１の元を基準点として、当該匿名性を満たす極小元を探索する探索処理と、
前記対象属性ごとの前記分類の数に基づき規定される有限集合の各部分集合を第２の元とし、当該第２の元の包含関係を順序関係として規定される第２の束のうち、探索された前記極小元にあらかじめ対応付けられた前記第２の元の補集合を特定し、当該補集合の極小横断それぞれに対応する前記第１の元のうち、前記匿名性を満たす第１の元を新たな前記基準点として導出する基準点導出処理と、
を実行させ、
特定された前記新たな基準点に基づく前記探索処理と、当該探索処理により探索される前記極小元に基づく前記基準点導出処理とを、前記新たな基準点が導出されなくなるまで実行させる、探索された一連の前記極小元それぞれが示す前記分類の組み合わせを、前記テーブルを抽象化するための条件の候補として出力させる、プログラム。
（１６）
複数の属性を含むレコードを複数含むテーブルの当該複数の属性のうち、少なくとも２以上の属性を対象属性として、当該対象属性が取り得る属性値それぞれが段階的により抽象化されるように区分する複数の分類の、前記２以上の対象属性間における各組み合わせを第１の元として、前記分類をより抽象化する順序関係に基づき規定される第１の束のうち、前記第１の元それぞれに対応する前記分類に基づき前記テーブルを抽象化した場合に、当該テーブルの前記レコード間で、所定の条件に基づく匿名性を満たす前記第１の元を基準点として、当該匿名性を満たす極小元を探索する探索処理を実行する極小元探索部と、
前記対象属性ごとの前記分類の数に基づき規定される有限集合の各部分集合を第２の元とし、当該第２の元の包含関係を順序関係として規定される第２の束のうち、探索された前記極小元にあらかじめ対応付けられた前記第２の元の補集合を特定し、当該補集合の極小横断それぞれに対応する前記第１の元のうち、前記匿名性を満たす第１の元を新たな前記基準点として導出する基準点導出処理を実行する基準点導出部と、
特定された前記新たな基準点に基づく前記探索処理と、当該探索処理により探索される前記極小元に基づく前記基準点導出処理とが、前記新たな基準点が導出されなくなるまで実行された結果に基づき、探索された一連の前記極小元それぞれが示す前記分類の組み合わせを、前記テーブルを抽象化するための条件の候補として出力する出力部と、
を備える、情報処理装置。 The following configurations also belong to the technical scope of the present disclosure.
(1)
The processor can abstract each of the attribute values that can be taken by the target attribute by using at least two or more attributes as the target attribute among the plurality of attributes of the table including a plurality of records including the plurality of attributes. The first element of the first bundle defined based on the order relationship that makes the classification more abstract, with each combination between the two or more target attributes as a first element of a plurality of classification categories When the table is abstracted based on the classification corresponding to each, the minimum satisfying the anonymity with the first element satisfying the anonymity based on a predetermined condition between the records of the table as a reference point Performing a search process to search for the source;
Search among the second bundles in which each subset of the finite set defined based on the number of classifications for each target attribute is a second element and the inclusion relation of the second element is defined as an order relation A first element satisfying the anonymity among the first elements corresponding to each of the minimal crossings of the complementary set is identified in advance and is associated with the minimal element. Executing a reference point derivation process for deriving as a new reference point;
Including
A processor executes the search process based on the specified new reference point and the reference point derivation process based on the local minimum searched by the search process until the new reference point is not derived. An information processing method for outputting a combination of the classifications indicated by each of the searched series of minimum elements as a candidate for a condition for abstracting the table.
(2)
The anonymity corresponds to the equivalence class having a record number less than a predetermined number k among equivalence classes indicating the set of records in which the combination of the attribute values of the two or more target attributes is equal in the table. When the records to be deleted are deleted in a range of a predetermined number t or less, the number of records is equal to or more than the predetermined number k for each of the equivalence classes based on the records in the table after the records are deleted. The information processing method according to (1), on the condition that
(3)
The second bundle is an ordered relation of the inclusion relation of the second element, with the second element being a power set of a finite set of positive integers that maximizes the sum of the number of classifications for each target attribute. The information processing method according to (1) or (2), defined as:
(4)
Each of the first elements in the first bundle and each of the second elements in the second bundle correspond to a maximum element of the first bundle and a minimum element of the second bundle. And, in correspondence with the minimum element of the first bundle and the maximum element of the second bundle, it is associated in advance based on the order relationship of the first bundle and the previous second bundle, The information processing apparatus according to (3).
(5)
In the second bundle,
Each of the target attributes is associated with an element corresponding to a plurality of consecutive positive integers different from each other in each element of the finite set of positive integers,
The information according to (3) or (4), wherein each of the plurality of classifications corresponding to the target attribute is associated with a subset of a finite set based on the element corresponding to the plurality of consecutive positive integers. Processing equipment.
(6)
In the second bundle,
Each of the plurality of classifications corresponding to the target attribute has more specific elements among elements corresponding to the plurality of consecutive positive integers associated with the target attribute as the classification is more specific. The information processing apparatus according to (5), wherein the information processing apparatus is associated with a subset sequentially added from an element having a smaller value.
(7)
The first element corresponding to the minimum crossing is, for each of the target attributes, for the minimum crossing, among the elements corresponding to the plurality of consecutive positive integers associated with the target attribute. The information processing apparatus according to (6), specified based on the second element corresponding to a finite set of positive integers interpolated with other elements having a value smaller than the element included in the crossing.
(8)
The information processing according to any one of (1) to (7), wherein in the search process executed first, the minimal element is searched with the maximum element of the first bundle as the reference point. Method.
(9)
The maximum element is the first element corresponding to the combination of the classifications that most abstracts the attribute values of the two or more target attributes in the first bundle, according to (8). Information processing method.
(10)
In the search process, the minimal element is searched from the reference point toward a direction in which the classification corresponding to at least one of the two or more target attributes is more specific. The information processing method according to any one of (9).
(11)
In the search process, the minimum element is searched based on a binary search in which the classification corresponding to at least one of the two or more target attributes is directed to a more specific direction. The information processing apparatus described.
(12)
Of the first elements corresponding to each of the minimum crossings, the first element that has been searched based on the search process is excluded from the candidates for the new reference point. The information processing method according to any one of the above.
(13)
The information processing method according to any one of (1) to (12), wherein the minimal crossing is specified based on dualization of a hypergraph with the complement set as a hyperedge.
(14)
The information processing method according to any one of (1) to (13), wherein at least one of the search process and the reference point derivation process is distributed by a plurality of processors.
(15)
On the computer,
A plurality of divisions such that at least two or more attributes of a plurality of attributes of a table including a plurality of records including a plurality of attributes are set as target attributes, and attribute values that can be taken by the target attributes are abstracted in stages. Corresponding to each of the first elements of the first bundle defined on the basis of the order relation that makes the classification more abstract, with each combination of the two or more target attributes of the classification as a first element When the table is abstracted based on the classification to be performed, the minimal element satisfying the anonymity is searched with the first element satisfying the anonymity based on a predetermined condition as a reference point between the records of the table Search process to
Search among the second bundles in which each subset of the finite set defined based on the number of classifications for each target attribute is a second element and the inclusion relation of the second element is defined as an order relation A first element satisfying the anonymity among the first elements corresponding to each of the minimal crossings of the complementary set is identified in advance and is associated with the minimal element. A reference point deriving process for deriving as a new reference point;
And execute
A search is performed to execute the search process based on the specified new reference point and the reference point derivation process based on the local minimum searched by the search process until the new reference point is not derived. A program for outputting a combination of the classifications indicated by each of the series of minimum elements as a candidate for a condition for abstracting the table.
(16)
A plurality of divisions such that at least two or more attributes of a plurality of attributes of a table including a plurality of records including a plurality of attributes are set as target attributes, and attribute values that can be taken by the target attributes are abstracted in stages. Corresponding to each of the first elements of the first bundle defined on the basis of the order relation that makes the classification more abstract, with each combination of the two or more target attributes of the classification as a first element When the table is abstracted based on the classification to be performed, the minimal element satisfying the anonymity is searched with the first element satisfying the anonymity based on a predetermined condition as a reference point between the records of the table A minimal element search unit that executes search processing to perform,
Search among the second bundles in which each subset of the finite set defined based on the number of classifications for each target attribute is a second element and the inclusion relation of the second element is defined as an order relation A first element satisfying the anonymity among the first elements corresponding to each of the minimal crossings of the complementary set is identified in advance and is associated with the minimal element. A reference point derivation unit for executing a reference point derivation process for deriving a reference point as a new reference point;
The search process based on the specified new reference point and the reference point derivation process based on the local minimum searched by the search process are executed until the new reference point is not derived. An output unit that outputs a combination of the classifications indicated by each of the searched series of minimal elements as a candidate for a condition for abstracting the table;
An information processing apparatus comprising:

１情報処理システム
１０情報処理装置
１１ＤＢ情報取得部
１２ＤＢ情報出力部
１３階層情報取得部
１４解析部
１４１解析データ生成部
１４３極小元探索部
１４５基準点導出部
１５解析結果出力部
１６一般化条件取得部
１７ＤＢ情報変換部
１８変換結果出力部
３０管理者端末
４０一般化ＤＢ
５０ＤＢ
６０解析サーバ
７０ＤＢサーバ
８０クライアント

DESCRIPTION OF SYMBOLS 1 Information processing system 10 Information processing apparatus 11 DB information acquisition part 12 DB information output part 13 Hierarchical information acquisition part 14 Analysis part 141 Analysis data generation part 143 Minimal element search part 145 Reference point derivation part 15 Analysis result output part 16 Generalization conditions Acquisition unit 17 DB information conversion unit 18 Conversion result output unit 30 Administrator terminal 40 Generalized DB
50 DB
60 Analysis server 70 DB server 80 Client

Claims

The processor can abstract each of the attribute values that can be taken by the target attribute by using at least two or more attributes as the target attribute among the plurality of attributes of the table including a plurality of records including the plurality of attributes. The first element of the first bundle defined based on the order relationship that makes the classification more abstract, with each combination between the two or more target attributes as a first element of a plurality of classification categories When the table is abstracted based on the classification corresponding to each, the minimum satisfying the anonymity with the first element satisfying the anonymity based on a predetermined condition between the records of the table as a reference point Performing a search process to search for the source;
Search among the second bundles in which each subset of the finite set defined based on the number of classifications for each target attribute is a second element and the inclusion relation of the second element is defined as an order relation A first element satisfying the anonymity among the first elements corresponding to each of the minimal crossings of the complementary set is identified in advance and is associated with the minimal element. Executing a reference point derivation process for deriving as a new reference point;
Including
A processor executes the search process based on the specified new reference point and the reference point derivation process based on the local minimum searched by the search process until the new reference point is not derived. An information processing method for outputting a combination of the classifications indicated by each of the searched series of minimum elements as a candidate for a condition for abstracting the table.

The anonymity corresponds to the equivalence class having a record number less than a predetermined number k among equivalence classes indicating the set of records in which the combination of the attribute values of the two or more target attributes is equal in the table. When the records to be deleted are deleted in a range of a predetermined number t or less, the number of records is equal to or more than the predetermined number k for each of the equivalence classes based on the records in the table after the records are deleted. The information processing method according to claim 1, wherein:

The second bundle is an ordered relation of the inclusion relation of the second element, with the second element being a power set of a finite set of positive integers that maximizes the sum of the number of classifications for each target attribute. The information processing method according to claim 1, defined as:

Each of the first elements in the first bundle and each of the second elements in the second bundle correspond to a maximum element of the first bundle and a minimum element of the second bundle. In addition, the first bundle and the previous second bundle are associated in advance so as to correspond to the minimum element of the first bundle and the maximum element of the second bundle. Item 4. The information processing device according to Item 3.

In the second bundle,
Each of the target attributes is associated with an element corresponding to a plurality of consecutive positive integers different from each other in each element of the finite set of positive integers,
The information processing apparatus according to claim 3, wherein each of the plurality of classifications corresponding to the target attribute is associated with a subset of a finite set based on the element corresponding to the plurality of consecutive positive integers.

In the second bundle,
Each of the plurality of classifications corresponding to the target attribute has more specific elements among elements corresponding to the plurality of consecutive positive integers associated with the target attribute as the classification is more specific. The information processing apparatus according to claim 5, wherein the information processing apparatus is associated with a subset sequentially added from an element having a smaller value.

The first element corresponding to the minimum crossing is, for each of the target attributes, for the minimum crossing, among the elements corresponding to the plurality of consecutive positive integers associated with the target attribute. The information processing apparatus according to claim 6, wherein the information is specified based on the second element corresponding to a finite set of positive integers interpolated with other elements having a value smaller than the element included in the crossing.

The information processing method according to claim 1, wherein in the search process executed first, the minimal element is searched with the maximum element of the first bundle as the reference point.

The maximum element is the first element corresponding to the combination of the classifications that most abstracts the attribute values of the two or more target attributes in the first bundle. Information processing method.

2. The minimal element is searched for in the search process from the reference point toward a direction in which the classification corresponding to at least one of the two or more target attributes is more specific. Information processing method.

11. The minimal element is searched for in the search process based on a binary search in which the classification corresponding to at least one of the two or more target attributes is directed to a more specific direction. Information processing device.

2. The information processing method according to claim 1, wherein, among the first elements corresponding to each of the minimum crossings, the first element that has already been searched based on the search process is excluded from the new reference point candidates. .

The information processing method according to claim 1, wherein the minimal crossing is specified based on dualization of a hypergraph with the complement set as a hyperedge.

The information processing method according to claim 1, wherein at least one of the search process and the reference point derivation process is distributedly processed by a plurality of processors.

On the computer,
A plurality of divisions such that at least two or more attributes of a plurality of attributes of a table including a plurality of records including a plurality of attributes are set as target attributes, and attribute values that can be taken by the target attributes are abstracted in stages. Corresponding to each of the first elements of the first bundle defined on the basis of the order relation that makes the classification more abstract, with each combination of the two or more target attributes of the classification as a first element When the table is abstracted based on the classification to be performed, the minimal element satisfying the anonymity is searched with the first element satisfying the anonymity based on a predetermined condition as a reference point between the records of the table Search process to
Search among the second bundles in which each subset of the finite set defined based on the number of classifications for each target attribute is a second element and the inclusion relation of the second element is defined as an order relation A first element satisfying the anonymity among the first elements corresponding to each of the minimal crossings of the complementary set is identified in advance and is associated with the minimal element. A reference point deriving process for deriving as a new reference point;
And execute
A search is performed to execute the search process based on the specified new reference point and the reference point derivation process based on the local minimum searched by the search process until the new reference point is not derived. A program for outputting a combination of the classifications indicated by each of the series of minimum elements as a candidate for a condition for abstracting the table.

A plurality of divisions such that at least two or more attributes of a plurality of attributes of a table including a plurality of records including a plurality of attributes are set as target attributes, and attribute values that can be taken by the target attributes are abstracted in stages. Corresponding to each of the first elements of the first bundle defined on the basis of the order relation that makes the classification more abstract, with each combination of the two or more target attributes of the classification as a first element When the table is abstracted based on the classification to be performed, the minimal element satisfying the anonymity is searched with the first element satisfying the anonymity based on a predetermined condition as a reference point between the records of the table A minimal element search unit that executes search processing to perform,
Search among the second bundles in which each subset of the finite set defined based on the number of classifications for each target attribute is a second element and the inclusion relation of the second element is defined as an order relation A first element satisfying the anonymity among the first elements corresponding to each of the minimal crossings of the complementary set is identified in advance and is associated with the minimal element. A reference point derivation unit for executing a reference point derivation process for deriving a reference point as a new reference point;
The search process based on the specified new reference point and the reference point derivation process based on the local minimum searched by the search process are executed until the new reference point is not derived. An output unit that outputs a combination of the classifications indicated by each of the searched series of minimal elements as a candidate for a condition for abstracting the table;
An information processing apparatus comprising: