CN113159918B - A Bank Customer Group Mining Method Based on Federal Regiment Penetration - Google Patents
A Bank Customer Group Mining Method Based on Federal Regiment Penetration Download PDFInfo
- Publication number
- CN113159918B CN113159918B CN202110380531.8A CN202110380531A CN113159918B CN 113159918 B CN113159918 B CN 113159918B CN 202110380531 A CN202110380531 A CN 202110380531A CN 113159918 B CN113159918 B CN 113159918B
- Authority
- CN
- China
- Prior art keywords
- bank
- client
- group
- customer
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/02—Banking, e.g. interest calculation or account maintenance
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/01—Customer relationship services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Technology Law (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明涉及联邦学习技术领域,特别一种基于联邦团渗透的银行客户群体挖掘方法。The invention relates to the technical field of federated learning, in particular to a method for mining bank customer groups based on the penetration of federated groups.
背景技术Background technique
银行需要了解客户行为,围绕客户需求进行价值挖掘,同时,这也意味着银行需要获取更多的客户信息,以实现精准分析。近几年,互联网隐私泄露事件层出不穷,与此同时,隐私泄露事件引起了越来越多用户的关注,政府也越来越注重网络安全。欧盟于2018年颁布了《通用保护条例》(GDPR)保护用户的数据隐私,中国、美国等许多国家也相继制定和完善了一系列隐私保护法规,处罚隐私泄露行为。联邦学习(federated learning)是Google提出的去中心化的,隐私保护分布式机器学习框架,其以去中心化的分布式计算支持大规模数据的分布式并行处理,通过本地计算与加密传输保证银行端的秘密数据不在计算过程中泄露。研究基于联邦团渗透的银行客户群体挖掘方法具有重要价值。在保护银行客户的数据隐私的同时,联合多个银行的客户信息对银行客户群体进行挖掘,既不违反隐私保护法律规定,也能充分利用银行拥有的客户数据,更准确地帮助其挖掘银行客户群体,进而,可以建立高质量的银行客户画像、进行精准广告投放和检测金融犯罪。Banks need to understand customer behavior and conduct value mining around customer needs. At the same time, this also means that banks need to obtain more customer information for accurate analysis. In recent years, Internet privacy leakage incidents have emerged one after another. At the same time, privacy leakage incidents have attracted more and more users' attention, and the government has paid more and more attention to network security. The European Union promulgated the General Protection Regulation (GDPR) in 2018 to protect users' data privacy, and many countries such as China and the United States have also formulated and improved a series of privacy protection regulations to punish privacy breaches. Federated learning is a decentralized, privacy-preserving distributed machine learning framework proposed by Google. It supports the distributed parallel processing of large-scale data with decentralized distributed computing, and guarantees banking through local computing and encrypted transmission. The secret data of the terminal is not leaked during the calculation process. It is of great value to study the mining method of bank customer groups based on the penetration of federation. While protecting the data privacy of bank customers, it combines the customer information of multiple banks to mine bank customer groups, which does not violate privacy protection laws, and can make full use of customer data owned by banks to help them more accurately mine bank customers. Groups, in turn, can build high-quality bank customer profiles, conduct targeted advertising, and detect financial crimes.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供一种基于联邦团渗透的银行客户群体挖掘方法,可以在保护银行客户隐私的同时,更加准确地对银行客户群体进行划分。The purpose of the present invention is to provide a bank customer group mining method based on federal group penetration, which can more accurately divide bank customer groups while protecting the privacy of bank customers.
为实现上述目的,本发明的技术方案是:一种基于联邦团渗透的银行客户群体挖掘方法,提供一种系统,该系统包括银行端重叠客户识别模块、银行端客户相似度计算模块、协调端客户相似度聚合模块、银行端客户网络k团发现模块、银行端客户网络k团渗透模块和银行端客户群体划分模块;该系统按如下步骤进行银行客户群体挖掘:In order to achieve the above object, the technical solution of the present invention is: a method for mining bank customer groups based on federal group penetration, and a system is provided, which includes a bank-side overlapping customer identification module, a bank-side customer similarity calculation module, and a coordination terminal. Customer similarity aggregation module, bank-side customer network k-group discovery module, bank-side customer network K-group penetration module and bank-side customer group division module; the system performs bank customer group mining according to the following steps:
步骤S1、所述银行端重叠客户识别模块在各银行端Ph分别读取银行客户网络G(V,E,R,A),其中V表示客户集,E表示边集,R表示特征集合,A是客户特征矩阵;随机选择一个银行端生成RSA加密算法秘钥对,并发送RSA公钥给其他所有银行端;所选银行端使用RSA公钥加密客户ID,分别与其他银行端使用RSA公钥加密的客户ID计算交集;所选银行端对得到的交集点求公共交集得到重叠客户集,并将重叠客户集发送到其他银行端,各银行端Ph得到公共重叠客户集Xh;Step S1, the bank-end overlapping customer identification module reads the bank customer network G(V, E, R, A) at each bank end P h respectively, where V represents the customer set, E represents the edge set, and R represents the feature set, A is the customer characteristic matrix; randomly select a bank to generate the RSA encryption algorithm key pair, and send the RSA public key to all other banks; the selected bank uses the RSA public key to encrypt the customer ID, and uses the RSA public key with other banks respectively. The customer ID encrypted by the key is used to calculate the intersection; the selected bank terminal seeks the public intersection of the obtained intersection points to obtain the overlapping customer set, and sends the overlapping customer set to other bank terminals, and each bank terminal P h obtains the public overlapping customer set X h ;
步骤S2、所述银行端客户相似度计算模块随机选择一个银行端生成同态加密算法的秘钥对,并发送秘钥对给其他所有银行端;银行端Ph计算客户特征矩阵Ah的维度|ah|,使用同态加密算法公钥对|ah|进行加密,并发送给协调端;其中,ah是客户的特征向量,协调端是数据聚合方,聚合Ph发送的加密数据;协调端接收各银行端Ph发送的加密客户特征矩阵维度|ah|,对加密状态的|ah|进行相加得到全局客户特征矩阵维度,并将全局客户特征矩阵维度发送给各银行端Ph;各银行端Ph使用同态加密算法私钥解密全局客户特征矩阵维度,根据全局客户特征矩阵维度和客户特征矩阵Ah计算重叠客户之间的局部相似度,得到重叠客户的局部相似度矩阵Sh;各银行端Ph使用同态加密算法公钥加密Sh,并发送Sh到协调端;Step S2, the bank-end customer similarity calculation module randomly selects a bank-end to generate a key pair of the homomorphic encryption algorithm, and sends the key pair to all other bank-ends; the bank-end P h calculates the dimension of the customer characteristic matrix A h |a h |, encrypt |a h | with the public key of the homomorphic encryption algorithm, and send it to the coordinator; among them, a h is the characteristic vector of the customer, and the coordinator is the data aggregator, which aggregates the encrypted data sent by P h ; The coordinator receives the encrypted customer feature matrix dimension | ah | sent by each bank end P h , adds the encrypted state | ah | to obtain the global customer feature matrix dimension, and sends the global customer feature matrix dimension to each bank Terminal Ph ; each bank terminal Ph uses the private key of the homomorphic encryption algorithm to decrypt the global customer characteristic matrix dimension, and calculates the local similarity between overlapping customers according to the global customer characteristic matrix dimension and the customer characteristic matrix A h , and obtains the partial similarity of overlapping customers. Similarity matrix Sh ; each bank end Ph uses the homomorphic encryption algorithm public key to encrypt Sh , and sends Sh to the coordinator;
步骤S3、所述协调端客户相似度聚合模块在协调端接收各银行端Ph发送的客户局部相似度矩阵Sh;协调端对加密状态的Sh进行相加得到加密的客户全局相似度矩阵,并将全局相似度矩阵发送给各银行端Ph;Step S3, the coordinating end client similarity aggregation module receives the client local similarity matrix Sh sent by each bank end P h at the coordinating end; the coordinating end adds the encrypted state Sh to obtain an encrypted client global similarity matrix , and send the global similarity matrix to each bank terminal Ph ;
步骤S4、所述银行端客户网络k团发现模块在各银行端Ph由重叠客户组成的重叠客户网络上发现所有的k团,得到k团集合;Step S4, the bank-side client network k-group discovery module discovers all k-groups on the overlapping customer network formed by overlapping customers at each bank-side P h , and obtains a k-group set;
步骤S5、所述银行端客户网络k团渗透模块在各银行端Ph使用同态加密算法私钥解密协调端发送过来的客户全局相似度矩阵;各银行端Ph根据解密的全局相似度矩阵和k团集合进行k团渗透,得到团图 Step S5, the bank-end client network k-group penetration module uses the homomorphic encryption algorithm private key at each bank end P h to decrypt the customer global similarity matrix sent by the coordinating end; each bank end P h decrypts the global similarity matrix according to the decrypted global similarity matrix. Perform k-group infiltration with the k-group set to get the group map
步骤S6、所述银行客户群体划分模块分别计算各银行端Ph的团图的连通分支,每一个连通分支内的节点集合即为一个银行客户群体,连通分支集合即为所述银行客户网络G上重叠客户Xh的群体划分C;输出最终银行客户网络的群体划分结果C。Step S6, the bank customer group division module calculates the group graph of each bank terminal Ph respectively connected branches, the node set in each connected branch is a bank customer group, and the connected branch set is the group division C of overlapping customers X h on the bank customer network G; output the group division result C of the final bank customer network .
在本发明一实施例中,所述步骤S1具体包括如下步骤:In an embodiment of the present invention, the step S1 specifically includes the following steps:
步骤S11、随机选择一个银行端Pi(i∈h)生成RSA秘钥对,并将RSA公钥发送到其他银行端Pj(j∈h∩j≠i);Step S11, randomly select a bank end P i(i∈h) to generate an RSA key pair, and send the RSA public key to other bank end P j(j∈h∩j≠i) ;
步骤S12、银行端Pi用RSA公钥加密银行客户网络Gi的客户Vi,在隐私保护下分别与其他银行端Pj求交集,并用RSA私钥解密解密得到Xi,j;Step S12, the bank end P i encrypts the client V i of the bank client network G i with the RSA public key, seeks intersection with other bank end P j under privacy protection respectively, and obtains X i,j with the RSA private key decryption and decryption;
步骤S13、银行端Pi对得到的交集客户Xi,j求共同交集客户,得到重叠客户集Xi=∪{Xi,j};Step S13, the bank end P i seeks a common intersection customer for the obtained intersection customers X i,j , and obtains an overlapping customer set X i =∪{X i,j };
步骤S14、银行端Pi发送重叠客户集Xi到其他银行端Pj,所有银行端Ph得到重叠客户集Xh=Xi。Step S14, the bank end P i sends the overlapping client set X i to the other bank ends P j , and all the bank ends Ph obtain the overlapping client set X h = X i .
在本发明一实施例中,所述步骤S2具体包括如下步骤:In an embodiment of the present invention, the step S2 specifically includes the following steps:
步骤S21、银行端Pi生成同态加密算法秘钥对,并将秘钥对发送到其他银行端Pj;Step S21, the bank end P i generates a homomorphic encryption algorithm key pair, and sends the key pair to other bank ends P j ;
步骤S22、各银行端Ph计算客户特征矩阵Ah的维度|ah|,使用同态加密算法的公钥加密|ah|,得到加密的局部客户特征矩阵维度E(|ah|),并将E(|ah|)发送到协调端;其中,E()是加密函数;Step S22, each bank end P h calculates the dimension | ah | of the customer feature matrix A h , and encrypts | ah | with the public key of the homomorphic encryption algorithm to obtain the encrypted local customer feature matrix dimension E(| ah |) , and send E(| ah |) to the coordinator; where E() is the encryption function;
步骤S23、协调端接收各银行端Ph发送过来的加密的局部客户特征矩阵维度E(|ah|);Step S23, the coordinating terminal receives the encrypted local customer characteristic matrix dimension E(| ah |) sent by each bank terminal Ph ;
步骤S24、协调端对E(|ah|)进行相加,得到全局客户特征矩阵维度 Step S24, the coordinator adds E(| ah |) to obtain the global customer feature matrix dimension
步骤S25、协调端将加密的发送给各银行端Ph;Step S25, the coordinator will encrypt the encrypted Send to each bank terminal Ph ;
步骤S26、银行端Ph接收协调端发送过来的使用同态加密算法的私钥解密得到全局客户特征矩阵维度其中,D()是解密函数;Step S26, the bank terminal Ph receives the data sent by the coordinating terminal. Private key decryption using homomorphic encryption algorithm Get the global customer feature matrix dimension Among them, D() is the decryption function;
步骤S27、银行端Ph根据和客户特征矩阵Ah计算重叠客户Xh的局部相似度矩阵Sh;其中,重叠客户间相似度计算如公式(4)所示;Step S27, the bank terminal Ph according to Calculate the local similarity matrix S h of overlapping customers X h with the customer feature matrix A h ; wherein, the similarity calculation between overlapping customers is as shown in formula (4);
其中,ai和aj是重叠客户vi和vj的属性向量,是异或运算操作,|ai|是特征向量ai的长度,s(ai,aj)代表客户vi和vj的相似度;where a i and a j are the attribute vectors of overlapping customers v i and v j , is the XOR operation, |a i | is the length of the feature vector a i , and s(a i , a j ) represents the similarity between customers v i and v j ;
步骤S28、银行端Ph使用同态加密算法的公钥对Sh进行加密,得到E(Sh),并发送E(Sh)到协调端。Step S28, the bank terminal Ph encrypts Sh using the public key of the homomorphic encryption algorithm to obtain E(S h ), and sends E(S h ) to the coordinator.
在本发明一实施例中,所述步骤S3具体包括如下步骤:In an embodiment of the present invention, the step S3 specifically includes the following steps:
步骤S31、协调端接收各银行端Ph发送过来的加密的局部相似度矩阵E(Sh);Step S31, the coordinating end receives the encrypted local similarity matrix E(S h ) sent by each bank end P h ;
步骤S32、协调端对局部相似度矩阵E(Sh)进行相加,得到全局相似度矩阵 Step S32, the coordinating end adds the local similarity matrix E(S h ) to obtain the global similarity matrix
步骤S33、协调端将加密的全局相似度矩阵发送给各银行端Ph。Step S33, the global similarity matrix that the coordinator will encrypt Sent to each bank terminal Ph .
在本发明一实施例中,所述步骤S4具体包括如下步骤:In an embodiment of the present invention, the step S4 specifically includes the following steps:
步骤S41、银行端Ph在Gh上计算由重叠客户集Xh组成的重叠客户网络 Step S41, the bank end P h calculates the overlapping customer network composed of the overlapping customer set X h on G h
步骤S42、银行端Ph使用k团发现算法寻找重叠客户网络中的k团,得到k团集合;其中,k团是一个由k个重叠客户组成的子客户网络,子客户网络中每个重叠客户与其他所有重叠客户都存在关联关系。Step S42, the bank terminal Ph uses the k-group discovery algorithm to find overlapping customer networks The k group in , obtains the k group set; among them, the k group is a sub-customer network composed of k overlapping customers, and each overlapping customer in the sub-customer network has an association relationship with all other overlapping customers.
在本发明一实施例中,所述步骤S5具体包括如下步骤:In an embodiment of the present invention, the step S5 specifically includes the following steps:
步骤S51、银行端Ph使用同态加密算法的私钥解密全局相似度矩阵得到 Step S51, the bank terminal Ph uses the private key of the homomorphic encryption algorithm to decrypt the global similarity matrix get
步骤S52、银行端Ph将k团集合中的每一个k团作为节点,构造团图 Step S52, the bank end Ph uses each k group in the k group set as a node to construct a group graph
步骤S53、银行端Ph根据解密的全局相似度矩阵计算两个k团之间的相似度,如果相似度大于所设定的阈值α,则为这两个k团添加一条边到中。其中,阈值α取0.8,两个k团之间的相似度计算如公式(5)所示;Step S53, the bank end Ph according to the decrypted global similarity matrix Calculate the similarity between two k-clusters, if the similarity is greater than the set threshold α, add an edge to the two k-clusters to middle. Among them, the threshold α is taken as 0.8, and the similarity calculation between the two k groups is shown in formula (5);
其中,vi和vj是重叠客户网络中的客户,Cp和Cq是k团,s(vi,vj)是客户vi和vj的相似度,s(Cp,Cq)是k团Cp和Cq的相似度,Ind(vi,vj)函数表示如果中客户vi和vj存在关联关系,则返回1,否则返回0。where v i and v j are overlapping client networks Customers in , C p and C q are k-clusters, s(vi, v j ) is the similarity of customers vi and v j , s (C p , C q ) is the similarity of k -clusters C p and C q degree, the Ind(v i , v j ) function indicates that if If there is an association relationship between customers v i and v j , it returns 1, otherwise it returns 0.
在本发明一实施例中,所述步骤S6具体包括如下步骤:In an embodiment of the present invention, the step S6 specifically includes the following steps:
步骤S61、银行端Ph计算团图的连通分支;Step S61, bank-side Ph calculation group graph connected branch of
步骤S62、银行端Ph合并每个连通分支的所有客户为一个客户群体,得到银行客户群体集合C;Step S62, the bank end Ph merges all the customers of each connected branch into one customer group, and obtains the bank customer group set C;
步骤S63、银行端Ph将银行客户群体集合C中每个群体Ci中的客户vi,j写成行向量形式Ri=(vi,j);Step S63, the bank terminal Ph writes the customers v i ,j in each group C i in the bank customer group set C into a row vector form R i =(vi ,j );
步骤S64、输出向量集合{Ri},0<i<m,m为客户群体个数,每一行代表一个客户群体。Step S64, the output vector set {R i }, 0<i<m, m is the number of customer groups, and each row represents a customer group.
本发明还提供了一种计算机可读存储介质,其上存储有能够被处理器运行的计算机程序指令,当处理器运行该计算机程序指令时,能够实现如上述所述的方法步骤。The present invention also provides a computer-readable storage medium on which computer program instructions that can be executed by a processor are stored. When the processor executes the computer program instructions, the above-mentioned method steps can be implemented.
相较于现有技术,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
(1)本发明提出了基于联邦团渗透的银行客户群体挖掘方法,能够在保护银行客户隐私的同时,更精确的挖掘银行客户之间的关系。(1) The present invention proposes a bank customer group mining method based on federal group penetration, which can more accurately mine the relationship between bank customers while protecting the privacy of bank customers.
(2)本发明提出一种新的团相似度度量指标,同时考虑了网络拓扑结构和特征信息,提高了用团渗透进行银行客户群体划分的精度。(2) The present invention proposes a new group similarity measure index, which takes into account the network topology and characteristic information at the same time, and improves the accuracy of dividing bank customer groups by group penetration.
附图说明Description of drawings
图1为本发明实施例的流程图。FIG. 1 is a flowchart of an embodiment of the present invention.
具体实施方式Detailed ways
下面结合附图,对本发明的技术方案进行具体说明。The technical solutions of the present invention will be described in detail below with reference to the accompanying drawings.
应该指出,以下详细说明都是示例性的,旨在对本申请提供进一步的说明。除非另有指明,本文使用的所有技术和科学术语具有与本申请所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed description is exemplary and intended to provide further explanation of the present application. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
需要注意的是,这里所使用的术语仅是为了描述具体实施方式,而非意图限制根据本申请的示例性实施方式。如在这里所使用的,除非上下文另外明确指出,否则单数形式也意图包括复数形式,此外,还应当理解的是,当在本说明书中使用术语“包含”和/或“包括”时,其指明存在特征、步骤、操作、器件、组件和/或它们的组合。It should be noted that the terminology used herein is for the purpose of describing specific embodiments only, and is not intended to limit the exemplary embodiments according to the present application. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural as well, furthermore, it is to be understood that when the terms "comprising" and/or "including" are used in this specification, it indicates that There are features, steps, operations, devices, components, and/or combinations thereof.
如图1所示,本实施例提供了一种基于联邦团渗透的银行客户群体挖掘方法,提供一种系统,该系统包括银行端重叠客户识别模块、银行端客户相似度计算模块、协调端客户相似度聚合模块、银行端客户网络k团发现模块、银行端客户网络k团渗透模块和银行端客户群体划分模块;该系统按如下步骤进行银行客户群体挖掘:As shown in FIG. 1 , this embodiment provides a method for mining bank customer groups based on federal group penetration, and provides a system that includes a bank-side overlapping customer identification module, a bank-side customer similarity calculation module, and a coordination-end customer Similarity aggregation module, bank-end customer network k-group discovery module, bank-end customer network K-group penetration module and bank-end customer group division module; the system performs bank customer group mining according to the following steps:
步骤S1、所述银行端重叠客户识别模块各银行端Ph分别读取银行客户网络G(V,E,R,A),其中V表示客户集,E表示边集,R表示特征集合,A是客户特征矩阵。随机选择一个银行端生成RSA加密算法秘钥对,并发送RSA公钥给其他所有银行端。所选银行端使用RSA公钥加密客户ID,分别与其他银行端使用RSA公钥加密的客户ID计算交集。所选银行端对得到的交集点求公共交集得到重叠客户集,并将重叠客户集发送到其他银行端,各银行端Ph得到公共重叠客户集Xh;Step S1, each bank end Ph of the bank-end overlapping customer identification module reads the bank customer network G(V, E, R, A) respectively, where V represents the customer set, E represents the edge set, R represents the feature set, and A is the customer characteristic matrix. Randomly select a bank to generate the RSA encryption algorithm key pair, and send the RSA public key to all other banks. The selected bank uses the RSA public key to encrypt the customer ID, and calculates the intersection with the customer ID encrypted by other banks using the RSA public key. The selected bank terminal seeks a common intersection of the obtained intersection points to obtain an overlapping customer set, and sends the overlapping customer set to other bank terminals, and each bank terminal P h obtains a common overlapping customer set X h ;
步骤S2、所述银行端客户相似度计算模块随机选择一个银行端生成同态加密算法的秘钥对,并发送秘钥对给其他所有银行端。银行端Ph计算客户特征矩阵Ah的维度|ah|,使用同态加密算法公钥对|ah|进行加密,并发送给协调端。其中,ah是客户的特征向量,协调端是数据聚合方,聚合Ph发送的加密数据。协调端接收各银行端Ph发送的加密客户特征矩阵维度|ah|,对加密状态的|ah|进行相加得到全局客户特征矩阵维度,并将全局客户特征矩阵维度发送给各银行端Ph。各银行端Ph使用同态加密算法私钥解密全局客户特征矩阵维度,根据全局客户特征矩阵维度和客户特征矩阵Ah计算重叠客户之间的局部相似度,得到重叠客户的局部相似度矩阵Sh。各银行端Ph使用同态加密算法公钥加密Sh,并发送Sh到协调端;Step S2, the bank client similarity calculation module randomly selects a bank end to generate a key pair of the homomorphic encryption algorithm, and sends the key pair to all other bank ends. The bank side Ph calculates the dimension | ah | of the customer characteristic matrix A h , encrypts | ah | with the public key of the homomorphic encryption algorithm, and sends it to the coordinator. Among them, a h is the feature vector of the customer, and the coordinator is the data aggregator, which aggregates the encrypted data sent by P h . The coordinating end receives the encrypted customer characteristic matrix dimension | ah | sent by each bank end P h , adds the encrypted state |a h | to obtain the global customer characteristic matrix dimension, and sends the global customer characteristic matrix dimension to each bank end Ph . Each bank end P h uses the private key of the homomorphic encryption algorithm to decrypt the global customer characteristic matrix dimension, calculates the local similarity between overlapping customers according to the global customer characteristic matrix dimension and the customer characteristic matrix A h , and obtains the local similarity matrix S of overlapping customers h . Each bank end Ph uses the homomorphic encryption algorithm public key to encrypt Sh , and sends Sh to the coordinator;
步骤S3、所述协调端客户相似度聚合模块在协调端接收各银行端Ph发送的客户局部相似度矩阵Sh。协调端对加密状态的Sh进行相加得到加密的客户全局相似度矩阵,并将全局相似度矩阵发送给各银行端Ph;Step S3, the coordinating end client similarity aggregation module receives the client local similarity matrix Sh sent by each bank end Ph at the coordinating end. The coordinating terminal adds the encrypted state Sh to obtain the encrypted global similarity matrix of customers, and sends the global similarity matrix to each bank terminal Ph ;
步骤S4、所述银行端客户网络k团发现模块在各银行端Ph由重叠客户组成的重叠客户网络上发现所有的k团,得到k团集合;Step S4, the bank-side client network k-group discovery module discovers all k-groups on the overlapping customer network formed by overlapping customers at each bank-side P h , and obtains a k-group set;
步骤S5、所述银行端客户网络k团渗透模块各银行端Ph使用同态加密算法私钥解密协调端发送过来的客户全局相似度矩阵;各银行端Ph根据解密的全局相似度矩阵和k团集合进行k团渗透,得到团图 Step S5, each bank end P h of the bank end client network k group penetration module decrypts the customer global similarity matrix sent by the coordinating end using the private key of the homomorphic encryption algorithm; each bank end P h according to the decrypted global similarity matrix and The k-group set performs k-group infiltration to obtain a group map
步骤S6、所述银行客户群体划分模块分别计算各银行端Ph的团图的连通分支,每一个连通分支内的节点集合即为一个银行客户群体,连通分支集合即为所述银行客户网络G上重叠客户Xh的群体划分C。输出最终银行客户网络的群体划分结果C。Step S6, the bank customer group division module calculates the group graph of each bank terminal Ph respectively The connected branch of each connected branch is a bank customer group, and the connected branch set is the group division C of overlapping customers X h on the bank customer network G. Output the group division result C of the final bank customer network.
进一步地,所述步骤S1具体包括以下步骤:Further, the step S1 specifically includes the following steps:
步骤S11、随机选择一个银行端Pi(i∈h)生成RSA秘钥对,并将RSA公钥发送到其他银行端Pj(j∈h∩j≠i);Step S11, randomly select a bank end P i(i∈h) to generate an RSA key pair, and send the RSA public key to other bank end P j(j∈h∩j≠i) ;
步骤S12、银行端Pi用RSA公钥加密银行客户网络Gi的客户Vi,在隐私保护下分别与其他银行端Pj求交集,并用RSA私钥解密解密得到Xi,j;Step S12, the bank end P i encrypts the client V i of the bank client network G i with the RSA public key, seeks intersection with other bank end P j under privacy protection respectively, and obtains X i,j with the RSA private key decryption and decryption;
步骤S13、银行端Pi对得到的交集客户Xi,j求共同交集客户,得到重叠客户集Xi=∪{Xi,j};Step S13, the bank end P i seeks a common intersection customer for the obtained intersection customers X i,j , and obtains an overlapping customer set X i =∪{X i,j };
步骤S14、银行端Pi发送重叠客户集Xi到其他银行端Pj,所有银行端Ph得到重叠客户集Xh=Xi。Step S14, the bank end P i sends the overlapping client set X i to the other bank ends P j , and all the bank ends Ph obtain the overlapping client set X h = X i .
进一步地,所述步骤S2具体包括以下步骤:Further, the step S2 specifically includes the following steps:
步骤S21、银行端Pi生成同态加密算法秘钥对,并将秘钥对发送到其他银行端Pj;Step S21, the bank end P i generates a homomorphic encryption algorithm key pair, and sends the key pair to other bank ends P j ;
步骤S22、各银行端Ph计算客户特征矩阵Ah的维度|ah|,使用同态加密算法的公钥加密|ah|,得到加密的局部客户特征矩阵维度E(|ah|),并将E(|ah|)发送到协调端;其中,E()是加密函数;Step S22, each bank end P h calculates the dimension | ah | of the customer feature matrix A h , and encrypts | ah | with the public key of the homomorphic encryption algorithm to obtain the encrypted local customer feature matrix dimension E(| ah |) , and send E(| ah |) to the coordinator; where E() is the encryption function;
步骤S23、协调端接收各银行端Ph发送过来的加密的局部客户特征矩阵维度E(|ah|);Step S23, the coordinating terminal receives the encrypted local customer characteristic matrix dimension E(| ah |) sent by each bank terminal Ph ;
步骤S24、协调端对E(|ah|)进行相加,得到全局客户特征矩阵维度 Step S24, the coordinator adds E(| ah |) to obtain the global customer feature matrix dimension
步骤S25、协调端将加密的发送给各银行端Ph;Step S25, the coordinator will encrypt the encrypted Send to each bank terminal Ph ;
步骤S26、银行端Ph接收协调端发送过来的使用同态加密算法的私钥解密得到全局客户特征矩阵维度其中,D()是解密函数;Step S26, the bank terminal Ph receives the data sent by the coordinating terminal. Private key decryption using homomorphic encryption algorithm Get the global customer feature matrix dimension Among them, D() is the decryption function;
步骤S27、银行端Ph根据和客户特征矩阵Ah计算重叠客户Xh的局部相似度矩阵Sh;其中,重叠客户间相似度计算如公式(7)所示;Step S27, the bank terminal Ph according to Calculate the local similarity matrix S h of overlapping customers X h with the customer feature matrix A h ; wherein, the similarity calculation between overlapping customers is as shown in formula (7);
其中,ai和aj是重叠客户vi和vj的属性向量,是异或运算操作,|ai|是特征向量ai的长度,s(ai,aj)代表客户vi和vj的相似度;where a i and a j are the attribute vectors of overlapping customers v i and v j , is the XOR operation, |a i | is the length of the feature vector a i , and s(a i , a j ) represents the similarity between customers v i and v j ;
步骤S28、银行端Ph使用同态加密算法的公钥对Sh进行加密,得到E(Sh),并发送E(Sh)到协调端。Step S28, the bank terminal Ph encrypts Sh using the public key of the homomorphic encryption algorithm to obtain E(S h ), and sends E(S h ) to the coordinator.
进一步地,所述步骤S3具体包括以下步骤:Further, the step S3 specifically includes the following steps:
步骤S31、协调端接收各银行端Ph发送过来的加密的局部相似度矩阵E(Sh);Step S31, the coordinating end receives the encrypted local similarity matrix E(S h ) sent by each bank end P h ;
步骤S32、协调端对局部相似度矩阵E(Sh)进行相加,得到全局相似度矩阵 Step S32, the coordinating end adds the local similarity matrix E(S h ) to obtain the global similarity matrix
步骤S33、协调端将加密的全局相似度矩阵发送给各银行端Ph。Step S33, the global similarity matrix that the coordinator will encrypt Sent to each bank terminal Ph .
进一步地,所述步骤S4具体包括以下步骤:Further, the step S4 specifically includes the following steps:
步骤S41、银行端Ph在Gh上计算由重叠客户集Xh组成的重叠客户网络 Step S41, the bank end P h calculates the overlapping customer network composed of the overlapping customer set X h on G h
步骤S42、银行端Ph使用k团发现算法寻找重叠客户网络中的k团,得到k团集合。其中,k团是一个由k个重叠客户组成的子客户网络,子客户网络中每个重叠客户与其他所有重叠客户都存在关联关系。Step S42, the bank terminal Ph uses the k-group discovery algorithm to find overlapping customer networks The k groups in , get the set of k groups. Among them, the k group is a sub-customer network composed of k overlapping customers, and each overlapping customer in the sub-customer network is associated with all other overlapping customers.
进一步地,所述步骤S5具体包括以下步骤:Further, the step S5 specifically includes the following steps:
步骤S51、银行端Ph使用同态加密算法的私钥解密全局相似度矩阵得到 Step S51, the bank terminal Ph uses the private key of the homomorphic encryption algorithm to decrypt the global similarity matrix get
步骤S52、银行端Ph将k团集合中的每一个k团作为节点,构造团图 Step S52, the bank end Ph uses each k group in the k group set as a node to construct a group graph
步骤S53、银行端Ph根据解密的全局相似度矩阵计算两个k团之间的相似度,如果相似度大于所设定的阈值α,则为这两个k团添加一条边到中。其中,阈值α取0.8,两个k团之间的相似度计算如公式(8)所示;Step S53, the bank end Ph according to the decrypted global similarity matrix Calculate the similarity between two k-clusters, if the similarity is greater than the set threshold α, add an edge to the two k-clusters to middle. Among them, the threshold α is taken as 0.8, and the similarity calculation between the two k groups is shown in formula (8);
其中,vi和vj是重叠客户网络中的客户,Cp和Cq是k团,s(vi,vj)是客户vi和vj的相似度,s(Cp,Cq)是k团Cp和Cq的相似度,Ind(vi,vj)函数表示如果中客户vi和vj存在关联关系,则返回1,否则返回0。where v i and v j are overlapping client networks Customers in , C p and C q are k-clusters, s(vi, v j ) is the similarity of customers vi and v j , s (C p , C q ) is the similarity of k -clusters C p and C q degree, the Ind(v i , v j ) function indicates that if If there is an association relationship between customers v i and v j , it returns 1, otherwise it returns 0.
进一步地,所述步骤S6具体包括以下步骤:Further, the step S6 specifically includes the following steps:
步骤S61、银行端Ph计算团图的连通分支;Step S61, bank-side Ph calculation group graph connected branch of
步骤S62、银行端Ph合并每个连通分支的所有客户为一个客户群体,得到银行客户群体集合C;Step S62, the bank end Ph merges all the customers of each connected branch into one customer group, and obtains the bank customer group set C;
步骤S63、银行端Ph将银行客户群体集合C中每个群体Ci中的客户vi,j写成行向量形式Ri=(vi,j);Step S63, the bank terminal Ph writes the customers v i ,j in each group C i in the bank customer group set C into a row vector form R i =(vi ,j );
步骤S64、输出向量集合{Ri},0<i<m,m为客户群体个数,每一行代表一个客户群体。Step S64, the output vector set {R i }, 0<i<m, m is the number of customer groups, and each row represents a customer group.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flows of the flowcharts and/or the block or blocks of the block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.
以上所述,仅是本发明的较佳实施例而已,并非是对本发明作其它形式的限制,任何熟悉本专业的技术人员可能利用上述揭示的技术内容加以变更或改型为等同变化的等效实施例。但是凡是未脱离本发明技术方案内容,依据本发明的技术实质对以上实施例所作的任何简单修改、等同变化与改型,仍属于本发明技术方案的保护范围。The above are only preferred embodiments of the present invention, and are not intended to limit the present invention in other forms. Any person skilled in the art may use the technical content disclosed above to make changes or modifications to equivalent changes. Example. However, any simple modifications, equivalent changes and modifications made to the above embodiments according to the technical essence of the present invention without departing from the content of the technical solutions of the present invention still belong to the protection scope of the technical solutions of the present invention.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110380531.8A CN113159918B (en) | 2021-04-09 | 2021-04-09 | A Bank Customer Group Mining Method Based on Federal Regiment Penetration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110380531.8A CN113159918B (en) | 2021-04-09 | 2021-04-09 | A Bank Customer Group Mining Method Based on Federal Regiment Penetration |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113159918A CN113159918A (en) | 2021-07-23 |
CN113159918B true CN113159918B (en) | 2022-06-07 |
Family
ID=76889211
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110380531.8A Expired - Fee Related CN113159918B (en) | 2021-04-09 | 2021-04-09 | A Bank Customer Group Mining Method Based on Federal Regiment Penetration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113159918B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114387064B (en) * | 2022-01-13 | 2024-07-19 | 福州大学 | Electronic commerce platform potential customer recommendation method and system based on comprehensive similarity |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110572253A (en) * | 2019-09-16 | 2019-12-13 | 济南大学 | A method and system for enhancing the privacy of federated learning training data |
CN111309788A (en) * | 2020-03-08 | 2020-06-19 | 山西大学 | Community structure discovery method and system for bank customer transaction network |
CN111666460A (en) * | 2020-05-27 | 2020-09-15 | 中国平安财产保险股份有限公司 | User portrait generation method and device based on privacy protection and storage medium |
CN111967910A (en) * | 2020-08-18 | 2020-11-20 | 中国银行股份有限公司 | User passenger group classification method and device |
CN112199702A (en) * | 2020-10-16 | 2021-01-08 | 鹏城实验室 | Privacy protection method, storage medium and system based on federal learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9369273B2 (en) * | 2014-02-26 | 2016-06-14 | Raytheon Bbn Technologies Corp. | System and method for mixing VoIP streaming data for encrypted processing |
-
2021
- 2021-04-09 CN CN202110380531.8A patent/CN113159918B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110572253A (en) * | 2019-09-16 | 2019-12-13 | 济南大学 | A method and system for enhancing the privacy of federated learning training data |
CN111309788A (en) * | 2020-03-08 | 2020-06-19 | 山西大学 | Community structure discovery method and system for bank customer transaction network |
CN111666460A (en) * | 2020-05-27 | 2020-09-15 | 中国平安财产保险股份有限公司 | User portrait generation method and device based on privacy protection and storage medium |
CN111967910A (en) * | 2020-08-18 | 2020-11-20 | 中国银行股份有限公司 | User passenger group classification method and device |
CN112199702A (en) * | 2020-10-16 | 2021-01-08 | 鹏城实验室 | Privacy protection method, storage medium and system based on federal learning |
Non-Patent Citations (1)
Title |
---|
一种隐私保护的分布式关联规则挖掘方法;桂琼等;《微电子学与计算机》;20090905(第09期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113159918A (en) | 2021-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | Toward highly secure yet efficient KNN classification scheme on outsourced cloud data | |
US10903976B2 (en) | End-to-end secure operations using a query matrix | |
JP6180177B2 (en) | Encrypted data inquiry method and system capable of protecting privacy | |
WO2018184407A1 (en) | K-means clustering method and system having privacy protection | |
EP3258669B1 (en) | Computer-implemented system and method for protecting sensitive data via data re-encryption | |
CN108737115B (en) | A privacy-preserving method for solving intersection of private attribute sets | |
WO2019080281A1 (en) | Health record access control system and method in electronic medical cloud | |
CN114817999B (en) | Outsourcing privacy protection method and device based on multi-key homomorphic encryption | |
CN111159727B (en) | Multi-party cooperation oriented Bayes classifier safety generation system and method | |
CN111475690B (en) | Character string matching method and device, data detection method and server | |
Bhargav et al. | A review on cryptography in cloud computing | |
CN114239018B (en) | Shared data number determining method and system for protecting privacy data | |
Wang et al. | Image encryption algorithm based on lattice hash function and privacy protection | |
Kebache et al. | Reducing the Encrypted Data Size: Healthcare with IoT-Cloud Computing Applications. | |
Chen et al. | JEDI: Joint and effective privacy preserving outsourced set intersection and data integration protocols | |
Zhang et al. | FSAIR: Fine-grained secure approximate image retrieval for mobile cloud computing | |
CN113159918B (en) | A Bank Customer Group Mining Method Based on Federal Regiment Penetration | |
CN112380404B (en) | Data filtering method, device and system | |
CN116595562B (en) | Data processing method and electronic equipment | |
CN115599959B (en) | Data sharing method, device, equipment and storage medium | |
CN117254898B (en) | Batch-OT-based privacy set intersection method, system, electronic device and medium | |
Sheikh et al. | Secure sum computation for insecure networks | |
CN113965310B (en) | Method for realizing mixed privacy calculation processing based on label capable of being controlled to be de-identified | |
US20220103534A1 (en) | Information processing system and information processing method | |
Peng et al. | Achieving Efficient and privacy-preserving reverse Skyline query over single cloud |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220607 |
|
CF01 | Termination of patent right due to non-payment of annual fee |