[go: up one dir, main page]

CN113159918B - A Bank Customer Group Mining Method Based on Federal Regiment Penetration - Google Patents

A Bank Customer Group Mining Method Based on Federal Regiment Penetration Download PDF

Info

Publication number
CN113159918B
CN113159918B CN202110380531.8A CN202110380531A CN113159918B CN 113159918 B CN113159918 B CN 113159918B CN 202110380531 A CN202110380531 A CN 202110380531A CN 113159918 B CN113159918 B CN 113159918B
Authority
CN
China
Prior art keywords
bank
client
group
customer
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202110380531.8A
Other languages
Chinese (zh)
Other versions
CN113159918A (en
Inventor
郭昆
魏明洋
郭文忠
刘西蒙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202110380531.8A priority Critical patent/CN113159918B/en
Publication of CN113159918A publication Critical patent/CN113159918A/en
Application granted granted Critical
Publication of CN113159918B publication Critical patent/CN113159918B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Technology Law (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a bank client group mining method based on federal group penetration. The Coordinator (Coordinator) represents a data aggregator. The method comprises the following steps: privacy protection is carried out between all the bank ends based on a client network to obtain intersection so as to obtain overlapped clients; each bank terminal respectively calculates a client local similarity matrix, encrypts the matrix by using a homomorphic encryption technology and sends the matrix to a coordination terminal; the coordination terminal aggregates the local similarity matrixes to obtain an encrypted global similarity matrix, and sends the encrypted global similarity matrix to each bank terminal; each bank end finds all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set; each bank end performs k-group penetration on the overlapped client network according to the decrypted global similarity matrix and the k-group set; and each bank end divides the bank client group according to the k-group penetration result and outputs the mining structure of the bank client group. The invention can combine the customer information of a plurality of banks to improve the mining accuracy of bank customer groups under the condition of protecting the data privacy of bank customers.

Description

一种基于联邦团渗透的银行客户群体挖掘方法A Bank Customer Group Mining Method Based on Federal Regiment Penetration

技术领域technical field

本发明涉及联邦学习技术领域,特别一种基于联邦团渗透的银行客户群体挖掘方法。The invention relates to the technical field of federated learning, in particular to a method for mining bank customer groups based on the penetration of federated groups.

背景技术Background technique

银行需要了解客户行为,围绕客户需求进行价值挖掘,同时,这也意味着银行需要获取更多的客户信息,以实现精准分析。近几年,互联网隐私泄露事件层出不穷,与此同时,隐私泄露事件引起了越来越多用户的关注,政府也越来越注重网络安全。欧盟于2018年颁布了《通用保护条例》(GDPR)保护用户的数据隐私,中国、美国等许多国家也相继制定和完善了一系列隐私保护法规,处罚隐私泄露行为。联邦学习(federated learning)是Google提出的去中心化的,隐私保护分布式机器学习框架,其以去中心化的分布式计算支持大规模数据的分布式并行处理,通过本地计算与加密传输保证银行端的秘密数据不在计算过程中泄露。研究基于联邦团渗透的银行客户群体挖掘方法具有重要价值。在保护银行客户的数据隐私的同时,联合多个银行的客户信息对银行客户群体进行挖掘,既不违反隐私保护法律规定,也能充分利用银行拥有的客户数据,更准确地帮助其挖掘银行客户群体,进而,可以建立高质量的银行客户画像、进行精准广告投放和检测金融犯罪。Banks need to understand customer behavior and conduct value mining around customer needs. At the same time, this also means that banks need to obtain more customer information for accurate analysis. In recent years, Internet privacy leakage incidents have emerged one after another. At the same time, privacy leakage incidents have attracted more and more users' attention, and the government has paid more and more attention to network security. The European Union promulgated the General Protection Regulation (GDPR) in 2018 to protect users' data privacy, and many countries such as China and the United States have also formulated and improved a series of privacy protection regulations to punish privacy breaches. Federated learning is a decentralized, privacy-preserving distributed machine learning framework proposed by Google. It supports the distributed parallel processing of large-scale data with decentralized distributed computing, and guarantees banking through local computing and encrypted transmission. The secret data of the terminal is not leaked during the calculation process. It is of great value to study the mining method of bank customer groups based on the penetration of federation. While protecting the data privacy of bank customers, it combines the customer information of multiple banks to mine bank customer groups, which does not violate privacy protection laws, and can make full use of customer data owned by banks to help them more accurately mine bank customers. Groups, in turn, can build high-quality bank customer profiles, conduct targeted advertising, and detect financial crimes.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种基于联邦团渗透的银行客户群体挖掘方法,可以在保护银行客户隐私的同时,更加准确地对银行客户群体进行划分。The purpose of the present invention is to provide a bank customer group mining method based on federal group penetration, which can more accurately divide bank customer groups while protecting the privacy of bank customers.

为实现上述目的,本发明的技术方案是:一种基于联邦团渗透的银行客户群体挖掘方法,提供一种系统,该系统包括银行端重叠客户识别模块、银行端客户相似度计算模块、协调端客户相似度聚合模块、银行端客户网络k团发现模块、银行端客户网络k团渗透模块和银行端客户群体划分模块;该系统按如下步骤进行银行客户群体挖掘:In order to achieve the above object, the technical solution of the present invention is: a method for mining bank customer groups based on federal group penetration, and a system is provided, which includes a bank-side overlapping customer identification module, a bank-side customer similarity calculation module, and a coordination terminal. Customer similarity aggregation module, bank-side customer network k-group discovery module, bank-side customer network K-group penetration module and bank-side customer group division module; the system performs bank customer group mining according to the following steps:

步骤S1、所述银行端重叠客户识别模块在各银行端Ph分别读取银行客户网络G(V,E,R,A),其中V表示客户集,E表示边集,R表示特征集合,A是客户特征矩阵;随机选择一个银行端生成RSA加密算法秘钥对,并发送RSA公钥给其他所有银行端;所选银行端使用RSA公钥加密客户ID,分别与其他银行端使用RSA公钥加密的客户ID计算交集;所选银行端对得到的交集点求公共交集得到重叠客户集,并将重叠客户集发送到其他银行端,各银行端Ph得到公共重叠客户集XhStep S1, the bank-end overlapping customer identification module reads the bank customer network G(V, E, R, A) at each bank end P h respectively, where V represents the customer set, E represents the edge set, and R represents the feature set, A is the customer characteristic matrix; randomly select a bank to generate the RSA encryption algorithm key pair, and send the RSA public key to all other banks; the selected bank uses the RSA public key to encrypt the customer ID, and uses the RSA public key with other banks respectively. The customer ID encrypted by the key is used to calculate the intersection; the selected bank terminal seeks the public intersection of the obtained intersection points to obtain the overlapping customer set, and sends the overlapping customer set to other bank terminals, and each bank terminal P h obtains the public overlapping customer set X h ;

步骤S2、所述银行端客户相似度计算模块随机选择一个银行端生成同态加密算法的秘钥对,并发送秘钥对给其他所有银行端;银行端Ph计算客户特征矩阵Ah的维度|ah|,使用同态加密算法公钥对|ah|进行加密,并发送给协调端;其中,ah是客户的特征向量,协调端是数据聚合方,聚合Ph发送的加密数据;协调端接收各银行端Ph发送的加密客户特征矩阵维度|ah|,对加密状态的|ah|进行相加得到全局客户特征矩阵维度,并将全局客户特征矩阵维度发送给各银行端Ph;各银行端Ph使用同态加密算法私钥解密全局客户特征矩阵维度,根据全局客户特征矩阵维度和客户特征矩阵Ah计算重叠客户之间的局部相似度,得到重叠客户的局部相似度矩阵Sh;各银行端Ph使用同态加密算法公钥加密Sh,并发送Sh到协调端;Step S2, the bank-end customer similarity calculation module randomly selects a bank-end to generate a key pair of the homomorphic encryption algorithm, and sends the key pair to all other bank-ends; the bank-end P h calculates the dimension of the customer characteristic matrix A h |a h |, encrypt |a h | with the public key of the homomorphic encryption algorithm, and send it to the coordinator; among them, a h is the characteristic vector of the customer, and the coordinator is the data aggregator, which aggregates the encrypted data sent by P h ; The coordinator receives the encrypted customer feature matrix dimension | ah | sent by each bank end P h , adds the encrypted state | ah | to obtain the global customer feature matrix dimension, and sends the global customer feature matrix dimension to each bank Terminal Ph ; each bank terminal Ph uses the private key of the homomorphic encryption algorithm to decrypt the global customer characteristic matrix dimension, and calculates the local similarity between overlapping customers according to the global customer characteristic matrix dimension and the customer characteristic matrix A h , and obtains the partial similarity of overlapping customers. Similarity matrix Sh ; each bank end Ph uses the homomorphic encryption algorithm public key to encrypt Sh , and sends Sh to the coordinator;

步骤S3、所述协调端客户相似度聚合模块在协调端接收各银行端Ph发送的客户局部相似度矩阵Sh;协调端对加密状态的Sh进行相加得到加密的客户全局相似度矩阵,并将全局相似度矩阵发送给各银行端PhStep S3, the coordinating end client similarity aggregation module receives the client local similarity matrix Sh sent by each bank end P h at the coordinating end; the coordinating end adds the encrypted state Sh to obtain an encrypted client global similarity matrix , and send the global similarity matrix to each bank terminal Ph ;

步骤S4、所述银行端客户网络k团发现模块在各银行端Ph由重叠客户组成的重叠客户网络上发现所有的k团,得到k团集合;Step S4, the bank-side client network k-group discovery module discovers all k-groups on the overlapping customer network formed by overlapping customers at each bank-side P h , and obtains a k-group set;

步骤S5、所述银行端客户网络k团渗透模块在各银行端Ph使用同态加密算法私钥解密协调端发送过来的客户全局相似度矩阵;各银行端Ph根据解密的全局相似度矩阵和k团集合进行k团渗透,得到团图

Figure BDA0003012635790000021
Step S5, the bank-end client network k-group penetration module uses the homomorphic encryption algorithm private key at each bank end P h to decrypt the customer global similarity matrix sent by the coordinating end; each bank end P h decrypts the global similarity matrix according to the decrypted global similarity matrix. Perform k-group infiltration with the k-group set to get the group map
Figure BDA0003012635790000021

步骤S6、所述银行客户群体划分模块分别计算各银行端Ph的团图

Figure BDA0003012635790000022
的连通分支,每一个连通分支内的节点集合即为一个银行客户群体,连通分支集合即为所述银行客户网络G上重叠客户Xh的群体划分C;输出最终银行客户网络的群体划分结果C。Step S6, the bank customer group division module calculates the group graph of each bank terminal Ph respectively
Figure BDA0003012635790000022
connected branches, the node set in each connected branch is a bank customer group, and the connected branch set is the group division C of overlapping customers X h on the bank customer network G; output the group division result C of the final bank customer network .

在本发明一实施例中,所述步骤S1具体包括如下步骤:In an embodiment of the present invention, the step S1 specifically includes the following steps:

步骤S11、随机选择一个银行端Pi(i∈h)生成RSA秘钥对,并将RSA公钥发送到其他银行端Pj(j∈h∩j≠i)Step S11, randomly select a bank end P i(i∈h) to generate an RSA key pair, and send the RSA public key to other bank end P j(j∈h∩j≠i) ;

步骤S12、银行端Pi用RSA公钥加密银行客户网络Gi的客户Vi,在隐私保护下分别与其他银行端Pj求交集,并用RSA私钥解密解密得到Xi,jStep S12, the bank end P i encrypts the client V i of the bank client network G i with the RSA public key, seeks intersection with other bank end P j under privacy protection respectively, and obtains X i,j with the RSA private key decryption and decryption;

步骤S13、银行端Pi对得到的交集客户Xi,j求共同交集客户,得到重叠客户集Xi=∪{Xi,j};Step S13, the bank end P i seeks a common intersection customer for the obtained intersection customers X i,j , and obtains an overlapping customer set X i =∪{X i,j };

步骤S14、银行端Pi发送重叠客户集Xi到其他银行端Pj,所有银行端Ph得到重叠客户集Xh=XiStep S14, the bank end P i sends the overlapping client set X i to the other bank ends P j , and all the bank ends Ph obtain the overlapping client set X h = X i .

在本发明一实施例中,所述步骤S2具体包括如下步骤:In an embodiment of the present invention, the step S2 specifically includes the following steps:

步骤S21、银行端Pi生成同态加密算法秘钥对,并将秘钥对发送到其他银行端PjStep S21, the bank end P i generates a homomorphic encryption algorithm key pair, and sends the key pair to other bank ends P j ;

步骤S22、各银行端Ph计算客户特征矩阵Ah的维度|ah|,使用同态加密算法的公钥加密|ah|,得到加密的局部客户特征矩阵维度E(|ah|),并将E(|ah|)发送到协调端;其中,E()是加密函数;Step S22, each bank end P h calculates the dimension | ah | of the customer feature matrix A h , and encrypts | ah | with the public key of the homomorphic encryption algorithm to obtain the encrypted local customer feature matrix dimension E(| ah |) , and send E(| ah |) to the coordinator; where E() is the encryption function;

步骤S23、协调端接收各银行端Ph发送过来的加密的局部客户特征矩阵维度E(|ah|);Step S23, the coordinating terminal receives the encrypted local customer characteristic matrix dimension E(| ah |) sent by each bank terminal Ph ;

步骤S24、协调端对E(|ah|)进行相加,得到全局客户特征矩阵维度

Figure BDA0003012635790000031
Step S24, the coordinator adds E(| ah |) to obtain the global customer feature matrix dimension
Figure BDA0003012635790000031

步骤S25、协调端将加密的

Figure BDA0003012635790000032
发送给各银行端Ph;Step S25, the coordinator will encrypt the encrypted
Figure BDA0003012635790000032
Send to each bank terminal Ph ;

步骤S26、银行端Ph接收协调端发送过来的

Figure BDA0003012635790000033
使用同态加密算法的私钥解密
Figure BDA0003012635790000034
得到全局客户特征矩阵维度
Figure BDA0003012635790000035
其中,D()是解密函数;Step S26, the bank terminal Ph receives the data sent by the coordinating terminal.
Figure BDA0003012635790000033
Private key decryption using homomorphic encryption algorithm
Figure BDA0003012635790000034
Get the global customer feature matrix dimension
Figure BDA0003012635790000035
Among them, D() is the decryption function;

步骤S27、银行端Ph根据

Figure BDA0003012635790000036
和客户特征矩阵Ah计算重叠客户Xh的局部相似度矩阵Sh;其中,重叠客户间相似度计算如公式(4)所示;Step S27, the bank terminal Ph according to
Figure BDA0003012635790000036
Calculate the local similarity matrix S h of overlapping customers X h with the customer feature matrix A h ; wherein, the similarity calculation between overlapping customers is as shown in formula (4);

Figure BDA0003012635790000037
Figure BDA0003012635790000037

其中,ai和aj是重叠客户vi和vj的属性向量,

Figure BDA0003012635790000038
是异或运算操作,|ai|是特征向量ai的长度,s(ai,aj)代表客户vi和vj的相似度;where a i and a j are the attribute vectors of overlapping customers v i and v j ,
Figure BDA0003012635790000038
is the XOR operation, |a i | is the length of the feature vector a i , and s(a i , a j ) represents the similarity between customers v i and v j ;

步骤S28、银行端Ph使用同态加密算法的公钥对Sh进行加密,得到E(Sh),并发送E(Sh)到协调端。Step S28, the bank terminal Ph encrypts Sh using the public key of the homomorphic encryption algorithm to obtain E(S h ), and sends E(S h ) to the coordinator.

在本发明一实施例中,所述步骤S3具体包括如下步骤:In an embodiment of the present invention, the step S3 specifically includes the following steps:

步骤S31、协调端接收各银行端Ph发送过来的加密的局部相似度矩阵E(Sh);Step S31, the coordinating end receives the encrypted local similarity matrix E(S h ) sent by each bank end P h ;

步骤S32、协调端对局部相似度矩阵E(Sh)进行相加,得到全局相似度矩阵

Figure BDA0003012635790000039
Step S32, the coordinating end adds the local similarity matrix E(S h ) to obtain the global similarity matrix
Figure BDA0003012635790000039

步骤S33、协调端将加密的全局相似度矩阵

Figure BDA00030126357900000310
发送给各银行端Ph。Step S33, the global similarity matrix that the coordinator will encrypt
Figure BDA00030126357900000310
Sent to each bank terminal Ph .

在本发明一实施例中,所述步骤S4具体包括如下步骤:In an embodiment of the present invention, the step S4 specifically includes the following steps:

步骤S41、银行端Ph在Gh上计算由重叠客户集Xh组成的重叠客户网络

Figure BDA00030126357900000311
Step S41, the bank end P h calculates the overlapping customer network composed of the overlapping customer set X h on G h
Figure BDA00030126357900000311

步骤S42、银行端Ph使用k团发现算法寻找重叠客户网络

Figure BDA00030126357900000312
中的k团,得到k团集合;其中,k团是一个由k个重叠客户组成的子客户网络,子客户网络中每个重叠客户与其他所有重叠客户都存在关联关系。Step S42, the bank terminal Ph uses the k-group discovery algorithm to find overlapping customer networks
Figure BDA00030126357900000312
The k group in , obtains the k group set; among them, the k group is a sub-customer network composed of k overlapping customers, and each overlapping customer in the sub-customer network has an association relationship with all other overlapping customers.

在本发明一实施例中,所述步骤S5具体包括如下步骤:In an embodiment of the present invention, the step S5 specifically includes the following steps:

步骤S51、银行端Ph使用同态加密算法的私钥解密全局相似度矩阵

Figure BDA00030126357900000313
得到
Figure BDA00030126357900000314
Step S51, the bank terminal Ph uses the private key of the homomorphic encryption algorithm to decrypt the global similarity matrix
Figure BDA00030126357900000313
get
Figure BDA00030126357900000314

步骤S52、银行端Ph将k团集合中的每一个k团作为节点,构造团图

Figure BDA00030126357900000315
Step S52, the bank end Ph uses each k group in the k group set as a node to construct a group graph
Figure BDA00030126357900000315

步骤S53、银行端Ph根据解密的全局相似度矩阵

Figure BDA00030126357900000316
计算两个k团之间的相似度,如果相似度大于所设定的阈值α,则为这两个k团添加一条边到
Figure BDA00030126357900000317
中。其中,阈值α取0.8,两个k团之间的相似度计算如公式(5)所示;Step S53, the bank end Ph according to the decrypted global similarity matrix
Figure BDA00030126357900000316
Calculate the similarity between two k-clusters, if the similarity is greater than the set threshold α, add an edge to the two k-clusters to
Figure BDA00030126357900000317
middle. Among them, the threshold α is taken as 0.8, and the similarity calculation between the two k groups is shown in formula (5);

Figure BDA0003012635790000041
Figure BDA0003012635790000041

Figure BDA0003012635790000042
Figure BDA0003012635790000042

其中,vi和vj是重叠客户网络

Figure BDA0003012635790000043
中的客户,Cp和Cq是k团,s(vi,vj)是客户vi和vj的相似度,s(Cp,Cq)是k团Cp和Cq的相似度,Ind(vi,vj)函数表示如果
Figure BDA0003012635790000044
中客户vi和vj存在关联关系,则返回1,否则返回0。where v i and v j are overlapping client networks
Figure BDA0003012635790000043
Customers in , C p and C q are k-clusters, s(vi, v j ) is the similarity of customers vi and v j , s (C p , C q ) is the similarity of k -clusters C p and C q degree, the Ind(v i , v j ) function indicates that if
Figure BDA0003012635790000044
If there is an association relationship between customers v i and v j , it returns 1, otherwise it returns 0.

在本发明一实施例中,所述步骤S6具体包括如下步骤:In an embodiment of the present invention, the step S6 specifically includes the following steps:

步骤S61、银行端Ph计算团图

Figure BDA0003012635790000045
的连通分支;Step S61, bank-side Ph calculation group graph
Figure BDA0003012635790000045
connected branch of

步骤S62、银行端Ph合并每个连通分支的所有客户为一个客户群体,得到银行客户群体集合C;Step S62, the bank end Ph merges all the customers of each connected branch into one customer group, and obtains the bank customer group set C;

步骤S63、银行端Ph将银行客户群体集合C中每个群体Ci中的客户vi,j写成行向量形式Ri=(vi,j);Step S63, the bank terminal Ph writes the customers v i ,j in each group C i in the bank customer group set C into a row vector form R i =(vi ,j );

步骤S64、输出向量集合{Ri},0<i<m,m为客户群体个数,每一行代表一个客户群体。Step S64, the output vector set {R i }, 0<i<m, m is the number of customer groups, and each row represents a customer group.

本发明还提供了一种计算机可读存储介质,其上存储有能够被处理器运行的计算机程序指令,当处理器运行该计算机程序指令时,能够实现如上述所述的方法步骤。The present invention also provides a computer-readable storage medium on which computer program instructions that can be executed by a processor are stored. When the processor executes the computer program instructions, the above-mentioned method steps can be implemented.

相较于现有技术,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:

(1)本发明提出了基于联邦团渗透的银行客户群体挖掘方法,能够在保护银行客户隐私的同时,更精确的挖掘银行客户之间的关系。(1) The present invention proposes a bank customer group mining method based on federal group penetration, which can more accurately mine the relationship between bank customers while protecting the privacy of bank customers.

(2)本发明提出一种新的团相似度度量指标,同时考虑了网络拓扑结构和特征信息,提高了用团渗透进行银行客户群体划分的精度。(2) The present invention proposes a new group similarity measure index, which takes into account the network topology and characteristic information at the same time, and improves the accuracy of dividing bank customer groups by group penetration.

附图说明Description of drawings

图1为本发明实施例的流程图。FIG. 1 is a flowchart of an embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图,对本发明的技术方案进行具体说明。The technical solutions of the present invention will be described in detail below with reference to the accompanying drawings.

应该指出,以下详细说明都是示例性的,旨在对本申请提供进一步的说明。除非另有指明,本文使用的所有技术和科学术语具有与本申请所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed description is exemplary and intended to provide further explanation of the present application. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

需要注意的是,这里所使用的术语仅是为了描述具体实施方式,而非意图限制根据本申请的示例性实施方式。如在这里所使用的,除非上下文另外明确指出,否则单数形式也意图包括复数形式,此外,还应当理解的是,当在本说明书中使用术语“包含”和/或“包括”时,其指明存在特征、步骤、操作、器件、组件和/或它们的组合。It should be noted that the terminology used herein is for the purpose of describing specific embodiments only, and is not intended to limit the exemplary embodiments according to the present application. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural as well, furthermore, it is to be understood that when the terms "comprising" and/or "including" are used in this specification, it indicates that There are features, steps, operations, devices, components, and/or combinations thereof.

如图1所示,本实施例提供了一种基于联邦团渗透的银行客户群体挖掘方法,提供一种系统,该系统包括银行端重叠客户识别模块、银行端客户相似度计算模块、协调端客户相似度聚合模块、银行端客户网络k团发现模块、银行端客户网络k团渗透模块和银行端客户群体划分模块;该系统按如下步骤进行银行客户群体挖掘:As shown in FIG. 1 , this embodiment provides a method for mining bank customer groups based on federal group penetration, and provides a system that includes a bank-side overlapping customer identification module, a bank-side customer similarity calculation module, and a coordination-end customer Similarity aggregation module, bank-end customer network k-group discovery module, bank-end customer network K-group penetration module and bank-end customer group division module; the system performs bank customer group mining according to the following steps:

步骤S1、所述银行端重叠客户识别模块各银行端Ph分别读取银行客户网络G(V,E,R,A),其中V表示客户集,E表示边集,R表示特征集合,A是客户特征矩阵。随机选择一个银行端生成RSA加密算法秘钥对,并发送RSA公钥给其他所有银行端。所选银行端使用RSA公钥加密客户ID,分别与其他银行端使用RSA公钥加密的客户ID计算交集。所选银行端对得到的交集点求公共交集得到重叠客户集,并将重叠客户集发送到其他银行端,各银行端Ph得到公共重叠客户集XhStep S1, each bank end Ph of the bank-end overlapping customer identification module reads the bank customer network G(V, E, R, A) respectively, where V represents the customer set, E represents the edge set, R represents the feature set, and A is the customer characteristic matrix. Randomly select a bank to generate the RSA encryption algorithm key pair, and send the RSA public key to all other banks. The selected bank uses the RSA public key to encrypt the customer ID, and calculates the intersection with the customer ID encrypted by other banks using the RSA public key. The selected bank terminal seeks a common intersection of the obtained intersection points to obtain an overlapping customer set, and sends the overlapping customer set to other bank terminals, and each bank terminal P h obtains a common overlapping customer set X h ;

步骤S2、所述银行端客户相似度计算模块随机选择一个银行端生成同态加密算法的秘钥对,并发送秘钥对给其他所有银行端。银行端Ph计算客户特征矩阵Ah的维度|ah|,使用同态加密算法公钥对|ah|进行加密,并发送给协调端。其中,ah是客户的特征向量,协调端是数据聚合方,聚合Ph发送的加密数据。协调端接收各银行端Ph发送的加密客户特征矩阵维度|ah|,对加密状态的|ah|进行相加得到全局客户特征矩阵维度,并将全局客户特征矩阵维度发送给各银行端Ph。各银行端Ph使用同态加密算法私钥解密全局客户特征矩阵维度,根据全局客户特征矩阵维度和客户特征矩阵Ah计算重叠客户之间的局部相似度,得到重叠客户的局部相似度矩阵Sh。各银行端Ph使用同态加密算法公钥加密Sh,并发送Sh到协调端;Step S2, the bank client similarity calculation module randomly selects a bank end to generate a key pair of the homomorphic encryption algorithm, and sends the key pair to all other bank ends. The bank side Ph calculates the dimension | ah | of the customer characteristic matrix A h , encrypts | ah | with the public key of the homomorphic encryption algorithm, and sends it to the coordinator. Among them, a h is the feature vector of the customer, and the coordinator is the data aggregator, which aggregates the encrypted data sent by P h . The coordinating end receives the encrypted customer characteristic matrix dimension | ah | sent by each bank end P h , adds the encrypted state |a h | to obtain the global customer characteristic matrix dimension, and sends the global customer characteristic matrix dimension to each bank end Ph . Each bank end P h uses the private key of the homomorphic encryption algorithm to decrypt the global customer characteristic matrix dimension, calculates the local similarity between overlapping customers according to the global customer characteristic matrix dimension and the customer characteristic matrix A h , and obtains the local similarity matrix S of overlapping customers h . Each bank end Ph uses the homomorphic encryption algorithm public key to encrypt Sh , and sends Sh to the coordinator;

步骤S3、所述协调端客户相似度聚合模块在协调端接收各银行端Ph发送的客户局部相似度矩阵Sh。协调端对加密状态的Sh进行相加得到加密的客户全局相似度矩阵,并将全局相似度矩阵发送给各银行端PhStep S3, the coordinating end client similarity aggregation module receives the client local similarity matrix Sh sent by each bank end Ph at the coordinating end. The coordinating terminal adds the encrypted state Sh to obtain the encrypted global similarity matrix of customers, and sends the global similarity matrix to each bank terminal Ph ;

步骤S4、所述银行端客户网络k团发现模块在各银行端Ph由重叠客户组成的重叠客户网络上发现所有的k团,得到k团集合;Step S4, the bank-side client network k-group discovery module discovers all k-groups on the overlapping customer network formed by overlapping customers at each bank-side P h , and obtains a k-group set;

步骤S5、所述银行端客户网络k团渗透模块各银行端Ph使用同态加密算法私钥解密协调端发送过来的客户全局相似度矩阵;各银行端Ph根据解密的全局相似度矩阵和k团集合进行k团渗透,得到团图

Figure BDA0003012635790000051
Step S5, each bank end P h of the bank end client network k group penetration module decrypts the customer global similarity matrix sent by the coordinating end using the private key of the homomorphic encryption algorithm; each bank end P h according to the decrypted global similarity matrix and The k-group set performs k-group infiltration to obtain a group map
Figure BDA0003012635790000051

步骤S6、所述银行客户群体划分模块分别计算各银行端Ph的团图

Figure BDA0003012635790000061
的连通分支,每一个连通分支内的节点集合即为一个银行客户群体,连通分支集合即为所述银行客户网络G上重叠客户Xh的群体划分C。输出最终银行客户网络的群体划分结果C。Step S6, the bank customer group division module calculates the group graph of each bank terminal Ph respectively
Figure BDA0003012635790000061
The connected branch of each connected branch is a bank customer group, and the connected branch set is the group division C of overlapping customers X h on the bank customer network G. Output the group division result C of the final bank customer network.

进一步地,所述步骤S1具体包括以下步骤:Further, the step S1 specifically includes the following steps:

步骤S11、随机选择一个银行端Pi(i∈h)生成RSA秘钥对,并将RSA公钥发送到其他银行端Pj(j∈h∩j≠i)Step S11, randomly select a bank end P i(i∈h) to generate an RSA key pair, and send the RSA public key to other bank end P j(j∈h∩j≠i) ;

步骤S12、银行端Pi用RSA公钥加密银行客户网络Gi的客户Vi,在隐私保护下分别与其他银行端Pj求交集,并用RSA私钥解密解密得到Xi,jStep S12, the bank end P i encrypts the client V i of the bank client network G i with the RSA public key, seeks intersection with other bank end P j under privacy protection respectively, and obtains X i,j with the RSA private key decryption and decryption;

步骤S13、银行端Pi对得到的交集客户Xi,j求共同交集客户,得到重叠客户集Xi=∪{Xi,j};Step S13, the bank end P i seeks a common intersection customer for the obtained intersection customers X i,j , and obtains an overlapping customer set X i =∪{X i,j };

步骤S14、银行端Pi发送重叠客户集Xi到其他银行端Pj,所有银行端Ph得到重叠客户集Xh=XiStep S14, the bank end P i sends the overlapping client set X i to the other bank ends P j , and all the bank ends Ph obtain the overlapping client set X h = X i .

进一步地,所述步骤S2具体包括以下步骤:Further, the step S2 specifically includes the following steps:

步骤S21、银行端Pi生成同态加密算法秘钥对,并将秘钥对发送到其他银行端PjStep S21, the bank end P i generates a homomorphic encryption algorithm key pair, and sends the key pair to other bank ends P j ;

步骤S22、各银行端Ph计算客户特征矩阵Ah的维度|ah|,使用同态加密算法的公钥加密|ah|,得到加密的局部客户特征矩阵维度E(|ah|),并将E(|ah|)发送到协调端;其中,E()是加密函数;Step S22, each bank end P h calculates the dimension | ah | of the customer feature matrix A h , and encrypts | ah | with the public key of the homomorphic encryption algorithm to obtain the encrypted local customer feature matrix dimension E(| ah |) , and send E(| ah |) to the coordinator; where E() is the encryption function;

步骤S23、协调端接收各银行端Ph发送过来的加密的局部客户特征矩阵维度E(|ah|);Step S23, the coordinating terminal receives the encrypted local customer characteristic matrix dimension E(| ah |) sent by each bank terminal Ph ;

步骤S24、协调端对E(|ah|)进行相加,得到全局客户特征矩阵维度

Figure BDA0003012635790000062
Step S24, the coordinator adds E(| ah |) to obtain the global customer feature matrix dimension
Figure BDA0003012635790000062

步骤S25、协调端将加密的

Figure BDA0003012635790000063
发送给各银行端Ph;Step S25, the coordinator will encrypt the encrypted
Figure BDA0003012635790000063
Send to each bank terminal Ph ;

步骤S26、银行端Ph接收协调端发送过来的

Figure BDA0003012635790000064
使用同态加密算法的私钥解密
Figure BDA0003012635790000065
得到全局客户特征矩阵维度
Figure BDA0003012635790000066
其中,D()是解密函数;Step S26, the bank terminal Ph receives the data sent by the coordinating terminal.
Figure BDA0003012635790000064
Private key decryption using homomorphic encryption algorithm
Figure BDA0003012635790000065
Get the global customer feature matrix dimension
Figure BDA0003012635790000066
Among them, D() is the decryption function;

步骤S27、银行端Ph根据

Figure BDA0003012635790000067
和客户特征矩阵Ah计算重叠客户Xh的局部相似度矩阵Sh;其中,重叠客户间相似度计算如公式(7)所示;Step S27, the bank terminal Ph according to
Figure BDA0003012635790000067
Calculate the local similarity matrix S h of overlapping customers X h with the customer feature matrix A h ; wherein, the similarity calculation between overlapping customers is as shown in formula (7);

Figure BDA0003012635790000068
Figure BDA0003012635790000068

其中,ai和aj是重叠客户vi和vj的属性向量,

Figure BDA0003012635790000069
是异或运算操作,|ai|是特征向量ai的长度,s(ai,aj)代表客户vi和vj的相似度;where a i and a j are the attribute vectors of overlapping customers v i and v j ,
Figure BDA0003012635790000069
is the XOR operation, |a i | is the length of the feature vector a i , and s(a i , a j ) represents the similarity between customers v i and v j ;

步骤S28、银行端Ph使用同态加密算法的公钥对Sh进行加密,得到E(Sh),并发送E(Sh)到协调端。Step S28, the bank terminal Ph encrypts Sh using the public key of the homomorphic encryption algorithm to obtain E(S h ), and sends E(S h ) to the coordinator.

进一步地,所述步骤S3具体包括以下步骤:Further, the step S3 specifically includes the following steps:

步骤S31、协调端接收各银行端Ph发送过来的加密的局部相似度矩阵E(Sh);Step S31, the coordinating end receives the encrypted local similarity matrix E(S h ) sent by each bank end P h ;

步骤S32、协调端对局部相似度矩阵E(Sh)进行相加,得到全局相似度矩阵

Figure BDA0003012635790000071
Step S32, the coordinating end adds the local similarity matrix E(S h ) to obtain the global similarity matrix
Figure BDA0003012635790000071

步骤S33、协调端将加密的全局相似度矩阵

Figure BDA0003012635790000072
发送给各银行端Ph。Step S33, the global similarity matrix that the coordinator will encrypt
Figure BDA0003012635790000072
Sent to each bank terminal Ph .

进一步地,所述步骤S4具体包括以下步骤:Further, the step S4 specifically includes the following steps:

步骤S41、银行端Ph在Gh上计算由重叠客户集Xh组成的重叠客户网络

Figure BDA0003012635790000073
Step S41, the bank end P h calculates the overlapping customer network composed of the overlapping customer set X h on G h
Figure BDA0003012635790000073

步骤S42、银行端Ph使用k团发现算法寻找重叠客户网络

Figure BDA0003012635790000074
中的k团,得到k团集合。其中,k团是一个由k个重叠客户组成的子客户网络,子客户网络中每个重叠客户与其他所有重叠客户都存在关联关系。Step S42, the bank terminal Ph uses the k-group discovery algorithm to find overlapping customer networks
Figure BDA0003012635790000074
The k groups in , get the set of k groups. Among them, the k group is a sub-customer network composed of k overlapping customers, and each overlapping customer in the sub-customer network is associated with all other overlapping customers.

进一步地,所述步骤S5具体包括以下步骤:Further, the step S5 specifically includes the following steps:

步骤S51、银行端Ph使用同态加密算法的私钥解密全局相似度矩阵

Figure BDA0003012635790000075
得到
Figure BDA0003012635790000076
Step S51, the bank terminal Ph uses the private key of the homomorphic encryption algorithm to decrypt the global similarity matrix
Figure BDA0003012635790000075
get
Figure BDA0003012635790000076

步骤S52、银行端Ph将k团集合中的每一个k团作为节点,构造团图

Figure BDA0003012635790000077
Step S52, the bank end Ph uses each k group in the k group set as a node to construct a group graph
Figure BDA0003012635790000077

步骤S53、银行端Ph根据解密的全局相似度矩阵

Figure BDA0003012635790000078
计算两个k团之间的相似度,如果相似度大于所设定的阈值α,则为这两个k团添加一条边到
Figure BDA0003012635790000079
中。其中,阈值α取0.8,两个k团之间的相似度计算如公式(8)所示;Step S53, the bank end Ph according to the decrypted global similarity matrix
Figure BDA0003012635790000078
Calculate the similarity between two k-clusters, if the similarity is greater than the set threshold α, add an edge to the two k-clusters to
Figure BDA0003012635790000079
middle. Among them, the threshold α is taken as 0.8, and the similarity calculation between the two k groups is shown in formula (8);

Figure BDA00030126357900000710
Figure BDA00030126357900000710

Figure BDA00030126357900000711
Figure BDA00030126357900000711

其中,vi和vj是重叠客户网络

Figure BDA00030126357900000712
中的客户,Cp和Cq是k团,s(vi,vj)是客户vi和vj的相似度,s(Cp,Cq)是k团Cp和Cq的相似度,Ind(vi,vj)函数表示如果
Figure BDA00030126357900000713
中客户vi和vj存在关联关系,则返回1,否则返回0。where v i and v j are overlapping client networks
Figure BDA00030126357900000712
Customers in , C p and C q are k-clusters, s(vi, v j ) is the similarity of customers vi and v j , s (C p , C q ) is the similarity of k -clusters C p and C q degree, the Ind(v i , v j ) function indicates that if
Figure BDA00030126357900000713
If there is an association relationship between customers v i and v j , it returns 1, otherwise it returns 0.

进一步地,所述步骤S6具体包括以下步骤:Further, the step S6 specifically includes the following steps:

步骤S61、银行端Ph计算团图

Figure BDA00030126357900000714
的连通分支;Step S61, bank-side Ph calculation group graph
Figure BDA00030126357900000714
connected branch of

步骤S62、银行端Ph合并每个连通分支的所有客户为一个客户群体,得到银行客户群体集合C;Step S62, the bank end Ph merges all the customers of each connected branch into one customer group, and obtains the bank customer group set C;

步骤S63、银行端Ph将银行客户群体集合C中每个群体Ci中的客户vi,j写成行向量形式Ri=(vi,j);Step S63, the bank terminal Ph writes the customers v i ,j in each group C i in the bank customer group set C into a row vector form R i =(vi ,j );

步骤S64、输出向量集合{Ri},0<i<m,m为客户群体个数,每一行代表一个客户群体。Step S64, the output vector set {R i }, 0<i<m, m is the number of customer groups, and each row represents a customer group.

本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flows of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

以上所述,仅是本发明的较佳实施例而已,并非是对本发明作其它形式的限制,任何熟悉本专业的技术人员可能利用上述揭示的技术内容加以变更或改型为等同变化的等效实施例。但是凡是未脱离本发明技术方案内容,依据本发明的技术实质对以上实施例所作的任何简单修改、等同变化与改型,仍属于本发明技术方案的保护范围。The above are only preferred embodiments of the present invention, and are not intended to limit the present invention in other forms. Any person skilled in the art may use the technical content disclosed above to make changes or modifications to equivalent changes. Example. However, any simple modifications, equivalent changes and modifications made to the above embodiments according to the technical essence of the present invention without departing from the content of the technical solutions of the present invention still belong to the protection scope of the technical solutions of the present invention.

Claims (6)

1. A bank client group mining method based on federal group penetration is characterized in that a bank client group mining system based on federal group penetration is provided, and the system comprises a bank end overlapping client identification module, a bank end client similarity calculation module, a coordination end client similarity aggregation module, a bank end client network k group discovery module, a bank end client network k group penetration module and a bank end client group division module; the system carries out bank customer group mining according to the following steps:
step S1, the bank end overlapping customer identification module is at each bank end PhRespectively reading a bank client network G (V, E, R, A), wherein V represents a client set, E represents an edge set, R represents a feature set, and A is a client feature matrix; randomly selecting one bank end to generate an RSA encryption algorithm key pair, and sending an RSA public key to other bank ends; the selected bank end encrypts the client ID by using the RSA public key, and respectively calculates the intersection with the client IDs encrypted by using the RSA public key of other bank ends; the selected bank end calculates the public intersection of the obtained intersections to obtain an overlapped client set, and sends the overlapped client set to other bank ends, wherein each bank end PhObtaining a common overlapping customer set Xh
Step S2, the bank client similarity calculation module randomly selects a bank end to generate a key pair of a homomorphic encryption algorithm, and sends the key pair to other bank ends; each bank terminal PhCalculating a customer feature matrix AhDimension | a ofh| a, using a homomorphic cryptographic algorithm public key pairhThe | is encrypted and sent to a coordination end; wherein, ahIs the characteristic vector of the client, the coordinating end is the data aggregation party, and the aggregation of all bank ends PhThe transmitted encrypted data; the coordination terminal receives each bank terminal PhTransmitted encrypted client feature matrix dimension | ah| a for encryption statushL is added to obtain the dimension of the global customer feature matrixAnd sends the global customer feature matrix dimension to each bank terminal Ph(ii) a Each bank terminal PhDecrypting the global customer feature matrix dimension using a homomorphic encryption algorithm private key based on the global customer feature matrix dimension and the customer feature matrix AhCalculating the local similarity between the overlapped clients to obtain a local similarity matrix S of the overlapped clientsh(ii) a Each bank terminal PhPublic key encryption S using homomorphic encryption algorithmhAnd send ShTo the coordinating end;
step S3, the coordination terminal client similarity aggregation module receives each bank terminal P at the coordination terminalhTransmitted local similarity matrix Sh(ii) a Coordinating S for port pair encryption statushAdding the obtained data to obtain an encrypted client global similarity matrix, and sending the global similarity matrix to each bank terminal Ph
Step S4, the bank client network k group discovery module is at each bank end PhFinding all k groups on an overlapped client network consisting of overlapped clients to obtain a k group set; wherein, the k groups are a sub-client network consisting of k overlapped clients, and each overlapped client in the sub-client network has an association relation with all other overlapped clients;
step S5, the k-group infiltration module of the bank client network is arranged at each bank end PhDecrypting the client global similarity matrix sent by the coordination terminal by using a homomorphic encryption algorithm private key; each bank terminal PhPerforming k-group infiltration according to the decrypted global similarity matrix and the k-group set to obtain a group diagram
Figure FDA0003574110400000011
Step S6, the bank customer group division module calculates each bank end PhGroup chart of
Figure FDA0003574110400000012
The node set in each connected branch is a bank customer group, and the connected branch set is an overlapped customer set X on the bank customer network GhA set of customer groups C; outputting a client group set C of the final bank client network;
the step S1 specifically includes the following steps:
step S11, randomly selecting a bank end PiGenerating RSA encryption algorithm key pair and sending RSA public key to other bank end Pj
Step S12, bank end PiEncrypting bank client network G with RSA public keyiClient V ofiRespectively with other bank terminals PjSolving intersection and obtaining X by RSA private key decryptioni,j
Step S13, bank end PiFor the obtained intersection Xi,jObtaining public intersection to obtain public overlapping client set Xi=∪{Xi,j};
Step S14, bank end PiSending a common overlapping customer set XiTo other bank end PjEach bank end PhObtaining a common overlapping customer set Xh=Xi
The step S2 specifically includes the following steps:
step S21, bank end PiGenerating a homomorphic encryption algorithm key pair, and sending the key pair to other bank terminals Pj
Step S22, each bank end PhCalculating a customer feature matrix AhDimension | a ofhPublic key encryption using homomorphic encryption algorithm | ahObtaining the dimension E (| a) of the encrypted local customer feature matrixhAnd E (| a) andh|) to the coordinating peer; wherein E () is an encryption function;
step S23, the coordination end receives each bank end PhThe sent encrypted local client characteristic matrix dimension E (| a)h|);
Step S24, coordinating the end pair E (| a)h|) are added to obtain the global customer feature matrix dimension
Figure FDA0003574110400000021
Step S25, the coordinating end encrypts the wholeLocal customer feature matrix dimension
Figure FDA0003574110400000022
To the bank terminals Ph
Step S26, each bank end PhTransmitted from the receiving and coordinating end
Figure FDA0003574110400000023
Private key decryption using homomorphic encryption algorithm
Figure FDA0003574110400000024
To obtain
Figure FDA0003574110400000025
Wherein D () is a decryption function;
step S27, each bank end PhAccording to
Figure FDA0003574110400000026
And a customer feature matrix AhCalculating a local similarity matrix S of overlapping customersh(ii) a Wherein, the similarity calculation between the overlapped clients is shown in formula (1);
Figure FDA0003574110400000027
wherein, aiIs overlapping clients viAttribute vector of ajIs overlapping clients vjIs determined by the attribute vector of (a),
Figure FDA0003574110400000028
is an XOR operation, | aiIs the feature vector aiDimension of, s (a)i,aj) Representative client viAnd vjThe similarity of (2);
step S28, bank end PhPublic key pair S using homomorphic encryption algorithmhPerforming encryption to obtain E (S)h) And transmitting E (S)h) To the coordinator side.
2. The bank customer group mining method based on federal group infiltration as claimed in claim 1, wherein the step S3 specifically comprises the following steps:
step S31, the coordination end receives each bank end PhThe encrypted local similarity matrix E (S) is senth);
Step S32, harmonize end pair E (S)h) Adding to obtain a global similarity matrix
Figure FDA0003574110400000029
Step S33, the coordination terminal converts the global similarity matrix into a global similarity matrix
Figure FDA0003574110400000031
To the bank terminals Ph
3. The bank customer group mining method based on federal group infiltration as claimed in claim 2, wherein the step S4 specifically comprises the following steps:
step S41, bank end PhAt GhUpper computing overlapping client network consisting of overlapping clients
Figure FDA0003574110400000032
Step S42, bank end PhFinding overlapping client networks using k-clique discovery algorithm
Figure FDA0003574110400000033
And obtaining a k-group set.
4. The bank customer group mining method based on federal group infiltration as claimed in claim 3, wherein the step S5 specifically comprises the following steps:
step S51, each bank end PhPrivate key decryption using homomorphic encryption algorithm
Figure FDA0003574110400000034
To obtain
Figure FDA0003574110400000035
Step S52, each bank terminal PhConstructing a clique graph by taking each k clique in the k clique set as a node
Figure FDA0003574110400000036
Step S53, each bank end PhAccording to decryption
Figure FDA0003574110400000037
Calculating the similarity between the two k groups, and if the similarity is greater than a preset threshold value alpha, adding an edge to the two k groups
Figure FDA0003574110400000038
The preparation method comprises the following steps of (1) performing; wherein the threshold value alpha is 0.8, and the similarity between two k groups is calculated as shown in the following formula;
Figure FDA0003574110400000039
Figure FDA00035741104000000310
wherein v isiAnd vjIs an overlapping customer network
Figure FDA00035741104000000311
Customer in (1), CpAnd CqIs a group k, s (v)i,vj) Is client viAnd vjSimilarity of (C), s (C)p,Cq) Is a k group CpAnd CqSimilarity of (d), Ind (v)i,vj) Function is expressed if
Figure FDA00035741104000000312
Chinese client viAnd vjIf the association relationship exists, 1 is returned, otherwise, 0 is returned.
5. The bank customer group mining method based on federal group infiltration as claimed in claim 4, wherein the step S6 specifically comprises the following steps:
step S61, each bank end PhCalculating a clique chart
Figure FDA00035741104000000313
A connected branch of (a);
step S62, each bank end PhAll the clients of each connected branch are merged into a client group to obtain a bank client group set C;
step S63, each bank end PhCollecting each group C in bank customer group CiClient v in (1)i,jWritten in the form of row vectors Ri=(vi,j);
Step S64, output vector set { R }i},0<i<m, m is the number of customer groups, and each row represents one customer group.
6. A computer-readable storage medium, having stored thereon computer program instructions executable by a processor, the computer program instructions being capable of, when executed by the processor, implementing the method steps of any of claims 1-5.
CN202110380531.8A 2021-04-09 2021-04-09 A Bank Customer Group Mining Method Based on Federal Regiment Penetration Expired - Fee Related CN113159918B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110380531.8A CN113159918B (en) 2021-04-09 2021-04-09 A Bank Customer Group Mining Method Based on Federal Regiment Penetration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110380531.8A CN113159918B (en) 2021-04-09 2021-04-09 A Bank Customer Group Mining Method Based on Federal Regiment Penetration

Publications (2)

Publication Number Publication Date
CN113159918A CN113159918A (en) 2021-07-23
CN113159918B true CN113159918B (en) 2022-06-07

Family

ID=76889211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110380531.8A Expired - Fee Related CN113159918B (en) 2021-04-09 2021-04-09 A Bank Customer Group Mining Method Based on Federal Regiment Penetration

Country Status (1)

Country Link
CN (1) CN113159918B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114387064B (en) * 2022-01-13 2024-07-19 福州大学 Electronic commerce platform potential customer recommendation method and system based on comprehensive similarity

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110572253A (en) * 2019-09-16 2019-12-13 济南大学 A method and system for enhancing the privacy of federated learning training data
CN111309788A (en) * 2020-03-08 2020-06-19 山西大学 Community structure discovery method and system for bank customer transaction network
CN111666460A (en) * 2020-05-27 2020-09-15 中国平安财产保险股份有限公司 User portrait generation method and device based on privacy protection and storage medium
CN111967910A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 User passenger group classification method and device
CN112199702A (en) * 2020-10-16 2021-01-08 鹏城实验室 Privacy protection method, storage medium and system based on federal learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9369273B2 (en) * 2014-02-26 2016-06-14 Raytheon Bbn Technologies Corp. System and method for mixing VoIP streaming data for encrypted processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110572253A (en) * 2019-09-16 2019-12-13 济南大学 A method and system for enhancing the privacy of federated learning training data
CN111309788A (en) * 2020-03-08 2020-06-19 山西大学 Community structure discovery method and system for bank customer transaction network
CN111666460A (en) * 2020-05-27 2020-09-15 中国平安财产保险股份有限公司 User portrait generation method and device based on privacy protection and storage medium
CN111967910A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 User passenger group classification method and device
CN112199702A (en) * 2020-10-16 2021-01-08 鹏城实验室 Privacy protection method, storage medium and system based on federal learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种隐私保护的分布式关联规则挖掘方法;桂琼等;《微电子学与计算机》;20090905(第09期);全文 *

Also Published As

Publication number Publication date
CN113159918A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
Liu et al. Toward highly secure yet efficient KNN classification scheme on outsourced cloud data
US10903976B2 (en) End-to-end secure operations using a query matrix
JP6180177B2 (en) Encrypted data inquiry method and system capable of protecting privacy
WO2018184407A1 (en) K-means clustering method and system having privacy protection
EP3258669B1 (en) Computer-implemented system and method for protecting sensitive data via data re-encryption
CN108737115B (en) A privacy-preserving method for solving intersection of private attribute sets
WO2019080281A1 (en) Health record access control system and method in electronic medical cloud
CN114817999B (en) Outsourcing privacy protection method and device based on multi-key homomorphic encryption
CN111159727B (en) Multi-party cooperation oriented Bayes classifier safety generation system and method
CN111475690B (en) Character string matching method and device, data detection method and server
Bhargav et al. A review on cryptography in cloud computing
CN114239018B (en) Shared data number determining method and system for protecting privacy data
Wang et al. Image encryption algorithm based on lattice hash function and privacy protection
Kebache et al. Reducing the Encrypted Data Size: Healthcare with IoT-Cloud Computing Applications.
Chen et al. JEDI: Joint and effective privacy preserving outsourced set intersection and data integration protocols
Zhang et al. FSAIR: Fine-grained secure approximate image retrieval for mobile cloud computing
CN113159918B (en) A Bank Customer Group Mining Method Based on Federal Regiment Penetration
CN112380404B (en) Data filtering method, device and system
CN116595562B (en) Data processing method and electronic equipment
CN115599959B (en) Data sharing method, device, equipment and storage medium
CN117254898B (en) Batch-OT-based privacy set intersection method, system, electronic device and medium
Sheikh et al. Secure sum computation for insecure networks
CN113965310B (en) Method for realizing mixed privacy calculation processing based on label capable of being controlled to be de-identified
US20220103534A1 (en) Information processing system and information processing method
Peng et al. Achieving Efficient and privacy-preserving reverse Skyline query over single cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220607

CF01 Termination of patent right due to non-payment of annual fee