CN115801233A - Method and client for constructing confusion set - Google Patents
Method and client for constructing confusion set Download PDFInfo
- Publication number
- CN115801233A CN115801233A CN202211229255.6A CN202211229255A CN115801233A CN 115801233 A CN115801233 A CN 115801233A CN 202211229255 A CN202211229255 A CN 202211229255A CN 115801233 A CN115801233 A CN 115801233A
- Authority
- CN
- China
- Prior art keywords
- client
- server
- field
- encrypted
- confusion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/08—Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/40—Network security protocols
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Storage Device Security (AREA)
Abstract
Description
技术领域technical field
本说明书实施例属于隐私计算技术领域,尤其涉及一种实现构造混淆集的方法和客户端。The embodiments of this specification belong to the technical field of privacy computing, and in particular relate to a method and a client for realizing the construction of a confusion set.
背景技术Background technique
隐私保护计算(Privacy-Preserving Computing)是在保护数据本身不对外泄露的前提下实现数据分析计算的技术集合,实现数据的可用不可见。通过隐私保护计算技术,可以在充分保护数据和隐私安全的前提下,实现数据价值的转化和释放。Privacy-Preserving Computing (Privacy-Preserving Computing) is a technology collection that realizes data analysis and calculation on the premise of protecting the data itself from leakage, and realizes that the data is available and invisible. Through privacy protection computing technology, the conversion and release of data value can be realized under the premise of fully protecting data and privacy security.
目前实现隐私保护计算的主流技术主要包括三大方向:第一类是以多方安全计算(Secure Multi-Party Computation,SMPC)为代表的基于密码学的隐私计算技术;第二类是以联邦学习(Federated Learning,FL)为代表的人工智能与隐私保护技术融合衍生的技术;第三类是以可信执行环境(Trust Execution Environment)为代表的基于可信硬件的机密计算(Confidential Computing,CC)技术。此外,还包括差分隐私 (DifferentialPrivacy,DP)等。差分隐私(Differential Privacy,DP)实际则是对计算结果的保护,而不是针对计算过程;联邦学习、安全多方计算以及机密计算则是对计算过程以及计算过程中间结果进行保护。At present, the mainstream technologies for realizing privacy-preserving computing mainly include three directions: the first type is cryptography-based privacy computing technology represented by Secure Multi-Party Computation (SMPC); the second type is federated learning ( Federated Learning (FL) is a technology derived from the fusion of artificial intelligence and privacy protection technology; the third category is Confidential Computing (CC) technology based on trusted hardware represented by Trusted Execution Environment (Trust Execution Environment) . In addition, it also includes differential privacy (DifferentialPrivacy, DP) and so on. Differential Privacy (DP) actually protects the calculation results, not the calculation process; federated learning, secure multi-party computing, and confidential computing protect the calculation process and the intermediate results of the calculation process.
第一类的多方安全计算,又包括四大基础技术,分别是混淆电路(GarbledCircuit, GC)、秘密分享(Secret Sharing)、不经意传输(Oblivious Transfer)和同态加密 (Homomorphic Encryption,HE)。其中,同态加密是一种特殊的加密算法,在密文基础上直接进行计算,与基于解密后的明文是一样的计算结果,其又包括半同态加密 (PartiallyHomomorphic Encryption,PHE)和全同态加密(Fully Homomorphic Encryption,FHE)。The first type of multi-party secure computing includes four basic technologies, namely Garbled Circuit (GC), Secret Sharing (Secret Sharing), Oblivious Transfer (Oblivious Transfer) and Homomorphic Encryption (HE). Among them, homomorphic encryption is a special encryption algorithm. It is directly calculated on the basis of ciphertext, and the calculation result is the same as that based on the decrypted plaintext. It also includes Partially Homomorphic Encryption (PHE) and identical Fully Homomorphic Encryption (FHE).
安全多方计算凭借其坚实的安全理论基础提供输入秘密数据的隐私保护能力,实现隐私保护计算过程的安全。目前安全多方计算主要有两条实施技术路线,包括通用安全多方计算和特定问题安全多方计算。前者可以解决各类计算问题,但是这种“万能型”的技术路线通常体系庞大,各种开销较大;后者针对特定问题设计专用协议,如隐私集合求交PSI(Private Set Intersection,PSI),隐私信息检索(Privacy Information Retrieval,PIR)等,往往能够以比通用安全多方计算协议更低的代价得到计算结果,但是需要领域专家针对应用场景进行精心设计,一般无法适用于通用场景且设计成本较高。Secure multi-party computing relies on its solid security theoretical foundation to provide privacy protection capabilities for inputting secret data, and to achieve privacy-protected computing process security. At present, there are two main implementation technical routes for secure multi-party computation, including general secure multi-party computation and problem-specific secure multi-party computation. The former can solve various computing problems, but this "universal" technical route usually has a huge system and various expenses; the latter designs special protocols for specific problems, such as private set intersection PSI (Private Set Intersection, PSI) , Privacy Information Retrieval (PIR), etc., can often obtain calculation results at a lower cost than general secure multi-party computing protocols, but requires domain experts to carefully design application scenarios, and generally cannot be applied to general scenarios and the design cost higher.
隐私集合求交是参与双方在不泄露任何额外信息的情况下,得到双方持有数据的交集。额外的信息指的是除了双方的数据交集以外的任何信息。隐私集合求交在现实场景中非常有用,比如在纵向联邦学习中做数据对齐,或是在社交软件中通过通讯录做好友发现等。Privacy set intersection is the intersection of the data held by both parties without revealing any additional information. Additional information refers to any information other than the data intersection of the two parties. Privacy set intersection is very useful in real-world scenarios, such as data alignment in vertical federated learning, or friend discovery through address books in social software.
隐私信息检索是客户端从数据库检索信息的一种方法。检索过程中,查询方隐藏查询目标标识,数据服务方提供匹配的查询结果却无法获知具体的查询对象。Private information retrieval is a method for clients to retrieve information from a database. During the retrieval process, the query party hides the query target identifier, and the data server provides matching query results but cannot know the specific query object.
发明内容Contents of the invention
本说明书的目的在于提供一种实现构造混淆集的方法和客户端,包括:The purpose of this specification is to provide a method and client for constructing a confusion set, including:
一种实现构造混淆集的方法,客户端接收服务端发送的查询基,所述查询基由数据库加密后得到;所述客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法;A method for realizing the construction of a confusion set, the client receives the query base sent by the server, and the query base is obtained after being encrypted by the database; the encryption/decryption performed by the client and the server on the same target adopts an exchangeable sequence of encryption /decryption algorithm;
所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;The client sends the encrypted sensitive field to the server, and obtains the same sensitive field encrypted by the server through interaction with the server;
所述客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的第一标识集合;The client retrieves in the query base according to the sensitive field encrypted by the server, and obtains a first identification set of matching records;
所述客户端选取查询基中除ID字段和感兴趣字段外的至少一个点,将所选的至少一个点作为检索条件,根据该检索条件和原有的感兴趣字段构造检索语句并在查询基上执行检索,得到匹配记录的第二标识集合;The client selects at least one point in the query base except the ID field and the field of interest, uses the selected at least one point as a retrieval condition, constructs a retrieval statement according to the retrieval condition and the original field of interest, and constructs a retrieval statement in the query base Execute retrieval on the above to obtain the second identification set of matching records;
所述客户端根据第一标识集合和第二标识集合构造混淆集。The client constructs a confusion set according to the first identity set and the second identity set.
一种实现构造混淆集的客户端,该客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且:A client that realizes the construction of a confusion set, the encryption/decryption performed by the client and the server on the same target adopts an encryption/decryption algorithm that can be exchanged, and:
所述客户端配置有查询基,所述查询基由所述服务端将数据库加密后得到;The client is configured with a query base, and the query base is obtained after the server encrypts the database;
所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的第一标识集合;选取查询基中除ID字段和感兴趣字段外的至少一个点,将所选的至少一个点作为检索条件,根据该检索条件和原有的感兴趣字段构造检索语句并在查询基上执行检索,得到匹配记录的第二标识集合;根据第一标识集合和第二标识集合构造混淆集。The client sends the sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server; in the query base, according to the search based on the sensitive field encrypted by the server, a matching record is obtained The first identification set; select at least one point in the query base except the ID field and the field of interest, use the selected at least one point as a retrieval condition, construct a retrieval statement according to the retrieval condition and the original field of interest, and in Searching is performed on the query basis to obtain a second identification set of matching records; a confusion set is constructed according to the first identification set and the second identification set.
一种实现构造混淆集的客户端,包括:A client that implements the construction of a confusion set, including:
处理器,processor,
存储器,存储有程序,其中在所述处理器执行所述程序时,执行上述的方法。The memory stores a program, wherein the above method is executed when the processor executes the program.
一种存储介质,用于存储程序,其中所述程序在被执行时使得客户端执行上述的方法。A storage medium is used for storing a program, wherein when the program is executed, the client executes the above-mentioned method.
上述实施例中,通过构造检索语句并在查询基上执行检索,得到第二标识集合构造,进而构造混淆集,可以使得混淆集更合理,从而避免因混淆集构造不合理而导致客户端隐私泄露。In the above-mentioned embodiment, by constructing the retrieval statement and executing the retrieval on the query base, the structure of the second identification set is obtained, and then the confusion set is constructed, which can make the confusion set more reasonable, thereby avoiding the privacy leakage of the client due to the unreasonable structure of the confusion set .
附图说明Description of drawings
为了更清楚地说明本说明书实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of this specification, the following will briefly introduce the drawings that need to be used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments recorded in this specification. , for those skilled in the art, other drawings can also be obtained according to these drawings without paying creative labor.
图1是一实施例的流程示意图;Fig. 1 is a schematic flow chart of an embodiment;
图2是一实施例的流程示意图;Fig. 2 is a schematic flow chart of an embodiment;
图3是一实施例的流程示意图。Fig. 3 is a schematic flow diagram of an embodiment.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本说明书中的技术方案,下面将结合本说明书实施例中的附图,对本说明书实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本说明书保护的范围。In order to enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below in conjunction with the drawings in the embodiments of this specification. Obviously, the described The embodiments are only some of the embodiments in this specification, not all of them. Based on the embodiments in this specification, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of this specification.
如前所述,PIR是客户端从数据库检索信息的一种方法。PIR方案是由Chor B等在1995 年提出的解决保护用户查询隐私的方案。PIR方案的主要目的是,保证查询用户向服务器上的数据库提交的查询请求,在用户查询的隐私信息不被泄漏的条件下完成查询,即在检索过程中服务器不知道用户具体查询信息及检索出的数据项。As mentioned earlier, PIR is a method for clients to retrieve information from a database. The PIR scheme is a scheme proposed by Chor B et al. in 1995 to solve the problem of protecting user query privacy. The main purpose of the PIR scheme is to ensure that the query request submitted by the query user to the database on the server is completed under the condition that the private information queried by the user is not leaked, that is, the server does not know the specific query information of the user and the retrieved information during the retrieval process. data item.
隐私信息检索的应用场景包括有:The application scenarios of private information retrieval include:
i.病患想通过医药系统查询其疾病的治疗药物,如果以该疾病名为查询条件,医疗系统将会得知该病人可能患有这样的疾病,从而病人的隐私被泄露,通过隐私信息查询可以避免此类泄露问题。i. The patient wants to inquire about the treatment drugs for his disease through the medical system. If the disease is named as the query condition, the medical system will know that the patient may suffer from such a disease, and the patient's privacy will be leaked. Query through private information Such leakage problems can be avoided.
ii.在域名、商标申请过程,用户需要首先向相关数据库提交自己申请的域名或商标信息以查询是否已存在,但有不想让服务提供方知晓自己的申请名称,从而能够抢先注册。ii. During the domain name and trademark application process, users need to first submit their applied domain name or trademark information to the relevant database to check whether it already exists, but if they do not want the service provider to know their applied name, they can pre-register.
iii.在证券市场中,某用户想查询某个股票信息,但又不能将自己感兴趣的股票泄露给服务方从而影响股票价格和自己的偏好。iii. In the securities market, a user wants to inquire about a certain stock information, but he cannot disclose the stock he is interested in to the service provider so as to affect the stock price and his own preference.
一个简单的实现方案是数据库把所有数据发送给客户端,但无法保护数据库安全,即无法保证服务端的隐私。能够同时保证客户端和数据库隐私安全的PIR,称为对称的PIR (Symmetrical PIR,SPIR),同时保证客户端和数据库两者之一隐私安全的PIR,称为非对称的PIR(Asymmetrical PIR,APIR)。根据数据库副本的个数分为多副本PIR和单副本PIR。多副本PIR协议要求多个数据库副本之间不能合谋,这在现实场景中很难满足,因此考虑更多的是单副本PIR。单副本PIR只能达到计算安全(Computational PIR,CPIR)。在大多数 PIR方案中,总是假设客户端知道想要检索的是数据库的第几个比特(单比特)。但是在现实场景中,客户端往往是根据关键字检索(并不知道该关键字对应数据库的具体位置),且希望取回的是字符串(多比特)。总而言之,一个实用的PIR通常需要最好同时满足对称、单副本、按关键字检索、返回字符串等多个条件,并达到计算效率和通信效率的平衡。通过同态加密、不经意传输(Oblivious Transfer,OT)、单向陷门函数(One-way TrapdoorFunction) 等密码学技术,可以满足或部分满足上述条件。A simple implementation is that the database sends all data to the client, but it cannot protect the security of the database, that is, the privacy of the server cannot be guaranteed. A PIR that can ensure the privacy and security of the client and the database at the same time is called a symmetrical PIR (Symmetrical PIR, SPIR), and a PIR that can ensure the privacy and security of one of the client and the database at the same time is called an asymmetrical PIR (Asymmetrical PIR, APIR). ). According to the number of database copies, it is divided into multi-copy PIR and single-copy PIR. The multi-copy PIR protocol requires that multiple database copies cannot collude, which is difficult to meet in real-world scenarios, so more consideration is given to single-copy PIR. Single-copy PIR can only achieve Computational PIR (CPIR). In most PIR schemes, it is always assumed that the client knows which bit (single bit) of the database it wants to retrieve. However, in real scenarios, the client often searches based on keywords (it does not know the specific location of the database corresponding to the keyword), and what it wants to get back is a string (multi-bit). All in all, a practical PIR usually needs to meet multiple conditions such as symmetry, single copy, retrieval by keyword, and return string at the same time, and achieve a balance between computing efficiency and communication efficiency. Through cryptographic techniques such as homomorphic encryption, oblivious transfer (Oblivious Transfer, OT), and one-way trapdoor function (One-way Trapdoor Function), the above conditions can be met or partially met.
本说明书提供一种实现隐私信息检索的方法实施例。This specification provides an embodiment of a method for realizing private information retrieval.
该实施例中,服务端(Server)可以预先将数据库加密后得到查询基,并发送该查询基至客户端。In this embodiment, the server (Server) may encrypt the database in advance to obtain the query base, and send the query base to the client.
一般的,服务端本地具有数据库,可以供客户端查询。服务端本地的数据库例如为如下:Generally, the server has a local database, which can be queried by the client. The local database on the server is, for example, as follows:
表1、服务端具有的数据库Table 1. Databases owned by the server
上述表1的例子中,包括ID、Name、Age、Native_place这4个字段,例如有id_0,...id_9 共10条记录,每一行为一个记录。其中,id_0,...id_9为每一行记录的标识。In the above example of Table 1, there are four fields including ID, Name, Age, and Native_place. For example, there are 10 records id_0, ... id_9, and each row is a record. Among them, id_0, ... id_9 is the identification of each row of records.
为了让客户端可以进行检索,而又不暴露服务端的隐私安全,服务端可以加密该数据库,得到查询基。加密方式可以采用RSA(一种使用广泛的非对称加密算法,1977年由罗纳德·李维斯特(Ron Rivest)、阿迪·萨莫尔(Adi Shamir)和伦纳德·阿德曼(LeonardAdleman)一起提出的)或ECC(Elliptical Curve Cryptography,椭圆曲线密码学)加密。具体的,服务端可以使用RSA私钥/ECC私钥α对数据加密,即对除了ID列的其它每个字段(即每个单元格中的数据)采用RSA私钥/ECC私钥α进行加密。In order to allow the client to search without exposing the privacy of the server, the server can encrypt the database to obtain the query base. The encryption method can use RSA (a widely used asymmetric encryption algorithm, developed by Ronald Rivest (Ron Rivest), Adi Shamir (Adi Shamir) and Leonard Adleman (Leonard Adleman) in 1977 proposed together) or ECC (Elliptical Curve Cryptography, elliptic curve cryptography) encryption. Specifically, the server can use the RSA private key/ECC private key α to encrypt data, that is, use the RSA private key/ECC private key α to encrypt each field except the ID column (that is, the data in each cell) .
采用ECC加解密算法的情况下,具体的,服务端可以生成一个秘密值α并妥善保存,该秘密值α也就是ECC私钥。此外,服务端可以将name字段的值通过一个哈希函数转换为椭圆曲线上的一个点,可以表达为Hash(C)或表达为H(C)。In the case of using the ECC encryption and decryption algorithm, specifically, the server can generate a secret value α and store it properly. The secret value α is also the ECC private key. In addition, the server can convert the value of the name field into a point on the elliptic curve through a hash function, which can be expressed as Hash(C) or H(C).
根据椭圆曲线上标量乘法的运算性质,椭圆曲线上的一个点P和一个整数k,计算Q=kP很容易,且得到的结果Q也是该椭圆曲线上的一个点;反之,如果知道椭圆曲线上的一个点对P、 Q,求解Q=kP中使等式成立的k的值很难。According to the operational properties of scalar multiplication on the elliptic curve, it is easy to calculate Q=kP for a point P on the elliptic curve and an integer k, and the obtained result Q is also a point on the elliptic curve; A point pair P, Q of , it is very difficult to find the value of k in Q=kP that makes the equation valid.
这里,根据椭圆曲线上的标量乘法运算α·H(C)很容易计算得到,但是知道α·H(C)的结果和H(C)却很难推算出α的值。很难得到α的值的情况下,知道α·H(C)的结果,也很难得到知道H(C)的值。Here, it is easy to calculate α·H(C) according to the scalar multiplication operation on the elliptic curve, but it is difficult to calculate the value of α after knowing the result of α·H(C) and H(C). When it is difficult to obtain the value of α, knowing the result of α·H(C) is also difficult to obtain the value of H(C).
进而,服务端采用秘密值α加密后的数据库如下所示:Furthermore, the database encrypted by the server using the secret value α is as follows:
表2、服务端采用ECC私钥加密后的查询基Table 2. The query base encrypted by the server using the ECC private key
需要说明的是,上述hash函数,不仅能将原始输入转换为固定长度和格式的输出,还能将输出转换为椭圆曲线上的一个点的x轴坐标。例如采用curve25519这样的椭圆曲线,任意的 256bits数据都可以作为这条椭圆曲线上的一个合法的x轴坐标。相应的,可以采用sha256或 sha3-256,也可以采用sha384、sha512或者sha3-384、sha3-512的结果中截取256bits。更广泛的说,任意hash值(不局限于hash结果是256bits)可以对椭圆曲线的阶取模,取模结果与生成元点乘之积(标量乘法)即为该椭圆曲线上的一个点。It should be noted that the above hash function can not only convert the original input into an output with a fixed length and format, but also convert the output into the x-axis coordinate of a point on the elliptic curve. For example, if an elliptic curve such as curve25519 is used, any 256bits data can be used as a legal x-axis coordinate on this elliptic curve. Correspondingly, sha256 or sha3-256 can be used, or 256 bits can be intercepted from the result of sha384, sha512 or sha3-384, sha3-512. More broadly speaking, any hash value (not limited to 256bits) can take the modulus of the order of the elliptic curve, and the product of the modulus result and the generator point (scalar multiplication) is a point on the elliptic curve.
进而,服务端可以将该查询基发送至需要进行检索的客户端。一种方式中,服务端可以直接发送该查询基至客户端,例如直接发送至客户端的设备,或者发送至客户端的代理服务器之类;另一种方式中,服务端可以在一个统一资源定位系统(Uniform ResourceLocator, URL)上发布该查询基,进而客户端可以从该URL上获取该查询基。Furthermore, the server can send the query base to the client that needs to retrieve it. In one way, the server can directly send the query base to the client, for example, directly to the client's device, or to the client's proxy server; in another way, the server can be in a uniform resource location system (Uniform ResourceLocator, URL) to publish the query base, and then the client can obtain the query base from the URL.
相应的,客户端可以接收到该查询基,并将接收到的查询基保存在本地。Correspondingly, the client can receive the query base and store the received query base locally.
类似的,采用RSA的情况下,服务端可以生成一个秘密值α并妥善保存,该秘密值也就是 RSA私钥。此外,服务端可以将name字段的值通过一个哈希函数转换为椭圆曲线上的一个点,可以表达为Hash(C)或表达为H(C)。Similarly, in the case of RSA, the server can generate a secret value α and store it properly, which is the RSA private key. In addition, the server can convert the value of the name field into a point on the elliptic curve through a hash function, which can be expressed as Hash(C) or H(C).
根据模幂运算的性质,已知秘密值α,对于一个大质数q和底数g,计算p=gαmod q很容易;反之,如果知道p、q和底数g,求解p=gαmod q中使等式成立的α的值很难。底数g也称为原根。According to the nature of modular exponentiation, the secret value α is known, and for a large prime number q and base g, it is easy to calculate p=g α mod q; on the contrary, if p, q and base g are known, it is easy to solve p=g α mod q The value of α for which the equation holds is difficult. The base g is also called the original root.
这里,根据模拟运算计算(H(C))αmod q很容易,但是知道(H(C))αmod q的结果和H(C)、 q却很难推算出α的值。很难得到α的值的情况下,知道(H(C))αmod q的结果,也很难得到知道H(C)的值。后续,将形如(H(C))αmod q的表达式省略mod q,简略表示为(H(C))α。Here, it is easy to calculate (H(C)) α mod q based on simulation operations, but it is difficult to deduce the value of α knowing the result of (H(C)) α mod q and H(C) and q. When it is difficult to obtain the value of α, knowing the result of (H(C)) α mod q is also difficult to obtain the value of H(C). In the following, an expression of the form (H(C)) α mod q will omit mod q, and be simply expressed as (H(C)) α .
进而,服务端采用秘密值α加密后的数据库如下所示:Furthermore, the database encrypted by the server using the secret value α is as follows:
表3、服务端采用RSA私钥加密后的查询基Table 3. The query base encrypted by the server using the RSA private key
进而,服务端可以将该查询基发送至需要进行检索的客户端。类似的,服务端可以直接发送该查询基至客户端,例如直接发送至客户端的设备,或者发送至客户端的代理服务器之类;另一种方式中,服务端可以在一个统一资源定位系统(Uniform ResourceLocator,URL) 上发布该查询基,进而客户端可以从该URL上获取该查询基。Furthermore, the server can send the query base to the client that needs to retrieve it. Similarly, the server can directly send the query base to the client, such as directly to the client's device, or to the client's proxy server; ResourceLocator, URL), and then the client can obtain the query base from the URL.
相应的,客户端可以接收到该查询基,并将接收到的查询基保存在本地。Correspondingly, the client can receive the query base and store the received query base locally.
例如图1所示,客户端与服务端之间的交互过程可以包括如下步骤:For example, as shown in Figure 1, the interaction process between the client and the server may include the following steps:
S110:客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段。S110: The client sends the encrypted sensitive field to the server, and obtains the same sensitive field encrypted by the server through interaction with the server.
例如,客户端的检索条件为Age字段值为25,而25为敏感字段,即不希望让对端知道。为了避免让服务端知道客户端检索条件是Age字段的值25,客户端可以将该25加密。例如,采用 RSA/ECC私钥加密,客户端采用的加密算法与服务端生成查询基采用的加密算法相同。For example, the retrieval condition of the client is that the value of the Age field is 25, and 25 is a sensitive field, that is, the peer end is not expected to know. In order to prevent the server from knowing that the retrieval condition of the client is the value 25 of the Age field, the client can encrypt the value 25. For example, RSA/ECC private key encryption is used, and the encryption algorithm used by the client is the same as that used by the server to generate the query base.
具体的,采用RSA私钥加密的情况下,客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25加密。具体的,可以是对25或对25的hash值加密。这里以对25的hash 加密为例加以说明,对25直接加密的情况类似,客户端与服务端采用相同的hash算法。例如,客户端采用与服务端相同的大质数q作为模数。客户端可以将对25采用β对25的hash值进行RSA 加密,得到(H(25))β。则客户端发送至服务端的敏感字段可以为(H(25))β,其中,(H(25))β表示敏感字段的值25的密文。Specifically, in the case of RSA private key encryption, the client itself generates the secret β and keeps it properly. Furthermore, the client can encrypt 25 with its own private key β. Specifically, it may be to encrypt 25 or a hash value of 25. Here we take the hash encryption of 25 as an example to illustrate. The situation of direct encryption of 25 is similar, and the client and server use the same hash algorithm. For example, the client uses the same large prime number q as the modulus as the server. The client can perform RSA encryption on the hash value of 25 using β to obtain (H(25)) β . Then the sensitive field sent by the client to the server may be (H(25)) β , where (H(25)) β represents the ciphertext of the value 25 of the sensitive field.
另一方面,客户端也可以构造检索语句,并将检索语句中的敏感字段加密后得到隐私字段,并用隐私字段替换敏感字段,将替换后的隐私检索语句发送至服务端。On the other hand, the client can also construct a search statement, encrypt the sensitive fields in the search statement to obtain the private field, replace the sensitive field with the private field, and send the replaced private search statement to the server.
例如,客户端构造的查询语句为select Name where Age=25。For example, the query statement constructed by the client is select Name where Age=25.
为了保护隐私,即不让服务端获得查询的是Age=25这个条件,例如是将其中的25隐私保护起来,结果如下:In order to protect privacy, that is, the age=25 condition is what prevents the server from obtaining the query. For example, the privacy of 25 of them is protected, and the result is as follows:
select Name where Age=?select Name where Age=?
其中,?表示替换后的检索语句。in,? Indicates the replaced search statement.
具体的,客户端可以将25用RSA私钥加密。例如,客户端可以将对25采用与服务端相同hash 函数进行hash计算,进而采用β对25的hash值进行RSA加密,得到(H(25))β。则客户端发送至服务端的查询语句例如为如下:Specifically, the client can encrypt 25 with an RSA private key. For example, the client can perform hash calculation on 25 using the same hash function as the server, and then use β to perform RSA encryption on the hash value of 25 to obtain (H(25)) β . The query statement sent by the client to the server is, for example, as follows:
select Name where Age=(H(25))β select Name where Age=(H(25)) β
如前所述,(H(25))β为密文,即为上面检索语句中的“?”代表的内容,服务端获得后并不能知晓其中的β和25。As mentioned above, (H(25)) β is the ciphertext, that is, the content represented by "?" in the above search statement, and the server cannot know β and 25 in it after obtaining it.
采用ECC私钥加密的情况下,客户端采用与服务端相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元。客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25 加密。具体的,可以是对25的hash值加密,客户端与服务端采用相同的hash算法。例如,客户端可以采用β对25的hash值进行ECC加密,得到β·H(25)。则客户端发送至服务端的敏感字段可以为β·H(25),其中,β·H(25)表示敏感字段的值25的密文。In the case of ECC private key encryption, the client uses the same elliptic curve as the server, that is, has the same elliptic curve parameters and generators. The client itself generates the secret β and keeps it securely. Furthermore, the client can encrypt 25 with its own private key β. Specifically, the hash value of 25 may be encrypted, and the client and server use the same hash algorithm. For example, the client can use β to perform ECC encryption on the hash value of 25 to obtain β·H(25). Then the sensitive field sent by the client to the server may be β·H(25), where β·H(25) represents the ciphertext of the value 25 of the sensitive field.
另一方面,客户端也可以构造检索语句,并将检索语句中的敏感字段加密后得到隐私字段,并用隐私字段替换敏感字段,将替换后的隐私检索语句发送至服务端。On the other hand, the client can also construct a search statement, encrypt the sensitive fields in the search statement to obtain the private field, replace the sensitive field with the private field, and send the replaced private search statement to the server.
例如,客户端构造的查询语句为select Name where Age=25。For example, the query statement constructed by the client is select Name where Age=25.
为了保护隐私,即不让服务端获得查询的是Age=25这个条件,例如是将其中的25隐私保护起来,结果如下:In order to protect privacy, that is, the age=25 condition is what prevents the server from obtaining the query. For example, the privacy of 25 of them is protected, and the result is as follows:
select Name where Age=?select Name where Age=?
其中,?表示替换后的检索语句。in,? Indicates the replaced search statement.
具体的,客户端可以将25用ECC私钥加密。例如,客户端采用与服务端相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元。客户端可以将检索语句中的敏感字段用自身ECC私钥加密后替换,将替换后的隐私检索语句发送至服务端。例如客户端自身生成秘密β并妥善保存。此外,客户端可以将对25采用与服务端相同hash函数进行hash计算,进而采用β对25的hash 值进行ECC加密,得到β·H(25)。则客户端发送至服务端的查询语句例如为如下:Specifically, the client can encrypt 25 with an ECC private key. For example, the client uses the same elliptic curve as the server, that is, has the same elliptic curve parameters and generators. The client can encrypt and replace the sensitive fields in the search statement with its own ECC private key, and send the replaced private search statement to the server. For example, the client itself generates the secret β and keeps it properly. In addition, the client can perform hash calculation on 25 using the same hash function as the server, and then use β to perform ECC encryption on the hash value of 25 to obtain β·H(25). The query statement sent by the client to the server is, for example, as follows:
select Name where Age=β·H(25)select Name where Age=β·H(25)
如前所述,β·H(25)为密文,即为上面检索语句中的“?”代表的内容,服务端获得后并不能知晓其中的β和25。As mentioned above, β·H(25) is the ciphertext, that is, the content represented by "?" in the above search statement, and the server cannot know β and 25 in it after obtaining it.
所述客户端通过与服务端的交互得到由服务端加密的同一敏感字段,可以包括服务端采用自身密钥对由客户端加密的敏感字段再次加密后发送至客户端,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。该内容的核心是需要找到一个满足连续两次加密操作(两方先后加密)可以交换顺序进行解密的加密算法。根据ECC的密码学性质,双方约定采用相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元,各自持有私钥α和β,加密操作为用α(或β)进行标量乘法运算,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。类似的,根据RSA的密码学性质加密,双方约定采用一个相同的大质数q和原根g,各自持有私钥α和β,加密操作为用α(或β)求幂并用q取模,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。整体来说,这里客户端和服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法。The client obtains the same sensitive field encrypted by the server through interaction with the server, which may include that the server uses its own key to re-encrypt the sensitive field encrypted by the client and sends it to the client, and the client uses its own key to pair The sensitive fields encrypted twice are decrypted to obtain the sensitive fields encrypted by the server. The core of this content is to find an encryption algorithm that satisfies two consecutive encryption operations (two parties encrypt successively) and can exchange order for decryption. According to the cryptographic nature of ECC, the two parties agree to use the same elliptic curve, that is, have the same elliptic curve parameters and generators, each hold the private key α and β, and the encryption operation is scalar multiplication with α (or β), regardless of Whether it is first encrypted with α and then encrypted with β or encrypted with β first and then encrypted with α, it can be decrypted in the same or different order, that is, the encrypted results can be decrypted in different orders. Similarly, according to the cryptographic nature of RSA encryption, the two parties agree to use the same large prime number q and original root g, each holding private keys α and β, and the encryption operation is to use α (or β) to exponentiate and use q to take the modulus, Regardless of first encrypting with α and then encrypting with β or first encrypting with β and then encrypting with α, they can be decrypted in the same or different order, that is, the encrypted results can be decrypted in different orders. On the whole, the encryption/decryption performed by the client and the server on the same target adopts the encryption/decryption algorithm with interchangeable order.
具体的,可以是服务端收到隐私检索语句后,对隐私字段再次加密后返回至客户端,也可以是服务端收到客户端发送的经客户端自身加密的敏感字段后,服务端对加密的敏感字段再次用服务端自身密钥加密后返回至客户端。进而,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。Specifically, it may be that after the server receives the privacy search statement, it encrypts the private field again and returns it to the client, or it may be that after the server receives the sensitive field encrypted by the client itself from the client, the server encrypts the encrypted field. Sensitive fields of the server are encrypted again with the server's own key and returned to the client. Furthermore, the client uses its own key to decrypt the twice-encrypted sensitive field to obtain the sensitive field encrypted by the server.
例如,情况1:服务端可以接收到客户端发送的(H(25))β。For example, Case 1: The server can receive (H(25)) β sent by the client.
服务端可以对加密后的敏感字段(亦即隐私字段)再次加密,并将再次加密后的敏感字段返回至客户端。具体的,服务端可以对隐私字段(H(25))β采用自身的RSA私钥α进行再次加密,得到((H(25))β)α。The server may re-encrypt the encrypted sensitive field (ie, the private field), and return the re-encrypted sensitive field to the client. Specifically, the server can re-encrypt the privacy field (H(25)) β with its own RSA private key α to obtain ((H(25)) β ) α .
例如,情况1':服务端可以接收到客户端发送的隐私检索语句select Name whereAge=(H(25))β。这样,服务端可以获得该隐私检索语句中的隐私字段(H(25))β。For example, case 1': the server may receive the privacy search statement select Name whereAge=(H(25)) β sent by the client. In this way, the server can obtain the privacy field (H(25)) β in the privacy retrieval statement.
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。The server can re-encrypt the privacy field, and return the re-encrypted privacy field to the client.
具体的,服务端可以对隐私字段(H(25))β采用自身的RSA私钥α进行再次加密,得到 ((H(25))β)α,具体过程类似上述,这里不再赘述。Specifically, the server can re-encrypt the privacy field (H(25)) β with its own RSA private key α to obtain ((H(25)) β ) α . The specific process is similar to the above, and will not be repeated here.
例如,情况2:服务端可以接收到客户端发送的β·H(25)。For example, case 2: the server can receive the β·H(25) sent by the client.
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。具体的,服务端可以对隐私字段β·H(25)采用自身的ECC私钥α进行再次加密,得到α·β·H(25)。The server can re-encrypt the privacy field, and return the re-encrypted privacy field to the client. Specifically, the server can re-encrypt the privacy field β·H(25) with its own ECC private key α to obtain α·β·H(25).
例如,情况2':服务端可以接收到客户端发送的隐私检索语句select Name whereAge=β·H(25)。这样,服务端可以获得该隐私检索语句中的隐私字段β·H(25)。For example, case 2': the server may receive the privacy retrieval statement select Name whereAge=β·H(25) sent by the client. In this way, the server can obtain the privacy field β·H(25) in the privacy retrieval sentence.
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。The server can re-encrypt the privacy field, and return the re-encrypted privacy field to the client.
具体的,服务端可以对隐私字段β·H(25)采用自身的ECC私钥α进行再次加密,得到α·β·H(25),具体过程类似上述,这里不再赘述。Specifically, the server can re-encrypt the privacy field β·H(25) with its own ECC private key α to obtain α·β·H(25). The specific process is similar to the above and will not be repeated here.
服务端采用自身密钥对由客户端加密的敏感字段(即隐私字段)再次加密后发送至客户端后,客户端可以采用自身密钥对两次加密后的隐私字段解密得到由服务端加密的敏感字段。After the server uses its own key to re-encrypt the sensitive field (that is, the privacy field) encrypted by the client and sends it to the client, the client can use its own key to decrypt the twice-encrypted privacy field to obtain the encrypted data from the server. Sensitive fields.
例如,对应上面情况1和1',客户端接收到服务端发送的((H(25))β)α,其中的幂次方运算存在性质如下:((H(25))β)α=(H(25))βα=(H(25))αβ=((H(25))α)β。进而,客户端可以采用自身私钥β的逆元对两次加密后的敏感字段解密,如下:这样,客户端得到由服务端加密的同一敏感字段,即(H(25))α。For example, corresponding to the above cases 1 and 1', the client receives the ((H(25)) β ) α sent by the server, and the existence property of the power operation is as follows: ((H(25)) β ) α = (H(25)) βα = (H(25)) αβ = ((H(25)) α ) β . Furthermore, the client can use the inverse of its own private key β Decrypt the sensitive fields encrypted twice, as follows: In this way, the client gets the same sensitive field encrypted by the server, namely (H(25)) α .
对应上面情况2和2',客户端接收到服务端发送的α·β·H(25),其中的标量乘法运算存在性质如下:α·β·H(25)=β·α·H(25)。进而,客户端可以采用自身私钥β的逆元β-1对两次加密后的敏感字段解密,如下:β-1·α·β·H(25)=β-1·β·α·H(25)=α·H(25)。这样,客户端同样得到由服务端加密的同一敏感字段,即α·H(25)。Corresponding to the above cases 2 and 2', the client receives the α·β·H(25) sent by the server, and the existence property of the scalar multiplication operation is as follows: α·β·H(25)=β·α·H(25 ). Furthermore, the client can use the inverse element β -1 of its own private key β to decrypt the sensitive fields encrypted twice, as follows: β -1 ·α·β·H(25)=β -1 ·β·α·H (25)=α·H(25). In this way, the client also gets the same sensitive field encrypted by the server, namely α·H(25).
需要说明的是,RSA中,根据欧拉定理,pk·sk=1mod(p-1)·(q-1),其中p和q两个大质数,所以pk和sk互为逆元。类似的,ECC中,pk=sk*G,G为ECC选定曲线上的一个生成元,所以pk和sk也是互为逆元。It should be noted that in RSA, according to Euler's theorem, pk·sk=1mod(p-1)·(q-1), where p and q are two large prime numbers, so pk and sk are inverses of each other. Similarly, in ECC, pk=sk*G, G is a generator on the selected curve of ECC, so pk and sk are also mutually inverse elements.
S120:客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识,并将该标识返回至服务端。S120: The client searches in the query base according to the sensitive field encrypted by the server, obtains an identifier of a matching record, and returns the identifier to the server.
S110执行后,客户端可以得到由服务端加密的同一敏感字段。After S110 is executed, the client can obtain the same sensitive field encrypted by the server.
客户端可以基于该由服务端加密的敏感字段在查询基中查询。例如,客户端解密后得到由服务端加密的隐私字段α·H(25)或(H(25))α,从而客户端基于该隐私字段在查询基中查询,例如分别在表2或表3中查询,可以得到Age中包含该隐私字段的记录为ID=d_1的这条记录,ID 为这条记录的标识。这样,客户基于该由服务端加密的隐私字段在查询基中查询,匹配到记录后可以定位得到匹配记录的标识。Clients can query the query base based on this sensitive field encrypted by the server. For example, the client decrypts and obtains the privacy field α·H(25) or (H(25)) α encrypted by the server, so that the client queries in the query base based on the privacy field, for example, in Table 2 or Table 3 Query in , it can be obtained that the record containing the privacy field in Age is the record with ID=d_1, and ID is the identifier of this record. In this way, the client queries the query base based on the private field encrypted by the server, and after matching records, the client can locate and obtain the identification of the matching records.
所述客户端将匹配记录的标识返回至服务端,可以包括两种情况。The client returns the identifier of the matching record to the server, which may include two situations.
一种是S110中,客户端构造检索语句,并将检索语句中的敏感字段加密后得到隐私字段,并用隐私字段替换敏感字段,将替换后的隐私检索语句发送至服务端的情况。该情况下,客户端可以直接将匹配记录的标识返回至服务端。One is that in S110, the client constructs a search statement, encrypts sensitive fields in the search statement to obtain a private field, replaces the sensitive field with the private field, and sends the replaced private search statement to the server. In this case, the client can directly return the identifier of the matching record to the server.
另一种是S110中,客户端发送经自身加密的敏感字段的值至服务端的情况。该情况下,S120中,客户端可以构造检索语句,例如检索语句为:The other is in S110, the client sends the value of the sensitive field encrypted by itself to the server. In this case, in S120, the client may construct a search statement, for example, the search statement is:
select Name where ID=id_1select Name where ID=id_1
这样,S120中,客户端可以将构造的上述检索式发送至服务端,该检索式中包含了匹配记录的标识,并指示感兴趣的字段是Name,也就是select后面紧跟的字段名称。In this way, in S120, the client can send the above constructed search formula to the server, the search formula includes the identification of the matching record, and indicates that the field of interest is Name, that is, the name of the field immediately following select.
换句话说,S110和S120这两个步骤中,可以选择在其中的一个步骤中发送检索式,该检索式中包含了感兴趣字段。In other words, in the two steps of S110 and S120, one of the steps may choose to send a search formula, and the search formula includes the field of interest.
S130:服务端返回所述数据库中所述标识对应记录中的感兴趣字段的值至客户端。S130: The server returns the value of the field of interest in the record corresponding to the identifier in the database to the client.
仍然按照上述例子,服务端接收到客户端发来的标识后,可以在所述数据库中查找所述标识对应记录,并按照S110或S120的感兴趣字段取出查找到的记录中的相应值,并将该取出的感兴趣字段的值返回至客户端。例如,返回id_1对应记录中的Name=B,即返回B至客户端。Still according to the above example, after receiving the identifier sent by the client, the server can search the database for the record corresponding to the identifier, and retrieve the corresponding value in the found record according to the field of interest in S110 or S120, and Return the retrieved value of the field of interest to the client. For example, return Name=B in the record corresponding to id_1, that is, return B to the client.
上述实施例中,通过将查询基预先配置到客户端的形式,实现不暴露数据库明文的情况下由客户端通过与服务端交互及查询基定位要查询的字段在查询基中的标识,进一步根据标识向服务端发起查询,得到标识对应记录中的感兴趣字段的值。相对于传统的多副本PIR,显然不需要多个副本数据库之间不能合谋的前提假设,实用性更好。相对于传统的单副本PIR 中只能实现比特位检索的情形,本实施例不需要关注要检索的关键字在数据库中的具体位置 (比特位置),可以实现字符串的查询,且可以支持结构化查询语句(Structured Query Language,SQL)。本实施例中数据库仍然保持在服务端,同时将数据库加密得到的查询基配置到客户端,以便于客户端检索时基于查询基进行数据定位以得到记录的标识,同时查询基的加密特性使得客户端不会获得数据库的内容,保证了服务端对数据库的隐私保护。整体来说,本实施例中数据库、查询基的形式,在一个服务端配置数据库和一个客户端配置查询基的情况下可以称为“非对称双副本”,在多个客户端配置查询基的情况下可以称为“非对称多副本”。In the above embodiment, by pre-configuring the query base in the form of the client, the client can locate the identifier of the field to be queried in the query base by interacting with the server and the query base without exposing the plaintext of the database, and further according to the identifier Initiate a query to the server to obtain the value identifying the field of interest in the corresponding record. Compared with the traditional multi-copy PIR, it obviously does not need the premise that multiple copy databases cannot collude, and the practicability is better. Compared with the situation in which only bits can be retrieved in the traditional single-copy PIR, this embodiment does not need to pay attention to the specific position (bit position) of the keyword to be retrieved in the database, can realize the query of the character string, and can support the structure Structured Query Language (SQL). In this embodiment, the database remains on the server, and at the same time, the query base encrypted by the database is configured to the client, so that the client can locate the data based on the query base to obtain the identification of the record, and the encryption feature of the query base enables the client to The client will not obtain the contents of the database, which ensures the privacy protection of the database by the server. On the whole, the form of database and query base in this embodiment can be called "asymmetric double copy" when a server configures a database and a client configures a query base. The case can be called "asymmetric multi-copy".
上述实施例中,通过SQL查询语句,客户端可以发起对感兴趣字段的查询,例如上述select Name...中要查询的Name字段。这在一定程度上暴露了客户端的感兴趣字段。另一种方式中,可以查询符合条件的记录,即符合条件的整行数据,这样可以保护客户端的隐私,但是需要服务端返回整条记录,这就一定程度上暴露了服务端的整行数据。例如S110/S120中通过“select*where Age=?”或“select*where ID=id_1”这样的检索语句。这样,服务端返回的结果可以是id_1这条的记录,例如如下:In the above embodiment, the client can initiate a query on the field of interest through the SQL query statement, for example, the Name field to be queried in the above select Name.... This somewhat exposes fields of interest to the client. In another way, you can query qualified records, that is, the entire row of data that meets the conditions, which can protect the privacy of the client, but requires the server to return the entire record, which exposes the entire row of data on the server to a certain extent. For example, in S110/S120, a search statement such as "select*where Age=?" or "select*where ID=id_1" is used. In this way, the result returned by the server can be the record of id_1, for example as follows:
id_1B 25shanghaiid_1B 25shanghai
另外,为了保证传输过程的安全,所述服务端可以加密返回所述数据库中所述标识对应记录/对应记录中感兴趣字段的值至所述客户端。例如,服务端可以采用与客户端协商所得的对称密钥对数据库中所述标识对应记录/对应记录中感兴趣字段的值加密后返回至所述客户端,或者采用所述客户端的非对称密钥中的公钥对数据库中所述标识对应记录/对应记录中感兴趣字段的值加密后返回至所述客户端,从而客户端可以用自身私钥解密,以及采用数字信封方式等等。In addition, in order to ensure the security of the transmission process, the server may encrypt and return the value of the corresponding record in the database/the value of the field of interest in the corresponding record to the client. For example, the server can use the symmetric key negotiated with the client to encrypt the value of the field of interest in the corresponding record/record in the database and return it to the client, or use the asymmetric key of the client to The public key in the key encrypts the corresponding record in the database/the value of the field of interest in the corresponding record and returns it to the client, so that the client can decrypt it with its own private key, and use a digital envelope, etc.
上述S120中,客户端直接将匹配得到的ID返回至服务端,虽然可以从服务端获得该ID对应的记录或记录中的感兴趣字段,如S130,但是,这会一定程度上暴露客户端的隐私,即会让服务端知道客户端想查询的标识是id_1。为了保护客户端的隐私,可以通过下述实施例中的方式实现:In the above S120, the client directly returns the matched ID to the server. Although the record corresponding to the ID or the field of interest in the record can be obtained from the server, such as S130, this will expose the privacy of the client to a certain extent. , which will let the server know that the identifier that the client wants to query is id_1. In order to protect the privacy of the client, it can be implemented in the following embodiments:
S210:客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段。S210: The client sends the encrypted sensitive field to the server, and obtains the same sensitive field encrypted by the server through interaction with the server.
例如,客户端的检索条件为Age字段值为25,而25为敏感字段,即不希望让对端知道。为了避免让服务端知道客户端检索条件是Age字段的值25,客户端可以将该25加密。例如,采用 RSA/ECC私钥加密,客户端采用的加密算法与服务端生成查询基采用的加密算法相同。For example, the retrieval condition of the client is that the value of the Age field is 25, and 25 is a sensitive field, that is, the peer end is not expected to know. In order to prevent the server from knowing that the retrieval condition of the client is the value 25 of the Age field, the client can encrypt the value 25. For example, RSA/ECC private key encryption is used, and the encryption algorithm used by the client is the same as that used by the server to generate the query base.
具体的,采用RSA私钥加密的情况下,客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25加密。具体的,可以是对25或对25的hash值加密。这里以对25的hash 加密为例加以说明,对25直接加密的情况类似,客户端与服务端采用相同的hash算法。例如,客户端采用与服务端相同的大质数q作为模数。客户端可以将对25采用β对25的hash值进行RSA 加密,得到(H(25))β。则客户端发送至服务端的敏感字段可以为(H(25))β,其中,(H(25))β表示敏感字段的值25的密文。Specifically, in the case of RSA private key encryption, the client itself generates the secret β and keeps it properly. Furthermore, the client can encrypt 25 with its own private key β. Specifically, it may be to encrypt 25 or a hash value of 25. Here we take the hash encryption of 25 as an example to illustrate. The situation of direct encryption of 25 is similar, and the client and server use the same hash algorithm. For example, the client uses the same large prime number q as the modulus as the server. The client can perform RSA encryption on the hash value of 25 using β to obtain (H(25)) β . Then the sensitive field sent by the client to the server may be (H(25)) β , where (H(25)) β represents the ciphertext of the value 25 of the sensitive field.
采用ECC私钥加密的情况下,客户端采用与服务端相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元。客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25 加密。具体的,可以是对25的hash值加密,客户端与服务端采用相同的hash算法。例如,客户端可以采用β对25的hash值进行ECC加密,得到β·H(25)。则客户端发送至服务端的敏感字段可以为β·H(25),其中,β·H(25)表示敏感字段的值25的密文。In the case of ECC private key encryption, the client uses the same elliptic curve as the server, that is, has the same elliptic curve parameters and generators. The client itself generates the secret β and keeps it securely. Furthermore, the client can encrypt 25 with its own private key β. Specifically, the hash value of 25 may be encrypted, and the client and server use the same hash algorithm. For example, the client can use β to perform ECC encryption on the hash value of 25 to obtain β·H(25). Then the sensitive field sent by the client to the server may be β·H(25), where β·H(25) represents the ciphertext of the value 25 of the sensitive field.
所述客户端通过与服务端的交互得到由服务端加密的同一敏感字段,可以包括服务端采用自身密钥对由客户端加密的敏感字段再次加密后发送至客户端,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。该内容的核心是需要找到一个满足连续两次加密操作(两方先后加密)可以交换顺序进行解密的加密算法。根据ECC的密码学性质,双方约定采用相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元,各自持有私钥α和β,加密操作为用α(或β)进行标量乘法运算,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。类似的,根据RSA的密码学性质加密,双方约定采用一个相同的大质数q和原根g,各自持有私钥α和β,加密操作为用α(或β)求幂并用q取模,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。整体来说,这里客户端和服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法。The client obtains the same sensitive field encrypted by the server through interaction with the server, which may include that the server uses its own key to re-encrypt the sensitive field encrypted by the client and sends it to the client, and the client uses its own key to pair The sensitive fields encrypted twice are decrypted to obtain the sensitive fields encrypted by the server. The core of this content is to find an encryption algorithm that satisfies two consecutive encryption operations (two parties encrypt successively) and can exchange order for decryption. According to the cryptographic nature of ECC, the two parties agree to use the same elliptic curve, that is, have the same elliptic curve parameters and generators, each hold the private key α and β, and the encryption operation is scalar multiplication with α (or β), regardless of Whether it is first encrypted with α and then encrypted with β or encrypted with β first and then encrypted with α, it can be decrypted in the same or different order, that is, the encrypted results can be decrypted in different orders. Similarly, according to the cryptographic nature of RSA encryption, the two parties agree to use the same large prime number q and original root g, each holding private keys α and β, and the encryption operation is to use α (or β) to exponentiate and use q to take the modulus, Regardless of first encrypting with α and then encrypting with β or first encrypting with β and then encrypting with α, they can be decrypted in the same or different order, that is, the encrypted results can be decrypted in different orders. On the whole, the encryption/decryption performed by the client and the server on the same target adopts the encryption/decryption algorithm with interchangeable order.
具体的,可以是服务端收到客户端发送的经客户端自身加密的敏感字段后,服务端对加密的敏感字段再次用服务端自身密钥加密后返回至客户端。进而,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。Specifically, after the server receives the encrypted sensitive field sent by the client, the server encrypts the encrypted sensitive field again with the server's own key and returns it to the client. Furthermore, the client uses its own key to decrypt the twice-encrypted sensitive field to obtain the sensitive field encrypted by the server.
例如,情况1:服务端可以接收到客户端发送的(H(25))β。For example, Case 1: The server can receive (H(25)) β sent by the client.
服务端可以对加密后的敏感字段(亦即隐私字段)再次加密,并将再次加密后的敏感字段返回至客户端。具体的,服务端可以对隐私字段(H(25))β采用自身的RSA私钥α进行再次加密,得到((H(25))β)α。The server may re-encrypt the encrypted sensitive field (ie, the private field), and return the re-encrypted sensitive field to the client. Specifically, the server can re-encrypt the privacy field (H(25)) β with its own RSA private key α to obtain ((H(25)) β ) α .
例如,情况2:服务端可以接收到客户端发送的β·H(25)。For example, case 2: the server can receive the β·H(25) sent by the client.
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。具体的,服务端可以对隐私字段β·H(25)采用自身的ECC私钥α进行再次加密,得到α·β·H(25)。The server can re-encrypt the privacy field, and return the re-encrypted privacy field to the client. Specifically, the server can re-encrypt the privacy field β·H(25) with its own ECC private key α to obtain α·β·H(25).
服务端采用自身密钥对由客户端加密的敏感字段(即隐私字段)再次加密后发送至客户端后,客户端可以采用自身密钥对两次加密后的隐私字段解密得到由服务端加密的敏感字段。After the server uses its own key to re-encrypt the sensitive field (that is, the privacy field) encrypted by the client and sends it to the client, the client can use its own key to decrypt the twice-encrypted privacy field to obtain the encrypted data from the server. Sensitive fields.
例如,对应上面情况1,客户端接收到服务端发送的((H(25))β)α,其中的幂次方运算存在性质如下:((H(25))β)α=(H(25))βα=(H(25))αβ=((H(25))α)β。进而,客户端可以采用自身私钥β的逆元对两次加密后的敏感字段解密,如下: 这样,客户端得到由服务端加密的同一敏感字段,即(H(25))α。For example, corresponding to the above case 1, the client receives ((H(25)) β ) α sent by the server, and the existence property of the power operation is as follows: ((H(25)) β ) α =(H( 25)) βα = (H(25)) αβ = ((H(25)) α ) β . Furthermore, the client can use the inverse of its own private key β Decrypt the sensitive fields encrypted twice, as follows: In this way, the client gets the same sensitive field encrypted by the server, namely (H(25)) α .
对应上面情况2,客户端接收到服务端发送的α·β·H(25),其中的标量乘法运算存在性质如下:α·β·H(25)=β·α·H(25)。进而,客户端可以采用自身私钥β的逆元β-1对两次加密后的敏感字段解密,如下:β-1·α·β·H(25)=β-1·β·α·H(25)=α·H(25)。这样,客户端同样得到由服务端加密的同一敏感字段,即α·H(25)。Corresponding to the above case 2, the client receives the α·β·H(25) sent by the server, and the existence property of the scalar multiplication operation is as follows: α·β·H(25)=β·α·H(25). Furthermore, the client can use the inverse element β -1 of its own private key β to decrypt the sensitive fields encrypted twice, as follows: β -1 ·α·β·H(25)=β -1 ·β·α·H (25)=α·H(25). In this way, the client also gets the same sensitive field encrypted by the server, namely α·H(25).
需要说明的是,RSA中,根据欧拉定理,pk·sk=1mod(p-1)·(q-1),其中p和q两个大质数,所以pk和sk互为逆元。类似的,ECC中,pk=sk·G,G为ECC选定曲线上的一个生成元,所以pk和sk也是互为逆元。It should be noted that in RSA, according to Euler's theorem, pk·sk=1mod(p-1)·(q-1), where p and q are two large prime numbers, so pk and sk are inverses of each other. Similarly, in ECC, pk=sk·G, G is a generating element on the selected curve of ECC, so pk and sk are also mutually inverse elements.
S220:客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识。S220: The client searches in the query base according to the sensitive field encrypted by the server, and obtains the identifier of the matching record.
S210执行后,客户端可以得到由服务端加密的同一敏感字段。After S210 is executed, the client can obtain the same sensitive field encrypted by the server.
客户端可以基于该由服务端加密的敏感字段在查询基中查询。例如,客户端解密后得到由服务端加密的隐私字段α·H(25)或(H(25))α,从而客户端基于该隐私字段在查询基中查询,例如分别在表2或表3中查询,可以得到Age中包含该隐私字段的记录为ID=d_1的这条记录,ID 为这条记录的标识。这样,客户基于该由服务端加密的隐私字段在查询基中查询,匹配到记录后可以定位得到匹配记录的标识。Clients can query the query base based on this sensitive field encrypted by the server. For example, the client decrypts and obtains the privacy field α·H(25) or (H(25)) α encrypted by the server, so that the client queries in the query base based on the privacy field, for example, in Table 2 or Table 3 Query in , it can be obtained that the record containing the privacy field in Age is the record with ID=d_1, and ID is the identifier of this record. In this way, the client queries the query base based on the private field encrypted by the server, and after matching records, the client can locate and obtain the identification of the matching records.
S230:服务端采用不经意传输方式返回所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录中的感兴趣字段的值至客户端。S230: The server returns, to the client, the value of the field of interest in the record corresponding to the identifier set of a predetermined size including the matching identifier in the database by means of inadvertent transmission.
S220中,客户端并不将匹配得到的ID返回至服务端,这样服务端无法获知客户端想要查找的是哪一条或哪几条记录;S210中客户端发送的经过客户端加密的敏感字段使得服务端也无法获知客户端查找的敏感字段将命中哪一条或哪几条记录,而只有客户端自己知道。这样,保护了客户端的隐私。但是,最终仍需要完成检索,这就需要服务端将客户端想查询的记录返回至客户端。In S220, the client does not return the matching ID to the server, so that the server cannot know which record or records the client wants to find; the sensitive field encrypted by the client sent by the client in S210 This makes it impossible for the server to know which record or records will be hit by the sensitive fields searched by the client, and only the client knows. In this way, the privacy of the client is protected. However, the retrieval still needs to be completed in the end, which requires the server to return the records that the client wants to query to the client.
这里,服务端可以采用不经意传输方式。Here, the server can use the oblivious transmission mode.
不经意传输(Oblivious Transfer,OT)可以基于RSA、ECC等实现,可以实现2选1、n选1和m选1、m选k(k<m<n)等多种OT。以2选1OT为例说明其原理,发送者有两个秘密,分别是m1和m2,需要发送2个秘密至接收者,接收者只能选择解密其中的1个而无法获知另一个,同时发送者也无法得知接收者选择的是哪一个。以RSA为例,2选1的一个简单的实施流程如下:Oblivious Transfer (OT) can be implemented based on RSA, ECC, etc., and can realize multiple OTs such as 2 to 1, n to 1, m to 1, m to k (k<m<n). Take 2 choose 1OT as an example to illustrate the principle. The sender has two secrets, namely m1 and m2, and needs to send two secrets to the receiver. The receiver can only choose to decrypt one of them and cannot know the other. At the same time, send The recipient also has no way of knowing which one the recipient has chosen. Taking RSA as an example, a simple implementation process of choosing 1 from 2 is as follows:
首先,发送者生成两对不同的公私钥,并公开两个公钥,记这两个公钥分别为公钥1和公钥2。假设接收者希望知道m1,但不希望发送人知道他想要的是m1。接收者生成一个随机数r,再用公钥1对r进行加密,传给发送者。发送者用自身的两个私钥对这个加密后的r进行解密,用私钥1解密得到r1,用私钥2解密得到r2。显然,只有r1是和r相等的,r2则是一串毫无意义的数(也是解密结果)。但发送者不知道接收者加密时用的哪个公钥,因此发送者也不知道自己算出来的r1和r2中的哪个才是真的r。发送者接收到m1和m2后,用r1对m1进行对称加密,用r2对m2进行对称加密,并将两个对称加密结果发送至接收者。接收者本地具有的r=r1,所以接收者用r对发来的两个结果分别进行对称解密可以得到m1,但是无法解密得到m2,这是因为接收者所具有的r≠r2,接收者也就无法用正确的对称密钥进行解密得到m2的值。这个过程中,发送者也不知道接收者算出的是m1和m2中的哪一个。First, the sender generates two different pairs of public and private keys, and discloses the two public keys, which are recorded as public key 1 and public key 2 respectively. Suppose the receiver wants to know m1, but does not want the sender to know that he wants m1. The receiver generates a random number r, encrypts r with the public key 1, and sends it to the sender. The sender uses his own two private keys to decrypt the encrypted r, decrypts with private key 1 to obtain r1, and decrypts with private key 2 to obtain r2. Obviously, only r1 is equal to r, and r2 is a series of meaningless numbers (also the result of decryption). But the sender does not know which public key the receiver uses for encryption, so the sender does not know which of the r1 and r2 calculated by himself is the real r. After receiving m1 and m2, the sender uses r1 to symmetric encrypt m1, uses r2 to symmetric encrypt m2, and sends the two symmetric encryption results to the receiver. The receiver has r=r1 locally, so the receiver uses r to symmetrically decrypt the two sent results to get m1, but cannot decrypt to get m2, because the receiver has r≠r2, and the receiver also It is impossible to decrypt with the correct symmetric key to obtain the value of m2. In this process, the sender does not know which of m1 and m2 is calculated by the receiver.
有了2选1作为基础,可以将2个公私钥对扩展为n个公私钥对,就成为了n选1的OT。n选1 的核心在于,服务端用n个不同密钥分别加密所述数据表中的n个记录/对应记录中感兴趣字段的值得到n个加密结果,并发送该n个加密结果至客户端;客户端采用匹配标识对应的密钥解密所述服务端发送的n个加密结果中匹配标识对应的1个加密结果。With 2 to 1 as the basis, 2 public-private key pairs can be expanded into n public-private key pairs, which becomes an OT of n to 1. The core of choosing 1 from n is that the server uses n different keys to encrypt the n records in the data table/values of the fields of interest in the corresponding records to obtain n encryption results, and send the n encryption results to the client end; the client uses the key corresponding to the matching identifier to decrypt one encrypted result corresponding to the matching identifier among the n encrypted results sent by the server.
结合本说明书上述实施例,假设服务端的数据表中具有总计n条记录,这样,客户端的查询基中相应的也具有n条加密的记录。为了方便,数据记录的ID按照顺序标识为id_0、id_1、 id_2、...id_n-1。一个简单的实施流程如下:In combination with the above-mentioned embodiments of this specification, it is assumed that there are a total of n records in the data table of the server, so that the query base of the client also has n encrypted records correspondingly. For convenience, the IDs of the data records are identified as id_0, id_1, id_2, . . . id_n-1 in sequence. A simple implementation process is as follows:
S231:服务端预先生成n对不同的公私钥对并公开公钥。S231: The server pre-generates n pairs of different public-private key pairs and discloses the public keys.
这里的n等于数据库中的记录的数量。Here n is equal to the number of records in the database.
服务端生成n对不同的公私钥对(pk-sk;pk是publick key,表示公钥;sk是secretkey,表示私钥;公钥可以公开,私钥需要保密),例如分别是pk0-sk0,pk1-sk1,pk2-sk2,...,pkn-1-skn-1,并公开这n个公钥,即公开pk0,pk1,pk2,...,pkn-1。服务端公开这n个有序的公钥后,客户端可以获得这n个公钥。The server generates n pairs of different public-private key pairs (pk-sk; pk is a public key, indicating a public key; sk is a secretkey, indicating a private key; the public key can be made public, and the private key needs to be kept secret), for example, pk 0 -sk 0 , pk 1 -sk 1 , pk 2 -sk 2 , ..., pk n-1 -sk n-1 , and publicize these n public keys, that is, publicize pk 0 , pk 1 , pk 2 , ... , pk n-1 . After the server discloses the n ordered public keys, the client can obtain the n public keys.
S232:客户端生成随机数r,并用期望获得的ID对应的公钥对r进行加密后发送至服务端。S232: The client generates a random number r, encrypts r with the public key corresponding to the desired ID, and sends it to the server.
这里假设客户端希望获得id_1的那条记录,同时不希望服务端知道客户端想要获得的记录是id_1的那条。这样,客户端可以采用pk1对r进行加密后发送至服务端。上述的有序,主要是指ID和公钥有对应关系,而这样的对应关系可以被客户端知晓。例如上述例子中,客户端希望获得id_1的那条记录但同时不希望服务端知道客户端想要获得的记录是id_1的那条,客户端可以采用id_1对应的pk1对r进行加密后发送至服务端;类似的,客户端希望获得id_t 的那条记录但同时不希望服务端知道客户端想要获得的记录是id_t的那条,客户端可以采用 id_t对应的pkt对r进行加密后发送至服务端。Here it is assumed that the client wants to obtain the record of id_1, and at the same time does not want the server to know that the record the client wants to obtain is the record of id_1. In this way, the client can use pk 1 to encrypt r and send it to the server. The order mentioned above mainly means that there is a corresponding relationship between the ID and the public key, and such a corresponding relationship can be known by the client. For example, in the above example, the client wants to obtain the record of id_1 but does not want the server to know that the record that the client wants to obtain is the record of id_1. The client can use the pk 1 corresponding to id_1 to encrypt r and send it to Server; similarly, the client wants to obtain the record of id_t but at the same time does not want the server to know that the record that the client wants to obtain is the record of id_t. The client can use the pk t corresponding to id_t to encrypt r and send it to the server.
S233:服务端接收到加密的r后,用n个私钥分别对其解密。S233: After receiving the encrypted r, the server uses n private keys to decrypt it respectively.
服务端分别用sk0,sk1,sk2,...,skn-1分别解密经过pk1加密的随机数r。例如,服务端用sk0解密得到r0,用sk1解密得到r1,...,用skn-1解密得到r(n-1)。The server uses sk 0 , sk 1 , sk 2 , ..., sk n-1 to decrypt the random number r encrypted by pk 1 respectively. For example, the server decrypts with sk 0 to obtain r0, decrypts with sk 1 to obtain r1, ..., and decrypts with sk n-1 to obtain r(n-1).
显然,只有r1是和r相等的,因为只有用sk1进行解密的才是用对应pk1加密的;而用不对应pk1的sk0、sk2、...、skn-1解密得到的结果r0、r2、...、r(n-1)都不会与r相同。通过解密,服务端只是得到形式相同的解密结果,并不知道真正的r是什么,也不知道客户端是用哪个公钥进行加密的。换句话说,服务端不知道客户端加密r时用的哪个公钥,因此服务端也不知道解密得到的n个结果r0、r1、r2、...、r(n-1)中的哪个才是真正的r。Obviously , only r1 is equal to r, because only those decrypted with sk 1 are encrypted with the corresponding pk 1 ; and decrypted with sk 0 , sk 2 , ..., sk n-1 that do not correspond to pk 1 to get The results of r0, r2, ..., r(n-1) will not be the same as r. Through decryption, the server only gets the decryption result in the same form, and does not know what the real r is, nor does it know which public key the client uses for encryption. In other words, the server does not know which public key the client uses to encrypt r, so the server does not know which of the n results r0, r1, r2, ..., r(n-1) obtained by decryption is the real r.
S234:服务端将数据库中每条记录按照序号采用对应序号的解密结果进行对称加密,将对称加密后的结果发送至客户端。S234: The server performs symmetric encryption on each record in the database according to the serial number using the decryption result corresponding to the serial number, and sends the symmetrically encrypted result to the client.
例如,服务端将id_0这条记录采用r0进行对称加密,将id_1这条记录采用r1进行对称加密,...,将id_n-1这条记录采用r(n-1)进行对称加密,并将这n个对称加密结果发送至客户端。For example, the server uses r0 for symmetric encryption of the id_0 record, uses r1 for symmetric encryption of the id_1 record, ..., uses r(n-1) for symmetric encryption of the id_n-1 record, and The n symmetric encryption results are sent to the client.
S235:客户端采用所述随机数r对所述对称加密结果中期望获得的ID对应的加密结果进行对称解密,得到检索结果。S235: The client uses the random number r to perform symmetric decryption on the encrypted result corresponding to the desired ID in the symmetric encrypted result to obtain a search result.
客户端采用所述随机数r对所述对称加密结果中期望获得的ID对应的加密结果进行对称解密。具体的,例如上述S232中客户端期望获得的是id_1对应的那条记录/记录中的感兴趣字段的值,则客户端采用对应的公钥pk1对所述随机数r进行加密;S233中,服务端用对应的私钥sk1对解密结果进行解密,得到的r1=r,而用不对应pk1的sk0、sk2、...、skn-1解密得到的结果r0、r2、...、r(n-1)都不会与r相同;S234中,服务端用r0、r1、r2、...、r(n-1)分别对对应的id_0、id_1、...、id_n-1这些记录/记录中的感兴趣字段的值进行对称加密,并将这n个对称加密结果发送至客户端;S235中,客户端采用所述随机数r对所述n个对称加密结果进行对称解密。其中,n个对称解密结果中,只有id_1的加密结果是用r对称加密的,因此这里只有对id_1的加密结果采用r进行解密才能得到正确的值。从而,客户端可以获得正确的检索结果,即获得对应记录/对应记录中感兴趣字段的值。The client uses the random number r to symmetrically decrypt the encrypted result corresponding to the desired ID in the symmetric encrypted result. Specifically, for example, what the client expects to obtain in S232 above is the value of the field of interest in the record/record corresponding to id_1, then the client uses the corresponding public key pk 1 to encrypt the random number r; in S233 , the server uses the corresponding private key sk 1 to decrypt the decryption result, and the obtained r1=r, and uses sk 0 , sk 2 , ..., sk n-1 that do not correspond to pk 1 to decrypt the results r0, r2 , ..., r(n-1) will not be the same as r; in S234, the server uses r0, r1, r2, ..., r(n-1) to match the corresponding id_0, id_1, ... ., id_n-1 these records/values of interest fields in the records are symmetrically encrypted, and the n symmetric encryption results are sent to the client; in S235, the client uses the random number r to encrypt the n symmetric The encrypted result is symmetrically decrypted. Among them, among the n symmetric decryption results, only the encrypted result of id_1 is encrypted symmetrically with r, so here only the encrypted result of id_1 is decrypted with r to obtain the correct value. Therefore, the client can obtain a correct retrieval result, that is, obtain the corresponding record/the value of the field of interest in the corresponding record.
当然,为了减少计算量,客户端可以仅采用所述随机数r对所述n个对称加密结果中期望获得的ID对应的加密结果进行对称解密,即客户端仅采用r对id_1的加密结果进行解密,从而获得id_1的对应记录/对应记录中感兴趣字段的值,而无须采用r对id_0、id_2、...、id_n-1 这些对称加密结果进行对称解密,因为客户端可以知道这些加密结果并非采用r进行的对称加密,即使采用r进行对称解密也无法解出正确结果。Of course, in order to reduce the amount of calculation, the client may only use the random number r to perform symmetric decryption on the encryption result corresponding to the desired ID among the n symmetric encryption results, that is, the client only uses r to decrypt the encryption result of id_1 Decrypt to obtain the corresponding record of id_1/the value of the field of interest in the corresponding record, without using r to perform symmetric decryption on the symmetric encryption results of id_0, id_2, ..., id_n-1, because the client can know these encryption results It is not symmetric encryption using r, even if r is used for symmetric decryption, the correct result cannot be solved.
为了更清楚的呈现,这里对客户端采用所述随机数r对所述n个对称加密结果进行对称解密,进行如下解释:For a clearer presentation, the client uses the random number r to perform symmetric decryption on the n symmetric encryption results, as follows:
S234中,服务端用r0、r1、r2、...、r(n-1)分别对对应的id_0、id_1、...、id_n-1这些记录/记录中的感兴趣字段的值进行对称加密:In S234, the server uses r0, r1, r2, ..., r(n-1) to symmetrize the values of the fields of interest in the records/records corresponding to id_0, id_1, ..., id_n-1 encryption:
Enc(id_0,r0),其中r0≠r;Enc(id_0, r0), where r0≠r;
Enc(id_1,r1),其中r1=r;Enc(id_1, r1), where r1 = r;
Enc(id_2,r2),其中r2≠r;Enc(id_2, r2), where r2≠r;
......
Enc(id_n-1,r(n-1)),其中r(n-1)≠r;Enc(id_n-1, r(n-1)), where r(n-1)≠r;
上述Enc表示加密(Encrypt),Enc()括号中的前一部分的id_0、id_1、id_2、...、id_n-1 表示n条记录/n条记录中感兴趣字段的值,后一部分的r0、r1、r2、...、r(n-1)表示加密密钥。The above Enc means encryption (Encrypt). The id_0, id_1, id_2, ..., id_n-1 of the former part in the Enc() brackets represent the value of the field of interest in n records/n records, and the r0, id_n-1 of the latter part r1, r2, . . . , r(n-1) represent encryption keys.
S235中,客户端采用随机数r对所述对S234中的加密结果进行对称解密。具体的,客户端采用所述随机数r对下述内容分别进行对称解密:In S235, the client uses the random number r to symmetrically decrypt the encryption result in S234. Specifically, the client uses the random number r to symmetrically decrypt the following content respectively:
Dec(Enc(id_0,r0),r),其中r0≠r;Dec(Enc(id_0, r0), r), where r0≠r;
Dec(Enc(id_1,r1),r),其中r1=r;Dec(Enc(id_1, r1), r), where r1=r;
Dec(Enc(id_2,r2),r),其中r2≠r;Dec(Enc(id_2, r2), r), where r2≠r;
......
Dec(Enc(id_n-1,r(n-1),r),其中r(n-1)≠r;Dec(Enc(id_n-1, r(n-1), r), where r(n-1)≠r;
上述Dec表示解密(Decrypt),Dec()中的前一部分表示解密对象,这里也就是上面的加密结果,Dec()中的后一部分表示解密采用的密钥。The above-mentioned Dec means decryption (Decrypt), the first part in Dec() means the decryption object, which is the above encryption result, and the latter part in Dec() means the key used for decryption.
可见,客户端只能解密得到id_1的那条记录,而无法推测出其它记录。这是因为服务端只有对id_1的那条记录采用了随机数r进行对称加密,而对其它ID采用的并非随机数r进行的对称加密,而客户端也无法获得除r1=r以外的r0、r2、...、r(n-1)。It can be seen that the client can only decrypt the record with id_1, but cannot infer other records. This is because the server only uses random number r for symmetric encryption on the record of id_1, but uses non-random number r for symmetric encryption on other IDs, and the client cannot obtain r0, r2,...,r(n-1).
需要说明的是,S231可以是在S230之后,或者是在S230之前,这里并不限制。It should be noted that S231 may be after S230 or before S230, which is not limited here.
上面是所述客户端在查询基中检索得到匹配记录的数量为1时,通过n选1不经意传输,将所述数据库中包含所述匹配标识在内的所有所述标识对应记录/对应记录中感兴趣字段的值传输至所述客户端。The above is that when the number of matching records retrieved by the client in the query base is 1, it selects 1 from n and inadvertently transmits all the corresponding records/corresponding records of the identification including the matching identification in the database The value of the field of interest is transmitted to the client.
上述实施例中,服务端并不知道客户端查询的是哪个或哪些ID,而是将数据库中的所有记录均加密返回至客户端,保护了客户端的隐私。但是,S233中,服务端用n个私钥分别对接收到加密的r进行解密,这样进行大量的非对称解密计算,需要消耗大量的CPU和内存资源。并且,S234中传输n个对称加密后的结果也将占用大量带宽。尤其是当n的数量比较大时,服务端的计算量较大,带宽占用也较大。In the above embodiment, the server does not know which ID or IDs the client is querying, but encrypts all the records in the database and returns them to the client, which protects the privacy of the client. However, in S233, the server uses n private keys to decrypt the received encrypted r respectively, so that a large number of asymmetric decryption calculations need to consume a large amount of CPU and memory resources. Moreover, the transmission of n symmetrically encrypted results in S234 will also occupy a large amount of bandwidth. Especially when the number of n is relatively large, the calculation amount of the server is relatively large, and the bandwidth occupation is also large.
此外,可能匹配的结果大于1,例如为k条(k>1),则可以通过n选k不经意传输来实现。关于n选k,一种实现方案是将n个记录中的每k个组成一个集合,每个集合对应一个公私钥对,这样总计会有(C表示组合公式,n里任选k个构成的组合的数量)。接下来,采用选1的不经意传输方式将所述数据库中包含所述匹配标识在内的所有所述标识对应记录/对应记录中感兴趣字段的值传输至所述客户端。选1的不经意传输方式,实现过程类似于上述n选1不经意传输的实现过程。即所述服务端用个不同密钥分别加密所述数据表中的每k个记录/对应记录中感兴趣字段的值得到个加密结果,并发送该个加密结果至客户端;所述客户端采用匹配标识对应的密钥解密所述服务端发送的个加密结果中匹配标识对应的1个加密结果。具体实现类似上述S231-S235的过程,这里不再赘述。In addition, it is possible that the matching result is greater than 1, for example, k items (k>1), which can be realized by selecting n from k and inadvertently transmitting. As for choosing k from n, one implementation scheme is to form a set of each k of n records, and each set corresponds to a public-private key pair, so that there will be a total of (C represents the combination formula, the number of combinations composed of k optional in n). Next, use In the inadvertent transmission mode of selecting 1, all records corresponding to the identifier/values of fields of interest in the corresponding records including the matching identifier in the database are transmitted to the client. The realization process of the oblivious transmission mode of choosing 1 is similar to the realization process of the above-mentioned n choosing 1 oblivious transmission. That is, the server uses A different key encrypts the value of the field of interest in each k record/corresponding record in the data table respectively to obtain encrypted results, and send the encrypted results to the client; the client uses the key corresponding to the matching identifier to decrypt the message sent by the server One encrypted result corresponding to the matching identifier among the encrypted results. The specific implementation of the process similar to the above S231-S235 will not be repeated here.
需要说明的是,上述的r也可以是非对称密钥中的公钥,这样,客户端接收到由r加密的结果后,可以采用自身的私钥对其解密得到结果。即S234和S235中,服务端将id_0这条记录采用r0进行非对称加密,将id_1这条记录采用r1进行非对称加密,...,将id_n-1这条记录采用r(n-1)进行非对称加密,并将这n个加密结果发送至客户端,客户端接收到这些加密的结果后,可以采用自身私钥对其中期望获得的ID对应的加密结果进行非对称解密得到结果。下面也类似,不再重复。It should be noted that the above r can also be the public key in the asymmetric key, so that after receiving the result encrypted by r, the client can use its own private key to decrypt it to obtain the result. That is, in S234 and S235, the server asymmetrically encrypts the id_0 record using r0, uses r1 to asymmetrically encrypt the id_1 record, ..., uses r(n-1) for the id_n-1 record Perform asymmetric encryption and send the n encrypted results to the client. After receiving the encrypted results, the client can use its own private key to asymmetrically decrypt the encrypted results corresponding to the desired ID to obtain the results. The following is also similar and will not be repeated.
基于此,本说明书给出以下增加了构造混淆集的一种实施方式:Based on this, this specification provides the following implementation mode with the addition of constructing confusion sets:
S310:客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段。S310: The client sends the encrypted sensitive field to the server, and obtains the same sensitive field encrypted by the server through interaction with the server.
例如,客户端的检索条件为Age字段值为25,而25为敏感字段,即不希望让对端知道。为了避免让服务端知道客户端检索条件是Age字段的值25,客户端可以将该25加密。例如,采用 RSA/ECC私钥加密,客户端采用的加密算法与服务端生成查询基采用的加密算法相同。For example, the retrieval condition of the client is that the value of the Age field is 25, and 25 is a sensitive field, that is, the peer end is not expected to know. In order to prevent the server from knowing that the retrieval condition of the client is the value 25 of the Age field, the client can encrypt the value 25. For example, RSA/ECC private key encryption is used, and the encryption algorithm used by the client is the same as that used by the server to generate the query base.
具体的,采用RSA私钥加密的情况下,客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25加密。具体的,可以是对25或对25的hash值加密。这里以对25的hash 加密为例加以说明,对25直接加密的情况类似,客户端与服务端采用相同的hash算法。例如,客户端采用与服务端相同的大质数q作为模数。客户端可以将对25采用β对25的hash值进行RSA 加密,得到(H(25))β。则客户端发送至服务端的敏感字段可以为(H(25))β,其中,(H(25))β表示敏感字段的值25的密文。Specifically, in the case of RSA private key encryption, the client itself generates the secret β and keeps it properly. Furthermore, the client can encrypt 25 with its own private key β. Specifically, it may be to encrypt 25 or a hash value of 25. Here we take the hash encryption of 25 as an example to illustrate. The situation of direct encryption of 25 is similar, and the client and server use the same hash algorithm. For example, the client uses the same large prime number q as the modulus as the server. The client can perform RSA encryption on the hash value of 25 using β to obtain (H(25)) β . Then the sensitive field sent by the client to the server may be (H(25)) β , where (H(25)) β represents the ciphertext of the value 25 of the sensitive field.
采用ECC私钥加密的情况下,客户端采用与服务端相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元。客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25 加密。具体的,可以是对25的hash值加密,客户端与服务端采用相同的hash算法。例如,客户端可以采用β对25的hash值进行ECC加密,得到β·H(25)。则客户端发送至服务端的敏感字段可以为β·H(25),其中,β·H(25)表示敏感字段的值25的密文。In the case of ECC private key encryption, the client uses the same elliptic curve as the server, that is, has the same elliptic curve parameters and generators. The client itself generates the secret β and keeps it securely. Furthermore, the client can encrypt 25 with its own private key β. Specifically, the hash value of 25 may be encrypted, and the client and server use the same hash algorithm. For example, the client can use β to perform ECC encryption on the hash value of 25 to obtain β·H(25). Then the sensitive field sent by the client to the server may be β·H(25), where β·H(25) represents the ciphertext of the value 25 of the sensitive field.
所述客户端通过与服务端的交互得到由服务端加密的同一敏感字段,可以包括服务端采用自身密钥对由客户端加密的敏感字段再次加密后发送至客户端,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。该内容的核心是需要找到一个满足连续两次加密操作(两方先后加密)可以交换顺序进行解密的加密算法。根据ECC的密码学性质,双方约定采用相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元,各自持有私钥α和β,加密操作为用α(或β)进行标量乘法运算,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。类似的,根据RSA的密码学性质加密,双方约定采用一个相同的大质数q和原根g,各自持有私钥α和β,加密操作为用α(或β)求幂并用q取模,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。整体来说,这里客户端和服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法。The client obtains the same sensitive field encrypted by the server through interaction with the server, which may include that the server uses its own key to re-encrypt the sensitive field encrypted by the client and sends it to the client, and the client uses its own key to pair The sensitive fields encrypted twice are decrypted to obtain the sensitive fields encrypted by the server. The core of this content is to find an encryption algorithm that satisfies two consecutive encryption operations (two parties encrypt successively) and can exchange order for decryption. According to the cryptographic nature of ECC, the two parties agree to use the same elliptic curve, that is, have the same elliptic curve parameters and generators, each hold the private key α and β, and the encryption operation is scalar multiplication with α (or β), regardless of Whether it is first encrypted with α and then encrypted with β or encrypted with β first and then encrypted with α, it can be decrypted in the same or different order, that is, the encrypted results can be decrypted in different orders. Similarly, according to the cryptographic nature of RSA encryption, the two parties agree to use the same large prime number q and original root g, each holding private keys α and β, and the encryption operation is to use α (or β) to exponentiate and use q to take the modulus, Regardless of first encrypting with α and then encrypting with β or first encrypting with β and then encrypting with α, they can be decrypted in the same or different order, that is, the encrypted results can be decrypted in different orders. On the whole, the encryption/decryption performed by the client and the server on the same target adopts the encryption/decryption algorithm with interchangeable order.
具体的,可以是服务端收到客户端发送的经客户端自身加密的敏感字段后,服务端对加密的敏感字段再次用服务端自身密钥加密后返回至客户端。进而,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。Specifically, after the server receives the encrypted sensitive field sent by the client, the server encrypts the encrypted sensitive field again with the server's own key and returns it to the client. Furthermore, the client uses its own key to decrypt the twice-encrypted sensitive field to obtain the sensitive field encrypted by the server.
例如,情况1:服务端可以接收到客户端发送的(H(25))β。For example, Case 1: The server can receive (H(25)) β sent by the client.
服务端可以对加密后的敏感字段(亦即隐私字段)再次加密,并将再次加密后的敏感字段返回至客户端。具体的,服务端可以对隐私字段(H(25))β采用自身的RSA私钥α进行再次加密,得到((H(25))β)α。The server may re-encrypt the encrypted sensitive field (ie, the private field), and return the re-encrypted sensitive field to the client. Specifically, the server can re-encrypt the privacy field (H(25)) β with its own RSA private key α to obtain ((H(25)) β ) α .
例如,情况2:服务端可以接收到客户端发送的β·H(25)。For example, case 2: the server can receive the β·H(25) sent by the client.
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。具体的,服务端可以对隐私字段β·H(25)采用自身的ECC私钥α进行再次加密,得到α·β·H(25)。The server can re-encrypt the privacy field, and return the re-encrypted privacy field to the client. Specifically, the server can re-encrypt the privacy field β·H(25) with its own ECC private key α to obtain α·β·H(25).
服务端采用自身密钥对由客户端加密的敏感字段(即隐私字段)再次加密后发送至客户端后,客户端可以采用自身密钥对两次加密后的隐私字段解密得到由服务端加密的敏感字段。After the server uses its own key to re-encrypt the sensitive field (that is, the privacy field) encrypted by the client and sends it to the client, the client can use its own key to decrypt the twice-encrypted privacy field to obtain the encrypted data from the server. Sensitive fields.
例如,对应上面情况1,客户端接收到服务端发送的((H(25))β)α,其中的幂次方运算存在性质如下:((H(25))β)α=(H(25))βα=(H(25))αβ=((H(25))α)β。进而,客户端可以采用自身私钥β的逆元对两次加密后的敏感字段解密,如下: 这样,客户端得到由服务端加密的同一敏感字段,即(H(25))α。For example, corresponding to the above case 1, the client receives ((H(25)) β ) α sent by the server, and the existence property of the power operation is as follows: ((H(25)) β ) α =(H( 25)) βα = (H(25)) αβ = ((H(25)) α ) β . Furthermore, the client can use the inverse of its own private key β Decrypt the sensitive fields encrypted twice, as follows: In this way, the client gets the same sensitive field encrypted by the server, namely (H(25)) α .
对应上面情况2,客户端接收到服务端发送的α·β·H(25),其中的标量乘法运算存在性质如下:α·β·H(25)=β·α·H(25)。进而,客户端可以采用自身私钥β的逆元β-1对两次加密后的敏感字段解密,如下:β-1·α·β·H(25)=β-1·β·α·H(25)=α·H(25)。这样,客户端同样得到由服务端加密的同一敏感字段,即α·H(25)。Corresponding to the above case 2, the client receives the α·β·H(25) sent by the server, and the existence property of the scalar multiplication operation is as follows: α·β·H(25)=β·α·H(25). Furthermore, the client can use the inverse element β -1 of its own private key β to decrypt the sensitive fields encrypted twice, as follows: β -1 ·α·β·H(25)=β -1 ·β·α·H (25)=α·H(25). In this way, the client also gets the same sensitive field encrypted by the server, namely α·H(25).
需要说明的是,RSA中,根据欧拉定理,pk·sk=1mod(p-1)·(q-1),其中p和q两个大质数,所以pk和sk互为逆元。类似的,ECC中,pk=sk·G,G为ECC选定曲线上的一个生成元,所以pk和sk也是互为逆元。It should be noted that in RSA, according to Euler's theorem, pk·sk=1mod(p-1)·(q-1), where p and q are two large prime numbers, so pk and sk are inverses of each other. Similarly, in ECC, pk=sk·G, G is a generating element on the selected curve of ECC, so pk and sk are also mutually inverse elements.
S320:客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识。S320: The client searches in the query base according to the sensitive field encrypted by the server, and obtains an identifier of a matching record.
S310执行后,客户端可以得到由服务端加密的同一敏感字段。After S310 is executed, the client can obtain the same sensitive field encrypted by the server.
客户端可以基于该由服务端加密的敏感字段在查询基中查询。例如,客户端解密后得到由服务端加密的隐私字段α·H(25)或(H(25))α,从而客户端基于该隐私字段在查询基中查询,例如分别在表2或表3中查询,可以得到Age中包含该隐私字段的记录为ID=d_1的这条记录,ID 为这条记录的标识。这样,客户基于该由服务端加密的隐私字段在查询基中查询,匹配到记录后可以定位得到匹配记录的标识。Clients can query the query base based on this sensitive field encrypted by the server. For example, the client decrypts and obtains the privacy field α·H(25) or (H(25)) α encrypted by the server, so that the client queries in the query base based on the privacy field, for example, in Table 2 or Table 3 Query in , it can be obtained that the record containing the privacy field in Age is the record with ID=d_1, and ID is the identifier of this record. In this way, the client queries the query base based on the private field encrypted by the server, and after matching records, the client can locate and obtain the identification of the matching records.
S330:服务端采用不经意传输方式返回所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录中的感兴趣字段的值至客户端。S330: The server returns, to the client, the value of the field of interest in the record corresponding to the identifier set of a predetermined size including the matching identifier in the database by means of inadvertent transmission.
S310中客户端发送的经过客户端加密的敏感字段使得服务端也无法获知客户端查找的敏感字段将命中哪一条记录,而只有客户端自己知道。这样,保护了客户端的隐私。但是,最终仍需要完成检索,这就需要服务端将客户端想查询的记录返回至客户端。The encrypted sensitive field sent by the client in S310 makes it impossible for the server to know which record the sensitive field searched by the client will hit, and only the client knows. In this way, the privacy of the client is protected. However, the retrieval still needs to be completed in the end, which requires the server to return the records that the client wants to query to the client.
上述图2对应的实施例给出了n选1不经意传输的实现方式,这里,可以采用m选1的不经意传输,其中m<n。S320中,客户端还可以不将匹配得到的ID单独返回至服务端,而是将匹配得到的ID与其它一些伪造的ID混淆组合在一起构造成混淆集,将混淆集发送至服务端,这样服务端无法准确获知客户端想要查找的是混淆集中的哪一条记录,并且需要保证客户端只能获得其中要查找的那一条记录,而无法获得其它记录。S320中,客户端发送的混淆集可以连同检索语句一并发送,例如select Name where ID=混淆集。或者,混淆集也可以在下面的S332 中发送,这里并不限定。The above-mentioned embodiment corresponding to FIG. 2 provides an implementation manner of choosing 1 from n for oblivious transmission. Here, the oblivious transmission of selecting 1 from m may be used, where m<n. In S320, the client can also not return the matched ID to the server separately, but mix the matched ID with other forged IDs to form a confusion set, and send the confusion set to the server, thus The server cannot know exactly which record in the confusion set the client wants to find, and it needs to ensure that the client can only get the record it wants to find, but not other records. In S320, the confusion set sent by the client may be sent together with a search statement, for example, select Name where ID=confusion set. Alternatively, the confusion set can also be sent in the following S332, which is not limited here.
结合本说明书上述实施例,假设服务端的数据表中具有总计n条记录,这样,客户端的查询基中相应的也具有n条加密的记录。为了方便,数据记录的ID按照顺序标识为id_0、id_1、 id_2、...id_n-1。一个简单的实施流程如下:In combination with the above-mentioned embodiments of this specification, it is assumed that there are a total of n records in the data table of the server, so that the query base of the client also has n encrypted records correspondingly. For convenience, the IDs of the data records are identified as id_0, id_1, id_2, . . . id_n-1 in sequence. A simple implementation process is as follows:
S331:服务端预先生成n对不同的公私钥对并公开公钥。S331: The server pre-generates n pairs of different public-private key pairs and discloses the public keys.
服务端生成n对不同的公私钥对(pk-sk;pk是publick key,表示公钥;sk是secretkey,表示私钥),例如分别是pk0-sk0,pk1-sk1,pk2-sk2,...,pkn-1-skn-1,并公开这 n个公钥,即公开pk0,pk1,pk2,...,pkn-1。服务端公开这n个有序的公钥后,客户端可以获得这n个公钥。The server generates n pairs of different public-private key pairs (pk-sk; pk is a public key, indicating a public key; sk is a secretkey, indicating a private key), for example, pk 0 -sk 0 , pk 1 -sk 1 , pk 2 -sk 2 , ..., pk n-1 -sk n-1 , and publicize these n public keys, that is, publicize pk 0 , pk 1 , pk 2 , ..., pk n-1 . After the server discloses the n ordered public keys, the client can obtain the n public keys.
S332:客户端生成包含期望获得ID在内的m大小的混淆集,并生成随机数r,并用期望获得的ID对应的公钥对r进行加密后与混淆集一并发送至服务端。S332: The client generates a confusion set of size m including the desired ID, generates a random number r, encrypts r with the public key corresponding to the desired ID, and sends it to the server together with the confusion set.
这里假设客户端希望获得id_1的那条记录,同时不希望服务端知道客户端想要获得的记录是id_1的那条,便生成m大小的混淆集,m=4时这个混淆集例如为:{id_1,id_2,id_3,id_4}。Here, it is assumed that the client wants to obtain the record of id_1, and at the same time does not want the server to know that the record that the client wants to obtain is the record of id_1, so it generates a confusion set of m size. When m=4, the confusion set is, for example: { id_1, id_2, id_3, id_4}.
这4个ID与公钥对例如存在以下对应关系:These 4 IDs have the following corresponding relationship with the public key pair, for example:
pk1,id_1pk 1 , id_1
pk2,id_2 pk2 , id_2
pk3,id_3pk 3 , id_3
pk4,id_4pk 4 , id_4
客户端可以采用pk1对r进行加密后与混淆集一并发送至服务端。例如:The client can use pk 1 to encrypt r and send it to the server together with the confusion set. For example:
此外,客户端可以将混淆集连同检索语句一并发送至服务端。这样,客户端可以采用pk1对 r进行加密后与包含混淆集的检索语句一并发送,例如:In addition, the client can send the confusion set together with the retrieval statement to the server. In this way, the client can use pk 1 to encrypt r and send it together with the retrieval statement containing the confusion set, for example:
select Name where ID={id_1,id_2,id_3,id_4}|Enc(r,pk1)select Name where ID={id_1,id_2,id_3,id_4}|Enc(r,pk 1 )
其中,“|”用于分割前面的检索语句和后面的加密后的随机数,下同。Among them, "|" is used to separate the previous search statement from the encrypted random number, the same below.
S333:服务端接收到混淆集和加密的r后,用对应的m个私钥分别对加密的r进行解密。S333: After receiving the confusion set and the encrypted r, the server uses the corresponding m private keys to decrypt the encrypted r respectively.
服务端分别用sk1,sk2,sk3,sk4,分别解密经过pk1加密的随机数r。例如,服务端用sk1解密得到r1,用sk2解密得到r2,用sk3解密得到r3,用sk4解密得到r4。The server uses sk 1 , sk 2 , sk 3 , and sk 4 to decrypt the random number r encrypted by pk 1 respectively. For example, the server decrypts with sk 1 to obtain r1, decrypts with sk 2 to obtain r2, decrypts with sk 3 to obtain r3, and decrypts with sk 4 to obtain r4.
显然,只有r1是和r相等的,因为只有用sk1进行解密的才是用对应pk1加密的;而用不对应pk1的sk2、sk3、sk4解密得到的结果r2、r3、r4都不会与r相同。通过解密,服务端只是得到形式相同的解密结果,并不知道真正的r是什么,也不知道客户端是用哪个公钥进行加密的。换句话说,服务端不知道客户端加密r时用的哪个公钥,因此服务端也不知道解密得到的4个结果r1、r2、r3、r4中的哪个才是真正的r。 Obviously , only r1 is equal to r, because only those decrypted with sk 1 are encrypted with the corresponding pk 1 ; and the results r2, r3 , Neither r4 will be the same as r. Through decryption, the server only gets the decryption result in the same form, and does not know what the real r is, nor does it know which public key the client uses for encryption. In other words, the server does not know which public key the client uses to encrypt r, so the server does not know which of the four decrypted results r1, r2, r3, and r4 is the real r.
此外,服务端接收到混淆集{id_1,id_2,id_3,id_4}后,可以从混淆集中得知客户端想要获取的数据是混淆集中4个ID中的1个,但不确定是其中哪一个,从而保护了客户端隐私。In addition, after the server receives the confusion set {id_1, id_2, id_3, id_4}, it can know from the confusion set that the data the client wants to obtain is one of the 4 IDs in the confusion set, but it is not sure which one it is , thereby protecting the privacy of the client.
S334:服务端将混淆集中指定的记录采用对应序号的解密结果进行对称加密,将对称加密后的结果发送至客户端。S334: The server performs symmetric encryption on the records specified in the confusion set using the decryption result of the corresponding serial number, and sends the symmetric encrypted result to the client.
例如,服务端将id_1这条记录采用r1进行对称加密,将id_2这条记录采用r2进行对称加密,将id_3这条记录采用r3进行对称加密,将id_4这条记录采用r4进行对称加密,并将这4 个对称加密结果发送至客户端。For example, the server encrypts the record id_1 using r1 for symmetric encryption, the record id_2 uses r2 for symmetric encryption, the record id_3 uses r3 for symmetric encryption, the record id_4 uses r4 for symmetric encryption, and These 4 symmetric encrypted results are sent to the client.
S335:客户端采用所述随机数r对所述对称加密结果中期望获得的ID对应的加密结果进行对称解密,得到检索结果。S335: The client uses the random number r to perform symmetric decryption on the encrypted result corresponding to the desired ID in the symmetric encrypted result to obtain a search result.
客户端采用所述随机数r对所述对称加密结果中期望获得的ID对应的加密结果进行对称解密。具体的,例如上述S332中客户端期望获得的是id_1对应的那条记录/记录中的感兴趣字段的值,则客户端采用对应的公钥pk1对所述随机数r进行加密;S333中,服务端用对应的私钥sk1对解密结果进行解密,得到的r1=r,而用不对应pk1的sk2、sk3、sk4解密得到的结果r2、 r3、r4都不会与r相同;S334中,服务端用r1、r2、r3、r4分别对对应的id_1、id_2、id_3、 id_4这些记录/记录中的感兴趣字段的值进行对称加密,并将这个对称加密结果发送至客户端; S335中,客户端采用所述随机数r对所述4个对称加密结果进行对称解密。其中,4个对称解密结果中,只有id_1的加密结果是用r对称加密的,因此这里只有对id_1的加密结果采用r进行解密才能得到正确的值。从而,客户端可以获得正确的检索结果,即获得对应记录/对应记录中感兴趣字段的值。The client uses the random number r to symmetrically decrypt the encrypted result corresponding to the desired ID in the symmetric encrypted result. Specifically, for example, what the client expects to obtain in S332 above is the value of the field of interest in the record/record corresponding to id_1, then the client uses the corresponding public key pk 1 to encrypt the random number r; in S333 , the server uses the corresponding private key sk 1 to decrypt the decryption result, and the obtained r1=r, and the results r2, r3, and r4 obtained by decrypting with sk 2 , sk 3 , and sk 4 that do not correspond to pk 1 will not be consistent with r is the same; in S334, the server uses r1, r2, r3, and r4 to symmetrically encrypt the values of the fields of interest in the corresponding records/records of id_1, id_2, id_3, and id_4, and send the symmetric encryption result to Client; In S335, the client uses the random number r to symmetrically decrypt the four symmetric encryption results. Among them, among the four symmetric decryption results, only the encrypted result of id_1 is symmetrically encrypted with r, so here only the encrypted result of id_1 is decrypted with r to get the correct value. Therefore, the client can obtain a correct retrieval result, that is, obtain the corresponding record/the value of the field of interest in the corresponding record.
当然,为了减少计算量,客户端可以仅采用所述随机数r对所述4个对称加密结果中期望获得的ID对应的加密结果进行对称解密,即客户端仅采用r对id_1的加密结果进行解密,从而获得id_1的对应记录/对应记录中感兴趣字段的值,而无须采用r对id_2、id_3、id_4这些对称加密结果进行对称解密,因为客户端可以知道这些加密结果并非采用r进行的对称加密,即使采用r进行对称解密也无法解出正确结果。Of course, in order to reduce the amount of calculation, the client can only use the random number r to perform symmetric decryption on the encryption result corresponding to the expected ID among the four symmetric encryption results, that is, the client only uses r to decrypt the encryption result of id_1 Decrypt to obtain the corresponding record of id_1/the value of the field of interest in the corresponding record, without using r to perform symmetric decryption on the symmetric encryption results of id_2, id_3, id_4, because the client can know that these encryption results are not symmetric using r Encryption, even if r is used for symmetric decryption, the correct result cannot be solved.
上述S331~S335仅是示例性的一种实现方式。在另一种实现方式中,可以将混淆集的构建和传输与OT协议的执行进行解耦,用OT协议来传输密钥。具体的,一方面,客户端可以将m 大小的混淆集发送至服务端,当然客户端知道m大小的混淆集中的第几个是真正想要获得的结果的标识,另一方面,服务端可以生成m个对称密钥,通过m选1的OT,客户端可以获得其中指定的一个对称密钥,即客户端获得真正想要获得结果的那个标识对应的对称密钥。这样,服务端可以将客户端混淆集中m个标识对应的记录采用对应的对称密钥加密后发送至客户端,从而客户端对其中真正想要获得的结果采用正确的对称密钥解密,从而得到结果。其中,服务端可以预先生成m个对称密钥,这样可以在进行OT交互之前批量完成密钥准备的工作,而不会占用OT协议执行的时间。The foregoing S331 to S335 are only exemplary implementation manners. In another implementation, the construction and transmission of the confusion set can be decoupled from the execution of the OT protocol, and the key can be transmitted using the OT protocol. Specifically, on the one hand, the client can send the m-size confusion set to the server. Of course, the client knows which number in the m-size confusion set is the identity of the result it really wants to obtain. On the other hand, the server can Generate m symmetric keys, and through the OT of selecting 1 from m, the client can obtain one of the specified symmetric keys, that is, the client obtains the symmetric key corresponding to the identity that the client really wants to obtain. In this way, the server can encrypt the records corresponding to the m identifiers in the client confusion set with the corresponding symmetric keys and send them to the client, so that the client can decrypt the results that the client really wants to obtain with the correct symmetric key, thus obtaining result. Among them, the server can pre-generate m symmetric keys, so that the key preparation can be completed in batches before OT interaction without taking up the execution time of the OT protocol.
上面是所述客户端在查询基中检索得到匹配记录的数量为1时,通过m选1不经意传输,服务端将所述混淆集中指明的m个标识对应记录/对应记录中感兴趣字段的值传输至所述客户端。此外,可能匹配的结果大于1,例如为k条(k>1),则可以通过m选k不经意传输来实现。m选k 不经意传输的核心在于,客户端构造m大小的混淆集并发送至服务端,1<m<n,该m个混淆集中的1个包含所述匹配的k个记录的标识,例如m=4,k=2,匹配标识为id_1和id_3,则构造的混淆集例如为{{id_1,id_3},{id_2和id_4},{id_3和id_4},{id_2}},显然其中第一个为匹配标识构成的子集合;进而,服务端生成m个对称密钥,通过m选1的OT协议使客户端获得其指定的那个对称密钥;服务端用m个不同对称密钥分别加密所述混淆集中的m个子集合得到m个加密结果子集合,并发送该m个加密结果子集合至客户端;客户端采用获得的对称密钥解密k个匹配标识构成的那个子集合,从而获得正确的解密结果。具体实现与上述将混淆集的构建和传输与OT协议的执行进行解耦,用OT协议来传输密钥的过程类似,不再展开。The above is that when the number of matching records retrieved by the client in the query base is 1, the server selects 1 from m and inadvertently transmits, and the server identifies the corresponding records/the values of the fields of interest in the corresponding records specified in the confusion set transmitted to the client. In addition, if the possible matching results are greater than 1, for example k (k>1), it can be realized by selecting k from m and inadvertently transmitting. The core of m choosing k inadvertent transmission is that the client constructs m-sized confusion sets and sends them to the server, 1<m<n, one of the m confusion sets contains the identifiers of the matching k records, for example m = 4, k = 2, the matching identifiers are id_1 and id_3, then the constructed confusion set is, for example, {{id_1, id_3}, {id_2 and id_4}, {id_3 and id_4}, {id_2}}, obviously the first one In order to match the sub-set formed by the identity; furthermore, the server generates m symmetric keys, and the client obtains the symmetric key specified by it through the OT protocol where m selects 1; the server uses m different symmetric keys to encrypt all Obtain m subsets of encrypted results from the m subsets in the above obfuscation set, and send the m subsets of encrypted results to the client; the client uses the obtained symmetric key to decrypt the subset composed of k matching identities to obtain the correct The decryption result. The specific implementation is similar to the above-mentioned process of decoupling the construction and transmission of the confusion set from the execution of the OT protocol, and using the OT protocol to transmit the key, and will not be expanded.
上述S310~S330的例子提供了构造混淆集的方案,以保护客户端隐私。在有些情况下,混淆集可能构造的并不合理,而仍然容易使服务器猜测到客户端真正想查询的ID。The above examples of S310-S330 provide a solution for constructing a confusion set to protect the privacy of the client. In some cases, the confusion set may be constructed unreasonably, but it is still easy for the server to guess the ID that the client really wants to query.
例如上述表1中的例子,可以注意到id_4与id_7这两行中Age和Native_place是分别相同的,也就是说,无法通过Age和Native_place区分这两行。上述S332中,客户端例如构造的混淆集为{id_1,id_4},构造的检索语句为:For example, in the example in Table 1 above, it can be noticed that the Age and Native_place in the two rows of id_4 and id_7 are respectively the same, that is, the two rows cannot be distinguished by Age and Native_place. In the above S332, for example, the confusion set constructed by the client is {id_1,id_4}, and the constructed retrieval statement is:
select Name where ID={id_1,id_4}select Name where ID={id_1,id_4}
服务端收到该包含混淆集的检索语句后,可以得知该检索语句中的感兴趣字段是Name,而不是Age和Native_place,也就说明之前客户端在S310和S320中是敏感字段应当是Age和/ 或Native_place。而混淆集中的元素id_4,从Age和/或Native_place都无法与id_7这一行相区分。也就是说,混淆集中的元素如果是通过敏感字段Age和/或Native_place来定位得到的,则仅有id_4而没有id_7是不合理的。这个检索语句中,混淆集中如果有id_4,则也应当有id_7。这样,服务端可以推测出现有的混淆集{id_1,id_4}中,id_4是伪造的元素,而id_1才是真正要查询的ID,这样就一定程度上造成了客户端的隐私泄露。After the server receives the search statement containing the confusion set, it can know that the field of interest in the search statement is Name, not Age and Native_place, which means that the sensitive field of the client in S310 and S320 should be Age and/or Native_place. The element id_4 in the confusion set cannot be distinguished from the row id_7 from Age and/or Native_place. That is to say, if the elements in the confusion set are located through the sensitive field Age and/or Native_place, it is unreasonable to only have id_4 but not id_7. In this search statement, if there is id_4 in the confusion set, there should also be id_7. In this way, the server can speculate that in the existing confusion set {id_1, id_4}, id_4 is a forged element, and id_1 is the real ID to be queried, thus causing the privacy of the client to be leaked to a certain extent.
上述内容是混淆集构造不合理导致客户端隐私泄露的一个示例。为了避免混淆集构造不合理,本说明书还提供一种构造混淆集的实施例。The above content is an example of the leakage of client privacy caused by the unreasonable construction of the confusion set. In order to avoid unreasonable construction of the confusion set, this specification also provides an embodiment of constructing the confusion set.
则构造混淆集的方法如图3所示可以包括:Then the method for constructing the confusion set may include as shown in Figure 3:
S410:客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段。S410: The client sends the encrypted sensitive field to the server, and obtains the same sensitive field encrypted by the server through interaction with the server.
S420:客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的第一标识集合。S420: The client searches in the query base according to the sensitive field encrypted by the server, and obtains a first identification set of matching records.
S410、S420与前述S310、S320分别类似,不再赘述。假设客户端发送至服务端的敏感字段为(H(25))β或β·H(25),则客户端通过与服务端的交互接收到由服务端加密的同一敏感字段为((H(25))β)α或α·β·H(25)。客户端可以采用自身私钥β的对两次加密后的敏感字段解密,例如为(H(25))α或α·H(25)。这样,客户端在本地查询基中根据所述由服务端加密的敏感字段(H(25))α或α·H(25)检索,得到匹配记录的第一标识为id_1。后续主要以表2所示为例加以说明。S410 and S420 are similar to the aforementioned S310 and S320 respectively, and will not be repeated here. Suppose the sensitive field sent by the client to the server is (H(25)) β or β·H(25), then the client receives the same sensitive field encrypted by the server through the interaction with the server as ((H(25) ) β ) α or α·β·H(25). The client can use its private key β to decrypt the twice-encrypted sensitive field, for example, (H(25)) α or α·H(25). In this way, the client searches in the local query base according to the sensitive field (H(25)) α or α·H(25) encrypted by the server, and obtains the first identifier of the matching record as id_1. In the following, Table 2 will be used as an example for illustration.
S430:客户端选取查询基中除ID字段和感兴趣字段外的至少一个点,将所选的至少一个点作为检索条件,根据该检索条件和原有的感兴趣字段构造检索语句并在查询基上执行检索,得到匹配记录的第二标识集合。S430: The client selects at least one point in the query base except for the ID field and the field of interest, uses the selected at least one point as a retrieval condition, constructs a retrieval statement according to the retrieval condition and the original field of interest, and generates the query in the query base. Execute retrieval on the above to obtain the second identification set of matching records.
假设混淆集大小为n,即混淆集中包含n个元素。通过本说明书实施例可以生成n个元素构成的混淆集,例如每个元素为一个标识,且n个标识中包含匹配记录的标识,即前述的第一标识集合。Suppose the size of the confusion set is n, that is, the confusion set contains n elements. According to the embodiment of this specification, a confusion set composed of n elements can be generated, for example, each element is an identifier, and the n identifiers include identifiers of matching records, that is, the aforementioned first identifier set.
S430中,仍以表2或表3为例,客户端选取查询基中除ID字段和感兴趣字段外的至少一个点,例如为1个点,具体例如是选择α·H(34)这个点。查询基中的一个点可以定义为由行和列确定的一个字段值。In S430, still taking Table 2 or Table 3 as an example, the client selects at least one point in the query base except the ID field and the field of interest, for example, one point, for example, select the point α·H(34) . A point in the query base can be defined as a field value determined by rows and columns.
进而,客户端可以将所选的至少一个点作为检索条件,根据该检索条件和原有的感兴趣字段构造检索语句。原有的感兴趣即客户端原本期望查询的感兴趣字段,例如原本即希望查询Name。例如选择α·H(34)这个点,对应的字段名为Age,则构造的检索语句可以如下:Furthermore, the client may use at least one selected point as a retrieval condition, and construct a retrieval statement according to the retrieval condition and the original field of interest. The original interest refers to the field of interest that the client originally expected to query, for example, it originally wanted to query Name. For example, if the point α·H(34) is selected, and the corresponding field name is Age, then the constructed search statement can be as follows:
select Name where Age=α·H(34)select Name where Age=α·H(34)
客户端可以在查询基上执行该检索语句,得到匹配记录的第二标识为id_4和id_7,作为第二标识集合{id_4,id_7}。The client can execute the search statement on the query basis, and obtain the second identifiers of the matching records as id_4 and id_7 as the second identifier set {id_4, id_7}.
此外,还可以进一步选查询基中除ID字段和感兴趣字段外的点,将所选的点作为检索条件,根据该检索条件和原有的感兴趣字段构造检索语句并在查询基上执行检索,得到匹配记录的标识添加进第二标识集合。例如,构造的检索语句如下:In addition, you can further select points in the query base other than the ID field and the field of interest, use the selected point as a retrieval condition, construct a retrieval statement based on the retrieval condition and the original field of interest, and execute the retrieval on the query base , and the identifiers of the matching records are obtained and added to the second identifier set. For example, the search statement constructed is as follows:
select Name where Native_place=α·H(anhui)select Name where Native_place=α·H(anhui)
则匹配记录的标识为id_0和id_2。Then the identifiers of the matching records are id_0 and id_2.
将id_0和id_2添加进第二标识集合中,得到第二标识集合为:{{id_4,id_7},{id_0, id_2}}。Add id_0 and id_2 into the second identity set to obtain the second identity set: {{id_4, id_7}, {id_0, id_2}}.
S440:客户端根据第一标识集合和第二标识集合构造混淆集。S440: The client constructs a confusion set according to the first identity set and the second identity set.
客户端可以将第一标识id_1和第二标识为id_4和id_7共同作为混淆集,即混淆集为{id_1, id_4,id_7}。这样,得到n=3的混淆集。进而,执行S331~S335中的其它步骤,可以使得混淆集更合理,从而避免因混淆集构造不合理而导致客户端隐私泄露。最后,服务端可以通过3 选1不经意传输将真正感兴趣字段的值返回至客户端。The client may use the first identifier id_1 and the second identifier id_4 and id_7 together as a confusion set, that is, the confusion set is {id_1, id_4, id_7}. In this way, a confusion set of n=3 is obtained. Furthermore, executing other steps in S331-S335 can make the confusion set more reasonable, thereby avoiding privacy leakage of the client due to unreasonable structure of the confusion set. Finally, the server can return the value of the field of real interest to the client through 1-of-3 inadvertent transmission.
其中,混淆集中元素的顺序并不限定,例如可以是{id_4,id_1,id_7},{id_4,id_7, id_1},{id_7,id_1,id_4},{id_7,id_4,id_1},{id_1,id_7,id_4}之类,可以是随机顺序,只要后续在OT中可以指定真正要传输的ID即可。下面例子中也类似,不再举例说明。Among them, the order of the elements in the confusion set is not limited, for example, it can be {id_4, id_1, id_7}, {id_4, id_7, id_1}, {id_7, id_1, id_4}, {id_7, id_4, id_1}, {id_1, id_7 , id_4} and the like can be in random order, as long as the real ID to be transmitted can be specified in OT later. The same is true in the following examples, which will not be illustrated again.
除此之外,还可以是n=2,例如混淆集为{id_1,{id_4,id_7}},即混淆集中的第二个元素为集合{id_4,id_7}。最后,服务端可以通过2选1不经意传输将真正感兴趣字段的值返回至客户端。另外,还可以是{id_1,{id_4,id_7},{id_0,id_2}},这时n=3,即3选1。In addition, it can also be n=2, for example, the confusion set is {id_1, {id_4, id_7}}, that is, the second element in the confusion set is the set {id_4, id_7}. Finally, the server can return the value of the field of real interest to the client through 2-to-1 inadvertent transmission. In addition, it can also be {id_1, {id_4, id_7}, {id_0, id_2}}, at this time n=3, that is, choose 1 from 3.
此外,S410、S420中的敏感字段例如是α·H(25),则得到的第一标识集合为{id_1}。例如第二标识集合为通过构建的检索式select Name where Age=α·H(46)和selectName where Age=α·H(56)分别检索得到的结果,第二标识集合为{id_3,id_9}。则,客户端根据第一标识集合和第二标识集合构造混淆集,可以是将第一标识集合与第二标识集合中的元素平铺后构成混淆集。这样,混淆集例如是{id_1,id_3,id_9}。进而,可以采用n选1不经意传输将真正感兴趣字段的值返回至客户端,这里n=3,混淆集中匹配记录数量1,即可以通过3选1不经意传输将id_0中Name字段的值返回至客户端。此外,当第一标识集合中的元素为k时,可以采用n选k不经意传输。In addition, the sensitive field in S410 and S420 is, for example, α·H(25), then the obtained first identification set is {id_1}. For example, the second identification set is the result obtained by searching respectively through the constructed search formulas select Name where Age=α·H(46) and selectName where Age=α·H(56), and the second identification set is {id_3, id_9}. Then, the client constructs a confusion set according to the first identity set and the second identity set, which may be formed by tiling elements in the first identity set and the second identity set. Thus, the confusion set is eg {id_1, id_3, id_9}. Furthermore, the value of the field of real interest can be returned to the client by inadvertent transmission of n to 1, where n=3, the number of matching records in the confusion set is 1, that is, the value of the Name field in id_0 can be returned to the client through inadvertent transmission of 3 to 1 client. In addition, when the number of elements in the first identity set is k, k can be selected from n for inadvertent transmission.
除了将第一标识集合与第二标识集合中的元素平铺后构成混淆集外,还可以是将第一标识集合中的元素平铺,并与第二标识集合共同构成混淆集。例如上述,第一标识集合为{id_0, id_6}。例如第二标识集合仍然为{id_3,id_9},则混淆集可以为{id_0,id_6,{id_3,id_9}}。进而,可以采用n选k不经意传输将真正感兴趣字段的值返回至客户端,这里n=3,混淆集中匹配记录数量2,即可以通过3选2不经意传输将id_0、id_6中Name字段的值返回至客户端。当第一标识集合中的元素为1时,n选k即为n选1。In addition to tiling elements in the first identification set and the second identification set to form a confusion set, elements in the first identification set may also be tiled and together with the second identification set to form a confusion set. For example, as mentioned above, the first identity set is {id_0, id_6}. For example, the second identification set is still {id_3, id_9}, and the confusion set may be {id_0, id_6, {id_3, id_9}}. Furthermore, the value of the field of real interest can be returned to the client by using n selection and k inadvertent transmission, where n=3, and the number of matching records in the confusion set is 2, that is, the value of the Name field in id_0 and id_6 can be inadvertently transmitted by 3 selections and 2 Return to client. When the element in the first identification set is 1, n choose k is n choose 1.
类似的,还可以是将第二标识集合中的元素平铺,并与第一标识集合共同构成混淆集。例如上述,第一标识集合为{id_0,id_6}。例如第二标识集合为通过构建的检索式select Name where Age=α·H(46)和select Name where Age=α·H(56)分别检索得到的结果,第二标识集合为{id_3,id_9}。则混淆集可以为{{id_0,id_6},id_3,id_9}。进而,可以采用n选k不经意传输将真正感兴趣字段的值返回至客户端,这里n=3,混淆集中匹配记录数量1,这2个匹配记录的标识作为1个集合称为混淆集中的1个元素,即可以通过3选1不经意传输将{id_0, id_6}中Name字段的值返回至客户端。Similarly, elements in the second identification set may also be tiled to form a confusion set together with the first identification set. For example, as mentioned above, the first identification set is {id_0, id_6}. For example, the second identification set is the result obtained by searching respectively through the constructed retrieval formulas select Name where Age=α·H(46) and select Name where Age=α·H(56), and the second identification set is {id_3, id_9} . Then the confusion set can be {{id_0, id_6}, id_3, id_9}. Furthermore, the value of the field of real interest can be returned to the client by using n selection and k inadvertent transmission, where n=3, the number of matching records in the confusion set is 1, and the identifiers of these two matching records are regarded as a set called 1 in the confusion set element, that is, the value of the Name field in {id_0, id_6} can be returned to the client through inadvertent transmission of 1 out of 3.
但是,不能将除ID字段和感兴趣字段外所选的至少一个点对应的无法区分的至少两行的 ID拆分到不同集合中。例如,第一标识集合为{id_0,id_6},第二标识集合为{{id_4,id_7}, id_3,id_9}。其中,如前所述,除ID字段和感兴趣字段Name外,剩余的字段Age和Native_place 这两个字段无法区分id_4和id_7,因此需要将id_4和id_7放入同一个集合中,即{id_4,id_7},不能拆分。反过来说,所述混淆集中除ID字段和感兴趣字段外无法区分的行的标识放置在同一个最小集合中,这一点也符合其它实现方式的要求,从而避免混淆集构造的不合理。However, the IDs of at least two rows corresponding to indistinguishable at least one selected point other than the ID field and the field of interest cannot be split into different sets. For example, the first identity set is {id_0, id_6}, and the second identity set is {{id_4, id_7}, id_3, id_9}. Among them, as mentioned above, in addition to the ID field and the field of interest Name, the remaining fields Age and Native_place cannot distinguish between id_4 and id_7, so id_4 and id_7 need to be put into the same set, that is, {id_4, id_7}, cannot be split. Conversely, the identifiers of indistinguishable rows in the confusion set except for the ID field and the field of interest are placed in the same minimum set, which also meets the requirements of other implementations, thereby avoiding unreasonable construction of the confusion set.
此外,还可以是将第一标识集合、第二标识集合中的元素混合后构成混淆集。例如上述,第一标识集合为{id_0,id_6},例如第二标识集合仍然为{id_3,id_9},则混淆集可以为{{id_0, id_6},{id_0,id_3},id_9}。id_3和id_9是除ID字段和感兴趣字段外可以区分的,因此对应的行的标识也可以拆分,即不放置在同一个最小集合中。进而,可以采用n选k不经意传输将真正感兴趣字段的值返回至客户端,这里n=3,混淆集中匹配记录数量2,而这2个匹配记录的标识作为1个集合称为混淆集中的1个元素,即可以通过3选1不经意传输将{id_0,id_6}中 Name字段的值返回至客户端。In addition, elements in the first identification set and the second identification set may also be mixed to form a confusion set. For example, as mentioned above, the first identification set is {id_0, id_6}, for example, the second identification set is still {id_3, id_9}, then the confusion set can be {{id_0, id_6}, {id_0, id_3}, id_9}. id_3 and id_9 can be distinguished except for the ID field and the field of interest, so the identifiers of the corresponding rows can also be split, that is, they are not placed in the same minimum set. Furthermore, the value of the field of real interest can be returned to the client by using n selection and k inadvertent transmission, where n=3, the number of matching records in the confusion set is 2, and the identifiers of these two matching records are regarded as a set called the confusion set 1 element, that is, the value of the Name field in {id_0, id_6} can be returned to the client through inadvertent transmission of 1 out of 3.
上述例子中给出了一个条件字段的情形,两个及以上的条件字段也是类似。具体的,S430 中,仍以表2为例,客户端选取查询基中除ID字段和感兴趣字段外的两个点,具体例如是选择α·H(34)和α·H(shanghai)这2个点。The above example shows the case of one conditional field, and it is similar for two or more conditional fields. Specifically, in S430, still taking Table 2 as an example, the client selects two points in the query base except the ID field and the field of interest, for example, select α·H(34) and α·H(shanghai) 2 points.
进而,客户端可以将所选的2个点作为检索条件,根据该检索条件和原有的感兴趣字段构造检索语句。原有的感兴趣即客户端原本期望查询的感兴趣字段,例如原本即希望查询Name。例如选择α·H(34)和α·H(shanghai)这2个点,对应的字段名分别为Age和Native_place,则构造的检索语句可以如下:Furthermore, the client can use the selected two points as retrieval conditions, and construct a retrieval statement based on the retrieval conditions and the original fields of interest. The original interest refers to the field of interest that the client originally expected to query, for example, it originally wanted to query Name. For example, if two points α·H(34) and α·H(shanghai) are selected, and the corresponding field names are Age and Native_place respectively, then the constructed search statement can be as follows:
select Name where Age=α·H(34)or Native_place=α·H(shanghai)select Name where Age=α·H(34) or Native_place=α·H(shanghai)
客户端可以在查询基上执行该检索语句,得到匹配的第二标识为id_4、id_7、id_1、id_5,作为第二标识集合{id_4,id_7,id_1、id_5}。The client can execute the search statement on the query basis, and obtain the matched second identifiers as id_4, id_7, id_1, id_5 as the second identifier set {id_4, id_7, id_1, id_5}.
上述检索语句中,where这一检索条件中,Age=α·H(34)是SQL中的一个谓词,Native_place=α·H(shanghai)是SQL中的另一个谓词。两个谓词之间的连接词,这里是or,此外也可以是and之类。换句话说,连接词可以是随机的一种连接词。更广泛的,多于两个谓词时,相邻两个谓词之间的连接词是随机的一种连接词。In the above retrieval statement, in the retrieval condition of where, Age=α·H(34) is a predicate in SQL, and Native_place=α·H(shanghai) is another predicate in SQL. The connecting word between two predicates, here is or, and it can also be and and the like. In other words, a connective can be a random kind of connective. More broadly, when there are more than two predicates, the connectives between two adjacent predicates are random ones.
以下介绍本说明书一实施例中的一种实现构造混淆集的客户端,该客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且:The following introduces a client that implements the construction of a confusion set in an embodiment of this specification. The encryption/decryption performed by the client and the server on the same target adopts an encryption/decryption algorithm that can be exchanged, and:
所述客户端配置有查询基,所述查询基由所述服务端将数据库加密后得到;The client is configured with a query base, and the query base is obtained after the server encrypts the database;
所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的第一标识集合;选取查询基中除ID字段和感兴趣字段外的至少一个点,将所选的至少一个点作为检索条件,根据该检索条件和原有的感兴趣字段构造检索语句并在查询基上执行检索,得到匹配记录的第二标识集合;根据第一标识集合和第二标识集合构造混淆集。The client sends the sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server; in the query base, according to the search based on the sensitive field encrypted by the server, a matching record is obtained The first identification set; select at least one point in the query base except the ID field and the field of interest, use the selected at least one point as a retrieval condition, construct a retrieval statement according to the retrieval condition and the original field of interest, and in Searching is performed on the query basis to obtain a second identification set of matching records; a confusion set is constructed according to the first identification set and the second identification set.
以下介绍本说明书一实施例中的一种实现构造混淆集的客户端,包括:The following introduces a client that implements the construction of a confusion set in an embodiment of this specification, including:
处理器,processor,
存储器,存储有程序,其中在所述处理器执行所述程序时,执行上述图3中的方法。The memory stores a program, wherein when the processor executes the program, the above-mentioned method in FIG. 3 is executed.
以下介绍本说明书一实施例中的一种存储介质,用于存储程序,其中所述程序在被执行时使得客户端执行上述图3中的方法。A storage medium in an embodiment of this specification is introduced below, which is used to store a program, wherein the program causes the client to execute the method in FIG. 3 above when executed.
在20世纪90年代,对于一个技术的改进可以很明显地区分是硬件上的改进(例如,对二极管、晶体管、开关等电路结构的改进)还是软件上的改进(对于方法流程的改进)。然而,随着技术的发展,当今的很多方法流程的改进已经可以视为硬件电路结构的直接改进。设计人员几乎都通过将改进的方法流程编程到硬件电路中来得到相应的硬件电路结构。因此,不能说一个方法流程的改进就不能用硬件实体模块来实现。例如,可编程逻辑器件(Programmable Logic Device,PLD)(例如现场可编程门阵列(Field Programmable GateArray,FPGA))就是这样一种集成电路,其逻辑功能由用户对器件编程来确定。由设计人员自行编程来把一个数字系统“集成”在一片PLD上,而不需要请芯片制造厂商来设计和制作专用的集成电路芯片。而且,如今,取代手工地制作集成电路芯片,这种编程也多半改用“逻辑编译器(logic compiler)”软件来实现,它与程序开发撰写时所用的软件编译器相类似,而要编译之前的原始代码也得用特定的编程语言来撰写,此称之为硬件描述语言(Hardware Description Language,HDL),而HDL也并非仅有一种,而是有许多种,如ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware DescriptionLanguage)、Confluence、 CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(RubyHardware Description Language) 等,目前最普遍使用的是VHDL(Very-High-SpeedIntegrated Circuit Hardware Description Language)与Verilog。本领域技术人员也应该清楚,只需要将方法流程用上述几种硬件描述语言稍作逻辑编程并编程到集成电路中,就可以很容易得到实现该逻辑方法流程的硬件电路。In the 1990s, the improvement of a technology can be clearly distinguished as an improvement in hardware (for example, improvements in circuit structures such as diodes, transistors, and switches) or improvements in software (improvement in method flow). However, with the development of technology, the improvement of many current method flows can be regarded as the direct improvement of the hardware circuit structure. Designers almost always get the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that the improvement of a method flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (Programmable Logic Device, PLD) (such as a Field Programmable Gate Array (Field Programmable Gate Array, FPGA)) is such an integrated circuit, and its logic function is determined by programming the device by a user. It is programmed by the designer to "integrate" a digital system on a PLD, instead of asking a chip manufacturer to design and make a dedicated integrated circuit chip. Moreover, nowadays, instead of making integrated circuit chips by hand, this kind of programming is mostly realized by "logic compiler (logic compiler)" software, which is similar to the software compiler used when writing programs. The original code of the computer must also be written in a specific programming language, which is called a hardware description language (Hardware Description Language, HDL), and there is not only one kind of HDL, but many kinds, such as ABEL (Advanced Boolean Expression Language) , AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., currently the most commonly used is VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog. It should also be clear to those skilled in the art that only a little logical programming of the method flow in the above-mentioned hardware description languages and programming into an integrated circuit can easily obtain a hardware circuit for realizing the logic method flow.
控制器可以按任何适当的方式实现,例如,控制器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式,控制器的例子包括但不限于以下微控制器:ARC 625D、 Atmel AT91SAM、Microchip PIC18F26K20以及Silicone Labs C8051F320,存储器控制器还可以被实现为存储器的控制逻辑的一部分。本领域技术人员也知道,除了以纯计算机可读程序代码方式实现控制器以外,完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件,而对其内包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。The controller may be implemented in any suitable way, for example the controller may take the form of a microprocessor or processor and a computer readable medium storing computer readable program code (such as software or firmware) executable by the (micro)processor , logic gates, switches, Application Specific Integrated Circuits (ASICs), programmable logic controllers, and embedded microcontrollers, examples of controllers include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory's control logic. Those skilled in the art also know that, in addition to realizing the controller in a purely computer-readable program code mode, it is entirely possible to make the controller use logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded The same function can be realized in the form of a microcontroller or the like. Therefore, such a controller can be regarded as a hardware component, and the devices included in it for realizing various functions can also be regarded as structures within the hardware component. Or even, means for realizing various functions can be regarded as a structure within both a software module realizing a method and a hardware component.
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为服务器系统。当然,本说明书不排除随着未来计算机技术的发展,实现上述实施例功能的计算机例如可以为个人计算机、膝上型计算机、车载人机交互设备、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。The systems, devices, modules, or units described in the above embodiments can be specifically implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a server system. Of course, this description does not exclude that with the development of future computer technology, the computer that realizes the functions of the above embodiments may be, for example, a personal computer, a laptop computer, a vehicle-mounted human-computer interaction device, a cellular phone, a camera phone, a smart phone, a personal digital assistant , media players, navigation devices, email devices, game consoles, tablet computers, wearable devices, or any combination of these devices.
虽然本说明书一个或多个实施例提供了如实施例或流程图所述的方法操作步骤,但基于常规或者无创造性的手段可以包括更多或者更少的操作步骤。实施例中列举的步骤顺序仅仅为众多步骤执行顺序中的一种方式,不代表唯一的执行顺序。在实际中的装置或终端产品执行时,可以按照实施例或者附图所示的方法顺序执行或者并行执行(例如并行处理器或者多线程处理的环境,甚至为分布式数据处理环境)。术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、产品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、产品或者设备所固有的要素。在没有更多限制的情况下,并不排除在包括所述要素的过程、方法、产品或者设备中还存在另外的相同或等同要素。例如若使用到第一,第二等词语用来表示名称,而并不表示任何特定的顺序。Although one or more embodiments of the present specification provide the operation steps of the method described in the embodiment or the flowchart, more or fewer operation steps may be included based on conventional or non-inventive means. The sequence of steps enumerated in the embodiments is only one of the execution sequences of many steps, and does not represent the only execution sequence. When an actual device or terminal product is executed, the methods shown in the embodiments or drawings can be executed sequentially or in parallel (such as a parallel processor or multi-thread processing environment, or even a distributed data processing environment). The term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, product, or apparatus comprising a set of elements includes not only those elements, but also other elements not expressly listed elements, or also elements inherent in such a process, method, product, or apparatus. Without further limitations, it is not excluded that there are additional identical or equivalent elements in a process, method, product or device comprising said elements. For example, if the words first, second, etc. are used, they are used to indicate names and do not indicate any specific order.
为了描述的方便,描述以上装置时以功能分为各种模块分别描述。当然,在实施本说明书一个或多个时可以把各模块的功能在同一个或多个软件和/或硬件中实现,也可以将实现同一功能的模块由多个子模块或子单元的组合实现等。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。For the convenience of description, when describing the above devices, functions are divided into various modules and described separately. Of course, when implementing one or more of this specification, the functions of each module can be realized in the same or more software and/or hardware, and the modules that realize the same function can also be realized by a combination of multiple submodules or subunits, etc. . The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
本说明书是参照根据本说明书实施例的方法、装置(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The description is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM. Memory is an example of computer readable media.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储、石墨烯存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage, graphene storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by computing devices. As defined herein, computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.
本领域技术人员应明白,本说明书一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本说明书一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本说明书一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that one or more embodiments of this specification may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present description may employ a computer program embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. The form of the product.
本说明书一个或多个实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本本说明书一个或多个实施例,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。One or more embodiments of this specification may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the present specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including storage devices.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本说明书的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。Each embodiment in this specification is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for relevant parts, refer to part of the description of the method embodiment. In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structures, materials or features are included in at least one embodiment or example of this specification. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.
以上所述仅为本说明书一个或多个实施例的实施例而已,并不用于限制本本说明书一个或多个实施例。对于本领域技术人员来说,本说明书一个或多个实施例可以有各种更改和变化。凡在本说明书的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在权利要求范围之内。The above description is only an example of one or more embodiments of this specification, and is not intended to limit one or more embodiments of this specification. For those skilled in the art, various modifications and changes may occur in one or more embodiments of this description. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of this specification shall be included in the scope of the claims.
Claims (11)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211229255.6A CN115801233B (en) | 2022-10-09 | 2022-10-09 | A method and client for constructing a confusion set |
| PCT/CN2022/135252 WO2024077734A1 (en) | 2022-10-09 | 2022-11-30 | Method and client for realizing construction of confusion set |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211229255.6A CN115801233B (en) | 2022-10-09 | 2022-10-09 | A method and client for constructing a confusion set |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN115801233A true CN115801233A (en) | 2023-03-14 |
| CN115801233B CN115801233B (en) | 2025-05-27 |
Family
ID=85432668
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202211229255.6A Active CN115801233B (en) | 2022-10-09 | 2022-10-09 | A method and client for constructing a confusion set |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN115801233B (en) |
| WO (1) | WO2024077734A1 (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119449493B (en) * | 2025-01-09 | 2025-07-08 | 杭州高新区(滨江)区块链与数据安全研究院 | Voice communication privacy protection method, system and medium based on voice obfuscation |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130173917A1 (en) * | 2011-12-30 | 2013-07-04 | Christopher J. Clifton | Secure search and retrieval |
| US10691754B1 (en) * | 2015-07-17 | 2020-06-23 | Hrl Laboratories, Llc | STAGS: secure, tunable, and accountable generic search in databases |
| CN112270006A (en) * | 2020-11-02 | 2021-01-26 | 重庆邮电大学 | Searchable encryption method for hiding search mode and access mode in e-commerce platform |
| CN113886887A (en) * | 2021-10-25 | 2022-01-04 | 支付宝(杭州)信息技术有限公司 | Data query method and device based on multi-party secure computing |
| CN114637746A (en) * | 2022-03-07 | 2022-06-17 | 杭州博盾习言科技有限公司 | Conditional anonymity query method, system and device based on privacy computing |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11093635B2 (en) * | 2016-05-05 | 2021-08-17 | The Johns Hopkins University | Apparatus and method for private information retrieval |
| US10289816B1 (en) * | 2018-06-08 | 2019-05-14 | Gsfm Llc | Methods, systems, and devices for an encrypted and obfuscated algorithm in a computing environment |
| US11239996B2 (en) * | 2019-12-18 | 2022-02-01 | International Business Machines Corporation | Weighted partial matching under homomorphic encryption |
| US11645409B2 (en) * | 2020-12-18 | 2023-05-09 | Seagate Technology Llc | Search and access pattern hiding verifiable searchable encryption for distributed settings with malicious servers |
| CN114065252B (en) * | 2021-11-19 | 2025-03-07 | 北京数牍科技有限公司 | A method, device and computer equipment for finding intersection of private sets with conditional retrieval |
| CN114036565B (en) * | 2021-11-19 | 2024-03-08 | 上海勃池信息技术有限公司 | Private information retrieval system and private information retrieval method |
| CN114386089B (en) * | 2021-12-07 | 2025-02-14 | 北京数牍科技有限公司 | A method for finding intersection of private sets based on multi-party conditional retrieval |
-
2022
- 2022-10-09 CN CN202211229255.6A patent/CN115801233B/en active Active
- 2022-11-30 WO PCT/CN2022/135252 patent/WO2024077734A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130173917A1 (en) * | 2011-12-30 | 2013-07-04 | Christopher J. Clifton | Secure search and retrieval |
| US10691754B1 (en) * | 2015-07-17 | 2020-06-23 | Hrl Laboratories, Llc | STAGS: secure, tunable, and accountable generic search in databases |
| CN112270006A (en) * | 2020-11-02 | 2021-01-26 | 重庆邮电大学 | Searchable encryption method for hiding search mode and access mode in e-commerce platform |
| CN113886887A (en) * | 2021-10-25 | 2022-01-04 | 支付宝(杭州)信息技术有限公司 | Data query method and device based on multi-party secure computing |
| CN114637746A (en) * | 2022-03-07 | 2022-06-17 | 杭州博盾习言科技有限公司 | Conditional anonymity query method, system and device based on privacy computing |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024077734A1 (en) | 2024-04-18 |
| CN115801233B (en) | 2025-05-27 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Zeng et al. | Forward secure public key encryption with keyword search for outsourced cloud storage | |
| US12135811B2 (en) | Encrypted information retrieval | |
| Wang et al. | Inverted index based multi-keyword public-key searchable encryption with strong privacy guarantee | |
| Boneh et al. | Private database queries using somewhat homomorphic encryption | |
| Li et al. | Proxy re‐encryption with equality test for secure data sharing in Internet of Things‐based healthcare systems | |
| WO2024066008A1 (en) | Method and system for implementing retrieval of privacy information, and server and client | |
| US20240372709A1 (en) | Encrypted information retrieval | |
| WO2024066013A1 (en) | Privacy information retrieval implementation | |
| US12244693B2 (en) | Multi-key information retrieval | |
| CN105282167A (en) | Searchable certificateless public key encryption method | |
| US11223472B2 (en) | Encrypted message search method, message transmission/reception system, server, terminal and program | |
| Zhang et al. | Secure and efficient searchable public key encryption for resource constrained environment based on pairings under prime order group | |
| JP2018526666A (en) | Personal search index with enhanced confidentiality | |
| CN115801233B (en) | A method and client for constructing a confusion set | |
| Yang et al. | ABKS‐CSC: attribute‐based keyword search with constant‐size ciphertexts | |
| JP2010160235A (en) | Retrieval system, terminal device, database device, retrieval method, and program | |
| Bera et al. | Searchable attribute-based proxy re-encryption: keyword privacy, verifiable expressive search, and outsourced decryption | |
| CN115664722A (en) | Method, system, server and client for realizing private information retrieval | |
| Zhang et al. | Efficient Boolean keywords search over encrypted cloud data in public key setting | |
| He et al. | Hierarchical conditional proxy re-encryption: A new insight of fine-grained secure data sharing | |
| Yin et al. | Attribute-Based Secure Keyword Search for Cloud Computing | |
| Nita et al. | Advanced Encryption Schemes |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |