[go: up one dir, main page]

CN115048367B - Method, device, terminal and storage medium for determining target set - Google Patents

Method, device, terminal and storage medium for determining target set

Info

Publication number
CN115048367B
CN115048367B CN202210626216.3A CN202210626216A CN115048367B CN 115048367 B CN115048367 B CN 115048367B CN 202210626216 A CN202210626216 A CN 202210626216A CN 115048367 B CN115048367 B CN 115048367B
Authority
CN
China
Prior art keywords
character
column
combination column
data
combination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210626216.3A
Other languages
Chinese (zh)
Other versions
CN115048367A (en
Inventor
韩哲
蒋嘉琦
陈鑫
吴浩然
李亚朋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Du Xiaoman Technology Beijing Co Ltd
Original Assignee
Du Xiaoman Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Du Xiaoman Technology Beijing Co Ltd filed Critical Du Xiaoman Technology Beijing Co Ltd
Priority to CN202210626216.3A priority Critical patent/CN115048367B/en
Publication of CN115048367A publication Critical patent/CN115048367A/en
Application granted granted Critical
Publication of CN115048367B publication Critical patent/CN115048367B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • User Interface Of Digital Computer (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种目标集合的确定方法、装置、终端及存储介质,包括:接收第一数据源和第二数据源;分别基于第一数据源和第二数据源,确定第一组合列和第二组合列;利用协商交互方式对第一组合列、第二组合列进行分析,确定分隔符;基于第一组合列、第二组合列和分隔符,确定第一组合列对应的第一索引号集合和第二组合列对应的第二索引号集合,以通过第一索引号集合和第二索引号集合得到目标集合。本发明灵活选取多列数据进行组合,得到多列数据对应的组合列,无需手动将组合列中的多列数据转换成单列数据,即可得到目标集合,用户可以根据需求灵活选择和自由组合列数据以形成组合列,便捷高效地实现不同组合列的求交操作。

The present application discloses a method, device, terminal and storage medium for determining a target set, including: receiving a first data source and a second data source; determining a first combination column and a second combination column based on the first data source and the second data source respectively; analyzing the first combination column and the second combination column in a negotiated interactive manner to determine a separator; determining a first index number set corresponding to the first combination column and a second index number set corresponding to the second combination column based on the first combination column, the second combination column and the separator, so as to obtain a target set through the first index number set and the second index number set. The present invention flexibly selects multiple columns of data for combination to obtain a combination column corresponding to the multiple columns of data. The target set can be obtained without manually converting the multiple columns of data in the combination column into a single column of data. The user can flexibly select and freely combine column data to form a combination column according to needs, and conveniently and efficiently implement the intersection operation of different combination columns.

Description

Target set determining method, device, terminal and storage medium
Technical Field
The present application relates to the field of data security technologies, and in particular, to a method, an apparatus, a terminal, and a storage medium for determining a target set.
Background
The private set intersection PSI (PRIVATE SET Intersection) refers to that the participating parties obtain the intersection of the data held by the parties without revealing any additional information. PSI is typically used to find samples common to the parties' data, but not reveal non-common sample information, prior to joint computation by multiple vendors.
At present, the private collection intersection of multiple column data combinations (i.e. column combinations) is generally performed in an indirect manner, that is, a user needs to clean data from different column combinations, then convert the cleaned different column combinations into different single-column data, and then perform PSI operation on different single-column data input systems.
However, the privacy set intersection operation steps of the multi-column data combination by adopting the method are complicated, so that the efficiency is low.
Disclosure of Invention
The application mainly aims to provide a method, a device, a terminal and a storage medium for determining a target set, so as to solve the problem of low efficiency in the related art.
To achieve the above object, in a first aspect, the present application provides a method for determining a target set, including:
receiving a first data source and a second data source;
determining a first combined column and a second combined column based on the first data source and the second data source, respectively, wherein the combined column is formed by combining a plurality of columns of data;
analyzing the first combined column and the second combined column by utilizing a negotiation interaction mode, and determining a separator;
And determining a first index number set corresponding to the first combined column and a second index number set corresponding to the second combined column based on the first combined column, the second combined column and the separator, so as to obtain a target set through the first index number set and the second index number set.
In one possible implementation, determining the first combined column and the second combined column based on the first data source and the second data source, respectively, includes:
Generating a first data table and a second data table based on the first data source and the second data source, respectively;
Selecting a preset number of column data from the first data table and the second data table respectively to obtain a preset number of first column data and a preset number of second column data;
And respectively combining the first column data with the preset number and the second column data with the preset number to obtain a first combined column and a second combined column.
In one possible implementation, a first data source is sent by a first client and a second data source is sent by a second client;
analyzing the first combination column and the second combination column by utilizing a negotiation interaction mode to determine separators, wherein the method comprises the following steps:
Determining a separator based on the first combined column and the second combined column in the case where the first client is a negotiation initiator;
In the case where the second client is the negotiation initiator, the separator is determined based on the first combined column and the second combined column.
In one possible implementation, in a case where the first client is a negotiation initiator, determining the separator based on the first combined column and the second combined column includes:
determining a first character difference set and a second character difference set based on the first combined column and the second combined column, respectively;
if any difference set of the first character difference set or the second character difference set is empty, acquiring a current time stamp, and determining a separator based on the current time stamp, wherein the separator is obtained by sequentially performing character string conversion, hash operation and character string interception on the current time stamp;
if the first character difference set and the second character difference set are not empty, selecting any character from the first character difference set as a target character;
and if the target character exists in the second character difference set, taking the target character as a separator.
In one possible implementation, the method further includes:
If the second character difference set does not have the target character, taking the second client as a negotiation initiator, and selecting any character from the second character difference set as the target character;
If the target character exists in the first character difference set, the target character is used as a separator;
And if the target character does not exist in the first character difference set, repeating the step of selecting any character from the first character difference set as the target character.
In one possible implementation, in a case where the second client is a negotiation initiator, determining the separator based on the first combined column and the second combined column includes:
if any difference set of the first character difference set or the second character difference set is empty, acquiring a current time stamp, and determining a separator based on the current time stamp, wherein the separator is obtained by sequentially performing character string conversion, hash operation and character string interception on the current time stamp;
If the first character difference set and the second character difference set are not empty, selecting any character from the second character difference set as a target character;
and if the target character exists in the first character difference set, taking the target character as a separator.
In one possible implementation, the method further includes:
if the first character difference set does not have the target character, the first client is used as a negotiation initiator, and any character is selected from the first character difference set to be used as the target character;
If the target character exists in the second character difference set, the target character is used as a separator;
and if the second character difference set does not have the target character, repeating the step of selecting any character from the second character difference set as the target character.
In one possible implementation, determining the first character difference set and the second character difference set based on the first combined column and the second combined column, respectively, includes:
Counting all characters in the first combination column to form a first character set, and differencing the preset character set and the first character set to obtain a first character difference set;
Counting all characters in the second combination column to form a second character set, and differencing the preset character set and the second character set to obtain a second character difference set.
In one possible implementation, determining, based on the first combined column, the second combined column, and the separator, a first set of index numbers corresponding to the first combined column and a second set of index numbers corresponding to the second combined column includes:
preprocessing the first combination column, the second combination column and the separator to obtain first combination data corresponding to the first combination column, a third index number set corresponding to the first combination data, and a second combination data corresponding to the second combination column and a fourth index number set corresponding to the second combination data;
and carrying out intersection operation on the first combined data and the second combined data, and combining the third index number set and the fourth index number set to obtain a first index number set corresponding to the first combined column and a second index number set corresponding to the second combined column.
In a second aspect, an embodiment of the present invention provides a device for determining a target set, including:
the data receiving module is used for receiving the first data source and the second data source;
a combined column determining module for determining a first combined column and a second combined column based on the first data source and the second data source, respectively, wherein the combined column is formed by combining a plurality of columns of data;
the separator determining module is used for analyzing the first combined column and the second combined column by utilizing a negotiation interaction mode to determine a separator;
the target set determining module is used for determining a first index number set corresponding to the first combined column and a second index number set corresponding to the second combined column based on the first combined column, the second combined column and the separator, so that a target set is obtained through the first index number set and the second index number set.
In a third aspect, an embodiment of the present invention provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the steps of a method for determining a set of targets as any one of the above when the computer program is executed.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of a method for determining a set of objects, as in any of the above.
The embodiment of the invention provides a method, a device, a terminal and a storage medium for determining a target set, which comprise the steps of receiving a first data source and a second data source, determining a first combination column and a second combination column based on the first data source and the second data source respectively, analyzing the first combination column and the second combination column by utilizing a negotiation interaction mode, determining a separator, and determining a first index number set corresponding to the first combination column and a second index number set corresponding to the second combination column based on the first combination column, the second combination column and the separator so as to obtain the target set through the first index number set and the second index number set. According to the method, multiple rows of data are flexibly selected and combined to obtain the combined columns (namely the first combined column and the second combined column) corresponding to the multiple rows of data, the multiple rows of data in the combined columns are not required to be manually converted into single-column data, the first combined column and the second combined column are directly subjected to privacy set intersection to obtain a first index number set corresponding to the first combined column and a second index number set corresponding to the second combined column, then corresponding data are directly inquired through the index numbers to obtain a target set, a user can flexibly select and freely combine the column data according to requirements to form the combined columns, and further automation, integration and flexibility of multi-row data combination intersection are achieved, and the user is not required to manually clean the data.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application, are incorporated in and constitute a part of this specification. The drawings and their description are illustrative of the application and are not to be construed as unduly limiting the application. In the drawings:
FIG. 1 is a flowchart of a method for determining a target set according to an embodiment of the present invention;
FIG. 2 is a flowchart of an implementation of a method for determining a target set according to another embodiment of the present invention;
FIG. 3 is a schematic diagram of a data table formed by storing source data according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating the selecting and numbering operations of column data according to an embodiment of the present invention;
FIG. 5 is a flowchart of a method for negotiating delimiters according to an embodiment of the present invention;
FIG. 6 is a flowchart of a negotiation implementation of a first client (A) as a negotiation initiator implementing a separator provided by an embodiment of the present invention;
FIG. 7 is a flowchart of a negotiation implementation of a second client (B) as a negotiation initiator implementing a separator provided by an embodiment of the present invention;
FIG. 8 is a schematic diagram of a data table formed by the intersection preprocessing according to an embodiment of the present invention;
Fig. 9 is a schematic diagram of PSI operation results provided by an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a device for determining a target set according to an embodiment of the present invention;
fig. 11 is a schematic diagram of a terminal according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein.
It should be understood that, in various embodiments of the present invention, the sequence number of each process does not mean that the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
It should be understood that in the present invention, "comprising" and "having" and any variations thereof are intended to cover non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements that are expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present invention, "plurality" means two or more. "and/or" is merely an association relationship describing the association object, and means that three relationships may exist, for example, and/or B, and that three cases of a alone, a and B together, and B alone may exist. The character "/" generally indicates that the context-dependent object is an "or" relationship. "comprising A, B and C", "comprising A, B, C" means that all three of A, B, C are comprised, "comprising A, B or C" means that one of A, B, C is comprised, "comprising A, B and/or C" means that any 1 or any 2 or 3 of A, B, C are comprised.
It should be understood that in the present invention, "B corresponding to a", "a corresponding to B", or "B corresponding to a" means that B is associated with a, from which B can be determined. Determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information. The matching of A and B is that the similarity of A and B is larger than or equal to a preset threshold value.
As used herein, "if" may be interpreted as "at" or "when" depending on the context, "or" in response to a determination "or" in response to a detection.
The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the following description will be made by way of specific embodiments with reference to the accompanying drawings.
In one embodiment, as shown in fig. 1, a method for determining a target set is provided, including the following steps:
Step S101, receiving a first data source and a second data source.
Referring to fig. 2, two data sources for performing privacy set intersection according to the present application include a first data source sent by a first client a and a second data source sent by a second client B. When the first data source and the second data source are received, the first data source and the second data source are required to be subjected to data import, a first data table and a second data table are respectively generated after the data import, and the imported data sources can be in the forms of csv files, mySQL, hive and the like.
Step S102, determining a first combined column and a second combined column based on the first data source and the second data source respectively.
Wherein the combined columns are formed by combining multiple columns of data. The method and the device are used for generating a first data table and a second data table based on the first data source and the second data source respectively, wherein the first data table and the second data table comprise a plurality of columns of data. Taking the first data table including N columns of data as an example, description is made to determine the first combination column. Specifically, 3 columns of data are selected from the first data table, namely, column 1, column 4 and column 10, and then the selected columns are numbered and then combined according to the numbering sequence to form a first combined column, namely, column 1, column 4 and column 10. The manner in which the second data source forms the second combined column is similar to the first data source and is not specifically limited herein.
And step 103, analyzing the first combined column and the second combined column by utilizing a negotiation interaction mode, and determining the separator.
The negotiation interaction mode means that the first client and the second client determine the separator through a negotiation mode, that is, the obtained separator needs to be determined by the first client and the second client together and is approved by both the first client and the second client.
Step S104, based on the first combination column, the second combination column and the separator, determining a first index number set corresponding to the first combination column and a second index number set corresponding to the second combination column, so as to obtain a target set through the first index number set and the second index number set.
The embodiment of the invention provides a method for determining a target set, which comprises the steps of receiving a first data source and a second data source, determining a first combination column and a second combination column based on the first data source and the second data source respectively, analyzing the first combination column and the second combination column by utilizing a negotiation interaction mode, determining a separator, and determining a first index number set corresponding to the first combination column and a second index number set corresponding to the second combination column based on the first combination column, the second combination column and the separator so as to obtain the target set through the first index number set and the second index number set. According to the method, multiple lines of data are flexibly selected and combined to obtain the combined columns (namely the first combined column and the second combined column) corresponding to the multiple lines of data, the multiple lines of data in the combined columns are not required to be manually converted into single-column data, the first combined column and the second combined column are directly subjected to privacy set intersection, the first index number set corresponding to the first combined column and the second index number set corresponding to the second combined column are obtained, then the corresponding data are directly inquired through the index numbers, the target set is obtained, a user can flexibly select and freely combine the column data according to requirements to form the combined columns, and privacy set intersection operation of different combined columns is conveniently and efficiently realized.
In one embodiment, step S102 includes:
step S201, a first data table and a second data table are generated based on the first data source and the second data source respectively.
After the data source is imported, a data table corresponding to the data source is automatically generated, that is, a first data table corresponding to the first data source and a second data table corresponding to the second data source are generated, that is, the imported data source is stored in the form of the data table shown in fig. 3. Specifically, the first behavior index number index, column names column-1, column-2 corresponding to the data source. Each of the remaining rows consists of a unique index value and a data value, wherein index is automatically generated and ordered.
Step S202, selecting preset number of column data from a first data table and a second data table respectively to obtain preset number of first column data and preset number of second column data;
step S203, respectively combining the first column data with the preset number and the second column data with the preset number to obtain a first combined column and a second combined column.
The first column data and the second column data are respectively corresponding to the first data table and the second data table, and are not specific to one column of data in the data tables.
The selection and numbering of column data is described with reference to fig. 4, and the tables to which the column data belongs are distinguished by columnA and columnB, i.e. columnA represents the column corresponding to the first table and columnB represents the column corresponding to the second table. Specifically, columnA-2, columnA-50 and columnA-52 are selected from the first data table and numbered 1, 3 and 2 respectively, columnB-1, columnB-3 and columnB-82 are selected from the second data table and numbered 2, 1 and 3 respectively, namely the first combination columns columnA-2, columnA-52 and columnA-50 and the second combination columns columnB-3, columnB-1 and columnB-82 are finally obtained.
After determining the first combined column and the second combined column, the separator is determined according to the negotiation interaction mode. Because the effect of the intersection in the application is related to the delimiter determined by the negotiation interaction, namely, the values of each column in the first combined column and the second combined column (the middle is separated by the delimiter) are spliced into a character string for comparison, and if the delimiter appears in the data values of each column, the accuracy of the PSI intersection result is affected.
As shown in connection with FIG. 4, columns columnA-2, columnA-52, columnA-50 of the A-side require a one-to-one matching of the corresponding values of columns columnB-3, columnB-1, columnB-82 of the B-side, respectively. Assume that columnA-2, columnA-52, columnA-50 of party A have a record of values of "a", "B" and "c", respectively, and columnB-3, columnB-1, columnB-82 of party B have a record of values of "a", "B" and "c", respectively. These two records are obviously unmatched and the PSI column combination is a positive miss. However, if "as separator" is chosen, the comparison method according to the present application will convert the records of both sides a and B into strings "a, B, c" and hit when crossing. To avoid such errors, the present application adds the step of delimiter negotiation to ensure that delimiters selected by both A and B do not appear in the data values of both parties.
In an embodiment, referring to fig. 5-7, since the first data source is sent by the first client and the second data source is sent by the second client, the specific process implemented in S103 is described by using different entities as negotiation initiators based on the difference of the entities sent by the data sources.
In the case where the first client is a negotiation initiator, determining the separator based on the first combined column and the second combined column comprises:
(1) A first character difference set and a second character difference set are determined based on the first combined column and the second combined column, respectively.
Specifically, all characters in the first combination column are counted to form a first character set, the preset character set and the first character set are subjected to difference to obtain a first character difference set, all characters in the second combination column are counted to form a second character set, and the preset character set and the second character set are subjected to difference to obtain a second character difference set.
A represents a first client, B represents a second client, the A side and the B side respectively scan data values of PSI combination columns, namely, the A side scans data values in the first combination columns, the B side scans data values in the second combination columns, all characters appearing in the data values of the sides are counted respectively, and a first character set A_ CharSet and a second character set B_ CharSet are formed.
Assuming that an ASCII code table constitutes a character set as ASC, difference sets of ASC, the set A_ CharSet and the set B_ CharSet are respectively obtained, and a first character difference set A_EXCEPT and a second character difference set B_EXCEPT are obtained.
(2) If either the first character difference set or the second character difference set is empty, a current time stamp is obtained, and a separator is determined based on the current time stamp.
The separator is obtained by sequentially performing character string conversion, hash operation and character string interception on the current timestamp, and the separator determining step is described in a specific embodiment below.
As shown in fig. 6, when the a party is a negotiation initiator and the B party is a participant, if at least one of the first character difference set a_except and the second character difference set b_except is an empty set, the a party obtains a current timestamp and converts the current timestamp into a string form Str (GetCurrentTimeMillis ()), and then performs a Hash operation (including, but not limited to, MD5, SHA1, SHA256, etc.) on the string to form a Hash (Str (GetCurrentTimeMillis ()). Finally, a character string composed of the first 16 bytes of the Hash result, namely Str (Byte (0,15) [ Hash (Str (GetCurrentTimeMillis ()))) ], is intercepted as a final negotiated separator, and the separator is sent to the party B, and the flow of the party A is ended. The B side receives Str (Byte (0,15) [ Hash (Str (GetCurrentTimeMillis ()))) ]) sent by the a side, and uses the Str as a separator for final negotiation, and the B side flow ends.
(3) And if the first character difference set and the second character difference set are not empty, selecting any character from the first character difference set as a target character, and if the target character exists in the second character difference set, taking the target character as a separator.
Referring to fig. 6, when the a party is a negotiation initiator and the B party is a participant, if neither the first character difference set a_except nor the second character difference set b_except is null, one character a is selected from the set a_except, and the a is sent to the B party.
And after receiving the character a sent by the A side, the B side judges whether the character a is in a second character difference set B_EXCEPT. If yes, confirming a as a separator after final negotiation, feeding back a conclusion that a can be used as the separator to the A side, ending the B side flow, and confirming a as the separator after final negotiation by the A side, wherein the A side ends the A side flow.
(4) If the second character difference set does not have the target character, feeding back to the A party that the target character cannot be used as a separator, removing the target character from the first difference set by the A party, and executing a process of taking the B party as a negotiation initiator, wherein details are shown in a process of taking a second client side as the negotiation initiator. If the negotiation of B is not successful, the operation that A and B are alternately used as negotiation initiator is carried out until the delimiter is negotiated.
If a is not in the second character difference set b_escape, the delimiter is determined using the rotation of parties B and a as negotiation initiators. When the A party is the negotiation initiator, the specific steps of taking the B party and the A party as the negotiation initiator in turn are that the A party is taken as the negotiation initiator of the negotiation flow, the specific flow is shown in figure 6, and if the negotiation flow is not finished, taking the B party as the negotiation initiator of the negotiation flow, and carrying out negotiation again, wherein the specific flow is shown in figure 7. If the negotiation flow has not ended, a continues to act as the negotiation initiator of the negotiation flow. Of course, the case when the B-party is the negotiation initiator is similar to the above, and will not be described here again.
In the case that the a-party is the negotiation initiator with reference to fig. 5 to 7, a character a is optionally selected from a_except and sent to the B-party. If a is not in the second character difference set B_EXCEPT, taking the party B as a negotiation initiator, and if the first character difference set A_EXCEPT and the second character difference set B_EXCEPT are not empty, selecting one character B from the set B_EXCEPT, and sending B to the party A.
After receiving the character B sent by the B side, the A side judges whether the character B is in the first character difference set A_EXCEPT. If yes, confirming B as a separator after final negotiation, feeding back a conclusion that B can be used as the separator to the B side, ending the flow of the A side, and confirming B as the separator after final negotiation by the B side, wherein the flow of the B side is ended.
If B is not in the first character difference set A_EXCEPT, taking the A party as a negotiation initiator, if any one of the first character difference set A_EXCEPT and the second character difference set B_EXCEPT is empty, acquiring a current time stamp, determining a separator based on the current time stamp, and if neither the first character difference set A_EXCEPT nor the second character difference set B_EXCEPT is empty, selecting one character c from the set A_EXCEPT, and sending c to the B party. The a-party and the B-party take turns as negotiation initiators until the separator is determined.
In addition, in the case that the a party is used as the negotiation initiator, if the target character a does not exist in the second character difference set, the target character a is removed from the first character difference set, and then whether other characters are in the second character difference set is continuously judged until the separator is determined.
In the case where the second client is the negotiation initiator, determining the separator based on the first combined column and the second combined column includes:
(1) If either the first character difference set or the second character difference set is empty, a current time stamp is obtained, and a separator is determined based on the current time stamp. The separator is obtained by sequentially performing character string conversion, hash operation and character string interception on the current timestamp.
(2) And if the first character difference set and the second character difference set are not empty, selecting any character from the second character difference set as a target character, and if the target character exists in the first character difference set, taking the target character as a separator.
(3) If the first character difference set does not have the target character, any character is selected from the first character difference set as the target character, if the second character difference set has the target character, the target character is used as the separator, and if the second character difference set does not have the target character, the step of selecting any character from the second character difference set as the target character is repeatedly executed.
Steps (1) - (3) in this embodiment are the same as steps (2) - (4) in determining the separator if the first client is the negotiation initiator, and are not described here again.
In one embodiment, step S104 includes:
Step S301, preprocessing the first combination column, the second combination column and the separator to obtain first combination data corresponding to the first combination column, a third index number set corresponding to the first combination data, and a second combination data corresponding to the second combination column and a fourth index number set corresponding to the second combination data.
Specifically, assuming that columnA-2, columnA-52, columnA-50 in the first combination column have a recorded value of "a", "b", and "c", respectively, columnB-3, columnB-1, columnB-82 in the second combination column have a recorded value of "a", "b", and "c", respectively, and "|" is used as the separator, one combination data formed in the first combination column is "a, |b|c", and one combination data formed in the second combination column is "a|b|c". When all records in the first and second combination columns are separated by a separator, the first and second combination data are constituted.
Since the first combination column and the second combination column are selected from fig. 3, the index numbers corresponding to each value in the first combination column and the second combination column can be found from fig. 3, and are the third index number set and the fourth index number set respectively.
The step of preprocessing the first combination column and the second combination column comprises the steps of establishing a first empty table corresponding to the first combination column and a second empty table corresponding to the second combination column, and then respectively inserting combination data formed in the first combination column and the second combination column and corresponding index numbers into the first empty table and the second empty table according to the header fields in the first empty table and the second empty table to obtain a third data table and a fourth data table.
Specifically, two temporary space-time data tables, namely a first empty table and a second empty table, are established, and the two header fields respectively represent an index number set and a spliced column combined data value (namely combined data) and are respectively represented by index-set and column-group.
Traversing the first data table and the second data table according to rows, combining column combinations and column numbers selected by a user, taking out data values one by one, adding a separator in the middle, splicing the data values into a combined data value-group, inserting the combined data value-group together with an index number index into an empty table, and completing the data table as shown in fig. 8.
When data is inserted, in order to avoid repeated insertion of the same combined data, whether the value-group to be inserted exists in the data value corresponding to the column-group column in the data table is checked. If not, the index and the value-group are directly inserted, and if so, the corresponding record is found in the data table, and the index is added into the index-set of the corresponding row.
Step S302, carrying out intersection operation on the first combined data and the second combined data, and combining the third index number set and the fourth index number set to obtain a first index number set corresponding to the first combined column and a second index number set corresponding to the second combined column.
The intersection operation refers to privacy set intersection.
After the above preprocessing, a privacy set query (PRIVATE SET Intersection, PSI) operation may be performed to determine an index number set, as shown in fig. 9, which specifically includes the following steps:
(1) After the intersection pretreatment, the A side and the B side respectively generate A third datA table TB-A and A fourth datA table TB-B.
(2) The column-group column datA of the datA tables TB-A and TB-B are subjected to the privacy set intersection operation, and solutions commonly used in the art, such as careless transmission, hashing, public key encryption, garbling circuit, homomorphic encryption, etc. (including but not limited to these), may be adopted, and the combined datA value-group of all intersection is recorded. For example, as shown in fig. 9, the combined data values of rows 1 and 3 of the a-side are PSI hit, and the combined data values of rows 2 and 3 of the B-side are PSI hit.
(3) And according to the combined datA value-group in the intersection, combining the datA tables TB-A and TB-B, positioning the corresponding record rows, and finding all corresponding index-sets. Index-set values in all the intersections of the A side and the B side are respectively summarized to form index sets IndexSet PSI (A) and IndexSet PSI (B) obtained by combining the intersections. For example, as shown in FIG. 9, indexSet PSI (A) obtained by the A-side combination intersection is {0,3,4}, and IndexSet PSI (B) obtained by the B-side combination intersection is {2,4,5,6,7}.
Then, by traversing index values in the sets IndexSet PSI (a) and IndexSet PSI (B), respectively, and using the index values as index numbers, the source data line information corresponding to the PSI column-by-column combination intersection result can be obtained by comparing the data tables generated after the source data are respectively imported (as shown in fig. 3). The PSI combination intersection effect is equal to that 'columnA-2 data value is equal to columnB-3 data value' columnA-52 data value is equal to columnB-1 data value 'columnA-50 data value is equal to columnB-82 data value'.
The above results may be downloaded, that is, after the PSI operation is completed, the user may view the statistical information of the multi-column combined intersection result, and may download the record row data in the intersection in the data table (as shown in fig. 3), so as to perform the result analysis.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
The following are device embodiments of the invention, for details not described in detail therein, reference may be made to the corresponding method embodiments described above.
Fig. 10 shows a schematic structural diagram of a target set determining device according to an embodiment of the present invention, and for convenience of explanation, only a portion related to the embodiment of the present invention is shown, where the target set determining device includes a data receiving module 1001, a combined column determining module 1002, a separator determining module 1003, and an intersection determining module 1004, specifically as follows:
A data receiving module 1001, configured to receive a first data source and a second data source;
a combined column determining module 1002, configured to determine a first combined column and a second combined column based on the first data source and the second data source, where the combined column is formed by combining multiple columns of data;
The separator determining module 1003 is configured to analyze the first combined column and the second combined column by using a negotiation interaction manner, and determine a separator;
the target set determining module 1004 is configured to determine, based on the first combined column, the second combined column, and the separator, a first index number set corresponding to the first combined column and a second index number set corresponding to the second combined column, so as to obtain a target set through the first index number set and the second index number set.
In one possible implementation, the combined column determination module 1002 includes:
the table generation sub-module is used for generating a first data table and a second data table based on the first data source and the second data source respectively;
The column data selecting sub-module is used for selecting preset number of column data from the first data table and the second data table respectively to obtain preset number of first column data and preset number of second column data;
and the column data combination sub-module is used for respectively combining the first column data with the preset number and the second column data with the preset number to obtain a first combination column and a second combination column.
In one possible implementation, a first data source is sent by a first client and a second data source is sent by a second client;
The separator determination module 1003 includes:
A first negotiation submodule, configured to determine a separator based on the first combination column and the second combination column in a case where the first client is a negotiation initiator;
and the second negotiation submodule is used for determining the separator based on the first combination column and the second combination column under the condition that the second client is a negotiation initiator.
In one possible implementation, the first negotiation submodule includes:
A first character difference set determining unit configured to determine a first character difference set and a second character difference set based on the first combination column and the second combination column, respectively;
The first judging unit is used for acquiring a current time stamp if any difference set of the first character difference set or the second character difference set is empty, and determining a separator based on the current time stamp, wherein the separator is obtained by sequentially performing character string conversion, hash operation and character string interception on the current time stamp;
The second judging unit is used for selecting any character from the first character difference set as a target character if the first character difference set and the second character difference set are not empty;
And the third judging unit is used for taking the target character as a separator if the target character exists in the second character difference set.
In one possible implementation, the method further includes:
A fourth judging unit, configured to select any character from the second character difference set as a target character if the second character difference set does not have the target character;
A fifth judging unit, configured to take the target character as a separator if the target character exists in the first character difference set;
and a sixth judging unit for repeating the step of selecting any character from the first character difference set as the target character if the target character does not exist in the first character difference set.
In one possible implementation, the second negotiation sub-module comprises:
a seventh judging unit, configured to obtain a current timestamp if any one of the first character difference set or the second character difference set is empty, and determine a separator based on the current timestamp, where the separator is obtained by sequentially performing character string conversion, hash operation, and character string interception on the current timestamp;
an eighth judging unit, configured to select any character from the second character difference set as a target character if neither the first character difference set nor the second character difference set is empty;
And a ninth judging unit, configured to take the target character as the separator if the target character exists in the first character difference set.
In one possible implementation, the method further includes:
A tenth judging unit, configured to select any character from the first character difference set as a target character if the target character does not exist in the first character difference set;
An eleventh judging unit for taking the target character as a separator if the target character exists in the second character difference set;
and a twelfth judging unit for repeating the step of selecting any character from the second character difference set as the target character if the target character does not exist in the second character difference set.
In one possible implementation, the first character difference set determining unit or the second character difference set determining unit includes:
The first statistics subunit is used for counting all characters in the first combination column to form a first character set, and differencing the preset character set and the first character set to obtain a first character difference set;
And the second statistics subunit is used for counting all characters in the second combination column to form a second character set, and performing difference between the preset character set and the second character set to obtain a second character difference set.
In one possible implementation, the target set determination module 1004 includes:
the preprocessing submodule is used for preprocessing the first combination column, the second combination column and the separator to obtain first combination data corresponding to the first combination column, a third index number set corresponding to the first combination data, second combination data corresponding to the second combination column and a fourth index number set corresponding to the second combination data;
And the PSI operation sub-module is used for carrying out the intersection operation on the first combined data and the second combined data, and combining the third index number set and the fourth index number set to obtain a first index number set corresponding to the first combined column and a second index number set corresponding to the second combined column.
Fig. 11 is a schematic diagram of a terminal according to an embodiment of the present invention. As shown in fig. 11, the terminal 11 of this embodiment includes a processor 110, a memory 111, and a computer program 112 stored in the memory 111 and executable on the processor 110. The steps in the above-described embodiments of the method for determining a set of objects are implemented by the processor 110 when executing the computer program 112, for example steps 101 to 104 shown in fig. 1. Or the processor 110, when executing the computer program 112, implements the functions of the modules/units in the above-described embodiments of the target set determination apparatus, for example, the functions of the modules/units 1001 to 1004 shown in fig. 10.
The present invention also provides a readable storage medium having a computer program stored therein, which when executed by a processor is configured to implement the method for determining a target set provided in the above-described various embodiments.
The readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media can be any available media that can be accessed by a general purpose or special purpose computer. For example, a readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the readable storage medium. In the alternative, the readable storage medium may be integral to the processor. The processor and the readable storage medium may reside in an Application SPECIFIC INTEGRATED Circuits (ASIC). In addition, the ASIC may reside in a user device. The processor and the readable storage medium may reside as discrete components in a communication device. The readable storage medium may be read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tape, floppy disk, optical data storage device, etc.
The present invention also provides a program product comprising execution instructions stored in a readable storage medium. The at least one processor of the device may read the execution instructions from the readable storage medium, the execution instructions being executed by the at least one processor to cause the device to implement the method of determining a set of targets provided by the various embodiments described above.
In the above embodiment of the apparatus, it should be understood that the Processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), but may also be other general purpose processors, digital signal processors (english: DIGITAL SIGNAL Processor, abbreviated as DSP), application specific integrated circuits (english: application SPECIFIC INTEGRATED Circuit, abbreviated as ASIC), and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
The foregoing embodiments are merely for illustrating the technical solution of the present invention, but not for limiting the same, and although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that the technical solution described in the foregoing embodiments may be modified or substituted for some of the technical features thereof, and that these modifications or substitutions should not depart from the spirit and scope of the technical solution of the embodiments of the present invention and should be included in the protection scope of the present invention.

Claims (9)

1.一种目标集合的确定方法,其特征在于,包括:1. A method for determining a target set, comprising: 接收第一数据源和第二数据源;receiving a first data source and a second data source; 分别基于所述第一数据源和所述第二数据源,确定第一组合列和第二组合列,其中,所述组合列是由多列数据组合形成;Determine a first combined column and a second combined column based on the first data source and the second data source respectively, wherein the combined column is formed by combining multiple columns of data; 利用协商交互方式对所述第一组合列、所述第二组合列进行分析,确定分隔符;Analyzing the first combination column and the second combination column in a negotiation interaction manner to determine a separator; 基于所述第一组合列、所述第二组合列和所述分隔符,确定所述第一组合列对应的第一索引号集合和所述第二组合列对应的第二索引号集合,以通过所述第一索引号集合和所述第二索引号集合得到目标集合;Based on the first combination column, the second combination column, and the delimiter, determining a first index number set corresponding to the first combination column and a second index number set corresponding to the second combination column, so as to obtain a target set through the first index number set and the second index number set; 所述第一数据源由第一客户端发送,所述第二数据源由第二客户端发送;The first data source is sent by a first client, and the second data source is sent by a second client; 所述利用协商交互方式对所述第一组合列、所述第二组合列进行分析,确定分隔符,包括:The analyzing the first combination column and the second combination column in a negotiation interaction manner to determine a separator includes: 在所述第一客户端为协商发起方的情况下,基于所述第一组合列和所述第二组合列,确定所述分隔符;In a case where the first client is a negotiation initiator, determining the delimiter based on the first combination column and the second combination column; 在所述第二客户端为协商发起方的情况下,基于所述第一组合列和所述第二组合列,确定所述分隔符;In a case where the second client is the negotiation initiator, determining the delimiter based on the first combination column and the second combination column; 所述在所述第一客户端为协商发起方的情况下,基于所述第一组合列和所述第二组合列,确定所述分隔符,包括:The determining the delimiter based on the first combination column and the second combination column when the first client is the negotiation initiator includes: 分别基于所述第一组合列和所述第二组合列,确定第一字符差集和第二字符差集;Determine a first character difference set and a second character difference set based on the first combination column and the second combination column respectively; 若所述第一字符差集或第二字符差集任一差集为空的情况下,获取当前时间戳,并基于所述当前时间戳,确定所述分隔符,其中,所述分隔符是对所述当前时间戳依次进行字符串转换、hash操作和字符串截取得到;If either the first character difference set or the second character difference set is empty, obtaining a current timestamp, and determining the delimiter based on the current timestamp, wherein the delimiter is obtained by sequentially performing string conversion, hash operation, and string truncation on the current timestamp; 若所述第一字符差集和所述第二字符差集均不为空的情况下,从所述第一字符差集中选取任一字符作为目标字符;If both the first character difference set and the second character difference set are not empty, selecting any character from the first character difference set as the target character; 若所述第二字符差集中存在所述目标字符,将所述目标字符作为所述分隔符;If the target character exists in the second character difference set, use the target character as the separator; 所述分别基于所述第一组合列和所述第二组合列,确定第一字符差集和第二字符差集,包括:The determining the first character difference set and the second character difference set based on the first combination column and the second combination column respectively includes: 统计所述第一组合列中的所有字符以形成第一字符集,并将预设字符集与所述第一字符集作差,得到第一字符差集;Counting all characters in the first combination column to form a first character set, and performing a subtraction between a preset character set and the first character set to obtain a first character difference set; 统计所述第二组合列中的所有字符以形成第二字符集,并将所述预设字符集与所述第二字符集作差,得到第二字符差集。All characters in the second combination column are counted to form a second character set, and the preset character set is subtracted from the second character set to obtain a second character difference set. 2.如权利要求1所述目标集合的确定方法,其特征在于,所述分别基于所述第一数据源和所述第二数据源,确定第一组合列和第二组合列,包括:2. The method for determining a target set according to claim 1, wherein determining the first combination column and the second combination column based on the first data source and the second data source respectively comprises: 分别基于所述第一数据源和所述第二数据源,生成第一数据表和第二数据表;generating a first data table and a second data table based on the first data source and the second data source respectively; 分别从所述第一数据表和所述第二数据表中选取预设数量的列数据,得到预设数量的第一列数据和预设数量的第二列数据;Selecting a preset number of columns of data from the first data table and the second data table respectively to obtain a preset number of first columns of data and a preset number of second columns of data; 分别将所述预设数量的第一列数据和所述预设数量的第二列数据进行组合,得到所述第一组合列和第二组合列。The preset number of first column data and the preset number of second column data are respectively combined to obtain the first combined column and the second combined column. 3.如权利要求1所述目标集合的确定方法,其特征在于,还包括:3. The method for determining a target set according to claim 1, further comprising: 若所述第二字符差集中不存在所述目标字符,将所述第二客户端作为协商发起方,并从所述第二字符差集中选取任一字符作为所述目标字符;If the target character does not exist in the second character difference set, taking the second client as the negotiation initiator, and selecting any character from the second character difference set as the target character; 若所述第一字符差集中存在所述目标字符,将所述目标字符作为所述分隔符;If the target character exists in the first character difference set, use the target character as the separator; 若所述第一字符差集中不存在所述目标字符,重复执行所述从所述第一字符差集中选取任一字符作为目标字符的步骤。If the target character does not exist in the first character difference set, the step of selecting any character from the first character difference set as the target character is repeated. 4.如权利要求1所述目标集合的确定方法,其特征在于,所述在所述第二客户端为协商发起方的情况下,基于所述第一组合列和所述第二组合列,确定所述分隔符,包括:4. The method for determining a target set according to claim 1, wherein, when the second client is the negotiation initiator, determining the delimiter based on the first combination column and the second combination column comprises: 若所述第一字符差集或第二字符差集任一差集为空的情况下,获取当前时间戳,并基于所述当前时间戳,确定所述分隔符,其中,所述分隔符是对所述当前时间戳依次进行字符串转换、hash操作和字符串截取得到;If either the first character difference set or the second character difference set is empty, obtaining a current timestamp, and determining the delimiter based on the current timestamp, wherein the delimiter is obtained by sequentially performing string conversion, hash operation, and string truncation on the current timestamp; 若所述第一字符差集和所述第二字符差集均不为空的情况下,从所述第二字符差集中选取任一字符作为目标字符;If both the first character difference set and the second character difference set are not empty, selecting any character from the second character difference set as the target character; 若所述第一字符差集中存在所述目标字符,将所述目标字符作为所述分隔符。If the target character exists in the first character difference set, the target character is used as the separator. 5.如权利要求4所述目标集合的确定方法,其特征在于,还包括:5. The method for determining a target set according to claim 4, further comprising: 若所述第一字符差集中不存在所述目标字符,将所述第一客户端作为协商发起方,并从所述第一字符差集中选取任一字符作为所述目标字符;If the target character does not exist in the first character difference set, taking the first client as the negotiation initiator, and selecting any character from the first character difference set as the target character; 若所述第二字符差集中存在所述目标字符,将所述目标字符作为所述分隔符;If the target character exists in the second character difference set, use the target character as the separator; 若所述第二字符差集中不存在所述目标字符,重复执行所述从所述第二字符差集中选取任一字符作为目标字符的步骤。If the target character does not exist in the second character difference set, the step of selecting any character from the second character difference set as the target character is repeated. 6.如权利要求1所述目标集合的确定方法,其特征在于,所述基于所述第一组合列、所述第二组合列和所述分隔符,确定所述第一组合列对应的第一索引号集合和所述第二组合列对应的第二索引号集合,包括:6. The method for determining a target set according to claim 1, wherein determining a first index number set corresponding to the first combination column and a second index number set corresponding to the second combination column based on the first combination column, the second combination column, and the delimiter comprises: 对所述第一组合列、所述第二组合列和所述分隔符进行预处理,得到所述第一组合列对应的第一组合数据和所述第一组合数据对应的第三索引号集合,以及所述第二组合列对应的第二组合数据和所述第二组合数据对应的第四索引号集合;Preprocessing the first combination column, the second combination column, and the delimiter to obtain first combination data corresponding to the first combination column and a third index number set corresponding to the first combination data, and second combination data corresponding to the second combination column and a fourth index number set corresponding to the second combination data; 将所述第一组合数据和所述第二组合数据进行求交运算,并结合所述第三索引号集合和所述第四索引号集合,得到所述第一组合列对应的第一索引号集合和所述第二组合列对应的第二索引号集合。An intersection operation is performed on the first combination data and the second combination data, and combined with the third index number set and the fourth index number set to obtain a first index number set corresponding to the first combination column and a second index number set corresponding to the second combination column. 7.一种目标集合的确定装置,其特征在于,包括:7. A device for determining a target set, comprising: 数据接收模块,用于接收第一数据源和第二数据源;A data receiving module, configured to receive a first data source and a second data source; 组合列确定模块,用于分别基于所述第一数据源和所述第二数据源,确定第一组合列和第二组合列,其中,所述组合列是由多列数据组合形成;a combination column determining module, configured to determine a first combination column and a second combination column based on the first data source and the second data source, respectively, wherein the combination column is formed by combining multiple columns of data; 分隔符确定模块,用于利用协商交互方式对所述第一组合列、所述第二组合列进行分析,确定分隔符;a delimiter determination module, configured to analyze the first combination column and the second combination column in a negotiation interaction manner to determine a delimiter; 目标集合确定模块,用于基于所述第一组合列、所述第二组合列和所述分隔符,确定所述第一组合列对应的第一索引号集合和所述第二组合列对应的第二索引号集合,以通过所述第一索引号集合和所述第二索引号集合得到目标集合;a target set determining module, configured to determine, based on the first combination column, the second combination column, and the delimiter, a first index number set corresponding to the first combination column and a second index number set corresponding to the second combination column, so as to obtain a target set using the first index number set and the second index number set; 所述第一数据源由第一客户端发送,所述第二数据源由第二客户端发送;The first data source is sent by a first client, and the second data source is sent by a second client; 所述利用协商交互方式对所述第一组合列、所述第二组合列进行分析,确定分隔符,包括:The analyzing the first combination column and the second combination column in a negotiation interaction manner to determine a separator includes: 在所述第一客户端为协商发起方的情况下,基于所述第一组合列和所述第二组合列,确定所述分隔符;In a case where the first client is a negotiation initiator, determining the delimiter based on the first combination column and the second combination column; 在所述第二客户端为协商发起方的情况下,基于所述第一组合列和所述第二组合列,确定所述分隔符;In a case where the second client is the negotiation initiator, determining the delimiter based on the first combination column and the second combination column; 所述在所述第一客户端为协商发起方的情况下,基于所述第一组合列和所述第二组合列,确定所述分隔符,包括:The determining the delimiter based on the first combination column and the second combination column when the first client is the negotiation initiator includes: 分别基于所述第一组合列和所述第二组合列,确定第一字符差集和第二字符差集;Determine a first character difference set and a second character difference set based on the first combination column and the second combination column respectively; 若所述第一字符差集或第二字符差集任一差集为空的情况下,获取当前时间戳,并基于所述当前时间戳,确定所述分隔符,其中,所述分隔符是对所述当前时间戳依次进行字符串转换、hash操作和字符串截取得到;If either the first character difference set or the second character difference set is empty, obtaining a current timestamp, and determining the delimiter based on the current timestamp, wherein the delimiter is obtained by sequentially performing string conversion, hash operation, and string truncation on the current timestamp; 若所述第一字符差集和所述第二字符差集均不为空的情况下,从所述第一字符差集中选取任一字符作为目标字符;If both the first character difference set and the second character difference set are not empty, selecting any character from the first character difference set as the target character; 若所述第二字符差集中存在所述目标字符,将所述目标字符作为所述分隔符;If the target character exists in the second character difference set, use the target character as the separator; 所述分别基于所述第一组合列和所述第二组合列,确定第一字符差集和第二字符差集,包括:The determining the first character difference set and the second character difference set based on the first combination column and the second combination column respectively includes: 统计所述第一组合列中的所有字符以形成第一字符集,并将预设字符集与所述第一字符集作差,得到第一字符差集;Counting all characters in the first combination column to form a first character set, and performing a subtraction between a preset character set and the first character set to obtain a first character difference set; 统计所述第二组合列中的所有字符以形成第二字符集,并将所述预设字符集与所述第二字符集作差,得到第二字符差集。All characters in the second combination column are counted to form a second character set, and the preset character set is subtracted from the second character set to obtain a second character difference set. 8.一种终端,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至6中任一项所述目标集合的确定方法的步骤。8. A terminal comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the target set determination method according to any one of claims 1 to 6 when executing the computer program. 9.一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行所述计算机程序时实现如权利要求1至6中任一项所述目标集合的确定方法的步骤。9. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the method for determining a target set according to any one of claims 1 to 6.
CN202210626216.3A 2022-06-02 2022-06-02 Method, device, terminal and storage medium for determining target set Active CN115048367B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210626216.3A CN115048367B (en) 2022-06-02 2022-06-02 Method, device, terminal and storage medium for determining target set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210626216.3A CN115048367B (en) 2022-06-02 2022-06-02 Method, device, terminal and storage medium for determining target set

Publications (2)

Publication Number Publication Date
CN115048367A CN115048367A (en) 2022-09-13
CN115048367B true CN115048367B (en) 2025-08-05

Family

ID=83159889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210626216.3A Active CN115048367B (en) 2022-06-02 2022-06-02 Method, device, terminal and storage medium for determining target set

Country Status (1)

Country Link
CN (1) CN115048367B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11809588B1 (en) * 2023-04-07 2023-11-07 Lemon Inc. Protecting membership in multi-identification secure computation and communication

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101141582A (en) * 2006-09-07 2008-03-12 Lg电子株式会社 Digital television receiver and method for processing a digital television signal
CA3084360A1 (en) * 2018-01-08 2019-07-11 Equifax Inc. Facilitating entity resolution, keying, and search match without transmitting personally identifiable information in the clear

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6128662A (en) * 1997-08-29 2000-10-03 Cisco Technology, Inc. Display-model mapping for TN3270 client
CN1812773B (en) * 2003-05-28 2012-06-27 莫诺索尔克斯有限公司 Polyethylene oxide-based films and drug delivery systems made therefrom
CN104765915B (en) * 2015-03-30 2017-08-04 中南大学 Three-dimensional laser scanning data modeling method and system
CN109543154B (en) * 2018-10-11 2021-07-23 天津字节跳动科技有限公司 Type conversion method and device of table data, storage medium and electronic equipment
CN111259082B (en) * 2020-02-11 2023-07-21 深圳市六因科技有限公司 A method to realize full data synchronization in a big data environment
CN112799850A (en) * 2021-02-26 2021-05-14 重庆度小满优扬科技有限公司 Model training method, model prediction method, and model control system
CN113138825A (en) * 2021-04-28 2021-07-20 北京乐学帮网络技术有限公司 Information display method and device, computer equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101141582A (en) * 2006-09-07 2008-03-12 Lg电子株式会社 Digital television receiver and method for processing a digital television signal
CA3084360A1 (en) * 2018-01-08 2019-07-11 Equifax Inc. Facilitating entity resolution, keying, and search match without transmitting personally identifiable information in the clear

Also Published As

Publication number Publication date
CN115048367A (en) 2022-09-13

Similar Documents

Publication Publication Date Title
EP3767483A1 (en) Method, device, system, and server for image retrieval, and storage medium
CN107798216B (en) Alignment of high similarity sequences using divide and conquer
CN108733317B (en) Data storage method and device
CN112037061A (en) Processing method and device for transactions in block chain, electronic equipment and storage medium
CN111708807B (en) Data flattening processing method, device, equipment and storage medium
CN115048367B (en) Method, device, terminal and storage medium for determining target set
CN107735783A (en) Method and apparatus for searching for image
CN111209591A (en) Storage structure sorted according to time and quick query method
CN115276889B (en) Decoding processing method, decoding processing device, computer equipment and storage medium
CN114970464A (en) Method, device, terminal equipment and storage medium for generating identification
Alshammary et al. Reviewing and evaluating existing file carving techniques for jpeg files
CN113204683B (en) Information reconstruction method and device, storage medium and electronic equipment
CN110797082A (en) Method and system for storing and reading gene sequencing data
CN113128848B (en) Data quality monitoring method of all-service index, electronic equipment and storage medium
CN117892355B (en) Multiparty data joint analysis method and system based on privacy protection
CN112883301A (en) Method and device for generating short link based on 55 system and storage medium
CN115698996A (en) Data storage servers and client devices for securely storing data
CN112751802A (en) Application identification method, system and equipment for encrypted traffic
CN116992486A (en) Cryptography-based united blacklist multiparty privacy query method and system
CN106469086B (en) Event processing method and device
CN113282662B (en) Block information processing method, device, equipment and medium
CN114025024A (en) A data transmission method and device
CN114443126A (en) Multi-version image processing method, information push method, device and electronic device
CN112445822B (en) Data query method and device, electronic equipment and computer readable storage medium
US20240354342A1 (en) Compact Probabilistic Data Structure For Storing Streamed Log Lines

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant