[go: up one dir, main page]

CN117521038B - Ownership verification methods, processing methods, equipment and media for structured data sets - Google Patents

Ownership verification methods, processing methods, equipment and media for structured data sets

Info

Publication number
CN117521038B
CN117521038B CN202311467146.2A CN202311467146A CN117521038B CN 117521038 B CN117521038 B CN 117521038B CN 202311467146 A CN202311467146 A CN 202311467146A CN 117521038 B CN117521038 B CN 117521038B
Authority
CN
China
Prior art keywords
data
structured
data set
watermark
secret information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311467146.2A
Other languages
Chinese (zh)
Other versions
CN117521038A (en
Inventor
朱文涛
周文红
张超
刘洋
杨立宝
王铎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202311467146.2A priority Critical patent/CN117521038B/en
Publication of CN117521038A publication Critical patent/CN117521038A/en
Priority to PCT/CN2024/125335 priority patent/WO2025098109A1/en
Application granted granted Critical
Publication of CN117521038B publication Critical patent/CN117521038B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/16Program or content traceability, e.g. by watermarking
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Technology Law (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Editing Of Facsimile Originals (AREA)
  • Storage Device Security (AREA)
  • Image Processing (AREA)

Abstract

本申请公开了一种结构化数据集的权属验证方法、处理方法、设备与介质;方法基于秘密信息生成与业务数据外观一致的水印数据,并将水印数据按预设比例混杂于业务数据之中,达到难以剥离、较好隐藏的效果。数据集验证方需要在向数据集所有方获取秘密信息之后才能实现分辨水印数据和业务数据,使得本申请水印数据具有较好的安全性,难以被数据集盗用方恶意利用。另一方面,本申请基于水印数据与业务数据所形成的比例特征确认结构化数据集的权属,而且引入了秘密信息,相较于直接通过水印数据确认权属的方式更难以被破解,能够较好地保护数据集所有方的合法权益。本申请可以广泛应用于信息安全领域。

The present application discloses a method, processing method, device and medium for verifying the ownership of a structured data set; the method generates watermark data that is consistent with the appearance of business data based on secret information, and mixes the watermark data into the business data in a preset proportion, so as to achieve an effect that is difficult to remove and well hidden. The data set verifier needs to obtain secret information from the data set owner before being able to distinguish between watermark data and business data, so that the watermark data of the present application has good security and is difficult to be maliciously used by the data set thief. On the other hand, the present application confirms the ownership of a structured data set based on the proportional characteristics formed by the watermark data and the business data, and introduces secret information, which is more difficult to crack than the method of directly confirming ownership through watermark data, and can better protect the legitimate rights and interests of the data set owner. The present application can be widely used in the field of information security.

Description

Rights verification method, processing method, device and medium for structured data set
Technical Field
The application relates to the field of information security, in particular to a rights verification method, a processing method, equipment and a medium of a structured data set.
Background
The concept of watermarking is common in the multimedia copyright related art. For example, additional data for copyright identification such as authorship information is embedded into multimedia content files such as images, audios and videos in a form of being visible to human eyes or being invisible to human eyes, so that copyright attribution of the produced contents is determined, and legal rights and interests of the authorship are maintained. Embedded watermarking techniques are widely used in the validation of rights to unstructured data.
For structured data, such as those based on combinations of fields such as phone numbers, identification numbers, etc., it does not support the addition of embedded watermarks in the data, and thus other means for watermarking are required. In the related art, structural data is generally watermarked by adopting a column watermark or a row watermark. The column watermark is an extra, non-practical (or not significant) data field or simply a decorative mark added to the existing data in format, and the row watermark is to synthesize multiple groups of forged data based on the original structured data and mix the forged data into a data set, and the watermark marking of the structured data is realized through the forged data. The disadvantage of column watermarking is that the added data fields with no (or little) practical meaning are very easy to distinguish, after other personnel or organizations (hereinafter referred to as data set theft party) acquire the structured data, the column watermarking is easily distinguished and stripped by machine means, and after stripping the column watermarking, the rights of the structured data are difficult to identify. The defect of the line watermark is that the forged structured data is usually obviously different from the service data in format or content, and the complete integration is difficult to achieve, so that the removal of the line watermark is easy for a data set embezzlement party. After the watermark mark is removed by the data set pirate, the structured data can be maliciously utilized, and at the moment, the legal rights and interests of all the data set parties are difficult to ensure.
Disclosure of Invention
In view of this, the embodiments of the present application provide a method, a processing method, a device and a medium for verifying rights of a structured data set.
One aspect of the present application provides a method of rights verification for a structured dataset, comprising the steps of:
the method comprises the steps of obtaining a structured data set, wherein the structured data set comprises a plurality of pieces of structured data, each piece of structured data is business data or watermark data, and the business data and the watermark data meet the same preset data format;
The method comprises the steps of obtaining secret information corresponding to a structured data set, a proportion label of watermark data and specific mathematical properties corresponding to the watermark data from a target object to be verified, wherein the specific mathematical properties are used for restraining a check value calculated by using the secret information and the watermark data according to a preset mathematical characteristic through a preset mathematical rule, and the probability that any one of the data meeting a preset data format and the check value calculated by the secret information according to the preset mathematical rule meets the preset mathematical characteristic is smaller than a first threshold value, and the proportion label is larger than the first threshold value;
Identifying the watermark data from the structured dataset according to the secret information and the particular mathematical property;
And counting the proportion result of the watermark data in the structured data set, and determining the ownership relationship of the target object and the structured data set according to the proportion result and the proportion label.
Further, in some embodiments, said identifying said watermark data from said structured dataset according to said secret information and said particular mathematical property comprises:
Calculating the secret information and the structured data through the preset mathematical rule to obtain a first check value;
Judging whether the first check value accords with the preset mathematical characteristic according to the specific mathematical property;
and if the first check value accords with the preset mathematical characteristic, determining the structured data as watermark data.
Further, in some embodiments, the determining the ownership of the target object and the structured dataset based on the scale result and the scale label comprises:
calculating a difference value between the proportional result and the proportional label;
and if the difference value is smaller than a second threshold value, determining that the target object is the ownership party of the structured dataset.
Further, in some embodiments, the calculating the difference value between the ratio result and the ratio tag includes:
Calculating a difference value between the proportion result and the proportion label, and determining an absolute value of the difference value as a difference value;
or calculating the difference between the proportional result and the proportional label, and determining the proportion of the absolute value of the difference to the proportional label as a difference value.
In another aspect, the application discloses a method for processing a structured dataset, comprising the steps of:
The method comprises the steps of obtaining an original data set and rights label information, wherein the original data set is used for storing structured data, the structured data meets a preset data format, the rights label information comprises secret information, a proportion label and specific mathematical properties, the specific mathematical properties are used for constraining a check value obtained through calculation by using the secret information and watermark data according to preset mathematical characteristics, the probability that any check value obtained through calculation by using the data meeting the preset data format and the secret information according to the preset mathematical rules meets the preset mathematical characteristics is smaller than a first threshold, and the proportion label is larger than the first threshold;
determining watermark data from data satisfying the predetermined data format based on the secret information and the specific mathematical property;
determining the target number of watermark data needing to be added into the original data set according to the number of business data contained in the original data set and the proportion label;
and adding the watermark data of the target quantity into the original data set to obtain a target data set.
Further, in some embodiments, obtaining the ownership tag information includes:
Acquiring association information corresponding to the original data set, wherein the association information is used for representing rights of the original data set;
And generating the secret information according to the association information.
Further, in some embodiments, said adding said target amount of said watermark data to said original data set, resulting in a target data set, comprises:
determining the target number of insertion locations in the raw dataset;
And adding each piece of watermark data to one insertion position in the original data set to obtain a target data set.
Another aspect of the application discloses a rights verification apparatus for a structured dataset, comprising:
The system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a structured data set, the structured data set comprises a plurality of pieces of structured data, each piece of structured data is business data or watermark data, and the business data and the watermark data meet the same preset data format;
the system comprises a first acquisition unit, a second acquisition unit, a first judgment unit and a second judgment unit, wherein the first acquisition unit is used for acquiring secret information corresponding to a structured data set, a proportion label of watermark data and specific mathematical properties corresponding to the watermark data from a target object to be verified, wherein the specific mathematical properties are used for restraining a verification value calculated by using the secret information and the watermark data according to a preset mathematical rule to conform to preset mathematical characteristics;
a processing unit for identifying the watermark data from the structured dataset according to the secret information and the specific mathematical property;
And the statistics unit is used for counting the proportion result of the watermark data in the structured data set, and determining the rights relation between the target object and the structured data set according to the proportion result and the proportion label.
In another aspect, the application discloses an electronic device comprising a processor and a memory;
the memory is used for storing programs;
The processor executes the program to realize the rights verification method of the structured data set or the processing method of the structured data set.
In another aspect, the application discloses a computer readable storage medium storing a program for execution by a processor to implement a method of rights verification for a structured data set or a method of processing a structured data set.
The embodiment of the application has the advantages that the rights verification method, the processing method, the equipment and the medium of the structured data set have the appearance consistent with that of the business data, are difficult to strip by a machine or a manual means, can be well hidden in the business data of the structured data set, and any data set verification party cannot distinguish the watermark data and the business data on the premise of not acquiring secret information from all data set parties, so that the watermark data has good security and is difficult to be maliciously utilized. On the other hand, the application confirms the rights of the structured data set based on the proportion characteristic formed by the watermark data and the business data, and introduces the secret information, so that the rights and interests of all parties of the data set can be better protected compared with the way of directly confirming the rights and interests through the watermark data.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a prior art column watermark;
FIG. 2 is a schematic diagram of a prior art line watermark;
FIG. 3 is a flow chart of a method for verifying ownership of a structured dataset provided in an embodiment of the present application;
FIG. 4 is a schematic flow chart of determining watermark data from structured data according to an embodiment of the present application;
FIG. 5 is a flow chart of a method of processing a structured data set according to an embodiment of the present application;
fig. 6 is a schematic diagram of a predetermined data format of service data provided in an embodiment of the present application;
FIG. 7 is a flow chart of generation of secret information provided in an embodiment of the present application;
FIG. 8 is a schematic diagram of watermark data inserted in an original dataset provided in an embodiment of the present application;
FIG. 9 is a schematic diagram of a device for verifying ownership of a structured dataset according to an embodiment of the present application;
Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The concept of watermarking is common in the multimedia copyright related art. For example, additional data for copyright identification such as authorship information is embedded into multimedia content files such as images, audios and videos in a form of being visible to human eyes or being invisible to human eyes, so that copyright attribution of the produced contents is determined, and legal rights and interests of the authorship are maintained. Embedded watermarking techniques are widely used in the validation of rights to unstructured data.
In the related art, the watermarking technology for structured data sets is mainly divided into two types, namely column watermarking and row watermarking.
In particular, a column watermark is an extra, meaningless (or less than meaningless) data field, or simply a decorative marker that is added to existing data in format. As shown in fig. 1, adding a decorative marker to the handset number 12300762185 becomes { #12300762185# } for identifying "this is my data", where { # and # } are column watermarks. The disadvantage of column watermarking is that the added data fields with no (or little) practical meaning are very easy to distinguish, after other personnel or organizations (hereinafter referred to as data set theft party) acquire the structured data, the column watermarking is easily distinguished and stripped by machine means, and after stripping the column watermarking, the rights of the structured data are difficult to identify.
In contrast, the row watermark does not change data from the dimension of the columns, but generates multiple groups of fake data on the basis of the original structured data to be mixed into the data set, and watermarking of the structured data is realized through the fake data, namely, the watermark is equivalent to inserting 'whole false data record'. As shown in fig. 2, a collection of fake subscribers are inserted in the handset number dataset, and their handset numbers use 111xxxxx, a virtually non-existent number, for identification of "this is my data". The defect of the line watermark is that the forged structured data is usually significantly different from the service data in format or content, such as the number 111xxxxx is easy to be resolved into false users by the industry personnel, and is difficult to be completely integrated, and the line watermark is easy to be removed for the data set stealer.
The embodiment of the application improves on the prior line watermarking technology and provides a rights verification method, a processing method, equipment and a medium of a structured data set.
In the embodiment of the application, the data set owner refers to an object entity which is legal to enjoy various legal rights and interests related to the structured data set. In the information age, however, the phenomenon of illegally acquiring and maliciously utilizing the structured data set is frequent, so that the information safety of the data set owners is seriously influenced and the data set owners are hindered from enjoying various legal rights. Therefore, the data watermarking technology of the embodiment of the application needs to be adopted to protect the information security of the structured data set.
In the embodiment of the application, the data set verification party refers to an object entity which wants to confirm the rights of the structured data set.
In order to solve the difficulty in verifying the rights of the structured data set, as shown in fig. 3, an embodiment of the present application proposes a method for verifying the rights of the structured data set, which can be applied to a data set verifier. Specifically, the method comprises the following steps:
Step 310, obtaining a structured data set, wherein the structured data set comprises a plurality of pieces of structured data, each piece of structured data is business data or watermark data, and the business data and the watermark data meet the same preset data format;
Step 320, obtaining secret information corresponding to the structured data set, a proportion label of the watermark data and specific mathematical properties corresponding to the watermark data from a target object to be verified, wherein the specific mathematical properties are used for restricting a check value obtained by calculation through a preset mathematical rule and using the secret information and the watermark data to conform to a preset mathematical feature, and the probability that any one of the data meeting the preset data format and the check value obtained by calculation through the secret information conforms to the preset mathematical feature is smaller than a first threshold value through the preset mathematical rule;
Step 330 of identifying the watermark data from the structured dataset according to the secret information and the specific mathematical property;
And 340, counting the proportion result of the watermark data in the structured data set, and determining the ownership relationship of the target object and the structured data set according to the proportion result and the proportion label.
In the embodiment of the application, a rights verification method of a structured data set is provided, and the method can be applied to a data set verifier. Specifically, first, a structured dataset that needs to be subject to rights verification may be acquired and an object to be verified may be determined, where in the embodiment of the present application, the object is marked as a target object. The target object may be an object that claims to hold a structured dataset that requires rights verification, or an object that provides the structured dataset to the dataset verifier, which is not limiting in embodiments of the present application.
In the embodiment of the application, the obtained structured data set includes a plurality of pieces of structured data, and the structured data satisfy the same predetermined data format, for example, the number of fields of each piece of structured data is the same and each corresponding field has the same data format. Each piece of structured data in the structured data set is business data or watermark data, the business data refers to normal and real data, for example, the business data can comprise fields such as mobile phone numbers, identity card numbers and the like, the watermark data is structured false data, and in the embodiment of the application, the business data and the watermark data meet the same preset data format and are indistinguishable in format and content. In other words, in the embodiment of the present application, the watermark data is not intuitively marked (as compared with the above-mentioned method that the synthesized mobile phone number is set to be at the beginning of 111, which is not actually existed in the present application), and the watermark data in the embodiment of the present application also does not have any warning effect. Thus, without the corresponding secret information, anyone cannot determine whether the structured data is business data or watermark data, i.e., cannot distinguish business data from watermark data.
Secret information corresponding to the structured data set, a scale tag of the watermark data, and specific mathematical properties of the watermark data may then be obtained from the target object. In the embodiment of the application, for the ownership party of the structured data set, secret information, the proportion label of watermark data and specific mathematical properties of watermark data need to be preselected for each structured data set with rights and requirements, wherein the secret information needs to be kept secret, and the data set verifier (such as a supervision department and a law enforcement agency) is authorized only when necessary. Based on the secret information, the data set owner can generate watermark data with specific mathematical properties, one piece of watermark data is indistinguishable from one piece of business data in appearance, but for all possible data meeting a preset data format, only a small proportion of synthesized data can meet the preset specific mathematical properties of the data set owner, so that the watermark data is formed. Therefore, in the embodiment of the application, the owner of the data set can control the proportion of watermark data inserted in the structured data set, record the proportion value as a proportion label, and after the data set verification party obtains the secret information, the secret information can be used for identifying each watermark data, and then whether the data set belongs to the target object is judged by checking the occurrence proportion of the watermark data in the structured data set.
In the embodiment of the application, the specific mathematical property corresponding to the watermark data refers to that the verification value obtained by calculating by using the secret information and the watermark data accords with the preset mathematical characteristic through the preset mathematical rule, namely, the verification value has the specific rule or characteristic in the mathematical sense. There are various ways of calculating the verification value using the secret information and the watermark data, and this can be achieved by using the relevant cryptographic rules. Illustratively, a message authentication code (Message Authent icat ion Code), which is a specialized class of algorithms in the cryptography arts, may be employed as the check value, for example. Such algorithms input a specified key and data (the former being pre-secret selected; both may be readable strings or arbitrary strings of bits), and output a standard length (e.g. 256 bits) data fingerprint (also known as a data digest or message authentication code) in the form of a random number. Although the formulation of the message authentication code is not complex, its output is unpredictable and can only be determined when both the key and data are entered. The mathematical properties of the message authentication code also include (a) that the data fingerprint is sensitive to both the key and the data, and that a change in either input (even by only 1 bit) results in a distinct output, and (b) that it is easy to calculate the data fingerprint from the key and the data, but for a key of sufficient strength and data from a sufficiently large sample space, it is impractical to reverse the key or data from the data fingerprint. In the embodiment of the application, when a message authentication code algorithm is used, secret information can be used as a secret key, structured data can be used as data, and a check value corresponding to each piece of structured data can be calculated through the algorithm.
In the embodiment of the application, the verification value calculated by using the secret information and the watermark data accords with the preset mathematical characteristic, and the specific form of the mathematical characteristic is not limited. Illustratively, the predetermined mathematical characteristic may be, for example, that the message authentication code is used as a check value, that in some embodiments the check value corresponding to the watermark data starts with 16 consecutive 1 bits, i.e. two bytes ffff in hexadecimal notation, that in some embodiments the check value corresponding to the watermark data ends with 20 consecutive 0 bits, and that in some embodiments the second and penultimate bytes of the check value corresponding to the watermark data are both hexadecimal numbers aa. It will be appreciated that since the message authentication code is in the form of a random number of fixed length, each of the examples given above pertains to a small probability event, and therefore, the particular mathematical property may restrict only a small portion of the resultant data to be watermark data, i.e. it may be considered that any calculated check value satisfying the predetermined data format and the secret information by calculation according to the predetermined mathematical rule is less than a first threshold, where the first threshold may be determined according to the total amount of data of the predetermined data format and the actual requirements, and in general the value of the first threshold may be as small as possible to reduce possible interference that may occur with normal traffic data. For example, the first threshold may be set to 2 -10. In the embodiment of the application, the size of the first threshold is not limited.
It may be appreciated that, in the embodiment of the present application, based on the set specific mathematical property, after the data set verifier obtains the secret information (and only after obtaining the secret information) through the authorization of the target object, whether the structured data satisfies the specific mathematical property may be verified one by one, so as to identify whether each piece of structured data is watermark data. Accordingly, the legitimate data set owners can reveal watermark data that appears to belong to the business data and that are intermixed in the structured data set to the data set verifier, and prove rights and claims to the structured data set based on the proportion of watermark data that appears in the structured data set. A key effect in this process is that only the data set owners and authorized data set verifiers can distinguish between the traffic data and the watermark data in the structured data set, so that only they can count the proportional result of the occurrence of the watermark data. If the target object is not a legal party of the data set, it cannot learn the correct secret information, and if the target object cannot provide the secret information, it can be determined that it is not a legal party of the structured data set, or if the target object provides the wrong secret information, the verification value calculated according to the wrong secret information and the structured data cannot correctly identify whether each piece of structured data is watermark data based on the set specific mathematical property, and the obtained proportion result will be far different from the correct proportion label, so that it can also be determined that it is not a legal party of the structured data set.
Of course, it should be noted that, in the embodiment of the present application, in order to more accurately and clearly determine the rights of the target object to the structured data set, generally, the owner of the data set needs to add a certain amount of watermark data into the structured data set, so that the proportion of the watermark data far exceeds the probability that the conventional verification value calculated by satisfying the data and the secret information in the predetermined data format accords with the preset mathematical characteristic. I.e. the value of the proportional flag is much larger than the first threshold, it is ensured that the structured data set is obviously distinguishable from the watermark data processed, and in the embodiment of the application, the size of the proportional flag is not particularly limited, and may be set to 5% for example.
It can be understood that compared with the similar technology, the application has the following four main characteristics:
1. The method can be used for proving the rights of all parties of the data set to the structured data set, even if fields in the structured data set have 'non-modifiable data' such as identification card numbers, mobile phone numbers and the like;
2. Similar to some watermark related algorithms, the application relates to cryptographic techniques, but the application is not limited to a specific cryptographic algorithm, and can be used as a bottom algorithm of a preset mathematical rule in the application as long as a cryptographic class (such as a message authentication code and a deterministic digital signature) of a check value can be generated for structured data based on secret information;
3. The application does not carry out copyright identification such as content tracing by extracting information such as producer identity from data like image watermark or video-audio watermark, but carries out verification on each piece of structured data in the structured data set one by using secret information to identify watermark data, judges the rights of the structured data set based on the proportion result of the watermark data in the structured data set, and checks whether the verification value of each piece of structured data meets specific mathematical property one by one;
4. When the target object proves to the data set verifier (such as a supervision department and a law enforcement agency) that rights to the structured data are claimed, the target object needs to be authorized to obtain secret information relative to the structured data, and an unauthorized person cannot carry out the verification.
In some embodiments, referring to fig. 4, said identifying said watermark data from said structured dataset according to said secret information and said particular mathematical property comprises:
Calculating the secret information and the structured data through the preset mathematical rule to obtain a first check value;
Judging whether the first check value accords with the preset mathematical characteristic according to the specific mathematical property;
and if the first check value accords with the preset mathematical characteristic, determining the structured data as watermark data.
In the embodiment of the application, when watermark data is determined from structured data, secret information and the structured data can be used for calculation through preset mathematical rules to obtain a check value, and the check value is recorded as a first check value. Then, according to the specific mathematical property, it may be determined whether the first check value meets the preset mathematical characteristic, if the first check value meets the preset mathematical characteristic, it may be determined as watermark data, and, conversely, if the first check value does not meet the preset mathematical characteristic, it may be determined as service data.
In some embodiments, the determining the ownership of the target object and the structured dataset based on the scale result and the scale label comprises:
calculating a difference value between the proportional result and the proportional label;
and if the difference value is smaller than a second threshold value, determining that the target object is the ownership party of the structured dataset.
In the embodiment of the application, when the rights relation between the target object and the structured data set is determined according to the proportion result and the proportion label, the service data may have the condition of conforming to specific mathematical properties, and the structured data set may also have the condition of being modified, added and deleted by a data set stealer and the like. Thus, there may be cases where the ratio results and ratio labels are not exactly identical. In the embodiment of the application, the difference value between the proportional result and the proportional label can be calculated, and the difference value can be flexibly set according to the requirement. For example, in some embodiments, the difference between the scale result and the scale label may be calculated and then the absolute value of the difference may be determined as the difference value, and in some embodiments, the ratio of the absolute value of the difference to the scale label (or the scale result) may also be calculated and the ratio may be determined as the difference value. It will be appreciated that in the embodiment of the present application, the larger the difference value between the scale result and the scale label, the less likely the target object is the owner of the structured dataset, and the smaller the difference value between the scale result and the scale label, the more likely the target object is the owner of the structured dataset. In the embodiment of the application, a threshold value can be set and recorded as a second threshold value, if the calculated difference value is small and is within the second threshold value, the target object can be determined to be the ownership party of the structured data set, and if the calculated difference value is large and is larger than or equal to the second threshold value, the target object can be determined not to be the ownership party of the structured data set.
Referring to fig. 5, in an embodiment of the present application, there is further provided a method for processing a structured data set, where the method may be used to generate a structured data set with watermark data, and may be used by all parties of the data set. Specifically, the processing method comprises the following steps:
step 510, acquiring an original data set and rights label information, wherein the original data set is used for storing structured data, the structured data meets a preset data format, the rights label information comprises secret information, a proportion label and specific mathematical properties, the specific mathematical properties are used for constraining a check value obtained through calculation by using the secret information and watermark data to meet preset mathematical characteristics, and the probability that any one of the data meeting the preset data format and the check value obtained through calculation by using the secret information meets the preset mathematical characteristics is smaller than a first threshold value, wherein the proportion label is larger than the first threshold value;
Step 520 of determining watermark data from data satisfying said predetermined data format based on said secret information and said specific mathematical property;
Step 530, determining the target number of watermark data to be added to the original data set according to the number of service data contained in the original data set and the proportion label;
step 540, adding the target number of watermark data to the original data set to obtain a target data set.
In an embodiment of the application, a method for processing a structured dataset is provided, which can be used for all parties of the dataset. In particular, the data set owner may obtain the original data set and the ownership tag information, wherein the original data set is a structured data set that may be used to store related structured data. The structured data satisfies a predetermined data format, and watermark data is searched and screened based on the predetermined data format. In the embodiment of the present application, the specific case of the predetermined data format is not limited. Illustratively, as shown in FIG. 6, when the original data set is used for storing a mobile phone number, it presents a predetermined data format of country code+national destination code+user number, and similarly, when the original data set is used for storing a user address, it presents a predetermined data format of province+city+county/district+town/street+community. In the embodiment of the application, the knowledge of the predetermined data format is beneficial to making watermark data similar to real business data, so that any party cannot judge whether the structured data set is business data or watermark data under the condition of not having specific secret information, namely, any person cannot distinguish the business data from the watermark data.
It should be specifically noted that in the embodiment of the present application, all data in the target data set may be set to be watermark data, and in this case, the obtained original data set may not include any real service data.
In an embodiment of the present application, the rights label information is used to implement rights labels for the original dataset, which may include secret information, scale labels, and specific mathematical properties. The specific meaning of these information is already described in the foregoing embodiments and will not be described in detail here. In the embodiment of the application, the three types of information in the right label information are not generally coupled, so that no sequence exists in the process of selection. The secret information must be kept secret, and only the secret information is authorized to a specific data set verifier (a supervision department and a law enforcement agency) when necessary, and other information needs to be published to the data set verifier and can also be published to the whole society.
In the embodiment of the application, based on the set ownership mark information, a piece of candidate data can be selected from the data meeting the preset data format according to a certain strategy (randomly selected or traversed according to a certain sequence), and is calculated through a preset mathematical rule, if the calculation result just meets the selected specific mathematical property, the candidate data is output according to watermark data, and if the calculation result does not meet the selected specific mathematical property, the candidate data is ignored (candidate data is not watermark data). In this way, watermark data may be determined from data satisfying a predetermined data format. In the embodiment of the present application, the target number of watermark data to be added to the original data set is determined according to the number of service data and the proportion label included in the original data set, for example, if the original data set has 9500 pieces of service data and the proportion label is 5%, 500 pieces of watermark data need to be acquired. If the specified number of watermark data is acquired, the watermark data can be mixed in the service data, so that the duty ratio of the watermark data in the original data set is equal to the preselected proportion label, and the processed target data set can be obtained.
In the embodiment of the application, watermark data cannot be distinguished from business data in appearance, and only a small proportion of the total possible data can meet specific mathematical properties to become watermark data. This means that a large amount of data needs to be computed one by one to find out which relatively few watermark data satisfy a specific mathematical property. The specific searching process can be selected randomly or the data can be traversed by a certain condition, and the application is not limited to the specific searching process.
In some embodiments, referring to FIG. 7, obtaining the ownership tag information includes:
Acquiring association information corresponding to the original data set, wherein the association information is used for representing rights of the original data set;
And generating the secret information according to the association information.
In some cases, in order to clarify the rights of the related data set conveniently, the embodiment of the application can also use the information with natural semantics to generate secret information when processing the data set, thereby facilitating the subsequent possible attribute proving operation. Specifically, in the embodiment of the present application, for the original data set, the association information corresponding to the original data set may be obtained, where the association information may be information for characterizing the rights of the original data set, for example, may be "special for a company XXX", and then, secret information may be generated according to the association information, for example, the association information may be directly used as secret information, or may be processed and then used as secret information, which is not limited in this aspect of the present application.
In some embodiments, said adding said target amount of said watermark data to said original data set resulting in a target data set comprises:
determining the target number of insertion locations in the raw dataset;
And adding each piece of watermark data to one insertion position in the original data set to obtain a target data set.
In the embodiment of the application, the watermark data with the target quantity is added into the original data set, and the method can be realized in a hybrid mode. As shown in fig. 8, the scrambling mentioned in the embodiment of the present application refers to relatively uniformly inserting watermark data into original service data, so that it is impossible to determine whether a piece of data is service data or watermark data according to the position (the line of the data table) of the piece of data in the data set. Specifically, a target number of insertion locations may be determined in the original dataset, and then each watermark data is added to one of the insertion locations, resulting in a target dataset. For example, the original data set has 9500 pieces of service data, after 500 pieces of watermark data are randomly inserted, the obtained structured data set has 10000 pieces of data in total, wherein 5% of the data are watermark data, but the positions where they appear cannot be judged, and there is no rule. In this way, the ownership security of the structured dataset may be improved.
The following describes and illustrates the technical scheme of the present application in connection with a specific application scenario example.
The embodiment of the application is applicable to the situation that the service data only comprises a single field, and is also applicable to the situation that the service data comprises a plurality of fields, and the following description is given by taking one embodiment as each. The illustrations are intended to illustrate the concepts and mathematical calculations involved in the present application, etc., and are not meant to be a real world case (e.g., an imaginary 123 segment cell phone number is used in the examples). In order to simplify the statement and facilitate the verification of the accounting of the embodiments, the following technical conventions are uniformly made in the following examples:
The character string is encoded according to the international general UTF-8 rule. For example, the result of the encoding of the string "math" consisting of two Chinese characters is an array of 6 bytes, the hexadecimal representation of which is e695b0e5ada6. For compatibility with most programming languages, watermark data verification adopts an internationally popular message authentication code HMAC-SHA-256, and the calculation result is an array formed by 32 bytes, and related test vectors are shown in I ETF RFC 4231.
The technical convention above is merely to better describe the embodiments and is in no way meant to limit the application in any way in terms of generality. In practical application, the application is not limited to specific codes and specific cryptographic algorithms, for example, chinese standard GB 18030 and the like can be adopted for character codes. The underlying message authentication code of the mathematical nature of the watermark data may be in the form of a Chinese standard HMAC-SM3 or CMAC-SM4, or an international SHA-3 based KMAC, or the like.
An embodiment one provides attribute proof for a dataset that contains only a single field.
Suppose that a lot of cell phone numbers of 123 segments are used in a test project conducted by a carrier 2024, month 2. To distinguish, the company uses only "message authentication code" with a handset number beginning with 16 consecutive 1 bits, i.e. with hexadecimal ffff ", and this batch of numbers is not reassigned to normal traffic. In other words, the corresponding message authentication code starts with the handset number ffff, i.e. the watermark data (the secret information required to calculate the message authentication code is known only to the company itself). The company only uses this cell phone number in the test item, meaning that it generates structured dataset watermark data with a 100% ratio. According to the method for processing structured data sets provided by the application, the company is taken as the owner of the data sets, and the company can generate a data set which is completely composed of watermark data by gradually traversing from 12300000000 12300023180、12300034919、12300078978、12300088650、12300393151、12300421487、12300600146、12300814686、12300857998、12301037953……
It is assumed that the cell phone number is exposed to the outside after the project is developed. To subside a possible concern, the company declares to the regulatory authorities that all handset numbers are reserved numbers for testing purposes (rather than being assigned to real personal users), and reveals to them (and only to the regulatory authorities) pre-selected secret information for computing message authentication codes, "1 certain share of china limited 2024, test specific" (this key is for example only, and the real secret information needs to be an unequivocally value).
The supervision department is used as a data set verification party, secret information and mobile phone numbers are subjected to UTF-8 coding according to character strings, and the secret information and the mobile phone numbers are substituted into an HMAC-SHA-256 formula to calculate a message authentication code shown in the following table 1 (only 10 pieces of data are displayed with limited space):
TABLE 1
The company also informs the authorities that the watermark data has the specific mathematical property that the resulting message authentication code starts with 16 consecutive 1 bits. The probability of satisfying the mathematical property is 2 -16 th for any piece of random data, i.e. the probability of a piece of data becoming watermark data is only about 1.5 ten thousandth. As shown in table 1, the administrative department verifies that all exposed mobile phone numbers are watermark data, so it is reasonable to believe that these mobile phone numbers are pre-selected test numbers by the company, rather than being distributed to real personal users and then revealed by security incidents. If a set of data is compromised due to security issues, it is not possible to find secret information afterwards that allows all data to meet certain mathematical properties (and thus to be verified as watermark data). Moreover, the secret information "the test special for 2 nd year of certain share of China limited company 2024" provided by the company also limits the rights of the data set, and the structured data set can be combined to prove that the structured data set is a pre-selected test number.
In a second embodiment, an attribute proof is provided for a dataset comprising a plurality of fields.
When validating data containing multiple fields, all key fields need to be involved in the computation (to determine if they meet a particular mathematical property). When all parties generate watermark data, all key fields participate in the logic operation process, authentication information (HMAC-SHA-256 message authentication codes in the embodiment) is obtained, and one or more fields in all key fields can be synthesized values according to the situation. In this process, one simple way to let all key fields participate in the logical operation is to directly splice the values of all key fields according to the character strings and then encode them (UTF-8 codes in the embodiment).
Assuming that the first and second home electronics companies are competing, the customer data of the first company (comprising at least 3 fields: identification card number, mobile phone number, name) is stolen by the second company in an illegal way for a long time. The company a wants to provide attribute proof for its customer data set, so that a piece of secret information is selected every quarter, watermark data is generated for the trade customer data set in the quarter and inserted according to the proportion of 5% (1 watermark data is inserted into every 19 pieces of business data, and the insertion position is randomly selected). Company b cannot recognize these watermark data without grasping secret information, and even does not know their existence.
Assuming that table 2 below is a watermark data sample (each field may be synthetic) inserted by the company a into its customer data set in quarter 2025, each row of data may be concatenated and then encoded to calculate the message authentication code as described above in "easy to do":
TABLE 2
Identification card number Mobile phone number Name of name
110101190212280049 12300762185 Liu Laoda A
110102190112290078 12303833627 Wang Xiaoer A
110103190012300255 12300144692 Zhang San (Zhang San)
11010418991231081X 12301137414 Liwu four-element bag
When law enforcement is caught against the illegal act of company b, company a reveals to law enforcement that its secret information corresponding to a certain data set is "chinese party a stock limited 2025 one quarter test specific", in which it is told that watermark data has a specific mathematical property that the resulting message authentication code ends with 20 consecutive 0 bits. The probability of satisfying the mathematical property is 2 -20 th power for any piece of data, namely the probability of a piece of random data becoming watermark data is less than 1 part per million.
After the law enforcement agency has authorized the secret information, the data set is verified to find that about 5% of the data is the watermark data meeting the specific mathematical properties claimed by the company a, for example, as shown in table 3 below:
TABLE 3 Table 3
It will be appreciated that if company b does not steal company a data, the probability of capturing company a watermark data in the data set should be less than 1 part per million. Since the watermark data of the first company in the data collection account for up to about 5%, law enforcement authorities confirm that the rights of the batch of data are all of the first company, and acknowledge the assertion that the data of the first company are stolen by the second company.
It can be appreciated that in the embodiment of the present application, watermark data is hidden in service data, so that the watermark data and the service data cannot be distinguished in aspects of format, content, and the like. Without specific secret information, anyone cannot determine whether the data is business data or watermark data. The application confirms the rights of the structured data set based on the proportion characteristic formed by the watermark data and the service data, and introduces the secret information, so that the rights and interests of all parties of the data set can be better protected compared with the way of directly confirming the rights and interests through the watermark data.
The rights verification apparatus for structured data sets according to embodiments of the present application is described below with reference to the accompanying drawings.
Referring to fig. 9, a rights verification apparatus for a structured data set according to an embodiment of the present application includes:
a first obtaining unit 910, configured to obtain a structured data set, where the structured data set includes a plurality of pieces of structured data, each piece of structured data is service data or watermark data, and the service data and the watermark data satisfy the same predetermined data format;
The second obtaining unit 920 is configured to obtain, from a target object to be verified, secret information corresponding to the structured data set, a proportion tag of the watermark data, and a specific mathematical property corresponding to the watermark data, where the specific mathematical property is used to restrict a verification value calculated by using the secret information and the watermark data according to a preset mathematical rule to conform to a preset mathematical feature, and any probability that the verification value calculated by satisfying the data in the predetermined data format and the secret information according to the preset mathematical rule conforms to the preset mathematical feature is smaller than a first threshold;
A processing unit 930 for identifying the watermark data from the structured dataset based on the secret information and the specific mathematical property;
And a statistics unit 940, configured to count a proportion result of the watermark data in the structured dataset, and determine a ownership relationship between the target object and the structured dataset according to the proportion result and the proportion label.
It can be understood that the content in the above method embodiment is applicable to the embodiment of the present device, and the specific functions implemented by the embodiment of the present device are the same as those of the embodiment of the above method, and the achieved beneficial effects are the same as those of the embodiment of the above method.
Referring to fig. 10, an embodiment of the present application provides an electronic device including:
at least one processor 1010;
At least one memory 1020 for storing at least one program;
the at least one program, when executed by the at least one processor 1010, causes the at least one processor 1010 to implement a method of ownership verification of the structured data set or a method of processing the structured data set.
Similarly, the content in the above method embodiment is applicable to the present electronic device embodiment, and the functions specifically implemented by the present electronic device embodiment are the same as those of the above method embodiment, and the beneficial effects achieved by the present electronic device embodiment are the same as those achieved by the above method embodiment.
The embodiment of the present application also provides a computer readable storage medium, in which a program executable by the processor 1010 is stored, where the program executable by the processor 1010 is used to perform the ownership verification method of the structured data set or the processing method of the structured data set described above when executed by the processor 1010.
Similarly, the content in the above method embodiment is applicable to the present computer-readable storage medium embodiment, and the functions specifically implemented by the present computer-readable storage medium embodiment are the same as those of the above method embodiment, and the beneficial effects achieved by the above method embodiment are the same as those achieved by the above method embodiment.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present application are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the application is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the functions and/or features may be integrated in a single physical device and/or software module or may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present application. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the application as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the application, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a Read-only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include an electrical connection (an electronic device) having one or more wires, a portable computer diskette (a magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of techniques known in the art, discrete logic circuits with logic gates for implementing logic functions on data signals, application specific integrated circuits with appropriate combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the foregoing description of the present specification, reference has been made to the terms "one embodiment/example", "another embodiment/example", "certain embodiments/examples", and the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the spirit and scope of the application as defined by the appended claims and their equivalents.
While the preferred embodiment of the present application has been described in detail, the present application is not limited to the embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present application, and the equivalent modifications or substitutions are intended to be included in the scope of the present application as defined in the appended claims.

Claims (10)

1. A method of verifying the rights of a structured dataset, comprising the steps of:
the method comprises the steps of obtaining a structured data set, wherein the structured data set comprises a plurality of pieces of structured data, each piece of structured data is business data or watermark data, and the business data and the watermark data meet the same preset data format;
The method comprises the steps of obtaining secret information corresponding to a structured data set, a proportion label of watermark data and specific mathematical properties corresponding to the watermark data from a target object to be verified, wherein the specific mathematical properties are used for restraining a check value calculated by using the secret information and the watermark data according to a preset mathematical characteristic through a preset mathematical rule, and the probability that any one of the data meeting a preset data format and the check value calculated by the secret information according to the preset mathematical rule meets the preset mathematical characteristic is smaller than a first threshold value, and the proportion label is larger than the first threshold value;
Identifying the watermark data from the structured dataset according to the secret information and the particular mathematical property;
And counting the proportion result of the watermark data in the structured data set, and determining the ownership relationship of the target object and the structured data set according to the proportion result and the proportion label.
2. A method of verifying the rights of a structured dataset as claimed in claim 1, wherein said identifying the watermark data from the structured dataset based on the secret information and the particular mathematical property comprises:
Calculating the secret information and the structured data through the preset mathematical rule to obtain a first check value;
Judging whether the first check value accords with the preset mathematical characteristic according to the specific mathematical property;
and if the first check value accords with the preset mathematical characteristic, determining the structured data as watermark data.
3. The method of claim 1, wherein determining the ownership of the target object and the structured dataset based on the scale result and the scale label comprises:
calculating a difference value between the proportional result and the proportional label;
and if the difference value is smaller than a second threshold value, determining that the target object is the ownership party of the structured dataset.
4. A method of verifying the authority of a structured dataset as defined in claim 3, wherein the calculating a difference value between the scale result and the scale label comprises:
Calculating a difference value between the proportion result and the proportion label, and determining an absolute value of the difference value as a difference value;
or calculating the difference between the proportional result and the proportional label, and determining the proportion of the absolute value of the difference to the proportional label as a difference value.
5. A method of processing a structured dataset, comprising the steps of:
The method comprises the steps of obtaining an original data set and rights label information, wherein the original data set is used for storing structured data, the structured data meets a preset data format, the rights label information comprises secret information, a proportion label and specific mathematical properties, the specific mathematical properties are used for constraining a check value obtained through calculation by using the secret information and watermark data according to preset mathematical characteristics, the probability that any check value obtained through calculation by using the data meeting the preset data format and the secret information according to the preset mathematical rules meets the preset mathematical characteristics is smaller than a first threshold, and the proportion label is larger than the first threshold;
determining watermark data from data satisfying the predetermined data format based on the secret information and the specific mathematical property;
determining the target number of watermark data needing to be added into the original data set according to the number of business data contained in the original data set and the proportion label;
and adding the watermark data of the target quantity into the original data set to obtain a target data set.
6. The method of claim 5, wherein obtaining the ownership tag information, comprising:
Acquiring association information corresponding to the original data set, wherein the association information is used for representing rights of the original data set;
And generating the secret information according to the association information.
7. A method of processing a structured dataset as claimed in claim 5, wherein said adding said target amount of said watermark data to said original dataset to obtain a target dataset comprises:
determining the target number of insertion locations in the raw dataset;
And adding each piece of watermark data to one insertion position in the original data set to obtain a target data set.
8. A rights verification apparatus for a structured data set, comprising:
The system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a structured data set, the structured data set comprises a plurality of pieces of structured data, each piece of structured data is business data or watermark data, and the business data and the watermark data meet the same preset data format;
the system comprises a first acquisition unit, a second acquisition unit, a first judgment unit and a second judgment unit, wherein the first acquisition unit is used for acquiring secret information corresponding to a structured data set, a proportion label of watermark data and specific mathematical properties corresponding to the watermark data from a target object to be verified, wherein the specific mathematical properties are used for restraining a verification value calculated by using the secret information and the watermark data according to a preset mathematical rule to conform to preset mathematical characteristics;
a processing unit for identifying the watermark data from the structured dataset according to the secret information and the specific mathematical property;
And the statistics unit is used for counting the proportion result of the watermark data in the structured data set, and determining the rights relation between the target object and the structured data set according to the proportion result and the proportion label.
9. An electronic device comprising a processor and a memory;
the memory is used for storing programs;
the processor executing the program to implement the method of any one of claims 1-7.
10. A computer readable storage medium, characterized in that the storage medium stores a program, which is executed by a processor to implement the method of any one of claims 1-7.
CN202311467146.2A 2023-11-06 2023-11-06 Ownership verification methods, processing methods, equipment and media for structured data sets Active CN117521038B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202311467146.2A CN117521038B (en) 2023-11-06 2023-11-06 Ownership verification methods, processing methods, equipment and media for structured data sets
PCT/CN2024/125335 WO2025098109A1 (en) 2023-11-06 2024-10-16 Ownership verification method and processing method for structured data set, device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311467146.2A CN117521038B (en) 2023-11-06 2023-11-06 Ownership verification methods, processing methods, equipment and media for structured data sets

Publications (2)

Publication Number Publication Date
CN117521038A CN117521038A (en) 2024-02-06
CN117521038B true CN117521038B (en) 2025-09-05

Family

ID=89744868

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311467146.2A Active CN117521038B (en) 2023-11-06 2023-11-06 Ownership verification methods, processing methods, equipment and media for structured data sets

Country Status (2)

Country Link
CN (1) CN117521038B (en)
WO (1) WO2025098109A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117521038B (en) * 2023-11-06 2025-09-05 中国电信股份有限公司 Ownership verification methods, processing methods, equipment and media for structured data sets
CN119150264A (en) * 2024-11-19 2024-12-17 杭州半云科技有限公司 Data watermark implantation and identification method for lossless data content
CN120277641B (en) * 2025-06-06 2025-08-12 南京信息工程大学 A federated learning copyright protection method and system based on mutual information

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740316A (en) * 2018-12-27 2019-05-10 北京三未信安科技发展有限公司 A kind of insertion of dynamic watermark, verification method and system and dynamic watermark processing system
CN112948895A (en) * 2019-12-10 2021-06-11 航天信息股份有限公司 Data watermark embedding method, watermark tracing method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7539872B2 (en) * 2003-05-23 2009-05-26 Purdue Research Foundation Method and system for rights assessment over digital data through watermarking
CN107832626B (en) * 2017-11-30 2019-09-17 中国人民解放军国防科技大学 Structured data right confirming method oriented to data circulation
US11699209B2 (en) * 2020-10-22 2023-07-11 Huawei Cloud Computing Technologies Co., Ltd. Method and apparatus for embedding and extracting digital watermarking for numerical data
CN116702103A (en) * 2023-06-19 2023-09-05 建信金融科技有限责任公司 Database watermark processing method, database watermark tracing method and device
CN117521038B (en) * 2023-11-06 2025-09-05 中国电信股份有限公司 Ownership verification methods, processing methods, equipment and media for structured data sets

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740316A (en) * 2018-12-27 2019-05-10 北京三未信安科技发展有限公司 A kind of insertion of dynamic watermark, verification method and system and dynamic watermark processing system
CN112948895A (en) * 2019-12-10 2021-06-11 航天信息股份有限公司 Data watermark embedding method, watermark tracing method and device

Also Published As

Publication number Publication date
CN117521038A (en) 2024-02-06
WO2025098109A1 (en) 2025-05-15

Similar Documents

Publication Publication Date Title
CN117521038B (en) Ownership verification methods, processing methods, equipment and media for structured data sets
Rey et al. A survey of watermarking algorithms for image authentication
Arnold et al. Techniques and applications of digital watermarking and content protection
Zhu et al. When seeing isn't believing [multimedia authentication technologies]
He et al. Adjacent-block based statistical detection method for self-embedding watermarking techniques
Li et al. Tamper detection and localization for categorical data using fragile watermarks
Dhole et al. Self embedding fragile watermarking for image tampering detection and image recovery using self recovery blocks
CN106408952A (en) Vehicle illegal behavior random photographing system and method
CN105005904A (en) RFID-coding-based artwork tracing method
Katariya Digital watermarking
Liu et al. A block oriented fingerprinting scheme in relational database
CN108564520B (en) GIS vector data copyright authentication method based on Moran index
Zhou et al. An additive-attack-proof watermarking mechanism for databases' copyrights protection using image
CN105912894B (en) A method of it is anti-fake that passport NO. being used for E-seal printed text
Hu et al. An image based algorithm for watermarking relational databases
CN119312299A (en) A blockchain-based data intellectual property evidence tracing method, device, equipment, storage medium and product
CN114021084A (en) Cross-media attack-based digital watermark technology implementation method
Sun et al. Multiple watermarking relational databases using image
CN114078071A (en) Image traceability method, device and medium
Jain et al. Digital watermarking
JP3651777B2 (en) Digital watermark system, digital watermark analysis apparatus, digital watermark analysis method, and recording medium
Mohanpurkar et al. Applying watermarking for copyright protection, traitor identification and joint ownership: A review
Shivani et al. An effective pixel-wise fragile watermarking scheme based on ARA bits
CN116167071A (en) Digital asset right-determining registration method and device based on blockchain
Mourya et al. Strengthening Video Integrity and Anti-Duplication Measures with Blockchain Innovations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant