[go: up one dir, main page]

CN102929896A - Data mining method based on privacy protection - Google Patents

Data mining method based on privacy protection Download PDF

Info

Publication number
CN102929896A
CN102929896A CN2011102329325A CN201110232932A CN102929896A CN 102929896 A CN102929896 A CN 102929896A CN 2011102329325 A CN2011102329325 A CN 2011102329325A CN 201110232932 A CN201110232932 A CN 201110232932A CN 102929896 A CN102929896 A CN 102929896A
Authority
CN
China
Prior art keywords
data
decision tree
method based
privacy protection
mining method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011102329325A
Other languages
Chinese (zh)
Inventor
丁力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jurong Jintai Science & Technology Park Co Ltd
Original Assignee
Jurong Jintai Science & Technology Park Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jurong Jintai Science & Technology Park Co Ltd filed Critical Jurong Jintai Science & Technology Park Co Ltd
Priority to CN2011102329325A priority Critical patent/CN102929896A/en
Publication of CN102929896A publication Critical patent/CN102929896A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the field of data mining, and particularly relates to a data mining method based on privacy protection. The data mining method based on privacy protection comprises the following steps of: conversion of original real data: discretizing potential digital attribute data, arranging a transition probability matrix for all attributes; generation of a decision tree: obtaining statistics of converted data records, and generating the decision tree gradually and recursively by using a converted training sample data set S, a determined splitting attribute set, splitting points and data subset marks; and the generation of a classification rule: pruning the generated decision tree to generate the classification rule. The data mining method based on privacy protection provided by the invention is applicable to noncharacter type data and nonuniformly distributed original data, and can also convert tag attributes; and moreover a classification tree constructed based on the converted data set has the characteristic of higher accuracy.

Description

Data digging method based on secret protection
Technical field
The invention belongs to Data Mining, relate in particular to the data digging method based on secret protection.
Background technology
Current society is the society of an information explosion, the development of internet has aggravated again exchange and the propagation of information, all these have excited again the demand of excavating useful information from a large amount of data greatly, these data and consequent information are the treasures of each industry, and it is recording the essential situation of managing faithfully.But in the face of like this a large amount of data, traditional data analysing method, can only obtain the surface layer information of data such as data retrieval, statistical study, can not obtain information its inherence, profound, the supvr is faced with the abundant and predicament of knowledge poorness of data.Therefore it is very important how excavating the useful knowledge of business decision from these data.
Summary of the invention
The invention provides a kind of data digging method based on secret protection, it comprises following steps:
The shift step of original True Data is carried out discretize to potential numeric type attribute data, again all properties is arranged transition probability matrix;
The generation step of decision tree, at server end, the data recording of statistics after the conversion utilized training sample data collection S, Split Attribute collection, split point and the data subset sign of having determined, progressively recursive generation decision tree after the conversion;
The generation step of classifying rules is carried out beta pruning to the above-mentioned decision tree that has generated, and produces classifying rules.
The raw data that the data digging method based on secret protection of the present invention invention goes for non-character type data and non-uniform Distribution also can the conversion tag attributes, and the classification tree that the data set after conversion is constructed has higher precision.
Description of drawings
Fig. 1 is the flow chart of steps based on the data digging method of secret protection.
Embodiment
Based on the data digging method of secret protection, its flow chart of steps comprises following steps as shown in Figure 1:
The shift step of original True Data is carried out discretize to potential numeric type attribute data, again all properties is arranged transition probability matrix;
The generation step of decision tree, at server end, the data recording of statistics after the conversion utilized training sample data collection S, Split Attribute collection, split point and the data subset sign of having determined, progressively recursive generation decision tree after the conversion;
The generation step of classifying rules is carried out beta pruning to the above-mentioned decision tree that has generated, and produces classifying rules.

Claims (1)

1. based on the data digging method of secret protection, it is characterized in that, comprise following steps:
The shift step of original True Data is carried out discretize to potential numeric type attribute data, again all properties is arranged transition probability matrix;
The generation step of decision tree, at server end, the data recording of statistics after the conversion utilized training sample data collection S, Split Attribute collection, split point and the data subset sign of having determined, progressively recursive generation decision tree after the conversion;
The generation step of classifying rules is carried out beta pruning to the above-mentioned decision tree that has generated, and produces classifying rules.
CN2011102329325A 2011-08-13 2011-08-13 Data mining method based on privacy protection Pending CN102929896A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011102329325A CN102929896A (en) 2011-08-13 2011-08-13 Data mining method based on privacy protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011102329325A CN102929896A (en) 2011-08-13 2011-08-13 Data mining method based on privacy protection

Publications (1)

Publication Number Publication Date
CN102929896A true CN102929896A (en) 2013-02-13

Family

ID=47644695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011102329325A Pending CN102929896A (en) 2011-08-13 2011-08-13 Data mining method based on privacy protection

Country Status (1)

Country Link
CN (1) CN102929896A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605749A (en) * 2013-11-20 2014-02-26 同济大学 Privacy protection associated rule data digging method based on multi-parameter interference
CN104731976A (en) * 2015-04-14 2015-06-24 海量云图(北京)数据技术有限公司 Method for finding and sorting private data in data table
CN104798043A (en) * 2014-06-27 2015-07-22 华为技术有限公司 Data processing method and computer system
CN113011484A (en) * 2021-03-12 2021-06-22 大商所飞泰测试技术有限公司 Graphical demand analysis and test case generation method based on classification tree and decision tree

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090248682A1 (en) * 2008-04-01 2009-10-01 Certona Corporation System and method for personalized search
CN201859444U (en) * 2010-04-07 2011-06-08 苏州市职业大学 Data excavation device for privacy protection

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090248682A1 (en) * 2008-04-01 2009-10-01 Certona Corporation System and method for personalized search
CN201859444U (en) * 2010-04-07 2011-06-08 苏州市职业大学 Data excavation device for privacy protection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
葛伟平 等: "基于隐私保护的分类挖掘", 《计算机研究与发展》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605749A (en) * 2013-11-20 2014-02-26 同济大学 Privacy protection associated rule data digging method based on multi-parameter interference
CN104798043A (en) * 2014-06-27 2015-07-22 华为技术有限公司 Data processing method and computer system
WO2015196476A1 (en) * 2014-06-27 2015-12-30 华为技术有限公司 Data processing method and computer system
US9984336B2 (en) 2014-06-27 2018-05-29 Huawei Technologies Co., Ltd. Classification rule sets creation and application to decision making
CN104731976A (en) * 2015-04-14 2015-06-24 海量云图(北京)数据技术有限公司 Method for finding and sorting private data in data table
CN104731976B (en) * 2015-04-14 2018-03-30 海量云图(北京)数据技术有限公司 The discovery of private data and sorting technique in tables of data
CN113011484A (en) * 2021-03-12 2021-06-22 大商所飞泰测试技术有限公司 Graphical demand analysis and test case generation method based on classification tree and decision tree
CN113011484B (en) * 2021-03-12 2023-12-26 大商所飞泰测试技术有限公司 Graphical demand analysis and test case generation method based on classification tree and judgment tree

Similar Documents

Publication Publication Date Title
Kubina et al. Use of big data for competitive advantage of company
CN103778200B (en) A kind of message information source abstracting method and its system
CN104199972A (en) Named entity relation extraction and construction method based on deep learning
US9928288B2 (en) Automatic modeling of column and pivot table layout tabular data
CN114911870A (en) Fusion management framework for multi-source heterogeneous industrial data
CN106503872A (en) A kind of business process system construction method based on basic business active set
CN102929896A (en) Data mining method based on privacy protection
CN102122280A (en) Method and system for intelligently extracting content object
JP5309543B2 (en) Information search server, information search method and program
CN108280561A (en) A kind of discrete manufacture mechanical product quality source tracing method based on comentropy and Weighted distance
CN120196741A (en) Text event association method based on large language model
CN106557881B (en) A method for building a business process system based on the execution sequence of business activities
CN104636324B (en) Topic source tracing method and system
CN105573972A (en) Report check formula generation method and apparatus
Li et al. Vandalism detection in OpenStreetMap via user embeddings
CN118503429A (en) Knowledge graph-based rapid classifying method and system for scientific and technological achievements
CN109614491B (en) Further mining method based on mining result of data quality detection rule
CN102929888A (en) Data mining method based on web
Qiu et al. How Much Do Women Build Open Source Infrastructure?
CN110955754A (en) A Model Construction Method for Repeated Call Analysis and Recognition
CN117348863B (en) Low-code development method and device for industrial software, electronic equipment and storage medium
Brock Júnior et al. Development of a logical model for geotechnical databases
CN106447165A (en) A heuristic job classification method and device
CN107729346A (en) A kind of new method for excavating the hidden transition of business procedure
CN107480822A (en) A TrieTree-Based Method Forecasting the Development Dynamics of Listed Enterprises

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130213