CN119203244A - A governance method for large model data leakage - Google Patents
A governance method for large model data leakage Download PDFInfo
- Publication number
- CN119203244A CN119203244A CN202411686760.2A CN202411686760A CN119203244A CN 119203244 A CN119203244 A CN 119203244A CN 202411686760 A CN202411686760 A CN 202411686760A CN 119203244 A CN119203244 A CN 119203244A
- Authority
- CN
- China
- Prior art keywords
- characteristic
- port
- model training
- data
- training terminal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention relates to the field of data management, in particular to a large model data leakage management method, which is characterized in that a characteristic behavior database aiming at a model training terminal is established based on instruction response data screening rule characteristics, when the model training terminal is accessed, a behavior fitting characterization coefficient corresponding to the model training terminal is determined according to comparison of instruction response data acquired in real time and data in the characteristic behavior database, the characteristic fitting type of the current model training terminal is identified, interaction of the model training terminal is controlled in time, and under the weak characteristic fitting type, the data transmission rule of a normal characteristic port is verified, and whether the corresponding normal characteristic port is closed or not is judged.
Description
Technical Field
The invention relates to the field of data management, in particular to a large-model data leakage management method.
Background
With the development of computer technology and internet technology, data models under different architectures are widely applied in various fields, access terminals are widely used in the application process, and various terminals, especially terminals for model training, may contain a large amount of sensitive data, so that terminal security is also important, and technologies related to terminal security and data leakage prevention are gradually paid attention to.
For example, CN114676456A discloses a data privacy protection method, device and storage medium based on edge calculation, and relates to the technical field of information security, wherein the method comprises the steps of acquiring a plurality of training sub-models in a federal machine learning mechanism, wherein each training sub-model is obtained by training an initial model by different end users according to a plurality of original data, and the plurality of original data are acquired different end user data; and according to the model parameters, learning and training the target training sub-model in target edge equipment to obtain a target training model, and detecting whether the data leakage risk exists in each original data according to the target training model. By the application, the problem that the risk of data leakage in the related technology is high is solved.
However, in the prior art, when different terminals are accessed by authorized visitors, due to the habit difference, certain regularity exists in the data dimension of the data interaction of the terminals and the instructions of the corresponding instruction input devices, in the prior art, the regularity is not considered to identify abnormal access in time, so that the security of the terminals is not high, and the data is easy to leak.
Disclosure of Invention
In order to solve the problems that when different terminals are accessed by authorized visitors, due to habit differences, data interaction of the terminals when accessed and instructions of corresponding instruction input equipment always have certain regularity in data dimension, in the prior art, the problem that the safety of the terminals is not high and data is easy to leak due to abnormal access is not recognized in time according to the regularity is solved, the invention provides a treatment method for large-model data leakage, which comprises the following steps:
Acquiring instruction response data in a history period of a model training terminal, wherein the instruction response data comprises data interaction amounts of the model training terminal and each port and instruction triggering targets corresponding to instruction input equipment;
Screening rule features based on the instruction response data to establish a feature behavior database aiming at the model training terminal, wherein the rule features comprise feature port combinations and feature function component combinations;
responding to the accessed model training terminal, and comparing instruction response data of the model training terminal in the acquisition period with data in a characteristic behavior database;
determining a behavior laminating characterization coefficient corresponding to the model training terminal according to the comparison result so as to identify the characteristic laminating category of the current model training terminal;
The feature fit class control model based on current access behavior trains data interactions of the terminal, including,
Invoking the characteristic behavior database to identify a normal characteristic port, verifying a data transmission rule of the normal characteristic port, judging whether to close the corresponding normal characteristic port, and synchronously sending verification requirement information to the model training terminal so as to judge whether to send an early warning signal according to feedback information of the model training terminal;
or continuously collecting interactive data of the model training terminal in a period and judging the characteristic fitting type of the model training terminal according to instruction data of the corresponding instruction input equipment.
Further, the process of screening feature port combinations based on the instruction response data includes,
Determining a first conditional probability of data interaction between the model training terminal and different port combinations in a plurality of accessed processes;
Determining a characteristic port combination based on a first conditional probability corresponding to each port combination, recording a data interaction quantity average value of each port data interaction in the characteristic port combination, and storing the data interaction quantity average value in a characteristic behavior database;
And if the first conditional probability corresponding to the port combination is greater than a predetermined port combination conditional probability threshold, judging the port combination as a characteristic port combination.
Further, the process of screening feature function component combinations based on the instruction response data includes
Acquiring a second conditional probability that each functional component combination is triggered in the process of accessing the model training terminal for a plurality of times;
Determining a characteristic functional component combination based on a second conditional probability corresponding to the functional component combination, and recording the triggering frequency of each functional component in the characteristic functional component combination;
And if the second conditional probability corresponding to the function component combination is greater than the predetermined function component combination probability threshold, determining that the current function component combination is the characteristic function component combination.
Further, the process of comparing the instruction response data of the model training terminal in the acquisition period with the data in the characteristic behavior database comprises,
Recording ports generating data interaction currently, if the current port combination exists in the characteristic behavior database, determining that the port combination is matched, and recording the current data interaction quantity average value corresponding to each port;
respectively calculating the difference between the current data interaction quantity average value corresponding to each port and the data interaction quantity average value stored in the characteristic behavior database, and obtaining the data interaction quantity matching characteristic after averaging the obtained difference values;
Recording the current triggered functional components, if the current functional component combination exists in the characteristic behavior database, determining that the functional component combination is matched, and recording the current triggering frequency of each functional component;
And calculating the difference between the current trigger frequency of each functional component and the trigger frequency stored in the characteristic behavior database, and obtaining the trigger frequency matching characteristic after averaging the obtained difference values.
Further, the process of determining the behavior fit characterization coefficient corresponding to the model training terminal according to the comparison result comprises,
Calculating the ratio of a preset reference data interaction quantity matching threshold value to a data interaction quantity matching characteristic to obtain a first fitting characterization factor;
Calculating the ratio of a preset reference trigger frequency matching threshold value to a trigger frequency matching characteristic to obtain a second fit characterization factor;
And determining the sum of the first lamination characterization factor and the second lamination characterization factor as a lamination characterization factor.
Further, the process of identifying the feature fit category of the current model training terminal comprises,
If the fit characteristic coefficient is larger than or equal to a preset fit characteristic coefficient threshold value, judging that the current model training terminal is of a strong characteristic fit type;
and if the fit characteristic coefficient is smaller than a preset fit characteristic coefficient threshold, judging that the current model training terminal is of a weak characteristic fit type.
Further, the data interaction of the model training terminal based on the characteristic fitting category control of the current access behavior comprises,
If the current model training terminal is in a weak feature fitting type, calling the feature behavior database to identify a normal feature port, verifying a data transmission rule of the normal feature port, judging whether to close the corresponding normal feature port, and synchronously sending verification requirement information to the model training terminal so as to judge whether to send an early warning signal according to feedback information of the model training terminal;
And if the current model training terminal is of a strong characteristic fitting type, continuously collecting interaction data of the model training terminal in a period and instruction data corresponding to the instruction input equipment to judge the characteristic fitting type of the model training terminal.
Further, the process of calling the characteristic behavior database to identify the normal characteristic port comprises,
Arranging each characteristic port combination according to the corresponding first conditional probability descending order to obtain a characteristic port combination sequence;
screening a characteristic port combination sequence number with a preset proportion by the head end of the characteristic port combination sequence;
marking each characteristic port combination based on each characteristic port combination serial number;
identifying ports contained in the marked feature port combinations as normal feature ports;
Each characteristic port combination corresponds to a unique serial number, and the characteristic port combination sequence consists of each unique serial number.
Further, verifying the data transmission rule of the normal feature port, determining whether to close the corresponding normal feature port includes,
Acquiring the current data interaction quantity of each normal characteristic port, respectively constructing a data interaction quantity time domain curve, and determining the normal data interaction characteristics;
Calling a characteristic behavior database to acquire data interaction amounts corresponding to the normal characteristic ports, constructing a data interaction amount sample time domain curve, and determining sample data interaction characteristics;
comparing the normal data interaction characteristics corresponding to the normal characteristic ports with the sample data interaction characteristics to determine characteristic difference quantity;
if the characteristic difference is greater than or equal to a preset characteristic difference threshold, judging that the normal characteristic port is closed;
The constant data interaction characteristic and the sample data interaction characteristic comprise data interaction quantity average amplitude, rising section average slope and falling section average slope.
Further, the feedback information comprises a key, and if the key is not matched with a preset key, the generation of the early warning signal is judged.
Compared with the prior art, the method and the device have the advantages that the characteristic behavior database aiming at the model training terminal is established based on the rule characteristics of the instruction response data, when the model training terminal is accessed, the behavior fitting characterization coefficients corresponding to the model training terminal are determined according to the comparison of the instruction response data acquired in real time and the data in the characteristic behavior database, the characteristic fitting type of the current model training terminal is identified, the interaction of the model training terminal is controlled in time, the data transmission rule of the normal characteristic port is verified under the weak characteristic fitting type, and whether the corresponding normal characteristic port is closed or not is judged.
In particular, the method and the device establish the characteristic behavior database aiming at the model training terminal based on the rule characteristics of the instruction response data, in the actual situation, due to the operation habit of an authorized operator, when the model training terminal is accessed for a long time, certain regularity exists in the instructions of the corresponding instruction input equipment, for example, partial functional components of the model training terminal are intensively triggered, and partial functional components are freshly triggered, so that the model training terminal also shows regularity in data interaction with each port, for example, data interaction with a specific port is generated, and the data interaction quantity is in a certain range, therefore, the characteristic port combination and the characteristic functional component combination are considered as rule characteristics, the special characteristic behavior database is constructed, data support is provided for the follow-up identification characteristic attaching category, the data interaction of the model training terminal is controlled adaptively, the safety is ensured, and the risk of data leakage is reduced.
In particular, the invention calculates the behavior laminating characterization coefficient, builds the characteristic laminating category of the model training terminal, and the laminating characterization coefficient characterizes the difference between the regularity reflected by the corresponding instruction response data when the model training terminal is accessed and the regularity reflected by the corresponding instruction response data in a normal state, so that whether the current model terminal is in a weak characteristic laminating category can be timely divided, the behavior of a camouflage authorized visitor for accessing the model training terminal can be identified, the subsequent intervention can be timely made, the data interaction of the model training terminal can be adaptively controlled, and the safety can be ensured.
In particular, the invention identifies the normal characteristic port, in the actual situation, the normal characteristic port is determined based on the characteristic behavior database, the port combination with higher data interaction frequency is represented in the normal state, and the invasion risk is higher, furthermore, the invention judges whether to close the normal characteristic port in time based on the transmission rule of the normal characteristic port, the data interaction quantity and the change condition of the data interaction quantity of the normal characteristic port are considered in the verification of the transmission rule, in the actual situation, if a camouflage operator accesses a model training terminal to steal data, interfere data interaction or cause pollution by calling a data input model, the invention can embody the transmission rule, therefore, the invention identifies the situation, closes the transmission port in time, ensures the safety of the model training terminal, and reduces the risk of data leakage.
Drawings
FIG. 1 is a schematic diagram of the steps of a method for managing large model data leakage according to an embodiment of the invention;
FIG. 2 is a logic decision diagram for identifying feature fit categories of a current model training terminal according to an embodiment of the invention;
FIG. 3 is a logic decision diagram of an embodiment of the present invention for determining whether to close a corresponding normal feature port;
FIG. 4 is a logic diagram of an embodiment of the present invention for determining whether to issue an early warning signal.
Detailed Description
The invention will be further described with reference to examples for the purpose of making the objects and advantages of the invention more apparent, it being understood that the specific examples described herein are given by way of illustration only and are not intended to be limiting.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.
Referring to fig. 1 to 4, fig. 1 is a schematic diagram illustrating steps of a method for managing large model data leakage according to an embodiment of the present invention, fig. 2 is a logic determination chart for identifying a feature fitting type of a training terminal of a current model according to an embodiment of the present invention, fig. 3 is a logic determination chart for determining whether to close a corresponding normal feature port according to an embodiment of the present invention, and fig. 4 is a logic determination chart for determining whether to send an early warning signal according to an embodiment of the present invention, where the method for managing large model data leakage according to the present embodiment includes:
S1, acquiring instruction response data in a history period of a model training terminal, wherein the instruction response data comprises data interaction amounts of the model training terminal and each port and instruction triggering targets corresponding to instruction input equipment;
s2, screening regular features based on the instruction response data to establish a feature behavior database aiming at the model training terminal, wherein the regular features comprise feature port combinations and feature function component combinations;
S3, responding to the mode training terminal being accessed, and acquiring instruction response data of the mode training terminal in a period to compare with data in a characteristic behavior database;
S4, determining a behavior laminating characterization coefficient corresponding to the model training terminal according to the comparison result so as to identify the characteristic laminating category of the current model training terminal;
s5, controlling data interaction of the model training terminal based on the characteristic fitting category of the current access behavior, comprising,
Invoking the characteristic behavior database to identify a normal characteristic port, verifying a data transmission rule of the normal characteristic port, judging whether to close the corresponding normal characteristic port, and synchronously sending verification requirement information to the model training terminal so as to judge whether to send an early warning signal according to feedback information of the model training terminal;
or continuously collecting interactive data of the model training terminal in a period and judging the characteristic fitting type of the model training terminal according to instruction data of the corresponding instruction input equipment.
Specifically, the specific structure of the model training terminal is not limited, and the model training terminal is a logic component capable of executing logic operation, and only needs to realize the function of loading or storing the model for training, for example, a computer, which is not described again.
Specifically, in implementation, the instruction input device is connected with the model training terminal in a matching manner, so as to input an instruction to the model training terminal to realize a corresponding function, such as a keyboard, a mouse, and the like.
Specifically, for the functional component, the functional component is configured in the model training terminal to implement a corresponding function, for example, a certain software in the model training terminal, and generally, the functional component can be triggered by the instruction input device through a control instruction to implement the corresponding function, which is not described herein.
Specifically, the model training terminal is accessed including a visitor accessing the model training terminal, the model training terminal being controlled by the instruction input device.
Specifically, in a computer network, ports refer to logical communication endpoints that are used to distinguish between different network services or processes running on the terminal, each port being identified by a unique port number, which is a 16-bit number ranging from 0 to 65535, and will not be described in detail.
Specifically, in step S2, the process of screening feature port combinations based on the instruction response data includes,
Determining a first conditional probability of data interaction between the model training terminal and different port combinations in a plurality of accessed processes, wherein it can be understood that single access comprises the step of starting to access the model training terminal to stop to access the model training terminal by a visitor;
it can be appreciated that the model training terminal may interact with different ports during a single access due to different control instructions,
Taking a total of 5 ports as an example, the port serial numbers are 1, 2, 3, 4 and 5 respectively;
If the current port combination is 1,2 and 3, the model training terminal is accessed 10 times, wherein the model training terminal is accessed 3 times under the current port combination, the first conditional probability corresponding to the port combination is 3/10, namely the ratio of the number of times the model training terminal is accessed under the current port combination to the total number of times the model training terminal is accessed;
Determining a characteristic port combination based on a first conditional probability corresponding to each port combination, recording a data interaction quantity average value of each port data interaction in the characteristic port combination, and storing the data interaction quantity average value in a characteristic behavior database;
And if the first conditional probability corresponding to the port combination is greater than a predetermined port combination conditional probability threshold, judging the port combination as a characteristic port combination.
Specifically, the port combination conditional probability threshold is determined based on the first conditional probability average value of each port combination, and is set to be the product of the conditional probability average value and the precision coefficient, and the precision coefficient is selected within the interval [1.05,1.15 ].
Specifically, in step S2, the process of screening feature function component combinations based on the instruction response data includes
Acquiring a second conditional probability that each functional component combination is triggered in the process of accessing the model training terminal for a plurality of times;
It can be understood that the second conditional probability is the ratio of the number of times the model training terminal is accessed to the total number of times the model training terminal is accessed under the combination of the functional components;
And determining a characteristic functional component combination based on a second conditional probability corresponding to the functional component combination, and recording the triggering frequency of each functional component in the characteristic functional component combination, wherein the triggering frequency is the number of times triggered in a reference time, and the reference time can be selected from 3min to 10min in order to enable the acquired instruction response data to have data characterizability.
And if the second conditional probability corresponding to the function component combination is greater than the predetermined function component combination probability threshold, determining that the current function component combination is the characteristic function component combination.
Specifically, in implementation, the functional component combination probability threshold is determined based on a second conditional probability average that each functional component combination is triggered, and is set to be a product of the second conditional probability average and the precision coefficient.
According to the method, the characteristic behavior database aiming at the model training terminal is established based on the rule characteristics of the instruction response data, in the practical situation, due to the operation habit of an authorized operator, when the model training terminal is accessed for a long time, certain regularity exists in the instructions of the corresponding instruction input equipment, for example, partial functional components of partial model training terminals are intensively triggered, and partial functional components are freshly triggered, so that the model training terminal also shows regularity in data interaction with each port, for example, data interaction with a specific port is generated, and the data interaction amount is in a certain range, therefore, the characteristic port combination and the characteristic functional component combination are considered as rule characteristics, the special characteristic behavior database is constructed, data support is provided for the follow-up identification characteristic attaching category, the data interaction of the model training terminal is controlled in an adaptive mode, the safety is ensured, and the risk of data leakage is reduced.
Specifically, in step S3, the process of comparing the instruction response data of the model training terminal in the acquisition period with the data in the characteristic behavior database comprises,
Recording a current port generating data interaction, if the current port combination exists in the characteristic behavior database, determining port combination matching, and recording a current data interaction quantity average value corresponding to each port, wherein it can be understood that the data interaction quantity of each port in each time period in a certain time can be synchronously recorded, and then the data interaction quantity average value is solved;
respectively calculating the difference between the current data interaction quantity average value corresponding to each port and the data interaction quantity average value stored in the characteristic behavior database, and obtaining the data interaction quantity matching characteristic after averaging the obtained difference values;
Recording the current triggered functional components, if the current functional component combination exists in the characteristic behavior database, determining that the functional component combination is matched, and recording the current triggering frequency of each functional component;
And calculating the difference between the current trigger frequency of each functional component and the trigger frequency stored in the characteristic behavior database, and obtaining the trigger frequency matching characteristic after averaging the obtained difference values.
Specifically, in step S4, the process of determining the behavior fit characterization coefficient corresponding to the model training terminal according to the comparison result includes,
Calculating the ratio of a preset reference data interaction quantity matching threshold value to a data interaction quantity matching characteristic to obtain a first fitting characterization factor;
Calculating the ratio of a preset reference trigger frequency matching threshold value to a trigger frequency matching characteristic to obtain a second fit characterization factor;
And determining the sum of the first lamination characterization factor and the second lamination characterization factor as a lamination characterization factor.
Specifically, the interaction volume matching threshold is determined based on the data interaction volume average of the port stored in the feature behavior database, and is typically set to between 0.25 and 0.35 times the data interaction volume average.
Specifically, the reference trigger frequency matching threshold is determined based on the trigger frequency of the functional component stored in the characteristic behavior database, and is typically set to between 0.15 times and 0.25 times the trigger frequency.
According to the method, the characteristic fitting characterization coefficients are calculated, the characteristic fitting categories of the model training terminal are constructed, the fitting characterization coefficients characterize the difference between the regularity embodied by the corresponding instruction response data when the model training terminal is accessed and the regularity embodied by the corresponding instruction response data in a normal state, so that whether the current model terminal is in a weak characteristic fitting category can be timely divided, the behavior that a camouflage authorized visitor accesses the model training terminal can be identified, subsequent intervention can be timely made, the data interaction of the model training terminal can be adaptively controlled, and the safety is guaranteed.
Specifically, in step S4, the process of identifying the feature fitting category of the current model training terminal comprises,
If the fit characteristic coefficient is larger than or equal to a preset fit characteristic coefficient threshold value, judging that the current model training terminal is of a strong characteristic fit type;
and if the fit characteristic coefficient is smaller than a preset fit characteristic coefficient threshold, judging that the current model training terminal is of a weak characteristic fit type.
Specifically, the fit-characterizing coefficient threshold is calculated based on the first fit-characterizing factor and the second fit-characterizing factor, it is understood that when the first fit-characterizing factor and the second fit-characterizing factor are close to 1, the characterization tends to an acceptable upper difference limit, and the fit-characterizing coefficient threshold is calculated based on the first fit-characterizing factor and the second fit-characterizing factor, so that, for characterizing the upper difference limit, a person skilled in the art can select the fit-characterizing coefficient threshold within the interval [2.15,2.25 ].
Specifically, in step S5, the data interaction of the training terminal based on the characteristic fitting category control model of the current access behavior comprises,
If the current model training terminal is in a weak feature fitting type, calling the feature behavior database to identify a normal feature port, verifying a data transmission rule of the normal feature port, judging whether to close the corresponding normal feature port, and synchronously sending verification requirement information to the model training terminal so as to judge whether to send an early warning signal according to feedback information of the model training terminal;
And if the current model training terminal is of a strong characteristic fitting type, continuously collecting interaction data of the model training terminal in a period and instruction data corresponding to the instruction input equipment to judge the characteristic fitting type of the model training terminal.
Specifically, in step S5, the process of calling the characteristic behavior database to identify the normal characteristic port comprises,
Arranging each characteristic port combination according to the corresponding first conditional probability descending order to obtain a characteristic port combination sequence;
screening a characteristic port combination sequence number with a preset proportion by the head end of the characteristic port combination sequence;
marking each characteristic port combination based on each characteristic port combination serial number;
identifying ports contained in the marked feature port combinations as normal feature ports;
Each characteristic port combination corresponds to a unique serial number, and the characteristic port combination sequence consists of each unique serial number.
Specifically, in order to screen out the data with stronger characterization of the front end data of the characteristic port combination sequence, the predetermined proportion is selected within 15% to 30%.
Specifically, in step S5, the process of verifying the data transmission rule of the normal feature port and determining whether to close the corresponding normal feature port includes,
Acquiring the current data interaction quantity of each normal characteristic port, respectively constructing a data interaction quantity time domain curve, and determining the normal data interaction characteristics;
Calling a characteristic behavior database to acquire data interaction amounts corresponding to the normal characteristic ports, constructing a data interaction amount sample time domain curve, and determining sample data interaction characteristics;
comparing the normal data interaction characteristics corresponding to the normal characteristic ports with the sample data interaction characteristics to determine characteristic difference quantity;
if the characteristic difference is greater than or equal to a preset characteristic difference threshold, judging that the normal characteristic port is closed;
The constant data interaction characteristic and the sample data interaction characteristic comprise data interaction quantity average amplitude, rising section average slope and falling section average slope.
Specifically, the feature difference amount is determined based on the comparison result of the regular data interaction feature and the sample data interaction feature, and comprises
Determining the difference between the average amplitude of the data interaction quantity in the constant data interaction characteristic and the average amplitude of the data interaction quantity in the sample data interaction characteristic, and solving the ratio of the obtained difference to the average amplitude of the data interaction quantity in the sample data interaction characteristic to obtain a first characteristic difference quantity;
determining the difference between the average slope of the rising segment in the constant data interaction characteristic and the average slope of the rising segment in the sample data interaction characteristic, and solving the ratio of the obtained difference to the average slope of the rising segment in the sample data interaction characteristic to obtain a second characteristic difference;
Determining the difference between the average slope of the descending segment in the constant data interaction characteristic and the average slope of the descending segment in the sample data interaction characteristic, and solving the ratio of the obtained difference to the average slope of the descending segment in the sample data interaction characteristic to obtain a third characteristic difference quantity;
Setting the sum of the first characteristic difference amount, the second characteristic difference amount, and the third characteristic difference amount as a characteristic difference amount;
It can be appreciated that the regular data interaction features and the sample data interaction features can be extracted according to corresponding time domain curves, and will not be described herein.
It can be understood that the data interaction quantity of the normal feature port allows a certain deviation, and the feature difference quantity threshold value is set to represent the situation that the difference is large, so that a person skilled in the art can select the feature difference quantity threshold value within the interval [0.65,0.85 ].
The invention identifies the normal characteristic port, in the actual situation, the normal characteristic port is determined based on the characteristic behavior database, the port combination with higher data interaction frequency is represented in the normal state, the invasion risk is higher, furthermore, the invention judges whether to close the normal characteristic port in time based on the transmission rule of the normal characteristic port, the data interaction quantity and the data interaction quantity change condition of the normal characteristic port are considered in the verification of the transmission rule, in actual situations, if a camouflage operator accesses the model training terminal to steal data, interfere data interaction or cause pollution by calling a data input model, the data input model can be reflected on a transmission rule, so that the invention recognizes the situations, timely closes a transmission port, ensures the safety of the model training terminal and reduces the risk of data leakage.
Specifically, the feedback information includes a key, and if the key does not match a predetermined key, it is determined that an early warning signal is generated.
It will be appreciated that verifying the demand information includes sending a request to the model training terminal to provide a key.
The key may be preset, and may be in an asymmetric encryption or a symmetric encryption mode, so that a person skilled in the art can verify whether the key is in conformity with the corresponding encryption mode by himself, which is not described again.
The generated early warning signal can be sent to a monitoring end for monitoring, which is not described again.
The method for managing large model data leakage of the present invention, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411686760.2A CN119203244B (en) | 2024-11-25 | 2024-11-25 | A governance method for large model data leakage |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411686760.2A CN119203244B (en) | 2024-11-25 | 2024-11-25 | A governance method for large model data leakage |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN119203244A true CN119203244A (en) | 2024-12-27 |
| CN119203244B CN119203244B (en) | 2025-03-25 |
Family
ID=94076391
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411686760.2A Active CN119203244B (en) | 2024-11-25 | 2024-11-25 | A governance method for large model data leakage |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN119203244B (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119449454A (en) * | 2024-11-28 | 2025-02-14 | 北京独创时代科技有限公司 | Information security detection management system |
| CN119922007A (en) * | 2025-01-24 | 2025-05-02 | 北京和润诚科技有限公司 | A data transmission protection method |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140098705A1 (en) * | 2010-12-30 | 2014-04-10 | Adaptive Spectrum And Signal Alignment, Inc. | Management center for communication system customer premises equipment |
| CN109508485A (en) * | 2018-10-30 | 2019-03-22 | 平安医疗健康管理股份有限公司 | A kind of data processing model dissemination method, device, server and storage medium |
| CN117633626A (en) * | 2023-12-04 | 2024-03-01 | 中国建设银行股份有限公司 | Model updating method, device and computer equipment |
| CN118467930A (en) * | 2024-07-09 | 2024-08-09 | 西安传显行风网络科技有限公司 | Abnormal data processing method applied to robot |
-
2024
- 2024-11-25 CN CN202411686760.2A patent/CN119203244B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140098705A1 (en) * | 2010-12-30 | 2014-04-10 | Adaptive Spectrum And Signal Alignment, Inc. | Management center for communication system customer premises equipment |
| CN109508485A (en) * | 2018-10-30 | 2019-03-22 | 平安医疗健康管理股份有限公司 | A kind of data processing model dissemination method, device, server and storage medium |
| CN117633626A (en) * | 2023-12-04 | 2024-03-01 | 中国建设银行股份有限公司 | Model updating method, device and computer equipment |
| CN118467930A (en) * | 2024-07-09 | 2024-08-09 | 西安传显行风网络科技有限公司 | Abnormal data processing method applied to robot |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119449454A (en) * | 2024-11-28 | 2025-02-14 | 北京独创时代科技有限公司 | Information security detection management system |
| CN119449454B (en) * | 2024-11-28 | 2025-10-31 | 北京独创时代科技有限公司 | An information security detection and management system |
| CN119922007A (en) * | 2025-01-24 | 2025-05-02 | 北京和润诚科技有限公司 | A data transmission protection method |
| CN119922007B (en) * | 2025-01-24 | 2025-09-12 | 北京和润诚科技有限公司 | Data transmission protection method |
Also Published As
| Publication number | Publication date |
|---|---|
| CN119203244B (en) | 2025-03-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN119203244B (en) | A governance method for large model data leakage | |
| EP3719678B1 (en) | Identity verification method and apparatus | |
| CN116112292B (en) | Abnormal behavior detection method, system and medium based on network flow big data | |
| CN110348188B (en) | Core body checking method and device | |
| CN110214322A (en) | System and method for protecting the access to resource | |
| CN104836781A (en) | Method distinguishing identities of access users, and device | |
| CN116915515B (en) | Access security control method and system for industrial control network | |
| CN119324835B (en) | Intelligent conference system | |
| CN113962712A (en) | Method for predicting fraud gangs and related equipment | |
| CN114996746A (en) | Data authority management method and system based on multi-dimensional information | |
| KR102230441B1 (en) | Method, Device and program for generating security action report based on the results of the security vulnerability assessment | |
| CN120354405A (en) | Computer defending system based on Internet of things | |
| CN114733207A (en) | Game account monitoring, analyzing, early warning and managing system based on feature analysis | |
| CN120768661A (en) | A system and method for network security evaluation based on artificial intelligence | |
| CN118332607B (en) | Financial big data analysis system and method based on blockchain | |
| CN120785656B (en) | An automated test case construction method and system for testing proprietary protocols | |
| CN119172178B (en) | Mobile office equipment remote monitoring management method based on Internet of things | |
| CN120915535A (en) | Network information consultation platform based on zero trust architecture and communication method | |
| CN112149036A (en) | Method and system for identifying batch abnormal interaction behaviors | |
| CN118551368B (en) | A method and system for character instruction intention recognition | |
| CN121037124B (en) | Shared data dynamic security management method, system, equipment and medium | |
| CN120105488A (en) | A full lifecycle data protection approach | |
| CN120524527A (en) | A Binlog-based monitoring method for illegally tampering with system data | |
| CN119102417A (en) | Smart door lock password anti-peeping method, system, electronic device and storage medium | |
| CN121327513A (en) | Training Methods for Dynamic Tracking Models of User Preferences Applicable to Electronic Signature Systems |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |