Disclosure of Invention
The embodiment of the application provides a business data security management system and method based on big data, which are used for solving the problem that unknown threat detection capability is limited in the prior art.
In a first aspect, an embodiment of the present application provides a business data security management system and method based on big data, including:
receiving a stream of business activity data from an enterprise internal system;
Dynamically analyzing abnormal behavior patterns in the business activity data stream, and evaluating the risk level of the business activity data stream in real time according to a preset security policy;
automatically adjusting the data encryption intensity according to the risk level, and implementing fine-granularity access control on the high-risk data;
Encrypting the business activity data stream based on the data encryption intensity, and dispersedly storing the encrypted business activity data stream in a plurality of data centers with different physical positions through a distributed storage technology;
When a predefined threat scene is detected, an early warning mechanism is automatically triggered and a report is generated, and corresponding protective measures are executed according to the security policy.
Optionally, the dynamically analyzing the abnormal behavior pattern in the business activity data stream and evaluating the risk level of the business activity data stream in real time according to a preset security policy includes:
Dynamically monitoring each activity index in the business activity data stream by using a machine learning model to identify potential abnormal behavior patterns, wherein the machine learning model can adjust a detection threshold according to historical data and current environmental changes;
Based on the abnormal behavior mode, a behavior analysis algorithm is applied, and a behavior score of the abnormal behavior mode is generated by combining a preset behavior feature library;
Processing the behavior scores by using a preset risk assessment framework to obtain risk assessment results, wherein the risk assessment framework ensures the comprehensiveness and accuracy of risk assessment of the business activity data stream;
and evaluating the risk level of the business activity data stream according to the behavior score and the risk evaluation result.
Optionally, the automatically adjusting the encryption strength of the data according to the risk level and performing fine-grained access control on the high-risk data includes:
Distinguishing different types of the business data streams by data classification techniques;
dynamically adjusting the encryption intensity of different types of business data streams according to the risk level by utilizing an adaptive encryption algorithm, wherein the encryption intensity comprises the length of an encryption key, and the adaptive encryption algorithm is used for automatically selecting the proper encryption key length according to the change of the risk level;
And determining the business activity data stream with the risk level larger than the preset risk level as high-risk data, and implementing fine-granularity access control on the high-risk data, wherein the fine-granularity access control at least comprises setting access authority rules.
Optionally, encrypting the business data stream based on the data encryption strength, and dispersedly storing the encrypted business data stream in a plurality of data centers with different physical locations through a distributed storage technology, including:
encrypting the business data stream based on the data encryption strength to generate an encrypted business data stream;
Dividing the encrypted business activity data stream into a plurality of data segments by using a data dividing algorithm;
And determining storage positions of the data segments in the data center through a hash allocation strategy, and respectively storing the data segments to the corresponding storage positions through a distributed storage technology. 5. The method of claim 1, wherein upon detection of a predefined threat scenario, automatically triggering an early warning mechanism and generating a report while performing corresponding safeguards in accordance with the security policy, comprises:
Monitoring all activity indexes in the business activity data stream in real time, and identifying potential threat behaviors in the business activity data stream through an anomaly detection algorithm;
Setting a predefined threat scene rule set, and judging that a predefined threat scene is detected when the potential threat behavior accords with any condition in the threat scene rule set;
Triggering an early warning mechanism when the predefined threat scene is detected, sending an instant alarm notification to a preset contact person or a management system, and recording the occurrence time and specific conditions of the threat scene;
Generating a detailed threat report document, the report document including a concrete form, occurrence time, influence range and possible cause analysis of the threat scene;
according to the preset security policy, corresponding protective measures are automatically selected and executed, wherein the protective measures at least comprise isolating affected data resources, suspending related account authorities and starting a data recovery flow;
And implementing the protective measures through an automatic script or a preconfigured workflow engine, and monitoring the execution state of the protective measures to ensure the effective execution of the protective measures.
Optionally, the adjusting the encryption strength of the business data streams of different types according to the risk level further includes:
defining an encryption intensity factor, and adjusting the encryption intensity of different types of business data streams according to the encryption intensity factor and the risk level;
Wherein the encryption intensity factor E (t) is determined according to the risk level R (t) of the business data stream, the importance factor I (t) of the business data stream, the sensitivity factor S (t) of the business data stream, the historical access frequency F (t) of the business data stream, and the data volume V (t) and time t of the business data stream:
E(t)=f(R(t),I(t),S(t),F(t),V(t),t)
wherein, the encryption strength factor E (t) can be calculated by the following formula:
Wherein x i (t) represents R (t), I (t), S (t), log (F (t) +1), W i (t) is a time-varying weight coefficient and satisfies
The weight coefficient w i (t) is obtained through machine learning model prediction:
wi(t)=LSTM(historicaldataofwi)
Further comprises:
Correcting the encryption intensity factor E (t) based on a preset time attenuation factor A (t), a dynamic adjustment factor DAF (t), a fluctuation factor V f (t) and a historical trend factor H f (t);
The modified encryption strength factor E (t) is calculated by the following formula:
Wherein, the Is the basic calculation part of the original encryption intensity factor, which calculates the basic encryption intensity factor by a weighted summation mode, wherein w i (t) is the weight corresponding to the factor I at the time t, x i (t) is the actual value of the factor I at the time t, x i (t) represents R (t), I (t), S (t) and log (F (t) +1),I.e. risk level, importance factor, sensitivity factor, logarithmic transformation of historical access frequency, square root transformation of data volume, A (t) represents a time decay factor that decreases exponentially with increasing gap from the initial time point t 0, expressed as Wherein alpha is a constant representing a time decay rate, DAF (t) represents a dynamic adjustment factor reflecting the trend of variation of the encryption intensity by calculating the average encryption intensity at the last w time points, expressed asV f (t) represents a fluctuation factor which reflects the fluctuation degree of the encryption strength with time, and is expressed asWherein the method comprises the steps ofIs the average value of the encryption intensity in w time points, H f (t) represents a historical trend factor, which measures the trend change of the encryption intensity along with the time, and the expression isE is a constant, and gamma and theta represent adjustment coefficients for adjusting the degree of influence of the dynamic adjustment factor DAF (t) and the historical trend factor H f (t) on the encryption strength.
Optionally, the splitting the encrypted business data stream into a plurality of data segments by using a data splitting algorithm includes:
Defining a data segmentation factor, and based on the data segmentation factor, segmenting the encrypted business activity data stream into a plurality of data segments by using a data segmentation algorithm;
Wherein the data segmentation factor D (T) is based on the total size T (T) of the business data stream, the importance factor I (T) of the business data stream, the sensitivity factor S (T) of the business data stream, the historical access frequency F (T) of the business data stream, the data volume V (T) of the business data stream and the time T are determined by:
D(t)=g(T(t),I(t),S(t),F(t),V(t),t)
the data segmentation factor D (t) is calculated by the following formula:
Where n (t) represents the number of data segments into which the business data stream is partitioned over time t, x i (t) represents I (t), S (t), log (F (t) +1), W i (t) is a time-varying weight coefficient and satisfies
Further comprises:
Correcting the data division factor D (t) based on a preset redundancy factor R f (t), a data distribution balance factor DBF (t), a load fluctuation factor L f (t) and a load trend factor LH f (t);
the corrected data division factor D (t) is calculated by the following formula:
wherein T (T) represents the total size of the business data stream at time T, n (T) represents the number of data segments into which the business data stream is partitioned at time T; Represents the result of a weighted summation of different attributes (e.g., importance, sensitivity, logarithmic transformation of historical access frequency, square root transformation of data volume), where w i (t) is a time-varying weight coefficient, x i (t) is the corresponding attribute at time t, and R f (t) is a redundancy factor representing the number of additional data segments added at time t to improve data reliability. The calculation mode is that Wherein, the reed (t) is a constant which changes with time and is used for adjusting the redundancy proportion, the DBF (t) is a data distribution balance factor and is used for measuring whether the data distribution between the data centers is balanced or not, and the calculation mode is thatWhere K is the number of data centers, C k (t) is the load of the kth data center at time t,Is the average load of all data centers at time t, L f (t) represents the load fluctuation factor, represents the degree of load fluctuation among the data centers, and is calculated byWherein LH f (t) is a load trend factor for measuring the trend of the load of the data center over time, calculated by C (t) represents a certain reference load value at time t, and is usually a small positive value to avoid zero denominator, and delta and eta are adjustment coefficients for adjusting the influence of DBF (t) and LH f (t), respectively.
In a second aspect, an embodiment of the present application provides a business data security management system based on big data, including:
a receiving module for receiving a stream of business activity data from an internal system of an enterprise;
the analysis module is used for dynamically analyzing the abnormal behavior mode in the business activity data stream and evaluating the risk level of the business activity data stream in real time according to a preset security policy;
the control module is used for automatically adjusting the data encryption intensity according to the risk level and implementing fine-granularity access control on the high-risk data;
the storage module dispersedly stores the encrypted business activity data stream in a plurality of data centers with different physical positions through a distributed storage technology;
and the early warning module automatically triggers an early warning mechanism and generates a report when a predefined threat scene is detected, and executes corresponding protective measures according to the security policy.
In a third aspect, an embodiment of the present application provides a computing device, including a processing component and a storage component, where the storage component stores one or more computer instructions, and the one or more computer instructions are used to be invoked and executed by the processing component to implement a business data security management method based on big data according to the first aspect.
In a fourth aspect, an embodiment of the present application provides a computer storage medium storing a computer program, where the computer program when executed by a computer implements a business data security management method based on big data as described in the first aspect.
In the embodiment of the application, a business activity data stream from an enterprise internal system is received, an abnormal behavior mode in the business activity data stream is dynamically analyzed, the risk level of the business activity data stream is evaluated in real time according to a preset security policy, the data encryption intensity is automatically adjusted according to the risk level, fine-granularity access control is implemented on high-risk data, the business activity data stream is encrypted based on the data encryption intensity, an early warning mechanism is automatically triggered and a report is generated when a predefined threat scene is detected, and corresponding protective measures are executed according to the security policy. The technical scheme provided by the application not only enhances the safety of business data, but also improves the efficiency and the intelligent level of data management, and provides omnibearing data security guarantee for enterprises.
These and other aspects of the application will be more readily apparent from the following description of the embodiments.
Detailed Description
In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present application with reference to the accompanying drawings.
In some of the flows described in the specification and claims of the present application and in the foregoing figures, a plurality of operations occurring in a particular order are included, but it should be understood that the operations may be performed out of order or performed in parallel, with the order of operations such as 101, 102, etc., being merely used to distinguish between the various operations, the order of the operations themselves not representing any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.
Fig. 1 is a flowchart of a business data security management method based on big data according to an embodiment of the present application, as shown in fig. 1, the method includes:
With the penetration of digital transformation, the amount of business activity data generated by the internal systems of enterprises is increasing. These data contain not only a large number of business operations records, but also sensitive information related to finance, customers, suppliers, etc. In order to secure these data, a system is needed that can receive and analyze the business data stream in real time and dynamically adjust the data protection policy according to the risk assessment results.
Currently, data protection measures for most enterprises mainly include static data encryption, fixed Access Control Lists (ACLs), and rule-based Intrusion Detection Systems (IDSs). These systems, while capable of providing some degree of data protection, appear to be frustrating in the face of dynamically changing data environments and complex security threats.
The existing data protection scheme has the defects that static data encryption cannot dynamically adjust encryption intensity according to actual risk levels of data, so that data protection forces for different risk levels are consistent, resource waste or insufficient protection is caused, a fixed access control list is difficult to adapt to continuously changing enterprise internal requirements, particularly when new threats are faced, access rights cannot be flexibly adjusted, a rule-based intrusion detection system depends on a predefined rule base, unknown threat detection capability is limited, and real-time updating cannot be carried out to cope with new attack modes.
The embodiment of the application provides a business data security management system and method based on big data, which are used for solving the problem that unknown threat detection capability is limited in the prior art.
In a first aspect, an embodiment of the present application provides a business data security management system and method based on big data, including:
101. receiving a stream of business activity data from an enterprise internal system;
This step refers to collecting and receiving data related to business activities from various internal information systems of the enterprise (e.g., ERP systems, CRM systems, financial systems, etc.). Such data may include transaction records, order information, customer profiles, financial statements, and the like, various data closely related to the operation of the enterprise.
The purpose of receiving these data streams is to perform subsequent analysis, processing, and protection thereof, ensuring the security and integrity of the data, while also providing valuable business insight to the enterprise.
Assume that a company is running an ERP system that records all business activities of the company, including but not limited to, purchase orders, sales orders, inventory changes, financial transactions, and the like. To ensure the security and compliance of these data, companies decide to deploy a large data-based business data security management system, in which the ERP system transmits the business data stream to the data security management system through an API interface or other data transmission protocol (such as FTP, SFTP, etc.), the data format possibly involved includes but is not limited to CSV file, XML file, JSON object or other structured data format, the data security management system includes a special data acquisition module responsible for monitoring the data transmission request of the ERP system or other internal system, the data acquisition module immediately starts to receive and buffer the incoming data stream upon receiving the data transmission request, the received data stream may undergo preliminary cleaning and format conversion to ensure that the data can meet the requirements of the subsequent processing module, the data preprocessing may include removing duplicate records, supplementing missing fields, format unification, etc., the preprocessed data may be temporarily stored in a buffer or temporary database waiting for further analysis and processing, and the data security management system may set a timing task to periodically import the data of the buffer into a long-term storage solution.
102. Dynamically analyzing abnormal behavior patterns in the business activity data stream, and evaluating the risk level of the business activity data stream in real time according to a preset security policy;
This step aims at identifying the abnormal behavior patterns that may exist in the business data streams by monitoring the data streams in real time, and evaluating the risk levels of the data streams according to preset security policies. Through dynamic analysis, the system can discover potential security threats in time and take corresponding protective measures according to the risk level.
Assume that a retail company records all customers' purchasing behavior in its ERP system. In order to ensure data security, the company deploys a business data security management system based on big data to monitor the business data streams, wherein the system receives the business data streams from an ERP system in real time, the data comprise purchase records, login behaviors, refund requests and the like of clients, a monitoring module analyzes the data in real time to identify any behaviors which do not accord with normal modes, the system dynamically monitors all activity indexes in the data streams by using a pre-trained machine learning model (such as an anomaly detection algorithm, a deep learning model and the like), for example, if the purchase amount of a certain client suddenly increases in a certain time period or a certain IP address frequently tries to log in and fails, the system marks the behaviors as potential abnormal modes, the system generates a behavior score by combining a preset behavior feature library and historical data according to the identified abnormal behavior mode, the behavior score is input into a preset risk assessment framework, the framework comprehensively considers various factors (such as the behavior score, the historical behavior mode, the current environment change and the like) to evaluate the overall risk level of the data streams, the system adjusts the risk level in real time according to the pre-trained machine learning model, and feeds the information back to the subsequent information, for example, and the system adopts an immediate access to the high-risk limit of the data or the encrypted data is subjected to the immediate access to the high-sensitivity performance limit.
The application considers that in the existing business data security management system, although basic data flow monitoring and abnormal behavior identification can be realized, the dynamic adaptability and the accuracy have certain limitations. The method is particularly characterized in that the traditional monitoring system often depends on a fixed threshold value to judge abnormal behaviors, the static threshold value is easy to generate false alarm or missing alarm when facing to a dynamic change business environment, risks are estimated by only relying on a single dimension (such as purchase amount), comprehensive influences of various factors (such as purchase frequency, login behaviors and the like) are ignored, so that risk estimation is not comprehensive enough, and the system cannot adjust own detection strategies according to historical data and changes of the current environment, so that adaptability under long-term operation is poor. In order to solve the technical problems, the embodiment of the application provides an alternative scheme, and more intelligent and comprehensive risk assessment is realized by introducing a machine learning model and a multi-level behavior analysis algorithm.
The alternative scheme is as follows:
Optionally, the "dynamically analyzing abnormal behavior patterns in the business data stream and evaluating the risk level of the business data stream in real time according to a preset security policy" in step 102 includes:
the method comprises the steps of dynamically monitoring all activity indexes in a business activity data stream by using a machine learning model to identify potential abnormal behavior modes, wherein the machine learning model can adjust a detection threshold according to historical data and current environmental changes, generating behavior scores of the abnormal behavior modes by combining a preset behavior feature library based on the abnormal behavior modes, processing the behavior scores by using a preset risk assessment framework to obtain risk assessment results, and assessing the risk level of the business activity data stream according to the behavior scores and the risk assessment results, wherein the risk assessment framework ensures the comprehensiveness and the accuracy of risk assessment of the business activity data stream.
Assume that a retail establishment deploys the improved business data security management system described above. The system monitors that the purchasing behavior of a client is abnormal in one day, monitors that the client makes multiple large purchases in one hour, the purchasing behaviors are increased by three times compared with the average purchasing amount in the past month of the client, the machine learning model adjusts a detection threshold according to historical data and the current environment, marks the behavior as a potential abnormal mode, the system applies a behavior analysis algorithm and combines a preset behavior feature library to generate a behavior score of 85/100 (fully divided into 100), the risk assessment framework comprehensively considers factors such as the behavior score (85), the historical behavior mode (the purchasing behavior of the client in the past month), the current environment change (the purchasing behavior in the short time) and the like, finally obtains a risk assessment result as a high risk, and evaluates the purchasing behavior of the client into a high risk grade according to the risk assessment result, immediately strengthens the encryption strength of related data, and performs fine grain access control on sensitive data related to the client.
Through the alternative scheme, the system not only can identify potential security threats more accurately, but also can adjust the detection strategy in time according to the dynamically-changed business environment, thereby effectively improving the accuracy and comprehensiveness of risk assessment. In addition, the system can automatically take corresponding encryption and access control measures according to the risk level, so that the safety of business data is further ensured.
103. Automatically adjusting the data encryption intensity according to the risk level, and implementing fine-granularity access control on the high-risk data;
The core of this step is to dynamically adjust the strength of data encryption according to the previously assessed risk level and to implement more stringent access control measures for high risk data. In this way, the system can assign different protection levels according to the importance and sensitivity of the data, ensuring better protection of high risk data.
Suppose a financial institution needs to protect the transaction records and other sensitive information of its customers. The institution deploys a large data-based business data security management system which can dynamically adjust the data encryption intensity according to the risk level and implement fine-grained access control on high-risk data, wherein the system firstly classifies received business data streams to distinguish different types of data (such as transaction records, personal identity information, financial statements and the like), then determines which data belong to the high-risk data according to the risk level evaluated in the previous step, and for the data marked as high-risk, the system encrypts the data by using a higher-level encryption algorithm (such as AES-256 instead of AES-128) and a longer key length. The adaptive encryption algorithm automatically selects the proper encryption key length according to the change of the risk level to ensure the balance between the security and the performance of the data, and for high-risk data, the system implements fine-grained access control, which means that even users which pass the authentication can access the data through further authority authentication, for example, certain sensitive data can only be allowed to be checked by advanced management personnel of a specific department or can be accessed through multiple authentications (such as double factor authentication).
In the existing business data security management system, although basic encryption and access control can be provided according to different data types, some problems still exist in practical application:
The application considers that the traditional system generally uses fixed encryption intensity, and uses the same encryption algorithm and key length no matter the risk level of the data, so that the encryption intensity is insufficient or the performance cost is overlarge, the traditional system can only carry out access control on the whole data, but cannot carry out finer access authority management on specific data items, which can lead to the access of sensitive data by unauthorized users, and when the risk level of the data changes, the manual adjustment of the encryption intensity and access control rule is time-consuming and easy to make mistakes. In order to solve the technical problems, the embodiment of the application provides an alternative scheme, and more flexible and accurate data protection is realized by introducing a data classification technology, an adaptive encryption algorithm and fine granularity access control.
The alternative scheme is as follows:
Optionally, the "automatically adjusting data encryption strength according to the risk level and implementing fine-grained access control on high risk data" in step 103 includes:
The method comprises the steps of classifying business data streams according to a risk level, distinguishing different types of the business data streams through a data classification technology, dynamically adjusting encryption intensity of the business data streams of different types according to the risk level by utilizing an adaptive encryption algorithm, wherein the encryption intensity comprises the length of an encryption key, the adaptive encryption algorithm is used for automatically selecting a proper encryption key length according to the change of the risk level, determining the business data streams with the risk level larger than a preset value as high risk data, and implementing fine-grained access control on the high risk data, wherein the fine-grained access control at least comprises a set access authority rule.
Suppose a financial institution needs to protect the transaction records and other sensitive information of its customers. The mechanism deploys a business data security management system based on big data, and the specific implementation steps are as follows:
the system classifies the received business data stream to distinguish different types of data such as transaction records, personal identity information, financial statement and the like, wherein the transaction records may comprise purchase history of clients, the personal identity information may comprise names, identity card numbers and the like, the financial statement may comprise financial status and the like, the system determines encryption intensity of different data types according to the risk level estimated before, the system uses AES-256 encryption algorithm for high risk data (such as transaction records and personal identity information) and automatically selects longer key length (such as 256 bits) according to the risk level, the system uses AES-128 encryption algorithm for low risk data (such as financial statement) and selects shorter key length (such as 128 bits), the system implements fine-grained access control on the high risk data, the system also sets access authority rules for the transaction records, and only financial department personnel after strict authentication can check and need to record and audit for each access.
Through the alternative scheme, the system not only can dynamically adjust the encryption strength according to the risk level of the data, but also can implement stricter fine-grained access control on the high-risk data. Therefore, the data security is improved, and the performance and efficiency of the system are ensured, so that the problems of static encryption strength and coarse-granularity access control in the prior art are effectively solved.
In the existing business data security management system, although the encryption strength can be adjusted according to the risk level of different data, some problems still exist in practical application:
Conventional encryption strength adjustment methods are generally static and cannot adjust the encryption strength in real time according to dynamic changes of data streams, existing methods may adjust the encryption strength based on only a single factor (such as risk level), neglect other important factors (such as importance, sensitivity, historical access frequency, etc.), and existing systems lack a mechanism capable of adaptively adjusting the encryption strength according to historical data and current environmental changes. In order to solve the technical problems, the embodiment of the invention provides an alternative scheme, and more intelligent and comprehensive data encryption intensity adjustment is realized by introducing an encryption intensity factor and a correction mechanism thereof.
The alternative scheme is as follows:
optionally, the adjusting the encryption strength of the business data streams of different types according to the risk level further includes:
defining encryption intensity factors, and adjusting the encryption intensity of different types of business data streams according to the risk level R (t) of the business data streams based on the encryption intensity factors, wherein the encryption intensity factors E (t) are determined according to the importance factor I (t) of the business data streams, the sensitivity factor S (t) of the business data streams, the historical access frequency F (t) of the business data streams and the data volume V (t) and the time t of the business data streams:
E(t)=f(R(t),I(t),S(t),F(t),V(t),t)
wherein, the encryption strength factor E (t) can be calculated by the following formula:
Wherein x i (t) represents R (t), I (t), S (t), log (F (t) +1), W i (t) is a time-varying weight coefficient and satisfies
The weight coefficient w i (t) is obtained through machine learning model prediction:
wi(t)=LSTM(historicaldataofwi)
Further comprises:
Correcting the encryption intensity factor E (t) based on a preset time attenuation factor A (t), a dynamic adjustment factor DAF (t), a fluctuation factor V f (t) and a historical trend factor H f (t);
The modified encryption strength factor E (t) is calculated by the following formula:
Wherein, the Is the basic calculation part of the original encryption intensity factor, which calculates the basic encryption intensity factor by a weighted summation mode, wherein w i (t) is the weight corresponding to the factor I at the time t, x i (t) is the actual value of the factor I at the time t, x i (t) represents R (t), I (t), S (t) and log (F (t) +1),I.e. risk level, importance factor, sensitivity factor, logarithmic transformation of historical access frequency, square root transformation of data volume, A (t) represents a time decay factor that decreases exponentially with increasing gap from the initial time point t 0, expressed as Wherein alpha is a constant representing a time decay rate, DAF (t) represents a dynamic adjustment factor reflecting the trend of variation of the encryption intensity by calculating the average encryption intensity at the last w time points, expressed asV f (t) represents a fluctuation factor which reflects the fluctuation degree of the encryption strength with time, and is expressed asWherein the method comprises the steps ofIs the average value of the encryption intensity in w time points, H f (t) represents a historical trend factor, which measures the trend change of the encryption intensity along with the time, and the expression isE is a constant, and gamma and theta represent adjustment coefficients for adjusting the degree of influence of the dynamic adjustment factor DAF (t) and the historical trend factor H f (t) on the encryption strength.
Suppose a financial institution needs to protect the transaction records and other sensitive information of its customers. The mechanism deploys a business data security management system based on big data, and the specific implementation steps are as follows:
Assuming that the risk level R (t) =0.8, the importance factor I (t) =0.7, the sensitivity factor S (t) =0.9, the history access frequency F (t) =5, the data volume V (t) =1000 of a certain business data stream at time t, the calculation x i (t) results in:
assuming that the weight coefficient w i (t) is [0.2,0.1,0.3,0.15,0.25], then it is calculated that:
Assuming that the time attenuation factor a (t) =e -0.1(10-0)=e-1 ≡0.37, and assuming that DAF (t) =10, v f(t)=5,Hf (t) =15, γ=0.5, θ=0.5, the modified encryption intensity factor E (t) is:
Based on the modified encryption strength factor E (t) ≡5.72, the system can select the corresponding encryption algorithm and key length. For example, if E (t) >5, the AES256 encryption algorithm is used, otherwise the AES-128 encryption algorithm is used, and assuming E (t) >5, the AES-256 encryption algorithm is used and the specific key length is determined from E (t).
By adopting the alternative scheme, the system not only can adjust the encryption strength in real time according to the dynamic change of the data stream, but also can comprehensively consider a plurality of factors (such as risk level, importance, sensitivity, historical access frequency and the like), thereby protecting the data security more intelligently and comprehensively. In addition, by introducing a time attenuation factor, a dynamic adjustment factor, a fluctuation factor and a historical trend factor, the system can better adapt to the change of the data stream and ensure the rationality of encryption intensity.
104. Encrypting the business activity data stream based on the data encryption intensity, and dispersedly storing the encrypted business activity data stream in a plurality of data centers with different physical positions through a distributed storage technology;
The method mainly comprises the steps of encrypting commercial activity data streams according to the data encryption intensity determined in the previous step, ensuring that the data are not illegally accessed or tampered in the transmission and storage processes, and adopting a distributed storage technology to store the encrypted data in a plurality of different data centers in a scattered manner, so that the availability and the safety of the data are improved, and the risk of single-point faults is reduced.
It is assumed that a nationwide company needs to ensure the security and high availability of its business data. The company deploys a business data security management system based on big data and adopts a distributed storage technology to protect the data. The method comprises the following specific implementation steps of determining encryption intensity required by different data streams according to the risk level estimated before, encrypting the business data streams by using a proper encryption algorithm (such as AES) and a key length, dividing the encrypted business data streams into a plurality of data segments, determining storage positions of each data segment through a hash allocation strategy and storing the data segments in different data centers in a scattered manner, improving the storage efficiency of data, ensuring that even if one data center fails, recovering the data from other data centers, managing and storing the encrypted data segments by using a distributed storage technology, and ensuring redundancy and high availability of the data on a physical level by storing the data in the plurality of data centers in a scattered manner.
The application considers that the prior business data security management system can realize the basic functions of data encryption and distributed storage, but has some problems in practical application, the prior system generally uses a fixed encryption algorithm and key length, can not dynamically adjust the encryption strength according to the risk level of data, so that the data protection is not flexible enough, the data is generally stored in one or a few data centers in a centralized way, once the centers fail, the data is possibly lost or unavailable, the prior threat detection and response mechanism is usually passive, no real-time monitoring and automatic protection measures are realized, the threat response speed is slow and the effect is poor, and in order to solve the technical problems, the embodiment of the application provides an alternative scheme which realizes more intelligent and comprehensive data protection by introducing a data segmentation algorithm, a hash allocation strategy and real-time monitoring and automatic protection measures.
The alternative scheme is as follows:
optionally, the "encrypt the business data stream based on the data encryption strength and store the encrypted business data stream in a plurality of data centers with different physical locations in a distributed storage technology" in step 104 includes:
Encrypting the business activity data stream based on the data encryption intensity to generate an encrypted business activity data stream, segmenting the encrypted business activity data stream into a plurality of data segments by using a data segmentation algorithm, determining storage positions of the data segments in a data center by using a hash allocation strategy, and respectively storing the data segments to corresponding storage positions by using a distributed storage technology. 5. The method of claim 1, wherein upon detection of a predefined threat scenario, automatically triggering an early warning mechanism and generating a report while performing corresponding safeguards in accordance with the security policy, comprises:
the method comprises the steps of monitoring all activity indexes in a business activity data stream in real time, identifying potential threat behaviors in the business activity data stream through an anomaly detection algorithm, setting a predefined threat scene rule set, judging that a predefined threat scene is detected when the potential threat behaviors meet any condition in the threat scene rule set, triggering an early warning mechanism when the predefined threat scene is detected, sending an instant alarm notification to a preset contact person or a management system, recording the occurrence time and specific conditions of the threat scene, generating a detailed threat report document, wherein the report document comprises the specific expression form, the occurrence time, the influence range and possible cause analysis of the threat scene, automatically selecting and executing corresponding protection measures according to the preset security policy, wherein the protection measures at least comprise isolating affected data resources, suspending related account authorities and starting a data recovery flow, and executing the protection measures and monitoring the execution state through an automatic script or a preset workflow engine to ensure the effective execution of the protection measures.
It is assumed that a nationwide company needs to ensure the security and high availability of its business data. The company deploys a business data security management system based on big data, and the specific implementation steps are as follows:
The system determines the data encryption intensity according to the risk level, and supposes that the risk level of a certain business activity data stream is higher, the system selects an AES-256 encryption algorithm and encrypts by using the 256-bit key length; the method comprises the steps of dividing an encrypted business activity data stream into a plurality of data segments, determining a storage position of each data segment through a hash allocation strategy, supposing that three data centers (A, B, C) are arranged in total, calculating through a hash function, distributing the data segments 1 to the data center A, distributing the data segments 2 to the data center B and the data segments 3 to the data center C, managing and storing the data segments through a Hadoop HDFS distributed storage technology to ensure high availability and redundancy of data, monitoring all activity indexes in the business activity data stream in real time by a system, identifying potential threat behaviors through an anomaly detection algorithm, supposing that a certain account is tried to be logged in more than 5 times in one hour, marking the account as the potential threat by the system, setting a predefined set of rule of scene rules for the threat, judging that the predefined threat scenes are detected when the potential threat behaviors meet any conditions, triggering a warning mechanism to send warning reports to a preset contact person or a management system when the predefined threat scenes are detected, automatically recording the specific threat scenes, automatically playing a detailed report, and automatically recording the specific threat scene, and automatically playing a detailed report, and analyzing the specific threat scene, and the specific security and the specific threat scene can be influenced by the specific security, and the specific security and the security system can be analyzed and the specific security conditions are generated, and the specific security conditions are selected and the specific security conditions are influenced In the case, the system pauses the login authority of the account and starts the data recovery flow. Through an automated script or a preconfigured workflow engine, the system implements the safeguard measures and monitors the execution state of the safeguard measures to ensure that the safeguard measures are effectively executed.
Through the above-mentioned alternative scheme, the system not only can dynamically adjust the encryption intensity according to the risk level of the data and improve the flexibility of data protection, but also can realize the high availability and redundancy of the data through the data segmentation and hash allocation strategy, so as to ensure that the data can still be recovered from other data centers even if a certain data center fails. In addition, the system can also monitor all activity indexes in the business activity data stream in real time, identify potential threat behaviors through an anomaly detection algorithm, automatically trigger an early warning mechanism and execute corresponding protective measures, and improve the threat response speed and effect while ensuring the data safety.
The application considers that the existing business data security management system can realize the basic functions of data encryption and distributed storage, but has some problems in practical application that the existing system generally uses a fixed segmentation strategy to divide data segments and can not flexibly segment according to the dynamic characteristics (such as total size, importance, sensitivity and the like) of data streams, the distribution of the data segments among different data centers is often unbalanced, so that the load of some data centers is overhigh and the load of other data centers is overlow, and the existing system lacks the consideration of data redundancy and can not quickly recover data when a single data center fails.
In order to solve the technical problems, the embodiment of the invention provides an alternative scheme, and by introducing the data segmentation factors and the correction mechanism thereof, the more intelligent and dynamic data segmentation and storage strategies are realized.
The alternative scheme is as follows:
optionally, the splitting the encrypted business data stream into a plurality of data segments by using a data splitting algorithm includes:
Defining a data segmentation factor, and based on the data segmentation factor, segmenting the encrypted business activity data stream into a plurality of data segments by using a data segmentation algorithm;
Wherein the data segmentation factor D (T) is based on the total size T (T) of the business data stream, the importance factor I (T) of the business data stream, the sensitivity factor S (T) of the business data stream, the historical access frequency F (T) of the business data stream, the data volume V (T) of the business data stream and the time T are determined by:
D(t)=g(T(t),I(t),S(t),F(t),V(t),t)
the data segmentation factor D (t) is calculated by the following formula:
Where n (t) represents the number of data segments into which the business data stream is partitioned over time t, x i (t) represents I (t), S (t), log (F (t) +1), W i (t) is a time-varying weight coefficient and satisfies
Further comprises:
Correcting the data division factor D (t) based on a preset redundancy factor R f (t), a data distribution balance factor DBF (t), a load fluctuation factor L f (t) and a load trend factor LH f (t);
the corrected data division factor D (t) is calculated by the following formula:
wherein T (T) represents the total size of the business data stream at time T, n (T) represents the number of data segments into which the business data stream is partitioned at time T; Represents the result of a weighted summation of different attributes (e.g., importance, sensitivity, logarithmic transformation of historical access frequency, square root transformation of data volume), where w i (t) is a time-varying weight coefficient, x i (t) is the corresponding attribute at time t, and R f (t) is a redundancy factor representing the number of additional data segments added at time t to improve data reliability. The calculation mode is that Wherein, the reed (t) is a constant which changes with time and is used for adjusting the redundancy proportion, the DBF (t) is a data distribution balance factor and is used for measuring whether the data distribution between the data centers is balanced or not, and the calculation mode is thatWhere K is the number of data centers, C k (t) is the load of the kth data center at time t,Is the average load of all data centers at time t, L f (t) represents the load fluctuation factor, represents the degree of load fluctuation among the data centers, and is calculated byWherein LH f (t) is a load trend factor for measuring the trend of the load of the data center over time, calculated by C (t) represents a certain reference load value at time t, and is usually a small positive value to avoid zero denominator, and delta and eta are adjustment coefficients for adjusting the influence of DBF (t) and LH f (t), respectively.
It is assumed that a nationwide company needs to ensure the security and high availability of its business data. The company deploys a business data security management system based on big data, and the specific implementation steps are as follows:
let T (T) =1000 MB, importance factor I (T) =0.8, sensitivity factor S (T) =0.9, history access frequency F (T) =5, data volume V (T) =1000, and data segment number n (T) =10 for a certain business data stream at time T. Assuming that the weight coefficient w i (t) is [0.2,0.1,0.3,0.15,0.25], then it is calculated that:
the data segmentation factor D (t) is:
The corrected data division factor D (t) is assumed that the redundancy factor R f (t) =5, the data distribution balance factor DBF (t) =0.8, the load fluctuation factor L f (t) =0.5, the load trend factor LH f (t) =0.3, the adjustment coefficient δ=0.5, and η=0.5, and the corrected data division factor D (t) is:
According to the corrected data division factor D (t) ≡ 165.14, the system divides the data stream into a plurality of data segments and determines the storage position through a hash allocation strategy, and the data segments are stored in different data centers in a scattered manner, so that even if one data center fails, the data can still be recovered from other data centers.
Through the above-mentioned alternative scheme, the system not only can dynamically adjust the data segmentation strategy according to the dynamic characteristics (such as total size, importance, sensitivity and the like) of the data stream, but also can ensure the balanced distribution and high availability of the data among a plurality of data centers through the redundancy factor, the data distribution balance factor, the load fluctuation factor and the load trend factor. In addition, by introducing the factors, the system can better adapt to the change of the data flow, ensure the rationality of data segmentation and the high efficiency of storage, thereby effectively solving the problems of static segmentation, unbalanced storage and the like in the prior art.
105. When a predefined threat scene is detected, an early warning mechanism is automatically triggered and a report is generated, and corresponding protective measures are executed according to the security policy.
The key of this step is that when a predefined threat scenario is detected, the system can automatically take a series of measures, including triggering an early warning mechanism, generating a detailed threat report, and executing corresponding safeguards according to a preset security policy. This process is intended to ensure that the relevant personnel are informed in time when the threat occurs and that effective action is taken to mitigate or eliminate the risk of the threat.
Assume that an online paymate needs to protect transaction data and account information for its users. The platform is provided with a business data safety management system based on big data, the system can automatically trigger an early warning mechanism and execute protective measures when a predefined threat scene is detected, the system monitors all activity indexes in a business activity data stream in real time and identifies potential threat actions through an anomaly detection algorithm, for example, if the system detects that a certain account has a large number of abnormal login attempts or a transaction amount is abnormally huge in a short time, the system sets a predefined threat scene rule set, wherein the predefined threat scene rule set comprises a plurality of known threat modes such as continuous login failure attempts, abnormal large-amount transactions, abnormal time period operation and the like. When the detected potential threat behavior accords with any condition in a threat scene rule set, the system determines that a predefined threat scene is detected, when the predefined threat scene is detected, the system automatically triggers an early warning mechanism, sends an instant alarm notification to a preset contact person or a management system and records the occurrence time and specific conditions of the threat scene, for example, the system can send an email or a short message alarm to a security team and generate a warning message on a control console, the system automatically generates a detailed threat report document, the report comprises the specific expression form, the occurrence time, the influence range and possible reason analysis of the threat scene, the report also comprises suggested processing measures and a next walking plan so that the security team can respond quickly, the system automatically selects and executes corresponding protective measures according to a preset security policy, the protective measures can comprise isolating affected data resources, suspending related rights, starting a data recovery process and the like, and the system ensures that the protective measures are effectively executed and continuously monitors the execution state of the protective measures through an automatic script or a preset workflow engine.
Fig. 2 is a schematic structural diagram of a business data security management system based on big data according to an embodiment of the present application, and as shown in fig. 2, the device includes:
a receiving module 21 for receiving a stream of business activity data from an internal system of an enterprise;
The analysis module 22 is configured to dynamically analyze abnormal behavior patterns in the business activity data stream, and evaluate risk levels of the business activity data stream in real time according to a preset security policy;
a control module 23, configured to automatically adjust data encryption intensity according to the risk level, and implement fine-grained access control on high-risk data;
the storage module 24 dispersedly stores the encrypted business activity data stream in a plurality of data centers with different physical positions through a distributed storage technology;
the early warning module 25 automatically triggers an early warning mechanism and generates a report when a predefined threat scene is detected, and executes corresponding protective measures according to the security policy.
The business data security management system based on big data shown in fig. 2 may implement a business data security management method based on big data shown in the embodiment shown in fig. 1, and its implementation principle and technical effects are not repeated. The specific manner in which the various modules and units perform operations in the big data based business data security management system in the above embodiments has been described in detail in the embodiments related to the method, and will not be described in detail herein.
In one possible design, a big data based business data security management apparatus of the embodiment of FIG. 2 may be implemented as a computing device, as shown in FIG. 3, which may include a storage component 31 and a processing component 32;
The storage component 31 stores one or more computer instructions for execution by the processing component 32.
The processing component 32 is configured to receive a business data stream from an internal system of an enterprise, dynamically analyze an abnormal behavior pattern in the business data stream, and evaluate a risk level of the business data stream in real time according to a preset security policy, automatically adjust a data encryption strength according to the risk level, and implement fine-grained access control on high-risk data, encrypt the business data stream based on the data encryption strength, automatically trigger an early warning mechanism and generate a report when a predefined threat scenario is detected, and simultaneously execute corresponding safeguards according to the security policy.
Wherein the processing component 32 may include one or more processors to execute computer instructions to perform all or part of the steps of the methods described above. Of course, the processing component may also be implemented as one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic elements for executing the methods described above.
The storage component 31 is configured to store various types of data to support operations at the terminal. The memory component may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
Of course, the computing device may necessarily include other components as well, such as input/output interfaces, display components, communication components, and the like.
The input/output interface provides an interface between the processing component and a peripheral interface module, which may be an output device, an input device, etc.
The communication component is configured to facilitate wired or wireless communication between the computing device and other devices, and the like.
The computing device may be a physical device or an elastic computing host provided by the cloud computing platform, and at this time, the computing device may be a cloud server, and the processing component, the storage component, and the like may be a base server resource rented or purchased from the cloud computing platform.
The embodiment of the application also provides a computer storage medium which stores a computer program, and the computer program can realize the business data security management method based on big data in the embodiment shown in the figure 1 when being executed by a computer.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present application, and not for limiting the same, and although the present application has been described in detail with reference to the above-mentioned embodiments, it should be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the spirit and scope of the technical solution of the embodiments of the present application.