[go: up one dir, main page]

CN118469273A - Abnormal behavior and risk detection method and system in cloud environment and storage medium - Google Patents

Abnormal behavior and risk detection method and system in cloud environment and storage medium Download PDF

Info

Publication number
CN118469273A
CN118469273A CN202410403252.2A CN202410403252A CN118469273A CN 118469273 A CN118469273 A CN 118469273A CN 202410403252 A CN202410403252 A CN 202410403252A CN 118469273 A CN118469273 A CN 118469273A
Authority
CN
China
Prior art keywords
data
risk
model
module
target processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410403252.2A
Other languages
Chinese (zh)
Inventor
顾斌
刘涛
于中阳
王亚菁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Xinsaiyun Computing Technology Co ltd
Shanghai Jimu Galaxy Digital Technology Co ltd
Original Assignee
Shanghai Xinsaiyun Computing Technology Co ltd
Shanghai Jimu Galaxy Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Xinsaiyun Computing Technology Co ltd, Shanghai Jimu Galaxy Digital Technology Co ltd filed Critical Shanghai Xinsaiyun Computing Technology Co ltd
Priority to CN202410403252.2A priority Critical patent/CN118469273A/en
Publication of CN118469273A publication Critical patent/CN118469273A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method, a system and a storage medium for detecting abnormal behaviors and risks in a cloud environment, wherein the method comprises the steps of collecting data to be processed by a data collection module; the collected data to be processed comprises: logging information data, resource utilization data and system configuration data; preprocessing data to be processed to obtain a training set and a testing set: the model training module selects an SVM model as a target processing model; the risk assessment module uses the test set to carry out verification and assessment on the trained target processing model, and when the verification and assessment are carried out, the accuracy rate and F1 score index of the current target processing model are calculated and used as the performance index of the target processing model; the model training module performs feature extraction according to newly generated data in a preset time period to obtain an updated training set, and further trains an original target processing model so as to update the target processing model periodically; the method improves the prediction accuracy.

Description

Abnormal behavior and risk detection method and system in cloud environment and storage medium
Technical Field
The invention relates to the technical field of cloud security application, in particular to a method and a system for detecting abnormal behaviors and risks in a cloud environment and a storage medium.
Background
With the development and progress of the age, the construction of novel infrastructure is remarkably promoted, the application of information technology is accelerated, and then the cloud computing technology is rapidly developed and applied. More and more enterprises and individuals choose to migrate their business and data into the cloud environment for greater flexibility, scalability, and efficiency. Cloud computing, as a distributed network system, can perform mass data processing in a relatively short time, and becomes an important transformation and necessary development direction in the current china informatization field. The cloud concept on enterprises is continuously emerging, and enterprises and institutions actively promote the deployment of software applications to powerful cloud servers, and accelerate the digital transformation.
However, more and more cloud server security issues are of great concern. As the security risk of networks continues to increase, the frequency and means of attack against cloud environments continues to increase. Once the server is under attack, a huge loss is caused which is difficult to recover. First, the business secret and user privacy data involved may be obtained illegally, resulting in data leakage and loss, destroying the user's trust in the enterprise. Secondly, large-scale attacks may cause downtime of the application server, causing significant online service failures for users, resulting in economic losses for the enterprise. Finally, data leakage may also result in data being used for illegal activities, severely affecting resident and social security.
Therefore, security monitoring of cloud environments is a very urgent problem. Traditional security monitoring methods rely primarily on rule-based static analysis and signature detection techniques, which generally fail to discover new, unknown security threats in a timely manner. In addition, due to the dynamic and highly virtualized nature of the cloud environment, conventional security detection methods are often difficult to apply in the cloud environment.
Disclosure of Invention
The technical scheme adopted by the embodiment of the invention relates to a method and a system for detecting abnormal behaviors and risks in a cloud environment and a storage medium. According to the method, a support vector machine is used for model training, abnormal behaviors and risks are identified, and risk levels are determined through risk level assessment.
The invention aims to provide an abnormal behavior and risk detection method in a cloud environment, and solves the technical problems pointed out in the prior art.
The invention provides a method for detecting abnormal behaviors and risks in a cloud environment, which comprises the following operation steps:
the data collection module is used for collecting data to be processed; the collected data to be processed comprises: logging information data, resource utilization data and system configuration data;
Preprocessing data to be processed to obtain a training set and a testing set:
the model training module selects an SVM model as a target processing model;
The model training module starts to start the target processing model, and trains the target processing model by using the training set to obtain a trained target processing model;
the risk evaluation module uses the test set to carry out verification evaluation on the trained target processing model, and when the verification evaluation is carried out, the accuracy and F1 score index of the current target processing model are calculated and used as the performance index of the target processing model;
After the target processing model is evaluated to be abnormal, the trained target processing model is deployed into a cloud computing environment, and the running state of the system and the input data of the user behavior are monitored in real time.
The model training module performs feature extraction according to newly generated data in a preset time period to obtain an updated training set, and further trains an original target processing model to periodically update the target processing model.
Preferably, as an embodiment; the login information data includes: logging frequency, logging IP address, common IP address, logging geographic position, logging time and logging equipment;
the resource utilization data includes: CPU utilization, memory utilization, network bandwidth utilization information, file access characteristics, and network traffic characteristics; the flow direction, the behavior pattern data and the abnormal behavior pattern mainly comprise frequently-changed behavior patterns; unusual time patterns; abnormal data access patterns; abnormal file uploading behavior;
the system configuration data includes: software version information; network configuration information; security configuration information.
Preferably, as an embodiment; the preprocessing of the data to be processed to obtain a training set and a testing set specifically comprises the following steps:
the data coding processing module performs behavior coding on the text type and behavior type login information data to obtain coded login information data characteristics;
the data cleaning processing module cleans the resource utilization rate data and the system configuration data to obtain cleaned data characteristics;
the data standardization processing module respectively carries out standardization processing on the login information data characteristics and the cleaned data characteristics to obtain standard processed characteristic data;
the data marking and dividing processing module performs feature marking on the feature data after standard processing, and then performs data set dividing processing to obtain a training set and a testing set.
Preferably, as an embodiment; accuracy of the current target processing model = number of correctly predicted samples/total number of samples x 100%;
wherein the number of correctly predicted samples is the number of samples that the model correctly classifies on the test set; the total number of samples is the total number of samples in the test set;
F1 score index of current target processing model, f1= (2×tp)/(2×tp+fp+fn);
TP represents the real number of samples of positive category, i.e. the number of samples that the model correctly predicts as positive category; FP represents the number of false positive class samples, i.e. the number of samples the model erroneously predicts negative class samples as positive class; FN represents the number of false negative class samples, i.e. the number of samples the model erroneously predicts positive class samples as negative class.
Preferably, as an embodiment; the training set and the testing set are divided according to a preset proportion, specifically: 70% of the data were used as training sets and 30% of the data were used as test sets.
Preferably, as an embodiment; after the target processing model is evaluated to be abnormal, deploying the trained target processing model into a cloud computing environment, and monitoring the running state of the system and input data of user behaviors in real time, wherein the method specifically comprises the following steps of;
Monitoring the running state and user behavior of the system in real time, and allowing the current user behavior to continue running when the predicted result is normal
When the predicted result is abnormal, the risk assessment module carries out risk level assessment, and statistics is carried out on the predicted result of the current user behavior for N times within preset time:
When the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is smaller than 10% and larger than 5%, judging the risk as a class 1 risk; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 10% and less than 20%, judging that the risk is level 2; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 20%, judging that the risk is of level 3;
when the risk assessment is judged to be the level 1 risk, only an administrator alarm reminding is initiated; ;
when judging that the risk assessment is a level 2 risk, initiating user identity verification, and continuously allowing user behavior to occur after the identity verification is passed; preventing user behavior after the identity verification is failed;
when it is determined that the risk assessment is a class 3 risk, user behavior is immediately prevented from occurring and an alarm prompt is initiated.
The invention provides an abnormal behavior and risk detection system in a cloud environment, which comprises a data processing module, a model training module and a risk assessment module; the data processing module comprises a data collection module;
the data collection module is used for collecting data to be processed; the collected data to be processed comprises: logging information data, resource utilization data and system configuration data;
the data processing module is used for preprocessing the data to be processed to obtain a training set and a testing set:
the model training module is used for selecting an SVM model as a target processing model; the model training module starts to start the target processing model, and trains the target processing model by using the training set to obtain a trained target processing model;
The risk evaluation module is used for performing verification evaluation on the trained target processing model by using the test set, and calculating the accuracy and F1 score index of the current target processing model as the performance index of the target processing model when the verification evaluation is performed; after the target processing model is evaluated to be abnormal, deploying the trained target processing model into a cloud computing environment, and monitoring the running state of the system and input data of user behaviors in real time;
The model training module is further used for further training an original target processing model according to the updated training set obtained by extracting the characteristics of the newly generated data in the preset time period, so as to update the target processing model regularly, adapt to new data and requirements and ensure the continuous effectiveness and accuracy of the model.
Preferably, as an embodiment; the data processing module further comprises a data coding processing module, a data cleaning processing module, a data standardization processing module and a data marking and dividing processing module, wherein:
The data coding processing module is used for performing behavior coding on the text type and behavior type login information data to obtain coded login information data characteristics;
the data cleaning processing module is used for cleaning the resource utilization rate data and the system configuration data to obtain cleaned data characteristics;
the data standardization processing module is used for carrying out standardization processing on the login information data characteristics and the cleaned data characteristics respectively to obtain standard processed characteristic data;
The data marking and dividing processing module is used for marking the characteristics of the standard processed characteristic data, and then carrying out data set dividing processing to obtain a training set and a testing set.
Preferably, as an embodiment; the risk assessment module further specifically comprises an assessment sub-module and a response time sub-module.
Preferably, as an embodiment; the evaluation sub-module is used for monitoring the running state and the user behavior of the system in real time, and allowing the current user behavior to continue running when the predicted result is normal; when the predicted result is abnormal, the risk assessment module carries out risk level assessment, and statistics is carried out on the predicted result of the current user behavior for N times within preset time: when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is smaller than 10% and larger than 5%, judging the risk as a class 1 risk; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 10% and less than 20%, judging that the risk is level 2; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 20%, judging that the risk is of level 3;
The response time sub-module is used for only initiating an administrator alarm prompt when judging that the risk assessment is a level 1 risk; when judging that the risk assessment is a level 2 risk, initiating user identity verification, and continuously allowing user behavior to occur after the identity verification is passed; preventing user behavior after the identity verification is failed; when it is determined that the risk assessment is a class 3 risk, user behavior is immediately prevented from occurring and an alarm prompt is initiated.
The invention provides a storage medium, wherein a computer program is stored on the storage medium, and the computer program realizes the steps of the abnormal behavior and risk detection method in the cloud environment when being executed by a processor.
Compared with the prior art, the embodiment of the invention has at least the following technical advantages:
By analyzing the abnormal behavior and risk detection method, system and storage medium in the cloud environment, the method and system for detecting the abnormal behavior and risk and the storage medium provided by the invention can be known, and when the method and the system are applied specifically, the support vector machine is utilized to predict the normal or abnormal behavior of the user, but in order to better improve the monitoring effect, the risk assessment module is added to carry out risk assessment on the abnormal behavior, so that the prediction accuracy is improved. According to the method and the device for determining the cloud environment, different commands can be executed according to the evaluated risk level, identity authentication is added to determine whether the user is credible or not, and the behavior is directly prevented when the high-risk behavior is monitored, so that the influence of the abnormal behavior on the cloud environment is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of overall operation steps of a method for detecting abnormal behavior and risk in a cloud environment according to a first embodiment of the present invention;
Fig. 2 is a schematic structural diagram of an abnormal behavior and risk detection system in a cloud environment according to a first embodiment of the present invention.
Reference numerals: a data processing module 10; a data collection module 11; a data encoding processing module 12; a data cleaning processing module 13; a data normalization processing module 14; a data marking and dividing processing module 15; a model training module 20; a risk assessment module 30; an evaluation sub-module 31; a response time sub-module 32.
Detailed Description
The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention will now be described in further detail with reference to specific examples thereof in connection with the accompanying drawings.
Example 1
As shown in fig. 1, the invention further provides a method for detecting abnormal behaviors and risks in a cloud environment, which comprises the following operation steps:
Step S1: the data collection module is used for collecting data to be processed; the collected data to be processed comprises: logging information data, resource utilization data and system configuration data; preprocessing data to be processed to obtain a training set and a testing set:
step S2: the model training module selects an SVM model as a target processing model;
step S3: the model training module starts to start the target processing model, and trains the target processing model by using the training set to obtain a trained target processing model;
Step S4: the risk evaluation module uses the test set to carry out verification evaluation on the trained target processing model, and when the verification evaluation is carried out, the accuracy and F1 score index of the current target processing model are calculated and used as the performance index of the target processing model;
Step S5: after the target processing model is evaluated to be abnormal, the trained target processing model is deployed into a cloud computing environment, and the running state of the system and the input data of the user behavior are monitored in real time.
Step S6: the model training module further trains the original target processing model according to the updated training set obtained by extracting the characteristics of the newly generated data in the preset time period, so as to update the target processing model regularly, thereby adapting to new data and requirements and ensuring the continuous effectiveness and accuracy of the model.
According to the technical scheme adopted by the embodiment of the invention, abnormal behaviors and risks are detected more accurately in the cloud environment, and the identity of the behaviours is verified after detection, so that behavior interception or permission is carried out according to the safety of the behaviours.
Preferably, as an embodiment; the login information data includes: logging frequency, logging IP address, common IP address, logging geographic location, logging time (weekday/weekend, day/night), logging device;
The resource utilization data includes: CPU utilization, memory utilization, network bandwidth usage information, file access characteristics (number of file reads and writes; file size; file type), network traffic characteristics (number of data packets; data packet size; traffic direction (inbound traffic/outbound traffic), behavior pattern data (including normal behavior pattern and abnormal behavior pattern), abnormal behavior pattern mainly including frequently changing behavior pattern (frequently switching user identity, a large number of login attempts, etc.), unusual time pattern (activity in non-working time), abnormal data access pattern (unauthorized access, illegal download behavior), abnormal file uploading behavior;
The system configuration data includes: software version information; network configuration information (firewall rules, network access control lists, etc.); security configuration information (cryptographic policies, access control rules, etc.).
Preferably, as an embodiment; the preprocessing of the data to be processed to obtain a training set and a testing set specifically comprises the following steps:
the data coding processing module performs behavior coding on the text type and behavior type login information data to obtain coded login information data characteristics;
the data cleaning processing module cleans the resource utilization rate data and the system configuration data to obtain cleaned data characteristics;
the data standardization processing module respectively carries out standardization processing on the login information data characteristics and the cleaned data characteristics to obtain standard processed characteristic data;
the data marking and dividing processing module performs feature marking on the feature data after standard processing, and then performs data set dividing processing to obtain a training set and a testing set.
The preprocessing operation mainly comprises preprocessing steps of encoding, cleaning, data standardization, marking and dividing of data and the like, and the quality and the integrity of the data are ensured.
The data preprocessing process mainly comprises three parts:
The data such as text and behavior are subjected to behavior coding, and the coding aims to be better read by a support vector machine so as to train:
Behavior: the user commonly logs in and logs out the IP behavior to be 0; the user logs in on every working day the IP of the log-out IP connection outside is 1; the user logs in and out of the IP connection outside IP at non-working days to be 2; the file downloading behavior is 3; 4, uploading file behaviors; the browsing file behavior is 5; the file deleting action is 6; the view and send internal and external mail behavior is 7. Time period: points 9 to 18 are 0, and the rest of the time period is 1. Region: the resident place is 0; the very premises is 1. Behavior mapping feature x= { activity, time, location }
Form a behavioral sequence feature set x= { X1, X2,..
And if one of the behaviors, the time period and the region is abnormal, the abnormal state is the abnormal state.
The data cleaning processing module is used for cleaning data of data class characteristics (CPU utilization rate, memory utilization rate and network bandwidth utilization condition, file access characteristics (file read-write times, file size and file type) network flow characteristics (data packet number, data packet size and flow direction (inbound flow/outbound flow)), and the data processing module is used for processing the missing value and the abnormal value of the data.
Identifying a missing value: defining the collected data as { x1, x2, x3, & gt..once again, xn }, wherein there are cases where the data is missing or abnormal, the data processing module identifies the location of the missing value in the data set and performs missing value filling.
The embodiment of the invention uses mean value calculation to fill the missing value, and the filling method is as follows:
where x i is the data point in the feature column and N is the number of non-missing values. The value is filled into the missing position after the calculation is completed.
And the data standardization processing module is used for: the data processing module performs standardization on the data and converts the data into the same scale or range by using a Z-score standardization method, so that dimensional differences among the features are eliminated.
The specific calculation formula is as follows:
Zi=(xi-μ)/σ;
Where x i is the training data point, Z i is the normalized data point, and σ is the standard deviation of the feature.
The data marking and dividing processing module: data is marked as normal behavior and abnormal behavior. The marking principle can be mainly to mark according to known security events or abnormal behaviors in historical data.
Such as: the user logs in and out of the IP connection outside IP on a non-working day as abnormal behavior.
After marking, the data set is divided into a training set and a testing set, and a general dividing mode is adopted: 70% of the data were used as training sets and 30% of the data were used as test sets.
Preferably, as an embodiment; accuracy of the current target processing model = number of correctly predicted samples/total number of samples x 100%;
wherein the number of correctly predicted samples is the number of samples that the model correctly classifies on the test set; the total number of samples is the total number of samples in the test set;
F1 score index of current target processing model, f1= (2×tp)/(2×tp+fp+fn);
TP represents the real number of samples of positive category, i.e. the number of samples that the model correctly predicts as positive category; FP represents the number of false positive class samples, i.e. the number of samples the model erroneously predicts negative class samples as positive class; FN represents the number of false negative class samples, i.e. the number of samples the model erroneously predicts positive class samples as negative class.
Preferably, as an embodiment; the training set and the testing set are divided according to a preset proportion, specifically: 70% of the data were used as training sets and 30% of the data were used as test sets.
Preferably, as an embodiment; in a specific implementation process, step S2: the model training module selects the SVM model as the target processing model. The invention selects Gaussian Radial Basis Function (RBF) for model training. Gaussian radial basis functions are one of the most commonly used kernel functions, and are suitable for most cases, especially where the distribution of data is complex and difficult to divide. The Gaussian radial basis function can map data to feature space of infinite dimension, more flexibly fit decision boundaries of various shapes, and has strong fitting capability.
In support vector machines, the anomaly detection problem can be implemented by a classification hyperplane, while gaussian radial basis functions are used to map the input data to a high-dimensional feature space, making it easier for the data to be separated in this space. The formula for anomaly detection can be expressed as:
wherein: f (x) is a predictive label representing the predictive category (normal or abnormal) of the input data point x; alpha i is the Lagrangian multiplier of the support vector machine; y i is a class label of the input data point x, and takes a value of 1 (normal) or-1 (abnormal); k (x i, x) is a Gaussian radial basis function for the similarity between the input data point x and the training data point x i; ρ is a threshold of the model for determining whether the input data point x is abnormal behavior.
In the anomaly detection process, it is necessary to determine whether the input data point x is anomaly or not according to the prediction result of the model and the threshold value ρ. In general, when the comparison result between the value of f (x) and the threshold ρ exceeds a predetermined threshold, the input data point x is predicted to be abnormal.
Preferably, as an embodiment; after the target processing model is evaluated to be abnormal, deploying the trained target processing model into a cloud computing environment, and monitoring the running state of the system and input data of user behaviors in real time, wherein the method specifically comprises the following steps of;
Monitoring the running state and user behavior of the system in real time, and allowing the current user behavior to continue running when the predicted result is normal
When the predicted result is abnormal, the risk assessment module carries out risk level assessment, and statistics is carried out on the predicted result of the current user behavior for N times within preset time:
When the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is smaller than 10% and larger than 5%, judging the risk as a class 1 risk; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 10% and less than 20%, judging that the risk is level 2; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 20%, judging that the risk is of level 3;
when the risk assessment is judged to be the level 1 risk, only an administrator alarm reminding is initiated; ;
when judging that the risk assessment is a level 2 risk, initiating user identity verification, and continuously allowing user behavior to occur after the identity verification is passed; preventing user behavior after the identity verification is failed;
when it is determined that the risk assessment is a class 3 risk, user behavior is immediately prevented from occurring and an alarm prompt is initiated.
At the same time, over mmax abnormal behaviors are initiated at the same time as the level 3 risk, namely, access is immediately prevented and an alarm is raised.
Example two
The second embodiment of the invention provides an abnormal behavior and risk detection system in a cloud environment, which comprises a data processing module 10, a model training module 20 and a risk assessment module 30; the data processing module comprises a data collecting module 11;
The data collection module 11 is used for collecting data to be processed; the collected data to be processed comprises: logging information data, resource utilization data and system configuration data;
the data processing module 10 is configured to pre-process data to be processed to obtain a training set and a testing set:
The model training module 20 is configured to select an SVM model as a target processing model; the model training module starts to start the target processing model, and trains the target processing model by using the training set to obtain a trained target processing model;
The risk assessment module 30 is configured to perform verification and assessment on the trained target processing model by using the test set, and calculate an accuracy rate and an F1 score index of the current target processing model as performance indexes of the target processing model when the verification and assessment are performed; after the target processing model is evaluated to be abnormal, deploying the trained target processing model into a cloud computing environment, and monitoring the running state of the system and input data of user behaviors in real time;
The model training module 20 is further configured to perform feature extraction according to newly generated data in a preset time period to obtain an updated training set, further train an original target processing model, and update the target processing model periodically, so as to adapt to new data and requirements, and ensure continuous validity and accuracy of the model.
Preferably, as an embodiment; the data processing module 10 further includes a data encoding processing module 12, a data cleaning processing module 13, a data normalizing processing module 14, and a data marking and dividing processing module 15, wherein:
the data encoding processing module 12 is configured to perform behavior encoding on the text-based and behavior-based login information data to obtain encoded login information data characteristics;
the data cleaning processing module 13 is configured to perform cleaning processing on the resource utilization rate data and the system configuration data to obtain cleaned data features;
The data standardization processing module 14 is configured to perform standardization processing on the login information data feature and the cleaned data feature to obtain standard processed feature data;
the data marking and dividing processing module 15 is configured to perform feature marking on the feature data after standard processing, and then perform data set dividing processing to obtain a training set and a testing set.
Preferably, as an embodiment; the risk assessment module further specifically comprises an assessment sub-module 31 and a response time sub-module 32.
Preferably, as an embodiment; the evaluation sub-module 31 is configured to monitor an operation state of the system and a user behavior in real time, and allow the current user behavior to continue to operate when the prediction result is normal; when the predicted result is abnormal, the risk assessment module carries out risk level assessment, and statistics is carried out on the predicted result of the current user behavior for N times within preset time: when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is smaller than 10% and larger than 5%, judging the risk as a class 1 risk; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 10% and less than 20%, judging that the risk is level 2; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 20%, judging that the risk is of level 3;
the response time sub-module 32 is configured to initiate only an administrator alert reminder when it is determined that the risk assessment is a level 1 risk; when judging that the risk assessment is a level 2 risk, initiating user identity verification, and continuously allowing user behavior to occur after the identity verification is passed; preventing user behavior after the identity verification is failed; when it is determined that the risk assessment is a class 3 risk, user behavior is immediately prevented from occurring and an alarm prompt is initiated.
The invention provides a storage medium, wherein a computer program is stored on the storage medium, and the computer program realizes the steps of the abnormal behavior and risk detection method in the cloud environment when being executed by a processor.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; modifications of the technical solutions described in the foregoing embodiments, or equivalent substitutions of some or all of the technical features thereof, may be made by those of ordinary skill in the art; such modifications and substitutions do not depart from the spirit of the invention.

Claims (11)

1. The abnormal behavior and risk detection method in the cloud environment is characterized by comprising the following operation steps:
the data collection module is used for collecting data to be processed; the collected data to be processed comprises: logging information data, resource utilization data and system configuration data;
Preprocessing data to be processed to obtain a training set and a testing set:
the model training module selects an SVM model as a target processing model;
The model training module starts to start the target processing model, and trains the target processing model by using the training set to obtain a trained target processing model;
the risk evaluation module uses the test set to carry out verification evaluation on the trained target processing model, and when the verification evaluation is carried out, the accuracy and F1 score index of the current target processing model are calculated and used as the performance index of the target processing model;
After the target processing model is evaluated to be abnormal, the trained target processing model is deployed into a cloud computing environment, and the running state of the system and the input data of the user behavior are monitored in real time.
The model training module performs feature extraction according to newly generated data in a preset time period to obtain an updated training set, and further trains an original target processing model to periodically update the target processing model.
2. The method for detecting abnormal behavior and risk in a cloud environment according to claim 1, wherein the login information data includes: logging frequency, logging IP address, common IP address, logging geographic position, logging time and logging equipment;
the resource utilization data includes: CPU utilization, memory utilization, network bandwidth utilization information, file access characteristics, and network traffic characteristics; the flow direction, the behavior pattern data and the abnormal behavior pattern mainly comprise frequently-changed behavior patterns; unusual time patterns; abnormal data access patterns; abnormal file uploading behavior;
the system configuration data includes: software version information; network configuration information; security configuration information.
3. The method for detecting abnormal behaviors and risks in a cloud environment according to claim 2, wherein the preprocessing of the data to be processed to obtain a training set and a testing set specifically comprises the following operation steps:
the data coding processing module performs behavior coding on the text type and behavior type login information data to obtain coded login information data characteristics;
the data cleaning processing module cleans the resource utilization rate data and the system configuration data to obtain cleaned data characteristics;
the data standardization processing module respectively carries out standardization processing on the login information data characteristics and the cleaned data characteristics to obtain standard processed characteristic data;
the data marking and dividing processing module performs feature marking on the feature data after standard processing, and then performs data set dividing processing to obtain a training set and a testing set.
4. A method of detecting abnormal behaviour and risk in a cloud environment according to claim 3, wherein the accuracy of said current target processing model = correctly predicted number of samples/total number of samples x 100%;
wherein the number of correctly predicted samples is the number of samples that the model correctly classifies on the test set; the total number of samples is the total number of samples in the test set;
F1 score index of current target processing model, f1= (2×tp)/(2×tp+fp+fn);
TP represents the real number of samples of positive category, i.e. the number of samples that the model correctly predicts as positive category; FP represents the number of false positive class samples, i.e. the number of samples the model erroneously predicts negative class samples as positive class; FN represents the number of false negative class samples, i.e. the number of samples the model erroneously predicts positive class samples as negative class.
5. The method for detecting abnormal behaviors and risks in a cloud environment according to claim 4, wherein the training set and the test set are divided according to a preset proportion, specifically: 70% of the data were used as training sets and 30% of the data were used as test sets.
6. The method for detecting abnormal behavior and risk in cloud environment according to claim 4, wherein after the target processing model is evaluated to be abnormal, the trained target processing model is deployed in the cloud computing environment, and the operation state of the system and the input data of the user behavior are monitored in real time, specifically comprising the following operation steps:
Monitoring the running state and user behavior of the system in real time, and allowing the current user behavior to continue running when the predicted result is normal
When the predicted result is abnormal, the risk assessment module carries out risk level assessment, and statistics is carried out on the predicted result of the current user behavior for N times within preset time:
When the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is smaller than 10% and larger than 5%, judging the risk as a class 1 risk; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 10% and less than 20%, judging that the risk is level 2; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 20%, judging that the risk is of level 3;
when the risk assessment is judged to be the level 1 risk, only an administrator alarm reminding is initiated; ;
when judging that the risk assessment is a level 2 risk, initiating user identity verification, and continuously allowing user behavior to occur after the identity verification is passed; preventing user behavior after the identity verification is failed;
when it is determined that the risk assessment is a class 3 risk, user behavior is immediately prevented from occurring and an alarm prompt is initiated.
7. The abnormal behavior and risk detection system in the cloud environment is characterized by comprising a data processing module, a model training module and a risk assessment module; the data processing module comprises a data collection module;
the data collection module is used for collecting data to be processed; the collected data to be processed comprises: logging information data, resource utilization data and system configuration data;
the data processing module is used for preprocessing the data to be processed to obtain a training set and a testing set:
the model training module is used for selecting an SVM model as a target processing model; the model training module starts to start the target processing model, and trains the target processing model by using the training set to obtain a trained target processing model;
The risk evaluation module is used for carrying out verification evaluation on the trained target processing model by using the test set, and calculating the accuracy and F1 score index of the current target processing model as the performance index of the target processing model when the verification evaluation is carried out; after the target processing model is evaluated to be abnormal, deploying the trained target processing model into a cloud computing environment, and monitoring the running state of the system and input data of user behaviors in real time;
The model training module is also used for further training the original target processing model according to the updated training set obtained by extracting the characteristics of the newly generated data in the preset time period, so as to update the target processing model regularly, thereby adapting to new data and requirements and ensuring the continuous effectiveness and accuracy of the model.
8. The system for detecting abnormal behavior and risk in a cloud environment of claim 7, wherein said data processing module further comprises a data encoding processing module, a data cleaning processing module, a data normalizing processing module, and a data marking and partitioning processing module, wherein:
The data coding processing module is used for performing behavior coding on the text type and behavior type login information data to obtain coded login information data characteristics;
the data cleaning processing module is used for cleaning the resource utilization rate data and the system configuration data to obtain cleaned data characteristics;
the data standardization processing module is used for carrying out standardization processing on the login information data characteristics and the cleaned data characteristics respectively to obtain standard processed characteristic data;
The data marking and dividing processing module is used for marking the characteristics of the standard processed characteristic data, and then carrying out data set dividing processing to obtain a training set and a testing set.
9. The system for detecting abnormal behavior and risk in a cloud environment of claim 7, wherein said risk assessment module further comprises an assessment sub-module and a response time sub-module.
10. The abnormal behavior and risk detection system in a cloud environment according to claim 9, wherein the evaluation submodule is configured to monitor an operation state of the system and a user behavior in real time, and allow the current user behavior to continue to operate when a prediction result is normal; when the predicted result is abnormal, the risk assessment module carries out risk level assessment, and statistics is carried out on the predicted result of the current user behavior for N times within preset time: when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is smaller than 10% and larger than 5%, judging the risk as a class 1 risk; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 10% and less than 20%, judging that the risk is level 2; when the proportion of the number M of predicted results of abnormal user behaviors to the number N of all predicted results is more than 20%, judging that the risk is of level 3;
The response time sub-module is used for only initiating an administrator alarm prompt when judging that the risk assessment is a level 1 risk; when judging that the risk assessment is a level 2 risk, initiating user identity verification, and continuously allowing user behavior to occur after the identity verification is passed; preventing user behavior after the identity verification is failed; when it is determined that the risk assessment is a class 3 risk, user behavior is immediately prevented from occurring and an alarm prompt is initiated.
11. A storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the abnormal behaviour and risk detection method in a cloud environment according to any one of claims 1 to 5.
CN202410403252.2A 2024-04-03 2024-04-03 Abnormal behavior and risk detection method and system in cloud environment and storage medium Pending CN118469273A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410403252.2A CN118469273A (en) 2024-04-03 2024-04-03 Abnormal behavior and risk detection method and system in cloud environment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410403252.2A CN118469273A (en) 2024-04-03 2024-04-03 Abnormal behavior and risk detection method and system in cloud environment and storage medium

Publications (1)

Publication Number Publication Date
CN118469273A true CN118469273A (en) 2024-08-09

Family

ID=92167340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410403252.2A Pending CN118469273A (en) 2024-04-03 2024-04-03 Abnormal behavior and risk detection method and system in cloud environment and storage medium

Country Status (1)

Country Link
CN (1) CN118469273A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119271520A (en) * 2024-12-09 2025-01-07 福建喜购宝信息科技股份有限公司 An intelligent control system of keyboard and mouse based on Internet of Things

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119271520A (en) * 2024-12-09 2025-01-07 福建喜购宝信息科技股份有限公司 An intelligent control system of keyboard and mouse based on Internet of Things
CN119271520B (en) * 2024-12-09 2025-02-18 福建喜购宝信息科技股份有限公司 Keyboard mouse intelligent control system based on thing networking

Similar Documents

Publication Publication Date Title
US10936717B1 (en) Monitoring containers running on container host devices for detection of anomalies in current container behavior
Hu et al. A simple and efficient hidden Markov model scheme for host-based anomaly intrusion detection
Hamed et al. A survey and taxonomy of classifiers of intrusion detection systems
CN117478433B (en) Network and information security dynamic early warning system
CN111274583A (en) A kind of big data computer network security protection device and its control method
CN119071049B (en) Server security access monitoring method based on Internet of things
Krishnaveni et al. Ensemble approach for network threat detection and classification on cloud computing
EP4091084B1 (en) Endpoint security using an action prediction model
Fitriani et al. Review of semi-supervised method for intrusion detection system
CN112287336A (en) Host security monitoring method, device, medium and electronic equipment based on block chain
Parhizkari Anomaly detection in intrusion detection systems
CN114760140A (en) APT attack tracing graph analysis method and device based on cluster analysis
CN118469273A (en) Abnormal behavior and risk detection method and system in cloud environment and storage medium
CN118487861A (en) Network attack behavior prediction method, device, equipment, medium and product
CN114124453B (en) Processing method and device of network security information, electronic equipment and storage medium
US20250023909A1 (en) Protecting backup systems against security threats using artificial intellegence
CN117692197A (en) Host intrusion security detection method, system, equipment and storage medium
Santhoshkumar et al. Enhancing Intrusion Detection Systems with Digital Twin Technology for Cyber-Physical Security
CN118200022B (en) Data encryption method and system based on malicious attacks on big data networks
CN120200864B (en) Safety protection system for trusted data space
CN119496640B (en) A data detection and processing method and system for network security of a ticketing platform
US20250184335A1 (en) Security breach detection and mitigation in a cloud-based environment
KR102348359B1 (en) Apparatus and methods for endpoint detection and reponse based on action of interest
CN115834204A (en) An analysis method and device for abnormal operation
Deepthi et al. Multi-level Data Integrity Model with Dual Immutable Digital Key Based Forensic Analysis in IoT Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination