CN119341803A

CN119341803A - Network security strategy generation method and system based on AI big model

Info

Publication number: CN119341803A
Application number: CN202411443539.4A
Authority: CN
Inventors: 倪林; 鲜明; 何俊; 王欣玫; 张帅
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2024-10-16
Filing date: 2024-10-16
Publication date: 2025-01-21
Anticipated expiration: 2044-10-16
Also published as: CN119341803B

Abstract

The present invention relates to the field of Internet of Things security, and specifically to a network security strategy generation method and system based on an AI large model. The method first includes a scenario library of historical security events and a strategy library containing various strategies, and simultaneously obtains the pending security events currently received by the network, and based on the data information of each dimension, performs scenario matching analysis on the pending security events and historical security events, and screens out similar security events of the pending security events from all historical security events, and uses the constructed weak classifier on the target dimension to iteratively train the historical security events to obtain the criticality of the target dimension, and based on the screening of each key dimension, further performs scenario matching analysis on the pending security events and similar security events, thereby generating a strategy set for the pending security events. The present invention can eliminate the interference of redundant dimensions on the scenario matching of network security events and improve the effect of security strategy generation.

Description

Network security policy generation method and system based on AI large model

Technical Field

The invention relates to the field of security of the Internet of things, in particular to a network security policy generation method and system based on an AI large model.

Background

In the field of security of the internet of things, traditional defense means such as firewalls, intrusion detection systems and the like have difficulty in coping with evolving network threats, including but not limited to advanced persistent threats, zero-day exploits and directed attacks against specific organizations, serious threats to the security of the internet of things, so that corresponding security policies need to be generated to cope with the threats, and network health is maintained.

In the related art, a security event currently received by a network is generally subjected to scene matching with a historical security event, and a policy of the historical security event similar to the current security event scene is used as a policy for solving the current network security event, but since the network security event generally has a large number of features with different dimensions, the scene matching effect between the current network security event and the historical security event is reduced due to the features with partial redundancy dimensions, so that the security policy generation effect is poor.

Disclosure of Invention

In order to solve the technical problem that the scene matching effect between the current network security event and the historical security event is reduced due to the existence of the characteristic of the partial redundancy dimension, so that the security policy generation effect is poor, the invention aims to provide a network security policy generation method and system based on an AI large model, and the adopted technical scheme is as follows:

the invention provides a network security policy generation method based on an AI large model, which comprises the following steps:

acquiring a scene library and a strategy library about network security, wherein the scene library comprises a plurality of historical security events, and simultaneously acquiring a to-be-processed security event currently received by a network, wherein the to-be-processed security event and the historical security event comprise data information of a plurality of dimensions;

Based on the scene similarity, screening similar safety events of the safety event to be processed from all the historical safety events, and respectively marking the similar safety events and dissimilar safety events differently to obtain labels of the historical safety events;

The method comprises the steps of taking any dimension as a target dimension, taking a decision tree constructed through all historical security events and related to the target dimension as a weak classifier, carrying out iterative training on the historical security events with labels based on the weak classifier to obtain the classification accuracy of each iterative process, judging whether to terminate iterative training according to the difference of the classification accuracy between each iterative process and the previous iterative process, and obtaining the key degree of the target dimension according to the classification accuracy of the last iterative process, the difference of the classification accuracy between the last iterative process and the first iterative process and the number of iterative training, and screening out the key dimension from all the dimensions according to the key degree of each dimension and the number of similar security events of the security events to be processed;

Obtaining a similar scene set of the security event to be processed according to the difference of the data information of each key dimension between the security event to be processed and the similar security event; and selecting a strategy set of the security event to be processed from a strategy library based on the similar scene set.

Further, the obtaining the scene similarity of each historical security event includes:

Dividing all dimensions into statistical feature dimensions and content feature dimensions based on the form of data information of each dimension, wherein the data information of the statistical feature dimensions is in a numerical form, and the data information of the content feature dimensions is in a text form;

taking a security event to be processed or any one historical security event as a target security event, and taking a vector formed by data information of each statistical feature dimension of the target security event as a first feature vector of the target security event;

processing the data information of each content characteristic dimension of the target security event by utilizing the word bag model to obtain a second characteristic vector of the target security event;

Taking the absolute value of the cosine similarity of the first feature vector between the security event to be processed and each historical security event as the first similarity of each historical security event, and taking the absolute value of the cosine similarity of the second feature vector between the security event to be processed and each historical security event as the second similarity of each historical security event;

And taking the average value of the first similarity and the second similarity as the scene similarity of each historical security event.

Further, the screening similar security events of the security events to be processed from all the historical security events based on the scene similarity comprises:

and taking the historical security events with the scene similarity larger than a preset first similarity threshold value as the similar security events of the security events to be processed.

Further, the obtaining the classification accuracy of each iteration process includes:

using an Adaboost algorithm, and classifying the labeled historical security events by using a weak classifier to obtain correctly classified historical security events in each iteration process;

The number of the history safety events which are correctly classified in each iteration process is used as a numerator, the number of all the history safety events is used as a denominator, and the ratio is used as the classification accuracy of each iteration process.

Further, the determining whether to terminate the iterative training according to the difference of the classification accuracy between each iterative process and the previous iterative process includes:

If the absolute value of the difference between each iteration process and the classification accuracy rate of the previous iteration process is smaller than a preset judgment threshold value, ending the iterative training, otherwise, continuing the iterative training.

Further, the obtaining the criticality of the target dimension includes:

taking the reciprocal of the absolute value of the difference value of the classification accuracy between the last iteration process and the first iteration process as a first key parameter of the target dimension;

taking the inverse number of the iterative training times as a second key parameter of the target dimension;

And normalizing the sum of the classification accuracy of the last iteration process, the first key parameter and the second key parameter of the target dimension to obtain the key degree of the target dimension.

Further, the screening the key dimension from all dimensions includes:

taking the number of similar security events of the security events to be processed as a numerator, taking the number of all the historical security events as a denominator, and taking the ratio as a similar scene ratio value of the security events to be processed;

Taking the difference value between the value 1 and the proportion value of the similar scene as a weight parameter of the security event to be processed, and upwardly rounding the product value of the weight parameter and the number of all dimensions to obtain the minimum dimension number;

taking any multiple dimensions as a dimension group, wherein the number of the dimensions contained in each dimension group is greater than or equal to the minimum number of dimensions and less than or equal to the number of all dimensions;

Taking the average value of the key degrees of all the dimensions in each dimension group as the overall key degree of each dimension group;

And taking all dimensions in the dimension group corresponding to the maximum value of the overall criticality as critical dimensions.

Further, the obtaining the similar scene set of the security event to be processed includes:

Based on the calculation method of the scene similarity of each historical security event, calculating scene similarity parameters of each similar security event according to the difference of the data information of each key dimension between the security event to be processed and the similar security event;

And taking the set formed by the similar security events with the scene similar parameters larger than a preset second similar threshold value as a similar scene set of the security events to be processed.

Further, the selecting a policy set of security events to be processed from a policy repository based on the similar scene set includes:

And selecting strategies for solving all the historical security events in the similar scene set from a strategy library, and taking a set formed by all the selected strategies as a strategy set of the security events to be processed.

The invention also provides a network security policy generation system based on the AI large model, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes any one of the steps of the network security policy generation method based on the AI large model when executing the computer program.

The invention has the following beneficial effects:

The invention considers that the characteristics of partial redundant dimensions of the network security event can reduce the scene matching effect between the current network security event and the historical security event, thereby causing poor security policy generation effect, thus firstly acquiring a scene library and a policy library about network security, and acquiring the currently received security event to be processed of the network, considering that the policy for solving the security event to be processed is unknown, the policy for solving the historical security event is known, the information data of each dimension of the network security event reflects the characteristics of the security event, therefore, the invention can firstly analyze the difference of the data information of each dimension between the security event to be processed and the historical security event, reflect the similarity degree between each historical security event and the security event to be processed through the scene similarity, and further screen out the similar security event of the security event to be processed, the similar safety event is similar to the scene of the safety event to be processed, the similar scene set of the safety event to be processed can be further constructed from the similar safety event, so that the strategy of the safety event to be processed is effectively generated, the scene matching analysis is carried out on the safety event to be processed and the historical safety event by directly using the data information of all dimensions in consideration of the steps, a plurality of redundant dimensions are not spent in all dimensions, and the importance in the scene matching process of different dimensions is different, therefore, the invention analyzes the target dimension, carries out iterative training on the historical safety event with the tag through the constructed weak classifier, obtains the classification accuracy of each iterative process, reflects the importance of the target dimension in the scene matching through the acquired key degree, screens out the key dimension, eliminates the interference of the redundant dimension on the scene matching, and then, through the constructed similar scene set of the security event to be processed, the strategy set of the security event to be processed is effectively generated.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a network security policy generation method based on an AI large model according to an embodiment of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the present invention to achieve the preset purposes, the following detailed description is given below of a network security policy generation method and system based on an AI large model according to the present invention, and the detailed description is given below of the specific implementation, structure, feature and effects thereof. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following specifically describes a specific scheme of a network security policy generation method and system based on an AI large model provided by the invention with reference to the drawings.

Referring to fig. 1, a flowchart of a network security policy generation method based on an AI large model according to an embodiment of the invention is shown, where the method includes:

Step S1, a scene library and a strategy library about network security are obtained, wherein the scene library comprises a plurality of historical security events, and meanwhile, the security events to be processed currently received by the network are obtained, and the security events to be processed and the historical security events comprise data information with a plurality of dimensions.

In the security maintenance process of the internet of things, a network system can generally receive a large number of network security events and adopts corresponding security policies to cope with various network security events, so that the embodiment of the invention firstly utilizes the Bert-BiLSTM-CRF algorithm in the AI large model to extract a large number of historical security events from the network, constructs a corresponding scene library by utilizing all the historical security events, and constructs a corresponding policy library by utilizing policies used for solving each historical security event, wherein the historical security events in the scene library and the policies in the policy library are interrelated, that is, for a certain historical security event in the scene library, a corresponding policy for solving the historical security event exists in the policy library, and the Bert-BiLSTM-CRF algorithm is a technical means well known to a person skilled in the art and is not repeated herein.

Meanwhile, the network can actively receive new network security events, the embodiment of the invention takes the new network security events received by the network as the security events to be processed, and can carry out scene matching analysis on the security events to be processed and the historical security events subsequently, thereby effectively generating a strategy for solving the security events to be processed, and the embodiment of the invention can effectively generate the strategy for solving the security events to be processed, which is to say, for the security events to be processed and the historical security events, there are a number of different dimensional characteristics including, for example, dimensions of traffic size, number of connections, number of failed logins, access pattern, file access behavior, etc., and data information of different dimensions is usually represented in both numerical form and text form, for example, data information of traffic size and number of connections is represented in numerical form, and data information of access pattern is represented in text form.

And S2, obtaining scene similarity of each historical security event according to the difference of data information of each dimension between the security event to be processed and each historical security event, screening similar security events of the security event to be processed from all the historical security events based on the scene similarity, and respectively marking the similar security events and the dissimilar security events differently to obtain labels of the historical security events.

Since the policy for solving the security event to be processed is unknown and the policy for solving the historical security event is known, the purpose of the embodiment of the invention is to solve the security event to be processed by using the known policy, so that similarity analysis, namely scene matching analysis, is required to be performed on the security event to be processed and the historical security event in the scene library, so that the security event to be processed is solved by using the policy of the historical security event similar to the scene characteristics of the security event to be processed, therefore, the difference of data information of each dimension between the security event to be processed and each historical security event can be analyzed first, the similarity degree about the network security scene represented between the security event to be processed and each historical security event is reflected by the acquired scene similarity, and the similar security event of the security event to be processed can be screened out from the historical security events based on the scene similarity degree.

Preferably, in an embodiment of the present invention, the method for acquiring the scene similarity of each historical security event specifically includes:

first, since the network security event is composed of features of multiple dimensions, and data information of each dimension mainly takes two forms, namely, a text form and a numerical form, in the subsequent process of quantitatively analyzing the network security event, all dimensions need to be divided into two dimension types, namely, a statistical feature dimension and a content feature dimension, based on the form of the data information of each dimension, wherein the data information of the statistical feature dimension is in the numerical form, and the data information of the content feature dimension is in the text form.

Then, taking the security event to be processed or any one of the historical security events as the target security event, and because the data information of the statistical feature dimension is in a numerical form, a vector formed by the data information of each statistical feature dimension of the target security event can be directly taken as a first feature vector of the target security event, for example, the statistical feature dimension of the target security event comprises the flow size, the connection number and the failed login attempt number, and the corresponding data information is respectively 1,2 and 3, and the first feature vector of the target security event is (1, 2 and 3).

Since the data information of the content feature dimension is expressed in a text form, the quantization processing of the data information of the content feature dimension can be realized by using a word bag model in the AI large model, the data information of each content feature dimension of the target security event is processed by using the word bag model, and the second feature vector of the target security event is obtained, and the word bag model is a technical means well known to those skilled in the art, and the first feature vector and the second feature vector of the security event to be processed and each historical security event can be obtained by the same method.

The absolute value of the cosine similarity of the first feature vector between the to-be-processed safety event and each historical safety event can be used as the first similarity of each historical safety event, the absolute value of the cosine similarity of the second feature vector between the to-be-processed safety event and each historical safety event can be used as the second similarity of each historical safety event, the larger the first similarity and the second similarity are, the more similar the scene represented by the to-be-processed safety event and each historical safety event is indicated, and therefore the average value of the first similarity and the second similarity can be used as the scene similarity of each historical safety event in combination with the first similarity and the second similarity.

In other embodiments of the present invention, the first similarity of each historical security event may be calculated by analyzing the euclidean distance or the manhattan distance of the first feature vector between the security event to be processed and each historical security event, and the second similarity of each historical security event may be calculated by analyzing the euclidean distance or the manhattan distance of the second feature vector between the security event to be processed and each historical security event, for example, the calculated euclidean distance or manhattan distance may be mapped in a negative correlation manner, so as to calculate the first similarity and the second similarity, and an average value of the first similarity and the second similarity may be normalized, so as to obtain the scene similarity of each historical security event.

As an example, in one embodiment of the present invention, the expression of the scene similarity for each historical security event may specifically be, for example:

Wherein A _n represents scene similarity of the nth historical security event, C _(n,1) represents cosine similarity of a first feature vector between the to-be-processed security event and the nth historical security event, C _(n,2) represents cosine similarity of a second feature vector between the to-be-processed security event and the nth historical security event, |C _(n,1) | represents first similarity of the nth historical security event, and|C _(n,2) | represents second similarity of the nth historical security event.

The larger the scene similarity is, the more similar the network security scene is shown between each historical security event and the security event to be processed is, so that the similar security event of the security event to be processed can be primarily screened from all the historical security events based on the scene similarity, and the similar scene set of the security event to be processed can be further constructed from the similar security events through dimension reduction analysis, so that the strategy of the security event to be processed can be more effectively generated.

Preferably, in one embodiment of the present invention, a historical security event with a scene similarity greater than a preset first similarity threshold is used as a similar security event of a security event to be processed, where the preset first similarity threshold has a value range of (0.5, 1), in one embodiment of the present invention, the preset first similarity threshold is set to 0.6, and a specific value of the preset first similarity threshold may also be set by an implementer according to a specific implementation scenario, which is not limited herein.

The method and the device have the advantages that the similar safety events are obtained through the process, two types of historical safety events, namely, the similar safety events and the dissimilar safety events, exist in the scene library, different marks are carried out on the similar safety events and the dissimilar safety events in the scene library, so that the label of each historical safety event is obtained, the constructed weak classifier can be used for carrying out iterative classification on the historical safety events with labels in the follow-up process, so that dimension reduction processing of network safety event scene matching analysis is achieved, influences of redundant dimensions on scene matching are eliminated, the effect of final strategy generation is improved, 1 and-1 can be used for carrying out marking processing, for example, the similar safety events in the scene library are marked as 1, the dissimilar safety events are marked as-1, at the moment, the label of each historical safety event is 1 or-1, the historical safety event with the label of 1 represents the similar safety event of the safety event to be processed, the historical safety event with the label of-1 represents the dissimilar safety event of the safety event to be processed, and other marking modes can be used in other embodiments of the method and the device are not limited.

Step S3, taking any dimension as a target dimension, taking a decision tree constructed through all historical security events and related to the target dimension as a weak classifier, carrying out iterative training on the historical security events with labels based on the weak classifier to obtain the classification accuracy of each iterative process, judging whether to terminate iterative training according to the difference of the classification accuracy between each iterative process and the previous iterative process, obtaining the key degree of the target dimension according to the classification accuracy of the last iterative process, the difference of the classification accuracy between the last iterative process and the first iterative process and the number of iterative training, and screening the key dimension from all the dimensions according to the key degree of each dimension and the number of similar security events of the security events to be processed.

Because a plurality of redundant dimensions do not exist in all dimensions, and the importance of different dimensions in the scene matching process between the to-be-processed safety event and the historical safety event is different, the data information of all dimensions is directly used for scene matching analysis between the to-be-processed safety event and the historical safety event, the existing redundant dimensions can reduce the accuracy of scene matching between the to-be-processed safety event and the historical safety event, so that the strategy generation effect of the to-be-processed safety event is poor, the importance of each dimension in the scene matching of the to-be-processed safety event and the historical safety event is required to be analyzed, the key dimension is selected, and the influence of the redundant dimensions on the scene matching is eliminated.

According to the embodiment of the invention, firstly, any dimension is analyzed, then, a decision tree related to the target dimension is constructed through all historical security events in a scene library, and the constructed decision tree related to the target dimension is used as a weak classifier, wherein the construction method of the decision tree is a technical means well known to a person skilled in the art, is not limited herein, and is generally used as a weak classifier by constructing a single-layer decision tree, namely a decision tree pile, so that the historical security events with labels can be iteratively trained based on the weak classifier, the classification accuracy of each iteration process is obtained, whether the iterative training is terminated is judged according to the difference of the classification accuracy between each iteration process and the previous iteration process, the number of iterative training is recorded, and the importance of scene matching of the target dimension between the security events to be processed and the historical security events can be analyzed subsequently based on the classification accuracy of each iteration process and the number of iterative training, thereby eliminating the influence of the redundant dimension.

Preferably, in one embodiment of the present invention, the method for obtaining the classification accuracy of each iteration process specifically includes:

Firstly, an Adaboost algorithm is used, a weak classifier is used for classifying historical security events with labels, and the historical security events which are correctly classified in each iteration process are obtained, wherein the Adaboost algorithm is a technical means well known to a person skilled in the art, details are not repeated here, then the number of the historical security events which are correctly classified in each iteration process is used as a numerator, the number of all the historical security events is used as a denominator, and the ratio is used as the classification accuracy of each iteration process.

As an example, in one embodiment of the present invention, the expression of the classification accuracy of each iterative process may specifically be, for example:

wherein T _i represents the classification accuracy of the ith iteration process, K _i represents the history security events correctly classified in the ith iteration process, and N represents the number of all history security events in the scene library.

Preferably, in one embodiment of the present invention, the method for determining whether to terminate iterative training specifically includes:

In the process of carrying out iterative training on a history security event with a label by using an Adaboost algorithm, when the change of the classification accuracy of each iterative process relative to the previous iterative process is small, the iterative training achieves a convergence effect, so that when the absolute value of the difference value of the classification accuracy between each iterative process and the previous iterative process is smaller than a preset judgment threshold value, the iterative training is stopped when the absolute value of the difference value is proved to be experienced to be converged, otherwise, the iterative training is continued when the convergence is not achieved, wherein the value range of the preset judgment threshold value is 0,0.1, the preset judgment threshold value is set to be 0.05 in one embodiment of the invention, and the specific value of the preset judgment threshold value can also be set by an embodiment according to a specific implementation scene.

It should be noted that, the first iteration process does not have the previous iteration process, so the implementation of the invention starts from the second iteration process, and judges whether to terminate the iteration, that is, after the iteration is terminated, there are at least two iteration processes.

After the historical security event is subjected to iterative training by using a weak classifier related to the target dimension, the higher the classification accuracy of the last iterative process is, the smaller the difference of the classification accuracy between the last iterative process and the first iterative process is, and the smaller the number of iterative training is, the more important the target dimension is in scene matching analysis between the security event to be processed and the historical security event is, so that the difference of the classification accuracy of the last iterative process, the classification accuracy between the last iterative process and the first iterative process and the number of iterative training can be analyzed, the importance of the target dimension in scene matching analysis between the security event to be processed and the historical security event can be reflected through the acquired key degree, and the key dimension can be selected based on the key degree in the follow-up process, thereby improving the effect generated by the security event strategy to be processed.

Preferably, in one embodiment of the present invention, the method for acquiring the criticality of the target dimension specifically includes:

Taking the reciprocal of the absolute value of the difference value of the classification accuracy between the last iteration process and the first iteration process as a first key parameter of the target dimension, taking the reciprocal of the number of iterative training as a second key parameter of the target dimension, and carrying out normalization processing on the classification accuracy of the last iteration process and the sum value of the first key parameter and the second key parameter of the target dimension to obtain the key degree of the target dimension.

In one embodiment of the present invention, the normalization process may specifically be, for example, maximum and minimum normalization processes, and the normalization in the subsequent steps may be performed by using the maximum and minimum normalization processes, and in other embodiments of the present invention, other normalization methods may be selected according to a specific range of values, which will not be described herein.

As an example, in one embodiment of the present invention, the expression of the criticality of the target dimension may specifically be, for example:

Wherein, G represents the key degree of the target dimension, T ^′ represents the classification accuracy of the last iterative process, T ₁ represents the classification accuracy of the first iterative process, M represents the number of iterative training; a first key parameter representing a target dimension; And a second key parameter representing the target dimension, norm () represents the normalization function.

The criticality of each dimension can be obtained by the same method as described above.

Considering that the network security scene represented by the security event to be processed is an unusual crowd scene, more dimensions are needed to perform scene matching analysis at this time so as to ensure that the security event to be processed of the crowd scene can also generate an effective security policy, whether the network security scene represented by the security event to be processed is an unusual crowd scene or not can be represented by the number of similar security events of the security event to be processed in a scene library, the smaller the number of similar security events is, the network security scene represented by the security event to be processed is described as an unusual crowd scene, otherwise, the network security scene represented by the security event to be processed is described as an unusual crowd scene, therefore, the number of similar security events of the security event to be processed can be analyzed, and meanwhile, the key degree of each dimension is combined, so that the key dimension is selected from all dimensions, and the influence of the redundant dimension on subsequent analysis is eliminated.

Firstly, the number of similar safety events of the safety event to be processed is taken as a numerator, the number of all historical safety events is taken as a denominator, the ratio is taken as a similar scene proportion value of the safety event to be processed, the similar scene proportion value is small, the safety event to be processed is expressed as a crowd scene, more dimensions are required for subsequent analysis, therefore, the difference value between the value 1 and the similar scene proportion value can be taken as a weight parameter of the safety event to be processed, and the product value of the weight parameter and the number of all dimensions is rounded up to obtain the minimum dimension number.

As an example, in one embodiment of the present invention, the expression of the minimum number of dimensions may specifically be, for example:

Wherein F _min represents the minimum dimension number, E represents the number of similar security events of the security events to be processed, N represents the number of all historical security events in the scene library; a similar scene proportion value representing a security event to be processed; And H represents the number of all dimensions.

Then, the number of the dimensions included in each dimension group is greater than or equal to the minimum number of dimensions and less than or equal to the number of all dimensions, for example, if the number of all dimensions is 5 and the minimum number of dimensions is 4, any 4 dimensions may be used as one dimension group and any 5 dimensions may be used as one dimension group.

Further, the average value of the key degrees of all the dimensions in each dimension group can be used as the overall key degree of each dimension group, and the larger the overall key degree is, the more key the dimensions contained in the dimension group are described, so that all the dimensions in the dimension group corresponding to the maximum value of the overall key degree can be used as key dimensions.

So far, the key dimension for scene matching of the to-be-processed safety event and the historical safety event is obtained, and the key dimension can be analyzed later, so that a similar scene set of the to-be-processed safety event is constructed in similar safety events of the to-be-processed safety event, the scene matching effect of the to-be-processed safety event and the historical safety event is improved, and the generation effect of the strategy of the to-be-processed safety event is further improved.

And step S4, obtaining a similar scene set of the security event to be processed according to the difference of the data information of each key dimension between the security event to be processed and the similar security event, and selecting a policy set of the security event to be processed from a policy library based on the similar scene set.

Because the importance of the key dimension in scene matching between the security event to be processed and the historical security event is larger, the difference of the data information of each key dimension between the security event to be processed and the similar security event can be further generated, the similar security event can be screened again, so that a similar scene set of the security event to be processed is constructed, and the similar scene set can be utilized subsequently to effectively generate a strategy set of the security event to be processed.

Preferably, in one embodiment of the present invention, the method for acquiring a similar scene set of a security event to be processed specifically includes:

similar to the method for calculating the scene similarity in the step S2, all the key dimensions are first divided into a statistical feature key dimension and a content feature key dimension based on the form of the data information of each key dimension, the data information of the statistical feature key dimension is in a numerical form, and the data information of the content feature key dimension is in a text form.

And using the security event to be processed or any similar security event as the security event to be detected, using a vector formed by data information of key dimensions of all statistical characteristics of the security event to be detected as a third characteristic vector of the target security event, and processing the data information of key dimensions of all content characteristics of the security event to be detected by using a word bag model to obtain a fourth characteristic vector of the security event to be detected.

The absolute value of the cosine similarity of the third feature vector between the security event to be processed and each similar security event is used as the third similarity of each similar security event, the absolute value of the cosine similarity of the fourth feature vector between the security event to be processed and each similar security event is used as the fourth similarity of each similar security event, and the average value of the third similarity and the fourth similarity is used as the scene similarity parameter of each similar security event.

The larger the scene similarity parameter is, the closer each similar security event is to the scene represented by the security event to be processed, so that a set formed by similar security events with scene similarity parameters larger than a preset second similarity threshold can be used as a similar scene set of the security event to be processed, wherein the value range of the preset second similarity threshold is (0.5, 1), in one embodiment of the invention, the preset second similarity threshold is set to be 0.8, and the specific value of the preset second similarity threshold can also be set by an implementation by the user according to the specific implementation scene without limitation.

As an example, in one embodiment of the present invention, the expression of the scene similarity parameter for each similar security event may specifically be, for example:

Wherein B _r represents a scene similarity parameter of the r-th similar security event, D _(r,3) represents cosine similarity of a third feature vector between the security event to be processed and the r-th similar security event, D _(r,4) represents cosine similarity of a fourth feature vector between the security event to be processed and the r-th similar security event, |D _(r,3) | represents third similarity of the r-th similar security event, and|D _(r,4) | represents fourth similarity of the r-th similar security event.

The set of similar scenarios includes historical security events similar to those represented by the security event to be processed, and thus a set of policies for the security event to be processed may be selected from the policy repository based on the set of similar scenarios.

Preferably, in one embodiment of the present invention, the method for acquiring the policy set of the security event to be processed specifically includes:

Because the policy repository and the scene inventory are interrelated, that is, the policy repository contains policies for solving the historical security events, the policies for solving all the historical security events in the similar scene set can be selected from the policy repository, and the set formed by all the selected policies is used as the policy set of the security events to be processed, thereby realizing the effective generation of the security event policies to be processed.

After the strategy set of the security event to be processed is constructed, the strategy set can be used for generating a corresponding scenario subsequently so as to solve the security event of the current Internet of things.

One embodiment of the invention provides a network security policy generation system based on an AI large model, which comprises a memory, a processor and a computer program, wherein the memory is used for storing the corresponding computer program, the processor is used for running the corresponding computer program, and the computer program can realize the method described in the steps S1-S4 when running in the processor.

It should be noted that the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

Claims

1. A network security policy generation method based on an AI large model, the method comprising:

2. The AI-large-model-based network security policy generation method of claim 1, wherein the obtaining scene similarity for each historical security event comprises:

3. The AI-large-model-based network security policy generation method of claim 1, wherein the screening similar security events of the security events to be processed from all the historical security events based on the scene similarity comprises:

4. The AI-large-model-based network security policy generation method of claim 1, wherein the obtaining the classification accuracy of each iterative process comprises:

5. The AI-large-model-based network security policy generation method of claim 1, wherein determining whether to terminate iterative training based on the difference in classification accuracy between each iterative process and the previous iterative process comprises:

6. The AI-large-model-based network security policy generation method of claim 1, wherein obtaining the criticality of the target dimension comprises:

7. The AI-large-model-based network security policy generation method of claim 1, wherein the screening out key dimensions from all dimensions comprises:

8. The AI-large-model-based network security policy generation method of claim 2, wherein obtaining a set of similar scenarios for a security event to be processed comprises:

9. The AI-large-model-based network security policy generation method of claim 1, wherein selecting a policy set of security events to be processed from a policy repository based on the set of similar scenarios comprises:

10. A network security policy generation system based on an AI large model, the system comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method according to any one of claims 1 to 9 when executing the computer program.