CN117675384B

CN117675384B - Intelligent park data security management method and system

Info

Publication number: CN117675384B
Application number: CN202311695183.9A
Authority: CN
Inventors: 施美英
Original assignee: Shenzhen Hezong Data Technology Co ltd
Current assignee: Shenzhen Hezong Data Technology Co ltd
Priority date: 2023-12-12
Filing date: 2023-12-12
Publication date: 2024-07-16
Anticipated expiration: 2043-12-12
Also published as: CN117675384A

Abstract

The invention belongs to the technical field of data security management, and discloses a method and a system for managing data security of an intelligent park; comprising the following steps: s1, constructing an access control matrix; s2, acquiring behavior mode features and space track features of each person, and constructing behavior four-element groups of the behavior person according to the behavior mode and space track information of each person; s3, training a hidden Markov model based on a personnel behavior analysis engine, and inputting personnel behavior quadruples into the hidden Markov model to detect and divide the personnel behavior quadruples into a normal behavior interval and an abnormal behavior interval; s4, judging whether the access behavior of personnel to access the data asset is abnormal or not by combining the access control matrix and the personnel behavior quadruple; if the access behavior is abnormal, sending real-time early warning to the access behavior and starting a response mechanism; the monitoring and protecting capability of the safety condition of the data assets in the park is comprehensively improved, and the overall efficiency of the data safety management is enhanced.

Description

Intelligent park data security management method and system

Technical Field

The invention relates to the technical field of data security management, in particular to a method and a system for data security management of an intelligent park.

Background

The patent with the application publication number of CN116843484A discloses a financial insurance data security management method based on the Internet of things, which comprises a device deployment module, a data collection module, a data encryption module, a data transmission module, a central storage module, a data monitoring module, a permission authentication module and a remote management module; according to the financial insurance data safety management method based on the Internet of things, firstly, hardware equipment of the Internet of things is deployed for control, all financial insurance data are encrypted through an encryption algorithm and then transmitted to a central storage through a safety transmission channel, data monitoring is carried out on the central storage, after abnormal conditions pass permission authentication, the data enter a central storage area for operation, and emergency conditions can be rapidly operated through remote equipment.

Patent application publication number CN115496428a discloses an industrial safety management method and system based on big data, the method comprises: constructing a security management framework, constructing a database and acquiring data to be transmitted; analyzing the data to be transmitted through big data, extracting data keywords, and dividing to obtain a data packet to be authorized; the data packet to be authorized is respectively sent to each functional department, and approval comments are received; and sending the unauthorized data packet to a data sender for modification, and re-approving after modification, and allowing the data to be sent out after approval is completed. The invention analyzes the data, determines the functional departments related to the data, intercepts out the parts related to each functional department and sends the parts to the corresponding functional departments so as to realize the auditing of the corresponding data of each department, allows the data to be sent to the outside after the auditing is finished, and greatly improves the safety of the data and provides a guarantee for data transmission.

The internal network system of the existing park is connected with a large amount of data containing business data assets, and the data leakage event is more than once by means of the existing safety management means; the main expression is as follows: the daily behavior mode of staff in the base is not effectively modeled, and when abnormal data access occurs, management staff cannot distinguish the existence of illegal operation; even if the monitoring center notices suspicious downloading activities, it is difficult to take targeted emergency measures at the first time; let alone the custom handling mechanism for different levels of events. The risk judgment method is mainly characterized in that the existing system cannot correlate and analyze various monitoring data in real time, and risk judgment of operation behaviors is completely based on experience of an administrator, so that interpretation is delayed and operation is complicated. When a large amount of confidential data flows out through the intranet boundary in a short time, the existing system can only perform formalized whole-network isolation, so that normal business is interrupted, and the influence is wide.

In view of the above, the present invention provides a method and system for security management of data in an intelligent park to solve the above-mentioned problems.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides the following technical scheme for achieving the purposes: a data security management method for an intelligent park comprises the following steps: s1, constructing an access control matrix;

s2, acquiring behavior pattern characteristics and space track information of each person, and constructing a person behavior quadruple according to the behavior pattern characteristics and the space track information of each person;

S3, training a hidden Markov model based on a personnel behavior analysis engine, and inputting personnel behavior quadruples into the hidden Markov model to detect and divide the personnel behavior quadruples into a normal behavior interval and an abnormal behavior interval;

S4, judging whether the access behavior of personnel to access the data asset is abnormal or not by combining the access control matrix and the personnel behavior quadruple; if the access behavior is abnormal, sending real-time early warning to the access behavior and starting a response mechanism.

Further, the construction mode of the access control matrix comprises the following steps:

Collecting all data assets within the campus, including but not limited to databases, file servers, and network devices; dividing the data assets into p categories according to the importance and sensitivity of the data assets;

constructing an access control matrix Eo×p, wherein o is the number of personnel roles in the park; p is the number of data asset classes; according to the working requirements and the access strategies, configuring access rights of each personnel role to each data asset class in a matrix, wherein the access rights are matrix elements ai j; ai j is the access right of the person role i to the data asset class j, the value is 0 or 1,0 indicates no access right, and 1 indicates access right;

the access control matrix is monitored in real time, and when the personnel roles, the data assets or the access strategies are changed, the access control matrix is updated.

Further, the method for acquiring the behavior pattern characteristics of each person includes:

v video monitoring devices are deployed in a critical area in a park; when a person role i enters the coverage range of video monitoring equipment Cj, generating image sequence data V { i, j }; v { i, j } represents the image sequence of person character i under video monitoring device Cj;

Summarizing the data of the personnel role i in the video monitoring equipment to obtain image sequence data V { i } of the ith personnel role; extracting a video frame image set V_ { i } according to the frame rate parameter f from the image sequence data V { i }; detecting and tracking personnel based on a computer vision depth learning algorithm for each frame image in the video frame image set V_ { i } to obtain a personnel area image sequence U_ { i };

identifying behavior pattern features through a personnel area image sequence U_ { i }; the behavior pattern features include an activity time sequence B_ { i } and an activity object sequence O_ { i }; the active time sequence B_ { i } is a time stamp corresponding to each frame of image; the active object sequence O_ { i } is a sequence formed by interaction operation categories for the interaction behavior of the personnel;

The identification mode of the interactive operation category comprises the following steps:

Carrying out semantic segmentation on a personnel area image sequence U_ { i }, segmenting e areas in the image, determining a semantic category for each area, and defining the semantic category as a semantic area;

judging personnel interaction behaviors between personnel and each region by detecting coincidence and relative motion relations among the regions in the image sequence and checking front-back correspondence relations of images according to time sequence;

combining the distinguished semantic regions with the detected personnel interaction behaviors to determine the personnel interaction operation category; the method for determining the interaction operation type of the personnel comprises the following steps:

On each frame of the image sequence, determining an object interacted by personnel at the time, and acquiring a corresponding IP port when the interacted object is park public electronic equipment; monitoring interaction behaviors of the public electronic equipment of the corresponding park according to the IP ports; and acquiring the corresponding interactive operation category according to the interactive behavior.

Further, the method for acquiring the space track information of each person includes:

setting RFID read-write equipment at c key space positions in a park; when a person character i passes through RFID read-write equipment Rk, an RFID mark of the person character i is scanned, space coordinates and a time stamp are recorded, and a positioning data sequence L { i, k }; l { i, k } represents a positioning data sequence of a person character i under the RFID read-write equipment Rk; summarizing the data of the personnel role i in the RFID read-write equipment to obtain the spatial coordinate sequence data of the ith personnel role as L { i };

Filtering out abnormal or discontinuous coordinate points in the data in the L { i } sequence; dividing a space organization form into beta functional areas according to an actual layout map of a park; defining a coordinate range of each functional area;

dividing L { i } into R access events of access areas, wherein each access event has corresponding entry and exit coordinates and corresponding time stamps;

and judging the entering and leaving functional areas according to the coordinates of the access event, sequencing and recording the accessed functional area sequences and the corresponding entering and leaving time for each person to form a symbol sequence representing the space track, namely finishing the extraction of the space track information.

Further, defining a person ID for each person; correlating the behavior pattern characteristics of the personnel with the space track information to form a personnel behavior quadruple; the personnel behavior quadruple comprises personnel ID, behavior pattern characteristics, space track information and time period; the time period is represented by the in-out time in the space trajectory information;

the acquisition process of the personnel behavior analysis engine comprises a data preparation stage and an engine training stage;

The data preparation phase includes:

Aiming at each personnel role in a park, collecting at least k personnel behavior quaternions as training sample sets, wherein the at least k personnel behavior quaternions comprise normal behavior modes and abnormal behavior modes; preprocessing a training sample set, wherein the preprocessing comprises removing redundant and repeated data, and converting text description in the training sample set into digital representation;

extracting personnel interaction characteristics and personnel track characteristics from personnel behavior quaternions in a training sample set; the personnel interaction characteristics comprise time sequence data of personnel interaction operation, operation category distribution and operation conversion matrix; the personnel track features comprise time sequence data and space distribution features of personnel activity tracks;

The method for extracting the personnel interaction characteristics and the personnel track characteristics from the personnel behavior quaternion in the training sample set comprises the following steps:

the time sequence data of the interaction operation is obtained from the active time sequence B_ { i }, and the original time stamp information is kept;

analyzing the active object sequence O_ { i }, extracting all the interactive operation categories appearing to form an operation category set C; traversing the active object sequence O_ { i }, counting the occurrence times of each category ck in the whole sequence, and marking as n (ck);

calculating the probability of each operation class occurring in all operations

Wherein Σn (cj) is the sum of the occurrence times of all operation categories;

calculating the occurrence probability P (ck) for each category ck, and collecting all P (ck) values to form probability distribution of operation categories, namely operation category distribution;

Traversing the active object sequence O_ { i }, and counting the times of transferring each operation category ci to the category cj to form a transfer count matrix M; the rows and columns of the transfer count matrix M are operation types; the matrix elements of the transfer count matrix M are the transfer occurrence times from ci to cj;

Calculating transition probabilities I.e., c i the number of transitions from c i to all categories, the ratio of ci to cj; sigma Mik is the number of transfers of ci to all categories; mij is the number of transition occurrences from ci to cj; all Pij values are formed into a matrix, namely a Markov transfer matrix among operation categories, namely an operation transfer matrix;

the time sequence data of the personnel movement track is obtained from time stamp data in the space coordinate sequence data L { i };

Extracting all position coordinate data from the space coordinate sequence data L { i }; calculating the maximum value, the minimum value and the range of the coordinate data;

Dividing the whole space into V grid cells, and calculating the occurrence times of coordinates for each grid cell to obtain the distribution frequency of the corresponding positions of each grid cell in the space; and calculating the proportion of the frequency to the total frequency, and converting the proportion into probability, namely the spatial distribution characteristic.

Further, the engine training phase includes:

The personnel interaction characteristics and the personnel track characteristics are spliced and fused to represent the personnel behavior characteristics; formatting the personnel behavior characteristic representation into time series data and carrying out normalization processing; obtaining a training initial set; dividing the training initial set into a training set and a verification set;

constructing a basic structure of an LSTM model, wherein the basic structure consists of an input layer, an h-layer LSTM layer and a full-connection layer;

The input layer represents the dimension of the current time step feature, and the dimension is set as the dimension of the human behavior feature; the LSTM layer is used for mining time sequence association between time step characteristics; the full connection layer converts the output of the last LSTM layer into a characteristic representation;

Inputting a training set into an LSTM model, setting training super-parameters of the LSTM model, wherein the training super-parameters comprise learning rate, training rounds, batch processing size and regular strength, and training the training super-parameters of the LSTM model through an error back propagation algorithm;

the error back propagation algorithm comprises the following modes:

Defining a loss function of the LSTM model, wherein the loss function adopts a mean square error function;

Mean square error function

Wherein ω is the number of training set samples selected in a single iteration; y _ε is the real output variable of the sample ε; the model takes x _ε as an input to a predicted output variable of a sample epsilon;

initializing a model parameter theta; calculating model output under current parameters at each iteration Calculating a loss function and its derivative with respect to a parameterWherein,A derivative term for the loss function J (θ) with respect to the parameter θ; updating the parameter theta by using a gradient descent method;

The gradient descent method is as follows Wherein θ' is a parameter for the next iteration; η is the learning rate and represents the step length when the parameters are updated;

Evaluating the model effect by adopting indexes on the verification set; selecting a model with optimal performance through g times of training;

Evaluating performance indexes of the LSTM model by adopting folding cross validation; and optimizing by adjusting parameters to obtain a personnel behavior analysis engine of the personnel role.

Further, the training mode of the hidden Markov model includes:

taking the personnel behavior analysis engine as a hidden state of the hidden Markov model, and taking the personnel behavior quadruple as an observation state of the hidden Markov model; based on a personnel behavior analysis engine, calculating probability distribution of transition between different hidden states as state transition probability of a hidden Markov model;

The calculation mode of transition probability distribution between hidden states comprises the following steps:

In a hidden markov model established by a personnel behavior analysis engine, defining a set s= { S1, S2,. }, SN }, where N is the number of hidden states;

extracting transition sequences among all the occurence hidden states from the training sample data set; traversing all the extracted hidden state transition sequences, and counting the transition times C (Si-Sj) between each pair of hidden states Si and Sj; namely, the number of times that the state Si is transferred to the state Sj is calculated; simultaneously counting the total number of times C (Si) of occurrence of each state Si;

According to the formula Calculating the transition probability from each state Si to the state Sj; wherein C (Si) represents the total number of occurrences of state Si; repeating calculation to obtain a transition probability matrix between hidden states, wherein the matrix is the probability distribution of transition between different hidden states;

After obtaining the transition probability matrix, setting the output probability of the hidden Markov model and the initial state distribution parameters; wherein the output probability is given by a human behavior analysis engine;

extracting behavior pattern features and space trajectory information of a person, wherein the behavior pattern features and the space trajectory information are expressed as a behavior feature sequence X= { X1, X2, & gt, xT }; wherein xT represents the behavior feature vector of the T-th time step;

Based on a trained personnel behavior analysis engine, an obtained hidden state set S= { S1, S2,., SN } and an inter-state transition probability matrix P; the hidden state set S comprises N hidden states, and the transition probability matrix P is an N multiplied by N matrix; the matrix element Pij of the transition probability matrix P represents the transition probability from the hidden state Si to the hidden state Sj;

Inputting the characteristic vector xt of each time step t in the behavior characteristic sequence X into a hidden Markov model in sequence for calculation to obtain an output result of each time step;

calculating and solving the most probably corresponding hidden state sequence by utilizing a forward-backward algorithm; obtaining the most likely hidden state sequence q= { Q1, Q2,..q., qT };

Substituting the obtained most probable hidden state sequence Q into a hidden Markov model, and calculating probability P (X|model parameters) of the behavior feature sequence X under given model parameters;

Testing the model effect on the verification set for l times, and if the performance index on the verification set is continuously improved for r times without obvious improvement, terminating the model training; obtaining a trained hidden Markov model;

Setting a region threshold, judging an abnormal mode if the probability P (X|model parameters) of the behavior feature sequence X under the given model parameters is larger than or equal to the region threshold, and judging a normal mode if the probability P (X|model parameters) of the behavior feature sequence X under the given model parameters is smaller than the region threshold;

collecting samples of the training set judged to be in a normal mode, and forming a normal behavior interval; and collecting samples of the training set judged to be in an abnormal mode to form an abnormal behavior interval.

Further, the method for judging whether the access behavior of the personnel to the data asset is abnormal by combining the access control matrix and the personnel behavior quadruple comprises the following steps:

When person p accesses data asset q, determining if a _pq is equal to 1; if a _pq =0, then the person p is judged to be abnormal; if a _pq =1, acquiring a personnel behavior quadruple corresponding to the personnel p;

Judging whether a person behavior quadruple corresponding to the person p belongs to an abnormal behavior interval or not; if the access behavior belongs to the abnormal behavior interval, judging that the access behavior of the person p is abnormal; if the access behavior of the person p is not abnormal, the access behavior of the person p is judged to be abnormal.

Further, the method for sending the real-time early warning and starting the response mechanism comprises the following steps:

Constructing a topic-based publish/subscribe pattern model; subscribing corresponding topics by security management personnel, and receiving early warning information in real time; the content of the early warning message comprises event time, related personnel, assets and operations; the publishing/subscribing mode model supports the instant messaging application to push the early warning to the mobile terminal;

constructing an access audit database and predefining an access event data model; accessing an event data model comprising dimensions of a desired record; when the access behavior is abnormal, synchronously writing the event record into an access audit database in a batch processing mode;

Acquiring an IP address of a person, and finding a network port connected with a switch; calling an API of the gatekeeper equipment and issuing an access control strategy; the content of the access control policy includes rules that prohibit network access to the IP; and the network gate equipment forcibly disconnects the network connection to stop subsequent access.

The intelligent park data safety management system is realized based on the intelligent park data safety management method, and comprises the following steps: the matrix construction module is used for constructing an access control matrix;

the quadruple construction module is used for acquiring the behavior pattern characteristics and the space track information of each person and constructing a person behavior quadruple according to the behavior pattern characteristics and the space track information of each person;

The comprehensive processing module comprises a model training unit and an interval dividing unit;

the model training unit is used for training the hidden Markov model based on the personnel behavior analysis engine;

the interval dividing unit is used for inputting the personnel behavior quadruple into the hidden Markov model to be detected and divided into a normal behavior interval and an abnormal behavior interval;

the early warning response module is used for combining the access control matrix and the personnel behavior quadruple to judge whether the access behavior of personnel to access the data asset is abnormal or not; if the access behavior is abnormal, sending real-time early warning to the access behavior and starting a response mechanism.

The intelligent park data security management method and system have the technical effects and advantages that:

By constructing an access control matrix, modeling video and space track multi-source heterogeneous data and training a hidden Markov model, the behavior modes of various personnel in a park can be comprehensively and accurately depicted, and the characteristics such as the active time period, the interactive operation type and the space position change track of the working area can be comprehensively and accurately depicted. On the basis, each operation behavior of the access data asset of the personnel entering the garden can be monitored in real time, when an abnormal access event is captured, the risk level of the event can be rapidly judged, and early warning information with different degrees can be selected and sent according to the situation. Meanwhile, the system links related safety emergency response measures, such as means of disconnecting network connection, remotely locking accounts and the like, so as to timely prevent the subsequent influence of data leakage or damage and furthest reduce the harm caused by a safety event; by comprehensively utilizing the technical means, the monitoring and protecting capability of the safety condition of the data assets in the park is comprehensively improved, and the overall efficiency of the data safety management is enhanced.

Drawings

FIG. 1 is a schematic diagram of a method for security management of data in an intelligent park according to the present invention;

FIG. 2 is a schematic diagram of a system for security management of data in an intelligent park according to the present invention;

FIG. 3 is a schematic diagram of an electronic device of the present invention;

fig. 4 is a schematic diagram of a storage medium of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

Referring to fig. 1, a method for managing data security of an intelligent park according to the present embodiment includes:

s1, constructing an access control matrix;

S4, judging whether the access behavior of personnel to access the data asset is abnormal or not by combining the access control matrix and the personnel behavior quadruple; if the access behavior is abnormal, sending real-time early warning to the access behavior and starting a response mechanism;

further, the construction mode of the access control matrix includes:

collecting all data assets within the campus, including but not limited to databases, file servers, and network devices; dividing the data assets into p categories according to the importance and sensitivity of the data assets; for example, core data, important data, general data, and the like;

monitoring the access control matrix in real time, and updating the access control matrix in time when personnel roles, data assets or access strategies are changed;

It should be noted that the importance of the data asset reflects the value or impact of the data asset on the operation of the campus business; the sensitivity reflects the damage degree of the data asset after leakage or damage; the core data is the most critical data assets related to campus survival and development, e.g., financial databases, design drawings, etc.; important data refers to data supporting the operation of the critical business of the park, such as employee information, customer information, in-research design documents and the like; general data such as data assets of lesser importance and sensitivity such as internal communication information, log files, etc.;

In particular, there are typically many different types of employees within a campus, such as research and development personnel, testers, and administrators, etc.; the type and amount of information that these personnel groups need to access and process at work also varies; these persons are divided into a plurality of "person roles" according to job responsibilities. For example, dividing the personnel in a campus into research roles, test roles, management roles, and the like;

further, the method for obtaining the behavior pattern characteristics and the space trajectory information of each person includes:

v video monitoring devices are deployed in a critical area in a park; setting RFID read-write equipment at c key space positions in a park; when a person role i enters the coverage range of video monitoring equipment Cj, triggering video recording to generate image sequence data V { i, j }; v { i, j } represents the sequence of images of person character i under device Cj; when a person character i passes through RFID read-write equipment Rk, an RFID mark of the person character i is scanned, space coordinates and a time stamp are recorded, and a positioning data sequence L { i, k }; l { i, k } represents a positioning data sequence of a person character i under the RFID read-write equipment Rk;

Summarizing the data of the personnel role i in the video monitoring equipment to obtain image sequence data V { i } of the ith personnel role; summarizing the data of the personnel role i in the RFID read-write equipment to obtain the spatial coordinate sequence data of the ith personnel role as L { i };

Extracting a video frame image set V_ { i } according to the frame rate parameter f from the image sequence data V { i }; detecting and tracking personnel based on a computer vision depth learning algorithm for each frame image in the video frame image set V_ { i } to obtain a personnel area image sequence U_ { i };

The specific implementation mode of the personnel detection and tracking method comprises the following steps:

Preprocessing the extracted video frame image data, wherein the preprocessing comprises image denoising, image enhancement contrast and the like;

On the preprocessed image, performing personnel identification by using a personnel detection algorithm YOLO or FasterR-CNN algorithm based on deep learning; the personnel detection algorithm YOLO or the fast R-CNN algorithm can rapidly locate the area where personnel appear in the image and give out a boundary box;

After detecting a person, extracting a person corresponding area image, and tracking the position movement of the person in a subsequent frame image;

Because false alarm may exist in detection and tracking, filtering of tracking results is needed, and discontinuous or position-changed tracks are deleted; combining the regional images of the tracked personnel according to the time sequence to form an image sequence U_ { i };

It should be noted that, the identification manner of the interaction category includes:

Carrying out semantic segmentation on the personnel area image sequence U_ { i }, and segmenting out a plurality of areas in the image, wherein each area determines a semantic category, such as a desk area, a chair area and a computer area;

Determining interaction behaviors between personnel and each region, such as movements of moving, holding, sitting and the like, by detecting coincidence and relative motion relations among the frame regions in the image sequence; judging by examining the front-back correspondence of the images according to the time sequence;

Combining the distinguished semantic regions with the detected personnel interaction behaviors to determine the personnel interaction operation category; for example, if the actions such as sitting down and hand movement of the person in the computer area are detected, the interaction object is judged to be a computer; on each frame of the image sequence, determining an object interacted by personnel at the time, and acquiring a corresponding IP port when the interacted object is a computer or other public electronic equipment in a park; monitoring interaction behaviors of the corresponding computers or other public electronic equipment in the park according to the IP ports; acquiring corresponding interactive operation categories according to the interactive behaviors; the interactive operation class specifically comprises a file operation class, an application program class, a system class, a network activity class and the like; for example, creating/saving/downloading/uploading files, copying/cutting/pasting/deleting files, modifying file attributes, searching/filtering/ordering file lists, decompressing or compressing files and loading external storage devices, belonging to the file operation class;

for example, start/close/login applications, office software using word processing/spreadsheet/presentation/mail clients, use of image/video/audio processing software, and access database applications, etc., belong to the class of applications;

It should be noted that, the determination of the interactive object is combined with the facility map or directory in the park, so as to ensure that the recognition result accurately matches the actual scene object, and not just the visual area divided by the image;

Because the RFID is possibly interfered by signals or equipment errors in the reading and writing process, abnormal or discontinuous coordinate points of data in the L { i } sequence are filtered out, and the influence of error data is eliminated;

Dividing the space organization form into beta functional areas, such as office area A, conference room B and the like, according to the actual layout map of the park; defining a coordinate range of each functional area;

judging the entering and leaving functional areas according to the coordinates of the access event, sequencing and recording the accessed functional area sequences and the corresponding entering and leaving time for each person to form a symbol sequence representing the space track, namely finishing the extraction of the space track information;

Further, the construction mode of the personnel behavior four-element group comprises the following steps:

Defining a person ID for each person; correlating the behavior pattern characteristics of the personnel with the space track information to form a personnel behavior quadruple; the personnel behavior quadruple comprises personnel ID, behavior pattern characteristics, space track information and time period;

Specifically, the person ID uniquely identifies a person; the space track information represents a sequence of spatial position changes of the person within a time period; the time period represents the behavior pattern and the time span of the spatial trajectory; represented by the time of entry and exit in the spatial trajectory information;

For example, one behavioral quadruple of person X is (X, [ open File A, edit File A, print File A ], [ zone 1, zone 2, zone 3],2022-05-06 08:00-12:00); the operation (behavior pattern) of opening file a, editing file a and printing file a is performed by person X in the period of 8 points to 12 points at 5/6/2022, and the spatial positions of person X are changed into area 1, area 2 and area 3 (spatial trajectory);

further, the acquiring process of the personnel behavior analysis engine comprises the following steps:

Aiming at each personnel role in a park, collecting at least k personnel behavior quaternions as training sample sets, wherein the at least k personnel behavior quaternions comprise normal behavior modes and abnormal behavior modes; preprocessing a training sample set, removing redundant and repeated data, and converting text description in the training sample set into digital representation;

It should be noted that, the method for extracting the personnel interaction feature and the personnel track feature from the personnel behavior quadruple includes:

The operation category distribution is obtained by counting all the interactive categories appearing in the active object sequence O_ { i } and probability distribution thereof;

Specifically, analyzing the active object sequence O_ { i }, extracting all the interactive operation categories appearing, and forming an operation category set C; for example, the set of operation categories C includes, but is not limited to { open file, edit file, save file, &..start application, close application, etc };

traversing the active object sequence O_ { i }, counting the occurrence times of each category ck in the whole sequence, and marking as n (ck);

calculating the probability of each operation class occurring in all operations

Calculating the occurrence probability P (ck) for each category ck, and collecting all P (ck) values, namely forming the probability distribution of the operation category;

Optionally, normalizing the probability distribution to make the sum of the probabilities be 1 so as to eliminate the influence of the length of the sample sequence;

the operation conversion matrix is obtained by calculating the transition probability among different interactive operation categories and constructing a Markov transition matrix among the operation categories;

specifically, traversing the active object sequence O_ { i }, and counting the times of transferring each operation category c i to the category cj to form a transfer count matrix M; the rows and columns of the transfer count matrix M are operation types; the matrix element of the transition count matrix M is c i to cj transition occurrence times;

Calculating transition probabilities I.e., c i the number of transitions from c i to all categories, the ratio of ci to cj; sigma Mik is the number of transfers of ci to all categories; mij is the number of transition occurrences from ci to cj;

forming matrixes by all Pij, namely, a Markov transfer matrix among operation categories;

The acquisition of the spatial distribution characteristics is to count the position distribution state of the position coordinates;

Specifically, all position coordinate data are extracted from the space coordinate sequence data L { i }; calculating the maximum value, the minimum value and the range of the coordinate data;

Dividing the whole space into a plurality of grid units, or classifying coordinates by adopting a clustering method; calculating the frequency of occurrence of coordinates for each grid cell to obtain the distribution frequency of the corresponding position of each grid cell in the space; calculating the proportion of the frequency to the total frequency, and converting the proportion into probability, namely a position distribution state;

The personnel interaction characteristics and the personnel track characteristics are spliced and fused to represent the personnel behavior characteristics; formatting the personnel behavior characteristic representation into time series data and carrying out normalization processing; obtaining a training initial set; dividing the training initial set into a training set and a verification set; the dividing ratio is set according to actual conditions; for example, an 80% training set and a 20% validation set;

the splicing and fusion mode is as follows:

constructing a fixed-length vector representation comprising an interactive feature field and a track feature field; for example, the vector length is set to m+n, the first m dimensions represent interaction features, and the last n dimensions represent trajectory features;

The personnel interaction features and the personnel track features are spliced into defined fixed-length vectors in sequence; the first m dimensions take interactive features and the second n dimensions take track features; finally forming a behavior feature vector integrating the two types of features; for the numerical value of the feature, normalization processing, for example, minimum-maximum normalization, is performed, and the feature value is mapped to a [0,1] interval;

The format of the time series data comprises:

sequentially taking the behavior feature vectors of the individual in each time slice along the time step sequence; obtaining time step sequence data, wherein the format is (characteristic, time);

constructing a basic structure of an LSTM model, wherein the basic structure consists of an input layer, an h-layer LSTM layer and a full-connection layer, and the value of h can be 2 or 3;

the error back propagation algorithm comprises the following modes:

Mean square error function

Initializing a model parameter theta; for example, input layer matrices and hidden layer matrices, etc.;

calculating model output under current parameters at each iteration Calculating a loss function and its derivative with respect to a parameterWherein,A derivative term for the loss function J (θ) with respect to the parameter θ;

updating the parameter theta by using a gradient descent method;

Evaluating the model effect by adopting indexes (such as accuracy, recall rate and the like) on the verification set; selecting a model with optimal performance through g times of training;

evaluating performance indexes of the LSTM model by adopting folding cross validation; optimizing through adjusting parameters to obtain a personnel behavior analysis engine of personnel roles;

specifically, the training initial set is randomly segmented into alpha equal data; for example, α=5 is divided into 5 folds;

Taking the fold as a cross test set and the rest alpha-1 folds as a cross training set aiming at each fold data, respectively training an LSTM model and testing; repeating the process for alpha times, wherein each fold of data is sequentially used as a cross test set;

calculating performance indicators, e.g., accuracy, recall, F1 score, etc., on the test set in each round of testing;

Averaging the performance indexes of the k-round test to obtain the average performance of the LSTM model on the data set; re-alpha-fold cross-validation by adjusting parameters, increasing or decreasing network layers of the LSTM model or changing a model structure and the like, and selecting the LSTM model with optimal performance by comparing performance indexes;

further, the training mode of the hidden markov model includes:

Extracting from the training data all transition sequences between the occurence hidden states, such as { (S1-S2), (S3-S5),., (SM-SN) }; the transition sequences reflect the actual transition conditions between each hidden state in the training data;

Traversing all the extracted hidden state transition sequences, and counting the transition times C (Si-Sj) between each pair of hidden states Si and Sj; namely, the number of times that the state Si is transferred to the state Sj is calculated; simultaneously counting the total number of times C (Si) of occurrence of each state Si;

According to the formula Calculating the transition probability from each state Si to the state Sj; wherein C (Si) represents the total number of occurrences of state Si; repeating the calculation process to obtain a transition probability matrix between hidden states, wherein the matrix is the probability distribution of transition between different hidden states;

After obtaining the transition probability matrix, setting the output probability distribution and initial state distribution parameters of the hidden Markov model; the output probability is given by a personnel behavior analysis engine, and the initial state distribution parameters can be set according to actual conditions; it should be noted that the initial state distribution parameters include the following parameters:

an initial state vector pi; pi represents the probability distribution corresponding to all hidden states at the initial time t=1; pi may be directly based on training samples to count the initial probability of the state, or assuming that the state is equally likely;

Relevant parameters of the observation sequence, including an observation symbol set and a length range of the observation sequence;

Observing parameters of probability distribution, wherein the parameters comprise parameters of Gaussian distribution and parameters of discrete probability distribution are probability quality functions;

Extracting behavior pattern features and spatial trajectory information of the person from the training set, expressed as a behavior feature sequence x= { X1, X2,., xT }; wherein xT represents the behavior feature vector of the T-th time step, which is a multidimensional vector; the dimension of the vector xT is equal to the sum of the dimension of the behavior pattern feature and the dimension of the space trajectory information; for example, if the behavior pattern is 5-dimensional and the spatial trajectory is 3-dimensional, xT is an 8-dimensional vector;

Calculating and solving the most probably corresponding hidden state sequence by utilizing a forward-backward algorithm; based on a recurrence idea, the algorithm calculates the forward probability and the backward probability of each hidden state as an observation state xt under each time step; and obtaining the most probable hidden state sequence q= { Q1, Q2, & gt, qT }, by the product of the forward probability and the backward probability;

substituting the obtained most probable hidden state sequence Q into a hidden Markov model, and calculating probability P (X|model parameters) of the behavior feature sequence X under given model parameters; the probability value of the probability reflects the degree to which the sample X accords with the current trained model and can be used as one of the criteria for judging abnormality;

Setting a region threshold, judging an abnormal mode if the probability P (X|model parameters) of the behavior feature sequence X under the given model parameters is larger than or equal to the region threshold, and judging a normal mode if the probability P (X|model parameters) of the behavior feature sequence X under the given model parameters is smaller than the region threshold; the setting of the regional threshold ensures the balance of detection accuracy and recall rate in practical application;

Collecting samples of the training set judged to be in a normal mode, and forming a normal behavior interval; collecting samples of the training set judged to be in an abnormal mode, and forming an abnormal behavior interval;

it should be noted that, the personnel behavior analysis engine is used as an LSTM model, and its hidden layer state can be used to construct the hidden state of the hidden Markov model; the hidden state in the LSTM model reflects the combination of the current input and the past information, has time sequence correlation, and can well represent the time dynamic characteristics of personnel behaviors; taking the hidden state of the LSTM model as the hidden state of the hidden Markov model;

the advantages of LSTM capturing time sequence characteristics and hidden Markov model judging time sequence abnormality are fused; the LSTM hidden state has rich dynamic information, provides information support for the hidden Markov model, and the hidden Markov model outputs a confidence result to realize order judgment;

further, the method for determining whether the access behavior of the personnel to access the data asset is abnormal by combining the access control matrix and the personnel behavior quadruple comprises the following steps:

When person p accesses data asset q, determining if a _pq is equal to 1; if a _pq =0, determining that the person p is abnormal, namely that the person p has no access right to the data asset q and tries to access the data asset q, and the person p belongs to illegal operation;

If a _pq =1, acquiring a personnel behavior quadruple corresponding to the personnel p;

judging whether a person behavior quadruple corresponding to the person p belongs to an abnormal behavior interval or not; if the access behavior belongs to the abnormal behavior interval, judging that the access behavior of the person p is abnormal;

if the access behavior of the person p does not belong to the abnormal behavior interval, judging that the access behavior of the person p is not abnormal;

Further, the method for sending the real-time early warning and starting the response mechanism includes:

Constructing an access audit database and predefining an access event data model; accessing an event data model includes dimensions of the desired record, e.g., including event time, operation, and source identification; when the access behavior is abnormal, synchronously writing the event record into an access audit database in a batch processing mode;

Acquiring an IP address of a person, and finding a network port connected with a switch; calling an API of the gatekeeper equipment and issuing an access control strategy; the content of the access control policy includes rules that prohibit network access to the IP; the network gate equipment forcibly disconnects the network connection to stop subsequent access;

According to the method, the system and the device, the behavior modes of various personnel in a park can be comprehensively and accurately depicted by constructing an access control matrix, video and space track multi-source heterogeneous data modeling and hidden Markov model training, and the characteristics including the active time period, the interactive operation type, the space position change track and the like of the working area of the system can be comprehensively and accurately depicted. On the basis, each operation behavior of the access data asset of the personnel entering the garden can be monitored in real time, when an abnormal access event is captured, the risk level of the event can be rapidly judged, and early warning information with different degrees can be selected and sent according to the situation. Meanwhile, the system links related safety emergency response measures, such as means of disconnecting network connection, remotely locking accounts and the like, so as to timely prevent the subsequent influence of data leakage or damage and furthest reduce the harm caused by a safety event; by comprehensively utilizing the technical means, the monitoring and protecting capability of the safety condition of the data assets in the park is comprehensively improved, and the overall efficiency of the data safety management is enhanced.

Example 2

Referring to fig. 2, which is not described in detail in embodiment 1, an intelligent campus data security management system is provided, including: the matrix construction module is used for constructing an access control matrix;

The early warning response module is used for combining the access control matrix and the personnel behavior quadruple to judge whether the access behavior of personnel to access the data asset is abnormal or not; if the access behavior is abnormal, sending real-time early warning to the access behavior and starting a response mechanism; all the modules are connected in a wired and/or wireless mode, so that data transmission among the modules is realized.

Example 3

Referring to fig. 3, an electronic device is also provided according to another aspect of the present application. The electronic device may include one or more processors and one or more memories. Wherein the memory has stored therein computer readable code which, when executed by the one or more processors, is operable to perform a data communication method of a train operation control system as described above.

The method or system according to embodiments of the application may also be implemented by means of the architecture of the electronic device shown in fig. 3. As shown in fig. 3, the electronic device may include an input device, one or more operators, one or more memories, one or more controllers, an output device, and the like. A memory in the electronic device, such as ROM503 or hard disk 507, may store a data communication method of a train operation control system provided by the present application. Of course, the architecture shown in fig. 3 is merely exemplary, and one or more components of the electronic device shown in fig. 3 may be omitted as may be practical in implementing different devices.

Example 4

Referring to FIG. 4, a computer readable storage medium 600 according to one embodiment of the application is shown. Computer readable storage medium 600 has stored thereon computer readable instructions. When the computer readable instructions are executed by the processor, a method for intelligent campus data security management according to the embodiments of the present application described with reference to the above drawings may be performed. Storage medium 600 includes, but is not limited to, for example, volatile memory and/or nonvolatile memory. Volatile memory can include, for example, random Access Memory (RAM), cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like.

In addition, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, the present application provides a non-transitory machine-readable storage medium storing machine-readable instructions executable by a processor to perform instructions corresponding to the method steps provided by the present application, a method of smart campus data security management. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU).

Claims

1. A method for securely managing data in an intelligent campus, comprising: s1, constructing an access control matrix;

The construction mode of the access control matrix comprises the following steps:

Constructing an access control matrix Eo×p, wherein o is the number of personnel roles in the park; p is the number of data asset classes; according to the working requirements and the access strategies, configuring access rights of each personnel role to each data asset class in a matrix, wherein the access rights are matrix elements aij; aij is the access right of the person role i to the data asset class j, the value is 0 or 1,0 indicates no access right, and 1 indicates access right;

monitoring the access control matrix in real time, and updating the access control matrix when personnel roles, data assets or access strategies are changed;

Defining a person ID for each person; correlating the behavior pattern characteristics of the personnel with the space track information to form a personnel behavior quadruple; the personnel behavior quadruple comprises personnel ID, behavior pattern characteristics, space track information and time period; the time period is represented by the in-out time in the space trajectory information;

The data preparation phase includes:

calculating the probability of each operation class occurring in all operations

Calculating transition probabilities I.e. the probability of ci transferring to cj is the ratio of ci to cj in the number of times of ci transferring to all categories; sigma Mik is the number of transfers of ci to all categories; mij is the number of transition occurrences from ci to cj; all Pij values are formed into a matrix, namely a Markov transfer matrix among operation categories, namely an operation transfer matrix;

2. The method for securely managing data in an intelligent campus according to claim 1, wherein the means for obtaining behavior pattern characteristics of each person comprises:

v video monitoring devices are deployed in a critical area in a park; when a person role i enters the coverage range of video monitoring equipment Cj, generating image sequence data V { i, j }; v { i, j } represents the image sequence of person i under video surveillance device Cj;

Summarizing the data of the personnel role i in the video monitoring equipment to obtain image sequence data V { i } of the ith personnel role; extracting a video frame image set V_ { i } according to the frame rate parameter f from the image sequence data V { i }; detecting and tracking personnel roles based on a computer vision depth learning algorithm for each frame image in the video frame image set V_ { i } to obtain a personnel area image sequence U_ { i };

3. The method for securely managing data in an intelligent campus according to claim 2, wherein the means for obtaining the spatial trajectory information of each person comprises:

4. The method of claim 3, wherein the engine training phase comprises:

the error back propagation algorithm comprises the following modes:

Mean square error function

5. The method for securely managing data in an intelligent campus of claim 4, wherein the training mode of the hidden markov model comprises:

6. The method for managing data security in an intelligent park according to claim 5, wherein the method for determining whether the access behavior of the person to the data asset is abnormal by combining the access control matrix and the person behavior quadruple comprises:

7. The method for intelligent campus data security management according to claim 6, wherein the means for sending the real-time early warning and initiating the response mechanism comprises:

8. A smart campus data security management system implemented based on the smart campus data security management method of any one of claims 1 to 7, comprising: the matrix construction module is used for constructing an access control matrix;