Disclosure of Invention
The embodiment of the application provides a traceable data layering and classifying method and equipment based on a genetic bionic model algorithm, aiming at improving the accuracy of data layering and classifying.
In one aspect, the application provides a hierarchical grading method of traceability data based on a genetic bionic model algorithm, which comprises the following steps:
acquiring target data to be managed;
Determining data layering and grading information of each target data;
Taking the data hierarchical information of each target data as an initial individual in a genetic algorithm;
Optimizing a plurality of initial individuals by utilizing the genetic algorithm to obtain optimized data layering and grading information;
And carrying out data hierarchical storage on the corresponding target data according to the optimized data hierarchical information.
In some embodiments, the data hierarchy information includes data hierarchy information and data hierarchy information, and the determining the data hierarchy information of each target data includes:
acquiring a data source and a data type of each target data, wherein the data source comprises at least one of enterprise data, personal data and public data, and the data type comprises a first-layer sub-class, a second-layer sub-class, a third-layer sub-class, a fourth-layer sub-class and a fifth-layer sub-class;
Determining data layering information of each target data based on the data source and the data type according to a preset layering strategy;
And determining data grading information of each target data.
In some embodiments, the determining the data ranking information of each of the target data includes:
Acquiring the data value and the data influence degree of each target data, wherein the data value comprises at least one of public interests, industry field interests, personal interests and organization interests;
And determining the data grading information of each target data according to a preset grading strategy based on the data value and the data influence degree.
In some embodiments, the optimizing the plurality of initial individuals by using the genetic algorithm to obtain optimized data hierarchical information includes:
For each initial individual, acquiring a storage performance index of corresponding target data when the target data is stored according to corresponding data hierarchical information, wherein the storage performance index comprises at least one of query speed, storage cost and data redundancy;
determining a data hierarchical score for the respective initial individual based on the storage performance indicator;
and carrying out optimization processing on a plurality of initial individuals according to the data layering grading scores by using the genetic algorithm to obtain optimized data layering grading information.
In some embodiments, the optimizing the plurality of initial individuals according to the data hierarchical grading score by using the genetic algorithm to obtain optimized data hierarchical grading information includes:
selecting a plurality of initial individuals according to the data hierarchical grading scores to obtain a plurality of first individuals;
performing cross operation on the first individuals to obtain second individuals;
performing mutation operation on a plurality of second individuals to obtain a plurality of third individuals;
And determining optimized data layering and grading information based on the third individuals.
In some embodiments, the data hierarchical information includes data hierarchical information and data hierarchical information, and the performing a cross operation on the plurality of first individuals to obtain a plurality of second individuals includes:
Selecting a plurality of first individuals according to preset cross probability;
exchanging the data hierarchical information and/or the data hierarchical information among different first individuals in the selected plurality of first individuals to obtain a plurality of second individuals;
Performing mutation operation on a plurality of second individuals to obtain a plurality of third individuals, wherein the mutation operation comprises the following steps:
selecting a plurality of second individuals according to the preset variation probability;
and adjusting the data hierarchical information and/or the data hierarchical information of each selected second individual to obtain a plurality of third individuals.
In some embodiments, after the data hierarchical storage is performed on the corresponding target data according to the optimized data hierarchical information, the method further includes:
Aiming at each target data, acquiring associated data with a blood-margin relation with the target data;
Taking the optimized data hierarchical information of the target data as the data hierarchical information of the associated data;
And storing the associated data according to the data hierarchical information of the associated data.
In some embodiments, after the data hierarchical storage is performed on the corresponding target data according to the optimized data hierarchical information, the method further includes:
Determining a security access policy of the target data based on the optimized data hierarchical information of the target data for each target data;
And limiting the access request to the target data according to the security access policy.
On the other hand, the application also provides a traceability data layering and classifying device based on the genetic bionic model algorithm, which comprises:
the first acquisition module is used for acquiring target data to be managed;
The first determining module is used for determining data layering and grading information of each piece of target data;
The second determining module is used for taking the data hierarchical grading information of each target data as an initial individual in a genetic algorithm;
The genetic algorithm module is used for carrying out optimization processing on a plurality of initial individuals by utilizing the genetic algorithm to obtain optimized data layering and grading information;
And the data storage module is used for carrying out data hierarchical storage on the corresponding target data according to the optimized data hierarchical information.
In another aspect, the present application also provides a computer apparatus, including:
one or more processors;
Memory, and
One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to run any of the genetic biomimetic model algorithm-based hierarchical method of trace-source data.
In another aspect, the present application further provides a computer readable storage medium, on which a computer program is stored, where the computer program is loaded by a processor to run any of the trace data hierarchical method based on the genetic bionic model algorithm.
In another aspect, the present application also provides a computer program product, including a computer program or instructions, where the computer program or instructions is executed by a processor to implement the hierarchical method of trace-source data based on the genetic bionic model algorithm as described in any one of the above.
The traceable data layering and classifying method and the traceable data layering and classifying device based on the genetic bionic model algorithm provided by the embodiment of the application refer to the genetic evolution principle in nature to optimize data management and processing, and combine the data management mode of combining the genetic algorithm and the data layering and classifying management idea. The method aims at optimizing the classification and grading process of the data by simulating genetic mechanisms such as selection, crossing, variation and the like in the biological evolution process, so that the efficiency and the safety of data management are improved, the method can be applied to various aspects such as data classification, storage, retrieval and the like in a big data environment, and the data processing efficiency and accuracy are improved. And traceable basic model data with obvious tag attributes can be formed, and the regional power grid is greatly facilitated by combining hierarchical data management.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.
In the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more of the described features. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
Any embodiment described as "exemplary" in this disclosure is not necessarily to be construed as preferred or advantageous over other embodiments. The present application is not intended to be limited to the embodiments shown but is to be accorded the widest scope consistent with the principles and features disclosed herein.
It should be noted that, because the method of the embodiment of the present application is executed in the computer device, the processing objects of each computer device exist in the form of data or information, for example, time, which is essentially time information, it can be understood that in the subsequent embodiment, if the size, the number, the position, etc. are all corresponding data, so that the computer device can process the data, which is not described herein in detail.
First, definitions of terms that may be involved in the embodiments of the present application will be described, specifically as follows:
(1) Genetic algorithm (Genetic Algorithm, GA) is an optimization algorithm for simulating biological evolution process, and searches for optimal solution by selecting, crossing (or recombining) and mutation by referring to the theory of evolutionary theory of Darling and the genetic theory of Mendelian. In data science, genetic algorithms are applied to tasks such as feature selection, cluster analysis, predictive modeling, etc., which can be considered as a manifestation of "data genetics".
(2) Data biomimetics, i.e., the design and manufacture of engineering systems or products that mimic the principles and behavior of biological systems. In the field of data science, data bionics is to utilize a data processing and storing mechanism of a biological system to improve and optimize a data technology. Including the extraction of inspiration from the organism, and the design of more efficient and intelligent data processing algorithms and storage schemes.
(3) Hierarchical grading-in data analysis, hierarchical grading is an important analysis method. By classifying and sorting the data, the method helps people to better understand the distribution and characteristics of the data, so that more accurate decisions can be made.
(4) The power grid dispatching data refers to various data used for dispatching and managing the power grid, and the power grid dispatching data comprises, but is not limited to, power production real-time information, dispatching instructions, control information and the like. The data are transmitted through the power dispatching data network, so that the safe, stable and economical operation of the power network is ensured.
(5) Data lifecycle-data lifecycle is the whole process from generation to extinction of data, and covers various stages of creation, collection, storage, processing, analysis, archiving, deletion, and the like of data.
(6) Public data-data that is disclosed according to the relevant requirements, and data that is included in the national public data category (except data that relates to privacy and personal privacy). The corporate public data does not in principle relate to the original detail data and mainly covers the data of indexes, reports and the like reported according to the relevant supervision requirements.
(7) Personal data-personal information data, electronically or otherwise recorded, relating to an identified or identifiable natural person, excluding anonymized information.
(8) Enterprise data, raw data and derivative data collected and generated by a company in the activities of power production, enterprise management, customer service, and the like. The enterprise data are classified, combed and rated according to the business lines, and are not overlapped with personal data and public data.
(9) Data set refers to the data form of data record convergence.
(10) And the wide table data is a database table in which indexes, dimensions and attributes related to the business theme are related together.
(11) Personal sensitive information is personal information data which is easy to cause the infringement of personality of natural people or the harm of personal and property safety once leaked or illegally used, and comprises information such as biological identification, specific identity, medical health, financial account, track and the like, and personal data of minors less than fourteen years old.
(12) The biological identification characteristic information refers to personal data which is obtained by technical processing of physical, biological or behavioral characteristics of a natural person and can identify the identity of the natural person singly or in combination with other information. The biological characteristic identification information comprises personal facial identification characteristics, irises, fingerprints, genes, voiceprints, gait, palmprints, auricles, eye prints and the like.
(13) Personal basic information refers to natural person basic condition information such as personal name, birthday, age, sex, nationality, native, political appearance, marital status, family relationship, address, personal phone number, email address, hobbies, etc.
(14) Important power users are clients which have important roles in the aspects of society, economy, life and the like in a country or a region (city), and can cause personal injury, larger environmental pollution, larger economic loss, serious confusion of social public order or special requirements on power supply reliability when power is supplied to the power users. Important power customer approval is typically set forth by customers, approved by power authorities of grade and beyond.
(15) Raw data refers to data which is not processed.
(16) Derived data refers to data generated through processing activities such as statistics, association, mining, aggregation, de-identification and the like.
In the related art, the hierarchical classification of data may employ the following scheme:
According to the scheme 1, data is subjected to hierarchical maintenance and hierarchical management in modes of manual configuration, custom setting and the like depending on working experience, and various data with relatively single data characteristics and similar data attributes are subjected to manual hierarchical division, so that the method has general practical value. However, for large-batch data with complex data characteristics, more data attributes and less common properties, hierarchical data management is performed by means of manual configuration or simple noun matching, so that more data hierarchical error conditions exist, and the data is coarse in classification, unclear in classification and unfavorable for quick retrieval of data and data security requirements.
The scheme 2 is that the automatic layering and grading of the data are realized by adopting a simple and clear label system or knowledge graph technology, the technology depends on mature data labels or knowledge graphs, and the automatic layering and grading are realized by intelligently judging the data, so that the risk of human intervention is eliminated, the cost of manual configuration is reduced, and the accuracy of the layering and grading of the data is improved. However, for the data content related to the unknown data tag or the imperfect knowledge graph, better identification and judgment cannot be performed, and the data can be layered and classified after the related data tag and the knowledge graph are perfected, so that the data layered and classified management of autonomous updating and autonomous judgment is lacking.
For scheme 1, manual data hierarchical management relies on manual operation and manual configuration and personal data management working experience, and has four main defects of low efficiency and increased workload of manual classification and classification along with the continuous increase of data volume. Under the background of big data, manual classification and classification can consume a large amount of time and labor cost, so that the efficiency is low, secondly, more errors are caused, subjectivity and inconsistency can exist in the manual classification and classification process, so that classification results are inaccurate or have deviation (in addition, artificial factors such as fatigue, negligence and the like can also cause classification errors), thirdly, the data are difficult to continuously generate and change, the manual classification and classification can usually only be operated for the current data set, classification and classification work needs to be carried out again when new data are generated, the difficulty of continuity is increased, fourthly, the cost is high, and enterprises need to input a large amount of manpower and resources (including recruitment and training of specialized data classification personnel, purchasing and maintenance related tools and equipment and the like) so as to increase the operation cost of the enterprises.
For scheme 2, the automatic hierarchical data classification is realized by adopting a simple and definite label system or knowledge graph technology, and the method has the defects that firstly, the built-in label system and knowledge graph algorithm are relied on, if the built-in algorithm is unreasonable in design or inaccurate in rule setting, the classification result is possibly deviated or wrong (the system is required to fully consider the diversity and complexity of the data in the design process and the specific requirements of different industries and different scenes), secondly, the adaptability to new types of data is poor, the new data types and data patterns are layered endlessly along with the continuous generation and change of the data (the automatic classification of the system is required to readjust the algorithm and rules in the face of the new types of data, even the retrain model is required to adapt to the new data characteristics, the complexity and the maintenance cost of the system are increased), thirdly, the security and the privacy are insufficient, and the system may be in contact with sensitive data or the privacy protection measures in the data classification process (if the security and privacy protection measures of the system are not in place, the data leakage or abuse risk is possibly caused).
Compared with the scheme 1, the traceable data layering and grading method and equipment based on the genetic bionic model algorithm have the advantages that firstly, efficiency is improved, data inheritance and data bionic algorithm are relied on, each feature of data is identified by adopting an artificial intelligence technology, layering and grading management of the data is automatically conducted, data management efficiency is greatly improved, secondly, accuracy is improved, automatic layering and grading of the data can be better and faster achieved through extracting features, attributes, association relations existing inside and outside the data and the like based on big data analysis, layering and grading accuracy of the data is improved based on analysis comparison of massive historical data, sustainability is improved, learning and searching can be conducted continuously through artificial intelligence analysis based on the genetic algorithm and the bionic algorithm of the data, the mastered data attribute and data characteristic are more and more beneficial to layering and grading management of the data along with continuous expansion of data quantity, and the data layering and grading management of the machine and related data algorithm is achieved, little or no manual intervention is needed, labor cost is greatly reduced, layering and grading management efficiency of the data is improved, and operation cost is reduced.
Compared with the scheme 2, the traceable data layering and grading method and equipment based on the genetic bionic model algorithm have the advantages that firstly, the characteristic, the attribute and the association relation of the identification data are automatically analyzed and identified by adopting the data genetic algorithm and the data bionic algorithm without depending on a built-in label system or a knowledge graph algorithm, the adaptability to the diversity and the complexity of the data is high, secondly, the adaptability to new types of data is high, the inheritance, the intersection and the variation of the data can be automatically realized based on the data genetic algorithm and the data bionic algorithm aiming at the continuous generation and the change of the data, the blood-margin relation of the data is obtained, the layering and grading management of the data is realized, and thirdly, the safety and the privacy of the data are high due to the fact that the data genetic algorithm is adopted, the layering and grading attribute of the data is inherited based on the data blood-margin relation, and the safety and the privacy of the data can be ensured.
In summary, the traceable data layering and classifying method and device based on the genetic bionic model algorithm provided by the embodiment of the application automatically identify each characteristic of the power grid dispatching data by adopting the data inheritance and data bionic algorithm, optimize the classification and classifying process of the data by simulating the inheritance mechanism such as selection, intersection, variation and the like in the biological evolution process, thereby improving the efficiency and the safety of data management. The method mainly comprises the following steps:
1. The method comprises the steps of acquiring power grid dispatching data, analyzing relevance among various data according to various attributes of the data, adopting a data genetic algorithm, realizing characteristic selection and cluster analysis of the data through operations such as selection, recombination and variation of the data, automatically optimizing a hierarchical process of the power grid dispatching data in combination with hierarchical management requirements of the data, realizing hierarchical management of the data, adopting hierarchical result data, realizing full life cycle management of the data by utilizing a data bionic technology, and improving data management efficiency and data security protection management.
2. And constructing traceable basic model data, adopting a data genetic bionic algorithm to perform feature definition on the obtained regional power grid dispatching data, formulating a data tag to form identifiable traceable basic model data, wherein the data has obvious feature attributes and can not be changed, and various data derived from the basic data automatically inherit relevant feature attributes of the basic data, so that the hierarchical management and the data traceability tracking of the data are realized quickly.
3. According to factors such as data value, data influence degree, use frequency and the like, data are divided into different layers and levels, and different management and control measures are adopted for the data of different layers and levels. The method helps to ensure confidentiality, integrity and availability of the data and prevent data leakage and attack.
4. The hierarchical management of the data realized by adopting the data genetic algorithm and the bionic algorithm has the characteristics of strong self-adaption, high robustness and good optimization effect, can adaptively adjust classification and grading rules according to different data sets and service requirements in the self-adaption aspect, can effectively solve the problems of data noise, abnormal values and the like in the robustness aspect, improves the stability and reliability of the data management, and can find a better data classification and grading scheme by performing global search in the optimization effect aspect of simulating the biological evolution process.
Specifically, referring to fig. 1 and 3, in an embodiment, a hierarchical grading method of traceability data based on a genetic bionic model algorithm may include:
101. And obtaining target data to be managed.
In the embodiment of the application, the target data to be managed is generally power grid dispatching data which needs to be hierarchically stored in a preset database. Because of the complexity and importance of the grid dispatching data, hierarchical management of the grid dispatching data is required.
In some embodiments of the present application, after the target data to be managed is obtained, preprocessing such as cleaning, conversion, standardization processing, etc. may be performed on the target data to form traceable basic model data with obvious tag attributes, and ensure the quality of the data.
102. Data hierarchy information of each target data is determined.
In the embodiment of the application, layering refers to dividing target data into different layers according to factors such as data value, data influence degree, data access frequency and the like. Grading refers to further grading of data within each layer in order to more finely manage the target data. Accordingly, the data hierarchy information may include data hierarchy information and data hierarchy information. By dividing the data into different levels and levels, and taking different management and control measures for the data of different levels and levels. The method helps to ensure confidentiality, integrity and availability of the data and prevent data leakage and attack.
In some embodiments of the present application, the target data is initially classified according to the characteristics of the target data, so as to obtain corresponding data hierarchical information. For example, a data layering and ranking policy may be formulated based on business requirements and data characteristics, and then the target data may be initially classified according to the data layering and ranking policy. These policies may include factors such as the source of the data, the type of data, the value of the data, the extent of the impact of the data, and so on.
The following illustrates a data layering and ranking strategy. The overall strategy can be summarized into three-dimensional five-layer six-level, wherein three-dimensional refers to carrying out data layering work from three data sources of enterprise data, personal data and public data, five-layer refers to data types (including one-layer subclass, two-layer subclass, three-layer subclass, four-layer subclass and five-layer subclass) of five-layer subclasses when target data are classified layer by layer, six-level refers to data influence degree after data value and leakage or misuse, and the data level of grid scheduling data is divided into six-level (including 6-level, 5-level, 4-level, 3-level, 2-level and 1-level) from high to low according to a confidential matter catalog of a grid company and a data sharing negative list.
In the layering strategy of the data, the data can be layered in five layers (five layers including professional fields, business topics, business objects, data entities and data objects) according to relevant standards and relevant operation guidelines and in combination with the actual business management of the power grid company. The first layer, the second layer and the third layer are classified according to the service characteristics and are divided into a professional field, a service theme and a service object, and the fourth layer and the fifth layer are classified according to the data characteristics and are divided into a data entity and a data attribute. The hierarchical policy of the data is shown in table 1 below, for example:
TABLE 1
It can be seen that determining the data layering and grading information of each target data can include obtaining a data source and a data type of each target data, wherein the data source comprises at least one of enterprise data, personal data and public data, the data type comprises one-layer subclass, two-layer subclass, three-layer subclass, four-layer subclass and five-layer subclass, and determining the data layering information of each target data based on the data source and the data type according to a preset layering strategy.
In the data grading strategy, relevant regulations are followed, and according to the importance degree of the data in the development of economy and society and the influence degree caused by tampering, damage, leakage or illegal acquisition and illegal utilization on public interests, industry fields or personal and organization rights, the data is divided into three levels of core data, important data and general data from high to low as a whole, namely, the data grading information can comprise at least one level of the core data, the important data and the general data. The hierarchical policy of the data is shown in table 2 below, for example:
TABLE 2
It can be seen that determining the data hierarchical information of each target data may include obtaining a data value and a data impact level of each target data, the data value including at least one of public interests, industry domain interests, personal interests, organization interests, the data impact level including at least one of particularly serious hazards, general hazards, and no hazards, and determining the data hierarchical information of each target data based on the data value and the data impact level according to a preset hierarchical policy.
103. The data hierarchical information of each target data is used as an initial individual in the genetic algorithm.
In the embodiment of the application, the data hierarchical information of the target data is used as an initial population in a genetic algorithm, the data hierarchical information of each target data is encoded into an initial individual in the genetic algorithm for subsequent processing, and each initial individual is also associated with corresponding target data. The coding mode can be set based on actual requirements, for example, binary coding rules can be adopted for coding. By automatically generating genetic codes, traceable basic model data are formed, and full life cycle management of the data is facilitated.
104. And (3) optimizing a plurality of initial individuals by utilizing a genetic algorithm to obtain optimized data layering and grading information.
In the embodiment of the application, the genetic algorithm is a search heuristic algorithm, is an optimization algorithm for simulating the natural biological evolution process, and searches the optimal solution in the solution space through operations such as selection, crossing, mutation and the like. In data management, genetic algorithms may be used to optimize data hierarchy information to enable target data to be partitioned and managed according to more rational data hierarchy information.
105. And carrying out data hierarchical storage on the corresponding target data according to the optimized data hierarchical information.
In the embodiment of the application, the data hierarchical storage is carried out on the corresponding target data according to the optimized data hierarchical information, namely, the optimized data hierarchical information is applied to the actual data management work, so that the hierarchical management and control of the data are realized.
In some embodiments of the present application, the efficiency of hierarchical management of related data may be further improved based on a data biomimetic algorithm. The method comprises the steps of storing corresponding target data according to optimized data hierarchical information, acquiring associated data with a blood relationship with the target data for each target data, taking the optimized data hierarchical information of the target data as data hierarchical information of the associated data, and storing the associated data according to the data hierarchical information of the associated data. Taking target data as power grid power generation data as an example, optimized data layering and grading information of the power grid power generation data can be obtained through a genetic algorithm, at the moment, associated data with blood relationship with the power grid power generation data can be obtained, for example, the associated data of the power grid power consumption condition, the power plant power generation condition and the like can be derived based on the attribute data such as power, electric quantity, generated time, load rate, peak-valley difference, maximum value, minimum value, same ratio, ring ratio and the like in the power grid power generation data, and then the optimized data layering and grading information of the power grid power generation data is used as the optimized data layering and grading information of the associated data, so that layering and grading efficiency and accuracy of the associated data are improved. In addition, when the data is searched, a data bionic technology (such as an ant colony algorithm, for example, a data information transmission mechanism in the ant foraging process is simulated) can be adopted to optimize a data search path so as to improve the data search efficiency.
In some embodiments of the present application, different management and control measures may also be taken for different levels and levels of data to ensure confidentiality, integrity and availability of the data, preventing data leakage and attacks. The method comprises the steps of storing corresponding target data according to optimized data hierarchical information, determining a security access policy of the target data according to the optimized data hierarchical information, and limiting access requests to the target data according to the security access policy, for example, limiting access and use of the target data by different users through setting of the security access policy.
It can be seen that the traceable data hierarchical classification method based on the genetic bionic model algorithm provided by the embodiment of the application uses the genetic evolution principle in nature to optimize data management and processing, and combines the data management mode of combining the genetic algorithm and the data hierarchical management idea. The method aims at optimizing the classification and grading process of the data by simulating genetic mechanisms such as selection, crossing, variation and the like in the biological evolution process, so that the efficiency and the safety of data management are improved, the method can be applied to various aspects such as data classification, storage, retrieval and the like in a big data environment, and the data processing efficiency and accuracy are improved.
Referring to fig. 2 and 3, on the basis of the one shown in fig. 1, optimization is performed on a plurality of initial individuals by using a genetic algorithm, so as to obtain optimized data hierarchical information, which may include:
201. and acquiring a storage performance index of corresponding target data when the corresponding target data is stored according to the corresponding data hierarchical information aiming at each initial individual.
In an embodiment of the application, the storage performance index comprises at least one of query speed, storage cost, data redundancy. The query speed refers to the time period spent by the target data being successfully queried, the storage cost refers to the hardware cost, the software cost and the like spent by the target data being stored, and the data redundancy (data redundancy) refers to unnecessary repeated storage of the data in the memory.
202. Based on the storage performance metrics, a data hierarchy rating score for the corresponding initial individual is determined.
In the embodiment of the application, the data hierarchical grading scores of the initial individuals represent the classification accuracy and the grading rationality of the data hierarchical grading information of the corresponding target data and are used for reflecting the advantages and disadvantages of the corresponding initial individuals. The data hierarchical score may be calculated by a preset function, for example, the query speed, the storage cost, and the data redundancy may be weighted and summed to obtain a corresponding data hierarchical score.
203. And optimizing a plurality of initial individuals according to the data layering grading scores by utilizing a genetic algorithm to obtain optimized data layering grading information.
In the embodiment of the application, the optimized data hierarchical information is obtained by optimizing a plurality of initial individuals according to the data hierarchical grading score by utilizing a genetic algorithm, and the method comprises the steps of selecting the plurality of initial individuals according to the data hierarchical grading score to obtain a plurality of first individuals, namely selecting excellent individuals as parents to carry out subsequent cross operation, carrying out cross operation on the plurality of first individuals to obtain a plurality of second individuals, carrying out mutation operation on the plurality of second individuals to obtain a plurality of third individuals, and determining the optimized data hierarchical information based on the plurality of third individuals.
In a further embodiment, performing cross operation on the plurality of first individuals to obtain a plurality of second individuals may include selecting the plurality of first individuals according to a preset cross probability, and exchanging data hierarchical information and/or data hierarchical information between different first individuals among the selected plurality of first individuals to obtain the plurality of second individuals. It can be seen that the crossover operation is specifically a process of simulating biological gene crossover, and the partial genes of the two parent individuals are exchanged to generate new offspring individuals.
In a further embodiment, the mutation operation is performed on the plurality of second individuals to obtain a plurality of third individuals, which may include selecting the plurality of second individuals according to a preset mutation probability, and adjusting the data hierarchical information and/or the data hierarchical information of each selected second individual to obtain a plurality of third individuals, thereby completing the mutation operation on each second individual. It can be seen that the mutation operation is specifically a process of randomly changing some genes of the offspring individuals with a certain probability, so as to increase the diversity of the population and avoid falling into a locally optimal solution.
In a further embodiment, the optimized data hierarchical information is determined based on a plurality of third individuals, specifically, an iterative optimization process is performed, that is, selection, crossover and mutation operations are repeatedly performed, and step-by-step data hierarchical information is optimized step by step through the iterative process, so that the optimized data hierarchical information is finally obtained.
In the iterative process, parameters of the genetic algorithm (such as population size requirements to be met in selection operation, crossover probability variation to be met in crossover operation, variation probability requirements to be met in variation operation, and the like) can be adjusted according to actual conditions so as to improve the performance of the genetic algorithm.
In a further embodiment, based on a plurality of third individuals, the number of iterations in determining the optimized hierarchical information of the data may be determined based on the target storage locations of the corresponding target data (due to the privacy requirements of the grid schedule data, the grid schedule data with higher importance typically needs to be stored in the grid intranet, while the grid schedule data with relatively lower importance may be stored in the grid extranet based on actual requirements, but the data management and maintenance costs of the grid intranet are typically higher than those of the grid extranet). The method comprises the steps of obtaining target storage positions of target data associated with a plurality of third individuals based on the plurality of third individuals, obtaining the plurality of target storage positions, and performing iterative optimization processing on the plurality of third individuals according to a first preset iteration number at the moment if the number of the target storage positions located in the power grid intranet is greater than or equal to the number of the target storage positions located in the power grid extranet in the plurality of target storage positions, so as to determine the optimized data hierarchical information, and preferentially guarantee the accuracy of the data hierarchical information of power grid scheduling data stored in the power grid intranet, thereby reducing the data management and maintenance cost in the power grid as much as possible.
And in the multiple target storage positions, if the number of the target storage positions in the internal network of the power grid is smaller than the number of the target storage positions in the external network of the power grid, performing iterative optimization processing on multiple third individuals according to the second preset iteration times, and determining the optimized data layering and grading information. The second preset iteration times are smaller than the first preset iteration times, so that time consumption of a genetic algorithm is reduced on the premise that data layering and grading information of power grid dispatching data has higher accuracy, and the cost of data layering and grading is prevented from being too high.
From the above embodiments, it can be seen that the tracing data layering and grading method based on the genetic bionic model algorithm provided by the embodiment of the application has the following technical effects:
1. And the data processing efficiency is improved. The data genetic bionic algorithm can automatically classify and grade mass data with high efficiency by simulating natural selection and genetic mechanism. The algorithm can identify the modes and the characteristics in the data, so that the types and the grades of the data can be more accurately divided, and the accuracy of data management is improved. The automation characteristic of the algorithm can reduce manual intervention, reduce the possibility of human errors, and improve the efficiency and speed of data processing. This is particularly important for processing large-scale data sets, enabling the cycle time for data governance to be significantly shortened. The algorithm can dynamically adjust the hierarchy of the data according to the change of the data, so that the processing efficiency is improved.
2. Data security and privacy protection are enhanced. The traceable data layering and grading method based on the genetic bionic model algorithm can implement different levels of security access strategies for different levels of data, so that risks of data leakage and abuse are reduced, and the security of the data is improved. Sensitive data with high data value and high data influence degree can be more accurately identified through a data genetic bionic algorithm, and key protection is carried out on the sensitive data. For example, sensitive data is stored and transmitted in an encrypted manner, unauthorized access is restricted, etc., to ensure that the sensitive data is not illicitly acquired and utilized.
3. Facilitating data sharing and analysis. The traceable data layering and grading method based on the genetic bionic model algorithm can clearly determine access and use authority of data of different levels, and provides clear boundaries for data sharing, so that the data sharing and exchange can be facilitated on the premise of compliance, and the data utilization value is improved. Through the optimized data index structure, the data retrieval speed in the data analysis process is increased, more accurate and timely data support is provided for a decision maker, and better business decisions are helped to be made.
4. Strengthening compliance operation requirements. The traceable data layering and grading method based on the genetic bionic model algorithm is beneficial to enterprises and organizations to better observe relevant regulations and industry standards. Through the classification and grading standard of the clear data and the corresponding safety management and protection requirements, the risk of violations can be reduced, and legal disputes are reduced.
5. The method has the advantages that the economic benefit of data service is improved, traceable basic model data formed by a genetic bionic algorithm is combined with a layered and graded data management method, the method has high practical value in the regional power grid industry, the difficulty of layered and graded data related to multiple departments, dimensions and services of the power grid is solved, the management level of regional power grid data is improved, an advanced technology is promoted to replace a manually operated management model, and good convenience is brought to the regional power grid.
In order to better implement the traceability data layering and classifying method based on the genetic bionic model algorithm in the embodiment of the present application, on the basis of the traceability data layering and classifying method based on the genetic bionic model algorithm, the embodiment of the present application further provides a traceability data layering and classifying device based on the genetic bionic model algorithm, as shown in fig. 4, a traceability data layering and classifying device 400 based on the genetic bionic model algorithm includes:
a first obtaining module 401, configured to obtain target data to be managed;
a first determining module 402, configured to determine data hierarchical information of each of the target data;
A second determining module 403, configured to use the data hierarchical information of each of the target data as an initial individual in a genetic algorithm;
The genetic algorithm module 404 is configured to perform optimization processing on the plurality of initial individuals by using the genetic algorithm, so as to obtain optimized data hierarchical information;
And the data storage module 405 is configured to perform data hierarchical storage on the corresponding target data according to the optimized data hierarchical information.
The embodiment of the application also provides computer equipment which integrates any traceable data layering and grading device based on the genetic bionic model algorithm. As shown in fig. 5, a schematic structural diagram of a computer device according to an embodiment of the present application is shown, specifically:
The computer device may include one or more processing cores 'processors 501, one or more computer-readable storage media's memory 502, a power supply 503, and an input unit 504, among other components. Those skilled in the art will appreciate that the computer device structure shown in FIG. 5 is not intended to be limiting of the computer device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components. Wherein:
The processor 501 is the control center of the computer device and uses various interfaces and lines to connect the various parts of the overall computer device, and by running or executing software programs and/or modules stored in the memory 502, and invoking data stored in the memory 502, performs various functions of the computer device and processes the data, thereby performing overall monitoring of the computer device. Optionally, the processor 501 may include one or more processing cores, and preferably the processor 501 may integrate an application processor and a modem processor, wherein the application processor primarily processes operating systems, user interfaces, application programs, etc., and the modem processor primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 501.
The memory 502 may be used to store software programs and modules, and the processor 501 executes various functional applications and data processing by executing the software programs and modules stored in the memory 502. The memory 502 may mainly include a storage program area that may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), etc., and a storage data area that may store data created according to the use of the computer device, etc. In addition, memory 502 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 502 may also include a memory controller to provide access to the memory 502 by the processor 501.
The computer device further includes a power supply 503 for powering the various components, and preferably the power supply 503 may be logically coupled to the processor 501 via a power management system such that functions such as charge, discharge, and power consumption management are performed by the power management system. The power supply 503 may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
The computer device may also include an input unit 504, which input unit 504 may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the computer device may further include a display unit or the like, which is not described herein. In particular, in this embodiment, the processor 501 in the computer device loads executable files corresponding to the processes of one or more application programs into the memory 502 according to the following instructions, and the processor 501 executes the application programs stored in the memory 502, so as to implement various functions as follows:
The method comprises the steps of obtaining target data to be managed, determining data layering and grading information of each target data, taking the data layering and grading information of each target data as an initial individual in a genetic algorithm, carrying out optimization treatment on a plurality of initial individuals by using the genetic algorithm to obtain optimized data layering and grading information, and carrying out data layering and grading storage on corresponding target data according to the optimized data layering and grading information.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.
To this end, embodiments of the present application provide a computer-readable storage medium that may include a Read Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or optical disk, and the like. The method comprises the steps of carrying out the steps of any traceable data layering and grading method based on the genetic bionic model algorithm provided by the embodiment of the application. For example, the loading of the computer program by the processor may perform the steps of:
The method comprises the steps of obtaining target data to be managed, determining data layering and grading information of each target data, taking the data layering and grading information of each target data as an initial individual in a genetic algorithm, carrying out optimization treatment on a plurality of initial individuals by using the genetic algorithm to obtain optimized data layering and grading information, and carrying out data layering and grading storage on corresponding target data according to the optimized data layering and grading information.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer readable storage medium and executes the computer instructions, so that the electronic device executes to implement the traceability data hierarchical method based on the genetic bionic model algorithm as described in any one of the above, for example:
The method comprises the steps of obtaining target data to be managed, determining data layering and grading information of each target data, taking the data layering and grading information of each target data as an initial individual in a genetic algorithm, carrying out optimization treatment on a plurality of initial individuals by using the genetic algorithm to obtain optimized data layering and grading information, and carrying out data layering and grading storage on corresponding target data according to the optimized data layering and grading information.
In the foregoing embodiments, the descriptions of the embodiments are focused on, and the portions of one embodiment that are not described in detail in the foregoing embodiments may be referred to in the foregoing detailed description of other embodiments, which are not described herein again.
In the implementation, each unit or structure may be implemented as an independent entity, or may be implemented as the same entity or several entities in any combination, and the implementation of each unit or structure may be referred to the foregoing method embodiments and will not be repeated herein.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
The method and the device for hierarchical classification of traceable data based on the genetic bionic model algorithm provided by the embodiment of the application are described in detail, the specific examples are used for describing the principle and the implementation mode of the application, the description of the above embodiment is only used for helping to understand the method and the core idea of the application, and meanwhile, the technical personnel in the field can change the specific implementation mode and the application range according to the idea of the application, so that the content of the description is not to be construed as limiting the application.