[go: up one dir, main page]

CN119829683A - Government affair data sharing system, method, equipment and storage medium - Google Patents

Government affair data sharing system, method, equipment and storage medium Download PDF

Info

Publication number
CN119829683A
CN119829683A CN202510068517.2A CN202510068517A CN119829683A CN 119829683 A CN119829683 A CN 119829683A CN 202510068517 A CN202510068517 A CN 202510068517A CN 119829683 A CN119829683 A CN 119829683A
Authority
CN
China
Prior art keywords
data
government
government affair
government data
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510068517.2A
Other languages
Chinese (zh)
Inventor
田野
弓恒勃
丁兵
颜恺华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202510068517.2A priority Critical patent/CN119829683A/en
Publication of CN119829683A publication Critical patent/CN119829683A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a government affair data sharing system, a government affair data sharing method, government affair data sharing equipment and a government affair data storage medium, which relate to the technical field of big data and comprise a standardized processing module, a real-time synchronous processing module, an encryption and authentication module and a storage and construction module, wherein the standardized processing module is used for constructing a target flow of original government affair data, carrying out data processing based on the target flow to complete standardized processing, the real-time synchronous processing module is used for capturing changed government affair data, sending the changed government affair data to a distributed message queue, utilizing the distributed message queue to transmit the government affair data in real time, utilizing a distributed coordination mechanism to ensure the synchronization of the government affair data, the encryption and authentication module is used for carrying out fine grain access control and identity authentication on a data visitor, sending the encrypted government affair data to the data visitor, and the storage and construction module is used for storing the government affair data after standardized processing and synchronization to a distributed storage system to construct a data query interface and a target retrieval engine, and carrying out data query and retrieval based on the data query interface and the target retrieval engine to realize the sharing of the government affair data.

Description

Government affair data sharing system, method, equipment and storage medium
Technical Field
The invention relates to the technical field of big data, in particular to a government affair data sharing system, a government affair data sharing method, government affair data sharing equipment and a government affair data storage medium.
Background
With the rapid development of e-government affairs, various government departments accumulate a large amount of data resources. However, these data tend to be scattered in different systems, forming "data islands" for historical reasons and technical barriers. The problems of repeated data acquisition, inconsistent information, difficulty in sharing and the like are caused, and the efficiency and quality of government service are seriously affected. At present, although a data sharing platform is established in some places, the problems still exist that data standards are not uniform, difficulty in data integration is increased, instantaneity is insufficient, instant service requirements are difficult to meet, a security protection mechanism is imperfect, data leakage risks exist, a data service interface is inflexible and difficult to adapt to diversified application requirements, and an effective data quality management mechanism is lacked to influence availability and reliability of data.
Disclosure of Invention
In view of the above, the present invention aims to provide a government affair data sharing system, a government affair data sharing method, a government affair data sharing device, a government affair data sharing storage medium, and a government affair data sharing method, a government affair data sharing device, a government affair data sharing storage medium and a government affair data sharing storage medium. The problems of data dispersion, non-uniform format, difficult sharing, safety risk and the like existing in the traditional government affair data management are effectively solved. The specific scheme is as follows:
In a first aspect, the present application discloses a government affair data sharing system, comprising:
The standardized processing module is used for carrying out data pattern definition on the original government affair data by utilizing a data pattern definition language of the target data serialization system so as to generate corresponding serialization codes, and constructing a target flow of the original government affair data based on APACHE NIFI so as to process the original government affair data based on the target flow and finish standardized processing of the original government affair data;
The real-time synchronization processing module is used for accessing each original government affair data into a distributed message queue, capturing changed government affair data through Debezium, sending the changed government affair data to the distributed message queue, transmitting corresponding government affair data in real time by utilizing the distributed message queue, and guaranteeing the synchronization of the government affair data by utilizing a distributed coordination mechanism of a ZooKeeper;
The encryption and authentication module is used for carrying out data transmission encryption on the government affair data by utilizing a transmission layer security protocol, carrying out fine-granularity access control and identity authentication on a data visitor, and sending the corresponding encrypted government affair data to the data visitor after the identity authentication is passed;
The storage and construction module is used for storing the standardized government affair data and the synchronized government affair data into the distributed storage system, constructing a data query interface through GraphQL technology and constructing a target retrieval engine so that the data visitor can perform corresponding data query and data retrieval based on the data query interface, the distributed storage system and the target retrieval engine to realize sharing of the government affair data.
Optionally, the data mode comprises fields, types and nested relations of the original government data, the target flow comprises any one or a combination of several of extraction, conversion and loading, and the conversion comprises any one or a combination of several of format conversion, data cleaning and data integration.
Optionally, the system further includes:
The processing and checking module is used for carrying out target processing and quality checking on the original government affair data of target quantity by utilizing APACHE SPARK, wherein the target processing comprises statistical analysis and/or data mining processing, and the quality checking is used for checking whether field values corresponding to the original government affair data are within a preset value range, whether target fields of the original government affair data are filled completely and whether the original government affair data of different data sources are consistent.
Optionally, the system further includes:
And the data processing module is used for constructing a breakpoint resume function based on a Flink checkPoint mechanism, and recording the processing state of the current government data according to a preset check point based on the breakpoint resume function when the system operation fails, so that the current government data can be continuously processed based on the check point and the processing state after the system is restored.
Optionally, the system further includes:
and the access behavior monitoring module is used for constructing a normal access behavior model by collecting and analyzing historical access data, and monitoring the current access behavior of the data visitor in real time by utilizing the normal access behavior model so that the normal access behavior model can send a corresponding alarm when monitoring that the current access behavior is a preset abnormal access behavior.
Optionally, the system further includes:
The safety detection module is used for configuring an automatic test script, and carrying out safety detection on the government affair data sharing system based on the automatic test script at regular intervals so as to ensure the safety of the government affair data sharing system.
Optionally, the system further includes:
And the visual analysis module is used for constructing a Apache Superset-based visual analysis platform so that the data visitor can drag and configure the government affair data by using the visual analysis platform based on analysis requirements to generate a corresponding chart, and the government affair data is subjected to visual analysis through the chart.
In a second aspect, the application discloses a government affair data sharing method, which comprises the following steps:
Performing data mode definition on original government data by using a data mode definition language of a target data serialization system to generate corresponding serialization codes, and constructing a target flow of the original government data based on APACHE NIFI so as to process the original government data based on the target flow and complete standardized processing of the original government data;
accessing each original government affair data into a distributed message queue, capturing changed government affair data through Debezium, sending the changed government affair data to the distributed message queue, transmitting corresponding government affair data in real time by using the distributed message queue, and guaranteeing the synchronization of the government affair data by using a distributed coordination mechanism of a ZooKeeper;
Carrying out data transmission encryption on the government affair data by utilizing a transmission layer security protocol, carrying out fine-granularity access control and identity authentication on a data visitor, and sending the corresponding encrypted government affair data to the data visitor after the identity authentication is passed;
And storing the standardized government affair data and the synchronized government affair data into a distributed storage system, constructing a data query interface through GraphQL technology, and constructing a target retrieval engine so that the data visitor can perform corresponding data query and data retrieval based on the data query interface, the distributed storage system and the target retrieval engine to realize sharing of the government affair data.
In a third aspect, the present application discloses an electronic device, comprising:
a memory for storing a computer program;
And the processor is used for executing the computer program to realize the government affair data sharing method.
In a fourth aspect, the present application discloses a computer readable storage medium for storing a computer program, where the computer program when executed by a processor implements a government affair data sharing method as described above.
The government affair data sharing system comprises a standardized processing module, a real-time synchronous processing module, a storage and processing module, a distributed information queue, a search engine, a query engine, a distributed system and a search engine, wherein the standardized processing module is used for carrying out data mode definition on original government affair data by utilizing a data mode definition language of a target data serialization system to generate corresponding serialization codes, constructing a target flow of the original government affair data based on APACHE NIFI so as to process the original government affair data based on the target flow and finish standardized processing of the original government affair data, the real-time synchronous processing module is used for accessing each original government affair data into the distributed information queue, capturing changed government affair data through Debezium, sending the changed government affair data to the distributed information queue, utilizing a distributed coordination mechanism of a ZooKeeper to ensure synchronization of the government affair data, and utilizing a transmission layer security protocol to carry out data transmission encryption, carrying out fine-granularity access control and identity authentication on the government affair data, sending the corresponding encrypted government affair data to a data visitor after the identity authentication is passed, and the data engine is used for storing the data GraphQL and searching the corresponding data by the distributed information queue and searching system to realize the search engine, and searching the data based on the target data by the distributed information system. The application unifies the data format and structure through the data standardization service, so that the data of different departments and systems can be mutually understood and interacted, the real-time synchronization service ensures the timeliness of the data, and each department can acquire the latest data in real time. The method can quickly capture and process the change data in the database, transmit the latest data to related systems and applications in real time, and make decisions by government departments in time according to the real-time data. The security of government affair data is ensured from links such as data transmission, access control, data storage and calculation, illegal acquisition, tampering or leakage of data in the process of exchanging and sharing are prevented, and information security is protected. And solve the high-efficient exchange and sharing problem of government affair data of cross departments, cross systems, overcome government affair data island dilemma, show improvement government affair service efficiency and quality.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a government affair data sharing system according to the present application;
FIG. 2 is a flow chart of a government affair data sharing method disclosed by the application;
Fig. 3 is a block diagram of an electronic device according to the present disclosure.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
At present, although a data sharing platform is established in some places, the problems still exist that data standards are not uniform, difficulty in data integration is increased, instantaneity is insufficient, instant service requirements are difficult to meet, a security protection mechanism is imperfect, data leakage risks exist, a data service interface is inflexible and difficult to adapt to diversified application requirements, and an effective data quality management mechanism is lacked to influence availability and reliability of data. In order to solve the technical problems, the application discloses a government affair data sharing system, a government affair data sharing method, government affair data sharing equipment and a storage medium, and the government affair data sharing system, the government affair data sharing method, the government affair data sharing equipment and the storage medium can realize real-time updating, safe sharing, flexible access and intelligent analysis of government affair data. The problems of data dispersion, non-uniform format, difficult sharing, safety risk and the like existing in the traditional government affair data management are effectively solved.
Referring to fig. 1, an embodiment of the present invention discloses a government affair data sharing system, including:
The standardized processing module 11 is configured to perform data schema definition on original government data by using a data schema definition language of a target data serialization system to generate a corresponding serialization code, and construct a target flow of the original government data based on APACHE NIFI, so as to process the original government data based on the target flow, and complete standardized processing of the original government data.
In this embodiment, apache Avro is adopted as the data serialization system, apache Avro provides a mode of defining a data mode (Schema), and a developer can define a data structure by using Avro IDL (data mode definition language) according to the characteristics and requirements of government data, wherein the data structure is a data mode, comprises information such as fields, types, nesting relations and the like of data, and then automatically generates multi-language compatible serialization codes, so that unified serialization and deserialization of data among different systems and languages are realized. And data interaction between systems written in different programming languages is facilitated. Meanwhile, metadata is managed by means of APACHE ATLAS, metadata management comprises management of information such as definition, source, use and circulation path of government affair data, and data blood-margin relation diagram is one important function, and the origin and conversion process of the data and the flow condition in different systems and flows can be clearly traced through the data blood-margin relation diagram, so that the metadata management is helpful for understanding the coming and going of the data and is convenient for data management and problem investigation. Automatic registration and updating of metadata, which is data describing government data, such as information including name, type, storage location, creation time, update time, association relationship, etc., is achieved through a set of RESTful API (Application Programming Interface ) interfaces provided by ATLAS REST API (APACHE ATLAS). Along with generation, updating and circulation of government affair data, related metadata also need to be registered in Atlas in time so as to be effectively managed and tracked, and when the data changes or new data is added, the metadata also needs to be updated correspondingly. An ETL (Extract-Transform-Load) process based on APACHE NIFI (an easy-to-use, powerful and reliable data pulling, data processing and distributing system for automating management of data flows between systems) was developed, wherein transformation mainly refers to performing operations on data such as format conversion, data cleansing (e.g. removing duplicate data, correcting erroneous data, supplementing missing data, etc.), data integration (merging or correlating data from different sources according to certain rules) to accommodate the subsequent data processing and analysis requirements.
In addition, the application utilizes APACHE SPARK to carry out large-scale data processing and quality inspection, wherein the data processing comprises complex calculation (such as statistical analysis, data mining and the like) and data aggregation (gathering scattered data into more meaningful information) and the like on massive government data, and the quality inspection comprises the steps of detecting and evaluating quality dimensions such as accuracy, completeness, consistency, timeliness and the like of the data through a series of preset rules and algorithms, for example, checking whether the data is in a reasonable range, whether the necessary-to-be-filled word segments are complete, whether the data of different data sources are consistent and the like, so as to realize real-time monitoring of the data quality. Specifically, a data quality inspection system based on a rule engine (such as Drools) is designed, flexible configuration of complex rules is supported, various data quality inspection rules, such as a value range of a specific field, a logic relationship among data and the like, can be customized according to different characteristics and service requirements of government data, and therefore accurate quality monitoring is achieved. In addition, the application creatively introduces a machine learning algorithm to realize intelligent metadata management and self-adaptive adjustment, automatically discovers data hiding association and business logic, dynamically updates metadata and blood-edge relation graphs, and simultaneously adopts a deep learning model to carry out real-time data quality monitoring and automatic repair, accurately detects quality problems and automatically triggers a repair mechanism. The automatic data processing and quality monitoring flow reduces errors caused by manual intervention, thereby improving the overall quality of government affair data, ensuring that the data is more reliable and providing powerful support for government decision and government affair service.
The real-time synchronization processing module 12 is configured to access each original government data to a distributed message queue, capture changed government data through Debezium, send the changed government data to the distributed message queue, transmit corresponding government data in real time by using the distributed message queue, and guarantee synchronization of the government data by using a distributed coordination mechanism of a ZooKeeper.
In this embodiment, APACHE KAFKA is used as a distributed message queue, and real-time stream processing is performed in combination with APACHE FLINK. By using the Kafka Connect framework, custom Source and Sink connectors are developed to support the access of various government data sources (such as databases of different types, file systems and the like), so that various government data can be conveniently accessed into the Kafka message queue. The high-throughput real-time data transmission is realized, and a large amount of real-time data from each government data source can be efficiently processed. The development of APACHE FLINK-based real-time data processing pipelines supports complex event processing, and can perform real-time analysis and processing on the change data flowing in real time, such as identifying event sequences of specific modes, calculating data indexes in real time and the like. The Kafka has strong large-scale data stream processing capability, and ensures the real-time performance and reliability of data. The Flink provides excellent stream processing functions, supports complex event processing and state management, and realizes real-time data conversion and analysis. The accurate primary semantics (Exactly-Once Semantics) based on the Kafka transaction are realized, so that the data can be accurately processed once no matter any fault or retry occurs in the whole synchronization process from the data source to final storage or processing, and the data can not be lost or repeatedly processed, thereby ensuring the accuracy of data synchronization.
The application uses Debezium to capture change data, wherein the change data refers to change data generated by adding and deleting operations of data in a government database. The transaction log of the database can be monitored in real time through Debezium, the change condition of the data can be found in time, and the captured change data is sent to the Kafka message queue in the form of a message.
The system also comprises a data processing module, which is used for constructing a breakpoint resume function based on a Flink checkPoint mechanism, and recording the processing state of the current government affair data according to a preset check point based on the breakpoint resume function when the system fails, so that the current government affair data can be processed based on the check point and the processing state after the system is restored. The specific design is based on the breakpoint resume function of the Flink checkPoint mechanism, if faults (such as node downtime, network interruption and the like) occur in the running process of the system, the Flink can record the current data processing state according to the previously set check point, and after the system is restored, the data processing can be continued from the position of the last check point without starting from the beginning, so that the fault tolerance of the system is improved, and the continuity and the integrity of the data processing are ensured.
The application realizes a distributed coordination mechanism based on a ZooKeeper, and has the specific function of coordinating the work of each node in a distributed environment. In the data synchronization process, consistency cognition of different nodes on the data state is ensured, the sequence and reliability of data synchronization are ensured, and data collision and inconsistency are avoided. For example, when multiple consumers acquire data from the Kafka queue for processing, the ZooKeeper can coordinate the consumption progress of the consumers, so that the data cannot be consumed repeatedly or in a missing way.
And the encryption and authentication module 13 is used for carrying out data transmission encryption on the government affair data by utilizing a transmission layer security protocol, carrying out fine-granularity access control and identity authentication on a data visitor, and sending the corresponding encrypted government affair data to the data visitor after the identity authentication is passed.
In this embodiment, TLS (Transport Layer Security, transport layer security protocol) is used to encrypt data transmission, so as to ensure confidentiality of government service data in the network transmission process and prevent data theft or tampering. The Apache Shiro is used for realizing fine-granularity access control and identity authentication, and the specific implementation mode is that authority allocation is carried out on different users or system roles by defining a series of authority policies and rules, and only users or systems which are authenticated and have corresponding authorities can access specific government data resources. For example, for sensitive data, only an administrator of a specific level can perform read-write operation, and an ordinary user can only perform read-only access, meanwhile, an ELK (log search, kibana) stack is integrated, so that omnibearing log collection and audit are realized, log is collected by log, an index is stored by the log search, and a visual interface is provided by Kibana to facilitate query analysis audit. Log information generated by each link of the government affair system, including data access logs, system operation logs and the like, is collected through the Logstash, is stored in an elastic search for indexing and storing, and then a visual interface is provided by utilizing Kibana, so that an administrator can conveniently perform log inquiry, analysis and audit, and potential safety problems and abnormal behaviors can be found timely. The application develops the API gateway based on Apache Knox, provides a unified safety access entrance, performs unified management and safety verification on the external request, and can only access government data service through the gateway by legal request, thereby preventing illegal access and malicious attack. In the process of transmitting data, based on homomorphic encryption data desensitization technology, direct calculation of encrypted data is supported, specific calculation operation, such as statistical analysis, is allowed to be carried out on the data in an encrypted state on the premise of guaranteeing data privacy, so that the safety of the data is protected, and the requirement of partial data analysis can be met. And meanwhile, a hardware security module (HSM, hardware security module) based on PKCS11 is used for managing the secret key, so that the secret key security is improved. The PKCS11 standard provides a secure way to manage keys, and the hardware security module stores the keys in a dedicated hardware device, isolated from the operating system, to prevent the keys from being illegally acquired or tampered with. Meanwhile, the hardware security module supports operations such as key generation, storage, encryption, decryption and the like, provides a secure key backup and recovery mechanism, and ensures the security and usability of the key when hardware equipment is in fault or replacement.
The application utilizes OWASP ZAP-based automatic security test flow to periodically scan security vulnerabilities. OWASP ZAP is a widely used open source security testing tool, and by configuring an automatic testing script, comprehensive security scanning is periodically performed on a government integrated big data platform, including detection of common security holes (such as SQL (Structured Query Language) injection, cross-site scripting attack and the like) of Web (World Wide Web) and timely discovery and repair of potential security hazards, and security of the platform is ensured. Meanwhile, the application utilizes an abnormal access detection system based on machine learning to identify potential security threats in real time. The system establishes a normal access behavior model by collecting and analyzing a large amount of historical access data, monitors the current access behavior in real time, and gives an alarm in time when an abnormal access behavior with larger deviation from the normal model is found (such as frequent access of sensitive data by a certain user in a short time, access from an abnormal geographic position and the like), so that an administrator can take corresponding measures to prevent and treat. In this way, the security of government data is ensured from each link of data transmission, access control, data storage, calculation and the like, illegal acquisition, tampering or leakage of data in the exchange and sharing processes are prevented, and the information security is protected.
The storage and construction module 14 is configured to store the standardized government affair data and the synchronized government affair data in a distributed storage system, construct a data query interface through GraphQL technology, and construct a target search engine, so that the data visitor performs corresponding data query and data search based on the data query interface, the distributed storage system and the target search engine, thereby realizing sharing of the government affair data.
In this embodiment, the present application stores standardized and synchronized data to a distributed storage system, possibly involving a distributed file system or a distributed database, to support large-scale data storage and high-concurrency access. And flexible data query interfaces are provided by utilizing GraphQL technology, the client can accurately specify required data fields and query conditions according to own requirements, so that the problem of excessive acquisition or insufficient data possibly occurring in the traditional RESTful API is avoided, unnecessary data transmission is reduced, and the data acquisition efficiency is improved. GraphQL and Apache Solr, and satisfies the query demands of different users and application scenes for the diversity of government data.
In addition, the Apache Solr is used for constructing a full-text search engine, so that text content in government affair data can be subjected to high-efficiency full-text search, for example, keywords in documents such as policy documents, official documents and the like are searched, related information is rapidly positioned, and complex data search requirements such as keyword search, fuzzy query, semantic query and the like are supported. The application integrates TensorFlow Serving, deploys a machine learning model, provides intelligent data analysis services, has the functions of data prediction (such as predicting government service demand trend, social economic development trend and the like), data classification (such as classifying government matters), intelligent recommendation (such as recommending proper government service projects for citizens) and the like, and helps government departments to make more scientific decisions. The system also comprises a visual analysis module, which is used for constructing a Apache Superset-based visual analysis platform so that a data visitor can drag and configure the government affair data by using the visual analysis platform based on analysis requirements, a corresponding chart is generated, and the government affair data is visually analyzed through the chart. The visual analysis platform based on Apache Superset is developed, the user-defined instrument board and report form are supported, the user can display government affair data in the form of visual charts (such as a histogram, a line graph, a map and the like) through simple dragging and configuration operation according to analysis requirements of the user, the visual analysis of the data is convenient, and information behind the data is quickly and insights, so that powerful support is provided for decision making.
The application also utilizes the cache layer based on Redis to optimize the access performance of the hot spot data. For government affair data (such as popular policy files, common statistical data and the like) which are frequently accessed, the government affair data are cached in Redis, and when a client requests the data again, the government affair data can be directly obtained from the cache, so that the data access speed is greatly improved, and the pressure of storing and processing the back-end data is reduced. More, the automation of data processing and model training is achieved by constructing ETL (Extract, transform, load) and machine learning workflows using Apache Airflow. A series of tasks and execution sequence and dependency relationships thereof can be defined through Airflow, such as tasks of regularly extracting data from government data sources, carrying out data cleaning conversion, training a machine learning model and the like, automatic scheduling and management of the whole data processing and model training flow are realized, the working efficiency is improved, and errors possibly caused by manual intervention are reduced. An intelligent question-answering system based on the knowledge graph is also developed to provide a natural language interactive interface. Through constructing the government affair knowledge graph, the information such as entity, relation and the like in government affair data is expressed in a structured mode, the intelligent question-answering system utilizes the natural language processing technology to understand the questions of the user, then the query and the reasoning are carried out in the knowledge graph, an accurate answer is provided for the user, and a more convenient and intelligent government affair data service interaction mode is realized. Therefore, the intelligent question-answering system based on the knowledge graph provides a natural language interaction mode, so that citizens can conveniently acquire government information, and the flexible data service stimulates innovative application of government information, such as intelligent government assistance, personalized government service recommendation and the like.
The application unifies the data format and structure through the data standardization service, so that the data of different departments and systems can be mutually understood and interacted, the real-time synchronization service ensures the timeliness of the data, and each department can acquire the latest data in real time. The method can quickly capture and process the change data in the database, transmit the latest data to related systems and applications in real time, and make decisions by government departments in time according to the real-time data. The security of government affair data is ensured from links such as data transmission, access control, data storage and calculation, illegal acquisition, tampering or leakage of data in the process of exchanging and sharing are prevented, and information security is protected. And solve the high-efficient exchange and sharing problem of government affair data of cross departments, cross systems, overcome government affair data island dilemma, show improvement government affair service efficiency and quality.
Referring to fig. 2, the embodiment of the invention discloses a government affair data sharing method, which comprises the following steps:
And S11, performing data pattern definition on the original government affair data by using a data pattern definition language of a target data serialization system to generate corresponding serialization codes, and constructing a target flow of the original government affair data based on APACHE NIFI so as to process the original government affair data based on the target flow and complete standardized processing of the original government affair data.
And step S12, accessing each original government affair data into a distributed message queue, capturing changed government affair data through Debezium, sending the changed government affair data to the distributed message queue, transmitting corresponding government affair data in real time by utilizing the distributed message queue, and guaranteeing the synchronization of the government affair data by utilizing a distributed coordination mechanism of a ZooKeeper.
And S13, carrying out data transmission encryption on the government affair data by utilizing a transmission layer security protocol, carrying out fine-granularity access control and identity authentication on a data visitor, and sending the corresponding encrypted government affair data to the data visitor after the identity authentication is passed.
And S14, storing the standardized government affair data and the synchronized government affair data into a distributed storage system, constructing a data query interface through GraphQL technology, and constructing a target search engine so that the data visitor can perform corresponding data query and data search based on the data query interface, the distributed storage system and the target search engine to realize sharing of the government affair data.
The application unifies the data format and structure through the data standardization service, so that the data of different departments and systems can be mutually understood and interacted, the real-time synchronization service ensures the timeliness of the data, and each department can acquire the latest data in real time. The method can quickly capture and process the change data in the database, transmit the latest data to related systems and applications in real time, and make decisions by government departments in time according to the real-time data. The security of government affair data is ensured from links such as data transmission, access control, data storage and calculation, illegal acquisition, tampering or leakage of data in the process of exchanging and sharing are prevented, and information security is protected. And solve the high-efficient exchange and sharing problem of government affair data of cross departments, cross systems, overcome government affair data island dilemma, show improvement government affair service efficiency and quality.
Further, the embodiment of the present application further discloses an electronic device, and fig. 3 is a block diagram of an electronic device 20 according to an exemplary embodiment, where the content of the figure is not to be considered as any limitation on the scope of use of the present application.
Fig. 3 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may include, in particular, at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input-output interface 25, and a communication bus 26. The memory 22 is configured to store a computer program, where the computer program is loaded and executed by the processor 21 to implement relevant steps in the government affair data sharing method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be specifically an electronic computer.
In this embodiment, the power supply 23 is configured to provide working voltages for each hardware device on the electronic device 20, the communication interface 24 is capable of creating a data transmission channel with an external device for the electronic device 20, and the communication protocol in which the communication interface is in compliance is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein, and the input/output interface 25 is configured to obtain external input data or output data to the external device, where the specific interface type may be selected according to the specific application requirement, and is not specifically limited herein.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, and the like, and the storage may be temporary storage or permanent storage.
The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and the computer program 222, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the government data sharing method performed by the electronic device 20 disclosed in any of the foregoing embodiments.
Furthermore, the application also discloses a computer readable storage medium for storing a computer program, wherein the computer program realizes the government affair data sharing method when being executed by a processor. For specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
While the foregoing has been provided to illustrate the principles and embodiments of the present application, specific examples have been provided herein to assist in understanding the principles and embodiments of the present application, and are intended to be in no way limiting, for those of ordinary skill in the art will, in light of the above teachings, appreciate that the principles and embodiments of the present application may be varied in any way.

Claims (10)

1.一种政务数据共享系统,其特征在于,包括:1. A government data sharing system, comprising: 标准化处理模块,用于利用目标数据序列化系统的数据模式定义语言对原始政务数据进行数据模式定义,以生成相应的序列化代码,并基于Apache NiFi构建所述原始政务数据的目标流程,以便基于所述目标流程对所述原始政务数据进行处理,完成所述原始政务数据的标准化处理;A standardization processing module, which is used to define the data model of the original government data using the data model definition language of the target data serialization system to generate corresponding serialization codes, and to build a target process of the original government data based on Apache NiFi, so as to process the original government data based on the target process and complete the standardization processing of the original government data; 实时同步处理模块,用于将各原始政务数据接入分布式消息队列,通过Debezium捕获变更的政务数据,将所述变更的政务数据发送至所述分布式消息队列,利用所述分布式消息队列实时传输相应的政务数据,并利用ZooKeeper的分布式协调机制保障所述政务数据的同步;A real-time synchronization processing module is used to connect each original government data to a distributed message queue, capture the changed government data through Debezium, send the changed government data to the distributed message queue, use the distributed message queue to transmit the corresponding government data in real time, and use the distributed coordination mechanism of ZooKeeper to ensure the synchronization of the government data; 加密与认证模块,用于利用传输层安全协议对所述政务数据进行数据传输加密,对数据访问者进行细粒度访问控制和身份认证,在身份认证通过后将相应的加密后政务数据发送至所述数据访问者;An encryption and authentication module is used to encrypt the data transmission of the government data using the transport layer security protocol, perform fine-grained access control and identity authentication on data accessors, and send the corresponding encrypted government data to the data accessors after the identity authentication is passed; 存储与构建模块,用于将标准化处理后的政务数据以及同步后的政务数据存储至分布式存储系统,通过GraphQL技术构建数据查询接口,并构建目标检索引擎,以便所述数据访问者基于所述数据查询接口、所述分布式存储系统以及所述目标检索引擎进行相应的数据查询与数据检索,以实现所述政务数据的共享。The storage and construction module is used to store the standardized government data and the synchronized government data in the distributed storage system, build a data query interface through GraphQL technology, and build a target retrieval engine so that the data visitor can perform corresponding data query and data retrieval based on the data query interface, the distributed storage system and the target retrieval engine to realize the sharing of the government data. 2.根据权利要求1所述的政务数据共享系统,其特征在于,所述数据模式包括所述原始政务数据的字段、类型以及嵌套关系;所述目标流程包括提取、转换以及加载中的任意一种或几种的组合;所述转换包括格式转换、数据清洗以及数据集成中的任意一种或几种的组合。2. The government data sharing system according to claim 1 is characterized in that the data model includes the fields, types and nested relationships of the original government data; the target process includes any one or a combination of extraction, conversion and loading; the conversion includes any one or a combination of format conversion, data cleaning and data integration. 3.根据权利要求1所述的政务数据共享系统,其特征在于,还包括:3. The government data sharing system according to claim 1, further comprising: 处理与检查模块,用于利用Apache Spark对目标数量的所述原始政务数据进行目标处理和质量检查;所述目标处理包括统计分析和/或数据挖掘处理;所述质量检查为对所述原始政务数据对应的字段取值是否在预设取值范围内、所述原始政务数据的目标字段是否填写完整以及不同数据源的原始政务数据是否一致的检查。The processing and inspection module is used to use Apache Spark to perform target processing and quality inspection on the target quantity of the original government data; the target processing includes statistical analysis and/or data mining processing; the quality inspection is to check whether the field values corresponding to the original government data are within the preset value range, whether the target fields of the original government data are fully filled in, and whether the original government data from different data sources are consistent. 4.根据权利要求1所述的政务数据共享系统,其特征在于,还包括:4. The government data sharing system according to claim 1, further comprising: 数据处理模块,用于构建基于Flink CheckPoint机制的断点续传功能,在系统运行出现故障时,基于所述断点续传功能根据预先设置的检查点记录当前政务数据的处理状态,以便在系统恢复后,基于所述检查点以及所述处理状态继续对当前政务数据进行处理。The data processing module is used to build a breakpoint-resume function based on the Flink CheckPoint mechanism. When a system failure occurs, the breakpoint-resume function records the processing status of the current government data according to a pre-set checkpoint, so that after the system is restored, the current government data can continue to be processed based on the checkpoint and the processing status. 5.根据权利要求1所述的政务数据共享系统,其特征在于,还包括:5. The government data sharing system according to claim 1, further comprising: 访问行为监测模块,用于通过收集并分析历史访问数据构建正常访问行为模型,利用所述正常访问行为模型实时监测所述数据访问者的当前访问行为,以便所述正常访问行为模型监测所述当前访问行为为预设异常访问行为时发送相应的警报。The access behavior monitoring module is used to build a normal access behavior model by collecting and analyzing historical access data, and use the normal access behavior model to monitor the current access behavior of the data visitor in real time, so that the normal access behavior model sends a corresponding alarm when detecting that the current access behavior is a preset abnormal access behavior. 6.根据权利要求1所述的政务数据共享系统,其特征在于,还包括:6. The government data sharing system according to claim 1, further comprising: 安全检测模块,用于配置自动化测试脚本,基于所述自动化测试脚本定期对政务数据共享系统进行安全检测,以保障所述政务数据共享系统的安全。The security detection module is used to configure the automated test script and regularly perform security detection on the government data sharing system based on the automated test script to ensure the security of the government data sharing system. 7.根据权利要求1至6任一项所述的政务数据共享系统,其特征在于,还包括:7. The government data sharing system according to any one of claims 1 to 6, characterized in that it also includes: 可视化分析模块,用于构建基于Apache Superset的可视化分析平台,以便所述数据访问者基于分析需求利用所述可视化分析平台对所述政务数据进行拖拽以及配置操作,生成相应的图表,通过所述图表对所述政务数据进行可视化分析。The visualization analysis module is used to build a visualization analysis platform based on Apache Superset, so that the data visitor can use the visualization analysis platform to drag and configure the government data based on analysis needs, generate corresponding charts, and perform visualization analysis on the government data through the charts. 8.一种政务数据共享方法,其特征在于,包括:8. A government data sharing method, characterized by comprising: 利用目标数据序列化系统的数据模式定义语言对原始政务数据进行数据模式定义,以生成相应的序列化代码,并基于Apache NiFi构建所述原始政务数据的目标流程,以便基于所述目标流程对所述原始政务数据进行处理,完成所述原始政务数据的标准化处理;The data model definition language of the target data serialization system is used to define the data model of the original government data to generate the corresponding serialization code, and the target process of the original government data is constructed based on Apache NiFi, so as to process the original government data based on the target process and complete the standardized processing of the original government data; 将各原始政务数据接入分布式消息队列,通过Debezium捕获变更的政务数据,将所述变更的政务数据发送至所述分布式消息队列,利用所述分布式消息队列实时传输相应的政务数据,并利用ZooKeeper的分布式协调机制保障所述政务数据的同步;Connect each original government data to a distributed message queue, capture the changed government data through Debezium, send the changed government data to the distributed message queue, use the distributed message queue to transmit the corresponding government data in real time, and use ZooKeeper's distributed coordination mechanism to ensure the synchronization of the government data; 利用传输层安全协议对所述政务数据进行数据传输加密,对数据访问者进行细粒度访问控制和身份认证,在身份认证通过后将相应的加密后政务数据发送至所述数据访问者;The government data is encrypted by using the transport layer security protocol, and fine-grained access control and identity authentication are performed on the data accessor. After the identity authentication is passed, the corresponding encrypted government data is sent to the data accessor; 将标准化处理后的政务数据以及同步后的政务数据存储至分布式存储系统,通过GraphQL技术构建数据查询接口,并构建目标检索引擎,以便所述数据访问者基于所述数据查询接口、所述分布式存储系统以及所述目标检索引擎进行相应的数据查询与数据检索,以实现所述政务数据的共享。The standardized government data and the synchronized government data are stored in a distributed storage system, and a data query interface is constructed through GraphQL technology, and a target retrieval engine is constructed so that the data accessor can perform corresponding data query and data retrieval based on the data query interface, the distributed storage system and the target retrieval engine to realize the sharing of the government data. 9.一种电子设备,其特征在于,包括:9. An electronic device, comprising: 存储器,用于保存计算机程序;Memory, used to store computer programs; 处理器,用于执行所述计算机程序,以实现如权利要求8所述的政务数据共享方法。A processor is used to execute the computer program to implement the government data sharing method as described in claim 8. 10.一种计算机可读存储介质,其特征在于,用于保存计算机程序;其中,所述计算机程序被处理器执行时实现如权利要求8所述的政务数据共享方法。10. A computer-readable storage medium, characterized in that it is used to store computer programs; wherein, when the computer program is executed by a processor, the government data sharing method as described in claim 8 is implemented.
CN202510068517.2A 2025-01-16 2025-01-16 Government affair data sharing system, method, equipment and storage medium Pending CN119829683A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510068517.2A CN119829683A (en) 2025-01-16 2025-01-16 Government affair data sharing system, method, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510068517.2A CN119829683A (en) 2025-01-16 2025-01-16 Government affair data sharing system, method, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN119829683A true CN119829683A (en) 2025-04-15

Family

ID=95292367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510068517.2A Pending CN119829683A (en) 2025-01-16 2025-01-16 Government affair data sharing system, method, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN119829683A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120123990A (en) * 2025-04-30 2025-06-10 浪潮云信息技术股份公司 Government function efficiency optimization method and system based on multimodal data fusion

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120123990A (en) * 2025-04-30 2025-06-10 浪潮云信息技术股份公司 Government function efficiency optimization method and system based on multimodal data fusion

Similar Documents

Publication Publication Date Title
US11785104B2 (en) Learning from similar cloud deployments
US20220232024A1 (en) Detecting deviations from typical user behavior
US20220303295A1 (en) Annotating changes in software across computing environments
US11765249B2 (en) Facilitating developer efficiency and application quality
US12126643B1 (en) Leveraging generative artificial intelligence (‘AI’) for securing a monitored deployment
CN107409126B (en) System and method for securing an enterprise computing environment
US11258814B2 (en) Methods and systems for using embedding from Natural Language Processing (NLP) for enhanced network analytics
CN112765245A (en) Electronic government affair big data processing platform
Zhang et al. Toward effective big data analysis in continuous auditing
US20190121969A1 (en) Graph Model for Alert Interpretation in Enterprise Security System
US20190349391A1 (en) Detection of user behavior deviation from defined user groups
US20220224707A1 (en) Establishing a location profile for a user device
Allam An Exploratory Survey of Hadoop Log Analysis Tools
CN113094385A (en) Data sharing fusion platform and method based on software definition open toolset
Seenivasan ETL (extract, transform, load) best practices
CN119829683A (en) Government affair data sharing system, method, equipment and storage medium
US12323449B1 (en) Code analysis feedback loop for code created using generative artificial intelligence (‘AI’)
US12309185B1 (en) Architecture for a generative artificial intelligence (AI)-enabled assistant
US12368745B1 (en) Using natural language queries to conduct an investigation of a monitored system
Valladares et al. Dimensional data model for early alerts of malicious activities in a CSIRT
Abreu et al. Provenance Segmentation.
WO2024112501A1 (en) Guided anomaly detection framework
Gnatyuk et al. Software System for Cybersecurity Events Correlation and Incident Management in Critical Infrastructure
CN118734313B (en) A cross-platform cloud resource anomaly detection method and device
US12348545B1 (en) Customizable generative artificial intelligence (‘AI’) assistant

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination