[go: up one dir, main page]

CN114780752B - Method, system, device and storage medium for constructing federated knowledge graph - Google Patents

Method, system, device and storage medium for constructing federated knowledge graph

Info

Publication number
CN114780752B
CN114780752B CN202210514582.XA CN202210514582A CN114780752B CN 114780752 B CN114780752 B CN 114780752B CN 202210514582 A CN202210514582 A CN 202210514582A CN 114780752 B CN114780752 B CN 114780752B
Authority
CN
China
Prior art keywords
target
graph
edge
knowledge graph
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210514582.XA
Other languages
Chinese (zh)
Other versions
CN114780752A (en
Inventor
汪河言
李金龙
刘攀
季江舟
贺瑶函
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Merchants Bank Co Ltd
Original Assignee
China Merchants Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Merchants Bank Co Ltd filed Critical China Merchants Bank Co Ltd
Priority to CN202210514582.XA priority Critical patent/CN114780752B/en
Publication of CN114780752A publication Critical patent/CN114780752A/en
Application granted granted Critical
Publication of CN114780752B publication Critical patent/CN114780752B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a federal knowledge graph construction method, a system, equipment and a storage medium, wherein the federal knowledge graph construction method comprises the steps of obtaining multi-source heterogeneous data in the target field, generating multi-source data tables based on the multi-source heterogeneous data, and carrying out classification analysis on the multi-source data tables to obtain target graph information, wherein the target graph information comprises different types of graph entities, entity attributes, different types of graph edges and edge attributes, generating different entity files and edge relation files based on the target graph information, and constructing a target federal knowledge graph based on the different entity files and the edge relation files. The method solves the technical problem that knowledge maps are difficult to construct by combining multiple data because various data are scattered and the data lack of correlation.

Description

Federal knowledge graph construction method, system, equipment and storage medium
Technical Field
The application relates to the technical field of Internet, in particular to a federal knowledge graph construction method, a federal knowledge graph construction system, federal knowledge graph construction equipment and a federal knowledge graph construction storage medium.
Background
The Knowledge Graph (knowledgegraph) is called a Knowledge domain visualization or Knowledge domain mapping map in the book condition report, is a series of various graphs showing the Knowledge development process and the structural relationship, and is required to identify the entities of various data and the corresponding association relationship in the process of constructing the Knowledge Graph, however, in a large number of application scenes in the financial industry, various data are scattered due to the lack of a uniform Knowledge frame, and the lack of association between the data, so that the Knowledge Graph is difficult to construct by combining multiparty data.
Disclosure of Invention
The application mainly aims to provide a federal knowledge graph construction method, a federal knowledge graph construction system, federal knowledge graph construction equipment and a federal knowledge graph construction storage medium, and aims to solve the technical problem that knowledge graphs are difficult to construct by combining multiple data because various data are scattered and the data are lack of correlation in the prior art.
In order to achieve the above object, the present application provides a federal knowledge graph construction method, which includes:
acquiring multi-source heterogeneous data in the target field, and generating each multi-source data table based on the multi-source heterogeneous data;
Classifying and analyzing each multi-source data table to obtain target spectrum information, wherein the target spectrum information comprises different types of spectrum entities, entity attributes, different types of spectrum edges and edge attributes;
Generating different entity files and side relation files based on the target map information;
and constructing a target federal knowledge graph based on the different entity files and the side relationship files.
Optionally, the step of classifying and analyzing the multi-source data table to obtain the target map information includes:
in the multi-source data table, combining the business scene of the current target field, selecting each target field with the query frequency meeting the preset frequency threshold as the different types of map entities, and determining the entity attribute corresponding to each map entity;
Based on each map entity, selecting the map entities of the same type and the target fields associated with the map entities of different types from the multi-source data table as map edges of different types, and determining edge attributes corresponding to each map edge, wherein the map edges represent association relations among the map entities.
Optionally, the step of generating the different entity file and the side relationship file based on the target map information includes:
Generating each entity file according to target fields corresponding to different map entities in the target map information;
And generating each side relation file according to the target fields corresponding to different map sides in the target map information.
Optionally, the step of generating each multi-source data table based on the multi-source heterogeneous data includes:
and processing the multi-source heterogeneous data by a preset natural language processing method to generate each multi-source data table.
Optionally, after the step of constructing the target federal knowledge-graph based on the different entity files and the side relationship files, the method further includes:
constructing a visual webpage of the target federal knowledge graph;
acquiring an operation instruction of a target user on the visual webpage;
Inquiring target return information corresponding to the operation instruction in the target federal knowledge graph;
And based on the target return information, performing visual drawing in the visual webpage through a preset drawing algorithm.
Optionally, the step of visually drawing through a preset drawing algorithm based on the target return information to obtain a target drawing includes:
Importing the target return information into a preset constructed force guide graph layout;
And dynamically calling a preset drawing function in the force-directed graph layout, and drawing the target return information based on the drawing function.
Optionally, the step of drawing the target return information based on the drawing function to obtain the target drawing map includes:
If the target return information exists in the map entity, carrying out node drawing on the basis of preset node patterns of the map entity data;
if the target return information contains the map edges, drawing the edges according to a preset edge pattern based on the number of the map edges between the map entities.
The application also provides a federal knowledge graph construction system, which is a virtual system, and comprises:
The acquisition module is used for acquiring multi-source heterogeneous data and generating various multi-source data tables based on the multi-source heterogeneous data;
The analysis module is used for classifying and analyzing each multi-source data table to obtain target spectrum information, wherein the target spectrum information comprises different types of spectrum entities, entity attributes, different types of spectrum edges and edge attributes;
the generation module is used for generating different entity files and side relation files based on the target map information;
And the construction module is used for constructing a target federal knowledge graph based on the different entity files and the side relationship files.
Optionally, the analysis module is further configured to;
in the multi-source data table, combining the business scene of the current target field, selecting each target field with the query frequency meeting the preset frequency threshold as the different types of map entities, and determining the entity attribute corresponding to each map entity;
Based on each map entity, selecting the map entities of the same type and the target fields associated with the map entities of different types from the multi-source data table as map edges of different types, and determining edge attributes corresponding to each map edge, wherein the map edges represent association relations among the map entities.
Optionally, the generating module is further configured to;
Generating each entity file according to target fields corresponding to different map entities in the target map information;
And generating each side relation file according to the target fields corresponding to different map sides in the target map information.
Optionally, the acquiring module is further configured to;
and processing the multi-source heterogeneous data by a preset natural language processing method to generate each multi-source data table.
Optionally, the federal knowledge graph construction system is further configured to:
constructing a visual webpage of the target federal knowledge graph;
acquiring an operation instruction of a target user on the visual webpage;
Inquiring target return information corresponding to the operation instruction in the target federal knowledge graph;
And based on the target return information, performing visual drawing in the visual webpage through a preset drawing algorithm.
Optionally, the federal knowledge graph construction system is further configured to:
Importing the target return information into a preset constructed force guide graph layout;
And dynamically calling a preset drawing function in the force-directed graph layout, and drawing the target return information based on the drawing function.
Optionally, the federal knowledge graph construction system is further configured to:
If the target return information exists in the map entity, carrying out node drawing on the basis of preset node patterns of the map entity data;
if the target return information contains the map edges, drawing the edges according to a preset edge pattern based on the number of the map edges between the map entities.
The application also provides federal knowledge graph construction equipment which is entity equipment and comprises a memory, a processor and a federal knowledge graph construction program stored on the memory, wherein the federal knowledge graph construction program is executed by the processor to realize the steps of the federal knowledge graph construction method.
The application also provides a storage medium which is a computer readable storage medium, wherein the computer readable storage medium is stored with a federal knowledge graph construction program, and the federal knowledge graph construction program is executed by a processor to realize the steps of the federal knowledge graph construction method.
The application provides a federal knowledge graph construction method, a system, equipment and a storage medium, which firstly acquire multi-source heterogeneous data in the target field, generate each multi-source data table based on the multi-source heterogeneous data, and further conduct classification analysis on each multi-source data table to obtain target graph information, wherein the target graph information comprises different types of graph entities, entity attributes, different types of graph edges and edge attributes, further, based on the target graph information, different entity files and edge relationship files are generated, further, based on the different entity files and edge relationship files, a target federal knowledge graph is constructed, and the purpose that service personnel can better expand services based on the associated information of federal knowledge graphs is achieved by classifying and analyzing data tables corresponding to the multi-source heterogeneous data, selecting different types of graph entities, entity attributes, different types of graph edges, edge attributes and the like corresponding to the constructed graph, and accordingly generating different entity files and edge relationship files.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a schematic flow chart of a first embodiment of a federal knowledge graph construction method according to the present application;
FIG. 2 is a schematic flow chart of a second embodiment of the federal knowledge base construction method according to the present application;
FIG. 3 is a schematic flow chart of a third embodiment of a federal knowledge graph construction method according to the present application;
FIG. 4 is a system frame diagram of a federal knowledge graph construction method of the present application;
FIG. 5 is a schematic diagram of a federal knowledge graph construction device in a hardware operating environment according to an embodiment of the present application;
fig. 6 is a schematic diagram of functional modules of the federal knowledge graph construction apparatus of the present application.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
An embodiment of the present application provides a federal knowledge graph construction method, in a first embodiment of the federal knowledge graph construction method of the present application, referring to fig. 1, the federal knowledge graph construction method includes:
Step S10, multi-source heterogeneous data in the target field are obtained, and each multi-source data table is generated based on the multi-source heterogeneous data;
In this embodiment, the target field includes medical, financial, aerospace fields, and the like, and in the present application, the financial field is used as the target field to describe, and the multi-source heterogeneous data is data of public, retail, and the like, which is aggregated through a unified knowledge frame, and the public data includes data information between companies and individuals.
And acquiring multi-source heterogeneous data in the target field, generating each multi-source data table based on the multi-source heterogeneous data, specifically acquiring multi-source heterogeneous data such as public and retail, which are required by building a map, analyzing the data such as public and retail according to a unified knowledge frame and combining an NLP (natural language processing) method, and generating each multi-source data table.
Step S20, classifying and analyzing each multi-source data table to obtain target spectrum information, wherein the target spectrum information comprises different types of spectrum entities, entity attributes, different types of spectrum edges and edge attributes;
in this embodiment, it should be noted that, the target pattern information is a knowledge pattern schema, and the format of the knowledge pattern data to be added is limited, which is equivalent to a data model corresponding to the field.
The multi-source data tables are subjected to classification analysis to obtain target map information, specifically, field information with higher query frequency in the current service scene is selected according to the service scene of the current target field, the field information is subjected to classification analysis, fields which can meet the construction entity conditions are selected as map entities based on the classified field information, further, field information used for describing the map entities is selected as entity attributes in the multi-source data tables based on the map entities, additionally, field information which can be related to the same map entities and field information which can be related to different map entities are selected as map edges in the multi-source data tables, and field information used for describing the map edges is selected as edge attributes in the multi-source data tables, so that rich association relations between companies-companies (relations of transactions, investments and the like), company-individuals (relations of legal persons, stakeholders, dong Gaojian and the like) and individuals-individuals (relations of children, parents, business and the like) are excavated.
For example, in a financial loan scenario, a is a legal person of a company B, an address of the company B, a creation time of the company B, and an investment of the company B into the company C, the company B and the company C are set as map entities, the creation time and the address are entity attributes of the company B, the relationship between the company B and the company C is an investment relationship, the investment relationship is taken as a map side, and the investment time, the investment amount, and the like are taken as side attributes of the map side.
Step S30, based on the target map information, generating different entity files and side relation files;
In this embodiment, different entity files and side relationship files are generated based on the target map information, specifically, corresponding fields of different entities are selected to generate entity files such as individuals, companies, groups and the like. Simultaneously, corresponding fields of different sides in the schema are selected to generate company-company (relationships of transactions, groups, private investments and the like), company-person (relationships of legal persons, stakeholders, dong Gaojian and the like) and person-person (relationships of children, parents, businesses and the like) side relationship files.
Step S30, based on the target map information, generates different entity files and side relation files, comprising the following steps:
Step S31, generating each entity file according to target fields corresponding to different map entities in the target map information;
And S32, generating each side relation file according to the target fields corresponding to different map sides in the target map information.
In this embodiment, specifically, corresponding target fields of different map entities are selected according to the target map information, and entity files such as individuals, companies, groups and the like are generated respectively, and corresponding target fields of different map edges in the target map information are selected to generate a plurality of edge relationship files.
And step S40, constructing a target federal knowledge graph based on the different entity files and the side relationship files.
In this embodiment, it should be noted that, the knowledge graph is essentially a semantic network for representing the relationship between entities, the knowledge graph is composed of a piece of knowledge, each piece of knowledge is represented as a SPO triplet (Object-PREDICATE-Object), for example, (entity 1, relationship, entity 2), (entity, attribute value) and other triples, the nodes in the knowledge graph represent the entities, the edges are the association relationship between the entities, and further, the nodes and edges may also have corresponding labels, where the labels are the labels of the categories corresponding to the nodes and edges.
Specifically, based on the different entity files and the side relationship files, determining nodes, node attributes and node labels required by constructing a map, determining association relations, relation attributes and relation labels among the nodes, further based on the nodes, the node attributes and the node labels, and determining association relations, relation attributes and relation labels among the nodes, constructing a target federal knowledge map containing association between companies, companies and individuals, for example, connecting a machine for deploying a neo4j database through ssh command of Linux, further copying the different entity files and the side relationship files into the neo4j machine through scp command of Linux, further closing neo4j service by using neo4j stop command, further, importing the entity files and the side relationship files by using neo4 j-adminimin report command, and restarting neo4j by using neo4j start command after importing is completed.
The embodiment of the application provides a federal knowledge graph construction method, which comprises the steps of firstly acquiring multi-source heterogeneous data in a target field, generating each multi-source data table based on the multi-source heterogeneous data, further classifying and analyzing each multi-source data table to obtain target graph information, wherein the target graph information comprises different types of graph entities, entity attributes, different types of graph edges and edge attributes, further, generating different entity files and edge relationship files based on the target graph information, further, constructing a target federal knowledge graph based on the different entity files and edge relationship files, and realizing classification analysis of a data table corresponding to the multi-source heterogeneous data, selecting information such as the graph entities, the entity attributes, the different types of graph edges and the edge attributes corresponding to the constructed graph, generating different entity files and edge relationship files, and constructing a target federal knowledge graph comprising various data, so that business personnel can better expand business based on the association information of federal knowledge.
Further, referring to fig. 2, step S20 includes performing a classification analysis on the multi-source data table to obtain target map information, which specifically includes:
step S21, in the multi-source data table, combining the business scene of the current target field, selecting each target field with the query frequency meeting the preset frequency threshold as the different types of map entities, and determining the entity attribute corresponding to each map entity;
step S22, based on each map entity, selecting the map entities of the same type and the target fields associated with the map entities of different types from the multi-source data table as map edges of different types, and determining the edge attribute corresponding to each map edge, wherein the map edges represent the association relationship between the map entities.
In this embodiment, specifically, in combination with a service scenario in a current target field, in the multi-source data table, a target field whose query frequency exceeds a preset frequency threshold value in the current service scenario is selected, and then each target field is subjected to classification analysis to obtain classified field information, and then field information meeting an entity construction condition is selected from the classified field information to serve as the map entity, further, based on each map entity, a preset number of field information for describing the map entity is selected to serve as an entity attribute of the map entity, further, after each map entity is determined, overall analysis is performed on each multi-source data table, field information associated with the same map entity and field information associated with different map entities is selected to serve as a map edge, that is, the map edge is used for representing an association relationship between the entities, and further, a preset number of field information for describing the map edge is selected to serve as an edge attribute of the map entity.
According to the scheme, namely, in the multi-source data table, all target fields with the query frequency meeting a preset frequency threshold are selected as the different types of map entities in combination with the service scene of the current target field, the entity attribute corresponding to each map entity is determined, and then all target fields which are associated with the map entities of the same type and the map entities of different types are selected as the map edges of different types in the multi-source data table based on each map entity, and the edge attribute corresponding to each map edge is determined, wherein the map edges represent the association relationship among the map entities, so that various data are aggregated, the entity is determined in combination with the service scene of the current target field, the association among the entities is analyzed, and therefore, the federal knowledge map can be constructed based on the association relationship between the entities and each entity.
Further, referring to fig. 3, in accordance with the first embodiment of the present application, in another embodiment of the present application, after the step of constructing the target federal knowledge base based on the different entity files and the side relationship files, the method further includes:
Step A10, constructing a visual webpage of the target federal knowledge graph;
In this embodiment, before the visual web page is constructed, the developer encapsulates the query statement of the target federal knowledge graph into a function interface form, so that when the user clicks the front end (the visual web page), the user can call the interface to obtain data in the target federal knowledge graph, for example, node query, queries all nodes with a distance of 1 from the center node by the name of a specific node transmitted from the outside, and returns a result after classifying according to the edge relationship type. And inquiring the graph algorithm, executing the graph algorithms such as community discovery, shortest path, node similarity and the like by generating inquiry sentences through names of a plurality of nodes which are externally transmitted, and returning an execution result.
Step A20, obtaining an operation instruction of a target user on the visual webpage;
in this embodiment, the operation instruction includes a single click, a double click, a right click, and the like.
Step A30, inquiring target return information corresponding to the operation instruction in the target federal knowledge graph;
In this embodiment, it should be noted that the data returned by different operation instructions is different. For example, when the target user is a single click instruction, a single click coordinate position is detected, whether the single click coordinate position belongs to a certain node or a certain side is judged, so that corresponding field information of the node or the side is obtained in the target federal knowledge graph, when the target user is a double click instruction, a double click coordinate position is detected, whether the double click coordinate position belongs to a certain node is judged, when the double click coordinate position is taken as being in the node, the node is expanded, that is, all nodes with a distance of 1 from the node are obtained in the target federal knowledge graph, and a return result after classification is carried out according to the relation type of the sides between the nodes, and the like.
And step A40, performing visual drawing in the visual webpage through a preset drawing algorithm based on the target return information.
In this embodiment, the preset drawing algorithm includes a force-directed graph layout algorithm, where the force-directed graph layout algorithm uses nodes as charges, calculates a combined force of attractive force and repulsive force by calculation of each node, and moves the position of the node based on the combined force.
And step A40, based on the target return information, performing visual drawing through a preset drawing algorithm to obtain a target drawing, wherein the method specifically comprises the following steps of:
Step A41, importing the target return information into a preset constructed force guide graph layout;
and step A42, dynamically calling a preset drawing function in the force guide graph layout, and drawing the target return information based on the drawing function.
In this embodiment, specifically, a force guidance graph layout is created, a drawing function is set in the force guidance graph layout, the target return information is further imported into the force guidance graph layout, visual drawing is performed on the target return information based on the drawing function, and a preset mechanical simulation model is combined to dynamically adjust the coordinate position of the node.
Step a42, drawing the target return information based on the drawing function to obtain the target drawing map, specifically including:
step A421, if the target return information includes a graph entity, performing a preset node style based on the graph entity data to perform node drawing;
and step A422, if the target return information has map edges, performing edge drawing according to a preset edge pattern based on the number of the map edges between the map entities.
In this embodiment, specifically, when the target returns that the information exists in the map entity, when the node (map entity) is drawn, the color, the highlight and other patterns drawn by the node are controlled according to the type of the map entity, for example, an arc function is used to control a 2D Canvas state machine, the map entity includes the entity of a person and a company, and the map entity corresponding to the person and the picture entity corresponding to the company can be drawn by adopting different colors.
Further, when the target return information includes map edges, drawing the edges (map edges) according to the number of edges between nodes by adopting different drawing modes, wherein the specific drawing process is to draw connecting lines according to a single-side condition between nodes and a preset edge style field, for example, a lineTo function is adopted to control a 2D Canvas state machine to draw, a first edge is drawn and executed by adopting a single-side method, the other edges are drawn by adopting a preset connecting line drawing mode, bezier quadratic curves are drawn by adopting a quadraticCurveTo method, for example, A and B are friends and trade relations, edges corresponding to the friend relations can be drawn by adopting straight lines, and edges corresponding to the trade relations can be drawn by adopting curves.
According to the technical scheme, namely, the visual webpage of the target federal knowledge graph is constructed, so that the operation instruction of a target user in the visual webpage is obtained, further, the target return information corresponding to the operation instruction is queried in the target federal knowledge graph, and further, based on the target return information, the visual drawing is carried out in the visual webpage through a preset drawing algorithm, so that the visual drawing of the knowledge graph is realized, the threshold of using the technology by business personnel is reduced, and business expansion can be carried out by using the federal knowledge graph conveniently and rapidly for business personnel with shallow technical foundation.
Further, referring to fig. 4, fig. 4 is a system frame diagram of the federal knowledge graph construction method of the present application, specifically, data of different data sources (multi-source heterogeneous data) are collected, multiple data tables (multi-source data tables) are generated, based on each multi-source data table, association relationships between graph entities are extracted by using NLP technology, and multiple entity files and side relationship files are generated, so that based on each entity file and side relationship file, the target federal knowledge graph is constructed through neo4j graph database, the target federal knowledge graph is stored, interface functions corresponding to query sentences are designed, and then clicking operation is performed on a front page by a user, corresponding interface functions are triggered, so as to obtain target return data corresponding to the target federal knowledge graph, and then the target return data is visualized and drawn.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a federal knowledge graph construction apparatus of a hardware operating environment according to an embodiment of the present application.
As shown in fig. 5, the federal knowledge graph construction apparatus may include a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. Wherein a communication bus 1002 is used to enable connected communication between the processor 1001 and a memory 1005. The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Optionally, the federal knowledge graph construction apparatus may further include a rectangular user interface, a network interface, a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like. The rectangular user interface may include a Display screen (Display), an input sub-module such as a Keyboard (Keyboard), and the optional rectangular user interface may also include a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WIFI interface).
Those skilled in the art will appreciate that the federal knowledge base construction apparatus structure illustrated in fig. 5 does not constitute a limitation of the federal knowledge base construction apparatus, and may include more or fewer components than illustrated, or may combine certain components, or a different arrangement of components.
As shown in fig. 5, an operating system, a network communication module, and a federal knowledge graph construction program may be included in a memory 1005, which is a computer storage medium. The operating system is a program that manages and controls the federal knowledge graph construction equipment hardware and software resources, supporting the operation of federal knowledge graph construction programs and other software and/or programs. The network communication module is used to implement communication between components within the memory 1005 and other hardware and software in the federal knowledge base construction system.
In the federal knowledge graph construction apparatus shown in fig. 5, a processor 1001 is configured to execute a federal knowledge graph construction program stored in a memory 1005 to implement the steps of the federal knowledge graph construction method described in any one of the above.
The specific implementation of the federal knowledge graph construction equipment is basically the same as the above-mentioned embodiments of the federal knowledge graph construction method, and will not be described in detail herein.
In addition, referring to fig. 6, fig. 6 is a schematic diagram of functional modules of the federal knowledge graph construction device according to the present application, and the present application further provides a federal knowledge graph construction system, where the federal knowledge graph construction system includes:
The acquisition module is used for acquiring multi-source heterogeneous data and generating various multi-source data tables based on the multi-source heterogeneous data;
The analysis module is used for classifying and analyzing each multi-source data table to obtain target spectrum information, wherein the target spectrum information comprises different types of spectrum entities, entity attributes, different types of spectrum edges and edge attributes;
the generation module is used for generating different entity files and side relation files based on the target map information;
And the construction module is used for constructing a target federal knowledge graph based on the different entity files and the side relationship files.
Optionally, the analysis module is further configured to;
in the multi-source data table, combining the business scene of the current target field, selecting each target field with the query frequency meeting the preset frequency threshold as the different types of map entities, and determining the entity attribute corresponding to each map entity;
Based on each map entity, selecting the map entities of the same type and the target fields associated with the map entities of different types from the multi-source data table as map edges of different types, and determining edge attributes corresponding to each map edge, wherein the map edges represent association relations among the map entities.
Optionally, the generating module is further configured to;
Generating each entity file according to target fields corresponding to different map entities in the target map information;
And generating each side relation file according to the target fields corresponding to different map sides in the target map information.
Optionally, the acquiring module is further configured to;
and processing the multi-source heterogeneous data by a preset natural language processing method to generate each multi-source data table.
Optionally, the federal knowledge graph construction system is further configured to:
constructing a visual webpage of the target federal knowledge graph;
acquiring an operation instruction of a target user on the visual webpage;
Inquiring target return information corresponding to the operation instruction in the target federal knowledge graph;
And based on the target return information, performing visual drawing in the visual webpage through a preset drawing algorithm.
Optionally, the federal knowledge graph construction system is further configured to:
Importing the target return information into a preset constructed force guide graph layout;
And dynamically calling a preset drawing function in the force-directed graph layout, and drawing the target return information based on the drawing function.
Optionally, the federal knowledge graph construction system is further configured to:
If the target return information exists in the map entity, carrying out node drawing on the basis of preset node patterns of the map entity data;
if the target return information contains the map edges, drawing the edges according to a preset edge pattern based on the number of the map edges between the map entities.
The specific implementation manner of the federal knowledge graph construction system is basically the same as that of each embodiment of the federal knowledge graph construction method, and is not repeated here.
An embodiment of the present application provides a storage medium, where the storage medium is a computer readable storage medium, and the computer readable storage medium stores one or more programs, and the one or more programs are further executable by one or more processors to implement the steps of the federal knowledge graph construction method according to any one of the foregoing embodiments.
The specific implementation manner of the computer readable storage medium of the present application is basically the same as the above embodiments of the federal knowledge graph construction method, and will not be described herein.
The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the application, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein, or any application, directly or indirectly, within the scope of the application.

Claims (7)

1.一种联邦知识图谱构建方法,其特征在于,所述联邦知识图谱构建方法包括:1. A method for constructing a federated knowledge graph, characterized in that the method comprises: 获取目标领域的多源异构数据,并基于所述多源异构数据,生成各多源数据表;Acquire multi-source heterogeneous data in a target domain, and generate multi-source data tables based on the multi-source heterogeneous data; 对各所述多源数据表进行归类分析,得到目标图谱信息,其中,所述目标图谱信息包括不同类型的图谱实体、实体属性、不同类型的图谱边以及边属性;Classify and analyze each of the multi-source data tables to obtain target graph information, wherein the target graph information includes different types of graph entities, entity attributes, different types of graph edges, and edge attributes; 基于所述目标图谱信息,生成不同的实体文件和边关系文件;Based on the target graph information, different entity files and edge relationship files are generated; 基于所述不同的实体文件和边关系文件,构建目标联邦知识图谱;Building a target federated knowledge graph based on the different entity files and edge relationship files; 在所述基于所述不同的实体文件和边关系文件,构建目标联邦知识图谱的步骤之后,还包括:After the step of constructing a target federated knowledge graph based on the different entity files and edge relationship files, the method further includes: 构建所述目标联邦知识图谱的可视化网页;Constructing a visualization webpage of the target federated knowledge graph; 获取目标用户在所述可视化网页的操作指令;Obtaining the target user's operation instructions on the visual webpage; 在所述目标联邦知识图谱中查询所述操作指令对应的目标返回信息;Querying the target federated knowledge graph for target return information corresponding to the operation instruction; 基于所述目标返回信息,通过预设绘制算法在所述可视化网页中进行可视化绘制;Based on the target return information, performing visual drawing on the visual webpage using a preset drawing algorithm; 所述在所述目标联邦知识图谱中查询所述操作指令对应的目标返回信息的步骤包括:The step of querying the target federated knowledge graph for target return information corresponding to the operation instruction includes: 当目标用户为单击指令时,检测单击坐标位置,判断所述单击坐标位置是否属于节点内或边上;当在节点内或边上时,在所述目标联邦知识图谱中获取节点或边对应的字段信息;When the target user issues a single-click instruction, the click coordinate position is detected to determine whether the click coordinate position is within a node or on an edge; if it is within a node or on an edge, the field information corresponding to the node or edge is obtained in the target federated knowledge graph; 当目标用户为双击指令时,检测双击坐标位置,判断所述双击坐标位置是否属于节点内;当双击坐标位置在节点内时,扩展所述节点,并且返回按照节点之间边的关系类型进行分类后的返回结果;When the target user double-clicks, the double-click coordinate position is detected to determine whether the double-click coordinate position is within the node; if the double-click coordinate position is within the node, the node is expanded and the result is returned after classification according to the relationship type of the edges between the nodes; 所述基于所述目标返回信息,通过预设绘制算法进行可视化绘制,得到目标绘制图的步骤包括:The step of performing visual drawing based on the target return information by using a preset drawing algorithm to obtain a target drawing includes: 将所述目标返回信息导入预设构建的力导向图布局;Importing the target return information into a preset force-directed graph layout; 动态调用所述力导向图布局中预先设置的绘制函数,并基于所述绘制函数,对所述目标返回信息进行绘制;Dynamically calling a drawing function preset in the force-directed graph layout, and drawing the target return information based on the drawing function; 所述基于所述绘制函数,对所述目标返回信息进行绘制的步骤包括:The step of drawing the target return information based on the drawing function includes: 若所述目标返回信息存在图谱实体,则基于所述图谱实体数据进行预设节点样式进行节点绘制;If the target return information contains a graph entity, the node is drawn using a preset node style based on the graph entity data; 若所述目标返回信息存在图谱边,则基于图谱实体之间图谱边的数量,按照预设边样式进行边绘制;If the target returned information contains a graph edge, the edge is drawn according to a preset edge style based on the number of graph edges between graph entities; 所述基于图谱实体之间图谱边的数量,按照预设边样式进行边绘制的步骤包括:The step of drawing edges according to a preset edge style based on the number of graph edges between graph entities includes: 在图谱实体间是单边的情况,根据预设边样式字段绘制连接线;In the case of a single edge between graph entities, a connection line is drawn according to the preset edge style field; 在图谱实体间是多边的情况,第一条边采用预设边样式字段绘制连接线,其余边以预设连接线绘制方式进行绘制。In the case of multiple edges between graph entities, the first edge is drawn with the preset edge style field, and the remaining edges are drawn with the preset connection line drawing method. 2.如权利要求1所述的联邦知识图谱构建方法,其特征在于,所述对各所述多源数据表进行归类分析,得到目标图谱信息的步骤包括:2. The method for constructing a federated knowledge graph according to claim 1, wherein the step of classifying and analyzing each of the multi-source data tables to obtain target graph information comprises: 在所述多源数据表中,结合当前目标领域的业务场景,选取查询频次满足预设频次阈值的各目标字段作为所述不同类型的图谱实体,并确定各所述图谱实体对应的实体属性;In the multi-source data table, in combination with the business scenario of the current target field, target fields whose query frequencies meet a preset frequency threshold are selected as the different types of graph entities, and entity attributes corresponding to the graph entities are determined; 基于各所述图谱实体,在所述多源数据表中,选取关联相同类型图谱实体以及关联不同类型图谱实体的各目标字段作为所述不同类型的图谱边,并确定各所述图谱边对应的边属性,其中,所述图谱边表征图谱实体之间的关联关系。Based on each of the graph entities, in the multi-source data table, the target fields associated with the same type of graph entities and the target fields associated with different types of graph entities are selected as the different types of graph edges, and the edge attributes corresponding to each of the graph edges are determined, wherein the graph edges represent the association relationship between the graph entities. 3.如权利要求1所述的联邦知识图谱构建方法,其特征在于,所述基于所述目标图谱信息,生成不同的实体文件和边关系文件的步骤包括:3. The method for constructing a federated knowledge graph according to claim 1, wherein the step of generating different entity files and edge relationship files based on the target graph information comprises: 根据所述目标图谱信息中不同图谱实体对应的目标字段,生成各所述实体文件;Generate each entity file according to the target fields corresponding to different graph entities in the target graph information; 以及根据所述目标图谱信息中不同图谱边对应的目标字段,生成各所述边关系文件。And generating each edge relationship file according to the target fields corresponding to different graph edges in the target graph information. 4.如权利要求1所述的联邦知识图谱构建方法,其特征在于,所述基于所述多源异构数据,生成各多源数据表的步骤包括:4. The method for constructing a federated knowledge graph according to claim 1, wherein the step of generating each multi-source data table based on the multi-source heterogeneous data comprises: 通过预设自然语言处理方法,对所述多源异构数据进行处理,生成各所述多源数据表。The multi-source heterogeneous data are processed by a preset natural language processing method to generate each of the multi-source data tables. 5.一种联邦知识图谱构建系统,其特征在于,所述联邦知识图谱构建系统包括:5. A federated knowledge graph construction system, characterized in that the federated knowledge graph construction system includes: 获取模块,用于获取多源异构数据,并基于所述多源异构数据,生成各多源数据表;An acquisition module, configured to acquire multi-source heterogeneous data and generate multi-source data tables based on the multi-source heterogeneous data; 分析模块,用于对各所述多源数据表进行归类分析,得到目标图谱信息,其中,所述目标图谱信息包括不同类型的图谱实体、实体属性、不同类型的图谱边以及边属性;an analysis module, configured to classify and analyze each of the multi-source data tables to obtain target graph information, wherein the target graph information includes different types of graph entities, entity attributes, different types of graph edges, and edge attributes; 生成模块,用于基于所述目标图谱信息,生成不同的实体文件和边关系文件;A generation module, configured to generate different entity files and edge relationship files based on the target graph information; 构建模块,用于基于所述不同的实体文件和边关系文件,构建目标联邦知识图谱;A construction module, configured to construct a target federated knowledge graph based on the different entity files and edge relationship files; 在所述基于所述不同的实体文件和边关系文件,构建目标联邦知识图谱之后,还包括:After constructing the target federated knowledge graph based on the different entity files and edge relationship files, the method further includes: 构建所述目标联邦知识图谱的可视化网页;Constructing a visualization webpage of the target federated knowledge graph; 获取目标用户在所述可视化网页的操作指令;Obtaining the target user's operation instructions on the visual webpage; 在所述目标联邦知识图谱中查询所述操作指令对应的目标返回信息;Querying the target federated knowledge graph for target return information corresponding to the operation instruction; 基于所述目标返回信息,通过预设绘制算法在所述可视化网页中进行可视化绘制;Based on the target return information, performing visual drawing on the visual webpage using a preset drawing algorithm; 所述在所述目标联邦知识图谱中查询所述操作指令对应的目标返回信息包括:The target return information corresponding to the operation instruction in the target federated knowledge graph includes: 当目标用户为单击指令时,检测单击坐标位置,判断所述单击坐标位置是否属于节点内或边上;当在节点内或边上时,在所述目标联邦知识图谱中获取节点或边对应的字段信息;When the target user issues a single-click instruction, the click coordinate position is detected to determine whether the click coordinate position is within a node or on an edge; if it is within a node or on an edge, the field information corresponding to the node or edge is obtained in the target federated knowledge graph; 当目标用户为双击指令时,检测双击坐标位置,判断所述双击坐标位置是否属于节点内;当双击坐标位置在节点内时,扩展所述节点,并且返回按照节点之间边的关系类型进行分类后的返回结果;When the target user double-clicks, the double-click coordinate position is detected to determine whether the double-click coordinate position is within the node; if the double-click coordinate position is within the node, the node is expanded and the result is returned after classification according to the relationship type of the edges between the nodes; 所述基于所述目标返回信息,通过预设绘制算法进行可视化绘制,得到目标绘制图包括:The step of performing visual drawing based on the target return information by using a preset drawing algorithm to obtain a target drawing includes: 将所述目标返回信息导入预设构建的力导向图布局;Importing the target return information into a preset force-directed graph layout; 动态调用所述力导向图布局中预先设置的绘制函数,并基于所述绘制函数,对所述目标返回信息进行绘制;Dynamically calling a drawing function preset in the force-directed graph layout, and drawing the target return information based on the drawing function; 所述基于所述绘制函数,对所述目标返回信息进行绘制包括:Drawing the target return information based on the drawing function includes: 若所述目标返回信息存在图谱实体,则基于所述图谱实体数据进行预设节点样式进行节点绘制;If the target return information contains a graph entity, the node is drawn using a preset node style based on the graph entity data; 若所述目标返回信息存在图谱边,则基于图谱实体之间图谱边的数量,按照预设边样式进行边绘制;If the target returned information contains a graph edge, the edge is drawn according to a preset edge style based on the number of graph edges between graph entities; 所述基于图谱实体之间图谱边的数量,按照预设边样式进行边绘制包括:Drawing edges according to a preset edge style based on the number of graph edges between graph entities includes: 在图谱实体间是单边的情况,根据预设边样式字段绘制连接线;In the case of a single edge between graph entities, a connection line is drawn according to the preset edge style field; 在图谱实体间是多边的情况,第一条边采用预设边样式字段绘制连接线,其余边以预设连接线绘制方式进行绘制。In the case of multiple edges between graph entities, the first edge is drawn with the preset edge style field, and the remaining edges are drawn with the preset connection line drawing method. 6.一种联邦知识图谱构建设备,其特征在于,所述联邦知识图谱构建设备包括:存储器、处理器以及存储在存储器上的联邦知识图谱构建程序,6. A federated knowledge graph construction device, characterized in that the federated knowledge graph construction device includes: a memory, a processor, and a federated knowledge graph construction program stored in the memory, 所述联邦知识图谱构建程序被所述处理器执行实现如权利要求1至4中任一项所述联邦知识图谱构建方法的步骤。The federated knowledge graph construction program is executed by the processor to implement the steps of the federated knowledge graph construction method as described in any one of claims 1 to 4. 7.一种存储介质,所述存储介质为计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有联邦知识图谱构建程序,所述联邦知识图谱构建程序被处理器执行实现如权利要求1至4中任一项所述联邦知识图谱构建方法的步骤。7. A storage medium, which is a computer-readable storage medium, characterized in that a federated knowledge graph construction program is stored on the computer-readable storage medium, and the federated knowledge graph construction program is executed by a processor to implement the steps of the federated knowledge graph construction method as described in any one of claims 1 to 4.
CN202210514582.XA 2022-05-12 2022-05-12 Method, system, device and storage medium for constructing federated knowledge graph Active CN114780752B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210514582.XA CN114780752B (en) 2022-05-12 2022-05-12 Method, system, device and storage medium for constructing federated knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210514582.XA CN114780752B (en) 2022-05-12 2022-05-12 Method, system, device and storage medium for constructing federated knowledge graph

Publications (2)

Publication Number Publication Date
CN114780752A CN114780752A (en) 2022-07-22
CN114780752B true CN114780752B (en) 2025-08-22

Family

ID=82437502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210514582.XA Active CN114780752B (en) 2022-05-12 2022-05-12 Method, system, device and storage medium for constructing federated knowledge graph

Country Status (1)

Country Link
CN (1) CN114780752B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115329612A (en) * 2022-10-17 2022-11-11 中国电子科技集团公司信息科学研究院 Signal processing heterogeneous integrated micro-system knowledge graph construction method and simulation method
CN118708916B (en) * 2024-08-29 2024-11-29 浙江大华技术股份有限公司 Relationship analysis method based on knowledge graph, electronic device and readable storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113157930A (en) * 2020-12-30 2021-07-23 上海科技发展有限公司 Knowledge graph construction method, system and terminal based on multi-source heterogeneous data

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9990429B2 (en) * 2010-05-14 2018-06-05 Microsoft Technology Licensing, Llc Automated social networking graph mining and visualization
CN109885691B (en) * 2019-01-08 2024-06-25 平安科技(深圳)有限公司 Knowledge graph completion method, knowledge graph completion device, computer equipment and storage medium
CN113127574B (en) * 2020-01-15 2024-10-22 京东方科技集团股份有限公司 Business data display method, system, device and medium based on knowledge graph
CN111930856B (en) * 2020-07-06 2023-02-21 北京邮电大学 Method, device and system for constructing domain knowledge map ontology and data
CN112597317B (en) * 2021-01-11 2021-11-26 西藏民族大学 Knowledge graph visualization method and system
CN113157947B (en) * 2021-05-20 2024-12-17 中国工商银行股份有限公司 Knowledge graph construction method, tool, device and server
CN113779273A (en) * 2021-09-16 2021-12-10 平安国际智慧城市科技股份有限公司 Method, device, computer and medium for mining enterprise information based on knowledge graph
CN114064913B (en) * 2021-10-19 2025-01-28 中国人民解放军31511部队 A document retrieval method and system based on knowledge graph

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113157930A (en) * 2020-12-30 2021-07-23 上海科技发展有限公司 Knowledge graph construction method, system and terminal based on multi-source heterogeneous data

Also Published As

Publication number Publication date
CN114780752A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
EP3859564A2 (en) Method and apparatus for generating knowledge graph, method and apparatus for relation mining, device and medium
CN114780752B (en) Method, system, device and storage medium for constructing federated knowledge graph
CN111708801A (en) Report generation method, device and electronic device
US20140164895A1 (en) Systems and methods for managing spreadsheet models
CN109471900A (en) Chart class data self action data exchange method and system, computer program
CN102208989A (en) Network visualization processing method and device
CN103389895A (en) Method and system for generating front end page
JP2022538702A (en) Voice packet recommendation method, device, electronic device and program
CN106777086B (en) Dynamic management method and device for webpage embedded points
CN112579558A (en) Method, device, storage medium and equipment for displaying topological graph
US11507741B2 (en) Document tracking through version hash linked graphs
CN109359199A (en) Fund manager's group dividing method, system, computer equipment and storage medium
CN115687349A (en) Database table generation method and device, storage medium and electronic device
CN116415004A (en) Knowledge graph construction method and device, storage medium and electronic equipment
US20130096967A1 (en) Optimizer
CN117909514A (en) Knowledge graph design method and device in text industry scene
CN113946594B (en) Integrated query method, device and equipment for industrial multi-source heterogeneous data
CN102937973B (en) A kind of generation is used for the method and apparatus presenting configuration information that information presents
CN116628290A (en) Node position adjustment method, device, equipment and storage medium
US9384285B1 (en) Methods for identifying related documents
US7259763B2 (en) Embedded analytics in information visualization
WO2024255306A1 (en) Image reconstruction method and apparatus, device, and storage medium
CN117909734A (en) Label generating apparatus, label generating method, electronic device, and computer-readable storage medium
CN117573936A (en) Data relationship display method and device and related equipment
CN116088816B (en) Method for developing panoramic service view by low codes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant