CN117493582A - Model result output method and device, electronic equipment and storage medium - Google Patents
Model result output method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN117493582A CN117493582A CN202311857269.7A CN202311857269A CN117493582A CN 117493582 A CN117493582 A CN 117493582A CN 202311857269 A CN202311857269 A CN 202311857269A CN 117493582 A CN117493582 A CN 117493582A
- Authority
- CN
- China
- Prior art keywords
- target
- entity
- relation
- score
- knowledge graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Animal Behavior & Ethology (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a method and a device for outputting a model result, electronic equipment and a storage medium, and belongs to the technical field of artificial intelligence. Wherein the method comprises the following steps: obtaining entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph, wherein the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity; calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector aiming at each triplet in the knowledge graph; and obtaining a target question to be inferred, and outputting a target answer with highest matching degree with the target question according to the score by adopting a language model. By the method and the device, the technical problem that the accuracy of the reasoning result output by the language model in the related technology is low is solved.
Description
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a method and a device for outputting a model result, electronic equipment and a storage medium.
Background
The large language model LLMs (Language and Logic Models) refers to a deep learning model trained using a large amount of text data, which can generate natural language text or understand the meaning of language text. The large language model can process various natural language tasks, such as text classification, question and answer, dialogue and the like, is an important approach to artificial intelligence, and has the advantages of being more efficient, intelligent, extensible and the like compared with the traditional knowledge graph mode. However, large language models are black-box models, and it is often difficult to capture and acquire factual knowledge, i.e., lack of factual knowledge. In the related art, the accuracy of the inference result output by the language model is low.
In view of the above problems in the related art, no effective solution has been found yet.
Disclosure of Invention
The application provides a method and a device for outputting a model result, electronic equipment and a storage medium, so as to solve the technical problems in the related art.
According to an embodiment of the present application, there is provided a method for outputting a model result, including: obtaining entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph, wherein the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity; calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector aiming at each triplet in the knowledge graph; and obtaining a target question to be inferred, and outputting a target answer with highest matching degree with the target question according to the score by adopting a language model.
According to another embodiment of the present application, there is provided an output device of a model result, including: the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph, the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity; the calculation module is used for calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector aiming at each triplet in the knowledge graph; and the output module is used for acquiring the target questions to be inferred and outputting target answers with highest matching degree with the target questions by adopting a language model according to the scores.
Optionally, the output module includes a first output unit, configured to perform a first conversion on the first entity h and the tail entity t through the relation r for each triplet in the knowledge graph, so as to obtain a relation expression e_h=r×h of the first entity h, and a relation expression e_t=r×t of the tail entity t; calculating a first score of the head entity h, the relation r and the tail entity t using the formula: score1 (h, r, t) =e_h_h_e_t+m, wherein m is a tuning threshold of the knowledge graph, and score1 (h, r, t) represents a first score; minimizing a first score of a positive triplet in the knowledge graph, maximizing the first score of a negative triplet in the knowledge graph, and acquiring a triplet set of which the first score is within a preset threshold; training the language model by adopting the triplet set to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the maximum score; and outputting the target tail entity with the maximum score as a target answer which is matched with the target question.
Optionally, the output device of the model result further comprises an expansion module, which is used for forming the target head entity, the target relation and the target tail entity into a target triplet; and adding the target triples to the knowledge graph.
Optionally, the output device of the model result further comprises a mapping module, which is used for calling an activation function sigmoid function; and mapping the calculated scores of the head entity h, the relation r and the tail entity t to a probability space of [0, 1] by adopting the sigmoid function, and taking the mapped value as a first score.
Optionally, the output module includes a second output unit, configured to perform a second conversion on the head entity h and the tail entity t through the relationship r for each triplet in the knowledge graph, to obtain a relationship representation e_h=h+r of the head entity h, and a relationship representation e_t=t+r of the tail entity t; calculating a second score of the head entity h, the relation r and the tail entity t using the formula: score2 (h, r, t) = ||e_h+r+m-e_t|_2, wherein m is a tuning threshold value of the knowledge graph, 2 represents the L2 norm, score2 (h, r, t) is the second score; minimizing a second score of the positive triplet in the knowledge graph, and maximizing the second score of the negative triplet in the knowledge graph to obtain a triplet set of which the second score is within a preset threshold; training the language model by adopting the triplet set to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the minimum score; and outputting the target tail entity with the minimum score as a target answer which is matched with the target question.
Optionally, the output device of the model result further includes a training module, configured to obtain a head entity vector corresponding to the target head entity, a tail entity vector corresponding to the target tail entity, and a relationship vector corresponding to the target relationship; and inputting the head entity vector, the tail entity vector and the relation vector as training data into an initial language model for training, and obtaining the language model after training.
Optionally, the output module includes a third output unit, configured to perform third conversion on the head entity h and the tail entity t through the relation r for each triplet in the knowledge graph, to obtain a relation expression e_h=h-r×r_h of the head entity h, and a relation expression e_t=t-r×r_t of the tail entity t, where r_h and r_t are projection vectors of the relation r on the head entity h and the tail entity t, respectively; calculating a third score of the head entity h, the relation r and the tail entity t using the formula: score3 (h, r, t) = ||e_h+r+m-e_t|2, where m is a tuning threshold of the knowledge graph, 2 represents the L2 norm, score3 (h, r, t) is the third score; minimizing a third score of the positive triplet in the knowledge graph, and maximizing the third score of the negative triplet in the knowledge graph to obtain a triplet set of which the third score is within a preset threshold; acquiring a prompt word of the target problem, training the language model by adopting the triplet set and the prompt word to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the minimum score; and outputting the target tail entity with the minimum score as a target answer which is matched with the target question.
According to a further embodiment of the present application, there is also provided a storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the apparatus embodiments described above when run.
According to yet another embodiment of the present application, there is also provided an electronic device including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other via the communication bus; wherein: a memory for storing a computer program; and a processor for executing the steps of the method by running a program stored on the memory.
According to yet another embodiment of the present application, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of the above method.
According to the embodiment of the application, entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph are obtained, wherein the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity; for each triplet in the knowledge graph, calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector; the method comprises the steps of obtaining target questions to be inferred, outputting target answers with highest matching degree with the target questions according to scores by using a language model, and inferring by using the language model, wherein the target answers have high-efficiency and extensible performance, but have the defect of lack of fact knowledge, the knowledge graph is a structured knowledge model, rich fact knowledge is clearly stored, external knowledge can be provided for reasoning and interpretation, and the accuracy of an inference result output by the language model is improved by combining structured knowledge data of the knowledge graph with the language model.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a block diagram of the hardware architecture of a computer according to an embodiment of the present application;
FIG. 2 is a flow chart of a method of outputting model results in accordance with an embodiment of the present application;
FIG. 3 is a schematic diagram of a system flow combining a language model and a knowledge graph in an embodiment of the present application;
fig. 4 is a block diagram of a model result output device according to an embodiment of the present application.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other.
It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
The method embodiment provided in the first embodiment of the present application may be executed in a mobile phone, a computer, a tablet or a similar computing device. Taking a computer as an example, fig. 1 is a block diagram of a hardware structure of a computer according to an embodiment of the present application. As shown in fig. 1, the computer may include one or more processors 102 (only one is shown in fig. 1) (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, and optionally, a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those of ordinary skill in the art that the configuration shown in FIG. 1 is merely illustrative and is not intended to limit the configuration of the computer described above. For example, the computer may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a method for outputting a model result in the embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104, thereby performing various functional applications and data processing, that is, implementing the method described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 104 may further include memory located remotely from processor 102, which may be connected to the computer via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communications provider of a computer. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
In this embodiment, a method for outputting a model result is provided, and fig. 2 is a flowchart of a method for outputting a model result according to an embodiment of the present application, as shown in fig. 2, where the flowchart includes the following steps:
step S10, entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph are obtained, wherein the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity;
step S20, calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector aiming at each triplet in the knowledge graph;
and step S30, obtaining a target question to be inferred, and outputting a target answer with highest matching degree with the target question according to the score by adopting a language model.
Knowledge graph KG (Knowledge Graph) stores structured knowledge as a triplet set kg= { (h, R, t) ⊆ e×r×e }, where E and R represent sets of entities and relationships, respectively. The knowledge graph comprises a plurality of triples, each triplet comprising a head entity h, a tail entity t, and a relation r of the head entity h and the tail entity t. In this embodiment, the knowledge graph may be learned by a knowledge graph representation learning algorithm based on tensor decomposition to obtain an embedded vector representation of the entities and the relationships, so as to obtain entity vectors corresponding to all the entities in the knowledge graph and relationship vectors corresponding to all the relationships. And calculating score of the head entity, the relation and the tail entity according to the entity vector and the relation vector aiming at each triplet in the knowledge graph, reasoning the target questions according to the score by combining the knowledge graph and the language model, and outputting target answers with highest matching degree with the target questions.
Through the steps, entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph are obtained, wherein the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity; for each triplet in the knowledge graph, calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector; the method comprises the steps of obtaining target questions to be inferred, outputting target answers with highest matching degree with the target questions according to scores by using a language model, and inferring by using the language model, wherein the target answers have high-efficiency and extensible performance, but have the defect of lack of fact knowledge, the knowledge graph is a structured knowledge model, rich fact knowledge is clearly stored, external knowledge can be provided for reasoning and interpretation, and the accuracy of an inference result output by the language model is improved by combining the language model and the knowledge graph.
In one implementation of this embodiment, the knowledge graph is enhanced by a language model: the step of outputting the target answer with the highest matching degree with the target question according to the score by adopting the language model comprises the following steps:
step A1, for each triplet in the knowledge graph, performing a first conversion on the head entity h and the tail entity t through the relation r, so as to obtain a relation expression e_h=r×h of the head entity h, and a relation expression e_t=r×t of the tail entity t;
step A2, calculating a first score of the head entity h, the relation r and the tail entity t by adopting the following formula: score1 (h, r, t) =e_h_h_r_e_t+m, where m is a tuning threshold of the knowledge graph, used to compensate for the loss in the training process, and score1 (h, r, t) represents a first score;
and calculating the inner product of the embedded vectors of the head entity h, the relation r and the tail entity t, taking the sum of the inner product and the tuning threshold value as a first score1 (h, r, t), wherein the score of the triplet is the confidence degree of the triplet measured by calculating the distance between the entities, and the smaller the score is, the higher the confidence degree of the triplet is.
Step A3, minimizing the first score of the positive triples in the knowledge graph, maximizing the first score of the negative triples in the knowledge graph, and obtaining a triples set with the first score within a preset threshold;
in this embodiment, the triples in the knowledge-graph may be optimized by random gradient descent: minimizing the first score of the positive triples in the knowledge graph, and maximizing the first score of the negative triples in the knowledge graph, wherein the positive triples are correct triples, the negative triples are wrong triples, and a triplet set with the first score within a preset threshold value is obtained.
Step A4, training the language model by adopting the triplet set to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the maximum score;
and step A5, outputting the target tail entity as a target answer which is most matched with the target question.
Training a language model by adopting a triplet set to obtain an inference model, extracting a target head entity and a target relation of a target problem, inputting the target head entity and the target relation into the inference model to obtain a target tail entity in the triplet with the largest score1 (h, r, t), and outputting the target tail entity with the largest score as a target answer which is the most matched with the target problem, namely, a prediction result of the model.
In this embodiment, after outputting the target tail entity as the target answer that best matches the target question, the method further includes: forming a target triplet by the target head entity, the target relation and the target tail entity; and adding the target triples to the knowledge graph.
When the answers matched with the target questions cannot be matched in the knowledge graph, natural language analysis and reasoning are carried out by using a language model, a ternary component value is calculated through a vector, new entities and relationships, namely a target head entity, a target relationship and a target tail entity, are obtained based on the values, the target ternary component is formed into a target ternary component, and the target ternary component is added into the knowledge graph to expand the knowledge graph, so that the reasoning capability of the knowledge graph is improved.
In this embodiment, before minimizing the first score of the positive triplet in the knowledge-graph, the method further comprises: calling an activating function sigmoid function; and mapping the calculated scores of the head entity h, the relation r and the tail entity t to a probability space of [0, 1] by adopting the sigmoid function, and taking the mapped value as a first score.
Mapping the calculated scores score1 of the head entity h, the relation r and the tail entity t to a probability space of [0, 1] by adopting a sigmoid function: score1 (h, r, t) =sigmoid (e_h_r_e_t+m), where m is a tuning threshold of the knowledge graph, and because the value of the score of the triplet is larger, the score is limited to a fixed range [0, 1] after being mapped, and the mapped value is used as a first score to perform subsequent processing, thereby reducing the value of the score and improving the convergence rate of the model.
In another implementation manner of this embodiment, enhancing a language model with a knowledge graph, and outputting, by using the language model, a target answer with a highest matching degree with the target question according to the score includes:
step B1, for each triplet in the knowledge graph, performing a second conversion on the head entity h and the tail entity t through the relation r, so as to obtain a relation expression e_h=h+r of the head entity h and a relation expression e_t=t+r of the tail entity t;
step B2, calculating a second score of the head entity h, the relation r and the tail entity t by adopting the following formula: score2 (h, r, t) = ||e_h+r+m-e_t|_2, wherein m is a tuning threshold value of the knowledge graph, 2 represents the L2 norm, score2 (h, r, t) is the second score;
in this embodiment, the formula score2 (h, r, t) = ||e_h+r+m-e_t|_2 is used for the positive triples in the knowledge graph to calculate the second score2 (h, r, t) of the positive triples, and the negative head entity h ' is randomly selected by negative sampling for the negative triples (h ', r, t) in the knowledge graph to calculate the second score2 (h ', r, t) of the negative triples.
Step B3, minimizing the second score of the positive triplet in the knowledge graph, and maximizing the second score of the negative triplet in the knowledge graph to obtain a triplet set of which the second score is within a preset threshold;
in this embodiment, the triples in the knowledge-graph are optimized by random gradient descent: and minimizing the second score of the positive triplet in the knowledge graph, maximizing the second score of the negative triplet in the knowledge graph, and obtaining a triplet set of which the second score is within a preset threshold.
Step B4, training the language model by adopting the triplet set to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the minimum score;
and step B5, outputting the target tail entity with the smallest score as a target answer which is most matched with the target question.
Training a language model by adopting a triplet set to obtain an inference model, extracting a target head entity and a target relation of a target problem, inputting the target head entity and the target relation into the inference model to obtain a target tail entity with the minimum score2 (h, r, t), and outputting the target tail entity as a target answer which is the best match with the target problem, namely a prediction result of the model.
In this embodiment, after outputting the target tail entity as the target answer that best matches the target question, the method further includes: acquiring a head entity vector corresponding to the target head entity, a tail entity vector corresponding to the target tail entity and a relation vector corresponding to the target relation; and inputting the head entity vector, the tail entity vector and the relation vector as training data into an initial language model for training, and obtaining the language model after training.
In this embodiment, the related triplet vector values in the knowledge graph are used as feature values of the language model to train, and the structured fact knowledge of the knowledge graph is used as features to train the language model, so as to solve the defect that the language model lacks the fact knowledge.
In another implementation manner of this embodiment, the reasoning is performed by fusing a language model and a unified model of a knowledge graph, and the outputting, by using the language model, the target answer with the highest matching degree with the target question according to the score includes:
step C1, for each triplet in the knowledge graph, performing a third conversion on the head entity h and the tail entity t through the relation r, so as to obtain a relation expression e_h=h-r_r_h of the head entity h, and a relation expression e_t=t-r_r_t of the tail entity t, where r_h and r_t are projection vectors of the relation r on the head entity h and the tail entity t respectively;
in the third conversion of the head entity h and the tail entity t through the relation r, r_h and r_t are projection vectors of the relation r on the head entity h and the tail entity t respectively, and are used for maintaining strong type constraint of the relation.
Step C2, calculating a third score of the head entity h, the relation r and the tail entity t by adopting the following formula: score3 (h, r, t) = ||e_h+r+m-e_t|2, where m is a tuning threshold of the knowledge graph, 2 represents the L2 norm, score3 (h, r, t) is the third score;
in this embodiment, the formula score3 (h, r, t) = |e_h+r+m-e_t|_2 is used for the positive triples in the knowledge graph to calculate the third score3 (h, r, t) of the positive triples, and the negative head entity h ' is randomly selected by negative sampling for the negative triples (h ', r, t) in the knowledge graph to calculate the second score3 (h ', r, t) of the negative triples.
Step C3, minimizing third scores of positive triples in the knowledge graph, maximizing third scores of negative triples in the knowledge graph, and obtaining a triplet set with the third scores within a preset threshold;
in this embodiment, the triples in the knowledge-graph are optimized by random gradient descent: minimizing the third score of the positive triples in the knowledge graph, maximizing the third score of the negative triples in the knowledge graph, and obtaining the triples set of which the first score is within a preset threshold.
Step C4, obtaining a prompt word of the target problem, training the language model by adopting the triplet set and the prompt word to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the minimum score;
and step C5, outputting the target tail entity with the smallest score as a target answer which is most matched with the target question.
And training a language model by adopting a triplet set to obtain an inference model, extracting a target head entity and a target relation of a target problem in an inference task, inputting the target head entity and the target relation into the inference model to obtain a target tail entity with the minimum score3 (h, r, t), and outputting the target tail entity as a target answer which is the most matched with the target problem, namely a prediction result of the model.
In this embodiment, a language model and a knowledge graph are fused to perform reasoning, and for a given natural language problem and knowledge graph entity and relationship, a unified model is used to perform reasoning and question-answering, and a corresponding answer or entity is output. The inference accuracy and coverage can be further improved.
Referring to fig. 3, the prompting engineering is added in this embodiment: hints are natural language input sequences that specify tasks for a language model. A hint may contain several elements: instructions, context, and input text. Instructions are phrases that direct the model to perform a particular task. The context provides context for entering text or a few examples. The input text is text that requires model processing. Prompt engineering can improve the ability of language models in a variety of complex tasks, such as question answering, emotion classification, and common sense reasoning. The chain thinking prompt realizes complex reasoning capability through an intermediate reasoning step. The knowledge graph data and the model training data are limited, and in the embodiment, the prompt words of the target problems are obtained based on prompt engineering, so that the knowledge graph data and the model training data can be used as certain initialized data for model training. The embodiment utilizes large-scale natural language data and knowledge graph data to enhance the reasoning and question-answering capability of the model.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present application.
Example 2
The embodiment also provides a device for outputting a model result, which is used for implementing the above embodiment and the preferred implementation, and is not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 4 is a block diagram of a model result output device according to an embodiment of the present application, and as shown in fig. 4, the device includes:
an obtaining module 40, configured to obtain entity vectors corresponding to all entities and relationship vectors corresponding to all relationships in a knowledge graph, where the entities include a head entity and a tail entity, and the knowledge graph includes a plurality of triples, each triplet including a head entity, a tail entity, and a relationship between the head entity and the tail entity;
a calculation module 42, configured to calculate, for each triplet in the knowledge graph, scores of a head entity, a relationship, and a tail entity according to the entity vector and the relationship vector;
and the output module 44 is used for acquiring the target questions to be inferred, and outputting target answers with highest matching degree with the target questions according to the scores by adopting a language model.
It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.
Example 3
Embodiments of the present application also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:
s1, obtaining entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph, wherein the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity;
s2, calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector aiming at each triplet in the knowledge graph;
s3, obtaining target questions to be inferred, and outputting target answers with highest matching degree with the target questions by adopting a language model according to the scores.
Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.
Embodiments of the present application also provide an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
Optionally, the electronic device may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.
Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:
s1, obtaining entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph, wherein the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity;
s2, calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector aiming at each triplet in the knowledge graph;
s3, obtaining target questions to be inferred, and outputting target answers with highest matching degree with the target questions by adopting a language model according to the scores.
Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.
The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.
Claims (9)
1. A method of outputting a model result, the method comprising:
obtaining entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph, wherein the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity;
calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector aiming at each triplet in the knowledge graph;
acquiring a target problem to be inferred, and outputting a target answer with highest matching degree with the target problem according to the score by adopting a language model;
the step of outputting the target answer with the highest matching degree with the target question according to the score by adopting a language model comprises the following steps:
for each triplet in the knowledge graph, performing first conversion on the head entity h and the tail entity t through the relation r to obtain a relation expression e_h=r×h of the head entity h and a relation expression e_t=r×t of the tail entity t;
calculating a first score of the head entity h, the relation r and the tail entity t using the formula: score1 (h, r, t) =e_h_h_e_t+m, wherein m is a tuning threshold of the knowledge graph, and score1 (h, r, t) represents a first score;
minimizing a first score of a positive triplet in the knowledge graph, maximizing the first score of a negative triplet in the knowledge graph, and acquiring a triplet set of which the first score is within a preset threshold;
training a language model by adopting the triplet set to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the maximum score;
and outputting the target tail entity with the maximum score as a target answer which is matched with the target question.
2. The method of claim 1, wherein after outputting the target tail entity with the largest score as the target answer that best matches the target question, the method further comprises:
forming a target triplet by the target head entity, the target relation and the target tail entity;
and adding the target triples to the knowledge graph.
3. The method of claim 1, wherein prior to minimizing the first score for the positive triplet in the knowledge-graph, the method further comprises:
calling an activating function sigmoid function;
and mapping the scores of the head entity h, the relation r and the tail entity t to a probability space of [0, 1] by adopting the sigmoid function, and taking the mapped value as a first score.
4. The method of claim 1, wherein outputting the target answer with the highest degree of matching to the target question according to the score using a language model comprises:
for each triplet in the knowledge graph, performing second conversion on the head entity h and the tail entity t through the relation r to obtain a relation representation e_h=h+r of the head entity h and a relation representation e_t=t+r of the tail entity t;
calculating a second score of the head entity h, the relation r and the tail entity t using the formula: score2 (h, r, t) = ||e_h+r+m-e_t|_2, wherein m is a tuning threshold value of the knowledge graph, 2 represents the L2 norm, score2 (h, r, t) is the second score;
minimizing a second score of the positive triplet in the knowledge graph, and maximizing the second score of the negative triplet in the knowledge graph to obtain a triplet set of which the second score is within a preset threshold;
training a language model by adopting the triplet set to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the minimum score;
and outputting the target tail entity with the minimum score as a target answer which is matched with the target question.
5. The method of claim 4, wherein after outputting the target tail entity with the smallest score as the target answer that best matches the target question, the method further comprises:
acquiring a head entity vector corresponding to the target head entity, a tail entity vector corresponding to the target tail entity and a relation vector corresponding to the target relation;
and inputting the head entity vector, the tail entity vector and the relation vector as training data into an initial language model for training, and obtaining the language model after training.
6. The method of claim 1, wherein outputting the target answer with the highest degree of matching to the target question according to the score using a language model comprises:
for each triplet in the knowledge graph, performing third conversion on the head entity h and the tail entity t through the relation r to obtain a relation expression e_h=h-r r_h of the head entity h and a relation expression e_t=t-r r_t of the tail entity t, wherein r_h and r_t are projection vectors of the relation r on the head entity h and the tail entity t respectively;
calculating a third score of the head entity h, the relation r and the tail entity t using the formula: score3 (h, r, t) = ||e_h+r+m-e_t|2, where m is a tuning threshold of the knowledge graph, 2 represents the L2 norm, score3 (h, r, t) is the third score;
minimizing a third score of the positive triplet in the knowledge graph, and maximizing the third score of the negative triplet in the knowledge graph to obtain a triplet set of which the third score is within a preset threshold;
acquiring a prompt word of the target problem, training a language model by adopting the triplet set and the prompt word to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the minimum score;
and outputting the target tail entity with the minimum score as a target answer which is matched with the target question.
7. A model result output device, comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring entity vectors corresponding to all entities and relation vectors corresponding to all relations in a knowledge graph, the entities comprise a head entity and a tail entity, the knowledge graph comprises a plurality of triples, and each triplet comprises the head entity, the tail entity and the relation between the head entity and the tail entity;
the calculation module is used for calculating scores of a head entity, a relation and a tail entity according to the entity vector and the relation vector aiming at each triplet in the knowledge graph;
the output module is used for acquiring target questions to be inferred and outputting target answers with highest matching degree with the target questions by adopting a language model according to the scores;
the output module comprises a first output unit, a second output unit and a third output unit, wherein the first output unit is used for carrying out first conversion on the head entity h and the tail entity t through the relation r to obtain a relation expression e_h=r×h of the head entity h and a relation expression e_t=r×t of the tail entity t; calculating a first score of the head entity h, the relation r and the tail entity t using the formula: score1 (h, r, t) =e_h_h_e_t+m, wherein m is a tuning threshold of the knowledge graph, and score1 (h, r, t) represents a first score; minimizing a first score of a positive triplet in the knowledge graph, maximizing the first score of a negative triplet in the knowledge graph, and acquiring a triplet set of which the first score is within a preset threshold; training the language model by adopting the triplet set to obtain an inference model, and inputting a target head entity and a target relation of the target problem into the inference model to obtain a target tail entity with the maximum score; and outputting the target tail entity with the maximum score as a target answer which is matched with the target question.
8. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus; wherein:
a memory for storing a computer program;
a processor for executing the method steps of any one of claims 1 to 6 by running a program stored on a memory.
9. A storage medium comprising a stored program, wherein the program when run performs the method steps of any one of claims 1 to 6.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311857269.7A CN117493582B (en) | 2023-12-29 | 2023-12-29 | Model result output method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311857269.7A CN117493582B (en) | 2023-12-29 | 2023-12-29 | Model result output method and device, electronic equipment and storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN117493582A true CN117493582A (en) | 2024-02-02 |
| CN117493582B CN117493582B (en) | 2024-04-05 |
Family
ID=89685407
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202311857269.7A Active CN117493582B (en) | 2023-12-29 | 2023-12-29 | Model result output method and device, electronic equipment and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN117493582B (en) |
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140297571A1 (en) * | 2013-03-29 | 2014-10-02 | International Business Machines Corporation | Justifying Passage Machine Learning for Question and Answer Systems |
| CN110737763A (en) * | 2019-10-18 | 2020-01-31 | 成都华律网络服务有限公司 | Chinese intelligent question-answering system and method integrating knowledge map and deep learning |
| CN111597314A (en) * | 2020-04-20 | 2020-08-28 | 科大讯飞股份有限公司 | Reasoning question-answering method, device and equipment |
| CN112380325A (en) * | 2020-08-15 | 2021-02-19 | 电子科技大学 | Knowledge graph question-answering system based on joint knowledge embedded model and fact memory network |
| CN113360604A (en) * | 2021-06-23 | 2021-09-07 | 中国科学技术大学 | Knowledge graph multi-hop question-answering method and model based on cognitive inference |
| CN113987203A (en) * | 2021-10-27 | 2022-01-28 | 湖南大学 | A knowledge graph reasoning method and system based on affine transformation and bias modeling |
| US20220292262A1 (en) * | 2021-03-10 | 2022-09-15 | At&T Intellectual Property I, L.P. | System and method for hybrid question answering over knowledge graph |
| CN115114421A (en) * | 2022-06-21 | 2022-09-27 | 青岛海信网络科技股份有限公司 | Question-answer model training method |
| CN117076688A (en) * | 2023-08-18 | 2023-11-17 | 中国工商银行股份有限公司 | Knowledge question and answer method based on domain knowledge graph and its device and electronic equipment |
-
2023
- 2023-12-29 CN CN202311857269.7A patent/CN117493582B/en active Active
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140297571A1 (en) * | 2013-03-29 | 2014-10-02 | International Business Machines Corporation | Justifying Passage Machine Learning for Question and Answer Systems |
| CN110737763A (en) * | 2019-10-18 | 2020-01-31 | 成都华律网络服务有限公司 | Chinese intelligent question-answering system and method integrating knowledge map and deep learning |
| CN111597314A (en) * | 2020-04-20 | 2020-08-28 | 科大讯飞股份有限公司 | Reasoning question-answering method, device and equipment |
| CN112380325A (en) * | 2020-08-15 | 2021-02-19 | 电子科技大学 | Knowledge graph question-answering system based on joint knowledge embedded model and fact memory network |
| US20220292262A1 (en) * | 2021-03-10 | 2022-09-15 | At&T Intellectual Property I, L.P. | System and method for hybrid question answering over knowledge graph |
| CN113360604A (en) * | 2021-06-23 | 2021-09-07 | 中国科学技术大学 | Knowledge graph multi-hop question-answering method and model based on cognitive inference |
| CN113987203A (en) * | 2021-10-27 | 2022-01-28 | 湖南大学 | A knowledge graph reasoning method and system based on affine transformation and bias modeling |
| CN115114421A (en) * | 2022-06-21 | 2022-09-27 | 青岛海信网络科技股份有限公司 | Question-answer model training method |
| CN117076688A (en) * | 2023-08-18 | 2023-11-17 | 中国工商银行股份有限公司 | Knowledge question and answer method based on domain knowledge graph and its device and electronic equipment |
Non-Patent Citations (1)
| Title |
|---|
| 昌攀;曹扬;: "改进的TransH模型在知识表示与推理领域的研究", 广西大学学报(自然科学版), no. 02, 25 April 2020 (2020-04-25) * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN117493582B (en) | 2024-04-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112365892B (en) | Man-machine conversation method, device, electronic device and storage medium | |
| CN107908803B (en) | Response method and device, storage medium and terminal for question and answer interaction | |
| CN110019838B (en) | Intelligent question-answering system and intelligent terminal | |
| CN110019729A (en) | Intelligent answer method and storage medium, terminal | |
| CN118861218A (en) | Sample data generation method, device, electronic device and storage medium | |
| CN110019728A (en) | Automatic interaction method and storage medium, terminal | |
| CN106708950B (en) | Data processing method and device for intelligent robot self-learning system | |
| CN114691815B (en) | Model training method, device, electronic device and storage medium | |
| CN110019730B (en) | Automatic interaction system and intelligent terminal | |
| CN110795558B (en) | Label acquisition method and device, storage medium and electronic device | |
| CN117874163A (en) | Inference prompt method, answer inference method, device, equipment and storage medium | |
| US20210319069A1 (en) | Corpus processing method, apparatus and storage medium | |
| CN117573804A (en) | Text processing method, electronic device and computer-readable storage medium | |
| CN117493582B (en) | Model result output method and device, electronic equipment and storage medium | |
| CN111931503A (en) | Information extraction method and device, equipment and computer readable storage medium | |
| CN115600818A (en) | Multidimensional scoring method, device, electronic equipment and storage medium | |
| CN119829709B (en) | Training of large question-answering models and methods and devices for question-answering tasks | |
| CN118588313B (en) | Hospital data dictionary mapping method, device, computing equipment and storage medium | |
| CN117744801A (en) | Conversation method, device and equipment of psychology big model based on artificial intelligence | |
| CN113868415B (en) | Knowledge base generation method and device, storage medium and electronic equipment | |
| CN110704587B (en) | Text answer searching method and device | |
| WO2023040545A1 (en) | Data processing method and apparatus, device, storage medium, and program product | |
| CN113761149A (en) | Dialogue information processing method, device, computer equipment and storage medium | |
| CN115510199A (en) | Data processing method, device and system | |
| CN117786416B (en) | A model training method, device, equipment, storage medium and product |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |