[go: up one dir, main page]

CN115391557B - A relation extraction method that integrates entity type representation and relation representation - Google Patents

A relation extraction method that integrates entity type representation and relation representation

Info

Publication number
CN115391557B
CN115391557B CN202211037489.0A CN202211037489A CN115391557B CN 115391557 B CN115391557 B CN 115391557B CN 202211037489 A CN202211037489 A CN 202211037489A CN 115391557 B CN115391557 B CN 115391557B
Authority
CN
China
Prior art keywords
sub
representation
subject
relation
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211037489.0A
Other languages
Chinese (zh)
Other versions
CN115391557A (en
Inventor
刘峤
徐远扬
骆妲
赵海睿
甘洋镭
侯睿
代婷婷
佟飘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202211037489.0A priority Critical patent/CN115391557B/en
Publication of CN115391557A publication Critical patent/CN115391557A/en
Application granted granted Critical
Publication of CN115391557B publication Critical patent/CN115391557B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)

Abstract

本发明公开了一种融合实体类型表征与关系表征的关系抽取方法,属于关系抽取技术领域。本发明涉及文本‑主客体弱相关语义表征机制,通过引入实体类型信息替换实体词意信息,进而降低抽取模型对主体‑客体语义关联的依赖;在上述基础上,本发明进一步建模实体关系的抽象语义信息,并与包含主客体类型信息的上下文语义表征进行融合,生成实体关系的语义映射,获得主体‑关系‑客体三元组更准确的预测效果。

This invention discloses a relation extraction method that integrates entity type representation and relation representation, belonging to the field of relation extraction technology. This invention relates to a text-subject-object weak correlation semantic representation mechanism, which reduces the extraction model's dependence on subject-object semantic association by introducing entity type information to replace entity word meaning information. Based on this, this invention further models the abstract semantic information of entity relations and integrates it with the contextual semantic representation containing subject-object type information to generate a semantic mapping of entity relations, achieving a more accurate prediction effect for subject-relationship-object triples.

Description

Relation extraction method for fusing entity type representation and relation representation
Technical Field
The invention relates to the field of relation extraction, in particular to a relation extraction method for fusing entity type characterization and relation characterization.
Background
In the background of the current information explosion, the information extraction technology extracts important information from massive unstructured texts and reconstructs the important information into structured information which is easy to use for downstream tasks (such as knowledge graph construction, search engine knowledge base construction and question-answering system knowledge base construction). Relationship extraction is an important field of information extraction, aimed at extracting structured relationship triplet information, i.e. (subject, relationship, object), from unstructured text to help characterize associative relationships between entities.
Existing relation extraction methods mostly use a joint or pipeline method based on named entity recognition to realize relation extraction. When modeling, firstly, named entity subjects and objects are identified, based on the identification result, semantic information of the subjects and the objects is enhanced in feature information to extract the relation, and context information of statement global is ignored, so that the model is degraded to a certain extent into a relation matching model based on entity pairs, and the rationality and robustness of relation extraction are affected. Therefore, the patent provides a relation extraction method for fusing entity type representation and relation representation, and the performance robustness of the model to unseen entities or sentences is improved.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a relation extraction method for fusing entity type characterization and relation characterization, which starts from semantic characterization, entity type characterization and relation characterization, and utilizes a text-host-guest weak correlation semantic characterization mechanism and a relation feature fusion mechanism to provide a novel relation extraction model, so that the context information of sentences can be effectively captured, the extraction of entity pairs and relations in unstructured texts is realized, and the problems mentioned in the background art are solved.
In order to achieve the purpose, the invention provides a relation extraction method for fusing entity type characterization and relation characterization, which comprises the following specific steps:
S10, for natural language texts of an input system, based on semantic information, entity type information and relation information of text encoded by a Word-Piece Word segmentation method, outputting Word-Piece semantic representation, entity type representation and relation representation;
Step S20, extracting a subject and an object in a text by further utilizing BERT and a binary annotation method based on the output word-piece semantic representation;
step S30, replacing the extracted word sense representations of the subject and the object through the output entity type representation to weaken subject-object semantic association information, constructing a weak correlation semantic representation mechanism of the subject and the object in the text, and generating a new weak semantic association text between the subject and the object;
S40, constructing a relation encoder based on a BERT representation model, encoding a new text with weak semantic association, extracting high-level abstract semantic information in the text, and outputting a text-host-guest weak-related context Wen Yuyi vector representation by combining the two-way context information;
and S50, constructing a fusion mechanism of context semantic information and relationship information related to the weakness of the text-host and the guest, wherein the fused characterization vector is used for capturing the subject-relationship-guest triples.
Preferably, the specific steps of the step S10 are as follows:
Step S101, the natural language text of the input system is a Word sequence, s= { w 1,...,wl }, wherein w i, i epsilon {1,2,.. The i }, represents the i-th Word in the sentence, i is the number of words contained in the sentence to be extracted, a Word-Piece representation model based on a BPE double-byte coding mode is constructed to represent the words in a vector space, each Word in the input sentence is divided into sub-words with fine granularity, and the sub-Word representation sequence is output Wherein t i, i e {1,2,.. The term, L }, represents the i-th subword in the sentence, which is the subword length of the sentence to be extracted after Word-Piece division;
Step S102, vector characterization is carried out on the entity type and relation type pre-input system, epsilon is a set of entity types, R is a set of relation types, and for any entity type e epsilon and any relation type R epsilon R of the input system, an entity type and relation characterization model based on a multi-layer perceptron is respectively constructed, and discrete entity type symbols and relation type symbols are converted into continuous high-dimensional characterization vectors To output fine-grained semantic information of entity types and relationship types.
Preferably, the specific steps of the step S20 are as follows:
step S201, constructing a named entity encoder based on BERT neural network representation model, and sequencing sub-words As input of the system encoder, the bi-directional context information of each word element is deeply encoded by the N transducer encoder blocks in sequence through the fine tuning parameters, and a bi-directional language representation vector sequence of depth is output
Wherein, trans represents a transducer encoder block, h α-1 represents the encoding result of the last transducer encoder block;
Step S202, establishing a named entity subject decoder and a subject decoder based on a fully-connected neural network to extract candidate subjects and candidate subjects in the sequence of subwords, and outputting the final block of the encoder For the input of the decoder, for each word element i in the sub-word sequence, the probability that the word element is a subject span starting point, a subject span end point, an object span starting point and an object span end point is calculated, and the formulas are respectively as follows:
wherein, the Using weight parameters and deviation parameters which can be learned in the representing fully-connected neural network, wherein sigma is a sigmoid activation function;
comparing the calculated probability values Whether the type E start_s, end_s, start_o and end_o exceed a preset threshold value of 0.5 (the threshold value is a superparameter which is artificially set by combining prior knowledge and superparameter experiments), the control system judges whether the word element is a label corresponding to the type according to whether the output probability value exceeds the threshold value, if so, the label is correspondingly judgedType e start_s, end_s, start_o, end_o is assigned 1, otherwise the tag is assigned 0;
According to the above determination tag Outputs a corresponding sequence representation of a subject span start point, a subject span end point,
Step S203, searching the nearest 1 tag to the right in the subject end point judging sequence d end_s for one 1 tag in the subject start point judging sequence d start_s to form a potential subject span sub i;
Carrying out the above operation on 1 tag in all subject and object starting point judgment sequences, and respectively outputting a potential subject span sequence H sub=(sub1,...,subm) and a potential object span sequence H obj=(obj1,...,objn), wherein the potential subject-object span pair sequences H= (sub 1,obj1),...,(subm×n,objm×n) are formed by combining two pairs;
wherein m and n are the number of potential subjects and the number of potential objects extracted from the sub word sequence respectively.
Preferably, the specific steps of the step S30 are as follows:
step S301, constructing a text-host-guest weak correlation semantic characterization mechanism, inputting entity type information to weaken the host-guest semantic correlation information, and for a given host-guest span pair (sub i,objj), using a corresponding entity type characterization vector (i M) with i not equal to j Sequence of subwordsThe representation vector of the corresponding span is replaced to weaken the subject-object semantic association information, a new text representation sequence is output,L 2 is the sequence length of the sub word after replacement, and meanwhile, the position of the type characterization vector e (sub i),e(objj) in the new sequence T is output, (s 1,...,sm) represents the main body replacement position sequence, m is the main body replacement length, (o 1,...,on) represents the object replacement position sequence, and n is the object replacement length.
Preferably, the specific steps of the step S40 are as follows:
Step S401, for subject-object pair (sub i,objj), i+.j, constructing a BERT neural network representation model-based relational encoder, and characterizing the new text sequence As input of the system encoder, the bi-directional context information of each word element is deeply encoded by the N transducer encoder blocks in sequence through the fine tuning parameters, and the bi-directional language of the output depth represents a vector sequenceAlpha epsilon [1, N ], wherein Trans represents a transducer encoder block, h α-1 represents the encoding result of the last transducer encoder block, and the output of the relational encoder is the encoding result of the last transducer encoder block, namely, the upper Wen Yuyi representation of the text-host-guest weak correlationWherein, h i,i∈{1,2,...,L2 is the context coding result of the lemma t i of the subword sequence.
Preferably, the specific steps of the step S50 are as follows:
Step S501, constructing a relation decoder based on a fully connected linear neural network, calculating a subject-object pair (sub i,objj), and outputting a relation when a natural language text sentence input by a system is s= { w 1,...,wl }, wherein i is not equal to j The formula is as follows:
H=Hsub+Hobj
pi,j,k=σ(W(H;e(rk))+b)
wherein, the Is a relationship ofIs used to determine the characterization vector of (c),AndSemantic representation of encoder output, respectivelyThe value at position (s 1,...,si),(o1,...,oj), maxPooling represents the maximum pooling layer operation, the output subject representation H sub and the object representation H obj are added to form the overall entity representation H, W, b is respectively a weight parameter and a deviation parameter which can be learned in the fully-connected linear neural network, and sigma is a sigmoid activation function;
If the calculated probability value p i,j,k exceeds a preset threshold value 0.6 (the threshold value is a superparameter which is artificially set by combining priori knowledge and superparameter experiments), the control system judges whether the triplet has a certain relation according to whether the output probability value exceeds the threshold value, and the host-object pair (sub i,objj) is considered that the relation exists when the i is not equal to j and the natural language text sentence is s is not equal to w 1,...,wl Pair h= (sub 1,obj1),...,(subm×n,objm×n) any entity pair and anyAnd calculating the occurrence probability of the three-tuple, wherein the final output result is that all the three-tuple with the probability exceeding a preset threshold value form an extraction result, namely, a relation extraction result Reuslt = ((sub 1,r1,obj1),...,(subn,rn,objn)) of a natural language text sentence s= { w 1,...,wl } and n is the number of the extracted three-tuple.
The beneficial effects of the invention are as follows:
1) The invention takes a natural language text as a research object, provides a relation extraction method for fusing entity type representation and relation representation, and starts from semantic representation, entity type representation and relation representation, and a novel relation extraction model is provided by utilizing a text-host-guest weak correlation semantic representation mechanism and a relation feature fusion mechanism, so that the context information of sentences can be effectively captured, and the extraction of entity pairs and relations in unstructured text is realized.
2) The method outputs entity type characterization and relation characterization besides semantic characterization, extracts a subject and an object at the same time by using a BERT-based neural network model, pairs the extracted subject and object, replaces coding information of the subject and the object by using type information of the subject and object according to a pairing result to obtain new semantic information, recodes the obtained new semantic information by using the BERT-based neural network model, and predicts whether a relation exists by using a maximum pooling and multi-layer perceptron. According to the invention, a text-host-guest weak correlation semantic characterization mechanism is designed, and entity semantic information is replaced by introducing entity type information, so that the dependence of an extraction model on the host-guest semantic association is reduced.
Drawings
FIG. 1 is a flow chart of the steps of the method of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to FIG. 1, the invention provides a technical scheme of a relation extraction method for fusing weak correlation semantic representation and relation representation, which comprises the following specific steps:
Step 1, for a natural language text of an input system, based on semantic information, entity type information and relation information of a Word-Piece Word segmentation method coding text, outputting Word-Piece semantic representation, entity type representation and relation representation, wherein in step 1-1, the natural language text of the input system is a Word sequence, s= { w 1,...,wl }, wherein w i, i epsilon {1, 2., l }, the i-th Word in a sentence is represented, and l is the number of words contained in the sentence to be extracted. The system constructs Word-Piece representation model based on BPE double-byte coding mode to represent words in vector space, divides each Word in input sentence into sub-words with fine granularity, and outputs sub-Word representation sequence Wherein t i, i e {1,2,.. The term, L }, represents the i-th subword in the sentence, L is the subword length of the sentence to be extracted after Word-Piece division;
Step 1-2, carrying out vector characterization on an entity type and relation type pre-input system, wherein epsilon is a set of entity types, R is a set of relation types, and for any entity type e epsilon and any relation type R epsilon R of the input system, respectively constructing an entity type and relation characterization model based on a multi-layer perceptron, and converting discrete entity type symbols and relation type symbols into continuous high-dimensional characterization vectors To output fine-grained semantic information of entity types and relationship types.
Step 2, extracting a subject and an object in the text by further utilizing BERT and a binary annotation method based on the output word-piece semantic representation;
and 2-1, constructing a named entity encoder based on the BERT neural network representation model. Sub word sequence As input of the system encoder, the bi-directional context information of each word element is deeply encoded by the N transducer encoder blocks in sequence through the fine tuning parameters, and a bi-directional language representation vector sequence of depth is outputAlpha epsilon [1, N ], wherein Trans represents a transducer encoder block, and h α-1 represents the encoding result of the last transducer encoder block;
And 2-2, establishing a named entity subject decoder and a named entity object decoder based on the fully-connected neural network to extract candidate subjects and candidate objects in the subword sequence. With the output of the last block of the encoder For the input of the decoder, for each word element i in the sub-word sequence, the probability that the word element is a subject span starting point, a subject span end point, an object span starting point and an object span end point is calculated, and the formulas are respectively as follows:
wherein, the With weight parameters and bias parameters that represent learnable in fully connected neural networks, σ is the sigmoid activation function.
Comparing the calculated probability valuesWhether the type E start_s, end_s, start_o and end_o exceed a preset threshold value of 0.5 (the threshold value is a superparameter which is artificially set by combining prior knowledge and superparameter experiments), the control system judges whether the word element is a label corresponding to the type according to whether the output probability value exceeds the threshold value, if so, the label is correspondingly judgedType e start_s, end_s, start_o, end_o is assigned 1, otherwise the tag is assigned 0;
the system determines the tag according to the above Outputs a corresponding sequence representation of a subject span start point, a subject span end point,
In step 2-3, the system searches for the nearest 1 tag to the right in the subject end judgment sequence d end_s for one 1 tag in the subject start judgment sequence d start_s to form a potential subject span sub i, and performs the same operation on the object judgment sequence to output a potential object span obj i. The above operation is performed on the 1 tag in all subject/object origin determining sequences, and the potential subject span sequence H sub=(sub1,...,subm) and the potential object span sequence H obj=(obj1,...,objn) are outputted, respectively. Two pairs of the potential subject-object span pair sequences are formed, H= (sub 1,obj1),...,(subm×n,objm×n), and m and n are respectively the number of the potential subjects and the number of the potential objects extracted from the sub word sequences;
step 3, replacing the extracted word sense representations of the subject and the object through the output entity type representation to weaken subject-object semantic association information, constructing a weak correlation semantic representation mechanism of the subject and the object in the text, and generating a new weak semantic association text between the subject and the object;
and 3-1, constructing a text-host-guest weak correlation semantic characterization mechanism, and inputting additional entity type information to weaken the host-guest semantic correlation information. For a given subject-object span pair (sub i,objj), i+.j, the system characterizes the vector with the corresponding entity type Sequence of subwordsReplacing the representation vector of the corresponding span in order to weaken the subject-object semantic association information, outputting a new text representation sequence,L 2 is the sequence length of the sub word after replacement. Meanwhile, outputting the position of the type characterization vector e (sub i),e(objj) in the new sequence T, (s 1,...,sm) represents a main body replacement position sequence, m is a main body replacement length, (o 1,...,on) represents an object replacement position sequence, and n is an object replacement length;
constructing a relation encoder based on a BERT representation model, encoding a new text with weak semantic association, extracting high-level abstract semantic information in the text, and outputting a text-host-guest weak-related context Wen Yuyi vector representation by combining the two-way context information;
step 4-1, for the subject-object pair (sub i,objj), i+.j, constructing a relational encoder based on the BERT neural network representation model. Sequence of text characterization As input of the system encoder, the bi-directional context information of each word element is deeply encoded by the N transducer encoder blocks in sequence through the fine tuning parameters, and the bi-directional language of the output depth represents a vector sequenceAlpha.epsilon.1, N where Trans represents a transducer encoder block and h α-1 represents the encoding result of the last transducer encoder block. The output of the relational encoder is the encoded result of the last transducer encoder block, i.e., the context-based weak correlation representation Wen YuyiWherein, h i,i∈{1,2,...,L2 is the context coding result of the word element t i of the sub word sequence;
Constructing a fusion mechanism of context semantic information and relationship information related to text-host-guest weakness, wherein the fused characterization vector is used for capturing a host-relationship-guest triplet;
step 5-1, constructing a relation decoder based on a fully connected linear neural network, calculating a subject-object pair (sub i,objj), and outputting a relation when a natural language text sentence input by a system is s= { w 1,...,wl }, wherein i is not equal to j The formula is as follows:
H=Hsub+Hobj
pi,j,k=σ(W(H;e(rk))+b)
wherein, the Is a relationship ofIs used to determine the characterization vector of (c),AndSemantic representation of encoder output, respectivelyThe value at position (s 1,...,si),(o1,...,oj), maxPooling represents the maximum pooling layer operation, the output subject representation H sub and the object representation H obj are added to form the overall entity representation H, W, b is respectively a weight parameter and a deviation parameter which can be learned in the fully-connected linear neural network, and sigma is a sigmoid activation function;
If the calculated probability value p i,j,k exceeds a preset threshold value 0.6 (the threshold value is a superparameter which is artificially set by combining priori knowledge and superparameter experiments), the control system judges whether the triplet has a certain relation according to whether the output probability value exceeds the threshold value, and the host-object pair (sub i,objj) is considered that the relation exists when the i is not equal to j and the natural language text sentence is s is not equal to w 1,...,wl Pair h= (sub 1,obj1),...,(subm×n,objm×n) any entity pair and anyAnd calculating the occurrence probability of the three-dimensional data, wherein the final output result of the system is an extraction result formed by all the triples with the probability exceeding a preset threshold value of 0.6, namely, the relation extraction result of a natural language text sentence s= { w 1,...,wl }, reuslt = ((sub 1,r1,obj1),...,(subn,rn,objn)), and n is the number of the extracted triples.
The method takes a natural language text as a research object, designs a relation extraction method for fusing entity type representation and relation representation, and starts from semantic representation, entity type representation and relation representation, and utilizes a text-host-guest weak correlation semantic representation mechanism and a relation feature fusion mechanism to provide a novel relation extraction model which can effectively capture context information of sentences and realize extraction of entity pairs and relations in unstructured text.
Although the present invention has been described with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements and changes may be made without departing from the spirit and principles of the present invention.

Claims (3)

1.一种融合实体类型表征与关系表征的关系抽取方法,其特征在于,所述关系抽取方法具体步骤如下:1. A relation extraction method that integrates entity type representation and relation representation, characterized in that the specific steps of the relation extraction method are as follows: 步骤S10:对于输入系统的自然语言文本,基于word-piece分词方法编码文本的语义信息、实体类型信息及关系信息,输出Word-Piece语义表征、实体类型表征、关系表征;Step S10: For the natural language text input to the system, encode the semantic information, entity type information and relation information of the text based on the word-piece segmentation method, and output the Word-Piece semantic representation, entity type representation and relation representation; 步骤S20:基于输出的word-piece语义表征,进一步利用BERT和二元标注法抽取文本中的主体和客体;Step S20: Based on the output word-piece semantic representation, further extract the subject and object in the text using BERT and binary annotation; 步骤S30:通过输出的实体类型表征替换抽取出的主体和客体的词义表征,以弱化主体-客体语义关联信息,构造文本中主体与客体的弱相关语义表征机制,生成主体客体之间的弱语义关联新文本;具体步骤如下:Step S30: Replace the extracted semantic representations of the subject and object with the output entity type representations to weaken the semantic association information between the subject and object, construct a weak semantic association mechanism between the subject and object in the text, and generate a new text with weak semantic association between the subject and object; the specific steps are as follows: 构造文本-主客体弱相关语义表征机制,输入实体类型信息以弱化主体-客体语义关联信息,对于给定主体-客体跨度对(subi,objj),i≠j,使用对应的实体类型表征向量对子词表征序列中对应跨度的表征向量进行替换,L为待抽取语句经过Word-Piece划分后的子词长度,以弱化主体-客体语义关联信息,输出新文本表征序列L2为替换后的子词序列长度,同时输出类型表征向量e(subi)、e(objj)在新文本表征序列T2中的位置,(s1,...,sm)表示主体替换位置序列,m为主体替换长度,(o1,...,on)表示客体替换位置序列,n为客体替换长度;Construct a text-subject-object weakly correlated semantic representation mechanism. Input entity type information to weaken the semantic association between subject and object. For a given subject-object span pair (sub i , obj j ), i ≠ j, use the corresponding entity type representation vector. Pair word representation sequence The representation vector corresponding to the span is replaced, where L is the length of the sub-words in the sentence to be extracted after Word-Piece segmentation, in order to weaken the subject-object semantic association information and output a new text representation sequence. L2 is the length of the replaced subword sequence, and outputs the positions of the type representation vectors e(sub i ) and e(obj j ) in the new text representation sequence T2 . ( s1 , ..., sm ) represents the subject replacement position sequence, m is the subject replacement length, and ( o1 , ..., on ) represents the object replacement position sequence, n is the object replacement length. 步骤S40:构造基于BERT表示模型的关系编码器,对弱语义关联新文本进行编码,提取文本中的高层抽象语义信息,并结合双向上下文信息输出文本-主客体弱相关的上下文语义向量表征;具体步骤如下:Step S40: Construct a relation encoder based on the BERT representation model to encode new text with weak semantic associations, extract high-level abstract semantic information from the text, and output a contextual semantic vector representation of the weak text-subject-object association by combining bidirectional contextual information; the specific steps are as follows: 对于主体-客体对(subi,objj),i≠j,构造基于BERT神经网络表示模型的关系编码器,将新文本表征序列作为系统编码器的输入,顺序通过N个Transformer编码器块,通过微调参数对每个词元的双向上下文信息进行深层次编码,输出深度的双向语言表示向量序列其中Trans表示Transformer编码器块,hα-1表示上一个Transformer编码器块的编码结果;关系编码器的输出为最后一个Transformer编码器块的编码结果,也即文本-主客体弱相关的上下文语义表征其中,hi为子词表征序列的词元的上下文编码结果,i∈{1,2,...,L2};For a subject-object pair (sub i , obj j ), where i ≠ j, construct a relation encoder based on the BERT neural network representation model to represent the new text sequence. As input to the system encoder, it sequentially passes through N Transformer encoder blocks. By fine-tuning the parameters, the bidirectional contextual information of each word is deeply encoded, and the output is a sequence of deep bidirectional language representation vectors. Where Trans represents the Transformer encoder block, and hα -1 represents the encoding result of the previous Transformer encoder block; the output of the relation encoder is the encoding result of the last Transformer encoder block, which is the text-subject-object weakly correlated contextual semantic representation. Where, h<sub> i </sub> is the context encoding result of the word representation sequence, i∈{1,2,...,L <sub>2</sub> }; 步骤S50:构造文本-主客体弱相关的上下文语义信息与关系信息的融合机制,融合后的表征向量将被用于捕获主体-关系-客体三元组;具体步骤如下:Step S50: Construct a fusion mechanism for text-subject weakly correlated contextual semantic information and relational information. The fused representation vector will be used to capture subject-relation-object triples; the specific steps are as follows: 构造基于全连接线性神经网络的关系解码器,计算主体-客体对(subi,objj),i≠j在系统输入的自然语言文本语句为s={w1,...,wl}时,输出关系rk∈R′的概率,R′为关系类型的集合,公式如下:A relation decoder based on a fully connected linear neural network is constructed to calculate the probability of outputting relation r<sub>k</sub> ∈ R when the input natural language text is s = {w <sub>1</sub> , ..., w<sub>l</sub> }, i ≠ j, for subject-object pairs (sub<sub> i </sub>, ob<sub> j </sub>), i ≠ j. R′ is the set of relation types. The formula is as follows: H=Hsub+Hobj H = H sub + H obj pi,j,k=σ(W(H;e(rk))+b)p i,j,k =σ(W(H;e(r k ))+b) 其中,为关系rk∈R′的表征向量,和(ho1,…,hoj)分别为编码器输出的语义表征在位置(s1,...,si),(o1,...,oj)的值,MaxPooling表示最大池化层运算,输出主体表征Hsub和客体表征Hobj,两者相加形成总体实体表征H,W、b分别为全连接线性神经网络中可学习的权重参数和偏差参数,σ为sigmoid激活函数;in, Let r be the representation vector of the relation r k ∈ R′. (h o1 ,…,h oj ) are the semantic representations of the encoder output, respectively. At the values of ( s1 , ..., s1 ) and ( o1 , ..., oj ), MaxPooling represents the max pooling layer operation, outputting the subject representation Hsub and the object representation Hobj , which are added together to form the overall entity representation H. W and b are the learnable weight parameters and bias parameters in the fully connected linear neural network, respectively, and σ is the sigmoid activation function. 若计算出的概率值pi,j,k超过预设阈值,则认为主体-客体对(subi,objj),i≠j在自然语言文本语句为s={w1,...,wl}时,存在关系rk∈R′,对H=(sub1,obj1),...,(subm×n,objm×n)中的任意实体对和任意关系类型r∈R′,计算其发生概率;If the calculated probability values p <sub>i,j,k</sub> exceed the preset threshold, then it is considered that the subject-object pair (sub<sub> i </sub>, obj<sub> j </sub>), i≠j, exists as a relation r<sub> k </sub> ∈ R′ when the natural language text statement is s={w <sub>1</sub> , ..., w<sub>l</sub> }. For any entity pair and any relation type r∈R′ in H=(sub <sub>1 </sub>, obj<sub> 1 </sub>), ..., (sub <sub>m×n </sub>, obj <sub>m×n</sub> ), calculate its probability of occurrence. 最终输出结果为所有概率超过预设阈值的三元组形成的抽取结果,即自然语言文本语句s={w1,...,wl}的关系抽取结果Reuslt=((sub1,r1,obj1),...,(subn,rn,objn)),n为抽取出的三元组数量。The final output is the extraction result formed by all triples whose probabilities exceed the preset threshold, that is, the relation extraction result of the natural language text statement s={ w1 ,..., wl } Reuslt=(( sub1 , r1 , obj1 ),...,( subn , rn , objn )), where n is the number of extracted triples. 2.根据权利要求1所述的融合实体类型表征与关系表征的关系抽取方法,其特征在于:所述步骤S10的具体步骤如下:2. The relation extraction method that integrates entity type representation and relation representation according to claim 1, characterized in that: the specific steps of step S10 are as follows: 步骤S101,输入系统的自然语言文本为单词序列,s∈{w1,w2,...,wl},其中wi表示语句中第i个单词,i∈{1,2,...,l},l为待抽取语句所含单词数量;构造基于BPE双字节编码方式的Word-Piece表征模型来表示向量空间中的单词,将输入句子中的每个单词都分割为细粒度的子词,输出子词表征序列其中ti表示语句中的第i个子词的表征,i∈{1,2,...,L};Step S101: The input system's natural language text is a word sequence, s∈{ w1 , w2 , ..., wl }, where wi represents the i-th word in the sentence, i∈{1, 2, ..., l}, and l is the number of words in the sentence to be extracted; a Word-Piece representation model based on BPE double-byte encoding is constructed to represent words in the vector space, and each word in the input sentence is segmented into fine-grained sub-words, outputting a sub-word representation sequence. Where t <sub>i </sub> represents the representation of the i-th sub-word in the statement, i∈{1,2,...,L}; 步骤S102,将实体类型和关系类型预先输入系统进行向量表征,ε为实体类型的集合,R′为关系类型的集合,对于输入系统的任意实体类型e∈ε和任意关系类型r∈R′,分别构造基于多层感知机的实体类型和关系表征模型,将离散的实体类型符号和关系类型符号转化为连续的高维表征向量 以输出实体类型和关系类型的细粒度语义信息。Step S102: Entity types and relation types are pre-input into the system for vector representation. ε is the set of entity types, and R′ is the set of relation types. For any entity type e∈ε and any relation type r∈R′ input into the system, entity type and relation representation models based on a multilayer perceptron are constructed respectively, transforming discrete entity type symbols and relation type symbols into continuous high-dimensional representation vectors. It outputs fine-grained semantic information about entity types and relation types. 3.根据权利要求1所述的融合实体类型表征与关系表征的关系抽取方法,其特征在于:所述步骤S20的具体步骤如下:3. The relation extraction method that integrates entity type representation and relation representation according to claim 1, characterized in that: the specific steps of step S20 are as follows: 步骤S201,构造基于BERT神经网络表示模型的命名实体编码器,将子词表征序列作为系统编码器的输入,顺序通过N个Transformer编码器块,通过微调参数对每个词元的双向上下文信息进行深层次编码,输出深度的双向语言表征向量序列 Step S201: Construct a named entity encoder based on the BERT neural network representation model to represent the word sequence. As input to the system encoder, it sequentially passes through N Transformer encoder blocks, where the bidirectional contextual information of each lexical unit is deeply encoded by fine-tuning the parameters, outputting a deep bidirectional language representation vector sequence. 其中,Trans表示Transformer编码器块,hα-1表示上一个Transformer编码器块的编码结果;Where Trans represents the Transformer encoder block, and h α-1 represents the encoding result of the previous Transformer encoder block; 步骤S202,建立基于全连接神经网络的命名实体主体解码器和客体解码器,以抽取子词表征序列中的候选主体和候选客体,以编码器最后一个块的输出为解码器的输入,对子词表征序列中的每个词元,计算该词元为主体跨度起点、主体跨度终点、客体跨度起点、客体跨度终点的概率,公式分别如下:Step S202: Establish a named entity subject decoder and object decoder based on a fully connected neural network to extract candidate subjects and candidate objects from the word representation sequence, using the output of the last block of the encoder. As the input to the decoder, for each lexical unit in the sub-word representation sequence, calculate the probability that the lexical unit is the starting point, ending point, starting point, or ending point of the subject span, as shown in the following formulas: 其中,分别为全连接神经网络中可学习的权重参数和偏差参数,σ是sigmoid激活函数;in, These are the learnable weight parameters and bias parameters in a fully connected neural network, respectively, and σ is the sigmoid activation function; 对比计算出的概率值是否超过预设定的阈值,type∈start_s、end_s、start_o、end_o,若是,则相应的判定标签被分配为1,否则标签被分配为0;Compare the calculated probability values Does it exceed a preset threshold, for type ∈ start_s, end_s, start_o, end_o? If so, then the corresponding judgment label is used. The label is assigned a value of 1 if it is not assigned a value of 1, otherwise the label is assigned a value of 0. 根据上述判定标签的结果输出对应的主体跨度起点、主体跨度终点、客体跨度起点、客体跨度终点的序列表示, Based on the above judgment labels The output is a sequence representation of the subject span start point, subject span end point, object span start point, and object span end point. 步骤S203,对主体起点判定序列dslart_s中的一个1标签,在主体终点判定序列dend_s中向右寻找最近的一个1标签,以组成一个潜在主体跨度subi;对于客体判定序列进行相同的操作,输出一个潜在客体跨度objjStep S203: For a 1 tag in the subject start-point determination sequence d slart_s , find the nearest 1 tag to the right in the subject end-point determination sequence d end_s to form a potential subject span sub i ; perform the same operation for the object determination sequence and output a potential object span obj j ; 对所有主体和客体起点判定序列中的1标签进行上述操作,分别输出潜在主体跨度序列Hsub=(sub1,...,subm)和潜在客体跨度序列Hobj=(obj1,...,objn),两两组合,形成潜在主体-客体跨度对序列,H=(sub1,obj1),...,(subm×n,objm×n);Perform the above operation on the 1 label in all subject and object starting point determination sequences, and output the potential subject span sequence H sub = (sub 1 ,...,sub m ) and the potential object span sequence H obj = (obj 1 ,...,obj n ) respectively. Combine them in pairs to form the potential subject-object span pair sequence H = (sub 1 ,obj 1 ) ,...,(sub m×n ,obj m×n ) ; 其中,m,n分别为子词表征序列抽取出的潜在主体数目和潜在客体数目。Where m and n are the number of potential subjects and potential objects extracted from the sub-word representation sequence, respectively.
CN202211037489.0A 2022-08-26 2022-08-26 A relation extraction method that integrates entity type representation and relation representation Active CN115391557B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211037489.0A CN115391557B (en) 2022-08-26 2022-08-26 A relation extraction method that integrates entity type representation and relation representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211037489.0A CN115391557B (en) 2022-08-26 2022-08-26 A relation extraction method that integrates entity type representation and relation representation

Publications (2)

Publication Number Publication Date
CN115391557A CN115391557A (en) 2022-11-25
CN115391557B true CN115391557B (en) 2025-12-12

Family

ID=84121638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211037489.0A Active CN115391557B (en) 2022-08-26 2022-08-26 A relation extraction method that integrates entity type representation and relation representation

Country Status (1)

Country Link
CN (1) CN115391557B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116737842B (en) * 2023-05-16 2025-07-15 北京海卓飞网络科技有限公司 Entity relationship display method and device, electronic equipment and computer storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051929A (en) * 2021-03-23 2021-06-29 电子科技大学 Entity relationship extraction method based on fine-grained semantic information enhancement
CN113051922A (en) * 2021-04-20 2021-06-29 北京工商大学 Triple extraction method and system based on deep learning

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428036B (en) * 2020-03-23 2022-05-27 浙江大学 Entity relationship mining method based on biomedical literature
CN114372454B (en) * 2020-10-14 2024-08-16 腾讯科技(深圳)有限公司 Text information extraction method, model training method, device and storage medium
CN113221567A (en) * 2021-05-10 2021-08-06 北京航天情报与信息研究所 Judicial domain named entity and relationship combined extraction method
CN113887211B (en) * 2021-10-22 2025-02-11 中国人民解放军网络空间部队信息工程大学 Relationship-oriented entity relationship joint extraction method and system
CN114528368B (en) * 2021-11-12 2023-08-25 南京师范大学 Spatial relationship extraction method based on fusion of pre-trained language model and text features
CN114841122A (en) * 2022-01-25 2022-08-02 电子科技大学 Text extraction method combining entity identification and relationship extraction, storage medium and terminal

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051929A (en) * 2021-03-23 2021-06-29 电子科技大学 Entity relationship extraction method based on fine-grained semantic information enhancement
CN113051922A (en) * 2021-04-20 2021-06-29 北京工商大学 Triple extraction method and system based on deep learning

Also Published As

Publication number Publication date
CN115391557A (en) 2022-11-25

Similar Documents

Publication Publication Date Title
CN113190656B (en) A Chinese Named Entity Extraction Method Based on Multi-Annotation Framework and Fusion Features
CN114757182B (en) A BERT short text sentiment analysis method with improved training method
CN113095415B (en) A cross-modal hashing method and system based on multimodal attention mechanism
CN117151220B (en) Entity link and relationship based extraction industry knowledge base system and method
CN109543180B (en) A Text Sentiment Analysis Method Based on Attention Mechanism
CN113887215B (en) Text similarity calculation method and device, electronic equipment and storage medium
CN114692602B (en) A graph convolutional network relation extraction method guided by syntactic information attention
CN115081437B (en) Machine-generated text detection method and system based on comparative learning of linguistic features
CN113886571A (en) Entity identification method, entity identification device, electronic equipment and computer readable storage medium
CN118114667B (en) Named Entity Recognition Model Based on Multi-task Learning and Attention Mechanism
CN116484024A (en) A multi-level knowledge base construction method based on knowledge graph
CN112463960B (en) Entity relationship determination method and device, computing equipment and storage medium
CN110826335A (en) A method and apparatus for named entity recognition
CN115048447B (en) Database natural language interface system based on intelligent semantic completion
CN114048314B (en) Natural language steganalysis method
CN114332519A (en) Image description generation method based on external triple and abstract relation
CN113987201A (en) Zero-sample knowledge graph completion method based on ontology adapter
CN118036577B (en) Sequence labeling method in natural language processing
CN117196032A (en) A knowledge graph construction method, device, electronic device and storage medium for intelligent decision-making
CN112818698A (en) Fine-grained user comment sentiment analysis method based on dual-channel model
CN111368176B (en) Cross-modal hash retrieval method and system based on supervision semantic coupling consistency
CN116842934A (en) Multi-document fusion deep learning title generation method based on continuous learning
CN115688784A (en) Chinese named entity recognition method fusing character and word characteristics
CN115391557B (en) A relation extraction method that integrates entity type representation and relation representation
CN118312612A (en) A Chinese multi-label classification method integrating named entity recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant