[go: up one dir, main page]

CN111008529B - A Chinese Relation Extraction Method Based on Neural Network - Google Patents

A Chinese Relation Extraction Method Based on Neural Network Download PDF

Info

Publication number
CN111008529B
CN111008529B CN201910669521.9A CN201910669521A CN111008529B CN 111008529 B CN111008529 B CN 111008529B CN 201910669521 A CN201910669521 A CN 201910669521A CN 111008529 B CN111008529 B CN 111008529B
Authority
CN
China
Prior art keywords
bilstma
layer
information
model
chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910669521.9A
Other languages
Chinese (zh)
Other versions
CN111008529A (en
Inventor
王凯
秦永彬
李婷
陈艳平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Luhao Technology Co ltd
Original Assignee
Guizhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou University filed Critical Guizhou University
Priority to CN201910669521.9A priority Critical patent/CN111008529B/en
Publication of CN111008529A publication Critical patent/CN111008529A/en
Application granted granted Critical
Publication of CN111008529B publication Critical patent/CN111008529B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

本发明公开了一种基于神经网络的中文关系抽取方法,该方法可以有效获取句子的结构信息和语义信息。在关系抽取任务中,单一的长短期记忆模型只能学习到某个特定维度的特征,而卷积神经网络可以利用多个卷积核学习不同维度特征。基于上述两个特点,本发明提出了一种多层双向长短期记忆‑注意力模型,该方法通过给长短期记忆模型设置不同大小的隐藏层,使其能自动从原始输入中抽取不同维度的、带依赖信息的抽象特征,并利用注意力机制捕获全局信息。实验显示,本发明中的方法相较多核卷积神经网络和单一的长短期记忆‑注意力模型能显著提高中文关系抽取效果,在ACE RDC 2005中文数据集上取得71.61%的F值,取得了很好的效果,这证明了该方法的有效性。

The invention discloses a neural network-based Chinese relation extraction method, which can effectively acquire the structural information and semantic information of sentences. In relation extraction tasks, a single long-short-term memory model can only learn features of a specific dimension, while convolutional neural networks can use multiple convolution kernels to learn features of different dimensions. Based on the above two characteristics, the present invention proposes a multi-layer bidirectional long-short-term memory-attention model, which can automatically extract abstract features of different dimensions and dependent information from the original input by setting hidden layers of different sizes for the long-term short-term memory model, and use the attention mechanism to capture global information. Experiments show that the method in the present invention can significantly improve the extraction effect of Chinese relations compared with the multi-core convolutional neural network and a single long-term short-term memory-attention model. On the ACE RDC 2005 Chinese data set, an F value of 71.61% has been achieved, which has achieved very good results, which proves the effectiveness of the method.

Description

Chinese relation extraction method based on neural network
Technical Field
The invention relates to the field of information extraction, in particular to a Chinese relation extraction method based on a neural network. Belonging to the technical field of natural language processing and machine learning.
Background
With the development of artificial intelligence and the technological explosion of the information extraction field, the extraction of entity relationships is attracting attention of more and more students as an important research topic in the information extraction field. The method mainly aims at extracting semantic relations between marked entity pairs in sentences, namely determining relation categories between entity pairs in unstructured text on the basis of entity identification, and forming structured data for storage and access. The result of entity relation extraction can be used for constructing a knowledge graph or an ontology knowledge base, and can also provide data support for the construction of an automatic question-answering system. In addition, entity relation extraction has important research significance in semantic network labeling, chapter understanding and machine translation.
Early relation extraction is mainly based on grammar rules, and the grammar structures in sentences are analyzed to be used as the basis for relation occurrence. Although the method achieves good results, recall rate is difficult to improve due to strict rules, professional grammar knowledge and literature basis are needed, and applicability is not high. With the continuous development of technology, the method of relation extraction is divided into three types of supervised, semi-supervised and unsupervised. Based on the content related to the invention, the supervised relation extraction method is studied with emphasis. The supervised extraction of relationships can be regarded as a classification problem, and there are mainly two methods to summarize: shallow structure model and deep learning model.
Shallow structures typically have only one layer or no hidden layer nodes, such as support vector machines, maximum entropy, etc. Shallow structures in relational extraction often use methods of feature engineering or kernel functions. Traditional methods based on feature engineering mainly rely on feature sets which are ingenious in design and output through language processing procedures. These above approaches mostly rely on a large number of manually designed features, or on well-designed kernel functions. Despite the assistance of many excellent NLP tools, there is still a risk of performance degradation due to errors such as word segmentation inaccuracy and syntax parsing errors. More importantly, the low portability of these well-designed features or kernel functions greatly affects their scalability.
In recent years, a relationship extraction study based on deep learning has been greatly advanced. The method for extracting various relations is based on the models of CNN, RNN and the like, and achieves good effects. Many neural network-based approaches show the advantages of neural networks over traditional shallow structures, but these results are mostly achieved on a distributed balanced english dataset and use many external features as an aid. The Chinese grammar has complex structure and more serious language blurring phenomenon.
Disclosure of Invention
The invention provides a Chinese relation extraction method based on a neural network. The method is characterized in that hidden layers with different sizes are arranged on the long-short-term memory model, so that abstract features with dependency information of different dimensions can be automatically extracted from original input, and global information is captured by using an attention mechanism. Experiments show that the method has a relatively large number of kernel convolutional neural networks and a single long-term and short-term memory-attention model, can obviously improve the extraction effect of Chinese relations, and obtains a relatively good result on an ACE RDC2005 Chinese data set, which proves the effectiveness of the method. The model frame is shown in figure 1.
The technical scheme of the invention is as follows: a neural network-based chinese relation extraction method, the method comprising the steps of: step one, constructing BiLSTMA units, and extracting deep semantic information and global dependency information of sentences; step two, constructing a Multi-BiLSTMA model, and acquiring semantic information with dependency relations of different granularities; and thirdly, verifying the validity of the method by using the real data.
The step 1 fully utilizes the advantages of a two-way long-short-term memory model (BiLSTM) in the aspect of processing long-term dependency problems and the characteristic that a Attention mechanism (Attention) can capture global dependency information, and builds BiLSTMA units (BiLSTM-Attention) to extract sentence deep semantic information and dependency information.
And 2, setting hidden layers with different sizes in the BiLSTMA units, combining the BiLSTMA units with different sizes, and constructing a Multi-BiLSTMA model which can acquire semantic information with dependency relations with different granularities.
Step 3, to verify the validity of the method, the recognition effect of the method is verified using the ACE RDC200 chinese dataset, thereby verifying its validity.
Advantageous effects
The beneficial effects of the invention are as follows:
in the invention, emphasis is placed on the characteristic that the Multi-core CNN can learn different granularity characteristics, a Multi-BiLSTMA model is constructed by setting BiLSTM with different sizes by utilizing BiLSTM and Attention mechanisms, and experiments prove that the method has excellent effects on an ACERDC2005 Chinese data set.
The invention provides a Chinese relation extraction method based on a neural network model of Multi-BiLSTM-attribute. Experiments prove that the method has higher performance on an ACE data set, and the effectiveness of the method is proved. The method provided by the invention effectively utilizes the characteristic that different granularity characteristics can be learned in the multi-core CNN neural network, combines the characteristics with BiLSTM, and fully exerts the characteristic of automatic extraction characteristics of the neural network model. The two-way BILSTM channel is provided with a plurality of hidden layers with different sizes, so that feature sparseness can be prevented to a certain extent, semantic information of characters can be effectively acquired and utilized, and abstract features with different dimensions can be automatically acquired. On the basis, an Attention mechanism is added, the weight is adjusted through the characteristics by utilizing the local characteristics and the global characteristics of sentences, the noise is reduced, and the accuracy is improved.
The method provided by the invention combines the characteristic that a single long-short-term memory model can only learn a certain specific dimension with the characteristic that a plurality of convolution kernels in a convolution neural network learn different dimensions, provides a Multi-BiLSTM-Attention model, obtains excellent results in the aspect of Chinese relation extraction, and obtains good use effects.
Drawings
FIG. 1 is a neural network model of the Multi-BiLSTM-Attention of the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings of the present specification.
For a sentence with two entities, the relation extraction task is to extract candidate relations between the two entities. The two-way long-short-term memory neural network (BiLSTM) model belongs to a variant of a cyclic neural network (RNN), can effectively process long-distance information and avoid gradient explosion, and is used by combining BiLSTM and Attention in view of better complementarity. However, a single, fixed BiLSTM can only learn information of a specific dimension, so by setting up different BiLSTM, a Multi-BiLSTMA model is constructed. The model may learn information-dependent features in multiple dimensions.
First, the input layer of the model consists of word vectors mapped to by the look-up table obtained by initialization. If the sentence length is L, mappingThe vectorized sentence may be expressed as: x= [ X ] 1 ,x 2 ,···,x L ]Wherein x is i ∈R D Is the i-th word w i D is the dimension of the vector. If the dictionary size is V, then the Embedding layer can be expressed as X ε R V×H . This process can be expressed as: x=embedding(s).
Second, the Multi-BiLSTMA layer of the present invention is composed of three BiLSTMA cells. Wherein each BiLSTMA cell is composed of a layer of BiLSTM and a layer of Attention. As shown in fig. 1 (b), the bimtma receives data of the embedded layer, and uses a forward LSTM and a reverse LSTM to form a bimstm layer for extracting features of deeper layers of the embedded layer. This procedure is summarized as: representing element-by-element additions. The Attention layer merges the information on each time step in the BiLSTM layer, and obtains the information with larger influence on the extraction result through calculation. This process can be summarized as:
The next step is the full connection layer of the model. After the outputs of the three BiLSTMA units are spliced together, the modeled information is classified by a full-connection (Dense) layer, wherein the size of the hidden layer is the relation type number, namely 7. This procedure is summarized as:
finally, in order to obtain a better experimental effect, the softmax layer is used for carrying out normalization processing on the output result of the full-connection layer, and a final classification result is obtained. In general, this process can be summarized as: y=softmax (D).
The validity of the method is verified by adopting real data, the selected data is an ACE RDC2005 standard Chinese data set, and the data is preprocessed first.
The present invention employs publicly published ACE RDC2005 chinese data sets for relationship extraction. After screening out the nonstandard documents, the experiment shares 628 documents. This dataset contains 6 entity relationship types (collectively positive examples), respectively: "ParT-WHOLE", "PHYS", "ORG-AFF", "GEN-AFF", "PER-SOC", "ART". Because the relationships in the dataset are directional, for example: if the entity pair (A, B) has an ART relationship in the data set, but no relationship type marked by the data set exists between the entity pair (B, A), all the conditions are collectively called negative examples, and the relationship type is marked as 'Other'. Because the relation extraction is mainly performed at sentence level, the terms "," areused. ", I! "? ",". "the 5 chinese punctuation marks cut the text in the dataset into sentences. The sentences without entity pairs are discarded, and the repeated sentences between the positive examples and the negative examples are removed (because the same sentence cannot be the positive example and the negative example), so that 1010056 sentences are obtained in total, wherein 9244 positive example sentences and 91812 negative example sentences are included. The ACE RDC2005 chinese dataset is a dataset with unbalanced distributions, each relationship type is not uniformly distributed, especially with negative cases up to 90.85%. In order to be closer to the real situation, the influence caused by a large amount of negative example data is reduced, so that only the result of the positive example is evaluated during the evaluation.
Secondly, on word vector processing, a method of randomly initializing LookupTable is adopted, the LookupTable is continuously adjusted in the training process, and the dimension of the word vector is set to be 100 dimensions. Since the neural network requires a fixed input, the average sentence length for each relationship type is analyzed. In order to balance the extraction effect and the training cost, a sentence with a sentence length equal to 50 is selected as the maximum input length, sentences with a sentence length lower than 50 are filled to 50 by 0, and the sentences with a sentence length higher than 50 are cut to 50. The AdaDelta function is selected as an optimization function, and the learning rate is 1.0 of the default of the optimization function. Further, the batch amount was set to 50 and the number of iterations was 100. Experiments prove that three BiLSTMA units are selected, wherein the sizes of the hidden layers are respectively 100, 200 and 300.
Finally, in order to prove the effectiveness of the method of the invention, three tasks were designed on the same data. The first task is to use the multi-core CNN for relationship extraction, which can be seen as a benchmark model; the second task is to use single-layer BiLSTMA for relation extraction, and experiments prove that the effect is superior to that of a simple multi-core CNN method through the combination of BiLSTM and Attention; the third task is to use a Multi-BiLSTMA model to extract the relation, prove that the model has the effect similar to that of Multi-core CNN, fully utilize the advantages of BiLSTM and Attention, and remarkably improve the experimental result compared with the former two.
After 5-fold cross-validation experiments, the performance is shown in table 1 (the F values of the three models have been shown in bold).
Table 1 relation extraction task Performance
The number distribution of each relationship type is not balanced and such results are also directly presented in table 1. The characteristics of high type results are presented in total, and the characteristics of the neural network are met. In general, the larger the data amount, the more fully trained the data amount, the less likely the overfitting will be, and the better the results will be under the same data quality and model. From the results, it can also be seen that the F values of the three classes, "PART-WHOLE", "ORG-AFF" and "GEN-AFF" are significantly higher than the other three positive example types, also due to the large data size of these three classes.
Meanwhile, as can be seen from table 1, the performance of the single-layer bistma is better than that of the simple multi-core CNN, because compared with the CNN, the bistma can capture the dependency information and key features in the sentence more effectively, so that a better extraction effect is obtained. The Multi-BiLSTMA has the characteristics of both, so that the performance of the Multi-BiLSTMA is obviously superior to that of the Multi-BiLSTMA. In summary, the Chinese relation extraction method based on the neural network provided by the invention has excellent performance.
The present invention is not described in detail in the present application, and is well known to those skilled in the art. Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.

Claims (2)

1. The Chinese relation extraction method based on the neural network is characterized by comprising the following steps of:
step 1: constructing a BiLSTMA unit, and extracting deep semantic information and global dependency information of sentences; fully utilizing the advantages of the two-way long-short-term memory model in the aspect of processing long-term dependence and the characteristic that the attention mechanism can capture global dependence information, and constructing a BiLSTMA unit to extract sentence deep semantic information and dependence information;
step 2: constructing a Multi-BiLSTMA model, and acquiring semantic information with dependency relations of different granularities; through setting hidden layers with different sizes in BiLSTMA units, combining the BiLSTMA units with different sizes to construct a Multi-BiLSTMA model, wherein the model can acquire semantic information with dependency relations with different granularities, and the specific steps are as follows:
firstly, an input layer of a model consists of word vectors mapped by a lookup table obtained through initialization;
secondly, the Multi-BiLSTMA layer consists of three BiLSTMA units, wherein each BiLSTMA unit consists of a layer of bidirectional LSTM and a layer of Attention, the Attention layer merges information on each time step in the BiLSTM layer, and information with great influence on an extraction result is obtained through calculation; after the outputs of the three BiLSTMA units are spliced, classifying the information learned by the model through a layer of full-connection layer, wherein the size of the hidden layer is the relation type number; normalizing the output result of the full-connection layer by using the softmax layer to obtain a final classification result;
step 3: the validity of the method is verified using the authentic data.
2. The neural network based chinese relation extraction method of claim 1, wherein step 3 uses the ACE RDC200 chinese data set to verify the recognition effect of the method, thereby verifying its validity.
CN201910669521.9A 2019-07-24 2019-07-24 A Chinese Relation Extraction Method Based on Neural Network Active CN111008529B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910669521.9A CN111008529B (en) 2019-07-24 2019-07-24 A Chinese Relation Extraction Method Based on Neural Network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910669521.9A CN111008529B (en) 2019-07-24 2019-07-24 A Chinese Relation Extraction Method Based on Neural Network

Publications (2)

Publication Number Publication Date
CN111008529A CN111008529A (en) 2020-04-14
CN111008529B true CN111008529B (en) 2023-07-21

Family

ID=70111470

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910669521.9A Active CN111008529B (en) 2019-07-24 2019-07-24 A Chinese Relation Extraction Method Based on Neural Network

Country Status (1)

Country Link
CN (1) CN111008529B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111831783B (en) * 2020-07-07 2023-12-08 北京北大软件工程股份有限公司 A method for extracting chapter-level relationships
CN114647726B (en) * 2022-03-04 2024-08-06 贵州大学 News webpage information extraction method, system, equipment and medium based on multidimensional text features

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9799327B1 (en) * 2016-02-26 2017-10-24 Google Inc. Speech recognition with attention-based recurrent neural networks
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A textual entailment relation recognition method fused with multi-granularity information
CN109710761A (en) * 2018-12-21 2019-05-03 中国标准化研究院 Sentiment analysis method based on attention-enhanced bidirectional LSTM model
CN109740148A (en) * 2018-12-16 2019-05-10 北京工业大学 A kind of text emotion analysis method of BiLSTM combination Attention mechanism
CN109858032A (en) * 2019-02-14 2019-06-07 程淑玉 Merge more granularity sentences interaction natural language inference model of Attention mechanism

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3454260A1 (en) * 2017-09-11 2019-03-13 Tata Consultancy Services Limited Bilstm-siamese network based classifier for identifying target class of queries and providing responses thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9799327B1 (en) * 2016-02-26 2017-10-24 Google Inc. Speech recognition with attention-based recurrent neural networks
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A textual entailment relation recognition method fused with multi-granularity information
CN109740148A (en) * 2018-12-16 2019-05-10 北京工业大学 A kind of text emotion analysis method of BiLSTM combination Attention mechanism
CN109710761A (en) * 2018-12-21 2019-05-03 中国标准化研究院 Sentiment analysis method based on attention-enhanced bidirectional LSTM model
CN109858032A (en) * 2019-02-14 2019-06-07 程淑玉 Merge more granularity sentences interaction natural language inference model of Attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
融合Attention多粒度句子交互自然语言推理研究;程淑玉等;《小型微型计算机系统》;20190614(第06期);正文2-3部分 *

Also Published As

Publication number Publication date
CN111008529A (en) 2020-04-14

Similar Documents

Publication Publication Date Title
CN110598713B (en) Intelligent image automatic description method based on deep neural network
Xu et al. Investigation on the Chinese text sentiment analysis based on convolutional neural networks in deep learning.
CN110020438B (en) Method and Device for Disambiguating Chinese Name Entity of Enterprise or Organization Based on Sequence Recognition
CN108984526B (en) Document theme vector extraction method based on deep learning
CN107766324B (en) A Text Consistency Analysis Method Based on Deep Neural Network
CN107133213B (en) A method and system for automatic extraction of text summaries based on algorithm
WO2023065544A1 (en) Intention classification method and apparatus, electronic device, and computer-readable storage medium
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN104915448B (en) A kind of entity based on level convolutional network and paragraph link method
CN111738004A (en) A training method for a named entity recognition model and a method for named entity recognition
CN110321563B (en) Text emotion analysis method based on hybrid supervision model
CN113051886B (en) A test question duplication checking method, device, storage medium and equipment
CN111738003A (en) Named entity recognition model training method, named entity recognition method and medium
CN107301246A (en) Chinese Text Categorization based on ultra-deep convolutional neural networks structural model
CN112417153B (en) Text classification method, device, terminal device and readable storage medium
CN107562784A (en) Short text classification method based on ResLCNN models
CN113220876B (en) A multi-label classification method and system for English text
CN108737406A (en) A kind of detection method and system of abnormal flow data
CN106547739A (en) A kind of text semantic similarity analysis method
WO2023159767A1 (en) Target word detection method and apparatus, electronic device and storage medium
CN113011194A (en) Text similarity calculation method fusing keyword features and multi-granularity semantic features
WO2017193685A1 (en) Method and device for data processing in social network
CN112232053A (en) A text similarity calculation system, method, and storage medium based on multi-keyword pair matching
CN107273913A (en) A kind of short text similarity calculating method based on multi-feature fusion
CN109918507B (en) textCNN (text-based network communication network) improved text classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20241015

Address after: No. 001, Building 3, Computing Power Center, Guiyang Big Data Science and Technology Innovation City, Huchao Township, Gui'an New District, Guiyang City, Guizhou Province, 550000

Patentee after: Guizhou Luhao Technology Co.,Ltd.

Country or region after: China

Address before: Science and Technology Department of Huaxi north campus, Guizhou University, Huaxi District, Guiyang City, Guizhou Province

Patentee before: Guizhou University

Country or region before: China

TR01 Transfer of patent right