[go: up one dir, main page]

CN116028812B - A method for constructing a pipeline multi-event extraction model - Google Patents

A method for constructing a pipeline multi-event extraction model Download PDF

Info

Publication number
CN116028812B
CN116028812B CN202211733205.1A CN202211733205A CN116028812B CN 116028812 B CN116028812 B CN 116028812B CN 202211733205 A CN202211733205 A CN 202211733205A CN 116028812 B CN116028812 B CN 116028812B
Authority
CN
China
Prior art keywords
event
type
data set
model
constructing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211733205.1A
Other languages
Chinese (zh)
Other versions
CN116028812A (en
Inventor
迟雨桐
冯少辉
张建业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Iplus Teck Co ltd
Original Assignee
Beijing Iplus Teck Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Iplus Teck Co ltd filed Critical Beijing Iplus Teck Co ltd
Priority to CN202211733205.1A priority Critical patent/CN116028812B/en
Publication of CN116028812A publication Critical patent/CN116028812A/en
Application granted granted Critical
Publication of CN116028812B publication Critical patent/CN116028812B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

本发明涉及一种管道式多事件抽取模型的构建方法,属于自然语言处理技术领域,解决了现有的事件抽取模型在语料中存在较多事件或多事件重叠的情况下容易出现识别缺漏、事件要素无法匹配,导致准确率低的问题。通过基于原始数据集构建得到事件特征数据集合,并进一步构建得到包含事件类型、事件要素的正、负样本的训练集,再使用训练集对T5模型进行训练,使模型有效地学到了各事件类型、事件角色、事件要素以及触发词之间的内在联系,尤其提高了模型对于多事件的理解和预测能力,整体训练过程使用提示信息(prompt)的方法,一定程度上保证了抽取准确率和忠诚度,得到了对事件文本具有较高识别率的事件抽取模型。

The present invention relates to a method for constructing a pipeline multi-event extraction model, which belongs to the technical field of natural language processing. The invention solves the problem that the existing event extraction model is prone to recognition omissions and event elements cannot be matched when there are many events or multiple events overlap in the corpus, resulting in low accuracy. An event feature data set is obtained by constructing based on an original data set, and a training set of positive and negative samples containing event types and event elements is further constructed, and then the T5 model is trained using the training set, so that the model effectively learns the internal connection between each event type, event role, event element and trigger word, and especially improves the model's understanding and prediction ability for multiple events. The overall training process uses a prompt information method to ensure the extraction accuracy and fidelity to a certain extent, and obtains an event extraction model with a high recognition rate for event text.

Description

Construction method of pipeline type multi-event extraction model
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method for constructing a pipeline type multi-event extraction model.
Background
Event extraction (EE, eventExtraction) is one of the important tasks in the field of Natural Language Processing (NLP), and the purpose of event extraction is to identify event types (eventtype), event trigger words (trigger), event elements (parameters), and element roles (argumentrole) contained in a given corpus. At present, the application scene of the event extraction technology is very wide, useful information in a large amount of texts can be extracted efficiently, and powerful data support is provided for knowledge graph construction.
The existing extraction method of the mainstream event extraction model comprises a sequence labeling method, a pointer discrimination method and a generation method. The sequence labeling method is essentially a multi-label multi-classification method, predicts possible labels of each token, extracts events by predicting the starting and ending positions of texts corresponding to each label by a pointer discrimination method, and is an end-to-end (end 2 end) method, extracts context information by a deeper network and directly outputs event information in a text format. The three methods have good performance in the corpus of single event or non-overlapping multiple event with less event number, but when more events occur in the corpus, especially when one or more elements overlap, the problems of lack of recognition, recognition error, incapability of matching event elements and the like are very easy to occur, so that the accuracy is very low. Because overlapping multiple events are ubiquitous in the actual corpus, a more optimized method for constructing the multiple event extraction model is needed to solve the problem of low extraction accuracy caused by the fact that the event extraction model in the prior art is lack of recognition and is not matched with event elements in the multiple event and overlapping multiple event extraction tasks.
Disclosure of Invention
In view of the above analysis, the embodiment of the invention aims to provide a method for constructing a pipeline type multi-event extraction model, which is used for solving the problem of low extraction accuracy rate caused by the fact that the event extraction model is lack of identification and event elements cannot be matched in multi-event and overlapped multi-event extraction tasks in the prior art.
In one aspect, an embodiment of the present invention provides a method for constructing a pipeline multi-event extraction model, including the following steps:
Acquiring marked text data as an original data set;
Obtaining an event characteristic data set based on the original data set, and further constructing an event type positive sample data set D +1, an event element positive sample data set D +2, an event type full negative sample data set D -1 and an event element random negative sample data set D -2 to finally obtain a model training data set D all;
Training the T5 model by using a training data set D all to obtain a trained pipeline type multi-event extraction model M trained;
And when multiple events are extracted, gradually constructing a prediction sample set of each step, wherein the trained model M trained is used for obtaining a prediction result of each step based on the prediction sample set of each step, and integrating to obtain a final extraction result.
Further, the obtaining the noted text data includes:
acquiring original text data;
The method comprises the steps of marking original text data, wherein the marking comprises the steps of determining event types contained in sentences in the text data, extracting trigger words, event elements and positions of the trigger words and the event elements according to the event types, and marking the event elements with proper event role labels.
Further, the event feature data set includes:
The method comprises the steps of setting a corresponding relation schema of event types and all event roles, a corresponding set S type_role of event types and single event roles, a set S type of all event types, a set S trigger of all trigger words and a set S argument of all event elements, wherein the schema records all event types in an original data set and all event roles corresponding to the event types and the event types respectively, obtaining the event types and the event roles in the schema according to the schema by S type_role, including two-by-two combinations of the event types and the event roles of each event in the schema, and the event roles belonging to the event roles in the schema, recording all event types by S type, recording all trigger words appearing in the original data set by S trigger, and recording all event elements contained in the original data set by S argument.
Further, the model training dataset D all is constructed by the following steps:
Summarizing and sorting the labeling information of the original dataset to obtain three event characteristic data sets, namely a corresponding relation schema of event types and all event roles, a corresponding set S type_role of event types and single event roles and an all event type set S type;
constructing an event type positive sample data set D +1 and an event element positive sample data set D +2 by using the original data set and the data set schema, and two event feature data sets of all trigger word sets S trigger and all event element sets S argument which occur in the original data set;
Constructing an event type all negative sample data set D -1 using the event type positive sample data set D +1 and the event type data set S type;
constructing an event element random negative sample data set D -2 by using the event element positive sample data set D +2, the trigger word set S trigger, the event element set S argument and the corresponding set S type_role of event types and single event roles;
Mixing and disturbing the D +1、D+2、D-1、D-2 to finally obtain a model training data set D all.
Further, the event type positive sample data set D +1 and the event element positive sample data set D +2 are constructed by the following steps:
A1. Extracting an event type e type, a trigger word w trigger and an event role e role_1~erole_n corresponding to a certain event contained in text data text_p of an original data set, wherein corresponding event elements w arg_1~warg_n (n is the number of event roles contained in the event and is equal to the number of event elements), the input of an event type positive sample for constructing the event is text_p+e type + "trigger word" and the output of the event type positive sample is w trigger, the input of an event element positive sample for constructing the event is text_p+prompt arg and the output of the event element positive sample is w arg_1~warg_n, and the event element prompt arg can be obtained by the following formula:
A2. Constructing an event type positive sample and an event element positive sample for each event in the text data text_p by using the method in (1) to obtain an event type positive sample data set D +1 and an event element positive sample data set D +2;
Further, the event type full negative sample dataset D -1 is constructed by the following steps:
B1. E type of a positive sample of an event type is sequentially changed into other event types of the event in the event type dataset S type, and the target output is null to obtain an event type full negative sample of the event;
B2. The method of (1) is used for all events in the event type positive sample dataset D +1 to construct an event type full negative sample dataset D -1.
Further, the event element random negative sample dataset D -2 is constructed by:
(1) Finding out all event element positive samples of a certain event in D +2, finding out all event element prompt promt arg from the event element positive samples, and forming a set S prompt;
(2) Randomly selecting a trigger word from S trigger to obtain w trigger_random, randomly selecting an element from S type_role to obtain an event type e type_random, an event role e role_random and a position p where the event role is located;
(3) Randomly selecting p event elements from the event element set S argument to obtain w arg_r_1~warg_r_p, and combining the event elements according to the following format to obtain event element random prompt arg_random;
promptarg_random=etype_random+wtrigger_random+warg_r_1+…+warg_r_p+erole_random
(4) Judging whether the promtt arg_random exists in the S prompt, if so, repeating the steps 2, 3 and 4, if not, constructing a negative sample by using the promtt arg_random, and adding the promtt arg_random into the S prompt;
(5) Repeating the steps (1) - (4) until 5n random negative samples of the event elements are obtained.
(6) And (3) constructing an event element random negative sample data set D -2 by using the methods (1) - (5) for all event samples in the D +2.
Further, the training of the T5 model includes:
Dividing the model training dataset D all according to a certain proportion to obtain a training set D train, a verification set D eval and a testing set D test;
Performing fine tuning training on the T5 model by using a training set D train for n rounds, performing verification by using a verification set D eval after each round of training, taking a round of model with the best verification result as a final model, and performing testing by using a testing set D test to finally obtain a trained model M trained;
Model loss is calculated and parameters are updated during training using the following formula:
Loss=CrossEntropy(xpred,xgold)
where x pred is the prediction result and x gold is the target output.
Further, the step-by-step construction of the prediction sample set of each step includes:
Constructing a first-step prediction sample set D step_1 based on text to be extracted and an event feature data set;
Based on text to be extracted, the event characteristic data set and the prediction result of the previous step model M trained construct prompt information prompt, and the text+prompt structure constructs the prediction sample set of the next step model, so that the 2-n+1 step prediction sample set D step_2~Dstep_(n+1) is constructed in sequence.
Further, the trained model M trained is configured to obtain a prediction result of each step based on the prediction sample set of each step, and integrate to obtain a final extraction result, which includes:
inputting D step_1 into a model M trained to obtain all trigger words p trigger contained in the text of the first-step prediction result;
constructing a 2-n+1 step prediction sample set D step_2~Dstep_(n+1) by using a format text+sample_x, and inputting D step_x into a model M trained to obtain each trigger word Corresponding event typeIs the x-1 th event role of (2)Corresponding x-1 event elementWherein x ε [2, n+1], wherein probtx_x is represented as:
and combining the prompting information of the last step with the extraction result to obtain a complete event.
Compared with the prior art, the invention has at least one of the following beneficial effects:
1. The training set containing positive and negative samples of event types and event elements is further constructed and obtained based on the original data set, and the T5 model is trained by using the training set, so that the model effectively learns internal relations among the event types, the event roles, the event elements and trigger words, and particularly, the understanding and predicting capacity of the model for multiple events is improved, and the whole training process uses a prompt information (prompt) method, so that extraction accuracy and loyalty are guaranteed to a certain extent, and an event extraction model with higher recognition rate on event texts is obtained.
2. The method comprises the steps of extracting event texts based on a trained model, extracting the event by using prompt information (prompt) in a layer-by-layer progressive mode, extracting corresponding trigger words by taking all event types as the prompt information, sequentially adding the trigger words and element roles to be extracted into prompt extraction event elements in steps, and combining the prompt information of the last step with an extraction result to obtain a complete event after all event elements contained in the event types are extracted.
In the invention, the technical schemes can be mutually combined to realize more preferable combination schemes. Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, like reference numerals being used to refer to like parts throughout the several views.
FIG. 1 is a schematic flow chart of a method for constructing a pipeline multi-event extraction model according to an embodiment of the invention;
FIG. 2 is a schematic diagram of an overall implementation flow of actual prediction in a method for constructing a pipeline multi-event extraction model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a training data constructing process according to an embodiment of the present invention;
Fig. 4 is a schematic flow chart of obtaining a prediction result according to an embodiment of the present invention.
Detailed Description
The following detailed description of preferred embodiments of the application is made in connection with the accompanying drawings, which form a part hereof, and together with the description of the embodiments of the application, are used to explain the principles of the application and are not intended to limit the scope of the application.
The invention discloses a method for constructing a pipeline type multi-event extraction model, which is shown in fig. 1 and comprises the following steps:
Step S110, obtaining marked text data as an original data set;
step S120, obtaining an event characteristic data set based on the original data set, and further constructing an event type positive sample data set D +1, an event element positive sample data set D +2, an event type full negative sample data set D -1 and an event element random negative sample data set D -2, so as to finally obtain a model training data set D all;
Step S130, training a T5 model by using a training data set D all to obtain a trained pipeline type multi-event extraction model M trained;
And when multiple events are extracted, gradually constructing a prediction sample set of each step, wherein the trained model M trained is used for obtaining a prediction result of each step based on the prediction sample set of each step, and integrating to obtain a final extraction result.
The embodiment of the invention trains the T5 model by using a training set containing positive and negative samples of event types and event elements to obtain a multi-event extraction model. The training set containing positive and negative samples of event types and event elements is further constructed and obtained based on the original data set, and the T5 model is trained by using the training set, so that the model effectively learns internal relations among the event types, the event roles, the event elements and trigger words, and particularly, the understanding and predicting capacity of the model for multiple events is improved, and the whole training process uses a prompt information (prompt) method, so that extraction accuracy and loyalty are guaranteed to a certain extent, and an event extraction model with higher recognition rate on event texts is obtained.
On the basis of the above embodiment, specifically, the noted text data in the above step S110 is obtained by the following method:
extracting a data set directly by using hundred-degree events;
The method comprises the steps of determining event types contained in sentences in the text data, extracting trigger words, event elements and positions of the trigger words, the event elements and the positions of the event elements according to the event types, marking the event elements with proper event role labels;
Specifically, the step S120 may be further optimized as the following steps:
Step S210, summarizing and sorting the labeling information of the original dataset to obtain three event feature data sets, namely a corresponding relation schema of event types and all event roles, a corresponding set S type_role of event types and single event roles and an all event type set S type;
Specifically, all event types and event roles in the original data set are summarized and arranged to construct a data set schema, S type_role and S type, wherein the schema records all event types and all event roles corresponding to the event types and the event roles respectively in the original data set, S type_role is obtained according to the schema, the event types and the event roles of each event in the schema are combined in pairs, the event roles belong to the event roles in the schema, S type records all event types, preferably, the schema is stored in a file json, and both S type_role and S type are stored by using a set (set).
For example, for an event type of "acquisition," the event roles include "acquisition time, acquirer" events, whose records in schema, S type_role, and S type are shown in table 1.
Table 1 example of recording of events of type "acquisition" in schema, S type_role and S type
Step S220, constructing an event type positive sample data set D +1 and an event element positive sample data set D +2 by using the original data set and the data set schema, and two event characteristic data sets of all trigger word sets S trigger and all event element sets S argument which appear in the original data set;
specifically, the constructing the event type positive sample data set D +1 and the event element positive sample data set D +2, and all trigger word sets S trigger and all event element sets S argument that occur in the original data set include:
(1) Extracting an event type e type, a trigger word w trigger and an event role e role_1~erole_n corresponding to a certain event contained in text data text_p of an original data set, wherein corresponding event elements w arg_1~warg_n (n is the number of event roles contained in the event and is equal to the number of event elements), the input of an event type positive sample for constructing the event is text_p+e type + "trigger word" and the output of the event type positive sample is w trigger, the input of an event element positive sample for constructing the event is text_p+prompt arg and the output of the event element positive sample is w arg_1~warg_n, and the event element prompt arg can be obtained by the following formula:
(2) Constructing an event type positive sample and an event element positive sample for each event in the text data text_p by using the method in (1) to obtain an event type positive sample data set D +1 and an event element positive sample data set D +2;
(3) The trigger words w trigger of all events in the text data text_p are stored in the trigger word set S trigger, and all event elements w arg_1~warg_n are stored in the event element set S argument, so as to obtain a trigger word data set S trigger and an event element data set S argument.
Illustratively, for events of the type "acquisition", the constructed event type positive samples and event element positive samples and the save examples in S trigger and S argument are shown in table 2.
Table 2 event type positive sample and event element positive sample examples of an event of type "acquisition" and save examples in S trigger and S argument
In this example, the elements of the hint information are divided by "-" and may be divided by other symbols or spaces. The order of occurrence of the event elements in the template arg when the positive sample of event elements is constructed must be consistent with the maintenance of the schema record.
For complex events, the output may be multiple event elements, and when an input is constructed, event element prompts need to be constructed respectively, and table 3 shows an example of positive samples of event elements of multiple events sharing trigger words:
Table 3 Multi-event element positive sample example of common trigger words
Step S230, constructing an event type all negative sample data set D -1 using the event type positive sample data set D +1 and the event type data set S type;
Specifically, the construction event type all negative sample dataset D -1 includes:
(1) E type of a positive sample of an event type is sequentially changed into other event types of the event in the event type dataset S type, and the target output is null to obtain an event type full negative sample of the event;
(2) The method of (1) is used for all events in the event type positive sample dataset D +1 to construct an event type full negative sample dataset D -1.
For example, if there are m event types in the event type dataset S type, then there are m-1 event type full negative samples for each event;
For training of a model, a positive sample is a sample with a target output result, a negative sample is a sample without an output result, and the negative sample is added during training, so that the model identification accuracy can be effectively improved.
Step S240, constructing an event element random negative sample data set D -2 by using the event element positive sample data set D +2, the trigger word set S trigger, the event element set S argument, and the corresponding set of event types and single event roles S type_role;
The input format of the event element random negative sample is identical with the event element positive sample, and the difference is that the prompt information of the event element random negative sample is different, and the output results are all null. For an event, the random negative number of samples of the event element is generally recommended to be 5 times of the positive number of samples of the event element;
Specifically, the step of constructing the event element random negative sample dataset D -2 is as follows:
(1) Finding out all event element positive samples of a certain event in D +2, finding out all event element prompt promt arg from the event element positive samples, and forming a set S prompt;
(2) Randomly selecting a trigger word from S trigger to obtain w trigger_random, randomly selecting an element from S type_role to obtain an event type e type_random, an event role e role_random and a position p where the event role is located;
(3) Randomly selecting p event elements from the event element set S argument to obtain w arg_r_1~warg_r_p, and combining the event elements according to the following format to obtain event element random prompt arg_random;
promptarg_random=etype_random+wtrigger_random+warg_r_1+…+warg_r_p+erole_random
(4) Judging whether the promtt arg_random exists in the S prompt, if so, repeating the steps 2, 3 and 4, if not, constructing a negative sample by using the promtt arg_random, and adding the promtt arg_random into the S prompt;
(5) Repeating the steps (1) - (4) until 5n random negative samples of the event elements are obtained.
(6) Constructing all event samples in the D +2 by using the methods in the steps (1) - (5) to obtain an event element random negative sample data set D -2;
Step S250, mixing and disturbing the D +1、D+2、D-1、D-2 to finally obtain a model training data set D all;
specifically, training the T5 model in step S130 includes:
Dividing a model training data set D all according to a certain proportion to obtain a training set D train, a verification set D eval and a test set D test, wherein the preferable proportion is 8:1:1, performing fine tuning training on a T5 model by using the training set D train for n rounds, performing verification by using the verification set D eval after each round of training, taking a round of model with the best verification result as a final model, and performing test by using the test set D test to finally obtain a trained model M trained, and the preferable training discussion n is 20;
further, model loss is calculated and parameters are updated during training using the following formula:
Loss=CrossEntropy(xpred,xgold)
Wherein x pred is the predicted result and x gold is the label.
Furthermore, the extraction of the actual event text by using the trained model M trained comprises the following steps:
step S310, obtaining text to be extracted, wherein the text to be extracted can be news text data crawled from a website;
Step S320, based on text to be extracted, an event characteristic data set obtained from an original data set and a prediction result of a previous step model M train, constructing a1 st to n+1 st step prediction sample set D step_1~Dstep_(n+1) in a text+prompt structure in steps, inputting D step_1~Dstep_(n+1) in steps into a model M train to obtain a prediction result of a1 st to n+1 st step model M train, wherein n is an event angle number of an event type corresponding to the first step prediction result;
Specifically, the step of constructing the 1 st to n+1 th step prediction sample set D step_1~Dstep_(n+1) and obtaining the prediction result of the 1 st to n+1 th step model M train includes the following steps:
(A) All event types e type in S type are traversed in turn, for any event type Adding samples to the first step prediction sample set D step1: After the traversal is finished, the number of samples in D step1 is m (m is the number of event types, k is [1, m ]);
(B) Inputting the first-step prediction sample set D step1 into M trained, and when a certain sample has an output result, the output result is the event type in the text to be extracted Trigger word of (a)Is marked asFind from schemaCorresponding first event roleAnd adding the output result into a next prediction sample set D step2 in the format text+prompt _2, wherein
For a sample without output result, the event type without input in the text is describedThe trigger words of (a) are not included in text, i.e. the event type isIs a part of the event.
(C) D step2 is input into M trained, and each trigger word is predictedCorresponding first event roleEvent elements of (2) are recorded asJudging the event type by looking up schemaWhether other event roles exist, if not, step S330 is performed;
If the event type The presence of other event roles in schemaThen for the event typeOther event roles of (a)Sequentially constructing a next prediction sample set D step_3~Dstep_(n+1) according to the steps, sequentially inputting D step_3~Dstep_(n+1) into a model M trained according to the steps to perform event elementsIs extracted up toExtracting event elements corresponding to all the included event roles by the model M trained, and performing step S330;
More specifically, the method for constructing the next prediction sample set D step_3~Dstep_(n+1) is as follows:
Building a sample by using a format text+sample _X and adding the sample into a predicted sample set D step_(x) in the next step, wherein Will be based on the prompt _(x-1) Replaced byAnd finally addWherein x is E [3, n+1], n is the event role number contained in the event type in schema;
The event elements used in prompt message prompt comprise The determination method is as follows:
where j E [1, n-1], n is the event role number contained in the event type in schema, if The method comprises the steps of including a plurality of prediction results, and constructing prediction samples by dividing the plurality of results according to the format in the step.
Step S330, integrating the prediction results based on the n+1th prediction sample set D step_(n+1) and the n+1th model M train to obtain a final recognition result;
specifically, the integrating to obtain the final recognition result includes:
According to D step_n+1 and the n-th event element of the predicted result, the event extraction result is obtained by arrangement:
Event type:
trigger words:
Event role/event element (role/event):
for example, event extraction results may be integrated using a format as in Table 4.
Table 4 event extraction result integration example
In summary, the beneficial effects of the embodiment are as follows:
Compared with the prior art, the invention has at least one of the following beneficial effects:
1. The training set containing positive and negative samples of event types and event elements is further constructed and obtained based on the original data set, and the T5 model is trained by using the training set, so that the model effectively learns internal relations among the event types, the event roles, the event elements and trigger words, and particularly, the understanding and predicting capacity of the model for multiple events is improved, and the whole training process uses a prompt information (prompt) method, so that extraction accuracy and loyalty are guaranteed to a certain extent, and an event extraction model with higher recognition rate on event texts is obtained.
2. The method comprises the steps of extracting event texts based on a trained model, extracting the event by using prompt information (prompt) in a layer-by-layer progressive mode, extracting corresponding trigger words by taking all event types as the prompt information, sequentially adding the trigger words and element roles to be extracted into prompt extraction event elements in steps, and combining the prompt information of the last step with an extraction result to obtain a complete event after all event elements contained in the event types are extracted.
Those skilled in the art will appreciate that all or part of the flow of the methods of the embodiments described above may be accomplished by way of a computer program to instruct associated hardware, where the program may be stored on a computer readable storage medium. Wherein the computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory, etc.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims (4)

1. The method for constructing the pipeline type multi-event extraction model is characterized by comprising the following steps of:
Acquiring marked text data as an original data set;
Obtaining an event characteristic data set based on the original data set, and further constructing an event type positive sample data set D +1, an event element positive sample data set D +2, an event type full negative sample data set D -1 and an event element random negative sample data set D -2 to finally obtain a model training data set D all;
Training the T5 model by using a training data set D all to obtain a trained pipeline type multi-event extraction model M trained;
When multiple events are extracted, a prediction sample set of each step is gradually constructed, the trained model M trained is used for obtaining a prediction result of each step based on the prediction sample set of each step, and the prediction results are integrated to obtain a final extraction result;
the model training data set D all is constructed by the following steps:
Summarizing and sorting the labeling information of the original dataset to obtain three event characteristic data sets, namely a corresponding relation schema of event types and all event roles, a corresponding set S type_role of event types and single event roles and an all event type set S type;
constructing an event type positive sample data set D +1 and an event element positive sample data set D +2 by using the original data set and the data set schema, and two event feature data sets of all trigger word sets S trigger and all event element sets S argument which occur in the original data set;
Constructing an event type all negative sample data set D -1 using the event type positive sample data set D +1 and the event type data set S type;
constructing an event element random negative sample data set D -2 by using the event element positive sample data set D +2, the trigger word set S trigger, the event element set S argument and the corresponding set S type_role of event types and single event roles;
mixing and disturbing the D +1、D+2、D-1、D-2 to finally obtain a model training data set D all;
The event type positive sample data set D +1 and the event element positive sample data set D +2 are constructed by the following steps:
A1. extracting an event type e type, a trigger word w trigger and an event role e role_1~erole_n corresponding to a certain event contained in text data text_p of an original data set, wherein corresponding event elements w arg_1~warg_n and n are event role numbers contained in the event, the input of an event type positive sample for constructing the event is text_p+e type + "trigger word" and the output of the event type positive sample is w trigger, the input of an event element positive sample for constructing the event is text_p+promtt arg and the output of the event element positive sample is w arg_1~warg_n, and the event element prompt promtt arg can be obtained by the following formula:
A2. Constructing an event type positive sample and an event element positive sample for each event in the text data text_p by using the method in (1) to obtain an event type positive sample data set D +1 and an event element positive sample data set D +2;
The event type all negative sample data set D -1 is constructed by the following steps:
B1. E type of a positive sample of an event type is sequentially changed into other event types of the event in the event type dataset S type, and the target output is null to obtain an event type full negative sample of the event;
B2. Using the method of (1) for all events in the event type positive sample dataset D +1, constructing an event type full negative sample dataset D -1;
The event element random negative sample data set D -2 is constructed by the following steps:
(1) Finding out all event element positive samples of a certain event in D +2, finding out all event element prompt promt arg from the event element positive samples, and forming a set S prompt;
(2) Randomly selecting a trigger word from S trigger to obtain w trigger_random, randomly selecting an element from S type_role to obtain an event type e type_random, an event role e role_random and a position p where the event role is located;
(3) Randomly selecting p event elements from the event element set S argument to obtain w arg_r_1~warg_r_p, and combining the event elements according to the following format to obtain event element random prompt arg_random;
promptarg_random=etype_random+wtrigger_random+warg_r_1+…+warg_r_p+erole_random
(4) Judging whether the promtt arg_random exists in the S prompt, if so, repeating the steps 2, 3 and 4, if not, constructing a negative sample by using the promtt arg_random, and adding the promtt arg_random into the S prompt;
(5) Repeating the steps (1) - (4) until 5n random negative samples of the event elements are obtained;
(6) Constructing all event samples in the D +2 by using the methods in the steps (1) - (5) to obtain an event element random negative sample data set D -2;
The step-by-step construction of the prediction sample set of each step comprises the following steps:
Constructing a first-step prediction sample set D step_1 based on text to be extracted and an event feature data set;
Constructing prompt information prompt based on text to be extracted, event characteristic data sets and a prediction result of a previous step model M trained, constructing a prediction sample set of a next step model by a text+prompt structure, and constructing a 2-n+1 step prediction sample set D step_2~Dstep_(n+1) in sequence according to steps;
The trained model M trained is configured to obtain a prediction result of each step based on the prediction sample set of each step, and integrate to obtain a final extraction result, where the method includes:
inputting D step_1 into a model M trained to obtain all trigger words p trigger contained in the text of the first-step prediction result;
Constructing a 2-n+1 step prediction sample set D step_2~Dstep_(n+1) by using a format text+prompt _X, and inputting D step_x into a model M trained to obtain each trigger word Corresponding event typeIs the x-1 th event role of (2)Corresponding x-1 event elementWherein x is [2, n+1], wherein promt _X is represented as:
and combining the prompting information of the last step with the extraction result to obtain a complete event.
2. The method of claim 1, wherein the obtaining the annotated text data comprises:
acquiring original text data;
The method comprises the steps of marking original text data, wherein the marking comprises the steps of determining event types contained in sentences in the text data, extracting trigger words, event elements and positions of the trigger words and the event elements according to the event types, and marking the event elements with proper event role labels.
3. The method of claim 1, wherein the set of event feature data comprises:
The method comprises the steps of setting a corresponding relation schema of event types and all event roles, a corresponding set S type_role of event types and single event roles, a set S type of all event types, a set S trigger of all trigger words and a set S argument of all event elements, wherein the schema records all event types in an original data set and all event roles corresponding to the event types and the event types respectively, obtaining the event types and the event roles in the schema according to the schema by S type_role, including two-by-two combinations of the event types and the event roles of each event in the schema, and the event roles belonging to the event roles in the schema, recording all event types by S type, recording all trigger words appearing in the original data set by S trigger, and recording all event elements contained in the original data set by S argument.
4. The method of claim 1, wherein training the T5 model comprises:
Dividing the model training dataset D all according to a certain proportion to obtain a training set D train, a verification set D eval and a testing set D test;
Performing fine tuning training on the T5 model by using a training set D train for n rounds, performing verification by using a verification set D eval after each round of training, taking a round of model with the best verification result as a final model, and performing testing by using a testing set D test to finally obtain a trained model M trained;
Model loss is calculated and parameters are updated during training using the following formula:
Loss=CrossEntropy(xpred,xgold)
where x pred is the prediction result and x gold is the target output.
CN202211733205.1A 2022-12-30 2022-12-30 A method for constructing a pipeline multi-event extraction model Active CN116028812B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211733205.1A CN116028812B (en) 2022-12-30 2022-12-30 A method for constructing a pipeline multi-event extraction model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211733205.1A CN116028812B (en) 2022-12-30 2022-12-30 A method for constructing a pipeline multi-event extraction model

Publications (2)

Publication Number Publication Date
CN116028812A CN116028812A (en) 2023-04-28
CN116028812B true CN116028812B (en) 2025-07-01

Family

ID=86070463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211733205.1A Active CN116028812B (en) 2022-12-30 2022-12-30 A method for constructing a pipeline multi-event extraction model

Country Status (1)

Country Link
CN (1) CN116028812B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597302A (en) * 2020-04-28 2020-08-28 北京中科智加科技有限公司 Text event acquisition method and device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8666916B2 (en) * 2011-07-07 2014-03-04 Yahoo! Inc. Method for summarizing event-related texts to answer search queries
CN112559747B (en) * 2020-12-15 2024-05-28 北京百度网讯科技有限公司 Event classification processing method, device, electronic device and storage medium
CN115099235B (en) * 2022-05-13 2024-11-08 清华大学 Text generation method based on entity description
CN115238045B (en) * 2022-09-21 2023-01-24 北京澜舟科技有限公司 Method, system and storage medium for extracting generation type event argument

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597302A (en) * 2020-04-28 2020-08-28 北京中科智加科技有限公司 Text event acquisition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116028812A (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN113535963B (en) Long text event extraction method and device, computer equipment and storage medium
CN110188331B (en) Model training method, dialogue system evaluation method, device, equipment and storage medium
Moss et al. Statistical characteristics of tonal harmony: A corpus study of Beethoven’s string quartets
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN113934851B (en) Data enhancement method, device and electronic device for text classification
CN114840642B (en) Event extraction method, device, equipment and storage medium
CN114139053B (en) Intelligent push system for lifelong learning resources based on credit bank and big data analysis
CN111126422B (en) Method, device, equipment and medium for establishing industry model and determining industry
CN119168506A (en) A method for rapid business process modeling based on large language model
CN116522943B (en) Address element extraction method and device, storage medium and computer equipment
CN118069892A (en) Mind map generation method, device, equipment and medium
CN116306974B (en) Model training method and device of question-answering system, electronic equipment and storage medium
CN114936563B (en) Event extraction method, device and storage medium
CN116028812B (en) A method for constructing a pipeline multi-event extraction model
CN114625860B (en) A method, device, equipment and medium for identifying contract terms
US11775757B2 (en) Automated machine-learning dataset preparation
CN112133308B (en) Method and device for classifying multiple tags of speech recognition text
CN116304017B (en) Pipeline type multi-event extraction method
CN113627194A (en) Information extraction method and device, and communication message classification method and device
CN109002849B (en) Method and device for identifying development stage of object
CN116304728B (en) Short text similarity matching method based on sentence representation and application
CN112784015B (en) Information identification method and device, apparatus, medium, and program
CN114049528B (en) Method and device for brand name recognition
CN117540714A (en) Method and device for detecting single meter quantity, electronic equipment and storage medium
US12477292B2 (en) Systems and methods for determining audio channels in audio data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant