Disclosure of Invention
In view of the above analysis, the embodiment of the invention aims to provide a method for constructing a pipeline type multi-event extraction model, which is used for solving the problem of low extraction accuracy rate caused by the fact that the event extraction model is lack of identification and event elements cannot be matched in multi-event and overlapped multi-event extraction tasks in the prior art.
In one aspect, an embodiment of the present invention provides a method for constructing a pipeline multi-event extraction model, including the following steps:
Acquiring marked text data as an original data set;
Obtaining an event characteristic data set based on the original data set, and further constructing an event type positive sample data set D +1, an event element positive sample data set D +2, an event type full negative sample data set D -1 and an event element random negative sample data set D -2 to finally obtain a model training data set D all;
Training the T5 model by using a training data set D all to obtain a trained pipeline type multi-event extraction model M trained;
And when multiple events are extracted, gradually constructing a prediction sample set of each step, wherein the trained model M trained is used for obtaining a prediction result of each step based on the prediction sample set of each step, and integrating to obtain a final extraction result.
Further, the obtaining the noted text data includes:
acquiring original text data;
The method comprises the steps of marking original text data, wherein the marking comprises the steps of determining event types contained in sentences in the text data, extracting trigger words, event elements and positions of the trigger words and the event elements according to the event types, and marking the event elements with proper event role labels.
Further, the event feature data set includes:
The method comprises the steps of setting a corresponding relation schema of event types and all event roles, a corresponding set S type_role of event types and single event roles, a set S type of all event types, a set S trigger of all trigger words and a set S argument of all event elements, wherein the schema records all event types in an original data set and all event roles corresponding to the event types and the event types respectively, obtaining the event types and the event roles in the schema according to the schema by S type_role, including two-by-two combinations of the event types and the event roles of each event in the schema, and the event roles belonging to the event roles in the schema, recording all event types by S type, recording all trigger words appearing in the original data set by S trigger, and recording all event elements contained in the original data set by S argument.
Further, the model training dataset D all is constructed by the following steps:
Summarizing and sorting the labeling information of the original dataset to obtain three event characteristic data sets, namely a corresponding relation schema of event types and all event roles, a corresponding set S type_role of event types and single event roles and an all event type set S type;
constructing an event type positive sample data set D +1 and an event element positive sample data set D +2 by using the original data set and the data set schema, and two event feature data sets of all trigger word sets S trigger and all event element sets S argument which occur in the original data set;
Constructing an event type all negative sample data set D -1 using the event type positive sample data set D +1 and the event type data set S type;
constructing an event element random negative sample data set D -2 by using the event element positive sample data set D +2, the trigger word set S trigger, the event element set S argument and the corresponding set S type_role of event types and single event roles;
Mixing and disturbing the D +1、D+2、D-1、D-2 to finally obtain a model training data set D all.
Further, the event type positive sample data set D +1 and the event element positive sample data set D +2 are constructed by the following steps:
A1. Extracting an event type e type, a trigger word w trigger and an event role e role_1~erole_n corresponding to a certain event contained in text data text_p of an original data set, wherein corresponding event elements w arg_1~warg_n (n is the number of event roles contained in the event and is equal to the number of event elements), the input of an event type positive sample for constructing the event is text_p+e type + "trigger word" and the output of the event type positive sample is w trigger, the input of an event element positive sample for constructing the event is text_p+prompt arg and the output of the event element positive sample is w arg_1~warg_n, and the event element prompt arg can be obtained by the following formula:
A2. Constructing an event type positive sample and an event element positive sample for each event in the text data text_p by using the method in (1) to obtain an event type positive sample data set D +1 and an event element positive sample data set D +2;
Further, the event type full negative sample dataset D -1 is constructed by the following steps:
B1. E type of a positive sample of an event type is sequentially changed into other event types of the event in the event type dataset S type, and the target output is null to obtain an event type full negative sample of the event;
B2. The method of (1) is used for all events in the event type positive sample dataset D +1 to construct an event type full negative sample dataset D -1.
Further, the event element random negative sample dataset D -2 is constructed by:
(1) Finding out all event element positive samples of a certain event in D +2, finding out all event element prompt promt arg from the event element positive samples, and forming a set S prompt;
(2) Randomly selecting a trigger word from S trigger to obtain w trigger_random, randomly selecting an element from S type_role to obtain an event type e type_random, an event role e role_random and a position p where the event role is located;
(3) Randomly selecting p event elements from the event element set S argument to obtain w arg_r_1~warg_r_p, and combining the event elements according to the following format to obtain event element random prompt arg_random;
promptarg_random=etype_random+wtrigger_random+warg_r_1+…+warg_r_p+erole_random
(4) Judging whether the promtt arg_random exists in the S prompt, if so, repeating the steps 2, 3 and 4, if not, constructing a negative sample by using the promtt arg_random, and adding the promtt arg_random into the S prompt;
(5) Repeating the steps (1) - (4) until 5n random negative samples of the event elements are obtained.
(6) And (3) constructing an event element random negative sample data set D -2 by using the methods (1) - (5) for all event samples in the D +2.
Further, the training of the T5 model includes:
Dividing the model training dataset D all according to a certain proportion to obtain a training set D train, a verification set D eval and a testing set D test;
Performing fine tuning training on the T5 model by using a training set D train for n rounds, performing verification by using a verification set D eval after each round of training, taking a round of model with the best verification result as a final model, and performing testing by using a testing set D test to finally obtain a trained model M trained;
Model loss is calculated and parameters are updated during training using the following formula:
Loss=CrossEntropy(xpred,xgold)
where x pred is the prediction result and x gold is the target output.
Further, the step-by-step construction of the prediction sample set of each step includes:
Constructing a first-step prediction sample set D step_1 based on text to be extracted and an event feature data set;
Based on text to be extracted, the event characteristic data set and the prediction result of the previous step model M trained construct prompt information prompt, and the text+prompt structure constructs the prediction sample set of the next step model, so that the 2-n+1 step prediction sample set D step_2~Dstep_(n+1) is constructed in sequence.
Further, the trained model M trained is configured to obtain a prediction result of each step based on the prediction sample set of each step, and integrate to obtain a final extraction result, which includes:
inputting D step_1 into a model M trained to obtain all trigger words p trigger contained in the text of the first-step prediction result;
constructing a 2-n+1 step prediction sample set D step_2~Dstep_(n+1) by using a format text+sample_x, and inputting D step_x into a model M trained to obtain each trigger word Corresponding event typeIs the x-1 th event role of (2)Corresponding x-1 event elementWherein x ε [2, n+1], wherein probtx_x is represented as:
and combining the prompting information of the last step with the extraction result to obtain a complete event.
Compared with the prior art, the invention has at least one of the following beneficial effects:
1. The training set containing positive and negative samples of event types and event elements is further constructed and obtained based on the original data set, and the T5 model is trained by using the training set, so that the model effectively learns internal relations among the event types, the event roles, the event elements and trigger words, and particularly, the understanding and predicting capacity of the model for multiple events is improved, and the whole training process uses a prompt information (prompt) method, so that extraction accuracy and loyalty are guaranteed to a certain extent, and an event extraction model with higher recognition rate on event texts is obtained.
2. The method comprises the steps of extracting event texts based on a trained model, extracting the event by using prompt information (prompt) in a layer-by-layer progressive mode, extracting corresponding trigger words by taking all event types as the prompt information, sequentially adding the trigger words and element roles to be extracted into prompt extraction event elements in steps, and combining the prompt information of the last step with an extraction result to obtain a complete event after all event elements contained in the event types are extracted.
In the invention, the technical schemes can be mutually combined to realize more preferable combination schemes. Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
Detailed Description
The following detailed description of preferred embodiments of the application is made in connection with the accompanying drawings, which form a part hereof, and together with the description of the embodiments of the application, are used to explain the principles of the application and are not intended to limit the scope of the application.
The invention discloses a method for constructing a pipeline type multi-event extraction model, which is shown in fig. 1 and comprises the following steps:
Step S110, obtaining marked text data as an original data set;
step S120, obtaining an event characteristic data set based on the original data set, and further constructing an event type positive sample data set D +1, an event element positive sample data set D +2, an event type full negative sample data set D -1 and an event element random negative sample data set D -2, so as to finally obtain a model training data set D all;
Step S130, training a T5 model by using a training data set D all to obtain a trained pipeline type multi-event extraction model M trained;
And when multiple events are extracted, gradually constructing a prediction sample set of each step, wherein the trained model M trained is used for obtaining a prediction result of each step based on the prediction sample set of each step, and integrating to obtain a final extraction result.
The embodiment of the invention trains the T5 model by using a training set containing positive and negative samples of event types and event elements to obtain a multi-event extraction model. The training set containing positive and negative samples of event types and event elements is further constructed and obtained based on the original data set, and the T5 model is trained by using the training set, so that the model effectively learns internal relations among the event types, the event roles, the event elements and trigger words, and particularly, the understanding and predicting capacity of the model for multiple events is improved, and the whole training process uses a prompt information (prompt) method, so that extraction accuracy and loyalty are guaranteed to a certain extent, and an event extraction model with higher recognition rate on event texts is obtained.
On the basis of the above embodiment, specifically, the noted text data in the above step S110 is obtained by the following method:
extracting a data set directly by using hundred-degree events;
The method comprises the steps of determining event types contained in sentences in the text data, extracting trigger words, event elements and positions of the trigger words, the event elements and the positions of the event elements according to the event types, marking the event elements with proper event role labels;
Specifically, the step S120 may be further optimized as the following steps:
Step S210, summarizing and sorting the labeling information of the original dataset to obtain three event feature data sets, namely a corresponding relation schema of event types and all event roles, a corresponding set S type_role of event types and single event roles and an all event type set S type;
Specifically, all event types and event roles in the original data set are summarized and arranged to construct a data set schema, S type_role and S type, wherein the schema records all event types and all event roles corresponding to the event types and the event roles respectively in the original data set, S type_role is obtained according to the schema, the event types and the event roles of each event in the schema are combined in pairs, the event roles belong to the event roles in the schema, S type records all event types, preferably, the schema is stored in a file json, and both S type_role and S type are stored by using a set (set).
For example, for an event type of "acquisition," the event roles include "acquisition time, acquirer" events, whose records in schema, S type_role, and S type are shown in table 1.
Table 1 example of recording of events of type "acquisition" in schema, S type_role and S type
Step S220, constructing an event type positive sample data set D +1 and an event element positive sample data set D +2 by using the original data set and the data set schema, and two event characteristic data sets of all trigger word sets S trigger and all event element sets S argument which appear in the original data set;
specifically, the constructing the event type positive sample data set D +1 and the event element positive sample data set D +2, and all trigger word sets S trigger and all event element sets S argument that occur in the original data set include:
(1) Extracting an event type e type, a trigger word w trigger and an event role e role_1~erole_n corresponding to a certain event contained in text data text_p of an original data set, wherein corresponding event elements w arg_1~warg_n (n is the number of event roles contained in the event and is equal to the number of event elements), the input of an event type positive sample for constructing the event is text_p+e type + "trigger word" and the output of the event type positive sample is w trigger, the input of an event element positive sample for constructing the event is text_p+prompt arg and the output of the event element positive sample is w arg_1~warg_n, and the event element prompt arg can be obtained by the following formula:
(2) Constructing an event type positive sample and an event element positive sample for each event in the text data text_p by using the method in (1) to obtain an event type positive sample data set D +1 and an event element positive sample data set D +2;
(3) The trigger words w trigger of all events in the text data text_p are stored in the trigger word set S trigger, and all event elements w arg_1~warg_n are stored in the event element set S argument, so as to obtain a trigger word data set S trigger and an event element data set S argument.
Illustratively, for events of the type "acquisition", the constructed event type positive samples and event element positive samples and the save examples in S trigger and S argument are shown in table 2.
Table 2 event type positive sample and event element positive sample examples of an event of type "acquisition" and save examples in S trigger and S argument
In this example, the elements of the hint information are divided by "-" and may be divided by other symbols or spaces. The order of occurrence of the event elements in the template arg when the positive sample of event elements is constructed must be consistent with the maintenance of the schema record.
For complex events, the output may be multiple event elements, and when an input is constructed, event element prompts need to be constructed respectively, and table 3 shows an example of positive samples of event elements of multiple events sharing trigger words:
Table 3 Multi-event element positive sample example of common trigger words
Step S230, constructing an event type all negative sample data set D -1 using the event type positive sample data set D +1 and the event type data set S type;
Specifically, the construction event type all negative sample dataset D -1 includes:
(1) E type of a positive sample of an event type is sequentially changed into other event types of the event in the event type dataset S type, and the target output is null to obtain an event type full negative sample of the event;
(2) The method of (1) is used for all events in the event type positive sample dataset D +1 to construct an event type full negative sample dataset D -1.
For example, if there are m event types in the event type dataset S type, then there are m-1 event type full negative samples for each event;
For training of a model, a positive sample is a sample with a target output result, a negative sample is a sample without an output result, and the negative sample is added during training, so that the model identification accuracy can be effectively improved.
Step S240, constructing an event element random negative sample data set D -2 by using the event element positive sample data set D +2, the trigger word set S trigger, the event element set S argument, and the corresponding set of event types and single event roles S type_role;
The input format of the event element random negative sample is identical with the event element positive sample, and the difference is that the prompt information of the event element random negative sample is different, and the output results are all null. For an event, the random negative number of samples of the event element is generally recommended to be 5 times of the positive number of samples of the event element;
Specifically, the step of constructing the event element random negative sample dataset D -2 is as follows:
(1) Finding out all event element positive samples of a certain event in D +2, finding out all event element prompt promt arg from the event element positive samples, and forming a set S prompt;
(2) Randomly selecting a trigger word from S trigger to obtain w trigger_random, randomly selecting an element from S type_role to obtain an event type e type_random, an event role e role_random and a position p where the event role is located;
(3) Randomly selecting p event elements from the event element set S argument to obtain w arg_r_1~warg_r_p, and combining the event elements according to the following format to obtain event element random prompt arg_random;
promptarg_random=etype_random+wtrigger_random+warg_r_1+…+warg_r_p+erole_random
(4) Judging whether the promtt arg_random exists in the S prompt, if so, repeating the steps 2, 3 and 4, if not, constructing a negative sample by using the promtt arg_random, and adding the promtt arg_random into the S prompt;
(5) Repeating the steps (1) - (4) until 5n random negative samples of the event elements are obtained.
(6) Constructing all event samples in the D +2 by using the methods in the steps (1) - (5) to obtain an event element random negative sample data set D -2;
Step S250, mixing and disturbing the D +1、D+2、D-1、D-2 to finally obtain a model training data set D all;
specifically, training the T5 model in step S130 includes:
Dividing a model training data set D all according to a certain proportion to obtain a training set D train, a verification set D eval and a test set D test, wherein the preferable proportion is 8:1:1, performing fine tuning training on a T5 model by using the training set D train for n rounds, performing verification by using the verification set D eval after each round of training, taking a round of model with the best verification result as a final model, and performing test by using the test set D test to finally obtain a trained model M trained, and the preferable training discussion n is 20;
further, model loss is calculated and parameters are updated during training using the following formula:
Loss=CrossEntropy(xpred,xgold)
Wherein x pred is the predicted result and x gold is the label.
Furthermore, the extraction of the actual event text by using the trained model M trained comprises the following steps:
step S310, obtaining text to be extracted, wherein the text to be extracted can be news text data crawled from a website;
Step S320, based on text to be extracted, an event characteristic data set obtained from an original data set and a prediction result of a previous step model M train, constructing a1 st to n+1 st step prediction sample set D step_1~Dstep_(n+1) in a text+prompt structure in steps, inputting D step_1~Dstep_(n+1) in steps into a model M train to obtain a prediction result of a1 st to n+1 st step model M train, wherein n is an event angle number of an event type corresponding to the first step prediction result;
Specifically, the step of constructing the 1 st to n+1 th step prediction sample set D step_1~Dstep_(n+1) and obtaining the prediction result of the 1 st to n+1 th step model M train includes the following steps:
(A) All event types e type in S type are traversed in turn, for any event type Adding samples to the first step prediction sample set D step1: After the traversal is finished, the number of samples in D step1 is m (m is the number of event types, k is [1, m ]);
(B) Inputting the first-step prediction sample set D step1 into M trained, and when a certain sample has an output result, the output result is the event type in the text to be extracted Trigger word of (a)Is marked asFind from schemaCorresponding first event roleAnd adding the output result into a next prediction sample set D step2 in the format text+prompt _2, wherein
For a sample without output result, the event type without input in the text is describedThe trigger words of (a) are not included in text, i.e. the event type isIs a part of the event.
(C) D step2 is input into M trained, and each trigger word is predictedCorresponding first event roleEvent elements of (2) are recorded asJudging the event type by looking up schemaWhether other event roles exist, if not, step S330 is performed;
If the event type The presence of other event roles in schemaThen for the event typeOther event roles of (a)Sequentially constructing a next prediction sample set D step_3~Dstep_(n+1) according to the steps, sequentially inputting D step_3~Dstep_(n+1) into a model M trained according to the steps to perform event elementsIs extracted up toExtracting event elements corresponding to all the included event roles by the model M trained, and performing step S330;
More specifically, the method for constructing the next prediction sample set D step_3~Dstep_(n+1) is as follows:
Building a sample by using a format text+sample _X and adding the sample into a predicted sample set D step_(x) in the next step, wherein Will be based on the prompt _(x-1) Replaced byAnd finally addWherein x is E [3, n+1], n is the event role number contained in the event type in schema;
The event elements used in prompt message prompt comprise The determination method is as follows:
where j E [1, n-1], n is the event role number contained in the event type in schema, if The method comprises the steps of including a plurality of prediction results, and constructing prediction samples by dividing the plurality of results according to the format in the step.
Step S330, integrating the prediction results based on the n+1th prediction sample set D step_(n+1) and the n+1th model M train to obtain a final recognition result;
specifically, the integrating to obtain the final recognition result includes:
According to D step_n+1 and the n-th event element of the predicted result, the event extraction result is obtained by arrangement:
Event type:
trigger words:
Event role/event element (role/event):
for example, event extraction results may be integrated using a format as in Table 4.
Table 4 event extraction result integration example
In summary, the beneficial effects of the embodiment are as follows:
Compared with the prior art, the invention has at least one of the following beneficial effects:
1. The training set containing positive and negative samples of event types and event elements is further constructed and obtained based on the original data set, and the T5 model is trained by using the training set, so that the model effectively learns internal relations among the event types, the event roles, the event elements and trigger words, and particularly, the understanding and predicting capacity of the model for multiple events is improved, and the whole training process uses a prompt information (prompt) method, so that extraction accuracy and loyalty are guaranteed to a certain extent, and an event extraction model with higher recognition rate on event texts is obtained.
2. The method comprises the steps of extracting event texts based on a trained model, extracting the event by using prompt information (prompt) in a layer-by-layer progressive mode, extracting corresponding trigger words by taking all event types as the prompt information, sequentially adding the trigger words and element roles to be extracted into prompt extraction event elements in steps, and combining the prompt information of the last step with an extraction result to obtain a complete event after all event elements contained in the event types are extracted.
Those skilled in the art will appreciate that all or part of the flow of the methods of the embodiments described above may be accomplished by way of a computer program to instruct associated hardware, where the program may be stored on a computer readable storage medium. Wherein the computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory, etc.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.