[go: up one dir, main page]

CN114328687A - Event extraction model training method and device, event extraction method and device - Google Patents

Event extraction model training method and device, event extraction method and device Download PDF

Info

Publication number
CN114328687A
CN114328687A CN202111595365.XA CN202111595365A CN114328687A CN 114328687 A CN114328687 A CN 114328687A CN 202111595365 A CN202111595365 A CN 202111595365A CN 114328687 A CN114328687 A CN 114328687A
Authority
CN
China
Prior art keywords
sample
argument
arguments
role
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111595365.XA
Other languages
Chinese (zh)
Other versions
CN114328687B (en
Inventor
徐国进
韩翠云
李心雨
黄佳艳
裴明
施茜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202111595365.XA priority Critical patent/CN114328687B/en
Publication of CN114328687A publication Critical patent/CN114328687A/en
Application granted granted Critical
Publication of CN114328687B publication Critical patent/CN114328687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

本公开提供了一种事件抽取模型训练方法及装置,涉及知识图谱、深度学习等人工智能技术领域,具体实现方案为:获取第一训练样本,第一训练样本包括第一样本文本和第一标注数据。通过第一训练样本进行模型训练得到第一子模型。获取第二训练样本,第二训练样本中包括第二样本文本、第二样本文本中存在的多个第二样本论元、多个第二样本论元中每两个第二样本论元对应同一事件的样本概率。通过第二训练样本进行模型训练得到第二子模型。确定事件抽取模型包括第一子模型和第二子模型。本公开提供的技术方案可以有效的提升事件抽取模型的准确性。

Figure 202111595365

The present disclosure provides an event extraction model training method and device, which relate to the technical fields of artificial intelligence such as knowledge graphs and deep learning. Label data. The first sub-model is obtained by performing model training on the first training sample. Obtain a second training sample, the second training sample includes the second sample text, multiple second sample arguments existing in the second sample text, and every two second sample arguments in the multiple second sample arguments correspond to the same The sample probability of the event. The second sub-model is obtained by performing model training on the second training sample. It is determined that the event extraction model includes a first sub-model and a second sub-model. The technical solution provided by the present disclosure can effectively improve the accuracy of the event extraction model.

Figure 202111595365

Description

事件抽取模型训练方法及装置、事件抽取方法及装置Event extraction model training method and device, event extraction method and device

技术领域technical field

本公开涉及知识图谱、深度学习等人工智能技术领域,尤其涉及一种事件抽取模型训练方法及装置、事件抽取方法及装置。The present disclosure relates to the technical fields of artificial intelligence such as knowledge graphs and deep learning, and in particular, to an event extraction model training method and device, and an event extraction method and device.

背景技术Background technique

事件抽取,指的是将需要的事件的信息从非结构化的文本中提取出来,整合成结构化的形式。Event extraction refers to extracting the required event information from unstructured text and integrating it into a structured form.

目前,事件抽取通常是通过事件抽取模型实现的,而因为标注信息不充分,就会导致事件抽取模型的准确性较差。At present, event extraction is usually achieved through an event extraction model, and because of insufficient labeling information, the accuracy of the event extraction model is poor.

发明内容SUMMARY OF THE INVENTION

本公开提供了一种事件抽取模型训练方法及装置、事件抽取方法及装置。The present disclosure provides an event extraction model training method and device, and an event extraction method and device.

根据本公开的第一方面,提供了一种事件抽取模型训练方法,包括:According to a first aspect of the present disclosure, an event extraction model training method is provided, including:

获取第一训练样本,所述第一训练样本包括第一样本文本和第一标注数据,所述第一标注数据包括:所述第一样本文本中的多个样本论元对应的多个数据包、各所述数据包对应的样本角色、各所述数据包对应的样本事件类型,其中,任一个数据包中的样本论元相同;Acquire a first training sample, where the first training sample includes first sample text and first labeling data, where the first labeling data includes: a plurality of samples corresponding to the plurality of sample arguments in the first sample text data packets, sample roles corresponding to each of the data packets, and sample event types corresponding to each of the data packets, wherein the sample arguments in any one of the data packets are the same;

通过所述第一训练样本进行模型训练得到第一子模型,所述第一子模型用于确定文本中存在的论元、所述论元对应的角色和所述论元对应的事件类型;Perform model training through the first training sample to obtain a first sub-model, where the first sub-model is used to determine the arguments existing in the text, the roles corresponding to the arguments, and the event types corresponding to the arguments;

获取第二训练样本,所述第二训练样本中包括第二样本文本、所述第二样本文本中存在的多个样本事件、各所述样本事件中所包括的第二样本论元;acquiring a second training sample, where the second training sample includes second sample text, multiple sample events existing in the second sample text, and second sample arguments included in each of the sample events;

通过所述第二训练样本进行模型训练得到第二子模型,所述第二子模型用于确定文本中存在的事件、所述事件对应的论元;Perform model training through the second training sample to obtain a second sub-model, where the second sub-model is used to determine an event existing in the text and an argument corresponding to the event;

基于所述第一子模型和所述第二子模型确定事件抽取模型。An event extraction model is determined based on the first sub-model and the second sub-model.

根据本公开的第二方面,提供了一种事件抽取方法,包括:According to a second aspect of the present disclosure, an event extraction method is provided, comprising:

获取待处理的第一文本;Get the first text to be processed;

通过预训练的事件抽取模型中的第一子模型对所述第一文本进行处理,得到第一输出结果,所述第一输出结果中包括:所述第一文本中存在的论元、所述论元对应的角色和所述论元对应的事件类型;The first text is processed by the first sub-model in the pre-trained event extraction model to obtain a first output result, where the first output result includes: arguments existing in the first text, the The role corresponding to the argument and the event type corresponding to the argument;

通过预训练的事件抽取模型中的第二子模型对所述第一输出结果进行处理,得到所述第一文本中存在的事件、所述事件对应的论元。The first output result is processed by the second sub-model in the pre-trained event extraction model to obtain the event existing in the first text and the argument corresponding to the event.

根据本公开的第三方面,提供了一种事件抽取模型训练装置,包括:According to a third aspect of the present disclosure, an event extraction model training device is provided, comprising:

获取模块,用于获取第一训练样本,所述第一训练样本包括第一样本文本和第一标注数据,所述第一标注数据包括:所述第一样本文本中的多个样本论元对应的多个数据包、各所述数据包对应的样本角色、各所述数据包对应的样本事件类型,其中,任一个数据包中的样本论元相同;An acquisition module, configured to acquire a first training sample, where the first training sample includes first sample text and first labeled data, and the first labeled data includes: a plurality of sample theories in the first sample text Multiple data packets corresponding to the element, sample roles corresponding to each of the data packets, and sample event types corresponding to each of the data packets, wherein the sample arguments in any one of the data packets are the same;

第一处理模块,用于通过所述第一训练样本进行模型训练得到第一子模型,所述第一子模型用于确定文本中存在的论元、所述论元对应的角色和所述论元对应的事件类型;The first processing module is used to perform model training through the first training sample to obtain a first sub-model, and the first sub-model is used to determine the arguments existing in the text, the roles corresponding to the arguments and the arguments. The event type corresponding to the meta;

第二获取模块,用于获取第二训练样本,所述第二训练样本中包括第二样本文本、所述第二样本文本中存在的多个样本事件、各所述样本事件中所包括的第二样本论元;A second acquisition module, configured to acquire a second training sample, where the second training sample includes a second sample text, a plurality of sample events existing in the second sample text, and the first sample event included in each of the sample events. two-sample argument;

第二处理模块,用于通过所述第二训练样本进行模型训练得到第二子模型,所述第二子模型用于确定文本中存在的事件、所述事件对应的论元;a second processing module, configured to perform model training through the second training sample to obtain a second sub-model, where the second sub-model is used to determine an event existing in the text and an argument corresponding to the event;

确定模块,用于基于所述第一子模型和所述第二子模型确定事件抽取模型。A determination module, configured to determine an event extraction model based on the first sub-model and the second sub-model.

根据本公开的第四方面,提供了一种事件抽取装置,包括:According to a fourth aspect of the present disclosure, an event extraction apparatus is provided, comprising:

获取模块,用于获取待处理的第一文本;an acquisition module for acquiring the first text to be processed;

第一处理模块,用于通过预训练的事件抽取模型中的第一子模型对所述第一文本进行处理,得到第一输出结果,所述第一输出结果中包括:所述第一文本中存在的论元、所述论元对应的角色和所述论元对应的事件类型;The first processing module is configured to process the first text through the first sub-model in the pre-trained event extraction model to obtain a first output result, where the first output result includes: The existing argument, the role corresponding to the argument, and the event type corresponding to the argument;

第二处理模块,用于通过预训练的事件抽取模型中的第二子模型对所述第一输出结果进行处理,得到所述第一文本中存在的事件、所述事件对应的论元。The second processing module is configured to process the first output result through the second sub-model in the pre-trained event extraction model to obtain the event existing in the first text and the argument corresponding to the event.

根据本公开的第五方面,提供了一种电子设备,包括:According to a fifth aspect of the present disclosure, there is provided an electronic device, comprising:

至少一个处理器;以及at least one processor; and

与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行第一方面或者第二方面所述的方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the method of the first aspect or the second aspect .

根据本公开的第六方面,提供了一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行第一方面或者第二方面所述的方法。According to a sixth aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the method of the first aspect or the second aspect.

根据本公开的第七方面,提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序,所述计算机程序存储在可读存储介质中,电子设备的至少一个处理器可以从所述可读存储介质读取所述计算机程序,所述至少一个处理器执行所述计算机程序使得电子设备执行第一方面或者第二方面所述的方法。According to a seventh aspect of the present disclosure, there is provided a computer program product, the computer program product comprising: a computer program stored in a readable storage medium, from which at least one processor of an electronic device can Reading the storage medium reads the computer program, and executing the computer program by the at least one processor causes the electronic device to perform the method of the first aspect or the second aspect.

根据本公开的技术解决了事件抽取模型的准确性较差的问题。The technique according to the present disclosure solves the problem of poor accuracy of the event extraction model.

应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or critical features of embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.

附图说明Description of drawings

附图用于更好地理解本方案,不构成对本公开的限定。其中:The accompanying drawings are used for better understanding of the present solution, and do not constitute a limitation to the present disclosure. in:

图1为本公开实施例提供事件抽取的实现示意图;FIG. 1 provides a schematic diagram of the implementation of event extraction according to an embodiment of the present disclosure;

图2为本公开实施例提供的事件抽取模型训练方法的流程图;2 is a flowchart of an event extraction model training method provided by an embodiment of the present disclosure;

图3为本公开实施例提供的标注数据的实现示意图;FIG. 3 is a schematic diagram of the implementation of labeling data provided by an embodiment of the present disclosure;

图4为本公开实施例提供的事件抽取模型训练方法的流程图二;FIG. 4 is a second flowchart of the event extraction model training method provided by the embodiment of the present disclosure;

图5为本公开实施例提供的第一子模型的处理示意图;FIG. 5 is a schematic diagram of processing a first sub-model provided by an embodiment of the present disclosure;

图6为本公开实施例提供的更新第一子模型的模型参数的实现示意图;FIG. 6 is a schematic diagram of an implementation of updating model parameters of a first sub-model provided by an embodiment of the present disclosure;

图7为本公开实施例提供的事件抽取模型训练方法的流程图三;FIG. 7 is a third flowchart of a training method for an event extraction model provided by an embodiment of the present disclosure;

图8为本公开实施例提供的确定第一概率的实现示意图;FIG. 8 is a schematic diagram of an implementation of determining a first probability according to an embodiment of the present disclosure;

图9为本公开实施例提供的确定待选窗口的实现示意图;FIG. 9 is a schematic diagram of the implementation of determining a candidate window according to an embodiment of the present disclosure;

图10为本公开实施例提供的确定目标窗口的实现示意图;FIG. 10 is a schematic diagram of the implementation of determining a target window provided by an embodiment of the present disclosure;

图11为本公开实施例提供的事件抽取方法的流程图;11 is a flowchart of an event extraction method provided by an embodiment of the present disclosure;

图12为本公开实施例提供的事件抽取方法的处理示意图;FIG. 12 is a schematic processing diagram of an event extraction method provided by an embodiment of the present disclosure;

图13为本公开实施例的事件抽取模型训练装置的结构示意图;13 is a schematic structural diagram of an event extraction model training apparatus according to an embodiment of the present disclosure;

图14为本公开实施例的事件抽取装置的结构示意图;14 is a schematic structural diagram of an event extraction apparatus according to an embodiment of the present disclosure;

图15是用来实现本公开实施例的事件抽取模型训练方法以及事件抽取方法的电子设备的框图。FIG. 15 is a block diagram of an electronic device used to implement the event extraction model training method and the event extraction method according to the embodiment of the present disclosure.

具体实施方式Detailed ways

以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

为了更好的理解本公开的技术方案,下面对本公开所涉及的相关技术进行进一步的详细介绍。In order to better understand the technical solutions of the present disclosure, the related technologies involved in the present disclosure will be further introduced in detail below.

事件作为信息的一种表现形式,其定义为特定的人、物在特定时间和特定地点相互作用的客观事实,一般来说是句子级的。在话题检测追踪(Topic Detection Tracking,TDT)中,事件是指关于某一主题的一组相关描述,这个主题可以是由分类或聚类形成的。As a form of information, an event is defined as the objective fact that a specific person or thing interacts at a specific time and a specific place, generally at the sentence level. In Topic Detection Tracking (TDT), an event refers to a set of related descriptions about a topic, which can be formed by classification or clustering.

其中,组成事件的各元素可以包括如下中的至少一种:触发词、事件类型、事件论元及论元角色。Wherein, each element constituting an event may include at least one of the following: trigger word, event type, event argument, and argument role.

其中,事件触发词:表示事件发生的核心词,多为动词或名词;Among them, the event trigger word: the core word indicating the occurrence of the event, mostly verbs or nouns;

事件类型:可以为预设的多种事件类型,例如在ACE2005语料库中定义了8种事件类型和33种子类型。目前大多数事件抽取均采用33种事件类型。可以理解,事件识别是基于词的34类(33类事件类型+None)多元分类任务。Event type: It can be preset for multiple event types, for example, 8 event types and 33 subtypes are defined in the ACE2005 corpus. Currently, most event extractions use 33 event types. It can be understood that event recognition is a word-based 34-category (33-category event type + None) multivariate classification task.

事件论元:事件的参与者,主要由实体、值、时间组成。值是一种非实体的事件参与者,例如工作岗位;Event Argument: The participants of the event, which are mainly composed of entities, values, and time. A value is a non-physical event participant, such as a job;

论元角色:事件论元在事件中充当的角色。共有35类角色,例如,攻击者、受害者等,同样可以为预设的。可以理解,角色分类是基于词对的36类(35类角色类型+None)多元分类任务;Argument Role: The role the event argument plays in the event. There are 35 types of roles, such as attackers, victims, etc., which can also be preset. It can be understood that the role classification is a multi-classification task of 36 categories (35 types of role types + None) based on word pairs;

事件抽取技术是从非结构化信息中抽取出用户感兴趣的事件,并以结构化呈现给用户。事件抽取任务可分解为4个子任务:触发词识别、事件类型分类、论元识别和角色分类任务。其中,触发词识别和事件类型分类可合并成事件识别任务。Event extraction technology extracts events of interest to users from unstructured information and presents them to users in a structured way. The event extraction task can be decomposed into four subtasks: trigger word recognition, event type classification, argument recognition and role classification tasks. Among them, trigger word recognition and event type classification can be combined into an event recognition task.

事件识别判断句子中的每个单词归属的事件类型,是一个基于单词的多分类任务。论元识别和角色分类可合并成论元角色分类任务。角色分类任务则是一个基于词对的多分类任务,判断句子中任意一对触发词和实体之间的角色关系。Event recognition judges the event type to which each word in a sentence belongs, and is a word-based multi-classification task. Argument identification and role classification can be combined into an argument role classification task. The role classification task is a multi-classification task based on word pairs, which judges the role relationship between any pair of trigger words and entities in the sentence.

例如可以结合图1对事件抽取进行理解,图1为本公开实施例提供事件抽取的实现示意图。For example, the event extraction can be understood in conjunction with FIG. 1 , which provides a schematic diagram of the implementation of event extraction according to an embodiment of the present disclosure.

假设当前存在这样一个例句“在A城,当一个坦克对着P酒店开火时一个摄影师死去了”,针对当前的这个例句进行事件抽取,例如可以得到图1所示的两个事件。Assuming that there is currently such an example sentence "In city A, a photographer died when a tank opened fire on Hotel P", the event extraction is performed for the current example sentence, for example, the two events shown in Figure 1 can be obtained.

其中,事件1的事件类型是死亡,其对应的触发词是上述例句中的“死去”,以及事件1中的事件论元包括摄影师、坦克、A城,这3个事件论元的论元角色依次是受害者、工具、地点。Among them, the event type of event 1 is death, and its corresponding trigger word is "death" in the above example, and the event arguments in event 1 include photographer, tank, and city A, the arguments of these three event arguments The roles are the victim, the tool, and the place in that order.

以及,事件2的事件类型是攻击,其对应的触发词是上述例句中的“开火”,以及事件2中的事件论元包括摄影师、P酒店、坦克、A城,这4个事件论元的论元角色依次是目标、目标、工具、地点。And, the event type of event 2 is attack, the corresponding trigger word is "fire" in the above example, and the event arguments in event 2 include photographer, hotel P, tank, city A, these 4 event arguments The argument roles are, in order, goal, goal, tool, and place.

基于上述示例可以确定的是,事件抽取可以将需要的事件的信息,从非结构化的文本中提取出来,整合成结构化的形式。Based on the above examples, it can be determined that event extraction can extract the required event information from unstructured text and integrate it into a structured form.

在上述介绍的事件抽取的基础上,篇章级事件抽取是指从长文本的篇章中结构化所有预定义类型的事件,结构化的事件包含事件类型,以及此事件类型下的所有事件论元及其角色。也就是说,篇章级事件抽取是针对长文本的篇章进行事件抽取处理,其中篇章级事件抽取往往面临文本篇幅长,并且文本中包含多个同类型事件的复杂情况。On the basis of the event extraction described above, chapter-level event extraction refers to structuring all predefined types of events from a long text chapter. A structured event includes the event type, as well as all event arguments and its role. That is to say, chapter-level event extraction is to perform event extraction processing for long-text chapters, and chapter-level event extraction often faces the complex situation that the text is long and the text contains multiple events of the same type.

目前,事件抽取的实现通常都是通过事件抽取模型来实现的,现有的篇章级事件抽取模型中,在训练的时候通常需要给出篇章中每个字的监督信号,其中监督信号可以包括论元角色、事件类型、在文本中的位置等等。At present, the realization of event extraction is usually realized by the event extraction model. In the existing chapter-level event extraction model, it is usually necessary to give the supervision signal of each word in the chapter during training. Meta roles, event types, position in the text, etc.

可以理解的是,现有的篇章级的事件抽取模型依赖于每个字的监督信号,而篇章级的事件论元往往在篇章中会出现多次,但是有可能标注数据不够充分,从而导致模型训练学习是容易被错误的监督信号所误导。It is understandable that the existing text-level event extraction models rely on the supervision signal of each word, and the text-level event arguments often appear multiple times in the text, but there may be insufficient labeled data, which leads to the model Training learning is easily misled by false supervision signals.

此处可以结合一个具体的示例进行说明,假设当前存在一篇文本,在这篇文本中包括公式a,并且假设公式a在这篇文本中共出现了10次,也就是说在这篇文本中10个不同的地方都出现了公式a。目前的现有技术中就会对公式a对应的论元角色、事件类型以及在文本中出现的位置进行标注,然而因为标注工作的工作量巨大,很容易出现标注不充分的情况。It can be explained here with a specific example, assuming that there is a text currently, including formula a in this text, and assuming that formula a appears 10 times in this text, that is, in this text 10 Formula a appears in different places. In the current prior art, the argument roles, event types, and positions appearing in the text corresponding to formula a are labeled. However, due to the huge workload of the labeling work, it is easy to cause insufficient labeling.

比如说当前示例中公式a在文本中共出现了10次,但是在标注的时候,仅仅对4个地方的公式进行了标注,也就是说只针对4个地方的公式标注了监督信号,而其余6个地方的公式没有标注到。因为模型的训练是依赖于监督信号的标注了,那么对于模型来说,因为当前只标注出来了4个地方的公式,那么模型就会认为在这篇文本中公式a只出现了4次,但是实际情况并不是这样的,那么上述标注的4个监督信号对于模型来说就属于误导的信息。For example, in the current example, formula a appears 10 times in the text, but when labeling, only the formulas in 4 places are labeled, that is to say, the supervision signals are only labeled for the formulas in 4 places, and the remaining 6 places are labeled with supervision signals. The formulas in this place are not marked. Because the training of the model depends on the labeling of the supervision signal, for the model, because only 4 formulas are currently marked, the model will think that formula a only appears 4 times in this text, but The actual situation is not the case, then the four supervision signals marked above are misleading information for the model.

因此基于上述介绍可以确定的是,若文本中的监督信号标注不充分,那么标注的监督信号实际上就是错误的,则在模型训练学习的时候,很容易被错误的监督信号所误导。Therefore, based on the above introduction, it can be determined that if the supervision signal in the text is not sufficiently labeled, the labeled supervision signal is actually wrong, and it is easy to be misled by the wrong supervision signal when the model is trained and learned.

此外,基于上述介绍可以确定的是,在篇章级事件抽取中,因为待抽取的文本篇幅较长,并且在文本中通常会包含多个同类型的复杂事件,那么很多时候,事件抽取就无法将同类型的事件区分开,从而导致最终的事件提取结果出现错误,比如说事件1和事件2是糅杂在一起的同类型事件,那么就有可能出现将事件1中的论元拆分到事件2的论元中,从而导致事件提取的结果错误。In addition, based on the above introduction, it can be determined that in chapter-level event extraction, because the text to be extracted is long and usually contains multiple complex events of the same type, in many cases, event extraction cannot Events of the same type are separated, resulting in an error in the final event extraction result. For example, event 1 and event 2 are mixed events of the same type, so it is possible to split the arguments in event 1 into events. 2 in the argument, resulting in an error in the result of event extraction.

针对上述介绍的现有技术中的问题,本公开提出了如下技术构思:通过多示例学习的方式,将文本中多次出现的事件论元作为一个包进行优化,以防止模型错误学习,从而可以提供篇章中同一事件论元出现多次而标注不充分时的模型训练方法。以及,通过确定中心论元,以中心论元作为区分多个同类型事件的方法,同时以中心论元作为中心构建一个窗口匹配同事件的其他论元,从而可以有效提升事件提取结果的准确性。In view of the problems in the prior art introduced above, the present disclosure proposes the following technical idea: by means of multi-instance learning, the event arguments that appear multiple times in the text are optimized as a package to prevent the model from erroneous learning, so that the Provides a model training method when the same event argument appears multiple times in a chapter and the annotation is insufficient. And, by determining the central argument, using the central argument as a method to distinguish multiple events of the same type, and at the same time building a window with the central argument as the center to match other arguments of the same event, which can effectively improve the accuracy of event extraction results. .

在上述介绍内容的基础上,下面结合具体的实施例对本公开提供的事件抽取模型训练方法进行介绍,需要说明的是,本公开各实施例的执行主体可以为服务器、处理器、微处理器等具备数据处理功能的设备,此处对本公开中各实施例的执行主体不做限制,其可以根据实际需求进行选择和设置,凡是具备数据处理功能的设备均可以作为本公开中各实施例的执行主体。On the basis of the above-mentioned content, the following describes the event extraction model training method provided by the present disclosure with reference to specific embodiments. It should be noted that the executive body of each embodiment of the present disclosure may be a server, a processor, a microprocessor, etc. For a device with data processing function, the execution subject of each embodiment in this disclosure is not limited here, and it can be selected and set according to actual needs. Any device with data processing function can be used as the implementation of each embodiment in this disclosure. main body.

下面首先结合图2和图3对本公开提供的事件抽取模型训练方法进行介绍,图2为本公开实施例提供的事件抽取模型训练方法的流程图,图3为本公开实施例提供的标注数据的实现示意图。The following first introduces the event extraction model training method provided by the present disclosure with reference to FIG. 2 and FIG. 3 . FIG. 2 is a flowchart of the event extraction model training method provided by the embodiment of the present disclosure. Implementation schematic.

如图2所示,该方法包括:As shown in Figure 2, the method includes:

S201、获取第一训练样本,第一训练样本包括第一样本文本和第一标注数据,第一标注数据包括:第一样本文本中的多个样本论元对应的多个数据包、各数据包对应的样本角色、各数据包对应的样本事件类型,其中,任一个数据包中的样本论元相同。S201. Acquire a first training sample, where the first training sample includes first sample text and first labeled data, and the first labeled data includes: multiple data packets corresponding to multiple sample arguments in the first sample text, each The sample role corresponding to the data packet and the sample event type corresponding to each data packet, wherein the sample argument in any data packet is the same.

在本实施例中,第一训练样本是用于对第一子模型进行训练的样本,其中第一训练样本可以包括第一样本文本以及第一标注数据。In this embodiment, the first training sample is a sample for training the first sub-model, where the first training sample may include the first sample text and the first annotation data.

可以理解的是,其中的第一样本文本就是文本篇章,比如说是论文、作文、文章等等,本实施例对第一样本文本的格式、字数等等均不做限定,第一样本文本的具体实现可以根据实际需求进行选择。It can be understood that the first sample text is a text chapter, such as a paper, a composition, an article, etc. This embodiment does not limit the format, word count, etc. of the first sample text. The specific implementation of this text can be selected according to actual needs.

以及,本实施例中的第一标注数据就是针对第一样本文本的标注数据,在第一标注数据中可以包括:第一样本文本中的多个样本论元对应的多个数据包、各数据包各自对应的样本角色、各数据包各自对应的样本事件类型,其中,任一个数据包中的样本论元相同。也就是说本实施例中将第一样本文本中所包括的论元、论元对应的论元角色以及论元对应的事件类型均进行了标注,以得到第一标注数据。And, the first annotation data in this embodiment is the annotation data for the first sample text, and the first annotation data may include: multiple data packets corresponding to multiple sample arguments in the first sample text, The respective sample roles corresponding to each data packet and the respective sample event types corresponding to each data packet, wherein the sample arguments in any data packet are the same. That is to say, in this embodiment, the arguments included in the first sample text, the argument roles corresponding to the arguments, and the event types corresponding to the arguments are all marked, so as to obtain the first marked data.

同时,需要说明的是,本实施例中在针对第一样本对象中的论元进行标注的时候,是根据第一样本文本中的多个样本论元确定了多个数据包,其中的任一个数据包中可以包括一个或多个样本论元,在任一个数据包中所包括的样本论元是相同的。At the same time, it should be noted that in this embodiment, when the arguments in the first sample object are marked, a plurality of data packets are determined according to the plurality of sample arguments in the first sample text, wherein the One or more sample arguments may be included in any data packet, and the sample arguments included in any data packet are the same.

例如可以参照图3进行理解,如图3所示,假设当前存在图3所示的样本文本,为了便于介绍,假设在样本文本中包括多个公式,比如说有图3所示的公式a、公式b、公式c,以及结合图3可以确定的是,在样本文本中,公式a和公式b均出现了多次。For example, it can be understood with reference to Figure 3. As shown in Figure 3, it is assumed that the sample text shown in Figure 3 currently exists. For the convenience of introduction, it is assumed that the sample text includes multiple formulas, such as formula a shown in Figure 3, Formula b, formula c, and in conjunction with FIG. 3 , it can be determined that in the sample text, both formula a and formula b appear multiple times.

则本实施例中将相同的样本文本确定为一个数据包,也就得到了图3的右侧所示的3个数据包。因为在样本文本中公式a出现了4次,因此在数据包1中就包括4个公式a;以及,在样本文本中公式b出现了2次,因此在数据包2中就包括2个公式b;以及,在样本文本中公式c仅出现了1次,因此在数据包3中就包括1个公式c。以及针对每一个数据包,均标注有各自对应的样本角色和样本事件类型。In this embodiment, the same sample text is determined as one data packet, and three data packets shown on the right side of FIG. 3 are obtained. Since formula a appears 4 times in the sample text, 4 formula a are included in packet 1; and, formula b appears 2 times in the sample text, so 2 formula b is included in packet 2 ; and, the formula c appears only once in the sample text, so 1 formula c is included in packet 3. And for each data packet, it is marked with its corresponding sample role and sample event type.

基于图3的介绍可以理解的是,在任一个数据包中实际上包括的就是相同的论元角色,那么针对任一个数据包,数据包所对应的样本角色,实际上就是数据包中的论元的论元角色。以及数据包所对应的样本事件类型,实际上就是数据包中的论元的样本事件类型。Based on the introduction in Figure 3, it can be understood that the same argument role is actually included in any data packet, so for any data packet, the sample role corresponding to the data packet is actually the argument in the data packet. argument role. And the sample event type corresponding to the data packet is actually the sample event type of the argument in the data packet.

基于上述介绍可以确定的是,本实施例中针对在样本文本中多次出现的论元,可以将多个地方的相同论元作为一个数据包,以及针对仅出现一次的论元,同样可以确定仅包括该论元的数据包。之后针对数据包进行标注,从而可以正确的告知模型在样本文本中的各个论元出现的次数。Based on the above introduction, it can be determined that, in this embodiment, for arguments that appear multiple times in the sample text, the same arguments in multiple places can be used as a data packet, and for arguments that appear only once, it can also be determined that Only packets that include this argument. The data packets are then marked, so that the model can be correctly informed of the number of occurrences of each argument in the sample text.

比如说针对图3中的数据包1,模型就可以确定在样本文本中公式a出现了4次,尽管我们没有告诉模型这4个样本文本的位置,但是我们起码正确的告诉了模型公式a的出现次数,以避免因为标注不充分而导致的对模型的误导。之后模型可以自主的学习各个论元的具体位置,因此本实施例可以通过将多次出现的相同论元作为一个数据包,并针对数据包进行标注,将标注数据提供给模型,从而可以将正确的标注信息提供给模型,以避免因为标注不充分所导致的对模型的误导。For example, for data packet 1 in Figure 3, the model can determine that formula a appears 4 times in the sample text. Although we did not tell the model the location of these 4 sample texts, we at least correctly told the model formula a. The number of occurrences to avoid misleading the model due to insufficient labeling. After that, the model can autonomously learn the specific positions of each argument. Therefore, in this embodiment, the same argument that appears multiple times can be regarded as a data packet, and the data packet can be marked, and the marked data can be provided to the model, so that the correct data can be provided to the model. The annotation information is provided to the model to avoid misleading the model due to insufficient annotation.

上述介绍的进行数据包划分的实现,实际上就是多示例学习的应用,其中多示例学习的具体实现在上述实施例中已经进行了介绍,此处不再赘述。The implementation of data packet division described above is actually an application of multi-instance learning, wherein the specific implementation of multi-instance learning has been introduced in the above embodiments, and will not be repeated here.

在实际实现过程中,第一训练样本中所包括的第一样本文本可以是多个,针对每一个第一样本文本都有对应的第一标注数据,其中第一样本文本的具体数量可以根据实际需求进行选择和设置。In the actual implementation process, there may be multiple first sample texts included in the first training sample, and each first sample text has corresponding first annotation data, wherein the specific number of the first sample texts It can be selected and set according to actual needs.

S202、通过第一训练样本进行模型训练得到第一子模型,第一子模型用于确定文本中存在的论元、论元对应的角色和论元对应的事件类型。S202 , performing model training on the first training sample to obtain a first sub-model, where the first sub-model is used to determine arguments existing in the text, roles corresponding to the arguments, and event types corresponding to the arguments.

在获取第一训练数据之后,可以通过第一训练数据进行模型训练,比如说通过第一训练样本对初始的第一子模型进行训练,以得到训练后的第一子模型,本实施例中的第一子模型用于确定文本中存在的论元,以及论元对应的论元角色和论元对应的事件类型。After acquiring the first training data, model training may be performed using the first training data. For example, the initial first sub-model is trained through the first training sample to obtain the trained first sub-model. In this embodiment, the The first sub-model is used to determine the arguments existing in the text, the argument roles corresponding to the arguments, and the event types corresponding to the arguments.

S203、获取第二训练样本,第二训练样本中包括第二样本文本、第二样本文本中存在的多个第二样本论元、多个第二样本论元中每两个第二样本论元对应同一事件的样本概率。S203. Obtain a second training sample, where the second training sample includes the second sample text, multiple second sample arguments existing in the second sample text, and every two second sample arguments in the multiple second sample arguments The sample probability corresponding to the same event.

以及在本实施例中,还存在第二训练样本,第二训练样本是用于对第二子模型进行训练的样本,其中第二训练样本可以包括第二样本文本、第二样本文本中存在的多个第二样本论元、多个第二样本论元中每两个第二样本论元对应同一事件的样本概率。And in this embodiment, there is also a second training sample, the second training sample is a sample used for training the second sub-model, wherein the second training sample may include the second sample text, the second sample text existing in the The multiple second sample arguments, and every two second sample arguments in the multiple second sample arguments correspond to the sample probability of the same event.

基于上述的介绍可以确定的是,针对每一个事件论元,其都可以属于某一个事件,那么针对任一两个事件论元,其可以属于同一个事件,也可以属于不同的事件,则本实施例中的每两个第二样本论元对应同一个事件的样本概率可以为0(也就是说不属于同一个事件),或者每两个第二样本论元对应同一个事件的样本概率可以为1(也就是说属于同一个事件)、Based on the above introduction, it can be determined that for each event argument, it can belong to a certain event, then for any two event arguments, it can belong to the same event or different events, then this In the embodiment, the sample probability that every two second sample arguments correspond to the same event may be 0 (that is to say, they do not belong to the same event), or the sample probability that every two second sample arguments correspond to the same event may be 0. is 1 (that is, belongs to the same event),

以及,在一种可能的实现方式中,第一子模型和第二子模型的训练例如可以存在前后关系,也就是说先对第一子模型进行训练,之后根据第一子模型的输出数据再对第二子模型进行训练,则在第二训练样本中的第二样本文本例如可以和上述介绍的第一样本文本相同,以及在第二训练样本中第二样本文本中存在的多个第二样本论元,例如可以为上述的第一子模型输出的论元。也就是说基于相同的样本文本,首先对第一子模型进行训练,之后根据第一子模型提取的论元再对第二子模型进行训练。And, in a possible implementation manner, for example, the training of the first sub-model and the second sub-model may have a contextual relationship, that is to say, the first sub-model is trained first, and then the first sub-model is trained according to the output data of the first sub-model. When the second sub-model is trained, the second sample text in the second training sample can be, for example, the same as the first sample text introduced above, and there are multiple first sample texts in the second training sample that exist in the second sample text. The two-sample argument, for example, can be the argument output by the first sub-model above. That is to say, based on the same sample text, the first sub-model is trained first, and then the second sub-model is trained according to the arguments extracted by the first sub-model.

或者,在另一种可能的实现方式中,第一子模型和第二子模型的训练还可以是相互独立的,也就是说训练第一子模型的过程、第一训练样本,以及训练第二子模型的过程、第二训练样本之间并不存在关系,其是完全独立的训练过程,以及完全独立的样本数据。在实际实现过程中,第一训练样本和第二训练样本的具体实现可以根据实际需求进行选择和设置,本实施例对此不做特别限制。Or, in another possible implementation manner, the training of the first sub-model and the second sub-model may also be independent of each other, that is, the process of training the first sub-model, the first training sample, and the training of the second sub-model There is no relationship between the process of the sub-model and the second training sample, which is a completely independent training process and completely independent sample data. In the actual implementation process, the specific implementation of the first training sample and the second training sample may be selected and set according to actual requirements, which is not particularly limited in this embodiment.

S204、通过第二训练样本进行模型训练得到第二子模型,第二子模型用于确定文本中存在的事件、事件对应的论元和论元对应的角色。S204 , performing model training on the second training sample to obtain a second sub-model, where the second sub-model is used to determine an event existing in the text, an argument corresponding to the event, and a role corresponding to the argument.

在获取第二训练数据之后,可以通过第二训练数据进行模型训练,比如说通过第二训练样本对初始的第二子模型进行训练,以得到训练后的第二子模型,本实施例中的第二子模型用于确定文本中存在的事件、事件对应的论元和论元对应的角色。After acquiring the second training data, model training can be performed by using the second training data. For example, the initial second sub-model is trained by using the second training sample to obtain the trained second sub-model. The second sub-model is used to determine the events existing in the text, the arguments corresponding to the events, and the roles corresponding to the arguments.

S205、基于第一子模型和第二子模型确定事件抽取模型。S205. Determine an event extraction model based on the first sub-model and the second sub-model.

在对上述的第一子模型和第二子模型进行训练之后,可以基于第一子模型和第二子模型确定事件抽取模型,因此本实施例中的事件抽取模型就包括上述介绍的第一子模型和第二子模型。本实施例中的事件抽取模型用于对输入的文本进行处理,从而输出文本中存在的事件、事件对应的论元以及论元对应的角色。After the above-mentioned first sub-model and second sub-model are trained, an event extraction model can be determined based on the first sub-model and the second sub-model. Therefore, the event extraction model in this embodiment includes the first sub-model described above. model and the second submodel. The event extraction model in this embodiment is used to process the input text, so as to output the events existing in the text, the arguments corresponding to the events, and the roles corresponding to the arguments.

本公开实施例提供的事件抽取模型训练方法,包括:获取第一训练样本,第一训练样本包括第一样本文本和第一标注数据,第一标注数据包括:第一样本文本中的多个样本论元对应的多个数据包、各数据包对应的样本角色、各数据包对应的样本事件类型,其中,任一个数据包中的样本论元相同。通过第一训练样本进行模型训练得到第一子模型,第一子模型用于确定文本中存在的论元、论元对应的角色和论元对应的事件类型。获取第二训练样本,第二训练样本中包括第二样本文本、第二样本文本中存在的多个第二样本论元、多个第二样本论元中每两个第二样本论元对应同一事件的样本概率。通过第二训练样本进行模型训练得到第二子模型,第二子模型用于确定文本中存在的事件、事件对应的论元和论元对应的角色。基于第一子模型和第二子模型确定事件抽取模型。通过第一训练样本和第二训练样本,对事件抽取模型中的第一子模型和第二子模型分别进行处理,其中的第一训练数据是将在样本文本中出现的各种样本论元确定为数据包,在任一个数据包中所包括的样本论元是相同的,并且针对数据包进行角色和事件类型的标注,从而可以保证给模型提供正确的标注信息,进而可以避免因为标注不充分所导致的对模型的误导,进而可以有效的提升事件抽取模型的准确性。The method for training an event extraction model provided by an embodiment of the present disclosure includes: acquiring a first training sample, where the first training sample includes first sample text and first labeling data, and the first labeling data includes: multiple data in the first sample text Multiple data packets corresponding to each sample argument, sample roles corresponding to each data packet, and sample event types corresponding to each data packet, wherein the sample argument in any data packet is the same. The first sub-model is obtained by performing model training on the first training sample, and the first sub-model is used to determine the arguments existing in the text, the roles corresponding to the arguments, and the event types corresponding to the arguments. Obtain a second training sample, the second training sample includes the second sample text, multiple second sample arguments existing in the second sample text, and every two second sample arguments in the multiple second sample arguments correspond to the same The sample probability of the event. The second sub-model is obtained by performing model training on the second training sample, and the second sub-model is used to determine the events existing in the text, the arguments corresponding to the events, and the roles corresponding to the arguments. An event extraction model is determined based on the first sub-model and the second sub-model. Through the first training sample and the second training sample, the first sub-model and the second sub-model in the event extraction model are processed respectively, wherein the first training data is to determine various sample arguments appearing in the sample text For data packets, the sample arguments included in any data packet are the same, and the roles and event types are marked for the data packets, so as to ensure that the correct annotation information is provided to the model, thereby avoiding the problem of insufficient annotation. The resulting misleading of the model can effectively improve the accuracy of the event extraction model.

在上述实施例的基础上,下面对第一子模型的训练和第二子模型的训练的实现分别进行介绍,下面首先结合图4至图6对本公开提供的事件抽取模型训练方法中,第一子模型的训练过程进行详细介绍。图4为本公开实施例提供的事件抽取模型训练方法的流程图二,图5为本公开实施例提供的第一子模型的处理示意图,图6为本公开实施例提供的更新第一子模型的模型参数的实现示意图。On the basis of the above-mentioned embodiment, the implementation of the training of the first sub-model and the training of the second sub-model will be introduced separately below. First, in the event extraction model training method provided by the present disclosure with reference to FIG. 4 to FIG. The training process of a sub-model is described in detail. 4 is a second flowchart of an event extraction model training method provided by an embodiment of the present disclosure, FIG. 5 is a schematic diagram of processing a first sub-model provided by an embodiment of the present disclosure, and FIG. 6 is an update first sub-model provided by an embodiment of the present disclosure. A schematic diagram of the realization of the model parameters.

如图4所示,该方法包括:As shown in Figure 4, the method includes:

S401、获取第一训练样本,第一训练样本包括第一样本文本和第一标注数据,第一标注数据包括:第一样本文本中的多个样本论元对应的多个数据包、各数据包对应的样本角色、各数据包对应的样本事件类型,其中,任一个数据包中的样本论元相同。S401. Obtain a first training sample, where the first training sample includes first sample text and first label data, and the first label data includes: multiple data packets corresponding to multiple sample arguments in the first sample text, each The sample role corresponding to the data packet and the sample event type corresponding to each data packet, wherein the sample argument in any data packet is the same.

其中,S401的实现方式与上述S201的实现方式类似,此处不再赘述。The implementation of S401 is similar to the implementation of S201 above, and details are not described herein again.

S402、通过待训练的第一子模型对第一样本文本进行处理得到第一预测数据,第一预测数据中包括多个预测论元、预测论元对应的预测角色和预测论元对应的预测事件类型。S402. Process the first sample text through the first sub-model to be trained to obtain first prediction data, where the first prediction data includes multiple prediction arguments, prediction roles corresponding to the prediction arguments, and predictions corresponding to the prediction arguments Event type.

本实施例中将尚未训练的第一子模型称为待训练的第一子模型。在根据第一训练样本对第一子模型进行训练的时候,例如,可以通过待训练的第一子模型对第一样本文本进行处理,因为本实施例中的第一子模型是用于确定文本中存在的论元、论元的角色以及论元的事件类型的。In this embodiment, the untrained first sub-model is referred to as the to-be-trained first sub-model. When training the first sub-model according to the first training sample, for example, the first sub-model to be trained can be used to process the first sample text, because the first sub-model in this embodiment is used to determine The arguments present in the text, the roles of the arguments, and the types of events of the arguments.

因此参照图5,第一子模型可以输出第一预测数据,在第一预测数据中包括多个预测论元,其中预测论元就是第一子模型针对第一样本文本进行论元提取得到的论元,比如说可以包括图5所示的论元a、论元b、论元c等等。以及在第一预测数据中包括各个预测论元各自对应的预测角色以和各个预测论元各自对应的预测事件类型。Therefore, referring to FIG. 5 , the first sub-model can output first prediction data, and the first prediction data includes a plurality of prediction arguments, wherein the prediction arguments are obtained by the first sub-model performing argument extraction on the first sample text Arguments, for example, may include argument a, argument b, argument c, etc. shown in Figure 5 . And the first prediction data includes a prediction role corresponding to each prediction argument and a prediction event type corresponding to each prediction argument.

在一种可能的实现方式中,本实施例中的第一预测数据中还包括多个预测论元在第一样本文本中的预测位置,以及第一预测数据中还包括多个预测论元在第一样本文本中的预测位置的概率。其中的概率可以理解为模型自身所输出的置信度,也就是说模型对当前输出的预测位置有多大的把握。In a possible implementation manner, the first prediction data in this embodiment further includes prediction positions of multiple prediction arguments in the first sample text, and the first prediction data further includes multiple prediction arguments The probability of the predicted position in the first sample text. The probability can be understood as the confidence level output by the model itself, that is, how sure the model is about the predicted position of the current output.

因此在本实施例中,尽管没有告诉第一子模型,相同的论元在文中的具体位置,仅仅是告诉了第一子模型在文中各个论元具体有几个,但是第一子模型也可以基于多示例学习的方式,输出提取的各个预测论元,同时可以输出提取的各个预测论元在文中的预测位置,以及针对当前的预测位置的概率。Therefore, in this embodiment, although the first sub-model is not told the specific position of the same argument in the text, only the number of each argument in the text is told to the first sub-model, but the first sub-model can also be Based on the multi-instance learning method, each extracted prediction argument can be output, and at the same time, the predicted position of each extracted prediction argument in the text and the probability of the current predicted position can be output.

S403、根据第一标注数据、预测论元对应的预测角色和预测论元对应的预测事件类型,确定第一损失。S403: Determine the first loss according to the first labeling data, the predicted role corresponding to the predicted argument, and the predicted event type corresponding to the predicted argument.

在第一子模型输出第一预测数据之后,因为本实施例中针对第一样本文本标注有第一标注数据,因此可以根据第一标注数据以及上述的第一预测数据,确定模型的损失。After the first sub-model outputs the first predicted data, since the first sample text is marked with the first marked data in this embodiment, the loss of the model can be determined according to the first marked data and the above-mentioned first predicted data.

在一种可能的实现方式中,基于上述介绍可以确定的是,在第一标注数据中包括多个数据包,其中每个数据包的内部包括的都是相同的论元,以及针对每个数据包标注有论元角色和事件类型,因此在标注数据中实际上是标注了论元的论元角色和事件类型的。In a possible implementation manner, based on the above description, it can be determined that the first labeled data includes multiple data packets, wherein the interior of each data packet includes the same argument, and for each data packet Packages are marked with argument roles and event types, so the argument roles and event types of arguments are actually marked in the marked data.

以及上述的第一预测数据中包括多个预测论元,以及各个预测论元对应的预测角色和预测论元对应的预测事件类型。And the above-mentioned first prediction data includes a plurality of prediction arguments, as well as the prediction roles corresponding to the prediction arguments and the prediction event types corresponding to the prediction arguments.

因此参照图6,可以根据第一标注数据、预测论元对应的预测角色以及预测论元所对应的事件类型,来确定第一损失。在一种可能的实现方式中,例如可以预设有第一损失函数,之后通过第一损失函数对第一标注数据、预测论元对应的预测角色以及预测论元所对应的事件类型进行处理,从而确定第一损失。Therefore, referring to FIG. 6 , the first loss can be determined according to the first labeled data, the predicted role corresponding to the predicted argument, and the event type corresponding to the predicted argument. In a possible implementation manner, for example, a first loss function may be preset, and then the first label data, the prediction role corresponding to the prediction argument, and the event type corresponding to the prediction argument are processed by the first loss function, Thereby the first loss is determined.

S404、将多个预测论元进行分组,得到多组预测论元,每组预测论元中的论元相同。S404 , grouping multiple prediction arguments to obtain multiple groups of prediction arguments, and the arguments in each group of prediction arguments are the same.

以及本实施例中,在第一预测数据中可以包括多个预测论元,以及基于上述介绍可以确定的是,在文本中针对同一个论元可能出现多次,因为本实施例中会通过数据包告知模型各个样本论元的数量具体有多少个,因此模型是会输出多个相同的样本论元各自的预测位置。And in this embodiment, multiple prediction arguments may be included in the first prediction data, and based on the above introduction, it can be determined that the same argument may appear multiple times in the text, because in this embodiment, the data The package tells the model how many sample arguments there are, so the model will output the predicted positions of multiple identical sample arguments.

因此当前可以对多个预测论元进行分组,从而得到多组预测论元,在每组预测预测论元中的论元相同。也就是说将预测论元中相同的论元作为一组,从而得到多组预测论元。Therefore, multiple prediction arguments can currently be grouped to obtain multiple groups of prediction arguments, and the arguments in each group of prediction arguments are the same. That is to say, the same arguments in the prediction arguments are used as a group, so as to obtain multiple groups of prediction arguments.

此处的一组预测论元与上述介绍的标注数据中的数据包实际上是类似的,不同之处在于,上述的数据包是提前标注所得到的数据,而当前的一组预测论元是在第一子模型进行处理之后,对得到的多个预测论元进行分组所得到的数据。The set of prediction arguments here is actually similar to the data package in the labeled data described above, the difference is that the above data package is the data obtained by the pre-annotation, and the current set of prediction arguments is After the first sub-model is processed, the resulting data is grouped by the resulting plurality of prediction arguments.

S405、根据多个预测论元在第一样本文本中的预测位置的概率,分别在多组预测论元中确定目标预测论元,其中,在一组预测论元中的目标预测论元在第一样本文本中的预测位置的概率最高。S405. According to the probability of the predicted positions of the multiple prediction arguments in the first sample text, respectively determine the target prediction argument in the multiple groups of prediction arguments, wherein the target prediction argument in the group of prediction arguments is in The predicted position in the first sample text has the highest probability.

基于上述介绍可以确定的是,本实施例中的第一预测数据中包括多个预测论元在第一样本文本中的预测位置的概率,本实施例中的概率可以理解为模型输出的置信度。Based on the above description, it can be determined that the first prediction data in this embodiment includes the probability of the predicted positions of multiple prediction arguments in the first sample text, and the probability in this embodiment can be understood as the confidence output of the model Spend.

同时可以理解的是,因为本公开中并未告知各个论元在文本中的位置,因此模型在学习的过程中其实并没有参考信息,而是自主的进行学习,那么在开始训练的过程中,模型所输出的预测论元在第一样本文本中的预测位置的概率其实并不会很高,也就是说模型对于输出的预测论元的位置其实也不是非常确定。At the same time, it can be understood that because the position of each argument in the text is not notified in this disclosure, the model does not actually have reference information during the learning process, but learns independently. Then, in the process of starting training, The probability of the predicted position of the prediction argument output by the model in the first sample text is actually not very high, that is to say, the position of the predicted argument output by the model is not very certain.

那么为了提升模型训练的效率,本实施例中可以在多组预测论元中,确定目标预测论元,其中目标预测论元为在一组预测论元中对应的预测位置的概率最高的论元。也就是说根据预测位置的概率最高的论元,来计算损失,从而可以有效的提升模型训练的有效性。Then, in order to improve the efficiency of model training, in this embodiment, a target prediction argument may be determined among multiple groups of prediction arguments, where the target prediction argument is the argument with the highest probability of the corresponding prediction position in a group of prediction arguments . That is to say, the loss is calculated according to the argument with the highest probability of the predicted position, which can effectively improve the effectiveness of model training.

S406、根据目标论元的预测位置和目标论元在第一样本文本中的实际位置,确定第二损失。S406. Determine the second loss according to the predicted position of the target argument and the actual position of the target argument in the first sample text.

可以确定的是,目标论元也是从第一样本文本中提取出来的论元,因此基于第一样本文本,可以确定目标论元在第一样本文本中的实际位置。在一种可能的实现方式中,本实施例中无需预先标注好各个论元在第一样本文本中的实际位置,而是在模型输出预测论元的预测位置之后,再在第一样本文本中确定预测位置附近的目标论元,以确定目标论元在第一样本文本中的实际位置。It can be determined that the target argument is also an argument extracted from the first sample text, so based on the first sample text, the actual position of the target argument in the first sample text can be determined. In a possible implementation, in this embodiment, the actual position of each argument in the first sample text does not need to be marked in advance, but after the model outputs the predicted position of the predicted argument, the first text In this paper, the target argument near the predicted position is determined to determine the actual position of the target argument in the first sample text.

参照图6,之后可以根据目标论元的预测位置,以及目标论元在第一样本文本中的实际位置,来确定第二损失。在一种可能的实现方式中,例如可以通过预设的第二损失函数对目标论元的预测位置以及目标论元在第一样本文本中的实际位置进行处理,以得到第二损失函数。Referring to FIG. 6 , the second loss can then be determined according to the predicted position of the target argument and the actual position of the target argument in the first sample text. In a possible implementation manner, for example, the predicted position of the target argument and the actual position of the target argument in the first sample text may be processed through a preset second loss function to obtain the second loss function.

S407、根据第一损失和第二损失,更新第一子模型的模型参数。S407. Update the model parameters of the first sub-model according to the first loss and the second loss.

其中,第一损失可以对第一子模型提取论元,以及输出论元的论元角色和事件类型的准确性进行优化,而第二损失可以对第一子模型输出各个论元的位置的准确性进行优化,因此本实施例中可以基于上述确定的第一损失和第二损失,对第一子模型的模型参数进行更新,以实现对第一子模型的训练。Among them, the first loss can optimize the accuracy of the arguments extracted by the first sub-model, as well as the argument roles and event types of the output arguments, while the second loss can optimize the accuracy of the positions of the output arguments of the first sub-model Therefore, in this embodiment, the model parameters of the first sub-model can be updated based on the first loss and the second loss determined above, so as to realize the training of the first sub-model.

以及可以理解的是,在实际实现过程中,上述所介绍的确定损失,以及根据损失更新第一子模型的模型参数的操作,可以迭代进行多次,直至到达预设迭代次数,或者直至模型收敛,则可以得到训练完成的第一子模型。And it can be understood that, in the actual implementation process, the above-mentioned operations of determining the loss and updating the model parameters of the first sub-model according to the loss can be iteratively performed for many times until the preset number of iterations is reached, or until the model converges. , the first sub-model after training can be obtained.

本公开实施例提供的事件抽取模型训练方法,通过待训练的第一子模型对第一样本文本进行处理,从而输出第一预测数据,之后根据第一预测数据中包括的预测论元的论元角色和事件类型,以及第一标注数据中的论元的论元角色和事件类型,来确定用于优化确定论元角色和事件类型的准确性的第一损失。以及还可以根据第一预测数据中所包括的预测论元在第一样本文本中的预测位置,以及预测位置的概率,选择概率最大的目标论元,之后基于目标论元来确定第二损失,从而可以有效提升模型训练的速度和效率。具体的,通过目标论元的预测位置和目标论元在第一样本文本中的实际位置,来确定用于优化输出的论元的位置的准确性的第二损失。之后根据第一损失和第二损失来更新第一子模型的模型参数,从而可以有效的实现针对第一子模型的训练,并且可以有效的保证第一子模型所输出的论元、论元的论元角色、论元的事件类型、论元的位置的准确性。In the event extraction model training method provided by the embodiment of the present disclosure, the first sample text is processed by the first sub-model to be trained, so as to output the first prediction data, and then according to the theory of the prediction arguments included in the first prediction data The meta roles and event types, and the argument roles and event types of the arguments in the first labeled data, determine a first loss for optimizing the accuracy of determining the argument roles and event types. And it is also possible to select the target argument with the highest probability according to the predicted position of the predicted argument included in the first prediction data in the first sample text and the probability of the predicted position, and then determine the second loss based on the target argument. , which can effectively improve the speed and efficiency of model training. Specifically, the second loss of the accuracy of the position of the argument for optimizing the output is determined by the predicted position of the target argument and the actual position of the target argument in the first sample text. Then, the model parameters of the first sub-model are updated according to the first loss and the second loss, so that the training for the first sub-model can be effectively implemented, and the arguments and arguments output by the first sub-model can be effectively guaranteed. Argument roles, event types of arguments, and accuracy of argument locations.

上述实施例介绍的是针对第一子模型进行训练的实现,下面再结合具体的实施例对针对第二子模型进行训练的实现方式进行详细介绍。例如可以结合图7至图10进行说明,图7为本公开实施例提供的事件抽取模型训练方法的流程图三,图8为本公开实施例提供的确定第一概率的实现示意图,图9为本公开实施例提供的确定待选窗口的实现示意图,图10为本公开实施例提供的确定目标窗口的实现示意图。The above-mentioned embodiments describe the implementation of training for the first sub-model, and the implementation of the training for the second sub-model will be described in detail below with reference to specific embodiments. For example, it can be described with reference to FIGS. 7 to 10 . FIG. 7 is a third flowchart of an event extraction model training method provided by an embodiment of the present disclosure, FIG. 8 is a schematic diagram of an implementation of determining a first probability provided by an embodiment of the present disclosure, and FIG. 9 is a A schematic diagram of implementation of determining a candidate window provided by an embodiment of the present disclosure, FIG. 10 is a schematic diagram of implementation of determining a target window provided by an embodiment of the present disclosure.

如图7所示,该方法包括:As shown in Figure 7, the method includes:

S701、获取第二训练样本,第二训练样本中包括第二样本文本、第二样本文本中存在的多个第二样本论元、多个第二样本论元中每两个第二样本论元对应同一事件的样本概率。S701. Obtain a second training sample, where the second training sample includes the second sample text, multiple second sample arguments existing in the second sample text, and every two second sample arguments in the multiple second sample arguments The sample probability corresponding to the same event.

其中,S701的实现方式与上述S203的实现方式类似,此处不再赘述。The implementation of S701 is similar to the implementation of S203, which is not repeated here.

S702、确定多个第二样本论元对应的多个第二样本角色。S702. Determine multiple second sample roles corresponding to multiple second sample arguments.

在本实施例中,每个样本论元对应有各自的样本角色,在一种可能的实现方式中,当前的第二样本论元例如可以为上述的第一子模型输出的预测论元,基于上述介绍可以确定的是,第一子模型还会输出论元的论元角色和事件类型。因此当前的情况下,就可以基于第一子模型的输出,获取多个第二样本论元各自对应的第二样本角色。In this embodiment, each sample argument corresponds to its own sample role. In a possible implementation manner, the current second sample argument may be, for example, the prediction argument output by the above-mentioned first sub-model, based on What can be determined from the above introduction is that the first submodel also outputs the argument roles and event types of the arguments. Therefore, in the current situation, the second sample roles corresponding to each of the plurality of second sample arguments can be obtained based on the output of the first sub-model.

S703、确定各第二样本角色下的第二样本论元对应同一个事件的第一概率。S703. Determine the first probability that the second sample arguments under each second sample role correspond to the same event.

在本实施例中,第二样本角色就是第二样本文本中的各个第二样本论元各自对应的样本角色,可以理解的是,每个样本论元均对应各自对应的事件,则本实施例中还可以确定各个第二样本角色下的第二样本论元对应同一个事件的第一概率。In this embodiment, the second sample role is the sample role corresponding to each second sample argument in the second sample text. It can be understood that each sample argument corresponds to its corresponding event, then this embodiment In can also determine the first probability that the second sample arguments under each second sample role correspond to the same event.

此处需要区分事件类型和事件,可以理解的是,针对同一个事件类型,可以存在多个不同的事件,比如说当前存在事件a和事件b,这两个事件的事件类型均为事件类型1。It is necessary to distinguish between event types and events here. It is understandable that there can be multiple different events for the same event type. For example, there are currently event a and event b, and the event types of these two events are both event type 1. .

本实施例中在确定各个第二样本角色下的第二样本论元对应同一个事件的概率时,例如可以是基于第二样本文本进行分析,从而确定上述介绍的概率。In this embodiment, when determining the probability that the second sample arguments under each second sample role correspond to the same event, for example, analysis may be performed based on the second sample text, so as to determine the probability described above.

例如可以结合图8进行理解,如图8所示,假设当前针对第二样本文本进行分析,在第二样本文本中存在样本角色1下的多个论元,比如说可以包括图8所示的论元1、论元2和论元3,也就是说这3个论元的论元角色均为样本角色1,以及基于上述介绍可以确定的是,在文本中同一论元可以出现多次,则图8的示例就表示论元2在第二样本文本中出现了3次。For example, it can be understood in conjunction with FIG. 8. As shown in FIG. 8, assuming that the second sample text is currently being analyzed, there are multiple arguments under the sample role 1 in the second sample text. Argument 1, Argument 2 and Argument 3, that is to say, the argument roles of these 3 arguments are sample role 1, and based on the above introduction, it can be determined that the same argument can appear multiple times in the text, Then the example in Figure 8 indicates that argument 2 appears three times in the second sample text.

参照图8,其中的论元1是属于事件a的,以及第一个论元2和第二个论元2是属于事件a的,第三个论元2是属于事件b的,以及论元3是属于事件b的。则针对图8的示例,可以确定第二样本文本下的第二样本论元属于事件a的概率是60%(3/5),属于事件b的概率是40%(2/5)。Referring to Figure 8, where argument 1 belongs to event a, and the first argument 2 and the second argument 2 belong to event a, the third argument 2 belongs to event b, and the argument 3 belongs to event b. Then for the example of FIG. 8 , it can be determined that the probability of the second sample argument under the second sample text belonging to event a is 60% (3/5), and the probability of belonging to event b is 40% (2/5).

本实施例中的第二样本文本下的第二样本论元属于同一个事件的概率,例如可以为属于数量最多的同一个事件的概率,也就是说将概率最大的确定为当前需要的概率,因此针对图8的示例,可以确定第二样本文本下的第二样本论元属于同一个事件的概率为60%。In this embodiment, the probability that the second sample argument under the second sample text belongs to the same event may be, for example, the probability that the second sample argument belongs to the same event with the largest number, that is to say, the probability with the largest probability is determined as the currently required probability, Therefore, for the example of FIG. 8 , it can be determined that the probability that the second sample argument under the second sample text belongs to the same event is 60%.

图8介绍的是样本角色下的各个样本论元对应了两个不同的事件的实现方式,在实际实现过程中,样本角色下的各个论元还有可能对应大于两个的不同的事件,其实现方式与上述介绍的类似,可以将存在的数量最多事件所对应的概率确定为当前需要的概率。Figure 8 shows how each sample argument under the sample role corresponds to two different events. In the actual implementation process, each argument under the sample role may also correspond to more than two different events. The implementation manner is similar to that described above, and the probability corresponding to the event with the largest number of existing events may be determined as the currently required probability.

S704、根据各第二样本角色的召回率和准确率,确定各第二样本角色的角色系数。S704. Determine the role coefficient of each second sample role according to the recall rate and accuracy rate of each second sample role.

以及,本实施例中的第二样本论元是上述介绍的第一子模型所输出的论元,基于上述介绍可以确定的是,第一子模型可以输出预测论元、各个预测论元的预测角色以及各个预测论元各自对应的预测事件类型。And, the second sample argument in this embodiment is the argument output by the first sub-model introduced above. Based on the above introduction, it can be determined that the first sub-model can output the prediction argument, the prediction of each prediction argument The role and the corresponding prediction event type for each prediction argument.

因为各个预测论元各自对应有各自的预测角色,因此第一子模型的输出中就存在诸多的预测角色,本实施例中的第二样本角色实际上也就是第一子模型输出的预测角色。Because each prediction argument has its own prediction role, there are many prediction roles in the output of the first sub-model, and the second sample role in this embodiment is actually the prediction role output by the first sub-model.

那么针对任一个第二样本角色,就可以统计第一子模型输出的第二样本角色对应的召回率和准确率,其中召回率(Recall)和准确率(Precision)的具体实现可以参照现有技术中的实现,此处对此不再赘述。Then, for any second sample role, the recall rate and accuracy rate corresponding to the second sample role output by the first sub-model can be counted, and the specific implementation of recall rate (Recall) and precision rate (Precision) can refer to the prior art The implementation in , will not be repeated here.

在确定各个第二样本角色的召回率和准确率之后,就可以根据各个第二样本角色的召回率和准确率,来确定各个第二样本角色的角色系数了。本实施例中的角色系数例如可以为综合评价指标(F-Measure)。After the recall rate and accuracy rate of each second sample role are determined, the role coefficient of each second sample role can be determined according to the recall rate and accuracy rate of each second sample role. The role coefficient in this embodiment may be, for example, a comprehensive evaluation index (F-Measure).

在一种可能的实现方式中,例如可以根据预设函数处理第二样本角色的召回率和准确率,得到第二样本角色的角色系数。In a possible implementation manner, for example, the recall rate and the accuracy rate of the second sample character may be processed according to a preset function to obtain the character coefficient of the second sample character.

其中,确定角色系数的预设函数例如可以满足如下的公式一:Wherein, the preset function for determining the role coefficient, for example, may satisfy the following formula 1:

Figure BDA0003430350550000161
Figure BDA0003430350550000161

其中,P为准确率,R为召回率,F为角色系数,本实施例中的角色系数具体可以为综合评价指标。Among them, P is the accuracy rate, R is the recall rate, and F is the role coefficient. The role coefficient in this embodiment may specifically be a comprehensive evaluation index.

针对每一个第二样本角色均可以执行上述操作,从而得到各个第二样本角色各自对应的角色系数F。The above operations can be performed for each second sample character, so as to obtain the character coefficient F corresponding to each second sample character.

S705、根据各第二样本角色各自对应的第一概率、以及各第二样本角色的角色系数,确定各第二样本角色的优先级。S705. Determine the priority of each second sample role according to the first probability corresponding to each second sample role and the role coefficient of each second sample role.

在确定各个第二样本角色各自对应的第一概率,以及各个第二样本角色各自的角色系数之后,例如可以根据第一概率和角色系数,确定各个第二样本角色的优先级。After determining the first probability corresponding to each second sample role and the role coefficient of each second sample role, for example, the priority of each second sample role may be determined according to the first probability and the role coefficient.

在一种可能的实现方式中,针对任一个第二样本角色,例如可以将第二样本角色对应的第一概率P以及第二样本角色的角色系数F的乘积,确定为第二样本角色的优先级。可以理解,P和F的乘积越大,就标识第二样本角色的优先级越大。In a possible implementation manner, for any second sample character, for example, the product of the first probability P corresponding to the second sample character and the character coefficient F of the second sample character may be determined as the priority of the second sample character class. It can be understood that the greater the product of P and F, the greater the priority of identifying the second sample role.

S706、根据各第二样本角色的优先级,在多个第二样本角色中确定待选中心角色。S706: Determine the central role to be selected from among the plurality of second sample roles according to the priority of each second sample role.

在确定各个第二样本角色各自的优先级之后,可以根据各个第二样本角色的优先级,在多个第二样本角色中确定待选中心角色。After the respective priorities of the respective second sample roles are determined, the central role to be selected may be determined from among the plurality of second sample roles according to the priorities of the respective second sample roles.

在一种可能的实现方式中,例如可以将各个第二样本角色的优先级和预设阈值进行比较,若确定第二样本角色中存在第二样本角色的优先级大于或等于预设阈值,则可以将优先级大于或者等于预设阈值的第二样本角色确定为待选中心角色。这种情况下,本实施例中的待选中心角色可以存在多个。In a possible implementation manner, for example, the priority of each second sample role may be compared with a preset threshold. If it is determined that the priority of the second sample role in the second sample role is greater than or equal to the preset threshold, then The second sample role whose priority is greater than or equal to the preset threshold may be determined as the central role to be selected. In this case, there may be multiple central roles to be selected in this embodiment.

在另一种可能的实现方式中,若确定多个第二样本角色的优先级均小于预设阈值,则可以将多个第二样本角色中优先级最大的第二样本角色确定为待选中心角色。In another possible implementation, if it is determined that the priorities of the multiple second sample roles are all smaller than the preset threshold, the second sample role with the highest priority among the multiple second sample roles may be determined as the center to be selected Role.

此处需要说明的是,样本角色的优先级越大,则表示该样本角色下的论区分事件的能力越强,因此本实施例中基于优先级确定待选中心角色,实际上是确定了区分事件较强的论元角色。It should be noted here that the higher the priority of the sample role, the stronger the ability to distinguish events under the sample role. Therefore, in this embodiment, the central role to be selected is determined based on the priority, which actually determines the distinction. The stronger argument role of the event.

S707、将待选中心角色对应的论元确定为中心论元。S707: Determine the argument corresponding to the central role to be selected as the central argument.

在确定待选中心角色之后,可以将待选中心角色对应的论元确定为中心论元。其中,待选中心角色对应的论元,实际上就是论元角色为待选中心角色的论元。After the central role to be selected is determined, the argument corresponding to the central role to be selected can be determined as the central argument. Among them, the argument corresponding to the central role to be selected is actually the argument whose role is the central role to be selected.

S708、在第二样本文本中确定中心论元对应的目标窗口,目标窗口中包括预设数量的字符。S708. Determine a target window corresponding to the central argument in the second sample text, where the target window includes a preset number of characters.

可以确定的是,本实施例中确定的中心论元可以存在多个,下面以任一个中心论元为例进行介绍。It can be determined that there may be multiple central arguments determined in this embodiment, and any one central argument is used as an example for description below.

本实施例中,可以在第二样本文本中确定中心论元对应的目标窗口,其中目标窗口中可以包括预设数量的字符,其中,在目标窗口中所包括的字符的数量实际上就是窗口的长度。In this embodiment, the target window corresponding to the central argument may be determined in the second sample text, where the target window may include a preset number of characters, and the number of characters included in the target window is actually the number of characters in the window. length.

在一种可能的实现方式中,本实施例中的目标窗口满足如下条件:目标窗口中包括预设数量的字符、目标窗口中包括中心论元,在满足上述两个条件的基础上,目标窗口中包括最多的和中心论元的事件类型一致的其他论元。In a possible implementation manner, the target window in this embodiment satisfies the following conditions: the target window includes a preset number of characters, the target window includes a central argument, and on the basis of satisfying the above two conditions, the target window Include the most other arguments that are consistent with the event type of the central argument.

下面对确定目标窗口的可能的实现方式进行介绍:The possible implementations of determining the target window are described below:

例如可以在第二样本文本中确定多个待选窗口,待选窗口中包括预设数量的字符,待选窗口中包括待选中心角色对应的论元。For example, a plurality of candidate windows may be determined in the second sample text, the candidate windows include a preset number of characters, and the candidate windows include arguments corresponding to the central role to be selected.

可以理解的是,本实施例中在确定目标窗口时,因为窗口的长度是预设的,并且在目标窗口中必须包括中心论元,那么为了保证目标窗口中包括最多的和中心论元的事件类型一致的其他论元,则可以通过滑动窗口的方式,首先确定多个待选窗口。It can be understood that, when determining the target window in this embodiment, because the length of the window is preset, and the target window must include the central argument, then in order to ensure that the target window includes the most events and the central argument For other arguments of the same type, you can first determine multiple candidate windows by sliding windows.

本实施例中的待选窗口包括预设数量的字符,也就是说待选窗口的长度是预设长度,并且待选窗口中待选中心角色对应的论元,此处的待选中心角色对应的论元实际上就是上述介绍的中心论元。The window to be selected in this embodiment includes a preset number of characters, that is to say, the length of the window to be selected is a preset length, and the argument corresponding to the central role to be selected in the window to be selected, the central role to be selected here corresponds to The argument of is actually the central argument introduced above.

例如可以结合图9进行理解,参见图9,假设其中的901为第二样本文本,在图9中示意出了第二样本文本的部分内容。假设当前的中心论元为图9中的“数千万元”,则针对该中心论元,比如说可以至少确定图9中的902、903、904分别所示的待选窗口。For example, it can be understood in conjunction with FIG. 9 . Referring to FIG. 9 , 901 is assumed to be the second sample text, and FIG. 9 illustrates part of the content of the second sample text. Assuming that the current central argument is "tens of millions of yuan" in Figure 9, for this central argument, for example, at least the candidate windows shown in 902, 903, and 904 in Figure 9 can be determined.

基于待选窗口902、待选窗口903和待选窗口904可以确定的是,在各个待选窗口中均包括中心论元“数千万元”,以及各个待选窗口的长度均为预设长度,其中预设长度比如说可以为N个字,其中N的具体设置可以根据实际需求进行选择。Based on the candidate window 902, the candidate window 903 and the candidate window 904, it can be determined that each candidate window includes the central argument "tens of millions of yuan", and the length of each candidate window is a preset length , where the preset length can be, for example, N words, and the specific setting of N can be selected according to actual needs.

以及基于上述图9可以确定的是,本实施例中可以在窗口中包括中心论元的基础上,基于预设的窗口长度来进行窗口的滑动,从而确定多个待选窗口。And based on the above-mentioned FIG. 9 , it can be determined that in this embodiment, on the basis of including the central argument in the window, the window can be slid based on the preset window length, thereby determining multiple candidate windows.

在确定待选窗口之后,可以确定待选窗口中包括的第一事件类型对应的论元的数量,第一事件类型为中心论元对应的事件类型。After the candidate window is determined, the number of arguments corresponding to the first event type included in the candidate window may be determined, where the first event type is the event type corresponding to the central argument.

基于上述介绍可以确定的是,因为本实施例中的目标窗口是包括最多的和中心论元的事件类型一致的其他论元的窗口。因此可以确定各个待选窗口中包括的第一事件类型对应的论元的数量,其中的第一事件类型为中心论元对应的事件类型。Based on the above description, it can be determined that the target window in this embodiment is the window that includes the most other arguments that are consistent with the event type of the central argument. Therefore, the number of arguments corresponding to the first event type included in each candidate window can be determined, where the first event type is the event type corresponding to the central argument.

例如可以参照图10进行理解,如图10所示,假设当前存在图10所示的窗口1和窗口2。For example, it can be understood with reference to FIG. 10 . As shown in FIG. 10 , it is assumed that window 1 and window 2 shown in FIG. 10 currently exist.

其中,在窗口1中,除了包括中心论元之外,还包括论元1、论元2、论元3和论元4,其中的中心论元的事件类型为事件类型a,论元1的事件类型为事件类型b,论元2的事件类型为事件类型a,论元3的事件类型为事件类型b,论元4的事件类型为事件类型c。那么可以确定的是,在待选窗口1中所包括的和中心论元的事件类型一致的其他论元的数量为1,也就是窗口1中的论元2。Among them, in window 1, in addition to the central argument, it also includes argument 1, argument 2, argument 3 and argument 4, where the event type of the central argument is event type a, and the event type of argument 1 The event type is event type b, the event type of argument 2 is event type a, the event type of argument 3 is event type b, and the event type of argument 4 is event type c. Then it can be determined that the number of other arguments included in the candidate window 1 that is consistent with the event type of the central argument is 1, that is, the argument 2 in the window 1.

以及,在窗口2中,除了包括中心论元之外,还包括论元2、论元3、论元4、论元1和论元5,其中的中心论元的事件类型为事件类型a,论元2的事件类型为事件类型a,论元3的事件类型为事件类型b,论元4的事件类型为事件类型a,论元1的事件类型为事件类型a,论元5的事件类型为事件类型c。那么可以确定的是,在待选窗口1中所包括的和中心论元的事件类型一致的其他论元的数量为3,也就是窗口1中的论元2、论元4和论元1。And, in window 2, in addition to the central argument, it also includes argument 2, argument 3, argument 4, argument 1 and argument 5, where the event type of the central argument is event type a, The event type of argument 2 is event type a, the event type of argument 3 is event type b, the event type of argument 4 is event type a, the event type of argument 1 is event type a, the event type of argument 5 is event type is event type c. Then it can be determined that the number of other arguments included in the candidate window 1 that is consistent with the event type of the central argument is 3, that is, the argument 2, the argument 4 and the argument 1 in the window 1.

之后,可以根据待选窗口中包括的第一事件类型对应的论元的数量,确定目标窗口。Afterwards, the target window may be determined according to the number of arguments corresponding to the first event type included in the window to be selected.

因为本实施例中是需要将包括最多的和中心论元的事件类型一致的其他论元的窗口确定为目标窗口,则在一种可能的实现方式中,就可以将包括的第一事件类型对应的论元的数量最多的待选窗口确定为目标窗口。Because in this embodiment, it is necessary to determine the window that includes the most other arguments that are consistent with the event type of the central argument as the target window, in a possible implementation manner, the included first event type can be corresponding to The candidate window with the largest number of arguments is determined as the target window.

比如说在上述图10的示例中,假设仅存在待选窗口1和待选窗口2,因为在图10的示例中,待选窗口中所包括的和中心论元的事件类型一致的其他论元的数量中,待选窗口2的数量是最多的,因此可以将待选窗口2确定为目标窗口。For example, in the example of FIG. 10 above, it is assumed that there are only candidate window 1 and candidate window 2, because in the example of FIG. 10, the candidate window includes other arguments that are consistent with the event type of the central argument Among the number of , the number of the window 2 to be selected is the largest, so the window 2 to be selected can be determined as the target window.

S709、确定目标窗口中存在的多个第一论元。S709: Determine multiple first arguments existing in the target window.

基于上述介绍可以确定的是,在目标窗口中,除了中心论元之外,还可以包括很多的其他论元,因此本实施例中可以确定目标窗口中存在的多个第一论元,其中的第一论元例如可以为目标窗口中除中心论元之外的其余论元。Based on the above description, it can be determined that, in addition to the central argument, the target window can also include many other arguments. Therefore, in this embodiment, multiple first arguments existing in the target window can be determined, among which the The first argument can be, for example, the remaining arguments in the target window except the central argument.

S710、获取多个第一论元与中心论元对应于同一事件的预测概率。S710. Obtain the predicted probability that the plurality of first arguments and the central argument correspond to the same event.

在确定多个第一论元之后,例如可以获取多个第一论元与中心论元对应于同一事件的预测概率,实际上也就是说确定第一论元和中心论元是属于同一个事件的预测概率。After multiple first arguments are determined, for example, the predicted probability that multiple first arguments and the central argument correspond to the same event can be obtained, which actually means that it is determined that the first argument and the central argument belong to the same event predicted probability.

本实施例中的预测概率可以为第二子模型输出得到的,也就是说第二子模型可以对各个第一论元和中心论元进行处理,从而输出第一论元与中心论元对应于同一事件的概率。The predicted probability in this embodiment can be obtained from the output of the second sub-model, that is to say, the second sub-model can process each of the first argument and the central argument, so as to output the first argument and the central argument corresponding to probability of the same event.

在一种可能的实现方式中,当前的第二子模型对第一论元与中心论元进行处理,以输出第一论元和中心论元对应于同一个事件的概率,可以是第二子模型中的部分处理过程,也就是说第二子模型除了输出第一论元和中心论元属于同一个事件的预测概率的处理之外,还包括其余的处理过程。比如说上述介绍的确定目标窗口和中心论元的过程也可以是第二子模型处理的,那么在这种情况下,直接将第二训练文本输入第二子模型即可。以及第二子模型后续还会输出文本中存在的事件、所述事件对应的论元和所述论元对应的角色。In a possible implementation manner, the current second sub-model processes the first argument and the central argument to output the probability that the first argument and the central argument correspond to the same event, which may be the second sub-model Part of the processing in the model, that is to say, the second sub-model includes the rest of the processing in addition to the processing of outputting the predicted probability that the first argument and the central argument belong to the same event. For example, the process of determining the target window and the central argument described above can also be processed by the second sub-model. In this case, the second training text can be directly input into the second sub-model. And the second sub-model will subsequently output the event existing in the text, the argument corresponding to the event, and the role corresponding to the argument.

或者,上述介绍的确定目标窗口和中心论元的过程还可以不是第二子模型处理的,其可以是第二子模型进行处理之前的预处理阶段执行的。那么在这种情况下,在将第二训练文本输入第二子模型之前,首先需要对第二训练文本执行上述介绍的预处理过程,以在第二训练文本中确定目标窗口以及中心论元,之后除了将第二训练样本输入第二子模型之外,还需要将目标窗口中的第一文本和中心论元同样输入第二子模型,以使得第二子模型可以输出第一论元和中心论元属于同一个事件的预测概率。Alternatively, the process of determining the target window and the central argument described above may not be processed by the second sub-model, but may be performed in a preprocessing stage before the second sub-model is processed. Then in this case, before the second training text is input into the second sub-model, the preprocessing process described above needs to be performed on the second training text to determine the target window and the central argument in the second training text, After that, in addition to inputting the second training sample into the second sub-model, it is also necessary to input the first text and center argument in the target window into the second sub-model, so that the second sub-model can output the first argument and center The predicted probability that arguments belong to the same event.

S711、根据各第一论元对应的预测概率,将对应的预测概率大于或等于概率阈值的第一论元确定为目标论元。S711. According to the predicted probability corresponding to each first argument, determine the first argument whose corresponding predicted probability is greater than or equal to the probability threshold as the target argument.

在本实施例中,每一个第一论元都和中心论元对应有预测概率,在一种可能的实现方式中,若第一论元对应的预测概率大于或等于概率阈值,则表示当前的第一论元和中心论元同属于一个事件的可能性比较大,因此可以将对应的预测概率大于或等于概率阈值的第一论元确定为目标论元。当前的目标论元就是指和中心论元同属于一个事件的论元。In this embodiment, each first argument has a predicted probability corresponding to the central argument. In a possible implementation, if the predicted probability corresponding to the first argument is greater than or equal to the probability threshold, it indicates that the current The first argument and the central argument are more likely to belong to the same event, so the first argument whose corresponding predicted probability is greater than or equal to the probability threshold can be determined as the target argument. The current target argument is the argument that belongs to the same event as the central argument.

以及在另一种可能的实现方式中,若第一论元对应的预测概率小于概率阈值,则表示当前的第一论元和中心论元同属于一个事件的可能性较小,因此这部分第一论元也就不是本实施例中所需要的目标论元。And in another possible implementation, if the predicted probability corresponding to the first argument is less than the probability threshold, it means that the current first argument and the central argument are less likely to belong to the same event, so this part of the first argument is less likely to belong to the same event. An argument is also not the target argument required in this embodiment.

以及可以理解的是,实际上是会存在多个中心论元的,针对每一个中心论元都会执行上述操作,假设当前目标窗口中的某个第一论元和当前的中心论元不属于同一个事件,但是该第一论元可能和其余的中心论元属于同一个事件。And it is understandable that there will actually be multiple central arguments, and the above operations will be performed for each central argument, assuming that a first argument in the current target window and the current central argument do not belong to the same an event, but the first argument may belong to the same event as the remaining central arguments.

因此可以理解的是,本实施例中通过确定中心论元,以及确定中心论元中的目标窗口中的各个第一论元和中心论元的预测概率,以确定第一论元和中心论元是否同属于一个事件,实际上是基于中心论元,实现了对掺杂在一起的事件的有效划分。Therefore, it can be understood that in this embodiment, the first argument and the central argument are determined by determining the central argument and determining the predicted probability of each first argument and the central argument in the target window in the central argument Whether they belong to the same event is actually based on the central argument, which realizes the effective division of mixed events.

以及在实际实现过程中,具体的概率阈值的设置可以根据实际需求进行选择和设置,本实施例对概率阈值的具体实现不做限制。And in the actual implementation process, the specific setting of the probability threshold may be selected and set according to actual requirements, and the specific implementation of the probability threshold is not limited in this embodiment.

S712、确定中心论元对应的预测事件,其中,预测事件中包括中心论元和目标论元。S712. Determine the predicted event corresponding to the central argument, wherein the predicted event includes the central argument and the target argument.

在针对当前的中心论元确定目标论元之后,就可以确定中心论元所对应的预测事件了,其中针对论元确定其对应的事件的具体实现,可以参照相关技术中的事件提取的实现,本实施例对此不做限制。After the target argument is determined for the current central argument, the predicted event corresponding to the central argument can be determined. For the specific implementation of the event corresponding to the argument, refer to the implementation of event extraction in related technologies. This embodiment does not limit this.

在确定中心论元所对应的预测事件之后,因为确定了上述介绍的目标论元和中心论元是属于同一个事件的,因此可以确定预测事件中包括中心论元和各个目标论元。After determining the predicted event corresponding to the central argument, because it is determined that the target argument and the central argument described above belong to the same event, it can be determined that the predicted event includes the central argument and each target argument.

以及在实际实现过程中,因为中心论元存在多个,针对每一个中心论元均执行上述介绍的操作,从而可以针对各个中心论元确定各自对应的预测事件,以及在预测事件中还包括和中心论元同属于一个事件的各个目标论元,因此可以实现从样本文本中提取多个预测事件,在每个预测事件中都包括至少一个预测论元。And in the actual implementation process, because there are multiple central arguments, the operations described above are performed for each central argument, so that the corresponding predicted events can be determined for each central argument, and the predicted events also include and The central argument belongs to each target argument of an event, so multiple prediction events can be extracted from the sample text, and each prediction event includes at least one prediction argument.

S713、根据预测事件中的预测论元和样本事件中的第二样本论元,确定第三损失。S713. Determine a third loss according to the prediction argument in the prediction event and the second sample argument in the sample event.

在确定各个预测事件之后,因为在第二训练样本中还包样本事件,其中每个样本事件中都包括至少一个第二样本论元。那么其中的样本事件就相当于标注信息,预测事件相当于模型输出的信息,之后可以根据样本事件中的预测论元和样本事件中的第二样本论元,确定第三损失。After each prediction event is determined, because the second training sample also includes sample events, each sample event includes at least one second sample argument. Then the sample event is equivalent to the label information, and the predicted event is equivalent to the information output by the model, and then the third loss can be determined according to the prediction argument in the sample event and the second sample argument in the sample event.

在一种可能的实现方式中,比如说可以通过第三损失函数对样本事件和预测事件进行处理,从而确定第三损失。本实施例中的第三损失是用于优化第二子模型所输出的预测事件的准确性的,进一步的,是为了优化第二子模型所输出的预测事件的划分,以及预测事件中所包括的预测论元的准确性的。In a possible implementation manner, for example, the third loss may be determined by processing the sample event and the predicted event through a third loss function. The third loss in this embodiment is used to optimize the accuracy of the predicted events output by the second sub-model. Further, it is used to optimize the division of predicted events output by the second sub-model, and the components included in the predicted events. The accuracy of the prediction arguments.

S714、根据第三损失更新第二子模型的模型参数。S714. Update the model parameters of the second sub-model according to the third loss.

其中,第三损失可以对第二子模型所输出的预测事件的准确性进行优化,因此在确定第三损失之后,就可以根据第三损失对第二子模型的模型参数进行更新了,以实现对第二子模型的训练。The third loss can optimize the accuracy of the predicted events output by the second sub-model. Therefore, after the third loss is determined, the model parameters of the second sub-model can be updated according to the third loss to achieve Training of the second submodel.

以及可以理解的是,在实际实现过程中,上述所介绍的确定损失,以及根据损失更新第二子模型的模型参数的操作,可以迭代进行多次,直至到达预设迭代次数,或者直至模型收敛,则可以得到训练完成的第二子模型。And it can be understood that, in the actual implementation process, the above-mentioned operations of determining the loss and updating the model parameters of the second sub-model according to the loss can be iteratively performed for many times until the preset number of iterations is reached, or until the model converges. , the second sub-model after training can be obtained.

本公开实施例提供的事件抽取模型训练方法,通过确定第二训练样本中的各个第二样本角色各自所对应的第一概率,以及各个第二样本角色各自的角色系数,之后根据角色系数以及第一概率确定各个第二样本角色的优先级,然后根据优先级确定至少一个中心样本角色,其中中心样本角色是对于事件的区分能力较强的样本角色,之后根据中心样本角色下的各个中心论元,确定各个中心论元各自对应的目标窗口,本实施例中的目标窗口是在包括中心论元的基础上,所包括的和中心论元同属于一个事件类型的其他论元最多的固定长度的窗口。然后确定中心论元的目标窗口中,各个其他论元和中心论元同属于一个事件的预测概率,并且根据预测概率来确定中心论元所对应的预测事件,从而可以有效的基于中心论元和预测概率,对复杂的句子中的多个事件进行划分,从而输出多个预测事件,其中的每个预测事件都包括至少一个预测论元。并且根据模型输出的预测事件以及标注的样本事件,对第二子模型的模型参数进行更新,从而可以准确有效的实现对第二子模型的训练,以使得第二子模型可以准确有效的输出包括对应论元的预测事件。In the event extraction model training method provided by the embodiment of the present disclosure, the first probability corresponding to each second sample role in the second training sample is determined, and the respective role coefficients of each second sample role are determined, and then according to the role coefficient and the first probability of each second sample role A probability determines the priority of each second sample role, and then at least one central sample role is determined according to the priority, wherein the central sample role is a sample role with a strong ability to distinguish events, and then according to each central argument under the central sample role , determine the target window corresponding to each central argument. The target window in this embodiment is based on the central argument, and includes the most other arguments of the same event type as the central argument. window. Then, in the target window of the central argument, the predicted probability of each other argument and the central argument belonging to the same event is determined, and the predicted event corresponding to the central argument is determined according to the predicted probability, so that the central argument and the central argument can be effectively based on the prediction probability. Prediction probability, which divides multiple events in a complex sentence to output multiple predicted events, each of which includes at least one predicted argument. And according to the predicted events outputted by the model and the marked sample events, the model parameters of the second sub-model are updated, so that the training of the second sub-model can be accurately and effectively implemented, so that the second sub-model can accurately and effectively output including The predicted event for the corresponding argument.

上述介绍了针对第一子模型和第二子模型的训练过程,基于上述的介绍可以确定的是,本公开中的第一子模型可以实现论元的提取,以及输出各个论元各自对应的论元角色、事件类型,以及本公开中的第二子模型可以实现根据第一子模型提取的论元,确定各个论元各自对应的事件,也就是说实现事件的划分。那么第一子模型和第二子模型结合应用,就可以实现事件的提取。具体的,可以得到文本中提取出的事件,以及确定各个事件中所包括的论元、论元对应的论元角色、论元对应的事件类型。The above describes the training process for the first sub-model and the second sub-model. Based on the above introduction, it can be determined that the first sub-model in the present disclosure can extract arguments and output the arguments corresponding to each argument. The meta-role, event type, and the second sub-model in the present disclosure can implement the arguments extracted according to the first sub-model to determine the respective events corresponding to each argument, that is, to realize the division of events. Then the first sub-model and the second sub-model are applied in combination to realize event extraction. Specifically, the events extracted from the text can be obtained, and the arguments included in each event, the argument roles corresponding to the arguments, and the event types corresponding to the arguments can be determined.

在上述实施例的基础上,下面结合图11和图12对本公开中提供的事件抽取方法进行进一步的详细介绍。On the basis of the above embodiments, the event extraction method provided in the present disclosure will be further described in detail below with reference to FIG. 11 and FIG. 12 .

图11为本公开实施例提供的事件抽取方法的流程图,图12为本公开实施例提供的事件抽取方法的处理示意图。FIG. 11 is a flowchart of an event extraction method provided by an embodiment of the present disclosure, and FIG. 12 is a schematic processing diagram of the event extraction method provided by an embodiment of the present disclosure.

如图11所示,该方法包括:As shown in Figure 11, the method includes:

S1101、获取待处理的第一文本。S1101. Acquire the first text to be processed.

在本实施例中,待处理的第一文本就是需要进行事件提取的文本,本实施例对第一文本的具体内容、篇幅长度、格式等等,均不作限制,凡是需要进行事件提取的文本均可以作为本实施例中的第一文本。In this embodiment, the first text to be processed is the text for which event extraction needs to be performed. This embodiment does not limit the specific content, length, format, etc. of the first text. It can be used as the first text in this embodiment.

S1102、通过预训练的事件抽取模型中的第一子模型对第一文本进行处理,得到第一输出结果,第一输出结果中包括:第一文本中存在的论元、论元对应的角色和论元对应的事件类型。S1102. Process the first text by using the first sub-model in the pre-trained event extraction model to obtain a first output result, where the first output result includes: arguments existing in the first text, roles corresponding to the arguments, and The event type to which the argument corresponds.

本实施例中的预训练的事件抽取模型可以包括第一子模型和第二子模型,其中第一子模型可以对第一文本进行处理,从而输出第一文本中存在的论元、各个论元各自对应的论元角色以及各个论元各自对应的事件类型。The pre-trained event extraction model in this embodiment may include a first sub-model and a second sub-model, wherein the first sub-model may process the first text, thereby outputting arguments, various arguments existing in the first text The corresponding argument roles and the corresponding event types of each argument.

参照图12,可以将第一文本输入事件抽取模型的第一子模型,以使得第一子模型输出第一输出结果,其中第一输出结果中包括第一文本中存在的论元、论元对应的论元角色以及论元对应的事件类型。Referring to FIG. 12 , the first sub-model of the model can be extracted from the first text input event, so that the first sub-model outputs a first output result, wherein the first output result includes the arguments existing in the first text and the arguments corresponding to the arguments. The argument role and the event type corresponding to the argument.

S1103、通过预训练的事件抽取模型中的第二子模型对第一输出结果进行处理,得到第一文本中存在的事件、事件对应的论元。S1103: Process the first output result by using the second sub-model in the pre-trained event extraction model to obtain an event existing in the first text and an argument corresponding to the event.

以及本实施例中的预训练的事件抽取模型还包括第二子模型,其中第二子模型可以第一子模型的第一输出结果进行处理,从而输出第一文本中存在的事件、事件对应的论元。And the pre-trained event extraction model in this embodiment also includes a second sub-model, wherein the second sub-model can process the first output result of the first sub-model, thereby outputting the events existing in the first text and the corresponding events of the events. Argument.

基于上述实施例的介绍可以确定的是,实际上第二子模型的处理只需要提取的各个论元和待处理的第一文本即可,因此在一种可能的实现方式中,参照图12,可以获取第一输出结果中的各个论元。Based on the introduction of the above embodiments, it can be determined that, in fact, the processing of the second sub-model only requires the extracted arguments and the first text to be processed. Therefore, in a possible implementation manner, referring to FIG. 12 , Each argument in the first output result can be obtained.

并且将各个论元以及第一文本输入值第二子模型,以使得第二子模型可以输出第一文本中存在的事件,以及各个事件各自包括的论元。And each argument and the first text are input into the second sub-model, so that the second sub-model can output the events existing in the first text and the arguments included in each event.

那么参照图12可以确定的是,其中的第一子模型可以输出第一文本中的各个论元、各个论元各自对应的论元角色、各个论元各自对应的事件类型,以及第二子模型可以输出第一文本中存在的事件以及各个事件对应的论元。Then, referring to FIG. 12, it can be determined that the first sub-model can output each argument in the first text, the argument role corresponding to each argument, the event type corresponding to each argument, and the second sub-model Events existing in the first text and arguments corresponding to each event can be output.

那么就可以得到图12所示的提取结果,也就是说在第一文本中存在的各个事件、各个事件的事件类型、各个事件所包括的论元、各个论元的论元角色。以及在可选的实现方式中,第二子模型还可以输出事件的触发词,那么在提取结果中就还可以包括各个事件的触发词,从而可以有效的实现从第一文本中提取事件。Then, the extraction result shown in FIG. 12 can be obtained, that is, each event existing in the first text, the event type of each event, the arguments included in each event, and the argument role of each argument. And in an optional implementation manner, the second sub-model can also output trigger words of events, then the extraction results can also include trigger words of each event, so that events can be effectively extracted from the first text.

在本实施例中,基于上述介绍的模型训练过程可以理解的是,本实施例中的预训练的事件提取模型,针对文本中存在的多次出现的论元,同样可以实现准确有效的模型训练,以及针对多个事件糅杂的复杂情况,可以有效的实现事件的划分。因此基于上述训练得到的事件抽取模型对第一文本进行处理,以得到事件抽取结果,从而可以有效保证事件抽取结果的准确性。In this embodiment, based on the model training process described above, it can be understood that the pre-trained event extraction model in this embodiment can also achieve accurate and effective model training for arguments that appear multiple times in the text , and for the complex situation where multiple events are mixed, the division of events can be effectively realized. Therefore, the first text is processed based on the event extraction model obtained by the above training to obtain an event extraction result, thereby effectively ensuring the accuracy of the event extraction result.

图13为本公开实施例的事件抽取模型训练装置的结构示意图。如图13所示,本实施例的事件抽取模型训练装置1300可以包括:获取模块1301、第一处理模块1302、第二获取模块1303、第二处理模块1304、确定模块1305。FIG. 13 is a schematic structural diagram of an event extraction model training apparatus according to an embodiment of the present disclosure. As shown in FIG. 13 , the event extraction model training apparatus 1300 in this embodiment may include: an acquisition module 1301 , a first processing module 1302 , a second acquisition module 1303 , a second processing module 1304 , and a determination module 1305 .

获取模块1301,用于获取第一训练样本,所述第一训练样本包括第一样本文本和第一标注数据,所述第一标注数据包括:所述第一样本文本中的多个样本论元对应的多个数据包、各所述数据包对应的样本角色、各所述数据包对应的样本事件类型,其中,任一个数据包中的样本论元相同;The acquisition module 1301 is configured to acquire a first training sample, where the first training sample includes a first sample text and first labeled data, and the first labeled data includes: a plurality of samples in the first sample text The multiple data packets corresponding to the argument, the sample role corresponding to each of the data packets, and the sample event type corresponding to each of the data packets, wherein the sample argument in any one of the data packets is the same;

第一处理模块1302,用于通过所述第一训练样本进行模型训练得到第一子模型,所述第一子模型用于确定文本中存在的论元、所述论元对应的角色和所述论元对应的事件类型;The first processing module 1302 is configured to perform model training through the first training sample to obtain a first sub-model, where the first sub-model is used to determine the arguments existing in the text, the roles corresponding to the arguments and the The event type corresponding to the argument;

第二获取模块1303,用于获取第二训练样本,所述第二训练样本中包括第二样本文本、所述第二样本文本中存在的多个样本事件、各所述样本事件中所包括的第二样本论元;The second obtaining module 1303 is configured to obtain a second training sample, where the second training sample includes the second sample text, a plurality of sample events existing in the second sample text, and the sample events included in each of the sample events. second sample argument;

第二处理模块1304,用于通过所述第二训练样本进行模型训练得到第二子模型,所述第二子模型用于确定文本中存在的事件、所述事件对应的论元;The second processing module 1304 is configured to perform model training through the second training sample to obtain a second sub-model, where the second sub-model is used to determine an event existing in the text and an argument corresponding to the event;

确定模块1305,用于基于所述第一子模型和所述第二子模型确定事件抽取模型。A determination module 1305, configured to determine an event extraction model based on the first sub-model and the second sub-model.

一种可能的实现方式中,所述第一处理模块1302具体用于:In a possible implementation manner, the first processing module 1302 is specifically configured to:

通过待训练的所述第一子模型对所述第一样本文本进行处理得到第一预测数据,所述第一预测数据中包括多个预测论元、所述预测论元对应的预测角色和所述预测论元对应的预测事件类型;The first sample text is processed by the first sub-model to be trained to obtain first prediction data, where the first prediction data includes a plurality of prediction arguments, the prediction roles corresponding to the prediction arguments, and the predicted event type corresponding to the predicted argument;

根据所述第一标注数据和所述第一预测数据,更新所述第一子模型的模型参数。According to the first annotation data and the first prediction data, the model parameters of the first sub-model are updated.

一种可能的实现方式中,所述第一预测数据中还包括所述多个预测论元在所述第一样本文本中的预测位置;In a possible implementation manner, the first prediction data further includes prediction positions of the plurality of prediction arguments in the first sample text;

所述第一处理模块1302具体用于:The first processing module 1302 is specifically used for:

根据所述第一标注数据、所述预测论元对应的预测角色和所述预测论元对应的预测事件类型,确定第一损失;determining the first loss according to the first labeling data, the predicted role corresponding to the prediction argument, and the predicted event type corresponding to the prediction argument;

根据所述多个预测论元的预测位置和所述多个预测论元在所述第一样本文本中的实际位置,确定第二损失;determining a second loss according to the predicted positions of the plurality of prediction arguments and the actual positions of the plurality of prediction arguments in the first sample text;

根据所述第一损失和所述第二损失,更新所述第一子模型的模型参数。According to the first loss and the second loss, the model parameters of the first sub-model are updated.

一种可能的实现方式中,所述第一预测数据中还包括所述多个预测论元在所述第一样本文本中的预测位置的概率;In a possible implementation manner, the first prediction data further includes the probability of prediction positions of the plurality of prediction arguments in the first sample text;

所述第一处理模块1302具体用于:The first processing module 1302 is specifically used for:

将所述多个预测论元进行分组,得到多组预测论元,每组预测论元中的论元相同;Grouping the plurality of prediction arguments to obtain multiple groups of prediction arguments, and the arguments in each group of prediction arguments are the same;

根据所述多个预测论元在所述第一样本文本中的预测位置的概率,分别在所述多组预测论元中确定目标预测论元,其中,在一组预测论元中的目标预测论元在所述第一样本文本中的预测位置的概率最高;According to the probabilities of the predicted positions of the plurality of prediction arguments in the first sample text, target prediction arguments are respectively determined in the groups of prediction arguments, wherein the target in a group of prediction arguments The probability of the predicted position of the predicted argument in the first sample text is the highest;

根据所述目标论元的预测位置和所述目标论元在所述第一样本文本中的实际位置,确定所述第二损失。The second loss is determined according to the predicted position of the target argument and the actual position of the target argument in the first sample text.

一种可能的实现方式中,所述第二处理模块1304具体用于:In a possible implementation manner, the second processing module 1304 is specifically configured to:

通过待训练的所述第二子模型对所述第二样本文本和所述多个第二样本论元进行处理,得到至少一个预测事件,所述预测事件中包括至少一个预测论元;The second sample text and the plurality of second sample arguments are processed by the second sub-model to be trained to obtain at least one predicted event, and the predicted event includes at least one predicted argument;

根据所述预测事件中的预测论元和所述样本事件中的第二样本论元,确定第三损失;determining a third loss based on the predicted argument in the predicted event and the second sample argument in the sample event;

根据所述第三损失更新所述第二子模型的模型参数。The model parameters of the second sub-model are updated according to the third loss.

一种可能的实现方式中,所述第二处理模块1304具体用于:In a possible implementation manner, the second processing module 1304 is specifically configured to:

根据所述第二样本文本,确定中心论元;determining a central argument according to the second sample text;

在所述第二样本文本中确定所述中心论元对应的目标窗口,所述目标窗口中包括预设数量的字符;determining a target window corresponding to the central argument in the second sample text, and the target window includes a preset number of characters;

确定所述目标窗口中存在的多个第一论元,并获取所述多个第一论元与所述中心论元对应于同一事件的预测概率;determining multiple first arguments existing in the target window, and obtaining the predicted probability that the multiple first arguments and the central argument correspond to the same event;

根据所述中心论元、各所述第一论元以及各所述第一论元对应的预测概率,确定至少一个预测事件。At least one predicted event is determined according to the central argument, each of the first arguments, and the predicted probability corresponding to each of the first arguments.

一种可能的实现方式中,所述第二处理模块1304具体用于:In a possible implementation manner, the second processing module 1304 is specifically configured to:

在所述第二样本文本中确定多个待选窗口,所述待选窗口中包括所述预设数量的字符,所述待选窗口中包括所述待选中心角色对应的论元;determining a plurality of candidate windows in the second sample text, the to-be-selected windows include the preset number of characters, and the to-be-selected windows include arguments corresponding to the to-be-selected central role;

确定所述待选窗口中包括的第一事件类型对应的论元的数量,所述第一事件类型为所述中心论元对应的事件类型;determining the number of arguments corresponding to the first event type included in the candidate window, where the first event type is the event type corresponding to the central argument;

根据所述待选窗口中包括的第一事件类型对应的论元的数量,确定所述目标窗口。The target window is determined according to the number of arguments corresponding to the first event type included in the candidate window.

一种可能的实现方式中,所述第二处理模块1304具体用于:In a possible implementation manner, the second processing module 1304 is specifically configured to:

将包括的第一事件类型对应的论元的数量最多的待选窗口确定为所述目标窗口。A candidate window with the largest number of included arguments corresponding to the first event type is determined as the target window.

一种可能的实现方式中,所述第二处理模块1304具体用于:In a possible implementation manner, the second processing module 1304 is specifically configured to:

确定所述多个第二样本论元对应的多个第二样本角色;determining a plurality of second sample roles corresponding to the plurality of second sample arguments;

确定各所述第二样本角色下的第二样本论元对应同一个事件的第一概率;determining the first probability that the second sample arguments under each of the second sample roles correspond to the same event;

根据各所述第二样本角色的召回率和准确率,确定各所述第二样本角色的角色系数;Determine the role coefficient of each of the second sample roles according to the recall rate and accuracy rate of each of the second sample roles;

根据所述各所述第二样本角色各自对应的第一概率、以及各所述第二样本角色的角色系数,在所述多个第二样本角色中确定所述待选中心角色;determining the central character to be selected among the plurality of second sample characters according to the first probability corresponding to each of the second sample characters and the character coefficient of each of the second sample characters;

将所述待选中心角色对应的论元确定为所述中心论元。The argument corresponding to the central role to be selected is determined as the central argument.

一种可能的实现方式中,针对任意一个第二样本角色;所述第二处理模块1304具体用于:In a possible implementation manner, for any second sample role; the second processing module 1304 is specifically used for:

根据预设函数处理所述第二样本角色的召回率和准确率,得到所述第二样本角色的角色系数。The recall rate and the accuracy rate of the second sample character are processed according to a preset function, and the character coefficient of the second sample character is obtained.

一种可能的实现方式中,所述第二处理模块1304具体用于:In a possible implementation manner, the second processing module 1304 is specifically configured to:

根据各所述第二样本角色各自对应的第一概率、以及各所述第二样本角色的角色系数,确定各所述第二样本角色的优先级;determining the priority of each of the second sample characters according to the first probability corresponding to each of the second sample characters and the role coefficient of each of the second sample characters;

根据各所述第二样本角色的优先级,在所述多个第二样本角色中确定所述待选中心角色。According to the priority of each of the second sample roles, the candidate central role is determined among the plurality of second sample roles.

一种可能的实现方式中,针对任意一个第二样本角色;所述第二处理模块1304具体用于:In a possible implementation manner, for any second sample role; the second processing module 1304 is specifically used for:

将所述第二样本角色对应一个事件的第一概率,与所述第二样本角色的角色系数的乘积,确定为各所述第二样本角色的优先级。The product of the first probability that the second sample character corresponds to an event and the character coefficient of the second sample character is determined as the priority of each second sample character.

一种可能的实现方式中,所述第二处理模块1304具体用于:In a possible implementation manner, the second processing module 1304 is specifically configured to:

若所述多个第二样本角色中存在第二样本角色的优先级大于或等于预设阈值,则将优先级大于或等于预设阈值的第二样本角色确定为所述待选中心角色;If the priority of a second sample role in the plurality of second sample roles is greater than or equal to a preset threshold, determining the second sample role with a priority greater than or equal to the preset threshold as the central role to be selected;

若所述多个第二样本角色的优先级均小于所述预设阈值,则将所述多个第二样本角色中优先级最大的第二样本角色确定为所述待选中心角色。If the priorities of the plurality of second sample characters are all smaller than the preset threshold, the second sample character with the highest priority among the plurality of second sample characters is determined as the central character to be selected.

一种可能的实现方式中,所述第二处理模块1304具体用于:In a possible implementation manner, the second processing module 1304 is specifically configured to:

根据各所述第一论元对应的预测概率,将对应的所述预测概率大于或等于概率阈值的第一论元确定为目标论元;According to the predicted probability corresponding to each of the first arguments, the corresponding first argument whose predicted probability is greater than or equal to the probability threshold is determined as the target argument;

确定所述中心论元对应的预测事件,其中,所述预测事件中包括所述中心论元和所述目标论元。A predicted event corresponding to the central argument is determined, wherein the predicted event includes the central argument and the target argument.

图14为本公开实施例的事件抽取装置的结构示意图。如图14所示,本实施例的事件抽取装置1400可以包括:获取模块1401、第一处理模块1402、第二获取模块1403。FIG. 14 is a schematic structural diagram of an event extraction apparatus according to an embodiment of the disclosure. As shown in FIG. 14 , the event extraction apparatus 1400 in this embodiment may include: an acquisition module 1401 , a first processing module 1402 , and a second acquisition module 1403 .

获取模块1401,用于获取待处理的第一文本;an acquisition module 1401, configured to acquire the first text to be processed;

第一处理模块1402,用于通过预训练的事件抽取模型中的第一子模型对所述第一文本进行处理,得到第一输出结果,所述第一输出结果中包括:所述第一文本中存在的论元、所述论元对应的角色和所述论元对应的事件类型;The first processing module 1402 is configured to process the first text through the first sub-model in the pre-trained event extraction model to obtain a first output result, where the first output result includes: the first text The arguments present in the argument, the roles corresponding to the arguments, and the event types corresponding to the arguments;

第二处理模块1403,用于通过预训练的事件抽取模型中的第二子模型对所述第一输出结果进行处理,得到所述第一文本中存在的事件、所述事件对应的论元。The second processing module 1403 is configured to process the first output result through the second sub-model in the pre-trained event extraction model to obtain the event existing in the first text and the argument corresponding to the event.

一种可能的实现方式中,所述第二处理模块1403具体用于:In a possible implementation manner, the second processing module 1403 is specifically used for:

获取所述第一输出结果中的各个论元;obtaining each argument in the first output result;

将所述各个论元以及所述第一文本输入至所述第二子模型,以使得所述第二子模型输出所述第一文本中存在的事件、所述事件对应的论元。The respective arguments and the first text are input into the second sub-model, so that the second sub-model outputs events existing in the first text and arguments corresponding to the events.

本公开提供一种事件抽取模型训练方法及装置、事件抽取方法及装置,应用于知识图谱、深度学习等人工智能技术领域,以达到提升事件抽取模型的抽取结果的准确性的效果。The present disclosure provides an event extraction model training method and device, and an event extraction method and device, which are applied to artificial intelligence technology fields such as knowledge graphs and deep learning, so as to achieve the effect of improving the accuracy of the extraction results of the event extraction model.

需要说明的是,本实施例中的人头模型并不是针对某一特定用户的人头模型,并不能反映出某一特定用户的个人信息。需要说明的是,本实施例中的二维人脸图像来自于公开数据集。It should be noted that the human head model in this embodiment is not a human head model for a specific user, and cannot reflect the personal information of a specific user. It should be noted that the two-dimensional face image in this embodiment comes from a public data set.

本公开的技术方案中,所涉及的用户个人信息的收集、存储、使用、加工、传输、提供和公开等处理,均符合相关法律法规的规定,且不违背公序良俗。In the technical solutions of the present disclosure, the collection, storage, use, processing, transmission, provision, and disclosure of the user's personal information involved are all in compliance with relevant laws and regulations, and do not violate public order and good customs.

根据本公开的实施例,本公开还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.

根据本公开的实施例,本公开还提供了一种计算机程序产品,计算机程序产品包括:计算机程序,计算机程序存储在可读存储介质中,电子设备的至少一个处理器可以从可读存储介质读取计算机程序,至少一个处理器执行计算机程序使得电子设备执行上述任一实施例提供的方案。According to an embodiment of the present disclosure, the present disclosure also provides a computer program product, the computer program product includes: a computer program, the computer program is stored in a readable storage medium, and at least one processor of the electronic device can read from the readable storage medium A computer program is taken, and at least one processor executes the computer program so that the electronic device executes the solution provided by any of the foregoing embodiments.

图15示出了可以用来实施本公开的实施例的示例电子设备1500的示意性框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。15 shows a schematic block diagram of an example electronic device 1500 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.

如图15所示,设备1500包括计算单元1501,其可以根据存储在只读存储器(ROM)1502中的计算机程序或者从存储单元1508加载到随机访问存储器(RAM)1503中的计算机程序,来执行各种适当的动作和处理。在RAM 1503中,还可存储设备1500操作所需的各种程序和数据。计算单元1501、ROM 1502以及RAM 1503通过总线1504彼此相连。输入/输出(I/O)接口1505也连接至总线1504。As shown in FIG. 15, the device 1500 includes a computing unit 1501 that can be executed according to a computer program stored in a read only memory (ROM) 1502 or a computer program loaded from a storage unit 1508 into a random access memory (RAM) 1503 Various appropriate actions and handling. In the RAM 1503, various programs and data necessary for the operation of the device 1500 can also be stored. The computing unit 1501 , the ROM 1502 , and the RAM 1503 are connected to each other through a bus 1504 . An input/output (I/O) interface 1505 is also connected to bus 1504 .

设备1500中的多个部件连接至I/O接口1505,包括:输入单元1506,例如键盘、鼠标等;输出单元1507,例如各种类型的显示器、扬声器等;存储单元1508,例如磁盘、光盘等;以及通信单元1509,例如网卡、调制解调器、无线通信收发机等。通信单元1509允许设备1500通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Various components in the device 1500 are connected to the I/O interface 1505, including: an input unit 1506, such as a keyboard, mouse, etc.; an output unit 1507, such as various types of displays, speakers, etc.; a storage unit 1508, such as a magnetic disk, an optical disk, etc. ; and a communication unit 1509, such as a network card, modem, wireless communication transceiver, and the like. The communication unit 1509 allows the device 1500 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.

计算单元1501可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元1501的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元1501执行上文所描述的各个方法和处理,例如事件抽取模型训练方法或者事件抽取方法。例如,在一些实施例中,事件抽取模型训练方法或者事件抽取方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元1508。在一些实施例中,计算机程序的部分或者全部可以经由ROM 1502和/或通信单元1509而被载入和/或安装到设备1500上。当计算机程序加载到RAM 1503并由计算单元1501执行时,可以执行上文描述的事件抽取模型训练方法或者事件抽取方法的一个或多个步骤。备选地,在其他实施例中,计算单元1501可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行事件抽取模型训练方法或者事件抽取方法。Computing unit 1501 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing units 1501 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1501 performs the various methods and processes described above, such as an event extraction model training method or an event extraction method. For example, in some embodiments, the event extraction model training method or the event extraction method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1508 . In some embodiments, part or all of the computer program may be loaded and/or installed on device 1500 via ROM 1502 and/or communication unit 1509 . When the computer program is loaded into the RAM 1503 and executed by the computing unit 1501, one or more steps of the event extraction model training method or the event extraction method described above may be performed. Alternatively, in other embodiments, the computing unit 1501 may be configured to perform the event extraction model training method or the event extraction method by any other suitable means (eg, by means of firmware).

本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、复杂可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein above may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), complex programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented. The program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.

在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与VPS服务("Virtual Private Server",或简称"VPS")中,存在的管理难度大,业务扩展性弱的缺陷。服务器也可以为分布式系统的服务器,或者是结合了区块链的服务器。A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS service ("Virtual Private Server", or "VPS" for short). , there are the defects of difficult management and weak business expansion. The server can also be a server of a distributed system, or a server combined with a blockchain.

应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present disclosure can be executed in parallel, sequentially, or in different orders. As long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, there is no limitation herein.

上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements, and improvements made within the spirit and principles of the present disclosure should be included within the protection scope of the present disclosure.

Claims (35)

1. An event extraction model training method comprises the following steps:
obtaining a first training sample, wherein the first training sample comprises a first sample text and first labeling data, and the first labeling data comprises: a plurality of data packets corresponding to a plurality of sample arguments in the first sample text, a sample role corresponding to each data packet, and a sample event type corresponding to each data packet, wherein the sample arguments in any data packet are the same;
performing model training through the first training sample to obtain a first sub-model, wherein the first sub-model is used for determining arguments existing in a text, roles corresponding to the arguments and event types corresponding to the arguments;
obtaining a second training sample, wherein the second training sample comprises a second sample text, a plurality of sample events existing in the second sample text, and a second sample argument included in each sample event;
performing model training through the second training sample to obtain a second sub-model, wherein the second sub-model is used for determining events existing in the text and arguments corresponding to the events;
determining an event extraction model based on the first sub-model and the second sub-model.
2. The method of claim 1, wherein model training through the first training sample results in a first sub-model comprising:
processing the first sample by the first sub-model to be trained to obtain first prediction data, wherein the first prediction data comprises a plurality of prediction arguments, a prediction role corresponding to the prediction arguments and a prediction event type corresponding to the prediction arguments;
and updating the model parameters of the first sub-model according to the first marking data and the first prediction data.
3. The method of claim 2, wherein the first prediction data further includes predicted locations of the plurality of prediction arguments in the first sample text;
updating model parameters of the first sub-model according to the first annotation data and the first prediction data, including:
determining a first loss according to the first labeled data, the predicted role corresponding to the predicted argument and the predicted event type corresponding to the predicted argument;
determining a second loss according to the predicted positions of the plurality of predicted arguments and the actual positions of the plurality of predicted arguments in the first sample text;
and updating the model parameters of the first submodel according to the first loss and the second loss.
4. The method of claim 3, wherein the first prediction data further includes probabilities of predicted locations of the plurality of prediction arguments in the first sample text;
determining a second loss based on the predicted locations of the plurality of predicted arguments and the actual locations of the plurality of predicted arguments in the first sample text, comprising:
grouping the multiple prediction arguments to obtain multiple groups of prediction arguments, wherein the arguments in each group of prediction arguments are the same;
determining target prediction arguments in the plurality of groups of prediction arguments respectively according to the probabilities of the prediction positions of the plurality of prediction arguments in the first sample text, wherein the probability of the prediction position of the target prediction argument in the group of prediction arguments in the first sample text is highest;
and determining the second loss according to the predicted position of the target argument and the actual position of the target argument in the first sample text.
5. The method of any of claims 1-4, wherein model training through the second training sample results in a second sub-model comprising:
processing the second sample text and the plurality of second sample arguments through the second submodel to be trained to obtain at least one predicted event, wherein the predicted event comprises at least one predicted argument;
determining a third loss according to a prediction argument in the prediction event and a second sample argument in the sample event;
and updating the model parameters of the second submodel according to the third loss.
6. The method of claim 5, wherein processing the second sample text and the plurality of second sample arguments with the second submodel to be trained to derive at least one predicted event comprises:
determining a central argument according to the second sample text;
determining a target window corresponding to the central argument in the second sample text, wherein the target window comprises a preset number of characters;
determining a plurality of first arguments existing in the target window, and acquiring the prediction probability that the plurality of first arguments and the central argument correspond to the same event;
and determining at least one predicted event according to the central argument, each first argument and the prediction probability corresponding to each first argument.
7. The method of claim 6, wherein determining a target window in the second sample text to which the central argument corresponds comprises:
determining a plurality of windows to be selected in the second sample text, wherein the windows to be selected comprise the characters with the preset number, and the windows to be selected comprise arguments corresponding to the central roles to be selected;
determining the number of arguments corresponding to a first event type included in the window to be selected, wherein the first event type is the event type corresponding to the central argument;
and determining the target window according to the number of arguments corresponding to the first event type in the window to be selected.
8. The method of claim 7, wherein determining the target window according to the number of arguments corresponding to the first event type included in the window to be selected comprises:
and determining the window to be selected with the maximum number of arguments corresponding to the first event type as the target window.
9. The method of any of claims 6-8, wherein determining a central argument from the second sample text comprises:
determining a plurality of second sample roles corresponding to the plurality of second sample arguments;
determining a first probability that a second sample argument under each second sample role corresponds to the same event;
determining the role coefficient of each second sample role according to the recall rate and the accuracy rate of each second sample role;
determining the center role to be selected in the plurality of second sample roles according to the first probability corresponding to each second sample role and the role coefficient of each second sample role;
and determining the argument corresponding to the central role to be selected as the central argument.
10. The method of claim 9, wherein for any one second sample role; determining the role coefficient of the second sample role according to the recall rate and the accuracy rate of the second sample role, wherein the role coefficient comprises the following steps:
and processing the recall rate and the accuracy rate of the second sample role according to a preset function to obtain the role coefficient of the second sample role.
11. The method according to claim 9 or 10, wherein determining the candidate center role among the plurality of second sample roles according to the first probability corresponding to each of the second sample roles and the role coefficient of each of the second sample roles includes:
determining the priority of each second sample role according to the corresponding first probability of each second sample role and the role coefficient of each second sample role;
and determining the role of the center to be selected in the plurality of second sample roles according to the priority of each second sample role.
12. The method of claim 11, wherein for any one second sample role; determining the priority of each second sample role according to the first probability corresponding to each second sample role and the role coefficient of each second sample role, including:
and determining the product of the first probability of the second sample role corresponding to one event and the role coefficient of the second sample role as the priority of each second sample role.
13. The method according to claim 11 or 12, wherein determining the candidate center role among the plurality of second sample roles according to the priority of each of the second sample roles comprises:
if the priority of a second sample role in the plurality of second sample roles is greater than or equal to a preset threshold, determining the second sample role with the priority greater than or equal to the preset threshold as the center role to be selected;
and if the priorities of the plurality of second sample roles are all smaller than the preset threshold, determining the second sample role with the highest priority in the plurality of second sample roles as the center role to be selected.
14. The method of any one of claims 6 to 13, wherein said determining at least one predicted event from said central argument, each of said first arguments, and a predicted probability corresponding to each of said first arguments comprises:
determining the first argument with the corresponding prediction probability larger than or equal to a probability threshold value as a target argument according to the prediction probability corresponding to each first argument;
and determining a predicted event corresponding to the central argument, wherein the predicted event comprises the central argument and the target argument.
15. An event extraction method, comprising:
acquiring a first text to be processed;
processing the first text through a first sub-model in a pre-trained event extraction model to obtain a first output result, wherein the first output result comprises: argument existing in the first text, role corresponding to the argument and event type corresponding to the argument;
and processing the first output result through a second sub-model in the pre-trained event extraction model to obtain an event existing in the first text and an argument corresponding to the event.
16. The method of claim 15, wherein the processing the first output result through a second submodel in the pre-trained event extraction model to obtain an event existing in the first text and an argument corresponding to the event comprises:
obtaining each argument in the first output result;
and inputting each argument and the first text into the second submodel, so that the second submodel outputs events existing in the first text and arguments corresponding to the events.
17. An event extraction model training device, comprising:
an obtaining module, configured to obtain a first training sample, where the first training sample includes a first sample text and first labeling data, and the first labeling data includes: a plurality of data packets corresponding to a plurality of sample arguments in the first sample text, a sample role corresponding to each data packet, and a sample event type corresponding to each data packet, wherein the sample arguments in any data packet are the same;
the first processing module is used for carrying out model training through the first training sample to obtain a first sub-model, and the first sub-model is used for determining arguments existing in a text, roles corresponding to the arguments and event types corresponding to the arguments;
a second obtaining module, configured to obtain a second training sample, where the second training sample includes a second sample text, a plurality of sample events existing in the second sample text, and a second sample argument included in each sample event;
the second processing module is used for carrying out model training through the second training sample to obtain a second sub-model, and the second sub-model is used for determining events existing in the text and arguments corresponding to the events;
a determination module to determine an event extraction model based on the first submodel and the second submodel.
18. The apparatus of claim 17, wherein the first processing module is specifically configured to:
processing the first sample by the first sub-model to be trained to obtain first prediction data, wherein the first prediction data comprises a plurality of prediction arguments, a prediction role corresponding to the prediction arguments and a prediction event type corresponding to the prediction arguments;
and updating the model parameters of the first sub-model according to the first marking data and the first prediction data.
19. The apparatus of claim 18, wherein the first prediction data further comprises predicted locations of the plurality of prediction arguments in the first sample text;
the first processing module is specifically configured to:
determining a first loss according to the first labeled data, the predicted role corresponding to the predicted argument and the predicted event type corresponding to the predicted argument;
determining a second loss according to the predicted positions of the plurality of predicted arguments and the actual positions of the plurality of predicted arguments in the first sample text;
and updating the model parameters of the first submodel according to the first loss and the second loss.
20. The apparatus of claim 19, wherein the first prediction data further comprises probabilities of predicted locations of the plurality of prediction arguments in the first sample text;
the first processing module is specifically configured to:
grouping the multiple prediction arguments to obtain multiple groups of prediction arguments, wherein the arguments in each group of prediction arguments are the same;
determining target prediction arguments in the plurality of groups of prediction arguments respectively according to the probabilities of the prediction positions of the plurality of prediction arguments in the first sample text, wherein the probability of the prediction position of the target prediction argument in the group of prediction arguments in the first sample text is highest;
and determining the second loss according to the predicted position of the target argument and the actual position of the target argument in the first sample text.
21. The apparatus according to any one of claims 17 to 20, wherein the second processing module is specifically configured to:
processing the second sample text and the plurality of second sample arguments through the second submodel to be trained to obtain at least one predicted event, wherein the predicted event comprises at least one predicted argument;
determining a third loss according to a prediction argument in the prediction event and a second sample argument in the sample event;
and updating the model parameters of the second submodel according to the third loss.
22. The apparatus of claim 21, wherein the second processing module is specifically configured to:
determining a central argument according to the second sample text;
determining a target window corresponding to the central argument in the second sample text, wherein the target window comprises a preset number of characters;
determining a plurality of first arguments existing in the target window, and acquiring the prediction probability that the plurality of first arguments and the central argument correspond to the same event;
and determining at least one predicted event according to the central argument, each first argument and the prediction probability corresponding to each first argument.
23. The apparatus of claim 22, wherein the second processing module is specifically configured to:
determining a plurality of windows to be selected in the second sample text, wherein the windows to be selected comprise the characters with the preset number, and the windows to be selected comprise arguments corresponding to the central roles to be selected;
determining the number of arguments corresponding to a first event type included in the window to be selected, wherein the first event type is the event type corresponding to the central argument;
and determining the target window according to the number of arguments corresponding to the first event type in the window to be selected.
24. The apparatus of claim 23, wherein the second processing module is specifically configured to:
and determining the window to be selected with the maximum number of arguments corresponding to the first event type as the target window.
25. The apparatus according to any one of claims 22-24, wherein the second processing module is specifically configured to:
determining a plurality of second sample roles corresponding to the plurality of second sample arguments;
determining a first probability that a second sample argument under each second sample role corresponds to the same event;
determining the role coefficient of each second sample role according to the recall rate and the accuracy rate of each second sample role;
determining the center role to be selected in the plurality of second sample roles according to the first probability corresponding to each second sample role and the role coefficient of each second sample role;
and determining the argument corresponding to the central role to be selected as the central argument.
26. The apparatus of claim 25, wherein for any one second sample role; the second processing module is specifically configured to:
and processing the recall rate and the accuracy rate of the second sample role according to a preset function to obtain the role coefficient of the second sample role.
27. The apparatus according to claim 25 or 26, wherein the second processing module is specifically configured to:
determining the priority of each second sample role according to the corresponding first probability of each second sample role and the role coefficient of each second sample role;
and determining the role of the center to be selected in the plurality of second sample roles according to the priority of each second sample role.
28. The apparatus of claim 27, wherein for any one second sample role; the second processing module is specifically configured to:
and determining the product of the first probability of the second sample role corresponding to one event and the role coefficient of the second sample role as the priority of each second sample role.
29. The apparatus according to claim 27 or 28, wherein the second processing module is specifically configured to:
if the priority of a second sample role in the plurality of second sample roles is greater than or equal to a preset threshold, determining the second sample role with the priority greater than or equal to the preset threshold as the center role to be selected;
and if the priorities of the plurality of second sample roles are all smaller than the preset threshold, determining the second sample role with the highest priority in the plurality of second sample roles as the center role to be selected.
30. The apparatus according to any one of claims 22-29, wherein the second processing module is specifically configured to:
determining the first argument with the corresponding prediction probability larger than or equal to a probability threshold value as a target argument according to the prediction probability corresponding to each first argument;
and determining a predicted event corresponding to the central argument, wherein the predicted event comprises the central argument and the target argument.
31. An event extraction device comprising:
the acquisition module is used for acquiring a first text to be processed;
a first processing module, configured to process the first text through a first sub-model in a pre-trained event extraction model to obtain a first output result, where the first output result includes: argument existing in the first text, role corresponding to the argument and event type corresponding to the argument;
and the second processing module is used for processing the first output result through a second sub-model in the pre-trained event extraction model to obtain the event existing in the first text and the argument corresponding to the event.
32. The apparatus of claim 31, wherein the second processing module is specifically configured to:
obtaining each argument in the first output result;
and inputting each argument and the first text into the second submodel, so that the second submodel outputs events existing in the first text and arguments corresponding to the events.
33. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-14 or claims 15-16.
34. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any of claims 1-14 or claims 15-16.
35. A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the method of any one of claims 1 to 14 or claims 15 to 16.
CN202111595365.XA 2021-12-23 2021-12-23 Event extraction model training method and device and event extraction method and device Active CN114328687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111595365.XA CN114328687B (en) 2021-12-23 2021-12-23 Event extraction model training method and device and event extraction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111595365.XA CN114328687B (en) 2021-12-23 2021-12-23 Event extraction model training method and device and event extraction method and device

Publications (2)

Publication Number Publication Date
CN114328687A true CN114328687A (en) 2022-04-12
CN114328687B CN114328687B (en) 2023-04-07

Family

ID=81012895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111595365.XA Active CN114328687B (en) 2021-12-23 2021-12-23 Event extraction model training method and device and event extraction method and device

Country Status (1)

Country Link
CN (1) CN114328687B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090222395A1 (en) * 2007-12-21 2009-09-03 Marc Light Systems, methods, and software for entity extraction and resolution coupled with event and relationship extraction
CN102298635A (en) * 2011-09-13 2011-12-28 苏州大学 Method and system for fusing event information
US8667509B1 (en) * 2008-09-30 2014-03-04 Emc Corporation Providing context information for events to an event handling component
WO2015084756A1 (en) * 2013-12-02 2015-06-11 Qbase, LLC Event detection through text analysis using trained event template models
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method
CN109582949A (en) * 2018-09-14 2019-04-05 阿里巴巴集团控股有限公司 Event element abstracting method, calculates equipment and storage medium at device
CN111783394A (en) * 2020-08-11 2020-10-16 深圳市北科瑞声科技股份有限公司 Training method of event extraction model, event extraction method, system and equipment
CN112052682A (en) * 2020-09-02 2020-12-08 平安资产管理有限责任公司 Event entity joint extraction method and device, computer equipment and storage medium
CN112116075A (en) * 2020-09-18 2020-12-22 厦门安胜网络科技有限公司 Event extraction model generation method and device and text event extraction method and device
CN112528625A (en) * 2020-12-11 2021-03-19 北京百度网讯科技有限公司 Event extraction method and device, computer equipment and readable storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090222395A1 (en) * 2007-12-21 2009-09-03 Marc Light Systems, methods, and software for entity extraction and resolution coupled with event and relationship extraction
US8667509B1 (en) * 2008-09-30 2014-03-04 Emc Corporation Providing context information for events to an event handling component
CN102298635A (en) * 2011-09-13 2011-12-28 苏州大学 Method and system for fusing event information
WO2015084756A1 (en) * 2013-12-02 2015-06-11 Qbase, LLC Event detection through text analysis using trained event template models
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method
CN109582949A (en) * 2018-09-14 2019-04-05 阿里巴巴集团控股有限公司 Event element abstracting method, calculates equipment and storage medium at device
CN111783394A (en) * 2020-08-11 2020-10-16 深圳市北科瑞声科技股份有限公司 Training method of event extraction model, event extraction method, system and equipment
CN112052682A (en) * 2020-09-02 2020-12-08 平安资产管理有限责任公司 Event entity joint extraction method and device, computer equipment and storage medium
CN112116075A (en) * 2020-09-18 2020-12-22 厦门安胜网络科技有限公司 Event extraction model generation method and device and text event extraction method and device
CN112528625A (en) * 2020-12-11 2021-03-19 北京百度网讯科技有限公司 Event extraction method and device, computer equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WEI XIANG ET AL: "A Survey of Event Extraction From Text", 《IEEE ACCESS》 *

Also Published As

Publication number Publication date
CN114328687B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN113836333B (en) Training method of image-text matching model, method and device for realizing image-text retrieval
CN114490998B (en) Text information extraction method and device, electronic equipment and storage medium
CN113408273B (en) Training method and device of text entity recognition model and text entity recognition method and device
CN114444619B (en) Sample generation method, training method, data processing method and electronic device
CN113420822B (en) Model training method and device, text prediction method and device
CN114611532A (en) Language model training method and device, target translation error detection method and device
CN115017898A (en) Recognition method, device, electronic device and storage medium for sensitive text
CN114547270A (en) Text processing method, and training method, device and equipment of text processing model
CN114647727A (en) Model training method, device and equipment applied to entity information recognition
CN114625923A (en) Training method of video retrieval model, video retrieval method, device and equipment
CN117668253A (en) Intelligent question and answer methods and systems based on natural language processing and knowledge graphs
CN112818167A (en) Entity retrieval method, entity retrieval device, electronic equipment and computer-readable storage medium
CN115328956B (en) Data query method and device based on artificial intelligence and storage medium
CN113641724B (en) Knowledge tag mining method and device, electronic equipment and storage medium
CN112528682B (en) Language detection method, device, electronic equipment and storage medium
CN114861059A (en) Resource recommendation method and device, electronic equipment and storage medium
CN114818736A (en) Text processing method, link method, device and storage medium for short text
CN114861676A (en) Paragraph extraction method and device and electronic equipment
CN114330344A (en) Named entity recognition method, training method, device, electronic equipment and medium
CN114254650A (en) An information processing method, device, equipment and medium
CN114282049A (en) A video retrieval method, device, equipment and storage medium
CN113947082A (en) Word segmentation processing method, device, equipment and storage medium
CN114328687B (en) Event extraction model training method and device and event extraction method and device
CN112632999A (en) Named entity recognition model obtaining method, named entity recognition device and named entity recognition medium
CN114116914B (en) Entity retrieval method, device and electronic device based on semantic tags

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant