[go: up one dir, main page]

CN112528641A - Method and device for establishing information extraction model, electronic equipment and readable storage medium - Google Patents

Method and device for establishing information extraction model, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN112528641A
CN112528641A CN202011454491.9A CN202011454491A CN112528641A CN 112528641 A CN112528641 A CN 112528641A CN 202011454491 A CN202011454491 A CN 202011454491A CN 112528641 A CN112528641 A CN 112528641A
Authority
CN
China
Prior art keywords
text
processed
subject
training
predicate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011454491.9A
Other languages
Chinese (zh)
Inventor
贾巍
戴岱
肖欣延
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202011454491.9A priority Critical patent/CN112528641A/en
Publication of CN112528641A publication Critical patent/CN112528641A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种建立信息抽取模型的方法、装置,涉及自然语言处理、深度学习技术领域。本申请在建立信息抽取模型时所采用的方案为:获取多个训练数据,使用训练数据对深度学习模型进行预设次数的训练;获取多个标注有主语的待处理文本,使用深度学习模型对多个待处理文本进行预测,得到各待处理文本中与所标注的主语对应的谓语和宾语;根据各待处理文本、各待处理文本中的主语以及与主语对应的谓语和宾语得到置信数据;使用训练数据与置信数据对深度学习模型进行训练之后,转至执行获取待处理文本的步骤,以此循环进行,直至深度学习模型的训练达到终止条件得到信息抽取模型。本申请能够建立从不同关系类别的文本中抽取关系信息的信息抽取模型。

Figure 202011454491

The present application discloses a method and an apparatus for establishing an information extraction model, which relate to the technical fields of natural language processing and deep learning. The scheme adopted in this application when establishing the information extraction model is: obtaining multiple training data, and using the training data to train the deep learning model for a preset number of times; obtaining multiple texts to be processed marked with subjects, using the deep learning model to A plurality of texts to be processed are predicted to obtain the predicate and object corresponding to the marked subject in each text to be processed; confidence data is obtained according to each text to be processed, the subject in each text to be processed, and the predicate and object corresponding to the subject; After using the training data and the confidence data to train the deep learning model, go to the step of obtaining the text to be processed, and repeat this cycle until the training of the deep learning model reaches the termination condition to obtain the information extraction model. The present application can establish an information extraction model for extracting relational information from texts of different relational categories.

Figure 202011454491

Description

Method and device for establishing information extraction model, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of information processing technologies, and in particular, to a method and an apparatus for establishing an information extraction model in the fields of natural language processing and deep learning technologies, an electronic device, and a readable storage medium.
Background
Unstructured text recorded in natural language, such as news, encyclopedia, financial reports, etc., is ubiquitous in various industries. Meanwhile, in various industries, there is also a general need for extracting structured information from unstructured text, for example, extracting relationship information from unstructured text, where the relationship information is a triple (object, predictor) of a relationship between an entity pair and the entity pair in the text.
Because a large amount of unstructured texts exist, the relational information is difficult to be extracted directly by manpower, and corresponding relational information extraction software based on a computer is produced at the same time.
However, the conventional relationship information extraction software needs to specify the relationship type of the extraction target in advance, such as the established time, the singer, the album, and the like, and each type of relationship information extraction software can extract only the relationship information of a specific type. If different types of relationship information are to be extracted, a plurality of relationship information extraction software needs to be developed, so that the relationship information extraction software has complicated training steps and low training efficiency.
Disclosure of Invention
The technical scheme adopted by the application for solving the technical problem is to provide a method for establishing an information extraction model, which comprises the following steps: acquiring a plurality of training data, and training a deep learning model for a preset number of times by using the training data; obtaining a plurality of texts to be processed marked with subjects, and predicting the plurality of texts to be processed by using the deep learning model to obtain predicates and objects corresponding to the marked subjects in the texts to be processed; obtaining confidence data according to each text to be processed, the subject in each text to be processed, and the predicate and the object corresponding to the subject; after the deep learning model is trained by using the training data and the confidence data, the step of acquiring the text to be processed is carried out, and the steps are carried out in a circulating manner until the training of the deep learning model reaches a termination condition, so that an information extraction model is obtained; the information extraction model can output a predicate and an object corresponding to a subject in a text according to the input text and the subject in the text.
The technical solution adopted by the present application to solve the technical problem is to provide an apparatus for establishing an information extraction model, comprising: the deep learning system comprises a first training unit, a second training unit and a third training unit, wherein the first training unit is used for acquiring a plurality of training data and using the training data to train a deep learning model for preset times; the prediction unit is used for acquiring a plurality of texts to be processed marked with subjects, predicting the plurality of texts to be processed by using the deep learning model, and obtaining predicates and objects corresponding to the marked subjects in the texts to be processed; the processing unit is used for obtaining confidence data according to each text to be processed, the subject in each text to be processed, and the predicate and the object corresponding to the subject; the second training unit is used for performing the step of acquiring the text to be processed after the deep learning model is trained by using the training data and the confidence data, and performing the step circularly until the training of the deep learning model reaches a termination condition to obtain an information extraction model; the information extraction model can output a predicate and an object corresponding to a subject in a text according to the input text and the subject in the text.
An electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above method.
A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the above method.
A computer program product, when being executed by an instruction processor, executes the above method for creating an information extraction model.
One embodiment in the above application has the following advantages or benefits: the method and the device can simplify the training steps of the information extraction model, improve the training efficiency of the information extraction model, and enable the established information extraction model to have the capability of extracting the relationship information from the texts of different relationship types. Because the technical means of training the deep learning model by continuously acquiring confidence data is adopted, the technical problem that the relation extraction software in the prior art can only extract the relation of a specific class is solved, and the technical effect that the established information extraction model has the capability of extracting the relation information from texts of different relation classes is realized.
Other effects of the above-described alternative will be described below with reference to specific embodiments.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present application;
FIG. 2 is a schematic diagram according to a second embodiment of the present application;
FIG. 3 is a block diagram of an electronic device for implementing a method of creating an information extraction model according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram according to a first embodiment of the present application. As shown in fig. 1, the method for establishing an information extraction model according to this embodiment may specifically include the following steps:
s101, obtaining a plurality of training data, and training a deep learning model for preset times by using the training data;
s102, obtaining a plurality of texts to be processed marked with subjects, and predicting the plurality of texts to be processed by using the deep learning model to obtain predicates and objects corresponding to the marked subjects in the texts to be processed;
s103, obtaining confidence data according to each text to be processed, the subject in each text to be processed, and the predicate and the object corresponding to the subject;
s104, after the deep learning model is trained by using the training data and the confidence data, the step of obtaining the text to be processed is executed, and the steps are circularly carried out until the training of the deep learning model reaches a termination condition, so that an information extraction model is obtained.
In the method for establishing the information extraction model provided by this embodiment, after the deep learning model is trained for the preset times by using the training data, the deep learning model is used to predict the obtained text to be processed, which is labeled with the subject, and confidence data is obtained according to the text to be processed, the subject labeled in the text to be processed, and the prediction result, and then the deep learning model is trained by combining the training data and the confidence data, so that the deep learning model can autonomously mine the relationship information in the text to be processed, thereby continuously learning new knowledge, simplifying the training step of the information extraction model, improving the training efficiency of the information extraction model, and enabling the established information extraction model to have the capability of extracting relationship information from texts of different relationship categories.
In this embodiment, the training data obtained by executing S101 is labeled data, and each training data is composed of a training text, a subject in the training text, and a predicate and an object corresponding to the subject.
For example, if the author of the "three-body" novel is liuxixin ", the subject (subject) of the training text is" three-body ", the predicate (predicate) corresponding to the subject" three-body "is" author ", and the object (object) corresponding to the subject" three-body "is" liuxixin ", the present embodiment takes" the author of the "three-body" novel is liuxixin, three-body, author, liuxixin "as one training data.
Specifically, the deep learning model in the present embodiment is composed of a pre-trained model (e.g., ERNIE), a predicate predictor (predictor), and an object predictor (object predictor); the pre-training model is used for obtaining a semantic vector, the predicate predictor is used for predicting a predicate according to the semantic vector, and the object predictor is used for predicting an object according to the semantic vector and the predicate obtained through prediction.
Therefore, in the present embodiment, when the deep learning model is trained for the preset number of times by using the training data in step S101, the optional implementation manners that can be adopted are: for each training data, after converting the training text into a vector sequence, splicing the vector sequence with the position information of the subject in the training text; taking the splicing result as the input of a pre-training model to obtain a semantic vector output by the pre-training model; the semantic vector is used as the input of a predicate predictor, and a predicate corresponding to a subject is used as the output of the predicate predictor; and taking the semantic vector and a vector corresponding to the predicate in the semantic vector as the input of the object predictor, taking the object corresponding to the subject as the output of the object predictor, and training the deep learning model for a preset number of times. The preset times in this embodiment may be set by a user, or may be fixed times, for example, one time.
In the embodiment, the deep learning model is trained for a preset number of times by using training data, so that the trained deep learning model has certain capability of extracting relationship information, that is, a predicate and an object corresponding to a subject in a text can be predicted according to the input text and the subject in the text.
After performing S101 training for a preset number of times on the deep learning model by using the obtained training data, the present embodiment performs S102 to obtain a plurality of texts to be processed, which are marked with subjects, and predicts the plurality of texts to be processed by using the deep learning model, so as to obtain predicates and objects corresponding to the marked subjects in the texts to be processed.
In this embodiment, when S102 is executed to obtain a plurality of texts to be processed labeled with subjects, the optional implementation manner that can be adopted is as follows: and acquiring a plurality of texts containing the relation words as a plurality of texts to be processed, and labeling the subject in each text to be processed, for example, acquiring the plurality of texts from a network.
It can be understood that, in the embodiment, a manner of presetting a relation word list may be adopted to obtain a text containing relation words from a network, where the relation word list may contain words that can be used as relation words; in this embodiment, it may also be determined whether the text acquired from the network is a text containing the relation word by using a relation word recognition model obtained through pre-training, that is, the relation word recognition model may recognize whether the text contains the relation word.
Research shows that the acquired text needs to contain the relation words in order to realize that the deep learning model can better and autonomously mine the relation information in the new text; and by acquiring the text containing the relation words, the deep learning model can extract the relation information from the texts with different relation categories. For example, if the president of the text "a" is B ", the president is the relation word between the country" a "and the name" B "in the text.
In this embodiment, when S102 is executed to predict a plurality of texts to be processed by using a deep learning model to obtain a predicate and an object corresponding to a labeled subject in each text to be processed, an optional implementation manner that can be adopted is as follows: for each text to be processed, after converting the text to be processed into a vector sequence, splicing the vector sequence with the position information of the subject marked in the text to be processed; taking the splicing result as the input of a pre-training model to obtain a semantic vector output by the pre-training model; taking the semantic vector as the input of a predicate predictor, and obtaining a predicate corresponding to the subject according to the output of the predicate predictor; and taking the semantic vector and a vector of the predicate obtained by corresponding to the semantic vector as the input of the object predictor, and obtaining the object corresponding to the subject according to the output of the object predictor.
After executing S102 to obtain a predicate and an object corresponding to a subject labeled in a text to be processed using a deep learning model, executing S103 to obtain confidence data according to each text to be processed, the subject in each text to be processed, and the predicate and the object corresponding to the subject.
It is understood that, when the confidence data is obtained by executing S103, all the texts to be processed, the subject of each text to be processed, and the predicate and the object corresponding to the subject, which are obtained in S102, may be used as the confidence data.
However, before the training of the deep learning model does not reach the termination condition, the accuracy of the prediction results obtained by the deep learning model according to the text to be processed and the subject is relatively unstable, so that the accuracy of the prediction results of some texts to be processed is relatively low, and if the deep learning model is subjected to subsequent training by using the prediction results with relatively low accuracy, the training speed of the deep learning model is reduced, and the training time of the deep learning model is prolonged.
In order to ensure that the confidence data has higher accuracy, in this embodiment, when S103 is executed to obtain the confidence data according to each text to be processed, the subject in each text to be processed, and the predicate and the object corresponding to the subject, the optional practice manners that may be adopted are: determining the confidence of each text to be processed; selecting a text to be processed with confidence coefficient exceeding a preset threshold value; and taking the selected text to be processed, the subject of the text to be processed, and the predicate and the subject corresponding to the subject as confidence data.
When determining the confidence of the text to be processed, the embodiment may use a confidence model obtained through pre-training, where the confidence model may obtain the confidence of the text to be processed according to the text to be processed, a subject in the text to be processed, and at least one of a predicate and an object corresponding to the subject; the confidence may also be calculated according to information such as the number of sources of the text to be processed, the subjects in the text to be processed, and the occurrence numbers of predicates and objects corresponding to the subjects.
It is understood that, after the confidence data is obtained in step S103, the obtained confidence data may be entered into the knowledge base, so that the confidence data may be directly obtained from the knowledge base in the subsequent training process of the deep learning model.
After the confidence data is obtained in step S103, step S104 is executed to train the deep learning model by using the training data and the confidence data, and then the step of obtaining the text to be processed is executed, so that the process is performed in a circulating manner until the training of the deep learning model reaches the termination condition, and the information extraction model is obtained.
Similar to the method when the deep learning model is trained by using the training data when S101 is executed in the present embodiment, when S104 is executed to train the deep learning model by using the confidence data, the present embodiment may adopt an optional implementation manner as follows: for each confidence data, after converting the text to be processed into a vector sequence, splicing the vector sequence with the position information of the subject in the text to be processed; taking the splicing result as the input of a pre-training model to obtain a semantic vector output by the pre-training model; the semantic vector is used as the input of a predicate predictor, and a predicate corresponding to a subject is used as the output of the predicate predictor; and taking the semantic vector and a vector corresponding to the predicate in the semantic vector as the input of the object predictor, taking the object corresponding to the subject as the output of the object predictor, and training the deep learning model.
The way of performing S104 to train the deep learning model by using the training data in the present embodiment is the same as the way of performing S101 to train the deep learning model by using the training data in the present embodiment, and is not described herein again.
That is to say, in the process of establishing the information extraction model, the deep learning model is trained by using two training modes, and since the second training mode can complete training by combining training data and continuously acquired confidence data, the purpose that the deep learning model continuously learns new knowledge through continuously updated data is achieved, so that the established information extraction model can have the capability of extracting relationship information from texts of different relationship categories.
In this embodiment, the termination condition that the deep learning model is trained in S104 may be that the training frequency of the deep learning model exceeds the termination frequency, or that the accuracy of the relationship information extracted by the deep learning model exceeds the termination accuracy.
In the present embodiment, the information extraction model obtained in S104 is executed, and a predicate and an object corresponding to a subject in a text can be output based on the input text and the subject in the text. The text input to the information extraction model may be a text containing related words.
By the method provided by the embodiment, the deep learning model is trained by continuously acquiring the confidence data, so that the deep learning model can autonomously mine the relationship information in different texts, new knowledge is continuously learned, the training steps of the information extraction model are simplified, the training efficiency of the information extraction model is improved, and the established information extraction model has the capability of extracting the relationship information from the texts in different relationship types.
Fig. 2 is a schematic diagram according to a second embodiment of the present application. As shown in fig. 2, the apparatus for creating an information extraction model according to this embodiment includes:
the first training unit 201 is configured to obtain a plurality of training data, and train the deep learning model for a preset number of times by using the training data;
the prediction unit 202 is configured to obtain a plurality of texts to be processed, which are marked with subjects, and predict the plurality of texts to be processed by using the deep learning model to obtain predicates and objects corresponding to the marked subjects in the texts to be processed;
the processing unit 203 is configured to obtain confidence data according to each text to be processed, a subject in each text to be processed, and a predicate and an object corresponding to the subject;
and the second training unit 204 is configured to, after the deep learning model is trained by using the training data and the confidence data, go to a step of acquiring the text to be processed, and perform the steps in a circulating manner until the training of the deep learning model reaches a termination condition, so as to obtain an information extraction model.
The training data acquired by the first training unit 201 in this embodiment is labeled data, and each training data is composed of a training text, a subject in the training text, and a predicate and an object corresponding to the subject.
Specifically, the deep learning model in the present embodiment is composed of a pre-trained model (e.g., ERNIE), a predicate predictor (predictor), and an object predictor (object predictor); the pre-training model is used for obtaining a semantic vector, the predicate predictor is used for predicting a predicate according to the semantic vector, and the object predictor is used for predicting an object according to the semantic vector and the predicate obtained through prediction.
Therefore, when the first training unit 201 in this embodiment performs training on the deep learning model for a preset number of times by using the training data, the optional implementation manner that can be adopted is as follows: for each training data, after converting the training text into a vector sequence, splicing the vector sequence with the position information of the subject in the training text; taking the splicing result as the input of a pre-training model to obtain a semantic vector output by the pre-training model; the semantic vector is used as the input of a predicate predictor, and a predicate corresponding to a subject is used as the output of the predicate predictor; and taking the semantic vector and a vector corresponding to the predicate in the semantic vector as the input of the object predictor, taking the object corresponding to the subject as the output of the object predictor, and training the deep learning model for a preset number of times. The preset times in this embodiment may be set by a user, or may be fixed times, for example, one time.
The first training unit 201 in this embodiment trains the deep learning model a preset number of times using training data, so that the trained deep learning model has a certain capability of extracting relationship information, that is, a predicate and an object corresponding to a subject in a text can be predicted from the input text and the subject in the text.
In the present embodiment, after the first training unit 201 performs training on the deep learning model for the preset number of times by using the acquired training data, the prediction unit 202 acquires a plurality of texts to be processed, to which the subjects are labeled, and predicts the plurality of texts to be processed by using the deep learning model, so as to obtain the predicates and objects corresponding to the labeled subjects in each text to be processed.
When the prediction unit 202 in this embodiment obtains a plurality of texts to be processed labeled with a subject, the optional implementation manner that can be adopted is as follows: and acquiring a plurality of texts containing the relation words as a plurality of texts to be processed, and labeling the subject in each text to be processed, for example, acquiring the plurality of texts from a network.
It is understood that the prediction unit 202 may obtain the text containing the relation words from the network by presetting a relation word list, where the relation word list may include words that can be used as the relation words; the prediction unit 202 may also determine whether the text obtained from the network is a text containing the relation words by using a relation word recognition model obtained through pre-training, that is, the relation word recognition model can recognize whether the text contains the relation words.
When the prediction unit 202 in this embodiment predicts a plurality of texts to be processed by using a deep learning model to obtain a predicate and an object corresponding to a labeled subject in each text to be processed, an optional implementation manner that can be adopted is as follows: for each text to be processed, after converting the text to be processed into a vector sequence, splicing the vector sequence with the position information of the subject marked in the text to be processed; taking the splicing result as the input of a pre-training model to obtain a semantic vector output by the pre-training model; taking the semantic vector as the input of a predicate predictor, and obtaining a predicate corresponding to the subject according to the output of the predicate predictor; and taking the semantic vector and a vector of the predicate obtained by corresponding to the semantic vector as the input of the object predictor, and obtaining the object corresponding to the subject according to the output of the object predictor.
In the present embodiment, after the predicate and the object corresponding to the subject labeled in the text to be processed are obtained by the prediction unit 202 using the deep learning model, the confidence data is obtained by the processing unit 203 from each text to be processed, the subject in each text to be processed, and the predicate and the object corresponding to the subject.
It is understood that, when obtaining the confidence data, the processing unit 203 in this embodiment may use all the texts to be processed, the subject of each text to be processed, and the predicate and object corresponding to the subject, which are acquired by the prediction unit 202, as the confidence data.
In order to ensure that the confidence data has higher accuracy, when the processing unit 203 in this embodiment obtains the confidence data according to each text to be processed, the subject in each text to be processed, and the predicate and the object corresponding to the subject, an optional practice manner that may be adopted is: determining the confidence of each text to be processed; selecting a text to be processed with confidence coefficient exceeding a preset threshold value; and taking the selected text to be processed, the subject of the text to be processed, and the predicate and the subject corresponding to the subject as confidence data.
When determining the confidence of the text to be processed, the processing unit 203 may use a confidence model obtained by pre-training to obtain the confidence; the processing unit 203 may also calculate the confidence level according to information such as the number of sources of the text to be processed, the subjects in the text to be processed, and the number of occurrences of the predicates and objects corresponding to the subjects.
It is understood that, after obtaining the confidence data, the processing unit 203 in this embodiment may enter the obtained confidence data into the knowledge base, so as to obtain the confidence data directly from the knowledge base in the subsequent training process of the deep learning model.
In this embodiment, after the processing unit 203 obtains the confidence data, the second training unit 204 trains the deep learning model by using the training data and the confidence data, and then the step of obtaining the text to be processed is executed, so that the process is performed in a loop until the training of the deep learning model reaches the termination condition, and the information extraction model is obtained.
Similar to the method of the present embodiment when the first training unit 201 trains the deep learning model using the training data, the second training unit 204 in the present embodiment may adopt the following optional implementation manners when training the deep learning model using the confidence data: for each confidence data, after converting the text to be processed into a vector sequence, splicing the vector sequence with the position information of the subject in the text to be processed; taking the splicing result as the input of a pre-training model to obtain a semantic vector output by the pre-training model; the semantic vector is used as the input of a predicate predictor, and a predicate corresponding to a subject is used as the output of the predicate predictor; and taking the semantic vector and a vector corresponding to the predicate in the semantic vector as the input of the object predictor, taking the object corresponding to the subject as the output of the object predictor, and training the deep learning model.
The way in which the second training unit 204 trains the deep learning model using the training data in this embodiment is the same as the way in which the first training unit 201 trains the deep learning model using the training data in this embodiment, and is not described herein again.
In this embodiment, the termination condition that the second training unit 204 trains the deep learning model may be that the training frequency of the deep learning model exceeds the termination frequency, or that the accuracy of the relationship information extracted by the deep learning model exceeds the termination accuracy.
In the present embodiment, the information extraction model obtained by the second training unit 204 can output a predicate and an object corresponding to a subject in a text based on the input text and the subject in the text. The text input to the information extraction model may be a text containing related words.
There is also provided, in accordance with an embodiment of the present application, an electronic device, a computer-readable storage medium, and a computer program product.
FIG. 3 illustrates a schematic block diagram of an example electronic device 300 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 3, the apparatus 300 includes a computing unit 301 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)302 or a computer program loaded from a storage unit 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data required for the operation of the device 300 can also be stored. The calculation unit 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Various components in device 300 are connected to I/O interface 305, including: an input unit 306 such as a keyboard, a mouse, or the like; an output unit 307 such as various types of displays, speakers, and the like; a storage unit 308 such as a magnetic disk, optical disk, or the like; and a communication unit 309 such as a network card, modem, wireless communication transceiver, etc. The communication unit 309 allows the device 300 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 301 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 301 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 301 executes the respective methods and processes described above, such as a method of building an information extraction model. For example, in some embodiments, the method of creating an information extraction model may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 308. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 300 via ROM 302 and/or communication unit 309. When loaded into RAM 303 and executed by computing unit 301, may perform one or more of the steps of the above described method of creating an information extraction model. Alternatively, in other embodiments, the computing unit 301 may be configured to perform the method of establishing the information extraction model by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (15)

1.一种建立信息抽取模型的方法,包括:1. A method for establishing an information extraction model, comprising: 获取多个训练数据,使用所述训练数据对深度学习模型进行预设次数的训练;Acquiring multiple training data, and using the training data to train the deep learning model for a preset number of times; 获取多个标注有主语的待处理文本,并使用所述深度学习模型对所述多个待处理文本进行预测,得到各待处理文本中与所标注的主语对应的谓语和宾语;Acquiring multiple texts to be processed marked with subjects, and using the deep learning model to predict the multiple texts to be processed, to obtain predicates and objects corresponding to the marked subjects in each text to be processed; 根据各待处理文本、各待处理文本中的主语以及与所述主语对应的谓语和宾语,得到置信数据;Obtain confidence data according to each text to be processed, the subject in each text to be processed, and the predicate and object corresponding to the subject; 使用所述训练数据与所述置信数据对所述深度学习模型进行训练之后,转至执行获取所述待处理文本的步骤,以此循环进行,直至所述深度学习模型的训练达到终止条件,得到信息抽取模型;After using the training data and the confidence data to train the deep learning model, go to the step of obtaining the text to be processed, and perform this cycle until the training of the deep learning model reaches the termination condition, obtaining information extraction model; 所述信息抽取模型能够根据输入的文本与文本中的主语,输出文本中与主语对应的谓语和宾语。The information extraction model can output the predicate and object corresponding to the subject in the text according to the input text and the subject in the text. 2.根据权利要求1所述的方法,其中,所述深度学习模型由预训练模型、谓语预测器与宾语预测器构成。2. The method of claim 1, wherein the deep learning model consists of a pretrained model, a predicate predictor, and an object predictor. 3.根据权利要求2所述的方法,其中,对深度学习模型进行训练包括:3. The method of claim 2, wherein training the deep learning model comprises: 将文本转化为向量序列之后,与文本中主语的位置信息进行拼接,所述文本为所述训练数据中的训练文本或者所述置信数据中的待处理文本;After the text is converted into a vector sequence, it is spliced with the position information of the subject in the text, and the text is the training text in the training data or the text to be processed in the confidence data; 将拼接结果作为所述预训练模型的输入,得到所述预训练模型输出的语义向量;The splicing result is used as the input of the pre-training model, and the semantic vector output by the pre-training model is obtained; 将所述语义向量作为所述谓语预测器的输入,将与主语对应的谓语作为所述谓语预测器的输出;The semantic vector is used as the input of the predicate predictor, and the predicate corresponding to the subject is used as the output of the predicate predictor; 将所述语义向量与所述语义向量中对应所述谓语的向量作为所述宾语预测器的输入,将与主语对应的宾语作为所述宾语预测器的输出。The semantic vector and the vector corresponding to the predicate in the semantic vector are used as the input of the object predictor, and the object corresponding to the subject is used as the output of the object predictor. 4.根据权利要求1所述的方法,其中,所述获取多个标注有主语的待处理文本包括:4. The method according to claim 1, wherein the acquiring a plurality of subject-marked texts to be processed comprises: 获取包含关系词的多个文本作为多个待处理文本,并对各待处理文本中的主语进行标注。A plurality of texts containing relative words are obtained as a plurality of texts to be processed, and subjects in each text to be processed are marked. 5.根据权利要求2所述的方法,其中,所述使用所述深度学习模型对所述多个待处理文本进行预测,得到各待处理文本中与所标注的主语对应的谓语和宾语包括:5. The method according to claim 2, wherein said using the deep learning model to predict the plurality of texts to be processed, and obtaining the predicates and objects corresponding to the marked subjects in each text to be processed include: 针对每一个待处理文本,将待处理文本转化为向量序列之后,与待处理文本中所标注的主语的位置信息进行拼接;For each to-be-processed text, after converting the to-be-processed text into a vector sequence, splicing with the position information of the subject marked in the to-be-processed text; 将拼接结果作为所述预训练模型的输入,得到所述预训练模型输出的语义向量;The splicing result is used as the input of the pre-training model, and the semantic vector output by the pre-training model is obtained; 将所述语义向量作为所述谓语预测器的输入,根据所述谓语预测器的输出得到与主语对应的谓语;The semantic vector is used as the input of the predicate predictor, and the predicate corresponding to the subject is obtained according to the output of the predicate predictor; 将所述语义向量与所述语义向量中对应所得到的谓语的向量作为所述宾语预测器的输入,根据所述宾语预测器的输出得到与主语对应的宾语。The semantic vector and the vector of the obtained predicate corresponding to the semantic vector are used as the input of the object predictor, and the object corresponding to the subject is obtained according to the output of the object predictor. 6.根据权利要求1所述的方法,其中,所述根据各待处理文本、各待处理文本中的主语以及与所述主语对应的谓语和宾语,得到置信数据包括:6. The method according to claim 1, wherein the obtaining confidence data according to each text to be processed, a subject in each text to be processed, and a predicate and an object corresponding to the subject comprises: 确定各待处理文本的置信度;determine the confidence level of each text to be processed; 选取置信度超过预设阈值的待处理文本;Select to-be-processed text whose confidence exceeds a preset threshold; 将所选取的待处理文本、待处理文本的主语以及与主语对应的谓语和主语,作为所述置信数据。The selected text to be processed, the subject of the text to be processed, and the predicate and subject corresponding to the subject are used as the confidence data. 7.一种建立信息抽取模型的装置,包括:7. A device for establishing an information extraction model, comprising: 第一训练单元,用于获取多个训练数据,使用所述训练数据对深度学习模型进行预设次数的训练;a first training unit, configured to obtain a plurality of training data, and use the training data to perform a preset number of training on the deep learning model; 预测单元,用于获取多个标注有主语的待处理文本,并使用所述深度学习模型对所述多个待处理文本进行预测,得到各待处理文本中与所标注的主语对应的谓语和宾语;A prediction unit is used to obtain multiple texts to be processed marked with subjects, and use the deep learning model to predict the multiple texts to be processed to obtain predicates and objects corresponding to the marked subjects in each text to be processed ; 处理单元,用于根据各待处理文本、各待处理文本中的主语以及与所述主语对应的谓语和宾语,得到置信数据;a processing unit, configured to obtain confidence data according to each text to be processed, a subject in each text to be processed, and a predicate and an object corresponding to the subject; 第二训练单元,用于使用所述训练数据与所述置信数据对所述深度学习模型进行训练之后,转至执行获取所述待处理文本的步骤,以此循环进行,直至所述深度学习模型的训练达到终止条件,得到信息抽取模型;a second training unit, configured to perform the step of acquiring the text to be processed after training the deep learning model using the training data and the confidence data, and perform this cycle until the deep learning model The training reaches the termination condition, and the information extraction model is obtained; 所述信息抽取模型能够根据输入的文本与文本中的主语,输出文本中与主语对应的谓语和宾语。The information extraction model can output the predicate and object corresponding to the subject in the text according to the input text and the subject in the text. 8.根据权利要求7所述的装置,其中,所述深度学习模型由预训练模型、谓语预测器与宾语预测器构成。8. The apparatus of claim 7, wherein the deep learning model consists of a pretrained model, a predicate predictor and an object predictor. 9.根据权利要求8所述的装置,其中,所述第一训练单元或者所述第二训练单元在对深度学习模型进行训练时,具体执行:9. The apparatus according to claim 8, wherein, when the first training unit or the second training unit trains the deep learning model, specifically: 将文本转化为向量序列之后,与文本中主语的位置信息进行拼接,所述文本为所述训练数据中的训练文本或者所述置信数据中的待处理文本;After the text is converted into a vector sequence, it is spliced with the position information of the subject in the text, and the text is the training text in the training data or the text to be processed in the confidence data; 将拼接结果作为所述预训练模型的输入,得到所述预训练模型输出的语义向量;The splicing result is used as the input of the pre-training model, and the semantic vector output by the pre-training model is obtained; 将所述语义向量作为所述谓语预测器的输入,将与主语对应的谓语作为所述谓语预测器的输出;The semantic vector is used as the input of the predicate predictor, and the predicate corresponding to the subject is used as the output of the predicate predictor; 将所述语义向量与所述语义向量中对应所述谓语的向量作为所述宾语预测器的输入,将与主语对应的宾语作为所述宾语预测器的输出。The semantic vector and the vector corresponding to the predicate in the semantic vector are used as the input of the object predictor, and the object corresponding to the subject is used as the output of the object predictor. 10.根据权利要求7所述的装置,其中,所述预测单元在获取多个标注有主语的待处理文本时,具体执行:10. The apparatus according to claim 7, wherein, when the prediction unit acquires a plurality of texts to be processed marked with subjects, specifically: 获取包含关系词的多个文本作为多个待处理文本,并对各待处理文本中的主语进行标注。A plurality of texts containing relative words are obtained as a plurality of texts to be processed, and subjects in each text to be processed are marked. 11.根据权利要求8所述的装置,其中,所述预测单元在使用所述深度学习模型对所述多个待处理文本进行预测,得到各待处理文本中与所标注的主语对应的谓语和宾语时,具体执行:11. The apparatus according to claim 8, wherein the prediction unit uses the deep learning model to predict the plurality of texts to be processed, and obtains the predicate and the marked subject in each text to be processed. When the object is used, the specific implementation: 针对每一个待处理文本,将待处理文本转化为向量序列之后,与待处理文本中所标注的主语的位置信息进行拼接;For each to-be-processed text, after converting the to-be-processed text into a vector sequence, splicing with the position information of the subject marked in the to-be-processed text; 将拼接结果作为所述预训练模型的输入,得到所述预训练模型输出的语义向量;The splicing result is used as the input of the pre-training model, and the semantic vector output by the pre-training model is obtained; 将所述语义向量作为所述谓语预测器的输入,根据所述谓语预测器的输出得到与主语对应的谓语;The semantic vector is used as the input of the predicate predictor, and the predicate corresponding to the subject is obtained according to the output of the predicate predictor; 将所述语义向量与所述语义向量中对应所得到的谓语的向量作为所述宾语预测器的输入,根据所述宾语预测器的输出得到与主语对应的宾语。The semantic vector and the vector of the obtained predicate corresponding to the semantic vector are used as the input of the object predictor, and the object corresponding to the subject is obtained according to the output of the object predictor. 12.根据权利要求7所述的装置,其中,所述处理单元在根据各待处理文本、各待处理文本中的主语以及与所述主语对应的谓语和宾语,得到置信数据时,具体执行:12. The apparatus according to claim 7, wherein, when the processing unit obtains confidence data according to each text to be processed, a subject in each text to be processed, and a predicate and an object corresponding to the subject, the processing unit specifically executes: 确定各待处理文本的置信度;determine the confidence level of each text to be processed; 选取置信度超过预设阈值的待处理文本;Select to-be-processed text whose confidence exceeds a preset threshold; 将所选取的待处理文本、待处理文本的主语以及与主语对应的谓语和主语,作为所述置信数据。The selected text to be processed, the subject of the text to be processed, and the predicate and subject corresponding to the subject are used as the confidence data. 13.一种电子设备,其特征在于,包括:13. An electronic device, characterized in that, comprising: 至少一个处理器;以及at least one processor; and 与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein, 所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-6中任一项所述的方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the execution of any of claims 1-6 Methods. 14.一种存储有计算机指令的非瞬时计算机可读存储介质,其特征在于,所述计算机指令用于使所述计算机执行权利要求1-6中任一项所述的方法。14. A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the method of any one of claims 1-6. 15.一种计算机程序产品,当所述计算机程序产品中的指令处理器执行时,执行权利要求1-6中任一项所述的建立信息抽取模型的方法。15. A computer program product that, when executed by an instruction processor in the computer program product, executes the method for establishing an information extraction model according to any one of claims 1-6.
CN202011454491.9A 2020-12-10 2020-12-10 Method and device for establishing information extraction model, electronic equipment and readable storage medium Pending CN112528641A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011454491.9A CN112528641A (en) 2020-12-10 2020-12-10 Method and device for establishing information extraction model, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011454491.9A CN112528641A (en) 2020-12-10 2020-12-10 Method and device for establishing information extraction model, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN112528641A true CN112528641A (en) 2021-03-19

Family

ID=74998870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011454491.9A Pending CN112528641A (en) 2020-12-10 2020-12-10 Method and device for establishing information extraction model, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN112528641A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818138A (en) * 2021-04-19 2021-05-18 中译语通科技股份有限公司 Knowledge graph ontology construction method and device, terminal device and readable storage medium
CN113609847A (en) * 2021-08-10 2021-11-05 北京百度网讯科技有限公司 Information extraction method, device, electronic device and storage medium
CN113836316A (en) * 2021-09-23 2021-12-24 北京百度网讯科技有限公司 Three-tuple data processing method, training method, device, equipment and medium
CN115391450A (en) * 2022-08-26 2022-11-25 百度在线网络技术(北京)有限公司 Inference information generation method, device, equipment, readable storage medium and product

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844348A (en) * 2017-02-13 2017-06-13 哈尔滨工业大学 A kind of Chinese sentence functional component analysis method
CN109271626A (en) * 2018-08-31 2019-01-25 北京工业大学 Text semantic analysis
CN110377694A (en) * 2019-06-06 2019-10-25 北京百度网讯科技有限公司 Text is marked to the method, apparatus, equipment and computer storage medium of logical relation
CN110807311A (en) * 2018-07-18 2020-02-18 百度在线网络技术(北京)有限公司 Method and apparatus for generating information
CN111291185A (en) * 2020-01-21 2020-06-16 京东方科技集团股份有限公司 Information extraction method, device, electronic device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844348A (en) * 2017-02-13 2017-06-13 哈尔滨工业大学 A kind of Chinese sentence functional component analysis method
CN110807311A (en) * 2018-07-18 2020-02-18 百度在线网络技术(北京)有限公司 Method and apparatus for generating information
CN109271626A (en) * 2018-08-31 2019-01-25 北京工业大学 Text semantic analysis
CN110377694A (en) * 2019-06-06 2019-10-25 北京百度网讯科技有限公司 Text is marked to the method, apparatus, equipment and computer storage medium of logical relation
CN111291185A (en) * 2020-01-21 2020-06-16 京东方科技集团股份有限公司 Information extraction method, device, electronic device and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818138A (en) * 2021-04-19 2021-05-18 中译语通科技股份有限公司 Knowledge graph ontology construction method and device, terminal device and readable storage medium
CN113609847A (en) * 2021-08-10 2021-11-05 北京百度网讯科技有限公司 Information extraction method, device, electronic device and storage medium
CN113609847B (en) * 2021-08-10 2023-10-27 北京百度网讯科技有限公司 Information extraction method, device, electronic equipment and storage medium
CN113836316A (en) * 2021-09-23 2021-12-24 北京百度网讯科技有限公司 Three-tuple data processing method, training method, device, equipment and medium
CN115391450A (en) * 2022-08-26 2022-11-25 百度在线网络技术(北京)有限公司 Inference information generation method, device, equipment, readable storage medium and product
CN115391450B (en) * 2022-08-26 2024-01-09 百度在线网络技术(北京)有限公司 Inference information generation method, device, equipment, readable storage medium and product

Similar Documents

Publication Publication Date Title
US20220318275A1 (en) Search method, electronic device and storage medium
WO2020108063A1 (en) Feature word determining method, apparatus, and server
CN112528641A (en) Method and device for establishing information extraction model, electronic equipment and readable storage medium
CN114724168A (en) Training method of deep learning model, text recognition method, text recognition device and text recognition equipment
CN113963197A (en) Image recognition method and device, electronic equipment and readable storage medium
CN113407610A (en) Information extraction method and device, electronic equipment and readable storage medium
CN112949818A (en) Model distillation method, device, equipment and storage medium
CN113204616A (en) Method and device for training text extraction model and extracting text
EP3992814A2 (en) Method and apparatus for generating user interest profile, electronic device and storage medium
CN118568256B (en) Method and device for evaluating text classification performance of large language model
CN114186681A (en) Method, apparatus and computer program product for generating model clusters
CN113904943A (en) Account detection method, device, electronic device and storage medium
US12007965B2 (en) Method, device and storage medium for deduplicating entity nodes in graph database
CN114549695A (en) Image generation method and device, electronic equipment and readable storage medium
CN113010571A (en) Data detection method, data detection device, electronic equipment, storage medium and program product
CN115658903A (en) Text classification method, model training method, related device and electronic equipment
CN113408280A (en) Negative example construction method, device, equipment and storage medium
CN115328956B (en) Data query method and device based on artificial intelligence and storage medium
CN113051926A (en) Text extraction method, equipment and storage medium
US20250299052A1 (en) Large model-based text generation method, electronic device and storage medium
CN114861059A (en) Resource recommendation method and device, electronic equipment and storage medium
CN114997329A (en) Method, apparatus, device, medium and product for generating a model
CN114218961A (en) Text detection method and device and electronic equipment
CN116524516B (en) Text structured information determination method, device, equipment and storage medium
CN117992569A (en) Method, device, equipment and medium for generating documents based on generative large model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination