Detailed Description
In order to better understand the technical solutions provided by the embodiments of the present application, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In order to facilitate a better understanding of the technical solution of the present application, the following description will explain the basic concepts related to the present application.
1) Question-answering system
The Question and answer system (Question ANSWERING SYSTEM, QA) is a high-level form of information retrieval system, and can answer questions raised by users in natural language with accurate and simple natural language, and the main reason for the research of the Question and answer system is the requirement of people for quickly and accurately acquiring information, and the Question and answer system is a research direction which is focused on and has wide development prospect in the fields of artificial intelligence and natural language processing.
2) An account, a first account and a second account
In the embodiment of the application, in order to distinguish the two different accounts, a first account refers to the user for creating the corpus of questions and a second account refers to the common account for answering the system by the question and answer, wherein the first account can be the enterprise-level user in the question and answer system, the first account in the embodiment of the application is used for inputting basic questions to the question and answer system and receiving an expanded question set returned by the question and answer system, and the second account can be used for inputting questions to be answered to the question and answer system and further returning answer information related to the questions to be answered.
3) Question-answer corpus
The question-answer corpus in the embodiment of the application comprises question sentences and answer information associated with the question sentences.
4) Words, first word and second word
In general, words generally refer to one or more characters in a text, the form of the words has an association relationship with the voice form of the text, for example, when the language form of the text is Chinese, one word can be a Chinese character or a phrase composed of a plurality of Chinese characters, when the language form of the text is English, one word can be an English word or a phrase composed of a plurality of English words, etc., and when the language form of the text is French, ind, italian, japanese or Korean, etc., one skilled in the art can set the form of the corresponding words according to actual requirements;
In the embodiment of the application, the words in each question (such as basic questions, questions to be processed or expanded questions and the like) and the words in the preset word set are mainly related, and for convenience of distinguishing, the first words are used for referring to the words in the questions to be processed, and the second words are used for referring to the words in the preset word set.
5) Semantic impact value and contextual information for words
For a certain word in a sentence, the semantic influence value of the word characterizes the influence degree of the word on the semantic meaning of the sentence, the context association information of the word characterizes the association between the word and each word in the sentence, and the context association information can be determined by the association degree of the word and the word before the word and the association degree of the word and the word after the word but is not limited to.
6) Bert (Bidirectional Encoder Representations from Transformer) model
The Bert model is a coding network (Encoder) of a bidirectional transducer, and aims to obtain semantic Representation (presentation) containing rich semantic information of a text by utilizing large-scale unlabeled corpus training, and then fine-tune the semantic Representation of the text in a specific natural language processing (Natural Language Processing, NLP) task, and finally apply the semantic Representation to the specific NLP task.
7) Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI)
Artificial intelligence is a theory, method, technology and application system which utilizes a digital computer or a machine controlled by the digital computer to simulate, extend and expand human intelligence, sense environment, acquire knowledge and use knowledge to acquire optimal results, namely, the artificial intelligence is a comprehensive technology of computer science, which is intended to know the essence of intelligence and produce a new intelligent machine which can react in a similar way to human intelligence, namely, research the design principle and implementation method of various intelligent machines, so that the machine has the functions of sensing, reasoning and decision, and the artificial intelligence software technology mainly comprises computer vision technology, voice processing technology, natural language processing technology, machine learning or deep learning and other large directions.
8) Natural language processing (Nature Language processing NLP)
Natural language processing is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.
The following describes the design concept of the present application.
In the construction process of the question and answer field, the question and answer corpus (comprising question sentences and answers related to the question sentences) is critical, but in the related technology, the question and answer corpus is usually obtained by artificially constructing the question sentences or expanding the existing question sentences, the efficiency of manually expanding the question and answer corpus is low, a person can only process a certain amount of data in one day, if a large number of question and answer corpora are required to be created, a large number of question and answer corpora are required to be expanded, the time is long, and the creation of question and answer corpus personnel cannot predict the forms of the questions and sentences which are common users, and further, the accuracy and the richness of the questions and sentences expanded by the general users are low, if the group of the users is not large enough, the number of the captured historical question and sentences is small, and the forms of the captured historical question and sentence tend to be single, the number and quality of the obtained question and answer corpus are obviously defective, in addition, if the accuracy of the excavated sentence in the captured mode is low, the mode of the log is the word of a class, the question and answer corpus is required to be retrieved, the word is the word of the question and answer corpus is required to be captured in the class, and the form of the text, the text and the text corpus is more accurate, the time is required to be captured in the area, and the area is more easily.
In view of the above, the inventor designs a method, a device, equipment and a computer storage medium for expanding questions, which take time consuming and low efficiency into consideration in manual creation and log capturing of questions in a question-answer corpus, so that questions in the question-answer corpus are obtained in an expanded manner, further more question-answer corpora can be obtained based on the expanded questions, specifically, a to-be-processed question is obtained based on a basic question, and an expanded question set corresponding to the to-be-processed question (i.e., an expanded question set corresponding to the basic question) is generated based on semantic influence values of first words in the to-be-processed question and context related information of the first words, wherein the expanded question set can contain one or more expanded questions with semantic similarity greater than a first preset threshold value of the to-be-processed question.
It should be noted that, the questions related in the embodiments of the present application may be, but not limited to, text information or voice information, and those skilled in the art may set the questions according to actual requirements.
In order to more clearly understand the design concept of the present application, the application scenario of the embodiment of the present application will be described by way of example, referring to fig. 1, a schematic structure diagram of a question and answer system is provided, where the system includes a terminal device 100 and a question and answer server 200, and a question and answer client 110 may be installed on the terminal device 100 (for example, but not limited to, 100-1 or 100-2 in the figure, etc.), where the question and answer client 110 is a client of the question and answer system, and the question and answer server 200 is a server of the question and answer system, and the question and answer client 110 and the question and answer server 200 communicate with each other.
The question-answering client 110 (such as 110-1 or 110-2 in the figure) can send the basic question indicated by the first account through the fourth display page to the question-answering server 200, can send the question to be answered indicated by the second account through the third display page to the question-answering server 200, and can display the expanded question or corresponding answer information in the display page provided by the question-answering client 110 based on the indication of the question-answering server 200.
The question-answering server 200 may, but is not limited to, obtain a basic question from the question-answering knowledge base 300 or receive a basic question sent by the question-answering client 110, obtain a question to be processed based on the basic question, and further generate an extended question set corresponding to the question to be processed based on the semantic impact value and the context association information of each first word in the question to be processed, and further the question-answering server 200 may also send the extended question set corresponding to the question to be processed to the question-answering client 110.
As an embodiment, the question-answering server 200 may also receive the question to be answered sent by the question-answering client 110, and based on the question to be answered, return answer information associated with the target question group including the question to be answered.
The question and answer server 200 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a plurality of cloud servers (such as but not limited to the server 200-1, the server 200-2 or the server 200-3 illustrated in the figure) providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and big data and artificial intelligence platforms in the cloud service technology, and the functions of the question and answer server 200 may be implemented by one or more cloud servers, or may be implemented by one or more cloud server clusters.
The terminal device 100 in the embodiments of the present application may be a mobile terminal, a fixed terminal or a portable terminal, such as a mobile handset, a site, a unit, a device, a multimedia computer, a multimedia tablet, an internet node, a communicator, a desktop computer, a laptop computer, a notebook computer, a netbook computer, a tablet computer, a Personal Communication System (PCS) device, a personal navigation device, a Personal Digital Assistant (PDA), an audio/video player, a digital camera/video camera, a positioning device, a television receiver, a radio broadcast receiver, an electronic book device, a game device, or any combination thereof, including accessories and peripherals of these devices or any combination thereof.
Referring to fig. 2, a question extension method according to an embodiment of the present application is applied to the question answering system (i.e. the question answering server 200 or the combination of the question answering server 200 and the question answering client 110), and specifically includes the following steps:
Step S201, acquiring a question to be processed based on the basic question.
As an embodiment, a basic question may be obtained before step S201, where a basic question may be obtained from questions in the question-answer corpus in the question-answer database 300, for example, one or more questions may be selected from questions in the question-answer corpus at random as a basic question, or according to the number of times each question in the question-answer corpus is recalled, a question with a history greater than the first threshold may be selected from questions in the question-answer corpus, where the first threshold is not limited, and may be set by a person skilled in the art according to actual needs.
In the embodiment of the application, part of or all of basic questions can be directly determined as to-be-processed questions in order to promote the diversity of the finally obtained expanded question set, the basic questions can be determined as to-be-processed questions when one basic question is obtained, and part of questions can be screened from the basic questions to be determined as to-be-processed questions or the basic questions can be determined as to-be-processed questions when a plurality of basic questions are obtained.
Further, in the process of screening out partial questions from the basic questions as questions to be processed, the partial questions can be screened out randomly from the basic questions, or the partial questions can be screened out in different modes according to different sources of the basic questions, if the basic questions are obtained from questions in a corpus, the partial questions can be screened out according to the recall times of histories of the basic questions, the basic questions with the recall times greater than a second time threshold can be screened out, the questions are determined to be the questions to be processed, if the basic questions are obtained in response to the input operation of the questions, K basic questions with the lowest semantic similarity with other basic questions can be screened out as the questions to be processed, or K basic questions with the highest input sequence are screened out as questions to be processed, and the like, wherein the second time threshold is not limited, the K is a positive integer according to actual requirements, and a person skilled in the art can set.
In the embodiment of the application, a trained target neural network model can be used for obtaining a first similar question of the basic question as a question to be processed, the target neural network model can include, but is not limited to, a Bert model, a Ernie model, an Albert model and the like, the basic question (part of or all of the basic questions) and the first similar question can be determined as the question to be processed, the first similar question of one basic question can include one or more questions, the first similar question is a question with the semantic similarity with the basic question being greater than a third preset threshold, the acquisition mode of the target neural network model is described in detail below, and the flexibility of acquiring the question to be processed is improved and the acquired diversity of the to be processed is increased.
Step S202, obtaining semantic influence values of all first words in a question to be processed, wherein the semantic influence values represent influence degrees of all first words on the semantics of the question to be processed.
The method comprises the steps of obtaining a first reference value and a second reference value of each first word based on a second word in a preset word set, normalizing the first reference value and the second reference value, and determining semantic influence values of each first word, wherein the preset word set can be a word set which is created in advance, the preset word set can comprise words with higher use frequency, and the like, and a person skilled in the art can set the preset word set according to actual requirements;
The first reference value represents the probability of generating a corresponding word in an extended question by using a first word, namely the first reference value represents the copy probability of the first word in the question to be processed when the extended question is generated, the second reference value represents the probability of generating the corresponding word in the extended question by using a second word, wherein the second word is a word with the semantic similarity with the first word being larger than a second preset threshold value in a preset word set, namely the second reference value represents the generation probability of generating the first word by using the second word when the extended question is generated, and the semantic influence value obtained in the process not only represents the influence degree of each first word on the semantic of the question to be processed, wherein the related first reference value and second reference value can also determine whether the word of the question to be processed can be directly copied when the question is generated, so that the efficiency and accuracy of generating the extended question can be improved.
According to the embodiment of the application, the semantic question feature vector of the question to be processed and the word feature vector of each first word can be obtained, so that the semantic influence value of each first word can be determined based on the distance between the word feature vector of each first word and the question feature vector, for example, but not limited to, the distance between the word feature vector of each first word and the question feature vector can be determined directly, the semantic influence value of each first word can be determined, the semantic influence value of each first word can be obtained after the distance is weighted, in the embodiment of the application, the semantic influence value of each first word can be obtained through a Copy mechanism (Copy mechanism, copy) and the like, the Copy probability of each first word is directly copied in the process of generating an expanded question, the generation probability of each first word is generated by using the words in a preset word list, and then the Copy probability and the generation probability of each first word are normalized, the semantic influence value of each first word is obtained after the word feature vector and the question feature vector are directly processed, the Copy mechanism refers to the distance between the words in an input sequence, the semantic influence value of each first word can be obtained, the semantic influence value of each word can be obtained by using a detailed description of the semantic influence value in the process of each word is not being obtained, and the semantic influence value of the semantic influence value is obtained by using a detailed description of the first word.
Step S203, based on the semantic impact value of each first word and the context associated information of each first word, generating an extended question set corresponding to the question to be processed, wherein the extended question set comprises extended questions with semantic similarity greater than a first preset threshold value with the question to be processed.
Specifically, in the embodiment of the present application, each first word is mapped into a word vector, the word vector is input into a context processing neural network, the input word vector is operated by using the context processing neural network, so as to obtain context related information of each first word, where the context processing neural network may be, but not limited to, LSTM, biSTM, etc.
As an embodiment, the context associated information of one word in the embodiment of the application characterizes the relevance between the word and each word belonging to the same question; the context association information of a word in a question may be, but is not limited to, a first association degree of the current word and a previous word in the question, the first association degree characterizes the association degree of the current word and the previous word in the question, and the second association degree characterizes the association degree of the current word and the previous word in the question, the first association degree may be, but is not limited to, a probability that the current word appears after the previous word, the second association degree may be, but is not limited to, a probability that the next word appears after the current word, and is convenient for understanding, an example of context association information is given here, provided that the current word is "bright", "the previous word is" bright ", the next word is" female ", the probability that the" bright "appears after the" bright "is 0.8", the context association information of the "bright" is not limited to a probability that the current word appears after the previous word is 0.8=0.01, a mapping may be obtained by mapping the previous word to the previous word, and a mapping may be a mapping vector between the previous word and the current word, and the current word may be a mapping vector, and a mapping may be obtained based on a distance vector between the previous word and the context association vector.
As an embodiment, a plurality of extended question groups are generated in a grouping mode, the set of the extended questions in the extended question groups is determined to be the extended question set, one extended question group at least comprises one extended question, the first words of different extended questions in the same extended question group can be the same, the first words of the extended questions in different extended question groups can be different, and therefore multiple groups of extended question groups with different first words can be obtained, and the number of the extended questions and the richness of the extended questions are greatly improved.
As an embodiment, in the step S203, the extended question set may be obtained by using a DBS (Diversity Beam Search) decoding mechanism, and those skilled in the art may flexibly set other decoding mechanisms to implement the step S203.
As an embodiment, the target neural network model involved in step S201 is described in detail below, which may be, but not limited to, a question generation (Question Generation, QG) model, where in the natural language processing field, the QG model refers to a given text and a corresponding answer, and a question (question sentence) corresponding to the answer is generated according to the two information.
The target neural network model may also be an architecture formed by a plurality of neural networks with text processing functions, please refer to fig. 3, an embodiment of the present application provides an architecture of a target neural network model, which includes a coding network and a decoding network, the coding network is used for question, learning to generate semantic vectors of each word in the question, the semantic vector of one word is obtained by merging the text vector of the one word with the semantic information of the question, the decoding network is used for learning to generate a question with the same semantic as the question, the coding network and the decoding network may be formed by a convolutional neural network (Convolutional Neural Network, CNN) or a cyclic neural network (Recurrent Neural Network, RNN), the coding network may be also but not limited to be formed by a Transformer unit, the coding network may be but not limited to be a Bert model, the Bert model is a two-way language model, each word may be simultaneously used with context information of the question, the two-way language model may be used for masking all the predicted words when the two-way language model is used, and the two-way language model may not be used to mask all the predicted words.
In the following description, taking a target neural network model including a coding network and a decoding network formed by a Transform unit as an example, a training process for obtaining the target neural network model is described, referring to fig. 4, in the embodiment of the present application, the coding network may be pre-trained first, and then the decoding parameters in the decoding network of the pre-trained coding network may be Fine-tuned by a Fine-tuning mechanism to obtain a trained target neural network model, where in the embodiment of the present application, the coding parameters of the coding network may be obtained by pre-training by a Bert model, and then the coding parameters and the decoding parameters may be Fine-tuned by a Fine-tune, referring to fig. 5, the training process of the target neural network model specifically includes the following steps:
Step S501, an initial target neural network model is obtained, wherein the initial target neural network model comprises an encoding network and a decoding network, the encoding network is used for learning and generating semantic vectors of words in a question sample by using a question sample, the semantic vector of one word is obtained by merging the text vector of the one word with the semantic information of the question sample, and the decoding network is used for learning and generating a question with the same semantic as the question sample by using the semantic vector of each word in the question sample.
Step S502, adjusting the coding parameters of the coding network by using question samples in the first question sample set.
In step S502, the question sample may be input into a coding network to obtain a predicted word feature vector and a predicted question feature vector of each word output by the coding network, and a first prediction bias of the coding network is determined based on a bias between the predicted word feature vector and a corresponding word feature vector sample of each word and a bias between the predicted question feature vector and the question feature vector sample, and a coding parameter of the coding network is adjusted towards a direction of reducing the first prediction bias by a loss function of the coding network until a coding network pre-training end condition is met, wherein the coding network pre-training end condition may include, but is not limited to, a training time reaching a first time threshold, a number of times of adjusting the coding parameter reaching a first adjustment time threshold, or a first prediction bias smaller than the first prediction bias threshold, etc.;
The coding network after the coding parameters are adjusted by the method can improve the accuracy of the coding network for coding each word in the question to obtain the word feature vector and the accuracy of the coding network for coding the question to obtain the question feature vector, and the coding parameters of the coding network can be adjusted by a person skilled in the art in other ways without excessive limitation.
Step S503, randomly initializing the decoding parameters of the initial decoding network.
Step S504, further adjusting the adjusted coding parameters and the decoding parameters after random initialization by using question samples in the second question sample set, and obtaining the trained target neural network model based on the further adjusted coding parameters and decoding parameters.
In step S504, a neural network model formed by inputting the question input sample into the coding network and the decoding network may be used, based on a second prediction deviation between a predicted similar question output by the neural network model and the corresponding similar question sample, and the adjusted coding parameter and the decoding parameter after random initialization are adjusted in a gradient descent manner toward a direction of reducing the second prediction deviation, so as to satisfy a training end condition, where the training end condition may include, but is not limited to, a training time reaching a second time threshold, a number of times of adjusting the coding parameter or the decoding parameter reaching a second adjustment time threshold, or a second prediction deviation smaller than a second prediction deviation threshold, etc.
Through the steps S501 to S504, the coding network is pre-trained, then the decoding network is randomized, then the coding parameters of the coding network and the decoding parameters of the decoding network are adjusted through the Fine-tune, so that the accuracy and depth of the adjusted parameters are increased in the process of training the target neural network model, and further the accuracy of a first similar question of a basic question generated by the trained target neural network model is improved, if one question is "hello loving", the first similar question generated by the target neural network model before training may be "o loving", but the first similar question generated by the trained target neural network may be "you true loving" or "long loving", and the like.
As an embodiment, the process of obtaining the semantic impact value of each first word and the first reference value, the second reference value, and the first reference value involved in step S202 is further described below.
The method and the device can flexibly set and acquire the first reference value and the second reference value of each first word, if one first word is only contained in a question to be processed and is not contained in a preset word set, and if the second word with the semantic similarity larger than a second threshold value with the first word does not exist in the preset word set, the first reference value of the first word can be set to be 1, the second reference value of the first word is set to be 0, if one first word is contained in the question to be processed and is contained in the preset word set, the first reference value of the first word can be set to be 0, the second reference value of the first word is set to be 1, and if one first word is contained in the question to be processed and is not contained in the preset word set, but the second word with the semantic similarity larger than a second threshold value with the first word exists in the preset word set, the first word can be set to be between 0 and 0.5 based on the semantic similarity between the first word and the first word, and the first reference value can be set to be between 0.5 and 0.5.
According to the method, the device and the system, the first reference value and the second reference value of each first word can be generated through a Copy mechanism, further, the first reference value and the second reference value are subjected to normalization processing to obtain semantic influence values of the first words, the Copy mechanism is a judging layer behind a decoding network and is used for determining whether each word is directly copied from an original word or a new word is generated, two modes exist when the word is generated, one mode is a word generation mode, the other mode is a word copying mode, the generation model is a probability model combining the two modes, the probability that the first reference value is the Copy mode, and the probability that the second reference value is the generation mode, and the problem that the word is not contained in a preset word set can be solved by adopting the Copy mechanism in the method, namely the problem that when an expanded sentence is generated, the expanded sentence is directly copied, so that the similarity of the semantic similarity of the expanded sentence to the question to be processed is improved can be achieved.
As an embodiment, the manner of normalizing the first reference value and the second reference value in the step S202 is not limited too much, and those skilled in the art may set the normalization for the first reference value and the second reference value according to actual requirements, for example, but not limited to, based on the principle of the following formula (1), formula (2) or formula (3).
P i=Pi_1+Pi _2 formula (1)
P i=Pi_1×k1+Pi _2×k formula 2 (2)
In the formulas (1) to (3), P i represents the semantic influence value of the ith (i is a positive integer) first term in the question to be processed, P i _1 is the first reference value of the ith first term, P i _2 is the second reference value of the ith first term, k1 is the first weight value of the first reference value, k2 is the second weight value of the second reference value, k3 is the third weight value of the first reference value, and k4 is the weight value of the second reference value, wherein the setting modes of k1 to k4 are not limited too much, and can be set by those skilled in the art according to actual requirements.
As an embodiment, after the first reference value and the second reference value of each first word are obtained through the Copy mechanism, the first reference value and the second reference value may be processed through the principle of the following formula (4), so as to obtain the semantic influence value of each first word.
P(yt|st,yt-1,ct,M)=P(yt,c|st,yt-1,ct,M)+P(yt,g|st,yt-1,ct,M) Formula (4)
In the formula (4), M is a set of input hidden layer states in a Copy mechanism, t represents a word at the time t, c t is an attention score (attention score), s t is a hidden state of the source, g represents probability of generating a corresponding word in an extended question by using a second word, c represents probability of generating the corresponding word in the extended question by using a first word, c|s t represents a first reference value of the word at the time t, and g|s t represents a second reference value of the word at the time t.
In the related technology, when a basic sentence is directly expanded by a QG model, the sentence which is expanded by the QG model and is similar to the basic sentence is possibly not smooth, the semantics of the expanded sentence are possibly inconsistent with those of the basic sentence, if the basic sentence before expansion is a playback of a 'how to see XX class', the sentence which is expanded by the QG model is a 'how to view and play back in XX class', the sentence which is expanded by adopting the Copy mechanism is a 'how to view and play back in XX class', obviously, the 'XX class' is a house-like class or a class which is not clear in the 'how to view and play back in XX class', the problem of a word which is very well solved by a Copy mechanism, such as 'the like of the original sentence is a' which is provided with a bright XX class ', the sentence which is expanded by the QG model is a' in a resident sentence 'unk' in a bright class ', the sentence is a' in a bright class ', the accuracy of which is a' in the bright class 'is a bright class', the corresponding to the semantic value of the '32' is clearly shown in the expansion class ', and the application is not clear in the profound is a profound of the application, and the application is a' in the profound value of the profound is more than 37, and the profound is clearly shown by the profound of the profound 24 when the profound is applied in the profound mechanism.
As an embodiment, a method of expanding the question to be processed in the form of a packet in step S203 is further described below.
Referring to fig. 6, an extended question set corresponding to a question to be processed may be obtained by:
Step S601, determining a threshold N (N is a positive integer) of the number of the extended questions in the extended question set.
As an embodiment, the number threshold N may be preset, and a person skilled in the art may set the number threshold N according to actual needs, for example, but not limited to, setting N to 3,5,6, or 9.
Step S602, screening out first words corresponding to the first N largest semantic impact values based on the semantic impact values of the first words in the question to be processed.
Step S603, determining the N first words selected as the first words of the expansion question in each expansion question group in the N expansion question groups.
As an embodiment, when the semantic influence value of a first word is larger than or equal to a semantic influence threshold, the first word can be directly copied when an expanded question is generated, so that when the semantic influence value of the screened first word is larger than or equal to the semantic influence threshold, the first word can be directly used as the first word in a corresponding expanded question group, when the semantic influence value of the screened first word is smaller than the semantic influence threshold, a second word with the semantic similarity of the first word being larger than a second preset threshold in a preset word set can be used as the first word in the corresponding expanded question group, the first word of the expanded question in different expanded question groups is different, the expanded questions in the obtained multiple expanded question groups are more varied, and the richness of the obtained multiple expanded question groups is improved under the condition that the semantics are relatively similar.
For convenience of understanding, referring to fig. 7, assuming that the question to be processed is "i can get a loss of work for several months", N is 3, the 3 first words with the largest semantic impact values are "i", "no" and "able", respectively, the semantic impact values of "i" and "no" are greater than the semantic impact threshold, and the semantic impact value of "able" is less than the semantic impact threshold, then "i" and "no" may be respectively used as the first word of the expansion question in the 1 st expansion question group and the 2 nd expansion question group, and the "yes" with the semantic similarity of "able" in the preset word set greater than the second preset threshold is determined as the first word of the expansion question in the 3 rd expansion question group.
Step S604, aiming at each expansion question group in the N expansion question groups, acquiring the expansion questions in each expansion question group according to the context associated information of the first word and the context associated information of the first word except the first word.
For easy understanding, a schematic illustration is given herein, and context association information of a current word in the extended question is determined by a first association degree and a second association degree of the current word; the first degree of association characterizes the probability of the current word appearing after the previous word, the second degree of association characterizes the probability of the next word appearing after the current word, please refer to fig. 8, if the first word of the extended question is "me" for the 1 st extended question group in fig. 7, when the second word of the extended question is generated (i.e. the second word is the current word), the probability of the "me" appearing "can be" and the probability of the "me" appearing "can be" are assumed to be 0.9 (i.e. the first degree of association of "can be 0.9), the probability of the" me "appearing" can be 0.5 (i.e. the second degree of association of "to be 0.5), the probability of the" me "appearing" is summarized as 0.1 (i.1) and the probability of the "me" appearing "after" is 0.9, the probability of the "can be" is 0.45 = 0.45, the probability of the "is generated as 0.45% of the" is compared with the probability of the "0.36, the probability of the" 0.45% is calculated as 0.45% of the "is about 0.25% of the probability of the" being 0.5% of the "and the" 0.25% of the "is easy to be compared with the probability of the" 0.45 "%" of the "0.0, the method for expanding the question provided by the embodiment of the application is not limited.
Step S605, generating an extended question set corresponding to the question to be processed by using the extended questions in each extended question group.
Specifically, part of the extended questions or all the extended questions can be screened from the extended question groups, the set of screened extended questions is determined to be the extended question set corresponding to the question to be processed, for example, the set of all the extended questions in the extended question groups can be determined to be the extended question set corresponding to the question to be processed, one extended question can be screened from each extended question group to form the extended question set corresponding to the question to be processed, and the like, and the user in the field can flexibly set according to service requirements.
As an embodiment, please refer to fig. 9, the embodiment of the present application provides an example of a fourth display page, where the first account may, but is not limited to, a question input operation triggered by the fourth display page, indicates a basic question, the first account may also input answer information associated with the basic question through the fourth display page, and the like, after logging in the fourth display page 900 through the first account, an avatar, a name, and the like of the first account may be displayed in an account display area 901 in the fourth display page 900, the first account may input the basic question in a first information input box 903 in the question and answer management area 902, input answer information corresponding to the basic question in a second information input box 904, and the second account may also search for a corresponding question or answer information, and the like through an information search box 905.
An embodiment of the invention includes the steps of displaying expanded questions in an expanded question set to a first account and grouping the expanded questions based on an indication of the first account after the step S203, specifically displaying at least one expanded question in the expanded question set on a first display page, further acquiring expanded questions to be grouped in response to a question grouping instruction triggered by the first account through a second display page, adding the expanded questions to be grouped to a question group indicated by the question grouping instruction, wherein the second display page may be a page embedded in the first display page, and the second display page may be a page independent of the first display page.
For ease of understanding, please refer to fig. 10, an example of a first display page is provided herein, in which the basic question is "how the navigation lights of the aircraft are distributed" in the first display page 1000, an extended question (such as, but not limited to, where the navigation lights of the aircraft are installed, how the navigation lights of the aircraft are installed, or where the navigation lights are distributed on the aircraft) of the pending question obtained from the basic question may be displayed in the first display area 1101, and answer information corresponding to the basic question may be displayed in the second display area 1102.
Referring to fig. 11 and 12, another example of a first display page is provided herein, in the first display page 1100, when testing a similar problem (i.e. an extended problem), the first account may enter the second display page 1200 after displaying data information to be fused of a basic problem (i.e. 1 piece of data to be fused, which is illustrated in the figure, and the 1 piece of data to be fused is information based on an extended question "what you can answer"), where the first account may click on the data information to be fused;
The first account may trigger a question grouping instruction through the first control 1201 in the second display page 1200, may cancel a grouping operation cancellation instruction of the operation of the question grouping through the second control, may confirm the question grouping instruction or the grouping operation cancellation instruction through the confirmation control 1203, or cancel the question grouping instruction or the grouping operation cancellation instruction through the cancellation control 1204, and so on.
As an embodiment, the obtained question group and answer information related to the question group can be stored in a question-answer database, when the first account uses a question-answer system, corresponding answer information can be returned to the first account based on the answer information of the question group in the question-answer database, specifically, a question to be answered input by the first account can be obtained in response to a question input operation triggered by the first account on a third display page, the question group containing the question to be answered is determined, answer information related to the determined question group is obtained, the answer information is displayed on a fifth display page, the fifth display page can be a page embedded in the third display page, the fifth display page can also be a page independent of the third display page, and the like.
Referring to fig. 13, an example of a third display page is provided herein, in the third display page 1300, a second account may input a question to be answered in a question input box 1301 and request corresponding answer information from a question answering system through a search control 1302, further, the question answering system displays answer information associated with a target question group including the question to be answered in an answer display area 1304 in a fifth display page 1303, where the target question group is a question group in a question answering knowledge base, further, the second account may derive answer information of the question to be answered from the question answering system through an answer deriving control 1305, and the second account may also feedback that the displayed answer information is wrong through a wrong feedback control 1306.
As an embodiment, the number of the questions to be processed obtained in the step S201 may be one or may include at least two, so that when the number of the questions to be processed includes at least two, an extended question set corresponding to each question to be processed may be obtained through the steps S202 and S203, or before the step S202, a target question to be processed may be selected from the at least two questions to be processed, and then, in the step S202, a semantic influence value of each first term in the target questions to be processed may be obtained for each target question to be processed.
Further, in order to improve the accuracy of the obtained extended question set, the target question to be processed may be determined from at least two questions to be processed by, but not limited to, the following manner:
The first question screening mode is to randomly select part of or all of the questions to be processed from the at least two questions to be processed as the target questions to be processed.
And in a second question screening mode, screening out the to-be-processed questions with the semantic similarity with the basic questions being greater than a third preset threshold value from the at least two to-be-processed questions, and determining the to-be-processed questions as the target to-be-processed questions.
The following provides a concrete example of question expansion, which is composed by a neural network model system as shown in fig. 14, wherein the neural network model system comprises an encoding network, a decoding network Copy mechanism and a DBS decoding mechanism, and the following steps are that:
the coding network is used for receiving the basic question, coding the basic question, generating semantic vectors of all words in the basic question, and transmitting the generated semantic vectors of all words to the decoding network;
the decoding network decodes the semantic vector of each word in the basic question, obtains a question to be processed, the semantic similarity of which is greater than a third preset threshold value, and transmits the question to be processed to a copy mechanism;
The Copy mechanism obtains a first reference value and a second reference value of each first word in a question to be processed based on a second word in a preset word set, normalizes the obtained first reference value and second reference value, and determines a semantic influence value of each first word;
the DBS decoding mechanism is used for determining a quantity threshold N in an expansion question set to be generated, generating N expansion question groups corresponding to the questions to be processed based on semantic influence values of the first words and context associated information of the first words, and generating an expansion question set (namely, an expansion question set corresponding to a basic question) corresponding to the questions to be processed based on the expansion questions in the N expansion question groups, wherein a specific mode can be seen in the above content and is not repeated.
Referring to table 1, an effect comparison of generating an extended question of a basic question by using a Beam Search (BS) BS decoding mechanism and generating an extended question of a basic question by using the method provided by the embodiment of the present application is given herein.
Effect comparison table 1 of different ways of expanding question
It is obvious from table 1 that the character similarity of the extended question 1-5 expanded by the BS decoding mechanism is very high, the richness of the extended question is low, and the character similarity of the extended question 1-5 expanded by the technical scheme provided by the embodiment of the application is low, and the richness of the extended question is high.
In the embodiment of the application, the diversity index is adopted to specifically measure the richness of the expanded question (count the number of DISTINCT NGRAM in a group DIVERSITY BEAM SEARCH of results divided by the total word number in the group beam and then average the results of all data), please refer to fig. 15 and 16, the abscissa in the figure represents the number of characters, and the ordinate represents the diversity index, so that the richness of the obtained expanded question is obviously higher whether in the social security field or the game field.
Referring to fig. 17 and fig. 18, a comparison graph of experimental results of semantic accuracy of an expanded question obtained by expanding a basic question by using different methods in the social security field and the medical field is provided, and it can be seen that when the method provided by the embodiment of the application is not used, the overall accuracy of the expanded question is very low, and after the method provided by the embodiment of the application is used for generalizing the index in the question expansion process, the accuracy of the expanded question in different fields (such as but not limited to the illustrated social security field and the game field) is obviously improved.
In summary, in the embodiment of the application, based on the semantic influence value of each first word in the question to be processed and the context associated information of each first word, an expanded question set corresponding to the question to be processed is automatically generated, so that time is saved, the efficiency of expanding the question is improved, the semantic accuracy of the expanded question is improved because the situation that an erroneous question is generated when the question is manually expanded is avoided, the questions to be processed are further expanded in a grouping way, the first words of the expanded questions in different expanded question groups are different, and the richness of the expanded question is further improved.
Referring to fig. 19, based on the same inventive concept, an embodiment of the present application provides a question expansion apparatus 1900, including:
an information receiving unit 1901 for acquiring a question to be processed based on the received basic question;
The word processing unit 1902 is configured to obtain a semantic impact value of each first word in a question to be processed, where the semantic impact value characterizes an impact degree of each first word on the semantic of the question to be processed;
the question expansion unit 1903 is configured to generate an expanded question set corresponding to the question to be processed based on the semantic impact value of each first word and the context associated information of each first word, where the expanded question set includes expanded questions with semantic similarity greater than a first preset threshold value with the question to be processed, and the context associated information characterizes a correlation between the one word and each word belonging to the same question.
As an embodiment, the word processing unit 1902 is specifically configured to:
The method comprises the steps of obtaining a first reference value and a second reference value of each first word based on a second word in a preset word set, wherein the first reference value represents the probability of generating a corresponding word in an expanded question by using the first word, the second reference value represents the probability of generating the corresponding word in the expanded question by using the second word, the second word is a word with semantic similarity with the first word being larger than a second preset threshold value in the preset word set, and carrying out normalization processing on the first reference value and the second reference value to determine the semantic influence value of each first word.
The question expansion unit 1903 is specifically configured to determine a number threshold value N of the expanded questions in the expanded question set, screen out first words corresponding to the first N semantic impact values that are the largest based on the magnitudes of the semantic impact values of the first words, respectively determine the screened N first words as first words of the expanded questions in each of the N expanded question sets, and obtain expanded questions in each of the expanded question sets according to context-related information of the first words and context-related information of the first words except the first words for each of the N expanded question sets, and generate an expanded question set corresponding to the question to be processed by using the expanded questions in each of the expanded question sets.
The information receiving unit 1901 is specifically configured to determine, as the question to be processed, a part of or all of the basic questions, or to input the basic questions using a trained target neural network model, and determine, as the question to be processed, a question output by the target neural network model and having a semantic similarity with the basic questions greater than a third preset threshold.
As an embodiment, the information receiving unit 1901 is configured to use a trained target neural network model, input the basic question, and determine, as the question to be processed, a question that is output by the target neural network model and has a semantic similarity with the question to be processed greater than a second preset threshold, where the target neural network model is trained by:
The method comprises the steps of obtaining an initial target neural network model, wherein the initial target neural network model comprises a coding network and a decoding network, the coding network is used for learning and generating semantic vectors of words in a question sample by using a question sample, and the semantic vector of one word is obtained by merging the text vector of the one word with the semantic information of the question sample;
adjusting the coding parameters of the coding network by using question samples in the first question sample set;
Randomly initializing decoding parameters of the initial decoding network;
and further adjusting the adjusted coding parameters and the decoding parameters after random initialization by utilizing question samples in the second question sample set, and obtaining a trained target neural network model based on the further adjusted coding parameters and the decoding parameters.
As an embodiment, the question to be processed includes at least two questions, and the word processing unit 1902 is further configured to determine a target question to be processed from the at least two questions before obtaining the semantic impact value of each first word in the question to be processed;
The word processing unit 1902 is specifically configured to obtain a semantic impact value of each first word in the target question to be processed.
The word processing unit 1902 is specifically configured to randomly select a part of the to-be-processed questions or all the to-be-processed questions from the at least two to-be-processed questions as the target to-be-processed questions, or screen out the to-be-processed questions with semantic similarity with the basic questions greater than a third preset threshold from the at least two to-be-processed questions, and determine that the to-be-processed questions are the target to-be-processed questions.
As an embodiment, the question expansion unit 1903 is further configured to, based on the semantic impact value of each first term and the context-related information of each first term, generate an expanded question set corresponding to the question to be processed, and then display at least one expanded question in the expanded question set on the first display page;
the question expansion unit 1903 is further configured to respond to a question grouping instruction triggered by the first account through the second display page to obtain a to-be-grouped expansion question, and add the to-be-grouped expansion question to a question group indicated by the question grouping instruction.
As an embodiment, the question expansion unit 1903 is further configured to obtain a question to be answered input by the second account in response to a question input operation triggered by the second account on the third display page, determine a target question group including the question to be answered, obtain answer information associated with the target question group, and display the answer information on the fifth display page.
As an example, the apparatus of fig. 19 may be used to implement any of the question expansion methods discussed above.
The generating apparatus 1900 is a computer apparatus as shown in fig. 20, which includes a processor 2001, a storage medium 2002, and at least one external communication interface 2003, as an example of a hardware entity, and the processor 2001, the storage medium 2002, and the external communication interface 2003 are all connected by a bus 2004.
The storage medium 2002 has a computer program stored therein;
processor 2001, when executing the computer program, implements a method of generating a smart contract for testing blockchain services as previously discussed.
One processor 2001 is illustrated in fig. 20, but the number of processors 2001 is not limited in practice.
The storage medium 2002 may be a volatile memory (RAM) such as a random-access memory (RAM), a nonvolatile memory (non-volatile memory) such as a read-only memory, a flash memory (flash memory), a hard disk (HARD DISK DRIVE, HDD) or a solid state disk (solid state disk) (STATE DRIVE, SSD), or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. The storage medium 2002 may be a combination of the above storage media.
Based on the same inventive concept, an embodiment of the present application provides a terminal device 100, which is described below.
Referring to fig. 21, the terminal device 100 includes a display unit 2140, a processor 2180, and a memory 2120, wherein the display unit 2140 includes a display panel 2141 for displaying information input by a user or information provided to the user, and various operation interfaces and display pages of the question-answering client 110, and is mainly used for displaying interfaces, shortcut windows, and the like of clients installed in the terminal device 100 in the embodiment of the present application.
Alternatively, the display panel 2141 may be configured in the form of a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD) or an Organic Light-Emitting Diode (OLED) or the like.
The processor 2180 is configured to read a computer program and then execute a method defined by the computer program, for example, the processor 2180 reads an application of a question-answering client, etc., thereby running the application on the terminal device 100, and displaying an interface of the application on the display unit 2140. The Processor 2180 may include one or more general-purpose processors, and may further include one or more DSPs (DIGITAL SIGNAL processors ) for performing related operations to implement the technical solutions provided by the embodiments of the present application.
Memory 2120 typically includes memory and external memory, which may be Random Access Memory (RAM), read-only memory (ROM), and CACHE (CACHE), among others. The external memory can be a hard disk, an optical disk, a USB disk, a floppy disk, a tape drive, etc. The memory 2120 is used to store computer programs including client-side corresponding applications and the like, and other data that may include data generated after an operating system or application program is run, including system data (e.g., configuration parameters of the operating system) and user data. In the embodiment of the present application, the program instructions are stored in the memory 2120, and the processor 2180 executes the program instructions in the memory 2120 to implement any one of the question expansion methods discussed in the previous figures.
In addition, the terminal device 100 may further include a display unit 2140 for receiving input digital information, word information, or touch operation or non-touch gestures, and generating signal inputs related to user settings and function control of the terminal device 100, and the like. Specifically, in the embodiment of the present application, the display unit 2140 may include a display panel 2141. The display panel 2141, such as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on the display panel 2141 or on the display panel 2141 using any suitable object or accessory such as a finger, stylus, etc.), and drive the corresponding connection device according to a predetermined program. Alternatively, the display panel 2141 may include two parts of a touch detection device and a touch controller. The touch controller receives touch information from the touch detection device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 2180, and can receive and execute commands sent by the processor 2180. In the embodiment of the present application, if the user clicks the question-answering client 110, the touch detection device in the display panel 2141 detects a touch operation, and then the touch controller sends a signal corresponding to the detected touch operation, the touch controller converts the signal into touch coordinates and sends the touch coordinates to the processor 2180, and the processor 2180 determines that the user needs to operate the question-answering client 110 according to the received touch coordinates.
The display panel 2141 may be implemented by various types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the display unit 2140, the terminal device 100 may further include an input unit 2130, the input unit 2130 may include, but is not limited to including, an image input device 2131 and other input devices 2132, and the other input devices 2132 may include, but are not limited to including, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, etc.
In addition to the above, the terminal device 100 may further include a power supply 2190 for powering other modules, an audio circuit 2160, a near field communication module 2170, and an RF circuit 2110. The terminal device 100 may also include one or more sensors 2150, such as acceleration sensors, light sensors, pressure sensors, and the like. The audio circuit 2160 specifically includes a speaker 2161, a microphone 2162, and the like, and the terminal device 100 can collect the sound of the user through the microphone 2162, perform corresponding operations, and the like, for example.
The number of processors 2180 may be one or more, and the processors 2180 and the memory 2120 may be coupled or may be relatively independent.
As an embodiment, the processor 2180 in fig. 21 may be used to implement the functions of the information receiving unit 1901, the word processing unit 1902, and the question expansion unit 1903 as in fig. 19.
As one example, the processor 2180 in fig. 21 may be used to implement the question-answering client 110 functions discussed previously.
It will be appreciated by those of ordinary skill in the art that implementing all or part of the steps of the above method embodiments may be implemented by hardware associated with program instructions, and that the above computer program may be stored in a computer readable storage medium, which when executed, performs the steps comprising the above method embodiments, where the above storage medium includes various media that may store program code, such as a removable storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk.
Or if implemented in the form of software functional modules and sold or used as a stand-alone product, the integrated units described above may also be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the above-mentioned methods of the embodiments of the present invention. The storage medium includes various media capable of storing program codes such as a removable storage device, a ROM, a RAM, a magnetic disk or an optical disk.
Based on the same technical idea, an embodiment of the present application also provides a computer-readable storage medium storing computer instructions that, when executed on a computer, cause the computer to perform the question extension method as previously discussed.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.