[go: up one dir, main page]

CN111428471B - Intention recognition method, device, equipment and storage medium based on artificial intelligence - Google Patents

Intention recognition method, device, equipment and storage medium based on artificial intelligence Download PDF

Info

Publication number
CN111428471B
CN111428471B CN202010162325.5A CN202010162325A CN111428471B CN 111428471 B CN111428471 B CN 111428471B CN 202010162325 A CN202010162325 A CN 202010162325A CN 111428471 B CN111428471 B CN 111428471B
Authority
CN
China
Prior art keywords
question
preset
text
vector
cosine value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010162325.5A
Other languages
Chinese (zh)
Other versions
CN111428471A (en
Inventor
张智
江炼鑫
莫洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202010162325.5A priority Critical patent/CN111428471B/en
Publication of CN111428471A publication Critical patent/CN111428471A/en
Application granted granted Critical
Publication of CN111428471B publication Critical patent/CN111428471B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of artificial intelligence, and discloses an intention recognition method, device, equipment and storage medium based on artificial intelligence, which are used for improving the accuracy of forum reply and the drainage efficiency of users. The artificial intelligence-based intention recognition method comprises the steps of collecting initial data from a preset forum website at regular time, preprocessing the initial data according to preset business to obtain a question text clause, calculating an included angle cosine value of the question text clause according to the processed sequence through a word frequency-inverse text frequency index algorithm to obtain a target answer, carrying out intention recognition on the question text clause through a deep neural network text classification model to obtain a template type, combining and intelligently replying the question text clause, the target answer and a preset link according to the template type, and carrying out intelligent reply on spliced contents through a preset crawler task, wherein the preset link is used for indicating a target user to access a target online question-answer system.

Description

Artificial intelligence-based intention recognition method, apparatus, device and storage medium
Technical Field
The present invention relates to the field of deep learning, and in particular, to an artificial intelligence based intention recognition method, apparatus, device, and storage medium.
Background
Along with the increasing competition of industries, the resource acquisition cost is higher and higher, the popularization and drainage methods are various, the product or service positioning, the user group positioning and the target user or potential user flow scene searching are generally clarified, the user preference or preference is analyzed and mined, and the strategy is formulated to guide the user to reach the target platform so as to realize the network popularization and drainage.
The method for guiding the clips in the forum already exists, but the clips are often low-quality advertising clips, and the clips cannot be accurately returned according to the user preference, so that the accuracy is low, and the user is guided with low efficiency.
Disclosure of Invention
The invention mainly aims to solve the technical problems of low accuracy of forum reply and low drainage efficiency for users.
The first aspect of the invention provides an intention recognition method based on artificial intelligence, which comprises the steps of collecting initial data from a preset forum website at regular time, preprocessing the initial data according to preset business to obtain a question text clause, calculating an included angle cosine value of the question text clause according to the processed sequence through a word frequency-inverse text frequency index algorithm, determining a target answer according to the included angle cosine value, carrying out intention recognition on the question text clause through a deep neural network text classification model to obtain a template type, combining the question text clause, the target answer and a preset link according to the template type, and carrying out intelligent postback on spliced contents through the preset crawler task, wherein the preset link is used for indicating a target user to access a target online question-answering system.
Optionally, in a first implementation manner of the first aspect of the present invention, the timing collection of initial data from a preset forum website includes determining a uniform resource locator address of the preset forum website, accessing the preset forum website at a timing through a preset crawler task and the uniform resource locator address of the preset forum website to obtain web page data, intercepting the web page data according to a preset page identifier to obtain initial data, where the initial data includes an escape character, a space, a web page tag and web page content, and recording the initial data and the uniform resource locator address of the preset forum website into a preset data table.
Optionally, in a second implementation manner of the first aspect of the present invention, the pre-processing the initial data according to the preset service to obtain a question text clause includes determining a keyword of the preset service, extracting the initial data according to the keyword, deleting the escape symbol and the space from the extracted data by a text processing manner, deleting the web page tag to obtain text data, processing the text data, and deleting the clause of the blank string to obtain the question text clause.
Optionally, in a third implementation manner of the first aspect of the present invention, calculating an included angle cosine value of the question text clause according to the processed sequence by a word frequency-inverse text frequency index algorithm, and determining a target answer according to the included angle cosine value, where the method includes preprocessing the question text clause according to the processed sequence to obtain an initial vocabulary and a first question sentence vector, text vectorizing a preset question sentence according to the word frequency-inverse text frequency index algorithm and the initial vocabulary to obtain a second question sentence vector, calculating the first question sentence vector and the second question sentence vector to obtain an included angle cosine value, and setting an answer corresponding to the preset question sentence with the largest included angle cosine value as the target answer.
Optionally, in a fourth implementation manner of the first aspect of the present invention, preprocessing the question text sentence according to the processed sequence to obtain an initial vocabulary and a first question sentence vector, including writing the plurality of question text sentences to a tail of a preset message queue according to the processed sequence, setting a timeout duration for the question text sentence, where the timeout duration is greater than or equal to 0, adding the question text sentence to a delay task queue when the timeout duration corresponding to the question text sentence is equal to 0 and the question text sentence is still in a waiting state, where the delay task queue is used to process the question text sentence with an out-of-term, and performing word segmentation and part-of-speech labeling for the question text sentence when the timeout duration corresponding to the question text sentence is greater than 0 to obtain the initial vocabulary and the first question sentence vector.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the text vectorizing the preset question according to the word frequency-inverse text frequency index algorithm and the initial vocabulary to obtain a second question vector includes obtaining an initial vocabulary, where the initial vocabulary includes a plurality of vocabularies,Reading the number of preset questions from the preset databaseCounting the plurality of words according to word frequency-inverse text frequency index algorithmNumber of occurrences in the question text clauseCounting that the preset question includes the plurality of wordsNumber of questions of (1)Text vectorization is carried out on the preset question according to a first preset formula to obtain a second question vector, wherein the first preset formula,
Optionally, in a sixth implementation manner of the first aspect of the present invention, the calculating the first question vector and the second question vector to obtain an included angle cosine value, and setting an answer corresponding to a preset question with a maximum included angle cosine value as a target answer includes calculating the first question vector and the second question vector according to a second preset formula to obtain an included angle cosine value, where the second preset formula is thatWherein the saidFor the first question vector, theFor the second question vector,For indicating to determine the first question vector according to the cosine value of the included angleAnd the second question vectorAnd sequencing the answers corresponding to the preset questions according to the sequence from the high value to the low value of the included angle cosine, and setting the answer corresponding to the preset question with the maximum value of the included angle cosine as a target answer.
The invention provides an artificial intelligence-based intention recognition device which comprises an acquisition unit, a preprocessing unit, a calculation unit, an intention recognition unit and a reply unit, wherein the acquisition unit is used for acquiring initial data from a preset forum website at fixed time, the preprocessing unit is used for preprocessing the initial data according to preset services to obtain a question text clause, the calculation unit is used for calculating an included angle cosine value of the question text clause according to a processed sequence through a word frequency-inverse text frequency index algorithm and determining a target answer according to the included angle cosine value, the intention recognition unit is used for carrying out intention recognition on the question text clause through a deep neural network text classification model to obtain a template type, the reply unit is used for combining the question text clause, the target answer and a preset link according to the template type and carrying out intelligent reply on spliced contents through a preset crawler task, and the preset link is used for indicating a target user to access a target online question-answering system.
Optionally, in a first implementation manner of the second aspect of the present invention, the collecting unit is specifically configured to determine a uniform resource locator address of a preset forum website, access the preset forum website at regular time through a preset crawler task and the uniform resource locator address of the preset forum website to obtain web page data, intercept the web page data according to a preset page identifier to obtain initial data, where the initial data includes an escape character, a space, a web page tag and web page content, and record the initial data and the uniform resource locator address of the preset forum website in a preset data table.
Optionally, in a second implementation manner of the second aspect of the present invention, the preprocessing unit is specifically configured to determine a keyword of a preset service, extract the initial data according to the keyword, delete the escape symbol and the space from the extracted data by a text processing manner, delete the web page tag to obtain text data, process the text data in clauses, and delete clauses of an empty character string to obtain a problematic text clause.
Optionally, in a third implementation manner of the second aspect of the present invention, the computing unit further includes a preprocessing subunit, configured to preprocess the question text clause according to a processed sequence to obtain an initial vocabulary and a first question vector, a text processing subunit, configured to perform text vectorization on a preset question according to a word frequency-inverse text frequency index algorithm and the initial vocabulary to obtain a second question vector, and a computing subunit, configured to calculate the first question vector and the second question vector to obtain an included angle cosine value, and set an answer corresponding to the preset question with the largest included angle cosine value as a target answer.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the preprocessing subunit is specifically configured to write the multiple question text sentences into a tail portion of a preset message queue according to a processed sequence, set a timeout period for the question text sentences, where the timeout period is greater than or equal to 0, add the question text sentences to a delay task queue when the timeout period corresponding to the question text sentences is equal to 0 and the question text sentences are still in a waiting state, where the delay task queue is used to process the question text sentences exceeding a period, and perform word segmentation and part-of-speech labeling for the question text sentences when the timeout period corresponding to the question text sentences is greater than 0, so as to obtain an initial vocabulary and a first question sentence vector.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the text processing subunit is specifically configured to obtain an initial vocabulary, where the initial vocabulary includes a plurality of vocabularies,Reading the number of preset questions from the preset databaseCounting the plurality of words according to word frequency-inverse text frequency index algorithmNumber of occurrences in the question text clauseCounting that the preset question includes the plurality of wordsNumber of questions of (1)Text vectorization is carried out on the preset question according to a first preset formula to obtain a second question vector, wherein the first preset formula,
Optionally, in a sixth implementation manner of the second aspect of the present invention, the computing subunit is specifically configured to set the first question vector toCalculating the first question vector and the second question vector according to a second preset formula to obtain an included angle cosine value, wherein the second preset formula isWherein the saidFor the first question vector, theFor the second question vector,Is used for indicating to determine the first question vector according to the included angle cosine valueAnd the second question vectorAnd sequencing the answers corresponding to the preset questions according to the sequence from the high value to the low value of the included angle cosine, and setting the answer corresponding to the preset question with the maximum value of the included angle cosine as a target answer.
A third aspect of the present invention provides an artificial intelligence based intent recognition device including a memory and at least one processor, the memory having instructions stored therein, the memory and the at least one processor being interconnected by a wire, the at least one processor invoking the instructions in the memory to cause the artificial intelligence based intent recognition device to perform the artificial intelligence based intent recognition method as described in the first aspect above.
A fourth aspect of the invention provides a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the artificial intelligence based intent recognition method of the first aspect described above.
According to the technical scheme, initial data are collected from a preset forum website at regular time, data preprocessing is conducted on the initial data according to preset services to obtain question text clauses, an included angle cosine value is calculated on the question text clauses according to the processed sequence through a word frequency-inverse text frequency index algorithm, a target answer is determined according to the included angle cosine value, intention recognition is conducted on the question text clauses through a deep neural network text classification model to obtain a template type, the question text clauses, the target answer and preset links are combined according to the template type, intelligent reply is conducted on spliced contents through the preset crawler task, and the preset links are used for indicating a target user to access a target online question-answering system. According to the embodiment of the invention, the crawling task is used for regularly grabbing the questions from the target forum, the intelligent question answering engine is used for judging and answering the grabbing questions, the crawling task is used for conducting the reply drainage according to the answer content, the user is led to the target page for conducting the intelligent question answering, and the timeliness, the accuracy and the efficiency of the reply drainage to the user are improved.
Drawings
FIG. 1 is a schematic diagram of one embodiment of an artificial intelligence based intention recognition method in an embodiment of the invention;
FIG. 2 is a schematic diagram of another embodiment of an artificial intelligence based intention recognition method in an embodiment of the invention;
FIG. 3 is a schematic diagram of an embodiment of an artificial intelligence based intention recognition device in an embodiment of the invention;
FIG. 4 is a schematic diagram of another embodiment of an artificial intelligence based intent recognition device in accordance with an embodiment of the invention;
FIG. 5 is a schematic diagram of one embodiment of an artificial intelligence based intent recognition device in an embodiment of the invention.
Detailed Description
The embodiment of the invention provides an artificial intelligence-based intention recognition method, device, equipment and storage medium, which are used for regularly grabbing questions from a target forum through a crawler task, judging and answering the grabbed questions through an intelligent question-answering engine, conducting copyback drainage according to answer contents through the crawler task, conducting intelligent question-answering on users by guiding the users to the target page, and improving the timeliness, accuracy and drainage efficiency of the copyback.
In order to enable those skilled in the art to better understand the present invention, embodiments of the present invention will be described below with reference to the accompanying drawings.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
For ease of understanding, a specific flow of an embodiment of the present invention is described below with reference to fig. 1, where an embodiment of an artificial intelligence-based intent recognition method according to the embodiment of the present invention includes:
101. Initial data are collected from a preset forum website at fixed time;
It will be appreciated that the execution subject of the present invention may be an artificial intelligence based intention recognition device, or may be a terminal or a server, and is not limited herein. The embodiment of the invention is described by taking a server as an execution main body as an example.
The server collects initial data from a preset forum website at fixed time, wherein the initial data comprises replies and comment contents of various forum sections in the preset forum website aiming at posting topics and topic stickers, and the initial data is expressed in a hypertext markup language code mode.
Further, the server determines a uniform resource locator address of a preset forum website, the server accesses the preset forum website at regular time through a preset crawler task and the uniform resource locator address of the preset forum website to obtain webpage data, for example, the server accesses the uniform resource locator address A of the preset forum website every 1 hour or every 2 hours in the early morning through the preset crawler task, the server intercepts the webpage data according to a preset page identifier to obtain initial data, the initial data comprises an escape character, a space, a webpage label and webpage content, for example, the webpage label comprises < html >, < head >, < body >, < div >, < br >, the server regularly intercepts the webpage data according to a regular expression to obtain the initial data, and the server records the initial data and the uniform resource locator address of the preset forum website into a preset data table.
102. Performing data preprocessing on the initial data according to preset service to obtain a problem text clause;
and the server performs data preprocessing on the initial data according to the preset service to obtain a question text clause. The data extraction refers to the process that a server extracts data from initial data according to preset service, the data cleaning refers to the process that the extracted data is deleted according to preset rules, and the data conversion refers to the process that the cleaned data is processed to obtain a problem text clause.
The method comprises the steps that when the server monitors that initial data to be processed exist in a preset data table, the server extracts the initial data according to preset service, the server cleans the extracted data, and the server converts the cleaned data to obtain the problem text clause.
103. Calculating an included angle cosine value of the question text clause according to the processed sequence through a word frequency-inverse text frequency index algorithm, and determining a target answer according to the included angle cosine value;
And the server calculates an included angle cosine value of the question text clause according to the processed sequence through a word frequency-inverse text frequency index algorithm, and determines a target answer according to the included angle cosine value. Specifically, the server performs question retrieval and processing on a plurality of question text clauses in parallel through a preset message queue. For example, the server acquires the question text clause including the clause A, the clause B, the clause C and the clause D according to the processed sequence, and sequentially processes the data of the clause A, the clause B, the clause C and the clause D through a preset message queue.
And then, the server calculates an included angle cosine value for the question text clause and the preset question sentence through a word frequency-inverse text frequency index algorithm, and determines a target answer according to the included angle cosine value. Further, the server finds a preset question sentence matched with the question text clause in the set of the existing question answer pairs in the preset question-answering center engine through the included angle cosine value, and sets the preset question sentence matched with the question text clause as a target answer. For example, the calculated included angle cosine value of the server includes 0.96, 0.18 and 0.57, and the server sets a preset question corresponding to the included angle cosine value of 0.96 as the target answer.
104. Carrying out intention recognition on the problem text clause through a deep neural network text classification model to obtain a template type;
And the server carries out intention recognition on the problem text clause through the deep neural network text classification model to obtain the template type. Specifically, the server cuts sentences through a deep neural network text classification model to obtain sentence matrixes (s, d), for example, the sentence matrixes are 7*5, 7 is a sentence length value and 5 is a word mapping vector, the server carries out convolution layer processing on the sentence matrixes to obtain feature vectors with different lengths, for example, the convolution kernel is set to be (2, 3, 4) in order to find 2-gram,3-gram and 4-gram features respectively, the server carries out pooling processing on the feature vectors with different lengths to obtain feature vectors with preset lengths, carries out connection and normalization processing on the feature vectors with preset lengths to obtain the probability of each category, and the server determines matched templates and template types according to the type of the maximum probability.
Optionally, the server performs text similarity matching on the question text clause through a preset intention recognition model to obtain a plurality of similarities, determines the maximum similarity from the plurality of similarities, and determines a matched template and a template type according to the maximum similarity.
105. And combining the question text clause, the target answer and a preset link according to the template type, and intelligently replying the spliced content through a preset crawler task, wherein the preset link is used for indicating the target user to access the target online question-answering system.
The server combines the question text clause, the target answer and the preset link according to the template type, and intelligent replying is carried out on the spliced content through the preset crawler task, wherein the preset link is used for indicating the target user to access the target online question-answering system. The method comprises the steps that a server determines a question-answer template according to a template type, wherein the question-answer template is an alternative character string set according to the template type, the server combines the text content of a question, a target answer and a preset link based on the question-answer template, and the server carries out intelligent posting on spliced content through a preset crawler task. For example, do the spliced content "@ me have had social security and further need to re-buy business insurance? medical expenses of social security reimbursement are limited, reimbursement limit is about 50%, and social security cannot protect a disease, so that the social security needs to be supplemented.
Further, when the target user clicks a preset jump chain from the preset forum, the target user is guided to different scene target online question-answering systems through the logic configuration information index numbers in the links, and online dialogue and automatic marketing are carried out between the target user and the target user through the target online question-answering systems.
According to the embodiment of the invention, the crawling task is used for regularly grabbing the questions from the target forum, the intelligent question answering engine is used for judging and answering the grabbing questions, the crawling task is used for conducting the reply drainage according to the answer content, the user is led to the target page for conducting the intelligent question answering, and the timeliness, the accuracy and the efficiency of the reply drainage to the user are improved.
Referring to fig. 2, another embodiment of an artificial intelligence based intention recognition method according to an embodiment of the present invention includes:
201. Initial data are collected from a preset forum website at fixed time;
The server collects initial data from a preset forum website at fixed time, wherein the initial data comprises replies and comment contents of various forum sections in the preset forum website aiming at posting topics and topic stickers, and the initial data is expressed in a hypertext markup language code mode.
Specifically, first, the server determines a uniform resource locator address of a preset forum website, where the uniform resource locator address is a representation of a location and an access manner of a resource available from the internet, and is an address of a standard resource on the internet. Each file on the internet has a unique url address.
And secondly, the server accesses the uniform resource locator address of the preset forum website at regular time through the preset crawler task to obtain access data. Specifically, the server sends a request to the uniform resource locator address of the preset forum website at regular time through a preset crawler task using a get mode or a post transmitting mode to obtain a return result, and analyzes the return result to obtain webpage data.
And thirdly, the server intercepts the webpage data according to the preset page identification to obtain initial data, wherein the initial data comprises an escape character, a space, a webpage label and webpage content. Wherein the web page data includes an escape character "\", a space, a web page tag and specific contents displayed in the web page, the web page tag includes < html >, < head >, < body >, < div >, < br >. The method comprises the steps that a server sets different preset page identifications according to different webpage data and sets first regular expressions for the different preset page identifications, and the server regularly intercepts the webpage data according to the first regular expressions to obtain initial data, wherein the regular expressions are special character sequences and are used for checking whether a character string is matched with a certain pattern or not.
Finally, the server records the initial data and the uniform resource locator address of the preset forum website into a preset data table.
202. Performing data preprocessing on the initial data according to preset service to obtain a problem text clause;
and the server performs data preprocessing on the initial data according to the preset service to obtain a question text clause. The data extraction refers to the process that a server extracts data from initial data according to preset service, the data cleaning refers to the process that the extracted data is deleted according to preset rules, and the data conversion refers to the process that the cleaned data is processed to obtain a problem text clause.
Specifically, first, the server determines keywords of a preset service, and extracts initial data according to the keywords. For example, the server sets the policy as a keyword, and the server extracts partial data related to the policy from the initial data, that is, the server deletes partial data unrelated to the policy from the initial data.
And secondly, deleting the escape symbol and the space of the extracted data by the server in a text processing mode, and deleting the webpage label to obtain text data. The server sets a second regular expression for the escape symbol, the space and the webpage label, and the server processes the extracted data according to the second regular expression in a text processing mode to obtain text data.
Then, the server processes the clause of the text data and deletes the clause of the empty character string to obtain the question text clause. The method comprises the steps of setting clause separators by a server, carrying out clause processing on text data by the server according to a preset segmentation function and the clause separators to obtain a plurality of clauses, traversing and inquiring the plurality of clauses by the server, and deleting the clauses of an empty character string to obtain a problem text clause. Wherein the clause separator includes a semicolon, a period, an exclamation mark, and a question mark.
203. Preprocessing the question text clauses according to the processed sequence to obtain an initial vocabulary and a first question sentence vector;
The server preprocesses the question text clauses according to the processed sequence to obtain an initial vocabulary and a first question sentence vector. The preprocessing comprises word segmentation, part-of-speech tagging and stop word filtering of preset question sentences and question text clauses. The method comprises the steps of presetting a question sentence and a question text clause, wherein the question sentence and the question text clause comprise sentences, the segmentation is to decompose the sentences into data structures taking words as units, the part-of-speech tagging is to determine a part-of-speech for each word according to the context of the sentences, the part-of-speech comprises nouns, pronouns, numbers, graduated words and adjectives, and the stop word filtering is to filter noise in a word segmentation result, such as, for example, yes and o.
The method comprises the steps that a server writes question text clauses into the tail of a queue of a preset message queue according to the processed sequence, the preset message queue is used for sequentially processing the question text clauses, the server sets time-out time length for the question text clauses, the time-out time length is larger than or equal to 0, the question text clauses are sent to a preset question-answering center engine according to a first-in first-out sequence, when the time-out time length corresponding to the question text clauses is 0 and the question text clauses are still in a waiting state, the server adds the question text clauses into a delay task queue, the delay task queue is used for processing the question text clauses exceeding the time-out time, and when the time-out time length corresponding to the question text clauses is larger than 0, word segmentation and part-of-speech marking are carried out on the question text clauses, and an initial vocabulary and a first question sentence vector are obtained.
It can be understood that the server sets the message-oriented middleware RabbitMQ as a preset message queue, and sets the timeout duration of the message for the question text clause when the server sends the question text clause through the RabbitMQ, wherein the timeout duration of the message is used for indicating and setting the survival duration of the question text clause, the unit of the timeout duration is millisecond, the value of the timeout duration of each message is greater than or equal to 0, the timeout duration of each message can be the same or different, and the method is not limited in particular, and when the question text clause is in a waiting state in the preset timeout duration, the question text clause is sent to the preset question-answering center engine through the RabbitMQ according to a first-in first-out sequence. For example, the server sends 3 question text phrases A, B and C with timeout periods of 10 seconds, 30 seconds, and 15 seconds, respectively. And after the timeout period is over 5 seconds, sequentially sending A, B and C through the RabbitMQ. It should be noted that when using RabbitMQ to implement the delayed task queue, it is ensured that the delay times of A, B and C are consistent.
204. Performing text vectorization on the preset question according to the word frequency-inverse text frequency index algorithm and the initial vocabulary to obtain a second question vector;
The server carries out text vectorization on preset questions according to a word frequency-inverse text frequency index algorithm and an initial vocabulary to obtain second question vectors, and specifically, the server acquires the initial vocabulary which comprises a plurality of vocabularies ,The server reads the number of preset questions from a preset databaseThe server counts a plurality of words according to the word frequency-inverse text frequency index algorithmNumber of occurrences in question text clausesThe server counts a plurality of vocabularies in the preset questionNumber of questions of (1)The server carries out text vectorization on the preset question according to a preset formula to obtain a second question vector, wherein the preset formula is as follows,
205. Calculating the first question vector and the second question vector to obtain an included angle cosine value, and setting an answer corresponding to a preset question with the largest included angle cosine value as a target answer;
The server calculates the first question vector and the second question vector to obtain an included angle cosine value, and concretely calculates the first question vector and the second question vector to obtain the included angle cosine value, wherein the formula for calculating the included angle cosine value is that Wherein, the method comprises the steps of, wherein,For the first question vector,For the second question vector,For indicating to determine a first question vector according to the cosine value of the included angleAnd a second question vectorFurther, the similarity of the two, further,The server sorts the answers corresponding to the preset questions according to the sequence from the larger value to the smaller value of the cosine of the included angle, and sets the answer corresponding to the preset question with the largest value of the cosine of the included angle as a target answer.
206. Carrying out intention recognition on the problem text clause through a deep neural network text classification model to obtain a template type;
And the server carries out intention recognition on the problem text clause through the deep neural network text classification model to obtain the template type. The deep neural network text classification model comprises a text-CNN model, specifically, a server cuts sentences through the deep neural network text classification model to obtain sentence matrixes (s, d), for example, the sentence matrixes are 7*5, 7 is a sentence length value, 5 is a word mapping vector, the server carries out convolution layer processing on the sentence matrixes to obtain feature vectors with different lengths, for example, the convolution kernel size is set to be (2, 3, 4) or to search 2-gram,3-gram and 4-gram features respectively, the server carries out pooling processing on the feature vectors with different lengths to obtain feature vectors with preset lengths, the server carries out connection and normalization processing on the feature vectors with preset lengths to obtain probability of each category, and the server determines matched templates and template types according to the type of the maximum probability.
Optionally, the server performs text similarity matching on the question text clause through a preset intention recognition model to obtain a plurality of similarities, determines the maximum similarity from the plurality of similarities, and determines a matched template and a template type according to the maximum similarity.
It should be noted that the purpose of the server for intention recognition is to determine whether the question and the answer have marketing value, and if the question and the answer have no marketing value, discard the text clause of the question. For example, for the question text clause "xxx disease cannot buy yyy insurance", when the server performs intent recognition, it needs to determine whether the yyy insurance belongs to the target enterprise, if the yyy insurance belongs to the target enterprise, then determine the matched template a and template type a, and if the yyy insurance does not belong to the target enterprise, then determine the matched template B and template type B.
207. And combining the question text clause, the target answer and a preset link according to the template type, and intelligently replying the spliced content through a preset crawler task, wherein the preset link is used for indicating the target user to access the target online question-answering system.
The server combines the question text clause, the target answer and the preset link according to the template type, and intelligent replying is carried out on the spliced content through the preset crawler task, wherein the preset link is used for indicating the target user to access the target online question-answering system. The method comprises the steps that a server determines a question-answer template according to a template type, wherein the question-answer template is an alternative character string set according to the template type, the server combines a question text clause, a target answer and a preset link based on the question-answer template, and the server carries out intelligent posting on spliced contents through a preset crawler task. For example, do the spliced content "@ me have had social security and further need to re-buy business insurance? medical expenses of social security reimbursement are limited, reimbursement limit is about 50%, and social security cannot protect a disease, so that the social security needs to be supplemented.
Further, when it is detected that the target user accesses the target online question-answering system through the preset link, online conversations are performed with the target user through the target online question-answering system. When a target user clicks a preset jump chain from a preset forum, the target user is guided to different scene target online question-answering systems through logic configuration information index numbers in the links, and online dialogue and automatic marketing are carried out between the target user and the target user through the target online question-answering systems. For example, for insurance topics, online conversations and automatic marketing are conducted with target users through a target online question and answer system.
According to the embodiment of the invention, the crawling task is used for regularly grabbing the questions from the target forum, the intelligent question answering engine is used for judging and answering the grabbing questions, the crawling task is used for conducting the reply drainage according to the answer content, the user is led to the target page for conducting the intelligent question answering, and the timeliness, the accuracy and the efficiency of the reply drainage to the user are improved.
The method for identifying intent based on artificial intelligence in the embodiment of the present invention is described above, and the apparatus for identifying intent based on artificial intelligence in the embodiment of the present invention is described below, referring to fig. 3, an embodiment of the apparatus for identifying intent based on artificial intelligence in the embodiment of the present invention includes:
the acquisition unit 301 is configured to acquire initial data from a preset forum website at regular time;
a preprocessing unit 302, configured to perform data preprocessing on the initial data according to a preset service, so as to obtain a question text clause;
A calculating unit 303, configured to calculate an included angle cosine value for the question text clause and a preset question sentence according to the processed sequence through a word frequency-inverse text frequency index algorithm, and determine a target answer according to the included angle cosine value;
The intention recognition unit 304 is configured to perform intention recognition on the question text clause through the deep neural network text classification model to obtain a template type;
and the reply unit 305 is used for combining the question text clause, the target answer and the preset link according to the template type, intelligently replying the spliced content through the preset crawler task, and indicating the target user to access the target online question-answering system.
According to the embodiment of the invention, the crawling task is used for regularly grabbing the questions from the target forum, the intelligent question answering engine is used for judging and answering the grabbing questions, the crawling task is used for conducting the reply drainage according to the answer content, the user is led to the target page for conducting the intelligent question answering, and the timeliness, the accuracy and the efficiency of the reply drainage to the user are improved.
Referring to fig. 4, another embodiment of an artificial intelligence based intention recognition apparatus according to an embodiment of the present invention includes:
the acquisition unit 301 is configured to acquire initial data from a preset forum website at regular time;
a preprocessing unit 302, configured to perform data preprocessing on the initial data according to a preset service, so as to obtain a question text clause;
A calculating unit 303, configured to calculate an included angle cosine value for the question text clause and a preset question sentence according to the processed sequence through a word frequency-inverse text frequency index algorithm, and determine a target answer according to the included angle cosine value;
The intention recognition unit 304 is configured to perform intention recognition on the question text clause through the deep neural network text classification model to obtain a template type;
and the reply unit 305 is used for combining the question text clause, the target answer and the preset link according to the template type, intelligently replying the spliced content through the preset crawler task, and indicating the target user to access the target online question-answering system.
Optionally, the acquisition unit 301 may be further specifically configured to:
determining a uniform resource locator address of a preset forum website;
The preset forum website is accessed at regular time through a preset crawler task and a uniform resource locator address of the preset forum website, and webpage data are obtained;
Intercepting webpage data according to preset page identification to obtain initial data, wherein the initial data comprises an escape character, a space, a webpage label and webpage content;
And recording the initial data and the uniform resource locator address of the preset forum website into a preset data table.
Optionally, the preprocessing unit 302 may be further specifically configured to:
Determining keywords of preset services, and extracting initial data according to the keywords;
deleting the escape symbol and the blank space from the extracted data in a text processing mode, and deleting the webpage label to obtain text data;
And carrying out clause processing on the text data, and deleting the clause of the empty character string to obtain the problem text clause.
Optionally, the computing unit 303 may further include:
a preprocessing subunit 3031, configured to preprocess the question text clause according to the processed sequence, to obtain an initial vocabulary and a first question vector;
the text processing subunit 3032 is configured to perform text vectorization on the preset question according to the word frequency-inverse text frequency index algorithm and the initial vocabulary to obtain a second question vector;
The calculating subunit 3033 is configured to calculate the first question vector and the second question vector to obtain an included angle cosine value, and set an answer corresponding to a preset question with the largest included angle cosine value as the target answer.
Optionally, the preprocessing subunit 3031 is further specifically configured to:
writing a plurality of problem text sentences into the tail part of a preset message queue according to the processed sequence;
setting time-out time length for the question text clause, wherein the time-out time length is greater than or equal to 0;
When the timeout duration corresponding to the question text clause is equal to 0 and the question text clause is still in a waiting state, adding the question text clause into a delay task queue, wherein the delay task queue is used for processing the overtime question text clause;
When the timeout duration corresponding to the question text clause is greater than 0, word segmentation and part-of-speech tagging are carried out on the question text clause, and an initial vocabulary and a first question sentence vector are obtained.
Optionally, the text processing subunit 3032 is further specifically configured to:
Obtaining an initial vocabulary, the initial vocabulary including a plurality of vocabularies ,Is a positive integer;
reading the number of preset questions from a preset database ;
Counting multiple vocabularies according to word frequency-inverse text frequency index algorithmNumber of occurrences in question text clauses;
The statistical preset question includes a plurality of wordsNumber of questions of (1);
Text vectorization is carried out on the preset question according to a first preset formula to obtain a second question vector, wherein the first preset formula,
Optionally, the calculating subunit 3033 is further specifically configured to:
setting the first question vector to ;
Calculating the first question vector and the second question vector according to a second preset formula to obtain an included angle cosine value, wherein the second preset formula is thatWherein, the method comprises the steps of, wherein,For the first question vector,For the second question vector,For indicating to determine a first question vector according to the cosine value of the included angleAnd a second question vectorSimilarity of (2);
And ordering the answers corresponding to the preset questions according to the order of the cosine values of the included angles from large to small, and setting the answer corresponding to the preset question with the largest cosine value of the included angle as a target answer.
According to the embodiment of the invention, the crawling task is used for regularly grabbing the questions from the target forum, the intelligent question answering engine is used for judging and answering the grabbing questions, the crawling task is used for conducting the reply drainage according to the answer content, the user is led to the target page for conducting the intelligent question answering, and the timeliness, the accuracy and the efficiency of the reply drainage to the user are improved.
Fig. 3 and fig. 4 above describe the artificial intelligence-based intention recognition apparatus in the embodiment of the present invention in detail from the point of view of the modularized functional entity, and the artificial intelligence-based intention recognition device in the embodiment of the present invention is described in detail from the point of view of hardware processing below.
Fig. 5 is a schematic diagram of an artificial intelligence based intent recognition device 500 according to an embodiment of the present invention, which may vary considerably in configuration or performance, and may include one or more processors (central processing units, CPU) 501 (e.g., one or more processors) and memory 509, one or more storage mediums 508 (e.g., one or more mass storage devices) storing applications 507 or data 506. Wherein the memory 509 and storage medium 508 may be transitory or persistent storage. The program stored on the storage medium 508 may include one or more modules (not shown), each of which may include a series of instruction operations in an artificial intelligence-based intent recognition device. Still further, the processor 501 may be configured to communicate with the storage medium 508 and execute a series of instruction operations in the storage medium 508 on the artificial intelligence based intent recognition device 500.
The artificial intelligence based intent recognition device 500 may also include one or more power supplies 502, one or more wired or wireless network interfaces 503, one or more input/output interfaces 504, and/or one or more operating systems 505, such as Windows Serve, mac OS X, unix, linux, freeBSD, and the like. It will be appreciated by those skilled in the art that the artificial intelligence based intent recognition device architecture shown in FIG. 5 does not constitute a limitation of the artificial intelligence based intent recognition device, and may include more or fewer components than illustrated, or may combine certain components, or a different arrangement of components.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
While the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that the foregoing embodiments may be modified or equivalents may be substituted for some of the features thereof, and that the modifications or substitutions do not depart from the spirit and scope of the embodiments of the invention.

Claims (8)

1.一种基于人工智能的意图识别方法,其特征在于,包括:1. An intention recognition method based on artificial intelligence, characterized by comprising: 定时从预置论坛网站中采集初始数据;Collect initial data from preset forum websites regularly; 根据预置业务对所述初始数据进行数据预处理,得到问题文本分句;Performing data preprocessing on the initial data according to preset services to obtain question text sentences; 对所述问题文本分句按照被处理的先后顺序通过词频-逆文本频率指数算法计算夹角余弦值,并根据所述夹角余弦值确定目标答案;Calculating the angle cosine value of the question text sentences according to the order in which they are processed by using the word frequency-inverse text frequency index algorithm, and determining the target answer according to the angle cosine value; 通过深度神经网络文本分类模型对所述问题文本分句进行意图识别,得到模板类型;Performing intent recognition on the question text sentences through a deep neural network text classification model to obtain a template type; 根据所述模板类型对所述问题文本分句、所述目标答案和预置链接进行组合,并通过预置爬虫任务对拼接后的内容进行智能回帖,所述预置链接用于指示目标用户访问目标在线问答系统;Combining the question text sentences, the target answer and the preset link according to the template type, and intelligently replying to the spliced content through the preset crawler task, wherein the preset link is used to instruct the target user to access the target online question-answering system; 所述对所述问题文本分句按照被处理的先后顺序通过词频-逆文本频率指数算法计算夹角余弦值,并根据所述夹角余弦值确定目标答案,包括:对所述问题文本分句按照被处理的先后顺序进行预处理,得到初始词汇表和第一问句向量;根据词频-逆文本频率指数算法和所述初始词汇表对预置问句进行文本向量化,得到第二问句向量;对所述第一问句向量和所述第二问句向量进行计算,得到夹角余弦值,并将所述夹角余弦值最大的预置问句对应的答案设置为目标答案;The step of calculating the angle cosine value of the question text sentences according to the order in which they are processed by using the word frequency-inverse text frequency index algorithm, and determining the target answer according to the angle cosine value includes: preprocessing the question text sentences according to the order in which they are processed to obtain an initial vocabulary and a first question vector; performing text vectorization on the preset question according to the word frequency-inverse text frequency index algorithm and the initial vocabulary to obtain a second question vector; calculating the first question vector and the second question vector to obtain the angle cosine value, and setting the answer corresponding to the preset question with the largest angle cosine value as the target answer; 所述对所述问题文本分句按照被处理的先后顺序进行预处理,得到初始词汇表和第一问句向量,包括:将多个问题文本分句按照被处理的先后顺序写入到预置消息队列的队列尾部;对所述问题文本分句设置超时时长,所述超时时长大于或者等于0;当所述问题文本分句对应的超时时长等于0,并且所述问题文本分句仍处于等待状态时,将所述问题文本分句添加到延时任务队列中,所述延时任务队列用于处理超期的问题文本分句;当所述问题文本分句对应的超时时长大于0时,对所述问题文本分句进行分词以及词性标注,得到初始词汇表和第一问句向量。The method pre-processes the question text sentences in the order in which they are processed to obtain an initial vocabulary and a first question vector, including: writing multiple question text sentences into the tail of a preset message queue in the order in which they are processed; setting a timeout period for the question text sentences, wherein the timeout period is greater than or equal to 0; when the timeout period corresponding to the question text sentence is equal to 0 and the question text sentence is still in a waiting state, adding the question text sentence to a delayed task queue, wherein the delayed task queue is used to process overdue question text sentences; when the timeout period corresponding to the question text sentence is greater than 0, segmenting and part-of-speech tagging the question text sentence to obtain an initial vocabulary and a first question vector. 2.根据权利要求1所述的基于人工智能的意图识别方法,其特征在于,所述定时从预置论坛网站中采集初始数据,包括:2. The artificial intelligence-based intention recognition method according to claim 1, characterized in that the periodic collection of initial data from a preset forum website comprises: 确定预置论坛网站的统一资源定位符地址;Determine the uniform resource locator address of the preset forum website; 通过预置爬虫任务和所述预置论坛网站的统一资源定位符地址定时访问所述预置论坛网站,得到网页数据;Regularly access the preset forum website through the preset crawler task and the uniform resource locator address of the preset forum website to obtain web page data; 对所述网页数据按照预置页面标识进行截取,得到初始数据,所述初始数据包括转义符、空格、网页标签和网页内容;The web page data is intercepted according to the preset page identifier to obtain initial data, wherein the initial data includes an escape character, a space, a web page tag and web page content; 将所述初始数据与所述预置论坛网站的统一资源定位符地址记录到预置数据表中。The initial data and the uniform resource locator address of the preset forum website are recorded in a preset data table. 3.根据权利要求2所述的基于人工智能的意图识别方法,其特征在于,所述根据预置业务对所述初始数据进行数据预处理,得到问题文本分句,包括:3. The method for intention recognition based on artificial intelligence according to claim 2 is characterized in that the data preprocessing of the initial data according to the preset business to obtain the question text sentence includes: 确定预置业务的关键词,并根据所述关键词对所述初始数据进行抽取;Determining keywords of preset services, and extracting the initial data according to the keywords; 通过文本处理方式对抽取后的数据删除所述转义符和所述空格,并对所述网页标签进行删除,得到文本数据;Deleting the escape character and the space from the extracted data by text processing, and deleting the web page tag to obtain text data; 对所述文本数据进行分句处理,并删除空字符串的分句,得到问题文本分句。The text data is processed into sentences, and sentences with empty character strings are deleted to obtain question text sentences. 4.根据权利要求1所述的基于人工智能的意图识别方法,其特征在于,所述根据词频-逆文本频率指数算法和所述初始词汇表对预置问句进行文本向量化,得到第二问句向量,包括:4. The method for intention recognition based on artificial intelligence according to claim 1, characterized in that the step of performing text vectorization on the preset question sentence according to the word frequency-inverse text frequency index algorithm and the initial vocabulary to obtain the second question sentence vector comprises: 获取初始词汇表,所述初始词汇表中包括多个词汇为正整数;Obtain an initial vocabulary, wherein the initial vocabulary includes a plurality of words , is a positive integer; 从预置数据库中读取预置问句的数量Read the number of preset questions from the preset database ; 根据词频-逆文本频率指数算法统计所述多个词汇在所述问题文本分句中出现的次数The plurality of words are counted according to a word frequency-inverse text frequency index algorithm The number of times it appears in the question text sentence ; 统计所述预置问句中包括所述多个词汇的问句数量Count the number of words included in the preset question Number of questions ; 根据第一预置公式对所述预置问句进行文本向量化,得到第二问句向量,其中,所述第一预置公式,k表示第j句预置问句。The preset question is vectorized according to a first preset formula to obtain a second question vector, wherein the first preset formula , , k represents the jth pre-set question. 5.根据权利要求1或者4所述的基于人工智能的意图识别方法,其特征在于,所述对所述第一问句向量和所述第二问句向量进行计算,得到夹角余弦值,并将所述夹角余弦值最大的预置问句对应的答案设置为目标答案,包括:5. The method for intention recognition based on artificial intelligence according to claim 1 or 4, characterized in that the step of calculating the first question vector and the second question vector to obtain a cosine value of an angle, and setting the answer corresponding to the preset question with the largest cosine value of the angle as the target answer comprises: 根据第二预置公式对所述第一问句向量和所述第二问句向量进行计算,得到夹角余弦值,所述第二预置公式为,其中,所述为所述第一问句向量,所述为所述第二问句向量,用于指示根据所述夹角余弦值确定所述第一问句向量和所述第二问句向量的相似度;The first question vector and the second question vector are calculated according to a second preset formula to obtain the cosine value of the angle, where the second preset formula is: , wherein the is the first question vector, is the second question vector, Used to indicate that the first question vector is determined according to the cosine value of the angle and the second question vector similarity; 按照所述夹角余弦值从大到小顺序对所述预置问句对应的答案进行排序,并将所述夹角余弦值最大的预置问句对应的答案设置为目标答案。The answers corresponding to the preset question sentences are sorted in descending order according to the cosine value of the angle, and the answer corresponding to the preset question sentence with the largest cosine value of the angle is set as the target answer. 6.一种基于人工智能的意图识别装置,其特征在于,所述基于人工智能的意图识别装置包括:6. An artificial intelligence-based intention recognition device, characterized in that the artificial intelligence-based intention recognition device comprises: 采集单元,用于定时从预置论坛网站中采集初始数据;A collection unit, used to collect initial data from a preset forum website at regular intervals; 预处理单元,用于根据预置业务对所述初始数据进行数据预处理,得到问题文本分句;A preprocessing unit, used for performing data preprocessing on the initial data according to preset services to obtain question text sentences; 计算单元,用于对所述问题文本分句按照被处理的先后顺序通过词频-逆文本频率指数算法计算夹角余弦值,并根据所述夹角余弦值确定目标答案;A calculation unit, used for calculating the angle cosine value of the question text sentences according to the order in which they are processed by using the word frequency-inverse text frequency index algorithm, and determining the target answer according to the angle cosine value; 意图识别单元,用于通过深度神经网络文本分类模型对所述问题文本分句进行意图识别,得到模板类型;An intention recognition unit, used to perform intention recognition on the question text sentence through a deep neural network text classification model to obtain a template type; 回帖单元,用于根据所述模板类型对所述问题文本分句、所述目标答案和预置链接进行组合,并通过预置爬虫任务对拼接后的内容进行智能回帖,所述预置链接用于指示目标用户访问目标在线问答系统;A reply unit, used to combine the question text sentence, the target answer and the preset link according to the template type, and intelligently reply to the spliced content through a preset crawler task, wherein the preset link is used to instruct the target user to access the target online question-answering system; 所述对所述问题文本分句按照被处理的先后顺序通过词频-逆文本频率指数算法计算夹角余弦值,并根据所述夹角余弦值确定目标答案,包括:对所述问题文本分句按照被处理的先后顺序进行预处理,得到初始词汇表和第一问句向量;根据词频-逆文本频率指数算法和所述初始词汇表对预置问句进行文本向量化,得到第二问句向量;对所述第一问句向量和所述第二问句向量进行计算,得到夹角余弦值,并将所述夹角余弦值最大的预置问句对应的答案设置为目标答案;The step of calculating the angle cosine value of the question text sentences according to the order in which they are processed by using the word frequency-inverse text frequency index algorithm, and determining the target answer according to the angle cosine value includes: preprocessing the question text sentences according to the order in which they are processed to obtain an initial vocabulary and a first question vector; performing text vectorization on the preset question according to the word frequency-inverse text frequency index algorithm and the initial vocabulary to obtain a second question vector; calculating the first question vector and the second question vector to obtain the angle cosine value, and setting the answer corresponding to the preset question with the largest angle cosine value as the target answer; 所述对所述问题文本分句按照被处理的先后顺序进行预处理,得到初始词汇表和第一问句向量,包括:将多个问题文本分句按照被处理的先后顺序写入到预置消息队列的队列尾部;对所述问题文本分句设置超时时长,所述超时时长大于或者等于0;当所述问题文本分句对应的超时时长等于0,并且所述问题文本分句仍处于等待状态时,将所述问题文本分句添加到延时任务队列中,所述延时任务队列用于处理超期的问题文本分句;当所述问题文本分句对应的超时时长大于0时,对所述问题文本分句进行分词以及词性标注,得到初始词汇表和第一问句向量。The method pre-processes the question text sentences in the order in which they are processed to obtain an initial vocabulary and a first question vector, including: writing multiple question text sentences into the tail of a preset message queue in the order in which they are processed; setting a timeout period for the question text sentences, wherein the timeout period is greater than or equal to 0; when the timeout period corresponding to the question text sentence is equal to 0 and the question text sentence is still in a waiting state, adding the question text sentence to a delayed task queue, wherein the delayed task queue is used to process overdue question text sentences; when the timeout period corresponding to the question text sentence is greater than 0, segmenting and part-of-speech tagging the question text sentence to obtain an initial vocabulary and a first question vector. 7.一种基于人工智能的意图识别设备,其特征在于,所述基于人工智能的意图识别设备包括:存储器和至少一个处理器,所述存储器中存储有指令,所述存储器和所述至少一个处理器通过线路互连;7. An artificial intelligence-based intention recognition device, characterized in that the artificial intelligence-based intention recognition device comprises: a memory and at least one processor, the memory stores instructions, and the memory and the at least one processor are interconnected through a line; 所述至少一个处理器调用所述存储器中的所述指令,以使得所述基于人工智能的意图识别设备执行如权利要求1-5中任意一项所述的基于人工智能的意图识别方法。The at least one processor calls the instructions in the memory so that the artificial intelligence-based intention recognition device executes the artificial intelligence-based intention recognition method as described in any one of claims 1-5. 8.一种计算机可读存储介质,存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-5中任意一项所述的基于人工智能的意图识别方法。8. A computer-readable storage medium storing a computer program, characterized in that when the computer program is executed by a processor, the artificial intelligence-based intention recognition method as described in any one of claims 1 to 5 is implemented.
CN202010162325.5A 2020-03-10 2020-03-10 Intention recognition method, device, equipment and storage medium based on artificial intelligence Active CN111428471B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010162325.5A CN111428471B (en) 2020-03-10 2020-03-10 Intention recognition method, device, equipment and storage medium based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010162325.5A CN111428471B (en) 2020-03-10 2020-03-10 Intention recognition method, device, equipment and storage medium based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN111428471A CN111428471A (en) 2020-07-17
CN111428471B true CN111428471B (en) 2025-04-11

Family

ID=71551540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010162325.5A Active CN111428471B (en) 2020-03-10 2020-03-10 Intention recognition method, device, equipment and storage medium based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN111428471B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860934A (en) * 2022-05-09 2022-08-05 青岛日日顺乐信云科技有限公司 A Smart Question Answering Method Based on NLP Technology
CN115936011B (en) * 2022-12-28 2023-10-20 南京易米云通网络科技有限公司 Multi-intention semantic recognition method in intelligent dialogue

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522393A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 Intelligent answer method, apparatus, computer equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7392185B2 (en) * 1999-11-12 2008-06-24 Phoenix Solutions, Inc. Speech based learning/training system using semantic decoding
CN108446286B (en) * 2017-02-16 2023-04-25 阿里巴巴集团控股有限公司 Method, device and server for generating natural language question answers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522393A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 Intelligent answer method, apparatus, computer equipment and storage medium

Also Published As

Publication number Publication date
CN111428471A (en) 2020-07-17

Similar Documents

Publication Publication Date Title
CN110968684B (en) Information processing method, device, equipment and storage medium
US8161059B2 (en) Method and apparatus for collecting entity aliases
CN102054015B (en) System and method for organizing community intelligence information using an organic object data model
CN114238573A (en) Information pushing method and device based on text countermeasure sample
US20110112995A1 (en) Systems and methods for organizing collective social intelligence information using an organic object data model
CA2774278C (en) Methods and systems for extracting keyphrases from natural text for search engine indexing
CN110543595B (en) In-station searching system and method
CN108959531B (en) Information search method, device, device and storage medium
CN109241277B (en) Text vector weighting method and system based on news keywords
CN103823824A (en) Method and system for automatically constructing text classification corpus by aid of internet
CN107885793A (en) A kind of hot microblog topic analyzing and predicting method and system
CN103810251B (en) Method and device for extracting text
CN111160019A (en) Public opinion monitoring method, device and system
CN111563382A (en) Text information acquisition method and device, storage medium and computer equipment
CN111428471B (en) Intention recognition method, device, equipment and storage medium based on artificial intelligence
CN109948154B (en) A system and method for character acquisition and relationship recommendation based on mailbox name
US8862586B2 (en) Document analysis system
CN118535978A (en) News analysis method and system based on multi-mode large model
CN112035723A (en) Resource library determination method and device, storage medium and electronic device
CN108446333B (en) Big data text mining processing system and method thereof
CN115640439A (en) Method, system and storage medium for network public opinion monitoring
CN116070024A (en) Article Recommendation Method and Device Based on New Energy Cloud and User Behavior
CN109902230A (en) Method and device for processing news data
Rousseau Graph-of-words: mining and retrieving text with networks of features
CN109597879B (en) Service behavior relation extraction method and device based on &#39;citation relation&#39; data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant