CN111858850B

CN111858850B - A method for achieving accurate and fast scoring of questions and answers in intelligent customer service

Info

Publication number: CN111858850B
Application number: CN202010609405.0A
Authority: CN
Inventors: 何彦霖; 邬敏健; 胡醒; 周畅
Original assignee: Yinsheng Payment Service Co Ltd
Current assignee: Yinsheng Payment Service Co Ltd
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2024-12-27
Anticipated expiration: 2040-06-29
Also published as: CN111858850A

Abstract

The embodiment of the present invention provides a method for achieving accurate and fast scoring of questions and answers on intelligent customer service, including the following steps: establishing a knowledge route associated with a knowledge base, matching the corresponding knowledge route according to the characteristics of the question, the knowledge route including a question and answer template, a search engine, and a semantic analysis; segmenting the question through a word segmenter hanlp to obtain segmented words, and matching the segmented words with the routing keywords of the target knowledge base according to the corresponding knowledge route; obtaining the target answer corresponding to the question from the target knowledge base based on the final score obtained by the full-text index score, the semantic similarity score, the word class weighted score, and the matching frequency score, and returning the target answer to the user. The embodiment of the present invention obtains the final score through the full-text index score, the semantic similarity score, the word class weighted score, and the matching frequency score to match the most appropriate answer, thereby achieving accurate answers and fast scoring.

Description

Method for realizing accurate and rapid scoring of questions and answers on intelligent customer service

Technical Field

The invention relates to the technical field of intelligent customer service, in particular to a method for realizing accurate and rapid scoring of questions and answers on intelligent customer service.

Background

With the continuous development of internet financial technology, more and more technologies are applied in the financial field, wherein intelligent customer service in the financial field involves numerous technical applications.

At present, three implementation modes of the question-answering function of intelligent customer service are mainly available, one is to find a corresponding matched question-answering template according to the format of a question of a questioner and then return the question, the mode is high in efficiency, but the question format template needs to be maintained all the time, and once a sentence of the question is not in the template range, the answer is not matched. The second is to use the search engine to divide words of sentences first, then save keywords, divide words of user questions when users ask questions, and then match the questions by the search engine. The third is to use the way of semantic analysis to answer questions, this flexibility is much higher, can match the answer of the question through semantic analysis and word stock training, but this way has high performance requirement, inefficiency, and easy "answer not to question", and for intelligent customer service, can match the answer that the user needs to appear more important. The answer required by the user is returned accurately and quickly by means of matching scores, which is most important for intelligent customer service.

The three question-answering modes have own defects, and if the three question-answering modes are combined together, the combination is more humanized, so that the aim of realizing accurate and quick scoring of questions and answers on intelligent customer service is needed on the basis of the prior art, and the method is a trend of modernization of the Internet of things.

Summary of the invention

In order to overcome the defects of the prior art, the invention provides a method for realizing accurate and rapid scoring of questions and answers on intelligent customer service, which is used for improving how a user can rapidly match answers to questions and accurately match questions of the answers in the intelligent customer service.

The technical scheme for solving the technical problems includes that a method for realizing accurate and rapid scoring of questions and answers on intelligent customer service is achieved, and the method comprises the steps of establishing knowledge routes associated with a knowledge base, matching corresponding knowledge routes according to characteristics of the questions, wherein the knowledge routes comprise question and answer templates, search engines and semantic analysis, word segmentation is conducted on the questions through a word segmentation device hanlp to obtain word after word segmentation, matching is conducted on the word after word segmentation and routing keywords of a target knowledge base according to the corresponding knowledge routes, and target answers corresponding to the questions are obtained from the target knowledge base based on final scores obtained through full-text index scores, semantic similarity scores, word class weighting scores and matching frequency scores and are returned to users.

Preferably, the step of obtaining the target answer corresponding to the question from the target knowledge base based on the final score obtained by the full-text index score, the semantic similarity score, the part-of-speech weighted score and the matching frequency score includes:

Matching the segmented words with 6 matching results with highest scores through a BM25 algorithm of ELASTCSEARCH full-text indexes;

acquiring full-text index scores (full score 20) corresponding to the highest 6 matching results;

Score(Q,d)=SUM(Wi*R(qi,d))Wi=IDF(qi)=log((N-n(qi)+0.5)/(n(qi)+0.5))

R(qi,d)=fi(k1+1)/(fi+K)K=k1*(1-b+b*(dl/avg(dl)))

Setting the character strings after word segmentation as qi=q1, q2, q3, wherein qn, N is the total number of documents in the index, N (qi) is the number of documents containing word segmentation qi, D is the search result, wi is the correlation weight of matching qi word segmentation and index documents, k1 and b are algorithm adjustable parameters, dl is the length of the index document D, avgdl is the average length of all texts in the index text set D.

calculating 6 matching results with highest matching scores through semantic similarity by using Python synonyms frames for the words after word segmentation;

Obtaining semantic similarity matching scores (full score 20) corresponding to the highest 6 matching results;

d1=max(compare(a1,b1),compare(a1,b2),...compare(a1,bm));

d2=max(compare(a2,b1),compare(a2,b2),...compare(a2,bm));

...

dn=max(compare(an,b1),compare(an,b2),...compare(an,bm));

the semantic similarity matching score = avg (d 1, d2,., dn);

The word set after word segmentation is wi= { a1, a2, & gt, an }, the word set of the result matched with the search engine is Wj= { b1, b2, & gt, bm }, compare (a, b) represents the distance between the word a and the word b, the value range is [0,1], and d represents the distance between the words.

calculating 6 matching results with highest matching scores of the segmented words through a special (a) function;

obtaining the part-of-speech weighting scores of the highest 6 matching results;

the semantic similarity matching score = avg (s 1, s2,., sn);

Where s is a word class weighted score, and the word set after word segmentation is wi= { a1, a2,..and an }, and special (a) is a specific noun scoring function, managed by a background management system, a score is set for a specific noun, and the score is obtained through the special (a) function.

Obtaining 6 matching results with highest matching frequency scores according to 20 (the matching frequency of the problem in the database/the highest problem matching frequency of the current database);

And obtaining the matching frequency scores of the highest 6 matching results.

Preferably, the final score calculation method:

Final score = w1 x full text index score + w2 x semantic similarity score + w3 x part of speech weighted score + w4 x matching frequency score;

Wherein w1, w2, w3 and w4 are weighted values of four types of scores, the initialization value is 0.5, the value range is [0,1], the values belong to a configurable value, and the system administrator adjusts the values according to the actual question-answer matching result.

Preferably, obtaining the target answer corresponding to the question from the target knowledge base based on the final score obtained by the full-text index score, the semantic similarity score, the part-of-speech weighting score and the matching frequency score comprises:

when the user is asking for a problem, if the user is matched with the cold-rolling library and the cold-rolling library has no matching result, the matching result is obtained through a third party API interface, or

When a user is presenting a problem, if the user is matched with the cold rolling warehouse and the cold rolling warehouse has no matching result, the matching result is obtained through calculation of a neural network algorithm.

Preferably, the feature matching according to the problem corresponds to a knowledge route including:

The same element numbers of the routing keyword set ki= { k1, k2,..kn } and the word set w= { a1, a2,..am } were found to be di.

The maximum max (di) is obtained and the knowledge base route with the highest matching degree is determined.

Preferably, the matching the word after word segmentation with the routing keyword of the target knowledge base according to the corresponding knowledge routing includes:

And setting a corresponding word segmentation strategy according to the corresponding knowledge route, so that the words after word segmentation are matched with the route keywords of the target knowledge base.

The method has the beneficial effects that the final score is obtained through the corresponding full-text index score, semantic similarity score, word class weighting score and matching frequency score, so that the accuracy and efficiency of the question matching answer are improved.

Drawings

FIG. 1 is a flow chart of a method for achieving accurate and rapid scoring of questions and answers on intelligent customer service.

Fig. 2 is a scoring schematic diagram of a method for achieving accurate and rapid scoring of questions and answers on intelligent customer service.

Detailed Description

The invention will be further described with reference to the drawings and examples.

The conception, specific structure, and technical effects produced by the present invention will be clearly and completely described below with reference to the embodiments and the drawings to fully understand the objects, features, and effects of the present invention. It is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments, and that other embodiments obtained by those skilled in the art without inventive effort are within the scope of the present invention based on the embodiments of the present invention. In addition, all the coupling/connection relationships referred to in the patent are not direct connection of the single-finger members, but rather, it means that a better coupling structure can be formed by adding or subtracting coupling aids depending on the specific implementation. The technical features in the invention can be interactively combined on the premise of no contradiction and conflict.

Referring to FIG. 1

S101, establishing a knowledge route associated with a knowledge base, and matching the corresponding knowledge route according to the characteristics of the problem, wherein the knowledge route comprises a question-answer template, a search engine and semantic analysis;

Establishing a knowledge route associated with a knowledge base, wherein the knowledge base comprises a cold library, a question and answer library, a business library and the like, and matching the corresponding knowledge route according to the characteristics of the questions, the knowledge route comprises a question and answer template, a search engine and semantic analysis, and when the user presents the questions, for example, the questions are that which faults are happened in an intelligent POS (point of sale) and if the questions are in the question and answer template, matching answers through the knowledge route path; if the question is searched through semantic analysis, the question is required to be subjected to semantic analysis, an answer corresponding to the same or similar semantic meaning with the question is searched, the matching answer is carried out with a knowledge base, multiple ways of searching the corresponding answer are provided, and the searching efficiency is improved.

S102, performing word segmentation processing on the problem through a word segmentation device hanlp to obtain segmented words, and matching the segmented words with routing keywords of a target knowledge base according to the corresponding knowledge routing;

When the user proposes "what is the smart POS opportunity out of order"? when the question and answer template and the semantic analysis cannot retrieve the answer, retrieving the question by means of a search engine, then the question is subjected to word segmentation processing through a word segmentation device hanlp to obtain word after word segmentation, the question is divided into words with various parts of speech, and corresponding knowledge bases are searched and matched. The system administrator gathers the questions and answers of the service, gathers the questions and answers to form a question-answer mapping relation, builds the knowledge base in advance, trains the knowledge base to obtain the trained knowledge base, and matches the corresponding knowledge base according to the attributes of the questions so as to quickly match the corresponding knowledge base, and improve the efficiency of searching the knowledge base.

And S103, obtaining a target answer corresponding to the question from a target knowledge base based on the final scores obtained by the full-text index score, the semantic similarity score, the word class weighting score and the matching frequency score, and returning the target answer to the user.

The matching result is obtained through a third party API interface if the user is matched with a frigid warehouse and the frigid warehouse does not have a matching result when the user is presenting the problem, or the matching result is obtained through a neural network algorithm calculation when the user is presenting the problem, so that the accuracy and the efficiency of the matching scoring are realized.

Referring to FIG. 2

S201, full text index scoring

obtaining the search engine matching score (full score 20) corresponding to the highest 6 matching results;

Score(Q,d)＝SUM(Wi*R(qi,d))Wi＝IDF(qi)＝log((N-n(qi)+0.5)/(n(qi)+0.5))R(qi,d)＝fi(k1+1)/(fi+K)K＝k1*(1-b+b*(dl/avg(dl)))

Setting the character strings after word segmentation as qi=q1, q2, q3, wherein qn, N is the total number of documents in the index, N (qi) is the number of documents containing word segmentation qi, D is the search result, wi is the correlation weight of matching qi word segmentation and index documents, k1 and b are algorithm adjustable parameters, dl, avgdl are the length of the index document D and the average length of all texts in the index text set D respectively. The scoring mode of the search engine eliminates most of the problems that the semantics are basically impossible to be related, and improves the efficiency of the words after the word segmentation.

S202, calculating 6 matching results with highest matching scores through semantic similarity by using Python synonyms frames according to the word after word segmentation through semantic similarity scoring;

And obtaining semantic similarity matching scores (full score of 20 minutes) corresponding to the highest 6 matching results.

d1=max(compare(a1,b1),compare(a1,b2),...compare(a1,bm));

d2=max(compare(a2,b1),compare(a2,b2),...compare(a2,bm));

...

dn=max(compare(an,b1),compare(an,b2),...compare(an,bm));

Semantic similarity matching score = avg (d 1, d2,., dn);

The word set after word segmentation is wi= { a1, a2,., an }, and the word set of the result matched with the search engine is as follows:

wj= { b1, b2,..bm }, compare (a, b) represents the distance between word a and word b, the range of values is [0,1], d represents the distance between words.

S203, word class weighting scoring

And obtaining the part-of-speech weighted scores of the highest 6 matching results.

Obtaining word segmentation results of the 6 matching results and word segmentation results of the problem, and carrying out word class weighting scoring according to specific nouns (16-20 min) > ordinary nouns (15 min) > verbs (10 min) > other words (5 min), so as to finally obtain an average word class weighting score, and calculating the distance between Wi and Wj through a paraphrasing framework synonyms, wherein the word class weighting score calculating method comprises the following steps:

semantic similarity matching score = avg (s 1, s2,., sn);

S204, matching frequency scoring

And obtaining the matching frequency of 6 matching results in the database, scoring the matching frequency according to 20 (the matching frequency of the question in the database/the highest question matching frequency of the current database), and scoring the matching frequency of the corresponding question higher, so that the probability of matching the answer required by the user is improved.

S205, final scoring

Final score = w1 x full text index score + w2 x semantic similarity score + w3 x part of speech weighted score + w4 x matching frequency score.

Wherein w1, w2, w3 and w4 are weighted values of four types of scores, the initialization value is 0.5, the value range is [0,1], the values belong to a configurable value, and the system administrator adjusts according to the actual question-answer matching result, for example, if the system administrator feels that the search engine matching score is more important for answer matching, the value of w1 is adjusted, or the value of w2, w3 and w4 is correspondingly reduced, or if the system administrator feels that the semantic similarity score is more important for answer matching, the value of w2 is adjusted, or the value of w1, w3 and w4 is correspondingly reduced, and the like.

In the embodiment of the application, the final score is obtained through the full-text index score, the semantic similarity score, the word class weighting score and the matching frequency score so as to match the most suitable answer, thereby improving the accuracy and efficiency of the question matching answer.

Principle of routing and word segmentation for knowledge base:

Let the routing keyword set of knowledge base i be ki= { k1, k2 }. The term kn, let the word set after word segmentation: w= { a1, a2, am }, the number of identical elements of the routing keyword set Ki and the word set W is calculated as di, the maximum value max (di) is obtained, the knowledge base route with the highest matching degree is determined, and if the plurality of knowledge base routes have the same score, the finally selected knowledge base route is determined according to the weight value of the knowledge base route configured in the background. The method comprises the steps of firstly carrying out partial keyword matching on the problems, then carrying out first scoring, selecting the knowledge base with the highest scoring for subsequent matching, reducing the number of subsequent matching times, improving the retrieval efficiency, and reducing the matching of knowledge irrelevant to the problems, thereby improving the matching accuracy.

For routing according to the corresponding knowledge, different word segmentation strategies are adopted.

For example, the cold-start library strategy is the widest in word segmentation scope, other words such as word, verb, azimuth word and the like are also participated in word segmentation, word forming, article forming and punctuation mark are not participated in word segmentation, for example, hello is the same, the question-answer strategy is only to divide nouns and verbs, for example, an intelligent POS machine breaks down, the business strategy is only to divide nouns, for example, the intelligent POS machine, the system adopts different word segmentation strategies according to different knowledge library types, one part is part of speech, the other part is weighted word, and the different word segmentation strategies can provide greater flexibility and accuracy for subsequent scoring.

While the preferred embodiment of the present application has been described in detail, the present application is not limited to the embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present application, and the equivalent modifications or substitutions are included in the scope of the present application as defined in the appended claims.

Claims

1. A method for realizing accurate and rapid scoring of questions and answers on intelligent customer service, comprising the following steps:

Establishing knowledge routes associated with a knowledge base, and matching the corresponding knowledge routes according to the characteristics of the questions, wherein the knowledge routes comprise question-answering templates, search engines and semantic analysis;

word segmentation is carried out on the problem through a word segmentation device hanlp to obtain words after word segmentation, and the words after word segmentation are matched with routing keywords of a target knowledge base according to the corresponding knowledge routing;

And obtaining a target answer corresponding to the question from a target knowledge base based on the final scores obtained by the full-text index score, the semantic similarity score, the word class weighting score and the matching frequency score, and returning the target answer to the user.

2. The method for realizing accurate and rapid scoring of questions and answers on intelligent customer service according to claim 1, wherein the step of obtaining the target answers corresponding to the questions from the target knowledge base based on the final score obtained from the full-text index score, the semantic similarity score, the word class weighting score and the matching frequency score comprises:

acquiring full-text index scores corresponding to the highest 6 matching results, wherein the full score is 20 minutes;

Score(Q,d)=SUM(Wi*R(qi,d))Wi=IDF(qi)=log((N-n(qi)+0.5)/(n(qi)+0.5))

R(qi,d)=fi(k1+1)/(fi+K)K=k1*(1-b+b*(dl/avg(dl)))

Wherein, the character strings after word segmentation are qi=q1, q2, q3, & gt, qn, N is the total document number in the index, N (qi) is the document number containing word segmentation qi, D is the search result, wi is the correlation weight of matching qi word segmentation and index document, k1 and b are algorithm adjustable parameters, dl is the length of the index document D, avg (dl) is the average length of all texts in the index text set D.

3. The method for realizing accurate and rapid scoring of questions and answers on intelligent customer service according to claim 1, wherein the step of obtaining the target answers corresponding to the questions from the target knowledge base based on the final score obtained from the full-text index score, the semantic similarity score, the word class weighting score and the matching frequency score comprises:

Obtaining semantic similarity matching scores corresponding to the highest 6 matching results, wherein the score is 20;

d1=max(compare(a1,b1),compare(a1,b2),...compare(a1,bm));

d2=max(compare(a2,b1),compare(a2,b2),...compare(a2,bm));

...

dn=max(compare(an,b1),compare(an,b2),...compare(an,bm));

the semantic similarity matching score = avg (d 1, d2,., dn);

4. The method for realizing accurate and rapid scoring of questions and answers on intelligent customer service according to claim 1, wherein the step of obtaining the target answers corresponding to the questions from the target knowledge base based on the final score obtained from the full-text index score, the semantic similarity score, the word class weighting score and the matching frequency score comprises:

the semantic similarity matching score = avg (s 1, s2,., sn);

5. The method for realizing accurate and rapid scoring of questions and answers on intelligent customer service according to claim 1, wherein the step of obtaining the target answers corresponding to the questions from the target knowledge base based on the final score obtained from the full-text index score, the semantic similarity score, the word class weighting score and the matching frequency score comprises:

And obtaining the matching frequency scores of the highest 6 matching results.

6. A method for achieving accurate and rapid scoring of questions and answers in intelligent customer service as claimed in any one of claims 2 to 5 wherein the final score calculation method comprises:

7. The method for achieving accurate and rapid scoring of questions and answers on intelligent customer service according to claim 1, wherein obtaining target answers corresponding to the questions from the target knowledge base based on final scores obtained from full-text index scores, semantic similarity scores, word class weighting scores, and matching frequency scores comprises:

8. A method for achieving accurate and rapid scoring of questions and answers on intelligent customer service as claimed in claim 1, wherein said matching corresponding knowledge routes based on characteristics of the questions comprises:

9. A method for achieving accurate and rapid scoring of questions and answers on intelligent customer service as claimed in claim 1, wherein said matching corresponding knowledge routes based on characteristics of the questions comprises:

10. The method for realizing accurate and rapid scoring of questions and answers on intelligent customer service according to claim 1, wherein the matching the word after word segmentation with the routing keyword of the target knowledge base according to the corresponding knowledge route comprises: