Disclosure of Invention
The application provides a method, a device and a storage medium for optimizing an intention recognition confidence coefficient threshold value, which are used for solving the problems of low intention recognition accuracy and recognition rate caused by a fixed threshold value.
In a first aspect, the present application discloses a method of optimizing an intent recognition confidence threshold, comprising:
setting n threshold values Ft, 1< = t < = n, and calculating an intention identification recall rate, an accuracy rate and an overall benefit value corresponding to each threshold value Ft;
and taking the intention recognition threshold value Ft corresponding to the maximum overall benefit value as an optimal threshold value Ftmax, and setting the intention recognition confidence coefficient threshold value as the optimal threshold value Ftmax.
According to a further technical scheme, the step of calculating the intention recognition recall rate, the accuracy rate and the overall benefit value corresponding to each threshold Ft comprises the following steps:
step (1): running an intention recognition model, and recording each piece of man-machine conversation content, an intention recognition result of the conversation content and a corresponding intention recognition confidence value Ctx;
step (2): judging whether the actual intention of the user is consistent with the preset intention Ix in the intention recognition model, and analyzing the human opportunity conversation result as follows:
if the confidence value Ctx is greater than or equal to the threshold Ft and the actual intention of the user is indeed the preset intention Ix, then Ix0=1, otherwise Ix2= 1; the method for judging the actual intention of the user comprises the following steps: analyzing from the log, and judging whether the user carries out conversation according to preset follow-up operation in the conversation;
if the confidence value Ctx is less than the threshold Ft and the user's actual intention is not the preset intention Ix, Ix1=1, otherwise Ix1= 0; the method for judging the actual intention of the user comprises the following steps: analyzing from the log, in the conversation, after inquiring whether the user intention is Ix or not by a conversation system in a question-following mode, giving a positive answer by the user or carrying out subsequent man-machine interaction according to the basic operation of a set story, and then judging that the actual intention of the user is Ix;
and (3): calculating the intention recognition recall rate through a formula (1), calculating the accuracy rate through a formula (2), and further calculating the overall benefit value:
In the formula, the default values of ix0, ix1 and ix2 are 0, and when ix0=1, the identification above the threshold value is correct; ix1=1, indicating that identification below the threshold is correct; ix2=1, indicating an identification error above a threshold;
overall benefit value Bt = Rt × Pt.
In a further technical scheme, the method for determining the optimal threshold Ftmax for intention recognition is as follows:
when the corresponding overall benefit value in different threshold values Ft is determined to be the maximum value Bmax, the threshold value is used as the optimal threshold value Ftmax; wherein Bmax satisfies the following conditions: the recall rate and the accuracy corresponding to the value of Bmax are higher than the preset lowest value; and Bmax is the maximum value among the overall benefit values Bt satisfying the aforementioned case.
In a second aspect, the present application provides an apparatus for optimizing an intent recognition confidence threshold, the apparatus comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program when executed by the processor implementing the steps of the method for optimizing an intent recognition confidence threshold as described in one or more of the present applications.
In a third aspect, the present application provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the optimization intent recognition confidence threshold method as described in one or more of the present applications.
Advantageous effects
1) The method can periodically and automatically calculate and update the intention recognition confidence threshold, and avoids the condition that the conversation recognition effect gradually degrades after the system sets the threshold once, so that the system continuously keeps a better state.
2) The invention provides a method for estimating whether the current intention identification is correct or not by judging whether the subsequent operation of the user is carried out according to the story script in the conversation process, and the method has the accuracy of a statistical level. By using the method, whether the intention recognition result made by the system in the conversation process is correct can be judged without manual intervention, so that the conversation effect can be automatically optimized through a program.
3) In the accuracy formula, the identification accuracy number above the threshold and the identification error number above the threshold are used as denominators, and the influences of the identification accuracy below the threshold and the error number on the accuracy are eliminated, so that the accuracy can reflect the influences of the threshold on the accuracy more accurately.
4) According to the overall benefit value calculation formula, the overall benefit value is positively correlated with the intention recall rate and the intention identification accuracy rate, and the purpose of quantifying the user experience effect under different confidence degree thresholds is achieved.
5) The invention solves the problem of personal perceptual experience of manually setting the confidence level threshold of the intention recognition, improves the natural language understanding capability of the conversation robot, can realize automatic and periodic threshold updating, and has more reasonable data, so that the intention recognition accuracy and the recognition rate of the conversation robot are high.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and it should be understood that the terms "comprises" and "comprising", and any variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The application scenario of the application is as follows: the method and the device can be applied to the related fields of the customer service robot with the automatic question answering function, such as self-service inquiry, self-service order handling, customer service and the like.
In the application scene of the customer service robot, the actual intention of a user needs to be recognized firstly, the method is to analyze a user statement, compare whether the confidence coefficient is larger than a certain threshold value, and judge whether the actual intention of the user is consistent with the preset intention of the robot system. In the prior art, the threshold setting of the customer service robot is a fixed value which is manually input, the optimization of the threshold is also judged subjectively by manpower, and a scientific and reasonable reference mechanism is lacked. The invention can automatically optimize the threshold value and provides a quantifiable optimization mode for the customer service robot. As shown in fig. 2, the threshold may be configured by selecting to turn on the automatic update setting of the threshold.
Example one
Fig. 1 is a method for optimizing an intention recognition confidence threshold according to an embodiment of the present application, and as shown in fig. 1, the method includes:
setting n thresholds Ft, 1< = t < = n, wherein each threshold interval is sp, and calculating an intention identification recall rate, an accuracy rate and an overall benefit value corresponding to each threshold Ft;
and taking the intention recognition threshold value Ft corresponding to the maximum overall benefit value as an optimal threshold value Ftmax, and setting the intention recognition confidence coefficient threshold value as the optimal threshold value Ftmax.
Step (1): operating an intention recognition model, and recording a man-machine conversation, an intention recognition result in the conversation process and a corresponding intention recognition confidence value Ctx;
step (2): judging whether the actual intention of the user is consistent with the preset intention Ix in the intention recognition model, and analyzing the human opportunity conversation result as follows:
if the confidence value Ctx is greater than or equal to the threshold Ft and the actual intention of the user is indeed the preset intention Ix, then Ix0=1, otherwise Ix2= 1; the method for judging the actual intention of the user comprises the following steps: analyzing from the log, judging whether the user carries out conversation according to preset subsequent operation in the conversation, if so, determining that the actual intention of the user is Ix, and if not, determining that the actual intention of the user is not Ix;
if the confidence value Ctx is less than the threshold Ft and the user's actual intention is not the preset intention Ix, Ix1=1, otherwise Ix1= 0; the method for judging the actual intention of the user comprises the following steps: analyzing from the log, in the conversation, after inquiring whether the user intention is Ix or not in a question-following mode by a conversation system, the user gives a positive answer, or even if the user gives a negative answer, the user performs subsequent human-computer interaction according to the basic operation of a set story, the actual intention of the user is judged to be Ix, and the actual intention is not Ix under other conditions;
and (3): calculating the intention recognition recall rate through a formula (1), calculating the accuracy rate through a formula (2), and further calculating the overall benefit value:
In the formula, the default values of ix0, ix1 and ix2 are 0, and when ix0=1, the identification above the threshold value is correct; ix1=1, indicating that identification below the threshold is correct; ix2=1, indicating an identification error above a threshold;
the results of the recognition of the actual intention of the user are shown in table 1 below:
TABLE 1
Overall benefit value Bt = Rt × Pt.
The method for determining the optimal threshold Ftmax for intention recognition is as follows:
when the corresponding overall benefit value in different threshold values Ft is determined to be the maximum value Bmax, the threshold value is used as the optimal threshold value Ftmax; wherein Bmax satisfies the following conditions: the recall rate and the accuracy corresponding to the value of Bmax are higher than the preset lowest value; and Bmax is the maximum value among the overall benefit values Bt satisfying the aforementioned case.
Take application in a customer service robot scenario as an example.
The customer service robot system is supposed to only contain two intentions which are respectively marked as intention 1 'inquiring telephone charge balance'; intent 2 "query package", manually initially set the threshold to 0.8, minimum accuracy to 0.7, and minimum recall to 0.25.
The method of optimizing the intent recognition confidence threshold is as follows:
step one, according to the conversation log, carrying out actual intention identification on the user statement
After a sentence of a user is input, the user firstly enters an intention recognition model to obtain confidence coefficient Ctx of intention recognition. After the user completes the whole interaction process, a complete conversation log of the man-machine interaction can be obtained.
And analyzing the actual intention of the user according to the conversation log. The determination method comprises the following steps:
the first is that: when the confidence value Ctx is greater than or equal to the threshold value Ft, the log data is shown in table 2 below. The conversation system judges whether the user carries out conversation according to preset follow-up operation, and judges the real intention of a statement of 'telephone fee package selection' of the user as 'query package' because the operation that the user clicks a 'query package' button is obtained.
TABLE 2
Secondly, the following steps: when the confidence value Ctx is less than the threshold Ft, the log data is shown in Table 3 below. After inquiring whether the user intention is 'query package' or not in a question-following mode, the session system judges the real intention of a user statement 'which package flow is more than 2020' as 'query package' because a user positive reply is obtained.
TABLE 3
Step two: and calculating the intention recognition recall rate, the accuracy and the overall benefit value under different thresholds according to the user sentences in the conversation log.
All user statements in the conversation log are counted, as shown in table 4 below:
TABLE 4
Exemplary settings are 6 thresholds: 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, and a threshold value with finer granularity can be set for calculation in the actual use process. And respectively calculating recall rates Rt, accuracy rates Pt and overall benefit values Bt under different threshold values according to a formula. The specific calculation process is shown in the following table 5:
TABLE 5
According to the above calculation results, when the threshold value is 0.7, the overall benefit value Bt is maximum.
Step three: updating a threshold value
Whether to update the threshold is determined according to the latest threshold calculation, the current threshold is 0.8, and when the threshold is 0.7, the accuracy is 0.85 greater than the minimum accuracy is 0.7, the recall rate is 0.33 greater than the minimum recall rate is 0.25, so the system threshold is updated to 0.7.
Example two
The present application provides an apparatus for optimizing an intent recognition confidence threshold, the apparatus comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the method of embodiment one.
It will be appreciated that in this embodiment, the memory may include both read-only memory and random access memory, and that the portion of memory that provides instructions and data to the processor may also include non-volatile random access memory. For example, the memory may also store device type information.
The processor may be a central processing unit CPU, or other general purpose processor, digital signal processor DSP, application specific integrated circuit ASIC, an off-the-shelf programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
A machine whose computing program (also called a program, software application, or code) includes a programmable processor
And which may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software.
The method in the first embodiment may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art.
EXAMPLE III
The present application provides a storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method of the first embodiment.
The storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Those of ordinary skill in the art will appreciate that the various illustrative methods described in connection with the present embodiments may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.