[go: up one dir, main page]

CN120579009A - A dynamic intention understanding method based on multimodal fusion - Google Patents

A dynamic intention understanding method based on multimodal fusion

Info

Publication number
CN120579009A
CN120579009A CN202511088840.2A CN202511088840A CN120579009A CN 120579009 A CN120579009 A CN 120579009A CN 202511088840 A CN202511088840 A CN 202511088840A CN 120579009 A CN120579009 A CN 120579009A
Authority
CN
China
Prior art keywords
intent
intention
user
conversation
types
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202511088840.2A
Other languages
Chinese (zh)
Other versions
CN120579009B (en
Inventor
欧阳禄萍
刘继鹏
高建文
赵耘逸
唐杰
唐湘峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhixueyun Beijing Technology Co ltd
Original Assignee
Zhixueyun Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhixueyun Beijing Technology Co ltd filed Critical Zhixueyun Beijing Technology Co ltd
Priority to CN202511088840.2A priority Critical patent/CN120579009B/en
Publication of CN120579009A publication Critical patent/CN120579009A/en
Application granted granted Critical
Publication of CN120579009B publication Critical patent/CN120579009B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a dynamic intention understanding method based on multi-modal fusion, which comprises the steps of obtaining multi-modal behavior data of a user in a current dialogue, rewriting a current dialogue text according to whether the current dialogue is a first dialogue or not, providing the rewritten dialogue text and other multi-modal behavior data to a large model together, prompting the large model to identify the intention of the current dialogue, specifically, designating a role of the large model in a prompt as a user dialogue intention understanding expert, designating understanding skills in a typical dialogue mode, designating various intention types to be identified and descriptions thereof, indicating the large model to understand the rewritten dialogue and other multi-modal behavior data according to designated roles and information, judging which of the various intention types belongs to the current dialogue, and calling a corresponding business flow according to an intention identification result to provide useful information for the current dialogue. The embodiment can improve the user intention recognition effect in the online learning scene.

Description

Dynamic intention understanding method based on multi-mode fusion
Technical Field
The embodiment of the invention relates to the field of artificial intelligence, in particular to a dynamic intention understanding method based on multi-mode fusion.
Background
Intent understanding refers to the purpose behind identifying individual behavior, and introducing intent understanding into intelligent conversations is more advantageous to provide accurate useful information to users.
The existing intention recognition technology mainly relies on single-mode (such as text) analysis, so that multidimensional user behavior data is difficult to effectively fuse, and specific intention types in an online learning scene are not clear. Patent application CN115186076a provides a text understanding intention extraction method, CN119886352a provides a dialogue intention understanding and interpretation text generation method based on deep learning, and the above problems cannot be solved.
Disclosure of Invention
The embodiment of the invention provides a dynamic intention understanding method based on multi-mode fusion, which aims to solve the technical problems.
In a first aspect, an embodiment of the present invention provides a method for understanding a dynamic intent based on multi-modal fusion, including:
Acquiring multi-modal behavior data of a user in a conversation, wherein the multi-modal behavior data comprises voice, images and conversation texts;
According to whether the dialogue is the first dialogue, rewriting the text of the dialogue;
The method comprises the steps of providing rewritten dialogue texts and other multi-modal behavior data to a large model together to prompt the large model to recognize the intention of the dialogue, specifically, designating the role of the large model in the prompt as a user dialogue intention understanding expert, designating understanding skills in a typical dialogue mode, designating various intention types to be recognized and descriptions thereof, and indicating the large model to understand the rewritten dialogue and other multi-modal behavior data according to the designated roles and information to judge which of the various intention types the dialogue belongs to;
calling a corresponding business process according to the intention recognition result to provide useful information for the conversation;
Wherein the plurality of intent types includes question and answer intent, search intent, task to do intent, learning task intent, teaching plan authoring intent and PPT authoring intent.
In a second aspect, an embodiment of the present invention provides an electronic device, including:
One or more processors;
a memory for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the dynamic intent understanding method based on multimodal fusion as described in any of the embodiments.
In a third aspect, embodiments of the present invention further provide a computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the dynamic intent understanding method based on multimodal fusion according to any of the embodiments.
The embodiment of the invention provides a dynamic intention understanding method based on multi-mode fusion, which not only fuses multi-mode information and history dialogue in intention recognition and improves the accuracy of intention recognition, but also utilizes a large model to realize automatic recognition of user intention, designates roles of the large model for user intention understanding specialists through large model prompt, designates understanding skills under typical dialogue modes, designates various intention types to be recognized and descriptions thereof, provides key information for large model recognition of user intention, then instructs the large model to understand the rewritten dialogue and other multi-mode behavior data according to the designated roles and information, judges which of the various intention types the current dialogue belongs to, enables the large model to accurately recognize basic user intention, and finally designs corresponding business processes aiming at each intention type, thereby providing effective information for users more conveniently.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a dynamic intent understanding method based on multimodal fusion provided by an embodiment of the invention;
FIG. 2 is a schematic diagram of a reinforcement learning network according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the invention, are within the scope of the invention.
In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it should also be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, integrally connected, mechanically connected, electrically connected, directly connected, indirectly connected through an intermediary, or in communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Fig. 1 is a flowchart of a dynamic intent understanding method based on multi-modal fusion according to an embodiment of the present invention. The method is suitable for online learning scenes and is executed by the electronic equipment. As shown in fig. 1, the method specifically includes:
S110, acquiring multi-mode behavior data of a user in the conversation, wherein the multi-mode behavior data comprise voice, images and conversation texts.
As described above, the present embodiment is applicable to an online learning scenario, such as a certain online learning platform. The platform provides an intelligent dialogue window for the user, and the user can acquire useful information through intelligent dialogue, so that various matters in the online learning platform can be completed more conveniently.
Optionally, the text data input by the user in the current dialogue is firstly obtained in the intelligent dialogue, and simultaneously, the voice data and the image data of the user are obtained through the camera of the learning device (such as a computer or a mobile phone) and the audio device, and the three types of data are used as the multi-modal behavior data of the current dialogue together. The voice data may provide voice, mood, and other semantic information not mentioned in the text data, and the image data may provide facial expression, mental state, and the like of the user.
And S120, rewriting the text of the current dialogue according to whether the current dialogue is the first dialogue.
Optionally, a "start new session" button is provided in the intelligent session window, and when the user clicks the button or logs in the intelligent session window for the first time, the session is considered to be the first session.
Further, if the current dialogue is the first dialogue, the text of the current dialogue is directly used as the rewritten dialogue text to participate in the subsequent operation, and if the current dialogue is not the first dialogue, the historical dialogue text between the current dialogue and the last first dialogue is firstly extracted, and the historical dialogue and the current dialogue are combined to be jointly used as the rewritten dialogue.
And S130, providing the rewritten dialogue text and other multi-modal behavior data to the large model together, and prompting the large model to recognize the intention of the dialogue.
The present embodiment classifies various types of intents for online learning scenarios, including question-answer intents, search intents, task to do intents, learning task intents, teaching plan authoring intents, and PPT authoring intents. These several types of intent cover several broad categories of basic requirements in an online learning scenario.
The present embodiment utilizes a large model to identify dialog intents, and the several types of intents described above and their descriptions play a vital role in large model hints. Meanwhile, the prompt is required to have the following contents to fully ensure the accuracy of intention recognition, namely, designating the role of the large model as a user dialogue intention understanding expert, designating the understanding skill under a typical dialogue mode, designating the large model to understand the rewritten dialogue and other multi-modal behavior data according to the designated role and information, and judging which of the multiple intention types the dialogue belongs to.
Illustratively, one hint is as follows:
# designating role, task
You are experienced and professional user problem intention understanding and intention judging specialists, can understand the problem intention according to the problem presented by the user, and feed back the actual intention of the user in the dialogue based on given intention classification.
# Specifies understanding skills in a typical dialog mode
1. If the specific intention word is explicitly mentioned in the user problem, returning according to the intention content mentioned by the user;
2. If the expressed intent in the user question is highly coincident with the meaning of the plurality of intent expressions in the description of the intent classification, feeding back the plurality of coincident intent classifications;
3. if there is no explicit intent in the user question, the default is returned as "question-answer intent".
# Specifies output Format requirement
Outputting the result according to the format requirement (separated by < format requirement > </format requirement >), ensuring that the output result can be correctly resolved by Python json.
# Specifies various intention types to be recognized and descriptions thereof
< Intention Classification >
Question and answer intention, which is mainly that the user hopes to quickly solve the questions and hopes you to answer the questions professionally;
the searching intention is that the user wants to actively inquire some resources, hopes you to search business and gives a result according to the requirement;
The intention of the task to be done is mainly that a user wants to find the backlog/work set which needs to be completed in the future and hopes that you can help find the task items and the number which need to be solved;
The intention of the learning task is mainly that a user wants to find a learning content set which needs to be completed in the future, and more emphasis is given to searching of learning resources which are required;
The intention is mainly that a user wants to create article content and hopes you to carefully create a teaching article for me;
PPT authoring intent-this intent is primarily that the user wants to write a PPT, hopefully you can plan and author an exquisite PPT for me.
Classification of intention
# Specifies other parameter variables:
< user problem >
{question}
User problem
< Format requirement >
[ "Here returns this dialog intention" ] and
Format requirement ]
Wherein # is used to mark notes on the prompt, and # are used to indicate parameter slots, < intent classification > and </intent classification > are filled in between with a plurality of preset intent types and descriptions thereof, < user question > and </user question > filled in modified user dialogue.
Based on the prompt, the large language model can accurately judge which of the multiple intention types the conversation belongs to.
S140, calling a corresponding business process according to the intention recognition result, and providing useful information for the conversation.
In this embodiment, different business processes are designed for different intention types, so as to provide faster and more accurate personalized services.
For the question and answer intention, the business process is that a large model is called to analyze and trace the knowledge points, and then the knowledge points and the sources are searched in a knowledge vector base to obtain useful information. Aiming at the search intention, the business processes are that the knowledge vector library is directly inquired to retrieve the matching information, and the business processes of question-answering intention are executed simultaneously, wherein the two processes are parallel, and simultaneously, the direct search result and the analysis search result are provided for the user to select. And aiming at the retrieval task flow to be handled, calling a large model to analyze the dialogue text to be handled, remotely scheduling the statistical data of the tasks to be handled according to the objects to be handled, and feeding back a task list to be handled of a specific object to a user. Aiming at the learning task intention, a large model is called to analyze the dialogue text to be learned, and specific learning resources are fed back to the user according to remote scheduling learning resource data of the object to be learned. Aiming at the teaching plan creation intention, a large model is called to analyze the creation points, related texts of the creation points are searched in a knowledge vector library, and the related texts are fed back to the large model to be processed into a complete manuscript. Aiming at the PPT authoring intention, calling a large model to analyze the authoring key points, planning page layout, searching related texts and multimedia data of the authoring key points in a knowledge vector library, and feeding back to the large model to process the PPT into corresponding PPT.
The above-described flow is performed in each session. The method not only integrates multi-modal information and historical conversations, improves accuracy of intention recognition, but also utilizes a large model to realize automatic recognition of user intention, designates roles of the large model as user intention understanding experts through large model prompt, designates understanding skills under typical conversation modes, designates various intention types to be recognized and descriptions thereof, provides key information for large model recognition of user intention, then instructs the large model to understand the rewritten conversations and other multi-modal behavior data according to the designated roles and information, judges which of the various intention types belongs to the conversation, enables the large model to accurately recognize basic user intention, and finally designs corresponding business processes aiming at each intention type, thereby providing effective information for users more conveniently.
Further, as described above, the multiple intention types preset in the prompt are key to ensure that the large model accurately recognizes the intention. In order to adapt to the dynamic change of the user dialogue intention and timely adjust the intention recognition capability of the large model, the preset various intention types can be dynamically adjusted. In one embodiment, the process may include the steps of:
generating an intention analysis chain of the user according to the information collection condition of the user on useful information in the previous dialogue, wherein the intention analysis chain consists of the useful information collected by the user every time of the dialogue or intention recognition results corresponding to the previous dialogue before terminating the dialogue chain. According to the method, the exploration condition of the large model on the real intention of the user is reflected through the intention analysis chain, and the shorter the intention analysis chain is, the more accurate the intention recognition of the large model is.
Alternatively, chain nodes for partitioning the intent analysis chain are first identified. Specifically, if the user clicks or copies the useful information of a certain dialogue, the user is judged to adopt the useful information of the certain dialogue and marks the certain dialogue as a link point, and if the user does not input dialogue text or exits a dialogue interface or clicks an option of starting a new dialogue in the certain dialogue, the user is judged to terminate a dialogue chain, and the certain dialogue is also marked as a link point. In this case, there is a special case that after the user clicks or copies the useful information of a certain dialog, the user does not input a dialog text beyond a set duration, exits the dialog interface, or clicks any one of several expressions of opening a new dialog, and the certain dialog marks a link point.
After the chain link points are marked, the intention recognition results corresponding to the previous dialogue between the two chain link points of the same user are sequentially arranged, so that the intention analysis chain of the same user is formed. It should be noted that, the intent analysis chain between two link points includes the following one of the two link nodes, and does not include the preceding one of the two link nodes, i.e., each link node counts as the end point of the previous intent analysis chain, and does not count as the next intent analysis chain.
And step two, adjusting the multiple intention types according to the change condition of each intention recognition result in the intention analysis chain, and applying the adjusted multiple intention types to a subsequent large model prompt.
As described above, the embodiment reflects the exploration condition of the large model on the real intention of the user through the intention analysis chain, the shorter the intention analysis chain is, the more accurate the intention recognition of the large model is, the real intention is explored through a small number of dialogues and useful information is provided, the longer the intention analysis chain is, the real intention is obtained and the useful information is provided through a plurality of rounds of dialogues by the large model, so that the accuracy of the intention analysis is affected in a very high probability that the intention classification provided for the large model is imperfect, and the intention classification is necessarily considered to be adjusted. It should be noted that the actual intention here refers to the actual intention of the user, and is not limited to the various intention types set in the large model prompt, but may be an intention type that is close to, more subdivided, or completely different from the several intention types. As long as one of the multiple intention types set in the large model prompt is very close to the real intention, the large model can more accurately identify the real intention, so that the aim of the embodiment is to improve the approaching degree of the intention type in the prompt and the real intention. Optionally, the present embodiment provides the following two alternative embodiments, and dynamically adjusts the multiple types set in the large model prompt (hereinafter referred to as "intention classification adjustment") to achieve this objective:
The first alternative embodiment is applicable to an intention analysis chain with the end point of acquiring useful information of the last dialogue for the user, if the length of the intention analysis chain exceeds a set threshold, the intention exploration process of the user is overlong, the useful information which can be acquired through multiple dialogues is acquired, and then the intention classification adjustment can be performed according to the following two conditions:
In the first case, all intention recognition results in the intention analysis chain are the same, which indicates that the large model is correct in recognition of the intention of the user but fails to provide credible and useful information in the early stage. This is most likely because the intent classification is not fine enough, resulting in the user always being sent a deeper dialog. At this time, if all the intention recognition results in the intention analysis chain are the same, the same intention type corresponding to each intention recognition result can be subdivided according to the last two dialogues in the intention analysis chain. Optionally, keyword analysis can be performed on the last two dialogues respectively, a keyword with the largest semantic difference between the last dialog and the last dialog is determined, and the same intention type in the current intention analysis chain is subdivided according to the keyword. For example, if all intention recognition results in the intention analysis chain are search intention, the second last dialogue content is "please help me search for the content about the rescue gist in the flood fighting and disaster relief course", and the last dialogue content is "please help me search for the content about the rescue practice in the flood fighting and disaster relief course", the keyword with the largest semantic difference of the last dialogue relative to the second last dialogue is "practice", the search intention may be subdivided into "practice search intention" and "theoretical search intention" according to the keyword, and description of subdivision intention and corresponding business flow may be added respectively. In this way, the large model can pay attention to information about actual operations and theories as early as possible in each dialogue, and recognize real intentions of more subdivision of users as early as possible, or present the actual operation search intentions and the theoretical search intentions to users as early as possible, and the real subdivision intentions are directly selected by the users. The operations from keyword division to business process design may be automatically implemented by means of a large model, may be implemented by a developer according to experience, or may be implemented by combining an automation means and human experience, and the embodiment is not particularly limited.
And in the second case, all intention recognition results in the intention analysis chain are not the same, so that the large model finally recognizes the correct intention through multiple exploration on the intention of the user, and reliable useful information is provided. In this case, the present embodiment focuses more on the turning point from the wrong intention to the correct intention to correct the most basic intention direction. Optionally, a continuous repeated segment of the last intention recognition result in the intention analysis chain at the end of the intention analysis chain can be extracted, and the intention type of the last dialog before the continuous repeated segment is subdivided according to the first dialog of the continuous repeated segment. The intention analysis chain is exemplified by question-answer, search, task to be done, learning task and learning task, the most one intention recognition result is learning task, the intention is repeated twice at the end of the intention analysis chain, the continuous repeated section is the last section of the learning task and learning task, the first dialogue in the section is the last second dialogue, the last dialogue before the section is the last third dialogue, and the third last intention type 'task to be done' is subdivided according to the content of the last second dialogue. Alternatively, keyword analysis may be performed on the two dialogues respectively, a keyword with the largest semantic difference between the penultimate dialog and the third-last dialog may be determined, and the intent type of the task to be handled may be subdivided according to the keyword. The content of the last-last dialog is "please help me find a course which is not completed", the content of the last-last dialog is "please help me preview of a course which is not completed", keywords with the largest semantic difference relative to the last-last dialog are "preview" and "content", so that the intention of the "task to be done" can be subdivided into a "task list to be done" and a "task content preview", and corresponding descriptions can be added to the subdivision intention, for example, the description of the task list to be done is "provide task list to user", and the description of the task content preview to be done is "provide task list to user and content preview thumbnail of each task" so as to distinguish meanings of the two intents, and meanwhile, corresponding business flows, for example, the business flow of the task content preview to be done is subdivided and designed more than the business flow of the task list to be done, for searching the task content to be done and the thumbnail, etc. Likewise, the above operations from keyword division to business process design may be automatically implemented by means of a large model, may be implemented empirically by a developer, or may be implemented by a combination of automation means and human experience, and the embodiment is not limited specifically.
The second alternative embodiment is suitable for an intent analysis chain with the end point terminating the dialogue chain for the user, if the length of the intent analysis chain exceeds a set threshold, which indicates that the exploration process of the user is too long and useful information which is trusted is not obtained after a plurality of dialogues, then the intent recognition is likely to be inaccurate, and a series of questions and answers are invalid. The intention classification adjustment can be performed in the following two cases:
In the first case, the intention recognition result in the intention analysis chain covers the existing various intention types, which indicates that the existing intention types do not touch the actual intention of the user, and then new intention types can be added according to the history dialogue content of the user. Alternatively, keyword parsing can be performed on all dialogs in the intent analysis chain, and association rules are used to determine several keywords with highest co-occurrence frequency in all dialogs, and new intent types are added according to the keywords. For example, if several keywords with highest co-occurrence frequency in all dialogs are "bidding schemes", "technical indexes", "business indexes", etc., new intention types "bidding scheme generation intention" can be added, and corresponding business processes can be designed. Likewise, the above operations from keyword division to business process design may be automatically implemented by means of a large model, may be implemented empirically by a developer, or may be implemented by a combination of automation means and human experience, and the embodiment is not limited specifically.
And in the second case, the intention recognition result in the intention analysis chain does not cover the existing various intention types, and the user can be asked according to the uncovered intention types of the intention analysis chain to determine whether the user belongs to the uncovered intention types. For example, if "PPT authoring intent" does not appear in the current intent analysis chain, a question may be asked to the user "ask you if you want to author a PPT.
In the above embodiment, attention is focused on an intention analysis chain with an overlong user exploration process, real intention recognition barriers possibly caused by intention classification are analyzed according to different exploration paths shown by intention analysis, a specific mode for eliminating the reasons is provided, and an improvement direction is provided for rationality and accuracy of the intention classification.
However, in the case of excessively long intention exploration, besides the recognition of the true intention possibly blocked by the intention classification, factors such as improper dialogue content of the user, occurrence of uncertainty waves on the intention of the user and the like may also block the recognition of the true intention, so that whether the adjusted intention classification is really beneficial to the analysis of the true intention after being applied to the large model prompt needs further verification. For this reason, in this embodiment, multiple intention types after each adjustment are used as an intention classification scheme, and a reinforcement learning network is constructed to determine an optimal intention classification scheme. In the prompt example of S130, the 6 types of intention are an intention classification scheme, and in another embodiment, the 7 types of intention are another intention classification scheme after the search intention is subdivided into an actual operation search intention and a theoretical search intention, the question-answer intention, the actual operation search intention, the theoretical search intention, the task intention to be done, the learning task intention, the teaching plan creation intention and the PPT creation intention.
In one embodiment, the reinforcement learning network uses the user's intention analysis chain in the actual dialogue as a state variable, uses various intention classification schemes as action variables, and constructs a reward function according to the final result and the average length of the user's intention analysis chain to select the optimal intention classification scheme and apply the result to the subsequent large model prompt. Alternatively, the higher the coverage of the end result of the intent analysis chain for all users over a period of time to the intent classification scheme used, the higher the forward rewards, and the shorter the average length of the intent analysis chain for all users over a period of time, the higher the forward rewards. By way of example, the following reward function may be constructed:
wherein, the Indicating the value of the prize,Representing the number of intention types in the currently in-use intention classification scheme covered by the end result of all user intention analysis chains over a period of time,Representing the total number of intent types in the intent classification scheme,Representing the average length of the intent analysis chain for all users over a period of time,AndPositive scaling factors, respectively.
Optionally, the user's intent analysis chain in the state variable may be encoded as a vector, the dimension of the vector is greater than the maximum length of the historical intent analysis chain, each intent type corresponds to a non-0 value, the first dimension of the vector is the non-0 value corresponding to the first intent recognition result in the intent analysis chain, the second dimension is the non-0 value corresponding to the second intent recognition result in the intent analysis chain, and so on, until all the intent recognition results in the intent analysis chain have corresponding vector elements, and the rest of the vector elements are filled with 0.
Optionally, the reinforcement learning network is structured as shown in fig. 2, and takes as batch input the intent analysis chains (i.e., state variables) of all the students in a period of time, and takes as output the optimal intent classification scheme (i.e., action variables) to be adopted in a period of time. The network comprises a feature extraction layer and a Q value calculation layer, wherein the feature extraction layer is used for encoding input features to obtain corresponding encoded features, the Q value calculation layer is used for outputting Q value vectors according to the encoded features, each element in the Q value vectors is the probability of selecting each intention classification scheme, and the intention classification scheme with the highest probability is the final output action variable of the reinforcement learning network. Optionally, the feature extraction layer and the Q value calculation layer are all in full connection structure, wherein the number of nodes of the feature extraction layer is gradually increased to enrich the feature information layer by layer, the number of nodes of the Q value calculation layer is gradually reduced to adapt to the dimension of the Q value vector, and the specific layer number and the node number can be flexibly set according to the requirement. The network updates the Q value by:
First, calculate :
Wherein, the Is shown in the stateDownward movementIs used for the (a) and (b),Is shown in the stateTake action downwardsThe Q value after the update is set to be equal to,Representative stateThe largest one of all Q values,Representing the status of the slaveTake actionReach toA reward obtained later; And (3) with Are adjustable coefficients.
Then according toAndThe difference between them is constructed as follows:
According toThe network parameters of the feature extraction layer and the Q value calculation layer can be updated.
Based on the basic algorithm, the reinforcement learning network gradually learns the optimal intention classification scheme which can really improve the effect of intention recognition according to the change of an intention analysis chain brought by each intention classification scheme after use in continuous user dialogue. The optimal scheme is applied to the subsequent large model prompt, so that the intention recognition effect of the large model and the user dialogue result are more excellent. Every time a new intention classification scheme is added, the corresponding node and full-connection operation with the previous layer are added at the last layer of the Q value calculation layer, and the network gradually updates parameters according to the dialogue condition.
Of course, besides the reinforcement learning network, various intention classification schemes can be evaluated according to user feedback, so that intention classification can be evaluated in time, and the effect after adjustment is prevented from being reduced or unstable.
Further, in another specific embodiment, the operations of adjusting and evaluating the intent classification scheme may be performed for each user, respectively, that is, the intent analysis chains of each user may be generated according to the information collection condition of each user on the useful information in the previous dialogue, respectively, and the various intent types of each user may be adjusted according to the change condition of the intent recognition results in the intent analysis chains of each user, and the various intent types after adjustment of each user may be applied to the subsequent large model prompts of each user.
Correspondingly, respectively aiming at each user, taking a plurality of intention types after each adjustment as an intention classification scheme, respectively constructing a reinforcement learning network aiming at each user, taking an intention analysis chain of the current user as a state variable, taking various intention classification schemes of the current user as action variables, constructing a reward function according to a final result and an average length of the intention analysis chain of the current user, selecting an optimal intention classification scheme of the current user, and applying the result to a large model prompt aiming at the current user.
The method has the advantages that the intention classification scheme can be adjusted for each user respectively, the method adapts to the personalized learning habit of the user, the improper influence possibly caused by applying the intention classification adjustment of the single user to all users is avoided, and the disadvantage that more storage space is needed for calculating resources. In practical application, the implementation mode can be flexibly selected according to the practical condition of the system.
In addition, it should be noted that, the user data (including voice data, image data, and dialogue text data) related to the present application are all information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where, as shown in fig. 3, the device includes a processor 60, a memory 61, an input device 62 and an output device 63, where the number of processors 60 in the device may be one or more, in fig. 3, one processor 60 is taken as an example, and the processor 60, the memory 61, the input device 62 and the output device 63 in the device may be connected by a bus or other manners, in fig. 3, the connection is taken as an example by a bus.
The memory 61 is used as a computer readable storage medium for storing software programs, computer executable programs and modules, such as program instructions/modules corresponding to the dynamic intent understanding method based on multimodal fusion in the embodiments of the invention. The processor 60 executes various functional applications of the device and data processing, i.e., implements the above-described dynamic intent understanding method based on multimodal fusion, by running software programs, instructions, and modules stored in the memory 61.
The memory 61 may mainly include a storage program area that may store an operating system, an application program required for at least one function, and a storage data area that may store data created according to the use of the terminal, etc. In addition, the memory 61 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 61 may further comprise memory remotely located relative to processor 60, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 62 may be used to receive entered numeric or character information and to generate key signal inputs related to user settings and function control of the device. The output 63 may comprise a display device such as a display screen.
Embodiments of the present invention also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the dynamic intent understanding method based on multimodal fusion of any of the embodiments.
The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the C-language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
It should be noted that the above embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that the technical solution described in the above embodiments may be modified or some or all of the technical features may be equivalently replaced, and these modifications or substitutions do not deviate the essence of the corresponding technical solution from the technical solution of the embodiments of the present invention.

Claims (10)

1.一种基于多模态融合的动态意图理解方法,其特征在于,包括:1. A dynamic intent understanding method based on multimodal fusion, characterized by comprising: 获取用户在本次对话中的多模态行为数据,所述多模态行为数据包括语音、图像和对话文本;Obtaining multimodal behavioral data of the user in this conversation, wherein the multimodal behavioral data includes voice, image, and conversation text; 根据本次对话是否为首次对话,对本次对话文本进行改写;Rewrite the text of this conversation based on whether it is the first conversation; 将改写后的对话文本和其它多模态行为数据共同提供至大模型,提示大模型识别本次对话的意图;具体的,提示语中指定大模型的角色为用户对话意图理解专家,指定典型对话模式下的理解技巧,指定待识别的多种意图类型及其描述,并指示大模型根据指定的角色和信息对所述改写后的对话和其它多模态行为数据进行理解,判断本次对话属于所述多种意图类型中的哪一种;The rewritten conversation text and other multimodal behavior data are provided to the big model, prompting the big model to identify the intent of the current conversation. Specifically, the prompt specifies the big model's role as an expert in understanding user conversation intent, specifies understanding techniques under typical conversation patterns, specifies multiple intent types to be identified and their descriptions, and instructs the big model to understand the rewritten conversation and other multimodal behavior data based on the specified role and information, and determine to which of the multiple intent types the current conversation belongs. 根据意图识别结果调用对应的业务流程,为本次对话提供有用信息;Call the corresponding business process based on the intent recognition result to provide useful information for this conversation; 其中,所述多种意图类型包括问答意图、搜索意图、待办任务意图、学习任务意图、教案创作意图和PPT创作意图。Among them, the multiple intent types include question-and-answer intent, search intent, to-do task intent, learning task intent, lesson plan creation intent and PPT creation intent. 2.根据权利要求1所述的方法,其特征在于,所述根据本次对话是否为首次对话,对本次对话文本进行改写,包括:2. The method according to claim 1, wherein rewriting the text of the conversation based on whether the conversation is the first conversation comprises: 如果本次对话为首次对话,将本次对话文本作为改写后的对话文本;If this is the first conversation, the text of this conversation will be used as the rewritten text; 如果本次对话非首次对话,将本次对话文本和历史对话文本组合后,作为改写后的对话文本。If this conversation is not the first conversation, the text of this conversation and the text of the historical conversation will be combined to form the rewritten text of the conversation. 3.根据权利要求1所述的方法,其特征在于,在所述根据意图识别结果调用对应的业务流程,为本次对话提供有用信息之后,还包括:3. The method according to claim 1, characterized in that after calling the corresponding business process based on the intent recognition result to provide useful information for the current conversation, it also includes: 根据用户在历次对话中对有用信息的采信情况,生成用户的意图分析链,其中,所述意图分析链由用户采信每次对话的有用信息或终止对话链之前历次对话对应的意图识别结果构成;Generate a user intention analysis chain based on the user's acceptance of useful information in previous conversations, where the intention analysis chain is composed of the intention recognition results corresponding to the user's acceptance of useful information in each conversation or the termination of the conversation chain; 根据所述意图分析链中各意图识别结果的变化情况,调整所述多种意图类型,并将调整后的多种意图类型运用至后续的大模型提示中。According to the changes in the intention recognition results in the intention analysis chain, the multiple intention types are adjusted, and the adjusted multiple intention types are applied to subsequent large model prompts. 4.根据权利要求3所述的方法,其特征在于,所述根据用户在历次对话中对有用信息的采信情况,生成用户的意图分析链,还包括:4. The method according to claim 3, wherein generating a user intention analysis chain based on the user's acceptance of useful information in previous conversations further comprises: 如果用户点击或复制了某次对话的有用信息,判断用户采信了所述某次对话的有用信息,并将所述某次对话标记为链节点;If the user clicks or copies useful information from a conversation, it is determined that the user has accepted the useful information from the conversation, and the conversation is marked as a link node; 如果用户在某次对话中超过设定时长未输入对话文本,或退出了对话界面,或点击了开启新对话的选项,判断用户终止了对话链,并将所述某次对话标记为链节点;If the user does not enter the conversation text within the set time limit, or exits the conversation interface, or clicks the option to start a new conversation, it is determined that the user has terminated the conversation chain and the conversation is marked as a chain node; 将同一用户在两个链节点到之间的历次对话对应的意图识别结果依次排列,形成所述同一用户的意图分析链。The intention recognition results corresponding to all previous conversations of the same user between two chain nodes are arranged in sequence to form an intention analysis chain of the same user. 5.根据权利要求3所述的方法,其特征在于,所述根据所述意图分析链中各意图识别结果的变化情况,调整所述多种意图类型,包括:5. The method according to claim 3, wherein adjusting the multiple intent types according to changes in the intent recognition results in the intent analysis chain comprises: 在所述意图分析链的终点为用户采信了最后一次对话的有用信息的情况下:When the end point of the intent analysis chain is that the user has accepted the useful information from the last conversation: 如果所述意图分析链中的所有意图识别结果均相同,根据所述意图分析链中的最后两次对话,对各意图识别结果对应的相同意图类型进行细分;If all intent recognition results in the intent analysis chain are the same, the same intent types corresponding to the intent recognition results are subdivided according to the last two conversations in the intent analysis chain; 否则,提取所述意图分析链中的最后一次意图识别结果在所述意图分析链末端的连续重复段;根据所述连续重复段的首次对话,对所述连续重复段之前最后一次对话的意图类型进行细分。Otherwise, extract the continuous repetition segment at the end of the intention analysis chain as the last intention recognition result in the intention analysis chain; and subdivide the intention type of the last conversation before the continuous repetition segment according to the first conversation of the continuous repetition segment. 6.根据权利要求3所述的方法,其特征在于,所述根据所述意图分析链中各意图识别结果的变化情况,调整所述多种意图类型,包括:6. The method according to claim 3, wherein adjusting the multiple intent types according to changes in the intent recognition results in the intent analysis chain comprises: 在所述意图分析链的终点为用户终止了对话链的情况下:When the end point of the intent analysis chain is the user terminating the conversation chain: 如果所述意图分析链中的意图识别结果覆盖了所述多种意图类型,根据用户的历次对话内容增加新的意图类型;If the intent recognition results in the intent analysis chain cover the multiple intent types, add new intent types based on the user's previous conversation content; 否则,根据所述意图分析链未覆盖的意图类型向用户提问,以确定用户是否属于所述未覆盖的意图类型。Otherwise, questions are asked to the user based on the intent types not covered by the intent analysis chain to determine whether the user belongs to the uncovered intent types. 7.根据权利要求3所述的方法,在所述根据所述意图分析链中各意图识别结果的变化情况,调整所述多种意图类型之后,还包括:7. The method according to claim 3, after adjusting the multiple intent types based on changes in the intent recognition results in the intent analysis chain, further comprising: 将每一次调整后的多种意图类型作为一种意图分类方案;The multiple intention types after each adjustment are used as an intention classification scheme; 构建强化学习网络,以用户的意图分析链为状态变量,以各种意图分类方案为动作变量,并根据用户意图分析链的最终结果和平均长度构建奖励函数,来选择最优的意图分类方案并运用至后续的大模型提示中。Build a reinforcement learning network with the user's intention analysis chain as the state variable and various intention classification schemes as the action variables. Build a reward function based on the final result and average length of the user's intention analysis chain to select the optimal intention classification scheme and apply it to subsequent large-scale model prompts. 8.根据权利要求3所述的方法,其特征在于,所述根据用户在历次对话中对有用信息的采信情况,生成用户的意图分析链,包括:根据每个用户在历次对话中对有用信息的采信情况,分别生成每个用户的意图分析链;8. The method according to claim 3, wherein generating a user intention analysis chain based on the user's acceptance of useful information in previous conversations comprises: generating an intention analysis chain for each user based on the user's acceptance of useful information in previous conversations; 相应的,所述根据所述意图分析链中各意图识别结果的变化情况,调整所述多种意图类型,并将调整后的多种意图类型运用至后续的大模型提示中,包括:根据每个用户的意图分析链中各意图识别结果的变化情况,分别调整每个用户的所述多种意图类型,并将每个用户调整后的多种意图类型分别并运用至每个用户后续的大模型提示中。Correspondingly, the multiple intent types are adjusted according to the changes in the intent recognition results in the intent analysis chain, and the adjusted multiple intent types are applied to the subsequent large model prompts, including: adjusting the multiple intent types for each user according to the changes in the intent recognition results in the intent analysis chain of each user, and applying the adjusted multiple intent types for each user to the subsequent large model prompts of each user. 9.一种电子设备,其特征在于,包括:9. An electronic device, comprising: 一个或多个处理器;one or more processors; 存储器,用于存储一个或多个程序,a memory for storing one or more programs, 当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现权利要求1-8任一所述的基于多模态融合的动态意图理解方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the dynamic intent understanding method based on multimodal fusion described in any one of claims 1-8. 10.一种计算机可读存储介质,其特征在于,其上存储有计算机程序,该程序被处理器执行时实现权利要求1-8任一所述的基于多模态融合的动态意图理解方法。10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which, when executed by a processor, implements the dynamic intent understanding method based on multimodal fusion as described in any one of claims 1-8.
CN202511088840.2A 2025-08-05 2025-08-05 Dynamic intention understanding method based on multi-mode fusion Active CN120579009B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202511088840.2A CN120579009B (en) 2025-08-05 2025-08-05 Dynamic intention understanding method based on multi-mode fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202511088840.2A CN120579009B (en) 2025-08-05 2025-08-05 Dynamic intention understanding method based on multi-mode fusion

Publications (2)

Publication Number Publication Date
CN120579009A true CN120579009A (en) 2025-09-02
CN120579009B CN120579009B (en) 2025-12-16

Family

ID=96860376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202511088840.2A Active CN120579009B (en) 2025-08-05 2025-08-05 Dynamic intention understanding method based on multi-mode fusion

Country Status (1)

Country Link
CN (1) CN120579009B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140257793A1 (en) * 2013-03-11 2014-09-11 Nuance Communications, Inc. Communicating Context Across Different Components of Multi-Modal Dialog Applications
CN117370506A (en) * 2023-07-21 2024-01-09 中图科信数智技术(北京)有限公司 An agricultural intelligent question and answer method and system that supports multi-modal and multi-turn dialogue
CN117573834A (en) * 2023-11-30 2024-02-20 北京快牛智营科技有限公司 Multi-robot dialogue method and system for software-oriented instant service platform
CN118246537A (en) * 2024-05-24 2024-06-25 腾讯科技(深圳)有限公司 Question answering method, device, equipment and storage medium based on large model
CN118899006A (en) * 2024-07-09 2024-11-05 中国联合网络通信集团有限公司 A method, device and readable storage medium for identifying user's incoming call intention
CN119003872A (en) * 2024-08-09 2024-11-22 国网安徽省电力有限公司安庆供电公司 Multi-mode intelligent knowledge retrieval and dialogue system and method
CN119623643A (en) * 2024-12-06 2025-03-14 中电信数智科技有限公司 A multimodal intelligent question-answering reasoning method and device based on knowledge distillation
CN119808789A (en) * 2024-12-16 2025-04-11 广州九四智能科技有限公司 Customer intention recognition and response system, method, device and medium based on LLM
CN119961403A (en) * 2025-01-08 2025-05-09 中国联合网络通信集团有限公司 Intention recognition method, electronic device, storage medium and program product
CN119989268A (en) * 2025-01-16 2025-05-13 北京连屏科技股份有限公司 A multimodal large language model dialogue generation method based on natural language understanding
CN120354860A (en) * 2025-06-24 2025-07-22 联通沃悦读科技文化有限公司 Dialog intention recognition method, system, electronic equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140257793A1 (en) * 2013-03-11 2014-09-11 Nuance Communications, Inc. Communicating Context Across Different Components of Multi-Modal Dialog Applications
CN117370506A (en) * 2023-07-21 2024-01-09 中图科信数智技术(北京)有限公司 An agricultural intelligent question and answer method and system that supports multi-modal and multi-turn dialogue
CN117573834A (en) * 2023-11-30 2024-02-20 北京快牛智营科技有限公司 Multi-robot dialogue method and system for software-oriented instant service platform
CN118246537A (en) * 2024-05-24 2024-06-25 腾讯科技(深圳)有限公司 Question answering method, device, equipment and storage medium based on large model
CN118899006A (en) * 2024-07-09 2024-11-05 中国联合网络通信集团有限公司 A method, device and readable storage medium for identifying user's incoming call intention
CN119003872A (en) * 2024-08-09 2024-11-22 国网安徽省电力有限公司安庆供电公司 Multi-mode intelligent knowledge retrieval and dialogue system and method
CN119623643A (en) * 2024-12-06 2025-03-14 中电信数智科技有限公司 A multimodal intelligent question-answering reasoning method and device based on knowledge distillation
CN119808789A (en) * 2024-12-16 2025-04-11 广州九四智能科技有限公司 Customer intention recognition and response system, method, device and medium based on LLM
CN119961403A (en) * 2025-01-08 2025-05-09 中国联合网络通信集团有限公司 Intention recognition method, electronic device, storage medium and program product
CN119989268A (en) * 2025-01-16 2025-05-13 北京连屏科技股份有限公司 A multimodal large language model dialogue generation method based on natural language understanding
CN120354860A (en) * 2025-06-24 2025-07-22 联通沃悦读科技文化有限公司 Dialog intention recognition method, system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN120579009B (en) 2025-12-16

Similar Documents

Publication Publication Date Title
CN117056479B (en) Intelligent question-answering interaction system based on semantic analysis engine
CN117332072B (en) Dialogue processing, voice abstract extraction and target dialogue model training method
CN116415650A (en) Method, device and storage medium for generating dialog language model and generating dialog
CN109871807B (en) Face image processing method and device
CN116882450B (en) Question-answering model editing method and device, electronic equipment and storage medium
WO2024066920A1 (en) Processing method and apparatus for dialogue in virtual scene, and electronic device, computer program product and computer storage medium
CN119336867A (en) User question answering method and its device, equipment, and medium
CN119202211A (en) A controllable supervised simulated conversation method and system based on large language model
CN116913278A (en) Speech processing method, device, equipment and storage medium
CN118312593B (en) Artificial intelligent interaction method and device based on multiple analysis models
CN116306524A (en) Image Chinese description method, device, equipment and storage medium
CN118897878A (en) A customer interactive service system based on artificial intelligence
CN116226411B (en) Interactive information processing method and device for interactive project based on animation
CN117271745A (en) An information processing method, device, computing equipment, and storage medium
CN120804271B (en) Multi-mode fusion and reinforcement learning collaborative retrieval enhancement generation method and system
CN115292620B (en) Region information identification method, device, electronic equipment and storage medium
CN120579009B (en) Dynamic intention understanding method based on multi-mode fusion
CN112925894B (en) Method, system and device for matching bid-asking questions in conversation
CN120067252A (en) Training method for dialogue model, artificial intelligence interview method, computing device, storage medium, and program product
CN119149699A (en) Communication method and device thereof
CN117744801A (en) Conversation method, device and equipment of psychology big model based on artificial intelligence
CN120687541A (en) Task processing methods, text processing methods, automatic question-answering methods, task processing model training methods, information processing methods based on task processing models, and cloud training platforms
CN118210881A (en) Multi-round dialogue prediction method and related products
CN116361423A (en) Statement generation method, device, and computer-readable storage medium
CN119357324B (en) A knowledge-guided interview interaction data processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant