CN119807383A

CN119807383A - A method, device and equipment for calling a function based on a large model

Info

Publication number: CN119807383A
Application number: CN202510284613.0A
Authority: CN
Inventors: 刘德龙; 王康; 耿嘉诚; 赵世灵; 陈波扬
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2025-03-11
Filing date: 2025-03-11
Publication date: 2025-04-11
Anticipated expiration: 2045-03-11
Also published as: CN119807383B

Abstract

The application relates to a function calling method, a function calling device and function calling equipment based on a large model, and relates to the technical field of data processing, wherein the method comprises the steps of generating a question message based on a question text input by a user and respective description text of each candidate function; each description text comprises input parameter information and output parameter information of a corresponding candidate function, answer prediction is conducted on the basis of a question message through a target large model to obtain a prediction message, the prediction message comprises function call data corresponding to the target function and a pre-reply template, the pre-reply template comprises an identification field corresponding to a return value of the target function, the target function is a candidate function related to a question text, the target function is called on the basis of the function call data to obtain a function return value, the identification field in the pre-reply template is replaced on the basis of the function return value to obtain an answer text corresponding to the question text, and function call efficiency is improved.

Description

Function calling method, device and equipment based on large model

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, and a device for function call based on a large model.

Background

With the successful application of ChatGPT (GENERATIVE PRE-trained Transformer, openai developed dialog generation tool based on the GPT model), intelligent question-answering systems based on large models are rapidly spreading in a number of fields. In the use process of the intelligent question-answering system, a function call based on a large model provides more accurate and diversified services for users.

In the related art, function call is typically performed by an AI (ARTIFICIAL INTELLIGENCE ) system in a user equipment, which is composed of two parts, an application and a large model inference engine. The specific function calling process comprises the steps that an application program combines a question text and function information and inputs the question text and the function information into a large model reasoning engine, the large model reasoning engine performs reasoning and outputs a function calling request, the application program executes the function calling process based on the function calling request and inputs the obtained function calling result and the question text into the large model reasoning engine again, the large model reasoning engine performs reasoning again, outputs an answer text and returns the answer text to the application program, and the application program feeds the answer text back to a user.

In the process of the function call, multiple interactions are needed between the application program and the large model reasoning engine, and the large model reasoning engine needs to execute multiple reasoning processes, so that the problems of low function call efficiency, low interaction efficiency and the like exist.

Disclosure of Invention

The invention provides a function calling method, device and equipment based on a large model, which are used for solving the problems of low function calling efficiency and interaction efficiency when a function is called based on the large model.

In a first aspect, an embodiment of the present application provides a method for calling a function based on a large model, where the method includes:

generating a question message based on question texts input by a user and respective description texts of the candidate functions, wherein each description text comprises input parameter information and output parameter information of the corresponding candidate function;

The method comprises the steps of carrying out answer prediction based on a question message through a target big model to obtain a predicted message, wherein the predicted message comprises function call data corresponding to a target function and a pre-reply template, and the pre-reply template comprises an identification field corresponding to a return value of the target function;

Calling the target function based on the function calling data to obtain a function return value;

and replacing the identification field in the pre-reply template based on the function return value to obtain an answer text corresponding to the question text.

According to the method, the prediction message fusing the function call data and the pre-reply template is output once through the large model, after the target function call is carried out, the function return value is directly replaced with the identification field, the answer text can be obtained, the final output of the user can be obtained through one-time large model reasoning, namely the large model reasoning utilization rate is improved, and the data interaction efficiency in the function call process is further improved.

In one possible implementation, generating the question message based on the question text and the respective descriptive text of each candidate function includes:

Generating a first text based on the question text and a first preset template, wherein the first text comprises a first text type for identifying the first text as the question text and text content of the question text;

generating a second text based on the respective description text of each candidate function and a second preset template, wherein the second text comprises a second text type for identifying the second text as the description text and text content of each description text;

and generating a question message based on the first text, the second text and a preset prompt word, wherein the prompt word is used for indicating the target big model to output a predicted message based on the question message.

In one possible implementation, before generating the question message based on the question text and the description text of each candidate function, the method further includes:

acquiring a target function call level corresponding to the artificial intelligent AI system, and inquiring candidate functions matched with the target function call level from a function library, wherein the function library stores a plurality of functions and the function call levels corresponding to the functions respectively, or

And acquiring a function set supported by the AI system, and taking each function included in the function set as a candidate function.

In one possible implementation, the predicted message further comprises an identifier for distinguishing function call data from a pre-reply template;

Before the function call data is used for calling the target function, the method further comprises the following steps:

Parsing the function call data from the predicted message based on the identifier;

The function call data comprises function identification of the target function, identification and value of each input parameter and identification of the output parameter, wherein the value of each input parameter is analyzed from the problem text through the target large model.

In one possible implementation, the target large model is obtained by:

constructing each sample message, wherein the sample message comprises a sample question message and a reference prediction message corresponding to the sample question message;

And carrying out iterative training on the initial large model by adopting each constructed sample message to obtain a target large model, wherein in one iterative training, the following operation is carried out, namely carrying out answer prediction on the basis of one sample questioning message through the initial large model, and adjusting the parameters of the current initial large model on the basis of the difference between the obtained prediction message and the reference prediction message of one sample questioning message.

In one possible implementation, each sample message is constructed by:

inputting a sample question text corresponding to the sample message and input parameter information of each candidate function into an initial large model, and carrying out answer prediction through the initial large model to obtain a function call request;

Performing function call based on the function call request to obtain a function return value, and inputting the function return value and a sample question text into an initial large model to obtain a sample answer text;

Based on the sample question text, the input parameter information and the output parameter information of each candidate function, constructing a sample question message in the sample message;

And replacing the function return value in the sample answer text with the identification field, and constructing a reference prediction message in the sample message based on the replaced sample answer text and the function call data in the function call request.

In a second aspect, an embodiment of the present application provides a function calling device based on a large model, where the device includes:

the generating module is used for generating a question message based on a question text input by a user and respective description text of each candidate function, wherein each description text comprises input parameter information and output parameter information of the corresponding candidate function;

The prediction module is used for predicting answers based on the questioning message through the target big model to obtain a prediction message, wherein the prediction message comprises function call data corresponding to a target function and a pre-reply template, and the pre-reply template comprises an identification field corresponding to a return value of the target function;

the calling module is used for calling the target function based on the function calling data to obtain a function return value;

and the replacing module is used for replacing the identification field in the pre-reply template based on the function return value to obtain an answer text corresponding to the question text.

In one possible implementation manner, the generating module is specifically configured to:

In one possible implementation manner, the generating module is further configured to, before generating the question message, based on the question text and the description text of each candidate function,:

before the calling module calls the target function based on the function calling data, the calling module is further used for:

In a possible implementation manner, the embodiment of the application further comprises a training module, wherein the training module is specifically used for obtaining the target large model by the following modes:

In one possible implementation, the training module is specifically configured to construct each sample message by:

In a third aspect, an embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements steps in the large model-based function calling method when executing the computer program.

In a fourth aspect, an embodiment of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above-described large model-based function calling method of the present application.

In a fifth aspect, an embodiment of the present application provides a computer program product, including a computer program, where the computer program is stored in a computer readable storage medium, and when a processor of a memory access device reads the computer program from the computer readable storage medium, the processor executes the computer program, so that the memory access device performs the steps in the above-mentioned large model-based function calling method according to the present application.

The technical effects of each of the second to fifth aspects and the technical effects that may be achieved by each aspect are referred to above for the first aspect or the technical effects that may be achieved by each possible aspect in the first aspect, and the detailed description is not repeated here.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it will be apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a diagram illustrating a function call procedure in the related art according to the present application;

FIG. 2 is a schematic diagram of a function call architecture according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of a function calling method based on a large model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a large model training process according to an embodiment of the present application;

FIG. 5 is a schematic flow chart of an example of function call provided in an embodiment of the present application;

FIG. 6 is a schematic diagram of a function calling device according to an embodiment of the present application;

fig. 7 is a schematic diagram of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application. Embodiments of the application and features of the embodiments may be combined with one another arbitrarily without conflict. Also, while a logical order of illustration is depicted in the flowchart, in some cases the steps shown or described may be performed in a different order than presented.

The terms first and second in the description and claims of the application and in the above-mentioned figures are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the term "include" and any variations thereof is intended to cover non-exclusive protection. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus. The term "plurality" in the present application may mean at least two, for example, two, three or more, and embodiments of the present application are not limited.

Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. It should be noted that, in the embodiments of the present application, some existing solutions in the industry such as software, components, models, etc. may be mentioned, and they should be regarded as exemplary, only for illustrating the feasibility of implementing the technical solution of the present application, but it does not mean that the applicant has or must not use the solution.

In the technical scheme of the application, the acquisition, transmission, storage, use and the like of the data all meet the requirements of national relevant laws and regulations.

In order to facilitate a better understanding of the technical solutions provided by the embodiments of the present application, the following brief description of the related terms is provided below:

(1) Large model-machine learning model with very large scale parameters (usually more than billions), complex computational structure, which can usually process massive data to accomplish various complex tasks such as natural language processing, image recognition, etc. The large model includes multiple models including LLM (Large Language Model ). The large language model refers to a deep learning model trained using a large amount of text data, can generate natural language text or understand meaning of the language text, and can process various natural language tasks such as text classification, question-answering, dialogue, and the like, for example, chat GPT, discourse, and the like.

(2) Function Call (Function Call) is an important technical means for the intelligent agent to realize the Function. In the context of AI large models, function calls allow an agent to handle a particular task or problem with a predefined function. These functions may be simple custom functions or function-type functions that encapsulate external tool APIs (Application Programming Interface, application programming interfaces) so that the agent can invoke external resources and services, such as accessing email, retrieving weather information, etc.

In recent years, with the successful application of ChatGPT (GENERATIVE PRE-trained Transformer, a GPT model-based dialog generation tool), a large model-based intelligent question-answering system has rapidly spread in a variety of fields. In the use process of the intelligent question-answering system, a function call based on a large model provides more accurate and diversified services for users.

In the related art, function call is generally performed by an AI system in a user equipment, which is composed of two parts of an application program and a large model inference engine. The specific function calling process comprises the steps that an application program combines a question text and function information, inputs the question text and the function information into a large model reasoning engine, the large model reasoning engine conducts reasoning and outputs a function calling request, the application program executes the function calling process based on the function calling request, the obtained function calling result and the question text are input into the large model reasoning engine again, the large model reasoning engine conducts reasoning again, an answer text is output and returned to the application program, and the application program feeds the answer text back to a user, as shown in fig. 1.

In this regard, in the related art, although there are also schemes for improving instruction processing efficiency by analyzing dependency relationships between function calls and by parallel calling, and schemes for timely responding to users and improving user experience by adding fixed reply call and function call replies. However, the former is applied to a scene of multiple function calls, and optimization of single function call is not considered, and the latter only improves user perception, but does not fundamentally improve processing efficiency.

In order to solve the above problems, the embodiment of the application provides a function calling method based on a large model, which is characterized in that a large model outputs a prediction message integrating function calling data and a pre-reply template (in which an identification field corresponding to a return value of a target function is embedded) at one time, and after the target function is called, the identification field is directly replaced by the function return value, so that an answer text can be obtained. The method ensures that the final output of the user can be obtained through one-time large model reasoning, namely, the utilization rate of the large model reasoning is improved, and further, the data interaction efficiency in the function calling process is improved.

Fig. 2 is a schematic diagram of a function call architecture provided in an embodiment of the present application, and for convenience of understanding, an overall flow of function call in the embodiment of the present application is described with reference to fig. 2.

Referring to fig. 2, the function calling method in the embodiment of the present application is applied to an AI system, which includes an application program and a large model reasoning engine. Optionally, the application program is arranged on the terminal equipment of the user and used for carrying out information interaction with the user (namely receiving information input by the user and feeding back the information to the user), calling a large model reasoning engine to carry out large model reasoning and completing function calling tasks, and the large model reasoning engine is arranged on the server and used for carrying out reasoning based on the calling of the application program so as to output corresponding results and feed back the results to the application program.

In some embodiments, the function call flow comprises the steps that a user sends a question text to an application program, the application program constructs a question message based on the question text and a description text of a candidate function and sends the question message to a large model reasoning engine, a large model predicts an answer based on the question message and outputs a prediction message comprising function call data and a pre-reply template and returns the prediction message to the application program, the application program calls a target function based on the function call data in the prediction message to obtain a function return value and fills the function return value into the pre-reply template to generate an answer text, and the application program feeds the answer text back to the user.

The present application is described in further detail below with reference to the above architecture and other drawings, and fig. 3 is a schematic flow chart of a function calling method based on a large model according to an embodiment of the present application, and referring to fig. 3, the function calling method provided in the embodiment of the present application specifically includes the following steps:

Step S301, generating a question message based on a question text input by a user and respective description text of each candidate function;

wherein, each description text comprises input parameter information and output parameter information of the corresponding candidate function;

in some embodiments, the application program in the AI system is provided in the user's terminal device, including but not limited to a mobile phone, a computer, etc., and the user can input question text, such as "10% discount for an original 400-yuan electronic product, i need to pay.

After receiving a question text input by a user, an application program triggers a subsequent function call flow based on a large model, specifically, firstly, respective description text of each candidate function is obtained, wherein the candidate function refers to a function called by an AI system support, then, a question message is generated based on the question text and the description text of the candidate function, and the description text comprises, but is not limited to, identification (name) of the candidate function, the function and parameter definition (namely input parameter information and output parameter information).

In one possible implementation, the candidate function may be determined in the embodiment of the present application based on the following manner:

In a possible implementation manner, the AI system in the embodiment of the application is provided with a corresponding target function call level, the target function call level is used for representing the function call capability of the AI system, and when the candidate function is determined, a function matched with the target function call level (i.e. the function call level is not higher than the target function call level) is searched from the function library and used as the candidate function. In practice, real-time modification of the target function call level of the AI system is supported.

In another possible implementation manner, the embodiment of the application configures the function set supported by the AI system in advance, the set can exist in a list form, and the function (i.e. candidate function) supported by the AI system is added to the function set in advance, and in implementation, modification, addition and deletion of the content in the function set are supported.

In one possible implementation manner, the step of generating the question message based on the question text and the description text of each candidate function specifically includes:

In some embodiments, after obtaining the question text and the description text of the candidate function, the application assembles the question text and the description text according to the format of the input message supported by the large model to obtain the question message.

In the specific implementation, a first preset template and a second preset template are preset, wherein the first preset template is arranged at a position corresponding to the problem text and a first text type, the second preset template is arranged at a position corresponding to the description text and a second text type, after the problem text is obtained, the problem text is filled into the corresponding position of the first preset template, so that a first text can be obtained, and similarly, after the description text is obtained, the description text is filled into the corresponding position of the second preset template, so that a second text can be obtained.

The generation process of the first text and the second text is described below with reference to examples.

Assume that the question text entered by the user is "what is the original 400-yuan electronic product, discounted by 10%, i need to pay? the candidate functions are" calculate_ discount "and" get_current_weather ", and the description information corresponding to each candidate function includes name, description, parameters, required, returns (return information, i.e. output parameter information), and the parameter includes type, properties, description, and the like.

In one possible implementation, the first text type is a user, and the first preset template is:

{

"role": "user",

"content": "XXXXX (i.e., the location corresponding to the question text)",

}。

Based on the first preset template and the question text, the first text can be obtained as follows:

{

"role": "user",

"content" is "10% discount for an electronic product of original price 400 yuan, how much does i need to pay?

}。

In one possible implementation, the second text type is system, and the second preset template is:

{

"role": " system ",

"content" you have the tool called XXXXX (i.e. the location corresponding to the descriptive text) ",

}。

Based on the first preset template and each description text, a second text can be obtained as follows:

{

"role": "system",

"content" you have the following tools that can call ：[{'name':'calculate_discount', 'description': 'Calculate the discounted price（）', 'parameters': {'type': 'object（ objects) ',' properties { 'original_price }' is { 'type:' number } ',' description: 'The original price of the item (original price of the good) }, discount _properties }' is { 'type:' number }, description: 'THE PERCENTAGE of discount (discount price of discount) is {' type: 'number }, description:' THE PERCENTAGE of discount (discount price of discount of ）'}}, 'required': ['original_price', 'discount_percentage']}, 'returns': {'type': 'object', 'properties': {'discounted_price（) } 'is {' type: 'number }', 'description:' THE PRICE AFTER APPLYING THE discount (price after discount) } },

{ 'Name }' get_current_weather }, 'description }' Get the current weather (current weather acquired), 'parameters {' type { 'object }', properties { 'location }' address { 'type }' string } 'description }' THE CITY NAME (city name) }, required [ 'location }, returns {' type: 'object }', properties { 'temperature {' type } 'number }' and } 'content }' The temperature of city (city temperature) } ] are shown.

}。

In some embodiments, a prompt word is further preset in the embodiment of the present application, where the prompt word is used to instruct the target big model to output a prediction message based on a question message, and in a specific implementation, the content of the prompt word may be set based on requirements, for example, set to "please output function call data corresponding to a target function to be called, and a template of answer text replied to a user" based on the following content.

In some embodiments, the method and the system can also preset the function types corresponding to the functions in the function library, wherein the function types are used for representing the business scene to which the functions belong, the business scene can be set based on requirements, and for example, weather, sales and the like can be set.

When a user inputs a question text, the user can input a target service scene corresponding to the question text at the same time, and when a question message is generated based on the question text and the description text of each candidate function, the user can screen candidate functions matched with the target service scene from each candidate function based on the target service scene and the service scene corresponding to each candidate function, and then generate the question message based on the question text and the description text of each screened candidate function. The method can reduce the number of candidate functions in the questioning message, reduce the running quantity of the large model, and further improve the efficiency and accuracy of the model.

For example, assuming that the candidate function includes function 1 (corresponding business scenario is weather), function 2 (corresponding business scenario is sales), function 3 (corresponding business scenario is sales), and the question text is "electronic product of original price 400 yuan, discount 10%, i need to pay how much.

Step S302, answer prediction is carried out based on the questioning message through the target big model, and a predicted message is obtained;

The prediction message comprises function call data corresponding to an objective function and a pre-reply template, wherein the pre-reply template comprises an identification field corresponding to a return value of the objective function;

In some embodiments, the application performs fine adjustment on the initial large model through the sample message in advance to obtain the target large model, and the target large model obtained after fine adjustment supports the prediction message based on the question message and outputs the function call data and the pre-reply template.

After the questioning message is obtained, the questioning message is input into a target large model, the target large model can output a predicted message, specifically, the target large model can infer based on context information in the questioning message, a target function corresponding to a question text (namely, the highest correlation) can be determined from a plurality of candidate functions, then function call data corresponding to the target function is analyzed from answer text of the questioning message, and based on the context in the questioning message, the content of a pre-reply template is predicted, and finally the predicted message is output.

In some embodiments, an identification field corresponding to the return value of the objective function is located in an embedded form in the pre-reply template, and the identification field may be set based on requirements, for example, may be set as an identification of the output parameters of the objective function. The format of the function call data is not limited in the embodiment of the present application, and may be json format, for example.

Illustratively, one possible predicted message format is as follows:

{ { {% < function call data >% } } } big model pre-reply { { identification field } } big model pre-reply, wherein "big model pre-reply { { identification field } }" big model pre-reply "is a pre-reply template portion.

Taking the electronic product with the original price of 400 yuan as the question text, the electronic product is discounted by 10%, what is needed to be paid by I, the objective function is calculated_ discount, and the corresponding prediction message can be { { { { (name ': calculated_ discount', 'return_vars [' discounted _price ',' arguments { (original_price): 400, 'discount _percentage': 10 }% } } } } } the electronic product should pay { { { discounted _price } }, for example.

Wherein "{ 'name': 'calculate_ discount', 'return_vars', [ 'discounted _price' ], 'arguments': { 'original_price': 400, 'discount _policy': 10 }" is the function call data, the "this electronic product you should pay { { discounted _price }" is the pre-reply template, and "discounted _price" is the identification field.

Step S303, calling the target function based on the function calling data to obtain a function return value;

In some embodiments, after obtaining the prediction message output by the large model, the prediction message is firstly parsed to function call data and a pre-reply template, and then a target function is called based on the function call data to obtain a function return value.

In a possible implementation manner, the prediction message further comprises an identifier for distinguishing function call data from a pre-reply template, and the identifier can be set based on requirements.

Before the function call data is used for calling the target function, the method in the embodiment of the application further comprises the following steps:

In an embodiment of the present application, an identifier may be set to "{ {%,% }" to be used for distinguishing function call data from a pre-reply template, and "{ { { { (function call data >% } } }) large model pre-reply { { (identification field }) large model pre-reply" is taken as an example, and based on the identifier { { { (and% }), a portion located in the identifier, and a portion located outside the identifier may be obtained, where the portion located in the identifier is the function call data, and the portion located outside the identifier is the pre-reply template.

Taking the predicted message ："{{%% {'name': 'calculate_discount', 'return_vars': ['discounted_price'], 'arguments': {'original_price': 400, 'discount_percentage': 10}} %%}} as an example of the { { { discounted _price } } "element, the function call data {'name': 'calculate_discount', 'return_vars': ['discounted_price'], 'arguments': {'original_price': 400, 'discount_percentage': 10}}, pre-reply template can be identified as the { { { { discounted _price }" element according to the identifier setting "{ {% }".

Further, the function call message may be further parsed to obtain a function identifier of the objective function, an identifier and a value of each input parameter, an identifier of the output parameter, and the like, where the foregoing examples may be taken as examples, to obtain a function identifier of the objective function, calculate_ discount, an identifier and a value of the input parameter { ' original_price: 400, ' discount _percentage ': 10}, and an identifier of the output parameter discounted _price.

And step S304, replacing the identification field in the pre-reply template based on the function return value to obtain an answer text corresponding to the question text.

In some embodiments, an identifier for characterizing the identification field may also be set in the pre-reply template, where the identifier may also be set based on requirements, and when the step S304 is performed, the identification field in the pre-reply template may be determined based on the identifier.

For example, the identifier corresponding to the identification field can be set to be ' { }, taking the pre-reply template as ' the electronic product should pay { { { discounted _price } ' element } ' for example, the identification field can be identified as ' discounted _price ' through the identification ' { }, and then the identification field discounted _price in the pre-reply template can be replaced based on the function return value (assumed to be 360) of the objective function, so that answer text corresponding to the question text is obtained, namely ' the electronic product should pay 360 elements } ' and returned to the user.

In some embodiments, the solution of the present application includes both model trimming and model reasoning, as shown in fig. 4, before performing the steps of steps S301-304 described above, the initial large model (i.e., the large model in the related art) needs to be trimmed based on the sample message first, so that the one-time output of the predicted message is supported.

In one possible implementation, the present example obtains the target large model by:

In some embodiments, the application can construct a plurality of sample messages based on specific business scenes and fine-tune the initial large model based on the sample messages, and optionally, conventional fine-tuning frameworks such as llama-factory, deepSpeed can be used for fine-tuning the large model.

After fine tuning the initial large model based on each sample message, the target large model in the above step S302 of the present application is obtained.

In one possible implementation, the present embodiment constructs each sample message by:

step 1, inputting a sample question text corresponding to a sample message and input parameter information of each candidate function into an initial large model, and carrying out answer prediction through the initial large model to obtain a function call request;

Step 2, performing function call based on the function call request to obtain a function return value, and inputting the function return value and a sample question text into an initial large model to obtain a sample answer text;

step 3, based on the sample question text, the input parameter information and the output parameter information of each candidate function, constructing a sample question message in the sample message;

And 4, replacing the function return value in the sample answer text with an identification field, and constructing a reference prediction message in the sample message based on the replaced sample answer text and function call data in the function call request.

In some embodiments, the above process of obtaining the sample answer text based on the initial large model (i.e. the steps 1-2) is related technology, and the specific process thereof is not repeated here. It should be noted that, when a plurality of sample messages are constructed, the process described in steps 1 to 4 may be continuously performed a plurality of times in the embodiment of the present application, or after the steps of steps 1 to 2 are performed for a plurality of sample answer texts, the process described in steps 3 to 4 may be performed based on the obtained results.

In some embodiments, the present application may also artificially construct a function call trimming dataset, where the function call trimming dataset includes a sample question text, input parameter information of each candidate function, a sample answer text, and the like, and then execute the trimming process described in steps 3-4 above, so as to obtain a sample message for performing large model trimming.

The above process of constructing a sample message is described in detail below in connection with specific examples.

Firstly, in the embodiment of the application, a function call fine adjustment data set is generated based on a sample problem text through a large model in the related technology, namely an initial large model (namely the steps 1-2), and the specific process is as follows:

assume that the sample question text is "the original price of a 200 yuan shoe, a 15% discount, how much does i need to pay:

{

"role": "system",

"content" you have the following tools to call ：[{'name': 'calculate_discount', 'description': 'Calculate the discounted price', 'parameters': {'type': 'object', 'properties': {'original_price': {'type': 'number', 'description': 'The original price of the item'}, 'discount_percentage': {'type': 'number', 'description': 'The percentage of discount'}}, 'required': ['original_price', 'discount_percentage']}}}]"

},{

"role": "user",

"Content" is "what is i needed to pay for a 15% discount for an original 200 yuan shoe?

}。

Answer prediction is carried out through the initial large model, and a function call request can be obtained:

{

"role": "assistant",

"content": "{'name': 'calculate_discount', 'return_vars': ['discounted_price'], 'arguments': {'original_price': 200, 'discount_percentage': 15}}"

}。

function call is performed based on the function call request to obtain a function return value 170, and the function return value is assembled according to the format of the input message supported by the initial large model, so that the function return value can be obtained:

{

"role": "tool",

"content": "170"

}。

Combining the function return value with the format with the sample question text, inputting an initial large model, and carrying out answer prediction based on the initial large model to obtain a sample answer text;

{

"role": "assistant",

"content": "the pair of shoes you should pay 170 yuan @:

}]

}。

Then, in the embodiment of the present application, after obtaining the function call trimming dataset, the function call trimming dataset is adjusted to obtain a sample message for trimming the large model (i.e. the steps 3 to 4 above), and the specific process is as follows:

Obtaining output parameter information ： {'type': 'object', 'properties': {'discounted_price': {'type': 'number', 'description': 'The price after applying the discount'}}, corresponding to candidate functions, setting the output parameter information ： {'type': 'object', 'properties': {'discounted_price': {'type': 'number', 'description': 'The price after applying the discount'}}, as a 'returns' field, and assembling the output parameter information ： {'type': 'object', 'properties': {'discounted_price': {'type': 'number', 'description': 'The price after applying the discount'}}, with the sample question text and the input parameter information of each candidate function, so that a sample question message in a sample message can be constructed:

"role": "system",

"content" you have the following tools to call ：[{'name': 'calculate_discount', 'description': 'Calculate the discounted price', 'parameters': {'type': 'object', 'properties': {'original_price': {'type': 'number', 'description': 'The original price of the item'}, 'discount_percentage': {'type': 'number', 'description': 'The percentage of discount'}}, 'required': ['original_price', 'discount_percentage']}, 'returns': {'type': 'object', 'properties': {'discounted_price': {'type': 'number', 'description': 'The price after applying the discount'}}}}]"

},{

"role": "user",

}。

The method comprises the steps of replacing a function return value in a sample answer text with an identification field (discounted _price is taken as an example), obtaining that the pair of shoes should pay { { { discounted _price } }' element ", and assembling the function return value with function call data in a function call request, so that a reference prediction message in a sample message can be constructed:

{

"role": "assistant",

"content": "{{%% {'name': 'calculate_discount', 'return_vars': ['discounted_price'], 'arguments': {'original_price': 200, 'discount_percentage': 15}} %%}} The pair of shoes you should pay { { { { discounted _price } } }' element "for you"

}。

According to the mode, a plurality of sample messages corresponding to the required business scenes can be constructed.

The above-described large model-based function call procedure is described in detail below with reference to fig. 5 by way of specific example, and the procedure specifically includes:

step S501, constructing each sample message;

wherein each sample message includes a sample question message and a reference prediction message corresponding to the sample question message.

Step S502, carrying out iterative training on the initial large model by adopting each constructed sample message to obtain a target large model;

In one iteration training, the following operation is performed that answer prediction is performed based on one sample question message through an initial large model, and parameters of the current initial large model are adjusted based on the difference between the obtained prediction message and a reference prediction message of the one sample question message.

Step S503, generating a question message based on the question text input by the user and the description text of each candidate function;

Step S504, answer prediction is carried out based on the questioning message through the target big model, and a predicted message is obtained;

step S505, analyzing function call data from the predicted message based on the identifier in the predicted message;

the identifier is used for distinguishing function call data and a pre-reply template;

Step S506, calling the target function based on the function calling data to obtain a function return value;

And S507, replacing the identification field in the pre-reply template based on the function return value to obtain an answer text corresponding to the question text.

Based on the same inventive concept, the application also provides a function calling device based on a large model, see fig. 6, which comprises:

a generating module 601, configured to generate a question message based on a question text input by a user and respective description text of each candidate function, where each description text includes input parameter information and output parameter information of the corresponding candidate function;

the prediction module 602 is configured to predict an answer based on a question message through a target big model to obtain a predicted message, where the predicted message includes function call data corresponding to a target function and a pre-reply template, and the pre-reply template includes an identification field corresponding to a return value of the target function;

A calling module 603, configured to call the target function based on the function call data, to obtain a function return value;

And a replacing module 604, configured to replace the identification field in the pre-reply template based on the function return value, so as to obtain an answer text corresponding to the question text.

In a possible implementation manner, the generating module 601 is specifically configured to:

In a possible implementation manner, the generating module 601 is further configured to, before generating the question message, based on the question text and the description text of each candidate function, further:

The calling module 603 is further configured to, before making a call to the target function based on the function call data:

Based on the same inventive concept, the embodiment of the present application further provides an electronic device, where the electronic device may implement the function of the foregoing large model-based function calling device, and referring to fig. 7, the electronic device includes:

at least one processor 701, and a memory 702 connected to the at least one processor 701, in which the specific connection medium between the processor 701 and the memory 702 is not limited in the embodiment of the present application, and in fig. 7, the connection between the processor 701 and the memory 702 through the bus 700 is taken as an example. Bus 700 is shown in bold lines in fig. 7, and the manner in which the other components are connected is illustrated schematically and not by way of limitation. The bus 700 may be divided into an address bus, a data bus, a control bus, etc., and is represented by only one thick line in fig. 7 for convenience of representation, but does not represent only one bus or one type of bus. Alternatively, the processor 701 may be referred to as a controller, and the names are not limited.

In an embodiment of the present application, the memory 702 stores instructions executable by the at least one processor 701, and the at least one processor 701 may perform the large model-based function call method discussed above by executing the instructions stored in the memory 702. The processor 701 may implement the functions of the various modules in the apparatus shown in fig. 6.

The processor 701 is a control center of the apparatus, and may connect various parts of the entire control device using various interfaces and lines, and by executing or executing instructions stored in the memory 702 and invoking data stored in the memory 702, various functions of the apparatus and processing data, thereby performing overall monitoring of the apparatus.

In one possible design, processor 701 may include one or more processing units, and processor 701 may integrate an application processor and a modem processor, wherein the application processor primarily processes operating systems, user interfaces, application programs, and the like, and the modem processor primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 701. In some embodiments, processor 701 and memory 702 may be implemented on the same chip, or they may be implemented separately on separate chips in some embodiments.

The processor 701 may be a general purpose processor such as a Central Processing Unit (CPU), digital signal processor, application specific integrated circuit, field programmable gate array or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, and may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the large model-based function calling method disclosed in connection with the embodiment of the application can be directly embodied as the execution completion of a hardware processor or the execution completion of the combination execution of hardware and software modules in the processor.

The memory 702 is a non-volatile computer-readable storage medium that can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 702 may include at least one type of storage medium, and may include, for example, flash Memory, hard disk, multimedia card, card Memory, random access Memory (Random Access Memory, RAM), static random access Memory (Static Random Access Memory, SRAM), programmable Read-Only Memory (Programmable Read Only Memory, PROM), read-Only Memory (ROM), charged erasable programmable Read-Only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory, EEPROM), magnetic Memory, magnetic disk, optical disk, and the like. Memory 702 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 702 in embodiments of the present application may also be circuitry or any other device capable of performing storage functions for storing program instructions and/or data.

By programming the processor 701, the code corresponding to the application anomaly detection method described in the foregoing embodiment may be cured into the chip, so that the chip can execute the steps of the large model-based function call method of the embodiment shown in fig. 3 at runtime. How to design and program the processor 701 is a technology well known to those skilled in the art, and will not be described in detail herein.

The embodiment of the application also provides a computer readable storage medium which stores computer executable instructions required to be executed by the processor and contains a program for executing the processor.

In some possible embodiments, aspects of the large model-based function calling method provided by the present application may also be implemented in the form of a program product, which includes program code for causing an electronic device to perform the steps of the large model-based function calling method according to the various exemplary embodiments of the present application as described above in the present specification, when the program product is run on the electronic device.

It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A method of function invocation based on a large model, the method comprising:

Generating a question message based on a question text input by a user and respective description text of each candidate function, wherein each description text comprises input parameter information and output parameter information of the corresponding candidate function;

2. The method of claim 1, wherein generating a question message based on the question text and the respective descriptive text for each candidate function comprises:

And generating the question message based on the first text, the second text and a preset prompt word, wherein the prompt word is used for indicating the target big model to output a predicted message based on the question message.

3. The method of claim 1, wherein prior to generating the question message based on the question text and the respective descriptive text for each candidate function, the method further comprises:

Acquiring a target function call level corresponding to an artificial intelligent AI system, and inquiring candidate functions matched with the target function call level from a function library, wherein the function library stores a plurality of functions and function call levels corresponding to the functions respectively, or

And acquiring a function set supported by the AI system, and taking each function included in the function set as the candidate function.

4. The method of claim 1, wherein the predicted message further includes an identifier for distinguishing the function call data from a pre-reply template;

before the calling the target function based on the function calling data, the method further comprises:

the function call data comprises function identification of the target function, identification and value of each input parameter and identification of output parameter, wherein the value of each input parameter is analyzed from the problem text through the target large model.

5. The method according to any one of claims 1 to 4, wherein the target large model is obtained by:

and carrying out iterative training on the initial large model by adopting each constructed sample message to obtain the target large model, wherein in one iterative training, the following operation is carried out, namely carrying out answer prediction on the basis of one sample question message through the initial large model, and adjusting the parameters of the current initial large model on the basis of the difference between the obtained prediction message and the reference prediction message of the one sample question message.

6. The method of claim 5, wherein each sample message is constructed by:

Performing function call based on the function call request to obtain a function return value, and inputting the function return value and the sample question text into the initial large model to obtain a sample answer text;

based on the sample question text, input parameter information and output parameter information of each candidate function, constructing a sample question message in the sample message;

And replacing the function return value in the sample answer text with an identification field, and constructing a reference prediction message in the sample message based on the replaced sample answer text and the function call data in the function call request.

7. A large model-based function call apparatus, the apparatus comprising:

8. An electronic device, comprising:

A memory for storing a computer program;

A processor for implementing the method of any of claims 1-6 when executing a computer program stored on the memory.

9. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the method of any of claims 1-6.

10. A computer program product, characterized in that the computer program product comprises computer program code which, when run on a computer, causes the computer to perform the method as claimed in any of the preceding claims 1-6.